CN117580962A - Methods and systems for detecting kidney disease or damage by gene expression analysis - Google Patents

Methods and systems for detecting kidney disease or damage by gene expression analysis Download PDF

Info

Publication number
CN117580962A
CN117580962A CN202180099114.4A CN202180099114A CN117580962A CN 117580962 A CN117580962 A CN 117580962A CN 202180099114 A CN202180099114 A CN 202180099114A CN 117580962 A CN117580962 A CN 117580962A
Authority
CN
China
Prior art keywords
kidney disease
subject
sample
diabetic nephropathy
injury
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180099114.4A
Other languages
Chinese (zh)
Inventor
刘颖冰
海伦·赵
阮威明
艾伦·H·吴
伊丽莎白·J·墨菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Momentum Biotechnology Co ltd
University of California
Original Assignee
Momentum Biotechnology Co ltd
University of California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Momentum Biotechnology Co ltd, University of California filed Critical Momentum Biotechnology Co ltd
Publication of CN117580962A publication Critical patent/CN117580962A/en
Pending legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Abstract

The present invention provides methods and systems for detecting kidney disease or damage. A subject may comprise (a) analyzing a body sample to produce a data set comprising one or more products of gene expression levels corresponding to a set of genes associated with kidney disease or damage in examples of the body; (b) Processing the data set with a computer to determine whether the subject has kidney disease or damage, or is at a higher risk of developing a disease; (c) Electronically outputting a report to determine whether the subject is suffering from or at elevated risk of suffering from a kidney disease.

Description

Methods and systems for detecting kidney disease or damage by gene expression analysis
[ background Art ]
The international kidney society estimates that 8.5 million people worldwide are affected by kidney disease. Diabetic nephropathy (diabetic nephropathy) is a major cause of kidney disease and is also the most common cause of end-stage kidney disease (ESRD). In addition, diabetic nephropathy is associated with higher cardiovascular and total morbidity and mortality, and therefore timely diagnosis and treatment is of paramount importance. Diabetic nephropathy is a diabetes-induced nephropathy, generally referred to as kidney damage due to high blood glucose levels caused by diabetes. Diabetic nephropathy progresses slowly. By effective early treatment, the progression of the disease can be slowed or even stopped. Diabetic nephropathy may be associated with measurable biomarkers, such as proteinuria and/or low eGFR in a subject's body sample; however, these biomarkers may be non-specific for diabetic nephropathy and may be due to other diseases or damage, such as diabetes, hypertension, igA nephropathy, membranous nephropathy, lupus nephritis, slightly altered disease, rheumatoid arthritis, use of non-steroidal anti-inflammatory drugs, smoking, excessive alcohol consumption, drug abuse, obesity, urinary tract infection, kidney stones, benign prostatic hyperplasia, and the like. In addition, patients with controlled diabetes may also have diabetic nephropathy, while poorly controlled diabetics may have little kidney damage. In addition, diabetic nephropathy patients may also be associated with other types of kidney disease or damage.
[ invention ]
Diabetic nephropathy (diabetic nephropathy) is a disease that is diagnosed inadequately and misdiagnosed in patients. For example, diabetic nephropathy may be under-diagnosed because diagnostic methods such as kidney biopsy may be risky (e.g., mortality rate of 1.8%), expensive, and time consuming; thus, many patients choose not to conduct such diagnostic tests. As another example, diabetic nephropathy may be misdiagnosed because diagnostic methods may lack sufficient sensitivity and specificity. For example, urinary albumin detection may be insensitive and non-specific. As another example, kidneys of structures and sizes such as x-ray imaging tests and ultrasonic diagnostic tests can be time consuming and indirect, and other tests such as panel test urine sediment, urine Protein Electrophoresis (UPEP), serum Protein Electrophoresis (SPEP), blood urine tests, antinuclear antibody (ANA) tests, HBV detection, HCV detection, HIV detection can provide limited information and are therefore indirect and inefficient. In vitro diagnostic techniques, such as techniques based on proteomics, genomics, and protein biomarkers, can be challenging in accurately detecting, assessing, and monitoring kidney disease or disease with high sensitivity, specificity, and accuracy. It is also difficult to identify early stage diabetic changes, and a common biomarker is proteinuria; however, proteinuria, particularly low levels of proteinuria, can be mixed with many other factors, such as hypertension, smoking, alcoholism, drug abuse, obesity, infection, obstruction, and the like. Thus, many patients may miss the best opportunity for early drug intervention or may receive treatment that is etiologically independent. Receiving such unnecessary and/or ineffective therapy can be costly, time consuming, and result in delays in providing other effective therapies to the patient.
In view of the perceived need for improved methods of detecting, assessing and monitoring kidney disease (e.g., diabetic nephropathy), which methods are rapid, inexpensive, noninvasive, highly sensitive, specific and accurate, the present disclosure provides methods, systems, and kits for detecting kidney disease or disease by processing biological samples (e.g., tissue samples, cell samples and/or body fluid samples) obtained or derived from a subject. For example, a nucleic acid, protein, or cell of a biological sample may be analyzed. Biological samples obtained from subjects can be analyzed to measure the presence or absence of kidney disease or damage, or to make a related assessment. Analysis may be performed in a set of genomic regions, such as the kidney disease associated genes or genomic loci. The subject may include subjects with kidney disease or damage (e.g., kidney disease or damage patients) and subjects without kidney disease or damage (e.g., normal or healthy controls).
The methods of the present disclosure may have many advantages over current methods, including convenience, safety, and non-invasiveness, subject matter of sample collection, possibility of repeated detection, direct methods of analyzing kidney damage using urine samples, suitability for analysis, suitability for monitoring disease progression and efficacy of treatment, suitability for sample collection in a home environment, capability of detection without a detailed medical history of the subject, capability of finding early diabetic changes (e.g., asymptomatic subjects).
Using the methods and systems of the present invention, renal diseases or conditions can be accurately detected in biological samples (e.g., urine samples) using detection methods with high sensitivity and specificity. Urine-based analysis a set of biomarkers can be analyzed using a machine learning algorithm to accurately distinguish between kidney disease or control samples of different stages of the disease (e.g., early, mid, or late). Furthermore, urine-based assays may provide high specificity, thereby facilitating non-invasive use of biomarkers associated with kidney disease or disease to monitor treatment of kidney disease or patients with disease. Furthermore, urine-based assays may have higher sensitivity and specificity than the kidney biopsy assays currently considered to be gold standard for the diagnosis of kidney disease.
In one aspect, the present disclosure provides a method of treating or analyzing a body sample of a subject, comprising (a) analyzing the body sample to produce a dataset comprising one or more levels of gene expression products in the body sample, wherein the one or more levels of gene expression products correspond to a set of genes associated with kidney disease or damage; (b) Computer processing the data set from (a) to determine whether the subject is at risk of kidney disease or damage or has an elevated risk, with an accuracy of at least about 80%; (c) Electronically outputting a report to determine whether the subject determined in (b) is at risk of kidney disease or damage or has an elevated risk.
In some embodiments, the body sample is selected from the group consisting of a blood sample, a serum sample, a plasma sample, a saliva sample, a stool sample, a sputum sample, a urine sample, a semen sample, a transvaginal fluid sample, a cerebrospinal fluid sample, a sweat sample, a cell sample, and a tissue sample. In some cases, the body sample is a urine sample. In certain embodiments, the urine sample is a fresh urine sample. In some embodiments, the urine sample is a frozen sample (e.g., a fresh frozen sample). In certain embodiments, the urine sample is a preserved urine sample (e.g., preserved in a preservative at room temperature to prevent degradation).
In certain embodiments, (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from a body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and sequencing at least a portion of the cDNA molecules to produce sequencing reads. In certain embodiments, sequencing reads are mapped to a reference sequence (e.g., a reference genome, such as a human genome) to generate a dataset, wherein the dataset comprises gene transcript counts. In certain embodiments, each count of a transcript is indicative of a gene expression event.
In some embodiments, (a) molecules obtained from reverse transcription of ribonucleic acid (RNA) or derived from a body sample produce complementary deoxyribonucleic acid molecules (complementary), and the dataset produced by the real-time polymerase chain reaction (-RT-PCR, also known as qPCR) of at least a portion of the complementary deoxyribonucleic acid molecules is analyzed. In certain embodiments, the dataset comprises cycle threshold (Ct) values inversely proportional to the level of gene expression.
In some embodiments, (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from a body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and analyzing at least a portion of the cDNA molecules by microarray analysis (e.g., affymetrix microarray) to produce a dataset. In certain embodiments, the dataset comprises counts of gene transcripts. In certain embodiments, each count is indicative of a gene expression event. In certain embodiments, (a) comprises ribonucleic acid (RNA) hybridization of molecules obtained or derived from a body sample using a specific set of probes, and analyzing the hybridized RNA molecules using a Luminex platform to generate a dataset. In certain embodiments, the dataset comprises counts of gene transcripts.
In certain embodiments, (a) comprises selectively enriching at least a portion of the cDNA molecules or RNA molecules of a set of genomic loci associated with kidney disease or disease. In certain embodiments, (a) comprises amplifying at least a portion of the cDNA molecules or RNA molecules. In certain embodiments, (a) comprises aligning at least a portion of the sequencing reads to a reference sequence. In certain embodiments, (a) comprises generating a count of gene transcripts. In certain embodiments, the reference sequence is at least a portion of a human reference genome. In some embodiments, the counts of gene transcripts are normalized to generate normalized gene transcript counts for downstream differential gene expression analysis. In some embodiments, normalization includes determining CPM (counts per million), TPM (transcripts per kilobase), or RPKM/FPKM (exon reads per fragment per million reads per fragment map per kilobase).
In some embodiments, the kidney disease or injury is selected from the group consisting of early kidney disease, medium kidney disease, end-stage kidney disease, asymptomatic kidney disease, diabetic nephropathy, hypertensive kidney disease, igA kidney disease, membranous kidney disease, minilesions, focal Segmental Glomerulosclerosis (FSGS), non-steroidal anti-inflammatory kidney disease toxicity, thin basement membrane kidney disease, amyloidosis, endocarditis-related ANCA vasculitis, and other infections, cardiorenal syndrome, igG4 kidney disease, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis (kidney infection), renal arterial stenosis, renal cyst, kidney disease associated with rheumatoid arthritis, and kidney stones. In some cases, the kidney disease or injury is diabetic nephropathy. In some cases, the diabetic nephropathy is early stage diabetic nephropathy. In some embodiments, the subject is asymptomatic diabetic nephropathy.
In certain embodiments, (b) comprises processing the data set using a trained algorithm. In certain embodiments, the training algorithm comprises a trained machine learning algorithm. In some embodiments, the trained machine learning algorithm is selected from the group consisting of support vector machines # SVM)、Bayesian classification, linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, enhancement algorithms, adaBoost algorithms, recursive Feature Elimination (RFE), and any combination thereof. In certain embodiments, the trained machine learning algorithm includes a Recursive Feature Elimination (RFE) algorithm. In certain embodiments, the trained machine learning algorithm is trained from a plurality of training samples, including a first set of body samples from subjects having a particular kidney disease or impairment (e.g., diabetic nephropathy) and a second set of body samples from subjects without kidney disease or impairment, wherein the first and second sets of body samples used as training are different from the body samples of the subjects. In some embodiments, the trained machine learning algorithm is trained by a plurality of training samples, including a first set of body samples that are specific kidney disease or damage (e.g., diabetic nephropathy) and a second set of body samples that are other kidney disease or damage, wherein the first and second sets of body samples that are trained are different from the body samples of the subject.
In some embodiments, (a) comprises comparing one or more levels of the gene expression product to a reference value. In some embodiments, the reference value corresponds to a first set of gene expression products from a body sample of a subject having a particular kidney disease or damage (e.g., diabetic nephropathy) and/or a second set of body samples of a subject without kidney disease or damage. In some embodiments, the reference value corresponds to a first set of gene expression products from a body sample from a subject having a particular kidney disease or injury (e.g., diabetic nephropathy) and/or a second set of subjects having other kidney diseases or injuries.
In some embodiments, the method further comprises detecting the presence or increased risk of the increased sensitivity of the subject's kidney disease or injury by at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of the increased specificity of the presence or absence of a kidney disease or injury in a subject of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of a kidney disease or injury in the subject by a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of a kidney disease or injury in the subject by at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the negative predictive value. In some embodiments, the method further comprises detecting the presence or increased risk of kidney disease or injury in the subject by an area under the curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least 0.82,0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least 0.98, or 0.99.
In some embodiments, the method further comprises determining clinical intervention in the subject based at least in part on the presence or increased risk of kidney disease or damage determined in (b). In some embodiments, the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle modification, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs. In some embodiments, the medicament relies on blockage of the renin-angiotensin aldosterone system (RAAS, such as Angiotensin Converting Enzyme (ACE) inhibitors and angiotensin II receptor blockers (ARBs)). In certain embodiments, the drug is an experimental treatment, such as drug-targeted vascular/hemodynamic effects, drug-targeted inflammation, and drug-targeted oxidative stress.
In some embodiments, (b) comprises analyzing a first set of genes to distinguish between a first type of kidney disease or damage and a non-kidney disease or damage (NEG negative) sample and a second set of genes to distinguish between a first type of kidney disease or damage and a second type of kidney disease or damage sample. In some embodiments, (b) comprises analyzing a first set of genes to distinguish between Diabetic Nephropathy (DN) and Negative (NEG) samples and a second set of genes to distinguish between DN and other Chronic Kidney Disease (CKD) samples that are not diabetic. In certain embodiments, the first set of genes is selected from the genomes listed in tables 3 and 5, wherein the second set of genes is selected from the genomes listed in tables 4 and 6. In some embodiments, (b) comprises a first diabetic nephropathy vs. diabetes negative control score generated based on the first set of genes and a second diabetic nephropathy vs. ckd score generated based on the second set of genes. In some embodiments, the first diabetic nephropathy vs. diabetes negative control score is greater than 0.5 indicating glomerular injury and less than 0.5 indicating tubular injury. In some embodiments, (b) comprises analyzing different sets of male-specific or female-specific genes based on the gender of the subject.
In some embodiments, the method further comprises first deleting at least one subset of subjects with a predetermined condition (e.g., obesity, morbid obesity, nicotine dependence, alcohol dependence, drug abuse, kidney stones, severe hypertension, urinary tract infections, heart disease, hepatitis b, hepatitis c, aids virus, psoriasis, rheumatoid arthritis, use of non-steroidal anti-inflammatory drugs, etc.), then optionally adding additional samples of other diabetic negative control subjects without the predetermined condition, and generating a modified group of diabetic negative control-X subjects (e.g., wherein X is a predetermined characteristic). In some cases, if the diabetic nephropathy vs. diabetic negative control-X alignment score is much higher than the diabetic nephropathy vs. diabetic negative control alignment score (e.g., > 0.1), this indicates that the predetermined feature X causes kidney damage.
In some embodiments, the method further comprises removing at least one non-diabetes-induced Chronic Kidney Disease (CKD) subset having a predetermined condition (e.g., obesity, morbid obesity, nicotine dependency, alcohol dependency, drug abuse dependency, kidney stones, severe hypertension, urinary tract infections, heart disease, hepatitis B, hepatitis C, HIV, psoriasis, rheumatoid arthritis, using non-steroidal anti-inflammatory drugs, igA nephropathy, membranous kidney disease, micro-lesions, focal Segmental Glomerulosclerosis (FSGS), thin-basal lamina kidney disease, amyloidosis, ANCA vasculitis associated with endocarditis and other infections, heart-kidney syndrome, igG4 kidney disease, interstitial nephritis, lithium ion nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, nephropyelitis (kidney infection), renal artery stenosis, renal cyst, rheumatoid arthritis-associated kidney disease, etc.), and optionally replacing with other CKD samples not having such predetermined condition to generate a set of modified CKD-Y (e.g., as a predetermined characteristic) subjects. In some cases, if the diabetic nephropathy vs. ckd-Y alignment score is much higher than the diabetic nephropathy vs. ckd alignment score (e.g., > 0.1), this indicates that the predetermined Y characteristic results in kidney damage.
In some embodiments, the method further comprises analyzing two or more data sets generated at two or more different points in time for the body sample. The computer processes the two or more data sets to determine whether a particular kidney disease or injury is present, absent or at increased risk in the subject. In some embodiments, the method further comprises determining the presence, absence, or increased risk of another type of kidney disease or damage and electronically outputting a report to determine the presence, absence, or increased risk of another type of kidney disease or damage in the subject. In some embodiments, the method further comprises analyzing the subject's body sample at two or more different points in time to produce two or more data sets. The computer processes the two or more data sets to determine whether another type of kidney disease or injury is present, absent or at increased risk.
In another aspect, the present disclosure provides a method of treating or analyzing a body sample of a subject, comprising (a) analyzing the body sample to produce a dataset comprising one or more levels of gene expression products in the body sample, wherein the one or more levels of gene expression products correspond to a set of genes associated with kidney disease or damage; (b) A computer processing the data set from (a) to determine whether the subject has kidney disease or damage, or is at high risk, with a sensitivity of at least about 80%; (c) Electronically outputting a report to determine whether the subject determined in (b) has kidney disease or disorder, or has an increased risk of having kidney disease or disorder.
In some embodiments, the body sample is selected from: blood samples, serum samples, plasma samples, saliva samples, stool samples, sputum samples, urine samples, semen samples, transvaginal fluid samples, cerebrospinal fluid samples, sweat samples, cell samples, and tissue samples. In some embodiments, the body sample is a urine sample. In some embodiments, the urine sample is a fresh urine sample. In some embodiments, the urine sample is a frozen sample (e.g., a freshly frozen sample). In some embodiments, the urine sample is a preserved urine sample (e.g., stored in a preservative at room temperature to prevent degradation).
In some embodiments, (a) includes reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from a body sample to produce complementary deoxyribonucleic acid (cDNA) molecules and sequencing at least a portion of the cDNA molecules to produce sequencing reads. In some embodiments, each count of transcripts is indicative of a gene expression event.
In some embodiments, (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from a body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and generating the dataset by analyzing at least a portion of the cDNA molecules in real time polymerase chain reaction (RT-PCR, also known as qPCR). In some embodiments, the dataset includes cycle threshold (Ct) values that are inversely proportional to the level of gene expression.
In some embodiments, (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from a body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and assaying at least a portion of the cDNA molecules (e.g., affymetrix Microarray) by microarray analysis to produce the dataset. In some embodiments, the dataset comprises counts of gene transcripts. In some embodiments, each count is indicative of a gene expression event. In some embodiments, (a) comprises hybridizing ribonucleic acid (RNA) molecules obtained or derived from a body sample using a specific probe set, and analyzing the hybridized RNA molecules using a Luminex platform to generate a data set. In some embodiments, the dataset comprises counts of gene transcripts. In certain embodiments, (a) comprises selectively enriching at least a portion of the cDNA molecules or RNA molecules of a set of genomic loci associated with kidney disease or disease. In certain embodiments, (a) comprises amplifying at least a portion of the cDNA molecules or RNA molecules. In certain embodiments, (a) comprises aligning at least a portion of the sequencing reads to a reference sequence. In certain embodiments, (a) comprises generating a count of gene transcripts. In certain embodiments, the reference sequence is at least a portion of a human reference genome. In some embodiments, the counts of gene transcripts are normalized to generate normalized gene transcript counts for downstream differential gene expression analysis. In some embodiments, normalization includes determining CPM (counts per million), TPM (transcripts per kilobase), or RPKM/FPKM (exon reads per fragment per million reads per fragment map per kilobase).
In certain embodiments, the kidney disease or injury is selected from the group consisting of early kidney disease, medium kidney disease, end-stage kidney disease, asymptomatic kidney disease, diabetic nephropathy, hypertensive kidney disease, igA kidney disease, membranous kidney disease, minilesions, focal Segmental Glomerulosclerosis (FSGS), non-steroidal anti-inflammatory kidney disease toxicity, thin basement membrane kidney disease, amyloidosis, endocarditis-related ANCA vasculitis, and other infections, cardiorenal syndrome, igG4 kidney disease, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis (kidney infection), renal arterial stenosis, renal cyst, kidney disease associated with rheumatoid arthritis, and kidney stones. In some cases, the kidney disease or injury is diabetic nephropathy. In some cases, the diabetic nephropathy is early stage diabetic nephropathy. In some embodiments, the subject is asymptomatic diabetic nephropathy.
In certain embodiments, (b) comprises processing the data set using a trained algorithm. In certain embodiments, the training algorithm comprises a trained machine learning algorithm. In some embodiments, the trained machine learning algorithm is selected from the group consisting of a Support Vector Machine (SVM), Bayesian classification, linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, enhancement algorithms, adaBoost algorithms, recursive Feature Elimination (RFE), and any combination thereof. In certain embodiments, the trained machine learning algorithm includes a Recursive Feature Elimination (RFE) algorithm. In certain embodiments, the trained machine learning algorithm is trained by a plurality of training samples, including a first set of body samples from a subject having a particular kidney disease or damage (e.g., diabetic nephropathy) and a second set of body samples from a subject without kidney disease or damage, wherein the first and second sets of body samples used as training are different from the subject's body samples. In some embodiments, the trained machine learning algorithm is trained by a plurality of training samples, including a first set of body samples from a particular kidney disease or injury (e.g., diabetic nephropathy) and a second set of body samples from other types of kidney disease or injury, for use as trainingThe first and second sets of body samples are different from the body sample of the subject.
In some embodiments, (a) comprises comparing one or more levels of the gene expression product to a reference value. In some embodiments, the reference value corresponds to a first set of gene expression products from a body sample of a subject having a particular kidney disease or damage (e.g., diabetic nephropathy) and/or a second set of body samples from subjects without kidney disease or damage. In some embodiments, the reference value corresponds to a first set of gene expression products from a particular kidney disease or damaged (e.g., diabetic nephropathy) subject and/or a second set of gene expression products from a body sample of other types of kidney disease or damaged subjects.
In some embodiments, the method further comprises detecting the presence or increased risk of the increased sensitivity of the subject's kidney disease or injury by at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In certain embodiments, the method further comprises detecting whether or not the subject's kidney disease or injury is present or at increased risk for a specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of a kidney disease or injury in the subject by at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the positive predictive value. In some embodiments, the method further comprises detecting the presence or increased risk of a kidney disease or injury in the subject by at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the negative predictive value. In some embodiments, the method further comprises detecting the presence or increased risk of kidney disease or injury in the subject by an area under the curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least 0.82,0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least 0.98, or 0.99.
In some embodiments, the method further comprises determining a clinical intervention in the subject based at least in part on the presence or increased risk of kidney disease or damage determined in (b). In some embodiments, the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle modification, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs. In some embodiments, the medicament relies on blockage of the renin-angiotensin aldosterone system (RAAS, such as Angiotensin Converting Enzyme (ACE) inhibitors and angiotensin II receptor blockers (ARBs)). In certain embodiments, the agent is an experimental treatment, such as agents that target vascular/hemodynamic effects, agents that target inflammation, and agents that target oxidative stress.
In some embodiments, (b) comprises analyzing a first set of genes to distinguish between a first type of kidney disease or damage and a non-kidney disease or damage (negative diabetic negative control) sample and a second set of genes to distinguish between a first type of kidney disease or damage and a second type of kidney disease or damage sample. In some embodiments, (b) comprises analyzing a first set of genes to distinguish between diabetic nephropathy (diabetic nephropathy) and non-renal disease (diabetic negative control) subjects and a second set of genes to distinguish between diabetic nephropathy and other Chronic Kidney Disease (CKD) subjects. In certain embodiments, the first set of genes is selected from the genomes listed in tables 3 and 5, wherein the second set of genes is selected from the genomes listed in tables 4 and 6. In certain embodiments, (b) comprises generating a first diabetic nephropathy vs. diabetes negative control score based on the first set of genes and generating a second diabetic nephropathy vs. ckd score based on the second set of genes. In some embodiments, the first diabetic nephropathy vs. diabetes negative control score is greater than 0.5 indicating glomerular injury and less than 0.5 indicating tubular injury. In some embodiments, (b) comprises analyzing different sets of male-specific or female-specific genes based on the gender of the subject.
In some embodiments, the method further comprises analyzing the body sample at two or more different points in time to produce two or more data sets. The computer processes the two or more data sets to determine whether a particular kidney disease or injury is present, absent or at increased risk in the subject. In some embodiments, the method further comprises determining whether another kidney disease or injury is present, absent, or at increased risk. While the electronic version outputs a report to confirm the presence, absence or increased risk of another type of kidney disease or injury. In some embodiments, the method further comprises analyzing the body sample to produce two or more data sets at two or more different points in time. The computer processes the two or more data sets to determine whether another type of kidney disease or damage is present, absent or at increased risk.
In some cases, the sensitivity and specificity of determining an increased risk of kidney disease or damage in a subject is determined based on the percentage of kidney disease or damage present or at increased risk that is correctly determined in a separate test sample.
In some embodiments, the method further comprises determining with a specificity of at least about 80% whether the subject's kidney disease or damage is present, absent, or at increased risk of damage. In certain embodiments, the method further comprises (d) electronically outputting a report to determine whether the subject has another type of kidney disease or damage.
In another aspect, the present disclosure provides a method of treating or analyzing a body sample of a subject, comprising (a) sequencing ribonucleic acid (RNA) molecules obtained or derived from the body sample to obtain sequencing reads indicative of the number of gene transcripts in the body sample, wherein the number of gene transcripts corresponds to a group of genes associated with kidney disease or damage; (b) Computer processing the gene transcript count from (a) to determine whether kidney disease or damage is present, absent or at increased risk in the subject; (c) Electronically outputting a report to determine whether the kidney disease or damage of the subject determined in (b) is present, absent or at increased risk.
In some embodiments, the body sample is selected from a blood sample, a serum sample, a plasma sample, a saliva sample, a stool sample, a sputum sample, a urine sample, a semen sample, a transvaginal fluid sample, a cerebrospinal fluid sample, a sweat sample, a cell sample, and a tissue sample. In some cases, the body sample is a urine sample. In certain embodiments, the urine sample is a fresh urine sample. In some embodiments, the urine sample is a frozen sample (e.g., a fresh frozen sample). In certain embodiments, the urine sample is a preserved urine sample (e.g., preserved in a preservative at room temperature to prevent degradation).
In certain embodiments, (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from a body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and sequencing at least a portion of the cDNA molecules to produce sequencing reads. In certain embodiments, sequencing reads are mapped to a reference sequence (e.g., a reference genome, such as a human genome) to generate a dataset, wherein the dataset comprises gene transcript counts. In certain embodiments, each count of a transcript is indicative of a gene expression event. In certain embodiments, (a) comprises selectively enriching at least a portion of the cDNA molecules of a set of genomic loci associated with kidney disease or disease.
In certain embodiments, (a) comprises selectively enriching at least a portion of the cDNA molecules of a set of genomic loci associated with kidney disease or damage. In certain embodiments, (a) comprises amplifying at least a portion of the cDNA molecules. In certain embodiments, (a) comprises aligning at least a portion of the sequencing reads with a reference sequence to generate a gene transcript count. In certain embodiments, the reference sequence is at least a portion of a human reference genome. In certain embodiments, (a) comprises generating a count of gene transcripts. In certain embodiments, the reference sequence is at least a portion of a human reference genome. In some embodiments, the counts of gene transcripts are normalized to generate a normalized gene transcript count analysis of downstream differential gene expression. In some embodiments, normalization includes determining CPM (counts per million), TPM (transcripts per kilobase), or RPKM/FPKM (exon reads per fragment per million reads per fragment map per kilobase).
In certain embodiments, (a) comprises selectively enriching at least a portion of the cDNA molecules of a set of genomic loci associated with kidney disease or disease. In certain embodiments, (a) comprises amplifying at least a portion of the cDNA molecules. In certain embodiments, (a) comprises aligning at least a portion of the sequencing reads with a reference sequence to generate a gene transcript count. In certain embodiments, the reference sequence is at least a portion of a human reference genome. In certain embodiments, (a) comprises generating a count of gene transcripts. In certain embodiments, the reference sequence is at least a portion of a human reference genome. In some embodiments, the counts of gene transcripts are normalized to generate a normalized gene transcript count analysis of downstream differential gene expression. In some embodiments, normalization includes determining CPM (counts per million), TPM (transcripts per kilobase), or RPKM/FPKM (exon reads per fragment per million reads per fragment map per kilobase).
In certain embodiments, the kidney disease or injury is selected from the group consisting of early kidney disease, medium kidney disease, end-stage kidney disease, asymptomatic kidney disease, diabetic nephropathy, hypertensive kidney disease, igA kidney disease, membranous kidney disease, minilesions, focal Segmental Glomerulosclerosis (FSGS), non-steroidal anti-inflammatory kidney disease toxicity, thin basement membrane kidney disease, amyloidosis, endocarditis-related ANCA vasculitis, and other infections, cardiorenal syndrome, igG4 kidney disease, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis (kidney infection), renal arterial stenosis, renal cyst, kidney disease associated with rheumatoid arthritis, and kidney stones. In some cases, the kidney disease or injury is diabetic nephropathy. In some cases, the diabetic nephropathy is early stage diabetic nephropathy. In some embodiments, the subject is asymptomatic diabetic nephropathy.
In certain embodiments, (b) comprises processing the data set using a trained algorithm. In certain embodiments, the training algorithm comprises a trained machine learning algorithm. In some embodiments, the trained machine learning algorithm is selected from the group consisting of a Support Vector Machine (SVM),Bayesian classification, linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, enhancement algorithms, adaBoost algorithms, recursive Feature Elimination (RFE), and any combination thereof. In certain embodiments, the trained machine learning algorithm includes a Recursive Feature Elimination (RFE) algorithm. In certain embodiments, the trained machine learning algorithm is trained by a plurality of training samples, including a first set of body samples from a subject having a particular kidney disease or damage (e.g., diabetic nephropathy) and a second set of body samples from a subject without kidney disease or damage, wherein the first and second sets of body samples used as training are different from the subject's body samples. In some embodiments, the trained machine learning algorithm is trained by a plurality of training samples, including a first set of body samples having a particular kidney disease or damage (e.g., diabetic nephropathy) and a second set of body samples having other kidney disease or damage, wherein the first and second sets of body samples used as training are different from the body samples of the subject.
In some embodiments, (a) comprises comparing a count of gene transcripts (e.g., one or more levels of gene expression products) to a reference value. In some examples the reference value corresponds to a first set of gene expression products from a body sample of a subject having a particular kidney disease or damage (e.g., diabetic nephropathy) and/or a second set of body samples from subjects without kidney disease or damage. In some embodiments, the reference value corresponds to a first set of gene expression products from a body sample of a subject having a particular kidney disease or injury (e.g., diabetic nephropathy) and/or a second set of body samples from subjects having other types of kidney disease or injury.
In some embodiments, the method further comprises detecting the presence, absence, or increased risk of increased sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of kidney disease or damage in the subject. In some embodiments, the method further comprises detecting the presence or increased risk of kidney disease or damage in the subject by at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of kidney disease or damage in the subject with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting an increased presence or risk of kidney disease or damage in the subject by at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the negative predictive value. In some embodiments, the method further comprises detecting the presence or increased risk of kidney disease or injury in the subject by an area under the subject curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least about 0.82, at least about 0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least 0.90, at least 0.91, at least 0.92, at least 0.93, at least 0.94, at least 0.95, at least 0.96, at least 0.97, at least 0.98, at least 0.99.
In some embodiments, the method further comprises determining a clinical intervention in the subject based at least in part on the presence or increased risk of kidney disease or damage determined in (b). In some embodiments, the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle modification, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs. In some embodiments, the medicament relies on blockage of the renin-angiotensin aldosterone system (RAAS, such as Angiotensin Converting Enzyme (ACE) inhibitors and angiotensin II receptor blockers (ARBs)). In certain embodiments, the agent is an experimental treatment, such as agents that target vascular/hemodynamic effects, agents that target inflammation, and agents that target oxidative stress.
In some embodiments, (b) comprises analyzing a first set of genes to distinguish between a first type of kidney disease or damage and a non-kidney disease or damage (negative diabetic negative control) sample and a second set of genes to distinguish between a first type of kidney disease or damage and a second type of kidney disease or damage sample. In some embodiments, (b) comprises analyzing a first set of genes to distinguish between Diabetic Nephropathy (DN) and negative (diabetic negative control) subjects and a second set of genes to distinguish between DN and other Chronic Kidney Disease (CKD). In certain embodiments, the first set of genes is selected from the genomes listed in tables 3 and 5, wherein the second set of genes is selected from the genomes listed in tables 4 and 6. In certain embodiments, (b) comprises generating a first diabetic nephropathy vs. diabetes negative control score based on the first set of genes and generating a second diabetic nephropathy vs. ckd score based on the second set of genes. In some cases, the first diabetic nephropathy vs. diabetes negative control score is greater than 0.5 indicating glomerular injury and less than 0.5 indicating tubular injury. In some embodiments, (b) comprises analyzing different sets of male-specific or female-specific genes based on the gender of the subject.
In some embodiments, the method further comprises analyzing the body sample at two or more different points in time to produce two or more data sets. The computer processes the two or more data sets to determine whether a particular kidney disease or injury is present, absent or at increased risk in the subject. In some embodiments, the method further comprises determining whether another type of kidney disease or injury is present, absent or at increased risk, while confirming, via an electronic output report, whether another type of kidney disease or injury is present, absent or at increased risk. In some embodiments, the method further comprises analyzing the subject's body sample at two or more different points in time to generate two or more data sets, which are processed by a computer to determine whether other types of kidney disease or damage are present, absent, or at increased risk in the subject.
In another aspect, the present disclosure provides a method of treating or analyzing a body sample of a subject, comprising (a) analyzing a plurality of cells obtained from the body sample to generate a set of data comprising quantitative measurements of a set of cell-based biomarkers comprising proteins associated with kidney disease or damage in the plurality of cells; (b) Processing the data set obtained from (a) with a computer to determine whether the subject has kidney disease or damage, or whether the risk of kidney disease or damage is elevated; and (c) electronically outputting a report to determine whether the subject determined in (b) has kidney disease or is damaged, or has an increased risk of having kidney disease or damage. In certain embodiments, the method further comprises electronically outputting a report to determine whether the subject has kidney disease or disorder, or (b) the determined subject has an increased risk of having other types of kidney disease or damage.
In some embodiments, the body sample is selected from the group consisting of a blood sample, a serum sample, a plasma sample, a saliva sample, a stool sample, a sputum sample, a urine sample, a semen sample, a transvaginal fluid sample, a cerebrospinal fluid sample, a sweat sample, a cell sample, and a tissue sample. In some cases, the body sample is a urine sample. In certain embodiments, the urine sample is a fresh urine sample, a frozen urine sample (e.g., a fresh frozen urine sample), or a preserved urine sample (e.g., preserved with a preservative at room temperature to prevent degradation).
In some embodiments, (a) comprises using a cell analysis method selected from the group consisting of ELISA, flow cytometry, LC/MS, confocal microscopy. In certain embodiments, the set of cell-based biomarkers comprising a protein associated with a kidney disease or disorder comprises at least one protein selected from the group consisting of proteins encoded by genomic loci associated with a kidney disease or disorder. In certain embodiments, the kidney disease or injury is selected from the group consisting of early kidney disease, medium kidney disease, end-stage kidney disease, asymptomatic kidney disease, diabetic nephropathy, hypertensive kidney disease, igA kidney disease, membranous kidney disease, minilesions, focal Segmental Glomerulosclerosis (FSGS), non-steroidal anti-inflammatory kidney disease toxicity, thin basement membrane kidney disease, amyloidosis, endocarditis-related ANCA vasculitis, and other infections, cardiorenal syndrome, igG4 kidney disease, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis (kidney infection), renal arterial stenosis, renal cyst, kidney disease associated with rheumatoid arthritis, and kidney stones. In certain embodiments, the kidney disease or injury is diabetic nephropathy. In certain embodiments, the diabetic nephropathy is early stage diabetic nephropathy. In some embodiments, the subject suffers from asymptomatic diabetic nephropathy.
In certain embodiments, (b) comprises processing the data set using a trained algorithm. In certain embodiments, the training algorithm comprises a trained machine learning algorithm. In some embodimentsThe trained machine learning algorithm is selected from the group consisting of a Support Vector Machine (SVM),Bayesian classification, linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, enhancement algorithms, adaBoost algorithms, recursive Feature Elimination (RFE), and any combination thereof. In some embodiments, the trained machine learning algorithm includes a recursive feature elimination algorithm. In some embodiments, the trained machine learning algorithm is trained from a plurality of samples, including a first set of body samples from a particular kidney disease or injury (e.g., diabetic nephropathy) and a second set of body samples from a kidney disease-free or injury, wherein the first and second sets of body samples used as training are different from the body samples of the subject. In some embodiments, the trained machine learning algorithm is trained from a plurality of samples, including a first set of body samples from a particular kidney disease or injury (e.g., diabetic nephropathy) and a second set of body samples from other types of kidney disease or injury, wherein the first and second sets of body samples used as training are different from the body samples of the subject.
In some embodiments, (a) comprises comparing the one or more cellular biomarker levels to a reference value. In some embodiments, the reference value corresponds to a first set of body samples from a patient suffering from a particular kidney disease or injury (e.g., diabetic nephropathy) and/or a second set of body samples from a patient without kidney disease or injury. In some embodiments, the reference value corresponds to a first set of body samples from a particular kidney disease or injury (e.g., diabetic nephropathy) and/or a second set of body samples from other types of kidney disease or injury, wherein the first set and the second set of body samples used for training are different from the body samples of the subject.
In some embodiments, the method further comprises detecting the presence or increased risk of kidney disease or damage in the subject with a sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of kidney disease or damage in the subject with a specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of kidney disease or damage in the subject with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting an increased presence or risk of kidney disease or damage in the subject by at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the negative predictive value. In some embodiments, the method further comprises detecting an area under the curve (AUC) of increased presence or risk of kidney disease or damage in the subject of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least 0.82,0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least 0.98 or 0.99.
In some embodiments, the method further comprises determining clinical intervention in the subject based at least in part on the presence or increased risk of kidney disease or damage determined in (b). In some embodiments, the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle modification, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs. In some embodiments, the medicament relies on blockage of the renin-angiotensin aldosterone system (RAAS, such as Angiotensin Converting Enzyme (ACE) inhibitors and angiotensin II receptor blockers (ARBs)). In some cases, the drug is an experimental treatment, such as a drug targeting vascular/hemodynamic effects, a drug targeting inflammation, and a drug targeting oxidative stress.
In some embodiments, (b) comprises analyzing a first set of genes to distinguish between a first type of kidney disease or damage and no kidney disease (NEG negative control) and a second set of genes to distinguish between a first type of kidney disease or damage and a second type of kidney disease or damage. In some embodiments, (b) comprises analyzing a first set of genes to distinguish between Diabetic Nephropathy (DN) and Negative (NEG) subjects and a second set of genes to distinguish between diabetic nephropathy and other Chronic Kidney Disease (CKD). In certain embodiments, the first set of genes is selected from the genomes listed in tables 3 and 5, wherein the second set of genes is selected from the genomes listed in tables 4 and 6. In certain embodiments, (b) comprises generating a first diabetic nephropathy vs. diabetes negative control score based on the first set of genes and generating a second diabetic nephropathy vs. ckd score based on the second set of genes. In some cases, the first diabetic nephropathy vs. diabetes negative control score is greater than 0.5 indicating glomerular injury and less than 0.5 indicating tubular injury. In some embodiments, (b) comprises analyzing different sets of male-specific or female-specific genes based on the gender of the subject.
In some embodiments, the method further comprises analyzing two or more data sets generated by the subject at two or more different time points. The computer processes the two or more data sets to determine the presence, absence, or increased risk of kidney disease or damage. In some embodiments, the method further comprises determining the presence, absence, or increased risk of another type of kidney disease or damage in the subject, and confirming the presence, absence, or increased risk of the other type of kidney disease by electronically outputting a report. In some embodiments, the method further comprises analyzing the subject body sample to produce two or more data sets at two or more different time points, and the computer processing the two or more data sets to determine the presence, absence, or increased risk of another type of kidney disease or damage.
In another aspect, the present disclosure provides a method of treating or analyzing a body sample of a subject comprising (a) sequencing nucleic acid molecules obtained or obtained from the body sample to obtain data comprising one or more quantitative measurements of a set of genes associated with kidney disease or damage, wherein sequencing comprises using an Illumina RNA pretreatment enrichment labeling kit, an Illumina Truseq exome kit, a Agilent sureselect target enrichment system, or a roses KAPA RNA HyperPrep kit; (b) A computer processing the data from (a) to determine whether kidney disease or damage is present or at increased risk; and (c) electronically outputting a report that identifies the presence or increased risk of kidney disease or damage in the subject of (b).
In another aspect, the present disclosure provides a method of treating or analyzing a body sample of a subject comprising (a) analyzing the body sample or a derivative thereof to produce a dataset comprising one or more quantitative indicators of a set of biomarkers associated with kidney disease or damage. Wherein the panel of biomarkers comprises at least 5 markers selected from the group consisting of the biomarkers set forth in Table 3, the biomarkers set forth in Table 4, the biomarkers set forth in Table 5, the biomarkers set forth in Table 6 (b) and the data obtained from (a) is computer processed to determine whether kidney disease or damage is present, absent or at increased risk in the subject. The method comprises the steps of carrying out a first treatment on the surface of the (c) Electronically outputting a report to determine whether the renal disease or injury in the subject determined in (b) is present, absent or at increased risk.
In another aspect, the present disclosure provides a kit for treating or analyzing a body sample of a subject, comprising identifying the presence, absence or relative amount of a region of a gene associated with kidney disease or damage with a set of probes, wherein the probes correspond to at least 5 markers selected from the group consisting of: comprising 5 biomarkers selected from the group consisting of the biomarkers listed in table 3, the biomarkers listed in table 4, the biomarkers listed in table 5 and the biomarkers listed in table 6.
In another aspect, the invention provides a method of diagnosing kidney disease or damage in a subject comprising (a) analyzing a body sample or a derivative thereof to produce a dataset comprising one or more quantitative indicators of a set of biomarkers associated with kidney disease or damage. Wherein the set of biomarkers comprises at least 5 markers selected from the group consisting of. The biomarkers set forth in Table 3, table 4, table 5, table 6, and (b) provide a diagnosis of kidney disease or damage based on a comparison of a set of biomarkers to a set of reference values.
In another aspect, the present disclosure provides a method for treating or analyzing a urine sample of a subject, comprising (a) ribonucleic acid (RNA) molecular sequencing obtained or derived from the urine sample to produce a sequencing read indicative of the number of gene transcripts in a body sample. These gene transcripts correspond to a set of genes associated with diabetic nephropathy; (b) Computer processing the gene transcript count from (a) to determine whether diabetic nephropathy is present, absent or at increased risk in the subject; (c) Electronically outputting a report to determine whether the diabetic nephropathy identified in (b) is present, absent or at increased risk.
In some embodiments, the body sample may be selected from a blood sample, a serum sample, a plasma sample, a saliva sample, a stool sample, a sputum sample, a urine sample, a semen sample, a transvaginal fluid sample, a cerebrospinal fluid sample, a sweat sample, a cell sample, and a tissue sample. In some cases, the body sample is a urine sample. In certain embodiments, the urine sample is a fresh urine sample. In some embodiments, the urine sample is a frozen sample (e.g., a fresh frozen sample). In certain embodiments, the urine sample is a preserved urine sample (e.g., preserved in a preservative at room temperature to prevent degradation).
In certain embodiments, (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from a body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and sequencing at least a portion of the cDNA molecules to produce sequencing reads. In certain embodiments, sequencing reads are mapped to a reference sequence (e.g., a reference genome, such as a human genome) to generate a dataset, wherein the dataset comprises gene transcript counts. In certain embodiments, each count of a transcript is indicative of a gene expression event. In certain embodiments, (a) comprises selectively enriching at least a portion of the cDNA molecules of a set of genomic loci associated with diabetic nephropathy.
In some embodiments, (a) comprises selectively enriching at least a portion of the cDNA molecules for genomic loci associated with diabetic nephropathy. In certain embodiments, (a) comprises amplifying at least a portion of the cDNA molecules. In certain embodiments, (a) comprises aligning at least a portion of the sequencing reads with a reference sequence to generate a gene transcript count. In certain embodiments, the reference sequence is at least a portion of a human reference genome. In certain embodiments, (a) comprises generating a count of gene transcripts. In certain embodiments, the reference sequence is at least a portion of a human reference genome. In some embodiments, the counts of gene transcripts are normalized to generate normalized gene transcript counts for downstream differential gene expression analysis. In some embodiments, normalization includes determining CPM (counts per million), TPM (transcripts per kilobase), or RPKM/FPKM (exon reads per fragment per million reads per fragment map per kilobase).
In some embodiments, the set of genes associated with diabetic nephropathy comprises at least one gene selected from the group listed in table 3, table 4, table 5, and table 6. In some embodiments, the diabetic nephropathy comprises early stage diabetic nephropathy, mid-stage diabetic nephropathy, late stage diabetic nephropathy, end-stage diabetic nephropathy, or asymptomatic diabetic nephropathy.
In certain embodiments, (b) comprises processing the data set using a trained algorithm. In some implementationsIn one example, the training algorithm comprises a trained machine learning algorithm. In some embodiments, the trained machine learning algorithm is selected from the group consisting of a Support Vector Machine (SVM),Bayesian classification, linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, enhancement algorithms, adaBoost algorithms, recursive Feature Elimination (RFE), and any combination thereof. In certain embodiments, the trained machine learning algorithm includes a Recursive Feature Elimination (RFE) algorithm. In some embodiments, the trained machine learning algorithm is trained from a plurality of training samples, including a first set of urine samples from diabetic nephropathy patients and a second set of urine samples from non-nephropathy patients, wherein the first set of urine samples and the second set of urine samples are different from the subject urine samples. In certain embodiments, the trained machine learning algorithm is trained from a plurality of training samples, including a first set of urine samples from diabetic nephropathy patients and a second set of urine samples from other types of nephropathy patients. Wherein the first and second sets of urine samples are different from the subject urine samples.
In certain embodiments, (a) comprises comparing a count of gene transcripts (e.g., one or more levels of gene expression products) to a reference value. In certain embodiments, the reference value corresponds to a gene expression product from a first set of urine samples from a diabetic nephropathy patient and/or a second set of urine samples from a non-nephropathy patient. In certain embodiments, the reference value corresponds to a gene expression product from a first set of urine samples from diabetic nephropathy patients and/or a second set of urine samples from other types of nephropathy patients.
In some embodiments, the method further comprises detecting the presence or increased risk of diabetic nephropathy in the subject with a sensitivity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of diabetic nephropathy with a specificity of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least 91%, at least 92%, at least 93%, at least 94% about 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting the presence or increased risk of diabetic nephropathy in the subject with a positive predictive value of at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%. In some embodiments, the method further comprises detecting at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% of the negative predictive value of the presence or increased risk of diabetic nephropathy in the subject. In some embodiments the method further comprises detecting the presence or increased risk of diabetic nephropathy in a subject for an area under the curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least 0.82,0.83, at least about 0.84, at least about 0.85, at least about 0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least 0.98, or 0.99.
In some embodiments, the method further comprises determining a clinical intervention in the subject based at least in part on the presence or increased risk of diabetic nephropathy determined in (b). In some embodiments, the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle modification, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs. In some embodiments, the medicament relies on blockage of the renin-angiotensin aldosterone system (RAAS, such as Angiotensin Converting Enzyme (ACE) inhibitors and angiotensin II receptor blockers (ARBs)). In certain embodiments, the agent is an experimental treatment, such as agents that target vascular/hemodynamic effects, agents that target inflammation, and agents that target oxidative stress.
In some embodiments, (b) comprises analyzing a first set of genes to distinguish between Diabetic Nephropathy (DN) and non-renal disease (NEG) subjects, and a second set of genes to distinguish between diabetic nephropathy (diabetic nephropathy) and another renal disease or injury. In certain embodiments, (b) comprises analyzing the first set of differentially expressed genes to distinguish between diabetic nephropathy and non-renal disease (NEG) subjects, and the differentially expressed genes to distinguish between diabetic nephropathy and other Chronic Kidney Disease (CKD). In certain embodiments, a first set of genes is selected from the genomes listed in tables 3 and 5 and a second set of genes is selected from the genomes listed in tables 4 and 6. In certain embodiments, (b) comprises generating a first diabetic nephropathy vs. diabetes negative control score based on the first set of genes and generating a second diabetic nephropathy vs. ckd score based on the second set of genes. In some embodiments, the first diabetic nephropathy vs. diabetes negative control score is greater than 0.5 indicating glomerular injury and less than 0.5 indicating tubular injury. In some embodiments, (b) comprises analyzing different sets of male-specific or female-specific genes based on the gender of the subject.
In some embodiments, the method further comprises analyzing the subject urine sample to produce two or more data sets at two or more different time points. The computer processes the two or more data sets to determine whether diabetic nephropathy is present or at increased risk. In some embodiments, the method further comprises determining whether another type of kidney disease or injury is present, absent or at increased risk. At the same time, a report is electronically output to confirm the presence, absence or increased risk of another type of kidney disease or injury. In some embodiments, the method further comprises analyzing the urine sample of the subject at two or more different time points to generate two or more data sets, and computer processing the two or more data sets to determine that another type of kidney disease or injury is present, absent or at increased risk. In some embodiments, the method further comprises analyzing the subject's body sample to produce two or more data sets at two or more different time points, and confirming the presence, absence or increased risk of another type of kidney disease or damage in the subject by electronically outputting a report.
In another aspect, the present disclosure provides a method of treating kidney disease or damage in a subject, comprising (a) diagnosing kidney disease or damage in a subject according to the methods of the present disclosure; (b) treating kidney disease or damage in the subject.
In another aspect, the present disclosure provides a method or platform for evaluating the efficacy of a novel drug for treating kidney disease or damage in a subject, comprising (a) comparing the change in score before and after drug treatment; (b) The computer-processed gene expression levels are targeted novel drugs.
Another aspect of the disclosure provides a non-transitory computer-readable medium consisting of machine-executable code that, when executed by one or more computer processors, implements any of the methods described above or elsewhere herein.
Another aspect of the present disclosure provides a system comprising one or more computer processors and computer memory coupled thereto. The computer memory includes machine executable code for execution by one or more computer processors, the code implementing any of the methods described above or elsewhere herein.
Other aspects and advantages of the present disclosure will become readily apparent to those skilled in the art from the following detailed description, wherein only illustrative embodiments of the present disclosure are shown and described. As will be realized, the present disclosure is capable of other and different embodiments and its several details are capable of modifications in various obvious respects, all without departing from the disclosure. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.
The company is cited by
All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference. If a citation of an incorporated publication and patent or patent application contradicts that disclosed in this specification, that specification should be read in place of and/or in preference to any such conflicting material.
[ description of the drawings ]
The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings (also referred to as "figures") of the present invention. "here"), wherein:
fig. 1 shows a flow chart of a method 100 for identifying kidney disease or damage.
Fig. 2A-2D show examples of Diabetic Nephropathy (DN) dimension reduction analyses of different age groups (fig. 2A), different gender (fig. 2B) and different race/ethnicity (fig. 2C), including an illustration of batch effects (fig. 2D).
FIG. 3 illustrates a computer system programmed or otherwise configured to implement the methods provided herein.
[ detailed description ] of the invention
While various embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed.
In the description and claims, the singular forms "a", "an" and "the" include plural referents unless the context clearly dictates otherwise. For example, the term "nucleic acid" includes a variety of nucleic acids, including mixtures thereof.
As used herein, "nucleic acid" generally refers to a polymeric form of nucleotides of any length, namely deoxyribonucleotides (dNTPs) or ribonucleotides (rNTPs) or analogs thereof. The nucleic acid may have any three-dimensional structure and may perform any known or unknown function. Non-limiting nucleic acids include DNA, RNA, coding or non-coding regions of a gene or gene fragment, sites (loci) determined by linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA), short hairpin RNA (shRNA), microrna (miRNA), ribozymes, cDNA, recombinant nucleic acids, branched nucleic acids, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, primers. A nucleic acid may be composed of one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. Modification of the nucleotide structure, if present, may be performed before or after nucleic acid assembly. The nucleotide sequence of the nucleic acid may be interrupted by non-nucleotide components. The polymerized nucleic acid may be further modified, such as by coupling or binding to a reporter.
As used herein, the term "target nucleic acid" generally refers to a nucleic acid molecule in a starting population that is desired to be identified as having a nucleotide sequence, the presence, number and/or sequence of which, or a change in one or more of which. The target nucleic acid may be any type of nucleic acid including DNA, RNA, and the like. As used herein, "ribonucleic acid (RNA) of interest" generally refers to a target nucleic acid, i.e., RNA. As used herein, "target deoxyribonucleic acid (DNA)" generally refers to a target nucleic acid, i.e., DNA.
The term "target" as used herein generally refers to a genomic region within a gene or marker region. As used herein, the term "quoted" generally refers to samples obtained or derived from a subject diagnosed with kidney disease or damage (kidney disease or damaged patient) or world health organization has received signs of negative clinical kidney disease or damage (e.g., a healthy or controlled subject has no kidney disease or damage).
The terms "site" or "region" as used herein are generally interchangeable and refer to a particular genomic region on a genome, represented by a chromosome number, a start position, and an end position.
The term "subject" as used herein generally refers to an entity or medium having genetic information that is either detectable or detectable. The subject may be a person or individual, such as a patient. One subject may be a vertebrate, such as a mammal. Non-limiting examples of mammals include mice, apes, humans, farm animals, sports animals, and pets.
The term "sample" or "biological sample" as used herein generally refers to a body sample or portion of a subject, obtained and analyzed to measure or determine characteristics of the whole, such as a sample of tissue, cells, blood, urine, or derivatives thereof. The term "biomarker" as used herein generally refers to any substance, structure, or process that can be used in a subject's body or product thereof and used to predict clinical outcome or affect the presence or absence of a treatment, select an appropriate treatment (or predict whether a treatment is effective), or monitor the current treatment regimen and possibly alter the treatment regimen.
The terms "amplification" and "amplification" are used interchangeably herein to generally refer to the production of one or more copies of a nucleic acid or "amplification product". The term "DNA amplification" generally refers to the production of one or more copies of a DNA molecule or "amplified DNA product. The term "reverse transcription amplification" generally refers to the production of deoxyribonucleic acid (DNA) from ribonucleic acid (RNA) templates by the action of reverse transcriptase. Amplification may be performed by Polymerase Chain Reaction (PCR), which is based on the use of DNA polymerase to synthesize a new DNA strand complementary to the original template strand.
The term "Polymerase Chain Reaction (PCR)" as used herein generally refers to a method of increasing the concentration of fragments of a sequence of interest in a mixture of genomic DNA without cloning or purification. The process of amplifying the target sequence may involve introducing a large excess of two oligonucleotide primers into a DNA mixture comprising the desired target sequence, followed by precise thermal cycling of the sequence in the presence of a DNA polymerase. The two primers may be complementary respective strand double-stranded target sequences. For amplification, the mixture may be denatured and the primers may anneal to complementary sequences within the target molecule. After annealing, the primers may be extended with a polymerase to form a pair of new complementary strands. The steps of denaturation, primer annealing, and polymerase extension can be repeated multiple times (e.g., denaturation, annealing, and extension constitute a "cycle; there can be many" cycles ") to obtain high concentrations of amplified fragments of the sequence of interest. The length of the amplified segment of the target sequence may be determined by the relative positions between the primers and is therefore a controllable parameter. Because of the reproducibility of this process, this method is known as the "polymerase chain reaction" (PCR). Because the amplified fragment required for the target sequence becomes the main sequence in the mixture (calculated as concentration), it is called "PCR amplification" and is a "PCR product" or "amplicon".
The term "DNA template" as used herein generally refers to sample DNA comprising a sequence of interest. At the beginning of the reaction, high temperatures act on the original double stranded DNA molecule, separating the two strands from each other.
As used herein, "primer" generally refers to a short fragment of single-stranded DNA that is complementary to a DNA template. The polymerase synthesizes new DNA starting from the primer ends.
The term "AUC" or "AUROC" as used herein is generally an abbreviation for the area under the curve of the receiver operating characteristics (Receiver Operating Characteristic, ROC). The ROC curve may be a plot of True Positive Rate (TPR) versus False Positive Rate (FPR) for a number of different possible thresholds or cut points for one diagnostic test, illustrating the cut points as selected (e.g., any increase in sensitivity is accompanied by a decrease in specificity). The area under the ROC curve (AUC) can be used to measure the accuracy of a diagnostic test (e.g., the larger the area, the more accurate the diagnosis), with an optimum of 1. In contrast, the ROC curve for random testing may be on the diagonal with an AUC of 0.5 (e.g., representing random or useless testing).
The international kidney disease society estimates that 8.5 million people worldwide are affected by kidney disease. Diabetic nephropathy (diabetic nephropathy) is a major cause of kidney disease and is also the most common cause of end-stage kidney disease (ESRD). In addition, diabetic nephropathy is associated with higher cardiovascular and total morbidity and mortality, and therefore timely diagnosis and treatment is of paramount importance. Diabetic nephropathy is a type of diabetic nephropathy, generally referred to as kidney damage due to high blood glucose levels caused by diabetes. Diabetic nephropathy progresses slowly. By effective early treatment, the progression of the disease can be slowed or even stopped. DN may be associated with a measurable biomarker, such as proteinuria and/or low evfr in a subject's body sample; however, these biomarkers may be non-specific for diabetic nephropathy and may be due to other diseases or damage, such as diabetes, hypertension, igA nephropathy, membranous nephropathy, lupus nephritis, slightly altered disease, rheumatoid arthritis, use of non-steroidal anti-inflammatory drugs, smoking, excessive alcohol consumption, drug abuse, obesity, urinary tract infection, kidney stones, benign prostatic hyperplasia, and the like. In addition, diabetic patients may have diabetic nephropathy, while poorly controlled diabetics may have little kidney damage. In addition, diabetic nephropathy patients may also have other types of kidney disease or damage.
Diabetic nephropathy (diabetic nephropathy) is a disease that is diagnosed inadequately and misdiagnosed in patients. For example, diabetic nephropathy may be under-diagnosed because diagnostic methods such as kidney biopsy may be risky (e.g., mortality rate of 1.8%), expensive, and time consuming; thus, many patients choose not to conduct such diagnostic tests. As another example, diabetic nephropathy may be misdiagnosed because diagnostic methods may lack sufficient sensitivity and specificity. For example, urinary albumin detection may be insensitive and non-specific. As another example, kidneys of structures and sizes such as x-ray imaging tests and ultrasonic diagnostic tests can be time consuming and indirect, and other tests such as panel test urine sediment, urine Protein Electrophoresis (UPEP), serum Protein Electrophoresis (SPEP), blood urine tests, antinuclear antibody (ANA) tests, HBV detection, HCV detection, HIV detection can provide limited information and are therefore indirect and inefficient. In vitro diagnostic techniques, such as techniques based on proteomics, genomics, and protein biomarkers, can be challenging in accurately detecting, assessing, and monitoring kidney disease or disease with high sensitivity, specificity, and accuracy. Identifying early stage diabetic changes can also be difficult because the only biomarker commonly used is proteinuria; however, proteinuria, particularly low levels of proteinuria, can be mixed with many other factors, such as hypertension, smoking, alcoholism, drug abuse, obesity, infection, obstruction, and the like. Thus, many patients may miss the best opportunity for early drug intervention or may receive treatment that is etiologically independent. Receiving such unnecessary and/or ineffective therapy can be costly, time consuming, and result in delays in providing other effective therapies to the patient.
In view of the perceived need for improved methods of detecting, assessing and monitoring kidney disease or disease (e.g., DN), which methods are rapid, inexpensive, noninvasive, highly sensitive, specific and accurate, the present disclosure provides methods, systems, and kits for detecting kidney disease or disease by processing biological samples (e.g., tissue samples, cell samples and/or body fluid samples) obtained or derived from a subject. For example, a nucleic acid, protein, or cell of a biological sample may be analyzed. Biological samples obtained from subjects can be analyzed to measure the presence or absence of kidney disease or damage, or to make a related assessment. Analysis may be performed in a set of genomic regions, such as the kidney disease associated genes or genomic loci. The subject may include subjects with kidney disease or damage (e.g., kidney disease or damage patients) and subjects without kidney disease or damage (e.g., normal or healthy controls).
The methods of the present disclosure may have many advantages over current methods, including convenience, safety, and the theme of non-invasives sample sets, the possibility of repeated testing, the suitability of kidney cells using urine samples for analysis, a direct method of analyzing kidney damage, the suitability of monitoring disease progression and efficacy of treatment, the suitability of sample collection in a home setting, the ability to detect without a detailed history of the subject, the ability to discover early diabetic changes (e.g., asymptomatic subjects).
Using the methods and systems of the present invention, renal diseases or conditions can be accurately detected in biological samples (e.g., urine samples) using detection methods with high sensitivity and specificity. Urine-based analysis a set of biomarkers can be analyzed using a machine learning algorithm to accurately distinguish between kidney disease or control samples of different stages of the disease (e.g., early, mid, or late). Furthermore, urine-based assays may provide high specificity, thereby facilitating non-invasive use of biomarkers associated with kidney disease or disease to monitor treatment of kidney disease or patients with disease. Furthermore, urine-based assays may have higher sensitivity and specificity than the kidney biopsy assays currently considered to be gold standard for the diagnosis of kidney disease.
Biological sample processing
Fig. 1 shows a flow chart of a method 100 for identifying kidney disease or disease. The method 100 may include analyzing a biological sample (e.g., a body sample composed of cells, tissue, blood, urine, or derivatives thereof) from a subject (e.g., a patient) to generate a data set comprising levels of gene expression products in the body sample (e.g., operation 102). The level of gene expression products may correspond to a group of genes associated with kidney disease or damage. Next, the method 100 may include computer processing the data set to determine whether the subject is suffering from a kidney disease or a high risk of a disease (e.g., operation 104). Next, the method 100 may include electronically outputting a report to determine whether the subject is present with kidney disease or a high risk of disease.
The biological sample may be obtained or derived from a blood sample, a serum sample, a plasma sample, a saliva sample, a stool sample, a sputum sample, a urine sample, a semen sample, a transvaginal fluid sample, a cerebrospinal fluid sample, a sweat sample, a cell sample, or a topic from a human tissue sample. The biological samples may be stored under various storage conditions prior to processing, such as different temperatures (e.g., room temperature, refrigerated or cold store conditions, 4 ℃, -18 ℃, -20 ℃ or-80 ℃ or liquid nitrogen) or different preservatives (e.g., alcohol, formaldehyde or potassium dichromate, or urine collection and storage tubes of Norgen Biotek inc.). The biological sample may be a fresh sample (e.g., processed in an appropriate time to avoid substantial RNA degradation) or a frozen or preserved sample.
The biological sample may be obtained from a subject suffering from a disease or injury, a subject suspected of suffering from a disease or injury, or a subject not suffering from a disease or suspected of not suffering from a disease or injury. Such a disease or injury may be an infectious disease, an immune-compromised or disease, a cancer, a genetic disease, a degenerative disease, a lifestyle disease, an injury, a rare disease, or an age-related disease. Infectious diseases may be caused by bacteria, viruses, fungi and/or parasites. Such damage or disease may be kidney disease or damage. The sample may be taken prior to and/or after treatment of the disease or impaired subject. The sample may be collected before and/or after treatment. The samples may be taken during the treatment or during the treatment regimen. Multiple samples may be taken from a subject to monitor the effect of the treatment. Multiple samples may be taken from a subject to monitor the progression of the disease over time. Multiple samples may be taken from a subject to assess the likelihood of simultaneous kidney disease or injury. Samples may be taken from subjects known or suspected to have kidney disease or disease, whereas clinical trials fail to obtain definitive positive or negative diagnosis.
The sample may be taken from a subject suspected of having a disease or injury (e.g., kidney disease or injury). Samples may be taken from subjects presenting with unexplained symptoms such as fatigue, nausea, weight loss, pain, weakness, or memory loss. The sample may be taken from a subject who has explained symptoms. The sample may be taken from a subject with asymptomatic disease or damage (e.g., kidney disease or damage). The sample may be taken from a subject who is likely to have a disease or injury (e.g., renal disease or injury) due to family history, age, environmental exposure, lifestyle risk factors, or the presence of other known risk factors.
After obtaining a biological sample from a subject, the biological sample obtained from the subject can be tested to generate gene expression data indicative of whether the subject has kidney disease or disease, or for a related assessment. For example, quantitative measurement of gene expression at a set of kidney disease or disease-associated genomic loci (e.g., quantitative measurement of gene expression at a plurality of kidney disease or disease-associated genomic loci may be indicative of kidney disease or disease in a subject) or the biological sample obtained from this topic may be (i) sufficiently isolated, enriched, or otherwise enriched for biological sample processing conditions, and (2) analysis of the nucleic acid molecule diversity to generate a nucleic acid molecule expression profile for a genomic kidney disease or disease-associated genomic locus.
Multiple nucleic acid molecules can be extracted from a biological sample and further analyzed (e.g., sequenced to generate multiple gene transcripts). The nucleic acid molecule may comprise deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). Nucleic acid molecules (e.g., RNA or DNA) can be extracted from biological samples by a variety of methods, such as the FastDNA Kit protocol of MP Biomedicals or the Allprep DNA/RNAkit of QIAGEN. This extraction method allows all RNA or DNA molecules to be extracted from the sample. Alternatively, the extraction method may selectively extract a portion of the RNA or DNA molecules from the sample. RNA molecules extracted from the sample may be converted to cDNA molecules by Reverse Transcription (RT).
The method may include a variety of assays suitable for assessing gene expression of kidney disease or disease-specific markers in a biological sample, including Next Generation Sequencing (NGS), real-time PCR, microarray analysis, and luminex-based gene expression analysis. Nucleic acid sequencing may employ any suitable sequencing method, such as shotgun sequencing, single molecule sequencing, nanopore sequencing, semiconductor sequencing, pyrosequencing, sequencing By Synthesis (SBS), multiplex polymerase chain reaction based methods, and exome targeted sequencing.
In the workflow of the multiplex polymerase chain reaction, total rna may be first reverse transcribed into cDNA. Multiplex gene-specific primers can be hybridized to a target site (e.g., one or more kidney disease or disorder biomarker or kidney disease or disorder-associated genomic site) and then PCR amplified to create an amplicon. The primer sequence may be removed. The second PCR can use sequencing primers to create sequencing ready fragments. Sequencing can include using commercially available kits and protocols, such as the amplieq of Thermo Fisher or the clearplex DNA/RNA amplicon sequencing kit of illumine, paragon Genomics. In the workflow of exome targeted sequencing, total rna may be reverse transcribed into cDNA first. An appropriate number of PCRs may be performed to sufficiently amplify the initial number of cdnas to the desired number of inputs. These initially amplified cdnas may be hybridized to gene-specific probes (e.g., one or more kidney disease or disorder biomarkers or genomic loci associated with kidney disease or disorder). The target gene fragment may be selected and enriched. A second PCR amplification of the enriched product may be performed to achieve a sufficient amount for sequencing. Sequencing methods may include the use of commercial kits and protocols such as the agilent bioelect XT HS2, truseq Exome kit of Illumina and the HyperPrep kit of roche.
The rn a or DNA molecule may be labeled, for example, with an identifiable tag, to allow multiplexing of multiple samples. Any number of RNA or DNA samples may be multiplexed. For example, a multiplex reaction may comprise at least 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,25,30,35,40,45,50,55,60,65, 70,75,80,85,90,95,100, or more than 100 initial samples. For example, multiple samples may be labeled with a sample barcode, so that each RNA or DNA molecule can trace back to the sample (and subject) from which the RNA or DNA molecule originated. These tags may be attached to RNA or DNA molecules by ligation or primer PCR amplification. In some embodiments, real-time fluorescent quantitative PCR (RT-PCR) can be used to assess gene expression of kidney disease or disease specific markers in biological samples.
In a nucleic acid population, only a specific target nucleic acid can be amplified (e.g., kidney disease or disease biomarker or one or more of the genomic loci associated with kidney disease or disease). In certain embodiments, up to 5 gene-specific primer/probe sets may be used to selectively amplify certain targets in each well. During the amplification process, the fluorescent label carried by the probe can fluoresce and can be captured by the camera detector. The level of fluorescence intensity defined by the cycle threshold (Ct) may be inversely proportional to the level of gene expression. RTPCR can be performed using any of a number of commercial kits, such as those provided by the companies Life Technologies, bio-Rad, promega, new England Biolabs, and the like.
In some embodiments, real-time fluorescent quantitative PCR (RT-PCR) can be used to assess gene expression of kidney disease or disease specific markers in biological samples. In a nucleic acid population, only a specific target nucleic acid can be amplified (e.g., kidney disease or disease biomarker or one or more of the genomic loci associated with kidney disease or disease). In certain embodiments, up to 5 gene-specific primer/probe sets may be used to selectively amplify certain targets in each well. During the amplification process, the fluorescent label carried by the probe can fluoresce and can be captured by the camera detector. The level of fluorescence intensity defined by the cycle threshold (Ct) may be inversely proportional to the level of gene expression. RTPCR can be performed using any of a number of commercial kits, such as those provided by the companies Life Technologies, bio-Rad, promega, new England Biolabs, and the like.
In some embodiments, microarray analysis can be used to assess gene expression of kidney disease or disease-specific markers in a biological sample. Probes for a gene of interest (e.g., one or more kidney disease or disease biomarkers or genomic loci associated with kidney disease or disease) can be printed on a microarray chip. The total RNA may first be reverse transcribed into cDNA. Fluorescent dyes may be added during reverse transcription. The labeled cDNA products can be hybridized to probes in a microarray chip. After hybridization, the microarray can be dried and scanned by a machine that uses a laser to excite the dye and measures the emission level with a detector. The amount of fluorescence may be proportional to the level of gene expression. In certain embodiments, luminex-based gene expression assays can be used to assess gene expression of kidney disease or disease-specific markers in biological samples. The method can be based on direct quantification of 3-80 RNA targets, as well as branched DNA (bDNA) signal amplification techniques. Urine samples can be cleaved to release RNA and incubated overnight with the target-specific probe set and Luminex capture beads. The signal amplification tree is then created by hybridization of the sequences of the pre-amplifier, the amplifier and the labeled probe. Each amplification unit may provide 400-fold signal with potentially 6 amplification units per copy of the target RNA, resulting in 2400-fold signal amplification per copy RNA. The signal can be detected by reading and analyzing the fluorescent reporter phycoerythrin on a Luminex instrument. The luminex-based gene expression analysis may include a QuantiGene Plex gene expression analysis using a commercial system such as Thermo Fisher.
Following sequencing of the nucleic acid molecules, the sequence reads may be subjected to appropriate bioinformatics to generate gene expression data indicative of the presence, absence, or relative assessment of kidney disease or damage. For example, the sequence reads may be aligned with one or more reference genomes (e.g., genomes of one or more species, such as human genomes). Aligned sequence reads can be quantified at a set of genomic loci to generate data indicative of kidney disease or the presence, absence, or relative assessment distribution of the disease. For example, a set of sequences corresponding to genomic loci associated with a kidney disease or disorder can be quantified to generate gene expression data indicative of the presence, absence, or relative assessment of the kidney disease or disorder.
Kidney disease or disease can be identified or monitored by using probes configured to selectively enrich for nucleic acid (e.g., RNA or DNA) molecules corresponding to a group of genomic loci (e.g., genomic loci associated with kidney disease or disease). The probe may be an oligonucleotide. The probe may have sequence complementarity (e.g., about 5 nucleotides, about 10 nucleotides, about 15 nucleotides, about 20 nucleotides, about 25 nucleotides, about 30 nucleotides, about 35 nucleotides), about 40 nucleotides, about 45 nucleotides, about 50 nucleotides, about 55 nucleotides, about 60 nucleotides, about 65 nucleotides, about 70 nucleotides, about 75 nucleotides, about 80 nucleotides, about 85 nucleotides, about 90 nucleotides, about 95 nucleotides, about 100 nucleotides, or more than 100 nucleotides to the nucleic acid sequence) from one or more individual genomic loci (e.g., renal disease or disease-associated genomic loci). The one or more genomic loci (e.g., kidney disease or disease-associated genomic loci) can comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, about 55, about 60, about 65, about 70, about 75, about 80, about 85, about 90, about 95, about 100, or more than 100 different genomic loci (e.g., kidney disease or disease-associated genomic loci). In some embodiments, the set of genomic loci comprises one or more genomic loci listed in table 3, table 4, table 5, and/or table 6 associated with a kidney disease or disorder.
The biological sample may be processed without extracting any nucleic acid. For example, the treatment may comprise analysis of the biological sample using the probe selected as a set of genomic loci (e.g., genomic loci associated with kidney disease or disease). A genome (e.g., a kidney disease or disease-associated genome site) can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, or more than 100 different genome sites (e.g., kidney disease or disease-associated genome sites). In some embodiments, the set of genomic loci comprises one or more genomic loci listed in table 3, table 4, table 5, and/or table 6 associated with a kidney disease or disorder.
The treatment may include selective analysis of one or more of the other genomic sites (e.g., kidney disease or disease-related genomic sites) in the biological sample using the probe. The probe may be a nucleic acid molecule (e.g., RNA or DNA) having sequence complementarity to a nucleic acid sequence (e.g., about 5 nucleotides, about 10 nucleotides, about 15 nucleotides, about 20 nucleotides), about 25 nucleotides, 30 nucleotides, 35 nucleotides, 40 nucleotides 45 nucleotides, 50 nucleotides, 55 nucleotides, 60 nucleotides, 65 nucleotides, about 70 nucleotides, about 75 nucleotides, about 80 nucleotides, about 85 nucleotides, about 90 nucleotides, about 95 nucleotides, about 100 nucleotides, or more than 100 nucleotides from one or more individual genomic loci (e.g., renal disease or disease-associated genomic loci). These nucleic acid molecules may be oligonucleotides or enriched sequences. Analysis of a biological sample using probes selected for one or more genomic sites (e.g., kidney disease or disease-associated genomic sites) may include the use of array hybridization, polymerase Chain Reaction (PCR), or nucleic acid sequencing (e.g., DNA sequencing or RNA sequencing).
The analysis reads can be quantified at one or more of the genomic loci (e.g., kidney disease or disease-related genomic loci) to generate gene expression data indicative of the presence, absence, or relative assessment of kidney disease or disease. For example, array hybridization or quantification of Polymerase Chain Reaction (PCR) can be performed on kidney disease or disease-related genomic sites to generate gene expression data for kidney disease or disease-related genomic sites in a biological sample. The detection results may include quantitative PCR (qPCR) values, digital PCR (dPCR) values, digital drop PCR (ddPCR) values, fluorescent values, and the like.
Kit for detecting a substance in a sample
The present disclosure provides kits for identifying or monitoring kidney disease or damage in a subject. The kit may include probes for identifying the presence, absence, or relative number of sequences in a genomic locus set associated with a kidney disease or disorder in a biological sample from a subject, which sequences may be predictive of the kidney disease or disorder. Such probes can selectively detect sequences of genomic loci associated with kidney disease or disease in a biological sample. The kit may comprise instructions for processing the biological sample using the probe to generate gene expression data on a panel of genomic sites associated with kidney disease or disease in the biological sample of the subject.
The probes in the kit may be selective for sequences at a plurality of kidney disease or disease-associated genomic loci in a biological sample. The probes in the kit may be configured to selectively enrich for nucleic acid (e.g., RNA or DNA) molecules corresponding to kidney disease or a group of genomic loci associated with the disease. The probes in the kit may be-oligonucleotides. The probes in the kit may have sequence complementarity to nucleic acid sequences of one or more genomic loci associated with kidney disease or disease. One or more genomic loci (e.g., kidney disease or disease-associated genomic loci) may comprise 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, about 15, about 20, about 25, about 30, about 35, about 40, about 45, about 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, or more than 100 different genomic loci (e.g., kidney disease or disease-associated genomic loci). In some embodiments, the one or more genomic loci comprise one or more genomic loci listed in table 3, table 4, table 5, and/or table 6 associated with kidney disease or disease.
The instructions in the kit may include instructions for analyzing the biological sample using a probe that is selective for the sequence of a kidney disease or disease-associated gene locus in the biological sample. The probe may be a nucleic acid molecule (e.g., RNA or DNA) having sequence complementarity to a nucleic acid sequence (e.g., about 5 nucleotides, about 10 nucleotides, about 15 nucleotides, about 20 nucleotides, about 25 nucleotides), about 30 nucleotides, about 35 nucleotides, about 40 nucleotides, about 45 nucleotides, about 50 nucleotides, about 55 nucleotides, about 60 nucleotides, about 65 nucleotides, about 70 nucleotides, about 75 nucleotides, about 80 nucleotides, about 85 nucleotides, about 90 nucleotides, about 95 nucleotides, about 100 nucleotides, or more than 100 nucleotides from one or more individual genomic loci (e.g., renal disease or disease-associated genomic loci). These nucleic acid molecules may be oligonucleotides or enriched sequences. Detection of a biological sample may involve performing array hybridization, polymerase Chain Reaction (PCR), or nucleic acid introduction sequencing (e.g., RNA sequence or DNA sequencing) to treat gene expression data generated by the biological sample to indicate the presence, absence, or relative number of sequence set kidney disease or discrete-associated genomic locus biological samples, which may indicate kidney disease or damage. Nucleic acid sequencing can be single-molecule (e.g., single-cell RNA-Seq or single-cell DNA-Seq).
The instructions in the kit may include instructions for measuring and interpreting analytical readings that can be quantified at one or more of the genomic loci associated with the kidney disease or disorder to produce a signal indicative of the presence, absence, or relative number of sequences on a panel of genomic loci associated with the kidney disease or disorder in the biological sample. For example, hybridization to an array corresponding to a kidney disease or disease-associated genomic locus or quantification of Polymerase Chain Reaction (PCR) can produce gene expression data in a set of kidney disease or disease-associated genomic loci in a biological sample. Analytical readings may include quantitative PCR (qPCR) values, digital PCR (dPCR) values, digital drop PCR (ddPCR) values, fluorescent values, etc., normalized values thereof, or ratio values thereof.
Classification
After processing the biological sample of the subject, the classification requirements can be used to process gene expression data at a kidney disease or disease-associated genomic locus to classify the biological sample to identify or evaluate the kidney disease or disease in the subject. In some avatars, the classification claim may be configured to determine kidney disease or damage, at least about 50% accuracy, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, about 98%, at least about 99%, or more than 99% of at least 25, at least about 50, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, at least about 500, or more than 500 individual samples.
Classification resorts may include supervised machine learning algorithms or unsupervised machine learning algorithms. Classification requirements may include classification and regression tree (CART) algorithms. For example, the classification requirements may include Support Vector Machines (SVMs), linear regression, logistic regression, nonlinear regression, neural networks, ensemble learning methods, boosting algorithms, adaBoost algorithms, random forests, deep learning algorithms,Bayes class appeal, recursive feature elimination algorithm, or a combination of both.
The classification requirement may be configured to accept a plurality of input variables and to generate one or more output values based on the plurality of input variables. The plurality of input variables may include data indicative of the presence, absence, or relative number of sequences or transcripts corresponding to a plurality of kidney diseases or disease-associated genomic loci. For example, the input variables may include a plurality of sequences or transcripts that correspond to or are aligned with a plurality of genomic loci associated with a kidney disease or disorder.
The classification resort may have one or more possible output values, each value comprising a fixed number of possible values (e.g., linear classification resort, logistic regression classification resort, etc.), indicating the classification of the biological sample by the classification resort. The classification requirement may include a binary classification requirement such that each of the one or more output values consists of one of two values (e.g., {0,1}, { positive, negative }, or { cancerous, non-cancerous }) indicating the classification of the biological sample by the classification requirement. The classification appeal may be another type of classification appeal such that each or more output values consists of one of more than two values (e.g., {0,1,2}, { positive, negative, or indeterminate }, or { diseased, non-diseased, or indeterminate }) indicating a classification of the biological sample by the classification appeal. The output value may include descriptive labels, numerical values, or a combination thereof. Some output values may contain descriptive labels. Such descriptive labels may provide for identification or indication of a disease state of a subject and may include, for example, positive, negative, diseased, non-diseased, or indeterminate. Such descriptive labels may provide a therapeutic identification of a disease state of a subject and may include, for example, a therapeutic intervention, a duration of the therapeutic intervention, and/or a dosage of the therapeutic intervention. Such descriptive labels may provide an identification secondary clinical test, may be suitably performed on the problem, and may include, for example, a biopsy, blood test, computed Tomography (CT) scan, magnetic Resonance Imaging (MRI) scan, ultrasound scan, chest x-ray, positron Emission Tomography (PET) scan, or PET-CT scan. Such descriptive signatures may provide a theme of prognostic disease states. Some descriptive labels may be mapped to numerical values, for example, by mapping "positive" to 1 and "negative" to 0.
Some output values may include numerical values, such as binary, integer, or continuous values. Such binary output values may comprise, for example, {0,1}. Such integer output values may include, for example, {0,1,2}. For example, such a continuous output value may comprise a probability value of at least 0 and not more than 1. For example, such a continuous output value may comprise a non-normalized probability value of at least 0. For example, such a continuous output value may include a denormalization probability value of at least 0. Such continuous output values may be indicative of a prognosis of a disease or damaged state in a subject, and may include, for example, indicative of an expected or average risk or severity of kidney disease or damage in a subject. Such continuous output values may represent predictions of a course of treatment for a disease or damaged condition of a subject, for example, may include an indication of an expected effective time of the course of treatment. Some values may be mapped to descriptive labels, e.g., 1 to "positive" and 0 to "negative".
Some output values may be assigned according to one or more cutoff values. For example, if the sample indicates that the subject has at least a 50% probability of being ill, then the binary classification of the sample may designate the output value as "positive" or 1. For example, if the sample indicates a probability of the subject being ill of less than 50%, the binary classification of the sample may assign an output value of "negative" or 0. In this case, a single cut-off value of 50% is used to classify the sample as one of two possible binary output values. Examples of individual cut-off values may include about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 25%, about 30%, about 35%, about 40%, about 45%, about 50%, about 55% about 60% about 65% about 70% about 75% about 80% about 85% about 90% about 95%98% and 99%.
As another example, a classified sample may be assigned an output value of "aggressive" or 1 if the sample indicates a subject having a probability of being ill of at least 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least 90%, at least 95%, at least 98%, or at least 99%. Classification of a sample may assign an output value of "aggressive" or 1 if the sample indicates that the subject has a probability of being ill of greater than 50%, greater than 55%, greater than 60%, greater than 65%, greater than 70%, greater than 75%, greater than 80%, greater than 85%, greater than 90%, greater than 95%, greater than 98%, or greater than 99%. The classification of the sample may assign a "negative" output value or 0 if the sample indicates a subject with a probability of being ill of less than 50%, less than 45%, less than 40%, less than 35%, less than 30%, less than 25%, less than 20%, less than 10%, less than 5%, less than 2%, or less than 1%. The classification of the sample may assign an output value of "negative" or 0 if the sample indicates a subject with a probability of not more than 50%, not more than 45%, not more than 40%, not more than 35%, not more than 30%, not more than 25%, not more than 20%, not more than 10%, not more than 5%, not more than 2%, or not more than 1% of a disease. If the sample is not classified as "positive", "negative", "1" or "0", the sample classification may assign an output value of "uncertainty" or 2. In this case, two sets of cut-off values are used to classify the sample as one of three possible output values. Examples of truncated value sets may include {1%,99% }, {2%,98% }, {5%,95% }, {10%, 0% }, {15%, 85% }, {20%, 80% }, {25%, 75% }, {30%, 70% }, 35%, 65% }, {40%,60% }, and {45%,55% }. Similarly, n sets of truncated values may be used to classify a sample as one of n+1 possible output values, where n is any positive integer.
Multiple independent training samples may be used to train classification appeal. Each individual training sample may include a biological sample from the subject, related data obtained by processing the biological sample (as described elsewhere herein), and one or more known output values corresponding to the biological sample (e.g., clinical diagnosis, prognosis, therapeutic effect, or absence of disease or impairment, such as kidney disease or impairment of the subject). The independent training samples may include biological samples and related data and outputs obtained from a plurality of different topics. The independent training samples may include biological samples and related data and outputs obtained at a plurality of different points in time of the same subject (e.g., before, after, and/or during a course of treatment to treat a disease or injury in the subject). The independent training samples may be related to the presence of kidney disease or disorder (e.g., the training samples include biological samples and related data and outputs obtained from a plurality of subjects known to have kidney disease or disorder). The independent training samples may be related to the absence of kidney disease or illness (e.g., the training samples include biological samples and related data and output obtained from multiple subjects known to have previously been diagnosed with no kidney disease or illness, or otherwise who is asymptomatic kidney disease or impairment).
The classification claim may train at least about 50, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, at least about 350, at least about 400, at least about 450, or at least about 500 independent training samples. The independent training samples may include samples related to the presence of kidney disease or disease and/or samples related to the absence of kidney disease or disease. Can train no more than 500, no more than 450, no more than 400, no more than 350, no more than 300, no more than 250, no more than 200, no more than 150, no more than 100, or no more than 50 independent training samples with the presence of kidney disease or damage. In some embodiments, the biological sample is independent of the sample used to train the classification appeal.
Classification appeal may be trained using a first number of independent training samples related to the presence of kidney disease or disease and a second number of independent training samples related to the absence of kidney disease or disease. The first plurality of independent training samples associated with kidney disease or damage may be no more than a second number of independent training samples associated with no kidney disease or damage. The first number of independent training samples associated with the presence of kidney disease or disorder may be equal to the second number of independent training samples associated with the absence of kidney disease or disorder. The first number of independent training samples associated with the presence of kidney disease or disorder may be greater than the second number of independent training samples associated with the absence of kidney disease or disorder.
The classification claim may be configured to determine kidney disease or damage, at least about 50% accuracy, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%, at least about 50, at least about 100, at least about 150, at least about 200, at least about 250, at least about 300, or more than 300 individual samples. The accuracy of a classification complaint to identify a kidney disease or disorder can be calculated as a percentage of an independent test sample (e.g., subjects with kidney disease or disorder, or apparently healthy subjects, with negative results from a clinical trial of kidney disease or disorder) and correctly identified or classified as having or not kidney disease or disorder.
Classification claims may be configured to determine kidney disease or damage, at least about 5%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than 99% of Positive Predictive Value (PPV). Classification complaints identify kidney disease or impaired PPV can be calculated as the percentage of biological samples identified or classified as having kidney disease or impaired that correspond to subjects who actually have kidney disease or impaired. A PPV may also be referred to as a precision.
The classification claim may be configured to identify at least about 5%, at least about 10%, at least about 15% kidney disease or damage by at least about 20%, about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99%. Classification complaints identify kidney disease or impaired NPV can be calculated as the percentage of biological samples identified or classified as not having kidney disease or impaired that correspond to subjects that are truly not having kidney disease or impaired.
The classification claim may be configured to determine a sensitivity of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than 99% of the kidney disease or impaired clinic. The clinical sensitivity of a classification complaint to identify a kidney disease or disorder can be calculated as the percentage of independent test samples associated with the presence of a kidney disease or disorder (e.g., subjects known to have a kidney disease or disorder) that are correctly identified or categorized as having a kidney disease or disorder. Clinical sensitivity may also be referred to as recall.
The classification claim may be configured to determine a clinical specificity of kidney disease or impairment of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than 99%. The clinical specificity of a classification complaint to identify a kidney disease or disorder can be calculated as the percentage of an independent test sample associated with the absence of a kidney disease or disorder (e.g., apparently healthy subjects, negative as a result of a clinical test for a kidney disease or disorder), which is correctly identified or categorized as being free of a kidney disease or disorder.
The classification may be configured to determine kidney disease or damage, an Area-index-Curve (AUC) of at least about 0.50, at least about 0.55, at least about 0.60, at least about 0.65, at least about 0.70, at least about 0.75, at least about 0.80, at least about 0.81, at least about 0.82, at least about 0.83, at least about 0.84, at least 0.85,0.86, at least about 0.87, at least about 0.88, at least about 0.89, at least about 0.90, at least about 0.91, at least about 0.92, at least about 0.93, at least about 0.94, at least about 0.95, at least about 0.96, at least about 0.97, at least about 0.98, at least about 0.99, or more than 0.99.AUC may be calculated as an integral of the ROC curve (e.g., the area under the ROC curve) associated with classifying a biological sample as either having or not having kidney disease or damage.
Classification requirements may be adjusted or tuned to improve the performance, accuracy, PPV, NPV, clinical sensitivity, clinical specificity, or AUC of identifying kidney disease or disease. The classification requirements may be adjusted or tuned by adjusting parameters of the classification requirements (e.g., a set of cut-off values for the classification samples, as described herein, or weights of the neural network). The classification requirements may be continuously or after the training process is completed.
After initial training of the classification, a subset of inputs may be determined to be the most influential or important inputs for high quality classification. For example, a subset of the plurality of genomic loci associated with a kidney disease or disorder may be determined to be the most influential or important genomic locus for high quality classification or identification of the kidney disease or disorder. The plurality of kidney disease or disease-associated genomic loci or a subset thereof may be ranked according to an index that indicates the impact or importance of each genomic locus on the high quality classification or recognition of kidney disease or disease. In some cases, these metrics may be used to significantly reduce the number of input variables (e.g., predicted variables) that may be used to train the classification appeal to a desired performance level (e.g., based on a desired minimum accuracy, PPV, NPV, clinical sensitivity, clinical specificity, or AUC). For example, if the training algorithm is trained to have a precision of greater than 99% with most classification results that are resorted to by tens or hundreds of input variables, then the training algorithm is trained, instead of only a selected subset 1,2,3,4,5,6,7,8,9,10, no more than 10 and no more than 15, no more than 20 days and no more than 25 years old, no more than 30, no more than 35 years old, no more than 40, no more than 45 years old, no more than 50 years old, or no more than about 100 most influential or most important input variables (e.g., marker genes, marker regions, majority of results, or other genetic loci) are degraded, but still acceptable precision classification (e.g., at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 96%, at least about 97%, or at least about 98%). The subset may be selected by sorting all input variables and selecting one predetermined number (e.g., 1,2,3,4,5,6,7,8,9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, no more than 30, no more than 35, no more than 40, no more than 45, no more than 50, or no more than 100) of input variables. In certain embodiments, the selected subset of influencing or most important input variables comprises one or more of the genomic loci listed in table 2.
Discriminating or monitoring kidney disease or damage
After classifying the biological sample using the classification to process gene expression data at genomic loci associated with the kidney disease or disease, a quantitative measure (e.g., probability of kidney disease or damage) indicative of the presence, absence, or relative assessment of the kidney disease or disease may be determined, and the subject whose regression of the kidney disease or damage may be monitored may be based at least in part on the quantitative measure (e.g., probability of kidney disease or probability of damage).
Identifying a subject having renal disease or impairment may be at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99% accurate. The accuracy of the classification to identify a kidney disease or disorder can be calculated as a percentage of an independent test sample (e.g., subjects with kidney disease or disorder, or apparently healthy subjects, with negative results from a clinical trial of kidney disease or disorder) and correctly identified or classified as having or not kidney disease or disorder.
Identifying a kidney disease or injury as likely to be at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, about 89%, about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99% Positive Predictive Value (PPV). Classification identifies kidney disease or impaired PPV can be calculated as the percentage of biological samples identified or classified as having kidney disease or impaired, which correspond to subjects who actually have kidney disease or impaired. A PPV may also be referred to as a precision.
The kidney disease or damage identified in (a) may be a Negative Predictive Value (NPV) of the subject of at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than 99%. Classification identifying renal disease or impaired NPV may be calculated as the percentage of biological samples identified or classified as being free of renal disease or impaired that correspond to subjects that are truly free of renal disease or impaired.
Kidney disease or injury may be at least about 5% sensitive, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more than about 99% from a clinically established subject. The clinical sensitivity of a classification complaint to identify a kidney disease or disorder can be calculated as a percentage of an independent test sample associated with the presence of a kidney disease or disorder (e.g., a subject correctly identified or classified as having a kidney disease or damage). Clinical sensitivity may also be referred to as recall.
The kidney disease or injury identified in (a) may be at least about 5%, at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 50%, at least about 55%, about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 81%, at least about 82%, at least about 83%, at least about 84%, at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or more than 99% of the subject's clinical specificity. The clinical specificity of a classification complaint to identify a kidney disease or disorder can be calculated as the percentage of an independent test sample associated with the absence of a kidney disease or disorder (e.g., apparently healthy subjects, negative as a result of a clinical test for a kidney disease or disorder), which is correctly identified or categorized as being free of a kidney disease or disorder.
After the kidney disease or condition is determined, a stage (e.g., early, medium, or late) of the kidney disease or condition may be further determined. The kidney disease or stage of the disease may be determined based at least in part on gene expression data for the genomic locus associated with the kidney disease or disease (e.g., quantitative determination of differential gene expression for the genomic locus associated with the kidney disease or disease). For example, early kidney disease or damage may refer to the stage of kidney disease or damage prior to the occurrence of clinical symptoms in a subject. As another example, end-stage renal disease or disease may refer to a subject suffering from severe renal disease or disease and/or severely impaired renal function (e.g., requiring dialysis, renal transplantation, or strict diabetes management or strict blood pressure control).
After determining that the subject has kidney disease or disorder, the subject may be provided with a therapeutic intervention (e.g., an appropriate course of treatment for the kidney disease or disorder of the subject). Therapeutic interventions may include effective doses of drugs such as ACE inhibitors or ARBs, vascular-targeting drugs such as Tie-2 activators, sodium-glucose transporter 2 (SGLT 2) inhibitors and glucagon-like peptide 1 (GLP-1) agonists, anti-inflammatory therapies including inflammatory factor inhibitors, pentoxifylines, and anti-transforming growth factor alpha/-epiregulin therapies, anti-oxidative stress therapies including Nicotinamide Adenine Dinucleotide Phosphate (NADPH) oxidase inhibitors and allopurinol, insulin effective doses for diabetes management, altering dietary or exercise regimens, surgery, smoking cessation, avoidance of non-steroidal anti-inflammatory drugs, or combinations thereof. If the subject is currently undergoing a course of renal disease or disease, the therapeutic intervention may include a subsequent different course of treatment (e.g., an increase in therapeutic effect due to the current course of treatment being ineffective or unresponsive).
Therapeutic intervention may include advising the subject to conduct a secondary clinical trial to determine diagnosis of kidney disease or damage. Such secondary clinical examinations may include renal biopsies, imaging examinations, blood examinations, computed Tomography (CT) scans, magnetic Resonance Imaging (MRI) scans, ultrasound scans, chest x-ray scans, positron Emission Tomography (PET) scans, PET-CT scans, or any combination thereof.
After a subject is determined to have kidney disease or damage, it can be treated. Treating the subject may include administering an appropriate therapeutic intervention to treat the subject for kidney disease or damage. Therapeutic interventions may include pharmaceutically effective dosages, insulin effective dosages for diabetes management, changes in dietary or exercise regimens, surgery, or a combination of both. If the subject is currently undergoing a treatment for a renal disease or course of disease, the therapeutic intervention performed may include a subsequent treatment for a different course (e.g., an increase in therapeutic effectiveness due to the current treatment course being ineffective or unresponsive).
If the subject is determined to be free of kidney disease or disease (e.g., diabetic nephropathy) or a different type of kidney disease is suspected of causing greater injury, medical intervention may be the subject of recommendation, a second clinical trial, to determine the cause of the kidney disease. Secondary clinical trials included kidney biopsies. If the subject is a heavy smoker, or a heavy alcoholism, or a drug abuse, the medical intervention may include advising the subject to quit smoking, alcohol or drug withdrawal, and conducting a second test after a few months of drug withdrawal. If the subject is morbid obese, the medical intervention may include weight management. After losing weight for several months, the subject may be subjected to a second test. If the subject has a persistent kidney or urinary tract infection, the medical intervention may include advising to treat the infection first. After a few months of treatment, the subject may be subjected to a second test.
Where a subject is determined to be at a high risk of suffering from a kidney disease or disorder (e.g., diabetic nephropathy), the subject may be subjected to a therapeutic intervention (e.g., prescribing and/or conducting an appropriate prophylactic treatment course to protect the subject's kidneys). Therapeutic interventions may include effective doses of drugs such as ACE inhibitors or ARBs, better glycemic control, effective doses of insulin for diabetes management, changing diet or exercise regimens, better blood pressure control, avoidance of non-steroidal anti-inflammatory drugs, weight management, or a combination of both.
Gene expression data for a kidney disease or disease-associated genomic locus (e.g., quantitative measurement of gene expression for a kidney disease or disease-associated genomic locus) can be assessed over a period of time to monitor a patient (e.g., gene expression data for a kidney disease or disease-associated genomic locus). Subjects at higher risk for developing kidney disease, or subjects with kidney disease or damage, or subjects undergoing treatment for kidney disease or damage). In this case, quantitative measurement of gene expression at a genomic locus associated with renal disease or disease in a patient may be altered during intervention, treatment or care. For example, due to effective intervention or treatment, a patient whose kidney disease or damage is resolving, a quantitative measurement of gene expression at a genomic locus associated with kidney disease or damage may be shifted to a gene expression profile or distribution in a healthy subject. Conversely, for example, the progression of a quantitative measure of gene expression in a patient with kidney disease or impaired-associated genomic locus is due to ineffective intervention or treatment (or any intervention or treatment received) that may shift to a gene expression profile or distribution and higher levels of the subject kidney disease or impaired stage.
As another example, quantitative measures of gene expression are at increased risk and regression of kidney disease or disease in patients with kidney disease or a discrete-associated gene locus because of a healthy topic of potential transgene expression profiling or distribution for effective prophylactic treatment. Conversely, for example, a patient whose gene locus is at increased risk of developing kidney disease associated with quantitative measures of gene expression-a patient whose gene expression profile or distribution kidney disease may be diverted to ineffective intervention or therapy (or receiving any intervention or therapy) is another example of a patient whose kidney disease or genomic locus associated with a disease is a quantitative measure of gene expression that gives a lower score to the patient, whereas laboratory results indicate that proteinuria or eGFR is not improved, possibly suggesting the coexistence or progression of another chronic kidney disease.
Progression or regression of kidney disease or damage in a subject may be monitored by monitoring the intervention or treatment course of kidney disease or damage in the subject. Monitoring may include assessing kidney disease or damage in a subject at two or more time points. The assessment may be based at least on gene expression data for kidney disease or disease-associated genomic loci determined at two or more time points (e.g., quantitative measurement of gene expression for kidney disease or disease-associated genomic loci).
Panel kidney disease or impairment of differential gene expression data-two or more points in time between decisions of related gene loci (e.g., quantitative measure kidney disease or discrete-associated genomic loci of gene expression) may suggest one or more clinical indications such as (i) diagnosis of kidney disease or impaired problem, (2) prognosis of kidney disease or impairment, (3) subjects of development of kidney disease or impairment, (iv) subjects of regression of kidney disease or impairment, (v) subjects of effectiveness or course of intervention to treat kidney disease or impairment, (vi) subjects of ineffective intervention or course of treatment of kidney disease or impairment, (seventh) another type of kidney disease or impairment, (eighth) another type of co-existence of kidney disease or progression that may be regressed, and (ix) kidney damage-related tobacco, alcohol, or abuse of drugs.
Panel kidney disease or impairment of differential gene expression data—two or more points in time between decisions of relevant gene loci (e.g., quantitative measure of gene expression kidney disease or discrete-associated genomic loci) may suggest diagnosing kidney disease or impairment. For example, if kidney disease or damage is not detected at an earlier point in time in the subject, but is detected at a later point in time in the subject, then such a difference may be indicative of a diagnosis of kidney disease or damage in the subject. Clinical actions or decisions may be made based on diagnostic indications of kidney disease or damage in the subject, for example, prescribing a new therapeutic intervention for the subject.
Panel kidney disease or impairment of differential gene expression data-determination of prognosis between two or more time points at related gene loci (e.g., quantitative measure of gene expression kidney disease or discrete-associated genomic loci) may indicate a problem of kidney disease or impairment.
Panel kidney disease or impairment of differential gene expression data-two or more points in time between decisions of related gene loci (e.g., quantitative measure of gene expression kidney disease or discrete-associated genomic loci) may suggest problems with regressive kidney disease or impairment. For example, if the subject of kidney disease or damage detection is at an earlier point in time, at a later point in time, if the difference between the two is negative (e.g., the presence, absence, or relative gene expression is reduced at the kidney disease panel evaluation or discrete-associated genomic locus (e.g., a quantitative measure of gene expression kidney disease or discrete-associated genomic locus) from the earlier point in time), then a regression difference (e.g., tumor burden, reduced tumor burden, or tumor size) may be indicated for the problem of kidney disease or damage. Clinical actions or decisions may be made based on an indication of such regression, for example, to continue or end a current therapeutic intervention with the subject.
For example, if the kidney disease or damage is found on the topic of an earlier time point but the subject is not found at a later time point, then a treatment course that may be suggestive of a treatment of the kidney disease or damage is distinguished.
For example, if a subject with kidney disease or damage is found at an earlier point in time, if a positive differential or null difference (e.g., the presence, absence, or relative assessment of gene expression of a kidney disease or disease-related genomic locus of a relatively assessed panel (e.g., a quantitative measure of gene expression of kidney disease or disease-related genomic locus) increases or remains at a constant level from a point in time after the earlier point in time), then a potentially implicit differential course of treatment for the kidney disease or damage problem may be made based on this sign of invalidity for the subject's kidney disease or damage treatment process.
Outputting a report of kidney disease or damage
Regression of postrenal disease or injury identification or progression or renal disease or disease monitoring, a report may be an electronic output determination or provide an indication of identification, prediction, regression, increased risk of renal disease or injury, or likelihood that the subject has other types of renal disease or injury. The subject cannot show kidney disease or damage (e.g., asymptomatic kidney disease or damage). The report may be displayed on a Graphical User Interface (GUI) of the user's electronic device. The user may be a subject, administrator, doctor, nurse, or other health care worker.
The report may include one or more clinical indications such as (i) a diagnosis of a problem with or damage to the kidney, (2) a prognosis of a disease or damage to the kidney, (3) a subject of progression of a disease or damage to the kidney, (iv) a subject of regression of a disease or damage to the kidney, (v) a subject of effectiveness or course of intervention to treat a disease or damage to the kidney, (vi) an ineffective intervention or course of treatment to treat a disease or damage to the kidney, (seventh) another type of disease or damage to the kidney, (eighth) another concurrent possibility that a disease of the kidney is fading or developing, and (ninth) kidney damage associated with tobacco, alcohol, or drugs of abuse. The report may include one or more clinical actions or decisions made based on the clinical indications.
For example, a clinical diagnostic indication of kidney disease or damage in a subject may be accompanied by a clinical act of prescribing a new therapeutic intervention for the subject. As another example, a clinical indication of a subject's kidney disease or disease progression may be accompanied by a clinical act of performing a new therapeutic intervention or switching therapeutic interventions on the subject (e.g., ending the current therapy and prescribing a new therapy). As another example, a clinical indication of regression of a renal disease or injury in a subject may be accompanied by a clinical action to continue or end the current therapeutic intervention in the subject. As another example, a clinical indication that indicates the efficacy of a treatment for treating kidney disease or impaired treatment in a subject may be accompanied by a clinical act of continuing or ending the subject's current therapeutic intervention. As another example, during treatment of a renal disease or injury in a subject, clinical indication inefficiency may be accompanied by a clinical act of ending the current therapeutic intervention and/or diverting (e.g., prescribing) another new therapeutic intervention.
Computer system
The present disclosure provides a computer system programmed to implement the methods of the present disclosure. FIG. 3 shows a computer system 301 programmed or otherwise configured, for example, to determine quantitative measures of gene expression, to generate a gene expression profile of RNA molecules in a genomic region; determining a quantitative measure indicative of the presence, absence, elevated risk, or a relevant assessment of kidney disease or damage in the subject; analyzing the gene expression data; determining or providing a kidney disease or disease symptom; and electronically output a report to determine or provide an indication of the kidney disease or condition.
Computer system 301 may adjust various aspects of the analysis, calculation, and generation of the present disclosure, e.g., determine quantitative measures of gene expression, to generate a gene expression profile of a genomic region RNA molecule; determining a quantitative measure indicative of the presence, absence, elevated risk, or related assessment of kidney disease or disease; analyzing the gene expression data; identifying or providing a kidney disease or sign of a disease; and electronically output a report to determine or provide an indication of kidney disease or disease. The computer system 301 may be the user's electronic device or a computer system that is remote from the electronic device. The electronic device may be a mobile electronic device.
Computer system 301 includes a central processing unit (CPU, also referred to herein as a "processor" and a "computer processor") 305, which may be a single-core or multi-core processor, or multiple processors processing in parallel. Computer system 301 also includes memory or memory locations 310 (e.g., random access memory, read only memory, flash memory), electronic storage units 315 (e.g., hard disk), communication interfaces 320 (e.g., network adapters) for communicating with one or more other systems, and peripheral devices 325, such as cache, other memory, data storage, and/or electronic display adapters. The memory 310, storage 315, interface 320, and peripherals 325 communicate with the CPU 305 through a communication bus (solid line) (e.g., motherboard). The storage unit 315 may be a data storage unit (or data repository) that stores data. Computer system 301 may be operatively coupled 320 with a computer network ("network") 330 via a communication interface network 330 may be Internet, internet and/or an external network, or an internal and/or external network in communication with the Internet.
The network 330 is in some cases a telecommunications and/or data network. Network 330 may include one or more computer servers that may implement distributed computing, such as cloud computing. For example, one or more computer servers 330 years and 5 months have cloud computing ("cloud") performed by the network to analyze, calculate, and present information disclosure of the generation such as, for example, determining quantitative measures of gene expression to generate gene expression profile gene regions of RNA molecules; determining a quantitative measure indicative of the presence, absence, elevated risk, or related assessment of kidney disease or disease; analyzing the gene expression data; identifying or providing a kidney disease or sign of a disease; and electronically output a report identifying or providing an indication of kidney disease or illness. Such cloud computing may be provided by cloud computing platforms such as Amazon Web Services (AWS), microsoft Azure, google cloud platform, and IBM cloud. Network 330, in some cases by means of computer system 301, may implement a point-to-point network, which may cause devices coupled to computer system 301 to appear as clients or servers.
CPU 305 may include one or more computer processors and/or one or more graphics processing units (gpus). The CPU 305 may execute a series of machine readable instructions, which may be embodied in a program or software. The instructions may be stored in a memory location, such as memory 310. The instructions may be directed to the CPU 305, which CPU 305 may then program or configure the CPU 305 to implement the methods of the present disclosure. Operations performed by the CPU 305 may include fetching, decoding, executing, and writing back.
The CPU 305 may be part of a circuit, such as an integrated circuit. One or more other components of system 301 may be included in the circuit. In some cases, the circuit is an Application Specific Integrated Circuit (ASIC).
The storage unit 315 may store files such as drivers, libraries, and saved programs. The storage unit 315 may store user data such as user preferences and user programs. In some cases, the computer system 301 may include one or more additional data storage units that are external to the computer system 301, such as on a remote server that communicates with the computer system 301 via an intranet or the Internet.
Computer system 301 may communicate with one or more remote computer systems over network 330. For example, computer system 301 may communicate with a user's remote computer system. Examples of remote computer systems include personal computers (e.g., portable PCs), tablet computers (e.g.,tab), phone, smart phone (e.g.,android supported device, +.>) Or a personal digital assistant. A user may access computer system 301 through network 330.
The methods described herein may be implemented by machine (e.g., a computer processor) executable code stored on an electronic storage location of computer system 301, for example, on memory 310 or electronic storage unit 315. The machine-executable or machine-readable code may be provided in the form of software. During use, code may be executed by processor 305. In some cases, code may be retrieved from storage 315 and stored in memory 310 for ready access by processor 305. In some cases, electronic storage unit 315 may be eliminated and machine executable instructions stored on memory 310.
The code may be pre-compiled and configured for use on a machine having a processor adapted to execute the code, or may be compiled at run-time. The code may be provided in an alternative programming language such that the code may be executed in a pre-compiled or compiled form.
Certain aspects of the systems and methods provided herein, such as computer system 301, may be embodied in programming. Aspects of the technology may be considered "articles of manufacture" or "article of manufacture" generally in the form of machine (or processor) executable code and/or associated data carried or embodied in a machine readable medium. The machine executable code may be stored on an electronic storage unit such as a memory (e.g., read only memory, random access memory, flash memory) or a hard disk. The "storage" type of media may include any or all of a computer, processor, or similar tangible memory, or modules associated therewith, such as various semiconductor memories, tape drives, disk drives, etc., which may provide non-transitory storage for software programming at any time. All or part of the software may sometimes communicate over the internet or various other telecommunications networks. For example, such communication may cause software to be loaded from one computer or processor to another computer or processor, e.g., from a management server or host computer to a computer platform of an application server. Thus, another type of media that may carry software elements includes optical, electrical, and electromagnetic waves, such as are used across the physical interface between local devices through wired and optical landline networks and various air links. Physical elements carrying such waves, such as wired or wireless links, optical links, or the like, may also be considered as media carrying the software. In the terms used herein, unless limited to a non-transitory, tangible "storage" medium, such as a computer or machine "readable medium," refers to any medium that participates in providing instructions to a processor for execution.
Thus, a machine-readable medium, such as computer-executable code, may take many forms, including but not limited to, tangible storage media, carrier wave media, or physical transmission media. Nonvolatile storage media includes, for example, optical or magnetic disks, such as any storage devices in any computer or similar device, such as may be used to implement the databases shown in the figures. Volatile storage media include dynamic memory, the main memory of such a computer platform. Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system. Carrier wave transmission media can be electrical or electromagnetic signals, and acoustic or light waves, such as those generated during Radio Frequency (RF) and Infrared (IR) data communications. Common forms of computer-readable media include, therefore, a floppy disk, a flexible disk, hard disk, magnetic tape, other magnetic medium, a cd-ROM, DVD, or DVD ROM, other optical medium, punch cards, other physical storage medium and patterned holes, a RAM, a ROM, a PROM, and EPROM, a FLASH-EPROM, other memory chip or cartridge, a carrier wave transporting data or instructions, a cable or connection transporting such carrier waves, or any other medium from which a computer can read program code and/or data. Many forms of these computer-readable media may be involved in carrying one or more sequences of one or more instructions to a processor for execution.
301 may include a computer system or in communication with an electronic display 335, including 340 providing a User Interface (UI), e.g., visually displayed data (e.g., gene expression data) indicating the absence of presence, or relatively evaluating the subject of kidney disease or damage; determining the presence, absence, elevated risk, or related assessment of kidney disease or damage in a subject, determining that the subject has kidney disease or damage; or electronically reported to determine or provide an indication of kidney disease or disease. Examples of UIs include, but are not limited to, graphical User Interfaces (GUIs) and web-based user interfaces.
The methods and systems of the present disclosure may be implemented by one or more algorithms. The algorithm, when executed by the central processor 305, may be implemented in software. For example, the algorithm may determine quantitative measures of gene expression, thereby generating a gene expression profile of the RNA molecule in the genomic region; determining a quantitative measure indicative of the presence, absence, elevated risk, or a relevant assessment of kidney disease or damage in the subject; analyzing the gene expression data; determining or providing evidence of kidney disease or damage in a subject; and electronically output a report to determine or provide an indication of the kidney disease or condition.
Examples of the invention
Example 1
Using the presently disclosed methods and systems, a urine sample of a subject is non-invasively assessed for Diabetic Nephropathy (DN). First, fresh urine specimens were collected from more than 300 subjects using a urine collection and retention cup (120 cc from Norgen Biotek corp.). Subjects include both men and women, including Diabetic Nephropathy (DN), diabetes without diabetes or renal manifestations (NEG), and non-diabetic Chronic Kidney Disease (CKD).
Next, the subject's medical records, including doctor's records and test results, are examined. Stringent selection criteria are used to select the most representative Diabetic Nephropathy (DN), diabetic negative controls (NEG) without diabetes or without renal manifestations, non-diabetic Chronic Kidney Disease (CKD). Then, total RNA samples were isolated from urine samples and subjected to whole transcriptome RNA sequencing. For example, the library preparation kit may be selected from the Illumina Nextera RNA concentration labelling kit; the Truseq RNA exome kit of Illumina; agilent SureSelect XT HS2 RNA preparation kit; KAPA RNA HyperPrep by Roche Inc.
Next, data analysis was performed on sequencing reads, including dimension reduction by Principal Component Analysis (PCA), using parameters such as gender, race, age, batch effect, etc. The identity is the only parameter observed to form two distinct populations; thus, the dataset was divided into two groups, male and female.
Next, to determine the genetic profile associated with diabetic nephropathy, we performed two comparisons, DN with CKD, DN with a diabetic negative control. The DN samples used in the two comparisons are identical. In each comparison, a list of differentially expressed genes was generated using the DESeq2 library package in R studio. In addition, all genes in our dataset (about 13000 after filtration) were also used.
Figures 2A-2C illustrate a Diabetic Nephropathy (DN) dimension reduction analysis for different age groups (figure 2A), different sexes (figure 2B), and different ethnicities/ethnicities (figure 2C). These numbers indicate that no distinct clusters are formed between different age groups (fig. 2A) and different ethnic/ethnic groups (fig. 2C). The men and women were tested to form two different groups (fig. 2B), so the data were separated into men and women groups for separate analysis.
Next, feature selection tests were performed on patients and corresponding gene lists using different classification requirements in Python. In each test, 80% of the samples were randomly selected as the training dataset, and the remaining 20% of the samples were selected as the test dataset.
Among the classification appeal tested, determining the recursive feature elimination classification appeal yields the best predictive score. Next, a scoring system is generated in Python using trained logistic regression classification appeal, which outputs a probability score (between 0 and 1) describing the probability that the sample belongs to a certain group (e.g., with or without diabetic nephropathy). If the probability score is above a threshold (e.g., 0.5), the sample is classified as "yes" or "positive" of DN (e.g., the subject has DN); otherwise, the sample is classified as "no" or "negative" (e.g., the subject has no DN).
Diabetic nephropathy will only be determined if the scores of both the DN vs. ckd model and the DN vs. diabetic negative control model are above a predetermined threshold (e.g., about 0.70). For example, when only one biomarker score is considered, a score greater than 0.5 for DN and a diabetes negative control may indicate glomerular injury, while a DN vs. diabetes negative control of less than 0.5 may indicate tubular injury. Thus, the specificity can be significantly improved using this dual biomarker assessment method, as shown in table 1.
TABLE 1 clinical interpretation of patient DN, diabetes negative control, and CKD scores
/>
Some patients do not have significant kidney damage (e.g., normal levels of proteinuria and eGFR) when samples are taken, but will soon thereafter develop microalbuminuria. These subjects were observed to have a scoring pattern that was similar to the DN scoring pattern. Based on the patient's history, which may include the presence or absence of other diabetic complications, we use model scores to reasonably predict that these patients are at high risk of developing DN. Table 2 shows the clinical interpretation of the subjects, including typical cases of diabetes change in predicted patients who are still normal proteinuria.
TABLE 2 clinical analysis of scores for diabetic nephropathy and diabetic negative controls and non-diabetic chronic kidney disease patients
If the patient is found to have a high risk of developing diabetic nephropathy, the patient may be subjected to therapeutic interventions such as near normal glycemic control, hypotensive therapy and dietary protein restriction. Various drugs may be administered, such as hormones (e.g., insulin), sulfonylureas, biguanides, angiotensin Converting Enzyme (ACE) inhibitors, angiotensin Receptor Blockers (ARBs), beta-adrenergic blockers, calcium channel blockers, and diuretics. However, it is more important to exclude or exclude patients without diabetic nephropathy, but who present confounding symptoms or biomarkers, in order to focus effective treatment on other causes at an earlier point in time, thereby improving efficacy, reducing costs and side effects.
Further, biopsy confirmed samples were scored, most matched. In each comparison, thousands of random split events (e.g., 2000 to 5000) were performed, whether diabetic nephropathy versus non-diabetic chronic kidney disease or diabetic nephropathy versus diabetic negative control. The most common genes were screened as gene markers for diabetic nephropathy. In this example, four sets of gene signatures were identified, two sets being male and two sets being female (diabetic nephropathy and non-diabetic chronic kidney disease, diabetic nephropathy and diabetic negativity, respectively), as shown in tables 3, 4, 5, and 6. For each of these 4 tables, either 4 or 5 sets (of different sizes) of strong predictive differential gene expression markers were provided.
TABLE 3 evaluation of differential Gene expression markers for diabetic nephropathy and diabetes negativity (female subject)
/>
TABLE 4 evaluation of differential Gene expression markers for diabetic nephropathy and non-diabetic chronic kidney disease (female subject)
/>
/>
TABLE 5 evaluation of differential Gene expression markers for diabetic nephropathy and diabetes negativity (Male subject)
/>
/>
TABLE 6 evaluation of differential Gene expression markers for diabetic nephropathy and non-diabetic chronic kidney disease (Male subject)
/>
/>
As described above, using the methods and systems disclosed in the present report, a non-invasive assessment of a urine sample of a subject is performed to assess Diabetic Nephropathy (DN). Notably, we utilized several improvements to obtain superior performance results, including unique diabetic nephropathy selection criteria, gender specific diabetic nephropathy gene signatures, dual biomarker diabetic nephropathy assessment methods, use of 4 gene signatures (combined gender specific signature and dual biomarker method), and use of recursive feature elimination classifiers and feature numbers used. For example, a 2-set gene signature may be used for men (a set of men-specific diabetic nephropathy and a gene signature for a negative comparison of diabetes, and a set of men-specific diabetic nephropathy and a gene signature for a non-diabetic chronic kidney disease) while two different sets of gene signatures may be used for female subjects (a set of women-specific diabetic nephropathy and a gene signature for a negative comparison of diabetes, and a set of women-specific diabetic nephropathy and a gene signature for a non-diabetic chronic kidney disease).
In addition to using a dual biomarker scoring method (e.g., using a first diabetic nephropathy vs. diabetic negative score and a second diabetic nephropathy vs. CKD score), the training dataset may be used to directly train a multi-dimensional classifier that includes positive cases and negative controls for two or more kidney diseases, such as diabetic nephropathy and other kidney diseases or diseases (e.g., diabetic nephropathy and CKD), using the gene markers listed in tables 3, 4, 5, and 6 as input features to the multi-classifier. Multiple classes of classifiers can be trained to distinguish between three or more cases, a first positive for kidney disease or damage, a second positive for kidney disease or damage, and a negative (e.g., no kidney disease or damage).
Diabetic nephropathy patients may benefit from therapeutic methods developed using regenerative medicine. These techniques may help to reverse or slow the kidney damage caused by the disease. For example, if a patient's diabetes is cured by islet cell transplantation or stem cell therapy, their renal function may be improved. In addition, new therapies such as stem cell therapies and new drugs may be developed for the treatment of diabetic nephropathy.
Other types of kidney damage (e.g., non-diabetic nephropathy) can be identified by modifying negative samples (e.g., NO group) using a dual biomarker method. For example, if kidney disease is associated. As nicotine-dependent subjects in the NO group are replaced with non-smoker subjects, the negative score of a certain kidney disease increases significantly (e.g. by more than 0.1), albumin urine may be the result of nicotine usage. The method can be used to identify various kidney diseases and injuries such as hypertensive nephropathy, igA nephropathy, membranous nephropathy, slightly altered disease, focal Segmental Glomerulosclerosis (FSGS), non-steroidal anti-inflammatory drug (NSAIDs) induced nephrotoxicity, thin basal membrane nephropathy, amyloidosis, ANCA vasculitis associated with endocarditis and other infections, cardiorenal syndrome, igG4 nephropathy, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis (kidney infection), renal arterial stenosis, renal cyst, rheumatoid arthritis associated kidney disease, kidney stones.
While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. The invention is not intended to be limited to the specific examples provided in the specification. While this invention has been described in terms of the above specification, the descriptions and illustrations of the embodiments herein are not intended to be construed in a limiting sense. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. Furthermore, it is to be understood that all aspects of the invention are not limited to the specific description, configurations, or relative proportions set forth, depending on various conditions and variables. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is therefore contemplated that the present invention shall also include any such alternatives, modifications, variations or equivalents. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.

Claims (190)

1. A method of treating or analyzing a body sample of a subject, comprising:
(a) Analyzing the body sample to generate a dataset comprising one or more levels of gene expression products in the body sample, wherein one or more levels of gene expression products correspond to a set of genes associated with kidney disease or damage;
(b) Computer processing the dataset from (a) to determine if there is a high risk of the kidney disease or damage in the subject with an accuracy of at least about 80%; and
(c) Electronically outputting a report to determine said presence, said absence, or said elevated risk of kidney disease or injury in the subject determined in (b).
2. The method of claim 1, wherein the body sample is selected from the group consisting of a blood sample, a serum sample, a plasma sample, a saliva sample, a stool sample, a sputum sample, a urine sample, a semen sample, a transvaginal fluid sample, a cerebrospinal fluid sample, a sweat sample, a cell sample, and a tissue sample.
3. The method of claim 2, wherein the body sample is referred to as a urine sample.
4. The method of claim 3, wherein the urine sample is a fresh urine sample, a frozen urine sample, or a preserved urine sample.
5. The method of claim 1, wherein (a) a molecule obtained from reverse transcription of ribonucleic acid (RNA) or derived from the body sample produces complementary deoxyribonucleic acid molecules (complementary), and a dataset produced by sequencing at least a portion of the complementary deoxyribonucleic acid molecules, wherein the dataset comprises sequencing reads.
6. The method of claim 1, wherein (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from the body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and determining at least a portion of the cDNA molecules by real-time polymerase chain reaction (RT-PCR) to produce the dataset.
7. The method of claim 1, wherein (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from the body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and analyzing at least a portion of the cDNA molecules by microarray analysis to produce the dataset; wherein the dataset comprises counts of gene transcripts.
8. The method of claim 1, wherein (a) comprises hybridizing ribonucleic acid (RNA) molecules obtained or derived from the body sample using a specific probe set, and analyzing the hybridized RNA molecules using a Luminex platform to generate the dataset.
9. The method of any one of claims 5 to 8, wherein (a) comprises selectively enriching at least a portion of said cDNA molecules or said RNA molecules for a set of genomic loci associated with said kidney disease or damage.
10. The method of any one of claims 5 to 8, wherein (a) comprises amplifying at least a portion of the cDNA molecule or the RNA molecule.
11. The method of any one of claims 5 to 8, wherein (a) comprises aligning at least a portion of the sequencing reads to a reference sequence.
12. The method of claim 11, wherein the reference sequence is at least a portion of a human reference genome.
13. The method of claim 11, wherein (a) comprises generating a count of gene transcripts.
14. The method of claim 13, wherein the count of gene transcripts is normalized to produce a normalized count of gene transcripts.
15. The method of claim 1, wherein the kidney disease or injury is selected from the group consisting of early kidney disease, medium kidney disease, end-stage kidney disease, asymptomatic kidney disease, diabetic nephropathy, hypertensive kidney disease, igA kidney disease, membranous kidney disease, minilesions, focal Segmental Glomerulosclerosis (FSGS), non-steroidal anti-inflammatory kidney disease toxicity, thin basement membrane kidney disease, amyloidosis, endocarditis-related ANCA vasculitis and other infections, cardiorenal syndrome, igG4 kidney disease, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis, renal artery stenosis, renal cyst, kidney disease associated with rheumatoid arthritis, and kidney stones.
16. The method of claim 15, wherein the kidney disease or injury is diabetic nephropathy.
17. The method of claim 16, wherein the diabetic nephropathy is early stage diabetic nephropathy.
18. The method of claim 16, wherein the subject is free of diabetic nephropathy symptoms.
19. The method of claim 16, wherein the set of genes comprises at least one gene selected from the group consisting of the genes listed in table 3, the genes listed in table 4, the genes listed in table 5, and the genes listed in table 6.
20. The method of claim 1, wherein (b) comprises processing the data set using a trained algorithm.
21. The method of claim 20, wherein the training algorithm comprises a trained machine learning algorithm.
22. The method of claim 21, wherein said trained machine learning algorithm is selected from the group consisting of a Support Vector Machine (SVM), a Support Vector Machine (SVM)Bayes classification, a linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, enhancement algorithms, adaBoost algorithms, recursive Feature Elimination (RFE), and any combination thereof.
23. The method of claim 22, wherein the trained machine learning algorithm comprises the Recursive Feature Elimination (RFE) algorithm.
24. The method of claim 21, wherein the trained machine learning algorithm is trained using a plurality of training samples comprising a body sample from a first group of subjects suffering from the kidney disease or damage and a second group of body samples from subjects not suffering from the kidney disease or damage, wherein the first and second groups of body samples are different from the body sample of the subject.
25. The method of claim 21, wherein the trained machine learning algorithm is trained using a plurality of training samples comprising a first set of body samples having the kidney disease or damage and a second set of body samples having other types of kidney disease or damage, wherein the first set of body samples and the second set of body samples are different from the body samples of the subject.
26. The method of claim 1, wherein (a) comprises comparing the one or more gene expression product levels to a reference value.
27. The method of claim 26, wherein the reference value corresponds to a first set of body samples from the kidney disease or damaged subject and/or a second set of gene expression products from body samples from subjects not suffering from kidney disease or damage.
28. The method of claim 26, wherein the reference value corresponds to a set of gene expression products from a first set of body samples from the kidney disease or damaged subject and/or a second set of body samples from subjects with other kidney disease or damage.
29. The method of claim 1, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a sensitivity of at least about 80%.
30. The method of claim 1, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a sensitivity of at least about 90%.
31. The method of claim 1, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a specificity of at least about 80%.
32. The method of claim 1, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a specificity of at least about 90%.
33. The method of claim 1, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a positive predictive value of at least about 80%.
34. The method of claim 1, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a negative predictive value of at least about 80%.
35. The method of claim 1, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject at an area under the curve (AUC) of at least 0.80.
36. The method of claim 1, further comprising determining clinical intervention in said subject based at least in part on said presence of said kidney disease or damage or said increased risk in (b).
37. The method of claim 36, wherein the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle changes, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs.
38. The method of claim 37, wherein the drug is dependent on blockage of the renin-angiotensin aldosterone system.
39. The method of claim 1, wherein (b) comprises analyzing a first set of genes for differentially distinguishing a subject from a first type of kidney disease or damage and no kidney disease or damage and a second set of genes for differentially distinguishing the first type of kidney disease or damage and a second type of kidney disease or damage.
40. The method of claim 39 (b) comprising analyzing a first set of genes for differentially distinguishing diabetic nephropathy (diabetic nephropathy) from non-renal disease (diabetic negative control) subjects and a second set of genes for differentially distinguishing diabetic nephropathy (diabetic nephropathy) from other Chronic Kidney Disease (CKD).
41. The method of claim 40, wherein the first set of genes is selected from the genes listed in tables 3 and 5 and the second set of genes is selected from the genes listed in tables 4 and 6.
42. The method of claim 40, wherein (b) comprises generating a first diabetic nephropathy vs. diabetes negative control score based on said first set of genes and a second diabetic nephropathy vs. CKD score based on said second set of genes.
43. The method of claim 42, wherein a glomerular injury is indicated when the first diabetic nephropathy vs. diabetes negative control score is greater than 0.5 and a tubular injury is indicated when less than 0.5.
44. The method of claim 1, wherein (b) comprises analyzing different sets of male-specific or female-specific genes based on the gender of the subject.
45. The method of claim 1, further comprising analyzing the body sample of the subject at two or more different points in time to produce two or more data sets. A computer processes the two or more data sets to determine the presence, absence, or increased risk of kidney disease in the subject.
46. The method of claim 1, further comprising determining the presence, absence, or increased risk of the subject suffering from another type of kidney disease or injury. And electronically outputting a report confirming said presence of said another kidney disease or injury, said absence or said increased risk in said subject. .
47. The method of claim 1, further comprising analyzing the body sample of the subject at two or more different points in time to produce two or more data sets. The computer processes the two or more data sets to determine the presence, absence, or increased risk of the subject suffering from another kidney disease or injury.
48. A method of treating or analyzing a subject's body sample, comprising (a) analyzing the body sample to produce a dataset comprising one or more levels of gene expression products in the body sample, wherein the one or more levels of gene expression products correspond to a set of genes associated with kidney disease or damage;
(b) Computer processing the dataset from (a) to determine with a sensitivity of at least 80% whether the subject is present, absent or impaired, or at increased risk.
(c) Electronically outputting a report to determine whether said kidney disease or damage of said subject determined in (b) is present, said deficiency or said increased risk.
49. The method of claim 48, wherein said body sample is selected from the group consisting of a blood sample, a serum sample, a plasma sample, a saliva sample, a stool sample, a sputum sample, a urine sample, a semen sample, a transvaginal fluid sample, a cerebrospinal fluid sample, a sweat sample, a cell sample, and a tissue sample.
50. The method of claim 49, wherein the body sample is a urine sample.
51. The method of claim 48, wherein the urine sample is a fresh urine sample, a frozen urine sample, or a preserved urine sample.
52. The method of claim 48, wherein (a) a molecule obtained from reverse transcription of ribonucleic acid (RNA) or derived from said body sample produces complementary deoxyribonucleic acid molecules (cDNA), and sequencing at least a portion of the dataset produced by the complementary deoxyribonucleic acid molecules cDNA, wherein the dataset comprises sequencing reads.
53. The method of claim 48, wherein (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from said body sample to obtain complementary deoxyribonucleic acid (cDNA) molecules, and analyzing at least a portion of said cDNA molecules by real-time polymerase chain reaction (RT-PCR) to obtain said dataset.
54. The method of claim 48, wherein (a) comprises reverse transcribing ribonucleic acid (RNA) molecules obtained or derived from the body sample to produce complementary deoxyribonucleic acid (cDNA) molecules, and analyzing at least a portion of the cDNA molecules by microarray analysis to produce the dataset, wherein the dataset comprises counts of gene transcripts.
55. The method of claim 48, wherein (a) comprises hybridizing ribonucleic acid (RNA) molecules obtained or derived from said body sample using a specific probe set, and analyzing said hybridized RNA molecules using a Luminex platform to obtain said dataset.
56. The method of any one of claims 52 to 55, wherein (a) comprises selectively enriching at least a portion of said cDNA molecules or said RNA molecules for a set of genomic loci associated with said kidney disease or damage.
57. The method of any one of claims 52 to 55, wherein (a) comprises amplifying at least a portion of said cDNA molecules or said RNA molecules.
58. The method of any one of claims 52 to 55, wherein (a) comprises aligning at least a portion of the sequencing reads to a reference sequence.
59. The method of claim 58, wherein the reference sequence is at least a portion of a human reference genome.
60. The method of claim 58, wherein (a) comprises generating a count of gene transcripts.
61. The method of claim 60, wherein the count of gene transcripts is normalized to yield a normalized count of gene transcripts.
62. The method of claim 48, wherein the kidney disease or injury is selected from the group consisting of early kidney disease, medium kidney disease, end-stage kidney disease, asymptomatic kidney disease, diabetic nephropathy, hypertensive kidney disease, igA kidney disease, membranous kidney disease, minilesions, focal Segmental Glomerulosclerosis (FSGS), non-steroidal anti-inflammatory kidney toxicity, thin basement membrane kidney disease, amyloidosis, endocarditis-related ANCA vasculitis and other infections, cardiorenal syndrome, igG4 kidney disease, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis, renal artery stenosis, renal cyst, kidney disease associated with rheumatoid arthritis, and kidney stones.
63. The method of claim 62, wherein the kidney disease or injury is diabetic nephropathy.
64. The method of claim 63, wherein the diabetic nephropathy is early stage diabetic nephropathy.
65. The method of claim 62, wherein the subject is asymptomatic for the diabetic nephropathy.
66. The method of claim 63, wherein said set of genes comprises at least one gene selected from the group consisting of the genes listed in Table 3, the genes listed in Table 4, the genes listed in Table 5, and the genes listed in Table 6.
67. The method of claim 49, wherein (b) comprises processing the data set using a trained algorithm.
68. The method of claim 67, wherein the training algorithm comprises a trained machine learning algorithm.
69. The method of claim 68, wherein said trained machine learning algorithm is selected from the group consisting of Support Vector Machines (SVMs),Bayesian classification, linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, boosting algorithms, adaBoost algorithms, recursive Feature Elimination (RFE) algorithms, and any combination thereof.
70. The method of claim 69, wherein said trained machine learning algorithm comprises said Recursive Feature Elimination (RFE) algorithm.
71. The method of claim 68, wherein said trained machine learning algorithm is trained with a plurality of training samples comprising a first set of body samples from a subject suffering from said kidney disease or impairment and a second set of body samples from a subject not suffering from kidney disease or impairment, wherein said first and second sets of body samples are different from said subject's body samples.
72. The method of claim 67, wherein said trained machine learning algorithm is trained from a plurality of training samples comprising a first set of body samples from a subject suffering from said kidney disease or impairment and a second set of body samples from a subject suffering from other types of kidney disease or impairment, wherein said first and second sets of body samples are different from said subject's body samples.
73. The method of claim 48, wherein (a) comprises comparing the one or more gene expression product levels to a reference value.
74. The method of claim 72, wherein said reference value corresponds to a set of gene expression products from a first set of body samples from subjects suffering from said kidney disease or damage and/or a second set of body samples from subjects not suffering from kidney disease or damage.
75. The method of claim 72, wherein said reference value corresponds to a set of gene expression products from a first set of body samples from subjects suffering from said kidney disease or damage and/or a second set of body samples from subjects suffering from other types of kidney disease or damage.
76. The method of claim 72, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a sensitivity of at least about 80%.
77. The method of claim 48, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a sensitivity of at least about 90%.
78. The method of claim 48, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a specificity of at least about 80%.
79. The method of claim 48, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a specificity of at least about 90%.
80. The method of claim 48, further comprising detecting said presence of or said increased risk of said kidney disease or injury in said subject with a positive predictive value of at least about 80%.
81. The method of claim 48, further comprising detecting said presence of or said increased risk of said kidney disease or injury in said subject with a negative predictive value of at least about 80%.
82. The method of claim 48, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject at an area under the curve (AUC) of at least 0.80.
83. The method of claim 48, further comprising determining clinical intervention in said subject based at least in part on said presence or said increased risk of said kidney disease or injury determined in (b).
84. The method of claim 83, wherein the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle changes, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs.
85. The method of claim 84, wherein the drug is dependent on blockage of the renin-angiotensin aldosterone system.
86. The method of claim 48, wherein (b) comprises analyzing a first set of genes for differentially distinguishing between a first type of kidney disease or impaired and a non-kidney disease or impaired (diabetic negative control) subject and analyzing a second set of genes for differentially distinguishing between a first type of kidney disease or impaired and a second type of kidney disease or impaired subject.
87. The method of claim 86, wherein (b) comprises analyzing a first set of genes for differentially distinguishing diabetic nephropathy (diabetic nephropathy) from non-renal disease (diabetic negative control) subjects and analyzing a second set of genes for differentially distinguishing diabetic nephropathy (diabetic nephropathy) from other Chronic Kidney Disease (CKD).
88. The method of claim 87, wherein the first set of genes is selected from the genes listed in tables 3 and 5 and the second set of genes is selected from the genes listed in tables 4 and 6.
89. The method of claim 87, wherein (b) comprises generating a first diabetic nephropathy vs. diabetes negative control score based on said first set of genes and a second diabetic nephropathy vs. ckd score based on said second set of genes.
90. The method of claim 89, wherein the first diabetic nephropathy vs. diabetes negative control score is greater than 0.5 indicating glomerular injury and less than 0.5 indicating tubular injury.
91. The method of claim 48, wherein (b) comprises analyzing different male-specific or female-specific genomes according to the sex of the subject.
92. The method of claim 48, further comprising analyzing the subject's body sample at two or more different points in time to produce two or more data sets. A computer processes the two or more data sets to determine whether the subject has the kidney disease or damage is the presence, the absence, or the increased risk.
93. The method of claim 48, further comprising determining whether said subject has said presence, said absence, or said increased risk of another type of kidney disease or injury. While electronically outputting a report confirming whether the subject has another type of kidney disease present, absent or at increased risk.
94. The method of claim 48, further comprising generating two or more data sets at two or more different time points from the sample analysis topic of the body, and computer processing the data sets to determine another type of kidney disease or damage topic that is lacking or at increased risk.
95. The method of claim 48, wherein said sensitivity of said determining an increased risk of said kidney disease or damage in said subject comprises determining from said presence or said increased risk of said kidney disease or damage properly determined in a separate test sample.
96. A method of treating or analyzing a body sample of a subject, comprising:
(a) Sequencing ribonucleic acid (RNA) molecules obtained or derived from the body sample to produce sequencing reads indicative of the number of gene transcripts in the body sample, wherein the number of gene transcripts corresponds to a group of genes associated with kidney disease or damage;
(b) Computer processing the gene transcript count (a) to determine whether the kidney disease or injury is present, absent or at increased risk in the subject; and (c) electronically outputting a report to determine whether said kidney disease or injury is present, absent, or at increased risk in said subject determined in (b).
97. The method of claim 96, wherein the body sample is selected from the group consisting of a blood sample, a serum sample, a plasma sample, a saliva sample, a stool sample, a sputum sample, a urine sample, a semen sample, a transvaginal fluid sample, a cerebrospinal fluid sample, a sweat sample, a cell sample, and a tissue sample.
98. The method of claim 97, wherein the body sample is a urine sample.
99. The method of claim 98, wherein the urine sample is a fresh urine sample, a frozen urine sample, or a preserved urine sample.
100. The method of claim 96, wherein (a) comprises reverse transcribing the RNA molecules to produce complementary deoxyribonucleic acid (cDNA) molecules, and sequencing at least a portion of the cDNA molecules to produce the sequencing reads.
101. The method of claim 100, wherein (a) comprises selectively enriching at least a portion of said cDNA molecules for a set of genomic loci associated with said kidney disease or disorder.
102. The method of claim 100, wherein (a) comprises amplifying at least a portion of the cDNA molecule.
103. The method of claim 96, wherein (a) comprises aligning at least a portion of the sequencing reads to a reference sequence.
104. The method of claim 103, wherein the reference sequence is at least a portion of a human reference genome.
105. The method of claim 96, wherein the kidney disease or injury is selected from the group consisting of early kidney disease, medium kidney disease, end-stage kidney disease, asymptomatic kidney disease, diabetic nephropathy, hypertensive kidney disease, igA kidney disease, membranous kidney disease, micro-lesions, focal Segmental Glomerulosclerosis (FSGS), non-steroidal anti-inflammatory kidney disease toxicity, thin-basal kidney disease, amyloidosis, endocarditis-related ANCA vasculitis and other infections, cardiorenal syndrome, igG4 kidney disease, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis, renal arterial stenosis, renal cyst, kidney disease associated with rheumatoid arthritis, and kidney stones.
106. The method of claim 105, wherein the kidney disease or injury is diabetic nephropathy.
107. The method of claim 106, wherein the diabetic nephropathy is early stage diabetic nephropathy.
108. The method of claim 105, wherein the subject is free of diabetic nephropathy symptoms.
109. The method of claim 106, wherein the set of genes comprises at least one gene selected from the group consisting of the genes listed in table 3, the genes listed in table 4, the genes listed in table 5, and the genes listed in table 6.
110. The method of claim 96, wherein (b) comprises processing the data set using a trained algorithm.
111. The method of claim 110, wherein the training algorithm comprises a trained machine learning algorithm.
112. The method of claim 111, wherein said trained machine learning algorithm is selected from the group consisting of Support Vector Machines (SVMs),Bayesian classification, linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, boosting algorithms, adaBoost algorithms, recursive Feature Elimination (RFE) algorithms, and any combination thereof.
113. The method of claim 112, wherein the trained machine learning algorithm comprises the Recursive Feature Elimination (RFE) algorithm.
114. The method of claim 111, wherein the trained machine learning algorithm is trained using a plurality of training samples comprising a body sample from a first group of subjects suffering from the kidney disease or damage and a second group of body samples from subjects not suffering from the kidney disease or damage, wherein the first and second groups of body samples are different from the body sample of the subject.
115. The method of claim 111, wherein the trained machine learning algorithm is trained using a plurality of training samples comprising a first set of body samples having the kidney disease or damage and a second set of body samples having other types of kidney disease or damage, wherein the first set of body samples and the second set of body samples are different from the body samples of the subject.
116. The method of claim 96, wherein (a) comprises comparing the one or more gene expression product levels to a reference value.
117. The method of claim 115, wherein the reference value corresponds to a first set of body samples from the kidney disease or damaged subject and/or a second set of gene expression products from body samples from subjects not suffering from kidney disease or damage.
118. The method of claim 115, wherein the reference value corresponds to a first set of gene expression products from a body sample of the kidney disease or damaged subject and/or a second set of body samples from subjects with other kidney diseases or damage.
119. The method of claim 96, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with an accuracy of at least about 80%.
120. The method of claim 96, further comprising detecting said presence of or said increased risk of said kidney disease or injury in said subject with a sensitivity of at least about 80%.
121. The method of claim 96, further comprising detecting said presence of or said increased risk of said kidney disease or injury in said subject with a sensitivity of at least about 90%.
122. The method of claim 96, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a specificity of at least about 80%.
123. The method of claim 96, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a specificity of at least about 90%.
124. The method of claim 96, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a positive predictive value of at least about 80%.
125. The method of claim 96, further comprising detecting said presence of or said increased risk of said kidney disease or injury in said subject with a negative predictive value of at least about 80%.
126. The method of claim 96, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject at an area under the curve (AUC) of at least 0.80.
127. The method of claim 96, further comprising determining clinical intervention in the subject based at least in part on the presence or the elevated risk of kidney disease or damage determined in (b).
128. The method of claim 127, wherein the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle changes, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs.
129. The method of claim 127, wherein the drug is dependent on blockage of the renin-angiotensin aldosterone system.
130. The method of claim 96, (b) comprising analyzing a first set of genes for differentially distinguishing a subject from a first type of kidney disease or damage and a second set of genes for differentially distinguishing the first type of kidney disease or damage from a second type of kidney disease or damage.
131. The method of claim 130, wherein (b) comprises analyzing a first set of genes for differentially distinguishing Diabetic Nephropathy (DN) from non-renal disease (NEG) subjects and a second set of genes for differentially distinguishing diabetic nephropathy from other Chronic Kidney Disease (CKD).
132. The method of claim 131, wherein the first set of genes is selected from the genes listed in tables 3 and 5 and the second set of genes is selected from the genes listed in tables 4 and 6.
133. The method of claim 131, wherein (b) comprises generating a first diabetic nephropathy vs. diabetes negative control score based on the first set of genes and a second diabetic nephropathy vs. ckd score based on the second set of genes.
134. The method of claim 133, wherein a first diabetic nephropathy vs. diabetes negative control score greater than 0.5 indicates glomerular injury and less than 0.5 indicates tubular injury.
135. The method of claim 96, wherein (b) comprises analyzing different male-specific or female-specific genomes according to the sex of the subject.
136. The method of claim 96, further comprising analyzing the body sample of the subject at two or more different time points to produce two or more data sets. A computer processes the two or more data sets to determine the presence, absence, or increased risk of kidney disease in the subject.
137. The method of claim 96, further comprising determining the presence, absence, or increased risk of the subject suffering from another type of kidney disease or injury. And electronically outputting a report confirming said presence of said another kidney disease or injury, said absence or said increased risk in said subject. .
138. The method of claim 96, further comprising analyzing the subject's body sample at two or more different points in time to produce two or more data sets. The computer processes the two or more data sets to determine the presence, absence, or increased risk of the subject suffering from another kidney disease or injury.
139. A method of processing or analyzing a body sample of a subject, comprising:
(a) Analyzing a plurality of cells obtained or acquired from the body sample to generate a set of data including quantitative measurements of a set of cell-based biomarkers comprising proteins associated with kidney disease or damage of the plurality of cells;
(b) Computer processing the dataset from (a) to determine whether the subject is present with the kidney disease or impaired or increased risk; and (c) electronically outputting a report to determine said presence or said increased risk of suffering from said kidney disease or injury in the subject determined in (b).
140. A method of treating or analyzing a body sample of a subject, comprising:
(a) Analyzing the body sample or derivative thereof to obtain data comprising one or more quantitative measurements of a set of biomarkers associated with kidney disease or damage, wherein the set of biomarkers comprises at least 5 biomarkers selected from the group consisting of the biomarkers set forth in table 3, the biomarkers set forth in table 4, the biomarkers set forth in table 5, and the biomarkers set forth in table 6; (b) Computer processing the data from (a) to determine whether the subject is present with the kidney disease or is impaired or at increased risk; and (c) electronically outputting a report to determine said presence or said increased risk of suffering from said kidney disease or injury in the subject determined in (b).
141. The method of claim 140, wherein the panel of biomarkers is not more than 20, selected from the group consisting of the biomarkers set forth in table 3, the biomarkers set forth in table 4, the biomarkers set forth in table 5 and the biomarkers set forth in table 6.
142. A kit for treating or analyzing a body sample of a subject, comprising a set of probes for identifying the presence, absence or relative amount of a set of genomic regions associated with kidney disease or damage in said body sample or derivative thereof, wherein said set of biomarkers comprises at least 5 biomarkers selected from the group consisting of the biomarkers set forth in table 3, the biomarkers set forth in table 4, the biomarkers set forth in table 5 and the biomarkers set forth in table 6.
143. A method of diagnosing kidney disease or damage comprising:
(a) Analyzing the subject body sample or derivative thereof to generate a dataset comprising one or more quantitative means of a set of markers associated with kidney disease or damage, wherein the set of markers comprises at least 5 biomarkers selected from the group consisting of the biomarkers set forth in table 3, the biomarkers set forth in table 4, the biomarkers set forth in table 5, the biomarkers set forth in table 6, and (b) providing a diagnosis of said kidney disease or damage based on a comparison of the set of biomarkers with a set of reference values.
144. A method of treating kidney disease or disorder comprising:
(a) The method of claim 122 diagnosing kidney disease or damage in the subject; and (b) treating said kidney disease or injury in said subject.
145. A method of treating or analyzing a urine sample of a subject, comprising:
(a) Sequencing ribonucleic acid (RNA) molecules obtained or derived from the urine sample, producing sequencing reads indicative of the number of gene transcripts in the urine sample, wherein the number of gene transcripts corresponds to a set of genes associated with diabetic nephropathy; (b) Computer processing the counted gene transcripts from (a) determining the presence, absence, or increased risk of having diabetic nephropathy in the subject; and (c) electronically outputting a report to determine the presence, absence or increased risk of said diabetic nephropathy in the subject determined in (b).
146. The method of claim 145, wherein the urine sample is a fresh urine sample, a frozen urine sample, or a preserved urine sample.
147. The method of claim 145, wherein (a) comprises reverse transcribing the RNA molecules to produce complementary deoxyribonucleic acid (cDNA) molecules, and sequencing at least a portion of the cDNA molecules to produce the sequencing reads.
148. The method of claim 145, wherein (a) comprises selectively enriching at least a portion of said cDNA molecules for a set of genomic loci associated with said diabetic nephropathy.
149. The method of claim 145, wherein (a) comprises amplifying at least a portion of the cDNA molecule.
150. The method of claim 145, wherein (a) comprises aligning at least a portion of the sequencing reads to a reference sequence.
151. The method of claim 145, wherein the reference sequence is at least a portion of a human reference genome.
152. The method of claim 145, wherein the diabetic nephropathy comprises early stage diabetic nephropathy, late stage diabetic nephropathy, end stage diabetic nephropathy, or asymptomatic diabetic nephropathy.
153. The method of claim 152, wherein the diabetic nephropathy is early stage diabetic nephropathy.
154. The method of claim 145, wherein the subject is free of diabetic nephropathy symptoms.
155. The method of claim 145, wherein the set of genes comprises at least one gene selected from the group consisting of the genes listed in table 3, the genes listed in table 4, the genes listed in table 5, and the genes listed in table 6.
156. The method of claim 145, wherein (b) comprises processing the dataset using a trained algorithm.
157. The method of claim 152, wherein the training algorithm comprises a trained machine learning algorithm.
158. The method of claim 156, wherein said trained machine learning algorithm is selected from the group consisting of Support Vector Machines (SVMs),Bayesian classification, linear regression, quantile regression, logistic regression, nonlinear regression, random forests, neural networks, ensemble learning methods, boosting algorithms, adaBoost algorithms, recursive Feature Elimination (RFE) algorithms, and any combination thereof.
159. The method of claim 158, wherein the trained machine learning algorithm comprises the Recursive Feature Elimination (RFE) algorithm.
160. The method of claim 157, wherein the trained machine learning algorithm is trained using a plurality of training samples comprising a first set of urine samples from diabetic nephropathy patients and a second set of urine samples from non-renal or impaired patients. Wherein the first set of urine samples and the second set of urine samples are different from the urine samples of the subject.
161. The method of claim 157, wherein the trained machine learning algorithm is trained using a plurality of training samples comprising a first set of urine samples from diabetic nephropathy patients and a second set of urine samples from other types of kidney disease or impaired patients, wherein the first set of urine samples and the second set of urine samples are different from the urine samples of the subject.
162. The method of claim 145, wherein (a) comprises comparing the one or more gene expression product levels to a reference value.
163. The method of claim 162, wherein the reference value corresponds to a set of gene expression products from a first set of urine samples from a diabetic nephropathy subject and/or a second set of urine samples from a non-renal or impaired subject.
164. The method of claim 162, wherein the reference value corresponds to a set of gene expression products from a first set of urine samples from a diabetic nephropathy subject and/or a second set of urine samples from other types of kidney disease or damaged subjects.
165. The method of claim 145, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with an accuracy of at least about 80%.
166. The method of claim 145, further comprising detecting said presence of or said increased risk of said kidney disease or injury in said subject with a sensitivity of at least about 80%.
167. The method of claim 145, further comprising detecting said presence of or said increased risk of said kidney disease or injury in said subject with a sensitivity of at least about 90%.
168. The method of claim 145, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a specificity of at least about 80%.
169. The method of claim 145, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a specificity of at least about 90%.
170. The method of claim 145, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a positive predictive value of at least about 80%.
171. The method of claim 145, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject with a negative predictive value of at least about 80%.
172. The method of claim 145, further comprising detecting said presence or said increased risk of said kidney disease or injury in said subject at an area under the curve (AUC) of at least 0.80.
173. The method of claim 145, further comprising determining clinical intervention in the subject based at least in part on the presence or the increased risk of diabetic nephropathy determined in (b).
174. The method of claim 173, wherein the clinical intervention is selected from the group consisting of drug treatment, enhancing glycemic control, hypertension control, lowering high cholesterol, promoting bone health, diet control, lifestyle changes, weight loss, exercise, smoking cessation, controlling alcohol intake, reducing/stopping abuse of drugs, and avoiding non-steroidal anti-inflammatory drugs.
175. The method of claim 174, wherein the medicament is dependent on blockage of the renin-angiotensin aldosterone system.
176. The method of claim 145 (b) comprising analyzing a first set of genes for differentially distinguishing Diabetic Nephropathy (DN) from non-renal disease (NEG) subjects, and a second set of genes for differentially distinguishing Diabetic Nephropathy (DN) from a second renal disease or injury.
177. The method of claim 176, wherein (b) comprises analyzing a first set of genes for differentially distinguishing diabetic nephropathy from non-renal disease (NEG) subjects and a second set of genes for differentially distinguishing diabetic nephropathy from other Chronic Kidney Disease (CKD) subjects.
178. The method of claim 177, wherein the first set of genes is selected from the genes listed in tables 3 and 5 and the second set of genes is selected from the genes listed in tables 4 and 6.
179. The method of claim 177, wherein (b) comprises generating a first diabetic nephropathy vs. diabetes negative control score based on the first set of genes and generating a second diabetic nephropathy vs. ckd score based on the second set of genes.
180. The method of claim 179, wherein a first diabetic nephropathy vs. diabetes negative control score greater than 0.5 indicates glomerular injury and less than 0.5 indicates tubular injury.
181. The method of claim 145, wherein (b) comprises analyzing different male-specific or female-specific genomes according to the sex of the subject.
182. The method of claim 145 further comprising analyzing urine samples from the subject at two or more different time points, generating two or more data sets, and computer processing the two or more data sets to determine the presence, absence, or increased risk of the diabetic nephropathy in the subject.
183. The method of claim 145, further comprising analyzing the urine sample of the subject to produce two or more data sets at two or more different time points. The computer processes the two or more data sets to determine the presence, absence, or increased risk of another type of kidney disease or injury.
184. The method of claim 145, further comprising determining whether another type of kidney disease or injury is present, absent or at increased risk in the subject. And electronically outputting a report confirming the presence, absence or increased risk of another type of kidney disease or injury in the subject.
185. The method of one of the preceding claims, further comprising deleting at least one subset of subjects having a non-kidney disease (NEG) manifestation of a predetermined condition, and then selectively adding additional samples of subjects not having other non-kidney disease manifestations of the condition to generate a modified set of diabetic negative control-X subjects, wherein X is a predetermined condition.
186. The method of claim 185, wherein the predetermined symptoms comprise those of obesity, morbid obesity, nicotine dependence, alcohol dependence, drug abuse, kidney stones, severe hypertension, urinary tract infections, heart diseases, hepatitis b, hepatitis c, aids, psoriasis, rheumatoid arthritis or use of non-steroidal anti-inflammatory drugs.
187. The method of claim 185, wherein if the diabetic nephropathy vs. diabetic negative control-X score is substantially higher than the diabetic nephropathy vs. diabetic negative control score, then it is indicative that the subject suffers from the predetermined condition X causing kidney damage.
188. The method of one of the preceding claims, further comprising deleting at least one subset of subjects having other nephrotic disease (CKD) manifestations of a predetermined condition, and then selectively adding additional subject samples of other nephrotic disease manifestations not having the condition to generate a set of modified CKD-Y subjects, wherein Y is a predetermined condition.
189. The method of claim 188, wherein the predetermined condition is those of obesity, morbid obesity, nicotine dependence, alcohol dependence, drug abuse, kidney stones, severe hypertension, urinary tract infections, heart disease, hepatitis b, hepatitis c, aids, psoriasis, rheumatoid arthritis, use of non-steroidal anti-inflammatory drugs, igA nephropathy, membranous nephropathy, micro-lesions, focal Segmental Glomerulosclerosis (FSGS), thin-film nephropathy, amyloidosis, ANCA vasculitis associated with endocarditis and other infections, cardiorenal syndrome, igG4 nephropathy, interstitial nephritis, lithium salt nephrotoxicity, lupus nephritis, multiple myeloma, polycystic kidney disease, pyelonephritis (kidney infection), renal arterial stenosis, renal cyst, or rheumatoid arthritis-associated kidney disease.
190. The method of claim 188, wherein if the diabetic nephropathy vs. ckd-Y score is significantly higher than the diabetic nephropathy vs. ckd score, then it is indicative that the subject suffers from the predetermined condition Y causing kidney damage.
CN202180099114.4A 2021-05-19 2021-05-19 Methods and systems for detecting kidney disease or damage by gene expression analysis Pending CN117580962A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2021/033084 WO2022245342A1 (en) 2021-05-19 2021-05-19 Methods and systems for detection of kidney disease or disorder by gene expression analysis

Publications (1)

Publication Number Publication Date
CN117580962A true CN117580962A (en) 2024-02-20

Family

ID=84141579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180099114.4A Pending CN117580962A (en) 2021-05-19 2021-05-19 Methods and systems for detecting kidney disease or damage by gene expression analysis

Country Status (2)

Country Link
CN (1) CN117580962A (en)
WO (1) WO2022245342A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GT200600381A (en) * 2005-08-25 2007-03-28 ORGANIC COMPOUNDS
WO2011133474A2 (en) * 2010-04-18 2011-10-27 Beth Israel Deaconess Medical Center Methods of predicting predisposition to or risk of kidney disease
AU2014317834A1 (en) * 2013-09-09 2016-03-10 Michael M. Abecassis Methods and systems for analysis of organ transplantation
WO2015042515A1 (en) * 2013-09-20 2015-03-26 University Of Virginia Patent Foundation Compositions and methods for protecting the kidney from ischemia reperfusion injury
EA201691496A1 (en) * 2014-01-27 2016-12-30 Эпик Сайенсиз, Инк. DIAGNOSTICS OF PROMOTIONAL CANCER BIOMARKERS BY CIRCULATING TUMOR CELLS

Also Published As

Publication number Publication date
WO2022245342A1 (en) 2022-11-24

Similar Documents

Publication Publication Date Title
US20200232046A1 (en) Genomic sequencing classifier
JP2022544604A (en) Systems and methods for detecting cellular pathway dysregulation in cancer specimens
JP7005596B2 (en) Detection of chromosomal interactions associated with breast cancer
US20230175058A1 (en) Methods and systems for abnormality detection in the patterns of nucleic acids
US20230160019A1 (en) Rna markers and methods for identifying colon cell proliferative disorders
US20220372573A1 (en) Methods and systems for detection of kidney disease or disorder by gene expression analysis
JP2023082157A (en) gene regulation
JP2023518291A (en) Systems and methods for detecting Alzheimer's disease risk using circulating free mRNA profiling assays
US20220213558A1 (en) Methods and systems for urine-based detection of urologic conditions
US20180371553A1 (en) Methods and compositions for the analysis of cancer biomarkers
WO2015079060A2 (en) Mirnas as advanced diagnostic tool in patients with cardiovascular disease, in particular acute myocardial infarction (ami)
JP2022524382A (en) Methods for Predicting Prostate Cancer and Their Use
EP3810807A1 (en) Methods and compositions for the analysis of cancer biomarkers
US20220042106A1 (en) Systems and methods of using cell-free nucleic acids to tailor cancer treatment
US20220084632A1 (en) Clinical classfiers and genomic classifiers and uses thereof
CN117580962A (en) Methods and systems for detecting kidney disease or damage by gene expression analysis
JP2024504062A (en) chromosome interactions
CN113159529A (en) Risk assessment model and related system for intestinal polyp
WO2018210338A1 (en) Methods for detecting malignant colon conditions
US20240071622A1 (en) Clinical classifiers and genomic classifiers and uses thereof
CN109609649B (en) lncRNA for diagnosing and treating rectal adenocarcinoma
EP4341438A2 (en) Methods and systems for methylation profiling of pregnancy-related states
JP2024511292A (en) Chromosome interaction markers
JP2022532108A (en) Chromosome conformation markers for prostate cancer and lymphoma
CN116377053A (en) Diagnostic biomarker for coronary artery dilatation and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination