WO2022261039A2 - Cancer detection method, kit, and system - Google Patents

Cancer detection method, kit, and system Download PDF

Info

Publication number
WO2022261039A2
WO2022261039A2 PCT/US2022/032423 US2022032423W WO2022261039A2 WO 2022261039 A2 WO2022261039 A2 WO 2022261039A2 US 2022032423 W US2022032423 W US 2022032423W WO 2022261039 A2 WO2022261039 A2 WO 2022261039A2
Authority
WO
WIPO (PCT)
Prior art keywords
mir
hsa
cancer
mirna
kit
Prior art date
Application number
PCT/US2022/032423
Other languages
English (en)
French (fr)
Other versions
WO2022261039A3 (en
Inventor
Andrew Zhang
Hai HU
Original Assignee
Mironcol Diagnostics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mironcol Diagnostics, Inc. filed Critical Mironcol Diagnostics, Inc.
Priority to CN202280041034.8A priority Critical patent/CN117500941A/zh
Priority to CA3221494A priority patent/CA3221494A1/en
Priority to EP22820856.7A priority patent/EP4352266A2/en
Priority to AU2022289858A priority patent/AU2022289858A1/en
Publication of WO2022261039A2 publication Critical patent/WO2022261039A2/en
Publication of WO2022261039A3 publication Critical patent/WO2022261039A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/178Oligonucleotides characterized by their use miRNA, siRNA or ncRNA

Definitions

  • the present invention relates generally to the technical field of disease screening, detection and diagnosis, and more specifically relates to a method, a kit, a system, and a non- transitory storage medium for the detection of one or multiple human cancers.
  • miRNAs are small single-stranded non-coding RNA molecules of an average of 22 nucleotides long encoded by their corresponding genes in the human genome.
  • the miRNAs function in negative post-transcriptional regulation of gene expression primarily by binding with complementary sequences in the 3’ untranslated region (3’ UTR) of mRNA molecules.
  • miRNAs appear to regulate more than 50% human genes, and abnormal expression of miRNAs has been implicated in many human cancers. miRNAs are also abundant as extracellular circulating molecules released into circulation by tumor cells either through cell death or by exosome- mediated signaling. Combined with its remarkable stability in the blood and other body fluids, circulating cell-free miRNAs have the potential to serve as noninvasive biomarkers for cancer screening and diagnosis.
  • the present disclosure provides a multi-cancer detection approach (i.e. method, kit, and system) by means of an miRNA biomarker set consisting of at least one miRNA biomarker.
  • the approach is substantially based on the expression profile of the miRNA biomarker set, which can be determined from a biological sample obtained from a human subject.
  • a biological sample can notably be a liquid biopsy sample including a blood sample, a serum sample, a plasma sample, a urine sample, a saliva sample, or a spatum sample to thereby allow a non-invasive or minimum- invasive detection for the cancer.
  • the approach can be employed to accurately and reliably detect whether a human subject has one of the cancers including lung cancer, biliary tract cancer, bladder cancer, colorectal cancer, esophageal cancer, gastric cancer, glioma cancer, liver cancer, pancreatic cancer, prostate cancer, ovarian cancer, and sarcoma.
  • a method for detecting a cancer from a biological sample obtained from a subject substantially includes the following three steps (l)-(3): [0008] Step (1): determining an expression profile of an miRNA biomarker set consisting of at least one miRNA from the biological sample.
  • the miRNA biomarker set comprises hsa- miR-5100.
  • Step (2) calculating a diagnostic index of the biological sample based on the expression profile of the miRNA biomarker set.
  • Step (3) classifying the subject as having the cancer or not based on the value of the calculated diagnostic index. If the calculated diagnostic index is greater than or equal to a pre determined threshold, the subject is classified as having the cancer; or if otherwise the subject is classified as not having the cancer.
  • the method is capable of achieving diagnostic accuracy having an AUC value greater than approximately 0.780.
  • the expression profile of an miRNA biomarker set is substantially a dataset containing expression level data that has been determined for each and every miRNA member contained in the miRNA biomarker set.
  • pre-determined threshold is referred to as a cut-point value of the diagnostic index that can be used to determine with a given specificity/sensitivity if a subject has the cancer type or not. It is typically pre-determined based on an existing dataset comprising a range of diagnostic index values that have been obtained and calculated for an existing population of subjects known to have, and/or known to be absent of, the disease. For example, in the EXAMPLE 1 provided below, when the miRNA biomarker set consists of any one of the top 100 miRNAs (corresponding to SEQ ID NOS: 1-100), the AUC can reach a level that is greater than 0.780 (i.e.
  • hsa-miR-1343-3p hsa-miR-1290, hsa-miR-4787-3p, hsa- miR-6877-5p, hsa-miR-17-3p, hsa-miR-6765-5p, hsa-miR- 1268b, hsa-miR-4258, hsa-miR-451a, hsa-miR-1228-5p, hsa-miR-8073, hsa-miR-4454, hsa-miR-187-5p, hsa-miR-4286, hsa-miR- 6746-5p, hsa-miR-663b, hsa-miR-6075, hsa-miR-5001-5p, hsa-miR-6789-5p, hsa-miR-4513, hs
  • the miRNA biomarker set further comprises, in addition to hsa-miR-5100, one or more of the other top 50 miRNAs listed in Table 1, i.e. hsa-miR-1343-3p, hsa-miR-1290, hsa-miR-4787-3p, hsa-miR-6877-5p, hsa-miR-17-3p, hsa-miR-6765-5p, hsa-miR- 1268b, hsa-miR-4258, hsa-miR-451a, hsa-miR-1228-5p, hsa-miR- 8073, hsa-miR-4454, hsa-miR- 187-5p, hsa-miR-4286, hsa-miR-6746-5p, hsa-m
  • the miRNA biomarker set further comprises, in addition to hsa-miR-5100, one or more of the other top 20 miRNAs listed in Table 1, i.e. hsa-miR-1343-3p, hsa-miR-1290, hsa-miR-4787-3p, hsa-miR-6877-5p, hsa-miR- 17-3 p, hsa-miR-6765-5p, hsa-miR- 1268b, hsa-miR-4258, hsa-miR-451a, hsa-miR- 1228-5p, hsa-miR- 8073, hsa-miR-4454, hsa-miR- 187-5p, hsa-miR-4286, hsa-miR-6746-5p, hs
  • the miRNA biomarker set further comprises, in addition to hsa-miR-5100, one or more of the other top 4 miRNAs listed in Table 1, i.e. hsa-miR-1343-3p, hsa-miR-1290, and hsa-miR-4787-3p, which are ranked based on the adjusted P value and correspond to SEQ ID NOS: 2-4 respectively.
  • the miRNA biomarker set consists of the top 4 miRNAs listed in Table 1, i.e.
  • the method can optionally be further configured to be capable of achieving diagnostic accuracy having a higher AUC value.
  • the method is configured to be capable of achieving diagnostic accuracy having an AUC value greater than approximately 0.850.
  • the cancer that can be detected can be selected from a group consisting of lung cancer, biliary tract cancer, bladder cancer, colorectal cancer, esophageal cancer, gastric cancer, glioma cancer, liver cancer, pancreatic cancer, prostate cancer, ovarian cancer, and sarcoma.
  • the method is configured to be capable of achieving diagnostic accuracy having an AUC value greater than approximately 0.950.
  • the cancer that can be detected can be selected from a group consisting of lung cancer, biliary tract cancer, bladder cancer, colorectal cancer, esophageal cancer, gastric cancer, glioma cancer, liver cancer, ovarian cancer, pancreatic cancer, and prostate cancer.
  • the method is configured to be capable of achieving diagnostic accuracy having an AUC value greater than approximately 0.990.
  • the cancer that can be detected can be selected from a group consisting of lung cancer, biliary tract cancer, bladder cancer, esophageal cancer, gastric cancer, glioma cancer, and prostate cancer.
  • the method is configured to be capable of achieving a diagnostic accuracy having an AUC value greater than approximately 0.999.
  • the cancer that can be detected can be lung cancer or gastric cancer.
  • the method can optionally be configured to be capable of achieving diagnostic accuracy having different sensitivity and specificity levels.
  • the method is configured to be capable of achieving diagnostic accuracy having a sensitivity greater than approximately 68.0% while having a specificity greater than approximately 99.0%.
  • the cancer that can be detected can be selected from a group consisting of lung cancer, biliary tract cancer, bladder cancer, colorectal cancer, esophageal cancer, gastric cancer, glioma cancer, liver cancer, pancreatic cancer, prostate cancer, ovarian cancer, and sarcoma.
  • the method is configured to be capable of achieving diagnostic accuracy having a sensitivity greater than approximately 83.0% while having a specificity greater than approximately 99.0%.
  • the cancer that can be detected can be selected from a group consisting of lung cancer, biliary tract cancer, bladder cancer, colorectal cancer, esophageal cancer, gastric cancer, glioma cancer, liver cancer, pancreatic cancer, and prostate cancer.
  • the method is configured to be capable of achieving diagnostic accuracy having a sensitivity greater than approximately 99.0% and having a specificity greater than approximately 99.0%.
  • the cancer that can be detected can be lung cancer or gastric cancer.
  • the diagnostic index is calculated via a weighted model using weights from one selected from a group consisting of Linear Models for Microarray Data (limma) model, logistic regression model, linear discriminant analysis (LDA) model, conditional logistic regression model, lasso regression model, ridge regression model, random forest, support vector machine, and probit regression model. Further optionally, the diagnostic index is calculated via a weighted model using weights from the limma model.
  • limma Linear Models for Microarray Data
  • LDA linear discriminant analysis
  • the terms “unweighted model” and “weighted model” are to be understood within the common definition as well appreciated by people skilled in the art.
  • “unweighted model” it refers to a situation where no weight is applied for each miRNAin the miRNA biomarker set when calculating the diagnostic index.
  • weighted model it refers to as a situation where a corresponding weight is applied for each miRNA in the miRNA biomarker set when calculating the diagnostic index.
  • the phrase “the diagnostic index is calculated via a weighted model” can be understood such that for any miRNA/ in the miRNA biomarker set, not all /, are equal (i.e. there are at least two miRNAs which have different weights).
  • Linear Models for Microarray Data (limma) model (Ritchie et al. 2015), “logistic regression model” (Venable and Ripley 2002), “linear discriminant analysis (LDA) model” (Venable and Ripley 2002), “conditional logistic regression model” (Venable and Ripley 2002), “lasso regression model” (Tibshirani 1996), “ridge regression model” (Hoerl and Kennard 1970), “random forest” (Ripley 1996), “support vector machine” (Ripley 1996), and “probit regression model” (Venable and Ripley 2002) is substantially a probability-modeling statistical model that models abides by the definition commonly appreciated by people skilled in the field, the details of which can be referenced by the reference included immediately behind.
  • step (3) comprises: classifying the subject as having the cancer if the normalized diagnostic index is equal to or greater than a preset cut-point; or classifying the subject as not having the cancer if otherwise.
  • the normalized diagnostic index is calculated based on formula (II):
  • the parami OCation and param SCaie are respectively a location parameter and a scale parameter configured to allow the normalized diagnostic index to be within a range no less than a first preset value and no greater than a second preset value.
  • the parami OCation is substantially a location parameter configured to shift the minimum of the normalized diagnostic index to the first preset value
  • param SCaie is substantially a scale parameter configured to scale the maximum of the normalized diagnostic index to the second value.
  • first preset value and the second preset value are respectively the minimum and maximum in the range of normalized diagnostic index values that have been obtained and calculated from an existing population of subjects known to have and known not to have the cancer, with outliers excluded.
  • multiple settings can be applied.
  • the diagnostic index values are determined to have a range of 600 to 1600 excluding outliers (see)
  • the parami OCation and param SCaie can be respectively set to 600 and 100 so that the final normalized diagnostic index can be no less than 0 and no greater than 10. It is note that this normalization scheme was employed in the below EXAMPLE 1.
  • the parami OCation and param SCaie can be respectively set to 600 and 1000, so that the final normalized diagnostic index can be set to be no less than 0 and no greater than 1. Further alternatively, the parami OCation and param SCaie can be respectively set to 600 and 10, so that the final normalized diagnostic index can be set to be no less than 0 and no greater than 100. Further alternatively, the parami OCation and param SCaie can be respectively set to 350 and 250, so that the final normalized diagnostic index can be set to be no less than 1 and no greater than 5.
  • the pre-set cut-point can optionally be set as 5.1 to thereby allow the method to have a specificity that is greater than approximately 0.95, or optionally can be set as 6.0 to thereby allow the method to have a specificity that is greater than approximately 0.99.
  • the biological sample is a liquid biopsy sample selected from a group consisting of a blood sample, a serum sample, a plasma sample, a urine sample (Yun et al. 2012), a saliva sample (Park et al. 2009), and a spatum sample.
  • the expression profile of the miRNA biomarker set in step (1) of determining an expression profile of an miRNA biomarker set consisting of at least one miRNA from the biological sample, can optionally be obtained by means of Northern Blotting, microarray analysis, RNA-sequencing, or RNA in-situ hybridization, or can optionally be obtained by means of a nucleic acid amplification procedure, comprising reverse-transcription PCR (RT-PCR), quantitative RT-PCR (qRT-PCR), or digital RT- PCR.
  • RT-PCR reverse-transcription PCR
  • qRT-PCR quantitative RT-PCR
  • digital RT- PCR digital RT-PCR
  • each of the above miRNA detection approaches is to be understood within the common definition well-appreciated by people of ordinary skills in the field. More details for implementing these approaches to determine the expression profile of the miRNA biomarker set will be provided below.
  • the phrase “diagnosis of the cancer” is referred to as the detection of the cancer in a subject previously known not to have the cancer, whereas the phrase “recurrence of the cancer” is referred to as the detection of the cancer again in a subject with the cancer who has previously been treated to remove the cancer to become cancer-free.
  • the present disclosure further provides a kit for detecting a cancer from a biological sample obtained from a subject, which is substantially employed for implementing the method described in the first aspect.
  • kits are referred to as a collection of articles and/or instructions.
  • An article included in the kit can be a physical entity or a component thereof.
  • articles that can be included in the kit as disclosed herein can include one or more nucleic acids (e.g. polynucleotides), or one or more device, apparatus or equipment (e.g. a molecular array or microarray that comprises the one or more nucleic acids).
  • An instruction included in the kit can be a description of the specific steps to be performed (e.g. a manual), which can be printed on a physical medium (e.g. paper, card, etc.), on a computer- readable storage medium (e.g. hard disc, compact disc or CD, flash drive, etc.), or even stored in the internet (e.g. in an accessible cloud space), etc.
  • Component (1) at least one nucleic acid, each capable of specifically recognizing each miRNA in an miRNA biomarker set to thereby allow an expression profile of the miRNA biomarker set to be obtained from the biological sample.
  • the miRNA biomarker set comprises hsa-miR-5100 (SEQ ID NO: 1).
  • Component (2) at least one instruction, comprising a first instruction and a second instruction.
  • the second instruction is configured for classifying the subject as having the cancer or not, wherein the subject is classified as having the cancer if the calculated diagnostic index is greater than or equal to a pre-determined threshold or as not having the cancer if otherwise.
  • the at least one nucleic acid can optionally comprise a polynucleotide capable of specifically hybridizing under a stringent condition to: either (a) a polynucleotide comprising or consisting of a nucleotide sequence of SEQ ID NO: 1, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consisting of a nucleotide sequence complementary to a nucleotide sequence of SEQ ID NO: 1, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides.
  • the miRNA biomarker set further comprises, in addition to hsa-miR-5100, one or more of the other 99 miRNAs listed in Table 1.
  • the at least one nucleic acid can optionally further comprise at least one polynucleotide, each capable of specifically hybridizing under a stringent condition to: either (a) a polynucleotide comprising or consisting of a nucleotide sequence of any one of SEQ ID NOS: 2-100, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consisting of a nucleotide sequence complementary to a nucleotide sequence of any one of SEQ ID NOS: 2-100, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucle
  • the miRNA biomarker set further comprises, in addition to hsa-miR-5100, one or more of the other top 50 miRNAs listed in Table 1.
  • the at least one nucleic acid can optionally further comprise at least one polynucleotide, each capable of specifically hybridizing under a stringent condition to: either (a) a polynucleotide comprising or consisting of a nucleotide sequence of any one of SEQ ID NOS: 2-50, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consisting of a nucleotide sequence complementary to a nucleotide sequence of any one of SEQ ID NOS: 2-50, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consist
  • the miRNA biomarker set further comprises, in addition to hsa-miR-5100, one or more of the other top 20 miRNAs listed in Table 1.
  • the at least one nucleic acid can optionally further comprise at least one polynucleotide, each capable of specifically hybridizing under a stringent condition to: either (a) a polynucleotide comprising or consisting of a nucleotide sequence of any one of SEQ ID NOS: 2-20, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consisting of a nucleotide sequence complementary to a nucleotide sequence of any one of SEQ ID NOS: 2-20, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consist
  • the miRNA biomarker set consists of the top 20 miRNAs in Table 1, and correspondingly, in component (1) of the kit, the at least one nucleic acid consists of a total of 20 polynucleotides which are respectively capable of specifically hybridizing under a stringent condition to: either (a) polynucleotides respectively comprising or consisting of nucleotide sequences of SEQ ID NOS: 1-20, derivatives thereof, variants thereof each having at least 80% sequence identity, or fragments thereof each comprising 15 or more consecutive nucleotides; or (b) polynucleotides respectively comprising or consisting of nucleotide sequences which are respectively complementary to nucleotide sequences of SEQ ID NOS: 1-20, derivatives thereof, variants thereof each having at least 80% sequence identity, or fragments thereof each comprising 15 or more consecutive nucleotides.
  • the miRNA biomarker set further comprises, in addition to hsa-miR-5100, one or more of the other top 4 miRNAs listed in Table 1.
  • the at least one nucleic acid can optionally further comprise at least one polynucleotide, each capable of specifically hybridizing under a stringent condition to: either (a) a polynucleotide comprising or consisting of a nucleotide sequence of any one of SEQ ID NOS: 2-4, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consisting of a nucleotide sequence complementary to a nucleotide sequence of any one of SEQ ID NOS: 2-4, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consist
  • the miRNA biomarker set consists of the top 4 miRNAs in Table 1, i.e. hsa-miR-5100, hsa-miR-1343-3p, hsa-miR-1290, and hsa-miR-4787-3p, and correspondingly, in component (1) of the kit, the at least one nucleic acid consists of a total of 4 polynucleotides which are respectively capable of specifically hybridizing under a stringent condition to: either (a) polynucleotides respectively comprising or consisting of nucleotide sequences of SEQ ID NOS: 1-4, derivatives thereof, variants thereof each having at least 80% sequence identity, or fragments thereof each comprising 15 or more consecutive nucleotides; or (b) polynucleotides respectively comprising or consisting of nucleotide sequences which are respectively complementary to nucleotide sequences of SEQ ID NOS: 1-4, derivatives thereof, variant
  • the diagnostic index in the first sub-instruction of the first instruction in component (2), can be calculated via an unweighted model, or alternatively via a weighted model using weights from one of the probability-modeling statistical models that have been provided above in the first aspect.
  • the diagnostic index is calculated via a weighted model using weights from the limma model.
  • the pre-determined threshold can be set as 1110, and the second instruction further indicates that the classification using 1110 as the pre determined threshold has a specificity > 0.95.
  • the pre-determined threshold can be set as 1200, and the second instruction further indicates that such classification using 1200 as the pre-determined threshold has a specificity > 0.99.
  • the first instruction further comprises a second sub-instruction for obtaining a normalized diagnostic index based on the diagnostic index calculated according to the first sub-instruction, and in the second instruction, the subject is classified as having the cancer if the normalized diagnostic index is greater than or equal to a preset cut-point or as not having the cancer if otherwise.
  • the normalization process is substantially identical to the normalization process mentioned above in the first method aspect above, whose description will be skipped in here.
  • the normalized diagnostic index is calculated via a weighted model using weights from the limma model, and the first preset value is 0, and the second preset value is 10.
  • the preset cut-point can be set optionally as 5.1 or 6.0, to thereby allow the classification using the preset cut-point to have a specificity that is > 0.95 or > 0.99, respectively.
  • the at least one instruction in component (2) in the kit may further comprise a third instruction for performing an evaluation of the subj ect, wherein said evaluation comprises a diagnosis of the cancer or a detection of a recurrence of the cancer; or may further comprise a fourth instruction for administering to the subject a therapeutic regimen when the subject is classified as having the cancer.
  • the at least one instruction in component (2) in the kit may further comprise a first additional instruction for obtaining the expression profile of the miRNA biomarker set, comprising a procedure for performing Northern Blotting, microarray analysis, RNA-sequencing, or RNAin-situ hybridization by means of the at least one nucleic acid.
  • the at least one nucleic acid may optionally be arranged on a molecular array.
  • the kit may further comprise at least one set of amplification primers, each set capable of specifically amplifying each of the at least one miRNA in the miRNA biomarker set from the biological sample.
  • the at least one instruction in component (2) in the kit may further comprise a second additional instruction for obtaining the expression profile of the miRNA biomarker set, comprising a procedure for performing reverse- transcription PCR (RT-PCR), quantitative RT-PCR (qRT-PCR), or digital RT-PCR by means of the at least one nucleic acid and the at least one set of amplification primers.
  • RT-PCR reverse- transcription PCR
  • qRT-PCR quantitative RT-PCR
  • digital RT-PCR digital RT-PCR by means of the at least one nucleic acid and the at least one set of amplification primers.
  • the biological sample can be a liquid biopsy sample selected from a group consisting of a blood sample, a serum sample, a plasma sample, a urine sample, a saliva sample, and a spatum sample.
  • the present disclosure further provides a system for detecting a cancer in a subject.
  • the system is substantially a computerized system comprising a collection of hardware (e.g. processor, memory, I/O interface, storage medium, etc.) and software (i.e. computer programs, including operation system software, and specific program software, etc.), which are configured to collaboratively work so as to collectively implement all or some steps of the method as described above in the first aspect.
  • the system comprises a processor and a non-transitory storage medium.
  • the non-transitory storage medium is configured to contain a software (i.e. program instructions) for execution by the processor, and the program instructions are configured to cause the processor to execute the various steps of the method according to the various different embodiments of the method that are described above in the first aspect.
  • the present disclosure further provides a non-transitory storage medium, configured to store computer-executable program instructions which, when executed by a processor, cause the processor to execute the method according to the various different embodiments of the method that are described above in the first aspect.
  • a “subject” means a mammal such as a primate including a human and a chimpanzee, a pet animal including a dog and a cat, a livestock animal including cattle, a horse, sheep, and a goat, and a rodent including a mouse and a rat.
  • the term “healthy subject” also means such a mammal without the cancer to be detected. It is to be noted that the whole disclosure concerns more specifically human subjects, but can optionally be applied to other non-human mammals as well.
  • nucleic acid As used herein, the term “polynucleotide” is interchangeable with “nucleic acid”, and is referred to as a nucleic acid including all of RNA, DNA, and RNA/DNA (chimera).
  • the DNA includes all of cDNA, genomic DNA, and synthetic DNA.
  • the RNA includes all of total RNA, mRNA, rRNA, miRNA, siRNA, snoRNA, snRNA, non-coding RNA and synthetic RNA.
  • fragment is a polynucleotide having a nucleotide sequence having a consecutive portion of a polynucleotide and desirably has a length of 15 or more nucleotides, e.g. 15, 16, 17, 18, 19, etc. nucleotides.
  • gene is intended to include not only RNA and double- stranded DNA but also each single-stranded DNA such as a plus strand (or a sense strand) or a complementary strand (or an antisense strand) constituting the duplex. The gene is not particularly limited by its length.
  • nucleic acid encoding a congener, a variant, or a derivative
  • a “nucleic acid” having a nucleotide sequence hybridizing under stringent conditions described later to a complementary sequence of a nucleotide sequence represented by any of SEQ ID NOs: 1 to 100 or a nucleotide sequence derived from the nucleotide sequence by the replacement of the nucleotide "U” (or “u") with the nucleotide "T” (or “t”).
  • the “gene” is not particularly limited by its functional region and can contain, for example, an expression control region, a coding region, an exon, or an intron.
  • the “gene” may be contained in a cell or may exist alone after being released into the outside of a cell. Alternatively, the “gene” may be in a state enclosed in a vesicle called exosome.
  • microRNA is intended to mean a 15- to 25-nucleotide non-coding RNA that is transcribed as an RNA precursor having a hairpin-like structure, cleaved by a dsRNA-cleaving enzyme which has RNase III cleavage activity, integrated into a protein complex called RISC, and involved in the suppression of translation of mRNA, unless otherwise specified.
  • miRNA as used herein includes not only a “miRNA” represented by a particular nucleotide sequence (or SEQ ID NO) but a precursor of the “miRNA” (pre-miRNA or pri-miRNA), and miRNAs having biological functions equivalent thereto, for example, a congener (i.e., a homolog or an ortholog), a variant (e.g., a genetic polymorph), and a derivative.
  • a congener i.e., a homolog or an ortholog
  • a variant e.g., a genetic polymorph
  • Such a precursor, a congener, a variant, or a derivative can be specifically identified using miRBase Release 20 (Kozomara and Griflfiths-Jones, 2010), and examples thereof can include an “miRNA” having a nucleotide sequence hybridizing under stringent conditions described later to a complementary sequence of any particular nucleotide sequence represented by any of SEQ ID NOS: 1 to 100.
  • miRNA as used herein may be a gene product of a miRNA gene.
  • Such a gene product includes a mature miRNA (e.g., a 15- to 25-nucleotide or 19- to 25-nucleotide non-coding RNA involved in the suppression of translation of mRNA as described above) or a miRNA precursor (e.g., pre-miRNA or pri-miRNA).
  • a mature miRNA e.g., a 15- to 25-nucleotide or 19- to 25-nucleotide non-coding RNA involved in the suppression of translation of mRNA as described above
  • a miRNA precursor e.g., pre-miRNA or pri-miRNA
  • the term “probe” includes a polynucleotide that is used for specifically detecting an RNA resulting from the expression of a gene or a polynucleotide derived from the RNA, and/or a polynucleotide complementary thereto.
  • primer includes a polynucleotide that specifically recognizes and amplifies an RNA resulting from the expression of a gene or a polynucleotide derived from the RNA, and/or a polynucleotide complementary thereto.
  • the term “variant” means, in the case of a nucleic acid, a natural variant attributed to polymorphism, mutation, or the like; a variant containing the deletion, substitution, addition, or insertion of 1, 2, or 3 or more nucleotides in a nucleotide sequence represented by any of SEQ ID NOs: 1 to 100 or a nucleotide sequence derived from the nucleotide sequence by the replacement of the nucleotide "U” (or “u") with the nucleotide "T” (or “t"), or a partial sequence thereof; a variant containing the deletion, substitution, addition, or insertion of 1 or 2 or more nucleotides in a nucleotide sequence of a premature miRNA of a sequence represented by any of SEQ ID NOs: 1 to 100 or a nucleotide sequence derived from the nucleotide sequence by the replacement of the nucleotide "U” (or “u”) with the nucleo
  • derivative is meant to include a modified nucleic acid, for example, a derivative labeled with a fluorophore or the like, a derivative containing a modified nucleotide (e.g., a nucleotide containing a group such as halogen, alkyl such as methyl, alkoxy such as methoxy, thio, or carboxymethyl, and a nucleotide that has undergone base rearrangement, double bond saturation, deamination, replacement of an oxygen molecule with a sulfur atom, etc.), PNA (peptide nucleic acid; Nielsen et al. 1991), and LNA (locked nucleic acid; Obika et al. 1998) without any limitation.
  • a modified nucleotide e.g., a nucleotide containing a group such as halogen, alkyl such as methyl, alkoxy such as methoxy, thio, or carboxymethyl
  • PNA peptide nucleic acid
  • LNA locked
  • the “nucleic acid” capable of specifically binding to a polynucleotide selected from the miRNAs described above is a synthesized or prepared nucleic acid and specifically includes a “nucleic acid probe” or a “primer”.
  • the “nucleic acid” is utilized directly or indirectly for detecting the presence or absence of cancer in a subject, for diagnosing the severity, the degree of amelioration, or the therapeutic sensitivity of cancer, or for screening for a candidate substance useful in the prevention, amelioration, or treatment of cancer.
  • nucleic acid includes a nucleotide, an oligonucleotide, and a polynucleotide capable of specifically recognizing and binding to a transcript represented by any of SEQ ID NOs: 1 to 100, or a synthetic cDNA nucleic acid thereof in vivo, particularly, in a sample such as a body fluid (e.g., blood or urine), in relation to the development of cancer.
  • a body fluid e.g., blood or urine
  • the nucleotide, the oligonucleotide, and the polynucleotide can be effectively used as probes for detecting the aforementioned gene expressed in vivo, in tissues, in cells, or the like on the basis of the properties described above, or as primers for amplifying the aforementioned gene expressed in vivo.
  • E-value As used within the scope of the disclosure, each of the terms “E-value”, “accuracy”, “AUC”, “sensitivity”, and “specificity” is generally to be understood to have the common definition that is well appreciated by people skilled in the art, and is specifically defined as follows: [0084]
  • P-value or “P”, is considered to be exchangeable with "/>-value” or "p” , and refers to a probability at which a more extreme statistic than that actually calculated from data under a null hypothesis is observed in a statistical test. Thus, smaller “P” or “P value” means more significant difference between subjects to be compared.
  • determination of the expression profile of the miRNA biomarker set substantially includes the determination of the expression level of each and every miRNA contained in the miRNA biomarker set.
  • expression levels for all of the miRNA contained in the miRNA biomarker set can be determined simultaneously in one single experiment that is well-controlled. Yet optionally, it is possible that expression levels of these miRNAs are determined in more than one experiment and by different experiment procedure.
  • measuring or detecting the expression of any of the miRNAs contained in the miRNA biomarker set comprises measuring or detecting any nucleic acid transcript corresponding to the miRNA.
  • expression can be detected or measured on the basis of miRNA or corresponding reverse transcribed cDNA levels.
  • Any quantitative or qualitative method for measuring RNA levels, or cDNA levels can be used. Suitable methods of detecting or measuring miRNA or cDNA levels include, for example, Northern Blotting, microarray analysis, RNA- sequencing, RNA in-situ hybridization, or a nucleic acid amplification procedure, such as reverse- transcription PCR (RT-PCR) or real-time RT-PCR, also known as quantitative RT-PCR (qRT-PCR), or digital RT-PCR. Such methods are well known in the art (see e.g., Green and Sambrook et al. 2012). Other techniques include digital, multiplexed analysis of gene expression, such as the nCounter® (NanoString Technologies, Seattle, WA) gene expression assays, which are further described in US20100112710 and US20100047924.
  • Detecting a nucleic acid of interest generally involves hybridization between a target (e.g. miRNA or cDNA) and a probe. Sequences of the miRNAs used in various cancer gene expression profiles are known. Therefore, one of skills in the art can readily design hybridization probes for detecting those miRNAs (see e.g., Green and Sambrook et al. 2012). For example, polynucleotide probes that specifically bind to the miRNA transcripts described herein (or cDNA synthesized therefrom) can be created using the nucleic acid sequences of the miRNA or cDNA targets themselves by routine techniques (e.g., PCR or synthesis).
  • the term “probe” means a part or portion of a polynucleotide sequence comprising about 10 or more contiguous nucleotides, about 15 or more contiguous nucleotides, about 20 or more contiguous nucleotides.
  • the polynucleotide probes will comprise 10 or more nucleic acids, 15 or more nucleic acids, or 20 or more nucleic acids.
  • Stringency of hybridization reactions is readily determinable by one of ordinary skill in the art, and generally is an empirical calculation dependent upon probe length, washing temperature, and salt concentration. In general, longer probes may require higher temperatures for proper annealing, while shorter probes may require lower temperatures.
  • Hybridization generally depends on the ability of denatured nucleic acid sequences to reanneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and hybridizable sequence, the higher the relative temperature that can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions more stringent, while lower temperatures less so.
  • Modely stringent conditions are described by, but not limited to, those in Sambrook et al. 1989, and include the use of washing solution and hybridization conditions (e.g., temperature, ionic strength and % SDS) less stringent than those described above.
  • An example of moderately stringent conditions is overnight incubation at 37°C in a solution comprising: 20% formamide, 5xSSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardfs solution, 10% dextran sulfate, and 20 mg/mL denatured sheared salmon sperm DNA, followed by washing the filters in 1 xSSC at about 37-50°C.
  • 5xSSC 150 mM NaCl, 15 mM trisodium citrate
  • 50 mM sodium phosphate pH 7.6
  • 5x Denhardfs solution 10% dextran sulfate
  • measuring the expression of the foregoing miRNAs in a biological sample can comprise, for instance, contacting a sample containing or suspected of containing cancer cells with polynucleotide probes specific to the miRNAs of interest, or with primers designed to amplify a portion of the miRNAs of interest, and detecting binding of the probes to the nucleic acid targets or amplification of the nucleic acids, respectively.
  • PCR primers are known in the art (see e.g., Green and Sambrook et al. 2012).
  • RNA-sequencing also called Whole Transcriptome Shotgun Sequencing
  • RNA-seq also called Whole Transcriptome Shotgun Sequencing
  • RNA-seq refers to any of a variety of high-throughput sequencing techniques used to detect the presence and quantity of RNA transcripts in real time. See Wang, Z., M. Gerstein, and M. Snyder, RNA-Seq: a revolutionary tool for transcriptomics, NAT REV GENET, 2009. 10(1): p. 57-63.
  • RNA-seq can be used to reveal a snapshot of a sample’s miRNAs from a genome at a given moment in time.
  • miRNA is converted to cDNA fragments via reverse transcription prior to sequencing, and, in certain embodiments, miRNA can be directly sequenced without conversion to cDNA.
  • Adaptors may be attached to the 5’ and/or 3’ ends of the miRNAs, and the miRNA or cDNA may optionally be amplified, for example by PCR.
  • the fragments are then sequenced using high-throughput sequencing technology, such as, for example, those available from Roche (e.g., the 454 platform), Illumina, Inc., and Applied Biosystem (e.g., the SOLiD system).
  • FIGS. 1A-1C show a case flow diagram for lung cancer dataset (FIG. 1A, split into a discovery and a validation set) and for ovarian, liver and bladder cancer datasets (FIG. IB, combined into a single validation dataset after removing redundant samples), and summarize the patient and tumor characteristics of patients with lung, bladder, ovarian, and liver cancers and demographic information of the corresponding controls (FIG. 1C);
  • FIGS. 2A-2G show the development and validation of the 4-miRNA diagnostic model in the lung cancer data set, with FIG. 2A showing determination of the optimal number (dotted line) of miRNAs for the diagnostic model by 10-fold cross validation in the discovery set; FIG. 2B showing ROC analysis in the discovery set; FIG. 2C showing distribution of normalized diagnostic index in the discovery set; FIG. 2D showing ROC analysis in the validation set; FIG. 2E showing distribution of normalized diagnostic index in the validation set; FIG. 2F showing comparison of normalized diagnostic index of paired serum samples (pre- vs. post-surgery) of 180 lung cancer patients; and FIG. 2G showing distribution of normalized diagnostic index in the clinical subsets of the validation set. Dotted horizontal lines represent the cut-point for the normalized diagnostic index of our model. The percentages shown in the graph were sensitivities in each cancer subgroup.
  • FIGS. 3 A and 3B show the performance of 4-miRNA diagnostic model in the datasets of additional cancers, with FIG. 3 A showing ROC analysis, and FIG. 3B showing distribution of normalized diagnostic index the 4-miRNA model.
  • the percentages shown in the graph were sensitivities of each cancer type and specificity of non-cancer controls;
  • FIGS. 4A and 4B show the ROC analysis and distribution of normalized diagnostic index across age and gender groups in the lung cancer dataset.
  • Step (1) determining the expression profile of the miRNA biomarker set
  • Step (2) calculating a diagnostic index of the biological sample based on the expression profile of the miRNA biomarker set.
  • Step (3) classifying the subject as having the cancer or not based on the value of the calculated diagnostic index. If the calculated diagnostic index is greater than or equal to a pre determined threshold, the subject is classified as having the cancer; or if otherwise the subject is classified as not having the cancer.
  • the miRNA biomarker set includes hsa-miR-5100, and optionally can further include any one or a combination of the miRNAs listed in Table 1 (see EXAMPLE 1).
  • the miRNA biomarker set may further include miRNA(s) from the top 2-100 miRNAs, or alternatively may further include miRNA(s) from the top 2-50 miRNAs, or alternatively may further include miRNA(s) from the top 2-20 miRNAs, or alternatively may further include miRNA(s) from the top 2-4 miRNAs, in Table 1.
  • the miRNA biomarker set consists of the top 4 miRNAs (i.e.
  • hsa-miR-5100, hsa-miR-1343-3p, hsa-miR-1290, and hsa-miR-4787-3p there can be different AUC cut-off levels (e.g. 0.780, 0.850, 0.950, 0.990, and 0.999), or different sensitivity-specificity levels (e.g. 68%-99%, 68%-99%, 83%-99%, and 99%-99%), at least at which the method is capable of accurately detecting certain cancer types.
  • the method can accurately detect lung cancer and gastric cancer at the AUC > 0.999, and/or at a sensitivity > 99.0% and having a specificity > 99.0%.
  • the diagnostic index can be calculated based on formula (I).
  • the calculation can be based on an unweighted model or on a weighted model.
  • different models e.g. limma model, logistic regression model, etc.
  • the diagnostic index is calculated via a weighted model using weights from the limma model.
  • the pre-determined threshold can be set as 1110 to thereby allow the method to have a specificity >0.95; or optionally, the pre-determined threshold can be set as 1200 to thereby allow the method to have a specificity >0.99.
  • the parami OCation and param SCaie can be selected as 600 and 1000 respectively to thereby allow the normalized diagnostic index to be between 0 and 10, and under such normalization, the preset cut-point can be set as 5.1 to give a specificity > 0.95 or as 6.0 to give a specificity > 0.99.
  • the biological sample can advantageously be a liquid biopsy sample such as a blood sample, a serum sample, a plasma sample, a urine sample, a saliva sample, or a spatum sample, etc.
  • Determination of the expression profile of the miRNA biomarker set can be realized by means of a variety of probe-based approaches including Northern Blotting, microarray analysis, RNA-sequencing, or RNA in-situ hybridization, or by means of a variety of amplification-dependent approaches including reverse-transcription PCR (RT-PCR), quantitative RT-PCR (qRT-PCR), or digital RT-PCR.
  • RT-PCR reverse-transcription PCR
  • qRT-PCR quantitative RT-PCR
  • digital RT-PCR digital RT-PCR
  • the method may further comprise a step of performing an evaluation of the subject, so as to determine if the subject is diagnosed as having the cancer (if the subject is absent of cancer before) or if the subject has recurrence of the cancer (if the subject has been treated to remove, or be free of, the cancer before).
  • the evaluation may further include physical examination, pathological examination of a biopsy from the subject, immunohistochemistry examination, or imaging examination including x-rays, computed tomography (CT), ultrasonography, magnetic resonance imaging, etc.
  • kit that can be employed to specifically implement the various steps of the method according to the different embodiments as described above in the first aspect of this section is further provided.
  • the kit substantially include certain articles (i.e. component (1), including one or more nucleic acids that can specifically recognize each miRNA in the miRNA biomarker set, and optionally one or more amplification primers) that can be used to determine the expression profile of the miRNA biomarker set and certain instructions (i.e. component (2)) for calculating the diagnostic index and for cancer classification.
  • component (1) including one or more nucleic acids that can specifically recognize each miRNA in the miRNA biomarker set, and optionally one or more amplification primers
  • certain instructions i.e. component (2)
  • each of the nucleic acids in component (1) may comprise a polynucleotide capable of specifically hybridizing under a stringent condition to (a) a polynucleotide comprising or consisting of a nucleotide sequence as set forth in SEQ ID NOS: 1-100, 1-50, 1-20 or 1-4, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides; or (b) a polynucleotide comprising or consisting of a nucleotide sequence complementary to a nucleotide sequence of SEQ ID NOS: 1-100, 1-50, 1-20 or 1-4, a derivative thereof, a variant thereof having at least 80% sequence identity, or a fragment thereof comprising 15 or more consecutive nucleotides.
  • kits regarding to the following elements/features, including: what miRNA components are included in the miRNA biomarker set; whether and how a normalization is performed over the diagnostic index; how the subject is classified as having the cancer or not, what samples can be used for the biological sample, and what detection accuracy level is to be achieved, etc.
  • the specific details for these different embodiments can be referenced to the various embodiments of the method as described above, and will be skipped herein for conciseness.
  • a computerized solution is further provided, which substantially serves, in a computerized and automatic manner, to implement the various steps of the method as described above in the first aspect of this section.
  • Such a computerized solution may be applied in a situation where the implementation of the various steps (l)-(3) of the method described above is to be automated by running a software program comprising program instructions in a computer, which brings about advantages such as high efficiency and great convenience.
  • such a computerized solution may include a computerized system or computer system, which comprises a processor (i.e. controller) and a computer-readable non- transitory storage medium that is communicatively coupled to the processor.
  • the computer- readable non-transitory storage medium is configured to store program instructions that are executable by the processor, thereby causing the processor to execute the various different steps in the method as described above, including:
  • Step (1) determining the expression profile of the miRNA biomarker set
  • Step (2) calculating a diagnostic index of the biological sample based on the expression profile of the miRNA biomarker set and according to formula (I); and [0126] Step (3): classifying the subject as having the cancer or not based on the value of the calculated diagnostic index.
  • processor is interpreted to be exchangeable with “central controller” or “central computing unit (CPU)”, and can be deemed to be a single core or multi core processor, or a plurality of processors for parallel processing.
  • non-transitory is intended to describe a tangible computer-readable storage medium excluding propagating electromagnetic signals, but are not intended to otherwise limit the type of physical computer-readable storage device that is encompassed by the phrase. Examples may include any tangible or non-transitory storage media or memory media such as electronic, magnetic, or optical media (e.g., disk or CD/DVD-ROM), or non-volatile memory storage (e.g., “flash” memory), etc.
  • the system 100 can, in addition to the processor 10 and the computer-readable non-transitory storage medium 20, further comprise a bus 30, a memory 40, an I/O interface 50, and a communication interface 60.
  • the processor 10, the storage medium 20, the memory 40, the I/O interface 50 and the communication interface 60 are all communicatively coupled with one another through the bus 30.
  • the storage medium 20 stores computer-executable program instructions which, when executed by the processor 10, cause the processor 10 to execute steps (l)-(3) of the method as described above.
  • the memory 40 is configured to transiently store the program instructions obtained from the storage medium 20, and the processor 10 is configured to execute the program instructions transiently stored in the memory 40.
  • the I/O interface 50 allows an input/output between the system 100 and a user, realizing the control of the system 100.
  • the communication interface 60 can allow the system 100 to be communicatively connected to another computing device to exchange data. It is to be noted that these computer hardware components can be locally arranged, or can be remotely arranged via a network, such as an intranet, an internet, or a cloud.
  • the original lung cancer study established a 2-miRNA diagnostic model (referred to as the “original 2-miRNA model” in this study) with high sensitivity and specificity for the detection of lung cancer (Asakura et al. 2020).
  • the objective of the current study was initially set to use this dataset to develop and validate a new diagnostic model that may out-perform the original 2-miRNA model for lung cancer detection. As datasets for additional cancer types were identified, the new model was evaluated for performance to detect other cancers.
  • Serum sample collection has been previously described in the original publications (Asakura et al. 2020; Yokoi et al. 2018; Usuba et al. 2019, Yamamoto et al. 2020). Briefly, serum samples were collected from cancer patients who were referred or admitted to the National Cancer Center Hospital (NCCH) between 2008 to 2016 prior to surgical operation, and stored at 4 °C for one week before being stored at -20 °C until further use. Cancer patients who were treated with preoperative chemotherapy and radiotherapy prior to serum collection were excluded.
  • NCCH National Cancer Center Hospital
  • NCCH National Center for Geriatrics and Gerontology
  • YMC Yokohama Minoru Clinic
  • 3DGene ® miRNA Labeling kit labeled by 3DGene ® miRNA Labeling kit and hybridized to 3D- Gene ® Human miRNA Oligo Chip (Toray Industries, Kanagawa, Japan) designed to investigate 2588 miRNA sequences registered in miRBase release 21.
  • 3D- Gene ® Human miRNA Oligo Chip Toray Industries, Kanagawa, Japan
  • RNA was determined when signal intensity was greater than mean plus two times standard deviation of the negative control signals, and in using the negative control signals the top and bottom 5% of the ranked signal intensities were removed. Background subtraction was performed by subtracting the mean signal of negative control signals (after removing top and bottom 5% as ranked by signal intensities) from the miRNA signal. Normalization across microarrays was achieved by calibrating according to three pre-selected internal control miRNAs (miR-149-3p, miR-2861, and miR-4463). [0141] 2.4. Diagnostic Model Development
  • Linear Model for Microarray Data (limma) (Ritchie et al. 2015) was performed in the discovery set to evaluate the statistical significance of differential miRNA expression between lung cancer vs. non-cancer.
  • a diagnostic index was calculated as a linear sum of miRNA expression levels weighted by limma statistics. The cut-point for the diagnostic index was chosen to ensure no misclassification of non-cancer controls in the discovery set to minimize false positives as the diagnostic model may potentially be used as a screening test in the at-risk general public.
  • the diagnostic performance for identifying cancer vs. non-cancer was determined by AUC of the ROC curve analysis, sensitivity, and specificity. Comparing AUC of two ROC curves was done with roc.test function with bootstrapping method from pROC package. Comparing paired sensitivities for the lung cancer clinical subsets of paired pre- vs. post-surgical samples was performed by McNemar test limma analysis was carried out using Bioconductor package limma (The Bioconductor Open Source Software For Bioinformatics (accessed on August 27, 2020). All statistical analysis was performed using R version 4.0.5 (The R Project for Statistical Computing (accessed on July 15, 2020)).
  • the lung cancer dataset included 1566 lung cancer patients and 2178 non-cancer controls (FIG. 1A) (Asakura et al. 2020).
  • the ovarian cancer dataset consisted of 333 ovarian cancer patients and 2759 non-cancer controls, as well as patients with breast, colorectal, esophageal, gastric, liver, lung, pancreatic, and sarcoma cancers (FIG. IB) (Yokoi et al. 2018).
  • liver and bladder cancer datasets included 345 liver cancer/1033 non-cancer and 392 bladder cancer/100 non-cancer participants, respectively, in addition to patients with biliary tract, breast, colorectal, esophageal, gastric, glioma, lung, ovarian, pancreatic, prostate, and sarcoma cancers (FIG. IB) (Usuba et al. 2019, Yamamoto et al. 2020). With the lung cancer dataset left intact, redundant samples within the other three datasets that showed correlations either among themselves or with samples in the lung cancer dataset being greater than 0.99 were removed.
  • the discovery set included 208 lung cancer patients and 208 non-cancer controls, matched by age, sex, and smoking status (Asakura et al. 2020).
  • the validation set included 1358 lung cancer patients and 1970 non-cancer controls.
  • the patients with lung cancer included 57% male, 62% former or current smokers, 78% adenocarcinoma, 14% squamous carcinoma, 72% stage I, 15% stage II, and 13% stage III (FIG. 1C).
  • the 392 bladder cancer patients were of mean age 68 y, 72% male, 5% metastatic, 12% nodal positive, 77% T2 or below, and 80% high grade (FIG. 1C).
  • the 333 ovarian cancer patients were of mean age 57 y, 25% stage I, 10% stage II, 55% serous, 19% clear cell, and 13% endometrioid histology (FIG. 1C).
  • the 348 liver cancer patients were of mean age 68 y, 78% male, 37% stage I, and 33% stage II (FIG. 1C). No detailed demographic information and tumor characteristics for the other cancers were provided by the original studies.
  • Table 1 cancer discovery set.
  • the cut-point of six was chosen to ensure no misclassification of the non-cancer controls in the discovery set to minimize the false positives, which resulted in 98% sensitivity and 100% specificity (FIG. 2C), compared to 99% for both sensitivity and specificity for the original 2- miRNA model (Asakura et al. 2020).
  • the performance of the 4-miRNA model was assessed in clinical subsets of the validation set, as defined by clinical stage, T stage, N stage, M stage, and Histology. Across all clinical subsets, the 4-miRNA model showed sensitivities of approximately 99% or above (FIG. 2G, Table 2), which were superior to the sensitivities of the original 2-miRNA model (Table 2). In particular for early stage lung cancer, e.g., for both patients with stage I lung cancer and patients with T1 tumors, the 4-miRNA model demonstrated >99% sensitivity (FIG. 2G, Table 2), compared to the sensitivities of 95.4 and 95.9%, respectively, for the 2-miRNA model (Table 2).
  • the 4-miRNA model In the prevalent histological types of adenocarcinoma and squamous cell carcinoma, the 4-miRNA model also demonstrated superior performance (FIG. 2G, Table 2), compared to the original 2-miRNA model (Table 2). Table 2. Comparison of sensitivities in the clinical subsets of the lung cancer validation set between the original 2-miRNA model and the new 4-miRNA model, while maintaining a specificity of >99%. [0156] Data on paired serum samples (pre- vs. post-surgery) were also available for 180 patients. The diagnostic indices of the 4-miRNA model for post-surgery serum samples were reduced to normal levels below the diagnostic index cut-point (FIG. 2F).
  • the 4-miRNA model demonstrated high sensitivities in the range from 83.2 to 100% for biliary tract, bladder, colorectal, esophageal, gastric, glioma, liver, pancreatic, and prostate cancers, and reasonable sensitivities of 68.2 and 72.0% for ovarian cancer and sarcoma, respectively (FIG. 3B).
  • the 4-miRNA model maintained a high specificity of 99.3%.
  • Galleri and PanSeer are developed as methylation- based epigenetic signatures (Klein et al. 2021; Chen 2020).
  • CCGA Circulating Cell-free Genome Atlas
  • the current MCED tests in development generally showed sensitivities in the range of 60-70% when a high specificity of 99% was mandated.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
PCT/US2022/032423 2021-06-09 2022-06-07 Cancer detection method, kit, and system WO2022261039A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202280041034.8A CN117500941A (zh) 2021-06-09 2022-06-07 癌症检测方法,试剂盒和系统
CA3221494A CA3221494A1 (en) 2021-06-09 2022-06-07 Cancer detection method, kit, and system
EP22820856.7A EP4352266A2 (en) 2021-06-09 2022-06-07 Cancer detection method, kit, and system
AU2022289858A AU2022289858A1 (en) 2021-06-09 2022-06-07 Cancer detection method, kit, and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202163208506P 2021-06-09 2021-06-09
US63/208,506 2021-06-09

Publications (2)

Publication Number Publication Date
WO2022261039A2 true WO2022261039A2 (en) 2022-12-15
WO2022261039A3 WO2022261039A3 (en) 2023-01-19

Family

ID=84426392

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/032423 WO2022261039A2 (en) 2021-06-09 2022-06-07 Cancer detection method, kit, and system

Country Status (5)

Country Link
EP (1) EP4352266A2 (zh)
CN (1) CN117500941A (zh)
AU (1) AU2022289858A1 (zh)
CA (1) CA3221494A1 (zh)
WO (1) WO2022261039A2 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110799648A (zh) * 2017-06-29 2020-02-14 东丽株式会社 用于检测肺癌的试剂盒、装置和方法

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9495515B1 (en) * 2009-12-09 2016-11-15 Veracyte, Inc. Algorithms for disease diagnostics
US20120041274A1 (en) * 2010-01-07 2012-02-16 Myriad Genetics, Incorporated Cancer biomarkers
WO2013107459A2 (en) * 2012-01-16 2013-07-25 Herlev Hospital Microrna for diagnosis of pancreatic cancer and/or prognosis of patients with pancreatic cancer by blood samples
US9708667B2 (en) * 2014-05-13 2017-07-18 Rosetta Genomics, Ltd. MiRNA expression signature in the classification of thyroid tumors
WO2016038119A1 (en) * 2014-09-09 2016-03-17 Istituto Europeo Di Oncologia S.R.L. Methods for lung cancer detection
CA3059480A1 (en) * 2017-04-28 2018-11-01 Toray Industries, Inc. Kit, device, and method for detecting ovarian tumor

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110799648A (zh) * 2017-06-29 2020-02-14 东丽株式会社 用于检测肺癌的试剂盒、装置和方法
CN110799648B (zh) * 2017-06-29 2024-03-22 东丽株式会社 用于检测肺癌的试剂盒、装置和方法

Also Published As

Publication number Publication date
EP4352266A2 (en) 2024-04-17
CA3221494A1 (en) 2022-12-15
AU2022289858A1 (en) 2024-01-04
WO2022261039A3 (en) 2023-01-19
CN117500941A (zh) 2024-02-02

Similar Documents

Publication Publication Date Title
JP5843840B2 (ja) 新しい癌マーカー
ES2656487T3 (es) Evaluación de la respuesta a la terapia de neoplasmas neuroendocrinos gastroenteropancreáticas (GEP-NEN)
JP6408380B2 (ja) 癌を患うリスクのある被検体を診断するための方法およびキット
JP2014509189A (ja) 結腸ガン遺伝子発現シグネチャーおよび使用方法
US11198909B2 (en) Risk scores based on human phosphodiesterase 4D variant 7 expression
WO2015073949A1 (en) Method of subtyping high-grade bladder cancer and uses thereof
US10287634B2 (en) RNA-biomarkers for diagnosing prostate cancer
EP3122905B1 (en) Circulating micrornas as biomarkers for endometriosis
US20240093312A1 (en) Detection method
EP3548631B1 (en) Risk scores based on human phosphodiesterase 4d variant 7 expression
AU2022289858A1 (en) Cancer detection method, kit, and system
US20130084241A1 (en) DEVELOPMENT OF miRNA DIAGNOSTICS TOOLS IN BLADDER CANCER
JP6611411B2 (ja) 膵臓がんの検出キット及び検出方法
US20210079479A1 (en) Compostions and methods for diagnosing lung cancers using gene expression profiles
WO2019245587A1 (en) Methods and compositions for the analysis of cancer biomarkers
US11427874B1 (en) Methods and systems for detection of prostate cancer by DNA methylation analysis
CN111315897A (zh) 用于黑素瘤检测的方法
JP2024519082A (ja) 肝細胞がんのdnaメチル化バイオマーカー

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 3221494

Country of ref document: CA

ENP Entry into the national phase

Ref document number: 2023576034

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 202280041034.8

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2022289858

Country of ref document: AU

Ref document number: AU2022289858

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 2022289858

Country of ref document: AU

Date of ref document: 20220607

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 2022820856

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022820856

Country of ref document: EP

Effective date: 20240109

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22820856

Country of ref document: EP

Kind code of ref document: A2