WO2010045462A1 - System for identification of multiple nucleic acid targets in a single sample and use thereof - Google Patents

System for identification of multiple nucleic acid targets in a single sample and use thereof Download PDF

Info

Publication number
WO2010045462A1
WO2010045462A1 PCT/US2009/060848 US2009060848W WO2010045462A1 WO 2010045462 A1 WO2010045462 A1 WO 2010045462A1 US 2009060848 W US2009060848 W US 2009060848W WO 2010045462 A1 WO2010045462 A1 WO 2010045462A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
biomarker
virus
sample
target nucleic
Prior art date
Application number
PCT/US2009/060848
Other languages
French (fr)
Inventor
Tom Morrison
Original Assignee
Biotrove, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Biotrove, Inc. filed Critical Biotrove, Inc.
Priority to EP09821256A priority Critical patent/EP2344619A4/en
Publication of WO2010045462A1 publication Critical patent/WO2010045462A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6851Quantitative amplification

Definitions

  • GEP gene expression profiling
  • Hybridization arrays are quite appealing for their ability to collect many measurements per sample. However, they suffer from low assay specificity, poor sensitivity, narrow dynamic range, poor signal-to-analyte response and complex sample processing making hybridization microarrays less attractive as a platform for diagnostic gene expression profiling relative to their well-established research utility.
  • QPCR has excellent lower detection threshold, signal-to-analyte response, and dynamic range.
  • RNA yield is often low from clinical samples, especially formalin fixed paraffin embedded tissues, and this low RNA yield limits the number of assays per test.
  • more tests consume expensive reagents and entail complicated workflows, requiring highly skilled labor and expensive reagents, making the test expensive and possibly slowing widespread adoption and deployment, despite its intrinsic clinical value.
  • hybridization microarrays exhibit similar performance limitations that make them less than ideal as a GEP diagnostic platform.
  • Examples of multivariate GEP tests showing both promise and limitations are two high profile commercial GEP breast cancer prognostic tests, MamaPrint, a 70 gene microarray test and Oncotype Dx, a 21 gene real-time QPCR test.
  • Each of these tests provides sufficient clinical accuracy to improve breast cancer patient outcome enabling selection of the best treatment plan on an individual basis.
  • these tests cannot be widely deployed in a kit format and must be performed at their respective company's laboratories as a clinical testing service because their reliability depends on the specific expertise and processes developed by each company for each test and are therefore not exportable to other laboratories.
  • a clear benefit to improving human health care capabilities would be a GEP platform that provides the analytic sensitivity and linear dynamic range of QPCR and assay scalability of microarrays while minimizing inter- laboratory analytical variation, cost and sample consumption, and enabling analysis of FFPE samples. This would enable widespread deployment in regional pathology laboratories for clinical diagnostic testing.
  • compositions and methods that provide for gene expression profiling using a pre-amplification step that enhances detection of multiple nucleic acid targets in a single sample, such as a biologic sample, in a nanoplatform system.
  • the invention provides a method for detecting a gene expression profile in a biological sample, the method involving the steps of preamplifying a biomarker in the presence of a defined competitive reference biomarker; individually exponentially amplifying the biomarker in the presence of the reference biomarker in a reaction volume of at least about 1, 10, 100, 500, or 1000 nl; identifying binding of a first detectable nucleic acid probe to the biomarker and any one of: binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and the melting temperature of the first detectable nucleic acid probe to the biomarker; and determining, respectively, any one of: the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker, where the half maximal effective concentration is used to determine the quantity of the biomarker in the sample.
  • the invention provides a method for detecting a gene expression profile in a biological sample, the method involving the steps of preamplifying a biomarker in the presence of a defined competitive reference biomarker; individually exponentially amplifying the biomarker in the presence of the reference biomarker in a set of reactions, each reaction having a volume of at least about 1, 10, 100, 500, or 1000 nl; identifying binding of a first detectable nucleic acid probe to the biomarker and any one of: binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and the melting temperature of the first detectable nucleic acid probe to the biomarker; determining, respectively, any one of: the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker; and plotting the ratio against the molar ratio of the reference nucleic acid for the set of reactions, where the half maximal effective concentration
  • the invention provides a method for identifying or monitoring a subject as having a pathological condition characterized by an alteration in gene expression, the method involving the steps of preamplifying a biomarker in the presence of a defined competitive reference biomarker; individually exponentially amplifying the biomarker in the presence of the reference biomarker in a reaction volume of at least about 1, 10, 100, 500, or 1000 nl; and detecting the presence or absence of the biomarker and the corresponding reference biomarker, where detection of the biomarker, indicates that the biomarker is present; and failure to detect the biomarker when the corresponding reference biomarker is detected indicates that the biomarker is absent from the sample.
  • the invention provides a method for detecting two or more target nucleic acid molecules in a single sample, the method involving the steps of preamplifying the target nucleic acid molecules in the presence of a defined reference nucleic acid molecule; individually exponentially amplifying each of the target nucleic acid molecules in the presence of the reference nucleic acid molecule in a reaction volume of at least about 1, 10, 100, 500, or 1000 nl; and detecting the presence or absence of the target nucleic acid molecules and the reference nucleic acid molecule, where detection of the target nucleic acid molecules indicates that the target nucleic acid is present; and failure to detect the target nucleic acid molecule when the reference nucleic acid molecule is detected indicates that the target nucleic acid molecule is absent from the sample.
  • the invention provides a method for detecting two or more target nucleic acid molecules in a single sample, the method involving the steps of preamplifying a target nucleic acid molecule in the presence of a defined reference nucleic acid molecule for each target; individually exponentially amplifying each of the target nucleic acid molecules in the presence of the reference nucleic acid molecule in a set of reactions, each reaction having a volume of at least about 1, 10, 100, 500, or 1000 nl; identifying binding of a first detectable nucleic acid probe to the target nucleic acid molecule any one of: binding of a second detectable nucleic acid probe to the corresponding reference nucleic acid molecule, and the melting temperature of the first detectable nucleic acid probe to the target nucleic acid; determining, respectively, any one of: the ratio of binding to the target nucleic acid and binding to the corresponding reference nucleic acid, and the ratio of binding to the bio marker and the melting temperature of the first detectable nucleic acid probe to the biomark
  • the invention provides a method for characterizing cancer, the method involving the steps of preamplifying a biomarker in the presence of a defined reference biomarker in a set of reactions, where the biomarker is selected from the group consisting of ERBB3, LCK, DUSP6, STATl, MMD, CPEB4, RNF4, STAT2, NFl, FRAPl, DLG2, IRF4, ANXA5, HMMR, HGF, and ZNF264; individually exponentially amplifying the biomarker in a reaction having a volume of at least about 1, 10, 100, 500, or 1000 nl; identifying binding of a first detectable nucleic acid probe to the biomarker and any one of: binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and the melting temperature of the first detectable nucleic acid probe to the biomarker; and determining, respectively, any one of: the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and the ratio
  • the invention provides a nanofluidic system having a high density array of nano liter-scale through-holes having a 10-50 nl reaction volume containing a standardized mixture of internal standards, at least two (e.g., 2, 3, 4, 5, etc.) pairs of detectable target nucleic acid probes, each of which is complementary to a target nucleic acid sequence, and a pair of detectable reference nucleic acid probes complementary to a competitive template internal standard, where each primer pair coamplifies a template and its respective competitive internal standard template with equal efficiency.
  • a standardized mixture of internal standards at least two (e.g., 2, 3, 4, 5, etc.) pairs of detectable target nucleic acid probes, each of which is complementary to a target nucleic acid sequence, and a pair of detectable reference nucleic acid probes complementary to a competitive template internal standard, where each primer pair coamplifies a template and its respective competitive internal standard template with equal efficiency.
  • the invention provides a kit containing a high density array of nano liter-scale through-holes having a 10-50 nl reaction volume containining a standardized mixture of internal standards, at least two (e.g., 2, 3, 4, 5, etc.) pairs of detectable target nucleic acid probes, each of which is complementary to a target nucleic acid sequence, and a pair of detectable reference nucleic acid probes complementary to a competitive template internal standard, where each primer pair coamplifies a template and its respective competitive internal standard template with equal efficiency, and written directions for using the kit to detect a gene expression profile in a biological sample.
  • a standardized mixture of internal standards at least two (e.g., 2, 3, 4, 5, etc.) pairs of detectable target nucleic acid probes, each of which is complementary to a target nucleic acid sequence, and a pair of detectable reference nucleic acid probes complementary to a competitive template internal standard, where each primer pair coamplifies a template and its respective competitive internal standard template with equal efficiency
  • the sample is detected for a condition selected from the group consisting of neoplasia, inflammation, pathogen infection, immune response, sepsis, the presence of liver metabolites, and the presence of a genetically modified organism.
  • detecting the neoplasia is for diagnosing a neoplasia, characterizing a neoplasia to identify tissue of origin, monitoring response of neoplasia to treatment, or predicting the risk of developing a neoplasia.
  • the target nucleic acid or biomarker is RNA or DNA.
  • the step of preamplifying a biomarker in the presence of a defined competitive reference biomarker involves preamplifying the target nucleic acids using primer sets specific for the target nucleic acids.
  • the primer set used in the step for preamplifying the target nucleic molecules is used in the step for amplifying the target nucleic molecules.
  • a first set of primers is used in the step for preamplifying the target nucleic molecules and a second set of primers is used in the step for amplifying the target nucleic acid molecules.
  • the step of preamplifying a biomarker in the presence of a defined competitive reference biomarker involves reverse transcriptase polymerase chain reaction (RT-PCR).
  • RT-PCR reverse transcriptase polymerase chain reaction
  • the nucleic acid probe to the target nucleic acid and the nucleic acid probe to the corresponding reference nucleic acid are fluorogenic.
  • the reaction occurs in a through-hole of a platen.
  • the target nucleic acid is derived from a bacterium, a virus, a spore, or a eukaryotic cell.
  • the eukaryotic cell is a neoplastic cell derived from lung, breast, prostate, thyroid, and pancreas.
  • the target nucleic acid molecule is derived from a bacterial pathogen selected from the list consisting of Aerobacter, Aeromonas, Acinetobacter, Actinomyces israelii, Agrobacterium, Bacillus, Bacillus antracis, Bacteroides, Bartonella, Bordetella, Bortella, Borrelia, Brucella, Burkholderia, Calymmatobacterium, Campylobacter, Citrobacter, Clostridium, Clostridium perfringers, Clostridium tetani, Cornyebacterium,Corynebacterium diphtheriae, corynebacterium sp., Enterobacter, Enterobacter aerogenes, Enterococcus, Erysipelothrix rhusiopathiae, Escherichia, Francisella, Fusobacterium nucleatum, Gardnerella, Haemophilus, Hafnia, Helicobacter,
  • the bacterial pathogen is antibiotic resistant.
  • the target nucleic acid molecule is derived from a virus selected from the list consisting of hepatitis C virus, human immunodeficiency virus, Retrovirus, Picornavirus, polio virus, hepatitis A virus, Enterovirus, human Coxsackie virus, rhinovirus, echovirus, Calcivirus, Togavirus, equine encephalitis virus, rubella virus, Flavivirus, dengue virus, encephalitis virus, yellow fever virus, Coronavirus, Rhabdovirus, vesicular stomatitis virus, rabies virus, Filovirus, ebola virus, Paramyxovirus, parainfluenza virus, mumps virus, measles virus, respiratory syncytial virus, Orthomyxovirus, influenza virus, Hantaan virus, bunga virus, phlebovirus, Nairo virus, Arena virus, hemorrhagic fever virus, reovirus, orbivirus, Rot
  • the sample is a biological fluid or tissue sample derived from a patient.
  • the sample is selected from blood, serum, urine, semen and saliva.
  • the tissue sample is selected from tissue biopsy, formaldehyde fixed paraffin embedded tissue, fine needle aspirate (FNA) biopsy and laser capture micro-dissected samples.
  • the sample contains at least about 1-1000 (e.g., 1-10, 1-100, 1-500) cells.
  • the sample contains at least about 1-1000 (e.g., 1-10, 1-100, 1-500) ng of RNA.
  • one target nucleic acid molecule can be detected in at least about 50, 25, or 10 copies per reaction or when at least about 50-100 copies of a competing target nucleic acid are present.
  • the method detects a target present at about 1-100 copies/reaction. In various embodiments of any of the above aspects, the method detects a target present at about 5-50 starting copies/reaction. In various embodiments of any of the above aspects, the method detects a target present at about 10 starting copies/reaction. In various embodiments of any of the above aspects, the method detects a target present at about 1 starting copy/reaction.
  • an absolute gene copy number is generated by curve fitting a plot of the ratio of the native/standard signals vs. standard concentration and the concentration (EC50) is used to determine the quantity of the target nucleic acid in the sample.
  • an absolute gene copy number is generated by curve fitting a plot of the ratio of the signal/melting temperature of the detectable nucleic acid probe vs. standard concentration and the concentration (EC50) is used to determine the quantity of the target nucleic acid in the sample.
  • the detectable target and reference nucleic acid probes have a distinct fluorometric dye that provides for the separate detection of amplified target and internal standards.
  • the tissue sample is selected from the group consisting of tissue biopsay, formaldehyde fixed paraffin embedded (FFPE) tissue, fine needle aspirate (FNA) biopsy and laser capture micro-dissected samples.
  • the sample contains at least about 1-1000 (e.g., 1-10, 1-100, 1-500) cells.
  • the sample contains at least about 1-1000 (e.g., 1-10, 1-100, 1-500) ng ofRNA.
  • alteration is meant an increase or decrease.
  • An alteration may be by as little as 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, or by 40%, 50%, 60%, or even by as much as 75%, 80%, 90%, or 100%.
  • amplify is meant to increase the number of copies of a molecule.
  • the polymerase chain reaction PCR is used to amplify nucleic acids.
  • preamplify is meant to increase the number of copies of a molecule (e.g., a biomarker or nucleic acid molecule) before exponentially amplifying the molecule.
  • preamplification may involve a linear increase in the number of copies of a molecule
  • binding is meant having a physicochemical affinity for a molecule. Binding is measured by any of the methods of the invention, e.g., hybridization of a detectable nucleic acid probe, such as a TaqMan based probe, Pleiades based probe.
  • biological sample is meant any tissue, cell, fluid, or other material derived from an organism (e.g., human subject).
  • biomarker is meant a polypeptide or polynucleotide that is differentially present in a sample taken from a subject having a disease or disorder relative to a reference.
  • exemplary biomarkers include nucleic acid molecules.
  • detect refers to identifying the presence, absence, or level of an agent.
  • detectable is meant a moiety that when linked to a molecule of interest renders the latter detectable. Such detection may be via spectroscopic, photochemical, biochemical, immunochemical, or chemical means.
  • useful labels include radioactive isotopes, magnetic beads, metallic beads, colloidal particles, fluorescent dyes, electron-dense reagents, enzymes (for example, as commonly used in an ELISA), biotin, digoxigenin, or haptens.
  • half-maximal effective concentration or "EC50” is response halfway between the baseline and maximum of the ratio of target molecule to a reference molecule, which corresponds to the inflection point from a sigmoidal curve fit when the ratio of target molecule to internal standard is plotted against molar ratio of the reference molecule.
  • an internal standard is meant a competitive template or molecule that is amplified in the presence of a native template or molecule.
  • gene expression profile is meant a characterization of the expression or expression level of two or more polynucleotides.
  • melting temperature is meant the lowest temperature at which a detection probe does not bind or hybridize to a target nucleic acid.
  • the melting temperature is determined by the inflection point of melting curve profile, which measures hybridization as a function of temperature.
  • “native” is meant endogenous, or originating in a sample.
  • nucleic acid or oligonucleotide probe is defined as a nucleic acid capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation.
  • a probe may include natural (i.e., A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.).
  • the bases in a probe may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization. It will be understood by one of skill in the art that probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions.
  • the probes are preferably directly labeled with isotopes, for example, chromophores, lumiphores, chromogens, or indirectly labeled with biotin to which a streptavidin complex may later bind.
  • isotopes for example, chromophores, lumiphores, chromogens, or indirectly labeled with biotin to which a streptavidin complex may later bind.
  • platen is meant a device having a high-density array of holes for holding and/or analyzing a plurality of liquid samples, e.g., described in US Patent Nos. 6,716,629; 6,027,873; 6,306,578; or 6,436,632, all of which are herein incorporated by reference.
  • a “competitive reference biomarker” is a reference biomarker that competes with the biomarker of interest in a chemical reaction (e.g., competes with the biomarker of interest for probe binding) .
  • standardized mixture of internal standards is meant a mixture that contains internal standards having a defined concentration or a defined number of molecules of the internal standards.
  • target nucleic acid molecule is meant a nucleic acid or biomarker of the sample that is to be detected.
  • Figures IA-C is a schematic diagram showing a workflow for the detection of multiple nucleic acid targets in a single sample, e.g., a formalin fixed paraffin embedded (FFPE) sample.
  • Figure 2 is a graph, which shows that preamplification does not increase replicate variation within and across separate experiments.
  • Levels of three poorly expressed genes (DPP4, SCNNlA, and WNTl) were measured in Stratagene Universal Human Reference RNA (SUHRRNA) under multiple conditions: with or without preamplification (Pre-Amp), with 1/5 or 1/10 typical primer concentration during pre-amplif ⁇ cation (1/5 or 1/10 primers, respectively), or with a 100-fold dilution prior to the 2nd round of amplification (1/100 dil).
  • FIG. 3 is a graph, which shows that two step Standardized RT-PCR (StaRT- PCR) allows significant decrease of sample consumption while increasing the number of target nucleic acids that can be assayed per sample.
  • StaRT-PCR Standardized RT-PCR
  • Expression levels of fourteen genes in Stratagene Universal Human Reference RNA (SUHRRNA) were measured with and without preamplification. At least three replicate measurements were performed for all but measurement of 9SF5 with preamplification. A mixture of 96 primers was used in the preamplification step.
  • Figures 4A-4B are graphs, which show that analytical variation was less than biological variation among formalin fixed paraffin embedded (FFPE) RNA samples and matched fresh frozen (FF) RNA samples as assayed by Standardized RT-PCR (StaRT-PCR).
  • Figure 4 A shows the ratio of transcript levels for two genes measured in matched pairs of formalin fixed paraffin embedded (FFPE) and in fresh frozen (FF) samples.
  • Figure 4B shows the measure of degradation as determined by the number of ⁇ -actin (ACTB) molecules obtained per 1 ng of RNA during reverse transcription, and the difference in Gene A/Gene B ratio for the matched pairs is related to this measure of RNA degradation.
  • ACTB ⁇ -actin
  • FIGS 5A-5C are graphs, which show Nano liter-scale PCR using the OpenArray (R) system can detect differences in gene expression between normal breast tissue samples and breast tumor samples and has less analytic variation compared to biological variation.
  • cDNA was generated by using random hexamers to amplify Total RNA (Clonetech).
  • a human kinase OpenArray (R) plate (508 kinase genes and 13 reference genes for normalization) was loaded at 1 ng RNA equivalence per hole in LightCycler FastStart DNA Master SYBR Green I and subjected to 32 thermal cycles. Array images were collected every cycle, hole intensities plotted against cycle number, and cycle threshold (Ct) automatically calculated by Biotrove NT Cycler software.
  • Figures 5A and 5C show technical replicate performance when comparing cycle number and Ct in normal breast tissues and breast tumor samples.
  • Figure 5B shows variability between sets of matched samples when comparing tumor matched normal samples and breast tumor samples.
  • Figures 6A-6F show that standardized Nano Array PCR was able to detect low levels of multiple target nucleic acids and distinguish them from corresponding internal standard nucleic acids. The points on each plot represent technical replicates of initial two pre-amplification PCR. In each reaction, ten (10) copies of the native template nucleic acid or internal standard nucleic acid were used.
  • Figure 6A shows detection of Dusp ⁇ native template (NT) (left panel) and Dusp ⁇ internal standard (IS) (right panel).
  • Figure 6B shows detection of erbb3 native template (left panel) and erbb3 internal standard (right panel).
  • Figure 6C shows detection of lck native template (left panel) and lck internal standard (right panel).
  • Figure 6D shows detection of mmdlO native template (left panel) and mmdlO internal standard (right panel).
  • Figure 6E shows detection of stat native template (left panel) and stat internal standard (right panel).
  • Figure 6F shows detection of tbp native template (left panel) and tbp internal standard (right panel).
  • FAM on the y-axis is the raw sample fluorescence data of the native template (NT) detected using a 6-carboxyfluorescein (FAM) labeled probe and VIC on the x-axis is the raw sample fluorescence data of the internal standard (IS) detected using a VIC (proprietary probe to Applera/ Applied Biosystems) labeled probe.
  • Figures 7A-7E show that one template at 10 copies can be detected by OpenArray (R) Start-PCR when up to 100 copies of a competitive template are present in the reaction.
  • Figure 7A shows the detection of 10 copies of a template in the absence of a competitive template.
  • Figure 7B shows the detection of 10 copies of a template when 1 copy of a competitive template is present in the reaction.
  • Figure 7C shows the detection of 10 copies of a template when 10 copies of a competitive template are present in the reaction.
  • Figure 7D shows the detection of 10 copies of a template when 100 copies of a competitive template are present in the reaction.
  • Figure 7E shows the results of a reaction in which neither the native template or competitive template nucleic acids are present in the reaction, simulating a failed PCR.
  • Figure 8 is a graph, which shows that little variation was observed among replicate high density, high throughput Standardized Nano Array PCR experiments.
  • Figure inset depicts raw sample fluorescence of the data.
  • FAM on the y-axis is the raw sample fluorescence data of the native template (NT) detected using a 6- carboxyfluorescein (FAM) labeled probe and VIC on the x-axis is the raw sample fluorescence data of the internal standard (IS) detected using a VIC (proprietary probe to Applera/ Applied Biosystems) labeled probe.
  • NT native template
  • FAM 6- carboxyfluorescein
  • VIC internal standard
  • Kinase gene expression comparison matched breast tumor/normal tissue.
  • cDNA was generated using random hexamers to amplify commercially a available total RNA sample (Clonetech Total RNA).
  • the human kinase OpenArray (R) plate (608 kinase genes and 13 reference genes for normalization) was loaded at 1 ng RNA equivalence per hole in LightCycler FastStart DNA Master SYBR Green I, subjected to 32 thermal cycles with array images collected every cycle, hole intensities plotted against cycle number and cycle threshold (Ct) automatically calculated by BioTrove NT Cycler software.
  • Figures 9A and 9B are schematic diagrams showing a workflow for the detection of multiple nucleic acid targets in a single sample.
  • Figure 9A shows a SNAP workflow using TaqMan single polynucleotide polymorphism (SNP) Assay in the Open Array (R) platform (OA) in the detection step in the SNAP assay.
  • SNP TaqMan single polynucleotide polymorphism
  • R Open Array
  • OA Open Array
  • FAM raw sample fluorescence data of the native template (NT) detected using a 6-carboxyfluorescein (FAM) labeled probe is graphed against VIC raw sample fluorescence data of the internal standard (IS) detected using a VIC (proprietary probe to Applera/ Applied Biosystems) labeled probe.
  • VIC proprietary probe to Applera/ Applied Biosystems
  • Figure 9B shows a SNAP workflow using Pleiades probes in the Open Array (R) platform (OA) ( Figure 9B) in the detection step in the SNAP assay.
  • R Open Array
  • FAM 6-carboxyfluorescein
  • Figure 10 is a graph that shows the amount of variation in the measurement of numbers of copies of the target sequences DUSP6, ERBB3, LCK, MMD, STATl, and TBPl when performed in replicate in the SNAP assay.
  • Figures 1 IA and 1 IB are graphs that show the determination of the number of copies of a biomarker over several input concentrations of cDNA by SNAP assay using TaqMan probes in the detection step ( Figure 1 IA) or Pleiades probes in the detection step ( Figure HB).
  • the input concentration of cDNA is shown on the x- axis, and the number of copies determined from the half-maximal concentration are shown on the y-axis.
  • Figures 12A-12D are graphs that show representative real-time PCR standard curves for 4 target genes (Figure 12A, Dlg2; Figure 12B, DUSP6; Figure 12C, FRAPl; Figure 12D, HGF) from the 16 lung cancer prognostic gene panel. Data was generated using 3x serial dilutions of cDNA, dual labeled hydrolysis probes and a Roche 480 Lightcycler with second derivative analysis to eliminate user bias in analysis settings.
  • Figures 13A-13C are graphs that depict data obtained in the SNAP process workflow. Samples are amplified in the presence of increasing amount of internal standard.
  • a fluorescent melting probe is used to characterize the end product ratios between native template and internal standard (Figure 13A) as a response to increasing internal standard (legend, starting copies internal standard).
  • the contribution of native template (blue line) and internal standard (green line) to the melting curve is deconvolved through curve fitting ( Figure 13B) to estimate the internal standard and native template molar ratio.
  • Transcript abundance is derived from the EC50 of a sigmoid curve fit to fraction native template vs. log internal standard concentration ( Figure 13 C) .
  • Figure 14 is a graph showing precision of results obtained by the SNAP assay.
  • Six half log serial dilutions of lung tumor cDNA 160 to 0.50 ng were prepared and three aliquots frozen. A dilution series was thawed and distributed into eight amplification tubes containing mastermix, log dilutions of internal standards (108 to 101 per tube) and 80 nM primer pairs to the assays indicated in table. Following 34 thermal cycles, PCR products were diluted 1000-fold into mastermix, transferred into OpenArray with each assay (primers and target specific Pleiades probe) preloaded into a single assay in each hole, put through 30 thermal cyclers in OpenArray NT Cycler, and end products measured by melting curve analysis.
  • Melt curve data was converted into EC50 using a MatLab script that uses the raw melting curve data to calculate the ratio of internal standard to native template products, then calculates the [NT] from an EC50 derived from a sigmoid curve to a Fraction NT vs. Log [IS] plot. Values in table were derived from the average of three sample replicates, linear regression (plot lines) was used to calculate slope and R ⁇ 2. The CV for each assay was calculated by normalizing the transcript abundance for each sample to the five reference genes (underlined assays), and then combining all results from samples with total input cDNA was greater than 1000 copies starting copies.
  • Figure 15 is a graph showing precision of results obtained by the SNAP assay from FFPE sample.
  • Lung tumor FFPE RNA was isolated and converted into cDNA.
  • Three serial dilutions (120, 60 and 30ng) of cDNA (RNA equivalence) were measured by SNAP.
  • the plot shows the transcript abundance (y-axis) of three sample replicates for each assay (x-axis).
  • Figure 17 is a graph showing a hypothetical example of how a prediction interval may be used to identify a range of risk scores (Gene Signature) that may be classified as inconclusive.
  • Figure 18 is a graph showing the effect of sample number and assay replicates on the lower 95% confidence bound of the ICC.
  • Figure 19 depicts a two sigmoid curve equation.
  • the molar ratio between IS and NT is estimated by fitting a curve to the top equation.
  • Tmis and TmNT refer to the melting point of the IS and NT product.
  • the TOPis , TOPNT , and Bottomis ,BottoniN ⁇ refer to the maximum and minimum Fluorescence of each sigmoid curve.
  • TOPis, Tmis, TmNT, HILLSLOPEis, AND HILLSLOPENT parameters are fixed and the solver reduces the residuals by fitting the melting curve data to BottoniNT and Bottomis.
  • the middle equation is used to generate the fraction NT in the sample.
  • the bottom equation is used to estimate the S/N for an assay.
  • F NT (NT) and F NT (IS) indicate fraction NT results for replicate NT or IS samples, respectively.
  • RMS root mean squares
  • STD standard deviation
  • Mean average.
  • the present invention features compositions and methods that provide for gene expression profiling using a pre-amplification step that enhances detection of multiple nucleic acid targets in a biologic sample, in a nanoplatform system.
  • the present invention provides for the quantitative measurement of gene expression in a low yield test sample and minimizes instrument- to-instrument variation.
  • Gene expression profiles generated in accordance with the methods of the invention are useful for the diagnosis, monitoring, or characterization of virtually any disease characterized by an alteration in gene expression including, for example, neoplasia, inflammation, and a variety of infectious diseases.
  • the invention is based, at least in part, on the discovery that a pre- amplification step, which provides for the enhanced detection of alterations in gene expression in a low yield test sample, can be used to enhance the number of transcripts of each gene in that sample. The number of such transcripts can then be measured relative to a known number of internal standard molecules within a standardized mixture of internal standards (SMIS) on a nanofluidic PCR platform.
  • SMIS standardized mixture of internal standards
  • the invention enables multivariate quantitative competitive PCR assays (e.g., StaRT- PCR) in a simplified and streamlined workflow with substantial reductions in reagent and sample consumption leading to low cost, reliable and accurate clinical analyses.
  • the invention employs a nanofluidic system that comprises a high density array of nano liter-scale through-holes or chambers for implementing a large number (e.g., at least about 500, 1000, 2000, 3000, 4000, 5000) of PCR analyses in less than about a microliter of fluid.
  • the invention employs the BioTrove nanofluidic system — a high density array of nano liter-scale through-holes or chambers for implementing up to 3072 PCR analyses with 33 nl per reaction on an array the size of a microscope slide.
  • Such arrays are described, for example, by U.S. Patent No. 6,716,629, which is incorporated herein by reference.
  • the OpenArray (R) plate is a steel platen that comprises 3072 through holes having a diameter of about 320 ⁇ M. Each of the through holes is treated with a polymer to make the inside surface of each hole hydrophilic and the exterior surface hydrophobic. Liquid is dispensed and retained in each through-hole by means of surface force differentials between the liquid surface tension and the polymer coatings., The through holes are grouped in forty-eight subarrays of sixty- four through holes each. The spacing between each subarray is about 4.5 mm.
  • the invention provides a platen comprising a high density array of nano liter-scale through-holes or chambers comprising less than about a 1000 nl, 750 nl, 500 nl, 250 nl, 100 nl, or even 50 nl of the reagents and samples for PCR analyses.
  • Methods for loading the array with a small volume of reagents are described, for example, in U.S. Patent No. 6,716,629, 6,812,030, and 6,716,629, and in U.S. Patent Publication Nos. 20080108112, 20030180807, and 20030124716.
  • the hydrophobic exterior surface of the platen is not wetted, keeping the liquid in each through-hole isolated from its neighbor.
  • PCR arrays are preloaded with PCR primers and probes.
  • Such reagents are typically transferred from 384-well plates into the through-holes with an array of 48 pins manipulated by a 4-axis robot, such that each through-hole of an OpenArray (R) plate has a different primer set.
  • the solvent is then removed resulting in the primers or primer/probes being immobilized on the inside surface of each hole.
  • Co-loading of a passive fluorescent reference dye allows detection of holes that failed to load assay.
  • the arrays are readily configurable as the assay configuration is based on the 384-well source plate layout.
  • the 3072 holes of the OpenArray (R) plate may be configured based on analytical needs; for example a sample can be interrogated by 16, 32, 64, multiples of 64, up to 3072 assays.
  • a pair of detectable target probes each of which is complementary to a target nucleic acid sequence, and capable of amplifying that sequence is used in combination with a pair of detectable reference probes, each of which is complementary to a competitive template internal standard nucleic acid sequence and capable of amplifying that standard sequence.
  • each primer pair coamplifies a native template and its respective competitive internal standard template with equal efficiency.
  • the target and reference probes are detectably labeled.
  • the detectable target and reference probes each comprises a distinct fluorometric dye that provides for the separate detection of amplified target and internal standards.
  • the amplifed target and internal standards are detected using a TaqMan two fluorescent dye assay to quantitatively measure the endpoint ratio between native template and internal standard.
  • the TaqMan assay uses two hybridization probes, each probe has a unique fluorescently quenched dye and specifically hybridizes to a PCR template sequence, as described by Livak et al., "Allelic discrimination using fluorogenic probes and the 5' nuclease assay," Genet Anal. 1999 Feb; 14(5-6): 143-9.), which is incorporated by reference in its entirety.
  • the hybridized probe is digested by the exonuclease activity of the Taq polymerase, resulting in release of the fluorescent dye specific for that probe.
  • the amplifed target and internal standards may also be detected using a Pleiades fluorescent probe detection assay to quantitatively measure the ratio between native template and the melting temperature of the first detectable probe.
  • the Pleiades assay uses a hybridization probe, and each probe specifically hybridizes to a target DNA sequence and has a fluorescent dye at the 5' terminus which is quenched by the interactions of a 3' quencher and a 5' minor groove binder (MGB), when the probe is not hybridized to the target DNA sequence, as described by Lukhtanov et al., "Novel DNA probes with low background and high hybridization- triggered fluorescence," Nucl. Acids. Res.. 2007 Jan;35(5):e30), which is incorporated by reference in its entirety.
  • MGB 5' minor groove binder
  • the fluorescent emissions from the released dyes reflect the molar ratio of the sample.
  • Methods for assaying such emissions are known in the art, and described, for example, by Fabienne Hermitte, "Mylopreliferative Biomarkers", Molecular Diagnostic World Congress, 2007.
  • Standardized reverse transcription PCR (StaRT- PCR TM) was developed with the goal of optimizing gene transcript measurement.
  • StaRT-PCR assays have a sensitive detection threshold ( ⁇ 10 molecules/assay) and signal-to analyte response (100%), high precision (mean gene copy CV across all genes was 6% and 3.2% with > 6000 starting RNA copy number), and a large linear dynamic range range (> 6 orders of magnitude, the full range of gene expression in the MAQC samples) (Shi et al, "The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.” Nat Biotechnol 2006 September;24(9):l 151-61; Shippy et al., "Using RNA sample titrations to assess microarray platform performance and normalization techniques.” Nat Biotechnol.
  • MAQC MicroArray Quality Control
  • Sample aliquots are added to a series of tubes (2, 3, 4, 5, 6, 7, 8, 9, 10) containing increasing numbers of copies of synthetic competitive template internal standard, and primers. Each primer pair coamplifies a native template and its respective competitive internal standard template with equal efficiency.
  • StaRT-PCR controls for all known sources of variation, including inter-sample variation in loading due to pipetting, interfering substances such as PCR inhibitors, inter-gene variation in amplification efficiency, and false negatives.
  • StaRT-PCR has been used successfully to identify patterns of gene expression associated with diagnosis of lung cancer (Warner et al., J MoI Diagn 2003 August;5(3): 176-83), risk of lung cancer (Crawford et al., Carcinogenesis 2007 December;28(12):2552-9), pulmonary sarcoidosis (Allen et al., Am J Respir Cell MoI Biol 1999 December;21(6):693-700), cystic fibrosis (Loitsch et al., Clin Chem 1999 May;45(5):619-24), chemoresistance in lung cancer (Harr et al., MoI Cancer 2005;4:23;Weaver et al., MoI Cancer 2005;4(l):18) childhood leukemias (Rots et al., Leukemia 2000 December;14(12):2166-75), staging of bladder cancer (Mitra et al., BMC Cancer 2006;6:159), and to develop databases of normal range of expression of
  • the primers of the invention embrace oligonucleotides of sufficient length and appropriate sequence so as to provide specific initiation of polymerization on a significant number of nucleic acids in the polymorphic locus.
  • the term "primer” as used herein refers to a sequence comprising two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and most preferably more than 8, which sequence is capable of initiating synthesis of a primer extension product, which is substantially complementary to a polymorphic locus strand.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent for polymerization. The exact length of primer will depend on many factors, including temperature, buffer, and nucleotide composition.
  • the oligonucleotide primer typically contains between 12 and 27 or more nucleotides, although it may contain fewer nucleotides.
  • Primers of the invention are designed to be "substantially" complementary to each strand of the genomic locus to be amplified and include the appropriate G or C nucleotides as discussed above. This means that the primers must be sufficiently complementary to hybridize with their respective strands under conditions that allow the agent for polymerization to perform. In other words, the primers should have sufficient complementarity with the 5' and 3' flanking sequences to hybridize therewith and permit amplification of the genomic locus. While exemplary primers are provided herein, it is understood that any primer that hybridizes with the target sequences of the invention are useful in the method of the invention for detecting a target nucleic acid.
  • the target nucleic acid may be present in a sample, e.g. clinical samples and biological samples. If high quality clinical samples are not used, amplification primers are designed to recognize shorter target sequences. For example, because RNA extracted from FFPE samples is typically fragmented, primers may be designed using criteria for FFPE sample amplification as described by Cronin et al. ("Measurement of Gene Expression in Archival Paraffin-Embedded Tissues.” AJP 2004, 164(1): 35-42. (14)). Because homogeneous product sizes have better inter transcript correlation for degraded samples, the PCR product sizes are 70-85 base pairs. Primer Tm is about 60 +/-1 0 C.
  • Amplification primers are compared by homology against the human transcriptome to ensure the binding specificity. Despite the use of DNAse in the RNA purification protocol, when possible, primers are designed to span RNA intron/exon splice junctions. Therefore, amplification of genomic contaminants will be inhibited by failure to produce full length products (typically >6KB).
  • the respective synthetic internal standard will match the native template in all but 1, 2, or 3 nucleotides within the probe binding sequence of the native nucleic acid molecule or biomarker.
  • the probe sequence for the internal standard will be based on this rearrangement, and therefore is predicted to bind only to the internal standard sequence, but not the corresponding native template.
  • Multiple internal standards are formulated into a mixture that contains the internal standards at a defined concentration or number of molecule of the internal standards.
  • such internal standards are also referred to as a "defined reference nucleic acid molecule", having a known concentration of the nucleic acid molecule or a known number of nucleic acid molecules.
  • each internal standard is synthesized in mg quantity, quantified by Hoechst dye flourometry, and stored in TE buffer, qPCR in which each native template is measured relative to a known number of internal standard molecules requires that the native template and internal standard molecule number is within 100-fold range of each other. Because genes may be expressed over 6 orders of magnitude, SMIS are prepared so that expression of each gene can be compared regardless of expression level. To accomplish this, the internal standard for each gene is mixed together at the same concentration, then serially diluted (e.g., 10-fold in TE).
  • the highest concentration of internal standard in working solution SMIS is 100-fold above the highest copy number estimated by qPCR (e.g., normalized to 10 ng RNA).
  • qPCR e.g., normalized to 10 ng RNA.
  • serial dilutions e.g., five 10-fold serial dilutions in TE are prepared.
  • SMIS mixtures are further diluted to working stocks (e.g., 10X) prior to addition (e.g., 9 ⁇ l); using all SMIS (e.g., 3, 4, 5, 6, or more) enables measurement of each gene transcript relative to a known quantity of its respective internal standard, while simultaneously enabling reliable comparison of measurement for each gene to each other gene.
  • working stocks e.g., 10X
  • SMIS e.g., 3, 4, 5, 6, or more
  • Use of SMIS enables reliable, reproducible measurement through two rounds of PCR amplification, including a pre- amplification which will be a benefit to low yield test samples (e.g., FFPE).
  • a PCR product (i.e., amplicon) or real-time PCR product is detected by probe binding.
  • probe binding generates a fluorescent signal, for example, by coupling a fluorogenic dye molecule and a quencher moiety to the same or different oligonucleotide substrates (e.g., TaqMan® (Applied Biosystems, Foster City, CA, USA), Pleiades (Nanogen, Inc., Bothell, WA, USA), Molecular Beacons (see, for example, Tyagi et al, Nature Biotechnology 14(3):303-8, 1996), Scorpions® (Molecular Probes Inc., Eugene, OR, USA)).
  • a PCR product is detected by the binding of a fluorogenic dye that emits a fluorescent signal upon binding (e.g., SYBR® Green (Molecular Probes)).
  • Such detection methods are useful for the detection of a target specific PCR product.
  • concentration of the native template is calculated from the ratio (native template: internal standard template) versus known copies of internal standard included in the reaction.
  • StaRT- PCR controls for all known sources of variation, including inter-sample variation in loading due to pipetting, interfering substances, such as PCR inhibitors, inter-gene variation in amplification efficiency, and false negatives.
  • the present invention provides compositions and methods for carrying out StaRT-PCR and other analytic methods that involve the preamplification of cDNA in the presence of a standardized mixture of internal standards on a nano liter scale.
  • the use of the preamplification step markedly reduces the amounts of starting sample (e.g., cDNA) and reagents required for each PCR reaction.
  • measuring each gene relative to a known number of internal standard molecules within a standardized mixture of internal standards in each reaction controls for unpredictable inter-sample variation in the efficiency of pre-amplif ⁇ cation caused by reagent consumption, PCR inhibitors, and/or product inhibition.
  • a standardized mixture of internal standards controls for preferential amplification of one transcript over another due to differences in amplification efficiencies.
  • the use of nanofluidic technology in combination with pre-amplification with multiple sets of primers and internal standards in the same reaction provides for the measurement of many genes (>100) using the RNA quantity normally required for six measurements. This allows for higher throughput that is virtually unrestricted by RNA input.
  • a sample is split into a number (2, 4, 6, 8, 10, 20) of tubes comprising standardized mixtures of internal standards. Each of these tubes then undergoes up to 35 cycles of multiplexed pre-amplification prior to individual analyte detection.
  • the RNA quantity required for 6 typical real-time PCR assays is increased such that the amount of RNA after pre-amplification is sufficient for the detection of target nucleic acid molecules in at least about 100, 200, 300, 400, or 500 assays.
  • the resulting increased gene target concentration eliminates issues of insufficient message when using nanofluidic measurement methods. Overall, the methods described herein provide robustness and quality controls lacking in current real-time qPCR and hybridization array approaches.
  • the native and internal standard targets can be detected using any method known in the art. For example, as little as a 10% size difference between analyte and internal standard templates allows them to be quantified individually by size separation methods (e.g., capillary electrophoresis, agarose gel mobility, HPLC or MALDI- TOF). Because such methods may be cumbersome when measuring tens to hundreds of targets, signal detection may be carried out using a two-color fluorometric dye system. Specifically, signal from each fluorophore corresponds to the native and internal standard relative molar ratios.
  • size separation methods e.g., capillary electrophoresis, agarose gel mobility, HPLC or MALDI- TOF.
  • signal detection may be carried out using a two-color fluorometric dye system. Specifically, signal from each fluorophore corresponds to the native and internal standard relative molar ratios.
  • Measuring each gene relative to a known number of internal standard templates enables reliable quantitative endpoint or real time measurement.
  • 1, 2, or 3 nucleotide differences in the nucleotide sequence between the internal standard and native template will provide sufficient specificity difference for TaqMan or Pleiades probe discrimination.
  • Such 1, 2, or 3 nucleotide differences at the internal standard probe binding site include deletions, insertions, or changing the order of the 2-3 nucleotides relative to the native template.
  • a 2 nucleotide change in the internal standard relative to the target sequence provides enough similarity so that it has the same PCR kinetics as the target sequence, but distinguishes the internal standard enough to provide probe discrimination in TaqMan and Pleiades based assays.
  • the invention provides a high-density array of 33 nanoliter through-holes designed for parallel PCR detection of multiple genomic targets. These through-holes are pre-loaded with specific amplification primers and detectable probes. Each through hole measures the analyte: internal standard ratio for one PCR target based on the ratio of signal from the two TaqMan or Pleiades probes.
  • the present invention can be employed to measure gene expression or a gene expression profile in a biological sample.
  • the methods of the invention require much less starting material than conventional diagnostic methods and may be employed to measure gene expression of biomarkers in blood or other tissues.
  • the invention provides for the identification of patterns of gene expression useful in virtually any clinical setting where conventional methods of analysis are used.
  • the present methods provide for the analysis of biomarkers associated with lung cancer (Warner et al, J MoI Diagn 2003 ;5: 176-83), risk of lung cancer (Crawford et al., Cancer Res 2000;60:1609-18, pulmonary sarcoidosis (Allen et al., Am. J. Respir. Cell. MoI. Biol.
  • samples of suspected cancerous lesions in the lung, breast, prostate, thyroid, and pancreas are commonly obtained by fine needle aspirate (FNA) biopsy. These samples often comprise fewer than 100 cells.
  • FNA fine needle aspirate
  • samples from anatomically small, but functionally important tissues of the brain, developing embryo, and animal models including laser capture micro-dissected samples.
  • Measurement of gene expression profiles in rare cells, such as circulating tumor cells (CTC) enriched from flow-sorted cell populations will also potentially benefit from this technique.
  • CTC circulating tumor cells
  • the biologic sample is a tissue sample that includes cells of a tissue or organ (e.g., lung, breast, prostatic tissue cells). Such tissue is obtained, for example, from a biopsy of the tissue or organ.
  • the biologic sample is a biologic fluid sample.
  • Biological fluid samples include blood, blood serum, plasma, urine, seminal fluids, and ejaculate, or any other biological fluid useful in the methods of the invention.
  • the tissue sample is a cytologic fine needle aspirate biopsy or formalin fixed paraffin embedded tissue. Use of the methods of the invention is particularly advantageous for such samples, where RNA often is limited by sample size or degradation.
  • the present invention provides a number of diagnostic assays that are useful for characterizing the gene expression profile of a biological sample.
  • the invention provides methods for the detection of alterations in gene expression associated with neoplasia.
  • the invention provides for the characterization of a gene expression profile from a sample that contains very little genetic material. This provides an advantage over conventional methods for assaying gene expression, which require 10-100 times as much starting material to reliably detect alterations in gene expression.
  • the size of biopsies obtained in many clinical situations is small and includes minimal amounts of genetic material. For example, samples of suspected cancerous lesions in the lung, breast, prostate, thyroid, and pancreas, are commonly obtained by fine needle aspirate (FNA) biopsy. These samples often comprise fewer than 100 cells.
  • FNA fine needle aspirate
  • the invention provides for the detection of genes listed in Table 1 (below).
  • the invention provides for the detection and diagnosis of a pathogen in a biological sample.
  • a variety of bacterial and viral pathogens may be detected using the system and methods of the invention.
  • Exemplary bacterial pathogens include, but are not limited to, Aerobacter, Aeromonas, Acinetobacter, Actinomyces israelii, Agrobacterium, Bacillus, Bacillus antracis, Bacteroides, Bartonella, Bordetella, Bortella, Borrelia, Brucella, Burkholderia, Calymmatobacterium, Campylobacter, Citrobacter, Clostridium, Clostridium perfringers, Clostridium tetani, Cornyebacterium, corynebacterium diphtheriae, corynebacterium sp., Enterobacter, Enterobacter aerogenes, Enterococcus, Erysipelothrix rhusiopathiae, Escherichia, Francisella, Fusobacter
  • Retroviridae e.g. human immunodeficiency viruses, such as HIV-I (also referred to as HDTV-III, LAVE or HTLV-III/LAV, or HIV-III; and other isolates, such as HIV-LP; Picornaviridae (e.g. polio viruses, hepatitis A virus; enteroviruses, human Coxsackie viruses, rhinoviruses, echoviruses); Calciviridae (e.g. strains that cause gastroenteritis); Togaviridae (e.g. equine encephalitis viruses, rubella viruses); Flaviridae (e.g.
  • Coronoviridae e.g. coronaviruses
  • Rhabdoviridae e.g. vesicular stomatitis viruses, rabies viruses
  • Filoviridae e.g. ebola viruses
  • Paramyxoviridae e.g. parainfluenza viruses, mumps virus, measles virus, respiratory syncytial virus
  • Orthomyxoviridae e.g. influenza viruses
  • Bungaviridae e.g.
  • Hepadnaviridae Hepatitis B virus
  • Parvovirida Parvoviruses
  • Papovaviridae papilloma viruses, polyoma viruses
  • Adenoviridae most adenoviruses
  • Herpesviridae herpes simplex virus (HSV) 1 and 2, varicella zoster virus, cytomegalovirus (CMV), herpes virus
  • Herpesviridae variola viruses, vaccinia viruses, pox viruses
  • Iridoviridae e.g. African swine fever virus
  • unclassified viruses e.g.
  • Other infectious organisms i.e., protists
  • Plasmodium spp. such as
  • Plasmodium falciparum Plasmodium malariae, Plasmodium ovale, and Plasmodium vivax and Toxoplasma gondii.
  • Blood-borne and/or tissues parasites include Plasmodium spp., Babesia microti, Babesia divergens, Leishmania tropica, Leishmania spp., Leishmania braziliensis, Leishmania donovani, Trypanosoma gambiense and Trypanosoma rhodesiense (African sleeping sickness), Trypanosoma cruzi (Chagas' disease), and Toxoplasma gondii.
  • kits for the detection of a gene expression profile are useful for the diagnosis, characterization, or monitoring of a neoplasia in a biological sample obtained from a subject.
  • the invention provides for the detection of a pathogen gene or genes in a biological sample.
  • the kit includes at least one primer pair that identifies a target sequence, together with instructions for using the primers to identify a gene expression profile in a biological sample.
  • the primers are provided in combination with a standardized mixture of internal standards on a nanofluidic PCR platform (e.g., a high density array).
  • the kit further comprises a pair of primers capable of binding to and amplifying a reference sequence.
  • the kit comprises a sterile container which contains the primers; such containers can be boxes, ampules, bottles, vials, tubes, bags, pouches, blister-packs, or other suitable container form known in the art.
  • a sterile container which contains the primers; such containers can be boxes, ampules, bottles, vials, tubes, bags, pouches, blister-packs, or other suitable container form known in the art.
  • Such containers can be made of plastic, glass, laminated paper, metal foil, or other materials suitable for holding nucleic acids.
  • the instructions will generally include information about the use of the compositions of the invention in detecting a gene expression profile.
  • the gene expression profile diagnoses or characterizes a neoplasia.
  • the kit further comprises any one or more of the reagents useful for an analytical method described herein (e.g., standardized reverse transcriptase PCR).
  • the instructions include at least one of the following: descriptions of the primer; methods for using the enclosed materials for the diagnosis of a neoplasia; precautions; warnings; indications; clinical or research studies; and/or references.
  • the instructions may be printed directly on the container (when present), or as a label applied to the container, or as a separate sheet, pamphlet, card, or folder supplied in or with the container.
  • Example 1 Preamplif ⁇ cation of genes expressed at low levels resulted in sensitive detection of target gene expression.
  • SNAP Standardized Nano liter Array PCR
  • a multiplex preamplification step is performed before samples are loaded into the nanofluidic array for a second round of singleplex PCR.
  • SEM Standardized Expression Measurement
  • Levels of three genes expressed at low level were measured in a commercially available reference RNA sample (Stratagene Universal Human Reference RNA; SUHRRNA), under multiple conditions ( Figure 2).
  • the conditions included the typical no pre-amplif ⁇ cation method as a control, or pre-amplif ⁇ cation with a 96-gene primer mixture.
  • Two different primer concentrations (1/6 or 1/10 usual concentration) were used in preamplification.
  • Preamplif ⁇ cation of PCR products allows a sufficient concentration of cDNA for thousands of individual quantitation reactions.
  • the preamplification PCR products were diluted 10- or 100- fold prior to the second round of PCR.
  • the measured expression levels for the experiments included a 1 :100 dilution of preamplification products. Both 10-fold and 10-fold dilutions of the Pre-Amp products produced similar variability in the measured transcript levels.
  • Preamplification protocols dramatically increase the number of transcript markers that can be measured in a fixed amount of cDNA by markedly reducing the amount of cDNA consumed per assay.
  • Sample size is particularly limited for clinical samples, including those derived from fine needle aspirate (FNA) biopsies or formalin fixed paraffin embedded (FFPE) material.
  • FNA fine needle aspirate
  • FFPE formalin fixed paraffin embedded
  • Example 2 StaRT-PCR detected differences in gene expression in Formalin Fixed Paraffin Embedded (FFPE) Samples.
  • FFPE Formalin Fixed Paraffin Embedded
  • Standardized RT-PCR (StaRT-PCR) analysis was performed on formalin fixed paraffin embedded (FFPE) RNA samples and matched fresh frozen (FF) sample RNA samples (Figure 4A). Seven (7) cell line cultures were split and either frozen or formalin fixed to obtain pairs of matched FFPE and FF RNA samples.
  • the biomarker was a ratio of Gene A/Gene B.
  • the expected ratio of the FFPE Gene A/Gene B value to the FF Gene A/Gene B value is about 1.0 in for every matched pair.
  • Figure 4A the ratio ratio of the FFPE Gene A/Gene B value to the FF Gene A/Gene B value varied four- fold, from 0.4 in the second matched pair to 1.2 in the sixth matched pair.
  • Example 3 Nanoliter-scale PCR using the OpenArray (R) system Detected Differences In Gene Expression Between Normal Breast Tissue Samples And Breast Tumor Samples.
  • R OpenArray
  • OpenArray (R) nanofluidic technology is desirable for diverse applications because the technology uses fewer resources than other methods. OpenArray (R) nanoliter reaction volumes and streamlined analytical workflow greatly facilitated an increase in the number of analyses one can make and reduces the cost of qPCR analysis per sample when compared to a microplate.
  • OpenArray (R) technology was used for the real-time (qPCR) measurement of 608 human kinase genes in matched breast tumor/normal samples ( Figures 5A-5C). Comparing cycle number and cycle threshold (Ct) showed good correlation in normal breast tissues and breast tumor samples, indicating technical replicate performance ( Figures 5A and 5C). Comparing tumor matched normal samples and breast tumor samples indicated detectable variability between sets of matched samples ( Figure 5B). These results clearly indicated that the biological difference between tumor and normal samples is far greater than the analytic variation of the OpenArray (R) platform.
  • Two 6 ⁇ l PCR reactions containing six primer pairs at 80 nM and either 10 starting copies of all six internal standard (IS) nucleic acids or 10 starting copies of all six native template (NT) nucleic acids were prepared, cycled 36 times (multiplex preamplif ⁇ cation), diluted 1000-fold in mastermix, and applied to an OpenArray (R) plate, which is manufactured with each hole containing an individual TaqMan SNP assay for one of the six gene targets (i.e., IS/NT detection is spatially multiplexed).
  • Each primer pair was optimized for SYBR green QPCR only (high efficiency, singleplex). Probes were added later to allow detection of either NT or IS amplicon in an OpenArray (R) plate.
  • NT was detected by FAM labeled probe and IS was detected by VIC labeled probe. After 20 cycles, arrays were imaged and multiplex signals detected from the ten starting copies of either NT or IS were compared in plots. The points on each plot represent technical replicates of the initial two preamplif ⁇ cation PCR. The experiments showed that both internal standard (IS) and native template (NT) gave unique signals at very low copies in the presence of non-optimized multiplex preamplif ⁇ cation, and all six assays demonstrated 10 starting copy sensitivity.
  • IS internal standard
  • NT native template
  • Example 5 Standardized NanoArray PCR showed sensitive detection or low quantities of native template in the presence of an internal standard.
  • a preferred positive control is a template that is nearly identical to the native sequence of target, but differing by a few base pairs (i.e., an internal standard (IS)).
  • IS internal standard
  • NT native template
  • following PCR one distinguishes which product is made (competitor or native sequence) with a probe.
  • Fluorescent probe specific technologies e.g., TaqMan SNP, Pleiades, or Beacons, which detect the sequence difference between the IS and NT.
  • the positive control mimics the NT as closely as possible in the reaction.
  • Each probe has a unique fluorescently quenched dye and specifically hybridizes to the PCR template sequence (Livak, "Allelic discrimination using fluorogenic probes and the 6' nuclease assay.” Genet Anal. 1999 Feb; 14(6-6): 143-9).
  • the hybridized probe was digested by the exonuclease activity of the Taq polymerase, resulting in release of the fluorescent dye specific for that probe.
  • the fluorescent emissions from the released dyes reflect the molar ratio of the sample.
  • Loaded OpenArray (R) plates were inserted into a glass case containing immiscible fluid and sealed with a light sensitive epoxy to prevent evaporation during thermal cycling.
  • the resulting PCR array was then cycled in a commercially available fiat block thermal cycler (e.g., the BioTrove NT Imager).
  • a commercially available fiat block thermal cycler e.g., the BioTrove NT Imager
  • thermal cyclers for high throughput PCR are also available, e.g., the BioRad ALD-021 IG which can cycle up to 32 slides every four hours (>98,000 PCR).
  • Two color fluorescent images were collected following PCR using a commercially available microarray scanner (e.g., BioTrove NT Imager, Tecan LS Reloaded).
  • the FAM:VIC fluorescent ratio was plotted against internal standard preamplification input quantity, and the inflection point from a sigmoidal curve fit (EC50) was used to indicate native template nucleic acid concentration.
  • genomic human DNA was used to simulate the competitive PCR titration curve behavior ( Figure 8). Similar to capillary electrophoresis and gel based methods used for StaRT-PCR endpoint measurements, TaqMan SNP assays are not analytically sensitive to a ⁇ 10% native: control template. Therefore, for this experiment the indicated log molar ratios ⁇ -l and >1 used replicate homozygous genomic DNA instead of the indicated molar ratio.
  • the experiment was performed with heterozygous genomic human DNA to simulate the expected FAM/VIC ratio and variation for such samples.
  • the raw fluorescent results are depicted in the inset figure as five clusters corresponding to, in a clockwise direction, the dilutions (-3 & -T), -1, 0, +1, (+2 & +3).
  • Eight (8) technical replicates were used to generate eight (8) sigmoid curve fits, resulting in an average of 0.066 +/- 0.007 for the value of the half maximal effective concentration (EC50) ( Figure 8).
  • An 11% coefficient of variation (CV) obtained for technical replicates was not substant considering that most of this variation can be accounted for by Poisson noise ( ⁇ 6% given the 300 genomic template input). It is expected that the CV can be greatly reduced when using more highly expressed prognostic RNA targets.
  • this study shows that high density, high throughput SNAP-PCR showed little variation among replicate experiments and that SNAP-PCR reactions are reproducible.
  • Example 7 Standardized Nanoliter Array PCR Gene Expression assay of individual biomarkers was able to detect 4- fold difference in nanoscale quantities.
  • NT:IS ratio the ratio of the signal of the biomarker to the signal of the internal standard was determined (NT:IS ratio) for the through hole reactions.
  • the NT:IS ratio was graphed against the concentration of the internal standard, and the half- maximal concentration (EC50) was determined.
  • the ratio of the signal from the biomarker to the melting temperature of the probe to the biomarker was determined.
  • the ratio of the signal from the biomarker to the melting temperature of the probe to the biomarker was graphed against the concentration of the internal standard, and the half-maximal concentration (EC50) was determined.
  • the SNAP assays showed enough sensitivity to detect a difference in the amount of the target sequence, resulting from a 4-fold difference in the amount of the input cDNA (Table 2).
  • a Standardized Nanoliter Array PCR (SNAP) Gene Expression assay is used to detect or quantify one or more nucleic acid targets by implementing preamplification and Standardized Mixtures of Internal Standards (SMIS) in a commercially available high-throughput PCR format, e.g. , the OpenArray (R) plate format.
  • SMIS Standardized Mixtures of Internal Standards
  • R OpenArray
  • SNAP provides better prognostic cancer gene expression profile consistency between labs.
  • the assay is based on competitive RT-PCR between a dilution series of known concentrations of synthetic gene-specific internal standard copies and the unknown number of target sequence copies.
  • FIG. 1A-1C A diagram of the workflow for an assay is shown in Figures 1A-1C.
  • This example shows how the Oncotype Dx workflow can be adapted to detect gene expression in a tumor biopsy.
  • a tumor biopsy sample e.g., breast biopsy, is taken from a patient.
  • a pathologist confirms sample pathology and selects tumor enriched sections for RNA extraction.
  • the method also measures many genes in low yielding samples obtained by fine needle aspirate (FNA) biopsies, laser capture micro-dissection or flow-sorted cytometry. Low yield samples benefit from preamplification and distribution of sample into fewer reaction tubes.
  • FNA fine needle aspirate
  • RNA 100 ng
  • Tube 1 (Table 1)
  • 3) contains the calibrant, a reagent that contains the internal standard for the ⁇ -actin (ACTB) loading control gene.
  • the ratio of native template (NT) to internal standard (IS) must be greater than 1 :10 and less than 10:1 for the measurement to be within assay range.
  • Initial calibration of each cDNA sample to a known quantity of ⁇ -actin (ACTB) internal standard ensures that the ACTB NT/IS is within this range for each subsequent measurement.
  • the calibrated cDNA sample is then used in the preamplification step.
  • the calibrated cDNA is evenly distributed among 6 StaRT-PCR tubes.
  • a passive dye added during the first strand cDNA synthesis could be used to detect if relatively equal volumes of cDNA were added to the preamplification tube.
  • the 6 StaRT-PCR tubes Prior to addition of the calibrated cDNA, the 6 StaRT-PCR tubes are loaded with PCR master mix, PCR primer pairs for gene targets (e.g., 21 primer pairs for Oncotype DX, 17 prognostic and 4 reference) and their internal standard competitive templates formulated into SMIS (e.g., serial 10-fold dilution of internal standards), Because genes are expressed over more than six orders of magnitude in human tissues, Tubes 2 to 7 (Table 3) are 10-fold serially diluted relative to the loading control gene ⁇ -actin (ACTB) internal standard, in a system of six (6) SMISTM, A-F. Inclusion of SMIS removes dependence on real-time instrument calibration and provides a self- referenced quality standard that controls for analytical false negatives and false positives.
  • these tubes undergo 16 cycles of PCR preamplification. Two microliters from each preamplification tube is added to 384-well plate containing 18 ⁇ l master mix (no primer or probes), to eliminate issues associated with nonspecific product preamplification and to prepare samples for loading into the OpenArray (R) plate provided in the kit. Following 16 cycles of PCR, the ratios of fluorescent emissions are measured and native template concentration estimated from the fluorescence ratio compared to an internal standard curve. Reference gene copies are used to correct for cDNA yield prior to gene expression profile (GEP) calculation.
  • the OpenArray (R) plate is pre-loaded with amplification primers and two differentially labeled probes specific for either native template or internal standard corresponding to each of the Oncotype Dx gene expression targets.
  • two differentially labeled fluorescent dye exonuclease probes specific for either native template or internal standard e.g., TaqMan® Taqman probes (Applied Biosystems, Foster City, CA, USA)are pre-loaded in the OpenArray (R) plate.
  • the probe could also be a number of different types: e.g., Pleiades (Nanogen, Inc., Bothell, WA, USA), Molecular Beacons (see, for example, Tyagi et al, Nature Biotechnology 14(3):303-8, 1996), Scorpions® (Molecular Probes Inc., Eugene, OR, USA)).
  • a 6-carboxyfluorescein (FAM) labeled probe recognizes sequences specific to the native template and a 5'-Tetrachloro-Fluorescein (TET) labeled probe recognizes internal standard specific sequences.
  • the synthetic gene- specific internal sequence could be, for example, a 1, 2, or 3 nucleotide difference compared to the target sequence in the probe binding sequence.
  • the 1,2, or 3 nucleotide change in the internal standard may produce mismatches with the probe hybridizing to the native template or target sequences, and thus reduces or prevents hybridization of the native template probe to the internal standard.
  • 1,2, or 3 nucleotide changes include deletions, insertions, or placing 2-3 nucleotides of the target sequence in a different order.
  • 2 nucleotide changes are effective in differentiating probe binding in TaqMan and Pleiades based SNAP assays.
  • the PCR efficiency for amplification of the synthetic sequence should be similar to the amplification for the target sequence.
  • Competitive PCR allows a multiplex pre- amplification step without consequence to assay accuracy. Applying the pre- amplified sample to a commercially available high-throughput PCR format, e.g., the OpenArray (R) plate simplifies the detection step of the 378 individual PCR assays.
  • Table 3 BioTrove Oncotype DX Workflow Test Kit
  • PCR buffer MgCl 2
  • Tag polymerase dNTPs
  • Each preamplif ⁇ cation dilution is transferred to its own OpenArray (R) plate subarray for amplification to detect target nucleic acids ( Figure 2).
  • All OpenArray (R) plate subarrays are identical; a subarray consists of assays for the 21 gene targets in three replicates.
  • the arrays are loaded, sealed and subjected to 30 PCR cycles in an approved flat block thermal cycler.
  • the cycled arrays are imaged in an NT imager or compatible slide scanner.
  • Absolute gene copy number is generated by curve fitting a plot of the ratio of the native/standard signals vs. standard concentration. Generally, a sigmoid curve fit is used.
  • Melting curves detected by saturating DNA dyes can also be substituted for probe-based ratio calculations.
  • the half maximal effective concentration (EC50) is used to determine the native quantity of the target nucleic acid in the sample. A more accurate quantification could be obtained if more standard concentrations are used.
  • the differential cost between performing 7 and 12 standard concentration measurements subarrays per test is small. Alternatively, assays compatible with 6 internal standard tubes would allow up to 8 tests per OpenArray (R) plate, further decreasing test cost.
  • Automated software analyzes images, calculates signal intensities, estimates copy number, Quality Assurance (QA) assay results, and outputs a Recurrence Score report.
  • QA Quality Assurance
  • the method provides an absolute quantification of multiple target nucleic acids in a sample. Preamplification results in at least a ten-fold improvement in sensitivity for low RNA yield samples. Transcript quantitation using internal standards is a robust and desirable method. However, prior to the OpenArray (R) plate, use of internal standards increased the test cost and complexity of target nucleic acid quantitation nearly ten- fold. Internal standards reduce variation issues brought about by instrument, pipetting, preamplification, change in cycle threshold ( ⁇ C t ) estimates and sample contaminants. Internal standards provide QA data for each assay data point. Compared with the existing real-time Oncotype DX test, improves test yield and provides an assay QA. Further benefits to be obtained from the present method include a reduction in the number of liquid handling steps and instrument requirements. All these benefits would occur at roughly the same price as the current test.
  • Example 9 Standardized NanoArray PCR (SNAP) analytic performance was demonstrated using a panel of 16 lung cancer prognostic genes and up to 5 endogenous control genes.
  • Real-time TaqMan non- standardized QPCR assays were developed for 16 lung cancer prognostic genes (Chen et al., N Engl J Med 2007 January 4;356(1):11- 20), and 10 reference gene targets (up to 5 endogenous control genes).
  • ⁇ C t normalized prognostic measurement
  • >95% linearity and ⁇ 20% CV at >1000 starting copies was demonstrated using data calculated from 6-point standard curves of cDNA from flash frozen lung samples.
  • the amplification primer sequences in the SNAP process were generated to provide high quality TaqMan real-time qPCR assays.
  • three different primer pairs were designed to unique regions of the gene and with the primer binding sites spanning an intron/exon boundary (>1000 bp intron where possible).
  • Evaluation criteria included cycle threshold value, ⁇ Rn, and amplification specificity as determined by melt curves and gel electrophoresis using amplification products derived from three annealing temperatures (55, 60 and 65°C).
  • all primer sets were tested for cDNA specificity using 50ng genomic DNA as template.
  • oligonucleotides were obtained with sequence matching the predicted amplicon product for each gene (80-100bp long oligonucleotides). These oligonucleotides were mixed in equimolar concentrations and then serially diluted over seven orders of magnitude and again used for PCR efficiency tests. In this experiment all genes demonstrated PCR efficiency >98% with correlation coefficients >0.99. Furthermore, all PCR assays demonstrated strong amplification signal down to eight copies of specific oligonucleotide template. To complete the panel, five of the ten reference genes (GUSB, MLN, PPIA, TBP and UCBH) were selected based on the Vandesompele method of selecting targets with minimal covariance.
  • the SNAP protocol has evolved over the course of experimentation, but essentially remains as depicted in Figure IA.
  • Samples are split into tubes containing log dilutions of an internal standard (IS) pool (e.g., 10 2 to 10 7 copies IS per reaction), mastermix, and all twenty-one PCR primer pairs (80 nM each primer), and then amplified by subjecting them to 34 PCR thermal cycles.
  • the internal standard pool is a mixture of 21 synthetic oligos that function as competitive templates as each differs from the native gene target by two bases in the probe binding sequence.
  • the preamplified PCR products are diluted 500-fold in mastermix, loaded into an
  • OpenArray and undergo 30 additional PCR cycles followed by melt curve analysis OpenArray nanoplates were manufactured such that primers/probe for all 21 assays were individually loaded into 63 separate wells in the nanoplate (i.e., each assay has three technical replicates).
  • the probe hybridization difference between IS and native template (NT) is somewhat adjustable, but a ⁇ Tm of 15°C between products produced good melting curve separation, which in turn improved the ability to estimate the IS:NT molar ratio in each sample.
  • the IS & NT melting curve separation, or signal-to-noise needs to be greater than ten, and preferentially greater than fifty (e.g., see Materials and Methods section for S/N calculation), however, the current designs proved sufficient for demonstrating >95% linearity calculated from 6-point standard curve of cDNA from flash frozen lung samples and ⁇ 20% CV for genes with >1000 starting copies for each reference normalized prognostic measurement ( ⁇ Q).
  • multiple rounds of Pleiades probe design and testing can be used to optimize the preferential S/N for each assay.
  • the conversion of melting curve data into transcript abundance is based on data establishing melting curve parameters for each NT and IS.
  • Pleiades probe melting curves of samples with either IS or NT were fit to a variable sloped sigmoid curve, and the resulting Tm and Hill coefficient saved as input parameters for SNAP analysis.
  • Figures 13A-13C depicts the SNAP sample analysis workflow beginning with determining the individual contribution of NT and IS by the results of fitting the melting curve for each sample-assay-IS combination. From this, the transcript abundance for each sample-assay combination is derived from the EC50 value derived from a sigmoid curve fit to a Fraction NT vs. log[IS] plot.
  • nfl 1.06 1.00 11% ppia 1.07 1.00 10% rnf 1.08 1.00 21% statl 1.08 1.00 14% stat2 1.14 1.00 27% tbp 1.01 1.00 19% ucbh 1.07 1.00 23% znf 1.17 1.00 24%
  • SNAP precision is another requirement for analytic accuracy.
  • SNAP precision was estimated by requiring any sample with >1000 starting copies to show ⁇ 50% CV.
  • the SNAP method divides the sample into eight internal standard amplification tubes, therefore, each of the individual amplifications may have as low as 125 starting copies.
  • the SNAP transcript abundance of any assay-sample combination with >1000 starting copies was reference normalized (similar to ⁇ CT for real-time data), and a STD/MEAN calculation made by pooling the normalized values (Table 5).
  • Example 11 SNAP requires at least 5-fold less RNA input than real-time QPCR.
  • RNA test input where at least one assay fails precision requirement (50% CV) it is demonstrated that SNAP requires at least 5 -fold less RNA input than real-time QPCR.
  • This study compares the analytic sensitivity of SNAP and real-time qPCR. Analytic sensitivity is important when working with highly degraded or limited samples such as formalin fixed tissue specimens or fine needle aspirate biopsies. In this study, the analytic sensitivity of a platform was defined as the sample quantity that resulted in a >50% CV for four inter day sample replicates.
  • each platform made 21 transcript measurements from serial dilutions of cDNA (8ng to 160pg).
  • the real-time platform measured only the nine lowest expressing transcript targets; the rationale being that these assays would fail the precision criteria first as their templates are diluted near single copy.
  • the real-time measurements used 42% (9/21) of the cDNA input (5.76ng to 69pg). Tables 6 and 7 provide the results from this experiment.
  • Table 6 Four inter day sample replicates of cDNA (ng input, top row of tables) were measured by SNAP.
  • Table 7 Four inter day sample replicates of cDNA (ng input, top row of tables) were measured by TaqMan real-time qPCR.
  • the ICC for each prognostic assay across three laboratory sites can demonstrate better inter laboratory correlation than non-standardized real-time QPCR.
  • the Inter lab concordance of the SNAP and real-time qPCR platforms can be studied. Seven cDNA samples were divided into 8ng aliquots and distributed to three laboratory sites each for SNAP and qPCR where four inter day measurements of 21 transcripts will be made. An Interclass correlation (ICC) is used for comparison as an assessment of quantitative reproducibility by different sites. According to the standardized nature of SNAP, the inter laboratory measurements for SNAP are more similar than those for real-time qPCR.
  • Example 12 SNAP using RNA from FFPE tissue
  • the lung prognostic panel SNAP assays were designed to work with highly degraded samples. In initial studies, the tasks of measuring SNAP analytic performance were simplified by using the higher yields and transcript concentration of RNA isolated from fresh frozen tumor samples. Thus, it was important to demonstrate the test panel response to RNA derived from FFPE samples.
  • RNA isolated from FFPE lung resection tumor blocks was converted into cDNA using random priming and MMLV reverse transcriptase. Three, serial twofold dilutions were measured by SNAP in triplicate (Figure 15). Per nanogram RNA equivalence, the average transcript abundance was ⁇ 130-fold less than for the fresh frozen samples used in Examples 10 and 11.
  • SNAP transcript abundance assays can be designed, constructed, and tested to validate the analytic performance of SNAP assays for genes in a lung adenocarcinoma prognostic test panel for a set of 60 genes. These assays can be used to identify a gene expression signature predictive of high-risk patients with lung adenocarcinoma. Steps for SNAP assay panel construction are outlined below
  • the resulting panel measures transcript abundance of the FFPE samples.
  • Analytic specificity of the SNAP assay is measured and determined as follows. Primer pair characterization is performed by examining microplate PCR product of lung tumor cDNA (10 ng), human genomic DNA (10ng) and NTC. SNAP involves single PCR amplification condition for all assays. Primer design algorithms, as are known in the art, yield assays with required performance in several standard commercial master mixes (ABI Fast Sybr Green master mix , ABI GeneAmp Fast PCR MasterMix, Roche Taq Gold/3mM Mg) at 6O 0 C annealing.
  • the specificity of each assay is determined by amplifying each assay-sample combination using ABI Fast Sybr Green master mix, collecting melting curve data for later analysis, subjecting the PCR products to polyacrylamide gel electrophoresis and observing products migrating with the correct mobility for cDNA samples, and absence of non specific products in all samples.
  • PCR template (see, e.g., Materials and Methods) can be isolated from leftover PCR by QIAquick PCR Purification kits, and quantified by NanoDrop spectrophotometry. Analytic sensitivity of the SNAP assay is measured and determined as follows. Primer pairs passing the specificity screen are measured for their ability to detect less than ten input copies of template.
  • the goal for selecting primer pairs to move into a SNAP assay is to generate primers that amplify less than ten starting copies in a multiplex PCR environment.
  • OpenArrays are prepared preloaded with a unique primer pair per through hole (well), allowing a single OpenArray plate to screen up to 256 primer pairs against 12 samples.
  • Limiting dilution PCR is used to measure the sensitivity of each assay.
  • Limiting dilution PCR is an endpoint qPCR technique that uses serial dilutions flanking single starting copy per reaction to estimate sample copy number (34). This method requires identification of positive/negative PCR reactions by SYBR Green melting curve analysis. The melting curve specific for each product will be pre-determined from the specificity experiment described above.
  • the purified amplicons isolated above will be pooled at Ie6 copies per ⁇ l in carrier DNA (10 pg/ ⁇ l Salmon Sperm DNA). Melting curve analysis of two-fold serial dilutions of NT pool (64 to 1/16 copies per OpenArray hole (33nl)) in four replicates determines presence or absence of PCR products (experiment requires 4 OpenArray Plates). A copy number estimate with 95% confidence intervals is obtained by entering PCR positive reactions and dilution factors into the software POISSON9 (created by N. Iscove Jan 1996). By comparing the limiting dilution derived copy number to the Nanodrop estimated copy number, one can establish if an assay is capable of sub ten starting copy analytic sensitivity. When this approach was applied to the 21 SNAP assays, all assays demonstrated near single copy sensitivity. All assays demonstrating less than ten copy sensitivity are tested for multiplex sensitivity.
  • Multiplex analytic sensitivity of the SNAP assay is measured and determined as follows. All primer pairs demonstrating sensitivity often copies are characterized for their multiplex analytic sensitivity. Primer pairs meeting the singleplex analytic sensitivity criteria above are pooled at 8OnM each and melting curve analysis of two- fold serial dilutions of native template (NT) pool (64 to 1/16 copies per OpenArray hole (33nl)) in four replicates determines presence or absence of PCR products. As above, the limiting dilution derived copy number is compared with the Nanodrop estimated copy number to establish if an assay is capable of sub ten starting copy analytic sensitivity. For each gene target, the assay demonstrating the best analytic sensitivity is moved to a probe design phase.
  • a new set of assays are designed and tested as described herein.
  • Half of the SNAP assays tested in this manner (11 assays) showed near single copy sensitivity.
  • minimal performance metrics required for the 60 gene SNAP panel are established. Briefly, six half log serial dilutions of lung tumor cDNA will be measured by SNAP. Each assay should demonstrate >95% linear response, routinely less than 50% CV for samples with greater than 1000 starting copies. Further, no native template signal should be detected in No Template Controls. Failing assays are replaced with new assays.
  • SNAP assays for the 60 gene panel can be externally scaled up.
  • the reproducible manufacture of the internal standard strip tubes aids in the standardization of the SNAP measurements.
  • the synthetic oligo internal standard pool is replaced with a more stable cloned IS library.
  • plasmids containing a desired DNA sequence e.g., the ISO9001 certified GenScript, Piscataway, NJ. Plasmids to each internal standard may be obtained from one of these companies. These plasmids are linearized (typically using the rare NOT I restriction site), quantified by both nanodrop fluorometry and limiting dilution PCR (averaging the results) and then pooled at 10 8 per ⁇ l in preparation for internal standard strip tube manufacture.
  • Example 14 SNAP assays for a test panel of 60 lung adenocarcinoma prognostic genes measures transcripts in FFPE samples
  • the prognostic value of the selected 60-gene SNAP panel is evaluated and a prognostic signature is identified to classify patients into high and low risk groups based on overall survival.
  • the effect of adding standard clinical variables into the risk classification algorithm can be evaluated. This is achieved in two steps: all, or a subset of genes are used to identify an expression signature that is univariately correlated with overall patient survival and a cut-off value to classify patients into high and low risk groups is determined.
  • the gene expression signature in the presence of clinical covariates is determined and independent prognostic factors are combined to develop a multifactorial risk classification.
  • the prognostic potential in lung adenocarcinoma patients of the selected 60-gene SNAP assay is determined, and a risk classification scheme is identified that can be tested in much larger, independent patient cohorts for true test validation.
  • Tissues for this study are obtained from patients treated for lung cancer by Surgeons in the Division of Thoracic and Foregut Surgery at the University of Rochester Medical Center (URMC) between 2003 and 2007.
  • FFPE tissue blocks from these patients are stored in the URMC Department of Pathology archives
  • the majority of the cohort (58%) had stage I disease and 65% had no detected lymph node involvement. This cohort is very similar to those studied in the Directors Challenge Consortium for the Molecular Classification of Lung Adenocarcinoma (3), and has the sample diversity and numbers to support the prognostic signature development .
  • the first requirement is to review the original H&E slides of the tumors and to identify tissue blocks that are 1) Representative of the tumor and 2) Contain an area of tissue comprised of at least 70% tumor cells.
  • the primary tumor histology and staging information is reviewed. From this review, one or two tumor blocks are identified for each patient and a request is made to retrieve the blocks from the archives.
  • a single 4 ⁇ m section from each tissue block, is stained with H&E and evaluated before making a decision regarding which tissue block is used for the molecular analysis.
  • the basis for this evaluation includes histologic criteria (good tumor representation, no necrosis, lack of contaminating tissues/structures etc.), tissue size, and the need to macrodissect in order to obtain >70% tumor cellularity.
  • a high resolution photograph may be taken of the H&E stained slide from each tissue block for later use.
  • Transcript abundance for the 60 gene prognostic panel for up to 250 cDNA samples are measured by SNAP. Based on previous SNAP transcript abundance measurements of FFPE samples described herein and the expected protocol improvements to both RNA isolation from FFPE and cDNA conversion efficiency, 100 ng RNA equivalence is expected to be sufficient for accurate transcript abundance measurement of all 60 prognostic genes. Reference normalized gene expression data are provided for further statistical analysis.
  • Determination of a prognostic signature from the 60 gene panel can be used to generate a risk score that is associated with patient prognosis, either with or without clinical covariates.
  • a risk score cut-off value may be determined that classifies patients into high and low risk groups for overall survival. This classifier can be used to determine the reproducibility of the SNAP assay for patient risk classification.
  • a first method, the compound covariate method is a standard prediction approach that has been used successfully in microarray studies with specific applications to lung adenocarcinoma (1).
  • the number of patients required to enable 80% power to detect a significant gene signature in the cohort of 150 - 187 patients with up to 9 years of observation is computed.
  • the individual gene signature GS j for patient j serves as a risk score for each patient. Patients are classified into two cohorts by the risk score median.
  • the power for a test of survival differences among the 2 groups is provided for the gene signature alone and for the addition of the gene signature to N stage. This approach permits the estimate of power when the gene signature serves as a covariate by itself and along with an important clinical covariate.
  • Differences between two groups can be characterized by the hazard ratio. Based upon AJCC stage distribution of the cohort with 5 years of accrual and 4 years of additional follow-up, a probability of 5 year overall survival of .21 is assumed. Applying the expected censoring pattern and the probability of 5 year survival to the cohort, 73% of study patients are predicted to die by January 2012. Because the number patients who are eligible may be variable, the minimum number of patient accruals are provided to achieve 80% power for various 2 group hazard ratios.
  • Table 9 Cohort size requirements to identify hazard ratios >1.5 with or without inclusion of clinical covariates with the gene signature.
  • the table also shows the number of patients required for testing the gene signature in the presence of a strong clinical covariate.
  • Example 15 Inter-site reproducibility and deployability of SNAP assays for a test panel of 60 lung adenocarcinoma prognostic genes.
  • the SNAP platform has inter-site reproducibility and deployability. These features can be achieved regardless of any prognostic utility of the specific, 60-gene lung cancer signature and this feature is important independent of clinical value. Multi-gene expression signatures for many different disease related endpoints can be refined and validated to generate a reproducible and deployable SNAP assay. Thus, SNAP assay precision can be used to determine patient classification.
  • Assay precision is a combination of variability in the measurement itself at the same institution and the variability that would result from different labs conducting the same measurement.
  • a reliability experiment to evaluate intra and inter-site reproducibility is conducted as follows. From an initial site, cDNA from 15 FFPE tissue samples is divided into three subsamples and shipped to three sites, A, B and C. At each site, each of the 15 samples is further subdivided into 3 subsamples and the 60 gene SNAP assay are run on each subsample on separate days. The intra-site reproducibility is then estimated for each lab. A mixed effects linear model is used to fit all data, setting the sites as fixed effects and the original samples and subsamples at each site as random effects.
  • the dependent variable is the continuously scaled gene signature (risk score).
  • risk score The equality of outcome by site is tested and the contribution to total variance is estimated for subject, site and replicate within site.
  • the SNAP assay risk score reliability is characterized by coefficient of variation, infra-class correlation coefficient, standard error of measurement and prediction interval for a new observation.
  • MAQC Microarray Quality Control
  • the frequency of an inconclusive test result due to lack of test precision is evaluated, in addition to the standard metrics described above.
  • the frequency of an inconclusive test result due to lack of test precision is an intuitive endpoint that can be evaluated and has practical utility. This endpoint and its evaluation are described below.
  • the inter-site standard error of measurement is used to estimate a prediction interval for a new observation.
  • This interval places a high probability (e.g. .95 or .99) that the true gene signature score falls within the stated interval.
  • the width of this prediction interval is related to risk classification, and the proportion of patients at risk of mis-classification estimate is estimated.
  • Hypothetical figure ( Figure 17) illustrates this concept.
  • 100 patients with a gene signature score from 10 to 50 are split into high/low risk groups at the median of 30.
  • a prediction interval for a new observation at the median is superimposed on the distributions. All values of the gene signature are considered consistent with the new observation.
  • the width of the prediction interval is ⁇ 4 and represents about 20% of the data.
  • the actual risk of misclassification depends upon (1) the intra- lab variability, (2) the inter- lab variability, and (3) the observed distribution of gene signature scores (see, e.g., Example 14).
  • the prediction interval method is a probability statement and cannot therefore identify which observations are truly inconclusive, just those that have a high probability of being so.
  • Quantitatively assessing the probability of an inconclusive new test result near the classification boundary thereby estimates the implications of imprecision in the clinical use of the gene signature. This is more intuitive than metrics such as CV, standard error of measurement, etc. and has more practical utility. For example, for a patient with a test result too close to the classification boundary, the test could be repeated or the patient could be placed into an intermediate risk group. In this regard, there is at least one benchmark for comparison. In a 2004 NEJM study of 675 breast cancer patients, the Oncotype DX test from Genomic Health placed 22% of patients into an intermediate risk group (14).
  • the Oncotype DX intermediate risk group was not determined using the same approach describe herein, the SNAP assay reproducibility is such that it can achieve no greater than 20% "inconclusive or intermediate" patient classification such that the test has sufficient reproducibility to be of clinical value. Furthermore, this feature represents reproducibility of SNAP performed at multiple sites rather than at a single site as is the case for all currently available multi-gene expression tests.
  • SNAP reliability can be determined by comparing results obtained at three sites.
  • the recommended sample size for such a study is derived from the association between number of independent samples, number of replicates and the intra-class correlation coefficient (ICC).
  • ICC intra-class correlation coefficient
  • Figure 18, prepared for a true ICC of 0.90 shows the tradeoff between number of replicates and precision in estimating the ICC.
  • the lower bound of an ICC estimate has been limited to 0.10 below the estimated ICC. If the observed ICC is 0.90, with high probability, the lower bound is at least 0.80 with 3 replicates of 15 samples.
  • a 16 gene panel of lung cancer diagnostic genes identified by Chen et al. was selected. While the 16-gene signature of Chen et al. faired as well as most other signatures (particularly when clinical covariates were included) it did not consistently provide significant risk classification in all analysis cohorts. However, a panel of lung cancer diagnostic genes identified in a much larger study (Beer et al., Nat Med 2002 August;8(8):816-24) is also used. Beer et al. independently evaluated several new and previously published gene-sets (including the Chen et al. signature). Although there are some indications that prognostic signatures in lung cancer may have value across different tumor histologies (2;32), the vast majority of studies have focused on lung adenocarcinoma. Even so, there is extensive heterogeneity, in fact there is almost no overlap, in the gene sets identified in different studies/patient cohorts and even between gene sets identified using the same patient cohort and array data (3).
  • Beer et al. (1) previously identified 50 survival-related genes for identifying high-risk patients with lung adenocarcinoma and more recently examined a data set composed of 442 adenocarcinomas (3). Subsequently, Beer et al. have used a combination of statistical and biological approaches to identify subset(s) of genes from this large dataset that are prognostic for survival of patients with lung adenocarcinoma based on the following assumptions. Genes whose expression are highly correlated should be separated into clusters (i.e. into similar biological functional groups); and there exist some clusters and subsets of genes in each selected cluster which are the most prognostic for survival. This approach is analogous to the commonly used approach of principal components to reduce dimension. The cluster approach to dimension reduction is more likely to be effective because it more closely mimics the underlying biology.
  • Simultaneous cluster and gene selection A two-stage selection procedure is implemented; the first selection on cluster level and the second one among genes within each selected cluster.
  • the Cox proportional hazard model is used to implement the proposed method.
  • the selection scheme is conducted in a Bayesian framework using an iterative algorithm.
  • the first step is to select C, much less than K, clusters which are relevant to the survival outcome.
  • the second selection identifies a subset of genes prognostic for survival. Both the clusters and genes within clusters are selected based on current probabilities in the Bayesian model.
  • the parameters of the model are updated for each selected set of genes. These steps are repeated until the simulation chain converges.
  • the empirical frequencies of the visits by the models with different number of clusters are calculated. For this, the most promising clusters for survival outcomes are ranked, and prognostic genes within each of selected clusters are identified.
  • the same estimation scheme is conducted for the identification of prognostic genes within each of selected clusters. Threshold values were chosen to end up with approximately 40-60 clusters and a total of approximately 300 genes and then performed qRT-PCR on a subset of 50 adenocarcinomas. Those genes with the most significant association between the Affymetrix and qRT-PCR measures were selected. This has identified an enriched set of 90 genes which are being evaluated in another, independent patient cohort.
  • This data will be available prior to the proposed start date of this grant period and will be used, in consultation with Dr Beer, to select a subset of up to 60 genes that will be utilized in this proposal.
  • This set may or may not include genes from the original 16- gene panel used in previous studies, but it will include the four endogenous control genes identified as being the most stable. Thus, a total of 64 genes are analyzed using SNAP.
  • RNA control is used to ensure the reverse transcription proceeded with the expected yield.
  • the FDA is exploring the use of such controls in QA for RNA quantification.
  • These RNA molecules are being developed by the External RNA Control Consortium (ERCC) as a tool for standardizing RNA quantification. These sequences have no homology to known species, so they will be unique in any RNA sample. While the application of RNA standards is still in development, the ERCC goal is for commercial vendors to manufacture and distribute the RNA standards for the purposes of standardizing RNA quantification.
  • ERCC External RNA Control Consortium
  • RNA for one of ERCC controls is spiked into the RNA isolated from FFPE prior to reverse transcription.
  • the efficiency of the RT step is estimated. While this control does not account for the chemical damage to the FFPE RNA molecules, it can be used to ensure the RT process meets minimal yield and efficiencies.
  • three amplification primer pairs are designed for a prognostic assay incorporating an ERCC target. Metrics for using the RT control may be established using methods known in the art. ERCC development of these standards may be followed for implementing the RNA reverse transcription control. QC metrics are based on measurements of FFPE samples. The assay is developed to be useful as a QC for equal cDNA distribution into the multiplex IS PCR strip tubes.
  • Accurate SNAP measurement may be obtained by distributing an equal amount of cDNA into the six IS multiplex amplification strip tubes. An unequal cDNA distribution may distort the transcript abundance result as SNAP quantification assumes equivalent DNA load in each multiplex reaction.
  • Another use for the RT control is as a cDNA sample distribution control. If stable RNA controls are not available, RNA with a DNA analog may be substituted and a known quantity (e.g., 6x10 5 copies) spiked into the cDNA sample.
  • an ERCC IS is added at a single concentration (e.g., 10 5 per tube) to the IS strip tubes during manufacture.
  • RNA extracted from FFPE samples is highly fragmented and therefore, amplification primers for high specificity and sensitivity assays to be used on samples derived from FFPE lung tumor blocks are designed to recognize shorter target sequences (33). For this reason, the primer design restricts PCR product sizes to 70- 85 base pairs, in keeping with findings that homogeneous product sizes have better inter transcript correlation for degraded samples (16). Primer design annealing temperature will be 60 ⁇ 1 0 C.
  • the predicted amplicon sequence is BLASTed against the human transcriptome to ensure the uniqueness of primer and probe binding specificity.
  • primers are designed to span RNA intron/exon splice junctions of > 1000 nucleotides; therefore, amplification of genomic contaminants are inhibited by failure to produce full length products.
  • Three primer designs per gene target is expected to be sufficient to yield at least one primer pair meeting the analytic requirements described herein. Primer pairs are redesigned should an assay fail to meet the success metrics established below.
  • Reference gene selection Eleven endogenous control genes were evaluated and five selected as being the most stable in a panel of lung adenocarcinoma specimens. Five reference genes were developed to normalize for cDNA load. In a prognostic gene expression panel, reference genes are introduced into the panel construction at the multiplex analytic sensitivity stage.
  • Algorithms used for converting melting curve information into molar ratio measurements are known in the art. Briefly, conversion of melting curve data into transcript abundance begins with establishing melting curve parameters for each NT and IS template. Pleiades probe melting curves of samples with either IS or NT template are fit to a variable sloped sigmoid curve, and the resulting Tm and Hill coefficient saved as input parameters for SNAP analysis. Next, the melting curves for each sample-assay combination are fit to a two sigmoid curve using the parameter inputs defined above, allowing the Bottomis and BottoniNT to be adjusted to minimize the residuals (Figure 19). The fraction NT is calculated from the Bottomis and BottoniNT solutions.
  • the SNAP protocol involves the distribution of samples into tubes containing high concentrations of internal standard. Following SNAP pre-amplification, product is transferred into the OpenArray for detection. Both activities are possible sources of laboratory template contamination. Standard operating protocols for the manufacture of SNAP reagents and measurement of samples by SNAP minimizes the possibility of SNAP reagents and products contaminating the workplace. To minimize the contamination risk four separate work areas for SNAP, each with equipment reserved for handling the SNAP reagents are used:
  • a clean hood space is designated for making aliquots of primer and probes.
  • a chemical fume hood is used for handing internal standard DNA at >10 8 copies per ⁇ l.
  • the multiplex amplification incorporates Uracil-N-glycosylase treatment prior to thermal cycling to degrade Uracil containing amplicons. All SNAP amplifications incorporate Uracil into the DNA template, thus reducing the chance that these products will interfere with SNAP measurements.
  • the SNAP OpenArray detection step does not require UNG treatment as the samples added to the OpenArray are insensitive to low level of contamination. For example, ten starting copies in the cDNA, the approximate LOD for SNAP, end up being amplified to >10 6 copies/ul; this sample would only be affected by very high levels of contamination (e.g., >10 5 / ⁇ l).
  • FFPE tissue blocks from these patients are stored in the URMC Department of Pathology archives. Due to the relatively recent surgery dates, all of these tissue blocks are stored on-site at URMC and are therefore easily accessible.
  • the specific patient cohort has already been identified and all relevant clinical information has been retrieved and reviewed by a study coordinator with assistance from two Thoracic surgery attendings, Dr. Carolyn Jones and Dr. Daniel Raymond.
  • Clinical covariates already determined include tumor histology, pathologic TNM tumor staging, neoadjuvant and adjuvant therapies, surgical resection type, gender, smoking history (never, past or active smoker at time of surgery), age at surgery and history of prior cancers.
  • the identified cohort is considerably larger than actually needed for adequate statistical power in the study (discussed below).
  • Inclusion criteria for this study are essentially the same as those used in a previous lung adenocarcinoma study (3).
  • Patients included are those diagnosed with primary lung adenocarcinoma (all histologic subtypes) who were surgically treated with curative intent (complete resection with negative surgical margins). This includes patients in AJCC6 stage groups I-III. No patients to be studied received preoperative chemotherapy or radiation and there is a minimum of 4-years follow-up available at the time of data analysis. Patients are excluded if they have a history of prior malignant disease or if death occurred within one month of surgery.
  • RNA isolation Tissue cutting and RNA isolation. Once the specific tumor blocks to be analyzed are identified, tissues are handed off for molecular analysis. For each tissue block, 10, 5 micron sections are cut into each of two eppendorf tubes containing RNA isolation buffer. One tube is used for RNA isolation and the other is stored at -8O 0 C as a backup if required for any reason. Protocols for RNA isolation from FFPE tissues is labor intensive and time consuming (33). Many commercially available kits (e.g. High Pure RNA Paraffin Kit; Roche Applied Science, Indianapolis, IN) are available for RNA isolation from FFPE tissues, and several of these have been tested and compared (36). However, RNA isolation may be performed using minor modifications to the protocol recommended in the High Pure RNA Paraffin Kit.
  • RNA isolation for gene expression studies from FFPE tissues by investigators at Genomic Health (Redwood City, CA). Once isolated, RNA is quantified using a NanoDrop spectrophotometer and the 260/280nm absorbance ratio is calculated to assess purity. If necessary, RNA is further purified by phenol- chloroform extraction and precipitation although this step is optional.
  • electrophoretic analysis eg. Agilent Bioanalyzer
  • Madabusi et al. (38) demonstrated that RNA with an RNA Integrity Number (RIN) >1.4 could be successfully used for gene expression analysis. This was also the case in the study by Rebeiro-Silva et al.
  • the Roche RNA isolation kit provided RIN > 1.4 in 100% of the samples tested. Isolated RNA can be stored at - 8O 0 C until needed.
  • cDNA synthesis cDNA is synthesized from 100-500ng of total RNA using a combination of random (octamers) and gene-specific primers.
  • the reverse transcription primer is designed 20-25 bases from the gene specific PCR primers for all genes in the study. These primers are short (10-15 bases) oligonucleotides with an annealing temperature of 30-35 0 C.
  • This combination of random and gene-specific priming can significantly improve the detection of gene expression from FFPE tissues, and should increase the sensitivity and reproducibility of the assays.
  • the specific reverse transcriptase to be used may also impact the sensitivity and will be determined in preliminary experiments testing and comparing several enzymes.
  • a risk score cut-off value can be assigned that classifies patients into high and low risk groups for overall survival by determining a risk score that is associated with patient prognosis, either with or without clinical covariates.
  • a primary goal however is to identify at least one approach that provides significant association with prognosis and that can be used to stratify patients into risk groups.
  • three methods of increasing complexity to predict overall survival in the cohort of up to 187 lung adenocarcinomas are available.
  • a first method, the compound covariate method is a standard prediction approach that has been used successfully in microarray studies with specific applications to lung adenocarcinoma (1).
  • Alternative approaches are also available that have characteristics unique from the compound covariate method and from each other.
  • a second method semi-supervised clustering (39) exploits the correlation among groups of individual genes.
  • a third method, random forests, allows for nonlinear effects and interactions among predictors.
  • the compound covariate method is a linear combination of all genes in the gene signature multiplied by their respective Cox proportional hazards coefficient. Specifically if q genes G 1 , G 2 , G3,..., G q , are selected for the gene signature then each gene will have an associated regression coefficient ⁇ i, ⁇ 2 , ⁇ 3 ,..., ⁇ q . The combination of these coefficients and the individual gene expression for patients y for each of the k genes will yield a gene signature (GS j ) for the jth patient of
  • GS j ⁇ lGl j + ⁇ 2 G 2j + ⁇ 3 G 3j +...+ ⁇ qGq j
  • the gene signature GS j is the predicted log relative hazard of death for patient j.
  • the number of genes selected for the gene signature will be based on cross validated risk stratification. Once patient gene signatures are estimated they are sorted and divided into high and low risk groups. The extent of separation of Kaplan-Meier estimates of survival will provide assessment of gene signature prognosis. Genes that are individually associated with survival will be used to supply 5 to 10 lists of the top n genes for evaluation. For example a potential list of candidate gene lists may consider the top 5, 10, 15, 20, 25, etc genes. Each set of top genes will be used to dichotomize the survival data at the median. The list with the best leave- 10-out cross-validated separation of high and low risk Kaplan-Meier curves will be selected.
  • a final compound covariate gene signature will be constructed.
  • Cross validation of the compound covariate predictor will use bootstrap resampling. Specifically, 200-500 bootstrap samples will be drawn, the gene signature generated, sorted and divided into 2 equal groups. Then the bootstrap gene signature will be applied to the original data and again divided into 2 risk groups. The agreement in classification accuracy over the B samples will constitute a bootstrap cross-validated classification accuracy.
  • bootstrap measures of model validation will be estimated such as R 2 , the proportion of explained variance, slope calibration, the extent to which the slope would be have to be changed so predicted survival matches observed survival, and Somer's D, a concordance coefficient for binary data.
  • Semi- supervised clustering combines supervised learning in which the patient status is known (vital status, survival time etc) and is used to find a classifier and unsupervised learning in which a classifier ignores patient status.
  • the "semi- supervised' method of Bair and Tibshirani (39) applies principle components analysis to the set of individual genes selected at the univariate level in method 1 (Compound Coveriate) due to their association with survival. Principal components reduces dimensionality by selecting gene subsets for one principal component that are correlated with one another but uncorrelated with genes in other principal components.
  • the random forests method is a competitor to both the compound covariate and semi-supervised methods. It would be expected to give results which are comparable to the compound covariate method except if there are substantial nonlinear effects and unexpected interactions among genes. Because recursive partitioning is more sensitive to these phenomena it may provide improved prediction over the compound covariate.
  • Recursive partitioning also known as classification and regression trees
  • recursive partitioning recursively searches among all covariates for the cutpoint and single covariate providing the maximum separation between groups.
  • To adapt recursive partitioning to right-censored survival data one scales the data to the parametric exponential distribution and uses the resulting cumulative hazard as the dependent variable.
  • Recursive partitioning classifies each patient into a distinct risk group based upon the similarity of their relative risk to the average relative risk of the terminal nodes. Random forests provide a more robust classification tree. A bootstrap sample (with replacement of the data) is obtained. Prior to each split, a random sample of the predictors is obtained to generate the next split. Patients are placed in classes based on similarity to the average node. Trees can be evaluated by measuring the separation of the Kaplan-Meier survival plots for each terminal node with a log rank test.
  • Wolmark N A multigene assay to predict recurrence of tamoxif en-treated, node- negative breast cancer. N Engl J Med 2004 December 30;351(27):2817-26.
  • CEBPG regulates ERCC5/XPG expression in human bronchial epithelial cells and this regulation is modified by E2F1/YY1 interactions. Carcinogenesis 2007 December;28(12):2552-9.
  • Reverse transcription-competitive multiplex PCR improves quantification of mRNA in clinical samples—application to the low abundance CFTR mRNA.
  • Clin Chem 1999 May;45(5):619-24. (28) Harr MW, Graves TG, Crawford EL, Warner KA, Reed CA, Willey JC. Variation in transcriptional regulation of cyclin dependent kinase inhibitor p21wafl/cipl among human bronchogenic carcinomas. MoI Cancer 2005;4:23.
  • Bair E Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2004 April;2(4):E108.

Abstract

The invention features compositions and methods that are useful for the detection of multiple nucleic acid targets in a single sample.

Description

SYSTEM FOR IDENTIFICATION OF MULTIPLE NUCLEIC ACID TARGETS IN A SINGLE SAMPLE AND USE THEREOF
CROSS-REFERENCE TO RELATED APPLICATION This application claims the benefit of U.S. Provisional Application No.:
61/105,701, filed on October 15, 2008, the entire contents of which are incorporated herein by reference.
STATEMENT OF RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH
The work was supported by in part, by National Cancer Institute (NCI) grant no. 1R21CA132806-01. The Government has certain rights in the invention.
BACKGROUND OF THE INVENTION Molecular characterization of cancer, in particular by gene expression profiling has great potential to improve prognosis, therapeutic selection and clinical outcomes by transforming them from population-based risk assessment and empirical treatment to a predictive, individualized model. Comprehensive clinical cancer characterization will need to encompass the diversity of genetic alterations that accompany neoplastic development. Promising prognostic tests for sub classes of breast, colorectal and lung cancers require measurement of tens of transcripts, and as advantages of further classification become known, gene expression profiling will require additional transcript markers to fully capture tumor diversity. However, molecular diagnostics and pharmaceutical companies, clinicians and FDA are struggling with how to deploy gene expression profiling (GEP) based diagnostics. The most significant barriers to GEP diagnostic implementation are reliable clinical sample RNA isolation from tumor biopsies or surgical resections, and accurate and robust gene transcript quantification.
Commercially available platforms for measuring gene expression, including those based on hybridization arrays or qPCR methods lacking an internal standard, do not have the inter-site concordance required for accurate outcome scores calculation. Hybridization arrays are quite appealing for their ability to collect many measurements per sample. However, they suffer from low assay specificity, poor sensitivity, narrow dynamic range, poor signal-to-analyte response and complex sample processing making hybridization microarrays less attractive as a platform for diagnostic gene expression profiling relative to their well-established research utility. QPCR has excellent lower detection threshold, signal-to-analyte response, and dynamic range. However, most commercially available realtime qPCR platforms are limited in their suitability for diagnostics due to instrument-to-instrument variability, insufficient quality control (including lack of control for PCR inhibitors) and fluidic complexity. Most methods rely on replicate measurements to provide some control for false negative and positive results; however, this approach requires additional sample consumption and does not control for sample-specific interfering substances such as assay specific inhibitors. This problem is exacerbated by the fact that RNA yield is often low from clinical samples, especially formalin fixed paraffin embedded tissues, and this low RNA yield limits the number of assays per test. Furthermore, more tests consume expensive reagents and entail complicated workflows, requiring highly skilled labor and expensive reagents, making the test expensive and possibly slowing widespread adoption and deployment, despite its intrinsic clinical value.
Along with QPCR, hybridization microarrays exhibit similar performance limitations that make them less than ideal as a GEP diagnostic platform.
Examples of multivariate GEP tests showing both promise and limitations are two high profile commercial GEP breast cancer prognostic tests, MamaPrint, a 70 gene microarray test and Oncotype Dx, a 21 gene real-time QPCR test. Each of these tests provides sufficient clinical accuracy to improve breast cancer patient outcome enabling selection of the best treatment plan on an individual basis. However, these tests cannot be widely deployed in a kit format and must be performed at their respective company's laboratories as a clinical testing service because their reliability depends on the specific expertise and processes developed by each company for each test and are therefore not exportable to other laboratories. A clear benefit to improving human health care capabilities would be a GEP platform that provides the analytic sensitivity and linear dynamic range of QPCR and assay scalability of microarrays while minimizing inter- laboratory analytical variation, cost and sample consumption, and enabling analysis of FFPE samples. This would enable widespread deployment in regional pathology laboratories for clinical diagnostic testing.
The detection of multiple nucleic acid targets in a single sample has the potential to improve diagnostic testing of biologic samples. Despite rapid progress in polynucleotide detection methods, current methods to detect multiple nucleic acid targets relies on multiple single diagnostic tests or on a high level of specific expertise for development and implementation of a single diagnostic test. Thus, there is an urgent need for diagnostic methods based on the detection of multiple nucleic acid targets in a single sample that are clinically deployable and have increased analytic sensitivity, simplified workflow, and improved quality control measures.
SUMMARY OF THE INVENTION
As described below, the present invention features compositions and methods that provide for gene expression profiling using a pre-amplification step that enhances detection of multiple nucleic acid targets in a single sample, such as a biologic sample, in a nanoplatform system.
In one aspect, the invention provides a method for detecting a gene expression profile in a biological sample, the method involving the steps of preamplifying a biomarker in the presence of a defined competitive reference biomarker; individually exponentially amplifying the biomarker in the presence of the reference biomarker in a reaction volume of at least about 1, 10, 100, 500, or 1000 nl; identifying binding of a first detectable nucleic acid probe to the biomarker and any one of: binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and the melting temperature of the first detectable nucleic acid probe to the biomarker; and determining, respectively, any one of: the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker, where the half maximal effective concentration is used to determine the quantity of the biomarker in the sample. In another aspect, the invention provides a method for detecting a gene expression profile in a biological sample, the method involving the steps of preamplifying a biomarker in the presence of a defined competitive reference biomarker; individually exponentially amplifying the biomarker in the presence of the reference biomarker in a set of reactions, each reaction having a volume of at least about 1, 10, 100, 500, or 1000 nl; identifying binding of a first detectable nucleic acid probe to the biomarker and any one of: binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and the melting temperature of the first detectable nucleic acid probe to the biomarker; determining, respectively, any one of: the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker; and plotting the ratio against the molar ratio of the reference nucleic acid for the set of reactions, where the half maximal effective concentration is used to determine the quantity of the biomarker in the sample.
In yet another aspect, the invention provides a method for identifying or monitoring a subject as having a pathological condition characterized by an alteration in gene expression, the method involving the steps of preamplifying a biomarker in the presence of a defined competitive reference biomarker; individually exponentially amplifying the biomarker in the presence of the reference biomarker in a reaction volume of at least about 1, 10, 100, 500, or 1000 nl; and detecting the presence or absence of the biomarker and the corresponding reference biomarker, where detection of the biomarker, indicates that the biomarker is present; and failure to detect the biomarker when the corresponding reference biomarker is detected indicates that the biomarker is absent from the sample.
In still another aspect, the invention provides a method for detecting two or more target nucleic acid molecules in a single sample, the method involving the steps of preamplifying the target nucleic acid molecules in the presence of a defined reference nucleic acid molecule; individually exponentially amplifying each of the target nucleic acid molecules in the presence of the reference nucleic acid molecule in a reaction volume of at least about 1, 10, 100, 500, or 1000 nl; and detecting the presence or absence of the target nucleic acid molecules and the reference nucleic acid molecule, where detection of the target nucleic acid molecules indicates that the target nucleic acid is present; and failure to detect the target nucleic acid molecule when the reference nucleic acid molecule is detected indicates that the target nucleic acid molecule is absent from the sample.
In yet another aspect, the invention provides a method for detecting two or more target nucleic acid molecules in a single sample, the method involving the steps of preamplifying a target nucleic acid molecule in the presence of a defined reference nucleic acid molecule for each target; individually exponentially amplifying each of the target nucleic acid molecules in the presence of the reference nucleic acid molecule in a set of reactions, each reaction having a volume of at least about 1, 10, 100, 500, or 1000 nl; identifying binding of a first detectable nucleic acid probe to the target nucleic acid molecule any one of: binding of a second detectable nucleic acid probe to the corresponding reference nucleic acid molecule, and the melting temperature of the first detectable nucleic acid probe to the target nucleic acid; determining, respectively, any one of: the ratio of binding to the target nucleic acid and binding to the corresponding reference nucleic acid, and the ratio of binding to the bio marker and the melting temperature of the first detectable nucleic acid probe to the biomarker; and plotting the ratio against the molar ratio of the reference nucleic acid for the set of reactions, where the half maximal effective concentration is used to determine the quantity of the target nucleic acid in the sample.
In still another aspect, the invention provides a method for characterizing cancer, the method involving the steps of preamplifying a biomarker in the presence of a defined reference biomarker in a set of reactions, where the biomarker is selected from the group consisting of ERBB3, LCK, DUSP6, STATl, MMD, CPEB4, RNF4, STAT2, NFl, FRAPl, DLG2, IRF4, ANXA5, HMMR, HGF, and ZNF264; individually exponentially amplifying the biomarker in a reaction having a volume of at least about 1, 10, 100, 500, or 1000 nl; identifying binding of a first detectable nucleic acid probe to the biomarker and any one of: binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and the melting temperature of the first detectable nucleic acid probe to the biomarker; and determining, respectively, any one of: the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker; plotting the ratio against the molar ratio of the reference nucleic acid for the set of reactions, where the half maximal effective concentration is used to determine the quantity of the biomarker in the sample. In yet another aspect, the invention provides a nanofluidic system having a high density array of nano liter-scale through-holes having a 10-50 nl reaction volume containing a standardized mixture of internal standards, at least two (e.g., 2, 3, 4, 5, etc.) pairs of detectable target nucleic acid probes, each of which is complementary to a target nucleic acid sequence, and a pair of detectable reference nucleic acid probes complementary to a competitive template internal standard, where each primer pair coamplifies a template and its respective competitive internal standard template with equal efficiency.
In still another aspect, the invention provides a kit containing a high density array of nano liter-scale through-holes having a 10-50 nl reaction volume containining a standardized mixture of internal standards, at least two (e.g., 2, 3, 4, 5, etc.) pairs of detectable target nucleic acid probes, each of which is complementary to a target nucleic acid sequence, and a pair of detectable reference nucleic acid probes complementary to a competitive template internal standard, where each primer pair coamplifies a template and its respective competitive internal standard template with equal efficiency, and written directions for using the kit to detect a gene expression profile in a biological sample.
In various embodiments of any of the above aspects, the sample is detected for a condition selected from the group consisting of neoplasia, inflammation, pathogen infection, immune response, sepsis, the presence of liver metabolites, and the presence of a genetically modified organism. In specific embodiments, detecting the neoplasia is for diagnosing a neoplasia, characterizing a neoplasia to identify tissue of origin, monitoring response of neoplasia to treatment, or predicting the risk of developing a neoplasia. In various embodiments of any of the above aspects, the target nucleic acid or biomarker is RNA or DNA. In various embodiments of any of the above aspects, the step of preamplifying a biomarker in the presence of a defined competitive reference biomarker involves preamplifying the target nucleic acids using primer sets specific for the target nucleic acids. In specific embodiments, the primer set used in the step for preamplifying the target nucleic molecules is used in the step for amplifying the target nucleic molecules. In other specific embodiments, a first set of primers is used in the step for preamplifying the target nucleic molecules and a second set of primers is used in the step for amplifying the target nucleic acid molecules.
In various embodiments of any of the above aspects, the step of preamplifying a biomarker in the presence of a defined competitive reference biomarker involves reverse transcriptase polymerase chain reaction (RT-PCR). In various embodiments of any of the above aspects, the steps of individually exponentially amplifying the biomarker in the presence of the reference biomarker in a set of reactions, each reaction having a volume of at least about 1, 10, 100, 500, or 1000 nl and of identifying binding of a first detectable nucleic acid probe to the biomarker and any one of: binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and the melting temperature of the first detectable nucleic acid probe to the biomarker, involve real-time PCR. In various embodiments of any of the above aspects, the nucleic acid probe to the target nucleic acid and the nucleic acid probe to the corresponding reference nucleic acid are fluorogenic. In various embodiments of any of the above aspects, the reaction occurs in a through-hole of a platen. In various embodiments of any of the above aspects, the target nucleic acid is derived from a bacterium, a virus, a spore, or a eukaryotic cell. In specific embodiments, the eukaryotic cell is a neoplastic cell derived from lung, breast, prostate, thyroid, and pancreas.
In various embodiments of any of the above aspects, the target nucleic acid molecule is derived from a bacterial pathogen selected from the list consisting of Aerobacter, Aeromonas, Acinetobacter, Actinomyces israelii, Agrobacterium, Bacillus, Bacillus antracis, Bacteroides, Bartonella, Bordetella, Bortella, Borrelia, Brucella, Burkholderia, Calymmatobacterium, Campylobacter, Citrobacter, Clostridium, Clostridium perfringers, Clostridium tetani, Cornyebacterium,Corynebacterium diphtheriae, corynebacterium sp., Enterobacter, Enterobacter aerogenes, Enterococcus, Erysipelothrix rhusiopathiae, Escherichia, Francisella, Fusobacterium nucleatum, Gardnerella, Haemophilus, Hafnia, Helicobacter, Klebsiella, Klebsiella pneumoniae, Lactobacillus, Legionella, Leptospira, Listeria, Morganella, Moraxella, Mycobacterium, Neisseria, Pasteurella, Pasturella multocida, Proteus, Providencia, Pseudomonas, Rickettsia, Salmonella,
Serratia, Shigella, Staphylococcus, Stentorophomonas, Streptococcus, Streptobacillus moniliformis, Treponema, Treponema pallidium, Treponema pertenue, Xanthomonas, Vibrio, and Yersinia. In specific embodiments, the bacterial pathogen is antibiotic resistant. In various embodiments of any of the above aspects, the target nucleic acid molecule is derived from a virus selected from the list consisting of hepatitis C virus, human immunodeficiency virus, Retrovirus, Picornavirus, polio virus, hepatitis A virus, Enterovirus, human Coxsackie virus, rhinovirus, echovirus, Calcivirus, Togavirus, equine encephalitis virus, rubella virus, Flavivirus, dengue virus, encephalitis virus, yellow fever virus, Coronavirus, Rhabdovirus, vesicular stomatitis virus, rabies virus, Filovirus, ebola virus, Paramyxovirus, parainfluenza virus, mumps virus, measles virus, respiratory syncytial virus, Orthomyxovirus, influenza virus, Hantaan virus, bunga virus, phlebovirus, Nairo virus, Arena virus, hemorrhagic fever virus, reovirus, orbivirus, Rotavirus, Birnavirus, Hepadnavirus, hepatitis B virus, Parvovirus, Papovavirus, papilloma virus, polyoma virus, adenovirus, herpes simplex virus 1, herpes simplex virus 2, varicella zoster virus, cytomegalovirus, herpes virus, variola virus, vaccinia virus, pox virus, African swine fever virus, Norwalk virus, and astro virus. In various embodiments of any of the above aspects, the sample is a biological fluid or tissue sample derived from a patient. In specific embodiments, the sample is selected from blood, serum, urine, semen and saliva. In other embodiments, the tissue sample is selected from tissue biopsy, formaldehyde fixed paraffin embedded tissue, fine needle aspirate (FNA) biopsy and laser capture micro-dissected samples. In various embodiments of any of the above aspects, the sample contains at least about 1-1000 (e.g., 1-10, 1-100, 1-500) cells. In various embodiments of any of the above aspects, the sample contains at least about 1-1000 (e.g., 1-10, 1-100, 1-500) ng of RNA. In various embodiments of any of the above aspects, one target nucleic acid molecule can be detected in at least about 50, 25, or 10 copies per reaction or when at least about 50-100 copies of a competing target nucleic acid are present. In various embodiments of any of the above aspects, the method detects a target present at about 1-100 copies/reaction. In various embodiments of any of the above aspects, the method detects a target present at about 5-50 starting copies/reaction. In various embodiments of any of the above aspects, the method detects a target present at about 10 starting copies/reaction. In various embodiments of any of the above aspects, the method detects a target present at about 1 starting copy/reaction.
In various embodiments of any of the above aspects, an absolute gene copy number is generated by curve fitting a plot of the ratio of the native/standard signals vs. standard concentration and the concentration (EC50) is used to determine the quantity of the target nucleic acid in the sample. In various embodiments of any of the above aspects, an absolute gene copy number is generated by curve fitting a plot of the ratio of the signal/melting temperature of the detectable nucleic acid probe vs. standard concentration and the concentration (EC50) is used to determine the quantity of the target nucleic acid in the sample. In various embodiments of any of the above aspects, the detectable target and reference nucleic acid probes have a distinct fluorometric dye that provides for the separate detection of amplified target and internal standards. In various embodiments, the tissue sample is selected from the group consisting of tissue biopsay, formaldehyde fixed paraffin embedded (FFPE) tissue, fine needle aspirate (FNA) biopsy and laser capture micro-dissected samples. In specific embodiments, the sample contains at least about 1-1000 (e.g., 1-10, 1-100, 1-500) cells. In specific embodiments, the sample contains at least about 1-1000 (e.g., 1-10, 1-100, 1-500) ng ofRNA. Other features and advantages of the invention will be apparent from the detailed description, and from the claims.
Definitions
Unless defined otherwise, all technical and scientific terms used herein have the meaning commonly understood by a person skilled in the art to which this invention belongs. The following references provide one of skill with a general definition of many of the terms used in this invention: Singleton et al., Dictionary of Microbiology and Molecular Biology (2nd ed. 1994); The Cambridge Dictionary of Science and Technology (Walker ed., 1988); The Glossary of Genetics, 5th Ed., R. Rieger et al. (eds.), Springer Verlag (1991); and Hale & Marham, The Harper Collins Dictionary of Biology (1991). As used herein, the following terms have the meanings ascribed to them below, unless specified otherwise.
By "alteration" is meant an increase or decrease. An alteration may be by as little as 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, or by 40%, 50%, 60%, or even by as much as 75%, 80%, 90%, or 100%.
By "amplify" is meant to increase the number of copies of a molecule. In one example, the polymerase chain reaction (PCR) is used to amplify nucleic acids. As used herein, "preamplify" is meant to increase the number of copies of a molecule (e.g., a biomarker or nucleic acid molecule) before exponentially amplifying the molecule. For example, preamplification may involve a linear increase in the number of copies of a molecule
By "binding" is meant having a physicochemical affinity for a molecule. Binding is measured by any of the methods of the invention, e.g., hybridization of a detectable nucleic acid probe, such as a TaqMan based probe, Pleiades based probe. By "biological sample" is meant any tissue, cell, fluid, or other material derived from an organism (e.g., human subject).
By "biomarker" is meant a polypeptide or polynucleotide that is differentially present in a sample taken from a subject having a disease or disorder relative to a reference. Exemplary biomarkers include nucleic acid molecules. By "detect" refers to identifying the presence, absence, or level of an agent.
By "detectable" is meant a moiety that when linked to a molecule of interest renders the latter detectable. Such detection may be via spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include radioactive isotopes, magnetic beads, metallic beads, colloidal particles, fluorescent dyes, electron-dense reagents, enzymes (for example, as commonly used in an ELISA), biotin, digoxigenin, or haptens.
By "half-maximal effective concentration" or "EC50" is response halfway between the baseline and maximum of the ratio of target molecule to a reference molecule, which corresponds to the inflection point from a sigmoidal curve fit when the ratio of target molecule to internal standard is plotted against molar ratio of the reference molecule.
By "internal standard" is meant a competitive template or molecule that is amplified in the presence of a native template or molecule. By "gene expression profile" is meant a characterization of the expression or expression level of two or more polynucleotides.
By "melting temperature" is meant the lowest temperature at which a detection probe does not bind or hybridize to a target nucleic acid. The melting temperature is determined by the inflection point of melting curve profile, which measures hybridization as a function of temperature.
By "native" is meant endogenous, or originating in a sample.
As used herein a "nucleic acid or oligonucleotide probe" is defined as a nucleic acid capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization. It will be understood by one of skill in the art that probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. The probes are preferably directly labeled with isotopes, for example, chromophores, lumiphores, chromogens, or indirectly labeled with biotin to which a streptavidin complex may later bind. By assaying for the presence or absence of the probe, one can detect the presence or absence of a target gene of interest.
By "platen" is meant a device having a high-density array of holes for holding and/or analyzing a plurality of liquid samples, e.g., described in US Patent Nos. 6,716,629; 6,027,873; 6,306,578; or 6,436,632, all of which are herein incorporated by reference.
By "reference" is meant a standard or control condition. As used herein, a "competitive reference biomarker" is a reference biomarker that competes with the biomarker of interest in a chemical reaction (e.g., competes with the biomarker of interest for probe binding) .
The phrase "selectively (or specifically) hybridizes to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent hybridization conditions when that sequence is present in a complex mixture (for example, total cellular or library DNA or RNA). By "standardized mixture of internal standards" is meant a mixture that contains internal standards having a defined concentration or a defined number of molecules of the internal standards.
By "target nucleic acid molecule" is meant a nucleic acid or biomarker of the sample that is to be detected.
BRIEF DESCRIPTION OF THE DRAWINGS
Figures IA-C is a schematic diagram showing a workflow for the detection of multiple nucleic acid targets in a single sample, e.g., a formalin fixed paraffin embedded (FFPE) sample. Figure 2 is a graph, which shows that preamplification does not increase replicate variation within and across separate experiments. Levels of three poorly expressed genes (DPP4, SCNNlA, and WNTl) were measured in Stratagene Universal Human Reference RNA (SUHRRNA) under multiple conditions: with or without preamplification (Pre-Amp), with 1/5 or 1/10 typical primer concentration during pre-amplifϊcation (1/5 or 1/10 primers, respectively), or with a 100-fold dilution prior to the 2nd round of amplification (1/100 dil). The amount of cDNA used in each reaction was equivalent to the amount typically derived from 10 ng of RNA. The PCR reaction volumes were 20 ul. Figure 3 is a graph, which shows that two step Standardized RT-PCR (StaRT- PCR) allows significant decrease of sample consumption while increasing the number of target nucleic acids that can be assayed per sample. Expression levels of fourteen genes in Stratagene Universal Human Reference RNA (SUHRRNA) were measured with and without preamplification. At least three replicate measurements were performed for all but measurement of 9SF5 with preamplification. A mixture of 96 primers was used in the preamplification step.
Figures 4A-4B are graphs, which show that analytical variation was less than biological variation among formalin fixed paraffin embedded (FFPE) RNA samples and matched fresh frozen (FF) RNA samples as assayed by Standardized RT-PCR (StaRT-PCR). Figure 4 A shows the ratio of transcript levels for two genes measured in matched pairs of formalin fixed paraffin embedded (FFPE) and in fresh frozen (FF) samples. Figure 4B shows the measure of degradation as determined by the number of β-actin (ACTB) molecules obtained per 1 ng of RNA during reverse transcription, and the difference in Gene A/Gene B ratio for the matched pairs is related to this measure of RNA degradation.
Figures 5A-5C are graphs, which show Nano liter-scale PCR using the OpenArray (R) system can detect differences in gene expression between normal breast tissue samples and breast tumor samples and has less analytic variation compared to biological variation. cDNA was generated by using random hexamers to amplify Total RNA (Clonetech). A human kinase OpenArray (R) plate (508 kinase genes and 13 reference genes for normalization) was loaded at 1 ng RNA equivalence per hole in LightCycler FastStart DNA Master SYBR Green I and subjected to 32 thermal cycles. Array images were collected every cycle, hole intensities plotted against cycle number, and cycle threshold (Ct) automatically calculated by Biotrove NT Cycler software. Figures 5A and 5C show technical replicate performance when comparing cycle number and Ct in normal breast tissues and breast tumor samples. Figure 5B shows variability between sets of matched samples when comparing tumor matched normal samples and breast tumor samples. Figures 6A-6F show that standardized Nano Array PCR was able to detect low levels of multiple target nucleic acids and distinguish them from corresponding internal standard nucleic acids. The points on each plot represent technical replicates of initial two pre-amplification PCR. In each reaction, ten (10) copies of the native template nucleic acid or internal standard nucleic acid were used. Figure 6A shows detection of Duspβ native template (NT) (left panel) and Duspβ internal standard (IS) (right panel). Figure 6B shows detection of erbb3 native template (left panel) and erbb3 internal standard (right panel). Figure 6C shows detection of lck native template (left panel) and lck internal standard (right panel). Figure 6D shows detection of mmdlO native template (left panel) and mmdlO internal standard (right panel). Figure 6E shows detection of stat native template (left panel) and stat internal standard (right panel). Figure 6F shows detection of tbp native template (left panel) and tbp internal standard (right panel). FAM on the y-axis is the raw sample fluorescence data of the native template (NT) detected using a 6-carboxyfluorescein (FAM) labeled probe and VIC on the x-axis is the raw sample fluorescence data of the internal standard (IS) detected using a VIC (proprietary probe to Applera/ Applied Biosystems) labeled probe.
Figures 7A-7E show that one template at 10 copies can be detected by OpenArray (R) Start-PCR when up to 100 copies of a competitive template are present in the reaction. Figure 7A shows the detection of 10 copies of a template in the absence of a competitive template. Figure 7B shows the detection of 10 copies of a template when 1 copy of a competitive template is present in the reaction. Figure 7C shows the detection of 10 copies of a template when 10 copies of a competitive template are present in the reaction. Figure 7D shows the detection of 10 copies of a template when 100 copies of a competitive template are present in the reaction. Figure 7E shows the results of a reaction in which neither the native template or competitive template nucleic acids are present in the reaction, simulating a failed PCR.
Figure 8 is a graph, which shows that little variation was observed among replicate high density, high throughput Standardized Nano Array PCR experiments. To simulate native and control template titration, human genomic heterozygous DNA was used to represent the 1 :1 (log 10=0) sample, the -1 and 1 samples were mixtures of homozygous DNA, and the remaining samples were homozygous replicates. Figure inset depicts raw sample fluorescence of the data. FAM on the y-axis is the raw sample fluorescence data of the native template (NT) detected using a 6- carboxyfluorescein (FAM) labeled probe and VIC on the x-axis is the raw sample fluorescence data of the internal standard (IS) detected using a VIC (proprietary probe to Applera/ Applied Biosystems) labeled probe. Kinase gene expression comparison matched breast tumor/normal tissue. cDNA was generated using random hexamers to amplify commercially a available total RNA sample (Clonetech Total RNA). The human kinase OpenArray (R) plate (608 kinase genes and 13 reference genes for normalization) was loaded at 1 ng RNA equivalence per hole in LightCycler FastStart DNA Master SYBR Green I, subjected to 32 thermal cycles with array images collected every cycle, hole intensities plotted against cycle number and cycle threshold (Ct) automatically calculated by BioTrove NT Cycler software.
Figures 9A and 9B are schematic diagrams showing a workflow for the detection of multiple nucleic acid targets in a single sample. Figure 9A shows a SNAP workflow using TaqMan single polynucleotide polymorphism (SNP) Assay in the Open Array (R) platform (OA) in the detection step in the SNAP assay. In the workflow, FAM raw sample fluorescence data of the native template (NT) detected using a 6-carboxyfluorescein (FAM) labeled probe is graphed against VIC raw sample fluorescence data of the internal standard (IS) detected using a VIC (proprietary probe to Applera/ Applied Biosystems) labeled probe. Figure 9B shows a SNAP workflow using Pleiades probes in the Open Array (R) platform (OA) (Figure 9B) in the detection step in the SNAP assay. In the workflow, FAM raw sample fluorescence data of the native template (NT) detected using a 6-carboxyfluorescein (FAM) labeled probe is graphed against temperature.
Figure 10 is a graph that shows the amount of variation in the measurement of numbers of copies of the target sequences DUSP6, ERBB3, LCK, MMD, STATl, and TBPl when performed in replicate in the SNAP assay.
Figures 1 IA and 1 IB are graphs that show the determination of the number of copies of a biomarker over several input concentrations of cDNA by SNAP assay using TaqMan probes in the detection step (Figure 1 IA) or Pleiades probes in the detection step (Figure HB). The input concentration of cDNA is shown on the x- axis, and the number of copies determined from the half-maximal concentration are shown on the y-axis.
Figures 12A-12D are graphs that show representative real-time PCR standard curves for 4 target genes (Figure 12A, Dlg2; Figure 12B, DUSP6; Figure 12C, FRAPl; Figure 12D, HGF) from the 16 lung cancer prognostic gene panel. Data was generated using 3x serial dilutions of cDNA, dual labeled hydrolysis probes and a Roche 480 Lightcycler with second derivative analysis to eliminate user bias in analysis settings. Figures 13A-13C are graphs that depict data obtained in the SNAP process workflow. Samples are amplified in the presence of increasing amount of internal standard. A fluorescent melting probe is used to characterize the end product ratios between native template and internal standard (Figure 13A) as a response to increasing internal standard (legend, starting copies internal standard). The contribution of native template (blue line) and internal standard (green line) to the melting curve is deconvolved through curve fitting (Figure 13B) to estimate the internal standard and native template molar ratio. Transcript abundance is derived from the EC50 of a sigmoid curve fit to fraction native template vs. log internal standard concentration (Figure 13 C) .
Figure 14 is a graph showing precision of results obtained by the SNAP assay. Six half log serial dilutions of lung tumor cDNA (160 to 0.50 ng) were prepared and three aliquots frozen. A dilution series was thawed and distributed into eight amplification tubes containing mastermix, log dilutions of internal standards (108 to 101 per tube) and 80 nM primer pairs to the assays indicated in table. Following 34 thermal cycles, PCR products were diluted 1000-fold into mastermix, transferred into OpenArray with each assay (primers and target specific Pleiades probe) preloaded into a single assay in each hole, put through 30 thermal cyclers in OpenArray NT Cycler, and end products measured by melting curve analysis. Melt curve data was converted into EC50 using a MatLab script that uses the raw melting curve data to calculate the ratio of internal standard to native template products, then calculates the [NT] from an EC50 derived from a sigmoid curve to a Fraction NT vs. Log [IS] plot. Values in table were derived from the average of three sample replicates, linear regression (plot lines) was used to calculate slope and RΛ2. The CV for each assay was calculated by normalizing the transcript abundance for each sample to the five reference genes (underlined assays), and then combining all results from samples with total input cDNA was greater than 1000 copies starting copies.
Figure 15 is a graph showing precision of results obtained by the SNAP assay from FFPE sample. Lung tumor FFPE RNA was isolated and converted into cDNA. Three serial dilutions (120, 60 and 30ng) of cDNA (RNA equivalence) were measured by SNAP. The plot shows the transcript abundance (y-axis) of three sample replicates for each assay (x-axis).
Figure 16 is a pie graph showing the distribution of histologic sub- type for the study cohort (n = 187). Figure 17 is a graph showing a hypothetical example of how a prediction interval may be used to identify a range of risk scores (Gene Signature) that may be classified as inconclusive.
Figure 18 is a graph showing the effect of sample number and assay replicates on the lower 95% confidence bound of the ICC.
Figure 19 depicts a two sigmoid curve equation. The molar ratio between IS and NT is estimated by fitting a curve to the top equation. Tmis and TmNT refer to the melting point of the IS and NT product. The TOPis , TOPNT , and Bottomis ,BottoniNτ refer to the maximum and minimum Fluorescence of each sigmoid curve. TOPis, Tmis, TmNT, HILLSLOPEis, AND HILLSLOPENT parameters are fixed and the solver reduces the residuals by fitting the melting curve data to BottoniNT and Bottomis. TOPIS=Fmax, T = temperature. The middle equation is used to generate the fraction NT in the sample. The bottom equation is used to estimate the S/N for an assay. FNT(NT) and FNT(IS) indicate fraction NT results for replicate NT or IS samples, respectively. RMS=root mean squares, STD= standard deviation, Mean=average.
Figure 20 is a pie graph depicting showing the distribution of surgical resection type for the study cohort (n = 187).
DETAILED DESCRIPTION OF THE INVENTION As described below, the present invention features compositions and methods that provide for gene expression profiling using a pre-amplification step that enhances detection of multiple nucleic acid targets in a biologic sample, in a nanoplatform system. Advantageously, the present invention provides for the quantitative measurement of gene expression in a low yield test sample and minimizes instrument- to-instrument variation. Gene expression profiles generated in accordance with the methods of the invention are useful for the diagnosis, monitoring, or characterization of virtually any disease characterized by an alteration in gene expression including, for example, neoplasia, inflammation, and a variety of infectious diseases. The invention is based, at least in part, on the discovery that a pre- amplification step, which provides for the enhanced detection of alterations in gene expression in a low yield test sample, can be used to enhance the number of transcripts of each gene in that sample. The number of such transcripts can then be measured relative to a known number of internal standard molecules within a standardized mixture of internal standards (SMIS) on a nanofluidic PCR platform. The invention enables multivariate quantitative competitive PCR assays (e.g., StaRT- PCR) in a simplified and streamlined workflow with substantial reductions in reagent and sample consumption leading to low cost, reliable and accurate clinical analyses.
Nanofluidic System
Advantageously, the invention employs a nanofluidic system that comprises a high density array of nano liter-scale through-holes or chambers for implementing a large number (e.g., at least about 500, 1000, 2000, 3000, 4000, 5000) of PCR analyses in less than about a microliter of fluid. In particular embodiments, the invention employs the BioTrove nanofluidic system — a high density array of nano liter-scale through-holes or chambers for implementing up to 3072 PCR analyses with 33 nl per reaction on an array the size of a microscope slide. Such arrays are described, for example, by U.S. Patent No. 6,716,629, which is incorporated herein by reference. An illustration of a Biotrove OpenArray (R) plate is provided at Figure 1. The OpenArray (R) plate is a steel platen that comprises 3072 through holes having a diameter of about 320μM. Each of the through holes is treated with a polymer to make the inside surface of each hole hydrophilic and the exterior surface hydrophobic. Liquid is dispensed and retained in each through-hole by means of surface force differentials between the liquid surface tension and the polymer coatings., The through holes are grouped in forty-eight subarrays of sixty- four through holes each. The spacing between each subarray is about 4.5 mm.
In particular, the invention provides a platen comprising a high density array of nano liter-scale through-holes or chambers comprising less than about a 1000 nl, 750 nl, 500 nl, 250 nl, 100 nl, or even 50 nl of the reagents and samples for PCR analyses. Methods for loading the array with a small volume of reagents are described, for example, in U.S. Patent No. 6,716,629, 6,812,030, and 6,716,629, and in U.S. Patent Publication Nos. 20080108112, 20030180807, and 20030124716. The hydrophobic exterior surface of the platen is not wetted, keeping the liquid in each through-hole isolated from its neighbor. The differential surface coating combined with the dimensional precision of the etched through- holes results in accurate and precise self-metered loading of liquid into each hole. PCR arrays are preloaded with PCR primers and probes. Such reagents are typically transferred from 384-well plates into the through-holes with an array of 48 pins manipulated by a 4-axis robot, such that each through-hole of an OpenArray (R) plate has a different primer set. The solvent is then removed resulting in the primers or primer/probes being immobilized on the inside surface of each hole. Co-loading of a passive fluorescent reference dye allows detection of holes that failed to load assay. The arrays are readily configurable as the assay configuration is based on the 384-well source plate layout. The 3072 holes of the OpenArray (R) plate may be configured based on analytical needs; for example a sample can be interrogated by 16, 32, 64, multiples of 64, up to 3072 assays.
In one embodiment, a pair of detectable target probes, each of which is complementary to a target nucleic acid sequence, and capable of amplifying that sequence is used in combination with a pair of detectable reference probes, each of which is complementary to a competitive template internal standard nucleic acid sequence and capable of amplifying that standard sequence. Desirably, each primer pair coamplifies a native template and its respective competitive internal standard template with equal efficiency. If desired, the target and reference probes are detectably labeled. In one embodiment, the detectable target and reference probes each comprises a distinct fluorometric dye that provides for the separate detection of amplified target and internal standards. For example, the amplifed target and internal standards are detected using a TaqMan two fluorescent dye assay to quantitatively measure the endpoint ratio between native template and internal standard. The TaqMan assay uses two hybridization probes, each probe has a unique fluorescently quenched dye and specifically hybridizes to a PCR template sequence, as described by Livak et al., "Allelic discrimination using fluorogenic probes and the 5' nuclease assay," Genet Anal. 1999 Feb; 14(5-6): 143-9.), which is incorporated by reference in its entirety. During the PCR extension phase, the hybridized probe is digested by the exonuclease activity of the Taq polymerase, resulting in release of the fluorescent dye specific for that probe. The amplifed target and internal standards may also be detected using a Pleiades fluorescent probe detection assay to quantitatively measure the ratio between native template and the melting temperature of the first detectable probe. The Pleiades assay uses a hybridization probe, and each probe specifically hybridizes to a target DNA sequence and has a fluorescent dye at the 5' terminus which is quenched by the interactions of a 3' quencher and a 5' minor groove binder (MGB), when the probe is not hybridized to the target DNA sequence, as described by Lukhtanov et al., "Novel DNA probes with low background and high hybridization- triggered fluorescence," Nucl. Acids. Res.. 2007 Jan;35(5):e30), which is incorporated by reference in its entirety. By the end of PCR, the fluorescent emissions from the released dyes reflect the molar ratio of the sample. Methods for assaying such emissions are known in the art, and described, for example, by Fabienne Hermitte, "Mylopreliferative Biomarkers", Molecular Diagnostic World Congress, 2007.
Assay System
Standardized reverse transcription PCR (StaRT- PCR ™) was developed with the goal of optimizing gene transcript measurement. In particular, StaRT-PCR assays have a sensitive detection threshold (< 10 molecules/assay) and signal-to analyte response (100%), high precision (mean gene copy CV across all genes was 6% and 3.2% with > 6000 starting RNA copy number), and a large linear dynamic range range (> 6 orders of magnitude, the full range of gene expression in the MAQC samples) (Shi et al, "The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements." Nat Biotechnol 2006 September;24(9):l 151-61; Shippy et al., "Using RNA sample titrations to assess microarray platform performance and normalization techniques." Nat Biotechnol. 2006 Sep;24(9):l 123-31), which are each incorporated herein by reference in their entireties). The combination of high signal-to-analyte response and high reproducibility routinely enabled the detection of as little as 20% differences in copy number. With this approach, measurement is quantitative and instrument-to- instrument variation is minimized when measured at endpoint. As such, cost may be reduced and the method simplified by not using expensive real-time PCR instrumentation. In StaRT-PCR, the endpoint PCR product for each gene cDNA (analyte) is quantified relative to a known number of molecules of its respective internal standard within the standardized mixture of internal standards. Sample aliquots are added to a series of tubes (2, 3, 4, 5, 6, 7, 8, 9, 10) containing increasing numbers of copies of synthetic competitive template internal standard, and primers. Each primer pair coamplifies a native template and its respective competitive internal standard template with equal efficiency. When gene measurements are normalized to a coamplified reference gene, StaRT-PCR controls for all known sources of variation, including inter-sample variation in loading due to pipetting, interfering substances such as PCR inhibitors, inter-gene variation in amplification efficiency, and false negatives. Recent reports have described the successful use of StaRT-PCR to measure the gene expression of several promising biomarkers in samples of blood (Rots et al., Leukemia 2000 December;14(12):2166-75; Peters et al., Clin Chem 2007 June;53(6): 1030-7) or other tissues. StaRT-PCR has been used successfully to identify patterns of gene expression associated with diagnosis of lung cancer (Warner et al., J MoI Diagn 2003 August;5(3): 176-83), risk of lung cancer (Crawford et al., Carcinogenesis 2007 December;28(12):2552-9), pulmonary sarcoidosis (Allen et al., Am J Respir Cell MoI Biol 1999 December;21(6):693-700), cystic fibrosis (Loitsch et al., Clin Chem 1999 May;45(5):619-24), chemoresistance in lung cancer (Harr et al., MoI Cancer 2005;4:23;Weaver et al., MoI Cancer 2005;4(l):18) childhood leukemias (Rots et al., Leukemia 2000 December;14(12):2166-75), staging of bladder cancer (Mitra et al., BMC Cancer 2006;6:159), and to develop databases of normal range of expression of inflammatory genes in peripheral blood samples (Peters et al., Clin Chem 2007 June;53(6): 1030-7). The primers of the invention embrace oligonucleotides of sufficient length and appropriate sequence so as to provide specific initiation of polymerization on a significant number of nucleic acids in the polymorphic locus. Specifically, the term "primer" as used herein refers to a sequence comprising two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and most preferably more than 8, which sequence is capable of initiating synthesis of a primer extension product, which is substantially complementary to a polymorphic locus strand. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent for polymerization. The exact length of primer will depend on many factors, including temperature, buffer, and nucleotide composition. The oligonucleotide primer typically contains between 12 and 27 or more nucleotides, although it may contain fewer nucleotides. Primers of the invention are designed to be "substantially" complementary to each strand of the genomic locus to be amplified and include the appropriate G or C nucleotides as discussed above. This means that the primers must be sufficiently complementary to hybridize with their respective strands under conditions that allow the agent for polymerization to perform. In other words, the primers should have sufficient complementarity with the 5' and 3' flanking sequences to hybridize therewith and permit amplification of the genomic locus. While exemplary primers are provided herein, it is understood that any primer that hybridizes with the target sequences of the invention are useful in the method of the invention for detecting a target nucleic acid.
The target nucleic acid may be present in a sample, e.g. clinical samples and biological samples. If high quality clinical samples are not used, amplification primers are designed to recognize shorter target sequences. For example, because RNA extracted from FFPE samples is typically fragmented, primers may be designed using criteria for FFPE sample amplification as described by Cronin et al. ("Measurement of Gene Expression in Archival Paraffin-Embedded Tissues." AJP 2004, 164(1): 35-42. (14)). Because homogeneous product sizes have better inter transcript correlation for degraded samples, the PCR product sizes are 70-85 base pairs. Primer Tm is about 60 +/-1 0C. Amplification primers are compared by homology against the human transcriptome to ensure the binding specificity. Despite the use of DNAse in the RNA purification protocol, when possible, primers are designed to span RNA intron/exon splice junctions. Therefore, amplification of genomic contaminants will be inhibited by failure to produce full length products (typically >6KB).
For each target nucleic acid molecule or bio marker, the respective synthetic internal standard will match the native template in all but 1, 2, or 3 nucleotides within the probe binding sequence of the native nucleic acid molecule or biomarker. The probe sequence for the internal standard will be based on this rearrangement, and therefore is predicted to bind only to the internal standard sequence, but not the corresponding native template. Multiple internal standards are formulated into a mixture that contains the internal standards at a defined concentration or number of molecule of the internal standards. For example, such internal standards are also referred to as a "defined reference nucleic acid molecule", having a known concentration of the nucleic acid molecule or a known number of nucleic acid molecules.
In standardized mixtures of internal standards (SMIS), each internal standard is synthesized in mg quantity, quantified by Hoechst dye flourometry, and stored in TE buffer, qPCR in which each native template is measured relative to a known number of internal standard molecules requires that the native template and internal standard molecule number is within 100-fold range of each other. Because genes may be expressed over 6 orders of magnitude, SMIS are prepared so that expression of each gene can be compared regardless of expression level. To accomplish this, the internal standard for each gene is mixed together at the same concentration, then serially diluted (e.g., 10-fold in TE). For the PCR step in the assay, the highest concentration of internal standard in working solution SMIS is 100-fold above the highest copy number estimated by qPCR (e.g., normalized to 10 ng RNA). To make IOOOX SMIS stocks, a tube is prepared that combines all internal standards 103-fold above this highest working solution concentration. To make the other SMIS dilutions, serial dilutions (e.g., five 10-fold serial dilutions in TE) are prepared. In preparation for a define PCR volume (e.g., 20 μl), SMIS mixtures are further diluted to working stocks (e.g., 10X) prior to addition (e.g., 9 μl); using all SMIS (e.g., 3, 4, 5, 6, or more) enables measurement of each gene transcript relative to a known quantity of its respective internal standard, while simultaneously enabling reliable comparison of measurement for each gene to each other gene. Use of SMIS enables reliable, reproducible measurement through two rounds of PCR amplification, including a pre- amplification which will be a benefit to low yield test samples (e.g., FFPE). A PCR product (i.e., amplicon) or real-time PCR product is detected by probe binding. In one embodiment, probe binding generates a fluorescent signal, for example, by coupling a fluorogenic dye molecule and a quencher moiety to the same or different oligonucleotide substrates (e.g., TaqMan® (Applied Biosystems, Foster City, CA, USA), Pleiades (Nanogen, Inc., Bothell, WA, USA), Molecular Beacons (see, for example, Tyagi et al, Nature Biotechnology 14(3):303-8, 1996), Scorpions® (Molecular Probes Inc., Eugene, OR, USA)). In another example, a PCR product is detected by the binding of a fluorogenic dye that emits a fluorescent signal upon binding (e.g., SYBR® Green (Molecular Probes)). Such detection methods are useful for the detection of a target specific PCR product. Following PCR, the concentration of the native template is calculated from the ratio (native template: internal standard template) versus known copies of internal standard included in the reaction. When gene measurements are normalized to a coamplified reference gene, StaRT- PCR controls for all known sources of variation, including inter-sample variation in loading due to pipetting, interfering substances, such as PCR inhibitors, inter-gene variation in amplification efficiency, and false negatives.
The present invention provides compositions and methods for carrying out StaRT-PCR and other analytic methods that involve the preamplification of cDNA in the presence of a standardized mixture of internal standards on a nano liter scale. As described in more detail below, the use of the preamplification step markedly reduces the amounts of starting sample (e.g., cDNA) and reagents required for each PCR reaction. In this two-round STaRT-PCR method, measuring each gene relative to a known number of internal standard molecules within a standardized mixture of internal standards in each reaction controls for unpredictable inter-sample variation in the efficiency of pre-amplifϊcation caused by reagent consumption, PCR inhibitors, and/or product inhibition. A standardized mixture of internal standards controls for preferential amplification of one transcript over another due to differences in amplification efficiencies. The use of nanofluidic technology in combination with pre-amplification with multiple sets of primers and internal standards in the same reaction provides for the measurement of many genes (>100) using the RNA quantity normally required for six measurements. This allows for higher throughput that is virtually unrestricted by RNA input.
A sample is split into a number (2, 4, 6, 8, 10, 20) of tubes comprising standardized mixtures of internal standards. Each of these tubes then undergoes up to 35 cycles of multiplexed pre-amplification prior to individual analyte detection. Through pre-amplification, the RNA quantity required for 6 typical real-time PCR assays is increased such that the amount of RNA after pre-amplification is sufficient for the detection of target nucleic acid molecules in at least about 100, 200, 300, 400, or 500 assays. Further, the resulting increased gene target concentration eliminates issues of insufficient message when using nanofluidic measurement methods. Overall, the methods described herein provide robustness and quality controls lacking in current real-time qPCR and hybridization array approaches. Following multiplex PCR, the native and internal standard targets can be detected using any method known in the art. For example, as little as a 10% size difference between analyte and internal standard templates allows them to be quantified individually by size separation methods (e.g., capillary electrophoresis, agarose gel mobility, HPLC or MALDI- TOF). Because such methods may be cumbersome when measuring tens to hundreds of targets, signal detection may be carried out using a two-color fluorometric dye system. Specifically, signal from each fluorophore corresponds to the native and internal standard relative molar ratios.
Measuring each gene relative to a known number of internal standard templates enables reliable quantitative endpoint or real time measurement. Instead of 10% size differences between internal standard and native template, 1, 2, or 3 nucleotide differences in the nucleotide sequence between the internal standard and native template will provide sufficient specificity difference for TaqMan or Pleiades probe discrimination. Such 1, 2, or 3 nucleotide differences at the internal standard probe binding site include deletions, insertions, or changing the order of the 2-3 nucleotides relative to the native template. In particular, a 2 nucleotide change in the internal standard relative to the target sequence provides enough similarity so that it has the same PCR kinetics as the target sequence, but distinguishes the internal standard enough to provide probe discrimination in TaqMan and Pleiades based assays. Accordingly, the invention provides a high-density array of 33 nanoliter through-holes designed for parallel PCR detection of multiple genomic targets. These through-holes are pre-loaded with specific amplification primers and detectable probes. Each through hole measures the analyte: internal standard ratio for one PCR target based on the ratio of signal from the two TaqMan or Pleiades probes. Workflow is streamlined as a single pipetting step that introduces the preamplified sample into a number of individual nanoliter scale reaction vessels, allowing many analyses to be performed per sample in parallel. Hermetically sealed in an optically transparent glass cassette after loading the sample, the nanoplates are temperature cycled in commercial flatblock thermocyclers and then imaged, for example, in a BioTrove NT Imager. The PCR assay performance in the nanoplates is equivalent to the same assay in microplates, but with over 150-fold lower reaction volume (33 nanoliter versus 5 microliter reaction volumes). Further, the high density and low volume of OpenArray (R) plate provides >150,000 PCR data points per day at a fraction of the cost of the same number of data points generated with PCR in microplates. As a result, gene expression profiling is readily scalable to higher throughputs with little incremental cost increase.
Diagnostic Methods
The present invention can be employed to measure gene expression or a gene expression profile in a biological sample. Desirably, the methods of the invention require much less starting material than conventional diagnostic methods and may be employed to measure gene expression of biomarkers in blood or other tissues. Accordingly, the invention provides for the identification of patterns of gene expression useful in virtually any clinical setting where conventional methods of analysis are used. For example, the present methods provide for the analysis of biomarkers associated with lung cancer (Warner et al, J MoI Diagn 2003 ;5: 176-83), risk of lung cancer (Crawford et al., Cancer Res 2000;60:1609-18, pulmonary sarcoidosis (Allen et al., Am. J. Respir. Cell. MoI. Biol. 1999:21, 693-700), cystic fibrosis (Loitsch et al., Clin. Chem. 1999:45, 619-624), chemoresistance in lung cancer (Weaver et al., Molecular Cancer, 4, 18, 2005; Harr et al., Molecular Cancer, 4, 23, 2005) childhood leukemias (Rots et al, Leukemia, 14, 2166-2175,2000), staging of bladder cancer (Mitra et al., BMC Cancer 2006;6:159), and to develop databases of normal range of expression of inflammatory genes in peripheral blood samples (Peters et al., Clinical Chemistry 53: 1030-1037, 2007).
The size of biopsies obtained in many clinical situations is decreasing as cytologic methods improve and economic pressure to reduce costs increases. For example, samples of suspected cancerous lesions in the lung, breast, prostate, thyroid, and pancreas, are commonly obtained by fine needle aspirate (FNA) biopsy. These samples often comprise fewer than 100 cells. Thus, there is increasing need for the ability to obtain tens to hundreds of expression measurements from 10 ng RNA. There also is a need to evaluate expression patterns in samples from anatomically small, but functionally important tissues of the brain, developing embryo, and animal models, including laser capture micro-dissected samples. Measurement of gene expression profiles in rare cells, such as circulating tumor cells (CTC) enriched from flow-sorted cell populations will also potentially benefit from this technique.
Implementation of the 70-gene Mammaprint diagnostic using the OpenArray (R) plate combined with the methods described herein is estimated to require 8 hours to prep and analyze the sample with no sophisticated automation and consume 85 ul of PCR reagents and 60 ng of RNA. This level of sample consumption will better enable gene expression profile analysis in samples having only a limited amount of genetic material.
Types of Biological Samples Gene expression profiling involves measuring the level of a bio marker of interest in a biological sample. In one embodiment, the biologic sample is a tissue sample that includes cells of a tissue or organ (e.g., lung, breast, prostatic tissue cells). Such tissue is obtained, for example, from a biopsy of the tissue or organ. In another embodiment, the biologic sample is a biologic fluid sample. Biological fluid samples include blood, blood serum, plasma, urine, seminal fluids, and ejaculate, or any other biological fluid useful in the methods of the invention. Alternatively, the tissue sample is a cytologic fine needle aspirate biopsy or formalin fixed paraffin embedded tissue. Use of the methods of the invention is particularly advantageous for such samples, where RNA often is limited by sample size or degradation.
Diagnostic assays
The present invention provides a number of diagnostic assays that are useful for characterizing the gene expression profile of a biological sample. In particular, the invention provides methods for the detection of alterations in gene expression associated with neoplasia. The invention provides for the characterization of a gene expression profile from a sample that contains very little genetic material. This provides an advantage over conventional methods for assaying gene expression, which require 10-100 times as much starting material to reliably detect alterations in gene expression. The size of biopsies obtained in many clinical situations is small and includes minimal amounts of genetic material. For example, samples of suspected cancerous lesions in the lung, breast, prostate, thyroid, and pancreas, are commonly obtained by fine needle aspirate (FNA) biopsy. These samples often comprise fewer than 100 cells. Thus, there is increasing need for the ability to obtain tens to hundreds of expression measurements from 10 ng RNA. There also is a need to evaluate expression patterns in samples from anatomically small, but functionally important tissues of the brain, developing embryo, and animal models, including laser capture micro-dissected samples. Measurement of gene expression profiles in rare cells, such as circulating tumor cells (CTC) enriched from flow-sorted cell populations are also likely to benefit from this technique. In particular embodiments, the invention provides for the detection of genes listed in Table 1 (below).
Table 1: Exemplary Target Genes for Detection of Neoplasia
Figure imgf000028_0001
Figure imgf000029_0001
Alternatively, the invention provides for the detection and diagnosis of a pathogen in a biological sample. A variety of bacterial and viral pathogens may be detected using the system and methods of the invention. Exemplary bacterial pathogens include, but are not limited to, Aerobacter, Aeromonas, Acinetobacter, Actinomyces israelii, Agrobacterium, Bacillus, Bacillus antracis, Bacteroides, Bartonella, Bordetella, Bortella, Borrelia, Brucella, Burkholderia, Calymmatobacterium, Campylobacter, Citrobacter, Clostridium, Clostridium perfringers, Clostridium tetani, Cornyebacterium, corynebacterium diphtheriae, corynebacterium sp., Enterobacter, Enterobacter aerogenes, Enterococcus, Erysipelothrix rhusiopathiae, Escherichia, Francisella, Fusobacterium nucleatum, Gardnerella, Haemophilus, Hafnia, Helicobacter, Klebsiella, Klebsiella pneumoniae, Lactobacillus, Legionella, Leptospira, Listeria, Morganella, Moraxella, Mycobacterium, Neisseria, Pasteurella, Pasturella multocida, Proteus, Providencia, Pseudomonas, Rickettsia, Salmonella, Serratia, Shigella, Staphylococcus, Stentorophomonas, Streptococcus, Streptobacillus moniliformis, Treponema, Treponema pallidium, Treponema pertenue, Xanthomonas, Vibrio, and Yersinia.
Examples of viruses detectable using the system and methods of the invention include Retroviridae (e.g. human immunodeficiency viruses, such as HIV-I (also referred to as HDTV-III, LAVE or HTLV-III/LAV, or HIV-III; and other isolates, such as HIV-LP; Picornaviridae (e.g. polio viruses, hepatitis A virus; enteroviruses, human Coxsackie viruses, rhinoviruses, echoviruses); Calciviridae (e.g. strains that cause gastroenteritis); Togaviridae (e.g. equine encephalitis viruses, rubella viruses); Flaviridae (e.g. dengue viruses, encephalitis viruses, yellow fever viruses); Coronoviridae (e.g. coronaviruses); Rhabdoviridae (e.g. vesicular stomatitis viruses, rabies viruses); Filoviridae (e.g. ebola viruses); Paramyxoviridae (e.g. parainfluenza viruses, mumps virus, measles virus, respiratory syncytial virus); Orthomyxoviridae (e.g. influenza viruses); Bungaviridae (e.g. Hantaan viruses, bunga viruses, phleboviruses and Nairo viruses); Arena viridae (hemorrhagic fever viruses); Reoviridae (e.g. reoviruses, orbiviurses and rotaviruses); Birnaviridae;
Hepadnaviridae (Hepatitis B virus); Parvovirida (parvoviruses); Papovaviridae (papilloma viruses, polyoma viruses); Adenoviridae (most adenoviruses); Herpesviridae (herpes simplex virus (HSV) 1 and 2, varicella zoster virus, cytomegalovirus (CMV), herpes virus; Poxviridae (variola viruses, vaccinia viruses, pox viruses); and Iridoviridae (e.g. African swine fever virus); and unclassified viruses (e.g. the agent of delta hepatitis (thought to be a defective satellite of hepatitis B virus), the agents of non-A, non-B hepatitis (class 1 = internally transmitted; class 2 = parenterally transmitted (i.e., Hepatitis C); Norwalk and related viruses, and astro viruses). Other infectious organisms (i.e., protists) include Plasmodium spp. such as
Plasmodium falciparum, Plasmodium malariae, Plasmodium ovale, and Plasmodium vivax and Toxoplasma gondii. Blood-borne and/or tissues parasites include Plasmodium spp., Babesia microti, Babesia divergens, Leishmania tropica, Leishmania spp., Leishmania braziliensis, Leishmania donovani, Trypanosoma gambiense and Trypanosoma rhodesiense (African sleeping sickness), Trypanosoma cruzi (Chagas' disease), and Toxoplasma gondii.
Kits
The invention also provides kits for the detection of a gene expression profile. Such kits are useful for the diagnosis, characterization, or monitoring of a neoplasia in a biological sample obtained from a subject. Alternatively, the invention provides for the detection of a pathogen gene or genes in a biological sample. In various embodiments, the kit includes at least one primer pair that identifies a target sequence, together with instructions for using the primers to identify a gene expression profile in a biological sample. Preferably, the primers are provided in combination with a standardized mixture of internal standards on a nanofluidic PCR platform (e.g., a high density array). In yet another embodiment, the kit further comprises a pair of primers capable of binding to and amplifying a reference sequence. In yet other embodiments, the kit comprises a sterile container which contains the primers; such containers can be boxes, ampules, bottles, vials, tubes, bags, pouches, blister-packs, or other suitable container form known in the art. Such containers can be made of plastic, glass, laminated paper, metal foil, or other materials suitable for holding nucleic acids.
The instructions will generally include information about the use of the compositions of the invention in detecting a gene expression profile. In particular embodiments, the gene expression profile diagnoses or characterizes a neoplasia. Preferably, the kit further comprises any one or more of the reagents useful for an analytical method described herein (e.g., standardized reverse transcriptase PCR). In other embodiments, the instructions include at least one of the following: descriptions of the primer; methods for using the enclosed materials for the diagnosis of a neoplasia; precautions; warnings; indications; clinical or research studies; and/or references. The instructions may be printed directly on the container (when present), or as a label applied to the container, or as a separate sheet, pamphlet, card, or folder supplied in or with the container.
The following examples are offered by way of illustration, not by way of limitation. While specific examples have been provided, the above description is illustrative and not restrictive. Any one or more of the features of the previously described embodiments can be combined in any manner with one or more features of any other embodiments in the present invention. Furthermore, many variations of the invention will become apparent to those skilled in the art upon review of the specification. The scope of the invention should, therefore, be determined not with reference to the above description, but instead should be determined with reference to the appended claims along with their full scope of equivalents.
It should be appreciated that the invention should not be construed to be limited to the examples that are now described; rather, the invention should be construed to include any and all applications provided herein and all equivalent variations within the skill of the ordinary artisan. The practice of the present invention employs, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are well within the purview of the skilled artisan. Such techniques are explained fully in the literature, such as, "Molecular Cloning: A Laboratory Manual", second edition (Sambrook, 1989); "Oligonucleotide Synthesis" (Gait, 1984); "Animal Cell Culture" (Freshney, 1987); "Methods in Enzymology" "Handbook of Experimental Immunology" (Weir, 1996); "Gene Transfer Vectors for Mammalian Cells" (Miller and Calos, 1987); "Current Protocols in Molecular Biology" (Ausubel, 1987); "PCR: The Polymerase Chain Reaction", (Mullis, 1994); "Current Protocols in Immunology" (Coligan,
1991). These techniques are applicable to the production of the polynucleotides and polypeptides of the invention, and, as such, may be considered in making and practicing the invention.
The following examples are put forth so as to provide those of ordinary skill in the art with a complete disclosure and description of how to make and use the assay, screening, and therapeutic methods of the invention, and are not intended to limit the scope of what the inventors regard as their invention.
EXAMPLES
Example 1. Preamplifϊcation of genes expressed at low levels resulted in sensitive detection of target gene expression. In the Standardized Nano liter Array PCR (SNAP) test workflow a multiplex preamplification step is performed before samples are loaded into the nanofluidic array for a second round of singleplex PCR. To determine the reproducibility and robustness of the preamplifϊcation StaRT-PCR method experiments were performed in the NCI-funded (CA 96806) Standardized Expression Measurement (SEM) Center at the University of Toledo Health Sciences Campus. Levels of three genes expressed at low level (DPP4, SCNNlA, and WNTl) were measured in a commercially available reference RNA sample (Stratagene Universal Human Reference RNA; SUHRRNA), under multiple conditions (Figure 2). In each experiment the conditions included the typical no pre-amplifϊcation method as a control, or pre-amplifϊcation with a 96-gene primer mixture. Two different primer concentrations (1/6 or 1/10 usual concentration) were used in preamplification. Preamplifϊcation of PCR products allows a sufficient concentration of cDNA for thousands of individual quantitation reactions. In this example, the preamplification PCR products were diluted 10- or 100- fold prior to the second round of PCR. The measured expression levels for the experiments included a 1 :100 dilution of preamplification products. Both 10-fold and 10-fold dilutions of the Pre-Amp products produced similar variability in the measured transcript levels. These results demonstrate that replicate variation is at an acceptable level with the preamplification protocol, both within and across experiments., even with genes expressed at low level, which are subject to random sampling error (Canales et al., "Evaluation of DNA microarray results with quantitative gene expression platforms." Nat Biotechnol. 2006 Sep;24(9):l 116-22).
Preamplification protocols dramatically increase the number of transcript markers that can be measured in a fixed amount of cDNA by markedly reducing the amount of cDNA consumed per assay. Sample size is particularly limited for clinical samples, including those derived from fine needle aspirate (FNA) biopsies or formalin fixed paraffin embedded (FFPE) material. Compounding the problem is that these clinical samples are often degraded and/or resistant to reverse transcription, resulting in small cDNA samples. Consequently only 1/lOOth of the usual amount of cDNA may be available for analysis of small clinical samples. To enable multivariate analysis for such samples, some form of preamplifϊcation/multiplex PCR is required because there is not enough sample to be distributed between individual gene tests.
To determine assay conditions for the analysis of a series of trans-thoracic fine needle aspirate (FNA) samples, Stratagene Universal Human Reference RNA (SUHRRNA) was assessed undiluted, 10-fold diluted, or 100-fold diluted with or without preamplification. The best conditions — 10-fold dilution of primer mixture and 100-fold dilution of preamplification products (1, 000-fold overall dilution) were used to repeat analysis of SUHRRNA against 14 genes. The results show that with preamplification an average coefficient of variation (CV) of 14.8% for the 14 genes was obtained which was similar to an average CV of 16.4% obtained for the 14 genes without preamplification (Figure 3). Also the mean values for five genes that had been previously measured gave mean values that were within 16% of values previously obtained during the optimization experiment. These same conditions were then used to assess clinical samples obtained by transthoracic FNA biopsy. These results demonstrate that two step StaRT-PCR allows reduced sample consumption while expanding the assay repertoire per sample.
Multiplex StaRT-PCR was used to assess Primary Bronchogenic Carcinoma FNA Biopsy Forty-two transthoracic FNA biopsy samples were assessed for four genes that putatively improve diagnosis between non-small cell lung cancer (NSCLC) and small cell lung carcinoma (SCLC). As is common with samples obtained from FNA, many of these samples were highly degraded and/or included PCR inhibitors. In this context, the CV was 48%, higher than nondegraded RNA samples, but still accurate enough for multivariate biomarkers.
Example 2. StaRT-PCR detected differences in gene expression in Formalin Fixed Paraffin Embedded (FFPE) Samples.
Standardized RT-PCR (StaRT-PCR) analysis was performed on formalin fixed paraffin embedded (FFPE) RNA samples and matched fresh frozen (FF) sample RNA samples (Figure 4A). Seven (7) cell line cultures were split and either frozen or formalin fixed to obtain pairs of matched FFPE and FF RNA samples. In this study, the biomarker was a ratio of Gene A/Gene B. The expected ratio of the FFPE Gene A/Gene B value to the FF Gene A/Gene B value is about 1.0 in for every matched pair. In (Figure 4A) the ratio ratio of the FFPE Gene A/Gene B value to the FF Gene A/Gene B value varied four- fold, from 0.4 in the second matched pair to 1.2 in the sixth matched pair. These results show that it is possible to obtain results from formalin fixed paraffin embedded (FFPE) RNA samples that are less than 2-fold different from matched fresh frozen (FF) sample RNA samples. In contrast, the biological variation was greater than 13-fold among the FF samples. Thus, the biological variation exceeded the analytical variation between FF and FFPE RNA samples by more than three-fold.
One likely reason for the higher analytical variation in Gene A/Gene B ratio for matched pairs 2 and 3 is that higher level of RNA degradation occurs in FFPE samples. Degradation was assessed from the number of β-actin molecules obtained from 1 ng of RNA during reverse transcription (Figure 4B). The difference in Gene A/Gene B ratio for the matched pairs is related to this measure of RNA degradation. Based on these results, a similar cut-off strategy can be used to compare reference gene copies to input quantity of RNA to ensure RNA quality. Altogether these studies show that StaRT-PCR analysis of FFPE RNA samples is comparable to that obtained for FF RNA samples and that StaRT-PCR is sensitive enough to distinguish biological variation in RNA samples.
Example 3. Nanoliter-scale PCR using the OpenArray (R) system Detected Differences In Gene Expression Between Normal Breast Tissue Samples And Breast Tumor Samples.
OpenArray (R) nanofluidic technology is desirable for diverse applications because the technology uses fewer resources than other methods. OpenArray (R) nanoliter reaction volumes and streamlined analytical workflow greatly facilitated an increase in the number of analyses one can make and reduces the cost of qPCR analysis per sample when compared to a microplate. OpenArray (R) technology was used for the real-time (qPCR) measurement of 608 human kinase genes in matched breast tumor/normal samples (Figures 5A-5C). Comparing cycle number and cycle threshold (Ct) showed good correlation in normal breast tissues and breast tumor samples, indicating technical replicate performance (Figures 5A and 5C). Comparing tumor matched normal samples and breast tumor samples indicated detectable variability between sets of matched samples (Figure 5B). These results clearly indicated that the biological difference between tumor and normal samples is far greater than the analytic variation of the OpenArray (R) platform. Example 4. Standardized NanoArray PCR detected low quantities of template nucleic acids in multiplex reactions.
To demonstrate the feasibility of 100-plex QPCR using internal standards ("Two step START-PCR"), Standardized NanoArray PCR was performed in multiplex reactions with low quantities of either a native template (NT) nucleic acids or internal standard (IS) nucleic acids (Figures 6A-6F). Two 6 μl PCR reactions containing six primer pairs at 80 nM and either 10 starting copies of all six internal standard (IS) nucleic acids or 10 starting copies of all six native template (NT) nucleic acids were prepared, cycled 36 times (multiplex preamplifϊcation), diluted 1000-fold in mastermix, and applied to an OpenArray (R) plate, which is manufactured with each hole containing an individual TaqMan SNP assay for one of the six gene targets (i.e., IS/NT detection is spatially multiplexed). Each primer pair was optimized for SYBR green QPCR only (high efficiency, singleplex). Probes were added later to allow detection of either NT or IS amplicon in an OpenArray (R) plate. NT was detected by FAM labeled probe and IS was detected by VIC labeled probe. After 20 cycles, arrays were imaged and multiplex signals detected from the ten starting copies of either NT or IS were compared in plots. The points on each plot represent technical replicates of the initial two preamplifϊcation PCR. The experiments showed that both internal standard (IS) and native template (NT) gave unique signals at very low copies in the presence of non-optimized multiplex preamplifϊcation, and all six assays demonstrated 10 starting copy sensitivity.
Example 5. Standardized NanoArray PCR showed sensitive detection or low quantities of native template in the presence of an internal standard.
For clinical applications, it is highly desirable to have a control that indicates a negative PCR result was not due to PCR inhibitors within the sample (i.e., a positive control). A preferred positive control is a template that is nearly identical to the native sequence of target, but differing by a few base pairs (i.e., an internal standard (IS)). The IS allows equal PCR efficiency between the internal standard (IS) and a native template (NT). Following PCR, one distinguishes which product is made (competitor or native sequence) with a probe. Fluorescent probe specific technologies, e.g., TaqMan SNP, Pleiades, or Beacons, which detect the sequence difference between the IS and NT. The positive control mimics the NT as closely as possible in the reaction.
To test the limit of detection in the presence of an internal standard, Standardized NanoArray PCR reactions were performed with a known quantity of a template and increasing titers of another competitive template (Figures 7A-7D). The data shows that one template can be detected at 10 copies/ 6 μl reaction and when up to 100 copies of a competing target nucleic acid are present. A failed PCR was simulated by performing a reaction lacking template (Figure 7E). An IS can be put in at the assay's limit of detection to detect the presence of PCR inhibitors. For many clinical cases, one simply wants a positive or negative result, which can be determined by measuring the endpoint color ratio of NT : IS. To quantify the amount of native target, the Standardized NanoArray PCR approach can be used.These studies show that little variation was observed among replicate high density, high throughput Standardized NanoArray PCR experiments.
Example 6. High density, high throughput Standardized NanoArray PCR showed little variation among replicate experiments.
Both real-time PCR and TaqMan endpoint PCR provide equivalent results in the OpenArray (R) platform to that in a microplate format (Morrison et al, (2006), "Nano liter high throughput quantitative PCR." Nucleic Acid Research 34(18): el23). For the purposes of the Standardized NanoArray PCR (SNAP) gene expression profiling method, TaqMan two fluorescent dye assay was used to quantitatively measure the endpoint ratio between native template and internal standard. Similar to the TaqMan SNP assay, this TaqMan assay uses two hybridization probes. Each probe has a unique fluorescently quenched dye and specifically hybridizes to the PCR template sequence (Livak, "Allelic discrimination using fluorogenic probes and the 6' nuclease assay." Genet Anal. 1999 Feb; 14(6-6): 143-9). During the PCR extension phase, the hybridized probe was digested by the exonuclease activity of the Taq polymerase, resulting in release of the fluorescent dye specific for that probe. By the end of the PCR, the fluorescent emissions from the released dyes reflect the molar ratio of the sample. Loaded OpenArray (R) plates were inserted into a glass case containing immiscible fluid and sealed with a light sensitive epoxy to prevent evaporation during thermal cycling. The resulting PCR array, at 64-fold denser than 384-well microplates, was then cycled in a commercially available fiat block thermal cycler (e.g., the BioTrove NT Imager). Commercially available thermal cyclers for high throughput PCR are also available, e.g., the BioRad ALD-021 IG which can cycle up to 32 slides every four hours (>98,000 PCR). Two color fluorescent images were collected following PCR using a commercially available microarray scanner (e.g., BioTrove NT Imager, Tecan LS Reloaded). The FAM:VIC fluorescent ratio was plotted against internal standard preamplification input quantity, and the inflection point from a sigmoidal curve fit (EC50) was used to indicate native template nucleic acid concentration. To examine the SNAP-OpenArray (R) workflow, genomic human DNA was used to simulate the competitive PCR titration curve behavior (Figure 8). Similar to capillary electrophoresis and gel based methods used for StaRT-PCR endpoint measurements, TaqMan SNP assays are not analytically sensitive to a < 10% native: control template. Therefore, for this experiment the indicated log molar ratios <-l and >1 used replicate homozygous genomic DNA instead of the indicated molar ratio. The experiment was performed with heterozygous genomic human DNA to simulate the expected FAM/VIC ratio and variation for such samples. The raw fluorescent results are depicted in the inset figure as five clusters corresponding to, in a clockwise direction, the dilutions (-3 & -T), -1, 0, +1, (+2 & +3). Eight (8) technical replicates were used to generate eight (8) sigmoid curve fits, resulting in an average of 0.066 +/- 0.007 for the value of the half maximal effective concentration (EC50) (Figure 8). An 11% coefficient of variation (CV) obtained for technical replicates was not substant considering that most of this variation can be accounted for by Poisson noise (~6% given the 300 genomic template input). It is expected that the CV can be greatly reduced when using more highly expressed prognostic RNA targets. Thus, this study shows that high density, high throughput SNAP-PCR showed little variation among replicate experiments and that SNAP-PCR reactions are reproducible.
Example 7. Standardized Nanoliter Array PCR Gene Expression assay of individual biomarkers was able to detect 4- fold difference in nanoscale quantities.
An experiment was performed to detect the sensitivity of an assay implementing preamplification and Standardized Mixtures of Internal Standards (SMIS) in the OpenArray (R) platform to detect nanoscale differences in the quantity of target sequences. Two SNAP workflows using either TaqMan probes or Pleiades probes in the detection step are respectively shown in Figure 9A and 9B. Samples containing cDNA (2.5 ng and 10 ng) from a biologic sample, e.g., lung biopsy, were preamplified by PCR for 35 cycles in the presence of primers specific for the detection of DUSP6, ERBB3, LCK, MMD, STATl, and TBPl target sequences. For each target sequence, seven (7) preamplification reactions were performed with a defined quantity of the corresponding internal standard, with concentrations of each internal standard increasing 10-fold across the set of 7 preamplification reactions (i.e., lO'-lO7). Aliquots of the reactions were loaded into the through holes of an OpenArray (R) plate, and the through-hole reactions were amplified using PCR for 20 cycles in the presence of either TaqMan probes or Pleiades probes. For the TaqMan- based SNAP assay, the ratio of the signal of the biomarker to the signal of the internal standard was determined (NT:IS ratio) for the through hole reactions. The NT:IS ratio was graphed against the concentration of the internal standard, and the half- maximal concentration (EC50) was determined. For the Pleiades-based SNAP assay, the ratio of the signal from the biomarker to the melting temperature of the probe to the biomarker was determined. The ratio of the signal from the biomarker to the melting temperature of the probe to the biomarker was graphed against the concentration of the internal standard, and the half-maximal concentration (EC50) was determined.
The SNAP assays showed enough sensitivity to detect a difference in the amount of the target sequence, resulting from a 4-fold difference in the amount of the input cDNA (Table 2).
Table 2: Detection of fold difference in target sequences from 4-fold difference in input cDNA concentration
Figure imgf000039_0001
Figure imgf000040_0001
In the SNAP assays using TaqMan probes, several assays of individual target sequences came close to showing the expected 4-fold difference in the quantity of the target sequence (DUSP6, STATl, TBP). For the SNAP assays using TaqMan probes, the coefficient of variation (CV) was under 50% for each target sequence in most of the assays. In particular, the SNAP assays performed with the Pleiades probes showed on average the expected 4-fold difference in the quantity of the target sequence. In SNAP assays using the Pleiades probes, several assays of individual target sequences were able to detect the 4-fold difference in concentration (DUSP6, ERBB3, MMD, and TBP). Furthermore the SNAP assays using the Pleiades probes showed low CV (>20% in all cases except for DUSP6). Thus, this example shows that SNAP assays were able to detect differences in the input concentrations of cDNA, and that SNAP assays using Pleiades probes, on average, were able to detect the expected 4-fold difference with low CV in most
Example 8. Standardized Nanoliter Array PCR Gene Expression Signature Test.
A Standardized Nanoliter Array PCR (SNAP) Gene Expression assay is used to detect or quantify one or more nucleic acid targets by implementing preamplification and Standardized Mixtures of Internal Standards (SMIS) in a commercially available high-throughput PCR format, e.g. , the OpenArray (R) plate format. Compared to commercially available nucleic acid diagnostic workflows for diagnosing a pathology, selecting treatment, or monitoring treatment, e.g., Oncotype Dx, SNAP provides better prognostic cancer gene expression profile consistency between labs. The assay is based on competitive RT-PCR between a dilution series of known concentrations of synthetic gene-specific internal standard copies and the unknown number of target sequence copies. When used in the OpenArray (R) platform, SNAP provides a prognostic assay with clinical accuracy, throughput, and low test cost required for personalized medicine in a simplified and robust assay platform. A diagram of the workflow for an assay is shown in Figures 1A-1C. This example shows how the Oncotype Dx workflow can be adapted to detect gene expression in a tumor biopsy. A tumor biopsy sample, e.g., breast biopsy, is taken from a patient. A pathologist confirms sample pathology and selects tumor enriched sections for RNA extraction. The method also measures many genes in low yielding samples obtained by fine needle aspirate (FNA) biopsies, laser capture micro-dissection or flow-sorted cytometry. Low yield samples benefit from preamplification and distribution of sample into fewer reaction tubes. Preamplification is carried out as shown in Figure IB using components provided in a kit (e.g., the components shown in Table 3). Total RNA (100 ng) from the formalin fixed, paraffin-embedded tumor biopsy blocks is reverse transcribed using random hexamers. The method provides quantitative calibration of each sample. Tube 1 (Table
3) contains the calibrant, a reagent that contains the internal standard for the β-actin (ACTB) loading control gene. The ratio of native template (NT) to internal standard (IS) must be greater than 1 :10 and less than 10:1 for the measurement to be within assay range. Initial calibration of each cDNA sample to a known quantity of β-actin (ACTB) internal standard ensures that the ACTB NT/IS is within this range for each subsequent measurement. The calibrated cDNA sample is then used in the preamplification step.
The calibrated cDNA is evenly distributed among 6 StaRT-PCR tubes. A passive dye added during the first strand cDNA synthesis could be used to detect if relatively equal volumes of cDNA were added to the preamplification tube. Prior to addition of the calibrated cDNA, the 6 StaRT-PCR tubes are loaded with PCR master mix, PCR primer pairs for gene targets (e.g., 21 primer pairs for Oncotype DX, 17 prognostic and 4 reference) and their internal standard competitive templates formulated into SMIS (e.g., serial 10-fold dilution of internal standards), Because genes are expressed over more than six orders of magnitude in human tissues, Tubes 2 to 7 (Table 3) are 10-fold serially diluted relative to the loading control gene β-actin (ACTB) internal standard, in a system of six (6) SMIS™, A-F. Inclusion of SMIS removes dependence on real-time instrument calibration and provides a self- referenced quality standard that controls for analytical false negatives and false positives.
In a separate room, these tubes undergo 16 cycles of PCR preamplification. Two microliters from each preamplification tube is added to 384-well plate containing 18 μl master mix (no primer or probes), to eliminate issues associated with nonspecific product preamplification and to prepare samples for loading into the OpenArray (R) plate provided in the kit. Following 16 cycles of PCR, the ratios of fluorescent emissions are measured and native template concentration estimated from the fluorescence ratio compared to an internal standard curve. Reference gene copies are used to correct for cDNA yield prior to gene expression profile (GEP) calculation. The OpenArray (R) plate is pre-loaded with amplification primers and two differentially labeled probes specific for either native template or internal standard corresponding to each of the Oncotype Dx gene expression targets. In the presesent example, two differentially labeled fluorescent dye exonuclease probes specific for either native template or internal standard, e.g., TaqMan® Taqman probes (Applied Biosystems, Foster City, CA, USA)are pre-loaded in the OpenArray (R) plate. The probe could also be a number of different types: e.g., Pleiades (Nanogen, Inc., Bothell, WA, USA), Molecular Beacons (see, for example, Tyagi et al, Nature Biotechnology 14(3):303-8, 1996), Scorpions® (Molecular Probes Inc., Eugene, OR, USA)). For example, a 6-carboxyfluorescein (FAM) labeled probe recognizes sequences specific to the native template and a 5'-Tetrachloro-Fluorescein (TET) labeled probe recognizes internal standard specific sequences. The synthetic gene- specific internal sequence could be, for example, a 1, 2, or 3 nucleotide difference compared to the target sequence in the probe binding sequence. The 1,2, or 3 nucleotide change in the internal standard may produce mismatches with the probe hybridizing to the native template or target sequences, and thus reduces or prevents hybridization of the native template probe to the internal standard. For example, 1,2, or 3 nucleotide changes include deletions, insertions, or placing 2-3 nucleotides of the target sequence in a different order. In particular, 2 nucleotide changes are effective in differentiating probe binding in TaqMan and Pleiades based SNAP assays. In this way the PCR efficiency for amplification of the synthetic sequence should be similar to the amplification for the target sequence. Competitive PCR allows a multiplex pre- amplification step without consequence to assay accuracy. Applying the pre- amplified sample to a commercially available high-throughput PCR format, e.g., the OpenArray (R) plate simplifies the detection step of the 378 individual PCR assays. Table 3: BioTrove Oncotype DX Workflow Test Kit
Four x 7 tubes Oncotype DX preamplification strips containing lyophilized preamplifϊcation primers, and internal standards formulated into the SMIS Dilution Systems A through F that correspond to the tubes, as listed below:
1. Certified and quality-assessed calibrant (ACTB 10 \-12 M (600,000 molecules) / GAPD 10"13 (60,000 molecules) / GAPD 10"14 (6,000 molecules)
2. SMIS System A (Target genes 10"11 M (6,000,000 molecules) / ACTB 10"12 M (600,000 molecules) / GAPD 10"13 (60,000 molecules) / GAPD 10"14 (6,000 molecules)
3. SMIS System B (Target genes 10"12 M (600,000 molecules) / ACTB 10"12 M (600,000 molecules) / GAPD 10"13 (60,000 molecules) / GAPD 10"14 (6,000 molecules)
4. SMIS System C (Target genes 10 V"13 M (60,000 molecules) / ACTB 10 -"12 M
(600,000 molecules) / GAPD 10 Λ-"13 (60,000 molecules) / GAPD 10 -"1144 ( , 6,000 molecules)
6. SMIS System D (Target genes 10"14 M (6,000 molecules) / ACTB 10 -"12 M (600,000 molecules) / GAPD 10 Λ-"13 (60,000 molecules) / GAPD 10 ,-"14 ( , 6,000 molecules)
6. SMIS System E (Target genes 1O 1-"1160 M (600 molecules) / ACTB 10 -"12 M (600,000 molecules) / GAPD 10 Λ-"13 (60,000 molecules) / GAPD 10 ,-"14 ( , 6,000 molecules)
7. SMIS System F (Target genes 1O V "1160 M (60 molecules) / ACTB 10 -"12 M (600,000 molecules) / GAPD 10 Λ-"13 (60,000 molecules) / GAPD 10 ,-"14 ( , 6,000 molecules)
One Master Mix tube. PCR buffer, MgCl2, Tag polymerase, dNTPs
One positive control tube for laboratory test QC
Calibrated SUHRcDNA from SUHRRNA (Stratagene Universal Human Reference RNA as positive controls for RNA quality, Reverse Transcription (RT), calibration, PCR, and reproducibility, in the following quantity: One vial of calibrated SUHRcDNA (Enough for 90-100, 96-well plates)
One Oncotype DX OpenArray (R) plate that allows four samples per array.
Each preamplifϊcation dilution is transferred to its own OpenArray (R) plate subarray for amplification to detect target nucleic acids (Figure 2). All OpenArray (R) plate subarrays are identical; a subarray consists of assays for the 21 gene targets in three replicates. The arrays are loaded, sealed and subjected to 30 PCR cycles in an approved flat block thermal cycler. The cycled arrays are imaged in an NT imager or compatible slide scanner. Absolute gene copy number is generated by curve fitting a plot of the ratio of the native/standard signals vs. standard concentration. Generally, a sigmoid curve fit is used. Melting curves detected by saturating DNA dyes, e.g., LC Green 2, can also be substituted for probe-based ratio calculations. The half maximal effective concentration (EC50) is used to determine the native quantity of the target nucleic acid in the sample. A more accurate quantification could be obtained if more standard concentrations are used. The differential cost between performing 7 and 12 standard concentration measurements subarrays per test is small. Alternatively, assays compatible with 6 internal standard tubes would allow up to 8 tests per OpenArray (R) plate, further decreasing test cost. Automated software analyzes images, calculates signal intensities, estimates copy number, Quality Assurance (QA) assay results, and outputs a Recurrence Score report.
The method provides an absolute quantification of multiple target nucleic acids in a sample. Preamplification results in at least a ten-fold improvement in sensitivity for low RNA yield samples. Transcript quantitation using internal standards is a robust and desirable method. However, prior to the OpenArray (R) plate, use of internal standards increased the test cost and complexity of target nucleic acid quantitation nearly ten- fold. Internal standards reduce variation issues brought about by instrument, pipetting, preamplification, change in cycle threshold (ΔCt) estimates and sample contaminants. Internal standards provide QA data for each assay data point. Compared with the existing real-time Oncotype DX test, improves test yield and provides an assay QA. Further benefits to be obtained from the present method include a reduction in the number of liquid handling steps and instrument requirements. All these benefits would occur at roughly the same price as the current test.
Example 9. Standardized NanoArray PCR (SNAP) analytic performance was demonstrated using a panel of 16 lung cancer prognostic genes and up to 5 endogenous control genes. Real-time TaqMan non- standardized QPCR assays were developed for 16 lung cancer prognostic genes (Chen et al., N Engl J Med 2007 January 4;356(1):11- 20), and 10 reference gene targets (up to 5 endogenous control genes). For each reference normalized prognostic measurement (ΔCt), >95% linearity and <20% CV at >1000 starting copies was demonstrated using data calculated from 6-point standard curves of cDNA from flash frozen lung samples.
The amplification primer sequences in the SNAP process were generated to provide high quality TaqMan real-time qPCR assays. For each gene target, three different primer pairs were designed to unique regions of the gene and with the primer binding sites spanning an intron/exon boundary (>1000 bp intron where possible). Evaluation criteria included cycle threshold value, ΔRn, and amplification specificity as determined by melt curves and gel electrophoresis using amplification products derived from three annealing temperatures (55, 60 and 65°C). In addition, all primer sets were tested for cDNA specificity using 50ng genomic DNA as template.
For DUSP6 none of the initial three primer sets were satisfactory, and three additional primer sets for DUSP6 were designed and tested. Based on the results of these experiments, 600C was selected as the best annealing temperature. One, or in some cases two, primer sets were chosen for each gene and used for further testing. Using SYBR green detection chemistry, PCR efficiency was measured on serial dilutions of cDNA for each selected primer set. From these data one primer set for each gene was selected where the estimated PCR efficiency was greater than 95%. These primer sets were used for the design and purchase of matching dual labeled hydrolysis probes. PCR efficiency was again tested on serially diluted cDNA using the hydrolysis probes and efficiencies >95% were achieved for all genes (Figures 12A- 12D). In addition, purchased synthetic oligonucleotides were obtained with sequence matching the predicted amplicon product for each gene (80-100bp long oligonucleotides). These oligonucleotides were mixed in equimolar concentrations and then serially diluted over seven orders of magnitude and again used for PCR efficiency tests. In this experiment all genes demonstrated PCR efficiency >98% with correlation coefficients >0.99. Furthermore, all PCR assays demonstrated strong amplification signal down to eight copies of specific oligonucleotide template. To complete the panel, five of the ten reference genes (GUSB, MLN, PPIA, TBP and UCBH) were selected based on the Vandesompele method of selecting targets with minimal covariance. This was achieved using the cycle threshold (Ct) data from eight lung tumor cDNA samples measured by real-time qPCR in order to establish each assay's covariance (Vandesompele 2002). Therefore this study shows the design and validation of quantitative PCR assays for 21 target genes and 5 selected endogenous control genes. Example 10. Preamplification Standardized (StaRT)-PCR fluorescent assays in Open Array PCR platform for 16 prognostic, and 5 reference cDNA targets were developed.
Based on promising initial evaluations with individual technologies, StaRT- PCR with fluorescent probes, preamplification, and OpenArray nanoplate technology were combined into a SNAP workflow. Use of Epoch BioScience Pleiades hybridization probes was able to provide exceptional results compared to the use of TaqMan hydrolysis probes. Melting curve analysis of Pleiades probes produced improved signal-to-noise as to hypothetically discriminate 6% template differences. Despite the extra effort required to replace TaqMan with Pleiades (adjustments to instrumentation, array manufacturing, software and analysis), the Pleiades performance justified switching detection chemistries.
The SNAP protocol has evolved over the course of experimentation, but essentially remains as depicted in Figure IA. Samples are split into tubes containing log dilutions of an internal standard (IS) pool (e.g., 102 to 107 copies IS per reaction), mastermix, and all twenty-one PCR primer pairs (80 nM each primer), and then amplified by subjecting them to 34 PCR thermal cycles. The internal standard pool is a mixture of 21 synthetic oligos that function as competitive templates as each differs from the native gene target by two bases in the probe binding sequence. Next, the preamplified PCR products are diluted 500-fold in mastermix, loaded into an
OpenArray and undergo 30 additional PCR cycles followed by melt curve analysis. OpenArray nanoplates were manufactured such that primers/probe for all 21 assays were individually loaded into 63 separate wells in the nanoplate (i.e., each assay has three technical replicates). The probe hybridization difference between IS and native template (NT) is somewhat adjustable, but a ΔTm of 15°C between products produced good melting curve separation, which in turn improved the ability to estimate the IS:NT molar ratio in each sample. For accurate SNAP measurements, the IS & NT melting curve separation, or signal-to-noise (S/N), needs to be greater than ten, and preferentially greater than fifty (e.g., see Materials and Methods section for S/N calculation), however, the current designs proved sufficient for demonstrating >95% linearity calculated from 6-point standard curve of cDNA from flash frozen lung samples and <20% CV for genes with >1000 starting copies for each reference normalized prognostic measurement (ΔQ). However, multiple rounds of Pleiades probe design and testing can be used to optimize the preferential S/N for each assay.
The conversion of melting curve data into transcript abundance is based on data establishing melting curve parameters for each NT and IS. Pleiades probe melting curves of samples with either IS or NT were fit to a variable sloped sigmoid curve, and the resulting Tm and Hill coefficient saved as input parameters for SNAP analysis. Figures 13A-13C depicts the SNAP sample analysis workflow beginning with determining the individual contribution of NT and IS by the results of fitting the melting curve for each sample-assay-IS combination. From this, the transcript abundance for each sample-assay combination is derived from the EC50 value derived from a sigmoid curve fit to a Fraction NT vs. log[IS] plot. MatLab scripts were written to take the OpenArray melting curve output files and generate transcript abundance measurements for each sample-assay combination. Excel was used to perform the final analysis for each milestone. The accuracy of a platform/technology is a combination of observed change in signal generated in response to sample input, and reproducibility. Thus, the experiments in this study demonstrated the SNAP signal response and reproducibility using six-half log cDNA dilutions (Figure 14). For all assays, the linearity of the SNAP response was >0.99 as determined by Pearson Correlation. Signal response was estimated by the slope which averaged 1.12 ± 0.08. Ideally the slope should be 1.0, but we suspect the 12% bias resulted from systematic errors in handling the cDNA and Internal Standard dilutions. The success of this effort can be seen in the 1.03 average slope (Table 4). Precautions can be implemented to ensure more accurate dilutions, these included transferring larger volumes, use of siliconized tubes, inclusion of carrier salmon genomic DNA (10 pg/ul) and heating DNA solutions prior to dilution (5 min @ 95°C). Thus, the SNAP method achieves >95% linearity in response. Table 4: Linear regression and CV for assays (Figure 14).
ASSAY SLOPE CV
* Anxa 1.08 1.00 9%
Il cpeb 1.03 1.00 14%
A dig 1.15 0.99 21%
X dusp 1.11 1.00 22%
X erbb 1.36 1.00 59%
Φ frap 1.10 1.00 14%
: gusb 1.12 1.00 12%
-™ hgf 1.29 0.99 17% hmmrp 1.12 1.00 16% ιrf4 1.25 0.99 15% lck 1.07 1.00 17%
-^- m ln 1.07 1.00 16% mmd 1.14 1.00 23%
*, nfl 1.06 1.00 11% ppia 1.07 1.00 10% rnf 1.08 1.00 21% statl 1.08 1.00 14% stat2 1.14 1.00 27% tbp 1.01 1.00 19% ucbh 1.07 1.00 23% znf 1.17 1.00 24%
Values in Table 4 were derived from the average of three sample replicates, linear regression (plot lines) was used to calculate slope and RΛ2. The CV for each assay was calculated by normalizing the transcript abundance for each sample to the five reference genes (underlined assays), and then combining all results from samples with total input cDNA was greater than 1000 copies starting copies.
Precision is another requirement for analytic accuracy. SNAP precision was estimated by requiring any sample with >1000 starting copies to show <50% CV. Important in this regard, the SNAP method divides the sample into eight internal standard amplification tubes, therefore, each of the individual amplifications may have as low as 125 starting copies. To calculate precision, the SNAP transcript abundance of any assay-sample combination with >1000 starting copies was reference normalized (similar to ΔCT for real-time data), and a STD/MEAN calculation made by pooling the normalized values (Table 5). Table 5: Sample replicate precision for reference normalized copies (>50% = red boxes).
Assay 120 60 30
Anxa 2% 17% 9% cpeb 3% 18% 1% dig 1 17% 12% dusp 3% 8% 25% erbb 4% 15% 8% frap 3% 20% 39%
h
Figure imgf000049_0001
mln 3% 4% 7% mmd 7% 13% 18% nfl 6% 12% 3% ppia 5% 12% 24% rnf 4% 8% 29% statl 6% 5% 8% stat2 6% 29% 31% tbp 9% 5% 13% ucbh 4% 7% 8% znf 23% 21% lllll
The <50% precision criteria were observed in 20 of the 21 assays (mean 19%± 10%). As ERBB had been demonstrated to perform under similar cDNA input concentrations, some distortion resulting from either IS or sample preparation resulted in ERBB not performing to <50% precision criteria. Together, these results indicate that SNAP meets the accuracy requirements of <50% CV at >1000 starting copies from three sample replicates of pooled cDNA from flash frozen lung samples.
Example 11. SNAP requires at least 5-fold less RNA input than real-time QPCR.
By determining the minimal RNA test input where at least one assay fails precision requirement (50% CV) it is demonstrated that SNAP requires at least 5 -fold less RNA input than real-time QPCR. This study compares the analytic sensitivity of SNAP and real-time qPCR. Analytic sensitivity is important when working with highly degraded or limited samples such as formalin fixed tissue specimens or fine needle aspirate biopsies. In this study, the analytic sensitivity of a platform was defined as the sample quantity that resulted in a >50% CV for four inter day sample replicates.
The study was designed such that each platform (SNAP and Real-time) made 21 transcript measurements from serial dilutions of cDNA (8ng to 160pg). To reduce labor and reagent expense, the real-time platform measured only the nine lowest expressing transcript targets; the rationale being that these assays would fail the precision criteria first as their templates are diluted near single copy. To adjust for this experimental difference, the real-time measurements used 42% (9/21) of the cDNA input (5.76ng to 69pg). Tables 6 and 7 provide the results from this experiment.
Table 6: Four inter day sample replicates of cDNA (ng input, top row of tables) were measured by SNAP.
Figure imgf000050_0001
Table 7: Four inter day sample replicates of cDNA (ng input, top row of tables) were measured by TaqMan real-time qPCR.
8.0 3.6 1.6 0.72
DLG2
HGF 13% 28% 33%
HMMRP 34% 17% 37%
IRF4 23% 24% 24%
LCK 20% 27% 44%
MMD 22% 20% 19%
TBP 22% 46%
ZNF 15% IiIiIIIIi 50%
Gus-81 24% 20%
Four inter day sample replicates of cDNA (ng input, top row of tables) were measured by SNAP (Table 4) or TaqMan real-time qPCR (Table 5). Real-time qPCR averaged CT' s from three technical replicates each day. Inter day precision was calculated from sample replicate STD copy number for SNAP, and CT for real-time. Red shading indicates where the replicate precision (CV) rose above 50%. To reduce labor, the real-time measurements were performed on only the nine lowest transcript targets in the cDNA sample and only the top 4 sample concentrations are shown as all others failed, 'nd' indicates no detectable product.
The real-time experiment failed in the top cDNA input when measuring
DLG2, the lowest abundance cDNA transcript. As expected, the real-time precision continued to fail when targets were diluted to near single copy. SNAP DLG precision failure occurred in the 11 -fold more dilute sample. On average, SNAP precision failed at >5-fold less sample input than qPCR. The improved SNAP sensitivity was not surprising.
While both technologies are sensitive to Poisson noise at low copy inputs, by necessity, the real-time reactions parsed the input sample into 63 aliquots (21 genes x 3 replicates) in order to generate 63 measurements. SNAP however parsed the sample into 8 initial aliquots (8 internal standard, multiplex pre-amplifications), which, following PCR, produced sufficient quantities to make 504 melting curve measurements (8 IS x 21 genes x 3 replicates). Thus the SNAP method had an 8-fold increase in input concentration and was therefore less susceptible to Poisson noise. This improved sensitivity of SNAP relative to real-time PCR will only increase as the number of targets in a SNAP test panel increases and this is a major advantage of the technology. These results show that SNAP demonstrated >5 fold improved analytic sensitivity over real-time qPCR.
By testing cDNA isolated from seven stage I/II lung tumor resections the ICC for each prognostic assay across three laboratory sites can demonstrate better inter laboratory correlation than non-standardized real-time QPCR. Thus, the Inter lab concordance of the SNAP and real-time qPCR platforms can be studied. Seven cDNA samples were divided into 8ng aliquots and distributed to three laboratory sites each for SNAP and qPCR where four inter day measurements of 21 transcripts will be made. An Interclass correlation (ICC) is used for comparison as an assessment of quantitative reproducibility by different sites. According to the standardized nature of SNAP, the inter laboratory measurements for SNAP are more similar than those for real-time qPCR.
Example 12. SNAP using RNA from FFPE tissue The lung prognostic panel SNAP assays were designed to work with highly degraded samples. In initial studies, the tasks of measuring SNAP analytic performance were simplified by using the higher yields and transcript concentration of RNA isolated from fresh frozen tumor samples. Thus, it was important to demonstrate the test panel response to RNA derived from FFPE samples. RNA isolated from FFPE lung resection tumor blocks was converted into cDNA using random priming and MMLV reverse transcriptase. Three, serial twofold dilutions were measured by SNAP in triplicate (Figure 15). Per nanogram RNA equivalence, the average transcript abundance was ~ 130-fold less than for the fresh frozen samples used in Examples 10 and 11. This observation indicates the relatively poor quality of the FFPE RNA for reverse transcription arising from fragmented transcripts and cross linking to protein. However, SNAP measurements of 60ng FFPE and 0.7ng fresh frozen sample RNAs had a similar copy number and precision. Significantly, under these conditions, the FFPE sample was too diluted for real-time qPCR to reliably detect product, whereas SNAP was able to detect all products with a reasonable precision. Based on these results, SNAP is capable of enabling high quality transcript measurement of many targets with limiting sample with the expectation that the SNAP assay reliably scales up to > 100 transcript measurements. Example 13. SNAP transcript abundance assays for a test panel of 60 lung adenocarcinoma prognostic genes.
SNAP transcript abundance assays can be designed, constructed, and tested to validate the analytic performance of SNAP assays for genes in a lung adenocarcinoma prognostic test panel for a set of 60 genes. These assays can be used to identify a gene expression signature predictive of high-risk patients with lung adenocarcinoma. Steps for SNAP assay panel construction are outlined below
1. Gene-set Selection
2. Select ERCC RT/Sample load control RNA target 3. Amplification primer design
4. Amplification primer specificity testing
5. Amplification primer singleplex sensitivity testing
6. Amplification primer multiplex sensitivity testing
7. Design and test Pleiades probes and internal standards 8. Test entire panel accuracy
9. Scale-up internal standard pool
10. Re-test entire panel accuracy
The resulting panel measures transcript abundance of the FFPE samples.
Analytic specificity of the SNAP assay is measured and determined as follows. Primer pair characterization is performed by examining microplate PCR product of lung tumor cDNA (10 ng), human genomic DNA (10ng) and NTC. SNAP involves single PCR amplification condition for all assays. Primer design algorithms, as are known in the art, yield assays with required performance in several standard commercial master mixes (ABI Fast Sybr Green master mix , ABI GeneAmp Fast PCR MasterMix, Roche Taq Gold/3mM Mg) at 6O0C annealing. The specificity of each assay is determined by amplifying each assay-sample combination using ABI Fast Sybr Green master mix, collecting melting curve data for later analysis, subjecting the PCR products to polyacrylamide gel electrophoresis and observing products migrating with the correct mobility for cDNA samples, and absence of non specific products in all samples. PCR template (see, e.g., Materials and Methods) can be isolated from leftover PCR by QIAquick PCR Purification kits, and quantified by NanoDrop spectrophotometry. Analytic sensitivity of the SNAP assay is measured and determined as follows. Primer pairs passing the specificity screen are measured for their ability to detect less than ten input copies of template. The goal for selecting primer pairs to move into a SNAP assay is to generate primers that amplify less than ten starting copies in a multiplex PCR environment. To facilitate primer pair screening, OpenArrays are prepared preloaded with a unique primer pair per through hole (well), allowing a single OpenArray plate to screen up to 256 primer pairs against 12 samples. Limiting dilution PCR is used to measure the sensitivity of each assay. Limiting dilution PCR is an endpoint qPCR technique that uses serial dilutions flanking single starting copy per reaction to estimate sample copy number (34). This method requires identification of positive/negative PCR reactions by SYBR Green melting curve analysis. The melting curve specific for each product will be pre-determined from the specificity experiment described above.
The purified amplicons isolated above will be pooled at Ie6 copies per μl in carrier DNA (10 pg/μl Salmon Sperm DNA). Melting curve analysis of two-fold serial dilutions of NT pool (64 to 1/16 copies per OpenArray hole (33nl)) in four replicates determines presence or absence of PCR products (experiment requires 4 OpenArray Plates). A copy number estimate with 95% confidence intervals is obtained by entering PCR positive reactions and dilution factors into the software POISSON9 (created by N. Iscove Jan 1996). By comparing the limiting dilution derived copy number to the Nanodrop estimated copy number, one can establish if an assay is capable of sub ten starting copy analytic sensitivity. When this approach was applied to the 21 SNAP assays, all assays demonstrated near single copy sensitivity. All assays demonstrating less than ten copy sensitivity are tested for multiplex sensitivity.
Multiplex analytic sensitivity of the SNAP assay is measured and determined as follows. All primer pairs demonstrating sensitivity often copies are characterized for their multiplex analytic sensitivity. Primer pairs meeting the singleplex analytic sensitivity criteria above are pooled at 8OnM each and melting curve analysis of two- fold serial dilutions of native template (NT) pool (64 to 1/16 copies per OpenArray hole (33nl)) in four replicates determines presence or absence of PCR products. As above, the limiting dilution derived copy number is compared with the Nanodrop estimated copy number to establish if an assay is capable of sub ten starting copy analytic sensitivity. For each gene target, the assay demonstrating the best analytic sensitivity is moved to a probe design phase. If at this stage a gene target does not have a candidate primer pair, a new set of assays are designed and tested as described herein. Half of the SNAP assays tested in this manner (11 assays) showed near single copy sensitivity. For SNAP Panel Analytic Validation, minimal performance metrics required for the 60 gene SNAP panel are established. Briefly, six half log serial dilutions of lung tumor cDNA will be measured by SNAP. Each assay should demonstrate >95% linear response, routinely less than 50% CV for samples with greater than 1000 starting copies. Further, no native template signal should be detected in No Template Controls. Failing assays are replaced with new assays.
SNAP assays for the 60 gene panel can be externally scaled up. The reproducible manufacture of the internal standard strip tubes aids in the standardization of the SNAP measurements. To ensure reproducible manufacture, the synthetic oligo internal standard pool is replaced with a more stable cloned IS library. There are several commercially available plasmids containing a desired DNA sequence (e.g., the ISO9001 certified GenScript, Piscataway, NJ). Plasmids to each internal standard may be obtained from one of these companies. These plasmids are linearized (typically using the rare NOT I restriction site), quantified by both nanodrop fluorometry and limiting dilution PCR (averaging the results) and then pooled at 108 per μl in preparation for internal standard strip tube manufacture. A small scale manufacture protocol has been developed to ensure that multiple lots of internal standards generate consistent results. Using this internal standard pool, the SNAP panel analytic validation experiment described above is repeated. Any issues with the new IS pool are addressed and corrected. Once the SNAP assay passes this final validation, the panel is ready for measuring the FFPE samples.
Example 14. SNAP assays for a test panel of 60 lung adenocarcinoma prognostic genes measures transcripts in FFPE samples
The prognostic value of the selected 60-gene SNAP panel is evaluated and a prognostic signature is identified to classify patients into high and low risk groups based on overall survival. In addition, the effect of adding standard clinical variables into the risk classification algorithm can be evaluated. This is achieved in two steps: all, or a subset of genes are used to identify an expression signature that is univariately correlated with overall patient survival and a cut-off value to classify patients into high and low risk groups is determined. Next, the gene expression signature in the presence of clinical covariates is determined and independent prognostic factors are combined to develop a multifactorial risk classification. The prognostic potential in lung adenocarcinoma patients of the selected 60-gene SNAP assay is determined, and a risk classification scheme is identified that can be tested in much larger, independent patient cohorts for true test validation.
Tissues for this study are obtained from patients treated for lung cancer by Surgeons in the Division of Thoracic and Foregut Surgery at the University of Rochester Medical Center (URMC) between 2003 and 2007. FFPE tissue blocks from these patients are stored in the URMC Department of Pathology archives
All patients in the cohort had a primary diagnosis of lung adenocarcinoma. However, this is a heterogeneous tumor histology made up of several sub-histologies as shown in Figure 16. It can be seen that the predominant sub-type is simple adenocarcinoma (60%) followed by adenocarcinoma with bronchealveolar features (BAC) and then mixed histologic subtype. Other subtypes make up approximately 10% of the cohort. Pathologic staging of the cohort is shown in Table 8.
Table 8: Pathologic stage of the study cohort (n = 187).
Pathologic Stage of Potential
Study Patients (n=187) n %
Stage I 109 58.3%
Stage Il 27 14.4%
Stage III 36 19.3%
Unknown (p NxMx) 15 8.0%
Pathologic T- Stage pT1 76 40.6% pT2 46 24.6% pT3 7 3.7% pT4 9 4.8%
Pathologic Nodal Stage pNO 121 64.7% pN1 22 11.8% pN2 28 15.0% pN3 0 0.0% pNX 16 8.6%
The majority of the cohort (58%) had stage I disease and 65% had no detected lymph node involvement. This cohort is very similar to those studied in the Directors Challenge Consortium for the Molecular Classification of Lung Adenocarcinoma (3), and has the sample diversity and numbers to support the prognostic signature development .
For all eligible patients described above, the first requirement is to review the original H&E slides of the tumors and to identify tissue blocks that are 1) Representative of the tumor and 2) Contain an area of tissue comprised of at least 70% tumor cells. The primary tumor histology and staging information is reviewed. From this review, one or two tumor blocks are identified for each patient and a request is made to retrieve the blocks from the archives. A single 4μm section from each tissue block, is stained with H&E and evaluated before making a decision regarding which tissue block is used for the molecular analysis. The basis for this evaluation includes histologic criteria (good tumor representation, no necrosis, lack of contaminating tissues/structures etc.), tissue size, and the need to macrodissect in order to obtain >70% tumor cellularity. A high resolution photograph may be taken of the H&E stained slide from each tissue block for later use. Transcript abundance for the 60 gene prognostic panel for up to 250 cDNA samples are measured by SNAP. Based on previous SNAP transcript abundance measurements of FFPE samples described herein and the expected protocol improvements to both RNA isolation from FFPE and cDNA conversion efficiency, 100 ng RNA equivalence is expected to be sufficient for accurate transcript abundance measurement of all 60 prognostic genes. Reference normalized gene expression data are provided for further statistical analysis.
Determination of a prognostic signature from the 60 gene panel can be used to generate a risk score that is associated with patient prognosis, either with or without clinical covariates. A risk score cut-off value may be determined that classifies patients into high and low risk groups for overall survival. This classifier can be used to determine the reproducibility of the SNAP assay for patient risk classification. There are multiple statistical methods available by which a risk score can generated from the 60 gene panel. Three methods of increasing complexity to predict overall survival in the cohort of up to 187 lung adenocarcinomas are available. A first method, the compound covariate method, is a standard prediction approach that has been used successfully in microarray studies with specific applications to lung adenocarcinoma (1). Alternative approaches may be used because they have characteristics unique from the compound covariate method and from each other. Specifically, a second method, semi-supervised clustering (39) exploits the correlation among groups of individual genes. A third method, random forests, allows for nonlinear effects and interactions among predictors. Any of these methods allows full prediction of outcome (overall survival) at the patient level with and without clinical covariates and permit rigorous internal cross-validation. At least one approach that provides significant association with prognosis that is identified can be used to stratify patients into risk groups.
To ensure there is adequate power in the data to derive a meaningful prediction rule the number of patients required to enable 80% power to detect a significant gene signature in the cohort of 150 - 187 patients with up to 9 years of observation is computed. The individual gene signature GSj for patient j serves as a risk score for each patient. Patients are classified into two cohorts by the risk score median. The power for a test of survival differences among the 2 groups is provided for the gene signature alone and for the addition of the gene signature to N stage. This approach permits the estimate of power when the gene signature serves as a covariate by itself and along with an important clinical covariate.
Differences between two groups can be characterized by the hazard ratio. Based upon AJCC stage distribution of the cohort with 5 years of accrual and 4 years of additional follow-up, a probability of 5 year overall survival of .21 is assumed. Applying the expected censoring pattern and the probability of 5 year survival to the cohort, 73% of study patients are predicted to die by January 2012. Because the number patients who are eligible may be variable, the minimum number of patient accruals are provided to achieve 80% power for various 2 group hazard ratios.
Sample size is shown for a one tailed test of the significance of the proportional hazards regression coefficient at α= .05 for testing the difference between equally sized low risk and high risk groups (Table 9).
Table 9: Cohort size requirements to identify hazard ratios >1.5 with or without inclusion of clinical covariates with the gene signature.
Figure imgf000059_0001
The table also shows the number of patients required for testing the gene signature in the presence of a strong clinical covariate. Sample size was calculated for the effect the gene signature alone and on a second set of clinical or pathologic covariate(s) (e.g., AJCC stage) (40). Assuming, at least 150 accruals, hazard ratios between 1.5 and 3 and a correlation of .25 between the gene signature and one or more clinical/pathologic covariates, we estimate that we can detect a gene signature- associated hazard ratios of 1.75 or more using a one tailed test at α = .05
Example 15. Inter-site reproducibility and deployability of SNAP assays for a test panel of 60 lung adenocarcinoma prognostic genes.
The SNAP platform has inter-site reproducibility and deployability. These features can be achieved regardless of any prognostic utility of the specific, 60-gene lung cancer signature and this feature is important independent of clinical value. Multi-gene expression signatures for many different disease related endpoints can be refined and validated to generate a reproducible and deployable SNAP assay. Thus, SNAP assay precision can be used to determine patient classification.
Assay precision is a combination of variability in the measurement itself at the same institution and the variability that would result from different labs conducting the same measurement. To address this for SNAP, a reliability experiment to evaluate intra and inter-site reproducibility is conducted as follows. From an initial site, cDNA from 15 FFPE tissue samples is divided into three subsamples and shipped to three sites, A, B and C. At each site, each of the 15 samples is further subdivided into 3 subsamples and the 60 gene SNAP assay are run on each subsample on separate days. The intra-site reproducibility is then estimated for each lab. A mixed effects linear model is used to fit all data, setting the sites as fixed effects and the original samples and subsamples at each site as random effects. The dependent variable is the continuously scaled gene signature (risk score). The equality of outcome by site is tested and the contribution to total variance is estimated for subject, site and replicate within site. The SNAP assay risk score reliability is characterized by coefficient of variation, infra-class correlation coefficient, standard error of measurement and prediction interval for a new observation. Unfortunately, no publicly available benchmark data exists with which to compare SNAP reproducibility metrics for measuring risk score. The closest such data set is derived from the Microarray Quality Control (MAQC)(18) studies but this work focused on gene-level reproducibility and not the reproducibility of multi-gene signatures. To measure success or adequacy in the absence of an accepted standard, the frequency of an inconclusive test result due to lack of test precision is evaluated, in addition to the standard metrics described above. The frequency of an inconclusive test result due to lack of test precision is an intuitive endpoint that can be evaluated and has practical utility. This endpoint and its evaluation are described below.
First, the inter-site standard error of measurement is used to estimate a prediction interval for a new observation. This interval places a high probability (e.g. .95 or .99) that the true gene signature score falls within the stated interval. The width of this prediction interval is related to risk classification, and the proportion of patients at risk of mis-classification estimate is estimated. Hypothetical figure (Figure 17) illustrates this concept.
In this example 100 patients with a gene signature score from 10 to 50 are split into high/low risk groups at the median of 30. A prediction interval for a new observation at the median is superimposed on the distributions. All values of the gene signature are considered consistent with the new observation. In this example the width of the prediction interval is ±4 and represents about 20% of the data. Thus one may conclude in this hypothetical example that the 20% of measurements (patients) nearest the cutoff of 30 will be at risk of misclassification. For these patients, the test result may more accurately be described as inconclusive, as the error in measurement precludes definitive positioning in one or the other risk groups. The actual risk of misclassification depends upon (1) the intra- lab variability, (2) the inter- lab variability, and (3) the observed distribution of gene signature scores (see, e.g., Example 14). The prediction interval method is a probability statement and cannot therefore identify which observations are truly inconclusive, just those that have a high probability of being so.
Quantitatively assessing the probability of an inconclusive new test result near the classification boundary thereby estimates the implications of imprecision in the clinical use of the gene signature. This is more intuitive than metrics such as CV, standard error of measurement, etc. and has more practical utility. For example, for a patient with a test result too close to the classification boundary, the test could be repeated or the patient could be placed into an intermediate risk group. In this regard, there is at least one benchmark for comparison. In a 2004 NEJM study of 675 breast cancer patients, the Oncotype DX test from Genomic Health placed 22% of patients into an intermediate risk group (14). Although the Oncotype DX intermediate risk group was not determined using the same approach describe herein, the SNAP assay reproducibility is such that it can achieve no greater than 20% "inconclusive or intermediate" patient classification such that the test has sufficient reproducibility to be of clinical value. Furthermore, this feature represents reproducibility of SNAP performed at multiple sites rather than at a single site as is the case for all currently available multi-gene expression tests.
SNAP reliability can be determined by comparing results obtained at three sites. The recommended sample size for such a study is derived from the association between number of independent samples, number of replicates and the intra-class correlation coefficient (ICC). To ensure a high probability that the ICC exceed 80% the joint effect of number of samples and number of replicates per sample on the width of a one sided confidence interval for ICC has been calculated. Figure 18, prepared for a true ICC of 0.90, shows the tradeoff between number of replicates and precision in estimating the ICC. The lower bound of an ICC estimate has been limited to 0.10 below the estimated ICC. If the observed ICC is 0.90, with high probability, the lower bound is at least 0.80 with 3 replicates of 15 samples.
Results reported herein were obtained using the following methods and materials unless indicated otherwise. Gene-set selection
A 16 gene panel of lung cancer diagnostic genes identified by Chen et al. was selected. While the 16-gene signature of Chen et al. faired as well as most other signatures (particularly when clinical covariates were included) it did not consistently provide significant risk classification in all analysis cohorts. However, a panel of lung cancer diagnostic genes identified in a much larger study (Beer et al., Nat Med 2002 August;8(8):816-24) is also used. Beer et al. independently evaluated several new and previously published gene-sets (including the Chen et al. signature). Although there are some indications that prognostic signatures in lung cancer may have value across different tumor histologies (2;32), the vast majority of studies have focused on lung adenocarcinoma. Even so, there is extensive heterogeneity, in fact there is almost no overlap, in the gene sets identified in different studies/patient cohorts and even between gene sets identified using the same patient cohort and array data (3).
Beer et al. (1) previously identified 50 survival-related genes for identifying high-risk patients with lung adenocarcinoma and more recently examined a data set composed of 442 adenocarcinomas (3). Subsequently, Beer et al. have used a combination of statistical and biological approaches to identify subset(s) of genes from this large dataset that are prognostic for survival of patients with lung adenocarcinoma based on the following assumptions. Genes whose expression are highly correlated should be separated into clusters (i.e. into similar biological functional groups); and there exist some clusters and subsets of genes in each selected cluster which are the most prognostic for survival. This approach is analogous to the commonly used approach of principal components to reduce dimension. The cluster approach to dimension reduction is more likely to be effective because it more closely mimics the underlying biology.
Simultaneous cluster and gene selection A two-stage selection procedure is implemented; the first selection on cluster level and the second one among genes within each selected cluster. The Cox proportional hazard model is used to implement the proposed method. The selection scheme is conducted in a Bayesian framework using an iterative algorithm. The first step is to select C, much less than K, clusters which are relevant to the survival outcome. Within each of selected C clusters, the second selection identifies a subset of genes prognostic for survival. Both the clusters and genes within clusters are selected based on current probabilities in the Bayesian model. The parameters of the model are updated for each selected set of genes. These steps are repeated until the simulation chain converges. At the end of analysis, the empirical frequencies of the visits by the models with different number of clusters are calculated. For this, the most promising clusters for survival outcomes are ranked, and prognostic genes within each of selected clusters are identified. The same estimation scheme is conducted for the identification of prognostic genes within each of selected clusters. Threshold values were chosen to end up with approximately 40-60 clusters and a total of approximately 300 genes and then performed qRT-PCR on a subset of 50 adenocarcinomas. Those genes with the most significant association between the Affymetrix and qRT-PCR measures were selected. This has identified an enriched set of 90 genes which are being evaluated in another, independent patient cohort. This data will be available prior to the proposed start date of this grant period and will be used, in consultation with Dr Beer, to select a subset of up to 60 genes that will be utilized in this proposal. This set may or may not include genes from the original 16- gene panel used in previous studies, but it will include the four endogenous control genes identified as being the most stable. Thus, a total of 64 genes are analyzed using SNAP.
Reverse Transcription Control.
A synthetic RNA control is used to ensure the reverse transcription proceeded with the expected yield. The FDA is exploring the use of such controls in QA for RNA quantification. These RNA molecules are being developed by the External RNA Control Consortium (ERCC) as a tool for standardizing RNA quantification. These sequences have no homology to known species, so they will be unique in any RNA sample. While the application of RNA standards is still in development, the ERCC goal is for commercial vendors to manufacture and distribute the RNA standards for the purposes of standardizing RNA quantification.
For SNAP, a fixed quantity of RNA (e.g., 105 copies) for one of ERCC controls is spiked into the RNA isolated from FFPE prior to reverse transcription. By measuring the amount of ERCC control in the cDNA, the efficiency of the RT step is estimated. While this control does not account for the chemical damage to the FFPE RNA molecules, it can be used to ensure the RT process meets minimal yield and efficiencies. As with the other prognostic assays, three amplification primer pairs are designed for a prognostic assay incorporating an ERCC target. Metrics for using the RT control may be established using methods known in the art. ERCC development of these standards may be followed for implementing the RNA reverse transcription control. QC metrics are based on measurements of FFPE samples. The assay is developed to be useful as a QC for equal cDNA distribution into the multiplex IS PCR strip tubes.
Sample loading control
Accurate SNAP measurement may be obtained by distributing an equal amount of cDNA into the six IS multiplex amplification strip tubes. An unequal cDNA distribution may distort the transcript abundance result as SNAP quantification assumes equivalent DNA load in each multiplex reaction. Another use for the RT control is as a cDNA sample distribution control. If stable RNA controls are not available, RNA with a DNA analog may be substituted and a known quantity (e.g., 6x105 copies) spiked into the cDNA sample. Unlike the other IS that range from 102 to 107 copies per tube, an ERCC IS is added at a single concentration (e.g., 105 per tube) to the IS strip tubes during manufacture. Consequently, an equal distribution of cDNA into the six IS multiplex amplification tubes is reflected in a uniform IS:NT ratio measured in each tube. Any significant variation in ratio would indicate an error in cDNA distribution and the sample result would be thrown out. As with the RT efficiency control, the acceptance criteria for the sample loading control is established over time as a wide variety of SNAP measurements are made.
Amplification primer design
RNA extracted from FFPE samples is highly fragmented and therefore, amplification primers for high specificity and sensitivity assays to be used on samples derived from FFPE lung tumor blocks are designed to recognize shorter target sequences (33). For this reason, the primer design restricts PCR product sizes to 70- 85 base pairs, in keeping with findings that homogeneous product sizes have better inter transcript correlation for degraded samples (16). Primer design annealing temperature will be 60 ±1 0C. The predicted amplicon sequence is BLASTed against the human transcriptome to ensure the uniqueness of primer and probe binding specificity. Despite the use of DNAse in the RNA purification protocol, when possible, primers are designed to span RNA intron/exon splice junctions of > 1000 nucleotides; therefore, amplification of genomic contaminants are inhibited by failure to produce full length products. Three primer designs per gene target is expected to be sufficient to yield at least one primer pair meeting the analytic requirements described herein. Primer pairs are redesigned should an assay fail to meet the success metrics established below.
Reference gene selection Eleven endogenous control genes were evaluated and five selected as being the most stable in a panel of lung adenocarcinoma specimens. Five reference genes were developed to normalize for cDNA load. In a prognostic gene expression panel, reference genes are introduced into the panel construction at the multiplex analytic sensitivity stage.
Pleiades Probe and Internal Standard Design
In generating SNAP assays, melting probes and internal standards for each amplification target are constructed. Epoch BioSciences, the maker of Pleides probes, will design and deliver probes with a 63° ± 3°C native template Tm. Synthetic template oligo internal standards with mutations in the probe biding site that lower the IS binding Tm by 15°C ± 3°C are designed. A sequence analysis program (e.g., DINAMeIt Server (35)) is used to select the appropriate IS mutations. The metric for determining the successful probe and IS design is the signal-to-noise ratio (S/N) in the assay. The S/N of each assay is measured comparing the signals generated by four replicates of pure NT vs. pure IS samples.
Algorithms used for converting melting curve information into molar ratio measurements are known in the art. Briefly, conversion of melting curve data into transcript abundance begins with establishing melting curve parameters for each NT and IS template. Pleiades probe melting curves of samples with either IS or NT template are fit to a variable sloped sigmoid curve, and the resulting Tm and Hill coefficient saved as input parameters for SNAP analysis. Next, the melting curves for each sample-assay combination are fit to a two sigmoid curve using the parameter inputs defined above, allowing the Bottomis and BottoniNT to be adjusted to minimize the residuals (Figure 19). The fraction NT is calculated from the Bottomis and BottoniNT solutions.
Lastly, the S/N is calculated for each sample based on the four sample replicates. Accurate SNAP measurement requires >10 S/N. Assays failing to meet this criterion likely require changes, which can be generated by mutation selection of the internal standard. Occasionally, as designed, the probe does not generate sufficient on/off signal and is replaced. Three internal standards per probe and ten additional probes are generated to the panel's proposed 60 genes. With the wide latitude in probe placement and design (Epoch uses Major Groove Binders and modified nucleotides to adjust binding Tm) and numerous options for internal standard probe binding site mutation type and placement, assays with >50 S/N can be routinely designed. Assays passing this metric are ready to be measured in the complete panel.
Internal Standard and Amplicon Handling Precautions
The SNAP protocol involves the distribution of samples into tubes containing high concentrations of internal standard. Following SNAP pre-amplification, product is transferred into the OpenArray for detection. Both activities are possible sources of laboratory template contamination. Standard operating protocols for the manufacture of SNAP reagents and measurement of samples by SNAP minimizes the possibility of SNAP reagents and products contaminating the workplace. To minimize the contamination risk four separate work areas for SNAP, each with equipment reserved for handling the SNAP reagents are used:
1. A clean hood space is designated for making aliquots of primer and probes.
2. A chemical fume hood is used for handing internal standard DNA at >108 copies per μl.
3. A bench space for loading cDNA samples into SNAP IS strip tubes.
4. A bench space for distributing amplified SNAP products into the OpenArray plates .
To reduce the impact of SNAP amplicon contamination, the multiplex amplification incorporates Uracil-N-glycosylase treatment prior to thermal cycling to degrade Uracil containing amplicons. All SNAP amplifications incorporate Uracil into the DNA template, thus reducing the chance that these products will interfere with SNAP measurements. The SNAP OpenArray detection step does not require UNG treatment as the samples added to the OpenArray are insensitive to low level of contamination. For example, ten starting copies in the cDNA, the approximate LOD for SNAP, end up being amplified to >106 copies/ul; this sample would only be affected by very high levels of contamination (e.g., >105/μl). To ensure clean amplification primers separate vendors for primers and synthetic template may be used. The SNAP test precautions (work areas 3 & 4 above) are expected to be compatible with existing high complexity clinical laboratories, as these already have separate workspaces for setting up PCR and handling amplicon. Lastly, as the SNAP platform technology evolves, it is reasonable to expect that automation will replace the hands on steps thereby making the process more resistant to mishandling.
Patients and tissues Tissues for this study are obtained from patients treated for lung cancer by
Surgeons in the Division of Thoracic and Foregut Surgery at the University of Rochester Medical Center (URMC) between 2003 and 2007. FFPE tissue blocks from these patients are stored in the URMC Department of Pathology archives. Due to the relatively recent surgery dates, all of these tissue blocks are stored on-site at URMC and are therefore easily accessible. The specific patient cohort has already been identified and all relevant clinical information has been retrieved and reviewed by a study coordinator with assistance from two Thoracic surgery attendings, Dr. Carolyn Jones and Dr. Daniel Raymond. Clinical covariates already determined include tumor histology, pathologic TNM tumor staging, neoadjuvant and adjuvant therapies, surgical resection type, gender, smoking history (never, past or active smoker at time of surgery), age at surgery and history of prior cancers. Furthermore, in anticipation of attrition in numbers at various steps in the processing and quality control (QC) of specimens, the identified cohort is considerably larger than actually needed for adequate statistical power in the study (discussed below).
Patient eligibility
Inclusion criteria for this study are essentially the same as those used in a previous lung adenocarcinoma study (3). Patients included are those diagnosed with primary lung adenocarcinoma (all histologic subtypes) who were surgically treated with curative intent (complete resection with negative surgical margins). This includes patients in AJCC6 stage groups I-III. No patients to be studied received preoperative chemotherapy or radiation and there is a minimum of 4-years follow-up available at the time of data analysis. Patients are excluded if they have a history of prior malignant disease or if death occurred within one month of surgery.
Patient Cohort
From a review of medical records and the Cancer Registry database at URMC, 286 patients were identified who underwent surgery for primary lung adenocarcinoma between 2003 and 2007. Of these, 52 patients can be excluded due to prior tumor history or neoadjuvant therapy and another 30 can be excluded due to stage IV disease and/or positive surgical margins. In addition, 17 patients presented with coincident, multiple lung nodules. Due to questions of accurate staging (synchronous versus metastatic nodules) and which of the tumor tissues should be analyzed for prognostic purposes, it is also prudent to exclude these patients from the study. For the remaining 187 eligible patients, the predominant surgical procedure performed was lobectomy followed by wedge resection, pneumonectomy, bilobectomy and segmentectomy (Figure 20). Other relevant characteristics of the 187 patients and their tumors are described herein.
Cohort Demographics
Of the 187 study patients, 112 (60%) are female and 75 (40%) are male. The average age is 68 with a range from 41-87 years. Racial and ethnic backgrounds of the study cohort have not yet been collated. However, this cohort reflects the population of the Upstate New York area and Rochester, in particular. Thus, the cohort is predominantly white with a small representation of African Americans and few individuals of Hispanic, Asian, or other origin.
Tissue cutting and RNA isolation. Once the specific tumor blocks to be analyzed are identified, tissues are handed off for molecular analysis. For each tissue block, 10, 5 micron sections are cut into each of two eppendorf tubes containing RNA isolation buffer. One tube is used for RNA isolation and the other is stored at -8O0C as a backup if required for any reason. Protocols for RNA isolation from FFPE tissues is labor intensive and time consuming (33). Many commercially available kits (e.g. High Pure RNA Paraffin Kit; Roche Applied Science, Indianapolis, IN) are available for RNA isolation from FFPE tissues, and several of these have been tested and compared (36). However, RNA isolation may be performed using minor modifications to the protocol recommended in the High Pure RNA Paraffin Kit. These modifications were determined to provide optimal RNA isolation for gene expression studies from FFPE tissues by investigators at Genomic Health (Redwood City, CA). Once isolated, RNA is quantified using a NanoDrop spectrophotometer and the 260/280nm absorbance ratio is calculated to assess purity. If necessary, RNA is further purified by phenol- chloroform extraction and precipitation although this step is optional. Although RNA from FFPE tissues is anticipated to be highly degraded, there is some utility to assess RNA integrity by electrophoretic analysis (eg. Agilent Bioanalyzer) (36-38). Madabusi et al. (38) demonstrated that RNA with an RNA Integrity Number (RIN) >1.4 could be successfully used for gene expression analysis. This was also the case in the study by Rebeiro-Silva et al. Furthermore, the Roche RNA isolation kit provided RIN > 1.4 in 100% of the samples tested. Isolated RNA can be stored at - 8O0C until needed.
cDNA synthesis. cDNA is synthesized from 100-500ng of total RNA using a combination of random (octamers) and gene-specific primers. The reverse transcription primer is designed 20-25 bases from the gene specific PCR primers for all genes in the study. These primers are short (10-15 bases) oligonucleotides with an annealing temperature of 30-350C. This combination of random and gene-specific priming can significantly improve the detection of gene expression from FFPE tissues, and should increase the sensitivity and reproducibility of the assays. The specific reverse transcriptase to be used may also impact the sensitivity and will be determined in preliminary experiments testing and comparing several enzymes. Specifically, up to five different commercially available reverse transcriptases (e.g., Omniscript, Superscript II and III, etc.) are tested by reverse transcribing 10 randomly chosen samples with each enzyme, and then performing quantitative PCR for 5 commonly used endogenous control genes spanning a wide range of expression. Absolute cycle threshold (Ct) values are compared statistically in order to determine if any enzyme(s) appear to be consistently better (lower Ct values) than the others. Statistical analysis.
A risk score cut-off value can be assigned that classifies patients into high and low risk groups for overall survival by determining a risk score that is associated with patient prognosis, either with or without clinical covariates.
There are multiple methods available by which a risk score can be generated from a gene panel. A primary goal however is to identify at least one approach that provides significant association with prognosis and that can be used to stratify patients into risk groups. Thus, three methods of increasing complexity to predict overall survival in the cohort of up to 187 lung adenocarcinomas are available. A first method, the compound covariate method, is a standard prediction approach that has been used successfully in microarray studies with specific applications to lung adenocarcinoma (1). Alternative approaches are also available that have characteristics unique from the compound covariate method and from each other. Specifically, a second method, semi-supervised clustering (39) exploits the correlation among groups of individual genes. A third method, random forests, allows for nonlinear effects and interactions among predictors. All methods allow full prediction of outcome (overall survival) at the patient level with and without clinical covariates and permit rigorous internal cross-validation. The compound covariate method is a linear combination of all genes in the gene signature multiplied by their respective Cox proportional hazards coefficient. Specifically if q genes G1, G2, G3,..., Gq, are selected for the gene signature then each gene will have an associated regression coefficient βi, β2, β3,..., βq. The combination of these coefficients and the individual gene expression for patients y for each of the k genes will yield a gene signature (GSj) for the jth patient of
GSj = βlGlj + β2G2j + β3G3j +...+ βqGqj
The gene signature GSj is the predicted log relative hazard of death for patient j. The number of genes selected for the gene signature will be based on cross validated risk stratification. Once patient gene signatures are estimated they are sorted and divided into high and low risk groups. The extent of separation of Kaplan-Meier estimates of survival will provide assessment of gene signature prognosis. Genes that are individually associated with survival will be used to supply 5 to 10 lists of the top n genes for evaluation. For example a potential list of candidate gene lists may consider the top 5, 10, 15, 20, 25, etc genes. Each set of top genes will be used to dichotomize the survival data at the median. The list with the best leave- 10-out cross-validated separation of high and low risk Kaplan-Meier curves will be selected. After the best number of genes is found a final compound covariate gene signature will be constructed. Cross validation of the compound covariate predictor will use bootstrap resampling. Specifically, 200-500 bootstrap samples will be drawn, the gene signature generated, sorted and divided into 2 equal groups. Then the bootstrap gene signature will be applied to the original data and again divided into 2 risk groups. The agreement in classification accuracy over the B samples will constitute a bootstrap cross-validated classification accuracy. In addition bootstrap measures of model validation will be estimated such as R2, the proportion of explained variance, slope calibration, the extent to which the slope would be have to be changed so predicted survival matches observed survival, and Somer's D, a concordance coefficient for binary data. Semi- supervised clustering combines supervised learning in which the patient status is known (vital status, survival time etc) and is used to find a classifier and unsupervised learning in which a classifier ignores patient status. The "semi- supervised' method of Bair and Tibshirani (39) applies principle components analysis to the set of individual genes selected at the univariate level in method 1 (Compound Coveriate) due to their association with survival. Principal components reduces dimensionality by selecting gene subsets for one principal component that are correlated with one another but uncorrelated with genes in other principal components.
The random forests method is a competitor to both the compound covariate and semi-supervised methods. It would be expected to give results which are comparable to the compound covariate method except if there are substantial nonlinear effects and unexpected interactions among genes. Because recursive partitioning is more sensitive to these phenomena it may provide improved prediction over the compound covariate. Recursive partitioning (also known as classification and regression trees) recursively searches among all covariates for the cutpoint and single covariate providing the maximum separation between groups. To adapt recursive partitioning to right-censored survival data, one scales the data to the parametric exponential distribution and uses the resulting cumulative hazard as the dependent variable. Recursive partitioning classifies each patient into a distinct risk group based upon the similarity of their relative risk to the average relative risk of the terminal nodes. Random forests provide a more robust classification tree. A bootstrap sample (with replacement of the data) is obtained. Prior to each split, a random sample of the predictors is obtained to generate the next split. Patients are placed in classes based on similarity to the average node. Trees can be evaluated by measuring the separation of the Kaplan-Meier survival plots for each terminal node with a log rank test.
Other Embodiments From the foregoing description, it will be apparent that variations and modifications may be made to the invention described herein to adopt it to various usages and conditions. Such embodiments are also within the scope of the following claims.
The recitation of a listing of elements in any definition of a variable herein includes definitions of that variable as any single element or combination (or subcombination) of listed elements. The recitation of an embodiment herein includes that embodiment as any single embodiment or in combination with any other embodiments or portions thereof.
All patents and publications mentioned in this specification are herein incorporated by reference to the same extent as if each independent patent and publication was specifically and individually indicated to be incorporated by reference.
References
(1) Beer DG, Kardia SL, Huang CC, Giordano TJ, Levin AM, Misek DE, Lin L, Chen G, Gharib TG, Thomas DG, Lizyness ML, Kuick R, Hayasaka S, Taylor JM, Iannettoni MD, Orringer MB, Hanash S. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med 2002 August;8(8):816-24. (2) Chen HY, Yu SL, Chen CH, Chang GC, Chen CY, Yuan A, Cheng
CL, Wang CH, Terng HJ, Kao SF, Chan WK, Li HN, Liu CC, Singh S, Chen WJ, Chen JJ, Yang PC. A five-gene signature and clinical outcome in non-small-cell lung cancer. N Engl J Med 2007 January 4;356(1):11-20. (3) Shedden K, Taylor JM, Enkemann SA, Tsao MS, Yeatman TJ, Gerald WL, Eschrich S, Jurisica I, Giordano TJ, Misek DE, Chang AC, Zhu CQ, Strumpf D, Hanash S, Shepherd FA, Ding K, Seymour L, Naoki K, Pennell N, Weir B, Verhaak R, Ladd-Acosta C, Golub T, Gruidl M, Sharma A et al. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat Med 2008 August; 14(8): 822-7.
(4) Sun Z, Wigle DA, Yang P. Non-overlapping and non-cell-type-specifϊc gene expression signatures predict lung cancer survival. J Clin Oncol 2008 February 20;26(6):877-83. (5) Potti A, Mukherjee S, Petersen R, Dressman HK, BiId A, Koontz J,
Kratzke R, Watson MA, Kelley M, Ginsburg GS, West M, Harpole DH, Jr., Nevins JR. A genomic strategy to refine prognosis in early-stage non-small-cell lung cancer. N Engl J Med 2006 August 10;355(6):570-80.
(6) Jemal A, Murray T, Ward E, Samuels A, Tiwari RC, Ghafoor A, Feuer EJ, Thun MJ. Cancer statistics, 2005. CA Cancer J Clin 2005 January;55(l): 10-30.
(7) Ries LAG, Melbert D, Krapcho M, Mariotto A, Miller BA, Feuer EJ, Clegg L, Homer MJ, Howlader N, Eisner MP, Reichman M, Edwards BKe. SEER Cancer Statistics Review, 1975-2004. Bethesda, MD.: National Cancer Institute; 2006. (8) Little AG, Rusch VW, Bonner JA, Gaspar LE, Green MR, Webb WR,
Stewart AK. Patterns of surgical care of lung cancer patients. Ann Thorac Surg 2005 December;80(6):2051-6.
(9) Nesbitt JC, Putnam JB, Jr., Walsh GL, Roth JA, Mountain CF. Survival in early-stage non-small cell lung cancer. Ann Thorac Surg 1995 August;60(2):466-72.
(10) Ramaswamy S, Ross KN, Lander ES, Golub TR. A molecular signature of metastasis in primary solid tumors. Nat Genet 2003 January;33(l):49-54.
(11) Guo L, Ma Y, Ward R, Castranova V, Shi X, Qian Y. Constructing molecular classifiers for the accurate prognosis of lung adenocarcinoma. Clin Cancer Res 2006 June 1; 12(11 Pt l):3344-54.
(12) Ramaswamy S. Translating cancer genomics into clinical oncology. N Engl J Med 2004 April 29;350(18):1814-6.
(13) Sjoblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, Szabo S, Buckhaults P, Farrell C, Meeh P, Markowitz SD, Willis J, Dawson D, Willson JK, Gazdar AF, Hartigan J, Wu L, Liu C, Parmigiani G, Park BH, Bachman KE et al. The consensus coding sequences of human breast and colorectal cancers. Science 2006 October 13;314(5797):268-74.
(14) Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, Hiller W, Fisher ER, Wickerham DL, Bryant J,
Wolmark N. A multigene assay to predict recurrence of tamoxif en-treated, node- negative breast cancer. N Engl J Med 2004 December 30;351(27):2817-26.
(15) Vendrell E, Ribas M, Vails J, Sole X, Grau M, Moreno V, Capella G, Peinado MA. Genomic and transcriptomic prognostic factors in RO Dukes B and C colorectal cancer patients. Int J Oncol 2007 May;30(5): 1099-107.
(16) Cronin M, Pho M, Dutta D, Stephans JC, Shak S, Kiefer MC, Esteban JM, Baker JB. Measurement of gene expression in archival paraffin-embedded tissues: development and performance of a 92-gene reverse transcriptase-polymerase chain reaction assay. Am J Pathol 2004 January;164(l):35-42. (17) Vandesompele J, De PK, Pattyn F, Poppe B, Van RN, De PA,
Speleman F. Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biol 2002 June 18;3(7):RESEARCH0034.
(18) Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de LF, Kawasaki ES, Lee KY, Luo Y, Sun YA, Willey JC, Setterquist RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM, Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 2006 September;24(9): 1151-61. (19) Canales RD, Luo Y, Willey JC, Austermiller B, Barbacioru CC,
Boysen C, Hunkapiller K, Jensen RV, Knight CR, Lee KY, Ma Y, Maqsodi B, Papallo A, Peters EH, Poulter K, Ruppel PL, Samaha RR, Shi L, Yang W, Zhang L, Goodsaid FM. Evaluation of DNA microarray results with quantitative gene expression platforms. Nat Biotechnol 2006 September;24(9):l 115-22. (20) Crawford EL, Peters GJ, Noordhuis P, Rots MG, Vondracek M,
Grafstrom RC, Lieuallen K, Lennon G, Zahorchak RJ, Georgeson MJ, WaIi A, Lechner JF, Fan PS, Kahaleh MB, Khuder SA, Warner KA, Weaver DA, Willey JC. Reproducible gene expression measurement among multiple laboratories obtained in a blinded study using standardized RT (StaRT)-PCR. MoI Diagn 2001 December;6(4):217-25.
(21 ) Crawford EL, Warner KA, Khuder SA, Zahorchak RJ, Willey JC . Multiplex standardized RT-PCR for expression analysis of many genes in small samples. Biochem Biophys Res Commun 2002 April 26;293(l):509-16.
(22) Rots MG, Willey JC, Jansen G, Van Zantwijk CH, Noordhuis P, DeMuth JP, Kuiper E, Veerman AJ, Pieters R, Peters GJ. mRNA expression levels of methotrexate resistance-related proteins in childhood leukemia as determined by a standardized competitive template-based RT-PCR method. Leukemia 2000 December;14(12):2166-75.
(23) Peters EH, Rojas-Caro S, Brigell MG, Zahorchak RJ, des Etages SA, Ruppel PL, Knight CR, Austermiller B, Graham MC, Wowk S, Banks S, Madabusi LV, Turk P, Wilder D, Kempfer C, Osborn TW, Willey JC. Quality-controlled measurement methods for quantification of variations in transcript abundance in whole blood samples from healthy volunteers. Clin Chem 2007 June;53(6): 1030-7.
(24) Warner KA, Crawford EL, Zaher A, Coombs RJ, Elsamaloty H, Roshong-Denk SL, Sharief I, Amurao GV, Yoon Y, Al-Astal AY, Assaly RA, Hernandez DA, Graves TG, Knight CR, Harr MW, Sheridan TB, DeMuth JP, Zahorchak RJ, Hammersley JR, Olson DE, Durham SJ, Willey JC. The c-myc x E2F- l/p21 interactive gene expression index augments cytomorphologic diagnosis of lung cancer in fine-needle aspirate specimens. J MoI Diagn 2003 August;5(3): 176-83.
(25) Crawford EL, Blomquist T, Mullins DN, Yoon Y, Hernandez DR, Al- Bagdhadi M, Ruiz J, Hammersley J, Willey JC. CEBPG regulates ERCC5/XPG expression in human bronchial epithelial cells and this regulation is modified by E2F1/YY1 interactions. Carcinogenesis 2007 December;28(12):2552-9.
(26) Allen JT, Knight RA, Bloor CA, Spiteri MA. Enhanced insulin-like growth factor binding protein-related protein 2 (Connective tissue growth factor) expression in patients with idiopathic pulmonary fibrosis and pulmonary sarcoidosis. Am J Respir Cell MoI Biol 1999 December;21(6):693-700. (27) Loitsch SM, Kippenberger S, Dauletbaev N, Wagner TO, Bargon J.
Reverse transcription-competitive multiplex PCR improves quantification of mRNA in clinical samples—application to the low abundance CFTR mRNA. Clin Chem 1999 May;45(5):619-24. (28) Harr MW, Graves TG, Crawford EL, Warner KA, Reed CA, Willey JC. Variation in transcriptional regulation of cyclin dependent kinase inhibitor p21wafl/cipl among human bronchogenic carcinomas. MoI Cancer 2005;4:23.
(29) Weaver DA, Crawford EL, Warner KA, Elkhairi F, Khuder SA, Willey JC. ABCC5, ERCC2, XPA and XRCC 1 transcript abundance levels correlate with cisp latin chemoresistance in non- small cell lung cancer cell lines. MoI Cancer 2005;4(l):18.
(30) Mitra AP, Almal AA, George B, Fry DW, Lenehan PF, Pagliarulo V, Cote RJ, Datar RH, Worzel WP. The use of genetic programming in the analysis of quantitative gene expression profiles for identification of nodal status in bladder cancer. BMC Cancer 2006;6:159.
(31) Morrison T, Hurley J, Garcia J, Yoder K, Katz A, Roberts D, Cho J, Kanigan T, Ilyin SE, Horowitz D, Dixon JM, Brenan CJ. Nanoliter high throughput quantitative PCR. Nucleic Acids Res 2006;34(18):el23. (32) Guo NL, Wan YW, Tosun K, Lin H, Msiska Z, Flynn DC, Remick SC,
Vallyathan V, Dowlati A, Shi X, Castranova V, Beer DG, Qian Y. Confirmation of gene expression-based prediction of survival in non-small cell lung cancer. Clin Cancer Res 2008 December 15;14(24):8213-20.
(33) Godfrey TE, Kim S-H, Chavira M, Ruff DW, Warren RS, Gray JW, Jensen RH. Quantitative mRNA expression analysis from formalin- fixed, paraffin- embedded tissues using 5' nuclease quantitative RT-PCR. Journal of Molecular Diagnostics 2000 May 1;2(2): 84-91.
(34) Morrison TB, Weis JJ, Wittwer CT. Quantification of low-copy transcripts by continuous SYBR Green I monitoring during amplification. Biotechniques 1998 June;24(6):954-8, 960, 962.
(35) Markham NR, Zuker M. DINAMeIt web server for nucleic acid melting prediction. Nucleic Acids Res 2005 July l;33(Web Server issue):W577- W581.
(36) Ribeiro-Silva A, Zhang H, Jeffrey SS. RNA extraction from ten year old formalin- fixed paraffin-embedded breast cancer samples: a comparison of column purification and magnetic bead-based technologies. BMC MoI Biol 2007;8:l 18.
(37) Chung JY, Braunschweig T, Hewitt SM. Optimization of recovery of RNA from formalin- fixed, paraffin-embedded tissue. Diagn MoI Pathol 2006 December; 15(4):229-36. (38) Madabusi LV, Latham GJ, Andruss BF. RNA extraction for arrays. Methods Enzymol 2006;411 :1-14.
(39) Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol 2004 April;2(4):E108.
(40) Schmoor C, Sauerbrei W, Schumacher M. Sample size considerations for the evaluation of prognostic factors in survival analysis. Stat Med 2000 February 29;19(4):441-52.

Claims

What is claimed is:
1. A method for detecting a gene expression profile in a biological sample, the method comprising the steps of: (a) preamplifying a biomarker in the presence of a defined competitive reference biomarker;
(b) individually exponentially amplifying the biomarker in the presence of the reference biomarker in a reaction volume of at least about 1-1000 nl;
(c) identifying binding of a first detectable nucleic acid probe to the biomarker and any one of:
(i) binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and (ii) the melting temperature of the first detectable nucleic acid probe to the biomarker; (d) determining, respectively, any one of:
(i) the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and (ii) the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker, wherein the half maximal effective concentration is used to determine the quantity of the biomarker in the sample.
2. A method for detecting a gene expression profile in a biological sample, the method comprising the steps of: (a) preamplifying a biomarker in the presence of a defined competitive reference biomarker;
(b) individually exponentially amplifying the biomarker in the presence of the reference biomarker in a set of reactions, each reaction having a volume of at least about 1-1000 nl; (c) identifying binding of a first detectable nucleic acid probe to the biomarker and any one of:
(i) binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and (ii) the melting temperature of the first detectable nucleic acid probe to the biomarker;
(d) determining, respectively, any one of:
(i) the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and
(ii) the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker; and
(e) plotting the ratio of step (d) against the molar ratio of the reference nucleic acid for the set of reactions, wherein the half maximal effective concentration is used to determine the quantity of the biomarker in the sample.
3. A method for identifying or monitoring a subject as having a pathological condition characterized by an alteration in gene expression, the method comprising the steps of:
(a) preamplifying a biomarker in the presence of a defined competitive reference biomarker;
(b) individually exponentially amplifying the biomarker in the presence of the reference biomarker in a reaction volume of at least about 1-1000 nl; and (c) detecting the presence or absence of the biomarker and the corresponding reference biomarker, wherein detection of the biomarker, indicates that the biomarker is present; and failure to detect the biomarker when the corresponding reference biomarker is detected indicates that the biomarker is absent from the sample.
4. A method for detecting two or more target nucleic acid molecules in a single sample, the method comprising the steps of:
(a) preamplifying the target nucleic acid molecules in the presence of a defined reference nucleic acid molecule; (b) individually exponentially amplifying each of the target nucleic acid molecules in the presence of the reference nucleic acid molecule in a reaction volume of at least about 1-1000 nl; and
(c) detecting the presence or absence of the target nucleic acid molecules and the reference nucleic acid molecule, wherein detection of the target nucleic acid molecules indicates that the target nucleic acid is present; and failure to detect the target nucleic acid molecule when the reference nucleic acid molecule is detected indicates that the target nucleic acid molecule is absent from the sample.
5. A method for detecting two or more target nucleic acid molecules in a single sample, the method comprising the steps of:
(a) preamplifying a target nucleic acid molecule in the presence of a defined reference nucleic acid molecule for each target; (b) individually exponentially amplifying each of the target nucleic acid molecules in the presence of the reference nucleic acid molecule in a set of reactions, each reaction having a volume of at least about 1-1000 nl;
(c) identifying binding of a first detectable nucleic acid probe to the target nucleic acid molecule any one of: (i) binding of a second detectable nucleic acid probe to the corresponding reference nucleic acid molecule, and
(ii) the melting temperature of the first detectable nucleic acid probe to the target nucleic acid;
(d) determining, respectively, any of: (i) the ratio of binding to the target nucleic acid and binding to the corresponding reference nucleic acid, and
(ii) the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker; and
(e) plotting the ratio of step (d) against the molar ratio of the reference nucleic acid for the set of reactions, wherein the half maximal effective concentration is used to determine the quantity of the target nucleic acid in the sample.
6. A method for characterizing cancer, the method comprising the steps of: (a) preamplifying a biomarker in the presence of a defined reference biomarker in a set of reactions, wherein the biomarker is selected from the group consisting of ERBB3, LCK, DUSP6, STATl, MMD, CPEB4, RNF4, STAT2, NFl, FRAPl, DLG2, IRF4, ANXA5, HMMR, HGF, and ZNF264; (b) individually exponentially amplifying the biomarker in a reaction having a volume of at least about 1-1000 nl;
(c) identifying binding of a first detectable nucleic acid probe to the biomarker and any one of: (i) binding of a second detectable nucleic acid probe to the corresponding reference biomarker, and
(ii) the melting temperature of the first detectable nucleic acid probe to the biomarker;
(d) determining, repectively, any one of: (i) the ratio of binding to the biomarker and binding to the corresponding reference biomarker, and
(ii) the ratio of binding to the biomarker and the melting temperature of the first detectable nucleic acid probe to the biomarker; and
(e) plotting the ratio of step (d) against the molar ratio of the reference nucleic acid for the set of reactions, wherein the half maximal effective concentration is used to determine the quantity of the biomarker in the sample.
7. The method any of claims 1-5, wherein the sample is detected for a condition selected from the group consisting of neoplasia, inflammation, pathogen infection, immune response, sepsis, the presence of liver metabolites, and the presence of a genetically modified organism.
8. The method of claim 7, wherein detecting the neoplasia is for diagnosing a neoplasia, characterizing a neoplasia to identify tissue of origin, monitoring response of neoplasia to treatment, or predicting the risk of developing a neoplasia.
9. The method of any one of claims 1 -6, wherein the target nucleic acid or biomarker is RNA or DNA.
10. The method of any one of claims 4 or 5, wherein step (a) comprises preamplifying the target nucleic acids using primer sets specific for the target nucleic acids.
11. The method of claim 10, wherein the primer set used in step (a) for preamplifying the target nucleic molecules is used in step (b) for amplifying the target nucleic molecules.
12. The method of claim 10, wherein a first set of primers is used in step (a) for preamplifying the target nucleic molecules and a second set of primers is used in step (b) for amplifying the target nucleic acid molecules.
13. The method of any one of claims 1-6, wherein step (a) comprises reverse transcriptase polymerase chain reaction (RT-PCR).
14. The method of any one of claims 1-6, wherein steps (b) and (c) comprise real-time PCR.
15. The method of any one of claims 4 or 5 , wherein the nucleic acid probe to the target nucleic acid and the nucleic acid probe to the corresponding reference nucleic acid are fluorogenic.
16. The method of any one of claims 1-6, wherein the reaction occurs in a through-hole of a platen.
17. The method of any one of claims 4 or 5, wherein said target nucleic acid is derived from a bacterium, a virus, a spore, or a eukaryotic cell.
18. The method of claim 8, wherein the eukaryotic cell is a neoplastic cell derived from lung, breast, prostate, thyroid, and pancreas.
19. The method of any one of claims 1-5, wherein the target nucleic acid molecule is derived from a bacterial pathogen selected from the list consisting of Aerobacter, Aeromonas, Acinetobacter, Actinomyces israelii, Agrobacterium,
Bacillus, Bacillus antracis, Bacteroides, Bartonella, Bordetella, Bortella, Borrelia, Brucella, Burkholderia, Calymmatobacterium, Campylobacter, Citrobacter, Clostridium, Clostridium perfringers, Clostridium tetani, Corny ebacterium, Cory neb acterium diphtheriae, corynebacterium sp., Enterobacter, Enterobacter aerogenes, Enter ΌCOCCUS, Erysipelothrix rhusiopathiae, Escherichia, Francisella, Fusobacterium nucleatum, Gardnerella, Haemophilus, Hafnia, Helicobacter, Klebsiella, Klebsiella pneumoniae, Lactobacillus, Legionella, Leptospira, Listeria, Morganella, Moraxella, Mycobacterium, Neisseria, Pasteurella, Pasturella multocida, Proteus, Providencia, Pseudomonas, Rickettsia, Salmonella, Serratia, Shigella, Staphylococcus, Stentorophomonas, Streptococcus, Streptobacillus moniliformis, Treponema, Treponema pallidium, Treponema pertenue, Xanthomonas, Vibrio, and Yersinia.
20. The method of claim 19, wherein the bacterial pathogen is antibiotic resistant.
21. The method of any one of claims 4 or 5 , wherein the target nucleic acid molecule is derived from a virus selected from the list consisting of hepatitis C virus, human immunodeficiency virus, Retrovirus, Picornavirus, polio virus, hepatitis A virus, Enterovirus, human Coxsackie virus, rhinovirus, echovirus, Calcivirus, Togavirus, equine encephalitis virus, rubella virus, Flavivirus, dengue virus, encephalitis virus, yellow fever virus, Coronavirus, Rhabdovirus, vesicular stomatitis virus, rabies virus, Filovirus, ebola virus, Paramyxovirus, parainfluenza virus, mumps virus, measles virus, respiratory syncytial virus, Orthomyxovirus, influenza virus,
Hantaan virus, bunga virus, phlebovirus, Nairo virus, Arena virus, hemorrhagic fever virus, reovirus, orbivirus, Rotavirus, Birnavirus, Hepadnavirus, hepatitis B virus, Parvovirus, Papovavirus, papilloma virus, polyoma virus, adenovirus, herpes simplex virus 1, herpes simplex virus 2, varicella zoster virus, cytomegalovirus, herpes virus, variola virus, vaccinia virus, pox virus, African swine fever virus, Norwalk virus, and astro virus.
22. The method of any one of claims 1-6, wherein the sample is a biological fluid or tissue sample derived from a patient.
23. The method of claim 22, wherein the sample is selected from the group consisting of blood, serum, urine, semen and saliva.
24. The method of claim 22, wherein the tissue sample is selected from the group consisting of tissue biopsy, formaldehyde fixed paraffin embedded tissue, fine needle aspirate (FNA) biopsy and laser capture micro-dissected samples.
25. The method of any one of claims 1-6, wherein the sample comprises at least about 1-1000 cells.
26. The method of any one of claims 1-6, wherein the sample comprises at least about 1-1000 ng of RNA.
27. The method of any one of claims 1-6, wherein one target nucleic acid molecule can be detected in at least about 50, 25, or 10 copies per reaction or when at least about 50-100 copies of a competing target nucleic acid are present.
28. The method of any one of claims 4 or 5, wherein the method detects a target present at about 1-100 copies/reaction.
29. The method of any one of claims 4 or 5, wherein the method detects a target present at about 5-50 starting copies/reaction.
30. The method of any one of claims 4 or 5, wherein the method detects a target present at about 10 starting copies/reaction.
31. The method of any one of claims 4 or 5, wherein the method detects a target present at about 1 starting copy/reaction.
32. The method of any one of claims 1, 2 or 5, wherein an absolute gene copy number is generated by curve fitting a plot of the ratio of the native/standard signals vs. standard concentration and the concentration (EC50) is used to determine the quantity of the target nucleic acid in the sample.
33. The method of any one of claims 1, 2 or 5, wherein an absolute gene copy number is generated by curve fitting a plot of the ratio of the signal/melting temperature of the detectable nucleic acid probe vs. standard concentration and the concentration (EC50) is used to determine the quantity of the target nucleic acid in the sample.
34. A nanofluidic system comprising a high density array of nano liter-scale through-holes comprising a 10-50 nl reaction volume comprising a standardized mixture of internal standards, at least two pairs of detectable target nucleic acid probes, each of which is complementary to a target nucleic acid sequence, and a pair of detectable reference nucleic acid probes complementary to a competitive template internal standard, wherein each primer pair coamplifies a template and its respective competitive internal standard template with equal efficiency.
35. The nanofluidic system of claim 34, wherein the detectable target and reference nucleic acid probes each comprises a distinct fluorometric dye that provides for the separate detection of amplified target and internal standards.
36. A kit comprising a high density array of nano liter-scale through-holes comprising a 10-50 nl reaction volume comprising a standardized mixture of internal standards, at least two pairs of detectable target nucleic acid probes, each of which is complementary to a target nucleic acid sequence, and a pair of detectable reference nucleic acid probes complementary to a competitive template internal standard, wherein each primer pair coamplifies a template and its respective competitive internal standard template with equal efficiency, and written directions for using the kit to detect a gene expression profile in a biological sample.
37. The kit of claim 36, wherein the detectable target and reference nucleic acid probes comprise a distinct fluorometric dye that provides for the separate detection of amplified target and internal standards.
38. The kit of claim 36, wherein the tissue sample is selected from the group consisting of tissue biopsay, formaldehyde fixed paraffin embedded (FFPE) tissue, fine needle aspirate (FNA) biopsy and laser capture micro-dissected samples.
39. The kit of claim 36, wherein the sample comprises at least about 1-1000 cells.
40. The kit of claim 36, wherein the sample comprises at least about 1-1000 ng ofRNA.
PCT/US2009/060848 2008-10-15 2009-10-15 System for identification of multiple nucleic acid targets in a single sample and use thereof WO2010045462A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP09821256A EP2344619A4 (en) 2008-10-15 2009-10-15 System for identification of multiple nucleic acid targets in a single sample and use thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10570108P 2008-10-15 2008-10-15
US61/105,701 2008-10-15

Publications (1)

Publication Number Publication Date
WO2010045462A1 true WO2010045462A1 (en) 2010-04-22

Family

ID=42106894

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/060848 WO2010045462A1 (en) 2008-10-15 2009-10-15 System for identification of multiple nucleic acid targets in a single sample and use thereof

Country Status (2)

Country Link
EP (1) EP2344619A4 (en)
WO (1) WO2010045462A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102634610A (en) * 2012-05-07 2012-08-15 镇江和创生物科技有限公司 Primer probe combination for specific detection of measles virus and rubella virus and kit
CN102778568A (en) * 2012-04-24 2012-11-14 万里明 Preparation and application of total antibody ELISA (enzyme linked immunosorbent assay) kit for detecting fever accompanied by thrombocytopenia syndrome virus
CN102809653A (en) * 2012-04-24 2012-12-05 万里明 Preparation and application of ELISA (Enzyme-Linked Immunosorbent Assay) kit for detecting novel bunyavirus antigen
CN102809649A (en) * 2012-04-24 2012-12-05 万里明 Preparation and application of enzyme-linked immuno sorbent assay (ELISA) kit for detecting virus antibody IgM of fever with throbocytopenia associated syndrome
US9376711B2 (en) 2011-07-13 2016-06-28 Qiagen Mansfield, Inc. Multimodal methods for simultaneous detection and quantification of multiple nucleic acids in a sample
WO2016160823A1 (en) * 2015-04-03 2016-10-06 Becton, Dickinson And Company Methods of amplifying nucleic acids and compositions and kits for practicing the same
CN106033087A (en) * 2015-03-18 2016-10-19 王峥 Method system for detecting number of molecules of material through built-in type standard curve
EP3553181A1 (en) * 2012-05-25 2019-10-16 Accugenomics, Inc. Nucleic acid amplification and use thereof
CN110709522A (en) * 2017-04-04 2020-01-17 建喾立嗣股份公司 Method for measuring nucleic acid mass of biological sample
CN110724769A (en) * 2019-12-03 2020-01-24 广东省农业科学院动物卫生研究所 PCR primer group, kit and detection method for detecting African swine fever virus MGF360-505R gene
CN112980979A (en) * 2021-04-15 2021-06-18 上海市临床检验中心 Fusobacterium nucleatum fluorescent quantitative detection kit and preparation method and detection method thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060183144A1 (en) * 2005-01-21 2006-08-17 Medical College Of Ohio Methods and compositions for assessing nucleic acids
US7354713B2 (en) * 2002-09-05 2008-04-08 Wisconsin Alumni Research Foundation Method of using estrogen-related receptor gamma (ERRγ) status to determine prognosis and treatment strategy for breast cancer, method of using ERRγ as a therapeutic target for treating breast cancer, method of using ERRγ to diagnose breast cancer, and method of using ERRγ to identify individuals predisposed to breast cancer

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030186246A1 (en) * 2002-03-28 2003-10-02 Willey James C. Multiplex standardized reverse transcriptase-polymerase chain reacton method for assessment of gene expression in small biological samples
AU2003302264A1 (en) * 2002-12-20 2004-09-09 Biotrove, Inc. Assay apparatus and method using microfluidic arrays

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7354713B2 (en) * 2002-09-05 2008-04-08 Wisconsin Alumni Research Foundation Method of using estrogen-related receptor gamma (ERRγ) status to determine prognosis and treatment strategy for breast cancer, method of using ERRγ as a therapeutic target for treating breast cancer, method of using ERRγ to diagnose breast cancer, and method of using ERRγ to identify individuals predisposed to breast cancer
US20060183144A1 (en) * 2005-01-21 2006-08-17 Medical College Of Ohio Methods and compositions for assessing nucleic acids

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHAN ET AL.: "Inhibition of Inducible Nitric Oxide Synthase Gene Expression and Enzyme activity by Epigallocatechin Gallate, a Natural product from Green Tea'", BIOCHEMICAL PHARRNACOLOGY, vol. 54, 1997, pages 1281 - 1286, XP027449907 *
See also references of EP2344619A4 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9376711B2 (en) 2011-07-13 2016-06-28 Qiagen Mansfield, Inc. Multimodal methods for simultaneous detection and quantification of multiple nucleic acids in a sample
CN102778568A (en) * 2012-04-24 2012-11-14 万里明 Preparation and application of total antibody ELISA (enzyme linked immunosorbent assay) kit for detecting fever accompanied by thrombocytopenia syndrome virus
CN102809653A (en) * 2012-04-24 2012-12-05 万里明 Preparation and application of ELISA (Enzyme-Linked Immunosorbent Assay) kit for detecting novel bunyavirus antigen
CN102809649A (en) * 2012-04-24 2012-12-05 万里明 Preparation and application of enzyme-linked immuno sorbent assay (ELISA) kit for detecting virus antibody IgM of fever with throbocytopenia associated syndrome
CN102634610A (en) * 2012-05-07 2012-08-15 镇江和创生物科技有限公司 Primer probe combination for specific detection of measles virus and rubella virus and kit
EP3553181A1 (en) * 2012-05-25 2019-10-16 Accugenomics, Inc. Nucleic acid amplification and use thereof
CN106033087A (en) * 2015-03-18 2016-10-19 王峥 Method system for detecting number of molecules of material through built-in type standard curve
WO2016160823A1 (en) * 2015-04-03 2016-10-06 Becton, Dickinson And Company Methods of amplifying nucleic acids and compositions and kits for practicing the same
CN110709522A (en) * 2017-04-04 2020-01-17 建喾立嗣股份公司 Method for measuring nucleic acid mass of biological sample
CN110724769A (en) * 2019-12-03 2020-01-24 广东省农业科学院动物卫生研究所 PCR primer group, kit and detection method for detecting African swine fever virus MGF360-505R gene
CN112980979A (en) * 2021-04-15 2021-06-18 上海市临床检验中心 Fusobacterium nucleatum fluorescent quantitative detection kit and preparation method and detection method thereof

Also Published As

Publication number Publication date
EP2344619A1 (en) 2011-07-20
EP2344619A4 (en) 2012-05-16

Similar Documents

Publication Publication Date Title
EP2344619A1 (en) System for identification of multiple nucleic acid targets in a single sample and use thereof
US8697363B2 (en) Methods for detecting multiple target nucleic acids in multiple samples by use nucleotide tags
EP2569453B1 (en) Nucleic acid isolation methods
US8921049B2 (en) Determination of copy number differences by amplification
US20140186827A1 (en) Assays for the detection of genotype, mutations, and/or aneuploidy
US20080020379A1 (en) Diagnosis and prognosis of infectious diseases clinical phenotypes and other physiologic states using host gene expression biomarkers in blood
US20110092387A1 (en) Expression profiling using microarrays
US20050191636A1 (en) Detection of STRP, such as fragile X syndrome
US20220088560A1 (en) High-level multiplex amplification
JP4989493B2 (en) Method for detecting nucleic acid sequence by intramolecular probe
US9689867B2 (en) Assays for affinity profiling of nucleic acid binding proteins
WO2022090521A1 (en) Generic cartridge and method for multiplex nucleic acid detection
CN116783308A (en) Universal cartridge and method for multiplex nucleic acid detection
Deharvengt et al. Molecular assessment of human diseases in the clinical laboratory
CN105392903B (en) Pre-amplification assay
US20170081713A1 (en) Multivalent probes having single nucleotide resolution
Jain et al. Biomarkers and Molecular Diagnostics
Class et al. Patent application title: ASSAYS FOR THE DETECTION OF GENOTYPE, MUTATIONS, AND/OR ANEUPLOIDY

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09821256

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2009821256

Country of ref document: EP