WO2000032820A1 - Method for detecting target sequences in small proportions in heterogeneous samples - Google Patents

Method for detecting target sequences in small proportions in heterogeneous samples Download PDF

Info

Publication number
WO2000032820A1
WO2000032820A1 PCT/US1999/028064 US9928064W WO0032820A1 WO 2000032820 A1 WO2000032820 A1 WO 2000032820A1 US 9928064 W US9928064 W US 9928064W WO 0032820 A1 WO0032820 A1 WO 0032820A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
sample
molecules
mutant
target nucleic
Prior art date
Application number
PCT/US1999/028064
Other languages
French (fr)
Inventor
Stanley N. Lapidus
Anthony P. Shuber
Original Assignee
Exact Laboratories, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Exact Laboratories, Inc. filed Critical Exact Laboratories, Inc.
Priority to AU15264/00A priority Critical patent/AU1526400A/en
Publication of WO2000032820A1 publication Critical patent/WO2000032820A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6851Quantitative amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification

Definitions

  • the invention relates broadly to methods for detection and identification of nucleic acids that exist in a heterogeneous biological sample in low frequency.
  • Detection of a mutant DNA in the specimen using conventional techniques is often difficult because the specimen does not contain the DNA of interest, or the signal associated with such low-frequency DNA is undetectable even if the target DNA is present in the specimen, or in a sample derived from the specimen.
  • disease-associated DNA is present in large amounts, and is easily detected in specimens, such as tumors, that are typically obtained by invasive means.
  • PCR polymerase chain reaction
  • PCR becomes a stochastic process when it is not 100% efficient (i.e. a process subject to the laws of probability). Once the PCR reactants are in place (e.g., targets, sufficient primer, polymerase, etc.), whether any specific target nucleic acid molecule is amplified is determined by the laws of probability.
  • the PCR reactants e.g., targets, sufficient primer, polymerase, etc.
  • a sample obtained from a complex biological specimen e.g., a sample, such as stool, in which a target DNA is present in low frequency relative to other DNA, protein, etc. in the sample
  • a sample obtained from a complex biological specimen e.g., a sample, such as stool, in which a target DNA is present in low frequency relative to other DNA, protein, etc. in the sample
  • the mutant molecule is not amplified in the first round, its concentration in the sample will be reduced from 1 in 100 to about 1 in 130. If the mutant nucleic acid is not amplified in the first two rounds, it will exist in the sample at an even lower ratio (about 1/169) with respect to the wild-type.
  • the mutant nucleic acid is amplified in every subsequent round of PCR in proportion to the wild type, its ratio in the sample will never be better than about 0.6% (1/169) of the sample (an approximately 40% reduction from its representation as compared to that before the amplification of two rounds).
  • an assay to detect the mutant nucleic acid in the sample has a sensitivity limit for the mutant of 1%, it is unlikely that the mutant will be detected, even after amplification.
  • Methods of the invention solve the problem of detecting low number, low- frequency molecular events in heterogeneous specimens.
  • Methods of the invention comprise determining the number of molecules in a sample that must be analyzed in order to maximize the probability that a low-frequency species will be detected in the sample.
  • Methods of the invention are based upon a modeling of the stochastic effects in PCR.
  • the principles disclosed herein are applicable to the identification, detection and/or quantification of any low-frequency molecule, especially in a heterogeneous sample.
  • a large specimen by weight or volume
  • steps are not taken to assure that the minimum number of molecules are processed from the specimen into a sample to be analyzed.
  • the invention recognizes that there are two types of heterogeneity in a complex biological sample, such as stool, sputum, and others.
  • a first type of heterogeneity is reflected in the relatively small amount of human DNA in such samples relative to other types of RNA and DNA (bacterial, viral, plant, and animal), proteins, etc. in the sample, and relative to other material such as mucus, fiber, etc.
  • a second type of heterogeneity is reflected in the relatively low amounts of a low-frequency human DNA (e.g., a mutant) with respect to the total human DNA in such samples.
  • a low-frequency human DNA e.g., a mutant
  • the detection of low-frequency human DNA is limited by the availability of such DNA in a sample prepared from a complex biological specimen.
  • Methods of the invention teach that the limited target DNA (corresponding, for example, to about 1% of the human DNA of a biological specimen) must be made available in a sample in order for amplification and detection to occur with high confidence.
  • the number of molecules analyzed in a sample taken from a specimen determines the ability of the analysis to reliably detect low- frequency DNA.
  • the number of input molecules (mutant plus wild- type) must be about 500 or greater if the PCR efficiency is close to 100%, the low- frequency DNA exists as about 1% of the total sample DNA, and a 0.5% detection threshold is used. As PCR efficiency goes down, the required number of input molecules goes up. Analyzing the minimum number of input molecules determined to be necessary by methods of the invention reduces the probability that a low-frequency event is not detected in PCR because it is not presented to the PCR or is not amplified in the first few rounds. Methods of the invention comprise determining a threshold number of sample molecules that must be analyzed in order to detect a low-frequency molecular event at a prescribed level of confidence. Methods of the invention also address the threshold number of molecules necessary for detection of a low-frequency species given a predetermined level of assay sensitivity.
  • methods of the invention are applied to PCR analysis.
  • Stochastic errors in the diagnosis of a mutant nucleic acid result from a failure to present sufficient relevant nucleic acid to the PCR, or from the failure to amplify a relevant nucleic acid.
  • the invention uses a model of stochastic errors in PCR, the invention provides a method for determining the minimum number of molecules that must be analyzed in order to provide confidence that: 1) the detection of signal associated with a low-frequency molecule is indicative of the actual presence in the sample of that molecule, and is not due to background "noise"; and 2) that the absence of signal is indicative of the absence of the target molecule, and not a failure to detect the low- frequency molecule
  • PCR is not a noise-free amplifier.
  • any PCR that is not 100% efficient there is some level of stochastic noise (failure to amplify a target DNA due to failure to prime template).
  • primer hybridization conditions typically are set so that as little non-specific binding as possible occurs.
  • the higher the specificity of primer hybridization the lower, necessarily, the efficiency of the PCR.
  • PCR efficiency is usually between about 2% and about 40%, especially when working with highly heterogeneous samples like stool, sputum, cervical scrapings, etc.
  • PCR efficiency means that, in any one round of PCR, 30% of the target will be amplified, producing about 1.3X molecules as compared to the previous round (assume that PCR primers are placed outside the region of mutation, and amplify through the mutation). If the number of mutant molecules is high as, for example, in a tumor specimen, mutant DNA will almost certainly be amplified.
  • mutant DNA It is only in the case of a heterogeneous sample in which the mutant DNA exists in small proportion that stochastic effects described herein play a role in reducing the probability of amplifying the mutant DNA.
  • a typical cancer-associated mutant DNA in the early stages of oncogenesis represents about 1% of the DNA in a heterogeneous sample. If PCR efficiency is set at 30% because of constraints needed to assure specific amplification, each mutant DNA molecule has only a 30% chance of being amplified in any round of PCR. If no mutant is amplified in the first round, the mutant DNA will represent only about 0.7% of the DNA in the sample after round 1.
  • the mutant DNA will represent about 0.6% of the DNA in the sample going into round three of the PCR . If the post-amplification assay used to detect the mutant has a sensitivity of no more than 0.5% for the mutant, it may not be possible to reliably detect the presence of the mutant. This is not the case when a mutant DNA species is present in large amounts relative to the wild-type DNA in a specimen (e.g., in a tumor) because there will be numerically sufficient mutant material in any prepared sample, thereby increasing the likelihood of target amplification. Also this is not the case when analyzing a heterogeneous sample when a great deal of material is present.
  • methods of the invention which provide means for determining the threshold number of molecules that must be involved in, for example, a PCR, in order to assure, within a predefined degree of statistical confidence, that low-frequency molecules actually are detected.
  • Methods of the invention are used to determine the minimum number of molecules that must be analyzed to assure detection of a low- frequency sample molecule in any assay system or systems in which stochastic processes operate.
  • methods of the invention will be provided in the context of conducting PCR in a heterogeneous sample. Once the minimum number of molecules that must be analyzed is determined, the skilled artisan can use any method available in the art to prepare (from a biological specimen) a sample having at least that number of molecules.
  • One method is to homogenize the specimen, or a portion thereof, in a physiologically-compatible buffer at a volume (ml) to mass (mg) ratio of at least 5:1 , and preferably about 10:1 or 20:1 and extract DNA therefrom.
  • Sample dilution assists in releasing DNA from the complex elements present in a heterogeneous sample, and is one way in which to ensure that the number of mutants is sufficient for detection.
  • the sample is enriched for human DNA using techniques known in the art such as sequence-specific capture prior to amplification. Other methods for increasing overall DNA (e.g., total human DNA) are also applicable for use in methods of the invention.
  • methods of the invention comprise detecting and/or quantifying a target nucleic acid in a biological sample such as, for example, tissue or body fluids.
  • a biological sample such as, for example, tissue or body fluids.
  • Methods of the invention may be practiced by preparing a sample comprising a minimum number of nucleic acid molecules sufficient to detect a target nucleic acid and then detecting said target nucleic acid and/or quantifying the number of target nucleic acid molecules in a sample.
  • the target nucleic acid is amplified prior to the step of detecting/quantifying the target nucleic acid.
  • the target nucleic acid is a low-frequency molecule such as a mutant nucleic acid.
  • the target nucleic acid is present in said sample at about between 0.5% and about 10% of the total species-specific nucleic acid in the sample.
  • Methods of the invention further comprise amplifying a target nucleic acid know or suspected to be present in a biological specimen.
  • a method of amplifying a target nucleic acid comprises preparing a sample comprising a minimum number of target nucleic acids present in said sample at about between 0.5% and about 10% of the total species-specific nucleic acid in the sample and amplifying the target nucleic acid. The method may further comprise the step of detecting the amplified target nucleic acid.
  • An alternative method for amplifying a mutant nucleic acid comprises selecting an amplification efficiency, level of statistical confidence, and suspected ratio of nucleic acid having a mutation to total nucleic acid in said specimen; determining a minimum number of nucleic acid molecules that must enter an amplification reaction in order to assure that a mutant nucleic acid will be amplified at a defined level of statistical confidence; preparing a sample comprising the minimum number of molecules sufficient to detect a mutant nucleic acid; and amplifying the mutant nucleic acid.
  • the mutant nucleic acid is amplified by PCR.
  • the present invention also provides methods for detecting loss of heterozygosity in nucleic acid molecules in a biological specimen.
  • Methods of the invention comprise preparing a sample comprising a minimum number of nucleic acid molecules necessary to detect a loss of heterozygosity, enumerating a number of target nucleic acid molecules suspected of having a loss of heterozygosity and a reference number of non- target nucleic acid molecules, and comparing the target number to the reference number. Methods of the invention determine whether the difference between the number of target and reference nucleic acid molecules is statistically significant, a statistically significant difference being indicative of a loss in heterozygosity.
  • any method for identifying low-frequency molecules may be employed .
  • the low frequency molecules are amplified by, for example, PCR prior to detecting the low- frequency molecules.
  • preferred methods include those disclosed in U.S. Patent No. 5,670,325, incorporated by reference herein.
  • a highly-preferred post- amplification detection means is the use of single-base extension assays to detect and/or identify a single nucleotide at, for example, a polymorphic locus.
  • Methods of the invention may be performed on any biological specimen. Methods of the invention are most advantageous when performed on a heterogeneous sample such as tissue and body fluid in which the detection is desired of a molecule that is present in the sample in small amount relative to other molecules in the sample.
  • a stool sample is a good example of a heterogeneous sample in which a mutant DNA, for example a mutant oncogene or tumor suppressor, is present at very low levels relative to other nucleic acids in the sample at early stages of oncogenesis. Diagnosis of such mutant DNA at early stages in the development of, for example, colorectal cancer is advantageous because colorectal cancer is highly-curable if detected at early stages.
  • Methods of the invention provide means to increase the likelihood of detection of mutant DNA indicative of the early stages of disease, such as cancer.
  • Particularly preferred biological specimens include blood, biopsy tissue, sputum, pus, semen, saliva, stool, lymph, cerebrospinal fluid, and urine.
  • Methods of the invention are also useful in the detection of a low-frequency molecule in specimens, especially heterogeneous tissue or body fluid specimens, obtained by pooling samples from multiple individuals or from identified populations (e.g. healthy, diseased, heterozygotes, etc.). Pooled samples may be used to identify clinically-relevant loci (e.g., single nucleotide variants associated with disease or pharmacological efficacy, safety, etc.), or to screen numerous patients simultaneously for a mutation. DNA isolated from pooled specimens or samples may also be used. An example of the use of methods of the invention is provided below.
  • Figure 1A is a flow chart of a model program for determining the minimum number of molecules that must be analyzed to assure detection of low frequency molecules in a heterogeneous sample.
  • Figure 1 B is a flow chart of the stochastic PCR sampling routine shown as "Take
  • Figure 1C is a flow chart of the stochastic PCR routine shown as "Perform stochastic PCR cycle" in Figure 1A.
  • the invention provides methods for determining the minimum number of molecules that must be analyzed in order to provide statistical confidence that a low- frequency molecule or molecular event will be detected in a sample prepared from a biological specimen, especially a heterogeneous biological specimen.
  • Methods of the invention capitalize on the realization that a sample containing a minimum number of molecules overcomes stochastic sampling errors. Identifying a minimum threshold of molecules for analysis assures, within a defined level of statistical confidence, that a low-frequency molecule is detected if it is present.
  • methods of the invention provide statistical confidence that a low-frequency DNA will be amplified in at least the early rounds of PCR, thereby preserving the ratio of that DNA with respect to the total molecules in the sample - even after further rounds of PCR amplification.
  • methods of the invention are especially useful when primers for PCR are designed to hybridize with template in a region outside the suspected mutation. The primers will be extended through the region of mutation, thus producing amplicon that corresponds to either the wild-type sequence or to the mutant sequence (depending on which template the primer anneals to).
  • mutant sequence exists in low proportion relative to the wild-type, and if PCR is run at below 100% efficiency, stochastic effects begin to take over, as primer may anneal to the wild-type nucleic acid more frequently than to the nucleic acid containing a mutation in the region to be amplified.
  • Methods of the invention are also useful if primers are designed to hybridize in the region of a mutation that differs from wild-type by only one or two bases, and annealing stringency is such that the mutant-directed probes non-specifically hybridize with the wild-type sequence.
  • Exemplification of the invention is based upon a model of stochastic processes in PCR. The model operates by iterating stochastic processes over a number of PCRs.
  • the model incorporates a preset PCR efficiency (established to meet separate specificity requirements), and a preset ratio of mutant DNA to total DNA in the sample to be analyzed (which is a property of the disease to be detected and the nature of the sample. For example, in stool samples, it is thought that a >1% ratio of mutant DNA to total human DNA is associated with disease.). Based upon those input values, the model predicts the number of molecules that must be presented to the PCR in order to ensure, within a predefined level of statistical confidence, that a low-frequency molecule will be amplified and detected.
  • sample size e.g., the weight, volume, etc.
  • characteristics of the sample e.g., its source, molecular makeup, etc.
  • a model used to exemplify methods of the invention is presented. Other methods and models are available to the skilled artisan, and can be used to implement the invention to determine the minimum number of molecules necessary to detect a low-frequency DNA.
  • the model according to methods of the invention solves the problems associated with amplification of low-frequency DNA.
  • the model dictates the number of molecules that must be presented to the PCR in order to reliably ensure amplification and detection.
  • the exemplary model simulates selection of DNA for amplification through several rounds of PCR.
  • a sample is chosen that contains a ratio of mutant-to-total DNA of 1 :100, which is assumed to lie at the clinical threshold for disease.
  • 1 % of the human DNA in a specimen e.g., stool
  • mutated i.e., has a deletion, substitution, rearrangement, inversion, or other sequence that is different than a corresponding wild-type sequence.
  • both the mutant and wild-type molecules will be selected (i.e., amplified) according to their ratio in the specimen (here, nominally 1 in 100), assuming there are any abnormal molecules in the sample.
  • any one round the number of each species that is amplified is determined according to a Poisson distribution. Over many rounds, the process is subject to stochastic errors that, as described above, reduce the ability to detect low-frequency mutant DNA. However, the earlier rounds of PCR (principally, the first two rounds) are proportionately more important when a low-frequency species is to be detected (for the reasons discussed above), and any rounds after round 10 are virtually unimportant.
  • the model determines the combined probability of (1) sufficient mutant molecules being presented to the PCR, and (2) the effects of stochastic amplification on those molecules so that at the output of the PCR there will be a sufficient number of molecules and a sufficient ratio of mutant to total molecules to assure reliable detection
  • the model used to run the number of molecules necessary at the first round of PCR was generated as a "Monte Carlo" simulation of a thousand experiments, each experiment consisting of 10 cycles of PCR operating on each molecule in the sample.
  • the simulation analyzed (1) taking a sample from the specimen; and (2) each round of PCR iteratively to determine whether, for each round, a mutant DNA if present in the sample was amplified.
  • the model Upon completion of the iterative sampling, the model determined the percent of rounds in which a mutant strand was amplified, the percent of mutants exceeding a predetermined threshold for detection (in this example 0.5% based upon the mutan total ratio of 1%), the coefficient of variation (CV) for stochastic sampling in each round alone, and the coefficient of variance for stochastic sampling and PCR in combination.
  • a predetermined threshold for detection in this example 0.5% based upon the mutan total ratio of 1%
  • CV coefficient of variation
  • Stochastic noise is created in PCR if the PCR efficiency is anything other than 0% or 100% (these two cases represent either there is no amplification at all or perfect fidelity of specific amplification).
  • the noise, or background, signal level in a PCR that is between 0% and 100% varies with the efficiency of the PCR.
  • Table 1 presents results obtained for iterative samplings with PCR efficiency set at 100% and 20%, and a mutanttotal ratio of 0.5%. Table 1 represents output from the model in 12 experiments conducted under various conditions.
  • the first row shows the nominal number of molecules entering the first round of PCR (i.e., the total number of molecules available for amplification).
  • the second row shows the percent of molecules (DNA) in the biological specimen that is expected to be mutant.
  • DNA percent of molecules in the biological specimen that is expected to be mutant.
  • the threshold for clinical relevance in the detection of early stage cancer is 1%. That is, 1% of the DNA in a sample derived from a heterogeneous specimen (e.g., stool) contains a mutation associated with colorectal cancer.
  • the 6th row is the threshold of detection of the assay used to measure PCR product after completion of PCR. That number is significant, as will be seen below, because sufficient mutant DNA must be produced by PCR to be detectable over aberrant signal from wild-type and random background noise.
  • the first line provides the likelihood that at least one mutant molecule is presented to the first round of PCR.
  • the second line under the Output heading provides the likelihood of detection of mutants (after PCR) above the predetermined threshold for detection. For example, in experiment 4, the results indicate that in 87.9% of experiments run under the conditions specified for experiment 4, the number of mutants will exceed the threshold number for detection. Finally, the last two rows provide the coefficient of variation for sampling, and for the combination of sampling and PCR.
  • mutant DNA is detected in only 97.1 % of the samples when 1000 input molecules are used (i.e., 1000 DNA molecules are available for priming at the initial PCR cycle), even though 100% of the DNA is amplified in any given round of PCR.
  • 1000 input molecules i.e., 1000 DNA molecules are available for priming at the initial PCR cycle
  • 100% of the DNA is amplified in any given round of PCR.
  • Stochastic errors due to variation in the number of input molecules become less significant at about 500 input molecules and higher (i.e., the CV for stochastic variations is about the same regardless of whether PCR efficiency is 20% or 100%).
  • the optimal number of molecules to be presented to the PCR is determined by selecting a PCR efficiency (or determining the efficiency by empirical means), and selecting a percentage of the sample suspected to be mutant DNA associated with disease. This, in turn, dictates a threshold of detection. Not ail detection strategies have similar underlying detection thresholds, so an appropriate technology must be selected.
  • the percentage mutant DNA may be determined by clinical considerations as outlined above for colorectal cancer.
  • a sample comprising that number of molecules (or greater) is prepared for PCR according to standard methods.
  • the number of molecules in a sample may be determined directly by, for example, enumerative methods such as those taught in U.S. Patent No. 5, 670,325, incorporated by reference herein.
  • the number of molecules in a complex sample may be determined by molar concentration, molecular weight, or by other means known in the art.
  • the amount of DNA in a sample may be determined by mass spectrometry, optical density, or other means known in the art.
  • the number of molecules in a sample derived from a biological specimen may be determined by numerous means in the art, including those disclosed in U.S. Patent Nos. 5,741 ,650 and 5,670,325, both of which are incorporated by reference herein.
  • a sample is prepared from a stool specimen by homogenizing in a physiologically-compatible buffer at a stool mass to buffer volume ratio of about 20:1 in order to maximize the amount of DNA in the sample available for amplification.
  • Physiologically acceptable buffers include those solvents generally known to those skilled in the art as suitable for dispersion of biological sample material. Such solvents include phosphate-buffered saline comprising a salt, such as 20-1 OOmM NaCI or KCI, and optionally a detergent, such as 1-10% SDS or TritonTM, and/or a proteinase, such as proteinase K (at, e.g., about 20mg/ml).
  • a preferred solvent is a physiologically-compatible buffer comprising, for example, 1 M Tris, 0.5M EDTA, 5M NaCI and water to a final concentration of 500 mM Tris, 16mM EDTA and 10mM NaCI at pH 9.
  • the buffer acts as a solvent to disperse the solid stool sample during homogenization and to facilitate separation of the DNA from the bacterial and fibrous components. Increasing the volume of solvent in relation to solid mass of the sample results in increased yields of DNA.
  • Buffer is added to the solid sample in a solvent volume to solid mass ratio of at least about 5:1.
  • the solvent volume to solid mass ratio is preferably in the range of about 10:1 to about 30:1 , and more preferably in the range of about 10:1 to about 20:1. Most preferably, the solvent volume to solid mass ratio is about 10:1.
  • solvent volume may be measured in milliliters, and solid mass measured in milligrams, but the practitioner will appreciate that the ratio of volume to mass remains constant, regardless of scale up or down of the particular mass and volume units. That is, solvent volume to solid mass ratios may be measured as liters:grams or ⁇ l: ⁇ g.
  • the minimum number of DNA molecules in the prepared sample may be verified by molarity, optical density, enumeration, or other means known in the art.
  • assays are performed to detect the presence of mutant DNA in the amplified sample.
  • mutant DNA may be detected in enumerative methods (see above) or by bulk detection using, for example, fluorescent markers, mass markers, radioactive markers, and the like.
  • the means for measuring the presence in the amplified sample of the low-frequency DNA is immaterial to the invention. Such means may be chosen by the skilled artisan in accordance with available materials, convenience, and clinical or diagnostic requirements.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides methods for detecting and analyzing molecules that exist in heterogeneous specimen or sample in small proportion relative to corresponding molecules.

Description

METHOD FOR DETECTING TARGET SEQUENCES IN SMALL PROPORTIONS IN HETEROGENEOUS SAMPLES
Field of the Invention
The invention relates broadly to methods for detection and identification of nucleic acids that exist in a heterogeneous biological sample in low frequency.
Background of the Invention It is often desirable to detect the presence in a complex biological sample of one or more molecules present in low frequency in the sample. For example, the detection of mutations in oncogenes at an early stage of oncogenesis are useful for early diagnosis of cancer. Such detection preferably is done in a specimen obtained through non-invasive, or minimally invasive means. Such specimens include stool, sputum, and other specimens that have a complex mixture of cellular components. DNA from cells having mutations indicative of early-stage cancer are present in such specimens in low frequency with respect to wild-type DNA. Detection of a mutant DNA in the specimen using conventional techniques is often difficult because the specimen does not contain the DNA of interest, or the signal associated with such low-frequency DNA is undetectable even if the target DNA is present in the specimen, or in a sample derived from the specimen. In contrast, disease-associated DNA is present in large amounts, and is easily detected in specimens, such as tumors, that are typically obtained by invasive means.
With the advent of the polymerase chain reaction (PCR), detection of nucleic acids became more routine, as the PCR allowed one to amplify vast quantities of a DNA of interest. Theoretically, PCR amplifies 100% of target, doubling the quantity of analyte with each cycle. Even with the abundance of material produced during PCR, careful attention must be paid to the amount of material presented to the PCR, and the representative nature of the input sample (that is, abnormalities must be sufficiently represented in the input sample to assure detection). Practical PCR is not 100% efficient. In order to assure that PCR is being run with a reasonable level of specificity, the Tm must be adjusted to reduce non-specific hybridization of primer. A consequence of increased specificity is a reduction in efficiency of the reaction. PCR becomes a stochastic process when it is not 100% efficient (i.e. a process subject to the laws of probability). Once the PCR reactants are in place (e.g., targets, sufficient primer, polymerase, etc.), whether any specific target nucleic acid molecule is amplified is determined by the laws of probability.
For example, in a PCR having 30% efficiency (which is in the typical range for most PCRs), and in which 99 wild-type nucleic acids and 1 mutant nucleic acid are present in a sample obtained from a complex biological specimen (e.g., a sample, such as stool, in which a target DNA is present in low frequency relative to other DNA, protein, etc. in the sample), there is nominally a 30% chance that the 1 mutant molecule will be amplified in the first round. If the mutant molecule is not amplified in the first round, its concentration in the sample will be reduced from 1 in 100 to about 1 in 130. If the mutant nucleic acid is not amplified in the first two rounds, it will exist in the sample at an even lower ratio (about 1/169) with respect to the wild-type. Even if the mutant nucleic acid is amplified in every subsequent round of PCR in proportion to the wild type, its ratio in the sample will never be better than about 0.6% (1/169) of the sample (an approximately 40% reduction from its representation as compared to that before the amplification of two rounds). Thus, if an assay to detect the mutant nucleic acid in the sample has a sensitivity limit for the mutant of 1%, it is unlikely that the mutant will be detected, even after amplification.
Similar problems exist in the detection of other low-frequency molecular species. For example, the detection of the relative amounts of high- and low- expression proteins may be undetectable over highly-expressed protein. A similar situation exists in detecting RNA, and other cellular molecules. Accordingly, there is a need in the art for methods of detecting low-frequency molecular events, especially in heterogeneous biological samples. Such methods are presented by the invention, a brief description of which follows.
Summary of the Invention Methods of the invention solve the problem of detecting low number, low- frequency molecular events in heterogeneous specimens. Methods of the invention comprise determining the number of molecules in a sample that must be analyzed in order to maximize the probability that a low-frequency species will be detected in the sample. Methods of the invention are based upon a modeling of the stochastic effects in PCR. However, the principles disclosed herein are applicable to the identification, detection and/or quantification of any low-frequency molecule, especially in a heterogeneous sample. Merely obtaining a large specimen (by weight or volume) is not sufficient if the specimen does not contain a sufficient number of target molecules, or if steps are not taken to assure that the minimum number of molecules are processed from the specimen into a sample to be analyzed. The invention recognizes that there are two types of heterogeneity in a complex biological sample, such as stool, sputum, and others. A first type of heterogeneity is reflected in the relatively small amount of human DNA in such samples relative to other types of RNA and DNA (bacterial, viral, plant, and animal), proteins, etc. in the sample, and relative to other material such as mucus, fiber, etc. A second type of heterogeneity is reflected in the relatively low amounts of a low-frequency human DNA (e.g., a mutant) with respect to the total human DNA in such samples. Thus, the detection of low-frequency human DNA (e.g., a mutant at the threshold of clinical relevance) is limited by the availability of such DNA in a sample prepared from a complex biological specimen. Methods of the invention teach that the limited target DNA (corresponding, for example, to about 1% of the human DNA of a biological specimen) must be made available in a sample in order for amplification and detection to occur with high confidence. According to the invention, the number of molecules analyzed in a sample taken from a specimen determines the ability of the analysis to reliably detect low- frequency DNA. In the case of PCR, the number of input molecules (mutant plus wild- type) must be about 500 or greater if the PCR efficiency is close to 100%, the low- frequency DNA exists as about 1% of the total sample DNA, and a 0.5% detection threshold is used. As PCR efficiency goes down, the required number of input molecules goes up. Analyzing the minimum number of input molecules determined to be necessary by methods of the invention reduces the probability that a low-frequency event is not detected in PCR because it is not presented to the PCR or is not amplified in the first few rounds. Methods of the invention comprise determining a threshold number of sample molecules that must be analyzed in order to detect a low-frequency molecular event at a prescribed level of confidence. Methods of the invention also address the threshold number of molecules necessary for detection of a low-frequency species given a predetermined level of assay sensitivity.
In a preferred embodiment, methods of the invention are applied to PCR analysis. Stochastic errors in the diagnosis of a mutant nucleic acid result from a failure to present sufficient relevant nucleic acid to the PCR, or from the failure to amplify a relevant nucleic acid. Using a model of stochastic errors in PCR, the invention provides a method for determining the minimum number of molecules that must be analyzed in order to provide confidence that: 1) the detection of signal associated with a low-frequency molecule is indicative of the actual presence in the sample of that molecule, and is not due to background "noise"; and 2) that the absence of signal is indicative of the absence of the target molecule, and not a failure to detect the low- frequency molecule
Practical (i.e., non-theoretical) PCR is not a noise-free amplifier. In any PCR that is not 100% efficient there is some level of stochastic noise (failure to amplify a target DNA due to failure to prime template). In order to reduce the level of noise due to nonspecific primer binding, primer hybridization conditions typically are set so that as little non-specific binding as possible occurs. However, the higher the specificity of primer hybridization, the lower, necessarily, the efficiency of the PCR. Thus, in order to assure appropriate specificity, PCR efficiency is usually between about 2% and about 40%, especially when working with highly heterogeneous samples like stool, sputum, cervical scrapings, etc. Greater PCR efficiencies are routinely achieved when amplifying, for example, plasmid DNA which does not have the heterogeneity of samples used for human diagnostics and screening. According to the invention, PCR at those efficiencies inevitably introduces stochastic errors when a target for amplification is in low frequency in the sample due to a failure to prime the low frequency DNA. A PCR efficiency of 30% means that, in any one round of PCR, 30% of the target will be amplified, producing about 1.3X molecules as compared to the previous round (assume that PCR primers are placed outside the region of mutation, and amplify through the mutation). If the number of mutant molecules is high as, for example, in a tumor specimen, mutant DNA will almost certainly be amplified. It is only in the case of a heterogeneous sample in which the mutant DNA exists in small proportion that stochastic effects described herein play a role in reducing the probability of amplifying the mutant DNA. However, a typical cancer-associated mutant DNA in the early stages of oncogenesis represents about 1% of the DNA in a heterogeneous sample. If PCR efficiency is set at 30% because of constraints needed to assure specific amplification, each mutant DNA molecule has only a 30% chance of being amplified in any round of PCR. If no mutant is amplified in the first round, the mutant DNA will represent only about 0.7% of the DNA in the sample after round 1. If no mutant is amplified in the first two rounds (0.7 x 0.7, or a 49% probability), the mutant DNA will represent about 0.6% of the DNA in the sample going into round three of the PCR . If the post-amplification assay used to detect the mutant has a sensitivity of no more than 0.5% for the mutant, it may not be possible to reliably detect the presence of the mutant. This is not the case when a mutant DNA species is present in large amounts relative to the wild-type DNA in a specimen (e.g., in a tumor) because there will be numerically sufficient mutant material in any prepared sample, thereby increasing the likelihood of target amplification. Also this is not the case when analyzing a heterogeneous sample when a great deal of material is present. Intuitively, 10,000,000 total input molecules. If 1% is mutant then 100,000 mutants exist. 100,000 molecules will be more or less faithfully amplified in early rounds of PCR (even at low efficiency) in a way that may not be the case for 1 or 2 mutant molecules. Methods of the invention are also applied to the detection and analysis of infectious organisms, (e.g., the presence of minimum residual disease (for example, HIV) in blood)
The problems associated with detecting low-frequency molecules have been overcome by methods of the invention which provide means for determining the threshold number of molecules that must be involved in, for example, a PCR, in order to assure, within a predefined degree of statistical confidence, that low-frequency molecules actually are detected. Methods of the invention are used to determine the minimum number of molecules that must be analyzed to assure detection of a low- frequency sample molecule in any assay system or systems in which stochastic processes operate. However, for ease of exemplification, methods of the invention will be provided in the context of conducting PCR in a heterogeneous sample. Once the minimum number of molecules that must be analyzed is determined, the skilled artisan can use any method available in the art to prepare (from a biological specimen) a sample having at least that number of molecules. One method, exemplified herein, is to homogenize the specimen, or a portion thereof, in a physiologically-compatible buffer at a volume (ml) to mass (mg) ratio of at least 5:1 , and preferably about 10:1 or 20:1 and extract DNA therefrom. Sample dilution assists in releasing DNA from the complex elements present in a heterogeneous sample, and is one way in which to ensure that the number of mutants is sufficient for detection. In a preferred method, the sample is enriched for human DNA using techniques known in the art such as sequence-specific capture prior to amplification. Other methods for increasing overall DNA (e.g., total human DNA) are also applicable for use in methods of the invention. In general, methods of the invention comprise detecting and/or quantifying a target nucleic acid in a biological sample such as, for example, tissue or body fluids. Methods of the invention may be practiced by preparing a sample comprising a minimum number of nucleic acid molecules sufficient to detect a target nucleic acid and then detecting said target nucleic acid and/or quantifying the number of target nucleic acid molecules in a sample. In a preferred method, the target nucleic acid is amplified prior to the step of detecting/quantifying the target nucleic acid.
In preferred methods of detecting and/or quantifying a target nucleic acid, the target nucleic acid is a low-frequency molecule such as a mutant nucleic acid. In a highly preferred embodiment, the target nucleic acid is present in said sample at about between 0.5% and about 10% of the total species-specific nucleic acid in the sample. Methods of the invention further comprise amplifying a target nucleic acid know or suspected to be present in a biological specimen. In one embodiment, a method of amplifying a target nucleic acid comprises preparing a sample comprising a minimum number of target nucleic acids present in said sample at about between 0.5% and about 10% of the total species-specific nucleic acid in the sample and amplifying the target nucleic acid. The method may further comprise the step of detecting the amplified target nucleic acid.
An alternative method for amplifying a mutant nucleic acid comprises selecting an amplification efficiency, level of statistical confidence, and suspected ratio of nucleic acid having a mutation to total nucleic acid in said specimen; determining a minimum number of nucleic acid molecules that must enter an amplification reaction in order to assure that a mutant nucleic acid will be amplified at a defined level of statistical confidence; preparing a sample comprising the minimum number of molecules sufficient to detect a mutant nucleic acid; and amplifying the mutant nucleic acid. In a preferred embodiment, the mutant nucleic acid is amplified by PCR. The present invention also provides methods for detecting loss of heterozygosity in nucleic acid molecules in a biological specimen. Methods of the invention comprise preparing a sample comprising a minimum number of nucleic acid molecules necessary to detect a loss of heterozygosity, enumerating a number of target nucleic acid molecules suspected of having a loss of heterozygosity and a reference number of non- target nucleic acid molecules, and comparing the target number to the reference number. Methods of the invention determine whether the difference between the number of target and reference nucleic acid molecules is statistically significant, a statistically significant difference being indicative of a loss in heterozygosity.
According to preferred embodiments of the invention, any method for identifying low-frequency molecules may be employed . In a preferred embodiment, the low frequency molecules are amplified by, for example, PCR prior to detecting the low- frequency molecules. Examples of preferred methods include those disclosed in U.S. Patent No. 5,670,325, incorporated by reference herein. A highly-preferred post- amplification detection means is the use of single-base extension assays to detect and/or identify a single nucleotide at, for example, a polymorphic locus.
Methods of the invention may be performed on any biological specimen. Methods of the invention are most advantageous when performed on a heterogeneous sample such as tissue and body fluid in which the detection is desired of a molecule that is present in the sample in small amount relative to other molecules in the sample. A stool sample is a good example of a heterogeneous sample in which a mutant DNA, for example a mutant oncogene or tumor suppressor, is present at very low levels relative to other nucleic acids in the sample at early stages of oncogenesis. Diagnosis of such mutant DNA at early stages in the development of, for example, colorectal cancer is advantageous because colorectal cancer is highly-curable if detected at early stages. Methods of the invention provide means to increase the likelihood of detection of mutant DNA indicative of the early stages of disease, such as cancer. Particularly preferred biological specimens include blood, biopsy tissue, sputum, pus, semen, saliva, stool, lymph, cerebrospinal fluid, and urine.
Methods of the invention are also useful in the detection of a low-frequency molecule in specimens, especially heterogeneous tissue or body fluid specimens, obtained by pooling samples from multiple individuals or from identified populations (e.g. healthy, diseased, heterozygotes, etc.). Pooled samples may be used to identify clinically-relevant loci (e.g., single nucleotide variants associated with disease or pharmacological efficacy, safety, etc.), or to screen numerous patients simultaneously for a mutation. DNA isolated from pooled specimens or samples may also be used. An example of the use of methods of the invention is provided below. The skilled artisan recognizes that the principles of the invention are applicable to a wide range of assays, including amplification reactions, competitive hybridizations, and other assays in which a low-frequency molecule is detected in a heterogeneous specimen or sample. The inventive methods are provided in the context of PCR for exemplification and illustration of a preferred embodiment for practice of the methods.
Description of the Drawings
Figure 1A is a flow chart of a model program for determining the minimum number of molecules that must be analyzed to assure detection of low frequency molecules in a heterogeneous sample. Figure 1 B is a flow chart of the stochastic PCR sampling routine shown as "Take
Stochastic Sample of Mutant to be Presented to PCR" in Figure 1A.
Figure 1C is a flow chart of the stochastic PCR routine shown as "Perform stochastic PCR cycle" in Figure 1A.
Detailed Description of the Invention The invention provides methods for determining the minimum number of molecules that must be analyzed in order to provide statistical confidence that a low- frequency molecule or molecular event will be detected in a sample prepared from a biological specimen, especially a heterogeneous biological specimen. Methods of the invention capitalize on the realization that a sample containing a minimum number of molecules overcomes stochastic sampling errors. Identifying a minimum threshold of molecules for analysis assures, within a defined level of statistical confidence, that a low-frequency molecule is detected if it is present.
In the context of the PCR, methods of the invention, as described below, provide statistical confidence that a low-frequency DNA will be amplified in at least the early rounds of PCR, thereby preserving the ratio of that DNA with respect to the total molecules in the sample - even after further rounds of PCR amplification. Thus, methods of the invention are especially useful when primers for PCR are designed to hybridize with template in a region outside the suspected mutation. The primers will be extended through the region of mutation, thus producing amplicon that corresponds to either the wild-type sequence or to the mutant sequence (depending on which template the primer anneals to). If the mutant sequence exists in low proportion relative to the wild-type, and if PCR is run at below 100% efficiency, stochastic effects begin to take over, as primer may anneal to the wild-type nucleic acid more frequently than to the nucleic acid containing a mutation in the region to be amplified. Methods of the invention are also useful if primers are designed to hybridize in the region of a mutation that differs from wild-type by only one or two bases, and annealing stringency is such that the mutant-directed probes non-specifically hybridize with the wild-type sequence. Exemplification of the invention is based upon a model of stochastic processes in PCR. The model operates by iterating stochastic processes over a number of PCRs. The model incorporates a preset PCR efficiency (established to meet separate specificity requirements), and a preset ratio of mutant DNA to total DNA in the sample to be analyzed (which is a property of the disease to be detected and the nature of the sample. For example, in stool samples, it is thought that a >1% ratio of mutant DNA to total human DNA is associated with disease.). Based upon those input values, the model predicts the number of molecules that must be presented to the PCR in order to ensure, within a predefined level of statistical confidence, that a low-frequency molecule will be amplified and detected. Once the number of molecules is determined, the skilled artisan can determine the sample size to be used (e.g, the weight, volume, etc.), depending on the characteristics of the sample (e.g., its source, molecular makeup, etc.). I. Model of Stochastic Processes in PCR
A model used to exemplify methods of the invention is presented. Other methods and models are available to the skilled artisan, and can be used to implement the invention to determine the minimum number of molecules necessary to detect a low-frequency DNA. The model according to methods of the invention solves the problems associated with amplification of low-frequency DNA. The model dictates the number of molecules that must be presented to the PCR in order to reliably ensure amplification and detection.
The exemplary model simulates selection of DNA for amplification through several rounds of PCR. For purposes of the model, a sample is chosen that contains a ratio of mutant-to-total DNA of 1 :100, which is assumed to lie at the clinical threshold for disease. For example, in colorectal cancer 1 % of the human DNA in a specimen (e.g., stool) is mutated (i.e., has a deletion, substitution, rearrangement, inversion, or other sequence that is different than a corresponding wild-type sequence). Over a large number of PCR rounds, both the mutant and wild-type molecules will be selected (i.e., amplified) according to their ratio in the specimen (here, nominally 1 in 100), assuming there are any abnormal molecules in the sample. However, in any one round, the number of each species that is amplified is determined according to a Poisson distribution. Over many rounds, the process is subject to stochastic errors that, as described above, reduce the ability to detect low-frequency mutant DNA. However, the earlier rounds of PCR (principally, the first two rounds) are proportionately more important when a low-frequency species is to be detected (for the reasons discussed above), and any rounds after round 10 are virtually unimportant. Thus, the model determines the combined probability of (1) sufficient mutant molecules being presented to the PCR, and (2) the effects of stochastic amplification on those molecules so that at the output of the PCR there will be a sufficient number of molecules and a sufficient ratio of mutant to total molecules to assure reliable detection The model used to run the number of molecules necessary at the first round of PCR was generated as a "Monte Carlo" simulation of a thousand experiments, each experiment consisting of 10 cycles of PCR operating on each molecule in the sample. The simulation analyzed (1) taking a sample from the specimen; and (2) each round of PCR iteratively to determine whether, for each round, a mutant DNA if present in the sample was amplified. Upon completion of the iterative sampling, the model determined the percent of rounds in which a mutant strand was amplified, the percent of mutants exceeding a predetermined threshold for detection (in this example 0.5% based upon the mutan total ratio of 1%), the coefficient of variation (CV) for stochastic sampling in each round alone, and the coefficient of variance for stochastic sampling and PCR in combination.
Stochastic noise is created in PCR if the PCR efficiency is anything other than 0% or 100% (these two cases represent either there is no amplification at all or perfect fidelity of specific amplification). The noise, or background, signal level in a PCR that is between 0% and 100% varies with the efficiency of the PCR. The standard deviation of stochastic noise, S, in a PCR is given by the equation, S = Vnpg, where n is the number of molecules in the sample, p is the efficiency of PCR, and q is 1-p. Table 1 presents results obtained for iterative samplings with PCR efficiency set at 100% and 20%, and a mutanttotal ratio of 0.5%. Table 1 represents output from the model in 12 experiments conducted under various conditions. The first row shows the nominal number of molecules entering the first round of PCR (i.e., the total number of molecules available for amplification). The second row shows the percent of molecules (DNA) in the biological specimen that is expected to be mutant. For colorectal cancer indicia in DNA recovered from stool, the threshold for clinical relevance in the detection of early stage cancer is 1%. That is, 1% of the DNA in a sample derived from a heterogeneous specimen (e.g., stool) contains a mutation associated with colorectal cancer. The 6th row is the threshold of detection of the assay used to measure PCR product after completion of PCR. That number is significant, as will be seen below, because sufficient mutant DNA must be produced by PCR to be detectable over aberrant signal from wild-type and random background noise. Under the heading "Outputs", the first line provides the likelihood that at least one mutant molecule is presented to the first round of PCR. The second line under the Output heading provides the likelihood of detection of mutants (after PCR) above the predetermined threshold for detection. For example, in experiment 4, the results indicate that in 87.9% of experiments run under the conditions specified for experiment 4, the number of mutants will exceed the threshold number for detection. Finally, the last two rows provide the coefficient of variation for sampling, and for the combination of sampling and PCR.
TABLE 1
Figure imgf000014_0001
As shown in Table 1 , even at 100% PCR efficiency, mutant DNA is detected in only 97.1 % of the samples when 1000 input molecules are used (i.e., 1000 DNA molecules are available for priming at the initial PCR cycle), even though 100% of the DNA is amplified in any given round of PCR. When 10,000 molecules are presented, it is virtually certain that the mutant DNA will be amplified and detected, as shown in the results for experiment 6 in Table 1. Stochastic errors due to variation in the number of input molecules become less significant at about 500 input molecules and higher (i.e., the CV for stochastic variations is about the same regardless of whether PCR efficiency is 20% or 100%). At lower PCR efficiency (20% in Table 1), the model shows that introducing 50, 100, 200, 500, or even 1000 molecules into the PCR does not assure either amplification or detection. As shown in experiment 12, introducing 10,000 molecules results in amplification of the mutant target, and a high likelihood of its subsequent detection. Thus, even with 100% efficient PCR, significant false negative events occur when input molecules fall below 500. The foregoing analysis shows that there is a unique range for the number of molecules that must be presented to a PCR in order to achieve amplification of a low- frequency DNA, and to allow its detection. That range is a function of the PCR efficiency, and the percentage of low-frequency (mutant) DNA in the sample, and the detection threshold. The aforementioned model was developed and run in Visual Basic for Applications code (Microsoft, Office 97) to simulate a PCR as described above. A flow chart containing the programming steps is provided in Figure 1. The statistical confidence level within which results were measured was held constant at approximately 99%. Only the PCR efficiency and percent mutant DNA were varied. As discussed above, the model iteratively samples DNA in a "Monte Carlo" simulation over a thousand experiments, each experiment consisting of 10 rounds of PCR. The results are shown below in Table II.
TABLE II
Figure imgf000016_0001
Regression of the data obtained using the model as described above, produced the set of curves set forth below in Table III. TABLE
Molecules needed to overcome stochastic effects with about 99% confidence
.1% Mutant -2% Mutant .5% Mutant , .10% Mutant!
Figure imgf000017_0001
PCR Efficiency
Using Table III, the optimal number of molecules to be presented to the PCR is determined by selecting a PCR efficiency (or determining the efficiency by empirical means), and selecting a percentage of the sample suspected to be mutant DNA associated with disease. This, in turn, dictates a threshold of detection. Not ail detection strategies have similar underlying detection thresholds, so an appropriate technology must be selected. The percentage mutant DNA may be determined by clinical considerations as outlined above for colorectal cancer.
In practice of the invention, one may determine the PCR efficiency and percent expected mutant in order to maximize the probability of obtaining amplified, detectable mutant DNA. For example, one may select N, the number of input molecules from the
"1%" curve in Table III, when 5% of the sample is expected to be mutant DNA in order to increase the confidence of the assay result.
Once the number of molecules for input to the PCR is determined, a sample comprising that number of molecules (or greater) is prepared for PCR according to standard methods. The number of molecules in a sample may be determined directly by, for example, enumerative methods such as those taught in U.S. Patent No. 5, 670,325, incorporated by reference herein. Alternatively, the number of molecules in a complex sample may be determined by molar concentration, molecular weight, or by other means known in the art. The amount of DNA in a sample may be determined by mass spectrometry, optical density, or other means known in the art. The number of molecules in a sample derived from a biological specimen may be determined by numerous means in the art, including those disclosed in U.S. Patent Nos. 5,741 ,650 and 5,670,325, both of which are incorporated by reference herein.
In one preferred embodiment, a sample is prepared from a stool specimen by homogenizing in a physiologically-compatible buffer at a stool mass to buffer volume ratio of about 20:1 in order to maximize the amount of DNA in the sample available for amplification. Physiologically acceptable buffers include those solvents generally known to those skilled in the art as suitable for dispersion of biological sample material. Such solvents include phosphate-buffered saline comprising a salt, such as 20-1 OOmM NaCI or KCI, and optionally a detergent, such as 1-10% SDS or Triton™, and/or a proteinase, such as proteinase K (at, e.g., about 20mg/ml). A preferred solvent is a physiologically-compatible buffer comprising, for example, 1 M Tris, 0.5M EDTA, 5M NaCI and water to a final concentration of 500 mM Tris, 16mM EDTA and 10mM NaCI at pH 9. The buffer acts as a solvent to disperse the solid stool sample during homogenization and to facilitate separation of the DNA from the bacterial and fibrous components. Increasing the volume of solvent in relation to solid mass of the sample results in increased yields of DNA.
Buffer is added to the solid sample in a solvent volume to solid mass ratio of at least about 5:1. The solvent volume to solid mass ratio is preferably in the range of about 10:1 to about 30:1 , and more preferably in the range of about 10:1 to about 20:1. Most preferably, the solvent volume to solid mass ratio is about 10:1. Typically, solvent volume may be measured in milliliters, and solid mass measured in milligrams, but the practitioner will appreciate that the ratio of volume to mass remains constant, regardless of scale up or down of the particular mass and volume units. That is, solvent volume to solid mass ratios may be measured as liters:grams or μl: μg. The minimum number of DNA molecules in the prepared sample may be verified by molarity, optical density, enumeration, or other means known in the art. After PCR amplification, assays are performed to detect the presence of mutant DNA in the amplified sample. Such mutant DNA may be detected in enumerative methods (see above) or by bulk detection using, for example, fluorescent markers, mass markers, radioactive markers, and the like. Once methods of the invention are used to ensure that low-frequency material, if present, will be amplified for detection, the means for measuring the presence in the amplified sample of the low-frequency DNA is immaterial to the invention. Such means may be chosen by the skilled artisan in accordance with available materials, convenience, and clinical or diagnostic requirements.

Claims

Claims 1. A method for detecting a target nucleic acid known or suspected to be present in a biological specimen, the method comprising the steps of: preparing a sample comprising a minimum number of nucleic acid molecules sufficient to detect a target nucleic acid; and detecting said target nucleic acid in said sample. 2. The method of claim 1 , wherein said biological specimen is a tissue or body fluid. 3. The method of claim 1 , wherein said target nucleic acid is a mutant nucleic acid. 4. The method of claim 1 , wherein said target nucleic acid is present in said sample at about between 0.5% and about 10% of the total species-specific nucleic acid in said sample. 5. The method of claim 1 , further comprising the step of amplifying said target nucleic acid prior to detecting said target nucleic acid. 6 A method for quantifying the amount of a target nucleic acid in a biological specimen, the method comprising the steps of: preparing a sample comprising a minimum number of nucleic acid molecules necessary to detect a target nucleic acid; and enumerating the number of target nucleic acid molecules in said sample. 7. A method for preparing a heterogeneous sample for detection of an analyte, the method comprising the steps of: determining a minimum number of analyte molecules that must be present in said sample for detection of said analyte at a defined level of statistical confidence; and preparing a sample comprising said minimum number of analyte molecules. 8. A method for amplifying a target nucleic acid known or suspected to be present in a biological specimen, the method comprising the steps of: (a) preparing a sample comprising a minimum number of molecules sufficient for detection, within a defined degree of statistical confidence, of a target nucleic acid present in said sample at between about 0.5% and about 10% of the total species- specific nucleic acid in said sample; and (b) amplifying said target nucleic acid. 9. The method of claim 8, further comprising the step of detecting amplified target nucleic acid. 10. The method of claim 8, wherein said biological specimen is a tissue or body fluid. 11. The method of claim 8, wherein said specimen is stool. 12. The method of claim 11 , wherein said preparing step comprises homogenizing said stool specimen in buffer at a stool sample mass-to-buffer volume ratio of about 20:1. 13. The method of claim 11 , wherein said preparing step comprises enriching said specimen for human DNA. 14. The method of claim 13, wherein said enriching step comprises sequence- specific capture of human DNA. 15. A method for amplifying a mutant nucleic acid in a sample prepared from a tissue or body fluid specimen, comprising the steps of: (a) selecting an amplification efficiency, level of statistical confidence, and suspected ratio of nucleic acid comprising a mutation to total nucleic acid in said specimen; (b) determining, based upon said efficiency, said ratio, a minimum number of nucleic acid molecules that must enter an amplification reaction in order to assure within said level of statistical confidence, that a nucleic acid comprising said mutation will be amplified; (c) preparing a sample comprising said minimum number of nucleic acid molecules; and (d) amplifying a region of said nucleic acid suspected to contain said mutation. 16. The method of claim 15, wherein said amplifying step comprises a polymerase chain reaction. 17. A method for detecting loss of heterozygosity in nucleic acid molecules in a biological specimen, the method comprising the steps of: preparing a sample comprising a minimum number of nucleic acid molecules necessary to detect a loss of heterozygosity; enumerating a number of target nucleic acid molecules in said sample a subset of which is suspected of having a loss of heterozygosity; enumerating a reference number of non-target nucleic acid molecules in said sample; and comparing said target number to said reference number, a statistically-significant difference between said target number and said reference number being indicative of a loss of heterozygosity. 18. The method of claims 1 or 8, wherein said biological specimen is obtained from a pooled patient population. 19. The method of claim 18 wherein said pooled biological specimen comprises a stool sample obtained from members of a patient population. 20. A method for detecting a mutant nucleic acid known or suspected to be present in a biological specimen, the method comprising the steps of: preparing a sample comprising a number of total nucleic acid copies sufficient to detect a mutant nucleic acid with a predetermined level of statistical confidence if said mutant nucleic acid is present in said sample; and detecting said mutant nucleic acid in said sample.
PCT/US1999/028064 1998-11-23 1999-11-23 Method for detecting target sequences in small proportions in heterogeneous samples WO2000032820A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU15264/00A AU1526400A (en) 1998-11-23 1999-11-23 Method for detecting target sequences in small proportions in heterogeneous samples

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10956798P 1998-11-23 1998-11-23
US60/109,567 1998-11-23

Publications (1)

Publication Number Publication Date
WO2000032820A1 true WO2000032820A1 (en) 2000-06-08

Family

ID=22328360

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/028064 WO2000032820A1 (en) 1998-11-23 1999-11-23 Method for detecting target sequences in small proportions in heterogeneous samples

Country Status (2)

Country Link
AU (1) AU1526400A (en)
WO (1) WO2000032820A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6586177B1 (en) 1999-09-08 2003-07-01 Exact Sciences Corporation Methods for disease detection
US6818404B2 (en) 1997-10-23 2004-11-16 Exact Sciences Corporation Methods for detecting hypermethylated nucleic acid in heterogeneous biological samples
US9109256B2 (en) 2004-10-27 2015-08-18 Esoterix Genetic Laboratories, Llc Method for monitoring disease progression or recurrence
US9777314B2 (en) 2005-04-21 2017-10-03 Esoterix Genetic Laboratories, Llc Analysis of heterogeneous nucleic acid samples

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0648845A2 (en) * 1993-07-08 1995-04-19 Johnson & Johnson Clinical Diagnostics, Inc. Method for coamplification of two different nucleic acid sequences using polymerase chain reaction
US5506105A (en) * 1991-12-10 1996-04-09 Dade International Inc. In situ assay of amplified intracellular mRNA targets
WO1997023651A1 (en) * 1995-12-22 1997-07-03 Exact Laboratories, Inc. Methods for the detection of clonal populations of transformed cells in a genomically heterogeneous cellular sample
WO1999007894A1 (en) * 1997-08-05 1999-02-18 Wisconsin Alumni Research Foundation Direct quantitation of low copy number rna

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5506105A (en) * 1991-12-10 1996-04-09 Dade International Inc. In situ assay of amplified intracellular mRNA targets
EP0648845A2 (en) * 1993-07-08 1995-04-19 Johnson & Johnson Clinical Diagnostics, Inc. Method for coamplification of two different nucleic acid sequences using polymerase chain reaction
WO1997023651A1 (en) * 1995-12-22 1997-07-03 Exact Laboratories, Inc. Methods for the detection of clonal populations of transformed cells in a genomically heterogeneous cellular sample
WO1999007894A1 (en) * 1997-08-05 1999-02-18 Wisconsin Alumni Research Foundation Direct quantitation of low copy number rna

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHEN M -S ET AL: "DETECTION OF SINGLE-BASE MUTATIONS BY A COMPETITIVE MOBILITY SHIFT ASSAY", ANALYTICAL BIOCHEMISTRY,US,ACADEMIC PRESS, SAN DIEGO, CA, vol. 239, no. 1, 15 July 1996 (1996-07-15), pages 61 - 69, XP000598275, ISSN: 0003-2697 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6818404B2 (en) 1997-10-23 2004-11-16 Exact Sciences Corporation Methods for detecting hypermethylated nucleic acid in heterogeneous biological samples
US6586177B1 (en) 1999-09-08 2003-07-01 Exact Sciences Corporation Methods for disease detection
US9109256B2 (en) 2004-10-27 2015-08-18 Esoterix Genetic Laboratories, Llc Method for monitoring disease progression or recurrence
US9777314B2 (en) 2005-04-21 2017-10-03 Esoterix Genetic Laboratories, Llc Analysis of heterogeneous nucleic acid samples

Also Published As

Publication number Publication date
AU1526400A (en) 2000-06-19

Similar Documents

Publication Publication Date Title
James et al. LamPORE: rapid, accurate and highly scalable molecular screening for SARS-CoV-2 infection, based on nanopore sequencing
AU2023251452A1 (en) Validation methods and systems for sequence variant calls
CN115572760A (en) Method for evaluating normality of immune repertoire and application thereof
US11104948B2 (en) NGS systems control and methods involving the same
EP4298248A1 (en) Methods for detection of donor-derived cell-free dna in transplant recipients of multiple organs
CN105209637B (en) Noninvasive sex of foetus determines
EP2270747B1 (en) Methods for detecting nucleic acid with microarray and program product for use in microarray data analysis
WO2000032820A1 (en) Method for detecting target sequences in small proportions in heterogeneous samples
EP3988672B1 (en) Use of off-target sequences for dna analysis
EP3969993A1 (en) Immunorepertoire wellness assessment systems and methods
KR20210091371A (en) How to detect a target analyte in a sample using a sigmoidal function for a gradient data set
Van Paemel et al. Minimally invasive classification of pediatric solid tumors using reduced representation bisulfite sequencing of cell-free DNA: a proof-of-principle study
Zhao From single cell gene-based diagnostics to diagnostic genomics: current applications and future perspectives
US11618920B2 (en) Method for analyzing nucleic acid sequence
WO2024007971A1 (en) Analysis of microbial fragments in plasma
US20220145368A1 (en) Methods for noninvasive prenatal testing of fetal abnormalities
US20040241661A1 (en) Pseudo single color method for array assays
JP2007259847A (en) Method and evaluating system for evaluating genotyping result
Bieler et al. Benefits of applying molecular barcoding systems are not uniform across different genomic applications
Whitty et al. Sensitive detection of DNA contamination in tumor samples via microhaplotypes
WO2023021978A1 (en) Method for examining autoimmune disease
AU2022270021A1 (en) Synthetic polynucleotides and method of use thereof in genetic analysis
Smith et al. Detection of genetic variations in coagulopathy-related genes using ramified rolling circle amplification
WO2019215223A1 (en) Method of cancer prognosis by assessing tumor variant diversity by means of establishing diversity indices

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref country code: AU

Ref document number: 2000 15264

Kind code of ref document: A

Format of ref document f/p: F

AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase