WO2017112738A1 - Methods for measuring microsatellite instability - Google Patents

Methods for measuring microsatellite instability Download PDF

Info

Publication number
WO2017112738A1
WO2017112738A1 PCT/US2016/067952 US2016067952W WO2017112738A1 WO 2017112738 A1 WO2017112738 A1 WO 2017112738A1 US 2016067952 W US2016067952 W US 2016067952W WO 2017112738 A1 WO2017112738 A1 WO 2017112738A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
bases
microsatellite
sample
indel
Prior art date
Application number
PCT/US2016/067952
Other languages
French (fr)
Inventor
Michael Perry
Kirsten Timms
Alexander Gutin
Original Assignee
Myriad Genetics, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Myriad Genetics, Inc. filed Critical Myriad Genetics, Inc.
Publication of WO2017112738A1 publication Critical patent/WO2017112738A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • the present disclosure relates to methods, kits, systems and compositions for detecting microsatellite instability in cancer cells.
  • the disclosure includes new markers that can be used in, e.g. , laboratory assays to detect microsatellite instability with high sensitivity.
  • Microsatellites are genomic regions containing tandem sequence repeats.
  • microsatellite While other types of low complexity sequences may be encompassed by the term "microsatellite,” most microsatellites are mononucleotide repeats (also called homopolymers) or dinucleotide repeats. Microsatellite instability (“MSI”) occurs when one or more of these regions contains more or fewer repeats in a somatic cell than expected based on the number of repeats in the germline. An early example of this is reported in Thibodeau et al, Microsatellite instability in cancer of the proximal colon, SCIENCE (1993) 260:816-819.
  • MSI Microsatellite instability
  • MSI is thought to be caused by slippage of the polymerase enzyme during
  • MMR DNA mismatch repair
  • MMR-deficiency in tumors may indicate that the underlying cancer is familial in nature (e.g. , caused by or contributed to by a germline mutation in an MMR or MMR-related gene). MMR-deficiency may also be used to predict response or resistance to certain chemotherapies (e.g. , 5-fluoracil or alkylating agents such as temozolomide).
  • chemotherapies e.g. , 5-fluoracil or alkylating agents such as temozolomide.
  • One panel known in the art is sometimes called the Bethesda panel, consisting of three dinucleotide repeats (D2S123, D5S346, D17S250) and two mononucleotide repeats (BAT26, BAT25).
  • D2S123, D5S346, D17S250 two dinucleotide repeats
  • BAT26, BAT25 two mononucleotide repeats
  • a tumor is typically considered MSI-positive if 40% or more of these markers are unstable (i.e., show a repeat length different from expected).
  • MSI-high Tumors that test negative for all five markers are sometimes referred to as microsatellite stable ("MSS").
  • MSS microsatellite stable
  • MSI-L a tumor that tests positive for one locus (or on ⁇ 30% of loci) is referred to as MSI-L.
  • the Bethesda panel is considered by some to have low sensitivity. See, e.g.,
  • NGS next generation sequencing
  • the present disclosure provides methods of analyzing microsatellite regions and detecting microsatellite instability. [0010] In an embodiment, a method of analyzing microsatellit regions is provided.
  • the method comprises: analyzing DNA derived from a patient sample to determine the nucleotide sequence of the DNA at a plurality of microsatellite regions, wherein (a) the plurality of microsatellite regions comprises at least one test microsatellite region; (b) the at least one test microsatellite region comprise(s) the nucleotide sequence of any one of SEQ ID NOs: l-35; and (c) the sequence of the at least one test microsatellite region is analyzed to detect at least one indel at a homopolymer subregion comprising: (1) bases 7-26 of SEQ ID NO: 1; (2) bases 11-27 of SEQ ID NO: 2; (3) bases 10-27 of SEQ ID NO: 3; (4) bases 1-6 of SEQ ID NO: 4; (5) bases 10-28 of SEQ ID NO: 5; (6) bases 1-18 of SEQ ID NO: 6; (7) bases 8-28 of SEQ ID NO: 7; (8) bases 7-23 of SEQ ID NO: 8; (9)
  • the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
  • the sample is a bodily fluid.
  • the sample is a tumor sample.
  • the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors.
  • the indel is detected by next generation sequencing.
  • a method of detecting microsatellite instability levels comprises (a) assaying DNA derived from a patient sample according to claim 1; and (b) detecting (1) high microsatellite instability in a sample in which at least 60% of the plurality of microsatellite regions comprise an indel in the homopolymer subregion; or (2) intermediate or low microsatellite instability in a sample in which fewer than 60% and more than 10% of the plurality of microsatellite regions comprise an indel in the homopolymer subregion; or
  • the at least one test microsatellite region is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions.
  • the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 indels.
  • between 20 and 33 indels is indicative of MSI.
  • the sample is a bodily fluid.
  • the sample is a tumor sample.
  • the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors.
  • the indel is detected by next generation sequencing.
  • a method of detecting microsatellite instability comprises detecting microsatellite instability comprising: (a) analyzing at least one microsatellite region present in one of SEQ ID NOs: l-35 in DNA derived from a patient sample; (b) detecting at least one indel in the at least one microsatellite region; and (c) detecting microsatellite instability in a sample in which the at least one microsatellite region comprises at least one indel in a homopolymer region wherein the at least one microsatellite region comprises: (1) bases 7-26 of SEQ ID NO: 1; (2) bases 11-27 of SEQ ID NO: 2; (3) bases 10-27 of SEQ ID NO: 3;
  • the at least one test microsatellite region is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions.
  • the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 indels.
  • between 20 and 33 indels is indicative of MSI.
  • the sample is a bodily fluid.
  • the sample is a tumor sample.
  • the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors.
  • the indel is detected by next generation sequencing.
  • Figure 1A shows a homopolymer that was excluded following secondary selection.
  • Figure IB shows a homopolymer exemplifying secondary criteria, which was included in the final selection of 35 homopolymers used for the MSI assay.
  • Microsatellite instability is evidence of a loss of function in the DNA mismatch repair (MMR) pathway, and the present disclosure is based on the discovery that analyzing certain homopolymer repeats provides a powerful test of MSI.
  • algorithm refers to any formula, model, mathematical equation, algorithmic, analytical or programmed process, or statistical technique or classification analysis that takes one or more inputs or parameters, whether continuous or categorical, and calculates an output value, index, index value or score.
  • algorithms include but are not limited to ratios, sums, regression operators such as exponents or coefficients, biomarker value transformations and normalizations (including, without limitation, normalization schemes that are based on clinical parameters such as age, gender, ethnicity, etc.), rules and guidelines, statistical classification models, and neural networks trained on populations.
  • linear and non-linear equations and statistical classification analyses to determine the relationship between (a) the presence of indels detected in a subject sample and (b) the level of the respective subject's MSI.
  • diagnosis refers to methods by which a determination can be made as to whether an individual is likely to be suffering from a given disease or condition, including but not limited diseases or conditions resulting from microsatellite instability.
  • the skilled artisan often makes a diagnosis on the basis of one or more diagnostic indicators, e.g., a biomarker, the presence, absence, amount, or change in amount of which is indicative of the presence, severity, or absence of the condition.
  • diagnostic indicators can include patient history; physical symptoms, e.g., unexplained weight loss, fever, fatigue, pains, or skin anomalies; phenotype; genotype; or environmental or heredity factors.
  • diagnostic refers to an increased probability that certain course or outcome will occur; that is, that a course or outcome is more likely to occur in a patient exhibiting a given characteristic, e.g., the presence or level of a diagnostic indicator, when compared to individuals not exhibiting the characteristic. Diagnostic methods can be used independently, or in combination with other diagnosing methods known in the art to determine whether a course or outcome is more likely to occur in a patient exhibiting a given characteristic.
  • disease can encompass any disorder, condition, sickness, ailment, etc. that manifests in, e.g., a disordered or incorrectly functioning organ, part, structure, or system of the body, and results from, e.g., genetic or developmental errors, MSI, infection, poisons, nutritional deficiency or imbalance, toxicity, or unfavorable environmental factors.
  • homopolymer refers to a microsatellite that is a mononucleotide repeat of at least 6 bases ⁇ e.g., a stretch of at least 6 consecutive A, C, T or G residues in the DNA).
  • a “homopolymer region” is a microsatellite region (as defined below) in which the microsatellite is a homopolymer.
  • a “homopolymer subregion” refers to a homopolymer microsatellite located within a larger genomic region (e.g., a homopolymer region).
  • SEQ ID NO: 1 is a microsatellite region of 31 nucleotides as follows:
  • the string of thymines at positions 7-26 in SEQ ID NO: l is a homopolymer subregion within the microsatellite or homopolymer region of SEQ ID NO: 1.
  • indels refers to a mutation in a nucleic acid whereby one or more nucleotides are either inserted or deleted, resulting in a net gain or loss of nucleotides, which can include any combination of insertions and deletions. Aberrant homopolymer lengths often result from indels.
  • Loynch syndrome refers to an autosomal dominant genetic condition which has a high risk of colon cancer as well as other cancers including endometrium, ovary, stomach, small intestine, hepatobiliary tract, upper urinary tract, brain, and skin cancer.
  • endometrium ovary
  • stomach small intestine
  • hepatobiliary tract upper urinary tract
  • brain and skin cancer.
  • HNPCC old name for the condition
  • microsatellite refers to a genetic locus comprising a short
  • tandemly repeated sequence motif comprising a minimal total length of 6 bases.
  • a "mononucleotide microsatellite” or “homopolymer” refers to a genetic locus comprising a repeated single nucleotide (e.g., poly-A) and is a specific subclass of microsatellites particularly relevant to the present disclosure.
  • a “dinucleotide microsatellite” refers to a genetic locus comprising a motif of two nucleotides that are tandemly repeated
  • a “trinucleotide microsatellite” refers to a genetic locus comprising three nucleotides that are tandemly repeated
  • a “tetranucleotide microsatellite” refers to a genetic locus comprising a motif of four nucleotides that are tandemly repeated.
  • Additional microsatellite motifs can comprise pent- and hexanucleotide repeats.
  • a “monomorphic microsatellite” is one in which all (or substantially all) individuals, particularly all individuals of a given population, share the same number of repeat units.
  • BAT26 is a monomorphic microsatellite in Europeans and a polymorphic microsatellite in Africans.
  • genomic DNA of a sample e.g., genomic DNA of a cancer cell present in or obtained from the subject.
  • genomic DNA of a sample e.g., genomic DNA of a cancer cell present in or obtained from the subject.
  • genomic DNA of a sample e.g., genomic DNA of a cancer cell present in or obtained from the subject.
  • genomic DNA of a sample e.g., genomic DNA of a cancer cell present in or obtained from the subject.
  • genomic DNA of a sample e.g., genomic DNA of a cancer cell present in or obtained from the subject.
  • MSI status means the presence of microsatellite instability
  • detecting MSI in a cancer cell sample may include classifying MSI status in the cancer cell, in which case the method may include a classification step.
  • classifying MSI status in a cancer cell sample means categorizing the sample based on its MSI status, e.g., the degree to which it comprises cancer cells harboring molecular features indicative of instability at microsatellite sites.
  • NGS next generation sequencing
  • DNA sequencing libraries are generated by clonal amplification by PCR in vitro
  • the DNA is sequenced by synthesis, such that the DNA sequence is determined by the addition of nucleotides to the complementary strand rather through chain-termination chemistry typical of Sanger sequencing
  • third, the spatially segregated, amplified DNA templates are sequenced simultaneously in a massively parallel fashion, typically without the requirement for a physical separation step.
  • NGS parallelization of sequencing reactions can generate hundreds of megabases to gigabases of nucleotide sequence reads in a single instrument run.
  • conventional sequencing techniques such as Sanger sequencing, which typically report the average genotype of an aggregate collection of molecules
  • NGS technologies typically digitally tabulate the sequence of numerous individual DNA fragments (sequence reads discussed in detail below), such that low frequency variants (e.g., variants present at less than about 10%, 5% or 1% frequency in a heterogeneous population of nucleic acid molecules) can be detected.
  • the term "massively parallel” can also be used to refer to the simultaneous generation of sequence information from many different template molecules by NGS.
  • NGS strategies can include several methodologies, including, but not limited to: (i) microelectrophoretic methods; (ii) sequencing by hybridization; (iii) real-time observation of single molecules, and (iv) cyclic-array sequencing.
  • Cyclic-array sequencing refers to technologies in which a sequence of a dense array of DNA is obtained by iterative cycles of template extension and imaging-based data collection.
  • cyclic-array sequencing technologies include, but are not limited to 454 sequencing, for example, used in 454 Genome Sequencers (Roche Applied Science; Basel), Solexa technology, for example, used in the Illumina Genome Analyzer, Illumina HiSeq, MiSeq, and NextSeq (San Diego, CA), the SOLiD platform (Applied Biosy stems; Foster City, CA), the Polonator (Dover/Harvard) and Heli Scope Single Molecule Sequencer technology (Helicos; Cambridge, MA).
  • Other NGS methods include single molecule real time sequencing (e.g., Pacific Bio) and ion semiconductor sequencing (e.g., Ion Torrent sequencing). See, e.g., Shendure & Ji, Next Generation DNA Sequencing, NAT. BIOTECH. (2008) 26: 1135-1145 for a more detailed discussion of NGS sequencing technologies.
  • patient or “individual” or “subject” generally refers to a human.
  • a subject can be male or female.
  • a subject can be one who has been previously diagnosed or identified as having a disease characterized by MSI.
  • a subject can be one who has already undergone, or is undergoing, a therapeutic intervention for disease characterized by MSI.
  • a subject can also be one who has not been previously diagnosed with a disease characterized by MSI.
  • sample refers to a physical specimen that is, or is derived from, a biological tissue, including biological fluids.
  • a sample is "derived from” a biological tissue when the sample is the result of some process applied to the biological tissue, e.g., serum is derived from whole blood. Examples include, but are not limited to, biopsy or tissue samples, frozen samples, blood and blood fractions or products (e.g., serum, platelets, red blood cells, and the like), tumor samples, sputum, bronchoalveolar lavage, cultured cells, e.g., primary cultures, explants, and transformed cells, stool, urine, etc.
  • a “biopsy” refers to the process of removing a tissue sample for diagnostic or prognostic evaluation, and to the tissue specimen itself. Any suitable biopsy technique can be applied to the diagnostic methods of the present disclosure. The biopsy technique applied will depend on the tissue type to be evaluated (e.g., lung etc.), the size and type of the tumor, among other factors. Representative biopsy techniques include, but are not limited to, excisional biopsy, incisional biopsy, needle biopsy, surgical biopsy, and bone marrow biopsy.
  • An “excisional biopsy” refers to the removal of an entire tumor mass with a small margin of normal tissue surrounding it.
  • An “incisional biopsy” refers to the removal of a wedge of tissue that includes a cross-sectional diameter of the tumor.
  • a diagnosis made by endoscopy or fluoroscopy can require a "core-needle biopsy", or a “fine-needle aspiration biopsy” which generally obtains a suspension of cells from within a target tissue.
  • a "bodily fluid” include all fluids obtained from a mammalian body, either processed (e.g., serum) or unprocessed, which can include, for example, blood, plasma, urine, lymph, gastric juices, bile, serum, saliva, sweat, and spinal and brain fluids.
  • cancer cell samples or “tumor sample” means a specimen comprising either at least one cancer cell or biomolecules derived therefrom. Non-limiting examples of such biomolecules include nucleic acids and proteins.
  • Biomolecules "derived” from a cancer cell sample include endogenous molecules extracted from the sample as well as artificially synthesized copies or versions of such endogenous biomolecules.
  • One illustrative, non-limiting example of such artificially synthesized molecules includes PCR amplification products in which nucleic acids (“biomolecules”) from the sample serve as PCR templates.
  • Nucleic acids of a cancer cell sample include nucleic acids located in a cancer cell or biomolecules derived from a cancer cell.
  • score means a value or set of values selected so as to provide a quantitative measure of a variable or characteristic of a subject's condition or the degree of MSI in a sample, and/or to discriminate, differentiate or otherwise characterize MSI.
  • the value(s) comprising the score can be based on, for example, quantitative data resulting in a measured amount of one or more sample constituents obtained from the subject.
  • the score can be derived from a single constituent, parameter or assessment, while in other embodiments the score is derived from multiple constituents, parameters and/or assessments.
  • the score can be based upon or derived from an interpretation function; e.g., an interpretation function derived from a particular predictive model using any of various statistical algorithms known in the art.
  • a "change in score” can refer to the absolute change in score, e.g. from one time point to the next, or the percent change in score, or the change in the score per unit time (i.e., the rate of score change).
  • test microsatellite region is a microsatellite region whose nucleotide sequence is analyzed according to the present disclosure.
  • a test microsatellite region includes non-homopolymer sequences immediately surrounding the homopolymer region that can be used to determine the precise homopolymer position within a genome.
  • SEQ ID NOs: l-35 are exemplary test microsatellite regions.
  • treatment or “therapy” or “therapeutic regimen” includes all clinical management of a subject and interventions, whether biological, chemical, physical, or a combination thereof, intended to sustain, ameliorate, improve, or otherwise alter the condition of a subject. These terms may be used synonymously herein.
  • Treatments include but are not limited to administration of prophylactics or therapeutic compounds (including small molecule and biologic drugs), exercise regimens, physical therapy, dietary modification and/or supplementation, bariatric surgical intervention, administration of therapeutic compounds (prescription or over-the-counter), and any other treatments known in the art as efficacious in preventing, delaying the onset of, or ameliorating disease characterized by MSI.
  • a "response to treatment” includes a subject's response to any of the above-described treatments, whether biological, chemical, physical, or a combination of the foregoing.
  • a “treatment course” relates to the dosage, duration, extent, etc. of a particular treatment or therapeutic regimen.
  • An initial therapeutic regimen as used herein is the first line of treatment.
  • the present disclosure relates to analyzing homopolymer repeat regions to determine levels of microsatellite instability (MSI), which can in turn be evidence of a loss of function in the DNA mismatch repair (MMR) pathway.
  • MSI microsatellite instability
  • MMR DNA mismatch repair
  • the present disclosure provides methods of analyzing microsatellite regions comprising analyzing DNA derived from a patient sample to determine the nucleotide sequences of the DNA at a plurality of microsatellite regions. In some embodiments the present disclosure provides methods of detecting microsatellite instability comprising analyzing microsatellite regions.
  • the microsatellite regions can be any region of a genome known to contain a microsatellite as defined herein.
  • the microsatellite region that is analyzed (or tested) to determine a specific nucleotide sequence is a test microsatellite region, which may contain nucleotide sequences immediately adjacent to the homopolymer subregion in order to determine its precise location within the genome.
  • Test microsatellite regions can be any sequence in a genome that includes a microsatellite, which can include the test microsatellite regions listed in Table 1. The homopolymer subregions within each SEQ ID are indicated. Table 1
  • Table 2 provides expanded sequence context for the homopolymers listed in
  • SEQ ID NOs 36-70 show additional 5' and 3' genomic sequence surrounding the sequences in SEQ ID NOs 1-35 (e.g., SEQ ID NO:36 shows additional flanking 5' and 3 ' genomic sequence surrounding SEQ ID NO: l, SEQ ID NO:37 shows additional flanking 5' and 3 ' genomic sequence surrounding SEQ ID NO:2, and so forth).
  • SEQ ID NO:36 shows additional flanking 5' and 3 ' genomic sequence surrounding SEQ ID NO: l
  • SEQ ID NO:37 shows additional flanking 5' and 3 ' genomic sequence surrounding SEQ ID NO:2, and so forth.
  • the homopolymer subregions within each SEQ ID are indicated.
  • the method comprises determining whether nucleic acids of the sample have an aberrant homopolymer length for a plurality of microsatellite regions comprising at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions chosen from Table 1 or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions chosen from Table 2.
  • the method comprises determining whether at least one indel is present in any homopolymer subregion included in the test microsatellite regions chosen from Table 1 or Table 2, wherein said at least one indel comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, or more indels.
  • the presence of an indel can be established by comparing the nucleotide sequence of the test microsatellite region to a reference, which in some embodiments includes the nucleotide sequence of DNA in which the indel is not present (e.g., a known reference DNA sequence, the sequence of germline DNA from the subject with the tumor, etc.).
  • a reference which in some embodiments includes the nucleotide sequence of DNA in which the indel is not present (e.g., a known reference DNA sequence, the sequence of germline DNA from the subject with the tumor, etc.).
  • the presence of an indel can be established by comparing the measured/detected length of the homopolymer subregion in the test microsatellite region to the expected length of the homopolymer (e.g., the length of this specific homopolymer in a reference genome).
  • Determining whether nucleic acids of a sample have an aberrant homopolymer length for at least one of the test microsatellite regions described herein, for example the test microsatellite regions of Table 1, can comprise assaying the sample and comparing the test microsatellite region of test microsatellite sequence(s) obtained from a reference sequence from a sample where MSI is not present in order to identify a difference in length between the test microsatellite region obtained from a patient and the reference sequence.
  • this determining comprises obtaining sequence data from such as assay and performing the comparison to identify the difference in length.
  • this determining comprises obtaining data from such a comparison and identifying a difference in length between the test sequence and the reference sequence.
  • Microsatellite instability can be detected, or measured, by identifying indels as described herein.
  • This measurement may be quantitative to varying degrees— e.g., assigning a score (e.g., numerical score) that lies along a substantially continuous spectrum and that reflects the quantitative level of MSI (e.g., the number or proportion of microsatellite markers showing instability).
  • this measurement may be a blending of two approaches— e.g., assigning such a score (e.g., numerical score) to a sample and then assigning MSI status in the sample into a discrete set of categories according to that score.
  • a score below X may be categorized as MSI-negative, a score above Y may be categorized as MSI-high, and a score between X and Y may be categorized as MSI-low or intermediate.
  • Y in this case the number of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length
  • Y in this case the number of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length
  • Y in this case the percentage of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length
  • Y in this case the percentage of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length
  • X the percentage of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length
  • X the percentage of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length
  • no microsatellite instability in a sample in which 10%) or fewer of the plurality of microsatellite regions comprises an indel in the homopolymer subregion.
  • the genome in which MSI status is being assessed can be from a cancer cell present in a tumor sample as defined herein.
  • the cancer cell can include any type of malignant solid tumor known to have metastatic potential, including without limitation, lung cancer (e.g., non-small cell lung cancer ( SCLC)), bone cancer, pancreatic cancer, cancer of the head or neck, melanoma, skin cancer, lymphoblast cancer, uterine cancer, ovarian cancer, cervical cancer, colorectal cancer, gastric cancer, colon cancer, stomach cancer, breast cancer, endometrial cancer, thyroid cancer, prostate cancer, rectal cancer, bladder cancer, kidney cancer (e.g., renal cell carcinoma), liver cancer (e.g., hepatocellular carcinoma), and cancers of the central nervous system (CNS), (e.g., glioma, glioblastoma multiforme or astrocytoma).
  • SCLC non-small cell lung cancer
  • CNS central nervous system
  • MSI status can be determined independent of cancer type. Thus, in principle, diagnosing MSI status can be done for every type of cancer. However, since MSI is most often present in cancers with a deficiency in mismatch repair genes, the methods of detecting MSI status described herein may be particularly useful in a tumor sample of a cancer where MMR deficiency occurs more frequently than in other types of cancer. Accordingly, the cancer sample may be a sample selected from colorectal cancer, endometrial cancer, ovarian cancer, gastric cancer, and leukemia.
  • MMR-deficient tumors have therefore been proposed. Synthetic lethality approaches have shown, for instance, that increased oxidative damage (by methotrexate exposure or PINKl silencing) or interference with the base excision repair (BER) pathway (by DNA polymerase ⁇ or ⁇ inhibition) sensitizes MMR-deficient tumors.
  • oxidative damage by methotrexate exposure or PINKl silencing
  • BER base excision repair
  • DNA polymerase ⁇ or ⁇ inhibition sensitizes MMR-deficient tumors.
  • oxidative damage induces 8-oxoguanine (8-oxoG) DNA lesions, which fail to be sufficiently repaired either by the BER or MMR pathway, generating mainly GC to TA dinucleotide transversions at the DNA level, leading to cell death.
  • MMR-deficient tumors are often also resistant to targeted cancer therapies, including anti-EGFR and anti-VEGF therapies. Although the precise reasons for this resistance are unknown, presence of secondary mutations in established tumor driver genes as a consequence of MMR- might be responsible.
  • MMR-deficient tumors can acquire mutations in double-strand break repair genes (e.g., MRE11, ATR and RAD50), known oncogenes or tumor suppressors (e.g., PIK3CA or PTEN). Since presence of MMR-deficiency mainly in colorectal and endometrial tumors represent a familial form of cancer, and since tumors exhibiting mutation spectra characteristic of MMR-deficiency, diagnostic tests assessing MMR-deficiency are commonly used.
  • double-strand break repair genes e.g., MRE11, ATR and RAD50
  • oncogenes or tumor suppressors e.g., PIK3CA or PTEN
  • the present methods generally relate to the detection of indels in genomic microsatellites.
  • Techniques for preparing nucleic acids in or derived from a cancer cell sample in a form that is suitable for indel detection can include, but are not limited to, PCR, detectable probes, sequencing and single base extensions, reverse transcriptase-PCR (RT-PCR), real-time PCR, allele- specific hybridization, reverse transcription quantitative real-time PCR (RT-qPCR) ligase chain reaction, strand displacement amplification (SDA), self-sustained sequence replication (3SR), or in situ PCR.
  • RT-PCR reverse transcriptase-PCR
  • RT-qPCR reverse transcription quantitative real-time PCR
  • SDA strand displacement amplification
  • 3SR self-sustained sequence replication
  • Indels e.g., homopolymer lengths
  • Indels can be detected by direct sequencing.
  • Non-limiting examples of sequence analysis useful in methods of the present disclosure include NGS (e.g., Chen et al, Genome Res. (2008) 18: 1143-1149); Srivatsan et al. PLoS Genet. (2008) 4:el000139), Maxam-Gilbert sequencing, Sanger sequencing, capillary array DNA sequencing, thermal cycle sequencing (Sears et al, Biotechniques (1992) 13 :626-633), solid-phase sequencing (Zimmerman et al, Methods Mol. Cell Biol.
  • sequencing with mass spectrometry such as matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI- TOF/MS; Fu et al, Nat. Biotechnol. (1998) 16:381-384), sequencing by hybridization (Chee et al, Science (1996) 274:610-614); Drmanac et al, Science (1993) 260: 1649-1652); Drmanac et al, Nat. Biotechnol. (1998) 16:54-58), Polony sequencing (Porreca et al, Curr. Protoc. Mol. Biol. (2006) Chp.
  • MALDI- TOF/MS matrix-assisted laser desorption/ionization time-of-flight mass spectrometry
  • Other detection methods include PyrosequencingTM of oligonucleotide-length products. Such methods often employ amplification techniques such as PCR. For example, in pyrosequencing, a sequencing primer is hybridized to a single stranded, PCR-amplified, DNA template; and incubated with the enzymes, DNA polymerase, ATP sulfurylase, luciferase and apyrase, and the substrates, adenosine 5' phosphosulfate (APS) and luciferin. The first of four deoxynucleotide triphosphates (dNTP) is added to the reaction.
  • dNTP deoxynucleotide triphosphates
  • DNA polymerase catalyzes the incorporation of the deoxynucleotide triphosphate into the DNA strand, if it is complementary to the base in the template strand. Each incorporation event is accompanied by release of pyrophosphate (PPi) in a quantity equimolar to the amount of incorporated nucleotide.
  • PPi pyrophosphate
  • ATP sulfurylase quantitatively converts PPi to ATP in the presence of adenosine 5' phosphosulfate. This ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP.
  • the light produced in the luciferase-catalyzed reaction is detected by a charge coupled device (CCD) camera and seen as a peak in a PyrogramTM Each light signal is proportional to the number of nucleotides incorporated.
  • Apyrase a nucleotide degrading enzyme, continuously degrades unincorporated dNTPs and excess ATP. When degradation is complete, another dNTP is added.
  • Another similar method for characterizing homopolymer sequence/length does not require use of a complete PCR, but typically uses only the extension of a primer by a single, fluorescence-labeled di deoxyribonucleic acid molecule (ddNTP) that is complementary to the nucleotide to be investigated.
  • ddNTP fluorescence-labeled di deoxyribonucleic acid molecule
  • the nucleotide at the polymorphic site can be identified via detection of a primer that has been extended by one base and is fluorescently labeled ⁇ e.g., Kobayashi et al., MOL. CELL. PROBES (1995) 9: 175-182).
  • Sample DNA can be analyzed using techniques including, without limitation, electrophoretic analysis or sequence analysis.
  • electrophoretic analysis include slab gel electrophoresis such as agarose or polyacrylamide gel electrophoresis, capillary electrophoresis, and denaturing gradient gel electrophoresis (DGGE).
  • slab gel electrophoresis such as agarose or polyacrylamide gel electrophoresis
  • capillary electrophoresis capillary electrophoresis
  • denaturing gradient gel electrophoresis DGGE
  • Other methods of nucleic acid analysis include, but are limited to, hybridization with allele-specific oligonucleotide probes (Wallace et al, NUCL. ACIDS RES.
  • This technique also commonly referred to as allele specific oligonucleotide hybridization (ASO) (e.g. , Stoneking et al, AM. J. HUM. GENET. (1991) 48:70-382; Saiki et al, NATURE (1986) 324: 163-166; EP 235,726; and WO/1989/01 1548), relies on distinguishing between two DNA molecules differing, typically, by one base by hybridizing an oligonucleotide probe that is specific for one of the variants to nucleic acid in or derived from a sample.
  • ASO allele specific oligonucleotide hybridization
  • This method typically employs short oligonucleotides, e.g., 15-20 bases in length, but may also employ longer oligonucleotides, e.g., 20-100 bases in length.
  • the probes are designed to differentially hybridize to one variant versus another. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and producing an essentially binary response, whereby a probe hybridizes efficiently to only one of the alleles.
  • Some probes are designed to hybridize to a segment of target DNA such that the variable site aligns with a central position ⁇ e.g. , in a 15-base oligonucleotide at the 7 position; in a 16-based oligonucleotide at either the 8 or 9 position) of the probe, but this design is not required.
  • the amount and/or presence of an allele can be determined by measuring the amount of allele-specific oligonucleotide that is hybridized to the sample.
  • the oligonucleotide is labeled with a detectable moiety, e.g. , a fluorescent label.
  • the homopolymer of SEQ ID NO: l can be detected as follows. First, DNA from a cancer cell sample may be extracted and cleaved or sheared into roughly uniform fragments of a desired length ⁇ e.g. , 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1,000 or more nucleotides).
  • This sheared DNA can be enriched for fragments containing the test microsatellite region(s) of interest ⁇ e.g. , SEQ ID NO: l) by either (a) oligonucleotide probe hybridization capture (specific for the test microsatellite region(s) of interest) followed by amplification (which can be but need not be specific) or (b) PCR amplification using primers specific for the test microsatellite region(s) of interest.
  • One or more oligonucleotides that hybridize only (or only efficiently hybridize) to a specific homopolymer of a specific length for each test microsatellite region can be applied to this sample of enriched DNA.
  • hybridization levels e.g., fluorescence intensity
  • hybridization levels can be measured for each allele-specific oligonucleotide, thereby detecting and optionally quantitating each homopolymer length in the sample.
  • Suitable assay formats for detecting hybrids formed between probes and target nucleic acid sequences in a sample include the immobilized target (dot-blot) format and immobilized probe (reverse dot-blot or line-blot) assay formats.
  • Dot blot and reverse dot blot assay formats are described in U.S. Pat. Nos. 5,310,893; 5,451,512; 5,468,613; and 5,604,099; each incorporated herein by reference.
  • amplified target DNA is immobilized on a solid support, such as a nylon membrane.
  • a solid support such as a nylon membrane.
  • the membrane-target complex is incubated with labeled probe under suitable hybridization conditions, unhybridized probe is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound probe.
  • Indels can also be detected using allele-specific amplification or primer extension methods. These reactions typically involve use of primers that are designed to selectively target one or another variant, where selectivity is achieved by failure to amplify target DNA that contains a mismatch corresponding to the 3 '-end of a primer. The presence of a mismatch affects the ability of a polymerase to extend a primer when the polymerase lacks error-correcting activity.
  • a primer complementary to a reference allele of a microsatellite i.e., without indel
  • the 3 '-terminal nucleotide hybridizes with the sequence containing the reference number/length of repeats.
  • the presence of the particular allele can be determined by the ability of the primer to initiate extension. If the 3'-terminus is mismatched (e.g., the sample contains an aberrant homopolymer length as compared to the reference), extension is impeded.
  • the primer can be used in conjunction with a second primer in an amplification reaction.
  • the second primer can hybridize at a site unrelated to the homopolymer.
  • Amplification proceeds from the two primers leading to a detectable product signifying the particular allelic form (e.g., specific homopolymer of a specific length) is present.
  • allelic form e.g., specific homopolymer of a specific length
  • Allele-specific amplification- or extension-based methods are described in, for example, WO/1993/022456; U.S. Pat. Nos. 5, 137,806; 5,595,890; 5,639,611; and 4,851,331.
  • identification of the alleles requires only detection of the presence or absence of amplified target sequences.
  • Methods for the detection of amplified target sequences include, but are not limited to, gel electrophoresis and probe hybridization assays (e.g., oligonucleotide microarray).
  • the amplified nucleic acid is detected by monitoring the increase in the total amount of double-stranded DNA in the reaction mixture as described, e.g., in U.S. Pat. No. 5,994,056.
  • the detection of double-stranded target DNA relies on the increased fluorescence various DNA-binding dyes, e.g., SYBR Green, exhibit when bound to double-stranded DNA.
  • Allele-specific amplification methods can be performed in reaction that employs multiple allele-specific primers to target particular alleles. Primers for such multiplex applications are generally labeled with distinguishable labels or are selected such that the amplification products produced from the alleles are distinguishable by size. Thus, for example, both alleles in a single sample can be identified using a single amplification by gel analysis of the amplification product.
  • an allele-specific oligonucleotide primer may be exactly complementary to one of the variant alleles in the hybridizing region or may have some mismatches at positions other than the 3 '-terminus of the oligonucleotide, which mismatches occur at non-variable sites in both allele sequences.
  • Genotyping can be performed using a "TaqManTM” or "5 '-nuclease assay", as described in U.S. Pat. Nos. 5,210,015; 5,487,972; and 5,804,375; and Holland et al, PROG. NATL. ACAD. SCI. (1988) 88:7276-7280.
  • TaqManTM assay labeled detection probes that hybridize within the amplified region are added during the amplification reaction. The probes are modified so as to prevent the probes from acting as primers for DNA synthesis.
  • the amplification is performed using a DNA polymerase having 5'- to 3 '-exonuclease activity.
  • any probe which successfully hybridizes to the target nucleic acid downstream from the primer being extended is degraded by the 5'- to 3 '-exonuclease activity of the DNA polymerase.
  • the synthesis of a new target strand also results in the degradation of a probe, and the accumulation of degradation product provides a measure of the synthesis of target sequences.
  • the hybridization probe can be an allele-specific probe that discriminates between alleles with and without indels, similar to the allele-specific oligonucleotides described above.
  • the method can be performed using an allele-specific primer, similar to those described above, and a labeled probe that binds to amplified product.
  • the probes can be at least about 12, 15, 16, 18, 20, 22, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 or more nucleotide portions of a contiguous sequence comprising the homopolymer of interest ⁇ e.g.
  • the probes can be produced by, for example, chemical synthesis, PCR amplification, generation from longer polynucleotides using restriction enzymes, etc.
  • the probes can be made completely complementary to the target nucleic acid or portion thereof ⁇ e.g. , to all or a portion of any of SEQ ID NOs: l-70). Therefore, usually high stringency conditions are desirable in order to prevent or at least minimize off-target hybridization. Conditions of high stringency are generally suitable for applications where the probes are complementary to regions of the target which lack heterogeneity.
  • Nucleic acid probes or alternatively nucleic acid from the samples, can be provided in solution for such assays, or can be affixed to a support (e.g., solid or semi-solid support).
  • supports examples include nitrocellulose (e.g., in membrane or microtiter well form), polyvinyl chloride (e.g., in sheets or microtiter wells), polystyrene latex (e.g., in beads or microtiter plates, polyvinylidine fluoride, diazotized paper, nylon membranes, activated beads, and Protein A beads.
  • any method suitable for detecting degradation product e.g., fluorescence
  • the nucleic acid from the sample can be subjected to gel electrophoresis or other size separation techniques; alternatively, the nucleic acid sample can be dot blotted without size separation.
  • the detection probe is labeled with two fluorescent dyes, one of which is capable of quenching the fluorescence of the other dye.
  • the dyes are attached to the probe, usually one attached to the 5 '-terminus and the other is attached to an internal site, such that quenching occurs when the probe is in an unhybridized state and such that cleavage of the probe by the 5'- to 3'-exonuclease activity of the DNA polymerase occurs in between the two dyes.
  • Amplification results in cleavage of the probe between the dyes with a concomitant elimination of quenching and an increase in the fluorescence observable from the initially quenched dye.
  • the accumulation of degradation product is monitored by measuring the increase in reaction fluorescence.
  • U.S. Pat. Nos. 5,491,063 and 5,571,673, both incorporated herein by reference, describe alternative methods for detecting the degradation of probe which occur concomitant with amplification.
  • Probes detectable upon a secondary structural change are also suitable for detection of a test locus variant.
  • Exemplified secondary structure or stem-loop structure probes include molecular beacons or Scorpion® primer/probes.
  • Molecular beacon probes are single- stranded oligonucleic acid probes that can form a hairpin structure in which a fluorophore and a quencher are usually placed on the opposite ends of the oligonucleotide. At either end of the probe short complementary sequences allow for the formation of an intramolecular stem, which enables the fluorophore and the quencher to come into close proximity.
  • the loop portion of the molecular beacon is complementary to a target nucleic acid of interest.
  • Binding of this probe to its target nucleic acid of interest forms a hybrid that forces the stem apart. This causes a conformation change that moves the fluorophore and the quencher away from each other and leads to a more intense fluorescent signal. See, e.g., Tyagi & Kramer, NAT. BlOTECHNOL. (1996) 14:303-308; Tyagi et al., NAT. BIOTECHNOL. (1998) 16:49-53; Piatek et al., NAT. BIOTECHNOL.
  • Alleles of target sequences can be differentiated using single-strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described, e.g., in Orita et al., PROC. NAT. ACAD. SCI. (1989) 86:2766-2770.
  • Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products.
  • Single- stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence.
  • the different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence difference between alleles of target genes.
  • Indel detection methods often employ labeled oligonucleotides.
  • Oligonucleotides can be labeled by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means.
  • useful labels include fluorescent dyes, radioactive labels, e.g., 32P, electron-dense reagents, enzyme, such as peroxidase or alkaline phosphatase, biotin, or haptens and proteins for which antisera or monoclonal antibodies are available. Labeling techniques are well known in the art ⁇ see e.g., Sambrook et al., supra).
  • the present disclosure also provides methods of administering, recommending, prescribing, etc. specific therapeutic regimens to patients whose samples are found to have specific MSI status as disclosed herein. These embodiments of the present disclosure thus will provide patient-specific biological information, which will be informative for therapy selection and will facilitate therapy response prediction.
  • the levels of MSI in a sample are compared to a reference ("reference standard” or "reference level”) in order to direct treatment decisions.
  • Indel determinations in homopolymers can be manipulated into a score, which can represent MSI status.
  • the reference standard used for any embodiment disclosed herein may comprise average, mean, or median levels of microsatellite instability (or average, mean, or median numbers of aberrant homopolymer lengths) in a control population.
  • the reference standard may additionally comprise cutoff values or any other statistical attribute of the control population, or earlier time points of the same subject, such as a standard deviation from the mean levels of aberrant homopolymers.
  • the control population may comprise healthy individuals, cancer patients having a particular response profile, or the same test patient prior to the administration of any or a specific therapy.
  • a test patient is treated more or less aggressively than a reference therapy based on the difference between the test patient's level of MSI ⁇ e.g., level of aberrant homopolymer lengths) and the reference level of MSI.
  • a reference therapy is any therapy that is the standard of care for the patient's disease. The standard of care can vary temporally and geographically, and a skilled person can easily determine the appropriate standard of care by consulting the relevant medical literature.
  • a more aggressive therapy than the standard therapy comprises beginning treatment earlier than in the standard therapy. In some embodiments, a more aggressive therapy than the standard therapy comprises administering additional treatments beyond the standard therapy. In some embodiments, a more aggressive therapy than the standard therapy comprises administering alternative treatments instead of the standard therapy. In some embodiments, a more aggressive therapy than the standard therapy comprises treating on an accelerated schedule compared to the standard therapy. In one embodiment a more aggressive therapy comprises increased length of therapy. In one embodiment a more aggressive therapy comprises increased frequency of the dose schedule. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs and increasing drug dosage. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs and accelerating dose schedule.
  • more aggressive therapy comprises selecting and administering more potent drugs and increasing length of therapy. In one embodiment, more aggressive therapy comprises increasing drug dosage and accelerating dose schedule. In one embodiment, more aggressive therapy comprises increasing drug dosage and increasing length of therapy. In one embodiment, more aggressive therapy comprises accelerating dose schedule and increasing length of therapy. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs, increasing drug dosage, and accelerating dose schedule. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs, increasing drug dosage, and increasing length of therapy. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs, accelerating dose schedule, and increasing length of therapy. In one embodiment, more aggressive therapy comprises increasing drug dosage, accelerating dose schedule, and increasing length of therapy. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs, increasing drug dosage, accelerating dose schedule, and increasing length of therapy. In some embodiments, a more aggressive therapy comprises administering a combination of drug-based and non-drug-based therapies.
  • a less aggressive therapy than the standard therapy comprises delaying treatment relative to the standard therapy.
  • a less aggressive therapy than the standard therapy comprises administering less treatment (e.g., lower dosage of one or more standard therapy agents) than in the standard therapy.
  • a less aggressive therapy than the standard therapy comprises administering a treatment regimen lacking one or more components of the standard therapy.
  • a less aggressive therapy than the standard therapy comprises administering treatment on a decelerated schedule compared to the standard therapy.
  • a less aggressive therapy than the standard therapy comprises administering no treatment (e.g., no therapeutic agents, watchful waiting, active surveillance, etc.).
  • a less aggressive therapy comprises delaying treatment.
  • a less aggressive therapy comprises selecting and administering less potent drugs. In one embodiment a less aggressive therapy comprises decreasing the frequency treatment. In one embodiment a less aggressive therapy comprises shortening length of therapy. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs and decreasing drug dosage. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs and decelerating dose schedule. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs and shortening length of therapy. In one embodiment, less aggressive therapy comprises decreasing drug dosage and decelerating dose schedule. In one embodiment, less aggressive therapy comprises decreasing drug dosage and shortening length of therapy. In one embodiment, less aggressive therapy comprises decelerating dose schedule and shortening length of therapy.
  • less aggressive therapy comprises selecting and administering less potent drugs, decreasing drug dosage, and decelerating dose schedule. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs, decreasing drug dosage, and shortening length of therapy. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs, decelerating dose schedule, and shortening length of therapy. In one embodiment, less aggressive therapy comprises decreasing drug dosage, decelerating dose schedule, and shortening length of therapy. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs, decreasing drug dosage, decelerating dose schedule, and shortening length of therapy. In some embodiments, a less aggressive therapy comprises administering only non-drug-based therapies.
  • MSI status of a tumor may further comprise a step of choosing the treatment regimen based on the MSI status (i.e., based on whether the tumor was found to be MSI-H, MSI-L or MSS).
  • kits for determining MSI in a sample, e.g., a tumor sample, comprising the tools to genotype the biomarker panel (e.g., the microsatellite regions shown in the sequences listed in Table 1 and/or Table 2).
  • the tools to genotype the biomarker panel e.g., the microsatellite regions shown in the sequences listed in Table 1 and/or Table 2.
  • kits comprise oligonucleotides that specifically identify one or more microsatellites described herein.
  • the oligonucleotide sequences may correspond to fragments of the biomarker nucleic acids.
  • the oligonucleotides can be more than 250, 200, 150, 100, 50, 25, 10, or fewer than 10 nucleotides in length.
  • the kit can contain in separate containers a nucleic acid, control formulations (positive and/or negative), and/or a detectable label, such as but not limited to fluorescein, green fluorescent protein, rhodamine, cyanine dyes, Alexa dyes, luciferase, and radiolabels, among others.
  • a detectable label such as but not limited to fluorescein, green fluorescent protein, rhodamine, cyanine dyes, Alexa dyes, luciferase, and radiolabels, among others.
  • Instructions for carrying out the assay including, optionally, instructions, can be included in the kit.
  • the kit can contain a nucleic acid substrate array comprising one or more nucleic acid sequences.
  • the nucleic acids on the array specifically identify one or more of the microsatellite regions described herein.
  • the sequence (e.g., homopolymer length) of one or more of the microsatellite regions can be identified by virtue of binding to the array.
  • the substrate array can be on a solid substrate, such as what is known as a "chip.” See, e.g., U.S. Pat. No. 5,744,305.
  • the substrate array can be a solution array; e.g., xMAP (Luminex, Austin, TX), Cyvera (Illumina, San Diego, CA), RayBio Antibody Arrays (RayBiotech, Inc., Norcross, GA), CellCard (Vitra Bioscience, Mountain View, CA) and Quantum Dots' Mosaic (Invitrogen, Carlsbad, CA).
  • xMAP Luminex, Austin, TX
  • Cyvera Illumina, San Diego, CA
  • RayBio Antibody Arrays RayBiotech, Inc., Norcross, GA
  • CellCard Vitra Bioscience, Mountain View, CA
  • Quantum Dots' Mosaic Invitrogen, Carlsbad, CA.
  • a machine-readable storage medium can comprise, for example, a data storage material that is encoded with machine-readable data or data arrays.
  • the data and machine- readable storage medium are capable of being used for a variety of purposes, when using a machine programmed with instructions for using said data. Such purposes include, without limitation, storing, accessing and manipulating information relating to MSI of a subject or population over time, or disease activity characterized by MSI in response treatment, or for drug discovery.
  • Data comprising the presence of indels can be implemented in computer programs that are executing on programmable computers, which comprise a processor, a data storage system, one or more input devices, one or more output devices, etc.
  • Program code can be applied to the input data to perform the functions described herein, and to generate output information. This output information can then be applied to one or more output devices, according to methods well-known in the art.
  • the computer can be, for example, a personal computer, a microcomputer, or a workstation of conventional design.
  • the computer programs can be implemented in a high-level procedural or object-oriented programming language, to communicate with a computer system.
  • the programs can also be implemented in machine or assembly language.
  • the programming language can also be a compiled or interpreted language.
  • Each computer program can be stored on storage media or a device such as ROM, magnetic diskette, etc., and can be readable by a programmable computer for configuring and operating the computer when the storage media or device is read by the computer to perform the described procedures.
  • Any health-related data management systems of the present teachings can be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium causes a computer to operate in a specific manner to perform various functions, as described herein.
  • the assays disclosed herein can be used to generate a "subject MSI profile.”
  • the subject MSI profiles can then be compared to a reference profile.
  • the biomarker profiles, reference and subject, of embodiments of the present teachings can be contained in a machine- readable medium, such as analog tapes like those readable by a CD-ROM or USB flash media, among others.
  • the machine-readable media can also comprise subject information, e.g., the subject's medical or family history.
  • MSI MSI assay that generates a score useful for determining whether an individual has disease characterized by MSI.
  • Microsatellites are regions of tandem repeats, and can be single nucleotide repeats, known as homopolymers, or dinucleotide repeats.
  • MIS occurs when the number or repeats varies, which can include either insertions or deletions of the nucleotide repeats into a microsatellite region.
  • MSI can be evidence of a loss of function in the DNA mismatch repair (MMR) pathway, which can result in a variety of diseases including cancer.
  • MMR DNA mismatch repair
  • a library useful for determining homologous recombination deficiency (HRD) having 54,091 single nucleotide polymorphisms (S Ps) spaced across the entire genome was used to discover homopolymers useful for an MSI assay.
  • the criteria for useful homopolymers was 1) that the homopolymer was within 100 bases of the SNP position, 2) that the homopolymer was 15-20 nucleotide bases in length, and 3) the average SNP coverage was over 100 independent reads to increase good coverage of nearby homopolymers.
  • Figure 1A shows a homopolymer that was excluded following secondary selection.
  • Figure IB shows a homopolymer exemplifying secondary criteria, which was included in the final selection of 35 homopolymers used for the MSI assay.
  • This example thus provides an assay that generates a score useful for determining whether an individual has MSI.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Hospice & Palliative Care (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Methods for detecting microsatellite instability in nucleic acids derived from a patient sample are provided comprising identifying insertions or deletions in microsatellite regions of the nucleic acid. The methods can be used on samples derived from tumors, and are useful for determining whether the sample has no, intermediate, or high degrees of microsatellite instability.

Description

METHODS FOR MEASURING MICROSATELLITE INSTABILITY
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Patent Application Serial No.
62/271, 184 filed December 22, 2015 and U. S. Provisional Patent Application Serial No. 62/287,042 filed January 26, 2016, the entire contents of which are hereby incorporated by reference.
FIELD OF THE DISCLOSURE
[0002] The present disclosure relates to methods, kits, systems and compositions for detecting microsatellite instability in cancer cells. In particular, the disclosure includes new markers that can be used in, e.g. , laboratory assays to detect microsatellite instability with high sensitivity.
BACKGROUND OF THE DISCLOSURE
[0003] Microsatellites are genomic regions containing tandem sequence repeats.
While other types of low complexity sequences may be encompassed by the term "microsatellite," most microsatellites are mononucleotide repeats (also called homopolymers) or dinucleotide repeats. Microsatellite instability ("MSI") occurs when one or more of these regions contains more or fewer repeats in a somatic cell than expected based on the number of repeats in the germline. An early example of this is reported in Thibodeau et al, Microsatellite instability in cancer of the proximal colon, SCIENCE (1993) 260:816-819.
[0004] MSI is thought to be caused by slippage of the polymerase enzyme during
DNA replication and subsequent failure of the cell to detect and/or correct such slippage, resulting in more or fewer repeats in the daughter genome than in the parent. One cause for this failure of correction is a deficiency in the DNA mismatch repair ("MMR") pathway in the cell. MMR deficient cells tend to accumulate microsatellite mutations, often in parallel to advancement to cancer. MSI can thus be used to characterize cancer cells as MMR-deficient, which may have implications for prognosis, therapy selection, etc.
[0005] MMR-deficiency in tumors may indicate that the underlying cancer is familial in nature (e.g. , caused by or contributed to by a germline mutation in an MMR or MMR-related gene). MMR-deficiency may also be used to predict response or resistance to certain chemotherapies (e.g. , 5-fluoracil or alkylating agents such as temozolomide). [0006] There are hundreds of thousands of microsatellites located throughout the human genome. However, MMR deficiency does not necessarily affect all microsatellites in a given cancer cell. Thus, it is important to carefully select which microsatellites are to be assayed in order to detect MSI with the desired level of sensitivity and specificity. One panel known in the art is sometimes called the Bethesda panel, consisting of three dinucleotide repeats (D2S123, D5S346, D17S250) and two mononucleotide repeats (BAT26, BAT25). See, e.g., Boland et al, A National Cancer Institute Workshop on Microsatellite Instability for cancer detection and familial predisposition: Development of international criteria for the determination of microsatellite instability in colorectal cancer, CANCER RES. (1998) 58:5248-5257. In this panel, a tumor is typically considered MSI-positive if 40% or more of these markers are unstable (i.e., show a repeat length different from expected). This is sometimes referred to as MSI-high or MSI-H. Tumors that test negative for all five markers are sometimes referred to as microsatellite stable ("MSS"). In some cases a tumor that tests positive for one locus (or on <30% of loci) is referred to as MSI-L.
[0007] The Bethesda panel is considered by some to have low sensitivity. See, e.g.,
Palomaki et al, EGAPP supplementary evidence review: DNA testing strategies aimed at reducing morbidity and mortality from Lynch syndrome, GENETICS IN MEDICINE (2009) 11 :42-65. This is important since MSI status can be influential in prognosis (typically better for MSI-H patients) and treatment selection (MSI-H tumors often do not respond to fluorouracil-based adjuvant therapy) and because newly diagnosed colorectal cancer ("CRC") patients are routinely tested for MSI. A further disadvantage lies in the fact that some markers in the Bethesda panel contain long repeats, and a typical PCR product used to detect such long markers may need to be well over 100 base pairs. While these lengths are readily manageable in Sanger sequencing, they can be more difficult in massively parallel sequencing techniques (such as so-called "next generation sequencing" or "NGS"). The shorter reads typical of NGS can make detecting microsatellite instability using the Bethesda panel challenging.
[0008] Thus, there is a clinical need for better markers for MSI, e.g., markers that are more sensitive and/or specific than the Bethesda panel and better suited to analysis on an NGS platform.
BRIEF SUMMARY OF THE DISCLOSURE
[0009] The present disclosure provides methods of analyzing microsatellite regions and detecting microsatellite instability. [0010] In an embodiment, a method of analyzing microsatellit regions is provided.
The method comprises: analyzing DNA derived from a patient sample to determine the nucleotide sequence of the DNA at a plurality of microsatellite regions, wherein (a) the plurality of microsatellite regions comprises at least one test microsatellite region; (b) the at least one test microsatellite region comprise(s) the nucleotide sequence of any one of SEQ ID NOs: l-35; and (c) the sequence of the at least one test microsatellite region is analyzed to detect at least one indel at a homopolymer subregion comprising: (1) bases 7-26 of SEQ ID NO: 1; (2) bases 11-27 of SEQ ID NO: 2; (3) bases 10-27 of SEQ ID NO: 3; (4) bases 1-6 of SEQ ID NO: 4; (5) bases 10-28 of SEQ ID NO: 5; (6) bases 1-18 of SEQ ID NO: 6; (7) bases 8-28 of SEQ ID NO: 7; (8) bases 7-23 of SEQ ID NO: 8; (9) bases 6-21 of SEQ ID NO: 9; (10) bases 6-21 of SEQ ID NO: 10; (11) bases 7-25 of SEQ ID NO: 11; (12) bases 5-22 of SEQ ID NO: 12; (13) bases 6-25 of SEQ ID NO: 13; (14) bases 10-30 of SEQ ID NO: 14; (15) bases 1-17 of SEQ ID NO: 15; (16) bases 1-20 of SEQ ID NO: 16; (17) bases 8-24 of SEQ ID NO: 17; (18) bases 12-28 of SEQ ID NO: 18; (19) bases 11-26 of SEQ ID NO: 19; (20) bases 1-19 of SEQ ID NO: 20; (21) bases 9-24 of SEQ ID NO: 21; (22) bases 11-28 of SEQ ID NO: 22; (23) bases 7-26 of SEQ ID NO: 23; (24) bases 12-30 of SEQ ID NO: 24; (25) bases 6-25 of SEQ ID NO: 25; (26) bases 8-23 of SEQ ID NO: 26; (27) bases 20-39 of SEQ ID NO: 27; (28) bases 6-23 of SEQ ID NO: 28; (29) bases 7-23 of SEQ ID NO: 29; (30) bases 8-30 of SEQ ID NO: 30; (31) bases 10-25 of SEQ ID NO: 31; (32) bases 7-22 of SEQ ID NO: 32; (33) bases 8-23 of SEQ ID NO: 33; (34) bases 9-27 of SEQ ID NO: 34; or (35) bases 1-21 of SEQ ID NO: 35. In an embodiment, the at least one test microsatellite region is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions. In an embodiment, the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 indels. In an embodiment, between 20 and 33 indels is indicative of MSI. In an embodiment, the sample is a bodily fluid. In an embodiment, the sample is a tumor sample. In an embodiment, the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors. In an embodiment, the indel is detected by next generation sequencing.
[0011] In another embodiment, a method of detecting microsatellite instability levels is provided. The method comprises (a) assaying DNA derived from a patient sample according to claim 1; and (b) detecting (1) high microsatellite instability in a sample in which at least 60% of the plurality of microsatellite regions comprise an indel in the homopolymer subregion; or (2) intermediate or low microsatellite instability in a sample in which fewer than 60% and more than 10% of the plurality of microsatellite regions comprise an indel in the homopolymer subregion; or
(3) no microsatellite instability in a sample in which 10% or fewer of the plurality of microsatellite regions comprises an indel in the homopolymer subregion. In an embodiment, the at least one test microsatellite region is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions. In an embodiment, the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 indels. In an embodiment, between 20 and 33 indels is indicative of MSI. In an embodiment, the sample is a bodily fluid. In an embodiment, the sample is a tumor sample. In an embodiment, the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors. In an embodiment, the indel is detected by next generation sequencing.
[0012] In another embodiment, a method of detecting microsatellite instability is provided. The method comprises detecting microsatellite instability comprising: (a) analyzing at least one microsatellite region present in one of SEQ ID NOs: l-35 in DNA derived from a patient sample; (b) detecting at least one indel in the at least one microsatellite region; and (c) detecting microsatellite instability in a sample in which the at least one microsatellite region comprises at least one indel in a homopolymer region wherein the at least one microsatellite region comprises: (1) bases 7-26 of SEQ ID NO: 1; (2) bases 11-27 of SEQ ID NO: 2; (3) bases 10-27 of SEQ ID NO: 3;
(4) bases 1-6 of SEQ ID NO: 4; (5) bases 10-28 of SEQ ID NO: 5; (6) bases 1-18 of SEQ ID NO: 6; (7) bases 8-28 of SEQ ID NO: 7; (8) bases 7-23 of SEQ ID NO: 8; (9) bases 6-21 of SEQ ID NO: 9; (10) bases 6-21 of SEQ ID NO: 10; (11) bases 7-25 of SEQ ID NO: 11; (12) bases 5-22 of SEQ ID NO: 12; (13) bases 6-25 of SEQ ID NO: 13; (14) bases 10-30 of SEQ ID NO: 14; (15) bases 1-17 of SEQ ID NO: 15; (16) bases 1-20 of SEQ ID NO: 16; (17) bases 8-24 of SEQ ID NO: 17; (18) bases 12-28 of SEQ ID NO: 18; (19) bases 11-26 of SEQ ID NO: 19; (20) bases 1-19 of SEQ ID NO: 20; (21) bases 9-24 of SEQ ID NO: 21; (22) bases 11-28 of SEQ ID NO: 22; (23) bases 7-26 of SEQ ID NO: 23; (24) bases 12-30 of SEQ ID NO: 24; (25) bases 6-25 of SEQ ID NO: 25; (26) bases 8-23 of SEQ ID NO: 26; (27) bases 20-39 of SEQ ID NO: 27; (28) bases 6-23 of SEQ ID NO: 28; (29) bases 7-23 of SEQ ID NO: 29; (30) bases 8-30 of SEQ ID NO: 30; (31) bases 10-25 of SEQ ID NO: 31; (32) bases 7-22 of SEQ ID NO: 32; (33) bases 8-23 of SEQ ID NO: 33; (34) bases 9-27 of SEQ ID NO: 34; or (35) bases 1-21 of SEQ ID NO: 35. In an embodiment, the at least one test microsatellite region is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions. In an embodiment, the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 indels. In an embodiment, between 20 and 33 indels is indicative of MSI. In an embodiment, the sample is a bodily fluid. In an embodiment, the sample is a tumor sample. In an embodiment, the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors. In an embodiment, the indel is detected by next generation sequencing.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] Figure 1A shows a homopolymer that was excluded following secondary selection.
[0014] Figure IB shows a homopolymer exemplifying secondary criteria, which was included in the final selection of 35 homopolymers used for the MSI assay.
DETAILED DESCRIPTION OF THE DISCLOSURE
[0015] Microsatellite instability (MSI) is evidence of a loss of function in the DNA mismatch repair (MMR) pathway, and the present disclosure is based on the discovery that analyzing certain homopolymer repeats provides a powerful test of MSI.
[0016] The following terms or definitions are provided solely to aid in the understanding of the disclosure. Unless specifically defined herein, all terms used herein have the same meaning as they would to one skilled in the art of the present disclosure. Practitioners are particularly directed to Sambrook et al, Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Press, Plainsview, N.Y. (1989); and Ausubel et al., Current Protocols in Molecular Biology (Supplement 47), John Wiley & Sons, New York (1999), for definitions and terms of the art. Unless expressly defined otherwise herein, the terms used herein should not be construed to have a scope less than understood by a person of ordinary skill in the art.
[0017] As herein, "algorithm" refers to any formula, model, mathematical equation, algorithmic, analytical or programmed process, or statistical technique or classification analysis that takes one or more inputs or parameters, whether continuous or categorical, and calculates an output value, index, index value or score. Examples of algorithms include but are not limited to ratios, sums, regression operators such as exponents or coefficients, biomarker value transformations and normalizations (including, without limitation, normalization schemes that are based on clinical parameters such as age, gender, ethnicity, etc.), rules and guidelines, statistical classification models, and neural networks trained on populations. Also of use in the context of MSI as described herein are linear and non-linear equations and statistical classification analyses to determine the relationship between (a) the presence of indels detected in a subject sample and (b) the level of the respective subject's MSI.
[0018] As used herein, the term "analyze" or "analyzing" includes "measure,"
"measuring," "detect," "detecting," identify," "identifying," "assay," "assaying," "quantify," or "quantifying," and refers to the process of determining a value or set of values associated with a sample by measurement of indels in a sample, and may further comprise comparing a test sequence to a reference sequence, which may include constituent nucleotides in a sample or set of samples from the same subject or other subject(s), to detect or identify indels.
[0019] As used herein, the term "diagnosis" refers to methods by which a determination can be made as to whether an individual is likely to be suffering from a given disease or condition, including but not limited diseases or conditions resulting from microsatellite instability. The skilled artisan often makes a diagnosis on the basis of one or more diagnostic indicators, e.g., a biomarker, the presence, absence, amount, or change in amount of which is indicative of the presence, severity, or absence of the condition. Other diagnostic indicators can include patient history; physical symptoms, e.g., unexplained weight loss, fever, fatigue, pains, or skin anomalies; phenotype; genotype; or environmental or heredity factors. A skilled artisan will understand that the term "diagnosis" refers to an increased probability that certain course or outcome will occur; that is, that a course or outcome is more likely to occur in a patient exhibiting a given characteristic, e.g., the presence or level of a diagnostic indicator, when compared to individuals not exhibiting the characteristic. Diagnostic methods can be used independently, or in combination with other diagnosing methods known in the art to determine whether a course or outcome is more likely to occur in a patient exhibiting a given characteristic.
[0020] As used herein, "disease" can encompass any disorder, condition, sickness, ailment, etc. that manifests in, e.g., a disordered or incorrectly functioning organ, part, structure, or system of the body, and results from, e.g., genetic or developmental errors, MSI, infection, poisons, nutritional deficiency or imbalance, toxicity, or unfavorable environmental factors.
[0021] As used herein, "homopolymer" refers to a microsatellite that is a mononucleotide repeat of at least 6 bases {e.g., a stretch of at least 6 consecutive A, C, T or G residues in the DNA). A "homopolymer region" is a microsatellite region (as defined below) in which the microsatellite is a homopolymer. A "homopolymer subregion" refers to a homopolymer microsatellite located within a larger genomic region (e.g., a homopolymer region). For example, SEQ ID NO: 1 is a microsatellite region of 31 nucleotides as follows:
SEQ ID NCv l - 5'-cccctcttttttttttttttttttttctgag-3 '
The string of thymines at positions 7-26 in SEQ ID NO: l is a homopolymer subregion within the microsatellite or homopolymer region of SEQ ID NO: 1.
[0022] As used herein, "indel" refers to a mutation in a nucleic acid whereby one or more nucleotides are either inserted or deleted, resulting in a net gain or loss of nucleotides, which can include any combination of insertions and deletions. Aberrant homopolymer lengths often result from indels.
[0023] The term "Lynch syndrome" as used herein refers to an autosomal dominant genetic condition which has a high risk of colon cancer as well as other cancers including endometrium, ovary, stomach, small intestine, hepatobiliary tract, upper urinary tract, brain, and skin cancer. The increased risk for these cancers is due to inherited mutations that impair DNA mismatch repair. The old name for the condition is HNPCC.
[0024] As used herein, "microsatellite" refers to a genetic locus comprising a short
(e.g., 1-20), tandemly repeated sequence motif comprising a minimal total length of 6 bases. A "mononucleotide microsatellite" or "homopolymer" refers to a genetic locus comprising a repeated single nucleotide (e.g., poly-A) and is a specific subclass of microsatellites particularly relevant to the present disclosure. A "dinucleotide microsatellite" refers to a genetic locus comprising a motif of two nucleotides that are tandemly repeated, a "trinucleotide microsatellite" refers to a genetic locus comprising three nucleotides that are tandemly repeated, and a "tetranucleotide microsatellite" refers to a genetic locus comprising a motif of four nucleotides that are tandemly repeated. Additional microsatellite motifs can comprise pent- and hexanucleotide repeats. A "monomorphic microsatellite" is one in which all (or substantially all) individuals, particularly all individuals of a given population, share the same number of repeat units. This is in contrast to a "polymorphic microsatellite", which is used to refer to microsatellites in which more than -1% of individuals a given population displays a different number of repeat units in at least of their alleles (often heterozygous for a major and different minor allele). By way of example, the BAT26 marker is comprised of 26 adenines in more than 99% of ethnic Europeans, whereas alleles with different numbers of adenines at this location (e.g., 15, 20, 22, 23) are seen in up to 25% of ethnic Africans, including African Americans. Thus, BAT26 is a monomorphic microsatellite in Europeans and a polymorphic microsatellite in Africans. When analyzing microsatellites, one may look at genomic DNA of a sample (e.g., genomic DNA of a cancer cell present in or obtained from the subject). "Microsatellite region" refers to the genomic context of a microsatellite— a genomic region containing a microsatellite (i.e., the microsatellite and flanking genomic nucleotides).
[0025] As used herein, "MSI status" means the presence of microsatellite instability
(MSI), a clonal or somatic change in the number of repeated DNA nucleotide units in microsatellites. In some embodiments detecting MSI in a cancer cell sample may include classifying MSI status in the cancer cell, in which case the method may include a classification step. As used herein, "classifying MSI status" in a cancer cell sample means categorizing the sample based on its MSI status, e.g., the degree to which it comprises cancer cells harboring molecular features indicative of instability at microsatellite sites.
[0026] As used herein, "next generation sequencing" or "NGS" refers to a variety of high-throughput sequencing technologies that parallelize the sequencing process, producing thousands or millions of sequences at once. NGS is generally conducted with the following steps: First, DNA sequencing libraries are generated by clonal amplification by PCR in vitro; second, the DNA is sequenced by synthesis, such that the DNA sequence is determined by the addition of nucleotides to the complementary strand rather through chain-termination chemistry typical of Sanger sequencing; third, the spatially segregated, amplified DNA templates are sequenced simultaneously in a massively parallel fashion, typically without the requirement for a physical separation step. NGS parallelization of sequencing reactions can generate hundreds of megabases to gigabases of nucleotide sequence reads in a single instrument run. Unlike conventional sequencing techniques, such as Sanger sequencing, which typically report the average genotype of an aggregate collection of molecules, NGS technologies typically digitally tabulate the sequence of numerous individual DNA fragments (sequence reads discussed in detail below), such that low frequency variants (e.g., variants present at less than about 10%, 5% or 1% frequency in a heterogeneous population of nucleic acid molecules) can be detected. The term "massively parallel" can also be used to refer to the simultaneous generation of sequence information from many different template molecules by NGS. [0027] NGS strategies can include several methodologies, including, but not limited to: (i) microelectrophoretic methods; (ii) sequencing by hybridization; (iii) real-time observation of single molecules, and (iv) cyclic-array sequencing. Cyclic-array sequencing refers to technologies in which a sequence of a dense array of DNA is obtained by iterative cycles of template extension and imaging-based data collection. Commercially available cyclic-array sequencing technologies include, but are not limited to 454 sequencing, for example, used in 454 Genome Sequencers (Roche Applied Science; Basel), Solexa technology, for example, used in the Illumina Genome Analyzer, Illumina HiSeq, MiSeq, and NextSeq (San Diego, CA), the SOLiD platform (Applied Biosy stems; Foster City, CA), the Polonator (Dover/Harvard) and Heli Scope Single Molecule Sequencer technology (Helicos; Cambridge, MA). Other NGS methods include single molecule real time sequencing (e.g., Pacific Bio) and ion semiconductor sequencing (e.g., Ion Torrent sequencing). See, e.g., Shendure & Ji, Next Generation DNA Sequencing, NAT. BIOTECH. (2008) 26: 1135-1145 for a more detailed discussion of NGS sequencing technologies.
[0028] As used herein, "patient" or "individual" or "subject" generally refers to a human. A subject can be male or female. A subject can be one who has been previously diagnosed or identified as having a disease characterized by MSI. A subject can be one who has already undergone, or is undergoing, a therapeutic intervention for disease characterized by MSI. A subject can also be one who has not been previously diagnosed with a disease characterized by MSI.
[0029] As used herein, "sample" or "biological sample" refers to a physical specimen that is, or is derived from, a biological tissue, including biological fluids. A sample is "derived from" a biological tissue when the sample is the result of some process applied to the biological tissue, e.g., serum is derived from whole blood. Examples include, but are not limited to, biopsy or tissue samples, frozen samples, blood and blood fractions or products (e.g., serum, platelets, red blood cells, and the like), tumor samples, sputum, bronchoalveolar lavage, cultured cells, e.g., primary cultures, explants, and transformed cells, stool, urine, etc. A "biopsy" refers to the process of removing a tissue sample for diagnostic or prognostic evaluation, and to the tissue specimen itself. Any suitable biopsy technique can be applied to the diagnostic methods of the present disclosure. The biopsy technique applied will depend on the tissue type to be evaluated (e.g., lung etc.), the size and type of the tumor, among other factors. Representative biopsy techniques include, but are not limited to, excisional biopsy, incisional biopsy, needle biopsy, surgical biopsy, and bone marrow biopsy. An "excisional biopsy" refers to the removal of an entire tumor mass with a small margin of normal tissue surrounding it. An "incisional biopsy" refers to the removal of a wedge of tissue that includes a cross-sectional diameter of the tumor. A diagnosis made by endoscopy or fluoroscopy can require a "core-needle biopsy", or a "fine-needle aspiration biopsy" which generally obtains a suspension of cells from within a target tissue. A "bodily fluid" include all fluids obtained from a mammalian body, either processed (e.g., serum) or unprocessed, which can include, for example, blood, plasma, urine, lymph, gastric juices, bile, serum, saliva, sweat, and spinal and brain fluids. As used herein, "cancer cell samples" or "tumor sample" means a specimen comprising either at least one cancer cell or biomolecules derived therefrom. Non-limiting examples of such biomolecules include nucleic acids and proteins. Biomolecules "derived" from a cancer cell sample include endogenous molecules extracted from the sample as well as artificially synthesized copies or versions of such endogenous biomolecules. One illustrative, non-limiting example of such artificially synthesized molecules includes PCR amplification products in which nucleic acids ("biomolecules") from the sample serve as PCR templates. "Nucleic acids of a cancer cell sample include nucleic acids located in a cancer cell or biomolecules derived from a cancer cell.
[0030] As used herein, "score" means a value or set of values selected so as to provide a quantitative measure of a variable or characteristic of a subject's condition or the degree of MSI in a sample, and/or to discriminate, differentiate or otherwise characterize MSI. The value(s) comprising the score can be based on, for example, quantitative data resulting in a measured amount of one or more sample constituents obtained from the subject. In certain embodiments the score can be derived from a single constituent, parameter or assessment, while in other embodiments the score is derived from multiple constituents, parameters and/or assessments. The score can be based upon or derived from an interpretation function; e.g., an interpretation function derived from a particular predictive model using any of various statistical algorithms known in the art. A "change in score" can refer to the absolute change in score, e.g. from one time point to the next, or the percent change in score, or the change in the score per unit time (i.e., the rate of score change).
[0031] As used herein, a "test microsatellite region" is a microsatellite region whose nucleotide sequence is analyzed according to the present disclosure. A test microsatellite region includes non-homopolymer sequences immediately surrounding the homopolymer region that can be used to determine the precise homopolymer position within a genome. SEQ ID NOs: l-35 are exemplary test microsatellite regions. [0032] As used herein, the term "treatment" or "therapy" or "therapeutic regimen" includes all clinical management of a subject and interventions, whether biological, chemical, physical, or a combination thereof, intended to sustain, ameliorate, improve, or otherwise alter the condition of a subject. These terms may be used synonymously herein. Treatments include but are not limited to administration of prophylactics or therapeutic compounds (including small molecule and biologic drugs), exercise regimens, physical therapy, dietary modification and/or supplementation, bariatric surgical intervention, administration of therapeutic compounds (prescription or over-the-counter), and any other treatments known in the art as efficacious in preventing, delaying the onset of, or ameliorating disease characterized by MSI. A "response to treatment" includes a subject's response to any of the above-described treatments, whether biological, chemical, physical, or a combination of the foregoing. A "treatment course" relates to the dosage, duration, extent, etc. of a particular treatment or therapeutic regimen. An initial therapeutic regimen as used herein is the first line of treatment.
Microsatellite instability
[0033] The present disclosure relates to analyzing homopolymer repeat regions to determine levels of microsatellite instability (MSI), which can in turn be evidence of a loss of function in the DNA mismatch repair (MMR) pathway.
[0034] In some embodiments the present disclosure provides methods of analyzing microsatellite regions comprising analyzing DNA derived from a patient sample to determine the nucleotide sequences of the DNA at a plurality of microsatellite regions. In some embodiments the present disclosure provides methods of detecting microsatellite instability comprising analyzing microsatellite regions.
[0035] The microsatellite regions can be any region of a genome known to contain a microsatellite as defined herein. The microsatellite region that is analyzed (or tested) to determine a specific nucleotide sequence is a test microsatellite region, which may contain nucleotide sequences immediately adjacent to the homopolymer subregion in order to determine its precise location within the genome.
[0036] Test microsatellite regions can be any sequence in a genome that includes a microsatellite, which can include the test microsatellite regions listed in Table 1. The homopolymer subregions within each SEQ ID are indicated. Table 1
Figure imgf000013_0001
[0037] Table 2 provides expanded sequence context for the homopolymers listed in
Table 1, wherein SEQ ID NOs 36-70 show additional 5' and 3' genomic sequence surrounding the sequences in SEQ ID NOs 1-35 (e.g., SEQ ID NO:36 shows additional flanking 5' and 3 ' genomic sequence surrounding SEQ ID NO: l, SEQ ID NO:37 shows additional flanking 5' and 3 ' genomic sequence surrounding SEQ ID NO:2, and so forth). The homopolymer subregions within each SEQ ID are indicated.
Table 2
Figure imgf000014_0001
Figure imgf000015_0001
[0038] In some embodiments of any of the methods described herein the method comprises determining whether nucleic acids of the sample have an aberrant homopolymer length for a plurality of microsatellite regions comprising at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions chosen from Table 1 or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions chosen from Table 2.
[0039] In some embodiments of any of the methods described herein the method comprises determining whether at least one indel is present in any homopolymer subregion included in the test microsatellite regions chosen from Table 1 or Table 2, wherein said at least one indel comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, or more indels.
[0040] The presence of an indel can be established by comparing the nucleotide sequence of the test microsatellite region to a reference, which in some embodiments includes the nucleotide sequence of DNA in which the indel is not present (e.g., a known reference DNA sequence, the sequence of germline DNA from the subject with the tumor, etc.). In the case of homopolymers, in some embodiments the presence of an indel can be established by comparing the measured/detected length of the homopolymer subregion in the test microsatellite region to the expected length of the homopolymer (e.g., the length of this specific homopolymer in a reference genome).
[0041] Determining whether nucleic acids of a sample have an aberrant homopolymer length for at least one of the test microsatellite regions described herein, for example the test microsatellite regions of Table 1, can comprise assaying the sample and comparing the test microsatellite region of test microsatellite sequence(s) obtained from a reference sequence from a sample where MSI is not present in order to identify a difference in length between the test microsatellite region obtained from a patient and the reference sequence. In some embodiments this determining comprises obtaining sequence data from such as assay and performing the comparison to identify the difference in length. In some embodiments this determining comprises obtaining data from such a comparison and identifying a difference in length between the test sequence and the reference sequence.
[0042] Microsatellite instability can be detected, or measured, by identifying indels as described herein. This measurement may be quantitative to varying degrees— e.g., assigning a score (e.g., numerical score) that lies along a substantially continuous spectrum and that reflects the quantitative level of MSI (e.g., the number or proportion of microsatellite markers showing instability). In some embodiments this measurement may be a blending of two approaches— e.g., assigning such a score (e.g., numerical score) to a sample and then assigning MSI status in the sample into a discrete set of categories according to that score.
[0043] As a non-limiting example of this blended measurement, a score below X may be categorized as MSI-negative, a score above Y may be categorized as MSI-high, and a score between X and Y may be categorized as MSI-low or intermediate. In some embodiments Y (in this case the number of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length) = 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1 1, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35. In some embodiments Y (in this case the percentage of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length) = 5%, 6%, 7%, 8%, 9%, 10%, 1 1%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or 100%. In some embodiments X (the percentage of test microsatellite regions chosen from Table 1 or Table 2 having an aberrant homopolymer length) = 0%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 1 1%, 12%, 13%, 14% or 15%). In some embodiments an MSI-low status is detected (e.g., the sample is classified as MSI-low, intermediate or equivalent) when Y = 5%, 6%, 7%, 8%, 9%, 10%, 1 1%, 12%, 13%, 14% or 15% and X = 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, or 70%.
[0044] In some embodiments, high microsatellite instability in a sample in which at least 60% of the plurality of microsatellite regions comprise an indel in the homopolymer subregion. In some embodiments, intermediate or low microsatellite instability in a sample in which fewer than 60% and more than 10% of the plurality of microsatellite regions comprise an indel in the homopolymer subregion. In some embodiments, no microsatellite instability in a sample in which 10%) or fewer of the plurality of microsatellite regions comprises an indel in the homopolymer subregion.
[0045] In some embodiments the genome in which MSI status is being assessed can be from a cancer cell present in a tumor sample as defined herein. The cancer cell can include any type of malignant solid tumor known to have metastatic potential, including without limitation, lung cancer (e.g., non-small cell lung cancer ( SCLC)), bone cancer, pancreatic cancer, cancer of the head or neck, melanoma, skin cancer, lymphoblast cancer, uterine cancer, ovarian cancer, cervical cancer, colorectal cancer, gastric cancer, colon cancer, stomach cancer, breast cancer, endometrial cancer, thyroid cancer, prostate cancer, rectal cancer, bladder cancer, kidney cancer (e.g., renal cell carcinoma), liver cancer (e.g., hepatocellular carcinoma), and cancers of the central nervous system (CNS), (e.g., glioma, glioblastoma multiforme or astrocytoma).
[0046] MSI status can be determined independent of cancer type. Thus, in principle, diagnosing MSI status can be done for every type of cancer. However, since MSI is most often present in cancers with a deficiency in mismatch repair genes, the methods of detecting MSI status described herein may be particularly useful in a tumor sample of a cancer where MMR deficiency occurs more frequently than in other types of cancer. Accordingly, the cancer sample may be a sample selected from colorectal cancer, endometrial cancer, ovarian cancer, gastric cancer, and leukemia.
[0047] Alternative approaches focusing on the aberrant DNA repair processes of
MMR-deficient tumors have therefore been proposed. Synthetic lethality approaches have shown, for instance, that increased oxidative damage (by methotrexate exposure or PINKl silencing) or interference with the base excision repair (BER) pathway (by DNA polymerase γ or β inhibition) sensitizes MMR-deficient tumors. In particular, in MMR-deficient tumors, oxidative damage induces 8-oxoguanine (8-oxoG) DNA lesions, which fail to be sufficiently repaired either by the BER or MMR pathway, generating mainly GC to TA dinucleotide transversions at the DNA level, leading to cell death. Additionally, it has been hypothesized that there is a maximum mutation frequency that a tumor can tolerate, above which a further increase in mutations would be detrimental. It has therefore been proposed to additionally treat MMR-deficient tumors with mutagenic nucleoside analogues until a critical level of mutations is obtained resulting in error catastrophe-like ablation of the tumor. MMR-deficient tumors are often also resistant to targeted cancer therapies, including anti-EGFR and anti-VEGF therapies. Although the precise reasons for this resistance are unknown, presence of secondary mutations in established tumor driver genes as a consequence of MMR- might be responsible. For instance, MMR-deficient tumors can acquire mutations in double-strand break repair genes (e.g., MRE11, ATR and RAD50), known oncogenes or tumor suppressors (e.g., PIK3CA or PTEN). Since presence of MMR-deficiency mainly in colorectal and endometrial tumors represent a familial form of cancer, and since tumors exhibiting mutation spectra characteristic of MMR-deficiency, diagnostic tests assessing MMR-deficiency are commonly used.
Detecting microsatellite regions
[0048] The present methods generally relate to the detection of indels in genomic microsatellites. Techniques for preparing nucleic acids in or derived from a cancer cell sample in a form that is suitable for indel detection can include, but are not limited to, PCR, detectable probes, sequencing and single base extensions, reverse transcriptase-PCR (RT-PCR), real-time PCR, allele- specific hybridization, reverse transcription quantitative real-time PCR (RT-qPCR) ligase chain reaction, strand displacement amplification (SDA), self-sustained sequence replication (3SR), or in situ PCR. Exemplary, but non-limiting, techniques for analysis of nucleic acid samples to detect indels are briefly described below.
DNA Sequencing
[0049] Indels (e.g., homopolymer lengths) can be detected by direct sequencing.
Non-limiting examples of sequence analysis useful in methods of the present disclosure include NGS (e.g., Chen et al, Genome Res. (2008) 18: 1143-1149); Srivatsan et al. PLoS Genet. (2008) 4:el000139), Maxam-Gilbert sequencing, Sanger sequencing, capillary array DNA sequencing, thermal cycle sequencing (Sears et al, Biotechniques (1992) 13 :626-633), solid-phase sequencing (Zimmerman et al, Methods Mol. Cell Biol. (1992) 3 :39-42), sequencing with mass spectrometry such as matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI- TOF/MS; Fu et al, Nat. Biotechnol. (1998) 16:381-384), sequencing by hybridization (Chee et al, Science (1996) 274:610-614); Drmanac et al, Science (1993) 260: 1649-1652); Drmanac et al, Nat. Biotechnol. (1998) 16:54-58), Polony sequencing (Porreca et al, Curr. Protoc. Mol. Biol. (2006) Chp. 7; Unit 7.8), ion semiconductor sequencing (Elliott et al, J. Biomol. Tech. 1 :24-30 (2010), DNA nanoball sequencing (Kaji et al, Chem. Soc. Rev. (2010) 39:948-56), single molecule real- time sequencing (Flusberg et al, Nat. Methods (2010) 6:461-5), or nanopore DNA sequencing (Wanunu, Phys. Life Rev. (2012) 9: 125-58).
[0050] Other detection methods include Pyrosequencing™ of oligonucleotide-length products. Such methods often employ amplification techniques such as PCR. For example, in pyrosequencing, a sequencing primer is hybridized to a single stranded, PCR-amplified, DNA template; and incubated with the enzymes, DNA polymerase, ATP sulfurylase, luciferase and apyrase, and the substrates, adenosine 5' phosphosulfate (APS) and luciferin. The first of four deoxynucleotide triphosphates (dNTP) is added to the reaction. DNA polymerase catalyzes the incorporation of the deoxynucleotide triphosphate into the DNA strand, if it is complementary to the base in the template strand. Each incorporation event is accompanied by release of pyrophosphate (PPi) in a quantity equimolar to the amount of incorporated nucleotide. ATP sulfurylase quantitatively converts PPi to ATP in the presence of adenosine 5' phosphosulfate. This ATP drives the luciferase-mediated conversion of luciferin to oxyluciferin that generates visible light in amounts that are proportional to the amount of ATP. The light produced in the luciferase-catalyzed reaction is detected by a charge coupled device (CCD) camera and seen as a peak in a Pyrogram™ Each light signal is proportional to the number of nucleotides incorporated. Apyrase, a nucleotide degrading enzyme, continuously degrades unincorporated dNTPs and excess ATP. When degradation is complete, another dNTP is added.
[0051] Another similar method for characterizing homopolymer sequence/length does not require use of a complete PCR, but typically uses only the extension of a primer by a single, fluorescence-labeled di deoxyribonucleic acid molecule (ddNTP) that is complementary to the nucleotide to be investigated. The nucleotide at the polymorphic site can be identified via detection of a primer that has been extended by one base and is fluorescently labeled {e.g., Kobayashi et al., MOL. CELL. PROBES (1995) 9: 175-182).
[0052] Sample DNA can be analyzed using techniques including, without limitation, electrophoretic analysis or sequence analysis. Non-limiting examples of electrophoretic analysis include slab gel electrophoresis such as agarose or polyacrylamide gel electrophoresis, capillary electrophoresis, and denaturing gradient gel electrophoresis (DGGE). Other methods of nucleic acid analysis include, but are limited to, hybridization with allele-specific oligonucleotide probes (Wallace et al, NUCL. ACIDS RES. (1978) 6:3543-3557), including immobilized oligonucleotides (Saiki et al, PNAS (1989) 86:6230-6234), oligonucleotide arrays (Maskos & Southern, NUCL. ACIDS RES. (1993) 21 :2269-2270), oligonucleotide-ligation assay (OLA) (Landegren et al, SCIENCE (1988) 241 : 1077), allele-specific ligation chain reaction (LCR) (Barrany, PNAS (1991) 88: 189-193), gap-LCR (Abavaya et al, NUCL. ACIDS RES. (1995) 23 :675-682), single-strand-conformation- polymorphism detection (Orita et al, GENOMICS (1983) 5 :874-879), RNAase cleavage at mismatched base-pairs (Myers et al, SCIENCE (1985) 230: 1242), genetic bit analysis (GBA) (Nikiforov et al, NUCL. ACIDS RES. (1994) 22:4167-4175), in situ hybridization, denaturing high performance liquid chromatography (DHPLC) (Kim et al, GENETIC TESTING (2008) 12:295-298).
Allele-Specifw Hybridization
[0053] This technique, also commonly referred to as allele specific oligonucleotide hybridization (ASO) (e.g. , Stoneking et al, AM. J. HUM. GENET. (1991) 48:70-382; Saiki et al, NATURE (1986) 324: 163-166; EP 235,726; and WO/1989/01 1548), relies on distinguishing between two DNA molecules differing, typically, by one base by hybridizing an oligonucleotide probe that is specific for one of the variants to nucleic acid in or derived from a sample. This method typically employs short oligonucleotides, e.g., 15-20 bases in length, but may also employ longer oligonucleotides, e.g., 20-100 bases in length. The probes are designed to differentially hybridize to one variant versus another. Hybridization conditions should be sufficiently stringent that there is a significant difference in hybridization intensity between alleles, and producing an essentially binary response, whereby a probe hybridizes efficiently to only one of the alleles. Some probes are designed to hybridize to a segment of target DNA such that the variable site aligns with a central position {e.g. , in a 15-base oligonucleotide at the 7 position; in a 16-based oligonucleotide at either the 8 or 9 position) of the probe, but this design is not required.
[0054] The amount and/or presence of an allele can be determined by measuring the amount of allele-specific oligonucleotide that is hybridized to the sample. Typically, the oligonucleotide is labeled with a detectable moiety, e.g. , a fluorescent label. In one specific embodiment, the homopolymer of SEQ ID NO: l can be detected as follows. First, DNA from a cancer cell sample may be extracted and cleaved or sheared into roughly uniform fragments of a desired length {e.g. , 100, 200, 300, 400, 500, 600, 700, 800, 900, or 1,000 or more nucleotides). This sheared DNA can be enriched for fragments containing the test microsatellite region(s) of interest {e.g. , SEQ ID NO: l) by either (a) oligonucleotide probe hybridization capture (specific for the test microsatellite region(s) of interest) followed by amplification (which can be but need not be specific) or (b) PCR amplification using primers specific for the test microsatellite region(s) of interest. One or more oligonucleotides that hybridize only (or only efficiently hybridize) to a specific homopolymer of a specific length for each test microsatellite region (e.g., the 20 thymines at positions 7-26 in SEQ ID NO: l) can be applied to this sample of enriched DNA. After stringent hybridization and washing conditions, hybridization levels (e.g., fluorescence intensity) can be measured for each allele-specific oligonucleotide, thereby detecting and optionally quantitating each homopolymer length in the sample. If only the oligonucleotide specific for the reference homopolymer length for a test microsatellite shows significant hybridization, then that test microsatellite can be counted as not having an aberrant homopolymer length. If however, a oligonucleotide specific for an aberrant homopolymer length (i.e., length that differs from the reference length) for a test microsatellite shows significant hybridization, then that test microsatellite can be counted as having an aberrant homopolymer length.
[0055] Suitable assay formats for detecting hybrids formed between probes and target nucleic acid sequences in a sample include the immobilized target (dot-blot) format and immobilized probe (reverse dot-blot or line-blot) assay formats. Dot blot and reverse dot blot assay formats are described in U.S. Pat. Nos. 5,310,893; 5,451,512; 5,468,613; and 5,604,099; each incorporated herein by reference.
[0056] In a dot-blot format, amplified target DNA is immobilized on a solid support, such as a nylon membrane. The membrane-target complex is incubated with labeled probe under suitable hybridization conditions, unhybridized probe is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound probe.
[0057] In the reverse dot-blot (or line-blot) format, the probes are immobilized on a solid support, such as a nylon membrane or a microtiter plate. The target DNA is labeled, typically during amplification by the incorporation of labeled primers. One or both of the primers can be labeled. The membrane-probe complex is incubated with the labeled amplified target DNA under suitable hybridization conditions, unhybridized target DNA is removed by washing under suitably stringent conditions, and the membrane is monitored for the presence of bound target DNA.
Allele-Specific Primers
[0058] Indels can also be detected using allele-specific amplification or primer extension methods. These reactions typically involve use of primers that are designed to selectively target one or another variant, where selectivity is achieved by failure to amplify target DNA that contains a mismatch corresponding to the 3 '-end of a primer. The presence of a mismatch affects the ability of a polymerase to extend a primer when the polymerase lacks error-correcting activity. For example, to detect an allele sequence using an allele-specific amplification- or extension-based method, a primer complementary to a reference allele of a microsatellite (i.e., without indel) is designed such that the 3 '-terminal nucleotide hybridizes with the sequence containing the reference number/length of repeats. The presence of the particular allele can be determined by the ability of the primer to initiate extension. If the 3'-terminus is mismatched (e.g., the sample contains an aberrant homopolymer length as compared to the reference), extension is impeded.
[0059] The primer can be used in conjunction with a second primer in an amplification reaction. The second primer can hybridize at a site unrelated to the homopolymer. Amplification proceeds from the two primers leading to a detectable product signifying the particular allelic form (e.g., specific homopolymer of a specific length) is present. Allele-specific amplification- or extension-based methods are described in, for example, WO/1993/022456; U.S. Pat. Nos. 5, 137,806; 5,595,890; 5,639,611; and 4,851,331.
[0060] Using allele-specific amplification-based genotyping, identification of the alleles requires only detection of the presence or absence of amplified target sequences. Methods for the detection of amplified target sequences include, but are not limited to, gel electrophoresis and probe hybridization assays (e.g., oligonucleotide microarray).
[0061] In an alternative probe-less method, the amplified nucleic acid is detected by monitoring the increase in the total amount of double-stranded DNA in the reaction mixture as described, e.g., in U.S. Pat. No. 5,994,056. In this example the detection of double-stranded target DNA relies on the increased fluorescence various DNA-binding dyes, e.g., SYBR Green, exhibit when bound to double-stranded DNA.
[0062] Allele-specific amplification methods can be performed in reaction that employs multiple allele-specific primers to target particular alleles. Primers for such multiplex applications are generally labeled with distinguishable labels or are selected such that the amplification products produced from the alleles are distinguishable by size. Thus, for example, both alleles in a single sample can be identified using a single amplification by gel analysis of the amplification product.
[0063] As in the case of allele-specific probes, an allele-specific oligonucleotide primer may be exactly complementary to one of the variant alleles in the hybridizing region or may have some mismatches at positions other than the 3 '-terminus of the oligonucleotide, which mismatches occur at non-variable sites in both allele sequences.
Detectable Probes
[0064] Genotyping can be performed using a "TaqMan™" or "5 '-nuclease assay", as described in U.S. Pat. Nos. 5,210,015; 5,487,972; and 5,804,375; and Holland et al, PROG. NATL. ACAD. SCI. (1988) 88:7276-7280. In the TaqMan™ assay, labeled detection probes that hybridize within the amplified region are added during the amplification reaction. The probes are modified so as to prevent the probes from acting as primers for DNA synthesis. The amplification is performed using a DNA polymerase having 5'- to 3 '-exonuclease activity. During each synthesis step of the amplification, any probe which successfully hybridizes to the target nucleic acid downstream from the primer being extended is degraded by the 5'- to 3 '-exonuclease activity of the DNA polymerase. Thus, the synthesis of a new target strand also results in the degradation of a probe, and the accumulation of degradation product provides a measure of the synthesis of target sequences.
[0065] The hybridization probe can be an allele-specific probe that discriminates between alleles with and without indels, similar to the allele-specific oligonucleotides described above. Alternatively, the method can be performed using an allele-specific primer, similar to those described above, and a labeled probe that binds to amplified product. The probes can be at least about 12, 15, 16, 18, 20, 22, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 or more nucleotide portions of a contiguous sequence comprising the homopolymer of interest {e.g. , at least about 12, 15, 16, 18, 20, 22, 24, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 or more nucleotide portions of the sequences listed in SEQ ID NOs: l-70). The probes can be produced by, for example, chemical synthesis, PCR amplification, generation from longer polynucleotides using restriction enzymes, etc. The probes can be made completely complementary to the target nucleic acid or portion thereof {e.g. , to all or a portion of any of SEQ ID NOs: l-70). Therefore, usually high stringency conditions are desirable in order to prevent or at least minimize off-target hybridization. Conditions of high stringency are generally suitable for applications where the probes are complementary to regions of the target which lack heterogeneity. The stringency of hybridization is determined by a number of factors during hybridization and during the washing procedure, including temperature, ionic strength, length of time, and concentration of formamide (Sambrook et al. "Molecular Cloning; A Laboratory Manual," Second Edition (Cold Spring Harbor Press, Cold Spring Harbor, N.Y., 1989)). [0066] Nucleic acid probes, or alternatively nucleic acid from the samples, can be provided in solution for such assays, or can be affixed to a support (e.g., solid or semi-solid support). Examples of supports that can be used include nitrocellulose (e.g., in membrane or microtiter well form), polyvinyl chloride (e.g., in sheets or microtiter wells), polystyrene latex (e.g., in beads or microtiter plates, polyvinylidine fluoride, diazotized paper, nylon membranes, activated beads, and Protein A beads.
[0067] Any method suitable for detecting degradation product (e.g., fluorescence) can be used in a 5 '-nuclease assay. The nucleic acid from the sample can be subjected to gel electrophoresis or other size separation techniques; alternatively, the nucleic acid sample can be dot blotted without size separation. Often, the detection probe is labeled with two fluorescent dyes, one of which is capable of quenching the fluorescence of the other dye. The dyes are attached to the probe, usually one attached to the 5 '-terminus and the other is attached to an internal site, such that quenching occurs when the probe is in an unhybridized state and such that cleavage of the probe by the 5'- to 3'-exonuclease activity of the DNA polymerase occurs in between the two dyes. Amplification results in cleavage of the probe between the dyes with a concomitant elimination of quenching and an increase in the fluorescence observable from the initially quenched dye. The accumulation of degradation product is monitored by measuring the increase in reaction fluorescence. U.S. Pat. Nos. 5,491,063 and 5,571,673, both incorporated herein by reference, describe alternative methods for detecting the degradation of probe which occur concomitant with amplification.
[0068] Probes detectable upon a secondary structural change are also suitable for detection of a test locus variant. Exemplified secondary structure or stem-loop structure probes include molecular beacons or Scorpion® primer/probes. Molecular beacon probes are single- stranded oligonucleic acid probes that can form a hairpin structure in which a fluorophore and a quencher are usually placed on the opposite ends of the oligonucleotide. At either end of the probe short complementary sequences allow for the formation of an intramolecular stem, which enables the fluorophore and the quencher to come into close proximity. The loop portion of the molecular beacon is complementary to a target nucleic acid of interest. Binding of this probe to its target nucleic acid of interest forms a hybrid that forces the stem apart. This causes a conformation change that moves the fluorophore and the quencher away from each other and leads to a more intense fluorescent signal. See, e.g., Tyagi & Kramer, NAT. BlOTECHNOL. (1996) 14:303-308; Tyagi et al., NAT. BIOTECHNOL. (1998) 16:49-53; Piatek et al., NAT. BIOTECHNOL. (1998) 16:359-363; Marras et al, GENETIC ANALYSIS: BIOMOLECULAR ENGINEERING (1999) 14: 151-156; Tpp et al, BIOTECHNIQUES (2000) 28:732-738).
Single-Strand Conformation Polymorphism Analysis
[0069] Alleles of target sequences can be differentiated using single-strand conformation polymorphism analysis, which identifies base differences by alteration in electrophoretic migration of single stranded PCR products, as described, e.g., in Orita et al., PROC. NAT. ACAD. SCI. (1989) 86:2766-2770. Amplified PCR products can be generated as described above, and heated or otherwise denatured, to form single stranded amplification products. Single- stranded nucleic acids may refold or form secondary structures which are partially dependent on the base sequence. The different electrophoretic mobilities of single-stranded amplification products can be related to base-sequence difference between alleles of target genes.
[0070] Indel detection methods often employ labeled oligonucleotides.
Oligonucleotides can be labeled by incorporating a label detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. Useful labels include fluorescent dyes, radioactive labels, e.g., 32P, electron-dense reagents, enzyme, such as peroxidase or alkaline phosphatase, biotin, or haptens and proteins for which antisera or monoclonal antibodies are available. Labeling techniques are well known in the art {see e.g., Sambrook et al., supra).
Therapeutic regimens
[0071] The present disclosure also provides methods of administering, recommending, prescribing, etc. specific therapeutic regimens to patients whose samples are found to have specific MSI status as disclosed herein. These embodiments of the present disclosure thus will provide patient-specific biological information, which will be informative for therapy selection and will facilitate therapy response prediction.
Reference Standards for Treatment
[0072] In many embodiments, the levels of MSI in a sample are compared to a reference ("reference standard" or "reference level") in order to direct treatment decisions. Indel determinations in homopolymers can be manipulated into a score, which can represent MSI status. The reference standard used for any embodiment disclosed herein may comprise average, mean, or median levels of microsatellite instability (or average, mean, or median numbers of aberrant homopolymer lengths) in a control population. The reference standard may additionally comprise cutoff values or any other statistical attribute of the control population, or earlier time points of the same subject, such as a standard deviation from the mean levels of aberrant homopolymers. In some embodiments, the control population may comprise healthy individuals, cancer patients having a particular response profile, or the same test patient prior to the administration of any or a specific therapy.
Reference Therapy for Treatment
[0073] In some embodiments, a test patient is treated more or less aggressively than a reference therapy based on the difference between the test patient's level of MSI {e.g., level of aberrant homopolymer lengths) and the reference level of MSI. In some embodiments a reference therapy is any therapy that is the standard of care for the patient's disease. The standard of care can vary temporally and geographically, and a skilled person can easily determine the appropriate standard of care by consulting the relevant medical literature.
[0074] In some embodiments, a more aggressive therapy than the standard therapy comprises beginning treatment earlier than in the standard therapy. In some embodiments, a more aggressive therapy than the standard therapy comprises administering additional treatments beyond the standard therapy. In some embodiments, a more aggressive therapy than the standard therapy comprises administering alternative treatments instead of the standard therapy. In some embodiments, a more aggressive therapy than the standard therapy comprises treating on an accelerated schedule compared to the standard therapy. In one embodiment a more aggressive therapy comprises increased length of therapy. In one embodiment a more aggressive therapy comprises increased frequency of the dose schedule. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs and increasing drug dosage. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs and accelerating dose schedule. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs and increasing length of therapy. In one embodiment, more aggressive therapy comprises increasing drug dosage and accelerating dose schedule. In one embodiment, more aggressive therapy comprises increasing drug dosage and increasing length of therapy. In one embodiment, more aggressive therapy comprises accelerating dose schedule and increasing length of therapy. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs, increasing drug dosage, and accelerating dose schedule. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs, increasing drug dosage, and increasing length of therapy. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs, accelerating dose schedule, and increasing length of therapy. In one embodiment, more aggressive therapy comprises increasing drug dosage, accelerating dose schedule, and increasing length of therapy. In one embodiment, more aggressive therapy comprises selecting and administering more potent drugs, increasing drug dosage, accelerating dose schedule, and increasing length of therapy. In some embodiments, a more aggressive therapy comprises administering a combination of drug-based and non-drug-based therapies.
[0075] In some embodiments, a less aggressive therapy than the standard therapy comprises delaying treatment relative to the standard therapy. In some embodiments, a less aggressive therapy than the standard therapy comprises administering less treatment (e.g., lower dosage of one or more standard therapy agents) than in the standard therapy. In some embodiments, a less aggressive therapy than the standard therapy comprises administering a treatment regimen lacking one or more components of the standard therapy. In some embodiments, a less aggressive therapy than the standard therapy comprises administering treatment on a decelerated schedule compared to the standard therapy. In some embodiments, a less aggressive therapy than the standard therapy comprises administering no treatment (e.g., no therapeutic agents, watchful waiting, active surveillance, etc.). In one embodiment a less aggressive therapy comprises delaying treatment. In one embodiment a less aggressive therapy comprises selecting and administering less potent drugs. In one embodiment a less aggressive therapy comprises decreasing the frequency treatment. In one embodiment a less aggressive therapy comprises shortening length of therapy. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs and decreasing drug dosage. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs and decelerating dose schedule. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs and shortening length of therapy. In one embodiment, less aggressive therapy comprises decreasing drug dosage and decelerating dose schedule. In one embodiment, less aggressive therapy comprises decreasing drug dosage and shortening length of therapy. In one embodiment, less aggressive therapy comprises decelerating dose schedule and shortening length of therapy. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs, decreasing drug dosage, and decelerating dose schedule. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs, decreasing drug dosage, and shortening length of therapy. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs, decelerating dose schedule, and shortening length of therapy. In one embodiment, less aggressive therapy comprises decreasing drug dosage, decelerating dose schedule, and shortening length of therapy. In one embodiment, less aggressive therapy comprises selecting and administering less potent drugs, decreasing drug dosage, decelerating dose schedule, and shortening length of therapy. In some embodiments, a less aggressive therapy comprises administering only non-drug-based therapies.
[0076] Thus, according to some particular embodiments, the methods of diagnosing
MSI status of a tumor, as presented herein, may further comprise a step of choosing the treatment regimen based on the MSI status (i.e., based on whether the tumor was found to be MSI-H, MSI-L or MSS).
Kits
[0077] In another embodiment, a kit is provided for determining MSI in a sample, e.g., a tumor sample, comprising the tools to genotype the biomarker panel (e.g., the microsatellite regions shown in the sequences listed in Table 1 and/or Table 2).
[0078] Other embodiments of the present teachings comprise detection reagents packaged together in the form of a kit for conducting any of the assays of the present teachings. In certain embodiments, the kits comprise oligonucleotides that specifically identify one or more microsatellites described herein. The oligonucleotide sequences may correspond to fragments of the biomarker nucleic acids. For example, the oligonucleotides can be more than 250, 200, 150, 100, 50, 25, 10, or fewer than 10 nucleotides in length. The kit can contain in separate containers a nucleic acid, control formulations (positive and/or negative), and/or a detectable label, such as but not limited to fluorescein, green fluorescent protein, rhodamine, cyanine dyes, Alexa dyes, luciferase, and radiolabels, among others. Instructions for carrying out the assay, including, optionally, instructions, can be included in the kit.
[0079] In other embodiments of the present teachings, the kit can contain a nucleic acid substrate array comprising one or more nucleic acid sequences. The nucleic acids on the array specifically identify one or more of the microsatellite regions described herein. In various embodiments, the sequence (e.g., homopolymer length) of one or more of the microsatellite regions can be identified by virtue of binding to the array. In some embodiments the substrate array can be on a solid substrate, such as what is known as a "chip." See, e.g., U.S. Pat. No. 5,744,305. In some embodiments the substrate array can be a solution array; e.g., xMAP (Luminex, Austin, TX), Cyvera (Illumina, San Diego, CA), RayBio Antibody Arrays (RayBiotech, Inc., Norcross, GA), CellCard (Vitra Bioscience, Mountain View, CA) and Quantum Dots' Mosaic (Invitrogen, Carlsbad, CA).
Machine-readable storage medium
[0080] A machine-readable storage medium can comprise, for example, a data storage material that is encoded with machine-readable data or data arrays. The data and machine- readable storage medium are capable of being used for a variety of purposes, when using a machine programmed with instructions for using said data. Such purposes include, without limitation, storing, accessing and manipulating information relating to MSI of a subject or population over time, or disease activity characterized by MSI in response treatment, or for drug discovery. Data comprising the presence of indels can be implemented in computer programs that are executing on programmable computers, which comprise a processor, a data storage system, one or more input devices, one or more output devices, etc. Program code can be applied to the input data to perform the functions described herein, and to generate output information. This output information can then be applied to one or more output devices, according to methods well-known in the art. The computer can be, for example, a personal computer, a microcomputer, or a workstation of conventional design.
[0081] The computer programs can be implemented in a high-level procedural or object-oriented programming language, to communicate with a computer system. The programs can also be implemented in machine or assembly language. The programming language can also be a compiled or interpreted language. Each computer program can be stored on storage media or a device such as ROM, magnetic diskette, etc., and can be readable by a programmable computer for configuring and operating the computer when the storage media or device is read by the computer to perform the described procedures. Any health-related data management systems of the present teachings can be considered to be implemented as a computer-readable storage medium, configured with a computer program, where the storage medium causes a computer to operate in a specific manner to perform various functions, as described herein.
[0082] The assays disclosed herein can be used to generate a "subject MSI profile."
The subject MSI profiles can then be compared to a reference profile. The biomarker profiles, reference and subject, of embodiments of the present teachings can be contained in a machine- readable medium, such as analog tapes like those readable by a CD-ROM or USB flash media, among others. The machine-readable media can also comprise subject information, e.g., the subject's medical or family history.
EXAMPLES
Example 1-Development of a microsatellite instability assay
[0083] This example demonstrates the development of a microsatellite instability
(MSI) assay that generates a score useful for determining whether an individual has disease characterized by MSI.
Background
[0084] Microsatellites are regions of tandem repeats, and can be single nucleotide repeats, known as homopolymers, or dinucleotide repeats. MIS occurs when the number or repeats varies, which can include either insertions or deletions of the nucleotide repeats into a microsatellite region. MSI can be evidence of a loss of function in the DNA mismatch repair (MMR) pathway, which can result in a variety of diseases including cancer.
Methods
[0085] A library useful for determining homologous recombination deficiency (HRD) having 54,091 single nucleotide polymorphisms (S Ps) spaced across the entire genome was used to discover homopolymers useful for an MSI assay. The criteria for useful homopolymers was 1) that the homopolymer was within 100 bases of the SNP position, 2) that the homopolymer was 15-20 nucleotide bases in length, and 3) the average SNP coverage was over 100 independent reads to increase good coverage of nearby homopolymers.
[0086] Following the initial selection of homopolymers useful for an MSI, a second round of selection was conducted using a cohort of gastric cancer samples. The cohort included 80 gastric cancer samples. Eleven samples were MSI positive using the above metrics. Secondary criteria selection included ensuring that the homopolymer was of a consistent length among normal tissue samples with low noise around the default length having CV less than 20%, and two or more samples must exhibit instability at the same homopolymer location. Results
[0087] Out of the 54,091 S P regions analyzed, 236 homopolymers were found falling within the established initial criteria. Each of the initial 236 homopolymers was evaluated using a cohort of gastric cancer samples and the secondary criteria was used to select further homopolymer sequences for an MSI assay. Of the initial 236 homopolymers selected from the initial 54,091 SNP regions analyzed, 35 homopolymers were found to meeting the secondary criteria. These 35 homopolymers were incorporated into the MSI assay.
[0088] Homopolymer selection is illustrated in Figure 1. Figure 1A shows a homopolymer that was excluded following secondary selection. Figure IB shows a homopolymer exemplifying secondary criteria, which was included in the final selection of 35 homopolymers used for the MSI assay.
[0089] Gastric samples were then tested against the 35 homopolymer MSI assay.
Twelve gastric samples were found to have MSI. On average 28 of the 35 homopolymers exhibited microsatellite instability, with the range being 20-33 affected homopolymers.
[0090] This example thus provides an assay that generates a score useful for determining whether an individual has MSI.
[0091] Although the foregoing disclosure has been described in some detail by way of illustration and example for purposes of clarity of understanding, it will be obvious that certain changes and modifications may be practiced within the scope of the appended claims.

Claims

CLAIMS What is claimed is:
1. A method of analyzing microsatellite regions comprising:
analyzing DNA derived from a patient sample to determine the nucleotide sequence of the DNA at a plurality of microsatellite regions, wherein
(a) the plurality of microsatellite regions comprises at least one test microsatellite region;
(b) the at least one test microsatellite region comprise(s) the nucleotide sequence of any one of SEQ ID NOs: l-35; and
(c) the sequence of the at least one test microsatellite region is analyzed to detect at least one indel at a homopolymer subregion comprising:
(1) bases 7-26 of SEQ ID NO: 1;
(2) bases 11-27 of SEQ ID NO: 2;
(3) bases 10-27 of SEQ ID NO: 3;
(4) bases 1-6 of SEQ ID NO: 4;
(5) bases 10-28 of SEQ ID NO: 5;
(6) bases 1-18 of SEQ ID NO: 6;
(7) bases 8-28 of SEQ ID NO: 7;
(8) bases 7-23 of SEQ ID NO: 8;
(9) bases 6-21 of SEQ ID NO: 9;
(10) bases 6-21 of SEQ ID NO: 10;
(11) bases 7-25 of SEQ ID NO: 11;
(12) bases 5-22 of SEQ ID NO: 12;
(13) bases 6-25 of SEQ ID NO: 13;
(14) bases 10-30 of SEQ ID NO: 14;
(15) bases 1-17 of SEQ ID NO: 15;
(16) bases 1-20 of SEQ ID NO: 16;
(17) bases 8-24 of SEQ ID NO: 17;
(18) bases 12-28 of SEQ ID NO: 18;
(19) bases 11-26 of SEQ ID NO: 19;
(20) bases 1-19 of SEQ ID NO: 20;
(21) bases 9-24 of SEQ ID NO: 21; (22) bases 11-28 of SEQ ID NO: 22;
(23) bases 7-26 of SEQ ID NO: 23;
(24) bases 12-30 of SEQ ID NO: 24;
(25) bases 6-25 of SEQ ID NO: 25;
(26) bases 8-23 of SEQ ID NO: 26;
(27) bases 20-39 of SEQ ID NO: 27;
(28) bases 6-23 of SEQ ID NO: 28;
(29) bases 7-23 of SEQ ID NO: 29;
(30) bases 8-30 of SEQ ID NO: 30;
(31) bases 10-25 of SEQ ID NO: 31;
(32) bases 7-22 of SEQ ID NO: 32;
(33) bases 8-23 of SEQ ID NO: 33;
(34) bases 9-27 of SEQ ID NO: 34; or
(35) bases 1-21 of SEQ ID NO: 35.
2. The method of claim 1 wherein the at least one test microsatellite region is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions.
3. The method of claim 1 wherein the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 indels.
4. The method of claim 3 wherein between 20 and 33 indels is indicative of MSI.
5. The method of claim 1 wherein the sample is a bodily fluid.
6. The method of claim 1 wherein the sample is a tumor sample.
7. The method of claim 6 wherein the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors.
8. The method of claim 1 wherein the indel is detected using next generation sequencing.
9. A method of detecting microsatellite instability levels comprising:
(a) assaying DNA derived from a patient sample according to claim 1; and
(b) detecting (1) high microsatellite instability in a sample in which at least 60% of the plurality of microsatellite regions comprise an indel in the homopolymer subregion; or
(2) intermediate or low microsatellite instability in a sample in which fewer than 60% and more than 10% of the plurality of microsatellite regions comprise an indel in the homopolymer subregion; or
(3) no microsatellite instability in a sample in which 10% or fewer of the plurality of microsatellite regions comprises an indel in the homopolymer subregion.
10. The method of claim 9 wherein the at least one test microsatellite region is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions.
11. The method of claim 9 wherein the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 indels.
12. The method of claim 11 wherein between 20 and 33 indels is indicative of MSI.
13. The method of claim 9 wherein the sample is a bodily fluid.
14. The method of claim 9 wherein the sample is a tumor sample.
15. The method of claim 14 wherein the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors.
16. The method of claim 9 wherein the indel is detected using next generation sequencing.
17. A method of detecting microsatellite instability comprising:
(a) analyzing at least one microsatellite region present in one of SEQ ID NOs: l- 35 in DNA derived from a patient sample;
(b) detecting at least one indel in the at least one microsatellite region; and
(c) detecting microsatellite instability in a sample in which the at least one microsatellite region comprises at least one indel in a homopolymer region
wherein the at least one microsatellite region comprises:
(1) bases 7-26 of SEQ ID NO: 1; (2) bases 11-27 of SEQ ID NO: 2;
(3) bases 10-27 of SEQ ID NO: 3;
(4) bases 1-6 of SEQ ID NO: 4;
(5) bases 10-28 of SEQ ID NO: 5;
(6) bases 1-18 of SEQ ID NO: 6;
(7) bases 8-28 of SEQ ID NO: 7;
(8) bases 7-23 of SEQ ID NO: 8;
(9) bases 6-21 of SEQ ID NO: 9;
(10) bases 6-21 of SEQ ID NO: 10;
(1 1) bases 7-25 of SEQ ID NO: 1 1;
(12) bases 5-22 of SEQ ID NO: 12;
(13) bases 6-25 of SEQ ID NO: 13;
(14) bases 10-30 of SEQ ID NO: 14;
(15) bases 1-17 of SEQ ID NO: 15;
(16) bases 1-20 of SEQ ID NO: 16;
(17) bases 8-24 of SEQ ID NO: 17;
(18) bases 12-28 of SEQ ID NO: 18;
(19) bases 11-26 of SEQ ID NO: 19;
(20) bases 1-19 of SEQ ID NO: 20;
(21 ) bases 9-24 of SEQ ID NO : 21 ;
(22) bases 11-28 of SEQ ID NO: 22;
(23) bases 7-26 of SEQ ID NO: 23;
(24) bases 12-30 of SEQ ID NO: 24;
(25) bases 6-25 of SEQ ID NO: 25;
(26) bases 8-23 of SEQ ID NO: 26;
(27) bases 20-39 of SEQ ID NO: 27;
(28) bases 6-23 of SEQ ID NO: 28;
(29) bases 7-23 of SEQ ID NO: 29;
(30) bases 8-30 of SEQ ID NO: 30;
(31) bases 10-25 of SEQ ID NO: 31;
(32) bases 7-22 of SEQ ID NO: 32;
(33) bases 8-23 of SEQ ID NO: 33; (34) bases 9-27 of SEQ ID NO: 34; or
(35) bases 1-21 of SEQ ID NO: 35.
18. The method of claim 17 wherein the at least one test microsatellite region is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 test microsatellite regions.
19. The method of claim 17 wherein the at least one indel is 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 indels.
20. The method of claim 19 wherein between 20 and 33 indels is indicative of MSI.
21. The method of claim 17 wherein the sample is a bodily fluid.
22. The method of claim 17 wherein the sample is a tumor sample.
23. The method of claim 22 wherein the tumor is selected from the group of bladder, breast, colon, skin, ovary, endometrium, lung, lymphoblast, pancreas, prostate, rectum, and stomach tumors.
24. The method of claim 17 wherein the indel is detected using next generation sequencing.
PCT/US2016/067952 2015-12-22 2016-12-21 Methods for measuring microsatellite instability WO2017112738A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201562271184P 2015-12-22 2015-12-22
US62/271,184 2015-12-22
US201662287042P 2016-01-26 2016-01-26
US62/287,042 2016-01-26

Publications (1)

Publication Number Publication Date
WO2017112738A1 true WO2017112738A1 (en) 2017-06-29

Family

ID=59091195

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/067952 WO2017112738A1 (en) 2015-12-22 2016-12-21 Methods for measuring microsatellite instability

Country Status (1)

Country Link
WO (1) WO2017112738A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019074963A1 (en) * 2017-10-09 2019-04-18 Strata Oncology, Inc. Microsatellite instability characterization
WO2019099529A1 (en) * 2017-11-16 2019-05-23 Illumina, Inc. Systems and methods for determining microsatellite instability
WO2019108807A1 (en) 2017-12-01 2019-06-06 Personal Genome Diagnositics Inc. Process for microsatellite instability detection
WO2020081607A1 (en) * 2018-10-15 2020-04-23 Tempus Labs, Inc. Microsatellite instability determination system and related methods
US20210139950A1 (en) * 2019-11-08 2021-05-13 Life Technologies Corporation Microsatellite instability measurement
WO2021092299A1 (en) * 2019-11-08 2021-05-14 Life Technologies Corporation Systems and assays for assessing microsatellite instability
EP3863019A1 (en) 2020-02-07 2021-08-11 Sophia Genetics S.A. Methods for detecting and characterizing microsatellite instability with high throughput sequencing

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020086078A (en) * 2001-05-11 2002-11-18 김호근 Diagnostic Kits For The Detection Of Microsatellite Instability
WO2013153130A1 (en) * 2012-04-10 2013-10-17 Vib Vzw Novel markers for detecting microsatellite instability in cancer and determining synthetic lethality with inhibition of the dna base excision repair pathway

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20020086078A (en) * 2001-05-11 2002-11-18 김호근 Diagnostic Kits For The Detection Of Microsatellite Instability
WO2013153130A1 (en) * 2012-04-10 2013-10-17 Vib Vzw Novel markers for detecting microsatellite instability in cancer and determining synthetic lethality with inhibition of the dna base excision repair pathway

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
LU ET AL.: "A novel approach for characterizing microsatellite instability in cancer cells", PLOS ONE, vol. 8, no. 5, 6 May 2013 (2013-05-06), pages 1 - 10 *
MORANDI ET AL.: "T([20]) repeat in the 3'-untranslated region of the MT1X gene : a marker with high sensitivity and specificity to detect microsatellite instability in colorectal cancer", INTERNATIONAL JOURNAL OF COLORECTAL DISEASE, vol. 27, 2012, pages 647 - 656, XP002698588, DOI: doi:10.1007/S00384-011-1365-7 *
MURPHY ET AL.: "Comparison of the microsatellite instability analysis system and the Bethesda Panel for the determination of microsatellite instability in colorectal cancers", JOURNAL OF MOLECULAR DIAGNOSTICS, vol. 8, no. 3, July 2006 (2006-07-01), pages 305 - 311, XP055395422 *
RUGGIERO ET AL.: "Deletion in a (T)8 microsatellite abrogates expression regulation by 3' -UTR", NUCLEIC ACIDS RESEARCH, vol. 31, no. 22, 2003, pages 6561 - 6569, XP002698591, DOI: doi:10.1093/NAR/GKG858 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10699802B2 (en) 2017-10-09 2020-06-30 Strata Oncology, Inc. Microsatellite instability characterization
WO2019074963A1 (en) * 2017-10-09 2019-04-18 Strata Oncology, Inc. Microsatellite instability characterization
AU2018367488B2 (en) * 2017-11-16 2021-09-16 Illumina, Inc. Systems and methods for determining microsatellite instability
WO2019099529A1 (en) * 2017-11-16 2019-05-23 Illumina, Inc. Systems and methods for determining microsatellite instability
KR20200015913A (en) * 2017-11-16 2020-02-13 일루미나, 인코포레이티드 System and method for determining microsatellite instability
CN110800061A (en) * 2017-11-16 2020-02-14 伊鲁米那股份有限公司 System and method for determining microsatellite instability
JP2020527337A (en) * 2017-11-16 2020-09-10 イルミナ インコーポレイテッド Systems and methods for determining microsatellite instability
KR102402002B1 (en) 2017-11-16 2022-05-25 일루미나, 인코포레이티드 Systems and Methods for Determining Microsatellite Instability
WO2019108807A1 (en) 2017-12-01 2019-06-06 Personal Genome Diagnositics Inc. Process for microsatellite instability detection
WO2020081607A1 (en) * 2018-10-15 2020-04-23 Tempus Labs, Inc. Microsatellite instability determination system and related methods
US20210139950A1 (en) * 2019-11-08 2021-05-13 Life Technologies Corporation Microsatellite instability measurement
WO2021092523A1 (en) * 2019-11-08 2021-05-14 Life Technologies Corporation Microsatellite instability measurement
WO2021092299A1 (en) * 2019-11-08 2021-05-14 Life Technologies Corporation Systems and assays for assessing microsatellite instability
CN114651072A (en) * 2019-11-08 2022-06-21 生命科技股份有限公司 Microsatellite instability measurement
JP2023500366A (en) * 2019-11-08 2023-01-05 ライフ テクノロジーズ コーポレーション Microsatellite instability measurements
JP7407284B2 (en) 2019-11-08 2023-12-28 ライフ テクノロジーズ コーポレーション Microsatellite instability measurements
EP3863019A1 (en) 2020-02-07 2021-08-11 Sophia Genetics S.A. Methods for detecting and characterizing microsatellite instability with high throughput sequencing
WO2021156486A1 (en) 2020-02-07 2021-08-12 Sophia Genetics S.A. Methods for detecting and characterizing microsatellite instability with high throughput sequencing

Similar Documents

Publication Publication Date Title
WO2017112738A1 (en) Methods for measuring microsatellite instability
EP2227568B1 (en) Molecular in vitro diagnosis of breast cancer
EP3524688B1 (en) Multiple detection method of methylated dna
EP2885427B1 (en) Colorectal cancer methylation marker
US20160340725A1 (en) Method to increase sensitivity of next generation sequencing
EP3067432A1 (en) DNA-methylation based method for classifying tumor species of the brain
JP2018530347A (en) Method for preparing cell-free nucleic acid molecules by in situ amplification
JP2012507988A (en) Genetic polymorphisms in age-related macular degeneration
CN101679971A (en) The decision method of progression risk of glaucoma
EP2982986B1 (en) Method for manufacturing gastric cancer prognosis prediction model
KR101992786B1 (en) Method for providing information of prediction and diagnosis of obesity using methylation level of CYP2E1 gene and composition therefor
JP2023109998A (en) Detection of microsatellite instability
KR101992792B1 (en) Method for providing information of prediction and diagnosis of obesity using methylation level of AKR1E2 gene and composition therefor
JPWO2017170644A1 (en) Gene mutation detection method
US20240093302A1 (en) Non-invasive cancer detection based on dna methylation changes
KR102112951B1 (en) Ngs method for the diagnosis of cancer
US20220205043A1 (en) Detecting cancer risk
JP2021503921A (en) Compositions and Methods for Adapting Cancer
US20190032143A1 (en) Kits and methods for diagnosis, screening, treatment and disease monitoring
EP2393939B1 (en) A snp marker of breast and ovarian cancer risk
US20140242583A1 (en) Assays, methods and compositions for diagnosing cancer
WO2017106365A1 (en) Methods for measuring mutation load
US20220389513A1 (en) A Method of Estimating a Circulating Tumor DNA Burden and Related Kits and Methods
US20240084389A1 (en) Use of simultaneous marker detection for assessing difuse glioma and responsiveness to treatment
KR102236717B1 (en) Method of providing the information for predicting of hematologic malignancy prognosis after peripheral blood stem cell transplantation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16880007

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16880007

Country of ref document: EP

Kind code of ref document: A1