WO2021252603A1 - Methods for identifying modified bases in a polynucleotide - Google Patents

Methods for identifying modified bases in a polynucleotide Download PDF

Info

Publication number
WO2021252603A1
WO2021252603A1 PCT/US2021/036577 US2021036577W WO2021252603A1 WO 2021252603 A1 WO2021252603 A1 WO 2021252603A1 US 2021036577 W US2021036577 W US 2021036577W WO 2021252603 A1 WO2021252603 A1 WO 2021252603A1
Authority
WO
WIPO (PCT)
Prior art keywords
modified
dna
damaged
enzyme
glycosylase
Prior art date
Application number
PCT/US2021/036577
Other languages
French (fr)
Inventor
Charles Rodi
Original Assignee
Rhodx, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rhodx, Inc. filed Critical Rhodx, Inc.
Publication of WO2021252603A1 publication Critical patent/WO2021252603A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions

Definitions

  • This invention generally relates to molecular biology and genetic engineering.
  • methods comprising a unique series of steps to mark sites of modification or damage within a nucleic acid (or polynucleotide, or deoxyribonucleic acid (DNA)) sequence for analysis and measurement.
  • the difference is detected by sequencing, hybridization, mass spectrometry, or any other technology that can detect sequence changes, including those with nucleotide resolution.
  • the base moieties of deoxyribonucleic acid can be damaged by a variety of means, including UV-radiation, ionizing radiation, chemical mutagens or modified by naturally occurring enzymes that modulate nucleic acid function. Only modifications of cytosines and adenines are currently known in DNA, though a variety of damaged bases occur.
  • Damaged or modified bases in DNA include thymine glycol (Tg), 8-oxo-7,8- dihydroguanine (8oxoG), 3-methyladenine (3mA), 7-methylguanine (7mG), deoxyinosine (dl), deoxyxanthosine (dX), deoxyuridine (dU), 5-hydroxyuridine (5- hoU), 5-hydroxymethyluridine (5-hmU), 5-formyluridine (5-fU), cyclobutene pyrimidine dimers (CPDs), 5 methyl cytosine (5mC), 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), 5-carboxylcytosine (5caC), and N6-methyladenine (6mA).
  • TLC thin layer chromatography
  • HPLC high-pressure liquid chromatography
  • LC-MS/MS liquid chromatography coupled with mass spectrometry
  • a modified or a damaged base moiety in a nucleic acid or polynucleotide, or deoxyribonucleic acid (DNA)
  • making or generating a reverse complement copy of the nucleic acid strand that contains or comprises the at least one made or generated abasic site wherein the making or generating comprises incorporating in the reverse complement copy opposite the abasic site a nucleotide that has a different identity from what the original modified or damaged nucleotide would have coded for.
  • the modified or damaged base moiety comprises a modified or damaged adenine (A), cytosine (C), guanine (G), thymine (T), or uracil (U);
  • the modified or damaged base moiety is 5-methylcytosine (5meC), 5- hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), 5-carboxylcytosine (5caC), or 6mA-methyladenine (N6-methyladenosine, m6A);
  • nucleotide base moiety is modified enzymatically or chemically
  • a modified or damaged cytosine is oxidized to a 5fC or a 5caC, or the modified or damaged cytosine is oxidized chemically to a 5fC or a 5caC, or the modified or damaged cytosine is oxidized enzymatically to a 5fC or a 5caC, and optionally the enzyme (for enzymatic oxidation) is a Ten-eleven Translocation (TET) enzyme, optionally a naturally occurring TET enzyme, or the enzyme (for enzymatic oxidation) is a genetically engineered form of a Ten-eleven Translocation (TET) enzyme, or the enzyme (for enzymatic oxidation) is TET1, TET2, or TET3, and optionally the 5fC or 5caC is removed or modified enzymatically to create an abasic site, and optionally the enzyme for generating an abasic site is thymine DNA glycosylase (TDG), and optionally the enzyme for generating an a
  • the modified base moiety is N6-methyladenine (6mA), or the modified base moiety 6mA is converted to hypoxanthine by a 6mA deaminase;
  • the method further comprises (or is followed by) base excision by an AlkA family enzyme (optionally a hypoxanthine DNA glycosylase) to create an abasic site;
  • an AlkA family enzyme optionally a hypoxanthine DNA glycosylase
  • the reverse complement copy is made by comprising use of a DNA polymerase with lesion bypass activity allowing it to incorporate a nucleotide opposite to the abasic site;
  • the DNA polymerase is naturally occurring, or the DNA polymerase is a Y- family polymerase, or the DNA polymerase is genetically engineered, or the genetically engineered polymerase is a Y-family DNA polymerase;
  • the DNA polymerase is a DNA polymerase that follows the A rule and inserts an adenine (A) base moiety opposite an abasic site for marking a site where the original damaged or modified base would have coded for a C, G, T, or U;
  • A adenine
  • the DNA polymerase is a DNA polymerase that does not follow the A rule, but instead inserts a cytosine (C) nucleotide opposite an abasic site for marking a site where the original damaged or modified base would have coded for an A, G, T, or U;
  • C cytosine
  • the genetically engineered DNA polymerase is a TAQ polymerase without the 5'-3' exonuclease domain, or KLENTAQTM or a derivative thereof
  • the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain, or genetically engineered KLENTAQTM or derivative thereof comprises one or more variants chosen from M747K, I614K, I614N, L616I, G84A, D144G, K314R, E520G, F598L, A608V, E742G, E742K, D58G, R74P, A109T, L245R, R343G, G370D, E520G, N583S, E694K, and A743P
  • the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain or KLENTAQTM or derivative thereof comprises M747K, or the genetically engineered TAQ polymerase without the 5'-3' exonu
  • the modified base moiety is removed by a naturally occurring or genetically engineered glycosylase, or the glycosylase is a monofunctional glycosylase, or the glycosylase is a bifunctional glycosylase where the lyase activity has been ablated.
  • kits comprising materials for practicing methods as provided herein, and optionally also comprising instructions for practicing methods as provided herein.
  • Enzymatic treatment removes the modified or damaged base leaving the phosphodiester backbone intact.
  • Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site. Amplification leads to a polymorphism at the original site of modification or damage.
  • FIG. 2 schematically illustrates an exemplary method as provided herein for detecting methylated cytosines.
  • Tet2 oxidizes 5mC and 5hmC to 5fC and subsequently to 5caC, which is subsequently excised by thymine DNA glycosylase (TDG), leaving an abasic site; unmodified C’s are unchanged.
  • TDG thymine DNA glycosylase
  • the “A rule” is followed and an A is placed opposite the abasic site.
  • the A codes for a T, resulting in OT and G>A transitions in the complementary strands.
  • Two distinct amplicons are produced: one from the top strand and one from the bottom strand.
  • NexGen Sequencing (NGS) would always detect two amplicons with each confirming the other as a site of DNA methylation; compared to a reference sequence, the method affects adjacent base pairs.
  • NGS NexGen Sequencing
  • FIG. 3A-B schematically illustrates an exemplary method as provided herein for methylation and mutation assays that can be performed in the same container or reaction vessel, e.g., the same tube.
  • the exemplary method can be performed in the same tube as mutation assays since the two are never confused: mutations produce only one amplicon type since mutations affect the same base pair.
  • FIG. 3 A the C > T change is clearly a mutation since only one amplicon type is produced. Even with a dinucleotide mutation, still only one amplicon type per mutation is seen, see FIG. 3B.
  • FIG. 4 schematically illustrates an exemplary method as provided herein for detecting N6-adenine Methylation.
  • 6mA deaminase or another appropriate amidohydrolase is used to convert 6-methyladenine to hypoxanthine, which is subsequently acted on by an AlkA family enzyme (hypoxanthine DNA glycosylases) to create an abasic site in the DNA.
  • AlkA family enzyme hyperxanthine DNA glycosylases
  • FIG. 5 illustrates a gel showing that treatment with TET2 protein enzymatically modifies 5 -methyl cytosine containing DNA so that it can no longer be cut using the restriction endonuclease Mspl, demonstrating successful oxidation to 5fC and/or 5caC: a PCR amplicon containing a single Hpall/Mspl site was methylated using Hpall methyltransferase; cutting by Mspl is unaffected by methylation of CpG islands by Hpall methyltransferase as demonstrated in lane 1 where Mspl digestion is complete; DNA used for lane 2 was further treated with TET2 protein; because Mspl can cut 5mC or 5hmC containing DNA, but not 5fC or 5caC containing DNA, thus FIG. 5 demonstrates oxidation of methylated DNA to 5fC and/or 5caC, both of which cannot be cut by Mspl but are suitable substrates for thymine DNA glycosylase
  • FIG. 6A-B schematically illustrates a 60-base pair, double-stranded DNA containing a single 5caC nucleotide and experimental results that demonstrate thymine DNA glycosylase (TDG) excision of the modified base from the duplex:
  • FIG. 6A schematically illustrates a synthetic double-stranded DNA containing a 5caC nucleotide at the position marked by an asterisk (*), where the (upper) 5’ to 3’ strand is SEQ ID NO: 1, 5’-
  • FIG. 6B illustrates a gel that displays results on an AGILENT TAPE STATIONTM after the DNA duplex of FIG. 6A has been denatured by treatment with sodium hydroxide plus heat, then neutralized with hydrochloric acid (Lane 1) to yield two separate, intact DNA strands, that migrate as an unresolved, broad band; Lane 2 shows the same duplex that is first treated with thymine DNA glycosylase (TDG) to remove the 5caC base moiety from the duplex, leaving an abasic site; subsequent treatment with sodium hydroxide plus heat results in strand scission at the abasic site in addition to denaturation of the DNA duplex; resulting in one intact DNA strand of 60 nucleotides (the bottom strand depicted in FIG.
  • TDG thymine DNA glycosylase
  • FIG. 7A-B schematically illustrate a primer-template pair, where the template contains an abasic site, a full-length, extended primer; and experimental results that demonstrate two DNA polymerases that can bypass the abasic site to produce full- length extended products:
  • FIG. 7A schematically illustrates a partially double-stranded DNA consisting of a 60-nucleotide template: 3’-
  • GTGCGTCGAGTACGGGAAGCCGACGGAGGACCTGATACAGG*CCTTGTGT TTCTGTTATA-5’ (SEQ ID NO:3) containing an abasic site (marked by an asterisk (*)) and a 21 -nucleotide primer 5’ - ATGCCCTTCGGCTGCCTCCTG- 3’ (SEQ ID NO:4) annealed to the template; and the expected product of 50 nucleotides produced by a DNA polymerase with lesion bypass activity extending the primer to the end of the template (i.e., a 50 nucleotide product (SEQ ID NO:5) results from the extension of the 21 -nucleotide primer (SEQ ID NO:4) on the 60-nucleotide template (SEQ ID NO:3): 5’-
  • FIG. 7B illustrates a gel that shows experimental results using the 21- nucleotide primer (SEQ ID NO:4) annealed to the 60-nucleotide template (SEQ ID NO:3); Lane 1 : no polymerase; Lane 2: a double mutant of a klentaq (a TAQ polymerase without the 5'-3' exonuclease domain) as described by Patel et al JBC. 2001; 276(7):5044 ⁇ 505i; Lane 3: a single mutant of klentaq as described by Gloeekner C, et aL ingew Chem Int Ed Engl.
  • methods as provided herein comprise the removal of one or more damaged or modified bases in the nucleic acid by an enzyme (e.g., by enzymatic or equivalent means) to create one or more abasic sites in a manner that leaves the phosphodiester backbone of the nucleic acid intact; followed by synthesis of a reverse complement to the original, now abasic, nucleic acid strand or strands using an enzyme with a DNA polymerase activity that possesses lesion bypass activity such that a nucleotide residue is incorporated in the reverse complement that has a different identity from the nucleotide that would have been incorporated if the abasic site had not been created in the original DNA molecule.
  • an enzyme e.g., by enzymatic or equivalent means
  • the removal of the damaged or modified base is affected by, or is facilitated by, the use of one or more enzymes or polypeptides having glycosylase activity, for example, a monofunctional glycosylase, i.e., an enzyme having only glycosylase activity.
  • the monofunctional glycosylase removes its cognate base (i.e., the damaged or modified base) to create an abasic site in a manner that leaves the phosphodiester backbone of the nucleic acid intact.
  • the monofunctional glycosylase may be naturally occurring or the product of genetic engineering.
  • One example of a genetically engineered monofunctional glycosylase is a bifunctional glycosylase that has been mutated in such a manner so as to retain its glycosylase activity, but not its lyase activity.
  • the modified or damaged base may first need to be acted on by an enzyme without glycosylase activity that converts the modified or damaged base into a form that is subsequently recognized and acted upon by the glycosylase to create an abasic site.
  • Tet (ten eleven translocation) proteins can convert 5mC to 5hmC, then to 5fC and then to 5caC (see e.g., Ito S et al. Science. 2011; 333(6047): 1300-1303; Shi DQ, et al. Front Genet. 2017; 8:100; Yin et al. J Am Chem Soc.
  • both 5fC and 5caC can be removed from double-stranded DNA by thymine DNA glycosylase (TDG) leaving abasic sites; TDG does not affect unmodified cytosines or 5mC or 5hmC (see e.g., Maiti A, Drohat AC. J Biol Chem. 2011, 286(41):35334-35338; He YF, et al. Science 2011; 333 (6047): 1303-1307; Bennett MT et al. 2006, J. Am. Chem. Soc. 128, 12510-12519).
  • TDG thymine DNA glycosylase
  • FIG. 4 Another example, depicted in Figure 4, is the conversion of 6mA to hypoxanthine by 6mA deaminase, followed by base excision by an AlkA family enzyme (e.g., a hypoxanthine DNA glycosylase) to create an abasic site in the DNA (see e.g., O'Brown ZK, Greer EL. N6-Methyladenine: A conserveed and Dynamic DNA Mark. Adv Exp Med Biol . 2Q16;945:213 - 246).
  • AlkA family enzyme e.g., a hypoxanthine DNA glycosylase
  • the enzyme with a first modifying activity may recognize more than one type of modified or damaged base.
  • prior to treatment by an enzyme with a second modifying activity may protect one or more of the types of modified or damaged bases while leaving one or more of the types of modified or damaged bases available for the first modifying enzyme which converts the available base to a form that may be subsequently removed to create an abasic site (note that in this alternative exemplary embodiment the “second modifying activity” is used before the “first modifying activity”).
  • Comparison of abasic sites without protective pretreatment to abasic sites with protective pretreatment helps define the identity of the modified or damaged bases.
  • the method then further comprises heat inactivation of the GT, e.g., at 65°C for 10 minutes, followed by treatment with a Tet protein to oxidize only the 5mC (not the 5ghmC) to 5fC, then 5caC (see, e.g., Yu M, et al. Cell. 2012; 149(6): 1368 - 1380), both of which can be removed by the monofunctional glycosylase, TDG, to leave an abasic site.
  • GT T4 b-glucosyltransferase
  • a reverse complement of the nucleic acid containing one or more, or all, of the abasic sites in the nucleic acid is made in a manner that incorporates in the reverse complement opposite the abasic site(s) a nucleotide that has a different identity from what the original modified nucleotide would have coded for.
  • the reverse complement is made by providing a sequence-specific nucleic acid primer that anneals to the nucleic acid strand containing the abasic site and is extended across it by a DNA polymerase that has lesion bypass activity, for example, as described in Gloeckner C, et al Angew Chem Ini Ed Engl.
  • a DNA polymerase that has lesion bypass activity is a Y-family DNA polymerase, e.g., a naturally occurring Y-family DNA polymerase, e.g., as the Sidfolohus soifataricus P2 DNA Polymerase IV, which follows the “A rule” and inserts an A opposite an abasic site, e.g., as described by Boudsocq F et a!.. Nucleic Acids Res.
  • DNA polymerases that follow the A rule and insert an A opposite an abasic site are used in marking sites where the original damaged or modified base would have coded for a C, G, T, or U; and in alternative embodiments the DNA polymerase that has lesion bypass activity is combined with one or more other (or additional) DNA polymerases to increase efficiency in making the reverse complement, for example, the other DNA polymerase can be the highly processive PHUSION HOT START IITM DNA polymerase (ThermoFisher) or Q5® HOT START HIGH-FIDELITYTM DNA polymerase (New England BioLabs).
  • the other DNA polymerase can be the highly processive PHUSION HOT START IITM DNA polymerase (ThermoFisher) or Q5® HOT START HIGH-FIDELITYTM DNA polymerase (New England BioLabs).
  • a Saccharomyces cerevisiae Y-family DNA polymerase is used; for example, a Revl enzyme is used which specifically incorporates dCMP opposite abasic sites (Nair DT, et al. J Mol Biol.
  • DNA polymerases that insert a C opposite an abasic site are used for marking sites where the original damaged or modified base would have coded for an A, G, T, or U.
  • the S. cerevisiae Y-family DNA polymerase used is Pol h (encoded by RAD30 ), which can bypass a T-T cyclobutane pyrimidine dimer (CPD) accurately and efficiently.
  • Pol h encoded by RAD30
  • CPD T-T cyclobutane pyrimidine dimer
  • a Homo sapiens Y-family DNA polymerase is used; for example, a REVl enzyme is used which specifically incorporates dCMP opposite abasic sites, and acts as a scaffold protein that interacts with the Y-family polymerases Pol h, Pol i and Pol K, as well as the B -family TLS polymerase Pol z (which is comprised of Rev3 and Rev7)
  • the Homo sapiens Y-family DNA polymerase used is Pol h (encoded by POLH or RAD 30 ), which bypasses a T-T CPD efficiently and with the same accuracy as undamaged DNA, and generates mutations at A-T base pairs during immunoglobulin gene somatic hypermutation.
  • the Homo sapiens Y-family DNA polymerase used is Pol i, encoded by POLI (also known as RAD30B ), which has unique replication fidelity, replicating template dA reasonably accurately, but replicating template dT in a highly error-prone manner.
  • the Homo sapiens Y-family DNA polymerase used is Pol K, encoded by POLK, an enzyme prone to making -1 frameshift mutations, but can accurately and efficiently bypass a number of N2-dG lesions.
  • a genetically engineered DNA polymerase with enhanced bypass activity is used, for example, as described by G!oeckner C, et al Angew Ghent Ini Ed Engl. 2007: 46(17):3115-3117.
  • the difference(s) in nucleotide sequence that occurs through the creation of abasic sites followed by making a reverse complement of the nucleic acid containing one or more abasic sites in a manner that incorporates in the reverse complement opposite the abasic site(s) a nucleotide that has a different identity from what the original modified nucleotide would have coded for are detected by e.g., sequencing, hybridization, mass spectrometry, TaqMan and equivalents or any other technology that can detect sequence changes, including those with nucleotide resolution.
  • products of manufacture and kits for practicing methods as provided herein are products of manufacture and kits for practicing methods as provided herein; and optionally, products of manufacture and kits can further comprise instructions for practicing methods as provided herein.
  • the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. About can be understood as within 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12% 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value.
  • This example demonstrates exemplary methods for detecting methylated cytosine base residues.
  • the presence of 5mC and 5hmC in DNA can be detected by treatment of the sample with an oxidizing enzyme that would convert 5mC to 5hmC; 5hmC to 5fC which might be further oxidized to 5caC; a non-limiting example of such an enzyme is ten-eleven-translocation enzyme, type 2 (TET2). Subsequent treatment with another enzyme that removes both 5fC and 5caC would leave an abasic site.
  • an oxidizing enzyme that would convert 5mC to 5hmC; 5hmC to 5fC which might be further oxidized to 5caC; a non-limiting example of such an enzyme is ten-eleven-translocation enzyme, type 2 (TET2).
  • TDG thymine DNA glycosylase
  • TDG has long been known to remove thymine from G/T mismatches, but is now known to efficiently remove the oxidized residues fC and caC from DNA when paired with G, both in vivo and in vitro , leaving abasic sites.
  • Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site; non-limiting examples of such an enzyme include Sulfolobus DNA polymerase IV, a thermostable Y-family polymerase that bypasses lesions (including abasic sites) in the template DNA; and genetically engineered KLENTAQTM mutants, including M747K. Since the original 5mC or 5hmC would have directed the insertion of a G residue in the reverse complement, the presence of the A residue acts to mark the site of 5mC or 5hmC.
  • bGT b- glucosyltransferase
  • This exemplary method includes distinct advantages over current methods 5,6 , including using the entirety of a sample for the simultaneous detection of damaged or modified bases without interfering with the detection of alternative alleles, including mutations, at other sites.
  • methylated cytosines in DNA as an example, bisulfite- based approaches and newer enzymatic approaches, both convert all unmethylated cytosines (C) to thymine (T) signals while leaving 5-methyl cytosine (5mC) and 5- hydroxymethyl cytosine (5hmC) as C signals in subsequent DNA sequencing analyses.
  • the present invention does just the opposite: it leaves all unmodified C’s untouched and converts only the 5mC and 5hmC residues into T signals. This simplifies assay design and sequence analysis.
  • the present invention provides compatibility with assays for allelic variation and mutation detection as follows: with the new method, a C>T change due to a methylated C in a CpG island would be correctly scored as a methylated C since it would produce two amplicon types as shown in Figure 2; similarly any OT change caused by allelic variation or by mutation would also be correctly scored (i.e. as an allelic variation or mutation) since each would produce only one amplicon type as shown in Figure 3. In bisulfite and other methods, that convert unmodified C’s to T’s, the conversion of unmodified C’s to T’s would confound mutation assays where a C>T mutation occurs.
  • the EGFR T790M mutation an important drug resistance marker in lung cancer
  • current methods that convert the unmethylated wild type C to T, identical to the mutant, would render the assay uninterpretable.
  • G to A mutations also could not be scored using such methods.
  • Table 1 includes a short list of important mutation assays affected by C>T or G>A transitions.
  • mutation and methylation assays are multiplexed and the entire DNA sample used. More data points would mean a higher probability of detecting a mutation if present (increased sensitivity) and increase confidence that an individual is indeed disease-free if the mutation is not detected (specificity). Since only one assay is needed for both mutation and methylation analysis, both workload and costs are dramatically decreased.
  • Table 1 Major mutation assays confounded by treatment with current methylation analysis assays.
  • This example demonstrates exemplary methods for detecting N6-adenine methylation.
  • N6-adenine methylation (6mA) of DNA may play an important role in early development and is significantly decreased in abundance in cancer cells and tissues compared to normal counterparts.
  • 6mA deaminase or another appropriate amidohydrolase is used to convert 6-methyladenine to hypoxanthine, which is subsequently acted on by AlkA family enzymes (hypoxanthine DNA glycosylases) to create an abasic site in the DNA.
  • 8-oxoguanine (8-oxo-dG) is an example of oxidative base damage that is associated with several forms of cancer.
  • 8-oxo-dG is recognized by the E. coli enzyme Fpg, the yeast enzyme Oggl and the human enzyme hOGGl. All three are bifunctional glycosylases with both glycosylase activity and lyase activity.
  • Elimination of the lyase activity by mutagenesis or inhibition that leaves the N- glycosylase activity intact allows the damaged 8-oxo-dG to be removed, leaving an abasic site.
  • Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site; non-limiting examples of such an enzyme include Sulfolobus DNA polymerase IV, a thermostable Y-family polymerase that bypasses lesions (including abasic sites) in the template DNA; and genetically engineered KLENTAQTM mutants, including M747K. Since an undamaged G would have directed the insertion of a C residue in the reverse complement, the presence of the A residue acts to mark the site of 8-oxo-dG.
  • This example demonstrates exemplary methods for detection of damaged bases in DNA in DNA.
  • hAAG human alkyladenine-DNA glycosylase
  • 3MeA 3-methyladenine
  • 7MeG 7-methylguanine
  • eA l-N6-ethenoadenine
  • Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site; non-limiting examples of such an enzyme include Sulfolobus DNA polymerase IV, a thermostable Y-family polymerase that bypasses lesions (including abasic sites) in the template DNA; and genetically engineered KLENTAQTM mutants, including M747K; and the double mutant M747K. Since an undamaged G would have directed the insertion of a C residue in the reverse complement; and an undamaged A would have directed the insertion of a T residue in the reverse complement, the presence of the A residue acts to mark the site of damaged DNA bases recognized by hAAG.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Provided are methods comprising a series of steps to mark sites of modification or damage within a nucleic acid (or polynucleotide, or deoxyribonucleic acid (DNA)) sequence for analysis and measurement. Provided are methods for determining sites of a nucleic acid damage and/or modification that comprises the removal (or replacement of) of a damaged or modified base moiety from the nucleic acid to generate or create an abasic site, followed by making a reverse complement of the nucleic acid now containing one or more abasic sites (corresponding to the damaged or modified base moieties) in a manner that incorporates in the reverse complement copy opposite the abasic site(s) a nucleotide that has a different identity from what the original modified nucleotide would have coded for; and difference can be detected by sequencing, hybridization, mass spectrometry, or any other technology that can detect sequence changes, including those with nucleotide resolution.

Description

METHODS FOR IDENTIFYING MODIFIED BASES IN A
POLYNUCLEOTIDE
RELATED APPLICATIONS
This Patent Convention Treaty (PCT) International Application claims the benefit of priority under 35 U.S.C. § 119(e) of U.S. Provisional Patent Application Serial No. (USSN) 63/037,408, June 10, 2020. The aforementioned application is expressly incorporated herein by reference in its entirety and for all purposes.
TECHNICAL FIELD
This invention generally relates to molecular biology and genetic engineering. In alternative embodiments, provided are methods comprising a unique series of steps to mark sites of modification or damage within a nucleic acid (or polynucleotide, or deoxyribonucleic acid (DNA)) sequence for analysis and measurement. In alternative embodiments, provided are methods for determining sites of a nucleic acid damage and/or modification that comprises the removal (or replacement of) of a damaged or modified base moiety from the nucleic acid to generate or create an abasic site, followed by making a reverse complement of the nucleic acid now containing one or more abasic sites (corresponding to the damaged or modified base moieties) in a manner that incorporates in the reverse complement copy opposite the abasic site(s) a nucleotide that has a different identity from what the original modified nucleotide would have coded for. In alternative embodiments, the difference is detected by sequencing, hybridization, mass spectrometry, or any other technology that can detect sequence changes, including those with nucleotide resolution.
BACKGROUND
The base moieties of deoxyribonucleic acid (DNA) can be damaged by a variety of means, including UV-radiation, ionizing radiation, chemical mutagens or modified by naturally occurring enzymes that modulate nucleic acid function. Only modifications of cytosines and adenines are currently known in DNA, though a variety of damaged bases occur. These modifications play important roles in normal development1 and in disease, including cancer where they are important biomarkers2 Damaged or modified bases in DNA include thymine glycol (Tg), 8-oxo-7,8- dihydroguanine (8oxoG), 3-methyladenine (3mA), 7-methylguanine (7mG), deoxyinosine (dl), deoxyxanthosine (dX), deoxyuridine (dU), 5-hydroxyuridine (5- hoU), 5-hydroxymethyluridine (5-hmU), 5-formyluridine (5-fU), cyclobutene pyrimidine dimers (CPDs), 5 methyl cytosine (5mC), 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), 5-carboxylcytosine (5caC), and N6-methyladenine (6mA).
Current detection and quantification methods range from chromatography approaches such as thin layer chromatography (TLC), high-pressure liquid chromatography (HPLC) and liquid chromatography coupled with mass spectrometry (LC-MS/MS). Although useful in detection and quantification, none of these methods are able to locate the damage or modification within a nucleic sequence. Some newer methods use antibodies against a particular base to isolate or immunoprecipitate nucleic acid fragments, but not with sufficient nucleotide resolution. Still others, such as bisulfite methods, destroy much of the sample by harsh chemical and temperature treatments and can be incompatible with the detection of other information encoded in the DNA, such as the presence of specific alleles or mutations.
SUMMARY
In alternative embodiments, provided are methods for detecting a modified or a damaged base moiety in a nucleic acid (or polynucleotide, or deoxyribonucleic acid (DNA)), comprising: a. Removing, or modifying and then removing, at least one modified or damaged base moiety from a nucleic acid strand to create a corresponding abasic site (wherein optionally a damaged base moiety is any structural modification to the base not present in the wild type (WT) base moiety); and, b. making or generating a reverse complement copy of the nucleic acid strand that contains or comprises the at least one made or generated abasic site, wherein the making or generating comprises incorporating in the reverse complement copy opposite the abasic site a nucleotide that has a different identity from what the original modified or damaged nucleotide would have coded for.
In alternative embodiments of methods as provided herein:
- the modified or damaged base moiety comprises a modified or damaged adenine (A), cytosine (C), guanine (G), thymine (T), or uracil (U); - the modified or damaged base moiety is 5-methylcytosine (5meC), 5- hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), 5-carboxylcytosine (5caC), or 6mA-methyladenine (N6-methyladenosine, m6A);
- the nucleotide base moiety is modified enzymatically or chemically;
- a modified or damaged cytosine is oxidized to a 5fC or a 5caC, or the modified or damaged cytosine is oxidized chemically to a 5fC or a 5caC, or the modified or damaged cytosine is oxidized enzymatically to a 5fC or a 5caC, and optionally the enzyme (for enzymatic oxidation) is a Ten-eleven Translocation (TET) enzyme, optionally a naturally occurring TET enzyme, or the enzyme (for enzymatic oxidation) is a genetically engineered form of a Ten-eleven Translocation (TET) enzyme, or the enzyme (for enzymatic oxidation) is TET1, TET2, or TET3, and optionally the 5fC or 5caC is removed or modified enzymatically to create an abasic site, and optionally the enzyme for generating an abasic site is thymine DNA glycosylase (TDG), and optionally the enzyme for generating an abasic site is a naturally occurring form of TDG or a genetically engineered form of TDG;
- the modified base moiety is N6-methyladenine (6mA), or the modified base moiety 6mA is converted to hypoxanthine by a 6mA deaminase;
- the method further comprises (or is followed by) base excision by an AlkA family enzyme (optionally a hypoxanthine DNA glycosylase) to create an abasic site;
- the reverse complement copy is made by comprising use of a DNA polymerase with lesion bypass activity allowing it to incorporate a nucleotide opposite to the abasic site;
- the DNA polymerase is naturally occurring, or the DNA polymerase is a Y- family polymerase, or the DNA polymerase is genetically engineered, or the genetically engineered polymerase is a Y-family DNA polymerase;
- the DNA polymerase is a DNA polymerase that follows the A rule and inserts an adenine (A) base moiety opposite an abasic site for marking a site where the original damaged or modified base would have coded for a C, G, T, or U;
- the DNA polymerase is a DNA polymerase that does not follow the A rule, but instead inserts a cytosine (C) nucleotide opposite an abasic site for marking a site where the original damaged or modified base would have coded for an A, G, T, or U;
- the genetically engineered DNA polymerase is a TAQ polymerase without the 5'-3' exonuclease domain, or KLENTAQ™ or a derivative thereof, or the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain, or genetically engineered KLENTAQ™ or derivative thereof, comprises one or more variants chosen from M747K, I614K, I614N, L616I, G84A, D144G, K314R, E520G, F598L, A608V, E742G, E742K, D58G, R74P, A109T, L245R, R343G, G370D, E520G, N583S, E694K, and A743P, or the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain or KLENTAQ™ or derivative thereof comprises M747K, or the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain or KLENTAQ™ or derivative thereof comprises I614N and L616I, or the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain or KLENTAQ™ or derivative thereof comprises M747K, I614N, and L616I; and/or
- the modified base moiety is removed by a naturally occurring or genetically engineered glycosylase, or the glycosylase is a monofunctional glycosylase, or the glycosylase is a bifunctional glycosylase where the lyase activity has been ablated.
In alternative embodiments, provided are kits comprising materials for practicing methods as provided herein, and optionally also comprising instructions for practicing methods as provided herein.
The details of one or more exemplary embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
All publications, patents, patent applications cited herein are hereby expressly incorporated by reference in their entireties for all purposes.
DESCRIPTION OF DRAWINGS
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the Office upon request and payment of the necessary fee.
The drawings set forth herein are illustrative of exemplary embodiments provided herein and are not meant to limit the scope of the invention as encompassed by the claims.
FIG. 1 schematically illustrates an exemplary method as provided herein for detecting abasic sites in double-stranded DNA; in this example, an instance where a DNA sequence contains a modified or damaged base that in this case is an A, C, or G (=V*); the unmodified or undamaged base would normally base pair with a T, G, or C, respectively (=B) is illustrated. Enzymatic treatment removes the modified or damaged base leaving the phosphodiester backbone intact. Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site. Amplification leads to a polymorphism at the original site of modification or damage.
FIG. 2 schematically illustrates an exemplary method as provided herein for detecting methylated cytosines. Tet2 oxidizes 5mC and 5hmC to 5fC and subsequently to 5caC, which is subsequently excised by thymine DNA glycosylase (TDG), leaving an abasic site; unmodified C’s are unchanged. When copied by Sulfolobus DNA polymerase IV or other appropriate polymerase with lesion bypass activity, the “A rule” is followed and an A is placed opposite the abasic site. In the return PCR reaction, the A codes for a T, resulting in OT and G>A transitions in the complementary strands. Two distinct amplicons are produced: one from the top strand and one from the bottom strand. NexGen Sequencing (NGS) would always detect two amplicons with each confirming the other as a site of DNA methylation; compared to a reference sequence, the method affects adjacent base pairs.
FIG. 3A-B schematically illustrates an exemplary method as provided herein for methylation and mutation assays that can be performed in the same container or reaction vessel, e.g., the same tube. The exemplary method can be performed in the same tube as mutation assays since the two are never confused: mutations produce only one amplicon type since mutations affect the same base pair. In FIG. 3 A, the C > T change is clearly a mutation since only one amplicon type is produced. Even with a dinucleotide mutation, still only one amplicon type per mutation is seen, see FIG. 3B.
FIG. 4 schematically illustrates an exemplary method as provided herein for detecting N6-adenine Methylation. 6mA deaminase or another appropriate amidohydrolase is used to convert 6-methyladenine to hypoxanthine, which is subsequently acted on by an AlkA family enzyme (hypoxanthine DNA glycosylases) to create an abasic site in the DNA. When copied by Sulfolobus DNA polymerase IV or other appropriate polymerase with lesion bypass activity, the “A rule” is followed and an A is placed opposite the abasic site where a T would have otherwise been inserted opposite the 6mA. FIG. 5 illustrates a gel showing that treatment with TET2 protein enzymatically modifies 5 -methyl cytosine containing DNA so that it can no longer be cut using the restriction endonuclease Mspl, demonstrating successful oxidation to 5fC and/or 5caC: a PCR amplicon containing a single Hpall/Mspl site was methylated using Hpall methyltransferase; cutting by Mspl is unaffected by methylation of CpG islands by Hpall methyltransferase as demonstrated in lane 1 where Mspl digestion is complete; DNA used for lane 2 was further treated with TET2 protein; because Mspl can cut 5mC or 5hmC containing DNA, but not 5fC or 5caC containing DNA, thus FIG. 5 demonstrates oxidation of methylated DNA to 5fC and/or 5caC, both of which cannot be cut by Mspl but are suitable substrates for thymine DNA glycosylase (TDG).
FIG. 6A-B schematically illustrates a 60-base pair, double-stranded DNA containing a single 5caC nucleotide and experimental results that demonstrate thymine DNA glycosylase (TDG) excision of the modified base from the duplex:
FIG. 6A schematically illustrates a synthetic double-stranded DNA containing a 5caC nucleotide at the position marked by an asterisk (*), where the (upper) 5’ to 3’ strand is SEQ ID NO: 1, 5’-
CACGCAGCTCATGCCCTTCGGCTGCCTCCTGGACTATGTC*GGGAACACAA AGAC AATAT-3 ’ (SEQ ID NO:l), and the (lower) 3’ to 5’ strand is SEQ ID NO:2, 3’- GTGCGTCGAGTACGGGAAGCCGACGGAGGACCTGATACAGGCCCTTGTGT TTCTGTTATA- 5’ (SEQ ID NO:2); and
FIG. 6B illustrates a gel that displays results on an AGILENT TAPE STATION™ after the DNA duplex of FIG. 6A has been denatured by treatment with sodium hydroxide plus heat, then neutralized with hydrochloric acid (Lane 1) to yield two separate, intact DNA strands, that migrate as an unresolved, broad band; Lane 2 shows the same duplex that is first treated with thymine DNA glycosylase (TDG) to remove the 5caC base moiety from the duplex, leaving an abasic site; subsequent treatment with sodium hydroxide plus heat results in strand scission at the abasic site in addition to denaturation of the DNA duplex; resulting in one intact DNA strand of 60 nucleotides (the bottom strand depicted in FIG. 6 A) and two fragments of 39 nucleotides and 19 nucleotides derived from the top strand depicted in FIG 6 A; after neutralization with hydrochloric acid, the 60 nucleotide and 49 nucleotide bands are now resolved on the AGILENT TAPE STATION™ and the 19 nucleotide fragment is seen running at about the same position as the lower gel marker, as expected.
FIG. 7A-B schematically illustrate a primer-template pair, where the template contains an abasic site, a full-length, extended primer; and experimental results that demonstrate two DNA polymerases that can bypass the abasic site to produce full- length extended products:
FIG. 7A schematically illustrates a partially double-stranded DNA consisting of a 60-nucleotide template: 3’-
GTGCGTCGAGTACGGGAAGCCGACGGAGGACCTGATACAGG*CCTTGTGT TTCTGTTATA-5’ (SEQ ID NO:3) containing an abasic site (marked by an asterisk (*)) and a 21 -nucleotide primer 5’ - ATGCCCTTCGGCTGCCTCCTG- 3’ (SEQ ID NO:4) annealed to the template; and the expected product of 50 nucleotides produced by a DNA polymerase with lesion bypass activity extending the primer to the end of the template (i.e., a 50 nucleotide product (SEQ ID NO:5) results from the extension of the 21 -nucleotide primer (SEQ ID NO:4) on the 60-nucleotide template (SEQ ID NO:3): 5’-
ATGCCCTTCGGCTGCCTCCTGGACTATGTCCAGGAACACAAAGACAATAT -3’ (SEQ ID NO: 5); and
FIG. 7B illustrates a gel that shows experimental results using the 21- nucleotide primer (SEQ ID NO:4) annealed to the 60-nucleotide template (SEQ ID NO:3); Lane 1 : no polymerase; Lane 2: a double mutant of a klentaq (a TAQ polymerase without the 5'-3' exonuclease domain) as described by Patel et al JBC. 2001; 276(7):5044~505i; Lane 3: a single mutant of klentaq as described by Gloeekner C, et aL ingew Chem Int Ed Engl. 2007; 46(17):3115-3117; Both polymerases exhibit bypass activity, producing the full-length 50-nucleotide product (SEQ ID NO:5) from the 21-nucleotide primer (SEQ ID NO:4) annealed to the 60- nucleotide template (SEQ ID NO: 3).
Like reference symbols in the various drawings indicate like elements.
DETAILED DESCRIPTION
In alternative embodiments, provided is a new approach to determining sites of damaged or modified bases in a nucleic acid (or polynucleotide) or deoxyribonucleic acid (DNA). In alternative embodiments, methods as provided herein comprise the removal of one or more damaged or modified bases in the nucleic acid by an enzyme (e.g., by enzymatic or equivalent means) to create one or more abasic sites in a manner that leaves the phosphodiester backbone of the nucleic acid intact; followed by synthesis of a reverse complement to the original, now abasic, nucleic acid strand or strands using an enzyme with a DNA polymerase activity that possesses lesion bypass activity such that a nucleotide residue is incorporated in the reverse complement that has a different identity from the nucleotide that would have been incorporated if the abasic site had not been created in the original DNA molecule.
In alternative embodiments, the removal of the damaged or modified base is affected by, or is facilitated by, the use of one or more enzymes or polypeptides having glycosylase activity, for example, a monofunctional glycosylase, i.e., an enzyme having only glycosylase activity. The monofunctional glycosylase removes its cognate base (i.e., the damaged or modified base) to create an abasic site in a manner that leaves the phosphodiester backbone of the nucleic acid intact. The monofunctional glycosylase may be naturally occurring or the product of genetic engineering. One example of a genetically engineered monofunctional glycosylase is a bifunctional glycosylase that has been mutated in such a manner so as to retain its glycosylase activity, but not its lyase activity.
In alternative embodiments, the modified or damaged base may first need to be acted on by an enzyme without glycosylase activity that converts the modified or damaged base into a form that is subsequently recognized and acted upon by the glycosylase to create an abasic site. For example, Tet (ten eleven translocation) proteins can convert 5mC to 5hmC, then to 5fC and then to 5caC (see e.g., Ito S et al. Science. 2011; 333(6047): 1300-1303; Shi DQ, et al. Front Genet. 2017; 8:100; Yin et al. J Am Chem Soc. 2013;135(28):10396-10403); both 5fC and 5caC can be removed from double-stranded DNA by thymine DNA glycosylase (TDG) leaving abasic sites; TDG does not affect unmodified cytosines or 5mC or 5hmC (see e.g., Maiti A, Drohat AC. J Biol Chem. 2011, 286(41):35334-35338; He YF, et al. Science 2011; 333 (6047): 1303-1307; Bennett MT et al. 2006, J. Am. Chem. Soc. 128, 12510-12519).
Another example, depicted in Figure 4, is the conversion of 6mA to hypoxanthine by 6mA deaminase, followed by base excision by an AlkA family enzyme (e.g., a hypoxanthine DNA glycosylase) to create an abasic site in the DNA (see e.g., O'Brown ZK, Greer EL. N6-Methyladenine: A Conserved and Dynamic DNA Mark. Adv Exp Med Biol . 2Q16;945:213 - 246).
In alternative embodiments, the enzyme with a first modifying activity (such as for example a glycolase) may recognize more than one type of modified or damaged base. In such instances, prior to treatment by an enzyme with a second modifying activity may protect one or more of the types of modified or damaged bases while leaving one or more of the types of modified or damaged bases available for the first modifying enzyme which converts the available base to a form that may be subsequently removed to create an abasic site (note that in this alternative exemplary embodiment the “second modifying activity” is used before the “first modifying activity”). Comparison of abasic sites without protective pretreatment to abasic sites with protective pretreatment helps define the identity of the modified or damaged bases. An example of this would be treatment of a DNA that contains both 5mC and 5hmC with T4 b-glucosyltransferase ( GT). 5hmC, but not 5mC, is converted to 5-glucosyl-hmC (5ghmC), which protects it from oxidation by Tet proteins. In alternative embodiments, the method then further comprises heat inactivation of the GT, e.g., at 65°C for 10 minutes, followed by treatment with a Tet protein to oxidize only the 5mC (not the 5ghmC) to 5fC, then 5caC (see, e.g., Yu M, et al. Cell. 2012; 149(6): 1368 - 1380), both of which can be removed by the monofunctional glycosylase, TDG, to leave an abasic site.
Subsequent to the generation (or creation) of the abasic sites, a reverse complement of the nucleic acid containing one or more, or all, of the abasic sites in the nucleic acid is made in a manner that incorporates in the reverse complement opposite the abasic site(s) a nucleotide that has a different identity from what the original modified nucleotide would have coded for. In some instances, the reverse complement is made by providing a sequence-specific nucleic acid primer that anneals to the nucleic acid strand containing the abasic site and is extended across it by a DNA polymerase that has lesion bypass activity, for example, as described in Gloeckner C, et al Angew Chem Ini Ed Engl. 2007: 46(17):3115-3117 or by Patel et al JBC. 2001; 276(7):5044-5051. In alternative embodiments, a DNA polymerase that has lesion bypass activity is a Y-family DNA polymerase, e.g., a naturally occurring Y-family DNA polymerase, e.g., as the Sidfolohus soifataricus P2 DNA Polymerase IV, which follows the “A rule” and inserts an A opposite an abasic site, e.g., as described by Boudsocq F et a!.. Nucleic Acids Res. 2001; 29(22):4607-4616; and in alternative embodiments DNA polymerases that follow the A rule and insert an A opposite an abasic site are used in marking sites where the original damaged or modified base would have coded for a C, G, T, or U; and in alternative embodiments the DNA polymerase that has lesion bypass activity is combined with one or more other (or additional) DNA polymerases to increase efficiency in making the reverse complement, for example, the other DNA polymerase can be the highly processive PHUSION HOT START II™ DNA polymerase (ThermoFisher) or Q5® HOT START HIGH-FIDELITY™ DNA polymerase (New England BioLabs).
In alternative embodiments, a Saccharomyces cerevisiae Y-family DNA polymerase is used; for example, a Revl enzyme is used which specifically incorporates dCMP opposite abasic sites (Nair DT, et al. J Mol Biol.
2011;406(1): 18 - 28).
In alternative embodiments, DNA polymerases that insert a C opposite an abasic site are used for marking sites where the original damaged or modified base would have coded for an A, G, T, or U.
In alternative embodiments, the S. cerevisiae Y-family DNA polymerase used is Pol h (encoded by RAD30 ), which can bypass a T-T cyclobutane pyrimidine dimer (CPD) accurately and efficiently.
In alternative embodiments, a Homo sapiens Y-family DNA polymerase is used; for example, a REVl enzyme is used which specifically incorporates dCMP opposite abasic sites, and acts as a scaffold protein that interacts with the Y-family polymerases Pol h, Pol i and Pol K, as well as the B -family TLS polymerase Pol z (which is comprised of Rev3 and Rev7)
In alternative embodiments, the Homo sapiens Y-family DNA polymerase used is Pol h (encoded by POLH or RAD 30 ), which bypasses a T-T CPD efficiently and with the same accuracy as undamaged DNA, and generates mutations at A-T base pairs during immunoglobulin gene somatic hypermutation.
In alternative embodiments, the Homo sapiens Y-family DNA polymerase used is Pol i, encoded by POLI (also known as RAD30B ), which has unique replication fidelity, replicating template dA reasonably accurately, but replicating template dT in a highly error-prone manner. In alternative embodiments, the Homo sapiens Y-family DNA polymerase used is Pol K, encoded by POLK, an enzyme prone to making -1 frameshift mutations, but can accurately and efficiently bypass a number of N2-dG lesions.
In alternative embodiments, a genetically engineered DNA polymerase with enhanced bypass activity is used, for example, as described by G!oeckner C, et al Angew Ghent Ini Ed Engl. 2007: 46(17):3115-3117.
In alternative embodiments, the difference(s) in nucleotide sequence that occurs through the creation of abasic sites followed by making a reverse complement of the nucleic acid containing one or more abasic sites in a manner that incorporates in the reverse complement opposite the abasic site(s) a nucleotide that has a different identity from what the original modified nucleotide would have coded for are detected by e.g., sequencing, hybridization, mass spectrometry, TaqMan and equivalents or any other technology that can detect sequence changes, including those with nucleotide resolution.
Products of manufacture and Kits
Provided are products of manufacture and kits for practicing methods as provided herein; and optionally, products of manufacture and kits can further comprise instructions for practicing methods as provided herein.
Any of the above aspects and embodiments can be combined with any other aspect or embodiment as disclosed here in the Summary, Figures and/or Detailed Description sections.
As used in this specification and the claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise.
Unless specifically stated or obvious from context, as used herein, the term “or” is understood to be inclusive and covers both “or” and “and”.
Unless specifically stated or obvious from context, as used herein, the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. About can be understood as within 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12% 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value. Unless otherwise clear from the context, all numerical values provided herein are modified by the term “about.” Unless specifically stated or obvious from context, as used herein, the terms “substantially all”, “substantially most of’, “substantially all of’, or “majority of’ encompass at least about 90%, 95%, 97%, 98%, 99% or 99.5%, or more of a referenced amount of a composition.
The entirety of each patent, patent application, publication and document referenced herein hereby is incorporated by reference. Citation of the above patents, patent applications, publications and documents is not an admission that any of the foregoing is pertinent prior art, nor does it constitute any admission as to the contents or date of these publications or documents. Incorporation by reference of these documents, standing alone, should not be construed as an assertion or admission that any portion of the contents of any document is considered to be essential material for satisfying any national or regional statutory disclosure requirement for patent applications. Notwithstanding, the right is reserved for relying upon any of such documents, where appropriate, for providing material deemed essential to the claimed subject matter by an examining authority or court.
Modifications may be made to the foregoing without departing from the basic aspects of the invention. Although the invention has been described in substantial detail with reference to one or more specific embodiments, those of ordinary skill in the art will recognize that changes may be made to the embodiments specifically disclosed in this application, and yet these modifications and improvements are within the scope and spirit of the invention. The invention illustratively described herein suitably may be practiced in the absence of any element(s) not specifically disclosed herein. Thus, for example, in each instance herein any of the terms "comprising", "consisting essentially of, and "consisting of' may be replaced with either of the other two terms. Thus, the terms and expressions which have been employed are used as terms of description and not of limitation, equivalents of the features shown and described, or portions thereof, are not excluded, and it is recognized that various modifications are possible within the scope of the invention. Embodiments of the invention are set forth in the following claims.
The invention will be further described with reference to the examples described herein; however, it is to be understood that the invention is not limited to such examples. EXAMPLES
Unless stated otherwise in the Examples, all recombinant DNA techniques are carried out according to standard protocols, for example, as described in Sambrook et al. (2012) Molecular Cloning: A Laboratory Manual, 4th Edition, Cold Spring Harbor Laboratory Press, NY and in Volumes 1 and 2 of Ausubel et al. (1994) Current Protocols in Molecular Biology, Current Protocols, USA. Other references for standard molecular biology techniques include Sambrook and Russell (2001) Molecular Cloning: A Laboratory Manual, Third Edition, Cold Spring Harbor Laboratory Press, NY, Volumes I and II of Brown (1998) Molecular Biology LabFax, Second Edition, Academic Press (UK). Standard materials and methods for polymerase chain reactions can be found in Dieffenbach and Dveksler (1995) PCR Primer: A Laboratory Manual, Cold Spring Harbor Laboratory Press, and in McPherson at al. (2000) PCR - Basics: From Background to Bench, First Edition, Springer Verlag, Germany.
Example 1: Detection of Methylated Cytosines
This example demonstrates exemplary methods for detecting methylated cytosine base residues.
The presence of 5mC and 5hmC in DNA can be detected by treatment of the sample with an oxidizing enzyme that would convert 5mC to 5hmC; 5hmC to 5fC which might be further oxidized to 5caC; a non-limiting example of such an enzyme is ten-eleven-translocation enzyme, type 2 (TET2). Subsequent treatment with another enzyme that removes both 5fC and 5caC would leave an abasic site. A non-limiting example of such an enzyme is thymine DNA glycosylase; TDG has long been known to remove thymine from G/T mismatches, but is now known to efficiently remove the oxidized residues fC and caC from DNA when paired with G, both in vivo and in vitro , leaving abasic sites. Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site; non-limiting examples of such an enzyme include Sulfolobus DNA polymerase IV, a thermostable Y-family polymerase that bypasses lesions (including abasic sites) in the template DNA; and genetically engineered KLENTAQ™ mutants, including M747K. Since the original 5mC or 5hmC would have directed the insertion of a G residue in the reverse complement, the presence of the A residue acts to mark the site of 5mC or 5hmC. One can differentiate between 5mC and 5hmC by pretreatment of the DNA with the T4 phage-derived enzyme, b- glucosyltransferase (bGT) which converts 5hmC to 5-glucosyl-hmC (5ghmC), protecting it from oxidation and thus subsequent base glycolysis. Comparison of a split sample, one treated with bGT and one untreated would allow one to differentiate between 5mC and 5hmC sites.
This exemplary method includes distinct advantages over current methods5,6, including using the entirety of a sample for the simultaneous detection of damaged or modified bases without interfering with the detection of alternative alleles, including mutations, at other sites. Using methylated cytosines in DNA as an example, bisulfite- based approaches and newer enzymatic approaches, both convert all unmethylated cytosines (C) to thymine (T) signals while leaving 5-methyl cytosine (5mC) and 5- hydroxymethyl cytosine (5hmC) as C signals in subsequent DNA sequencing analyses. The present invention does just the opposite: it leaves all unmodified C’s untouched and converts only the 5mC and 5hmC residues into T signals. This simplifies assay design and sequence analysis. The present invention provides compatibility with assays for allelic variation and mutation detection as follows: with the new method, a C>T change due to a methylated C in a CpG island would be correctly scored as a methylated C since it would produce two amplicon types as shown in Figure 2; similarly any OT change caused by allelic variation or by mutation would also be correctly scored (i.e. as an allelic variation or mutation) since each would produce only one amplicon type as shown in Figure 3. In bisulfite and other methods, that convert unmodified C’s to T’s, the conversion of unmodified C’s to T’s would confound mutation assays where a C>T mutation occurs. By way of example, the EGFR T790M mutation, an important drug resistance marker in lung cancer, results from the 2369C>T transition; current methods that convert the unmethylated wild type C to T, identical to the mutant, would render the assay uninterpretable. Similarly, since a C>T transition in one strand results in a G>A transition in the other strand, G to A mutations also could not be scored using such methods. Table 1 includes a short list of important mutation assays affected by C>T or G>A transitions. With current methods, DNA samples would have to be split into separate methylation and mutation processes. This would double the number of samples to be analyzed and cut in half the amount of DNA available for each analysis. This would especially impact assays where scant quantities of DNA are available (e.g. circulating cell-free DNA). In alternative embodiments, mutation and methylation assays are multiplexed and the entire DNA sample used. More data points would mean a higher probability of detecting a mutation if present (increased sensitivity) and increase confidence that an individual is indeed disease-free if the mutation is not detected (specificity). Since only one assay is needed for both mutation and methylation analysis, both workload and costs are dramatically decreased.
Figure imgf000017_0001
Table 1 : Major mutation assays confounded by treatment with current methylation analysis assays.
Example 2: Detection of N6-adenine methylation
This example demonstrates exemplary methods for detecting N6-adenine methylation.
N6-adenine methylation (6mA) of DNA may play an important role in early development and is significantly decreased in abundance in cancer cells and tissues compared to normal counterparts. One method of application of the present invention to the detection of 6mA is depicted in Figure 4 where 6mA deaminase or another appropriate amidohydrolase is used to convert 6-methyladenine to hypoxanthine, which is subsequently acted on by AlkA family enzymes (hypoxanthine DNA glycosylases) to create an abasic site in the DNA. Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site; non-limiting examples of such an enzyme include Sulfolobus DNA polymerase IV, a thermostable Y-family polymerase that bypasses lesions (including abasic sites) in the template DNA; and genetically engineered KLENTAQ™ mutants, including M747K. Since the original 6mA would have directed the insertion of a T residue in the reverse complement, the presence of the A residue acts to mark the site of 6mA. Example 3 : Detection of 8-oxo-dG in DNA
This example demonstrates exemplary methods for detection of 8-oxo-dG in
DNA.
8-oxoguanine (8-oxo-dG) is an example of oxidative base damage that is associated with several forms of cancer. 8-oxo-dG is recognized by the E. coli enzyme Fpg, the yeast enzyme Oggl and the human enzyme hOGGl. All three are bifunctional glycosylases with both glycosylase activity and lyase activity.
Elimination of the lyase activity by mutagenesis or inhibition that leaves the N- glycosylase activity intact allows the damaged 8-oxo-dG to be removed, leaving an abasic site. Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site; non-limiting examples of such an enzyme include Sulfolobus DNA polymerase IV, a thermostable Y-family polymerase that bypasses lesions (including abasic sites) in the template DNA; and genetically engineered KLENTAQ™ mutants, including M747K. Since an undamaged G would have directed the insertion of a C residue in the reverse complement, the presence of the A residue acts to mark the site of 8-oxo-dG.
Example 4: Detection of Damaged Bases in DNA
This example demonstrates exemplary methods for detection of damaged bases in DNA in DNA.
In contrast to many glycosylases that are involved in the excision of specific lesions, the human alkyladenine-DNA glycosylase (hAAG) catalyzes the excision of a variety of modified bases, including 3-methyladenine (3MeA) and 7-methylguanine (7MeG), and l-N6-ethenoadenine (eA). Excision of these damaged bases by hAAG would leave abasic sites. Synthesis of a reverse complement by a DNA polymerase with lesion bypass activity that follows the “A rule” would result in the insertion of an A residue opposite the abasic site; non-limiting examples of such an enzyme include Sulfolobus DNA polymerase IV, a thermostable Y-family polymerase that bypasses lesions (including abasic sites) in the template DNA; and genetically engineered KLENTAQ™ mutants, including M747K; and the double mutant M747K. Since an undamaged G would have directed the insertion of a C residue in the reverse complement; and an undamaged A would have directed the insertion of a T residue in the reverse complement, the presence of the A residue acts to mark the site of damaged DNA bases recognized by hAAG.
A number of embodiments of the invention have been described. Nevertheless, it can be understood that various modifications may be made without departing from the spirit and scope of the invention. Accordingly, other embodiments are within the scope of the following claims.

Claims

WHAT IS CLAIMED IS:
1. A method for detecting a modified or a damaged base moiety in a nucleic acid (or polynucleotide, or deoxyribonucleic acid (DNA)), comprising:
(a) removing, or modifying and then removing, at least one modified or damaged base moiety from a nucleic acid strand to create a corresponding abasic site; and,
(b) making or generating a reverse complement copy of the nucleic acid strand that contains or comprises the at least one made or generated abasic site, wherein the making or generating comprises incorporating in the reverse complement copy opposite the abasic site a nucleotide that has a different identity from what the original modified or damaged nucleotide would have coded for.
2. The method of claim 1, wherein the modified or damaged base moiety comprises a modified or damaged adenine (A), cytosine (C), guanine (G), thymine (T), or uracil (U).
3. The method of claim 2, wherein the modified or damaged base moiety is 5-methylcytosine (5meC), 5-hydroxymethylcytosine (5hmC), 5-formylcytosine (5fC), or 5-carboxylcytosine (5caC).
4. The method of claim 3, wherein a modified or damaged cytosine is oxidized to a 5fC or a 5caC.
5. The method of claim 4, wherein the modified or damaged cytosine is oxidized chemically to a 5fC or a 5caC.
6. The method of claim 4, wherein the modified or damaged cytosine is oxidized enzymatically to a 5fC or a 5caC.
7. The method of claim 6, wherein the enzyme is a Ten-eleven Translocation (TET) enzyme, optionally a naturally occurring TET enzyme.
8. The method of claim 7, wherein the enzyme is a genetically engineered form of a Ten-eleven Translocation (TET) enzyme.
9. The method of claims 7 or 8, wherein the enzyme is TET1, TET2, or
TET3.
10. The method of claim 5 or 6, wherein the 5fC or 5caC is removed, or modified enzymatically and then removed, to create an abasic site.
11. The method of claim 10, wherein the enzyme is thymine DNA glycosylase (TDG), optionally a naturally occurring form of thymine DNA glycosylase (TDG).
12. The method of claim 11, wherein the enzyme is a genetically engineered form of thymine DNA glycosylase (TDG).
13. The method of claim 2 wherein the modified base moiety is N6- methyladenine (6mA).
14. The method of claim 13, wherein the modified base moiety 6mA is converted to hypoxanthine by a 6mA deaminase.
15. The method of claim 14, further comprising (followed by) base excision by an AlkA family enzyme (optionally a hypoxanthine DNA glycosylase) to create an abasic site.
16. The method of claim 1, wherein the reverse complement copy is made by comprising use of a DNA polymerase with lesion bypass activity allowing it to incorporate a nucleotide opposite to the abasic site.
17. The method of claim 16, wherein the DNA polymerase is naturally occurring.
18. The method of claim 16, wherein the DNA polymerase is a Y-family polymerase.
19. The method of claim 16, wherein the DNA polymerase is a DNA polymerase that follows the A rule and inserts an adenine (A) nucleotide opposite an abasic site for marking a site where the original damaged or modified base would have coded for a C, G, T, or U.
20. The method of claim 16, wherein the DNA polymerase is a DNA polymerase that does not follow the A rule, but instead inserts a cytosine (C) nucleotide opposite an abasic site for marking a site where the original damaged or modified base would have coded for an A, G, T, or U.
21. The method of claim 16, wherein the DNA polymerase is genetically engineered.
22. The method of claim 21, wherein the genetically engineered polymerase is a Y-family DNA polymerase.
23. The method of claim 21, wherein the genetically engineered DNA polymerase is TAQ polymerase without the 5'-3' exonuclease domain, or KLENTAQ™ or a derivative thereof.
24. The method of claim 22, wherein the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain or KLENTAQ™ or derivative thereof comprises one or more variants chosen from M747K, I614K, I614N, L616I, G84A, D144G, K314R, E520G, F598L, A608V, E742G, E742K, D58G, R74P, A109T, L245R, R343G, G370D, E520G, N583S, E694K, and A743P.
25. The method of claim 22, wherein the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain or KLENTAQ™ or derivative thereof comprises M747K.
26. The method of claim 22, where the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain or KLENTAQ™ or derivative thereof comprises I614N and L616I.
27. The method of claim 22, wherein the genetically engineered TAQ polymerase without the 5'-3' exonuclease domain or KLENTAQ™ or derivative thereof comprises M747K, I614N, and L616I.
28. The method of claim 1, wherein the modified base moiety is removed by a naturally occurring or genetically engineered glycosylase.
29. The method of claim 28, wherein the glycosylase is a monofunctional glycosylase.
30. The method of claim 28, wherein the glycosylase is a bifunctional glycosylase where the lyase activity has been ablated.
31. A kit comprising materials, optionally enzymes, for practicing a method of any of the preceding claims, and optionally further comprising instructions for practicing a method of any of the preceding claims.
PCT/US2021/036577 2020-06-10 2021-06-09 Methods for identifying modified bases in a polynucleotide WO2021252603A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202063037408P 2020-06-10 2020-06-10
US63/037,408 2020-06-10

Publications (1)

Publication Number Publication Date
WO2021252603A1 true WO2021252603A1 (en) 2021-12-16

Family

ID=78846513

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/036577 WO2021252603A1 (en) 2020-06-10 2021-06-09 Methods for identifying modified bases in a polynucleotide

Country Status (1)

Country Link
WO (1) WO2021252603A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024083982A1 (en) * 2022-10-21 2024-04-25 F. Hoffmann-La Roche Ag Detection of modified nucleobases in nucleic acid samples

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020132249A1 (en) * 2000-01-14 2002-09-19 Patel Premal H. DNA polymerase mutant having one or more mutations in the active site
US20060134631A1 (en) * 1996-01-09 2006-06-22 Krokan Hans E Novel DNA glycosylases and their use
US20170233802A1 (en) * 2008-12-11 2017-08-17 Pacific Biosciences Of California, Inc. Classification of nucleic acid templates
US20190040457A1 (en) * 2015-05-12 2019-02-07 Wake Forest University Health Sciences Identification of genetic modifications
US20190185919A1 (en) * 2015-10-30 2019-06-20 New England Biolabs, Inc. Compositions and Methods for Analyzing Modified Nucleotides
US20190249239A1 (en) * 2016-09-12 2019-08-15 Technische Hochschule Wildau Convertible adapters

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060134631A1 (en) * 1996-01-09 2006-06-22 Krokan Hans E Novel DNA glycosylases and their use
US20020132249A1 (en) * 2000-01-14 2002-09-19 Patel Premal H. DNA polymerase mutant having one or more mutations in the active site
US20170233802A1 (en) * 2008-12-11 2017-08-17 Pacific Biosciences Of California, Inc. Classification of nucleic acid templates
US20190040457A1 (en) * 2015-05-12 2019-02-07 Wake Forest University Health Sciences Identification of genetic modifications
US20190185919A1 (en) * 2015-10-30 2019-06-20 New England Biolabs, Inc. Compositions and Methods for Analyzing Modified Nucleotides
US20190249239A1 (en) * 2016-09-12 2019-08-15 Technische Hochschule Wildau Convertible adapters

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ITO ET AL.: "Tet proteins can convert 5-methylcytosine to 5-formylcytosine and 5-carboxylcytosine", SCIENCE, vol. 333, no. 6047, 21 July 2011 (2011-07-21), pages 1300 - 1303, XP055101432, DOI: 10.1126/science.1210597 *
O’BROWN ZACH KLAPHOLZ, GREER ERIC LIEBERMAN: "N6-methyladenine: a conserved and dynamic DNA mark", ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY, vol. 945, 3 February 2017 (2017-02-03), pages 213 - 246, XP055883337 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024083982A1 (en) * 2022-10-21 2024-04-25 F. Hoffmann-La Roche Ag Detection of modified nucleobases in nucleic acid samples

Similar Documents

Publication Publication Date Title
Smith et al. High-throughput bisulfite sequencing in mammalian genomes
EP2825645B1 (en) Methods and compositions for discrimination between cytosine and modifications thereof, and for methylome analysis
CA3041913A1 (en) Compositions and methods for identifying nucleic acid molecules
JP7044185B2 (en) Modified thermostable DNA polymerase
KR102126744B1 (en) Method for using heat-resistant mismatch endonuclease
CA2430503A1 (en) Detection of nucleic acid differences using combined endonuclease cleavage and ligation reactions
CN110741092A (en) Method for amplifying DNA to maintain methylation state
US20160194706A1 (en) Restriction Endonucleases and Their Uses
WO2016152812A1 (en) High-sensitivity method for detecting target nucleic acid
EP4083231A1 (en) Compositions and methods for nucleic acid analysis
JP2017509324A (en) Error-free DNA sequencing
CN113106145A (en) Compositions and methods for preparing nucleic acid libraries
EP2304044A1 (en) Method and kits for repairing nucleic acid sequences
WO2021252603A1 (en) Methods for identifying modified bases in a polynucleotide
US20180051330A1 (en) Methods of amplifying nucleic acids and compositions and kits for practicing the same
CN115961001A (en) Single base positioning analysis method for 5-methylcytosine in DNA mediated by DNA methyltransferase binding cytosine deaminase
US20130309667A1 (en) Primers for analyzing methylated sequences and methods of use thereof
US20220307077A1 (en) Conservative concurrent evaluation of dna modifications
EP3827011B1 (en) Methods and composition for targeted genomic analysis
Sérandour et al. Coupling exonuclease digestion with selective chemical labeling for base-resolution mapping of 5-hydroxymethylcytosine in genomic DNA
AU2017370655B2 (en) Compositions and methods for identifying nucleic acid molecules
JP2010035533A (en) Detection method of methylated cytosine
CN116888271A (en) Synthetic polynucleotides and methods for selectively amplifying alleles
US20130310550A1 (en) Primers for analyzing methylated sequences and methods of use thereof
WO2023150633A2 (en) Multifunctional primers for paired sequencing reads

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21822387

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21822387

Country of ref document: EP

Kind code of ref document: A1