WO2021163561A1 - Variant family d dna polymerases - Google Patents

Variant family d dna polymerases Download PDF

Info

Publication number
WO2021163561A1
WO2021163561A1 PCT/US2021/017956 US2021017956W WO2021163561A1 WO 2021163561 A1 WO2021163561 A1 WO 2021163561A1 US 2021017956 W US2021017956 W US 2021017956W WO 2021163561 A1 WO2021163561 A1 WO 2021163561A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
sequence
identity
positions
polymerase
Prior art date
Application number
PCT/US2021/017956
Other languages
French (fr)
Inventor
Andrew F. Gardner
Kelly M. ZATOPEK
Thomas C. Evans
Ece ALPASLAN
Original Assignee
New England Biolabs, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New England Biolabs, Inc. filed Critical New England Biolabs, Inc.
Priority to EP21710368.8A priority Critical patent/EP4103701A1/en
Publication of WO2021163561A1 publication Critical patent/WO2021163561A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1252DNA-directed DNA polymerase (2.7.7.7), i.e. DNA replicase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/07Nucleotidyltransferases (2.7.7)
    • C12Y207/07007DNA-directed DNA polymerase (2.7.7.7), i.e. DNA replicase

Definitions

  • DNA polymerases are classified into Families A, B, C, D, X, Y and RT according to their amino acid sequences. DNA polymerases have several properties that contribute to their replicative fidelity. For example, some DNA polymerases of Families A, B, and D have proofreading 3' to 5' (3 '-5') exonuclease activity. When a DNA polymerase incorporates an incorrect or modified nucleotide, for example, in a primer strand, it detects structural perturbations caused by mispairing or nucleotide modification and transfers the primer strand from the polymerase domain to the 3 '-5' exonuclease active site.
  • DNA polymerases restrict access to their active sites to prevent incorporation of ribonucleotides.
  • Families A, B, X, Y, and reverse transcriptases have a steric gate that excludes rNTPs from the active site by a steric clash between a bulky amino acid side chain in the steric gate and the 2’-OH of such rNTPs. Reducing the size of the side chain at the steric gate position allows DNA polymerases so modified to incorporate a single rNTP as efficiently as dNTP.
  • Polymerases with such modified steric gates have been extensively employed in molecular biology applications such as single-molecule sequencing, sequencing by synthesis, and single nucleotide polymorphism (SNP) detection.
  • SNP single nucleotide polymorphism
  • Family D DNA polymerases have properties that differ from some or all other polymerases. These distinguishing properties may include one or more of the following: (a) having little to no strand displacement activity, (b) reading through uracils encountered in a template strand and continue polymerization, (c) synthesizing DNA at a slower rate, (d) binding other replication enzymes, including core replisome proteins (e.g., mini-chromosome maintenance (MCM) helicase, DNA ligase, the archaeal Cdc45 protein (GAN) and the processivity factor PCNA), and (e) activity in the presence of amphidicolin.
  • MCM mini-chromosome maintenance
  • GAN archaeal Cdc45 protein
  • PCNA processivity factor
  • PolD is a heterodimeric enzyme consisting of a large 5 ’-3’ polymerase subunit and a small MRE11-like 3'-5' exonuclease subunit. The activity of each subunit requires the other subunit to be present.
  • Family D DNA polymerases preferentially incorporate dNTPs over rNTPs. The molecular basis for this selectivity has not yet been identified, which may have limited the use of wild type Family D polymerases.
  • a variant Family D polymerase may comprise, for example, (a) an amino acid sequence comprising at least two (or all three) of (optionally, in order from amino- to carboxy terminus and/or otherwise corresponding to) (i) a sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 1, (ii) a sequence having at least 85% identity to positions 808-816 of SEQ ID NO: 1, and (iii) a sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 1; and (b) a substitution at a position corresponding to (i) position 930, 931, or 932 of SEQ ID NO: 1, (ii) position 822, 954, 959, or 963 of SEQ ID NO: 3, or (iii) position 13, 203
  • a variant Family D polymerase may comprise, in some embodiments, (a) an amino acid sequence comprising at least two (or all three) of (optionally, in order from amino- to carboxy terminus and/or otherwise corresponding to) (i) a sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 1, (ii) a sequence having at least 85% identity to positions 808- 816 of SEQ ID NO: 1, and (iii) a sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 1; and (b)(i) a substitution at a position corresponding to position 930, 931, or 932 of SEQ ID NO: 1, (ii) any amino acid other than tyrosine at one or more positions corresponding to position 13, 203, 822, 885, 890, 954, 984, 1020, 1026, and/or 1165 of SEQ ID NO: 1, (iii) any amino acid other than lysine at one or more positions corresponding to position 392,
  • a variant Family D polymerase may comprise (a) an amino acid sequence comprising at least two sequences selected from (i) a sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 1, (ii) a sequence having at least 85% identity to positions 808-816 of SEQ ID NO: 1, and (iii) a sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 1; and (b) any amino acid other than (i) tyrosine at one or more positions corresponding to position 13, 203, 822, 885, 890, 954, 984, 1020, 1026, and/or 1165 of SEQ ID NO: 1, (ii) lysine at one or more positions corresponding to position 392, 398 and/or 959 of SEQ ID NO: 1, (iii) cystine at a position corresponding to position 963 of SEQ ID NO: 1, (iv) phenylalanine at one or more positions corresponding to position 452 and/or 9
  • a variant Family D polymerase may comprise, for example, (a) an amino acid sequence comprising at least one (and any two of up to all six) of (optionally, in order from amino- to carboxy terminus and/or otherwise corresponding to) (i) a sequence having at least 90% identity to positions 145-169 of SEQ ID NO: 3 or 4, (ii) a sequence having at least 90% identity to positions 245-271 of SEQ ID NO: 3 or 4, (iii) a sequence having at least 90% identity to positions 354-364 of SEQ ID NO: 3 or 4, (iv) a sequence having at least 90% identity to positions 651-679 of SEQ ID NO: 3 or 4, (v) a sequence having at least 90% identity to positions 821-841 of SEQ ID NO: 3 or 4, and (vi) a sequence having at least 80% identity to positions 947-963 of SEQ ID NO: 3 or 4; and (b) a substitution at a position corresponding to (i) position 930, 931, or 932 of SEQ ID NO: 1, (
  • a variant Family D polymerase having such a substitution may comprise, for example, a substitution corresponding to position 13, 203, 822, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4.
  • a variant Family D polymerase having from one to six of the foregoing sequences may further comprise a sequence having at least 80% identity to positions 808-816 of SEQ ID NO: 4.
  • a variant Family D polymerase may comprise a sequence identical to positions 927-930 of SEQ ID NO: 1 and/or positions 932-934 of SEQ ID NO: 1 at, for example, corresponding positions along its length.
  • a variant DNA polymerase comprising a substitution at a position corresponding to position 931 of SEQ ID NO: 1 may be selected from an alanine substitution, an aspartate substitution, a cysteine substitution, a glutamate substitution, a glutamine substitution, a glycine substitution, an isoleucine substitution, a leucine substitution, a methionine substitution, a phenylalanine substitution, a proline substitution, a serine substitution, a threonine substitution, and a valine substitution.
  • a substitution at a position corresponding to position 931 may be alanine.
  • a variant Family D polymerase may comprise a sequence having at least 90%, at least 95% or at least 98% identity (but in each case, less than 100% identity) to SEQ ID NO: 1.
  • a variant Family D polymerase may comprise a sequence having at least 80%, at least 85%, at least 90%, at least 95% or at least 98% identity (but in each case, less than 100% identity) to SEQ ID NO: 2, 3, or 4.
  • a variant Family D polymerase may comprise a sequence having (a) at least 70%, at least 75%, or at least 80%, at least 85% identity (but in each case, less than 100% identity) to SEQ ID NO: 2, 3, or 4 and (b) at least one (and any two of up to all six) of (optionally, in order from amino- to carboxy terminus and/or otherwise corresponding to) (i) a sequence having at least 90% identity to positions 145-169 of SEQ ID NO: 3 or 4, (ii) a sequence having at least 90% identity to positions 245-271 of SEQ ID NO: 3 or 4, (iii) a sequence having at least 90% identity to positions 354-364 of SEQ ID NO: 3 or 4, (iv) a sequence having at least 90% identity to positions 651-679 of SEQ ID NO: 3 or 4, (v) a sequence having at least 90% identity to positions 821-841 of SEQ ID NO: 3 or 4, and (vi) a sequence having at least 80% identity to positions 947-963 of SEQ ID NO:
  • a variant Family D polymerase having such a substitution may comprise, for example, a substitution corresponding to position 13, 203, 822, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4.
  • a variant Family D polymerase having from one to six of the foregoing sequences may further comprise a sequence having at least 80% identity to positions 808-816 of SEQ ID NO:
  • a composition comprising a variant Family D polymerase may further include one or more substrates, intermediates, and/or products of such polymerase.
  • a composition may comprise one or more ribonucleoside triphosphates, one or more deoxyribonucleoside triphosphates, or one or more ribonucleoside triphosphates and one or more deoxyribonucleoside triphosphates.
  • a composition may comprise one or more modified nucleotides (e.g., modified nucleotides selected from the group consisting of dideoxynucleoside triphosphates, acyclic- nucleoside triphosphates, 3'-0-azidomethyl-ddNTPs, 3'-0-amino-ddNTPs, 3 ’-OH unblocked dNTPs, biotin-deoxyuridine triphosphates, rNTP, 3'-dNTP, 2'-amino-NTP, 2'-azido-NTP, and 2'- Omethyl-NTP).
  • a product of a variant Family D polymerase included in a composition or arising from a method disclosed herein may comprise, for example, a polynucleotide having at least one ribonucleotide.
  • a variant Family D polymerase may have or lack an exonuclease subunit (DPI). If present, the exonuclease subunit may have or lack exonuclease activity (e.g ., 3’ to 5’ exonuclease activity). For example, an exonuclease subunit may comprise one or more substitutions that reduce or eliminate exonuclease activity relative to the corresponding wild type sequence.
  • a composition may comprise a fusion protein comprising a variant Family D polymerase and any desired polypeptide.
  • a fusion protein may comprise a polypeptide selected from a DNA- binding peptide or protein, a maltose binding domain, a chitin binding domain, or a SNAP-Tag®.
  • a method may comprise contacting a variant Family D polymerase with a template polynucleotide (e.g., a DNA template) deoxyribonucleoside triphosphates, and optionally a primer having a sequence complementary to at least a portion of the sequence of the template under suitable conditions (e.g., time, temperature, pH) to produce at least one polynucleotide copy of the template.
  • a method may further include contacting a variant Family D polymerase with one or more ribonucleoside triphosphates and/or one or more modified nucleotides.
  • a template polynucleotide may comprise at least one ribonucleotide.
  • a template may comprise a uracil.
  • a polynucleotide copy produced upon or following contact with a variant Family D polymerase may comprise a deoxyribonucleotide at positions complementary to any template positions occupied by a ribonucleotide.
  • a DNA template comprising one uracil contacted with a variant Family D polymerase may give rise to a polynucleotide copy of the template, the copy comprising an adenine at the position that is complementary to the uracil.
  • methods for filling a gap in a double- stranded polynucleotide are also provided.
  • a method may comprise contacting a variant Family D polymerase and one or more deoxyribonucleoside triphosphates with a double-stranded polynucleotide comprising a gap on one strand that is at least one nucleotide in length to produce a product polynucleotide that has no such gap.
  • kits including variant Family D polymerases.
  • a kit may comprise a variant Family D polymerase and one or more ribonucleoside triphosphates, one or more deoxyribonucleoside triphosphates, one or more modified nucleotides, one or more primers, one or more adapters, one or more buffering agents, or combinations thereof.
  • FIGURE 1 shows examples of Family D polymerase sequences.
  • SEQ ID NO: 1 contains Family D polymerase consensus sequences identified according to Example 1, in which each X independently can be any amino acid, except SEQ ID NO:l residues X927, which is G or T, X928, which is L, M, or T, X929, which is A or S, X933, which is S, F, or C, and X934, which is G, A, C, or V.
  • SEQ ID NOS: 2-3 contain Family D polymerase consensus sequence identified according to Example 1, in which each X independently can be any amino acid.
  • SEQ ID NOS: 4 and 5 are Family D polymerase sequences of Euryarchaeota (9°N) and Pyrococcus abyssi, respectively.
  • FIGURE 2 shows that Family D polymerase variants incorporated rNTPs relative to dNTPs more efficiently than wild type PolD.
  • FIGURE 3 shows that additional Family D polymerase variants incorporated rNTPs relative to dNTPs more efficiently than wild type PolD.
  • FIGURE 4 shows the ability of variant Family D polymerases have to incorporate rNTPs successively.
  • FIGURE 4A is a sketch illustrating the reaction and FIGURE 4B illustrates data obtained in accordance with Example 5.
  • the wild type polD shown incorporates one rNTP, while a polD H931A incorporates up to four successive rNTPs. As controls both polD and pol H931A incorporate successive dNTPs.
  • compositions, methods and kits are provided here that improve among other things, the synthesis of DNA that contains modified nucleotides such as ribonucleotides using a class of polymerases that do not belong to the DNA polymerase A or B families.
  • Family D polymerases preferentially incorporate deoxyribonucleoside triphosphates relative to larger ribonucleoside triphosphates, which may be attributed to a steric gate motif limiting substrate access to the catalytic site.
  • Variant Family D polymerases with modifications in or modifications impacting this steric gate are provided in some embodiments disclosed herein. For example, variant Family D polymerases may have an enhanced or reduced ability to exclude ribonucleotides from the active site.
  • Sources of commonly understood terms and symbols may include: standard treatises and texts such as Komberg and Baker, DNA Replication, Second Edition (W.H. Freeman, New York, 1992); Lehninger, Biochemistry, Second Edition (Worth Publishers, New York, 1975); Strachan and Read, Human Molecular Genetics, Second Edition (Wiley-Liss, New York, 1999); Eckstein, editor, Oligonucleotides and Analogs: A Practical Approach (Oxford University Press, New York, 1991); Gait, editor, Oligonucleotide Synthesis: A Practical Approach (IRL Press, Oxford, 1984); Singleton, et ah, Dictionary of Microbiology and Molecular biology, 2d ed., John Wiley and Sons, New York (1994), and Hale & Markham, the Harper Collins Dictionary of Biology, Harper Perennial, N.Y. (1991) and the like.
  • a protein refers to one or more proteins, i.e., a single protein and multiple proteins.
  • claims can be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements or use of a “negative” limitation.
  • Numeric ranges are inclusive of the numbers defining the range. All numbers should be understood to encompass the midpoint of the integer above and below the integer i.e., the number 2 encompasses 1.5-2.5. The number 2.5 encompasses 2.45-2.55 etc. When sample numerical values are provided, each alone may represent an intermediate value in a range of values and together may represent the extremes of a range unless specified.
  • buffering agent refers to an agent that allows a solution to resist changes in pH when acid or alkali is added to the solution.
  • suitable non-naturally occurring buffering agents include, for example, any of Tris, HEPES, TAPS, MOPS, tricine, and MES.
  • corresponding to refers to positions that lie across from one another when sequences are aligned, e.g., by the BLAST algorithm.
  • An amino acid position in a functional or structural motif in one polymerase may correspond to a position within a functionally equivalent functional or structural motif in another polymerase.
  • DNA polymerase refers to an enzyme that is capable of replicating DNA and optionally may have exonuclease activity.
  • DNA template refers to the DNA strand read by a DNA polymerase and of which a copy is synthesized.
  • “Family D polymerase” refers to a heterodimeric archaeal DNA polymerase having a small exonuclease subunit (DPI) and a large polymerase subunit (DP2).
  • Family D polymerases may be produced in cells in an immature form and then undergo post-translational processing (e.g., removal of amino-terminal sequences, inteins, or both). Examples of Family D polymerases may include 9°N PolD (Euryarchaeota), Genbank Accession No. KPV61551.1 (Bathyarchaeota), Accession No.
  • fusion protein refers to protein composed of a plurality of polypeptide components that are un-joined in their native state.
  • Fusion proteins may be a combination of two, three or four or more different proteins.
  • the term polypeptide is not intended to be limited to a fusion of two heterologous amino acid sequences.
  • a fusion protein may have one or more heterologous domains added to the N-terminus, C-terminus, and or the middle portion of the protein. If two parts of a fusion protein are “heterologous”, they are not part of the same protein in its natural state.
  • fusion proteins include a variant Family D polymerase fused to an SS07 DNA binding peptide (see for example, US Patent 6,627,424), a transcription factor (see for example, US patent 10,041,051), a binding protein suitable for immobilization such as maltose binding domain (MBP), a histidine tag (“His-tag”), chitin binding domain (CBD) or a SNAP-Tag® (New England Biolabs, Ipswich, MA (see for example US patents 7,939,284 and 7,888,090)).
  • MBP maltose binding domain
  • His-tag histidine tag
  • CBD chitin binding domain
  • SNAP-Tag® New England Biolabs, Ipswich, MA (see for example US patents 7,939,284 and 7,888,090)
  • fusion proteins include a heterologous targeting sequence, a linker, an epitope tag, a detectable fusion partner, such as a fluorescent protein, b-galactosidase, luciferase and the functionally similar peptides.
  • modified nucleotide refers to a non-canonical nucleotide that may be incorporated into a growing polynucleotide strand.
  • modified nucleotides include nucleotide terminators (dideoxy nucleoside triphosphates (ddNTPs) and acyclic-nucleoside triphosphates (acycloNTPs)), reversible nucleotide terminators (3'-0-azidomethyl-ddNTPs, 3'-0- amino-ddNTPs, and 3 ’-OH unblocked dNTPs), and tagged nucleotides (biotin-deoxyuridine triphosphates (biotin-dUTPs)).
  • ddNTPs deoxy nucleoside triphosphates
  • acycloNTPs acyclic-nucleoside triphosphates
  • reversible nucleotide terminators (3'-0-azidomethyl-ddNTPs
  • modified nucleotides include nucleotides with any atom or group other than a hydrogen at the 2' position (e.g ., rATP, 3'-dATP, 2'-amino-ATP, 2'- azido-ATP and 2'- (9-methyl- ATP).
  • 3’-OH unblocked dNTPs e.g., Lightning TerminatorsTM (Agilent Technologies, Inc., Houston, Tex.) (Gardner AF, Wang J, Wu W, Karouby J, Li H, Stupi BP, Jack WE, Hersh MN, Metzker ML. (2012) Rapid incorporation kinetics and improved fidelity of a novel class of 3'-OH unblocked reversible terminators. Nucleic Acids Res., 40, 7404-7415)).
  • NTP refers to a nucleoside triphosphate including, for example, any deoxyribonucleoside triphosphate (“dNTP”) and any ribonucleoside triphosphate (“rNTP”).
  • dNTP deoxyribonucleoside triphosphate
  • rNTP ribonucleoside triphosphate
  • non-naturally occurring refers to a polynucleotide, polypeptide, carbohydrate, lipid, or composition that does not exist in nature.
  • a polynucleotide, polypeptide, carbohydrate, lipid, or composition may differ from naturally occurring polynucleotides polypeptides, carbohydrates, lipids, or compositions in one or more respects.
  • a polymer e.g ., a polynucleotide, polypeptide, or carbohydrate
  • the component building blocks e.g., nucleotide sequence, amino acid sequence, or sugar molecules.
  • a polymer may differ from a naturally occurring polymer with respect to the molecule(s) to which it is linked.
  • a “non-naturally occurring” protein may differ from naturally occurring proteins in its secondary, tertiary, or quaternary structure, by having a chemical bond (e.g., a covalent bond including a peptide bond, a phosphate bond, a disulfide bond, an ester bond, and ether bond, and others) to a polypeptide (e.g., a fusion protein), a lipid, a carbohydrate, or any other molecule.
  • a chemical bond e.g., a covalent bond including a peptide bond, a phosphate bond, a disulfide bond, an ester bond, and ether bond, and others
  • a “non-naturally occurring” polynucleotide or nucleic acid may contain one or more other modifications (e.g., an added label or other moiety) to the 5’- end, the 3’ end, and/or between the 5’- and 3 ’-ends (e.g., methylation) of the nucleic acid.
  • a “non-naturally occurring” composition may differ from naturally occurring compositions in one or more of the following respects: (a) having components that are not combined in nature, (b) having components in concentrations not found in nature, (c) omitting one or components otherwise found in naturally occurring compositions, (d) having a form not found in nature, e.g., dried, freeze dried, crystalline, aqueous, and (e) having one or more additional components beyond those found in nature (e.g., buffering agents, a detergent, a dye, a solvent or a preservative).
  • All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
  • polynucleotide copy refers to the product of polymerization activity of a DNA polymerase.
  • a polynucleotide copy may comprise deoxyribonucleotides with or without ribonucleotides.
  • position refers to the place such amino acid occupies in the primary sequence of a peptide or polypeptide numbered from its amino terminus to its carboxy terminus.
  • substitution at a position in a comparator amino acid sequence refers to any difference at that position relative to the corresponding position in a reference sequence, including a deletion, an insertion, and a different amino acid, where the comparator and reference sequences are at least 80% identical to each other.
  • a substitution in a comparator sequence in addition to being different than the reference sequence, may differ from all corresponding positions in naturally-occurring sequences that are at least 80% identical to the comparator sequence.
  • variant Family D polymerase refers to a non-naturally occurring archaeal Family D DNA polymerase that has an amino acid sequence that is less than 100% identical to the amino acid sequence of a naturally occurring DNA polymerase from archaea or has a non-naturally occurring chemical modification (e.g ., a polypeptide fused to its amino terminal or carboxy terminal end or other chemical modification).
  • a variant amino acid sequence may have at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98% or at least 99% identity to the amino acid sequence of a naturally occurring Family D polymerase without being 100% identical to any known naturally occurring polymerase. Sequence differences may include insertions, deletions and substitutions of one or more amino acids.
  • a variant Family D polymerase may have a small exonuclease subunit (DPI) and a large polymerase subunit (DP2).
  • DPI small exonuclease subunit
  • DP2 large polymerase subunit
  • a variant Family D polymerase may have exonuclease activity or may be exonuclease deficient, where exonuclease activity is 5 ’-3’.
  • Exonuclease deficient variants may have one or more amino acid substitutions in the small subunit (DPI) including, for example, substitutions at positions corresponding to D507A, H554A, or both D507A and H554A of 9°N (SEQ ID NO: 6).
  • a variant Family D polymerase lacking exonuclease activity may comprise only such portion of the DPI subunit as necessary to support catalytic activity of the polymerase (DP2) subunit.
  • Variant Family D polymerases are provided here that differ from wild type Family D DNA polymerases in their abilities to incorporate rNTPs.
  • the incorporation ratio of adenosine triphosphate to deoxyadenosine triphosphate (rA:dA) may be at least 0.19, less than 0.26 (or less than wild type), more than 0.26 (or more than wild type), at least 0.28, at least 0.3, at least 0.4, at least 0.5, at least 0.8, or at least about.0.9.
  • Use of wild type Family D polymerases may have been limited in the past, in part, by their rNTP/dNTP selectivity.
  • Family D DNA polymerases may have little to no strand displacement activity. When polD encounters a downstream strand of DNA hybridized to the template strand during synthesis, it pauses rather than displacing the hybridized strand. Accordingly, polD may be used in applications for which strand displacement is not desired or required, including, for example, gap-filling applications. While Family B polymerases stall when they encounter a uracil in the template strand, variant Family D polymerases are able to read through and continue polymerization. Accordingly, variant Family D polymerases could be useful for sequencing polynucleotides that have or have the potential to have one or more deaminated cytosines.
  • variant Family D polymerases may synthesize DNA at a slower rate than many other polymerases.
  • DNA sequencing by synthesis applications typically require use of a reversible terminator. While faster reaction rates are commonly preferred, the slower rate of variant Family D polymerases could be optimized for real time detection of base incorporation in the growing DNA strand.
  • a fusion protein may comprise a variant Family D polymerase and a replisome protein.
  • Variant Family D polymerases may bind other replication enzymes, including core replisome proteins, for example, mini-chromosome maintenance (MCM1 and MCM2) helicase, DNA ligase, the archaeal Cdc45 protein (GAN), Cdc6 and the processivity factor PCNA, whereas Family B DNA polymerases may bind MCM2 and single- stranded binding protein RPA3.
  • Combining a replisome protein and a variant Family D polymerase in a fusion protein may enhance the activity of one or both proteins (e.g., by making more efficient use of substrates, improving the quantity /purity of products, and/or facilitate delivery of both molecules to a desired site. Binding or fusing polD with accessory proteins may enable coupling polD synthesis with other activities such as MCM helicase DNA unwinding, ligation of nicks, or increases in processivity with PCNA clamps.
  • Aphidicolin is an antibiotic that acts on Family B polymerases to reversibly inhibit eukaryotic DNA replication.
  • Variant Family D polymerases like PolD itself, may remain active, polymerizing DNA in the presence of aphidicolin. Accordingly, where both polB and a variant Family D polymerase are present, but polD activity is preferred or required, a reaction may be performed in the presence of aphidicolin.
  • Variant Family D polymerases may have an amino acid sequence comprising one or more of the consensus domains shown in Table 1, each of which may have the indicated number of substitutions. Each of these domains corresponds to the indicated portion of wild type Family D polymerase (e.g., Euryarchaeota 9°N PolD-L; SEQ ID NO: 4) and one or more of the degenerate sequences disclosed herein (SEQ ID NOS: 1-3). Variant Family D polymerases have at least one substitution relative to wild type Family D polymerase (e.g ., in domain Gl, G2, G3, or G4) and are not naturally occurring.
  • wild type Family D polymerase e.g., Euryarchaeota 9°N PolD-L; SEQ ID NO: 4
  • variant Family D polymerases have at least one substitution relative to wild type Family D polymerase (e.g ., in domain Gl, G2, G3, or G4) and are not naturally occurring.
  • a variant Family D polymerase may have an amino acid sequence comprising domains as shown in Table 2 with domain C referring to either of domains Cl and C2, domain F referring to either of domains FI and F2, and domain G referring to any of domains Gl, G2, G3, and G4 of Table 1.
  • Each domain when present, may appear in alphabetical order from the amino terminal end to the carboxy terminal end of a variant Family D polymerase as shown in Table 2 and further may appear in a position corresponding to the position of the comparable domain in a wild type Family D polymerase.
  • Table 2 Variant Family D Polymerase Domains
  • variant Family D polymerases have been engineered and tested for their ability to preferentially incorporate ribonucleotides into polynucleotides.
  • a variant Family D polymerase may have an amino acid sequence comprising one or more substitutions at one or more positions selected from positions corresponding to positions 145-169, 245-271, 336, 354-364, 368, 392, 395, 398, 452-455, 651-679, 808-816, 821-841, 918-943 (e.g., 930, 931, and 932), 947-964 (e.g., 954, 959, and 963), 966, and 1020 of SEQ ID NO: 4.
  • a variant Family D polymerase may have an amino acid sequence comprising one or more sequences having (a) at least 90% identity to domain A, B, D or F2, (b) at least 85% identity to domain E or H, (c) at least 80% identity to domain Cl, C2, or FI, (d) up to 97% identity to Gl, G2, G3, or G4, in some embodiments.
  • a variant DNA polymerase may have an amino acid sequence comprising one or more substitutions at one or more positions selected from positions corresponding to positions 13, 203, 822, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4.
  • variant Family D polymerases may have a substitution at a position corresponding to position 931 of SEQ ID NO: 1, 2, 3, or 4 or one or more substitutions in the triplet Pro-His-Thr (SEQ ID NO: 15) corresponding to positions 930-932 of SEQ ID NO:l, 2, 3, or 4 (domain G4).
  • a variant Family D polymerase comprises at least one substitution in Gl, G2, G3, or G4.
  • Such variants may further include a non-naturally occurring sequence having at least 80%, 85%, 90%, 95% or 97% or 100% sequence identity with one or more of the other sequence motifs described in Table 1 and Fig. 1.
  • a variant Family D polymerase may have an amino acid sequence comprising a sequence having (a) at least 85% identity to domain Cl or C2, (b) at least 85% identity to domain E, (c) at least 80% identity to domain FI or F2, and (d) up to 97% identity to domain Gl, G2, G3, or G4 (e.g., Table 2, numbers 4, 61, 69, 72, 79, 89, 103, 105, 108, 110, 113, 120, 124, 125, 127, 130, and 131).
  • a variant Family D polymerase may have an amino acid sequence comprising, for example, (a) a sequence having up to 97% identity to domain Gl, G2, G3, or G4, and (b) a sequence having a sequence having (i) at least 90% identity to domain A (e.g., Table 2, number 5), (ii) at least 90% identity to domain B (e.g., Table 2, number 6), (iii) at least 80% identity to domain Cl or C2 (e.g., Table 2, number 7), (iv) at least 90% identity to domain D (e.g., Table 2, number 8), (v) at least 85% identity to domain E (e.g., Table 2, number 9), (vi) at least 90% identity to domain FI or F2 (e.g., Table 2, number 10), (vii) at least 85% identity to domain H (e.g., Table 2, number 11), or any combinations thereof (e.g., Table 2, numbers 12-17, 20-23, 30, 31, 33, 34, 40-42, 48, 50-52, 60,
  • a variant Family D polymerase may comprise a substitution at a position corresponding to position 931 of any of SEQ ID NOS: 1-4.
  • Substitutions at position 931 may include H931D, H931E, H931C, H931L, H931I, H931F, H931G, H931T, H931V, H931Q, H931M, H931P, H931S, and H931A, may be limited to H931D and H931E (e.g., where a rA:dA ratio below wild type is desired), or may be limited to H931C, H931L, H931I, H931F, H931G, H931T, H931V, H931Q, H931M, H931P, H931S, and H931A (e.g., where a rA:dA ratio above wild type is desired).
  • a variant Family D polymerase may have one or more substitutions at one or more positions corresponding to any of the positions of the Pyrococcus abyssi Family D polymerase disclosed in Table 1 of U.S. Provisional Application No. 62/976,039 filed February 13, 2020 (e.g., positions 106-161, 243-267, 326-330, 361-365, 385-397, 441-451, 657-667, 822-829, 919-928, 940-962, and 981-997 of SEQ ID NO:5 or positions 109-164, 246-270, 332-336, 367-371, 391- 403, 447-447, 665-675, 830-837, 927-936, 948-970, 989-1005 of SEQ ID NO:4).
  • a variant Family D polymerase may have any desired length.
  • a variant Family D polymerase may be shorter (e.g., up to 10% shorter) or longer (e.g., up to 10% longer) than a wild-type sequence.
  • a variant Family D polymerase may have the same or about the same length as a wild-type family D polymerase (e.g., 1250-1300 amino acids).
  • a Family D polymerase in some embodiments, may have fewer (e.g., 1-20 fewer) or may have more (e.g., 1-20 more) amino acids than SEQ ID NO: 1, 2, 3, or 4.
  • a variant Family D polymerase may have an amino acid sequence having at least 90% identity to SEQ ID NOS: 1-4.
  • a variant Family D polymerase having a substitution at a position corresponding to position 931 of any of SEQ ID NOS: 1-4 may comprise a sequence having at least 90% identity to any of SEQ ID NOS: 1-4, at least 95% identity to any of SEQ ID NOS: 1-4, at least 98% identity to any of SEQ ID NOS: 1-4, at least 99% identity to any of SEQ ID NOS: 1-4, or at least 99.5% identity to any of SEQ ID NOS: 1-4, in each case, including all substitutions in the percent identity calculation.
  • Variant Family D polymerases may have improved properties including, for example, an increased ability to incorporate ribonucleotides, increased replicative fidelity, or both. Additional properties may include, for example, no strand displacement activity, no aphidicolin sensitivity, ability to read through uracil on a template strand, slower rate of DNA synthesis relative to other families of polymerases, ability to bind other replication enzymes, and combinations thereof.
  • a variant Family D polymerase may be included in a composition with other materials.
  • a composition may comprise, for example, a variant Family D polymerase and one or more of a buffering agent, a crowding agent (such as polyethylene glycol included in a storage or reaction mixture), a single strand binding protein or portions thereof, an unwinding agent (e.g ., a helicase), a detergent (e.g., a nonionic, cationic, anionic or zwitterionic detergent), an additive (e.g., albumin), glycerol, salt (e.g. KC1), EDTA, a dye, a reaction enhancer or inhibitor, an oxidizing agent, a reducing agent, a solvent and/or a preservative.
  • a buffering agent such as polyethylene glycol included in a storage or reaction mixture
  • a crowding agent such as polyethylene glycol included in a storage or reaction mixture
  • a single strand binding protein or portions thereof an unwinding agent (e
  • the buffering agent combined with additional reagents that are standard in the art may be formulated for storage of a variant Family D polymerase and/or for the desired reaction mixture.
  • the formulation for storage of the polymerase variant or for an amplification reaction may be the same or different.
  • a variant Family D polymerase may be prepared for storage as a reagent or included in a mastermix for storage.
  • a polymerase may be included in a reaction buffer or mastermix.
  • the reaction mix and the storage mix may be the same or different.
  • a storage mix may be a concentrated form of the reagent for dilution into a reaction mix.
  • the polymerase variant may be in a lyophilized form.
  • a polymerase variant may be in a master mix containing deoxyribose nucleoside triphosphates (e.g., one, two, three or all four of dATP, dTTP, dGTP and dCTP and/or one or more modified dNTPs) and, optionally, ribonucleoside triphosphates.
  • concentration of the one or more labeled NTP in a composition may be in the range of 10 nM to 200 mM.
  • the molar ratio of a labeled NTP to the corresponding unlabeled version of the same NTP e.g .
  • biotin-dCTP to dCTP in the composition may be in the range of 1:1000 to 1000:1, e.g., 1:100 to 100:1 or 1:10 to 10:1.
  • the molar ratio of the labeled dNTP to the corresponding unlabeled dNTP (e.g. biotin- dCTP to dCTP) in the nucleotide mix may be in the range of 1:1000 to 1:100, 1:100 to 1:10, 1:10 to 1:1, 1:1 to 1:10, 1:10 to 1:100, or 1:100 to 100:1000.
  • the composition may optionally comprise primers.
  • the primers may be partially or complete random.
  • the primers may be exonuclease-resistant.
  • primers may comprise one or more chemical modifications, for example, phosphorothioate modifications.
  • the chemical modifications on the primers may occur at 3 ’-terminal or both of 3’ and 5’ terminals of the primers.
  • the chemical modifications on the primers may further occur at one or more non-terminal positions of the primers.
  • a variant Family D polymerase or, optionally, a composition comprising a variant Family D polymerase may be prepared in any desired form. Examples include a liquid or an emulsion. Alternatively, the variant or composition containing the variant Family D polymerase may be formulated as a solid, granule, lyophilized state, gel, pellet, or powder. A variant may be immobilized on a solid surface such as a matrix, bead, paper, plastic, resin, column, chip, microfluidic device or other instrument platform where the variant Family D polymerase adheres directly or via an affinity binding domain.
  • a variant Family D polymerase may be included in a fusion protein with an affinity binding domain such as a maltose binding protein, a chitin binding domain, a his tag, a self-labeling protein tag (a SNAP-Tag®, New England Biolabs), biotin or combinations thereof.
  • An affinity binding domain may be associated with a variant Family D polymerase non-co valently such as via an antibody.
  • a composition comprising a variant Family D polymerase may further comprise a second polymerase wherein the second polymerase differs from the variant Family D polymerase.
  • the present disclosure further relates to methods of using variant Family D polymerases.
  • methods of polynucleotide synthesis including amplification (e.g., isothermal amplification, helicase dependent amplification, rolling circle amplification (RCA), multiple strand displacement amplification (MDA), whole genome amplification), quantitative amplification, and sequencing by synthesis.
  • amplification e.g., isothermal amplification, helicase dependent amplification, rolling circle amplification (RCA), multiple strand displacement amplification (MDA), whole genome amplification
  • MDA multiple strand displacement amplification
  • sequencing by synthesis e.g., sequencing by synthesis.
  • a method of synthesizing a copy of a polynucleotide template may comprise contacting the polynucleotide template with (a) a variant Family D polymerase, (b) dNTPs and at least one of an rNTP and a modified NTP, and (c) optionally, at least one primer having a sequence complementary to at least a portion of the sequence of the polynucleotide template for a time and at a temperature to produce the copy of the polynucleotide template.
  • the need for a primer may be obviated where the template polynucleotide is hybridized to a complementary strand.
  • a method of polynucleotide synthesis may include combining a polynucleotide template (e.g ., a target DNA) to be amplified with a variant Family D polymerase, dNTPs, one or more rNTPs, and optionally, one or more primers to produce a reaction mixture; and incubating the reaction mixture to synthesize one or more copies of the template.
  • a polynucleotide template may comprise one or more uracils (e.g., cytosines that have been oxidized to uracil). Synthesized fragments may be used for any analytical purpose (e.g., sequencing), synthetic purpose (e.g., cloning), or diagnostic purpose (e.g., single nucleotide polymorphism (SNP) detection).
  • a method of filling in a gap in an otherwise double- stranded DNA molecule comprises contacting the DNA molecule with (a) a variant Family D polymerase, (b) dNTPs, and (c) optionally, rNTPs for a time and at a temperature to fill in the gap.
  • a variant Family D polymerase may be used to synthesize synthetic molecules containing modified nucleotides.
  • a variant Family D polymerase may be used to synthesize modified nucleic aptamers for use as inhibitors or therapeutics.
  • a method may comprise contacting an aptamer template polynucleotide with a variant Family D polymerase, dNTPs, and at least one rNTP. Aptamers synthesized by variant Family D polymerase with partial or total substitutions with ribonucleotides increase the chemical diversity and functionality of the aptamer molecule.
  • a variant Family D polymerase may be used to synthesize synthetic molecules containing modified nucleotides.
  • a variant Family D polymerase may be used to synthesize modified nucleic aptamers for use as inhibitors or therapeutics.
  • a method may comprise contacting an aptamer template polynucleotide with a variant Family D polymerase, dNTPs, and at least one modified NTP. Aptamers synthesized by variant Family D polymerase with partial or total substitutions with 2'-modified nucleotides increase the chemical diversity and functionality of the aptamer molecule.
  • Variant Family D polymerases may be used to site-specifically label DNA by incorporation of a nucleotide containing a 2’ modification, for example 2’-azido-ATP.
  • This 2’ modification could be a molecular handle which allows for reactivity of the nucleotide, and in turn, DNA into which it has been incorporated.
  • a modified nucleotide could be incorporated into double- stranded polynucleotide at a primer/template junction, or at a nick or gap site by incubating (a) a variant Family D polymerase, (b) a DNA molecule, and (c) a modified nucleotide for a time and temperature to allow for incorporation of the modified nucleotide into a synthetized copy of the DNA.
  • the molecular handle (i.e. azido group) present on the modified nucleotide can be a reactive group that allows for DNA labeling, i.e. click-chemistry (Nat Chem Biol. 2017 Sep 19; 13(10): 1057. doi: 10.1038/nchembio.2482. PMID: 28926550).
  • the incorporation of modified nucleotides by a variant Family D polymerase could deem the newly synthesized molecule resistant to exonucleases, and therefore protect the molecule from degradation in vivo.
  • variant Family D polymerases with the ability to readily incorporate ribonucleotides during synthesis may be used in RNA sequencing.
  • variant Family D polymerase synthesizes DNA molecules with partial ribonucleotide substitution by replacing a single dN with a rN.
  • a variant Family D polymerase is contacted with an RNA template and dA, rATP, dCTP, dGTP and dTTP so synthetic products contain.
  • dA, rA, dC, dG, dT are examples of synthetic products contain.
  • C produces molecules with dA, dC, rC, dG, dT
  • G produces molecules with dA, dC, dG, rG, dT
  • T produces molecules with dA, dC, dG, dT, rU.
  • produced polynucleotide molecules are treated with alkaline or RNaseH2 to cleave at ribonucleotide sites, resulting in a population of products, each terminating with a rNTP. Products are separated by electrophoresis and analyzed to determine the sequence pattern.
  • variant Family D polymerase may be used to create polynucleotides resistant to one or more restriction enzymes. Restriction enzymes recognize and cleave specific DNA sequences. Modifying that recognition sequence (e.g., by methylating cytosine in the recognition sequence) may block restriction enzyme cleavage.
  • a method for producing a restriction enzyme resistant polynucleotide may comprise contacting a template polynucleotide having a recognition sequence for a restriction enzyme with a variant Family D polymerase, deoxyribonucleotides as dNTPs and a ribonucleotide as rNTP to produce a polynucleotide having a modified recognition sequence for the restriction enzyme, the modified recognition sequence comprising the ribonucleotide.
  • Incorporation of rNTPs by variant Family D polymerases may modify restriction enzyme recognition sequences from DNA to partially or fully ribosubstituted molecules that may block restriction enzyme cleavage. This technique could enable selective restriction enzyme cleavage of certain polynucleotides in populations of molecules.
  • Variant Family D polymerase can also introduce modifications to enable site-specific cleavage.
  • a template polynucleotide comprising a primer binding sequence and a target nucleotide adjacent to the 5’ end of the primer binding sequence, may be contacted with a primer having 5’ end, a 3’ end, and a sequence complementary to the primer binding sequence, a variant Family D polymerase, and a ribonucleotide as rNTP complementary to the target nucleotide to form a reaction mixture and produce an extended primer comprising the ribonucleotide at its 3’ end.
  • the reaction mixture may be contacted with dNTPs to further extend the extended primer.
  • Variant Family D polymerases may have an even higher capacity to discriminate against rNTP compared to wild type.
  • the incorporation ratio of adenosine to adenine (rA:dA) may be less than 0.25, less than 0.24, less than 0.23, less than 0.22, less than 0.21, or less than 0.20.
  • a variant Family D polymerase may comprise an aspartate or glutamate substitution at the position corresponding to position 931 of SEQ ID NOS:l, 2, 3 or 4.
  • Products produced by variant Family D polymerases e.g ., comprising H931D and H931E substitutions
  • products produced by variant Family D polymerases may include fewer rNMPs than products produced by wild type polymerases or other variant Family D polymerases.
  • kits including a variant Family D DNA polymerase as described herein.
  • a kit may include a variant Family D DNA polymerase and dNTPs, rNTPs, primers, other enzymes (e.g., other polymerases, enzymes other than polymerases, or both), buffering agents, or combinations thereof.
  • a variant Family D polymerase may be included in a storage buffer (e.g., comprising glycerol and a buffering agent).
  • a kit may include a reaction buffer which may be in concentrated form, and the buffer may contain additives (e.g. glycerol), salt (e.g. KC1), reducing agent, EDTA or detergents, among others.
  • a kit comprising dNTPs may include one, two, three of all four of dATP, dTTP, dGTP and dCTP.
  • a kit comprising rNTPs may include one, two, three of all four of rATP, rUTP, rGTP and rCTP.
  • a kit may further comprise one or more modified nucleotides.
  • the kit may optionally comprise one or more primers (random primers, bump primers, exonuclease-resistant primers, chemically-modified primers, custom sequence primers, or combinations thereof).
  • One or more components of a kit may be included in one container for a single step reaction, or one or more components may be contained in one container, but separated from other components for sequential use or parallel use. The contents of a kit may be formulated for use in a desired method or process.
  • a kit contains: (i) a variant Family D polymerase; and (ii) a buffer.
  • the variant polymerase may have a lyophilized form or may be included in a buffer ( e.g ., a storage buffer or a reaction buffer in concentrated form).
  • the kit may contain the variant polymerase in a mastermix suitable for receiving and amplifying a template nucleic acid.
  • the DNA polymerase may be a purified enzyme so as to contain substantially no DNA or RNA and no nucleases.
  • the reaction buffer in (ii) and/or storage buffers containing the DNA polymerase in (i) may include non-ionic, ionic e.g. anionic or zwitterionic surfactants and crowding agents.
  • a kit may further include one or more dNTPs including dNTPs with large adducts such as a fluorescent-label or biotin-modified nucleotide, or a methylated nucleotide or other modified nucleotide.
  • the kit may include the DNA polymerase and reaction buffer in a single tube or in different tubes.
  • Steric gate motifs of wild type Family D polymerases from representative members of the archaeal phyla were aligned using the Multiple Sequence Alignment Tool Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/).
  • Table 3 shows that the Pro-His-Thr triplet (SEQ ID NO: 15) is conserved among Euryarchaeota, Bathyarchaeota, Thaumarchaeota, Aigarchaeota, Korarchaeota, Odinarchaeota, Lokiarchaeota, Thorarchaeota and Heimdallarchaeota.
  • Invariant motif amino acids P930 and T932 are underlined and invariant steric gate amino acid H931 is shown in bold.
  • a steric gate amino acid will block ribonucleotide incorporation at low ribonucleotide concentrations. Altering steric gate amino acid may reduce discrimination against ribonucleotides at low concentration.
  • Family D DNA polymerases the position of the rNTP 2'-OH in the active site had not been identified. To identify the determinants of ribonucleotide discrimination in Family D DNA polymerases, active site amino acid variants were screened for increased incorporation at low ribonucleotide (1 mM) concentration.
  • a nucleotide incorporation assay was performed to determine the ratio of rATP to dATP incorporated by wild type (H931) and variant Family D polymerases (H931X).
  • a primer- template was used to monitor rATP and dATP incorporation and was prepared by annealing the 50mer 5’-FAM primer (10 mM) (5'-FAM-AGT GAA TTC GAG CTC GGT ACC CGG GGA TCC TCT AGA GTC GAC CTG CAG GT-3' (SEQ ID NO: 29)) to the 67mer template (15 mM) (5'-AAG CAC GAA AGC AGG GTA CCT GCA GGT CGA CTC TAG AGG ATC CCC GGG TAC CGA GCT CGA ATT CAC T-3' (SEQ ID NO: 30)) in IX
  • ThermoPol buffer (20 mM Tris- HC1, 10 mM (NH 4 )2S0 4 , 10 mM KC1, 2 m
  • a 10 pi aliquot of polD (SEQ ID NO: 4) or H931X DNA was prepared by mixing ThermoPol buffer (IX final concentration), primer/template DNA (40 nM final concentration) and polD- (2 pi of heat lysate containing over-expressed polD or H931X).
  • a second 10 m ⁇ aliquot was prepared by mixing ThermoPol buffer (IX final concentration) and dATP or ATP (2 mM final concentration).
  • the polD/DNA and dATP or ATP aliquots were mixed (20 pL reaction) and placed at 65 °C for 10 min, after which 20 uL of 50 mM EDTA were added.
  • a negative control reaction was performed in which 10 uL of lx Thermopol buffer was added to the polD/DNA mixture instead of nucleotides.
  • Reaction products were separated by capillary electrophoresis using a 3730x1 Genetic Analyzer (Applied Biosystems) and fluorescent peaks were analyzed using Peak Scanner software version 1.0 (Applied Biosystems). The percentage of ribonucleotide incorporation was divided by the percentage of deoxyribonucleotide incorporation to obtain the rA:dA ratio. All assays were performed in triplicate to ensure experiment reproducibility.
  • Results are shown in FIGURE 2, in which it can be seen that wild type polD incorporated rATP and dATP at a ratio of 0.26 and variant Family D polymerases incorporated rATP and dATP at a ratio ranging from 0.2 to 0.92 with polD H931A having the highest ratio of rATP to dATP incorporation.
  • a nucleotide incorporation assay was performed to determine the ability of polD and polD H931Ato incorporate successive ribonucleotides into a primer/template (FIGURES 4A and 4B).
  • the 50mer DNA primer/ 67mer DNA template was created as described above.
  • a construct with a 50 nucleotide RNA primer (same sequence) was also annealed to the 67mer DNA template as described above.
  • a 20 uL reaction containing 20 nM nucleotide construct, 100 nM polD or polD H931A, and 1 uM dNTPs or rNTPs in lx Thermopol buffer was placed at 65 °C for 15 mins. The reaction was quenched with equal volume of 50 mM EDTA and analyzed by capillary electrophoresis.
  • FIGURE 4 illustrates the ability of variant Family D polymerases have to incorporate successive dNTPs and rNTPs.
  • Figure 4A is a sketch illustrating binding of a variant Family D polymerase to a template strand (lower strand) and a primer (upper strand) having a fluorescent label. In the presence of NTPs and dNTPs, incubation at an appropriate temperature for an appropriate time results in extension of the primer.
  • the traces shown on the left side of FIGURE 4B show that both polD and polD H931A can readily incorporate successive dNTPs onto DNA and RNA templates. However, the traces on the right show that variant polD H931 A can incorporate 4 successive ribonucleotides onto a DNA template, while polD can only incorporate a single ribonucleotide onto a DNA primer.
  • EXAMPLE 5 Enhanced incorporation of 2'-modified nucleotides by variant Family D polymerase
  • Modified nucleotides rATP, 3'-dATP (Cordycepin), 2'-amino-ATP, 2'-azido-ATP and 2'- (9-methyl-ATP were from Trilink Biotechnologies.
  • the primer-template used to monitor modified nucleotide incorporation was prepared by annealing the 50mer 5’-FAM primer (10 mM) (5'-FAM-AGT GAA TTC GAG CTC GGT ACC CGG GGA TCC TCT AGA GTC GAC CTG CAG GT-3' (SEQ ID NO: 29)) to the 62mer Template (15 pM) (5'-AAG CAC GAA AGC AGG GTA CCT GCA GGT CGA CTC TAG AGG ATC CCC GGG TAC CGA GCT CGA ATT CAC T-3' (SEQ ID NO: 30)) in IX ThermoPol buffer (20 mM Tris-HCl, 10 mM (NH 4 ) 2 S0 4 , 10 m
  • a 100 pi reaction mix was prepared containing IX ThermoPol Buffer, 10 nM primer-template, 25 nM wild type Family D DNA polymerase (polD) or variant Family D DNA polymerase (polD/H931A) and 1 pM modified nucleotide. Reactions were incubated for 1 minute at 65°C and 10 pi aliquots were removed and mixed with 50 mM EDTA to halt the reaction. Reaction products were separated by capillary electrophoresis using a 3730x1 Genetic Analyzer (Applied Biosystems) and fluorescent peaks were analyzed using Peak Scanner software version 1.0 (Applied Biosystems). The concentration of product (51 nt DNA with a FAM label) was graphed as a function of time (Table).
  • PolD incorporated rATP, 3'-dATP, 2'-amino-ATP, 2'-azido-ATP and 2'- (9-methyl- ATP poorly (Table 4, middle column).
  • PolD/H931A incorporated 2'-modified nucleotides rATP, 3'- dATP, 2'-amino-ATP, 2'-azido-ATP and 2'- (9-methyl- ATP with higher yield than polD (Table 4, right column).
  • EXAMPLE 6 Enhanced incorporation of modified nucleotides by variant Family D polymerase Additional variant positions with the potential to impact activity and/or selectivity of a variant Family D polymerase were prepared as summarized in Table 5.

Abstract

The present disclosure relates to polymerases (e.g., variants of Family D DNA polymerases) for polynucleotide synthesis, polynucleotide amplification, polynucleotide sequencing, cloning a polynucleotide, or combinations thereof. Variant Family D polymerases have one or more substitutions relative to wild type Family D polymerases (e.g., substitutions impacting substrate selectivity) and may incorporate ribonucleotides and/or modified nucleotides into synthesized or extended strands.

Description

VARIANT FAMILY D DNA POLYMERASES
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority to U.S. Provisional Application No. 62/976,039 filed February 13, 2020 and U.S. Provisional Application No. 62/976,055 filed February 13, 2020, the entire contents of which are hereby incorporated by reference.
SEQUENCE LISTING STATEMENT
This disclosure includes a Sequence Listing submitted electronically in ascii format under the file name “NEB-425-2_ST25.txt”. This Sequence Listing is incorporated herein in its entirety by this reference.
BACKGROUND
DNA polymerases are classified into Families A, B, C, D, X, Y and RT according to their amino acid sequences. DNA polymerases have several properties that contribute to their replicative fidelity. For example, some DNA polymerases of Families A, B, and D have proofreading 3' to 5' (3 '-5') exonuclease activity. When a DNA polymerase incorporates an incorrect or modified nucleotide, for example, in a primer strand, it detects structural perturbations caused by mispairing or nucleotide modification and transfers the primer strand from the polymerase domain to the 3 '-5' exonuclease active site. In addition to correcting errors after they have arisen, DNA polymerases restrict access to their active sites to prevent incorporation of ribonucleotides. For example, Families A, B, X, Y, and reverse transcriptases have a steric gate that excludes rNTPs from the active site by a steric clash between a bulky amino acid side chain in the steric gate and the 2’-OH of such rNTPs. Reducing the size of the side chain at the steric gate position allows DNA polymerases so modified to incorporate a single rNTP as efficiently as dNTP.
Polymerases with such modified steric gates have been extensively employed in molecular biology applications such as single-molecule sequencing, sequencing by synthesis, and single nucleotide polymorphism (SNP) detection. Despite the range of polymerases available now, additional DNA polymerases with different properties and features may further expand the uses to which polymerases may be put.
Family D DNA polymerases have properties that differ from some or all other polymerases. These distinguishing properties may include one or more of the following: (a) having little to no strand displacement activity, (b) reading through uracils encountered in a template strand and continue polymerization, (c) synthesizing DNA at a slower rate, (d) binding other replication enzymes, including core replisome proteins (e.g., mini-chromosome maintenance (MCM) helicase, DNA ligase, the archaeal Cdc45 protein (GAN) and the processivity factor PCNA), and (e) activity in the presence of amphidicolin.
Structurally, PolD is a heterodimeric enzyme consisting of a large 5 ’-3’ polymerase subunit and a small MRE11-like 3'-5' exonuclease subunit. The activity of each subunit requires the other subunit to be present. Despite having a catalytic core that resembles an RNA polymerase, Family D DNA polymerases preferentially incorporate dNTPs over rNTPs. The molecular basis for this selectivity has not yet been identified, which may have limited the use of wild type Family D polymerases. Cann, I.K., Komori, K., Toh, H., Kanai, S. and Ishino, Y. (1998) A heterodimeric DNA polymerase: evidence that members of Euryarchaeota possess a distinct DNA polymerase. Proc Natl Acad Sci U S A, 95, 14250-14255; Raia, P., Carroni, M., Henry, E., Pehau-Amaudet, G., Brule, S., Beguin, P., Henneke, G., Lindahl, E., Delame, M. and Sauguet, L. (2019) Structure of the DP1-DP2 PolD complex bound with DNA and its implications for the evolutionary history of DNA and RNA polymerases. PLoS Biol, 17, e3000122; Takashima, N., Ishino, S., Oki, K., Takafuji, M., Yamagami, T., Matsuo, R., Mayanagi, K. and Ishino, Y. (2019) Elucidating functions of DPI and DP2 subunits from the Thermococcus kodakarensis family D DNA polymerase. Extremophiles, 23, 161-172; Ishino, Y., Komori, K., Cann, I.K. and Koga, Y. (1998) A novel DNA polymerase family found in Archaea. Journal of bacteriology, 180, 2232-2236; Greenough, L., Menin, J.F., Desai, N.S., Kelman, Z. and Gardner, A.F. (2014) Characterization of Family D DNA polymerase from Thermococcus sp. 9 degrees N. Extremophiles, 18, 653-664; Schermerhom, K.M. and Gardner, A.F. (2015) Pre-steady-state Kinetic Analysis of a Family D DNA Polymerase from Thermococcus sp. 9°N Reveals Mechanisms for Archaeal Genomic Replication and Maintenance. J. Biol. Chem., 290, 21800-21810; Astatke, M., Ng, K., Grindley, N.D. and Joyce, C.M. (1998) A single side chain prevents Escherichia coli DNA polymerase I (Klenow fragment) from incorporating ribonucleotides. Proc Natl Acad Sci U S A, 95, 3402-3407; Shen, Y., Musti, K., Hiramoto, M., Kikuchi, H., Kawarabayashi, Y., & Matsui, I. (2001). Invariant Asp-1122 and Asp-1124 are essential residues for polymerization catalysis of family D DNA polymerase from Pyrococcus horikoshii., 276(29), 27376-27383. http://doi.org/10.1074/jbc.M011762200. SUMMARY
The present disclosure relates to compositions, methods, and kits comprising variant Family D polymerases that incorporate ribonucleotides and/or modified nucleotides into the polynucleotides they synthesize. A variant Family D polymerase may comprise, for example, (a) an amino acid sequence comprising at least two (or all three) of (optionally, in order from amino- to carboxy terminus and/or otherwise corresponding to) (i) a sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 1, (ii) a sequence having at least 85% identity to positions 808-816 of SEQ ID NO: 1, and (iii) a sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 1; and (b) a substitution at a position corresponding to (i) position 930, 931, or 932 of SEQ ID NO: 1, (ii) position 822, 954, 959, or 963 of SEQ ID NO: 3, or (iii) position 13, 203, 336, 368, 453, 455, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4. A variant Family D polymerase having such a substitution may comprise, for example, a substitution corresponding to position 13, 203, 822, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4.
A variant Family D polymerase may comprise, in some embodiments, (a) an amino acid sequence comprising at least two (or all three) of (optionally, in order from amino- to carboxy terminus and/or otherwise corresponding to) (i) a sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 1, (ii) a sequence having at least 85% identity to positions 808- 816 of SEQ ID NO: 1, and (iii) a sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 1; and (b)(i) a substitution at a position corresponding to position 930, 931, or 932 of SEQ ID NO: 1, (ii) any amino acid other than tyrosine at one or more positions corresponding to position 13, 203, 822, 885, 890, 954, 984, 1020, 1026, and/or 1165 of SEQ ID NO: 1, (iii) any amino acid other than lysine at one or more positions corresponding to position 392, 398 and/or 959 of SEQ ID NO: 1 , (iv) any amino acid other than cystine at a position corresponding to position 963 of SEQ ID NO: 1, (v) any amino acid other than phenylalanine at one or more positions corresponding to position 452 and/or 980 of SEQ ID NO: 1, (vi) any amino acid other than arginine at a position corresponding to position 395 of SEQ ID NO: 1, (vii) any amino acid other than glutamate at a position corresponding to position 454 of SEQ ID NO: 1, or (viii) any amino acid other than aspartate at one or more positions corresponding to position 964 and/or 966 of SEQ ID NO: 1. In some embodiments, a variant Family D polymerase may comprise (a) an amino acid sequence comprising at least two sequences selected from (i) a sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 1, (ii) a sequence having at least 85% identity to positions 808-816 of SEQ ID NO: 1, and (iii) a sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 1; and (b) any amino acid other than (i) tyrosine at one or more positions corresponding to position 13, 203, 822, 885, 890, 954, 984, 1020, 1026, and/or 1165 of SEQ ID NO: 1, (ii) lysine at one or more positions corresponding to position 392, 398 and/or 959 of SEQ ID NO: 1, (iii) cystine at a position corresponding to position 963 of SEQ ID NO: 1, (iv) phenylalanine at one or more positions corresponding to position 452 and/or 980 of SEQ ID NO: 1, (v) arginine at a position corresponding to position 395 of SEQ ID NO: 1, (vi) glutamate at a position corresponding to position 454 of SEQ ID NO: 1, and/or (vii) aspartate at one or more positions corresponding to position 964 or 966 of SEQ ID NO: 1.
A variant Family D polymerase may comprise, for example, (a) an amino acid sequence comprising at least one (and any two of up to all six) of (optionally, in order from amino- to carboxy terminus and/or otherwise corresponding to) (i) a sequence having at least 90% identity to positions 145-169 of SEQ ID NO: 3 or 4, (ii) a sequence having at least 90% identity to positions 245-271 of SEQ ID NO: 3 or 4, (iii) a sequence having at least 90% identity to positions 354-364 of SEQ ID NO: 3 or 4, (iv) a sequence having at least 90% identity to positions 651-679 of SEQ ID NO: 3 or 4, (v) a sequence having at least 90% identity to positions 821-841 of SEQ ID NO: 3 or 4, and (vi) a sequence having at least 80% identity to positions 947-963 of SEQ ID NO: 3 or 4; and (b) a substitution at a position corresponding to (i) position 930, 931, or 932 of SEQ ID NO: 1, (ii) position 822, 954, 959, or 963 of SEQ ID NO: 3, or (iii) position 13, 203, 336, 368, 453, 455, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4. A variant Family D polymerase having such a substitution may comprise, for example, a substitution corresponding to position 13, 203, 822, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4. A variant Family D polymerase having from one to six of the foregoing sequences may further comprise a sequence having at least 80% identity to positions 808-816 of SEQ ID NO: 4.
A variant Family D polymerase may comprise a sequence identical to positions 927-930 of SEQ ID NO: 1 and/or positions 932-934 of SEQ ID NO: 1 at, for example, corresponding positions along its length. A variant DNA polymerase comprising a substitution at a position corresponding to position 931 of SEQ ID NO: 1 may be selected from an alanine substitution, an aspartate substitution, a cysteine substitution, a glutamate substitution, a glutamine substitution, a glycine substitution, an isoleucine substitution, a leucine substitution, a methionine substitution, a phenylalanine substitution, a proline substitution, a serine substitution, a threonine substitution, and a valine substitution. For example, a substitution at a position corresponding to position 931 may be alanine.
A variant Family D polymerase may comprise a sequence having at least 90%, at least 95% or at least 98% identity (but in each case, less than 100% identity) to SEQ ID NO: 1. A variant Family D polymerase may comprise a sequence having at least 80%, at least 85%, at least 90%, at least 95% or at least 98% identity (but in each case, less than 100% identity) to SEQ ID NO: 2, 3, or 4. A variant Family D polymerase may comprise a sequence having (a) at least 70%, at least 75%, or at least 80%, at least 85% identity (but in each case, less than 100% identity) to SEQ ID NO: 2, 3, or 4 and (b) at least one (and any two of up to all six) of (optionally, in order from amino- to carboxy terminus and/or otherwise corresponding to) (i) a sequence having at least 90% identity to positions 145-169 of SEQ ID NO: 3 or 4, (ii) a sequence having at least 90% identity to positions 245-271 of SEQ ID NO: 3 or 4, (iii) a sequence having at least 90% identity to positions 354-364 of SEQ ID NO: 3 or 4, (iv) a sequence having at least 90% identity to positions 651-679 of SEQ ID NO: 3 or 4, (v) a sequence having at least 90% identity to positions 821-841 of SEQ ID NO: 3 or 4, and (vi) a sequence having at least 80% identity to positions 947-963 of SEQ ID NO: 3 or 4. A variant Family D polymerase may further comprise a substitution at a position corresponding to (i) position 930, 931, or 932 of SEQ ID NO: 1, (ii) position 822, 954, 959, or 963 of SEQ ID NO:
3, or (iii) position 13, 203, 336, 368, 453, 455, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4. A variant Family D polymerase having such a substitution may comprise, for example, a substitution corresponding to position 13, 203, 822, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4. A variant Family D polymerase having from one to six of the foregoing sequences may further comprise a sequence having at least 80% identity to positions 808-816 of SEQ ID NO:
4.
A composition comprising a variant Family D polymerase may further include one or more substrates, intermediates, and/or products of such polymerase. For example, a composition may comprise one or more ribonucleoside triphosphates, one or more deoxyribonucleoside triphosphates, or one or more ribonucleoside triphosphates and one or more deoxyribonucleoside triphosphates. A composition may comprise one or more modified nucleotides (e.g., modified nucleotides selected from the group consisting of dideoxynucleoside triphosphates, acyclic- nucleoside triphosphates, 3'-0-azidomethyl-ddNTPs, 3'-0-amino-ddNTPs, 3 ’-OH unblocked dNTPs, biotin-deoxyuridine triphosphates, rNTP, 3'-dNTP, 2'-amino-NTP, 2'-azido-NTP, and 2'- Omethyl-NTP). A product of a variant Family D polymerase included in a composition or arising from a method disclosed herein may comprise, for example, a polynucleotide having at least one ribonucleotide.
A variant Family D polymerase may have or lack an exonuclease subunit (DPI). If present, the exonuclease subunit may have or lack exonuclease activity ( e.g ., 3’ to 5’ exonuclease activity). For example, an exonuclease subunit may comprise one or more substitutions that reduce or eliminate exonuclease activity relative to the corresponding wild type sequence. A composition may comprise a fusion protein comprising a variant Family D polymerase and any desired polypeptide. For example, a fusion protein may comprise a polypeptide selected from a DNA- binding peptide or protein, a maltose binding domain, a chitin binding domain, or a SNAP-Tag®.
The present disclosure provides various methods for producing copies of a template polynucleotide. For example, a method may comprise contacting a variant Family D polymerase with a template polynucleotide (e.g., a DNA template) deoxyribonucleoside triphosphates, and optionally a primer having a sequence complementary to at least a portion of the sequence of the template under suitable conditions (e.g., time, temperature, pH) to produce at least one polynucleotide copy of the template. A method may further include contacting a variant Family D polymerase with one or more ribonucleoside triphosphates and/or one or more modified nucleotides. A template polynucleotide may comprise at least one ribonucleotide. For example, a template may comprise a uracil. A polynucleotide copy produced upon or following contact with a variant Family D polymerase may comprise a deoxyribonucleotide at positions complementary to any template positions occupied by a ribonucleotide. For example, a DNA template comprising one uracil contacted with a variant Family D polymerase may give rise to a polynucleotide copy of the template, the copy comprising an adenine at the position that is complementary to the uracil. Also provided are methods for filling a gap in a double- stranded polynucleotide. For example, a method may comprise contacting a variant Family D polymerase and one or more deoxyribonucleoside triphosphates with a double-stranded polynucleotide comprising a gap on one strand that is at least one nucleotide in length to produce a product polynucleotide that has no such gap.
The present disclosure further provides kits including variant Family D polymerases. For example, a kit may comprise a variant Family D polymerase and one or more ribonucleoside triphosphates, one or more deoxyribonucleoside triphosphates, one or more modified nucleotides, one or more primers, one or more adapters, one or more buffering agents, or combinations thereof.
BRIEF DESCRIPTION OF THE FIGURES
FIGURE 1 shows examples of Family D polymerase sequences. SEQ ID NO: 1 contains Family D polymerase consensus sequences identified according to Example 1, in which each X independently can be any amino acid, except SEQ ID NO:l residues X927, which is G or T, X928, which is L, M, or T, X929, which is A or S, X933, which is S, F, or C, and X934, which is G, A, C, or V. SEQ ID NOS: 2-3 contain Family D polymerase consensus sequence identified according to Example 1, in which each X independently can be any amino acid. SEQ ID NOS: 4 and 5 are Family D polymerase sequences of Euryarchaeota (9°N) and Pyrococcus abyssi, respectively.
FIGURE 2 shows that Family D polymerase variants incorporated rNTPs relative to dNTPs more efficiently than wild type PolD. The example Family D polymerase variants illustrated incorporated rNTPs relative to dNTPs at a ratio ranging from 0.20 to 0.92 compared with wild type polD that incorporated rATP and dATP at a ratio of 0.26.
FIGURE 3 shows that additional Family D polymerase variants incorporated rNTPs relative to dNTPs more efficiently than wild type PolD. The variant Family D polymerases illustrated incorporated rNTPs relative to dNTPs at a ratio ranging from 0.30 to 0.63 compared with wild type polD that incorporated rATP and dATP at a ratio of 0.26.
FIGURE 4 shows the ability of variant Family D polymerases have to incorporate rNTPs successively. FIGURE 4A is a sketch illustrating the reaction and FIGURE 4B illustrates data obtained in accordance with Example 5. The wild type polD shown incorporates one rNTP, while a polD H931A incorporates up to four successive rNTPs. As controls both polD and pol H931A incorporate successive dNTPs.
PET AIDED DESCRIPTION
Compositions, methods and kits are provided here that improve among other things, the synthesis of DNA that contains modified nucleotides such as ribonucleotides using a class of polymerases that do not belong to the DNA polymerase A or B families. Family D polymerases preferentially incorporate deoxyribonucleoside triphosphates relative to larger ribonucleoside triphosphates, which may be attributed to a steric gate motif limiting substrate access to the catalytic site. Variant Family D polymerases with modifications in or modifications impacting this steric gate are provided in some embodiments disclosed herein. For example, variant Family D polymerases may have an enhanced or reduced ability to exclude ribonucleotides from the active site.
Aspects of the present disclosure can be further understood in light of the embodiments, section headings, figures, descriptions and examples, none of which should be construed as limiting the entire scope of the present disclosure in any way. Accordingly, the claims set forth below should be construed in view of the full breadth and spirit of the disclosure.
Each of the individual embodiments described and illustrated herein has discrete components and features which can be readily separated from or combined with the features of any of the other several embodiments without departing from the scope or spirit of the present teachings. Any recited method can be carried out in the order of events recited or in any other order which is logically possible.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Still, certain terms are defined herein with respect to embodiments of the disclosure and for the sake of clarity and ease of reference.
Sources of commonly understood terms and symbols may include: standard treatises and texts such as Komberg and Baker, DNA Replication, Second Edition (W.H. Freeman, New York, 1992); Lehninger, Biochemistry, Second Edition (Worth Publishers, New York, 1975); Strachan and Read, Human Molecular Genetics, Second Edition (Wiley-Liss, New York, 1999); Eckstein, editor, Oligonucleotides and Analogs: A Practical Approach (Oxford University Press, New York, 1991); Gait, editor, Oligonucleotide Synthesis: A Practical Approach (IRL Press, Oxford, 1984); Singleton, et ah, Dictionary of Microbiology and Molecular biology, 2d ed., John Wiley and Sons, New York (1994), and Hale & Markham, the Harper Collins Dictionary of Biology, Harper Perennial, N.Y. (1991) and the like.
As used herein and in the appended claims, the singular forms “a” and “an” include plural referents unless the context clearly dictates otherwise. For example, the term “a protein” refers to one or more proteins, i.e., a single protein and multiple proteins. It is further noted that the claims can be drafted to exclude any optional element. As such, this statement is intended to serve as antecedent basis for use of such exclusive terminology as “solely,” “only” and the like in connection with the recitation of claim elements or use of a “negative” limitation.
Numeric ranges are inclusive of the numbers defining the range. All numbers should be understood to encompass the midpoint of the integer above and below the integer i.e., the number 2 encompasses 1.5-2.5. The number 2.5 encompasses 2.45-2.55 etc. When sample numerical values are provided, each alone may represent an intermediate value in a range of values and together may represent the extremes of a range unless specified.
As used herein, “buffering agent” refers to an agent that allows a solution to resist changes in pH when acid or alkali is added to the solution. Examples of suitable non-naturally occurring buffering agents that may be used in the compositions, kits, and methods of the disclosure include, for example, any of Tris, HEPES, TAPS, MOPS, tricine, and MES.
With respect to an amino acid residue or a nucleotide base position, “corresponding to” refers to positions that lie across from one another when sequences are aligned, e.g., by the BLAST algorithm. An amino acid position in a functional or structural motif in one polymerase may correspond to a position within a functionally equivalent functional or structural motif in another polymerase.
As used herein, “DNA polymerase” refers to an enzyme that is capable of replicating DNA and optionally may have exonuclease activity.
As used herein, “DNA template” refers to the DNA strand read by a DNA polymerase and of which a copy is synthesized.
As used herein, “Family D polymerase” (also, “polD” or “Family D DNA polymerase”) refers to a heterodimeric archaeal DNA polymerase having a small exonuclease subunit (DPI) and a large polymerase subunit (DP2). Family D polymerases may be produced in cells in an immature form and then undergo post-translational processing (e.g., removal of amino-terminal sequences, inteins, or both). Examples of Family D polymerases may include 9°N PolD (Euryarchaeota), Genbank Accession No. KPV61551.1 (Bathyarchaeota), Accession No. RNJ72434.1 (Thaumarchaeota), Accession No. PUA31350.1 (Aigarchaeota), Accession No. RLG53083.1 (Korarchaeota), Accession No. OLS 17959.1 (Odinarchaeota), Accession No. TET59439.1 (Lokiarchaeota), Accession No. TFH09197.1 (Thorarchaeota), and Accession No. OLS25708.1 (Heimdallarchaeota) . As used herein, “fusion protein” refers to protein composed of a plurality of polypeptide components that are un-joined in their native state. Fusion proteins may be a combination of two, three or four or more different proteins. The term polypeptide is not intended to be limited to a fusion of two heterologous amino acid sequences. A fusion protein may have one or more heterologous domains added to the N-terminus, C-terminus, and or the middle portion of the protein. If two parts of a fusion protein are “heterologous”, they are not part of the same protein in its natural state. Examples of fusion proteins include a variant Family D polymerase fused to an SS07 DNA binding peptide (see for example, US Patent 6,627,424), a transcription factor (see for example, US patent 10,041,051), a binding protein suitable for immobilization such as maltose binding domain (MBP), a histidine tag (“His-tag”), chitin binding domain (CBD) or a SNAP-Tag® (New England Biolabs, Ipswich, MA (see for example US patents 7,939,284 and 7,888,090)). The binding peptide may be used to improve solubility or yield of the polymerase variant during the production of the protein reagent. Other examples of fusion proteins include a heterologous targeting sequence, a linker, an epitope tag, a detectable fusion partner, such as a fluorescent protein, b-galactosidase, luciferase and the functionally similar peptides.
As used herein, “modified nucleotide” refers to a non-canonical nucleotide that may be incorporated into a growing polynucleotide strand. Examples of modified nucleotides include nucleotide terminators (dideoxy nucleoside triphosphates (ddNTPs) and acyclic-nucleoside triphosphates (acycloNTPs)), reversible nucleotide terminators (3'-0-azidomethyl-ddNTPs, 3'-0- amino-ddNTPs, and 3 ’-OH unblocked dNTPs), and tagged nucleotides (biotin-deoxyuridine triphosphates (biotin-dUTPs)). Examples of modified nucleotides include nucleotides with any atom or group other than a hydrogen at the 2' position ( e.g ., rATP, 3'-dATP, 2'-amino-ATP, 2'- azido-ATP and 2'- (9-methyl- ATP). Examples further include 3’-OH unblocked dNTPs (e.g., Lightning Terminators™ (Agilent Technologies, Inc., Houston, Tex.) (Gardner AF, Wang J, Wu W, Karouby J, Li H, Stupi BP, Jack WE, Hersh MN, Metzker ML. (2012) Rapid incorporation kinetics and improved fidelity of a novel class of 3'-OH unblocked reversible terminators. Nucleic Acids Res., 40, 7404-7415)).
As used herein, “NTP” refers to a nucleoside triphosphate including, for example, any deoxyribonucleoside triphosphate (“dNTP”) and any ribonucleoside triphosphate (“rNTP”).
As used herein, “non-naturally occurring” refers to a polynucleotide, polypeptide, carbohydrate, lipid, or composition that does not exist in nature. Such a polynucleotide, polypeptide, carbohydrate, lipid, or composition may differ from naturally occurring polynucleotides polypeptides, carbohydrates, lipids, or compositions in one or more respects. For example, a polymer ( e.g ., a polynucleotide, polypeptide, or carbohydrate) may differ in the kind and arrangement of the component building blocks (e.g., nucleotide sequence, amino acid sequence, or sugar molecules). A polymer may differ from a naturally occurring polymer with respect to the molecule(s) to which it is linked. For example, a “non-naturally occurring” protein may differ from naturally occurring proteins in its secondary, tertiary, or quaternary structure, by having a chemical bond (e.g., a covalent bond including a peptide bond, a phosphate bond, a disulfide bond, an ester bond, and ether bond, and others) to a polypeptide (e.g., a fusion protein), a lipid, a carbohydrate, or any other molecule. Similarly, a “non-naturally occurring” polynucleotide or nucleic acid may contain one or more other modifications (e.g., an added label or other moiety) to the 5’- end, the 3’ end, and/or between the 5’- and 3 ’-ends (e.g., methylation) of the nucleic acid. A “non-naturally occurring” composition may differ from naturally occurring compositions in one or more of the following respects: (a) having components that are not combined in nature, (b) having components in concentrations not found in nature, (c) omitting one or components otherwise found in naturally occurring compositions, (d) having a form not found in nature, e.g., dried, freeze dried, crystalline, aqueous, and (e) having one or more additional components beyond those found in nature (e.g., buffering agents, a detergent, a dye, a solvent or a preservative). All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
As used herein, “polynucleotide copy” refers to the product of polymerization activity of a DNA polymerase. A polynucleotide copy may comprise deoxyribonucleotides with or without ribonucleotides.
With reference to an amino acid, “position” refers to the place such amino acid occupies in the primary sequence of a peptide or polypeptide numbered from its amino terminus to its carboxy terminus.
As used herein, “substitution” at a position in a comparator amino acid sequence refers to any difference at that position relative to the corresponding position in a reference sequence, including a deletion, an insertion, and a different amino acid, where the comparator and reference sequences are at least 80% identical to each other. A substitution in a comparator sequence, in addition to being different than the reference sequence, may differ from all corresponding positions in naturally-occurring sequences that are at least 80% identical to the comparator sequence.
As used herein, “variant Family D polymerase” (also, “variant Family D DNA polymerase”) refers to a non-naturally occurring archaeal Family D DNA polymerase that has an amino acid sequence that is less than 100% identical to the amino acid sequence of a naturally occurring DNA polymerase from archaea or has a non-naturally occurring chemical modification ( e.g ., a polypeptide fused to its amino terminal or carboxy terminal end or other chemical modification). A variant amino acid sequence may have at least 80%, at least 85%, at least 90%, at least 95%, at least 97%, at least 98% or at least 99% identity to the amino acid sequence of a naturally occurring Family D polymerase without being 100% identical to any known naturally occurring polymerase. Sequence differences may include insertions, deletions and substitutions of one or more amino acids.
A variant Family D polymerase may have a small exonuclease subunit (DPI) and a large polymerase subunit (DP2). A variant Family D polymerase may have exonuclease activity or may be exonuclease deficient, where exonuclease activity is 5 ’-3’. Exonuclease deficient variants may have one or more amino acid substitutions in the small subunit (DPI) including, for example, substitutions at positions corresponding to D507A, H554A, or both D507A and H554A of 9°N (SEQ ID NO: 6). A variant Family D polymerase lacking exonuclease activity may comprise only such portion of the DPI subunit as necessary to support catalytic activity of the polymerase (DP2) subunit.
Variant Family D polymerases are provided here that differ from wild type Family D DNA polymerases in their abilities to incorporate rNTPs. For example, the incorporation ratio of adenosine triphosphate to deoxyadenosine triphosphate (rA:dA) may be at least 0.19, less than 0.26 (or less than wild type), more than 0.26 (or more than wild type), at least 0.28, at least 0.3, at least 0.4, at least 0.5, at least 0.8, or at least about.0.9. Use of wild type Family D polymerases may have been limited in the past, in part, by their rNTP/dNTP selectivity.
Like T4 and T7 polymerases, Family D DNA polymerases may have little to no strand displacement activity. When polD encounters a downstream strand of DNA hybridized to the template strand during synthesis, it pauses rather than displacing the hybridized strand. Accordingly, polD may be used in applications for which strand displacement is not desired or required, including, for example, gap-filling applications. While Family B polymerases stall when they encounter a uracil in the template strand, variant Family D polymerases are able to read through and continue polymerization. Accordingly, variant Family D polymerases could be useful for sequencing polynucleotides that have or have the potential to have one or more deaminated cytosines.
Like PolD, variant Family D polymerases may synthesize DNA at a slower rate than many other polymerases. DNA sequencing by synthesis applications typically require use of a reversible terminator. While faster reaction rates are commonly preferred, the slower rate of variant Family D polymerases could be optimized for real time detection of base incorporation in the growing DNA strand.
In some embodiments, a fusion protein may comprise a variant Family D polymerase and a replisome protein. Variant Family D polymerases may bind other replication enzymes, including core replisome proteins, for example, mini-chromosome maintenance (MCM1 and MCM2) helicase, DNA ligase, the archaeal Cdc45 protein (GAN), Cdc6 and the processivity factor PCNA, whereas Family B DNA polymerases may bind MCM2 and single- stranded binding protein RPA3. Combining a replisome protein and a variant Family D polymerase in a fusion protein may enhance the activity of one or both proteins (e.g., by making more efficient use of substrates, improving the quantity /purity of products, and/or facilitate delivery of both molecules to a desired site. Binding or fusing polD with accessory proteins may enable coupling polD synthesis with other activities such as MCM helicase DNA unwinding, ligation of nicks, or increases in processivity with PCNA clamps.
Aphidicolin is an antibiotic that acts on Family B polymerases to reversibly inhibit eukaryotic DNA replication. Variant Family D polymerases, like PolD itself, may remain active, polymerizing DNA in the presence of aphidicolin. Accordingly, where both polB and a variant Family D polymerase are present, but polD activity is preferred or required, a reaction may be performed in the presence of aphidicolin.
Variant Family D polymerases may have an amino acid sequence comprising one or more of the consensus domains shown in Table 1, each of which may have the indicated number of substitutions. Each of these domains corresponds to the indicated portion of wild type Family D polymerase (e.g., Euryarchaeota 9°N PolD-L; SEQ ID NO: 4) and one or more of the degenerate sequences disclosed herein (SEQ ID NOS: 1-3). Variant Family D polymerases have at least one substitution relative to wild type Family D polymerase ( e.g ., in domain Gl, G2, G3, or G4) and are not naturally occurring.
Table 1: Family D Polymerase Domains
Figure imgf000015_0001
* At least one of the substitutions is at the position corresponding to position 931 of SEQ ID NO: 1, 2, 3, or 4. For example, a variant Family D polymerase may have an amino acid sequence comprising domains as shown in Table 2 with domain C referring to either of domains Cl and C2, domain F referring to either of domains FI and F2, and domain G referring to any of domains Gl, G2, G3, and G4 of Table 1. Each domain, when present, may appear in alphabetical order from the amino terminal end to the carboxy terminal end of a variant Family D polymerase as shown in Table 2 and further may appear in a position corresponding to the position of the comparable domain in a wild type Family D polymerase. Table 2: Variant Family D Polymerase Domains
Figure imgf000016_0001
Figure imgf000016_0002
Figure imgf000017_0001
Figure imgf000017_0002
Figure imgf000018_0002
* Cl or C2 † FI or F2
* Gl, G2, G3, or G4
Figure imgf000018_0001
Various variant Family D polymerases have been engineered and tested for their ability to preferentially incorporate ribonucleotides into polynucleotides. In some embodiments, a variant Family D polymerase may have an amino acid sequence comprising one or more substitutions at one or more positions selected from positions corresponding to positions 145-169, 245-271, 336, 354-364, 368, 392, 395, 398, 452-455, 651-679, 808-816, 821-841, 918-943 (e.g., 930, 931, and 932), 947-964 (e.g., 954, 959, and 963), 966, and 1020 of SEQ ID NO: 4. A variant Family D polymerase may have an amino acid sequence comprising one or more sequences having (a) at least 90% identity to domain A, B, D or F2, (b) at least 85% identity to domain E or H, (c) at least 80% identity to domain Cl, C2, or FI, (d) up to 97% identity to Gl, G2, G3, or G4, in some embodiments. A variant DNA polymerase may have an amino acid sequence comprising one or more substitutions at one or more positions selected from positions corresponding to positions 13, 203, 822, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4.
It may be desirable, in some embodiments, for variant Family D polymerases to have a substitution at a position corresponding to position 931 of SEQ ID NO: 1, 2, 3, or 4 or one or more substitutions in the triplet Pro-His-Thr (SEQ ID NO: 15) corresponding to positions 930-932 of SEQ ID NO:l, 2, 3, or 4 (domain G4). For example, a variant Family D polymerase comprises at least one substitution in Gl, G2, G3, or G4. Such variants may further include a non-naturally occurring sequence having at least 80%, 85%, 90%, 95% or 97% or 100% sequence identity with one or more of the other sequence motifs described in Table 1 and Fig. 1. For example, a variant Family D polymerase may have an amino acid sequence comprising a sequence having (a) at least 85% identity to domain Cl or C2, (b) at least 85% identity to domain E, (c) at least 80% identity to domain FI or F2, and (d) up to 97% identity to domain Gl, G2, G3, or G4 (e.g., Table 2, numbers 4, 61, 69, 72, 79, 89, 103, 105, 108, 110, 113, 120, 124, 125, 127, 130, and 131). A variant Family D polymerase may have an amino acid sequence comprising, for example, (a) a sequence having up to 97% identity to domain Gl, G2, G3, or G4, and (b) a sequence having a sequence having (i) at least 90% identity to domain A (e.g., Table 2, number 5), (ii) at least 90% identity to domain B (e.g., Table 2, number 6), (iii) at least 80% identity to domain Cl or C2 (e.g., Table 2, number 7), (iv) at least 90% identity to domain D (e.g., Table 2, number 8), (v) at least 85% identity to domain E (e.g., Table 2, number 9), (vi) at least 90% identity to domain FI or F2 (e.g., Table 2, number 10), (vii) at least 85% identity to domain H (e.g., Table 2, number 11), or any combinations thereof (e.g., Table 2, numbers 12-17, 20-23, 30, 31, 33, 34, 40-42, 48, 50-52, 60, 61, 70, 71, 73, 80-82, 83, 90-92, 93, 100-103, 110, 111, 114, 120, 121, 130, and 131).
In some embodiments, a variant Family D polymerase may comprise a substitution at a position corresponding to position 931 of any of SEQ ID NOS: 1-4. Substitutions at position 931 may include H931D, H931E, H931C, H931L, H931I, H931F, H931G, H931T, H931V, H931Q, H931M, H931P, H931S, and H931A, may be limited to H931D and H931E (e.g., where a rA:dA ratio below wild type is desired), or may be limited to H931C, H931L, H931I, H931F, H931G, H931T, H931V, H931Q, H931M, H931P, H931S, and H931A (e.g., where a rA:dA ratio above wild type is desired).
A variant Family D polymerase may have one or more substitutions at one or more positions corresponding to any of the positions of the Pyrococcus abyssi Family D polymerase disclosed in Table 1 of U.S. Provisional Application No. 62/976,039 filed February 13, 2020 (e.g., positions 106-161, 243-267, 326-330, 361-365, 385-397, 441-451, 657-667, 822-829, 919-928, 940-962, and 981-997 of SEQ ID NO:5 or positions 109-164, 246-270, 332-336, 367-371, 391- 403, 447-447, 665-675, 830-837, 927-936, 948-970, 989-1005 of SEQ ID NO:4).
A variant Family D polymerase may have any desired length. For example, a variant Family D polymerase may be shorter (e.g., up to 10% shorter) or longer (e.g., up to 10% longer) than a wild-type sequence. A variant Family D polymerase may have the same or about the same length as a wild-type family D polymerase (e.g., 1250-1300 amino acids). A Family D polymerase, in some embodiments, may have fewer (e.g., 1-20 fewer) or may have more (e.g., 1-20 more) amino acids than SEQ ID NO: 1, 2, 3, or 4. A variant Family D polymerase may have an amino acid sequence having at least 90% identity to SEQ ID NOS: 1-4. For example, a variant Family D polymerase having a substitution at a position corresponding to position 931 of any of SEQ ID NOS: 1-4 may comprise a sequence having at least 90% identity to any of SEQ ID NOS: 1-4, at least 95% identity to any of SEQ ID NOS: 1-4, at least 98% identity to any of SEQ ID NOS: 1-4, at least 99% identity to any of SEQ ID NOS: 1-4, or at least 99.5% identity to any of SEQ ID NOS: 1-4, in each case, including all substitutions in the percent identity calculation.
Variant Family D polymerases may have improved properties including, for example, an increased ability to incorporate ribonucleotides, increased replicative fidelity, or both. Additional properties may include, for example, no strand displacement activity, no aphidicolin sensitivity, ability to read through uracil on a template strand, slower rate of DNA synthesis relative to other families of polymerases, ability to bind other replication enzymes, and combinations thereof.
A variant Family D polymerase may be included in a composition with other materials. A composition may comprise, for example, a variant Family D polymerase and one or more of a buffering agent, a crowding agent (such as polyethylene glycol included in a storage or reaction mixture), a single strand binding protein or portions thereof, an unwinding agent ( e.g ., a helicase), a detergent (e.g., a nonionic, cationic, anionic or zwitterionic detergent), an additive (e.g., albumin), glycerol, salt (e.g. KC1), EDTA, a dye, a reaction enhancer or inhibitor, an oxidizing agent, a reducing agent, a solvent and/or a preservative. The buffering agent combined with additional reagents that are standard in the art may be formulated for storage of a variant Family D polymerase and/or for the desired reaction mixture. The formulation for storage of the polymerase variant or for an amplification reaction may be the same or different.
A variant Family D polymerase may be prepared for storage as a reagent or included in a mastermix for storage. Alternatively, a polymerase may be included in a reaction buffer or mastermix. The reaction mix and the storage mix may be the same or different. For example, a storage mix may be a concentrated form of the reagent for dilution into a reaction mix. In one example, the polymerase variant may be in a lyophilized form. In another example, a polymerase variant may be in a master mix containing deoxyribose nucleoside triphosphates (e.g., one, two, three or all four of dATP, dTTP, dGTP and dCTP and/or one or more modified dNTPs) and, optionally, ribonucleoside triphosphates. The concentration of the one or more labeled NTP in a composition may be in the range of 10 nM to 200 mM. In some embodiments, the molar ratio of a labeled NTP to the corresponding unlabeled version of the same NTP ( e.g . biotin-dCTP to dCTP) in the composition may be in the range of 1:1000 to 1000:1, e.g., 1:100 to 100:1 or 1:10 to 10:1. For example, the molar ratio of the labeled dNTP to the corresponding unlabeled dNTP (e.g. biotin- dCTP to dCTP) in the nucleotide mix may be in the range of 1:1000 to 1:100, 1:100 to 1:10, 1:10 to 1:1, 1:1 to 1:10, 1:10 to 1:100, or 1:100 to 100:1000. The composition may optionally comprise primers. In some embodiments, the primers may be partially or complete random. In some embodiments, the primers may be exonuclease-resistant. In some embodiments, primers may comprise one or more chemical modifications, for example, phosphorothioate modifications. In some embodiments, the chemical modifications on the primers may occur at 3 ’-terminal or both of 3’ and 5’ terminals of the primers. In some embodiments, the chemical modifications on the primers may further occur at one or more non-terminal positions of the primers.
A variant Family D polymerase or, optionally, a composition comprising a variant Family D polymerase may be prepared in any desired form. Examples include a liquid or an emulsion. Alternatively, the variant or composition containing the variant Family D polymerase may be formulated as a solid, granule, lyophilized state, gel, pellet, or powder. A variant may be immobilized on a solid surface such as a matrix, bead, paper, plastic, resin, column, chip, microfluidic device or other instrument platform where the variant Family D polymerase adheres directly or via an affinity binding domain. A variant Family D polymerase may be included in a fusion protein with an affinity binding domain such as a maltose binding protein, a chitin binding domain, a his tag, a self-labeling protein tag (a SNAP-Tag®, New England Biolabs), biotin or combinations thereof. An affinity binding domain may be associated with a variant Family D polymerase non-co valently such as via an antibody. A composition comprising a variant Family D polymerase may further comprise a second polymerase wherein the second polymerase differs from the variant Family D polymerase.
The present disclosure further relates to methods of using variant Family D polymerases. For example, methods of polynucleotide synthesis are provided, including amplification (e.g., isothermal amplification, helicase dependent amplification, rolling circle amplification (RCA), multiple strand displacement amplification (MDA), whole genome amplification), quantitative amplification, and sequencing by synthesis. A method of synthesizing a copy of a polynucleotide template may comprise contacting the polynucleotide template with (a) a variant Family D polymerase, (b) dNTPs and at least one of an rNTP and a modified NTP, and (c) optionally, at least one primer having a sequence complementary to at least a portion of the sequence of the polynucleotide template for a time and at a temperature to produce the copy of the polynucleotide template. The need for a primer may be obviated where the template polynucleotide is hybridized to a complementary strand. A method of polynucleotide synthesis may include combining a polynucleotide template ( e.g ., a target DNA) to be amplified with a variant Family D polymerase, dNTPs, one or more rNTPs, and optionally, one or more primers to produce a reaction mixture; and incubating the reaction mixture to synthesize one or more copies of the template. A polynucleotide template may comprise one or more uracils (e.g., cytosines that have been oxidized to uracil). Synthesized fragments may be used for any analytical purpose (e.g., sequencing), synthetic purpose (e.g., cloning), or diagnostic purpose (e.g., single nucleotide polymorphism (SNP) detection).
Methods of polynucleotide repair (e.g., gap-filling) are provided. A method of filling in a gap in an otherwise double- stranded DNA molecule comprises contacting the DNA molecule with (a) a variant Family D polymerase, (b) dNTPs, and (c) optionally, rNTPs for a time and at a temperature to fill in the gap.
A variant Family D polymerase may be used to synthesize synthetic molecules containing modified nucleotides. For example, a variant Family D polymerase may be used to synthesize modified nucleic aptamers for use as inhibitors or therapeutics. A method may comprise contacting an aptamer template polynucleotide with a variant Family D polymerase, dNTPs, and at least one rNTP. Aptamers synthesized by variant Family D polymerase with partial or total substitutions with ribonucleotides increase the chemical diversity and functionality of the aptamer molecule.
A variant Family D polymerase may be used to synthesize synthetic molecules containing modified nucleotides. For example, a variant Family D polymerase may be used to synthesize modified nucleic aptamers for use as inhibitors or therapeutics. A method may comprise contacting an aptamer template polynucleotide with a variant Family D polymerase, dNTPs, and at least one modified NTP. Aptamers synthesized by variant Family D polymerase with partial or total substitutions with 2'-modified nucleotides increase the chemical diversity and functionality of the aptamer molecule.
Variant Family D polymerases may be used to site-specifically label DNA by incorporation of a nucleotide containing a 2’ modification, for example 2’-azido-ATP. This 2’ modification could be a molecular handle which allows for reactivity of the nucleotide, and in turn, DNA into which it has been incorporated. A modified nucleotide could be incorporated into double- stranded polynucleotide at a primer/template junction, or at a nick or gap site by incubating (a) a variant Family D polymerase, (b) a DNA molecule, and (c) a modified nucleotide for a time and temperature to allow for incorporation of the modified nucleotide into a synthetized copy of the DNA. The molecular handle (i.e. azido group) present on the modified nucleotide can be a reactive group that allows for DNA labeling, i.e. click-chemistry (Nat Chem Biol. 2017 Sep 19; 13(10): 1057. doi: 10.1038/nchembio.2482. PMID: 28926550). Further, the incorporation of modified nucleotides by a variant Family D polymerase could deem the newly synthesized molecule resistant to exonucleases, and therefore protect the molecule from degradation in vivo.
In some embodiments, variant Family D polymerases with the ability to readily incorporate ribonucleotides during synthesis may be used in RNA sequencing. In these techniques, variant Family D polymerase synthesizes DNA molecules with partial ribonucleotide substitution by replacing a single dN with a rN. For example, in "A" reactions, a variant Family D polymerase is contacted with an RNA template and dA, rATP, dCTP, dGTP and dTTP so synthetic products contain. dA, rA, dC, dG, dT. Correspondingly, "C" reactions produce molecules with dA, dC, rC, dG, dT, "G" reactions produce molecules with dA, dC, dG, rG, dT, and "T" reactions produce molecules with dA, dC, dG, dT, rU. Following synthesis, produced polynucleotide molecules are treated with alkaline or RNaseH2 to cleave at ribonucleotide sites, resulting in a population of products, each terminating with a rNTP. Products are separated by electrophoresis and analyzed to determine the sequence pattern.
In some embodiments, variant Family D polymerase may be used to create polynucleotides resistant to one or more restriction enzymes. Restriction enzymes recognize and cleave specific DNA sequences. Modifying that recognition sequence (e.g., by methylating cytosine in the recognition sequence) may block restriction enzyme cleavage. A method for producing a restriction enzyme resistant polynucleotide may comprise contacting a template polynucleotide having a recognition sequence for a restriction enzyme with a variant Family D polymerase, deoxyribonucleotides as dNTPs and a ribonucleotide as rNTP to produce a polynucleotide having a modified recognition sequence for the restriction enzyme, the modified recognition sequence comprising the ribonucleotide. Incorporation of rNTPs by variant Family D polymerases may modify restriction enzyme recognition sequences from DNA to partially or fully ribosubstituted molecules that may block restriction enzyme cleavage. This technique could enable selective restriction enzyme cleavage of certain polynucleotides in populations of molecules.
Variant Family D polymerase can also introduce modifications to enable site-specific cleavage. For example, a template polynucleotide, comprising a primer binding sequence and a target nucleotide adjacent to the 5’ end of the primer binding sequence, may be contacted with a primer having 5’ end, a 3’ end, and a sequence complementary to the primer binding sequence, a variant Family D polymerase, and a ribonucleotide as rNTP complementary to the target nucleotide to form a reaction mixture and produce an extended primer comprising the ribonucleotide at its 3’ end. The reaction mixture may be contacted with dNTPs to further extend the extended primer. The resulting product can then be specifically cleaved at the rN by RNaseH2 to create a site-specific nick. One application is USER® cloning (Nucleic Acids Res. 2007;35(6): 1992-2002. USER friendly DNA engineering and cloning method by uracil excision. Bitinaite J 1 , Rubino M, Varma KH, Schildkraut I, Vaisvila R, Vaiskunaite R).
Variant Family D polymerases, in some embodiments, may have an even higher capacity to discriminate against rNTP compared to wild type. For example, the incorporation ratio of adenosine to adenine (rA:dA) may be less than 0.25, less than 0.24, less than 0.23, less than 0.22, less than 0.21, or less than 0.20. A variant Family D polymerase may comprise an aspartate or glutamate substitution at the position corresponding to position 931 of SEQ ID NOS:l, 2, 3 or 4. Products produced by variant Family D polymerases ( e.g ., comprising H931D and H931E substitutions) in the presence of both rNTPs and dNTPs may include fewer rNMPs than products produced by wild type polymerases or other variant Family D polymerases.
The present disclosure further relates to kits including a variant Family D DNA polymerase as described herein. For example, a kit may include a variant Family D DNA polymerase and dNTPs, rNTPs, primers, other enzymes (e.g., other polymerases, enzymes other than polymerases, or both), buffering agents, or combinations thereof. A variant Family D polymerase may be included in a storage buffer (e.g., comprising glycerol and a buffering agent). A kit may include a reaction buffer which may be in concentrated form, and the buffer may contain additives (e.g. glycerol), salt (e.g. KC1), reducing agent, EDTA or detergents, among others. A kit comprising dNTPs may include one, two, three of all four of dATP, dTTP, dGTP and dCTP. A kit comprising rNTPs may include one, two, three of all four of rATP, rUTP, rGTP and rCTP. A kit may further comprise one or more modified nucleotides. The kit may optionally comprise one or more primers (random primers, bump primers, exonuclease-resistant primers, chemically-modified primers, custom sequence primers, or combinations thereof). One or more components of a kit may be included in one container for a single step reaction, or one or more components may be contained in one container, but separated from other components for sequential use or parallel use. The contents of a kit may be formulated for use in a desired method or process.
A kit is provided that contains: (i) a variant Family D polymerase; and (ii) a buffer. The variant polymerase may have a lyophilized form or may be included in a buffer ( e.g ., a storage buffer or a reaction buffer in concentrated form). The kit may contain the variant polymerase in a mastermix suitable for receiving and amplifying a template nucleic acid. The DNA polymerase may be a purified enzyme so as to contain substantially no DNA or RNA and no nucleases. The reaction buffer in (ii) and/or storage buffers containing the DNA polymerase in (i) may include non-ionic, ionic e.g. anionic or zwitterionic surfactants and crowding agents. A kit may further include one or more dNTPs including dNTPs with large adducts such as a fluorescent-label or biotin-modified nucleotide, or a methylated nucleotide or other modified nucleotide. The kit may include the DNA polymerase and reaction buffer in a single tube or in different tubes.
EXAMPLES
Some specific embodiments are further illustrated by the examples that follow. All reagents identified herein are available from New England Biolabs (Ipswich, MA).
EXAMPLE 1: Family D Polymerase Have Conserved Domains
Motifs common to Family D polymerases were identified in three steps. First, similar proteins to SEQ ID 4 were identified using Geneious Software with the following BLAST settings:
Database: GenBank nucleotide collection (nr/nt)
Program: blastp
Matrix: BLOSUM62
Word Size:6
Max E-value: 10
Gap cost (Open Extend): 11 1
Sequences similar to 9°N polD (SEQ ID 4) identified in the previous step were aligned using Geneious Software with the following settings:
MUSCLE Alignment 3.8.425 Settings:
Maximum number of Iterations: 8 Iteration 1: kmer6_6 Subsequent: pcctid_kimura Iteration 1 & 2: UPGMG Subsequent: UPGMB Iteration 1 & 2: pseudo Subsequent: pseudo Iteration 1 & 2: CLUSTALW Subsequent: CUSTALW Objective score: spm Anchor spacing: 32 Min length: 24 Multiplier: 1.2 Window Size: 5 Margin: 5
Finally, from multiple sequence alignment, conserved amino acid motifs among 50 or 200 closest matches were identified using Geneious software. The motifs identified are shown in Table 1 and Figure 1.
Steric gate motifs of wild type Family D polymerases from representative members of the archaeal phyla were aligned using the Multiple Sequence Alignment Tool Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo/). Table 3 shows that the Pro-His-Thr triplet (SEQ ID NO: 15) is conserved among Euryarchaeota, Bathyarchaeota, Thaumarchaeota, Aigarchaeota, Korarchaeota, Odinarchaeota, Lokiarchaeota, Thorarchaeota and Heimdallarchaeota. Invariant motif amino acids P930 and T932 are underlined and invariant steric gate amino acid H931 is shown in bold.
Table 1
Table 3. DNA polymerase D, large subunit steric gate motif alignment
_ Phylum _ Steric gate motif _ Genbank SEQ ID NO
Euryarchaeota 927 GLAPHTSAGI 936 9°N PolD 20
Bathyarchaeota 852 GLAPHTSVGI 861 KPV61551.1 21
Thaumarchaeota 829 GLAPHTSVGI 838 RNJ72434.1 22
Aigarchaeota 868 SLSPHTYVAV 877 PUA31350.1 23
Korarchaeota 846 TMAPHTFTAI 855 RLG53083.1 24
Odinarchaeota 862 GLAPHTSAGI 871 OLS 17959.1 25
Lokiarchaeota 886 GLAPHTSAGI 895 TET59439.1 26
Thorarchaeota 951 GLAPHTSAGI 960 TFH09197.1 27
Heimdallarchaeota 864 GLAPHTSAGV 873 OLS25708.1 28 EXAMPLE 2: Identification of the Family D Polymerase Steric Gate
In all other DNA polymerase families, a steric gate amino acid will block ribonucleotide incorporation at low ribonucleotide concentrations. Altering steric gate amino acid may reduce discrimination against ribonucleotides at low concentration. In Family D DNA polymerases, the position of the rNTP 2'-OH in the active site had not been identified. To identify the determinants of ribonucleotide discrimination in Family D DNA polymerases, active site amino acid variants were screened for increased incorporation at low ribonucleotide (1 mM) concentration. Amino acid positions predicted to be within a reasonable distance of the catalytic site of the polymerase domain (DP2) were identified from available structural data (Protein Data Bank Accession number 6HMS. Raia, P., Carroni, M., Henry, E., Pehau-Amaudet, G., Brule, S., Beguin, P., et al. (2019). Structure of the DP1-DP2 PolD complex bound with DNA and its implications for the evolutionary history of DNA and RNA polymerases. PLoS Biology, 17(1), e3000122. 2) using PyMol software (Schrodinger, LLC) and selected for further evaluation. Specifically, amino acids at selected positions were substituted and the resulting variants were tested for changes in rNTP incorporation (Example 3).
EXAMPLE 3: Variant Family D Polymerase Incorporates Ribonucleotides
A nucleotide incorporation assay was performed to determine the ratio of rATP to dATP incorporated by wild type (H931) and variant Family D polymerases (H931X). A primer- template was used to monitor rATP and dATP incorporation and was prepared by annealing the 50mer 5’-FAM primer (10 mM) (5'-FAM-AGT GAA TTC GAG CTC GGT ACC CGG GGA TCC TCT AGA GTC GAC CTG CAG GT-3' (SEQ ID NO: 29)) to the 67mer template (15 mM) (5'-AAG CAC GAA AGC AGG GTA CCT GCA GGT CGA CTC TAG AGG ATC CCC GGG TAC CGA GCT CGA ATT CAC T-3' (SEQ ID NO: 30)) in IX ThermoPol buffer (20 mM Tris- HC1, 10 mM (NH4)2S04, 10 mM KC1, 2 mM MgS04, 0.1% Triton X-100, pH 8.8 at 25 °C) by heating to 95 °C for three minutes following by cooling to room temperature.
A 10 pi aliquot of polD (SEQ ID NO: 4) or H931X DNA was prepared by mixing ThermoPol buffer (IX final concentration), primer/template DNA (40 nM final concentration) and polD- (2 pi of heat lysate containing over-expressed polD or H931X). A second 10 mΐ aliquot was prepared by mixing ThermoPol buffer (IX final concentration) and dATP or ATP (2 mM final concentration). The polD/DNA and dATP or ATP aliquots were mixed (20 pL reaction) and placed at 65 °C for 10 min, after which 20 uL of 50 mM EDTA were added. A negative control reaction was performed in which 10 uL of lx Thermopol buffer was added to the polD/DNA mixture instead of nucleotides.
Reaction products were separated by capillary electrophoresis using a 3730x1 Genetic Analyzer (Applied Biosystems) and fluorescent peaks were analyzed using Peak Scanner software version 1.0 (Applied Biosystems). The percentage of ribonucleotide incorporation was divided by the percentage of deoxyribonucleotide incorporation to obtain the rA:dA ratio. All assays were performed in triplicate to ensure experiment reproducibility. Results are shown in FIGURE 2, in which it can be seen that wild type polD incorporated rATP and dATP at a ratio of 0.26 and variant Family D polymerases incorporated rATP and dATP at a ratio ranging from 0.2 to 0.92 with polD H931A having the highest ratio of rATP to dATP incorporation.
EXAMPLE 4: Successive ribonucleotide incorporation
A nucleotide incorporation assay was performed to determine the ability of polD and polD H931Ato incorporate successive ribonucleotides into a primer/template (FIGURES 4A and 4B). The 50mer DNA primer/ 67mer DNA template was created as described above. In addition to the 50mer DNA primer, a construct with a 50 nucleotide RNA primer (same sequence) was also annealed to the 67mer DNA template as described above. A 20 uL reaction containing 20 nM nucleotide construct, 100 nM polD or polD H931A, and 1 uM dNTPs or rNTPs in lx Thermopol buffer was placed at 65 °C for 15 mins. The reaction was quenched with equal volume of 50 mM EDTA and analyzed by capillary electrophoresis.
FIGURE 4 illustrates the ability of variant Family D polymerases have to incorporate successive dNTPs and rNTPs. Figure 4A is a sketch illustrating binding of a variant Family D polymerase to a template strand (lower strand) and a primer (upper strand) having a fluorescent label. In the presence of NTPs and dNTPs, incubation at an appropriate temperature for an appropriate time results in extension of the primer. The traces shown on the left side of FIGURE 4B show that both polD and polD H931A can readily incorporate successive dNTPs onto DNA and RNA templates. However, the traces on the right show that variant polD H931 A can incorporate 4 successive ribonucleotides onto a DNA template, while polD can only incorporate a single ribonucleotide onto a DNA primer.
EXAMPLE 5: Enhanced incorporation of 2'-modified nucleotides by variant Family D polymerase
Modified nucleotides rATP, 3'-dATP (Cordycepin), 2'-amino-ATP, 2'-azido-ATP and 2'- (9-methyl-ATP were from Trilink Biotechnologies. The primer-template used to monitor modified nucleotide incorporation was prepared by annealing the 50mer 5’-FAM primer (10 mM) (5'-FAM-AGT GAA TTC GAG CTC GGT ACC CGG GGA TCC TCT AGA GTC GAC CTG CAG GT-3' (SEQ ID NO: 29)) to the 62mer Template (15 pM) (5'-AAG CAC GAA AGC AGG GTA CCT GCA GGT CGA CTC TAG AGG ATC CCC GGG TAC CGA GCT CGA ATT CAC T-3' (SEQ ID NO: 30)) in IX ThermoPol buffer (20 mM Tris-HCl, 10 mM (NH4)2S04, 10 mM KC1, 2 mM MgS04, 0.1% Triton X-100, pH 8.8 at 25 °C) by heating to 95 °C for three minutes following by cooling to room temperature.
For each modified nucleotide incorporation reaction, a 100 pi reaction mix was prepared containing IX ThermoPol Buffer, 10 nM primer-template, 25 nM wild type Family D DNA polymerase (polD) or variant Family D DNA polymerase (polD/H931A) and 1 pM modified nucleotide. Reactions were incubated for 1 minute at 65°C and 10 pi aliquots were removed and mixed with 50 mM EDTA to halt the reaction. Reaction products were separated by capillary electrophoresis using a 3730x1 Genetic Analyzer (Applied Biosystems) and fluorescent peaks were analyzed using Peak Scanner software version 1.0 (Applied Biosystems). The concentration of product (51 nt DNA with a FAM label) was graphed as a function of time (Table).
PolD incorporated rATP, 3'-dATP, 2'-amino-ATP, 2'-azido-ATP and 2'- (9-methyl- ATP poorly (Table 4, middle column). PolD/H931A incorporated 2'-modified nucleotides rATP, 3'- dATP, 2'-amino-ATP, 2'-azido-ATP and 2'- (9-methyl- ATP with higher yield than polD (Table 4, right column).
Table 4: Incorporation of 2'-modified ATP by polD exo- and polD exo-/H931A
Figure imgf000029_0001
Figure imgf000030_0001
ai Reactions were performed in triplicate and average Product (nM) is reported.
EXAMPLE 6: Enhanced incorporation of modified nucleotides by variant Family D polymerase Additional variant positions with the potential to impact activity and/or selectivity of a variant Family D polymerase were prepared as summarized in Table 5.
Table 5: Variant Family D Polymerases
Polymerase3 Motif Active Feature
WT N/A Y
KS92A basic Yb Primer stabilization
RS95A basic Y
KS98A basic Y
P9S0A selectivity Y
H9S1A selectivity Y Steric Gate
H9S1S selectivity Y Steric Gate
H9S1V selectivity Y Steric Gate
H9S1D selectivity Y Steric Gate
H9S1R selectivity N Steric Gate
H9S1Y selectivity N Steric Gate
T9S2A selectivity Y
K959A acidic Y
R960A acidic Yb Nucleotide Binding
R961A acidic Yb Nucleotide Binding
N962A acidic Yb Mg2+ binding/lntein splicing
N962D acidic Y
C96SA acidic Y Intein splicing
D964A acidic N Mg2+ binding
D966A acidic N Mg2+ binding aAll polymerases contain DPH1 H554A mutation bReduced polymerase activity observed
Reactions were performed in triplicate and average Product (nM) is reported. Modified nucleotide incorporation by the additional variant Family D polymerases shown in Table 6 was assessed. Briefly, polD variants were incubated with a labeled 50mer primer/67mer template and 1 mM of either dATP or rATP for 15 min at 65 °C. Incorporation of a single dA or rA was visualized by capillary electrophoresis. Table 6: Variant Family D Polymerases
PolD mutants screened for ribonucleotide discrimination activity Polymerase3 Selection Reason rA/dA Activity13
WT +
F452AC +
V453AC FVEN Motif in +
E454Ad DPBB-1 +
N455Ad +
K959Ad +
R960Ad
R961Ad Acidic Motif
N962Ad +
C963Ad +
T156Ad KH Domain +
P930Ad +
H931Ad PHT Motif ++++
T932Ad + aAll polymerases contain DPH1 H554A mutation bThe rA/dA activity +, ++ and ++++ respectively indicate no observed rA incorporation, 1- 25%, 25-50% and >90% ratio of rA incorporation compared to dA incorporation cResidue predicted to be within 20A of incoming nucleotide sugar dResidue predicted to be within 12A of incoming nucleotide sugar

Claims

CLAIMS What is claimed is:
1. A composition, comprising a variant Family D polymerase comprising:
(a) an amino acid sequence comprising at least two sequences selected from:
(i) a sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 1,
(ii) a sequence having at least 85% identity to positions 808-816 of SEQ ID NO: 1, and
(iii) a sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 1; and
(b) a substitution at a position corresponding to position 13, 203, 822, 885, 890, 980, 984, 1020, 1026, or 1165 of SEQ ID NO: 4.
2. The composition of Claim 1, wherein the amino acid sequence comprises all three of (i) the sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 4, (ii) the sequence having at least 85% identity to positions 808-816 of SEQ ID NO: 4, and (iii) the sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 4.
3. The composition of Claim 1 or Claim 2, wherein the amino acid sequence further comprises the sequence of positions 927-930 of SEQ ID NO: 4 and the sequence of positions 932- 934 of SEQ ID NO: 4.
4. The composition of any of Claims 1-3, wherein the variant Family D polymerase has at least 80% sequence identity to SEQ ID NO: 2, 3, or 4.
5. The composition of any of Claims 1-4, wherein the variant Family D polymerase has at least 90% sequence identity to SEQ ID NO: 1, 2, 3, or 4.
6. The composition of any of Claims 1-6, wherein the variant Family D polymerase lacks exonuclease activity.
7. A composition, comprising a variant Family D polymerase comprising:
(a) an amino acid sequence having at least one sequence selected from: (i) a sequence having at least 90% identity to positions 145-169 of SEQ ID NO: 4,
(ii) a sequence having at least 90% identity to positions 245-271 of SEQ ID NO: 4,
(iii) a sequence having at least 90% identity to positions 354-364 of SEQ ID NO: 4,
(iv) a sequence having at least 90% identity to positions 651-679 of SEQ ID NO: 4,
(v) a sequence having at least 90% identity to positions 821-841 of SEQ ID NO: 4, and
(vi) a sequence having at least 80% identity to positions 947-963 of SEQ ID NO: 4; and
(b) a substitution at a position corresponding to position 13, 203, 822, 885, 890, 980,
984, 1020, 1026, or 1165 of SEQ ID NO: 4.
8. The composition of Claim 7, wherein the amino acid sequence comprises at least two of (i) the sequence having at least 90% identity to positions 145-169 of SEQ ID NO: 4, (ii) the sequence having at least 90% identity to positions 245-271 of SEQ ID NO: 4, (iii) the sequence having at least 90% identity to positions 354-364 of SEQ ID NO: 4, (iv) the sequence having at least 90% identity to positions 651-679 of SEQ ID NO: 4, (v) the sequence having at least 90% identity to positions 821-841 of SEQ ID NO: 4, and (vi) the sequence having at least 90% identity to positions 947-963 of SEQ ID NO: 4.
9. The composition of Claim 7 or Claim 8 further comprising proline at a position corresponding to 930 of SEQ ID NO: 3 and threonine at a position corresponding to 932 of SEQ ID NO: 4.
10. The composition of any of Claims 7-9, wherein the variant Family D polymerase lacks exonuclease activity.
11. A composition, comprising a variant Family D polymerase comprising:
(a) an amino acid sequence comprising at least two sequences selected from: (i) a sequence having at least 85% identity to positions 355-362 of SEQ ID NO: 1,
(ii) a sequence having at least 85% identity to positions 808-816 of SEQ ID NO: 1, and
(iii) a sequence having at least 80% identity to positions 824-828 of SEQ ID NO: 1; and
(b) any amino acid other than
(i) tyrosine at a position corresponding to position 13, 203, 822, 885, 890, 954, 984, 1020, 1026, or 1165 of SEQ ID NO: 1,
(ii) lysine at a position corresponding to position 392, 398 or 959 of SEQ ID NO: 1,
(iii) cystine at a position corresponding to position 963 of SEQ ID NO: 1,
(iv) phenylalanine at a position corresponding to position 452 or 980 of SEQ ID NO: 1,
(v) arginine at a position corresponding to position 395 of SEQ ID NO: 1,
(vi) glutamate at a position corresponding to position 454 of SEQ ID NO: 1, or
(vii) aspartate at a position corresponding to position 964 or 966 of SEQ ID NO:
1.
12. The composition of any of Claims 1-11 further comprising one or more ribonucleoside triphosphates, one or more deoxyribonucleoside triphosphates, or one or more ribonucleoside triphosphates and one or more deoxyribonucleoside triphosphates.
13. The composition of any of Claims 1-12 further comprising one or more modified nucleotides.
14. The composition of any of Claims 7-13, wherein the amino acid sequence further comprises a sequence having at least 80% identity to positions 808-816 of SEQ ID NO: 4.
15. The composition of Claim 1, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, or 14, wherein the variant Family D polymerase has 3’ to 5’ exonuclease activity.
16. A method comprising contacting a composition according to any of Claims 1-15 with a DNA template, deoxyribonucleoside triphosphates, and optionally a primer having a sequence complementary to at least a portion of the sequence of the DNA template to produce at least one polynucleotide copy of the template.
17. The method of Claim 16 further comprising contacting the composition with one or more ribonucleoside triphosphates.
18. The method of Claim 16 or 17 further comprising contacting the composition with one or more modified nucleotides.
19. The method of any of Claims 16-18, wherein the polynucleotide copy of the template comprises at least one ribonucleotide.
20. The method of any of Claims 16-19, wherein the DNA template comprises at least one uracil and the polynucleotide copy comprises an adenine at the position that corresponds to the at least one uracil.
21. A composition according to any of Claims 1-15 further comprising a fusion protein, the fusion protein comprising the variant Family D polymerase and a polypeptide selected from the group consisting of a DNA-binding peptide, a maltose binding domain, a chitin binding domain, or a SNAP-Tag®.
22. A kit comprising:
(a) a composition according to any of Claims 1-14; and
(b) one or more ribonucleoside triphosphates, one or more deoxyribonucleoside triphosphates, one or more modified nucleotides, one or more primers, one or more adapters, one or more buffering agents, or combinations thereof.
PCT/US2021/017956 2020-02-13 2021-02-12 Variant family d dna polymerases WO2021163561A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP21710368.8A EP4103701A1 (en) 2020-02-13 2021-02-12 Variant family d dna polymerases

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202062976055P 2020-02-13 2020-02-13
US202062976039P 2020-02-13 2020-02-13
US62/976,055 2020-02-13
US62/976,039 2020-02-13

Publications (1)

Publication Number Publication Date
WO2021163561A1 true WO2021163561A1 (en) 2021-08-19

Family

ID=74859517

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/017956 WO2021163561A1 (en) 2020-02-13 2021-02-12 Variant family d dna polymerases

Country Status (2)

Country Link
EP (1) EP4103701A1 (en)
WO (1) WO2021163561A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6627424B1 (en) 2000-05-26 2003-09-30 Mj Bioworks, Inc. Nucleic acid modifying enzymes
EP1196583B1 (en) * 1999-04-21 2006-07-12 Centre National De La Recherche Scientifique (Cnrs) Type ii dna polymerase from pyrococcus abyssi
US7888090B2 (en) 2004-03-02 2011-02-15 Ecole Polytechnique Federale De Lausanne Mutants of O6-alkylguanine-DNA alkyltransferase
US7939284B2 (en) 2001-04-10 2011-05-10 Ecole Polytechnique Federale De Lausanne Methods using O6-alkylguanine-DNA alkyltransferases
US10041051B2 (en) 2014-08-27 2018-08-07 New England Biolabs, Inc. Fusion polymerase and method for using the same

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1196583B1 (en) * 1999-04-21 2006-07-12 Centre National De La Recherche Scientifique (Cnrs) Type ii dna polymerase from pyrococcus abyssi
US6627424B1 (en) 2000-05-26 2003-09-30 Mj Bioworks, Inc. Nucleic acid modifying enzymes
US7939284B2 (en) 2001-04-10 2011-05-10 Ecole Polytechnique Federale De Lausanne Methods using O6-alkylguanine-DNA alkyltransferases
US7888090B2 (en) 2004-03-02 2011-02-15 Ecole Polytechnique Federale De Lausanne Mutants of O6-alkylguanine-DNA alkyltransferase
US10041051B2 (en) 2014-08-27 2018-08-07 New England Biolabs, Inc. Fusion polymerase and method for using the same

Non-Patent Citations (29)

* Cited by examiner, † Cited by third party
Title
"Genbank", Database accession no. OLS 17959.1
"Oligonucleotide Synthesis: A Practical Approach", 1984, IRL PRESS
ASTATKE, M.NG, K.GRINDLEY, N.D.JOYCE, C.M.: "A single side chain prevents Escherichia coli DNA polymerase I (Klenow fragment) from incorporating ribonucleotides", PROC NATL ACAD SCI USA, vol. 95, 1998, pages 3402 - 3407, XP055242309, DOI: 10.1073/pnas.95.7.3402
BITINAITE J1RUBINO MVARMA KHSCHILDKRAUT IVAISVILA RVAISKUNAITE R: "USER friendly DNA engineering and cloning method by uracil excision", NUCLEIC ACIDS RES., vol. 35, no. 6, 2007, pages 1992 - 2002, XP055170276, DOI: 10.1093/nar/gkm041
CANN, I.K.KOMORI, K.TOH, H.KANAI, S.ISHINO, Y.: "A heterodimeric DNA polymerase: evidence that members of Euryarchaeota possess a distinct DNA polymerase", PROC NATL ACAD SCI USA, vol. 95, 1998, pages 14250 - 14255, XP000858986, DOI: 10.1073/pnas.95.24.14250
CASTREC B ET AL: "The Glycine-Rich Motif of Pyrococcus abyssi DNA Polymerase D Is Critical for Protein Stability", JOURNAL OF MOLECULAR BIOLOGY, ACADEMIC PRESS, UNITED KINGDOM, vol. 396, no. 4, 5 March 2010 (2010-03-05), pages 840 - 848, XP027234814, ISSN: 0022-2836, [retrieved on 20100305], DOI: 10.1016/J.JMB.2010.01.006 *
GARDNER AFWANG JWU WKAROUBY JLI HSTUPI BPJACK WEHERSH MNMETZKER ML: "Rapid incorporation kinetics and improved fidelity of a novel class of 3'-OH unblocked reversible terminators", NUCLEIC ACIDS RES., vol. 40, 2012, pages 7404 - 7415, XP055043174, DOI: 10.1093/nar/gks330
GREENOUGH, L.MENIN, J.F.DESAI, N.S.KELMAN, Z.GARDNER, A.F.: "Characterization of Family D DNA polymerase from Thermococcus sp. 9 degrees N", EXTREMOPHILES, vol. 18, 2014, pages 653 - 664, XP055364100, DOI: 10.1007/s00792-014-0646-9
HALEMARKHAM: "Oligonucleotides and Analogs: A Practical Approach", 1991, OXFORD UNIVERSITY PRESS
ISHINO, Y.KOMORI, K.CANN, I.K.KOGA, Y.: "A novel DNA polymerase family found in Archaea", JOURNAL OF BACTERIOLOGY, vol. 180, 1998, pages 2232 - 2236, XP000858985
JESSICA A. BROWN ET AL: "Unlocking the Sugar "Steric Gate" of DNA Polymerases", BIOCHEMISTRY, vol. 50, no. 7, 22 February 2011 (2011-02-22), pages 1135 - 1142, XP055009914, ISSN: 0006-2960, DOI: 10.1021/bi101915z *
KOMBERGBAKER: "DNA Replication", 1992, W.H. FREEMAN
LEHNINGER: "Biochemistry", 1975, WORTH PUBLISHERS
LEMOR MÉLANIE ET AL: "Differential Activities of DNA Polymerases in Processing Ribonucleotides during DNA Synthesis in Archaea", JOURNAL OF MOLECULAR BIOLOGY, vol. 430, no. 24, 1 December 2018 (2018-12-01), United Kingdom, pages 4908 - 4924, XP055805599, ISSN: 0022-2836, DOI: 10.1016/j.jmb.2018.10.004 *
LUCIA GREENOUGH ET AL: "Characterization of Family D DNA polymerase from Thermococcus sp. 9?N", EXTREMOPHILES, vol. 18, no. 4, 3 May 2014 (2014-05-03), JP, pages 653 - 664, XP055364100, ISSN: 1431-0651, DOI: 10.1007/s00792-014-0646-9 *
NAT CHEM BIOL., vol. 13, no. 10, 19 September 2017 (2017-09-19), pages 1057
RAIA PIERRE ET AL: "Structure of the DP1-DP2 PolD complex bound with DNA and its implications for the evolutionary history of DNA and RNA polymerases", PLOS BIOLOGY, vol. 17, no. 1, 18 January 2019 (2019-01-18), pages e3000122, XP055805594, Retrieved from the Internet <URL:https://storage.googleapis.com/plos-corpus-prod/10.1371/journal.pbio.3000122/2/pbio.3000122.pdf?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=wombat-sa@plos-prod.iam.gserviceaccount.com/20210519/auto/storage/goog4_request&X-Goog-Date=20210519T063243Z&X-Goog-Expires=86400&X-Goog-SignedHeaders=h> DOI: 10.1371/journal.pbio.3000122 *
RAIA, P.CARRONI, M.HENRY, E.PEHAU-ARNAUDET, G.BRULE, S.BEGUIN, P. ET AL.: "Structure of the DP1-DP2 PolD complex bound with DNA and its implications for the evolutionary history of DNA and RNA polymerases", PLOS BIOLOGY, vol. 17, no. 1, 2019, pages e3000122. 2
RAIA, P.CARRONI, M.HENRY, E.PEHAU-ARNAUDET, G.BRULE, S.BEGUIN, P.HENNEKE, G.LINDAHL, E.DELARUE, M.SAUGUET, L.: "Structure of the DP1-DP2 PolD complex bound with DNA and its implications for the evolutionary history of DNA and RNA polymerases", PLOS BIOL, vol. 17, 2019, pages e3000122
SAUGUET LUDOVIC ET AL: "Shared active site architecture between archaeal PolD and multi-subunit RNA polymerases revealed by X-ray crystallography", NATURE COMMUNICATIONS, vol. 7, no. 1, 1 November 2016 (2016-11-01), XP055805589, Retrieved from the Internet <URL:https://www.nature.com/articles/ncomms12227.pdf> DOI: 10.1038/ncomms12227 *
SCHERMERHORN, K.M.GARDNER, A.F.: "Pre-steady-state Kinetic Analysis of a Family D DNA Polymerase from Thermococcus sp. 9°N Reveals Mechanisms for Archaeal Genomic Replication and Maintenance", J. BIOL. CHEM., vol. 290, 2015, pages 21800 - 21810
SHEN Y ET AL: "Invariant Asp-1122 and Asp-1124 are essential residues for polymerization catalysis of family D DNA polymerase from Pyrococcus horikoshii", JOURNAL OF BIOLOGICAL CHEMISTRY, AMERICAN SOCIETY FOR BIOCHEMISTRY AND MOLECULAR BIOLOGY, US, vol. 276, no. 29, 20 July 2001 (2001-07-20), pages 27376 - 27383, XP003017377, ISSN: 0021-9258, DOI: 10.1074/JBC.M011762200 *
SHEN Y ET AL: "Subunit interaction and regulation of activity through terminal domains of the family D DNA polymerase from Pyrococcus horikoshii", JOURNAL OF BIOLOGICAL CHEMISTRY, AMERICAN SOCIETY FOR BIOCHEMISTRY AND MOLECULAR BIOLOGY, US, vol. 278, no. 23, 6 June 2003 (2003-06-06), pages 21247 - 21257, XP003017376, ISSN: 0021-9258, DOI: 10.1074/JBC.M212286200 *
SHEN Y.: "A 21-amino acid peptide from the cysteine cluster II of the family D DNA polymerase from Pyrococcus horikoshii stimulates its nuclease activity which is Mre11-like and prefers manganese ion as the cofactor", NUCLEIC ACIDS SYMPOSIUM SERIES, vol. 32, no. 1, 2 January 2004 (2004-01-02), GB, pages 158 - 168, XP055805583, ISSN: 0261-3166, DOI: 10.1093/nar/gkh153 *
SHEN, Y.MUSTI, K.HIRAMOTO, M.KIKUCHI, H.KAWARABAYASHI, Y.MATSUI, I., INVARIANT ASP-1122 AND ASP-1124 ARE ESSENTIAL RESIDUES FOR POLYMERIZATION CATALYSIS OF FAMILY D DNA POLYMERASE FROM PYROCOCCUS HORIKOSHII, vol. 276, no. 29, 2001, pages 27376 - 27383, Retrieved from the Internet <URL:http://doi.org/10.1074/jbc.M011762200>
SINGLETON ET AL.: "Dictionary of Microbiology and Molecular biology", 1994, JOHN WILEY AND SONS
STRACHANREAD: "Human Molecular Genetics", 1999, WILEY-LISS
TAKASHIMA, N.ISHINO, S.OKI, K.TAKAFUJI, M.YAMAGAMI, T.MATSUO, R.MAYANAGI, K.ISHINO, Y.: "Elucidating functions of DPI and DP2 subunits from the Thermococcus kodakarensis family D DNA polymerase", EXTREMOPHILES, vol. 23, 2019, pages 161 - 172
ZATOPEK KELLY M ET AL: "Novel ribonucleotide discrimination in the RNA polymerase-like two-barrel catalytic core of Family D DNA polymerases", NUCLEIC ACIDS RESEARCH, vol. 48, no. 21, 2 December 2020 (2020-12-02), GB, pages 12204 - 12218, XP055805586, ISSN: 0305-1048, DOI: 10.1093/nar/gkaa986 *

Also Published As

Publication number Publication date
EP4103701A1 (en) 2022-12-21

Similar Documents

Publication Publication Date Title
US10662413B2 (en) Recombinant DNA polymerase for improved incorporation of nucleotide analogues
EP2097523B1 (en) Compositions and methods using split polymerases
JP6579204B2 (en) Modified thermostable DNA polymerase
JP2010528669A (en) Polymerase stabilization
JP6701450B2 (en) DNA production method and DNA fragment ligation kit
US11371028B2 (en) Variant DNA polymerases having improved properties and method for improved isothermal amplification of a target DNA
JP2022543569A (en) Templateless Enzymatic Synthesis of Polynucleotides Using Poly(A) and Poly(U) Polymerases
EP1463809B1 (en) Mutation of dna polymerases from archaeobacteria
JP2020182463A (en) Nucleic acid amplification reagent
WO2021163561A1 (en) Variant family d dna polymerases
WO2023038145A1 (en) Method for producing circular dna
US20230133012A1 (en) Nucleic acid polymerase variants, kits and methods for template-independent rna synthesis
KR20240024924A (en) Use with polymerase mutants and 3&#39;-OH non-blocking reversible terminators
WO2021163559A1 (en) Variant family d dna polymerases
JP2004135628A (en) Method for transducing variation in dna
Nayak et al. PRODUCTION OF TAQPOLYMERASE FROM E. COLI: A TREMENDOUS APPROACH OF CLONING AND EXPRESSION OF TAQPOLYMERASE-I GENE IN E. COLI
JP2006042643A (en) Method for screening new nuclease by using cell-free protein synthesis
WO2007117857A2 (en) Novel dna polymerase from caldicellulosiruptor kristjanssonii
JP2006042642A (en) NEW RESTRICTION ENZYME PabI

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21710368

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021710368

Country of ref document: EP

Effective date: 20220913