WO2020072715A1 - Compositions and methods comprising mutants of terminal deoxynucleotidyl transferase - Google Patents

Compositions and methods comprising mutants of terminal deoxynucleotidyl transferase

Info

Publication number
WO2020072715A1
WO2020072715A1 PCT/US2019/054398 US2019054398W WO2020072715A1 WO 2020072715 A1 WO2020072715 A1 WO 2020072715A1 US 2019054398 W US2019054398 W US 2019054398W WO 2020072715 A1 WO2020072715 A1 WO 2020072715A1
Authority
WO
WIPO (PCT)
Prior art keywords
tdt
modified
polypeptide
enzyme
amino acid
Prior art date
Application number
PCT/US2019/054398
Other languages
French (fr)
Inventor
George M. Church
Nicholas J. CONWAY
Richard E. KOHMAN
Erkin KURU
Jonathan RITTICHIER
Daniel Jordan WIEGAND
Original Assignee
President And Fellows Of Harvard College
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by President And Fellows Of Harvard College filed Critical President And Fellows Of Harvard College
Publication of WO2020072715A1 publication Critical patent/WO2020072715A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • C12N9/1264DNA nucleotidylexotransferase (2.7.7.31), i.e. terminal nucleotidyl transferase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y207/00Transferases transferring phosphorus-containing groups (2.7)
    • C12Y207/07Nucleotidyltransferases (2.7.7)
    • C12Y207/07031DNA nucleotidylexotransferase (2.7.7.31), i.e. terminal deoxynucleotidyl transferase

Definitions

  • TdT modified terminal deoxynucleotidyl transferase
  • Terminal deoxynucleotidyl transferase is a very useful template-independent DNA polymerase for major biotechnological applications such as the storage of digital information and c/e novo oligonucleotide synthesis.
  • TdT has the unique ability to rapidly catalyze the synthesis of long DNA oligonucleotides in the presence of only a small initiator sequence, cofactors and nucleoside triphosphate monomers.
  • a modified TdT polypeptide sequence can modify the function, reaction catalysis and substrate binding of the TdT polypeptide.
  • certain amino acid mutations can alter the cofactor preference for the TdT polypeptide, such that a modified TdT may be more efficient in the presence of Mn2+ or Mg2+ than in the presence of the endogenously preferred Co2+ cofactor.
  • a modified TdT polypeptide can comprise different temperature or pH sensitivities, altered rates of DNA synthesis, and the ability to incorporate non-natural nucleotides.
  • a modified TdT enzyme comprises a reduced substrate bias towards a preferred initiator sequence or nucleoside triphosphate base (A, G, C, or T) compared to an unmodified TdT enzyme from which it is derived.
  • TdT terminal deoxynucleotidyl transferase
  • enzyme i.e., enzyme
  • the modified TdT polypeptide comprises a sequence having at least one amino acid mutation and retains at least 10% of the template-independent DNA polymerase activity of the TdT polypeptide from which the modified TdT is derived.
  • the template-independent DNA polymerase activity of the modified TdT polypeptide and the wild-type TdT polypeptide from which it is derived is measured using the same enzymatic reaction conditions (e.g., co-factor and co-factor concentration, temperature, time, pH, nucleotide(s) and nucleotide(s) concentration etc.).
  • the template-independent polymerase activity of a given modified TdT polypeptide is assessed under reaction conditions that are different from the reaction conditions that are used to assess the activity of the unmodified TdT polypeptide.
  • modifications to a TdT polypeptide can alter the co-factor preference or degree of nucleotide bias, thus it may be desirable to compare the activity of the modified TdT polypeptide under conditions preferred by the modified enzyme to the activity of the wild-type enzyme under conditions preferred by the wild-type enzyme. This is because under the preferred conditions of the modified TdT polypeptide, the wild-type TdT may have substantially reduced activity.
  • the modified TdT comprises at least 50% of the template-independent DNA polymerase activity of the TdT polypeptide from which it is derived.
  • the protein sequence of the modified TdT comprises at least two amino acid mutations compared to the TdT polypeptide from which it is derived.
  • the TdT polypeptide from which the modified TdT is derived is a human TdT polypeptide.
  • the human TdT polypeptide comprises SEQ ID NO: 1.
  • the modified TdT comprises a cofactor preference that is different from the cofactor preference of the TdT polypeptide from which it is derived.
  • the modified TdT comprises a reduced degree of bias compared to the degree of bias of the TdT polypeptide from which it is derived under substantially similar enzyme assay conditions.
  • the modified TdT comprises at least one amino acid mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and L459.
  • the modified TdT comprises a mutation at R453 and at least one additional mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and L459.
  • the mutation at R453 is R453A.
  • the at least one additional mutation occurs at amino acid residue V432.
  • the mutation at amino acid residue V432 is V432G.
  • the modified TdT polypeptide comprises a sequence selected from those listed in any one of Tables 3-8.
  • the modified TdT polypeptide is a variant comprising the mutations R453A-V432G.
  • Another aspect provided herein relates to a method for generating a polynucleotide sequence c/e novo or in vitro, the method comprising; incubating a modified TdT enzyme as described herein in the presence of an initiator sequence, a cofactor and at least one nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand.
  • the modified TdT enzyme or the initiator sequence are conjugated to a solid support.
  • the solid support comprises a bead, a membrane, or a column.
  • the cofactor is a divalent cation.
  • the divalent cation is Co2+, Mn2+, Mg2+ or Zn2+.
  • the modified TdT enzyme of claim 1 is incubated in the presence of 2, 3, or 4 nucleoside triphosphates.
  • the method further comprises a second step of incubating a modified TdT enzyme of claim 1 in the presence of an initiator sequence, a cofactor and at least one different nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand.
  • the step is repeated once or twice each in the presence of at least one different nucleotide.
  • Another aspect provided herein relates to a nucleic acid molecule encoding any one of the modified TdT enzymes described herein (e.g., encoding any one of the polypeptide sequences listed in Tables 3-8).
  • the nucleic acid encodes a modified TdT polypeptide variant comprising the mutations R453A-V432G.
  • Another aspect provided herein relates to a vector comprising a nucleic acid molecule encoding any one of the modified TdT enzymes described herein (e.g., encoding any one of the polypeptide sequences listed in Tables 3-8).
  • the vector comprises a nucleic acid that encodes a modified TdT polypeptide variant comprising the mutations R453A-V432G.
  • Another aspect provided herein relates to a cell comprising the modified TdT polypeptide, a nucleic acid molecule encoding a modified TdT polypeptide, and/or the vector comprising such a nucleic acid molecule as described herein.
  • the cell is a bacterial cell.
  • a solid support comprising a modified TdT enzyme as described herein (e.g., any one of the protein sequences listed in Tables 3-8).
  • the solid support comprises a modified TdT polypeptide variant comprising the mutations R453A-V432G.
  • FIG. 1 View of catalytic pocket of murine TdT with large arginine at residue position 453 protruding near where nucleotide binds.
  • FIGs. 2A-2B Heat-map of single-codon mutant variants at amino acid residue R453.
  • FIG. 2B 6% TBE-Urea denaturing gel electrophoresis analysis of human TdT R453H DNA oligonucleotide synthesis reactions compared to control, wild-type human TdT. 200-nt ssDNA ladder was used to determine the size of the produced DNA oligonucleotide. Gels were stained with lx GelStar Nucleic Acid Stain.
  • FIG. 3 Heat-map of double-codon mutants variants carrying the constant amino acid change R453A.
  • RFU values for each cofactor evaluated were normalized by the total protein concentration of the human TdT mutant variants as determined by a reducing agent microBCA assay. The final concentration of cofactor was 0.25 mM for all reactions.
  • FIGs. 4A-4B Heat-maps of double-codon mutants variants carrying the constant amino acid change R453A.
  • RFU values for each natural nucleotide evaluated were normalized by the total protein concentration of the human TdT mutant variants as determined by a reducing agent microBCA assay. The final concentration for each nucleotide was 1 mM and the cofactor was 0.25 mM Mn2+.
  • FIG. 5 6% TBE-Urea denaturing gel electrophoresis analysis of natural nucleotide incorporate by wild-type human TdT (WT-hTDT) compared to double-codon mutant variant human TdT R453A- V432G. Nucleotide concentration was 1 mM and the initiator oligonucleotide sequence was a Poly-T-l5- mer at 10 pmol per reaction. Wild-type human TdT was supplemented with 0.25 mM Co2+ cofactor and the double mutant variant was supplemented with 0.25 mM Mn2+ cofactor.
  • FIGs. 6A-6B 15% TBE-Urea denaturing gel electrophoresis analysis of natural nucleotide incorporation by single-codon mutant R453A human TdT compared to double codon-mutant R453A- V432G with varying DNA oligonucleotide initiator sequences.
  • FIG. 6A indicates reactions supplemented with 10 pmol of Poly-T-l5-mer
  • FIG. 6B indicates reactions supplemented with 10 pmol of Poly-A- l5-mer. Both the mutant variants were supplemented with 0.25 mM Mn2+ cofactor.
  • FIG. 7. 15% TBE-Urea denaturing gel electrophoresis analysis of natural ribonucleotide incorporation by single-codon mutant R453A human TdT compared to double codon-mutant R453A- V432G. Reactions were supplemented with 1 mM of each ribonucleotide, 0.25 mM Mn2+, and 10 pmol of DNA oligonucleotide initiator sequence poly-T-l5-mer
  • TdT polypeptides having at least one amino acid mutation at a desired residue but retaining at least 25% of the template -independent DNA polymerase activity of the unmodified TdT polypeptide.
  • TdT variants are contemplated for use in the generation of nucleic acid sequences for the storage of digital information, c/e novo oligonucleotide synthesis, or the like.
  • template-independent DNA polymerase activity refers to the ability of a TdT polypeptide, variant or mutant to add at least one nucleotide to a growing polynucleotide strand in the absence of a template DNA strand.
  • the term "substantially retains TdT activity” means that a variant or modified TdT polypeptide will retain at least 10% of the template-independent DNA polymerase activity (as assessed by measuring in vitro TdT enzyme activity) of the polypeptide or peptide from which it is derived (e.g., wildtype human TdT).
  • the activity of the derivative and the activity of wild-type TdT are assessed under substantially similar conditions, for example, in the presence of the same cofactor (e.g., Co 2+ ).
  • the activity of the derivative can be determined under different conditions (e.g., in the presence of an alternative co-factor, such as Mn2 + , Zn 2+ or Mg 2+ , and compared to the activity of the wild-type TdT enzyme determined under conditions preferred under native conditions (e.g., in the presence of Co 2+ ).
  • the derivative will retain at least 25%, at least 30% at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99% or even 100% of the template -independent DNA polymerase activity of the peptide/polypeptide from which it is derived.
  • the term“cofactor preference” refers to the cofactor that permits the highest enzymatic activity of a given TdT variant in the same assay conditions and using the same concentration of cofactor (e.g., 0.25mM).
  • the cofactor preference is expressed in descending order, such as the cofactor preference for endogenous wild-type TdT which is expressed as Co2+ > Mg2+, Mn2+.
  • the term "increased activity" refers to an increase in template-independent DNA polymerase activity of a derivative compared to that of the parent peptide/polypeptide, for example, the derivative can have at least a 2-fold increase, at least 5 -fold, at least 10-fold, at least 20-fold, at least 50-fold, at least 100- fold, at least 1000-fold or more increase in template-independent DNA polymerase activity compared to the parent peptide/polypeptide from which it is derived.
  • the terms“derivative,” “variant,” or“mutant” as used herein refers to a polypeptide or nucleic acid that comprises at least one mutation but is "substantially similar” to a wild-type human TdT polypeptide.
  • a molecule is said to be “substantially similar” to another molecule if both molecules have substantially similar structures (i.e., they are at least 50% similar in amino acid sequence as determined by BLASTp alignment set at default parameters) and are substantially similar in at least one relevant function (e.g., template-independent DNA polymerase activity).
  • a variant differs from the naturally occurring polypeptide or nucleic acid by one or more amino acid or nucleic acid deletions, additions, substitutions or side-chain modifications, yet retains one or more specific functions or biological activities of the naturally occurring molecule.
  • Amino acid substitutions include alterations in which an amino acid is replaced with a different naturally-occurring or a non-conventional amino acid residue. Some substitutions can be classified as“conservative,” in which case an amino acid residue contained in a polypeptide is replaced with another naturally occurring amino acid of similar character either in relation to polarity, side chain functionality or size.
  • substitutions encompassed by variants as described herein can also be“non-conservative,” in which an amino acid residue which is present in a peptide is substituted with an amino acid having different properties (e.g., substituting a charged or hydrophobic amino acid with an uncharged or hydrophilic amino acid), or alternatively, in which a naturally-occurring amino acid is substituted with a non-conventional amino acid.
  • variants when used with reference to a polynucleotide or polypeptide, are variations in primary, secondary, or tertiary structure, as compared to a reference polynucleotide or polypeptide, respectively (e.g., as compared to a wild- type polynucleotide or polypeptide). Polynucleotide changes can result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference sequence.
  • Variants can also include insertions, deletions or substitutions of amino acids, including insertions and substitutions of amino acids and other molecules) that do not normally occur in the peptide sequence that is the basis of the variant, including but not limited to insertion of ornithine which does not normally occur in human proteins.
  • statically significant or“significantly” refer to statistical significance and generally mean a two standard deviation (2SD) or greater difference relative to a reference value.
  • “decrease”,“reduced”,“reduction”, or“inhibit” are all used herein to mean a decrease by a statistically significant amount.
  • “reduce,”“reduction” or“decrease” or“inhibit” typically means a decrease by at least 10% as compared to a reference level (e.g.
  • a decrease can be preferably down to a level accepted as within the range of normal for an individual without a given disorder.
  • the terms“increased”,“increase”,“enhance”, or“activate” are all used herein to mean an increase by a statically significant amount.
  • the terms“increased”,“increase”, “enhance”, or“activate” can mean an increase of at least 10% as compared to a reference level, for example an increase of at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% increase or any increase between 10-100% as compared to a reference level, or at least about a 2- fold, or at least about a 3 -fold, or at least about a 4-fold, or at least about a 5 -fold or at least about a 10-fold increase, or any increase between 2-fold and 10-fold or greater as compared to a reference level.
  • an“increase” is a statistically significant increase in such level.
  • oligonucleotide encompass double- or triple-stranded nucleic acids, as well as single-stranded molecules.
  • nucleic acid strands need not be coextensive (i.e., a double-stranded nucleic acid need not be double-stranded along the entire length of both strands).
  • Nucleic acid sequences, when provided, are listed in the 5' to 3' direction, unless stated otherwise. Methods described herein provide for the generation of isolated nucleic acids. Methods described herein additionally provide for the generation of isolated and purified nucleic acids.
  • An“oligonucleotide,”“polynucleotide,” and“nucleic acid” as referred to herein can comprise at least 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, or more bases in length.
  • compositions, methods, and respective component(s) thereof are used in reference to compositions, methods, and respective component(s) thereof, that are essential to the method or composition, yet open to the inclusion of unspecified elements, whether essential or not.
  • compositions, methods, and respective components thereof as described herein, which are exclusive of any element not recited in that description of the embodiment.
  • the term "consisting essentially of” refers to those elements required for a given embodiment. The term permits the presence of elements that do not materially affect the basic and novel or functional characteristic(s) of that embodiment.
  • TdT Terminal deoxynucleotidyl Transferase
  • Terminal deoxynucleotidyl transferase is a template-independent DNA polymerase that catalyzes the addition of nucleotides to the 3’ terminus of a DNA molecule (e.g., a single stranded DNA strand).
  • TdT plays a role in introducing minor changes into the genetic material by randomly adding nucleotides to single-stranded DNA during recombination.
  • TdT activity is important in adaptation of the vertebrate immune system by increasing antigen receptor diversity.
  • There are two known isoforms of TdT (i) a short form having 509 amino acids (TdTS), and (ii) a long form having 529 amino acids (TdTL).
  • TdTS and TdTL comprise the domains necessary to bind nucleotides, DNA, and metal ion cofactors.
  • the derivatives or mutants of TdT described herein can be derived from either TdT isoform.
  • Two functionally independent human TdT regions have been identified: breast cancer susceptibility protein BRCA1 C-terminal (BRCT) domain at the N-terminus and the polymerase-like domain at the C-terminus.
  • BRCT domain of TdT is involved in protein-protein and protein-DNA interactions during DNA repair and cell cycle checkpoint pathways.
  • the pol— like domain is the catalytic core of the enzyme and contains the active site of the phosphoryl transfer reaction.
  • NLS nuclear localization signal
  • the protein domain structure and crystal structure of TdT is known to those of skill in the art and is not described in further detail herein.
  • TdT is unique in its ability to use a variety of other divalent cations such as Mn2+, Zn2+ and Mg2+.
  • the extension rate in vitro with dATP in the presence of divalent metal ions is ranked in the following order: Mg2+ > Zn2+ > Co2+ > Mn2+.
  • each metal ion has different effects on the kinetics of nucleotide incorporation.
  • Mg2+ facilitates the preferential utilization of dGTP and dATP whereas Co2+ increases the catalytic polymerization efficiency of the pyrimidines, dCTP and dTTP.
  • Zn2+ behaves as a unique positive effector for TdT since reaction rates with Mg2+ are stimulated by the addition of micromolar quantities of Zn2+. This enhancement may reflect the ability of Zn2+ to induce conformational changes in TdT that yields higher catalytic efficiencies. Polymerization rates are lower in the presence of Mn2+ compared to Mg2+, suggesting that Mn2+ does not support the reaction as efficiently as Mg2+. Further description of TdT is provided in Biochim Biophys Acta., May 2010; 1804(5): 1151— 1166 hereby incorporated by reference in its entirety.
  • Table 1 Exemplary unmodified polypeptide sequences of TdT in different species
  • the modified TdT polypeptides can be derived from any one of SEQ ID NOs: 1-9.
  • the modified TdT polypeptide is derived from a human polypeptide sequence, for example, SEQ ID NO: 1.
  • TdT e.g., human TdT
  • variants of TdT that comprise at least one amino acid mutation compared to the TdT protein from which they are derived and retain at least 10% of the functional template -independent DNA polymerase activity of the unmodified TdT (e.g., using an enzymatic TdT test as described herein).
  • the variants of TdT retain at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 98%, at least 99% of the template-independent DNA polymerase activity of the unmodified TdT from which the variant is derived.
  • the TdT variant comprises a template-independent DNA polymerase activity that is substantially similar to the activity of the TdT from which it is derived.
  • the term“substantially similar” refers to an activity of a TdT variant that comprises an activity that is not statistically significant (i.e., p ⁇ 0.05) as compared to the activity from unmodified TdT protein from which the variant is derived (e.g., as assessed using a TdT enzyme assay as described herein).
  • TdT variant has more than 100% of the activity of a wild-type or native polypeptide, e.g., 110%, 125%, 150%, 175%, 200%, 500%, 1000% or more.
  • Variant TdTs can comprise at least 2, at least 3, at least 4, at least 5 amino acid mutations or more (e.g., 6, 7, 8, 9, 10 etc.) as compared to the TdT enzyme from which they are derived.
  • the variant of TdT comprises a“single -codon” mutation (i.e., a single amino acid mutation).
  • the TdT variant comprises a“double-codon” (i.e., two amino acid substitutions) or “triple-codon” mutation (i.e., three amino acid substitutions).
  • the amino acid substitutions can comprise a conservative amino acid substitution.
  • conservative amino acid substitutions is well known in the art, which relates to substitution of a particular amino acid by one having a similar characteristic (e.g., similar charge or hydrophobicity, similar bulkiness). Examples include aspartic acid for glutamic acid, or isoleucine for leucine. A list of exemplary conservative amino acid substitutions is given in the table below.
  • a conservative substitution mutant or variant will 1) have only conservative amino acid substitutions relative to the parent sequence, 2) will have at least 90% sequence identity with respect to the parent sequence, preferably at least 95% identity, 96% identity, 97% identity, 98% identity or 99% or greater identity; and 3) will retain TdT template-independent DNA polymerase activity (e.g., enzyme activity) as that term is defined herein.
  • TdT template-independent DNA polymerase activity e.g., enzyme activity
  • non-conservative amino acid substitution may be preferred, for example, when a TdT variant with differing cofactor binding, enzyme activity, or reduced substrate bias is desired.
  • “Non conservative substitution” refers to the substitution of an amino acid in one class with an amino acid from another class; for example, substitution of an Ala, a class II residue, with a class III residue such as Asp, Asn, Glu, or Gln.
  • non-conservative substitutions include the substitution of a non-polar (hydrophobic) amino acid residue such as isoleucine, valine, leucine, alanine, methionine for a polar (hydrophilic) residue such as cysteine, glutamine, glutamic acid or lysine and/or a polar residue for a non-polar residue.
  • a non-polar (hydrophobic) amino acid residue such as isoleucine, valine, leucine, alanine, methionine
  • a polar (hydrophilic) residue such as cysteine, glutamine, glutamic acid or lysine and/or a polar residue for a non-polar residue.
  • a TdT variant as described herein can have a mixture of conservative and non-conservative amino acid substitutions in any desired configuration.
  • the TdT variant can be tested for activity, co-factor preference and nucleotide bias using methods known in the art or described in the Examples.
  • the amino acid residue to be mutated is an amino acid that plays a role in maintaining protein structural integrity, reaction catalysis and substrate binding (cofactor, initiator sequence & nucleoside triphosphate).
  • the one or more amino acids targeted for mutation to a different amino acid include, but are not limited to, S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, or L459 (numbering based on the wild- type human TdT sequence Uniprot #P04053.
  • the TdT is a non-human mammalian TdT or TdT from other non-mammalian species.
  • the TdT is a member of the archaeo-eukaryotic primase (AEP) superfamily.
  • the TdT is a PolpTN2 or a C- terminal truncated PolpTN2, a PriS, a nonhomologous end joining archaeo-eukaryotic primase, a mammalian Ro ⁇ q, or a eukaryotic PrimPol.
  • the variant does not comprise a mutation(s) that would require TdT to use a template for synthesis of a polynucleotide strand.
  • Amino acid sequence alignment of a polypeptide of interest with a reference can provide guidance regarding not only residues likely to be necessary for function but also, conversely, those residues likely to tolerate change. Where, for example, an alignment shows two identical or similar amino acids at corresponding positions, it is more likely that that site is important functionally. Where, conversely, alignment shows residues in corresponding positions to differ significantly in size, charge, hydrophobicity, etc., it is more likely that that site can tolerate variation in a functional polypeptide.
  • Such alignments are readily created by one of ordinary skill in the art, e.g., using the default settings of the alignment tool of the BLASTP program.
  • homologs of any given polypeptide or nucleic acid sequence can be found using BLAST programs, e.g., by searching freely available databases of sequence for homologous sequences, or by querying those databases for annotations indicating a homolog (e.g., search strings that comprise a gene name or describe the activity of a gene).
  • the variant amino acid sequence (or corresponding DNA sequence) can be at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or more, identical to a native or reference sequence.
  • the degree of homology (percent identity) between a native and a mutant sequence can be determined, for example, by comparing the two sequences using freely available computer programs commonly employed for this purpose on the world wide web.
  • the variant amino acid or DNA sequence can be at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or more, similar to the sequence from which it is derived (referred to herein as an“original" sequence).
  • the degree of similarity (percent similarity) between an original and a mutant sequence can be determined, for example, by using a similarity matrix.
  • Similarity matrices are well known in the art and a number of tools for comparing two sequences using similarity matrices are freely available online, e.g., BLASTp (available on the world wide web), with default parameters set.
  • an amino acid mutation is introduced using any method known in the art, for example, site-directed mutagenesis where targeted mutations are introduced into one or more desired positions of a template TdT polynucleotide.
  • site-directed mutagenesis where targeted mutations are introduced into one or more desired positions of a template TdT polynucleotide.
  • This may be achieved by classic primer extension mutagenesis using a mutagenesis primer containing one or more desired mutations relative to the template polynucleotide.
  • the mutagenesis primer can be a synthetic oligonucleotide or a PCR product, and it may include one or more desired substitutions, deletions, additions or any desired combination thereof. Means and methods for producing such primers are readily available in the art.
  • the oligonucleotide or PCR product used as primer must be 5'-phosphorylated for ligation. This can be achieved by enzymatic phosphorylation reaction, by enzymatic digestion of the 5' end of the DNA or by conjugation in a chemical reaction.
  • Commercial kits for site-directed mutagenesis can be obtained commercially from e.g., New England Biolabs (Ipswich, MA), Thermo Fisher Scientific (Waltham, MA), Agilent (Santa Clara, CA), TransgenBiotech (Beijing, China), Biogene (Cambridge, MA), etc.
  • an insertion comprises at least one additional residue but does not exceed 20 additional residues, for example, 1-18 residues, 1-16 residues, 1-15 residues, 1-14 residues, 1-12 residues, 1-10 residues, 1-9 residues, 1-8 residues, 1-7 residues, 1-6 residues, 1-5 residues, 1-4 residues, 1-3 residues or 1-2 residues are inserted.
  • a deletion comprises removal of at least one residue but does not exceed 10 residues, for example, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or 2 residues are deleted.
  • a TdT polypeptide can be modified, e.g., by addition of a moiety to one or more of the amino acids that together comprise the peptide.
  • a polypeptide as described herein can comprise one or more moiety molecules, e.g., 1 or more moiety molecules per polypeptide, 2 or more moiety molecules per polypeptide, 5 or more moiety molecules per polypeptide, 10 or more moiety molecules per polypeptide or more.
  • a polypeptide as described herein can comprise one or more types of modifications and/or moieties, e.g. 1 type of modification, 2 types of modifications, 3 types of modifications or more types of modifications.
  • Non-limiting examples of modifications and/or moieties include PEGylation; glycosylation; HESylation; ELPylation; lipidation; acetylation; amidation; end-capping modifications; cyano groups; phosphorylation; albumin, and cyclization.
  • an end-capping modification can comprise acetylation at the N- terminus, N-terminal acylation, and N-terminal formylation.
  • an end-capping modification can comprise amidation at the C-terminus, introduction of C-terminal alcohol, aldehyde, ester, and thioester moieties.
  • the TdT polypeptide variant comprises a single-codon mutation, for example, a single-codon mutation at amino acid residue R453.
  • Exemplary single-codon mutants with confirmed activity include the variants listed in Table 3.
  • the single-codon mutant is selected from the variants in Table 3, Table 4 or Table 5.
  • a single codon-mutant at R453 can exhibit an altered preference for divalent cation in comparison to wildtype. Examples of such single codon-mutants and their preferred substrate include are found in Table 5. Table 5: Single-codon mutants at R453 with altered preference for divalent cation in comparison to wildtype.
  • the TdT polypeptide variant comprises a double-codon mutation.
  • one of the two codon mutations in a double-codon mutation comprises a mutation at residue R453 (e.g., R453A).
  • Exemplary double-codon mutants having activity confirmed as described in the working Examples include those listed in Table 6, 7 or 8.
  • Table 8 Top performing double-codon mutant variants (with R453A constant)
  • the modified TdT polypeptide variant comprises the mutations R453A- V432G.
  • the technology described herein relates to a nucleic acid encoding a modified or variant TdT polypeptide as described herein.
  • the term“nucleic acid” or “nucleic acid sequence” refers to any molecule, preferably a polymeric molecule, incorporating units of ribonucleic acid, deoxyribonucleic acid or an analog thereof.
  • the nucleic acid can be either single -stranded or double -stranded.
  • a single-stranded nucleic acid can be one strand nucleic acid of a denatured double- stranded DNA.
  • the nucleic acid is DNA.
  • the nucleic acid is RNA.
  • Suitable nucleic acid molecules include DNA, including genomic DNA or cDNA. Other suitable nucleic acid molecules include RNA, including mRNA.
  • a nucleic acid encoding a modified or variant TdT polypeptide as described herein is comprised by a vector.
  • a nucleic acid sequence encoding a modified TdT polypeptide as described herein is operably linked to a vector.
  • the term "vector”, as used herein, refers to a nucleic acid construct designed for delivery to a host cell or for transfer between different host cells.
  • a vector can be viral or non-viral.
  • the term“vector” encompasses any genetic element that is capable of replication when associated with the proper control elements and that can transfer gene sequences to cells.
  • a vector can include, but is not limited to, a cloning vector, an expression vector, a plasmid, phage, transposon, cosmid, chromosome, virus, virion, etc.
  • expression vector refers to a vector that directs expression of an RNA or polypeptide from sequences linked to transcriptional regulatory sequences on the vector.
  • sequences expressed will often, but not necessarily, be heterologous to the cell.
  • An expression vector may comprise additional elements, for example, the expression vector may have two replication systems, thus allowing it to be maintained in two organisms, for example in human cells for expression and in a prokaryotic host for cloning and amplification.
  • the term“viral vector” refers to a nucleic acid vector construct that includes at least one element of viral origin and has the capacity to be packaged into a viral vector particle.
  • the viral vector can contain a nucleic acid encoding a mutant TdT polypeptide as described herein in place of non- essential viral genes.
  • the vector and/or particle may be utilized for the purpose of transferring nucleic acids into cells either in vitro or in vivo. Numerous forms of viral vectors are known in the art. Production of TdT polypeptides or variants
  • a TdT variant as that term is used herein, can be produced chemically by e.g., solution or solid-phase peptide synthesis, or semi-synthesis in solution beginning with protein fragments coupled through conventional solution methods, as described by Dugas et al (1981). However, given the size and complexity of an enzyme, it is generally preferred to synthesize e.g., a TdT polypeptide or variant using e.g., recombinant methods.
  • the TdT polypeptide or variant is produced recombinantly.
  • Systems for cloning and expressing polypeptides useful with the methods and compositions described herein include various microorganisms and cells that are well known in recombinant technology and thus are not described in detail herein. These include, for example, various strains of E. coli, Bacillus, Streptomyces, and Saccharomyces, as well as mammalian, yeast and insect cells.
  • a TdT peptide or variant can be produced as a peptide or fusion protein, if so desired.
  • Suitable vectors for producing peptides and polypeptides are known and available from private and public laboratories and depositories and from commercial vendors.
  • Recipient cells capable of expressing the gene product are then transfected.
  • the transfected recipient cells are cultured under conditions that permit expression of the recombinant gene products, which are recovered from the culture.
  • Host mammalian cells such as Chinese Hamster ovary cells (CHO) or COS-l cells, can be used. These hosts can be used in connection with poxvirus vectors, such as vaccinia or swinepox. Suitable non-pathogenic viruses that can be engineered to carry the synthetic gene into the cells of the host include poxviruses, such as vaccinia, adenovirus, retroviruses and the like.
  • non- pathogenic viruses are commonly used for human gene therapy, and as carrier for other vaccine agents, and are known and selectable by one of skill in the art.
  • the selection of other suitable host cells and methods for transformation, culture, amplification, screening and product production and purification can be performed by one of skill in the art by reference to known techniques.
  • TdT polypeptide or variant it may be desirable to isolate and/or purify a synthesized TdT polypeptide or variant.
  • Protein purification techniques are well known to those of skill in the art and as such are not described in detail herein. These techniques can involve, at one level, the homogenization and crude fractionation of the cells, tissue or organ to polypeptide and non-polypeptide fractions.
  • the TdT peptide or variant can be further purified using chromatographic and electrophoretic techniques to achieve partial or complete purification (or purification to homogeneity).
  • Analytical methods particularly suited to the preparation of a pure peptide or polypeptide are ion-exchange chromatography, gel exclusion chromatography, polyacrylamide gel electrophoresis, affinity chromatography, immunoaffinity chromatography and isoelectric focusing.
  • a particularly efficient method of purifying peptides/polypeptides is fast performance liquid chromatography (FPLC) or even high performance liquid chromatography (HPLC).
  • A“purified TdT peptide/polypeptide or variant” is intended to refer to a composition, isolatable from other components, wherein the TdT peptide or variant is purified to any degree relative to the organism producing recombinant protein or in its naturally-obtainable state.
  • An isolated or purified polypeptide therefore, also refers to a /polypeptide free from the environment in which it may naturally occur.
  • “purified” will refer to a TdT polypeptide composition that has been subjected to fractionation to remove various other components, and which composition substantially retains its expressed biological activity (i.e., TdT DNA polymerase activity).
  • substantially purified this designation will refer to a composition in which the TdT polypeptide forms the major component of the composition, such as constituting about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, or more of the proteins in the composition.
  • TdT polypeptide or variant there is no general requirement that the TdT polypeptide or variant be provided in the most purified state. Indeed, it is contemplated that less purified products will have utility in certain embodiments. Partial purification can be accomplished by using fewer purification steps in combination, or by utilizing different forms of the same general purification scheme. For example, it is appreciated that a cation- exchange column chromatography performed utilizing an HPLC apparatus will generally result in a greater “-fold” purification than the same technique utilizing a low pressure chromatography system. Methods exhibiting a lower degree of relative purification may have advantages in total recovery of protein product, or in maintaining the activity of an expressed protein.
  • TdT polypeptide or variant Various methods for quantifying the degree of purification of a given TdT polypeptide or variant are known to those of skill in the art and include, for example, determining the specific activity of an active fraction, or assessing the amount of polypeptides within a fraction by SDS/PAGE analysis
  • the enzymatic activity of TdT or a variant can be determined using any assay known to those of skill in the art and are not described in detail herein.
  • the activity of TdT or a variant thereof is described by the amount of protein that is needed to catalyze the incorporation of a certain concentration of natural or non-natural nucleotides into a single-stranded polynucleotide sequence using an initiator strand that exists in-solution or bound to a surface (e.g., de novo or in vitro).
  • multiple enzyme assays can be run in parallel under a variety of different conditions, such as in the presence of different metal ion cofactors.
  • the activity of a given TdT variant is compared to the activity of the protein from which it was derived.
  • a human TdT variant can be compared to the activity of wild-type human TdT (e.g., as a positive control).
  • the enzyme assay is performed using the preferred co-factor of endogenous TdT, Co2+.
  • the enzyme assay is performed in the presence of an alternative cofactor (e.g., Mn2+, Mg2+, Zn2+ etc.).
  • the activity of the TdT variant and the activity of the e.g., wild-type TdT are measured using the same set of reaction conditions, which can directly show the effect of a given mutation(s) on the functional activity of TdT.
  • one can compare the activity of the variant TdT in the presence of Mg2+ to the activity of wild-type TdT in the presence of Co2+.
  • Such an assay can be useful to determine the effects of a given mutation on function of the variant TdT as compared to the wild-type TdT under preferred, endogenous conditions (i.e., in the presence of Co2+ as a co-factor).
  • multiplex and high- throughput are used to describe the parallelizable nature of a kinetic assay by having the ability to determine the individual enzymatic activity of less than, equal to or greater than 96 protein variants and/or reaction conditions in a single experiment.
  • the activity of multiple purified TdT variants can be determined by the rate at which long single-stranded polynucleotide sequences are produced by measuring the fluorescent response in Relative Fluorescence Units (RFU) of a nucleic acid stain that is highly specific for single-stranded DNA.
  • REU Relative Fluorescence Unit
  • the accuracy of the kinetic assay is characterized by a minimal observable fluorescent response if double stranded DNA contaminants are present in the reaction vessel or if single-stranded polynucleotide sequences form unintended secondary structures such as hairpins, stem-loop structures or G-quadraplexes and the like. Terminal deoxynucleotidyl transferase activity is only present if an observable increase in fluorescent signal occurs in comparison to a negative control consisting of only initiator strand, free nucleotides, cofactor and appropriate buffers.
  • a positive control consisting of commercially available terminal transferase, such as bovine terminal deoxynucleotidyl transferase (New England Biolabs, Inc.) may also be used to relatively gauge the activity of purified template independent DNA polymerase variants or complexes.
  • Terminal transferase such as bovine terminal deoxynucleotidyl transferase (New England Biolabs, Inc.)
  • bovine terminal deoxynucleotidyl transferase New England Biolabs, Inc.
  • Single stranded nucleic acid fluorescent stains suitable for kinetic assays are known to those of skill in the art and are described in (ThermoFisher Scientific Inc., The Molecular Probes Handbook, Nucleic Acid Detection and Analysis— Chapter 8, Nucleic Acid Stains— Section 8.1, hereby incorporated by reference in its entirety).
  • a concentration curve consisting of a single polynucleotide sequence greater than 10 nucleotides can be generated to yield a set of standardized fluorescent signals. Because the fluorescent response in the presence of TdT activity is directly correlated to the amount of single stranded polynucleotide present at a given reaction time interval, the exact amount of polynucleotide in terms of mass can be interpolated from the concentration versus RFU curve and tracked throughout the progression of the reaction. This produces a rate unit for a particular amount of protein in terms of "mass increase in single -stranded polynucleotide per minute".
  • the rate unit for this kinetic assay can be further quantitated given additional reaction parameters such as free nucleotide composition, cofactors, and initiator sequence composition as well as each component' s respective concentration.
  • This kinetic assay provides a highly accurate and standardized method to specifically determine the best- candidate TdT variants in a cost- efficient and high-throughput activity screen.
  • compositions comprising a TdT polypeptide or variant
  • the TdT polypeptide or variant (e.g., isolated, synthetic, or recombinant peptide) is attached to, or enclosed or enveloped by, a macromolecular complex.
  • the macromolecular complex can be, without limitation, a virus, a bacteriophage, a bacterium, a liposome, a microparticle, a targeting sequence, a nanoparticle (e.g., a gold nanoparticle), a magnetic bead, a yeast cell, a mammalian cell, a cell or a microdevice.
  • macromolecular complexes within the scope of the methods and compositions described herein can include virtually any complex that can attach or enclose a peptide/polypeptide and be used in methods for c/e novo synthesis of oligonucleotides or other nucleic acids.
  • the isolated TdT polypeptide or variant can be attached to a solid support e.g., for purification of the TdT polypeptide or variant and/or ease of removing nucleic acid products generated in an enzymatic reaction mix from the TdT polypeptide when desired, such as, for example, magnetic beads, Sepharose beads, agarose beads, nanoparticles, a nitrocellulose membrane, a nylon membrane, a column chromatography matrix, a high performance liquid chromatography (HPLC) matrix or a fast performance liquid chromatography (FPLC) matrix for purification.
  • a solid support e.g., for purification of the TdT polypeptide or variant and/or ease of removing nucleic acid products generated in an enzymatic reaction mix from the TdT polypeptide when desired, such as, for example, magnetic beads, Sepharose beads, agarose beads, nanoparticles, a nitrocellulose membrane, a nylon membrane, a column chromatography matrix, a high performance liquid chromatography (
  • a solid support may be biological, nonbiological, organic, inorganic, or any combination thereof.
  • Supports for use with TdT polypeptides or variants can be any shape, size, or geometry as desired.
  • the support may be square, rectangular, round, flat, planar, circular, tubular, spherical, and the like.
  • the support may be physically separated into regions, for example, with trenches, grooves, wells, or chemical barriers (e.g., hydrophobic coatings, etc.).
  • Supports may be made from glass (silicon dioxide), metal, ceramic, polymer or other materials known to those of skill in the art.
  • Supports may be a solid, semi-solid, elastomer or gel.
  • TdT polypeptide sequences can be bound to such supports or substrates using methods, linkers (cleavable or non-cleavable) and chemistry known to those of skill in the art.
  • the TdT polypeptide or variant comprises a fusion protein.
  • These molecules generally have all or a substantial portion of the TdT peptide/variant, linked at the N- or C- terminus, to all or a portion of a second polypeptide or protein.
  • fusions may employ leader sequences from other species to permit the recombinant expression of a protein in a heterologous host.
  • Another useful fusion includes the addition of an immunologically active domain, such as an antibody epitope, to facilitate purification of the fusion protein. Inclusion of a cleavage site at or near the fusion junction will facilitate removal of the extraneous polypeptide after purification.
  • Other useful fusions include linking of functional domains, such as, for example, active sites from enzymes, glycosylation domains, cellular targeting signals or transmembrane regions.
  • TdT variants described herein can be used in the synthesis of nucleic acids for the purpose of storing digital information in nucleic acids, such as DNA.
  • DNA has the capacity to hold vast amounts of information, readily stored for long periods in a compact form.
  • the high capacity of DNA to store information stably under easily achieved conditions has made DNA an attractive target for information storage since the mid-90’s.
  • DNA molecules have a longevity that permits long-term storage with little to no deterioration of the encoded information.
  • Data storage systems based on both living vector DNA (in vivo DNA molecules) and synthesized DNA (in vitro DNA) have been proposed. Given that in vivo DNA storage systems have constraints on the quantity, genomic elements and locations that can be manipulated without affecting viability of the DNA molecules in the living vector organisms (e.g., bacteria), in vivo DNA storage is not the preferred method for high capacity data storage.
  • the methods and compositions provided herein relate to an enzymatic method of making a polynucleotide or nucleic acid sequence.
  • provided herein are method for c/e novo synthesis of nucleic acid sequences using a TdT variant as described herein for the purpose of storing digital information in DNA.
  • the method includes combining at least one selected nucleotide triphosphate, one or more cations, and a TdT variant in an aqueous reaction medium including a target substrate comprising an initiator sequence and having a 3' terminal nucleotide attached to a single stranded portion, such that the template -independent polymerase interact with the target substrate under conditions which covalently add one or more of the selected nucleotide triphosphate to the 3' terminal nucleotide.
  • the method can further includes repeatedly introducing an additional subsequent selected nucleotide triphosphate to the aqueous reaction medium under conditions which enzymatically add one or more of the subsequent selected nucleotide triphosphate to the target substrate until the polynucleotide is formed.
  • a TdT variant as described herein to a macromolecule or solid support in a method of generating a polynucleotide.
  • a TdT variant can be contacted with the components necessary for an enzymatic reaction that produces polynucleotide products in a flow-through manner, while the TdT variant is conjugated to a solid.
  • the solid support can comprise a growing polynucleotide strand and an untethered TdT variant can be removed from the solid support to stop the enzymatic reaction.
  • Other methods where one or more of the reaction components are attached to a solid support can be readily envisioned by one of skill in the art.
  • conditions sufficient to synthesize one or more nucleic acid molecules using the TdT variants described herein can include one or more nucleotides, one or more buffers or buffering salts, and one or more cofactors (e.g., divalent metal ions).
  • conditions sufficient to synthesize one or more nucleic acid molecules according to the invention may include incubating at an elevated temperature (e.g., greater than about 37° C., 40° C., 45° C., 50° C., 55° C., 60° C., 65° C., 70° C., 75° C., 80° C., 85° C., 90° C., or 95° C.) and/or in the presence of one or more deoxy- or dideoxyribonucleoside triphosphates.
  • an elevated temperature e.g., greater than about 37° C., 40° C., 45° C., 50° C., 55° C., 60° C., 65° C., 70° C., 75° C., 80° C., 85° C., 90° C., or 95° C.
  • Suitable deoxy- and dideoxyribonucleoside triphosphates include, but are not limited to, dATP, dCTP, dGTP, dTTP, dITP, 7-deaza-dGTP, 7-deaza-dATP, ddUTP, ddATP, ddCTP, ddGTP, ddITP, ddTTP, [a-S]dATP, [a-S]dTTP, [a-S]dGTP, and [a-S]dCTP.
  • the conditions may comprise a suitable concentration of at least one divalent metal cofactor. In some embodiments, the conditions may comprise more than one divalent metal cofactor.
  • Nucleic acids synthesized using the methods and compositions described herein can be applied to the storage of digital information in DNA as known to those of skill in the art.
  • TdT variants as described herein can be used with any method known in the art for the purpose of generating oligonucleotides de novo.
  • Oligonucleotides synthesized using the methods and/or TdT variants described herein comprise, in various embodiments, at least about 5, 10, 15, 20, 30, 40, 50, 60, 70, 75, 80, 90, 100, 120, 150 or more bases.
  • oligonucleotide synthesis is performed on a surface to allow for synthesis at a fast rate.
  • a fast rate As an example, at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 125, 150, 175, 200 nucleotides per hour, or more are synthesized.
  • libraries of oligonucleotides are synthesized in parallel on a substrate.
  • kits comprising wild type TdT or variant TdT polypeptides can be configured for use in any procedure known to those skilled in the art.
  • Suitable kits can be prepared for, for example, for cDNA synthesis and/or amplification, detectably labeling DNA molecules, and DNA sequencing.
  • kits can comprise a carrier that can be compartmentalized to receive in close confinement one or more containers such as vials, test tubes, wells, solid supports, chips and the like.
  • at least one of such containers contains components or a mixture of components needed to perform c/e novo oligonucleotide or DNA synthesis.
  • a kit as described herein comprises a container having a substantially purified sample of a TdT variant of the invention.
  • the kit comprises a container(s) having one or more nucleotides needed to synthesize a DNA molecule.
  • a kit comprises a container having one or a number of different types of dideoxynucleoside triphosphates, optionally labeled with one or more detectable groups.
  • a kit as described herein can comprise pyrophosphatase.
  • Kits for DNA synthesis can comprise a first container containing a TdT variant polymerase as described herein, and one or more containers each having one, two, three or four dNTPs. Of course, it is also possible to combine one or more of these reagents in a single tube or other containers.
  • the kit of the present invention may include one or more containers that contain detectably labeled nucleotides that may be used during the synthesis or sequencing of a DNA molecule.
  • labels include, but are not limited to, radioactive isotopes, fluorescent labels, chemiluminescent labels, nuclear tags biolumine scent labels and enzyme labels.
  • the invention may be as claimed in any one of the following numbered paragraphs.
  • TdT terminal deoxynucleotidyl transferase
  • modified TdT enzyme of any one of paragraphs 1-8 wherein the modified TdT comprises a mutation at R453 and at least one additional mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and F459.
  • a method for generating a polynucleotide sequence c/e novo or in vitro comprising; incubating a modified TdT enzyme of paragraph 1 in the presence of an initiator sequence, a cofactor and at least one nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand.
  • a modified TdT enzyme or the initiator sequence are conjugated to a solid support.
  • [00135] 23 A vector comprising the nucleic acid molecule of paragraph 22.
  • a cell comprising the modified TdT polypeptide of paragraphs 1-13, the nucleic acid molecule of paragraph 22, and/or the vector of paragraph 23.
  • a solid support comprising a modified TdT enzyme of paragraphs 1-13.
  • Terminal deoxynucleotidyl transferase is an exceedingly useful template -independent DNA polymerase for major biotechnological applications such as the storage of digital information and c/e novo oligonucleotide synthesis.
  • TdT has the ability to rapidly catalyze the synthesis of long DNA oligonucleotides in the presence of only a small initiator sequence, cofactors, and nucleoside triphosphate monomers.
  • TdT exhibits substrate bias towards both the preferred initiator sequence composition and nucleoside triphosphate base (A, G, C, or T). This bias greatly inhibits the ability for TdT to be utilized as a universal template -independent DNA polymerase and severely limits the capacity for the precise control of TdT in any oligonucleotide synthesis scheme.
  • functional mutant variants of TdT that display improved enzymatic phenotypes, such as reduced substrate bias, are highly desired to enable a multitude of commercially viable biotechnological applications. Because the protein functional landscape of TdT remains largely unexplored, it was sought to begin mapping it by generating libraries of TdT mutant variants via site-directed mutagenesis with the overall intention of improving the wild-type enzyme.
  • EXAMPLE 1 ENZYME ENGINEERING APPROACH
  • Rational site-directed mutagenesis was employed as a primary method for generating TdT mutant variant libraries in that the technique is low-cost and well-practiced in the protein engineering field. Based on a combination of protein structural analysis and previous reports of single-codon TdT mutant variant functionality analysis, several amino acid residues were identified that are important for maintaining protein structural integrity, reaction catalysis, and substrate binding (cofactor, initiator sequence, & nucleoside triphosphate).
  • Mutant variant generation proceeded hierarchically: single codon-mutants were first generated and evaluated for initial activity and/or the desired enzymatic phenotype followed by double-codon mutants and then, if necessary, triple-codon mutants.
  • mutant variant libraries were designed and built from TdT sequences originating from species including but not limited to wild-type Mus musculus, Bos taurus, Monodelphis domestica, Eulemur macaco, Xenopus laevis, Ambystoma mexicanum, Oncorhynchus mykiss, and Gallus. Additional mutant variation may arise from truncations or removal of protein domains associated with in vivo enzymatic activities DNA repair mechanisms unnecessary for in vitro DNA oligonucleotide synthesis.
  • TdT requires the presence of a divalent cation cofactor for optimal enzymatic activity.
  • Synthesis reactions are typically supplemented with Co2+, however alternative divalent cations with varying properties such as Mg2+, Mn2+, Zn2+, and combinations thereof have been reported to be compatible with TdT.
  • divalent cations are directly involved in the binding and catalysis of the nucleoside triphosphates onto the growing oligonucleotide, it was hypothesized that any human TdT mutant variants generated may have altered cofactor requirements due to changes in structure, polarity, or hydrophobicity of the catalytic pocket.
  • enzymatic activity was screened in the presence of 0.25 mM Co2+, Mn2+ or Mg2+. Reaction supplementation with Zn2+ is least tolerated by TdT and deviations from a concentration of 0.25 mM cofactor generally result in decreased enzymatic activity.
  • the single-codon mutant R453H displayed the highest activity when the reaction was supplemented with Mn2+; however, supplementation with Co2+ and Mg2+ still resulted in appreciable activity.
  • Denaturing gel electrophoresis analysis of these reactions indicate that long oligonucleotides >400-nt were synthesized in the presence of each cofactor as compared to the wild-type, no-change human TdT where long oligonucleotide were only synthesized in the presence of Co2+ (FIG. 2B).
  • a large array of different cofactor preferences were observed when screening double-codon mutants variants carrying the constant amino acid change R453A (FIG. 3).
  • mutant variants of human TdT indicate that improvements over the wild-type enzyme may be observed as increased flexibility in the presence of variable reaction conditions or substrates. Therefore, novel mutant variants may be, for example, less temperature or pH sensitive, highly processive, able to synthesize DNA oligonucleotide at faster rates, able to incorporate non-natural nucleotides or may only display their enzymatic phenotypes when reactions are supplemented with the preferred cofactor. However, mutant variants such as R453H are highly desired as they may function optimally regardless of reaction and substrate type or composition.
  • each double-codon mutant variants’ ability was evaluated with respect to its ability to incorporate each of the four natural nucleoside triphosphate bases with reactions substituted with Mn 2+ .
  • the 100 mutant variants produced it was found that several displayed an enhanced ability to incorporate all four bases given a particular oligonucleotide initiator sequence (FIG. 4).
  • TdT is very active when adding dATP to an initiator oligonucleotide consisting of a homopolymer dT initiator oligonucleotide but not active when in the presence of dTTP under similar conditions; however, the mutant variant R453A-V432G allowed all four natural nucleoside triphosphate bases to be incorporated at similar rates producing very long fragments of ssDNA (>l400-nt) using the same homopolymer dT oligonucleotide (FIG. 5).
  • this mutant variant of hTdT retained this decreased substrate bias and specificity in the presence of a homopolymer dA oligonucleotide, which could be active when adding dTTPs but not active in the presence of dATP (FIG. 6). While R453A-V432G was the best performing mutant variant identified to date in terms of most significantly decreased substrate bias and specificity, other mutant variants of human TdT that display similar enzymatic phenotypes are also of particular interest.
  • TdT can incorporate ribonucleotides in addition to deoxyribonucleotides; however, generally only 1-2 ribonucleotides can be added as growing DNA oligonucleotide becomes a less preferred substrate for TdT.
  • double-codon mutants display decreased substrate bias for deoxynucleotides
  • top performing variants can efficiently incorporate all four natural ribonucleoside triphosphates.
  • R453A-V432G produced long fragments of ssRNA in comparison to single-codon mutant R453A (FIG. 7).
  • the primary sequences of wild-type or mutant enzymes of interest were codon optimized for E. coli expression using a custom optimization algorithm and ordered as gBlocks® (IDT) with 20-nt overlap sequences for Gibson Assembly into the pET-28-c-(+) His-tag expression vector (EMD Millipore 69866- 3).
  • IDT gBlocks®
  • the gBlocks® were PCR amplified with Phusion High Fidelity (HF) Polymerase (NEB M05030).
  • PCR thermocycling was performed as follows: initial denature for 98°C for 30 seconds, denature at 98°C for 10 seconds, anneal at 68°C for 10 seconds, and extend at 72°C for 60 seconds for 18 cycles before a final extension of 5 minutes at 72°C.
  • PCR reactions were purified and concentrated using a QIAquick PCR Purification Kit (Qiagen 28106).
  • the pET-28-c-(+) expression vector was prepared for gBlocks® insertion by digesting the circular DNA with 40U of NDel (NEB R0111) per 500 ng vector at 37°C for 90 minutes.
  • the linear DNA was separated from undigested material with 2% agarose gel electrophoresis and extracted by incubating agarose containing the bands corresponding to the linear DNA in Buffer QG (Qiagen 19063) at 55°C rotating at 1000 RPM for 2 hours. The resultant mixture was cleaned and concentrated with the QIAquick PCR Purification Kit. The PCR amplified insert and vector sequences were combined at a ratio of 1:3 with 0.1 pmol of total material and assembled with Gibson Assembly Master Mix (NEB E5510S) at 50°C for 1 hour. T7 Express chemically competent E.
  • Cultures were then pelleted at 3500 x g for 10 minutes and then His-Tag purified using a HisTalon Resin Kit as per manufacturer’s instructions (Clontech 635654).
  • the eluted enzyme samples were then buffer exchanged into an optimal 2X protein storage buffer using l5-mL filter columns (Millipore) at the appropriate MWCO by centrifugation at 5000 x G for 15 minutes at 4C. This process was repeated twice. On the third spin, samples were spun for 30 minutes in order to concentrate the protein into a smaller volume.
  • Single or multiple amino acids can be mutagenized for improvement by rational design or by high-throughput methods such as error-prone PCR.
  • Plasmids carrying the target protein were harvested and purified from a sequence verified liquid bacterial cultures grown overnight in LB-kanamycin media at 37°C using a MiniPrep Kit (Qiagen 27104).
  • Oligonucleotide primers were ordered from IDT and were designed to PCR amplify the protein expression plasmid while simultaneously mutagenizing the plasmid at the predetermined location, yielding linearized DNA.
  • the protein expression plasmid was PCR amplified using the Q5 Hot Start High-Fidelity 2x Master Mix with the following thermocycling conditions: initial denature for 98°C for 30 seconds, denature at 98°C for 10 seconds, anneal at 68°C for 10 seconds, and extend at 72°C for 120 seconds for 25 cycles before a final extension of 2 minutes at 72°C. 1 pL of the resulting PCR amplification reaction was then treated with the kit’s enzyme reaction cocktail to re -circularize the protein expression plasmid while digesting away the unsubstituted plasmid sequences remaining in the reaction mixture .
  • the length of the ssDNA produced in these reactions was determined by comparing products to a 100-h ⁇ ssDNA ladder (Simplex Biosciences) using a 15% TBE-Urea denaturing gel (Thermo EC6885) following the manufacturer’s protocol. Approximately 8 pL of the initial activity screen reaction volume was loaded onto the gels and run at 185V for 60 minutes unless otherwise specified. Gels were then stained with a solution of lx GelStar Nucleic Acid stain gel stain for 15 minutes with gentle agitation. The resultant gel was then imaged on a Typhoon FLA 9500 system (GE Healthcare Life Sciences) using imaging parameters for SYBR Gold. For extension reactions using initiator oligonucleotides labeled with a 5’-fluorophore such as FAM, Cy5, Cy3, etc, gels were not stained and imaged directly using the appropriate parameters.
  • initiator oligonucleotides labeled with a 5’-fluorophore such as FAM, Cy5,

Abstract

Provided herein are modified TdT polypeptides and uses thereof.

Description

COMPOSITIONS AND METHODS COMPRISING MUTANTS OF TERMINAL
DEOXYNUCLEOTIDYL TRANSFERASE
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This Application claims benefit under 35 U.S.C. § 119(e) of the U.S. Provisional Application No. 62/741,143 filed October 4, 2018, the contents of which are incorporated herein by reference in their entirety.
FIELD OF THE DISCLOSURE
[0002] The present disclosure described herein relates to modified terminal deoxynucleotidyl transferase (TdT) polypeptides and uses thereof.
BACKGROUND
[0003] Terminal deoxynucleotidyl transferase (TdT) is a very useful template-independent DNA polymerase for major biotechnological applications such as the storage of digital information and c/e novo oligonucleotide synthesis. TdT has the unique ability to rapidly catalyze the synthesis of long DNA oligonucleotides in the presence of only a small initiator sequence, cofactors and nucleoside triphosphate monomers.
SUMMARY
[0004] The methods and compositions described herein are based, in part, on the discovery that one or more amino acid mutations in the TdT polypeptide sequence can modify the function, reaction catalysis and substrate binding of the TdT polypeptide. For example, certain amino acid mutations can alter the cofactor preference for the TdT polypeptide, such that a modified TdT may be more efficient in the presence of Mn2+ or Mg2+ than in the presence of the endogenously preferred Co2+ cofactor. In other examples, a modified TdT polypeptide can comprise different temperature or pH sensitivities, altered rates of DNA synthesis, and the ability to incorporate non-natural nucleotides. In one preferred embodiment, a modified TdT enzyme comprises a reduced substrate bias towards a preferred initiator sequence or nucleoside triphosphate base (A, G, C, or T) compared to an unmodified TdT enzyme from which it is derived.
[0005] In one aspect, provided herein is a modified terminal deoxynucleotidyl transferase (TdT) polypeptide (i.e., enzyme), wherein the modified TdT polypeptide comprises a sequence having at least one amino acid mutation and retains at least 10% of the template-independent DNA polymerase activity of the TdT polypeptide from which the modified TdT is derived. In one embodiment, the template-independent DNA polymerase activity of the modified TdT polypeptide and the wild-type TdT polypeptide from which it is derived is measured using the same enzymatic reaction conditions (e.g., co-factor and co-factor concentration, temperature, time, pH, nucleotide(s) and nucleotide(s) concentration etc.). In other embodiments, the template-independent polymerase activity of a given modified TdT polypeptide is assessed under reaction conditions that are different from the reaction conditions that are used to assess the activity of the unmodified TdT polypeptide. One of skill in the art will recognize that modifications to a TdT polypeptide can alter the co-factor preference or degree of nucleotide bias, thus it may be desirable to compare the activity of the modified TdT polypeptide under conditions preferred by the modified enzyme to the activity of the wild-type enzyme under conditions preferred by the wild-type enzyme. This is because under the preferred conditions of the modified TdT polypeptide, the wild-type TdT may have substantially reduced activity.
[0006] In one embodiment of this aspect and all other aspects provided herein, the modified TdT comprises at least 50% of the template-independent DNA polymerase activity of the TdT polypeptide from which it is derived.
[0007] In another embodiment of this aspect and all other aspects provided herein, the protein sequence of the modified TdT comprises at least two amino acid mutations compared to the TdT polypeptide from which it is derived.
[0008] In another embodiment of this aspect and all other aspects provided herein, the TdT polypeptide from which the modified TdT is derived is a human TdT polypeptide.
[0009] In another embodiment of this aspect and all other aspects provided herein, the human TdT polypeptide comprises SEQ ID NO: 1.
[0010] In another embodiment of this aspect and all other aspects provided herein, the modified TdT comprises a cofactor preference that is different from the cofactor preference of the TdT polypeptide from which it is derived.
[0011] In another embodiment of this aspect and all other aspects provided herein, the modified TdT comprises a reduced degree of bias compared to the degree of bias of the TdT polypeptide from which it is derived under substantially similar enzyme assay conditions.
[0012] In another embodiment of this aspect and all other aspects provided herein, the modified TdT comprises at least one amino acid mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and L459.
[0013] In another embodiment of this aspect and all other aspects provided herein, the modified TdT comprises a mutation at R453 and at least one additional mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and L459.
[0014] In another embodiment of this aspect and all other aspects provided herein, the mutation at R453 is R453A.
[0015] In another embodiment of this aspect and all other aspects provided herein, the at least one additional mutation occurs at amino acid residue V432.
[0016] In another embodiment of this aspect and all other aspects provided herein, the mutation at amino acid residue V432 is V432G.
[0017] In another embodiment of this aspect and all other aspects provided herein, the modified TdT polypeptide comprises a sequence selected from those listed in any one of Tables 3-8.
[0018] In another embodiment of this aspect and all other aspects provided herein, the modified TdT polypeptide is a variant comprising the mutations R453A-V432G.
[0019] Another aspect provided herein relates to a method for generating a polynucleotide sequence c/e novo or in vitro, the method comprising; incubating a modified TdT enzyme as described herein in the presence of an initiator sequence, a cofactor and at least one nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand.
[0020] In one embodiment of this aspect and all other aspects provided herein, the modified TdT enzyme or the initiator sequence are conjugated to a solid support.
[0021] In another embodiment of this aspect and all other aspects provided herein, the solid support comprises a bead, a membrane, or a column.
[0022] In another embodiment of this aspect and all other aspects provided herein, the cofactor is a divalent cation.
[0023] In another embodiment of this aspect and all other aspects provided herein, the divalent cation is Co2+, Mn2+, Mg2+ or Zn2+.
[0024] In another embodiment of this aspect and all other aspects provided herein, the modified TdT enzyme of claim 1 is incubated in the presence of 2, 3, or 4 nucleoside triphosphates.
[0025] In another embodiment of this aspect and all other aspects provided herein, the method further comprises a second step of incubating a modified TdT enzyme of claim 1 in the presence of an initiator sequence, a cofactor and at least one different nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand.
[0026] In another embodiment of this aspect and all other aspects provided herein, the step is repeated once or twice each in the presence of at least one different nucleotide. [0027] Another aspect provided herein relates to a nucleic acid molecule encoding any one of the modified TdT enzymes described herein (e.g., encoding any one of the polypeptide sequences listed in Tables 3-8). In another embodiment of this aspect and all other aspects provided herein, the nucleic acid encodes a modified TdT polypeptide variant comprising the mutations R453A-V432G.
[0028] Another aspect provided herein relates to a vector comprising a nucleic acid molecule encoding any one of the modified TdT enzymes described herein (e.g., encoding any one of the polypeptide sequences listed in Tables 3-8). In one embodiment of this aspect and all other aspects provided herein, the vector comprises a nucleic acid that encodes a modified TdT polypeptide variant comprising the mutations R453A-V432G.
[0029] Another aspect provided herein relates to a cell comprising the modified TdT polypeptide, a nucleic acid molecule encoding a modified TdT polypeptide, and/or the vector comprising such a nucleic acid molecule as described herein.
[0030] In one embodiment of this aspect and all other aspects provided herein, the cell is a bacterial cell.
[0031] Another aspect provided herein relates to a solid support comprising a modified TdT enzyme as described herein (e.g., any one of the protein sequences listed in Tables 3-8). In one embodiment of this aspect and all other aspects provided herein, the solid support comprises a modified TdT polypeptide variant comprising the mutations R453A-V432G.
BRIEF DESCRIPTION OF THE DRAWINGS
[0032] FIG. 1. View of catalytic pocket of murine TdT with large arginine at residue position 453 protruding near where nucleotide binds. Model generated with PyMol software (Schrodinger, NY, NY) using PDB: 1KEJ.
[0033] FIGs. 2A-2B. (FIG. 2A) Heat-map of single-codon mutant variants at amino acid residue R453. RFU values for each cofactor evaluated were normalized by the total protein concentration of the human TdT mutant variants as determined by a reducing agent microBCA assay. The final concentration of cofactor was 0.25 mM for all reactions. The heat-map is sorted by increasing average RFU at the completion of reaction incubation (top has most activity and bottom has least activity). Values represent the mean with (N=2). (FIG. 2B) 6% TBE-Urea denaturing gel electrophoresis analysis of human TdT R453H DNA oligonucleotide synthesis reactions compared to control, wild-type human TdT. 200-nt ssDNA ladder was used to determine the size of the produced DNA oligonucleotide. Gels were stained with lx GelStar Nucleic Acid Stain.
[0034] FIG. 3. Heat-map of double-codon mutants variants carrying the constant amino acid change R453A. RFU values for each cofactor evaluated were normalized by the total protein concentration of the human TdT mutant variants as determined by a reducing agent microBCA assay. The final concentration of cofactor was 0.25 mM for all reactions. The heat-map is sorted by increasing average RFU at the completion of reaction incubation (top has most activity and bottom has least activity). Values represent the mean with (N=2).
[0035] FIGs. 4A-4B. Heat-maps of double-codon mutants variants carrying the constant amino acid change R453A. RFU values for each natural nucleotide evaluated were normalized by the total protein concentration of the human TdT mutant variants as determined by a reducing agent microBCA assay. The final concentration for each nucleotide was 1 mM and the cofactor was 0.25 mM Mn2+. Heat-maps are sorted by increasing average RFU at the completion of reaction incubation, where (FIG. 4A) is the top 1- 50 mutants variants and (FIG. 4B) is top mutant variants 51-100. Values represent the mean with (N=2).
[0036] FIG. 5. 6% TBE-Urea denaturing gel electrophoresis analysis of natural nucleotide incorporate by wild-type human TdT (WT-hTDT) compared to double-codon mutant variant human TdT R453A- V432G. Nucleotide concentration was 1 mM and the initiator oligonucleotide sequence was a Poly-T-l5- mer at 10 pmol per reaction. Wild-type human TdT was supplemented with 0.25 mM Co2+ cofactor and the double mutant variant was supplemented with 0.25 mM Mn2+ cofactor.
[0037] FIGs. 6A-6B. 15% TBE-Urea denaturing gel electrophoresis analysis of natural nucleotide incorporation by single-codon mutant R453A human TdT compared to double codon-mutant R453A- V432G with varying DNA oligonucleotide initiator sequences. (FIG. 6A) indicates reactions supplemented with 10 pmol of Poly-T-l5-mer and (FIG. 6B) indicates reactions supplemented with 10 pmol of Poly-A- l5-mer. Both the mutant variants were supplemented with 0.25 mM Mn2+ cofactor.
[0038] FIG. 7. 15% TBE-Urea denaturing gel electrophoresis analysis of natural ribonucleotide incorporation by single-codon mutant R453A human TdT compared to double codon-mutant R453A- V432G. Reactions were supplemented with 1 mM of each ribonucleotide, 0.25 mM Mn2+, and 10 pmol of DNA oligonucleotide initiator sequence poly-T-l5-mer
DETAILED DESCRIPTION
[0039] The technology described herein is based, in part, on the generation of modified TdT polypeptides having at least one amino acid mutation at a desired residue but retaining at least 25% of the template -independent DNA polymerase activity of the unmodified TdT polypeptide. Such TdT variants are contemplated for use in the generation of nucleic acid sequences for the storage of digital information, c/e novo oligonucleotide synthesis, or the like.
Definitions
[0040] As used herein, the term“template-independent DNA polymerase activity” refers to the ability of a TdT polypeptide, variant or mutant to add at least one nucleotide to a growing polynucleotide strand in the absence of a template DNA strand.
[0041] As used herein, the term "substantially retains TdT activity" means that a variant or modified TdT polypeptide will retain at least 10% of the template-independent DNA polymerase activity (as assessed by measuring in vitro TdT enzyme activity) of the polypeptide or peptide from which it is derived (e.g., wildtype human TdT). In some embodiments, the activity of the derivative and the activity of wild-type TdT are assessed under substantially similar conditions, for example, in the presence of the same cofactor (e.g., Co2+). However, the activity of the derivative can be determined under different conditions (e.g., in the presence of an alternative co-factor, such as Mn2+, Zn2+ or Mg2+, and compared to the activity of the wild-type TdT enzyme determined under conditions preferred under native conditions (e.g., in the presence of Co2+). In other embodiments, the derivative will retain at least 25%, at least 30% at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, at least 99% or even 100% of the template -independent DNA polymerase activity of the peptide/polypeptide from which it is derived.
[0042] As used herein, the term“cofactor preference” refers to the cofactor that permits the highest enzymatic activity of a given TdT variant in the same assay conditions and using the same concentration of cofactor (e.g., 0.25mM). In some embodiments, the cofactor preference is expressed in descending order, such as the cofactor preference for endogenous wild-type TdT which is expressed as Co2+ > Mg2+, Mn2+.
[0043] The term "increased activity" refers to an increase in template-independent DNA polymerase activity of a derivative compared to that of the parent peptide/polypeptide, for example, the derivative can have at least a 2-fold increase, at least 5 -fold, at least 10-fold, at least 20-fold, at least 50-fold, at least 100- fold, at least 1000-fold or more increase in template-independent DNA polymerase activity compared to the parent peptide/polypeptide from which it is derived.
[0044] The terms“derivative,” "variant," or“mutant” as used herein refers to a polypeptide or nucleic acid that comprises at least one mutation but is "substantially similar" to a wild-type human TdT polypeptide. A molecule is said to be "substantially similar" to another molecule if both molecules have substantially similar structures (i.e., they are at least 50% similar in amino acid sequence as determined by BLASTp alignment set at default parameters) and are substantially similar in at least one relevant function (e.g., template-independent DNA polymerase activity). A variant differs from the naturally occurring polypeptide or nucleic acid by one or more amino acid or nucleic acid deletions, additions, substitutions or side-chain modifications, yet retains one or more specific functions or biological activities of the naturally occurring molecule. Amino acid substitutions include alterations in which an amino acid is replaced with a different naturally-occurring or a non-conventional amino acid residue. Some substitutions can be classified as“conservative,” in which case an amino acid residue contained in a polypeptide is replaced with another naturally occurring amino acid of similar character either in relation to polarity, side chain functionality or size. Substitutions encompassed by variants as described herein can also be“non-conservative,” in which an amino acid residue which is present in a peptide is substituted with an amino acid having different properties (e.g., substituting a charged or hydrophobic amino acid with an uncharged or hydrophilic amino acid), or alternatively, in which a naturally-occurring amino acid is substituted with a non-conventional amino acid. Also encompassed within the term“variant,” when used with reference to a polynucleotide or polypeptide, are variations in primary, secondary, or tertiary structure, as compared to a reference polynucleotide or polypeptide, respectively (e.g., as compared to a wild- type polynucleotide or polypeptide). Polynucleotide changes can result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference sequence. Variants can also include insertions, deletions or substitutions of amino acids, including insertions and substitutions of amino acids and other molecules) that do not normally occur in the peptide sequence that is the basis of the variant, including but not limited to insertion of ornithine which does not normally occur in human proteins.
[0045] The terms“modified TdT enzyme,”“modified TdT polypeptide,”“TdT variant,” and“mutant TdT” are used interchangeably herein.
[0046] The terms “statistically significant" or“significantly" refer to statistical significance and generally mean a two standard deviation (2SD) or greater difference relative to a reference value.
[0047] The terms“decrease”,“reduced”,“reduction”, or“inhibit” are all used herein to mean a decrease by a statistically significant amount. In some embodiments,“reduce,”“reduction" or“decrease" or“inhibit” typically means a decrease by at least 10% as compared to a reference level (e.g. the absence of a given treatment) and can include, for example, a decrease by at least about 10%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or more. As used herein,“reduction” or“inhibition” does not encompass a complete inhibition or reduction as compared to a reference level. “Complete inhibition” is a 100% inhibition as compared to a reference level. A decrease can be preferably down to a level accepted as within the range of normal for an individual without a given disorder. [0048] The terms“increased”,“increase”,“enhance”, or“activate” are all used herein to mean an increase by a statically significant amount. In some embodiments, the terms“increased”,“increase”, “enhance”, or“activate” can mean an increase of at least 10% as compared to a reference level, for example an increase of at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% increase or any increase between 10-100% as compared to a reference level, or at least about a 2- fold, or at least about a 3 -fold, or at least about a 4-fold, or at least about a 5 -fold or at least about a 10-fold increase, or any increase between 2-fold and 10-fold or greater as compared to a reference level. In the context of a marker or symptom, an“increase” is a statistically significant increase in such level.
[0049] As used herein, the terms“oligonucleotide,”“polynucleotide,” and“nucleic acid” encompass double- or triple-stranded nucleic acids, as well as single-stranded molecules. In double- or triple -stranded nucleic acids, the nucleic acid strands need not be coextensive (i.e., a double-stranded nucleic acid need not be double-stranded along the entire length of both strands). Nucleic acid sequences, when provided, are listed in the 5' to 3' direction, unless stated otherwise. Methods described herein provide for the generation of isolated nucleic acids. Methods described herein additionally provide for the generation of isolated and purified nucleic acids. An“oligonucleotide,”“polynucleotide,” and“nucleic acid” as referred to herein can comprise at least 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, or more bases in length.
[0050] As used herein the term "comprising" or "comprises" is used in reference to compositions, methods, and respective component(s) thereof, that are essential to the method or composition, yet open to the inclusion of unspecified elements, whether essential or not.
[0051] The term "consisting of refers to compositions, methods, and respective components thereof as described herein, which are exclusive of any element not recited in that description of the embodiment.
[0052] As used herein the term "consisting essentially of refers to those elements required for a given embodiment. The term permits the presence of elements that do not materially affect the basic and novel or functional characteristic(s) of that embodiment.
[0053] Other than in the operating examples, or where otherwise indicated, all numbers expressing quantities of ingredients or reaction conditions used herein should be understood as modified in all instances by the term“about.” The term“about” when used in connection with percentages can mean ±1%.
[0054] The singular terms "a," "an," and "the" include plural referents unless context clearly indicates otherwise. Similarly, the word "or" is intended to include "and" unless the context clearly indicates otherwise. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of this disclosure, suitable methods and materials are described below. The abbreviation, "e.g." is derived from the Latin exempli gratia, and is used herein to indicate a non-limiting example. Thus, the abbreviation "e.g." is synonymous with the term "for example."
[0055] Definitions of common terms in cell biology and molecular biology can be found in“The Merck Manual of Diagnosis and Therapy”, l9th Edition, published by Merck Research Laboratories, 2006 (ISBN 0-911910-19-0); Robert S. Porter et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182-9); Benjamin Lewin, Genes X, published by Jones & Bartlett Publishing, 2009 (ISBN-10: 0763766321); Kendrew et al. (eds.), , Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 1- 56081-569-8) and Current Protocols in Protein Sciences 2009, Wiley Intersciences, Coligan et al, eds.
[0056] Unless otherwise stated, the present invention was performed using standard procedures, as described, for example in Sambrook et al., Molecular Cloning: A Laboratory Manual (3 ed.), Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., USA (2001); Davis et al., Basic Methods in Molecular Biology, Elsevier Science Publishing, Inc., New York, USA (1995); or Methods in Enzymology: Guide to Molecular Cloning Techniques Vol. l52, S. L. Berger and A. R. Kimmel Eds., Academic Press Inc., San Diego, USA (1987); Current Protocols in Protein Science (CPPS) (John E. Coligan, et. al., ed., John Wiley and Sons, Inc.), Current Protocols in Cell Biology (CPCB) (Juan S. Bonifacino et. al. ed., John Wiley and Sons, Inc.), and Culture of Animal Cells: A Manual of Basic Technique by R. Ian Freshney, Publisher: Wiley-Liss; 5th edition (2005), Animal Cell Culture Methods (Methods in Cell Biology, Vol. 57, Jennie P. Mather and David Barnes editors, Academic Press, lst edition, 1998) which are all incorporated by reference herein in their entireties.
Terminal deoxynucleotidyl Transferase (TdT)
[0057] Terminal deoxynucleotidyl transferase (TdT) is a template-independent DNA polymerase that catalyzes the addition of nucleotides to the 3’ terminus of a DNA molecule (e.g., a single stranded DNA strand). TdT plays a role in introducing minor changes into the genetic material by randomly adding nucleotides to single-stranded DNA during recombination. TdT activity is important in adaptation of the vertebrate immune system by increasing antigen receptor diversity. There are two known isoforms of TdT: (i) a short form having 509 amino acids (TdTS), and (ii) a long form having 529 amino acids (TdTL). Both TdTS and TdTL comprise the domains necessary to bind nucleotides, DNA, and metal ion cofactors. Thus, it is specifically contemplated that the derivatives or mutants of TdT described herein can be derived from either TdT isoform. [0058] Two functionally independent human TdT regions have been identified: breast cancer susceptibility protein BRCA1 C-terminal (BRCT) domain at the N-terminus and the polymerase-like domain at the C-terminus. The BRCT domain of TdT is involved in protein-protein and protein-DNA interactions during DNA repair and cell cycle checkpoint pathways. The pol— like domain is the catalytic core of the enzyme and contains the active site of the phosphoryl transfer reaction. In addition, a nuclear localization signal (NLS) motif is found at the N-terminus. The protein domain structure and crystal structure of TdT is known to those of skill in the art and is not described in further detail herein.
[0059] While endogenous TdT prefers to use cobalt ion (Co2+) as a co-factor, TdT is unique in its ability to use a variety of other divalent cations such as Mn2+, Zn2+ and Mg2+. In general, the extension rate in vitro with dATP in the presence of divalent metal ions is ranked in the following order: Mg2+ > Zn2+ > Co2+ > Mn2+. In addition, each metal ion has different effects on the kinetics of nucleotide incorporation. For example, Mg2+ facilitates the preferential utilization of dGTP and dATP whereas Co2+ increases the catalytic polymerization efficiency of the pyrimidines, dCTP and dTTP. Zn2+ behaves as a unique positive effector for TdT since reaction rates with Mg2+ are stimulated by the addition of micromolar quantities of Zn2+. This enhancement may reflect the ability of Zn2+ to induce conformational changes in TdT that yields higher catalytic efficiencies. Polymerization rates are lower in the presence of Mn2+ compared to Mg2+, suggesting that Mn2+ does not support the reaction as efficiently as Mg2+. Further description of TdT is provided in Biochim Biophys Acta., May 2010; 1804(5): 1151— 1166 hereby incorporated by reference in its entirety.
Table 1: Exemplary unmodified polypeptide sequences of TdT in different species
Figure imgf000012_0001
Figure imgf000013_0001
Figure imgf000014_0001
Figure imgf000015_0001
[0060] The modified TdT polypeptides (also referred to herein as“TdT variants”) can be derived from any one of SEQ ID NOs: 1-9. In one embodiment, the modified TdT polypeptide is derived from a human polypeptide sequence, for example, SEQ ID NO: 1.
Modified TdT/Variants of TdT
[0061] Provided herein are variants of TdT (e.g., human TdT) that comprise at least one amino acid mutation compared to the TdT protein from which they are derived and retain at least 10% of the functional template -independent DNA polymerase activity of the unmodified TdT (e.g., using an enzymatic TdT test as described herein). In some embodiments, the variants of TdT retain at least 15%, at least 20%, at least 25%, at least 30%, at least 35%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 98%, at least 99% of the template-independent DNA polymerase activity of the unmodified TdT from which the variant is derived. In some embodiments, the TdT variant comprises a template-independent DNA polymerase activity that is substantially similar to the activity of the TdT from which it is derived. As used herein, the term“substantially similar” refers to an activity of a TdT variant that comprises an activity that is not statistically significant (i.e., p < 0.05) as compared to the activity from unmodified TdT protein from which the variant is derived (e.g., as assessed using a TdT enzyme assay as described herein).
[0062] It is contemplated that some mutations can potentially improve the relevant activity, such that a TdT variant has more than 100% of the activity of a wild-type or native polypeptide, e.g., 110%, 125%, 150%, 175%, 200%, 500%, 1000% or more.
[0063] Variant TdTs can comprise at least 2, at least 3, at least 4, at least 5 amino acid mutations or more (e.g., 6, 7, 8, 9, 10 etc.) as compared to the TdT enzyme from which they are derived. In one embodiment, the variant of TdT comprises a“single -codon” mutation (i.e., a single amino acid mutation). In other embodiments the TdT variant comprises a“double-codon” (i.e., two amino acid substitutions) or “triple-codon” mutation (i.e., three amino acid substitutions).
[0064] In some embodiments, the amino acid substitutions can comprise a conservative amino acid substitution. The terminology "conservative amino acid substitutions" is well known in the art, which relates to substitution of a particular amino acid by one having a similar characteristic (e.g., similar charge or hydrophobicity, similar bulkiness). Examples include aspartic acid for glutamic acid, or isoleucine for leucine. A list of exemplary conservative amino acid substitutions is given in the table below. A conservative substitution mutant or variant will 1) have only conservative amino acid substitutions relative to the parent sequence, 2) will have at least 90% sequence identity with respect to the parent sequence, preferably at least 95% identity, 96% identity, 97% identity, 98% identity or 99% or greater identity; and 3) will retain TdT template-independent DNA polymerase activity (e.g., enzyme activity) as that term is defined herein.
Figure imgf000016_0001
Figure imgf000017_0001
[0065] Alternatively, a non-conservative amino acid substitution may be preferred, for example, when a TdT variant with differing cofactor binding, enzyme activity, or reduced substrate bias is desired. "Non conservative substitution” refers to the substitution of an amino acid in one class with an amino acid from another class; for example, substitution of an Ala, a class II residue, with a class III residue such as Asp, Asn, Glu, or Gln. Additional non-limiting examples of non-conservative substitutions include the substitution of a non-polar (hydrophobic) amino acid residue such as isoleucine, valine, leucine, alanine, methionine for a polar (hydrophilic) residue such as cysteine, glutamine, glutamic acid or lysine and/or a polar residue for a non-polar residue.
[0066] As will be appreciated by those of skill in the art, a TdT variant as described herein can have a mixture of conservative and non-conservative amino acid substitutions in any desired configuration. The TdT variant can be tested for activity, co-factor preference and nucleotide bias using methods known in the art or described in the Examples.
[0067] In some embodiments, the amino acid residue to be mutated is an amino acid that plays a role in maintaining protein structural integrity, reaction catalysis and substrate binding (cofactor, initiator sequence & nucleoside triphosphate). In certain embodiments, the one or more amino acids targeted for mutation to a different amino acid include, but are not limited to, S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, or L459 (numbering based on the wild- type human TdT sequence Uniprot #P04053. In other embodiments, the TdT is a non-human mammalian TdT or TdT from other non-mammalian species. In some embodiments, the TdT is a member of the archaeo-eukaryotic primase (AEP) superfamily. In other embodiments, the TdT is a PolpTN2 or a C- terminal truncated PolpTN2, a PriS, a nonhomologous end joining archaeo-eukaryotic primase, a mammalian Roΐq, or a eukaryotic PrimPol. In one embodiment, the variant does not comprise a mutation(s) that would require TdT to use a template for synthesis of a polynucleotide strand.
[0068] Amino acid sequence alignment of a polypeptide of interest with a reference, e.g., from another species can provide guidance regarding not only residues likely to be necessary for function but also, conversely, those residues likely to tolerate change. Where, for example, an alignment shows two identical or similar amino acids at corresponding positions, it is more likely that that site is important functionally. Where, conversely, alignment shows residues in corresponding positions to differ significantly in size, charge, hydrophobicity, etc., it is more likely that that site can tolerate variation in a functional polypeptide. Such alignments are readily created by one of ordinary skill in the art, e.g., using the default settings of the alignment tool of the BLASTP program. Furthermore, homologs of any given polypeptide or nucleic acid sequence can be found using BLAST programs, e.g., by searching freely available databases of sequence for homologous sequences, or by querying those databases for annotations indicating a homolog (e.g., search strings that comprise a gene name or describe the activity of a gene).
[0069] The variant amino acid sequence (or corresponding DNA sequence) can be at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or more, identical to a native or reference sequence. The degree of homology (percent identity) between a native and a mutant sequence can be determined, for example, by comparing the two sequences using freely available computer programs commonly employed for this purpose on the world wide web. The variant amino acid or DNA sequence can be at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or more, similar to the sequence from which it is derived (referred to herein as an“original" sequence). The degree of similarity (percent similarity) between an original and a mutant sequence can be determined, for example, by using a similarity matrix. Similarity matrices are well known in the art and a number of tools for comparing two sequences using similarity matrices are freely available online, e.g., BLASTp (available on the world wide web), with default parameters set.
[0070] In some embodiments, an amino acid mutation is introduced using any method known in the art, for example, site-directed mutagenesis where targeted mutations are introduced into one or more desired positions of a template TdT polynucleotide. This may be achieved by classic primer extension mutagenesis using a mutagenesis primer containing one or more desired mutations relative to the template polynucleotide. The mutagenesis primer can be a synthetic oligonucleotide or a PCR product, and it may include one or more desired substitutions, deletions, additions or any desired combination thereof. Means and methods for producing such primers are readily available in the art. The oligonucleotide or PCR product used as primer must be 5'-phosphorylated for ligation. This can be achieved by enzymatic phosphorylation reaction, by enzymatic digestion of the 5' end of the DNA or by conjugation in a chemical reaction. Commercial kits for site-directed mutagenesis can be obtained commercially from e.g., New England Biolabs (Ipswich, MA), Thermo Fisher Scientific (Waltham, MA), Agilent (Santa Clara, CA), TransgenBiotech (Beijing, China), Biogene (Cambridge, MA), etc.
[0071] Amino acid insertions and deletions are specifically contemplated herein in a modified TdT polypeptide provided that such insertions or deletions do not unduly impair the template-independent DNA polymerase activity. In certain embodiments, an insertion comprises at least one additional residue but does not exceed 20 additional residues, for example, 1-18 residues, 1-16 residues, 1-15 residues, 1-14 residues, 1-12 residues, 1-10 residues, 1-9 residues, 1-8 residues, 1-7 residues, 1-6 residues, 1-5 residues, 1-4 residues, 1-3 residues or 1-2 residues are inserted. In other embodiments, a deletion comprises removal of at least one residue but does not exceed 10 residues, for example, less than 9, less than 8, less than 7, less than 6, less than 5, less than 4, less than 3, or 2 residues are deleted.
[0072] In some embodiments, a TdT polypeptide can be modified, e.g., by addition of a moiety to one or more of the amino acids that together comprise the peptide. In some embodiments, a polypeptide as described herein can comprise one or more moiety molecules, e.g., 1 or more moiety molecules per polypeptide, 2 or more moiety molecules per polypeptide, 5 or more moiety molecules per polypeptide, 10 or more moiety molecules per polypeptide or more. In some embodiments, a polypeptide as described herein can comprise one or more types of modifications and/or moieties, e.g. 1 type of modification, 2 types of modifications, 3 types of modifications or more types of modifications. Non-limiting examples of modifications and/or moieties include PEGylation; glycosylation; HESylation; ELPylation; lipidation; acetylation; amidation; end-capping modifications; cyano groups; phosphorylation; albumin, and cyclization. In some embodiments, an end-capping modification can comprise acetylation at the N- terminus, N-terminal acylation, and N-terminal formylation. In some embodiments, an end-capping modification can comprise amidation at the C-terminus, introduction of C-terminal alcohol, aldehyde, ester, and thioester moieties.
[0073] In some embodiments, the TdT polypeptide variant comprises a single-codon mutation, for example, a single-codon mutation at amino acid residue R453. Exemplary single-codon mutants with confirmed activity (as assessed using methods described e.g., in the working Examples) include the variants listed in Table 3. In some embodiments, the single-codon mutant is selected from the variants in Table 3, Table 4 or Table 5.
Table 3 : Single-codon mutant variants with confirmed activity
Figure imgf000020_0001
Table 4: Top-performing single-codon mutant variants
Figure imgf000020_0002
[0074] In some embodiments, a single codon-mutant at R453 can exhibit an altered preference for divalent cation in comparison to wildtype. Examples of such single codon-mutants and their preferred substrate include are found in Table 5. Table 5: Single-codon mutants at R453 with altered preference for divalent cation in comparison to wildtype.
Figure imgf000021_0001
[0075] In other embodiments, the TdT polypeptide variant comprises a double-codon mutation. In one embodiment, one of the two codon mutations in a double-codon mutation comprises a mutation at residue R453 (e.g., R453A). Exemplary double-codon mutants having activity confirmed as described in the working Examples include those listed in Table 6, 7 or 8.
Table 6: Double-codon mutant variants with confirmed activity
Figure imgf000021_0002
Figure imgf000022_0001
Table 7: Double codon-mutant variants with robust activity and some degree of bias
Figure imgf000022_0002
Table 8: Top performing double-codon mutant variants (with R453A constant)
Figure imgf000022_0003
[0076] In one embodiment, the modified TdT polypeptide variant comprises the mutations R453A- V432G.
Nucleic Acids and Vectors
[0077] In some embodiments, the technology described herein relates to a nucleic acid encoding a modified or variant TdT polypeptide as described herein. As used herein, the term“nucleic acid” or “nucleic acid sequence” refers to any molecule, preferably a polymeric molecule, incorporating units of ribonucleic acid, deoxyribonucleic acid or an analog thereof. The nucleic acid can be either single -stranded or double -stranded. A single-stranded nucleic acid can be one strand nucleic acid of a denatured double- stranded DNA. In one aspect, the nucleic acid is DNA. In another aspect, the nucleic acid is RNA. Suitable nucleic acid molecules include DNA, including genomic DNA or cDNA. Other suitable nucleic acid molecules include RNA, including mRNA.
[0078] In some embodiments, a nucleic acid encoding a modified or variant TdT polypeptide as described herein is comprised by a vector. In some of the aspects described herein, a nucleic acid sequence encoding a modified TdT polypeptide as described herein is operably linked to a vector. The term "vector", as used herein, refers to a nucleic acid construct designed for delivery to a host cell or for transfer between different host cells. As used herein, a vector can be viral or non-viral. The term“vector” encompasses any genetic element that is capable of replication when associated with the proper control elements and that can transfer gene sequences to cells. A vector can include, but is not limited to, a cloning vector, an expression vector, a plasmid, phage, transposon, cosmid, chromosome, virus, virion, etc.
[0079] As used herein, the term "expression vector" refers to a vector that directs expression of an RNA or polypeptide from sequences linked to transcriptional regulatory sequences on the vector. The sequences expressed will often, but not necessarily, be heterologous to the cell. An expression vector may comprise additional elements, for example, the expression vector may have two replication systems, thus allowing it to be maintained in two organisms, for example in human cells for expression and in a prokaryotic host for cloning and amplification.
[0080] As used herein, the term“viral vector" refers to a nucleic acid vector construct that includes at least one element of viral origin and has the capacity to be packaged into a viral vector particle. The viral vector can contain a nucleic acid encoding a mutant TdT polypeptide as described herein in place of non- essential viral genes. The vector and/or particle may be utilized for the purpose of transferring nucleic acids into cells either in vitro or in vivo. Numerous forms of viral vectors are known in the art. Production of TdT polypeptides or variants
[0081] A TdT variant, as that term is used herein, can be produced chemically by e.g., solution or solid-phase peptide synthesis, or semi-synthesis in solution beginning with protein fragments coupled through conventional solution methods, as described by Dugas et al (1981). However, given the size and complexity of an enzyme, it is generally preferred to synthesize e.g., a TdT polypeptide or variant using e.g., recombinant methods.
[0082] Thus, in one embodiment, the TdT polypeptide or variant is produced recombinantly. Systems for cloning and expressing polypeptides useful with the methods and compositions described herein include various microorganisms and cells that are well known in recombinant technology and thus are not described in detail herein. These include, for example, various strains of E. coli, Bacillus, Streptomyces, and Saccharomyces, as well as mammalian, yeast and insect cells. A TdT peptide or variant can be produced as a peptide or fusion protein, if so desired. Suitable vectors for producing peptides and polypeptides are known and available from private and public laboratories and depositories and from commercial vendors. Recipient cells capable of expressing the gene product are then transfected. The transfected recipient cells are cultured under conditions that permit expression of the recombinant gene products, which are recovered from the culture. Host mammalian cells, such as Chinese Hamster ovary cells (CHO) or COS-l cells, can be used. These hosts can be used in connection with poxvirus vectors, such as vaccinia or swinepox. Suitable non-pathogenic viruses that can be engineered to carry the synthetic gene into the cells of the host include poxviruses, such as vaccinia, adenovirus, retroviruses and the like. A number of such non- pathogenic viruses are commonly used for human gene therapy, and as carrier for other vaccine agents, and are known and selectable by one of skill in the art. The selection of other suitable host cells and methods for transformation, culture, amplification, screening and product production and purification can be performed by one of skill in the art by reference to known techniques.
Purification of TdT polypeptides or variants
[0083] In some embodiments, it may be desirable to isolate and/or purify a synthesized TdT polypeptide or variant. Protein purification techniques are well known to those of skill in the art and as such are not described in detail herein. These techniques can involve, at one level, the homogenization and crude fractionation of the cells, tissue or organ to polypeptide and non-polypeptide fractions. The TdT peptide or variant can be further purified using chromatographic and electrophoretic techniques to achieve partial or complete purification (or purification to homogeneity). Analytical methods particularly suited to the preparation of a pure peptide or polypeptide are ion-exchange chromatography, gel exclusion chromatography, polyacrylamide gel electrophoresis, affinity chromatography, immunoaffinity chromatography and isoelectric focusing. A particularly efficient method of purifying peptides/polypeptides is fast performance liquid chromatography (FPLC) or even high performance liquid chromatography (HPLC).
[0084] A“purified TdT peptide/polypeptide or variant” is intended to refer to a composition, isolatable from other components, wherein the TdT peptide or variant is purified to any degree relative to the organism producing recombinant protein or in its naturally-obtainable state. An isolated or purified polypeptide, therefore, also refers to a /polypeptide free from the environment in which it may naturally occur. In one embodiment,“purified” will refer to a TdT polypeptide composition that has been subjected to fractionation to remove various other components, and which composition substantially retains its expressed biological activity (i.e., TdT DNA polymerase activity). Where the term “substantially purified” is used, this designation will refer to a composition in which the TdT polypeptide forms the major component of the composition, such as constituting about 50%, about 60%, about 70%, about 80%, about 90%, about 95%, or more of the proteins in the composition.
[0085] There is no general requirement that the TdT polypeptide or variant be provided in the most purified state. Indeed, it is contemplated that less purified products will have utility in certain embodiments. Partial purification can be accomplished by using fewer purification steps in combination, or by utilizing different forms of the same general purification scheme. For example, it is appreciated that a cation- exchange column chromatography performed utilizing an HPLC apparatus will generally result in a greater “-fold” purification than the same technique utilizing a low pressure chromatography system. Methods exhibiting a lower degree of relative purification may have advantages in total recovery of protein product, or in maintaining the activity of an expressed protein.
[0086] Various methods for quantifying the degree of purification of a given TdT polypeptide or variant are known to those of skill in the art and include, for example, determining the specific activity of an active fraction, or assessing the amount of polypeptides within a fraction by SDS/PAGE analysis
Measuring TdT activity
[0087] The enzymatic activity of TdT or a variant can be determined using any assay known to those of skill in the art and are not described in detail herein. In one embodiment, the activity of TdT or a variant thereof is described by the amount of protein that is needed to catalyze the incorporation of a certain concentration of natural or non-natural nucleotides into a single-stranded polynucleotide sequence using an initiator strand that exists in-solution or bound to a surface (e.g., de novo or in vitro). As will be appreciated by one of skill in the art, multiple enzyme assays can be run in parallel under a variety of different conditions, such as in the presence of different metal ion cofactors. In certain embodiments, the activity of a given TdT variant is compared to the activity of the protein from which it was derived. For example, a human TdT variant can be compared to the activity of wild-type human TdT (e.g., as a positive control). One of skill in the art will readily recognize that the activity of TdT and variants thereof will vary depending on the specific conditions in which the assay is performed. In one embodiment, the enzyme assay is performed using the preferred co-factor of endogenous TdT, Co2+. In other embodiments, the enzyme assay is performed in the presence of an alternative cofactor (e.g., Mn2+, Mg2+, Zn2+ etc.). In some embodiments, the activity of the TdT variant and the activity of the e.g., wild-type TdT are measured using the same set of reaction conditions, which can directly show the effect of a given mutation(s) on the functional activity of TdT. In other embodiments, it may be desirable to compare the activity of a variant TdT determined under one set of conditions to the activity of an unmutated TdT under a second set of conditions. This is particularly useful when the cofactor preference of the TdT variant is suspected to be altered due to the mutation introduced. For example, one can compare the activity of the variant TdT in the presence of Mg2+ to the activity of wild-type TdT in the presence of Co2+. Such an assay can be useful to determine the effects of a given mutation on function of the variant TdT as compared to the wild-type TdT under preferred, endogenous conditions (i.e., in the presence of Co2+ as a co-factor).
[0088] The terms "multiplex" and "high- throughput" are used to describe the parallelizable nature of a kinetic assay by having the ability to determine the individual enzymatic activity of less than, equal to or greater than 96 protein variants and/or reaction conditions in a single experiment. In an exemplary multiplex kinetic assay, the activity of multiple purified TdT variants can be determined by the rate at which long single-stranded polynucleotide sequences are produced by measuring the fluorescent response in Relative Fluorescence Units (RFU) of a nucleic acid stain that is highly specific for single-stranded DNA. The accuracy of the kinetic assay is characterized by a minimal observable fluorescent response if double stranded DNA contaminants are present in the reaction vessel or if single-stranded polynucleotide sequences form unintended secondary structures such as hairpins, stem-loop structures or G-quadraplexes and the like. Terminal deoxynucleotidyl transferase activity is only present if an observable increase in fluorescent signal occurs in comparison to a negative control consisting of only initiator strand, free nucleotides, cofactor and appropriate buffers. In addition, a positive control consisting of commercially available terminal transferase, such as bovine terminal deoxynucleotidyl transferase (New England Biolabs, Inc.) may also be used to relatively gauge the activity of purified template independent DNA polymerase variants or complexes. Single stranded nucleic acid fluorescent stains suitable for kinetic assays are known to those of skill in the art and are described in (ThermoFisher Scientific Inc., The Molecular Probes Handbook, Nucleic Acid Detection and Analysis— Chapter 8, Nucleic Acid Stains— Section 8.1, hereby incorporated by reference in its entirety). [0089] To further quantitate the rate unit of each individual purified template independent DNA polymerase variants, a concentration curve consisting of a single polynucleotide sequence greater than 10 nucleotides can be generated to yield a set of standardized fluorescent signals. Because the fluorescent response in the presence of TdT activity is directly correlated to the amount of single stranded polynucleotide present at a given reaction time interval, the exact amount of polynucleotide in terms of mass can be interpolated from the concentration versus RFU curve and tracked throughout the progression of the reaction. This produces a rate unit for a particular amount of protein in terms of "mass increase in single -stranded polynucleotide per minute". The rate unit for this kinetic assay can be further quantitated given additional reaction parameters such as free nucleotide composition, cofactors, and initiator sequence composition as well as each component' s respective concentration. This kinetic assay provides a highly accurate and standardized method to specifically determine the best- candidate TdT variants in a cost- efficient and high-throughput activity screen.
Compositions comprising a TdT polypeptide or variant
[0090] In one embodiment, the TdT polypeptide or variant (e.g., isolated, synthetic, or recombinant peptide) is attached to, or enclosed or enveloped by, a macromolecular complex. The macromolecular complex can be, without limitation, a virus, a bacteriophage, a bacterium, a liposome, a microparticle, a targeting sequence, a nanoparticle (e.g., a gold nanoparticle), a magnetic bead, a yeast cell, a mammalian cell, a cell or a microdevice. These are representative examples only and macromolecular complexes within the scope of the methods and compositions described herein can include virtually any complex that can attach or enclose a peptide/polypeptide and be used in methods for c/e novo synthesis of oligonucleotides or other nucleic acids.
[0091] If desired, the isolated TdT polypeptide or variant can be attached to a solid support e.g., for purification of the TdT polypeptide or variant and/or ease of removing nucleic acid products generated in an enzymatic reaction mix from the TdT polypeptide when desired, such as, for example, magnetic beads, Sepharose beads, agarose beads, nanoparticles, a nitrocellulose membrane, a nylon membrane, a column chromatography matrix, a high performance liquid chromatography (HPLC) matrix or a fast performance liquid chromatography (FPLC) matrix for purification. Additional suitable supports include, but are not limited to, slides, beads, chips, particles, strands, gels, sheets, tubing, spheres, containers, capillaries, pads, slices, films, plates and the like. In various embodiments, a solid support may be biological, nonbiological, organic, inorganic, or any combination thereof. Supports for use with TdT polypeptides or variants can be any shape, size, or geometry as desired. For example, the support may be square, rectangular, round, flat, planar, circular, tubular, spherical, and the like. When using a support that is substantially planar, the support may be physically separated into regions, for example, with trenches, grooves, wells, or chemical barriers (e.g., hydrophobic coatings, etc.). Supports may be made from glass (silicon dioxide), metal, ceramic, polymer or other materials known to those of skill in the art. Supports may be a solid, semi-solid, elastomer or gel.
[0092] TdT polypeptide sequences can be bound to such supports or substrates using methods, linkers (cleavable or non-cleavable) and chemistry known to those of skill in the art.
[0093] In one embodiment, the TdT polypeptide or variant comprises a fusion protein. These molecules generally have all or a substantial portion of the TdT peptide/variant, linked at the N- or C- terminus, to all or a portion of a second polypeptide or protein. For example, fusions may employ leader sequences from other species to permit the recombinant expression of a protein in a heterologous host. Another useful fusion includes the addition of an immunologically active domain, such as an antibody epitope, to facilitate purification of the fusion protein. Inclusion of a cleavage site at or near the fusion junction will facilitate removal of the extraneous polypeptide after purification. Other useful fusions include linking of functional domains, such as, for example, active sites from enzymes, glycosylation domains, cellular targeting signals or transmembrane regions.
High Capacity Data Storage
[0094] The TdT variants described herein can be used in the synthesis of nucleic acids for the purpose of storing digital information in nucleic acids, such as DNA.
[0095] DNA has the capacity to hold vast amounts of information, readily stored for long periods in a compact form. The high capacity of DNA to store information stably under easily achieved conditions has made DNA an attractive target for information storage since the mid-90’s. In addition to information density, DNA molecules have a longevity that permits long-term storage with little to no deterioration of the encoded information. Data storage systems based on both living vector DNA (in vivo DNA molecules) and synthesized DNA (in vitro DNA) have been proposed. Given that in vivo DNA storage systems have constraints on the quantity, genomic elements and locations that can be manipulated without affecting viability of the DNA molecules in the living vector organisms (e.g., bacteria), in vivo DNA storage is not the preferred method for high capacity data storage.
[0096] In contrast,“isolated DNA” or“in vitro or c/e novo synthesized DNA” is more easily“written” and routine recovery of examples of the non-living DNA from samples that are tens of thousands of years old indicates that a well-prepared non-living DNA sample can have an exceptionally long lifespan in easily- achieved low-maintenance environments (i.e. cold, dry and dark environments). [0097] Previous work on the storage of information (also termed data) in the DNA has typically focused on “writing” a human-readable message into the DNA in encoded form, and then“reading” the encoded human-readable message by determining the sequence of the DNA and decoding the sequence. Work in the field of DNA computing has given rise to schemes that in principle permit large-scale associative (content- addressed) memory, but there have been no attempts to develop this work as practical DNA-storage schemes. Other methods for storing information in DNA include using an encoding method of 4-DNA bases that represent each character of an extended ASCII character set. A synthetic DNA molecule is then produced, which includes the digital information, an encryption key, and is flanked on each side by a primer sequence. Finally, the synthesized DNA is incorporated in a storage DNA. In the event that the amount of DNA is too large, then the information can be fragmented into a number of segments.
[0098] In one embodiment, the methods and compositions provided herein relate to an enzymatic method of making a polynucleotide or nucleic acid sequence. Thus, provided herein are method for c/e novo synthesis of nucleic acid sequences using a TdT variant as described herein for the purpose of storing digital information in DNA. In one embodiment, the method includes combining at least one selected nucleotide triphosphate, one or more cations, and a TdT variant in an aqueous reaction medium including a target substrate comprising an initiator sequence and having a 3' terminal nucleotide attached to a single stranded portion, such that the template -independent polymerase interact with the target substrate under conditions which covalently add one or more of the selected nucleotide triphosphate to the 3' terminal nucleotide.
[0099] In another embodiment, for methods that employ a first reaction in the presence of a single selected nucleotide triphosphate, the method can further includes repeatedly introducing an additional subsequent selected nucleotide triphosphate to the aqueous reaction medium under conditions which enzymatically add one or more of the subsequent selected nucleotide triphosphate to the target substrate until the polynucleotide is formed.
[00100] In some embodiments, it may be desirable to attach a TdT variant as described herein to a macromolecule or solid support in a method of generating a polynucleotide. As one of skill in the art will appreciate, a TdT variant can be contacted with the components necessary for an enzymatic reaction that produces polynucleotide products in a flow-through manner, while the TdT variant is conjugated to a solid. Alternatively, the solid support can comprise a growing polynucleotide strand and an untethered TdT variant can be removed from the solid support to stop the enzymatic reaction. Other methods where one or more of the reaction components are attached to a solid support can be readily envisioned by one of skill in the art.
[00101] In some embodiments, conditions sufficient to synthesize one or more nucleic acid molecules using the TdT variants described herein can include one or more nucleotides, one or more buffers or buffering salts, and one or more cofactors (e.g., divalent metal ions). In some embodiments, conditions sufficient to synthesize one or more nucleic acid molecules according to the invention may include incubating at an elevated temperature (e.g., greater than about 37° C., 40° C., 45° C., 50° C., 55° C., 60° C., 65° C., 70° C., 75° C., 80° C., 85° C., 90° C., or 95° C.) and/or in the presence of one or more deoxy- or dideoxyribonucleoside triphosphates. Suitable deoxy- and dideoxyribonucleoside triphosphates include, but are not limited to, dATP, dCTP, dGTP, dTTP, dITP, 7-deaza-dGTP, 7-deaza-dATP, ddUTP, ddATP, ddCTP, ddGTP, ddITP, ddTTP, [a-S]dATP, [a-S]dTTP, [a-S]dGTP, and [a-S]dCTP. In some embodiments, the conditions may comprise a suitable concentration of at least one divalent metal cofactor. In some embodiments, the conditions may comprise more than one divalent metal cofactor.
[00102] Nucleic acids synthesized using the methods and compositions described herein can be applied to the storage of digital information in DNA as known to those of skill in the art.
De novo Oligonucleotide Synthesis
[00103] The TdT variants as described herein can be used with any method known in the art for the purpose of generating oligonucleotides de novo. Oligonucleotides synthesized using the methods and/or TdT variants described herein comprise, in various embodiments, at least about 5, 10, 15, 20, 30, 40, 50, 60, 70, 75, 80, 90, 100, 120, 150 or more bases. In some embodiments, at least about 1 pmol, 10 pmol, 20 pmol, 30 pmol, 40 pmol, 50 pmol, 60 pmol, 70 pmol, 80 pmol, 90 pmol, 100 pmol, 150 pmol, 200 pmol, 300 pmol, 400 pmol, 500 pmol, 600 pmol, 700 pmol, 800 pmol, 900 pmol, 1 nmol, 5 nmol, 10 nmol, 100 nmol or more of a desired oligonucleotide is synthesized.
[00104] In some embodiments, oligonucleotide synthesis is performed on a surface to allow for synthesis at a fast rate. As an example, at least 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 125, 150, 175, 200 nucleotides per hour, or more are synthesized. In some embodiments, libraries of oligonucleotides are synthesized in parallel on a substrate.
Kits
[00105] The wild type and variant TdT polymerases are well suited for the preparation of a kit, for example, for de novo DNA or oligonucleotide synthesis. Kits comprising wild type TdT or variant TdT polypeptides can be configured for use in any procedure known to those skilled in the art. Suitable kits can be prepared for, for example, for cDNA synthesis and/or amplification, detectably labeling DNA molecules, and DNA sequencing. Such kits can comprise a carrier that can be compartmentalized to receive in close confinement one or more containers such as vials, test tubes, wells, solid supports, chips and the like. Preferably at least one of such containers contains components or a mixture of components needed to perform c/e novo oligonucleotide or DNA synthesis.
[00106] In some embodiments, a kit as described herein comprises a container having a substantially purified sample of a TdT variant of the invention. In another embodiment, the kit comprises a container(s) having one or more nucleotides needed to synthesize a DNA molecule. In another embodiment, a kit comprises a container having one or a number of different types of dideoxynucleoside triphosphates, optionally labeled with one or more detectable groups. In some embodiments, a kit as described herein can comprise pyrophosphatase.
[00107] Kits for DNA synthesis can comprise a first container containing a TdT variant polymerase as described herein, and one or more containers each having one, two, three or four dNTPs. Of course, it is also possible to combine one or more of these reagents in a single tube or other containers. When desired, the kit of the present invention may include one or more containers that contain detectably labeled nucleotides that may be used during the synthesis or sequencing of a DNA molecule. One or a number of labels may be used to detect such nucleotides. Illustrative labels include, but are not limited to, radioactive isotopes, fluorescent labels, chemiluminescent labels, nuclear tags biolumine scent labels and enzyme labels.
[00108] All patents and other publications; including literature references, issued patents, published patent applications, and co-pending patent applications; cited throughout this application are expressly incorporated herein by reference for the purpose of describing and disclosing, for example, the methodologies described in such publications that might be used in connection with the technology described herein. These publications are provided solely for their disclosure prior to the filing date of the present application. Nothing in this regard should be construed as an admission that the inventors are not entitled to antedate such disclosure by virtue of prior invention or for any other reason. All statements as to the date or representation as to the contents of these documents is based on the information available to the applicants and does not constitute any admission as to the correctness of the dates or contents of these documents.
[00109] The description of embodiments of the disclosure is not intended to be exhaustive or to limit the disclosure to the precise form disclosed. While specific embodiments of, and examples for, the disclosure are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the disclosure, as those skilled in the relevant art will recognize. For example, while method steps or functions are presented in a given order, alternative embodiments may perform functions in a different order, or functions may be performed substantially concurrently. The teachings of the disclosure provided herein can be applied to other procedures or methods as appropriate. The various embodiments described herein can be combined to provide further embodiments. Aspects of the disclosure can be modified, if necessary, to employ the compositions, functions and concepts of the above references and application to provide yet further embodiments of the disclosure . Moreover, due to biological functional equivalency considerations, some changes can be made in protein structure without affecting the biological or chemical action in kind or amount. These and other changes can be made to the disclosure in light of the detailed description. All such modifications are intended to be included within the scope of the appended claims.
[00110] Specific elements of any of the foregoing embodiments can be combined or substituted for elements in other embodiments. Furthermore, while advantages associated with certain embodiments of the disclosure have been described in the context of these embodiments, other embodiments may also exhibit such advantages, and not all embodiments need necessarily exhibit such advantages to fall within the scope of the disclosure.
[00111] The technology described herein is further illustrated by the following examples which in no way should be construed as being further limiting.
[00112] In some embodiments, the invention may be as claimed in any one of the following numbered paragraphs.
[00113] 1. A modified terminal deoxynucleotidyl transferase (TdT) polypeptide, wherein the modified TdT polypeptide comprises a sequence having at least one amino acid mutation and retains at least 10% of the template-independent DNA polymerase activity of the TdT polypeptide from which the modified TdT is derived.
[00114] 2. The modified TdT enzyme of paragraph 1, wherein the modified TdT comprises at least
50% of the template-independent DNA polymerase activity of the TdT polypeptide from which it is derived.
[00115] 3. The modified TdT enzyme of paragraph 1 or 2, wherein the protein sequence of the modified TdT comprises at least two amino acid mutations compared to the TdT polypeptide from which it is derived.
[00116] 4. The modified TdT enzyme of paragraph 1,2, or 3, wherein the TdT polypeptide from which the modified TdT is derived is a human TdT polypeptide. [00117] 5. The modified TdT enzyme of paragraph 4, wherein the human TdT polypeptide comprises
SEQ ID NO: 1.
[00118] 6. The modified TdT enzyme of any one of paragraphs 1-5, wherein the modified TdT comprises a cofactor preference that is different from the cofactor preference of the TdT polypeptide from which it is derived.
[00119] 7. The modified TdT enzyme of any one of paragraphs 1-6, wherein the modified TdT comprises a reduced degree of bias compared to the degree of bias of the TdT polypeptide from which it is derived under substantially similar enzyme assay conditions.
[00120] 8. The modified TdT enzyme of any one of paragraphs 1-7, wherein the modified TdT comprises at least one amino acid mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and L459.
[00121] 9. The modified TdT enzyme of any one of paragraphs 1-8, wherein the modified TdT comprises a mutation at R453 and at least one additional mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and F459.
[00122] 10. The modified TdT enzyme of paragraph 9, wherein the mutation at R453 is R453A.
[00123] 11. The modified TdT enzyme of paragraph 9 or 10, wherein the at least one additional mutation occurs at amino acid residue V432.
[00124] 12. The modified TdT enzyme of any one of paragraphs 9-11, wherein the mutation at amino acid residue V432 is V432G.
[00125] 13. The modified TdT enzyme of any one of paragraphs 1-12, comprising a sequence selected from Tables 3-8.
[00126] 14. A method for generating a polynucleotide sequence c/e novo or in vitro, the method comprising; incubating a modified TdT enzyme of paragraph 1 in the presence of an initiator sequence, a cofactor and at least one nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand. [00127] 15. The method of paragraph 14, wherein the modified TdT enzyme or the initiator sequence are conjugated to a solid support.
[00128] 16. The method of paragraph 15, wherein the solid support comprises a bead, a membrane, or a column.
[00129] 17. The method of any one of paragraphs 14-16, wherein the cofactor is a divalent cation.
[00130] 18. The method of paragraph 17, wherein the divalent cation is Co2+, Mn2+, Mg2+ or Zn2+.
[00131] 19. The method of any one of paragraphs 14-18, wherein the modified TdT enzyme of paragraph 1 is incubated in the presence of 2, 3, or 4 nucleoside triphosphates.
[00132] 20. The method of any one of paragraphs 14-19, further comprising a second step of incubating a modified TdT enzyme of paragraph 1 in the presence of an initiator sequence, a cofactor and at least one different nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand.
[00133] 21. The method of paragraph 20, wherein the step is repeated once or twice each in the presence of at least one different nucleotide.
[00134] 22. A nucleic acid molecule encoding any one of the modified TdT enzymes of paragraphs 1-
13.
[00135] 23. A vector comprising the nucleic acid molecule of paragraph 22.
[00136] 24. A cell comprising the modified TdT polypeptide of paragraphs 1-13, the nucleic acid molecule of paragraph 22, and/or the vector of paragraph 23.
[00137] 25. The cell of paragraph 24, wherein the cell is a bacterial cell.
[00138] 26. A solid support comprising a modified TdT enzyme of paragraphs 1-13.
EXAMPLES
[00139] Terminal deoxynucleotidyl transferase (TdT) is an exceedingly useful template -independent DNA polymerase for major biotechnological applications such as the storage of digital information and c/e novo oligonucleotide synthesis. TdT has the ability to rapidly catalyze the synthesis of long DNA oligonucleotides in the presence of only a small initiator sequence, cofactors, and nucleoside triphosphate monomers. While it is well-known that wild-type TdT can synthesize DNA oligonucleotides with lengths >l400-nt without a template sequence and is highly promiscuous as compared to other polymerases, TdT exhibits substrate bias towards both the preferred initiator sequence composition and nucleoside triphosphate base (A, G, C, or T). This bias greatly inhibits the ability for TdT to be utilized as a universal template -independent DNA polymerase and severely limits the capacity for the precise control of TdT in any oligonucleotide synthesis scheme. Thus, functional mutant variants of TdT that display improved enzymatic phenotypes, such as reduced substrate bias, are highly desired to enable a multitude of commercially viable biotechnological applications. Because the protein functional landscape of TdT remains largely unexplored, it was sought to begin mapping it by generating libraries of TdT mutant variants via site-directed mutagenesis with the overall intention of improving the wild-type enzyme.
EXAMPLE 1: ENZYME ENGINEERING APPROACH
[00140] Rational site-directed mutagenesis was employed as a primary method for generating TdT mutant variant libraries in that the technique is low-cost and well-practiced in the protein engineering field. Based on a combination of protein structural analysis and previous reports of single-codon TdT mutant variant functionality analysis, several amino acid residues were identified that are important for maintaining protein structural integrity, reaction catalysis, and substrate binding (cofactor, initiator sequence, & nucleoside triphosphate). These amino acid residues, based on the wild-type (WT) human TdT sequence (Uniprot #P04053), included, but are not limited to, S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and L459. Mutant variant generation proceeded hierarchically: single codon-mutants were first generated and evaluated for initial activity and/or the desired enzymatic phenotype followed by double-codon mutants and then, if necessary, triple-codon mutants.
[00141] While the focus was initially on generating mutant variants of wild-type human TdT, many of the chosen residues are highly conserved among a multitude of species. It is expected that the TdT mutant variants derived from other organisms will function similarly with varying degrees of magnitude or potentially have radically different enzymatic phenotypes. Furthermore, generating these cross-species mutant variants allowed the expansion of the diversity of sequences while minimizing the difficult task of processing large-scale, random mutations. Thus, mutant variant libraries were designed and built from TdT sequences originating from species including but not limited to wild-type Mus musculus, Bos taurus, Monodelphis domestica, Eulemur macaco, Xenopus laevis, Ambystoma mexicanum, Oncorhynchus mykiss, and Gallus. Additional mutant variation may arise from truncations or removal of protein domains associated with in vivo enzymatic activities DNA repair mechanisms unnecessary for in vitro DNA oligonucleotide synthesis.
Mutagenesis of WT Human TdT Alters the Preference for Divalent Cation Cofactors Needed for Enzymatic Activity
[00142] It is well established that TdT requires the presence of a divalent cation cofactor for optimal enzymatic activity. Synthesis reactions are typically supplemented with Co2+, however alternative divalent cations with varying properties such as Mg2+, Mn2+, Zn2+, and combinations thereof have been reported to be compatible with TdT. Because divalent cations are directly involved in the binding and catalysis of the nucleoside triphosphates onto the growing oligonucleotide, it was hypothesized that any human TdT mutant variants generated may have altered cofactor requirements due to changes in structure, polarity, or hydrophobicity of the catalytic pocket. Thus, in order to perform a thorough evaluation of TdT mutant variants, enzymatic activity was screened in the presence of 0.25 mM Co2+, Mn2+ or Mg2+. Reaction supplementation with Zn2+ is least tolerated by TdT and deviations from a concentration of 0.25 mM cofactor generally result in decreased enzymatic activity.
[00143] Mutagenesis of Human TdT was first performed at the amino acid residue R453 as it protrudes into the catalytic pocket and appears to directly interact with the incoming nucleoside triphosphate (FIG. 1). Of the 18 single-codon mutant variants generated, 7 were on average better than the“no-change” control sample in terms of the normalized rate and overall concentration of single -stranded (ss)DNA generation after 60 minutes of reaction time (FIG. 2A). The remaining 11 mutant variants were observed to have lower activity, which indicates that mutagenesis at this residue is well tolerated by the enzyme. While it was found that the majority of the single codon mutants to be most active in the presence of Co2+, many were more active with Mn2+ or Mg2+. For example, the single-codon mutant R453H displayed the highest activity when the reaction was supplemented with Mn2+; however, supplementation with Co2+ and Mg2+ still resulted in appreciable activity. Denaturing gel electrophoresis analysis of these reactions indicate that long oligonucleotides >400-nt were synthesized in the presence of each cofactor as compared to the wild-type, no-change human TdT where long oligonucleotide were only synthesized in the presence of Co2+ (FIG. 2B). Similarly, a large array of different cofactor preferences were observed when screening double-codon mutants variants carrying the constant amino acid change R453A (FIG. 3).
[00144] The results obtained from the initial screen of both single- and double-codon mutant variants of human TdT indicate that improvements over the wild-type enzyme may be observed as increased flexibility in the presence of variable reaction conditions or substrates. Therefore, novel mutant variants may be, for example, less temperature or pH sensitive, highly processive, able to synthesize DNA oligonucleotide at faster rates, able to incorporate non-natural nucleotides or may only display their enzymatic phenotypes when reactions are supplemented with the preferred cofactor. However, mutant variants such as R453H are highly desired as they may function optimally regardless of reaction and substrate type or composition.
Double-Codon Mutant Variants of human TdT Display Decreased Substrate Bias & Specificity
[00145] In addition to evaluating single- and double-codon mutant variants’ cofactor preference, it was sought to determine if mutagenesis also induced changes in the preferred nucleoside triphosphate or initiator oligonucleotide sequence substrate. Using the structural modeling information as a guide (FIG. 1), the large protruding R453 amino acid residue was substituted for the small, aliphatic residue alanine (R453A). As previously described, single substitution at this residue position yielded mutant variants with altered functionality in terms of cofactor specificity; however, the nucleoside triphosphate or initiator oligonucleotide sequence substrate bias observed in wild-type human TdT extension reactions remained in those of single-codon mutant variants. Interestingly, radical changes in several of our double-codon mutant variants (with R453A a constant) were observed in which either all four nucleoside triphosphates were incorporated equally and/or the composition of the oligonucleotide appeared to affect enzyme bias significantly less. This mirrored the previous findings in which double-codon mutant variants were observed to display decreased specificity towards the preferred divalent cofactor metal.
[00146] Using a ssDNA activity assay, each double-codon mutant variants’ ability was evaluated with respect to its ability to incorporate each of the four natural nucleoside triphosphate bases with reactions substituted with Mn2+. Of the 100 mutant variants produced, it was found that several displayed an enhanced ability to incorporate all four bases given a particular oligonucleotide initiator sequence (FIG. 4). For example, it is generally accepted that human TdT is very active when adding dATP to an initiator oligonucleotide consisting of a homopolymer dT initiator oligonucleotide but not active when in the presence of dTTP under similar conditions; however, the mutant variant R453A-V432G allowed all four natural nucleoside triphosphate bases to be incorporated at similar rates producing very long fragments of ssDNA (>l400-nt) using the same homopolymer dT oligonucleotide (FIG. 5). Interestingly, this mutant variant of hTdT retained this decreased substrate bias and specificity in the presence of a homopolymer dA oligonucleotide, which could be active when adding dTTPs but not active in the presence of dATP (FIG. 6). While R453A-V432G was the best performing mutant variant identified to date in terms of most significantly decreased substrate bias and specificity, other mutant variants of human TdT that display similar enzymatic phenotypes are also of particular interest. These results are a primer for further mutagenesis of human TdT that will be useful for incorporating 3’-modified reversible terminator nucleotides used in DNA de novo synthesis or for improved general enzymatic oligonucleotide synthesis for applications like the digital encoding of information in DNA.
Double-Codon Mutant Variants of human TdT Display an Ability to Efficiently Incorporate Ribon ucleotides
[00147] It was previously established that TdT can incorporate ribonucleotides in addition to deoxyribonucleotides; however, generally only 1-2 ribonucleotides can be added as growing DNA oligonucleotide becomes a less preferred substrate for TdT. In addition to finding that the double-codon mutants display decreased substrate bias for deoxynucleotides, it was found that top performing variants can efficiently incorporate all four natural ribonucleoside triphosphates. For example, R453A-V432G produced long fragments of ssRNA in comparison to single-codon mutant R453A (FIG. 7).
Materials & Methods
Enzyme Expression and Purification
[00148] The primary sequences of wild-type or mutant enzymes of interest were codon optimized for E. coli expression using a custom optimization algorithm and ordered as gBlocks® (IDT) with 20-nt overlap sequences for Gibson Assembly into the pET-28-c-(+) His-tag expression vector (EMD Millipore 69866- 3). Using forward and reverse primers from IDT, the gBlocks® were PCR amplified with Phusion High Fidelity (HF) Polymerase (NEB M05030). The PCR thermocycling was performed as follows: initial denature for 98°C for 30 seconds, denature at 98°C for 10 seconds, anneal at 68°C for 10 seconds, and extend at 72°C for 60 seconds for 18 cycles before a final extension of 5 minutes at 72°C. PCR reactions were purified and concentrated using a QIAquick PCR Purification Kit (Qiagen 28106). The pET-28-c-(+) expression vector was prepared for gBlocks® insertion by digesting the circular DNA with 40U of NDel (NEB R0111) per 500 ng vector at 37°C for 90 minutes. The linear DNA was separated from undigested material with 2% agarose gel electrophoresis and extracted by incubating agarose containing the bands corresponding to the linear DNA in Buffer QG (Qiagen 19063) at 55°C rotating at 1000 RPM for 2 hours. The resultant mixture was cleaned and concentrated with the QIAquick PCR Purification Kit. The PCR amplified insert and vector sequences were combined at a ratio of 1:3 with 0.1 pmol of total material and assembled with Gibson Assembly Master Mix (NEB E5510S) at 50°C for 1 hour. T7 Express chemically competent E. coli (NEB C2566I) were transformed with the fully assembled plasmid as per manufacturer’s instructions and positive transformants are selected for on LB-kanamycin plates (50 ug/mL kanamycin). [00149] Bacterial colonies were sequenced (Genewiz, T7 - Forward Primer, T7 Term - Reverse Primer) and those with perfect matches were grown in liquid LB-kanamycin media (50 pg/mL kanamycin) overnight, diluted 1 :400 in fresh liquid LB-kanamycin, and induced with 1 mM IPTG (Sigma 16758) at approximately ODgoo = 0.8. The induced liquid cultures were incubated overnight at l5°C, shaking at 250 RPM. Cultures were then pelleted at 3500 x g for 10 minutes and then His-Tag purified using a HisTalon Resin Kit as per manufacturer’s instructions (Clontech 635654). The eluted enzyme samples were then buffer exchanged into an optimal 2X protein storage buffer using l5-mL filter columns (Millipore) at the appropriate MWCO by centrifugation at 5000 x G for 15 minutes at 4C. This process was repeated twice. On the third spin, samples were spun for 30 minutes in order to concentrate the protein into a smaller volume. Two small aliquots were taken for to determine the overall protein concentration using a Reducing Agent Compatible MicroBCA kit (Thermo 23252) and to determine the size of the His-Tag purified protein using a l6% Tris-Gly denaturing gel (Thermo XP00165) with a 10-250 kDa protein ladder (Thermo 26619). Post gel-electrophoresis, gels were stained with Coomassie Orange Fluor (Thermo C33250) for 20 minutes at room temperature under gentle agitation and visualized using a GelDoc Image Station (Biorad). The remaining concentrated stock of protein was diluted 1 :2 with sterile glycerol and stored at -20 C.
Site-Directed Mutagenesis of Starting Plasmid
[00150] Single or multiple amino acids can be mutagenized for improvement by rational design or by high-throughput methods such as error-prone PCR. Plasmids carrying the target protein were harvested and purified from a sequence verified liquid bacterial cultures grown overnight in LB-kanamycin media at 37°C using a MiniPrep Kit (Qiagen 27104). Oligonucleotide primers were ordered from IDT and were designed to PCR amplify the protein expression plasmid while simultaneously mutagenizing the plasmid at the predetermined location, yielding linearized DNA. Using the reagents from the Q5 Site-Directed Mutagenesis Kit (NEB E0554S), the protein expression plasmid was PCR amplified using the Q5 Hot Start High-Fidelity 2x Master Mix with the following thermocycling conditions: initial denature for 98°C for 30 seconds, denature at 98°C for 10 seconds, anneal at 68°C for 10 seconds, and extend at 72°C for 120 seconds for 25 cycles before a final extension of 2 minutes at 72°C. 1 pL of the resulting PCR amplification reaction was then treated with the kit’s enzyme reaction cocktail to re -circularize the protein expression plasmid while digesting away the unsubstituted plasmid sequences remaining in the reaction mixture . After bacterial transformation and sequence verification, colonies with perfect sequence matches were used to express and analyze the site-directed mutant protein with methods previously outlined. The resultant purified mutagenized protein was concentrated and buffer exchanged into the appropriate 2X storage buffer as previously mentioned and diluted 1 :2 with sterile glycerol and stored at -20°C. Initial Activity Screen with Natural dNTPs
[00151] Expressed proteins with terminal transferase activity were screened by determining the rate of ssDNA generation in terms of total ssDNA concentration and the length/distribution of ssDNA produced by the protein after incubation with natural dNTPs. In order to measure the rate of ssDNA generation, a 10 pL bulk extension reaction consisting of 10 pmol of a short 5’-Cy5 labeled initiator oligonucleotide (15- 20-nt), 100 mM of dNTPs, 0.25 mM divalent cation cofactor (such as Co2+, Mg2+, Mn2+, Zn2+, or combinations thereof), lx Reaction Buffer, lx SYBR Dye (GelStar (Lonza 50535), Qubit ssDNA Dye (Thermo Q 10212), or SYBR Green II RNA gel stain (Thermo S7564), and 1 pL of purified enzyme was monitored on a plate reader (EX:598 nm, EM:522 nm) over 30 minutes at 37°C, taking signal reads every 1 minute in triplicate (N=3). The length of the ssDNA produced in these reactions was determined by comparing products to a 100-hΐ ssDNA ladder (Simplex Biosciences) using a 15% TBE-Urea denaturing gel (Thermo EC6885) following the manufacturer’s protocol. Approximately 8 pL of the initial activity screen reaction volume was loaded onto the gels and run at 185V for 60 minutes unless otherwise specified. Gels were then stained with a solution of lx GelStar Nucleic Acid stain gel stain for 15 minutes with gentle agitation. The resultant gel was then imaged on a Typhoon FLA 9500 system (GE Healthcare Life Sciences) using imaging parameters for SYBR Gold. For extension reactions using initiator oligonucleotides labeled with a 5’-fluorophore such as FAM, Cy5, Cy3, etc, gels were not stained and imaged directly using the appropriate parameters.

Claims

1. A modified terminal deoxynucleotidyl transferase (TdT) polypeptide, wherein the modified TdT polypeptide comprises a sequence having at least one amino acid mutation and retains at least 10% of the template -independent DNA polymerase activity of the TdT polypeptide from which the modified TdT is derived.
2. The modified TdT enzyme of claim 1, wherein the modified TdT comprises at least 50% of the template -independent DNA polymerase activity of the TdT polypeptide from which it is derived.
3. The modified TdT enzyme of claim 1 or 2, wherein the protein sequence of the modified TdT comprises at least two amino acid mutations compared to the TdT polypeptide from which it is derived.
4. The modified TdT enzyme of claim 1, 2, or 3, wherein the TdT polypeptide from which the modified TdT is derived is a human TdT polypeptide.
5. The modified TdT enzyme of claim 4, wherein the human TdT polypeptide comprises SEQ ID NO : 1
6. The modified TdT enzyme of any one of claims 1-5, wherein the modified TdT comprises a cofactor preference that is different from the cofactor preference of the TdT polypeptide from which it is derived.
7. The modified TdT enzyme of any one of claims 1-6, wherein the modified TdT comprises a reduced degree of bias compared to the degree of bias of the TdT polypeptide from which it is derived under substantially similar enzyme assay conditions.
8. The modified TdT enzyme of any one of claims 1-7, wherein the modified TdT comprises at least one amino acid mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and L459.
9. The modified TdT enzyme of any one of claims 1-8, wherein the modified TdT comprises a mutation at R453 and at least one additional mutation at an amino acid residue selected from the group consisting of: S279, G341, H342, D343, V344, D345, A396, A429, 1430, R431, V432, D433, R442, F444, R453, Q454, and L459.
10. The modified TdT enzyme of claim 9, wherein the mutation at R453 is R453A.
11. The modified TdT enzyme of claim 9 or 10, wherein the at least one additional mutation occurs at amino acid residue V432.
12. The modified TdT enzyme of claim 9, 10, or 11, wherein the mutation at amino acid residue V432 is V432G.
13. The modified TdT enzyme of any one of claims 1-12, comprising a sequence selected from Tables
3-8.
14. A method for generating a polynucleotide sequence c/e novo or in vitro, the method comprising; incubating a modified TdT enzyme of claim 1 in the presence of an initiator sequence, a cofactor and at least one nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand.
15. The method of claim 14, wherein the modified TdT enzyme or the initiator sequence are conjugated to a solid support.
16. The method of claim 15, wherein the solid support comprises a bead, a membrane, or a column.
17. The method of any one of claims 14-16, wherein the cofactor is a divalent cation.
18. The method of claim 17, wherein the divalent cation is Co2+, Mn2+, Mg2+ or Zn2+.
19. The method of any one of claims 14-18, wherein the modified TdT enzyme of claim 1 is incubated in the presence of 2, 3, or 4 nucleoside triphosphates.
20. The method of any one of claims 14-19, further comprising a second step of incubating a modified TdT enzyme of claim 1 in the presence of an initiator sequence, a cofactor and at least one different nucleoside triphosphate under conditions and for a time sufficient to add at least one nucleotide to the 3’ end of a polynucleotide strand.
21. The method of claim 20, wherein the step is repeated once or twice each in the presence of at least one different nucleotide.
22. A nucleic acid molecule encoding any one of the modified TdT enzymes of claims 1-13.
23. A vector comprising the nucleic acid molecule of claim 22.
24. A cell comprising the modified TdT polypeptide of claims 1-13, the nucleic acid molecule of claim 22, and/or the vector of claim 23.
25. The cell of claim 24, wherein the cell is a bacterial cell.
26. A solid support comprising a modified TdT enzyme of claims 1-13.
PCT/US2019/054398 2018-10-04 2019-10-03 Compositions and methods comprising mutants of terminal deoxynucleotidyl transferase WO2020072715A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201862741143P 2018-10-04 2018-10-04
US62/741,143 2018-10-04

Publications (1)

Publication Number Publication Date
WO2020072715A1 true WO2020072715A1 (en) 2020-04-09

Family

ID=70055838

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2019/054398 WO2020072715A1 (en) 2018-10-04 2019-10-03 Compositions and methods comprising mutants of terminal deoxynucleotidyl transferase

Country Status (1)

Country Link
WO (1) WO2020072715A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116836955A (en) * 2023-05-17 2023-10-03 中国科学院深圳先进技术研究院 Terminal deoxynucleotidyl transferase and preparation method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160108382A1 (en) * 2014-10-20 2016-04-21 Molecular Assemblies, Inc. Modified template-independent enzymes for polydeoxynucleotide synthesis
US20180023108A1 (en) * 2015-02-10 2018-01-25 Nuclera Nucleics Ltd. Novel use
WO2018102818A1 (en) * 2016-12-02 2018-06-07 President And Fellows Of Harvard College Processive template independent dna polymerase variants
US20190211315A1 (en) * 2018-01-08 2019-07-11 Dna Script Variants of Terminal Deoxynucleotidyl Transferase and Uses Thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160108382A1 (en) * 2014-10-20 2016-04-21 Molecular Assemblies, Inc. Modified template-independent enzymes for polydeoxynucleotide synthesis
US20180023108A1 (en) * 2015-02-10 2018-01-25 Nuclera Nucleics Ltd. Novel use
WO2018102818A1 (en) * 2016-12-02 2018-06-07 President And Fellows Of Harvard College Processive template independent dna polymerase variants
US20190211315A1 (en) * 2018-01-08 2019-07-11 Dna Script Variants of Terminal Deoxynucleotidyl Transferase and Uses Thereof

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116836955A (en) * 2023-05-17 2023-10-03 中国科学院深圳先进技术研究院 Terminal deoxynucleotidyl transferase and preparation method thereof

Similar Documents

Publication Publication Date Title
CN102796728B (en) Methods and compositions for DNA fragmentation and tagging by transposases
Roy et al. Post‐transfer editing in vitro and in vivo by the β subunit of phenylalanyl‐tRNA synthetase
Robichon et al. Engineering Escherichia coli BL21 (DE3) derivative strains to minimize E. coli protein contamination after purification by immobilized metal affinity chromatography
US20190360013A1 (en) Processive Template Independent DNA Polymerase Variants
Frechin et al. Yeast mitochondrial Gln-tRNAGln is generated by a GatFAB-mediated transamidation pathway involving Arc1p-controlled subcellular sorting of cytosolic GluRS
US8859237B2 (en) Diguanylate cyclase method of producing the same and its use in the manufacture of cyclic-di-GMP and analogues thereof
CN112639089B (en) Recombinant KOD polymerase
CN113061591B (en) Novel firefly luciferase mutant, preparation method and application thereof
CN103562410A (en) Sso7-polymerase conjugates with decreased non-specific activity
JPWO2016148044A1 (en) Modified aminoacyl-tRNA synthetase and use thereof
JP2022543569A (en) Templateless Enzymatic Synthesis of Polynucleotides Using Poly(A) and Poly(U) Polymerases
JP4263598B2 (en) Tyrosyl tRNA synthetase mutant
CN111073871B (en) DNA polymerase mutant with improved thermal stability as well as construction method and application thereof
CN111172129A (en) Phi29DNA polymerase mutant for improving thermal stability, amplification uniformity and amplification efficiency and application thereof
EP3959312A1 (en) Isolated nucleic acid binding domains
WO2020072715A1 (en) Compositions and methods comprising mutants of terminal deoxynucleotidyl transferase
CN112175980B (en) Method for improving activity of polymerase large fragment through site-directed mutagenesis and application
Fang et al. Synonymous rare arginine codons and tRNA abundance affect protein production and quality of TEV protease variant
JPWO2008001947A1 (en) Mutant SepRS and method for introducing site-specific phosphoserine into a protein using the same
CN116240188A (en) Preparation method of dihydropteroic acid synthetase mutant with improved thermal stability
JP2004533828A (en) N4 virus single-stranded DNA-dependent RNA polymerase
CN116096872A (en) Thermostable terminal deoxynucleotidyl transferase
Sissler et al. Handling mammalian mitochondrial tRNAs and aminoacyl-tRNA synthetases for functional and structural characterization
KR20210151928A (en) Systems, methods and compositions for recombinant in vitro transcription and translation using thermophilic proteins
WO2023098036A1 (en) Taq enzyme mutant, preparation method, and application thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19869865

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19869865

Country of ref document: EP

Kind code of ref document: A1