NZ770892A - Polymerases, compositions, and methods of use - Google Patents
Polymerases, compositions, and methods of useInfo
- Publication number
- NZ770892A NZ770892A NZ770892A NZ77089219A NZ770892A NZ 770892 A NZ770892 A NZ 770892A NZ 770892 A NZ770892 A NZ 770892A NZ 77089219 A NZ77089219 A NZ 77089219A NZ 770892 A NZ770892 A NZ 770892A
- Authority
- NZ
- New Zealand
- Prior art keywords
- amino acid
- polymerase
- dna polymerase
- mutation
- acid sequence
- Prior art date
Links
- 239000000203 mixture Substances 0.000 title description 29
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 83
- 238000010348 incorporation Methods 0.000 claims abstract description 66
- 239000002773 nucleotide Substances 0.000 claims abstract description 62
- 230000035772 mutation Effects 0.000 claims description 277
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 224
- 235000001014 amino acid Nutrition 0.000 claims description 200
- 238000006467 substitution reaction Methods 0.000 claims description 187
- 150000001413 amino acids Chemical class 0.000 claims description 121
- 101710029649 MDV043 Proteins 0.000 claims description 118
- 101700011961 DPOM Proteins 0.000 claims description 113
- 101700061424 POLB Proteins 0.000 claims description 113
- 101700054624 RF1 Proteins 0.000 claims description 113
- 108020004707 nucleic acids Proteins 0.000 claims description 44
- 150000007523 nucleic acids Chemical class 0.000 claims description 44
- 108020004511 Recombinant DNA Proteins 0.000 claims description 43
- 229920003013 deoxyribonucleic acid Polymers 0.000 claims description 37
- 239000002253 acid Substances 0.000 claims description 29
- 238000006243 chemical reaction Methods 0.000 claims description 21
- 230000002209 hydrophobic Effects 0.000 claims description 21
- 150000007513 acids Chemical class 0.000 claims description 19
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 14
- 238000001514 detection method Methods 0.000 claims description 12
- 230000000694 effects Effects 0.000 claims description 11
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 claims description 8
- 239000002777 nucleoside Substances 0.000 claims description 8
- 150000003833 nucleoside derivatives Chemical class 0.000 claims description 8
- 230000001809 detectable Effects 0.000 claims description 7
- 125000000267 glycino group Chemical group [H]N([*])C([H])([H])C(=O)O[H] 0.000 claims description 7
- 125000002887 hydroxy group Chemical group [H]O* 0.000 claims description 7
- 108020004634 Archaeal DNA Proteins 0.000 claims description 6
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 claims description 6
- 125000003295 alanine group Chemical group N[C@@H](C)C(=O)* 0.000 claims description 5
- 125000000217 alkyl group Chemical group 0.000 claims description 5
- 241000205160 Pyrococcus Species 0.000 claims description 4
- 125000000547 substituted alkyl group Chemical group 0.000 claims description 4
- 125000000510 L-tryptophano group Chemical group [H]C1=C([H])C([H])=C2N([H])C([H])=C(C([H])([H])[C@@]([H])(C(O[H])=O)N([H])[*])C2=C1[H] 0.000 claims description 3
- -1 azidomethyl group Chemical group 0.000 claims description 3
- 125000004435 hydrogen atoms Chemical group [H]* 0.000 claims description 3
- 125000001424 substituent group Chemical group 0.000 claims description 3
- CZPWVGJYEJSRLH-UHFFFAOYSA-N 289-95-2 Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 claims description 2
- ASJSAQIRZKANQN-CRCLSJGQSA-N Deoxyribose Chemical compound OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 claims description 2
- 101700008821 EXO Proteins 0.000 claims description 2
- 101700083023 EXRN Proteins 0.000 claims description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N Purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 claims description 2
- 125000002252 acyl group Chemical group 0.000 claims description 2
- 125000003342 alkenyl group Chemical group 0.000 claims description 2
- 125000003545 alkoxy group Chemical group 0.000 claims description 2
- 125000001118 alkylidene group Chemical group 0.000 claims description 2
- 125000000304 alkynyl group Chemical group 0.000 claims description 2
- 125000003368 amide group Chemical group 0.000 claims description 2
- 125000003710 aryl alkyl group Chemical group 0.000 claims description 2
- 125000003118 aryl group Chemical group 0.000 claims description 2
- 125000004104 aryloxy group Chemical group 0.000 claims description 2
- 230000000903 blocking Effects 0.000 claims description 2
- 229910052799 carbon Inorganic materials 0.000 claims description 2
- 125000004432 carbon atoms Chemical group C* 0.000 claims description 2
- 125000004093 cyano group Chemical group *C#N 0.000 claims description 2
- 239000007850 fluorescent dye Substances 0.000 claims description 2
- 125000005843 halogen group Chemical group 0.000 claims description 2
- 125000001072 heteroaryl group Chemical group 0.000 claims description 2
- 125000005553 heteroaryloxy group Chemical group 0.000 claims description 2
- 125000000623 heterocyclic group Chemical group 0.000 claims description 2
- 125000006239 protecting group Chemical group 0.000 claims description 2
- 229920002248 Nuclear DNA Polymers 0.000 claims 3
- 108090000725 DNA polymerase A Proteins 0.000 claims 1
- 102000004214 DNA polymerase A Human genes 0.000 claims 1
- QVRVXSZKCXFBTE-UHFFFAOYSA-N N-[4-(6,7-dimethoxy-3,4-dihydro-1H-isoquinolin-2-yl)butyl]-2-(2-fluoroethoxy)-5-methylbenzamide Chemical compound C1C=2C=C(OC)C(OC)=CC=2CCN1CCCCNC(=O)C1=CC(C)=CC=C1OCCF QVRVXSZKCXFBTE-UHFFFAOYSA-N 0.000 claims 1
- 125000001188 haloalkyl group Chemical group 0.000 claims 1
- 230000002829 reduced Effects 0.000 abstract description 15
- 102000004190 Enzymes Human genes 0.000 abstract description 4
- 108090000790 Enzymes Proteins 0.000 abstract description 4
- 238000002703 mutagenesis Methods 0.000 description 30
- 231100000350 mutagenesis Toxicity 0.000 description 30
- 230000001186 cumulative Effects 0.000 description 23
- 238000000034 method Methods 0.000 description 21
- 210000004027 cells Anatomy 0.000 description 18
- 235000018102 proteins Nutrition 0.000 description 18
- 102000004169 proteins and genes Human genes 0.000 description 18
- 108090000623 proteins and genes Proteins 0.000 description 18
- PEDCQBHIVMGVHV-UHFFFAOYSA-N glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 15
- XEBWQGVWTUSTLN-UHFFFAOYSA-M Phenylmercury acetate Chemical compound CC(=O)O[Hg]C1=CC=CC=C1 XEBWQGVWTUSTLN-UHFFFAOYSA-M 0.000 description 13
- DHMQDGOQFOQNFH-UHFFFAOYSA-N glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 12
- 229920001850 Nucleic acid sequence Polymers 0.000 description 11
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 8
- MTCFGRXMJLQNBG-REOHCLBHSA-N L-serine Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 8
- 239000003153 chemical reaction reagent Substances 0.000 description 8
- 238000000338 in vitro Methods 0.000 description 8
- 241000588724 Escherichia coli Species 0.000 description 7
- 238000001742 protein purification Methods 0.000 description 7
- 230000001603 reducing Effects 0.000 description 7
- 238000003786 synthesis reaction Methods 0.000 description 7
- 239000004471 Glycine Substances 0.000 description 6
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000010367 cloning Methods 0.000 description 6
- 239000000463 material Substances 0.000 description 6
- 238000006722 reduction reaction Methods 0.000 description 6
- FAPWRFPIFSIZLT-UHFFFAOYSA-M sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 6
- 230000002194 synthesizing Effects 0.000 description 6
- 229920001405 Coding region Polymers 0.000 description 5
- 108020004705 Codon Proteins 0.000 description 5
- 102000016928 DNA-Directed DNA Polymerase Human genes 0.000 description 5
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 5
- 241001446467 Mama Species 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 239000005022 packaging material Substances 0.000 description 5
- 230000000576 supplementary Effects 0.000 description 5
- 229960001230 Asparagine Drugs 0.000 description 4
- QIVBCDIJIAJPQS-SECBINFHSA-N D-tryptophane Chemical compound C1=CC=C2C(C[C@@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-SECBINFHSA-N 0.000 description 4
- 229960000310 ISOLEUCINE Drugs 0.000 description 4
- RAXXELZNTBOGNW-UHFFFAOYSA-N Imidazole Chemical compound C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 4
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 4
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 4
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 4
- XUJNEKJLAYXESH-REOHCLBHSA-N L-cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 4
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 4
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 4
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 4
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 4
- 210000004940 Nucleus Anatomy 0.000 description 4
- 229920000272 Oligonucleotide Polymers 0.000 description 4
- 229960005190 Phenylalanine Drugs 0.000 description 4
- 239000004473 Threonine Substances 0.000 description 4
- 235000004279 alanine Nutrition 0.000 description 4
- 235000009582 asparagine Nutrition 0.000 description 4
- UIIMBOGNXHQVGW-UHFFFAOYSA-M buffer Substances [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 235000018417 cysteine Nutrition 0.000 description 4
- KCXVZYZYPLLWCC-UHFFFAOYSA-N edta Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 4
- 230000002708 enhancing Effects 0.000 description 4
- 230000001965 increased Effects 0.000 description 4
- 238000002955 isolation Methods 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 238000002741 site-directed mutagenesis Methods 0.000 description 4
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 4
- 239000004474 valine Substances 0.000 description 4
- LENZDBCJOHFCAS-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 3
- 229960001456 Adenosine Triphosphate Drugs 0.000 description 3
- ZKHQWZAMYRWXGA-KQYNXXCUSA-N Adenosine triphosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-N 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 3
- 108010014303 DNA-Directed DNA Polymerase Proteins 0.000 description 3
- 241000206602 Eukaryota Species 0.000 description 3
- 102200011200 FOXP3 L408A Human genes 0.000 description 3
- 102200013572 SPAST A485V Human genes 0.000 description 3
- 238000007792 addition Methods 0.000 description 3
- 230000001580 bacterial Effects 0.000 description 3
- 230000000295 complement Effects 0.000 description 3
- 210000003527 eukaryotic cell Anatomy 0.000 description 3
- 238000002866 fluorescence resonance energy transfer Methods 0.000 description 3
- 230000002068 genetic Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000006011 modification reaction Methods 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- 108091007521 restriction endonucleases Proteins 0.000 description 3
- 230000002441 reversible Effects 0.000 description 3
- 238000002864 sequence alignment Methods 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 230000014621 translational initiation Effects 0.000 description 3
- 239000004475 Arginine Substances 0.000 description 2
- 229960005261 Aspartic Acid Drugs 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 229960002989 Glutamic Acid Drugs 0.000 description 2
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 2
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 2
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 2
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 2
- 239000004472 Lysine Substances 0.000 description 2
- 108010002747 Pfu DNA polymerase Proteins 0.000 description 2
- 241001272996 Polyphylla fullo Species 0.000 description 2
- 241001148023 Pyrococcus abyssi Species 0.000 description 2
- 241000205156 Pyrococcus furiosus Species 0.000 description 2
- XPPKVPWEQAFLFU-UHFFFAOYSA-J Pyrophosphate Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 2
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 2
- 229920002684 Sepharose Polymers 0.000 description 2
- 229920000978 Start codon Polymers 0.000 description 2
- 102200053240 USP7 C223S Human genes 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 241000617156 archaeon Species 0.000 description 2
- 235000003704 aspartic acid Nutrition 0.000 description 2
- 238000004166 bioassay Methods 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 230000000875 corresponding Effects 0.000 description 2
- 230000012361 double-strand break repair Effects 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 235000013922 glutamic acid Nutrition 0.000 description 2
- 239000004220 glutamic acid Substances 0.000 description 2
- 125000005647 linker group Chemical group 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000033607 mismatch repair Effects 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 238000007481 next generation sequencing Methods 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 229920001184 polypeptide Polymers 0.000 description 2
- 238000000164 protein isolation Methods 0.000 description 2
- 238000005215 recombination Methods 0.000 description 2
- 230000001105 regulatory Effects 0.000 description 2
- 239000006228 supernatant Substances 0.000 description 2
- 210000001519 tissues Anatomy 0.000 description 2
- 238000006257 total synthesis reaction Methods 0.000 description 2
- 230000005026 transcription initiation Effects 0.000 description 2
- 241001606065 Aoa Species 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 241001124532 Bubalus depressicornis Species 0.000 description 1
- 102100019404 CDSN Human genes 0.000 description 1
- 229920002839 Cis-regulatory element Polymers 0.000 description 1
- 210000001072 Colon Anatomy 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 230000004568 DNA-binding Effects 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N Ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 102000013165 Exonucleases Human genes 0.000 description 1
- 108060002716 Exonucleases Proteins 0.000 description 1
- 102000034378 G proteins Human genes 0.000 description 1
- 108091006011 G proteins Proteins 0.000 description 1
- 125000000393 L-methionino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])C([H])([H])C(SC([H])([H])[H])([H])[H] 0.000 description 1
- 241000147080 Methanococcus aeolicus Species 0.000 description 1
- 229920000582 Polyisocyanurate Polymers 0.000 description 1
- 241001531230 Pyrococcus endeavori Species 0.000 description 1
- 241000522615 Pyrococcus horikoshii Species 0.000 description 1
- 241000205192 Pyrococcus woesei Species 0.000 description 1
- 102000009572 RNA Polymerase II Human genes 0.000 description 1
- 108010009460 RNA Polymerase II Proteins 0.000 description 1
- 210000004358 Rod Cell Outer Segment Anatomy 0.000 description 1
- 101710043352 SHFL Proteins 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- 108090000787 Subtilisin Proteins 0.000 description 1
- 102000004523 Sulfate Adenylyltransferase Human genes 0.000 description 1
- 108010022348 Sulfate Adenylyltransferase Proteins 0.000 description 1
- 241000545779 Thermococcus barophilus Species 0.000 description 1
- 241001127161 Thermococcus gammatolerans Species 0.000 description 1
- 241001237851 Thermococcus gorgonarius Species 0.000 description 1
- 241001235254 Thermococcus kodakarensis Species 0.000 description 1
- 241000706981 Thermococcus sibiricus Species 0.000 description 1
- 241000295520 Thermococcus waiotapuensis Species 0.000 description 1
- 102000006612 Transducin Human genes 0.000 description 1
- 108010087042 Transducin Proteins 0.000 description 1
- 229940035893 Uracil Drugs 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 102000004965 antibodies Human genes 0.000 description 1
- 108090001123 antibodies Proteins 0.000 description 1
- 238000003556 assay method Methods 0.000 description 1
- 230000027455 binding Effects 0.000 description 1
- 230000003115 biocidal Effects 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 230000001413 cellular Effects 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000010192 crystallographic characterization Methods 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drugs Drugs 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002255 enzymatic Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000011888 foil Substances 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000000977 initiatory Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- BPHPUYQFMNQIOC-NXRLNHOXSA-N isopropyl β-D-thiogalactopyranoside Chemical compound CC(C)S[C@@H]1O[C@H](CO)[C@H](O)[C@H](O)[C@H]1O BPHPUYQFMNQIOC-NXRLNHOXSA-N 0.000 description 1
- 230000000670 limiting Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000000504 luminescence detection Methods 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 239000012139 lysis buffer Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000002906 microbiologic Effects 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 238000000302 molecular modelling Methods 0.000 description 1
- 230000000869 mutational Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000000056 organs Anatomy 0.000 description 1
- 239000000123 paper Substances 0.000 description 1
- 238000002823 phage display Methods 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 230000000704 physical effect Effects 0.000 description 1
- 229920003023 plastic Polymers 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 230000001402 polyadenylating Effects 0.000 description 1
- WCUXLLCKKVVCTQ-UHFFFAOYSA-M potassium chloride Chemical compound [Cl-].[K+] WCUXLLCKKVVCTQ-UHFFFAOYSA-M 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 238000001814 protein method Methods 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000001718 repressive Effects 0.000 description 1
- 108010066533 ribonuclease S Proteins 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000000087 stabilizing Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 239000012536 storage buffer Substances 0.000 description 1
- 230000001131 transforming Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 238000004450 types of analysis Methods 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
Abstract
Presented herein are altered polymerase enzymes for improved incorporation of nucleotides and nucleotide analogues, in particular altered polymerases that maintain high fidelity under reduced incorporation times, as well as methods and kits using the same.
Description
[01]
[02]
[03]
[04]
POLYMERASES, COMPOSITIONS, AND METHODS OF USE
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Application Serial No.
62/753,558, filed October 31, 2018, which is incorporated by reference herein in its
entirety.
SEQUENCE LISTING
This application contains a Sequence Listing electronically submitted via EFS-Web
to the United States Patent and Trademark Office as an ASCII text file entitled “IP-
l546-PCT_ST25.txt” having a size of 224 kilobytes and created on October 31,
2019. The information contained in the Sequence Listing is incorporated by
reference herein.
FIELD
The present disclosure relates to, among other things, altered polymerases for use in
performing a nucleotide incorporation reaction, particularly in the context of nucleic
acid sequencing by synthesis.
BACKGROUND
Next-generation sequencing (NGS) technology relies on DNA polymerases as a
critical component of the sequencing process. Reduction of the time for sequencing
a template while maintaining high fidelity is desirable. Reducing each cycle of a
sequencing by synthesis (SB S) process is a useful step to achieving a shorter
sequencing run time. One approach to reduce cycle time is to reduce the time of the
incorporation step. However, while reductions in incorporation time could offer
[05]
[06]
[07]
[03]
significant improvement to the overall run time, they typically do so at the expense
of fidelity. For instance, phasing rates, pre-phasing rates, and/or bypass rates
increase, and as a consequence error rate is increased. At low error rates, during a
sequencing run most template molecules in a cluster terminate in the same labeled
nucleotide and the signal is clear. In contrast, at reduced fidelity, during a
sequencing run an increasing number of template molecule in a cluster terminate in
the incorrect labeled nucleotide and the signal can become too noisy to accurately
determine which nucleotide was incorporated.
SUMMARY
Provided herein are recombinant DNA polymerases. One example of a polymerase
of the present disclosure includes an amino acid sequence that is at least 80%
identical to a 9°N DNA polymerase amino acid sequence SEQ ID NO: 1.
In one embodiment, a polymerase also includes an amino acid substitution mutation
at a position functionally equivalent to Tyr497 and at least one amino acid
substitution mutation at a position functionally equivalent to Phel52, Val27 8,
Met329, Val47l, Thr5 14, Leu63 l, or Glu734 in the 9°N DNA polymerase amino
acid sequence, and optionally further includes amino acid substitution mutations at
positions functionally equivalent to amino acids Metl29, Aspl4l, Glul43, Cys223,
Leu408, Tyr409, Pro4l0, and Ala485 in the 9°N DNA polymerase amino acid
SCQLICIICC.
In one embodiment, a polymerase also includes an amino acid substitution mutation
at a position functionally equivalent to Tyr497 and at least one amino acid
substitution mutation at a position functionally equivalent to Lys47 6, Lys47 7 ,
Thr514, Ile52l, or Thr59O in the 9°N DNA polymerase amino acid sequence, and
optionally further includes amino acid substitution mutations at positions
functionally equivalent to amino acids Metl29, Aspl4l, Glul43, Cys223, Leu408,
Tyr409, Pro4l0, and Ala485 in the 9°N DNA polymerase amino acid sequence.
In one embodiment, a polymerase also includes an amino acid substitution mutation
at a position functionally equivalent to Tyr497 and at least one amino acid
[09]
[010]
[011]
substitution mutation at a position functionally equivalent to Arg247, Glu599,
Lys620, His633, or Val66l in the 9°N DNA polymerase amino acid sequence, and
optionally further includes amino acid substitution mutations at positions
functionally equivalent to amino acids Metl29, Aspl4l, Glul43, Cys223, Leu408,
Tyr409, Pro4lO, and Ala485 in the 9°N DNA polymerase amino acid sequence.
In one embodiment,, a polymerase also includes (i) an amino acid substitution
mutation at a position functionally equivalent to Tyr497; (ii) at least one amino acid
substitution mutation at a position functionally equivalent to Phel52, Val27 8,
Met329, Val47l, Leu63 l, or Glu7 34 in the 9°N DNA polymerase amino acid
sequence, and (iii) at least one amino acid substitution mutation at a position
functionally equivalent to Lys476, Lys477, Thr5l4, Ile52l, or Thr59O in the 9°N
DNA polymerase amino acid sequence, and optionally further includes amino acid
substitution mutations at positions functionally equivalent to amino acids Metl29,
Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4lO, and Ala485 in the 9°N DNA
polymerase amino acid sequence.
In one embodiment, a polymerase also includes (i) an amino acid substitution
mutation at a position functionally equivalent to Tyr497; (ii) at least one amino acid
substitution mutation at a position functionally equivalent to Lys47 6, Lys47 7 ,
Thr5 14, Ile52l, or Thr59O in the 9°N DNA polymerase amino acid sequence, and
(iii) at least one amino acid substitution mutation at a position functionally
equivalent to Arg247, Glu599, Lys620, His633, or Val66l in the 9°N DNA
polymerase amino acid sequence, and optionally further includes amino acid
substitution mutations at positions functionally equivalent to amino acids Metl29,
Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4lO, and Ala485 in the 9°N DNA
polymerase amino acid sequence.
In one embodiment, a polymerase also includes (i) an amino acid substitution
mutation at a position functionally equivalent to Tyr497; (ii) at least one amino acid
substitution mutation at a position functionally equivalent to Phel52, Val27 8,
Met329, Val47l, Leu63 l, or Glu7 34 in the 9°N DNA polymerase amino acid
sequence, (iii) at least one amino acid substitution mutation at a position
[012]
[013]
[014]
functionally equivalent to Lys476, Lys477, Thr514, Ile521, or Thr590 in the 9°N
DNA polymerase amino acid sequence, and (iv) at least one amino acid substitution
mutation at a position functionally equivalent to Arg247, Glu599, Lys620, His633,
or Val66l in the 9°N DNA polymerase amino acid sequence, and optionally further
includes amino acid substitution mutations at positions functionally equivalent to
amino acids Metl29, Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4l0, and
Ala485 in the 9°N DNA polymerase amino acid sequence.
Another example of a polymerase of the present disclosure includes an amino acid
sequence that is at least 80% identical to a 9°N DNA polymerase amino acid
sequence SEQ ID N018, and also includes (i) amino acid substitution mutations at
positions functionally equivalent to Tyr497, Phe152, Val27 8, Met329, Val471, and
Thr514 in the 9°N DNA polymerase amino acid sequence, (ii) amino acid
substitution mutations at positions functionally equivalent to Tyr497, Met329,
Val471, and Glu7 34 in the 9°N DNA polymerase amino acid sequence; (iii) amino
acid substitution mutations at positions functionally equivalent to Tyr497, Arg247,
Glu599, and His633 in the 9°N DNA polymerase amino acid sequence; (iv) amino
acid substitution mutations at positions functionally equivalent to Tyr497, Arg247,
Glu599, Lys620, and His633 in the 9°N DNA polymerase amino acid sequence; (v)
amino acid substitution mutations at positions functionally equivalent to Tyr497,
Met 329, Thr514, Lys620, and Val66l in the 9°N DNA polymerase amino acid
sequence; or (vi) amino acid substitution mutations at positions functionally
equivalent to Tyr497, Val278, Val471, Arg247, Glu599, and His633 in the 9°N
DNA polymerase amino acid sequence.
Also provided herein is a recombinant DNA polymerase that includes the amino
acid sequence of any one of SEQ ID NOSI 10-34, a nucleic acid molecule that
encodes a polymerase described herein, an expression vector that includes the
nucleic acid molecule, and a host cell that includes the vector.
The present disclosure also includes methods. In one embodiment, a method is for
incorporating modified nucleotides into a growing DNA strand. The method
[015]
[016]
[017]
[018]
includes allowing the following components to interact: (i) a polymerase described
herein; (ii) a DNA template; and (iii) a nucleotide solution.
Also provided herein is a kit. In one embodiment, the kit is for performing a
nucleotide incorporation reaction. The kit can include, for instance, a polymerase
described herein and a nucleotide solution.
BRIEF DESCRIPTION OF THE VIEWS OF THE DRAWINGS
is a schematic showing alignment of polymerase amino acid sequences from
T hermococcus sp. 9°N-7 (9°N, SEQ ID N0: 1), T hermococcus litoralis (Vent, SEQ
ID NO:2 and Deep Vent, SEQ ID NO:3), T hermococcus waiotapuensis (Twa, SEQ
ID NO:7), T hermococcus kodakaraenis (KOD, SEQ ID N0:5), Pyrococcusfi/riosus
(Pfu, SEQ ID N024), Pyrococcus abyss: (Pab, SEQ ID N026). An “*” (asterisk)
indicates positions which have a single, fully conserved residue between all
polymerases. A “z” (colon) indicates conservation between groups of strongly
similar properties as below - roughly equivalent to scoring > 0.5 in the Gonnet PAM
250 matrix. A “.” (period) indicates conservation between groups of weakly similar
properties as below - roughly equivalent to scoring =< 0.5 and > O in the Gonnet
PAM 250 matrix.
shows reduced phasing and cumulative error rates at short incorporation
times demonstrated by one of the altered polymerases of the present disclosure, Pol
1558 (SEQ ID N0:11), when compared to a Pol 812 (SEQ ID N018) control (left
panels). The two enzymes show comparable phasing and error rates at standard
incorporation times (right panels).
shows reduced R1 phasing and cumulative E. C0li error rates at short
incorporation times demonstrated by selected altered polymerases of the present
disclosure, Pol 1558 (SEQ ID N011 1), Pol 1671 (SEQ ID NO:23), Pol 1682 (SEQ
ID N0:25), and Pol 1745 (SEQ ID N0:28), when compared to Pol 812 (SEQ ID
NO:8) and Pol 963 (SEQ ID N019) controls. The broken lines in the top and
bottom panels indicate the cumulative E C0li error and R1 phasing rates
demonstrated by Pol 812 at standard incorporation times. compares the
[019]
[020]
[021]
[022]
phasing and prephasing rates of the same altered polymerases in reference to Pol
812 and Pol 963 controls.
compares cumulative PhiX errors rates of Pol 1550 (SEQ ID NO: 10) and Pol
1558 (SEQ ID NO:11) with that of Pol 812 (SEQ ID N018) control at standard and
short incorporation times during long sequencing reads (2x25O cycles). Both
mutants show notable reductions in error rates following the paired-end turn.
shows a comparison between NovaSeqTM sequencing metrics of one of the
altered polymerases of the present invention, Pol 1671 (SEQ ID NO:23),
demonstrated at short incorporation times, and those of Pol 812 (SEQ ID NO:8)
control demonstrated at standard and short incorporation times. The top panels
show the percentages of clusters passing filter (“Clusters PF”); the bottom panels
show the cumulative PhiX error rates. The light open circles denote Pol 812 metrics
at the standard incorporation times, whereas the dark open circles denote Pol 812
metrics at the short incorporation times. All of the Pol 1671 metrics denoted by the
solid circles are at the short incorporation times. summarizes the
cumulative PhiX error rates, Q30 values, and phasing rates shown by Pol 1671 in
reference to Pol 812 control for NovaSeqTM reads 1 and 2 at standard and short
incorporation times. Significant improvements in the quality of both reads were
observed when Pol 1671 was used.
DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
The term "and/or" means one or all of the listed elements or a combination of any
two or more of the listed elements.
The words "preferred" and "preferably" refer to embodiments of the invention that
may afford certain benefits, under certain circumstances. However, other
embodiments may also be preferred, under the same or other circumstances.
Furthermore, the recitation of one or more preferred embodiments does not imply
that other embodiments are not useful, and is not intended to exclude other
embodiments from the scope of the invention.
[023]
[024]
[025]
[026]
[027]
[028]
[029]
[030]
The terms "comprises" and variations thereof do not have a limiting meaning where
these terms appear in the description and claims.
It is understood that wherever embodiments are described herein with the language
“include,” “includes,” or “including,” and the like, otherwise analogous
embodiments described in terms of “consisting of ’ and/or “consisting essentially of ’
are also provided.
Unless otherwise specified, "a," "an," "the," and "at least one" are used
interchangeably and mean one or more than one.
Conditions that are “suitable” for an event to occur or “suitable” conditions are
conditions that do not prevent such events from occurring. Thus, these conditions
permit, enhance, facilitate, and/or are conducive to the event.
As used herein, “providing” in the context of a composition, an article, a nucleic
acid, or a nucleus means making the composition, article, nucleic acid, or nucleus,
purchasing the composition, article, nucleic acid, or nucleus, or otherwise obtaining
the compound, composition, article, or nucleus.
Also herein, the recitations of numerical ranges by endpoints include all numbers
subsumed within that range (e.g., l to 5 includes 1, 1.5, 2, 2.75, 3, 3.80, 4, 5, etc.).
77 (L
Reference throughout this specification to “one embodiment, an embodiment,”
“certain embodiments,” or “some embodiments,” etc., means that a particular
feature, configuration, composition, or characteristic described in connection with
the embodiment is included in at least one embodiment of the disclosure. Thus, the
appearances of such phrases in various places throughout this specification are not
necessarily referring to the same embodiment of the disclosure. Furthermore, the
particular features, configurations, compositions, or characteristics may be
combined in any suitable manner in one or more embodiments.
Maintaining or surpassing current levels of performance at faster incorporation
times can be aided by a new generation of polymerases. Presented herein are
[031]
[032]
polymerase enzymes having significantly improved performance under sequencing
by synthesis (SB S) fast cycle time conditions. The inventors have surprisingly
identified certain altered polymerases which exhibit improved characteristics
including improved accuracy during short incorporations times. Improved accuracy
includes reduced error rate and reduced phasing. The altered polymerases have a
number of other associated advantages, including reduced prephasing, reduced
bypass rate, and improved quality metrics in SBS reactions. This improvement is
maintained even when a polymerase is used at lower concentrations. Accordingly,
in one embodiment, the concentration of a DNA polymerase in an SBS reaction can
be from 120 ng/ul to 80 ng/ul. In one embodiment, the concentration of a DNA
polymerase in a SBS reaction can be no greater than 120 ng/ul, no greater than 110
ng/pl, no greater than 100 ng/ul, or no greater than 90 ng/ul. In one embodiment,
the concentration of a DNA polymerase in an SBS reaction can be at least 80 ng/ul,
at least 90 ng/ul, at least 100 ng/ul, or at least 110 ng/pl.
Error rate refers to a measurement of the frequency of error in the identification of
the correct base, i.e., the complement of the template sequence at a specific position,
during a sequencing reaction. The fidelity with which a sequenced library matches
the original genome sequence can vary depending on the frequency of base mutation
occurring at any stage from the extraction of the nucleic acid to its sequencing on a
sequencing platform. This frequency places an upper limit on the probability of a
sequenced base being correct. In some embodiments, the quality score is presented
as a numerical value. For example, the quality score can be quoted as QXX where
the XX is the score and it means that that particular call has a probability of error of
l0'XM°. Thus, as an example, Q30 equates to an error rate of l in 1000, or 0.1%,
and Q40 equates to an error rate of 1 in 10,000, or 0.01%.
Phasing and pre-phasing are terms known to those of skill in the art and are used to
describe the loss of synchrony in the readout of the sequence copies of a cluster.
Phasing and pre-phasing cause the extracted intensities for a specific cycle to
include the signal of the current cycle and noise from the preceding and following
cycles. Thus, as used herein, the term “phasing” refers to a phenomenon in SBS
[033]
[034]
[035]
that is caused by incomplete incorporation of a nucleotide in some portion of DNA
strands within clusters by polymerases at a given sequencing cycle, and is thus a
measure of the rate at which single molecules within a cluster lose sync with each
other. Phasing can be measured during detection of cluster signal at each cycle and
can be reported as a percentage of detectable signal from a cluster that is out of
synchrony with the signal in the cluster. As an example, a cluster is detected by a
“green” fluorophore signal during cycle N. In the subsequent cycle (cycle N+l),
99.9% of the cluster signal is detected in the “red” channel and 0.1% of the signal
remains from the previous cycle and is detected in the “green” channel. This result
would indicate that phasing is occurring, and can be reported as a numerical value,
such as a phasing value of 0.1, indicating that 0.1% of the molecules in the cluster
are falling behind at each cycle.
The term “pre-phasing” as used herein refers to a phenomenon in SBS that is caused
by the incorporation of nucleotides without effective 3' terminators, causing the
incorporation event to go one cycle ahead. As the number of cycles increases, the
fraction of sequences per cluster affected by phasing increases, hampering the
identification of the correct base. Pre-phasing can be detected by a sequencing
instrument and reported as a numerical value, such as a pre-phasing value of 0.1,
indicating that 0.1% of the molecules in the cluster are running ahead at each cycle.
Detection of phasing and pre-phasing can be performed and reported according to
any suitable methodology as is known in the art, for example, as described in U.S.
Patent No. 8,965,076. For example, as described in the Examples below, phasing is
detected and reported routinely during SBS sequencing runs on sequencing
instrument such as HiSeqTM, Genome AnalyzerTM, NextSeqTM, NovaSeqTM, iSeqTM,
MiniSeqTM, or MiSeqTM sequencing platforms from Illumina, Inc. (San Diego, CA)
or any other suitable instrument known in the art.
Reduced cycle times can increase the occurrence of phasing, pre-phasing, and/or
bypass rate, each of which contributes to error rate. The discovery of altered
polymerases which decrease the incidence of phasing, pre-phasing, and/or bypass
rate, even when used in fast cycle time conditions, is surprising and provides a great
[036]
[037]
advantage in SBS applications. For example, the altered polymerases can provide
faster SBS cycle time, lower phasing and pre-phasing values, and/or longer
sequencing read length. The characterization of error rate and phasing for altered
polymerases as provided herein is set forth in the Example section below.
Polymerases
Provided herein are polymerases, compositions including a polymerase, and
methods of using a polymerase. A polymerase described herein is a DNA
polymerase. In one embodiment, a polymerase of the present disclosure, also
referred to herein as an “altered polymerase,” is based on the amino acid sequence
of a reference polymerase. An altered polymerase includes substitution mutations at
one or more residues when compared to the reference polymerase. A substitution
mutation can be at the same position or a functionally equivalent position compared
to the reference polymerase. Reference polymerases and functionally equivalent
positions are described in detail herein. The skilled person will readily appreciate
that an altered polymerase described herein is not naturally occurring.
A reference polymerase described herein has error rates that are useful is SBS
reactions, however, using a reference polymerase in SBS reactions with shorter
incorporation times increases the error rate. An altered polymerase described herein
maintains the superior error rates observed with reference polymerases even when
the altered polymerase is used in SBS reactions with shorter incorporation times. In
one embodiment, reduced error rates occur when the altered polymerase is tested
using fast incorporation times. Incorporation refers to the amount of time a DNA
polymerase is in contact with a template. As used herein, a slow incorporation time
is the incorporation time used under a standard cycle using a MiniSeqTM benchtop
sequencing system. Slow incorporation times include from 40 seconds to 50
seconds. As used herein, a fast cycle time refers to an incorporation step that is
from 10 seconds to 40 seconds. In one embodiment, a fast cycle time is an
incorporation time of no greater than 40 seconds, no greater than 30 seconds, no
greater than 20 seconds, no greater than 18 seconds, no greater than 16 seconds, no
greater than 14 seconds, or no greater than 12 seconds. In one embodiment, a fast
[038]
[039]
[040]
[041]
cycle time is an incorporation time of at least 10 seconds, at least 12 seconds, at
least 14 seconds, at least 16 seconds, at least 18 seconds, at least 20 seconds, or at
least 30 seconds. In one embodiment, a fast cycle time is an incorporation time of
less than 40 seconds, less than 30 seconds, less than 20 seconds, less than 18
seconds, less than 16 seconds, less than 14 seconds, less than 12 seconds, or less
than 10 seconds.
An altered polymerase described herein can be used in SBS reactions for runs of
different lengths. A “run” refers to the number of nucleotides that are identified on
a template. A run typically includes a run based on the first primer (e.g., a readl
primer) which reads one strand of a template and a run based on the second primer
(e.g., a read2 primer) which reads the complementary strand of the template. In one
embodiment, the number of nucleotides identified using the first primer or the
second primer can be from 10 to 150 nucleotides. In one embodiment, the number
of nucleotides identified using the first primer or the second primer can be no
greater than 150 nucleotides, no greater than 130 nucleotides, no greater than 110
nucleotides, no greater than 90 nucleotides, no greater than 70 nucleotides, no
greater than 50 nucleotides, no greater than 30 nucleotides, or no greater than 20
nucleotides. In one embodiment, the number of nucleotides identified using the first
primer or the second primer can be at least 10, at least 20, at least 30, at least 50, at
least 70, at least 90, at least 110, or at least 130 nucleotides.
In certain embodiments, an altered polymerase is based on a family B type DNA
polymerase. An altered polymerase can be based on, for example, a family B
archaeal DNA polymerase, a human DNA polymerase-or, or a phage polymerase.
Family B archaeal DNA polymerases are well known in the art as exemplified by
the disclosure of U.S. Patent No. 8,283,149. In certain embodiments, an archaeal
DNA polymerase is from a hyperthennophilic archaeon and is thermostable.
In certain embodiments, a family B archaeal DNA polymerase is from a genus such
as, for example, T hermococcus, Pyrococcus, or Mel‘//zanococcus. Members of the
genus T hermococcus are well known in the art and include, but are not limited to T.
[042]
[043]
[044]
4557, T. barophilus, T. gammatolerans, T onnurineus, T. sibiricus, T. kodakarensis,
T. gorgonarius, and T. waiotapuensis. Members of the genus Pyrococcus are well
known in the art and include, but are not limited to P. NA2, P. abyssi, P. furiosus, P.
horikoshii, P. yaycmosii, P. endeavori, P. glycovorcms, and P. woesei. Members of
the genus Methcmococcus are well known in the art and include, but are not limited
to M. aeolicus, M maripaludis, M vannielii, M voltae, M Zhermolithotrophicus,
and M jannaschii.
In one embodiment an altered polymerase is based on Vent®, Deep Vent®, 9°N,
Pfu, KOD, or a Pab polymerase. Vent® and Deep Vent® are commercial names
used for family B DNA polymerases isolated from the hyperthermophilic archaeon
T hermococcus litoralis. 9°N polymerase is a family B polymerase isolated from
T //zermococcus sp. Pfu polymerase is a family B polymerase isolated from
Pyrococcusfuriosus. KOD polymerase is a family B polymerase isolated from
T hermococcus kodakaraenis. Pab polymerase is a family B polymerase isolated
from Pyrococcus abyssi. Twa is a family B polymerase isolated from T
waiotapuensis. Examples of Vent®, Deep Vent®, 9°N, Pfu, KOD, Pab, and Twa
polymerases are disclosed in
In certain embodiments, a family B archaeal DNA polymerase is from a phage such
as, for example, T4, RB69, or phi29 phage.
shows a sequence alignment for proteins having the amino acid sequences
shown in SEQ ID NOs: 1-7. The alignment indicates amino acids that are conserved
in the different family B polymerases. The skilled person will appreciate that the
conserved amino acids and conserved regions are most likely conserved because
they are important to the function of the polymerases, and therefore show a
correlation between structure and function of the polymerases. The alignment also
shows regions of variability across the different family B polymerases. A person of
ordinary skill in the art can deduce from such data regions of a polymerase in which
substitutions, particularly conservative substitutions, may be permitted without
unduly affecting biological activity of the altered polymerase.
[045]
[046]
[047]
An altered polymerase described herein is based on the amino acid sequence of a
known polymerase (also referred to herein as a reference polymerase) and further
includes substitution mutations at one or more residues. In one embodiment, a
substitution mutation is at a position functionally equivalent to an amino acid of a
reference polymerase. By "functionally equivalent" it is meant that the altered
polymerase has the amino acid substitution at the amino acid position in the
reference polymerase that has the same functional role in both the reference
polymerase and the altered polymerase.
In general, functionally equivalent substitution mutations in two or more different
polymerases occur at homologous amino acid positions in the amino acid sequences
of the polymerases. Hence, use herein of the term “functionally equivalent” also
encompasses mutations that are “positionally equivalent” or “homologous” to a
given mutation, regardless of whether or not the particular function of the mutated
amino acid is known. It is possible to identify the locations of functionally
equivalent and positionally equivalent amino acid residues in the amino acid
sequences of two or more different polymerases on the basis of sequence alignment
and/or molecular modelling. An example of sequence alignment to identify
positionally equivalent and/or fimctionally equivalent residues is set forth in
For example, the residues in the Twa, KOD, Pab, Pfu, Deep Vent, and Vent
polymerases of that are vertically aligned are considered positionally
equivalent as well as functionally equivalent to the corresponding residue in the 9°N
polymerase amino acid sequence. Thus, for example residue 349 of the 9°N, Twa,
KOD, Pfu, Deep Vent, and Pab polymerases and residue 351 of the Vent
polymerase are functionally equivalent and positionally equivalent. Likewise, for
example residue 633 of the 9°N, Twa, KOD, and Pab polymerases, residue 634 of
the Pfu and Deep Vent polymerases, and residue 636 of the Vent polymerase are
functionally equivalent and positionally equivalent. The skilled person can easily
identify functionally equivalent residues in DNA polymerases.
In certain embodiments, the substitution mutation comprises a mutation to a residue
having a non-polar side chain. Amino acids having non-polar side chains are well-
[048]
[049]
[050]
[051]
known in the art and include, for example: alanine, glycine, isoleucine, leucine,
methionine, phenylalanine, proline, tryptophan, and valine.
In certain embodiments, the substitution mutation comprises a mutation to a residue
having a polar side chain. Amino acids having polar side chains are well-known in
the art and include, for example: arginine, asparagine, aspartic acid, glutamine,
glutamic acid, histidine, lysine, serine, cysteine, tyrosine, and threonine.
In certain embodiments, the substitution mutation comprises a mutation to a residue
having a hydrophobic side chain. Amino acids having hydrophobic side chains are
well-known in the art and include, for example: glycine, alanine, valine, leucine,
isoleucine, proline, phenylalanine, methionine, and tryptophan.
In certain embodiments, the substitution mutation comprises a mutation to a residue
having an uncharged side chain. Amino acids having uncharged side chains are
well-known in the art and include, for example: glycine, serine, cysteine,
asparagine, glutamine, tyrosine, and threonine, among others.
In one embodiment, an altered polymerase has an amino acid sequence that is
structurally similar to a reference polymerase disclosed herein. In one embodiment,
a reference polymerase is one that includes the amino acid sequence of 9°N (SEQ
ID NO: 1). Optionally, the reference polymerase is SEQ ID NO:l with the
following substitution mutations: Metl29Ala, Aspl4lAla, Glul43Ala, Cys223 Ser,
Leu408Ala, Tyr409Ala, Pro4l0Ile, and Ala485Val. A polymerase having the
amino acid sequence of 9°N (SEQ ID NO: 1) with substitution mutations
Metl29Ala, Aspl4lAla, Glul43Ala, Cys223Ser, Leu408Ala, Tyr409Ala,
Pro4l0Ile, and Ala485Val is disclosed at SEQ ID NO:8, and is also referred to
herein as the Pol8l2 polymerase. Other reference sequences include SEQ ID NO:2,
3, 4, 5, 6, or 7. Optionally, a reference polymerase is SEQ ID NO: 2, 3, 4, 5, 6, or 7
with substitution mutations functionally and positionally equivalent to the following
substitution mutations in SEQ ID NO: 1: Metl29Ala, Aspl4lAla, Glul43Ala,
Cys223Ser, Leu408Ala, Tyr409Ala, Pro4l0Ile, and Ala485Val.
[052]
[053]
[054]
[055]
As used herein, an altered polymerase may be “structurally similar” to a reference
polymerase if the amino acid sequence of the altered polymerase possesses a
specified amount of sequence similarity and/or sequence identity compared to the
reference polymerase.
Structural similarity of two amino acid sequences can be determined by aligning the
residues of the two sequences (for example, a candidate polymerase and a reference
polymerase described herein) to optimize the number of identical amino acids along
the lengths of their sequences; gaps in either or both sequences are permitted in
making the alignment in order to optimize the number of identical amino acids,
although the amino acids in each sequence must nonetheless remain in their proper
order. A candidate polymerase is the polymerase being compared to the reference
polymerase. A candidate polymerase that has structural similarity with a reference
polymerase and polymerase activity is an altered polymerase.
Unless modified as otherwise described herein, a pair-wise comparison analysis of
amino acid sequences or nucleotide sequences can be conducted, for instance, by the
local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981),
by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol.
481443 (1970), by the search for similarity method of Pearson & Lipman, Proc.
Nat’l. Acad. Sci. USA 8512444 (1988), by computerized implementations of these
algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics
Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis), or
by visual inspection (see generally Current Protocols in Molecular Biology, Ausubel
et al., eds., Current Protocols, a joint venture between Greene Publishing
Associates, Inc. and John Wiley & Sons, Inc., supplemented through 2004).
One example of an algorithm that is suitable for determining structural similarity is
the BLAST algorithm, which is described in Altschul et al., J. Mol. Biol. 215:403-
410 (1990). Software for performing BLAST analyses is publicly available through
the National Center for Biotechnology Information. This algorithm involves first
identifying high scoring sequence pairs (HSPs) by identifying short words of length
W in the query sequence, which either match or satisfy some positive-valued
[056]
threshold score T when aligned with a word of the same length in a database
sequence. T is referred to as the neighborhood word score threshold (Altschul et al.,
J. Mol. Biol. 2152403-410 (1990)). These initial neighborhood word hits act as seeds
for initiating searches to find longer HSPs containing them. The word hits are then
extended in both directions along each sequence for as far as the cumulative
alignment score can be increased. Cumulative scores are calculated using, for
nucleotide sequences, the parameters M (reward score for a pair of matching
residues; always >0) and N (penalty score for mismatching residues; always <0).
For amino acid sequences, a scoring matrix is used to calculate the cumulative
score. Extension of the word hits in each direction are halted when: the cumulative
alignment score falls off by the quantity X from its maximum achieved value; the
cumulative score goes to zero or below, due to the accumulation of one or more
negative-scoring residue alignments, or the end of either sequence is reached. The
BLAST algorithm parameters W, T, and X determine the sensitivity and speed of
the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a
wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a
comparison of both strands. For amino acid sequences, the BLASTP program uses
as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62
scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl. Acad. Sci. USA
89: 10915).
In addition to calculating percent sequence identity, the BLAST algorithm also
performs a statistical analysis of the similarity between two sequences (see, e. g.,
Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure
of similarity provided by the BLAST algorithm is the smallest sum probability
(P(N)), which provides an indication of the probability by which a match between
two nucleotide or amino acid sequences would occur by chance. For example, a
nucleic acid is considered similar to a reference sequence if the smallest sum
probability in a comparison of the test nucleic acid to the reference nucleic acid is
less than about 0.1, more preferably less than about 0.01, and most preferably less
than about 0.001.
[057]
[053]
[059]
In the comparison of two amino acid sequences, structural similarity may be
referred to by percent “identity” or may be referred to by percent “similarity.”
“Identity” refers to the presence of identical amino acids. “Similarity” refers to the
presence of not only identical amino acids but also the presence of conservative
substitutions. A conservative substitution for an amino acid in a protein may be
selected from other members of the class to which the amino acid belongs. For
example, it is well-known in the art of protein biochemistry that an amino acid
belonging to a grouping of amino acids having a particular size or characteristic
(such as charge, hydrophobicity, or hydrophilicity) can be substituted for another
amino acid without altering the activity of a protein, particularly in regions of the
protein that are not directly associated with biological activity. For example, non-
polar amino acids include alanine, glycine, isoleucine, leucine, methionine,
phenylalanine, proline, tryptophan, and valine. Hydrophobic amino acids include
glycine, alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, and
tryptophan. Polar amino acids include arginine, asparagine, aspartic acid, glutamine,
glutamic acid, histidine, lysine, serine, cysteine, tyrosine, and threonine. The
uncharged amino acids include glycine, serine, cysteine, asparagine, glutamine,
tyrosine, and threonine, among others.
Thus, as used herein, reference to a polymerase as described herein, such as
reference to the amino acid sequence of one or more SEQ ID NOs described herein
can include a protein with at least 80%, at least 85%, at least 86%, at least 87%, at
least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at
least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%
amino acid sequence similarity to the reference polymerase.
Alternatively, as used herein, reference to a polymerase as described herein, such as
reference to the amino acid sequence of one or more SEQ ID NOs described herein
can include a protein with at least 80%, at least 85%, at least 86%, at least 87%, at
least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at
least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%
amino acid sequence identity to the reference polymerase.
WO 2020/092830 PCT/US2019/059246
[060] The present disclosure describes a collection of mutations that result in a
polymerase having one or more of the activities described herein. A polymerase
described herein can include any number of mutations, eg., at least 1, at least 2, at
least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at
least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, or at
least 18 mutations compared to a reference polymerase, such as SEQ ID NO:1 or
SEQ ID N018. Likewise, a polymerase described herein can include the mutations
in any combination. For example, Table 1 sets out examples of specific altered
polymerases that include different combinations of mutations described herein. A
check mark (J) indicates the presence of the listed mutation. The listed mutations,
e. g., Y497G, F152G, V278L, etc., are mutations at positions on SEQ ID NO:1.
Amfiv
vmmfi
Awfiv
onmfi
Anfiv
mnmfi
Aofiv
Hwofi
Amfiv
qmofi
Awfiv
omofi
Amfiv
mama
ANH.
mmmfi
AHHV
wmmfi
3333333
33333
3333333333
Aofiv
ommfi
Hmo>
mmm:
omox
mmmm
nvmm
ommh
wfimh
nnvx
onqx
«mum
Hmo4
«Hm»
Hhw>
mNm$
whN>
Nmfim
~mw>
9 due
_on_
mco_..m...:_>_
aomfiofibom @883 mo moafimxm A oEa.H :2:
avm.
onna
Ammv
mama
aamv
mama
Aam.
Nona
Aomv
aona
ama.
wmna
$333333
$333333
Awav
mama
anav
owoa
Aoa.
ooaa
Amav
awoa
Ava.
mama
$333
$333
Amav
anoa
aaav
aaoa
Aaa.
aooa
SSSSSSSSSSSSSSS
Aoav
owma
[062]
[063]
[064]
[065]
[066]
[067]
An altered polymerase of the present disclosure includes a substitution mutation at a
position functionally equivalent to Tyr497 in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Tyr497 is a mutation to a non-polar, hydrophobic, or uncharged amino acid, for
example Gly.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Phe152 in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Phel 52 is a mutation to a non-polar, hydrophobic, or uncharged amino acid, for
example Gly.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Val278 in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Val278 is a mutation to a non-polar or hydrophobic amino acid, for example Leu.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Met329 in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Met329 is a mutation to a polar amino acid, for example His.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Val47l in a 9°N polymerase (SEQ ID N011). In
one embodiment, the substitution mutation at a position functionally equivalent to
Val47l is a mutation to a polar or uncharged amino acid, for example Ser.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Thr514 in a 9°N polymerase (SEQ ID N011) as
is known in the art and exemplified by US. Patent Application No. 2016/0032377.
In one embodiment, the substitution mutation at a position functionally equivalent to
Thr5 14 is a mutation to a non-polar or hydrophobic amino acid, for example Ala. In
some embodiments, other substitution mutations that can be used in combination
[068]
[069]
[070]
[071]
[072]
with a non-polar or hydrophobic amino acid at a position functionally equivalent to
Thr5 14 include Phel52, Val278, M329, Val47l, Lue63 l, Glu734, or a combination
thereof. In one embodiment, the substitution mutation at a position functionally
equivalent to Thr5 14 is a mutation to a polar or uncharged amino acid, for example
Ser. In some embodiments, other substitution mutations that can be used in
combination with a polar or uncharged amino acid at a position functionally
equivalent to Thr5 14 include Lys47 6, Lys47 7 , Ile52l, Thr590, or a combination
thereof.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Leu631 in a 9°N polymerase (SEQ ID N011). In
one embodiment, the substitution mutation at a position functionally equivalent to
Leu631 is a mutation to a non-polar or hydrophobic amino acid, for example Met.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Glu734 in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Glu734 is a mutation to a polar or uncharged amino acid, for example Arg.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Lys476 in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Lys476 is a mutation to a non-polar of hydrophobic amino acid, for example Trp.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Lys477 in a 9°N polymerase (SEQ ID N011), as
is known in the art and exemplified by the disclosure of US Patent No. 9,765,309.
In one embodiment, the substitution mutation at a position functionally equivalent to
Lys477 is a mutation to a non-polar or hydrophobic amino acid, for example Met.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Ile521 in a 9°N polymerase (SEQ ID N021) as is
known in the art and exemplified by U.S. Patent Application No. 2016/0032377. In
[073]
[074]
[075]
[076]
[077]
[078]
one embodiment, the substitution mutation at a position functionally equivalent to
Ile52l is a mutation to a non-polar amino acid, for example Leu.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Thr59O in a 9°N polymerase (SEQ ID NO:1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Thr590 is a mutation to a non-polar or hydrophobic amino acid, for example Ile.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Arg247 in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Arg247 is a mutation to a non-polar or uncharged amino acid, for example Tyr.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Glu599 in a 9°N polymerase (SEQ ID N011). In
one embodiment, the substitution mutation at a position functionally equivalent to
Glu599 is a mutation to a polar or uncharged amino acid, for example Asp.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Lys62O in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Lys620 is a mutation to a polar or uncharged amino acid, for example Arg.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to His633 in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
His633 is a mutation to a non-polar, hydrophobic, or uncharged amino acid, for
example Gly.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Val66l in a 9°N polymerase (SEQ ID NO: 1). In
one embodiment, the substitution mutation at a position functionally equivalent to
Val66l is a mutation to a polar or uncharged amino acid, for example Asp.
[079]
[030]
[031]
In one embodiment, an altered polymerase includes at least one substitution
mutation at a position functionally equivalent to amino acids Metl29, Aspl4l,
Glul43, Cys223, Lys408, Tyr409, Pro4l0, Ala485, or a combination thereof. In
one embodiment, the substitution mutation at a position functionally equivalent to
Metl29, Aspl4l, Glul43, Lys408, or Tyr409 is a mutation to a non-polar or
hydrophobic amino acid, for example Ala. In one embodiment, the substitution
mutation at a position functionally equivalent to Cys223 is a mutation to a polar or
uncharged amino acid, for example Ser. In one embodiment, the substitution
mutation at a position functionally equivalent to Pro4l0 is a mutation to a non-polar
or hydrophobic amino acid, for example Ile. In one embodiment, the substitution
mutation at a position functionally equivalent to Ala485 is a mutation to a non-polar
or hydrophobic amino acid, for example Val.
In one embodiment, as altered polymerase includes an amino acid substitution
mutation at a position functionally equivalent to Tyr497 and at least one, at least
two, at least three, at least four, at least five, at least six, or seven amino acid
substitution mutations at positions functionally equivalent to an amino acid selected
from Phel52, Val278, Met329, Val47l, Thr5l4, Leu631, and Glu734 in the 9°N
DNA polymerase amino acid sequence. In one embodiment, the altered polymerase
also includes amino acid substitution mutations at positions functionally equivalent
to amino acids Metl29, Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4l0, and
Ala485 in the 9°N DNA polymerase amino acid sequence.
In one embodiment, an altered polymerase includes an amino acid substitution
mutation at position functionally equivalent to Tyr497 and at least one, at least two,
at least three, at least four, or five amino acid substitution mutations at positions
functionally equivalent to an amino acid selected from Lys47 6, Lys47 7 , Thr5 14,
Ile52l, and Thr59O in the 9°N DNA polymerase amino acid sequence. In one
embodiment, the altered polymerase also includes amino acid substitution mutations
at positions functionally equivalent to amino acids Metl29, Aspl4l, Glul43,
Cys223, Leu408, Tyr409, Pro4l0, and Ala485 in the 9°N DNA polymerase amino
acid sequence.
[082]
[083]
[034]
In one embodiment, an altered polymerase includes an amino acid substitution
mutation at position functionally equivalent to Tyr497 and at least two, at least
three, at least four, or five amino acid substitution mutations at positions
functionally equivalent to an amino acid selected from Arg247, Glu599, Lys620,
His633, and Val66l in the 9°N DNA polymerase amino acid sequence. In one
embodiment, the altered polymerase also includes amino acid substitution mutations
at positions functionally equivalent to amino acids Metl29, Asp14l, Glul43,
Cys223, Leu408, Tyr409, Pro410, and Ala485 in the 9°N DNA polymerase amino
acid sequence.
In one embodiment, an altered polymerase includes an amino acid substitution
mutation at position functionally equivalent to Tyr497, at least one, at least two, at
least three, at least four, at least five, at least six, or seven amino acid substitution
mutations at a position functionally equivalent to Phel 52, Val278, Met329, Val47 l,
Thr5 14, Leu63 1, or Glu734 in the 9°N DNA polymerase amino acid sequence, and
(iii) at least one, at least two, at least three, at least four, or five amino acid
substitution mutations at a position functionally equivalent to Lys47 6, Lys47 7 ,
Thr5 14, Ile52l, or Thr59O in the 9°N DNA polymerase amino acid sequence. In one
embodiment, the altered polymerase also includes amino acid substitution mutations
at positions functionally equivalent to amino acids Metl29, Asp14l, Glul43,
Cys223, Leu408, Tyr409, Pro410, and Ala485 in the 9°N DNA polymerase amino
acid sequence.
In one embodiment, an altered polymerase includes an amino acid substitution
mutation at position functionally equivalent to Tyr497, at least one, at least two, at
least three, at least four, at least five, at least six, or seven amino acid substitution
mutations at positions functionally equivalent to an amino acid selected from
Phel52, Val278, Met329, Val47l, Thr5l4, Leu63 l, and Glu734 in the 9°N DNA
polymerase amino acid sequence, and at least one, at least two, at least three, at least
four, or five amino acid substitution mutations at positions functionally equivalent
to an amino acid selected from Arg247, Glu599, Lys620, His633, or Val66l in the
9°N DNA polymerase amino acid sequence. In one embodiment, the altered
[085]
[086]
polymerase also includes amino acid substitution mutations at positions functionally
equivalent to amino acids Metl29, Asp14l, Glul43, Cys223, Leu408, Tyr409,
Pro4lO, and Ala485 in the 9°N DNA polymerase amino acid sequence.
In one embodiment, an altered polymerase includes an amino acid substitution
mutation at position functionally equivalent to Tyr497, at least one, least two, at
least three, at least four, or five amino acid substitution mutations at positions
functionally equivalent to an amino acid selected from Lys47 6, Lys47 7 , Thr5 14,
Ile52l, or Thr59O in the 9°N DNA polymerase amino acid sequence, and at least
one, at least two, at least three, at least four, or five amino acid substitution
mutations at positions functionally equivalent to an amino acid selected from
Arg247, Glu599, Lys620, His633, or Val66l in the 9°N DNA polymerase amino
acid sequence. In one embodiment, the altered polymerase also includes amino acid
substitution mutations at positions functionally equivalent to amino acids Metl29,
Aspl41, Glul43, Cys223, Leu408, Tyr409, Pro4lO, and Ala485 in the 9°N DNA
polymerase amino acid sequence.
In one embodiment, an altered polymerase includes an amino acid substitution
mutation at a position functionally equivalent to Tyr497, at least one, at least two, at
least three, at least four, at least five, or six amino acid substitution mutations at
positions functionally equivalent to an amino acid selected from Phe152, Val278,
Met329, Val47l, Leu63 l, and Glu7 34 in the 9°N DNA polymerase amino acid
sequence, at least one, at least two, at least three, at least four, or five amino acid
substitution mutations at positions functionally equivalent to an amino acid selected
from Lys476, Lys477, Thr5 14, Ile52l, or Thr590, in the 9°N DNA polymerase
amino acid sequence, and at least one, at least two, at least three, at least four, or
five amino acid substitution mutations at positions functionally equivalent to an
amino acid selected from Arg247, Glu599, Lys620, His633, or Val66l in the 9°N
DNA polymerase amino acid sequence. In one embodiment, the altered polymerase
also includes amino acid substitution mutations at positions functionally equivalent
to amino acids Metl29, Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4lO, and
Ala485 in the 9°N DNA polymerase amino acid sequence.
[087]
[088]
[039]
[090]
Specific examples of altered polymerases include Pol 1550 (SEQ ID NO: 10), Pol
1558 (SEQ ID NO:ll), Pol 1563 (SEQ ID NO:12), Pol 1565 (SEQ ID NOII3), Pol
1630 (SEQ ID NOII4), Pol 1634 (SEQ ID NO:l5), Pol 1641 (SEQ ID NO:l6), Pol
1573 (SEQ ID NOII7), Pol 1576 (SEQ ID NO:18), Pol 1584 (SEQ ID NOII9), Pol
1586 (SEQ ID NO:20), Pol 1601 (SEQ ID NO:2l), Pol 1611 (SEQ ID NO:22), Pol
1671 (SEQ ID NOI23), Pol 1677 (SEQ ID NO:24), Pol 1682 (SEQ ID NO:25), Pol
1680 (SEQ ID NOI27), Pol 1745 (SEQ ID NO:28), Pol 1758 (SEQ ID NO:29), Pol
1761 (SEQ ID NO:30), Pol 1762 (SEQ ID NO:3l), Pol 1765 (SEQ ID NO:32), Pol
1769 (SEQ ID NOI33), and Pol 1770 (SEQ ID NOI34).
An altered polymerase described herein can include additional mutations that are
known to affect polymerase activity. On such substitution mutation is at a position
functionally equivalent to Arg7l3 in the 9°N polymerase (SEQ ID NO: 1). Any of a
variety of substitution mutations at one or more of positions known to result in
reduced exonuclease activity can be made, as is known in the art and exemplified by
US Patent No. 8,623,628. In one embodiment, the substitution mutation at position
Arg7 13 is a mutation to a non-polar, hydrophobic, or uncharged amino acid, for
example Gly, Met, or Ala.
In one embodiment, an altered polymerase includes a substitution mutation at a
position functionally equivalent to Arg743 or Lys705, or a combination thereof, in
the 9°N polymerase (SEQ ID NO: 1), as is known in the art and exemplified by the
disclosure of US Patent No. 8,623,628. In one embodiment, the substitution
mutation at position Arg743 or Lys705 is a mutation to a non-polar or hydrophobic
amino acid, for example Ala.
The present disclosure also provides compositions that include an altered
polymerase described herein. The composition can include other components in
addition to the altered polymerase. For example, the composition can include a
buffer, a nucleotide solution, or a combination thereof. The nucleotide solution can
include nucleotides, such as nucleotides that are labelled, synthetic, modified, or a
combination thereof. In one embodiment, a composition includes target nucleic
acids, such as a library of target nucleic acids.
[091]
[092]
[093]
Mutating Polymerases
Various types of mutagenesis are optionally used in the present disclosure, e. g., to
modify polymerases to produce variants, e. g., in accordance with polymerase
models and model predictions as discussed above, or using random or semi-random
mutational approaches. In general, any available mutagenesis procedure can be used
for making polymerase mutants. Such mutagenesis procedures optionally include
selection of mutant nucleic acids and polypeptides for one or more activity of
interest (e. g., reduced pyrophosphorolysis, increased turnover e.g., for a given
nucleotide analog). Procedures that can be used include, but are not limited to: site-
directed point mutagenesis, random point mutagenesis, in vitro or in vivo
homologous recombination (DNA shuffling and combinatorial overlap PCR),
mutagenesis using uracil containing templates, oligonucleotide-directed
mutagenesis, phosphorothioate-modified DNA mutagenesis, mutagenesis using
gapped duplex DNA, point mismatch repair, mutagenesis using repair-deficient host
strains, restriction-selection and restriction-purification, deletion mutagenesis,
mutagenesis by total gene synthesis, degenerate PCR, double-strand break repair,
and many others known to persons of skill. The starting polymerase for mutation
can be any of those noted herein, including available polymerase mutants such as
those identified e.g., in US Patent No. 8,460,910 and US Patent No. 8,623,628, each
of which is incorporated by reference in its entirety.
Optionally, mutagenesis can be guided by known information from a naturally
occurring polymerase molecule, or of a known altered or mutated polymerase (e.g.,
using an existing mutant polymerase), e.g., sequence, sequence comparisons,
physical properties, crystal structure and/or the like as discussed above. However, in
another class of embodiments, modification can be essentially random (e.g., as in
classical or "family" DNA shuffling, see, e.g., Crameri et al. (1998) "DNA shuffling
of a family of genes from diverse species accelerates directed evolution" Nature
39l:288-29l).
Additional information on mutation formats is found in: Sambrook et al., Molecular
Cloning--A Laboratory Manual (3rd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory,
WO 2020/092830 PCT/US2019/059246
Cold Spring Harbor, NY., 2000 ("Sambrook"); Current Protocols in Molecular
Biology, F. M. Ausubel et al., eds., Current Protocols, a joint venture between
Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (supplemented
through 2011) ("Ausubel")) and PCR Protocols A Guide to Methods and
Applications (Innis et al. eds) Academic Press Inc. San Diego, Calif. (1990)
("Innis"). The following publications and references cited within provide additional
detail on mutation formats: Arnold, Protein engineering for unusual environments,
Current Opinion in Biotechnology 4:450-455 (1993), Bass et al., Mutant Trp
repressors with new DNA-binding specificities, Science 242:240-245 (1988), Bordo
and Argos (1991) Suggestions for "Safe" Residue Substitutions in Site-directed
Mutagenesis 217:721-729, Botstein & Shortle, Strategies and applications of in
vitro mutagenesis, Science 229: 1 193-1201 (1985); Carter et al., Improved
oligonucleotide site-directed mutagenesis using M13 vectors, Nucl. Acids Res. 13:
4431-4443 (1985), Carter, Site-directed mutagenesis, Biochem. J. 237: 1-7 (1986);
Carter, Improved oligonucleotide-directed mutagenesis using M13 vectors, Methods
in Enzymol. 154: 382-403 (1987); Dale et al., Oligonucleotide-directed random
mutagenesis using the phosphorothioate method, Methods Mol. Biol. 57 :369-3 74
(1996); Eghtedarzadeh & Henikoff, Use of oligonucleotides to generate large
deletions, Nucl. Acids Res. 14: 5115 (1986), Fritz et al., Oligonucleotide-directed
construction of mutations: a gapped duplex DNA procedure without enzymatic
reactions in vitro, Nucl. Acids Res. 16: 6987-6999 (1988), Grundstrom et al.,
Oligonucleotide-directed mutagenesis by microscale ‘ shot-gun‘ gene synthesis,
Nucl. Acids Res. 13: 3305-3316 (1985); Hayes (2002) Combining Computational
and Experimental Screening for rapid Optimization of Protein Properties PNAS
99(25) 15926-15931, Kunkel, The efficiency of oligonucleotide directed
mutagenesis, in Nucleic Acids & Molecular Biology (Eckstein, F. and Lilley, D. M.
J. eds., Springer Verlag, Berlin)) (1987), Kunkel, Rapid and efficient site-specific
mutagenesis without phenotypic selection, Proc. Natl. Acad. Sci. USA 82:488-492
(1985); Kunkel et al., Rapid and efficient site-specific mutagenesis without
phenotypic selection, Methods in Enzymol. 154, 367-3 82 (1987); Kramer et al., The
gapped duplex DNA approach to oligonucleotide-directed mutation construction,
WO 2020/092830 PCT/US2019/059246
Nucl. Acids Res. 12: 9441-9456 (1984); Kramer & Fritz Oligonucleotide-directed
construction of mutations via gapped duplex DNA, Methods in Enzymol. 154:350-
367 (1987), Kramer et al., Point Mismatch Repair, Cell 38:879-887 (1984), Kramer
et al., Improved enzymatic in vitro reactions in the gapped duplex DNA approach to
oligonucleotide-directed construction of mutations, Nucl. Acids Res. 16: 7207
(1988); Ling et al., Approaches to DNA mutagenesis: an overview, Anal Biochem.
254(2): 157-178 (1997); Lorimer and Pastan Nucleic Acids Res. 23, 3067-8 (1995);
Mandecki, Oligonucleotide-directed double-strand break repair in plasmids of
Escherichia coli: a method for site-specific mutagenesis, Proc. Natl. Acad. Sci.
USA, 83 :7177-7181(1986), Nakamaye & Eckstein, Inhibition of restriction
endonuclease Nci I cleavage by phosphorothioate groups and its application to
oligonucleotide-directed mutagenesis, Nucl. Acids Res. 14: 9679-9698 (1986),
Nambiar et al., Total synthesis and cloning of a gene coding for the ribonuclease S
protein, Science 223: 1299-1301(1984); Sakamar and Khorana, Total synthesis and
expression of a gene for the a-subunit of bovine rod outer segment guanine
nucleotide-binding protein (transducin), Nucl. Acids Res. 14: 6361-6372 (1988);
Sayers et al., Y-T Exonucleases in phosphorothioate-based oligonucleotide-directed
mutagenesis, Nucl. Acids Res. 16:791-802 (1988); Sayers et al., Strand specific
cleavage of phosphorothioate-containing DNA by reaction with restriction
endonucleases in the presence of ethidium bromide, (1988) Nucl. Acids Res. 16:
803-814, Sieber, et al., Nature Biotechnology, 19:456-460 (2001), Smith, In vitro
mutagenesis, Ann. Rev. Genet. 19:423-462 (1985); Methods in Enzymol. 100: 468-
500 (1983), Methods in Enzymol. 154: 329-350 (1987), Stemmer, Nature 370, 389-
91(1994); Taylor et al., The use of phosphorothioate-modified DNA in restriction
enzyme reactions to prepare nicked DNA, Nucl. Acids Res. 13: 8749-8764 (1985),
Taylor et al., The rapid generation of oligonucleotide-directed mutations at high
frequency using phosphorothioate-modified DNA, Nucl. Acids Res. 13: 8765-8787
(1985); Wells et al., Importance of hydrogen-bond formation in stabilizing the
transition state of subtilisin, Phil. Trans. R. Soc. Lond. A 317: 415-423 (1986),
Wells et al., Cassette mutagenesis: an efficient method for generation of multiple
mutations at defined sites, Gene 34:315-323 (1985), Zoller & Smith,
[094]
[095]
Oligonucleotide-directed mutagenesis using M 13-derived vectors: an efficient and
general procedure for the production of point mutations in any DNA fragment,
Nucleic Acids Res. 10:6487-6500 (1982), Zoller & Smith, Oligonucleotide-directed
mutagenesis of DNA fragments cloned into M13 vectors, Methods in Enzymol.
100:468-500 (1983), Zoller & Smith, Oligonucleotide-directed mutagenesis: a
simple method using two oligonucleotide primers and a single-stranded DNA
template, Methods in Enzymol. 154:329-350 (1987), Clackson et al. (1991)
"Making antibody fragments using phage display libraries" Nature 352:624-628,
Gibbs et al. (2001) "Degenerate oligonucleotide gene shuffling (DOGS): a method
for enhancing the frequency of recombination with family shuffling" Gene 271 : 13-
, and Hiraga and Arnold (2003) "General method for sequence-independent site-
directed chimeragenesis: J. Mol. Biol. 330:287-296. Additional details on many of
the above methods can be found in Methods in Enzymology Volume 154, which
also describes useful controls for trouble-shooting problems with various
mutagenesis methods.
Making and Isolating Recombinant Polvmerases
Generally, nucleic acids encoding a polymerase as presented herein can be made by
cloning, recombination, in vitro synthesis, in vitro amplification and/or other
available methods. A variety of recombinant methods can be used for expressing an
expression vector that encodes a polymerase as presented herein. Methods for
making recombinant nucleic acids, expression and isolation of expressed products
are well known and described in the art. A number of exemplary mutations and
combinations of mutations, as well as strategies for design of desirable mutations,
are described herein. Methods for making and selecting mutations in the active site
of polymerases, including for modifying steric features in or near the active site to
permit improved access by nucleotide analogs are found herein and, e. g., in WO
2007/076057 and WO 2008/051530.
Additional useful references for mutation, recombinant and in Vitro nucleic acid
manipulation methods (including cloning, expression, PCR, and the like) include
Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in
[096]
[097]
Enzymology volume 152 Academic Press, Inc., San Diego, Calif. (Berger);
Kaufman et al. (2003) Handbook of Molecular and Cellular Methods in Biology and
Medicine Second Edition Ceske (ed) CRC Press (Kaufman); The Nucleic Acid
Protocols Handbook Ralph Rapley (ed) (2000) Cold Spring Harbor, Humana Press
Inc (Rapley); Chen et al. (ed) PCR Cloning Protocols, Second Edition (Methods in
Molecular Biology, Volume 192) Humana Press; and in Viljoen et al. (2005)
Molecular Diagnostic PCR Handbook Springer, ISBN 1402034032.
In addition, a plethora of kits are commercially available for the purification of
plasmids or other relevant nucleic acids from cells, (see, e.g., EasyPrepTM and
F lexiPrepTM, both from Pharmacia Biotech; StrataCleanTM, from Stratagene; and
QIAprepTM from Qiagen). Any isolated and/or purified nucleic acid can be further
manipulated to produce other nucleic acids, used to transfect cells, incorporated into
related Vectors to infect organisms for expression, and/or the like. Typical cloning
Vectors contain transcription and translation terminators, transcription and
translation initiation sequences, and promoters useful for regulation of the
expression of the particular target nucleic acid. The Vectors optionally comprise
generic expression cassettes containing at least one independent terminator
sequence, sequences permitting replication of the cassette in eukaryotes, or
prokaryotes, or both, (e.g., shuttle Vectors) and selection markers for both
prokaryotic and eukaryotic systems. Vectors are suitable for replication and
integration in prokaryotes, eukaryotes, or both.
Other useful references, e. g. for cell isolation and culture (e.g., for subsequent
nucleic acid isolation) include Freshney (1994) Culture of Animal Cells, a Manual
of Basic Technique, third edition, Wiley-Liss, New York and the references cited
therein; Payne et al. (1992) Plant Cell and Tissue Culture in Liquid Systems John
Wiley & Sons, Inc. New York, NY., Gamborg and Phillips (eds) (1995) Plant Cell,
Tissue and Organ Culture; Fundamental Methods Springer Lab Manual, Springer-
Verlag (Berlin Heidelberg New York), and Atlas and Parks (eds) The Handbook of
Microbiological Media (1993) CRC Press, Boca Raton, Fla.
[098]
[099]
The present disclosure also includes nucleic acids encoding the altered polymerases
disclosed herein. A particular amino acid can be encoded by multiple codons, and
certain translation systems (eg., prokaryotic or eukaryotic cells) often exhibit codon
bias, e.g., different organisms often prefer one of the several synonymous codons
that encode the same amino acid. As such, nucleic acids presented herein are
optionally "codon optimized," meaning that the nucleic acids are synthesized to
include codons that are preferred by the particular translation system being
employed to express the polymerase. For example, when it is desirable to express
the polymerase in a bacterial cell (or even a particular strain of bacteria), the nucleic
acid can be synthesized to include codons most frequently found in the genome of
that bacterial cell, for efficient expression of the polymerase. A similar strategy can
be employed when it is desirable to express the polymerase in a eukaryotic cell, e.g.,
the nucleic acid can include codons preferred by that eukaryotic cell.
A Variety of protein isolation and detection methods are known and can be used to
isolate polymerases, eg., from recombinant cultures of cells expressing the
recombinant polymerases presented herein. A variety of protein isolation and
detection methods are well known in the art, including, e.g., those set forth in R.
Scopes, Protein Purification, Springer-Verlag, N.Y. (1982); Deutscher, Methods in
Enzymology Vol. 182: Guide to Protein Purification, Academic Press, Inc. N.Y.
(1990); Sandana (1997) Bioseparation of Proteins, Academic Press, Inc.; Bollag et
al. (1996) Protein Methods, 2nd Edition Wiley-Liss, NY; Walker (1996) The
Protein Protocols Handbook Humana Press, NJ, Harris and Angal (1990) Protein
Purification Applications: A Practical Approach IRL Press at Oxford, Oxford,
England; Harris and Angal Protein Purification Methods: A Practical Approach IRL
Press at Oxford, Oxford, England; Scopes (1993) Protein Purification: Principles
and Practice 3rd Edition Springer Verlag, NY, J anson and Ryden (1998) Protein
Purification: Principles, High Resolution Methods and Applications, Second Edition
Wiley-VCH, NY, and Walker (1998) Protein Protocols on CD-ROM Humana Press,
NJ, and the references cited therein. Additional details regarding protein purification
and detection methods can be found in Satinder Ahuja ed., Handbook of
Bioseparations, Academic Press (2000).
WO 2020/092830 PCT/US2019/059246
Methods of Use
[0100] The altered polymerases presented herein can be used in a sequencing procedure,
such as a sequencing-by-synthesis (SBS) technique. Briefly, SBS can be initiated
by contacting the target nucleic acids with one or more nucleotides (e. g., labelled,
synthetic, modified, or a combination thereof), DNA polymerase, etc. Those
features where a primer is extended using the target nucleic acid as template will
incorporate a labeled nucleotide that can be detected. The incorporation time used
in a sequencing run can be significantly reduced using the altered polymerases
described herein. Optionally, the labeled nucleotides can further include a
reversible termination property that terminates further primer extension once a
nucleotide has been added to a primer. For example, a nucleotide analog having a
reversible terminator moiety can be added to a primer such that subsequent
extension cannot occur until a deblocking agent is delivered to remove the moiety.
Thus, for embodiments that use reversible termination, a deblocking reagent can be
delivered to the flow cell (before or after detection occurs). Washes can be carried
out between the various delivery steps. The cycle can then be repeated n times to
extend the primer by n nucleotides, thereby detecting a sequence of length n.
Exemplary SBS procedures, fluidic systems, and detection platforms that can be
readily adapted for use with an array produced by the methods of the present
disclosure are described, for example, in Bentley et al., Nature 456:53-59 (2008),
WO 04/018497; WO 91/06678; WO 07/123744; US Patent Nos. 7,057,026,
7,329,492, 7,211,414, 7,315,019, 7,405,281, and 8,343,746.
[0101] Other sequencing procedures that use cyclic reactions can be used, such as
pyrosequencing. Pyrosequencing detects the release of inorganic pyrophosphate
(PPi) as particular nucleotides are incorporated into a nascent nucleic acid strand
(Ronaghi, et al., Analytical Biochemistry 242(1), 84-9 (1996), Ronaghi, Genome
Res. 11(1), 3-11 (2001); Ronaghi et al. Science 281(5375), 363 (1998); US Pat. Nos.
6,210,891, 6,258,568 and 6,274,320). In pyrosequencing, released PPi can be
detected by being converted to adenosine triphosphate (ATP) by ATP sulfurylase,
and the resulting ATP can be detected via luciferase-produced photons. Thus, the
WO 2020/092830 PCT/US2019/059246
sequencing reaction can be monitored via a luminescence detection system.
Excitation radiation sources used for fluorescence based detection systems are not
necessary for pyrosequencing procedures. Useful fluidic systems, detectors and
procedures that can be used for application of pyrosequencing to arrays of the
present disclosure are described, for example, in W0 2012/05 8096, US Pat. App.
Pub. No. 2005/0191698 A1, US Patent Nos. 7,595,883 and 7,244,559.
[0102] Some embodiments can use methods involving the real-time monitoring of DNA
polymerase activity. For example, nucleotide incorporations can be detected
through fluorescence resonance energy transfer (FRET) interactions between a
fluorophore-bearing polymerase and y-phosphate-labeled nucleotides, or with
zeromode waveguides. Techniques and reagents for FRET-based sequencing are
described, for example, in Levene et al. Science 299, 682-686 (2003); Lundquist et
al. Opt. Left. 33, 1026-1028 (2008), Korlach et al. Proc. Natl. Acad. Sci. USA 105,
1176—1181 (2008).
[0103] Some SBS embodiments include detection of a proton released upon incorporation
of a nucleotide into an extension product. For example, sequencing based on
detection of released protons can use an electrical detector and associated
techniques that are commercially available from Ion Torrent (Guilford, CT, a Life
Technologies subsidiary) or sequencing methods and systems described in US
Patent Nos. 8,262,900, 7,948,015, 8,349,167, and US Published Patent Application
No. 2010/0137143 A1.
[0104] Accordingly, presented herein are methods for incorporating nucleotide analogues
into DNA including allowing the following components to interact: (i) an altered
polymerase according to any of the above embodiments, (ii) a DNA template, and
(iii) a nucleotide solution. In certain embodiments, the DNA template include a
clustered array. In certain embodiments, the nucleotides are modified at the 3' sugar
hydroxyl, and include modifications at the 3' sugar hydroxyl such that the
substituent is larger in size than the naturally occurring 3' hydroxyl group.
WO 2020/092830 PCT/US2019/059246
Nucleic Acids Encoding Altered Polvmerases
[0105] The present disclosure also includes nucleic acid molecules encoding the altered
polymerases described herein. For any given altered polymerase which is a mutant
version of a polymerase for which the amino acid sequence and preferably also the
wild type nucleotide sequence encoding the polymerase is known, it is possible to
obtain a nucleotide sequence encoding the mutant according to the basic principles
of molecular biology. For example, given that the wild type nucleotide sequence
encoding 9°N polymerase is known, it is possible to deduce a nucleotide sequence
encoding any given mutant version of 9°N having one or more amino acid
substitutions using the standard genetic code. Similarly, nucleotide sequences can
readily be derived for mutant versions other polymerases such as, for example,
Vent® polymerase, Deep Vent® polymerase, Pfu polymerase, KOD polymerase,
Pab polymerase, etc. Nucleic acid molecules having the required nucleotide
sequence may then be constructed using standard molecular biology techniques
known in the art.
[0106] In accordance with the embodiments presented herein, a defined nucleic acid
includes not only the identical nucleic acid but also any minor base variations
including, in particular, substitutions in cases which result in a synonymous codon
(a different codon specifying the same amino acid residue) due to the degenerate
code in conservative amino acid substitutions. The term “nucleic acid sequence”
also includes the complementary sequence to any single stranded sequence given
regarding base variations.
[0107] The nucleic acid molecules described herein may also, advantageously, be included
in a suitable expression vector to express the polymerase proteins encoded
therefrom in a suitable host. Incorporation of cloned DNA into a suitable
expression vector for subsequent transformation of said cell and subsequent
selection of the transformed cells is well known to those skilled in the art as
provided in Sambrook et al. (1989), Molecular cloning: A Laboratory Manual, Cold
Spring Harbor Laboratory.
WO 2020/092830 PCT/US2019/059246
[0108] Such an expression vector includes a vector having a nucleic acid according to the
embodiments presented herein operably linked to regulatory sequences, such as
promoter regions, that are capable of effecting expression of said DNA fragments.
The term “operably linked” refers to a juxtaposition wherein the components
described are in a relationship permitting them to function in their intended manner.
Such vectors may be transformed into a suitable host cell to provide for the
expression of a protein according to the embodiments presented herein.
[0109] The nucleic acid molecule may encode a mature protein or a protein having a pro-
sequence, including that encoding a leader sequence on the preprotein which is then
cleaved by the host cell to form a mature protein. The vectors may be, for example,
plasmid, virus or phage vectors provided with an origin of replication, and
optionally a promoter for the expression of said nucleotide and optionally a
regulator of the promoter. The vectors may contain one or more selectable markers,
such as, for example, an antibiotic resistance gene.
[0110] Regulatory elements required for expression include promoter sequences to bind
RNA polymerase and to direct an appropriate level of transcription initiation and
also translation initiation sequences for ribosome binding. For example, a bacterial
expression vector may include a promoter such as the lac promoter and for
translation initiation the Shine-Dalgarno sequence and the start codon AUG.
Similarly, a eukaryotic expression vector may include a heterologous or
homologous promoter for RNA polymerase II, a downstream polyadenylation
signal, the start codon AUG, and a termination codon for detachment of the
ribosome. Such vectors may be obtained commercially or be assembled from the
sequences described by methods well known in the art.
[0111] Transcription of DNA encoding the polymerase by higher eukaryotes may be
optimized by including an enhancer sequence in the vector. Enhancers are cis-
acting elements of DNA that act on a promoter to increase the level of transcription.
Vectors will also generally include on gins of replication in addition to the selectable
markers.
WO 2020/092830 PCT/US2019/059246
[0112] The present disclosure also provides a kit for performing a nucleotide incorporation
reaction. The kit includes at least one altered polymerase described herein and a
nucleotide solution in a suitable packaging material in an amount sufficient for at
least one nucleotide incorporation reaction. Optionally, other reagents such as
buffers and solutions needed to use the altered polymerase and nucleotide solution
are also included. Instructions for use of the packaged components are also
typically included.
[0113] In certain embodiments, the nucleotide solution includes labelled nucleotides. In
certain embodiments, the nucleotides are synthetic nucleotides. In certain
embodiments, the nucleotides are modified nucleotides. In certain embodiments, a
modified nucleotide has been modified at the 3' sugar hydroxyl such that the
substituent is larger in size than the naturally occurring 3' hydroxyl group. In
certain embodiments, the modified nucleotides include a modified nucleotide or
nucleoside molecule that includes a purine or pyrimidine base and a ribose or
deoxyribose sugar moiety having a removable 3'-OH blocking group covalently
attached thereto, such that the 3' carbon atom has attached a group of the structure
-O-Z
wherein Z is any of -C(R’)2-O-R”, -C(R’)2-N(R”)2, -C(R’)2-N(H)R”, -
C(R’)2-S-R” and -C(R’)2-F,
wherein each R" is or is part of a removable protecting group,
each R’ is independently a hydrogen atom, an alkyl, substituted alkyl,
arylalkyl, alkenyl, alkynyl, aryl, heteroaryl, heterocyclic, acyl, cyano, alkoxy,
aryloxy, heteroaryloxy or amido group, or a detectable label attached through a
linking group, or (R’)2 represents an alkylidene group of formula =C(R”’)2 wherein
each R”’ may be the same or different and is selected from the group comprising
hydrogen and halogen atoms and alkyl groups, and
wherein the molecule may be reacted to yield an intermediate in which each
R" is exchanged for H or, where Z is -C(R’)2-F, the F is exchanged for OH, SH or
WO 2020/092830 PCT/US2019/059246
NH2, preferably OH, which intermediate dissociates under aqueous conditions to
afford a molecule with a free 3'OH,
with the proviso that where Z is -C(R’)2-S-R”, both R’ groups are not H.
In certain embodiments, R‘ of the modified nucleotide or nucleoside is an alkyl or
substituted alkyl. In certain embodiments, -Z of the modified nucleotide or
nucleoside is of formula -C(R’)2-N3. In certain embodiments, Z is an azidomethyl
group.
[0114] In certain embodiments, the modified nucleotides are fluorescently labelled to allow
their detection. In certain embodiments, the modified nucleotides include a
nucleotide or nucleoside having a base attached to a detectable label via a cleavable
linker. In certain embodiments, the detectable label includes a fluorescent label.
[0115] As used herein, the phrase “packaging material” refers to one or more physical
structures used to house the contents of the kit. The packaging material is
constructed by known methods, preferably to provide a sterile, contaminant-free
environment. The packaging material has a label which indicates that the
components can be used for conducting a nucleotide incorporation reaction. In
addition, the packaging material contains instructions indicating how the materials
within the kit are employed to practice a nucleotide incorporation reaction. As used
herein, the term “package” refers to a solid matrix or material such as glass, plastic,
paper, foil, and the like, capable of holding within fixed limits the polypeptides.
“Instructions for use” typically include a tangible expression describing the reagent
concentration or at least one assay method parameter, such as the relative amounts
of reagent and sample to be admixed, maintenance time periods for reagent/ sample
admixtures, temperature, buffer conditions, and the like.
[0116] The complete disclosure of the patents, patent documents, and publications cited in
the Background, the Detailed Description of Exemplary Embodiments, and
elsewhere herein are incorporated by reference in their entirety as if each were
individually incorporated.
WO 2020/092830 PCT/US2019/059246
[0117] Illustrative embodiments of this invention are discussed, and reference has been
made to possible variations within the scope of this invention. These and other
variations, combinations, and modifications in the invention will be apparent to
those skilled in the art without departing from the scope of the invention, and it
should be understood that this invention is not limited to the illustrative
embodiments set forth herein. Accordingly, the invention is to be limited only by
the claims provided below and equivalents thereof.
EXANIPLES
[0118] The present invention is illustrated by the following examples. It is to be
understood that the particular examples, materials, amounts, and procedures are to
be interpreted broadly in accordance with the scope and spirit of the invention as set
forth herein.
Example 1
General Assay Methods and Conditions
[0119] Unless otherwise noted, this describes the general assay conditions used in the
Examples described herein.
A. Cloning and Expression of Polymerases
[0120] Methods for making recombinant nucleic acids, expression, and isolation of
expressed products are known and described in the art. Mutagenesis was performed
on the coding region encoding a 9°N polymerase (SEQ ID NO: 1) using standard
site-directed mutagenesis methodology. PCR-based approaches were used to
amplify mutated coding regions and add a His-tag. For each mutation made, the
proper sequence of the altered coding region was confirmed by determining the
sequence of the cloned DNA.
[0121] His-tagged mutant polymerase coding regions were subcloned into pET1la vector
and transformed into BL21 Star (DE3) expression cells (Invitrogen). Overnight
cultures from single-picked colonies were used to inoculate expression cultures in
2.8L flasks. Cultures were grown at 37°C until OD600 of about 0.8, protein
WO 2020/092830 PCT/US2019/059246
expression was then induced with 0.2 mM IPTG and followed by 4 hours of
additional growth. Cultures were centrifuged at 7000 rpm for 20 minutes. Cell
pellets were stored at -20°C until purification.
[0122] Pellets were freeze-thawed and lysed with 5x w/V lysis buffer (50 mM Tris-HCl
pH7 .5, 1 mM EDTA, 0.1% BME, and 5% Glycerol) in the presence of Ready-Lyse
and Omnicleave reagents (Epicentre) according to manufacturer recommendations.
The f1nalNaCl concentration was raised to 500 mM and lysate was incubated on ice
for 5 minutes. Following centrifugation, the supernatant was incubated at 80°C for
about 70 minutes. All further purification was performed at 4°C. Supernatant was
iced for 30min before being centrifuged and purified using 5mL Ni Sepharose HP
columns (GE). Columns were pre-equilibrated with Buffer A (50 mM Tris-HCl pH
7.5, 1 mM EDTA, 5% Glycerol, 500 mM NaCl, and 20 mM Imidazole). The
column was eluted using a 75 mL gradient from 20 to 500mM imidazole. Peak
fractions were pooled and diluted with 10% glycerol to match the conductivity of
SP Buffer A (50 mM Tris-HCl pH 7.5, 150 mM NaCl, 1 mM EDTA, 5% Glycerol)
and loaded onto 5 mL SP Sepharose columns (GE). The column was eluted using a
100 mL gradient from 150 to 1000 mM NaCl. Peak fractions were pooled, dialyzed
into storage buffer (10 mM Tris-HC1 pH 7.5, 300 mM KCL, 0.1 mM EDTA, and
50% Glycerol) and stored at -20°C.
B. Error Rate and Phasing Analysis
[0123] Sequencing experiments were used to compare error rates and phasing values.
Unless indicated otherwise, the experiments were carried out on a MiniSeqTM
system (Illumina, Inc., San Diego, Calif.), according to manufacturer instructions.
For example, for each polymerase, a separate incorporation mix (IMX) was
prepared and used in a short run (35 cycles in read 1) or long run (227 cycle run of
151 in read 1 and 76 in read 2). Standard MiniSeq Mid Output Reagent Cartridge
formulations were used, with the standard polymerase substituted with the
polymerase being tested, at a concentration of 90 ug/mL. The time for incubation of
IMX on the flowcell Varied as noted in the Examples herein. The DNA library used
was made following the standard TruSeqTM Nano protocol (Illumina, Inc.), with 350
WO 2020/092830 PCT/US2019/059246
bp target insert size, using E. coli genomic DNA; PhiX DNA (Illumina, Inc) was
added to resulting library in ~1 : 10 molar ratio. Illumina RTA Software was used to
evaluate error rate on both genomes as well as phasing levels.
Example 2
Sequencing Performance of Altered Polymerases
[0124] A number of altered polymerases were identified that had error rates and phasing
levels in a short run under a short incorporation time that were not substantially
greater than a control polymerase used in a short run under a standard incorporation
time. Table 2 below compares sequencing performance of the altered polymerases
listed in Table l at 14 sec incorporation time relative to a control polymerase
represented by Pol 812 (SEQ ID NO:8). The quality metrics used to evaluate the
altered polymerases were the phasing rates of read 1 (“R1 Phasing”) and cumulative
error rates of E. coli (“E coli Error”) and bacteriophage PhiX (“PhiX Error”)
sequencing controls. The metrics were normalized to corresponding Pol 812
phasing and error rates at the standard (46 sec) incorporation time (“812 STD”).
For example, the R1 phasing rate of Pol 812 at the short incorporation time (“812
Fast”) is 6 fold higher than its R1 phasing at the standard incorporation time,
whereas the cumulative E. coli and PhiX error rates are 12.5 and 6.4 fold higher,
respectively. Similarly, the R1 phasing rate of Pol 963 (SEQ ID NO:9) at the short
incorporation time is 5.9 fold higher than the R1 phasing of Pol 812 at the standard
incorporation time, whereas the cumulative E. coli and PhiX error rates of Pol 963
are 4.6 and 3.6 fold higher, respectively.
[0125] Table 2: Performance metrics of the altered polymerases listed in Table 1.
Sequencing Performance
Pol (SEQ ID R1 Phasing E. coli Error PhiX Error
NO:)
812 STD (8) 1.0 1.0 1.0
812 Fast (8) 6.0 12.5 6.4
963(9) 5.9 4.6 3.6
*1550 (10) 3.0 1.5 1.4
*1558 (11) 2.9 1.5 1.5
1563 (12) 3.5 1.7 1.6
WO 2020/092830 PCT/US2019/059246
1565 (13) 3.3 2.3 1.6
1630 (14) 3.3 1.9 1.5
1634 (15) 3.1 1.7 1.7
1641(16) 3.1 2.0 1.5
1573 (17) 3.2 1.6 1.4
1576 (18) 3.6 1.5 1.4
1584 (19) 3.5 2.0 1.5
1586 (20) 3.9 2.9 2.2
1601 (21) 2.9 3.1 1.9
1611 (22) 3.7 3.8 2.4
31671 (23) 2.9 1.3 1.2
31677 (24) 3.0 1.9 1.7
31682 (25) 2.8 1.6 1.4
1700 (26) 3.3 1.7 1.7
31680 (27) 2.8 1.7 1.6
31745 (28) 2.7 1.6 1.2
31758 (29) 2.8 1.6 1.3
31761 (30) 2.7 1.5 1.4
31762 (31) 2.8 1.6 1.4
31765 (32) 2.8 1.3 1.2
31769 (33) 2.6 1.4 1.2
31770 (34) 2.7 2.0 1.3
[0126] Altered polymerases characterized by relative phasing rates no greater than 3.0 and
cumulative error rates no greater than 2.0 at short incorporation times represent
particularly attractive candidates for fast SBS applications. Example of such
polymerases are denoted in bold font and an asterisk in Table 2. Additional results
are shown in FIGS. 2 and 3.
[0127] shows reduced phasing and cumulative error rates at short incorporation
times demonstrated by one of the altered polymerases identified in Example 2, Pol
1558 (SEQ ID NO:11), when compared to a Pol 812 control (left panels). The two
enzymes show comparable phasing and error rates at standard incorporation times
(right panels).
[0128] shows reduced R1 phasing and cumulative E. coli error rates at short
incorporation times demonstrated by selected altered polymerases identified in
Example 2, Pol 1558, Pol 1671 (SEQ ID NO:23), Pol 1682 (SEQ ID NO:25), and
Pol 1745 (SEQ ID NO:28), when compared to Pol 812 and Pol 963 controls. The
WO 2020/092830 PCT/US2019/059246
broken lines in the top and bottom panels indicate the cumulative E. C0li error and
R1 phasing rates demonstrated by Pol 812 at standard incorporation times.
[0129] compares the phasing and prephasing rates of the same altered polymerases
in reference to Pol 812 and Pol 963 controls, showing reductions in the phasing
rates by Pols 1558, 1671, 1682, and 1745.
Example 3
Activity of Altered Polymerases under Long Run Conditions
[0130] Selected altered polymerases identified in Example 2 were evaluated using different
run lengths. Cumulative PhiX error rates for each of the altered polymerases at
standard incorporation times (46 sec) were compared to phasing and cumulative
error rates at short incorporation times (22 sec) during long sequencing runs (250
cycles in read 1 followed by 250 cycles in read 2, for a total of 500 cycles). The
longer run conditions result in more sequence information (i.e., the identity of more
nucleotides is determined) per run and are similar to the conditions often used in
sequencing platforms, for instance when sequencing a whole genome. The results
are shown in
[0131] compares cumulative PhiX errors rates of Pol 1550 (SEQ ID NO: 10) and Pol
1558 (SEQ ID N021 1) with that of P01 812 (SEQ ID N018) control at standard and
short incorporation times during long sequencing reads (2x25O cycles). Both
mutants show notable reductions in error rates following the paired-end turn. In
addition, Pol 1550 shows a significant reduction in error rate compared to Pol 812
during the first sequencing read.
Example 4
Evaluation of Sequencing Metrics on NovaSeqTM
[0132] One of the altered polymerases identified in Example 2, Pol 1671, was evaluated on
Illumina’s NovaSeqTM platform using the S1 flow cell and NovaSeqTM sequencing
chemistry. Error rates, phasing levels, and other sequencing quality metrics were
WO 2020/092830 PCT/US2019/059246
determined for reads 1 and 2 at short and standard incorporation times. The results
for Pol 1671 and Pol 812 are summarized in
[0133] shows a comparison between NovaSeqTM sequencing metrics of Pol 1671
(SEQ ID NO:23), demonstrated at short incorporation times (10 sec), and those of
Pol 812 (SEQ ID N018) control demonstrated at standard (40 sec) and short (10 sec)
incorporation times. The top panels show the percentages of clusters passing filter
(“Clusters PF”), the bottom panels show the cumulative PhiX error rates. The light
open circles denote the Pol 812 metrics at the standard incorporation times, whereas
the dark open circles denote the Pol 812 metrics at the short incorporation times.
All of the Pol 1671 metrics denoted by the solid circles are at the short incorporation
times. The results indicate that Pol 1671 shows comparable performance in both
reads at the short incorporation times to that of Pol 812 at the standard incorporation
times.
[0134] summarizes the cumulative PhiX error rates, Q30 values, and phasing rates
shown by Pol 1671 in reference to Pol 812 control for NovaSeqTM reads 1 and 2 at
standard (40 sec) and short (22 sec) incorporation times. Significant improvements
in the sequencing quality of both reads were observed when Pol 1671 was used.
[0135] The complete disclosure of all patents, patent applications, and publications, and
electronically available material (including, for instance, nucleotide sequence
submissions in, e. g., GenBank and RefSeq, and amino acid sequence submissions
in, e. g., SwissProt, PIR, PRF, PDB, and translations from annotated coding regions
in GenBank and RefSeq) cited herein are incorporated by reference in their entirety.
Supplementary materials referenced in publications (such as supplementary tables,
supplementary figures, supplementary materials and methods, and/or supplementary
experimental data) are likewise incorporated by reference in their entirety. In the
event that any inconsistency exists between the disclosure of the present application
and the disclosure(s) of any document incorporated herein by reference, the
disclosure of the present application shall govern. The foregoing detailed
description and examples have been given for clarity of understanding only. No
unnecessary limitations are to be understood therefrom. The invention is not limited
WO 2020/092830 PCT/US2019/059246
to the exact details shown and described, for variations obvious to one skilled in the
art will be included within the invention defined by the claims.
[0136] Unless otherwise indicated, all numbers expressing quantities of components,
molecular weights, and so forth used in the specification and claims are to be
understood as being modified in all instances by the term "about." Accordingly,
unless otherwise indicated to the contrary, the numerical parameters set forth in the
specification and claims are approximations that may vary depending upon the
desired properties sought to be obtained by the present invention. At the very least,
and not as an attempt to limit the doctrine of equivalents to the scope of the claims,
each numerical parameter should at least be construed in light of the number of
reported significant digits and by applying ordinary rounding techniques.
[0137] Notwithstanding that the numerical ranges and parameters setting forth the broad
scope of the invention are approximations, the numerical values set forth in the
specific examples are reported as precisely as possible. All numerical values,
however, inherently contain a range necessarily resulting from the standard
deviation found in their respective testing measurements.
[0138] All headings are for the convenience of the reader and should not be used to limit
the meaning of the text that follows the heading, unless so specified.
WO 2020/092830 PCT/US2019/059246
Claims (98)
1. CLAIMSl. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO: 1, wherein the recombinant DNA polymerase comprises an amino acidsubstitution mutation at a position functionally equivalent to Tyr497 and at least oneamino acid substitution mutation at a position functionally equivalent to Phel 52,Val278, Met329, Val47l, Thr5l4, Leu63 l, or Glu734 in the 9°N DNA polymeraseamino acid sequence.
2. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Tyr497 comprises a mutation to a non-polar,hydrophobic, or uncharged amino acid.
3. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Tyr497 comprises a mutation to Gly.
4. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Phel52 comprises a mutation to a non-polar,hydrophobic, or uncharged amino acid.
5. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Phel52 comprises a mutation to Gly.
6. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Val278 comprises a mutation to a non-polar orhydrophobic amino acid.
7. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Val278 comprises a mutation to Leu.WO 2020/092830 PCT/US2019/059246
8. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Met329 comprises a mutation to a polar amino acid.
9. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Met329 comprises a mutation to His.
10. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Val471 comprises a mutation to a polar or unchargedamino acid.
11. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Val471 comprises a mutation to Ser.
12. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Thr514 comprises a mutation to a non-polar orhydrophobic amino acid.
13. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Thr514 comprises a mutation to Ala.
14. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Leu631 comprises a mutation to a non-polar orhydrophobic amino acid.
15. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Leu631 comprises a mutation to Met.
16. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Glu734 comprises a mutation to a polar amino acid.
17. The polymerase of claim 1, wherein the substitution mutation at the positionfunctionally equivalent to Glu734 comprises a mutation to Arg.
18. The polymerase of claim 1, wherein the polymerase comprises at least two,at least three, at least four, at least five, at least six, or seven amino acid substitutionmutations at positions functionally equivalent to an amino acid selected fromWO 2020/092830 PCT/US2019/059246Phe152, Val278, Met329, Val47l, Thr514, Leu631, and Glu734 in the 9°N DNApolymerase amino acid sequence.
19. The polymerase of any one of claims 1-18, wherein the polymerase furthercomprises amino acid substitution mutations at positions functionally equivalent toamino acids Met129, Asp141, Glu143, Cys223, Leu408, Tyr409, Pro410, andAla485 in the 9°N DNA polymerase amino acid sequence.
20. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO: 1, wherein the recombinant DNA polymerase comprises an amino acidsubstitution mutation at position functionally equivalent to Tyr497 and at least oneamino acid substitution mutation at a position functionally equivalent to Lys476,Lys477, Thr514, Ile521, or Thr59O in the 9°N DNA polymerase amino acidsequence.
21. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Lys476 comprises a mutation to a non-polar orhydrophobic amino acid.
22. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Lys476 comprises a mutation to Trp.
23. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Lys477 comprises a mutation to a non-polar orhydrophobic amino acid.
24. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Lys477 comprises a mutation to Met.
25. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Thr514 comprises a mutation to a polar oruncharged amino acid.WO 2020/092830 PCT/US2019/059246
26. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Thr5l4 comprises a mutation to Ser.
27. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Ile52l comprises a mutation to a non-polar orhydrophobic amino acid.
28. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Ile52l comprises a mutation to Leu.
29. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Thr59O comprises a mutation to a non-polar orhydrophobic amino acid.
30. The polymerase of claim 20, wherein the substitution mutation at theposition functionally equivalent to Thr59O comprises a mutation to Ile.
31. The polymerase of claim 20, wherein the polymerase comprises at least two,at least three, at least four, or five amino acid substitution mutations at positionsfunctionally equivalent to an amino acid selected from Lys47 6, Lys47 7, Thr5 14,Ile52l, and Thr59O in the 9°N DNA polymerase amino acid sequence.
32. The polymerase of any one of claims 20-31, wherein the polymerase furthercomprises amino acid substitution mutations at positions functionally equivalent toamino acids Metl29, Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4l0, andAla485 in the 9°N DNA polymerase amino acid sequence.
33. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO: 1, wherein the recombinant DNA polymerase comprises an amino acidsubstitution mutation at position functionally equivalent to Tyr497 and at least oneamino acid substitution mutation at a position functionally equivalent to Arg247,Glu599, Lys620, His633, or Val66l in the 9°N DNA polymerase amino acidSCQLICIICC.WO 2020/092830 PCT/US2019/059246
34. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to Arg247 comprises a mutation to a non-polar oruncharged amino acid.
35. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to Arg247 comprises a mutation to Tyr.
36. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to Glu599 comprises a mutation to a polar aminoacid.
37. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to Glu599 comprises a mutation to Asp.
38. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to Lys62O comprises a mutation to a polar aminoacid.
39. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to Lys62O comprises a mutation to Arg.
40. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to His633 comprises a mutation to a non-polar,hydrophobic, or uncharged amino acid.
41. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to His633 comprises a mutation to Gly.
42. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to Val66l comprises a mutation to a polar aminoacid.
43. The polymerase of claim 33, wherein the substitution mutation at theposition functionally equivalent to Val66l comprises a mutation to Asp.WO 2020/092830 PCT/US2019/059246
44. The polymerase of claim 33, wherein the polymerase comprises at least two,at least three, at least four, or five amino acid substitution mutations at positionsfunctionally equivalent to an amino acid selected from Arg247, Glu599, Lys620,His633, and Val66l in the 9°N DNA polymerase amino acid sequence.
45. The polymerase of any one of claims 33-44, wherein the polymerase furthercomprises amino acid substitution mutations at positions functionally equivalent toamino acids Metl29, Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4l0, andA1a485 in the 9°N DNA polymerase amino acid sequence.
46. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO: 1, wherein the recombinant DNA polymerase comprises (i) an amino acidsubstitution mutation at position functionally equivalent to Tyr497, (ii) at least oneamino acid substitution mutation at a position functionally equivalent to Phel52,Val278, Met329, Val47l, Leu63 l, or Glu734 in the 9°N DNA polymerase aminoacid sequence, and (iii) at least one amino acid substitution mutation at a positionfunctionally equivalent to Lys476, Lys477, Thr5l4, Ile52l, or Thr59O in the 9°NDNA polymerase amino acid sequence.
47. The polymerase of claim 46, wherein the polymerase comprises at least two,at least three, at least four, at least five, or six amino acid substitution mutations atpositions functionally equivalent to an amino acid selected from Phel52, Val278,Met329, Val47l, Leu63 l, and Glu7 34 in the 9°N DNA polymerase amino acidSCCIUBIICC.
48. The polymerase of claim 46, wherein the polymerase comprises at least two,at least three, at least four, or five amino acid substitution mutations at positionsfunctionally equivalent to an amino acid selected from Lys47 6, Lys47 7 , Thr5 l4,Ile52l, and Thr59O in the 9°N DNA polymerase amino acid sequence.
49. The polymerase of any one of claims 46-48, wherein the polymerase furthercomprises amino acid substitution mutations at positions functionally equivalent toWO 2020/092830 PCT/US2019/059246amino acids Metl29, Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4l0, andAla485 in the 9°N DNA polymerase amino acid sequence.
50. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO: 1, wherein the recombinant DNA polymerase comprises (i) an amino acidsubstitution mutation at position functionally equivalent to Tyr497; (ii) at least oneamino acid substitution mutation at a position functionally equivalent to Phel52,Val278, Met329, Val47l, Thr514, Leu631, or Glu734 in the 9°N DNA polymeraseamino acid sequence, and (iii) at least one amino acid substitution mutation at aposition functionally equivalent to Arg247, Glu599, Lys620, His633, or Val66l inthe 9°N DNA polymerase amino acid sequence.
51. The polymerase of claim 50, wherein the polymerase comprises at least two,at least three, at least four, at least five, at least six, or seven amino acid substitutionmutations at positions functionally equivalent to an amino acid selected fromPhel52, Val278, Met329, Val47l, Thr5l4, Leu63 1, and Glu734 in the 9°N DNApolymerase amino acid sequence.
52. The polymerase of claim 50, wherein the polymerase comprises at least two,at least three, at least four, or five amino acid substitution mutations at positionsfunctionally equivalent to an amino acid selected from Arg247, Glu599, Lys620,His633, or Val66l in the 9°N DNA polymerase amino acid sequence.
53. The polymerase of any one of claims 50-52, wherein the polymerase furthercomprises amino acid substitution mutations at positions functionally equivalent toamino acids Metl29, Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro4l0, andAla485 in the 9°N DNA polymerase amino acid sequence.
54. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO: 1, wherein the recombinant DNA polymerase comprises (i) an amino acidsubstitution mutation at position functionally equivalent to Tyr497, (ii) at least oneWO 2020/092830 PCT/US2019/059246amino acid substitution mutation at a position functionally equivalent to Lys47 6,Lys477, Thr514, Ile52l, or Thr59O in the 9°N DNA polymerase amino acidsequence, and (iii) at least one amino acid substitution mutation at a positionfunctionally equivalent to Arg247, Glu599, Lys620, His633, or Val66l in the 9°NDNA polymerase amino acid sequence.
55. The polymerase of claim 54, wherein the polymerase comprises at least two,at least three, at least four, or five amino acid substitution mutations at positionsfunctionally equivalent to an amino acid selected from Lys476, Lys477, Thr514,Ile52l, or Thr59O in the 9°N DNA polymerase amino acid sequence.
56. The polymerase of claim 54, wherein the polymerase comprises at least two,at least three, at least four, or five amino acid substitution mutations at positionsfunctionally equivalent to an amino acid selected from Arg247, Glu599, Lys620,His633, or Val66l in the 9°N DNA polymerase amino acid sequence.
57. The polymerase of any one of claims 54-56, wherein the polymerase furthercomprises amino acid substitution mutations at positions functionally equivalent toamino acids Metl29, Aspl4l, Glul43, Cys223, Leu408, Tyr409, Pro410, andAla485 in the 9°N DNA polymerase amino acid sequence.
58. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO: 1, wherein the recombinant DNA polymerase comprises (i) an amino acidsubstitution mutation at position functionally equivalent to Tyr497, (ii) at least oneamino acid substitution mutation at a position functionally equivalent to Phe152,Val278, Met329, Val471, Leu631, or Glu734 in the 9°N DNA polymerase aminoacid sequence, (iii) at least one amino acid substitution mutation at a positionfunctionally equivalent to Lys476, Lys477, Thr514, Ile52l, or Thr59O in the 9°NDNA polymerase amino acid sequence, and (iv) at least one amino acid substitutionmutation at a position functionally equivalent to Arg247, Glu599, Lys620, His633,or Val66l in the 9°N DNA polymerase amino acid sequence.WO 2020/092830 PCT/US2019/059246
59. The polymerase of claim 58, wherein the polymerase comprises at least two,at least three, at least four, at least five, or six amino acid substitution mutations atpositions functionally equivalent to an amino acid selected from Phe152, Val27 8,Met329, Val471, Leu631, and Glu7 34 in the 9°N DNA polymerase amino acidsequence.
60. The polymerase of claim 58, wherein the polymerase comprises at least two,at least three, at least four, or five amino acid substitution mutations at positionsfunctionally equivalent to an amino acid selected from Lys476, Lys477, Thr514,Ile52l, or Thr59O in the 9°N DNA polymerase amino acid sequence.
61. The polymerase of claim 58, wherein the polymerase comprises at least two,at least three, at least four, or five amino acid substitution mutations at positionsfunctionally equivalent to an amino acid selected from Arg247, Glu599, Lys620,His633, or Val661 in the 9°N DNA polymerase amino acid sequence.
62. The polymerase of any one of claims 58-61, wherein the polymerase furthercomprises amino acid substitution mutations at positions functionally equivalent toamino acids Met129, Asp141, Glu143, Cys223, Leu408, Tyr409, Pro410, andAla485 in the 9°N DNA polymerase amino acid sequence.
63. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDN018, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Phe152, Val278, Met329,Va1471, and Thr5 14 in the 9°N DNA polymerase amino acid sequence.
64. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO:8, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Met329, Val471, andGlu7 34 in the 9°N DNA polymerase amino acid sequence.WO 2020/092830 PCT/US2019/059246
65. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDN018, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Arg247, Glu599, andHis633 in the 9°N DNA polymerase amino acid sequence.
66. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO:8, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Arg247, Glu599, Lys620,and His633 in the 9°N DNA polymerase amino acid sequence.
67. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO:8, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Met 329, Thr514, Lys620,and Val66l in the 9°N DNA polymerase amino acid sequence.
68. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO:8, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Val27 8, Val471, Arg247,Glu599, and His633 in the 9°N DNA polymerase amino acid sequence.
69. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDN018, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Arg247, His633, andVal66l in the 9°N DNA polymerase amino acid sequence.
70. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDN018, wherein the recombinant DNA polymerase comprises amino acid substitutionWO 2020/092830 PCT/US2019/059246mutations at positions functionally equivalent to Tyr497, Phe152, Val27 8, Val47 l,Arg247, Glu599, Lys620, His633, and Val66l in the 9°N DNA polymerase aminoacid sequence.
71. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO:8, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Val471, Thr514, Arg247,and Lys62O in the 9°N DNA polymerase amino acid sequence.
72. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO:8, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Met329, Val471, Thr514,Arg247, and His633 in the 9°N DNA polymerase amino acid sequence.
73. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO:8, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Val471, Thr514, Arg247,Glu599, and Lys62O in the 9°N DNA polymerase amino acid sequence.
74. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDN018, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Va127 8, Met329, Va147l,Arg247, and His633 in the 9°N DNA polymerase amino acid sequence.
75. A recombinant DNA polymerase comprising an amino acid sequence that isat least 80% identical to a 9°N DNA polymerase amino acid sequence SEQ IDNO:8, wherein the recombinant DNA polymerase comprises amino acid substitutionmutations at positions functionally equivalent to Tyr497, Val27 8, Met329, Val47 1,WO 2020/092830 PCT/US2019/059246Arg247, Glu599, Lys620, and His633 in the 9°N DNA polymerase amino acidSCQLICIICC.
76. The polymerase of any one of claims 1-75, wherein the polymerase is afamily B type DNA polymerase.
77. The polymerase of claim 76, wherein the polymerase is selected from thegroup consisting of a family B archaeal DNA polymerase, a human DNApolymerase-a, T4 polymerase, RB69 polymerase, and phi29 phage DNApolymerase.
78. The polymerase of claim 77, wherein the family B archaeal DNApolymerase is from a genus selected from the group consisting of T hermococcus,Pyrococcus, and Methcmococcus.
79. The polymerase of any of claims 1-78, wherein the polymerase comprisesreduced exonuclease activity as compared to a wild type polymerase.
80. A recombinant DNA polymerase comprising the amino acid sequence of anyone of SEQ ID NOs:l0-34.
81. A nucleic acid molecule encoding a polymerase as defined in any of claims1-75 and 80.
82. An expression Vector comprising the nucleic acid molecule of claim 81.
83. A host cell comprising the Vector of claim 82.
84. A method for incorporating modified nucleotides into a growing DNAstrand, the method comprising allowing the following components to interact: (i) apolymerase according to any one of claims 1-75 and 80, (ii) a DNA template, and(iii) a nucleotide solution.
85. The method of claim 84, wherein the DNA template comprises a clusteredarray .WO 2020/092830 PCT/US2019/059246
86. A kit for performing a nucleotide incorporation reaction, the kit comprising:a polymerase as defined in any one of claims 1-75 and 80 and a nucleotide solution.
87. The kit of claim 86, wherein the nucleotide solution comprises labellednucleotides.
88. The kit of claim 86, wherein the nucleotides comprise synthetic nucleotides.
89. The kit of claim 86, wherein the nucleotides comprise modified nucleotides.
90. The kit of claim 86, wherein the modified nucleotides have been modified atthe 3' sugar hydroxyl such that the substituent is larger in size than the naturallyoccurring 3' hydroxyl group.
91. The kit of claim 89, wherein modified nucleotides comprise a modifiednucleotide or nucleoside molecule comprising a purine or pyrimidine base and aribose or deoxyribose sugar moiety having a removable 3'-OH blocking groupcovalently attached thereto, such that the 3' carbon atom has attached a group of thestructure-O-Zwherein Z is any of -C(R')2-O-R", -C(R')2-N(R")2, -C(R')2-N(H)R", -C(R')2—S-R" and -C(R')2-F, wherein each R" is or is part of a removable protecting group;each R’ is independently a hydrogen atom, an alkyl, substituted alkyl,arylalkyl, alkenyl, alkynyl, aryl, heteroaryl, heterocyclic, acyl, cyano, alkoxy,aryloxy, heteroaryloxy or amido group, or a detectable label attached through alinking group; or (R')2 represents an alkylidene group of formula =C(R"‘)2 whereineach R”’ may be the same or different and is selected from the group comprisinghydrogen and halogen atoms and alkyl groups, andwherein said molecule may be reacted to yield an intermediate in which eachR" is exchanged for H or, where Z is -C(R')2-F, the F is exchanged for OH, SH orNH2, preferably OH, which intermediate dissociates under aqueous conditions toafford a molecule with a free 3'0H,WO 2020/092830 PCT/US2019/059246with the proviso that where Z is -C(R')2-S-R", both R‘ groups are not H.
92. The kit of claim 91, wherein R’ of the modified nucleotide or nucleoside isan alkyl or substituted alkyl.
93. The kit of claim 91, wherein -Z of the modified nucleotide or nucleoside isof formula -C(R')2-N3.
94. The kit of claim 93, wherein Z is an azidomethyl group.
95. The kit of claim 89, wherein the modified nucleotides are fluorescentlylabelled to allow their detection.
96. The kit of claim 89, wherein the modified nucleotides comprise a nucleotideor nucleoside having a base attached to a detectable label Via a cleavable linker.
97. The kit of claim 96, wherein the detectable label comprises a fluorescentlabel.
98. The kit of claim 86, further comprising one or more DNA templatemolecules and/or primers.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US62/753,558 | 2018-10-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
NZ770892A true NZ770892A (en) |
Family
ID=
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11560552B2 (en) | Polymerases, compositions, and methods of use | |
US11634697B2 (en) | Polymerases, compositions, and methods of use | |
US11198854B2 (en) | Modified polymerases for improved incorporation of nucleotide analogues | |
AU2021290289B2 (en) | Modified Polymerases For Improved Incorporation Of Nucleotide Analogues | |
US20230047225A1 (en) | Polymerases, compositions, and methods of use | |
NZ770892A (en) | Polymerases, compositions, and methods of use | |
RU2779599C1 (en) | Polymerases, compositions, and their application methods |