US20230374567A1 - Method for modifying a template double stranded polynucleotide - Google Patents
Method for modifying a template double stranded polynucleotide Download PDFInfo
- Publication number
- US20230374567A1 US20230374567A1 US18/194,062 US202318194062A US2023374567A1 US 20230374567 A1 US20230374567 A1 US 20230374567A1 US 202318194062 A US202318194062 A US 202318194062A US 2023374567 A1 US2023374567 A1 US 2023374567A1
- Authority
- US
- United States
- Prior art keywords
- polynucleotide
- seq
- helicase
- pore
- hel308
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 102000040430 polynucleotide Human genes 0.000 title claims abstract description 480
- 108091033319 polynucleotide Proteins 0.000 title claims abstract description 480
- 239000002157 polynucleotide Substances 0.000 title claims abstract description 480
- 238000000034 method Methods 0.000 title claims abstract description 151
- 108060004795 Methyltransferase Proteins 0.000 claims description 267
- 239000011148 porous material Substances 0.000 claims description 192
- 102100022536 Helicase POLQ-like Human genes 0.000 claims description 146
- 101000899334 Homo sapiens Helicase POLQ-like Proteins 0.000 claims description 146
- 239000012528 membrane Substances 0.000 claims description 118
- 239000000758 substrate Substances 0.000 claims description 102
- 102000008579 Transposases Human genes 0.000 claims description 70
- 108010020764 Transposases Proteins 0.000 claims description 70
- 102000004169 proteins and genes Human genes 0.000 claims description 46
- 108090000623 proteins and genes Proteins 0.000 claims description 46
- 230000035772 mutation Effects 0.000 claims description 42
- 239000012634 fragment Substances 0.000 claims description 39
- 230000008878 coupling Effects 0.000 claims description 29
- 238000010168 coupling process Methods 0.000 claims description 29
- 238000005859 coupling reaction Methods 0.000 claims description 29
- 238000005259 measurement Methods 0.000 claims description 25
- 108060002716 Exonuclease Proteins 0.000 claims description 6
- 102000013165 exonuclease Human genes 0.000 claims description 6
- 108010078791 Carrier Proteins Proteins 0.000 claims description 2
- 108010077544 Chromatin Proteins 0.000 claims description 2
- 102100039095 Chromatin-remodeling ATPase INO80 Human genes 0.000 claims description 2
- 101001033682 Homo sapiens Chromatin-remodeling ATPase INO80 Proteins 0.000 claims description 2
- 101710196562 RNA helicase NPH-II Proteins 0.000 claims description 2
- 210000003483 chromatin Anatomy 0.000 claims description 2
- 238000007634 remodeling Methods 0.000 claims description 2
- 238000012512 characterization method Methods 0.000 abstract description 17
- 238000007672 fourth generation sequencing Methods 0.000 abstract description 3
- 125000003729 nucleotide group Chemical group 0.000 description 158
- 239000002773 nucleotide Substances 0.000 description 152
- 235000001014 amino acid Nutrition 0.000 description 147
- 229940024606 amino acid Drugs 0.000 description 146
- 150000001413 amino acids Chemical class 0.000 description 139
- 239000000523 sample Substances 0.000 description 93
- 102220580933 Induced myeloid leukemia cell differentiation protein Mcl-1_F56V_mutation Human genes 0.000 description 64
- 150000002632 lipids Chemical class 0.000 description 54
- 102000004190 Enzymes Human genes 0.000 description 52
- 108090000790 Enzymes Proteins 0.000 description 52
- 125000003275 alpha amino acid group Chemical group 0.000 description 52
- 125000000151 cysteine group Chemical group N[C@@H](CS)C(=O)* 0.000 description 51
- 102000053602 DNA Human genes 0.000 description 46
- 108020004414 DNA Proteins 0.000 description 46
- 239000004327 boric acid Substances 0.000 description 45
- 235000018102 proteins Nutrition 0.000 description 45
- 239000004328 sodium tetraborate Substances 0.000 description 41
- 239000010410 layer Substances 0.000 description 39
- 230000004048 modification Effects 0.000 description 37
- 238000012986 modification Methods 0.000 description 37
- 230000027455 binding Effects 0.000 description 36
- 235000018417 cysteine Nutrition 0.000 description 35
- -1 polymerases Proteins 0.000 description 34
- 239000000232 Lipid Bilayer Substances 0.000 description 31
- 239000000178 monomer Substances 0.000 description 29
- 108090000765 processed proteins & peptides Proteins 0.000 description 29
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 26
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 26
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 23
- 230000000295 complement effect Effects 0.000 description 22
- 239000003550 marker Substances 0.000 description 22
- 102000004196 processed proteins & peptides Human genes 0.000 description 22
- 229910052799 carbon Inorganic materials 0.000 description 21
- 239000002777 nucleoside Substances 0.000 description 21
- 229920001184 polypeptide Polymers 0.000 description 21
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 20
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 20
- 239000002342 ribonucleoside Substances 0.000 description 20
- 238000012163 sequencing technique Methods 0.000 description 20
- 102220533243 Glycophorin-B_Y51A_mutation Human genes 0.000 description 19
- 102000035160 transmembrane proteins Human genes 0.000 description 18
- 108091005703 transmembrane proteins Proteins 0.000 description 18
- 102000014914 Carrier Proteins Human genes 0.000 description 17
- 108010090804 Streptavidin Proteins 0.000 description 17
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 17
- 108091008324 binding proteins Proteins 0.000 description 17
- 239000000872 buffer Substances 0.000 description 17
- 238000006243 chemical reaction Methods 0.000 description 17
- 102000039446 nucleic acids Human genes 0.000 description 17
- 108020004707 nucleic acids Proteins 0.000 description 17
- 150000007523 nucleic acids Chemical class 0.000 description 17
- 229910052720 vanadium Inorganic materials 0.000 description 17
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 16
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 description 16
- 239000005549 deoxyribonucleoside Substances 0.000 description 16
- 229960002685 biotin Drugs 0.000 description 15
- 239000011616 biotin Substances 0.000 description 15
- 229910052731 fluorine Inorganic materials 0.000 description 15
- 229920001223 polyethylene glycol Polymers 0.000 description 15
- 238000006467 substitution reaction Methods 0.000 description 15
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 14
- GYOZYWVXFNDGLU-XLPZGREQSA-N dTMP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)C1 GYOZYWVXFNDGLU-XLPZGREQSA-N 0.000 description 14
- 229910052739 hydrogen Inorganic materials 0.000 description 14
- 150000003833 nucleoside derivatives Chemical class 0.000 description 14
- 229920002477 rna polymer Polymers 0.000 description 14
- 150000003839 salts Chemical class 0.000 description 14
- 239000000126 substance Substances 0.000 description 14
- 229910052717 sulfur Inorganic materials 0.000 description 14
- YKBGVTZYEHREMT-KVQBGUIXSA-N 2'-deoxyguanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](CO)O1 YKBGVTZYEHREMT-KVQBGUIXSA-N 0.000 description 13
- WCUXLLCKKVVCTQ-UHFFFAOYSA-M Potassium chloride Chemical compound [Cl-].[K+] WCUXLLCKKVVCTQ-UHFFFAOYSA-M 0.000 description 13
- 235000020958 biotin Nutrition 0.000 description 13
- 229920001400 block copolymer Polymers 0.000 description 13
- 239000000203 mixture Substances 0.000 description 13
- XEBWQGVWTUSTLN-UHFFFAOYSA-M phenylmercury acetate Chemical compound CC(=O)O[Hg]C1=CC=CC=C1 XEBWQGVWTUSTLN-UHFFFAOYSA-M 0.000 description 13
- 239000011780 sodium chloride Substances 0.000 description 13
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 12
- CKTSBUTUHBMZGZ-SHYZEUOFSA-N 2'‐deoxycytidine Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 CKTSBUTUHBMZGZ-SHYZEUOFSA-N 0.000 description 12
- ZKHQWZAMYRWXGA-KQYNXXCUSA-J ATP(4-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-J 0.000 description 12
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 12
- 102000012410 DNA Ligases Human genes 0.000 description 12
- 108010061982 DNA Ligases Proteins 0.000 description 12
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 12
- 230000015572 biosynthetic process Effects 0.000 description 12
- 239000003153 chemical reaction reagent Substances 0.000 description 12
- 230000000694 effects Effects 0.000 description 12
- 230000002209 hydrophobic effect Effects 0.000 description 12
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 12
- 239000002202 Polyethylene glycol Substances 0.000 description 11
- 229910052740 iodine Inorganic materials 0.000 description 11
- 229920000428 triblock copolymer Polymers 0.000 description 11
- 108091006146 Channels Proteins 0.000 description 10
- 241000588724 Escherichia coli Species 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 239000011859 microparticle Substances 0.000 description 10
- 235000000346 sugar Nutrition 0.000 description 10
- 230000001052 transient effect Effects 0.000 description 10
- 229910052727 yttrium Inorganic materials 0.000 description 10
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 9
- 239000004971 Cross linker Substances 0.000 description 9
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 9
- 239000001257 hydrogen Substances 0.000 description 9
- 229920000642 polymer Polymers 0.000 description 9
- 239000000047 product Substances 0.000 description 9
- KHWCHTKSEGGWEX-RRKCRQDMSA-N 2'-deoxyadenosine 5'-monophosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(O)=O)O1 KHWCHTKSEGGWEX-RRKCRQDMSA-N 0.000 description 8
- NCMVOABPESMRCP-SHYZEUOFSA-N 2'-deoxycytosine 5'-monophosphate Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)C1 NCMVOABPESMRCP-SHYZEUOFSA-N 0.000 description 8
- LTFMZDNNPPEQNG-KVQBGUIXSA-N 2'-deoxyguanosine 5'-monophosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@H]1C[C@H](O)[C@@H](COP(O)(O)=O)O1 LTFMZDNNPPEQNG-KVQBGUIXSA-N 0.000 description 8
- CKTSBUTUHBMZGZ-UHFFFAOYSA-N Deoxycytidine Natural products O=C1N=C(N)C=CN1C1OC(CO)C(O)C1 CKTSBUTUHBMZGZ-UHFFFAOYSA-N 0.000 description 8
- TWRXJAOTZQYOKJ-UHFFFAOYSA-L Magnesium chloride Chemical compound [Mg+2].[Cl-].[Cl-] TWRXJAOTZQYOKJ-UHFFFAOYSA-L 0.000 description 8
- 239000007983 Tris buffer Substances 0.000 description 8
- 150000001540 azides Chemical class 0.000 description 8
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 8
- 235000012000 cholesterol Nutrition 0.000 description 8
- 238000010438 heat treatment Methods 0.000 description 8
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 8
- 230000003993 interaction Effects 0.000 description 8
- 239000000463 material Substances 0.000 description 8
- 239000007787 solid Substances 0.000 description 8
- 239000000243 solution Substances 0.000 description 8
- 150000003573 thiols Chemical class 0.000 description 8
- 229940104230 thymidine Drugs 0.000 description 8
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 8
- 210000004027 cell Anatomy 0.000 description 7
- 239000013068 control sample Substances 0.000 description 7
- JSRLJPSBLDHEIO-SHYZEUOFSA-N dUMP Chemical compound O1[C@H](COP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C=C1 JSRLJPSBLDHEIO-SHYZEUOFSA-N 0.000 description 7
- 238000012217 deletion Methods 0.000 description 7
- 230000037430 deletion Effects 0.000 description 7
- 238000000338 in vitro Methods 0.000 description 7
- 238000011534 incubation Methods 0.000 description 7
- 229910052757 nitrogen Inorganic materials 0.000 description 7
- 239000002245 particle Substances 0.000 description 7
- 125000006850 spacer group Chemical group 0.000 description 7
- 229910052721 tungsten Inorganic materials 0.000 description 7
- NZJKEQFPRPAEPO-UHFFFAOYSA-N 1h-benzimidazol-4-amine Chemical compound NC1=CC=CC2=C1N=CN2 NZJKEQFPRPAEPO-UHFFFAOYSA-N 0.000 description 6
- YZEUHQHUFTYLPH-UHFFFAOYSA-N 2-nitroimidazole Chemical compound [O-][N+](=O)C1=NC=CN1 YZEUHQHUFTYLPH-UHFFFAOYSA-N 0.000 description 6
- NEJMFSBXFBFELK-UHFFFAOYSA-N 4-nitro-1h-benzimidazole Chemical compound [O-][N+](=O)C1=CC=CC2=C1N=CN2 NEJMFSBXFBFELK-UHFFFAOYSA-N 0.000 description 6
- LAVZKLJDKGRZJG-UHFFFAOYSA-N 4-nitro-1h-indole Chemical compound [O-][N+](=O)C1=CC=CC2=C1C=CN2 LAVZKLJDKGRZJG-UHFFFAOYSA-N 0.000 description 6
- XORHNJQEWQGXCN-UHFFFAOYSA-N 4-nitro-1h-pyrazole Chemical compound [O-][N+](=O)C=1C=NNC=1 XORHNJQEWQGXCN-UHFFFAOYSA-N 0.000 description 6
- WSGURAYTCUVDQL-UHFFFAOYSA-N 5-nitro-1h-indazole Chemical compound [O-][N+](=O)C1=CC=C2NN=CC2=C1 WSGURAYTCUVDQL-UHFFFAOYSA-N 0.000 description 6
- OZFPSOBLQZPIAV-UHFFFAOYSA-N 5-nitro-1h-indole Chemical compound [O-][N+](=O)C1=CC=C2NC=CC2=C1 OZFPSOBLQZPIAV-UHFFFAOYSA-N 0.000 description 6
- PSWCIARYGITEOY-UHFFFAOYSA-N 6-nitro-1h-indole Chemical compound [O-][N+](=O)C1=CC=C2C=CNC2=C1 PSWCIARYGITEOY-UHFFFAOYSA-N 0.000 description 6
- 241000588921 Enterobacteriaceae Species 0.000 description 6
- 108091093037 Peptide nucleic acid Proteins 0.000 description 6
- DJJCXFVJDGTHFX-UHFFFAOYSA-N Uridinemonophosphate Natural products OC1C(O)C(COP(O)(O)=O)OC1N1C(=O)NC(=O)C=C1 DJJCXFVJDGTHFX-UHFFFAOYSA-N 0.000 description 6
- 239000007864 aqueous solution Substances 0.000 description 6
- IERHLVCPSMICTF-XVFCMESISA-N cytidine 5'-monophosphate Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(O)=O)O1 IERHLVCPSMICTF-XVFCMESISA-N 0.000 description 6
- IERHLVCPSMICTF-UHFFFAOYSA-N cytidine monophosphate Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(COP(O)(O)=O)O1 IERHLVCPSMICTF-UHFFFAOYSA-N 0.000 description 6
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 6
- 239000012530 fluid Substances 0.000 description 6
- RQFCJASXJCIDSX-UUOKFMHZSA-N guanosine 5'-monophosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@H]1O RQFCJASXJCIDSX-UUOKFMHZSA-N 0.000 description 6
- 235000013928 guanylic acid Nutrition 0.000 description 6
- 150000002500 ions Chemical class 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 125000003835 nucleoside group Chemical group 0.000 description 6
- 125000001997 phenyl group Chemical group [H]C1=C([H])C([H])=C(*)C([H])=C1[H] 0.000 description 6
- 239000001103 potassium chloride Substances 0.000 description 6
- 235000011164 potassium chloride Nutrition 0.000 description 6
- DJJCXFVJDGTHFX-XVFCMESISA-N uridine 5'-monophosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 DJJCXFVJDGTHFX-XVFCMESISA-N 0.000 description 6
- NEMHIKRLROONTL-QMMMGPOBSA-N (2s)-2-azaniumyl-3-(4-azidophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N=[N+]=[N-])C=C1 NEMHIKRLROONTL-QMMMGPOBSA-N 0.000 description 5
- 108700035208 EC 7.-.-.- Proteins 0.000 description 5
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 5
- 102000003960 Ligases Human genes 0.000 description 5
- 108090000364 Ligases Proteins 0.000 description 5
- 108010076504 Protein Sorting Signals Proteins 0.000 description 5
- 102220608764 Sorting nexin-29_Y51L_mutation Human genes 0.000 description 5
- 108091046915 Threose nucleic acid Proteins 0.000 description 5
- 239000002253 acid Substances 0.000 description 5
- UDMBCSSLTHHNCD-KQYNXXCUSA-N adenosine 5'-monophosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(O)=O)[C@@H](O)[C@H]1O UDMBCSSLTHHNCD-KQYNXXCUSA-N 0.000 description 5
- 150000001345 alkine derivatives Chemical class 0.000 description 5
- 125000000539 amino acid group Chemical group 0.000 description 5
- 239000012472 biological sample Substances 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- VGONTNSXDCQUGY-UHFFFAOYSA-N desoxyinosine Natural products C1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 VGONTNSXDCQUGY-UHFFFAOYSA-N 0.000 description 5
- 125000001183 hydrocarbyl group Chemical group 0.000 description 5
- 239000000543 intermediate Substances 0.000 description 5
- 239000000787 lecithin Substances 0.000 description 5
- 239000003446 ligand Substances 0.000 description 5
- 239000002502 liposome Substances 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000000746 purification Methods 0.000 description 5
- 239000002904 solvent Substances 0.000 description 5
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 4
- LOJNBPNACKZWAI-UHFFFAOYSA-N 3-nitro-1h-pyrrole Chemical compound [O-][N+](=O)C=1C=CNC=1 LOJNBPNACKZWAI-UHFFFAOYSA-N 0.000 description 4
- 208000035657 Abasia Diseases 0.000 description 4
- 101710092462 Alpha-hemolysin Proteins 0.000 description 4
- 239000001904 Arabinogalactan Substances 0.000 description 4
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 4
- IVOMOUWHDPKRLL-KQYNXXCUSA-N Cyclic adenosine monophosphate Chemical compound C([C@H]1O2)OP(O)(=O)O[C@H]1[C@@H](O)[C@@H]2N1C(N=CN=C2N)=C2N=C1 IVOMOUWHDPKRLL-KQYNXXCUSA-N 0.000 description 4
- 229920000858 Cyclodextrin Polymers 0.000 description 4
- 102100033215 DNA nucleotidylexotransferase Human genes 0.000 description 4
- 108010008286 DNA nucleotidylexotransferase Proteins 0.000 description 4
- RTZKZFJDLAIYFH-UHFFFAOYSA-N Diethyl ether Chemical compound CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 4
- 108091093094 Glycol nucleic acid Proteins 0.000 description 4
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 4
- 101710174798 Lysenin Proteins 0.000 description 4
- 108010052285 Membrane Proteins Proteins 0.000 description 4
- 229910019142 PO4 Inorganic materials 0.000 description 4
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 4
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 4
- 239000000427 antigen Substances 0.000 description 4
- 108091007433 antigens Proteins 0.000 description 4
- 102000036639 antigens Human genes 0.000 description 4
- 125000004429 atom Chemical group 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- 102220418127 c.164A>G Human genes 0.000 description 4
- AIYUHDOJVYHVIT-UHFFFAOYSA-M caesium chloride Chemical compound [Cl-].[Cs+] AIYUHDOJVYHVIT-UHFFFAOYSA-M 0.000 description 4
- 239000004303 calcium sorbate Substances 0.000 description 4
- 229940106189 ceramide Drugs 0.000 description 4
- 239000002800 charge carrier Substances 0.000 description 4
- 125000003636 chemical group Chemical group 0.000 description 4
- ZOOGRGPOEVQQDX-KHLHZJAASA-N cyclic guanosine monophosphate Chemical compound C([C@H]1O2)O[P@](O)(=O)O[C@@H]1[C@H](O)[C@H]2N1C(N=C(NC2=O)N)=C2N=C1 ZOOGRGPOEVQQDX-KHLHZJAASA-N 0.000 description 4
- 230000007831 electrophysiology Effects 0.000 description 4
- 238000002001 electrophysiology Methods 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- IPCSVZSSVZVIGE-UHFFFAOYSA-M hexadecanoate Chemical compound CCCCCCCCCCCCCCCC([O-])=O IPCSVZSSVZVIGE-UHFFFAOYSA-M 0.000 description 4
- IPCSVZSSVZVIGE-UHFFFAOYSA-N hexadecanoic acid Chemical compound CCCCCCCCCCCCCCCC(O)=O IPCSVZSSVZVIGE-UHFFFAOYSA-N 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 235000018977 lysine Nutrition 0.000 description 4
- 229910001629 magnesium chloride Inorganic materials 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 239000003960 organic solvent Substances 0.000 description 4
- 229940068917 polyethylene glycols Drugs 0.000 description 4
- 238000003752 polymerase chain reaction Methods 0.000 description 4
- 239000000276 potassium ferrocyanide Substances 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 102220282693 rs1308150896 Human genes 0.000 description 4
- 102220217675 rs370859268 Human genes 0.000 description 4
- 239000002356 single layer Substances 0.000 description 4
- 238000010561 standard procedure Methods 0.000 description 4
- 210000002784 stomach Anatomy 0.000 description 4
- XOGGUFAVLNCTRS-UHFFFAOYSA-N tetrapotassium;iron(2+);hexacyanide Chemical compound [K+].[K+].[K+].[K+].[Fe+2].N#[C-].N#[C-].N#[C-].N#[C-].N#[C-].N#[C-] XOGGUFAVLNCTRS-UHFFFAOYSA-N 0.000 description 4
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 4
- 230000017105 transposition Effects 0.000 description 4
- LWIHDJKSTIGBAC-UHFFFAOYSA-K tripotassium phosphate Chemical compound [K+].[K+].[K+].[O-]P([O-])([O-])=O LWIHDJKSTIGBAC-UHFFFAOYSA-K 0.000 description 4
- JWDFQMWEFLOOED-UHFFFAOYSA-N (2,5-dioxopyrrolidin-1-yl) 3-(pyridin-2-yldisulfanyl)propanoate Chemical compound O=C1CCC(=O)N1OC(=O)CCSSC1=CC=CC=N1 JWDFQMWEFLOOED-UHFFFAOYSA-N 0.000 description 3
- AUTOLBMXDDTRRT-JGVFFNPUSA-N (4R,5S)-dethiobiotin Chemical compound C[C@@H]1NC(=O)N[C@@H]1CCCCCC(O)=O AUTOLBMXDDTRRT-JGVFFNPUSA-N 0.000 description 3
- XQUPVDVFXZDTLT-UHFFFAOYSA-N 1-[4-[[4-(2,5-dioxopyrrol-1-yl)phenyl]methyl]phenyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1C(C=C1)=CC=C1CC1=CC=C(N2C(C=CC2=O)=O)C=C1 XQUPVDVFXZDTLT-UHFFFAOYSA-N 0.000 description 3
- 108091006112 ATPases Proteins 0.000 description 3
- 102000057290 Adenosine Triphosphatases Human genes 0.000 description 3
- 239000004475 Arginine Substances 0.000 description 3
- 241000894006 Bacteria Species 0.000 description 3
- UDMBCSSLTHHNCD-UHFFFAOYSA-N Coenzym Q(11) Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(O)=O)C(O)C1O UDMBCSSLTHHNCD-UHFFFAOYSA-N 0.000 description 3
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 3
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 3
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Natural products NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 3
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 3
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 3
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 3
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 3
- 239000004472 Lysine Substances 0.000 description 3
- PEEHTFAAVSWFBL-UHFFFAOYSA-N Maleimide Chemical compound O=C1NC(=O)C=C1 PEEHTFAAVSWFBL-UHFFFAOYSA-N 0.000 description 3
- 241001486996 Methanocaldococcus Species 0.000 description 3
- 241000205276 Methanosarcina Species 0.000 description 3
- 101710144111 Non-structural protein 3 Proteins 0.000 description 3
- 101710163270 Nuclease Proteins 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- XYFCBTPGUUZFHI-UHFFFAOYSA-N Phosphine Natural products P XYFCBTPGUUZFHI-UHFFFAOYSA-N 0.000 description 3
- RWRDLPDLKQPQOW-UHFFFAOYSA-N Pyrrolidine Chemical compound C1CCNC1 RWRDLPDLKQPQOW-UHFFFAOYSA-N 0.000 description 3
- 239000004283 Sodium sorbate Substances 0.000 description 3
- 241000205188 Thermococcus Species 0.000 description 3
- 101710183280 Topoisomerase Proteins 0.000 description 3
- LNQVTSROQXJCDD-UHFFFAOYSA-N adenosine monophosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(CO)C(OP(O)(O)=O)C1O LNQVTSROQXJCDD-UHFFFAOYSA-N 0.000 description 3
- 230000003321 amplification Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 3
- 125000003118 aryl group Chemical group 0.000 description 3
- 238000000231 atomic layer deposition Methods 0.000 description 3
- 230000004888 barrier function Effects 0.000 description 3
- 239000002585 base Substances 0.000 description 3
- 239000011324 bead Substances 0.000 description 3
- 239000004330 calcium propionate Substances 0.000 description 3
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 3
- 210000000170 cell membrane Anatomy 0.000 description 3
- 150000001783 ceramides Chemical class 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 3
- 229920001577 copolymer Polymers 0.000 description 3
- 229940104302 cytosine Drugs 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- VHJLVAABSRFDPM-QWWZWVQMSA-N dithiothreitol Chemical compound SC[C@@H](O)[C@H](O)CS VHJLVAABSRFDPM-QWWZWVQMSA-N 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 239000001530 fumaric acid Substances 0.000 description 3
- 125000000524 functional group Chemical group 0.000 description 3
- 230000002779 inactivation Effects 0.000 description 3
- 230000000670 limiting effect Effects 0.000 description 3
- 125000003588 lysine group Chemical group [H]N([H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C([H])(N([H])[H])C(*)=O 0.000 description 3
- 230000001404 mediated effect Effects 0.000 description 3
- 229910052751 metal Inorganic materials 0.000 description 3
- 239000002184 metal Substances 0.000 description 3
- 239000002105 nanoparticle Substances 0.000 description 3
- 238000003199 nucleic acid amplification method Methods 0.000 description 3
- 239000004177 patent blue V Substances 0.000 description 3
- 239000010452 phosphate Substances 0.000 description 3
- 239000008363 phosphate buffer Substances 0.000 description 3
- 239000004331 potassium propionate Substances 0.000 description 3
- 230000009257 reactivity Effects 0.000 description 3
- 239000002151 riboflavin Substances 0.000 description 3
- 102200160490 rs1800299 Human genes 0.000 description 3
- 102220289580 rs33916541 Human genes 0.000 description 3
- 102200114133 rs368386747 Human genes 0.000 description 3
- 102200026914 rs730882246 Human genes 0.000 description 3
- 102200037599 rs749038326 Human genes 0.000 description 3
- 102220143003 rs753997345 Human genes 0.000 description 3
- 235000004400 serine Nutrition 0.000 description 3
- 239000004324 sodium propionate Substances 0.000 description 3
- 239000004250 tert-Butylhydroquinone Substances 0.000 description 3
- 239000003053 toxin Substances 0.000 description 3
- 239000001226 triphosphate Substances 0.000 description 3
- 235000011178 triphosphate Nutrition 0.000 description 3
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 3
- 229960004441 tyrosine Drugs 0.000 description 3
- 229930195735 unsaturated hydrocarbon Natural products 0.000 description 3
- OILXMJHPFNGGTO-UHFFFAOYSA-N (22E)-(24xi)-24-methylcholesta-5,22-dien-3beta-ol Natural products C1C=C2CC(O)CCC2(C)C2C1C1CCC(C(C)C=CC(C)C(C)C)C1(C)CC2 OILXMJHPFNGGTO-UHFFFAOYSA-N 0.000 description 2
- WETFHJRYOTYZFD-YIZRAAEISA-N (2r,3s,5s)-2-(hydroxymethyl)-5-(3-nitropyrrol-1-yl)oxolan-3-ol Chemical compound C1[C@H](O)[C@@H](CO)O[C@@H]1N1C=C([N+]([O-])=O)C=C1 WETFHJRYOTYZFD-YIZRAAEISA-N 0.000 description 2
- WRIDQFICGBMAFQ-UHFFFAOYSA-N (E)-8-Octadecenoic acid Natural products CCCCCCCCCC=CCCCCCCC(O)=O WRIDQFICGBMAFQ-UHFFFAOYSA-N 0.000 description 2
- TZCPCKNHXULUIY-RGULYWFUSA-N 1,2-distearoyl-sn-glycero-3-phosphoserine Chemical compound CCCCCCCCCCCCCCCCCC(=O)OC[C@H](COP(O)(=O)OC[C@H](N)C(O)=O)OC(=O)CCCCCCCCCCCCCCCCC TZCPCKNHXULUIY-RGULYWFUSA-N 0.000 description 2
- AASYSXRGODIQGY-UHFFFAOYSA-N 1-[1-(2,5-dioxopyrrol-1-yl)hexyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1C(CCCCC)N1C(=O)C=CC1=O AASYSXRGODIQGY-UHFFFAOYSA-N 0.000 description 2
- SGVWDRVQIYUSRA-UHFFFAOYSA-N 1-[2-[2-(2,5-dioxopyrrol-1-yl)ethyldisulfanyl]ethyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1CCSSCCN1C(=O)C=CC1=O SGVWDRVQIYUSRA-UHFFFAOYSA-N 0.000 description 2
- WHEOHCIKAJUSJC-UHFFFAOYSA-N 1-[2-[bis[2-(2,5-dioxopyrrol-1-yl)ethyl]amino]ethyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1CCN(CCN1C(C=CC1=O)=O)CCN1C(=O)C=CC1=O WHEOHCIKAJUSJC-UHFFFAOYSA-N 0.000 description 2
- WXXSHAKLDCERGU-UHFFFAOYSA-N 1-[4-(2,5-dioxopyrrol-1-yl)butyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1CCCCN1C(=O)C=CC1=O WXXSHAKLDCERGU-UHFFFAOYSA-N 0.000 description 2
- QRZUPJILJVGUFF-UHFFFAOYSA-N 2,8-dibenzylcyclooctan-1-one Chemical compound C1CCCCC(CC=2C=CC=CC=2)C(=O)C1CC1=CC=CC=C1 QRZUPJILJVGUFF-UHFFFAOYSA-N 0.000 description 2
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 2
- FZWGECJQACGGTI-UHFFFAOYSA-N 2-amino-7-methyl-1,7-dihydro-6H-purin-6-one Chemical compound NC1=NC(O)=C2N(C)C=NC2=N1 FZWGECJQACGGTI-UHFFFAOYSA-N 0.000 description 2
- TWJNQYPJQDRXPH-UHFFFAOYSA-N 2-cyanobenzohydrazide Chemical compound NNC(=O)C1=CC=CC=C1C#N TWJNQYPJQDRXPH-UHFFFAOYSA-N 0.000 description 2
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical compound OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 description 2
- LQJBNNIYVWPHFW-UHFFFAOYSA-N 20:1omega9c fatty acid Natural products CCCCCCCCCCC=CCCCCCCCC(O)=O LQJBNNIYVWPHFW-UHFFFAOYSA-N 0.000 description 2
- KIUMMUBSPKGMOY-UHFFFAOYSA-N 3,3'-Dithiobis(6-nitrobenzoic acid) Chemical group C1=C([N+]([O-])=O)C(C(=O)O)=CC(SSC=2C=C(C(=CC=2)[N+]([O-])=O)C(O)=O)=C1 KIUMMUBSPKGMOY-UHFFFAOYSA-N 0.000 description 2
- FPQQSJJWHUJYPU-UHFFFAOYSA-N 3-(dimethylamino)propyliminomethylidene-ethylazanium;chloride Chemical group Cl.CCN=C=NCCCN(C)C FPQQSJJWHUJYPU-UHFFFAOYSA-N 0.000 description 2
- FSASIHFSFGAIJM-UHFFFAOYSA-N 3-methyladenine Chemical compound CN1C=NC(N)=C2N=CN=C12 FSASIHFSFGAIJM-UHFFFAOYSA-N 0.000 description 2
- XTWYTFMLZFPYCI-KQYNXXCUSA-N 5'-adenylphosphoric acid Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O XTWYTFMLZFPYCI-KQYNXXCUSA-N 0.000 description 2
- OFJNVANOCZHTMW-UHFFFAOYSA-N 5-hydroxyuracil Chemical compound OC1=CNC(=O)NC1=O OFJNVANOCZHTMW-UHFFFAOYSA-N 0.000 description 2
- DPRSKJHWKNHBOW-UHFFFAOYSA-N 7-Deazainosine Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2C=C1 DPRSKJHWKNHBOW-UHFFFAOYSA-N 0.000 description 2
- OQMZNAMGEHIHNN-UHFFFAOYSA-N 7-Dehydrostigmasterol Natural products C1C(O)CCC2(C)C(CCC3(C(C(C)C=CC(CC)C(C)C)CCC33)C)C3=CC=C21 OQMZNAMGEHIHNN-UHFFFAOYSA-N 0.000 description 2
- LSMBOEFDMAIXTM-UUOKFMHZSA-N 7-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-1h-imidazo[4,5-d]triazin-4-one Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NN=NC(O)=C2N=C1 LSMBOEFDMAIXTM-UUOKFMHZSA-N 0.000 description 2
- DPRSKJHWKNHBOW-KCGFPETGSA-N 7-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-1h-pyrrolo[2,3-d]pyrimidin-4-one Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(NC=NC2=O)=C2C=C1 DPRSKJHWKNHBOW-KCGFPETGSA-N 0.000 description 2
- QFFLRMDXYQOYKO-KVQBGUIXSA-N 7-[(2r,4s,5r)-4-hydroxy-5-(hydroxymethyl)oxolan-2-yl]-1h-imidazo[4,5-d]triazin-4-one Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NN=NC(O)=C2N=C1 QFFLRMDXYQOYKO-KVQBGUIXSA-N 0.000 description 2
- RGKBRPAAQSHTED-UHFFFAOYSA-N 8-oxoadenine Chemical compound NC1=NC=NC2=C1NC(=O)N2 RGKBRPAAQSHTED-UHFFFAOYSA-N 0.000 description 2
- QSBYPNXLFMSGKH-UHFFFAOYSA-N 9-Heptadecensaeure Natural products CCCCCCCC=CCCCCCCCC(O)=O QSBYPNXLFMSGKH-UHFFFAOYSA-N 0.000 description 2
- 101000777504 Actinia fragacea DELTA-actitoxin-Afr1a Proteins 0.000 description 2
- XTWYTFMLZFPYCI-UHFFFAOYSA-N Adenosine diphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(O)=O)C(O)C1O XTWYTFMLZFPYCI-UHFFFAOYSA-N 0.000 description 2
- 241000607534 Aeromonas Species 0.000 description 2
- 108091023037 Aptamer Proteins 0.000 description 2
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 2
- ZWIADYZPOWUWEW-XVFCMESISA-N CDP Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O1 ZWIADYZPOWUWEW-XVFCMESISA-N 0.000 description 2
- 108020004705 Codon Proteins 0.000 description 2
- PCDQPRRSZKQHHS-CCXZUQQUSA-N Cytarabine Triphosphate Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 PCDQPRRSZKQHHS-CCXZUQQUSA-N 0.000 description 2
- AHCYMLUZIRLXAA-SHYZEUOFSA-N Deoxyuridine 5'-triphosphate Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C=C1 AHCYMLUZIRLXAA-SHYZEUOFSA-N 0.000 description 2
- 239000004258 Ethoxyquin Substances 0.000 description 2
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 2
- 102100037123 Exosome RNA helicase MTR4 Human genes 0.000 description 2
- QGWNDRXFNXRZMB-UUOKFMHZSA-N GDP Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O QGWNDRXFNXRZMB-UUOKFMHZSA-N 0.000 description 2
- JZNWSCPGTDBMEW-UHFFFAOYSA-N Glycerophosphorylethanolamin Natural products NCCOP(O)(=O)OCC(O)CO JZNWSCPGTDBMEW-UHFFFAOYSA-N 0.000 description 2
- ZWZWYGMENQVNFU-UHFFFAOYSA-N Glycerophosphorylserin Natural products OC(=O)C(N)COP(O)(=O)OCC(O)CO ZWZWYGMENQVNFU-UHFFFAOYSA-N 0.000 description 2
- 239000004471 Glycine Substances 0.000 description 2
- XKMLYUALXHKNFT-UUOKFMHZSA-N Guanosine-5'-triphosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O XKMLYUALXHKNFT-UUOKFMHZSA-N 0.000 description 2
- 101001029120 Homo sapiens Exosome RNA helicase MTR4 Proteins 0.000 description 2
- 229930010555 Inosine Natural products 0.000 description 2
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 2
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 2
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- TUNFSRHWOTWDNC-UHFFFAOYSA-N Myristic acid Natural products CCCCCCCCCCCCCC(O)=O TUNFSRHWOTWDNC-UHFFFAOYSA-N 0.000 description 2
- 235000021360 Myristic acid Nutrition 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- 239000005642 Oleic acid Substances 0.000 description 2
- ZQPPMHVWECSIRJ-UHFFFAOYSA-N Oleic acid Natural products CCCCCCCCC=CCCCCCCCC(O)=O ZQPPMHVWECSIRJ-UHFFFAOYSA-N 0.000 description 2
- 101710116435 Outer membrane protein Proteins 0.000 description 2
- 235000021314 Palmitic acid Nutrition 0.000 description 2
- 229920003171 Poly (ethylene oxide) Polymers 0.000 description 2
- 239000004952 Polyamide Substances 0.000 description 2
- 108010021757 Polynucleotide 5'-Hydroxyl-Kinase Proteins 0.000 description 2
- 102000008422 Polynucleotide 5'-hydroxyl-kinase Human genes 0.000 description 2
- 108010013381 Porins Proteins 0.000 description 2
- 102000017033 Porins Human genes 0.000 description 2
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 2
- 108091093078 Pyrimidine dimer Proteins 0.000 description 2
- 101710086015 RNA ligase Proteins 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- 102220498565 Serine/threonine-protein kinase N2_E94D_mutation Human genes 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- 229930182558 Sterol Natural products 0.000 description 2
- PZBFGYYEXUXCOF-UHFFFAOYSA-N TCEP Chemical compound OC(=O)CCP(CCC(O)=O)CCC(O)=O PZBFGYYEXUXCOF-UHFFFAOYSA-N 0.000 description 2
- RZCIEJXAILMSQK-JXOAFFINSA-N TTP Chemical compound O=C1NC(=O)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 RZCIEJXAILMSQK-JXOAFFINSA-N 0.000 description 2
- 229920004890 Triton X-100 Polymers 0.000 description 2
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 2
- 101150025199 Upf1 gene Proteins 0.000 description 2
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 2
- 102100037111 Uracil-DNA glycosylase Human genes 0.000 description 2
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 2
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 2
- XCCTYIAWTASOJW-XVFCMESISA-N Uridine-5'-Diphosphate Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 XCCTYIAWTASOJW-XVFCMESISA-N 0.000 description 2
- 241000607598 Vibrio Species 0.000 description 2
- BZDVTEPMYMHZCR-JGVFFNPUSA-N [(2s,5r)-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methyl phosphono hydrogen phosphate Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(O)=O)CC1 BZDVTEPMYMHZCR-JGVFFNPUSA-N 0.000 description 2
- ATBOMIWRCZXYSZ-XZBBILGWSA-N [1-[2,3-dihydroxypropoxy(hydroxy)phosphoryl]oxy-3-hexadecanoyloxypropan-2-yl] (9e,12e)-octadeca-9,12-dienoate Chemical compound CCCCCCCCCCCCCCCC(=O)OCC(COP(O)(=O)OCC(O)CO)OC(=O)CCCCCCC\C=C\C\C=C\CCCCC ATBOMIWRCZXYSZ-XZBBILGWSA-N 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 125000002015 acyclic group Chemical group 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 229960005305 adenosine Drugs 0.000 description 2
- 230000006154 adenylylation Effects 0.000 description 2
- AWUCVROLDVIAJX-UHFFFAOYSA-N alpha-glycerophosphate Natural products OCC(O)COP(O)(O)=O AWUCVROLDVIAJX-UHFFFAOYSA-N 0.000 description 2
- PNEYBMLMFCGWSK-UHFFFAOYSA-N aluminium oxide Inorganic materials [O-2].[O-2].[O-2].[Al+3].[Al+3] PNEYBMLMFCGWSK-UHFFFAOYSA-N 0.000 description 2
- 150000001412 amines Chemical class 0.000 description 2
- 239000012491 analyte Substances 0.000 description 2
- 238000004873 anchoring Methods 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- QVGXLLKOCUKJST-UHFFFAOYSA-N atomic oxygen Chemical compound [O] QVGXLLKOCUKJST-UHFFFAOYSA-N 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- LGJMUZUPVCAVPU-UHFFFAOYSA-N beta-Sitostanol Natural products C1CC2CC(O)CCC2(C)C2C1C1CCC(C(C)CCC(CC)C(C)C)C1(C)CC2 LGJMUZUPVCAVPU-UHFFFAOYSA-N 0.000 description 2
- 125000002915 carbonyl group Chemical group [*:2]C([*:1])=O 0.000 description 2
- 150000001768 cations Chemical class 0.000 description 2
- 150000003841 chloride salts Chemical class 0.000 description 2
- 238000003271 compound fluorescence assay Methods 0.000 description 2
- 229910052593 corundum Inorganic materials 0.000 description 2
- GVJHHUAWPYXKBD-UHFFFAOYSA-N d-alpha-tocopherol Natural products OC1=C(C)C(C)=C2OC(CCCC(C)CCCC(C)CCCC(C)C)(C)CCC2=C1C GVJHHUAWPYXKBD-UHFFFAOYSA-N 0.000 description 2
- DAEAPNUQQAICNR-RRKCRQDMSA-K dADP(3-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP([O-])(=O)OP([O-])([O-])=O)O1 DAEAPNUQQAICNR-RRKCRQDMSA-K 0.000 description 2
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 2
- FTDHDKPUHBLBTL-SHYZEUOFSA-K dCDP(3-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 FTDHDKPUHBLBTL-SHYZEUOFSA-K 0.000 description 2
- RGWHQCVHVJXOKC-SHYZEUOFSA-N dCTP Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO[P@](O)(=O)O[P@](O)(=O)OP(O)(O)=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-N 0.000 description 2
- CIKGWCTVFSRMJU-KVQBGUIXSA-N dGDP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(O)=O)O1 CIKGWCTVFSRMJU-KVQBGUIXSA-N 0.000 description 2
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 2
- UJLXYODCHAELLY-XLPZGREQSA-N dTDP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 UJLXYODCHAELLY-XLPZGREQSA-N 0.000 description 2
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 2
- QHWZTVCCBMIIKE-SHYZEUOFSA-N dUDP Chemical compound O1[C@H](COP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C=C1 QHWZTVCCBMIIKE-SHYZEUOFSA-N 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 235000014113 dietary fatty acids Nutrition 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 239000000539 dimer Substances 0.000 description 2
- 238000007598 dipping method Methods 0.000 description 2
- ZGSPNIOCEDOHGS-UHFFFAOYSA-L disodium [3-[2,3-di(octadeca-9,12-dienoyloxy)propoxy-oxidophosphoryl]oxy-2-hydroxypropyl] 2,3-di(octadeca-9,12-dienoyloxy)propyl phosphate Chemical compound [Na+].[Na+].CCCCCC=CCC=CCCCCCCCC(=O)OCC(OC(=O)CCCCCCCC=CCC=CCCCCC)COP([O-])(=O)OCC(O)COP([O-])(=O)OCC(OC(=O)CCCCCCCC=CCC=CCCCCC)COC(=O)CCCCCCCC=CCC=CCCCCC ZGSPNIOCEDOHGS-UHFFFAOYSA-L 0.000 description 2
- POULHZVOKOAJMA-UHFFFAOYSA-N dodecanoic acid Chemical compound CCCCCCCCCCCC(O)=O POULHZVOKOAJMA-UHFFFAOYSA-N 0.000 description 2
- 238000002337 electrophoretic mobility shift assay Methods 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 150000002148 esters Chemical class 0.000 description 2
- 239000013604 expression vector Substances 0.000 description 2
- 229930195729 fatty acid Natural products 0.000 description 2
- 239000000194 fatty acid Substances 0.000 description 2
- 150000004665 fatty acids Chemical class 0.000 description 2
- 150000002190 fatty acyls Chemical group 0.000 description 2
- 238000013467 fragmentation Methods 0.000 description 2
- 238000006062 fragmentation reaction Methods 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 150000004676 glycans Chemical class 0.000 description 2
- 229910021389 graphene Inorganic materials 0.000 description 2
- QGWNDRXFNXRZMB-UHFFFAOYSA-N guanidine diphosphate Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(COP(O)(=O)OP(O)(O)=O)C(O)C1O QGWNDRXFNXRZMB-UHFFFAOYSA-N 0.000 description 2
- CJNBYAVZURUTKZ-UHFFFAOYSA-N hafnium(IV) oxide Inorganic materials O=[Hf]=O CJNBYAVZURUTKZ-UHFFFAOYSA-N 0.000 description 2
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 2
- 125000000487 histidyl group Chemical group [H]N([H])C(C(=O)O*)C([H])([H])C1=C([H])N([H])C([H])=N1 0.000 description 2
- 229960003786 inosine Drugs 0.000 description 2
- PGLTVOMIXTUURA-UHFFFAOYSA-N iodoacetamide Chemical compound NC(=O)CI PGLTVOMIXTUURA-UHFFFAOYSA-N 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- QXJSBBXBKPUZAA-UHFFFAOYSA-N isooleic acid Natural products CCCCCCCC=CCCCCCCCCC(O)=O QXJSBBXBKPUZAA-UHFFFAOYSA-N 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 239000013554 lipid monolayer Substances 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000004811 liquid chromatography Methods 0.000 description 2
- 150000002678 macrocyclic compounds Chemical class 0.000 description 2
- 125000005439 maleimidyl group Chemical group C1(C=CC(N1*)=O)=O 0.000 description 2
- 230000011987 methylation Effects 0.000 description 2
- 238000007069 methylation reaction Methods 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 238000002715 modification method Methods 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- ZQPPMHVWECSIRJ-KTKRTIGZSA-N oleic acid Chemical compound CCCCCCCC\C=C/CCCCCCCC(O)=O ZQPPMHVWECSIRJ-KTKRTIGZSA-N 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 239000001301 oxygen Substances 0.000 description 2
- 235000021317 phosphate Nutrition 0.000 description 2
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 2
- WTJKGGKOPKCXLL-RRHRGVEJSA-N phosphatidylcholine Chemical compound CCCCCCCCCCCCCCCC(=O)OC[C@H](COP([O-])(=O)OCC[N+](C)(C)C)OC(=O)CCCCCCCC=CCCCCCCCC WTJKGGKOPKCXLL-RRHRGVEJSA-N 0.000 description 2
- 150000008104 phosphatidylethanolamines Chemical class 0.000 description 2
- 150000003905 phosphatidylinositols Chemical class 0.000 description 2
- 229910000073 phosphorus hydride Inorganic materials 0.000 description 2
- 239000013612 plasmid Substances 0.000 description 2
- 229920002647 polyamide Polymers 0.000 description 2
- 229920001282 polysaccharide Polymers 0.000 description 2
- 239000005017 polysaccharide Substances 0.000 description 2
- 229910000160 potassium phosphate Inorganic materials 0.000 description 2
- 235000011009 potassium phosphates Nutrition 0.000 description 2
- 230000001915 proofreading effect Effects 0.000 description 2
- 239000013635 pyrimidine dimer Substances 0.000 description 2
- 238000003259 recombinant expression Methods 0.000 description 2
- 230000002829 reductive effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000002441 reversible effect Effects 0.000 description 2
- 102220030196 rs398123907 Human genes 0.000 description 2
- 229930195734 saturated hydrocarbon Natural products 0.000 description 2
- HFHDHCJBZVLPGP-UHFFFAOYSA-N schardinger α-dextrin Chemical compound O1C(C(C2O)O)C(CO)OC2OC(C(C2O)O)C(CO)OC2OC(C(C2O)O)C(CO)OC2OC(C(O)C2O)C(CO)OC2OC(C(C2O)O)C(CO)OC2OC2C(O)C(O)C1OC2CO HFHDHCJBZVLPGP-UHFFFAOYSA-N 0.000 description 2
- 229960001153 serine Drugs 0.000 description 2
- 239000002689 soil Substances 0.000 description 2
- 239000007858 starting material Substances 0.000 description 2
- 150000003432 sterols Chemical class 0.000 description 2
- 235000003702 sterols Nutrition 0.000 description 2
- KZNICNPSHKQLFF-UHFFFAOYSA-N succinimide Chemical compound O=C1CCC(=O)N1 KZNICNPSHKQLFF-UHFFFAOYSA-N 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 239000012085 test solution Substances 0.000 description 2
- HLZKNKRTKFSKGZ-UHFFFAOYSA-N tetradecan-1-ol Chemical compound CCCCCCCCCCCCCCO HLZKNKRTKFSKGZ-UHFFFAOYSA-N 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- 235000010384 tocopherol Nutrition 0.000 description 2
- 229960001295 tocopherol Drugs 0.000 description 2
- 229930003799 tocopherol Natural products 0.000 description 2
- 239000011732 tocopherol Substances 0.000 description 2
- 231100000765 toxin Toxicity 0.000 description 2
- 108700012359 toxins Proteins 0.000 description 2
- MQAYPFVXSPHGJM-UHFFFAOYSA-M trimethyl(phenyl)azanium;chloride Chemical compound [Cl-].C[N+](C)(C)C1=CC=CC=C1 MQAYPFVXSPHGJM-UHFFFAOYSA-M 0.000 description 2
- BYGOPQKDHGXNCD-UHFFFAOYSA-N tripotassium;iron(3+);hexacyanide Chemical compound [K+].[K+].[K+].[Fe+3].N#[C-].N#[C-].N#[C-].N#[C-].N#[C-].N#[C-] BYGOPQKDHGXNCD-UHFFFAOYSA-N 0.000 description 2
- 229940035893 uracil Drugs 0.000 description 2
- 229910001845 yogo sapphire Inorganic materials 0.000 description 2
- GVJHHUAWPYXKBD-IEOSBIPESA-N α-tocopherol Chemical compound OC1=C(C)C(C)=C2O[C@@](CCC[C@H](C)CCC[C@H](C)CCCC(C)C)(C)CCC2=C1C GVJHHUAWPYXKBD-IEOSBIPESA-N 0.000 description 2
- KZJWDPNRJALLNS-VPUBHVLGSA-N (-)-beta-Sitosterol Natural products O[C@@H]1CC=2[C@@](C)([C@@H]3[C@H]([C@H]4[C@@](C)([C@H]([C@H](CC[C@@H](C(C)C)CC)C)CC4)CC3)CC=2)CC1 KZJWDPNRJALLNS-VPUBHVLGSA-N 0.000 description 1
- BQPPJGMMIYJVBR-UHFFFAOYSA-N (10S)-3c-Acetoxy-4.4.10r.13c.14t-pentamethyl-17c-((R)-1.5-dimethyl-hexen-(4)-yl)-(5tH)-Delta8-tetradecahydro-1H-cyclopenta[a]phenanthren Natural products CC12CCC(OC(C)=O)C(C)(C)C1CCC1=C2CCC2(C)C(C(CCC=C(C)C)C)CCC21C BQPPJGMMIYJVBR-UHFFFAOYSA-N 0.000 description 1
- JSHOVKSMJRQOGY-UHFFFAOYSA-N (2,5-dioxopyrrolidin-1-yl) 4-(pyridin-2-yldisulfanyl)butanoate Chemical compound O=C1CCC(=O)N1OC(=O)CCCSSC1=CC=CC=N1 JSHOVKSMJRQOGY-UHFFFAOYSA-N 0.000 description 1
- CSVWWLUMXNHWSU-UHFFFAOYSA-N (22E)-(24xi)-24-ethyl-5alpha-cholest-22-en-3beta-ol Natural products C1CC2CC(O)CCC2(C)C2C1C1CCC(C(C)C=CC(CC)C(C)C)C1(C)CC2 CSVWWLUMXNHWSU-UHFFFAOYSA-N 0.000 description 1
- RQOCXCFLRBRBCS-UHFFFAOYSA-N (22E)-cholesta-5,7,22-trien-3beta-ol Natural products C1C(O)CCC2(C)C(CCC3(C(C(C)C=CCC(C)C)CCC33)C)C3=CC=C21 RQOCXCFLRBRBCS-UHFFFAOYSA-N 0.000 description 1
- KMIDUNAIKQUUGE-NTZNESFSSA-N (2S)-2-amino-8-[(2R,3S)-3-ethynyloxolan-2-yl]-8-oxooctanoic acid Chemical compound N[C@H](C(=O)O)CCCCCC(=O)[C@@H]1OCC[C@H]1C#C KMIDUNAIKQUUGE-NTZNESFSSA-N 0.000 description 1
- HRGXDARRSCSGOG-MRVPVSSYSA-N (2r)-2-amino-3-[4-[3-(trifluoromethyl)diazirin-3-yl]phenyl]propanoic acid Chemical compound C1=CC(C[C@@H](N)C(O)=O)=CC=C1C1(C(F)(F)F)N=N1 HRGXDARRSCSGOG-MRVPVSSYSA-N 0.000 description 1
- QLUTZSRKFWJIGM-QMMMGPOBSA-N (2r)-2-azaniumyl-3-[(2-nitrophenyl)methylsulfanyl]propanoate Chemical compound OC(=O)[C@@H](N)CSCC1=CC=CC=C1[N+]([O-])=O QLUTZSRKFWJIGM-QMMMGPOBSA-N 0.000 description 1
- DTERQYGMUDWYAZ-SSDOTTSWSA-N (2r)-6-acetamido-2-azaniumylhexanoate Chemical compound CC(=O)NCCCC[C@@H]([NH3+])C([O-])=O DTERQYGMUDWYAZ-SSDOTTSWSA-N 0.000 description 1
- XSYUPRQVAHJETO-WPMUBMLPSA-N (2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-amino-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidazol-5-yl)propanoyl]amino]-3-(1h-imidaz Chemical compound C([C@H](N)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CC=1NC=NC=1)C(O)=O)C1=CN=CN1 XSYUPRQVAHJETO-WPMUBMLPSA-N 0.000 description 1
- POGSZHUEECCEAP-ZETCQYMHSA-N (2s)-2-amino-3-(3-amino-4-hydroxyphenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(N)=C1 POGSZHUEECCEAP-ZETCQYMHSA-N 0.000 description 1
- ZHUOMTMPTNZOJE-VIFPVBQESA-N (2s)-2-amino-3-(3-cyanophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=CC(C#N)=C1 ZHUOMTMPTNZOJE-VIFPVBQESA-N 0.000 description 1
- PEMUHKUIQHFMTH-QMMMGPOBSA-N (2s)-2-amino-3-(4-bromophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(Br)C=C1 PEMUHKUIQHFMTH-QMMMGPOBSA-N 0.000 description 1
- KWIPUXXIFQQMKN-VIFPVBQESA-N (2s)-2-amino-3-(4-cyanophenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(C#N)C=C1 KWIPUXXIFQQMKN-VIFPVBQESA-N 0.000 description 1
- VPRPVNNXDCMVQT-JTQLQIEISA-N (2s)-2-amino-3-(4-ethylsulfanylcarbonylphenyl)propanoic acid Chemical compound CCSC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 VPRPVNNXDCMVQT-JTQLQIEISA-N 0.000 description 1
- PHUOJEKTSKQBNT-NSHDSACASA-N (2s)-2-amino-3-(4-prop-2-enoxyphenyl)propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(OCC=C)C=C1 PHUOJEKTSKQBNT-NSHDSACASA-N 0.000 description 1
- RKXGYYMTJRQUQQ-NSHDSACASA-N (2s)-2-amino-3-(4-propan-2-ylsulfanylcarbonylphenyl)propanoic acid Chemical compound CC(C)SC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 RKXGYYMTJRQUQQ-NSHDSACASA-N 0.000 description 1
- HIAVWJOQCVNAQC-LBPRGKRZSA-N (2s)-2-amino-3-(naphthalen-2-ylamino)propanoic acid Chemical compound C1=CC=CC2=CC(NC[C@H](N)C(O)=O)=CC=C21 HIAVWJOQCVNAQC-LBPRGKRZSA-N 0.000 description 1
- VTERJWKRLSSHIC-QMMMGPOBSA-N (2s)-2-amino-3-[(2-nitrophenyl)methoxy]propanoic acid Chemical compound OC(=O)[C@@H](N)COCC1=CC=CC=C1[N+]([O-])=O VTERJWKRLSSHIC-QMMMGPOBSA-N 0.000 description 1
- HYSPNOMZFGNKBR-QMMMGPOBSA-N (2s)-2-amino-3-[(4,5-dimethoxy-2-nitrophenyl)methoxy]propanoic acid Chemical compound COC1=CC(COC[C@H](N)C(O)=O)=C([N+]([O-])=O)C=C1OC HYSPNOMZFGNKBR-QMMMGPOBSA-N 0.000 description 1
- LJHYWUVYIKCPGU-VIFPVBQESA-N (2s)-2-amino-3-[4-(carboxymethyl)phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(CC(O)=O)C=C1 LJHYWUVYIKCPGU-VIFPVBQESA-N 0.000 description 1
- AWYGHHYADLYNRW-RGURZIINSA-N (2s)-2-amino-3-[4-[(2-amino-3-sulfanylpropanoyl)amino]phenyl]propanoic acid Chemical compound OC(=O)[C@@H](N)CC1=CC=C(NC(=O)C(N)CS)C=C1 AWYGHHYADLYNRW-RGURZIINSA-N 0.000 description 1
- NLFOHNAFILVHGM-AWEZNQCLSA-N (2s)-2-amino-3-[4-[(2-nitrophenyl)methoxy]phenyl]propanoic acid Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1OCC1=CC=CC=C1[N+]([O-])=O NLFOHNAFILVHGM-AWEZNQCLSA-N 0.000 description 1
- QEQAKQQRJFWPOR-JTQLQIEISA-N (2s)-2-amino-4-(7-hydroxy-2-oxochromen-4-yl)butanoic acid Chemical compound C1=C(O)C=CC2=C1OC(=O)C=C2CC[C@H](N)C(O)=O QEQAKQQRJFWPOR-JTQLQIEISA-N 0.000 description 1
- TUMGFDAVJXBHMU-BYPYZUCNSA-N (2s)-2-amino-5-sulfanylpentanoic acid Chemical compound OC(=O)[C@@H](N)CCCS TUMGFDAVJXBHMU-BYPYZUCNSA-N 0.000 description 1
- RPLCQQYRZLXMKL-ZETCQYMHSA-N (2s)-2-amino-6-(2-azidoethoxycarbonylamino)hexanoic acid Chemical compound OC(=O)[C@@H](N)CCCCNC(=O)OCCN=[N+]=[N-] RPLCQQYRZLXMKL-ZETCQYMHSA-N 0.000 description 1
- FQFIGHYUHCWZRZ-JTQLQIEISA-N (2s)-2-amino-6-(cyclopentanecarbonylamino)hexanoic acid Chemical compound OC(=O)[C@@H](N)CCCCNC(=O)C1CCCC1 FQFIGHYUHCWZRZ-JTQLQIEISA-N 0.000 description 1
- KRFMMSZGIQEBIJ-QMMMGPOBSA-N (2s)-2-amino-6-(prop-2-ynoxycarbonylamino)hexanoic acid Chemical compound OC(=O)[C@@H](N)CCCCNC(=O)OCC#C KRFMMSZGIQEBIJ-QMMMGPOBSA-N 0.000 description 1
- VVQIIIAZJXTLRE-QMMMGPOBSA-N (2s)-2-amino-6-[(2-methylpropan-2-yl)oxycarbonylamino]hexanoic acid Chemical compound CC(C)(C)OC(=O)NCCCC[C@H](N)C(O)=O VVQIIIAZJXTLRE-QMMMGPOBSA-N 0.000 description 1
- WORDWOPJMYWZSB-NSHDSACASA-N (2s)-2-amino-6-[(2-nitrophenyl)methoxycarbonylamino]hexanoic acid Chemical compound OC(=O)[C@@H](N)CCCCNC(=O)OCC1=CC=CC=C1[N+]([O-])=O WORDWOPJMYWZSB-NSHDSACASA-N 0.000 description 1
- HBMWPJLCTYKAGL-YFKPBYRVSA-N (2s)-2-amino-6-sulfanylhexanoic acid Chemical compound OC(=O)[C@@H](N)CCCCS HBMWPJLCTYKAGL-YFKPBYRVSA-N 0.000 description 1
- SDZGVFSSLGTJAJ-ZETCQYMHSA-N (2s)-2-azaniumyl-3-(2-nitrophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1[N+]([O-])=O SDZGVFSSLGTJAJ-ZETCQYMHSA-N 0.000 description 1
- GTVVZTAFGPQSPC-QMMMGPOBSA-N (2s)-2-azaniumyl-3-(4-nitrophenyl)propanoate Chemical compound OC(=O)[C@@H](N)CC1=CC=C([N+]([O-])=O)C=C1 GTVVZTAFGPQSPC-QMMMGPOBSA-N 0.000 description 1
- CYHRSNOITZHLJN-NSHDSACASA-N (2s)-2-azaniumyl-3-(4-propan-2-ylphenyl)propanoate Chemical compound CC(C)C1=CC=C(C[C@H](N)C(O)=O)C=C1 CYHRSNOITZHLJN-NSHDSACASA-N 0.000 description 1
- NKQWPQYVVKUEPV-QMMMGPOBSA-N (2s)-2-hydroxy-6-[(2-methylpropan-2-yl)oxycarbonylamino]hexanoic acid Chemical compound CC(C)(C)OC(=O)NCCCC[C@H](O)C(O)=O NKQWPQYVVKUEPV-QMMMGPOBSA-N 0.000 description 1
- ZXSBHXZKWRIEIA-JTQLQIEISA-N (2s)-3-(4-acetylphenyl)-2-azaniumylpropanoate Chemical compound CC(=O)C1=CC=C(C[C@H](N)C(O)=O)C=C1 ZXSBHXZKWRIEIA-JTQLQIEISA-N 0.000 description 1
- XKZCXMNMUMGDJG-AWEZNQCLSA-N (2s)-3-[(6-acetylnaphthalen-2-yl)amino]-2-aminopropanoic acid Chemical compound C1=C(NC[C@H](N)C(O)=O)C=CC2=CC(C(=O)C)=CC=C21 XKZCXMNMUMGDJG-AWEZNQCLSA-N 0.000 description 1
- YZJSUQQZGCHHNQ-BYPYZUCNSA-N (2s)-6-amino-2-azaniumyl-6-oxohexanoate Chemical compound OC(=O)[C@@H](N)CCCC(N)=O YZJSUQQZGCHHNQ-BYPYZUCNSA-N 0.000 description 1
- CHGIKSSZNBCNDW-UHFFFAOYSA-N (3beta,5alpha)-4,4-Dimethylcholesta-8,24-dien-3-ol Natural products CC12CCC(O)C(C)(C)C1CCC1=C2CCC2(C)C(C(CCC=C(C)C)C)CCC21 CHGIKSSZNBCNDW-UHFFFAOYSA-N 0.000 description 1
- ALSTYHKOOCGGFT-KTKRTIGZSA-N (9Z)-octadecen-1-ol Chemical compound CCCCCCCC\C=C/CCCCCCCCO ALSTYHKOOCGGFT-KTKRTIGZSA-N 0.000 description 1
- VLSDXINSOMDCBK-BQYQJAHWSA-N (E)-1,1'-azobis(N,N-dimethylformamide) Chemical compound CN(C)C(=O)\N=N\C(=O)N(C)C VLSDXINSOMDCBK-BQYQJAHWSA-N 0.000 description 1
- JVGVDSSUAVXRDY-MRVPVSSYSA-N (R)-3-(4-hydroxyphenyl)lactic acid Chemical compound OC(=O)[C@H](O)CC1=CC=C(O)C=C1 JVGVDSSUAVXRDY-MRVPVSSYSA-N 0.000 description 1
- UKGJZDSUJSPAJL-YPUOHESYSA-N (e)-n-[(1r)-1-[3,5-difluoro-4-(methanesulfonamido)phenyl]ethyl]-3-[2-propyl-6-(trifluoromethyl)pyridin-3-yl]prop-2-enamide Chemical compound CCCC1=NC(C(F)(F)F)=CC=C1\C=C\C(=O)N[C@H](C)C1=CC(F)=C(NS(C)(=O)=O)C(F)=C1 UKGJZDSUJSPAJL-YPUOHESYSA-N 0.000 description 1
- CMUHFUGDYMFHEI-UHFFFAOYSA-N -2-Amino-3-94-aminophenyl)propanoic acid Natural products OC(=O)C(N)CC1=CC=C(N)C=C1 CMUHFUGDYMFHEI-UHFFFAOYSA-N 0.000 description 1
- WWJWZQKUDYKLTK-UHFFFAOYSA-N 1,n6-ethenoadenine Chemical compound C1=NC2=NC=N[C]2C2=NC=CN21 WWJWZQKUDYKLTK-UHFFFAOYSA-N 0.000 description 1
- LUTLAXLNPLZCOF-UHFFFAOYSA-N 1-Methylhistidine Natural products OC(=O)C(N)(C)CC1=NC=CN1 LUTLAXLNPLZCOF-UHFFFAOYSA-N 0.000 description 1
- RVRLFABOQXZUJX-UHFFFAOYSA-N 1-[1-(2,5-dioxopyrrol-1-yl)ethyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1C(C)N1C(=O)C=CC1=O RVRLFABOQXZUJX-UHFFFAOYSA-N 0.000 description 1
- FERLGYOHRKHQJP-UHFFFAOYSA-N 1-[2-[2-[2-(2,5-dioxopyrrol-1-yl)ethoxy]ethoxy]ethyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1CCOCCOCCN1C(=O)C=CC1=O FERLGYOHRKHQJP-UHFFFAOYSA-N 0.000 description 1
- OYRSKXCXEFLTEY-UHFFFAOYSA-N 1-[2-[2-[2-[2-(2,5-dioxopyrrol-1-yl)ethoxy]ethoxy]ethoxy]ethyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1CCOCCOCCOCCN1C(=O)C=CC1=O OYRSKXCXEFLTEY-UHFFFAOYSA-N 0.000 description 1
- VNJBTKQBKFMEHH-UHFFFAOYSA-N 1-[4-(2,5-dioxopyrrol-1-yl)-2,3-dihydroxybutyl]pyrrole-2,5-dione Chemical compound O=C1C=CC(=O)N1CC(O)C(O)CN1C(=O)C=CC1=O VNJBTKQBKFMEHH-UHFFFAOYSA-N 0.000 description 1
- PBFKSBAPGGMKKJ-UHFFFAOYSA-N 1-[6-(2,5-dioxopyrrolidin-1-yl)hexyl]pyrrolidine-2,5-dione Chemical compound O=C1CCC(=O)N1CCCCCCN1C(=O)CCC1=O PBFKSBAPGGMKKJ-UHFFFAOYSA-N 0.000 description 1
- UHDGCWIWMRVCDJ-UHFFFAOYSA-N 1-beta-D-Xylofuranosyl-NH-Cytosine Natural products O=C1N=C(N)C=CN1C1C(O)C(O)C(CO)O1 UHDGCWIWMRVCDJ-UHFFFAOYSA-N 0.000 description 1
- BMQZYMYBQZGEEY-UHFFFAOYSA-M 1-ethyl-3-methylimidazolium chloride Chemical compound [Cl-].CCN1C=C[N+](C)=C1 BMQZYMYBQZGEEY-UHFFFAOYSA-M 0.000 description 1
- XYTLYKGXLMKYMV-UHFFFAOYSA-N 14alpha-methylzymosterol Natural products CC12CCC(O)CC1CCC1=C2CCC2(C)C(C(CCC=C(C)C)C)CCC21C XYTLYKGXLMKYMV-UHFFFAOYSA-N 0.000 description 1
- SBNOTUDDIXOFSN-UHFFFAOYSA-N 1h-indole-2-carbaldehyde Chemical compound C1=CC=C2NC(C=O)=CC2=C1 SBNOTUDDIXOFSN-UHFFFAOYSA-N 0.000 description 1
- PHNGFPPXDJJADG-RRKCRQDMSA-N 2'-deoxyinosine-5'-monophosphate Chemical compound O1[C@H](COP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(N=CNC2=O)=C2N=C1 PHNGFPPXDJJADG-RRKCRQDMSA-N 0.000 description 1
- GIMRVVLNBSNCLO-UHFFFAOYSA-N 2,6-diamino-5-formamido-4-hydroxypyrimidine Chemical compound NC1=NC(=O)C(NC=O)C(N)=N1 GIMRVVLNBSNCLO-UHFFFAOYSA-N 0.000 description 1
- VYMHBQQZUYHXSS-UHFFFAOYSA-N 2-(3h-dithiol-3-yl)pyridine Chemical compound C1=CSSC1C1=CC=CC=N1 VYMHBQQZUYHXSS-UHFFFAOYSA-N 0.000 description 1
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 1
- XLOULZPUVVVWES-UHFFFAOYSA-N 2-amino-3-(8-hydroxyquinolin-3-yl)propanoic acid Chemical compound OC1=CC=CC2=CC(CC(N)C(O)=O)=CN=C21 XLOULZPUVVVWES-UHFFFAOYSA-N 0.000 description 1
- RZGAAVNNBWANKD-UHFFFAOYSA-N 2-amino-3-[[5-(dimethylamino)naphthalen-1-yl]sulfonylamino]propanoic acid Chemical compound C1=CC=C2C(N(C)C)=CC=CC2=C1S(=O)(=O)NCC(N)C(O)=O RZGAAVNNBWANKD-UHFFFAOYSA-N 0.000 description 1
- GOJUJUVQIVIZAV-UHFFFAOYSA-N 2-amino-4,6-dichloropyrimidine-5-carbaldehyde Chemical group NC1=NC(Cl)=C(C=O)C(Cl)=N1 GOJUJUVQIVIZAV-UHFFFAOYSA-N 0.000 description 1
- JINGUCXQUOKWKH-UHFFFAOYSA-N 2-aminodecanoic acid Chemical compound CCCCCCCCC(N)C(O)=O JINGUCXQUOKWKH-UHFFFAOYSA-N 0.000 description 1
- YZXUCQCJZKJMIR-UHFFFAOYSA-N 2-azaniumyl-3-[4-(trifluoromethoxy)phenyl]propanoate Chemical compound OC(=O)C(N)CC1=CC=C(OC(F)(F)F)C=C1 YZXUCQCJZKJMIR-UHFFFAOYSA-N 0.000 description 1
- CKGCFBNYQJDIGS-UHFFFAOYSA-N 2-azaniumyl-6-(phenylmethoxycarbonylamino)hexanoate Chemical compound OC(=O)C(N)CCCCNC(=O)OCC1=CC=CC=C1 CKGCFBNYQJDIGS-UHFFFAOYSA-N 0.000 description 1
- JVPFOKXICYJJSC-UHFFFAOYSA-N 2-azaniumylnonanoate Chemical compound CCCCCCCC(N)C(O)=O JVPFOKXICYJJSC-UHFFFAOYSA-N 0.000 description 1
- KLEXDBGYSOIREE-UHFFFAOYSA-N 24xi-n-propylcholesterol Natural products C1C=C2CC(O)CCC2(C)C2C1C1CCC(C(C)CCC(CCC)C(C)C)C1(C)CC2 KLEXDBGYSOIREE-UHFFFAOYSA-N 0.000 description 1
- XMTQQYYKAHVGBJ-UHFFFAOYSA-N 3-(3,4-DICHLOROPHENYL)-1,1-DIMETHYLUREA Chemical compound CN(C)C(=O)NC1=CC=C(Cl)C(Cl)=C1 XMTQQYYKAHVGBJ-UHFFFAOYSA-N 0.000 description 1
- UQTZMGFTRHFAAM-ZETCQYMHSA-N 3-iodo-L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(I)=C1 UQTZMGFTRHFAAM-ZETCQYMHSA-N 0.000 description 1
- FBTSQILOGYXGMD-LURJTMIESA-N 3-nitro-L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C([N+]([O-])=O)=C1 FBTSQILOGYXGMD-LURJTMIESA-N 0.000 description 1
- FPTJELQXIUUCEY-UHFFFAOYSA-N 3beta-Hydroxy-lanostan Natural products C1CC2C(C)(C)C(O)CCC2(C)C2C1C1(C)CCC(C(C)CCCC(C)C)C1(C)CC2 FPTJELQXIUUCEY-UHFFFAOYSA-N 0.000 description 1
- CMUHFUGDYMFHEI-QMMMGPOBSA-N 4-amino-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(N)C=C1 CMUHFUGDYMFHEI-QMMMGPOBSA-N 0.000 description 1
- XWHHYOYVRVGJJY-QMMMGPOBSA-N 4-fluoro-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(F)C=C1 XWHHYOYVRVGJJY-QMMMGPOBSA-N 0.000 description 1
- PZNQZSRPDOEBMS-QMMMGPOBSA-N 4-iodo-L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(I)C=C1 PZNQZSRPDOEBMS-QMMMGPOBSA-N 0.000 description 1
- NHOKUDODDWSIAJ-UHFFFAOYSA-N 5,6-dihydroxy-1,3-diazinane-2,4-dione Chemical compound OC1NC(=O)NC(=O)C1O NHOKUDODDWSIAJ-UHFFFAOYSA-N 0.000 description 1
- OHAMXGZMZZWRCA-UHFFFAOYSA-N 5-formyluracil Chemical compound OC1=NC=C(C=O)C(O)=N1 OHAMXGZMZZWRCA-UHFFFAOYSA-N 0.000 description 1
- QALVYNJUHYOTCW-UHFFFAOYSA-N 5-hydroxy-5-methyl-1,3-diazinane-2,4,6-trione Chemical compound CC1(O)C(=O)NC(=O)N=C1O QALVYNJUHYOTCW-UHFFFAOYSA-N 0.000 description 1
- JDBGXEHEIRGOBU-UHFFFAOYSA-N 5-hydroxymethyluracil Chemical compound OCC1=CNC(=O)NC1=O JDBGXEHEIRGOBU-UHFFFAOYSA-N 0.000 description 1
- NJQONZSFUKNYOY-JXOAFFINSA-N 5-methylcytidine 5'-monophosphate Chemical compound O=C1N=C(N)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(O)=O)O1 NJQONZSFUKNYOY-JXOAFFINSA-N 0.000 description 1
- NLLCDONDZDHLCI-UHFFFAOYSA-N 6-amino-5-hydroxy-1h-pyrimidin-2-one Chemical compound NC=1NC(=O)N=CC=1O NLLCDONDZDHLCI-UHFFFAOYSA-N 0.000 description 1
- CLGFIVUFZRGQRP-UHFFFAOYSA-N 7,8-dihydro-8-oxoguanine Chemical compound O=C1NC(N)=NC2=C1NC(=O)N2 CLGFIVUFZRGQRP-UHFFFAOYSA-N 0.000 description 1
- IHLOTZVBEUFDMD-UUOKFMHZSA-N 7-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-2,2-dioxo-1h-imidazo[4,5-c][1,2,6]thiadiazin-4-one Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(NS(=O)(=O)NC2=O)=C2N=C1 IHLOTZVBEUFDMD-UUOKFMHZSA-N 0.000 description 1
- UBKVUFQGVWHZIR-UHFFFAOYSA-N 8-oxoguanine Chemical compound O=C1NC(N)=NC2=NC(=O)N=C21 UBKVUFQGVWHZIR-UHFFFAOYSA-N 0.000 description 1
- 241000589291 Acinetobacter Species 0.000 description 1
- 241000251468 Actinopterygii Species 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 239000012099 Alexa Fluor family Substances 0.000 description 1
- 229920000856 Amylose Polymers 0.000 description 1
- 241000205046 Archaeoglobus Species 0.000 description 1
- 235000007319 Avena orientalis Nutrition 0.000 description 1
- 244000075850 Avena orientalis Species 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 239000004184 Avoparcin Substances 0.000 description 1
- 241000193738 Bacillus anthracis Species 0.000 description 1
- 239000001878 Bakers yeast glycan Substances 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 235000014698 Brassica juncea var multisecta Nutrition 0.000 description 1
- 235000006008 Brassica napus var napus Nutrition 0.000 description 1
- 240000000385 Brassica napus var. napus Species 0.000 description 1
- 235000011299 Brassica oleracea var botrytis Nutrition 0.000 description 1
- 235000017647 Brassica oleracea var italica Nutrition 0.000 description 1
- 240000003259 Brassica oleracea var. botrytis Species 0.000 description 1
- 235000006618 Brassica rapa subsp oleifera Nutrition 0.000 description 1
- 235000004977 Brassica sinapistrum Nutrition 0.000 description 1
- 239000004255 Butylated hydroxyanisole Substances 0.000 description 1
- 102220484866 C-type lectin domain family 4 member A_W21A_mutation Human genes 0.000 description 1
- YDNKGFDKKRUKPY-JHOUSYSJSA-N C16 ceramide Natural products CCCCCCCCCCCCCCCC(=O)N[C@@H](CO)[C@H](O)C=CCCCCCCCCCCCCC YDNKGFDKKRUKPY-JHOUSYSJSA-N 0.000 description 1
- 102220548139 Calpain-2 catalytic subunit_D22E_mutation Human genes 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 239000004215 Carbon black (E152) Substances 0.000 description 1
- 241000205484 Cenarchaeum Species 0.000 description 1
- 241000819038 Chichester Species 0.000 description 1
- 241001247823 Citromicrobium Species 0.000 description 1
- LPZCCMIISIBREI-MTFRKTCUSA-N Citrostadienol Natural products CC=C(CC[C@@H](C)[C@H]1CC[C@H]2C3=CC[C@H]4[C@H](C)[C@@H](O)CC[C@]4(C)[C@H]3CC[C@]12C)C(C)C LPZCCMIISIBREI-MTFRKTCUSA-N 0.000 description 1
- 108091028732 Concatemer Proteins 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241000989055 Cronobacter Species 0.000 description 1
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 1
- VMQMZMRVKUZKQL-UHFFFAOYSA-N Cu+ Chemical compound [Cu+] VMQMZMRVKUZKQL-UHFFFAOYSA-N 0.000 description 1
- 241000159506 Cyanothece Species 0.000 description 1
- 108010069514 Cyclic Peptides Proteins 0.000 description 1
- 102000001189 Cyclic Peptides Human genes 0.000 description 1
- UHDGCWIWMRVCDJ-PSQAKQOGSA-N Cytidine Natural products O=C1N=C(N)C=CN1[C@@H]1[C@@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-PSQAKQOGSA-N 0.000 description 1
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- OUYCCCASQSFEME-MRVPVSSYSA-N D-tyrosine Chemical compound OC(=O)[C@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-MRVPVSSYSA-N 0.000 description 1
- 229930195709 D-tyrosine Natural products 0.000 description 1
- 108090000133 DNA helicases Proteins 0.000 description 1
- 102000003844 DNA helicases Human genes 0.000 description 1
- 238000000018 DNA microarray Methods 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 230000033616 DNA repair Effects 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 230000004568 DNA-binding Effects 0.000 description 1
- 101710096438 DNA-binding protein Proteins 0.000 description 1
- ARVGMISWLZPBCH-UHFFFAOYSA-N Dehydro-beta-sitosterol Natural products C1C(O)CCC2(C)C(CCC3(C(C(C)CCC(CC)C(C)C)CCC33)C)C3=CC=C21 ARVGMISWLZPBCH-UHFFFAOYSA-N 0.000 description 1
- 102000004099 Deoxyribonuclease (Pyrimidine Dimer) Human genes 0.000 description 1
- 108010082610 Deoxyribonuclease (Pyrimidine Dimer) Proteins 0.000 description 1
- 101100310856 Drosophila melanogaster spri gene Proteins 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 102000004533 Endonucleases Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 101800001466 Envelope glycoprotein E1 Proteins 0.000 description 1
- 241000283086 Equidae Species 0.000 description 1
- DNVPQKQSNYMLRS-NXVQYWJNSA-N Ergosterol Natural products CC(C)[C@@H](C)C=C[C@H](C)[C@H]1CC[C@H]2C3=CC=C4C[C@@H](O)CC[C@]4(C)[C@@H]3CC[C@]12C DNVPQKQSNYMLRS-NXVQYWJNSA-N 0.000 description 1
- 239000004249 Erythorbin acid Substances 0.000 description 1
- 241000190844 Erythrobacter Species 0.000 description 1
- 241000588722 Escherichia Species 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 241000701533 Escherichia virus T4 Species 0.000 description 1
- 108010007577 Exodeoxyribonuclease I Proteins 0.000 description 1
- 108010009832 Exodeoxyribonucleases Proteins 0.000 description 1
- 102000009788 Exodeoxyribonucleases Human genes 0.000 description 1
- 102100029075 Exonuclease 1 Human genes 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 241001280345 Ferroplasma Species 0.000 description 1
- MBMLMWLHJBBADN-UHFFFAOYSA-N Ferrous sulfide Chemical compound [Fe]=S MBMLMWLHJBBADN-UHFFFAOYSA-N 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 241000287828 Gallus gallus Species 0.000 description 1
- BKLIAINBCQPSOV-UHFFFAOYSA-N Gluanol Natural products CC(C)CC=CC(C)C1CCC2(C)C3=C(CCC12C)C4(C)CCC(O)C(C)(C)C4CC3 BKLIAINBCQPSOV-UHFFFAOYSA-N 0.000 description 1
- 244000068988 Glycine max Species 0.000 description 1
- 235000010469 Glycine max Nutrition 0.000 description 1
- 244000299507 Gossypium hirsutum Species 0.000 description 1
- 239000007995 HEPES buffer Substances 0.000 description 1
- 241001477024 Haladaptatus Species 0.000 description 1
- 241000204942 Halobacterium sp. Species 0.000 description 1
- 241000204991 Haloferax Species 0.000 description 1
- 241000557006 Halorubrum Species 0.000 description 1
- 241000526120 Haloterrigena Species 0.000 description 1
- 241001559576 Halothiobacillus Species 0.000 description 1
- 108010006464 Hemolysin Proteins Proteins 0.000 description 1
- 102100039869 Histone H2B type F-S Human genes 0.000 description 1
- 101001035372 Homo sapiens Histone H2B type F-S Proteins 0.000 description 1
- 101001094545 Homo sapiens Retrotransposon-like protein 1 Proteins 0.000 description 1
- 240000005979 Hordeum vulgare Species 0.000 description 1
- 235000007340 Hordeum vulgare Nutrition 0.000 description 1
- 238000006736 Huisgen cycloaddition reaction Methods 0.000 description 1
- AVXURJPOCDRRFD-UHFFFAOYSA-N Hydroxylamine Chemical class ON AVXURJPOCDRRFD-UHFFFAOYSA-N 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical group [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 1
- 241000588748 Klebsiella Species 0.000 description 1
- WTDRDQBEARUVNC-LURJTMIESA-N L-DOPA Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C(O)=C1 WTDRDQBEARUVNC-LURJTMIESA-N 0.000 description 1
- FFFHZYDWPBMWHY-UHFFFAOYSA-N L-Homocysteine Natural products OC(=O)C(N)CCS FFFHZYDWPBMWHY-UHFFFAOYSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- FBWIRBFZWNIGJC-LURJTMIESA-N L-dihomomethionine zwitterion Chemical compound CSCCCC[C@H](N)C(O)=O FBWIRBFZWNIGJC-LURJTMIESA-N 0.000 description 1
- FFFHZYDWPBMWHY-VKHMYHEASA-N L-homocysteine Chemical compound OC(=O)[C@@H](N)CCS FFFHZYDWPBMWHY-VKHMYHEASA-N 0.000 description 1
- ZFOMKMMPBOQKMC-KXUCPTDWSA-N L-pyrrolysine Chemical compound C[C@@H]1CC=N[C@H]1C(=O)NCCCC[C@H]([NH3+])C([O-])=O ZFOMKMMPBOQKMC-KXUCPTDWSA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- LOPKHWOTGJIQLC-UHFFFAOYSA-N Lanosterol Natural products CC(CCC=C(C)C)C1CCC2(C)C3=C(CCC12C)C4(C)CCC(C)(O)C(C)(C)C4CC3 LOPKHWOTGJIQLC-UHFFFAOYSA-N 0.000 description 1
- 239000005639 Lauric acid Substances 0.000 description 1
- 240000004322 Lens culinaris Species 0.000 description 1
- 235000014647 Lens culinaris subsp culinaris Nutrition 0.000 description 1
- 108010014603 Leukocidins Proteins 0.000 description 1
- 108090001030 Lipoproteins Proteins 0.000 description 1
- 102000004895 Lipoproteins Human genes 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 1
- 244000070406 Malus silvestris Species 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 241000204999 Methanococcoides Species 0.000 description 1
- 241000203353 Methanococcus Species 0.000 description 1
- 241000203390 Methanogenium Species 0.000 description 1
- 241000204639 Methanohalobium Species 0.000 description 1
- 241000203006 Methanohalophilus Species 0.000 description 1
- 241000900014 Methanoregula Species 0.000 description 1
- 241001487033 Methanosalsum Species 0.000 description 1
- 241000205265 Methanospirillum Species 0.000 description 1
- 241000010754 Methanothermococcus Species 0.000 description 1
- 241000205011 Methanothrix Species 0.000 description 1
- 241001486995 Methanotorris Species 0.000 description 1
- 101001067830 Mus musculus Peptidyl-prolyl cis-trans isomerase A Proteins 0.000 description 1
- 240000005561 Musa balbisiana Species 0.000 description 1
- 241000187480 Mycobacterium smegmatis Species 0.000 description 1
- BRMWTNUJHUMWMS-LURJTMIESA-N N(tele)-methyl-L-histidine Chemical compound CN1C=NC(C[C@H](N)C(O)=O)=C1 BRMWTNUJHUMWMS-LURJTMIESA-N 0.000 description 1
- OKIZCWYLBDKLSU-UHFFFAOYSA-M N,N,N-Trimethylmethanaminium chloride Chemical compound [Cl-].C[N+](C)(C)C OKIZCWYLBDKLSU-UHFFFAOYSA-M 0.000 description 1
- CRJGESKKUOMBCT-VQTJNVASSA-N N-acetylsphinganine Chemical compound CCCCCCCCCCCCCCC[C@@H](O)[C@H](CO)NC(C)=O CRJGESKKUOMBCT-VQTJNVASSA-N 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 241000894751 Natrialba Species 0.000 description 1
- 241000588653 Neisseria Species 0.000 description 1
- CAHGCLMLTWQZNJ-UHFFFAOYSA-N Nerifoliol Natural products CC12CCC(O)C(C)(C)C1CCC1=C2CCC2(C)C(C(CCC=C(C)C)C)CCC21C CAHGCLMLTWQZNJ-UHFFFAOYSA-N 0.000 description 1
- 244000061176 Nicotiana tabacum Species 0.000 description 1
- 235000002637 Nicotiana tabacum Nutrition 0.000 description 1
- GEYBMYRBIABFTA-VIFPVBQESA-N O-methyl-L-tyrosine Chemical compound COC1=CC=C(C[C@H](N)C(O)=O)C=C1 GEYBMYRBIABFTA-VIFPVBQESA-N 0.000 description 1
- VIIDDSZOUXRNDY-IUCAKERBSA-N OC(=O)[C@@H](N)CCCCNC(=O)[C@@H]1CCCN1 Chemical compound OC(=O)[C@@H](N)CCCCNC(=O)[C@@H]1CCCN1 VIIDDSZOUXRNDY-IUCAKERBSA-N 0.000 description 1
- OZBFLQITCMCIOY-FOUAGVGXSA-N OC[C@H]([C@H]([C@@H]([C@H]1O)O)O[C@H]2O[C@@H]([C@@H](O[C@H]3O[C@H](CO)[C@H]([C@@H]([C@H]3O)O)O[C@H]3O[C@H](CO)[C@H]([C@@H]([C@H]3O)O)O[C@H]3O[C@H](CO)[C@H]([C@@H]([C@H]3O)O)O[C@H]3O[C@H](CO)[C@H]([C@@H]([C@H]3O)O)O3)[C@H](O)[C@H]2O)CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O)[C@@H]3O[C@@H]1CO Chemical compound OC[C@H]([C@H]([C@@H]([C@H]1O)O)O[C@H]2O[C@@H]([C@@H](O[C@H]3O[C@H](CO)[C@H]([C@@H]([C@H]3O)O)O[C@H]3O[C@H](CO)[C@H]([C@@H]([C@H]3O)O)O[C@H]3O[C@H](CO)[C@H]([C@@H]([C@H]3O)O)O[C@H]3O[C@H](CO)[C@H]([C@@H]([C@H]3O)O)O3)[C@H](O)[C@H]2O)CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O)[C@@H]3O[C@@H]1CO OZBFLQITCMCIOY-FOUAGVGXSA-N 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 241001494479 Pecora Species 0.000 description 1
- 102000005877 Peptide Initiation Factors Human genes 0.000 description 1
- 108010044843 Peptide Initiation Factors Proteins 0.000 description 1
- 244000046052 Phaseolus vulgaris Species 0.000 description 1
- 235000010627 Phaseolus vulgaris Nutrition 0.000 description 1
- YGYAWVDWMABLBF-UHFFFAOYSA-N Phosgene Chemical compound ClC(Cl)=O YGYAWVDWMABLBF-UHFFFAOYSA-N 0.000 description 1
- 101710124239 Poly(A) polymerase Proteins 0.000 description 1
- 229920002594 Polyethylene Glycol 8000 Polymers 0.000 description 1
- 108010039918 Polylysine Proteins 0.000 description 1
- 229920001213 Polysorbate 20 Polymers 0.000 description 1
- 241000979017 Pseudomonas sp. Lz4W Species 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 241000205160 Pyrococcus Species 0.000 description 1
- 108090000944 RNA Helicases Proteins 0.000 description 1
- 102000004409 RNA Helicases Human genes 0.000 description 1
- 241001506137 Rapa Species 0.000 description 1
- 102220637688 Ras-related protein Rab-33A_D91G_mutation Human genes 0.000 description 1
- 108010012737 RecQ Helicases Proteins 0.000 description 1
- 102000019196 RecQ Helicases Human genes 0.000 description 1
- 102100035123 Retrotransposon-like protein 1 Human genes 0.000 description 1
- 244000299790 Rheum rhabarbarum Species 0.000 description 1
- 235000009411 Rheum rhabarbarum Nutrition 0.000 description 1
- 241001148569 Rhodothermus Species 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 240000000111 Saccharum officinarum Species 0.000 description 1
- 235000007201 Saccharum officinarum Nutrition 0.000 description 1
- 241001312748 Salinibacter Species 0.000 description 1
- 241000607142 Salmonella Species 0.000 description 1
- 102220528872 Serum amyloid A-4 protein_A96D_mutation Human genes 0.000 description 1
- 241000607768 Shigella Species 0.000 description 1
- 229910052581 Si3N4 Inorganic materials 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 244000061456 Solanum tuberosum Species 0.000 description 1
- 235000002595 Solanum tuberosum Nutrition 0.000 description 1
- 239000004187 Spiramycin Substances 0.000 description 1
- 235000021355 Stearic acid Nutrition 0.000 description 1
- 241000122971 Stenotrophomonas Species 0.000 description 1
- 102220600132 Succinate dehydrogenase [ubiquinone] cytochrome b small subunit, mitochondrial_Q42R_mutation Human genes 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- 241000205101 Sulfolobus Species 0.000 description 1
- 241001164579 Sulfurimonas Species 0.000 description 1
- 102220539152 Superoxide dismutase [Cu-Zn]_A96G_mutation Human genes 0.000 description 1
- 229920006362 Teflon® Polymers 0.000 description 1
- 235000009470 Theobroma cacao Nutrition 0.000 description 1
- 244000299461 Theobroma cacao Species 0.000 description 1
- 241000039733 Thermoproteus thermophilus Species 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 101800001690 Transmembrane protein gp41 Proteins 0.000 description 1
- 235000021307 Triticum Nutrition 0.000 description 1
- 244000098338 Triticum aestivum Species 0.000 description 1
- 102000004142 Trypsin Human genes 0.000 description 1
- 108090000631 Trypsin Proteins 0.000 description 1
- 239000004182 Tylosin Substances 0.000 description 1
- 108010073429 Type V Secretion Systems Proteins 0.000 description 1
- HZYXFRGVBOPPNZ-UHFFFAOYSA-N UNPD88870 Natural products C1C=C2CC(O)CCC2(C)C2C1C1CCC(C(C)=CCC(CC)C(C)C)C1(C)CC2 HZYXFRGVBOPPNZ-UHFFFAOYSA-N 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 241000219094 Vitaceae Species 0.000 description 1
- 238000002441 X-ray diffraction Methods 0.000 description 1
- 241000607734 Yersinia <bacteria> Species 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 1
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 1
- KILNVBDSWZSGLL-PWXLRKPBSA-N [(2r)-2,3-bis(2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10,11,11,12,12,13,13,14,14,15,15,16,16,16-hentriacontadeuteriohexadecanoyloxy)propyl] 2-(trimethylazaniumyl)ethyl phosphate Chemical compound [2H]C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C(=O)OC[C@H](COP([O-])(=O)OCC[N+](C)(C)C)OC(=O)C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])C([2H])([2H])[2H] KILNVBDSWZSGLL-PWXLRKPBSA-N 0.000 description 1
- NMRGXROOSPKRTL-SUJDGPGCSA-N [(2r)-2,3-bis(3,7,11,15-tetramethylhexadecoxy)propyl] 2-(trimethylazaniumyl)ethyl phosphate Chemical compound CC(C)CCCC(C)CCCC(C)CCCC(C)CCOC[C@H](COP([O-])(=O)OCC[N+](C)(C)C)OCCC(C)CCCC(C)CCCC(C)CCCC(C)C NMRGXROOSPKRTL-SUJDGPGCSA-N 0.000 description 1
- IDBJTPGHAMAEMV-OIVUAWODSA-N [(2r)-2,3-di(tricosa-10,12-diynoyloxy)propyl] 2-(trimethylazaniumyl)ethyl phosphate Chemical compound CCCCCCCCCCC#CC#CCCCCCCCCC(=O)OC[C@H](COP([O-])(=O)OCC[N+](C)(C)C)OC(=O)CCCCCCCCC#CC#CCCCCCCCCCC IDBJTPGHAMAEMV-OIVUAWODSA-N 0.000 description 1
- GFHJCDJVUAFINE-KXQOOQHDSA-N [(2r)-2-(16-fluorohexadecanoyloxy)-3-hexadecanoyloxypropyl] 2-(trimethylazaniumyl)ethyl phosphate Chemical compound CCCCCCCCCCCCCCCC(=O)OC[C@H](COP([O-])(=O)OCC[N+](C)(C)C)OC(=O)CCCCCCCCCCCCCCCF GFHJCDJVUAFINE-KXQOOQHDSA-N 0.000 description 1
- KPUOHXMVCZBWQC-JXOAFFINSA-N [(2r,3s,4r,5r)-5-[4-amino-5-(hydroxymethyl)-2-oxopyrimidin-1-yl]-3,4-dihydroxyoxolan-2-yl]methyl dihydrogen phosphate Chemical compound C1=C(CO)C(N)=NC(=O)N1[C@H]1[C@H](O)[C@H](O)[C@@H](COP(O)(O)=O)O1 KPUOHXMVCZBWQC-JXOAFFINSA-N 0.000 description 1
- PGNICAOCNIVZRV-NSHDSACASA-N [2-[[(5s)-5-amino-5-carboxypentyl]carbamoyloxymethyl]phenyl]-diazonioazanide Chemical compound OC(=O)[C@@H](N)CCCCNC(=O)OCC1=CC=CC=C1[N-][N+]#N PGNICAOCNIVZRV-NSHDSACASA-N 0.000 description 1
- PGAVKCOVUIYSFO-UHFFFAOYSA-N [[5-(2,4-dioxopyrimidin-1-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound OC1C(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)OC1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-UHFFFAOYSA-N 0.000 description 1
- ASJWEHCPLGMOJE-LJMGSBPFSA-N ac1l3rvh Chemical class N1C(=O)NC(=O)[C@@]2(C)[C@@]3(C)C(=O)NC(=O)N[C@H]3[C@H]21 ASJWEHCPLGMOJE-LJMGSBPFSA-N 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 150000001266 acyl halides Chemical class 0.000 description 1
- 230000010933 acylation Effects 0.000 description 1
- 238000005917 acylation reaction Methods 0.000 description 1
- 238000013006 addition curing Methods 0.000 description 1
- NLTUCYMLOPLUHL-KQYNXXCUSA-N adenosine 5'-[gamma-thio]triphosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=S)[C@@H](O)[C@H]1O NLTUCYMLOPLUHL-KQYNXXCUSA-N 0.000 description 1
- 229960003190 adenosine monophosphate Drugs 0.000 description 1
- OQIQSTLJSLGHID-WNWIJWBNSA-N aflatoxin B1 Chemical compound C=1([C@@H]2C=CO[C@@H]2OC=1C=C(C1=2)OC)C=2OC(=O)C2=C1CCC2=O OQIQSTLJSLGHID-WNWIJWBNSA-N 0.000 description 1
- 229960003767 alanine Drugs 0.000 description 1
- HAXFWIACAGNFHA-UHFFFAOYSA-N aldrithiol Chemical compound C=1C=CC=NC=1SSC1=CC=CC=N1 HAXFWIACAGNFHA-UHFFFAOYSA-N 0.000 description 1
- 125000001931 aliphatic group Chemical group 0.000 description 1
- 229910052783 alkali metal Inorganic materials 0.000 description 1
- 229910001514 alkali metal chloride Inorganic materials 0.000 description 1
- 125000002355 alkine group Chemical group 0.000 description 1
- 125000005336 allyloxy group Chemical group 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 235000021016 apples Nutrition 0.000 description 1
- 239000012736 aqueous medium Substances 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 239000011668 ascorbic acid Substances 0.000 description 1
- 235000010323 ascorbic acid Nutrition 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 150000001508 asparagines Chemical class 0.000 description 1
- 229940009098 aspartate Drugs 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-L aspartate group Chemical class N[C@@H](CC(=O)[O-])C(=O)[O-] CKLJMWTZIZZHCS-REOHCLBHSA-L 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 238000000429 assembly Methods 0.000 description 1
- 230000000712 assembly Effects 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- IVRMZWNICZWHMI-UHFFFAOYSA-N azide group Chemical group [N-]=[N+]=[N-] IVRMZWNICZWHMI-UHFFFAOYSA-N 0.000 description 1
- 150000001541 aziridines Chemical class 0.000 description 1
- 125000004069 aziridinyl group Chemical group 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 235000021015 bananas Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 1
- MJVXAPPOFPTTCA-UHFFFAOYSA-N beta-Sistosterol Natural products CCC(CCC(C)C1CCC2C3CC=C4C(C)C(O)CCC4(C)C3CCC12C)C(C)C MJVXAPPOFPTTCA-UHFFFAOYSA-N 0.000 description 1
- NJKOMDUNNDKEAI-UHFFFAOYSA-N beta-sitosterol Natural products CCC(CCC(C)C1CCC2(C)C3CC=C4CC(O)CCC4C3CCC12C)C(C)C NJKOMDUNNDKEAI-UHFFFAOYSA-N 0.000 description 1
- 125000004057 biotinyl group Chemical group [H]N1C(=O)N([H])[C@]2([H])[C@@]([H])(SC([H])([H])[C@]12[H])C([H])([H])C([H])([H])C([H])([H])C([H])([H])C(*)=O 0.000 description 1
- JCZLABDVDPYLRZ-AWEZNQCLSA-N biphenylalanine Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1C1=CC=CC=C1 JCZLABDVDPYLRZ-AWEZNQCLSA-N 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 235000019282 butylated hydroxyanisole Nutrition 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 239000001527 calcium lactate Substances 0.000 description 1
- VTJUKNSKBAOEHE-UHFFFAOYSA-N calixarene Chemical class COC(=O)COC1=C(CC=2C(=C(CC=3C(=C(C4)C=C(C=3)C(C)(C)C)OCC(=O)OC)C=C(C=2)C(C)(C)C)OCC(=O)OC)C=C(C(C)(C)C)C=C1CC1=C(OCC(=O)OC)C4=CC(C(C)(C)C)=C1 VTJUKNSKBAOEHE-UHFFFAOYSA-N 0.000 description 1
- 239000004202 carbamide Substances 0.000 description 1
- 150000001721 carbon Chemical group 0.000 description 1
- 239000002041 carbon nanotube Substances 0.000 description 1
- 229910021393 carbon nanotube Inorganic materials 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- ZVEQCJWYRWKARO-UHFFFAOYSA-N ceramide Natural products CCCCCCCCCCCCCCC(O)C(=O)NC(CO)C(O)C=CCCC=C(C)CCCCCCCCC ZVEQCJWYRWKARO-UHFFFAOYSA-N 0.000 description 1
- 235000013339 cereals Nutrition 0.000 description 1
- 229960000541 cetyl alcohol Drugs 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 238000003508 chemical denaturation Methods 0.000 description 1
- 238000007385 chemical modification Methods 0.000 description 1
- 235000013330 chicken meat Nutrition 0.000 description 1
- BBJQPKLGPMQWBU-JADYGXMDSA-N cholesteryl palmitate Chemical compound C([C@@H]12)C[C@]3(C)[C@@H]([C@H](C)CCCC(C)C)CC[C@H]3[C@@H]1CC=C1[C@]2(C)CC[C@H](OC(=O)CCCCCCCCCCCCCCC)C1 BBJQPKLGPMQWBU-JADYGXMDSA-N 0.000 description 1
- ALSTYHKOOCGGFT-UHFFFAOYSA-N cis-oleyl alcohol Natural products CCCCCCCCC=CCCCCCCCCO ALSTYHKOOCGGFT-UHFFFAOYSA-N 0.000 description 1
- KRKNYBCHXYNGOX-UHFFFAOYSA-N citric acid Substances OC(=O)CC(O)(C(O)=O)CC(O)=O KRKNYBCHXYNGOX-UHFFFAOYSA-N 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 150000003983 crown ethers Chemical class 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 238000002425 crystallisation Methods 0.000 description 1
- 230000008025 crystallization Effects 0.000 description 1
- 208000030381 cutaneous melanoma Diseases 0.000 description 1
- 229940097362 cyclodextrins Drugs 0.000 description 1
- 125000000640 cyclooctyl group Chemical group [H]C1([H])C([H])([H])C([H])([H])C([H])([H])C([H])(*)C([H])([H])C([H])([H])C1([H])[H] 0.000 description 1
- 125000001887 cyclopentyloxy group Chemical group C1(CCCC1)O* 0.000 description 1
- UHDGCWIWMRVCDJ-ZAKLUEHWSA-N cytidine Chemical compound O=C1N=C(N)C=CN1[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O1 UHDGCWIWMRVCDJ-ZAKLUEHWSA-N 0.000 description 1
- NHFQNAGPXIVKND-UHFFFAOYSA-N dbco-maleimide Chemical compound C1C2=CC=CC=C2C#CC2=CC=CC=C2N1C(=O)CCNC(=O)CCN1C(=O)C=CC1=O NHFQNAGPXIVKND-UHFFFAOYSA-N 0.000 description 1
- XCEBOJWFQSQZKR-UHFFFAOYSA-N dbco-nhs Chemical compound C1C2=CC=CC=C2C#CC2=CC=CC=C2N1C(=O)CCC(=O)ON1C(=O)CCC1=O XCEBOJWFQSQZKR-UHFFFAOYSA-N 0.000 description 1
- VVFZXPZWVJMYPX-UHFFFAOYSA-N dbco-peg4--maleimide Chemical compound C1C2=CC=CC=C2C#CC2=CC=CC=C2N1C(=O)CCNC(=O)CCOCCOCCOCCOCCNC(=O)CCN1C(=O)C=CC1=O VVFZXPZWVJMYPX-UHFFFAOYSA-N 0.000 description 1
- KTIOBJVNCOFWCL-UHFFFAOYSA-N dbco-peg4-amine Chemical compound NCCOCCOCCOCCOCCC(=O)NCCC(=O)N1CC2=CC=CC=C2C#CC2=CC=CC=C12 KTIOBJVNCOFWCL-UHFFFAOYSA-N 0.000 description 1
- RRCXYKNJTKJNTD-UHFFFAOYSA-N dbco-peg4-nhs ester Chemical compound C1C2=CC=CC=C2C#CC2=CC=CC=C2N1C(=O)CCC(=O)NCCOCCOCCOCCOCCC(=O)ON1C(=O)CCC1=O RRCXYKNJTKJNTD-UHFFFAOYSA-N 0.000 description 1
- ZJVGOGQIAYMKAS-MZOCQUDTSA-N dbco-s-s-peg3-biotin Chemical compound C1C2=CC=CC=C2C#CC2=CC=CC=C2N1C(=O)CCC(=O)NCCSSCCC(=O)NCCOCCOCCOCCNC(=O)CCCC[C@H]1[C@H]2NC(=O)N[C@H]2CS1 ZJVGOGQIAYMKAS-MZOCQUDTSA-N 0.000 description 1
- DTPCFIHYWYONMD-UHFFFAOYSA-N decaethylene glycol Chemical compound OCCOCCOCCOCCOCCOCCOCCOCCOCCOCCO DTPCFIHYWYONMD-UHFFFAOYSA-N 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010511 deprotection reaction Methods 0.000 description 1
- 238000011033 desalting Methods 0.000 description 1
- ROSVUDLPLHNVHN-UHFFFAOYSA-N dibenzocyclooctynol Chemical compound C1#CCCC2=CC=CC=C2C2=C1C=CC=C2O ROSVUDLPLHNVHN-UHFFFAOYSA-N 0.000 description 1
- 238000002050 diffraction method Methods 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- QBSJHOGDIUQWTH-UHFFFAOYSA-N dihydrolanosterol Natural products CC(C)CCCC(C)C1CCC2(C)C3=C(CCC12C)C4(C)CCC(C)(O)C(C)(C)C4CC3 QBSJHOGDIUQWTH-UHFFFAOYSA-N 0.000 description 1
- 239000003085 diluting agent Substances 0.000 description 1
- 239000001177 diphosphate Substances 0.000 description 1
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- KPUWHANPEXNPJT-UHFFFAOYSA-N disiloxane Chemical class [SiH3]O[SiH3] KPUWHANPEXNPJT-UHFFFAOYSA-N 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 238000004821 distillation Methods 0.000 description 1
- AFOSIXZFDONLBT-UHFFFAOYSA-N divinyl sulfone Chemical class C=CS(=O)(=O)C=C AFOSIXZFDONLBT-UHFFFAOYSA-N 0.000 description 1
- 239000003651 drinking water Substances 0.000 description 1
- 235000020188 drinking water Nutrition 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 229920001971 elastomer Polymers 0.000 description 1
- 239000000806 elastomer Substances 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000002848 electrochemical method Methods 0.000 description 1
- 239000008151 electrolyte solution Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000009088 enzymatic function Effects 0.000 description 1
- 150000002118 epoxides Chemical class 0.000 description 1
- DNVPQKQSNYMLRS-SOWFXMKYSA-N ergosterol Chemical compound C1[C@@H](O)CC[C@]2(C)[C@H](CC[C@]3([C@H]([C@H](C)/C=C/[C@@H](C)C(C)C)CC[C@H]33)C)C3=CC=C21 DNVPQKQSNYMLRS-SOWFXMKYSA-N 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 229960005542 ethidium bromide Drugs 0.000 description 1
- 108010052305 exodeoxyribonuclease III Proteins 0.000 description 1
- 238000001125 extrusion Methods 0.000 description 1
- 230000006126 farnesylation Effects 0.000 description 1
- 150000002191 fatty alcohols Chemical class 0.000 description 1
- 238000000198 fluorescence anisotropy Methods 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 150000002306 glutamic acid derivatives Chemical class 0.000 description 1
- 235000004554 glutamine Nutrition 0.000 description 1
- 150000002309 glutamines Chemical class 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 235000021021 grapes Nutrition 0.000 description 1
- 229940029575 guanosine Drugs 0.000 description 1
- 125000005179 haloacetyl group Chemical group 0.000 description 1
- 239000003228 hemolysin Substances 0.000 description 1
- 230000002949 hemolytic effect Effects 0.000 description 1
- 125000000623 heterocyclic group Chemical group 0.000 description 1
- BXWNKGSJHAJOGX-UHFFFAOYSA-N hexadecan-1-ol Chemical compound CCCCCCCCCCCCCCCCO BXWNKGSJHAJOGX-UHFFFAOYSA-N 0.000 description 1
- KYYWBEYKBLQSFW-UHFFFAOYSA-N hexadecanoic acid Chemical compound CCCCCCCCCCCCCCCC(O)=O.CCCCCCCCCCCCCCCC(O)=O KYYWBEYKBLQSFW-UHFFFAOYSA-N 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 229920001519 homopolymer Polymers 0.000 description 1
- 150000002429 hydrazines Chemical class 0.000 description 1
- 229930195733 hydrocarbon Natural products 0.000 description 1
- 150000002430 hydrocarbons Chemical class 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 229920001600 hydrophobic polymer Polymers 0.000 description 1
- WGCNASOHLSPBMP-UHFFFAOYSA-N hydroxyacetaldehyde Natural products OCC=O WGCNASOHLSPBMP-UHFFFAOYSA-N 0.000 description 1
- 238000002847 impedance measurement Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000000415 inactivating effect Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 229910010272 inorganic material Inorganic materials 0.000 description 1
- 239000011147 inorganic material Substances 0.000 description 1
- 229920000592 inorganic polymer Polymers 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 239000011810 insulating material Substances 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 239000002608 ionic liquid Substances 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 150000002540 isothiocyanates Chemical class 0.000 description 1
- 238000009533 lab test Methods 0.000 description 1
- 229940058690 lanosterol Drugs 0.000 description 1
- CAHGCLMLTWQZNJ-RGEKOYMOSA-N lanosterol Chemical compound C([C@]12C)C[C@@H](O)C(C)(C)[C@H]1CCC1=C2CC[C@]2(C)[C@H]([C@H](CCC=C(C)C)C)CC[C@@]21C CAHGCLMLTWQZNJ-RGEKOYMOSA-N 0.000 description 1
- 235000021374 legumes Nutrition 0.000 description 1
- 229960004502 levodopa Drugs 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 229920002521 macromolecule Polymers 0.000 description 1
- 235000009973 maize Nutrition 0.000 description 1
- 239000001630 malic acid Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 125000000956 methoxy group Chemical group [H]C([H])([H])O* 0.000 description 1
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 1
- PQIOSYKVBBWRRI-UHFFFAOYSA-N methylphosphonyl difluoride Chemical group CP(F)(F)=O PQIOSYKVBBWRRI-UHFFFAOYSA-N 0.000 description 1
- 239000011325 microbead Substances 0.000 description 1
- 238000004377 microelectronic Methods 0.000 description 1
- 238000000386 microscopy Methods 0.000 description 1
- 239000004005 microsphere Substances 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 150000004712 monophosphates Chemical class 0.000 description 1
- 210000003097 mucus Anatomy 0.000 description 1
- DDBRXOJCLVGHLX-UHFFFAOYSA-N n,n-dimethylmethanamine;propane Chemical compound CCC.CN(C)C DDBRXOJCLVGHLX-UHFFFAOYSA-N 0.000 description 1
- WQEPLUUGTLDZJY-UHFFFAOYSA-N n-Pentadecanoic acid Natural products CCCCCCCCCCCCCCC(O)=O WQEPLUUGTLDZJY-UHFFFAOYSA-N 0.000 description 1
- VVGIYYKRAMHVLU-UHFFFAOYSA-N newbouldiamide Natural products CCCCCCCCCCCCCCCCCCCC(O)C(O)C(O)C(CO)NC(=O)CCCCCCCCCCCCCCCCC VVGIYYKRAMHVLU-UHFFFAOYSA-N 0.000 description 1
- 230000001293 nucleolytic effect Effects 0.000 description 1
- 239000012038 nucleophile Substances 0.000 description 1
- QIQXTHQIDYTFRH-UHFFFAOYSA-N octadecanoic acid Chemical compound CCCCCCCCCCCCCCCCCC(O)=O QIQXTHQIDYTFRH-UHFFFAOYSA-N 0.000 description 1
- OQCDKBAXFALNLD-UHFFFAOYSA-N octadecanoic acid Natural products CCCCCCCC(C)CCCCCCCCC(O)=O OQCDKBAXFALNLD-UHFFFAOYSA-N 0.000 description 1
- 239000000574 octyl gallate Substances 0.000 description 1
- 235000021313 oleic acid Nutrition 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 239000011368 organic material Substances 0.000 description 1
- 229920000620 organic polymer Polymers 0.000 description 1
- 150000002907 osmium Chemical class 0.000 description 1
- 108010014203 outer membrane phospholipase A Proteins 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 238000012856 packing Methods 0.000 description 1
- 238000010422 painting Methods 0.000 description 1
- TVIDEEHSOPHZBR-AWEZNQCLSA-N para-(benzoyl)-phenylalanine Chemical compound C1=CC(C[C@H](N)C(O)=O)=CC=C1C(=O)C1=CC=CC=C1 TVIDEEHSOPHZBR-AWEZNQCLSA-N 0.000 description 1
- 150000002972 pentoses Chemical class 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 150000003904 phospholipids Chemical class 0.000 description 1
- 150000008300 phosphoramidites Chemical class 0.000 description 1
- 150000003013 phosphoric acid derivatives Chemical class 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 230000004962 physiological condition Effects 0.000 description 1
- 210000002381 plasma Anatomy 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 229920003023 plastic Polymers 0.000 description 1
- 229920003192 poly(bis maleimide) Polymers 0.000 description 1
- 229920002704 polyhistidine Polymers 0.000 description 1
- 229920000656 polylysine Polymers 0.000 description 1
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 1
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 1
- 239000004175 ponceau 4R Substances 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 239000000737 potassium alginate Substances 0.000 description 1
- 235000012015 potatoes Nutrition 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 239000013615 primer Substances 0.000 description 1
- 238000000358 protein NMR Methods 0.000 description 1
- 150000003212 purines Chemical class 0.000 description 1
- 150000003230 pyrimidines Chemical class 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 210000003705 ribosome Anatomy 0.000 description 1
- DWRXFEITVBNRMK-JXOAFFINSA-N ribothymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 DWRXFEITVBNRMK-JXOAFFINSA-N 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 102220198221 rs1057519921 Human genes 0.000 description 1
- 102200152572 rs1060503493 Human genes 0.000 description 1
- 102220266761 rs143103316 Human genes 0.000 description 1
- 102220344476 rs1555685159 Human genes 0.000 description 1
- 102220036433 rs35389822 Human genes 0.000 description 1
- 102200076325 rs5658 Human genes 0.000 description 1
- 102220058097 rs730881898 Human genes 0.000 description 1
- 102220188881 rs747642461 Human genes 0.000 description 1
- 102220213938 rs768671254 Human genes 0.000 description 1
- 102220100740 rs878854050 Human genes 0.000 description 1
- 102220158369 rs886047453 Human genes 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 239000013535 sea water Substances 0.000 description 1
- 238000007789 sealing Methods 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 239000013049 sediment Substances 0.000 description 1
- 150000003355 serines Chemical class 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 239000003579 shift reagent Substances 0.000 description 1
- LIVNPJMFVYWSIS-UHFFFAOYSA-N silicon monoxide Inorganic materials [Si-]#[O+] LIVNPJMFVYWSIS-UHFFFAOYSA-N 0.000 description 1
- 229920002379 silicone rubber Polymers 0.000 description 1
- 239000004945 silicone rubber Substances 0.000 description 1
- KZJWDPNRJALLNS-VJSFXXLFSA-N sitosterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CC[C@@H](CC)C(C)C)[C@@]1(C)CC2 KZJWDPNRJALLNS-VJSFXXLFSA-N 0.000 description 1
- 229950005143 sitosterol Drugs 0.000 description 1
- 235000015500 sitosterol Nutrition 0.000 description 1
- NLQLSVXGSXCXFE-UHFFFAOYSA-N sitosterol Natural products CC=C(/CCC(C)C1CC2C3=CCC4C(C)C(O)CCC4(C)C3CCC2(C)C1)C(C)C NLQLSVXGSXCXFE-UHFFFAOYSA-N 0.000 description 1
- 201000003708 skin melanoma Diseases 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 239000004320 sodium erythorbate Substances 0.000 description 1
- 239000001488 sodium phosphate Substances 0.000 description 1
- 229910000162 sodium phosphate Inorganic materials 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 238000004611 spectroscopical analysis Methods 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
- 239000008117 stearic acid Substances 0.000 description 1
- 230000000707 stereoselective effect Effects 0.000 description 1
- 229940032091 stigmasterol Drugs 0.000 description 1
- HCXVJBMSMIARIN-PHZDYDNGSA-N stigmasterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)/C=C/[C@@H](CC)C(C)C)[C@@]1(C)CC2 HCXVJBMSMIARIN-PHZDYDNGSA-N 0.000 description 1
- 235000016831 stigmasterol Nutrition 0.000 description 1
- BFDNMXAIBMJLBB-UHFFFAOYSA-N stigmasterol Natural products CCC(C=CC(C)C1CCCC2C3CC=C4CC(O)CCC4(C)C3CCC12C)C(C)C BFDNMXAIBMJLBB-UHFFFAOYSA-N 0.000 description 1
- 230000035892 strand transfer Effects 0.000 description 1
- 229960002317 succinimide Drugs 0.000 description 1
- YBBRCQOCSYXUOC-UHFFFAOYSA-N sulfuryl dichloride Chemical compound ClS(Cl)(=O)=O YBBRCQOCSYXUOC-UHFFFAOYSA-N 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 238000001847 surface plasmon resonance imaging Methods 0.000 description 1
- 239000004094 surface-active agent Substances 0.000 description 1
- 229920001059 synthetic polymer Polymers 0.000 description 1
- 150000003568 thioethers Chemical group 0.000 description 1
- 125000003396 thiol group Chemical group [H]S* 0.000 description 1
- 235000008521 threonine Nutrition 0.000 description 1
- 150000003588 threonines Chemical class 0.000 description 1
- GUKSGXOLJNWRLZ-UHFFFAOYSA-N thymine glycol Chemical compound CC1(O)C(O)NC(=O)NC1=O GUKSGXOLJNWRLZ-UHFFFAOYSA-N 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000005945 translocation Effects 0.000 description 1
- 239000013638 trimer Substances 0.000 description 1
- RYFMWSXOAZQYPI-UHFFFAOYSA-K trisodium phosphate Chemical compound [Na+].[Na+].[Na+].[O-]P([O-])([O-])=O RYFMWSXOAZQYPI-UHFFFAOYSA-K 0.000 description 1
- 239000012588 trypsin Substances 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 1
- 229940045145 uridine Drugs 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 239000011534 wash buffer Substances 0.000 description 1
- JPZXHKDZASGCLU-LBPRGKRZSA-N β-(2-naphthyl)-alanine Chemical compound C1=CC=CC2=CC(C[C@H](N)C(O)=O)=CC=C21 JPZXHKDZASGCLU-LBPRGKRZSA-N 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2521/00—Reaction characterised by the enzymatic activity
- C12Q2521/10—Nucleotidyl transfering
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2521/00—Reaction characterised by the enzymatic activity
- C12Q2521/50—Other enzymatic activities
- C12Q2521/513—Winding/unwinding enzyme, e.g. helicase
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2522/00—Reaction characterised by the use of non-enzymatic proteins
- C12Q2522/10—Nucleic acid binding proteins
- C12Q2522/101—Single or double stranded nucleic acid binding proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2525/00—Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
- C12Q2525/10—Modifications characterised by
- C12Q2525/155—Modifications characterised by incorporating/generating a new priming site
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2525/00—Reactions involving modified oligonucleotides, nucleic acids, or nucleotides
- C12Q2525/10—Modifications characterised by
- C12Q2525/191—Modifications characterised by incorporating an adaptor
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2535/00—Reactions characterised by the assay type for determining the identity of a nucleotide base or a sequence of oligonucleotides
- C12Q2535/122—Massive parallel sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2565/00—Nucleic acid analysis characterised by mode or means of detection
- C12Q2565/60—Detection means characterised by use of a special device
- C12Q2565/631—Detection means characterised by use of a special device being a biochannel or pore
Definitions
- the invention relates to a method for modifying a template double stranded polynucleotide, especially for characterisation using nanopore sequencing.
- transposase there are many commercial situations which require the preparation of a nucleic acid library. This is frequently achieved using a transposase. Depending on the transposase which is used to prepare the library it may be necessary to repair the transposition events in vitro before the library can be used, for example in sequencing.
- Nanopores Transmembrane pores have great potential as direct, electrical biosensors for polymers and a variety of small molecules.
- recent focus has been given to nanopores as a potential DNA sequencing technology.
- Nanopore detection of the nucleotide gives a current change of known signature and duration.
- Strand sequencing can involve the use of a molecular brake to control the movement of the polynucleotide through the pore.
- the invention relates to a method for modifying a template double stranded polynucleotide, especially for characterisation using nanopore sequencing.
- the method produces from the template a plurality of modified double stranded polynucleotides. These modified polynucleotides can then be characterised.
- the inventors have surprisingly demonstrated that it is possible to remove a MuA transposase from modified polynucleotides using a translocase. This avoids the need to heat inactivate the MuA transposase, which may also inactivate any other enzymes or proteins being used in the preparation or characterisation of the modified polynucleotides. Removing the heat inactivation step also dispenses with the need for additional equipment such as a thermal cycler or water bath, used for heating up the sample.
- the invention therefore provides a method for modifying a template double stranded polynucleotide, comprising:
- FIG. 1 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- No PhiX peak was observed between the upper and lower markers for transpososome 1 (labelled 1) or transpososome 2 (labelled 2) when incubated at room temp in the absence of an enzyme.
- FIG. 2 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- a PhiX peak was observed between the upper and lower markers for transpososome 1 (labelled 1) when incubated at 75° C. for 10 minutes.
- FIG. 3 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- a PhiX peak was observed between the upper and lower markers for transpososome 2 (labelled 1) when incubated at 75° C. for 10 minutes.
- FIG. 4 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- a PhiX peak was not observed between the upper and lower markers for transpososome 1 (labelled 1) when incubated with Hel308Mbu-E284C/S615C-STrEP(C) (SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus).
- FIG. 5 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- a PhiX peak was observed between the upper and lower markers for transpososome 2 (labelled 1) when incubated with Hel308Mbu-E284C/S615C-STrEP (SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus).
- FIG. 6 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- a PhiX peak was observed between the upper and lower markers for transpososome 2 (labelled 1) when incubated with either A) Hel308Mbu-E284C/S615C-STrEP(C) (SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus) or B) at 75° C. for 10 minutes.
- FIG. 7 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- Line 1 corresponds to control sample (i) which has been incubated at room temperature in the absence of a translocase. No tragmentation peak was observed for sample (i).
- Line 2 corresponds to sample (ii) which has been incubated at 75° C. A tagmentation peak was observed between the upper and lower markers with sample (ii).
- FIG. 8 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- Line 1 corresponds to control sample (i) which has been incubated at room temperature in the absence of a translocase. No tragmentation peak was observed for sample (i).
- Line 3 corresponds to sample (iii) which has been incubated at room temperature with Hel308Mbu-E284C-STrEP(C) (SEQ ID NO: 10 with mutation E284C with a streptavidin tag attached at its C terminus). A tagmentation peak was observed between the upper and lower markers with sample (iii).
- FIG. 9 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- Line 1 corresponds to control sample (i) which has been incubated at room temperature in the absence of a translocase. No tragmentation peak was observed for sample (i).
- a tagmentation peak was observed between the upper and lower markers with sample (iv).
- FIG. 10 shows an Agilent 2100 Bioanalyser trace.
- the lower marker is labelled X and the upper marker is labelled Y.
- Line 1 corresponds to control sample (i) which has been incubated at room temperature in the absence of a translocase. No tragmentation peak was observed for sample (i).
- Line 5 corresponds to sample (v) which has been incubated at room temperature with UvrD Eco-(E117C/M380C)-STrEP (SEQ ID NO: 122 with mutations E177C/M380C with a streptavidin tag attached at the C terminus). A tagmentation peak was observed between the upper and lower markers with sample (v).
- FIG. 12 shows a cartoon representation of a translocase being used to remove a MuA transposase from a construct.
- the MuA transposase (labelled A) is bound to a double stranded MuA substrate (labelled B) which has two overhangs labelled C at each end of one of the strands.
- the MuA fragment s the template polynucleotide and ligates a double stranded MuA substrate to one end producing construct D.
- the translocase (labelled E) was allowed to bind to the construct at one of the overhangs.
- the translocase removes the MuA from the construct producing a modified double stranded polynucleotide.
- a leader was attached to the double stranded polynucleotide which had an enzyme (labelled F) pre-bound which was capable of controlling the movement of the polynucleotide through a nanopore.
- SEQ ID NO: 1 shows the codon optimised polynucleotide sequence encoding the MS-B 1 mutant MspA monomer. This mutant lacks the signal sequence and includes the following mutations: D90N, D91N, D93N, D118R, D134R and E139K.
- SEQ ID NO: 2 shows the amino acid sequence of the mature form of the MS-B 1 mutant of the MspA monomer. This mutant lacks the signal sequence and includes the following mutations: D90N, D91N, D93N, D118R, D134R and E139K.
- SEQ ID NO: 3 shows the polynucleotide sequence encoding one monomer of ⁇ -hemolysin-E111N/K147N ( ⁇ -HL-NN; Stoddart et al., PNAS, 2009; 106(19): 7702-7707).
- SEQ ID NO: 4 shows the amino acid sequence of one monomer of ⁇ -HL-NN.
- SEQ ID Nos: 5 to 7 show the amino acid sequences of MspB, C and D.
- SEQ ID NO: 8 shows the amino acid sequence of the Hel308 motif.
- SEQ ID NO: 9 shows the amino acid sequence of the extended Hel308 motif.
- SEQ ID NOs: 10 to 58 show the amino acid sequences of Hel308 helicases in Table 1.
- SEQ ID NO: 59 shows the RecD-like motif I.
- SEQ ID Nos: 60 to 62 show the extended RecD-like motif I.
- SEQ ID NO: 63 shows the RecD motif I.
- SEQ ID NO: 64 shows a preferred RecD motif I, namely G G P G T G K T.
- SEQ ID Nos: 65 to 67 show the extended RecD motif I.
- SEQ ID NO: 68 shows the RecD-like motif V.
- SEQ ID NO: 69 shows the RecD motif V.
- SEQ ID NOs: 70 to 77 show the MobF motif III.
- SEQ ID NOs: 78 to 84 show the MobQ motif III.
- SEQ ID NO: 85 shows the amino acid sequence of TraI Eco.
- SEQ ID NO: 86 shows the RecD-like motif I of TraI Eco.
- SEQ ID NO: 87 shows the RecD-like motif V of TraI Eco.
- SEQ ID NO: 88 shows the MobF motif III of TraI Eco.
- SEQ ID NO: 89 shows the XPD motif V.
- SEQ ID NO: 90 shows XPD motif VI.
- SEQ ID NO: 91 shows the amino acid sequence of XPD Mbu.
- SEQ ID NO: 92 shows the XPD motif V of XPD Mbu.
- SEQ ID NO: 93 shows XPD motif VI of XPD Mbu.
- SEQ ID NO: 94 shows the polynucleotide sequence of the double stranded portion of a
- SEQ ID NO: 95 shows the polynucleotide sequence of the double stranded portion of a MuA substrate of the invention. This sequence is complementary to SEQ ID NO: 94 except that it contains a U at the 3′ end.
- SEQ ID NO: 96 shows polynucleotide sequence of the overhang strand of the double stranded MuA substrate of the invention.
- SEQ ID NO: 97 shows the amino acid sequence of Dda 1993.
- SEQ ID NOs: 98 to 112 show the amino acid sequences of other Dda helicases for use in the invention.
- SEQ ID NO: 113 shows the codon optimised polynucleotide sequence encoding the wild-type CsgG monomer from Escherichia coli Str. K-12 substr. MC4100. This monomer lacks the signal sequence.
- SEQ ID NO: 114 shows the amino acid sequence of the mature form of the wild-type CsgG monomer from Escherichia coli Str. K-12 substr. MC4100. This monomer lacks the signal sequence.
- the abbreviation used for this CsgG CsgG-Eco.
- SEQ ID NO: 115 to 121 show polynucleotide sequences used in the Examples.
- SEQ ID NO: 122 shows the amino acid sequence of UvrD-Eco wild-type.
- a polynucleotide includes “polynucleotides”
- a substrate includes two or more such substrates
- a transmembrane protein pore includes two or more such pores, and the like.
- E94/P108 means E94 and P108 or E94D/R108K means E94D and P108K.
- the present invention provides a method of modifying a template polynucleotide.
- the template may be modified for any purpose.
- the method is preferably for modifying a template polynucleotide for characterisation, such as for strand sequencing.
- the template polynucleotide is typically the polynucleotide that will ultimately be characterised, or sequenced, in accordance with the invention. This is discussed in more detail below.
- the method provided is a method for modifying a double stranded polynucleotide template, comprising: (a) contacting the polynucleotide template with a MuA transposase in the presence of a double stranded MuA substrate that comprises an overhang at one or both ends of one strand, such that the MuA transposase (i) processes the template polynucleotide to produce a plurality of double stranded fragments and (ii) ligates the double stranded MuA substrate to one or both ends of a double stranded fragment of the plurality, thereby producing a ligation product to which is bound a MuA transposase; and (b) contacting the ligation product with a translocase, such that the translocase processes the ligation product to remove the MuA transposase, thereby producing a plurality of modified double stranded polynucleotides.
- the method involves the formation of a plurality of modified double stranded polynucleotides.
- modified double stranded polynucleotides are typically easier to characterise than the template polynucleotide, especially using strand sequencing.
- the plurality of modified double stranded polynucleotides may themselves be characterised in order to facilitate the characterisation of the template polynucleotide.
- the sequence of the template polynucleotide can be determined by sequencing each of the modified double stranded polynucleotides.
- the modified double stranded polynucleotides are shorter than the template polynucleotide and so it is more straightforward to characterise them using strand sequencing.
- the modified double stranded polynucleotides may be of any length. The length is determined by the length of the template polynucleotide and the action of the MuA transposase which fragments the polynucleotide.
- the modified double stranded polynucleotride is less than about 5000 kb.
- the modified double strand polynucleotides can be selectively labelled by including the labels in the MuA substrates. Labelling is selective in that only the modified double stranded polynucleotides produced by the MuA transposase are labelled.
- a label is an entity that enables sample identification, barcoding and/or tracking of the modified double stranded polynucleotide.
- Suitable labels include, but are not limited to, calibration sequences, coupling moieties and adaptor bound enzymes. Examples of coupling moieties include, for example, azide, DBCO, pyridyldithiol and malemide.
- Calibration sequences include any sequence of a known composition.
- Adaptor bound enzymes include, for example, translocases, polymerases, helicases and other polynucleotide binding proteins.
- the method introduces into the double stranded polynucleotides modifications which facilitate their characterisation using strand sequencing. It is well-established that coupling a polynucleotide to the membrane containing the nanopore lowers by several orders of magnitude the amount of polynucleotide required to allow its characterisation or sequencing. This is discussed in International Application No. PCT/GB2012/051191 (published as WO 2012/164270).
- the method of the invention allows the production of a plurality of double stranded polynucleotides each of which include a means for coupling the polynucleotides to a membrane. This is discussed in more detail below.
- the characterisation of double stranded polynucleotides using a nanopore typically requires the presence of a leader sequence designed to preferentially thread into the nanopore.
- the method of the invention allows the production of a plurality of double stranded polynucleotides each of which include a single stranded leader sequence. This is discussed in more detail below.
- the method of the invention modifies a template double stranded polynucleotide, preferably for characterisation.
- the template polynucleotide is typically the polynucleotide that will ultimately be characterised, or sequenced, in accordance with the invention. It may also be called the target double stranded polynucleotide or the double stranded polynucleotide of interest.
- a polynucleotide, such as a nucleic acid is a macromolecule comprising two or more nucleotides.
- the polynucleotide or nucleic acid may comprise any combination of any nucleotides.
- the nucleotides can be naturally occurring or artificial.
- One or more nucleotides in the template polynucleotide can be oxidized or methylated.
- One or more nucleotides in the template polynucleotide may be damaged.
- the polynucleotide may comprise a pyrimidine dimer. Such dimers are typically associated with damage by ultraviolet light and are the primary cause of skin melanomas.
- One or more nucleotides in the template polynucleotide may be modified, for instance with a label or a tag. Suitable labels are described below.
- the template polynucleotide may comprise one or more spacers.
- a nucleotide typically contains a nucleobase, a sugar and at least one phosphate group.
- the nucleobase and sugar form a nucleoside.
- the nucleobase is typically heterocyclic. Nucleobases include, but are not limited to, purines and pyrimidines and more specifically adenine (A), guanine (G), thymine (T), uracil (U) and cytosine (C).
- A adenine
- G guanine
- T thymine
- U uracil
- C cytosine
- the sugar is typically a pentose sugar.
- Nucleotide sugars include, but are not limited to, ribose and deoxyribose.
- the sugar is preferably a deoxyribose.
- the template double stranded polynucleotide preferably comprises the following nucleosides: deoxyadeno sine (dA), deoxyuridine (dU) and/or thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC).
- the nucleotide is typically a ribonucleotide or deoxyribonucleotide.
- the nucleotide is preferably a deoxyribonucleotide.
- the nucleotide typically contains a monophosphate, diphosphate or triphosphate. Phosphates may be attached on the 5′ or 3′ side of a nucleotide.
- Nucleotides include, but are not limited to, adenosine monophosphate (AMP), guanosine monophosphate (GMP), thymidine monophosphate (TMP), uridine monophosphate (UMP), 5-methylcytidine monophosphate, 5-hydroxymethylcytidine monophosphate, cytidine monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyguanosine monophosphate (dGMP), deoxythymidine monophosphate (dTMP), deoxyuridine monophosphate (dUMP) and deoxycytidine monophosphate (dCMP).
- AMP adenosine monophosphate
- GFP guanosine monophosphate
- TMP thymidine monophosphate
- UMP uridine monophosphate
- CMP cyclic adenosine monophosphate
- the nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP, dCMP and dUMP.
- the nucleotides are most preferably selected from dAMP, dTMP, dGMP, dCMP and dUMP.
- the template double stranded polynucleotide preferably comprises the following nucleotides: dAMP, dUMP and/or dTMP, dGMP and dCMP.
- a nucleotide may be abasic (i.e. lack a nucleobase).
- a nucleotide may also lack a nucleobase and a sugar (i.e. is a C3 spacer).
- the nucleotides in the template polynucleotide may be attached to each other in any manner.
- the nucleotides are typically attached by their sugar and phosphate groups as in nucleic acids.
- the nucleotides may be connected via their nucleobases as in pyrimidine dimers.
- the template polynucleotide is double stranded.
- the template polynucleotide may contain some single stranded regions, but at least a portion of the template polynucleotide is double stranded.
- the template polynucleotide can be a nucleic acid, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA).
- the template polynucleotide can comprise one strand of RNA hybridised to one strand of DNA.
- the polynucleotide may be any synthetic nucleic acid known in the art, such as peptide nucleic acid (PNA), glycerol nucleic acid (GNA), threose nucleic acid (TNA), locked nucleic acid (LNA) or other synthetic polymers with nucleotide side chains.
- the template polynucleotide can be any length.
- the polynucleotide can be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400 or at least 500 nucleotide pairs in length.
- the polynucleotide can be 1000 or more nucleotide pairs, 5000 or more nucleotide pairs in length or 100000 or more nucleotide pairs in length.
- the template polynucleotide is typically present in any suitable sample.
- the invention is typically carried out on a sample that is known to contain or suspected to contain the template polynucleotide. Alternatively, the invention may be carried out on a sample to confirm the identity of one or more template polynucleotides whose presence in the sample is known or expected.
- the sample may be a biological sample.
- the invention may be carried out in vitro using at least one sample obtained from or extracted from any organism or microorganism.
- the organism or microorganism is typically archaeal, prokaryotic or eukaryotic and typically belongs to one of the five kingdoms: plantae, animalia, fungi, monera and protista.
- the invention may be carried out in vitro on at least one sample obtained from or extracted from any virus.
- the sample is preferably a fluid sample.
- the sample typically comprises a body fluid of the patient.
- the sample may be urine, lymph, saliva, mucus or amniotic fluid but is preferably blood, plasma or serum.
- the sample is human in origin, but alternatively it may be from another mammal animal such as from commercially farmed animals such as horses, cattle, sheep, fish, chickens or pigs or may alternatively be pets such as cats or dogs.
- the sample may be of plant origin, such as a sample obtained from a commercial crop, such as a cereal, legume, fruit or vegetable, for example wheat, barley, oats, canola, maize, soya, rice, rhubarb, bananas, apples, tomatoes, potatoes, grapes, tobacco, beans, lentils, sugar cane, cocoa, broccoli or cotton.
- the sample may be a non-biological sample.
- the non-biological sample is preferably a fluid sample.
- Examples of non-biological samples include surgical fluids, water such as drinking water, sea water or river water, and reagents for laboratory tests.
- the sample is typically processed prior to being used in the invention, for example by centrifugation or by passage through a membrane that filters out unwanted molecules or cells, such as red blood cells.
- the sample may be measured immediately upon being taken.
- the sample may also be typically stored prior to assay, preferably below ⁇ 70° C.
- the template polynucleotide is contacted with a MuA transposase.
- This contacting occurs under conditions which allow the transposase to function, e.g. to fragment the template polynucleotide and to ligate MuA substrates to the one or both ends of the fragments.
- MuA transposase is commercially available, for instance from Thermo Scientific (Catalogue Number F-750C, 20 ⁇ L (1.1 ⁇ g/ ⁇ L)).
- the MuA translocase may be a wild type MuA translocase or a modified MuA translocase. Conditions under which MuA transposase will function are known in the art. Examples of suitable conditions are described in the Examples.
- the template polynucleotide is contacted with a population of double stranded MuA substrates.
- the MuA substrates contain a known MuA recognition sequence. Incubation of the template polynucleotide and MuA substrates with MuA results in adaptor formation.
- the double stranded substrates are polynucleotide substrates and may be formed from any of the nucleotides or nucleic acids discussed above.
- the MuA substrates are typically formed from the same nucleotides as the template polynucleotide, except for the universal nucleotides or at least one nucleotide which comprises a nucleoside that is not present in the template polynucleotide.
- the population of substrates is typically homogenous (i.e. typically contains a plurality of identical substrates).
- the population of substrates may be heterogeneous (i.e. may contain a plurality of different substrates).
- Suitable substrates for a MuA transposase are known in the art (Saariaho and Savilahti, Nucleic Acids Research, 2006; 34(10): 3139-3149 and Lee and Harshey, J. Mol. Biol., 2001; 314: 433-444).
- Each substrate typically comprises a double stranded portion which provides its activity as a substrate for MuA transposase.
- the double stranded portion is typically the same in each substrate.
- the population of substrates may comprise different double stranded portions.
- the double stranded portion in each substrate is typically at least 50 nucleotide pairs in length, such as at least 55, at least 60 or at least 65 nucleotide pairs in length.
- the double stranded portion may have a length of up to 10 kb, such as 5 kb, 1 kb or 100 base pairs.
- the double stranded portion in each substrate preferably comprises a dinucleotide comprising deoxycytidine (dC) and deoxyadenosine (dA) at the 3′ end of each strand.
- the dC and dA are typically in different orientations in the two strands of the double stranded portion, i.e. one strand has dC/dA and the other strand has dA/dC at the 3′ end when reading from 5′ to 3′.
- One strand of the double stranded portion preferably comprises the sequence shown in SEQ ID NO: 94 and the other strand of the double stranded portion preferably comprises a sequence which is complementary to the sequence shown in SEQ ID NO: 94.
- Each substrate comprises an overhang at one or both ends of one strand, i.e. at least one overhang on one strand.
- the one strand in the double stranded substrate having an overhang at one or both ends is also called the one substrate strand.
- constructs comprising a fragment of the template polynucletide and one or more MuA substrates are formed.
- a translocase that moves in the 5′ to 3′ may be used to remove the MuA transposases from the constructs.
- a translocase that moves in either direction, i.e. from 5′ to 3′ or from 3′ to 5′, may be used to remove the MuA transposases from the constructs.
- Each substrate preferably comprises a double stranded portion which comprises the sequence shown in SEQ ID NO: 94 hybridised to a sequence which is complementary to the sequence shown in SEQ ID NO: 94.
- the one overhang is preferably at the 5′ end of the sequence which is complementary to the sequence shown in SEQ ID NO: 94.
- the sequence complementary to the sequence shown in SEQ ID NO: 94 may have overhangs at both ends.
- the sequence complementary to the sequence shown in SEQ ID NO: 94 is the one substrate strand.
- the overhang may be at least 3, at least 4, at least 5, at least 6 or at least 7 nucleotides in length.
- the overhang may have a length of up to about 200 nucleotides, such as about 100, 50, 25 or 10 nucleotides.
- the overhang is preferably 5 nucleotides in length.
- the overhang may comprise any of the nucleotides discussed above.
- the translocase will remove both the MuA transposase and the one substrate strand, i.e. the substrate strand with the overhang. If the overhang at the 5′ end of the one substrate strand is closed after formation of the constructs, the translocase will remove only the MuA transposase.
- Closure of the overhang occurs for example where the 5′ end of the overhang is ligated to the adjacent 3′ end of a strand of the template polynucleotide fragment.
- each substrate comprises an overhang at both ends of one strand and the overhang at the 5′ end is formed from universal nucleotides.
- the overhang preferably consists of universal nucleotides. This allows the overhang to be closed after formation of the constructs.
- Each substrate preferably comprises a double stranded portion which comprises the sequence shown in SEQ ID NO: 94 hybridised to a sequence which is complementary to the sequence shown in SEQ ID NO: 94.
- the overhang formed from universal nucleotides is at the 5′ end of the sequence which is complementary to the sequence shown in SEQ ID NO: 94.
- the overhangs may be at least 3, at least 4, at least 5, at least 6 or at least 7 nucleotides in length.
- the overhangs are preferably 5 nucleotides in length.
- a universal nucleotide is one which will hybridise to some degree to all of the nucleotides in the template polynucleotide.
- a universal nucleotide is preferably one which will hybridise to some degree to nucleotides comprising the nucleosides adenosine (A), thymine (T), uracil (U), guanine (G) and cytosine (C).
- the universal nucleotide may hybridise more strongly to some nucleotides than to others.
- the universal nucleotide used in the oligomers hybridises to all of the nucleotides in the template polynucleotide.
- the universal nucleotide preferably comprises one of the following nucleobases: hypoxanthine, 4-nitroindole, 5-nitroindole, 6-nitroindole, 3-nitropyrrole, nitroimidazole, 4-nitropyrazole, 4-nitrobenzimidazole, 5-nitroindazole, 4-aminobenzimidazole or phenyl (C6-aromatic ring.
- the universal nucleotide more preferably comprises one of the following nucleosides: 2′-deoxyinosine, inosine, 7-deaza-2′-deoxyinosine, 7-deaza-inosine, 2-aza-deoxyinosine, 2-aza-inosine, 4-nitroindole 2′-deoxyribonucleoside, 4-nitroindole ribonucleoside, 5-nitroindole 2′-deoxyribonucleoside, 5-nitroindole ribonucleoside, 6-nitroindole 2′-deoxyribonucleoside, 6-nitroindole ribonucleoside, 3-nitropyrrole 2′-deoxyribonucleoside, 3-nitropyrrole ribonucleoside, an acyclic sugar analogue of hypoxanthine, nitroimidazole 2′-deoxyribonucleoside, nitroimidazole ribonucleoside, 4-nitropyrazole 2′
- the universal nucleotides in each overhang may be different from one another.
- the universal nucleotides in each overhang are preferably the same. All of the universal nucleotides in the population of substrates are preferably the same universal nucleotide.
- the method of the invention preferably comprises
- the overhang(s) of universal nucleotides may further comprise a reactive group, preferably at the 5′ end.
- the reactive group may be used to ligate the overhangs to the fragments in the constructs as discussed below.
- the reactive group may be used to ligate the fragments to the overhangs using click chemistry.
- Click chemistry is a term first introduced by Kolb et al. in 2001 to describe an expanding set of powerful, selective, and modular building blocks that work reliably in both small- and large-scale applications (Kolb H C, Finn, MG, Sharpless K B, Click chemistry: diverse chemical function from a few good reactions, Angew. Chem. Int. Ed. 40 (2001) 2004-2021).
- the reaction must be modular, wide in scope, give very high yields, generate only inoffensive by-products that can be removed by nonchromatographic methods, and be stereospecific (but not necessarily enantioselective).
- the required process characteristics include simple reaction conditions (ideally, the process should be insensitive to oxygen and water), readily available starting materials and reagents, the use of no solvent or a solvent that is benign (such as water) or easily removed, and simple product isolation. Purification if required must be by nonchromatographic methods, such as crystallization or distillation, and the product must be stable under physiological conditions”.
- Suitable examples of click chemistry include, but are not limited to, the following:
- the reactive group may be one that is suitable for click chemistry.
- the reactive group may be any of those disclosed in International Application No. PCT/GB10/000132 (published as WO 2010/086602), particularly in Table 4 of that application.
- the modification method uses a MuA transposase and a population of MuA substrates each comprising at least one overhang comprising a reactive group.
- the overhang(s) may be any length and may comprise any combination of any nucleotide(s). Suitable lengths and nucleotides are disclosed above. Suitable reactive groups are discussed above. Accordingly, the invention provides a method for modifying a template double stranded polynucleotide, comprising:
- each substrate comprises (i) an overhang at both ends of one strand and (ii) at least one nucleotide 10 nucleotides or fewer from the overhang at the 5′ end of the one strand which comprises a nucleoside that is not present in the template polynucleotide.
- the nucleotide that is not present in the template polynucleotide is typically a non-natural nucleotide where the template polynucleotide comprises only natural nucleotides.
- the double stranded portion in each substrate preferably comprises a dinucleotide comprising deoxycytidine (dC) and deoxyadenosine (dA) at the 3′ end of each strand and a dinucleotide comprising thymidine (dT) and deoxyguanosine (dG) at the 5′ end of each strand.
- dC deoxycytidine
- dA deoxyadenosine
- dG deoxyguanosine
- one or both of the nucleotides in the dT and dG dinucleotide of the one substrate strand may be replaced with a nucleotide comprising a nucleoside that is not present in the template polynucleotide as discussed below.
- the template polynucleotide comprises deoxyadenosine (dA), thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC), but not deoxyuridine (dU) and the dA in the dC and dA dinucleotide of one strand is replaced with a nucleotide comprising deoxyuridine (dU).
- deoxyadenosine dA
- dT thymidine
- dG deoxyguanosine
- dC deoxycytidine
- dU deoxyuridine
- the double stranded portion preferably comprises the sequence shown in SEQ ID NO: 94 and a sequence which is complementary to the sequence shown in SEQ ID NO: 94 and which is modified to include at least one nucleotide that is not present in the template polynucleotide.
- the sequence complementary to SEQ ID NO: 94 further comprises the overhang, i.e. is the one substrate strand.
- the double stranded portion comprises the sequence shown in SEQ ID NO: 94 and the sequence shown in SEQ ID NO: 95 (see below).
- SEQ ID NO: 27 the dT in the dT and dG dinucleotide at the 5′ end had been replaced with dU.
- This double stranded portion may be used when the template polynucleotide comprises deoxyadenosine (dA), thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC), but not deoxyuridine (dU).
- dA deoxyadenosine
- dT thymidine
- dG deoxyguanosine
- dC deoxycytidine
- U deoxyuridine
- the overhangs may be at least 3, at least 4, at least 5, at least 6 or at least 7 nucleotides in length.
- the overhangs are preferably 4 nucleotides in length.
- the overhangs may comprise any of the nucleotides discussed above.
- Each substrate comprises at least one nucleotide in the one substrate strand which is 10 nucleotides or fewer from the overhang at 5′ end and which comprises a nucleoside that is not present in the template polynucleotide.
- Each substrate may comprise any number of nucleotides which comprise a nucleoside that is not present in the template polynucleotide, such as 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. If a substrate comprises more than one nucleotide that is not present in the template polynucleotide, those nucleotides are typically the same, but may be different.
- the template polynucleotide comprises deoxyadenosine (dA), thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC) but not deoxyuridine (dU)
- the nucleoside that is not present in the template polynucleotide is preferably deoxyuridine (dU).
- one strand of the double stranded portion comprises the sequence shown in SEQ ID NO: 94 and the other strand of the double stranded portion comprises the sequence shown in SEQ ID NO: 95 (see above).
- SEQ ID NO: 95 the dT in the dT and dG dinucleotide at the 5′ end had been replaced with dU.
- the overhang at the 5′ end of SEQ ID NO: 95 is attached to the U.
- each substrate comprises the sequence shown in SEQ ID NO: 94 and the sequence shown in SEQ ID NO: 96 (see below).
- This substrate may be used when the template polynucleotide comprises deoxyadenosine (dA), thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC), but not deoxyuridine (dU).
- Each substrate also comprise an overhang at the 3′ end of the sequence shown in SEQ ID NO: 96.
- the template polynucleotide comprises deoxyadenosine (dA), deoxyuridine (dU), deoxyguanosine (dG) and deoxycytidine (dC) but not thymidine (dT)
- the nucleoside that is not present in the template polynucleotide is preferably thymidine (dT).
- the nucleoside that is not present in the template polynucleotide is preferably abasic, adenosine (A), uridine (U), 5-methyluridine (m 5 U), cytidine (C) or guanosine (G) or preferably comprises urea, 5, 6 dihydroxythymine, thymine glycol, 5-hydroxy-5 methylhydanton, uracil glycol, 6-hydroxy-5, 6-dihdrothimine, methyltartronylurea, 7, 8-dihydro-8-oxoguanine (8-oxoguanine), 8-oxoadenine, fapy-guanine, methy-fapy-guanine, fapy-adenine, aflatoxin B 1-fapy-guanine, 5-hydroxy-cytosine, 5-hydroxy-uracil, 3-methyladenine, 7-methylguanine, 1,N6-ethenoadenine, hypoxanthin
- the at least one nucleotide is 10 nucleotides or fewer from the overhang at the 5′ end, such as 9, 8, 7, 6, 5, 4, 3, 2, 1 or 0 nucleotides from the overhang. In other words, the at least one nucleotide is preferably at any of positions A to K in the Example below.
- the at least one nucleotide is preferably 0 nucleotides from the overhang (i.e. is adjacent to the overhang). In other words, the at least one nucleotide is preferably at position K in the Example below.
- the at least one nucleotide may be the first nucleotide in the overhang.
- the at least one nucleotide may be at position A in the Example below.
- nucleotides in the overhang may comprise a nucleoside that is not present in the template polynucleotide.
- a person skilled in the art is capable of designing suitable substrates.
- the method of the invention preferably comprises
- the method comprises ligating the overhangs to the fragments in the constructs. This may be done using any method of ligating nucleotides known in the art. For instance, it may be done using a ligase, such as a DNA ligase. Alternatively, if the overhangs comprise a reactive group, the reactive group may be used to ligate the overhangs to the fragments in the constructs.
- a nucleotide comprising a complementary reactive group may be attached to the fragments and the two reactive groups may be reacted together to ligate the overhangs to the fragments.
- Click chemistry may be used as discussed above.
- nucleotide(s) which comprise(s) a nucleoside that is not present in the template polynucleotide from the ligated constructs.
- Nucleotides are selectively removed if they are removed (or excised) from the ligated constructs, but the other nucleotides in the ligated constructs (i.e. those comprising different nucleosides) are not removed (or excised).
- Nucleotides comprising deoxyuridine may be selectively removed using Uracil-Specific Excision Reagent (USER®), which is a mixture of Uracil DNA glycosylase (UDG) and the DNA glycosylase-lyase Endonuclease VIII.
- URR® Uracil-Specific Excision Reagent
- the gaps can be repaired using a polymerase and a ligase, such as DNA polymerase and a DNA ligase.
- the gaps can be repaired using random oligonucleotides of sufficient length to bring the gaps and a ligase.
- translocase that is capable of removing the MuA transposase may be used in the invention. This may occur, for example, as a result of the unwinding of double stranded polynucleotide by a translocase.
- the translocase is preferably a helicase.
- Suitable helicases are well-known in the art (M. E. Fairman-Williams et al., Curr. Opin. Struct Biol., 2010, 20 (3), 313-324, T. M. Lohman et al., Nature Reviews Molecular Cell Biology, 2008, 9, 391-401).
- the helicase is preferably a member of superfamily 1 or superfamily 2.
- the helicase is more preferably a member of one of the following families: Pif1-like, Upf1-like, UvrD/Rep, Ski-like, Rad3/XPD, NS3/NPH-II, DEAD, DEAH/RHA, RecG-like, REcQ-like, T1R-like, Swi/Snf-like and Rig-I-like.
- the first three of those families are in superfamily 1 and the second ten families are in superfamily 2.
- the helicase is more preferably a member of one of the following subfamilies: RecD, Upf1 (RNA), PcrA, Rep, UvrD, Hel308, Mtr4 (RNA), XPD, NS3 (RNA), Mss116 (RNA), Prp43 (RNA), RecG, RecQ, T1R, RapA and Hef (RNA).
- the first five of those subfamilies are in superfamily 1 and the second eleven subfamilies are in superfamily 2.
- Members of the Upf1, Mtr4, NS3, Mss116, Prp43 and Hef subfamilies are RNA helicases.
- Members of the remaining subfamilies are DNA helicases.
- the helicase may be Srs2.
- the helicase may be RecBCD.
- the helicase is preferably a Hel308 helicase. Any Hel308 helicase may be used in accordance with the invention. Hel308 helicases are also known as ski2-like helicases and the two terms can be used interchangeably. Suitable Hel308 helicases are disclosed in Table 4 of International Application No. PCT/GB2012/052579 (published as WO 2013/057495).
- the Hel308 helicase typically comprises the amino acid motif Q-X1-X2-G-R-A-G-R (hereinafter called the Hel308 motif; SEQ ID NO: 8).
- the Hel308 motif is typically part of the helicase motif VI (Tuteja and Tuteja, Eur. J. Biochem. 271, 1849-1863 (2004)).
- X1 may be C, M or L.
- X1 is preferably C.
- X2 may be any amino acid residue.
- X2 is typically a hydrophobic or neutral residue.
- X2 may be A, F, M, C, V, L, I, S, T, P or R.
- X2 is preferably A, F, M, C, V, L, I, S, T or P.
- X2 is more preferably A, M or L.
- X2 is most preferably A or M.
- the Hel308 helicase preferably comprises the motif Q-X1-X2-G-R-A-G-R-P (hereinafter called the extended Hel308 motif; SEQ ID NO: 9) wherein X1 and X2 are as described above.
- Hel308 helicases The most preferred Hel308 helicases, Hel308 motifs and extended Hel308 motifs are shown in the Table 1 below.
- the most preferred Hel308 motif is shown in SEQ ID NO: 17.
- the most preferred extended Hel308 motif is shown in SEQ ID NO: 18.
- the Hel308 helicase preferably comprises the sequence of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 or a variant thereof.
- a variant of a Hel308 helicase is an enzyme that has an amino acid sequence which varies from that of the wild-type helicase and which retains polynucleotide binding activity.
- a variant of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 and which retains polynucleotide binding activity.
- Polynucleotide binding activity can be determined using methods known in the art. Suitable methods include, but are not limited to, fluorescence anisotropy, tryptophan fluorescence and electrophoretic mobility shift assay (EMSA). For instance, the ability of a variant to bind a single stranded polynucleotide can be determined as described in the Examples.
- the variant retains helicase activity. This can be measured in various ways. For instance, the ability of the variant to translocate along a polynucleotide can be measured using electrophysiology, a fluorescence assay or ATP hydrolysis.
- the variant may include modifications that facilitate handling of the polynucleotide encoding the helicase and/or facilitate its activity at high salt concentrations and/or room temperature.
- Variants typically differ from the wild-type helicase in regions outside of the Hel308 motif or extended Hel308 motif discussed above. However, variants may include modifications within these motif(s).
- a variant will preferably be at least 30% homologous to that sequence based on amino acid identity.
- the variant polypeptide may be at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 over the entire sequence.
- the variant may differ from the wild-type sequence in any of the ways discussed below with reference to SEQ ID NOs: 2 and 4.
- a variant of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 preferably comprises the Hel308 motif or extended Hel308 motif of the wild-type sequence as shown in Table 1 above.
- a variant may comprise the Hel308 motif or extended Hel308 motif from a different wild-type sequence.
- a variant of SEQ ID NO: 10 may comprise the Hel308 motif or extended Hel308 motif from SEQ ID NO: 13 (i.e. SEQ ID NO: 14 or 15).
- Variants of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 may also include modifications within the Hel308 motif or extended Hel308 motif of the relevant wild-type sequence. Suitable modifications at X1 and X2 are discussed above when defining the two motifs.
- a variant of SEQ ID NO: 10 may lack the first 19 amino acids of SEQ ID NO: 10 and/or lack the last 33 amino acids of SEQ ID NO: 10.
- a variant of SEQ ID NO: 10 preferably comprises a sequence which is at least 70%, at least 75%, at least 80%, at least 85%, at least 90% or more preferably at least 95%, at least 97% or at least 99% homologous based on amino acid identity with amino acids 20 to 211 or 20 to 727 of SEQ ID NO: 10.
- the Hel308 helicase may be modified as described in International Application No. PCT/GB2015/051925 (published as WO 2014/013260).
- two or more parts on the helicase may be connected to reduce the size of the opening in the polynucleotide domain through which a polynucleotide can unbind from the helicase and wherein the helicase retains its ability to control the movement of the polynucleotide.
- the polynucleotide domain and opening can be found between domain 2 (one of the ATPase domains) and domain 4 (the ratchet domain) and domain 2 and domain 5 (the molecular brake).
- the two or more parts connected in accordance with the invention are preferably (a) any amino acid in domain 2 and any amino acid in domain 4 or (b) any amino acid in domain 2 and any amino acid in domain 5.
- the amino acid residues which define domains 2, 4 and 5 in various Hel308 helicases are listed in Table 2 below.
- the Hel308 helicase preferably comprises the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) or a variant thereof.
- the polynucleotide domain and opening can be found between domain 2 (one of the ATPase domains) and domain 4 (the ratchet domain) and domain 2 and domain 5 (the molecular brake).
- the two or more parts of Hel308 Mbu connected are preferably (a) any amino acid in domain 2 and any amino acid in domain 4 or (b) any amino acid in domain 2 and any amino acid in domain 5.
- the amino acid residues which define domains 2, 4 and 5 for Hel308 Mbu are listed in Table 2 above.
- the two or more parts of Hel308 Mbu connected are preferably amino acids 284 and 615 in SEQ ID NO: 10. These amino acids are preferably substituted with cysteine (i.e. E284C and S615C) such that they can be connected by cysteine linkage.
- cysteine i.e. E284C and S615C
- the invention may use a mutant Hel308 Mbu protein which comprises a variant of SEQ ID NO: 10 in which E284 and 5615 are modified.
- E284 and 5615 are preferably substituted.
- E284 and 5615 are more preferably substituted with cysteine (i.e. E284C and S615C).
- the variant may differ from SEQ ID NO: 10 at positions other than E284 and 5615 as long as E284 and 5615 are modified.
- the variant will preferably be at least 30% homologous to SEQ ID NO: based on amino acid identity as discussed in more detail below.
- E284 and 5615 do not have to be connected. Alternatively, E284 and 5615 may be connected.
- the Hel308 helicase more preferably comprises (a) the sequence of Hel308 Tga (i.e. SEQ ID NO: 33) or a variant thereof, (b) the sequence of Hel308 Csy (i.e. SEQ ID NO: 22) or a variant thereof or (c) the sequence of Hel308 Mhu (i.e. SEQ ID NO: 52) or a variant thereof.
- SEQ ID NO: 10 (Hel308 Mbu) contains five natural cysteine residues. However, all of these residues are located within or around the DNA binding grove of the enzyme. Once a DNA strand is bound within the enzyme, these natural cysteine residues become less accessible for external modifications. This allows specific cysteine mutants of SEQ ID NO: 10 to be designed and attached to the moiety using cysteine linkage as discussed above.
- Preferred variants of SEQ ID NO: 10 have one or more of the following substitutions: A29C, Q221C, Q442C, T569C, A577C, A700C and S708C. The introduction of a cysteine residue at one or more of these positions facilitates cysteine linkage as discussed above.
- SEQ ID NO: have one or more of the following substitutions: M2Faz, R10Faz, F15Faz, A29Faz, R185Faz, A268Faz, E284Faz, Y387Faz, F400Faz, Y455Faz, E464Faz, E573Faz, A577Faz, E649Faz, A700Faz, Y720Faz, Q442Faz and S708Faz.
- the introduction of a Faz residue at one or more of these positions facilitates Faz linkage as discussed above.
- the Hel308 helicase is modified by the introduction of one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D272, N273, D274, G281, E284, E285, E287, 5288, T289, G290, E291, D293, T294, N300, R303, K304, N314, 5315, N316, H317, R318, K319, L320, E322, R326, N328, 5615, K717, Y720, N721 and 5724 in Hel308 Mbu (SEQ ID NO: 10), wherein the helicase retains its ability to control the movement of a polynucleotide.
- the one or more cysteine residues and/or one or more non-natural amino acids are preferably introduced by substitution.
- the helicase may bind to a polynucleotide via internal nucleotides or at one of its termini.
- These modifications decrease the ability of the polynucleotide to unbind or disengage from the helicase, particularly from internal nucleotides of the polynucleotide.
- the one or more modifications increase the processivity of the Hel308 helicase by preventing dissociation from the polynucleotide strand.
- the thermal stability of the enzyme is also increased by the one or more modifications giving it an improved structural stability that is beneficial in Strand Sequencing.
- the modified Hel308 helicases of the invention have all of the advantages and uses discussed above.
- the modified Hel308 helicase has the ability to control the movement of a polynucleotide. This can be measured as discussed above.
- the modified Hel308 helicase is artificial or non-natural.
- the Hel308 helicase preferably comprises a variant of one of the helicases shown in Table 1 above which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D272, N273, D274, G281, E284, E285, E287, 5288, T289, G290, E291, D293, T294, N300, R303, K304, N314, 5315, N316, H317, R318, K319, L320, E322, R326, N328, 5615, K717, Y720, N721 and 5724 in Hel308 Mbu (SEQ ID NO: 10).
- the Hel308 helicase preferably comprises a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D272, N273, D274, G281, E284, E285, E287, S288, T289, G290, E291, D293, T294, N300, R303, K304, N314, 5315, N316, H317, R318, K319, L320, E322, R326, N328, 5615, K717, Y720, N721 and S724 in Hel308 Mbu (SEQ ID NO: 10).
- the Hel308 helicase preferably comprises a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D274, E284, E285, E287, S288, T289, G290, E291, N316, K319, 5615, K717 or Y720 in Hel308 Mbu (SEQ ID NO: 10).
- Table 3a and 3b below show the positions in other Hel308 helicases which correspond to D274, E284, E285, S288, 5615, K717, Y720, E287, T289, G290, E291, N316 and K319 in Hel308 Mbu (SEQ ID NO: 10).
- E283 corresponds to D274 in Hel308 Mbu
- E293 corresponds to E284 in Hel308 Mbu
- 1294 corresponds to E285 in Hel308 Mbu
- V297 corresponds to S288 in Hel308 Mbu
- D671 corresponds to 5615 in Hel308 Mbu
- K775 corresponds to K717 in Hel308 Mbu
- E778 corresponds to Y720 in Hel308 Mbu.
- the lack of a corresponding position in another Hel308 helicase is marked as a “-”.
- the Hel308 helicase more preferably comprises a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D274, E284, E285, S288, S615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10).
- the relevant positions are shown in columns A to G in Table 3a above.
- the helicase may comprise a cysteine residue at one, two, three, four, five, six or seven of the positions which correspond to D274, E284, E285, S288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of these positions may be substituted with cysteine.
- the helicase of the invention may comprise a cysteine at any of the following combinations of the positions labelled A to G in that row: ⁇ A ⁇ , ⁇ B ⁇ , ⁇ C ⁇ , ⁇ D ⁇ , ⁇ G ⁇ , ⁇ E ⁇ , ⁇ F ⁇ , ⁇ A and B ⁇ , ⁇ A and C ⁇ , ⁇ A and D ⁇ , ⁇ A and G ⁇ , ⁇ A and E ⁇ , ⁇ A and F ⁇ , ⁇ B and C ⁇ , ⁇ B and D ⁇ , ⁇ B and G ⁇ , ⁇ B and E ⁇ , ⁇ B and F ⁇ , ⁇ C and D ⁇ , ⁇ C and G ⁇ , ⁇ C and E ⁇ , ⁇ C and F ⁇ , ⁇ D and G ⁇ , ⁇ D and E ⁇ , ⁇ D and F ⁇ , ⁇ G and E ⁇ , ⁇ G and F ⁇ , ⁇ E and F ⁇ , ⁇ A, B and C ⁇ , ⁇ A, B and D ⁇ , ⁇ A, B and G ⁇ , ⁇ E and F ⁇ , ⁇ A, B and
- the helicase may comprises a non-natural amino acid, such as Faz, at one, two, three, four, five, six or seven of the positions which correspond to D274, E284, E285, 5288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of these positions may be substituted with a non-natural amino acid, such as Faz.
- the helicase of the invention may comprise a non-natural amino acid, such as Faz, at any of the combinations of the positions labelled A to G above.
- the helicase may comprise a combination of one or more cysteines and one or more non-natural amino acids, such as Faz, at two or more of the positions which correspond to D274, E284, E285, S288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of one or more cysteine residues and one or more non-natural amino acids, such as Faz, may be present at the relevant positions. For instance, for each row of Table 3a and 3b above, the helicase of the invention may comprise one or more cysteines and one or more non-natural amino acids, such as Faz, at any of the combinations of the positions labelled A to G above.
- the Hel308 helicase more preferably comprises a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D274, E284, E285, S288 and S615 in Hel308 Mbu (SEQ ID NO: 10).
- the relevant positions are shown in columns A to E in Table 3a above.
- the helicase may comprise a cysteine residue at one, two, three, four or five, six or seven of the positions which correspond to D274, E284, E285, S288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of these positions may be substituted with cysteine.
- the helicase of the invention may comprise a cysteine at any of the following combinations of the positions labelled A to E in that row: ⁇ A ⁇ , ⁇ B ⁇ , ⁇ C ⁇ , ⁇ D ⁇ , ⁇ E ⁇ , ⁇ A and B ⁇ , ⁇ A and C ⁇ , ⁇ A and D ⁇ , ⁇ A and E ⁇ , ⁇ B and C ⁇ , ⁇ B and D ⁇ , ⁇ B and E ⁇ , ⁇ C and D ⁇ , ⁇ C and E ⁇ , ⁇ D and E ⁇ , ⁇ A, B and C ⁇ , ⁇ A, B and D ⁇ , ⁇ A, B and E ⁇ , ⁇ A, C and D ⁇ , ⁇ A, C and E ⁇ , ⁇ A, D and E ⁇ , ⁇ B, C and D ⁇ , ⁇ B, C and E ⁇ , ⁇ B, D and E ⁇ , ⁇ C, D and E ⁇ , ⁇ A, B, C and D ⁇ , ⁇ A, B, C and D ⁇ , ⁇ B, D and E ⁇ , ⁇ C, D and
- the helicase may comprises a non-natural amino acid, such as Faz, at one, two, three, four or five of the positions which correspond to D274, E284, E285, 5288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of these positions may be substituted with a non-natural amino acid, such as Faz.
- the helicase of the invention may comprise a non-natural amino acid, such as Faz, at any of the combinations of the positions labelled A to E above.
- the helicase may comprise a combination of one or more cysteines and one or more non-natural amino acids, such as Faz, at two or more of the positions which correspond to D274, E284, E285, S288 and 5615 in Hel308 Mbu (SEQ ID NO: 10). Any combination of one or more cysteine residues and one or more non-natural amino acids, such as Faz, may be present at the relevant positions.
- the helicase of the invention may comprise one or more cysteines and one or more non-natural amino acids, such as Faz, at any of the combinations of the positions labelled A to E above.
- the Hel308 helicase preferably comprises a variant of the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) which comprises one or more cysteine residues and/or one or more non-natural amino acids at D272, N273, D274, G281, E284, E285, E287, S288, T289, G290, E291, D293, T294, N300, R303, K304, N314, 5315, N316, H317, R318, K319, L320, E322, R326, N328, S615, K717, Y720, N721 and 5724.
- SEQ ID NO: 10 variant of the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) which comprises one or more cysteine residues and/or one or more non-natural amino acids at D272, N273, D274, G281, E284, E285, E287, S288, T289, G290, E291, D293, T294, N
- the variant preferably comprises D272C, N273C, D274C, G281C, E284C, E285C, E287C, S288C, T289C, G290C, E291C, D293C, T294C, N300C, R303C, K304C, N314C, S315C, N316C, H317C, R318C, K319C, L320C, E322C, R326C, N328C, S615C, K717C, Y720C, N721C or S724C.
- the variant preferably comprises D272Faz, N273Faz, D274Faz, G281Faz, E284Faz, E285Faz, E287Faz, S288Faz, T289Faz, G290Faz, E291Faz, D293Faz, T294Faz, N300Faz, R303Faz, K304Faz, N314Faz, S315Faz, N316Faz, H317 Faz, R318Faz, K319Faz, L320Faz, E322Faz, R326Faz, N328Faz, S615Faz, K717Faz, Y720Faz, N721Faz or S724Faz.
- the Hel308 helicase preferably comprises a variant of the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) which comprises one or more cysteine residues and/or one or more non-natural amino acids at D274, E284, E285, S288, 5615, K717 and Y720.
- the helicase of the invention may comprise one or more cysteines, one or more non-natural amino acids, such as Faz, or a combination thereof at any of the combinations of the positions labelled A to G above.
- the Hel308 helicase preferably comprises a variant of the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of D274, E284, E285, 5288 and 5615.
- the helicase of the invention may comprise a cysteine or a non-natural amino acid, such as Faz, at any of the following combinations of positions: ⁇ D274 ⁇ , ⁇ E284 ⁇ , ⁇ E285 ⁇ , ⁇ S288 ⁇ , ⁇ S615 ⁇ , ⁇ D274 and E284 ⁇ , ⁇ D274 and E285 ⁇ , ⁇ D274 and S288 ⁇ , ⁇ D274 and 5615 ⁇ , ⁇ E284 and E285 ⁇ , ⁇ E284 and S288 ⁇ , ⁇ E284 and 5615 ⁇ , ⁇ E285 and S288 ⁇ , ⁇ E285 and 5615 ⁇ , ⁇ 5288 and 5615 ⁇ , ⁇ D274, E284 and E285 ⁇ , ⁇ D274, E284 and S288 ⁇ , ⁇ D274, E284 and 5615 ⁇ , ⁇ D274, E285 and S288 ⁇ , ⁇ D274, E284 and 5615 ⁇ , ⁇ D274, E285 and
- the helicase preferably comprises a variant of SEQ ID NO: 10 which comprises (a) E284C and 5615C, (b), E284Faz and S615Faz, (c) E284C and S615Faz or (d) E284Faz and S615C.
- the helicase more preferably comprises the sequence shown in SEQ ID NO: 10 with E284C and 5615C.
- Preferred non-natural amino acids for use in the invention include, but are not limited, to 4-Azido-L-phenylalanine (Faz), 4-Acetyl-L-phenylalanine, 3-Acetyl-L-phenylalanine, 4-Acetoacetyl-L-phenylalanine, O-Allyl-L-tyrosine, 3-(Phenylselanyl)-L-alanine, O-2-Propyn-1-yl-L-tyrosine, 4-(Dihydroxyboryl)-L-phenylalanine, 4-[(Ethylsulfanyl)carbonyl]-L-phenylalanine, (2S)-2-amino-3- ⁇ 4-[(propan-2-ylsulfanyl)carbonyl]phenyl ⁇ propanoic acid, (2S)-2-amino-3- ⁇ 4-[(2-amino-3-sulfanylpropanoyl)amino
- the most preferred non-natural amino acid is 4-azido-L-phenylalanine (Faz).
- variant of a Hel308 helicase is an enzyme that has an amino acid sequence which varies from that of the wild-type helicase and which retains polynucleotide binding activity.
- a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 may comprise additional modifications as long as it comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D272, N273, D274, G281, E284, E285, E287, 5288, T289, G290, E291, D293, T294, N300, R303, K304, N314, S315, N316, H317, R318, K319, L320, E322, R326, N328, 5615, K717, Y720, N721 and 5724 in
- a variant may comprise the mutations in domain 5 disclosed in Woodman et al. (J. Mol. Biol. (2007) 374, 1139-1144). These mutations correspond to R685A, R687A and R689A in SEQ ID NO: 10.
- connection can be transient, for example non-covalent. Even transient connection will reduce the size of the opening and reduce unbinding of the polynucleotide from the helicase through the opening.
- the two or more parts are preferably connected by affinity molecules.
- Suitable affinity molecules are known in the art.
- the affinity molecules are preferably (a) complementary polynucleotides (International Application No. PCT/GB10/000132 (published as WO 2010/086602), (b) an antibody or a fragment thereof and the complementary epitope (Biochemistry 6th Ed, W. H. Freeman and co (2007) pp 953-954), (c) peptide zippers (O'Shea et al., Science 254 (5031): 539-544), (d) capable of interacting by ⁇ -sheet augmentation (Remaut and Waksman Trends Biochem. Sci.
- the two or more parts may be transiently connected by a hexa-his tag or Ni-NTA.
- the two or more parts may also be modified such that they transiently connect to each other.
- the two or more parts are preferably permanently connected.
- a connection is permanent if is not broken while the helicase is used or cannot be broken without intervention on the part of the user, such as using reduction to open —S—S— bonds.
- the two or more parts are preferably covalently-attached.
- the two or more parts may be covalently attached using any method known in the art.
- the two or more parts may be covalently attached via their naturally occurring amino acids, such as cysteines, threonines, serines, aspartates, asparagines, glutamates and glutamines.
- Naturally occurring amino acids may be modified to facilitate attachment.
- the naturally occurring amino acids may be modified by acylation, phosphorylation, glycosylation or farnesylation. Other suitable modifications are known in the art. Modifications to naturally occurring amino acids may be post-translation modifications.
- the two or more parts may be attached via amino acids that have been introduced into their sequences. Such amino acids are preferably introduced by substitution.
- the introduced amino acid may be cysteine or a non-natural amino acid that facilitates attachment.
- Suitable non-natural amino acids include, but are not limited to, 4-azido-L-phenylalanine (Faz), any one of the amino acids numbered 1-71 included in FIG. 1 of Liu C. C. and Schultz P. G., Annu. Rev. Biochem., 2010, 79, 413-444 or any one of the amino acids listed below.
- the introduced amino acids may be modified as discussed above.
- the two or more parts are connected using linkers.
- Linker molecules are discussed in more detail below.
- One suitable method of connection is cysteine linkage. This is discussed in more detail below.
- the two or more parts are preferably connected using one or more, such as two or three, linkers.
- the one or more linkers may be designed to reduce the size of, or close, the opening as discussed above. If one or more linkers are being used to close the opening as discussed above, at least a part of the one or more linkers is preferably oriented such that it is not parallel to the polynucleotide when it is bound by the helicase. More preferably, all of the linkers are oriented in this manner.
- At least a part of the one or more linkers preferably crosses the opening in an orientation that is not parallel to the polynucleotide when it bound by the helicase. More preferably, all of the linkers cross the opening in this manner. In these embodiments, at least a part of the one or more linkers may be perpendicular to the polynucleotide. Such orientations effectively close the opening such that the polynucleotide cannot unbind from the helicase through the opening.
- Each linker may have two or more functional ends, such as two, three or four functional ends. Suitable configurations of ends in linkers are well known in the art.
- One or more ends of the one or more linkers are preferably covalently attached to the helicase. If one end is covalently attached, the one or more linkers may transiently connect the two or more parts as discussed above. If both or all ends are covalently attached, the one or more linkers permanently connect the two or more parts.
- At least one of the two or more parts is preferably modified to facilitate the attachment of the one or more linkers. Any modification may be made.
- the linkers may be attached to one or more reactive cysteine residues, reactive lysine residues or non-natural amino acids in the two or more parts.
- the non-natural amino acid may be any of those discussed above.
- the non-natural amino acid is preferably 4-azido-L-phenylalanine (Faz).
- At least one amino acid in the two or more parts is preferably substituted with cysteine or a non-natural amino acid, such as Faz.
- the one or more linkers are preferably amino acid sequences and/or chemical crosslinkers.
- Suitable amino acid linkers such as peptide linkers, are known in the art.
- the length, flexibility and hydrophilicity of the amino acid or peptide linker are typically designed such that it reduces the size of the opening, but does not to disturb the functions of the helicase.
- Preferred flexible peptide linkers are stretches of 2 to 20, such as 4, 6, 8, 10 or 16, serine and/or glycine amino acids. More preferred flexible linkers include (SG) 1 , (SG) 2 , (SG) 3 , (SG) 4 , (SG) 5 , (SG) 8 , (SG) 10 , (SG) 15 or (SG) 20 wherein S is serine and G is glycine.
- Preferred rigid linkers are stretches of 2 to 30, such as 4, 6, 8, 16 or 24, proline amino acids. More preferred rigid linkers include (P) 12 wherein P is proline.
- the amino acid sequence of a linker preferably comprises a polynucleotide binding moiety. Such moieties and the advantages associated with their use are discussed below.
- Suitable chemical crosslinkers are well-known in the art. Suitable chemical crosslinkers include, but are not limited to, those including the following functional groups: maleimide, active esters, succinimide, azide, alkyne (such as dibenzocyclooctynol (DIBO or DBCO), difluoro cycloalkynes and linear alkynes), phosphine (such as those used in traceless and non-traceless Staudinger ligations), haloacetyl (such as iodoacetamide), phosgene type reagents, sulfonyl chloride reagents, isothiocyanates, acyl halides, hydrazines, disulphides, vinyl sulfones, aziridines and photoreactive reagents (such as aryl azides, diaziridines).
- alkyne such as dibenzocyclooctynol (DIBO or DBCO), diflu
- Reactions between amino acids and functional groups may be spontaneous, such as cysteine/maleimide, or may require external reagents, such as Cu(I) for linking azide and linear alkynes.
- Linkers can comprise any molecule that stretches across the distance required. Linkers can vary in length from one carbon (phosgene-type linkers) to many Angstroms. Examples of linear molecules, include but are not limited to, are polyethyleneglycols (PEGs), polypeptides, polysaccharides, deoxyribonucleic acid (DNA), peptide nucleic acid (PNA), threose nucleic acid (TNA), glycerol nucleic acid (GNA), saturated and unsaturated hydrocarbons, polyamides. These linkers may be inert or reactive, in particular they may be chemically cleavable at a defined position, or may be themselves modified with a fluorophore or ligand. The linker is preferably resistant to dithiothreitol (DTT).
- DTT dithiothreitol
- Preferred crosslinkers include 2,5-dioxopyrrolidin-1-yl 3-(pyridin-2-yldisulfanyl)propanoate, 2,5-dioxopyrrolidin-1-yl 4-(pyridin-2-yldisulfanyl)butanoate and 2,5-dioxopyrrolidin-1-yl 8-(pyridin-2-yldisulfanyl)octananoate, di-maleimide PEG 1k, di-maleimide PEG 3.4k, di-maleimide PEG 5k, di-maleimide PEG 10k, bis(maleimido)ethane (BMOE), bis-maleimidohexane (BMH), 1,4-bis-maleimidobutane (BMB), 1,4 bis-maleimidyl-2,3-dihydroxybutane (BMDB), BM[PEO]2 (1,8-bis-maleimidodiethyleneglycol),
- the one or more linkers may be cleavable. This is discussed in more detail below.
- the two or more parts may be connected using two different linkers that are specific for each other. One of the linkers is attached to one part and the other is attached to another part. The linkers should react to form a modified helicase of the invention.
- the two or more parts may be connected using the hybridization linkers described in International Application No. PCT/GB10/000132 (published as WO 2010/086602).
- the two or more parts may be connected using two or more linkers each comprising a hybridizable region and a group capable of forming a covalent bond.
- the hybridizable regions in the linkers hybridize and link the two or more parts.
- the linked parts are then coupled via the formation of covalent bonds between the groups. Any of the specific linkers disclosed in International Application No. PCT/GB10/000132 (published as WO 2010/086602) may be used in accordance with the invention.
- the two or more parts may be modified and then attached using a chemical crosslinker that is specific for the two modifications. Any of the crosslinkers discussed above may be used.
- the linkers may be labeled. Suitable labels include, but are not limited to, fluorescent molecules (such as Cy3 or AlexaFluor®555), radioisotopes, e.g. 125 I, 35 S, enzymes, antibodies, antigens, polynucleotides and ligands such as biotin. Such labels allow the amount of linker to be quantified.
- the label could also be a cleavable purification tag, such as biotin, or a specific sequence to show up in an identification method, such as a peptide that is not present in the protein itself, but that is released by trypsin digestion.
- a preferred method of connecting the two or more parts is via cysteine linkage. This can be mediated by a bi-functional chemical crosslinker or by an amino acid linker with a terminal presented cysteine residue. Linkage can occur via natural cysteines in the helicase. Alternatively, cysteines can be introduced into the two or more parts of the helicase. If the two or more parts are connected via cysteine linkage, the one or more cysteines have preferably been introduced to the two or more parts by substitution.
- any bi-functional linker may be designed to ensure that the size of the opening is reduced sufficiently and the function of the helicase is retained.
- Suitable linkers include bismaleimide crosslinkers, such as 1,4-bis(maleimido)butane (BMB) or bis(maleimido)hexane.
- BMB 1,4-bis(maleimido)butane
- One draw back of bi-functional linkers is the requirement of the helicase to contain no further surface accessible cysteine residues if attachment at specific sites is preferred, as binding of the bi-functional linker to surface accessible cysteine residues may be difficult to control and may affect substrate binding or activity.
- cysteine residues may be enhanced by modification of the adjacent residues, for example on a peptide linker. For instance, the basic groups of flanking arginine, histidine or lysine residues will change the pKa of the cysteines thiol group to that of the more reactive 5-group.
- cysteine residues may be protected by thiol protective groups such as 5,5′-dithiobis-(2-nitrobenzoic acid) (dTNB). These may be reacted with one or more cysteine residues of the helicase before a linker is attached. Selective deprotection of surface accessible cysteines may be possible using reducing reagents immobilized on beads (for example immobilized tris(2-carboxyethyl) phosphine, TCEP). Cysteine linkage of the two or more parts is discussed in more detail below.
- Another preferred method of attaching the two or more parts is via 4-azido-L-phenylalanine (Faz) linkage.
- This can be mediated by a bi-functional chemical linker or by a polypeptide linker with a terminal presented Faz residue.
- the one or more Faz residues have preferably been introduced to the helicase by substitution. Faz linkage of two or more helicases is discussed in more detail below.
- the helicase is preferably a RecD helicase. Any RecD helicase may be used in accordance with the invention.
- the structures of RecD helicases are known in the art (FEBS J. 2008 April; 275(8):1835-51. Epub 2008 Mar. 9. ATPase activity of RecD is essential for growth of the Antarctic Pseudomonas syringae Lz4W at low temperature. Satapathy A K, Pavankumar T L, Bhattacharjya S, Sankaranarayanan R, Ray MK; EMS Microbiol Rev. 2009 May; 33(3):657-87. The diversity of conjugative relaxases and its application in plasmid classification.
- the RecD helicase typically comprises the amino acid motif X1-X2-X3-G-X4-X5-X6-X7 (hereinafter called the RecD-like motif I; SEQ ID NO: 59), wherein X1 is G, S or A, X2 is any amino acid, X3 is P, A, S or G, X4 is T, A, V, S or C, X5 is G or A, X6 is K or R and X7 is T or S.
- X1 is preferably G.
- X2 is preferably G, I, Y or A.
- X2 is more preferably G.
- X3 is preferably P or A.
- X4 is preferably T, A, V or C.
- X4 is preferably T, V or C.
- X5 is preferably G.
- X6 is preferably K.
- X7 is preferably T or S.
- the RecD helicase preferably comprises Q-(X8) 16-18 -X1-X2-X3-G-X4-X5-X6-X7 (hereinafter called the extended RecD-like motif I; SEQ ID NOs: 60, 61 and 62), wherein X1 to X7 are as defined above and X8 is any amino acid.
- Suitable sequences for (X8) 16 can be identified in SEQ ID NOs: 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47 and 50 of U.S. Patent Application No. 61/581,332 and SEQ ID NOs: 18, 21, 24, 25, 28, 30, 32, 35, 37, 39, 41, 42 and 44 of International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- the RecD helicase preferably comprises the amino acid motif G-G-P-G-Xa-G-K-Xb (hereinafter called the RecD motif I; SEQ ID NO: 63) wherein Xa is T, V or C and Xb is T or S. Xa is preferably T. Xb is preferably T.
- the Rec-D helicase preferably comprises the sequence G-G P G T G K T (SEQ ID NO: 64).
- the RecD helicase more preferably comprises the amino acid motif Q-(X8) 16-18 -G-G-P-G-Xa-G-K-Xb (hereinafter called the extended RecD motif I; SEQ ID NO: 65, 66 and 67), wherein Xa and Xb are as defined above and X8 is any amino acid.
- the extended RecD motif I SEQ ID NO: 65.
- Suitable sequences for (X8) 16 can be identified in SEQ ID NOs: 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47 and 50 of U.S. Patent Application No. 61/581,332 and SEQ ID NOs: 18, 21, 24, 25, 28, 30, 32, 35, 37, 39, 41, 42 and 44 of International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- the RecD helicase typically comprises the amino acid motif X1-X2-X3-X4-X5-(X6) 3 -Q-X7 (hereinafter called the RecD-like motif V; SEQ ID NO: 68), wherein X1 is Y, W or F, X2 is A, T, S, M, C or V, X3 is any amino acid, X4 is T, N or S, X5 is A, T, G, S, V or I, X6 is any amino acid and X7 is G or S.
- X1 is preferably Y.
- X2 is preferably A, M, C or V.
- X2 is more preferably A.
- X3 is preferably I, M or L.
- X3 is more preferably I or L.
- X4 is preferably T or S. X4 is more preferably T.
- X5 is preferably A, V or I.
- X5 is more preferably V or I.
- X5 is most preferably V.
- (X6) 3 is preferably H-K-S, H-M-A, H-G-A or H-R-S.
- (X6) 3 is more preferably H-K-S.
- X7 is preferably G.
- the RecD helicase preferably comprises the amino acid motif Xa-Xb-Xc-Xd-Xe-H-K-S-Q-G (hereinafter called the RecD motif V; SEQ ID NO: 69), wherein Xa is Y, W or F, Xb is A, M, C or V, Xc is I, M or L, Xd is T or S and Xe is V or I.
- Xa is preferably Y.
- Xb is preferably A.
- Xd is preferably T.
- Xd is preferably V.
- Preferred RecD motifs I are shown in Table 5 of U.S. Patent Application No. 61/581,332.
- Preferred RecD-like motifs I are shown in Table 7 of U.S. Patent Application No.
- the RecD helicase is preferably one of the helicases shown in Table 4 or 5 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562) or a variant thereof. Variants are described in U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- the RecD helicase is preferably a TraI helicase or a TraI subgroup helicase.
- TraI helicases and TraI subgroup helicases may contain two RecD helicase domains, a relaxase domain and a C-terminal domain.
- the TraI subgroup helicase is preferably a TrwC helicase.
- the TraI helicase or TraI subgroup helicase is preferably one of the helicases shown in Table 6 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562) or a variant thereof. Variants are described in U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- the TraI helicase or a TraI subgroup helicase typically comprises a RecD-like motif I as defined above (SEQ ID NO: 59) and/or a RecD-like motif V as defined above (SEQ ID NO: 68).
- the TraI helicase or a TraI subgroup helicase preferably comprises both a RecD-like motif I (SEQ ID NO: 59) and a RecD-like motif V (SEQ ID NO: 68).
- the TraI helicase or a TraI subgroup helicase typically further comprises one of the following two motifs:
- the TraI helicase or TraI subgroup helicase is more preferably one of the helicases shown in Table 6 or 7 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562) or a variant thereof.
- the TraI helicase most preferably comprises the sequence shown in SEQ ID NO: 85 or a variant thereof.
- SEQ ID NO: 85 is TraI Eco (NCBI Reference Sequence: NP_061483.1; Genbank AAQ98619.1; SEQ ID NO: 85).
- TraI Eco comprises the following motifs: RecD-like motif I (GYAGVGKT; SEQ ID NO: 86), RecD-like motif V (YAITAHGAQG; SEQ ID NO: 87) and Mob F motif III (HDTSRDQEPQLHTH; SEQ ID NO: 88).
- the TraI helicase or TraI subgroup helicase more preferably comprises the sequence of one of the helicases shown in Table 4 below, i.e. one of SEQ ID NOs: 85, 126, 134 and 138, or a variant thereof.
- two or more parts on the RecD helicase, TraI helicase or TraI subgroup helicase may be connected to reduce the size of the opening in the polynucleotide domain through which a polynucleotide can unbind from the helicase and wherein the helicase retains its ability to control the movement of the polynucleotide.
- Any of the embodiments discussed above for Hel308 helicases equally apply to RecD helicases, TraI helicases or TraI subgroup helicases.
- TrwC Cba The two or more parts of TrwC Cba that are connected are preferably (a) amino acids 691 and 346 in SEQ ID NO: 126; (b) amino acids 657 and 339 in SEQ ID NO: 126; (c) amino acids 691 and 350 in SEQ ID NO: 126; or (d) amino acids 690 and 350 in SEQ ID NO: 126.
- These amino acids are preferably substituted with cysteine such that they can be connected by cysteine linkage.
- the invention may use a mutant TrwC Cba protein which comprises a variant of SEQ ID NO: 126 in which amino acids 691 and 346; 657 and 339; 691 and 350; or 690 and 350 are modified.
- the amino acids are preferably substituted.
- the amino acids are more preferably substituted with cysteine.
- the variant may differ from SEQ ID NO: 126 at positions other than 691 and 346; 657 and 339; 691 and 350; or 690 and 350 as long as the relevant amino acids are modified.
- the variant will preferably be at least 10% homologous to SEQ ID NO: 126 based on amino acid identity as discussed in more detail below. Amino acid 691 and 346; 657 and 339; 691 and 350; or 690 and 350 are not connected.
- These mutant TrwC Cba proteins may be used to form a modified helicase in which the modified amino acids are connected.
- a variant of a RecD helicase, TraI helicase or TraI subgroup helicase is an enzyme that has an amino acid sequence which varies from that of the wild-type helicase and which retains polynucleotide binding activity. This can be measured as described above.
- a variant of SEQ ID NO: 85, 126, 134 or 138 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 85, 126, 134 or 138 and which retains polynucleotide binding activity.
- the variant retains helicase activity.
- the variant must work in at least one of the two modes discussed below. Preferably, the variant works in both modes.
- the variant may include modifications that facilitate handling of the polynucleotide encoding the helicase and/or facilitate its activity at high salt concentrations and/or room temperature.
- Variants typically differ from the wild-type helicase in regions outside of the motifs discussed above. However, variants may include modifications within these motif(s).
- a variant will preferably be at least 10% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 20%, at least 25%, at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of any one of SEQ ID NOs: 85, 126, 134 and 138 over the entire sequence.
- the variant may differ from the wild-type sequence in any of the ways discussed above with reference to SEQ ID NOs: 2 and 4.
- a variant of any one of SEQ ID NOs: 85, 126, 134 and 138 preferably comprises the RecD-like motif I and/or RecD-like motif V of the wild-type sequence.
- a variant of SEQ ID NO: 85, 126, 134 or 138 may comprise the RecD-like motif I and/or extended RecD-like motif V from a different wild-type sequence.
- a variant may comprise any one of the preferred motifs shown in Tables 5 and 7 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- Variants of SEQ ID NOs: 85, 126, 134 and 138 may also include modifications within the RecD-like motifs I and V of the wild-type sequence.
- a variant of SEQ ID NO: 85, 126, 134 or 138 preferably comprises one or more substituted cysteine residues and/or one or more substituted Faz residues to facilitate attachment as discussed above.
- the helicase is preferably an XPD helicase. Any XPD helicase may be used in accordance with the invention. XPD helicases are also known as Rad3 helicases and the two terms can be used interchangeably.
- XPD helicases The structures of XPD helicases are known in the art (Cell. 2008 May 30; 133(5):801-12. Structure of the DNA repair helicase XPD. Liu H, Rudolf J, Johnson K A, McMahon S A, Oke M, Carter L, McRobbie A M, Brown S E, Naismith J H, White M F).
- the XPD helicase typically comprises the amino acid motif X1-X2-X3-G-X4-X5-X6-E-G (hereinafter called XPD motif V; SEQ ID NO: 89).
- X1, X2, X5 and X6 are independently selected from any amino acid except D, E, K and R.
- X1, X2, X5 and X6 are independently selected from G, P, A, V, L, I, M, C, F, Y, W, H, Q, N, S and T.
- X1, X2, X5 and X6 are preferably not charged.
- X1, X2, X5 and X6 are preferably not H.
- X1 is more preferably V, L, I, S or Y.
- X5 is more preferably V, L, I, N or F.
- X6 is more preferably S or A.
- X3 and X4 may be any amino acid residue.
- X4 is preferably K, R or T.
- the XPD helicase typically comprises the amino acid motif Q-Xa-Xb-G-R-Xc-Xd-R-(Xe) 3 -Xf-(Xg) 7 -D-Xh-R (hereinafter called XPD motif VI; SEQ ID NO: 90).
- Xa, Xe and Xg may be any amino acid residue.
- Xb, Xc and Xd are independently selected from any amino acid except D, E, K and R.
- Xb, Xc and Xd are typically independently selected from G, P, A, V, L, I, M, C, F, Y, W, H, Q, N, S and T.
- Xb, Xc and Xd are preferably not charged.
- Xb, Xc and Xd are preferably not H.
- Xb is more preferably V, A, L, I or M.
- Xc is more preferably V, A, L, I, M or C.
- Xd is more preferably I, H, L, F, M or V.
- Xf may be D or E.
- (Xg) 7 is X g1 , X g2 , X g3 , X g4 , X g5 , X g6 and X g7 .
- X g2 is preferably G, A, S or C.
- X g5 is preferably F, V, L, I, M, A, W or Y.
- X g6 is preferably L, F, Y, M, I or V.
- X g7 is preferably A, C, V, L, I, M or S.
- the XPD helicase preferably comprises XPD motifs V and VI.
- the most preferred XPD motifs V and VI are shown in Table 5 of U.S. Patent Application No. 61/581,340 and International Application No. PCT/GB2012/053273 (published as WO 2012/098561).
- the XPD helicase preferably further comprises an iron sulphide (FeS) core between two Walker A and B motifs (motifs I and II).
- An FeS core typically comprises an iron atom coordinated between the sulphide groups of cysteine residues.
- the FeS core is typically tetrahedral.
- the XPD helicase is preferably one of the helicases shown in Table 4 or 5 of International Application No. PCT/GB2012/053273 (published as WO 2012/098561) or a variant thereof.
- the XPD helicase most preferably comprises the sequence shown in SEQ ID NO: 91 or a variant thereof.
- SEQ ID NO: 91 is XPD Mbu ( Methanococcoides burtonii ; YP_566221.1; GI:91773529).
- XPD Mbu comprises YLWGTLSEG (Motif V; SEQ ID NO: 92) and QAMGRVVRSPTDYGARILLDGR (Motif VI; SEQ ID NO: 93).
- a variant of a XPD helicase is an enzyme that has an amino acid sequence which varies from that of the wild-type helicase and which retains polynucleotide binding activity. This can be measured as described above.
- a variant of SEQ ID NO: 91 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 91 and which retains polynucleotide binding activity.
- the variant retains helicase activity.
- the variant must work in at least one of the two modes discussed below. Preferably, the variant works in both modes.
- the variant may include modifications that facilitate handling of the polynucleotide encoding the helicase and/or facilitate its activity at high salt concentrations and/or room temperature. Variants typically differ from the wild-type helicase in regions outside of XPD motifs V and VI discussed above. However, variants may include modifications within one or both of these motifs.
- a variant will preferably be at least 10%, preferably 30% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 91 over the entire sequence.
- the variant may differ from the wild-type sequence in any of the ways discussed above with reference to SEQ ID NOs: 2 and 4.
- a variant of SEQ ID NO: 91 preferably comprises the XPD motif V and/or the XPD motif VI of the wild-type sequence.
- a variant of SEQ ID NO: 91 more preferably comprises both XPD motifs V and VI of SEQ ID NO: 91.
- a variant of SEQ ID NO: 91 may comprise XPD motifs V and/or VI from a different wild-type sequence.
- a variant of SEQ ID NO: 91 may comprise any one of the preferred motifs shown in Table 5 of U.S. Patent Application No. 61/581,340 and International Application No. PCT/GB2012/053273 (published as WO 2012/098561).
- Variants of SEQ ID NO: 91 may also include modifications within XPD motif V and/or XPD motif VI of the wild-type sequence. Suitable modifications to these motifs are discussed above when defining the two motifs.
- two or more parts on the XPD helicase may be connected to reduce the size of the opening in the polynucleotide domain through which a polynucleotide can unbind from the helicase and wherein the helicase retains its ability to control the movement of the polynucleotide. Any of the embodiments discussed above for Hel308 helicases equally apply to XPD helicases.
- a variant of SEQ ID NO: 91 preferably comprises one or more substituted cysteine residues and/or one or more substituted Faz residues to facilitate attachment as discussed above.
- the helicase is preferably a UvrD helicase. Any UvrD helicase may be used in the invention.
- the UvrD helicase preferably comprises the sequence shown in SEQ ID NO: 122 or a variant thereof. Variants are defined above. Over the entire length of the amino acid sequence of any one of SEQ ID NO: 122, a variant will preferably be at least 20% homologous to that sequence based on amino acid similarity or identity.
- the variant polypeptide may be at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of any one of SEQ ID Ns: 122 over the entire sequence.
- the helicase is preferably a Dda helicase. Any Dda helicase may be used in the invention.
- Dda helicases typically comprises the following five domains: 1A (RecA-like motor) domain, 2A (RecA-like motor) domain, tower domain, pin domain and hook domain (Xiaoping He et al., 2012, Structure; 20: 1189-1200).
- the domains may be identified using protein modelling, x-ray diffraction measurement of the protein in a crystalline state (Rupp B (2009). Biomolecular Crystallography: Principles, Practice and Application to Structural Biology.
- Preferred Dda helicases are shown in Table 5 below.
- the Dda helicase more preferably comprises the sequence of one of the helicases shown in the Table 5 above, i.e. one of SEQ ID NOs: 97 to 112, or a variant thereof. Variants are defined above. Over the entire length of the amino acid sequence of any one of SEQ ID NOs: 97 to 112, a variant will preferably be at least 20% homologous to that sequence based on amino acid similarity or identity.
- the variant polypeptide may be at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of any one of SEQ ID NOs: 97 to 112 over the entire sequence.
- Preferred variants of any one of SEQ ID NOs: 97 to 112 have a non-natural amino acid, such as Faz, at the amino- (N-) terminus and/or carboxy (C-) terminus.
- Preferred variants of any one of SEQ ID NOs: 8 to 23 have a cysteine residue at the amino- (N-) terminus and/or carboxy (C-) terminus.
- Preferred variants of any one of SEQ ID NOs: 8 to 23 have a cysteine residue at the amino- (N-) terminus and a non-natural amino acid, such as Faz, at the carboxy (C-) terminus or vice versa.
- Preferred variants of SEQ ID NO: 8 contain one or more of, such as all of, the following modifications E54G, D151E, I196N and G357A.
- the Dda helicase preferably comprises any of the modifications disclosed in International Application Nos. PCT/GB2014/052736 and PCT/GB2015/052916 (published as WO/2015/055981 and WO 2016/055777).
- a preferred variant of SEQ ID NO: 97 comprises (a) E94C and A360C or (b) E94C, A360C, C109A and C136A and then optionally ( ⁇ M1)G1 (i.e. deletion of M1 and then addition G1). It may also be termed M1G. Any of the variants discussed above may further comprise M1G.
- two or more parts on the Dda helicase may be connected to reduce the size of the opening in the polynucleotide domain through which a polynucleotide can unbind from the helicase and wherein the helicase retains its ability to control the movement of the polynucleotide.
- Any of the embodiments discussed above for Hel308 helicases equally apply to Dda helicases.
- the translocase is preferably a strippase.
- the strippase is preferably the INO80 chromatin remodeling complex or a FtsK/SpoIIIE transporter.
- the translocase is contacted with the constructs after they are created by the MuA transposase. In another embodiment, the translocase is bound to the substrates before the substrates are contacted with the template polynucleotide.
- constructs comprising a fragment of the template polynucletide and one or more MuA substrates are formed.
- the two strands of each construct are preferably linked at one end by a hairpin loop.
- a hairpin loop is added to each of the fragments of the template polynucleotide generated by the MuA transposase.
- Suitable hairpin loops can be designed using methods known in the art.
- the hairpin loop may be any length.
- the hairpin loop is typically 110 or fewer nucleotides, such as 100 or fewer nucleotides, 90 or fewer nucleotides, 80 or fewer nucleotides, 70 or fewer nucleotides, 60 or fewer nucleotides, 50 or fewer nucleotides, 40 or fewer nucleotides, 30 or fewer nucleotides, 20 or fewer nucleotides or 10 or fewer nucleotides, in length.
- the hairpin loop is preferably from about 1 to 110, from 2 to 100, from 5 to 80 or from 6 to 50 nucleotides in length.
- hairpin loop Longer lengths of the hairpin loop, such as from 50 to 110 nucleotides, are preferred if the loop is involved in the differential selectability of the adaptor. Similarly, shorter lengths of the hairpin loop, such as from 1 to 5 nucleotides, are preferred if the loop is not involved in the selectable binding as discussed below.
- the hairpin loop preferably comprises a selectable binding moiety.
- a selectable binding moiety is a moiety that can be selected on the basis of its binding properties.
- a selectable binding moiety is preferably a moiety that specifically binds to a surface.
- a selectable binding moiety specifically binds to a surface if it binds to the surface to a much greater degree than any other moiety used in the invention.
- the moiety binds to a surface to which no other moiety used in the invention binds.
- Suitable selective binding moieties are known in the art.
- Preferred selective binding moieties include, but are not limited to, biotin, a polynucleotide sequence, antibodies, antibody fragments, such as Fab and ScSv, antigens, polynucleotide binding proteins, poly histidine tails and GST tags.
- the most preferred selective binding moieties are biotin and a selectable polynucleotide sequence. Biotin specifically binds to a surface coated with avidins.
- Selectable polynucleotide sequences specifically bind (i.e. hybridise) to a surface coated with homologus sequences.
- selectable polynucleotide sequences specifically bind to a surface coated with polynucleotide binding proteins.
- the hairpin loop and/or the selectable binding moiety may comprise a region that can be cut, nicked, cleaved or hydrolysed. Such a region can be designed to allow the constructs to be removed from the surface to which it is bound following purification or isolation. Suitable regions are known in the art. Suitable regions include, but are not limited to, an RNA region, a region comprising desthiobiotin and streptavidin, a disulphide bond and a photocleavable region.
- the hairpin loop may be provided at either end of the polynucleotide, i.e. the 5′ or the 3′ end.
- the hairpin loop may be ligated to the polynucleotide using any method known in the art.
- the hairpin loop may be ligated using a ligase, such as T4 DNA ligase, E. coli DNA ligase, Taq DNA ligase, Tma DNA ligase and 9° N DNA ligase.
- the hairpin loop may be added to the constructs as described in International Application No. PCT/GB2014/052505 (published as WO 2015/022544).
- the method preferably further comprises attaching one or more molecular brakes to a non-substrate strand.
- a non-substrate strand is a strand of a MuA double stranded substrate that does not comprise an overhang.
- the molecular brakes may be attached to the non-substrate strands in the substrates before they are contacted with the template polynucleotide and the MuA transposase.
- the molecular brakes may be attached to the other strands from the substrates remaining in the constructs after they are created by the MuA transposase.
- the molecular brakes are preferably bound to Y adaptors comprising a leader sequence and/or one or more anchors capable of coupling the adaptor to a membrane and the Y adaptors are attached to the other strands in step (c).
- the Y adaptors are typically polynucleotide adaptors. They may be formed from any of the polynucleotides discussed above.
- the Y adaptor typically comprises (a) a double stranded region and (b) a single stranded region or a region that is not complementary at the other end.
- the Y adaptor may be described as having an overhang if it comprises a single stranded region.
- the presence of a non-complementary region in the Y adaptor gives the adaptor its Y shape since the two strands typically do not hybridise to each other unlike the double stranded portion.
- the Y adaptor may comprise one or more anchors.
- the Y adaptor and/or the hairpin loop may be ligated to the polynucleotide using any method known in the art.
- One or both of the adaptors may be ligated using a ligase, such as T4 DNA ligase, E. coli DNA ligase, Taq DNA ligase, Tma DNA ligase and 9° N DNA ligase.
- the adaptors may be added to the constructs as described in International Application No. PCT/GB2014/052505 (published as WO 2015/022544).
- the Y adaptor may be provided with a leader sequence which preferentially threads into the pore.
- the leader sequence facilitates the method of the invention.
- the leader sequence is designed to preferentially thread into the transmembrane pore and thereby facilitate the movement of polynucleotide through the pore.
- the leader sequence can also be used to link the polynucleotide to the one or more anchors as discussed below.
- the leader sequence typically comprises a polymer.
- the polymer is preferably negatively charged.
- the polymer is preferably a polynucleotide, such as DNA or RNA, a modified polynucleotide (such as abasic DNA), PNA, LNA, polyethylene glycol (PEG) or a polypeptide.
- the leader preferably comprises a polynucleotide and more preferably comprises a single stranded polynucleotide.
- the leader sequence can comprise any of the polynucleotides discussed above.
- the single stranded leader sequence most preferably comprises a single strand of DNA, such as a poly dT section.
- the leader sequence preferably comprises the one or more spacers.
- the leader sequence can be any length, but is typically 10 to 150 nucleotides in length, such as from 20 to 150 nucleotides in length.
- the length of the leader typically depends on the transmembrane pore used in the method.
- the Y adaptor preferably comprises a selectable binding moiety as discussed above.
- the Y adaptor and/or the selectable binding moiety may comprise a region that can be cut, nicked, cleaved or hydrolysed as discussed above.
- the method comprises contacting the target polynucleotide with a molecular brake which controls the movement of the target polynucleotide through the pore.
- a molecular brake which controls the movement of the target polynucleotide through the pore.
- Any molecular brake may be used including any of those disclosed in International Application No. PCT/GB2014/052737 (published as WO 2015/110777).
- the molecular brake is preferably a polynucleotide binding protein.
- the polynucleotide binding protein may be any protein that is capable of binding to the polynucleotide and controlling its movement through a transmembrane pore as discussed in more detail below. It is straightforward in the art to determine whether or not a protein binds to a polynucleotide.
- the protein typically interacts with and modifies at least one property of the polynucleotide.
- the protein may modify the polynucleotide by cleaving it to form individual nucleotides or shorter chains of nucleotides, such as di- or trinucleotides.
- the moiety may modify the polynucleotide by orienting it or moving it to a specific position, i.e. controlling its movement.
- the polynucleotide binding protein is preferably derived from a polynucleotide handling enzyme.
- a polynucleotide handling enzyme is a polypeptide that is capable of interacting with and modifying at least one property of a polynucleotide.
- the enzyme may modify the polynucleotide by cleaving it to form individual nucleotides or shorter chains of nucleotides, such as di- or trinucleotides.
- the enzyme may modify the polynucleotide by orienting it or moving it to a specific position.
- the polynucleotide handling enzyme does not need to display enzymatic activity as long as it is capable of binding the polynucleotide and controlling its movement through the pore. For instance, the enzyme may be modified to remove its enzymatic activity or may be used under conditions which prevent it from acting as an enzyme. Such conditions are discussed in more detail below.
- the polynucleotide handling enzyme is preferably derived from a nucleolytic enzyme.
- the polynucleotide handling enzyme used in the construct of the enzyme is more preferably derived from a member of any of the Enzyme Classification (EC) groups 3.1.11, 3.1.13, 3.1.14, 3.1.15, 3.1.16, 3.1.21, 3.1.22, 3.1.25, 3.1.26, 3.1.27, 3.1.30 and 3.1.31.
- the enzyme may be any of those disclosed in International Application No. PCT/GB10/000133 (published as WO 2010/086603).
- Preferred enzymes are polymerases, exonucleases, helicases, translocases and topoisomerases, such as gyrases.
- Suitable enzymes include, but are not limited to, exonuclease I from E. coli (SEQ ID NO: 11), exonuclease III enzyme from E. coli (SEQ ID NO: 13), RecJ from T. thermophilus (SEQ ID NO: 15) and bacteriophage lambda exonuclease (SEQ ID NO: 17), TatD exonuclease and variants thereof.
- Three subunits comprising the sequence shown in SEQ ID NO: or a variant thereof interact to form a trimer exonuclease.
- the polymerase may be PyroPhage® 3173 DNA Polymerase (which is commercially available from Lucigen® Corporation), SD Polymerase (commercially available from Bioron®) or variants thereof.
- the enzyme is preferably Phi29 DNA polymerase (SEQ ID NO: 9) or a variant thereof.
- the topoisomerase is preferably a member of any of the Moiety Classification (EC) groups 5.99.1.2 and 5.99.1.3.
- the enzyme is most preferably derived from a helicase.
- the helicase may be or be derived from a Hel308 helicase, a RecD helicase, such as TraI helicase or a TrwC helicase, a XPD helicase or a Dda helicase.
- the helicase may be or be derived from Hel308 Mbu (SEQ ID NO: 18), Hel308 Csy (SEQ ID NO: 19), Hel308 Tga (SEQ ID NO: 20), Hel308 Mhu (SEQ ID NO: 21), TraI Eco (SEQ ID NO: 22), XPD Mbu (SEQ ID NO: 23) or a variant thereof.
- the helicase may be any of the helicases, modified helicases or helicase constructs disclosed in International Application Nos. PCT/GB2012/052579 (published as WO 2013/057495); PCT/GB2012/053274 (published as WO 2013/098562); PCT/GB2012/053273 (published as WO2013098561); PCT/GB2013/051925 (published as WO 2014/013260); PCT/GB2013/051924 (published as WO 2014/013259); PCT/GB2013/051928 (published as WO 2014/013262) and PCT/GB2014/052736 (published as WO/2015/055981).
- the helicase preferably comprises the sequence shown in SEQ ID NO: 25 (Trwc Cba) or as variant thereof, the sequence shown in SEQ ID NO: 18 (Hel308 Mbu) or a variant thereof or the sequence shown in SEQ ID NO: 24 (Dda) or a variant thereof.
- Variants may differ from the native sequences in any of the ways discussed below for transmembrane pores.
- a preferred variant of SEQ ID NO: 24 comprises (a) E94C and A360C or (b) E94C, A360C, C109A and C136A and then optionally ( ⁇ M1)G1 (i.e. deletion of M1 and then addition G1). It may also be termed M1G. Any of the variants discussed above may further comprise M1G.
- the Dda helicase preferably comprises any of the modifications disclosed in International Application Nos. PCT/GB2014/052736 and PCT/GB2015/052916 (published as WO/2015/055981 and WO 2016/055777).
- helicases Any number of helicases may be used in accordance with the invention. For instance, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more helicases may be used. In some embodiments, different numbers of helicases may be used.
- the method of the invention preferably comprises attaching two or more helicases to the other strands.
- the two or more helicases are typically the same helicase.
- the two or more helicases may be different helicases.
- the two or more helicases may be any combination of the helicases mentioned above.
- the two or more helicases may be two or more Dda helicases.
- the two or more helicases may be one or more Dda helicases and one or more TrwC helicases.
- the two or more helicases may be different variants of the same helicase.
- the two or more helicases are preferably attached to one another.
- the two or more helicases are more preferably covalently attached to one another.
- the helicases may be attached in any order and using any method.
- Preferred helicase constructs for use in the invention are described in International Application Nos. PCT/GB2013/051925 (published as WO 2014/013260); PCT/GB2013/051924 (published as WO 2014/013259); PCT/GB2013/051928 (published as WO 2014/013262) and PCT/GB2014/052736.
- a variant of SEQ ID NO: 9, 11, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24 or 25 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 9, 11, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24 or 25 and which retains polynucleotide binding ability. This can be measured using any method known in the art. For instance, the variant can be contacted with a polynucleotide and its ability to bind to and move along the polynucleotide can be measured. The variant may include modifications that facilitate binding of the polynucleotide and/or facilitate its activity at high salt concentrations and/or room temperature. Variants may be modified such that they bind polynucleotides (i.e.
- a variant will preferably be at least 50% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 9, 11, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24 or 25 over the entire sequence.
- the enzyme may be covalently attached to the pore. Any method may be used to covalently attach the enzyme to the pore.
- TrwC Cba-Q594A SEQ ID NO: 25 with the mutation Q594A.
- This variant does not function as a helicase (i.e. binds polynucleotides but does not move along them when provided with all the necessary components to facilitate movement, e.g. ATP and Mg 2+ ).
- the polynucleotide is translocated through the pore either with or against an applied potential.
- Exonucleases that act progressively or processively on double stranded polynucleotides can be used on the cis side of the pore to feed the remaining single strand through under an applied potential or the trans side under a reverse potential.
- a helicase that unwinds the double stranded DNA can also be used in a similar manner.
- a polymerase may also be used.
- sequencing applications that require strand translocation against an applied potential, but the DNA must be first “caught” by the enzyme under a reverse or no potential.
- the single strand DNA exonucleases or single strand DNA dependent polymerases can act as molecular motors to pull the recently translocated single strand back through the pore in a controlled stepwise manner, trans to cis, against the applied potential.
- Helicases may work in two modes with respect to the pore.
- the method is preferably carried out using a helicase such that it moves the polynucleotide through the pore with the field resulting from the applied voltage.
- the helicase moves the polynucleotide into the pore such that it is passed through the pore with the field until it finally translocates through to the trans side of the membrane.
- the method is preferably carried out such that a helicase moves the polynucleotide through the pore against the field resulting from the applied voltage.
- the 3′ end of the polynucleotide is first captured in the pore, and the helicase moves the polynucleotide through the pore such that it is pulled out of the pore against the applied field until finally ejected back to the cis side of the membrane.
- the method may also be carried out in the opposite direction.
- the 3′ end of the polynucleotide may be first captured in the pore and the helicase may move the polynucleotide into the pore such that it is passed through the pore with the field until it finally translocates through to the trans side of the membrane.
- the helicase When the helicase is not provided with the necessary components to facilitate movement or is modified to hinder or prevent its movement, it can bind to the polynucleotide and act as a brake slowing the movement of the polynucleotide when it is pulled into the pore by the applied field.
- the inactive mode it does not matter whether the polynucleotide is captured either 3′ or 5′ down, it is the applied field which pulls the polynucleotide into the pore towards the trans side with the enzyme acting as a brake.
- the movement control of the polynucleotide by the helicase can be described in a number of ways including ratcheting, sliding and braking. Helicase variants which lack helicase activity can also be used in this way.
- the molecular brake may function as the translocase that removes the MuA transposase.
- the molecular brake is used in addition to a translocase.
- the molecular brake and translocase may be the same enzyme or different enzymes. Where the molecule brake and translcase are the same enzyme, one molecule of the enzyme may act as a molecular brake and another molecule of the enzyme may act as a translocase to remove the MuA transposase.
- the polynucleotide may be contacted with the molecular brake and the pore in any order. It is preferred that, when the polynucleotide is contacted with the molecular brake, such as a helicase, and the pore, the polynucleotide firstly forms a complex with the protein. When the voltage is applied across the pore, the polynucleotide/protein complex then forms a complex with the pore and controls the movement of the polynucleotide through the pore.
- the molecular brake such as a helicase
- Any steps in the method using a polynucleotide binding protein are typically carried out in the presence of free nucleotides or free nucleotide analogues and an enzyme cofactor that facilitates the action of the polynucleotide binding protein.
- the free nucleotides may be one or more of any of the individual nucleotides discussed above.
- the free nucleotides include, but are not limited to, adenosine monophosphate (AMP), adenosine diphosphate (ADP), adenosine triphosphate (ATP), guanosine monophosphate (GMP), guanosine diphosphate (GDP), guanosine triphosphate (GTP), thymidine monophosphate (TMP), thymidine diphosphate (TDP), thymidine triphosphate (TTP), uridine monophosphate (UMP), uridine diphosphate (UDP), uridine triphosphate (UTP), cytidine monophosphate (CMP), cytidine diphosphate (CDP), cytidine triphosphate (CTP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyadenosine diphosphate (dADP), deoxyaden
- the free nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP or dCMP.
- the free nucleotides are preferably adenosine triphosphate (ATP).
- the enzyme cofactor is a factor that allows the construct to function.
- the enzyme cofactor is preferably a divalent metal cation.
- the divalent metal cation is preferably Mg 2+ , Mn 2+ , Ca 2+ or Co 2+ .
- the enzyme cofactor is most preferably Mg 2+ .
- the molecular brakes may be any compound or molecule which binds to the polynucleotide and slows the movement of the polynucleotide through the pore.
- the molecular brake may be any of those discussed above.
- the molecular brake preferably comprises a compound which binds to the polynucleotide.
- the compound is preferably a macrocycle. Suitable macrocycles include, but are not limited to, cyclodextrins, calixarenes, cyclic peptides, crown ethers, cucurbiturils, pillararenes, derivatives thereof or a combination thereof.
- the cyclodextrin or derivative thereof may be any of those disclosed in Eliseev, A. V., and Schneider, H-J. (1994) J.
- the cyclodextrin is more preferably heptakis-6-amino- ⁇ -cyclodextrin (am 7 - ⁇ CD), 6-monodeoxy-6-monoamino- ⁇ -cyclodextrin (ami- ⁇ CD) or heptakis-(6-deoxy-6-guanidino)-cyclodextrin (guy- ⁇ CD).
- the method of the invention preferably does not comprise heat inactivating the MuA transposase.
- Heat inactivation may also inactivate any other enzymes or proteins being used in the preparation or characterisation of the modified polynucleotides. Removing the heat inactivation step also dispenses with the need for additional equipment required for heating, such as a thermal cycler, hot block, or water bath, used for heating up the sample.
- the method of the invention can therefore be used in a variety of different settings including those without an electricity supply.
- the invention also provides a population of double stranded MuA substrates for modifying a template polynucleotide, wherein each substrate comprises an overhang at one or both ends and a translocases bound to an overhang. Any of the embodiments discussed above equally apply to the population of the invention.
- the invention also provides a plurality of polynucleotides modified using the method of the invention.
- the plurality of polynucleotides may be in any of the forms discussed above.
- the population or plurality may be isolated, substantially isolated, purified or substantially purified.
- a population or plurality is isolated or purified if it is completely free of any other components, such as the template polynucleotide, lipids or pores.
- a population or plurality is substantially isolated if it is mixed with carriers or diluents which will not interfere with its intended use.
- a population or plurality is substantially isolated or substantially purified if it is present in a form that comprises less than 10%, less than 5%, less than 2% or less than 1% of other components, such as lipids or pores.
- the invention also comprises a method of characterising at least one polynucleotide modified using a method of the invention.
- the modified polynucleotide is contacted with a transmembrane pore such that at least one strand of the polynucleotide moves through the pore.
- One or more measurements which are indicative of one or more characteristics of the polynucleotide are taken as the at least one strand moves with respect to the pore.
- the invention also provides a method of characterising a template polynucleotide.
- the template polynucleotide is modified using the method of the invention to produce a plurality of modified polynucleotides.
- Each modified polynucleotide is contacted with a transmembrane pore such that at least one strand of each polynucleotide moves through the pore.
- One or more measurements which are indicative of one or more characteristics of the polynucleotide are taken as the at least one strand of each polynucleotide moves with respect to the pore.
- the method preferably comprises contacting the/each modified polynucleotide with a transmembrane pore such that both strands of the polynucleotide move through the pore. If molecular brakes are present on the/each modified polynucleotides, the molecular brakes may control the movement of the/each modified polynucleotide through the pore and/or separate the two strands of the/each modified polynucleotide.
- the transmembrane pore is typically in a membrane. Any membrane may be used in accordance with the invention. Suitable membranes are well-known in the art.
- the membrane is preferably an amphiphilic layer.
- An amphiphilic layer is a layer formed from amphiphilic molecules, such as phospholipids, which have both hydrophilic and lipophilic properties.
- the amphiphilic molecules may be synthetic or naturally occurring. Non-naturally occurring amphiphiles and amphiphiles which form a monolayer are known in the art and include, for example, block copolymers (Gonzalez-Perez et al., Langmuir, 2009, 25, 10447-10450).
- Block copolymers are polymeric materials in which two or more monomer sub-units are polymerised together to create a single polymer chain. Block copolymers typically have properties that are contributed by each monomer sub-unit. However, a block copolymer may have unique properties that polymers formed from the individual sub-units do not possess. Block copolymers can be engineered such that one of the monomer sub-units is hydrophobic (i.e. lipophilic), whilst the other sub-unit(s) are hydrophilic whilst in aqueous media. In this case, the block copolymer may possess amphiphilic properties and may form a structure that mimics a biological membrane.
- the block copolymer may be a diblock (consisting of two monomer sub-units), but may also be constructed from more than two monomer sub-units to form more complex arrangements that behave as amphiphiles.
- the copolymer may be a triblock, tetrablock or pentablock copolymer.
- the membrane is preferably a triblock copolymer membrane.
- Archaebacterial bipolar tetraether lipids are naturally occurring lipids that are constructed such that the lipid forms a monolayer membrane. These lipids are generally found in extremophiles that survive in harsh biological environments, thermophiles, halophiles and acidophiles. Their stability is believed to derive from the fused nature of the final bilayer. It is straightforward to construct block copolymer materials that mimic these biological entities by creating a triblock polymer that has the general motif hydrophilic-hydrophobic-hydrophilic. This material may form monomeric membranes that behave similarly to lipid bilayers and encompass a range of phase behaviours from vesicles through to laminar membranes. Membranes formed from these triblock copolymers hold several advantages over biological lipid membranes. Because the triblock copolymer is synthesised, the exact construction can be carefully controlled to provide the correct chain lengths and properties required to form membranes and to interact with pores and other proteins.
- Block copolymers may also be constructed from sub-units that are not classed as lipid sub-materials; for example a hydrophobic polymer may be made from siloxane or other non-hydrocarbon based monomers.
- the hydrophilic sub-section of block copolymer can also possess low protein binding properties, which allows the creation of a membrane that is highly resistant when exposed to raw biological samples.
- This head group unit may also be derived from non-classical lipid head-groups.
- Triblock copolymer membranes also have increased mechanical and environmental stability compared with biological lipid membranes, for example a much higher operational temperature or pH range.
- the synthetic nature of the block copolymers provides a platform to customise polymer based membranes for a wide range of applications.
- the membrane is most preferably one of the membranes disclosed in International Application No. PCT/GB2013/052766 or PCT/GB2013/052767.
- amphiphilic molecules may be chemically-modified or functionalised to facilitate coupling of the polynucleotide.
- the amphiphilic layer may be a monolayer or a bilayer.
- the amphiphilic layer is typically planar.
- the amphiphilic layer may be curved.
- the amphiphilic layer may be supported.
- the amphiphilic layer may be concave.
- the amphiphilic layer may be suspended from raised pillars such that the peripheral region of the amphiphilic layer is higher than the amphiphilic layer region in the centre. This may allow the microparticle to travel, move, slide or roll along the membrane as described above.
- Amphiphilic membranes are typically naturally mobile, essentially acting as two dimensional fluids with lipid diffusion rates of approximately 10 ⁇ 8 cm s-1. This means that the pore and coupled polynucleotide can typically move within an amphiphilic membrane.
- the membrane may be a lipid bilayer.
- Lipid bilayers are models of cell membranes and serve as excellent platforms for a range of experimental studies.
- lipid bilayers can be used for in vitro investigation of membrane proteins by single-channel recording.
- lipid bilayers can be used as biosensors to detect the presence of a range of substances.
- the lipid bilayer may be any lipid bilayer. Suitable lipid bilayers include, but are not limited to, a planar lipid bilayer, a supported bilayer or a liposome.
- the lipid bilayer is preferably a planar lipid bilayer. Suitable lipid bilayers are disclosed in International Application No. PCT/GB08/000563 (published as WO 2008/102121), International Application No. PCT/GB08/004127 (published as WO 2009/077734) and International Application No. PCT/GB2006/001057 (published as WO 2006/100484).
- Lipid bilayers are commonly formed by the method of Montal and Mueller (Proc. Natl. Acad. Sci. USA., 1972; 69: 3561-3566), in which a lipid monolayer is carried on aqueous solution/air interface past either side of an aperture which is perpendicular to that interface.
- the lipid is normally added to the surface of an aqueous electrolyte solution by first dissolving it in an organic solvent and then allowing a drop of the solvent to evaporate on the surface of the aqueous solution on either side of the aperture. Once the organic solvent has evaporated, the solution/air interfaces on either side of the aperture are physically moved up and down past the aperture until a bilayer is formed.
- Planar lipid bilayers may be formed across an aperture in a membrane or across an opening into a recess.
- Montal & Mueller The method of Montal & Mueller is popular because it is a cost-effective and relatively straightforward method of forming good quality lipid bilayers that are suitable for protein pore insertion.
- Other common methods of bilayer formation include tip-dipping, painting bilayers and patch-clamping of liposome bilayers.
- Tip-dipping bilayer formation entails touching the aperture surface (for example, a pipette tip) onto the surface of a test solution that is carrying a monolayer of lipid. Again, the lipid monolayer is first generated at the solution/air interface by allowing a drop of lipid dissolved in organic solvent to evaporate at the solution surface. The bilayer is then formed by the Langmuir-Schaefer process and requires mechanical automation to move the aperture relative to the solution surface.
- the aperture surface for example, a pipette tip
- lipid dissolved in organic solvent is applied directly to the aperture, which is submerged in an aqueous test solution.
- the lipid solution is spread thinly over the aperture using a paintbrush or an equivalent. Thinning of the solvent results in formation of a lipid bilayer.
- complete removal of the solvent from the bilayer is difficult and consequently the bilayer formed by this method is less stable and more prone to noise during electrochemical measurement.
- Patch-clamping is commonly used in the study of biological cell membranes.
- the cell membrane is clamped to the end of a pipette by suction and a patch of the membrane becomes attached over the aperture.
- the method has been adapted for producing lipid bilayers by clamping liposomes which then burst to leave a lipid bilayer sealing over the aperture of the pipette.
- the method requires stable, giant and unilamellar liposomes and the fabrication of small apertures in materials having a glass surface.
- Liposomes can be formed by sonication, extrusion or the Mozafari method (Colas et al. (2007) Micron 38:841-847).
- the lipid bilayer is formed as described in International Application No. PCT/GB08/004127 (published as WO 2009/077734).
- the lipid bilayer is formed from dried lipids.
- the lipid bilayer is formed across an opening as described in WO2009/077734 (PCT/GB08/004127).
- a lipid bilayer is formed from two opposing layers of lipids.
- the two layers of lipids are arranged such that their hydrophobic tail groups face towards each other to form a hydrophobic interior.
- the hydrophilic head groups of the lipids face outwards towards the aqueous environment on each side of the bilayer.
- the bilayer may be present in a number of lipid phases including, but not limited to, the liquid disordered phase (fluid lamellar), liquid ordered phase, solid ordered phase (lamellar gel phase, interdigitated gel phase) and planar bilayer crystals (lamellar sub-gel phase, lamellar crystalline phase).
- lipid composition that forms a lipid bilayer may be used.
- the lipid composition is chosen such that a lipid bilayer having the required properties, such as surface charge, ability to support membrane proteins, packing density or mechanical properties, is formed.
- the lipid composition can comprise one or more different lipids.
- the lipid composition can contain up to 100 lipids.
- the lipid composition preferably contains 1 to 10 lipids.
- the lipid composition may comprise naturally-occurring lipids and/or artificial lipids.
- the lipids typically comprise a head group, an interfacial moiety and two hydrophobic tail groups which may be the same or different.
- Suitable head groups include, but are not limited to, neutral head groups, such as diacylglycerides (DG) and ceramides (CM); zwitterionic head groups, such as phosphatidylcholine (PC), phosphatidylethanolamine (PE) and sphingomyelin
- Suitable interfacial moieties include, but are not limited to, naturally-occurring interfacial moieties, such as glycerol-based or ceramide-based moieties.
- Suitable hydrophobic tail groups include, but are not limited to, saturated hydrocarbon chains, such as lauric acid (n-Dodecanolic acid), myristic acid (n-Tetradecononic acid), palmitic acid (n-Hexadecanoic acid), stearic acid (n-Octadecanoic) and arachidic (n-Eicosanoic); unsaturated hydrocarbon chains, such as oleic acid (cis-9-Octadecanoic); and branched hydrocarbon chains, such as phytanoyl.
- saturated hydrocarbon chains such as lauric acid (n-Dodecanolic acid), myristic acid (n-Tetradecononic acid), palmitic acid (n-Hexadecanoic acid), stearic acid (n-Octadecanoic) and arachidic (n-Eicosanoic
- unsaturated hydrocarbon chains such as oleic acid (cis-9-Octadecanoic)
- the length of the chains and the position and number of the branches, such as methyl groups, in the branched hydrocarbon chains can vary.
- the hydrophobic tail groups can be linked to the interfacial moiety as an ether or an ester.
- the lipids may be mycolic acid.
- the lipids can also be chemically-modified.
- the head group or the tail group of the lipids may be chemically-modified.
- Suitable lipids whose head groups have been chemically-modified include, but are not limited to, PEG-modified lipids, such as 1,2-Diacyl-sn-Glycero-3-Phosphoethanolamine-N-[Methoxy(Polyethylene glycol)-2000]; functionalised PEG Lipids, such as 1,2-Distearoyl-sn-Glycero-3 Phosphoethanolamine-N-[Biotinyl(Polyethylene Glycol) 2000]; and lipids modified for conjugation, such as 1,2-Dioleoyl-sn-Glycero-3-Phosphoethanolamine-N-(succinyl) and 1,2-Dipalmitoyl-sn-Glycero-3-Phosphoethanolamine-N-(Biotinyl).
- Suitable lipids whose tail groups have been chemically-modified include, but are not limited to, polymerisable lipids, such as 1,2-bis(10,12-tricosadiynoyl)-sn-Glycero-3-Phosphocholine; fluorinated lipids, such as 1-Palmitoyl-2-(16-Fluoropalmitoyl)-sn-Glycero-3-Phosphocholine; deuterated lipids, such as 1,2-Dipalmitoyl-D62-sn-Glycero-3-Phosphocholine; and ether linked lipids, such as 1,2-Di-O-phytanyl-sn-Glycero-3-Phosphocholine.
- the lipids may be chemically-modified or functionalised to facilitate coupling of the polynucleotide.
- the amphiphilic layer typically comprises one or more additives that will affect the properties of the layer.
- Suitable additives include, but are not limited to, fatty acids, such as palmitic acid, myristic acid and oleic acid; fatty alcohols, such as palmitic alcohol, myristic alcohol and oleic alcohol; sterols, such as cholesterol, ergosterol, lanosterol, sitosterol and stigmasterol; lysophospholipids, such as 1-Acyl-2-Hydroxy-sn-Glycero-3-Phosphocholine; and ceramides.
- the membrane is a solid state layer.
- Solid state layers can be formed from both organic and inorganic materials including, but not limited to, microelectronic materials, insulating materials such as HfO 2 , Si 3 N 4 , Al 2 O 3 , and SiO, organic and inorganic polymers such as polyamide, plastics such as Teflon® or elastomers such as two-component addition-cure silicone rubber, and glasses.
- the solid state layer may be by atomic layer deposition (ALD).
- the ALD solid state layer may comprise alternating layers of HfO 2 and Al 2 O 3 .
- the solid state layer may be formed from monatomic layers, such as graphene, or layers that are only a few atoms thick.
- Suitable graphene layers are disclosed in International Application No. PCT/US2008/010637 (published as WO 2009/035647). Yusko et al., Nature Nanotechnology, 2011; 6: 253-260 and US Patent Application No. 2013/0048499 describe the delivery of proteins to transmembrane pores in solid state layers without the use of microparticles. The method of the invention may be used to improve the delivery in the methods disclosed in these documents.
- the method is typically carried out using (i) an artificial amphiphilic layer comprising a pore, (ii) an isolated, naturally-occurring lipid bilayer comprising a pore, or (iii) a cell having a pore inserted therein.
- the method is typically carried out using an artificial amphiphilic layer, such as an artificial triblock copolymer layer.
- the layer may comprise other transmembrane and/or intramembrane proteins as well as other molecules in addition to the pore. Suitable apparatus and conditions are discussed below.
- the method of the invention is typically carried out in vitro.
- a transmembrane pore is a structure that crosses the membrane to some degree.
- a transmembrane pore comprises a first opening and a second opening with a lumen extending between the first opening and the second opening.
- the transmembrane pore permits hydrated ions driven by an applied potential to flow across or within the membrane.
- the transmembrane pore typically crosses the entire membrane so that hydrated ions may flow from one side of the membrane to the other side of the membrane.
- the transmembrane pore does not have to cross the membrane. It may be closed at one end.
- the pore may be a well, gap, channel, trench or slit in the membrane along which or into which hydrated ions may flow.
- the pore may be biological or artificial. Suitable pores include, but are not limited to, protein pores, polynucleotide pores and solid state pores.
- the pore may be a DNA origami pore (Langecker et al., Science, 2012; 338: 932-936).
- the transmembrane pore is preferably a transmembrane protein pore.
- a transmembrane protein pore is a polypeptide or a collection of polypeptides that permits hydrated ions, such as polynucleotide, to flow from one side of a membrane to the other side of the membrane.
- the transmembrane protein pore is capable of forming a pore that permits hydrated ions driven by an applied potential to flow from one side of the membrane to the other.
- the transmembrane protein pore preferably permits polynucleotides to flow from one side of the membrane, such as a triblock copolymer membrane, to the other.
- the transmembrane protein pore allows a polynucleotide, such as DNA or RNA, to be moved through the pore.
- the transmembrane protein pore may be a monomer or an oligomer.
- the pore is preferably made up of several repeating subunits, such as at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, or at least 16 subunits.
- the pore is preferably a hexameric, heptameric, octameric or nonameric pore.
- the pore may be a homo-oligomer or a hetero-oligomer.
- the transmembrane protein pore typically comprises a barrel or channel through which the ions may flow.
- the subunits of the pore typically surround a central axis and contribute strands to a transmembrane ⁇ barrel or channel or a transmembrane ⁇ -helix bundle or channel.
- the barrel or channel of the transmembrane protein pore typically comprises amino acids that facilitate interaction with s, such as nucleotides, polynucleotides or nucleic acids. These amino acids are preferably located near a constriction of the barrel or channel.
- the transmembrane protein pore typically comprises one or more positively charged amino acids, such as arginine, lysine or histidine, or aromatic amino acids, such as tyrosine or tryptophan. These amino acids typically facilitate the interaction between the pore and nucleotides, polynucleotides or nucleic acids.
- Transmembrane protein pores for use in accordance with the invention can be derived from ⁇ -barrel pores or ⁇ -helix bundle pores.
- ⁇ -barrel pores comprise a barrel or channel that is formed from ⁇ -strands.
- Suitable ⁇ -barrel pores include, but are not limited to, ⁇ -toxins, such as ⁇ -hemolysin, anthrax toxin and leukocidins, and outer membrane proteins/porins of bacteria, such as Mycobacterium smegmatis porin (Msp), for example MspA, MspB, MspC or MspD, CsgG, outer membrane porn F (OmpF), outer membrane porn G (OmpG), outer membrane phospholipase A and Neisseria autotransporter lipoprotein (NalP) and other pores, such as lysenin.
- Msp Mycobacterium smegmatis porin
- OmpF outer membrane porn F
- ⁇ -helix bundle pores comprise a barrel or channel that is formed from ⁇ -helices.
- Suitable ⁇ -helix bundle pores include, but are not limited to, inner membrane proteins and a outer membrane proteins, such as WZA and ClyA toxin.
- the transmembrane pore may be derived from lysenin.
- Suitable pores derived from CsgG are disclosed in International Application No. PCT/EP2015/069965.
- Suitable pores derived from lysenin are disclosed in International Application No. PCT/GB2013/050667 (published as WO 2013/153359).
- the transmembrane pore may be derived from or based on Msp, ⁇ -hemolysin ( ⁇ -HL), lysenin, CsgG, ClyA, Sp1 and haemolytic protein fragaceatoxin C (FraC).
- the wild type ⁇ -hemolysin pore is formed of 7 identical monomers or sub-units (i.e., it is heptameric).
- the sequence of one monomer or sub-unit of ⁇ -hemolysin-NN is shown in SEQ ID NO: 4.
- the transmembrane protein pore is preferably derived from Msp, more preferably from MspA. Such a pore will be oligomeric and typically comprises 7, 8, 9 or 10 monomers derived from Msp.
- the pore may be a homo-oligomeric pore derived from Msp comprising identical monomers. Alternatively, the pore may be a hetero-oligomeric pore derived from Msp comprising at least one monomer that differs from the others.
- the pore is derived from MspA or a homolog or paralog thereof.
- a monomer derived from Msp typically comprises the sequence shown in SEQ ID NO: 2 or a variant thereof.
- SEQ ID NO: 2 is the MS-(B1)8 mutant of the MspA monomer. It includes the following mutations: D90N, D91N, D93N, D118R, D134R and E139K.
- a variant of SEQ ID NO: 2 is a polypeptide that has an amino acid sequence which varies from that of SEQ ID NO: 2 and which retains its ability to form a pore. The ability of a variant to form a pore can be assayed using any method known in the art.
- the variant may be inserted into an amphiphilic layer along with other appropriate subunits and its ability to oligomerise to form a pore may be determined.
- Methods are known in the art for inserting subunits into membranes, such as amphiphilic layers.
- subunits may be suspended in a purified form in a solution containing a triblock copolymer membrane such that it diffuses to the membrane and is inserted by binding to the membrane and assembling into a functional state.
- subunits may be directly inserted into the membrane using the “pick and place” method described in M. A. Holden, H. Bayley. J. Am. Chem. Soc. 2005, 127, 6502-6503 and International Application No. PCT/GB2006/001057 (published as WO 2006/100484).
- a variant will preferably be at least 50% homologous to that sequence based on amino acid similarity or identity. More preferably, the variant may be at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid similarity or identity to the amino acid sequence of SEQ ID NO: 2 over the entire sequence. There may be at least 80%, for example at least 85%, 90% or 95%, amino acid similarity or identity over a stretch of 100 or more, for example 125, 150, 175 or 200 or more, contiguous amino acids (“hard homology”).
- Standard methods in the art may be used to determine homology.
- the UWGCG Package provides the BESTFIT program which can be used to calculate homology, for example used on its default settings (Devereux et al (1984) Nucleic Acids Research 12, p 387-395).
- the PILEUP and BLAST algorithms can be used to calculate homology or line up sequences (such as identifying equivalent residues or corresponding sequences (typically on their default settings)), for example as described in Altschul S. F. (1993) J Mol Evol 36:290-300; Altschul, S. F et al (1990) J Mol Biol 215:403-10.
- Similarity can be measured using pairwise identity or by applying a scoring matrix such as BLOSUM62 and converting to an equivalent identity. Since they represent functional rather than evolved changes, deliberately mutated positions would be masked when determining homology. Similarity may be determined more sensitively by the application of position-specific scoring matrices using, for example, PSIBLAST on a comprehensive database of protein sequences. A different scoring matrix could be used that reflect amino acid chemico-physical properties rather than frequency of substitution over evolutionary time scales (e.g. charge).
- SEQ ID NO: 2 is the MS-(B1)8 mutant of the MspA monomer.
- the variant may comprise any of the mutations in the MspB, C or D monomers compared with MspA.
- the mature forms of MspB, C and D are shown in SEQ ID NOs: 5 to 7.
- the variant may comprise the following substitution present in MspB: A138P.
- the variant may comprise one or more of the following substitutions present in MspC: A96G, N102E and A138P.
- the variant may comprise one or more of the following mutations present in MspD: Deletion of G1, L2V, E5Q, L8V, D13G, W21A, D22E, K47T, I49H, I68V, D91G, A96Q, N102D, S103T, V1041, S136K and G141A.
- the variant may comprise combinations of one or more of the mutations and substitutions from Msp B, C and D.
- the variant preferably comprises the mutation L88N.
- a variant of SEQ ID NO: 2 has the mutation L88N in addition to all the mutations of MS-B1 and is called MS-(B2)8.
- the pore used in the invention is preferably MS-(B2)8.
- the variant of SEQ ID NO: 2 preferably comprises one or more of D56N, D56F, E59R, G75S, G77S, A96D and Q126R.
- a variant of SEQ ID NO: 2 has the mutations G75S/G77S/L88N/Q126R in addition to all the mutations of MS-B 1 and is called MS-B2C.
- the pore used in the invention is preferably MS-(B2)8 or MS-(B2C)8.
- the variant of SEQ ID NO: 2 preferably comprises N93D.
- the variant more more preferably comprises the mutations G75S/G77S/L88N/N93D/Q126R.
- Amino acid substitutions may be made to the amino acid sequence of SEQ ID NO: 2 in addition to those discussed above, for example up to 1, 2, 3, 4, 5, 10, 20 or 30 substitutions.
- amino acids may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality or charge to the amino acids they replace.
- conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid.
- the transmembrane protein pore is preferably derived from CsgG, more preferably from CsgG from E. coli Str. K-12 substr. MC4100. Such a pore will be oligomeric and typically comprises 7, 8, 9 or 10 monomers derived from CsgG.
- the pore may be a homo-oligomeric pore derived from CsgG comprising identical monomers. Alternatively, the pore may be a hetero-oligomeric pore derived from CsgG comprising at least one monomer that differs from the others.
- a monomer derived from CsgG typically comprises the sequence shown in SEQ ID NO: 114 or a variant thereof.
- a variant of SEQ ID NO: 114 is a polypeptide that has an amino acid sequence which varies from that of SEQ ID NO: 114 and which retains its ability to form a pore.
- a variant will preferably be at least 50% homologous to that sequence based on amino acid similarity or identity. More preferably, the variant may be at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid similarity or identity to the amino acid sequence of SEQ ID NO: 114 over the entire sequence.
- the variant of SEQ ID NO: 114 may comprise any of the mutations disclosed in International Application No. PCT/GB2015/069965 (published as WO 2016/034591).
- the variant of SEQ ID NO: 114 preferably comprises one or more of the following (i) one or more mutations at the following positions (i.e. mutations at one or more of the following positions) N40, D43, E44, S54, S57, Q62, R97, E101, E124, E131, R142, T150 and R192, such as one or more mutations at the following positions (i.e.
- the variant may comprise any combination of (i) to (xi). If the variant comprises any one of (i) and (iii) to (xi), it may further comprise a mutation at one or more of Y51, N55 and F56, such as at Y51, N55, F56, Y51/N55, Y51/F56, N55/F56 or Y51/N55/F56.
- Preferred variants of SEQ ID NO: 114 which form pores in which fewer nucleotides contribute to the current as the polynucleotide moves through the pore comprise Y51A/F56A, Y51A/F56N, Y51I/F56A, Y51L/F56A, Y51T/F56A, Y51I/F56N, Y51L/F56N or Y51T/F56N or more preferably Y51I/F56A, Y51L/F56A or Y51T/F56A.
- Preferred variants of SEQ ID NO: 114 which form pores displaying an increased range comprise mutations at the following positions:
- Preferred variants of SEQ ID NO: 114 which form pores displaying an increased range comprise:
- Preferred variants of SEQ ID NO: 114 which form pores in which fewer nucleotides contribute to the current as the polynucleotide moves through the pore comprise mutations at the following positions:
- Preferred variants of SEQ ID NO: 114 which form pores displaying an increased throughput comprise mutations at the following positions:
- Preferred variants which form pores displaying an increased throughput comprise:
- Preferred variants of SEQ ID NO: 7 which form pores in which capture of the polynucleotide is increased comprise the following mutations:
- Preferred variants of SEQ ID NO: 114 comprise the following mutations:
- Preferred variants of SEQ ID NO: 114 comprise the following mutations:
- the variant of SEQ ID NO: 114 may comprise any of the substitutions present in another CsgG homologue.
- Preferred CsgG homologues are shown in SEQ ID NOs: 3 to 7 and 26 to 41 of International Application No. PCT/GB2015/069965 (published as WO 2016/034591).
- any of the proteins described herein may be modified to assist their identification or purification, for example by the addition of histidine residues (a his tag), aspartic acid residues (an asp tag), a streptavidin tag, a flag tag, a SUMO tag, a GST tag or a MBP tag, or by the addition of a signal sequence to promote their secretion from a cell where the polypeptide does not naturally contain such a sequence.
- Histidine residues a his tag
- aspartic acid residues an asp tag
- streptavidin tag e.g., a flag tag, a SUMO tag, a GST tag or a MBP tag
- a signal sequence to promote their secretion from a cell where the polypeptide does not naturally contain such a sequence.
- An alternative to introducing a genetic tag is to chemically react a tag onto a native or engineered position on the pore or construct. An example of this would be to react a gel-shift reagent to a cysteine
- the pore may be labelled with a revealing label.
- the revealing label may be any suitable label which allows the pore to be detected. Suitable labels include, but are not limited to, fluorescent molecules, radioisotopes, e.g. 125 I, 35 S, enzymes, antibodies, antigens, polynucleotides and ligands such as biotin.
- any of the proteins described herein may be made synthetically or by recombinant means.
- the pore may be synthesised by in vitro translation and transcription (IVTT).
- IVTT in vitro translation and transcription
- the amino acid sequence of the pore may be modified to include non-naturally occurring amino acids or to increase the stability of the protein.
- amino acids may be introduced during production.
- the pore may also be altered following either synthetic or recombinant production.
- any of the proteins described herein, such as the transmembrane protein pores, can be produced using standard methods known in the art.
- Polynucleotide sequences encoding a pore or construct may be derived and replicated using standard methods in the art.
- Polynucleotide sequences encoding a pore or construct may be expressed in a bacterial host cell using standard techniques in the art.
- the pore may be produced in a cell by in situ expression of the polypeptide from a recombinant expression vector.
- the expression vector optionally carries an inducible promoter to control the expression of the polypeptide.
- the pore may be produced in large scale following purification by any protein liquid chromatography system from protein producing organisms or after recombinant expression.
- Typical protein liquid chromatography systems include FPLC, AKTA systems, the Bio-Cad system, the Bio-Rad BioLogic system and the Gilson HPLC system.
- The/each modified polynucleortide preferably comprises one or more anchors which are capable of coupling to the membrane.
- the method preferably further comprises coupling the target polynucleotide to the membrane using the one or more anchors.
- the anchor comprises a group which couples (or binds) to the polynucleotide and a group which couples (or binds) to the membrane.
- Each anchor may covalently couple (or bind) to the polynucleotide and/or the membrane.
- the group may be a chemical group and/or a functional group.
- the polynucleotide may be coupled to the membrane using any number of anchors, such as 2, 3, 4 or more anchors.
- the polynucleotide may be coupled to the membrane using two anchors each of which separately couples (or binds) to both the polynucleotide and membrane.
- the one or more anchors may comprise one or more molecular brakes or polynucleotide binding proteins.
- Each anchor may comprise one or more molecular brakes or polynucleotide binding proteins.
- the molecular brake(s) or polynucleotide binding protein(s) may be any of those discussed below.
- the one or more anchors preferably comprise a polypeptide anchor present in the membrane and/or a hydrophobic anchor present in the membrane.
- the hydrophobic anchor is preferably a lipid, fatty acid, sterol, carbon nanotube, polypeptide, protein or amino acid, for example cholesterol, palmitate or tocopherol.
- the one or more anchors are not the pore.
- the components of the membrane may be chemically-modified or functionalised to form the one or more anchors. Examples of suitable chemical modifications and suitable ways of functionalising the components of the membrane are discussed in more detail below. Any proportion of the membrane components may be functionalised, for example at least 0.01%, at least 0.1%, at least 1%, at least 10%, at least 25%, at least 50% or 100%.
- the polynucleotide may be coupled directly to the membrane.
- the one or more anchors used to couple the polynucleotide to the membrane preferably comprise a linker.
- the one or more anchors may comprise one or more, such as 2, 3, 4 or more, linkers.
- One linker may be used to couple more than one, such as 2, 3, 4 or more, polynucleotides to the membrane.
- Preferred linkers include, but are not limited to, polymers, such as polynucleotides, polyethylene glycols (PEGs), polysaccharides and polypeptides. These linkers may be linear, branched or circular. For instance, the linker may be a circular polynucleotide. The polynucleotide may hybridise to a complementary sequence on the circular polynucleotide linker.
- the one or more anchors or one or more linkers may comprise a component that can be cut or broken down, such as a restriction site or a photolabile group.
- linkers functionalised with maleimide groups will react with and attach to cysteine residues in proteins.
- the protein may be present in the membrane, may be the polynucleotide itself or may be used to couple (or bind) to the polynucleotide. This is discussed in more detail below.
- Crosslinkage of polynucleotides can be avoided using a “lock and key” arrangement. Only one end of each linker may react together to form a longer linker and the other ends of the linker each react with the polynucleotide or membrane respectively.
- linkers are described in International Application No. PCT/GB10/000132 (published as WO 2010/086602).
- linker is preferred in the sequencing embodiments discussed below. If a polynucleotide is permanently coupled directly to the membrane in the sense that it does not uncouple when interacting with the pore, then some sequence data will be lost as the sequencing run cannot continue to the end of the polynucleotide due to the distance between the membrane and the pore. If a linker is used, then the polynucleotide can be processed to completion.
- the coupling may be permanent or stable.
- the coupling may be such that the polynucleotide remains coupled to the membrane when interacting with the pore.
- the coupling may be transient.
- the coupling may be such that the polynucleotide may decouple from the membrane when interacting with the pore.
- the transient nature of the coupling is preferred. If a permanent or stable linker is attached directly to either the 5′ or 3′ end of a polynucleotide and the linker is shorter than the distance between the membrane and the transmembrane pore's channel, then some sequence data will be lost as the sequencing run cannot continue to the end of the polynucleotide. If the coupling is transient, then when the coupled end randomly becomes free of the membrane, then the polynucleotide can be processed to completion.
- the polynucleotide may be transiently coupled to an amphiphilic layer or triblock copolymer membrane using cholesterol or a fatty acyl chain. Any fatty acyl chain having a length of from 6 to 30 carbon atom, such as hexadecanoic acid, may be used.
- a polynucleotide such as a nucleic acid
- an amphiphilic layer such as a triblock copolymer membrane or lipid bilayer. Coupling of nucleic acids to synthetic lipid bilayers has been carried out previously with various different tethering strategies. These are summarised in Table 3 below.
- Synthetic polynucleotides and/or linkers may be functionalised using a modified phosphoramidite in the synthesis reaction, which is easily compatible for the direct addition of suitable anchoring groups, such as cholesterol, tocopherol, palmitate, thiol, lipid and biotin groups.
- suitable anchoring groups such as cholesterol, tocopherol, palmitate, thiol, lipid and biotin groups.
- Coupling of polynucleotides to a linker or to a functionalised membrane can also be achieved by a number of other means provided that a complementary reactive group or an anchoring group can be added to the polynucleotide.
- a complementary reactive group or an anchoring group can be added to the polynucleotide.
- a thiol group can be added to the 5′ of ssDNA or dsDNA using T4 polynucleotide kinase and ATP ⁇ S (Grant, G. P. and P. Z. Qin (2007). “A facile method for attaching nitroxide spin labels at the 5′ terminus of nucleic acids.” Nucleic Acids Res 35(10): e77).
- An azide group can be added to the 5′-phosphate of ssDNA or dsDNA using T4 polynucleotide kinase and ⁇ -[2-Azidoethyl]-ATP or ⁇ -[6-Azidohexyl]-ATP.
- a tether containing either a thiol, iodoacetamide OPSS or maleimide group (reactive to thiols) or a DIBO (dibenzocyclooxtyne) or alkyne group (reactive to azides), can be covalently attached to the polynucleotide.
- a more diverse selection of chemical groups such as biotin, thiols and fluorophores, can be added using terminal transferase to incorporate modified oligonucleotides to the 3′ of ssDNA (Kumar, A., P. Tchen, et al. (1988). “Nonradioactive labeling of synthetic oligonucleotide probes with terminal deoxynucleotidyl transferase.” Anal Biochem 169(2): 376-82). Streptavidin/biotin and/or streptavidin/desthiobiotin coupling may be used for any other polynucleotide.
- a polynucleotide can be coupled to a membrane using streptavidin/biotin and streptavidin/desthiobiotin. It may also be possible that anchors may be directly added to polynucleotides using terminal transferase with suitably modified nucleotides (e.g. cholesterol or palmitate).
- the one or more anchors preferably couple the polynucleotide to the membrane via hybridisation.
- the hybridisation may be present in any part of the one or more anchors, such as between the one or more anchors and the polynucleotide, within the one or more anchors or between the one or more anchors and the membrane.
- Hybridisation in the one or more anchors allows coupling in a transient manner as discussed above.
- a linker may comprise two or more polynucleotides, such as 3, 4 or 5 polynucleotides, hybridised together.
- the one or more anchors may hybridise to the polynucleotide.
- the one or more anchors may hybridise directly to the polynucleotide, directly to a Y adaptor and/or leader sequence attached to the polynucleotide or directly to a hairpin loop adaptor attached to the polynucleotide (as discussed in more detail below).
- the one or more anchors may be hybridised to one or more, such as 2 or 3, intermediate polynucleotides (or “splints”) which are hybridised to the polynucleotide, to a Y adaptor and/or leader sequence attached to the polynucleotide or to a hairpin loop adaptor attached to the polynucleotide (as discussed in more detail below).
- the one or more anchors may comprise a single stranded or double stranded polynucleotide.
- One part of the anchor may be ligated to a single stranded or double stranded polynucleotide analyte.
- Ligation of short pieces of ssDNA have been reported using T4 RNA ligase I (Troutt, A. B., M. G. McHeyzer-Williams, et al. (1992). “Ligation-anchored PCR: a simple amplification technique with single-sided specificity.” Proc Natl Acad Sci USA 89(20): 9823-5).
- either a single stranded or double stranded polynucleotide can be ligated to a double stranded polynucleotide and then the two strands separated by thermal or chemical denaturation.
- a double stranded polynucleotide it is possible to add either a piece of single stranded polynucleotide to one or both of the ends of the duplex, or a double stranded polynucleotide to one or both ends.
- T4 RNA ligase I for ligation to other regions of single stranded polynucleotides.
- ligation can be “blunt-ended”, with complementary 3′ dA/dT tails on the polynucleotide and added polynucleotide respectively (as is routinely done for many sample prep applications to prevent concatemer or dimer formation) or using “sticky-ends” generated by restriction digestion of the polynucleotide and ligation of compatible adapters.
- each single strand will have either a 5′ or 3′ modification if a single stranded polynucleotide was used for ligation or a modification at the 5′ end, the 3′ end or both if a double stranded polynucleotide was used for ligation.
- the one or more anchors can be incorporated during the chemical synthesis of the polynucleotide.
- the polynucleotide can be synthesised using a primer having a reactive group attached to it.
- Adenylated polynucleotides are intermediates in ligation reactions, where an adenosine-monophosphate is attached to the 5′-phosphate of the polynucleotide.
- kits are available for generation of this intermediate, such as the 5′ DNA Adenylation Kit from NEB.
- reactive groups such as thiols, amines, biotin, azides, etc.
- anchors could be directly added to polynucleotides using a 5′ DNA adenylation kit with suitably modified nucleotides (e.g. cholesterol or palmitate).
- PCR polymerase chain reaction
- two synthetic oligonucleotide primers a number of copies of the same section of DNA can be generated, where for each copy the 5′ of each strand in the duplex will be a synthetic polynucleotide.
- Single or multiple nucleotides can be added to 3′ end of single or double stranded DNA by employing a polymerase. Examples of polymerases which could be used include, but are not limited to, Terminal Transferase, Klenow and E. coli Poly(A) polymerase).
- anchors such as cholesterol, thiol, amine, azide, biotin or lipid, can be incorporated into double stranded polynucleotides. Therefore, each copy of the amplified polynucleotide will contain an anchor.
- the polynucleotide is coupled to the membrane without having to functionalise the polynucleotide.
- This can be achieved by coupling the one or more anchors, such as a polynucleotide binding protein or a chemical group, to the membrane and allowing the one or more anchors to interact with the polynucleotide or by functionalizing the membrane.
- the one or more anchors may be coupled to the membrane by any of the methods described herein.
- the one or more anchors may comprise one or more linkers, such as maleimide functionalised linkers.
- the polynucleotide is typically RNA, DNA, PNA, TNA or LNA and may be double or single stranded. This embodiment is particularly suited to genomic DNA polynucleotides.
- the one or more anchors can comprise any group that couples to, binds to or interacts with single or double stranded polynucleotides, specific nucleotide sequences within the polynucleotide or patterns of modified nucleotides within the polynucleotide, or any other ligand that is present on the polynucleotide.
- Suitable binding proteins for use in anchors include, but are not limited to, E. coli single stranded binding protein, P5 single stranded binding protein, T4 gp32 single stranded binding protein, the TOPO V dsDNA binding region, human histone proteins, E. coli HU DNA binding protein and other archaeal, prokaryotic or eukaryotic single stranded or double stranded polynucleotide (or nucleic acid) binding proteins, including those listed below.
- the specific nucleotide sequences could be sequences recognised by transcription factors, ribosomes, endonucleases, topoisomerases or replication initiation factors.
- the patterns of modified nucleotides could be patterns of methylation or damage.
- the one or more anchors can comprise any group which couples to, binds to, intercalates with or interacts with a polynucleotide.
- the group may intercalate or interact with the polynucleotide via electrostatic, hydrogen bonding or Van der Waals interactions.
- Such groups include a lysine monomer, poly-lysine (which will interact with ssDNA or dsDNA), ethidium bromide (which will intercalate with dsDNA), universal bases or universal nucleotides (which can hybridise with any polynucleotide) and osmium complexes (which can react to methylated bases).
- a polynucleotide may therefore be coupled to the membrane using one or more universal nucleotides attached to the membrane.
- Each universal nucleotide may be coupled to the membrane using one or more linkers.
- the universal nucleotide preferably comprises one of the following nucleobases: hypoxanthine, 4-nitroindole, 5-nitroindole, 6-nitroindole, formylindole, 3-nitropyrrole, nitroimidazole, 4-nitropyrazole, 4-nitrobenzimidazole, 5-nitroindazole, 4-aminobenzimidazole or phenyl (C6-aromatic ring).
- the universal nucleotide more preferably comprises one of the following nucleosides: 2′-deoxyinosine, inosine, 7-deaza-2′-deoxyinosine, 7-deaza-inosine, 2-aza-deoxyinosine, 2-aza-inosine, 2-0′-methylinosine, 4-nitroindole 2′-deoxyribonucleoside, 4-nitroindole ribonucleoside, 5-nitroindole 2′-deoxyribonucleoside, 5-nitroindole ribonucleoside, 6-nitroindole 2′-deoxyribonucleoside, 6-nitroindole ribonucleoside, 3-nitropyrrole 2′-deoxyribonucleoside, 3-nitropyrrole ribonucleoside, an acyclic sugar analogue of hypoxanthine, nitroimidazole 2′-deoxyribonucleoside, nitroimidazole ribonucleo
- the universal nucleotide more preferably comprises 2′-deoxyinosine.
- the universal nucleotide is more preferably IMP or dIMP.
- the universal nucleotide is most preferably dPMP (2′-Deoxy-P-nucleoside monophosphate) or dKMP (N6-methoxy-2, 6-diaminopurine monophosphate).
- the one or more anchors may couple to (or bind to) the polynucleotide via Hoogsteen hydrogen bonds (where two nucleobases are held together by hydrogen bonds) or reversed Hoogsteen hydrogen bonds (where one nucleobase is rotated through 180° with respect to the other nucleobase).
- the one or more anchors may comprise one or more nucleotides, one or more oligonucleotides or one or more polynucleotides which form Hoogsteen hydrogen bonds or reversed Hoogsteen hydrogen bonds with the polynucleotide. These types of hydrogen bonds allow a third polynucleotide strand to wind around a double stranded helix and form a triplex.
- the one or more anchors may couple to (or bind to) a double stranded polynucleotide by forming a triplex with the double stranded duplex.
- At least 1%, at least 10%, at least 25%, at least 50% or 100% of the membrane components may be functionalised.
- the one or more anchors comprise a protein
- they may be able to anchor directly into the membrane without further functonalisation, for example if it already has an external hydrophobic region which is compatible with the membrane.
- proteins include, but are not limited to, transmembrane proteins, intramembrane proteins and membrane proteins.
- the protein may be expressed with a genetically fused hydrophobic region which is compatible with the membrane. Such hydrophobic protein regions are known in the art.
- the one or more anchors are preferably mixed with the polynucleotide before delivery to the membrane, but the one or more anchors may be contacted with the membrane and subsequently contacted with the polynucleotide.
- the polynucleotide may be functionalised, using methods described above, so that it can be recognised by a specific binding group.
- the polynucleotide may be functionalised with a ligand such as biotin (for binding to streptavidin), amylose (for binding to maltose binding protein or a fusion protein), Ni-NTA (for binding to poly-histidine or poly-histidine tagged proteins) or peptides (such as an antigen).
- a ligand such as biotin (for binding to streptavidin), amylose (for binding to maltose binding protein or a fusion protein), Ni-NTA (for binding to poly-histidine or poly-histidine tagged proteins) or peptides (such as an antigen).
- the one or more anchors may be used to couple a polynucleotide to the membrane when the polynucleotide is attached to a leader sequence which preferentially threads into the pore.
- Leader sequences are discussed in more detail below.
- the polynucleotide is attached (such as ligated) to a leader sequence which preferentially threads into the pore.
- a leader sequence may comprise a homopolymeric polynucleotide or an abasic region.
- the leader sequence is typically designed to hybridise to the one or more anchors either directly or via one or more intermediate polynucleotides (or splints).
- the one or more anchors typically comprise a polynucleotide sequence which is complementary to a sequence in the leader sequence or a sequence in the one or more intermediate polynucleotides (or splints).
- the one or more splints typically comprise a polynucleotide sequence which is complementary to a sequence in the leader sequence.
- an amino acid, peptide, polypeptide or protein is coupled to an amphiphilic layer, such as a triblock copolymer layer or lipid bilayer.
- an amphiphilic layer such as a triblock copolymer layer or lipid bilayer.
- Various methodologies for the chemical attachment of such polynucleotides are available.
- An example of a molecule used in chemical attachment is EDC (1-ethyl-3-[3-dimethylaminopropyl]carbodiimide hydrochloride).
- Reactive groups can also be added to the 5′ of polynucleotides using commercially available kits (Thermo Pierce, Part No. 22980). Suitable methods include, but are not limited to, transient affinity attachment using histidine residues and Ni-NTA, as well as more robust covalent attachment by reactive cysteines, lysines or non natural amino acids.
- the method of the invention may concern characterising two or more polynucleotides, such as 3 or more, 4 or more, or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 20 or more, 30 or more, 50 or more, 100 or more, 500 or more, 1,000 or more, 5,000 or more, 10,000 or more, 100,000 or more, 1000,000 or more or 5000,000 or more, polynucleotides.
- the two or more polynucleotides may be delivered using the same microparticle or different microparticles.
- a microparticle is a microscopic particle whose size is typically measured in micrometres ( ⁇ m). Microparticles may also known as microspheres or microbeads. The microparticle may be a nanoparticle. A nanoparticle is a microscopic particle whose size is typically measured in nanometres (nm).
- a microparticle typically has a particle size of from about 0.001 ⁇ m to about 500 ⁇ m.
- a nanoparticle may have a particle size of from about 0.01 ⁇ m to about 200 ⁇ m or about 0.1 ⁇ m to about 100 ⁇ m. More often, a microparticle has a particle size of from about 0.5 ⁇ m to about 100 ⁇ m, or for instance from about 1 ⁇ m to about 50 ⁇ m.
- the microparticle may have a particle size of from about 1 nm to about 1000 nm, such as from about 10 nm to about 500 nm, about 20 nm to about 200 nm or from about 30 nm to about 100 nm.
- polynucleotides are characterised, they may be different from one another.
- the two or more polynucleotides may be two or more instances of the same polynucleotide. This allows proof reading.
- the polynucleotides can be naturally occurring or artificial.
- the method may be used to verify the sequence of two or more manufactured oligonucleotides.
- the methods are typically carried out in vitro.
- the method may involve measuring two, three, four or five or more characteristics of each polynucleotide.
- the one or more characteristics are preferably selected from (i) the length of the polynucleotide, (ii) the identity of the polynucleotide, (iii) the sequence of the polynucleotide, (iv) the secondary structure of the polynucleotide and (v) whether or not the polynucleotide is modified.
- any combination of (i) to (v) may be measured in accordance with the invention, such as ⁇ i ⁇ , ⁇ ii ⁇ , ⁇ iii ⁇ , ⁇ iv ⁇ , ⁇ v ⁇ , ⁇ i,ii ⁇ , ⁇ i,ii ⁇ , ⁇ i,iv ⁇ , ⁇ i,v ⁇ , ⁇ ii,ii ⁇ , ⁇ ii,iv ⁇ , ⁇ iii,v ⁇ , ⁇ iii,v ⁇ , ⁇ iv,v ⁇ , ⁇ i,iiii ⁇ , ⁇ i,iii,v ⁇ , ⁇ i,iii,v ⁇ , ⁇ i,iiii,v ⁇ , ⁇ i,iiii,v ⁇ , ⁇ i,iiii,v ⁇ , ⁇ i,iiii,v ⁇ , ⁇ i,iiii,v ⁇ , ⁇ i,iiii,v ⁇ , ⁇ iiiii,v ⁇ , ⁇ iiiii,v ⁇ , ⁇
- the length of the polynucleotide may be measured for example by determining the number of interactions between the polynucleotide and the pore or the duration of interaction between the polynucleotide and the pore.
- the identity of the polynucleotide may be measured in a number of ways.
- the identity of the polynucleotide may be measured in conjunction with measurement of the sequence of the polynucleotide or without measurement of the sequence of the polynucleotide.
- the former is straightforward; the polynucleotide is sequenced and thereby identified.
- the latter may be done in several ways. For instance, the presence of a particular motif in the polynucleotide may be measured (without measuring the remaining sequence of the polynucleotide). Alternatively, the measurement of a particular electrical and/or optical signal in the method may identify the polynucleotide as coming from a particular source.
- the sequence of the polynucleotide can be determined as described previously. Suitable sequencing methods, particularly those using electrical measurements, are described in Stoddart D et al., Proc Natl Acad Sci, 12; 106(19):7702-7, Lieberman K R et al, J Am Chem Soc. 2010; 132(50):17961-72, and International Application WO 2000/28312.
- the secondary structure may be measured in a variety of ways. For instance, if the method involves an electrical measurement, the secondary structure may be measured using a change in dwell time or a change in current flowing through the pore. This allows regions of single-stranded and double-stranded polynucleotide to be distinguished.
- the presence or absence of any modification may be measured.
- the method preferably comprises determining whether or not the polynucleotide is modified by methylation, by oxidation, by damage, with one or more proteins or with one or more labels, tags or spacers. Specific modifications will result in specific interactions with the pore which can be measured using the methods described below. For instance, methylcyotsine may be distinguished from cytosine on the basis of the current flowing through the pore during its interaction with each nucleotide.
- the methods may be carried out using any apparatus that is suitable for investigating a membrane/pore system in which a pore is present in a membrane.
- the method may be carried out using any apparatus that is suitable for transmembrane pore sensing.
- the apparatus comprises a chamber comprising an aqueous solution and a barrier that separates the chamber into two sections.
- the barrier typically has an aperture in which the membrane containing the pore is formed.
- the barrier forms the membrane in which the pore is present.
- the methods may be carried out using the apparatus described in International Application No. PCT/GB08/000562 (WO 2008/102120).
- a variety of different types of measurements may be made. This includes without limitation: electrical measurements and optical measurements.
- a suitable optical method involving the measurement of fluorescence is disclosed by J. Am. Chem. Soc. 2009, 131 1652-1653.
- Possible electrical measurements include: current measurements, impedance measurements, tunnelling measurements (Ivanov A P et al., Nano Lett. 2011 Jan. 12; 11(1):279-85), and FET measurements (International Application WO 2005/124888).
- Optical measurements may be combined with electrical measurements (Soni G V et al., Rev Sci Instrum. 2010 January; 81(1):014301).
- the measurement may be a transmembrane current measurement such as measurement of ionic current flowing through the pore.
- Electrical measurements may be made using standard single channel recording equipment as describe in Stoddart D et al., Proc Natl Acad Sci, 12; 106(19):7702-7, Lieberman K R et al, J Am Chem Soc. 2010; 132(50):17961-72, and International Application WO 2000/28312.
- electrical measurements may be made using a multi-channel system, for example as described in International Application WO 2009/077734 and International Application WO 2011/067559.
- the method is preferably carried out with a potential applied across the membrane.
- the applied potential may be a voltage potential.
- the applied potential may be a chemical potential.
- An example of this is using a salt gradient across a membrane, such as an amphiphilic layer. A salt gradient is disclosed in Holden et al., J Am Chem Soc. 2007 Jul. 11; 129(27):8650-5.
- the current passing through the pore as a polynucleotide moves with respect to the pore is used to estimate or determine the sequence of the polynucleotide. This is strand sequencing.
- the methods may involve measuring the current passing through the pore as the polynucleotide moves with respect to the pore. Therefore the apparatus may also comprise an electrical circuit capable of applying a potential and measuring an electrical signal across the membrane and pore.
- the methods may be carried out using a patch clamp or a voltage clamp.
- the methods preferably involve the use of a voltage clamp.
- the method comprises:
- the methods of the invention may involve the measuring of a current passing through the pore as the polynucleotide moves with respect to the pore.
- Suitable conditions for measuring ionic currents through transmembrane protein pores are known in the art and disclosed in the Example.
- the method is typically carried out with a voltage applied across the membrane and pore.
- the voltage used is typically from +5 V to ⁇ 5 V, such as from +4 V to ⁇ 4 V, +3 V to ⁇ 3 V or +2 V to ⁇ 2 V.
- the voltage used is typically from ⁇ 600 mV to +600 mV or ⁇ 400 mV to +400 mV.
- the voltage used is preferably in a range having a lower limit selected from ⁇ 400 mV, ⁇ 300 mV, ⁇ 200 mV, ⁇ 150 mV, ⁇ 100 mV, ⁇ 50 mV, ⁇ 20 mV and 0 mV and an upper limit independently selected from +10 mV, +20 mV, +50 mV, +100 mV, +150 mV, +200 mV, +300 mV and +400 mV.
- the voltage used is more preferably in the range 100 mV to 240 mV and most preferably in the range of 120 mV to 220 mV. It is possible to increase discrimination between different nucleotides by a pore by using an increased applied potential.
- the methods are typically carried out in the presence of any charge carriers, such as metal salts, for example alkali metal salt, halide salts, for example chloride salts, such as alkali metal chloride salt.
- Charge carriers may include ionic liquids or organic salts, for example tetramethyl ammonium chloride, trimethylphenyl ammonium chloride, phenyltrimethyl ammonium chloride, or 1-ethyl-3-methyl imidazolium chloride.
- the salt is present in the aqueous solution in the chamber.
- Potassium chloride (KCl), sodium chloride (NaCl), caesium chloride (CsCl) or a mixture of potassium ferrocyanide and potassium ferricyanide is typically used.
- KCl, NaCl and a mixture of potassium ferrocyanide and potassium ferricyanide are preferred.
- the charge carriers may be asymmetric across the membrane. For instance, the type and/or concentration of the charge carriers may be different on each side of the membrane.
- the salt concentration may be at saturation.
- the salt concentration may be 3 M or lower and is typically from 0.1 to 2.5 M, from 0.3 to 1.9 M, from 0.5 to 1.8 M, from 0.7 to 1.7 M, from 0.9 to 1.6 M or from 1 M to 1.4 M.
- the salt concentration is preferably from 150 mM to 1 M.
- the method is preferably carried out using a salt concentration of at least 0.3 M, such as at least 0.4 M, at least 0.5 M, at least 0.6 M, at least 0.8 M, at least 1.0 M, at least 1.5 M, at least 2.0 M, at least 2.5 M or at least 3.0 M.
- High salt concentrations provide a high signal to noise ratio and allow for currents indicative of the presence of a nucleotide to be identified against the background of normal current fluctuations.
- the methods are typically carried out in the presence of a buffer.
- the buffer is present in the aqueous solution in the chamber. Any buffer may be used in the method of the invention.
- the buffer is phosphate buffer.
- Other suitable buffers are HEPES and Tris-HCl buffer.
- the methods are typically carried out at a pH of from 4.0 to 12.0, from 4.5 to 10.0, from 5.0 to 9.0, from 5.5 to 8.8, from 6.0 to 8.7 or from 7.0 to 8.8 or 7.5 to 8.5.
- the pH used is preferably about 7.5.
- the methods may be carried out at from 0° C. to 100° C., from 15° C. to 95° C., from 16° C. to 90° C., from 17° C. to 85° C., from 18° C. to 80° C., 19° C. to 70° C., or from 20° C. to 60° C.
- the methods are typically carried out at room temperature.
- the methods are optionally carried out at a temperature that supports enzyme function, such as about 37° C.
- the present invention also provides a kit for modifying a template polynucleotide.
- the kit comprises (a) a population of MuA substrates as defined above, (b) a MuA transposase and (c) a translocase. Any of the embodiments discussed above with reference to the methods and products of the invention equally apply to the kits.
- the kit may further comprise the components of a membrane, such as the components of an amphiphilic layer or a lipid bilayer.
- the kit may further comprise the components of a transmembrane pore.
- the kit may further comprise a molecular brake. Suitable membranes, pores and molecular brakes are discussed above.
- the kit may further comprise a Y adaptor comprising a leader sequence and/or one or more anchors capable of coupling the adaptor to a membrane.
- a Y adaptor comprising a leader sequence and/or one or more anchors capable of coupling the adaptor to a membrane.
- Suitable Y adaptors, leader sequences and anchors are discussed above.
- the kit of the invention may additionally comprise one or more other reagents or instruments which enable any of the embodiments mentioned above to be carried out.
- reagents or instruments include one or more of the following: suitable buffer(s) (aqueous solutions), means to obtain a sample from a subject (such as a vessel or an instrument comprising a needle), means to amplify and/or express polynucleotides, a membrane as defined above or voltage or patch clamp apparatus.
- Reagents may be present in the kit in a dry state such that a fluid sample resuspends the reagents.
- the kit may also, optionally, comprise instructions to enable the kit to be used in the method of the invention or details regarding which patients the method may be used for.
- the kit may, optionally, comprise nucleotides.
- MuA binds to the transposon as a tetramer and is extremely stable; remaining tightly bound after strand transfer of the transposon. If the MuA is not removed from the DNA, this can inhibit characterisation using a nanopore system. MuA can be removed by heating to 75° C. However, this relies on the use of a thermal cycler or water bath and could damage other components in the solution. Here we describe an alternative technique for removing MuA without needing to heat the reaction, using a helicase.
- Hel308Mbu-E284C/S615C-STrEP(C) (SEQ ID NO: with mutations E284C/S615C with a streptavidin tag attached at its C terminus) is a processive helicase which binds to single stranded DNA and moves in a 3′ to 5′ direction.
- Hel308Mbu-E284C/S615C-STrEP(C) SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus
- Hel308Mbu-E284C/S615C-STrEP(C) (20 uM, SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus) was reduced using 10 mM DTT in a 2 ml protein low bind Eppendorf and rotated on a Hula shaker (ThermoFisher Scientific) for 1 h, at 10 rpm with no vibration.
- the enzyme was then buffer exchanged, into 100 mM sodium phosphate, 500 mM NaCl, 5 mM EDTA and 0.1% Tween-20 pH8.0, using Zeba spin desalting columns 7K MWCO, 0.5 ml (ThermoFisher Scientific) according to the manufacturers protocol.
- the sample was diluted to 10 uM and 50 uM 1,11-bis(maleimido)triethylene glycol was added.
- the sample was then rotated on a Hula shaker for further 2 hours. This resulted in a closed complex helicase which was able to load onto DNA at the 3′ end.
- the sequence for the transposon top strand was (SEQ ID NO: 115). This was annealed with either SEQ ID NO: 116 to form transposon 1 or annealed with SEQ ID NO: 117 to form transposon 2 which has a 3′ overhang on the bottom strand.
- the transposon top strand was also annealed with the transposon leader (30 iSpC3 spacers attached at the 3′ end to the 5′ of SEQ ID NO: 118, which is attached at its 3′ end to the 5′ end of four iSp18 spacers which are attached at the 3′ end to the 5′ end of SEQ ID NO: 119).
- Transposons (10 uM) were annealed in 50 mM NaCl, 10 mM Tris ⁇ HCl pH8.0. The transposon sequences were heated to 95C for 2 minutes and then slow cooled (6 seconds for every 0.1° C. decrease) to 4° C.
- Transposon 1, transposon 2 and leader transposon were each mixed to 2 uM in 40 ul, with concentrated MuA transposase (20 ul, 1.1 mg/ml, ThermoFisher Scientific) in 25 mM Tris ⁇ HCl pH8, 110 mM NaCl, 0.5 mM EDTA, 10% glycerol and 0.05% Triton-X100. These were then incubated at 30° C. for 90 minutes to form transpososome 1, transpososome 2 and leader transpososome respectively, at 2 uM.
- Transpososome 1 and transpososome 2 were each mixed to 50 nM with 1.5 ug of PhiX174 RFI DNA (New England Biolabs) in 25 mM Tris ⁇ HCl pH8, 110 mM NaCl and 10 mM MgCl 2 in a 30 ul reaction in a 0.2 ml PCR tube. Each reaction was incubated at room temperature for 2 minutes before being split in half to form 3 tubes of 10 uls for each. 1 tube of each transpososome was incubated at 75° C. for 5 minutes, 1 tube of each transpososome was left at room temperature for 5 minutes with nothing added.
- PhiX174 RFI DNA New England Biolabs
- Hel308Mbu-E284C/S615C-STrEP(C) (1 uM) was added to the final tubes along with 10 mM of ATP (Sigma-Aldrich) and incubated at room temperature for 5 minutes. 1 ul of each reaction was then analysed on the Agilent 2100 Bioanalyser 12,000 bp setting, along with 1 ul of unmodified PhiX.
- a 60 ul sample was made with 1.5 ug of lambda DNA (New England Biolabs) and 120 nM of leader transpososome in 25 mM Tris ⁇ HCl pH8, 110 mM NaCl and 10 mM MgCl 2 and the sample mixed by inversion. The sample was incubated at room temperature for 10 minutes. The sample was then split into 3 sets of 20 ul reactions. nH20 (4 ul, ThermoFisher Scientific) was added to sample 1 and the sample was heated at 75° C. for 10 minutes.
- Hel308Mbu-E284C/S615C-STrEP(C) (2 ul, 10 uM) and ATP (2 ul, 100 mM, Sigma-Aldrich) were added to sample 2 and it was incubated at room temperature for 10 minutes.
- nH20 (4 ul, ThermoFisher Scientific) was added to sample 3 and the sample was incubated at room temperature for 10 minutes.
- Agencourt AMPure XP SPRI beads 24 ul
- Buffer was added to each sample (50 ul, 750 mM NaCl, 10% PEG8000 and 50 mM Tris ⁇ HCl pH8). The wash buffer was then removed and discarded from each sample. Buffer 1 (6 ul, 10 mM Tris ⁇ HCl, 20 mM NaCl) was then to each sample and each samples was then mixed in order to resuspend the beads. Each sample was then spun down and returned to the magnetic rack. 6 ul of each sample was then removed and 1.5 ul of buffer 2 (1 uM of SEQ ID NO: 20 (which has 6 iSp18 spacers attached at its 3′ end), 750 mM KCl, 5 mM EDTA, 125 mM Kpi pH8) was added to each sample.
- Buffer (1.25 ul, 800 uM TMAD) was then added to each sample and then each was incubated at room temperature for 5 minutes. Finally, 6 ul of fuel mix (75 mM ATP, 75 mM MgCl 2 ) and 284 ul of buffer (25 mM Potassium phosphate, 500 mM potassium chloride, pH8) was added to each sample.
- FIG. 6 shows transpososome 2 after treatment with Hel308Mbu-E284C/S615C-STrEP(C) and heat treatment.
- the two PhiX peaks are of a similar height, indicating that Hel308 was just as efficient as heat at removing MuA transposase.
- FIG. 11 shows a graph of throughput for samples 1-3.
- Sample 3 shows a throughput of around 20 kb/nanopore/hr which is significantly lower than samples 1 and 2 showing that by not removing the MuA transposase characterisation using a nanopore system was inhibited.
- Sample 2 (heat treatment) and sample 3 produce much higher throughput values around 80 kb/nanopore/hr for sample 2 and 85 kb/nanopore/hr for sample 3. This shows that removal of MuA transposase using Hel308Mbu-E284C/S615C-STrEP(C) was as efficient as heat treatment. Removal of MuA transposase using Hel308Mbu-E284C/S615C-STrEP(C) resulted in improved characterisation using a nanopore system.
- This example describes using a number of different translocases to remove MuA transposase.
- a MuA adapter consisting of SEQ ID NO: 117 and 121 were annealed to 10 uM in 10 mM Tris-HCl (pH 7.5), 50 mM NaCl, from 95° C. to 22° C. at 2° C. per minute.
- This adapter contained the minimal MuA recognition sequence, with the pre-formed 5′ bottom strand flap, as well as a 12 nt 5′ tail on the top strand and a 10 nt 3′ tail on the bottom strand.
- a transposome complex was formed but addition of 1 ul of the MuA adapter, 4.5 ul of nuclease free water, 2 ul of 5 ⁇ transposome buffer (125 mM Tris pH 8, 550 mM NaCl, 2.5 mM EDTA, 50% glycerol, 0.25% Triton-X100) and 2.5 ul of concentrated MuA transposase (Thermofisher). The mixture was then incubated at 30° C. for 1.5 hours.
- transposition reaction containing 10 ul of 5 ⁇ transposase buffer (125 mM Tris pH 8, 550 mM NaCl, 50 mM MgCl2), 5 ul transposome, 2.5 ug PhiX RFI (NEB) and nuclease free water to 50 ul, was then carried out at room temperature for 10 minutes. After 10 mins 6.25 ul of 100 mM rATP was added and the reaction was split into 5 ⁇ 11.25 ul.
- FIGS. 7 to 10 show a number of Agilent traces for samples (i)-(v).
- Sample (i) was a control where no translocase was added and the sample was no heated.
- FIGS. 7 to 10 all illustrate the control showing no tagmentation peak was observed for this sample this was because the MuA was still bound to the DNA, which prevented the transpososome from moving into the gel matrix of the Agilent 2100 Bioanalyser system.
- FIG. 7 also shows sample (ii) (line 2) which shows a clear tagmentation peak when the sample was heated to 75° C. in order to remove the MuA transposase.
- FIG. 8 shows sample (iii, line 3) and the control sample (i, line 1).
- Sample (iii) shows a clear tagmentation peak when the sample was heated with Hel308Mbu-E284C-STrEP(C) in order to remove the MuA transposase. This indicated the fact that Hel308Mbu-E284C-STrEP(C) was able to successfully remove MuA transposase from transposons.
- FIG. 9 shows sample (iv, line 4) and the control sample (i, line 1).
- Sample (iv) shows a clear tagmentation peak when the sample was heated with T4 Dda-(E94C/F98W/C109A/C136A/A360C) in order to remove the MuA transposase. This indicated the fact that T4 Dda-(E94C/F98W/C109A/C136A/A360C) was able to successfully remove MuA transposase from transposons.
- FIG. 10 shows sample (v, line 5) and the control sample (i, line 1).
- Sample (v) shows a clear tagmentation peak when the sample was heated with UvrD Eco-(E117C/M380C)-STrEP in order to remove the MuA transposase. This indicated the fact that UvrD Eco-(E117C/M380C)-STrEP was able to successfully remove MuA transposase from transposons.
Abstract
The invention relates to a method for modifying a template double stranded polynucleotide, especially for characterisation using nanopore sequencing. The method produces from the template a plurality of modified double stranded polynucleotides. These modified polynucleotides can then be characterised.
Description
- The invention relates to a method for modifying a template double stranded polynucleotide, especially for characterisation using nanopore sequencing.
- There are many commercial situations which require the preparation of a nucleic acid library. This is frequently achieved using a transposase. Depending on the transposase which is used to prepare the library it may be necessary to repair the transposition events in vitro before the library can be used, for example in sequencing.
- There is currently a need for rapid and cheap polynucleotide (e.g. DNA or RNA) sequencing and identification technologies across a wide range of applications. Existing technologies are slow and expensive mainly because they rely on amplification techniques to produce large volumes of polynucleotide and require a high quantity of specialist fluorescent chemicals for signal detection.
- Transmembrane pores (nanopores) have great potential as direct, electrical biosensors for polymers and a variety of small molecules. In particular, recent focus has been given to nanopores as a potential DNA sequencing technology.
- When a potential is applied across a nanopore, there is a change in the current flow when an analyte, such as a nucleotide, resides transiently in the barrel for a certain period of time. Nanopore detection of the nucleotide gives a current change of known signature and duration. In the strand sequencing method, a single polynucleotide strand is passed through the pore and the identity of the nucleotides are derived. Strand sequencing can involve the use of a molecular brake to control the movement of the polynucleotide through the pore.
- International Application No. PCT/GB2014/052505 (published as WO 2015/022544) discloses using a MuA transposase and a population of MuA substrates to produce a plurality of shorter, modified double stranded polynucleotides from a template double stranded polynucleotide. The modified polynucleotides can be designed such that they are each easier to characterise, such as by strand sequencing, than the original template polynucleotide. The MuA transposase is inactivated by heat.
- The invention relates to a method for modifying a template double stranded polynucleotide, especially for characterisation using nanopore sequencing. The method produces from the template a plurality of modified double stranded polynucleotides. These modified polynucleotides can then be characterised.
- The inventors have surprisingly demonstrated that it is possible to remove a MuA transposase from modified polynucleotides using a translocase. This avoids the need to heat inactivate the MuA transposase, which may also inactivate any other enzymes or proteins being used in the preparation or characterisation of the modified polynucleotides. Removing the heat inactivation step also dispenses with the need for additional equipment such as a thermal cycler or water bath, used for heating up the sample.
- The invention therefore provides a method for modifying a template double stranded polynucleotide, comprising:
-
- (a) contacting the template polynucleotide with a MuA transposase and a population of double stranded MuA substrates each comprising an overhang at one or both ends of one strand such that the transposase fragments the template polynucleotide and ligates a substrate to one or both ends of the double stranded fragments and thereby producing a plurality of fragment/substrate constructs; and
- (b) using a translocase to remove the MuA transposases from the constructs and thereby producing a plurality of modified double stranded polynucleotides.
-
FIG. 1 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y. No PhiX peak was observed between the upper and lower markers for transpososome 1 (labelled 1) or transpososome 2 (labelled 2) when incubated at room temp in the absence of an enzyme. -
FIG. 2 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y. A PhiX peak was observed between the upper and lower markers for transpososome 1 (labelled 1) when incubated at 75° C. for 10 minutes. -
FIG. 3 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y. A PhiX peak was observed between the upper and lower markers for transpososome 2 (labelled 1) when incubated at 75° C. for 10 minutes. -
FIG. 4 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y. A PhiX peak was not observed between the upper and lower markers for transpososome 1 (labelled 1) when incubated with Hel308Mbu-E284C/S615C-STrEP(C) (SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus). -
FIG. 5 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y. A PhiX peak was observed between the upper and lower markers for transpososome 2 (labelled 1) when incubated with Hel308Mbu-E284C/S615C-STrEP (SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus). -
FIG. 6 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y. A PhiX peak was observed between the upper and lower markers for transpososome 2 (labelled 1) when incubated with either A) Hel308Mbu-E284C/S615C-STrEP(C) (SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus) or B) at 75° C. for 10 minutes. A comparison between PhiX withtranspososome 2 treated with heat and with Hel308. Red is heat treated, blue is Hel308 treated. -
FIG. 7 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y.Line 1 corresponds to control sample (i) which has been incubated at room temperature in the absence of a translocase. No tragmentation peak was observed for sample (i).Line 2 corresponds to sample (ii) which has been incubated at 75° C. A tagmentation peak was observed between the upper and lower markers with sample (ii). -
FIG. 8 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y.Line 1 corresponds to control sample (i) which has been incubated at room temperature in the absence of a translocase. No tragmentation peak was observed for sample (i).Line 3 corresponds to sample (iii) which has been incubated at room temperature with Hel308Mbu-E284C-STrEP(C) (SEQ ID NO: 10 with mutation E284C with a streptavidin tag attached at its C terminus). A tagmentation peak was observed between the upper and lower markers with sample (iii). -
FIG. 9 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y.Line 1 corresponds to control sample (i) which has been incubated at room temperature in the absence of a translocase. No tragmentation peak was observed for sample (i).Line 4 corresponds to sample (iv) which has been incubated at room temperature with T4 Dda-(E94C/F98W/C109A/C136A/A360C) (SEQ ID NO: 97 with mutations E94C/F98W/C109A/C136A/A360C and then (ΔM1)G1G2 (where (ΔM1)G1G2=deletion of M1 and then addition G1 and G2). A tagmentation peak was observed between the upper and lower markers with sample (iv). -
FIG. 10 shows an Agilent 2100 Bioanalyser trace. The lower marker is labelled X and the upper marker is labelled Y.Line 1 corresponds to control sample (i) which has been incubated at room temperature in the absence of a translocase. No tragmentation peak was observed for sample (i).Line 5 corresponds to sample (v) which has been incubated at room temperature with UvrD Eco-(E117C/M380C)-STrEP (SEQ ID NO: 122 with mutations E177C/M380C with a streptavidin tag attached at the C terminus). A tagmentation peak was observed between the upper and lower markers with sample (v). -
FIG. 11 shows a bar chart of throughput (y-axis label=kb/nanopore/hr) for samples 1-3 (sample 1=incubation at room temperature with Hel308Mbu-E284C/S615C-STrEP(C) using transpososome with 3′ overhang,sample 2=incubation at 75° C. for 10 minutes andsample 3=incubation at room temp in absence of Hel308Mbu-E284C/S615C-STrEP(C)). -
FIG. 12 shows a cartoon representation of a translocase being used to remove a MuA transposase from a construct. The MuA transposase (labelled A) is bound to a double stranded MuA substrate (labelled B) which has two overhangs labelled C at each end of one of the strands. Instep 1 the MuA fragments the template polynucleotide and ligates a double stranded MuA substrate to one end producing construct D. Instep 2 the translocase (labelled E) was allowed to bind to the construct at one of the overhangs. Instep 3 the translocase removes the MuA from the construct producing a modified double stranded polynucleotide. In step 4 a leader was attached to the double stranded polynucleotide which had an enzyme (labelled F) pre-bound which was capable of controlling the movement of the polynucleotide through a nanopore. - It is to be understood that the Figures are for the purpose of illustrating particular embodiments of the invention only, and are not intended to be limiting.
- SEQ ID NO: 1 shows the codon optimised polynucleotide sequence encoding the MS-
B 1 mutant MspA monomer. This mutant lacks the signal sequence and includes the following mutations: D90N, D91N, D93N, D118R, D134R and E139K. - SEQ ID NO: 2 shows the amino acid sequence of the mature form of the MS-
B 1 mutant of the MspA monomer. This mutant lacks the signal sequence and includes the following mutations: D90N, D91N, D93N, D118R, D134R and E139K. - SEQ ID NO: 3 shows the polynucleotide sequence encoding one monomer of α-hemolysin-E111N/K147N (α-HL-NN; Stoddart et al., PNAS, 2009; 106(19): 7702-7707).
- SEQ ID NO: 4 shows the amino acid sequence of one monomer of α-HL-NN.
- SEQ ID NOs: 5 to 7 show the amino acid sequences of MspB, C and D.
- SEQ ID NO: 8 shows the amino acid sequence of the Hel308 motif.
- SEQ ID NO: 9 shows the amino acid sequence of the extended Hel308 motif.
- SEQ ID NOs: 10 to 58 show the amino acid sequences of Hel308 helicases in Table 1.
- SEQ ID NO: 59 shows the RecD-like motif I.
- SEQ ID NOs: 60 to 62 show the extended RecD-like motif I.
- SEQ ID NO: 63 shows the RecD motif I.
- SEQ ID NO: 64 shows a preferred RecD motif I, namely G G P G T G K T.
- SEQ ID NOs: 65 to 67 show the extended RecD motif I.
- SEQ ID NO: 68 shows the RecD-like motif V.
- SEQ ID NO: 69 shows the RecD motif V.
- SEQ ID NOs: 70 to 77 show the MobF motif III.
- SEQ ID NOs: 78 to 84 show the MobQ motif III.
- SEQ ID NO: 85 shows the amino acid sequence of TraI Eco.
- SEQ ID NO: 86 shows the RecD-like motif I of TraI Eco.
- SEQ ID NO: 87 shows the RecD-like motif V of TraI Eco.
- SEQ ID NO: 88 shows the MobF motif III of TraI Eco.
- SEQ ID NO: 89 shows the XPD motif V.
- SEQ ID NO: 90 shows XPD motif VI.
- SEQ ID NO: 91 shows the amino acid sequence of XPD Mbu.
- SEQ ID NO: 92 shows the XPD motif V of XPD Mbu.
- SEQ ID NO: 93 shows XPD motif VI of XPD Mbu.
- SEQ ID NO: 94 shows the polynucleotide sequence of the double stranded portion of a
- MuA substrate of the invention.
- SEQ ID NO: 95 shows the polynucleotide sequence of the double stranded portion of a MuA substrate of the invention. This sequence is complementary to SEQ ID NO: 94 except that it contains a U at the 3′ end.
- SEQ ID NO: 96 shows polynucleotide sequence of the overhang strand of the double stranded MuA substrate of the invention.
- SEQ ID NO: 97 shows the amino acid sequence of Dda 1993.
- SEQ ID NOs: 98 to 112 show the amino acid sequences of other Dda helicases for use in the invention.
- SEQ ID NO: 113 shows the codon optimised polynucleotide sequence encoding the wild-type CsgG monomer from Escherichia coli Str. K-12 substr. MC4100. This monomer lacks the signal sequence.
- SEQ ID NO: 114 shows the amino acid sequence of the mature form of the wild-type CsgG monomer from Escherichia coli Str. K-12 substr. MC4100. This monomer lacks the signal sequence. The abbreviation used for this CsgG=CsgG-Eco.
- SEQ ID NO: 115 to 121 show polynucleotide sequences used in the Examples.
- SEQ ID NO: 122 shows the amino acid sequence of UvrD-Eco wild-type.
- It is to be understood that the sequences are not intended to be limiting.
- It is to be understood that different applications of the disclosed products and methods may be tailored to the specific needs in the art. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments of the invention only, and is not intended to be limiting.
- In addition as used in this specification and the appended claims, the singular forms “a”, “an”, and “the” include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to “a polynucleotide” includes “polynucleotides”, reference to “a substrate” includes two or more such substrates, reference to “a transmembrane protein pore” includes two or more such pores, and the like.
- In this specification, where different amino acids at a specific position are separated by the symbol “/”, the symbol “I” means “or”. For instance, P108R/K means P108R or P108K. In this specification, where different positions or different substitions are separated by the symbol “/”, the “I” symbol means “and”. For example, E94/P108 means E94 and P108 or E94D/R108K means E94D and P108K.
- All publications, patents and patent applications cited herein, whether supra or infra, are hereby incorporated by reference in their entirety.
- The present invention provides a method of modifying a template polynucleotide. The template may be modified for any purpose. The method is preferably for modifying a template polynucleotide for characterisation, such as for strand sequencing. The template polynucleotide is typically the polynucleotide that will ultimately be characterised, or sequenced, in accordance with the invention. This is discussed in more detail below.
- The method provided is a method for modifying a double stranded polynucleotide template, comprising: (a) contacting the polynucleotide template with a MuA transposase in the presence of a double stranded MuA substrate that comprises an overhang at one or both ends of one strand, such that the MuA transposase (i) processes the template polynucleotide to produce a plurality of double stranded fragments and (ii) ligates the double stranded MuA substrate to one or both ends of a double stranded fragment of the plurality, thereby producing a ligation product to which is bound a MuA transposase; and (b) contacting the ligation product with a translocase, such that the translocase processes the ligation product to remove the MuA transposase, thereby producing a plurality of modified double stranded polynucleotides.
- The method involves the formation of a plurality of modified double stranded polynucleotides. These modified double stranded polynucleotides are typically easier to characterise than the template polynucleotide, especially using strand sequencing. The plurality of modified double stranded polynucleotides may themselves be characterised in order to facilitate the characterisation of the template polynucleotide. For instance, the sequence of the template polynucleotide can be determined by sequencing each of the modified double stranded polynucleotides.
- The modified double stranded polynucleotides are shorter than the template polynucleotide and so it is more straightforward to characterise them using strand sequencing. The modified double stranded polynucleotides may be of any length. The length is determined by the length of the template polynucleotide and the action of the MuA transposase which fragments the polynucleotide. Typically, the modified double stranded polynucleotride is less than about 5000 kb.
- The modified double strand polynucleotides can be selectively labelled by including the labels in the MuA substrates. Labelling is selective in that only the modified double stranded polynucleotides produced by the MuA transposase are labelled. A label is an entity that enables sample identification, barcoding and/or tracking of the modified double stranded polynucleotide. Suitable labels include, but are not limited to, calibration sequences, coupling moieties and adaptor bound enzymes. Examples of coupling moieties include, for example, azide, DBCO, pyridyldithiol and malemide. Calibration sequences include any sequence of a known composition. Adaptor bound enzymes include, for example, translocases, polymerases, helicases and other polynucleotide binding proteins.
- In some embodiments, the method introduces into the double stranded polynucleotides modifications which facilitate their characterisation using strand sequencing. It is well-established that coupling a polynucleotide to the membrane containing the nanopore lowers by several orders of magnitude the amount of polynucleotide required to allow its characterisation or sequencing. This is discussed in International Application No. PCT/GB2012/051191 (published as WO 2012/164270). The method of the invention allows the production of a plurality of double stranded polynucleotides each of which include a means for coupling the polynucleotides to a membrane. This is discussed in more detail below.
- The characterisation of double stranded polynucleotides using a nanopore typically requires the presence of a leader sequence designed to preferentially thread into the nanopore. The method of the invention allows the production of a plurality of double stranded polynucleotides each of which include a single stranded leader sequence. This is discussed in more detail below.
- It is also well established that linking the two strands of a double stranded polynucleotide by a bridging moiety, such as hairpin loop, allows both strands of the polynucleotide to be characterised or sequenced by a nanopore. This is advantageous because it doubles the amount of information obtained from a single double stranded polynucleotide. Moreover, because the sequence in the template complement strand is necessarily orthogonal to the sequence of the template strand, the information from the two strands can be combined informatically. Thus, this mechanism provides an orthogonal proof-reading capability that provides higher confidence observations. This is discussed in International Application No. PCT/GB2012/051786 (published as WO 2013/014451). The method of the invention allows the production of a plurality of modified double stranded polynucleotides in which the two strands of each polynucleotide are linked using a hairpin loop.
- The method of the invention modifies a template double stranded polynucleotide, preferably for characterisation. The template polynucleotide is typically the polynucleotide that will ultimately be characterised, or sequenced, in accordance with the invention. It may also be called the target double stranded polynucleotide or the double stranded polynucleotide of interest. A polynucleotide, such as a nucleic acid, is a macromolecule comprising two or more nucleotides. The polynucleotide or nucleic acid may comprise any combination of any nucleotides. The nucleotides can be naturally occurring or artificial. One or more nucleotides in the template polynucleotide can be oxidized or methylated. One or more nucleotides in the template polynucleotide may be damaged. For instance, the polynucleotide may comprise a pyrimidine dimer. Such dimers are typically associated with damage by ultraviolet light and are the primary cause of skin melanomas. One or more nucleotides in the template polynucleotide may be modified, for instance with a label or a tag. Suitable labels are described below. The template polynucleotide may comprise one or more spacers.
- A nucleotide typically contains a nucleobase, a sugar and at least one phosphate group. The nucleobase and sugar form a nucleoside.
- The nucleobase is typically heterocyclic. Nucleobases include, but are not limited to, purines and pyrimidines and more specifically adenine (A), guanine (G), thymine (T), uracil (U) and cytosine (C).
- The sugar is typically a pentose sugar. Nucleotide sugars include, but are not limited to, ribose and deoxyribose. The sugar is preferably a deoxyribose.
- The template double stranded polynucleotide preferably comprises the following nucleosides: deoxyadeno sine (dA), deoxyuridine (dU) and/or thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC).
- The nucleotide is typically a ribonucleotide or deoxyribonucleotide. The nucleotide is preferably a deoxyribonucleotide. The nucleotide typically contains a monophosphate, diphosphate or triphosphate. Phosphates may be attached on the 5′ or 3′ side of a nucleotide.
- Nucleotides include, but are not limited to, adenosine monophosphate (AMP), guanosine monophosphate (GMP), thymidine monophosphate (TMP), uridine monophosphate (UMP), 5-methylcytidine monophosphate, 5-hydroxymethylcytidine monophosphate, cytidine monophosphate (CMP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyguanosine monophosphate (dGMP), deoxythymidine monophosphate (dTMP), deoxyuridine monophosphate (dUMP) and deoxycytidine monophosphate (dCMP). The nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP, dCMP and dUMP. The nucleotides are most preferably selected from dAMP, dTMP, dGMP, dCMP and dUMP.
- The template double stranded polynucleotide preferably comprises the following nucleotides: dAMP, dUMP and/or dTMP, dGMP and dCMP.
- A nucleotide may be abasic (i.e. lack a nucleobase). A nucleotide may also lack a nucleobase and a sugar (i.e. is a C3 spacer).
- The nucleotides in the template polynucleotide may be attached to each other in any manner. The nucleotides are typically attached by their sugar and phosphate groups as in nucleic acids. The nucleotides may be connected via their nucleobases as in pyrimidine dimers.
- The template polynucleotide is double stranded. The template polynucleotide may contain some single stranded regions, but at least a portion of the template polynucleotide is double stranded.
- The template polynucleotide can be a nucleic acid, such as deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). The template polynucleotide can comprise one strand of RNA hybridised to one strand of DNA. The polynucleotide may be any synthetic nucleic acid known in the art, such as peptide nucleic acid (PNA), glycerol nucleic acid (GNA), threose nucleic acid (TNA), locked nucleic acid (LNA) or other synthetic polymers with nucleotide side chains.
- The template polynucleotide can be any length. For example, the polynucleotide can be at least 10, at least 50, at least 100, at least 150, at least 200, at least 250, at least 300, at least 400 or at least 500 nucleotide pairs in length. The polynucleotide can be 1000 or more nucleotide pairs, 5000 or more nucleotide pairs in length or 100000 or more nucleotide pairs in length.
- The template polynucleotide is typically present in any suitable sample. The invention is typically carried out on a sample that is known to contain or suspected to contain the template polynucleotide. Alternatively, the invention may be carried out on a sample to confirm the identity of one or more template polynucleotides whose presence in the sample is known or expected.
- The sample may be a biological sample. The invention may be carried out in vitro using at least one sample obtained from or extracted from any organism or microorganism. The organism or microorganism is typically archaeal, prokaryotic or eukaryotic and typically belongs to one of the five kingdoms: plantae, animalia, fungi, monera and protista. The invention may be carried out in vitro on at least one sample obtained from or extracted from any virus. The sample is preferably a fluid sample. The sample typically comprises a body fluid of the patient. The sample may be urine, lymph, saliva, mucus or amniotic fluid but is preferably blood, plasma or serum. Typically, the sample is human in origin, but alternatively it may be from another mammal animal such as from commercially farmed animals such as horses, cattle, sheep, fish, chickens or pigs or may alternatively be pets such as cats or dogs. Alternatively, the sample may be of plant origin, such as a sample obtained from a commercial crop, such as a cereal, legume, fruit or vegetable, for example wheat, barley, oats, canola, maize, soya, rice, rhubarb, bananas, apples, tomatoes, potatoes, grapes, tobacco, beans, lentils, sugar cane, cocoa, broccoli or cotton.
- The sample may be a non-biological sample. The non-biological sample is preferably a fluid sample. Examples of non-biological samples include surgical fluids, water such as drinking water, sea water or river water, and reagents for laboratory tests.
- The sample is typically processed prior to being used in the invention, for example by centrifugation or by passage through a membrane that filters out unwanted molecules or cells, such as red blood cells. The sample may be measured immediately upon being taken. The sample may also be typically stored prior to assay, preferably below −70° C.
- The template polynucleotide is contacted with a MuA transposase. This contacting occurs under conditions which allow the transposase to function, e.g. to fragment the template polynucleotide and to ligate MuA substrates to the one or both ends of the fragments. MuA transposase is commercially available, for instance from Thermo Scientific (Catalogue Number F-750C, 20 μL (1.1 μg/μL)). The MuA translocase may be a wild type MuA translocase or a modified MuA translocase. Conditions under which MuA transposase will function are known in the art. Examples of suitable conditions are described in the Examples.
- The template polynucleotide is contacted with a population of double stranded MuA substrates. The MuA substrates contain a known MuA recognition sequence. Incubation of the template polynucleotide and MuA substrates with MuA results in adaptor formation. The double stranded substrates are polynucleotide substrates and may be formed from any of the nucleotides or nucleic acids discussed above. The MuA substrates are typically formed from the same nucleotides as the template polynucleotide, except for the universal nucleotides or at least one nucleotide which comprises a nucleoside that is not present in the template polynucleotide.
- The population of substrates is typically homogenous (i.e. typically contains a plurality of identical substrates). The population of substrates may be heterogeneous (i.e. may contain a plurality of different substrates).
- Suitable substrates for a MuA transposase are known in the art (Saariaho and Savilahti, Nucleic Acids Research, 2006; 34(10): 3139-3149 and Lee and Harshey, J. Mol. Biol., 2001; 314: 433-444).
- Each substrate typically comprises a double stranded portion which provides its activity as a substrate for MuA transposase. The double stranded portion is typically the same in each substrate. The population of substrates may comprise different double stranded portions.
- The double stranded portion in each substrate is typically at least 50 nucleotide pairs in length, such as at least 55, at least 60 or at least 65 nucleotide pairs in length. The double stranded portion may have a length of up to 10 kb, such as 5 kb, 1 kb or 100 base pairs. The double stranded portion in each substrate preferably comprises a dinucleotide comprising deoxycytidine (dC) and deoxyadenosine (dA) at the 3′ end of each strand. The dC and dA are typically in different orientations in the two strands of the double stranded portion, i.e. one strand has dC/dA and the other strand has dA/dC at the 3′ end when reading from 5′ to 3′.
- One strand of the double stranded portion preferably comprises the sequence shown in SEQ ID NO: 94 and the other strand of the double stranded portion preferably comprises a sequence which is complementary to the sequence shown in SEQ ID NO: 94.
- Each substrate comprises an overhang at one or both ends of one strand, i.e. at least one overhang on one strand. The one strand in the double stranded substrate having an overhang at one or both ends is also called the one substrate strand.
- If there is only one overhang, it is preferably located at the 5′ end of the one substrate strand. After fragmentation of the template polynucleotide and ligation of the MuA substrate to the fragments of the template polynucleotide (tagmentation), constructs comprising a fragment of the template polynucletide and one or more MuA substrates are formed. In such embodiments, a translocase that moves in the 5′ to 3′ may be used to remove the MuA transposases from the constructs.
- If there are two overhangs, i.e. one at each end of one substrate strand, a translocase that moves in either direction, i.e. from 5′ to 3′ or from 3′ to 5′, may be used to remove the MuA transposases from the constructs.
- Each substrate preferably comprises a double stranded portion which comprises the sequence shown in SEQ ID NO: 94 hybridised to a sequence which is complementary to the sequence shown in SEQ ID NO: 94. The one overhang is preferably at the 5′ end of the sequence which is complementary to the sequence shown in SEQ ID NO: 94. The sequence complementary to the sequence shown in SEQ ID NO: 94 may have overhangs at both ends. The sequence complementary to the sequence shown in SEQ ID NO: 94 is the one substrate strand.
- The overhang may be at least 3, at least 4, at least 5, at least 6 or at least 7 nucleotides in length. The overhang may have a length of up to about 200 nucleotides, such as about 100, 50, 25 or 10 nucleotides. The overhang is preferably 5 nucleotides in length. The overhang may comprise any of the nucleotides discussed above.
- If the overhang at the 5′ end of the one substrate strand is not closed after formation of the constructs, the translocase will remove both the MuA transposase and the one substrate strand, i.e. the substrate strand with the overhang. If the overhang at the 5′ end of the one substrate strand is closed after formation of the constructs, the translocase will remove only the MuA transposase.
- Closure of the overhang occurs for example where the 5′ end of the overhang is ligated to the adjacent 3′ end of a strand of the template polynucleotide fragment.
- In one embodiment, each substrate comprises an overhang at both ends of one strand and the overhang at the 5′ end is formed from universal nucleotides. The overhang preferably consists of universal nucleotides. This allows the overhang to be closed after formation of the constructs. Each substrate preferably comprises a double stranded portion which comprises the sequence shown in SEQ ID NO: 94 hybridised to a sequence which is complementary to the sequence shown in SEQ ID NO: 94. The overhang formed from universal nucleotides is at the 5′ end of the sequence which is complementary to the sequence shown in SEQ ID NO: 94.
- The overhangs may be at least 3, at least 4, at least 5, at least 6 or at least 7 nucleotides in length. The overhangs are preferably 5 nucleotides in length.
- A universal nucleotide is one which will hybridise to some degree to all of the nucleotides in the template polynucleotide. A universal nucleotide is preferably one which will hybridise to some degree to nucleotides comprising the nucleosides adenosine (A), thymine (T), uracil (U), guanine (G) and cytosine (C). The universal nucleotide may hybridise more strongly to some nucleotides than to others. For instance, a universal nucleotide (I) comprising the nucleoside, 2′-deoxyinosine, will show a preferential order of pairing of I-C>I-A>I-G approximately =I-T. For the purposes of the invention, it is only necessary that the universal nucleotide used in the oligomers hybridises to all of the nucleotides in the template polynucleotide.
- The universal nucleotide preferably comprises one of the following nucleobases: hypoxanthine, 4-nitroindole, 5-nitroindole, 6-nitroindole, 3-nitropyrrole, nitroimidazole, 4-nitropyrazole, 4-nitrobenzimidazole, 5-nitroindazole, 4-aminobenzimidazole or phenyl (C6-aromatic ring. The universal nucleotide more preferably comprises one of the following nucleosides: 2′-deoxyinosine, inosine, 7-deaza-2′-deoxyinosine, 7-deaza-inosine, 2-aza-deoxyinosine, 2-aza-inosine, 4-
nitroindole 2′-deoxyribonucleoside, 4-nitroindole ribonucleoside, 5-nitroindole 2′-deoxyribonucleoside, 5-nitroindole ribonucleoside, 6-nitroindole 2′-deoxyribonucleoside, 6-nitroindole ribonucleoside, 3-nitropyrrole 2′-deoxyribonucleoside, 3-nitropyrrole ribonucleoside, an acyclic sugar analogue of hypoxanthine,nitroimidazole 2′-deoxyribonucleoside, nitroimidazole ribonucleoside, 4-nitropyrazole 2′-deoxyribonucleoside, 4-nitropyrazole ribonucleoside, 4-nitrobenzimidazole 2′-deoxyribonucleoside, 4-nitrobenzimidazole ribonucleoside, 5-nitroindazole 2′-deoxyribonucleoside, 5-nitroindazole ribonucleoside, 4-aminobenzimidazole 2′-deoxyribonucleoside, 4-aminobenzimidazole ribonucleoside, phenyl C-ribonucleoside or phenyl C-2′-deoxyribosyl nucleoside. The universal nucleotide is most preferably comprises 2′-deoxyinosine. - The universal nucleotides in each overhang may be different from one another. The universal nucleotides in each overhang are preferably the same. All of the universal nucleotides in the population of substrates are preferably the same universal nucleotide.
- The method of the invention preferably comprises
-
- (a) contacting the template polynucleotide with a MuA transposase and a population of double stranded MuA substrates each comprising an overhang at both ends of one strand, wherein the overhang at the 5′ end of the one strand consists of universal nucleotides, such that the transposase fragments the template polynucleotide into fragments and ligates a substrate to one or both ends of the double stranded fragments and thereby producing a plurality of fragment/substrate constructs;
- (b) allowing the overhangs consisting of universal nucleotides to hybridise to the opposite fragment strands in the constructs;
- (c) ligating the overhangs consisting of universal nucleotides to the adjacent fragment strands in the constructs; and
- (d) using a translocase to remove the MuA transposases from the constructs and thereby producing a plurality of modified double stranded polynucleotides. In this embodiment, the translocase binds to the overhangs at the 3′ ends of the one substrate strands in the constructs and moves 3′ to 5′ to remove the MuA tranposase. Since the 5′ overhang is closed, the one substrate strands remain in the constructs.
- The overhang(s) of universal nucleotides may further comprise a reactive group, preferably at the 5′ end. The reactive group may be used to ligate the overhangs to the fragments in the constructs as discussed below. The reactive group may be used to ligate the fragments to the overhangs using click chemistry. Click chemistry is a term first introduced by Kolb et al. in 2001 to describe an expanding set of powerful, selective, and modular building blocks that work reliably in both small- and large-scale applications (Kolb H C, Finn, MG, Sharpless K B, Click chemistry: diverse chemical function from a few good reactions, Angew. Chem. Int. Ed. 40 (2001) 2004-2021). They have defined the set of stringent criteria for click chemistry as follows: “The reaction must be modular, wide in scope, give very high yields, generate only inoffensive by-products that can be removed by nonchromatographic methods, and be stereospecific (but not necessarily enantioselective). The required process characteristics include simple reaction conditions (ideally, the process should be insensitive to oxygen and water), readily available starting materials and reagents, the use of no solvent or a solvent that is benign (such as water) or easily removed, and simple product isolation. Purification if required must be by nonchromatographic methods, such as crystallization or distillation, and the product must be stable under physiological conditions”.
- Suitable examples of click chemistry include, but are not limited to, the following:
-
- (a) copper-free variant of the 1,3 dipolar cycloaddition reaction, where an azide reacts with an alkyne under strain, for example in a cyclooctane ring;
- (b) the reaction of an oxygen nucleophile on one linker with an epoxide or aziridine reactive moiety on the other; and
- (c) the Staudinger ligation, where the alkyne moiety can be replaced by an aryl phosphine, resulting in a specific reaction with the azide to give an amide bond.
- Any reactive group may be used in the invention. The reactive group may be one that is suitable for click chemistry. The reactive group may be any of those disclosed in International Application No. PCT/GB10/000132 (published as WO 2010/086602), particularly in Table 4 of that application.
- In a further embodiment, the modification method uses a MuA transposase and a population of MuA substrates each comprising at least one overhang comprising a reactive group. The overhang(s) may be any length and may comprise any combination of any nucleotide(s). Suitable lengths and nucleotides are disclosed above. Suitable reactive groups are discussed above. Accordingly, the invention provides a method for modifying a template double stranded polynucleotide, comprising:
-
- (a) contacting the template polynucleotide with a MuA transposase and a population of double stranded MuA substrates each comprising an overhang at both ends of one strand, wherein the overhang at the 5′ end of the one strand comprises a reactive group, such that the transposase fragments the template polynucleotide and ligates a substrate to one or both ends of the double stranded fragments and thereby producing a plurality of fragment/substrate constructs; and
- (b) ligating the overhangs to the fragments in the constructs using the reactive group;
- (c) using a translocase to remove the MuA transposases from the constructs and thereby producing a plurality of modified double stranded polynucleotides. In this embodiment, the translocase binds to the overhangs at the 3′ ends of the one substrate strands in the constructs and moves 3′ to 5′ to remove the MuA tranposase. Since the 5′ overhang is closed, the one substrate strands remain in the constructs.
Nucleosides that are not Present in the Template Polynucleotide
- In one embodiment, each substrate comprises (i) an overhang at both ends of one strand and (ii) at least one
nucleotide 10 nucleotides or fewer from the overhang at the 5′ end of the one strand which comprises a nucleoside that is not present in the template polynucleotide. For example, the nucleotide that is not present in the template polynucleotide is typically a non-natural nucleotide where the template polynucleotide comprises only natural nucleotides. - As discussed above, the double stranded portion in each substrate preferably comprises a dinucleotide comprising deoxycytidine (dC) and deoxyadenosine (dA) at the 3′ end of each strand and a dinucleotide comprising thymidine (dT) and deoxyguanosine (dG) at the 5′ end of each strand. In some embodiments, one or both of the nucleotides in the dT and dG dinucleotide of the one substrate strand may be replaced with a nucleotide comprising a nucleoside that is not present in the template polynucleotide as discussed below. In a preferred embodiment, the template polynucleotide comprises deoxyadenosine (dA), thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC), but not deoxyuridine (dU) and the dA in the dC and dA dinucleotide of one strand is replaced with a nucleotide comprising deoxyuridine (dU). This is exemplified below.
- The double stranded portion preferably comprises the sequence shown in SEQ ID NO: 94 and a sequence which is complementary to the sequence shown in SEQ ID NO: 94 and which is modified to include at least one nucleotide that is not present in the template polynucleotide. The sequence complementary to SEQ ID NO: 94 further comprises the overhang, i.e. is the one substrate strand. In a more preferred embodiment, the double stranded portion comprises the sequence shown in SEQ ID NO: 94 and the sequence shown in SEQ ID NO: 95 (see below). In SEQ ID NO: 27, the dT in the dT and dG dinucleotide at the 5′ end had been replaced with dU. This double stranded portion (shown below) may be used when the template polynucleotide comprises deoxyadenosine (dA), thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC), but not deoxyuridine (dU).
-
(SEQ 94) 5′-GTTTTCGCATTTATCGTGAAACGCTTTCGCGTTTTTCGTGCGC CGCTTCA-3′ (SEQ 95) 3′-CAAAAGCGTAAATAGCACTTTGCGAAAGCGCAAAAAGCACGCG GCGAAG U -5′ - The overhangs may be at least 3, at least 4, at least 5, at least 6 or at least 7 nucleotides in length. The overhangs are preferably 4 nucleotides in length. The overhangs may comprise any of the nucleotides discussed above.
- Each substrate comprises at least one nucleotide in the one substrate strand which is 10 nucleotides or fewer from the overhang at 5′ end and which comprises a nucleoside that is not present in the template polynucleotide. Each substrate may comprise any number of nucleotides which comprise a nucleoside that is not present in the template polynucleotide, such as 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. If a substrate comprises more than one nucleotide that is not present in the template polynucleotide, those nucleotides are typically the same, but may be different.
- If the template polynucleotide comprises deoxyadenosine (dA), thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC) but not deoxyuridine (dU), the nucleoside that is not present in the template polynucleotide is preferably deoxyuridine (dU).
- In a preferred embodiment, one strand of the double stranded portion comprises the sequence shown in SEQ ID NO: 94 and the other strand of the double stranded portion comprises the sequence shown in SEQ ID NO: 95 (see above). In SEQ ID NO: 95, the dT in the dT and dG dinucleotide at the 5′ end had been replaced with dU. The overhang at the 5′ end of SEQ ID NO: 95 is attached to the U.
- In a most preferred embodiment, each substrate comprises the sequence shown in SEQ ID NO: 94 and the sequence shown in SEQ ID NO: 96 (see below). This substrate (shown below) may be used when the template polynucleotide comprises deoxyadenosine (dA), thymidine (dT), deoxyguanosine (dG) and deoxycytidine (dC), but not deoxyuridine (dU).
-
(SEQ 94) 5′-GTTTTCGCATTTATCGTGAAACGCTTTCGCGTTTTTCGTGCGC CGCTTCA-3′ (SEQ 96) 3′-CAAAAGCGTAAATAGCACTTTGCGAAAGCGCAAAAAGCACGCG GCGAAG U CTAG-5′ - Each substrate also comprise an overhang at the 3′ end of the sequence shown in SEQ ID NO: 96.
- If the template polynucleotide comprises deoxyadenosine (dA), deoxyuridine (dU), deoxyguanosine (dG) and deoxycytidine (dC) but not thymidine (dT), the nucleoside that is not present in the template polynucleotide is preferably thymidine (dT).
- The nucleoside that is not present in the template polynucleotide is preferably abasic, adenosine (A), uridine (U), 5-methyluridine (m5U), cytidine (C) or guanosine (G) or preferably comprises urea, 5, 6 dihydroxythymine, thymine glycol, 5-hydroxy-5 methylhydanton, uracil glycol, 6-hydroxy-5, 6-dihdrothimine, methyltartronylurea, 7, 8-dihydro-8-oxoguanine (8-oxoguanine), 8-oxoadenine, fapy-guanine, methy-fapy-guanine, fapy-adenine, aflatoxin B 1-fapy-guanine, 5-hydroxy-cytosine, 5-hydroxy-uracil, 3-methyladenine, 7-methylguanine, 1,N6-ethenoadenine, hypoxanthine, 5-hydroxyuracil, 5-hydroxymethyluracil, 5-formyluracil or a cis-syn-cyclobutane pyrimidine dimer.
- The at least one nucleotide is 10 nucleotides or fewer from the overhang at the 5′ end, such as 9, 8, 7, 6, 5, 4, 3, 2, 1 or 0 nucleotides from the overhang. In other words, the at least one nucleotide is preferably at any of positions A to K in the Example below. The at least one nucleotide is preferably 0 nucleotides from the overhang (i.e. is adjacent to the overhang). In other words, the at least one nucleotide is preferably at position K in the Example below.
- The at least one nucleotide may be the first nucleotide in the overhang. In other words, the at least one nucleotide may be at position A in the Example below.
- All of the nucleotides in the overhang may comprise a nucleoside that is not present in the template polynucleotide. A person skilled in the art is capable of designing suitable substrates.
- The method of the invention preferably comprises
-
- (a) contacting the template polynucleotide with a MuA transposase and a population of double stranded MuA substrates each comprising (i) an overhang at both ends of one strand and (ii) at least one
nucleotide 10 nucleotides or fewer from the overhang at the 5′ end of the one strand which comprises a nucleoside that is not present in the template polynucleotide such that the transposase fragments the template polynucleotide into fragments and ligates a substrate at one or both ends of the double stranded fragments and thereby producing a plurality of fragment/substrate constructs; - (b) removing the overhangs at the 5′ end of the one substrate strands from the constructs by selectively removing the at least one nucleotide and thereby producing a plurality of double stranded constructs comprising single stranded gaps;
- (c) repairing the single stranded gaps in the constructs; and
- (d) using a translocase to remove the MuA transposases from the constructs and thereby producing a plurality of modified double stranded polynucleotides.
- (a) contacting the template polynucleotide with a MuA transposase and a population of double stranded MuA substrates each comprising (i) an overhang at both ends of one strand and (ii) at least one
- In those embodiments in which the MuA substrates comprise overhangs of universal nucleotides, the method comprises ligating the overhangs to the fragments in the constructs. This may be done using any method of ligating nucleotides known in the art. For instance, it may be done using a ligase, such as a DNA ligase. Alternatively, if the overhangs comprise a reactive group, the reactive group may be used to ligate the overhangs to the fragments in the constructs.
- For instance, a nucleotide comprising a complementary reactive group may be attached to the fragments and the two reactive groups may be reacted together to ligate the overhangs to the fragments. Click chemistry may be used as discussed above.
- Methods are known in the art for selectively removing the nucleotide(s) which comprise(s) a nucleoside that is not present in the template polynucleotide from the ligated constructs. Nucleotides are selectively removed if they are removed (or excised) from the ligated constructs, but the other nucleotides in the ligated constructs (i.e. those comprising different nucleosides) are not removed (or excised).
- Nucleotides comprising deoxyuridine (dU) may be selectively removed using Uracil-Specific Excision Reagent (USER®), which is a mixture of Uracil DNA glycosylase (UDG) and the DNA glycosylase-lyase Endonuclease VIII.
- Methods are known in the art for repairing the single stranded gaps in the double stranded constructs. For instance, the gaps can be repaired using a polymerase and a ligase, such as DNA polymerase and a DNA ligase. Alternatively, the gaps can be repaired using random oligonucleotides of sufficient length to bring the gaps and a ligase.
- Any translocase that is capable of removing the MuA transposase may be used in the invention. This may occur, for example, as a result of the unwinding of double stranded polynucleotide by a translocase.
- The translocase is preferably a helicase. Suitable helicases are well-known in the art (M. E. Fairman-Williams et al., Curr. Opin. Struct Biol., 2010, 20 (3), 313-324, T. M. Lohman et al., Nature Reviews Molecular Cell Biology, 2008, 9, 391-401).
- The helicase is preferably a member of
superfamily 1 orsuperfamily 2. The helicase is more preferably a member of one of the following families: Pif1-like, Upf1-like, UvrD/Rep, Ski-like, Rad3/XPD, NS3/NPH-II, DEAD, DEAH/RHA, RecG-like, REcQ-like, T1R-like, Swi/Snf-like and Rig-I-like. The first three of those families are insuperfamily 1 and the second ten families are insuperfamily 2. The helicase is more preferably a member of one of the following subfamilies: RecD, Upf1 (RNA), PcrA, Rep, UvrD, Hel308, Mtr4 (RNA), XPD, NS3 (RNA), Mss116 (RNA), Prp43 (RNA), RecG, RecQ, T1R, RapA and Hef (RNA). The first five of those subfamilies are insuperfamily 1 and the second eleven subfamilies are insuperfamily 2. Members of the Upf1, Mtr4, NS3, Mss116, Prp43 and Hef subfamilies are RNA helicases. Members of the remaining subfamilies are DNA helicases. The helicase may be Srs2. The helicase may be RecBCD. - The helicase is preferably a Hel308 helicase. Any Hel308 helicase may be used in accordance with the invention. Hel308 helicases are also known as ski2-like helicases and the two terms can be used interchangeably. Suitable Hel308 helicases are disclosed in Table 4 of International Application No. PCT/GB2012/052579 (published as WO 2013/057495).
- The Hel308 helicase typically comprises the amino acid motif Q-X1-X2-G-R-A-G-R (hereinafter called the Hel308 motif; SEQ ID NO: 8). The Hel308 motif is typically part of the helicase motif VI (Tuteja and Tuteja, Eur. J. Biochem. 271, 1849-1863 (2004)). X1 may be C, M or L. X1 is preferably C. X2 may be any amino acid residue. X2 is typically a hydrophobic or neutral residue. X2 may be A, F, M, C, V, L, I, S, T, P or R. X2 is preferably A, F, M, C, V, L, I, S, T or P. X2 is more preferably A, M or L. X2 is most preferably A or M.
- The Hel308 helicase preferably comprises the motif Q-X1-X2-G-R-A-G-R-P (hereinafter called the extended Hel308 motif; SEQ ID NO: 9) wherein X1 and X2 are as described above.
- The most preferred Hel308 helicases, Hel308 motifs and extended Hel308 motifs are shown in the Table 1 below.
-
TABLE 1 Preferred Hel308 helicases and their motifs % % SEQ Identity Identity Extended ID Hel308 Hel308 Hel308 Hel308 NO: Helicase Names Pfu Mbu motif motif 10 Hel308 Mbu Methanococcoides 37% — QMAGRAGR QMAGRAGRP burtonii (SEQ ID NO: (SEQ ID NO: 12) 11) 13 Hel308 Pfu Pyrococcus — 37% QMLGRAGR QMLGRAGRP furiosus DSM (SEQ ID NO: (SEQ ID NO: 15) 3638 14) 16 Hel308 Hvo Haloferax 34% 41% QMMGRAGR QMMGRAGRP volcanii (SEQ ID NO: (SEQ ID NO: 18) 17) 19 Hel308 Hla Halorubrum 35% 42% QMCGRAGR QMCGRAGRP lacusprofundi (SEQ ID NO: (SEQ ID NO: 21) 20) 22 Hel308 Csy Cenarchaeum 34% 34% QLCGRAGR QLCGRAGRP symbiosum (SEQ ID NO: (SEQ ID NO: 24) 23) 25 Hel308 Sso Sulfolobus 35% 33% QMSGRAGR QMSGRAGRP solfataricus (SEQ ID NO: (SEQ ID NO: 27) 26) 28 Hel308 Mfr Methanogenium 37% 44% QMAGRAGR QMAGRAGRP frigidum (SEQ ID NO: (SEQ ID NO: 12) 11) 29 Hel308 Mok Methanothermococcus 37% 34% QCIGRAGR QCIGRAGRP okinawensis (SEQ ID NO: (SEQ ID NO: 31) 30) 32 Hel308 Mig Methanotorris 40% 35% QCIGRAGR QCIGRAGRP igneus Kol 5 (SEQ ID NO: (SEQ ID NO: 31) 30) 33 Hel308 Tga Thermococcus 60% 38% QMMGRAGR QMMGRAGRP gammatolerans (SEQ ID NO: (SEQ ID NO: 18) EJ3 17) 34 Hel308 Tba Thermococcus 57% 35% QMIGRAGR QMIGRAGRP barophilus MP (SEQ ID NO: (SEQ ID NO: 36) 35) 37 Hel308 Tsi Thermococcus 56% 35% QMMGRAGR QMMGRAGRP sibiricus MM 739 (SEQ ID NO: (SEQ ID NO: 18) 17) 38 Hel308 Mba Methanosarcina 39% 60% QMAGRAGR QMAGRAGRP barkeri str. Fusaro (SEQ ID NO: (SEQ ID NO: 12) 11) 39 Hel308 Mac Methanosarcina 38% 60% QMAGRAGR QMAGRAGRP acetivorans (SEQ ID NO: (SEQ ID NO: 12) 11) 40 Hel308 Methanohalophilus 38% 60% QMAGRAGR QMAGRAGRP Mmah mahii DSM 5219 (SEQ ID NO: (SEQ ID NO: 12) 11) 41 Hel308 Methanosarcina 38% 60% QMAGRAGR QMAGRAGRP Mmaz mazei (SEQ ID NO: (SEQ ID NO: 12) 11) 42 Hel308 Mth Methanosaeta 39% 46% QMAGRAGR QMAGRAGRP thermophila PT (SEQ ID NO: (SEQ ID NO: 12) 11) 43 Hel308 Mzh Methanosalsum 39% 57% QMAGRAGR QMAGRAGRP zhilinae DSM (SEQ ID NO: (SEQ ID NO: 12) 4017 11) 44 Hel308 Mev Methanohalobium 38% 61% QMAGRAGR QMAGRAGRP evestigatum Z- (SEQ ID NO: (SEQ ID NO: 12) 7303 11) 45 Hel308 Methanococcus 36% 32% QCIGRAGR QCIGRAGRP Mma maripaludis (SEQ ID NO: (SEQ ID NO: 31) 30) 46 Hel308 Nma Natrialba 37% 43% QMMGRAGR QMMGRAGRP magadii (SEQ ID NO: (SEQ ID NO: 18) 17) 47 Hel308 Mbo Methanoregula 38% 45% QMAGRAGR QMAGRAGRP boonei 6A8 (SEQ ID NO: (SEQ ID NO: 12) 11) 48 Hel308 Fac Ferroplasma 34% 32% QMIGRAGR QMIGRAGRP acidarmanus (SEQ ID NO: (SEQ ID NO: 36) 35) 49 Hel308 Mfe Methanocaldococcus 40% 35% QCIGRAGR QCIGRAGRP fervens AG86 (SEQ ID NO: (SEQ ID NO: 31) 30) 50 Hel308 Mja Methanocaldococcus 24% 22% QCIGRAGR QCIGRAGRP jannaschii (SEQ ID NO: (SEQ ID NO: 31) 30) 51 Hel308 Min Methanocaldococcus 41% 33% QCIGRAGR QCIGRAGRP infernus (SEQ ID NO: (SEQ ID NO: 31) 30) 52 Hel308 Mhu Methanospirillum 36% 40% QMAGRAGR QMAGRAGRP hungatei JF-1 (SEQ ID NO: (SEQ ID NO: 12) 11) 53 Hel308 Afu Archaeoglobus 40% 40% QMAGRAGR QMAGRAGRP fulgidus DSM (SEQ ID NO: (SEQ ID NO: 12) 4304 11) 54 Hel308 Htu Haloterrigena 35% 43% QMAGRAGR QMMGRAGRP turkmenica (SEQ ID NO: (SEQ ID NO: 12) 11) 55 Hel308 Hpa Haladaptatus 38% 45% QMFGRAGR QMFGRAGRP paucihalophilus (SEQ ID NO: (SEQ ID NO: 57) DX253 56) 58 Hel308 Hsp Halobacterium sp. 36.8% 42.0% QMFGRAGR QMFGRAGRP ski2-like NRC-1 (SEQ ID NO: (SEQ ID NO: 57) helicase 56) - The most preferred Hel308 motif is shown in SEQ ID NO: 17. The most preferred extended Hel308 motif is shown in SEQ ID NO: 18.
- The Hel308 helicase preferably comprises the sequence of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 or a variant thereof.
- A variant of a Hel308 helicase is an enzyme that has an amino acid sequence which varies from that of the wild-type helicase and which retains polynucleotide binding activity. In particular, a variant of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 and which retains polynucleotide binding activity. Polynucleotide binding activity can be determined using methods known in the art. Suitable methods include, but are not limited to, fluorescence anisotropy, tryptophan fluorescence and electrophoretic mobility shift assay (EMSA). For instance, the ability of a variant to bind a single stranded polynucleotide can be determined as described in the Examples.
- The variant retains helicase activity. This can be measured in various ways. For instance, the ability of the variant to translocate along a polynucleotide can be measured using electrophysiology, a fluorescence assay or ATP hydrolysis.
- The variant may include modifications that facilitate handling of the polynucleotide encoding the helicase and/or facilitate its activity at high salt concentrations and/or room temperature. Variants typically differ from the wild-type helicase in regions outside of the Hel308 motif or extended Hel308 motif discussed above. However, variants may include modifications within these motif(s).
- Over the entire length of the amino acid sequence of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58, a variant will preferably be at least 30% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 over the entire sequence. There may be at least 70%, for example at least 80%, at least 85%, at least 90% or at least 95%, amino acid identity over a stretch of 150 or more, for example 200, 300, 400, 500, 600, 700, 800, 900 or 1000 or more, contiguous amino acids (“hard homology”). Homology is determined as described below. The variant may differ from the wild-type sequence in any of the ways discussed below with reference to SEQ ID NOs: 2 and 4.
- A variant of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 preferably comprises the Hel308 motif or extended Hel308 motif of the wild-type sequence as shown in Table 1 above. However, a variant may comprise the Hel308 motif or extended Hel308 motif from a different wild-type sequence. For instance, a variant of SEQ ID NO: 10 may comprise the Hel308 motif or extended Hel308 motif from SEQ ID NO: 13 (i.e. SEQ ID NO: 14 or 15). Variants of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 may also include modifications within the Hel308 motif or extended Hel308 motif of the relevant wild-type sequence. Suitable modifications at X1 and X2 are discussed above when defining the two motifs. A variant of SEQ ID NO: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 or 58 preferably comprises one or more substituted cysteine residues and/or one or more substituted Faz residues to facilitate attachment as discussed above.
- A variant of SEQ ID NO: 10 may lack the first 19 amino acids of SEQ ID NO: 10 and/or lack the last 33 amino acids of SEQ ID NO: 10. A variant of SEQ ID NO: 10 preferably comprises a sequence which is at least 70%, at least 75%, at least 80%, at least 85%, at least 90% or more preferably at least 95%, at least 97% or at least 99% homologous based on amino acid identity with
amino acids 20 to 211 or 20 to 727 of SEQ ID NO: 10. - The Hel308 helicase may be modified as described in International Application No. PCT/GB2015/051925 (published as WO 2014/013260). In particular, two or more parts on the helicase may be connected to reduce the size of the opening in the polynucleotide domain through which a polynucleotide can unbind from the helicase and wherein the helicase retains its ability to control the movement of the polynucleotide. In Hel308 helicases, the polynucleotide domain and opening can be found between domain 2 (one of the ATPase domains) and domain 4 (the ratchet domain) and
domain 2 and domain 5 (the molecular brake). The two or more parts connected in accordance with the invention are preferably (a) any amino acid indomain 2 and any amino acid indomain 4 or (b) any amino acid indomain 2 and any amino acid indomain 5. The amino acid residues which definedomains -
TABLE 2 Amino acid residues which correspond to domains 2, 4 and 5 in various Hel308 helicases. SEQ Hel308 Domain 2 Domain 4 Domain 5 ID NO: Homologue Start End Start End Start End 10 Mbu W200 E409 Y506 G669 S670 Q760 13 Pfu W198 F398 Y490 G640 I641 S720 16 Hvo W201 W418 Y509 G725 V726 E827 19 Hla W201 W418 Y513 G725 V726 R824 22 Csy W205 G414 Y504 G644 I645 K705 25 Sso W204 L420 Y506 G651 I652 S717 28 Mfr W193 E397 Y488 G630 I631 I684 29 Mok W198 G415 Y551 G706 A707 I775 32 Mig W200 E408 Y495 G632 A633 I699 33 Tga W198 R399 Y491 G639 V640 R720 34 Tba W219 F420 Y512 G660 V661 K755 37 Tsi W221 L422 Y514 G662 V663 K744 38 Mba W200 E409 Y498 G643 A644 Y729 39 Mac W200 E409 Y499 G644 A645 F730 40 Mmah W196 G405 Y531 G678 A679 N747 41 Mmaz W200 E409 Y499 G644 A645 Y730 42 Mth W203 M404 Y491 G629 A630 A693 43 Mzh W200 N409 Y505 G651 I652 T739 44 Mev W200 D409 Y499 G643 V644 F733 45 Mma W196 G405 Y531 G678 A679 N747 46 Nma W201 W413 Y541 G688 V689 F799 47 Mbo W197 E402 Y493 G637 I638 G723 48 Fac F197 T390 Y480 G613 V614 R681 49 Mfe W199 Q408 Y494 G629 A630 F696 50 Mja W197 Q406 Y492 G627 A628 F694 51 Min W189 Q390 Y476 G604 A605 I670 52 Mhu W198 D402 Y493 G637 V638 C799 53 Afu W201 F399 Y487 G626 V627 E696 54 Htu W201 W413 Y533 G680 V681 F791 55 Hpa W201 W412 Y502 G657 V658 E752 58 Hsp (ski2- W210 Y421 Y512 G687 V688 S783 like helicase) - The Hel308 helicase preferably comprises the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) or a variant thereof. In Hel308 Mbu, the polynucleotide domain and opening can be found between domain 2 (one of the ATPase domains) and domain 4 (the ratchet domain) and
domain 2 and domain 5 (the molecular brake). The two or more parts of Hel308 Mbu connected are preferably (a) any amino acid indomain 2 and any amino acid indomain 4 or (b) any amino acid indomain 2 and any amino acid indomain 5. The amino acid residues which definedomains - The invention may use a mutant Hel308 Mbu protein which comprises a variant of SEQ ID NO: 10 in which E284 and 5615 are modified. E284 and 5615 are preferably substituted. E284 and 5615 are more preferably substituted with cysteine (i.e. E284C and S615C). The variant may differ from SEQ ID NO: 10 at positions other than E284 and 5615 as long as E284 and 5615 are modified. The variant will preferably be at least 30% homologous to SEQ ID NO: based on amino acid identity as discussed in more detail below. E284 and 5615 do not have to be connected. Alternatively, E284 and 5615 may be connected.
- The Hel308 helicase more preferably comprises (a) the sequence of Hel308 Tga (i.e. SEQ ID NO: 33) or a variant thereof, (b) the sequence of Hel308 Csy (i.e. SEQ ID NO: 22) or a variant thereof or (c) the sequence of Hel308 Mhu (i.e. SEQ ID NO: 52) or a variant thereof.
- SEQ ID NO: 10 (Hel308 Mbu) contains five natural cysteine residues. However, all of these residues are located within or around the DNA binding grove of the enzyme. Once a DNA strand is bound within the enzyme, these natural cysteine residues become less accessible for external modifications. This allows specific cysteine mutants of SEQ ID NO: 10 to be designed and attached to the moiety using cysteine linkage as discussed above. Preferred variants of SEQ ID NO: 10 have one or more of the following substitutions: A29C, Q221C, Q442C, T569C, A577C, A700C and S708C. The introduction of a cysteine residue at one or more of these positions facilitates cysteine linkage as discussed above. Other preferred variants of SEQ ID NO: have one or more of the following substitutions: M2Faz, R10Faz, F15Faz, A29Faz, R185Faz, A268Faz, E284Faz, Y387Faz, F400Faz, Y455Faz, E464Faz, E573Faz, A577Faz, E649Faz, A700Faz, Y720Faz, Q442Faz and S708Faz. The introduction of a Faz residue at one or more of these positions facilitates Faz linkage as discussed above.
- The Hel308 helicase is modified by the introduction of one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D272, N273, D274, G281, E284, E285, E287, 5288, T289, G290, E291, D293, T294, N300, R303, K304, N314, 5315, N316, H317, R318, K319, L320, E322, R326, N328, 5615, K717, Y720, N721 and 5724 in Hel308 Mbu (SEQ ID NO: 10), wherein the helicase retains its ability to control the movement of a polynucleotide. The one or more cysteine residues and/or one or more non-natural amino acids are preferably introduced by substitution.
- These modifications do not prevent the helicase from binding to a polynucleotide. For instance, the helicase may bind to a polynucleotide via internal nucleotides or at one of its termini. These modifications decrease the ability of the polynucleotide to unbind or disengage from the helicase, particularly from internal nucleotides of the polynucleotide. In other words, the one or more modifications increase the processivity of the Hel308 helicase by preventing dissociation from the polynucleotide strand. The thermal stability of the enzyme is also increased by the one or more modifications giving it an improved structural stability that is beneficial in Strand Sequencing. The modified Hel308 helicases of the invention have all of the advantages and uses discussed above.
- The modified Hel308 helicase has the ability to control the movement of a polynucleotide. This can be measured as discussed above. The modified Hel308 helicase is artificial or non-natural.
- The Hel308 helicase preferably comprises a variant of one of the helicases shown in Table 1 above which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D272, N273, D274, G281, E284, E285, E287, 5288, T289, G290, E291, D293, T294, N300, R303, K304, N314, 5315, N316, H317, R318, K319, L320, E322, R326, N328, 5615, K717, Y720, N721 and 5724 in Hel308 Mbu (SEQ ID NO: 10). The Hel308 helicase preferably comprises a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D272, N273, D274, G281, E284, E285, E287, S288, T289, G290, E291, D293, T294, N300, R303, K304, N314, 5315, N316, H317, R318, K319, L320, E322, R326, N328, 5615, K717, Y720, N721 and S724 in Hel308 Mbu (SEQ ID NO: 10).
- The Hel308 helicase preferably comprises a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D274, E284, E285, E287, S288, T289, G290, E291, N316, K319, 5615, K717 or Y720 in Hel308 Mbu (SEQ ID NO: 10).
- Table 3a and 3b below show the positions in other Hel308 helicases which correspond to D274, E284, E285, S288, 5615, K717, Y720, E287, T289, G290, E291, N316 and K319 in Hel308 Mbu (SEQ ID NO: 10). For instance, in Hel308 Hvo (SEQ ID NO:16), E283 corresponds to D274 in Hel308 Mbu, E293 corresponds to E284 in Hel308 Mbu, 1294 corresponds to E285 in Hel308 Mbu, V297 corresponds to S288 in Hel308 Mbu, D671 corresponds to 5615 in Hel308 Mbu, K775 corresponds to K717 in Hel308 Mbu and E778 corresponds to Y720 in Hel308 Mbu. The lack of a corresponding position in another Hel308 helicase is marked as a “-”.
-
TABLE 3a Positions which correspond to D274, E284, E285, S288, S615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). SEQ Hel308 ID NO: homologue A B C D E F G 10 Mbu D274 E284 E285 S288 S615 K717 Y720 13 Pfu L265 E275 L276 S279 P585 K690 E693 16 Hvo E283 E293 I294 V297 D671 K775 E778 19 Hla E283 E293 I294 G297 D668 R775 E778 22 Csy D280 K290 I291 S294 P589 T694 N697 25 Sso L281 K291 Q292 D295 D596 K702 Q705 28 Mfr H264 E272 K273 A276 G576 K678 E681 29 Mok S279 L289 S290 D293 P649 K753 R756 32 Mig Y276 L286 S287 D290 P579 K679 K682 33 Tga L266 S276 L277 Q280 P583 K689 D692 34 Tba L287 E297 L298 S301 S604 K710 E713 37 Tsi L289 Q299 L300 G303 N606 G712 E715 38 Mba E274 D284 E285 E288 S589 K691 D694 39 Mac E274 D284 E285 E288 P590 K692 E695 40 Mmah H272 L282 S283 D286 P621 K725 K728 41 Mmaz E274 D284 E285 E288 P590 K692 E698 42 Mth A269 L279 A280 L283 H575 K677 E680 43 Mzh H274 Q284 E285 E288 P596 K699 Q702 44 Mev G274 E284 E285 E288 T590 K691 Y694 45 Mma H272 L282 S283 D286 P621 K725 K728 46 Nma G277 T287 E288 E291 D634 K737 E740 47 Mbo A270 E277 R278 E281 S583 G685 E688 48 Fac Q264 F267 E268 E271 P559 K663 K666 49 Mfe R275 L285 S286 E289 P576 K676 K679 50 Mja I273 L283 S284 E287 P574 K674 K677 51 Min R257 L267 S268 D271 P554 K651 K654 52 Mhu S269 Q277 E278 R281 S583 G685 R688 53 Afu K268 K277 A278 E281 D575 R677 E680 54 Htu D277 D287 D288 D291 D626 K729 E732 55 Hpa D276 D286 Q287 D290 D595 K707 E710 58 Hsp (ski2- E286 E296 I297 V300 D633 A737 E740 like helicase) -
TABLE 3b Positions which correspond to E287, T289, G290, E291, N316 and K319 in Hel308 Mbu (SEQ ID NO: 10). SEQ Hel308 ID NO: homologue H I J K L M 10 Mbu E287 T289 G290 E291 N316 K319 13 Pfu D278 L280 E281 E282 D307 V310 16 Hvo D296 S298 D299 T300 E324 T327 19 Hla S296 S298 D299 T300 E324 A327 22 Csy S293 G295 G296 E297 D322 S325 25 Sso D294 I296 E297 E298 A325 D328 28 Mfr E275 A277 A278 E279 M304 T307 29 Mok L292 N294 P295 T296 E320 K323 32 Mig L289 P291 P292 T293 E317 K320 33 Tga S279 L281 E282 D283 V308 T311 34 Tba E300 L302 E303 S304 A329 T332 37 Tsi D302 L304 D305 T306 T331 S334 38 Mba L287 N289 S290 E291 P316 E319 39 Mac L287 N289 S290 E291 P316 E319 40 Mmah L285 R287 P288 V289 K313 K316 41 Mmaz I287 N289 S290 E291 P316 E319 42 Mth R282 S284 G285 E286 E311 R314 43 Mzh G287 A289 G290 E291 E316 R319 44 Mev L287 T289 S290 D291 A316 K319 45 Mma L285 R287 P288 V289 K313 K316 46 Nma R290 D292 S293 D294 T319 S322 47 Mbo L280 G282 T283 P284 K309 S312 48 Fac L270 I272 P273 P274 D299 T302 49 Mfe L288 P290 P291 T292 Q316 K319 50 Mja L286 P288 P289 T290 Q314 K317 51 Min F270 P272 P273 T274 E298 K301 52 Mhu R280 L282 R283 D284 Q309 T312 53 Afu L280 E282 N283 E284 G309 R312 54 Htu R290 D292 S293 D294 T319 S322 55 Hpa R289 V291 S292 D293 D318 S321 58 Hsp (ski2- G299 S301 D302 T303 E327 E330 like helicase) - The Hel308 helicase more preferably comprises a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D274, E284, E285, S288, S615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). The relevant positions are shown in columns A to G in Table 3a above.
- The helicase may comprise a cysteine residue at one, two, three, four, five, six or seven of the positions which correspond to D274, E284, E285, S288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of these positions may be substituted with cysteine. For instance, for each row of Table 3a above, the helicase of the invention may comprise a cysteine at any of the following combinations of the positions labelled A to G in that row: {A}, {B}, {C}, {D}, {G}, {E}, {F}, {A and B}, {A and C}, {A and D}, {A and G}, {A and E}, {A and F}, {B and C}, {B and D}, {B and G}, {B and E}, {B and F}, {C and D}, {C and G}, {C and E}, {C and F}, {D and G}, {D and E}, {D and F}, {G and E}, {G and F}, {E and F}, {A, B and C}, {A, B and D}, {A, B and G}, {A, B and E}, {A, B and F}, {A, C and D}, {A, C and G}, {A, C and E}, {A, C and F}, {A, D and G}, {A, D and E}, {A, D and F}, {A, G and E}, {A, G and F}, {A, E and F}, {B, C and D}, {B, C and G}, {B, C and E}, {B, C and F}, {B, D and G}, {B, D and E}, {B, D and F}, {B, G and E}, {B, G and F}, {B, E and F}, {C, D and G}, {C, D and E}, {C, D and F}, {C, G and E}, {C, G and F}, {C, E and F}, {D, G and E}, {D, G and F}, {D, E and F}, {G, E and F}, {A, B, C and D}, {A, B, C and G}, {A, B, C and E}, {A, B, C and F}, {A, B, D and G}, {A, B, D and E}, {A, B, D and F}, {A, B, G and E}, {A, B, G and F}, {A, B, E and F}, {A, C, D and G}, {A, C, D and E}, {A, C, D and F}, {A, C, G and E}, {A, C, G and F}, {A, C, E and F}, {A, D, G and E}, {A, D, G and F}, {A, D, E and F}, {A, G, E and F}, {B, C, D and G}, {B, C, D and E}, {B, C, D and F}, {B, C, G and E}, {B, C, G and F}, {B, C, E and F}, {B, D, G and E}, {B, D, G and F}, {B, D, E and F}, {B, G, E and F}, {C, D, G and E}, {C, D, G and F}, {C, D, E and F}, {C, G, E and F}, {D, G, E and F}, {A, B, C, D and G}, {A, B, C, D and E}, {A, B, C, D and F}, {A, B, C, G and E}, {A, B, C, G and F}, {A, B, C, E and F}, {A, B, D, G and E}, {A, B, D, G and F}, {A, B, D, E and F}, {A, B, G, E and F}, {A, C, D, G and E}, {A, C, D, G and F}, {A, C, D, E and F}, {A, C, G, E and F}, {A, D, G, E and F}, {B, C, D, G and E}, {B, C, D, G and F}, {B, C, D, E and F}, {B, C, G, E and F}, {B, D, G, E and F}, {C, D, G, E and F}, {A, B, C, D, G and E}, {A, B, C, D, G and F}, {A, B, C, D, E and F}, {A, B, C, G, E and F}, {A, B, D, G, E and F}, {A, C, D, G, E and F}, {B, C, D, G, E and F}, or {A, B, C, D, G, E and F}.
- The helicase may comprises a non-natural amino acid, such as Faz, at one, two, three, four, five, six or seven of the positions which correspond to D274, E284, E285, 5288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of these positions may be substituted with a non-natural amino acid, such as Faz. For instance, for each row of Table 3a above, the helicase of the invention may comprise a non-natural amino acid, such as Faz, at any of the combinations of the positions labelled A to G above.
- The helicase may comprise a combination of one or more cysteines and one or more non-natural amino acids, such as Faz, at two or more of the positions which correspond to D274, E284, E285, S288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of one or more cysteine residues and one or more non-natural amino acids, such as Faz, may be present at the relevant positions. For instance, for each row of Table 3a and 3b above, the helicase of the invention may comprise one or more cysteines and one or more non-natural amino acids, such as Faz, at any of the combinations of the positions labelled A to G above.
- The Hel308 helicase more preferably comprises a variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D274, E284, E285, S288 and S615 in Hel308 Mbu (SEQ ID NO: 10). The relevant positions are shown in columns A to E in Table 3a above.
- The helicase may comprise a cysteine residue at one, two, three, four or five, six or seven of the positions which correspond to D274, E284, E285, S288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of these positions may be substituted with cysteine. For instance, for each row of Table 3a above, the helicase of the invention may comprise a cysteine at any of the following combinations of the positions labelled A to E in that row: {A}, {B}, {C}, {D}, {E}, {A and B}, {A and C}, {A and D}, {A and E}, {B and C}, {B and D}, {B and E}, {C and D}, {C and E}, {D and E}, {A, B and C}, {A, B and D}, {A, B and E}, {A, C and D}, {A, C and E}, {A, D and E}, {B, C and D}, {B, C and E}, {B, D and E}, {C, D and E}, {A, B, C and D}, {A, B, C and E}, {A, B, D and E}, {A, C, D and E}, {B, C, D and E} or {A, B, C, D and E}.
- The helicase may comprises a non-natural amino acid, such as Faz, at one, two, three, four or five of the positions which correspond to D274, E284, E285, 5288, 5615, K717 and Y720 in Hel308 Mbu (SEQ ID NO: 10). Any combination of these positions may be substituted with a non-natural amino acid, such as Faz. For instance, for each row of Table 3a above, the helicase of the invention may comprise a non-natural amino acid, such as Faz, at any of the combinations of the positions labelled A to E above.
- The helicase may comprise a combination of one or more cysteines and one or more non-natural amino acids, such as Faz, at two or more of the positions which correspond to D274, E284, E285, S288 and 5615 in Hel308 Mbu (SEQ ID NO: 10). Any combination of one or more cysteine residues and one or more non-natural amino acids, such as Faz, may be present at the relevant positions. For instance, for each row of Table 3a above, the helicase of the invention may comprise one or more cysteines and one or more non-natural amino acids, such as Faz, at any of the combinations of the positions labelled A to E above.
- The Hel308 helicase preferably comprises a variant of the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) which comprises one or more cysteine residues and/or one or more non-natural amino acids at D272, N273, D274, G281, E284, E285, E287, S288, T289, G290, E291, D293, T294, N300, R303, K304, N314, 5315, N316, H317, R318, K319, L320, E322, R326, N328, S615, K717, Y720, N721 and 5724. The variant preferably comprises D272C, N273C, D274C, G281C, E284C, E285C, E287C, S288C, T289C, G290C, E291C, D293C, T294C, N300C, R303C, K304C, N314C, S315C, N316C, H317C, R318C, K319C, L320C, E322C, R326C, N328C, S615C, K717C, Y720C, N721C or S724C. The variant preferably comprises D272Faz, N273Faz, D274Faz, G281Faz, E284Faz, E285Faz, E287Faz, S288Faz, T289Faz, G290Faz, E291Faz, D293Faz, T294Faz, N300Faz, R303Faz, K304Faz, N314Faz, S315Faz, N316Faz, H317 Faz, R318Faz, K319Faz, L320Faz, E322Faz, R326Faz, N328Faz, S615Faz, K717Faz, Y720Faz, N721Faz or S724Faz.
- The Hel308 helicase preferably comprises a variant of the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) which comprises one or more cysteine residues and/or one or more non-natural amino acids at D274, E284, E285, S288, 5615, K717 and Y720. The helicase of the invention may comprise one or more cysteines, one or more non-natural amino acids, such as Faz, or a combination thereof at any of the combinations of the positions labelled A to G above.
- The Hel308 helicase preferably comprises a variant of the sequence of Hel308 Mbu (i.e. SEQ ID NO: 10) which comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of D274, E284, E285, 5288 and 5615. For instance, for Hel308 Mbu (SEQ ID NO: 10), the helicase of the invention may comprise a cysteine or a non-natural amino acid, such as Faz, at any of the following combinations of positions: {D274}, {E284}, {E285}, {S288}, {S615}, {D274 and E284}, {D274 and E285}, {D274 and S288}, {D274 and 5615}, {E284 and E285}, {E284 and S288}, {E284 and 5615}, {E285 and S288}, {E285 and 5615}, {5288 and 5615}, {D274, E284 and E285}, {D274, E284 and S288}, {D274, E284 and 5615}, {D274, E285 and S288}, {D274, E285 and 5615}, {D274, S288 and 5615}, {E284, E285 and S288}, {E284, E285 and S615}, {E284, 5288 and S615}, {E285, 5288 and S615}, {D274, E284, E285 and S288}, {D274, E284, E285 and 5615}, {D274, E284, S288 and 5615}, {D274, E285, S288 and 5615}, {E284, E285, S288 and 5615} or {D274, E284, E285, S288 and 5615}.
- The helicase preferably comprises a variant of SEQ ID NO: 10 which comprises (a) E284C and 5615C, (b), E284Faz and S615Faz, (c) E284C and S615Faz or (d) E284Faz and S615C.
- The helicase more preferably comprises the sequence shown in SEQ ID NO: 10 with E284C and 5615C.
- Preferred non-natural amino acids for use in the invention include, but are not limited, to 4-Azido-L-phenylalanine (Faz), 4-Acetyl-L-phenylalanine, 3-Acetyl-L-phenylalanine, 4-Acetoacetyl-L-phenylalanine, O-Allyl-L-tyrosine, 3-(Phenylselanyl)-L-alanine, O-2-Propyn-1-yl-L-tyrosine, 4-(Dihydroxyboryl)-L-phenylalanine, 4-[(Ethylsulfanyl)carbonyl]-L-phenylalanine, (2S)-2-amino-3-{4-[(propan-2-ylsulfanyl)carbonyl]phenyl}propanoic acid, (2S)-2-amino-3-{4-[(2-amino-3-sulfanylpropanoyl)amino]phenyl}propanoic acid, O-Methyl-L-tyrosine, 4-Amino-L-phenylalanine, 4-Cyano-L-phenylalanine, 3-Cyano-L-phenylalanine, 4-Fluoro-L-phenylalanine, 4-Iodo-L-phenylalanine, 4-Bromo-L-phenylalanine, O-(Trifluoromethyl)tyrosine, 4-Nitro-L-phenylalanine, 3-Hydroxy-L-tyrosine, 3-Amino-L-tyrosine, 3-Iodo-L-tyrosine, 4-Isopropyl-L-phenylalanine, 3-(2-Naphthyl)-L-alanine, 4-Phenyl-L-phenylalanine, (2S)-2-amino-3-(naphthalen-2-ylamino)propanoic acid, 6-(Methylsulfanyl)norleucine, 6-Oxo-L-lysine, D-tyrosine, (2R)-2-Hydroxy-3-(4-hydroxyphenyl)propanoic acid, (2R)-2-Ammoniooctanoate3-(2,2′-Bipyridin-5-yl)-D-alanine, 2-amino-3-(8-hydroxy-3-quinolyl)propanoic acid, 4-Benzoyl-L-phenylalanine, 5-(2-Nitrobenzyl)cysteine, (2R)-2-amino-3[(2-nitrobenzyl)sulfanyl]propanoic acid, (2S)-2-amino-3-[(2-nitrobenzyl)oxy]propanoic acid, O-(4,5-Dimethoxy-2-nitrobenzyl)-L-serine, (2S)-2-amino-6-({[(2-nitrobenzyl)oxy]carbonyl}amino)hexanoic acid, O-(2-Nitrobenzyl)-L-tyrosine, 2-Nitrophenylalanine, 4-[(E)-Phenyldiazenyl]-L-phenylalanine, 4-[3-(Trifluoromethyl)-3H-diaziren-3-yl]-D-phenylalanine, 2-amino-3-[[5-(dimethylamino)-1-naphthyl]sulfonylamino]propanoic acid, (2S)-2-amino-4-(7-hydroxy-2-oxo-2H-chromen-4-yl)butanoic acid, (2S)-3-[(6-acetylnaphthalen-2-yl)amino]-2-aminopropanoic acid, 4-(Carboxymethyl)phenylalanine, 3-Nitro-L-tyrosine, 0-Sulfo-L-tyrosine, (2R)-6-Acetamido-2-ammoniohexanoate, 1-Methylhistidine, 2-Aminononanoic acid, 2-Aminodecanoic acid, L-Homocysteine, 5-Sulfanylnorvaline, 6-Sulfanyl-L-norleucine, 5-(Methylsulfanyl)-L-noryaline, N6-{[(2R,3R)-3-Methyl-3,4-dihydro-2H-pyrrol-2-yl]carbonyl}-L-lysine, N6-[(Benzyloxy)carbonyl]lysine, (2S)-2-amino-6-[(cyclopentylcarbonyl)amino]hexanoic acid, N6-[(Cyclopentyloxy)carbonyl]-L-lysine, (2S)-2-amino-6-({[(2R)-tetrahydrofuran-2-ylcarbonyl]amino}hexanoic acid, (2S)-2-amino-8-[(2R,3S)-3-ethynyltetrahydrofuran-2-yl]-8-oxooctanoic acid, N6-(tert-Butoxycarbonyl)-L-lysine, (2S)-2-Hydroxy-6-({[(2-methyl-2-propanyl)oxy]carbonyl}amino)hexanoic acid, N6-[(Allyloxy)carbonyl]lysine, (2S)-2-amino-6-({[(2-azidobenzyl)oxy]carbonyl}amino)hexanoic acid, N6-L-Prolyl-L-lysine, (2S)-2-amino-6-{[(prop-2-yn-1-yloxy)carbonyl]amino}hexanoic acid and N6-[(2-Azidoethoxy)carbonyl]-L-lysine.
- The most preferred non-natural amino acid is 4-azido-L-phenylalanine (Faz).
- As discussed above, variant of a Hel308 helicase is an enzyme that has an amino acid sequence which varies from that of the wild-type helicase and which retains polynucleotide binding activity. A variant of one of SEQ ID NOs: 10, 13, 16, 19, 22, 25, 28, 29, 32, 33, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 and 58 may comprise additional modifications as long as it comprises one or more cysteine residues and/or one or more non-natural amino acids at one or more of the positions which correspond to D272, N273, D274, G281, E284, E285, E287, 5288, T289, G290, E291, D293, T294, N300, R303, K304, N314, S315, N316, H317, R318, K319, L320, E322, R326, N328, 5615, K717, Y720, N721 and 5724 in Hel308 Mbu (SEQ ID NO: 10). Suitable modifications and variants are discussed above with reference to the embodiments with two or more parts connected.
- A variant may comprise the mutations in
domain 5 disclosed in Woodman et al. (J. Mol. Biol. (2007) 374, 1139-1144). These mutations correspond to R685A, R687A and R689A in SEQ ID NO: 10. - The two or more parts may be connected in any way. The connection can be transient, for example non-covalent. Even transient connection will reduce the size of the opening and reduce unbinding of the polynucleotide from the helicase through the opening.
- The two or more parts are preferably connected by affinity molecules. Suitable affinity molecules are known in the art. The affinity molecules are preferably (a) complementary polynucleotides (International Application No. PCT/GB10/000132 (published as WO 2010/086602), (b) an antibody or a fragment thereof and the complementary epitope (Biochemistry 6th Ed, W. H. Freeman and co (2007) pp 953-954), (c) peptide zippers (O'Shea et al., Science 254 (5031): 539-544), (d) capable of interacting by β-sheet augmentation (Remaut and Waksman Trends Biochem. Sci. (2006) 31 436-444), (e) capable of hydrogen bonding, pi-stacking or forming a salt bridge, (f) rotaxanes (Xiang Ma and He Tian Chem. Soc. Rev., 2010, 39, 70-80), (g) an aptamer and the complementary protein (James, W. in Encyclopedia of Analytical Chemistry, R. A. Meyers (Ed.) pp. 4848-4871 John Wiley & Sons Ltd, Chichester, 2000) or (h) half-chelators (Hammerstein et al. J Biol Chem. 2011 Apr. 22; 286(16): 14324-14334). For (e), hydrogen bonding occurs between a proton bound to an electronegative atom and another electronegative atom. Pi-stacking requires two aromatic rings that can stack together where the planes of the rings are parallel. Salt bridges are between groups that can delocalize their electrons over several atoms, e. g. between aspartate and arginine.
- The two or more parts may be transiently connected by a hexa-his tag or Ni-NTA. The two or more parts may also be modified such that they transiently connect to each other.
- The two or more parts are preferably permanently connected. In the context of the invention, a connection is permanent if is not broken while the helicase is used or cannot be broken without intervention on the part of the user, such as using reduction to open —S—S— bonds.
- The two or more parts are preferably covalently-attached. The two or more parts may be covalently attached using any method known in the art.
- The two or more parts may be covalently attached via their naturally occurring amino acids, such as cysteines, threonines, serines, aspartates, asparagines, glutamates and glutamines.
- Naturally occurring amino acids may be modified to facilitate attachment. For instance, the naturally occurring amino acids may be modified by acylation, phosphorylation, glycosylation or farnesylation. Other suitable modifications are known in the art. Modifications to naturally occurring amino acids may be post-translation modifications. The two or more parts may be attached via amino acids that have been introduced into their sequences. Such amino acids are preferably introduced by substitution. The introduced amino acid may be cysteine or a non-natural amino acid that facilitates attachment. Suitable non-natural amino acids include, but are not limited to, 4-azido-L-phenylalanine (Faz), any one of the amino acids numbered 1-71 included in FIG. 1 of Liu C. C. and Schultz P. G., Annu. Rev. Biochem., 2010, 79, 413-444 or any one of the amino acids listed below. The introduced amino acids may be modified as discussed above.
- In a preferred embodiment, the two or more parts are connected using linkers. Linker molecules are discussed in more detail below. One suitable method of connection is cysteine linkage. This is discussed in more detail below. The two or more parts are preferably connected using one or more, such as two or three, linkers. The one or more linkers may be designed to reduce the size of, or close, the opening as discussed above. If one or more linkers are being used to close the opening as discussed above, at least a part of the one or more linkers is preferably oriented such that it is not parallel to the polynucleotide when it is bound by the helicase. More preferably, all of the linkers are oriented in this manner. If one or more linkers are being used to close the opening as discussed above, at least a part of the one or more linkers preferably crosses the opening in an orientation that is not parallel to the polynucleotide when it bound by the helicase. More preferably, all of the linkers cross the opening in this manner. In these embodiments, at least a part of the one or more linkers may be perpendicular to the polynucleotide. Such orientations effectively close the opening such that the polynucleotide cannot unbind from the helicase through the opening.
- Each linker may have two or more functional ends, such as two, three or four functional ends. Suitable configurations of ends in linkers are well known in the art.
- One or more ends of the one or more linkers are preferably covalently attached to the helicase. If one end is covalently attached, the one or more linkers may transiently connect the two or more parts as discussed above. If both or all ends are covalently attached, the one or more linkers permanently connect the two or more parts.
- At least one of the two or more parts is preferably modified to facilitate the attachment of the one or more linkers. Any modification may be made. The linkers may be attached to one or more reactive cysteine residues, reactive lysine residues or non-natural amino acids in the two or more parts. The non-natural amino acid may be any of those discussed above. The non-natural amino acid is preferably 4-azido-L-phenylalanine (Faz). At least one amino acid in the two or more parts is preferably substituted with cysteine or a non-natural amino acid, such as Faz.
- The one or more linkers are preferably amino acid sequences and/or chemical crosslinkers.
- Suitable amino acid linkers, such as peptide linkers, are known in the art. The length, flexibility and hydrophilicity of the amino acid or peptide linker are typically designed such that it reduces the size of the opening, but does not to disturb the functions of the helicase. Preferred flexible peptide linkers are stretches of 2 to 20, such as 4, 6, 8, 10 or 16, serine and/or glycine amino acids. More preferred flexible linkers include (SG)1, (SG)2, (SG)3, (SG)4, (SG)5, (SG)8, (SG)10, (SG)15 or (SG)20 wherein S is serine and G is glycine. Preferred rigid linkers are stretches of 2 to 30, such as 4, 6, 8, 16 or 24, proline amino acids. More preferred rigid linkers include (P)12 wherein P is proline. The amino acid sequence of a linker preferably comprises a polynucleotide binding moiety. Such moieties and the advantages associated with their use are discussed below.
- Suitable chemical crosslinkers are well-known in the art. Suitable chemical crosslinkers include, but are not limited to, those including the following functional groups: maleimide, active esters, succinimide, azide, alkyne (such as dibenzocyclooctynol (DIBO or DBCO), difluoro cycloalkynes and linear alkynes), phosphine (such as those used in traceless and non-traceless Staudinger ligations), haloacetyl (such as iodoacetamide), phosgene type reagents, sulfonyl chloride reagents, isothiocyanates, acyl halides, hydrazines, disulphides, vinyl sulfones, aziridines and photoreactive reagents (such as aryl azides, diaziridines).
- Reactions between amino acids and functional groups may be spontaneous, such as cysteine/maleimide, or may require external reagents, such as Cu(I) for linking azide and linear alkynes.
- Linkers can comprise any molecule that stretches across the distance required. Linkers can vary in length from one carbon (phosgene-type linkers) to many Angstroms. Examples of linear molecules, include but are not limited to, are polyethyleneglycols (PEGs), polypeptides, polysaccharides, deoxyribonucleic acid (DNA), peptide nucleic acid (PNA), threose nucleic acid (TNA), glycerol nucleic acid (GNA), saturated and unsaturated hydrocarbons, polyamides. These linkers may be inert or reactive, in particular they may be chemically cleavable at a defined position, or may be themselves modified with a fluorophore or ligand. The linker is preferably resistant to dithiothreitol (DTT).
- Preferred crosslinkers include 2,5-dioxopyrrolidin-1-yl 3-(pyridin-2-yldisulfanyl)propanoate, 2,5-dioxopyrrolidin-1-yl 4-(pyridin-2-yldisulfanyl)butanoate and 2,5-dioxopyrrolidin-1-yl 8-(pyridin-2-yldisulfanyl)octananoate, di-maleimide PEG 1k, di-maleimide PEG 3.4k, di-maleimide PEG 5k, di-maleimide PEG 10k, bis(maleimido)ethane (BMOE), bis-maleimidohexane (BMH), 1,4-bis-maleimidobutane (BMB), 1,4 bis-maleimidyl-2,3-dihydroxybutane (BMDB), BM[PEO]2 (1,8-bis-maleimidodiethyleneglycol), BM[PEO]3 (1,11-bis-maleimidotriethylene glycol), tris[2-maleimidoethyl]amine (TMEA), DTME dithiobismaleimidoethane, bis-maleimide PEGS, bis-maleimide PEG11, DBCO-maleimide, DBCO-PEG4-maleimide, DBCO-PEG4-NH2, DBCO-PEG4-NHS, DBCO-NHS, DBCO-PEG-DBCO 2.8 kDa, DBCO-PEG-DBCO 4.0 kDa, DBCO-15 atoms-DBCO, DBCO-26 atoms-DBCO, DBCO-35 atoms-DBCO, DBCO-PEG4-S-S-PEG3-biotin, DBCO-S-S-PEG3-biotin, DBCO-S-S-PEG11-biotin, (succinimidyl 3-(2-pyridyldithio)propionate (SPDP) and maleimide-PEG (2 kDa)-maleimide (ALPHA,OMEGA-BIS-MALEIMIDO POLY (ETHYLENE GLYCOL)). The most preferred crosslinker is maleimide-propyl-SRDFWRS-(1,2-diaminoethane)-propyl-maleimide as used in the Examples.
- The one or more linkers may be cleavable. This is discussed in more detail below.
- The two or more parts may be connected using two different linkers that are specific for each other. One of the linkers is attached to one part and the other is attached to another part. The linkers should react to form a modified helicase of the invention. The two or more parts may be connected using the hybridization linkers described in International Application No. PCT/GB10/000132 (published as WO 2010/086602). In particular, the two or more parts may be connected using two or more linkers each comprising a hybridizable region and a group capable of forming a covalent bond. The hybridizable regions in the linkers hybridize and link the two or more parts. The linked parts are then coupled via the formation of covalent bonds between the groups. Any of the specific linkers disclosed in International Application No. PCT/GB10/000132 (published as WO 2010/086602) may be used in accordance with the invention.
- The two or more parts may be modified and then attached using a chemical crosslinker that is specific for the two modifications. Any of the crosslinkers discussed above may be used.
- The linkers may be labeled. Suitable labels include, but are not limited to, fluorescent molecules (such as Cy3 or AlexaFluor®555), radioisotopes, e.g. 125I, 35S, enzymes, antibodies, antigens, polynucleotides and ligands such as biotin. Such labels allow the amount of linker to be quantified. The label could also be a cleavable purification tag, such as biotin, or a specific sequence to show up in an identification method, such as a peptide that is not present in the protein itself, but that is released by trypsin digestion.
- A preferred method of connecting the two or more parts is via cysteine linkage. This can be mediated by a bi-functional chemical crosslinker or by an amino acid linker with a terminal presented cysteine residue. Linkage can occur via natural cysteines in the helicase. Alternatively, cysteines can be introduced into the two or more parts of the helicase. If the two or more parts are connected via cysteine linkage, the one or more cysteines have preferably been introduced to the two or more parts by substitution.
- The length, reactivity, specificity, rigidity and solubility of any bi-functional linker may be designed to ensure that the size of the opening is reduced sufficiently and the function of the helicase is retained. Suitable linkers include bismaleimide crosslinkers, such as 1,4-bis(maleimido)butane (BMB) or bis(maleimido)hexane. One draw back of bi-functional linkers is the requirement of the helicase to contain no further surface accessible cysteine residues if attachment at specific sites is preferred, as binding of the bi-functional linker to surface accessible cysteine residues may be difficult to control and may affect substrate binding or activity. If the helicase does contain several accessible cysteine residues, modification of the helicase may be required to remove them while ensuring the modifications do not affect the folding or activity of the helicase. This is discussed in International Application No. PCT/GB10/000133 (published as WO 2010/086603). The reactivity of cysteine residues may be enhanced by modification of the adjacent residues, for example on a peptide linker. For instance, the basic groups of flanking arginine, histidine or lysine residues will change the pKa of the cysteines thiol group to that of the more reactive 5-group. The reactivity of cysteine residues may be protected by thiol protective groups such as 5,5′-dithiobis-(2-nitrobenzoic acid) (dTNB). These may be reacted with one or more cysteine residues of the helicase before a linker is attached. Selective deprotection of surface accessible cysteines may be possible using reducing reagents immobilized on beads (for example immobilized tris(2-carboxyethyl) phosphine, TCEP). Cysteine linkage of the two or more parts is discussed in more detail below.
- Another preferred method of attaching the two or more parts is via 4-azido-L-phenylalanine (Faz) linkage. This can be mediated by a bi-functional chemical linker or by a polypeptide linker with a terminal presented Faz residue. The one or more Faz residues have preferably been introduced to the helicase by substitution. Faz linkage of two or more helicases is discussed in more detail below.
- The helicase is preferably a RecD helicase. Any RecD helicase may be used in accordance with the invention. The structures of RecD helicases are known in the art (FEBS J. 2008 April; 275(8):1835-51. Epub 2008 Mar. 9. ATPase activity of RecD is essential for growth of the Antarctic Pseudomonas syringae Lz4W at low temperature. Satapathy A K, Pavankumar T L, Bhattacharjya S, Sankaranarayanan R, Ray MK; EMS Microbiol Rev. 2009 May; 33(3):657-87. The diversity of conjugative relaxases and its application in plasmid classification. Garcillán-Barcia M P, Francia M V, de la Cruz F; J Biol Chem. 2011 Apr. 8; 286(14):12670-82. Epub 2011 Feb. 2. Functional characterization of the multidomain F plasmid TraI relaxase-helicase. Cheng Y, McNamara D E, Miley M J, Nash R P, Redinbo M R).
- The RecD helicase typically comprises the amino acid motif X1-X2-X3-G-X4-X5-X6-X7 (hereinafter called the RecD-like motif I; SEQ ID NO: 59), wherein X1 is G, S or A, X2 is any amino acid, X3 is P, A, S or G, X4 is T, A, V, S or C, X5 is G or A, X6 is K or R and X7 is T or S. X1 is preferably G. X2 is preferably G, I, Y or A. X2 is more preferably G. X3 is preferably P or A. X4 is preferably T, A, V or C. X4 is preferably T, V or C. X5 is preferably G. X6 is preferably K. X7 is preferably T or S. The RecD helicase preferably comprises Q-(X8)16-18-X1-X2-X3-G-X4-X5-X6-X7 (hereinafter called the extended RecD-like motif I; SEQ ID NOs: 60, 61 and 62), wherein X1 to X7 are as defined above and X8 is any amino acid. There are preferably 16 X8 residues (i.e. (X8)16) in the extended RecD-like motif I (SEQ ID NO: 60). Suitable sequences for (X8)16 can be identified in SEQ ID NOs: 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47 and 50 of U.S. Patent Application No. 61/581,332 and SEQ ID NOs: 18, 21, 24, 25, 28, 30, 32, 35, 37, 39, 41, 42 and 44 of International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- The RecD helicase preferably comprises the amino acid motif G-G-P-G-Xa-G-K-Xb (hereinafter called the RecD motif I; SEQ ID NO: 63) wherein Xa is T, V or C and Xb is T or S. Xa is preferably T. Xb is preferably T. The Rec-D helicase preferably comprises the sequence G-G P G T G K T (SEQ ID NO: 64). The RecD helicase more preferably comprises the amino acid motif Q-(X8)16-18-G-G-P-G-Xa-G-K-Xb (hereinafter called the extended RecD motif I; SEQ ID NO: 65, 66 and 67), wherein Xa and Xb are as defined above and X8 is any amino acid. There are preferably 16 X8 residues (i.e. (X8)16) in the extended RecD motif I (SEQ ID NO: 65). Suitable sequences for (X8)16 can be identified in SEQ ID NOs: 14, 17, 20, 23, 26, 29, 32, 35, 38, 41, 44, 47 and 50 of U.S. Patent Application No. 61/581,332 and SEQ ID NOs: 18, 21, 24, 25, 28, 30, 32, 35, 37, 39, 41, 42 and 44 of International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- The RecD helicase typically comprises the amino acid motif X1-X2-X3-X4-X5-(X6)3-Q-X7 (hereinafter called the RecD-like motif V; SEQ ID NO: 68), wherein X1 is Y, W or F, X2 is A, T, S, M, C or V, X3 is any amino acid, X4 is T, N or S, X5 is A, T, G, S, V or I, X6 is any amino acid and X7 is G or S. X1 is preferably Y. X2 is preferably A, M, C or V. X2 is more preferably A. X3 is preferably I, M or L. X3 is more preferably I or L. X4 is preferably T or S. X4 is more preferably T. X5 is preferably A, V or I. X5 is more preferably V or I. X5 is most preferably V. (X6)3 is preferably H-K-S, H-M-A, H-G-A or H-R-S. (X6)3 is more preferably H-K-S. X7 is preferably G. The RecD helicase preferably comprises the amino acid motif Xa-Xb-Xc-Xd-Xe-H-K-S-Q-G (hereinafter called the RecD motif V; SEQ ID NO: 69), wherein Xa is Y, W or F, Xb is A, M, C or V, Xc is I, M or L, Xd is T or S and Xe is V or I. Xa is preferably Y. Xb is preferably A. Xd is preferably T. Xd is preferably V. Preferred RecD motifs I are shown in Table 5 of U.S. Patent Application No. 61/581,332. Preferred RecD-like motifs I are shown in Table 7 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562). Preferred RecD-like motifs V are shown in Tables 5 and 7 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- The RecD helicase is preferably one of the helicases shown in Table 4 or 5 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562) or a variant thereof. Variants are described in U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- The RecD helicase is preferably a TraI helicase or a TraI subgroup helicase. TraI helicases and TraI subgroup helicases may contain two RecD helicase domains, a relaxase domain and a C-terminal domain. The TraI subgroup helicase is preferably a TrwC helicase. The TraI helicase or TraI subgroup helicase is preferably one of the helicases shown in Table 6 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562) or a variant thereof. Variants are described in U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- The TraI helicase or a TraI subgroup helicase typically comprises a RecD-like motif I as defined above (SEQ ID NO: 59) and/or a RecD-like motif V as defined above (SEQ ID NO: 68). The TraI helicase or a TraI subgroup helicase preferably comprises both a RecD-like motif I (SEQ ID NO: 59) and a RecD-like motif V (SEQ ID NO: 68). The TraI helicase or a TraI subgroup helicase typically further comprises one of the following two motifs:
-
- The amino acid motif H-(X1)2-X2-R-(X3)5-12-H-X4-H (hereinafter called the MobF motif III; SEQ ID NOs: 70 to 77), wherein X1 and X2 are any amino acid and X2 and X4 are independently selected from any amino acid except D, E, K and R. (X1)2 is of course X1a-X1b. X1a and X1b can be the same of different amino acid. X1a is preferably D or E. X1b is preferably T or D. (X1)2 is preferably DT or ED. (X1)2 is most preferably DT. The 5 to 12 amino acids in (X3)5-12 can be the same or different. X2 and X4 are independently selected from G, P, A, V, L, I, M, C, F, Y, W, H, Q, N, S and T. X2 and X4 are preferably not charged. X2 and X4 are preferably not H. X2 is more preferably N, S or A. X2 is most preferably N. X4 is most preferably F or T. (X3)5-12 is preferably 6 or 10 residues in length. Suitable embodiments of (X3)5-12 can be derived from SEQ ID NOs: 58, 62, 66 and 70 shown in Table 7 of U.S. Patent Application No. 61/581,332 and SEQ ID NOs: 61, 65, 69, 73, 74, 82, 86, 90, 94, 98, 102, 110, 112, 113, 114, 117, 121, 124, 125, 129, 133, 136, 140, 144, 147, 151, 152, 156, 160, 164 and 168 of International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- The amino acid motif G X1-X2-X3-X4-X5-X6-X7H-(X8)6-12-H-X9 (hereinafter called the MobQ motif III; SEQ ID NOs: 78 to 84), wherein X1, X2, X3, X5, X6, X7 and X9 are independently selected from any amino acid except D, E, K and R, X4 is D or E and X8 is any amino acid. X1, X2, X3, X5, X6, X7 and X9 are independently selected from G, P, A, V, L, I, M, C, F, Y, W, H, Q, N, S and T. X1, X2, X3, X5, X6, X7 and X9 are preferably not charged. X1, X2, X3, X5, X6, X7 and X9 are preferably not H. The 6 to 12 amino acids in (X8)6-12 can be the same or different. Preferred MobF motifs III are shown in Table 7 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562).
- The TraI helicase or TraI subgroup helicase is more preferably one of the helicases shown in Table 6 or 7 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562) or a variant thereof. The TraI helicase most preferably comprises the sequence shown in SEQ ID NO: 85 or a variant thereof. SEQ ID NO: 85 is TraI Eco (NCBI Reference Sequence: NP_061483.1; Genbank AAQ98619.1; SEQ ID NO: 85). TraI Eco comprises the following motifs: RecD-like motif I (GYAGVGKT; SEQ ID NO: 86), RecD-like motif V (YAITAHGAQG; SEQ ID NO: 87) and Mob F motif III (HDTSRDQEPQLHTH; SEQ ID NO: 88).
- The TraI helicase or TraI subgroup helicase more preferably comprises the sequence of one of the helicases shown in Table 4 below, i.e. one of SEQ ID NOs: 85, 126, 134 and 138, or a variant thereof.
-
TABLE 4 More preferred TraI helicase and TraI subgroup helicases % RecD- RecD- Mob F Identity like like motif SEQ to motif I motif V III ID TraI (SEQ ID (SEQ ID (SEQ ID NO Name Strain NCBI ref Eco NO:) NO:) NO:) 85 TraI Escherichia NCBI — GYAGV YAITA HDTSR Eco coli Reference GKT HGAQG DQEPQ Sequence: (86) (87) LHTH NP_061483.1 88) Genbank AAQ98619.1 126 TrwC Citromicrobium NCBI 15% GIAGA YALNV HDTNR Cba bathyomarinum Reference GKS HMAQG NQEPN JL354 Sequence: (131) (132) LHFH ZP_06861556.1 (133) 134 TrwC Halothiobacillus NCBI 11.5% GAAGA YCITIH HEDAR Hne neapolitanus Reference GKT RSQG TVDDI c2 Sequence: (135) (136) ADPQL YP_003262832.1 HTH (137) 138 TrwC Erythrobacter NCBI 16% GIAGA YALNA HDTNR Eli litoralis Reference GKS HMAQG NQEPN HTCC2594 Sequence: (131) (139) LHFH YP_457045.1 (133) - As discussed above for Hel308 helicases, two or more parts on the RecD helicase, TraI helicase or TraI subgroup helicase may be connected to reduce the size of the opening in the polynucleotide domain through which a polynucleotide can unbind from the helicase and wherein the helicase retains its ability to control the movement of the polynucleotide. Any of the embodiments discussed above for Hel308 helicases equally apply to RecD helicases, TraI helicases or TraI subgroup helicases. The two or more parts of TrwC Cba that are connected are preferably (a) amino acids 691 and 346 in SEQ ID NO: 126; (b) amino acids 657 and 339 in SEQ ID NO: 126; (c) amino acids 691 and 350 in SEQ ID NO: 126; or (d) amino acids 690 and 350 in SEQ ID NO: 126. These amino acids are preferably substituted with cysteine such that they can be connected by cysteine linkage.
- The invention may use a mutant TrwC Cba protein which comprises a variant of SEQ ID NO: 126 in which amino acids 691 and 346; 657 and 339; 691 and 350; or 690 and 350 are modified. The amino acids are preferably substituted. The amino acids are more preferably substituted with cysteine. The variant may differ from SEQ ID NO: 126 at positions other than 691 and 346; 657 and 339; 691 and 350; or 690 and 350 as long as the relevant amino acids are modified. The variant will preferably be at least 10% homologous to SEQ ID NO: 126 based on amino acid identity as discussed in more detail below. Amino acid 691 and 346; 657 and 339; 691 and 350; or 690 and 350 are not connected. These mutant TrwC Cba proteins may be used to form a modified helicase in which the modified amino acids are connected.
- A variant of a RecD helicase, TraI helicase or TraI subgroup helicase is an enzyme that has an amino acid sequence which varies from that of the wild-type helicase and which retains polynucleotide binding activity. This can be measured as described above. In particular, a variant of SEQ ID NO: 85, 126, 134 or 138 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 85, 126, 134 or 138 and which retains polynucleotide binding activity. The variant retains helicase activity. The variant must work in at least one of the two modes discussed below. Preferably, the variant works in both modes. The variant may include modifications that facilitate handling of the polynucleotide encoding the helicase and/or facilitate its activity at high salt concentrations and/or room temperature. Variants typically differ from the wild-type helicase in regions outside of the motifs discussed above. However, variants may include modifications within these motif(s).
- Over the entire length of the amino acid sequence of any one of SEQ ID NO: 85, 126, 134 and 138, a variant will preferably be at least 10% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 20%, at least 25%, at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of any one of SEQ ID NOs: 85, 126, 134 and 138 over the entire sequence. There may be at least 70%, for example at least 80%, at least 85%, at least 90% or at least 95%, amino acid identity over a stretch of 150 or more, for example 200, 300, 400, 500, 600, 700, 800, 900 or 1000 or more, contiguous amino acids (“hard homology”). Homology is determined as described below. The variant may differ from the wild-type sequence in any of the ways discussed above with reference to SEQ ID NOs: 2 and 4.
- A variant of any one of SEQ ID NOs: 85, 126, 134 and 138 preferably comprises the RecD-like motif I and/or RecD-like motif V of the wild-type sequence. However, a variant of SEQ ID NO: 85, 126, 134 or 138 may comprise the RecD-like motif I and/or extended RecD-like motif V from a different wild-type sequence. For instance, a variant may comprise any one of the preferred motifs shown in Tables 5 and 7 of U.S. Patent Application No. 61/581,332 and International Application No. PCT/GB2012/053274 (published as WO 2012/098562). Variants of SEQ ID NOs: 85, 126, 134 and 138 may also include modifications within the RecD-like motifs I and V of the wild-type sequence. A variant of SEQ ID NO: 85, 126, 134 or 138 preferably comprises one or more substituted cysteine residues and/or one or more substituted Faz residues to facilitate attachment as discussed above.
- The helicase is preferably an XPD helicase. Any XPD helicase may be used in accordance with the invention. XPD helicases are also known as Rad3 helicases and the two terms can be used interchangeably.
- The structures of XPD helicases are known in the art (Cell. 2008 May 30; 133(5):801-12. Structure of the DNA repair helicase XPD. Liu H, Rudolf J, Johnson K A, McMahon S A, Oke M, Carter L, McRobbie A M, Brown S E, Naismith J H, White M F). The XPD helicase typically comprises the amino acid motif X1-X2-X3-G-X4-X5-X6-E-G (hereinafter called XPD motif V; SEQ ID NO: 89). X1, X2, X5 and X6 are independently selected from any amino acid except D, E, K and R. X1, X2, X5 and X6 are independently selected from G, P, A, V, L, I, M, C, F, Y, W, H, Q, N, S and T. X1, X2, X5 and X6 are preferably not charged. X1, X2, X5 and X6 are preferably not H. X1 is more preferably V, L, I, S or Y. X5 is more preferably V, L, I, N or F. X6 is more preferably S or A. X3 and X4 may be any amino acid residue. X4 is preferably K, R or T.
- The XPD helicase typically comprises the amino acid motif Q-Xa-Xb-G-R-Xc-Xd-R-(Xe)3-Xf-(Xg)7-D-Xh-R (hereinafter called XPD motif VI; SEQ ID NO: 90). Xa, Xe and Xg may be any amino acid residue. Xb, Xc and Xd are independently selected from any amino acid except D, E, K and R. Xb, Xc and Xd are typically independently selected from G, P, A, V, L, I, M, C, F, Y, W, H, Q, N, S and T. Xb, Xc and Xd are preferably not charged. Xb, Xc and Xd are preferably not H. Xb is more preferably V, A, L, I or M. Xc is more preferably V, A, L, I, M or C. Xd is more preferably I, H, L, F, M or V. Xf may be D or E. (Xg)7 is Xg1, Xg2, Xg3, Xg4, Xg5, Xg6 and Xg7. Xg2 is preferably G, A, S or C. Xg5 is preferably F, V, L, I, M, A, W or Y. Xg6 is preferably L, F, Y, M, I or V. Xg7 is preferably A, C, V, L, I, M or S.
- The XPD helicase preferably comprises XPD motifs V and VI. The most preferred XPD motifs V and VI are shown in Table 5 of U.S. Patent Application No. 61/581,340 and International Application No. PCT/GB2012/053273 (published as WO 2012/098561).
- The XPD helicase preferably further comprises an iron sulphide (FeS) core between two Walker A and B motifs (motifs I and II). An FeS core typically comprises an iron atom coordinated between the sulphide groups of cysteine residues. The FeS core is typically tetrahedral.
- The XPD helicase is preferably one of the helicases shown in Table 4 or 5 of International Application No. PCT/GB2012/053273 (published as WO 2012/098561) or a variant thereof. The XPD helicase most preferably comprises the sequence shown in SEQ ID NO: 91 or a variant thereof. SEQ ID NO: 91 is XPD Mbu (Methanococcoides burtonii; YP_566221.1; GI:91773529). XPD Mbu comprises YLWGTLSEG (Motif V; SEQ ID NO: 92) and QAMGRVVRSPTDYGARILLDGR (Motif VI; SEQ ID NO: 93).
- A variant of a XPD helicase is an enzyme that has an amino acid sequence which varies from that of the wild-type helicase and which retains polynucleotide binding activity. This can be measured as described above. In particular, a variant of SEQ ID NO: 91 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 91 and which retains polynucleotide binding activity. The variant retains helicase activity. The variant must work in at least one of the two modes discussed below. Preferably, the variant works in both modes. The variant may include modifications that facilitate handling of the polynucleotide encoding the helicase and/or facilitate its activity at high salt concentrations and/or room temperature. Variants typically differ from the wild-type helicase in regions outside of XPD motifs V and VI discussed above. However, variants may include modifications within one or both of these motifs.
- Over the entire length of the amino acid sequence of SEQ ID NO: 91, such as SEQ ID NO: 10, a variant will preferably be at least 10%, preferably 30% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 91 over the entire sequence. There may be at least 70%, for example at least 80%, at least 85%, at least 90% or at least 95%, amino acid identity over a stretch of 150 or more, for example 200, 300, 400, 500, 600, 700, 800, 900 or 1000 or more, contiguous amino acids (“hard homology”). Homology is determined as described below. The variant may differ from the wild-type sequence in any of the ways discussed above with reference to SEQ ID NOs: 2 and 4.
- A variant of SEQ ID NO: 91 preferably comprises the XPD motif V and/or the XPD motif VI of the wild-type sequence. A variant of SEQ ID NO: 91 more preferably comprises both XPD motifs V and VI of SEQ ID NO: 91. However, a variant of SEQ ID NO: 91 may comprise XPD motifs V and/or VI from a different wild-type sequence. For instance, a variant of SEQ ID NO: 91 may comprise any one of the preferred motifs shown in Table 5 of U.S. Patent Application No. 61/581,340 and International Application No. PCT/GB2012/053273 (published as WO 2012/098561). Variants of SEQ ID NO: 91 may also include modifications within XPD motif V and/or XPD motif VI of the wild-type sequence. Suitable modifications to these motifs are discussed above when defining the two motifs. As discussed above for Hel308 helicases, two or more parts on the XPD helicase may be connected to reduce the size of the opening in the polynucleotide domain through which a polynucleotide can unbind from the helicase and wherein the helicase retains its ability to control the movement of the polynucleotide. Any of the embodiments discussed above for Hel308 helicases equally apply to XPD helicases. A variant of SEQ ID NO: 91 preferably comprises one or more substituted cysteine residues and/or one or more substituted Faz residues to facilitate attachment as discussed above.
- The helicase is preferably a UvrD helicase. Any UvrD helicase may be used in the invention. The UvrD helicase preferably comprises the sequence shown in SEQ ID NO: 122 or a variant thereof. Variants are defined above. Over the entire length of the amino acid sequence of any one of SEQ ID NO: 122, a variant will preferably be at least 20% homologous to that sequence based on amino acid similarity or identity. More preferably, the variant polypeptide may be at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of any one of SEQ ID Ns: 122 over the entire sequence. There may be at least 70%, for example at least 80%, at least 85%, at least 90% or at least 95%, amino acid identity over a stretch of 100 or more, for example 150, 200, 300, 400 or 500 or more, contiguous amino acids (“hard homology”). Homology or similarity is determined as described below.
- The helicase is preferably a Dda helicase. Any Dda helicase may be used in the invention. Dda helicases typically comprises the following five domains: 1A (RecA-like motor) domain, 2A (RecA-like motor) domain, tower domain, pin domain and hook domain (Xiaoping He et al., 2012, Structure; 20: 1189-1200). The domains may be identified using protein modelling, x-ray diffraction measurement of the protein in a crystalline state (Rupp B (2009). Biomolecular Crystallography: Principles, Practice and Application to Structural Biology. New York: Garland Science), nuclear magnetic resonance (NMR) spectroscopy of the protein in solution (Mark Rance; Cavanagh, John; Wayne J. Fairbrother; Arthur W. Hunt III; Skelton, NNicholas J. (2007). Protein NMR spectroscopy: principles and practice (2nd ed.). Boston: Academic Press.) or cryo-electron microscopy of the protein in a frozen-hydrated state (van Heel M, Gowen B, Matadeen R, Orlova E V, Finn R, Pape T, Cohen D, Stark H, Schmidt R, Schatz M, Patwardhan A (2000). “Single-particle electron cryo-microscopy: towards atomic resolution.”. Q Rev Biophys. 33: 307-69). Structural information of proteins determined by above mentioned methods are publicly available from the protein bank (PDB) database.
- Preferred Dda helicases are shown in Table 5 below.
-
Number of D/E Sequence vs. K/R Dda Homologue Identity amino # (SEQ ID NO:) Habitat Uniprot Length to 1993/% acids C Rma- Rhodothermus Mild D0MKQ2 678 21 −84/+85 2 DSM marinus halophile, (SEQ ID moderate NO: 98) thermophile >65° C. Csp Cyanothece sp. Marine B1X365 496 24 −76/+76 5 (SEQ ID (strain ATCC bacterium NO: 99) 51142) Sru Salinibacter Extremely Q2S429 421 26 −78/+54 3 (SEQ ID ruber halophilic, NO: 100) 35-45° C. Sgo Sulfurimonas Habitat: B6BJ43 500 27 −72/+64 2 (SEQ ID gotlandica GD1 hydrothermal NO: 101) vents, coastal sediments Vph12B8 Vibrio phage Host found M4MBC3 450 27 −62/+47 6 (SEQ ID henriette 12B8 in saltwater, NO: 102) stomach bug Vph Vibrio phage Host found I6XGX8 421 39 −55/+45 5 (SEQ ID phi-pp2 in saltwater, NO: 103) stomach bug Aph65 Aeromonas Host found E5DRP6 434 40 −57/+48 4 (SEQ ID phage 65 in NO: 104) fresh/brackish water, stomach bug AphCC2 Aeromonas Host found I6XH64 420 41 −53/+44 4 (SEQ ID phage CC2 in NO: 105) fresh/brackish water, stomach bug Cph Cronobacter Host K4FBD0 443 42 −59/+57 4 (SEQ ID phage vB CsaM member of NO: 106) GAP161 enterobacteriaceae Kph Klebsiella Host D5JF67 442 44 −59/+58 5 (SEQ ID phage KP15 member of NO: 107) enterobacteriaceae SphlME13 Stenotrophomonas Host found J7HXT5 438 51 −58/+59 7 (SEQ ID phage in soil NO: 108) IME13 AphAc42 Acinetobacter Host found E5EYE6 442 59 −53/+49 9 (SEQ ID phage Ac42 in soil NO: 109) SphSP18 Shigella phage Host E3SFA5 442 59 −55/+55 9 (SEQ ID SP18 member of NO: 110) enterobacteriaceae Yph Yersinia phage Host I7J3V8 439 64 −52/+52 7 (SEQ ID phiR1-RT member of NO: 111) enterobacteriaceae SphS16 Salmonella Host M1EA88 441 72 −56/+55 5 (SEQ ID phage S16 member of NO: 112) enterobacteriaceae 1993 Enterobateria Host P32270 439 100 −57/+58 5 (SEQ ID phage T4 member of NO: 97) enterobacteriaceae - The Dda helicase more preferably comprises the sequence of one of the helicases shown in the Table 5 above, i.e. one of SEQ ID NOs: 97 to 112, or a variant thereof. Variants are defined above. Over the entire length of the amino acid sequence of any one of SEQ ID NOs: 97 to 112, a variant will preferably be at least 20% homologous to that sequence based on amino acid similarity or identity. More preferably, the variant polypeptide may be at least 30%, at least 40%, at least 45%, at least 50%, at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of any one of SEQ ID NOs: 97 to 112 over the entire sequence. There may be at least 70%, for example at least 80%, at least 85%, at least 90% or at least 95%, amino acid identity over a stretch of 100 or more, for example 150, 200, 300, 400 or 500 or more, contiguous amino acids (“hard homology”). Homology or similarity is determined as described below.
- Preferred variants of any one of SEQ ID NOs: 97 to 112 have a non-natural amino acid, such as Faz, at the amino- (N-) terminus and/or carboxy (C-) terminus. Preferred variants of any one of SEQ ID NOs: 8 to 23 have a cysteine residue at the amino- (N-) terminus and/or carboxy (C-) terminus. Preferred variants of any one of SEQ ID NOs: 8 to 23 have a cysteine residue at the amino- (N-) terminus and a non-natural amino acid, such as Faz, at the carboxy (C-) terminus or vice versa. Preferred variants of SEQ ID NO: 8 contain one or more of, such as all of, the following modifications E54G, D151E, I196N and G357A.
- The Dda helicase preferably comprises any of the modifications disclosed in International Application Nos. PCT/GB2014/052736 and PCT/GB2015/052916 (published as WO/2015/055981 and WO 2016/055777).
- A preferred variant of SEQ ID NO: 97 comprises (a) E94C and A360C or (b) E94C, A360C, C109A and C136A and then optionally (ΔM1)G1 (i.e. deletion of M1 and then addition G1). It may also be termed M1G. Any of the variants discussed above may further comprise M1G.
- As discussed above for Hel308 helicases, two or more parts on the Dda helicase may be connected to reduce the size of the opening in the polynucleotide domain through which a polynucleotide can unbind from the helicase and wherein the helicase retains its ability to control the movement of the polynucleotide. Any of the embodiments discussed above for Hel308 helicases equally apply to Dda helicases.
- The translocase is preferably a strippase. The strippase is preferably the INO80 chromatin remodeling complex or a FtsK/SpoIIIE transporter.
- In one embodiment, the translocase is contacted with the constructs after they are created by the MuA transposase. In another embodiment, the translocase is bound to the substrates before the substrates are contacted with the template polynucleotide.
- After fragmentation of the template polynucleotide and ligation of the MuA substrate to the fragments of the template polynucleotide (tagmentation), constructs comprising a fragment of the template polynucletide and one or more MuA substrates are formed. The two strands of each construct are preferably linked at one end by a hairpin loop. In this embodiment, a hairpin loop is added to each of the fragments of the template polynucleotide generated by the MuA transposase. Suitable hairpin loops can be designed using methods known in the art. The hairpin loop may be any length. The hairpin loop is typically 110 or fewer nucleotides, such as 100 or fewer nucleotides, 90 or fewer nucleotides, 80 or fewer nucleotides, 70 or fewer nucleotides, 60 or fewer nucleotides, 50 or fewer nucleotides, 40 or fewer nucleotides, 30 or fewer nucleotides, 20 or fewer nucleotides or 10 or fewer nucleotides, in length. The hairpin loop is preferably from about 1 to 110, from 2 to 100, from 5 to 80 or from 6 to 50 nucleotides in length. Longer lengths of the hairpin loop, such as from 50 to 110 nucleotides, are preferred if the loop is involved in the differential selectability of the adaptor. Similarly, shorter lengths of the hairpin loop, such as from 1 to 5 nucleotides, are preferred if the loop is not involved in the selectable binding as discussed below.
- The hairpin loop preferably comprises a selectable binding moiety. This allows the constructs to be purified or isolated. A selectable binding moiety is a moiety that can be selected on the basis of its binding properties. Hence, a selectable binding moiety is preferably a moiety that specifically binds to a surface. A selectable binding moiety specifically binds to a surface if it binds to the surface to a much greater degree than any other moiety used in the invention. In preferred embodiments, the moiety binds to a surface to which no other moiety used in the invention binds.
- Suitable selective binding moieties are known in the art. Preferred selective binding moieties include, but are not limited to, biotin, a polynucleotide sequence, antibodies, antibody fragments, such as Fab and ScSv, antigens, polynucleotide binding proteins, poly histidine tails and GST tags. The most preferred selective binding moieties are biotin and a selectable polynucleotide sequence. Biotin specifically binds to a surface coated with avidins. Selectable polynucleotide sequences specifically bind (i.e. hybridise) to a surface coated with homologus sequences. Alternatively, selectable polynucleotide sequences specifically bind to a surface coated with polynucleotide binding proteins.
- The hairpin loop and/or the selectable binding moiety may comprise a region that can be cut, nicked, cleaved or hydrolysed. Such a region can be designed to allow the constructs to be removed from the surface to which it is bound following purification or isolation. Suitable regions are known in the art. Suitable regions include, but are not limited to, an RNA region, a region comprising desthiobiotin and streptavidin, a disulphide bond and a photocleavable region.
- The hairpin loop may be provided at either end of the polynucleotide, i.e. the 5′ or the 3′ end. The hairpin loop may be ligated to the polynucleotide using any method known in the art. The hairpin loop may be ligated using a ligase, such as T4 DNA ligase, E. coli DNA ligase, Taq DNA ligase, Tma DNA ligase and 9° N DNA ligase. The hairpin loop may be added to the constructs as described in International Application No. PCT/GB2014/052505 (published as WO 2015/022544).
- The method preferably further comprises attaching one or more molecular brakes to a non-substrate strand. A non-substrate strand is a strand of a MuA double stranded substrate that does not comprise an overhang. The molecular brakes may be attached to the non-substrate strands in the substrates before they are contacted with the template polynucleotide and the MuA transposase. The molecular brakes may be attached to the other strands from the substrates remaining in the constructs after they are created by the MuA transposase.
- The molecular brakes are preferably bound to Y adaptors comprising a leader sequence and/or one or more anchors capable of coupling the adaptor to a membrane and the Y adaptors are attached to the other strands in step (c).
- The Y adaptors are typically polynucleotide adaptors. They may be formed from any of the polynucleotides discussed above.
- The Y adaptor typically comprises (a) a double stranded region and (b) a single stranded region or a region that is not complementary at the other end. The Y adaptor may be described as having an overhang if it comprises a single stranded region. The presence of a non-complementary region in the Y adaptor gives the adaptor its Y shape since the two strands typically do not hybridise to each other unlike the double stranded portion. The Y adaptor may comprise one or more anchors.
- The Y adaptor and/or the hairpin loop may be ligated to the polynucleotide using any method known in the art. One or both of the adaptors may be ligated using a ligase, such as T4 DNA ligase, E. coli DNA ligase, Taq DNA ligase, Tma DNA ligase and 9° N DNA ligase. Alternatively, the adaptors may be added to the constructs as described in International Application No. PCT/GB2014/052505 (published as WO 2015/022544).
- The Y adaptor may be provided with a leader sequence which preferentially threads into the pore. The leader sequence facilitates the method of the invention. The leader sequence is designed to preferentially thread into the transmembrane pore and thereby facilitate the movement of polynucleotide through the pore. The leader sequence can also be used to link the polynucleotide to the one or more anchors as discussed below.
- The leader sequence typically comprises a polymer. The polymer is preferably negatively charged. The polymer is preferably a polynucleotide, such as DNA or RNA, a modified polynucleotide (such as abasic DNA), PNA, LNA, polyethylene glycol (PEG) or a polypeptide. The leader preferably comprises a polynucleotide and more preferably comprises a single stranded polynucleotide. The leader sequence can comprise any of the polynucleotides discussed above. The single stranded leader sequence most preferably comprises a single strand of DNA, such as a poly dT section. The leader sequence preferably comprises the one or more spacers.
- The leader sequence can be any length, but is typically 10 to 150 nucleotides in length, such as from 20 to 150 nucleotides in length. The length of the leader typically depends on the transmembrane pore used in the method.
- The Y adaptor preferably comprises a selectable binding moiety as discussed above. The Y adaptor and/or the selectable binding moiety may comprise a region that can be cut, nicked, cleaved or hydrolysed as discussed above.
- The method comprises contacting the target polynucleotide with a molecular brake which controls the movement of the target polynucleotide through the pore. Any molecular brake may be used including any of those disclosed in International Application No. PCT/GB2014/052737 (published as WO 2015/110777).
- The molecular brake is preferably a polynucleotide binding protein. The polynucleotide binding protein may be any protein that is capable of binding to the polynucleotide and controlling its movement through a transmembrane pore as discussed in more detail below. It is straightforward in the art to determine whether or not a protein binds to a polynucleotide. The protein typically interacts with and modifies at least one property of the polynucleotide. The protein may modify the polynucleotide by cleaving it to form individual nucleotides or shorter chains of nucleotides, such as di- or trinucleotides. The moiety may modify the polynucleotide by orienting it or moving it to a specific position, i.e. controlling its movement.
- The polynucleotide binding protein is preferably derived from a polynucleotide handling enzyme. A polynucleotide handling enzyme is a polypeptide that is capable of interacting with and modifying at least one property of a polynucleotide. The enzyme may modify the polynucleotide by cleaving it to form individual nucleotides or shorter chains of nucleotides, such as di- or trinucleotides. The enzyme may modify the polynucleotide by orienting it or moving it to a specific position. The polynucleotide handling enzyme does not need to display enzymatic activity as long as it is capable of binding the polynucleotide and controlling its movement through the pore. For instance, the enzyme may be modified to remove its enzymatic activity or may be used under conditions which prevent it from acting as an enzyme. Such conditions are discussed in more detail below.
- The polynucleotide handling enzyme is preferably derived from a nucleolytic enzyme. The polynucleotide handling enzyme used in the construct of the enzyme is more preferably derived from a member of any of the Enzyme Classification (EC) groups 3.1.11, 3.1.13, 3.1.14, 3.1.15, 3.1.16, 3.1.21, 3.1.22, 3.1.25, 3.1.26, 3.1.27, 3.1.30 and 3.1.31. The enzyme may be any of those disclosed in International Application No. PCT/GB10/000133 (published as WO 2010/086603).
- Preferred enzymes are polymerases, exonucleases, helicases, translocases and topoisomerases, such as gyrases. Suitable enzymes include, but are not limited to, exonuclease I from E. coli (SEQ ID NO: 11), exonuclease III enzyme from E. coli (SEQ ID NO: 13), RecJ from T. thermophilus (SEQ ID NO: 15) and bacteriophage lambda exonuclease (SEQ ID NO: 17), TatD exonuclease and variants thereof. Three subunits comprising the sequence shown in SEQ ID NO: or a variant thereof interact to form a trimer exonuclease. The polymerase may be PyroPhage® 3173 DNA Polymerase (which is commercially available from Lucigen® Corporation), SD Polymerase (commercially available from Bioron®) or variants thereof. The enzyme is preferably Phi29 DNA polymerase (SEQ ID NO: 9) or a variant thereof. The topoisomerase is preferably a member of any of the Moiety Classification (EC) groups 5.99.1.2 and 5.99.1.3.
- The enzyme is most preferably derived from a helicase. The helicase may be or be derived from a Hel308 helicase, a RecD helicase, such as TraI helicase or a TrwC helicase, a XPD helicase or a Dda helicase. The helicase may be or be derived from Hel308 Mbu (SEQ ID NO: 18), Hel308 Csy (SEQ ID NO: 19), Hel308 Tga (SEQ ID NO: 20), Hel308 Mhu (SEQ ID NO: 21), TraI Eco (SEQ ID NO: 22), XPD Mbu (SEQ ID NO: 23) or a variant thereof.
- The helicase may be any of the helicases, modified helicases or helicase constructs disclosed in International Application Nos. PCT/GB2012/052579 (published as WO 2013/057495); PCT/GB2012/053274 (published as WO 2013/098562); PCT/GB2012/053273 (published as WO2013098561); PCT/GB2013/051925 (published as WO 2014/013260); PCT/GB2013/051924 (published as WO 2014/013259); PCT/GB2013/051928 (published as WO 2014/013262) and PCT/GB2014/052736 (published as WO/2015/055981).
- The helicase preferably comprises the sequence shown in SEQ ID NO: 25 (Trwc Cba) or as variant thereof, the sequence shown in SEQ ID NO: 18 (Hel308 Mbu) or a variant thereof or the sequence shown in SEQ ID NO: 24 (Dda) or a variant thereof. Variants may differ from the native sequences in any of the ways discussed below for transmembrane pores. A preferred variant of SEQ ID NO: 24 comprises (a) E94C and A360C or (b) E94C, A360C, C109A and C136A and then optionally (ΔM1)G1 (i.e. deletion of M1 and then addition G1). It may also be termed M1G. Any of the variants discussed above may further comprise M1G.
- The Dda helicase preferably comprises any of the modifications disclosed in International Application Nos. PCT/GB2014/052736 and PCT/GB2015/052916 (published as WO/2015/055981 and WO 2016/055777).
- Any number of helicases may be used in accordance with the invention. For instance, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more helicases may be used. In some embodiments, different numbers of helicases may be used.
- The method of the invention preferably comprises attaching two or more helicases to the other strands. The two or more helicases are typically the same helicase. The two or more helicases may be different helicases.
- The two or more helicases may be any combination of the helicases mentioned above. The two or more helicases may be two or more Dda helicases. The two or more helicases may be one or more Dda helicases and one or more TrwC helicases. The two or more helicases may be different variants of the same helicase.
- The two or more helicases are preferably attached to one another. The two or more helicases are more preferably covalently attached to one another. The helicases may be attached in any order and using any method. Preferred helicase constructs for use in the invention are described in International Application Nos. PCT/GB2013/051925 (published as WO 2014/013260); PCT/GB2013/051924 (published as WO 2014/013259); PCT/GB2013/051928 (published as WO 2014/013262) and PCT/GB2014/052736.
- A variant of SEQ ID NO: 9, 11, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24 or 25 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 9, 11, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24 or 25 and which retains polynucleotide binding ability. This can be measured using any method known in the art. For instance, the variant can be contacted with a polynucleotide and its ability to bind to and move along the polynucleotide can be measured. The variant may include modifications that facilitate binding of the polynucleotide and/or facilitate its activity at high salt concentrations and/or room temperature. Variants may be modified such that they bind polynucleotides (i.e. retain polynucleotide binding ability) but do not function as a helicase (i.e. do not move along polynucleotides when provided with all the necessary components to facilitate movement, e.g. ATP and Mg2+). Such modifications are known in the art. For instance, modification of the Mg2+ binding domain in helicases typically results in variants which do not function as helicases. These types of variants may act as molecular brakes (see below).
- Over the entire length of the amino acid sequence of SEQ ID NO: 9, 11, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24 or 25, a variant will preferably be at least 50% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 9, 11, 13, 15, 17, 18, 19, 20, 21, 22, 23, 24 or 25 over the entire sequence. There may be at least 80%, for example at least 85%, 90% or 95%, amino acid identity over a stretch of 200 or more, for example 230, 250, 270, 280, 300, 400, 500, 600, 700, 800, 900 or 1000 or more, contiguous amino acids (“hard homology”). Homology is determined as described above. The variant may differ from the wild-type sequence in any of the ways discussed above with reference to SEQ ID NO: 2 and 4 above. The enzyme may be covalently attached to the pore. Any method may be used to covalently attach the enzyme to the pore.
- A preferred molecular brake is TrwC Cba-Q594A (SEQ ID NO: 25 with the mutation Q594A). This variant does not function as a helicase (i.e. binds polynucleotides but does not move along them when provided with all the necessary components to facilitate movement, e.g. ATP and Mg2+).
- In strand sequencing, the polynucleotide is translocated through the pore either with or against an applied potential. Exonucleases that act progressively or processively on double stranded polynucleotides can be used on the cis side of the pore to feed the remaining single strand through under an applied potential or the trans side under a reverse potential. Likewise, a helicase that unwinds the double stranded DNA can also be used in a similar manner. A polymerase may also be used. There are also possibilities for sequencing applications that require strand translocation against an applied potential, but the DNA must be first “caught” by the enzyme under a reverse or no potential. With the potential then switched back following binding the strand will pass cis to trans through the pore and be held in an extended conformation by the current flow. The single strand DNA exonucleases or single strand DNA dependent polymerases can act as molecular motors to pull the recently translocated single strand back through the pore in a controlled stepwise manner, trans to cis, against the applied potential.
- Any helicase may be used in the method. Helicases may work in two modes with respect to the pore. First, the method is preferably carried out using a helicase such that it moves the polynucleotide through the pore with the field resulting from the applied voltage. In this mode the 5′ end of the polynucleotide is first captured in the pore, and the helicase moves the polynucleotide into the pore such that it is passed through the pore with the field until it finally translocates through to the trans side of the membrane. Alternatively, the method is preferably carried out such that a helicase moves the polynucleotide through the pore against the field resulting from the applied voltage. In this mode the 3′ end of the polynucleotide is first captured in the pore, and the helicase moves the polynucleotide through the pore such that it is pulled out of the pore against the applied field until finally ejected back to the cis side of the membrane.
- The method may also be carried out in the opposite direction. The 3′ end of the polynucleotide may be first captured in the pore and the helicase may move the polynucleotide into the pore such that it is passed through the pore with the field until it finally translocates through to the trans side of the membrane.
- When the helicase is not provided with the necessary components to facilitate movement or is modified to hinder or prevent its movement, it can bind to the polynucleotide and act as a brake slowing the movement of the polynucleotide when it is pulled into the pore by the applied field. In the inactive mode, it does not matter whether the polynucleotide is captured either 3′ or 5′ down, it is the applied field which pulls the polynucleotide into the pore towards the trans side with the enzyme acting as a brake. When in the inactive mode, the movement control of the polynucleotide by the helicase can be described in a number of ways including ratcheting, sliding and braking. Helicase variants which lack helicase activity can also be used in this way.
- The molecular brake may function as the translocase that removes the MuA transposase. Preferably, the molecular brake is used in addition to a translocase. The molecular brake and translocase may be the same enzyme or different enzymes. Where the molecule brake and translcase are the same enzyme, one molecule of the enzyme may act as a molecular brake and another molecule of the enzyme may act as a translocase to remove the MuA transposase.
- The polynucleotide may be contacted with the molecular brake and the pore in any order. It is preferred that, when the polynucleotide is contacted with the molecular brake, such as a helicase, and the pore, the polynucleotide firstly forms a complex with the protein. When the voltage is applied across the pore, the polynucleotide/protein complex then forms a complex with the pore and controls the movement of the polynucleotide through the pore.
- Any steps in the method using a polynucleotide binding protein are typically carried out in the presence of free nucleotides or free nucleotide analogues and an enzyme cofactor that facilitates the action of the polynucleotide binding protein. The free nucleotides may be one or more of any of the individual nucleotides discussed above. The free nucleotides include, but are not limited to, adenosine monophosphate (AMP), adenosine diphosphate (ADP), adenosine triphosphate (ATP), guanosine monophosphate (GMP), guanosine diphosphate (GDP), guanosine triphosphate (GTP), thymidine monophosphate (TMP), thymidine diphosphate (TDP), thymidine triphosphate (TTP), uridine monophosphate (UMP), uridine diphosphate (UDP), uridine triphosphate (UTP), cytidine monophosphate (CMP), cytidine diphosphate (CDP), cytidine triphosphate (CTP), cyclic adenosine monophosphate (cAMP), cyclic guanosine monophosphate (cGMP), deoxyadenosine monophosphate (dAMP), deoxyadenosine diphosphate (dADP), deoxyadenosine triphosphate (dATP), deoxyguanosine monophosphate (dGMP), deoxyguanosine diphosphate (dGDP), deoxyguanosine triphosphate (dGTP), deoxythymidine monophosphate (dTMP), deoxythymidine diphosphate (dTDP), deoxythymidine triphosphate (dTTP), deoxyuridine monophosphate (dUMP), deoxyuridine diphosphate (dUDP), deoxyuridine triphosphate (dUTP), deoxycytidine monophosphate (dCMP), deoxycytidine diphosphate (dCDP) and deoxycytidine triphosphate (dCTP). The free nucleotides are preferably selected from AMP, TMP, GMP, CMP, UMP, dAMP, dTMP, dGMP or dCMP. The free nucleotides are preferably adenosine triphosphate (ATP). The enzyme cofactor is a factor that allows the construct to function. The enzyme cofactor is preferably a divalent metal cation. The divalent metal cation is preferably Mg2+, Mn2+, Ca2+ or Co2+. The enzyme cofactor is most preferably Mg2+.
- The molecular brakes may be any compound or molecule which binds to the polynucleotide and slows the movement of the polynucleotide through the pore. The molecular brake may be any of those discussed above. The molecular brake preferably comprises a compound which binds to the polynucleotide. The compound is preferably a macrocycle. Suitable macrocycles include, but are not limited to, cyclodextrins, calixarenes, cyclic peptides, crown ethers, cucurbiturils, pillararenes, derivatives thereof or a combination thereof. The cyclodextrin or derivative thereof may be any of those disclosed in Eliseev, A. V., and Schneider, H-J. (1994) J. Am. Chem. Soc. 116, 6081-6088. The cyclodextrin is more preferably heptakis-6-amino-β-cyclodextrin (am7-βCD), 6-monodeoxy-6-monoamino-β-cyclodextrin (ami-βCD) or heptakis-(6-deoxy-6-guanidino)-cyclodextrin (guy-βCD).
- The method of the invention preferably does not comprise heat inactivating the MuA transposase. Heat inactivation may also inactivate any other enzymes or proteins being used in the preparation or characterisation of the modified polynucleotides. Removing the heat inactivation step also dispenses with the need for additional equipment required for heating, such as a thermal cycler, hot block, or water bath, used for heating up the sample. The method of the invention can therefore be used in a variety of different settings including those without an electricity supply.
- The invention also provides a population of double stranded MuA substrates for modifying a template polynucleotide, wherein each substrate comprises an overhang at one or both ends and a translocases bound to an overhang. Any of the embodiments discussed above equally apply to the population of the invention.
- The invention also provides a plurality of polynucleotides modified using the method of the invention. The plurality of polynucleotides may be in any of the forms discussed above.
- The population or plurality may be isolated, substantially isolated, purified or substantially purified. A population or plurality is isolated or purified if it is completely free of any other components, such as the template polynucleotide, lipids or pores. A population or plurality is substantially isolated if it is mixed with carriers or diluents which will not interfere with its intended use. For instance, a population or plurality is substantially isolated or substantially purified if it is present in a form that comprises less than 10%, less than 5%, less than 2% or less than 1% of other components, such as lipids or pores.
- The invention also comprises a method of characterising at least one polynucleotide modified using a method of the invention. The modified polynucleotide is contacted with a transmembrane pore such that at least one strand of the polynucleotide moves through the pore. One or more measurements which are indicative of one or more characteristics of the polynucleotide are taken as the at least one strand moves with respect to the pore.
- The invention also provides a method of characterising a template polynucleotide. The template polynucleotide is modified using the method of the invention to produce a plurality of modified polynucleotides. Each modified polynucleotide is contacted with a transmembrane pore such that at least one strand of each polynucleotide moves through the pore. One or more measurements which are indicative of one or more characteristics of the polynucleotide are taken as the at least one strand of each polynucleotide moves with respect to the pore.
- If the/each modified polynucleotide comprises a hairpin loop, the method preferably comprises contacting the/each modified polynucleotide with a transmembrane pore such that both strands of the polynucleotide move through the pore. If molecular brakes are present on the/each modified polynucleotides, the molecular brakes may control the movement of the/each modified polynucleotide through the pore and/or separate the two strands of the/each modified polynucleotide.
- The transmembrane pore is typically in a membrane. Any membrane may be used in accordance with the invention. Suitable membranes are well-known in the art. The membrane is preferably an amphiphilic layer. An amphiphilic layer is a layer formed from amphiphilic molecules, such as phospholipids, which have both hydrophilic and lipophilic properties. The amphiphilic molecules may be synthetic or naturally occurring. Non-naturally occurring amphiphiles and amphiphiles which form a monolayer are known in the art and include, for example, block copolymers (Gonzalez-Perez et al., Langmuir, 2009, 25, 10447-10450). Block copolymers are polymeric materials in which two or more monomer sub-units are polymerised together to create a single polymer chain. Block copolymers typically have properties that are contributed by each monomer sub-unit. However, a block copolymer may have unique properties that polymers formed from the individual sub-units do not possess. Block copolymers can be engineered such that one of the monomer sub-units is hydrophobic (i.e. lipophilic), whilst the other sub-unit(s) are hydrophilic whilst in aqueous media. In this case, the block copolymer may possess amphiphilic properties and may form a structure that mimics a biological membrane. The block copolymer may be a diblock (consisting of two monomer sub-units), but may also be constructed from more than two monomer sub-units to form more complex arrangements that behave as amphiphiles. The copolymer may be a triblock, tetrablock or pentablock copolymer. The membrane is preferably a triblock copolymer membrane.
- Archaebacterial bipolar tetraether lipids are naturally occurring lipids that are constructed such that the lipid forms a monolayer membrane. These lipids are generally found in extremophiles that survive in harsh biological environments, thermophiles, halophiles and acidophiles. Their stability is believed to derive from the fused nature of the final bilayer. It is straightforward to construct block copolymer materials that mimic these biological entities by creating a triblock polymer that has the general motif hydrophilic-hydrophobic-hydrophilic. This material may form monomeric membranes that behave similarly to lipid bilayers and encompass a range of phase behaviours from vesicles through to laminar membranes. Membranes formed from these triblock copolymers hold several advantages over biological lipid membranes. Because the triblock copolymer is synthesised, the exact construction can be carefully controlled to provide the correct chain lengths and properties required to form membranes and to interact with pores and other proteins.
- Block copolymers may also be constructed from sub-units that are not classed as lipid sub-materials; for example a hydrophobic polymer may be made from siloxane or other non-hydrocarbon based monomers. The hydrophilic sub-section of block copolymer can also possess low protein binding properties, which allows the creation of a membrane that is highly resistant when exposed to raw biological samples. This head group unit may also be derived from non-classical lipid head-groups.
- Triblock copolymer membranes also have increased mechanical and environmental stability compared with biological lipid membranes, for example a much higher operational temperature or pH range. The synthetic nature of the block copolymers provides a platform to customise polymer based membranes for a wide range of applications.
- The membrane is most preferably one of the membranes disclosed in International Application No. PCT/GB2013/052766 or PCT/GB2013/052767.
- The amphiphilic molecules may be chemically-modified or functionalised to facilitate coupling of the polynucleotide.
- The amphiphilic layer may be a monolayer or a bilayer. The amphiphilic layer is typically planar. The amphiphilic layer may be curved. The amphiphilic layer may be supported. The amphiphilic layer may be concave. The amphiphilic layer may be suspended from raised pillars such that the peripheral region of the amphiphilic layer is higher than the amphiphilic layer region in the centre. This may allow the microparticle to travel, move, slide or roll along the membrane as described above.
- Amphiphilic membranes are typically naturally mobile, essentially acting as two dimensional fluids with lipid diffusion rates of approximately 10−8 cm s-1. This means that the pore and coupled polynucleotide can typically move within an amphiphilic membrane.
- The membrane may be a lipid bilayer. Lipid bilayers are models of cell membranes and serve as excellent platforms for a range of experimental studies. For example, lipid bilayers can be used for in vitro investigation of membrane proteins by single-channel recording. Alternatively, lipid bilayers can be used as biosensors to detect the presence of a range of substances. The lipid bilayer may be any lipid bilayer. Suitable lipid bilayers include, but are not limited to, a planar lipid bilayer, a supported bilayer or a liposome. The lipid bilayer is preferably a planar lipid bilayer. Suitable lipid bilayers are disclosed in International Application No. PCT/GB08/000563 (published as WO 2008/102121), International Application No. PCT/GB08/004127 (published as WO 2009/077734) and International Application No. PCT/GB2006/001057 (published as WO 2006/100484).
- Methods for forming lipid bilayers are known in the art. Lipid bilayers are commonly formed by the method of Montal and Mueller (Proc. Natl. Acad. Sci. USA., 1972; 69: 3561-3566), in which a lipid monolayer is carried on aqueous solution/air interface past either side of an aperture which is perpendicular to that interface. The lipid is normally added to the surface of an aqueous electrolyte solution by first dissolving it in an organic solvent and then allowing a drop of the solvent to evaporate on the surface of the aqueous solution on either side of the aperture. Once the organic solvent has evaporated, the solution/air interfaces on either side of the aperture are physically moved up and down past the aperture until a bilayer is formed. Planar lipid bilayers may be formed across an aperture in a membrane or across an opening into a recess.
- The method of Montal & Mueller is popular because it is a cost-effective and relatively straightforward method of forming good quality lipid bilayers that are suitable for protein pore insertion. Other common methods of bilayer formation include tip-dipping, painting bilayers and patch-clamping of liposome bilayers.
- Tip-dipping bilayer formation entails touching the aperture surface (for example, a pipette tip) onto the surface of a test solution that is carrying a monolayer of lipid. Again, the lipid monolayer is first generated at the solution/air interface by allowing a drop of lipid dissolved in organic solvent to evaporate at the solution surface. The bilayer is then formed by the Langmuir-Schaefer process and requires mechanical automation to move the aperture relative to the solution surface.
- For painted bilayers, a drop of lipid dissolved in organic solvent is applied directly to the aperture, which is submerged in an aqueous test solution. The lipid solution is spread thinly over the aperture using a paintbrush or an equivalent. Thinning of the solvent results in formation of a lipid bilayer. However, complete removal of the solvent from the bilayer is difficult and consequently the bilayer formed by this method is less stable and more prone to noise during electrochemical measurement.
- Patch-clamping is commonly used in the study of biological cell membranes. The cell membrane is clamped to the end of a pipette by suction and a patch of the membrane becomes attached over the aperture. The method has been adapted for producing lipid bilayers by clamping liposomes which then burst to leave a lipid bilayer sealing over the aperture of the pipette. The method requires stable, giant and unilamellar liposomes and the fabrication of small apertures in materials having a glass surface.
- Liposomes can be formed by sonication, extrusion or the Mozafari method (Colas et al. (2007) Micron 38:841-847).
- In a preferred embodiment, the lipid bilayer is formed as described in International Application No. PCT/GB08/004127 (published as WO 2009/077734). Advantageously in this method, the lipid bilayer is formed from dried lipids. In a most preferred embodiment, the lipid bilayer is formed across an opening as described in WO2009/077734 (PCT/GB08/004127).
- A lipid bilayer is formed from two opposing layers of lipids. The two layers of lipids are arranged such that their hydrophobic tail groups face towards each other to form a hydrophobic interior. The hydrophilic head groups of the lipids face outwards towards the aqueous environment on each side of the bilayer. The bilayer may be present in a number of lipid phases including, but not limited to, the liquid disordered phase (fluid lamellar), liquid ordered phase, solid ordered phase (lamellar gel phase, interdigitated gel phase) and planar bilayer crystals (lamellar sub-gel phase, lamellar crystalline phase).
- Any lipid composition that forms a lipid bilayer may be used. The lipid composition is chosen such that a lipid bilayer having the required properties, such as surface charge, ability to support membrane proteins, packing density or mechanical properties, is formed. The lipid composition can comprise one or more different lipids. For instance, the lipid composition can contain up to 100 lipids. The lipid composition preferably contains 1 to 10 lipids. The lipid composition may comprise naturally-occurring lipids and/or artificial lipids.
- The lipids typically comprise a head group, an interfacial moiety and two hydrophobic tail groups which may be the same or different. Suitable head groups include, but are not limited to, neutral head groups, such as diacylglycerides (DG) and ceramides (CM); zwitterionic head groups, such as phosphatidylcholine (PC), phosphatidylethanolamine (PE) and sphingomyelin
- (SM); negatively charged head groups, such as phosphatidylglycerol (PG); phosphatidylserine (PS), phosphatidylinositol (PI), phosphatic acid (PA) and cardiolipin (CA); and positively charged headgroups, such as trimethylammonium-Propane (TAP). Suitable interfacial moieties include, but are not limited to, naturally-occurring interfacial moieties, such as glycerol-based or ceramide-based moieties. Suitable hydrophobic tail groups include, but are not limited to, saturated hydrocarbon chains, such as lauric acid (n-Dodecanolic acid), myristic acid (n-Tetradecononic acid), palmitic acid (n-Hexadecanoic acid), stearic acid (n-Octadecanoic) and arachidic (n-Eicosanoic); unsaturated hydrocarbon chains, such as oleic acid (cis-9-Octadecanoic); and branched hydrocarbon chains, such as phytanoyl. The length of the chain and the position and number of the double bonds in the unsaturated hydrocarbon chains can vary. The length of the chains and the position and number of the branches, such as methyl groups, in the branched hydrocarbon chains can vary. The hydrophobic tail groups can be linked to the interfacial moiety as an ether or an ester. The lipids may be mycolic acid.
- The lipids can also be chemically-modified. The head group or the tail group of the lipids may be chemically-modified. Suitable lipids whose head groups have been chemically-modified include, but are not limited to, PEG-modified lipids, such as 1,2-Diacyl-sn-Glycero-3-Phosphoethanolamine-N-[Methoxy(Polyethylene glycol)-2000]; functionalised PEG Lipids, such as 1,2-Distearoyl-sn-Glycero-3 Phosphoethanolamine-N-[Biotinyl(Polyethylene Glycol) 2000]; and lipids modified for conjugation, such as 1,2-Dioleoyl-sn-Glycero-3-Phosphoethanolamine-N-(succinyl) and 1,2-Dipalmitoyl-sn-Glycero-3-Phosphoethanolamine-N-(Biotinyl). Suitable lipids whose tail groups have been chemically-modified include, but are not limited to, polymerisable lipids, such as 1,2-bis(10,12-tricosadiynoyl)-sn-Glycero-3-Phosphocholine; fluorinated lipids, such as 1-Palmitoyl-2-(16-Fluoropalmitoyl)-sn-Glycero-3-Phosphocholine; deuterated lipids, such as 1,2-Dipalmitoyl-D62-sn-Glycero-3-Phosphocholine; and ether linked lipids, such as 1,2-Di-O-phytanyl-sn-Glycero-3-Phosphocholine. The lipids may be chemically-modified or functionalised to facilitate coupling of the polynucleotide.
- The amphiphilic layer, for example the lipid composition, typically comprises one or more additives that will affect the properties of the layer. Suitable additives include, but are not limited to, fatty acids, such as palmitic acid, myristic acid and oleic acid; fatty alcohols, such as palmitic alcohol, myristic alcohol and oleic alcohol; sterols, such as cholesterol, ergosterol, lanosterol, sitosterol and stigmasterol; lysophospholipids, such as 1-Acyl-2-Hydroxy-sn-Glycero-3-Phosphocholine; and ceramides.
- In another preferred embodiment, the membrane is a solid state layer. Solid state layers can be formed from both organic and inorganic materials including, but not limited to, microelectronic materials, insulating materials such as HfO2, Si3N4, Al2O3, and SiO, organic and inorganic polymers such as polyamide, plastics such as Teflon® or elastomers such as two-component addition-cure silicone rubber, and glasses. The solid state layer may be by atomic layer deposition (ALD). The ALD solid state layer may comprise alternating layers of HfO2 and Al2O3. The solid state layer may be formed from monatomic layers, such as graphene, or layers that are only a few atoms thick. Suitable graphene layers are disclosed in International Application No. PCT/US2008/010637 (published as WO 2009/035647). Yusko et al., Nature Nanotechnology, 2011; 6: 253-260 and US Patent Application No. 2013/0048499 describe the delivery of proteins to transmembrane pores in solid state layers without the use of microparticles. The method of the invention may be used to improve the delivery in the methods disclosed in these documents.
- The method is typically carried out using (i) an artificial amphiphilic layer comprising a pore, (ii) an isolated, naturally-occurring lipid bilayer comprising a pore, or (iii) a cell having a pore inserted therein. The method is typically carried out using an artificial amphiphilic layer, such as an artificial triblock copolymer layer. The layer may comprise other transmembrane and/or intramembrane proteins as well as other molecules in addition to the pore. Suitable apparatus and conditions are discussed below. The method of the invention is typically carried out in vitro.
- A transmembrane pore is a structure that crosses the membrane to some degree. Typically, a transmembrane pore comprises a first opening and a second opening with a lumen extending between the first opening and the second opening. The transmembrane pore permits hydrated ions driven by an applied potential to flow across or within the membrane. The transmembrane pore typically crosses the entire membrane so that hydrated ions may flow from one side of the membrane to the other side of the membrane. However, the transmembrane pore does not have to cross the membrane. It may be closed at one end. For instance, the pore may be a well, gap, channel, trench or slit in the membrane along which or into which hydrated ions may flow.
- Any transmembrane pore may be used in the invention. The pore may be biological or artificial. Suitable pores include, but are not limited to, protein pores, polynucleotide pores and solid state pores. The pore may be a DNA origami pore (Langecker et al., Science, 2012; 338: 932-936).
- The transmembrane pore is preferably a transmembrane protein pore. A transmembrane protein pore is a polypeptide or a collection of polypeptides that permits hydrated ions, such as polynucleotide, to flow from one side of a membrane to the other side of the membrane. In the present invention, the transmembrane protein pore is capable of forming a pore that permits hydrated ions driven by an applied potential to flow from one side of the membrane to the other. The transmembrane protein pore preferably permits polynucleotides to flow from one side of the membrane, such as a triblock copolymer membrane, to the other. The transmembrane protein pore allows a polynucleotide, such as DNA or RNA, to be moved through the pore.
- The transmembrane protein pore may be a monomer or an oligomer. The pore is preferably made up of several repeating subunits, such as at least 6, at least 7, at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, or at least 16 subunits. The pore is preferably a hexameric, heptameric, octameric or nonameric pore. The pore may be a homo-oligomer or a hetero-oligomer.
- The transmembrane protein pore typically comprises a barrel or channel through which the ions may flow. The subunits of the pore typically surround a central axis and contribute strands to a transmembrane β barrel or channel or a transmembrane α-helix bundle or channel.
- The barrel or channel of the transmembrane protein pore typically comprises amino acids that facilitate interaction with s, such as nucleotides, polynucleotides or nucleic acids. These amino acids are preferably located near a constriction of the barrel or channel. The transmembrane protein pore typically comprises one or more positively charged amino acids, such as arginine, lysine or histidine, or aromatic amino acids, such as tyrosine or tryptophan. These amino acids typically facilitate the interaction between the pore and nucleotides, polynucleotides or nucleic acids.
- Transmembrane protein pores for use in accordance with the invention can be derived from β-barrel pores or α-helix bundle pores. β-barrel pores comprise a barrel or channel that is formed from β-strands. Suitable β-barrel pores include, but are not limited to, β-toxins, such as α-hemolysin, anthrax toxin and leukocidins, and outer membrane proteins/porins of bacteria, such as Mycobacterium smegmatis porin (Msp), for example MspA, MspB, MspC or MspD, CsgG, outer membrane porn F (OmpF), outer membrane porn G (OmpG), outer membrane phospholipase A and Neisseria autotransporter lipoprotein (NalP) and other pores, such as lysenin. α-helix bundle pores comprise a barrel or channel that is formed from α-helices. Suitable α-helix bundle pores include, but are not limited to, inner membrane proteins and a outer membrane proteins, such as WZA and ClyA toxin. The transmembrane pore may be derived from lysenin. Suitable pores derived from CsgG are disclosed in International Application No. PCT/EP2015/069965. Suitable pores derived from lysenin are disclosed in International Application No. PCT/GB2013/050667 (published as WO 2013/153359). The transmembrane pore may be derived from or based on Msp, α-hemolysin (α-HL), lysenin, CsgG, ClyA, Sp1 and haemolytic protein fragaceatoxin C (FraC). The wild type α-hemolysin pore is formed of 7 identical monomers or sub-units (i.e., it is heptameric). The sequence of one monomer or sub-unit of α-hemolysin-NN is shown in SEQ ID NO: 4.
- The transmembrane protein pore is preferably derived from Msp, more preferably from MspA. Such a pore will be oligomeric and typically comprises 7, 8, 9 or 10 monomers derived from Msp. The pore may be a homo-oligomeric pore derived from Msp comprising identical monomers. Alternatively, the pore may be a hetero-oligomeric pore derived from Msp comprising at least one monomer that differs from the others. Preferably the pore is derived from MspA or a homolog or paralog thereof.
- A monomer derived from Msp typically comprises the sequence shown in SEQ ID NO: 2 or a variant thereof. SEQ ID NO: 2 is the MS-(B1)8 mutant of the MspA monomer. It includes the following mutations: D90N, D91N, D93N, D118R, D134R and E139K. A variant of SEQ ID NO: 2 is a polypeptide that has an amino acid sequence which varies from that of SEQ ID NO: 2 and which retains its ability to form a pore. The ability of a variant to form a pore can be assayed using any method known in the art. For instance, the variant may be inserted into an amphiphilic layer along with other appropriate subunits and its ability to oligomerise to form a pore may be determined. Methods are known in the art for inserting subunits into membranes, such as amphiphilic layers. For example, subunits may be suspended in a purified form in a solution containing a triblock copolymer membrane such that it diffuses to the membrane and is inserted by binding to the membrane and assembling into a functional state. Alternatively, subunits may be directly inserted into the membrane using the “pick and place” method described in M. A. Holden, H. Bayley. J. Am. Chem. Soc. 2005, 127, 6502-6503 and International Application No. PCT/GB2006/001057 (published as WO 2006/100484).
- Over the entire length of the amino acid sequence of SEQ ID NO: 2, a variant will preferably be at least 50% homologous to that sequence based on amino acid similarity or identity. More preferably, the variant may be at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid similarity or identity to the amino acid sequence of SEQ ID NO: 2 over the entire sequence. There may be at least 80%, for example at least 85%, 90% or 95%, amino acid similarity or identity over a stretch of 100 or more, for example 125, 150, 175 or 200 or more, contiguous amino acids (“hard homology”).
- Standard methods in the art may be used to determine homology. For example the UWGCG Package provides the BESTFIT program which can be used to calculate homology, for example used on its default settings (Devereux et al (1984) Nucleic Acids Research 12, p 387-395). The PILEUP and BLAST algorithms can be used to calculate homology or line up sequences (such as identifying equivalent residues or corresponding sequences (typically on their default settings)), for example as described in Altschul S. F. (1993) J Mol Evol 36:290-300; Altschul, S. F et al (1990) J Mol Biol 215:403-10. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). Similarity can be measured using pairwise identity or by applying a scoring matrix such as BLOSUM62 and converting to an equivalent identity. Since they represent functional rather than evolved changes, deliberately mutated positions would be masked when determining homology. Similarity may be determined more sensitively by the application of position-specific scoring matrices using, for example, PSIBLAST on a comprehensive database of protein sequences. A different scoring matrix could be used that reflect amino acid chemico-physical properties rather than frequency of substitution over evolutionary time scales (e.g. charge).
- SEQ ID NO: 2 is the MS-(B1)8 mutant of the MspA monomer. The variant may comprise any of the mutations in the MspB, C or D monomers compared with MspA. The mature forms of MspB, C and D are shown in SEQ ID NOs: 5 to 7. In particular, the variant may comprise the following substitution present in MspB: A138P. The variant may comprise one or more of the following substitutions present in MspC: A96G, N102E and A138P. The variant may comprise one or more of the following mutations present in MspD: Deletion of G1, L2V, E5Q, L8V, D13G, W21A, D22E, K47T, I49H, I68V, D91G, A96Q, N102D, S103T, V1041, S136K and G141A.
- The variant may comprise combinations of one or more of the mutations and substitutions from Msp B, C and D. The variant preferably comprises the mutation L88N. A variant of SEQ ID NO: 2 has the mutation L88N in addition to all the mutations of MS-B1 and is called MS-(B2)8. The pore used in the invention is preferably MS-(B2)8. The variant of SEQ ID NO: 2 preferably comprises one or more of D56N, D56F, E59R, G75S, G77S, A96D and Q126R. A variant of SEQ ID NO: 2 has the mutations G75S/G77S/L88N/Q126R in addition to all the mutations of MS-
B 1 and is called MS-B2C. The pore used in the invention is preferably MS-(B2)8 or MS-(B2C)8. The variant of SEQ ID NO: 2 preferably comprises N93D. The variant more preferably comprises the mutations G75S/G77S/L88N/N93D/Q126R. - Amino acid substitutions may be made to the amino acid sequence of SEQ ID NO: 2 in addition to those discussed above, for example up to 1, 2, 3, 4, 5, 10, 20 or 30 substitutions.
- Conservative substitutions replace amino acids with other amino acids of similar chemical structure, similar chemical properties or similar side-chain volume. The amino acids introduced may have similar polarity, hydrophilicity, hydrophobicity, basicity, acidity, neutrality or charge to the amino acids they replace. Alternatively, the conservative substitution may introduce another amino acid that is aromatic or aliphatic in the place of a pre-existing aromatic or aliphatic amino acid.
- The transmembrane protein pore is preferably derived from CsgG, more preferably from CsgG from E. coli Str. K-12 substr. MC4100. Such a pore will be oligomeric and typically comprises 7, 8, 9 or 10 monomers derived from CsgG. The pore may be a homo-oligomeric pore derived from CsgG comprising identical monomers. Alternatively, the pore may be a hetero-oligomeric pore derived from CsgG comprising at least one monomer that differs from the others.
- A monomer derived from CsgG typically comprises the sequence shown in SEQ ID NO: 114 or a variant thereof. A variant of SEQ ID NO: 114 is a polypeptide that has an amino acid sequence which varies from that of SEQ ID NO: 114 and which retains its ability to form a pore.
- The ability of a variant to form a pore can be assayed using any method known in the art as discussed above.
- Over the entire length of the amino acid sequence of any one of SEQ ID NO: 114, a variant will preferably be at least 50% homologous to that sequence based on amino acid similarity or identity. More preferably, the variant may be at least 55%, at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid similarity or identity to the amino acid sequence of SEQ ID NO: 114 over the entire sequence. There may be at least 80%, for example at least 85%, 90% or 95%, amino acid similarity or identity over a stretch of 100 or more, for example 125, 150, 175 or 200 or more, contiguous amino acids (“hard homology”). Homology can be measured as discussed above.
- The variant of SEQ ID NO: 114 may comprise any of the mutations disclosed in International Application No. PCT/GB2015/069965 (published as WO 2016/034591). The variant of SEQ ID NO: 114 preferably comprises one or more of the following (i) one or more mutations at the following positions (i.e. mutations at one or more of the following positions) N40, D43, E44, S54, S57, Q62, R97, E101, E124, E131, R142, T150 and R192, such as one or more mutations at the following positions (i.e. mutations at one or more of the following positions) N40, D43, E44, S54, S57, Q62, E101, E131 and T150 or N40, D43, E44, E101 and E131; (ii) mutations at Y51/N55, Y51/F56, N55/F56 or Y51/N55/F56; (iii) Q42R or Q42K; (iv) K49R; (v) N102R, N102F, N102Y or N102W; (vi) D149N, D149Q or D149R; (vii) E185N, E185Q or E185R; (viii) D195N, D195Q or D195R; (ix) E201N, E201Q or E201R; (x) E203N, E203Q or E203R; and (xi) deletion of one or more of the following positions F48, K49, P50, Y51, P52, A53, S54, N55, F56 and S57. The variant may comprise any combination of (i) to (xi). If the variant comprises any one of (i) and (iii) to (xi), it may further comprise a mutation at one or more of Y51, N55 and F56, such as at Y51, N55, F56, Y51/N55, Y51/F56, N55/F56 or Y51/N55/F56.
- Preferred variants of SEQ ID NO: 114 which form pores in which fewer nucleotides contribute to the current as the polynucleotide moves through the pore comprise Y51A/F56A, Y51A/F56N, Y51I/F56A, Y51L/F56A, Y51T/F56A, Y51I/F56N, Y51L/F56N or Y51T/F56N or more preferably Y51I/F56A, Y51L/F56A or Y51T/F56A.
- Preferred variants of SEQ ID NO: 114 which form pores displaying an increased range comprise mutations at the following positions:
-
- Y51, F56, D149, E185, E201 and E203;
- N55 and F56;
- Y51 and F56;
- Y51, N55 and F56; or
- F56 and N102.
- Preferred variants of SEQ ID NO: 114 which form pores displaying an increased range comprise:
-
- Y51N, F56A, D149N, E185R, E201N and E203N;
- N55S and F56Q;
- Y51A and F56A;
- Y51A and F56N;
- Y51I and F56A;
- Y51L and F56A;
- Y51T and F56A;
- Y51I and F56N;
- Y51L and F56N;
- Y51T and F56N;
- Y51T and F56Q;
- Y51A, N55S and F56A;
- Y51A, N55S and F56N;
- Y51T, N55S and F56Q; or
- F56Q and N102R.
- Preferred variants of SEQ ID NO: 114 which form pores in which fewer nucleotides contribute to the current as the polynucleotide moves through the pore comprise mutations at the following positions:
-
- N55 and F56, such as N55X and F56Q, wherein X is any amino acid; or
- Y51 and F56, such as Y51X and F56Q, wherein X is any amino acid.
- Preferred variants of SEQ ID NO: 114 which form pores displaying an increased throughput comprise mutations at the following positions:
-
- D149, E185 and E203;
- D149, E185, E201 and E203; or
- D149, E185, D195, E201 and E203.
- Preferred variants which form pores displaying an increased throughput comprise:
-
- D149N, E185N and E203N;
- D149N, E185N, E201N and E203N;
- D149N, E185R, D195N, E201N and E203N; or
- D149N, E185R, D195N, E201R and E203N.
- Preferred variants of SEQ ID NO: 7 which form pores in which capture of the polynucleotide is increased comprise the following mutations:
-
- D43N/Y51T/F56Q;
- E44N/Y51T/F56Q;
- D43N/E44N/Y51T/F56Q;
- Y51T/F56Q/Q62R;
- D43N/Y51T/F56Q/Q62R;
- E44N/Y51T/F56Q/Q62R; or
- D43N/E44N/Y51T/F56Q/Q62R.
- Preferred variants of SEQ ID NO: 114 comprise the following mutations:
-
- D149R/E185R/E201R/E203R or Y51T/F56Q/D149R/E185R/E201R/E203R;
- D149N/E185N/E201N/E203N or Y51T/F56Q/D149N/E185N/E201N/E203N;
- E201R/E203R or Y51T/F56Q/E201R/E203R
- E201N/E203R or Y51T/F56Q/E201N/E203R;
- E203R or Y51T/F56Q/E203R;
- E203N or Y51T/F56Q/E203N;
- E201R or Y51T/F56Q/E201R;
- E201N or Y51T/F56Q/E201N;
- E185R or Y51T/F56Q/E185R;
- E185N or Y51T/F56Q/E185N;
- D149R or Y51T/F56Q/D149R;
- D149N or Y51T/F56Q/D149N;
- R142E or Y51T/F56Q/R142E;
- R142N or Y51T/F56Q/R142N;
- R192E or Y51T/F56Q/R192E; or
- R192N or Y51T/F56Q/R192N.
- Preferred variants of SEQ ID NO: 114 comprise the following mutations:
-
- Y51A/F56Q/E101N/N102R;
- Y51A/F56Q/R97N/N102G;
- Y51A/F56Q/R97N/N102R;
- Y51A/F56Q/R97N;
- Y51A/F56Q/R97G;
- Y51A/F56Q/R97L;
- Y51A/F56Q/N102R;
- Y51A/F56Q/N102F;
- Y51A/F56Q/N102G;
- Y51A/F56Q/E101R;
- Y51A/F56Q/E101F;
- Y51A/F56Q/E101N; or
- Y51A/F56Q/E101G
- The variant of SEQ ID NO: 114 may comprise any of the substitutions present in another CsgG homologue. Preferred CsgG homologues are shown in SEQ ID NOs: 3 to 7 and 26 to 41 of International Application No. PCT/GB2015/069965 (published as WO 2016/034591).
- Any of the proteins described herein, such as the transmembrane protein pores, may be modified to assist their identification or purification, for example by the addition of histidine residues (a his tag), aspartic acid residues (an asp tag), a streptavidin tag, a flag tag, a SUMO tag, a GST tag or a MBP tag, or by the addition of a signal sequence to promote their secretion from a cell where the polypeptide does not naturally contain such a sequence. An alternative to introducing a genetic tag is to chemically react a tag onto a native or engineered position on the pore or construct. An example of this would be to react a gel-shift reagent to a cysteine engineered on the outside of the pore. This has been demonstrated as a method for separating hemolysin hetero-oligomers (Chem Biol. 1997 July; 4(7):497-505).
- The pore may be labelled with a revealing label. The revealing label may be any suitable label which allows the pore to be detected. Suitable labels include, but are not limited to, fluorescent molecules, radioisotopes, e.g. 125I, 35S, enzymes, antibodies, antigens, polynucleotides and ligands such as biotin.
- Any of the proteins described herein, such as the transmembrane protein pores, may be made synthetically or by recombinant means. For example, the pore may be synthesised by in vitro translation and transcription (IVTT). The amino acid sequence of the pore may be modified to include non-naturally occurring amino acids or to increase the stability of the protein. When a protein is produced by synthetic means, such amino acids may be introduced during production. The pore may also be altered following either synthetic or recombinant production.
- Any of the proteins described herein, such as the transmembrane protein pores, can be produced using standard methods known in the art. Polynucleotide sequences encoding a pore or construct may be derived and replicated using standard methods in the art. Polynucleotide sequences encoding a pore or construct may be expressed in a bacterial host cell using standard techniques in the art. The pore may be produced in a cell by in situ expression of the polypeptide from a recombinant expression vector. The expression vector optionally carries an inducible promoter to control the expression of the polypeptide. These methods are described in Sambrook, J. and Russell, D. (2001). Molecular Cloning: A Laboratory Manual, 3rd Edition. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
- The pore may be produced in large scale following purification by any protein liquid chromatography system from protein producing organisms or after recombinant expression. Typical protein liquid chromatography systems include FPLC, AKTA systems, the Bio-Cad system, the Bio-Rad BioLogic system and the Gilson HPLC system.
- The/each modified polynucleortide preferably comprises one or more anchors which are capable of coupling to the membrane. The method preferably further comprises coupling the target polynucleotide to the membrane using the one or more anchors.
- The anchor comprises a group which couples (or binds) to the polynucleotide and a group which couples (or binds) to the membrane. Each anchor may covalently couple (or bind) to the polynucleotide and/or the membrane. The group may be a chemical group and/or a functional group.
- The polynucleotide may be coupled to the membrane using any number of anchors, such as 2, 3, 4 or more anchors. For instance, the polynucleotide may be coupled to the membrane using two anchors each of which separately couples (or binds) to both the polynucleotide and membrane.
- The one or more anchors may comprise one or more molecular brakes or polynucleotide binding proteins. Each anchor may comprise one or more molecular brakes or polynucleotide binding proteins. The molecular brake(s) or polynucleotide binding protein(s) may be any of those discussed below.
- If the membrane is an amphiphilic layer, such as a triblock copolymer membrane, the one or more anchors preferably comprise a polypeptide anchor present in the membrane and/or a hydrophobic anchor present in the membrane. The hydrophobic anchor is preferably a lipid, fatty acid, sterol, carbon nanotube, polypeptide, protein or amino acid, for example cholesterol, palmitate or tocopherol. In preferred embodiments, the one or more anchors are not the pore.
- The components of the membrane, such as the amphiphilic molecules, copolymer or lipids, may be chemically-modified or functionalised to form the one or more anchors. Examples of suitable chemical modifications and suitable ways of functionalising the components of the membrane are discussed in more detail below. Any proportion of the membrane components may be functionalised, for example at least 0.01%, at least 0.1%, at least 1%, at least 10%, at least 25%, at least 50% or 100%.
- The polynucleotide may be coupled directly to the membrane. The one or more anchors used to couple the polynucleotide to the membrane preferably comprise a linker. The one or more anchors may comprise one or more, such as 2, 3, 4 or more, linkers. One linker may be used to couple more than one, such as 2, 3, 4 or more, polynucleotides to the membrane.
- Preferred linkers include, but are not limited to, polymers, such as polynucleotides, polyethylene glycols (PEGs), polysaccharides and polypeptides. These linkers may be linear, branched or circular. For instance, the linker may be a circular polynucleotide. The polynucleotide may hybridise to a complementary sequence on the circular polynucleotide linker.
- The one or more anchors or one or more linkers may comprise a component that can be cut or broken down, such as a restriction site or a photolabile group.
- Functionalised linkers and the ways in which they can couple molecules are known in the art. For instance, linkers functionalised with maleimide groups will react with and attach to cysteine residues in proteins. In the context of this invention, the protein may be present in the membrane, may be the polynucleotide itself or may be used to couple (or bind) to the polynucleotide. This is discussed in more detail below.
- Crosslinkage of polynucleotides can be avoided using a “lock and key” arrangement. Only one end of each linker may react together to form a longer linker and the other ends of the linker each react with the polynucleotide or membrane respectively. Such linkers are described in International Application No. PCT/GB10/000132 (published as WO 2010/086602).
- The use of a linker is preferred in the sequencing embodiments discussed below. If a polynucleotide is permanently coupled directly to the membrane in the sense that it does not uncouple when interacting with the pore, then some sequence data will be lost as the sequencing run cannot continue to the end of the polynucleotide due to the distance between the membrane and the pore. If a linker is used, then the polynucleotide can be processed to completion.
- The coupling may be permanent or stable. In other words, the coupling may be such that the polynucleotide remains coupled to the membrane when interacting with the pore.
- The coupling may be transient. In other words, the coupling may be such that the polynucleotide may decouple from the membrane when interacting with the pore. For certain applications, such as aptamer detection and polynucleotide sequencing, the transient nature of the coupling is preferred. If a permanent or stable linker is attached directly to either the 5′ or 3′ end of a polynucleotide and the linker is shorter than the distance between the membrane and the transmembrane pore's channel, then some sequence data will be lost as the sequencing run cannot continue to the end of the polynucleotide. If the coupling is transient, then when the coupled end randomly becomes free of the membrane, then the polynucleotide can be processed to completion. Chemical groups that form permanent/stable or transient links are discussed in more detail below. The polynucleotide may be transiently coupled to an amphiphilic layer or triblock copolymer membrane using cholesterol or a fatty acyl chain. Any fatty acyl chain having a length of from 6 to 30 carbon atom, such as hexadecanoic acid, may be used.
- In preferred embodiments, a polynucleotide, such as a nucleic acid, is coupled to an amphiphilic layer such as a triblock copolymer membrane or lipid bilayer. Coupling of nucleic acids to synthetic lipid bilayers has been carried out previously with various different tethering strategies. These are summarised in Table 3 below.
-
TABLE 3 Anchor Type of comprising coupling Reference Thiol Stable Yoshina-Ishii, C. and S. G. Boxer (2003). “Arrays of mobile tethered vesicles on supported lipid bilayers.” J Am Chem Soc 125(13): 3696-7. Biotin Stable Nikolov, V., R. Lipowsky, et al. (2007). “Behavior of giant vesicles with anchored DNA molecules.” Biophys J 92(12): 4356-68 Cholesterol Transient Pfeiffer, I. and F. Hook (2004). “Bivalent cholesterol-based coupling of oligonucletides to lipid membrane assemblies.” J Am Chem Soc 126(33): 10224-5 Surfactant Stable van Lengerich, B., R. J. Rawle, et al. (e.g. Lipid, “Covalent attachment of lipid vesicles to a Palmitate, fluid-supported bilayer allows observation of etc) DNA-mediated vesicle interactions.” Langmuir 26(11): 8666-72 - Synthetic polynucleotides and/or linkers may be functionalised using a modified phosphoramidite in the synthesis reaction, which is easily compatible for the direct addition of suitable anchoring groups, such as cholesterol, tocopherol, palmitate, thiol, lipid and biotin groups. These different attachment chemistries give a suite of options for attachment to polynucleotides. Each different modification group couples the polynucleotide in a slightly different way and coupling is not always permanent so giving different dwell times for the polynucleotide to the membrane. The advantages of transient coupling are discussed above.
- Coupling of polynucleotides to a linker or to a functionalised membrane can also be achieved by a number of other means provided that a complementary reactive group or an anchoring group can be added to the polynucleotide. The addition of reactive groups to either end of a polynucleotide has been reported previously. A thiol group can be added to the 5′ of ssDNA or dsDNA using T4 polynucleotide kinase and ATPγS (Grant, G. P. and P. Z. Qin (2007). “A facile method for attaching nitroxide spin labels at the 5′ terminus of nucleic acids.” Nucleic Acids Res 35(10): e77). An azide group can be added to the 5′-phosphate of ssDNA or dsDNA using T4 polynucleotide kinase and γ-[2-Azidoethyl]-ATP or γ-[6-Azidohexyl]-ATP. Using thiol or Click chemistry a tether, containing either a thiol, iodoacetamide OPSS or maleimide group (reactive to thiols) or a DIBO (dibenzocyclooxtyne) or alkyne group (reactive to azides), can be covalently attached to the polynucleotide. A more diverse selection of chemical groups, such as biotin, thiols and fluorophores, can be added using terminal transferase to incorporate modified oligonucleotides to the 3′ of ssDNA (Kumar, A., P. Tchen, et al. (1988). “Nonradioactive labeling of synthetic oligonucleotide probes with terminal deoxynucleotidyl transferase.” Anal Biochem 169(2): 376-82). Streptavidin/biotin and/or streptavidin/desthiobiotin coupling may be used for any other polynucleotide. A polynucleotide can be coupled to a membrane using streptavidin/biotin and streptavidin/desthiobiotin. It may also be possible that anchors may be directly added to polynucleotides using terminal transferase with suitably modified nucleotides (e.g. cholesterol or palmitate).
- The one or more anchors preferably couple the polynucleotide to the membrane via hybridisation. The hybridisation may be present in any part of the one or more anchors, such as between the one or more anchors and the polynucleotide, within the one or more anchors or between the one or more anchors and the membrane. Hybridisation in the one or more anchors allows coupling in a transient manner as discussed above. For instance, a linker may comprise two or more polynucleotides, such as 3, 4 or 5 polynucleotides, hybridised together. The one or more anchors may hybridise to the polynucleotide. The one or more anchors may hybridise directly to the polynucleotide, directly to a Y adaptor and/or leader sequence attached to the polynucleotide or directly to a hairpin loop adaptor attached to the polynucleotide (as discussed in more detail below). Alternatively, the one or more anchors may be hybridised to one or more, such as 2 or 3, intermediate polynucleotides (or “splints”) which are hybridised to the polynucleotide, to a Y adaptor and/or leader sequence attached to the polynucleotide or to a hairpin loop adaptor attached to the polynucleotide (as discussed in more detail below).
- The one or more anchors may comprise a single stranded or double stranded polynucleotide. One part of the anchor may be ligated to a single stranded or double stranded polynucleotide analyte. Ligation of short pieces of ssDNA have been reported using T4 RNA ligase I (Troutt, A. B., M. G. McHeyzer-Williams, et al. (1992). “Ligation-anchored PCR: a simple amplification technique with single-sided specificity.” Proc Natl Acad Sci USA 89(20): 9823-5). Alternatively, either a single stranded or double stranded polynucleotide can be ligated to a double stranded polynucleotide and then the two strands separated by thermal or chemical denaturation. To a double stranded polynucleotide, it is possible to add either a piece of single stranded polynucleotide to one or both of the ends of the duplex, or a double stranded polynucleotide to one or both ends. For addition of single stranded polynucleotides to the double stranded polynucleotide, this can be achieved using T4 RNA ligase I as for ligation to other regions of single stranded polynucleotides. For addition of double stranded polynucleotides to a double stranded polynucleotide then ligation can be “blunt-ended”, with complementary 3′ dA/dT tails on the polynucleotide and added polynucleotide respectively (as is routinely done for many sample prep applications to prevent concatemer or dimer formation) or using “sticky-ends” generated by restriction digestion of the polynucleotide and ligation of compatible adapters. Then, when the duplex is melted, each single strand will have either a 5′ or 3′ modification if a single stranded polynucleotide was used for ligation or a modification at the 5′ end, the 3′ end or both if a double stranded polynucleotide was used for ligation.
- If the polynucleotide is a synthetic strand, the one or more anchors can be incorporated during the chemical synthesis of the polynucleotide. For instance, the polynucleotide can be synthesised using a primer having a reactive group attached to it.
- Adenylated polynucleotides are intermediates in ligation reactions, where an adenosine-monophosphate is attached to the 5′-phosphate of the polynucleotide. Various kits are available for generation of this intermediate, such as the 5′ DNA Adenylation Kit from NEB. By substituting ATP in the reaction for a modified nucleotide triphosphate, then addition of reactive groups (such as thiols, amines, biotin, azides, etc) to the 5′ of a polynucleotide can be possible. It may also be possible that anchors could be directly added to polynucleotides using a 5′ DNA adenylation kit with suitably modified nucleotides (e.g. cholesterol or palmitate).
- A common technique for the amplification of sections of genomic DNA is using polymerase chain reaction (PCR). Here, using two synthetic oligonucleotide primers, a number of copies of the same section of DNA can be generated, where for each copy the 5′ of each strand in the duplex will be a synthetic polynucleotide. Single or multiple nucleotides can be added to 3′ end of single or double stranded DNA by employing a polymerase. Examples of polymerases which could be used include, but are not limited to, Terminal Transferase, Klenow and E. coli Poly(A) polymerase). By substituting ATP in the reaction for a modified nucleotide triphosphate then anchors, such as cholesterol, thiol, amine, azide, biotin or lipid, can be incorporated into double stranded polynucleotides. Therefore, each copy of the amplified polynucleotide will contain an anchor.
- Ideally, the polynucleotide is coupled to the membrane without having to functionalise the polynucleotide. This can be achieved by coupling the one or more anchors, such as a polynucleotide binding protein or a chemical group, to the membrane and allowing the one or more anchors to interact with the polynucleotide or by functionalizing the membrane. The one or more anchors may be coupled to the membrane by any of the methods described herein. In particular, the one or more anchors may comprise one or more linkers, such as maleimide functionalised linkers.
- In this embodiment, the polynucleotide is typically RNA, DNA, PNA, TNA or LNA and may be double or single stranded. This embodiment is particularly suited to genomic DNA polynucleotides.
- The one or more anchors can comprise any group that couples to, binds to or interacts with single or double stranded polynucleotides, specific nucleotide sequences within the polynucleotide or patterns of modified nucleotides within the polynucleotide, or any other ligand that is present on the polynucleotide.
- Suitable binding proteins for use in anchors include, but are not limited to, E. coli single stranded binding protein, P5 single stranded binding protein, T4 gp32 single stranded binding protein, the TOPO V dsDNA binding region, human histone proteins, E. coli HU DNA binding protein and other archaeal, prokaryotic or eukaryotic single stranded or double stranded polynucleotide (or nucleic acid) binding proteins, including those listed below.
- The specific nucleotide sequences could be sequences recognised by transcription factors, ribosomes, endonucleases, topoisomerases or replication initiation factors. The patterns of modified nucleotides could be patterns of methylation or damage.
- The one or more anchors can comprise any group which couples to, binds to, intercalates with or interacts with a polynucleotide. The group may intercalate or interact with the polynucleotide via electrostatic, hydrogen bonding or Van der Waals interactions. Such groups include a lysine monomer, poly-lysine (which will interact with ssDNA or dsDNA), ethidium bromide (which will intercalate with dsDNA), universal bases or universal nucleotides (which can hybridise with any polynucleotide) and osmium complexes (which can react to methylated bases). A polynucleotide may therefore be coupled to the membrane using one or more universal nucleotides attached to the membrane. Each universal nucleotide may be coupled to the membrane using one or more linkers. The universal nucleotide preferably comprises one of the following nucleobases: hypoxanthine, 4-nitroindole, 5-nitroindole, 6-nitroindole, formylindole, 3-nitropyrrole, nitroimidazole, 4-nitropyrazole, 4-nitrobenzimidazole, 5-nitroindazole, 4-aminobenzimidazole or phenyl (C6-aromatic ring). The universal nucleotide more preferably comprises one of the following nucleosides: 2′-deoxyinosine, inosine, 7-deaza-2′-deoxyinosine, 7-deaza-inosine, 2-aza-deoxyinosine, 2-aza-inosine, 2-0′-methylinosine, 4-
nitroindole 2′-deoxyribonucleoside, 4-nitroindole ribonucleoside, 5-nitroindole 2′-deoxyribonucleoside, 5-nitroindole ribonucleoside, 6-nitroindole 2′-deoxyribonucleoside, 6-nitroindole ribonucleoside, 3-nitropyrrole 2′-deoxyribonucleoside, 3-nitropyrrole ribonucleoside, an acyclic sugar analogue of hypoxanthine,nitroimidazole 2′-deoxyribonucleoside, nitroimidazole ribonucleoside, 4-nitropyrazole 2′-deoxyribonucleoside, 4-nitropyrazole ribonucleoside, 4-nitrobenzimidazole 2′-deoxyribonucleoside, 4-nitrobenzimidazole ribonucleoside, 5-nitroindazole 2′-deoxyribonucleoside, 5-nitroindazole ribonucleoside, 4-aminobenzimidazole 2′-deoxyribonucleoside, 4-aminobenzimidazole ribonucleoside, phenyl C-ribonucleoside, phenyl C-2′-deoxyribosyl nucleoside, T-deoxynebularine, T-deoxyisoguanosine, K-2′-deoxyribose, P-2′-deoxyribose and pyrrolidine. The universal nucleotide more preferably comprises 2′-deoxyinosine. The universal nucleotide is more preferably IMP or dIMP. The universal nucleotide is most preferably dPMP (2′-Deoxy-P-nucleoside monophosphate) or dKMP (N6-methoxy-2, 6-diaminopurine monophosphate). - The one or more anchors may couple to (or bind to) the polynucleotide via Hoogsteen hydrogen bonds (where two nucleobases are held together by hydrogen bonds) or reversed Hoogsteen hydrogen bonds (where one nucleobase is rotated through 180° with respect to the other nucleobase). For instance, the one or more anchors may comprise one or more nucleotides, one or more oligonucleotides or one or more polynucleotides which form Hoogsteen hydrogen bonds or reversed Hoogsteen hydrogen bonds with the polynucleotide. These types of hydrogen bonds allow a third polynucleotide strand to wind around a double stranded helix and form a triplex. The one or more anchors may couple to (or bind to) a double stranded polynucleotide by forming a triplex with the double stranded duplex.
- In this embodiment at least 1%, at least 10%, at least 25%, at least 50% or 100% of the membrane components may be functionalised.
- Where the one or more anchors comprise a protein, they may be able to anchor directly into the membrane without further functonalisation, for example if it already has an external hydrophobic region which is compatible with the membrane. Examples of such proteins include, but are not limited to, transmembrane proteins, intramembrane proteins and membrane proteins. Alternatively the protein may be expressed with a genetically fused hydrophobic region which is compatible with the membrane. Such hydrophobic protein regions are known in the art.
- The one or more anchors are preferably mixed with the polynucleotide before delivery to the membrane, but the one or more anchors may be contacted with the membrane and subsequently contacted with the polynucleotide.
- In another aspect the polynucleotide may be functionalised, using methods described above, so that it can be recognised by a specific binding group. Specifically the polynucleotide may be functionalised with a ligand such as biotin (for binding to streptavidin), amylose (for binding to maltose binding protein or a fusion protein), Ni-NTA (for binding to poly-histidine or poly-histidine tagged proteins) or peptides (such as an antigen).
- According to a preferred embodiment, the one or more anchors may be used to couple a polynucleotide to the membrane when the polynucleotide is attached to a leader sequence which preferentially threads into the pore. Leader sequences are discussed in more detail below. Preferably, the polynucleotide is attached (such as ligated) to a leader sequence which preferentially threads into the pore. Such a leader sequence may comprise a homopolymeric polynucleotide or an abasic region. The leader sequence is typically designed to hybridise to the one or more anchors either directly or via one or more intermediate polynucleotides (or splints). In such instances, the one or more anchors typically comprise a polynucleotide sequence which is complementary to a sequence in the leader sequence or a sequence in the one or more intermediate polynucleotides (or splints). In such instances, the one or more splints typically comprise a polynucleotide sequence which is complementary to a sequence in the leader sequence.
- Any of the methods discussed above for coupling polynucleotides to membranes, such as amphiphilic layers, can of course be applied to other polynucleotide and membrane combinations. In some embodiments, an amino acid, peptide, polypeptide or protein is coupled to an amphiphilic layer, such as a triblock copolymer layer or lipid bilayer. Various methodologies for the chemical attachment of such polynucleotides are available. An example of a molecule used in chemical attachment is EDC (1-ethyl-3-[3-dimethylaminopropyl]carbodiimide hydrochloride). Reactive groups can also be added to the 5′ of polynucleotides using commercially available kits (Thermo Pierce, Part No. 22980). Suitable methods include, but are not limited to, transient affinity attachment using histidine residues and Ni-NTA, as well as more robust covalent attachment by reactive cysteines, lysines or non natural amino acids.
- Any number of polynucleotides can be investigated. For instance, the method of the invention may concern characterising two or more polynucleotides, such as 3 or more, 4 or more, or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 20 or more, 30 or more, 50 or more, 100 or more, 500 or more, 1,000 or more, 5,000 or more, 10,000 or more, 100,000 or more, 1000,000 or more or 5000,000 or more, polynucleotides. The two or more polynucleotides may be delivered using the same microparticle or different microparticles.
- A microparticle is a microscopic particle whose size is typically measured in micrometres (μm). Microparticles may also known as microspheres or microbeads. The microparticle may be a nanoparticle. A nanoparticle is a microscopic particle whose size is typically measured in nanometres (nm).
- A microparticle typically has a particle size of from about 0.001 μm to about 500 μm. For instance, a nanoparticle may have a particle size of from about 0.01 μm to about 200 μm or about 0.1 μm to about 100 μm. More often, a microparticle has a particle size of from about 0.5 μm to about 100 μm, or for instance from about 1 μm to about 50 μm. The microparticle may have a particle size of from about 1 nm to about 1000 nm, such as from about 10 nm to about 500 nm, about 20 nm to about 200 nm or from about 30 nm to about 100 nm.
- If two or more polynucleotides are characterised, they may be different from one another. The two or more polynucleotides may be two or more instances of the same polynucleotide. This allows proof reading.
- The polynucleotides can be naturally occurring or artificial. For instance, the method may be used to verify the sequence of two or more manufactured oligonucleotides. The methods are typically carried out in vitro.
- The method may involve measuring two, three, four or five or more characteristics of each polynucleotide. The one or more characteristics are preferably selected from (i) the length of the polynucleotide, (ii) the identity of the polynucleotide, (iii) the sequence of the polynucleotide, (iv) the secondary structure of the polynucleotide and (v) whether or not the polynucleotide is modified. Any combination of (i) to (v) may be measured in accordance with the invention, such as {i}, {ii}, {iii}, {iv}, {v}, {i,ii}, {i,iii}, {i,iv}, {i,v}, {ii,iii}, {ii,iv}, {ii,v}, {iii,iv}, {iii,v}, {iv,v}, {i,ii,iii}, {i,ii,iv}, {i,ii,v}, {i,iii,iv}, {i,iii,v}, {i,iv,v}, {ii,iii,iv}, {ii,iii,v}, {ii,iv,v}, {iii,iv,v}, {i,ii,iii,iv}, {i,ii,iii,v}, {i,ii,iv,v}, {i,iii,iv,v}, {ii,iii,iv,v} or {i,ii,iii,iv,v}.
- For (i), the length of the polynucleotide may be measured for example by determining the number of interactions between the polynucleotide and the pore or the duration of interaction between the polynucleotide and the pore.
- For (ii), the identity of the polynucleotide may be measured in a number of ways. The identity of the polynucleotide may be measured in conjunction with measurement of the sequence of the polynucleotide or without measurement of the sequence of the polynucleotide. The former is straightforward; the polynucleotide is sequenced and thereby identified. The latter may be done in several ways. For instance, the presence of a particular motif in the polynucleotide may be measured (without measuring the remaining sequence of the polynucleotide). Alternatively, the measurement of a particular electrical and/or optical signal in the method may identify the polynucleotide as coming from a particular source.
- For (iii), the sequence of the polynucleotide can be determined as described previously. Suitable sequencing methods, particularly those using electrical measurements, are described in Stoddart D et al., Proc Natl Acad Sci, 12; 106(19):7702-7, Lieberman K R et al, J Am Chem Soc. 2010; 132(50):17961-72, and International Application WO 2000/28312.
- For (iv), the secondary structure may be measured in a variety of ways. For instance, if the method involves an electrical measurement, the secondary structure may be measured using a change in dwell time or a change in current flowing through the pore. This allows regions of single-stranded and double-stranded polynucleotide to be distinguished.
- For (v), the presence or absence of any modification may be measured. The method preferably comprises determining whether or not the polynucleotide is modified by methylation, by oxidation, by damage, with one or more proteins or with one or more labels, tags or spacers. Specific modifications will result in specific interactions with the pore which can be measured using the methods described below. For instance, methylcyotsine may be distinguished from cytosine on the basis of the current flowing through the pore during its interaction with each nucleotide.
- The methods may be carried out using any apparatus that is suitable for investigating a membrane/pore system in which a pore is present in a membrane. The method may be carried out using any apparatus that is suitable for transmembrane pore sensing. For example, the apparatus comprises a chamber comprising an aqueous solution and a barrier that separates the chamber into two sections. The barrier typically has an aperture in which the membrane containing the pore is formed. Alternatively the barrier forms the membrane in which the pore is present.
- The methods may be carried out using the apparatus described in International Application No. PCT/GB08/000562 (WO 2008/102120).
- A variety of different types of measurements may be made. This includes without limitation: electrical measurements and optical measurements. A suitable optical method involving the measurement of fluorescence is disclosed by J. Am. Chem. Soc. 2009, 131 1652-1653. Possible electrical measurements include: current measurements, impedance measurements, tunnelling measurements (Ivanov A P et al., Nano Lett. 2011 Jan. 12; 11(1):279-85), and FET measurements (International Application WO 2005/124888). Optical measurements may be combined with electrical measurements (Soni G V et al., Rev Sci Instrum. 2010 January; 81(1):014301). The measurement may be a transmembrane current measurement such as measurement of ionic current flowing through the pore.
- Electrical measurements may be made using standard single channel recording equipment as describe in Stoddart D et al., Proc Natl Acad Sci, 12; 106(19):7702-7, Lieberman K R et al, J Am Chem Soc. 2010; 132(50):17961-72, and International Application WO 2000/28312. Alternatively, electrical measurements may be made using a multi-channel system, for example as described in International Application WO 2009/077734 and International Application WO 2011/067559.
- The method is preferably carried out with a potential applied across the membrane. The applied potential may be a voltage potential. Alternatively, the applied potential may be a chemical potential. An example of this is using a salt gradient across a membrane, such as an amphiphilic layer. A salt gradient is disclosed in Holden et al., J Am Chem Soc. 2007 Jul. 11; 129(27):8650-5. In some instances, the current passing through the pore as a polynucleotide moves with respect to the pore is used to estimate or determine the sequence of the polynucleotide. This is strand sequencing.
- The methods may involve measuring the current passing through the pore as the polynucleotide moves with respect to the pore. Therefore the apparatus may also comprise an electrical circuit capable of applying a potential and measuring an electrical signal across the membrane and pore. The methods may be carried out using a patch clamp or a voltage clamp. The methods preferably involve the use of a voltage clamp.
- In a preferred embodiment, the method comprises:
-
- (a) contacting the/each modified polynucleotide with a transmembrane pore such that at least one strand of the/each polynucleotide moves through the pore; and
- (b) measuring the current passing through the pore as at least one strand of the/each polynucleotide moves with respect to the pore wherein the current is indicative of one or more characteristics of the at least one strand of the/each polynucleotide and thereby characterising the modified/template polynucleotide.
- The methods of the invention may involve the measuring of a current passing through the pore as the polynucleotide moves with respect to the pore. Suitable conditions for measuring ionic currents through transmembrane protein pores are known in the art and disclosed in the Example. The method is typically carried out with a voltage applied across the membrane and pore. The voltage used is typically from +5 V to −5 V, such as from +4 V to −4 V, +3 V to −3 V or +2 V to −2 V. The voltage used is typically from −600 mV to +600 mV or −400 mV to +400 mV. The voltage used is preferably in a range having a lower limit selected from −400 mV, −300 mV, −200 mV, −150 mV, −100 mV, −50 mV, −20 mV and 0 mV and an upper limit independently selected from +10 mV, +20 mV, +50 mV, +100 mV, +150 mV, +200 mV, +300 mV and +400 mV. The voltage used is more preferably in the
range 100 mV to 240 mV and most preferably in the range of 120 mV to 220 mV. It is possible to increase discrimination between different nucleotides by a pore by using an increased applied potential. - The methods are typically carried out in the presence of any charge carriers, such as metal salts, for example alkali metal salt, halide salts, for example chloride salts, such as alkali metal chloride salt. Charge carriers may include ionic liquids or organic salts, for example tetramethyl ammonium chloride, trimethylphenyl ammonium chloride, phenyltrimethyl ammonium chloride, or 1-ethyl-3-methyl imidazolium chloride. In the exemplary apparatus discussed above, the salt is present in the aqueous solution in the chamber. Potassium chloride (KCl), sodium chloride (NaCl), caesium chloride (CsCl) or a mixture of potassium ferrocyanide and potassium ferricyanide is typically used. KCl, NaCl and a mixture of potassium ferrocyanide and potassium ferricyanide are preferred. The charge carriers may be asymmetric across the membrane. For instance, the type and/or concentration of the charge carriers may be different on each side of the membrane.
- The salt concentration may be at saturation. The salt concentration may be 3 M or lower and is typically from 0.1 to 2.5 M, from 0.3 to 1.9 M, from 0.5 to 1.8 M, from 0.7 to 1.7 M, from 0.9 to 1.6 M or from 1 M to 1.4 M. The salt concentration is preferably from 150 mM to 1 M. The method is preferably carried out using a salt concentration of at least 0.3 M, such as at least 0.4 M, at least 0.5 M, at least 0.6 M, at least 0.8 M, at least 1.0 M, at least 1.5 M, at least 2.0 M, at least 2.5 M or at least 3.0 M. High salt concentrations provide a high signal to noise ratio and allow for currents indicative of the presence of a nucleotide to be identified against the background of normal current fluctuations.
- The methods are typically carried out in the presence of a buffer. In the exemplary apparatus discussed above, the buffer is present in the aqueous solution in the chamber. Any buffer may be used in the method of the invention. Typically, the buffer is phosphate buffer. Other suitable buffers are HEPES and Tris-HCl buffer. The methods are typically carried out at a pH of from 4.0 to 12.0, from 4.5 to 10.0, from 5.0 to 9.0, from 5.5 to 8.8, from 6.0 to 8.7 or from 7.0 to 8.8 or 7.5 to 8.5. The pH used is preferably about 7.5.
- The methods may be carried out at from 0° C. to 100° C., from 15° C. to 95° C., from 16° C. to 90° C., from 17° C. to 85° C., from 18° C. to 80° C., 19° C. to 70° C., or from 20° C. to 60° C. The methods are typically carried out at room temperature. The methods are optionally carried out at a temperature that supports enzyme function, such as about 37° C.
- The present invention also provides a kit for modifying a template polynucleotide. The kit comprises (a) a population of MuA substrates as defined above, (b) a MuA transposase and (c) a translocase. Any of the embodiments discussed above with reference to the methods and products of the invention equally apply to the kits.
- The kit may further comprise the components of a membrane, such as the components of an amphiphilic layer or a lipid bilayer. The kit may further comprise the components of a transmembrane pore. The kit may further comprise a molecular brake. Suitable membranes, pores and molecular brakes are discussed above.
- The kit may further comprise a Y adaptor comprising a leader sequence and/or one or more anchors capable of coupling the adaptor to a membrane. Suitable Y adaptors, leader sequences and anchors are discussed above.
- The kit of the invention may additionally comprise one or more other reagents or instruments which enable any of the embodiments mentioned above to be carried out. Such reagents or instruments include one or more of the following: suitable buffer(s) (aqueous solutions), means to obtain a sample from a subject (such as a vessel or an instrument comprising a needle), means to amplify and/or express polynucleotides, a membrane as defined above or voltage or patch clamp apparatus. Reagents may be present in the kit in a dry state such that a fluid sample resuspends the reagents. The kit may also, optionally, comprise instructions to enable the kit to be used in the method of the invention or details regarding which patients the method may be used for. The kit may, optionally, comprise nucleotides.
- The following Example illustrates the invention.
- MuA binds to the transposon as a tetramer and is extremely stable; remaining tightly bound after strand transfer of the transposon. If the MuA is not removed from the DNA, this can inhibit characterisation using a nanopore system. MuA can be removed by heating to 75° C. However, this relies on the use of a thermal cycler or water bath and could damage other components in the solution. Here we describe an alternative technique for removing MuA without needing to heat the reaction, using a helicase. Hel308Mbu-E284C/S615C-STrEP(C) (SEQ ID NO: with mutations E284C/S615C with a streptavidin tag attached at its C terminus) is a processive helicase which binds to single stranded DNA and moves in a 3′ to 5′ direction. When the transposon has a 3′ overhang on the bottom strand, Hel308Mbu-E284C/S615C-STrEP(C) (SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus) can bind and, upon moving along the DNA, force the MuA complex to dissociate from the DNA.
- Hel308Mbu-E284C/S615C-STrEP(C) (20 uM, SEQ ID NO: 10 with mutations E284C/S615C with a streptavidin tag attached at its C terminus) was reduced using 10 mM DTT in a 2 ml protein low bind Eppendorf and rotated on a Hula shaker (ThermoFisher Scientific) for 1 h, at 10 rpm with no vibration. The enzyme was then buffer exchanged, into 100 mM sodium phosphate, 500 mM NaCl, 5 mM EDTA and 0.1% Tween-20 pH8.0, using Zeba spin desalting columns 7K MWCO, 0.5 ml (ThermoFisher Scientific) according to the manufacturers protocol. The sample was diluted to 10 uM and 50
uM 1,11-bis(maleimido)triethylene glycol was added. The sample was then rotated on a Hula shaker for further 2 hours. This resulted in a closed complex helicase which was able to load onto DNA at the 3′ end. - The sequence for the transposon top strand was (SEQ ID NO: 115). This was annealed with either SEQ ID NO: 116 to form
transposon 1 or annealed with SEQ ID NO: 117 to formtransposon 2 which has a 3′ overhang on the bottom strand. - The transposon top strand was also annealed with the transposon leader (30 iSpC3 spacers attached at the 3′ end to the 5′ of SEQ ID NO: 118, which is attached at its 3′ end to the 5′ end of four iSp18 spacers which are attached at the 3′ end to the 5′ end of SEQ ID NO: 119).
- Transposons (10 uM) were annealed in 50 mM NaCl, 10 mM Tris·HCl pH8.0. The transposon sequences were heated to 95C for 2 minutes and then slow cooled (6 seconds for every 0.1° C. decrease) to 4° C.
-
Transposon 1,transposon 2 and leader transposon were each mixed to 2 uM in 40 ul, with concentrated MuA transposase (20 ul, 1.1 mg/ml, ThermoFisher Scientific) in 25 mM Tris·HCl pH8, 110 mM NaCl, 0.5 mM EDTA, 10% glycerol and 0.05% Triton-X100. These were then incubated at 30° C. for 90 minutes to formtranspososome 1,transpososome 2 and leader transpososome respectively, at 2 uM. - Transpososome 1 and
transpososome 2 were each mixed to 50 nM with 1.5 ug of PhiX174 RFI DNA (New England Biolabs) in 25 mM Tris·HCl pH8, 110 mM NaCl and 10 mM MgCl2 in a 30 ul reaction in a 0.2 ml PCR tube. Each reaction was incubated at room temperature for 2 minutes before being split in half to form 3 tubes of 10 uls for each. 1 tube of each transpososome was incubated at 75° C. for 5 minutes, 1 tube of each transpososome was left at room temperature for 5 minutes with nothing added. Hel308Mbu-E284C/S615C-STrEP(C) (1 uM) was added to the final tubes along with 10 mM of ATP (Sigma-Aldrich) and incubated at room temperature for 5 minutes. 1 ul of each reaction was then analysed on the Agilent 2100 Bioanalyser 12,000 bp setting, along with 1 ul of unmodified PhiX. - A 60 ul sample was made with 1.5 ug of lambda DNA (New England Biolabs) and 120 nM of leader transpososome in 25 mM Tris·HCl pH8, 110 mM NaCl and 10 mM MgCl2 and the sample mixed by inversion. The sample was incubated at room temperature for 10 minutes. The sample was then split into 3 sets of 20 ul reactions. nH20 (4 ul, ThermoFisher Scientific) was added to
sample 1 and the sample was heated at 75° C. for 10 minutes. Hel308Mbu-E284C/S615C-STrEP(C) (2 ul, 10 uM) and ATP (2 ul, 100 mM, Sigma-Aldrich) were added tosample 2 and it was incubated at room temperature for 10 minutes. nH20 (4 ul, ThermoFisher Scientific) was added tosample 3 and the sample was incubated at room temperature for 10 minutes. Agencourt AMPure XP SPRI beads (24 ul) were added to each sample (1-3) and the samples were incubated at room temperature for 5 minutes. The samples were then transferred to a magnetic rack and incubated for 2 minutes at room temperature. The supernatant was then removed and discarded from each sample. Buffer was added to each sample (50 ul, 750 mM NaCl, 10% PEG8000 and 50 mM Tris·HCl pH8). The wash buffer was then removed and discarded from each sample. Buffer 1 (6 ul, 10 mM Tris·HCl, 20 mM NaCl) was then to each sample and each samples was then mixed in order to resuspend the beads. Each sample was then spun down and returned to the magnetic rack. 6 ul of each sample was then removed and 1.5 ul of buffer 2 (1 uM of SEQ ID NO: 20 (which has 6 iSp18 spacers attached at its 3′ end), 750 mM KCl, 5 mM EDTA, 125 mM Kpi pH8) was added to each sample. The samples were then incubated at room temperature for 10 minutes. T4 Dda-(E94C/C109A/C136A/A360C) (SEQ ID NO: 97 with mutations E94C/C109A/C136A/A360C and then (ΔM1)G1G2 (where (ΔM1)G1G2=deletion of M1 and then addition G1 and G2), 1.25 ul, 5 uM), 25 mM Potassium phosphate, 150 mM KCl, 5% glycerol, 1 mM EDTA, pH7) was then added to each sample and then each sample was incubated at room temperature for 5 minutes. Buffer (1.25 ul, 800 uM TMAD) was then added to each sample and then each was incubated at room temperature for 5 minutes. Finally, 6 ul of fuel mix (75 mM ATP, 75 mM MgCl2) and 284 ul of buffer (25 mM Potassium phosphate, 500 mM potassium chloride, pH8) was added to each sample. - Electrical measurements were acquired from single MspA nanopores inserted in block co-polymer in buffer (25 mM K Phosphate buffer, 150 mM Potassium Ferrocyanide (II), 150 mM Potassium Ferricyanide (III), pH 8.0). After achieving a single pore inserted in the block co-polymer, then buffer (2 mL, 25 mM K Phosphate buffer, 150 mM Potassium Ferrocyanide (II), 150 mM Potassium Ferricyanide (III), pH 8.0) was flowed through the system to remove any excess MspA nanopores. 150 uL of 500 mM KCl, 25 mM K Phosphate, pH8.0 was then flowed through the system. After 10 minutes a further 150 uL of the sample described above was then flowed into the single nanopore experimental system. The experiment was run at −140 mV and helicase-controlled DNA movement was monitored.
- When the MuA transpososome is not removed from transpososome 1 (
FIG. 1 line labelled 1) or tranpososome 2 (FIG. 1 , line labelled 2) e.g. the control where both transpososomes are incubated at room temperature (sample 3), no peak was seen on the trace between the upper marker (labelled Y) and the lower marker (labelled X). This was because the MuA was still bound to the DNA, which prevented both transpososmes (1 and 2) from moving into the gel matrix of the Agilent 2100 Bioanalyser system. - When the sample was heated to 75° C. for 10 minutes, a peak can be seen for both transpososomes (
FIG. 2 (transpososome 1) andFIG. 3 (transpososome 2)) between the upper (Y) and lower (X) markers. This represents linearised PhiX with no MuA transposase bound to it. - Treatment with Hel308Mbu-E284C/S615C-STrEP(C) does not result in a PhiX peak for
transpososome 1, as there was no 3′ overhang for the enzyme to load onto, so the MuA remained bound (SeeFIG. 4 ). Fortranspososome 2, a PhiX peak was seen after addition of Hel308Mbu-E284C/S615C-STrEP(C) becausetranspososome 2 had a 3′ overhang for the enzyme to load onto (SeeFIG. 5 ). This indicated the fact that Hel308 was able to successfully remove MuA transposase from transposons. -
FIG. 6 showstranspososome 2 after treatment with Hel308Mbu-E284C/S615C-STrEP(C) and heat treatment. The two PhiX peaks are of a similar height, indicating that Hel308 was just as efficient as heat at removing MuA transposase. - Electrophysiology experiments were carried out as described above and the throughput of the experiments were compared (kilobases/per nanopore/hour) for sample 3 (incubation at room temp in absence of Hel308Mbu-E284C/S615C-STrEP(C)), sample 2 (incubation at 75° C. for 10 minutes) and sample 1 (incubation at room temperature with Hel308Mbu-E284C/S615C-STrEP(C) using transpososome with 3′ overhang).
FIG. 11 shows a graph of throughput for samples 1-3.Sample 3 shows a throughput of around 20 kb/nanopore/hr which is significantly lower thansamples sample 3 produce much higher throughput values around 80 kb/nanopore/hr forsample 2 and 85 kb/nanopore/hr forsample 3. This shows that removal of MuA transposase using Hel308Mbu-E284C/S615C-STrEP(C) was as efficient as heat treatment. Removal of MuA transposase using Hel308Mbu-E284C/S615C-STrEP(C) resulted in improved characterisation using a nanopore system. - This example describes using a number of different translocases to remove MuA transposase.
- A MuA adapter consisting of SEQ ID NO: 117 and 121 were annealed to 10 uM in 10 mM Tris-HCl (pH 7.5), 50 mM NaCl, from 95° C. to 22° C. at 2° C. per minute. This adapter contained the minimal MuA recognition sequence, with the pre-formed 5′ bottom strand flap, as well as a 12 nt 5′ tail on the top strand and a 10 nt 3′ tail on the bottom strand.
- A transposome complex was formed but addition of 1 ul of the MuA adapter, 4.5 ul of nuclease free water, 2 ul of 5× transposome buffer (125
mM Tris pH 8, 550 mM NaCl, 2.5 mM EDTA, 50% glycerol, 0.25% Triton-X100) and 2.5 ul of concentrated MuA transposase (Thermofisher). The mixture was then incubated at 30° C. for 1.5 hours. - A transposition reaction, containing 10 ul of 5× transposase buffer (125
mM Tris pH 8, 550 mM NaCl, 50 mM MgCl2), 5 ul transposome, 2.5 ug PhiX RFI (NEB) and nuclease free water to 50 ul, was then carried out at room temperature for 10 minutes. After 10 mins 6.25 ul of 100 mM rATP was added and the reaction was split into 5×11.25 ul. To sample (i) and (ii) 1.25 ul of nuclease free water was added; to sample (iii) 1.25 ul of Hel308Mbu-E284C-STrEP(C) (SEQ ID NO: 10 with mutation E284C with a streptavidin tag attached at its C terminus) was added; to sample (iv) 1.25 ul of T4 Dda-(E94C/F98W/C109A/C136A/A360C) (SEQ ID NO: 97 with mutations E94C/F98W/C109A/C136A/A360C and then (ΔM1)G1G2 (where (ΔM1)G1G2=deletion of M1 and then addition G1 and G2), was added; to sample (v) 1.25 ul of UvrD Eco-(E117C/M380C)-STrEP (SEQ ID NO: 122 with mutations E177C/M380C with a streptavidin tag attached at the C terminus). Samples (i), (iii), (iv) and (v) were then left at room temperature for 10 mins while sample (ii) was left at 75° C. for 10 mins. All samples were then loaded onto a 12000 Agilent DNA chip to look for Tagementation products. -
FIGS. 7 to 10 show a number of Agilent traces for samples (i)-(v). Sample (i) was a control where no translocase was added and the sample was no heated.FIGS. 7 to 10 all illustrate the control showing no tagmentation peak was observed for this sample this was because the MuA was still bound to the DNA, which prevented the transpososome from moving into the gel matrix of the Agilent 2100 Bioanalyser system.FIG. 7 also shows sample (ii) (line 2) which shows a clear tagmentation peak when the sample was heated to 75° C. in order to remove the MuA transposase. -
FIG. 8 shows sample (iii, line 3) and the control sample (i, line 1). Sample (iii) shows a clear tagmentation peak when the sample was heated with Hel308Mbu-E284C-STrEP(C) in order to remove the MuA transposase. This indicated the fact that Hel308Mbu-E284C-STrEP(C) was able to successfully remove MuA transposase from transposons. -
FIG. 9 shows sample (iv, line 4) and the control sample (i, line 1). Sample (iv) shows a clear tagmentation peak when the sample was heated with T4 Dda-(E94C/F98W/C109A/C136A/A360C) in order to remove the MuA transposase. This indicated the fact that T4 Dda-(E94C/F98W/C109A/C136A/A360C) was able to successfully remove MuA transposase from transposons. -
FIG. 10 shows sample (v, line 5) and the control sample (i, line 1). Sample (v) shows a clear tagmentation peak when the sample was heated with UvrD Eco-(E117C/M380C)-STrEP in order to remove the MuA transposase. This indicated the fact that UvrD Eco-(E117C/M380C)-STrEP was able to successfully remove MuA transposase from transposons.
Claims (25)
1. A method for modifying a template double stranded polynucleotide, comprising:
(a) contacting the template polynucleotide with a MuA transposase and a population of double stranded MuA substrates each comprising an overhang at one or both ends of one strand such that the transposase fragments the template polynucleotide and ligates a substrate to one or both ends of the double stranded fragments and thereby producing a plurality of fragment/substrate constructs; and
(b) using a translocase to remove the MuA transposases from the constructs and thereby producing a plurality of modified double stranded polynucleotides.
2. A method according to claim 1 , wherein the translocase is contacted with the constructs after they are created by the MuA transposase.
3. A method according to claim 1 , wherein the translocase is bound to the substrates before the substrates are contacted with the template polynucleotide.
4. A method according to claim 1 , wherein the translocase is a helicase.
5. A method according to claim 4 , wherein the helicase is from superfamily 1 or superfamily 2; optionally wherein the helicase is a member of one of the following families: Pif1-like, Upf1-like, UvrD/Rep, Ski-like, Rad3/XPD, NS3/NPH-II, DEAD, DEAH/RHA, RecG-like, REcQ-like, T1R-like, Swi/Snf-like and Rig-I-like; or wherein the helicase is a UvrD helicase, a Hel308 helicase, a TraI helicase, a TraI subgroup helicase, an XPD helicase or a Dda helicase.
6-7. (canceled)
8. A method according to claim 4 , wherein the helicase is a Hel308 helicase; optionally wherein the Hel308 helicase is Hel308 Mbu (E284C/S615C)-bismaleimidePEG11 (SEQ ID NO: 10 with mutations E284C/S615C connected by a bismaleimidePEG11 linker).
9. (canceled)
10. A method according to claim 1 , wherein the translocase is a strippase; optionally wherein the strippase is the INO80 chromatin remodeling complex or a FtsK/SpoIIIE transporter.
11. (canceled)
12. A method according to claim 1 , wherein the two strands of each construct are linked at one end by a hairpin loop.
13. A method according to claim 1 , wherein the method further comprises attaching molecular brakes to the other strands in the substrates.
14. A method according to claim 13 , wherein the molecular brakes are attached to the other strands in the substrates before they are contacted with the template polynucleotide and the MuA transposase.
15. A method according to claim 13 , wherein the molecular brakes are attached to the other strands from the substrates remaining in the constructs after they are created by the MuA transposase.
16. A method according to claim 13 , wherein the molecular brakes are bound to Y adaptors comprising a leader sequence and/or one or more anchors capable of coupling the adaptor to a membrane and the Y adaptors are attached to the other strands in step (c).
17. A method according to claim 13 , wherein the molecular brakes are derived from a polymerase, a helicase or an exonuclease.
18. (canceled)
19. A population of double stranded MuA substrates for modifying a template polynucleotide, wherein each substrate comprises an overhang at one or both ends of one strand and a translocase bound to an overhang.
20. A plurality of polynucleotides modified using a method according to claim 1 .
21. A method of characterising at least one polynucleotide modified using a method according to claim 1 , comprising:
a) contacting the modified polynucleotide with a transmembrane pore such that at least one strand of the polynucleotide moves through the pore; and
b) taking one or more measurements which are indicative of one or more characteristics of the polynucleotide as the at least one strand moves with respect to the pore and thereby characterising the modified polynucleotide.
22. A method of characterising a template polynucleotide, comprising:
a) modifying the template polynucleotide using a method according to claim 1 to produce a plurality of modified polynucleotides;
b) contacting each modified polynucleotide with a transmembrane pore such that at least one strand of each polynucleotide moves through the pore; and
c) taking one or more measurements which are indicative of one or more characteristics of the polynucleotide as the at least one strand of each polynucleotide moves with respect to the pore and thereby characterising the template polynucleotide.
23. A method according to claim 21 , wherein the one or more characteristics are selected from (i) the length of the polynucleotide, (ii) the identity of the polynucleotide, (iii) the sequence of the polynucleotide, (iv) the secondary structure of the polynucleotide and (v) whether or not the polynucleotide is modified.
24. A method according to claim 21 , wherein the method comprises measuring the current passing through the pore as the at least one strand or each polynucleotide moves with respect to the pore.
25. A kit for modifying a template polynucleotide comprising (a) a population of MuA substrates as defined in claim 1 , (b) a MuA transposase and (c) a translocase; optionally wherein the kit further comprises a polynucleotide protein and/or a Y adaptor comprising a leader sequence and/or one or more anchors capable of coupling the adaptor to a membrane.
26. (canceled)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/194,062 US20230374567A1 (en) | 2016-05-25 | 2023-03-31 | Method for modifying a template double stranded polynucleotide |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1609220.7 | 2016-05-25 | ||
GBGB1609220.7A GB201609220D0 (en) | 2016-05-25 | 2016-05-25 | Method |
PCT/GB2017/051490 WO2017203267A1 (en) | 2016-05-25 | 2017-05-25 | Method |
US201816304114A | 2018-11-21 | 2018-11-21 | |
US18/194,062 US20230374567A1 (en) | 2016-05-25 | 2023-03-31 | Method for modifying a template double stranded polynucleotide |
Related Parent Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/304,114 Continuation US11649480B2 (en) | 2016-05-25 | 2017-05-25 | Method for modifying a template double stranded polynucleotide |
PCT/GB2017/051490 Continuation WO2017203267A1 (en) | 2016-05-25 | 2017-05-25 | Method |
Publications (1)
Publication Number | Publication Date |
---|---|
US20230374567A1 true US20230374567A1 (en) | 2023-11-23 |
Family
ID=56369957
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/304,114 Active 2040-06-17 US11649480B2 (en) | 2016-05-25 | 2017-05-25 | Method for modifying a template double stranded polynucleotide |
US18/194,062 Pending US20230374567A1 (en) | 2016-05-25 | 2023-03-31 | Method for modifying a template double stranded polynucleotide |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/304,114 Active 2040-06-17 US11649480B2 (en) | 2016-05-25 | 2017-05-25 | Method for modifying a template double stranded polynucleotide |
Country Status (5)
Country | Link |
---|---|
US (2) | US11649480B2 (en) |
EP (1) | EP3464615B1 (en) |
CN (1) | CN109219665B (en) |
GB (1) | GB201609220D0 (en) |
WO (1) | WO2017203267A1 (en) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010086622A1 (en) | 2009-01-30 | 2010-08-05 | Oxford Nanopore Technologies Limited | Adaptors for nucleic acid constructs in transmembrane sequencing |
IN2014DN00221A (en) | 2011-07-25 | 2015-06-05 | Oxford Nanopore Tech Ltd | |
WO2014013259A1 (en) | 2012-07-19 | 2014-01-23 | Oxford Nanopore Technologies Limited | Ssb method |
GB201314695D0 (en) | 2013-08-16 | 2013-10-02 | Oxford Nanopore Tech Ltd | Method |
KR102168813B1 (en) | 2013-03-08 | 2020-10-22 | 옥스포드 나노포어 테크놀로지즈 리미티드 | Enzyme stalling method |
GB201403096D0 (en) | 2014-02-21 | 2014-04-09 | Oxford Nanopore Tech Ltd | Sample preparation method |
GB201418159D0 (en) | 2014-10-14 | 2014-11-26 | Oxford Nanopore Tech Ltd | Method |
GB201609220D0 (en) | 2016-05-25 | 2016-07-06 | Oxford Nanopore Tech Ltd | Method |
CA3044782A1 (en) | 2017-12-29 | 2019-06-29 | Clear Labs, Inc. | Automated priming and library loading device |
GB201807793D0 (en) | 2018-05-14 | 2018-06-27 | Oxford Nanopore Tech Ltd | Method |
WO2021253410A1 (en) * | 2020-06-19 | 2021-12-23 | 北京齐碳科技有限公司 | Pif1-like helicase and application thereof |
GB202205617D0 (en) * | 2022-04-14 | 2022-06-01 | Oxford Nanopore Tech Plc | Novel modified protein pores and enzymes |
WO2024023219A1 (en) * | 2022-07-27 | 2024-02-01 | Illumina Cambridge Limited | Tagmentation workflow |
Family Cites Families (194)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
FI82266C (en) | 1982-10-19 | 1991-02-11 | Cetus Corp | Process for Preparation of IL-2 Mutein |
GB8924338D0 (en) | 1989-10-28 | 1989-12-13 | Atomic Energy Authority Uk | Electrodes |
US5215899A (en) | 1989-11-09 | 1993-06-01 | Miles Inc. | Nucleic acid amplification employing ligatable hairpin probe and transcription |
US5424413A (en) | 1992-01-22 | 1995-06-13 | Gen-Probe Incorporated | Branched nucleic acid probes |
FR2703693B1 (en) | 1993-04-06 | 1995-07-13 | Pasteur Institut | Rapid method of determining a DNA sequence and application to sequencing and diagnosis. |
US5714320A (en) | 1993-04-15 | 1998-02-03 | University Of Rochester | Rolling circle synthesis of oligonucleotides and amplification of select randomized circular oligonucleotides |
US5777078A (en) | 1993-04-28 | 1998-07-07 | Worcester Foundation For Experimental Biology | Triggered pore-forming agents |
EP0753071A1 (en) | 1993-04-28 | 1997-01-15 | Worcester Foundation For Experimental Biology | Cell-targeted lytic pore-forming agents |
DE4320201A1 (en) | 1993-06-18 | 1995-01-12 | Asta Medica Ag | Use of Cetrorelix and other nona and decapeptides for the manufacture of a medicament for combating AIDS and for growth stimulation |
US7569341B2 (en) | 1994-01-31 | 2009-08-04 | Trustees Of Boston University | Nucleic acid directed immobilization arrays and methods of assembly |
US5561043A (en) | 1994-01-31 | 1996-10-01 | Trustees Of Boston University | Self-assembling multimeric nucleic acid constructs |
US5795782A (en) | 1995-03-17 | 1998-08-18 | President & Fellows Of Harvard College | Characterization of individual polymer molecules based on monomer-interface interactions |
US6362002B1 (en) | 1995-03-17 | 2002-03-26 | President And Fellows Of Harvard College | Characterization of individual polymer molecules based on monomer-interface interactions |
US6395887B1 (en) | 1995-08-01 | 2002-05-28 | Yale University | Analysis of gene expression by display of 3'-end fragments of CDNAS |
US5866336A (en) | 1996-07-16 | 1999-02-02 | Oncor, Inc. | Nucleic acid amplification oligonucleotides with molecular energy transfer labels and methods based thereon |
DE19648625A1 (en) | 1996-11-13 | 1998-05-14 | Soft Gene Gmbh | Microprojectile for the introduction of substances into cells by ballistic transfer |
WO1999005167A1 (en) | 1997-07-25 | 1999-02-04 | University Of Massachusetts | Designed protein pores as components for biosensors |
US6087099A (en) | 1997-09-08 | 2000-07-11 | Myriad Genetics, Inc. | Method for sequencing both strands of a double stranded DNA in a single sequencing reaction |
US6127166A (en) | 1997-11-03 | 2000-10-03 | Bayley; Hagan | Molluscan ligament polypeptides and genes encoding them |
JPH11137260A (en) | 1997-11-06 | 1999-05-25 | Soyaku Gijutsu Kenkyusho:Kk | Anti-influenza viral cyclic dumbbell type rna-dna chimera compound and anti-influenza viral agent |
US6123819A (en) | 1997-11-12 | 2000-09-26 | Protiveris, Inc. | Nanoelectrode arrays |
DE19826758C1 (en) | 1998-06-15 | 1999-10-21 | Soft Gene Gmbh | Production of closed, double-stranded DNA molecules for use in gene therapy or genetic vaccination |
US6743605B1 (en) | 1998-06-24 | 2004-06-01 | Enzo Life Sciences, Inc. | Linear amplification of specific nucleic acid sequences |
US6787308B2 (en) | 1998-07-30 | 2004-09-07 | Solexa Ltd. | Arrayed biomolecules and their use in sequencing |
US6235502B1 (en) | 1998-09-18 | 2001-05-22 | Molecular Staging Inc. | Methods for selectively isolating DNA using rolling circle amplification |
US6267872B1 (en) | 1998-11-06 | 2001-07-31 | The Regents Of The University Of California | Miniature support for thin films containing single channels or nanopores and methods for using same |
US6426231B1 (en) | 1998-11-18 | 2002-07-30 | The Texas A&M University System | Analyte sensing mediated by adapter/carrier molecules |
WO2000034527A2 (en) | 1998-12-11 | 2000-06-15 | The Regents Of The University Of California | Targeted molecular bar codes |
NO986133D0 (en) | 1998-12-23 | 1998-12-23 | Preben Lexow | Method of DNA Sequencing |
US7056661B2 (en) | 1999-05-19 | 2006-06-06 | Cornell Research Foundation, Inc. | Method for sequencing nucleic acid molecules |
EP1192103A1 (en) | 1999-06-22 | 2002-04-03 | President And Fellows of Harvard College | Control of solid state dimensional features |
WO2001002425A2 (en) | 1999-06-29 | 2001-01-11 | University Health Network | Peptide conjugates for the stabilization of membrane proteins and interactions with biological membranes |
CA2383264A1 (en) | 1999-08-13 | 2001-02-22 | Yale University | Binary encoded sequence tags |
US6274320B1 (en) | 1999-09-16 | 2001-08-14 | Curagen Corporation | Method of sequencing a nucleic acid |
US6682649B1 (en) | 1999-10-01 | 2004-01-27 | Sophion Bioscience A/S | Substrate and a method for determining and/or monitoring electrophysiological properties of ion channels |
AU1804001A (en) | 1999-12-02 | 2001-06-12 | Molecular Staging, Inc. | Generation of single-strand circular dna from linear self-annealing segments |
EP2083015B1 (en) | 2000-02-11 | 2016-04-06 | The Texas A & M University System | Biosensor compositions and methods of use |
US20020132978A1 (en) | 2000-03-21 | 2002-09-19 | Hans-Peter Gerber | VEGF-modulated genes and methods employing them |
DE60132075T2 (en) | 2000-03-22 | 2009-03-12 | Curagen Corp., New Haven | WNT-1 RELATED POLYPEPTIDES AND NUCLEIC ACIDS THEREFOR |
US6596488B2 (en) | 2000-03-30 | 2003-07-22 | City Of Hope | Tumor suppressor gene |
US6387624B1 (en) | 2000-04-14 | 2002-05-14 | Incyte Pharmaceuticals, Inc. | Construction of uni-directionally cloned cDNA libraries from messenger RNA for improved 3′ end DNA sequencing |
US7001792B2 (en) | 2000-04-24 | 2006-02-21 | Eagle Research & Development, Llc | Ultra-fast nucleic acid sequencing device and a method for making and using the same |
US20020132350A1 (en) | 2000-09-14 | 2002-09-19 | Pioneer Hi-Bred International, Inc. | Targeted genetic manipulation using Mu bacteriophage cleaved donor complex |
US6709861B2 (en) | 2000-11-17 | 2004-03-23 | Lucigen Corp. | Cloning vectors and vector components |
AU2002239284A1 (en) | 2000-11-27 | 2002-06-03 | The Regents Of The University Of California | Methods and devices for characterizing duplex nucleic acid molecules |
US20020197618A1 (en) | 2001-01-20 | 2002-12-26 | Sampson Jeffrey R. | Synthesis and amplification of unstructured nucleic acids for rapid sequencing |
US20030087232A1 (en) | 2001-01-25 | 2003-05-08 | Fred Christians | Methods for screening polypeptides |
US7807408B2 (en) | 2001-03-19 | 2010-10-05 | President & Fellows Of Harvard College | Directed evolution of proteins |
US6863833B1 (en) | 2001-06-29 | 2005-03-08 | The Board Of Trustees Of The Leland Stanford Junior University | Microfabricated apertures for supporting bilayer lipid membranes |
JP2005505262A (en) | 2001-07-03 | 2005-02-24 | ザ リージェンツ オブ ザ ユニバーシティ オブ カリフォルニア | Mammalian sweet taste and amino acid heterodimeric taste receptors |
US6852492B2 (en) | 2001-09-24 | 2005-02-08 | Intel Corporation | Nucleic acid sequencing by raman monitoring of uptake of precursors during molecular replication |
EP1487978B1 (en) | 2002-03-15 | 2008-11-19 | Nuevolution A/S | An improved method for synthesising templated molecules |
US20030215881A1 (en) | 2002-05-10 | 2003-11-20 | Hagan Bayley | Stochastic sensing through covalent interactions |
US7452699B2 (en) | 2003-01-15 | 2008-11-18 | Dana-Farber Cancer Institute, Inc. | Amplification of DNA in a hairpin structure, and applications |
WO2004072294A2 (en) | 2003-02-12 | 2004-08-26 | Genizon Svenska Ab | Methods and means for nucleic acid sequencing |
US20100035254A1 (en) | 2003-04-08 | 2010-02-11 | Pacific Biosciences Of California, Inc. | Composition and method for nucleic acid sequencing |
US7745116B2 (en) | 2003-04-08 | 2010-06-29 | Pacific Biosciences Of California, Inc. | Composition and method for nucleic acid sequencing |
US7163658B2 (en) | 2003-04-23 | 2007-01-16 | Rouvain Bension | Rapid sequencing of polymers |
US7344882B2 (en) | 2003-05-12 | 2008-03-18 | Bristol-Myers Squibb Company | Polynucleotides encoding variants of the TRP channel family member, LTRPC3 |
US20070122885A1 (en) | 2005-08-22 | 2007-05-31 | Fermalogic, Inc. | Methods of increasing production of secondary metabolites by manipulating metabolic pathways that include methylmalonyl-coa |
WO2005056750A2 (en) | 2003-12-11 | 2005-06-23 | Quark Biotech, Inc. | Inversion-duplication of nucleic acids and libraries prepared thereby |
GB0400584D0 (en) | 2004-01-12 | 2004-02-11 | Solexa Ltd | Nucleic acid chacterisation |
WO2006028508A2 (en) | 2004-03-23 | 2006-03-16 | President And Fellows Of Harvard College | Methods and apparatus for characterizing polynucleotides |
GB2413796B (en) | 2004-03-25 | 2006-03-29 | Global Genomics Ab | Methods and means for nucleic acid sequencing |
US20050227239A1 (en) | 2004-04-08 | 2005-10-13 | Joyce Timothy H | Microarray based affinity purification and analysis device coupled with solid state nanopore electrodes |
US7618778B2 (en) | 2004-06-02 | 2009-11-17 | Kaufman Joseph C | Producing, cataloging and classifying sequence tags |
WO2005124888A1 (en) | 2004-06-08 | 2005-12-29 | President And Fellows Of Harvard College | Suspended carbon nanotube field effect transistor |
US7700281B2 (en) | 2004-06-30 | 2010-04-20 | Usb Corporation | Hot start nucleic acid amplification |
AU2005272823B2 (en) | 2004-08-13 | 2012-04-12 | President And Fellows Of Harvard College | An ultra high-throughput opti-nanopore DNA readout platform |
US20060086626A1 (en) | 2004-10-22 | 2006-04-27 | Joyce Timothy H | Nanostructure resonant tunneling with a gate voltage source |
US7867716B2 (en) | 2004-12-21 | 2011-01-11 | The Texas A&M University System | High temperature ion channels and pores |
US7890268B2 (en) | 2004-12-28 | 2011-02-15 | Roche Molecular Systems, Inc. | De-novo sequencing of nucleic acids |
GB0505971D0 (en) | 2005-03-23 | 2005-04-27 | Isis Innovation | Delivery of molecules to a lipid bilayer |
US7507575B2 (en) | 2005-04-01 | 2009-03-24 | 3M Innovative Properties Company | Multiplex fluorescence detection device having removable optical modules |
EP1910537A1 (en) | 2005-06-06 | 2008-04-16 | 454 Life Sciences Corporation | Paired end sequencing |
US20070020640A1 (en) | 2005-07-21 | 2007-01-25 | Mccloskey Megan L | Molecular encoding of nucleic acid templates for PCR and other forms of sequence analysis |
WO2007018601A1 (en) | 2005-08-02 | 2007-02-15 | Rubicon Genomics, Inc. | Compositions and methods for processing and amplification of dna, including using multiple enzymes in a single reaction |
GB0523282D0 (en) | 2005-11-15 | 2005-12-21 | Isis Innovation | Methods using pores |
EP1963530B1 (en) | 2005-12-22 | 2011-07-27 | Pacific Biosciences of California, Inc. | Active surface coupled polymerases |
US7932029B1 (en) | 2006-01-04 | 2011-04-26 | Si Lok | Methods for nucleic acid mapping and identification of fine-structural-variations in nucleic acids and utilities |
WO2007098427A2 (en) | 2006-02-18 | 2007-08-30 | Michael Strathmann | Massively multiplexed sequencing |
US8673567B2 (en) | 2006-03-08 | 2014-03-18 | Atila Biosystems, Inc. | Method and kit for nucleic acid sequence detection |
EP2002017B1 (en) | 2006-04-04 | 2015-06-10 | Keygene N.V. | High throughput detection of molecular markers based on restriction fragments |
WO2007146158A1 (en) | 2006-06-07 | 2007-12-21 | The Trustees Of Columbia University In The City Of New York | Dna sequencing by nanopore using modified nucleotides |
JP4876766B2 (en) | 2006-08-10 | 2012-02-15 | トヨタ自動車株式会社 | Fuel cell |
US20110039776A1 (en) | 2006-09-06 | 2011-02-17 | Ashutosh Chilkoti | Fusion peptide therapeutic compositions |
US20100311602A1 (en) | 2006-10-13 | 2010-12-09 | J. Craig Venter Institute, Inc. | Sequencing method |
CA2666517A1 (en) | 2006-10-23 | 2008-05-02 | Pacific Biosciences Of California, Inc. | Polymerase enzymes and reagents for enhanced nucleic acid sequencing |
GB2445016B (en) | 2006-12-19 | 2012-03-07 | Microsaic Systems Plc | Microengineered ionisation device |
WO2008102121A1 (en) | 2007-02-20 | 2008-08-28 | Oxford Nanopore Technologies Limited | Formation of lipid bilayers |
EP3798317B1 (en) | 2007-04-04 | 2024-01-03 | The Regents of the University of California | Compositions, devices, systems, and methods for using a nanopore |
EP3540436B1 (en) | 2007-09-12 | 2023-11-01 | President And Fellows Of Harvard College | High-resolution molecular sensor |
GB2453377A (en) | 2007-10-05 | 2009-04-08 | Isis Innovation | Transmembrane protein pores and molecular adapters therefore. |
KR101414713B1 (en) | 2007-10-11 | 2014-07-03 | 삼성전자주식회사 | Method of amplifying target nucleic acids by rolling circle amplification in the presence of ligase and endonuclease |
GB0724736D0 (en) | 2007-12-19 | 2008-01-30 | Oxford Nanolabs Ltd | Formation of layers of amphiphilic molecules |
JP5288143B2 (en) | 2007-12-31 | 2013-09-11 | 富士レビオ株式会社 | Cluster of microresonators for cavity mode optical detection |
WO2009092035A2 (en) | 2008-01-17 | 2009-07-23 | Sequenom, Inc. | Methods and compositions for the analysis of biological molecules |
US8263367B2 (en) | 2008-01-25 | 2012-09-11 | Agency For Science, Technology And Research | Nucleic acid interaction analysis |
US8231969B2 (en) | 2008-03-26 | 2012-07-31 | University Of Utah Research Foundation | Asymmetrically functionalized nanoparticles |
US8236499B2 (en) | 2008-03-28 | 2012-08-07 | Pacific Biosciences Of California, Inc. | Methods and compositions for nucleic acid sample preparation |
US8628940B2 (en) | 2008-09-24 | 2014-01-14 | Pacific Biosciences Of California, Inc. | Intermittent detection during analytical reactions |
EP3425060B1 (en) | 2008-03-28 | 2021-10-27 | Pacific Biosciences of California, Inc. | Compositions and methods for nucleic acid sequencing |
US20090269771A1 (en) | 2008-04-24 | 2009-10-29 | Life Technologies Corporation | Method of sequencing and mapping target nucleic acids |
WO2009132124A2 (en) | 2008-04-24 | 2009-10-29 | The Trustees Of Columbia University In The City Of New York | Geometric patterns and lipid bilayers for dna molecule organization and uses thereof |
US20110229877A1 (en) | 2008-07-07 | 2011-09-22 | Oxford Nanopore Technologies Limited | Enzyme-pore constructs |
JP2011527191A (en) | 2008-07-07 | 2011-10-27 | オックスフォード ナノポア テクノロジーズ リミテッド | Base detection pore |
US20100092960A1 (en) | 2008-07-25 | 2010-04-15 | Pacific Biosciences Of California, Inc. | Helicase-assisted sequencing with molecular beacons |
CN102203273A (en) | 2008-09-09 | 2011-09-28 | 生命技术公司 | Methods of generating gene specific libraries |
US8481264B2 (en) | 2008-09-19 | 2013-07-09 | Pacific Biosciences Of California, Inc. | Immobilized nucleic acid complexes for sequence analysis |
EP3029467B1 (en) | 2008-09-22 | 2020-01-08 | University of Washington | Msp nanopores and related methods |
US8383369B2 (en) | 2008-09-24 | 2013-02-26 | Pacific Biosciences Of California, Inc. | Intermittent detection during analytical reactions |
WO2010036287A1 (en) | 2008-09-24 | 2010-04-01 | Pacific Biosciences Of California, Inc. | Intermittent detection during analytical reactions |
PL2963709T3 (en) | 2008-10-24 | 2017-11-30 | Epicentre Technologies Corporation | Transposon end compositions and methods for modifying nucleic acids |
US9080211B2 (en) | 2008-10-24 | 2015-07-14 | Epicentre Technologies Corporation | Transposon end compositions and methods for modifying nucleic acids |
US8486630B2 (en) | 2008-11-07 | 2013-07-16 | Industrial Technology Research Institute | Methods for accurate sequence data and modified base position determination |
GB0820927D0 (en) | 2008-11-14 | 2008-12-24 | Isis Innovation | Method |
CA2746632C (en) | 2008-12-11 | 2020-06-30 | Pacific Biosciences Of California, Inc. | Characterization of modified nucleic acids |
US9222082B2 (en) | 2009-01-30 | 2015-12-29 | Oxford Nanopore Technologies Limited | Hybridization linkers |
WO2010086622A1 (en) | 2009-01-30 | 2010-08-05 | Oxford Nanopore Technologies Limited | Adaptors for nucleic acid constructs in transmembrane sequencing |
DK2396430T3 (en) | 2009-02-16 | 2013-07-15 | Epict Technologies Corp | TEMPLATE-INDEPENDENT LINGERING OF SINGLE-STRENGTH DNA |
CA2753294A1 (en) | 2009-02-23 | 2010-08-26 | Cytomx Therapeutics, Inc. | Proproteins and methods of use thereof |
FR2943656A1 (en) | 2009-03-25 | 2010-10-01 | Air Liquide | HYDROGEN PRODUCTION METHOD AND PLANT USING A THERMOCINETIC COMPRESSOR |
GB0905140D0 (en) | 2009-03-25 | 2009-05-06 | Isis Innovation | Method |
US8986928B2 (en) | 2009-04-10 | 2015-03-24 | Pacific Biosciences Of California, Inc. | Nanopore sequencing devices and methods |
BRPI1012752B1 (en) | 2009-04-20 | 2019-06-25 | Oxford Nanopore Technologies Limited | METHOD AND APPARATUS FOR DETECTING AN INTERACTION OF A MOLECULAR ENTITY WITH A MEMBRANE PROTEIN IN A LAYER OF ANFIFYLIC MOLECULES |
GB0910302D0 (en) | 2009-06-15 | 2009-07-29 | Lumora Ltd | Nucleic acid amplification |
US20120015821A1 (en) | 2009-09-09 | 2012-01-19 | Life Technologies Corporation | Methods of Generating Gene Specific Libraries |
WO2011067559A1 (en) | 2009-12-01 | 2011-06-09 | Oxford Nanopore Technologies Limited | Biochemical analysis instrument |
US20120010085A1 (en) | 2010-01-19 | 2012-01-12 | Rava Richard P | Methods for determining fraction of fetal nucleic acids in maternal samples |
FR2955773B1 (en) | 2010-02-01 | 2017-05-26 | Commissariat A L'energie Atomique | MOLECULAR COMPLEX FOR TARGETING ANTIGENS TO ANTIGEN-PRESENTING CELLS AND ITS APPLICATIONS FOR VACCINATION |
KR20110100963A (en) | 2010-03-05 | 2011-09-15 | 삼성전자주식회사 | Microfluidic device and method for deterimining sequences of target nucleic acids using the same |
WO2011112718A1 (en) | 2010-03-10 | 2011-09-15 | Ibis Biosciences, Inc. | Production of single-stranded circular nucleic acid |
US8652779B2 (en) | 2010-04-09 | 2014-02-18 | Pacific Biosciences Of California, Inc. | Nanopore sequencing using charge blockade labels |
US20120244525A1 (en) | 2010-07-19 | 2012-09-27 | New England Biolabs, Inc. | Oligonucleotide Adapters: Compositions and Methods of Use |
CN103392008B (en) | 2010-09-07 | 2017-10-20 | 加利福尼亚大学董事会 | Movement by continuation enzyme with the precision controlling DNA of a nucleotides in nano-pore |
CA2821299C (en) | 2010-11-05 | 2019-02-12 | Frank J. Steemers | Linking sequence reads using paired code tags |
US10443096B2 (en) | 2010-12-17 | 2019-10-15 | The Trustees Of Columbia University In The City Of New York | DNA sequencing by synthesis using modified nucleotides and nanopore detection |
US20130291392A1 (en) | 2011-01-18 | 2013-11-07 | R.K. Swamy | Multipurpose instrument for triangle solutions, measurements and geometrical applications called triometer |
US9402808B2 (en) | 2011-01-19 | 2016-08-02 | Panacea Biotec Limited | Liquid oral composition of lanthanum salts |
EP3037536B1 (en) | 2011-01-28 | 2019-11-27 | Illumina, Inc. | Oligonucleotide replacement for di-tagged and directional libraries |
US20120196279A1 (en) | 2011-02-02 | 2012-08-02 | Pacific Biosciences Of California, Inc. | Methods and compositions for nucleic acid sample preparation |
WO2012107778A2 (en) | 2011-02-11 | 2012-08-16 | Oxford Nanopore Technologies Limited | Mutant pores |
US9347929B2 (en) | 2011-03-01 | 2016-05-24 | The Regents Of The University Of Michigan | Controlling translocation through nanopores with fluid wall |
EP3633370A1 (en) | 2011-05-27 | 2020-04-08 | Oxford Nanopore Technologies Limited | Coupling method |
US20130017978A1 (en) | 2011-07-11 | 2013-01-17 | Finnzymes Oy | Methods and transposon nucleic acids for generating a dna library |
US9145623B2 (en) | 2011-07-20 | 2015-09-29 | Thermo Fisher Scientific Oy | Transposon nucleic acids comprising a calibration sequence for DNA sequencing |
IN2014DN00221A (en) | 2011-07-25 | 2015-06-05 | Oxford Nanopore Tech Ltd | |
US9632102B2 (en) | 2011-09-25 | 2017-04-25 | Theranos, Inc. | Systems and methods for multi-purpose analysis |
JP6457811B2 (en) | 2011-09-23 | 2019-01-23 | オックスフォード ナノポール テクノロジーズ リミテッド | Analysis of polymers containing polymer units |
US20140308661A1 (en) | 2011-09-25 | 2014-10-16 | Theranos, Inc. | Systems and methods for multi-analysis |
US9810704B2 (en) | 2013-02-18 | 2017-11-07 | Theranos, Inc. | Systems and methods for multi-analysis |
EP2987870B1 (en) | 2011-10-21 | 2020-02-19 | Oxford Nanopore Technologies Limited | Method of characterizing a target polynucleotide using a transmembrane pore and molecular motor |
US9404147B2 (en) | 2011-12-19 | 2016-08-02 | Gen-Probe Incorporated | Closed nucleic acid structures |
EP2798083B1 (en) | 2011-12-29 | 2017-08-09 | Oxford Nanopore Technologies Limited | Method for characterising a polynucelotide by using a xpd helicase |
CN104126018B (en) | 2011-12-29 | 2021-09-14 | 牛津纳米孔技术公司 | Enzymatic process |
NO2694769T3 (en) | 2012-03-06 | 2018-03-03 | ||
WO2013153359A1 (en) | 2012-04-10 | 2013-10-17 | Oxford Nanopore Technologies Limited | Mutant lysenin pores |
GB2559073A (en) | 2012-06-08 | 2018-07-25 | Pacific Biosciences California Inc | Modified base detection with nanopore sequencing |
TWI655213B (en) | 2012-07-13 | 2019-04-01 | 目立康股份有限公司 | Method for producing self-organizing peptide derivative |
US10808231B2 (en) | 2012-07-19 | 2020-10-20 | Oxford Nanopore Technologies Limited | Modified helicases |
CA2879355C (en) | 2012-07-19 | 2021-09-21 | Oxford Nanopore Technologies Limited | Helicase construct and its use in characterising polynucleotides |
WO2014013259A1 (en) | 2012-07-19 | 2014-01-23 | Oxford Nanopore Technologies Limited | Ssb method |
US9551023B2 (en) | 2012-09-14 | 2017-01-24 | Oxford Nanopore Technologies Ltd. | Sample preparation method |
GB201313121D0 (en) | 2013-07-23 | 2013-09-04 | Oxford Nanopore Tech Ltd | Array of volumes of polar medium |
AU2013336430B2 (en) | 2012-10-26 | 2018-02-15 | Oxford Nanopore Technologies Limited | Droplet interfaces |
US9670526B2 (en) | 2012-11-09 | 2017-06-06 | Stratos Genomics, Inc. | Concentrating a target molecule for sensing by a nanopore |
US9683230B2 (en) | 2013-01-09 | 2017-06-20 | Illumina Cambridge Limited | Sample preparation on a solid support |
US20140206842A1 (en) | 2013-01-22 | 2014-07-24 | Muhammed Majeed | Peptides Modified with Triterpenoids and Small Organic Molecules: Synthesis and use in Cosmeceutical |
GB201314695D0 (en) | 2013-08-16 | 2013-10-02 | Oxford Nanopore Tech Ltd | Method |
KR102168813B1 (en) | 2013-03-08 | 2020-10-22 | 옥스포드 나노포어 테크놀로지즈 리미티드 | Enzyme stalling method |
GB201318465D0 (en) | 2013-10-18 | 2013-12-04 | Oxford Nanopore Tech Ltd | Method |
US9567632B2 (en) | 2013-03-19 | 2017-02-14 | New England Biolabs, Inc. | Enrichment of target sequences |
ES2896017T3 (en) | 2013-08-30 | 2022-02-23 | Univ Washington Through Its Center For Commercialization | Selective modification of polymer subunits to improve nanopore-based analysis |
EP3575410A3 (en) | 2013-10-18 | 2020-03-04 | Oxford Nanopore Technologies Limited | Modified enzymes |
GB201406151D0 (en) | 2014-04-04 | 2014-05-21 | Oxford Nanopore Tech Ltd | Method |
GB201321123D0 (en) | 2013-11-29 | 2014-01-15 | Linea Ab Q | Amplification of circular molecules |
US10385389B2 (en) | 2014-01-22 | 2019-08-20 | Oxford Nanopore Technologies Ltd. | Method for attaching one or more polynucleotide binding proteins to a target polynucleotide |
US20170226503A1 (en) | 2014-02-06 | 2017-08-10 | MetaMixis Biologics, Inc. | Methods for Sequential Screening with Co-Culture Based Detection of Metagenomic Elements Conferring Heterologous Metabolite Secretion |
GB201403096D0 (en) | 2014-02-21 | 2014-04-09 | Oxford Nanopore Tech Ltd | Sample preparation method |
US10131944B2 (en) | 2014-03-24 | 2018-11-20 | The Regents Of The University Of California | Molecular adapter for capture and manipulation of transfer RNA |
GB201417712D0 (en) | 2014-10-07 | 2014-11-19 | Oxford Nanopore Tech Ltd | Method |
US10443087B2 (en) | 2014-06-13 | 2019-10-15 | Illumina Cambridge Limited | Methods and compositions for preparing sequencing libraries |
US10017759B2 (en) | 2014-06-26 | 2018-07-10 | Illumina, Inc. | Library preparation of tagged nucleic acid |
WO2016003814A1 (en) | 2014-06-30 | 2016-01-07 | Illumina, Inc. | Methods and compositions using one-sided transposition |
WO2016022557A1 (en) | 2014-08-05 | 2016-02-11 | Twist Bioscience Corporation | Cell free cloning of nucleic acids |
WO2016028887A1 (en) | 2014-08-19 | 2016-02-25 | Pacific Biosciences Of California, Inc. | Compositions and methods for enrichment of nucleic acids |
US10435685B2 (en) | 2014-08-19 | 2019-10-08 | Pacific Biosciences Of California, Inc. | Compositions and methods for enrichment of nucleic acids |
CN117164682A (en) | 2014-09-01 | 2023-12-05 | 弗拉芒区生物技术研究所 | Mutant CSGG wells |
GB201418159D0 (en) | 2014-10-14 | 2014-11-26 | Oxford Nanopore Tech Ltd | Method |
WO2016138080A1 (en) | 2015-02-24 | 2016-09-01 | Trustees Of Boston University | Protection of barcodes during dna amplification using molecular hairpins |
US10612073B2 (en) | 2015-02-26 | 2020-04-07 | Hitachi High-Technologies Corporation | Method for constructing nucleic acid molecule |
GB201609220D0 (en) | 2016-05-25 | 2016-07-06 | Oxford Nanopore Tech Ltd | Method |
CN107488656B (en) | 2016-06-13 | 2020-07-17 | 陆欣华 | Nucleic acid isothermal self-amplification method |
GB201807793D0 (en) | 2018-05-14 | 2018-06-27 | Oxford Nanopore Tech Ltd | Method |
-
2016
- 2016-05-25 GB GBGB1609220.7A patent/GB201609220D0/en not_active Ceased
-
2017
- 2017-05-25 EP EP17727331.5A patent/EP3464615B1/en active Active
- 2017-05-25 CN CN201780034561.5A patent/CN109219665B/en active Active
- 2017-05-25 WO PCT/GB2017/051490 patent/WO2017203267A1/en unknown
- 2017-05-25 US US16/304,114 patent/US11649480B2/en active Active
-
2023
- 2023-03-31 US US18/194,062 patent/US20230374567A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
EP3464615A1 (en) | 2019-04-10 |
US20190194722A1 (en) | 2019-06-27 |
CN109219665A (en) | 2019-01-15 |
WO2017203267A1 (en) | 2017-11-30 |
GB201609220D0 (en) | 2016-07-06 |
EP3464615B1 (en) | 2020-11-25 |
US11649480B2 (en) | 2023-05-16 |
CN109219665B (en) | 2022-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230374567A1 (en) | Method for modifying a template double stranded polynucleotide | |
US11525126B2 (en) | Modified helicases | |
US11649490B2 (en) | Method of target molecule characterisation using a molecular pore | |
US20240026441A1 (en) | Method for attaching one or more polynucleotide binding proteins to a target polynucleotide | |
KR102436445B1 (en) | Modified enzymes | |
US20200102608A1 (en) | Method for characterising a double stranded nucleic acid using a nano-pore and anchor molecules at both ends of said nucleic acid | |
US20190345550A1 (en) | Method for controlling the movement of a polynucleotide through a transmembrane pore | |
CN117264925A (en) | Modified enzymes | |
US11965183B2 (en) | Modified enzymes |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: OXFORD NANOPORE TECHNOLOGIES PLC, UNITED KINGDOM Free format text: CHANGE OF NAME;ASSIGNOR:OXFORD NANOPORE TECHNOLOGIES, LTD.;REEL/FRAME:066880/0540 Effective date: 20210924 Owner name: OXFORD NANOPORE TECHNOLOGIES LTD, UNITED KINGDOM Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:STODDART, DAVID JACKSON;WHITE, JAMES;REEL/FRAME:066880/0438 Effective date: 20190311 |