WO2006073504A2 - Wobble sequencing - Google Patents
Wobble sequencing Download PDFInfo
- Publication number
- WO2006073504A2 WO2006073504A2 PCT/US2005/027695 US2005027695W WO2006073504A2 WO 2006073504 A2 WO2006073504 A2 WO 2006073504A2 US 2005027695 W US2005027695 W US 2005027695W WO 2006073504 A2 WO2006073504 A2 WO 2006073504A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequencing
- primer
- anchor
- bases
- sequence
- Prior art date
Links
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 76
- 238000000034 method Methods 0.000 claims abstract description 40
- 229920001519 homopolymer Polymers 0.000 claims abstract description 7
- 238000001712 DNA sequencing Methods 0.000 claims abstract description 3
- 238000009396 hybridization Methods 0.000 claims description 9
- 108020004707 nucleic acids Proteins 0.000 claims description 9
- 102000039446 nucleic acids Human genes 0.000 claims description 9
- 150000007523 nucleic acids Chemical class 0.000 claims description 9
- 108091034117 Oligonucleotide Proteins 0.000 claims description 8
- 238000003384 imaging method Methods 0.000 claims description 6
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 claims description 4
- 239000000203 mixture Substances 0.000 abstract description 8
- 238000006243 chemical reaction Methods 0.000 abstract description 7
- 238000009825 accumulation Methods 0.000 abstract 1
- 239000002585 base Substances 0.000 description 73
- 239000011324 bead Substances 0.000 description 36
- 125000003729 nucleotide group Chemical group 0.000 description 18
- 239000002773 nucleotide Substances 0.000 description 14
- 239000000839 emulsion Substances 0.000 description 11
- 230000003321 amplification Effects 0.000 description 10
- 238000003199 nucleic acid amplification method Methods 0.000 description 10
- 238000002474 experimental method Methods 0.000 description 9
- 230000000750 progressive effect Effects 0.000 description 9
- 108091093088 Amplicon Proteins 0.000 description 7
- 108020004414 DNA Proteins 0.000 description 5
- 230000000295 complement effect Effects 0.000 description 5
- 108010061982 DNA Ligases Proteins 0.000 description 4
- 102000012410 DNA Ligases Human genes 0.000 description 4
- 102000004190 Enzymes Human genes 0.000 description 4
- 108090000790 Enzymes Proteins 0.000 description 4
- 102100037111 Uracil-DNA glycosylase Human genes 0.000 description 4
- 238000013459 approach Methods 0.000 description 4
- 238000003776 cleavage reaction Methods 0.000 description 4
- 230000001351 cycling effect Effects 0.000 description 4
- 230000007017 scission Effects 0.000 description 4
- 238000007841 sequencing by ligation Methods 0.000 description 4
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 3
- MSXVEPNJUHWQHW-UHFFFAOYSA-N 2-methylbutan-2-ol Chemical compound CCC(C)(C)O MSXVEPNJUHWQHW-UHFFFAOYSA-N 0.000 description 3
- DLFVBJFMPXGRIB-UHFFFAOYSA-N Acetamide Chemical compound CC(N)=O DLFVBJFMPXGRIB-UHFFFAOYSA-N 0.000 description 3
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 3
- 102000004099 Deoxyribonuclease (Pyrimidine Dimer) Human genes 0.000 description 3
- 108010082610 Deoxyribonuclease (Pyrimidine Dimer) Proteins 0.000 description 3
- 101710181041 Endonuclease 8 Proteins 0.000 description 3
- 241000588724 Escherichia coli Species 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- LYCAIKOWRPUZTN-UHFFFAOYSA-N Ethylene glycol Chemical compound OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 102000003960 Ligases Human genes 0.000 description 3
- 108090000364 Ligases Proteins 0.000 description 3
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 3
- LRHPLDYGYMQRHN-UHFFFAOYSA-N N-Butanol Chemical compound CCCCO LRHPLDYGYMQRHN-UHFFFAOYSA-N 0.000 description 3
- DNIAPMSPPWPWGF-UHFFFAOYSA-N Propylene glycol Chemical compound CC(O)CO DNIAPMSPPWPWGF-UHFFFAOYSA-N 0.000 description 3
- DKGAVHZHDRPRBM-UHFFFAOYSA-N Tert-Butanol Chemical compound CC(C)(C)O DKGAVHZHDRPRBM-UHFFFAOYSA-N 0.000 description 3
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 3
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 3
- 230000001186 cumulative effect Effects 0.000 description 3
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- RYHBNJHYFVUHQT-UHFFFAOYSA-N 1,4-Dioxane Chemical compound C1COCCO1 RYHBNJHYFVUHQT-UHFFFAOYSA-N 0.000 description 2
- YEJRWHAVMIAJKC-UHFFFAOYSA-N 4-Butyrolactone Chemical compound O=C1CCCO1 YEJRWHAVMIAJKC-UHFFFAOYSA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 208000035657 Abasia Diseases 0.000 description 2
- PAYRUJLWNCNPSJ-UHFFFAOYSA-N Aniline Chemical compound NC1=CC=CC=C1 PAYRUJLWNCNPSJ-UHFFFAOYSA-N 0.000 description 2
- 108091028732 Concatemer Proteins 0.000 description 2
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 2
- 101150101095 Mmp12 gene Proteins 0.000 description 2
- PMDCZENCAXMSOU-UHFFFAOYSA-N N-ethylacetamide Chemical compound CCNC(C)=O PMDCZENCAXMSOU-UHFFFAOYSA-N 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- JUJWROOIHBZHMG-UHFFFAOYSA-N Pyridine Chemical compound C1=CC=NC=C1 JUJWROOIHBZHMG-UHFFFAOYSA-N 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 2
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 2
- 229920004890 Triton X-100 Polymers 0.000 description 2
- 239000013504 Triton X-100 Substances 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 239000003513 alkali Substances 0.000 description 2
- XXROGKLTLUQVRX-UHFFFAOYSA-N allyl alcohol Chemical compound OCC=C XXROGKLTLUQVRX-UHFFFAOYSA-N 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- BTANRVKWQNVYAZ-UHFFFAOYSA-N butan-2-ol Chemical compound CCC(C)O BTANRVKWQNVYAZ-UHFFFAOYSA-N 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 239000003398 denaturant Substances 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 239000012634 fragment Substances 0.000 description 2
- 238000011065 in-situ storage Methods 0.000 description 2
- 238000010348 incorporation Methods 0.000 description 2
- ZXEKIIBDNHEJCQ-UHFFFAOYSA-N isobutanol Chemical compound CC(C)CO ZXEKIIBDNHEJCQ-UHFFFAOYSA-N 0.000 description 2
- AMXOYNBUYSYVKV-UHFFFAOYSA-M lithium bromide Chemical compound [Li+].[Br-] AMXOYNBUYSYVKV-UHFFFAOYSA-M 0.000 description 2
- KWGKDLIKAYFUFQ-UHFFFAOYSA-M lithium chloride Chemical compound [Li+].[Cl-] KWGKDLIKAYFUFQ-UHFFFAOYSA-M 0.000 description 2
- 235000019689 luncheon sausage Nutrition 0.000 description 2
- XUWHAWMETYGRKB-UHFFFAOYSA-N piperidin-2-one Chemical compound O=C1CCCCN1 XUWHAWMETYGRKB-UHFFFAOYSA-N 0.000 description 2
- 239000013612 plasmid Substances 0.000 description 2
- BDERNNFJNOPAEC-UHFFFAOYSA-N propan-1-ol Chemical compound CCCO BDERNNFJNOPAEC-UHFFFAOYSA-N 0.000 description 2
- 238000012175 pyrosequencing Methods 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 238000005096 rolling process Methods 0.000 description 2
- 239000000523 sample Substances 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 238000001308 synthesis method Methods 0.000 description 2
- MPLHNVLQVRSVEE-UHFFFAOYSA-N texas red Chemical group [O-]S(=O)(=O)C1=CC(S(Cl)(=O)=O)=CC=C1C(C1=CC=2CCCN3CCCC(C=23)=C1O1)=C2C1=C(CCC1)C3=[N+]1CCCC3=C2 MPLHNVLQVRSVEE-UHFFFAOYSA-N 0.000 description 2
- VGONTNSXDCQUGY-RRKCRQDMSA-N 2'-deoxyinosine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(N=CNC2=O)=C2N=C1 VGONTNSXDCQUGY-RRKCRQDMSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- TZGPACAKMCUCKX-UHFFFAOYSA-N 2-hydroxyacetamide Chemical compound NC(=O)CO TZGPACAKMCUCKX-UHFFFAOYSA-N 0.000 description 1
- -1 3-nitropyrole Chemical compound 0.000 description 1
- OZFPSOBLQZPIAV-UHFFFAOYSA-N 5-nitro-1h-indole Chemical compound [O-][N+](=O)C1=CC=C2NC=CC2=C1 OZFPSOBLQZPIAV-UHFFFAOYSA-N 0.000 description 1
- KLSJWNVTNUYHDU-UHFFFAOYSA-N Amitrole Chemical compound NC1=NC=NN1 KLSJWNVTNUYHDU-UHFFFAOYSA-N 0.000 description 1
- 108010063905 Ampligase Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 241001635598 Enicostema Species 0.000 description 1
- JOYRKODLDBILNP-UHFFFAOYSA-N Ethyl urethane Chemical compound CCOC(N)=O JOYRKODLDBILNP-UHFFFAOYSA-N 0.000 description 1
- 108060002716 Exonuclease Proteins 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- SQUHHTBVTRBESD-UHFFFAOYSA-N Hexa-Ac-myo-Inositol Natural products CC(=O)OC1C(OC(C)=O)C(OC(C)=O)C(OC(C)=O)C(OC(C)=O)C1OC(C)=O SQUHHTBVTRBESD-UHFFFAOYSA-N 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 101710203526 Integrase Proteins 0.000 description 1
- 241000721701 Lynx Species 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 1
- SAQSTQBVENFSKT-UHFFFAOYSA-M TCA-sodium Chemical compound [Na+].[O-]C(=O)C(Cl)(Cl)Cl SAQSTQBVENFSKT-UHFFFAOYSA-M 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- DNSISZSEWVHGLH-UHFFFAOYSA-N butanamide Chemical compound CCCC(N)=O DNSISZSEWVHGLH-UHFFFAOYSA-N 0.000 description 1
- AIYUHDOJVYHVIT-UHFFFAOYSA-M caesium chloride Chemical compound [Cl-].[Cs+] AIYUHDOJVYHVIT-UHFFFAOYSA-M 0.000 description 1
- 239000004202 carbamide Substances 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- HPXRVTGHNJAIIH-UHFFFAOYSA-N cyclohexanol Chemical compound OC1CCCCC1 HPXRVTGHNJAIIH-UHFFFAOYSA-N 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- VGONTNSXDCQUGY-UHFFFAOYSA-N desoxyinosine Natural products C1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 VGONTNSXDCQUGY-UHFFFAOYSA-N 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- QGBSISYHAICWAH-UHFFFAOYSA-N dicyandiamide Chemical compound NC(N)=NC#N QGBSISYHAICWAH-UHFFFAOYSA-N 0.000 description 1
- 238000007847 digital PCR Methods 0.000 description 1
- WQABCVAJNWAXTE-UHFFFAOYSA-N dimercaprol Chemical compound OCC(S)CS WQABCVAJNWAXTE-UHFFFAOYSA-N 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- GTOKAWYXANYGFG-UHFFFAOYSA-N ethyl n-propylcarbamate Chemical compound CCCNC(=O)OCC GTOKAWYXANYGFG-UHFFFAOYSA-N 0.000 description 1
- 102000013165 exonuclease Human genes 0.000 description 1
- MHMNJMPURVTYEJ-UHFFFAOYSA-N fluorescein-5-isothiocyanate Chemical compound O1C(=O)C2=CC(N=C=S)=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 MHMNJMPURVTYEJ-UHFFFAOYSA-N 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 229910001385 heavy metal Inorganic materials 0.000 description 1
- ALBYIUDWACNRRB-UHFFFAOYSA-N hexanamide Chemical compound CCCCCC(N)=O ALBYIUDWACNRRB-UHFFFAOYSA-N 0.000 description 1
- CDAISMWEOUEBRE-GPIVLXJGSA-N inositol Chemical compound O[C@H]1[C@H](O)[C@@H](O)[C@H](O)[C@H](O)[C@@H]1O CDAISMWEOUEBRE-GPIVLXJGSA-N 0.000 description 1
- 229960000367 inositol Drugs 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 229940035429 isobutyl alcohol Drugs 0.000 description 1
- 238000007169 ligase reaction Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- SURZCVYFPAXNGN-UHFFFAOYSA-N methyl-carbamic acid ethyl ester Chemical compound CCOC(=O)NC SURZCVYFPAXNGN-UHFFFAOYSA-N 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- KERBAAIBDHEFDD-UHFFFAOYSA-N n-ethylformamide Chemical compound CCNC=O KERBAAIBDHEFDD-UHFFFAOYSA-N 0.000 description 1
- NWVVVBRKAWDGAB-UHFFFAOYSA-N p-methoxyphenol Chemical compound COC1=CC=C(O)C=C1 NWVVVBRKAWDGAB-UHFFFAOYSA-N 0.000 description 1
- WVDDGKGOMKODPV-ZQBYOMGUSA-N phenyl(114C)methanol Chemical compound O[14CH2]C1=CC=CC=C1 WVDDGKGOMKODPV-ZQBYOMGUSA-N 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 229920000136 polysorbate Polymers 0.000 description 1
- ZNNZYHKDIALBAK-UHFFFAOYSA-M potassium thiocyanate Chemical compound [K+].[S-]C#N ZNNZYHKDIALBAK-UHFFFAOYSA-M 0.000 description 1
- 229940116357 potassium thiocyanate Drugs 0.000 description 1
- QLNJFJADRCOGBJ-UHFFFAOYSA-N propionamide Chemical compound CCC(N)=O QLNJFJADRCOGBJ-UHFFFAOYSA-N 0.000 description 1
- 229940080818 propionamide Drugs 0.000 description 1
- UMJSCPRVCHMLSP-UHFFFAOYSA-N pyridine Natural products COC1=CC=CN=C1 UMJSCPRVCHMLSP-UHFFFAOYSA-N 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 108091008146 restriction endonucleases Proteins 0.000 description 1
- YGSDEFSMJLZEOE-UHFFFAOYSA-M salicylate Chemical compound OC1=CC=CC=C1C([O-])=O YGSDEFSMJLZEOE-UHFFFAOYSA-M 0.000 description 1
- 229960001860 salicylate Drugs 0.000 description 1
- CDAISMWEOUEBRE-UHFFFAOYSA-N scyllo-inosotol Natural products OC1C(O)C(O)C(O)C(O)C1O CDAISMWEOUEBRE-UHFFFAOYSA-N 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- BAZAXWOYCMUHIX-UHFFFAOYSA-M sodium perchlorate Chemical compound [Na+].[O-]Cl(=O)(=O)=O BAZAXWOYCMUHIX-UHFFFAOYSA-M 0.000 description 1
- 229910001488 sodium perchlorate Inorganic materials 0.000 description 1
- UYCAUPASBSROMS-AWQJXPNKSA-M sodium;2,2,2-trifluoroacetate Chemical compound [Na+].[O-][13C](=O)[13C](F)(F)F UYCAUPASBSROMS-AWQJXPNKSA-M 0.000 description 1
- 241000894007 species Species 0.000 description 1
- NVBFHJWHLNUMCV-UHFFFAOYSA-N sulfamide Chemical compound NS(N)(=O)=O NVBFHJWHLNUMCV-UHFFFAOYSA-N 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- YUKQRDCYNOVPGJ-UHFFFAOYSA-N thioacetamide Chemical compound CC(N)=S YUKQRDCYNOVPGJ-UHFFFAOYSA-N 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 238000011179 visual inspection Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
Definitions
- the present invention relates to novel methods and compositions for DNA sequencing.
- the methods described herein are useful for sequencing homopolymeric regions of DNA.
- a second problem with the FISSEQ approach is that the set of polymerases typically utilized in such reactions do not efficiently incorporate nucleotides due to the high density of modified nucleotides. For that reason, a large fraction of unlabeled nucleotides are introduced, thus reducing the overall density of modification and extending read-lengths. This results in less labeled nucleotide and, accordingly, less signal. Accordingly, the present invention is directed to novel methods of sequencing that circumvent these problems and provides advantages over methods of sequencing known in the art.
- the present invention provides novel sequencing methods designed to circumvent problems associated with sequencing-by-synthesis methods known in the art.
- the methods described herein are based on sequencing by polymerase-extension, they differ from FISSEQ and pyrosequencing in that base-additions are not "progressive.” Instead, after a given single-base-extension (SBE), the sequencing primer is stripped from the bead- immobilized templates and a new primer is hybridized. Thus to get beyond the first base, each sequencing primer in the set "reaches" out to a defined position in the unknown unique sequence of the template (e.g., to the fourth base or the fifth base).
- SBE single-base-extension
- a sequencing primer from 5' to 3', thus consists of an "anchor sequence" that is complementary to the constant sequence on the template, and a defined number of additional bases (e.g., universal, degenerate and/or natural bases), that will hybridize to the unknown sequence regardless of what it is. If, for example, there are three fixed universal bases, then the sequencing primer is positioned to sequence the fourth base via SBE with labeled nucleotides. After a single-base- extension and data acquisition, extended and unextended primers are stripped (e.g., with heat) and a new primer is annealed that has a different number of universal bases, thus querying a different base-position within the unknown sequence. Thus in this simplest iteration of the scheme, one only needs a set of N primers to achieve a read-length of N.
- additional bases e.g., universal, degenerate and/or natural bases
- the present invention provides many advantages over sequencing methods known in the art.
- the methods described herein 1) provide a quick solution to the problem of sequencing homopolymers; 2) enable manual mistakes and biochemical inefficiencies to be non-cumulative; 3) greatly expedite the technology development for longer reads (i.e. don't have to cycle out to test a method for improving read-lengths); 4) provide better signals than are obtained by the FISSEQ system currently used in the art (i.e., in which a desire for signal has to be balanced against a desire to minimize the fraction of extended templates with cleaved linker as it inhibits the polymerase); and 5) greatly increase the choice and amounts of enzyme (polymerase or ligase) due to the lack of a requirement to take extensions to completion.
- Figure 1 depicts primer information.
- the first column of numbers indicates the cycle number assigned to a given query.
- the second and third columns indicate the sequencing primer used, and the fourth column indicates the conditions of hybridization.
- the fifth column indicates the base(s) used to extend, and the 6 th column indicates the templates expected to add.
- the remaining columns indicate the best-fit slope coefficient for adders and non-adders, and finally the ratio of these values.
- TR Texas Red.
- Figure 2 depicts an extension with 37C.8N.CG, sequencing bases 10,11,12 on T4. Blue indicates bases that were sequenced; yellow indicates bases attempted and failed; uncolored indicates bases that were not attempted.
- Figure 3 depicts sequencing on emulsion beads.
- Figure 4 depicts primer information for primers that extended either T2, T3 or T4.
- Figure 5 depicts bases that were sequenced. Blue indicates bases that were sequenced; yellow indicates bases attempted and failed; uncolored indicates bases that were not attempted.
- Figure 6 depicts sequencing on emulsion beads.
- Figure 7 is a schematic depicting query of tag positions (-5) by mismatch ligation.
- Figures 8A and 8B is a schematic depicting unique tags and queries that will ligate.
- Figures 9 A and 9B is a schematic of the method of the present invention.
- Figure 10 is a four color depiction of four possible base calls.
- Figures 11 is a graph showing variation in accuracy over each of 26 cycles of nonprogressive sequencing.
- DNA sequences of numerous features are obtained in parallel by cycles of hybridization of sequencing primers that contain universal, degenerate, and/or specific bases at positions of unknown sequence, followed by single-base- extension with polymerase and nucleotide.
- polymerase and nucleotide As polymerases generally only extend from terminally-matched nucleotides, when an extension occurs, the identity of the bases complementary to specific bases present at the 3' terminus of a given sequencing primer is revealed.
- use of modified nucleotides with different fluorescent labels reveals the identity of the incorporated nucleotide.
- a given sequencing primer is designed with a known number of universal or degenerate nucleotides, and a known number of specific nucleotides, one knows the specific position within the unknown template that one is sequencing.
- the methods of the invention include the use of "degenerate bases” which are intended to include, but are not limited to, primer mixes that contain all possible sequences at unknown positions.
- the methods of the invention also include the use of universal bases at some or all of the primer positions.
- Universal bases are intended to include, but are not limited to, synthetic nucleotide analogs that ideally pair with equal affinities to each of the natural nucleotides, and are readily accepted as substrates by natural enzymes. Examples of universal bases include 5-nitroindole, 3-nitropyrole, deoxyinosine, and the like.
- the methods of the invention further include the use of natural bases, wherein sequencing primer oligonucleotides are synthesized with fully degenerate positions, such that all possible sequencing primers (or some random subset of all possibilities) are present during hybridization.
- overall efficiency could be improved by enzyme engineering for greater permissiveness with respect to mismatches (e.g., the M1/M4 variants of Taq) or alterations to the primer design strategy.
- methods of the invention are directed to fixing the terminal two bases of a given sequencing primer, but allowing the remainder of bases at "universal" positions to be synthesized with fully-degenerate natural bases.
- Non-terminator FISSEQ yields approximately 0.50 bases-per-cycle (assuming no homopolymer resolution and thus counting multi-base runs as single extensions). By this consideration, achieving an identical read-length would require approximately 2.67 times as many cycles in the 2 bp-matched-wobble-sequencing system.
- a typical primer-name below is "37C.2N.CA”.
- the anchor sequence is a trimmed version of the original FISSEQ primer for the Tl..T5 template.
- the "37C” indicates the extent to which it has been trimmed (i.e. 37C is the Tm of the anchor sequence if it were a stand-alone primer).
- the "2N” indicates that the anchor-sequence is followed by two full “wobble” or degenerate bases, and the CA indicates the fixed two terminal bases. This primer would extend to the 5 th base, thus sequencing 3 bases (base 3, 4 and 5) on 1/16 th of the templates of a random library.
- primers with even numbers of "wobble” or degenerate bases and terminal bases that match at least one of the five T1..T5 templates were focused on to ensure extension at every cycle. For a given "reach-length,” this was approximately 1/4* of the primers that would be required in a real sequencing experiment involving sequencing of genomic fragments. However, this estimate is slightly conservative in that one could do multiples of three for the number of "wobble” or degenerate bases, rather than multiples of two. Some optional redundancy was built in. For example, 37C.2N.XX sequences bases 3, 4 and 5. 37C.4N.X sequences bases 5, 6 and 7. Thus, base 5 was sequenced twice (as is base 7, base 9, etc.)
- Figure 1 depicts results from top-layered, 1 ⁇ M beads with loaded Tl..T5 templates. These are primers that would be required in a full sequencing experiment on unknown sequence. Primers were ordered to sequence through to the 11 th base on all five templates (37C.0N.XX through 37C.8N.XX). Only one primer was ordered for 37C.10N.XX through 37C.18N.XX.
- Wobble Ligation an embodiment of the invention referred to as "Wobble Ligation.”
- Several of the principles are identical or similar to Wobble Extension as previously described herein. These principles are distinguishable from FISSEQ and other sequencing methods, such as that described in Macevicz US Patent No. 5,750,341.
- a single primer is hybridized and extended; degenerate bases within the oligonucleotide primer are included to 'reach' a specific distance into the unknown sequence.
- a single primer is hybridized that is universal (the 'anchor' primer) and sits such that either its 5' or 3' end is immediately adjacent to the unknown sequence.
- the position to be queried is encoded in a pool of degenerate nonamers (9-mer) that are ligated to the anchor primer.
- anchor primers having one or several degenerate positions at the terminus to be ligated to can serve as substrates for ligation and so can be used to position the query even further into the unknown sequence.
- the assays are always identical, in that the full pool of possible nonamers is being ligated to the anchor primer. What changes between the assays (and determines whether one is sequencing base 4 or base 7 in a particular cycle, for example), is the correlations between specific positions in the degenerate nonamer and fluorescent labels at its end.
- Figure 7 depicts, for example, the querying of position (-4) relative to the anchor primer.
- Such error establishes an upper limit on the accuracy of any sequencing method which operates on material that is the product of the amplification.
- template is diluted to the point where 1 template molecule and 1 bead will be trapped in an emulsion compartment, and PCR will proceed from this single molecule resulting in many copies bound to the bead.
- An error arising early during the amplification will result in a bead having either a homogenous population of amplicons bearing the error, or a heterogenous population of amplicons, some bearing the error and some not. In either case, the accuracy of the sequence derived from such a bead will be low.
- emulsion PCR will be started with multiple copies of a given template molecule in a compartment. Then, PCR will initiate from each copy independently, and the product bound to the bead in that compartment will be largely homogenous and error-free, even if errors arise early during amplification from 1 of the copies of the template.
- the first is to clone the template desired to be sequenced into a plasmid, transform into bacteria or yeast, and perform emulsion PCR not with naked single-copy template DNA, but rather with individual cells, each of which includes multiple copies of the template. During PCR the cells will rupture and amplification will proceed from each copy of the plasmid present. Since multiple copies of the template were present, and since each was copied independently by the host cell's low- error replication machinery, the probability of obtaining a PCR-based error in a preponderance of amplicons is very low.
- the second approach uses linear rolling circle amplification to prepare template molecules which are linear concatemers of independent copies of the original template. PCR then initiates from each site on the concatemer independently.
- the important constraint (regardless of the method used to get multiple copies of a template into an emulsion compartment or otherwise to initiate a spatially-clustered exponential amplification) is that the initial copies made of the original template are independent of each other and so the probability of two such copies bearing the same error is very low.
- the original template (a circular molecule) is iterated over many times, such that all copies are copies of the original template (unlike PCR, which makes copies of copies).
- Embodiments of the present invention are directed to methods to determine, with single-base resolution, the length of the unique region of a library molecule.
- a paired-tag genomic library is constructed where each library molecule is comprised of a unique region flanked by common primer sites.
- the type Hs restriction enzyme Mmel is used. Mmel cuts either 17bp or 18bp from its recognition sequence, and in the embodiment described here thus produces inserts of 17bp or 18bp at a ratio of about 50:50 with little to no sequence-dependence. Knowing the exact length of each insert is advantageous since sequencing methods described herein include the step of reading a certain number of bases from each side of the 17-18bp tag. In order to generate a contiguous sequence from such reads, knowing the exact length of the insert would be beneficial.
- a ligation-query scheme which relies on the specificity of the ligase reaction catalyzed by ampligase or some other ligase capable of yielding sufficient base paring specificity to first 'walk' across the insert with fully degenerate nonamers, and then query the identity of a base in the opposing universal primer sequence.
- An 'anchor' primer complementary to sequence in universal primer A can be first hybridized, then perform degenerate nonamer ligation to span the unique insert, and finally query the length of such insert with a pair of fluorescently-labeled query primers, where each possible length (17 or 18) is coded by a different fluorophore as depicted in Figure 8 A and 8B.
- This embodiment can be carried out in the 5'->3' direction by using a degenerate nonamer population that is phosphorylated at the 5' end (such that that end will ligate to the anchor primer), and the fluorophore resides on its 3' end.
- a kit including endonuclease 8 and UDG is commercially available from New England Biolabs under the tradename USER.
- a schematic of a sample UDG reaction is provided in the figure below.
- a P endonuclease (variable specificity)
- Certain polymerase- and ligase- driven cyclic sequencing methods are termed "progressive,” in that they interrogate the sequencing template by incorporating onto the end of a growing polynucleotide chain, digesting from the end of the template, or ligating to a growing oligonucleotide primer.
- progressive in that they interrogate the sequencing template by incorporating onto the end of a growing polynucleotide chain, digesting from the end of the template, or ligating to a growing oligonucleotide primer.
- the non-progressive cycling method of the present invention reduces, or in certain embodiments, eliminates, the adverse effects of amplicon dephasing in existing sequencing by synthesis methods (both polymerase- and ligase- driven) by removing the sequencing primer periodically (as often as after each base-position is interrogated).
- amplicon dephasing in existing sequencing by synthesis methods (both polymerase- and ligase- driven) by removing the sequencing primer periodically (as often as after each base-position is interrogated).
- enzymatic and chemical inefficiencies and other errors do not accumulate as the sequencing run proceeds. Rather, each cycle is independent of previous inefficiencies or misincorporations (assuming the primer is removed after each sequencing cycle).
- the non-progressive cycling method of the present invention has the added advantage of allowing one to know, with reasonably certainty, which position in the template is being interrogated.
- the primer can be removed in a number of ways.
- Heat can be used to melt the primer off the template.
- Alkali can be used to chemically denature the primer from the template.
- Numerous other chemical denaturants can be used, which include: methanol, ethanol, isopropanol, n-propanol, allyl alcohol, sec-butyl alcohol, tert-butyl alcohol, isobutyl alcohol, n-butyl alcohol, tert-amyl alcohol, ethylene glycol, glycerol, dithioglycerol, propylene glycol, cyclohexyl alcohol, benzyl alcohol, inositol, phenol, p-methoxyphenol, aniline, pyridine, purine, 1,4-dioxane, gamma-butyrolactone, 3 -amino triazole, formamide, N-ethyl formamide, N-N- dimethylform
- Chemically-labile linkages such as phosphorothioate with heavy-metal ion cleavage treatment as described in M. Mag, S. Luking, J. W. Engels, Nucleic Acids Res., 19:1437 (April 11, 1991) can be included in the primer to allow it to be fragmented into many pieces, each of which has a Tm low enough to cause the prime ⁇ query complex to denature from the template.
- Primers can be made enzymatically-labile by the inclusion of ribonucleotides or ribonucleotide stretches (susceptible to cleavage by RNase H or alkali) or the inclusion of deoxyuridines (subject to cleavage by a mixture of uracil DNA glycosylase and endonuclease VIII) or abasic sites (subject to cleavage by endonuclease VIII).
- the primer can also be removed enzymatically by the use of a suitable exonuclease.
- the following steps were carried out cyclically to interrogate each base of the template sequentially.
- An 'anchor primer' was hybridized complementary to common library sequence.
- a pool of fluorescently-labeled 'query primers' specific to one tag-position was then ligated to the template. Imaging was then used to determine which primer pool ligated to which bead.
- the anchor: :query primer complex was then stripped. The process was then repeated.
- Query primers used were nonamers which were degenerate at all positions excepy the query position. At the query position, only one base was present for a given fluorophore.
- the pool of probes used to query position five was composed of the following four label-subpools:
- Anchor primers were hybridized in a flowcell (10OuM primer in 6x SSPE) for 5 minutes at 56C, then cooled to 42C and held for 2 minutes. Excess primer was then washed out at room temperature with Wash IE (1OmM Tris-HCl pH 7.5, 5OmM KCl, 2mM EDTA pH 8.0, 0.01% Triton X-100) for 2 minutes.
- Query primers were ligated in the flowcell (8uM query primer mix (2uM each subpool), 6000U T4 DNA ligase (NEB), Ix T4 DNA ligase buffer (NEB)) at 35C and held for 30 minutes. At the end of the reaction, excess query primer was washed out at room temperature with Wash IE for 5 minutes.
- the cycles consist of the following four steps: (a) hybridization of one of four anchor primer, (b) ligation of fluorescent, degenerate nonamers, (c) four color imaging on epifluorescence microscope, (d) stripping of the anchor primer:nonamer complexes prior to beginning the next cycle.
- the anchor primers are each designed to be complementary to universal sequence immediately 5' or 3' to one of the two tags.
- Al, A2, A3 and A4 indicate the four locations to which anchor primers are targeted relative to the amplicon. Arrows indicate the direction sequenced into the tag from each anchor primer. From anchor primers Al and A3, 7 bases are sequenced into each tag, and from anchor primers A2 and A4, 6 bases are sequenced into each tag. Thus, 13 bp per tag are obtained, and 26 bp per amplicon, with 4 to 5 bp gaps within each tag sequence.
- each cycle involves performing a ligation reaction with T4 DNA ligase and a fully degenerate population of nonamers.
- the nonamer molecules are individually labeled with one of four fluorophores (e.g., Texas Red, Cy5, Cy3, FITC).
- fluorophores e.g., Texas Red, Cy5, Cy3, FITC
- the nonamers are structured differently. Specifically, a single position within each nonamer is correlated with the identity of the fluorophore with which it is labeled.
- the fiuorphore molecule is attached at the opposite end of the nonamer relative to the end targeted to the ligation junction.
- the anchor primer is hybridized such that its 3' end is adjacent to the genomic tag. To query a position five bases in to the tag sequence, the four- color population of nonamersis used.
- Figure 11 shows data from a single cycle of non-progressive sequencing by ligation, and in particular is the sequencing data from position (-1) of the proximal tag of a complex E. coli derived library.
- Figure 11 shows variation in accuracy over each of 26 cycles of non-progressive sequencing by ligation in a single experiment resequencing an E. coli genome. Cumulative distribution of raw error as a function of rank- ordered quality, with each of 26 sequencing-by-ligation cycles in a single sequencing experiment is treated as an independent data-set.
- the x-axis indicates percentile bins of beads, sorted on the basis of a confidence metric.
- the >>-axis (log scale) indicates the raw base-calling accuracy of each cumulative bin.
- Pritchard CE Southern EM., "Effects of base mismatches on joining of short oligodeoxynucleotides by DNA ligases," Nucleic Acids Res., 1997 Sept. 1; 25(17):3403- 3407.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Novel methods and compositions for DNA sequencing are provided. The methods described herein are useful for sequencing homopolymeric regions of DNA. The methods also prevent the accumulation of mistakes and inefficiencies in the sequencing reaction.
Description
PATENT ATTORNEY DOCKET NO. 10498-00091
WOBBLE SEQUENCING
STATEMENT OF GOVERNMENT INTERESTS
This invention was made with Government support under Award Numbers 1P50 HG003170, awarded by the Centers of Excellence in Genomic Science (CEGS); and DE-FG02-02ER63445, awarded by Genomes to Life (GTL). The Government has certain rights in the invention.
FIELD OF THE INVENTION
The present invention relates to novel methods and compositions for DNA sequencing. The methods described herein are useful for sequencing homopolymeric regions of DNA.
BACKGROUND OF THE INVENTION
Current state-of-the-art in sequencing-by-synthesis relies on a single sequencing primer, with a known sequence, followed by cyclic additions of a single nucleotide species at each cycle and detection of incorporation events (e.g., C-A-G-T-C-A-G-T...) via fluorescence or light. Examples of these methods include fluorescent in situ sequencing (FISSEQ) and pyrosequencing. A major problem for both of these approaches is that it is very difficult to decode consecutive runs of the same base in the unknown sequence (i.e., hompolymeric runs), and it is difficult to distinguish single from multiple incorporation events. As approximately 44% of nucleotides are part of a homopolymeric run, this is obviously a major consideration. Most efforts to circumvent this problem involve the development of reversibly terminating nucleotides, which cause a variety of difficulties.
A second problem with the FISSEQ approach is that the set of polymerases typically utilized in such reactions do not efficiently incorporate nucleotides due to the high density of modified nucleotides. For that reason, a large fraction of unlabeled nucleotides are introduced, thus reducing the overall density of modification and extending read-lengths. This results in less labeled nucleotide and, accordingly, less signal. Accordingly, the present invention is directed to novel methods of sequencing that circumvent these problems and provides advantages over methods of sequencing known in the art.
SUMMARY
The present invention provides novel sequencing methods designed to circumvent problems associated with sequencing-by-synthesis methods known in the art. Although the methods described herein are based on sequencing by polymerase-extension, they differ from FISSEQ and pyrosequencing in that base-additions are not "progressive." Instead, after a given single-base-extension (SBE), the sequencing primer is stripped from the bead- immobilized templates and a new primer is hybridized. Thus to get beyond the first base, each sequencing primer in the set "reaches" out to a defined position in the unknown unique sequence of the template (e.g., to the fourth base or the fifth base). A sequencing primer, from 5' to 3', thus consists of an "anchor sequence" that is complementary to the constant sequence on the template, and a defined number of additional bases (e.g., universal, degenerate and/or natural bases), that will hybridize to the unknown sequence regardless of what it is. If, for example, there are three fixed universal bases, then the sequencing primer is positioned to sequence the fourth base via SBE with labeled nucleotides. After a single-base- extension and data acquisition, extended and unextended primers are stripped (e.g., with heat) and a new primer is annealed that has a different number of universal bases, thus querying a different base-position within the unknown sequence. Thus in this simplest iteration of the scheme, one only needs a set of N primers to achieve a read-length of N.
The present invention provides many advantages over sequencing methods known in the art. The methods described herein: 1) provide a quick solution to the problem of sequencing homopolymers; 2) enable manual mistakes and biochemical inefficiencies to be non-cumulative; 3) greatly expedite the technology development for longer reads (i.e. don't have to cycle out to test a method for improving read-lengths); 4) provide better signals than are obtained by the FISSEQ system currently used in the art (i.e., in which a desire for signal has to be balanced against a desire to minimize the fraction of extended templates with cleaved linker as it inhibits the polymerase); and 5) greatly increase the choice and amounts of enzyme (polymerase or ligase) due to the lack of a requirement to take extensions to completion.
BRIEF DESCRIPTION OF THE DRAWINGS
The patent or application file contains at least one drawing executed in color. Copies of this patent or patent application publication with color drawing(s) will be provided by the
Office upon request and payment of the necessary fee. The foregoing and other features and advantages of the present invention will be more fully understood from the following detailed description of illustrative embodiments taken in conjunction with the accompanying drawings in which:
Figure 1 depicts primer information. The first column of numbers indicates the cycle number assigned to a given query. The second and third columns indicate the sequencing primer used, and the fourth column indicates the conditions of hybridization. The fifth column indicates the base(s) used to extend, and the 6th column indicates the templates expected to add. The remaining columns indicate the best-fit slope coefficient for adders and non-adders, and finally the ratio of these values. TR = Texas Red.
Figure 2 depicts an extension with 37C.8N.CG, sequencing bases 10,11,12 on T4. Blue indicates bases that were sequenced; yellow indicates bases attempted and failed; uncolored indicates bases that were not attempted.
Figure 3 depicts sequencing on emulsion beads.
Figure 4 depicts primer information for primers that extended either T2, T3 or T4.
Figure 5 depicts bases that were sequenced. Blue indicates bases that were sequenced; yellow indicates bases attempted and failed; uncolored indicates bases that were not attempted.
Figure 6 depicts sequencing on emulsion beads.
Figure 7 is a schematic depicting query of tag positions (-5) by mismatch ligation.
Figures 8A and 8B is a schematic depicting unique tags and queries that will ligate.
Figures 9 A and 9B is a schematic of the method of the present invention.
Figure 10 is a four color depiction of four possible base calls.
Figures 11 is a graph showing variation in accuracy over each of 26 cycles of nonprogressive sequencing.
DETAILED DESCRIPTION
In the methods described herein, DNA sequences of numerous features are obtained in parallel by cycles of hybridization of sequencing primers that contain universal, degenerate, and/or specific bases at positions of unknown sequence, followed by single-base- extension with polymerase and nucleotide. As polymerases generally only extend from terminally-matched nucleotides, when an extension occurs, the identity of the bases
complementary to specific bases present at the 3' terminus of a given sequencing primer is revealed. Furthermore, use of modified nucleotides with different fluorescent labels reveals the identity of the incorporated nucleotide. As a given sequencing primer is designed with a known number of universal or degenerate nucleotides, and a known number of specific nucleotides, one knows the specific position within the unknown template that one is sequencing.
The methods of the invention include the use of "degenerate bases" which are intended to include, but are not limited to, primer mixes that contain all possible sequences at unknown positions. The methods of the invention also include the use of universal bases at some or all of the primer positions. "Universal bases" are intended to include, but are not limited to, synthetic nucleotide analogs that ideally pair with equal affinities to each of the natural nucleotides, and are readily accepted as substrates by natural enzymes. Examples of universal bases include 5-nitroindole, 3-nitropyrole, deoxyinosine, and the like. The methods of the invention further include the use of natural bases, wherein sequencing primer oligonucleotides are synthesized with fully degenerate positions, such that all possible sequencing primers (or some random subset of all possibilities) are present during hybridization. Without intending to be bound by theory, overall efficiency could be improved by enzyme engineering for greater permissiveness with respect to mismatches (e.g., the M1/M4 variants of Taq) or alterations to the primer design strategy.
In one embodiment, methods of the invention are directed to fixing the terminal two bases of a given sequencing primer, but allowing the remainder of bases at "universal" positions to be synthesized with fully-degenerate natural bases. The disadvantage of this compromise is that 16 separate hybridizations are required for each "reach" length (42 combinations of the two terminal bases). This is mitigated by the fact that polymerases don't extend off of mispaired termini very well, so a given extension set reveals the identity of both the two terminal bases and the extended base. So the average efficiency of the process is 3/16 = 0.188 bases per cycle.
Non-terminator FISSEQ, by comparison, yields approximately 0.50 bases-per-cycle (assuming no homopolymer resolution and thus counting multi-base runs as single extensions). By this consideration, achieving an identical read-length would require approximately 2.67 times as many cycles in the 2 bp-matched-wobble-sequencing system.
This invention is further illustrated by the following examples, which should not be construed as limiting. The contents of all references, patents and published patent applications cited throughout this application are hereby incorporated by reference in their entirety for all purposes.
EXAMPLE I Cycle Protocol
Typical cycles were as follows:
1. Hybridize sequencing primer (15 minutes, 10 μM primer in 6x SSPE, 40-500C)
2. Extend (4 minutes, SSB + polymerase + nucleotide)
3. Wash (2 minute)
4. Image acquisition
5. Strip primer (5 minutes, Wash IE, 70 'C)
If the wobble-bases were fixed (poly- A, poly-G, poly-C, or poly-T instead of poly-N), extensions were no longer efficient. Without intending to be bound by theory, this indicates that some degree of "sorting" is going on during the hybridization that is critical to the overall process working. Hoping for this to occur, the "anchor sequence" is purposefully short (Tm = 370C if it were alone), weighting the hybridization process to depend to a greater degree on the "wobble" or degenerate sequences. Initial data indicated that SEQUENASE™ was significantly better than Klenow for this approach. Primer-stripping was initially very inefficient with beads. It only started working when the bead array was fabricated such that the beads were embedded in the gel near the gel-liquid interface (opposite the glass surface, or "top-layered").
EXAMPLE II Primer Nomenclature
A typical primer-name below is "37C.2N.CA". For the primers described herein, the anchor sequence is a trimmed version of the original FISSEQ primer for the Tl..T5 template. The "37C" (or "23C" or the like) indicates the extent to which it has been trimmed (i.e. 37C is the Tm of the anchor sequence if it were a stand-alone primer). The "2N" indicates that the anchor-sequence is followed by two full "wobble" or degenerate bases, and the CA indicates
the fixed two terminal bases. This primer would extend to the 5th base, thus sequencing 3 bases (base 3, 4 and 5) on 1/16th of the templates of a random library.
In the examples below, primers with even numbers of "wobble" or degenerate bases and terminal bases that match at least one of the five T1..T5 templates were focused on to ensure extension at every cycle. For a given "reach-length," this was approximately 1/4* of the primers that would be required in a real sequencing experiment involving sequencing of genomic fragments. However, this estimate is slightly conservative in that one could do multiples of three for the number of "wobble" or degenerate bases, rather than multiples of two. Some optional redundancy was built in. For example, 37C.2N.XX sequences bases 3, 4 and 5. 37C.4N.XX sequences bases 5, 6 and 7. Thus, base 5 was sequenced twice (as is base 7, base 9, etc.)
EXAMPLE III Proof of Principle on Loaded Beads
Figure 1 depicts results from top-layered, 1 μM beads with loaded Tl..T5 templates. These are primers that would be required in a full sequencing experiment on unknown sequence. Primers were ordered to sequence through to the 11th base on all five templates (37C.0N.XX through 37C.8N.XX). Only one primer was ordered for 37C.10N.XX through 37C.18N.XX.
Failures are listed in yellow. Without intending to be bound by theory, the first failure (cycle 17), was likely due to manual error in preparing the extension reagent mix, as its repeat (cycle 24) was successful, and this primer worked well in the emulsion-bead experiment below. Without intending to be bound by theory, the remaining failures correlate with attempts at longer reads. The 37C.12N.CG primer, interestingly, works quite well for one template but not another. In a subsequent experiment, using SEQUENASE™ instead of Klenow resulted in both templates working with this primer. SEQUENASE™ also yields greater signal in general than Klenow in this protocol.
Without intending to be bound by theory, several trends emerged: a) there was poor performance of "G" extensions, which was improved using SEQUENASE™; and b) poor performance of the T5 template in terms of signal yield at any given cycle when it was expected to extend. This outcome may be explained by the shortening of the anchor of the sequencing primer.
Approximately 11 -base-pair reads were obtained from all five templates, and all observations appear consistent. A 15-bp read was obtained on one of the templates (T4), but results were not consistent (i.e. cycle 28) and failure was experienced beyond base 15 (cycles 29-31). Extension was performed with 37C.8N.CG, sequencing bases 10,11,12 on T4 (Figure 2).
Since the above worked so well, the experiment was repeated on emulsion-generated beads top-layered (Figure 3). The templates were diluted independently, only mixing them as they went into the emulsion mix. The reason for this is that they are single-stranded, and this procedure minimizes their binding to one another, which confounds results. However, the ratios of the five templates deviated from 1 :1. The initial set of primers used on these templates were the 37C.0N.XX series, which essentially establishes the identity of each bead. As the fraction of beads with 1 or more template was high, it was not surprising that a high fraction of non-clonal beads was observed. Only approximately 1% of the gel (25 frames) was imaged at each cycle. The overall numbers were as follows: no template, 29,658; weakly amplified, 10,164; strong clonal, 13,350; Tl = 57; T2 = 8,945; T3 = 2,165; T4 = 1,834; T5 = 349; strong non-clonal, 7,668; and total, 60,840.
The numbers are generally consistent with what one would expect from Poisson statistics, but with a modest excess of non-clonal beads. Without intending to be bound by theory, these data indicate that some fraction of the "no template" beads actually don't participate in the distribution (e.g., they are excluded because they are in the oil compartment, or in a compartment that is too small to initiate PCR and the like).
EXAMPLE IV Primers That Extended Either T2, T3, or T4
The initial analysis of clonality and identity, which were based on the 37C.0N.XX primers, led to the focus on primers that extended either T2, T3, or T4, as these dominated the slide (Figures 4 and 6). Relative to the above there are also changes to the hybridization conditions and modified nucleotides, but the most important difference (other than the fact that these are emulsion-generated beads) was that SEQUENASE™ was utilized instead of Klenow. Extension was performed with 37C.8N.CG, sequencing bases 10,11,12 on T4 using emulsion-beads instead of loaded beads (Figure 5).
On cycle 19/20 (Figure 4), stripping was performed before reading the Cy3 signal out. Interestingly, less than 30 seconds in Wash IE at 7O0C was sufficient for stripping, or at least for redistribution of signal amongst the beads. Thus, cycles 22 and 23 were repeated with 37C.12N.CG.
What worked and what didn't work was based on visual inspection of the graphs. Thus, without intending to be bound by theory, even though 37.12N.CG->T had lower "ratios" than 37C.14N. AT-^C, it still appears to have worked, whereas 37C.14N.AT->C appeared not to have worked.
The slide was stripped and sequencing primer was re-annealed at the conclusion to determine to what extent the templates had fallen off due to heat exposure and the like. The difference between the two sets of images (pre-sequencing and post-sequencing) was negligible. The two sets of images were strikingly consistent with one another, which indicated that template was not being lost over the course of the experiment. This inspection also demonstrated quite clearly that the extent of gel warping over the approximately 20 cycles was negligible. Good signal was obtained for nearly all of the cycles.
An additional experiment was performed using the same primer, 37C.8N.CG, sequencing bases 10,11,12 on T4 (except with emulsion beads instead of loaded beads, and showing only well-amplified, clonal beads). The signal on these beads was higher than the loaded beads. Without intending to be bound by theory, reasons for this include: a) more template on amplified beads; and (b) the switch to SEQUENASE™ from Klenow.
EXAMPLE V
Wobble Ligation Method
The following describes an embodiment of the invention referred to as "Wobble Ligation." Several of the principles are identical or similar to Wobble Extension as previously described herein. These principles are distinguishable from FISSEQ and other sequencing methods, such as that described in Macevicz US Patent No. 5,750,341.
According to the Wobble Ligation embodiment described herein:
(a) At each step of the sequencing, a single base position in the unknown sequence is being queried.
(b) Which base is being queried is directly a function of the structure of the oligonucleotides used in the reaction.
(c) After each cycle of enzymatic treatment and imaging, these oligonucleotides are stripped from the DNA attached to the beads; the method is thus non-progressive, in that any given cycle is not dependent on the efficiency of previous cycles.
There are several differences between Wobble Extension and Wobble Ligation:
(a) Ligases, rather than polymerases, are used as the discriminatory enzyme,
(b) In Wobble Extension, a single primer is hybridized and extended; degenerate bases within the oligonucleotide primer are included to 'reach' a specific distance into the unknown sequence. In Wobble Ligation, a single primer is hybridized that is universal (the 'anchor' primer) and sits such that either its 5' or 3' end is immediately adjacent to the unknown sequence. The position to be queried is encoded in a pool of degenerate nonamers (9-mer) that are ligated to the anchor primer. However, anchor primers having one or several degenerate positions at the terminus to be ligated to can serve as substrates for ligation and so can be used to position the query even further into the unknown sequence.
(c) The assays are always identical, in that the full pool of possible nonamers is being ligated to the anchor primer. What changes between the assays (and determines whether one is sequencing base 4 or base 7 in a particular cycle, for example), is the correlations between specific positions in the degenerate nonamer and fluorescent labels at its end. Figure 7 depicts, for example, the querying of position (-4) relative to the anchor primer.
EXAMPLE VI Ultra Low-Error PCR colonies
There is generally a high error rate for any pre-sequencing amplification method which starts from single templates and employs exponential amplification, including PCR, emlusion PCR, bead emulsion PCR, in situ polonies, digital PCR, bridge PCR, multiple displacement amplification (MDA) and the like. Such methods are described in C. P. Adams, S. J. Kron. (U.S. Patent 5,641,658, Mosaic Technologies, Inc.; Whitehead Institute for Biomedical Research, USA, 1997); D. Dressman, H. Yan, G. Traverso, K. W. Kinzler, B. Vogelstein, Proc. Natl. Acad. ScL USA, 100, 8817 (July 22, 2003); D. S. Tawfik, A. D. Griffiths, Natl. Biotechnol, 16, 652 (July, 1998); F. J. Ghadessy, J. L. Ong, P. Holliger, Proc. Natl. Acad. Sci. USA, 98, 4552 (April 10, 2001); M. Nakano et al., J. Biotechnol, 102, 117 (April 24, 2003); R. D. Mitra, G. M. Church, Nucleic Acids Res 27, e34 (Dec 15, 1999); and
F. B. Dean et al., Proc. Natl. Acad. ScL USA, 99, 5261 (April 16, 2002), each of which are hereby incorporated by reference.
Such error establishes an upper limit on the accuracy of any sequencing method which operates on material that is the product of the amplification. For example, during bead emulsion PCR, template is diluted to the point where 1 template molecule and 1 bead will be trapped in an emulsion compartment, and PCR will proceed from this single molecule resulting in many copies bound to the bead. An error arising early during the amplification will result in a bead having either a homogenous population of amplicons bearing the error, or a heterogenous population of amplicons, some bearing the error and some not. In either case, the accuracy of the sequence derived from such a bead will be low.
According to embodiments of the present invention, emulsion PCR will be started with multiple copies of a given template molecule in a compartment. Then, PCR will initiate from each copy independently, and the product bound to the bead in that compartment will be largely homogenous and error-free, even if errors arise early during amplification from 1 of the copies of the template.
To achieve this goal, two techniques are useful. The first is to clone the template desired to be sequenced into a plasmid, transform into bacteria or yeast, and perform emulsion PCR not with naked single-copy template DNA, but rather with individual cells, each of which includes multiple copies of the template. During PCR the cells will rupture and amplification will proceed from each copy of the plasmid present. Since multiple copies of the template were present, and since each was copied independently by the host cell's low- error replication machinery, the probability of obtaining a PCR-based error in a preponderance of amplicons is very low.
The second approach uses linear rolling circle amplification to prepare template molecules which are linear concatemers of independent copies of the original template. PCR then initiates from each site on the concatemer independently. The important constraint (regardless of the method used to get multiple copies of a template into an emulsion compartment or otherwise to initiate a spatially-clustered exponential amplification) is that the initial copies made of the original template are independent of each other and so the probability of two such copies bearing the same error is very low. With a linear rolling circle amplification, the original template (a circular molecule) is iterated over many times, such
that all copies are copies of the original template (unlike PCR, which makes copies of copies).
EXAMPLE VII Ligase-Driven DNA Molecular Ruler
Embodiments of the present invention are directed to methods to determine, with single-base resolution, the length of the unique region of a library molecule. To perform polony sequencing, a paired-tag genomic library is constructed where each library molecule is comprised of a unique region flanked by common primer sites. In order to generate a library where all inserts are short and of strictly defined length (which is important for signal homogeneity when using emulsion PCR to load the templates to sequencing beads), the type Hs restriction enzyme Mmel is used. Mmel cuts either 17bp or 18bp from its recognition sequence, and in the embodiment described here thus produces inserts of 17bp or 18bp at a ratio of about 50:50 with little to no sequence-dependence. Knowing the exact length of each insert is advantageous since sequencing methods described herein include the step of reading a certain number of bases from each side of the 17-18bp tag. In order to generate a contiguous sequence from such reads, knowing the exact length of the insert would be beneficial.
According to this embodiment a ligation-query scheme is used which relies on the specificity of the ligase reaction catalyzed by ampligase or some other ligase capable of yielding sufficient base paring specificity to first 'walk' across the insert with fully degenerate nonamers, and then query the identity of a base in the opposing universal primer sequence. An 'anchor' primer complementary to sequence in universal primer A can be first hybridized, then perform degenerate nonamer ligation to span the unique insert, and finally query the length of such insert with a pair of fluorescently-labeled query primers, where each possible length (17 or 18) is coded by a different fluorophore as depicted in Figure 8 A and 8B.
EXAMPLE VIII
An additional embodiment of the present invention is described in the following method.
1. Hybridize 5'-phosphorylated, deoxyuridine-containing anchor-primer to target sequence
3 ' -AGAGUCUACUCA-/5 ' Phos/ 5 ' TCTCAGATGAGT??????????????? ...
2. Perform a base-query by ligating to this, with T4 DNA ligase, fully degenerate nonamers, where an internal base correlates with the identity of one of four fluorophores (four color nonamers) as illustrated in Figure 7.
3. Collect data by four-color imaging or some other means.
4. To remove the primer:degenerate-sequence:fluorophore complex before beginning the next cycle, treat with both Endonuclease 8 and E. coli Uracil-DNA Glycosylase ("UDG"). The UDG will cleave the uracils in the anchor primer, leaving abasic sites that will be cleaved by Endonuclease 8, leaving short fragments with low Tm's that will melt off the immobilized DNA strands at ambient temperatures. Heat, chemical denaturants, or other chemically or enzymatically labile bonds in the anchor primer could also be used in place of deoxyuridines to remove the primer:degenerate-sequence:fluorophore complex.
This embodiment can be carried out in the 5'->3' direction by using a degenerate nonamer population that is phosphorylated at the 5' end (such that that end will ligate to the anchor primer), and the fluorophore resides on its 3' end.
A kit including endonuclease 8 and UDG is commercially available from New England Biolabs under the tradename USER. A schematic of a sample UDG reaction is provided in the figure below.
Base Excision Operates Where a Single Damaged Base Occurs
Uracil degiycosylase
JIOOf.
Example IX
Non-Progressive Cycling as Described in Example V
Certain polymerase- and ligase- driven cyclic sequencing methods are termed "progressive," in that they interrogate the sequencing template by incorporating onto the end of a growing polynucleotide chain, digesting from the end of the template, or ligating to a growing oligonucleotide primer. See for example , Braslavsky, B. Hebert, E. Kartalov, S. R. Quake, Proc. Natl. Acad. ScL USA, 100, 3960 (April 1, 2003); R. D. Mitra, J. Shendure, J. Olejnik, O. Edyta Krzymanska, G. M. Church, Anal. Biochem., 320, 55 (Sep 1, 2003); M. Ronaghi, S. Karamohamed, B. Pettersson, M. Uhlen, P. Nyren, Anal. Biochem., 242, 84 (Nov 1, 1996); S. C. C. Macevicz. (U.S. Patent 5,750,341, Lynx Therapeutics, Inc., USA, 1998), and S. Brenner et al., Natl. Biotechnol., 18:630 (Jun, 2000) each of which are hereby incorporated by reference. These "progressive" methods, however, are disadvantageous in that they exhibit amplicon dephasing, which results in decreased sequencing fidelity as the number of bases sequenced into the template increases.
The non-progressive cycling method of the present invention reduces, or in certain embodiments, eliminates, the adverse effects of amplicon dephasing in existing sequencing
by synthesis methods (both polymerase- and ligase- driven) by removing the sequencing primer periodically (as often as after each base-position is interrogated). Thus, enzymatic and chemical inefficiencies and other errors do not accumulate as the sequencing run proceeds. Rather, each cycle is independent of previous inefficiencies or misincorporations (assuming the primer is removed after each sequencing cycle). The non-progressive cycling method of the present invention has the added advantage of allowing one to know, with reasonably certainty, which position in the template is being interrogated. This advantageously allows one to resolve homopolymers since the interrogation event has been de-coupled from the positioning event. Furthermore, it allows one to sequence a template out-of-order, rather than requiring one to sequentially query positions 5' to 3' or 3' to 5'.
According to the non-progressive cycling method of the present invention, the primer can be removed in a number of ways. Heat can be used to melt the primer off the template. Alkali can be used to chemically denature the primer from the template. Numerous other chemical denaturants can be used, which include: methanol, ethanol, isopropanol, n-propanol, allyl alcohol, sec-butyl alcohol, tert-butyl alcohol, isobutyl alcohol, n-butyl alcohol, tert-amyl alcohol, ethylene glycol, glycerol, dithioglycerol, propylene glycol, cyclohexyl alcohol, benzyl alcohol, inositol, phenol, p-methoxyphenol, aniline, pyridine, purine, 1,4-dioxane, gamma-butyrolactone, 3 -amino triazole, formamide, N-ethyl formamide, N-N- dimethylformamide, acetamide, N-ethyl acetamide, N-N-dimethyl acetamide, propionamide, butyramide, hexamide, glycolamide, thioacetamide, delta-valerolactam, urethan, N-methyl urethan, N-propylurethan, cyanoguanidine, sulfamide, glycine, acetonitrile, urea, Tween 40, Triton X-100, sodium trichloroacetate, sodium perchlorate, lithium bromide, cesium chloride, lithium chloride, potassium thiocyanate, sodium trifluoroacetate, sodium dodecyl sulfate, salicylate, dimethylsulfoxide, dioxane, and the like. Suitable denaturation methods are described in L. Levine, J. A. Gordon, W. P. Jencks, Biochem. 2:168 (Jan 1963); and J. Shendure et al., Science (published online Aug. 4, 2005).
Chemically-labile linkages, such as phosphorothioate with heavy-metal ion cleavage treatment as described in M. Mag, S. Luking, J. W. Engels, Nucleic Acids Res., 19:1437 (April 11, 1991) can be included in the primer to allow it to be fragmented into many pieces, each of which has a Tm low enough to cause the primeπquery complex to denature from the template. Primers can be made enzymatically-labile by the inclusion of ribonucleotides or ribonucleotide stretches (susceptible to cleavage by RNase H or alkali) or the inclusion of
deoxyuridines (subject to cleavage by a mixture of uracil DNA glycosylase and endonuclease VIII) or abasic sites (subject to cleavage by endonuclease VIII). The primer can also be removed enzymatically by the use of a suitable exonuclease.
Non-Progressive Sequencing By Ligation Using Deoxyuridine Stripping
According to one aspect of the present invention, the following steps were carried out cyclically to interrogate each base of the template sequentially. An 'anchor primer' was hybridized complementary to common library sequence. A pool of fluorescently-labeled 'query primers' specific to one tag-position was then ligated to the template. Imaging was then used to determine which primer pool ligated to which bead. The anchor: :query primer complex was then stripped. The process was then repeated.
Anchor primers used had the following sequences (U = deoxyuridine):
• T30UIA 5'-GGGCCGUACGUCCAACT-S'
• T30UIB 5'-CGCCUUGGCCUCCGACT-S'
• PRlUION 5'-CCCGGGUUCCUCAUUCUCT-S'
• LIGFIXDD 5'-Phos/AUCACCGACUGCCCA-3'
• LIGFIXD2T30A S'-Phos/AGUUGGAGGUACGGC-S'
• LIGFIXD2T30B S'-Phos/AGUCGGAGGCCAAGC-S'
Query primers used were nonamers which were degenerate at all positions excepy the query position. At the query position, only one base was present for a given fluorophore. For example, the pool of probes used to query position five was composed of the following four label-subpools:
• Cy54NA 5'-Phos/NNNNANNNN/Cy5~3'
• Cy34NG S'-Phos/NNNNGNNNN/CyS-S'
• TexasRed4NC 5 ' -Phos/NNNNCNNNN/TR-3 '
• FRET4NT 5'-Phos/NNNNTNNNN/FRET-3'
Anchor primers were hybridized in a flowcell (10OuM primer in 6x SSPE) for 5 minutes at 56C, then cooled to 42C and held for 2 minutes. Excess primer was then washed out at room temperature with Wash IE (1OmM Tris-HCl pH 7.5, 5OmM KCl, 2mM EDTA pH 8.0, 0.01% Triton X-100) for 2 minutes.
Query primers were ligated in the flowcell (8uM query primer mix (2uM each subpool), 6000U T4 DNA ligase (NEB), Ix T4 DNA ligase buffer (NEB)) at 35C and held for 30
minutes. At the end of the reaction, excess query primer was washed out at room temperature with Wash IE for 5 minutes.
Four-color imaging was performed on an epifluorescence microscope with filters appropriate to the fluorophores attached to the nonamers.
Anchor:: query primer complex was stripped with USER (NEB), a combination of uracil DNA glycosylase and endonuclease VIII. To perform the stripping reaction, the following protocol was executed in the flowcell:
• Incubate 15OuL stripping mix (3 ul USER (NEB), 150 ul TE) for 5 minutes at 37C
• Raise temperature to 56C and hold 1 minute
• Wash for 1 minute with Wash IE; temperature gradually decreases
• Incubate 150 ul fresh stripping mix for 5 minutes at 37C
• Wash for 5 minutes with Wash IE; temperature gradually decreases
With reference to Figure 9A, the cycles consist of the following four steps: (a) hybridization of one of four anchor primer, (b) ligation of fluorescent, degenerate nonamers, (c) four color imaging on epifluorescence microscope, (d) stripping of the anchor primer:nonamer complexes prior to beginning the next cycle. The anchor primers are each designed to be complementary to universal sequence immediately 5' or 3' to one of the two tags. Al, A2, A3 and A4 indicate the four locations to which anchor primers are targeted relative to the amplicon. Arrows indicate the direction sequenced into the tag from each anchor primer. From anchor primers Al and A3, 7 bases are sequenced into each tag, and from anchor primers A2 and A4, 6 bases are sequenced into each tag. Thus, 13 bp per tag are obtained, and 26 bp per amplicon, with 4 to 5 bp gaps within each tag sequence.
With reference to Figure 9B, each cycle involves performing a ligation reaction with T4 DNA ligase and a fully degenerate population of nonamers. The nonamer molecules are individually labeled with one of four fluorophores (e.g., Texas Red, Cy5, Cy3, FITC). Depending on which position that a given cycle is aiming to interrogate, the nonamers are structured differently. Specifically, a single position within each nonamer is correlated with the identity of the fluorophore with which it is labeled. Additionally, the fiuorphore molecule is attached at the opposite end of the nonamer relative to the end targeted to the ligation junction. For example, in Figure 9B, the anchor primer is hybridized such that its 3' end is
adjacent to the genomic tag. To query a position five bases in to the tag sequence, the four- color population of nonamersis used.
Referring to Figure 10, four-color data from each cycle can be visualized in tetrahedral space, where each point represents a single bead, and the four clusters correspond to the four possible base calls. Figure 11 shows data from a single cycle of non-progressive sequencing by ligation, and in particular is the sequencing data from position (-1) of the proximal tag of a complex E. coli derived library. Figure 11 shows variation in accuracy over each of 26 cycles of non-progressive sequencing by ligation in a single experiment resequencing an E. coli genome. Cumulative distribution of raw error as a function of rank- ordered quality, with each of 26 sequencing-by-ligation cycles in a single sequencing experiment is treated as an independent data-set. The x-axis indicates percentile bins of beads, sorted on the basis of a confidence metric. The >>-axis (log scale) indicates the raw base-calling accuracy of each cumulative bin.
References
Housby JN, Southern EM., "Thermus scotoductus and Rhodothermus marinus DNA ligases have higher ligation efficiencies than thermus thermophilus DNA ligase," Anal Biochem., 2002 March 1; 302(l):88-94.
Housby JN, Thorbjarnardottir SH, Jonsson ZO, Southern EM., "Optimised ligation of oligonucleotides by thermal ligases: comparison of Thermus scotoductus and Rhodothermus marinus DNA ligases to other thermophilic ligases," Nucleic Acids Res., 2000 Feb. 1; 28(3):E10.
Housby JN, Southern EM., "Fidelity of DNA ligation: a novel experimental approach based on the polymerisation of libraries of oligonucleotides," Nucleic Acids Res., 1998 Sept. 15; 26(18):4259-4266.
Pritchard CE, Southern EM., "Effects of base mismatches on joining of short oligodeoxynucleotides by DNA ligases," Nucleic Acids Res., 1997 Sept. 1; 25(17):3403- 3407.
Claims
1. A method described above for DNA sequencing, useful for sequencing homopolymeric regions of DNA.
2. A method of sequencing a target nucleic acid comprising: a. providing a sequencing primer, wherein the sequencing primer has at least one anchor sequence and a universal base; b. hybridizing the sequencing primer to a target nucleic acid; and c. extending the sequencing primer.
3. A method of sequencing a target nucleic acid comprising: a. providing a sequencing primer, wherein the sequencing primer has at least one anchor sequence and a degenerate base; b. hybridizing the sequencing primer to a target nucleic acid; and c. extending the sequencing primer.
4. A method of sequencing a target nucleic acid comprising: a. providing a sequencing primer, wherein the sequencing primer has at least one anchor sequence and a natural base; b. hybridizing the sequencing primer to a target nucleic acid; and c. extending the sequencing primer.
5. A method for sequencing a target nucleic acid comprising:
(a) hybridization of one of several anchor primers to a common sequence adjacent to an unknown sequence,
(b) ligation of fluorescently labeled, degenerate oligonucleotides to the anchor primer, such that identity of the fluorophore is informative of the identity of one or more positions within the degenerate oligonucleotide,
(c) imaging to determine primer ligation,
(d) stripping of the anchor primeπdegenerate oligonucleotide complexes, and
(e) repeating steps (a)-(d) one or more times.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/670,588 US20070207482A1 (en) | 2004-08-04 | 2007-02-02 | Wobble sequencing |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US59861004P | 2004-08-04 | 2004-08-04 | |
US60/598,610 | 2004-08-04 | ||
US69271805P | 2005-06-22 | 2005-06-22 | |
US60/692,718 | 2005-06-22 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/670,588 Continuation US20070207482A1 (en) | 2004-08-04 | 2007-02-02 | Wobble sequencing |
Publications (3)
Publication Number | Publication Date |
---|---|
WO2006073504A2 true WO2006073504A2 (en) | 2006-07-13 |
WO2006073504A3 WO2006073504A3 (en) | 2007-04-12 |
WO2006073504A8 WO2006073504A8 (en) | 2007-09-27 |
Family
ID=36647934
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2005/027695 WO2006073504A2 (en) | 2004-08-04 | 2005-08-04 | Wobble sequencing |
Country Status (2)
Country | Link |
---|---|
US (1) | US20070207482A1 (en) |
WO (1) | WO2006073504A2 (en) |
Cited By (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2009094583A1 (en) * | 2008-01-23 | 2009-07-30 | Complete Genomics, Inc. | Methods and compositions for preventing bias in amplification and sequencing reactions |
EP2189793A1 (en) * | 2008-11-21 | 2010-05-26 | Micronas GmbH | Method for regenerating a biosensor |
US7811810B2 (en) | 2007-10-25 | 2010-10-12 | Industrial Technology Research Institute | Bioassay system including optical detection apparatuses, and method for detecting biomolecules |
WO2010148039A2 (en) | 2009-06-15 | 2010-12-23 | Complete Genomics, Inc. | Methods and compositions for long fragment read sequencing |
US7906285B2 (en) | 2003-02-26 | 2011-03-15 | Callida Genomics, Inc. | Random array DNA analysis by hybridization |
EP2362209A2 (en) | 2009-03-11 | 2011-08-31 | Industrial Technology Research Institute | Apparatus and method for detection and discrimination of the type of a molecular object |
EP2511843A2 (en) | 2009-04-29 | 2012-10-17 | Complete Genomics, Inc. | Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence |
EP2565279A1 (en) | 2007-12-05 | 2013-03-06 | Complete Genomics, Inc. | Efficient base determination in sequencing reactions |
US8431691B2 (en) | 2005-02-01 | 2013-04-30 | Applied Biosystems Llc | Reagents, methods, and libraries for bead-based sequencing |
WO2013066975A1 (en) | 2011-11-02 | 2013-05-10 | Complete Genomics, Inc. | Treatment for stabilizing nucleic acid arrays |
EP2657869A2 (en) | 2007-08-29 | 2013-10-30 | Applied Biosystems, LLC | Alternative nucleic acid sequencing methods |
WO2013166517A1 (en) | 2012-05-04 | 2013-11-07 | Complete Genomics, Inc. | Methods for determining absolute genome-wide copy number variations of complex tumors |
US8615365B2 (en) | 2009-02-03 | 2013-12-24 | Complete Genomics, Inc. | Oligomer sequences mapping |
US8725422B2 (en) | 2010-10-13 | 2014-05-13 | Complete Genomics, Inc. | Methods for estimating genome-wide copy number variations |
US8731843B2 (en) | 2009-02-03 | 2014-05-20 | Complete Genomics, Inc. | Oligomer sequences mapping |
US8738296B2 (en) | 2009-02-03 | 2014-05-27 | Complete Genomics, Inc. | Indexing a reference sequence for oligomer sequence mapping |
WO2014145820A2 (en) | 2013-03-15 | 2014-09-18 | Complete Genomics, Inc. | Multiple tagging of long dna fragments |
US8865078B2 (en) | 2010-06-11 | 2014-10-21 | Industrial Technology Research Institute | Apparatus for single-molecule detection |
US8945835B2 (en) | 2006-02-08 | 2015-02-03 | Illumina Cambridge Limited | Method for sequencing a polynucleotide template |
US9023769B2 (en) | 2009-11-30 | 2015-05-05 | Complete Genomics, Inc. | cDNA library for nucleic acid sequencing |
US9222132B2 (en) | 2008-01-28 | 2015-12-29 | Complete Genomics, Inc. | Methods and compositions for efficient base calling in sequencing reactions |
US9267172B2 (en) | 2007-11-05 | 2016-02-23 | Complete Genomics, Inc. | Efficient base determination in sequencing reactions |
EP3037553A1 (en) | 2007-10-25 | 2016-06-29 | Industrial Technology Research Institute | Bioassay system including optical detection apparatuses, and method for detecting biomolecules |
US9382585B2 (en) | 2007-10-30 | 2016-07-05 | Complete Genomics, Inc. | Apparatus for high throughput sequencing of nucleic acids |
EP3043319A1 (en) | 2010-04-30 | 2016-07-13 | Complete Genomics, Inc. | Method and system for accurate alignment and registration of array for dna sequencing |
US9482615B2 (en) | 2010-03-15 | 2016-11-01 | Industrial Technology Research Institute | Single-molecule detection system and methods |
US9524369B2 (en) | 2009-06-15 | 2016-12-20 | Complete Genomics, Inc. | Processing and analysis of complex nucleic acid sequence data |
US9551026B2 (en) | 2007-12-03 | 2017-01-24 | Complete Genomincs, Inc. | Method for nucleic acid detection using voltage enhancement |
US9637784B2 (en) | 2005-06-15 | 2017-05-02 | Complete Genomics, Inc. | Methods for DNA sequencing and analysis using multiple tiers of aliquots |
US9803239B2 (en) | 2012-03-29 | 2017-10-31 | Complete Genomics, Inc. | Flow cells for high density array chips |
US9880089B2 (en) | 2010-08-31 | 2018-01-30 | Complete Genomics, Inc. | High-density devices with synchronous tracks for quad-cell based alignment correction |
WO2018129214A1 (en) | 2017-01-04 | 2018-07-12 | Complete Genomics, Inc. | Stepwise sequencing by non-labeled reversible terminators or natural nucleotides |
US10227647B2 (en) | 2015-02-17 | 2019-03-12 | Complete Genomics, Inc. | DNA sequencing using controlled strand displacement |
WO2019071471A1 (en) | 2017-10-11 | 2019-04-18 | 深圳华大智造科技有限公司 | Method for improving loading and stability of nucleic acid on solid support |
US10385391B2 (en) | 2009-09-22 | 2019-08-20 | President And Fellows Of Harvard College | Entangled mate sequencing |
US10392726B2 (en) | 2010-10-08 | 2019-08-27 | President And Fellows Of Harvard College | High-throughput immune sequencing |
US10726942B2 (en) | 2013-08-23 | 2020-07-28 | Complete Genomics, Inc. | Long fragment de novo assembly using short reads |
WO2020180813A1 (en) | 2019-03-06 | 2020-09-10 | Qiagen Sciences, Llc | Compositions and methods for adaptor design and nucleic acid library construction for rolony-based sequencing |
EP3746564A1 (en) | 2018-01-29 | 2020-12-09 | St. Jude Children's Research Hospital, Inc. | Method for nucleic acid amplification |
WO2021103695A1 (en) * | 2019-11-25 | 2021-06-03 | 齐鲁工业大学 | Single-base continuous extension flow-type targeted sequencing method |
WO2021185320A1 (en) | 2020-03-18 | 2021-09-23 | Mgi Tech Co., Ltd. | Restoring phase in massively parallel sequencing |
US11198855B2 (en) | 2014-11-13 | 2021-12-14 | The Board Of Trustees Of The University Of Illinois | Bio-engineered hyper-functional “super” helicases |
US11389779B2 (en) | 2007-12-05 | 2022-07-19 | Complete Genomics, Inc. | Methods of preparing a library of nucleic acid fragments tagged with oligonucleotide bar code sequences |
Families Citing this family (108)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SG10201405158QA (en) | 2006-02-24 | 2014-10-30 | Callida Genomics Inc | High throughput genome sequencing on dna arrays |
US7910302B2 (en) | 2006-10-27 | 2011-03-22 | Complete Genomics, Inc. | Efficient arrays of amplified polynucleotides |
US20090111705A1 (en) | 2006-11-09 | 2009-04-30 | Complete Genomics, Inc. | Selection of dna adaptor orientation by hybrid capture |
US8716190B2 (en) | 2007-09-14 | 2014-05-06 | Affymetrix, Inc. | Amplification and analysis of selected targets on solid supports |
US9388457B2 (en) | 2007-09-14 | 2016-07-12 | Affymetrix, Inc. | Locus specific amplification using array probes |
WO2009052214A2 (en) | 2007-10-15 | 2009-04-23 | Complete Genomics, Inc. | Sequence analysis using decorated nucleic acids |
US8298768B2 (en) | 2007-11-29 | 2012-10-30 | Complete Genomics, Inc. | Efficient shotgun sequencing methods |
EP2285979B2 (en) | 2008-05-27 | 2020-02-19 | Dako Denmark A/S | Hybridization compositions and methods |
CN102186992A (en) * | 2008-08-15 | 2011-09-14 | 康奈尔大学 | Device for rapid identification of nucleic acids for binding to specific chemical targets |
US9303287B2 (en) | 2009-02-26 | 2016-04-05 | Dako Denmark A/S | Compositions and methods for RNA hybridization applications |
US20190300945A1 (en) | 2010-04-05 | 2019-10-03 | Prognosys Biosciences, Inc. | Spatially Encoded Biological Assays |
US10787701B2 (en) | 2010-04-05 | 2020-09-29 | Prognosys Biosciences, Inc. | Spatially encoded biological assays |
PT2556171E (en) | 2010-04-05 | 2015-12-21 | Prognosys Biosciences Inc | Spatially encoded biological assays |
US8865077B2 (en) | 2010-06-11 | 2014-10-21 | Industrial Technology Research Institute | Apparatus for single-molecule detection |
US10167508B2 (en) | 2010-08-06 | 2019-01-01 | Ariosa Diagnostics, Inc. | Detection of genetic abnormalities |
US11031095B2 (en) | 2010-08-06 | 2021-06-08 | Ariosa Diagnostics, Inc. | Assay systems for determination of fetal copy number variation |
US20120034603A1 (en) | 2010-08-06 | 2012-02-09 | Tandem Diagnostics, Inc. | Ligation-based detection of genetic variants |
US20130040375A1 (en) | 2011-08-08 | 2013-02-14 | Tandem Diagnotics, Inc. | Assay systems for genetic analysis |
US20130261003A1 (en) | 2010-08-06 | 2013-10-03 | Ariosa Diagnostics, In. | Ligation-based detection of genetic variants |
US8700338B2 (en) | 2011-01-25 | 2014-04-15 | Ariosa Diagnosis, Inc. | Risk calculation for evaluation of fetal aneuploidy |
US10533223B2 (en) | 2010-08-06 | 2020-01-14 | Ariosa Diagnostics, Inc. | Detection of target nucleic acids using hybridization |
US20140342940A1 (en) | 2011-01-25 | 2014-11-20 | Ariosa Diagnostics, Inc. | Detection of Target Nucleic Acids using Hybridization |
US11203786B2 (en) | 2010-08-06 | 2021-12-21 | Ariosa Diagnostics, Inc. | Detection of target nucleic acids using hybridization |
US9671344B2 (en) | 2010-08-31 | 2017-06-06 | Complete Genomics, Inc. | High-density biochemical array chips with asynchronous tracks for alignment correction by moiré averaging |
WO2012103031A2 (en) | 2011-01-25 | 2012-08-02 | Ariosa Diagnostics, Inc. | Detection of genetic abnormalities |
US11270781B2 (en) | 2011-01-25 | 2022-03-08 | Ariosa Diagnostics, Inc. | Statistical analysis for non-invasive sex chromosome aneuploidy determination |
US10131947B2 (en) | 2011-01-25 | 2018-11-20 | Ariosa Diagnostics, Inc. | Noninvasive detection of fetal aneuploidy in egg donor pregnancies |
US9994897B2 (en) | 2013-03-08 | 2018-06-12 | Ariosa Diagnostics, Inc. | Non-invasive fetal sex determination |
US8756020B2 (en) | 2011-01-25 | 2014-06-17 | Ariosa Diagnostics, Inc. | Enhanced risk probabilities using biomolecule estimations |
WO2012118745A1 (en) | 2011-02-28 | 2012-09-07 | Arnold Oliphant | Assay systems for detection of aneuploidy and sex determination |
EP2694709B1 (en) | 2011-04-08 | 2016-09-14 | Prognosys Biosciences, Inc. | Peptide constructs and assay systems |
GB201106254D0 (en) | 2011-04-13 | 2011-05-25 | Frisen Jonas | Method and product |
US8712697B2 (en) | 2011-09-07 | 2014-04-29 | Ariosa Diagnostics, Inc. | Determination of copy number variations using binomial probability calculations |
EP2761028A1 (en) | 2011-09-30 | 2014-08-06 | Dako Denmark A/S | Hybridization compositions and methods using formamide |
EP2768974B1 (en) * | 2011-10-21 | 2017-07-19 | Dako Denmark A/S | Hybridization compositions and methods |
US10289800B2 (en) | 2012-05-21 | 2019-05-14 | Ariosa Diagnostics, Inc. | Processes for calculating phased fetal genomic sequences |
US9884893B2 (en) | 2012-05-21 | 2018-02-06 | Distributed Bio, Inc. | Epitope focusing by variable effective antigen surface concentration |
CA2874413A1 (en) | 2012-05-21 | 2013-11-28 | The Scripps Research Institute | Methods of sample preparation |
US9488823B2 (en) | 2012-06-07 | 2016-11-08 | Complete Genomics, Inc. | Techniques for scanned illumination |
US9628676B2 (en) | 2012-06-07 | 2017-04-18 | Complete Genomics, Inc. | Imaging systems with movable scan mirrors |
CN104583421A (en) | 2012-07-19 | 2015-04-29 | 阿瑞奥萨诊断公司 | Multiplexed sequential ligation-based detection of genetic variants |
USRE50065E1 (en) | 2012-10-17 | 2024-07-30 | 10X Genomics Sweden Ab | Methods and product for optimising localised or spatial detection of gene expression in a tissue sample |
WO2014200579A1 (en) | 2013-06-13 | 2014-12-18 | Ariosa Diagnostics, Inc. | Statistical analysis for non-invasive sex chromosome aneuploidy determination |
DK3013984T3 (en) | 2013-06-25 | 2023-06-06 | Prognosys Biosciences Inc | METHOD FOR DETERMINING SPATIAL PATTERNS IN BIOLOGICAL TARGETS IN A SAMPLE |
KR102160389B1 (en) | 2013-08-05 | 2020-09-28 | 트위스트 바이오사이언스 코포레이션 | De novo synthesized gene libraries |
WO2015042708A1 (en) | 2013-09-25 | 2015-04-02 | Bio-Id Diagnostic Inc. | Methods for detecting nucleic acid fragments |
EP3191604B1 (en) | 2014-09-09 | 2021-04-14 | Igenomx International Genomics Corporation | Methods and compositions for rapid nucleic acid library preparation |
WO2016126987A1 (en) | 2015-02-04 | 2016-08-11 | Twist Bioscience Corporation | Compositions and methods for synthetic gene assembly |
WO2016126882A1 (en) | 2015-02-04 | 2016-08-11 | Twist Bioscience Corporation | Methods and devices for de novo oligonucleic acid assembly |
FI3901281T3 (en) | 2015-04-10 | 2023-01-31 | Spatially distinguished, multiplex nucleic acid analysis of biological specimens | |
US9981239B2 (en) | 2015-04-21 | 2018-05-29 | Twist Bioscience Corporation | Devices and methods for oligonucleic acid library synthesis |
EP3350314A4 (en) | 2015-09-18 | 2019-02-06 | Twist Bioscience Corporation | Oligonucleic acid variant libraries and synthesis thereof |
KR20180058772A (en) | 2015-09-22 | 2018-06-01 | 트위스트 바이오사이언스 코포레이션 | Flexible substrate for nucleic acid synthesis |
CN115920796A (en) | 2015-12-01 | 2023-04-07 | 特韦斯特生物科学公司 | Functionalized surfaces and preparation thereof |
CA3034769A1 (en) | 2016-08-22 | 2018-03-01 | Twist Bioscience Corporation | De novo synthesized nucleic acid libraries |
US10417457B2 (en) | 2016-09-21 | 2019-09-17 | Twist Bioscience Corporation | Nucleic acid based data storage |
GB2573069A (en) | 2016-12-16 | 2019-10-23 | Twist Bioscience Corp | Variant libraries of the immunological synapse and synthesis thereof |
CA3054303A1 (en) | 2017-02-22 | 2018-08-30 | Twist Bioscience Corporation | Nucleic acid based data storage |
US10894959B2 (en) | 2017-03-15 | 2021-01-19 | Twist Bioscience Corporation | Variant libraries of the immunological synapse and synthesis thereof |
WO2018231864A1 (en) | 2017-06-12 | 2018-12-20 | Twist Bioscience Corporation | Methods for seamless nucleic acid assembly |
AU2018284227B2 (en) | 2017-06-12 | 2024-05-02 | Twist Bioscience Corporation | Methods for seamless nucleic acid assembly |
CN111566125A (en) | 2017-09-11 | 2020-08-21 | 特韦斯特生物科学公司 | GPCR binding proteins and synthesis thereof |
GB2583590A (en) | 2017-10-20 | 2020-11-04 | Twist Bioscience Corp | Heated nanowells for polynucleotide synthesis |
AU2019205269A1 (en) | 2018-01-04 | 2020-07-30 | Twist Bioscience Corporation | DNA-based digital information storage |
CN112639130B (en) | 2018-05-18 | 2024-08-09 | 特韦斯特生物科学公司 | Polynucleotides, reagents and methods for nucleic acid hybridization |
US11519033B2 (en) | 2018-08-28 | 2022-12-06 | 10X Genomics, Inc. | Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample |
US11649485B2 (en) | 2019-01-06 | 2023-05-16 | 10X Genomics, Inc. | Generating capture probes for spatial analysis |
US11926867B2 (en) | 2019-01-06 | 2024-03-12 | 10X Genomics, Inc. | Generating capture probes for spatial analysis |
WO2020176678A1 (en) | 2019-02-26 | 2020-09-03 | Twist Bioscience Corporation | Variant nucleic acid libraries for glp1 receptor |
JP2022522668A (en) | 2019-02-26 | 2022-04-20 | ツイスト バイオサイエンス コーポレーション | Mutant nucleic acid library for antibody optimization |
WO2020243579A1 (en) | 2019-05-30 | 2020-12-03 | 10X Genomics, Inc. | Methods of detecting spatial heterogeneity of a biological sample |
CA3144644A1 (en) | 2019-06-21 | 2020-12-24 | Twist Bioscience Corporation | Barcode-based nucleic acid sequence assembly |
AU2020356471A1 (en) | 2019-09-23 | 2022-04-21 | Twist Bioscience Corporation | Variant nucleic acid libraries for CRTH2 |
WO2021091611A1 (en) | 2019-11-08 | 2021-05-14 | 10X Genomics, Inc. | Spatially-tagged analyte capture agents for analyte multiplexing |
EP4025711A2 (en) | 2019-11-08 | 2022-07-13 | 10X Genomics, Inc. | Enhancing specificity of analyte binding |
WO2021133842A1 (en) | 2019-12-23 | 2021-07-01 | 10X Genomics, Inc. | Compositions and methods for using fixed biological samples in partition-based assays |
EP4424843A3 (en) | 2019-12-23 | 2024-09-25 | 10X Genomics, Inc. | Methods for spatial analysis using rna-templated ligation |
US11702693B2 (en) | 2020-01-21 | 2023-07-18 | 10X Genomics, Inc. | Methods for printing cells and generating arrays of barcoded cells |
US11732299B2 (en) | 2020-01-21 | 2023-08-22 | 10X Genomics, Inc. | Spatial assays with perturbed cells |
US11821035B1 (en) | 2020-01-29 | 2023-11-21 | 10X Genomics, Inc. | Compositions and methods of making gene expression libraries |
US12076701B2 (en) | 2020-01-31 | 2024-09-03 | 10X Genomics, Inc. | Capturing oligonucleotides in spatial transcriptomics |
US12110541B2 (en) | 2020-02-03 | 2024-10-08 | 10X Genomics, Inc. | Methods for preparing high-resolution spatial arrays |
US11898205B2 (en) | 2020-02-03 | 2024-02-13 | 10X Genomics, Inc. | Increasing capture efficiency of spatial assays |
US11732300B2 (en) | 2020-02-05 | 2023-08-22 | 10X Genomics, Inc. | Increasing efficiency of spatial analysis in a biological sample |
US11835462B2 (en) | 2020-02-11 | 2023-12-05 | 10X Genomics, Inc. | Methods and compositions for partitioning a biological sample |
US11891654B2 (en) | 2020-02-24 | 2024-02-06 | 10X Genomics, Inc. | Methods of making gene expression libraries |
US11926863B1 (en) | 2020-02-27 | 2024-03-12 | 10X Genomics, Inc. | Solid state single cell method for analyzing fixed biological cells |
US11768175B1 (en) | 2020-03-04 | 2023-09-26 | 10X Genomics, Inc. | Electrophoretic methods for spatial analysis |
CN115916999A (en) | 2020-04-22 | 2023-04-04 | 10X基因组学有限公司 | Methods for spatial analysis using targeted RNA depletion |
AU2021275906A1 (en) | 2020-05-22 | 2022-12-22 | 10X Genomics, Inc. | Spatial analysis to detect sequence variants |
EP4414459A3 (en) | 2020-05-22 | 2024-09-18 | 10X Genomics, Inc. | Simultaneous spatio-temporal measurement of gene expression and cellular activity |
WO2021242834A1 (en) | 2020-05-26 | 2021-12-02 | 10X Genomics, Inc. | Method for resetting an array |
AU2021283184A1 (en) | 2020-06-02 | 2023-01-05 | 10X Genomics, Inc. | Spatial transcriptomics for antigen-receptors |
EP4025692A2 (en) | 2020-06-02 | 2022-07-13 | 10X Genomics, Inc. | Nucleic acid library methods |
US12031177B1 (en) | 2020-06-04 | 2024-07-09 | 10X Genomics, Inc. | Methods of enhancing spatial resolution of transcripts |
WO2021252499A1 (en) | 2020-06-08 | 2021-12-16 | 10X Genomics, Inc. | Methods of determining a surgical margin and methods of use thereof |
EP4165207B1 (en) | 2020-06-10 | 2024-09-25 | 10X Genomics, Inc. | Methods for determining a location of an analyte in a biological sample |
EP4450639A2 (en) | 2020-06-25 | 2024-10-23 | 10X Genomics, Inc. | Spatial analysis of dna methylation |
US11981960B1 (en) | 2020-07-06 | 2024-05-14 | 10X Genomics, Inc. | Spatial analysis utilizing degradable hydrogels |
US11761038B1 (en) | 2020-07-06 | 2023-09-19 | 10X Genomics, Inc. | Methods for identifying a location of an RNA in a biological sample |
US11981958B1 (en) | 2020-08-20 | 2024-05-14 | 10X Genomics, Inc. | Methods for spatial analysis using DNA capture |
US11926822B1 (en) | 2020-09-23 | 2024-03-12 | 10X Genomics, Inc. | Three-dimensional spatial analysis |
US11827935B1 (en) | 2020-11-19 | 2023-11-28 | 10X Genomics, Inc. | Methods for spatial analysis using rolling circle amplification and detection probes |
AU2021409136A1 (en) | 2020-12-21 | 2023-06-29 | 10X Genomics, Inc. | Methods, compositions, and systems for capturing probes and/or barcodes |
WO2022178267A2 (en) | 2021-02-19 | 2022-08-25 | 10X Genomics, Inc. | Modular assay support devices |
EP4301870A1 (en) | 2021-03-18 | 2024-01-10 | 10X Genomics, Inc. | Multiplex capture of gene and protein expression from a biological sample |
EP4347879A1 (en) | 2021-06-03 | 2024-04-10 | 10X Genomics, Inc. | Methods, compositions, kits, and systems for enhancing analyte capture for spatial analysis |
EP4196605A1 (en) | 2021-09-01 | 2023-06-21 | 10X Genomics, Inc. | Methods, compositions, and kits for blocking a capture probe on a spatial array |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6528288B2 (en) * | 1999-04-21 | 2003-03-04 | Genome Technologies, Llc | Shot-gun sequencing and amplification without cloning |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5641658A (en) * | 1994-08-03 | 1997-06-24 | Mosaic Technologies, Inc. | Method for performing amplification of nucleic acid with two primers bound to a single solid support |
US5750341A (en) * | 1995-04-17 | 1998-05-12 | Lynx Therapeutics, Inc. | DNA sequencing by parallel oligonucleotide extensions |
-
2005
- 2005-08-04 WO PCT/US2005/027695 patent/WO2006073504A2/en active Application Filing
-
2007
- 2007-02-02 US US11/670,588 patent/US20070207482A1/en not_active Abandoned
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6528288B2 (en) * | 1999-04-21 | 2003-03-04 | Genome Technologies, Llc | Shot-gun sequencing and amplification without cloning |
Non-Patent Citations (7)
Title |
---|
DAY J.P. ET AL.: 'Nucleotide Analogs and new buffers improve a generalized method to enrich for low abundance mutations' NUCLEIC ACIDS RESEARCH vol. 27, no. 8, 1999, pages 1819 - 1827, XP002159669 * |
KACZOROWSKI ET AL.: 'Automated four-color DNA sequencing using primers assembled by hexamer ligation' GENE vol. 179, 1996, pages 195 - 198 * |
KACZOROWSKI ET AL.: 'Genomic DNA sequencing by SPEL-6 primer walking using hexamer ligation' GENE vol. 223, 1998, pages 83 - 91 * |
KACZOROWSKI T. ET AL.: 'Co-operativity of hexamer ligation' GENE vol. 179, 1996, pages 189 - 193, XP004071982 * |
PASTINEN T. ET AL.: 'A System for Specific, High-throughput Genotyping by Allele-Specific Primer Extention on Microarrays' GENOME RESEARCH vol. 10, 2000, pages 1031 - 1042, XP008013561 * |
RAJA M.C. ET AL.: 'DNA sequencing using differential extension with nucleotide subsets (DENS)' NUCLEIC ACID RESEARCH vol. 25, no. 4, 1997, pages 800 - 805, XP003010704 * |
ROSE T.M. ET AL.: 'Consensus-degenerate hybrid oligonucleotide primers for amplification of distantly related sequences' NUCLEIC ACIDS RESEARCH vol. 26, no. 7, 1998, pages 1628 - 1635, XP002141299 * |
Cited By (75)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7906285B2 (en) | 2003-02-26 | 2011-03-15 | Callida Genomics, Inc. | Random array DNA analysis by hybridization |
US8431691B2 (en) | 2005-02-01 | 2013-04-30 | Applied Biosystems Llc | Reagents, methods, and libraries for bead-based sequencing |
US10323277B2 (en) | 2005-02-01 | 2019-06-18 | Applied Biosystems, Llc | Reagents, methods, and libraries for bead-based sequencing |
US11414702B2 (en) | 2005-06-15 | 2022-08-16 | Complete Genomics, Inc. | Nucleic acid analysis by random mixtures of non-overlapping fragments |
US10125392B2 (en) | 2005-06-15 | 2018-11-13 | Complete Genomics, Inc. | Preparing a DNA fragment library for sequencing using tagged primers |
US9637784B2 (en) | 2005-06-15 | 2017-05-02 | Complete Genomics, Inc. | Methods for DNA sequencing and analysis using multiple tiers of aliquots |
US10351909B2 (en) | 2005-06-15 | 2019-07-16 | Complete Genomics, Inc. | DNA sequencing from high density DNA arrays using asynchronous reactions |
US9637785B2 (en) | 2005-06-15 | 2017-05-02 | Complete Genomics, Inc. | Tagged fragment library configured for genome or cDNA sequence analysis |
US9944984B2 (en) | 2005-06-15 | 2018-04-17 | Complete Genomics, Inc. | High density DNA array |
US9650673B2 (en) | 2005-06-15 | 2017-05-16 | Complete Genomics, Inc. | Single molecule arrays for genetic and chemical analysis |
US8945835B2 (en) | 2006-02-08 | 2015-02-03 | Illumina Cambridge Limited | Method for sequencing a polynucleotide template |
US9994896B2 (en) | 2006-02-08 | 2018-06-12 | Illumina Cambridge Limited | Method for sequencing a polynucelotide template |
US10876158B2 (en) | 2006-02-08 | 2020-12-29 | Illumina Cambridge Limited | Method for sequencing a polynucleotide template |
EP2657869A2 (en) | 2007-08-29 | 2013-10-30 | Applied Biosystems, LLC | Alternative nucleic acid sequencing methods |
US7811810B2 (en) | 2007-10-25 | 2010-10-12 | Industrial Technology Research Institute | Bioassay system including optical detection apparatuses, and method for detecting biomolecules |
EP3037553A1 (en) | 2007-10-25 | 2016-06-29 | Industrial Technology Research Institute | Bioassay system including optical detection apparatuses, and method for detecting biomolecules |
US10017815B2 (en) | 2007-10-30 | 2018-07-10 | Complete Genomics, Inc. | Method for high throughput screening of nucleic acids |
US9382585B2 (en) | 2007-10-30 | 2016-07-05 | Complete Genomics, Inc. | Apparatus for high throughput sequencing of nucleic acids |
US9267172B2 (en) | 2007-11-05 | 2016-02-23 | Complete Genomics, Inc. | Efficient base determination in sequencing reactions |
US9551026B2 (en) | 2007-12-03 | 2017-01-24 | Complete Genomincs, Inc. | Method for nucleic acid detection using voltage enhancement |
US11389779B2 (en) | 2007-12-05 | 2022-07-19 | Complete Genomics, Inc. | Methods of preparing a library of nucleic acid fragments tagged with oligonucleotide bar code sequences |
EP2610351A1 (en) | 2007-12-05 | 2013-07-03 | Complete Genomics, Inc. | Efficient base determination in sequencing reactions |
EP2565279A1 (en) | 2007-12-05 | 2013-03-06 | Complete Genomics, Inc. | Efficient base determination in sequencing reactions |
WO2009094583A1 (en) * | 2008-01-23 | 2009-07-30 | Complete Genomics, Inc. | Methods and compositions for preventing bias in amplification and sequencing reactions |
US9222132B2 (en) | 2008-01-28 | 2015-12-29 | Complete Genomics, Inc. | Methods and compositions for efficient base calling in sequencing reactions |
US11214832B2 (en) | 2008-01-28 | 2022-01-04 | Complete Genomics, Inc. | Methods and compositions for efficient base calling in sequencing reactions |
US10662473B2 (en) | 2008-01-28 | 2020-05-26 | Complete Genomics, Inc. | Methods and compositions for efficient base calling in sequencing reactions |
US11098356B2 (en) | 2008-01-28 | 2021-08-24 | Complete Genomics, Inc. | Methods and compositions for nucleic acid sequencing |
US9523125B2 (en) | 2008-01-28 | 2016-12-20 | Complete Genomics, Inc. | Methods and compositions for efficient base calling in sequencing reactions |
EP2189793A1 (en) * | 2008-11-21 | 2010-05-26 | Micronas GmbH | Method for regenerating a biosensor |
US8518709B2 (en) | 2008-11-21 | 2013-08-27 | Endress+Hauser Conducta Gesellschaft Fuer Mess-Und Regeltechnik Mbh+Co. Kg | Method for regenerating a biosensor |
US8731843B2 (en) | 2009-02-03 | 2014-05-20 | Complete Genomics, Inc. | Oligomer sequences mapping |
US8615365B2 (en) | 2009-02-03 | 2013-12-24 | Complete Genomics, Inc. | Oligomer sequences mapping |
US8738296B2 (en) | 2009-02-03 | 2014-05-27 | Complete Genomics, Inc. | Indexing a reference sequence for oligomer sequence mapping |
US9778188B2 (en) | 2009-03-11 | 2017-10-03 | Industrial Technology Research Institute | Apparatus and method for detection and discrimination molecular object |
US10996166B2 (en) | 2009-03-11 | 2021-05-04 | Industrial Technology Research Institute | Apparatus and method for detection and discrimination molecular object |
EP3159678A1 (en) | 2009-03-11 | 2017-04-26 | Industrial Technology Research Institute | Apparatus and method for detection and discrimination of a molecular object |
EP2362209A2 (en) | 2009-03-11 | 2011-08-31 | Industrial Technology Research Institute | Apparatus and method for detection and discrimination of the type of a molecular object |
EP2511843A2 (en) | 2009-04-29 | 2012-10-17 | Complete Genomics, Inc. | Method and system for calling variations in a sample polynucleotide sequence with respect to a reference polynucleotide sequence |
EP2977455A1 (en) | 2009-06-15 | 2016-01-27 | Complete Genomics, Inc. | Methods and compositions for long fragment read sequencing |
US9524369B2 (en) | 2009-06-15 | 2016-12-20 | Complete Genomics, Inc. | Processing and analysis of complex nucleic acid sequence data |
WO2010148039A2 (en) | 2009-06-15 | 2010-12-23 | Complete Genomics, Inc. | Methods and compositions for long fragment read sequencing |
US10385391B2 (en) | 2009-09-22 | 2019-08-20 | President And Fellows Of Harvard College | Entangled mate sequencing |
US9023769B2 (en) | 2009-11-30 | 2015-05-05 | Complete Genomics, Inc. | cDNA library for nucleic acid sequencing |
US9777321B2 (en) | 2010-03-15 | 2017-10-03 | Industrial Technology Research Institute | Single molecule detection system and methods |
US9482615B2 (en) | 2010-03-15 | 2016-11-01 | Industrial Technology Research Institute | Single-molecule detection system and methods |
EP3043319A1 (en) | 2010-04-30 | 2016-07-13 | Complete Genomics, Inc. | Method and system for accurate alignment and registration of array for dna sequencing |
US8865078B2 (en) | 2010-06-11 | 2014-10-21 | Industrial Technology Research Institute | Apparatus for single-molecule detection |
US9995683B2 (en) | 2010-06-11 | 2018-06-12 | Industrial Technology Research Institute | Apparatus for single-molecule detection |
US9880089B2 (en) | 2010-08-31 | 2018-01-30 | Complete Genomics, Inc. | High-density devices with synchronous tracks for quad-cell based alignment correction |
US12012670B2 (en) | 2010-10-08 | 2024-06-18 | President And Fellows Of Harvard College | High-throughput immune sequencing |
US10392726B2 (en) | 2010-10-08 | 2019-08-27 | President And Fellows Of Harvard College | High-throughput immune sequencing |
US8725422B2 (en) | 2010-10-13 | 2014-05-13 | Complete Genomics, Inc. | Methods for estimating genome-wide copy number variations |
WO2013066975A1 (en) | 2011-11-02 | 2013-05-10 | Complete Genomics, Inc. | Treatment for stabilizing nucleic acid arrays |
US11835437B2 (en) | 2011-11-02 | 2023-12-05 | Complete Genomics, Inc. | Treatment for stabilizing nucleic acid arrays |
US10837879B2 (en) | 2011-11-02 | 2020-11-17 | Complete Genomics, Inc. | Treatment for stabilizing nucleic acid arrays |
US9803239B2 (en) | 2012-03-29 | 2017-10-31 | Complete Genomics, Inc. | Flow cells for high density array chips |
WO2013166517A1 (en) | 2012-05-04 | 2013-11-07 | Complete Genomics, Inc. | Methods for determining absolute genome-wide copy number variations of complex tumors |
EP3741872A1 (en) | 2013-03-15 | 2020-11-25 | Complete Genomics, Inc. | Multiple tagging of long dna fragments |
WO2014145820A2 (en) | 2013-03-15 | 2014-09-18 | Complete Genomics, Inc. | Multiple tagging of long dna fragments |
US10726942B2 (en) | 2013-08-23 | 2020-07-28 | Complete Genomics, Inc. | Long fragment de novo assembly using short reads |
US11198855B2 (en) | 2014-11-13 | 2021-12-14 | The Board Of Trustees Of The University Of Illinois | Bio-engineered hyper-functional “super” helicases |
US10227647B2 (en) | 2015-02-17 | 2019-03-12 | Complete Genomics, Inc. | DNA sequencing using controlled strand displacement |
US11319588B2 (en) | 2015-02-17 | 2022-05-03 | Mgi Tech Co., Ltd. | DNA sequencing using controlled strand displacement |
EP4112741A1 (en) | 2017-01-04 | 2023-01-04 | MGI Tech Co., Ltd. | Stepwise sequencing by non-labeled reversible terminators or natural nucleotides |
WO2018129214A1 (en) | 2017-01-04 | 2018-07-12 | Complete Genomics, Inc. | Stepwise sequencing by non-labeled reversible terminators or natural nucleotides |
WO2019071471A1 (en) | 2017-10-11 | 2019-04-18 | 深圳华大智造科技有限公司 | Method for improving loading and stability of nucleic acid on solid support |
EP3995590A1 (en) | 2017-10-11 | 2022-05-11 | MGI Tech Co., Ltd. | Method for improving loading and stability of nucleic acid |
US11905553B2 (en) | 2018-01-29 | 2024-02-20 | St. Jude Children's Research Hospital, Inc. | Method for nucleic acid amplification |
EP3746564A1 (en) | 2018-01-29 | 2020-12-09 | St. Jude Children's Research Hospital, Inc. | Method for nucleic acid amplification |
US11643682B2 (en) | 2018-01-29 | 2023-05-09 | St. Jude Children's Research Hospital, Inc. | Method for nucleic acid amplification |
EP4183886A1 (en) | 2018-01-29 | 2023-05-24 | St. Jude Children's Research Hospital, Inc. | Method for nucleic acid amplification |
WO2020180813A1 (en) | 2019-03-06 | 2020-09-10 | Qiagen Sciences, Llc | Compositions and methods for adaptor design and nucleic acid library construction for rolony-based sequencing |
WO2021103695A1 (en) * | 2019-11-25 | 2021-06-03 | 齐鲁工业大学 | Single-base continuous extension flow-type targeted sequencing method |
WO2021185320A1 (en) | 2020-03-18 | 2021-09-23 | Mgi Tech Co., Ltd. | Restoring phase in massively parallel sequencing |
Also Published As
Publication number | Publication date |
---|---|
WO2006073504A3 (en) | 2007-04-12 |
WO2006073504A8 (en) | 2007-09-27 |
US20070207482A1 (en) | 2007-09-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20070207482A1 (en) | Wobble sequencing | |
US20210062186A1 (en) | Next-generation sequencing libraries | |
US20220267845A1 (en) | Selective Amplfication of Nucleic Acid Sequences | |
US8753816B2 (en) | Sequencing methods | |
US7622281B2 (en) | Methods and compositions for clonal amplification of nucleic acid | |
JP7240337B2 (en) | LIBRARY PREPARATION METHODS AND COMPOSITIONS AND USES THEREOF | |
EP2694679A2 (en) | Methods and systems for sequencing long nucleic acids | |
US20120107878A1 (en) | Multiplex assembly of high fedelity dna | |
US20190106744A1 (en) | Dna sequencing | |
EP3956445B1 (en) | Multiplex assembly of nucleic acid molecules | |
US20210017596A1 (en) | Sequential sequencing methods and compositions | |
US20200377935A1 (en) | Polynucleotide adapters and methods of use thereof | |
KR20230124636A (en) | Compositions and methods for highly sensitive detection of target sequences in multiplex reactions | |
US20200123604A1 (en) | Dna sequencing | |
US20230323451A1 (en) | Selective amplification of molecularly identifiable nucleic 5 acid sequences | |
WO2008127901A1 (en) | Region-specific hyperbranched amplification | |
So | Universal Sequence Tag Array (U-STAR) platform: strategies towards the development of a universal platform for the absolute quantification of gene expression on a global scale |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 11670588 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWP | Wipo information: published in national office |
Ref document number: 11670588 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |