US20220064628A1 - Compositions and methods for synthetic gene assembly - Google Patents
Compositions and methods for synthetic gene assembly Download PDFInfo
- Publication number
- US20220064628A1 US20220064628A1 US17/320,127 US202117320127A US2022064628A1 US 20220064628 A1 US20220064628 A1 US 20220064628A1 US 202117320127 A US202117320127 A US 202117320127A US 2022064628 A1 US2022064628 A1 US 2022064628A1
- Authority
- US
- United States
- Prior art keywords
- sequence
- nucleic acid
- fragment
- fragments
- site
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 182
- 239000000203 mixture Substances 0.000 title abstract description 27
- 108700005078 Synthetic Genes Proteins 0.000 title 1
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 384
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 207
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 207
- 239000012634 fragment Substances 0.000 claims description 458
- 108020004414 DNA Proteins 0.000 claims description 139
- 238000006243 chemical reaction Methods 0.000 claims description 93
- 108090000623 proteins and genes Proteins 0.000 claims description 82
- 102000053602 DNA Human genes 0.000 claims description 56
- 230000000694 effects Effects 0.000 claims description 50
- 125000003729 nucleotide group Chemical group 0.000 claims description 40
- 239000002773 nucleotide Substances 0.000 claims description 36
- 230000027455 binding Effects 0.000 claims description 17
- 210000004027 cell Anatomy 0.000 claims description 16
- 239000013598 vector Substances 0.000 claims description 14
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 claims description 12
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 claims description 12
- 238000009396 hybridization Methods 0.000 claims description 11
- 239000002299 complementary DNA Substances 0.000 claims description 7
- 108090000652 Flap endonucleases Proteins 0.000 claims description 6
- 102000004150 Flap endonucleases Human genes 0.000 claims description 5
- 108020004682 Single-Stranded DNA Proteins 0.000 claims description 5
- 238000011534 incubation Methods 0.000 claims description 4
- 230000002194 synthesizing effect Effects 0.000 claims description 4
- 108010061982 DNA Ligases Proteins 0.000 claims description 3
- 102000012410 DNA Ligases Human genes 0.000 claims description 3
- 230000001580 bacterial effect Effects 0.000 claims description 3
- 238000005382 thermal cycling Methods 0.000 claims description 3
- 108700008625 Reporter Genes Proteins 0.000 claims description 2
- 210000004962 mammalian cell Anatomy 0.000 claims description 2
- 239000003550 marker Substances 0.000 claims description 2
- 230000010076 replication Effects 0.000 claims description 2
- 210000003705 ribosome Anatomy 0.000 claims description 2
- 230000003612 virological effect Effects 0.000 claims description 2
- 210000005253 yeast cell Anatomy 0.000 claims description 2
- 108020004638 Circular DNA Proteins 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 12
- 238000012986 modification Methods 0.000 abstract description 7
- 230000004048 modification Effects 0.000 abstract description 5
- 102000004190 Enzymes Human genes 0.000 description 233
- 108090000790 Enzymes Proteins 0.000 description 233
- 238000003776 cleavage reaction Methods 0.000 description 171
- 230000007017 scission Effects 0.000 description 168
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 167
- 239000002243 precursor Substances 0.000 description 159
- 229940035893 uracil Drugs 0.000 description 82
- 230000004927 fusion Effects 0.000 description 64
- 239000003795 chemical substances by application Substances 0.000 description 58
- 101710147059 Nicking endonuclease Proteins 0.000 description 51
- 108091028043 Nucleic acid sequence Proteins 0.000 description 51
- 230000000295 complement effect Effects 0.000 description 42
- 230000002441 reversible effect Effects 0.000 description 41
- 108010042407 Endonucleases Proteins 0.000 description 40
- 230000015572 biosynthetic process Effects 0.000 description 40
- 238000003786 synthesis reaction Methods 0.000 description 40
- 239000002253 acid Substances 0.000 description 38
- 239000013612 plasmid Substances 0.000 description 38
- 230000003321 amplification Effects 0.000 description 37
- 238000003199 nucleic acid amplification method Methods 0.000 description 37
- 238000003752 polymerase chain reaction Methods 0.000 description 37
- 239000000047 product Substances 0.000 description 36
- 108010063362 DNA-(Apurinic or Apyrimidinic Site) Lyase Proteins 0.000 description 34
- 102100031780 Endonuclease Human genes 0.000 description 33
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 32
- 102100037111 Uracil-DNA glycosylase Human genes 0.000 description 32
- 102000010719 DNA-(Apurinic or Apyrimidinic Site) Lyase Human genes 0.000 description 30
- 238000000137 annealing Methods 0.000 description 27
- 150000007513 acids Chemical class 0.000 description 22
- 108091008146 restriction endonucleases Proteins 0.000 description 22
- 102000040430 polynucleotide Human genes 0.000 description 21
- 108091033319 polynucleotide Proteins 0.000 description 21
- 239000002157 polynucleotide Substances 0.000 description 21
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 20
- 108010082610 Deoxyribonuclease (Pyrimidine Dimer) Proteins 0.000 description 19
- 241000588724 Escherichia coli Species 0.000 description 19
- 238000012545 processing Methods 0.000 description 18
- 230000002255 enzymatic effect Effects 0.000 description 17
- 230000008439 repair process Effects 0.000 description 17
- 102000004099 Deoxyribonuclease (Pyrimidine Dimer) Human genes 0.000 description 16
- 239000002777 nucleoside Substances 0.000 description 16
- 108060002716 Exonuclease Proteins 0.000 description 15
- 102000013165 exonuclease Human genes 0.000 description 15
- 230000006846 excision repair Effects 0.000 description 14
- -1 N6-adenine Chemical compound 0.000 description 13
- 230000000670 limiting effect Effects 0.000 description 13
- 208000035657 Abasia Diseases 0.000 description 12
- 108020001738 DNA Glycosylase Proteins 0.000 description 12
- 102000011724 DNA Repair Enzymes Human genes 0.000 description 12
- 108010076525 DNA Repair Enzymes Proteins 0.000 description 12
- 102000028381 DNA glycosylase Human genes 0.000 description 12
- 108700034637 EC 3.2.-.- Proteins 0.000 description 12
- 102000003960 Ligases Human genes 0.000 description 12
- 108090000364 Ligases Proteins 0.000 description 12
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 11
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 11
- 102000004317 Lyases Human genes 0.000 description 11
- 108090000856 Lyases Proteins 0.000 description 11
- 108091034117 Oligonucleotide Proteins 0.000 description 11
- 102000004169 proteins and genes Human genes 0.000 description 11
- FZWGECJQACGGTI-UHFFFAOYSA-N 2-amino-7-methyl-1,7-dihydro-6H-purin-6-one Chemical compound NC1=NC(O)=C2N(C)C=NC2=N1 FZWGECJQACGGTI-UHFFFAOYSA-N 0.000 description 10
- FSASIHFSFGAIJM-UHFFFAOYSA-N 3-methyladenine Chemical compound CN1C=NC(N)=C2N=CN=C12 FSASIHFSFGAIJM-UHFFFAOYSA-N 0.000 description 10
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 10
- 230000002093 peripheral effect Effects 0.000 description 9
- 238000007858 polymerase cycling assembly Methods 0.000 description 9
- 238000002360 preparation method Methods 0.000 description 9
- 239000011541 reaction mixture Substances 0.000 description 9
- 239000000758 substrate Substances 0.000 description 9
- OIVLITBTBDPEFK-UHFFFAOYSA-N 5,6-dihydrouracil Chemical compound O=C1CCNC(=O)N1 OIVLITBTBDPEFK-UHFFFAOYSA-N 0.000 description 8
- LRFVTYWOQMYALW-UHFFFAOYSA-N 9H-xanthine Chemical compound O=C1NC(=O)NC2=C1NC=N2 LRFVTYWOQMYALW-UHFFFAOYSA-N 0.000 description 8
- 108020004635 Complementary DNA Proteins 0.000 description 8
- 102000004533 Endonucleases Human genes 0.000 description 8
- 229960000643 adenine Drugs 0.000 description 8
- 150000003833 nucleoside derivatives Chemical class 0.000 description 8
- 125000003835 nucleoside group Chemical group 0.000 description 8
- 231100000241 scar Toxicity 0.000 description 8
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 8
- 108091093088 Amplicon Proteins 0.000 description 7
- 108010036364 Deoxyribonuclease IV (Phage T4-Induced) Proteins 0.000 description 7
- KDCGOANMDULRCW-UHFFFAOYSA-N Purine Natural products N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 7
- 108010064978 Type II Site-Specific Deoxyribonucleases Proteins 0.000 description 7
- 239000003153 chemical reaction reagent Substances 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- NKKLCOFTJVNYAQ-UHFFFAOYSA-N formamidopyrimidine Chemical compound O=CNC1=CN=CN=C1 NKKLCOFTJVNYAQ-UHFFFAOYSA-N 0.000 description 7
- 230000035772 mutation Effects 0.000 description 7
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 7
- 238000003860 storage Methods 0.000 description 7
- 235000000346 sugar Nutrition 0.000 description 7
- HPZMWTNATZPBIH-UHFFFAOYSA-N 1-methyladenine Chemical compound CN1C=NC2=NC=NC2=C1N HPZMWTNATZPBIH-UHFFFAOYSA-N 0.000 description 6
- RFLVMTUMFYRZCB-UHFFFAOYSA-N 1-methylguanine Chemical compound O=C1N(C)C(N)=NC2=C1N=CN2 RFLVMTUMFYRZCB-UHFFFAOYSA-N 0.000 description 6
- OVONXEQGWXGFJD-UHFFFAOYSA-N 4-sulfanylidene-1h-pyrimidin-2-one Chemical compound SC=1C=CNC(=O)N=1 OVONXEQGWXGFJD-UHFFFAOYSA-N 0.000 description 6
- JDBGXEHEIRGOBU-UHFFFAOYSA-N 5-hydroxymethyluracil Chemical compound OCC1=CNC(=O)NC1=O JDBGXEHEIRGOBU-UHFFFAOYSA-N 0.000 description 6
- ZLAQATDNGLKIEV-UHFFFAOYSA-N 5-methyl-2-sulfanylidene-1h-pyrimidin-4-one Chemical compound CC1=CNC(=S)NC1=O ZLAQATDNGLKIEV-UHFFFAOYSA-N 0.000 description 6
- 229930010555 Inosine Natural products 0.000 description 6
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 6
- BVIAOQMSVZHOJM-UHFFFAOYSA-N N(6),N(6)-dimethyladenine Chemical compound CN(C)C1=NC=NC2=C1N=CN2 BVIAOQMSVZHOJM-UHFFFAOYSA-N 0.000 description 6
- HYVABZIGRDEKCD-UHFFFAOYSA-N N(6)-dimethylallyladenine Chemical compound CC(C)=CCNC1=NC=NC2=C1N=CN2 HYVABZIGRDEKCD-UHFFFAOYSA-N 0.000 description 6
- 229910019142 PO4 Inorganic materials 0.000 description 6
- 238000010804 cDNA synthesis Methods 0.000 description 6
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 6
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 6
- 229960003786 inosine Drugs 0.000 description 6
- 239000010452 phosphate Substances 0.000 description 6
- CLGFIVUFZRGQRP-UHFFFAOYSA-N 7,8-dihydro-8-oxoguanine Chemical compound O=C1NC(N)=NC2=C1NC(=O)N2 CLGFIVUFZRGQRP-UHFFFAOYSA-N 0.000 description 5
- 101710081048 Endonuclease III Proteins 0.000 description 5
- 108010035344 Thymine DNA Glycosylase Proteins 0.000 description 5
- 102100020779 UV excision repair protein RAD23 homolog B Human genes 0.000 description 5
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 5
- 230000037429 base substitution Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 229910052739 hydrogen Inorganic materials 0.000 description 5
- 239000001257 hydrogen Substances 0.000 description 5
- 238000005192 partition Methods 0.000 description 5
- 150000003230 pyrimidines Chemical class 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical compound CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 4
- 102100035886 Adenine DNA glycosylase Human genes 0.000 description 4
- 108020004705 Codon Proteins 0.000 description 4
- 102100037696 Endonuclease V Human genes 0.000 description 4
- 102100026406 G/T mismatch-specific thymine DNA glycosylase Human genes 0.000 description 4
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 4
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 4
- DRTQHJPVMGBUCF-XVFCMESISA-N Uridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-XVFCMESISA-N 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 238000012937 correction Methods 0.000 description 4
- 238000013500 data storage Methods 0.000 description 4
- 230000009615 deamination Effects 0.000 description 4
- 238000006481 deamination reaction Methods 0.000 description 4
- 238000006911 enzymatic reaction Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 238000003780 insertion Methods 0.000 description 4
- 230000037431 insertion Effects 0.000 description 4
- 238000001668 nucleic acid synthesis Methods 0.000 description 4
- 238000003908 quality control method Methods 0.000 description 4
- 238000007789 sealing Methods 0.000 description 4
- 238000012163 sequencing technique Methods 0.000 description 4
- 230000005783 single-strand break Effects 0.000 description 4
- 239000007858 starting material Substances 0.000 description 4
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 4
- 229940075420 xanthine Drugs 0.000 description 4
- PJXVQPWEQYWHRL-UHFFFAOYSA-N 1-acetyl-4-aminopyrimidin-2-one Chemical compound CC(=O)N1C=CC(N)=NC1=O PJXVQPWEQYWHRL-UHFFFAOYSA-N 0.000 description 3
- SATCOUWSAZBIJO-UHFFFAOYSA-N 1-methyladenine Natural products N=C1N(C)C=NC2=C1NC=N2 SATCOUWSAZBIJO-UHFFFAOYSA-N 0.000 description 3
- WJNGQIYEQLPJMN-IOSLPCCCSA-N 1-methylinosine Chemical compound C1=NC=2C(=O)N(C)C=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O WJNGQIYEQLPJMN-IOSLPCCCSA-N 0.000 description 3
- MXHRCPNRJAMMIM-SHYZEUOFSA-N 2'-deoxyuridine Chemical compound C1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-SHYZEUOFSA-N 0.000 description 3
- WYDKPTZGVLTYPG-UHFFFAOYSA-N 2,8-diamino-3,7-dihydropurin-6-one Chemical compound N1C(N)=NC(=O)C2=C1N=C(N)N2 WYDKPTZGVLTYPG-UHFFFAOYSA-N 0.000 description 3
- HLYBTPMYFWWNJN-UHFFFAOYSA-N 2-(2,4-dioxo-1h-pyrimidin-5-yl)-2-hydroxyacetic acid Chemical compound OC(=O)C(O)C1=CNC(=O)NC1=O HLYBTPMYFWWNJN-UHFFFAOYSA-N 0.000 description 3
- SGAKLDIYNFXTCK-UHFFFAOYSA-N 2-[(2,4-dioxo-1h-pyrimidin-5-yl)methylamino]acetic acid Chemical compound OC(=O)CNCC1=CNC(=O)NC1=O SGAKLDIYNFXTCK-UHFFFAOYSA-N 0.000 description 3
- YSAJFXWTVFGPAX-UHFFFAOYSA-N 2-[(2,4-dioxo-1h-pyrimidin-5-yl)oxy]acetic acid Chemical compound OC(=O)COC1=CNC(=O)NC1=O YSAJFXWTVFGPAX-UHFFFAOYSA-N 0.000 description 3
- CRYCZDRIXVHNQB-UHFFFAOYSA-N 2-amino-8-bromo-3,7-dihydropurin-6-one Chemical compound N1C(N)=NC(=O)C2=C1N=C(Br)N2 CRYCZDRIXVHNQB-UHFFFAOYSA-N 0.000 description 3
- YCFWZXAEOXKNHL-UHFFFAOYSA-N 2-amino-8-chloro-3,7-dihydropurin-6-one Chemical compound N1C(N)=NC(=O)C2=C1N=C(Cl)N2 YCFWZXAEOXKNHL-UHFFFAOYSA-N 0.000 description 3
- DJGMEMUXTWZGIC-UHFFFAOYSA-N 2-amino-8-methyl-3,7-dihydropurin-6-one Chemical compound N1C(N)=NC(=O)C2=C1N=C(C)N2 DJGMEMUXTWZGIC-UHFFFAOYSA-N 0.000 description 3
- MWBWWFOAEOYUST-UHFFFAOYSA-N 2-aminopurine Chemical compound NC1=NC=C2N=CNC2=N1 MWBWWFOAEOYUST-UHFFFAOYSA-N 0.000 description 3
- XMSMHKMPBNTBOD-UHFFFAOYSA-N 2-dimethylamino-6-hydroxypurine Chemical compound N1C(N(C)C)=NC(=O)C2=C1N=CN2 XMSMHKMPBNTBOD-UHFFFAOYSA-N 0.000 description 3
- SMADWRYCYBUIKH-UHFFFAOYSA-N 2-methyl-7h-purin-6-amine Chemical compound CC1=NC(N)=C2NC=NC2=N1 SMADWRYCYBUIKH-UHFFFAOYSA-N 0.000 description 3
- KOLPWZCZXAMXKS-UHFFFAOYSA-N 3-methylcytosine Chemical compound CN1C(N)=CC=NC1=O KOLPWZCZXAMXKS-UHFFFAOYSA-N 0.000 description 3
- GJAKJCICANKRFD-UHFFFAOYSA-N 4-acetyl-4-amino-1,3-dihydropyrimidin-2-one Chemical compound CC(=O)C1(N)NC(=O)NC=C1 GJAKJCICANKRFD-UHFFFAOYSA-N 0.000 description 3
- BLXGZIDBSXVMLU-UHFFFAOYSA-N 5-(2-bromoethenyl)-1h-pyrimidine-2,4-dione Chemical compound BrC=CC1=CNC(=O)NC1=O BLXGZIDBSXVMLU-UHFFFAOYSA-N 0.000 description 3
- MQJSSLBGAQJNER-UHFFFAOYSA-N 5-(methylaminomethyl)-1h-pyrimidine-2,4-dione Chemical compound CNCC1=CNC(=O)NC1=O MQJSSLBGAQJNER-UHFFFAOYSA-N 0.000 description 3
- LQLQRFGHAALLLE-UHFFFAOYSA-N 5-bromouracil Chemical compound BrC1=CNC(=O)NC1=O LQLQRFGHAALLLE-UHFFFAOYSA-N 0.000 description 3
- VKLFQTYNHLDMDP-PNHWDRBUSA-N 5-carboxymethylaminomethyl-2-thiouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=S)NC(=O)C(CNCC(O)=O)=C1 VKLFQTYNHLDMDP-PNHWDRBUSA-N 0.000 description 3
- ZFTBZKVVGZNMJR-UHFFFAOYSA-N 5-chlorouracil Chemical compound ClC1=CNC(=O)NC1=O ZFTBZKVVGZNMJR-UHFFFAOYSA-N 0.000 description 3
- RHIULBJJKFDJPR-UHFFFAOYSA-N 5-ethyl-1h-pyrimidine-2,4-dione Chemical compound CCC1=CNC(=O)NC1=O RHIULBJJKFDJPR-UHFFFAOYSA-N 0.000 description 3
- OFJNVANOCZHTMW-UHFFFAOYSA-N 5-hydroxyuracil Chemical compound OC1=CNC(=O)NC1=O OFJNVANOCZHTMW-UHFFFAOYSA-N 0.000 description 3
- KSNXJLQDQOIRIP-UHFFFAOYSA-N 5-iodouracil Chemical compound IC1=CNC(=O)NC1=O KSNXJLQDQOIRIP-UHFFFAOYSA-N 0.000 description 3
- KELXHQACBIUYSE-UHFFFAOYSA-N 5-methoxy-1h-pyrimidine-2,4-dione Chemical compound COC1=CNC(=O)NC1=O KELXHQACBIUYSE-UHFFFAOYSA-N 0.000 description 3
- JHEKLAXXCHLMNM-UHFFFAOYSA-N 5-propyl-1h-pyrimidine-2,4-dione Chemical compound CCCC1=CNC(=O)NC1=O JHEKLAXXCHLMNM-UHFFFAOYSA-N 0.000 description 3
- DCPSTSVLRXOYGS-UHFFFAOYSA-N 6-amino-1h-pyrimidine-2-thione Chemical compound NC1=CC=NC(S)=N1 DCPSTSVLRXOYGS-UHFFFAOYSA-N 0.000 description 3
- CZJGCEGNCSGRBI-UHFFFAOYSA-N 6-amino-5-ethyl-1h-pyrimidin-2-one Chemical compound CCC1=CNC(=O)N=C1N CZJGCEGNCSGRBI-UHFFFAOYSA-N 0.000 description 3
- NLLCDONDZDHLCI-UHFFFAOYSA-N 6-amino-5-hydroxy-1h-pyrimidin-2-one Chemical compound NC=1NC(=O)N=CC=1O NLLCDONDZDHLCI-UHFFFAOYSA-N 0.000 description 3
- CKOMXBHMKXXTNW-UHFFFAOYSA-N 6-methyladenine Chemical compound CNC1=NC=NC2=C1N=CN2 CKOMXBHMKXXTNW-UHFFFAOYSA-N 0.000 description 3
- FVXHPCVBOXMRJP-UHFFFAOYSA-N 8-bromo-7h-purin-6-amine Chemical compound NC1=NC=NC2=C1NC(Br)=N2 FVXHPCVBOXMRJP-UHFFFAOYSA-N 0.000 description 3
- MSSXOMSJDRHRMC-UHFFFAOYSA-N 9H-purine-2,6-diamine Chemical compound NC1=NC(N)=C2NC=NC2=N1 MSSXOMSJDRHRMC-UHFFFAOYSA-N 0.000 description 3
- 208000010200 Cockayne syndrome Diseases 0.000 description 3
- 102100033697 DNA cross-link repair 1A protein Human genes 0.000 description 3
- 108010060616 DNA-3-methyladenine glycosidase II Proteins 0.000 description 3
- 102100028778 Endonuclease 8-like 1 Human genes 0.000 description 3
- 102100028779 Endonuclease 8-like 2 Human genes 0.000 description 3
- 102100021710 Endonuclease III-like protein 1 Human genes 0.000 description 3
- GHASVSINZRGABV-UHFFFAOYSA-N Fluorouracil Chemical compound FC1=CNC(=O)NC1=O GHASVSINZRGABV-UHFFFAOYSA-N 0.000 description 3
- 101710153534 G/U mismatch-specific DNA glycosylase Proteins 0.000 description 3
- 101001000351 Homo sapiens Adenine DNA glycosylase Proteins 0.000 description 3
- 101000871548 Homo sapiens DNA cross-link repair 1A protein Proteins 0.000 description 3
- 101001123824 Homo sapiens Endonuclease 8-like 1 Proteins 0.000 description 3
- 101001123823 Homo sapiens Endonuclease 8-like 2 Proteins 0.000 description 3
- 101000970385 Homo sapiens Endonuclease III-like protein 1 Proteins 0.000 description 3
- 101000717424 Homo sapiens UV excision repair protein RAD23 homolog B Proteins 0.000 description 3
- SGSSKEDGVONRGC-UHFFFAOYSA-N N(2)-methylguanine Chemical compound O=C1NC(NC)=NC2=C1N=CN2 SGSSKEDGVONRGC-UHFFFAOYSA-N 0.000 description 3
- CBCQWVQNMGNYEO-UHFFFAOYSA-N N(6)-hydroxyadenine Chemical compound ONC1=NC=NC2=C1NC=N2 CBCQWVQNMGNYEO-UHFFFAOYSA-N 0.000 description 3
- 238000012408 PCR amplification Methods 0.000 description 3
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000003491 array Methods 0.000 description 3
- 239000007795 chemical reaction product Substances 0.000 description 3
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical class NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 3
- MXHRCPNRJAMMIM-UHFFFAOYSA-N desoxyuridine Natural products C1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 MXHRCPNRJAMMIM-UHFFFAOYSA-N 0.000 description 3
- 238000001962 electrophoresis Methods 0.000 description 3
- 229960002949 fluorouracil Drugs 0.000 description 3
- 239000000543 intermediate Substances 0.000 description 3
- 230000008018 melting Effects 0.000 description 3
- 238000002844 melting Methods 0.000 description 3
- GLVAUDGFNGKCSF-UHFFFAOYSA-N mercaptopurine Chemical compound S=C1NC=NC2=C1NC=N2 GLVAUDGFNGKCSF-UHFFFAOYSA-N 0.000 description 3
- IZAGSTRIDUNNOY-UHFFFAOYSA-N methyl 2-[(2,4-dioxo-1h-pyrimidin-5-yl)oxy]acetate Chemical compound COC(=O)COC1=CNC(=O)NC1=O IZAGSTRIDUNNOY-UHFFFAOYSA-N 0.000 description 3
- XJVXMWNLQRTRGH-UHFFFAOYSA-N n-(3-methylbut-3-enyl)-2-methylsulfanyl-7h-purin-6-amine Chemical compound CSC1=NC(NCCC(C)=C)=C2NC=NC2=N1 XJVXMWNLQRTRGH-UHFFFAOYSA-N 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 3
- 150000004713 phosphodiesters Chemical group 0.000 description 3
- UEZVMMHDMIWARA-UHFFFAOYSA-M phosphonate Chemical compound [O-]P(=O)=O UEZVMMHDMIWARA-UHFFFAOYSA-M 0.000 description 3
- 238000006116 polymerization reaction Methods 0.000 description 3
- 230000001915 proofreading effect Effects 0.000 description 3
- GUKSGXOLJNWRLZ-UHFFFAOYSA-N thymine glycol Chemical compound CC1(O)C(O)NC(=O)NC1=O GUKSGXOLJNWRLZ-UHFFFAOYSA-N 0.000 description 3
- 229960003087 tioguanine Drugs 0.000 description 3
- 230000033587 transcription-coupled nucleotide-excision repair Effects 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- VUDQSRFCCHQIIU-UHFFFAOYSA-N 1-(3,5-dichloro-2,6-dihydroxy-4-methoxyphenyl)hexan-1-one Chemical compound CCCCCC(=O)C1=C(O)C(Cl)=C(OC)C(Cl)=C1O VUDQSRFCCHQIIU-UHFFFAOYSA-N 0.000 description 2
- YLDISKWYZFVQIK-UHFFFAOYSA-N 1-hydroxypyrimidine-2,4-dione Chemical compound ON1C=CC(=O)NC1=O YLDISKWYZFVQIK-UHFFFAOYSA-N 0.000 description 2
- GIMRVVLNBSNCLO-UHFFFAOYSA-N 2,6-diamino-5-formamido-4-hydroxypyrimidine Chemical compound NC1=NC(=O)C(NC=O)C(N)=N1 GIMRVVLNBSNCLO-UHFFFAOYSA-N 0.000 description 2
- XHBSBNYEHDQRCP-UHFFFAOYSA-N 2-amino-3-methyl-3,7-dihydro-6H-purin-6-one Chemical compound O=C1NC(=N)N(C)C2=C1N=CN2 XHBSBNYEHDQRCP-UHFFFAOYSA-N 0.000 description 2
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical compound OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 description 2
- DQDFTGKLWKBNCB-UHFFFAOYSA-N 4-amino-1-hydroxypyrimidin-2-one Chemical compound NC=1C=CN(O)C(=O)N=1 DQDFTGKLWKBNCB-UHFFFAOYSA-N 0.000 description 2
- 102100025915 5' exonuclease Apollo Human genes 0.000 description 2
- RGKBRPAAQSHTED-UHFFFAOYSA-N 8-oxoadenine Chemical compound NC1=NC=NC2=C1NC(=O)N2 RGKBRPAAQSHTED-UHFFFAOYSA-N 0.000 description 2
- 102100027938 ATR-interacting protein Human genes 0.000 description 2
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- 229930195730 Aflatoxin Natural products 0.000 description 2
- XWIYFDMXXLINPU-UHFFFAOYSA-N Aflatoxin G Chemical compound O=C1OCCC2=C1C(=O)OC1=C2C(OC)=CC2=C1C1C=COC1O2 XWIYFDMXXLINPU-UHFFFAOYSA-N 0.000 description 2
- QGZKDVFQNNGYKY-UHFFFAOYSA-N Ammonia Chemical compound N QGZKDVFQNNGYKY-UHFFFAOYSA-N 0.000 description 2
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 2
- 101100290380 Caenorhabditis elegans cel-1 gene Proteins 0.000 description 2
- 102100037631 Centrin-2 Human genes 0.000 description 2
- SRBFZHDQGSBBOR-IOVATXLUSA-N D-xylopyranose Chemical compound O[C@@H]1COC(O)[C@H](O)[C@H]1O SRBFZHDQGSBBOR-IOVATXLUSA-N 0.000 description 2
- 102000012698 DDB1 Human genes 0.000 description 2
- 102100021122 DNA damage-binding protein 2 Human genes 0.000 description 2
- 102100031866 DNA excision repair protein ERCC-5 Human genes 0.000 description 2
- 102100031868 DNA excision repair protein ERCC-8 Human genes 0.000 description 2
- 230000033616 DNA repair Effects 0.000 description 2
- 102100029094 DNA repair endonuclease XPF Human genes 0.000 description 2
- 102100035619 DNA-(apurinic or apyrimidinic site) lyase Human genes 0.000 description 2
- 102100039128 DNA-3-methyladenine glycosylase Human genes 0.000 description 2
- 241000224495 Dictyostelium Species 0.000 description 2
- 101100170004 Dictyostelium discoideum repE gene Proteins 0.000 description 2
- 101100170005 Drosophila melanogaster pic gene Proteins 0.000 description 2
- 102100028773 Endonuclease 8-like 3 Human genes 0.000 description 2
- 102000007122 Fanconi Anemia Complementation Group G protein Human genes 0.000 description 2
- 108010033305 Fanconi Anemia Complementation Group G protein Proteins 0.000 description 2
- 102100026121 Flap endonuclease 1 Human genes 0.000 description 2
- 101000720953 Homo sapiens 5' exonuclease Apollo Proteins 0.000 description 2
- 101000880516 Homo sapiens Centrin-2 Proteins 0.000 description 2
- 101001041466 Homo sapiens DNA damage-binding protein 2 Proteins 0.000 description 2
- 101000920778 Homo sapiens DNA excision repair protein ERCC-8 Proteins 0.000 description 2
- 101001123819 Homo sapiens Endonuclease 8-like 3 Proteins 0.000 description 2
- 101000977270 Homo sapiens MMS19 nucleotide excision repair protein homolog Proteins 0.000 description 2
- 101000807668 Homo sapiens Uracil-DNA glycosylase Proteins 0.000 description 2
- 102100023474 MMS19 nucleotide excision repair protein homolog Human genes 0.000 description 2
- BAVYZALUXZFZLV-UHFFFAOYSA-N Methylamine Chemical compound NC BAVYZALUXZFZLV-UHFFFAOYSA-N 0.000 description 2
- 108010086093 Mung Bean Nuclease Proteins 0.000 description 2
- IOVCWXUNBOPUCH-UHFFFAOYSA-N Nitrous acid Chemical class ON=O IOVCWXUNBOPUCH-UHFFFAOYSA-N 0.000 description 2
- 108091081548 Palindromic sequence Proteins 0.000 description 2
- NQRYJNQNLNOLGT-UHFFFAOYSA-N Piperidine Chemical compound C1CCNCC1 NQRYJNQNLNOLGT-UHFFFAOYSA-N 0.000 description 2
- 102100025391 Pre-mRNA-splicing factor SYF1 Human genes 0.000 description 2
- 102000001708 Protein Isoforms Human genes 0.000 description 2
- 108010029485 Protein Isoforms Proteins 0.000 description 2
- 108010090804 Streptavidin Proteins 0.000 description 2
- 102100024855 Three-prime repair exonuclease 1 Human genes 0.000 description 2
- 102100020845 UV excision repair protein RAD23 homolog A Human genes 0.000 description 2
- 101710204645 UV excision repair protein RAD23 homolog B Proteins 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 239000005409 aflatoxin Substances 0.000 description 2
- 230000029936 alkylation Effects 0.000 description 2
- 238000005804 alkylation reaction Methods 0.000 description 2
- HIMXGTXNXJYFGB-UHFFFAOYSA-N alloxan Chemical compound O=C1NC(=O)C(=O)C(=O)N1 HIMXGTXNXJYFGB-UHFFFAOYSA-N 0.000 description 2
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 2
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 2
- DRTQHJPVMGBUCF-PSQAKQOGSA-N beta-L-uridine Natural products O[C@H]1[C@@H](O)[C@H](CO)O[C@@H]1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-PSQAKQOGSA-N 0.000 description 2
- 238000007068 beta-elimination reaction Methods 0.000 description 2
- 230000001588 bifunctional effect Effects 0.000 description 2
- 125000002680 canonical nucleotide group Chemical group 0.000 description 2
- 238000007385 chemical modification Methods 0.000 description 2
- 239000013599 cloning vector Substances 0.000 description 2
- 101150077768 ddb1 gene Proteins 0.000 description 2
- 238000012217 deletion Methods 0.000 description 2
- 230000037430 deletion Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000029087 digestion Effects 0.000 description 2
- 230000006862 enzymatic digestion Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 108010052305 exodeoxyribonuclease III Proteins 0.000 description 2
- 238000001502 gel electrophoresis Methods 0.000 description 2
- 102000054767 gene variant Human genes 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 238000005304 joining Methods 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 238000002493 microarray Methods 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 150000008300 phosphoramidites Chemical class 0.000 description 2
- 229920000642 polymer Polymers 0.000 description 2
- 230000004952 protein activity Effects 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 230000002269 spontaneous effect Effects 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 238000001308 synthesis method Methods 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- 229940104230 thymidine Drugs 0.000 description 2
- DRTQHJPVMGBUCF-UHFFFAOYSA-N uracil arabinoside Natural products OC1C(O)C(CO)OC1N1C(=O)NC(=O)C=C1 DRTQHJPVMGBUCF-UHFFFAOYSA-N 0.000 description 2
- 229940045145 uridine Drugs 0.000 description 2
- CADQNXRGRFJSQY-UOWFLXDJSA-N (2r,3r,4r)-2-fluoro-2,3,4,5-tetrahydroxypentanal Chemical compound OC[C@@H](O)[C@@H](O)[C@@](O)(F)C=O CADQNXRGRFJSQY-UOWFLXDJSA-N 0.000 description 1
- GUKSGXOLJNWRLZ-DUZGATOHSA-N (5r,6r)-5,6-dihydroxy-5-methyl-1,3-diazinane-2,4-dione Chemical compound C[C@@]1(O)[C@@H](O)NC(=O)NC1=O GUKSGXOLJNWRLZ-DUZGATOHSA-N 0.000 description 1
- WWJWZQKUDYKLTK-UHFFFAOYSA-N 1,n6-ethenoadenine Chemical compound C1=NC2=NC=N[C]2C2=NC=CN21 WWJWZQKUDYKLTK-UHFFFAOYSA-N 0.000 description 1
- LMXHFXAFDMNKIM-UHFFFAOYSA-N 1-(2-hydroxyethyl)-5-nitropyrrole-2-carbonitrile Chemical compound OCCN1C(C#N)=CC=C1[N+]([O-])=O LMXHFXAFDMNKIM-UHFFFAOYSA-N 0.000 description 1
- HXDYHJGZXSPLJE-UHFFFAOYSA-N 2,4-dioxopyrimidine-1-carbaldehyde Chemical compound O=CN1C=CC(=O)NC1=O HXDYHJGZXSPLJE-UHFFFAOYSA-N 0.000 description 1
- KMEBCRWKZZSRRT-UHFFFAOYSA-N 2-methyl-7h-purine Chemical compound CC1=NC=C2NC=NC2=N1 KMEBCRWKZZSRRT-UHFFFAOYSA-N 0.000 description 1
- GICKXGZWALFYHZ-UHFFFAOYSA-N 3,N(4)-ethenocytosine Chemical compound O=C1NC=CC2=NC=CN12 GICKXGZWALFYHZ-UHFFFAOYSA-N 0.000 description 1
- 108010034927 3-methyladenine-DNA glycosylase Proteins 0.000 description 1
- FAUCUWNCMBVOPX-UHFFFAOYSA-N 3-methylpurine Chemical compound CN1C=NC=C2N=CN=C12 FAUCUWNCMBVOPX-UHFFFAOYSA-N 0.000 description 1
- UXMXLZZMYVUDLU-UHFFFAOYSA-N 4,5-dihydropyrimidin-4-ol Chemical class OC1CC=NC=N1 UXMXLZZMYVUDLU-UHFFFAOYSA-N 0.000 description 1
- MVYUVUOSXNYQLL-UHFFFAOYSA-N 4,6-diamino-5-formamidopyrimidine Chemical compound NC1=NC=NC(N)=C1NC=O MVYUVUOSXNYQLL-UHFFFAOYSA-N 0.000 description 1
- NBAKTGXDIBVZOO-UHFFFAOYSA-N 5,6-dihydrothymine Chemical compound CC1CNC(=O)NC1=O NBAKTGXDIBVZOO-UHFFFAOYSA-N 0.000 description 1
- NHOKUDODDWSIAJ-UHFFFAOYSA-N 5,6-dihydroxy-1,3-diazinane-2,4-dione Chemical compound OC1NC(=O)NC(=O)C1O NHOKUDODDWSIAJ-UHFFFAOYSA-N 0.000 description 1
- RFKUZJDCLCWCDQ-UHFFFAOYSA-N 5-Hydroxydihydro-2,4(1H,3H)-pyrimidinedione Chemical compound OC1CNC(=O)NC1=O RFKUZJDCLCWCDQ-UHFFFAOYSA-N 0.000 description 1
- UIHWKXHRHOBLKQ-UHFFFAOYSA-N 5-hydroxy-5-methyl-1,3-diazinane-2,4-dione Chemical compound CC1(O)CNC(=O)NC1=O UIHWKXHRHOBLKQ-UHFFFAOYSA-N 0.000 description 1
- CNQHZBFQVYOFGD-UHFFFAOYSA-N 5-hydroxy-5-methylimidazolidine-2,4-dione Chemical compound CC1(O)NC(=O)NC1=O CNQHZBFQVYOFGD-UHFFFAOYSA-N 0.000 description 1
- KBDWGFZSICOZSJ-UHFFFAOYSA-N 5-methyl-2,3-dihydro-1H-pyrimidin-4-one Chemical compound N1CNC=C(C1=O)C KBDWGFZSICOZSJ-UHFFFAOYSA-N 0.000 description 1
- 108010057896 5-methylcytosine-DNA glycosylase Proteins 0.000 description 1
- PGSPUKDWUHBDKJ-UHFFFAOYSA-N 6,7-dihydro-3h-purin-2-amine Chemical compound C1NC(N)=NC2=C1NC=N2 PGSPUKDWUHBDKJ-UHFFFAOYSA-N 0.000 description 1
- 108050007143 AP endonuclease 1 Proteins 0.000 description 1
- 102000018054 AP endonuclease 1 Human genes 0.000 description 1
- 101710165113 ATR-interacting protein Proteins 0.000 description 1
- 101710181213 Adenine DNA glycosylase Proteins 0.000 description 1
- 240000007087 Apium graveolens Species 0.000 description 1
- 235000015849 Apium graveolens Dulce Group Nutrition 0.000 description 1
- 235000010591 Appio Nutrition 0.000 description 1
- 101000884448 Arabidopsis thaliana DNA-(apurinic or apyrimidinic site) endonuclease, chloroplastic Proteins 0.000 description 1
- 101150023831 Atrip gene Proteins 0.000 description 1
- 108090001008 Avidin Proteins 0.000 description 1
- 235000014469 Bacillus subtilis Nutrition 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 241000244203 Caenorhabditis elegans Species 0.000 description 1
- 101100115215 Caenorhabditis elegans cul-2 gene Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 208000032544 Cicatrix Diseases 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- 102100035186 DNA excision repair protein ERCC-1 Human genes 0.000 description 1
- 108010035476 DNA excision repair protein ERCC-5 Proteins 0.000 description 1
- 102100031867 DNA excision repair protein ERCC-6 Human genes 0.000 description 1
- 102100029995 DNA ligase 1 Human genes 0.000 description 1
- 230000007018 DNA scission Effects 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 108010000577 DNA-Formamidopyrimidine Glycosylase Proteins 0.000 description 1
- 230000004568 DNA-binding Effects 0.000 description 1
- 108010046855 DNA-deoxyinosine glycosidase Proteins 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 241001397104 Dima Species 0.000 description 1
- 108700043035 Drosophila Rrp1 Proteins 0.000 description 1
- 208000037595 EN1-related dorsoventral syndrome Diseases 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 101710086833 Endonuclease 4 homolog Proteins 0.000 description 1
- 101710094010 Endonuclease II Proteins 0.000 description 1
- 101710190882 Endonuclease V Proteins 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 108010007577 Exodeoxyribonuclease I Proteins 0.000 description 1
- 108010046914 Exodeoxyribonuclease V Proteins 0.000 description 1
- 102100029075 Exonuclease 1 Human genes 0.000 description 1
- 102100037091 Exonuclease V Human genes 0.000 description 1
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 1
- 101800001941 Hippocampal cholinergic neurostimulating peptide Proteins 0.000 description 1
- 101000697966 Homo sapiens ATR-interacting protein Proteins 0.000 description 1
- 101000851684 Homo sapiens Chimeric ERCC6-PGBD3 protein Proteins 0.000 description 1
- 101000897441 Homo sapiens Cyclin-O Proteins 0.000 description 1
- 101000876529 Homo sapiens DNA excision repair protein ERCC-1 Proteins 0.000 description 1
- 101000920783 Homo sapiens DNA excision repair protein ERCC-6 Proteins 0.000 description 1
- 101000863770 Homo sapiens DNA ligase 1 Proteins 0.000 description 1
- 101000806846 Homo sapiens DNA-(apurinic or apyrimidinic site) endonuclease Proteins 0.000 description 1
- 101001137256 Homo sapiens DNA-(apurinic or apyrimidinic site) lyase Proteins 0.000 description 1
- 101000729474 Homo sapiens DNA-directed RNA polymerase I subunit RPA1 Proteins 0.000 description 1
- 101000650600 Homo sapiens DNA-directed RNA polymerase I subunit RPA2 Proteins 0.000 description 1
- 101000880860 Homo sapiens Endonuclease V Proteins 0.000 description 1
- 101000913035 Homo sapiens Flap endonuclease 1 Proteins 0.000 description 1
- 101000619640 Homo sapiens Leucine-rich repeats and immunoglobulin-like domains protein 1 Proteins 0.000 description 1
- 101000825217 Homo sapiens Meiotic recombination protein SPO11 Proteins 0.000 description 1
- 101000647571 Homo sapiens Pre-mRNA-splicing factor SYF1 Proteins 0.000 description 1
- 101000709305 Homo sapiens Replication protein A 14 kDa subunit Proteins 0.000 description 1
- 101001092206 Homo sapiens Replication protein A 32 kDa subunit Proteins 0.000 description 1
- 101001092125 Homo sapiens Replication protein A 70 kDa DNA-binding subunit Proteins 0.000 description 1
- 101000830956 Homo sapiens Three-prime repair exonuclease 1 Proteins 0.000 description 1
- 101000717428 Homo sapiens UV excision repair protein RAD23 homolog A Proteins 0.000 description 1
- 101000777240 Micrococcus luteus (strain ATCC 4698 / DSM 20030 / JCM 1464 / NBRC 3333 / NCIMB 9278 / NCTC 2665 / VKM Ac-2230) Ultraviolet N-glycosylase/AP lyase Proteins 0.000 description 1
- 241001291091 Mimivirus Species 0.000 description 1
- ICMSBKUUXMDWAQ-UHFFFAOYSA-N N3-Methyladenine Chemical compound CN1C=NC(N)C2=C1N=CN2 ICMSBKUUXMDWAQ-UHFFFAOYSA-N 0.000 description 1
- 241000221960 Neurospora Species 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 239000012807 PCR reagent Substances 0.000 description 1
- 101000889620 Plutella xylostella Aminopeptidase N Proteins 0.000 description 1
- 101710128131 Probable endonuclease 4 Proteins 0.000 description 1
- 101710155288 Putative endonuclease 4 Proteins 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 102100034372 Replication protein A 14 kDa subunit Human genes 0.000 description 1
- 102100035525 Replication protein A 32 kDa subunit Human genes 0.000 description 1
- 102100035729 Replication protein A 70 kDa DNA-binding subunit Human genes 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 101000777243 Schizosaccharomyces pombe (strain 972 / ATCC 24843) UV-damage endonuclease Proteins 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 101150061787 Trex1 gene Proteins 0.000 description 1
- 241000012469 Trimerotropis maritima Species 0.000 description 1
- 101150013568 US16 gene Proteins 0.000 description 1
- 101710204576 UV excision repair protein RAD23 homolog A Proteins 0.000 description 1
- 101710163493 Uracil-DNA glycosylase 1 Proteins 0.000 description 1
- 241000221566 Ustilago Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 108010039040 adenine glycosylase Proteins 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 125000001931 aliphatic group Chemical group 0.000 description 1
- 125000000217 alkyl group Chemical group 0.000 description 1
- 239000002168 alkylating agent Substances 0.000 description 1
- 229940100198 alkylating agent Drugs 0.000 description 1
- 125000003275 alpha amino acid group Chemical group 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 150000001412 amines Chemical class 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 229910021529 ammonia Inorganic materials 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 1
- 210000004436 artificial bacterial chromosome Anatomy 0.000 description 1
- 210000001106 artificial yeast chromosome Anatomy 0.000 description 1
- 230000033590 base-excision repair Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 125000001369 canonical nucleoside group Chemical group 0.000 description 1
- 239000013043 chemical agent Substances 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 230000002153 concerted effect Effects 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 108010004031 deoxyribonuclease A Proteins 0.000 description 1
- 230000008021 deposition Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- ANCLJVISBRWUTR-UHFFFAOYSA-N diaminophosphinic acid Chemical compound NP(N)(O)=O ANCLJVISBRWUTR-UHFFFAOYSA-N 0.000 description 1
- RJBIAAZJODIFHR-UHFFFAOYSA-N dihydroxy-imino-sulfanyl-$l^{5}-phosphane Chemical compound NP(O)(O)=S RJBIAAZJODIFHR-UHFFFAOYSA-N 0.000 description 1
- 239000000539 dimer Substances 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-K dioxido-sulfanylidene-sulfido-$l^{5}-phosphane Chemical compound [O-]P([O-])([S-])=S NAGJZTKCGNOGPW-UHFFFAOYSA-K 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 108010064144 endodeoxyribonuclease VII Proteins 0.000 description 1
- 230000009483 enzymatic pathway Effects 0.000 description 1
- 150000002170 ethers Chemical class 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 108010086271 exodeoxyribonuclease II Proteins 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 239000012467 final product Substances 0.000 description 1
- 230000002538 fungal effect Effects 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 239000007789 gas Substances 0.000 description 1
- 125000005843 halogen group Chemical group 0.000 description 1
- 150000002402 hexoses Chemical class 0.000 description 1
- 229920001519 homopolymer Polymers 0.000 description 1
- 102000050321 human APEX1 Human genes 0.000 description 1
- 210000000688 human artificial chromosome Anatomy 0.000 description 1
- 108010061664 human oxoguanine glycosylase 1 Proteins 0.000 description 1
- 102000012201 human oxoguanine glycosylase 1 Human genes 0.000 description 1
- QAOWNCQODCNURD-UHFFFAOYSA-M hydrogensulfate Chemical compound OS([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-M 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 238000000126 in silico method Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000005865 ionizing radiation Effects 0.000 description 1
- 238000006977 lyase reaction Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 1
- 230000001035 methylating effect Effects 0.000 description 1
- YACKEPLHDIMKIO-UHFFFAOYSA-N methylphosphonic acid Chemical compound CP(O)(O)=O YACKEPLHDIMKIO-UHFFFAOYSA-N 0.000 description 1
- 230000033607 mismatch repair Effects 0.000 description 1
- 231100000219 mutagenic Toxicity 0.000 description 1
- 230000003505 mutagenic effect Effects 0.000 description 1
- 238000007857 nested PCR Methods 0.000 description 1
- 238000002515 oligonucleotide synthesis Methods 0.000 description 1
- 229910000489 osmium tetroxide Inorganic materials 0.000 description 1
- 239000012285 osmium tetroxide Substances 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 230000004792 oxidative damage Effects 0.000 description 1
- 230000001590 oxidative effect Effects 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000012846 protein folding Effects 0.000 description 1
- 230000004850 protein–protein interaction Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 230000037387 scars Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 239000002344 surface layer Substances 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 241001515965 unidentified phage Species 0.000 description 1
- 201000008549 xeroderma pigmentosum group E Diseases 0.000 description 1
- 108010073629 xeroderma pigmentosum group F protein Proteins 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
- C12N15/1031—Mutagenizing nucleic acids mutagenesis by gene assembly, e.g. assembly by oligonucleotide extension PCR
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
- C12N15/1027—Mutagenizing nucleic acids by DNA shuffling, e.g. RSR, STEP, RPR
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1093—General methods of preparing gene libraries, not provided for in other subgroups
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/66—General methods for inserting a gene into a vector to form a recombinant vector using cleavage and ligation; Use of non-functional linkers or adaptors, e.g. linkers containing the sequence for a restriction endonuclease
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/26—Preparation of nitrogen-containing carbohydrates
- C12P19/28—N-glycosides
- C12P19/30—Nucleotides
- C12P19/34—Polynucleotides, e.g. nucleic acids, oligoribonucleotides
Definitions
- nucleic acid synthesis is a powerful tool for basic biological research and biotechnology applications. While various methods are known for the synthesis of relatively short fragments of nucleic acids in a small scale, these techniques suffer from scalability, automation, speed, accuracy, and cost. In many cases, the assembly of nucleic acids from shorter segments is limited by the availability of non-degenerate overhangs that can be annealed to join the segments.
- nucleic acid assembly comprising: providing a predetermined nucleic acid sequence; providing a plurality of precursor double-stranded nucleic acid fragments, each precursor double-stranded nucleic acid fragment having two strands, wherein each of the two strands comprises a sticky end sequence of 5′-A (N x ) T-3′ (SEQ ID NO.: 1) or 5′-G (N x )C-3′ (SEQ ID NO.: 16), wherein N is a nucleotide, wherein x is the number of nucleotides between nucleotides A and T or between G and C, and wherein x is 1 to 10, and wherein no more than two precursor double-stranded nucleic acid fragments comprise the same sticky end sequence; providing primers comprising a nicking endonuclease recognition site and a sequence comprising (i) 5′-A (N x ) U-3′ (SEQ ID NO.: 80) corresponding to each of the
- Methods are further provided wherein x is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 100 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 25 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is 2 kb to 20 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is at least 2 kb in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 100 bases in length. Methods are further provided wherein the double-stranded nucleic acid fragments are each at least 500 bases in length.
- double-stranded nucleic acid fragments are each at least 1 kb in length. Methods are further provided wherein the double-stranded nucleic acid fragments are each at least 20 kb in length. Methods are further provided wherein the sticky ends are at least 4 bases long. Methods are further provided wherein the sticky ends are 6 bases long.
- step c further comprises providing (i) a forward primer comprising, in order 5′ to 3′: a first outer adaptor region and nucleic acid sequence from a first terminal portion of predetermined nucleic acid sequence; and (ii) a reverse primer, comprising, in order 5′ to 3′: a second outer adaptor region and nucleic acid sequence from a second terminal portion of predetermined nucleic acid sequence.
- a forward primer comprising, in order 5′ to 3′: a first outer adaptor region and nucleic acid sequence from a first terminal portion of predetermined nucleic acid sequence
- a reverse primer comprising, in order 5′ to 3′: a second outer adaptor region and nucleic acid sequence from a second terminal portion of predetermined nucleic acid sequence.
- the annealed double-stranded nucleic acid fragments comprise the first outer adaptor region and the second outer adapter region.
- the nicking and cleavage reagents comprise a nicking endonuclease.
- the nicking endonuclease comprises endonuclease VIII.
- Methods are further provided wherein the nicking endonuclease is selected from the list consisting of Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BtsI, Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII.
- Methods are further provided wherein the method further comprises ligating the annealed double-stranded nucleic acid fragments.
- annealing comprises thermocycling between a maximum and a minimum temperature, thereby generating a first overhang from a first double-stranded DNA fragment and a second overhang from a second double-stranded DNA fragment, wherein the first and the second overhangs are complimentary, hybridizing the first and second overhangs to each other; and ligating.
- a polymerase lacking 3′ to 5′ proofreading activity is added during the polynucleotide extension reaction.
- the polymerase is a Family A polymerase.
- the polymerase is a Family B high fidelity polymerase engineered to tolerate base pairs comprising uracil.
- Methods are further provided wherein the precursor double-stranded nucleic acid fragments comprise an adaptor sequence comprising the nicking endonuclease recognition site. Methods are further provided wherein one of the plurality of precursor double-stranded nucleic acid fragments is a linear vector. In some aspects, provided herein is a nucleic acid library generated by any of the aforementioned methods.
- nucleic acid assembly comprising: providing a predetermined nucleic acid sequence; synthesizing a plurality of precursor double-stranded nucleic acid fragments, each precursor double-stranded nucleic acid fragment having two strands, wherein each of the two strands comprises a sticky end sequence of 5′-A (Nx) T-3′ (SEQ ID NO.: 1) or 5′-G (Nx)C-3′ (SEQ ID NO.: 16), wherein N is a nucleotide, wherein x is the number of nucleotides between nucleotides A and T or between G and C, and wherein x is 1 to 10, and wherein no more than two precursor double-stranded nucleic acid fragments comprise the same sticky end sequence; providing primers comprising a nicking endonuclease recognition site and a sequence comprising (i) 5′-A (Nx) M-3′ (SEQ ID NO.: 82) corresponding to each of the different sticky end sequence;
- x is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. Methods are further provided wherein x is 4. Methods are further provided wherein the non-canonical base is uracil, inosine, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, acetylcytosine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N-6-isopentenyl adenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 1-methyladenine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, 5-ethylcytos
- non-canonical base is incorporated into the double-stranded nucleic acid fragments by performing a nucleic acid extension reaction from a primer comprising the non-canonical nucleotide.
- the non-canonical base is a uracil.
- the uracil is in a deoxyuridine-deoxyadenosine base pair.
- the primers are 10 to 30 bases in length.
- one of the plurality of precursor double-stranded nucleic acid fragments comprises a portion of linear vector.
- no more than 2 N nucleotides of the sticky end sequence have the same identity.
- the precursor double-stranded nucleic acid fragments comprise an adaptor sequence comprising the nicking endonuclease recognition site.
- the predetermined nucleic acid sequence is 1 kb to 100 kb in length.
- the plurality of precursor nucleic acid fragments are each at least 100 bases in length.
- the sticky ends are at least 4 bases long in each precursor nucleic acid.
- a nucleic acid library generated by any of the aforementioned methods.
- nucleic acid assembly comprising: providing a predetermined nucleic acid sequence; synthesizing a plurality of single-stranded nucleic acid fragments, wherein each single-stranded nucleic acid fragment encodes for a portion of the predetermined nucleic acid sequence and comprises at least one sticky end motif, wherein the sticky end motif comprises a sequence of 5-A(N x )T-3′ (SEQ ID NO.: 1) or 5′-G(N x )C-3′ (SEQ ID NO.: 16) in the predetermined nucleic acid sequence, wherein N is a nucleotide, wherein x is the number of nucleotides between nucleotides A and T or between G and C, and wherein x is 1 to 10, and wherein no more than two single-stranded nucleic acid fragments comprise the same sticky end sequence; amplifying the plurality of single-stranded nucleic acid fragments to generate a plurality of double-stranded nucleic
- Methods are further provided wherein x is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 100 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 25 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is 2 kb to 20 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is at least 2 kb in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 100 bases in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 500 bases in length.
- Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 1 kb in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 20 kb in length. Methods are further provided wherein the sticky ends are at least 4 bases long. Methods are further provided wherein the sticky ends are 6 bases long.
- non-canonical base is uracil, inosine, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, acetylcytosine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N-6-isopentenyl adenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 1-methyladenine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, 5-ethylcytosine, N6-adenine, N6-methyladenine, N,N-dimethyladenine, 8-brom
- non-canonical base is incorporated into the double-stranded nucleic acid by performing a nucleic acid extension reaction from a primer comprising the non-canonical nucleotide.
- the non-canonical base is a uracil.
- the uracil is in a deoxyuridine-deoxyadenosine base pair.
- the nicking recognition site is a nicking endonuclease recognition site.
- the distance between the non-canonical base the nicking enzyme cleavage site is less than 12 base pairs.
- Methods are further provided wherein the distance between the non-canonical base the nicking enzyme cleavage site is at least 5 base pairs. Methods are further provided wherein the first nicking enzyme comprises a base excision activity. Methods are further provided wherein the first nicking enzyme comprises uracil-DNA glycosylase (UDG). Methods are further provided wherein the first nicking enzyme comprises an AP endonuclease. Methods are further provided wherein the first nicking enzyme comprises endonuclease VIII. Methods are further provided wherein the second nicking enzyme a nicking endonuclease.
- nicking endonuclease is selected from the list consisting of Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BtsI, Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPI0.
- each of the plurality of double-stranded nucleic acid fragments further comprises a two sticky ends.
- each of the two sticky ends have a different sequence from each other.
- the sticky ends comprises a 3′ overhang.
- Methods are further provided wherein the method further comprises ligating the annealed double-stranded nucleic acid fragments.
- annealing comprises: thermocycling between a maximum and a minimum temperature, thereby generating a first overhang from a first double-stranded DNA fragment and a second overhang from a second double-stranded DNA fragment, wherein the first and the second overhangs are complimentary; hybridizing the first and second overhangs to each other; and ligating.
- Methods are further provided wherein the annealed double-stranded nucleic acid fragments comprise a 5′ outer adaptor region and a 3′ outer adaptor region.
- Methods are further provided wherein at least two non-identical single-stranded nucleic acid fragments are synthesized. Methods are further provided wherein at least 5 non-identical single-stranded nucleic acid fragments are synthesized. Methods are further provided wherein at least 20 non-identical single-stranded nucleic acid fragments are synthesized. Methods are further provided wherein a polymerase lacking 3′ to 5′ proofreading activity is added during the amplification step. Methods are further provided wherein the polymerase is a Family A polymerase. Methods are further provided wherein the polymerase is a Family B high fidelity polymerase engineered to tolerate base pairs comprising uracil. Methods are further provided wherein the amplified plurality of single-stranded nucleic acid fragments are not naturally occurring. Provided herein are nucleic acid libraries generated by any of the aforementioned methods.
- DNA libraries comprising n DNA fragments, each comprising a first strand and a second strand, each of then DNA fragments comprising, in order 5′ to 3′: a first nicking endonuclease recognition site, a first sticky end motif, a template region, a second sticky end motif, and a second nicking endonuclease recognition site, wherein the first sticky end motif comprises a sequence of 5′-A (N x ) i,1 U-3′ (SEQ ID NO.: 13) in the first strand; and wherein the second sticky end motif comprises a sequence of 5′-A (N x ) i,2 U-3′ (SEQ ID NO.: 14) in the second strand; wherein N x denotes x nucleosides, wherein (N x ) i,2 reverse complementary to (N x ) i,1 and different from every other N′ found in any sticky end motif sequence within the fragment library, wherein the first nicking
- Libraries are further provided wherein the first nicking endonuclease recognition site, the first sticky end motif, the variable insert, the second sticky end motif site, and the second nicking endonuclease recognition site are ordered as recited.
- Libraries are further provided wherein the library further comprises a starter DNA fragment comprising a template region, a second sticky end motif, and a second nicking endonuclease recognition site; wherein the second sticky end motif comprises a sequence of 5′-A (N x ) s,2 (SEQ ID NO.: 20) and wherein (N x ) s,2 reverse complementary to (N x ) 1,1 .
- Libraries are further provided wherein the library further comprises a finishing DNA fragment comprising a first nicking endonuclease recognition site, a first sticky end motif, and a template region; wherein the first sticky end motif comprises a sequence of 5′-A (N x ) f,1 U-3′ (SEQ ID NO.: 21) and wherein (N x ) f,1 is reverse complementary to (N x ) n,2 . Libraries are further provided wherein the first and second nicking endonuclease recognition sites are the same. Libraries are further provided wherein n is at least 2. Libraries are further provided wherein n is less than 10. Libraries are further provided wherein x is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10.
- Libraries are further provided wherein x is 4. Libraries are further provided wherein the template region of each of the n DNA fragments encodes for a different nucleic acid sequencing from the template region of every other of the n DNA fragments. Libraries are further provided wherein the sequences of the n DNA fragments are not naturally occurring. Libraries are further provided wherein the first nicking endonuclease recognition site is not naturally adjacent to the first sticky end motif.
- FIG. 1 depicts a workflow through which a nucleic acid product is assembled from 1 kbp nucleic acid fragments.
- FIG. 2 depicts the assembly of a longer nucleic acid fragment from the ligation of two oligonucleic acid fragments having complementary overhangs and discloses SEQ ID NOs.: 4, 6, 3, 5, 3, 6, 3 and 6, respectively, in order of appearance.
- FIG. 3 depicts a uracil-containing universal primer pair, and discloses SEQ ID NOs.: 7, 2, 8 and 2, respectively, in order of appearance.
- FIG. 4 depicts the assembly of a nucleic acid product from oligonucleic acid fragments having complementary overhangs.
- FIGS. 5A-5B depict the assembly of a recombinatorial library from a library of nucleic acid fragments each having at least one unspecified base.
- FIG. 6 depicts a diagram of steps demonstrating a process workflow for oligonucleic acid synthesis and assembly.
- FIG. 7 illustrates an example of a computer system.
- FIG. 8 is a block diagram illustrating an example architecture of a computer system.
- FIG. 9 is a diagram demonstrating a network configured to incorporate a plurality of computer systems, a plurality of cell phones and personal data assistants, and Network Attached Storage (NAS).
- NAS Network Attached Storage
- FIG. 10 is a block diagram of a multiprocessor computer system using a shared virtual address memory space.
- FIG. 11 shows an image of an electrophoresis gel resolving amplicons of a LacZ gene assembled in a plasmid using scar-free assembly methods described herein.
- nucleic acid fragments into longer nucleic acid molecules of desired predetermined sequence and length without leaving inserted nucleic acid sequence at assembly points, aka “scar” sequence.
- amplification steps are provided during the synthesis of the fragments which provide a means for increasing the mass of a long nucleic acid sequence to be amplified by amplifying the shorter fragments and then rejoining them in a processive manner such that the long nucleic acid is assembled.
- the term “about” in reference to a number or range of numbers is understood to mean the stated number and numbers +/ ⁇ 10% thereof, or 10% below the lower listed limit and 10% above the higher listed limit for the values listed for a range.
- the terms “preselected sequence”, “predefined sequence” or “predetermined sequence” are used interchangeably. The terms mean that the sequence of the polymer is known and chosen before synthesis or assembly of the polymer. In particular, various aspects of the invention are described herein primarily with regard to the preparation of nucleic acids molecules, the sequence of the oligonucleotide or polynucleotide being known and chosen before the synthesis or assembly of the nucleic acid molecules.
- nucleic acid refers broadly to any type of coding or non-coding, long polynucleotide or polynucleotide analog.
- complementary refers to the capacity for precise pairing between two nucleotides. If a nucleotide at a given position of a nucleic acid is capable of hydrogen bonding with a nucleotide of another nucleic acid, then the two nucleic acids are considered to be complementary to one another (or, more specifically in some usage, “reverse complementary”) at that position.
- Complementarity between two single-stranded nucleic acid molecules may be “partial,” in which only some of the nucleotides bind, or it may be complete when total complementarity exists between the single-stranded molecules.
- the degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands.
- Hybridization and “annealing” refer to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.
- the term “hybridized” as applied to a polynucleotide is a polynucleotide in a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.
- the hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner.
- the complex may comprise two strands forming a duplex structure, three or more strands forming a multi stranded complex, a single self-hybridizing strand, or any combination of these.
- a hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PCR or other amplification reactions, or the enzymatic cleavage of a polynucleotide by a ribozyme.
- a first sequence that can be stabilized via hydrogen bonding with the bases of the nucleotide residues of a second sequence is said to be “hybridizable” to the second sequence.
- the second sequence can also be said to be hybridizable to the first sequence.
- a sequence hybridized with a given sequence is the “complement” of the given sequence.
- a “target nucleic acid” is a desired molecule of predetermined sequence to be synthesized, and any fragment thereof.
- primer refers to an oligonucleotide that is capable of hybridizing (also termed “annealing”) with a nucleic acid and serving as an initiation site for nucleotide (RNA or DNA) polymerization under appropriate conditions (i.e. in the presence of four different nucleoside triphosphates and an agent for polymerization, such as DNA or RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature.
- RNA or DNA nucleotide
- the appropriate length of a primer depends on the intended use of the primer. In some instances, primers are least 7 nucleotides long. In some instances, primers range from 7 to 70 nucleotides, 10 to 30 nucleotides, or from 15 to 30 nucleotides in length.
- primers are from 30 to 50 or 40 to 70 nucleotides long. Oligonucleotides of various lengths as further described herein are used as primers or precursor fragments for amplification and/or gene assembly reactions.
- “primer length” refers to the portion of an oligonucleotide or nucleic acid that hybridizes to a complementary “target” sequence and primes nucleotide synthesis. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template but must be sufficiently complementary to hybridize with a template.
- primer site or “primer binding site” refers to the segment of the target nucleic acid to which a primer hybridizes.
- FIG. 1 An exemplary workflow illustrating the generation of a target nucleic acid using a scar-free nucleic acid assembly method is shown in FIG. 1 .
- the predetermined sequence of a double-stranded target nucleic acid 100 is analyzed to find short sequences, such as sequences of 3, 4, 5, 6, 7, 8, 9, or 10 bases, to serve as sticky end motifs 101 a - 101 g .
- Each sticky end motif 101 a - 101 g identified in the target nucleic acid need not comprise a sequence unique from another sequence in the target nucleic acid, but each sticky end sequence involved in target nucleic acid assembly is used only once, that is, at only one pair of precursor nucleic acid fragment ends.
- a sticky end motif comprises the sequence A(N x )T (SEQ ID NO.: 1), wherein x indicates from about 1 to about 10, N deoxyribonucleic acid bases of any sequence. For example, x is 4, 5 or 6 and each N may be the same or different from another N in the motif. In some cases, a sticky end motif comprises an ANNNNT (SEQ ID NO.: 2) sequence.
- the fragments are synthesized 115 with the sticky end motifs from the target nucleic acid 100 , for example, by de novo synthesis.
- synthesis 115 results in double-stranded precursor nucleic acid fragments 120 a - 120 c .
- Each double-stranded precursor nucleic acid fragments 120 a - 120 c includes an adaptor sequence positioned at either end of target fragment sequence.
- the outer terminal portions of the double-stranded precursor nucleic acid fragments each comprise an outer adaptor 121 a - 121 b .
- Each double-stranded precursor nucleic acid fragment 121 a - 120 c is synthesized 115 such that it overlaps with another region of another fragment sequence via sticky end motifs 101 a - 101 g in a processed order. As illustrated in FIG.
- synthesis also results in including additional sequence in a connecting adaptor region 123 a - 123 d .
- the “sticky end motif” occurs at a desired frequency in the nucleic acid sequence.
- the connecting adaptor region 123 a - 123 d includes a sticky end motif 101 a - 101 b and a first nicking enzyme recognition site 125 .
- Further processing of the double-stranded precursor nucleic acid fragments 120 a - 120 c is done via primers in an amplification reaction via primers in an amplification reaction 130 to insert a non-canonical base 131 .
- connecting adaptor regions 123 a - 123 d and/or outer adaptors 120 a - 120 b are and/or are appended to either end of the fragments during a processing step, for example, via primers in an amplification reaction 130 .
- the double-stranded precursor nucleic acid fragments 120 a - 120 c as subjected to enzymatic processing 140 entails cleaving portions of the connecting adaptor regions 123 a - 123 d .
- a first nicking enzyme binds at a first nicking enzyme recognition site 125 , and then cleaves the opposite stand.
- a second nicking enzyme cleaves the non-canonical base 131 .
- the enzymatic reaction results in fragments having stick ends 140 a - 140 d wherein pairs of sticky ends are revers complementary and correspond to sticky end motifs 101 a - 101 b in the original sequence.
- the fragments are subjected to an annealing and an ligation reaction 150 to form a reaction product 155 comprising target sequence.
- the annealing and ligation reactions 150 can include rounds of annealing, ligating and melting under conditions such that only desired sticky ends 140 a - 140 d are able to anneal and ligate, while cleaved end fragments remain unligated.
- Ordered assembly of nucleic acid fragments includes linear and circular assembly, for example, fragments are assembled with a vector into a plasmid.
- each double-stranded fragment is flanked on a terminal side by a double-stranded connecting adaptor comprising: a double-stranded sticky end motif derived from the target nucleic acid sequence, a nicking enzyme cleavage site located only a first strand of the adaptor, and a double-stranded nicking enzyme recognition sequence, such that upon incubation with a first nicking enzyme specific for the nicking enzyme recognition sequence, a single-strand break is introduced at the nicking enzyme cleavage site in the first strand.
- a double-stranded connecting adaptor comprising: a double-stranded sticky end motif derived from the target nucleic acid sequence, a nicking enzyme cleavage site located only a first strand of the adaptor, and a double-stranded nicking enzyme recognition sequence, such that upon incubation with a first nicking enzyme specific for the nicking enzyme recognition sequence, a single-strand break is introduced at the nicking enzyme cleavage
- the sticky end motif of the connecting adaptor is located directly at the 5′ or 3′ end of a fragment so that each sticky end motif-fragment or fragment-sticky end motif construct comprises sequence native to the predetermined target nucleic acid sequence.
- the target nucleic acid sequences 100 may be partitioned in sticky end motifs 101 a - 101 g of about 200 bp or other lengths, such as less than or about 50 bp, about 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1000 bp, or more bp.
- double-stranded nucleic acids comprising a first strand having a first cleavage site and a second strand having a second cleavage site; wherein the cleavage sites are positioned one or more bases from one another in sequence.
- double-stranded nucleic acids comprising a first strand comprising a non-canonical base and a second strand comprising a nicking enzyme cleavage site; wherein the non-canonical base and nicking enzyme cleavage site are positioned one or more bases from one another in sequence.
- nicking enzymes directed to act in tandem at adjacent or near adjacent positions on opposite strands of a double-stranded nucleic acid, one may impact the generation of a sticky end at or near the end of a first nucleic acid fragment, wherein the sticky end sequence is unique and complementary only to the sticky end of a second nucleic acid fragment sequentially adjacent thereto in a predetermined sequence of a full-length target nucleic acid to be assembled from the fragments.
- FIGS. 2A-2B An example workflow illustrating the generation of a nick at a non-canonical base in a nucleic acid is shown in FIGS. 2A-2B .
- a predetermined sequence of a target nucleic acid is partitioned in silico into fragments, where the sequence of each fragment is separated from an adjacent fragment by an identified sticky end motif.
- the connecting adaptor regions 123 a - 123 d appended to an end of a fragment include a sticky end motif corresponding to the sticky end motif 101 a - 101 g adjacent to the fragment such that each motif can processively be aligned during enzymatic processing.
- fragment 3′ end of a first fragment 201 is configured for connection to the 5′ end of fragment 2 202 via a sticky end motif X 211 a .
- fragment 2 201 is configured for connection to fragment 3 203 in the target sequence via sticky end motif Y 211 d and fragment 3 203 is configured for connection to fragment 4 204 in the target sequence via sticky end motif Z 211 c.
- a connecting adaptor comprises a first and a second nicking enzyme recognition site such that tandem nicks made to the connecting adaptor do not affect the sequence of the fragment to which the adaptor is connected.
- a detailed view of precursor fragments 203 and 204 having such connecting adaptors is show in FIGS. 2 ( 220 and 215 , respectively).
- the 5′ connecting adaptor of the fragment 4 204 comprises a first double-stranded nicking enzyme recognition site 225 , a first nicking enzyme cleavage site 227 located on a first single-strand 221 , and a double-stranded sticky end motif Z (AAGTCT, SEQ ID NO.: 3) modified with a uracil (AAGTCU, SEQ ID NO.: 4) on a second single-strand 223 .
- the 3′ connecting adaptor of fragment 3 230 comprises the double-stranded sticky end motif Z 211 c (SEQ ID NO.: 3) modified with a uracil (AGACTU, SEQ ID NO.: 5) on a first single-strand 229 , the first nicking enzyme cleavage site 227 on a second single-strand 231 , and the first double-stranded nicking enzyme recognition site 225 .
- each strand of the connecting adaptors comprise two nicking sites—a first nicking enzyme cleavage site and a uracil—located at different positions and strands in the adaptor sequence.
- the first nicking enzyme cleavage site 227 is located at the backbone of a single-strand of each connecting adaptor, adjacent to a first nicking enzyme recognition sequence 225 .
- the cleavage site is located at a position adjacent to a 5′ or 3′ end of a nicking enzyme recognition site by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 bases.
- Fragments are treated with a first nicking enzyme, in this case, a strand-adjacent nicking enzyme, which cleaves a single-strand of the connecting adaptor at the first nicking enzyme cleavage site; and a second nicking enzyme which excises uracil and cleaves a single-strand of the connecting adaptor at the excised uracil site.
- Cleaved fragments 241 , 242 comprise sticky end overhangs. Fragments comprising complementary sticky end overhangs are annealed and ligated 250 .
- the ligation product 260 comprises predetermined target nucleic acid sequence comprising adjacent fragments separated by a sticky end motif, without the introduction of extraneous scar sequence.
- a sticky end motif includes forward, reverse, and reverse complements of a sticky end sequence.
- a first strand of sticky end motif Z comprises SEQ ID NO.: 3 and a second strand of sticky end motif Z comprises the reverse complement of SEQ ID NO.: 3, AGACTT (SEQ ID NO.: 6), FIG. 2 .
- precursor fragments are either synthesized with one or both sites, assembled from smaller nucleic acids comprising one or both sites, amplified with a primer comprising one or both sites, or any combination of the methods described or known in the art.
- a precursor fragment can comprise a sticky end sequence and a primer is synthesized comprising a sequence that is complementary to the sticky end sequence, yet comprises a non-canonical base substitution at the 3′ end of the sticky end sequence.
- Amplification of precursor nucleic acid fragments comprising sticky end sequences with the primer may introduce the non-canonical base to the precursor fragment sequence so that the precursor fragment amplicons comprise a nicking enzyme cleavage site defined by the position of the non-canonical base.
- a double-stranded precursor fragment comprising, in 5′ to 3′ or 3′ to 5′ order: a first double-stranded nicking enzyme recognition sequence, a first nicking enzyme cleavage site on a first single-strand, a double-stranded sticky end motif, and a double-stranded fragment of predetermined target sequence; wherein amplification of the precursor fragment with a non-canonical base-containing primer as described introduces a second nicking enzyme cleavage site between the sticky end motif and fragment of predetermined target sequence on a second single-strand.
- a collection of precursor nucleic acid fragments comprising a fragment sequence of a predetermined sequence of a target nucleic acid and a 5′ and/or 3′ connecting adaptor, wherein each connecting adaptor comprises a shared sequence among the precursor fragments and optionally one or more bases variable among the precursor fragments.
- Amplification of collective fragments comprising a shared sequence can be performed using a universal primer targeting shared sequence of the adaptors.
- An exemplary universal primer is one that comprises a base or sequence of bases which differs from a shared adaptor sequence of precursor nucleic acid fragments.
- a universal primer comprises a non-canonical base as an addition and/or base substitution to shared adaptor sequence, and amplification of precursor fragments comprising the shared adaptor sequence with the primer introduces the non-canonical base into each adaptor sequence.
- An illustration of an exemplary universal primer pair comprising a non-canonical base substitution is shown in FIG. 3 .
- Each primer comprises, in 5′ to 3′ order: one or more adaptor bases 301 a , 301 b , a nicking enzyme recognition site 302 a , 302 b , and a sticky end motif comprising a T to U base substitution (sticky end motif in forward primer 305 : AATGCU, SEQ ID NO.: 7 303 a ; sticky end motif in reverse primer 310 : AGCATU, SEQ ID NO.: 8 303 b ).
- Amplification of a first precursor nucleic acid having an adaptor comprising sticky end motif AATGCT (SEQ ID NO.: 9) with the forward primer introduces a uracil to a single-strand of the adaptors in the resulting amplicons.
- Amplification of a second precursor nucleic acid having an adaptor comprising sticky end motif AGCATT (SEQ ID NO.: 10) with the reverse primer introduces a uracil to a single-strand of the adaptors in the resulting amplicons.
- the amplification products, cleavage steps described herein, have compatible sticky ends are suitable for annealing and ligating.
- a set of two or more universal primer pairs is used in a method disclosed herein, wherein each pair comprises a universal forward primer and a universal reverse primer, and wherein the forward primers in the set each comprise a shared forward sequence and a variable forward sequence and the reverse primers in the set each comprise a shared reverse sequence and a variable reverse sequence.
- a set of universal primers designed to amplify the collection of nucleic acids may comprises differences within each set of universal forward and reverse primers relating to one or more bases of the sticky end motif sequence.
- a universal primer pair incorporates a universal primer sequence 5′ to a sticky end motif sequence in a nucleic acid.
- a universal primer sequence comprises a universal nicking enzyme recognition sequence to be incorporated at the end of each fragment in a library of precursor nucleic acid fragments.
- a primer fusion site comprises four bases 3′ to an adenine (A) and 5′ to a uracil (U).
- the 5′-A (N 4 ) U-3′ (SEQ ID NO.: 11) primer fusion sequence is located at the very 3′ end of the exemplary primers, which conclude with a 3′ uracil.
- the primer fusion can be sequence is 5′-G (N 4 ) U-3′ (SEQ ID NO.: 12).
- N 4 represents any configuration of 4 bases (N), where each base N has the same or different identity than another base N. In some cases, the number of N bases is greater than or less than 4.
- Each precursor fragment 401 - 404 comprises at least one connecting adaptor and optionally an outer adaptor at each end of a target fragment sequence, wherein each of the connecting and outer adaptors comprise a shared sequence.
- the precursor fragment 401 - 404 are modified to include non-canonical bases 410 , subject to enzymatic digestion 415 to generate fragments with overhangs 420 , and subject to annealing and ligation 430 .
- the primers may be universal primers described herein.
- the nucleic acids comprising fragment 1 401 and fragment 2 402 are appended at their 3′ or 5′ ends, respectively, with sticky end motif X, wherein the sequence: fragment 1-sticky end motif X-fragment 2 occurs in the predetermined target sequence.
- the nucleic acids comprising fragment 2 402 and fragment 3 403 are appended at their 3′ or 5′ ends, respectively, with sticky end motif Y, wherein the sequence fragment 2-sticky end motif Y-fragment 3 occurs in the predetermined target sequence.
- the nucleic acids comprising fragment 3 403 and fragment 4 404 are appended at their 3′ or 5′ ends, respectively, with sticky end motif Z, wherein the sequence fragment 3-sticky end motif Z-fragment 4 occurs in the predetermined target sequence.
- the ligation product is then amplified by PCR 440 using primers 445 , 446 complementary to outer adaptors regions.
- the resulting final product is a plurality of nucleic acids which lack adaptor regions 450 .
- Connecting adaptors disclosed herein may comprise a Type II restriction endonuclease recognition sequence.
- a sticky end motif shared between adjacent fragments in a predetermined sequence is a Type II restriction endonuclease recognition sequence.
- sticky end motif X is a first Type II restriction endonuclease recognition sequence so that upon digesting with the appropriate Type II restriction enzyme, a sticky end is produced at the ends of nucleic acids 401 and 402 .
- sticky end motifs Y and Z are also two different Type II restriction endonuclease recognition sequences native to the predetermined target nucleic acid sequence. In such cases a target nucleic acid having no scar sites is assembled from the Type II-digested fragments.
- fragments assembled using Type II restriction endonucleases are small, for example, less than about 500, 200, or 100 bases so to reduce the possibility of cleavage at a site within the fragment sequence.
- a combination of tandem, single-strand breaks and Type II restriction endonuclease cleavage is used to prepare precursor fragments for assembly.
- tandem nicking of a double-stranded nucleic acid and/or double-stranded cleavage by a Type II restriction endonuclease results in undesired sequences terminal to cleavage sites remaining in the cleavage reaction.
- These terminal bases are optionally removed to facilitate downstream ligation.
- Cleaved termini are removed, for example, through size-exclusion column purification.
- terminal ends are tagged with an affinity tag such as biotin such that the cleaved ends are removed from the reaction using avidin or streptavidin, such as streptavidin coated on beads.
- an affinity tag such as biotin
- streptavidin such as streptavidin coated on beads.
- cleaved ends of precursor fragments are retained throughout annealing of the fragments to a larger target nucleic acid.
- precursor fragments comprise a first nicking enzyme cleavage site defined by a first nicking enzyme recognition sequence, and a non-canonical base.
- precursor fragments are treated with a first enzyme activity that excises the non-canonical base and a second enzyme activity that cleaves single-stranded nucleic acids at the abasic site and first nicking enzyme cleavage site.
- Some of the cleaved ends produced at the first nicking enzyme cleavage site are able to reanneal to cleaved sticky end overhangs, and may re-ligate.
- the molecule formed thereby Upon ligation, the molecule formed thereby will not have the first nicking enzyme cleavage site, as the sequence that specifies cleavage is in the cleaved-off terminal fragment rather than in the adjacent fragment sequence. Subsequently, ligated ends will not be re-cleaved by strand-adjacent nicking enzyme. Additionally, as neither strand has a gap position corresponding to the excised non-canonical base position, sticky ends of precursor nucleic acid fragments that are end pairs intended to assemble into a larger target are capable of annealing to one another across both strands.
- Sticky ends of cleaved precursor nucleic acid fragments are allowed to anneal to one another under conditions promoting stringent hybridization, such that in some cases, only perfectly reverse complementary sticky ends anneal. In some cases, less stringent annealing is permitted. Annealed sticky ends are ligated to form either complete target nucleic acid molecules, or larger fragment target nucleic acid molecules. Larger fragment molecules are in turn subjected to one or more additional rounds of assembly, using either methods described herein and additional sticky end sites, or one or more assembly techniques known in the art.
- the target molecules are assembled from precursor nucleic acid fragments that are in many cases synthesized to a length that is within a target level of sequence confidence—that is, they are synthesized to a length for which the synthesis method provides a high degree of confidence in sequence integrity. In some cases, this length is about 100, 150, 200, 250, 300, 350, 400, 450, or 500 nucleic acid bases.
- the methods provided herein generate a specific target sequence for a recombinatorial library, e.g., a chimeric construct or a construct comprising at least one targeted variation for codon mutation.
- Positions to vary include, without limitation, codons at residues of interest in an encoded protein, codons of residues of unknown function in an encoded protein, and pairs or larger combinations of codons encoding residues known or suspected to work in concert to influence a characteristic of a protein such as enzymatic activity, thermostabilty, protein folding, antigenicity, protein-protein interactions, solubility or other characteristics.
- a library of variants may be prepared by synthesizing target nucleic acids from fragments having at least one indeterminate or partially determinate position among members of the library.
- target fragments are synthesized having combinations of variants.
- multiple combinations of variations at a first position and variations at a second position may be present in the library.
- all possible combinations of variants are represented in a library.
- the library may be constructed such that variant base positions are each found on different target fragments, or alternately, multiple variant base positions are found on the same target fragment library.
- FIGS. 5A-5B illustrate an exemplary workflow for recombinatorial library synthesis of a target gene.
- the target gene is partitioned into fragments 1-4 by motifs X, Y, and Z 500 , each fragment comprising one or two indeterminate sites ( FIG. 5A ).
- Precursor fragments 501 comprise an outer adaptor, a variant of fragment 1 comprising one indeterminate site, and a connecting adaptor comprising motif X.
- Precursor fragments 502 comprise a connecting adaptor comprising motif X, a variant of fragment 2 comprising one indeterminate site, and a connecting adaptor comprising motif Y.
- Precursor fragments 503 comprise a connecting adaptor comprising motif Y, a variant of fragment 3 comprising two indeterminate sites, and a connecting adaptor comprising motif Z.
- Precursor fragments 504 comprise a connecting adaptor comprising motif Z, a variant of fragment 4 comprising one indeterminate sites, and a second outer adaptor.
- PCR is used to generate amplicons 510 of each precursor fragment, collectively, 500 , In some cases, using a universal primer pair(s) ( FIG. 5B ).
- Precursor nucleic acids are digested at their connecting adaptor sequence to generate sticky ends, complements of which are annealed and ligated together to form a series of target genes comprising: fragment 1 sequence comprising one indeterminate site, motif X, fragment 2 sequence comprising one indeterminate site, motif Y, fragment 3 sequence comprising two indeterminate sites, motif Z, and fragment 4 sequence comprising one indeterminate site 520 .
- the number of possible target gene variants is 4 5 or 1,024 different genes.
- FIG. 5B , part 530 shows a conceptual depiction of some of these target gene variants after PCR amplification.
- Methods described herein comprise assembling double-stranded DNA (“dsDNA”) target nucleic acid from shorter target nucleic acid fragments that are building block precursors. Assembly may proceed by hybridizing uniquely complimentary pairs of overhangs. Such uniquely complimentary pairs may be formed by incorporating sticky ends from two precursor fragments that appear successively in the assembled nucleic acid. In some cases, the pair of overhangs does not involve complete complementarity, but rather sufficient partial complementarity that allows for selective hybridization of successive precursor fragments under designated reaction conditions.
- dsDNA double-stranded DNA
- a cleavage agent includes any molecule with enzymatic activity for base excision and/or single-strand cleavage of a double-stranded nucleic acid.
- a cleavage agent is a nicking enzyme or has nicking enzymatic activity.
- a cleavage agent recognizes a cleavage or nicking enzyme recognition sequence, mismatched base pair, atypical base, non-canonical or modified nucleoside or nucleobase to be directed to a specific cleavage site.
- two cleavage agents have independent recognition sites and cleavage sites.
- a cleavage agent generates a single-stranded cleavage, e.g., a nick or a gap, involving removal of one or more nucleosides from a single-strand of a double-stranded nucleic acid.
- a cleavage agent cleaves a phosphodiester bond of a single-strand in a double-stranded nucleic acid.
- area methods for creating a sticky end on a double-stranded nucleic acid comprising: (a) providing a linear double-stranded nucleic acid comprising in order an insert region, a first fusion site, and a first adaptor region; (b) creating a first nick on a first strand of the double-stranded nucleic acid with a first cleavage agent having a first recognition site and a first specific cleavage site; and (c) creating a second nick on a second strand of the double-stranded nucleic acid with a second cleavage agent having a second recognition site and a second specific cleavage site; wherein the method produces a sticky end at the first fusion site; wherein the first recognition site is in the first fusion site or the first adaptor region; and wherein the second recognition site is in the first fusion site or first adaptor region.
- the first adaptor region or first fusion site comprises a sticky end motif. In some cases, the first adaptor region or first fusion site comprises a strand-adjacent nicking enzyme recognition sequence. In some cases, a precursor nucleic acid sequence comprises a fusion site and adaptor region that is not naturally adjacent to each other.
- methods for creating sticky ends on double-stranded nucleic acid comprising: (a) providing a plurality of double-stranded nucleic acids each comprising in order an insert region, a fusion site, and an adaptor region, wherein each of the plurality of double-stranded nucleic acids have a different fusion site; (b) creating a first nick on a first strand of each of the plurality of double-stranded nucleic acids with a first cleavage agent having a first recognition site and a first specific cleavage site; and (c) creating a second nick on a second strand of each of the plurality of double-stranded nucleic acids with a second cleavage agent having a second recognition site and a second specific cleavage site; wherein the method produces a sticky end at each fusion site of the plurality of double-stranded nucleic acids; wherein the first recognition site is in the fusion site or the adaptor region of the plurality of double-
- a polynucleotide comprising: (a) providing a reaction mixture comprising a first dsDNA fragment comprising a uracil base on its first strand; a second dsDNA fragment comprising a uracil base on its first strand; a first cleaving agent that cuts dsDNA on a single-strand at the site of a uracil; a second cleaving agent that cuts dsDNA on a single-strand, wherein the cleavage site of the second cleaving agent is within k bp of the uracil in an opposite strand and wherein k is between 2 and 10; and a ligase; and (b) thermocycling the reaction mixture between a maximum and a minimum temperature, thereby generating a first overhang from the first dsDNA fragment and a second overhang from the second dsDNA fragment, wherein the first and the second overhangs are complimentary, hybridizing the first and second overhang
- a polynucleotide comprising: (a) providing a reaction mixture comprising n dsDNA fragments each comprising a first and a second strand, and a first nicking endonuclease recognition site, a first fusion site, a variable insert, a second fusion site, and a second nick enzyme recognition site, wherein the second fusion site comprises a uracil base on the first strand and the first fusion site comprises a uracil base on the second strand; a first cleaving agent that cuts dsDNA on a single-strand at the site of a uracil; a second cleaving agent that cuts dsDNA on a single-strand, wherein the cleavage site of the second cleaving agent is within k bp of the uracil in an opposite strand and wherein k is between 2 and 10; and a ligase; and (b) thermocycling the reaction mixture between
- fragment libraries comprising n DNA fragments, each comprising a first strand and a second strand, each ith DNA fragment comprising a first nicking endonuclease recognition site, a first fusion site, a variable insert, a second fusion site, and a second nick enzyme recognition site; wherein the first fusion site comprises a sequence of 5′-A (Nx) i,1 U-3′ (SEQ ID NO.: 13) in the first strand; and wherein the second fusion site comprises a sequence of 5′-A (Nx) i,2 U-3′ (SEQ ID NO.: 14) in the second strand; wherein Nx denotes x nucleosides; wherein (Nx) i,2 is reverse complementary to (N x ) i+1,1 and different from every other Nx found in any fusion site sequence within the fragment library; wherein the first nicking endonuclease recognition sites are positioned such that there is a corresponding cle
- primer libraries comprising n primers, each comprising a nicking endonuclease recognition sequence and a fusion sequence comprising 5′-A (Nx) i U-3′ (SEQ ID NO.: 15), wherein the nicking endonuclease recognition sequence is positioned 5′ of the fusion sequence.
- the nicking endonuclease recognition sites are positioned such that the nicking endonuclease recognition site in a primer is capable of generating a corresponding cleavage site in a reverse complimentary DNA strand 3′ of a first fusion site in the reverse complementary DNA strand, if the primer were hybridized to the reverse complementary DNA strand such that the fusion sequence hybridizes to the first fusion site in the reverse complementary DNA strand.
- x is selected from the list consisting of the integers 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10.
- n is at least 2. In some cases, n is less than 10.
- the sequences of the n primers are not naturally occurring.
- the primers are in a kit further comprising a nicking endonuclease, UDG, and an AP endonuclease.
- a primer is said to anneal to another nucleic acid if the primer, or a portion thereof, hybridizes to a nucleotide sequence within the nucleic acid.
- the statement that a primer hybridizes to a particular nucleotide sequence is not intended to imply that the primer hybridizes either completely or exclusively to that nucleotide sequence.
- a sticky end is an end of a double-stranded nucleic acid having a 5′ or 3′ overhang, wherein a first strand of the nucleic acid comprises one or more bases at its 5′ or 3′ end, respectively, which are collectively not involved in a base-pair with bases of the second strand of the double-stranded nucleic acid.
- An overhang is capable of annealing to a complementary overhang under suitable reaction conditions.
- “sticky end” and “overhang” are used interchangeably.
- overhang lengths include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more bases. For example an overhang has 4 to 10 bases, 4 to 8 bases, or 4 to 6 bases.
- Sticky end motifs are generally identified from a predetermined sequence of a target nucleic acid to be synthesized from fragments partitioned by selected identified sticky end motifs.
- ANNNNT SEQ ID NO.: 2
- GNNNNC SEQ ID NO.: 17
- Selected sticky ends serve as fusion sites for annealing and ligating together two fragments via complementary sticky ends.
- a sticky end comprises a sequence of A (N x ) T (SEQ ID NO.: 1), wherein N x is x number of N bases of any sequence.
- a sticky end comprises a sequence of G (N x ) C (SEQ ID NO.: 16), wherein N x is x number of N bases of any sequence.
- a sticky end motif is a sequence of double-stranded polynucleotides in a nucleic acid that when treated with an appropriate cleavage agent make up a sticky end.
- the N x sequence or full sequence of a sticky end at the 3′ end of a first nucleic acid fragment is completely or partially reverse complementary to the N x sequence of a sticky end at the 5′ end of a second nucleic acid fragment.
- the 3′ end of the second nucleic acid fragment has a sticky end that is completely or partially reverse complementary to the N x sequence of sticky end at the 5′ end of a third nucleic acid fragment, and so on.
- the motif of the sticky end complementary between the first and second nucleic acids is the same as the motif of the sticky end complementary between the second and third nucleic acids.
- sticky end motifs includes motifs having identical base number and sequence identities.
- sticky end motifs of a plurality of nucleic acids are the same, yet have variable identities.
- each motif shares the sequence ANNNNT (SEQ ID NO.: 2), but two or more motifs differ in the identity of the sequence of 4, N bases.
- a plurality of nucleic acid fragments to be assembled may each comprise a sticky end motif of A (N x ) T (SEQ ID NO.: 1), wherein the sequence of a given motif is only shared among two of the fragments adjacent to one another in a target nucleic acid sequence.
- these nucleic acid fragments under appropriate conditions, anneal to each other in a linear sequence without degeneracy in the pairing of overhangs and hence the nucleic acid order within the linear sequence.
- the number of bases x in N x in a sticky end motif described herein may be the same for all sticky end motifs for a number of nucleic acids within a plurality of nucleic acids.
- sticky end motifs belonging to a number of nucleic acids within a plurality of nucleic acids comprise sequences of A (N x ) T (SEQ ID NO.: 1), G (N x ) C (SEQ ID NO.: 16), or combinations thereof, wherein the number of bases x in N x is the same or varies among the plurality of nucleic acids.
- the number of bases x in N x may be more than or equal to about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more.
- FIG. 2 depicts the preparation and annealing of two sticky ends in a plurality of precursor nucleic acid fragments.
- a plurality of fragments spanning a predetermined target nucleic acid sequence is generated for which sticky end motif sequences have been selected (sticky end motifs X, Y, and Z) such that only two fragments will share a particular compatible sticky end.
- Each precursor fragment comprises target nucleic acid fragment sequence, flanked by sticky end motif sequence ANNNNT (SEQ ID NO.: 2), wherein NNNN are specific to an end pair, and having a U in place of the T at the 3′ end of one strand.
- the sequence is GNNNNC (SEQ ID NO.: 17), herein NNNN are specific to an end pair, and having a U in place of the C at the 3′ end of one strand.
- FIG. 4 Another non-limiting depiction of sticky end use is shown in the example workflow of FIG. 4 , which generally depicts the assembly of target nucleic acids from precursor nucleic acid fragments via assembly of complementary sticky ends in the precursor fragments.
- Connecting adaptors of two or more fragments may be synthesized to be flanked by Type II restriction endonuclease sites that are unique to a fragment pair.
- Compatible ends are ligated and PCR is used to amplify the full length target nucleic acids.
- methods and compositions described herein use two independent cleavage events that have to occur within a distance that allow for separation of a cleaved end sequence under specified reaction conditions.
- two different cleaving agents are used that both cut DNA only at a single-strand.
- one or both of the cleaving agents cut outside of its recognition sequence (a “strand-adjacent nicking enzyme”). This allows independency of the process from the actual sequence of the overhangs which are to be assembled at sticky end sites.
- one or more of the cleavage agents recognizes or cleaves at non-canonical bases that are not part of the Watson-Crick base pairs or typical base pairs, including, but not limited to a uracil, a mismatch, and a modified base.
- methods for generation of a sticky end in a double-stranded nucleic acid having a sticky end motif comprises cleaving a first strand of the nucleic acid at a first position adjacent to one end of the sticky end motif and cleaving a second strand of the nucleic acid at a second position adjacent to the other end of the sticky end motif.
- the first and/or second position are defined by their location next to a nicking enzyme recognition sequence.
- the first and/or second position are defined by the presence of a non-canonical base, wherein excision and cleavage at the non-canonical base site occurs via one or more nicking enzymes collectively having excision and endonuclease activities.
- two nicks on opposite strands of a nucleic acid are within a short nick-to-nick distance from each other, e.g., a distance equal to or less than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 base pairs.
- a nicking enzyme recognition sequence is positioned such that its cleavage site is at the desired nick-to-nick distance from the other cleavage activity that is used together to create an overhang.
- a single-strand of a sticky end motif may be modified with or comprises a non-canonical base positioned directly adjacent to a target nucleic acid sequence.
- a non-canonical base identifies a cleavage site.
- an adaptor sequence comprising a sticky end motif further comprises a nicking enzyme recognition sequence adjacent to the terminal end of the sticky end motif. In this configuration, if the nicking enzyme recognition sequence defines a cleavage site adjacent to the recognition sequence and is located next to the sticky end motif, treatment with a strand-adjacent nicking enzyme introduces a nick on a single-strand between the nicking enzyme recognition sequence and sticky end motif.
- non-canonical bases for inclusion in a modified sticky end motif are, without limitation, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, acetylcytosine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N-6-isopentenyl adenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 1-methyladenine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, 5-ethylcytosine, N6-adenine, N6-methyladenine, N,N-dimethyladenine, 8-bromoadenine
- nucleoside and nucleotide include those moieties that contain not only conventional ribose and deoxyribose sugars, but other sugars as well. Modified nucleosides or nucleotides also include modifications on the sugar moiety, e.g., wherein one or more of the hydroxyl groups are replaced with halogen atoms or aliphatic groups, or are functionalized as ethers, amines, or the like.
- modified sugar moieties which can be used to modify nucleosides or nucleotides at any position on their structures include, but are not limited to arabinose, 2-fluoroarabinose, xylose, and hexose, or a modified component of the phosphate backbone, such as phosphorothioate, a phosphorodithioate, a phosphoramidothioate, a pliosphoramidate, a phosphordiamidate, a methylphosphonate, an alkyl phosphotriester, or a formacetal or analog thereof.
- a nucleic acid described herein may be treated with a chemical agent, or synthesized using modified nucleotides, thereby creating a modified nucleic acid.
- a modified nucleic may is cleaved, for example at the site of the modified base.
- a nucleic acid may comprise alkylated bases, such N3-methyladenine and N3-methylguanine, which may be recognized and cleaved by an alkyl purine DNA-glycosylase, such as DNA glycosylase I ( E. coli TAG) or AlkA.
- uracil residues may be introduced site specifically, for example by the use of a primer comprising uracil at a specific site.
- the modified nucleic acid may be cleaved at the site of the uracil residue, for example by a uracil N-glycosylase. Guanine in its oxidized form, 8-hydroxyguanine, may be cleaved by formamidopyrimidine DNA N-glycosylase.
- Examples of chemical cleavage processes include without limitation alkylation, (e.g., alkylation of phosphorothioate-modified nucleic acid); cleavage of acid lability of P3′-N5′-phosphoroamidate-containing nucleic acid; and osmium tetroxide and piperidine treatment of nucleic acid.
- Methods described herein provide for synthesis of a precursor nucleic acid sequence, or a target fragment sequence thereof, has a length of about or at least about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 10000, 20000, or 30000 bases.
- a plurality of precursor nucleic acid fragments are prepared with sticky ends, and the sticky ends are annealed and ligated to generate the predetermined target nucleic acid sequence having a base length of about, or at least about, 500, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 15000, 20000, 30000, 50000, or 100000 bases.
- a precursor nucleic acid sequence is assembled with another precursor nucleic acid sequence via annealing and ligation of complementary sticky ends, followed by additional rounds of sticky end generation and assembly with other precursor fragment(s) to generate a long target nucleic acid sequence.
- 2, 3, 4, 5, 6, 7, 8, 9, or 10 rounds of sticky end generation and assembly are performed to generate a long target nucleic acid of predetermined sequence.
- the precursor nucleic acid fragment or a plurality of precursor nucleic acid fragments may span a predetermined sequence of a target gene, or portion thereof.
- the precursor nucleic acid fragment or a plurality of precursor nucleic acid fragments may span a vector and a plasmid sequence, or portion thereof.
- a precursor nucleic acid fragment comprises a sequence of a cloning vector from a plasmid.
- a cloning vector is generated using de novo synthesis and an assembly method described herein, and is subsequently assembled with a precursor nucleic acid fragment or fragments of a target gene to generate an expression plasmid harboring the target gene.
- a vector may be a nucleic acid, optionally derived from a virus, a plasmid, or a cell, which comprises features for expression in a host cell, including, for example, an origin of replication, selectable marker, reporter gene, promoter, and/or ribosomal binding site.
- a host cell includes, without limitation, a bacterial cell, a viral cell, a yeast cell, and a mammalian cell.
- Cloning vectors useful as precursor nucleic acid fragments include, without limitation, those derived from plasmids, bacteriophages, cosmids, bacterial artificial chromosomes, yeast artificial chromosomes, and human artificial chromosomes.
- target nucleic acid fragments having an error rate of less than 1/500, 1/1000, 1/10,000 or less compared to a predetermined sequence(s).
- target fragment length is selected in light of the location of desired sticky ends, such that target fragment length varies among fragments in light of the occurrence of desired sticky ends among target fragments.
- target nucleic acid fragments are synthesized to a size of at least 20 but less than 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 500, 1000, 5000, 10000, or 30000 bases.
- target fragments are synthesized de novo, such as through nonenzymatic nucleic acid synthesis.
- target nucleic acid fragments are synthesized from template nucleic acids, such as templates of nucleic acids that are to be assembled into a single target nucleic acid but which, in some cases, do not naturally occur adjacent to one another.
- target nucleic acid fragments having at least one indeterminate position followed by the ligation at sticky ends to adjacent target nucleic acid fragments also having at least one indeterminate position, one can synthesize a target nucleic acid population that comprises a recombinant library of all possible combinations of the base identities at the varying positions.
- at least one base position is partially indeterminate in some cases, such that two or three base alternatives are permitted.
- target nucleic acid fragments are selected such that only one base varies within a given target nucleic acid fragment, which in turn allows for each position to independently vary in the target nucleic acid library.
- FIG. 6 An example workflow of nucleic acid synthesis is shown in FIG. 6 .
- Methods of synthesis using this workflow are, in some instances, performed to generate a plurality of target nucleic acid fragments, or oligonucleotides thereof, for assembly using sticky end methods described herein.
- oligonucleotides are prepared and assembled into precursor fragments using the methods depicted in FIG. 6 .
- the workflow is divided generally into the following processes: (1) de novo synthesis of a single stranded oligonucleic acid library, (2) joining oligonucleic acids to form larger fragments, (3) error correction, (4) quality control, and (5) shipment.
- an intended nucleic acid sequence or group of nucleic acid sequences is preselected. For example, a library of precursor nucleic acid fragments is preselected for generation.
- a structure comprising a surface layer 601 is provided.
- chemistry of the surface is functionalized in order to improve the oligonucleic acid synthesis process. Areas of low surface energy are generated to repel liquid while areas of high surface energy are generated to attract liquids.
- the surface itself may be in the form of a planar surface or contain variations in shape, such as protrusions or nanowells which increase surface area.
- high surface energy molecules selected support oligonucleic acid attachment and synthesis.
- a device such as a material deposition device, is designed to release reagents in a step wise fashion such that multiple oligonucleic acids extend from an actively functionalized surface region, in parallel, one residue at a time to generate oligomers with a predetermined nucleic acid sequence.
- oligonucleic acids are cleaved from the surface at this stage.
- Cleavage includes gas cleavage, e.g., with ammonia or methylamine.
- the generated oligonucleic acid libraries are placed in a reaction chamber.
- the reaction chamber also referred to as “nanoreactor” is a silicon coated well containing PCR reagents lowered onto the oligonucleic acid library 603 .
- a reagent is added to release the oligonucleic acids from the surface.
- the oligonucleic acids are released subsequent to sealing of the nanoreactor 605 . Once released, fragments of single-stranded oligonucleic acids hybridize in order to span an entire long range sequence of DNA. Partial hybridization 605 is possible because each synthesized oligonucleic acid is designed to have a small portion overlapping with at least one other oligonucleic acid in the pool.
- oligonucleic acids are assembled in a PCA reaction.
- the oligonucleic acids anneal to complementary fragments and gaps are filled in by a polymerase.
- Each cycle increases the length of various fragments randomly depending on which oligonucleic acids find each other. Complementarity amongst the fragments allows for forming a complete large span of double-stranded DNA 606 , in some instances, a fragment of DNA to be assembled into a target nucleic acid.
- the nanoreactor is separated from the surface 607 and positioned for interaction with a surface having primers for PCR 608 .
- the nanoreactor is subject to PCR 609 and the larger nucleic acids are amplified.
- the nanochamber is opened 611 , error correction reagents are added 612 , the chamber is sealed 613 and an error correction reaction occurs to remove mismatched base pairs and/or strands with poor complementarity from the double-stranded PCR amplification products 614 .
- the nanoreactor is opened and separated 615 . Error corrected product is next subject to additional processing steps, such as PCR, nucleic acid sorting, and/or molecular bar coding, and then packaged 622 for shipment 623 .
- quality control measures are taken.
- quality control steps include, for example, interaction with a wafer having sequencing primers for amplification of the error corrected product 616 , sealing the wafer to a chamber containing error corrected amplification product 617 , and performing an additional round of amplification 618 .
- the nanoreactor is opened 619 and the products are pooled 620 and sequenced 621 .
- nucleic acid sorting is performed prior to sequencing.
- the packaged product 622 is approved for shipment 623 .
- the product is a library of precursor nucleic acids to be assembled using scar-free assembly methods and compositions described herein.
- the primer binding sequence is a universal primer binding sequence shared among all primers in a reaction.
- different set of primers are used for generating different final nucleic acids.
- multiple populations of primers each have their own “universal” primer binding sequence that is directed to hybridize with universal primer binding sites on multiple nucleic acids in a library. In such a configuration, different nucleic acids within a population share a universal primer binding site, but differ in other sequence elements. Thus, multiple populations of nucleic acids may be used as a template in primer extension reactions in parallel through the use of different universal primer binding sites.
- Universal primers may comprise a fusion site sequence that is partially or completely complementary to a sticky end motif of one of the nucleic acids. The combination of a primer binding sequence and the sticky end motif sequence is used to hybridize the primer to template nucleic acids.
- primers and/or adaptor sequences further comprise a recognition sequence for a cleavage agent, such as a nicking enzyme.
- primers and/or primer binding sequences in an adaptor sequence further comprise a recognition sequence for a cleavage agent, such as a nicking enzyme.
- a nicking enzyme recognition sequence is introduced to extension products by a primer.
- Primer extension may be used to introduce a sequence element other than a typical DNA or RNA Watson-Crick base pair, including, without being limited to, a uracil, a mismatch, a loop, or a modified nucleoside; and thus creates a non-canonical base pair in a double-stranded target nucleic acid or fragment thereof.
- Primers are designed to contain such sequences in a way that still allows efficient hybridization that leads to primer extension.
- Such non-Watson-Crick sequence elements may be used to create a nick on one strand of the resulting double-stranded nucleic acid amplicon.
- a primer extension reaction is used to produce extension products incorporating uracil into a precursor nucleic acid fragment sequence.
- primer extension reactions may be performed linearly or exponentially.
- a polymerase in a primer extension reaction is a ‘Family A’ polymerase lacking 3′-5′ proofreading activity.
- a polymerase in a primer extension reaction is a Family B high fidelity polymerase engineered to tolerate base pairs comprising uracil.
- a polymerase in a primer extension reaction is a Kappa Uracil polymerase, a FusionU polymerase, or a Pfu turbo polymerase as commercially available.
- the generation of an overhang described herein in a double-stranded nucleic acid comprises may create two independent single-stranded nicks at an end of the double-stranded nucleic acid.
- the two independent single-stranded nicks are generated by two cleavage agents having cleavage activities independent from each other.
- a nick is created by including a recognition site for a cleavage agent, for example in an adaptor region or fusion site.
- a cleavage agent is a nicking endonuclease using a nicking endonuclease recognition sequence or any other agent that produces a site-specific single-stranded cut.
- a mismatch repair agent that creates a gap at the site of a mismatched base-pair, or a base excision system that creates a gap at the site of a recognized nucleoside, such as a deoxy-uridine, is used to create a single-stranded cut.
- a deoxy-uridine is a non-canonical base in a non-canonical base pair formed with a deoxy-adenine, a deoxy-guanine, or a deoxy-thymine.
- a nucleic acid comprises a deoxy-uridine/deoxy-adenine base pair.
- a glycosylase such as UDG
- an AP endonuclease such as endonuclease VIII
- a second nick is created similarly using any suitable single-stranded site-specific cleavage agent; wherein the second nick is created at a site not directly across from the first nick in the double-stranded nucleic acid.
- Such pairs of staggered nicks when in proximity to each other and under appropriate reaction conditions, cause a sticky end when parts of the original nucleic acid melt away from each other.
- one or more of the cleavage sites are situated apart from the sequence of the fusion site.
- Two nicks in a double-stranded nucleic acid may be created such that the resulting overhang is co-extensive with the span of a sticky end site.
- a first nick is created at the juncture between sticky end site and adaptor region at one end of a nucleic acid; and a second nick is created at the other end of the sticky end site.
- a mixture of enzymatic uracil excision activity and nicking endonuclease activity may be provided in a mixture of engineered fragments.
- a strand-adjacent nicking enzyme is provided, such that sticky ends that reanneal to their cleaved terminal ends and are re-ligated across a single-strand will be re-subjected to single-strand nicking due to the reconstitution of the strand-adjacent nicking site.
- Overhangs of various sizes are prepared by adjusting the distance between two nicks on opposite strands of the end of a double-stranded nucleic acid.
- the distance or the length of an overhang is equal to or less than 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, or 2 bases.
- Overhangs may be 3′ or 5′ overhangs.
- the cleavage site of a cleavage agent is a fixed distance away from its recognition site. In some cases, the fixed distance between a cleavage agent's cleavage site and recognition site is more than or equal to about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 bases or more.
- the fixed distance between a cleavage agent's cleavage site and recognition site is 2-10 bases, 3-9 bases, or 4-8 bases.
- the cleavage site of a cleaving agent may be outside of its recognition site, for example, it is adjacent to its recognition site and the agent is a strand-adjacent nicking enzyme. In some case, the recognition site of a cleavage agent is not cleaved.
- a double-stranded nucleic acid disclosed herein may be modified to comprise a non-canonical base.
- a nucleic acid fragment having a sticky end motif such as A (N x ) T (SEQ ID NO.: 1) or G (N x ) C (SEQ ID NO.: 16) is prepared.
- the fragment further comprises a recognition site for a single-strand cleavage agent, such as a nicking endonuclease, having a cleavage site immediately adjacent to the last base in the sticky end motif sequence.
- the recognition site is introduced by a primer in a nucleic acid extension reaction using a strand of the fragment comprising the sticky end motif as a template.
- the recognition site is appended to the end of the fragment in an adaptor region.
- a nucleic acid extension reaction using the strand of the fragment comprising the sticky end motif, such as A (N x ) T (SEQ ID NO.: 1) or G (N x ) C (SEQ ID NO.: 16), as a template is primed with a primer comprising a sticky end sequence comprising a non-canonical base substitution.
- a sticky end motif of A (N x ) T (SEQ ID NO.: 1) in a template one such primer comprises the sequence A (N x )′ U (SEQ ID NO.: 18), wherein (N x )′ is partially or completely reverse complementary to (N x ).
- one such primer comprises the sequence A (N x ) U (SEQ ID NO.: 19).
- the A (N x )′ U (SEQ ID NO.: 18) and/or A (N x ) U (SEQ ID NO.: 19) sequence on the primer is located at the very 3′ end of the primer.
- a plurality of such primers each having a sequence of A (N x )′ U (SEQ ID NO.: 18) and/or A (N x ) U (SEQ ID NO.: 19) corresponding to a sequence of A (N x ) T in one strand of a fragment may be used to perform a nucleic acid extension reaction.
- the exemplary sequences described have a sticky end motif comprising a first A or G and a terminal T or C prior to non-canonical base in corporation. However, any sticky end motif sequence is useful with the methods described herein.
- each double-stranded nucleic acid precursor fragment of the n double-stranded nucleic acid fragments comprises a first nicking endonuclease recognition site, a first fusion site, a variable insert of predetermined fragment sequence, a second fusion site, and a second nick enzyme recognition site, optionally in that order.
- the first fusion site comprises or is a first sticky end motif and the second fusion site comprises or is a second sticky end motif.
- the first fusion site has the sequence of 5′-A (N x ) i,1 U-3′ (SEQ ID NO.: 13) in the first strand, wherein denotes N x x bases or nucleosides and the subscript “ i,1 ” in (N x ) i,1 denotes the first strand of the ith fragment.
- the second fusion site has the sequence of 5′-A (N x ) i,2 U-3′ (SEQ ID NO.: 14) in the second strand, wherein denotes N x x bases or nucleosides and the subscript “ 0 ” in (N x ) i,2 denotes the second strand of the ith fragment.
- (N x ) i,2 is completely or partially reverse complementary to (N x ) i+1,1 in the first strand of the i+1'th fragment.
- Each N x found in the fusion site sequences are the same or different that the N x in any other fusion site sequence found within the fragment library.
- the first nicking endonuclease recognition site is positioned such that there is a corresponding cleavage site immediately 3′ of the first fusion site in the second strand and the second nicking endonuclease recognition site is positioned such that there is a corresponding cleavage site immediately 3′ of the second fusion site in the first strand.
- a fragment library may comprise a starter DNA fragment comprising a variable insert, a second fusion site, and a second nick enzyme recognition site.
- the second fusion site of the starter DNA fragment comprises a sequence of 5′-A (N x ) s,2 U-3′ (SEQ ID NO.: 20), wherein the subscript “ s,2 ” in (N x ) s,2 denotes the second strand of the starter fragment and (N x ) s,2 is reverse complementary to (N x ) 1,1 in one of the fusion sites of the first nucleic acid fragment in the library.
- the fragment library may also comprise a finishing DNA fragment comprising a first nicking endonuclease recognition site, a first fusion site, and a variable insert.
- the first fusion site comprises a sequence of 5′-A (N x ) f,1 U-3′(SEQ ID NO.: 21), wherein the subscript “ f,1 ” in (N x ) f,1 denotes the first strand of the finishing fragment And (N x ) f,1 is reverse complementary to (N x ) n,2 in one of the fusion sites of the nth nucleic acid fragment in the library.
- the first and/or the second nicking endonuclease recognition sites are the same in all the fragments in the fragment library.
- the fragment library comprises about or at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 75, 100, 125, 150, 200, 250, 500, or more nucleic acid fragments.
- the fragment library comprises 2-75 fragments, 4-125 fragments, or 5-10 fragments.
- Each primer within the library may comprise a recognition sequence such a nicking endonuclease recognition sequence, and a fusion sequence comprising a sticky end motif.
- a sticky end motif having the sequence 5′-A (N x ) i U-3′ (SEQ ID NO.: 15).
- the recognition sequence is positioned 5′ of the fusion site sequence.
- the recognition sequence is positioned such that the recognition site in a primer is capable of generating a corresponding cleavage site in a reverse complimentary DNA strand 3′ of a first fusion site in the reverse complementary DNA strand, if the primer were hybridized to the reverse complementary DNA strand such that the fusion sequence hybridizes to the first fusion site in the reverse complementary DNA strand.
- a primer library described herein comprises about or at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 75, 100, 125, 150, 200, 250, 500, or more primers.
- nick generally refers to enzymatic cleavage of only one strand of a double-stranded nucleic acid at a particular region, while leaving the other strand intact, regardless of whether one or more bases are removed. In some cases, one or more bases are removed while in other cases no bases are removed and only phosphodiester bonds are broken.
- such cleavage events leave behind intact double-stranded regions lacking nicks that are a short distance apart from each other on the double-stranded nucleic acid, for example a distance of about or at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 bases or more.
- the distance between the intact double-stranded regions is equal to or less than 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, or 2 bases.
- the distance between the intact double-stranded regions is 2 to 10 bases, 3 to 9 bases, or 4 to 8 bases.
- Cleavage agents used in methods described herein may be selected from nicking endonucleases, DNA glycosylases, or any single-stranded cleavage agents described in further detail elsewhere herein.
- Enzymes for cleavage of single-stranded DNA may be used for cleaving heteroduplexes in the vicinity of mismatched bases, D-loops, heteroduplexes formed between two strands of DNA which differ by a single base, an insertion or deletion.
- Mismatch recognition proteins that cleave one strand of the mismatched DNA in the vicinity of the mismatch site may be used as cleavage agents.
- Nonenzymatic cleaving may also be done through photodegredation of a linker introduced through a custom oligonucleotide used in a PCR reaction.
- fragments designed and synthesized such that the inherent cleavage sites are utilized in the preparation of fragments for assembly. For instance, these inherent cleavage sites are supplemented with a cleavage site that is introduced, e.g., by recognition sites in adaptor sequences, by a mismatch, by a uracil, and/or by an un-natural nucleoside.
- a cleavage site that is introduced, e.g., by recognition sites in adaptor sequences, by a mismatch, by a uracil, and/or by an un-natural nucleoside.
- described herein is a plurality of double stranded nucleic acids such as dsDNA, comprising an atypical DNA base pair comprising a non-canonical base in a fusion site and a recognition site for a single-strand cleaving agent.
- compositions according to embodiments described herein in many cases, comprise two or more cleaving agents.
- a first cleaving agent has the atypical DNA base pair as its recognition site and the cleaving agent cleaves a single-strand at or a fixed distance away from the atypical DNA base pair.
- a second cleaving agent has an independent single-strand cleaving and/or recognition activity from the first cleaving agent.
- the nucleic acid molecules in the composition are such that the recognition site for the second single-strand cleaving agent is not naturally adjacent to the fusion site or the remainder of the nucleic acid in any of the plurality of double stranded nucleic acids in the composition.
- the cleavage sites of two cleavage agents are located on opposite strands.
- Type II restriction endonuclease in as a cleavage agent.
- Type II enzymes cleave within or at short specific distances from a recognition site.
- Type II restriction endonucleases comprise many sub-types with varying activities.
- Exemplary Type II restriction endonucleases include, without limitation, Type IIP, Type IIF, Type IIB (e.g. BcgI and BplI), Type IIE (e.g. NaeI), and Type IIM (DpnI) restriction endonucleases.
- Type II enzymes are those like Hhal, HindIII, and Notl that cleave DNA within their recognition sequences. Many recognize DNA sequences that are symmetric, because, without being bound by theory, they bind to DNA as homodimers, but a few, (e.g., BbvCI: CCTCAGC (SEQ ID NO.: 22)) recognize asymmetric DNA sequences, because, without being bound by theory, they bind as heterodimers.
- Some enzymes recognize continuous sequences (e.g., EcoRI: GAATTC (SEQ ID NO.: 23)) in which the two half-sites of the recognition sequence are adjacent, while others recognize discontinuous sequences (e.g., BglI: GCC GGC (SEQ ID NO.: 24)) in which the half-sites are separated.
- discontinuous sequences e.g., BglI: GCC GGC (SEQ ID NO.: 24)
- Type II enzymes are those like FokI and AlwI that cleave outside of their recognition sequence to one side.
- Type IIS enzymes recognize sequences that are continuous and asymmetric.
- Type IIS restriction endonucleases e.g. FokI
- Type IIS enzymes typically comprise two distinct domains, one for DNA binding, and the other for DNA cleavage.
- Type IIA restriction endonucleases recognize asymmetric sequences but can cleave symmetrically within the recognition sequences (e.g.
- BbvCI cleaves 2 based downstream of the 5′-end of each strand of CCTCAGC (SEQ ID NO.: 25)). Similar to Type IIS restriction endonucleases, Type ITT restriction enzymes (e.g., Bpu10I and BslI) are composed of two different subunits. Type IIG restriction enzymes, the third major kind of Type II enzyme, are large, combination restriction-and-modification enzymes, Type IIG restriction endonucleases (e.g. Eco57I) do have a single subunit, like classical Type II restriction enzymes. The two enzymatic activities typically reside in the same protein chain.
- Type ITT restriction enzymes e.g., Bpu10I and BslI
- Type IIG restriction enzymes the third major kind of Type II enzyme, are large, combination restriction-and-modification enzymes
- Type IIG restriction endonucleases e.g. Eco57I
- the two enzymatic activities typically reside in the same protein chain.
- These enzymes cleave outside of their recognition sequences and can be classified as those that recognize continuous sequences (e.g., AcuI: CTGAAG (SEQ ID NO.: 26)) and cleave on just one side; and those that recognize discontinuous sequences (e.g., BcgI: CGA TGC (SEQ ID NO.: 27)) and cleave on both sides releasing a small fragment containing the recognition sequence.
- AcuI CTGAAG (SEQ ID NO.: 26)
- discontinuous sequences e.g., BcgI: CGA TGC (SEQ ID NO.: 27)
- these enzymes may switch into either restriction mode to cleave the DNA, or modification mode to methylate it.
- Type III enzymes are also large combination restriction-and-modification enzymes. They cleave outside of their recognition sequences and require two such sequences in opposite orientations within the same DNA molecule to accomplish cleavage.
- Type IV enzymes recognize modified DNA, e.g. methylated, hydroxymethylated and glucosyl-hydroxymethylated DNA and are exemplified by the McrBC and Mrr systems of E. coli.
- nicking endonucleases typically recognize non-palindromes. They can be bona fide nicking enzymes, such as frequent cutter Nt.CviPII and Nt.CviQII, or rare-cutting homing endonucleases (HEases) I-BasI and I-Hmul, both of which recognize a degenerate 24-bp sequence.
- HEases rare-cutting homing endonucleases
- isolated large subunits of heterodimeric Type IIS REases such as BtsI, BsrDI and BstNBI/BspD6I display nicking activity.
- restriction endonucleases that make double-strand cuts may be retained by engineering variants of these enzymes such that they make single-strand breaks.
- recognition sequence-specific nicking endonucleases are used as cleavage agents that cleave only a single-strand of double-stranded DNA at a cleavage site.
- Nicking endonucleases useful in various embodiments of methods and compositions described herein include Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BtsI, Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII, used either alone or in various combinations.
- nicking endonucleases that cleave outside of their recognition sequence e.g.
- Nb.BsrDI, Nb.BtsI, Nt.AlwI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII are used.
- nicking endonucleases that cut within their recognition sequences e.g. Nb.BbvCI, Nb.BsmI, or Nt.BbvCI are used.
- Recognition sites for the various specific cleavage agents used herein, such as the nicking endonucleases, comprise a specific nucleic acid sequence.
- nickase Nb.BbvCI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site (with “
- nickase Nb.BsmI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site:
- nickase Nb.BsrDI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site:
- nickase Nb.BtsI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site:
- nickase Nt.AlwI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site:
- nickase Nt.BbvCI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site:
- nickase Nt.BsmAI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site:
- nickase Nt.BspQI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site:
- nickase Nt.BstNBI New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site:
- nickase Nt.CviPII New England Biolabs, Ipswich, Mass. nicks at the following cleavage site with respect to its recognition site (wherein D denotes A or G or T and wherein H denotes A or C or T:
- a non-canonical base and/or a non-canonical base pair in a sticky end motif and/or adaptor sequence may be recognized by an enzyme for cleavage at its 5′ or 3′ end.
- the non-canonical base and/or non-canonical base pair comprises a uracil base.
- the enzyme is a DNA repair enzyme.
- the base and/or non-canonical base pair is recognized by an enzyme that catalyzes a first step in base excision, for example, a DNA glycosylase.
- a DNA glycosylase is useful for removing a base from a nucleic acid while leaving the backbone of the nucleic acid intact, generating an apurinic or apyrimidinic site, or AP site. This removal is accomplished by flipping the base out of a double-stranded nucleic acid followed by cleavage of the N-glycosidic bond.
- the non-canonical base or non-canonical base pair may be recognized by a bifunctional glycosylase.
- the glycosylase removes a non-canonical base from a nucleic acid by N-glycosylase activity.
- the resulting apurinic/apyrimidinic (AP) site is then incised by the AP lyase activity of bifunctional glycosylase via ⁇ -elimination of the 3′ phosphodiester bond.
- the glycosylase and/or DNA repair enzyme may recognize a uracil or a non-canonical base pair comprising uracil, for example U:G and/or U:A.
- Nucleic acid base substrates recognized by a glycosylase include, without limitation, uracil, 3-meA (3-methyladenine), hypoxanthine, 8-oxoG, FapyG, FapyA, Tg (thymine glycol), hoU (hydroxyuracil), hmU (hydroxymethyluracil), fU (formyluracil), hoC (hydroxycytosine), fC (formylcytosine), oxidized base, alkylated base, deaminated base, methylated base, and any non-canonical nucleobase provided herein or known in the art.
- the glycosylase and/or DNA repair enzyme recognizes oxidized bases such as 2,6-diamino-4-hydroxy-5-formamidopyrimidine (FapyG) and 8-oxoguanine (8-oxo).
- Glycosylases and/or DNA repair enzymes which recognize oxidized bases include, without limitation, OGG1 (8-oxoG DNA glycosylase 1) or E. coli Fpg (recognizes 8-oxoG:C pair), MYH (MutY homolog DNA glycosylase) or E. coli MutY (recognizes 8-oxoG:A), NEIL1, NEIL2 and NEIL3.
- the glycosylase and/or DNA repair enzyme recognizes methylated bases such as 3-methyladenine.
- methylated bases such as 3-methyladenine.
- An example of a glycosylase that recognizes methylated bases is E. coli AlkA or 3-methyladenine DNA glycosylase II, Magl and MPG (methylpurine glycosylase).
- Additional non-limiting examples of glycosylases include SMUG1 (single-strand specific monofunctional uracil DNA glycosylase 1), TDG (thymine DNA glycosylase), MBD4 (methyl-binding domain glycosylase 4), and NTHL1 (endonuclease III-like 1).
- Exemplary DNA glycosylases include, without limitation, uracil DNA glycosylases (UDGs), helix-hairpin-helix (HhH) glycosylases, 3-methyl-purine glycosylase (MPG) and endonuclease VIII-like (NEIL) glycosylases.
- UDGs uracil DNA glycosylases
- HhH helix-hairpin-helix glycosylasesylases
- MPG 3-methyl-purine glycosylase
- NEIL endonuclease VIII-like glycosylases.
- Helix-hairpin-helix (HhH) glycosylases include, without limitation, Nth (homologs of the E.
- coli EndoIII protein OggI (8-oxoG DNA glycosylase I), MutY/Mig (A/G-mismatch-specific adenine glycosylase), AlkA (alkyladenine-DNA glycosylase), MpgII (N-methylpurine-DNA glycosylase II), and OggII (8-oxoG DNA glycosylase II).
- Exemplary 3-methyl-puring glycosylases (MPGs) substances include, in non-limiting examples, alkylated bases including 3-meA, 7-meG, 3-meG and ethylated bases.
- Endonuclease VIII-like glycosylase substrates include, without limitation, oxidized pyrimidines (e.g., Tg, 5-hC, FaPyA, PaPyG), 5-hU and 8-oxoG.
- Exemplary uracil DNA glycosylases include, without limitation, thermophilic uracil DNA glycosylases, uracil-N glycosylases (UNGs), mismatch-specific uracil DNA glycosylases (MUGs) and single-strand specific monofunctional uracil DNA glycosylases (SMUGs).
- UNGs include UNG1 isoforms and UNG2 isoforms.
- MUGs include thymidine DNA glycosylase (TDG).
- TDG thymidine DNA glycosylase
- the non-canonical base pair included in a fragment disclosed herein is a mismatch base pair, for example a homopurine pair or a heteropurine pair.
- a primer described herein comprises one or more bases which form a mismatch base pair with a base of a target nucleic acid or with a base of an adaptor sequence connected to a target nucleic acid.
- an endonuclease, exonuclease, glycosylase, DNA repair enzyme, or any combination thereof recognizes the mismatch pair for subsequent removal and cleavage.
- the TDG enzyme is capable of excising thymine from G:T mismatches.
- the non-canonical base is released from a dsDNA molecule by a DNA glycosylase resulting in an abasic site.
- This abasic site (AP site) is further processed by an endonuclease which cleaves the phosphate backbone at the abasic site.
- Endonucleases included in methods herein may be AP endonucleases.
- the endonuclease is a class I or class II AP endonuclease which incises DNA at the phosphate groups 3′ and 5′ to the baseless site leaving 3′ OH and 5′ phosphate termini.
- the endonuclease may also be a class III or class IV AP endonuclease which cleaves DNA at the phosphate groups 3′ and 5′ to the baseless site to generate 3′ phosphate and 5′ OH.
- an endonuclease cleaving a fragment disclosed herein is an AP endonuclease which is grouped in a family based on sequence similarity and structure, for example, AP endonuclease family 1 or AP endonuclease family 2.
- Examples of AP endonuclease family 1 members include, without limitation, E. coli exonuclease III, S. pneumoniae and B.
- subtilis exonuclease A mammalian AP endonuclease 1 (API), Drosophila recombination repair protein 1 , Arabidopsis thaliana apurinic endonuclease-redox protein, Dictyostelium DNA-(apurinic or apyrimidinic site) lyase, enzymes comprising one or more domains thereof, and enzymes having at least 75% sequence identity to one or more domains or regions thereof.
- API mammalian AP endonuclease 1
- Drosophila recombination repair protein 1 A
- Arabidopsis thaliana apurinic endonuclease-redox protein Arabidopsis thaliana apurinic endonuclease-redox protein
- Dictyostelium DNA-(apurinic or apyrimidinic site) lyase enzymes comprising one or more domains thereof, and enzyme
- AP endonuclease family 2 members include, without limitation, bacterial endonuclease IV, fungal and Caenorhabditis elegans apurinic endonuclease APN1, Dictyostelium endonuclease 4 homolog, Archaeal probable endonuclease 4 homologs, mimivirus putative endonuclease 4, enzymes comprising one or more domains thereof, and enzymes having at least 75% sequence identity to one or more domains or regions thereof.
- endonucleases include endonucleases derived from both Prokaryotes (e.g., endonuclease IV, RecBCD endonuclease, T7 endonuclease, endonuclease II) and Eukaryotes (e.g., Neurospora endonuclease, S1 endonuclease, P1 endonuclease, Mung bean nuclease I, Ustilago nuclease).
- Prokaryotes e.g., endonuclease IV, RecBCD endonuclease, T7 endonuclease, endonuclease II
- Eukaryotes e.g., Neurospora endonuclease, S1 endonuclease, P1 endonuclease, Mung bean nuclease I, Ustilago nuclease.
- the endonuclease is S1 endonuclease. In some instances, the endonuclease is endonuclease III. The endonuclease may be a endonuclease IV. In some case, an endonuclease is a protein comprising an endonuclease domain having endonuclease activity, i.e., cleaves a phosphodiester bond.
- a non-canonical base is removed with a DNA excision repair enzyme and endonuclease or lyase, wherein the endonuclease or lyase activity is optionally from an excision repair enzyme or a region of the excision repair enzyme.
- Excision repair enzymes include, without limitation, Methyl Purine DNA Glycosylase (recognizes methylated bases), 8-Oxo-GuanineGlycosylase 1 (recognizes 8-oxoG:C pairs and has lyase activity), Endonuclease Three Homolog 1 (recognizes T-glycol, C-glycol, and formamidopyrimidine and has lyase activity), inosine, hypoxanthine-DNA glycosylase; 5-Methylcytosine, 5-Methylcytosine DNA glycosylase; Formamidopyrimidine-DNA-glycosylase (excision of oxidized residue from DNA: hydrolysis of the N-glycosidic bond (DNA glycosylase), and beta-elimination (AP-lyase reaction)).
- DNA excision repair enzyme is uracil DNA glycosylase.
- DNA excision repair enzymes include also include, without limitation, Aag (catalyzes excision of 3-methyladenine, 3-methylguanine, 7-methylguanine, hypoxanthine, 1,N6-ethenoadenine), endonuclease III (catalyzes excision of cis- and trans-thymine glycol, 5,6-dihydrothymine, 5,6-dihydroxydihydrothymine, 5-hydroxy-5-methylhydantoin, 6-hydroxy-5,6-dihydropyrimidines, 5-hydroxycytosine and 5-hydroxyuracil, 5-hydroxy-6-hydrothymine, 5,6-dihydrouracil, 5-hydroxy-6-hydrouracil, AP sites, uracil glycol, methyltartronylurea, alloxan), endonuclease V (cleaves AP sites on dsDNA and ssDNA),
- Non-limiting DNA excision repair enzymes are listed in Curr Protoc Mol Biol. 2008 October; Chapter 3: Unit3.9.
- DNA excision repair enzymes such as endonucleases, may be selected to excise a specific non-canonical base.
- endonuclease V, T. maritima is a 3′-endonuclease which initiates the removal of deaminated bases such as uracil, hypoxanthine, and xanthine.
- a DNA excision repair enzyme having endonuclease activity functions to remove a modified or non-canonical base from a strand of a dsDNA molecule without the use of an enzyme having glycosylase activity.
- DNA repair enzyme comprises glycosylase activity, lyase activity, endonuclease activity, or any combination thereof.
- one or more DNA excision repair enzymes are used in the methods described herein, for example one or more glycosylases or a combination of one or more glycosylases and one or more endonucleases.
- Fpg formamidopyrimidine [fapy]-DNA glycosylase
- 8-oxoguanine DNA glycosylase acts both as a N-glycosylase and an AP-lyase.
- the N-glycosylase activity releases a non-canonical base (e.g., 8-oxoguanine, 8-oxoadenine, fapy-guanine, methy-fapy-guanine, fapy-adenine, aflatoxin Bi-fapy-guanine, 5-hydroxy-cytosine, 5-hydroxy-uracil) from dsDNA, generating an abasic site.
- the lyase activity then cleaves both 3′ and 5′ to the abasic site thereby removing the abasic site and leaving a 1 base gap or nick.
- Additional enzymes which comprise more than enzymatic activities include, without limitation, endonuclease III (Nth) protein from E. coli (N-glycosylase and AP-lyase) and Tma endonuclease III (N-glycosylase and AP-lyase).
- Nth endonuclease III
- AP-lyase AP-lyase
- Tma endonuclease III N-glycosylase and AP-lyase
- mismatch endonucleases are used to nick DNA in the region of mismatches or damaged DNA, including but not limited to T7 Endonuclease I, E. coli Endonuclease V, T4 Endonuclease VII, mung bean nuclease, Cel-1 endonuclease, E. coli Endonuclease IV and UVDE.
- such enzymes can detect polynucleotide loops and insertions, detect mismatches in base pairing, recognize sequence differences in polynucleotide strands between about 100 bp and 3 kb in length and recognize such mutations in a target polynucleotide sequence without substantial adverse effects of flanking DNA sequences.
- exonuclease comprises 3′ DNA polymerase activity.
- Exonucleases include those enzymes in the following groups: exonuclease I, exonuclease II, exonuclease III, exonuclease IV, exonuclease V, exonuclease VI, exonuclease VII, and exonuclease VIII.
- an exonuclease has AP endonuclease activity.
- the exonuclease is any enzyme comprising one or more domains or amino acid regions suitable for cleaving nucleotides from either 5′ or 3′ end or both ends, of a nucleic acid chain.
- Exonucleases include wild-type exonucleases and derivatives, chimeras, and/or mutants thereof.
- Mutant exonucleases include enzymes comprising one or more mutations, insertions, deletions or any combination thereof within the amino acid or nucleic acid sequence of an exonuclease.
- a polymerase is provided to a reaction comprising an enzyme treated dsDNA molecule, wherein one or more non-canonical bases of the dsDNA molecule has been excised, for example, by treatment with one or more DNA repair enzymes.
- the DNA product has been treated with a glycosylase and endonuclease to remove a non-canonical base.
- one or more nucleotides e.g., dNTPs
- the DNA product has been treated with a UDG and endonuclease VIII to remove at least one uracil.
- one or more nucleotides are provided to a reaction comprising the treated dsDNA molecule and the polymerase.
- a site-specific base excision reagents comprising one or more enzymes are used as cleavage agents that cleave only a single-strand of double-stranded DNA at a cleavage site.
- a number of repair enzymes are suitable alone or in combination with other agents to generate such nicks.
- a DNA repair enzyme is a native or an in vitro-created chimeric protein with one or more activities.
- Cleavage agents in various embodiments, comprise enzymatic activities, including enzyme mixtures, which include one or more of nicking endonucleases, AP endonucleases, glycosylases and lyases involved in base excision repair.
- a damaged base is removed by a DNA enzyme with glycosylase activity, which hydrolyses an N-glycosylic bond between the deoxyribose sugar moiety and the base.
- a DNA enzyme with glycosylase activity hydrolyses an N-glycosylic bond between the deoxyribose sugar moiety and the base.
- an E. coli glycosylase and an UDG endonuclease act upon deaminated cytosine while two 3-mAde glycosylases from E. coli (Tagl and Tagil) act upon alkylated bases.
- the product of removal of a damaged base by a glycosylase is an AP site (apurinic/apyrimidinic site), also known as an abasic site, is a location in a nucleic acid that has neither a purine nor a pyrimidine base.
- DNA repair systems are often used to correctly replace the AP site. This is achieved in various instances by an AP endonuclease that nicks the sugar phosphate backbone adjacent to the AP site and the abasic sugar is removed. Some naturally occurring or synthetic repair systems include activities, such as the DIMA polymerase/DNA ligase activity, to insert a new nucleotide.
- AP endonucleases are classified according to their sites of incision. Class I AP endonucleases and class II AP endonucleases incise DNA at the phosphate groups 3′ and 5′ to the baseless site leaving 3′-OH and 5′-phosphate termini. Class III and class IV AP endonucleases also cleave DNA at the phosphate groups 3′ and 5′ to the baseless site, but they generate a 3′-phosphate and a 5′-OH.
- AP endonucleases remove moieties attached to the 3′ OH that inhibit polynucleotide polymerization. For example a 3′ phosphate is converted to a 3′ OH by E. coli endonuclease IV.
- AP endonucleases work in conjunction with glycosylases to engineer nucleic acids at a site of mismatch, a non-canonical nucleoside or a base that is not one of the major nucleosides for a nucleic acid, such as a uracil in a DNA strand.
- glycosylase substrates include, without limitation, uracil, hypoxanthine, 3-methyladenine (3-mAde), formamidopyrimidine (FAPY), 7,8 dihydro-8-oxyguanine and hydroxymethyluracil.
- glycosyslase substrates incorporated into DNA site-specifically by nucleic acid extension from a primer comprising the substrate.
- glycosylase substrates are introduced by chemical modification of a nucleoside, for example by deamination of cytosine, e.g. by bisulfate, nitrous acids, or spontaneous deamination, producing uracil, or by deamination of adenine by nitrous acids or spontaneous deamination, producing hypoxanthine.
- nucleic acids include generating 3-mAde as a product of alkylating agents, FAPY (7-mGua) as product of methylating agents of DNA, 7,8-dihydro-8 oxoguanine as a mutagenic oxidation product of guanine, 4,6-diamino-5-FAPY produced by gamma radiation, and hydroxymethyuracil produced by ionizing radiation or oxidative damage to thymidine.
- Some enzymes comprise AP endonuclease and glycosylase activities that are coordinated either in a concerted manner or sequentially.
- polynucleotide cleavage enzymes used to generate single-stranded nicks include the following types of enzymes derived from but not limited to any particular organism or virus or non-naturally occurring variants thereof: E. coli endonuclease IV, Tth endonuclease IV, human AP endonuclease, glycosylases, such as UDG, E. coli 3-methyladenine DNA glycoylase (AIkA) and human Aag, glycosylase/lyases, such as E. coli endonuclease III, E. coli endonuclease VIII, E. coli Fpg, human OGG1, and T4 PDG, and lyases.
- Exemplary additional DNA repair enzymes are listed in Table 1.
- USER Uracil-Specific Excision Reagent; New England BioLabs
- UDG Uracil DNA glycosylase
- apyrimidinic site while leaving the phosphodiester backbone intact.
- Endonuclease VIII is used to break the phosphodiester backbone at the 3′ and 5′ sides of the abasic site so that the base-free deoxyribose is released, creating a one nucleotide gap at the site of uracil nucleotide.
- nucleic acid fragments are treated prior to assembly into a target nucleic acid of predetermined sequence.
- nucleic acid fragments are treated to create a sticky end, such as a sticky end with a 3′ overhang or a 5′ overhang.
- uracil bases are incorporated into one or both strands of the target nucleic acids, which are chewed off upon treatment with Uracil DNA glycosylase (UDG) and Endonuclease VIII (EndoVIII).
- uracil bases are incorporated near the 5′ ends (or 3′ ends), such as at least or at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 bases from the 5′ end (or 3′ end), of one or both strands. In some cases, uracil bases are incorporated near the 5′ ends such as at most or at most about 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 base from the 5′ end, of one or both strands.
- uracil bases are incorporated near the 5′ end such as between 1-20, 2-19, 3-18, 4-17, 5-16, 6-15, 7-14, 8-13, 9-12, 10-13, 11-14 bases from the 5′ end, of one or both strands.
- the uracil bases may be incorporated near the 5′ end such that the distance between the uracil bases and the 5′ end of one or both strands may fall within a range bound by any of these values, for example from 7-19 bases.
- the reaction is thermocycled between a maximum and minimum temperature to repeatedly enhance cleavage, melting, annealing, and/or ligation.
- the temperature ranges from a high of 80 degrees Celsius. In some cases, the temperature ranges from a low to 4 degrees Celsius. In some cases, the temperature ranges from 4 degrees Celsius to 80 degrees Celsius. In some cases, the temperature ranges among intermediates in this range.
- the temperature ranges from a high of 60 degrees Celsius. In some cases, the temperature ranges to a low of 16 degrees Celsius. In some cases, the temperature ranges from a high of 60 degrees Celsius to a low of 16 degrees Celsius. In some cases, the mixture is temperature cycled to allow for the removal of cleaved sticky ended distal fragments from precursor fragments at elevated temperatures and to allow for the annealing of the fragments with complementary sticky ends at a lower temperature. In some cases, alternative combinations or alternative temperatures are used. In yet more alternate cases the reactions occur at a single temperature. In some cases, palindromic sequences are excluded from overhangs. The number of fragment populations to anneal in a reaction varies across target nucleic acids.
- a ligation reaction comprises 2, 3, 4, 5, 6, 7, 8, or more than 8 types of target fragments to be assembled.
- portions of the entire nucleic acid are synthesized in separate reactions.
- intermediate nucleic acids are used in a subsequent assembly round that uses the same or a different method to assemble larger intermediates or the final target nucleic acid.
- the same or different cleavage agents, recognition sites, and cleavage sites are used in subsequent rounds of assembly.
- consecutive rounds of assembly e.g. pooled or parallel assembly, are used to synthesize larger fragments in a hierarchical manner.
- described herein are methods and compositions for the preparation of a target nucleic acid, wherein the target nucleic acid is a gene, using assembly of shorter fragments.
- PCR Polymerase chain reaction
- PCA non-polymerase-cycling-assembly
- PT PCR-based and non-polymerase-cycling-assembly
- PT PCR-based and non-polymerase-cycling-assembly
- Amplification reactions described herein can be performed by any means known in the art.
- the nucleic acids are amplified by polymerase chain reaction (PCR).
- Other methods of nucleic acid amplification include, for example, ligase chain reaction, oligonucleotide ligations assay, and hybridization assay.
- DNA polymerases described herein include enzymes that have DNA polymerase activity even though it may have other activities.
- a single DNA polymerase or a plurality of DNA polymerases may be used throughout the repair and copying reactions. The same DNA polymerase or set of DNA polymerases may be used at different stages of the present methods or the DNA polymerases may be varied or additional polymerase added during various steps.
- Amplification may be achieved through any process by which the copy number of a target sequence is increased, e.g. PCR. Amplification can be performed at any point during a multi reaction procedure, e.g. before or after pooling of sequencing libraries from independent reaction volumes and may be used to amplify any suitable target molecule described herein.
- Oligonucleic acids serving as target nucleic acids for assembly may be synthesized de novo in parallel.
- the oligonucleic acids may be assembled into precursor fragments which are then assembled into target nucleic acids. In some case, greater than about 100, 1000, 16,000, 50,000 or 250,000 or even greater than about 1,000,000 different oligonucleic acids are synthesized together. In some cases, these oligonucleic acids are synthesized in less than 20, 10, 5, 1, 0.1 cm 2 , or smaller surface area. In some instances, oligonucleic acids are synthesized on a support, e.g. surfaces, such as microarrays, beads, miniwells, channels, or substantially planar devices.
- oligonucleic acids are synthesized using phosphoramidite chemistry.
- the surface of the oligonucleotide synthesis loci of a substrate in some instances is chemically modified to provide a proper site for the linkage of the growing nucleotide chain to the surface.
- Various types of surface modification chemistry exists which allow a nucleotide to attached to the substrate surface.
- the DNA and RNA synthesized according to the methods described herein may be used to express proteins in vivo or in vitro.
- the nucleic acids may be used alone or in combination to express one or more proteins each having one or more protein activities. Such protein activities may be linked together to create a naturally occurring or non-naturally occurring metabolic/enzymatic pathway. Further, proteins with binding activity may be expressed using the nucleic acids synthesized according to the methods described herein. Such binding activity may be used to form scaffolds of varying sizes.
- the methods and systems described herein may comprise and/or are performed using a software program on a computer system. Accordingly, computerized control for the optimization of design algorithms described herein and the synthesis and assembly of nucleic acids are within the bounds of this disclosure. For example, supply of reagents and control of PCR reaction conditions are controlled with a computer. In some instances, a computer system is programmed to search for sticky end motifs in a user specified predetermined nucleic acid sequence, interface these motifs with a list of suitable nicking enzymes, and/or determine one or more assembly algorithms to assemble fragments defined by the sticky end motifs.
- a computer system described herein accepts as an input one or more orders for one or more nucleic acids of predetermined sequence, devises an algorithm(s) for the synthesis and/or assembly of the one or more nucleic acid fragments, provides an output in the form of instructions to a peripheral device(s) for the synthesis and/or assembly of the one or more nucleic acid fragments, and/or instructs for the production of the one or more nucleic acid fragments by the peripheral devices to form the desired nucleic acid of predetermined sequence.
- a computer system operates without human intervention during one or more of steps for the production of a target nucleic acid of predetermined sequence or nucleic acid fragment thereof.
- a software system is used to identify sticky end motif sequence for use in a target sequence assembly reaction consistent with the disclosure herein.
- J is about 200. In some cases, J is about 1000.
- J is a number selected from about 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 100, or more than 1000. In some cases, J is a value in the range from 70-250.
- I/J is the number of fragments to be assembled (x). X ⁇ 1 breakpoints are added along the target sequence, reflecting the number of junctions in the target sequence to be assembled. In some cases, junctions are selected at equal intervals or at approximately equal intervals throughout the target sequence.
- the nearest breakpoint site candidate is identified, for example having ANNNNT (SEQ ID NO.: 2), or GNNNNC (SEQ ID NO.: 17).
- the breakpoint has a 6 base sequence in some cases, while in other cases the junction sequence is 1, 2, 3, 4, or 5 bases, and in other cases the junction is 7, 8, 9, 10, or more than 10 bases.
- the breakpoint site candidate comprises a purine at a first position, a number of bases ranging from 0 to 8 or greater, preferably 1 or greater in some cases, and a pyrimidine at a final position such that the first position purine and the final position pyrimidine are a complementary base pair (either AT or GC).
- breakpoint selection is continued for sites up to and In some cases, including each breakpoint or near each breakpoint.
- Site candidates are evaluated so as to reduce the presence of at least one of palindromic sequences, homopolymers, extreme GC content, and extreme AT content. Sites are assessed in light of at least one of these criteria, optionally in combination with or alternatively viewing additional criteria for site candidate evaluation. If a site is determined or calculated to have undesirable qualities, then the next site in a vicinity is subjected to a comparable evaluation.
- Site candidates are further evaluated for cross-site similarity, for example excluding sites that share more than L bases in common at common positions or in common sequence. In some cases, L is 2, such that the central NNNN of some selected sticky ends must not share similar bases at similar positions.
- L is 2, such that the central NNNN of some selected sticky ends must not share similar bases in similar patterns. In alternate cases, L is 3, 4, 5, 6, or greater than 6. Site candidates are evaluated individually or in combination, until a satisfactory sticky end system or group of distinct sticky ends is identified for a given assembly reaction. Alternate methods employ at least one of the steps recited above, alone or in combination with additional steps recited above or in combination with at least one step not recited above, or in combination with a plurality of steps recited above and at least one step not recited above.
- a method described herein may be operably linked to a computer, either remotely or locally.
- a method described herein is performed using a software program on a computer.
- a system described herein comprises a software program for performing and/or analyzing a method or product of a method described herein. Accordingly, computerized control of a process step of any method described herein is envisioned.
- the computer system 700 illustrated in FIG. 7 depicts a logical apparatus that reads instructions from media 711 and/or a network port 705 , which is optionally be connected to server 709 having fixed media 712 .
- a computer system such as shown in FIG. 7 , includes a CPU 701 , disk drive 703 , optional input devices such as keyboard 715 and/or mouse 716 and optional monitor 707 .
- Data communication can be achieved through the indicated communication medium to a server at a local or a remote location.
- Communication medium includes any means of transmitting and/or receiving data.
- communication medium is a network connection, a wireless connection, and/or an internet connection. Such a connection can provide for communication over the World Wide Web. It is envisioned that data relating to the present disclosure is transmittable over such networks or connections for reception and/or review by a user 722 , as illustrated in FIG. 7 .
- FIG. 8 A block diagram illustrating a first example architecture of a computer system 800 for use in connection with example embodiments of the disclosure is shown in FIG. 8 .
- the example computer system of FIG. 8 includes a processor 802 for processing instructions.
- processors include: Intel XeonTM processor, AMD OpteronTM processor, Samsung 32-bit RISC ARM 1176JZ(F)-S v1.0TM processor, ARM Cortex-A8 Samsung S5PC100TM processor, ARM Cortex-A8 Apple A4TM processor, Marvell PXA 930TM processor, and a functionally-equivalent processor. Multiple threads of execution can be used for parallel processing. In some instances, multiple processors or processors with multiple cores are used, whether in a single computer system, in a cluster, or distributed across systems over a network comprising a plurality of computers, cell phones, and/or personal data assistant devices.
- a high speed cache 804 is connected to, or incorporated in, the processor 802 to provide a high speed memory for instructions or data that have been recently, or are frequently, used by processor 802 .
- the processor 802 is connected to a north bridge 806 by a processor bus 808 .
- the north bridge 806 is connected to random access memory (RAM) 810 by a memory bus 812 and manages access to the RAM 810 by the processor 802 .
- the north bridge 806 is also connected to a south bridge 814 by a chipset bus 816 .
- the south bridge 814 is, in turn, connected to a peripheral bus 818 .
- the peripheral bus is, for example, PCI, PCI-X, PCI Express, or another peripheral bus.
- the north bridge and south bridge are often referred to as a processor chipset and manage data transfer between the processor, RAM, and peripheral components on the peripheral bus 818 .
- the functionality of the north bridge is incorporated into the processor instead of using a separate north bridge chip.
- system 800 includes an accelerator card 822 attached to the peripheral bus 818 .
- the accelerator may include field programmable gate arrays (FPGAs) or other hardware for accelerating certain processing.
- FPGAs field programmable gate arrays
- an accelerator is used for adaptive data restructuring or to evaluate algebraic expressions used in extended set processing.
- System 800 includes an operating system for managing system resources.
- operating systems include: Linux, WindowsTM, MACOSTM, BlackBerry OSTM, iOSTM, and other functionally-equivalent operating systems, as well as application software running on top of the operating system for managing data storage and optimization in accordance with example embodiments of the present disclosure.
- System 800 includes network interface cards (NICs) 820 and 821 connected to the peripheral bus for providing network interfaces to external storage, such as Network Attached Storage (NAS) and other computer systems that can be used for distributed parallel processing.
- NICs network interface cards
- NAS Network Attached Storage
- FIG. 9 is a diagram showing a network 900 with a plurality of computer systems 902 a , and 902 b , a plurality of cell phones and personal data assistants 902 c , and Network Attached Storage (NAS) 904 a , and 904 b .
- systems 902 a , 902 b , and 902 c manage data storage and optimize data access for data stored in NAS 904 a and 904 b .
- a mathematical model can be used for the data and be evaluated using distributed parallel processing across computer systems 902 a and 902 b , and cell phone and personal data assistant system 902 c .
- Computer systems 902 a and 902 b , and cell phone and personal data assistant system 902 c can provide parallel processing for adaptive data restructuring of the data stored in NAS 904 a and 904 b .
- FIG. 9 illustrates an example only, and a wide variety of other computer architectures and systems can be used in conjunction with the various embodiments of the present disclosure.
- a blade server can be used to provide parallel processing.
- Processor blades can be connected through a back plane to provide parallel processing.
- Storage can also be connected to the back plane or as NAS through a separate network interface.
- processors maintain separate memory spaces and transmit data through network interfaces, back plane or other connectors for parallel processing by other processors. In some instances, some or all of the processors use a shared virtual address memory space.
- FIG. 10 is a block diagram of a multiprocessor computer system 1000 using a shared virtual address memory space in accordance with an example embodiment.
- the system includes a plurality of processors 1002 a - f that can access a shared memory subsystem 1004 .
- the system incorporates a plurality of programmable hardware memory algorithm processors (MAPs) 1006 a - f in the memory subsystem 1004 .
- MAPs programmable hardware memory algorithm processors
- Each MAP 1006 a - f can comprise a memory 1008 a - f and one or more field programmable gate arrays (FPGAs) 1010 a - f .
- FPGAs field programmable gate arrays
- the MAP provides a configurable functional unit and particular algorithms or portions of algorithms can be provided to the FPGAs 1010 a - f for processing in close coordination with a respective processor.
- the MAPs are used to evaluate algebraic expressions regarding a data model and to perform adaptive data restructuring in example embodiments.
- each MAP is globally accessible by all of the processors for these purposes.
- each MAP uses Direct Memory Access (DMA) to access an associated memory 1008 a - f , allowing it to execute tasks independently of, and asynchronously from, the respective microprocessor 1002 a - f
- DMA Direct Memory Access
- a MAP can feed results directly to another MAP for pipelining and parallel execution of algorithms.
- the above computer architectures and systems are examples only, and a wide variety of other computer, cell phone, and personal data assistant architectures and systems can be used in connection with example embodiments, including systems using any combination of general processors, co-processors, FPGAs and other programmable logic devices, system on chips (SOCs), application specific integrated circuits (ASICs), and other processing and logic elements.
- SOCs system on chips
- ASICs application specific integrated circuits
- all or part of the computer system can be implemented in software or hardware.
- Any variety of data storage media can be used in connection with example embodiments, including random access memory, hard drives, flash memory, tape drives, disk arrays, Network Attached Storage (NAS) and other local or distributed data storage devices and systems.
- NAS Network Attached Storage
- a gene of about 1 kB (the “1 kB Gene Construct”) was selected to perform restriction enzyme-free ligation with a vector:
- the 1 kB Gene Construct which is an assembled gene fragment with heterogeneous sequence populations, was purchased as a single gBlock (Integrated DNA Technologies).
- the 1 kB Gene Construct was amplified in a PCR reaction with uracil-containing primers.
- the PCR reaction components were prepared according to Table 2.
- the 1 kB Gene Construct was amplified with the uracil-containing primers in a PCR reaction performed using the thermal cycling conditions described in Table 3.
- Step Cycle 1 1 cycle: 98° C., 30 sec 2 20 cycles: 98° C., 10 sec; 68° C., 15 sec; 72° C., 60 sec 3 1 cycle: 72° C., 5 min 4 Hold: 4° C.
- the uracil-containing PCR products were purified using Qiagen MinElute column, eluted in 10 ⁇ L EB buffer, analyzed by electrophoresis (BioAnalyzer), and quantified on a NanoDrop to be 93 ng/ ⁇ L.
- the uracil-containing PCR products of the 1 kB Gene Construct were incubated with a mixture of Uracil DNA glycosylase (UDG) and Endonuclease VIII to generate sticky ends. The incubation occurred at 37° C. for 30 min in a reaction mixture as described in Table 4.
- Two synthetic oligonucleotides having 3′ overhangs when annealed together (“Artificial Vector”) were hybridized and ligated to the digested uracil-containing 1 kB Gene Construct (“Sticky-end Construct”).
- the first oligo (“Upper Oligo”, SEQ ID NO.: 51) contains a 5′ phosphate for ligation.
- the second oligonucleotide (“Lower Oligo”, SEQ ID NO.: 52) lacks a base on the 5′ end such that it leaves a nucleotide gap after hybridizing to the Sticky-end Construct with the Upper Oligo. Further, the Lower Oligo lacks a 5′ phosphate to ensure that no ligation occurs at this juncture.
- the first six phosphate bonds on the Lower Oligo are phosphorothioated to prevent exonuclease digestion from the gap.
- Oligonucleic acid sequences of the Artificial Vector are shown in Table 5. An asterisk denotes a phosphorothioate bond.
- the Sticky-end Construct was mixed with Upper Oligo and Lower Oligo (5 ⁇ M each) in 1 ⁇ CutSmart buffer (NEB). The mixture was heated to 95° C. for 5 min, and then slowly cooled to anneal. The annealed product comprised a circularized gene construct comprising the 1 kB Gene Construct. This construct was generated without the remnants of any restriction enzyme cleavage sites and thus lacked any associated enzymatic “scars.”
- a LacZ gene was assembled into a 5 kb plasmid from three precursor LacZ fragments and 1 precursor plasmid fragment. Assembly was performed using 9 different reaction conditions.
- a 5 kb plasmid was amplified with two different sets of primers for introducing a sticky end motif comprising a non-canonical base (SEQ ID NO.: 53): set A (SEQ ID NOs.: 54 and 55) and set B (SEQ ID NOs.: 56 and 57), shown in Table 6, to produce plasmid precursor fragments A and B, respectively.
- Sequence Primer identity name Sequence SEQ ID plasmid- TGATCGGCAATGATATG/ideoxyU/ NO.: 54 Fa CTGGAAAGAACATGTG SEQ ID plasmid- TGATCGGCAATGATGGC/ideoxyU/ NO.: 55 Ra TATAATGCGACAAACAACAG SEQ ID plasmid- TGATCGGCAATGATATG/ideoxyU/ NO.: 56 Fb CGCTGGAAAGAACATG SEQ ID plasmid- TGATCGGCAATGATGGC/ideoxyU/ NO.: 57 Ra CGTATAATGCGACAAACAAC
- Each primer set comprises, in 5′ to 3′ order: 6 adaptor bases (TGATCG, SEQ ID NO.: 58), a first nicking enzyme recognition site (GCAATG, SEQ ID NO.: 59), a sticky end motif comprising a non-canonical base (ANNNNU, SEQ ID NO.: 53), and plasmid sequence.
- the first two bases of the plasmid sequence in the forward and reverse primers of set B are a CG. These two bases are absent from the forward and reverse primers of set A.
- Two plasmid fragments, plasmid A and plasmid B were amplified using primer set A and primer B, respectively.
- the composition of the amplification reaction is shown in Table 7.
- the amplification reaction conditions are shown in Table 8.
- PCR component Quantity Concentration in mixture Phusion U (2 U/ ⁇ L) 1 1 U/50 ⁇ L 5x Phusion HF buffer 20 1x 10 mM dNTP 4 400 ⁇ M Plasmid template (50 pg/4) 4 100 pg/50 ⁇ L plasmid-Fa or plasmid-Fb 0.25 0.5 ⁇ M (200 ⁇ M) plasmid-Ra or plasmid-Rb 0.25 0.5 ⁇ M (200 ⁇ M) Water 70.5
- Step Cycle 1 1 cycle: 98° C., 30 sec 2 30 cycles: 98° C., 10 sec; 49° C., 15 sec; 72° C., 90 sec 3 1 cycle: 72° C., 5 min 4 Hold: 4° C., 15-30 sec per kb
- the precursor plasmid fragment was treated with DpnI, denatured and purified.
- LacZ sequence was analyzed to identify two sticky end motifs which partition the sequence into roughly 3, 1 kb fragments: LacZ fragments 1-3. Sequence identities of the two sticky end motifs and the LacZ fragments are shown in Table 9.
- SEQ ID NO.: 60 shows the complete LacZ gene, wherein motifs are italicized, fragment 1 is underlined with a single line, fragment 2 is underlined with a squiggly line, and fragment 3 is underlined with a double line.
- Sequence Sequence identity name Sequence SEQ ID fragment 1 ATGACCATGATTACGGATTCACTGGCCGTCGTTTTACAACG NO.: 61 TCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCC TTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAA GAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCT GAATGGCGAATGGCGCTTTGCCTGGTTTCCGGCACCAGAAG CGGTGCCGGAAAGCTGGCTGGAGTGCGATCTTCCTGAGGCC GATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTA CGATGCGCCCATCTACACCAACGTGACCTATCCCATTACGG TCAATCCGCCGTTTGTTCCCACGGAGAATCCGACGGGTTGT TACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGA AGGCCAGACGCGAATTATTTTTGATGGCGTTAACTCG
- LacZ fragments 1-3 were assembled from smaller, synthesized oligonucleic acids. During fragment preparation, the 5′ and/or 3′ of each fragment end was appended with a connecting adaptor to generated adaptor-modified fragments 1-3. To prepare LacZ for assembly with the precursor plasmid fragments, the 5′ end of fragment 1 and the 3′ end of fragment 3 were appended with a first outer adaptor comprising outer adaptor motif 1 (AGCCAT, SEQ ID NO.: 66) and a second outer adaptor comprising outer adaptor motif 2 (TTATGT, SEQ ID NO.: 67), respectively. The sequences of modified fragments 1-3 are shown in Table 10.
- Each modified fragment comprises a first adaptor sequence (GTATGCTGACTGCT, SEQ ID NO.: 68) at the first end and second adaptor sequence (TTGCCCTACGGTCT, SEQ ID NO.: 69) at the second end, indicated by a dashed underline.
- Each modified fragment comprises a nicking enzyme recognition site (GCAATG, SEQ ID NO.: 59), indicated by a dotted underline.
- Each modified fragment comprises an ANNNNT motif (SEQ ID NO.: 2), indicated by italics.
- Sequence Sequence identity name Sequence SEQ ID NO.: 70 modified fragment 1 AAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATC CCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCG ATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGC GCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCT GGCTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGTCGTCC CCTCAAACTGGCAGATGCACGGTTACGATGCGCCCATCTACA CCAACGTGACCTATCCCATTACGGTCAATCCGCCGTTTGTTC CCACGGAGAATCCGACGGGTTGTTACTCGCTCACATTTAATG TTGATGAAAGCTGGCTACAGGAAGGCCAGACGCGAATTATTT TTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGC GCTGGGTCGGTTACGGTTACGGTTACGGTTTCATCTGTGGTGCAACGGGC GCTGG
- each modified fragment was amplified using the universal primers shown in Table 11.
- An asterisk indicates a phosphorothioated bond.
- Each primer set comprises, in 5′ to 3′ order: adaptor sequence, a first nicking enzyme recognition site (GCAATG, SEQ ID NO.: 59), and a sticky end motif comprising a non-canonical base (ANNNNU, SEQ ID NO.: 53).
- Modified fragments 1-3 were amplified using their corresponding primers modfrag1F/modfrag1R, modfrag2F/modfrag2R and modfrag3F/modfrag3R, respectively.
- the composition of the amplification reaction is shown in Table 12.
- the amplification reaction conditions are shown in Table 13.
- PCR component Quantity Concentration in mixture Phusion U (2 U/ ⁇ L) 1 1 U/50 ⁇ L 5x Phusion HF buffer 20 1x 10 mM dNTP 2 200 ⁇ M Plasmid template (50 pg/ ⁇ L) 2 100 pg/100 ⁇ L Forward primer (200 ⁇ M) 0.25 0.5 ⁇ M Forward primer (200 ⁇ M) 0.25 0.5 ⁇ M Water 70.5
- LacZ precursor fragments were annealed and ligated with the plasmid fragment according to reactions 1 and 2 under conditions A-I shown in Table 14.
- the nicking enzyme Nb.BsrDI was used to generate a nick adjacent to the nicking recognition site (GCAATG, SEQ ID NO.: 59) on one strand during reaction 1.
- USER UDG and endonuclease VIII
- Reaction 2 comprised three steps: cleavage of uracil, ligation, and enzymatic inactivation.
- Assembled fragments comprise LacZ inserted into the 5 kb plasmid.
- FIG. 11 shows an image of a gel electrophoresis of LacZ amplified inserts generated from assembly conditions A-I.
- LacZ precursor Incubate fragments with Incubate reaction 1 with 9/10 fragments 1-3; nicking enzyme USER, ATP, T7 ligase, and plasmid precursor Nb.BsrDI and buffer at buffer at 37° C. for 30 min, fragment A 65° C. for 60 min 16° C. for 60 min, and 80° C. for 20 min C LacZ precursor Incubate fragments with Incubate reaction 1 with 8/10 fragments 1-3; nicking enzyme USER, ATP, T7 ligase, and plasmid precursor Nb.BsrDI and buffer at buffer at 37° C. for 30 min, fragment A 65° C. for 60 min 16° C. for 60 min, and 80° C.
- An enzyme of interest having an activity to be improved is selected. Specific amino acid residues relevant to enzyme activity and stability are identified. The nucleic acid sequence encoding the enzyme is obtained. Bases corresponding to the specific amino acid residues are identified, and the nucleic acid is partitioned into fragments such that each fragment spans a single base position corresponding to a specific amino acid residue.
- Target nucleic acid fragments are synthesized such that identified bases corresponding to the specific amino acid residues are indeterminate.
- Target nucleic acid fragments are amplified using a uridine primer and treated with a sequence adjacent nick enzyme and a uridine-specific nick enzyme. Cleaved end sequence is removed and target nucleic acid fragments are assembled to generate a target nucleic acid library. Aliquots of the library are sequenced to confirm success of the assembly, and aliquoted molecules of the library are individually cloned and transformed into a host cell for expression. Expressed enzymes are isolated and assayed for activity and stability.
- Enzymes having increased stability due to single point mutations are identified. Enzymes having increased activity due to single point mutations are identified. Also identified are enzymes having increased stability and/or activity due to combinations of point mutations, each of which individually is detrimental to enzyme activity or stability, and which would be unlikely to be pursued by more traditional, ‘one mutation at a time’ approaches.
- a 3 kb double-stranded target gene of predetermined sequence is prepared using a de novo synthesis and assembly method described herein.
- the predetermined gene sequence is first analyzed to identify fragments which will be synthesized and assembled into the final gene product.
- the target nucleic acid sequence is analyzed to identify sticky end motifs having an ANNNNT sequence (SEQ ID NO.: 2). Two of the identified motifs are selected according to their position in the sequence, so that the first identified motif is located at roughly 1 kb and the second identified motif is located at roughly 2 kb. The two selected motifs thus partition the target sequence into three, approximately 1 kb precursor fragments, denoted fragments 1, 2 and 3.
- Fragments 1, 2 and 3 are prepared by de novo synthesis and PCA assembly of oligonucleic acids. During this process, outer adaptor sequences are added to the 5′ end of fragment 1 and the 3′ end of fragment 3, and connecting adaptor sequences are added to the 3′ end of fragment 1, the 5′ and 3′ ends of fragment 2, and the 5′ end of fragment 3.
- the connecting adaptor sequences located at the 3′ end of fragment 1 and the 5′ end of fragment 2 comprise the sequence of the first identified ANNNNT motif (SEQ ID NO.: 2).
- the connecting adaptor sequences located at the 3′ end of fragment 2 and the 5′ end of fragment 3 comprise the sequence of the second identified ANNNNT motif (SEQ ID NO.: 2).
- Each connecting adaptor comprises, in order: a sequence of 1-10 bases (adaptor bases), a first nicking enzyme recognition site comprising a first nicking enzyme cleavage site on one strand, and a sticky end motif.
- the adaptor bases and first nicking enzyme cleavage site comprise the same bases for each connecting adaptor.
- Fragment 1 prepared with adaptor sequence comprises, in 5′ to 3′ order: a first outer adaptor sequence; fragment 1 sequence; and a first connecting adaptor sequence comprising, in 5′ to 3′ order, the first ANNNNT motif (SEQ ID NO.: 2), the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on a first strand, and the sequence of adaptor bases.
- Fragment 2 prepared with adaptor sequence comprises, in 5′ to 3′ order: the first connecting adaptor sequence comprising, in 5′ to 3′ order, the sequence of adaptor bases, the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on a second strand, and the first ANNNNT motif (SEQ ID NO.: 2); fragment 2 sequence; and a second connecting adaptor sequence comprising, in 5′ to 3′ order, the second ANNNNT motif (SEQ ID NO.: 2), the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on a first strand, and the sequence of adaptor bases.
- Fragment 3 prepared with adaptor sequence comprises, in 5′ to 3′ order: the second connecting adaptor sequence comprising, in 5′ to 3′ order, the sequence of adaptor bases, the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on a second strand, the second ANNNNT motif (SEQ ID NO.: 2); fragment 3 sequence; and a second outer adaptor sequence.
- Each of the prepared fragments are amplified to incorporate a second nicking enzyme cleavage site on a single-strand of each fragment such that the second nicking enzyme cleavage site is located from 1 to 10 bases away from the first nicking enzyme cleavage site of each fragment and on a different strand from the first nicking enzyme cleavage site.
- the second nicking enzyme cleavage site comprises a non-canonical base.
- the non-canonical base is added to each fragment during PCR via a primer comprising the sequence of adaptor bases, the first nicking enzyme recognition site, a sticky end motif ANNNNT (SEQ ID NO.: 2), and the non-canonical base.
- Fragment 1 comprises, in 5′ to 3′ order: the first outer adaptor sequence, fragment 1 sequence, the non-canonical base on the second strand, the first ANNNNT motif (SEQ ID NO.: 2), the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on the first strand, and the sequence of adaptor bases.
- Fragment 2 comprises, in 5′ to 3′ order: the sequence of adaptor bases, the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on the second strand, the first ANNNNT motif (SEQ ID NO.: 2), the non-canonical base on the first strand, fragment 2 sequence, the non-canonical base on the second strand, the ANNNNT motif (SEQ ID NO.: 2), the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on the first strand, and the sequence of adaptor bases.
- Fragment 3 comprises, in 5′ to 3′ order: the sequence of adaptor bases, the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on the second strand, the second ANNNNT motif (SEQ ID NO.: 2), the non-canonical base on the first strand, fragment 3 sequence, and a second outer adaptor sequence.
- Each of the three fragments comprising two nicking enzyme cleavage sites are treating with a first nicking enzyme and a second nicking enzyme.
- the first nicking enzyme creates a nick at the first nicking enzyme cleavage site by cleaving a single-strand of the fragment.
- the second nicking enzyme creates a nick by removing the non-canonical base from the fragment.
- the enzyme-treated fragments have an overhang comprising a sticky end motif ANNNNT (SEQ ID NO.: 2).
- Enzyme-treated fragment 1 comprises, in 5′ to 3′ order: the first outer adaptor, fragment 1 sequence, and on the first strand, the first sticky end motif ANNNNT (SEQ ID NO.: 2).
- Enzyme-treated fragment 2 comprises, in 5′ to 3′ order: on the second strand, the first sticky end motif ANNNNT (SEQ ID NO.: 2); fragment 2 sequence; and on the first strand, the second sticky end motif ANNNNT (SEQ ID NO.: 2).
- Enzyme-treated fragment 3 comprises, in 5′ to 3′ order: on the second strand, the second sticky end motif ANNNNT (SEQ ID NO.: 2); fragment 3 sequence; and the second outer adaptor.
- the first sticky ends of fragments 1 and 2 are annealed and the second sticky ends of fragments 2 and 3 are annealed, generating a gene comprising, in 5′ to 3′ order: the first outer adaptor, fragment 1 sequence, the first sticky end motif, fragment 2 sequence, the second sticky end motif, fragment 3 sequence, and the second outer adaptor.
- the assembled product comprises the predetermined sequence of the target gene without any scar sites.
- the assembled product is amplified using primers to the outer adaptors to generate desired quantities of the target gene.
- Example 5 Generation of Precursor Nucleic Acid Fragments Using Uracil as a Non-Canonical Base
- a double-stranded target gene of predetermined sequence is prepared using a de novo synthesis and assembly method described herein.
- the predetermined gene sequence is first analyzed to identify fragments which will be synthesized and assembled into the final gene product.
- the target nucleic acid sequence is analyzed to identify sticky end motifs. Three of the identified motifs are selected according to their position in the sequence, so that the motifs partition the predetermined sequence in four fragments having roughly similar sequence lengths.
- the sticky end motifs are designated sticky end motif x, sticky end motif y, and sticky end motif z.
- the precursor fragments are designed fragment 1, fragment 2, fragment 3, and fragment 4. Accordingly, the predetermined sequence comprises, in order: fragment 1 sequence, sticky end motif x, fragment 2 sequence, sticky end motif y, fragment 3 sequence, sticky end motif z, and fragment 4 sequence.
- Fragments 1-4 are prepared by de novo synthesis and PCA assembly of oligonucleic acids. During this process connecting adaptor sequences are added to the 3′ end of fragment 1, the 5′ and 3′ ends of fragments 2 and 3, and the 5′ end of fragment 4.
- the connecting adaptor sequences located at the 3′ end of fragment 1 and the 5′ end of fragment 2 comprise sticky end motif x.
- the connecting adaptor sequences located at the 3′ end of fragment 2 and the 5′ end of fragment 3 comprise sticky end motif y.
- the connecting adaptor sequences located at the 3′ end of fragment 3 and the 5′ end of fragment 4 comprise sticky end motif z.
- Each connecting adaptor comprises, in order: a sequence of 1-10 bases (adaptor bases), a first nicking enzyme recognition site comprising a first nicking enzyme cleavage site on a first strand, a sticky end motif comprising a second nicking enzyme cleavage site on the 3′ base of the second strand.
- the second nicking enzyme cleavage site comprises the non-canonical base uracil.
- the connecting adaptor sequences are positioned at the 5′ and/or 3′ end of a fragment such that the 3′ uracil of the connecting adaptor is positioned directed next to the 5′ and/or 3′ end of the fragment.
- the adaptor bases and first nicking enzyme cleavage site comprise the same bases for each connecting adaptor.
- Precursor fragment 1 comprises fragment 1 sequence and a first connecting adaptor comprising sticky end motif x.
- Precursor fragment 2 comprises the first connecting adaptor comprising sticky end motif x, fragment 2 sequence, and a second connecting adaptor comprising sticky end motif y.
- Precursor fragment 3 comprises the second connecting adaptor comprising sticky end motif y, fragment 3 sequence, and a third connecting adaptor comprising sticky end motif z.
- Precursor fragment 4 comprises the third connecting adaptor comprising sticky end motif z and fragment 4 sequence.
- Each of the four precursor fragments comprise one or two connecting adaptors, each connecting adaptor comprising: a first nicking enzyme recognition site comprising a first nicking enzyme cleavage site on a first strand, and uracil base on the second strand.
- the precursor fragments are treating with a first nicking enzyme which recognizes the first nicking enzyme recognition site to generate a nick at the first nicking enzyme cleavage site.
- the precursor fragments are treated with a second nicking enzyme, USER, which excises the uracil from the second strand, generating a nick where the uracil used to reside.
- USER comprises Uracil DNA glycosylase (UDG) and DNA glycosylase-lyase Endonuclease VIII (EndoVIII). Each precursor fragment now comprises an overhang consisting of a sticky end motif.
- Precursor fragment 1 now comprises fragment 1 sequence and a 5′ overhang consisting of sequence motif x.
- Precursor fragment 2 now comprises a 3′ overhang consisting of sequence motif x, fragment 2 sequence, and a 5′ overhang consisting of sequence motif y.
- Precursor fragment 3 now comprises a 3′ overhang consisting of sequence motif y, fragment 3 sequence, and a 5′ overhang consisting of sequence motif z.
- Precursor fragment 4 now comprises a 3′ overhang consisting of sequence motif z and fragment 4 sequence.
- the sticky end motif x overhangs of precursor fragments 1 and 2 are annealed, the sticky end motif y overhangs of precursor fragments 2 and 3 are annealed, and the sticky end motif z overhangs of precursor fragments 3 and 4 are annealed, generating a gene comprising, in 5′ to 3′ order: fragment 1 sequence, sticky end motif x, fragment 2 sequence, sticky end motif y, fragment 3 sequence, sticky end motif z and fragment 4 sequence.
- the product to be assembled comprises the predetermined sequence of the target gene without any scar sites.
- the assembled product is optionally amplified to generate desired quantities of the target gene.
- precursor fragments are generated at sufficient quantities such that amplification of the final gene is unnecessary. Such instances allow for the generation of large genes which are unable to be amplified using traditional amplification methods.
- a population of precursor nucleic acid fragments are amplified using a set of universal primer pairs, wherein each universal primer introduces a non-canonical base uracil to a single-strand of a precursor nucleic acid.
- a predetermined sequence of a target gene is analyzed to select sticky end motifs that partition the gene into precursor fragments of desired size.
- the sticky end motifs have the sequence ANNNNT (SEQ ID NO.: 2), where each selected sticky end motif has a different NNNN sequence.
- the NNNN sequence for each selected sticky end motif is noted.
- Universal forward primers are synthesized to comprise, in 5′ to 3′ order: 1-20 forward adaptor bases, a nicking enzyme recognition site, and a sticky end motif comprising ANNNNU (SEQ ID NO.: 53).
- a subpopulation of forward primers is generated so that each subpopulation comprises a NNNN sequence of a different sticky end motif selected from the target gene.
- Universal reverse primers are synthesized to comprise, in 5′ to 3′ order: 1-20 reverse adaptor bases, a nicking enzyme recognition site, and a sticky end motif comprising ANNNNU (SEQ ID NO.: 53).
- a subpopulation of reverse primers is generated so that each subpopulation comprises the reverse complement of a NNNN sequence of a different sticky end motif selected from the target gene.
- the nicking enzyme recognition site sequence in the universal primers is designed such that when the universal primers are incorporated into precursor fragments during an amplification reaction, the reverse complement sequence of the nicking enzyme recognition site sequence in the universal primer comprises a nicking enzyme cleavage site. Accordingly, upon treating with a nicking enzyme specific for the nicking enzyme cleavage site, a nick is generated on a strand of the fragment not comprising the uracil base.
- Precursor fragments partitioned by the selected sticky end motifs are assembled from smaller, synthesized nucleic acids.
- the precursor fragments are amplified using the set of universal primers comprising the sticky end motif ANNNNT (SEQ ID NO.: 2), wherein the T is mutated with the non-canonical base uracil.
- the precursor fragments each comprise a nicking enzyme recognition site comprising a nicking enzyme cleavage site on one strand and a uracil base on the other strand.
- Precursor fragments amplified with universal primers are treated with a first nicking enzyme to create a nick at the nicking enzyme cleavage site and a second nicking enzyme comprising UDG and Endonuclease VIII activity to generate a nick at the uracil base site.
- the precursor fragments comprise overhangs with the sticky end motif ANNNNT (SEQ ID NO.: 2).
- Fragments comprising complementary overhangs are annealed to generate the target gene.
- the target gene comprises the predetermined sequence, with no extraneous scar sites.
- a double-stranded target gene of predetermined sequence is prepared using a de novo synthesis and assembly method described herein.
- the predetermined gene sequence is first analyzed to identify fragments which will be synthesized and assembled into the final gene product.
- the target nucleic acid sequence is analyzed to identify sticky end motifs having a Type II restriction endonuclease recognition sequence. Three of the identified motifs are selected according to their position in the sequence, so that the motifs partition the predetermined sequence in four fragments having roughly similar sequence lengths of about 200 kb.
- the sticky end motifs are designated sticky end motif x, sticky end motif y, and sticky end motif z.
- the precursor fragments are designed fragment 1, fragment 2, fragment 3, and fragment 4. Accordingly, the predetermined sequence comprises, in order: fragment 1 sequence, sticky end motif x, fragment 2 sequence, sticky end motif y, fragment 3 sequence, sticky end motif z, and fragment 4 sequence.
- Precursor fragments 1-4 are prepared by de novo synthesis and PCA assembly of oligonucleic acids. During this process connecting adaptor sequences are added to the 3′ end of fragment 1, the 5′ and 3′ ends of fragments 2 and 3, and the 5′ end of fragment 4.
- the connecting adaptor sequences located at the 3′ end of fragment 1 and the 5′ end of fragment 2 comprise sticky end motif x.
- the connecting adaptor sequences located at the 3′ end of fragment 2 and the 5′ end of fragment 3 comprise sticky end motif y.
- the connecting adaptor sequences located at the 3′ end of fragment 3 and the 5′ end of fragment 4 comprise sticky end motif z.
- Each connecting adaptor comprises a sequence of 1-10 adaptor bases and sticky end motif comprising a Type II restriction endonuclease recognition sequence. Also during preparation of precursor fragments 1-4, outer adaptors comprising 1-10 adaptor bases are added to the 5′ and 3′ ends of fragments 1 and 4, respectively.
- the adaptor bases comprise the same bases for each connecting adaptor and outer adaptor.
- Precursor fragment 1 comprises outer adaptor sequence 1, fragment 1 sequence and a first connecting adaptor comprising sticky end motif x.
- Precursor fragment 2 comprises the first connecting adaptor comprising sticky end motif x, fragment 2 sequence, and a second connecting adaptor comprising sticky end motif y.
- Precursor fragment 3 comprises the second connecting adaptor comprising sticky end motif y, fragment 3 sequence, and a third connecting adaptor comprising sticky end motif z.
- Precursor fragment 4 comprises the third connecting adaptor comprising sticky end motif z, fragment 4 sequence, and outer adaptor sequence 2.
- Each of the four precursor fragments comprise one or two connecting adaptors, each connecting adaptor having a sticky end motif comprising a Type II restriction endonuclease recognition sequence.
- the precursor fragments are treated with three Type II restriction enzymes, each enzyme specific for a Type II recognition sequence in sticky end motifs X-Z, to generate four precursor fragments with sticky ends.
- the sticky end motif x overhangs of precursor fragments 1 and 2 are annealed, the sticky end motif y overhangs of precursor fragments 2 and 3 are annealed, and the sticky end motif z overhangs of precursor fragments 3 and 4 are annealed, generating a gene comprising, in 5′ to 3′ order: fragment 1 sequence, sticky end motif x, fragment 2 sequence, sticky end motif y, fragment 3 sequence, sticky end motif z and fragment 4 sequence.
- the product to be assembled comprises the predetermined sequence of the target gene without any scar sites.
Landscapes
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Plant Pathology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Crystallography & Structural Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Description
- This application is a Continuation of U.S. patent application Ser. No. 16/530,717, filed Aug. 2, 2019, which is a Continuation of U.S. patent application Ser. No. 15/433,909, filed Feb. 15, 2017, now abandoned, which is a Continuation of U.S. patent application Ser. No. 15/154,879, filed May 13, 2016, now U.S. Pat. No. 9,677,067, issued on Jun. 13, 2017, which is a Continuation of PCT/US16/16636, filed Feb. 4, 2016, which claims the benefit of U.S. Provisional Application No. 62/112,022, filed Feb. 4, 2015, which are herein incorporated by reference in their entirety.
- The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Jan. 25, 2016, is named 44854_709_303_SL.txt and is 41,173 bytes in size.
- De novo nucleic acid synthesis is a powerful tool for basic biological research and biotechnology applications. While various methods are known for the synthesis of relatively short fragments of nucleic acids in a small scale, these techniques suffer from scalability, automation, speed, accuracy, and cost. In many cases, the assembly of nucleic acids from shorter segments is limited by the availability of non-degenerate overhangs that can be annealed to join the segments.
- Provided herein are methods for nucleic acid assembly, comprising: providing a predetermined nucleic acid sequence; providing a plurality of precursor double-stranded nucleic acid fragments, each precursor double-stranded nucleic acid fragment having two strands, wherein each of the two strands comprises a sticky end sequence of 5′-A (Nx) T-3′ (SEQ ID NO.: 1) or 5′-G (Nx)C-3′ (SEQ ID NO.: 16), wherein N is a nucleotide, wherein x is the number of nucleotides between nucleotides A and T or between G and C, and wherein x is 1 to 10, and wherein no more than two precursor double-stranded nucleic acid fragments comprise the same sticky end sequence; providing primers comprising a nicking endonuclease recognition site and a sequence comprising (i) 5′-A (Nx) U-3′ (SEQ ID NO.: 80) corresponding to each of the different sticky end sequences of 5′-A (Nx) T-3′ (SEQ ID NO.: 1) or (ii) 5′-G (Nx) U-3′ (SEQ ID NO.: 81) corresponding to each of the different sticky end sequences of 5′-G (Nx)C-3′ (SEQ ID NO.: 16); and performing a polynucleotide extension reaction to form double-stranded nucleic acid fragments; subjecting the polynucleotide extension reaction product to nicking and cleavage reactions to form double-stranded nucleic acid fragments with 3′ overhangs; and annealing the double-stranded nucleic acid fragments to form a nucleic acid encoding for the predetermined nucleic acid sequence that does not include the nicking endonuclease recognition site. Methods are further provided wherein x is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 100 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 25 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is 2 kb to 20 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is at least 2 kb in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 100 bases in length. Methods are further provided wherein the double-stranded nucleic acid fragments are each at least 500 bases in length. Methods are further provided wherein the double-stranded nucleic acid fragments are each at least 1 kb in length. Methods are further provided wherein the double-stranded nucleic acid fragments are each at least 20 kb in length. Methods are further provided wherein the sticky ends are at least 4 bases long. Methods are further provided wherein the sticky ends are 6 bases long. Methods are further provided wherein step c further comprises providing (i) a forward primer comprising, in order 5′ to 3′: a first outer adaptor region and nucleic acid sequence from a first terminal portion of predetermined nucleic acid sequence; and (ii) a reverse primer, comprising, in order 5′ to 3′: a second outer adaptor region and nucleic acid sequence from a second terminal portion of predetermined nucleic acid sequence. Methods are further provided wherein the annealed double-stranded nucleic acid fragments comprise the first outer adaptor region and the second outer adapter region. Methods are further provided wherein the nicking and cleavage reagents comprise a nicking endonuclease. Methods are further provided wherein the nicking endonuclease comprises endonuclease VIII. Methods are further provided wherein the nicking endonuclease is selected from the list consisting of Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BtsI, Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII. Methods are further provided wherein the method further comprises ligating the annealed double-stranded nucleic acid fragments. Methods are further provided wherein annealing comprises thermocycling between a maximum and a minimum temperature, thereby generating a first overhang from a first double-stranded DNA fragment and a second overhang from a second double-stranded DNA fragment, wherein the first and the second overhangs are complimentary, hybridizing the first and second overhangs to each other; and ligating. Methods are further provided wherein a polymerase lacking 3′ to 5′ proofreading activity is added during the polynucleotide extension reaction. Methods are further provided wherein the polymerase is a Family A polymerase. Methods are further provided wherein the polymerase is a Family B high fidelity polymerase engineered to tolerate base pairs comprising uracil. Methods are further provided wherein the precursor double-stranded nucleic acid fragments comprise an adaptor sequence comprising the nicking endonuclease recognition site. Methods are further provided wherein one of the plurality of precursor double-stranded nucleic acid fragments is a linear vector. In some aspects, provided herein is a nucleic acid library generated by any of the aforementioned methods.
- Methods are provided herein for nucleic acid assembly, comprising: providing a predetermined nucleic acid sequence; synthesizing a plurality of precursor double-stranded nucleic acid fragments, each precursor double-stranded nucleic acid fragment having two strands, wherein each of the two strands comprises a sticky end sequence of 5′-A (Nx) T-3′ (SEQ ID NO.: 1) or 5′-G (Nx)C-3′ (SEQ ID NO.: 16), wherein N is a nucleotide, wherein x is the number of nucleotides between nucleotides A and T or between G and C, and wherein x is 1 to 10, and wherein no more than two precursor double-stranded nucleic acid fragments comprise the same sticky end sequence; providing primers comprising a nicking endonuclease recognition site and a sequence comprising (i) 5′-A (Nx) M-3′ (SEQ ID NO.: 82) corresponding to each of the different sticky end sequences of 5′-A (Nx) T-3′ (SEQ ID NO.: 1) or (ii) 5′-G (Nx) M-3′ (SEQ ID NO.: 83) corresponding to each of the different sticky end sequences of 5′-G (Nx)C-3′ (SEQ ID NO.: 16), wherein M is a non-canonical base, wherein the primers are each 7 to 70 bases in length; and performing a polynucleotide extension reaction to form double-stranded nucleic acid fragments; subjecting the polynucleotide extension reaction product to nicking and cleavage reactions to form double-stranded nucleic acid fragments with 3′ overhangs; and annealing the double-stranded nucleic acid fragments to form a nucleic acid encoding for the predetermined nucleic acid sequence that does not include the nicking endonuclease recognition site. Methods are further provided wherein x is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. Methods are further provided wherein x is 4. Methods are further provided wherein the non-canonical base is uracil, inosine, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, acetylcytosine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N-6-isopentenyl adenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 1-methyladenine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, 5-ethylcytosine, N6-adenine, N6-methyladenine, N,N-dimethyladenine, 8-bromoadenine, 7-methylguanine, 8-bromoguanine, 8-chloroguanine, 8-aminoguanine, 8-methylguanine, 8-thioguanine, 5-ethyluracil, 5-propyluracil, 5-methylaminomethyluracil, methoxyarninomethyl-2-thiouracil, beta-D-mannosylqueosine, 5′-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid, pseudouracil, 1-methylpseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-hydroxymethyluracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-S-oxyacetic acid, 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, 5-(2-bromovinyl)uracil, 2-aminopurine, 6-hydroxyaminopurine, 6-thiopurine, or 2,6-diaminopurine. Methods are further provided wherein the non-canonical base is incorporated into the double-stranded nucleic acid fragments by performing a nucleic acid extension reaction from a primer comprising the non-canonical nucleotide. Methods are further provided wherein the non-canonical base is a uracil. Methods are further provided wherein the uracil is in a deoxyuridine-deoxyadenosine base pair. Methods are further provided wherein the primers are 10 to 30 bases in length. Methods are further provided wherein one of the plurality of precursor double-stranded nucleic acid fragments comprises a portion of linear vector. Methods are further provided wherein no more than 2 N nucleotides of the sticky end sequence have the same identity. Methods are further provided wherein the precursor double-stranded nucleic acid fragments comprise an adaptor sequence comprising the nicking endonuclease recognition site. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 100 kb in length. Methods are further provided wherein the plurality of precursor nucleic acid fragments are each at least 100 bases in length. Methods are further provided wherein the sticky ends are at least 4 bases long in each precursor nucleic acid. In some aspects, provided herein is a nucleic acid library generated by any of the aforementioned methods.
- Methods are provided herein for nucleic acid assembly, comprising: providing a predetermined nucleic acid sequence; synthesizing a plurality of single-stranded nucleic acid fragments, wherein each single-stranded nucleic acid fragment encodes for a portion of the predetermined nucleic acid sequence and comprises at least one sticky end motif, wherein the sticky end motif comprises a sequence of 5-A(Nx)T-3′ (SEQ ID NO.: 1) or 5′-G(Nx)C-3′ (SEQ ID NO.: 16) in the predetermined nucleic acid sequence, wherein N is a nucleotide, wherein x is the number of nucleotides between nucleotides A and T or between G and C, and wherein x is 1 to 10, and wherein no more than two single-stranded nucleic acid fragments comprise the same sticky end sequence; amplifying the plurality of single-stranded nucleic acid fragments to generate a plurality of double-stranded nucleic acid fragments, wherein the plurality of double-stranded nucleic acid fragments are modified from the predetermined nucleic acid sequence to comprise (i) a non-canonical base located at a 3′ end of the sticky end motif on a first strand and (ii) a first adaptor region located 5′ of the non-canonical base on the first strand, wherein the first adaptor region comprises a nicking enzyme recognition site; creating sticky ends, wherein creating sticky ends comprises: treating the plurality of double-stranded fragments with a first nicking enzyme that nicks the non-canonical base on a first strand of each double-stranded fragment, and cleaving the nicked non-canonical base; and treating the plurality of double-stranded fragments with a second nicking enzyme, wherein the second nicking enzyme binds to the first strand at the nicking enzyme recognition site and cleaves a second strand of each double-stranded fragment, wherein a cleavage site for the nicking enzyme is located at a junction between the sticky end motif a sequence reverse complementary to the first adaptor region of the first strand; and annealing the double-stranded nucleic acid fragments to form a nucleic acid encoding for the predetermined nucleic acid sequence that does not include the nicking endonuclease recognition site. Methods are further provided wherein x is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 100 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is 1 kb to 25 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is 2 kb to 20 kb in length. Methods are further provided wherein the predetermined nucleic acid sequence is at least 2 kb in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 100 bases in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 500 bases in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 1 kb in length. Methods are further provided wherein the plurality of single-stranded nucleic acid fragments are each at least 20 kb in length. Methods are further provided wherein the sticky ends are at least 4 bases long. Methods are further provided wherein the sticky ends are 6 bases long. Methods are further provided wherein the non-canonical base is uracil, inosine, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, acetylcytosine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N-6-isopentenyl adenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 1-methyladenine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, 5-ethylcytosine, N6-adenine, N6-methyladenine, N,N-dimethyladenine, 8-bromoadenine, 7-methylguanine, 8-bromoguanine, 8-chloroguanine, 8-aminoguanine, 8-methylguanine, 8-thioguanine, 5-ethyluracil, 5-propyluracil, 5-methylaminomethyluracil, methoxyarninomethyl-2-thiouracil, beta-D-mannosylqueosine, 5′-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid, pseudouracil, 1-methylpseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-hydroxymethyluracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-S-oxyacetic acid, 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, 5-(2-bromovinyl)uracil, 2-aminopurine, 6-hydroxyaminopurine, 6-thiopurine, or 2,6-diaminopurine. Methods are further provided wherein the non-canonical base is incorporated into the double-stranded nucleic acid by performing a nucleic acid extension reaction from a primer comprising the non-canonical nucleotide. Methods are further provided wherein the non-canonical base is a uracil. Methods are further provided wherein the uracil is in a deoxyuridine-deoxyadenosine base pair. Methods are further provided wherein the nicking recognition site is a nicking endonuclease recognition site. Methods are further provided wherein the distance between the non-canonical base the nicking enzyme cleavage site is less than 12 base pairs. Methods are further provided wherein the distance between the non-canonical base the nicking enzyme cleavage site is at least 5 base pairs. Methods are further provided wherein the first nicking enzyme comprises a base excision activity. Methods are further provided wherein the first nicking enzyme comprises uracil-DNA glycosylase (UDG). Methods are further provided wherein the first nicking enzyme comprises an AP endonuclease. Methods are further provided wherein the first nicking enzyme comprises endonuclease VIII. Methods are further provided wherein the second nicking enzyme a nicking endonuclease. Methods are further provided wherein the nicking endonuclease is selected from the list consisting of Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BtsI, Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPI0. Methods are further provided wherein each of the plurality of double-stranded nucleic acid fragments further comprises a two sticky ends. Methods are further provided wherein each of the two sticky ends have a different sequence from each other. Methods are further provided wherein the sticky ends comprises a 3′ overhang. Methods are further provided wherein the method further comprises ligating the annealed double-stranded nucleic acid fragments. Methods are further provided wherein annealing comprises: thermocycling between a maximum and a minimum temperature, thereby generating a first overhang from a first double-stranded DNA fragment and a second overhang from a second double-stranded DNA fragment, wherein the first and the second overhangs are complimentary; hybridizing the first and second overhangs to each other; and ligating. Methods are further provided wherein the annealed double-stranded nucleic acid fragments comprise a 5′ outer adaptor region and a 3′ outer adaptor region. Methods are further provided wherein at least two non-identical single-stranded nucleic acid fragments are synthesized. Methods are further provided wherein at least 5 non-identical single-stranded nucleic acid fragments are synthesized. Methods are further provided wherein at least 20 non-identical single-stranded nucleic acid fragments are synthesized. Methods are further provided wherein a polymerase lacking 3′ to 5′ proofreading activity is added during the amplification step. Methods are further provided wherein the polymerase is a Family A polymerase. Methods are further provided wherein the polymerase is a Family B high fidelity polymerase engineered to tolerate base pairs comprising uracil. Methods are further provided wherein the amplified plurality of single-stranded nucleic acid fragments are not naturally occurring. Provided herein are nucleic acid libraries generated by any of the aforementioned methods.
- Provided herein are DNA libraries comprising n DNA fragments, each comprising a first strand and a second strand, each of then DNA fragments comprising, in order 5′ to 3′: a first nicking endonuclease recognition site, a first sticky end motif, a template region, a second sticky end motif, and a second nicking endonuclease recognition site, wherein the first sticky end motif comprises a sequence of 5′-A (Nx)i,1U-3′ (SEQ ID NO.: 13) in the first strand; and wherein the second sticky end motif comprises a sequence of 5′-A (Nx)i,2U-3′ (SEQ ID NO.: 14) in the second strand; wherein Nx denotes x nucleosides, wherein (Nx)i,2 reverse complementary to (Nx)i,1 and different from every other N′ found in any sticky end motif sequence within the fragment library, wherein the first nicking endonuclease recognition site in each of the DNA fragments are positioned such that there is a corresponding cleavage site immediately 3′ of the sticky end motif in the second strand, and wherein the second nicking endonuclease recognition sites are positioned such that there is a corresponding cleavage site immediately 3′ of the second sticky end motif in the first strand. Libraries are further provided wherein the first nicking endonuclease recognition site, the first sticky end motif, the variable insert, the second sticky end motif site, and the second nicking endonuclease recognition site are ordered as recited. Libraries are further provided wherein the library further comprises a starter DNA fragment comprising a template region, a second sticky end motif, and a second nicking endonuclease recognition site; wherein the second sticky end motif comprises a sequence of 5′-A (Nx)s,2 (SEQ ID NO.: 20) and wherein (Nx)s,2 reverse complementary to (Nx)1,1. Libraries are further provided wherein the library further comprises a finishing DNA fragment comprising a first nicking endonuclease recognition site, a first sticky end motif, and a template region; wherein the first sticky end motif comprises a sequence of 5′-A (Nx)f,1U-3′ (SEQ ID NO.: 21) and wherein (Nx)f,1 is reverse complementary to (Nx)n,2. Libraries are further provided wherein the first and second nicking endonuclease recognition sites are the same. Libraries are further provided wherein n is at least 2. Libraries are further provided wherein n is less than 10. Libraries are further provided wherein x is 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10. Libraries are further provided wherein x is 4. Libraries are further provided wherein the template region of each of the n DNA fragments encodes for a different nucleic acid sequencing from the template region of every other of the n DNA fragments. Libraries are further provided wherein the sequences of the n DNA fragments are not naturally occurring. Libraries are further provided wherein the first nicking endonuclease recognition site is not naturally adjacent to the first sticky end motif.
- All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.
-
FIG. 1 depicts a workflow through which a nucleic acid product is assembled from 1 kbp nucleic acid fragments. -
FIG. 2 depicts the assembly of a longer nucleic acid fragment from the ligation of two oligonucleic acid fragments having complementary overhangs and discloses SEQ ID NOs.: 4, 6, 3, 5, 3, 6, 3 and 6, respectively, in order of appearance. -
FIG. 3 depicts a uracil-containing universal primer pair, and discloses SEQ ID NOs.: 7, 2, 8 and 2, respectively, in order of appearance. -
FIG. 4 depicts the assembly of a nucleic acid product from oligonucleic acid fragments having complementary overhangs. -
FIGS. 5A-5B depict the assembly of a recombinatorial library from a library of nucleic acid fragments each having at least one unspecified base. -
FIG. 6 depicts a diagram of steps demonstrating a process workflow for oligonucleic acid synthesis and assembly. -
FIG. 7 illustrates an example of a computer system. -
FIG. 8 is a block diagram illustrating an example architecture of a computer system. -
FIG. 9 is a diagram demonstrating a network configured to incorporate a plurality of computer systems, a plurality of cell phones and personal data assistants, and Network Attached Storage (NAS). -
FIG. 10 is a block diagram of a multiprocessor computer system using a shared virtual address memory space. -
FIG. 11 shows an image of an electrophoresis gel resolving amplicons of a LacZ gene assembled in a plasmid using scar-free assembly methods described herein. - Disclosed herein are methods and compositions for the assembly of nucleic acid fragments into longer nucleic acid molecules of desired predetermined sequence and length without leaving inserted nucleic acid sequence at assembly points, aka “scar” sequence. In addition, amplification steps are provided during the synthesis of the fragments which provide a means for increasing the mass of a long nucleic acid sequence to be amplified by amplifying the shorter fragments and then rejoining them in a processive manner such that the long nucleic acid is assembled.
- The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of any embodiment. As used herein, the singular forms “a,” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Unless specifically stated or obvious from context, as used herein, the term “about” in reference to a number or range of numbers is understood to mean the stated number and numbers +/−10% thereof, or 10% below the lower listed limit and 10% above the higher listed limit for the values listed for a range. As used herein, the terms “preselected sequence”, “predefined sequence” or “predetermined sequence” are used interchangeably. The terms mean that the sequence of the polymer is known and chosen before synthesis or assembly of the polymer. In particular, various aspects of the invention are described herein primarily with regard to the preparation of nucleic acids molecules, the sequence of the oligonucleotide or polynucleotide being known and chosen before the synthesis or assembly of the nucleic acid molecules.
- The term “nucleic acid” as used herein refers broadly to any type of coding or non-coding, long polynucleotide or polynucleotide analog. As used herein, the term “complementary” refers to the capacity for precise pairing between two nucleotides. If a nucleotide at a given position of a nucleic acid is capable of hydrogen bonding with a nucleotide of another nucleic acid, then the two nucleic acids are considered to be complementary to one another (or, more specifically in some usage, “reverse complementary”) at that position. Complementarity between two single-stranded nucleic acid molecules may be “partial,” in which only some of the nucleotides bind, or it may be complete when total complementarity exists between the single-stranded molecules. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands.
- “Hybridization” and “annealing” refer to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The term “hybridized” as applied to a polynucleotide is a polynucleotide in a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner. The complex may comprise two strands forming a duplex structure, three or more strands forming a multi stranded complex, a single self-hybridizing strand, or any combination of these. A hybridization reaction may constitute a step in a more extensive process, such as the initiation of a PCR or other amplification reactions, or the enzymatic cleavage of a polynucleotide by a ribozyme. A first sequence that can be stabilized via hydrogen bonding with the bases of the nucleotide residues of a second sequence is said to be “hybridizable” to the second sequence. In such a case, the second sequence can also be said to be hybridizable to the first sequence. In many cases a sequence hybridized with a given sequence is the “complement” of the given sequence.
- In general, a “target nucleic acid” is a desired molecule of predetermined sequence to be synthesized, and any fragment thereof.
- The term “primer” refers to an oligonucleotide that is capable of hybridizing (also termed “annealing”) with a nucleic acid and serving as an initiation site for nucleotide (RNA or DNA) polymerization under appropriate conditions (i.e. in the presence of four different nucleoside triphosphates and an agent for polymerization, such as DNA or RNA polymerase or reverse transcriptase) in an appropriate buffer and at a suitable temperature. The appropriate length of a primer depends on the intended use of the primer. In some instances, primers are least 7 nucleotides long. In some instances, primers range from 7 to 70 nucleotides, 10 to 30 nucleotides, or from 15 to 30 nucleotides in length. In some instances, primers are from 30 to 50 or 40 to 70 nucleotides long. Oligonucleotides of various lengths as further described herein are used as primers or precursor fragments for amplification and/or gene assembly reactions. In this context, “primer length” refers to the portion of an oligonucleotide or nucleic acid that hybridizes to a complementary “target” sequence and primes nucleotide synthesis. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template. A primer need not reflect the exact sequence of the template but must be sufficiently complementary to hybridize with a template. The term “primer site” or “primer binding site” refers to the segment of the target nucleic acid to which a primer hybridizes.
- Scar-Free Nucleic Acid Assembly
- An exemplary workflow illustrating the generation of a target nucleic acid using a scar-free nucleic acid assembly method is shown in
FIG. 1 . In a first step, the predetermined sequence of a double-stranded targetnucleic acid 100 is analyzed to find short sequences, such as sequences of 3, 4, 5, 6, 7, 8, 9, or 10 bases, to serve as sticky end motifs 101 a-101 g. Each sticky end motif 101 a-101 g identified in the target nucleic acid need not comprise a sequence unique from another sequence in the target nucleic acid, but each sticky end sequence involved in target nucleic acid assembly is used only once, that is, at only one pair of precursor nucleic acid fragment ends. Sticky end motifs are generally used more than once, that is, at more than one pair of precursor nucleic acid fragment ends. A sticky end motif comprises the sequence A(Nx)T (SEQ ID NO.: 1), wherein x indicates from about 1 to about 10, N deoxyribonucleic acid bases of any sequence. For example, x is 4, 5 or 6 and each N may be the same or different from another N in the motif. In some cases, a sticky end motif comprises an ANNNNT (SEQ ID NO.: 2) sequence. After the targetnucleic acid sequence 100 is analyzed to identify sticky end motifs 101 a-101 g and fragment sequences 110 a-110 c selected 105, the fragments are synthesized 115 with the sticky end motifs from the targetnucleic acid 100, for example, by de novo synthesis. - In one example of the de novo synthesis process as illustrated in
FIG. 1 ,synthesis 115 results in double-stranded precursor nucleic acid fragments 120 a-120 c. Each double-stranded precursor nucleic acid fragments 120 a-120 c includes an adaptor sequence positioned at either end of target fragment sequence. The outer terminal portions of the double-stranded precursor nucleic acid fragments each comprise an outer adaptor 121 a-121 b. Each double-stranded precursor nucleic acid fragment 121 a-120 c is synthesized 115 such that it overlaps with another region of another fragment sequence via sticky end motifs 101 a-101 g in a processed order. As illustrated inFIG. 1 , at the region of the synthesize double-stranded precursor nucleic acid fragment comprising a sticky end motif 101 a-101 b, synthesis also results in including additional sequence in a connecting adaptor region 123 a-123 d. The “sticky end motif” occurs at a desired frequency in the nucleic acid sequence. The connecting adaptor region 123 a-123 d includes a sticky end motif 101 a-101 b and a first nickingenzyme recognition site 125. - Further processing of the double-stranded precursor nucleic acid fragments 120 a-120 c is done via primers in an amplification reaction via primers in an
amplification reaction 130 to insert anon-canonical base 131. In an alternative method, connecting adaptor regions 123 a-123 d and/or outer adaptors 120 a-120 b are and/or are appended to either end of the fragments during a processing step, for example, via primers in anamplification reaction 130. - To generate fragments capable of annealing, the double-stranded precursor nucleic acid fragments 120 a-120 c as subjected to
enzymatic processing 140. Enzymatic processing 140 as illustrated inFIG. 1 , entails cleaving portions of the connecting adaptor regions 123 a-123 d. In a first enzymatic reaction, a first nicking enzyme binds at a first nickingenzyme recognition site 125, and then cleaves the opposite stand. In a second enzymatic reaction, a second nicking enzyme cleaves thenon-canonical base 131. The enzymatic reaction results in fragments having stick ends 140 a-140 d wherein pairs of sticky ends are revers complementary and correspond to sticky end motifs 101 a-101 b in the original sequence. Finally, the fragments are subjected to an annealing and anligation reaction 150 to form a reaction product 155 comprising target sequence. The annealing andligation reactions 150 can include rounds of annealing, ligating and melting under conditions such that only desiredsticky ends 140 a-140 d are able to anneal and ligate, while cleaved end fragments remain unligated. Ordered assembly of nucleic acid fragments includes linear and circular assembly, for example, fragments are assembled with a vector into a plasmid. - In one example, each double-stranded fragment is flanked on a terminal side by a double-stranded connecting adaptor comprising: a double-stranded sticky end motif derived from the target nucleic acid sequence, a nicking enzyme cleavage site located only a first strand of the adaptor, and a double-stranded nicking enzyme recognition sequence, such that upon incubation with a first nicking enzyme specific for the nicking enzyme recognition sequence, a single-strand break is introduced at the nicking enzyme cleavage site in the first strand. In exemplary cases, the sticky end motif of the connecting adaptor is located directly at the 5′ or 3′ end of a fragment so that each sticky end motif-fragment or fragment-sticky end motif construct comprises sequence native to the predetermined target nucleic acid sequence. The target
nucleic acid sequences 100 may be partitioned in sticky end motifs 101 a-101 g of about 200 bp or other lengths, such as less than or about 50 bp, about 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, or 1000 bp, or more bp. - In various aspects, described herein are double-stranded nucleic acids comprising a first strand having a first cleavage site and a second strand having a second cleavage site; wherein the cleavage sites are positioned one or more bases from one another in sequence. As a non-limiting example, provided are double-stranded nucleic acids comprising a first strand comprising a non-canonical base and a second strand comprising a nicking enzyme cleavage site; wherein the non-canonical base and nicking enzyme cleavage site are positioned one or more bases from one another in sequence. Through the combined action of nicking enzymes directed to act in tandem at adjacent or near adjacent positions on opposite strands of a double-stranded nucleic acid, one may impact the generation of a sticky end at or near the end of a first nucleic acid fragment, wherein the sticky end sequence is unique and complementary only to the sticky end of a second nucleic acid fragment sequentially adjacent thereto in a predetermined sequence of a full-length target nucleic acid to be assembled from the fragments.
- An example workflow illustrating the generation of a nick at a non-canonical base in a nucleic acid is shown in
FIGS. 2A-2B . As a preliminary step, as illustrated inFIG. 1 , a predetermined sequence of a target nucleic acid is partitioned in silico into fragments, where the sequence of each fragment is separated from an adjacent fragment by an identified sticky end motif. The connecting adaptor regions 123 a-123 d appended to an end of a fragment include a sticky end motif corresponding to the sticky end motif 101 a-101 g adjacent to the fragment such that each motif can processively be aligned during enzymatic processing. For example, the 3′ end of afirst fragment 201 is configured for connection to the 5′ end offragment 2 202 via a stickyend motif X 211 a. Similarly,fragment 2 201 is configured for connection tofragment 3 203 in the target sequence via sticky end motif Y 211 d andfragment 3 203 is configured for connection tofragment 4 204 in the target sequence via stickyend motif Z 211 c. - In some instances, a connecting adaptor comprises a first and a second nicking enzyme recognition site such that tandem nicks made to the connecting adaptor do not affect the sequence of the fragment to which the adaptor is connected. For example, a detailed view of precursor fragments 203 and 204 having such connecting adaptors is show in
FIGS. 2 (220 and 215, respectively). The 5′ connecting adaptor of thefragment 4 204 comprises a first double-stranded nickingenzyme recognition site 225, a first nickingenzyme cleavage site 227 located on a first single-strand 221, and a double-stranded sticky end motif Z (AAGTCT, SEQ ID NO.: 3) modified with a uracil (AAGTCU, SEQ ID NO.: 4) on a second single-strand 223. The 3′ connecting adaptor offragment 3 230 comprises the double-stranded stickyend motif Z 211 c (SEQ ID NO.: 3) modified with a uracil (AGACTU, SEQ ID NO.: 5) on a first single-strand 229, the first nickingenzyme cleavage site 227 on a second single-strand 231, and the first double-stranded nickingenzyme recognition site 225. Accordingly, each strand of the connecting adaptors comprise two nicking sites—a first nicking enzyme cleavage site and a uracil—located at different positions and strands in the adaptor sequence. - Continuing this exemplary workflow, nicking reactions 240 are next described. The first nicking
enzyme cleavage site 227 is located at the backbone of a single-strand of each connecting adaptor, adjacent to a first nickingenzyme recognition sequence 225. In some instances, the cleavage site is located at a position adjacent to a 5′ or 3′ end of a nicking enzyme recognition site by 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 bases. Fragments are treated with a first nicking enzyme, in this case, a strand-adjacent nicking enzyme, which cleaves a single-strand of the connecting adaptor at the first nicking enzyme cleavage site; and a second nicking enzyme which excises uracil and cleaves a single-strand of the connecting adaptor at the excised uracil site. Cleaved fragments 241, 242 comprise sticky end overhangs. Fragments comprising complementary sticky end overhangs are annealed and ligated 250. Theligation product 260 comprises predetermined target nucleic acid sequence comprising adjacent fragments separated by a sticky end motif, without the introduction of extraneous scar sequence. - As used herein, a sticky end motif includes forward, reverse, and reverse complements of a sticky end sequence. For example, a first strand of sticky end motif Z comprises SEQ ID NO.: 3 and a second strand of sticky end motif Z comprises the reverse complement of SEQ ID NO.: 3, AGACTT (SEQ ID NO.: 6),
FIG. 2 . - To prepare double-stranded precursor fragments with one or two nicking enzyme cleavage sites, precursor fragments are either synthesized with one or both sites, assembled from smaller nucleic acids comprising one or both sites, amplified with a primer comprising one or both sites, or any combination of the methods described or known in the art. For example, a precursor fragment can comprise a sticky end sequence and a primer is synthesized comprising a sequence that is complementary to the sticky end sequence, yet comprises a non-canonical base substitution at the 3′ end of the sticky end sequence. Amplification of precursor nucleic acid fragments comprising sticky end sequences with the primer may introduce the non-canonical base to the precursor fragment sequence so that the precursor fragment amplicons comprise a nicking enzyme cleavage site defined by the position of the non-canonical base. In one example, a double-stranded precursor fragment is prepared comprising, in 5′ to 3′ or 3′ to 5′ order: a first double-stranded nicking enzyme recognition sequence, a first nicking enzyme cleavage site on a first single-strand, a double-stranded sticky end motif, and a double-stranded fragment of predetermined target sequence; wherein amplification of the precursor fragment with a non-canonical base-containing primer as described introduces a second nicking enzyme cleavage site between the sticky end motif and fragment of predetermined target sequence on a second single-strand.
- In some cases, a collection of precursor nucleic acid fragments is provided, each precursor nucleic acid fragment comprising a fragment sequence of a predetermined sequence of a target nucleic acid and a 5′ and/or 3′ connecting adaptor, wherein each connecting adaptor comprises a shared sequence among the precursor fragments and optionally one or more bases variable among the precursor fragments. Amplification of collective fragments comprising a shared sequence can be performed using a universal primer targeting shared sequence of the adaptors.
- An exemplary universal primer is one that comprises a base or sequence of bases which differs from a shared adaptor sequence of precursor nucleic acid fragments. For example, a universal primer comprises a non-canonical base as an addition and/or base substitution to shared adaptor sequence, and amplification of precursor fragments comprising the shared adaptor sequence with the primer introduces the non-canonical base into each adaptor sequence. An illustration of an exemplary universal primer pair comprising a non-canonical base substitution is shown in
FIG. 3 . Each primer comprises, in 5′ to 3′ order: one or more adaptor bases 301 a, 301 b, a nicking enzyme recognition site 302 a, 302 b, and a sticky end motif comprising a T to U base substitution (sticky end motif in forward primer 305: AATGCU, SEQ ID NO.: 7 303 a; sticky end motif in reverse primer 310: AGCATU, SEQ ID NO.: 8 303 b). Amplification of a first precursor nucleic acid having an adaptor comprising sticky end motif AATGCT (SEQ ID NO.: 9) with the forward primer introduces a uracil to a single-strand of the adaptors in the resulting amplicons. Amplification of a second precursor nucleic acid having an adaptor comprising sticky end motif AGCATT (SEQ ID NO.: 10) with the reverse primer introduces a uracil to a single-strand of the adaptors in the resulting amplicons. The amplification products, cleavage steps described herein, have compatible sticky ends are suitable for annealing and ligating. In some cases, a set of two or more universal primer pairs is used in a method disclosed herein, wherein each pair comprises a universal forward primer and a universal reverse primer, and wherein the forward primers in the set each comprise a shared forward sequence and a variable forward sequence and the reverse primers in the set each comprise a shared reverse sequence and a variable reverse sequence. A set of universal primers designed to amplify the collection of nucleic acids may comprises differences within each set of universal forward and reverse primers relating to one or more bases of the sticky end motif sequence. - Provided herein are methods where a universal primer pair incorporates a universal primer sequence 5′ to a sticky end motif sequence in a nucleic acid. As a non-limiting example, a universal primer sequence comprises a universal nicking enzyme recognition sequence to be incorporated at the end of each fragment in a library of precursor nucleic acid fragments. For the universal primers shown in
FIG. 3 , as one example, a primer fusion site comprises fourbases 3′ to an adenine (A) and 5′ to a uracil (U). The 5′-A (N4) U-3′ (SEQ ID NO.: 11) primer fusion sequence is located at the very 3′ end of the exemplary primers, which conclude with a 3′ uracil. Alternatively, the primer fusion can be sequence is 5′-G (N4) U-3′ (SEQ ID NO.: 12). For some assembly reactions with precursor nucleic acid fragments, a number of such primers with varying N4 sequences are used within a reaction mixture, each targeting a complementary fusion site on one end of one of the fragments that are to be assembled. N4 represents any configuration of 4 bases (N), where each base N has the same or different identity than another base N. In some cases, the number of N bases is greater than or less than 4. Without being bound by theory, since mismatched base pairs toward the 3′ end of a primer significantly reduce the efficiency of a nucleic acid extension reaction, placement of variable regions that target different fusion sites increases the specificity between the primer fusion site sequences and fragment fusion site sequences. - A plurality of precursor nucleic acid fragments comprising shared and variable regions of sequence is shown in
FIG. 4 . Each precursor fragment 401-404 comprises at least one connecting adaptor and optionally an outer adaptor at each end of a target fragment sequence, wherein each of the connecting and outer adaptors comprise a shared sequence. FollowingPCR amplification 405 with primers (designate by arrows in above and below “401” inFIG. 4 ), the precursor fragment 401-404 are modified to includenon-canonical bases 410, subject toenzymatic digestion 415 to generate fragments with overhangs 420, and subject to annealing and ligation 430. The primers may be universal primers described herein. The nucleicacids comprising fragment 1 401 andfragment 2 402 are appended at their 3′ or 5′ ends, respectively, with sticky end motif X, wherein the sequence: fragment 1-stickyend motif X-fragment 2 occurs in the predetermined target sequence. The nucleicacids comprising fragment 2 402 andfragment 3 403 are appended at their 3′ or 5′ ends, respectively, with sticky end motif Y, wherein the sequence fragment 2-sticky end motif Y-fragment 3 occurs in the predetermined target sequence. The nucleicacids comprising fragment 3 403 andfragment 4 404 are appended at their 3′ or 5′ ends, respectively, with sticky end motif Z, wherein the sequence fragment 3-sticky end motif Z-fragment 4 occurs in the predetermined target sequence. The ligation product is then amplified byPCR 440 usingprimers 445, 446 complementary to outer adaptors regions. The resulting final product is a plurality of nucleic acids which lack adaptor regions 450. - Connecting adaptors disclosed herein may comprise a Type II restriction endonuclease recognition sequence. In such instances, a sticky end motif shared between adjacent fragments in a predetermined sequence is a Type II restriction endonuclease recognition sequence. As a non-limiting example, sticky end motif X is a first Type II restriction endonuclease recognition sequence so that upon digesting with the appropriate Type II restriction enzyme, a sticky end is produced at the ends of
nucleic acids - In some cases, tandem nicking of a double-stranded nucleic acid and/or double-stranded cleavage by a Type II restriction endonuclease, results in undesired sequences terminal to cleavage sites remaining in the cleavage reaction. These terminal bases are optionally removed to facilitate downstream ligation. Cleaved termini are removed, for example, through size-exclusion column purification. Alternately or in combination, terminal ends are tagged with an affinity tag such as biotin such that the cleaved ends are removed from the reaction using avidin or streptavidin, such as streptavidin coated on beads. Alternately, for tandem nicking reactions, cleaved ends of precursor fragments are retained throughout annealing of the fragments to a larger target nucleic acid.
- Provided herein are methods where the precursor fragments comprise a first nicking enzyme cleavage site defined by a first nicking enzyme recognition sequence, and a non-canonical base. In these cases, precursor fragments are treated with a first enzyme activity that excises the non-canonical base and a second enzyme activity that cleaves single-stranded nucleic acids at the abasic site and first nicking enzyme cleavage site. Some of the cleaved ends produced at the first nicking enzyme cleavage site are able to reanneal to cleaved sticky end overhangs, and may re-ligate. However, such re-ligation will also reconstitute the cleavage site, and will be re-cleaved if the single-strand nicking enzyme activity is included in the reaction. The opposite strand, from which the non-canonical base has been excised and the phosphodiester backbone cleaved at that site, is incapable of re-ligation to the cleaved end because of the gap created at the now abasic site. However, sticky ends of precursor nucleic acid fragments that are end pairs intended to assemble into a larger fragment are capable of annealing to one another and ligating. Upon ligation, the molecule formed thereby will not have the first nicking enzyme cleavage site, as the sequence that specifies cleavage is in the cleaved-off terminal fragment rather than in the adjacent fragment sequence. Subsequently, ligated ends will not be re-cleaved by strand-adjacent nicking enzyme. Additionally, as neither strand has a gap position corresponding to the excised non-canonical base position, sticky ends of precursor nucleic acid fragments that are end pairs intended to assemble into a larger target are capable of annealing to one another across both strands.
- Following successive rounds of thermocycling through annealing, ligation and denaturing, optionally in the presence of a nicking enzyme, sticky ends that bind to their partner ends will be ligated and drawn out of the sticky end pool, while sticky ends that bind to cleaved terminator sequence will remain available for ligation in successive rounds. Through successive iterations of annealing, ligation and melting, cleaved ends remain unligated while junction binding events become ligated to one another.
- Sticky ends of cleaved precursor nucleic acid fragments are allowed to anneal to one another under conditions promoting stringent hybridization, such that in some cases, only perfectly reverse complementary sticky ends anneal. In some cases, less stringent annealing is permitted. Annealed sticky ends are ligated to form either complete target nucleic acid molecules, or larger fragment target nucleic acid molecules. Larger fragment molecules are in turn subjected to one or more additional rounds of assembly, using either methods described herein and additional sticky end sites, or one or more assembly techniques known in the art.
- Methods and compositions described herein allow assembly of large nucleic acid target molecules with a high degree of confidence as to sequence integrity. The target molecules are assembled from precursor nucleic acid fragments that are in many cases synthesized to a length that is within a target level of sequence confidence—that is, they are synthesized to a length for which the synthesis method provides a high degree of confidence in sequence integrity. In some cases, this length is about 100, 150, 200, 250, 300, 350, 400, 450, or 500 nucleic acid bases.
- In some cases, the methods provided herein generate a specific target sequence for a recombinatorial library, e.g., a chimeric construct or a construct comprising at least one targeted variation for codon mutation. Positions to vary include, without limitation, codons at residues of interest in an encoded protein, codons of residues of unknown function in an encoded protein, and pairs or larger combinations of codons encoding residues known or suspected to work in concert to influence a characteristic of a protein such as enzymatic activity, thermostabilty, protein folding, antigenicity, protein-protein interactions, solubility or other characteristics.
- A library of variants may be prepared by synthesizing target nucleic acids from fragments having at least one indeterminate or partially determinate position among members of the library. In some cases, target fragments are synthesized having combinations of variants. Upon assembly of a target nucleic acid library, multiple combinations of variations at a first position and variations at a second position may be present in the library. In some instances, all possible combinations of variants are represented in a library. The library may be constructed such that variant base positions are each found on different target fragments, or alternately, multiple variant base positions are found on the same target fragment library.
-
FIGS. 5A-5B illustrate an exemplary workflow for recombinatorial library synthesis of a target gene. The target gene is partitioned into fragments 1-4 by motifs X, Y, andZ 500, each fragment comprising one or two indeterminate sites (FIG. 5A ). In some instances, not all fragments of a target gene comprise an indeterminate site. Precursor fragments 501 comprise an outer adaptor, a variant offragment 1 comprising one indeterminate site, and a connecting adaptor comprising motif X. Precursor fragments 502 comprise a connecting adaptor comprising motif X, a variant offragment 2 comprising one indeterminate site, and a connecting adaptor comprising motif Y. Precursor fragments 503 comprise a connecting adaptor comprising motif Y, a variant offragment 3 comprising two indeterminate sites, and a connecting adaptor comprising motif Z. Precursor fragments 504 comprise a connecting adaptor comprising motif Z, a variant offragment 4 comprising one indeterminate sites, and a second outer adaptor. PCR is used to generate amplicons 510 of each precursor fragment, collectively, 500, In some cases, using a universal primer pair(s) (FIG. 5B ). Precursor nucleic acids are digested at their connecting adaptor sequence to generate sticky ends, complements of which are annealed and ligated together to form a series of target genes comprising:fragment 1 sequence comprising one indeterminate site, motif X,fragment 2 sequence comprising one indeterminate site, motif Y,fragment 3 sequence comprising two indeterminate sites, motif Z, andfragment 4 sequence comprising one indeterminate site 520. The number of possible target gene variants is 45 or 1,024 different genes.FIG. 5B ,part 530, shows a conceptual depiction of some of these target gene variants after PCR amplification. - Methods described herein comprise assembling double-stranded DNA (“dsDNA”) target nucleic acid from shorter target nucleic acid fragments that are building block precursors. Assembly may proceed by hybridizing uniquely complimentary pairs of overhangs. Such uniquely complimentary pairs may be formed by incorporating sticky ends from two precursor fragments that appear successively in the assembled nucleic acid. In some cases, the pair of overhangs does not involve complete complementarity, but rather sufficient partial complementarity that allows for selective hybridization of successive precursor fragments under designated reaction conditions.
- Generation of an overhang on a double-stranded nucleic acid is generally performed with two cleavage agents. A cleavage agent includes any molecule with enzymatic activity for base excision and/or single-strand cleavage of a double-stranded nucleic acid. For example, a cleavage agent is a nicking enzyme or has nicking enzymatic activity. A cleavage agent recognizes a cleavage or nicking enzyme recognition sequence, mismatched base pair, atypical base, non-canonical or modified nucleoside or nucleobase to be directed to a specific cleavage site. In some cases, two cleavage agents have independent recognition sites and cleavage sites. In some cases, a cleavage agent generates a single-stranded cleavage, e.g., a nick or a gap, involving removal of one or more nucleosides from a single-strand of a double-stranded nucleic acid. In some cases, a cleavage agent cleaves a phosphodiester bond of a single-strand in a double-stranded nucleic acid.
- Provided herein area methods for creating a sticky end on a double-stranded nucleic acid comprising: (a) providing a linear double-stranded nucleic acid comprising in order an insert region, a first fusion site, and a first adaptor region; (b) creating a first nick on a first strand of the double-stranded nucleic acid with a first cleavage agent having a first recognition site and a first specific cleavage site; and (c) creating a second nick on a second strand of the double-stranded nucleic acid with a second cleavage agent having a second recognition site and a second specific cleavage site; wherein the method produces a sticky end at the first fusion site; wherein the first recognition site is in the first fusion site or the first adaptor region; and wherein the second recognition site is in the first fusion site or first adaptor region. In some cases, the first adaptor region or first fusion site comprises a sticky end motif. In some cases, the first adaptor region or first fusion site comprises a strand-adjacent nicking enzyme recognition sequence. In some cases, a precursor nucleic acid sequence comprises a fusion site and adaptor region that is not naturally adjacent to each other.
- Provided herein are methods for creating sticky ends on double-stranded nucleic acid comprising: (a) providing a plurality of double-stranded nucleic acids each comprising in order an insert region, a fusion site, and an adaptor region, wherein each of the plurality of double-stranded nucleic acids have a different fusion site; (b) creating a first nick on a first strand of each of the plurality of double-stranded nucleic acids with a first cleavage agent having a first recognition site and a first specific cleavage site; and (c) creating a second nick on a second strand of each of the plurality of double-stranded nucleic acids with a second cleavage agent having a second recognition site and a second specific cleavage site; wherein the method produces a sticky end at each fusion site of the plurality of double-stranded nucleic acids; wherein the first recognition site is in the fusion site or the adaptor region of the plurality of double-stranded nucleic acids; and wherein the second recognition site is in the fusion site or adaptor region of the plurality of double-stranded nucleic acids. In some cases, the first adaptor region or first fusion site comprises a sticky end motif. In some cases, the first adaptor region or first fusion site comprises a strand-adjacent nicking enzyme recognition sequence.
- Provided herein are methods for assembling a polynucleotide comprising: (a) providing a reaction mixture comprising a first dsDNA fragment comprising a uracil base on its first strand; a second dsDNA fragment comprising a uracil base on its first strand; a first cleaving agent that cuts dsDNA on a single-strand at the site of a uracil; a second cleaving agent that cuts dsDNA on a single-strand, wherein the cleavage site of the second cleaving agent is within k bp of the uracil in an opposite strand and wherein k is between 2 and 10; and a ligase; and (b) thermocycling the reaction mixture between a maximum and a minimum temperature, thereby generating a first overhang from the first dsDNA fragment and a second overhang from the second dsDNA fragment, wherein the first and the second overhangs are complimentary, hybridizing the first and second overhangs to each other, and ligating.
- Provided herein are methods for assembling a polynucleotide comprising: (a) providing a reaction mixture comprising n dsDNA fragments each comprising a first and a second strand, and a first nicking endonuclease recognition site, a first fusion site, a variable insert, a second fusion site, and a second nick enzyme recognition site, wherein the second fusion site comprises a uracil base on the first strand and the first fusion site comprises a uracil base on the second strand; a first cleaving agent that cuts dsDNA on a single-strand at the site of a uracil; a second cleaving agent that cuts dsDNA on a single-strand, wherein the cleavage site of the second cleaving agent is within k bp of the uracil in an opposite strand and wherein k is between 2 and 10; and a ligase; and (b) thermocycling the reaction mixture between a maximum and a minimum temperature, thereby generating a first overhang and a second overhang on each end of the n dsDNA fragments, wherein the second overhang on the ith of the n dsDNA fragments is reverse complementary to the first overhang on the i+1st of the n dsDNA fragments, hybridizing the complementary overhangs to each other, and ligating.
- Provided herein are fragment libraries comprising n DNA fragments, each comprising a first strand and a second strand, each ith DNA fragment comprising a first nicking endonuclease recognition site, a first fusion site, a variable insert, a second fusion site, and a second nick enzyme recognition site; wherein the first fusion site comprises a sequence of 5′-A (Nx)i,1U-3′ (SEQ ID NO.: 13) in the first strand; and wherein the second fusion site comprises a sequence of 5′-A (Nx)i,2U-3′ (SEQ ID NO.: 14) in the second strand; wherein Nx denotes x nucleosides; wherein (Nx)i,2 is reverse complementary to (Nx)i+1,1 and different from every other Nx found in any fusion site sequence within the fragment library; wherein the first nicking endonuclease recognition sites are positioned such that there is a corresponding cleavage site immediately 3′ of the first fusion site in the second strand; and wherein the second nicking endonuclease recognition sites are positioned such that there is a corresponding cleavage site immediately 3′ of the second fusion site in the first strand.
- Provided herein are primer libraries comprising n primers, each comprising a nicking endonuclease recognition sequence and a fusion sequence comprising 5′-A (Nx)i U-3′ (SEQ ID NO.: 15), wherein the nicking endonuclease recognition sequence is positioned 5′ of the fusion sequence. In some cases, the nicking endonuclease recognition sites are positioned such that the nicking endonuclease recognition site in a primer is capable of generating a corresponding cleavage site in a reverse
complimentary DNA strand 3′ of a first fusion site in the reverse complementary DNA strand, if the primer were hybridized to the reverse complementary DNA strand such that the fusion sequence hybridizes to the first fusion site in the reverse complementary DNA strand. In some cases, x is selected from the list consisting of theintegers - A primer is said to anneal to another nucleic acid if the primer, or a portion thereof, hybridizes to a nucleotide sequence within the nucleic acid. The statement that a primer hybridizes to a particular nucleotide sequence is not intended to imply that the primer hybridizes either completely or exclusively to that nucleotide sequence.
- Provided herein are methods for the creation of a sticky end on a nucleic acid using a combination of independently acting single-strand cleaving enzymes rather than a single restriction endonuclease. In some cases, a sticky end is an end of a double-stranded nucleic acid having a 5′ or 3′ overhang, wherein a first strand of the nucleic acid comprises one or more bases at its 5′ or 3′ end, respectively, which are collectively not involved in a base-pair with bases of the second strand of the double-stranded nucleic acid. An overhang is capable of annealing to a complementary overhang under suitable reaction conditions. In some cases, “sticky end” and “overhang” are used interchangeably. Non-limiting examples of overhang lengths include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more bases. For example an overhang has 4 to 10 bases, 4 to 8 bases, or 4 to 6 bases.
- Sticky end motifs are generally identified from a predetermined sequence of a target nucleic acid to be synthesized from fragments partitioned by selected identified sticky end motifs. In some cases, ANNNNT (SEQ ID NO.: 2) motifs are identified as sources of potential sticky ends in a target sequence. In some cases, GNNNNC (SEQ ID NO.: 17) motifs are identified as a source of potential sticky ends in a target sequence. Each N is independently any base. Selected sticky ends serve as fusion sites for annealing and ligating together two fragments via complementary sticky ends.
- In some cases, a sticky end comprises a sequence of A (Nx) T (SEQ ID NO.: 1), wherein Nx is x number of N bases of any sequence. In some cases, a sticky end comprises a sequence of G (Nx) C (SEQ ID NO.: 16), wherein Nx is x number of N bases of any sequence. A sticky end motif is a sequence of double-stranded polynucleotides in a nucleic acid that when treated with an appropriate cleavage agent make up a sticky end. For reactions comprising a plurality of double-stranded nucleic acid fragments to be assembled, in some instances the Nx sequence or full sequence of a sticky end at the 3′ end of a first nucleic acid fragment is completely or partially reverse complementary to the Nx sequence of a sticky end at the 5′ end of a second nucleic acid fragment. In similar instances the 3′ end of the second nucleic acid fragment has a sticky end that is completely or partially reverse complementary to the Nx sequence of sticky end at the 5′ end of a third nucleic acid fragment, and so on. In some instances, the motif of the sticky end complementary between the first and second nucleic acids is the same as the motif of the sticky end complementary between the second and third nucleic acids. This sequence similarity between sticky end motifs includes motifs having identical base number and sequence identities. In some cases, sticky end motifs of a plurality of nucleic acids are the same, yet have variable identities. For example, each motif shares the sequence ANNNNT (SEQ ID NO.: 2), but two or more motifs differ in the identity of the sequence of 4, N bases. A plurality of nucleic acid fragments to be assembled may each comprise a sticky end motif of A (Nx) T (SEQ ID NO.: 1), wherein the sequence of a given motif is only shared among two of the fragments adjacent to one another in a target nucleic acid sequence. Thus, these nucleic acid fragments, under appropriate conditions, anneal to each other in a linear sequence without degeneracy in the pairing of overhangs and hence the nucleic acid order within the linear sequence.
- The number of bases x in Nx in a sticky end motif described herein may be the same for all sticky end motifs for a number of nucleic acids within a plurality of nucleic acids. In some instances, sticky end motifs belonging to a number of nucleic acids within a plurality of nucleic acids comprise sequences of A (Nx) T (SEQ ID NO.: 1), G (Nx) C (SEQ ID NO.: 16), or combinations thereof, wherein the number of bases x in Nx is the same or varies among the plurality of nucleic acids. The number of bases x in Nx may be more than or equal to about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more. In some cases, the number of bases x in Nx sticky end motifs of a plurality of nucleic acids is less than or equal to 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, or 2 bases. In some cases, the number of bases x in Nx in sticky end motifs is 2-10 bases, 3-9 bases, 4-8 bases, or 5-10 bases. In some case, a sequence of N bases in a sticky end motif described herein comprises no more than 4, 3, 2, or 1 of the same base. For example, in a sticky end motif comprising x=4 N bases, no more than 1, 2, 3 or 4 bases have the same identity. In some cases, no more than 2, 3 or 4 bases in a sticky end motif sequence have the same identity. In some cases, a sequence adjacent to a sticky end motif in a nucleic acid described herein does not comprise a G or C in the first two positions adjacent to the 3′ end of the sticky end motif.
- Referring to the figures,
FIG. 2 depicts the preparation and annealing of two sticky ends in a plurality of precursor nucleic acid fragments. InFIG. 2 , a plurality of fragments spanning a predetermined target nucleic acid sequence is generated for which sticky end motif sequences have been selected (sticky end motifs X, Y, and Z) such that only two fragments will share a particular compatible sticky end. Each precursor fragment comprises target nucleic acid fragment sequence, flanked by sticky end motif sequence ANNNNT (SEQ ID NO.: 2), wherein NNNN are specific to an end pair, and having a U in place of the T at the 3′ end of one strand. In alternate embodiments the sequence is GNNNNC (SEQ ID NO.: 17), herein NNNN are specific to an end pair, and having a U in place of the C at the 3′ end of one strand. - Another non-limiting depiction of sticky end use is shown in the example workflow of
FIG. 4 , which generally depicts the assembly of target nucleic acids from precursor nucleic acid fragments via assembly of complementary sticky ends in the precursor fragments. Connecting adaptors of two or more fragments may be synthesized to be flanked by Type II restriction endonuclease sites that are unique to a fragment pair. Compatible ends are ligated and PCR is used to amplify the full length target nucleic acids. - Position-Specific Sticky End Generation
- In some cases, methods and compositions described herein use two independent cleavage events that have to occur within a distance that allow for separation of a cleaved end sequence under specified reaction conditions. For example, two different cleaving agents are used that both cut DNA only at a single-strand. In some cases, one or both of the cleaving agents cut outside of its recognition sequence (a “strand-adjacent nicking enzyme”). This allows independency of the process from the actual sequence of the overhangs which are to be assembled at sticky end sites. In some cases, one or more of the cleavage agents recognizes or cleaves at non-canonical bases that are not part of the Watson-Crick base pairs or typical base pairs, including, but not limited to a uracil, a mismatch, and a modified base.
- Further provided herein are methods for generation of a sticky end in a double-stranded nucleic acid having a sticky end motif comprises cleaving a first strand of the nucleic acid at a first position adjacent to one end of the sticky end motif and cleaving a second strand of the nucleic acid at a second position adjacent to the other end of the sticky end motif. In some cases, the first and/or second position are defined by their location next to a nicking enzyme recognition sequence. For example, a strand-adjacent nicking enzyme recognitions the nicking enzyme recognition sequence and cleaves a single-strand adjacent to the recognition sequence. In some cases, the first and/or second position are defined by the presence of a non-canonical base, wherein excision and cleavage at the non-canonical base site occurs via one or more nicking enzymes collectively having excision and endonuclease activities. In some cases, two nicks on opposite strands of a nucleic acid are within a short nick-to-nick distance from each other, e.g., a distance equal to or less than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 base pairs. A nicking enzyme recognition sequence is positioned such that its cleavage site is at the desired nick-to-nick distance from the other cleavage activity that is used together to create an overhang.
- A single-strand of a sticky end motif may be modified with or comprises a non-canonical base positioned directly adjacent to a target nucleic acid sequence. In some cases, a non-canonical base identifies a cleavage site. In an exemplary arrangement, an adaptor sequence comprising a sticky end motif further comprises a nicking enzyme recognition sequence adjacent to the terminal end of the sticky end motif. In this configuration, if the nicking enzyme recognition sequence defines a cleavage site adjacent to the recognition sequence and is located next to the sticky end motif, treatment with a strand-adjacent nicking enzyme introduces a nick on a single-strand between the nicking enzyme recognition sequence and sticky end motif. Examples of non-canonical bases for inclusion in a modified sticky end motif are, without limitation, 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, acetylcytosine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, N-6-isopentenyl adenine, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 1-methyladenine, 2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, 5-ethylcytosine, N6-adenine, N6-methyladenine, N,N-dimethyladenine, 8-bromoadenine, 7-methylguanine, 8-bromoguanine, 8-chloroguanine, 8-aminoguanine, 8-methylguanine, 8-thioguanine, 5-ethyluracil, 5-propyluracil, 5-methylaminomethyluracil, methoxyarninomethyl-2-thiouracil, beta-D-mannosylqueosine, 5′-methoxycarboxymethyluracil, 5-methoxyuracil, 2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid, pseudouracil, 1-methylpseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-hydroxymethyluracil, 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-S-oxyacetic acid, 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, 5-(2-bromovinyl)uracil, 2-aminopurine, 6-hydroxyaminopurine, 6-thiopurine, and 2,6-diaminopurine.
- In addition, the terms “nucleoside” and “nucleotide” include those moieties that contain not only conventional ribose and deoxyribose sugars, but other sugars as well. Modified nucleosides or nucleotides also include modifications on the sugar moiety, e.g., wherein one or more of the hydroxyl groups are replaced with halogen atoms or aliphatic groups, or are functionalized as ethers, amines, or the like. Examples of modified sugar moieties which can be used to modify nucleosides or nucleotides at any position on their structures include, but are not limited to arabinose, 2-fluoroarabinose, xylose, and hexose, or a modified component of the phosphate backbone, such as phosphorothioate, a phosphorodithioate, a phosphoramidothioate, a pliosphoramidate, a phosphordiamidate, a methylphosphonate, an alkyl phosphotriester, or a formacetal or analog thereof.
- A nucleic acid described herein may be treated with a chemical agent, or synthesized using modified nucleotides, thereby creating a modified nucleic acid. In various embodiments, a modified nucleic may is cleaved, for example at the site of the modified base. For example, a nucleic acid may comprise alkylated bases, such N3-methyladenine and N3-methylguanine, which may be recognized and cleaved by an alkyl purine DNA-glycosylase, such as DNA glycosylase I (E. coli TAG) or AlkA. Similarly, uracil residues may be introduced site specifically, for example by the use of a primer comprising uracil at a specific site. The modified nucleic acid may be cleaved at the site of the uracil residue, for example by a uracil N-glycosylase. Guanine in its oxidized form, 8-hydroxyguanine, may be cleaved by formamidopyrimidine DNA N-glycosylase. Examples of chemical cleavage processes include without limitation alkylation, (e.g., alkylation of phosphorothioate-modified nucleic acid); cleavage of acid lability of P3′-N5′-phosphoroamidate-containing nucleic acid; and osmium tetroxide and piperidine treatment of nucleic acid.
- Methods described herein provide for synthesis of a precursor nucleic acid sequence, or a target fragment sequence thereof, has a length of about or at least about 50, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 10000, 20000, or 30000 bases. In some cases, a plurality of precursor nucleic acid fragments are prepared with sticky ends, and the sticky ends are annealed and ligated to generate the predetermined target nucleic acid sequence having a base length of about, or at least about, 500, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 15000, 20000, 30000, 50000, or 100000 bases. In some cases, a precursor nucleic acid sequence is assembled with another precursor nucleic acid sequence via annealing and ligation of complementary sticky ends, followed by additional rounds of sticky end generation and assembly with other precursor fragment(s) to generate a long target nucleic acid sequence. In some cases, 2, 3, 4, 5, 6, 7, 8, 9, or 10 rounds of sticky end generation and assembly are performed to generate a long target nucleic acid of predetermined sequence. The precursor nucleic acid fragment or a plurality of precursor nucleic acid fragments may span a predetermined sequence of a target gene, or portion thereof. The precursor nucleic acid fragment or a plurality of precursor nucleic acid fragments may span a vector and a plasmid sequence, or portion thereof. For example, a precursor nucleic acid fragment comprises a sequence of a cloning vector from a plasmid. In some such cases, a cloning vector is generated using de novo synthesis and an assembly method described herein, and is subsequently assembled with a precursor nucleic acid fragment or fragments of a target gene to generate an expression plasmid harboring the target gene. A vector may be a nucleic acid, optionally derived from a virus, a plasmid, or a cell, which comprises features for expression in a host cell, including, for example, an origin of replication, selectable marker, reporter gene, promoter, and/or ribosomal binding site. A host cell includes, without limitation, a bacterial cell, a viral cell, a yeast cell, and a mammalian cell. Cloning vectors useful as precursor nucleic acid fragments include, without limitation, those derived from plasmids, bacteriophages, cosmids, bacterial artificial chromosomes, yeast artificial chromosomes, and human artificial chromosomes.
- Provided herein are methods for synthesis of target nucleic acid fragments having an error rate of less than 1/500, 1/1000, 1/10,000 or less compared to a predetermined sequence(s). In some cases, target fragment length is selected in light of the location of desired sticky ends, such that target fragment length varies among fragments in light of the occurrence of desired sticky ends among target fragments. In some cases, target nucleic acid fragments are synthesized to a size of at least 20 but less than 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 500, 1000, 5000, 10000, or 30000 bases. In some cases, target fragments are synthesized de novo, such as through nonenzymatic nucleic acid synthesis. In some cases, target nucleic acid fragments are synthesized from template nucleic acids, such as templates of nucleic acids that are to be assembled into a single target nucleic acid but which, in some cases, do not naturally occur adjacent to one another.
- Through the synthesis of target nucleic acid fragments having at least one indeterminate position, followed by the ligation at sticky ends to adjacent target nucleic acid fragments also having at least one indeterminate position, one can synthesize a target nucleic acid population that comprises a recombinant library of all possible combinations of the base identities at the varying positions. Alternately, at least one base position is partially indeterminate in some cases, such that two or three base alternatives are permitted. In some such cases, target nucleic acid fragments are selected such that only one base varies within a given target nucleic acid fragment, which in turn allows for each position to independently vary in the target nucleic acid library.
- An example workflow of nucleic acid synthesis is shown in
FIG. 6 . Methods of synthesis using this workflow are, in some instances, performed to generate a plurality of target nucleic acid fragments, or oligonucleotides thereof, for assembly using sticky end methods described herein. In some cases, oligonucleotides are prepared and assembled into precursor fragments using the methods depicted inFIG. 6 . The workflow is divided generally into the following processes: (1) de novo synthesis of a single stranded oligonucleic acid library, (2) joining oligonucleic acids to form larger fragments, (3) error correction, (4) quality control, and (5) shipment. Prior to de novo synthesis, an intended nucleic acid sequence or group of nucleic acid sequences is preselected. For example, a library of precursor nucleic acid fragments is preselected for generation. - In some instances, a structure comprising a surface layer 601 is provided. In the example, chemistry of the surface is functionalized in order to improve the oligonucleic acid synthesis process. Areas of low surface energy are generated to repel liquid while areas of high surface energy are generated to attract liquids. The surface itself may be in the form of a planar surface or contain variations in shape, such as protrusions or nanowells which increase surface area. In the workflow example, high surface energy molecules selected support oligonucleic acid attachment and synthesis.
- In
step 602 of the workflow example, a device, such as a material deposition device, is designed to release reagents in a step wise fashion such that multiple oligonucleic acids extend from an actively functionalized surface region, in parallel, one residue at a time to generate oligomers with a predetermined nucleic acid sequence. In some cases, oligonucleic acids are cleaved from the surface at this stage. Cleavage includes gas cleavage, e.g., with ammonia or methylamine. - The generated oligonucleic acid libraries are placed in a reaction chamber. In some instances, the reaction chamber (also referred to as “nanoreactor”) is a silicon coated well containing PCR reagents lowered onto the
oligonucleic acid library 603. Prior to or after the sealing 604 of the oligonucleic acids, a reagent is added to release the oligonucleic acids from the surface. In the exemplary workflow, the oligonucleic acids are released subsequent to sealing of the nanoreactor 605. Once released, fragments of single-stranded oligonucleic acids hybridize in order to span an entire long range sequence of DNA. Partial hybridization 605 is possible because each synthesized oligonucleic acid is designed to have a small portion overlapping with at least one other oligonucleic acid in the pool. - After hybridization, oligonucleic acids are assembled in a PCA reaction. During the polymerase cycles of the PCA reaction, the oligonucleic acids anneal to complementary fragments and gaps are filled in by a polymerase. Each cycle increases the length of various fragments randomly depending on which oligonucleic acids find each other. Complementarity amongst the fragments allows for forming a complete large span of double-stranded DNA 606, in some instances, a fragment of DNA to be assembled into a target nucleic acid.
- After PCA is complete, the nanoreactor is separated from the surface 607 and positioned for interaction with a surface having primers for
PCR 608. After sealing, the nanoreactor is subject to PCR 609 and the larger nucleic acids are amplified. AfterPCR 610, the nanochamber is opened 611, error correction reagents are added 612, the chamber is sealed 613 and an error correction reaction occurs to remove mismatched base pairs and/or strands with poor complementarity from the double-stranded PCR amplification products 614. The nanoreactor is opened and separated 615. Error corrected product is next subject to additional processing steps, such as PCR, nucleic acid sorting, and/or molecular bar coding, and then packaged 622 forshipment 623. - In some cases, quality control measures are taken. After error correction, quality control steps include, for example, interaction with a wafer having sequencing primers for amplification of the error corrected product 616, sealing the wafer to a chamber containing error corrected
amplification product 617, and performing an additional round of amplification 618. The nanoreactor is opened 619 and the products are pooled 620 and sequenced 621. In some cases, nucleic acid sorting is performed prior to sequencing. After an acceptable quality control determination is made, the packagedproduct 622 is approved forshipment 623. Alternatively, the product is a library of precursor nucleic acids to be assembled using scar-free assembly methods and compositions described herein. - Provided herein is library of nucleic acids each synthesized with an adaptor sequence comprising a shared primer binding sequence. In some cases, the primer binding sequence is a universal primer binding sequence shared among all primers in a reaction. In some cases, different set of primers are used for generating different final nucleic acids. In some cases, multiple populations of primers each have their own “universal” primer binding sequence that is directed to hybridize with universal primer binding sites on multiple nucleic acids in a library. In such a configuration, different nucleic acids within a population share a universal primer binding site, but differ in other sequence elements. Thus, multiple populations of nucleic acids may be used as a template in primer extension reactions in parallel through the use of different universal primer binding sites. Universal primers may comprise a fusion site sequence that is partially or completely complementary to a sticky end motif of one of the nucleic acids. The combination of a primer binding sequence and the sticky end motif sequence is used to hybridize the primer to template nucleic acids. In some cases, primers and/or adaptor sequences further comprise a recognition sequence for a cleavage agent, such as a nicking enzyme. In some cases, primers and/or primer binding sequences in an adaptor sequence further comprise a recognition sequence for a cleavage agent, such as a nicking enzyme. In some cases, a nicking enzyme recognition sequence is introduced to extension products by a primer.
- Primer extension may be used to introduce a sequence element other than a typical DNA or RNA Watson-Crick base pair, including, without being limited to, a uracil, a mismatch, a loop, or a modified nucleoside; and thus creates a non-canonical base pair in a double-stranded target nucleic acid or fragment thereof. Primers are designed to contain such sequences in a way that still allows efficient hybridization that leads to primer extension. Such non-Watson-Crick sequence elements may be used to create a nick on one strand of the resulting double-stranded nucleic acid amplicon. In some cases, a primer extension reaction is used to produce extension products incorporating uracil into a precursor nucleic acid fragment sequence. Such primer extension reactions may be performed linearly or exponentially. In some cases, a polymerase in a primer extension reaction is a ‘Family A’ polymerase lacking 3′-5′ proofreading activity. In some cases, a polymerase in a primer extension reaction is a Family B high fidelity polymerase engineered to tolerate base pairs comprising uracil. In some cases, a polymerase in a primer extension reaction is a Kappa Uracil polymerase, a FusionU polymerase, or a Pfu turbo polymerase as commercially available.
- Nicking Enzyme Recognition Sequences and Cleavage Sites
- The generation of an overhang described herein in a double-stranded nucleic acid comprises may create two independent single-stranded nicks at an end of the double-stranded nucleic acid. In some cases, the two independent single-stranded nicks are generated by two cleavage agents having cleavage activities independent from each other. In some cases, a nick is created by including a recognition site for a cleavage agent, for example in an adaptor region or fusion site. In some cases, a cleavage agent is a nicking endonuclease using a nicking endonuclease recognition sequence or any other agent that produces a site-specific single-stranded cut. For example, a mismatch repair agent that creates a gap at the site of a mismatched base-pair, or a base excision system that creates a gap at the site of a recognized nucleoside, such as a deoxy-uridine, is used to create a single-stranded cut. In some cases, a deoxy-uridine is a non-canonical base in a non-canonical base pair formed with a deoxy-adenine, a deoxy-guanine, or a deoxy-thymine. In some cases, for example, when using a uracil containing primer in a nucleic acid extension reaction, a nucleic acid comprises a deoxy-uridine/deoxy-adenine base pair. For example a glycosylase, such as UDG, alone or in combination with an AP endonuclease, such as endonuclease VIII, is used to excise uracil and create a gap. In some cases, a second nick is created similarly using any suitable single-stranded site-specific cleavage agent; wherein the second nick is created at a site not directly across from the first nick in the double-stranded nucleic acid. Such pairs of staggered nicks, when in proximity to each other and under appropriate reaction conditions, cause a sticky end when parts of the original nucleic acid melt away from each other. In various embodiments, one or more of the cleavage sites are situated apart from the sequence of the fusion site.
- Two nicks in a double-stranded nucleic acid may be created such that the resulting overhang is co-extensive with the span of a sticky end site. For example, a first nick is created at the juncture between sticky end site and adaptor region at one end of a nucleic acid; and a second nick is created at the other end of the sticky end site. Thus, only one strand along a sticky end site is kept at the end of a nucleic acid along the entire sticky end sequence, while the other is cut off. A mixture of enzymatic uracil excision activity and nicking endonuclease activity may be provided in a mixture of engineered fragments. In some cases, a strand-adjacent nicking enzyme is provided, such that sticky ends that reanneal to their cleaved terminal ends and are re-ligated across a single-strand will be re-subjected to single-strand nicking due to the reconstitution of the strand-adjacent nicking site.
- Overhangs of various sizes are prepared by adjusting the distance between two nicks on opposite strands of the end of a double-stranded nucleic acid. In some cases, the distance or the length of an overhang is equal to or less than 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, or 2 bases. Overhangs may be 3′ or 5′ overhangs. In various embodiments, the cleavage site of a cleavage agent is a fixed distance away from its recognition site. In some cases, the fixed distance between a cleavage agent's cleavage site and recognition site is more than or equal to about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 bases or more. In some cases, the fixed distance between a cleavage agent's cleavage site and recognition site is 2-10 bases, 3-9 bases, or 4-8 bases. The cleavage site of a cleaving agent may be outside of its recognition site, for example, it is adjacent to its recognition site and the agent is a strand-adjacent nicking enzyme. In some case, the recognition site of a cleavage agent is not cleaved.
- A double-stranded nucleic acid disclosed herein may be modified to comprise a non-canonical base. As a non-limiting example, a nucleic acid fragment having a sticky end motif such as A (Nx) T (SEQ ID NO.: 1) or G (Nx) C (SEQ ID NO.: 16) is prepared. In some cases, the fragment further comprises a recognition site for a single-strand cleavage agent, such as a nicking endonuclease, having a cleavage site immediately adjacent to the last base in the sticky end motif sequence. Alternatively, the recognition site is introduced by a primer in a nucleic acid extension reaction using a strand of the fragment comprising the sticky end motif as a template. For example, the recognition site is appended to the end of the fragment in an adaptor region. In a non-limiting example, a nucleic acid extension reaction using the strand of the fragment comprising the sticky end motif, such as A (Nx) T (SEQ ID NO.: 1) or G (Nx) C (SEQ ID NO.: 16), as a template is primed with a primer comprising a sticky end sequence comprising a non-canonical base substitution. For a sticky end motif of A (Nx) T (SEQ ID NO.: 1) in a template, one such primer comprises the sequence A (Nx)′ U (SEQ ID NO.: 18), wherein (Nx)′ is partially or completely reverse complementary to (Nx). For a sticky end motif of A (Nx) T (SEQ ID NO.: 1) in a template, one such primer comprises the sequence A (Nx) U (SEQ ID NO.: 19). In some cases, the A (Nx)′ U (SEQ ID NO.: 18) and/or A (Nx) U (SEQ ID NO.: 19) sequence on the primer is located at the very 3′ end of the primer. A plurality of such primers each having a sequence of A (Nx)′ U (SEQ ID NO.: 18) and/or A (Nx) U (SEQ ID NO.: 19) corresponding to a sequence of A (Nx) T in one strand of a fragment may be used to perform a nucleic acid extension reaction. The exemplary sequences described have a sticky end motif comprising a first A or G and a terminal T or C prior to non-canonical base in corporation. However, any sticky end motif sequence is useful with the methods described herein.
- Provided herein are fragment libraries comprising n double-stranded precursor nucleic acids fragments. In some cases, each double-stranded nucleic acid precursor fragment of the n double-stranded nucleic acid fragments comprises a first nicking endonuclease recognition site, a first fusion site, a variable insert of predetermined fragment sequence, a second fusion site, and a second nick enzyme recognition site, optionally in that order. In some cases, the first fusion site comprises or is a first sticky end motif and the second fusion site comprises or is a second sticky end motif. In some instances, the first fusion site has the sequence of 5′-A (Nx)i,1 U-3′ (SEQ ID NO.: 13) in the first strand, wherein denotes Nx x bases or nucleosides and the subscript “i,1” in (Nx)i,1 denotes the first strand of the ith fragment. In some cases, the second fusion site has the sequence of 5′-A (Nx)i,2U-3′ (SEQ ID NO.: 14) in the second strand, wherein denotes Nx x bases or nucleosides and the subscript “0” in (Nx)i,2 denotes the second strand of the ith fragment. In some instances (Nx)i,2 is completely or partially reverse complementary to (Nx)i+1,1 in the first strand of the i+1'th fragment. Each Nx found in the fusion site sequences are the same or different that the Nx in any other fusion site sequence found within the fragment library. In some cases, the first nicking endonuclease recognition site is positioned such that there is a corresponding cleavage site immediately 3′ of the first fusion site in the second strand and the second nicking endonuclease recognition site is positioned such that there is a corresponding cleavage site immediately 3′ of the second fusion site in the first strand.
- A fragment library may comprise a starter DNA fragment comprising a variable insert, a second fusion site, and a second nick enzyme recognition site. In some cases, the second fusion site of the starter DNA fragment comprises a sequence of 5′-A (Nx)s,2 U-3′ (SEQ ID NO.: 20), wherein the subscript “s,2” in (Nx)s,2 denotes the second strand of the starter fragment and (Nx)s,2 is reverse complementary to (Nx)1,1 in one of the fusion sites of the first nucleic acid fragment in the library. Similarly, the fragment library may also comprise a finishing DNA fragment comprising a first nicking endonuclease recognition site, a first fusion site, and a variable insert. In some cases, the first fusion site comprises a sequence of 5′-A (Nx)f,1 U-3′(SEQ ID NO.: 21), wherein the subscript “f,1” in (Nx)f,1 denotes the first strand of the finishing fragment And (Nx)f,1 is reverse complementary to (Nx)n,2 in one of the fusion sites of the nth nucleic acid fragment in the library. In some cases, the first and/or the second nicking endonuclease recognition sites are the same in all the fragments in the fragment library. In various embodiments, the fragment library comprises about or at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 75, 100, 125, 150, 200, 250, 500, or more nucleic acid fragments. In some instances, the fragment library comprises 2-75 fragments, 4-125 fragments, or 5-10 fragments.
- Further described herein is a primer library of n primers. Each primer within the library may comprise a recognition sequence such a nicking endonuclease recognition sequence, and a fusion sequence comprising a sticky end motif. For example, a sticky end motif having the sequence 5′-A (Nx)i U-3′ (SEQ ID NO.: 15). In some cases, the recognition sequence is positioned 5′ of the fusion site sequence. In some cases, the recognition sequence is positioned such that the recognition site in a primer is capable of generating a corresponding cleavage site in a reverse
complimentary DNA strand 3′ of a first fusion site in the reverse complementary DNA strand, if the primer were hybridized to the reverse complementary DNA strand such that the fusion sequence hybridizes to the first fusion site in the reverse complementary DNA strand. In various aspects, a primer library described herein comprises about or at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 60, 75, 100, 125, 150, 200, 250, 500, or more primers. - Provided herein are methods where two or more independent cleaving agents are selected to generate single-stranded cleavage on opposite strands of a double-stranded nucleic acid. As used herein, “nick” generally refers to enzymatic cleavage of only one strand of a double-stranded nucleic acid at a particular region, while leaving the other strand intact, regardless of whether one or more bases are removed. In some cases, one or more bases are removed while in other cases no bases are removed and only phosphodiester bonds are broken. In some instances, such cleavage events leave behind intact double-stranded regions lacking nicks that are a short distance apart from each other on the double-stranded nucleic acid, for example a distance of about or at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 bases or more. In some cases, the distance between the intact double-stranded regions is equal to or less than 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, or 2 bases. In some instances, the distance between the intact double-stranded regions is 2 to 10 bases, 3 to 9 bases, or 4 to 8 bases.
- Cleavage agents used in methods described herein may be selected from nicking endonucleases, DNA glycosylases, or any single-stranded cleavage agents described in further detail elsewhere herein. Enzymes for cleavage of single-stranded DNA may be used for cleaving heteroduplexes in the vicinity of mismatched bases, D-loops, heteroduplexes formed between two strands of DNA which differ by a single base, an insertion or deletion. Mismatch recognition proteins that cleave one strand of the mismatched DNA in the vicinity of the mismatch site may be used as cleavage agents. Nonenzymatic cleaving may also be done through photodegredation of a linker introduced through a custom oligonucleotide used in a PCR reaction.
- Provided herein are fragments designed and synthesized such that the inherent cleavage sites are utilized in the preparation of fragments for assembly. For instance, these inherent cleavage sites are supplemented with a cleavage site that is introduced, e.g., by recognition sites in adaptor sequences, by a mismatch, by a uracil, and/or by an un-natural nucleoside. In various embodiments, described herein is a plurality of double stranded nucleic acids such as dsDNA, comprising an atypical DNA base pair comprising a non-canonical base in a fusion site and a recognition site for a single-strand cleaving agent. Compositions according to embodiments described herein, in many cases, comprise two or more cleaving agents. In some cases, a first cleaving agent has the atypical DNA base pair as its recognition site and the cleaving agent cleaves a single-strand at or a fixed distance away from the atypical DNA base pair. In some cases, a second cleaving agent has an independent single-strand cleaving and/or recognition activity from the first cleaving agent. In some cases, the nucleic acid molecules in the composition are such that the recognition site for the second single-strand cleaving agent is not naturally adjacent to the fusion site or the remainder of the nucleic acid in any of the plurality of double stranded nucleic acids in the composition. In some instances, the cleavage sites of two cleavage agents are located on opposite strands.
- Type II Enzymes
- Provided herein are methods and compositions described herein use a Type II restriction endonuclease in as a cleavage agent. Type II enzymes cleave within or at short specific distances from a recognition site. There are a variety of different type II enzymes known in the art, many of which differ in the sequence they recognize. Type II restriction endonucleases comprise many sub-types with varying activities. Exemplary Type II restriction endonucleases include, without limitation, Type IIP, Type IIF, Type IIB (e.g. BcgI and BplI), Type IIE (e.g. NaeI), and Type IIM (DpnI) restriction endonucleases. The most common Type II enzymes are those like Hhal, HindIII, and Notl that cleave DNA within their recognition sequences. Many recognize DNA sequences that are symmetric, because, without being bound by theory, they bind to DNA as homodimers, but a few, (e.g., BbvCI: CCTCAGC (SEQ ID NO.: 22)) recognize asymmetric DNA sequences, because, without being bound by theory, they bind as heterodimers. Some enzymes recognize continuous sequences (e.g., EcoRI: GAATTC (SEQ ID NO.: 23)) in which the two half-sites of the recognition sequence are adjacent, while others recognize discontinuous sequences (e.g., BglI: GCC GGC (SEQ ID NO.: 24)) in which the half-sites are separated. Using this type, a 3′-hydroxyl on one side of each cut and a 5′-phosphate on the other may be created upon cleavage.
- The next most common Type II enzymes, usually referred to as ‘Type IIS” are those like FokI and AlwI that cleave outside of their recognition sequence to one side. Type IIS enzymes recognize sequences that are continuous and asymmetric. Type IIS restriction endonucleases (e.g. FokI) cleave DNA at a defined distance from their non-palindromic asymmetric recognition sites. These enzymes may function as dimers. Type IIS enzymes typically comprise two distinct domains, one for DNA binding, and the other for DNA cleavage. Type IIA restriction endonucleases recognize asymmetric sequences but can cleave symmetrically within the recognition sequences (e.g. BbvCI cleaves 2 based downstream of the 5′-end of each strand of CCTCAGC (SEQ ID NO.: 25)). Similar to Type IIS restriction endonucleases, Type ITT restriction enzymes (e.g., Bpu10I and BslI) are composed of two different subunits. Type IIG restriction enzymes, the third major kind of Type II enzyme, are large, combination restriction-and-modification enzymes, Type IIG restriction endonucleases (e.g. Eco57I) do have a single subunit, like classical Type II restriction enzymes. The two enzymatic activities typically reside in the same protein chain. These enzymes cleave outside of their recognition sequences and can be classified as those that recognize continuous sequences (e.g., AcuI: CTGAAG (SEQ ID NO.: 26)) and cleave on just one side; and those that recognize discontinuous sequences (e.g., BcgI: CGA TGC (SEQ ID NO.: 27)) and cleave on both sides releasing a small fragment containing the recognition sequence. When these enzymes bind to their substrates, they may switch into either restriction mode to cleave the DNA, or modification mode to methylate it.
- Type III enzymes are also large combination restriction-and-modification enzymes. They cleave outside of their recognition sequences and require two such sequences in opposite orientations within the same DNA molecule to accomplish cleavage. Type IV enzymes recognize modified DNA, e.g. methylated, hydroxymethylated and glucosyl-hydroxymethylated DNA and are exemplified by the McrBC and Mrr systems of E. coli.
- Some naturally occurring and recombinant endonucleases make single-strand breaks. These nicking endonucleases (NEases) typically recognize non-palindromes. They can be bona fide nicking enzymes, such as frequent cutter Nt.CviPII and Nt.CviQII, or rare-cutting homing endonucleases (HEases) I-BasI and I-Hmul, both of which recognize a degenerate 24-bp sequence. As well, isolated large subunits of heterodimeric Type IIS REases such as BtsI, BsrDI and BstNBI/BspD6I display nicking activity.
- Thus, properties of restriction endonucleases that make double-strand cuts may be retained by engineering variants of these enzymes such that they make single-strand breaks. In various embodiments, recognition sequence-specific nicking endonucleases are used as cleavage agents that cleave only a single-strand of double-stranded DNA at a cleavage site. Nicking endonucleases useful in various embodiments of methods and compositions described herein include Nb.BbvCI, Nb.BsmI, Nb.BsrDI, Nb.BtsI, Nt.AlwI, Nt.BbvCI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII, used either alone or in various combinations. In various embodiments, nicking endonucleases that cleave outside of their recognition sequence, e.g. Nb.BsrDI, Nb.BtsI, Nt.AlwI, Nt.BsmAI, Nt.BspQI, Nt.BstNBI, and Nt.CviPII, are used. In some instances, nicking endonucleases that cut within their recognition sequences, e.g. Nb.BbvCI, Nb.BsmI, or Nt.BbvCI are used. Recognition sites for the various specific cleavage agents used herein, such as the nicking endonucleases, comprise a specific nucleic acid sequence. The nickase Nb.BbvCI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site (with “|” specifying the nicking (cleavage) site and “N” representing any nucleoside, e.g. one of C, A, G or T):
-
(SEQ ID NO.: 28) 5′ CCTCA GC 3′(SEQ ID NO.: 29) 3′ GGAGT|CG 5′ - The nickase Nb.BsmI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site:
-
(SEQ ID NO.: 30) 5′ GAATGCN 3′(SEQ ID NO.: 31) 3′ CTTAC|GN 5′ - The nickase Nb.BsrDI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site:
-
(SEQ ID NO.: 32) 5′ GCAATGNN 3′(SEQ ID NO.: 33) 3′ CGTTAC|NN 5′ - The nickase Nb.BtsI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site:
-
(SEQ ID NO.: 34) 5′ GCAGTGNN 3′(SEQ ID NO.: 35) 3′ CGTCAC|NN 5′ - The nickase Nt.AlwI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site:
-
(SEQ ID NO.: 36) 5′ GGATCNNNN| N 3′(SEQ ID NO.: 37) 3′ CCTAGNNNNN 5′ - The nickase Nt.BbvCI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site:
-
(SEQ ID NO.: 38) 5′ CC| TCAGC 3′(SEQ ID NO.: 39) 3′ GGAGTCG 5′ - The nickase Nt.BsmAI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site:
-
(SEQ ID NO.: 40) 5′ GTCTCN| N 3′(SEQ ID NO.: 41) 3′ CAGAGNN 5′ - The nickase Nt.BspQI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site:
-
(SEQ ID NO.: 42) 5′ GCTCTTCN|3′ (SEQ ID NO.: 43) 3′ CGAGAAGN 5′ - The nickase Nt.BstNBI (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site:
-
(SEQ ID NO.: 44) 5′ GAGTCNNNN| N 3′(SEQ ID NO.: 45) 3′ CTCAGNNNNN 5′ - The nickase Nt.CviPII (New England Biolabs, Ipswich, Mass.) nicks at the following cleavage site with respect to its recognition site (wherein D denotes A or G or T and wherein H denotes A or C or T:
-
(SEQ ID NO.: 46) 5′ | CCD 3′(SEQ ID NO.: 47) 3′ GGH 5′ - Non-Canonical Base Recognizing Enzymes
- A non-canonical base and/or a non-canonical base pair in a sticky end motif and/or adaptor sequence may be recognized by an enzyme for cleavage at its 5′ or 3′ end. In some instances, the non-canonical base and/or non-canonical base pair comprises a uracil base. In some cases, the enzyme is a DNA repair enzyme. In some cases, the base and/or non-canonical base pair is recognized by an enzyme that catalyzes a first step in base excision, for example, a DNA glycosylase. A DNA glycosylase is useful for removing a base from a nucleic acid while leaving the backbone of the nucleic acid intact, generating an apurinic or apyrimidinic site, or AP site. This removal is accomplished by flipping the base out of a double-stranded nucleic acid followed by cleavage of the N-glycosidic bond.
- The non-canonical base or non-canonical base pair may be recognized by a bifunctional glycosylase. In this case, the glycosylase removes a non-canonical base from a nucleic acid by N-glycosylase activity. The resulting apurinic/apyrimidinic (AP) site is then incised by the AP lyase activity of bifunctional glycosylase via β-elimination of the 3′ phosphodiester bond.
- The glycosylase and/or DNA repair enzyme may recognize a uracil or a non-canonical base pair comprising uracil, for example U:G and/or U:A. Nucleic acid base substrates recognized by a glycosylase include, without limitation, uracil, 3-meA (3-methyladenine), hypoxanthine, 8-oxoG, FapyG, FapyA, Tg (thymine glycol), hoU (hydroxyuracil), hmU (hydroxymethyluracil), fU (formyluracil), hoC (hydroxycytosine), fC (formylcytosine), oxidized base, alkylated base, deaminated base, methylated base, and any non-canonical nucleobase provided herein or known in the art. In some cases, the glycosylase and/or DNA repair enzyme recognizes oxidized bases such as 2,6-diamino-4-hydroxy-5-formamidopyrimidine (FapyG) and 8-oxoguanine (8-oxo). Glycosylases and/or DNA repair enzymes which recognize oxidized bases include, without limitation, OGG1 (8-oxoG DNA glycosylase 1) or E. coli Fpg (recognizes 8-oxoG:C pair), MYH (MutY homolog DNA glycosylase) or E. coli MutY (recognizes 8-oxoG:A), NEIL1, NEIL2 and NEIL3. In some cases, the glycosylase and/or DNA repair enzyme recognizes methylated bases such as 3-methyladenine. An example of a glycosylase that recognizes methylated bases is E. coli AlkA or 3-methyladenine DNA glycosylase II, Magl and MPG (methylpurine glycosylase). Additional non-limiting examples of glycosylases include SMUG1 (single-strand specific monofunctional uracil DNA glycosylase 1), TDG (thymine DNA glycosylase), MBD4 (methyl-binding domain glycosylase 4), and NTHL1 (endonuclease III-like 1). Exemplary DNA glycosylases include, without limitation, uracil DNA glycosylases (UDGs), helix-hairpin-helix (HhH) glycosylases, 3-methyl-purine glycosylase (MPG) and endonuclease VIII-like (NEIL) glycosylases. Helix-hairpin-helix (HhH) glycosylases include, without limitation, Nth (homologs of the E. coli EndoIII protein), OggI (8-oxoG DNA glycosylase I), MutY/Mig (A/G-mismatch-specific adenine glycosylase), AlkA (alkyladenine-DNA glycosylase), MpgII (N-methylpurine-DNA glycosylase II), and OggII (8-oxoG DNA glycosylase II). Exemplary 3-methyl-puring glycosylases (MPGs) substances include, in non-limiting examples, alkylated bases including 3-meA, 7-meG, 3-meG and ethylated bases. Endonuclease VIII-like glycosylase substrates include, without limitation, oxidized pyrimidines (e.g., Tg, 5-hC, FaPyA, PaPyG), 5-hU and 8-oxoG.
- Exemplary uracil DNA glycosylases (UDGs) include, without limitation, thermophilic uracil DNA glycosylases, uracil-N glycosylases (UNGs), mismatch-specific uracil DNA glycosylases (MUGs) and single-strand specific monofunctional uracil DNA glycosylases (SMUGs). In non-limiting examples, UNGs include UNG1 isoforms and UNG2 isoforms. In non-limiting examples, MUGs include thymidine DNA glycosylase (TDG). A UDG may be active against uracil in ssDNA and dsDNA.
- The non-canonical base pair included in a fragment disclosed herein is a mismatch base pair, for example a homopurine pair or a heteropurine pair. In some cases, a primer described herein comprises one or more bases which form a mismatch base pair with a base of a target nucleic acid or with a base of an adaptor sequence connected to a target nucleic acid. In some cases, an endonuclease, exonuclease, glycosylase, DNA repair enzyme, or any combination thereof recognizes the mismatch pair for subsequent removal and cleavage. For example, the TDG enzyme is capable of excising thymine from G:T mismatches. In some cases, the non-canonical base is released from a dsDNA molecule by a DNA glycosylase resulting in an abasic site. This abasic site (AP site) is further processed by an endonuclease which cleaves the phosphate backbone at the abasic site. Endonucleases included in methods herein may be AP endonucleases. For example, the endonuclease is a class I or class II AP endonuclease which incises DNA at the
phosphate groups 3′ and 5′ to the baseless site leaving 3′ OH and 5′ phosphate termini. The endonuclease may also be a class III or class IV AP endonuclease which cleaves DNA at thephosphate groups 3′ and 5′ to the baseless site to generate 3′ phosphate and 5′ OH. In some cases, an endonuclease cleaving a fragment disclosed herein is an AP endonuclease which is grouped in a family based on sequence similarity and structure, for example,AP endonuclease family 1 orAP endonuclease family 2. Examples ofAP endonuclease family 1 members include, without limitation, E. coli exonuclease III, S. pneumoniae and B. subtilis exonuclease A, mammalian AP endonuclease 1 (API), Drosophilarecombination repair protein 1, Arabidopsis thaliana apurinic endonuclease-redox protein, Dictyostelium DNA-(apurinic or apyrimidinic site) lyase, enzymes comprising one or more domains thereof, and enzymes having at least 75% sequence identity to one or more domains or regions thereof. Examples ofAP endonuclease family 2 members include, without limitation, bacterial endonuclease IV, fungal and Caenorhabditis elegans apurinic endonuclease APN1,Dictyostelium endonuclease 4 homolog, Archaealprobable endonuclease 4 homologs, mimivirusputative endonuclease 4, enzymes comprising one or more domains thereof, and enzymes having at least 75% sequence identity to one or more domains or regions thereof. Exemplary, endonucleases include endonucleases derived from both Prokaryotes (e.g., endonuclease IV, RecBCD endonuclease, T7 endonuclease, endonuclease II) and Eukaryotes (e.g., Neurospora endonuclease, S1 endonuclease, P1 endonuclease, Mung bean nuclease I, Ustilago nuclease). In some case, an endonuclease functions as both a glycosylase and an AP-lyase. The endonuclease may be endonuclease VIII. In some cases, the endonuclease is S1 endonuclease. In some instances, the endonuclease is endonuclease III. The endonuclease may be a endonuclease IV. In some case, an endonuclease is a protein comprising an endonuclease domain having endonuclease activity, i.e., cleaves a phosphodiester bond. - Provided herein are methods where a non-canonical base is removed with a DNA excision repair enzyme and endonuclease or lyase, wherein the endonuclease or lyase activity is optionally from an excision repair enzyme or a region of the excision repair enzyme. Excision repair enzymes include, without limitation, Methyl Purine DNA Glycosylase (recognizes methylated bases), 8-Oxo-GuanineGlycosylase 1 (recognizes 8-oxoG:C pairs and has lyase activity), Endonuclease Three Homolog 1 (recognizes T-glycol, C-glycol, and formamidopyrimidine and has lyase activity), inosine, hypoxanthine-DNA glycosylase; 5-Methylcytosine, 5-Methylcytosine DNA glycosylase; Formamidopyrimidine-DNA-glycosylase (excision of oxidized residue from DNA: hydrolysis of the N-glycosidic bond (DNA glycosylase), and beta-elimination (AP-lyase reaction)). In some cases, the DNA excision repair enzyme is uracil DNA glycosylase. DNA excision repair enzymes include also include, without limitation, Aag (catalyzes excision of 3-methyladenine, 3-methylguanine, 7-methylguanine, hypoxanthine, 1,N6-ethenoadenine), endonuclease III (catalyzes excision of cis- and trans-thymine glycol, 5,6-dihydrothymine, 5,6-dihydroxydihydrothymine, 5-hydroxy-5-methylhydantoin, 6-hydroxy-5,6-dihydropyrimidines, 5-hydroxycytosine and 5-hydroxyuracil, 5-hydroxy-6-hydrothymine, 5,6-dihydrouracil, 5-hydroxy-6-hydrouracil, AP sites, uracil glycol, methyltartronylurea, alloxan), endonuclease V (cleaves AP sites on dsDNA and ssDNA), Fpg (catalyzes excision of 8-oxoguanine, 5-hydroxycytosine, 5-hydroxyuracil, aflatoxin-bound imidazole ring-opened guanine, imidazole ring-opened N-2-aminofluorene-C8-guanine, open ring forms of 7-methylguanine), and Mug (catalyzes the removal of uracil in U:G mismatches in double-stranded oligonucleic acids, excision of 3, N4-ethenocytosine (eC) in eC:G mismatches in double-, or single-stranded oligonucleic acids). Non-limiting DNA excision repair enzymes are listed in Curr Protoc Mol Biol. 2008 October; Chapter 3: Unit3.9. DNA excision repair enzymes, such as endonucleases, may be selected to excise a specific non-canonical base. As an example, endonuclease V, T. maritima is a 3′-endonuclease which initiates the removal of deaminated bases such as uracil, hypoxanthine, and xanthine. In some cases, a DNA excision repair enzyme having endonuclease activity functions to remove a modified or non-canonical base from a strand of a dsDNA molecule without the use of an enzyme having glycosylase activity.
- In some cases, a DNA excision repair enzyme (“DNA repair enzyme”) comprises glycosylase activity, lyase activity, endonuclease activity, or any combination thereof. In some cases, one or more DNA excision repair enzymes are used in the methods described herein, for example one or more glycosylases or a combination of one or more glycosylases and one or more endonucleases. As an example, Fpg (formamidopyrimidine [fapy]-DNA glycosylase), also known as 8-oxoguanine DNA glycosylase, acts both as a N-glycosylase and an AP-lyase. The N-glycosylase activity releases a non-canonical base (e.g., 8-oxoguanine, 8-oxoadenine, fapy-guanine, methy-fapy-guanine, fapy-adenine, aflatoxin Bi-fapy-guanine, 5-hydroxy-cytosine, 5-hydroxy-uracil) from dsDNA, generating an abasic site. The lyase activity then cleaves both 3′ and 5′ to the abasic site thereby removing the abasic site and leaving a 1 base gap or nick. Additional enzymes which comprise more than enzymatic activities include, without limitation, endonuclease III (Nth) protein from E. coli (N-glycosylase and AP-lyase) and Tma endonuclease III (N-glycosylase and AP-lyase). For a list of DNA repair enzymes having lyase activity, see the New England BioLabs® Inc. catalog.
- Provided herein are methods where mismatch endonucleases are used to nick DNA in the region of mismatches or damaged DNA, including but not limited to T7 Endonuclease I, E. coli Endonuclease V, T4 Endonuclease VII, mung bean nuclease, Cel-1 endonuclease, E. coli Endonuclease IV and UVDE. Cel-1 endonuclease from celery and similar enzymes, typically plant enzymes, exhibit properties that detect a variety of errors in double-stranded nucleic acids. For example, such enzymes can detect polynucleotide loops and insertions, detect mismatches in base pairing, recognize sequence differences in polynucleotide strands between about 100 bp and 3 kb in length and recognize such mutations in a target polynucleotide sequence without substantial adverse effects of flanking DNA sequences.
- Provided herein are methods where one or more non-canonical bases are excised from a dsDNA molecule which is subsequently treated with an enzyme comprising exonuclease activity. In some cases, the exonuclease comprises 3′ DNA polymerase activity. Exonucleases include those enzymes in the following groups: exonuclease I, exonuclease II, exonuclease III, exonuclease IV, exonuclease V, exonuclease VI, exonuclease VII, and exonuclease VIII. In some cases, an exonuclease has AP endonuclease activity. In some cases, the exonuclease is any enzyme comprising one or more domains or amino acid regions suitable for cleaving nucleotides from either 5′ or 3′ end or both ends, of a nucleic acid chain. Exonucleases include wild-type exonucleases and derivatives, chimeras, and/or mutants thereof. Mutant exonucleases include enzymes comprising one or more mutations, insertions, deletions or any combination thereof within the amino acid or nucleic acid sequence of an exonuclease.
- Provided herein are methods where a polymerase is provided to a reaction comprising an enzyme treated dsDNA molecule, wherein one or more non-canonical bases of the dsDNA molecule has been excised, for example, by treatment with one or more DNA repair enzymes. In some cases, the DNA product has been treated with a glycosylase and endonuclease to remove a non-canonical base. In some cases, one or more nucleotides (e.g., dNTPs) are provided to a reaction comprising the treated dsDNA molecule and the polymerase. In some instances, the DNA product has been treated with a UDG and endonuclease VIII to remove at least one uracil. In some cases, one or more nucleotides (e.g., dNTPs) are provided to a reaction comprising the treated dsDNA molecule and the polymerase.
- DNA Repair Enzymes
- Provided herein are methods where a site-specific base excision reagents comprising one or more enzymes are used as cleavage agents that cleave only a single-strand of double-stranded DNA at a cleavage site. A number of repair enzymes are suitable alone or in combination with other agents to generate such nicks. An exemplary list of repair enzymes in provided in Table 1. Homologs or non-natural variants of the repair enzymes, including those in Table 1, are also be used according to various embodiments. Any of the repair enzymes for use according to the methods and compositions described herein may be naturally occurring, recombinant or synthetic. In some instances, a DNA repair enzyme is a native or an in vitro-created chimeric protein with one or more activities. Cleavage agents, in various embodiments, comprise enzymatic activities, including enzyme mixtures, which include one or more of nicking endonucleases, AP endonucleases, glycosylases and lyases involved in base excision repair.
- Without being bound by theory, a damaged base is removed by a DNA enzyme with glycosylase activity, which hydrolyses an N-glycosylic bond between the deoxyribose sugar moiety and the base. For example, an E. coli glycosylase and an UDG endonuclease act upon deaminated cytosine while two 3-mAde glycosylases from E. coli (Tagl and Tagil) act upon alkylated bases. The product of removal of a damaged base by a glycosylase is an AP site (apurinic/apyrimidinic site), also known as an abasic site, is a location in a nucleic acid that has neither a purine nor a pyrimidine base. DNA repair systems are often used to correctly replace the AP site. This is achieved in various instances by an AP endonuclease that nicks the sugar phosphate backbone adjacent to the AP site and the abasic sugar is removed. Some naturally occurring or synthetic repair systems include activities, such as the DIMA polymerase/DNA ligase activity, to insert a new nucleotide.
- Repair enzymes are found in prokaryotic and eukaryotic cells. Some enzymes having applicability herein have glycosylase and AP endonuclease activity in one molecule. AP endonucleases are classified according to their sites of incision. Class I AP endonucleases and class II AP endonucleases incise DNA at the
phosphate groups 3′ and 5′ to the baseless site leaving 3′-OH and 5′-phosphate termini. Class III and class IV AP endonucleases also cleave DNA at thephosphate groups 3′ and 5′ to the baseless site, but they generate a 3′-phosphate and a 5′-OH. - In some cases, AP endonucleases remove moieties attached to the 3′ OH that inhibit polynucleotide polymerization. For example a 3′ phosphate is converted to a 3′ OH by E. coli endonuclease IV. In some cases, AP endonucleases work in conjunction with glycosylases to engineer nucleic acids at a site of mismatch, a non-canonical nucleoside or a base that is not one of the major nucleosides for a nucleic acid, such as a uracil in a DNA strand.
- Examples of glycosylase substrates include, without limitation, uracil, hypoxanthine, 3-methyladenine (3-mAde), formamidopyrimidine (FAPY), 7,8 dihydro-8-oxyguanine and hydroxymethyluracil. In some instances, glycosyslase substrates incorporated into DNA site-specifically by nucleic acid extension from a primer comprising the substrate. In some instances, glycosylase substrates are introduced by chemical modification of a nucleoside, for example by deamination of cytosine, e.g. by bisulfate, nitrous acids, or spontaneous deamination, producing uracil, or by deamination of adenine by nitrous acids or spontaneous deamination, producing hypoxanthine. Other examples of chemical modification of nucleic acids include generating 3-mAde as a product of alkylating agents, FAPY (7-mGua) as product of methylating agents of DNA, 7,8-dihydro-8 oxoguanine as a mutagenic oxidation product of guanine, 4,6-diamino-5-FAPY produced by gamma radiation, and hydroxymethyuracil produced by ionizing radiation or oxidative damage to thymidine. Some enzymes comprise AP endonuclease and glycosylase activities that are coordinated either in a concerted manner or sequentially.
- Examples of polynucleotide cleavage enzymes used to generate single-stranded nicks include the following types of enzymes derived from but not limited to any particular organism or virus or non-naturally occurring variants thereof: E. coli endonuclease IV, Tth endonuclease IV, human AP endonuclease, glycosylases, such as UDG, E. coli 3-methyladenine DNA glycoylase (AIkA) and human Aag, glycosylase/lyases, such as E. coli endonuclease III, E. coli endonuclease VIII, E. coli Fpg, human OGG1, and T4 PDG, and lyases. Exemplary additional DNA repair enzymes are listed in Table 1.
-
TABLE 1 DNA repair enzymes. Accession Gene Activity Number UNG Uracil-DNA glycosylase NM_080911 SMUG1 Uracil-DNA glycosylase NM_014311 MBD4 Removes U or T opposite G at CpG NM_003925 sequences TDG Removes U, T or ethenoC opposite NM_003211 G OGG1 Removes 8-oxoG opposite C NM_016821 MUTYH (MYH) Removes A opposite 8-oxoG NM_012222 NTHL1 (NTH1) Removes Ring-saturated or NM_002528 fragmented pyrimidines MPG Removes 3-meA, ethenoA, NM_002434 hypoxanthine NEIL1 Removes thymine glycol NM_024608 NEIL2 Removes oxidative products of NM_145043 pyrimidines XPC Binds damaged DNA as complex NM_004628 with RAD23B, CETN2 RAD23B (HR23B) Binds damaged DNA as complex NM_002874 with XPC, CETN2 CETN2 Binds damaged DNA as complex NM_004344 with XPC, RAD23B RAD23A (HR23A) Substitutes for HR23B NM_005053 XPA Binds damaged DNA in preincision NM_000380 complex RPA1 Binds DNA in preincision complex NM_002945 RPA2 Binds DNA in preincision complex NM_002946 RPA3 Binds DNA in preincision complex NM_002947 ERCC5 (XPG) 3′ incision NM_000123 ERCC1 5′ incision subunit NM_001983 ERCC4 (XPF) 5′ incision subunit NM_005236 LIG1 DNA joining NM_000234 CKN1(CSA) Cockayne syndrome; Needed for NM_000082 transcription-coupled NER ERCC6 (CSB) Cockayne syndrome; Needed for NM_000124 transcription-coupled NER XAB2 (HCNP) Cockayne syndrome; Needed for NM_020196 transcription-coupled NER DDB1 Complex defective in XP group E NM_001923 DDB2 DDB1, DDB2 NM_000107 MMS19L (MMS19) Transcription and NER NM_022362 FEN1 (DNase IV) Flap endonuclease NM_004111 SPO11 endonuclease NM_012444 F1135220 (ENDOV) incision 3′ of hypoxanthine and NM_173627 uracil FANCA Involved in tolerance or repair of NM_000135 DNA crosslinks FANCB Involved in tolerance or repair of NM_152633 DNA crosslinks FANCC Involved in tolerance or repair of NM_000136 DNA crosslinks FANCD2 Involved in tolerance or repair of NM_033084 DNA crosslinks FANCE Involved in tolerance or repair of NM_021922 DNA crosslinks FANCF Involved in tolerance or repair of NM_022725 DNA crosslinks FANCG (XRCC9) Involved in tolerance or repair of NM_004629 DNA crosslinks FANCL Involved in tolerance or repair of NM_018062 DNA crosslinks DCLRE1A (SNM1) DNA crosslink repair NM_014881 DCLRE1B (SNM1B) Related to SNM1 NM_022836 NEIL3 Resembles NEIL1 and NEIL2 NM_018248 ATRIP (TREX1) ATR-interacting protein 5′ NM_130384 alternative ORF of the TREX1/ATRIP gene NTH Removes damaged pyrimidines NP_416150.1 NET Removes damaged pyrimidines NP_415242.1 NFT Deoxyinosine 3′ endonuclease NP_418426.1 MUTM Formamidopyrimidine DNA NP_418092.1 glycosylase UNG Uracil-DNA glycosylase NP_417075.1 UVRA DNA excision repair enzyme NP_418482.1 complex UVRB DNA excision repair enzyme NP_415300.1 complex UVRC DNA excision repair enzyme NP_416423.3 complex DENV Pyrimidine dimer glycosylase NP_049733.1 - Provided herein are methods where one or more enzymatic activities, such as those of repair enzymes, are used in combination to generate a site-specific single-strand nick. For example, USER (Uracil-Specific Excision Reagent; New England BioLabs) generates a single nucleoside gap at the location of a uracil. USER is a mixture of Uracil DNA glycosylase (UDG) and the DNA glycosylase-lyase Endonuclease VIII. UDG catalyzes the excision of a uracil base, forming an abasic (apyrimidinic) site while leaving the phosphodiester backbone intact. The lyase activity of Endonuclease VIII is used to break the phosphodiester backbone at the 3′ and 5′ sides of the abasic site so that the base-free deoxyribose is released, creating a one nucleotide gap at the site of uracil nucleotide.
- Provided herein are methods where a nucleic acid fragment is treated prior to assembly into a target nucleic acid of predetermined sequence. In some instances, nucleic acid fragments are treated to create a sticky end, such as a sticky end with a 3′ overhang or a 5′ overhang. For example, uracil bases are incorporated into one or both strands of the target nucleic acids, which are chewed off upon treatment with Uracil DNA glycosylase (UDG) and Endonuclease VIII (EndoVIII). In some instances, uracil bases are incorporated near the 5′ ends (or 3′ ends), such as at least or at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 bases from the 5′ end (or 3′ end), of one or both strands. In some cases, uracil bases are incorporated near the 5′ ends such as at most or at most about 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 base from the 5′ end, of one or both strands. In some cases, uracil bases are incorporated near the 5′ end such as between 1-20, 2-19, 3-18, 4-17, 5-16, 6-15, 7-14, 8-13, 9-12, 10-13, 11-14 bases from the 5′ end, of one or both strands. Those of skill in art will appreciate that the uracil bases may be incorporated near the 5′ end such that the distance between the uracil bases and the 5′ end of one or both strands may fall within a range bound by any of these values, for example from 7-19 bases.
- Provided herein are methods where two or more of the cleavage, annealing and ligation reactions are performed concurrently within the same mixture and the mixture comprises a ligase. In some cases, one or more of the various reactions is sped up and one or more of the various reactions is slowed down by adjusting the reaction conditions such as temperature. In some cases, the reaction is thermocycled between a maximum and minimum temperature to repeatedly enhance cleavage, melting, annealing, and/or ligation. In some cases, the temperature ranges from a high of 80 degrees Celsius. In some cases, the temperature ranges from a low to 4 degrees Celsius. In some cases, the temperature ranges from 4 degrees Celsius to 80 degrees Celsius. In some cases, the temperature ranges among intermediates in this range. In some cases, the temperature ranges from a high of 60 degrees Celsius. In some cases, the temperature ranges to a low of 16 degrees Celsius. In some cases, the temperature ranges from a high of 60 degrees Celsius to a low of 16 degrees Celsius. In some cases, the mixture is temperature cycled to allow for the removal of cleaved sticky ended distal fragments from precursor fragments at elevated temperatures and to allow for the annealing of the fragments with complementary sticky ends at a lower temperature. In some cases, alternative combinations or alternative temperatures are used. In yet more alternate cases the reactions occur at a single temperature. In some cases, palindromic sequences are excluded from overhangs. The number of fragment populations to anneal in a reaction varies across target nucleic acids. In some cases, a ligation reaction comprises 2, 3, 4, 5, 6, 7, 8, or more than 8 types of target fragments to be assembled. For a given target nucleic acid, in some cases, portions of the entire nucleic acid are synthesized in separate reactions. In some cases, intermediate nucleic acids are used in a subsequent assembly round that uses the same or a different method to assemble larger intermediates or the final target nucleic acid. The same or different cleavage agents, recognition sites, and cleavage sites are used in subsequent rounds of assembly. In some instances, consecutive rounds of assembly, e.g. pooled or parallel assembly, are used to synthesize larger fragments in a hierarchical manner. In some cases, described herein are methods and compositions for the preparation of a target nucleic acid, wherein the target nucleic acid is a gene, using assembly of shorter fragments.
- Polymerase chain reaction (PCR)-based and non-polymerase-cycling-assembly (PCA)-based strategies may be used for gene synthesis. In addition, non-PCA-based gene synthesis using different strategies and methods, including enzymatic gene synthesis, annealing and ligation reaction, simultaneous synthesis of two genes via a hybrid gene, shotgun ligation and co-ligation, insertion gene synthesis, gene synthesis via one strand of DNA, template-directed ligation, ligase chain reaction, microarray-mediated gene synthesis, Golden Gate Gene Assembly, Blue Heron solid support technology, Sloning building block technology, RNA-mediated gene assembly, the PCR-based thermodynamically balanced inside-out (TBIO) (Gao et al., 2003), two-step total gene synthesis method that combines dual asymmetrical PCR (DA-PCR) (Sandhu et al., 1992), overlap extension PCR (Young and Dong, 2004), PCR-based two-step DNA synthesis (PTDS) (Xiong et al., 2004b), successive PCR method (Xiong et al., 2005, 2006a), or any other suitable method known in the art can be used in connection with the methods and compositions described herein, for the assembly of longer polynucleotides from shorter oligonucleotides.
- Amplification
- Amplification reactions described herein can be performed by any means known in the art. In some cases, the nucleic acids are amplified by polymerase chain reaction (PCR). Other methods of nucleic acid amplification include, for example, ligase chain reaction, oligonucleotide ligations assay, and hybridization assay. DNA polymerases described herein include enzymes that have DNA polymerase activity even though it may have other activities. A single DNA polymerase or a plurality of DNA polymerases may be used throughout the repair and copying reactions. The same DNA polymerase or set of DNA polymerases may be used at different stages of the present methods or the DNA polymerases may be varied or additional polymerase added during various steps. Amplification may be achieved through any process by which the copy number of a target sequence is increased, e.g. PCR. Amplification can be performed at any point during a multi reaction procedure, e.g. before or after pooling of sequencing libraries from independent reaction volumes and may be used to amplify any suitable target molecule described herein.
- Oligonucleic Acid Synthesis
- Oligonucleic acids serving as target nucleic acids for assembly may be synthesized de novo in parallel. The oligonucleic acids may be assembled into precursor fragments which are then assembled into target nucleic acids. In some case, greater than about 100, 1000, 16,000, 50,000 or 250,000 or even greater than about 1,000,000 different oligonucleic acids are synthesized together. In some cases, these oligonucleic acids are synthesized in less than 20, 10, 5, 1, 0.1 cm2, or smaller surface area. In some instances, oligonucleic acids are synthesized on a support, e.g. surfaces, such as microarrays, beads, miniwells, channels, or substantially planar devices. In some case, oligonucleic acids are synthesized using phosphoramidite chemistry. In order to host phosphoramidite chemistry, the surface of the oligonucleotide synthesis loci of a substrate in some instances is chemically modified to provide a proper site for the linkage of the growing nucleotide chain to the surface. Various types of surface modification chemistry exists which allow a nucleotide to attached to the substrate surface.
- The DNA and RNA synthesized according to the methods described herein may be used to express proteins in vivo or in vitro. The nucleic acids may be used alone or in combination to express one or more proteins each having one or more protein activities. Such protein activities may be linked together to create a naturally occurring or non-naturally occurring metabolic/enzymatic pathway. Further, proteins with binding activity may be expressed using the nucleic acids synthesized according to the methods described herein. Such binding activity may be used to form scaffolds of varying sizes.
- The methods and systems described herein may comprise and/or are performed using a software program on a computer system. Accordingly, computerized control for the optimization of design algorithms described herein and the synthesis and assembly of nucleic acids are within the bounds of this disclosure. For example, supply of reagents and control of PCR reaction conditions are controlled with a computer. In some instances, a computer system is programmed to search for sticky end motifs in a user specified predetermined nucleic acid sequence, interface these motifs with a list of suitable nicking enzymes, and/or determine one or more assembly algorithms to assemble fragments defined by the sticky end motifs. In some instances, a computer system described herein accepts as an input one or more orders for one or more nucleic acids of predetermined sequence, devises an algorithm(s) for the synthesis and/or assembly of the one or more nucleic acid fragments, provides an output in the form of instructions to a peripheral device(s) for the synthesis and/or assembly of the one or more nucleic acid fragments, and/or instructs for the production of the one or more nucleic acid fragments by the peripheral devices to form the desired nucleic acid of predetermined sequence. In some instances, a computer system operates without human intervention during one or more of steps for the production of a target nucleic acid of predetermined sequence or nucleic acid fragment thereof.
- In some cases, a software system is used to identify sticky end motif sequence for use in a target sequence assembly reaction consistent with the disclosure herein. For example, in some cases, a software system is used to identify a sticky end motif using at least one, up to and including all, of the steps as follows. Given a final target sequence of length I, a desired target fragment of J, and a desired sticky end overhang length of K (for 5′
ANNNNT 3′(SEQ ID NO.: 2), K=6) and a maximum desired similarity between sites of L, assembly parameters are in some instances calculated as follows. In some cases, J is about 200. In some cases, J is about 1000. In some cases, J is a number selected from about 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 100, or more than 1000. In some cases, J is a value in the range from 70-250. I/J is the number of fragments to be assembled (x). X−1 breakpoints are added along the target sequence, reflecting the number of junctions in the target sequence to be assembled. In some cases, junctions are selected at equal intervals or at approximately equal intervals throughout the target sequence. - For at least one breakpoint, the nearest breakpoint site candidate is identified, for example having ANNNNT (SEQ ID NO.: 2), or GNNNNC (SEQ ID NO.: 17). Consistent with the disclosure herein, the breakpoint has a 6 base sequence in some cases, while in other cases the junction sequence is 1, 2, 3, 4, or 5 bases, and in other cases the junction is 7, 8, 9, 10, or more than 10 bases. In some cases, the breakpoint site candidate comprises a purine at a first position, a number of bases ranging from 0 to 8 or greater, preferably 1 or greater in some cases, and a pyrimidine at a final position such that the first position purine and the final position pyrimidine are a complementary base pair (either AT or GC).
- In some cases, breakpoint selection is continued for sites up to and In some cases, including each breakpoint or near each breakpoint. Site candidates are evaluated so as to reduce the presence of at least one of palindromic sequences, homopolymers, extreme GC content, and extreme AT content. Sites are assessed in light of at least one of these criteria, optionally in combination with or alternatively viewing additional criteria for site candidate evaluation. If a site is determined or calculated to have undesirable qualities, then the next site in a vicinity is subjected to a comparable evaluation. Site candidates are further evaluated for cross-site similarity, for example excluding sites that share more than L bases in common at common positions or in common sequence. In some cases, L is 2, such that the central NNNN of some selected sticky ends must not share similar bases at similar positions. In some cases, L is 2, such that the central NNNN of some selected sticky ends must not share similar bases in similar patterns. In alternate cases, L is 3, 4, 5, 6, or greater than 6. Site candidates are evaluated individually or in combination, until a satisfactory sticky end system or group of distinct sticky ends is identified for a given assembly reaction. Alternate methods employ at least one of the steps recited above, alone or in combination with additional steps recited above or in combination with at least one step not recited above, or in combination with a plurality of steps recited above and at least one step not recited above.
- A method described herein may be operably linked to a computer, either remotely or locally. In some cases, a method described herein is performed using a software program on a computer. In some cases, a system described herein comprises a software program for performing and/or analyzing a method or product of a method described herein. Accordingly, computerized control of a process step of any method described herein is envisioned.
- The
computer system 700 illustrated inFIG. 7 depicts a logical apparatus that reads instructions frommedia 711 and/or anetwork port 705, which is optionally be connected toserver 709 having fixedmedia 712. In some cases, a computer system, such as shown inFIG. 7 , includes aCPU 701,disk drive 703, optional input devices such askeyboard 715 and/ormouse 716 andoptional monitor 707. Data communication can be achieved through the indicated communication medium to a server at a local or a remote location. Communication medium includes any means of transmitting and/or receiving data. As non-limiting examples, communication medium is a network connection, a wireless connection, and/or an internet connection. Such a connection can provide for communication over the World Wide Web. It is envisioned that data relating to the present disclosure is transmittable over such networks or connections for reception and/or review by auser 722, as illustrated inFIG. 7 . - A block diagram illustrating a first example architecture of a
computer system 800 for use in connection with example embodiments of the disclosure is shown inFIG. 8 . The example computer system ofFIG. 8 includes aprocessor 802 for processing instructions. Non-limiting examples of processors include: Intel Xeon™ processor, AMD Opteron™ processor, Samsung 32-bit RISC ARM 1176JZ(F)-S v1.0™ processor, ARM Cortex-A8 Samsung S5PC100™ processor, ARM Cortex-A8 Apple A4™ processor, Marvell PXA 930™ processor, and a functionally-equivalent processor. Multiple threads of execution can be used for parallel processing. In some instances, multiple processors or processors with multiple cores are used, whether in a single computer system, in a cluster, or distributed across systems over a network comprising a plurality of computers, cell phones, and/or personal data assistant devices. - In the computer system of
FIG. 8 , ahigh speed cache 804 is connected to, or incorporated in, theprocessor 802 to provide a high speed memory for instructions or data that have been recently, or are frequently, used byprocessor 802. Theprocessor 802 is connected to anorth bridge 806 by a processor bus 808. Thenorth bridge 806 is connected to random access memory (RAM) 810 by a memory bus 812 and manages access to theRAM 810 by theprocessor 802. Thenorth bridge 806 is also connected to asouth bridge 814 by a chipset bus 816. Thesouth bridge 814 is, in turn, connected to a peripheral bus 818. The peripheral bus is, for example, PCI, PCI-X, PCI Express, or another peripheral bus. The north bridge and south bridge are often referred to as a processor chipset and manage data transfer between the processor, RAM, and peripheral components on the peripheral bus 818. In some alternative architectures, the functionality of the north bridge is incorporated into the processor instead of using a separate north bridge chip. In some instances,system 800 includes anaccelerator card 822 attached to the peripheral bus 818. The accelerator may include field programmable gate arrays (FPGAs) or other hardware for accelerating certain processing. For example, an accelerator is used for adaptive data restructuring or to evaluate algebraic expressions used in extended set processing. - Software and data are stored in
external storage 824, which can then be loaded intoRAM 810 and/orcache 804 for use by the processor.System 800 includes an operating system for managing system resources. Non-limiting examples of operating systems include: Linux, Windows™, MACOS™, BlackBerry OS™, iOS™, and other functionally-equivalent operating systems, as well as application software running on top of the operating system for managing data storage and optimization in accordance with example embodiments of the present disclosure.System 800 includes network interface cards (NICs) 820 and 821 connected to the peripheral bus for providing network interfaces to external storage, such as Network Attached Storage (NAS) and other computer systems that can be used for distributed parallel processing. -
FIG. 9 is a diagram showing anetwork 900 with a plurality ofcomputer systems personal data assistants 902 c, and Network Attached Storage (NAS) 904 a, and 904 b. In some instances,systems NAS computer systems assistant system 902 c.Computer systems assistant system 902 c can provide parallel processing for adaptive data restructuring of the data stored inNAS FIG. 9 illustrates an example only, and a wide variety of other computer architectures and systems can be used in conjunction with the various embodiments of the present disclosure. For example, a blade server can be used to provide parallel processing. Processor blades can be connected through a back plane to provide parallel processing. Storage can also be connected to the back plane or as NAS through a separate network interface. - In some instances, processors maintain separate memory spaces and transmit data through network interfaces, back plane or other connectors for parallel processing by other processors. In some instances, some or all of the processors use a shared virtual address memory space.
-
FIG. 10 is a block diagram of amultiprocessor computer system 1000 using a shared virtual address memory space in accordance with an example embodiment. The system includes a plurality ofprocessors 1002 a-f that can access a sharedmemory subsystem 1004. The system incorporates a plurality of programmable hardware memory algorithm processors (MAPs) 1006 a-f in thememory subsystem 1004. EachMAP 1006 a-f can comprise a memory 1008 a-f and one or more field programmable gate arrays (FPGAs) 1010 a-f. The MAP provides a configurable functional unit and particular algorithms or portions of algorithms can be provided to the FPGAs 1010 a-f for processing in close coordination with a respective processor. For example, the MAPs are used to evaluate algebraic expressions regarding a data model and to perform adaptive data restructuring in example embodiments. In this example, each MAP is globally accessible by all of the processors for these purposes. In one configuration, each MAP uses Direct Memory Access (DMA) to access an associated memory 1008 a-f, allowing it to execute tasks independently of, and asynchronously from, therespective microprocessor 1002 a-f In this configuration, a MAP can feed results directly to another MAP for pipelining and parallel execution of algorithms. - The above computer architectures and systems are examples only, and a wide variety of other computer, cell phone, and personal data assistant architectures and systems can be used in connection with example embodiments, including systems using any combination of general processors, co-processors, FPGAs and other programmable logic devices, system on chips (SOCs), application specific integrated circuits (ASICs), and other processing and logic elements. In some instances, all or part of the computer system can be implemented in software or hardware. Any variety of data storage media can be used in connection with example embodiments, including random access memory, hard drives, flash memory, tape drives, disk arrays, Network Attached Storage (NAS) and other local or distributed data storage devices and systems.
- The following examples are set forth to illustrate more clearly the principle and practice of embodiments disclosed herein to those skilled in the art and are not to be construed as limiting the scope of any claimed embodiments.
- Amplification with Uracil-Containing PCR Primers
- A gene of about 1 kB (the “1 kB Gene Construct”) was selected to perform restriction enzyme-free ligation with a vector:
-
(SEQ ID NO.: 48) 5′ CAGCAGTTCCTCGCTCTTCTCACGACGAGTTCGACATCAAC AAGCTGCGCTACCACAAGATCGTGCTGATGGCCGACGCCGATGTT GACGGCCAGCACATCGCAACGCTGCTGCTCACCCTGCTTTTCCGC TTCATGCCAGACCTCGTCGCCGAAGGCCACGTCTACTTGGCACAG CCACCTTTGTACAAACTGAAGTGGCAGCGCGGAGAGCCAGGATTC GCATACTCCGATGAGGAGCGCGATGAGCAGCTCAACGAAGGCCTT GCCGCTGGACGCAAGATCAACAAGGACGACGGCATCCAGCGCTAC AAGGGTCTCGGCGAGATGAACGCCAGCGAGCTGTGGGAAACCACC ATGGACCCAACTGTTCGTATTCTGCGCCGCGTGGACATCACCGAT GCTCAGCGTGCTGATGAACTGTTCTCCATCTTGATGGGTGACGAC GTTGTGGCTCGCCGCAGCTTCATCACCCGAAATGCCAAGGATGTT CGTTTCCTCGATATCTAAAGCGCCTTACTTAACCCGCCCCTGGAA TTCTGGGGGCGGGTTTTGTGATTTTTAGGGTCAGCACTTTATAAA TGCAGGCTTCTATGGCTTCAAGTTGGCCAATACGTGGGGTTGATT TTTTAAAACCAGACTGGCGTGCCCAAGAGCTGAACTTTCGCTAGT CATGGGCATTCCTGGCCGGTTTCTTGGCCTTCAAACCGGACAGGA ATGCCCAAGTTAACGGAAAAACCGAAAGAGGGGCACGCCAGTCTG GTTCTCCCAAACTCAGGACAAATCCTGCCTCGGCGCCTGCGAAAA GTGCCCTCTCCTAAATCGTTTCTAAGGGCTCGTCAGACCCCAGTT GATACAAACATACATTCTGAAAATTCAGTCGCTTAAATGGGCGCA GCGGGAAATGCTGAAAACTACATTAATCACCGATACCCTAGGGCA CGTGACCTCTACTGAACCCACCACCACAGCCCATGTTCCACTACC TGATGGATCTTCCACTCCAGTCCAAATTTGGGCGTACACTGCGAG TCCACTACGAT 3′ - The 1 kB Gene Construct, which is an assembled gene fragment with heterogeneous sequence populations, was purchased as a single gBlock (Integrated DNA Technologies). The 1 kB Gene Construct was amplified in a PCR reaction with uracil-containing primers. The PCR reaction components were prepared according to Table 2.
-
TABLE 2 PCR reaction mixture comprising uracil-containing primers. 10 μL 5X HF buffer (ThermoFisher Scientific) 0.8 μL 10 mM dNTP (NEB) 1 ng template (1 kB Gene Construct) 2.5 μL forward primer (10 μM) 5′ CAGCAGT/ ideoxyU/ CCTCGCTCTTCT 3′ (SEQ ID NO.: 49; Integrated DNA Technologies) 2.5 μL reverse primer (10 μM) 5′ ATCGTAG/ ideoxyU/ GGACTCGCAGTGTA 3′ (SEQ ID NO.: 50; Integrated DNA Technologies) 0.5 μL Phusion-U hot start DNA polymerase (ThermoFisher Scientific, 2U/μL) Water up to 50 μL - The 1 kB Gene Construct was amplified with the uracil-containing primers in a PCR reaction performed using the thermal cycling conditions described in Table 3.
-
TABLE 3 PCR reaction conditions for amplifying a gene with uracil-containing primers. Step Cycle 1 1 cycle: 98° C., 30 sec 2 20 cycles: 98° C., 10 sec; 68° C., 15 sec; 72° C., 60 sec 3 1 cycle: 72° C., 5 min 4 Hold: 4° C. - The uracil-containing PCR products were purified using Qiagen MinElute column, eluted in 10 μL EB buffer, analyzed by electrophoresis (BioAnalyzer), and quantified on a NanoDrop to be 93 ng/μL. The uracil-containing PCR products of the 1 kB Gene Construct were incubated with a mixture of Uracil DNA glycosylase (UDG) and Endonuclease VIII to generate sticky ends. The incubation occurred at 37° C. for 30 min in a reaction mixture as described in Table 4.
-
TABLE 4 Digestion reaction conditions for generating sticky ends in a uracil-containing gene. Reaction component Quantity Uracil-containing PCR product 15 nM (final concentration) 10x CutSmart buffer (NEB) 10 μL UDG/EndoVIII (NEB or Enzymatics) 2 μL of 1 U/μL Water Up to 94.7 μL - Two synthetic oligonucleotides having 3′ overhangs when annealed together (“Artificial Vector”) were hybridized and ligated to the digested uracil-containing 1 kB Gene Construct (“Sticky-end Construct”). The first oligo (“Upper Oligo”, SEQ ID NO.: 51) contains a 5′ phosphate for ligation. The second oligonucleotide (“Lower Oligo”, SEQ ID NO.: 52) lacks a base on the 5′ end such that it leaves a nucleotide gap after hybridizing to the Sticky-end Construct with the Upper Oligo. Further, the Lower Oligo lacks a 5′ phosphate to ensure that no ligation occurs at this juncture. The first six phosphate bonds on the Lower Oligo are phosphorothioated to prevent exonuclease digestion from the gap. Oligonucleic acid sequences of the Artificial Vector are shown in Table 5. An asterisk denotes a phosphorothioate bond.
-
TABLE 5 Sequence identities of an artificial vector for ligation to a sticky-end gene product. Sequence ID Sequence SEQ ID 5′/5phos/TACGCTCTTCCTCAGCAGTGG NO.: 51 TCATCGTAGT 3′SEQ ID 5′ A*C*C*A*C*T*GCTGAGGAAGAGCGT NO.: 52 ACAGCAGTT 3′Artificial TACGCTCTTCCTCAGCA G T G G T CA Vector TCGTAGTTTGACGACATGCGAGAAGGAGTC SEQ ID GT*C*A*C*C*A* NO.: 79 - The Sticky-end Construct was mixed with Upper Oligo and Lower Oligo (5 μM each) in 1× CutSmart buffer (NEB). The mixture was heated to 95° C. for 5 min, and then slowly cooled to anneal. The annealed product comprised a circularized gene construct comprising the 1 kB Gene Construct. This construct was generated without the remnants of any restriction enzyme cleavage sites and thus lacked any associated enzymatic “scars.”
- A LacZ gene was assembled into a 5 kb plasmid from three precursor LacZ fragments and 1 precursor plasmid fragment. Assembly was performed using 9 different reaction conditions.
- Preparation of Precursor Plasmid Fragments
- A 5 kb plasmid was amplified with two different sets of primers for introducing a sticky end motif comprising a non-canonical base (SEQ ID NO.: 53): set A (SEQ ID NOs.: 54 and 55) and set B (SEQ ID NOs.: 56 and 57), shown in Table 6, to produce plasmid precursor fragments A and B, respectively.
-
TABLE 6 Sequence identities of plasmid primers. Sequence Primer identity name Sequence SEQ ID plasmid- TGATCGGCAATGATATG/ideoxyU/ NO.: 54 Fa CTGGAAAGAACATGTG SEQ ID plasmid- TGATCGGCAATGATGGC/ideoxyU/ NO.: 55 Ra TATAATGCGACAAACAACAG SEQ ID plasmid- TGATCGGCAATGATATG/ideoxyU/ NO.: 56 Fb CGCTGGAAAGAACATG SEQ ID plasmid- TGATCGGCAATGATGGC/ideoxyU/ NO.: 57 Ra CGTATAATGCGACAAACAAC - Each primer set comprises, in 5′ to 3′ order: 6 adaptor bases (TGATCG, SEQ ID NO.: 58), a first nicking enzyme recognition site (GCAATG, SEQ ID NO.: 59), a sticky end motif comprising a non-canonical base (ANNNNU, SEQ ID NO.: 53), and plasmid sequence. The first two bases of the plasmid sequence in the forward and reverse primers of set B are a CG. These two bases are absent from the forward and reverse primers of set A. Two plasmid fragments, plasmid A and plasmid B, were amplified using primer set A and primer B, respectively. The composition of the amplification reaction is shown in Table 7. The amplification reaction conditions are shown in Table 8.
-
TABLE 7 PCR reaction mixture for amplification of a 5 kb plasmid. PCR component Quantity (μL) Concentration in mixture Phusion U (2 U/μL) 1 1 U/50 μL 5x Phusion HF buffer 20 1x 10 mM dNTP 4 400 μM Plasmid template (50 pg/4) 4 100 pg/50 μL plasmid-Fa or plasmid-Fb 0.25 0.5 μM (200 μM) plasmid-Ra or plasmid-Rb 0.25 0.5 μM (200 μM) Water 70.5 -
TABLE 8 PCR reaction conditions for amplification of a 5 kb plasmid. Step Cycle 1 1 cycle: 98° C., 30 sec 2 30 cycles: 98° C., 10 sec; 49° C., 15 sec; 72° C., 90 sec 3 1 cycle: 72° C., 5 min 4 Hold: 4° C., 15-30 sec per kb - The precursor plasmid fragment was treated with DpnI, denatured and purified.
- Preparation of Precursor LacZ Fragments
- The LacZ sequence was analyzed to identify two sticky end motifs which partition the sequence into roughly 3, 1 kb fragments: LacZ fragments 1-3. Sequence identities of the two sticky end motifs and the LacZ fragments are shown in Table 9. SEQ ID NO.: 60 shows the complete LacZ gene, wherein motifs are italicized,
fragment 1 is underlined with a single line,fragment 2 is underlined with a squiggly line, andfragment 3 is underlined with a double line. -
TABLE 9 Sequence identities of LacZ fragments and sticky end motifs. Sequence Sequence identity name Sequence SEQ ID fragment 1 ATGACCATGATTACGGATTCACTGGCCGTCGTTTTACAACG NO.: 61 TCGTGACTGGGAAAACCCTGGCGTTACCCAACTTAATCGCC TTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAATAGCGAA GAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCT GAATGGCGAATGGCGCTTTGCCTGGTTTCCGGCACCAGAAG CGGTGCCGGAAAGCTGGCTGGAGTGCGATCTTCCTGAGGCC GATACTGTCGTCGTCCCCTCAAACTGGCAGATGCACGGTTA CGATGCGCCCATCTACACCAACGTGACCTATCCCATTACGG TCAATCCGCCGTTTGTTCCCACGGAGAATCCGACGGGTTGT TACTCGCTCACATTTAATGTTGATGAAAGCTGGCTACAGGA AGGCCAGACGCGAATTATTTTTGATGGCGTTAACTCGGCGT TTCATCTGTGGTGCAACGGGCGCTGGGTCGGTTACGGCCAG GACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGCATTTTT ACGCGCCGGAGAAAACCGCCTCGCGGTGATGGTGCTGCGCT GGAGTGACGGCAGTTATCTGGAAGATCAGGATATGTGGCGG ATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCATAAACC GACTACACAAATCAGCGATTTCCATGTTGCCACTCGCTTTA ATGATGATTTCAGCCGCGCTGTACTGGAGGCTGAAGTTCAG ATGTGCGGCGAGTTGCGTGACTACCTACGGGTAACAGTTTC TTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCACCGCGC CTTTCGGCGGTGAAATTATCGATGAGCGTGGTGGTTATGCC GATCGCGTCACACTACGTCTGAACGTCGAAAACCCGAAACT GTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGGTGGTTG AACTGCACACCGCCGACGGCACGCTGATTGAAGCAGAAGCC TGCGATGTCGGTTTCCGCGAGGTGCGGATTGAA SEQ ID NO.: 62 fragment 2SEQ ID fragment 3 GATTGAACTGCCTGAACTACCGCAGCCGGAGAGCGCCGGGC NO.: 63 AACTCTGGCTCACAGTACGCGTAGTGCAACCGAACGCGACC GCATGGTCAGAAGCCGGGCACATCAGCGCCTGGCAGCAGTG GCGTCTGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGT CCCACGCCATCCCGCATCTGACCACCAGCGAAATGGATTTT TGCATCGAGCTGGGTAATAAGCGTTGGCAATTTAACCGCCA GTCAGGCTTTCTTTCACAGATGTGGATTGGCGATAAAAAAC AACTGCTGACGCCGCTGCGCGATCAGTTCACCCGTGCACCG CTGGATAACGACATTGGCGTAAGTGAAGCGACCCGCATTGA CCCTAACGCCTGGGTCGAACGCTGGAAGGCGGCGGGCCATT ACCAGGCCGAAGCAGCGTTGTTGCAGTGCACGGCAGATACA CTTGCTGATGCGGTGCTGATTACGACCGCTCACGCGTGGCA GCATCAGGGGAAAACCTTATTTATCAGCCGGAAAACCTACC GGATTGATGGTAGTGGTCAAATGGCGATTACCGTTGATGTT GAAGTGGCGAGCGATACACCGCATCCGGCGCGGATTGGCCT GAACTGCCAGCTGGCGCAGGTAGCAGAGCGGGTAAACTGGC TCGGATTAGGGCCGCAAGAAAACTATCCCGACCGCCTTACT GCCGCCTGTTTTGACCGCTGGGATCTGCCATTGTCAGACAT GTATACCCCGTACGTCTTCCCGAGCGAAAACGGTCTGCGCT GCGGGACGCGCGAATTGAATTATGGCCCACACCAGTGGCGC GGCGACTTCCAGTTCAACATCAGCCGCTACAGTCAACAGCA ACTGATGGAAACCAGCCATCGCCATCTGCTGCACGCGGAAG AAGGCACATGGCTGAATATCGACGGTTTCCATATGGGGATT GGTGGCGACGACTCCTGGAGCCCGTCAGTATCGGCGGAATT CCAGCTGAGCGCCGGTCGCTACCATTACCAGTTGGTCTGGT GTCAAAAATAA SEQ ID motif 1 AATGGT NO.: 64 SEQ ID motif 2 ACAGTT NO.: 65 SEQ ID NO.: 60 LacZ ATCGCCTTGCAGCACATCCCCCTTTCGCCAGCTGGCGTAAT AGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCG CAGCCTGAATGGCGAATGGCGCTTTGCCTGGTTTCCGGCAC CAGAAGCGGTGCCGGAAAGCTGGCTGGAGTGCGATCTTCCT GAGGCCGATACTGTCGTCGTCCCCTCAAACTGGCAGATGCA CGGTTACGATGCGCCCATCTACACCAACGTGACCTATCCCA TTACGGTCAATCCGCCGTTTGTTCCCACGGAGAATCCGACG GGTTGTTACTCGCTCACATTTAATGTTGATGAAAGCTGGCT ACAGGAAGGCCAGACGCGAATTATTTTTGATGGCGTTAACT CGGCGTTTCATCTGTGGTGCAACGGGCGCTGGGTCGGTTAC GGCCAGGACAGTCGTTTGCCGTCTGAATTTGACCTGAGCGC ATTTTTACGCGCCGGAGAAAACCGCCTCGCGGTGATGGTGC TGCGCTGGAGTGACGGCAGTTATCTGGAAGATCAGGATATG TGGCGGATGAGCGGCATTTTCCGTGACGTCTCGTTGCTGCA TAAACCGACTACACAAATCAGCGATTTCCATGTTGCCACTC GCTTTAATGATGATTTCAGCCGCGCTGTACTGGAGGCTGAA GTTCAGATGTGCGGCGAGTTGCGTGACTACCTACGGGTAAC AGTTTCTTTATGGCAGGGTGAAACGCAGGTCGCCAGCGGCA CCGCGCCTTTCGGCGGTGAAATTATCGATGAGCGTGGTGGT TATGCCGATCGCGTCACACTACGTCTGAACGTCGAAAACCC GAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTGCGG TGGTTGAACTGCACACCGCCGACGGCACGCTGATTGAAGCA AACTGCCTGAACTACCGCAGCCGGAGAGCGCCGGGCAACTC TGGCTCACAGTACGCGTAGTGCAACCGAACGCGACCGCATG GTCAGAAGCCGGGCACATCAGCGCCTGGCAGCAGTGGCGTC TGGCGGAAAACCTCAGTGTGACGCTCCCCGCCGCGTCCCAC GCCATCCCGCATCTGACCACCAGCGAAATGGATTTTTGCAT CGAGCTGGGTAATAAGCGTTGGCAATTTAACCGCCAGTCAG GCTTTCTTTCACAGATGTGGATTGGCGATAAAAAACAACTG CTGACGCCGCTGCGCGATCAGTTCACCCGTGCACCGCTGGA TAACGACATTGGCGTAAGTGAAGCGACCCGCATTGACCCTA ACGCCTGGGTCGAACGCTGGAAGGCGGCGGGCCATTACCAG GCCGAAGCAGCGTTGTTGCAGTGCACGGCAGATACACTTGC TGATGCGGTGCTGATTACGACCGCTCACGCGTGGCAGCATC AGGGGAAAACCTTATTTATCAGCCGGAAAACCTACCGGATT GATGGTAGTGGTCAAATGGCGATTACCGTTGATGTTGAAGT GGCGAGCGATACACCGCATCCGGCGCGGATTGGCCTGAACT GCCAGCTGGCGCAGGTAGCAGAGCGGGTAAACTGGCTCGGA TTAGGGCCGCAAGAAAACTATCCCGACCGCCTTACTGCCGC CTGTTTTGACCGCTGGGATCTGCCATTGTCAGACATGTATA CCCCGTACGTCTTCCCGAGCGAAAACGGTCTGCGCTGCGGG ACGCGCGAATTGAATTATGGCCCACACCAGTGGCGCGGCGA CTTCCAGTTCAACATCAGCCGCTACAGTCAACAGCAACTGA TGGAAACCAGCCATCGCCATCTGCTGCACGCGGAAGAAGGC ACATGGCTGAATATCGACGGTTTCCATATGGGGATTGGTGG CGACGACTCCTGGAGCCCGTCAGTATCGGCGGAATTCCAGC TGAGCGCCGGTCGCTACCATTACCAGTTGGTCTGGTGTCAA - LacZ fragments 1-3 were assembled from smaller, synthesized oligonucleic acids. During fragment preparation, the 5′ and/or 3′ of each fragment end was appended with a connecting adaptor to generated adaptor-modified fragments 1-3. To prepare LacZ for assembly with the precursor plasmid fragments, the 5′ end of
fragment 1 and the 3′ end offragment 3 were appended with a first outer adaptor comprising outer adaptor motif 1 (AGCCAT, SEQ ID NO.: 66) and a second outer adaptor comprising outer adaptor motif 2 (TTATGT, SEQ ID NO.: 67), respectively. The sequences of modified fragments 1-3 are shown in Table 10. Each modified fragment comprises a first adaptor sequence (GTATGCTGACTGCT, SEQ ID NO.: 68) at the first end and second adaptor sequence (TTGCCCTACGGTCT, SEQ ID NO.: 69) at the second end, indicated by a dashed underline. Each modified fragment comprises a nicking enzyme recognition site (GCAATG, SEQ ID NO.: 59), indicated by a dotted underline. Each modified fragment comprises an ANNNNT motif (SEQ ID NO.: 2), indicated by italics. -
TABLE 10 Sequence identities of modified LacZ fragments. Sequence Sequence identity name Sequence SEQ ID NO.: 70 modified fragment 1AAAACCCTGGCGTTACCCAACTTAATCGCCTTGCAGCACATC CCCCTTTCGCCAGCTGGCGTAATAGCGAAGAGGCCCGCACCG ATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGC GCTTTGCCTGGTTTCCGGCACCAGAAGCGGTGCCGGAAAGCT GGCTGGAGTGCGATCTTCCTGAGGCCGATACTGTCGTCGTCC CCTCAAACTGGCAGATGCACGGTTACGATGCGCCCATCTACA CCAACGTGACCTATCCCATTACGGTCAATCCGCCGTTTGTTC CCACGGAGAATCCGACGGGTTGTTACTCGCTCACATTTAATG TTGATGAAAGCTGGCTACAGGAAGGCCAGACGCGAATTATTT TTGATGGCGTTAACTCGGCGTTTCATCTGTGGTGCAACGGGC GCTGGGTCGGTTACGGCCAGGACAGTCGTTTGCCGTCTGAAT TTGACCTGAGCGCATTTTTACGCGCCGGAGAAAACCGCCTCG CGGTGATGGTGCTGCGCTGGAGTGACGGCAGTTATCTGGAAG ATCAGGATATGTGGCGGATGAGCGGCATTTTCCGTGACGTCT CGTTGCTGCATAAACCGACTACACAAATCAGCGATTTCCATG TTGCCACTCGCTTTAATGATGATTTCAGCCGCGCTGTACTGG AGGCTGAAGTTCAGATGTGCGGCGAGTTGCGTGACTACCTAC GGGTAACAGTTTCTTTATGGCAGGGTGAAACGCAGGTCGCCA GCGGCACCGCGCCTTTCGGCGGTGAAATTATCGATGAGCGTG GTGGTTATGCCGATCGCGTCACACTACGTCTGAACGTCGAAA ACCCGAAACTGTGGAGCGCCGAAATCCCGAATCTCTATCGTG CGGTGGTTGAACTGCACACCGCCGACGGCACGCTGATTGAAG SEQ ID NO.: 71 modified fragment 2AGCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGA TGGTGCAGGATATCCTGCTGATGAAGCAGAACAACTTTAACG CCGTGCGCTGTTCGCATTATCCGAACCATCCGCTGTGGTACA CGCTGTGCGACCGCTACGGCCTGTATGTGGTGGATGAAGCCA ATATTGAAACCCACGGCATGGTGCCAATGAATCGTCTGACCG ATGATCCGCGCTGGCTACCGGCGATGAGCGAACGCGTAACGC GAATGGTGCAGCGCGATCGTAATCACCCGAGTGTGATCATCT GGTCGCTGGGGAATGAATCAGGCCACGGCGCTAATCACGACG CGCTGTATCGCTGGATCAAATCTGTCGATCCTTCCCGCCCGG TGCAGTATGAAGGCGGCGGAGCCGACACCACGGCCACCGATA TTATTTGCCCGATGTACGCGCGCGTGGATGAAGACCAGCCCT TCCCGGCTGTGCCGAAATGGTCCATCAAAAAATGGCTTTCGC TACCTGGAGAGACGCGCCCGCTGATCCTTTGCGAATACGCCC ACGCGATGGGTAACAGTCTTGGCGGTTTCGCTAAATACTGGC AGGCGTTTCGTCAGTATCCCCGTTTACAGGGCGGCTTCGTCT GGGACTGGGTGGATCAGTCGCTGATTAAATATGATGAAAACG GCAACCCGTGGTCGGCTTACGGCGGTGATTTTGGCGATACGC CGAACGATCGCCAGTTCTGTATGAACGGTCTGGTCTTTGCCG ACCGCACGCCGCATCCAGCGCTGACGGAAGCAAAACACCAGC AGCAGTTTTTCCAGTTCCGTTTATCCGGGCAAACCATCGAAG TGACCAGCGAATACCTGTTCCGTCATAGCGATAACGAGCTCC TGCACTGGATGGTGGCGCTGGATGGTAAGCCGCTGGCAAGCG SEQ ID NO.: 72 modified fragment 3ACAGTACGCGTAGTGCAACCGAACGCGACCGCATGGTCAGAA GCCGGGCACATCAGCGCCTGGCAGCAGTGGCGTCTGGCGGAA AACCTCAGTGTGACGCTCCCCGCCGCGTCCCACGCCATCCCG CATCTGACCACCAGCGAAATGGATTTTTGCATCGAGCTGGGT AATAAGCGTTGGCAATTTAACCGCCAGTCAGGCTTTCTTTCA CAGATGTGGATTGGCGATAAAAAACAACTGCTGACGCCGCTG CGCGATCAGTTCACCCGTGCACCGCTGGATAACGACATTGGC GTAAGTGAAGCGACCCGCATTGACCCTAACGCCTGGGTCGAA CGCTGGAAGGCGGCGGGCCATTACCAGGCCGAAGCAGCGTTG TTGCAGTGCACGGCAGATACACTTGCTGATGCGGTGCTGATT ACGACCGCTCACGCGTGGCAGCATCAGGGGAAAACCTTATTT ATCAGCCGGAAAACCTACCGGATTGATGGTAGTGGTCAAATG GCGATTACCGTTGATGTTGAAGTGGCGAGCGATACACCGCAT CCGGCGCGGATTGGCCTGAACTGCCAGCTGGCGCAGGTAGCA GAGCGGGTAAACTGGCTCGGATTAGGGCCGCAAGAAAACTAT CCCGACCGCCTTACTGCCGCCTGTTTTGACCGCTGGGATCTG CCATTGTCAGACATGTATACCCCGTACGTCTTCCCGAGCGAA AACGGTCTGCGCTGCGGGACGCGCGAATTGAATTATGGCCCA CACCAGTGGCGCGGCGACTTCCAGTTCAACATCAGCCGCTAC AGTCAACAGCAACTGATGGAAACCAGCCATCGCCATCTGCTG CACGCGGAAGAAGGCACATGGCTGAATATCGACGGTTTCCAT ATGGGGATTGGTGGCGACGACTCCTGGAGCCCGTCAGTATCG GCGGAATTCCAGCTGAGCGCCGGTCGCTACCATTACCAGTTG - To generate a second nicking enzyme recognition site, a non-canonical base uracil, each modified fragment was amplified using the universal primers shown in Table 11. An asterisk indicates a phosphorothioated bond.
-
TABLE 11 Uracil-containing universal primers for amplification of modified LacZ fragments. Sequence identity Sequence name Sequence SEQ ID NO.: 73 modfrag1F GTATGCTGACTGCTGCAA TGAGCCA*/3deoxyU/ SEQ ID NO.: 74 modfrag1R TTGCCCTACGGTCTGCAA TGACCAT*/3deoxyU/ SEQ ID NO.: 75 modfrag2F GTATGCTGACTGCTGCAA TGAATGG*/3deoxyU/ SEQ ID NO.: 76 modfrag2R TTGCCCTACGGTCTGCAA TGAACTG*/3deoxyU/ SEQ ID NO.: 77 modfrag3F GTATGCTGACTGCTGCAA TGACAGT*/3deoxyU/ SEQ ID NO.: 78 modfrag3R TTGCCCTACGGTCTGCAA TGACATA*/3deoxyU/ - Each primer set comprises, in 5′ to 3′ order: adaptor sequence, a first nicking enzyme recognition site (GCAATG, SEQ ID NO.: 59), and a sticky end motif comprising a non-canonical base (ANNNNU, SEQ ID NO.: 53). Modified fragments 1-3 were amplified using their corresponding primers modfrag1F/modfrag1R, modfrag2F/modfrag2R and modfrag3F/modfrag3R, respectively. The composition of the amplification reaction is shown in Table 12. The amplification reaction conditions are shown in Table 13.
-
TABLE 12 PCR reaction mixture for amplification of modified LacZ fragments. PCR component Quantity (μL) Concentration in mixture Phusion U (2 U/μL) 1 1 U/50 μL 5x Phusion HF buffer 20 1x 10 mM dNTP 2 200 μM Plasmid template (50 pg/μL) 2 100 pg/100 μL Forward primer (200 μM) 0.25 0.5 μM Forward primer (200 μM) 0.25 0.5 μM Water 70.5 -
TABLE 13 PCR reaction conditions for amplification of modified LacZ fragments Step Cycle 1 1 cycle: 98° C., 30 sec 2 20 cycles: 98° C., 10 sec; 72° C., 30 sec 3 10 cycles: 98° C., 10 sec; 72° C., 45 sec 4 1 cycle: 72° C., 5 min 5 Hold: 4° C., 15-30 sec per kb - Assembly of LacZ precursor fragments
- LacZ precursor fragments were annealed and ligated with the plasmid fragment according to
reactions reaction 1. USER (UDG and endonuclease VIII) was used to generate a nick at uracil in a second strand duringreaction 2.Reaction 2 comprised three steps: cleavage of uracil, ligation, and enzymatic inactivation. Assembled fragments comprise LacZ inserted into the 5 kb plasmid. To determine efficiency of assembly into the plasmid, PCR of colonies resulting from the transformation of assembled plasmids into E. coli were amplified using plasmid-specific primers. Amplification products from 10 colonies of conditions A-I were amplified by colony PCR. The number amplicons with the correct size insert (about 3 kb), as identified by gel electrophoresis, are shown in Table 14.FIG. 11 shows an image of a gel electrophoresis of LacZ amplified inserts generated from assembly conditions A-I. -
TABLE 14 LacZ fragment efficiency assembly analysis. Predicted insert Precursor size confirmed by Condition fragments Reaction 1 Reaction 2 electrophoresis A LacZ precursor Incubate fragments with Incubate reaction 1 with 4/10 fragments 1-3; nicking enzyme USER, ATP, T7 ligase, and plasmid precursor Nb.BsrDI and buffer at buffer at 37° C. for 30 min, fragment A 65° C. for 60 min 16° C. for 60 min, and 80° C. for 20 min B LacZ precursor Incubate fragments with Incubate reaction 1 with 9/10 fragments 1-3; nicking enzyme USER, ATP, T7 ligase, and plasmid precursor Nb.BsrDI and buffer at buffer at 37° C. for 30 min, fragment A 65° C. for 60 min 16° C. for 60 min, and 80° C. for 20 min C LacZ precursor Incubate fragments with Incubate reaction 1 with 8/10 fragments 1-3; nicking enzyme USER, ATP, T7 ligase, and plasmid precursor Nb.BsrDI and buffer at buffer at 37° C. for 30 min, fragment A 65° C. for 60 min 16° C. for 60 min, and 80° C. for 20 min D LacZ precursor Incubate fragments with Incubate reaction 1 with 10/10 fragments 1-3; nicking enzyme USER, ATP, T7 ligase, and plasmid precursor Nb.BsrDI and buffer at buffer at 60° C. for 30 min, 20 fragment A 65° C. for 60 min cycles of 37° C. for 1 mm and 16° C. for 3 mm, 80° C. for 20 min, 4° C. hold E LacZ precursor Incubate fragments with Incubate reaction 1 with 7/10 fragments 1-3; nicking enzyme USER, ATP, T7 ligase, and plasmid precursor Nb.BsrDI and buffer at buffer at 60° C. for 30 min, 20 fragment A 65° C. for 60 min cycles of 37° C. for 1 mm and 16° C. for 3 mm, 80° C. for 20 min, 4° C. hold F LacZ precursor Incubate fragments with Incubate reaction 1 with 9/10 fragments 1-3; nicking enzyme USER, ATP, T7 ligase, and plasmid precursor Nb.BsrDI and buffer at buffer at 60° C. for 30 min, 20 fragment A 65° C. for 60 min cycles of 37° C. for 1 mm and 16° C. for 3 mm, 80° C. for 20 min, 4° C. hold G LacZ precursor Incubate fragments with Incubate reaction 1 with 0/10 fragments 1-3; nicking enzyme USER, T7 ligase, and buffer at plasmid precursor Nb.BsrDI, ATP and 37° C. for 60 min, 16° C. for 60 fragment B buffer at 65° C. for 60 min, and 80° C. for 20 min min H plasmid precursor Incubate fragments with Incubate reaction 1 with 0/4 fragment A nicking enzyme USER, ATP, T7 ligase, and Nb.BsrDI and buffer at buffer at 37° C. for 60 min, 65° C. for 60 min 16° C. for 60 min, and 80° C. for 20 min I plasmid precursor Incubate fragments with Incubate reaction 1 with 0/4 fragment B nicking enzyme USER, ATP, T7 ligase, and Nb.BsrDI and buffer at buffer at 37° C. for 60 min, 65° C. for 60 min 16° C. for 60 min, and 80° C. for 20 min - An enzyme of interest having an activity to be improved is selected. Specific amino acid residues relevant to enzyme activity and stability are identified. The nucleic acid sequence encoding the enzyme is obtained. Bases corresponding to the specific amino acid residues are identified, and the nucleic acid is partitioned into fragments such that each fragment spans a single base position corresponding to a specific amino acid residue.
- Target nucleic acid fragments are synthesized such that identified bases corresponding to the specific amino acid residues are indeterminate. Target nucleic acid fragments are amplified using a uridine primer and treated with a sequence adjacent nick enzyme and a uridine-specific nick enzyme. Cleaved end sequence is removed and target nucleic acid fragments are assembled to generate a target nucleic acid library. Aliquots of the library are sequenced to confirm success of the assembly, and aliquoted molecules of the library are individually cloned and transformed into a host cell for expression. Expressed enzymes are isolated and assayed for activity and stability.
- Enzymes having increased stability due to single point mutations are identified. Enzymes having increased activity due to single point mutations are identified. Also identified are enzymes having increased stability and/or activity due to combinations of point mutations, each of which individually is detrimental to enzyme activity or stability, and which would be unlikely to be pursued by more traditional, ‘one mutation at a time’ approaches.
- A 3 kb double-stranded target gene of predetermined sequence is prepared using a de novo synthesis and assembly method described herein. The predetermined gene sequence is first analyzed to identify fragments which will be synthesized and assembled into the final gene product.
- Determination of Gene Fragment Sequences
- The target nucleic acid sequence is analyzed to identify sticky end motifs having an ANNNNT sequence (SEQ ID NO.: 2). Two of the identified motifs are selected according to their position in the sequence, so that the first identified motif is located at roughly 1 kb and the second identified motif is located at roughly 2 kb. The two selected motifs thus partition the target sequence into three, approximately 1 kb precursor fragments, denoted
fragments - De Novo Synthesis of Precursor Fragments
-
Fragments fragment 1 and the 3′ end offragment 3, and connecting adaptor sequences are added to the 3′ end offragment 1, the 5′ and 3′ ends offragment 2, and the 5′ end offragment 3. The connecting adaptor sequences located at the 3′ end offragment 1 and the 5′ end offragment 2 comprise the sequence of the first identified ANNNNT motif (SEQ ID NO.: 2). The connecting adaptor sequences located at the 3′ end offragment 2 and the 5′ end offragment 3 comprise the sequence of the second identified ANNNNT motif (SEQ ID NO.: 2). Each connecting adaptor comprises, in order: a sequence of 1-10 bases (adaptor bases), a first nicking enzyme recognition site comprising a first nicking enzyme cleavage site on one strand, and a sticky end motif. The adaptor bases and first nicking enzyme cleavage site comprise the same bases for each connecting adaptor. -
Fragment 1 prepared with adaptor sequence comprises, in 5′ to 3′ order: a first outer adaptor sequence;fragment 1 sequence; and a first connecting adaptor sequence comprising, in 5′ to 3′ order, the first ANNNNT motif (SEQ ID NO.: 2), the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on a first strand, and the sequence of adaptor bases.Fragment 2 prepared with adaptor sequence comprises, in 5′ to 3′ order: the first connecting adaptor sequence comprising, in 5′ to 3′ order, the sequence of adaptor bases, the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on a second strand, and the first ANNNNT motif (SEQ ID NO.: 2);fragment 2 sequence; and a second connecting adaptor sequence comprising, in 5′ to 3′ order, the second ANNNNT motif (SEQ ID NO.: 2), the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on a first strand, and the sequence of adaptor bases.Fragment 3 prepared with adaptor sequence comprises, in 5′ to 3′ order: the second connecting adaptor sequence comprising, in 5′ to 3′ order, the sequence of adaptor bases, the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on a second strand, the second ANNNNT motif (SEQ ID NO.: 2);fragment 3 sequence; and a second outer adaptor sequence. - Generation of Fragments with Two Nicking Enzyme Cleavage Sites
- Each of the prepared fragments are amplified to incorporate a second nicking enzyme cleavage site on a single-strand of each fragment such that the second nicking enzyme cleavage site is located from 1 to 10 bases away from the first nicking enzyme cleavage site of each fragment and on a different strand from the first nicking enzyme cleavage site. The second nicking enzyme cleavage site comprises a non-canonical base. The non-canonical base is added to each fragment during PCR via a primer comprising the sequence of adaptor bases, the first nicking enzyme recognition site, a sticky end motif ANNNNT (SEQ ID NO.: 2), and the non-canonical base.
-
Fragment 1 comprises, in 5′ to 3′ order: the first outer adaptor sequence,fragment 1 sequence, the non-canonical base on the second strand, the first ANNNNT motif (SEQ ID NO.: 2), the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on the first strand, and the sequence of adaptor bases.Fragment 2 comprises, in 5′ to 3′ order: the sequence of adaptor bases, the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on the second strand, the first ANNNNT motif (SEQ ID NO.: 2), the non-canonical base on the first strand,fragment 2 sequence, the non-canonical base on the second strand, the ANNNNT motif (SEQ ID NO.: 2), the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on the first strand, and the sequence of adaptor bases.Fragment 3 comprises, in 5′ to 3′ order: the sequence of adaptor bases, the first nicking enzyme recognition site comprising the first nicking enzyme cleavage site on the second strand, the second ANNNNT motif (SEQ ID NO.: 2), the non-canonical base on the first strand,fragment 3 sequence, and a second outer adaptor sequence. - Cleavage of Fragments with Two Nicking Enzymes
- Each of the three fragments comprising two nicking enzyme cleavage sites are treating with a first nicking enzyme and a second nicking enzyme. The first nicking enzyme creates a nick at the first nicking enzyme cleavage site by cleaving a single-strand of the fragment. The second nicking enzyme creates a nick by removing the non-canonical base from the fragment. The enzyme-treated fragments have an overhang comprising a sticky end motif ANNNNT (SEQ ID NO.: 2).
- Enzyme-treated
fragment 1 comprises, in 5′ to 3′ order: the first outer adaptor,fragment 1 sequence, and on the first strand, the first sticky end motif ANNNNT (SEQ ID NO.: 2). Enzyme-treatedfragment 2 comprises, in 5′ to 3′ order: on the second strand, the first sticky end motif ANNNNT (SEQ ID NO.: 2);fragment 2 sequence; and on the first strand, the second sticky end motif ANNNNT (SEQ ID NO.: 2). Enzyme-treatedfragment 3 comprises, in 5′ to 3′ order: on the second strand, the second sticky end motif ANNNNT (SEQ ID NO.: 2);fragment 3 sequence; and the second outer adaptor. - Assembly of Cleaved Fragments
- The first sticky ends of
fragments fragments fragment 1 sequence, the first sticky end motif,fragment 2 sequence, the second sticky end motif,fragment 3 sequence, and the second outer adaptor. The assembled product comprises the predetermined sequence of the target gene without any scar sites. The assembled product is amplified using primers to the outer adaptors to generate desired quantities of the target gene. - A double-stranded target gene of predetermined sequence is prepared using a de novo synthesis and assembly method described herein. The predetermined gene sequence is first analyzed to identify fragments which will be synthesized and assembled into the final gene product.
- Determination of Gene Fragment Sequences
- The target nucleic acid sequence is analyzed to identify sticky end motifs. Three of the identified motifs are selected according to their position in the sequence, so that the motifs partition the predetermined sequence in four fragments having roughly similar sequence lengths. The sticky end motifs are designated sticky end motif x, sticky end motif y, and sticky end motif z. The precursor fragments are designed
fragment 1,fragment 2,fragment 3, andfragment 4. Accordingly, the predetermined sequence comprises, in order:fragment 1 sequence, sticky end motif x,fragment 2 sequence, sticky end motif y,fragment 3 sequence, sticky end motif z, andfragment 4 sequence. - De Novo Synthesis of Precursor Fragments
- Fragments 1-4 are prepared by de novo synthesis and PCA assembly of oligonucleic acids. During this process connecting adaptor sequences are added to the 3′ end of
fragment 1, the 5′ and 3′ ends offragments fragment 4. The connecting adaptor sequences located at the 3′ end offragment 1 and the 5′ end offragment 2 comprise sticky end motif x. The connecting adaptor sequences located at the 3′ end offragment 2 and the 5′ end offragment 3 comprise sticky end motif y. The connecting adaptor sequences located at the 3′ end offragment 3 and the 5′ end offragment 4 comprise sticky end motif z. Each connecting adaptor comprises, in order: a sequence of 1-10 bases (adaptor bases), a first nicking enzyme recognition site comprising a first nicking enzyme cleavage site on a first strand, a sticky end motif comprising a second nicking enzyme cleavage site on the 3′ base of the second strand. The second nicking enzyme cleavage site comprises the non-canonical base uracil. The connecting adaptor sequences are positioned at the 5′ and/or 3′ end of a fragment such that the 3′ uracil of the connecting adaptor is positioned directed next to the 5′ and/or 3′ end of the fragment. The adaptor bases and first nicking enzyme cleavage site comprise the same bases for each connecting adaptor. -
Precursor fragment 1 comprisesfragment 1 sequence and a first connecting adaptor comprising sticky end motif x.Precursor fragment 2 comprises the first connecting adaptor comprising sticky end motif x,fragment 2 sequence, and a second connecting adaptor comprising sticky end motif y.Precursor fragment 3 comprises the second connecting adaptor comprising sticky end motif y,fragment 3 sequence, and a third connecting adaptor comprising sticky end motif z.Precursor fragment 4 comprises the third connecting adaptor comprising sticky end motif z andfragment 4 sequence. - Cleavage of Fragments with Two Nicking Enzymes
- Each of the four precursor fragments comprise one or two connecting adaptors, each connecting adaptor comprising: a first nicking enzyme recognition site comprising a first nicking enzyme cleavage site on a first strand, and uracil base on the second strand. The precursor fragments are treating with a first nicking enzyme which recognizes the first nicking enzyme recognition site to generate a nick at the first nicking enzyme cleavage site. The precursor fragments are treated with a second nicking enzyme, USER, which excises the uracil from the second strand, generating a nick where the uracil used to reside. USER comprises Uracil DNA glycosylase (UDG) and DNA glycosylase-lyase Endonuclease VIII (EndoVIII). Each precursor fragment now comprises an overhang consisting of a sticky end motif.
-
Precursor fragment 1 now comprisesfragment 1 sequence and a 5′ overhang consisting of sequence motif x.Precursor fragment 2 now comprises a 3′ overhang consisting of sequence motif x,fragment 2 sequence, and a 5′ overhang consisting of sequence motif y.Precursor fragment 3 now comprises a 3′ overhang consisting of sequence motif y,fragment 3 sequence, and a 5′ overhang consisting of sequence motif z.Precursor fragment 4 now comprises a 3′ overhang consisting of sequence motif z andfragment 4 sequence. - Assembly of Cleaved Fragments
- The sticky end motif x overhangs of
precursor fragments precursor fragments precursor fragments fragment 1 sequence, sticky end motif x,fragment 2 sequence, sticky end motif y,fragment 3 sequence, sticky end motif z andfragment 4 sequence. - The product to be assembled comprises the predetermined sequence of the target gene without any scar sites. The assembled product is optionally amplified to generate desired quantities of the target gene. Alternatively, precursor fragments are generated at sufficient quantities such that amplification of the final gene is unnecessary. Such instances allow for the generation of large genes which are unable to be amplified using traditional amplification methods.
- A population of precursor nucleic acid fragments are amplified using a set of universal primer pairs, wherein each universal primer introduces a non-canonical base uracil to a single-strand of a precursor nucleic acid.
- Design of Universal Primers
- A predetermined sequence of a target gene is analyzed to select sticky end motifs that partition the gene into precursor fragments of desired size. The sticky end motifs have the sequence ANNNNT (SEQ ID NO.: 2), where each selected sticky end motif has a different NNNN sequence. The NNNN sequence for each selected sticky end motif is noted.
- Universal forward primers are synthesized to comprise, in 5′ to 3′ order: 1-20 forward adaptor bases, a nicking enzyme recognition site, and a sticky end motif comprising ANNNNU (SEQ ID NO.: 53). A subpopulation of forward primers is generated so that each subpopulation comprises a NNNN sequence of a different sticky end motif selected from the target gene.
- Universal reverse primers are synthesized to comprise, in 5′ to 3′ order: 1-20 reverse adaptor bases, a nicking enzyme recognition site, and a sticky end motif comprising ANNNNU (SEQ ID NO.: 53). A subpopulation of reverse primers is generated so that each subpopulation comprises the reverse complement of a NNNN sequence of a different sticky end motif selected from the target gene.
- The nicking enzyme recognition site sequence in the universal primers is designed such that when the universal primers are incorporated into precursor fragments during an amplification reaction, the reverse complement sequence of the nicking enzyme recognition site sequence in the universal primer comprises a nicking enzyme cleavage site. Accordingly, upon treating with a nicking enzyme specific for the nicking enzyme cleavage site, a nick is generated on a strand of the fragment not comprising the uracil base.
- Amplification of Precursor Nucleic Acid Fragments with Universal Primers
- Precursor fragments partitioned by the selected sticky end motifs are assembled from smaller, synthesized nucleic acids. The precursor fragments are amplified using the set of universal primers comprising the sticky end motif ANNNNT (SEQ ID NO.: 2), wherein the T is mutated with the non-canonical base uracil. The precursor fragments each comprise a nicking enzyme recognition site comprising a nicking enzyme cleavage site on one strand and a uracil base on the other strand.
- Enzymatic Digestion of Precursor Fragments Amplified with Universal Primers
- Precursor fragments amplified with universal primers are treated with a first nicking enzyme to create a nick at the nicking enzyme cleavage site and a second nicking enzyme comprising UDG and Endonuclease VIII activity to generate a nick at the uracil base site. The precursor fragments comprise overhangs with the sticky end motif ANNNNT (SEQ ID NO.: 2).
- Assembly of Cleaved Fragments
- Fragments comprising complementary overhangs are annealed to generate the target gene. The target gene comprises the predetermined sequence, with no extraneous scar sites.
- A double-stranded target gene of predetermined sequence is prepared using a de novo synthesis and assembly method described herein. The predetermined gene sequence is first analyzed to identify fragments which will be synthesized and assembled into the final gene product.
- Determination of Gene Fragment Sequences
- The target nucleic acid sequence is analyzed to identify sticky end motifs having a Type II restriction endonuclease recognition sequence. Three of the identified motifs are selected according to their position in the sequence, so that the motifs partition the predetermined sequence in four fragments having roughly similar sequence lengths of about 200 kb. The sticky end motifs are designated sticky end motif x, sticky end motif y, and sticky end motif z. The precursor fragments are designed
fragment 1,fragment 2,fragment 3, andfragment 4. Accordingly, the predetermined sequence comprises, in order:fragment 1 sequence, sticky end motif x,fragment 2 sequence, sticky end motif y,fragment 3 sequence, sticky end motif z, andfragment 4 sequence. - De Novo Synthesis of Precursor Fragments
- Precursor fragments 1-4 are prepared by de novo synthesis and PCA assembly of oligonucleic acids. During this process connecting adaptor sequences are added to the 3′ end of
fragment 1, the 5′ and 3′ ends offragments fragment 4. The connecting adaptor sequences located at the 3′ end offragment 1 and the 5′ end offragment 2 comprise sticky end motif x. The connecting adaptor sequences located at the 3′ end offragment 2 and the 5′ end offragment 3 comprise sticky end motif y. The connecting adaptor sequences located at the 3′ end offragment 3 and the 5′ end offragment 4 comprise sticky end motif z. Each connecting adaptor comprises a sequence of 1-10 adaptor bases and sticky end motif comprising a Type II restriction endonuclease recognition sequence. Also during preparation of precursor fragments 1-4, outer adaptors comprising 1-10 adaptor bases are added to the 5′ and 3′ ends offragments -
Precursor fragment 1 comprisesouter adaptor sequence 1,fragment 1 sequence and a first connecting adaptor comprising sticky end motif x.Precursor fragment 2 comprises the first connecting adaptor comprising sticky end motif x,fragment 2 sequence, and a second connecting adaptor comprising sticky end motif y.Precursor fragment 3 comprises the second connecting adaptor comprising sticky end motif y,fragment 3 sequence, and a third connecting adaptor comprising sticky end motif z.Precursor fragment 4 comprises the third connecting adaptor comprising sticky end motif z,fragment 4 sequence, andouter adaptor sequence 2. - Cleavage of Fragments with Type II Restriction Enzymes
- Each of the four precursor fragments comprise one or two connecting adaptors, each connecting adaptor having a sticky end motif comprising a Type II restriction endonuclease recognition sequence. The precursor fragments are treated with three Type II restriction enzymes, each enzyme specific for a Type II recognition sequence in sticky end motifs X-Z, to generate four precursor fragments with sticky ends.
- Assembly of Cleaved Fragments
- The sticky end motif x overhangs of
precursor fragments precursor fragments precursor fragments fragment 1 sequence, sticky end motif x,fragment 2 sequence, sticky end motif y,fragment 3 sequence, sticky end motif z andfragment 4 sequence. The product to be assembled comprises the predetermined sequence of the target gene without any scar sites. - While preferred embodiments of the present disclosure have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the disclosure. It should be understood that various alternatives to the embodiments described herein may be employed.
Claims (17)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/320,127 US20220064628A1 (en) | 2015-02-04 | 2021-05-13 | Compositions and methods for synthetic gene assembly |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562112022P | 2015-02-04 | 2015-02-04 | |
PCT/US2016/016636 WO2016126987A1 (en) | 2015-02-04 | 2016-02-04 | Compositions and methods for synthetic gene assembly |
US15/154,879 US9677067B2 (en) | 2015-02-04 | 2016-05-13 | Compositions and methods for synthetic gene assembly |
US15/433,909 US20170159044A1 (en) | 2015-02-04 | 2017-02-15 | Compositions and methods for synthetic gene assembly |
US16/530,717 US20190352635A1 (en) | 2015-02-04 | 2019-08-02 | Compositions and methods for synthetic gene assembly |
US17/320,127 US20220064628A1 (en) | 2015-02-04 | 2021-05-13 | Compositions and methods for synthetic gene assembly |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/530,717 Continuation US20190352635A1 (en) | 2015-02-04 | 2019-08-02 | Compositions and methods for synthetic gene assembly |
Publications (1)
Publication Number | Publication Date |
---|---|
US20220064628A1 true US20220064628A1 (en) | 2022-03-03 |
Family
ID=56564711
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/154,879 Active US9677067B2 (en) | 2015-02-04 | 2016-05-13 | Compositions and methods for synthetic gene assembly |
US15/433,909 Abandoned US20170159044A1 (en) | 2015-02-04 | 2017-02-15 | Compositions and methods for synthetic gene assembly |
US16/530,717 Abandoned US20190352635A1 (en) | 2015-02-04 | 2019-08-02 | Compositions and methods for synthetic gene assembly |
US17/320,127 Pending US20220064628A1 (en) | 2015-02-04 | 2021-05-13 | Compositions and methods for synthetic gene assembly |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/154,879 Active US9677067B2 (en) | 2015-02-04 | 2016-05-13 | Compositions and methods for synthetic gene assembly |
US15/433,909 Abandoned US20170159044A1 (en) | 2015-02-04 | 2017-02-15 | Compositions and methods for synthetic gene assembly |
US16/530,717 Abandoned US20190352635A1 (en) | 2015-02-04 | 2019-08-02 | Compositions and methods for synthetic gene assembly |
Country Status (3)
Country | Link |
---|---|
US (4) | US9677067B2 (en) |
CA (1) | CA2975855A1 (en) |
WO (1) | WO2016126987A1 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11452980B2 (en) | 2013-08-05 | 2022-09-27 | Twist Bioscience Corporation | De novo synthesized gene libraries |
US11492728B2 (en) | 2019-02-26 | 2022-11-08 | Twist Bioscience Corporation | Variant nucleic acid libraries for antibody optimization |
US11492665B2 (en) | 2018-05-18 | 2022-11-08 | Twist Bioscience Corporation | Polynucleotides, reagents, and methods for nucleic acid hybridization |
US11492727B2 (en) | 2019-02-26 | 2022-11-08 | Twist Bioscience Corporation | Variant nucleic acid libraries for GLP1 receptor |
US11512347B2 (en) | 2015-09-22 | 2022-11-29 | Twist Bioscience Corporation | Flexible substrates for nucleic acid synthesis |
US11550939B2 (en) | 2017-02-22 | 2023-01-10 | Twist Bioscience Corporation | Nucleic acid based data storage using enzymatic bioencryption |
US11562103B2 (en) | 2016-09-21 | 2023-01-24 | Twist Bioscience Corporation | Nucleic acid based data storage |
US11691118B2 (en) | 2015-04-21 | 2023-07-04 | Twist Bioscience Corporation | Devices and methods for oligonucleic acid library synthesis |
US11697668B2 (en) | 2015-02-04 | 2023-07-11 | Twist Bioscience Corporation | Methods and devices for de novo oligonucleic acid assembly |
US11745159B2 (en) | 2017-10-20 | 2023-09-05 | Twist Bioscience Corporation | Heated nanowells for polynucleotide synthesis |
US11807956B2 (en) | 2015-09-18 | 2023-11-07 | Twist Bioscience Corporation | Oligonucleic acid variant libraries and synthesis thereof |
US11970697B2 (en) | 2020-10-19 | 2024-04-30 | Twist Bioscience Corporation | Methods of synthesizing oligonucleotides using tethered nucleotides |
US12018065B2 (en) | 2020-04-27 | 2024-06-25 | Twist Bioscience Corporation | Variant nucleic acid libraries for coronavirus |
US12091777B2 (en) | 2019-09-23 | 2024-09-17 | Twist Bioscience Corporation | Variant nucleic acid libraries for CRTH2 |
Families Citing this family (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2016126987A1 (en) | 2015-02-04 | 2016-08-11 | Twist Bioscience Corporation | Compositions and methods for synthetic gene assembly |
EP3384077A4 (en) | 2015-12-01 | 2019-05-08 | Twist Bioscience Corporation | Functionalized surfaces and preparation thereof |
JP6854340B2 (en) | 2016-08-22 | 2021-04-07 | ツイスト バイオサイエンス コーポレーション | Denovo Synthesized Nucleic Acid Library |
CN107760742B (en) * | 2016-08-23 | 2022-10-11 | 南京金斯瑞生物科技有限公司 | Synthesis method of gene rich in AT or GC |
WO2018048827A1 (en) * | 2016-09-07 | 2018-03-15 | Massachusetts Institute Of Technology | Rna-guided endonuclease-based dna assembly |
US10907274B2 (en) | 2016-12-16 | 2021-02-02 | Twist Bioscience Corporation | Variant libraries of the immunological synapse and synthesis thereof |
CA3056388A1 (en) | 2017-03-15 | 2018-09-20 | Twist Bioscience Corporation | Variant libraries of the immunological synapse and synthesis thereof |
WO2018231872A1 (en) * | 2017-06-12 | 2018-12-20 | Twist Bioscience Corporation | Methods for seamless nucleic acid assembly |
WO2018231864A1 (en) * | 2017-06-12 | 2018-12-20 | Twist Bioscience Corporation | Methods for seamless nucleic acid assembly |
US11407837B2 (en) | 2017-09-11 | 2022-08-09 | Twist Bioscience Corporation | GPCR binding proteins and synthesis thereof |
EP3701023A4 (en) * | 2017-10-27 | 2021-07-28 | Twist Bioscience Corporation | Systems and methods for polynucleotide scoring |
CA3080179A1 (en) * | 2017-11-14 | 2019-05-23 | Oregon Health & Science University | Nuclear-targeted dna repair enzymes and methods of use |
WO2019136175A1 (en) | 2018-01-04 | 2019-07-11 | Twist Bioscience Corporation | Dna-based digital information storage |
JP7029138B2 (en) * | 2018-03-05 | 2022-03-03 | 国立大学法人埼玉大学 | A method for producing a linked nucleic acid fragment, a linked nucleic acid fragment, and a library composed of the linked nucleic acid fragment. |
EP3613855A1 (en) * | 2018-08-23 | 2020-02-26 | Clariant Produkte (Deutschland) GmbH | Method for the production of a nucleic acid library |
CN111378645B (en) * | 2018-12-27 | 2020-12-01 | 江苏金斯瑞生物科技有限公司 | Gene synthesis method |
CA3131514A1 (en) * | 2019-02-25 | 2020-09-03 | Twist Bioscience Corporation | Compositions and methods for next generation sequencing |
CN114026231A (en) * | 2019-04-10 | 2022-02-08 | 里本生物实验室有限责任公司 | Polynucleotide libraries |
GB201905651D0 (en) * | 2019-04-24 | 2019-06-05 | Lightbio Ltd | Nucleic acid constructs and methods for their manufacture |
AU2020298294A1 (en) * | 2019-06-21 | 2022-02-17 | Twist Bioscience Corporation | Barcode-based nucleic acid sequence assembly |
US20220290163A1 (en) * | 2019-07-25 | 2022-09-15 | Bgi Geneland Scientific Co., Ltd. | Method for manipulating terminals of double stranded dna |
WO2021055760A1 (en) | 2019-09-18 | 2021-03-25 | Intergalactic Therapeutics, Inc. | Synthetic dna vectors and methods of use |
CN115867665A (en) * | 2020-06-15 | 2023-03-28 | 博德研究所 | Chimeric amplification subarray sequencing |
EP4147712A1 (en) | 2021-09-13 | 2023-03-15 | OncoDNA | Method to generate a double-stranded dna pool encoding neoantigens of a tumor of a patient |
EP4401762A1 (en) | 2021-09-13 | 2024-07-24 | OncoDNA | Method to generate a double-stranded dna pool encoding neoantigens of a tumor of a patient |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1992006200A1 (en) * | 1990-09-28 | 1992-04-16 | F. Hoffmann-La-Roche Ag | 5' to 3' exonuclease mutations of thermostable dna polymerases |
US20130254934A1 (en) * | 2010-12-10 | 2013-09-26 | Takeshi Nakano | Disease-resistant plant and method for preparing the same |
US20160230175A1 (en) * | 2015-02-11 | 2016-08-11 | Agilent Technologies, Inc. | Methods and compositions for rapid seamless dna assembly |
US20180355351A1 (en) * | 2017-06-12 | 2018-12-13 | Twist Bioscience Corporation | Methods for seamless nucleic acid assembly |
Family Cites Families (609)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3549368A (en) | 1968-07-02 | 1970-12-22 | Ibm | Process for improving photoresist adhesion |
US3920714A (en) | 1972-11-16 | 1975-11-18 | Weber Heinrich | Process for the production of polymeric hydrocarbons with reactive silyl side groups |
GB1550867A (en) | 1975-08-04 | 1979-08-22 | Hughes Aircraft Co | Positioning method and apparatus for fabricating microcircuit devices |
US4415732A (en) | 1981-03-27 | 1983-11-15 | University Patents, Inc. | Phosphoramidite compounds and processes |
EP0090789A1 (en) | 1982-03-26 | 1983-10-05 | Monsanto Company | Chemical DNA synthesis |
JPS59224123A (en) | 1983-05-20 | 1984-12-17 | Oki Electric Ind Co Ltd | Alignment mark for wafer |
JPS61141761A (en) | 1984-12-12 | 1986-06-28 | Kanegafuchi Chem Ind Co Ltd | Curable composition |
US5242794A (en) | 1984-12-13 | 1993-09-07 | Applied Biosystems, Inc. | Detection of specific sequences in nucleic acids |
US4613398A (en) | 1985-06-06 | 1986-09-23 | International Business Machines Corporation | Formation of etch-resistant resists through preferential permeation |
US4981797A (en) | 1985-08-08 | 1991-01-01 | Life Technologies, Inc. | Process of producing highly transformable cells and cells produced thereby |
US4726877A (en) | 1986-01-22 | 1988-02-23 | E. I. Du Pont De Nemours And Company | Methods of using photosensitive compositions containing microgels |
US4808511A (en) | 1987-05-19 | 1989-02-28 | International Business Machines Corporation | Vapor phase photoresist silylation process |
JPH07113774B2 (en) | 1987-05-29 | 1995-12-06 | 株式会社日立製作所 | Pattern formation method |
US4988617A (en) | 1988-03-25 | 1991-01-29 | California Institute Of Technology | Method of detecting a nucleotide change in nucleic acids |
US5700637A (en) | 1988-05-03 | 1997-12-23 | Isis Innovation Limited | Apparatus and method for analyzing polynucleotide sequences and method of generating oligonucleotide arrays |
US5556750A (en) | 1989-05-12 | 1996-09-17 | Duke University | Methods and kits for fractionating a population of DNA molecules based on the presence or absence of a base-pair mismatch utilizing mismatch repair systems |
US5459039A (en) | 1989-05-12 | 1995-10-17 | Duke University | Methods for mapping genetic mutations |
US6008031A (en) | 1989-05-12 | 1999-12-28 | Duke University | Method of analysis and manipulation of DNA utilizing mismatch repair systems |
US5102797A (en) | 1989-05-26 | 1992-04-07 | Dna Plant Technology Corporation | Introduction of heterologous genes into bacteria using transposon flanked expression cassette and a binary vector system |
US6309822B1 (en) | 1989-06-07 | 2001-10-30 | Affymetrix, Inc. | Method for comparing copy number of nucleic acid sequences |
US5143854A (en) | 1989-06-07 | 1992-09-01 | Affymax Technologies N.V. | Large scale photolithographic solid phase synthesis of polypeptides and receptor binding screening thereof |
US5744101A (en) | 1989-06-07 | 1998-04-28 | Affymax Technologies N.V. | Photolabile nucleoside protecting groups |
US6040138A (en) | 1995-09-15 | 2000-03-21 | Affymetrix, Inc. | Expression monitoring by hybridization to high density oligonucleotide arrays |
US5242974A (en) | 1991-11-22 | 1993-09-07 | Affymax Technologies N.V. | Polymer reversal on solid surfaces |
US5527681A (en) | 1989-06-07 | 1996-06-18 | Affymax Technologies N.V. | Immobilized molecular synthesis of systematically substituted compounds |
CA2036946C (en) | 1990-04-06 | 2001-10-16 | Kenneth V. Deugau | Indexing linkers |
US5494810A (en) | 1990-05-03 | 1996-02-27 | Cornell Research Foundation, Inc. | Thermostable ligase-mediated DNA amplifications system for the detection of genetic disease |
DE69133389T2 (en) | 1990-09-27 | 2005-06-02 | Invitrogen Corp., Carlsbad | Direct cloning of PCR amplified nucleic acids |
GB9025236D0 (en) | 1990-11-20 | 1991-01-02 | Secr Defence | Silicon-on porous-silicon;method of production |
US6582908B2 (en) | 1990-12-06 | 2003-06-24 | Affymetrix, Inc. | Oligonucleotides |
JPH06504997A (en) | 1990-12-06 | 1994-06-09 | アフィメトリックス, インコーポレイテッド | Synthesis of immobilized polymers on a very large scale |
WO1992010588A1 (en) | 1990-12-06 | 1992-06-25 | Affymax Technologies N.V. | Sequencing by hybridization of a target nucleic acid to a matrix of defined oligonucleotides |
US5455166A (en) * | 1991-01-31 | 1995-10-03 | Becton, Dickinson And Company | Strand displacement amplification |
US5137814A (en) | 1991-06-14 | 1992-08-11 | Life Technologies, Inc. | Use of exo-sample nucleotides in gene cloning |
US5449754A (en) | 1991-08-07 | 1995-09-12 | H & N Instruments, Inc. | Generation of combinatorial libraries |
US5474796A (en) | 1991-09-04 | 1995-12-12 | Protogene Laboratories, Inc. | Method and apparatus for conducting an array of chemical reactions on a support surface |
ATE293011T1 (en) | 1991-11-22 | 2005-04-15 | Affymetrix Inc A Delaware Corp | COMBINATORY STRATEGIES FOR POLYMER SYNTHESIS |
US5384261A (en) | 1991-11-22 | 1995-01-24 | Affymax Technologies N.V. | Very large scale immobilized polymer synthesis using mechanically directed flow paths |
DE69322266T2 (en) | 1992-04-03 | 1999-06-02 | Perkin-Elmer Corp., Foster City, Calif. | SAMPLES COMPOSITION AND METHOD |
JP2553322Y2 (en) | 1992-05-11 | 1997-11-05 | サンデン株式会社 | Filter feed mechanism of beverage brewing device |
US5288514A (en) | 1992-09-14 | 1994-02-22 | The Regents Of The University Of California | Solid phase and combinatorial synthesis of benzodiazepine compounds on a solid support |
JP3176444B2 (en) | 1992-10-01 | 2001-06-18 | 株式会社リコー | Aqueous ink and recording method using the same |
DE4241045C1 (en) | 1992-12-05 | 1994-05-26 | Bosch Gmbh Robert | Process for anisotropic etching of silicon |
US5395753A (en) | 1993-02-19 | 1995-03-07 | Theratech, Inc. | Method for diagnosing rheumatoid arthritis |
ES2204913T3 (en) | 1993-04-12 | 2004-05-01 | Northwestern University | METHOD FOR TRAINING OF OLIGONUCLEOTIDES. |
US7135312B2 (en) | 1993-04-15 | 2006-11-14 | University Of Rochester | Circular DNA vectors for synthesis of RNA and DNA |
CN1039623C (en) | 1993-10-22 | 1998-09-02 | 中国人民解放军军事医学科学院毒物药物研究所 | Pharmaceutical composition for preventing and treating motion sickness syndrome and preparation method thereof |
US6893816B1 (en) | 1993-10-28 | 2005-05-17 | Houston Advanced Research Center | Microfabricated, flowthrough porous apparatus for discrete detection of binding reactions |
AU700315B2 (en) | 1993-10-28 | 1998-12-24 | Houston Advanced Research Center | Microfabricated, flowthrough porous apparatus for discrete detection of binding reactions |
US6027877A (en) | 1993-11-04 | 2000-02-22 | Gene Check, Inc. | Use of immobilized mismatch binding protein for detection of mutations and polymorphisms, purification of amplified DNA samples and allele identification |
US5834252A (en) | 1995-04-18 | 1998-11-10 | Glaxo Group Limited | End-complementary polymerase reaction |
US6015880A (en) | 1994-03-16 | 2000-01-18 | California Institute Of Technology | Method and substrate for performing multiple sequential reactions on a matrix |
US5514789A (en) | 1994-04-21 | 1996-05-07 | Barrskogen, Inc. | Recovery of oligonucleotides by gas phase cleavage |
SE512382C2 (en) | 1994-04-26 | 2000-03-06 | Ericsson Telefon Ab L M | Device and method for placing elongate elements against or adjacent to a surface |
EP0706649B1 (en) | 1994-04-29 | 2001-01-03 | Perkin-Elmer Corporation | Method and apparatus for real time detection of nucleic acid amplification products |
US6287850B1 (en) | 1995-06-07 | 2001-09-11 | Affymetrix, Inc. | Bioarray chip reaction apparatus and its manufacture |
JPH10507160A (en) | 1994-06-23 | 1998-07-14 | アフィマックス テクノロジーズ エヌ.ブイ. | Photoactive compounds and methods of using the same |
US5641658A (en) | 1994-08-03 | 1997-06-24 | Mosaic Technologies, Inc. | Method for performing amplification of nucleic acid with two primers bound to a single solid support |
US5530516A (en) | 1994-10-04 | 1996-06-25 | Tamarack Scientific Co., Inc. | Large-area projection exposure system |
US6635226B1 (en) | 1994-10-19 | 2003-10-21 | Agilent Technologies, Inc. | Microanalytical device and use thereof for conducting chemical processes |
US6613560B1 (en) | 1994-10-19 | 2003-09-02 | Agilent Technologies, Inc. | PCR microreactor for amplifying DNA using microquantities of sample fluid |
US5556752A (en) | 1994-10-24 | 1996-09-17 | Affymetrix, Inc. | Surface-bound, unimolecular, double-stranded DNA |
JPH11511900A (en) | 1994-11-22 | 1999-10-12 | コンプレツクス フルイツド システムズ,インコーポレーテツド | Non-amine photoresist adhesion promoters for microelectronics applications |
US5700642A (en) | 1995-05-22 | 1997-12-23 | Sri International | Oligonucleotide sizing using immobilized cleavable primers |
US5830655A (en) | 1995-05-22 | 1998-11-03 | Sri International | Oligonucleotide sizing using cleavable primers |
US5877280A (en) | 1995-06-06 | 1999-03-02 | The Mount Sinai School Of Medicine Of The City University Of New York | Thermostable muts proteins |
US6446682B1 (en) | 1995-06-06 | 2002-09-10 | James P. Viken | Auto-loading fluid exchanger and method of use |
US5707806A (en) | 1995-06-07 | 1998-01-13 | Genzyme Corporation | Direct sequence identification of mutations by cleavage- and ligation-associated mutation-specific sequencing |
US5780613A (en) | 1995-08-01 | 1998-07-14 | Northwestern University | Covalent lock for self-assembled oligonucleotide constructs |
JP2000501615A (en) | 1995-12-15 | 2000-02-15 | アマーシャム・ライフ・サイエンス・インコーポレーテッド | Method using a mismatch repair system for detection and removal of mutant sequences generated during enzyme amplification |
US6274369B1 (en) | 1996-02-02 | 2001-08-14 | Invitrogen Corporation | Method capable of increasing competency of bacterial cell transformation |
US6706875B1 (en) | 1996-04-17 | 2004-03-16 | Affyemtrix, Inc. | Substrate preparation process |
US5869245A (en) | 1996-06-05 | 1999-02-09 | Fox Chase Cancer Center | Mismatch endonuclease and its use in identifying mutations in targeted polynucleotide strands |
US5853993A (en) | 1996-10-21 | 1998-12-29 | Hewlett-Packard Company | Signal enhancement method and kit |
WO1998022541A2 (en) | 1996-11-08 | 1998-05-28 | Ikonos Corporation | Method for coating substrates |
US5750672A (en) | 1996-11-22 | 1998-05-12 | Barrskogen, Inc. | Anhydrous amine cleavage of oligonucleotides |
WO1998029736A1 (en) | 1996-12-31 | 1998-07-09 | Genometrix Incorporated | Multiplexed molecular analysis apparatus and method |
ATE294229T1 (en) | 1997-02-12 | 2005-05-15 | Invitrogen Corp | METHOD FOR DRYING COMPETENT CELLS |
US5882496A (en) | 1997-02-27 | 1999-03-16 | The Regents Of The University Of California | Porous silicon structures with high surface area/specific pore size |
US6770748B2 (en) | 1997-03-07 | 2004-08-03 | Takeshi Imanishi | Bicyclonucleoside and oligonucleotide analogue |
AU751956B2 (en) | 1997-03-20 | 2002-09-05 | University Of Washington | Solvent for biopolymer synthesis, solvent microdroplets and methods of use |
US6419883B1 (en) | 1998-01-16 | 2002-07-16 | University Of Washington | Chemical synthesis using solvent microdroplets |
US6028189A (en) | 1997-03-20 | 2000-02-22 | University Of Washington | Solvent for oligonucleotide synthesis and methods of use |
ATE378417T1 (en) | 1997-03-21 | 2007-11-15 | Stratagene California | EXTRACTS CONTAINING POLYMERASE IMPROVEMENT FACTOR (PEF), PEF PROTEIN COMPLEXES, ISOLATED PEF PROTEIN AND METHOD FOR PURIFICATION AND IDENTIFICATION |
US5922593A (en) | 1997-05-23 | 1999-07-13 | Becton, Dickinson And Company | Microbiological test panel and method therefor |
US6969488B2 (en) | 1998-05-22 | 2005-11-29 | Solexa, Inc. | System and apparatus for sequential processing of analytes |
DE69824586T2 (en) | 1997-06-26 | 2005-06-23 | PerSeptive Biosystems, Inc., Framingham | SAMPLE HIGH DENSITY SAMPLE FOR THE ANALYSIS OF BIOLOGICAL SAMPLES |
GB9714716D0 (en) | 1997-07-11 | 1997-09-17 | Brax Genomics Ltd | Characterising nucleic acids |
US6027898A (en) | 1997-08-18 | 2000-02-22 | Transgenomic, Inc. | Chromatographic method for mutation detection using mutation site specifically acting enzymes and chemicals |
US6794499B2 (en) | 1997-09-12 | 2004-09-21 | Exiqon A/S | Oligonucleotide analogues |
US6136568A (en) | 1997-09-15 | 2000-10-24 | Hiatt; Andrew C. | De novo polynucleotide synthesis using rolling templates |
US6670127B2 (en) | 1997-09-16 | 2003-12-30 | Egea Biosciences, Inc. | Method for assembly of a polynucleotide encoding a target polypeptide |
WO1999014318A1 (en) | 1997-09-16 | 1999-03-25 | Board Of Regents, The University Of Texas System | Method for the complete chemical synthesis and assembly of genes and genomes |
US6287776B1 (en) | 1998-02-02 | 2001-09-11 | Signature Bioscience, Inc. | Method for detecting and classifying nucleic acid hybridization |
US6251588B1 (en) | 1998-02-10 | 2001-06-26 | Agilent Technologies, Inc. | Method for evaluating oligonucleotide probe sequences |
EP1054726B1 (en) | 1998-02-11 | 2003-07-30 | University of Houston, Office of Technology Transfer | Apparatus for chemical and biochemical reactions using photo-generated reagents |
EP2180309B1 (en) | 1998-02-23 | 2017-11-01 | Wisconsin Alumni Research Foundation | Apparatus for synthesis of arrays of DNA probes |
AU3601599A (en) | 1998-03-25 | 1999-10-18 | Ulf Landegren | Rolling circle replication of padlock probes |
US6284497B1 (en) | 1998-04-09 | 2001-09-04 | Trustees Of Boston University | Nucleic acid arrays and methods of synthesis |
US6376285B1 (en) | 1998-05-28 | 2002-04-23 | Texas Instruments Incorporated | Annealed porous silicon with epitaxial layer for SOI |
US6251595B1 (en) | 1998-06-18 | 2001-06-26 | Agilent Technologies, Inc. | Methods and devices for carrying out chemical reactions |
ATE313548T1 (en) | 1998-06-22 | 2006-01-15 | Affymetrix Inc | REAGENT AND METHOD FOR SOLID PHASE SYNTHESIS |
US7399844B2 (en) | 1998-07-09 | 2008-07-15 | Agilent Technologies, Inc. | Method and reagents for analyzing the nucleotide sequence of nucleic acids |
US6218118B1 (en) | 1998-07-09 | 2001-04-17 | Agilent Technologies, Inc. | Method and mixture reagents for analyzing the nucleotide sequence of nucleic acids by mass spectrometry |
US20030022207A1 (en) | 1998-10-16 | 2003-01-30 | Solexa, Ltd. | Arrayed polynucleotides and their use in genome analysis |
US6787308B2 (en) | 1998-07-30 | 2004-09-07 | Solexa Ltd. | Arrayed biomolecules and their use in sequencing |
US6222030B1 (en) | 1998-08-03 | 2001-04-24 | Agilent Technologies, Inc. | Solid phase synthesis of oligonucleotides using carbonate protecting groups and alpha-effect nucleophile deprotection |
US6107038A (en) | 1998-08-14 | 2000-08-22 | Agilent Technologies Inc. | Method of binding a plurality of chemicals on a substrate by electrophoretic self-assembly |
EP1405666B1 (en) | 1998-08-28 | 2007-03-21 | febit biotech GmbH | Substrate for methods to determine analytes and methods to manufacture such substrates |
US6258454B1 (en) | 1998-09-01 | 2001-07-10 | Agilent Technologies Inc. | Functionalization of substrate surfaces with silane mixtures |
US6461812B2 (en) | 1998-09-09 | 2002-10-08 | Agilent Technologies, Inc. | Method and multiple reservoir apparatus for fabrication of biomolecular arrays |
US6458583B1 (en) | 1998-09-09 | 2002-10-01 | Agilent Technologies, Inc. | Method and apparatus for making nucleic acid arrays |
AR021833A1 (en) | 1998-09-30 | 2002-08-07 | Applied Research Systems | METHODS OF AMPLIFICATION AND SEQUENCING OF NUCLEIC ACID |
US6399516B1 (en) | 1998-10-30 | 2002-06-04 | Massachusetts Institute Of Technology | Plasma etch techniques for fabricating silicon structures from a substrate |
US6309828B1 (en) | 1998-11-18 | 2001-10-30 | Agilent Technologies, Inc. | Method and apparatus for fabricating replicate arrays of nucleic acid molecules |
GB9900298D0 (en) | 1999-01-07 | 1999-02-24 | Medical Res Council | Optical sorting method |
US6376246B1 (en) | 1999-02-05 | 2002-04-23 | Maxygen, Inc. | Oligonucleotide mediated nucleic acid recombination |
WO2000042559A1 (en) | 1999-01-18 | 2000-07-20 | Maxygen, Inc. | Methods of populating data structures for use in evolutionary simulations |
EP1062614A1 (en) | 1999-01-19 | 2000-12-27 | Maxygen, Inc. | Methods for making character strings, polynucleotides and polypeptides |
US20070065838A1 (en) | 1999-01-19 | 2007-03-22 | Maxygen, Inc. | Oligonucleotide mediated nucleic acid recombination |
US6251685B1 (en) | 1999-02-18 | 2001-06-26 | Agilent Technologies, Inc. | Readout method for molecular biological electronically addressable arrays |
EP1153127B1 (en) | 1999-02-19 | 2006-07-26 | febit biotech GmbH | Method for producing polymers |
EP2177627B1 (en) | 1999-02-23 | 2012-05-02 | Caliper Life Sciences, Inc. | Manipulation of microparticles in microfluidic systems |
JP2002538790A (en) | 1999-03-08 | 2002-11-19 | プロトジーン・ラボラトリーズ・インコーポレーテッド | Methods and compositions for economically synthesizing and assembling long DNA sequences |
US6824866B1 (en) | 1999-04-08 | 2004-11-30 | Affymetrix, Inc. | Porous silica substrates for polymer synthesis and assays |
US6284465B1 (en) | 1999-04-15 | 2001-09-04 | Agilent Technologies, Inc. | Apparatus, systems and method for locating nucleic acids bound to surfaces |
US6469156B1 (en) | 1999-04-20 | 2002-10-22 | The United States Of America As Represented By The Department Of Health And Human Services | Rapid and sensitive method for detecting histoplasma capsulatum |
US6221653B1 (en) | 1999-04-27 | 2001-04-24 | Agilent Technologies, Inc. | Method of performing array-based hybridization assays using thermal inkjet deposition of sample fluids |
US6518056B2 (en) | 1999-04-27 | 2003-02-11 | Agilent Technologies Inc. | Apparatus, systems and method for assaying biological materials using an annular format |
US6773676B2 (en) | 1999-04-27 | 2004-08-10 | Agilent Technologies, Inc. | Devices for performing array hybridization assays and methods of using the same |
US6300137B1 (en) | 1999-04-28 | 2001-10-09 | Agilent Technologies Inc. | Method for synthesizing a specific, surface-bound polymer uniformly over an element of a molecular array |
US6242266B1 (en) | 1999-04-30 | 2001-06-05 | Agilent Technologies Inc. | Preparation of biopolymer arrays |
US7276336B1 (en) | 1999-07-22 | 2007-10-02 | Agilent Technologies, Inc. | Methods of fabricating an addressable array of biopolymer probes |
US6323043B1 (en) | 1999-04-30 | 2001-11-27 | Agilent Technologies, Inc. | Fabricating biopolymer arrays |
JP2003516169A (en) | 1999-05-01 | 2003-05-13 | プシメデイカ・リミテツド | Induced porous silicon |
US7056661B2 (en) | 1999-05-19 | 2006-06-06 | Cornell Research Foundation, Inc. | Method for sequencing nucleic acid molecules |
EP1185544B1 (en) | 1999-05-24 | 2008-11-26 | Invitrogen Corporation | Method for deblocking of labeled oligonucleotides |
US6132997A (en) | 1999-05-28 | 2000-10-17 | Agilent Technologies | Method for linear mRNA amplification |
US6815218B1 (en) | 1999-06-09 | 2004-11-09 | Massachusetts Institute Of Technology | Methods for manufacturing bioelectronic devices |
EP1190097A2 (en) | 1999-06-22 | 2002-03-27 | Invitrogen Corporation | Improved primers and methods for the detection and discrimination of nucleic acids |
DE19928410C2 (en) | 1999-06-22 | 2002-11-28 | Agilent Technologies Inc | Device housing with a device for operating a laboratory microchip |
US6709852B1 (en) | 1999-06-22 | 2004-03-23 | Invitrogen Corporation | Rapid growing microorganisms for biotechnology applications |
US6399394B1 (en) | 1999-06-30 | 2002-06-04 | Agilent Technologies, Inc. | Testing multiple fluid samples with multiple biopolymer arrays |
US6465183B2 (en) | 1999-07-01 | 2002-10-15 | Agilent Technologies, Inc. | Multidentate arrays |
US7504213B2 (en) | 1999-07-09 | 2009-03-17 | Agilent Technologies, Inc. | Methods and apparatus for preparing arrays comprising features having degenerate biopolymers |
US6461816B1 (en) | 1999-07-09 | 2002-10-08 | Agilent Technologies, Inc. | Methods for controlling cross-hybridization in analysis of nucleic acid sequences |
US6306599B1 (en) | 1999-07-16 | 2001-10-23 | Agilent Technologies Inc. | Biopolymer arrays and their fabrication |
US6346423B1 (en) | 1999-07-16 | 2002-02-12 | Agilent Technologies, Inc. | Methods and compositions for producing biopolymeric arrays |
US6180351B1 (en) | 1999-07-22 | 2001-01-30 | Agilent Technologies Inc. | Chemical array fabrication with identifier |
US6201112B1 (en) | 1999-07-22 | 2001-03-13 | Agilent Technologies Inc. | Method for 3′ end-labeling ribonucleic acids |
US6262490B1 (en) | 1999-11-05 | 2001-07-17 | Advanced Semiconductor Engineering, Inc. | Substrate strip for use in packaging semiconductor chips |
US6743585B2 (en) | 1999-09-16 | 2004-06-01 | Agilent Technologies, Inc. | Methods for preparing conjugates |
US7244559B2 (en) | 1999-09-16 | 2007-07-17 | 454 Life Sciences Corporation | Method of sequencing a nucleic acid |
US7211390B2 (en) | 1999-09-16 | 2007-05-01 | 454 Life Sciences Corporation | Method of sequencing a nucleic acid |
US6319674B1 (en) | 1999-09-16 | 2001-11-20 | Agilent Technologies, Inc. | Methods for attaching substances to surfaces |
US7122303B2 (en) | 1999-09-17 | 2006-10-17 | Agilent Technologies, Inc. | Arrays comprising background features that provide for a measure of a non-specific binding and methods for using the same |
US7078167B2 (en) | 1999-09-17 | 2006-07-18 | Agilent Technologies, Inc. | Arrays having background features and methods for using the same |
AU7537200A (en) | 1999-09-29 | 2001-04-30 | Solexa Ltd. | Polynucleotide sequencing |
DE19964337B4 (en) | 1999-10-01 | 2004-09-16 | Agilent Technologies, Inc. (n.d.Ges.d.Staates Delaware), Palo Alto | Microfluidic microchip with bendable suction tube |
EP1235932A2 (en) | 1999-10-08 | 2002-09-04 | Protogene Laboratories, Inc. | Method and apparatus for performing large numbers of reactions using array assembly |
US6232072B1 (en) | 1999-10-15 | 2001-05-15 | Agilent Technologies, Inc. | Biopolymer array inspection |
US6451998B1 (en) | 1999-10-18 | 2002-09-17 | Agilent Technologies, Inc. | Capping and de-capping during oligonucleotide synthesis |
US6171797B1 (en) | 1999-10-20 | 2001-01-09 | Agilent Technologies Inc. | Methods of making polymeric arrays |
US7115423B1 (en) | 1999-10-22 | 2006-10-03 | Agilent Technologies, Inc. | Fluidic structures within an array package |
US6387636B1 (en) | 1999-10-22 | 2002-05-14 | Agilent Technologies, Inc. | Method of shielding biosynthesis reactions from the ambient environment on an array |
US6077674A (en) | 1999-10-27 | 2000-06-20 | Agilent Technologies Inc. | Method of producing oligonucleotide arrays with features of high purity |
US6329210B1 (en) | 1999-10-29 | 2001-12-11 | Agilent Technologies, Inc. | Method and apparatus for high volume polymer synthesis |
US20010055761A1 (en) | 1999-10-29 | 2001-12-27 | Agilent Technologies | Small scale dna synthesis using polymeric solid support with functionalized regions |
US8268605B2 (en) | 1999-10-29 | 2012-09-18 | Agilent Technologies, Inc. | Compositions and methods utilizing DNA polymerases |
US6689319B1 (en) | 1999-10-29 | 2004-02-10 | Agilent Technologies, Ind. | Apparatus for deposition and inspection of chemical and biological fluids |
US6406849B1 (en) | 1999-10-29 | 2002-06-18 | Agilent Technologies, Inc. | Interrogating multi-featured arrays |
US6428957B1 (en) | 1999-11-08 | 2002-08-06 | Agilent Technologies, Inc. | Systems tools and methods of assaying biological materials using spatially-addressable arrays |
US6440669B1 (en) | 1999-11-10 | 2002-08-27 | Agilent Technologies, Inc. | Methods for applying small volumes of reagents |
US6446642B1 (en) | 1999-11-22 | 2002-09-10 | Agilent Technologies, Inc. | Method and apparatus to clean an inkjet reagent deposition device |
US6582938B1 (en) | 2001-05-11 | 2003-06-24 | Affymetrix, Inc. | Amplification of nucleic acids |
US6800439B1 (en) | 2000-01-06 | 2004-10-05 | Affymetrix, Inc. | Methods for improved array preparation |
CA2396320A1 (en) | 2000-01-11 | 2001-07-19 | Maxygen, Inc. | Integrated systems and methods for diversity generation and screening |
US6587579B1 (en) | 2000-01-26 | 2003-07-01 | Agilent Technologies Inc. | Feature quality in array fabrication |
US6406851B1 (en) | 2000-01-28 | 2002-06-18 | Agilent Technologies, Inc. | Method for coating a substrate quickly and uniformly with a small volume of fluid |
US6458526B1 (en) | 2000-01-28 | 2002-10-01 | Agilent Technologies, Inc. | Method and apparatus to inhibit bubble formation in a fluid |
US7198939B2 (en) | 2000-01-28 | 2007-04-03 | Agilent Technologies, Inc. | Apparatus for interrogating an addressable array |
US6235483B1 (en) | 2000-01-31 | 2001-05-22 | Agilent Technologies, Inc. | Methods and kits for indirect labeling of nucleic acids |
GB0002389D0 (en) | 2000-02-02 | 2000-03-22 | Solexa Ltd | Molecular arrays |
US6403314B1 (en) | 2000-02-04 | 2002-06-11 | Agilent Technologies, Inc. | Computational method and system for predicting fragmented hybridization and for identifying potential cross-hybridization |
US6833450B1 (en) | 2000-03-17 | 2004-12-21 | Affymetrix, Inc. | Phosphite ester oxidation in nucleic acid array preparation |
US6365355B1 (en) | 2000-03-28 | 2002-04-02 | The Regents Of The University Of California | Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches |
US7776021B2 (en) | 2000-04-28 | 2010-08-17 | The Charles Stark Draper Laboratory | Micromachined bilayer unit for filtration of small molecules |
US6716634B1 (en) | 2000-05-31 | 2004-04-06 | Agilent Technologies, Inc. | Increasing ionization efficiency in mass spectrometry |
US7163660B2 (en) | 2000-05-31 | 2007-01-16 | Infineon Technologies Ag | Arrangement for taking up liquid analytes |
JP2004509609A (en) | 2000-06-02 | 2004-04-02 | ブルー ヘロン バイオテクノロジー インコーポレイテッド | Methods for improving sequence fidelity of synthetic double-stranded oligonucleotides |
WO2002010443A1 (en) | 2000-07-27 | 2002-02-07 | The Australian National University | Combinatorial probes and uses therefor |
EP1176151B1 (en) | 2000-07-28 | 2014-08-20 | Agilent Technologies, Inc. | Synthesis of polynucleotides using combined oxidation/deprotection chemistry |
US6890760B1 (en) | 2000-07-31 | 2005-05-10 | Agilent Technologies, Inc. | Array fabrication |
US6599693B1 (en) | 2000-07-31 | 2003-07-29 | Agilent Technologies Inc. | Array fabrication |
US7205400B2 (en) | 2000-07-31 | 2007-04-17 | Agilent Technologies, Inc. | Array fabrication |
DE60114525T2 (en) | 2000-07-31 | 2006-07-20 | Agilent Technologies Inc., A Delaware Corp., Palo Alto | Array-based methods for the synthesis of nucleic acid mixtures |
US6613893B1 (en) | 2000-07-31 | 2003-09-02 | Agilent Technologies Inc. | Array fabrication |
GB0018876D0 (en) | 2000-08-01 | 2000-09-20 | Applied Research Systems | Method of producing polypeptides |
AU2001284997A1 (en) | 2000-08-24 | 2002-03-04 | Maxygen, Inc. | Constructs and their use in metabolic pathway engineering |
EP1319013A2 (en) | 2000-09-08 | 2003-06-18 | University Technologies International Inc. | Linker phosphoramidites for oligonucleotide synthesis |
US6966945B1 (en) | 2000-09-20 | 2005-11-22 | Goodrich Corporation | Inorganic matrix compositions, composites and process of making the same |
WO2002027029A2 (en) | 2000-09-27 | 2002-04-04 | Lynx Therapeutics, Inc. | Method for determining relative abundance of nucleic acid sequences |
EP1330306A2 (en) | 2000-10-10 | 2003-07-30 | BioTrove, Inc. | Apparatus for assay, synthesis and storage, and methods of manufacture, use, and manipulation thereof |
DE10051396A1 (en) | 2000-10-17 | 2002-04-18 | Febit Ferrarius Biotech Gmbh | An integrated synthesis and identification of an analyte, comprises particles immobilized at a carrier to be coupled to receptors in a structured pattern to give receptor arrays for biochemical reactions |
DE60125312T2 (en) | 2000-10-26 | 2007-06-06 | Agilent Technologies, Inc. (n.d. Ges. d. Staates Delaware), Santa Clara | microarray |
US6905816B2 (en) | 2000-11-27 | 2005-06-14 | Intelligent Medical Devices, Inc. | Clinically intelligent diagnostic devices and methods |
DE10060433B4 (en) | 2000-12-05 | 2006-05-11 | Hahn-Schickard-Gesellschaft für angewandte Forschung e.V. | Method for producing a fluid component, fluid component and analysis device |
US6660475B2 (en) * | 2000-12-15 | 2003-12-09 | New England Biolabs, Inc. | Use of site-specific nicking endonucleases to create single-stranded regions and applications thereof |
DE60227361D1 (en) | 2001-01-19 | 2008-08-14 | Centocor Inc | COMPUTER MEDIATED ASSEMBLY OF POLYNUCLEOTIDES ENCODING A TARGETED POLYPEPTIDE |
US6958217B2 (en) * | 2001-01-24 | 2005-10-25 | Genomic Expression Aps | Single-stranded polynucleotide tags |
US6879915B2 (en) | 2001-01-31 | 2005-04-12 | Agilent Technologies, Inc. | Chemical array fabrication and use |
US7027930B2 (en) | 2001-01-31 | 2006-04-11 | Agilent Technologies, Inc. | Reading chemical arrays |
US7166258B2 (en) | 2001-01-31 | 2007-01-23 | Agilent Technologies, Inc. | Automation-optimized microarray package |
US6660338B1 (en) | 2001-03-08 | 2003-12-09 | Agilent Technologies, Inc. | Functionalization of substrate surfaces with silane mixtures |
US7211654B2 (en) | 2001-03-14 | 2007-05-01 | Regents Of The University Of Michigan | Linkers and co-coupling agents for optimization of oligonucleotide synthesis and purification on solid supports |
EP2801624B1 (en) | 2001-03-16 | 2019-03-06 | Singular Bio, Inc | Arrays and methods of use |
US6610978B2 (en) | 2001-03-27 | 2003-08-26 | Agilent Technologies, Inc. | Integrated sample preparation, separation and introduction microdevice for inductively coupled plasma mass spectrometry |
US7208322B2 (en) | 2001-04-02 | 2007-04-24 | Agilent Technologies, Inc. | Sensor surfaces for detecting analytes |
US6943036B2 (en) | 2001-04-30 | 2005-09-13 | Agilent Technologies, Inc. | Error detection in chemical array fabrication |
CA2446417A1 (en) | 2001-05-03 | 2002-11-14 | Sigma Genosys, L.P. | Methods for assembling protein microarrays |
EP1392868B2 (en) | 2001-05-18 | 2013-09-04 | Wisconsin Alumni Research Foundation | Method for the synthesis of dna sequences using photo-labile linkers |
WO2002094846A2 (en) | 2001-05-22 | 2002-11-28 | Parallel Synthesis Technologies, Inc. | Method for in situ, on-chip chemical synthesis |
US6880576B2 (en) | 2001-06-07 | 2005-04-19 | Nanostream, Inc. | Microfluidic devices for methods development |
US6649348B2 (en) | 2001-06-29 | 2003-11-18 | Agilent Technologies Inc. | Methods for manufacturing arrays |
US6613523B2 (en) | 2001-06-29 | 2003-09-02 | Agilent Technologies, Inc. | Method of DNA sequencing using cleavable tags |
US6989267B2 (en) | 2001-07-02 | 2006-01-24 | Agilent Technologies, Inc. | Methods of making microarrays with substrate surfaces having covalently bound polyelectrolyte films |
US6753145B2 (en) | 2001-07-05 | 2004-06-22 | Agilent Technologies, Inc. | Buffer composition and method for hybridization of microarrays on adsorbed polymer siliceous surfaces |
US6702256B2 (en) | 2001-07-17 | 2004-03-09 | Agilent Technologies, Inc. | Flow-switching microdevice |
US7128876B2 (en) | 2001-07-17 | 2006-10-31 | Agilent Technologies, Inc. | Microdevice and method for component separation in a fluid |
US7314599B2 (en) | 2001-07-17 | 2008-01-01 | Agilent Technologies, Inc. | Paek embossing and adhesion for microfluidic devices |
US20030108903A1 (en) | 2001-07-19 | 2003-06-12 | Liman Wang | Multiple word DNA computing on surfaces |
US8067556B2 (en) | 2001-07-26 | 2011-11-29 | Agilent Technologies, Inc. | Multi-site mutagenesis |
US7371580B2 (en) | 2001-08-24 | 2008-05-13 | Agilent Technologies, Inc. | Use of unstructured nucleic acids in assaying nucleic acid molecules |
US6682702B2 (en) | 2001-08-24 | 2004-01-27 | Agilent Technologies, Inc. | Apparatus and method for simultaneously conducting multiple chemical reactions |
JP2003101204A (en) | 2001-09-25 | 2003-04-04 | Nec Kansai Ltd | Wiring substrate, method of manufacturing the same, and electronic component |
US20050124022A1 (en) | 2001-10-30 | 2005-06-09 | Maithreyan Srinivasan | Novel sulfurylase-luciferase fusion proteins and thermostable sulfurylase |
US6902921B2 (en) | 2001-10-30 | 2005-06-07 | 454 Corporation | Sulfurylase-luciferase fusion proteins and thermostable sulfurylase |
US7524950B2 (en) | 2001-10-31 | 2009-04-28 | Agilent Technologies, Inc. | Uses of cationic salts for polynucleotide synthesis |
US6858720B2 (en) | 2001-10-31 | 2005-02-22 | Agilent Technologies, Inc. | Method of synthesizing polynucleotides using ionic liquids |
US6852850B2 (en) | 2001-10-31 | 2005-02-08 | Agilent Technologies, Inc. | Use of ionic liquids for fabrication of polynucleotide arrays |
US20030087298A1 (en) | 2001-11-02 | 2003-05-08 | Roland Green | Detection of hybridization on oligonucleotide microarray through covalently labeling microarray probe |
EP1314783B1 (en) | 2001-11-22 | 2008-11-19 | Sloning BioTechnology GmbH | Nucleic acid linkers and their use in gene synthesis |
US20030099952A1 (en) | 2001-11-26 | 2003-05-29 | Roland Green | Microarrays with visible pattern detection |
US20030143605A1 (en) | 2001-12-03 | 2003-07-31 | Si Lok | Methods for the selection and cloning of nucleic acid molecules free of unwanted nucleotide sequence alterations |
US6927029B2 (en) | 2001-12-03 | 2005-08-09 | Agilent Technologies, Inc. | Surface with tethered polymeric species for binding biomolecules |
US6838888B2 (en) | 2001-12-13 | 2005-01-04 | Agilent Technologies, Inc. | Flow cell humidity sensor system |
WO2003054232A2 (en) | 2001-12-13 | 2003-07-03 | Blue Heron Biotechnology, Inc. | Methods for removal of double-stranded oligonucleotides containing sequence errors using mismatch recognition proteins |
US7932070B2 (en) | 2001-12-21 | 2011-04-26 | Agilent Technologies, Inc. | High fidelity DNA polymerase compositions and uses therefor |
US6846454B2 (en) | 2001-12-24 | 2005-01-25 | Agilent Technologies, Inc. | Fluid exit in reaction chambers |
US6790620B2 (en) | 2001-12-24 | 2004-09-14 | Agilent Technologies, Inc. | Small volume chambers |
US7282183B2 (en) | 2001-12-24 | 2007-10-16 | Agilent Technologies, Inc. | Atmospheric control in reaction chambers |
US7025324B1 (en) | 2002-01-04 | 2006-04-11 | Massachusetts Institute Of Technology | Gating apparatus and method of manufacture |
US20030171325A1 (en) | 2002-01-04 | 2003-09-11 | Board Of Regents, The University Of Texas System | Proofreading, error deletion, and ligation method for synthesis of high-fidelity polynucleotide sequences |
US6673552B2 (en) | 2002-01-14 | 2004-01-06 | Diversa Corporation | Methods for purifying annealed double-stranded oligonucleotides lacking base pair mismatches or nucleotide gaps |
US7141368B2 (en) | 2002-01-30 | 2006-11-28 | Agilent Technologies, Inc. | Multi-directional deposition in array fabrication |
US7157229B2 (en) | 2002-01-31 | 2007-01-02 | Nimblegen Systems, Inc. | Prepatterned substrate for optical synthesis of DNA probes |
US20040126757A1 (en) | 2002-01-31 | 2004-07-01 | Francesco Cerrina | Method and apparatus for synthesis of arrays of DNA probes |
US7422851B2 (en) | 2002-01-31 | 2008-09-09 | Nimblegen Systems, Inc. | Correction for illumination non-uniformity during the synthesis of arrays of oligomers |
US7037659B2 (en) | 2002-01-31 | 2006-05-02 | Nimblegen Systems Inc. | Apparatus for constructing DNA probes having a prismatic and kaleidoscopic light homogenizer |
US7083975B2 (en) | 2002-02-01 | 2006-08-01 | Roland Green | Microarray synthesis instrument and method |
US20030148291A1 (en) | 2002-02-05 | 2003-08-07 | Karla Robotti | Method of immobilizing biologically active molecules for assay purposes in a microfluidic format |
US6958119B2 (en) | 2002-02-26 | 2005-10-25 | Agilent Technologies, Inc. | Mobile phase gradient generation microfluidic device |
US6770892B2 (en) | 2002-02-28 | 2004-08-03 | Agilent Technologies, Inc. | Method and system for automated focus-distance determination for molecular array scanners |
US6929951B2 (en) | 2002-02-28 | 2005-08-16 | Agilent Technologies, Inc. | Method and system for molecular array scanner calibration |
US6914229B2 (en) | 2002-02-28 | 2005-07-05 | Agilent Technologies, Inc. | Signal offset for prevention of data clipping in a molecular array scanner |
US6919181B2 (en) | 2002-03-25 | 2005-07-19 | Agilent Technologies, Inc. | Methods for generating ligand arrays |
WO2003085094A2 (en) | 2002-04-01 | 2003-10-16 | Blue Heron Biotechnology, Inc. | Solid phase methods for polynucleotide production |
US6773888B2 (en) | 2002-04-08 | 2004-08-10 | Affymetrix, Inc. | Photoactivatable silane compounds and methods for their synthesis and use |
CA2483338C (en) | 2002-04-22 | 2014-10-14 | Genencor International, Inc. | Method of creating a library of bacterial clones with varying levels of gene expression |
US6946285B2 (en) | 2002-04-29 | 2005-09-20 | Agilent Technologies, Inc. | Arrays with elongated features |
US7125523B2 (en) | 2002-04-29 | 2006-10-24 | Agilent Technologies, Inc. | Holders for arrays |
US6621076B1 (en) | 2002-04-30 | 2003-09-16 | Agilent Technologies, Inc. | Flexible assembly for transporting sample fluids into a mass spectrometer |
US7094537B2 (en) | 2002-04-30 | 2006-08-22 | Agilent Technologies, Inc. | Micro arrays with structured and unstructured probes |
US7221785B2 (en) | 2002-05-21 | 2007-05-22 | Agilent Technologies, Inc. | Method and system for measuring a molecular array background signal from a continuous background region of specified size |
US7273730B2 (en) | 2002-05-24 | 2007-09-25 | Invitrogen Corporation | Nested PCR employing degradable primers |
WO2003100012A2 (en) | 2002-05-24 | 2003-12-04 | Nimblegen Systems, Inc. | Microarrays and method for running hybridization reaction for multiple samples on a single microarray |
US7537936B2 (en) | 2002-05-31 | 2009-05-26 | Agilent Technologies, Inc. | Method of testing multiple fluid samples with multiple biopolymer arrays |
US6789965B2 (en) | 2002-05-31 | 2004-09-14 | Agilent Technologies, Inc. | Dot printer with off-axis loading |
US7078505B2 (en) | 2002-06-06 | 2006-07-18 | Agilent Technologies, Inc. | Manufacture of arrays with varying deposition parameters |
US7351379B2 (en) | 2002-06-14 | 2008-04-01 | Agilent Technologies, Inc. | Fluid containment structure |
US6939673B2 (en) | 2002-06-14 | 2005-09-06 | Agilent Technologies, Inc. | Manufacture of arrays with reduced error impact |
US7919308B2 (en) | 2002-06-14 | 2011-04-05 | Agilent Technologies, Inc. | Form in place gaskets for assays |
US7371348B2 (en) | 2002-06-14 | 2008-05-13 | Agilent Technologies | Multiple array format |
US7220573B2 (en) | 2002-06-21 | 2007-05-22 | Agilent Technologies, Inc. | Array assay devices and methods of using the same |
US6713262B2 (en) | 2002-06-25 | 2004-03-30 | Agilent Technologies, Inc. | Methods and compositions for high throughput identification of protein/nucleic acid binding pairs |
US7894998B2 (en) | 2002-06-26 | 2011-02-22 | Agilent Technologies, Inc. | Method for identifying suitable nucleic acid probe sequences for use in nucleic acid arrays |
US7202358B2 (en) | 2002-07-25 | 2007-04-10 | Agilent Technologies, Inc. | Methods for producing ligand arrays |
US7452712B2 (en) | 2002-07-30 | 2008-11-18 | Applied Biosystems Inc. | Sample block apparatus and method of maintaining a microcard on a sample block |
US7101508B2 (en) | 2002-07-31 | 2006-09-05 | Agilent Technologies, Inc. | Chemical array fabrication errors |
US6835938B2 (en) | 2002-07-31 | 2004-12-28 | Agilent Technologies, Inc. | Biopolymer array substrate thickness dependent automated focus-distance determination method for biopolymer array scanners |
US7153689B2 (en) | 2002-08-01 | 2006-12-26 | Agilent Technologies, Inc. | Apparatus and methods for cleaning and priming droplet dispensing devices |
US7205128B2 (en) | 2002-08-16 | 2007-04-17 | Agilent Technologies, Inc. | Method for synthesis of the second strand of cDNA |
US7563600B2 (en) | 2002-09-12 | 2009-07-21 | Combimatrix Corporation | Microarray synthesis and assembly of gene-length polynucleotides |
JP2006517090A (en) | 2002-09-26 | 2006-07-20 | コーサン バイオサイエンシーズ, インコーポレイテッド | Synthetic gene |
US7498176B2 (en) | 2002-09-27 | 2009-03-03 | Roche Nimblegen, Inc. | Microarray with hydrophobic barriers |
JP4471927B2 (en) | 2002-09-30 | 2010-06-02 | ニンブルゲン システムズ インコーポレイテッド | Array parallel loading method |
JP2006500954A (en) | 2002-10-01 | 2006-01-12 | ニンブルゲン システムズ インコーポレイテッド | A microarray with multiple oligonucleotides in a single array feature |
US7129075B2 (en) | 2002-10-18 | 2006-10-31 | Transgenomic, Inc. | Isolated CEL II endonuclease |
US8283148B2 (en) | 2002-10-25 | 2012-10-09 | Agilent Technologies, Inc. | DNA polymerase compositions for quantitative PCR and methods thereof |
JP2006503586A (en) | 2002-10-28 | 2006-02-02 | ゼオトロン コーポレイション | Array oligomer synthesis and use |
US7629120B2 (en) | 2002-10-31 | 2009-12-08 | Rice University | Method for assembling PCR fragments of DNA |
US7402279B2 (en) | 2002-10-31 | 2008-07-22 | Agilent Technologies, Inc. | Device with integrated microfluidic and electronic components |
US7364896B2 (en) | 2002-10-31 | 2008-04-29 | Agilent Technologies, Inc. | Test strips including flexible array substrates and method of hybridization |
AU2003287449A1 (en) | 2002-10-31 | 2004-05-25 | Nanostream, Inc. | Parallel detection chromatography systems |
US7390457B2 (en) | 2002-10-31 | 2008-06-24 | Agilent Technologies, Inc. | Integrated microfluidic array device |
US7422911B2 (en) | 2002-10-31 | 2008-09-09 | Agilent Technologies, Inc. | Composite flexible array substrate having flexible support |
US20040086892A1 (en) | 2002-11-06 | 2004-05-06 | Crothers Donald M. | Universal tag assay |
US7029854B2 (en) | 2002-11-22 | 2006-04-18 | Agilent Technologies, Inc. | Methods designing multiple mRNA transcript nucleic acid probe sequences for use in nucleic acid arrays |
US7062385B2 (en) | 2002-11-25 | 2006-06-13 | Tufts University | Intelligent electro-optical nucleic acid-based sensor array and method for detecting volatile compounds in ambient air |
US20040110133A1 (en) | 2002-12-06 | 2004-06-10 | Affymetrix, Inc. | Functionated photoacid generator for biological microarray synthesis |
US7932025B2 (en) | 2002-12-10 | 2011-04-26 | Massachusetts Institute Of Technology | Methods for high fidelity production of long nucleic acid molecules with error control |
US7879580B2 (en) | 2002-12-10 | 2011-02-01 | Massachusetts Institute Of Technology | Methods for high fidelity production of long nucleic acid molecules |
US20060076482A1 (en) | 2002-12-13 | 2006-04-13 | Hobbs Steven E | High throughput systems and methods for parallel sample analysis |
US6987263B2 (en) | 2002-12-13 | 2006-01-17 | Nanostream, Inc. | High throughput systems and methods for parallel sample analysis |
US7247337B1 (en) | 2002-12-16 | 2007-07-24 | Agilent Technologies, Inc. | Method and apparatus for microarray fabrication |
US20040191810A1 (en) | 2002-12-17 | 2004-09-30 | Affymetrix, Inc. | Immersed microarrays in conical wells |
US7960157B2 (en) | 2002-12-20 | 2011-06-14 | Agilent Technologies, Inc. | DNA polymerase blends and uses thereof |
DE03808546T1 (en) | 2002-12-23 | 2006-01-26 | Agilent Technologies, Inc., Palo Alto | COMPARATIVE GENOMIC HYBRIDIZATION TESTS USING CHARACTERIZED IMMOBILIZED OLIGONUCLEOTIDES AND COMPOSITIONS FOR IMPLEMENTING THEM |
US7737089B2 (en) | 2002-12-23 | 2010-06-15 | Febit Holding Gmbh | Photoactivatable two-stage protective groups for the synthesis of biopolymers |
US7372982B2 (en) | 2003-01-14 | 2008-05-13 | Agilent Technologies, Inc. | User interface for molecular array feature analysis |
US6809277B2 (en) | 2003-01-22 | 2004-10-26 | Agilent Technologies, Inc. | Method for registering a deposited material with channel plate channels, and switch produced using same |
DE602004036672C5 (en) | 2003-01-29 | 2012-11-29 | 454 Life Sciences Corporation | Nucleic acid amplification based on bead emulsion |
US7202264B2 (en) | 2003-01-31 | 2007-04-10 | Isis Pharmaceuticals, Inc. | Supports for oligomer synthesis |
US8073626B2 (en) | 2003-01-31 | 2011-12-06 | Agilent Technologies, Inc. | Biopolymer array reading |
US6950756B2 (en) | 2003-02-05 | 2005-09-27 | Agilent Technologies, Inc. | Rearrangement of microarray scan images to form virtual arrays |
US7413709B2 (en) | 2003-02-12 | 2008-08-19 | Agilent Technologies, Inc. | PAEK-based microfluidic device with integrated electrospray emitter |
US7244513B2 (en) | 2003-02-21 | 2007-07-17 | Nano-Proprietary, Inc. | Stain-etched silicon powder |
US7252938B2 (en) | 2003-02-25 | 2007-08-07 | Agilent Technologies, Inc. | Methods and devices for producing a polymer at a location of a substrate |
US7070932B2 (en) | 2003-02-25 | 2006-07-04 | Agilent Technologies, Inc. | Methods and devices for detecting printhead misalignment of an in situ polymeric array synthesis device |
US6977223B2 (en) | 2003-03-07 | 2005-12-20 | Massachusetts Institute Of Technology | Three dimensional microfabrication |
US20050053968A1 (en) | 2003-03-31 | 2005-03-10 | Council Of Scientific And Industrial Research | Method for storing information in DNA |
EP1613776A1 (en) | 2003-04-02 | 2006-01-11 | Blue Heron Biotechnology, Inc. | Error reduction in automated gene synthesis |
US7534561B2 (en) | 2003-04-02 | 2009-05-19 | Agilent Technologies, Inc. | Nucleic acid array in situ fabrication methods and arrays produced using the same |
US7206439B2 (en) | 2003-04-30 | 2007-04-17 | Agilent Technologies, Inc. | Feature locations in array reading |
US7269518B2 (en) | 2003-04-30 | 2007-09-11 | Agilent Technologies, Inc. | Chemical array reading |
US6916113B2 (en) | 2003-05-16 | 2005-07-12 | Agilent Technologies, Inc. | Devices and methods for fluid mixing |
CA2526368A1 (en) | 2003-05-20 | 2004-12-02 | Fluidigm Corporation | Method and system for microfluidic device and imaging thereof |
US8133670B2 (en) | 2003-06-13 | 2012-03-13 | Cold Spring Harbor Laboratory | Method for making populations of defined nucleic acid molecules |
US6938476B2 (en) | 2003-06-25 | 2005-09-06 | Agilent Technologies, Inc. | Apparatus and methods for sensing fluid levels |
US7534563B2 (en) | 2003-06-30 | 2009-05-19 | Agilent Technologies, Inc. | Methods for producing ligand arrays |
US20050016851A1 (en) | 2003-07-24 | 2005-01-27 | Jensen Klavs F. | Microchemical method and apparatus for synthesis and coating of colloidal nanoparticles |
US6843281B1 (en) | 2003-07-30 | 2005-01-18 | Agilent Techinologies, Inc. | Methods and apparatus for introducing liquids into microfluidic chambers |
US7353116B2 (en) | 2003-07-31 | 2008-04-01 | Agilent Technologies, Inc. | Chemical array with test dependent signal reading or processing |
US7028536B2 (en) | 2004-06-29 | 2006-04-18 | Nanostream, Inc. | Sealing interface for microfluidic device |
US7348144B2 (en) | 2003-08-13 | 2008-03-25 | Agilent Technologies, Inc. | Methods and system for multi-drug treatment discovery |
US7229497B2 (en) | 2003-08-26 | 2007-06-12 | Massachusetts Institute Of Technology | Method of preparing nanocrystals |
US7193077B2 (en) | 2003-08-30 | 2007-03-20 | Agilent Technologies, Inc. | Exocyclic amine triaryl methyl protecting groups in two-step polynucleotide synthesis |
US7417139B2 (en) | 2003-08-30 | 2008-08-26 | Agilent Technologies, Inc. | Method for polynucleotide synthesis |
US7585970B2 (en) | 2003-08-30 | 2009-09-08 | Agilent Technologies, Inc. | Method of polynucleotide synthesis using modified support |
US7385050B2 (en) | 2003-08-30 | 2008-06-10 | Agilent Technologies, Inc. | Cleavable linker for polynucleotide synthesis |
US7427679B2 (en) | 2003-08-30 | 2008-09-23 | Agilent Technologies, Inc. | Precursors for two-step polynucleotide synthesis |
US20050049796A1 (en) | 2003-09-03 | 2005-03-03 | Webb Peter G. | Methods for encoding non-biological information on microarrays |
EP2325339A3 (en) | 2003-09-09 | 2011-11-02 | Integrigen, Inc. | Methods and compositions for generation of germline human antibody genes |
US20050112636A1 (en) | 2003-09-23 | 2005-05-26 | Atom Sciences | Polymeric nucleic acid hybridization probes |
US7488607B2 (en) | 2003-09-30 | 2009-02-10 | Agilent Technologies, Inc. | Electronically readable microarray with electronic addressing function |
US7147362B2 (en) | 2003-10-15 | 2006-12-12 | Agilent Technologies, Inc. | Method of mixing by intermittent centrifugal force |
US7075161B2 (en) | 2003-10-23 | 2006-07-11 | Agilent Technologies, Inc. | Apparatus and method for making a low capacitance artificial nanopore |
WO2005043154A2 (en) | 2003-10-27 | 2005-05-12 | Massachusetts Institute Of Technology | High density reaction chambers and methods of use |
US7169560B2 (en) | 2003-11-12 | 2007-01-30 | Helicos Biosciences Corporation | Short cycle methods for sequencing polynucleotides |
US7276338B2 (en) | 2003-11-17 | 2007-10-02 | Jacobson Joseph M | Nucleotide sequencing via repetitive single molecule hybridization |
DE10353887A1 (en) | 2003-11-18 | 2005-06-16 | Febit Ag | Highly parallel matrix-based DNA synthesizer |
US7851192B2 (en) | 2004-11-22 | 2010-12-14 | New England Biolabs, Inc. | Modified DNA cleavage enzymes and methods for use |
US7282705B2 (en) | 2003-12-19 | 2007-10-16 | Agilent Technologies, Inc. | Microdevice having an annular lining for producing an electrospray emitter |
ES2432040T3 (en) | 2004-01-28 | 2013-11-29 | 454 Life Sciences Corporation | Nucleic acid amplification with continuous flow emulsion |
US7084180B2 (en) | 2004-01-28 | 2006-08-01 | Velocys, Inc. | Fischer-tropsch synthesis using microchannel technology and novel catalyst and microchannel reactor |
US7125488B2 (en) | 2004-02-12 | 2006-10-24 | Varian, Inc. | Polar-modified bonded phase materials for chromatographic separations |
US7875463B2 (en) | 2004-03-26 | 2011-01-25 | Agilent Technologies, Inc. | Generalized pulse jet ejection head control model |
US20050214779A1 (en) | 2004-03-29 | 2005-09-29 | Peck Bill J | Methods for in situ generation of nucleic acid arrays |
DK1773978T3 (en) | 2004-05-19 | 2014-05-26 | Univ Pittsburgh | Perfused, three-dimensional cell / tissue disease models |
US7302348B2 (en) | 2004-06-02 | 2007-11-27 | Agilent Technologies, Inc. | Method and system for quantifying and removing spatial-intensity trends in microarray data |
US20060024711A1 (en) | 2004-07-02 | 2006-02-02 | Helicos Biosciences Corporation | Methods for nucleic acid amplification and sequence determination |
US7811753B2 (en) | 2004-07-14 | 2010-10-12 | Ibis Biosciences, Inc. | Methods for repairing degraded DNA |
US20060012793A1 (en) | 2004-07-19 | 2006-01-19 | Helicos Biosciences Corporation | Apparatus and methods for analyzing samples |
US7276720B2 (en) | 2004-07-19 | 2007-10-02 | Helicos Biosciences Corporation | Apparatus and methods for analyzing samples |
US20060019084A1 (en) | 2004-07-23 | 2006-01-26 | Pearson Laurence T | Monolithic composition and method |
US20060024678A1 (en) | 2004-07-28 | 2006-02-02 | Helicos Biosciences Corporation | Use of single-stranded nucleic acid binding proteins in sequencing |
WO2006073504A2 (en) | 2004-08-04 | 2006-07-13 | President And Fellows Of Harvard College | Wobble sequencing |
WO2006018044A1 (en) | 2004-08-18 | 2006-02-23 | Agilent Technologies, Inc. | Microfluidic assembly with coupled microfluidic devices |
US7034290B2 (en) | 2004-09-24 | 2006-04-25 | Agilent Technologies, Inc. | Target support with pattern recognition sites |
US7943046B2 (en) | 2004-10-01 | 2011-05-17 | Agilent Technologies, Inc | Methods and systems for on-column protein delipidation |
AU2005295351A1 (en) | 2004-10-18 | 2006-04-27 | Codon Devices, Inc. | Methods for assembly of high fidelity synthetic polynucleotides |
US7141807B2 (en) | 2004-10-22 | 2006-11-28 | Agilent Technologies, Inc. | Nanowire capillaries for mass spectrometry |
US8380441B2 (en) | 2004-11-30 | 2013-02-19 | Agilent Technologies, Inc. | Systems for producing chemical array layouts |
US7977119B2 (en) | 2004-12-08 | 2011-07-12 | Agilent Technologies, Inc. | Chemical arrays and methods of using the same |
US7439272B2 (en) | 2004-12-20 | 2008-10-21 | Varian, Inc. | Ultraporous sol gel monoliths |
JP2008526259A (en) | 2005-01-13 | 2008-07-24 | コドン デバイシズ インコーポレイテッド | Compositions and methods for protein design |
US20060171855A1 (en) | 2005-02-03 | 2006-08-03 | Hongfeng Yin | Devices,systems and methods for multi-dimensional separation |
US20090088679A1 (en) | 2005-02-07 | 2009-04-02 | Massachusetts Institute Of Technology | Electronically-Degradable Layer-by-Layer Thin Films |
US20060203236A1 (en) | 2005-03-08 | 2006-09-14 | Zhenghua Ji | Sample cell |
EP1623763A1 (en) | 2005-03-11 | 2006-02-08 | Agilent Technologies, Inc. | Chip with cleaning cavity |
US7618777B2 (en) | 2005-03-16 | 2009-11-17 | Agilent Technologies, Inc. | Composition and method for array hybridization |
US20060219637A1 (en) | 2005-03-29 | 2006-10-05 | Killeen Kevin P | Devices, systems and methods for liquid chromatography |
EP1874792B1 (en) | 2005-04-27 | 2016-04-13 | Sigma-Aldrich Co. LLC | Activators for oligonucleotide and phosphoramidite synthesis |
US7572907B2 (en) | 2005-04-29 | 2009-08-11 | Agilent Technologies, Inc. | Methods and compounds for polynucleotide synthesis |
DE602006015633D1 (en) | 2005-04-29 | 2010-09-02 | Synthetic Genomics Inc | AMPLIFICATION AND CLONING OF INDIVIDUAL DNA MOLECULES BY ROLLING CIRCLE AMPLIFICATION |
US7396676B2 (en) | 2005-05-31 | 2008-07-08 | Agilent Technologies, Inc. | Evanescent wave sensor with attached ligand |
CA2611671C (en) | 2005-06-15 | 2013-10-08 | Callida Genomics, Inc. | Single molecule arrays for genetic and chemical analysis |
US7919239B2 (en) | 2005-07-01 | 2011-04-05 | Agilent Technologies, Inc. | Increasing hybridization efficiencies |
US8076064B2 (en) | 2005-07-09 | 2011-12-13 | Agilent Technologies, Inc. | Method of treatment of RNA sample |
US7718365B2 (en) | 2005-07-09 | 2010-05-18 | Agilent Technologies, Inc. | Microarray analysis of RNA |
WO2007018601A1 (en) | 2005-08-02 | 2007-02-15 | Rubicon Genomics, Inc. | Compositions and methods for processing and amplification of dna, including using multiple enzymes in a single reaction |
DE102005037351B3 (en) | 2005-08-08 | 2007-01-11 | Geneart Ag | In vitro method for directed evolution of proteins, useful e.g. in pharmaceutical development, uses expression system for performing translation, transcription and reverse transcription |
DK2239327T3 (en) | 2005-08-11 | 2015-05-18 | Synthetic Genomics Inc | A method for in vitro recombination |
US7749701B2 (en) | 2005-08-11 | 2010-07-06 | Agilent Technologies, Inc. | Controlling use of oligonucleotide sequences released from arrays |
WO2007025059A1 (en) | 2005-08-26 | 2007-03-01 | Surmodics, Inc. | Silane coating compositions, coating systems, and methods |
US20100233429A1 (en) | 2005-09-16 | 2010-09-16 | Yamatake Corporation | Substrate for Biochip, Biochip, Method for Manufacturing Substrate for Biochip and Method for Manufacturing Biochip |
US20080308884A1 (en) | 2005-10-13 | 2008-12-18 | Silex Microsystems Ab | Fabrication of Inlet and Outlet Connections for Microfluidic Chips |
US8552174B2 (en) | 2005-10-31 | 2013-10-08 | Agilent Technologies, Inc. | Solutions, methods, and processes for deprotection of polynucleotides |
US8202985B2 (en) | 2005-10-31 | 2012-06-19 | Agilent Technologies, Inc. | Monomer compositions for the synthesis of polynucleotides, methods of synthesis, and methods of deprotection |
US7368550B2 (en) | 2005-10-31 | 2008-05-06 | Agilent Technologies, Inc. | Phosphorus protecting groups |
US7759471B2 (en) | 2005-10-31 | 2010-07-20 | Agilent Technologies, Inc. | Monomer compositions for the synthesis of RNA, methods of synthesis, and methods of deprotection |
US7291471B2 (en) | 2005-11-21 | 2007-11-06 | Agilent Technologies, Inc. | Cleavable oligonucleotide arrays |
GB0524069D0 (en) | 2005-11-25 | 2006-01-04 | Solexa Ltd | Preparation of templates for solid phase amplification |
EP1989318B1 (en) | 2006-01-06 | 2014-07-30 | Agilent Technologies, Inc. | Reaction buffer composition for nucleic acid replication with packed dna polymerases |
WO2007087377A2 (en) | 2006-01-25 | 2007-08-02 | Massachusetts Institute Of Technology | Photoelectrochemical synthesis of high density combinatorial polymer arrays |
US9274108B2 (en) | 2006-02-06 | 2016-03-01 | Massachusetts Institute Of Technology | Self-assembly of macromolecules on multilayered polymer surfaces |
WO2007095171A2 (en) | 2006-02-14 | 2007-08-23 | Massachusetts Institute Of Technology | Absorbing film |
US7807356B2 (en) | 2006-03-09 | 2010-10-05 | Agilent Technologies, Inc. | Labeled nucleotide composition |
US7572908B2 (en) | 2006-03-23 | 2009-08-11 | Agilent Technologies, Inc. | Cleavable linkers for polynucleotides |
US7855281B2 (en) | 2006-03-23 | 2010-12-21 | Agilent Technologies, Inc. | Cleavable thiocarbonate linkers for polynucleotide synthesis |
US20070231800A1 (en) | 2006-03-28 | 2007-10-04 | Agilent Technologies, Inc. | Determination of methylated DNA |
US8058055B2 (en) | 2006-04-07 | 2011-11-15 | Agilent Technologies, Inc. | High resolution chromosomal mapping |
US20070238106A1 (en) | 2006-04-07 | 2007-10-11 | Agilent Technologies, Inc. | Systems and methods of determining alleles and/or copy numbers |
US20070238104A1 (en) | 2006-04-07 | 2007-10-11 | Agilent Technologies, Inc. | Competitive oligonucleotides |
US20070238108A1 (en) | 2006-04-07 | 2007-10-11 | Agilent Technologies, Inc. | Validation of comparative genomic hybridization |
EP2010678A2 (en) | 2006-04-11 | 2009-01-07 | New England Biolabs, Inc. | Repair of nucleic acids for improved amplification |
JP2009538123A (en) | 2006-04-19 | 2009-11-05 | アプライド バイオシステムズ, エルエルシー | Reagents, methods and libraries for gel-free bead-based sequencing |
US20070259345A1 (en) | 2006-05-03 | 2007-11-08 | Agilent Technologies, Inc. | Target determination using compound probes |
US20070259346A1 (en) | 2006-05-03 | 2007-11-08 | Agilent Technologies, Inc. | Analysis of arrays |
US20070259344A1 (en) | 2006-05-03 | 2007-11-08 | Agilent Technologies, Inc. | Compound probes and methods of increasing the effective probe densities of arrays |
US20070259347A1 (en) | 2006-05-03 | 2007-11-08 | Agilent Technologies, Inc. | Methods of increasing the effective probe densities of arrays |
US20090087840A1 (en) | 2006-05-19 | 2009-04-02 | Codon Devices, Inc. | Combined extension and ligation for nucleic acid assembly |
WO2007137242A2 (en) | 2006-05-19 | 2007-11-29 | Massachusetts Institute Of Technology | Microfluidic-based gene synthesis |
WO2008054543A2 (en) | 2006-05-20 | 2008-05-08 | Codon Devices, Inc. | Oligonucleotides for multiplex nucleic acid assembly |
US20080193772A1 (en) | 2006-07-07 | 2008-08-14 | Bio-Rad Laboratories, Inc | Mass spectrometry probes having hydrophobic coatiings |
US7524942B2 (en) | 2006-07-31 | 2009-04-28 | Agilent Technologies, Inc. | Labeled nucleotide composition |
US7572585B2 (en) | 2006-07-31 | 2009-08-11 | Agilent Technologies, Inc. | Enzymatic labeling of RNA |
SI2056845T1 (en) * | 2006-08-08 | 2018-02-28 | Rheinische Friedrich-Wilhelms-Universitaet Bonn | Structure and use of 5' phosphate oligonucleotides |
DE102006039479A1 (en) | 2006-08-23 | 2008-03-06 | Febit Biotech Gmbh | Programmable oligonucleotide synthesis |
US8415138B2 (en) | 2006-08-31 | 2013-04-09 | Agilent Technologies, Inc. | Apparatuses and methods for oligonucleotide preparation |
US8053191B2 (en) | 2006-08-31 | 2011-11-08 | Westend Asset Clearinghouse Company, Llc | Iterative nucleic acid assembly using activation of vector-encoded traits |
US8097711B2 (en) | 2006-09-02 | 2012-01-17 | Agilent Technologies, Inc. | Thioether substituted aryl carbonate protecting groups |
EP2078077A2 (en) | 2006-10-04 | 2009-07-15 | Codon Devices, Inc | Nucleic acid libraries and their design and assembly |
US20080085514A1 (en) | 2006-10-10 | 2008-04-10 | Peck Bill J | Methods and devices for array synthesis |
US7867782B2 (en) | 2006-10-19 | 2011-01-11 | Agilent Technologies, Inc. | Nanoscale moiety placement methods |
US7999087B2 (en) | 2006-11-15 | 2011-08-16 | Agilent Technologies, Inc. | 2′-silyl containing thiocarbonate protecting groups for RNA synthesis |
US8242258B2 (en) | 2006-12-03 | 2012-08-14 | Agilent Technologies, Inc. | Protecting groups for RNA synthesis |
US7989396B2 (en) | 2006-12-05 | 2011-08-02 | The Board Of Trustees Of The Leland Stanford Junior University | Biomolecule immobilization on biosensors |
US8314220B2 (en) | 2007-01-26 | 2012-11-20 | Agilent Technologies, Inc. | Methods compositions, and kits for detection of microRNA |
US20080182296A1 (en) | 2007-01-31 | 2008-07-31 | Chanda Pranab K | Pcr-directed gene synthesis from large number of overlapping oligodeoxyribonucleotides |
US9029085B2 (en) | 2007-03-07 | 2015-05-12 | President And Fellows Of Harvard College | Assays and other reactions involving droplets |
US7651762B2 (en) | 2007-03-13 | 2010-01-26 | Varian, Inc. | Methods and devices using a shrinkable support for porous monolithic materials |
EP2476689B1 (en) | 2007-05-10 | 2015-10-21 | Agilent Technologies, Inc. | Thiocarbon-protecting groups for RNA synthesis |
US20090023190A1 (en) | 2007-06-20 | 2009-01-22 | Kai Qin Lao | Sequence amplification with loopable primers |
US20080318334A1 (en) | 2007-06-20 | 2008-12-25 | Robotti Karla M | Microfluidic devices comprising fluid flow paths having a monolithic chromatographic material |
US8194244B2 (en) | 2007-06-29 | 2012-06-05 | Intel Corporation | Solution sample plate with wells designed for improved Raman scattering signal detection efficiency |
US7659069B2 (en) | 2007-08-31 | 2010-02-09 | Agilent Technologies, Inc. | Binary signaling assay using a split-polymerase |
US7979215B2 (en) | 2007-07-30 | 2011-07-12 | Agilent Technologies, Inc. | Methods and systems for evaluating CGH candidate probe nucleic acid sequences |
US8685642B2 (en) | 2007-07-30 | 2014-04-01 | Agilent Technologies, Inc. | Allele-specific copy number measurement using single nucleotide polymorphism and DNA arrays |
US20090036664A1 (en) | 2007-07-31 | 2009-02-05 | Brian Jon Peter | Complex oligonucleotide primer mix |
EP2190988A4 (en) | 2007-08-07 | 2010-12-22 | Agency Science Tech & Res | Integrated microfluidic device for gene synthesis |
WO2009023257A1 (en) | 2007-08-15 | 2009-02-19 | Massachusetts Institute Of Technology | Microstructures for fluidic ballasting and flow control |
US20090053704A1 (en) | 2007-08-24 | 2009-02-26 | Natalia Novoradovskaya | Stabilization of nucleic acids on solid supports |
WO2009039208A1 (en) | 2007-09-17 | 2009-03-26 | Twof, Inc. | Supramolecular nanostamping printing device |
US7790387B2 (en) | 2007-09-24 | 2010-09-07 | Agilent Technologies, Inc. | Thiocarbonate linkers for polynucleotides |
AU2008307617B2 (en) | 2007-09-28 | 2013-05-23 | Pacific Biosciences Of California, Inc. | Error-free amplification of DNA for clonal sequencing |
WO2009070665A1 (en) | 2007-11-27 | 2009-06-04 | Massachusetts Institute Of Technology | Near field detector for integrated surface plasmon resonance biosensor applications |
US9286439B2 (en) | 2007-12-17 | 2016-03-15 | Yeda Research And Development Co Ltd | System and method for editing and manipulating DNA |
US9540637B2 (en) | 2008-01-09 | 2017-01-10 | Life Technologies Corporation | Nucleic acid adaptors and uses thereof |
WO2009089384A1 (en) | 2008-01-09 | 2009-07-16 | Life Technologies | Method of making a paired tag library for nucleic acid sequencing |
US7682809B2 (en) | 2008-01-11 | 2010-03-23 | Agilent Technologies, Inc. | Direct ATP release sequencing |
EP2238459B1 (en) | 2008-01-23 | 2019-05-08 | Roche Diagnostics GmbH | Integrated instrument performing synthesis and amplification |
US8304273B2 (en) | 2008-01-24 | 2012-11-06 | Massachusetts Institute Of Technology | Insulated nanogap devices and methods of use thereof |
WO2009097368A2 (en) | 2008-01-28 | 2009-08-06 | Complete Genomics, Inc. | Methods and compositions for efficient base calling in sequencing reactions |
US20090194483A1 (en) | 2008-01-31 | 2009-08-06 | Robotti Karla M | Microfluidic device having monolithic separation medium and method of use |
WO2009113709A1 (en) * | 2008-03-11 | 2009-09-17 | 国立大学法人東京大学 | Method of preparing dna fragment having sticky end |
US20090230044A1 (en) | 2008-03-13 | 2009-09-17 | Agilent Technologies, Inc. | Microfluid Chip Cleaning |
US20090238722A1 (en) | 2008-03-18 | 2009-09-24 | Agilent Technologies, Inc. | Pressure-Reinforced Fluidic Chip |
FR2930137B1 (en) * | 2008-04-18 | 2010-04-23 | Corevalve Inc | TREATMENT EQUIPMENT FOR A CARDIAC VALVE, IN PARTICULAR A MITRAL VALVE. |
US8911948B2 (en) | 2008-04-30 | 2014-12-16 | Integrated Dna Technologies, Inc. | RNase H-based assays utilizing modified RNA monomers |
JP4667490B2 (en) | 2008-07-09 | 2011-04-13 | 三菱電機株式会社 | Cooker |
WO2010014903A1 (en) | 2008-07-31 | 2010-02-04 | Massachusetts Institute Of Technology | Multiplexed olfactory receptor-based microsurface plasmon polariton detector |
US20100069250A1 (en) | 2008-08-16 | 2010-03-18 | The Board Of Trustees Of The Leland Stanford Junior University | Digital PCR Calibration for High Throughput Sequencing |
CA2734235C (en) | 2008-08-22 | 2019-03-26 | Sangamo Biosciences, Inc. | Methods and compositions for targeted single-stranded cleavage and targeted integration |
US8808986B2 (en) | 2008-08-27 | 2014-08-19 | Gen9, Inc. | Methods and devices for high fidelity polynucleotide synthesis |
CN102639552B (en) | 2008-09-05 | 2016-05-25 | 高端学术皇家研究会/麦吉尔大学 | The RNA monomer that contains O-acetal levulic acid ester group (O-acetal levulinyl ester) and the application in RNA microarray thereof |
US20100076183A1 (en) | 2008-09-22 | 2010-03-25 | Dellinger Douglas J | Protected monomer and method of final deprotection for rna synthesis |
US8213015B2 (en) | 2008-09-25 | 2012-07-03 | Agilent Technologies, Inc. | Integrated flow cell with semiconductor oxide tubing |
US20100090341A1 (en) | 2008-10-14 | 2010-04-15 | Molecular Imprints, Inc. | Nano-patterned active layers formed by nano-imprint lithography |
US20100301398A1 (en) | 2009-05-29 | 2010-12-02 | Ion Torrent Systems Incorporated | Methods and apparatus for measuring analytes |
US9080211B2 (en) | 2008-10-24 | 2015-07-14 | Epicentre Technologies Corporation | Transposon end compositions and methods for modifying nucleic acids |
US8357489B2 (en) | 2008-11-13 | 2013-01-22 | The Board Of Trustees Of The Leland Stanford Junior University | Methods for detecting hepatocellular carcinoma |
US8963262B2 (en) | 2009-08-07 | 2015-02-24 | Massachusettes Institute Of Technology | Method and apparatus for forming MEMS device |
TW201104253A (en) | 2008-12-31 | 2011-02-01 | Nat Health Research Institutes | Microarray chip and method of fabricating for the same |
US8569046B2 (en) | 2009-02-20 | 2013-10-29 | Massachusetts Institute Of Technology | Microarray with microchannels |
DK2398915T3 (en) | 2009-02-20 | 2016-12-12 | Synthetic Genomics Inc | Synthesis of nucleic acids sequence verified |
US7862716B2 (en) | 2009-04-13 | 2011-01-04 | Sielc Technologies Corporation | HPLC schematic with integrated sample cleaning system |
WO2010124734A1 (en) | 2009-04-29 | 2010-11-04 | Telecom Italia S.P.A. | Method and apparatus for depositing a biological fluid onto a substrate |
US9085798B2 (en) | 2009-04-30 | 2015-07-21 | Prognosys Biosciences, Inc. | Nucleic acid constructs and methods of use |
EP2248914A1 (en) | 2009-05-05 | 2010-11-10 | Max-Planck-Gesellschaft zur Förderung der Wissenschaften e.V. | The use of class IIB restriction endonucleases in 2nd generation sequencing applications |
US9309557B2 (en) | 2010-12-17 | 2016-04-12 | Life Technologies Corporation | Nucleic acid amplification |
US20100300882A1 (en) | 2009-05-26 | 2010-12-02 | General Electric Company | Devices and methods for in-line sample preparation of materials |
US8309710B2 (en) | 2009-06-29 | 2012-11-13 | Agilent Technologies, Inc. | Use of N-alkyl imidazole for sulfurization of oligonucleotides with an acetyl disulfide |
US8642755B2 (en) | 2009-06-30 | 2014-02-04 | Agilent Technologies, Inc. | Use of thioacetic acid derivatives in the sulfurization of oligonucleotides with phenylacetyl disulfide |
GB0912909D0 (en) | 2009-07-23 | 2009-08-26 | Olink Genomics Ab | Probes for specific analysis of nucleic acids |
US20120184724A1 (en) | 2009-09-22 | 2012-07-19 | Agilent Technologies, Inc. | Protected monomers and methods of deprotection for rna synthesis |
US8975019B2 (en) | 2009-10-19 | 2015-03-10 | University Of Massachusetts | Deducing exon connectivity by RNA-templated DNA ligation/sequencing |
WO2011053957A2 (en) | 2009-11-02 | 2011-05-05 | Gen9, Inc. | Compositions and methods for the regulation of multiple genes of interest in a cell |
US10207240B2 (en) | 2009-11-03 | 2019-02-19 | Gen9, Inc. | Methods and microfluidic devices for the manipulation of droplets in high fidelity polynucleotide assembly |
US20110114549A1 (en) | 2009-11-13 | 2011-05-19 | Agilent Technolgies, Inc. | Microfluidic device comprising separation columns |
WO2011066185A1 (en) | 2009-11-25 | 2011-06-03 | Gen9, Inc. | Microfluidic devices and methods for gene synthesis |
EP3597771A1 (en) | 2009-11-25 | 2020-01-22 | Gen9, Inc. | Methods and apparatuses for chip-based dna error reduction |
US9217144B2 (en) | 2010-01-07 | 2015-12-22 | Gen9, Inc. | Assembly of high fidelity polynucleotides |
US9758817B2 (en) | 2010-01-13 | 2017-09-12 | Agilent Technologies, Inc. | Method for identifying a nucleic acid in a sample |
KR101230350B1 (en) | 2010-01-27 | 2013-02-06 | 주식회사 엘지화학 | Battery Pack of Excellent Structural Stability |
US20120027786A1 (en) | 2010-02-23 | 2012-02-02 | Massachusetts Institute Of Technology | Genetically programmable pathogen sense and destroy |
GB201003036D0 (en) | 2010-02-23 | 2010-04-07 | Fermentas Uab | Restriction endonucleases and their applications |
US8716467B2 (en) | 2010-03-03 | 2014-05-06 | Gen9, Inc. | Methods and devices for nucleic acid synthesis |
WO2011143556A1 (en) | 2010-05-13 | 2011-11-17 | Gen9, Inc. | Methods for nucleotide sequencing and high fidelity polynucleotide synthesis |
US9187777B2 (en) | 2010-05-28 | 2015-11-17 | Gen9, Inc. | Methods and devices for in situ nucleic acid synthesis |
GB2481425A (en) | 2010-06-23 | 2011-12-28 | Iti Scotland Ltd | Method and device for assembling polynucleic acid sequences |
US8715933B2 (en) | 2010-09-27 | 2014-05-06 | Nabsys, Inc. | Assay methods using nicking endonucleases |
WO2012154201A1 (en) | 2010-10-22 | 2012-11-15 | President And Fellows Of Harvard College | Orthogonal amplification and assembly of nucleic acid sequences |
EP2635679B1 (en) * | 2010-11-05 | 2017-04-19 | Illumina, Inc. | Linking sequence reads using paired code tags |
AU2011338841B2 (en) | 2010-11-12 | 2017-02-16 | Gen9, Inc. | Methods and devices for nucleic acids synthesis |
WO2012064975A1 (en) | 2010-11-12 | 2012-05-18 | Gen9, Inc. | Protein arrays and methods of using and making the same |
EP3564392B1 (en) | 2010-12-17 | 2021-11-24 | Life Technologies Corporation | Methods for nucleic acid amplification |
US9487807B2 (en) | 2010-12-27 | 2016-11-08 | Ibis Biosciences, Inc. | Compositions and methods for producing single-stranded circular DNA |
KR101802908B1 (en) | 2011-03-30 | 2017-11-29 | 도레이 카부시키가이샤 | Membrane-separation-type culture device, membrane-separation-type culture kit, stem cell separation method using same, and separation membrane |
US10131903B2 (en) | 2011-04-01 | 2018-11-20 | The Regents Of The University Of California | Microfluidic platform for synthetic biology applications |
WO2012149171A1 (en) | 2011-04-27 | 2012-11-01 | The Regents Of The University Of California | Designing padlock probes for targeted genomic sequencing |
US8722585B2 (en) | 2011-05-08 | 2014-05-13 | Yan Wang | Methods of making di-tagged DNA libraries from DNA or RNA using double-tagged oligonucleotides |
EP2710172B1 (en) | 2011-05-20 | 2017-03-29 | Fluidigm Corporation | Nucleic acid encoding reactions |
US9752176B2 (en) | 2011-06-15 | 2017-09-05 | Ginkgo Bioworks, Inc. | Methods for preparative in vitro cloning |
US20130045483A1 (en) | 2011-07-01 | 2013-02-21 | Whitehead Institute For Biomedical Research | Yeast cells expressing amyloid beta and uses therefor |
WO2013019361A1 (en) | 2011-07-07 | 2013-02-07 | Life Technologies Corporation | Sequencing methods |
US20130017978A1 (en) | 2011-07-11 | 2013-01-17 | Finnzymes Oy | Methods and transposon nucleic acids for generating a dna library |
US20150203839A1 (en) | 2011-08-26 | 2015-07-23 | Gen9, Inc. | Compositions and Methods for High Fidelity Assembly of Nucleic Acids |
EP2748318B1 (en) | 2011-08-26 | 2015-11-04 | Gen9, Inc. | Compositions and methods for high fidelity assembly of nucleic acids |
EP2753714B1 (en) | 2011-09-06 | 2017-04-12 | Gen-Probe Incorporated | Circularized templates for sequencing |
US8840981B2 (en) | 2011-09-09 | 2014-09-23 | Eastman Kodak Company | Microfluidic device with multilayer coating |
EP3964285A1 (en) | 2011-09-26 | 2022-03-09 | Thermo Fisher Scientific Geneart GmbH | High efficiency, small volume nucleic acid synthesis |
EP2766838A2 (en) | 2011-10-11 | 2014-08-20 | Life Technologies Corporation | Systems and methods for analysis and interpretation of nucleic acid sequence data |
CA2852949A1 (en) | 2011-10-19 | 2013-04-25 | Nugen Technologies, Inc. | Compositions and methods for directional nucleic acid amplification and sequencing |
US8987174B2 (en) | 2011-10-28 | 2015-03-24 | Prognosys Biosciences, Inc. | Methods for manufacturing molecular arrays |
US8815782B2 (en) | 2011-11-11 | 2014-08-26 | Agilent Technologies, Inc. | Use of DNAzymes for analysis of an RNA sample |
US8450107B1 (en) | 2011-11-30 | 2013-05-28 | The Broad Institute Inc. | Nucleotide-specific recognition sequences for designer TAL effectors |
US20130137173A1 (en) | 2011-11-30 | 2013-05-30 | Feng Zhang | Nucleotide-specific recognition sequences for designer tal effectors |
EP2599785A1 (en) | 2011-11-30 | 2013-06-05 | Agilent Technologies, Inc. | Novel methods for the synthesis and purification of oligomers |
CA2862364C (en) | 2011-12-30 | 2021-02-23 | Quest Diagnostics Investments Incorporated | Nucleic acid analysis using emulsion pcr |
ES2776673T3 (en) | 2012-02-27 | 2020-07-31 | Univ North Carolina Chapel Hill | Methods and uses for molecular tags |
US9150853B2 (en) | 2012-03-21 | 2015-10-06 | Gen9, Inc. | Methods for screening proteins using DNA encoded chemical libraries as templates for enzyme catalysis |
US20150353921A9 (en) | 2012-04-16 | 2015-12-10 | Jingdong Tian | Method of on-chip nucleic acid molecule synthesis |
LT2841601T (en) | 2012-04-24 | 2019-07-10 | Gen9, Inc. | Methods for sorting nucleic acids and multiplexed preparative in vitro cloning |
US20130281308A1 (en) | 2012-04-24 | 2013-10-24 | Gen9, Inc. | Methods for sorting nucleic acids and preparative in vitro cloning |
CN104736722B (en) | 2012-05-21 | 2018-08-07 | 斯克利普斯研究所 | Sample preparation methods |
US10308979B2 (en) | 2012-06-01 | 2019-06-04 | Agilent Technologies, Inc. | Target enrichment and labeling for multi-kilobase DNA |
SG11201407818PA (en) | 2012-06-01 | 2014-12-30 | European Molecular Biology Lab Embl | High-capacity storage of digital information in dna |
US9102936B2 (en) | 2012-06-11 | 2015-08-11 | Agilent Technologies, Inc. | Method of adaptor-dimer subtraction using a CRISPR CAS6 protein |
WO2014004393A1 (en) | 2012-06-25 | 2014-01-03 | Gen9, Inc. | Methods for nucleic acid assembly and high throughput sequencing |
US9255245B2 (en) | 2012-07-03 | 2016-02-09 | Agilent Technologies, Inc. | Sample probes and methods for sampling intracellular material |
US20140038240A1 (en) * | 2012-07-10 | 2014-02-06 | Pivot Bio, Inc. | Methods for multipart, modular and scarless assembly of dna molecules |
WO2014012071A1 (en) | 2012-07-12 | 2014-01-16 | Massachusetts Institute Of Technology | Methods and apparatus for assembly |
WO2014021938A1 (en) | 2012-08-02 | 2014-02-06 | The Board Of Trustees Of The Leland Stanford Junior University | Methods and apparatus for nucleic acid synthesis using oligo-templated polymerization |
EP2890836B1 (en) | 2012-08-31 | 2019-07-17 | The Scripps Research Institute | Methods related to modulators of eukaryotic cells |
US9328376B2 (en) | 2012-09-05 | 2016-05-03 | Bio-Rad Laboratories, Inc. | Systems and methods for stabilizing droplets |
EP3252174B1 (en) | 2012-10-15 | 2020-07-01 | Life Technologies Corporation | Compositions, methods, systems and kits for target nucleic acid enrichment |
KR20140048733A (en) | 2012-10-16 | 2014-04-24 | 삼성전자주식회사 | Multiwell plate and method for analyzing target material using the same |
WO2014088693A1 (en) * | 2012-12-06 | 2014-06-12 | Agilent Technologies, Inc. | Molecular fabrication |
WO2014092886A2 (en) | 2012-12-10 | 2014-06-19 | Agilent Technologies, Inc. | Pairing code directed assembly |
WO2014093694A1 (en) | 2012-12-12 | 2014-06-19 | The Broad Institute, Inc. | Crispr-cas nickase systems, methods and compositions for sequence manipulation in eukaryotes |
SG11201506750QA (en) | 2013-02-28 | 2015-09-29 | Univ Nanyang Tech | Method of manufacturing a device for supporting biological material growth and device therefrom |
EP2964778B1 (en) | 2013-03-05 | 2019-10-09 | Agilent Technologies, Inc. | Detection of genomic rearrangements by sequence capture |
EP2971034B1 (en) | 2013-03-13 | 2020-12-02 | Gen9, Inc. | Compositions, methods and apparatus for oligonucleotides synthesis |
WO2014160059A1 (en) | 2013-03-13 | 2014-10-02 | Gen9, Inc. | Compositions and methods for synthesis of high fidelity oligonucleotides |
US20140274741A1 (en) | 2013-03-15 | 2014-09-18 | The Translational Genomics Research Institute | Methods to capture and sequence large fragments of dna and diagnostic methods for neuromuscular disease |
IL292498A (en) | 2013-03-15 | 2022-06-01 | Gen9 Inc | Compositions and methods for multiplex nucleic acids synthesis |
US20140274729A1 (en) | 2013-03-15 | 2014-09-18 | Nugen Technologies, Inc. | Methods, compositions and kits for generation of stranded rna or dna libraries |
US9279149B2 (en) | 2013-04-02 | 2016-03-08 | Molecular Assemblies, Inc. | Methods and apparatus for synthesizing nucleic acids |
US9771613B2 (en) | 2013-04-02 | 2017-09-26 | Molecular Assemblies, Inc. | Methods and apparatus for synthesizing nucleic acid |
US10683536B2 (en) | 2013-04-02 | 2020-06-16 | Molecular Assemblies, Inc. | Reusable initiators for synthesizing nucleic acids |
US20150010953A1 (en) | 2013-07-03 | 2015-01-08 | Agilent Technologies, Inc. | Method for producing a population of oligonucleotides that has reduced synthesis errors |
KR20150005062A (en) | 2013-07-04 | 2015-01-14 | 삼성전자주식회사 | Processor using mini-cores |
US10421957B2 (en) | 2013-07-29 | 2019-09-24 | Agilent Technologies, Inc. | DNA assembly using an RNA-programmable nickase |
US20160168564A1 (en) | 2013-07-30 | 2016-06-16 | Gen9, Inc. | Methods for the Production of Long Length Clonal Sequence Verified Nucleic Acid Constructs |
DK3030682T3 (en) | 2013-08-05 | 2020-09-14 | Twist Bioscience Corp | DE NOVO SYNTHESIZED GENE LIBRARIES |
US9589445B2 (en) | 2013-08-07 | 2017-03-07 | Nike, Inc. | Activity recognition with activity reminders |
US20170175110A1 (en) | 2013-11-27 | 2017-06-22 | Gen9, Inc. | Libraries of Nucleic Acids and Methods for Making the Same |
US20150159152A1 (en) | 2013-12-09 | 2015-06-11 | Integrated Dna Technologies, Inc. | Long nucleic acid sequences containing variable regions |
GB2521387B (en) | 2013-12-18 | 2020-05-27 | Ge Healthcare Uk Ltd | Oligonucleotide data storage on solid supports |
US9587268B2 (en) | 2014-01-29 | 2017-03-07 | Agilent Technologies Inc. | Fast hybridization for next generation sequencing target enrichment |
US10287627B2 (en) | 2014-02-08 | 2019-05-14 | The Regents Of The University Of Colorado, A Body Corporate | Multiplexed linking PCR |
WO2015195178A2 (en) | 2014-03-27 | 2015-12-23 | Canon U.S. Life Sciences, Inc. | Integration of ex situ fabricated porous polymer monoliths into fluidic chips |
CN106232906A (en) | 2014-04-15 | 2016-12-14 | 沃尔沃建造设备有限公司 | Device and control method thereof for the electromotor and hydraulic pump that control engineering machinery |
CN106536734B (en) | 2014-05-16 | 2020-12-22 | Illumina公司 | Nucleic acid synthesis technology |
US20150361422A1 (en) | 2014-06-16 | 2015-12-17 | Agilent Technologies, Inc. | High throughput gene assembly in droplets |
US20150361423A1 (en) | 2014-06-16 | 2015-12-17 | Agilent Technologies, Inc. | High throughput gene assembly in droplets |
US10870845B2 (en) | 2014-07-01 | 2020-12-22 | Global Life Sciences Solutions Operations UK Ltd | Methods for capturing nucleic acids |
US10472620B2 (en) | 2014-07-01 | 2019-11-12 | General Electric Company | Method, substrate and device for separating nucleic acids |
US20170198268A1 (en) | 2014-07-09 | 2017-07-13 | Gen9, Inc. | Compositions and Methods for Site-Directed DNA Nicking and Cleaving |
EP3169781B1 (en) | 2014-07-15 | 2020-04-08 | Life Technologies Corporation | Compositions and methods for nucleic acid assembly |
WO2016022557A1 (en) | 2014-08-05 | 2016-02-11 | Twist Bioscience Corporation | Cell free cloning of nucleic acids |
CN107278234A (en) | 2014-10-03 | 2017-10-20 | 生命科技公司 | Gene order verification composition, method and kit |
CN113930455A (en) | 2014-10-09 | 2022-01-14 | 生命技术公司 | CRISPR oligonucleotides and gene clips |
CN107107058B (en) | 2014-10-22 | 2021-08-10 | 加利福尼亚大学董事会 | High-definition micro-droplet printer |
WO2016126987A1 (en) | 2015-02-04 | 2016-08-11 | Twist Bioscience Corporation | Compositions and methods for synthetic gene assembly |
WO2016126882A1 (en) | 2015-02-04 | 2016-08-11 | Twist Bioscience Corporation | Methods and devices for de novo oligonucleic acid assembly |
US9981239B2 (en) | 2015-04-21 | 2018-05-29 | Twist Bioscience Corporation | Devices and methods for oligonucleic acid library synthesis |
WO2016183100A1 (en) | 2015-05-11 | 2016-11-17 | Twist Bioscience Corporation | Compositions and methods for nucleic acid amplification |
-
2016
- 2016-02-04 WO PCT/US2016/016636 patent/WO2016126987A1/en active Application Filing
- 2016-02-04 CA CA2975855A patent/CA2975855A1/en active Pending
- 2016-05-13 US US15/154,879 patent/US9677067B2/en active Active
-
2017
- 2017-02-15 US US15/433,909 patent/US20170159044A1/en not_active Abandoned
-
2019
- 2019-08-02 US US16/530,717 patent/US20190352635A1/en not_active Abandoned
-
2021
- 2021-05-13 US US17/320,127 patent/US20220064628A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1992006200A1 (en) * | 1990-09-28 | 1992-04-16 | F. Hoffmann-La-Roche Ag | 5' to 3' exonuclease mutations of thermostable dna polymerases |
US20130254934A1 (en) * | 2010-12-10 | 2013-09-26 | Takeshi Nakano | Disease-resistant plant and method for preparing the same |
US20160230175A1 (en) * | 2015-02-11 | 2016-08-11 | Agilent Technologies, Inc. | Methods and compositions for rapid seamless dna assembly |
US20180355351A1 (en) * | 2017-06-12 | 2018-12-13 | Twist Bioscience Corporation | Methods for seamless nucleic acid assembly |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11559778B2 (en) | 2013-08-05 | 2023-01-24 | Twist Bioscience Corporation | De novo synthesized gene libraries |
US11452980B2 (en) | 2013-08-05 | 2022-09-27 | Twist Bioscience Corporation | De novo synthesized gene libraries |
US11697668B2 (en) | 2015-02-04 | 2023-07-11 | Twist Bioscience Corporation | Methods and devices for de novo oligonucleic acid assembly |
US11691118B2 (en) | 2015-04-21 | 2023-07-04 | Twist Bioscience Corporation | Devices and methods for oligonucleic acid library synthesis |
US11807956B2 (en) | 2015-09-18 | 2023-11-07 | Twist Bioscience Corporation | Oligonucleic acid variant libraries and synthesis thereof |
US11512347B2 (en) | 2015-09-22 | 2022-11-29 | Twist Bioscience Corporation | Flexible substrates for nucleic acid synthesis |
US11562103B2 (en) | 2016-09-21 | 2023-01-24 | Twist Bioscience Corporation | Nucleic acid based data storage |
US12056264B2 (en) | 2016-09-21 | 2024-08-06 | Twist Bioscience Corporation | Nucleic acid based data storage |
US11550939B2 (en) | 2017-02-22 | 2023-01-10 | Twist Bioscience Corporation | Nucleic acid based data storage using enzymatic bioencryption |
US11745159B2 (en) | 2017-10-20 | 2023-09-05 | Twist Bioscience Corporation | Heated nanowells for polynucleotide synthesis |
US11492665B2 (en) | 2018-05-18 | 2022-11-08 | Twist Bioscience Corporation | Polynucleotides, reagents, and methods for nucleic acid hybridization |
US11732294B2 (en) | 2018-05-18 | 2023-08-22 | Twist Bioscience Corporation | Polynucleotides, reagents, and methods for nucleic acid hybridization |
US11492727B2 (en) | 2019-02-26 | 2022-11-08 | Twist Bioscience Corporation | Variant nucleic acid libraries for GLP1 receptor |
US11492728B2 (en) | 2019-02-26 | 2022-11-08 | Twist Bioscience Corporation | Variant nucleic acid libraries for antibody optimization |
US12091777B2 (en) | 2019-09-23 | 2024-09-17 | Twist Bioscience Corporation | Variant nucleic acid libraries for CRTH2 |
US12018065B2 (en) | 2020-04-27 | 2024-06-25 | Twist Bioscience Corporation | Variant nucleic acid libraries for coronavirus |
US11970697B2 (en) | 2020-10-19 | 2024-04-30 | Twist Bioscience Corporation | Methods of synthesizing oligonucleotides using tethered nucleotides |
Also Published As
Publication number | Publication date |
---|---|
WO2016126987A1 (en) | 2016-08-11 |
US20160264958A1 (en) | 2016-09-15 |
US9677067B2 (en) | 2017-06-13 |
US20190352635A1 (en) | 2019-11-21 |
CA2975855A1 (en) | 2016-08-11 |
US20170159044A1 (en) | 2017-06-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220064628A1 (en) | Compositions and methods for synthetic gene assembly | |
US20240132872A1 (en) | Capture of nucleic acids using a nucleic acid-guided nuclease-based system | |
US20210332078A1 (en) | Compositions and methods for nucleic acid amplification | |
US20160257985A1 (en) | Degradable adaptors for background reduction | |
CA3106822C (en) | Method for editing dna in cell-free system | |
KR102278495B1 (en) | DNA production method and kit for linking DNA fragments | |
JP2022002538A (en) | Ligase-assisted nucleic acid circularization and amplification | |
US20240271328A1 (en) | Rapid library construction for high throughput sequencing | |
WO2023038145A1 (en) | Method for producing circular dna | |
US20220380738A1 (en) | Programmable Cleavage of Double-Stranded DNA | |
Yang et al. | Generation of Artificial Nucleic Acid Systems for Biotechnology and Therapeutics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: TWIST BIOSCIENCE CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TORO, ESTEBAN;TREUSCH, SEBASTIAN;CHEN, SIYUAN;AND OTHERS;SIGNING DATES FROM 20160718 TO 20170201;REEL/FRAME:060101/0593 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |