US20210040475A1 - Compositions and methods for preparing nucleic acid libraries - Google Patents
Compositions and methods for preparing nucleic acid libraries Download PDFInfo
- Publication number
- US20210040475A1 US20210040475A1 US17/044,723 US201917044723A US2021040475A1 US 20210040475 A1 US20210040475 A1 US 20210040475A1 US 201917044723 A US201917044723 A US 201917044723A US 2021040475 A1 US2021040475 A1 US 2021040475A1
- Authority
- US
- United States
- Prior art keywords
- primer
- adapter
- sequence
- pool
- kit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 200
- 239000000203 mixture Substances 0.000 title claims abstract description 42
- 150000007523 nucleic acids Chemical class 0.000 title abstract description 32
- 102000039446 nucleic acids Human genes 0.000 title abstract description 24
- 108020004707 nucleic acids Proteins 0.000 title abstract description 24
- 102000040430 polynucleotide Human genes 0.000 claims abstract description 268
- 108091033319 polynucleotide Proteins 0.000 claims abstract description 268
- 239000002157 polynucleotide Substances 0.000 claims abstract description 268
- 238000006243 chemical reaction Methods 0.000 claims abstract description 247
- 230000003321 amplification Effects 0.000 claims abstract description 124
- 238000003199 nucleic acid amplification method Methods 0.000 claims abstract description 124
- 238000012163 sequencing technique Methods 0.000 claims abstract description 67
- 239000011541 reaction mixture Substances 0.000 claims abstract description 16
- 125000003729 nucleotide group Chemical group 0.000 claims description 168
- 239000002773 nucleotide Substances 0.000 claims description 164
- 108020004414 DNA Proteins 0.000 claims description 150
- 230000000295 complement effect Effects 0.000 claims description 47
- 102100033215 DNA nucleotidylexotransferase Human genes 0.000 claims description 34
- 108010008286 DNA nucleotidylexotransferase Proteins 0.000 claims description 34
- 238000006116 polymerization reaction Methods 0.000 claims description 31
- 102000053602 DNA Human genes 0.000 claims description 29
- LSNNMFCWUKXFEE-UHFFFAOYSA-M Bisulfite Chemical compound OS([O-])=O LSNNMFCWUKXFEE-UHFFFAOYSA-M 0.000 claims description 16
- 230000011987 methylation Effects 0.000 claims description 13
- 238000007069 methylation reaction Methods 0.000 claims description 13
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 claims description 12
- 108020004682 Single-Stranded DNA Proteins 0.000 claims description 9
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 claims description 8
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 claims description 8
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 claims description 8
- 230000000379 polymerizing effect Effects 0.000 claims description 8
- 230000030609 dephosphorylation Effects 0.000 claims description 4
- 238000006209 dephosphorylation reaction Methods 0.000 claims description 4
- 238000002360 preparation method Methods 0.000 abstract description 15
- 239000000523 sample Substances 0.000 description 89
- 239000000047 product Substances 0.000 description 68
- 108091034117 Oligonucleotide Proteins 0.000 description 42
- 239000011324 bead Substances 0.000 description 39
- 239000000872 buffer Substances 0.000 description 36
- 239000012634 fragment Substances 0.000 description 36
- 238000003752 polymerase chain reaction Methods 0.000 description 36
- 230000008569 process Effects 0.000 description 32
- 238000009396 hybridization Methods 0.000 description 28
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 25
- 210000004027 cell Anatomy 0.000 description 25
- 230000002068 genetic effect Effects 0.000 description 24
- 101000960946 Homo sapiens Interleukin-19 Proteins 0.000 description 23
- 102100039879 Interleukin-19 Human genes 0.000 description 23
- 206010028980 Neoplasm Diseases 0.000 description 23
- 230000001364 causal effect Effects 0.000 description 21
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 20
- 239000012149 elution buffer Substances 0.000 description 20
- 230000035772 mutation Effects 0.000 description 20
- 239000000758 substrate Substances 0.000 description 20
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 18
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 18
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 18
- 238000011282 treatment Methods 0.000 description 17
- GUAHPAJOXVYFON-ZETCQYMHSA-N (8S)-8-amino-7-oxononanoic acid zwitterion Chemical compound C[C@H](N)C(=O)CCCCCC(O)=O GUAHPAJOXVYFON-ZETCQYMHSA-N 0.000 description 16
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 16
- 102000003960 Ligases Human genes 0.000 description 15
- 108090000364 Ligases Proteins 0.000 description 15
- 201000011510 cancer Diseases 0.000 description 15
- 239000003153 chemical reaction reagent Substances 0.000 description 15
- 238000000137 annealing Methods 0.000 description 14
- 201000010099 disease Diseases 0.000 description 14
- 238000012545 processing Methods 0.000 description 13
- 239000006228 supernatant Substances 0.000 description 13
- 102100034343 Integrase Human genes 0.000 description 12
- 102000054765 polymorphisms of proteins Human genes 0.000 description 12
- 238000005406 washing Methods 0.000 description 12
- 108010061982 DNA Ligases Proteins 0.000 description 11
- 102000012410 DNA Ligases Human genes 0.000 description 11
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 11
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 11
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 11
- 238000003149 assay kit Methods 0.000 description 11
- 102000052116 epidermal growth factor receptor activity proteins Human genes 0.000 description 11
- 108700015053 epidermal growth factor receptor activity proteins Proteins 0.000 description 11
- YOHYSYJDKVYCJI-UHFFFAOYSA-N n-[3-[[6-[3-(trifluoromethyl)anilino]pyrimidin-4-yl]amino]phenyl]cyclopropanecarboxamide Chemical compound FC(F)(F)C1=CC=CC(NC=2N=CN=C(NC=3C=C(NC(=O)C4CC4)C=CC=3)C=2)=C1 YOHYSYJDKVYCJI-UHFFFAOYSA-N 0.000 description 11
- 108090000623 proteins and genes Proteins 0.000 description 11
- 239000002096 quantum dot Substances 0.000 description 11
- 108091092878 Microsatellite Proteins 0.000 description 10
- 210000004369 blood Anatomy 0.000 description 10
- 239000008280 blood Substances 0.000 description 10
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 10
- 238000013467 fragmentation Methods 0.000 description 10
- 238000006062 fragmentation reaction Methods 0.000 description 10
- 230000035945 sensitivity Effects 0.000 description 10
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 9
- 102000004190 Enzymes Human genes 0.000 description 9
- 108090000790 Enzymes Proteins 0.000 description 9
- 102100030708 GTPase KRas Human genes 0.000 description 9
- 201000004283 Shwachman-Diamond syndrome Diseases 0.000 description 9
- 206010041662 Splinter Diseases 0.000 description 9
- 229940088598 enzyme Drugs 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- 230000037431 insertion Effects 0.000 description 9
- 239000011780 sodium chloride Substances 0.000 description 9
- 229940035893 uracil Drugs 0.000 description 9
- 108091028043 Nucleic acid sequence Proteins 0.000 description 8
- 229910019142 PO4 Inorganic materials 0.000 description 8
- 230000027455 binding Effects 0.000 description 8
- 230000015572 biosynthetic process Effects 0.000 description 8
- 238000012217 deletion Methods 0.000 description 8
- 230000037430 deletion Effects 0.000 description 8
- 238000009826 distribution Methods 0.000 description 8
- 238000001962 electrophoresis Methods 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 238000010438 heat treatment Methods 0.000 description 8
- 229910052739 hydrogen Inorganic materials 0.000 description 8
- 239000001257 hydrogen Substances 0.000 description 8
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 8
- 239000010452 phosphate Substances 0.000 description 8
- 230000002441 reversible effect Effects 0.000 description 8
- 238000003860 storage Methods 0.000 description 8
- 238000012360 testing method Methods 0.000 description 8
- 239000011534 wash buffer Substances 0.000 description 8
- 241000588724 Escherichia coli Species 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 239000012148 binding buffer Substances 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 7
- 238000000605 extraction Methods 0.000 description 7
- 239000012530 fluid Substances 0.000 description 7
- 238000011534 incubation Methods 0.000 description 7
- 230000000670 limiting effect Effects 0.000 description 7
- 238000000746 purification Methods 0.000 description 7
- RYVNIFSIEDRLSJ-UHFFFAOYSA-N 5-(hydroxymethyl)cytosine Chemical compound NC=1NC(=O)N=CC=1CO RYVNIFSIEDRLSJ-UHFFFAOYSA-N 0.000 description 6
- 101000605639 Homo sapiens Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform Proteins 0.000 description 6
- 102100038332 Phosphatidylinositol 4,5-bisphosphate 3-kinase catalytic subunit alpha isoform Human genes 0.000 description 6
- 238000001816 cooling Methods 0.000 description 6
- -1 cord blood Substances 0.000 description 6
- 238000004925 denaturation Methods 0.000 description 6
- 230000036425 denaturation Effects 0.000 description 6
- 229920001519 homopolymer Polymers 0.000 description 6
- 229910052751 metal Inorganic materials 0.000 description 6
- 239000002184 metal Substances 0.000 description 6
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 6
- 108091008146 restriction endonucleases Proteins 0.000 description 6
- 238000000527 sonication Methods 0.000 description 6
- 238000006467 substitution reaction Methods 0.000 description 6
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 5
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 5
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 5
- 229910021580 Cobalt(II) chloride Inorganic materials 0.000 description 5
- 108010017826 DNA Polymerase I Proteins 0.000 description 5
- 102000004594 DNA Polymerase I Human genes 0.000 description 5
- 206010011878 Deafness Diseases 0.000 description 5
- 241000238557 Decapoda Species 0.000 description 5
- 208000002537 Neuronal Ceroid-Lipofuscinoses Diseases 0.000 description 5
- 230000001419 dependent effect Effects 0.000 description 5
- 208000016354 hearing loss disease Diseases 0.000 description 5
- 238000002156 mixing Methods 0.000 description 5
- 210000002381 plasma Anatomy 0.000 description 5
- 102000004169 proteins and genes Human genes 0.000 description 5
- 238000013442 quality metrics Methods 0.000 description 5
- 239000003161 ribonuclease inhibitor Substances 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 241000894007 species Species 0.000 description 5
- 210000001519 tissue Anatomy 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- 229930024421 Adenine Natural products 0.000 description 4
- 108020004635 Complementary DNA Proteins 0.000 description 4
- BAWFJGJZGIEFAR-NNYOXOHSSA-N NAD zwitterion Chemical compound NC(=O)C1=CC=C[N+]([C@H]2[C@@H]([C@H](O)[C@@H](COP([O-])(=O)OP(O)(=O)OC[C@@H]3[C@H]([C@@H](O)[C@@H](O3)N3C4=NC=NC(N)=C4N=C3)O)O2)O)=C1 BAWFJGJZGIEFAR-NNYOXOHSSA-N 0.000 description 4
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 4
- 108010090804 Streptavidin Proteins 0.000 description 4
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 4
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 4
- 229960000643 adenine Drugs 0.000 description 4
- 238000010804 cDNA synthesis Methods 0.000 description 4
- 230000001413 cellular effect Effects 0.000 description 4
- 239000003795 chemical substances by application Substances 0.000 description 4
- 210000000349 chromosome Anatomy 0.000 description 4
- 239000002299 complementary DNA Substances 0.000 description 4
- 231100000895 deafness Toxicity 0.000 description 4
- 208000035475 disorder Diseases 0.000 description 4
- 230000007613 environmental effect Effects 0.000 description 4
- 238000002955 isolation Methods 0.000 description 4
- 201000008051 neuronal ceroid lipofuscinosis Diseases 0.000 description 4
- 230000009871 nonspecific binding Effects 0.000 description 4
- 201000006790 nonsyndromic deafness Diseases 0.000 description 4
- UEZVMMHDMIWARA-UHFFFAOYSA-M phosphonate Chemical compound [O-]P(=O)=O UEZVMMHDMIWARA-UHFFFAOYSA-M 0.000 description 4
- 238000010839 reverse transcription Methods 0.000 description 4
- 210000002966 serum Anatomy 0.000 description 4
- 239000004055 small Interfering RNA Substances 0.000 description 4
- 239000001509 sodium citrate Substances 0.000 description 4
- NLJMYIDDQXHKNR-UHFFFAOYSA-K sodium citrate Chemical compound O.O.[Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O NLJMYIDDQXHKNR-UHFFFAOYSA-K 0.000 description 4
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 4
- 102100033793 ALK tyrosine kinase receptor Human genes 0.000 description 3
- 230000030933 DNA methylation on cytosine Effects 0.000 description 3
- 102100039788 GTPase NRas Human genes 0.000 description 3
- 101000744505 Homo sapiens GTPase NRas Proteins 0.000 description 3
- 101000984753 Homo sapiens Serine/threonine-protein kinase B-raf Proteins 0.000 description 3
- 206010020608 Hypercoagulation Diseases 0.000 description 3
- 241000124008 Mammalia Species 0.000 description 3
- 102100027103 Serine/threonine-protein kinase B-raf Human genes 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 239000012472 biological sample Substances 0.000 description 3
- 230000000903 blocking effect Effects 0.000 description 3
- 210000001124 body fluid Anatomy 0.000 description 3
- 210000000481 breast Anatomy 0.000 description 3
- 230000009850 completed effect Effects 0.000 description 3
- 239000002131 composite material Substances 0.000 description 3
- 230000001143 conditioned effect Effects 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 229940104302 cytosine Drugs 0.000 description 3
- 238000007405 data analysis Methods 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 230000007812 deficiency Effects 0.000 description 3
- 230000000593 degrading effect Effects 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000006073 displacement reaction Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 231100000888 hearing loss Toxicity 0.000 description 3
- 230000010370 hearing loss Effects 0.000 description 3
- 238000003505 heat denaturation Methods 0.000 description 3
- 238000012165 high-throughput sequencing Methods 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 230000000977 initiatory effect Effects 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 229950006238 nadide Drugs 0.000 description 3
- 102200006532 rs112445441 Human genes 0.000 description 3
- 102200085789 rs121913279 Human genes 0.000 description 3
- 102200006537 rs121913529 Human genes 0.000 description 3
- 102200006541 rs121913530 Human genes 0.000 description 3
- 102200007373 rs17851045 Human genes 0.000 description 3
- 102220014422 rs397517094 Human genes 0.000 description 3
- 102200003102 rs863225281 Human genes 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 201000005665 thrombophilia Diseases 0.000 description 3
- 210000002700 urine Anatomy 0.000 description 3
- FHSISDGOVSHJRW-UHFFFAOYSA-N 5-formylcytosine Chemical compound NC1=NC(=O)NC=C1C=O FHSISDGOVSHJRW-UHFFFAOYSA-N 0.000 description 2
- 206010069754 Acquired gene mutation Diseases 0.000 description 2
- 108700028369 Alleles Proteins 0.000 description 2
- 241000713838 Avian myeloblastosis virus Species 0.000 description 2
- 108700020463 BRCA1 Proteins 0.000 description 2
- 102000036365 BRCA1 Human genes 0.000 description 2
- 101150072950 BRCA1 gene Proteins 0.000 description 2
- 102000052609 BRCA2 Human genes 0.000 description 2
- 108700020462 BRCA2 Proteins 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 101150008921 Brca2 gene Proteins 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 108091029430 CpG site Proteins 0.000 description 2
- 208000009283 Craniosynostoses Diseases 0.000 description 2
- 206010049889 Craniosynostosis Diseases 0.000 description 2
- 108010060248 DNA Ligase ATP Proteins 0.000 description 2
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 2
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 2
- 201000010374 Down Syndrome Diseases 0.000 description 2
- 101000827763 Drosophila melanogaster Fibroblast growth factor receptor homolog 1 Proteins 0.000 description 2
- 108700024394 Exon Proteins 0.000 description 2
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 2
- 206010020365 Homocystinuria Diseases 0.000 description 2
- 208000008852 Hyperoxaluria Diseases 0.000 description 2
- 101710203526 Integrase Proteins 0.000 description 2
- 108060004795 Methyltransferase Proteins 0.000 description 2
- 241000713869 Moloney murine leukemia virus Species 0.000 description 2
- 208000021642 Muscular disease Diseases 0.000 description 2
- 201000009623 Myopathy Diseases 0.000 description 2
- 208000014060 Niemann-Pick disease Diseases 0.000 description 2
- 102100032028 Non-receptor tyrosine-protein kinase TYK2 Human genes 0.000 description 2
- 101710163270 Nuclease Proteins 0.000 description 2
- 206010033128 Ovarian cancer Diseases 0.000 description 2
- 206010061535 Ovarian neoplasm Diseases 0.000 description 2
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 2
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 2
- 102000015623 Polynucleotide Adenylyltransferase Human genes 0.000 description 2
- 108010024055 Polynucleotide adenylyltransferase Proteins 0.000 description 2
- 241000288906 Primates Species 0.000 description 2
- 206010036790 Productive cough Diseases 0.000 description 2
- 102000006382 Ribonucleases Human genes 0.000 description 2
- 108010083644 Ribonucleases Proteins 0.000 description 2
- 241000714474 Rous sarcoma virus Species 0.000 description 2
- 238000012300 Sequence Analysis Methods 0.000 description 2
- 208000000453 Skin Neoplasms Diseases 0.000 description 2
- 108091027967 Small hairpin RNA Proteins 0.000 description 2
- 108020004459 Small interfering RNA Proteins 0.000 description 2
- CDBYLPFSWZWCQE-UHFFFAOYSA-L Sodium Carbonate Chemical compound [Na+].[Na+].[O-]C([O-])=O CDBYLPFSWZWCQE-UHFFFAOYSA-L 0.000 description 2
- UIIMBOGNXHQVGW-UHFFFAOYSA-M Sodium bicarbonate Chemical compound [Na+].OC([O-])=O UIIMBOGNXHQVGW-UHFFFAOYSA-M 0.000 description 2
- QAOWNCQODCNURD-UHFFFAOYSA-L Sulfate Chemical compound [O-]S([O-])(=O)=O QAOWNCQODCNURD-UHFFFAOYSA-L 0.000 description 2
- 108010010057 TYK2 Kinase Proteins 0.000 description 2
- 108010006785 Taq Polymerase Proteins 0.000 description 2
- 108020004566 Transfer RNA Proteins 0.000 description 2
- 108700036262 Trifunctional Protein Deficiency With Myopathy And Neuropathy Proteins 0.000 description 2
- 206010044688 Trisomy 21 Diseases 0.000 description 2
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 2
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 2
- 208000008383 Wilms tumor Diseases 0.000 description 2
- 239000006227 byproduct Substances 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 210000001175 cerebrospinal fluid Anatomy 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 239000000356 contaminant Substances 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 235000013305 food Nutrition 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 201000005706 hypokalemic periodic paralysis Diseases 0.000 description 2
- PHTQWCKDNZKARW-UHFFFAOYSA-N isoamylol Chemical compound CC(C)CCO PHTQWCKDNZKARW-UHFFFAOYSA-N 0.000 description 2
- 238000011901 isothermal amplification Methods 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 201000001441 melanoma Diseases 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 108091070501 miRNA Proteins 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 201000008026 nephroblastoma Diseases 0.000 description 2
- 238000007481 next generation sequencing Methods 0.000 description 2
- 150000003230 pyrimidines Chemical class 0.000 description 2
- 238000003753 real-time PCR Methods 0.000 description 2
- 230000008439 repair process Effects 0.000 description 2
- 238000007894 restriction fragment length polymorphism technique Methods 0.000 description 2
- 108020004418 ribosomal RNA Proteins 0.000 description 2
- 210000003296 saliva Anatomy 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 210000000582 semen Anatomy 0.000 description 2
- 239000000344 soap Substances 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000037439 somatic mutation Effects 0.000 description 2
- 210000003802 sputum Anatomy 0.000 description 2
- 208000024794 sputum Diseases 0.000 description 2
- 238000007725 thermal activation Methods 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- 230000005945 translocation Effects 0.000 description 2
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 1
- 206010000021 21-hydroxylase deficiency Diseases 0.000 description 1
- 108700020831 3-Hydroxyacyl-CoA Dehydrogenase Proteins 0.000 description 1
- 102100021834 3-hydroxyacyl-CoA dehydrogenase Human genes 0.000 description 1
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical compound CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 1
- 102100032123 AMP deaminase 1 Human genes 0.000 description 1
- 208000000363 Agenesis of Corpus Callosum Diseases 0.000 description 1
- 208000028060 Albright disease Diseases 0.000 description 1
- 102100035028 Alpha-L-iduronidase Human genes 0.000 description 1
- 208000033337 Alpha-sarcoglycan-related limb-girdle muscular dystrophy R3 Diseases 0.000 description 1
- 108010063905 Ampligase Proteins 0.000 description 1
- 102100032187 Androgen receptor Human genes 0.000 description 1
- 102000008873 Angiotensin II receptor Human genes 0.000 description 1
- 108050000824 Angiotensin II receptor Proteins 0.000 description 1
- 102100029470 Apolipoprotein E Human genes 0.000 description 1
- 101710095339 Apolipoprotein E Proteins 0.000 description 1
- 206010068220 Aspartylglucosaminuria Diseases 0.000 description 1
- 206010003594 Ataxia telangiectasia Diseases 0.000 description 1
- 208000001827 Ataxia with vitamin E deficiency Diseases 0.000 description 1
- 208000031212 Autoimmune polyendocrinopathy Diseases 0.000 description 1
- 208000034320 Autosomal recessive spastic ataxia of Charlevoix-Saguenay Diseases 0.000 description 1
- 201000001321 Bardet-Biedl syndrome Diseases 0.000 description 1
- 208000037663 Best vitelliform macular dystrophy Diseases 0.000 description 1
- 208000034067 Beta-sarcoglycan-related limb-girdle muscular dystrophy R4 Diseases 0.000 description 1
- 208000033258 Bifunctional enzyme deficiency Diseases 0.000 description 1
- 208000009766 Blau syndrome Diseases 0.000 description 1
- 208000005692 Bloom Syndrome Diseases 0.000 description 1
- 206010005949 Bone cancer Diseases 0.000 description 1
- 208000018084 Bone neoplasm Diseases 0.000 description 1
- BTBUEUYNUDRHOZ-UHFFFAOYSA-N Borate Chemical compound [O-]B([O-])[O-] BTBUEUYNUDRHOZ-UHFFFAOYSA-N 0.000 description 1
- 241000283690 Bos taurus Species 0.000 description 1
- 208000003174 Brain Neoplasms Diseases 0.000 description 1
- 101150029409 CFTR gene Proteins 0.000 description 1
- 208000022526 Canavan disease Diseases 0.000 description 1
- 241000282472 Canis lupus familiaris Species 0.000 description 1
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 108700005857 Carnitine palmitoyl transferase 1A deficiency Proteins 0.000 description 1
- 208000005359 Carnitine palmitoyl transferase 1A deficiency Diseases 0.000 description 1
- 108700005858 Carnitine palmitoyl transferase 2 deficiency Proteins 0.000 description 1
- 201000002929 Carnitine palmitoyltransferase II deficiency Diseases 0.000 description 1
- 208000004918 Cartilage-hair hypoplasia Diseases 0.000 description 1
- 102000011727 Caspases Human genes 0.000 description 1
- 108010076667 Caspases Proteins 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 206010007747 Cataract congenital Diseases 0.000 description 1
- 101900144306 Cauliflower mosaic virus Reverse transcriptase Proteins 0.000 description 1
- 208000031464 Cavernous Central Nervous System Hemangioma Diseases 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 208000032929 Cerebral haemangioma Diseases 0.000 description 1
- 108091006146 Channels Proteins 0.000 description 1
- 201000003679 Charlevoix-Saguenay spastic ataxia Diseases 0.000 description 1
- 206010008723 Chondrodystrophy Diseases 0.000 description 1
- 208000033810 Choroidal dystrophy Diseases 0.000 description 1
- 208000013147 Classic homocystinuria Diseases 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 208000008020 Cohen syndrome Diseases 0.000 description 1
- 208000006992 Color Vision Defects Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- 206010010356 Congenital anomaly Diseases 0.000 description 1
- 206010053138 Congenital aplastic anaemia Diseases 0.000 description 1
- 208000021599 Congenital lactic acidosis, Saguenay-Lac-Saint-Jean type Diseases 0.000 description 1
- 208000029767 Congenital, Hereditary, and Neonatal Diseases and Abnormalities Diseases 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- 102000012437 Copper-Transporting ATPases Human genes 0.000 description 1
- 208000011231 Crohn disease Diseases 0.000 description 1
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 1
- 206010071093 Cystathionine beta-synthase deficiency Diseases 0.000 description 1
- 201000003883 Cystic fibrosis Diseases 0.000 description 1
- 206010011777 Cystinosis Diseases 0.000 description 1
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 1
- 102100036279 DNA (cytosine-5)-methyltransferase 1 Human genes 0.000 description 1
- 102100029995 DNA ligase 1 Human genes 0.000 description 1
- 101710148291 DNA ligase 1 Proteins 0.000 description 1
- 102100033688 DNA ligase 3 Human genes 0.000 description 1
- 102100033195 DNA ligase 4 Human genes 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- 102000004099 Deoxyribonuclease (Pyrimidine Dimer) Human genes 0.000 description 1
- 108010082610 Deoxyribonuclease (Pyrimidine Dimer) Proteins 0.000 description 1
- 201000010385 Dihydropyrimidine Dehydrogenase Deficiency Diseases 0.000 description 1
- 206010066054 Dysmorphism Diseases 0.000 description 1
- 208000014094 Dystonic disease Diseases 0.000 description 1
- 101150039808 Egfr gene Proteins 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 108010067770 Endopeptidase K Proteins 0.000 description 1
- 206010014989 Epidermolysis bullosa Diseases 0.000 description 1
- 241000283086 Equidae Species 0.000 description 1
- 108091008794 FGF receptors Proteins 0.000 description 1
- 208000033534 FKRP-related limb-girdle muscular dystrophy R9 Diseases 0.000 description 1
- 108010014172 Factor V Proteins 0.000 description 1
- 201000007371 Factor XIII Deficiency Diseases 0.000 description 1
- 206010016207 Familial Mediterranean fever Diseases 0.000 description 1
- 201000006107 Familial adenomatous polyposis Diseases 0.000 description 1
- 208000001730 Familial dysautonomia Diseases 0.000 description 1
- 201000004939 Fanconi anemia Diseases 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- PXGOKWXKJXAPGV-UHFFFAOYSA-N Fluorine Chemical compound FF PXGOKWXKJXAPGV-UHFFFAOYSA-N 0.000 description 1
- 201000011240 Frontotemporal dementia Diseases 0.000 description 1
- 206010072104 Fructose intolerance Diseases 0.000 description 1
- 208000006517 Fumaric aciduria Diseases 0.000 description 1
- 108700036912 Fumaric aciduria Proteins 0.000 description 1
- 208000025499 G6PD deficiency Diseases 0.000 description 1
- 208000013381 GRACILE syndrome Diseases 0.000 description 1
- 208000027472 Galactosemias Diseases 0.000 description 1
- 201000003741 Gastrointestinal carcinoma Diseases 0.000 description 1
- 208000015872 Gaucher disease Diseases 0.000 description 1
- 208000010055 Globoid Cell Leukodystrophy Diseases 0.000 description 1
- 206010018444 Glucose-6-phosphate dehydrogenase deficiency Diseases 0.000 description 1
- 102100036263 Glutamyl-tRNA(Gln) amidotransferase subunit C, mitochondrial Human genes 0.000 description 1
- 108700006770 Glutaric Acidemia I Proteins 0.000 description 1
- 208000021097 Glutaryl-CoA dehydrogenase deficiency Diseases 0.000 description 1
- 102100029492 Glycogen phosphorylase, muscle form Human genes 0.000 description 1
- 208000032007 Glycogen storage disease due to acid maltase deficiency Diseases 0.000 description 1
- 208000011476 Glycogen storage disease due to glucose-6-phosphatase deficiency type Ib Diseases 0.000 description 1
- 208000032008 Glycogen storage disease due to glycogen debranching enzyme deficiency Diseases 0.000 description 1
- 208000032000 Glycogen storage disease due to muscle glycogen phosphorylase deficiency Diseases 0.000 description 1
- 206010053185 Glycogen storage disease type II Diseases 0.000 description 1
- 206010053250 Glycogen storage disease type III Diseases 0.000 description 1
- 206010018462 Glycogen storage disease type V Diseases 0.000 description 1
- 239000007995 HEPES buffer Substances 0.000 description 1
- 108010054147 Hemoglobins Proteins 0.000 description 1
- 102000001554 Hemoglobins Human genes 0.000 description 1
- 208000002972 Hepatolenticular Degeneration Diseases 0.000 description 1
- 208000032087 Hereditary Leber Optic Atrophy Diseases 0.000 description 1
- 208000028572 Hereditary chronic pancreatitis Diseases 0.000 description 1
- 206010019878 Hereditary fructose intolerance Diseases 0.000 description 1
- 208000033981 Hereditary haemochromatosis Diseases 0.000 description 1
- 206010056976 Hereditary pancreatitis Diseases 0.000 description 1
- 102000016871 Hexosaminidase A Human genes 0.000 description 1
- 108010053317 Hexosaminidase A Proteins 0.000 description 1
- 101000775844 Homo sapiens AMP deaminase 1 Proteins 0.000 description 1
- 101001019502 Homo sapiens Alpha-L-iduronidase Proteins 0.000 description 1
- 101001001786 Homo sapiens Glutamyl-tRNA(Gln) amidotransferase subunit C, mitochondrial Proteins 0.000 description 1
- 101000700475 Homo sapiens Glycogen phosphorylase, muscle form Proteins 0.000 description 1
- 101000840267 Homo sapiens Immunoglobulin lambda-like polypeptide 1 Proteins 0.000 description 1
- 101000653360 Homo sapiens Methylcytosine dioxygenase TET1 Proteins 0.000 description 1
- 101000653374 Homo sapiens Methylcytosine dioxygenase TET2 Proteins 0.000 description 1
- 101000653369 Homo sapiens Methylcytosine dioxygenase TET3 Proteins 0.000 description 1
- 101000587058 Homo sapiens Methylenetetrahydrofolate reductase Proteins 0.000 description 1
- 101000798015 Homo sapiens RAC-beta serine/threonine-protein kinase Proteins 0.000 description 1
- 101000798007 Homo sapiens RAC-gamma serine/threonine-protein kinase Proteins 0.000 description 1
- 101001012157 Homo sapiens Receptor tyrosine-protein kinase erbB-2 Proteins 0.000 description 1
- 101000641122 Homo sapiens Sacsin Proteins 0.000 description 1
- 101000613251 Homo sapiens Tumor susceptibility gene 101 protein Proteins 0.000 description 1
- 208000023105 Huntington disease Diseases 0.000 description 1
- 208000007599 Hyperkalemic periodic paralysis Diseases 0.000 description 1
- 208000000563 Hyperlipoproteinemia Type II Diseases 0.000 description 1
- 208000034600 Hyperornithinemia-hyperammonemia-homocitrullinuria syndrome Diseases 0.000 description 1
- 206010049933 Hypophosphatasia Diseases 0.000 description 1
- 102000038455 IGF Type 1 Receptor Human genes 0.000 description 1
- 108010031794 IGF Type 1 Receptor Proteins 0.000 description 1
- 102000038460 IGF Type 2 Receptor Human genes 0.000 description 1
- 108010031792 IGF Type 2 Receptor Proteins 0.000 description 1
- 101150088952 IGF1 gene Proteins 0.000 description 1
- 101150002416 Igf2 gene Proteins 0.000 description 1
- 102100029616 Immunoglobulin lambda-like polypeptide 1 Human genes 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 208000000420 Isovaleric acidemia Diseases 0.000 description 1
- 101150068332 KIT gene Proteins 0.000 description 1
- 208000008839 Kidney Neoplasms Diseases 0.000 description 1
- 208000028226 Krabbe disease Diseases 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- 238000007397 LAMP assay Methods 0.000 description 1
- 206010056715 Laurence-Moon-Bardet-Biedl syndrome Diseases 0.000 description 1
- 201000000639 Leber hereditary optic neuropathy Diseases 0.000 description 1
- 102100024640 Low-density lipoprotein receptor Human genes 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 208000035177 MELAS Diseases 0.000 description 1
- 208000035172 MERRF Diseases 0.000 description 1
- 239000007993 MOPS buffer Substances 0.000 description 1
- 201000001853 McCune-Albright syndrome Diseases 0.000 description 1
- 108700000232 Medium chain acyl CoA dehydrogenase deficiency Proteins 0.000 description 1
- 206010072654 Medium-chain acyl-coenzyme A dehydrogenase deficiency Diseases 0.000 description 1
- 102100030550 Menin Human genes 0.000 description 1
- 201000011442 Metachromatic leukodystrophy Diseases 0.000 description 1
- 102100030819 Methylcytosine dioxygenase TET1 Human genes 0.000 description 1
- 102100030803 Methylcytosine dioxygenase TET2 Human genes 0.000 description 1
- 102100030812 Methylcytosine dioxygenase TET3 Human genes 0.000 description 1
- 102100029684 Methylenetetrahydrofolate reductase Human genes 0.000 description 1
- 208000000570 Methylenetetrahydrofolate reductase deficiency Diseases 0.000 description 1
- 108700019352 Methylenetetrahydrofolate reductase deficiency Proteins 0.000 description 1
- 208000035155 Mitochondrial DNA-associated Leigh syndrome Diseases 0.000 description 1
- 102100027891 Mitochondrial chaperone BCS1 Human genes 0.000 description 1
- 208000003445 Mouth Neoplasms Diseases 0.000 description 1
- 208000008955 Mucolipidoses Diseases 0.000 description 1
- 206010056886 Mucopolysaccharidosis I Diseases 0.000 description 1
- 206010056893 Mucopolysaccharidosis VII Diseases 0.000 description 1
- 208000028781 Mucopolysaccharidosis type 1 Diseases 0.000 description 1
- 208000007326 Muenke Syndrome Diseases 0.000 description 1
- 206010073149 Multiple endocrine neoplasia Type 2 Diseases 0.000 description 1
- 206010073148 Multiple endocrine neoplasia type 2A Diseases 0.000 description 1
- 208000012905 Myotonic disease Diseases 0.000 description 1
- 102100027661 N-sulphoglucosamine sulphohydrolase Human genes 0.000 description 1
- 206010028851 Necrosis Diseases 0.000 description 1
- 208000034965 Nemaline Myopathies Diseases 0.000 description 1
- 206010029164 Nephrotic syndrome Diseases 0.000 description 1
- 208000004485 Nijmegen breakage syndrome Diseases 0.000 description 1
- 208000010505 Nose Neoplasms Diseases 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 102000003832 Nucleotidyltransferases Human genes 0.000 description 1
- 108090000119 Nucleotidyltransferases Proteins 0.000 description 1
- 208000004286 Osteochondrodysplasias Diseases 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 102000023984 PPAR alpha Human genes 0.000 description 1
- 108010028924 PPAR alpha Proteins 0.000 description 1
- 102000000536 PPAR gamma Human genes 0.000 description 1
- 108010016731 PPAR gamma Proteins 0.000 description 1
- 102000014160 PTEN Phosphohydrolase Human genes 0.000 description 1
- 108010011536 PTEN Phosphohydrolase Proteins 0.000 description 1
- 201000011392 Pallister-Hall syndrome Diseases 0.000 description 1
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 1
- 206010033892 Paraplegia Diseases 0.000 description 1
- 208000004843 Pendred Syndrome Diseases 0.000 description 1
- 208000012202 Pervasive developmental disease Diseases 0.000 description 1
- 201000011252 Phenylketonuria Diseases 0.000 description 1
- 108010010677 Phosphodiesterase I Proteins 0.000 description 1
- 102000004160 Phosphoric Monoester Hydrolases Human genes 0.000 description 1
- 108090000608 Phosphoric Monoester Hydrolases Proteins 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 101150063858 Pik3ca gene Proteins 0.000 description 1
- 108010077971 Plasminogen Inactivators Proteins 0.000 description 1
- 102000010752 Plasminogen Inactivators Human genes 0.000 description 1
- 229920002594 Polyethylene Glycol 8000 Polymers 0.000 description 1
- 108010021757 Polynucleotide 5'-Hydroxyl-Kinase Proteins 0.000 description 1
- 102000008422 Polynucleotide 5'-hydroxyl-kinase Human genes 0.000 description 1
- 108010094028 Prothrombin Proteins 0.000 description 1
- 102100027378 Prothrombin Human genes 0.000 description 1
- 241000205156 Pyrococcus furiosus Species 0.000 description 1
- 102100032315 RAC-beta serine/threonine-protein kinase Human genes 0.000 description 1
- 102100032314 RAC-gamma serine/threonine-protein kinase Human genes 0.000 description 1
- 101710086015 RNA ligase Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 101100240886 Rattus norvegicus Nptx2 gene Proteins 0.000 description 1
- 102100030086 Receptor tyrosine-protein kinase erbB-2 Human genes 0.000 description 1
- 102100029986 Receptor tyrosine-protein kinase erbB-3 Human genes 0.000 description 1
- 101710100969 Receptor tyrosine-protein kinase erbB-3 Proteins 0.000 description 1
- 102100029981 Receptor tyrosine-protein kinase erbB-4 Human genes 0.000 description 1
- 101710100963 Receptor tyrosine-protein kinase erbB-4 Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 208000015634 Rectal Neoplasms Diseases 0.000 description 1
- 208000007014 Retinitis pigmentosa Diseases 0.000 description 1
- 201000000582 Retinoblastoma Diseases 0.000 description 1
- 208000006289 Rett Syndrome Diseases 0.000 description 1
- 201000008539 Rhizomelic chondrodysplasia punctata type 1 Diseases 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 201000001638 Riley-Day syndrome Diseases 0.000 description 1
- 102100034272 Sacsin Human genes 0.000 description 1
- 208000025816 Sanfilippo syndrome type A Diseases 0.000 description 1
- 108700017825 Short chain Acyl CoA dehydrogenase deficiency Proteins 0.000 description 1
- 108010016797 Sickle Hemoglobin Proteins 0.000 description 1
- 208000018020 Sickle cell-beta-thalassemia disease syndrome Diseases 0.000 description 1
- 206010048676 Sjogren-Larsson Syndrome Diseases 0.000 description 1
- 201000007410 Smith-Lemli-Opitz syndrome Diseases 0.000 description 1
- DWAQJAXMDSEUJJ-UHFFFAOYSA-M Sodium bisulfite Chemical compound [Na+].OS([O-])=O DWAQJAXMDSEUJJ-UHFFFAOYSA-M 0.000 description 1
- 208000032930 Spastic paraplegia Diseases 0.000 description 1
- 208000005718 Stomach Neoplasms Diseases 0.000 description 1
- 241000282887 Suidae Species 0.000 description 1
- 108091084976 TET family Proteins 0.000 description 1
- 102000043123 TET family Human genes 0.000 description 1
- 241001495444 Thermococcus sp. Species 0.000 description 1
- 101000803944 Thermus filiformis DNA ligase Proteins 0.000 description 1
- 241000557726 Thermus oshimai Species 0.000 description 1
- 241001522143 Thermus scotoductus Species 0.000 description 1
- 101000803951 Thermus scotoductus DNA ligase Proteins 0.000 description 1
- 241000589499 Thermus thermophilus Species 0.000 description 1
- 101000803959 Thermus thermophilus (strain ATCC 27634 / DSM 579 / HB8) DNA ligase Proteins 0.000 description 1
- 241000868182 Thermus thermophilus HB8 Species 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical group OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- 206010043515 Throat cancer Diseases 0.000 description 1
- 208000024770 Thyroid neoplasm Diseases 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- 102100040879 Tumor susceptibility gene 101 protein Human genes 0.000 description 1
- 208000035896 Twin-reversed arterial perfusion sequence Diseases 0.000 description 1
- 208000007824 Type A Niemann-Pick Disease Diseases 0.000 description 1
- 206010045261 Type IIa hyperlipidaemia Diseases 0.000 description 1
- 208000032001 Tyrosinemia type 1 Diseases 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 201000006793 Walker-Warburg syndrome Diseases 0.000 description 1
- 208000018839 Wilson disease Diseases 0.000 description 1
- 201000001408 X-linked juvenile retinoschisis 1 Diseases 0.000 description 1
- 208000017441 X-linked retinoschisis Diseases 0.000 description 1
- 201000004525 Zellweger Syndrome Diseases 0.000 description 1
- 208000008919 achondroplasia Diseases 0.000 description 1
- 201000000761 achromatopsia Diseases 0.000 description 1
- 238000001994 activation Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 206010001689 alkaptonuria Diseases 0.000 description 1
- 125000000217 alkyl group Chemical group 0.000 description 1
- 208000006682 alpha 1-Antitrypsin Deficiency Diseases 0.000 description 1
- 201000006288 alpha thalassemia Diseases 0.000 description 1
- 201000008333 alpha-mannosidosis Diseases 0.000 description 1
- 150000001408 amides Chemical class 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 239000003708 ampul Substances 0.000 description 1
- 108010080146 androgen receptors Proteins 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 201000003554 argininosuccinic aciduria Diseases 0.000 description 1
- 239000012298 atmosphere Substances 0.000 description 1
- 208000029560 autism spectrum disease Diseases 0.000 description 1
- 201000009561 autosomal recessive limb-girdle muscular dystrophy type 2D Diseases 0.000 description 1
- 201000009553 autosomal recessive limb-girdle muscular dystrophy type 2E Diseases 0.000 description 1
- 201000009510 autosomal recessive limb-girdle muscular dystrophy type 2I Diseases 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 208000005980 beta thalassemia Diseases 0.000 description 1
- SQVRNKJHWKZAKO-UHFFFAOYSA-N beta-N-Acetyl-D-neuraminic acid Natural products CC(=O)NC1C(O)CC(O)(C(O)=O)OC1C(O)C(O)CO SQVRNKJHWKZAKO-UHFFFAOYSA-N 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 206010071434 biotinidase deficiency Diseases 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 210000001185 bone marrow Anatomy 0.000 description 1
- 101150048834 braF gene Proteins 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 210000000234 capsid Anatomy 0.000 description 1
- 239000002775 capsule Substances 0.000 description 1
- 201000004010 carnitine palmitoyltransferase I deficiency Diseases 0.000 description 1
- 230000030833 cell death Effects 0.000 description 1
- 108091092259 cell-free RNA Proteins 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 201000000760 cerebral cavernous malformation Diseases 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 208000003571 choroideremia Diseases 0.000 description 1
- 230000001684 chronic effect Effects 0.000 description 1
- 208000029664 classic familial adenomatous polyposis Diseases 0.000 description 1
- 210000001072 colon Anatomy 0.000 description 1
- 201000007254 color blindness Diseases 0.000 description 1
- 208000030483 congenital disorder of glycosylation Ib Diseases 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000009223 counseling Methods 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000009615 deamination Effects 0.000 description 1
- 238000006481 deamination reaction Methods 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 239000005546 dideoxynucleotide Substances 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 230000001079 digestive effect Effects 0.000 description 1
- 238000007847 digital PCR Methods 0.000 description 1
- 238000006471 dimerization reaction Methods 0.000 description 1
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 1
- 235000011180 diphosphates Nutrition 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 208000010118 dystonia Diseases 0.000 description 1
- 208000016570 early-onset generalized limb-onset dystonia Diseases 0.000 description 1
- 208000002169 ectodermal dysplasia Diseases 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 230000001973 epigenetic effect Effects 0.000 description 1
- 230000004049 epigenetic modification Effects 0.000 description 1
- 108700021358 erbB-1 Genes Proteins 0.000 description 1
- 210000003238 esophagus Anatomy 0.000 description 1
- 150000002148 esters Chemical class 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 210000003722 extracellular fluid Anatomy 0.000 description 1
- 208000014337 facial nerve disease Diseases 0.000 description 1
- 108010091897 factor V Leiden Proteins 0.000 description 1
- 201000007219 factor XI deficiency Diseases 0.000 description 1
- 201000001386 familial hypercholesterolemia Diseases 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 210000004700 fetal blood Anatomy 0.000 description 1
- 229910052731 fluorine Inorganic materials 0.000 description 1
- 239000011737 fluorine Substances 0.000 description 1
- 238000009472 formulation Methods 0.000 description 1
- 238000007672 fourth generation sequencing Methods 0.000 description 1
- 208000014346 fumarase deficiency Diseases 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 210000000232 gallbladder Anatomy 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 201000011243 gastrointestinal stromal tumor Diseases 0.000 description 1
- 238000003205 genotyping method Methods 0.000 description 1
- 230000000762 glandular Effects 0.000 description 1
- 208000008605 glucosephosphate dehydrogenase deficiency Diseases 0.000 description 1
- 201000004502 glycogen storage disease II Diseases 0.000 description 1
- 201000004543 glycogen storage disease III Diseases 0.000 description 1
- 208000005516 glycogen storage disease Ib Diseases 0.000 description 1
- 201000004534 glycogen storage disease V Diseases 0.000 description 1
- 208000011460 glycogen storage disease due to glucose-6-phosphatase deficiency type IA Diseases 0.000 description 1
- 229930182470 glycoside Natural products 0.000 description 1
- 150000002338 glycosides Chemical class 0.000 description 1
- 229940029575 guanosine Drugs 0.000 description 1
- 210000002216 heart Anatomy 0.000 description 1
- 201000005787 hematologic cancer Diseases 0.000 description 1
- 208000024200 hematopoietic and lymphoid system neoplasm Diseases 0.000 description 1
- 201000000391 hemochromatosis type 1 Diseases 0.000 description 1
- 208000002672 hepatitis B Diseases 0.000 description 1
- 208000013144 homocystinuria due to methylene tetrahydrofolate reductase deficiency Diseases 0.000 description 1
- 210000003917 human chromosome Anatomy 0.000 description 1
- 210000004251 human milk Anatomy 0.000 description 1
- 235000020256 human milk Nutrition 0.000 description 1
- 150000002431 hydrogen Chemical class 0.000 description 1
- 125000004435 hydrogen atom Chemical group [H]* 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 230000033444 hydroxylation Effects 0.000 description 1
- 238000005805 hydroxylation reaction Methods 0.000 description 1
- 201000008980 hyperinsulinism Diseases 0.000 description 1
- 201000010072 hypochondroplasia Diseases 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 201000002313 intestinal cancer Diseases 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 208000006443 lactic acidosis Diseases 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 208000026695 long chain 3-hydroxyacyl-CoA dehydrogenase deficiency Diseases 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 208000020816 lung neoplasm Diseases 0.000 description 1
- 230000002934 lysing effect Effects 0.000 description 1
- LBSANEJBGMCTBH-UHFFFAOYSA-N manganate Chemical compound [O-][Mn]([O-])(=O)=O LBSANEJBGMCTBH-UHFFFAOYSA-N 0.000 description 1
- 208000012402 maple syrup urine disease type 1A Diseases 0.000 description 1
- 208000012406 maple syrup urine disease type 1B Diseases 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 208000005548 medium chain acyl-CoA dehydrogenase deficiency Diseases 0.000 description 1
- 208000002839 megalencephalic leukoencephalopathy with subcortical cysts Diseases 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 208000005340 mucopolysaccharidosis III Diseases 0.000 description 1
- 208000011045 mucopolysaccharidosis type 3 Diseases 0.000 description 1
- 208000025919 mucopolysaccharidosis type 7 Diseases 0.000 description 1
- 208000012226 mucopolysaccharidosis type IIIA Diseases 0.000 description 1
- 210000003097 mucus Anatomy 0.000 description 1
- 238000002887 multiple sequence alignment Methods 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 208000011042 muscle-eye-brain disease Diseases 0.000 description 1
- 230000017074 necrotic cell death Effects 0.000 description 1
- 208000009928 nephrosis Diseases 0.000 description 1
- 231100001027 nephrosis Toxicity 0.000 description 1
- 230000000926 neurological effect Effects 0.000 description 1
- 201000007657 neuronal ceroid lipofuscinosis 5 Diseases 0.000 description 1
- 230000007827 neuronopathy Effects 0.000 description 1
- 230000007823 neuropathy Effects 0.000 description 1
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 238000002515 oligonucleotide synthesis Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 210000000496 pancreas Anatomy 0.000 description 1
- 208000027838 paramyotonia congenita of Von Eulenburg Diseases 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 208000024335 physical disease Diseases 0.000 description 1
- 230000003169 placental effect Effects 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 239000002797 plasminogen activator inhibitor Substances 0.000 description 1
- 238000005498 polishing Methods 0.000 description 1
- 208000030761 polycystic kidney disease Diseases 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 208000001061 polyostotic fibrous dysplasia Diseases 0.000 description 1
- 208000015768 polyposis Diseases 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 210000002307 prostate Anatomy 0.000 description 1
- 229940039716 prothrombin Drugs 0.000 description 1
- 150000003212 purines Chemical class 0.000 description 1
- 201000010108 pycnodysostosis Diseases 0.000 description 1
- 238000012175 pyrosequencing Methods 0.000 description 1
- 208000022563 qualitative or quantitative defects of alpha-sarcoglycan Diseases 0.000 description 1
- 208000022561 qualitative or quantitative defects of beta-sarcoglycan Diseases 0.000 description 1
- 239000000376 reactant Substances 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 102000005962 receptors Human genes 0.000 description 1
- 108020003175 receptors Proteins 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 201000007714 retinoschisis Diseases 0.000 description 1
- 230000001177 retroviral effect Effects 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 208000007442 rickets Diseases 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 102200023384 rs587777213 Human genes 0.000 description 1
- 208000010532 sarcoglycanopathy Diseases 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 208000001392 short chain acyl-CoA dehydrogenase deficiency Diseases 0.000 description 1
- SQVRNKJHWKZAKO-OQPLDHBCSA-N sialic acid Chemical compound CC(=O)N[C@@H]1[C@@H](O)C[C@@](O)(C(O)=O)OC1[C@H](O)[C@H](O)CO SQVRNKJHWKZAKO-OQPLDHBCSA-N 0.000 description 1
- 208000007056 sickle cell anemia Diseases 0.000 description 1
- 210000003491 skin Anatomy 0.000 description 1
- 210000002460 smooth muscle Anatomy 0.000 description 1
- 229910000030 sodium bicarbonate Inorganic materials 0.000 description 1
- 235000017557 sodium bicarbonate Nutrition 0.000 description 1
- 229910000029 sodium carbonate Inorganic materials 0.000 description 1
- 229940079827 sodium hydrogen sulfite Drugs 0.000 description 1
- 235000010267 sodium hydrogen sulphite Nutrition 0.000 description 1
- 239000002689 soil Substances 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
- 239000007858 starting material Substances 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 210000004243 sweat Anatomy 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 201000003896 thanatophoric dysplasia Diseases 0.000 description 1
- 210000001541 thymus gland Anatomy 0.000 description 1
- 210000001685 thyroid gland Anatomy 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 201000007905 transthyretin amyloidosis Diseases 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
- 125000002264 triphosphate group Chemical group [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 201000011296 tyrosinemia Diseases 0.000 description 1
- 201000007972 tyrosinemia type I Diseases 0.000 description 1
- 210000003932 urinary bladder Anatomy 0.000 description 1
- 210000005166 vasculature Anatomy 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 201000007790 vitelliform macular dystrophy Diseases 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1093—General methods of preparing gene libraries, not provided for in other subgroups
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/06—Biochemical methods, e.g. using enzymes or whole viable microorganisms
Definitions
- Identifying and analyzing complex nucleic acid populations is an active field of development with multiple applications. Such analyses have been greatly facilitated by large-scale parallel nucleic acid sequencing (also referred to as “high-throughput sequencing” or “next generation sequencing” (NGS)). Due to challenges such as small sample input and errors at various stages in manipulation, it remains difficult to detect nucleic acid species that are present in relatively low abundance. Such challenges can arise in situations like testing for possible contaminants (e.g., in food or water), detecting the presence of a particular bacteria in a complex population (e.g., in environmental testing), and detecting presence of nucleic acids associated with disease (e.g. infection, or cancer), particularly at early stages.
- NGS next generation sequencing
- compositions and methods disclosed herein address this need, and provide additional advantages as well.
- the present disclosure provides methods for preparing a polynucleotide library.
- the methods comprise (a) in a first tailing reaction, adding a first tail to each of a plurality of target polynucleotides by template-independent polymerization, wherein the first tailing reaction comprises a first adapter comprising an overhang that hybridizes to the first tail; (b) in a first ligation reaction, ligating a strand of the first adapter to the first tail; (c) amplifying target polynucleotides comprising the strand of the first adapter by extending a first primer hybridized to the strand of the first adapter; (d) in a second tailing reaction, adding a second tail to each of a plurality of the amplified target polynucleotides by template-independent polymerization, wherein the second tailing reaction comprises a second adapter comprising an overhang that hybridizes to the second tail; and (e) in a second ligation reaction
- the method comprises one or more of: (a) fragmenting polynucleotides to produce the target polynucleotides; (b) dephosphorylation of one or both ends of the target polynucleotides; and (c) denaturing double-stranded polynucleotides to single-stranded polynucleotides to produce the target polynucleotides.
- the plurality of target polynucleotides comprises single-stranded DNA.
- the target polynucleotides comprise cell-free polynucleotides, or amplification products thereof.
- the target polynucleotides comprise single-stranded cell-free DNA (cfDNA).
- the amount of target polynucleotides in the first tailing reaction is about 0.1-500 ng, 1-100 ng, or 5-50 ng. In some embodiments, the target polynucleotides have an average length of about 50 to 600 nucleotides. In some embodiments, the target polynucleotides are treated prior to the first ligation reaction to differentially modify methylated cytosines or unmethylated cytosines, such as by treating the target polynucleotides with bisulfite. In some embodiments, the template-independent polymerization is catalyzed by a polymerase, such as a terminal deoxynucleotidyl transferase (TdT).
- TdT terminal deoxynucleotidyl transferase
- the first tail comprises a sequence that is different from the second tail. In some embodiments, the first tail and the second tail comprise the same sequence. In some embodiments, the first tail, the second tail, or both consist of one or two types of nucleotides. In some embodiments, the first tail, the second tail, or both are selected from the group consisting of poly-A, poly-C, and poly-C/T. In some embodiments, at least one of the tails consists of two types of nucleotides polymerized from a pool of the two types of nucleotides, wherein the two types of nucleotides in the pool are present in same or different amounts.
- the two types of nucleotides in the pool are in a ratio of about 9:1, 5:1, 3:1, or 1:1.
- the first adapter and the second adapter comprise double-stranded regions that are different in polynucleotide sequence.
- the amplifying comprises linear amplification.
- the overhang of the first and/or second adapter is a 3′-overhang.
- the overhang of the first and/or second adapter is 6 to 12 nucleotides in length.
- (i) the first tailing reaction and the first ligation reaction occur in the same reaction mixture, and/or (ii) the second tailing reaction and the second ligation reaction occur in the same reaction mixture.
- the method further comprises amplifying target polynucleotides comprising the strand of the second adapter by extending a second primer hybridized to the strand of the second adapter.
- the sequence of the first primer that hybridizes with the strand of the first adapter is different from the sequence of the second primer that hybridizes with the second adapter.
- amplification with the primer hybridized to the strand of the second adapter is an exponential amplification.
- the method further comprises an amplification reaction with a third primer and a fourth primer, wherein (i) the third primer hybridizes to a complement of at least a portion of the first primer, and (ii) the fourth primer hybridizes to a complement of at least a portion of the second primer.
- the hybridizable sequence of the third primer is different from the hybridizable sequence of the first primer, and/or the hybridizable sequence of the fourth primer is different from the hybridizable sequence of the second primer.
- the sequences of the third primer and the fourth primer are different.
- the third primer, the fourth primer, or both comprise an index sequence that identifies a sample source of the target polynucleotides.
- the method further comprises sequencing amplification products of the amplification comprising the second primer. In some embodiments, the method further comprises sequencing amplification products of the amplification comprising the third and fourth primer. In some embodiments, the method further comprises grouping sequencing reads according to the index sequence. In some embodiments, sequencing comprises detecting a sequence variant or a difference in nucleotide methylation, relative to a reference sequence.
- compositions for use in one or more methods described herein are provided.
- the present disclosure provides a polynucleotide produced according to any of the methods described herein.
- kits for preparing a polynucleotide library comprises: (a) a template-independent polymerase; (b) a first pool of nucleotides that can be polymerized by the template-independent polymerase; (c) a second pool of nucleotides that can be polymerized by the template-independent polymerase; (d) a first adapter comprising an overhang that is hybridizable to tails formed by polymerizing the first pool of polynucleotides; and (e) a second adapter comprising an overhang that is hybridizable to tails formed by polymerizing the second pool of polynucleotides, wherein the second adapter comprises a different sequence than the first adapter.
- the template-independent polymerase is a terminal deoxynucleotidyl transferase (TdT).
- TdT terminal deoxynucleotidyl transferase
- at least one of the first pool and the second pool contains at least one type of nucleotide not present in the other pool.
- the first pool and the second pool comprise the same one or more types of nucleotides.
- the first pool, the second pool, or both consist of one or two types of nucleotides.
- the first pool, the second pool, or both are selected from the group consisting of (i) a pool of dATP, (ii) a pool of dCTP, and (iii) a pool of dCTP and dTTP.
- the first pool and the second pool consists of two types of nucleotides that are present in same or different amounts.
- the two types of nucleotides in the pool are in a ratio of about 9:1, 5:1, 3:1, or 1:1.
- the first adapter and the second adapter comprise double-stranded regions that are different in polynucleotide sequence.
- the overhang of the first and/or second adapter is a 3′-overhang. In some embodiments, the overhang of the first and/or second adapter is 6 to 12 nucleotides in length.
- the kit further comprises a first primer that is hybridizable to a strand of the first adapter under conditions for a primer extension reaction. In some embodiments, the kit further comprises a second primer that is hybridizable to a strand of the second adapter under conditions for a primer extension reaction. In some embodiments, the sequence of the first primer that is hybridizable to the strand of the first adapter is different from the sequence of the second primer that is hybridizable to the second adapter.
- the kit further comprises a third primer and a fourth primer, wherein (i) the third primer is hybridizable to a complement of at least a portion of the first primer under conditions for a primer extension reaction, and (ii) the fourth primer is hybridizable to a complement of at least a portion of the second primer under conditions for a primer extension reaction.
- the hybridizable sequence of the third primer is different from the hybridizable sequence of the first primer, and/or the hybridizable sequence of the fourth primer is different from the hybridizable sequence of the second primer.
- the hybridizable sequence of the third primer hybridizes 5′ with respect to the hybridizable sequence of the first primer, and/or the hybridizable sequence of the fourth primer hybridizes 5′ with respect to the hybridizable sequence of the second primer.
- the sequences of the third primer and fourth primer are different.
- the third primer, the fourth primer, or both comprise an index sequence that identifies a sample source of the target polynucleotides.
- the methods comprise (a) in a first tailing reaction, adding a first tail to each of a plurality of target polynucleotides by template-independent polymerization, wherein the first tailing reaction comprises a first adapter comprising an overhang that hybridizes to the first tail; (b) in a first ligation reaction, ligating a strand of the first adapter to the first tail; (c) amplifying target polynucleotides comprising the strand of the first adapter by extending a first primer hybridized to the strand of the first adapter; and (d) in a second ligation reaction, ligating a strand of a second adapter to the amplified target polynucleotides.
- the second ligation reaction comprises, in a second tailing reaction, adding a second tail to each of a plurality of the amplified target polynucleotides by template-independent polymerization.
- the second tailing reaction comprises a second adapter comprising an overhang that hybridizes to the second tail.
- the second ligation reaction ligating a strand of the second adapter to the second tail.
- the second ligation reaction comprises a second adapter comprising an overhang that hybridizes to the amplified target polynucleotides.
- the method comprises one or more of: (a) fragmenting polynucleotides to produce the target polynucleotides; (b) dephosphorylation of one or both ends of the target polynucleotides; and (c) denaturing double-stranded polynucleotides to single-stranded polynucleotides to produce the target polynucleotides.
- the plurality of target polynucleotides comprises single-stranded DNA.
- the target polynucleotides comprise cell-free polynucleotides, or amplification products thereof.
- the target polynucleotides comprise single-stranded cell-free DNA (cfDNA).
- the amount of target polynucleotides in the first tailing reaction is about 0.1-500 ng, 1-100 ng, or 5-50 ng. In some embodiments, the target polynucleotides have an average length of about 50 to 600 nucleotides. In some embodiments, the target polynucleotides are treated prior to step (b) to differentially modify methylated cytosines or unmethylated cytosines. In some embodiments, the differentially modifying comprises treating the target polynucleotides with bisulfite. In some embodiments, the template-independent polymerization is catalyzed by a polymerase.
- the polymerase is a terminal deoxynucleotidyl transferase (TdT).
- TdT terminal deoxynucleotidyl transferase
- the first tail comprises a sequence that is different from the second tail.
- the first tail and the second tail comprise the same sequence.
- the first tail, the second tail, or both consist of one or two types of nucleotides.
- the first tail, the second tail, or both are selected from the group consisting of poly-A, poly-C, and poly-C/T.
- At least one of the tails consists of two types of nucleotides polymerized from a pool of the two types of nucleotides, wherein the two types of nucleotides in the pool are present in same or different amounts. In some embodiments, the two types of nucleotides in the pool are in a ratio of about 9:1, 7:1, 5:1, 3:1, or 1:1. In some embodiments, the second tailing reaction is omitted. In some embodiments, the first adapter and the second adapter comprise double-stranded regions that are different in polynucleotide sequence. In some embodiments, the amplifying comprises linear amplification. In some embodiments, the overhang of the first and/or second adapter is a 3′-overhang.
- the first and/or second adapter have both a 3′-overhang and a 5′-overhang.
- the 3′-overhang of the first and/or second adapter is 6 to 12 nucleotides in length.
- the 5′-overhang of the first and/or second adapter is 2 to 6 nucleotides in length.
- (i) the first tailing reaction and the first ligation reaction occur in the same reaction mixture, and/or (ii) the second tailing reaction and the second ligation reaction occur in the same reaction mixture.
- the method further comprises amplifying target polynucleotides comprising the strand of the second adapter by extending a second primer hybridized to the strand of the second adapter.
- the sequence of the first primer that hybridizes with the strand of the first adapter is different from the sequence of the second primer that hybridizes with the second adapter.
- amplification with the primer hybridized to the strand of the second adapter is an exponential amplification.
- the method further comprises an amplification reaction with a third primer and a fourth primer, wherein (i) the third primer hybridizes to a complement of at least a portion of the first primer, and (ii) the fourth primer hybridizes to a complement of at least a portion of the second primer.
- the hybridizable sequence of the third primer is different from the hybridizable sequence of the first primer, and/or the hybridizable sequence of the fourth primer is different from the hybridizable sequence of the second primer.
- the sequences of the third primer and the fourth primer are different.
- the third primer, the fourth primer, or both comprise an index sequence that identifies a sample source of the target polynucleotides.
- the method further comprises sequencing amplification products of the amplification comprising the second primer. In some embodiments, the method further comprises sequencing amplification products of the amplification comprising the third and fourth primer. In some embodiments, the method further comprises grouping sequencing reads according to the index sequence.
- compositions for use in one or more methods described herein are provided.
- the present disclosure provides a polynucleotide produced according to any of the methods described herein.
- kits for preparing a polynucleotide library comprises (a) a template-independent polymerase; (b) a first pool of nucleotides that can be polymerized by the template-independent polymerase; (c) a second pool of nucleotides that can be polymerized by the template-independent polymerase; (d) a first adapter comprising an overhang that is hybridizable to tails formed by polymerizing the first pool of polynucleotides; and (e) a second adapter comprising an overhang that is hybridizable to the amplified target polynucleotides.
- the template-independent polymerase is a terminal deoxynucleotidyl transferase (TdT).
- TdT terminal deoxynucleotidyl transferase
- at least one of the first pool and the second pool contains at least one type of nucleotide not present in the other pool.
- the first pool and the second pool comprise the same one or more types of nucleotides.
- the first pool, the second pool, or both consist of one or two types of nucleotides.
- the first pool, the second pool, or both are selected from the group consisting of (i) a pool of dATP, (ii) a pool of dCTP, and (iii) a pool of dCTP and dTTP.
- the first pool and the second pool consists of two types of nucleotides that are present in same or different amounts.
- the two types of nucleotides in the pool are in a ratio of about 9:1, 7:1, 5:1, 3:1, or 1:1.
- the first adapter and the second adapter comprise double-stranded regions that are different in polynucleotide sequence.
- the overhang of the first and/or second adapter is a 3′-overhang.
- the first and/or second adapter have both a 3′-overhang and a 5′-overhang.
- the 3′-overhang of the first and/or second adapter is 6 to 12 nucleotides in length. In some embodiments, the 5′-overhang of the first and/or second adapter is 2 to 6 nucleotides in length. In some embodiments, the kit further comprises a first primer that is hybridizable to a strand of the first adapter under conditions for a primer extension reaction. In some embodiments, the kit further comprises a second primer that is hybridizable to a strand of the second adapter under conditions for a primer extension reaction. In some embodiments, the sequence of the first primer that is hybridizable to the strand of the first adapter is different from the sequence of the second primer that is hybridizable to the second adapter.
- the kit further comprises a third primer and a fourth primer, wherein (i) the third primer is hybridizable to a complement of at least a portion of the first primer under conditions for a primer extension reaction, and (ii) the fourth primer is hybridizable to a complement of at least a portion of the second primer under conditions for a primer extension reaction.
- the hybridizable sequence of the third primer is different from the hybridizable sequence of the first primer, and/or the hybridizable sequence of the fourth primer is different from the hybridizable sequence of the second primer.
- the hybridizable sequence of the third primer hybridizes 5′ with respect to the hybridizable sequence of the first primer, and/or the hybridizable sequence of the fourth primer hybridizes 5′ with respect to the hybridizable sequence of the second primer.
- the sequences of the third primer and fourth primer are different.
- the third primer, the fourth primer, or both comprise an index sequence that identifies a sample source of the target polynucleotides.
- FIG. 1 illustrates an example library preparation method, in accordance with an embodiment.
- the illustration includes sequences CCCTCCTC (SEQ ID NO: 1), TTTTTTTTTTTT (SEQ ID NO: 2), and AAAAAAAAAAAA (SEQ ID NO: 3).
- FIG. 2 illustrates example adapters, in accordance with an embodiment.
- the illustration includes SEQ ID NOs: 4-7, in order from top to bottom.
- FIG. 3 illustrates a comparison between a polynucleotide prepared in accordance with an embodiment comprising a tailing reaction (bottom), and a polynucleotide prepared instead using “Y” adapters (top).
- the illustration includes SEQ ID NOs: 8-15, in order from left to right then top to bottom.
- FIG. 4 illustrates an example plot of a capillary electrophoretic analysis.
- FIGS. 5A-C illustrate example plots of capillary electrophoretic analyses.
- FIGS. 6A-B illustrate example plots of electrophoretic analyses
- FIG. 7 illustrates the methylation level of 12,977 targeted CpG sites across different samples.
- FIGS. 8A-B illustrate example plots of capillary electrophoretic analyses.
- FIG. 9 illustrates an example library preparation method, in accordance with an embodiment of the invention.
- the illustration includes sequences TCTCTCTC and NNNNNNN, where N is any base.
- FIG. 10 illustrates example adapters, in accordance with an embodiment of the invention.
- the illustration includes SEQ ID NOs: 4, 22, 6 and 23, in order from top to bottom.
- FIG. 11 illustrates an example plot of a capillary electrophoretic analysis (lines on graph from top to bottom, 10 ng lambda, 5 ng lambda, 2 ng lambda, 1 ng lambda).
- the term “about” or “approximately” means within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, i.e., the limitations of the measurement system. For example, “about” can mean within one or more than one standard deviation, per the practice in the art. Alternatively, “about” can mean a range of up to 20%, up to 10%, up to 5%, or up to 1% of a given value. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude, preferably within 5-fold, and more preferably within 2-fold, of a value. Where particular values are described in the application and claims, unless otherwise stated the term “about” meaning within an acceptable error range for the particular value should be assumed.
- polynucleotide refers to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function, known or unknown.
- polynucleotides coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, primers, and adapters.
- loci defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes, cDNA, recombin
- a polynucleotide may comprise one or more modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component.
- cell-free circulating
- extracellular as applied to polynucleotides
- a sample from a subject or portion thereof that can be isolated or otherwise manipulated without applying a lysis step to the sample as originally collected (e.g., as in extraction from cells or viruses).
- Cell-free polynucleotides are thus unencapsulated or “free” from the cells or viruses from which they originate, even before a sample of the subject is collected.
- Cell-free polynucleotides may be produced as a byproduct of cell death (e.g.
- cell-free polynucleotides may be isolated from a non-cellular fraction of blood (e.g. serum or plasma), from other bodily fluids (e.g. urine), or from non-cellular fractions of other types of samples.
- a “subject” can be a mammal such as a non-primate (e.g., cows, pigs, horses, cats, dogs, rats, etc.) or a primate (e.g., monkey or human).
- the subject is a human.
- the subject is a mammal (e.g., a human) having or potentially having a disease, disorder, or condition, examples of which are described herein.
- the subject is a mammal (e.g., a human) at risk of developing a disease, disorder, or condition, examples of which are described herein.
- amplify generally refer to any process by which one or more copies are made of a target polynucleotide or a portion thereof.
- a variety of methods of amplifying polynucleotides e.g. DNA and/or RNA are available, some examples of which are described herein.
- Amplification may be linear, exponential, or involve both linear and exponential phases in a multi-phase amplification process.
- Amplification methods may involve changes in temperature, such as a heat denaturation step, or may be isothermal processes that do not require heat denaturation.
- Hybridization refers to a reaction in which one or more polynucleotides react to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues.
- the hydrogen bonding may occur by Watson Crick base pairing, Hoogstein binding, or in any other sequence specific manner according to base complementarity.
- the complex may comprise two strands forming a duplex structure, three or more strands forming a multi stranded complex, a single self-hybridizing strand, or any combination of these.
- a hybridization reaction may constitute a step in a more extensive process, such as the initiation of PCR, or the enzymatic cleavage of a polynucleotide by an endonuclease.
- the term “hybridizable” as applied to a polynucleotide refers to the ability of the polynucleotide to form a complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues in a hybridization reaction.
- a hybridizable sequence of nucleotides is at least about 50%, 60%, 70%, 75%, 80%, 85%, 90%, 95%, or 100% complementary to the sequence to which it hybridizes.
- a hybridizable sequence is one that hybridizes to one or more target sequences as part of, and under the conditions of, a step in a multi-step process (e.g., a ligation reaction, or an amplification reaction).
- “Complementarity” refers to the ability of a nucleic acid to form hydrogen bond(s) with another nucleic acid sequence by either traditional Watson-Crick base pairing or other non-traditional types.
- a percent complementarity indicates the percentage of residues in a first nucleic acid sequence which can form hydrogen bonds (e.g., Watson-Crick base pairing) with a second nucleic acid sequence (e.g., 5, 6, 7, 8, 9, or 10 out of 10 being 50%, 60%, 70%, 80%, 90%, and 100% complementary, respectively).
- Perfectly complementary means that all the contiguous residues of a first nucleic acid sequence will hydrogen bond with the same number of contiguous residues in a second nucleic acid sequence.
- Sequence identity such as for the purpose of assessing percent complementarity, may be measured by any suitable alignment algorithm, including but not limited to the Needleman-Wunsch algorithm (see e.g. the EMBOSS Needle aligner available at www.ebi.ac.uk/Tools/psa/emboss_needle/nucleotide.html, optionally with default settings), the BLAST algorithm (see e.g. the BLAST alignment tool available at blast.ncbi.nlm.nih.gov/Blast.cgi, optionally with default settings), or the Smith-Waterman algorithm (see e.g.
- Optimal alignment may be assessed using any suitable parameters of a chosen algorithm, including default parameters.
- sequence variant refers to any variation in sequence relative to one or more reference sequences. Typically, the sequence variant occurs with a lower frequency than the reference sequence for a given population of individuals for which the reference sequence is known.
- the reference sequence is a single known reference sequence, such as the genomic sequence of a single individual.
- the reference sequence is a consensus sequence formed by aligning multiple known sequences, such as the genomic sequence of multiple individuals serving as a reference population, or multiple sequencing reads of polynucleotides from the same individual.
- sequence variant occurs with a low frequency in the population (also referred to as a “rare” sequence variant).
- the sequence variant may occur with a frequency of about or less than about 5%, 4%, 3%, 2%, 1.5%, 1%, 0.75%, 0.5%, 0.25%, 0.1%, 0.075%, 0.05%, 0.04%, 0.03%, 0.02%, 0.01%, 0.005%, 0.001%, or lower. In some cases, the sequence variant occurs with a frequency of about or less than about 0.1%.
- a sequence variant can be any variation with respect to a reference sequence.
- a sequence variation may consist of a change in, insertion of, or deletion of a single nucleotide, or of a plurality of nucleotides (e.g. 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides).
- sequence variants comprise two or more nucleotide differences
- the nucleotides that are different may be contiguous with one another, or discontinuous.
- types of sequence variants include single nucleotide polymorphisms (SNP), deletion/insertion polymorphisms (DIP), copy number variants (CNV), short tandem repeats (STR), simple sequence repeats (SSR), variable number of tandem repeats (VNTR), amplified fragment length polymorphisms (AFLP), retrotransposon-based insertion polymorphisms, sequence specific amplified polymorphism, and differences in epigenetic marks that can be detected as sequence variants (e.g. methylation differences).
- a sequence variant can refer to a chromosome rearrangement, including but not limited to a translocation or fusion gene.
- the present disclosure provides methods for preparing a polynucleotide library.
- the methods comprise (a) in a first tailing reaction, adding a first tail to each of a plurality of target polynucleotides by template-independent polymerization, wherein the first tailing reaction comprises a first adapter comprising an overhang that hybridizes to the first tail; (b) in a first ligation reaction, ligating a strand of the first adapter to the first tail; (c) amplifying target polynucleotides comprising the strand of the first adapter by extending a first primer hybridized to the strand of the first adapter; (d) in a second tailing reaction, adding a second tail to each of a plurality of the amplified target polynucleotides by template-independent polymerization, wherein the second tailing reaction comprises a second adapter comprising an overhang that hybridizes to the second tail; and (e) in a second ligation reaction
- the present disclosure provides methods for preparing a polynucleotide library.
- the methods comprise (a) in a first tailing reaction, adding a first tail to each of a plurality of target polynucleotides by template-independent polymerization, wherein the first tailing reaction comprises a first adapter comprising an overhang that hybridizes to the first tail; (b) in a first ligation reaction, ligating a strand of the first adapter to the first tail; (c) amplifying target polynucleotides comprising the strand of the first adapter by extending a first primer hybridized to the strand of the first adapter; and (d) in a second ligation reaction, ligating a strand of a second adapter to the amplified target polynucleotides.
- the second adaptor ligation is used without a tailing reaction.
- the second ligation reaction can comprise, in a second tailing reaction, adding a second tail to each of a plurality of the amplified target polynucleotides by template-independent polymerization.
- the second tailing reaction can comprise a second adapter comprising an overhang that hybridizes to the second tail.
- the second ligation reaction ligating a strand of the second adapter to the second tail.
- the second ligation reaction comprises a second adapter comprising an overhang that hybridizes to the amplified target polynucleotides.
- the second adaptor ligation can utilize a 3′ overhang of random bases in the adaptor to serve as a splinter to facilitate ligation.
- the second adapters can be added to the 3′ ends of the amplified target polynucleotides.
- the 3′ overhang of the adapter serves as a splinter to stabilize the substrate strand and facilitate the ligation between the 3′ end of the substrate strand and the 5′ end of the phosphorylated opposite adapter strand.
- Polynucleotides useful in methods of the present disclosure can be derived from any of a variety of sample sources.
- the sample is an environmental sample, such as a naturally occurring or artificial atmosphere, water sample, soil sample, surface swab, or any other sample of interest.
- polynucleotides are derived from a biological sample, such as a sample of a subject.
- biological samples include tissues (e.g. skin, heart, lung, kidney, bone marrow, breast, pancreas, liver, muscle, smooth muscle, bladder, gall bladder, colon, intestine, brain, prostate, esophagus, thyroid, and tumor), bodily fluids (e.g.
- the sample is blood, a blood fraction, plasma, serum, saliva, sputum, urine, semen, transvaginal fluid, cerebrospinal fluid, or stool.
- the sample is blood, such as whole blood or a blood fraction (e.g. serum or plasma).
- polynucleotides are extracted from a sample, such as when polynucleotides to be analyzed are contained within cells or viral capsids.
- an extraction method the method selected may depend, in part, on the type of sample to be processed.
- a variety of extraction methods are available.
- nucleic acids can be purified by organic extraction with phenol, phenol/chloroform/isoamyl alcohol, or similar formulations, including TRIzol and TriReagent.
- samples are treated to remove or degrade one or more components, such as protein (e.g., by proteinase K treatment) or RNA (e.g., by RNaseA treatment), and/or to preserve one or more components, such as RNA (e.g., by treatment with RNase inhibitor).
- proteins e.g., by proteinase K treatment
- RNA e.g., by RNaseA treatment
- further steps may be employed to purify one or both separately from the other.
- Sub-fractions of extracted nucleic acids can also be generated, for example, purification by size, sequence, or other physical or chemical characteristic.
- purification of nucleic acids can be performed after subsequent manipulation, such as to remove excess or unwanted reagents, reactants, or products.
- the methods described herein involve manipulation of cell-free polynucleotides obtained from a sample of a subject without cellular extraction (e.g. without a step for lysing cells, viruses, and/or other capsules comprising nucleic acids).
- polynucleotides are manipulated directly in a biological sample as collected.
- cell-free polynucleotides are separated from other components of a sample (e.g. cells and/or proteins) without treatment to release polynucleotides contained in cells that may be present in the sample.
- the sample can be treated to separate cells from the sample.
- a sample is subjected to centrifugation and the supernatant comprising the cell-free polynucleotides is separated for further processing (e.g. isolation of polynucleotides from other components, or other manipulation of the polynucleotides).
- cell-free polynucleotides are purified away from other components of an initial sample (e.g. cells and/or proteins).
- a variety of procedures for isolation of polynucleotides without cellular extraction are available, such as by precipitation or non-specific binding to a substrate followed by washing the substrate to release bound polynucleotides.
- the starting amount of polynucleotides isolated from a sample source can vary, and in some cases may be small.
- the amount of starting polynucleotides is about or less than about 1000 ng, 500 ng, 100 ng 50 ng, 25 ng, 20 ng, 15 ng, 10 ng, 5 ng, 4 ng, 3 ng, 2 ng, 1 ng, 0.5 ng, 0.1 ng, or less.
- the amount of starting polynucleotides is in the range of about 0.1-500 ng, such as between 1-100 ng or 5-50 ng.
- polynucleotides to be analyzed comprise amplification products of polynucleotides from a sample.
- Amplification products can be specifically amplified (e.g., by using target-specific amplification primers), or non-specifically amplified (e.g., by using a pool of non-specific amplification primers).
- amplification templates comprise DNA and/or RNA.
- polynucleotides to be analyzed comprise RNA that is reverse-transcribed into DNA as part of a reverse transcription (RT) reaction.
- reverse transcription comprises extension of an oligonucleotide primer hybridized to a target RNA by an RNA-dependent DNA polymerase (also referred to as a “reverse transcriptase”), using the target RNA molecule as the template to produce a complementary DNA (cDNA).
- RNA-dependent DNA polymerase also referred to as a “reverse transcriptase”
- reverse transcriptases examples include, but are not limited to, retroviral reverse transcriptase (e.g., Moloney Murine Leukemia Virus (M-MLV), Avian Myeloblastosis Virus (AMV) or Rous Sarcoma Virus (RSV) reverse transcriptases), Superscript ITM, Superscript IITM, Superscript IIITM, retrotransposon reverse transcriptase, hepatitis B reverse transcriptase, cauliflower mosaic virus reverse transcriptase, bacterial reverse transcriptase, and mutants, variants or derivatives thereof.
- the reverse transcriptase is a hot-start reverse transcriptase enzyme.
- the polynucleotides are polynucleotides that have been subjected to fragmentation.
- the fragments have an average length, median length, or fractional distribution of lengths (e.g., accounting for at least 50%, 60%, 70%, 80%, 90%, or more) that is less than a predefined length or within a predefined range of lengths.
- the predefined length is about or less than about 1500, 1000, 800, 600, 500, 300, 200, 100, or 50 nucleotides in length.
- the predefined range of lengths is a range between 10-1000, 10-800, 10-700, 50-600, 100-600, or 150-400 nucleotides in length.
- the fragmented polynucleotides have an average size within a pre-defined range (e.g. an average or median length from about 10 to about 1,000 nucleotides in length, such as between 10-800, 10-700, 50-600, 100-600, or 150-400 nucleotides; or an average or medium length of less than 1500, 1000, 750, 500, 400, 300, 250, 100, 50, or fewer nucleotides in length).
- a pre-defined range e.g. an average or median length from about 10 to about 1,000 nucleotides in length, such as between 10-800, 10-700, 50-600, 100-600, or 150-400 nucleotides; or an average or medium length of less than 1500, 1000, 750, 500, 400, 300, 250, 100, 50, or fewer nucleotides in length.
- fragmenting the polynucleotides comprises mechanical fragmentation, chemical fragmentation, and/or heating. In some embodiments, the fragmentation is accomplished mechanically comprising subjecting sample polynucleotides to acoustic sonication. In some embodiments, the fragmentation comprises treating the sample polynucleotides with one or more enzymes under conditions suitable for the one or more enzymes to generate nucleic acid breaks (e.g., double-stranded breaks). Examples of enzymes useful in the generation of polynucleotide fragments include sequence specific and non-sequence specific nucleases. Non-limiting examples of nucleases include DNase I, Fragmentase, restriction endonucleases, variants thereof, and combinations thereof.
- fragmentation comprises treating the sample polynucleotides with one or more restriction endonucleases. Fragmentation can produce fragments having 5′ overhangs, 3′ overhangs, blunt ends, or a combination thereof. In some embodiments, such as when fragmentation comprises the use of one or more restriction endonucleases, cleavage of sample polynucleotides leaves overhangs having a predictable sequence. Fragmented polynucleotides may be subjected to a step of size selecting the fragments, such as column purification or isolation from an agarose gel.
- polynucleotides are treated to prepare the 5′ ends and/or the 3′ ends for subsequent steps, such as extension or ligation steps. Preparation of polynucleotide ends can be particularly helpful following fragmentation procedures. Preparation of polynucleotide ends is often referred to as end “polishing” or “repair.” In some embodiments, polynucleotide ends are repaired to generate blunt-end or single-stranded fragments with 5′ phosphorylated ends (e.g., using dNTP, T4 DNA polymerase, Klenow large fragment, T4 Polynucleotide Kinase, and ATP).
- end repair comprises adding an adenine to the 3′ ends to generate a 3′-A overhang (e.g., using dATP, Klenow fragment (3′-5′ exo-) or Taq polymerase).
- one or both polynucleotide ends are dephosphorylated, such as by treatment with a phosphatase.
- the methods comprise a first tailing reaction, in which a first tail is added to each of a plurality of target polynucleotides by template-independent polymerization.
- the target polynucleotides are single-stranded.
- the target polynucleotides may be naturally single-stranded, or treated to be single-stranded if not already so.
- target RNA can be reverse-transcribed to form DNA-RNA hybrid molecules, which can then be treated with RNaseH or heat-denatured in the presence of RNase A to degrade the RNA and yield single-stranded cDNA.
- double-stranded DNA can be heat-denatured (e.g., by incubation at about 95° C.), optionally followed by rapid cooling (e.g., incubation on ice).
- the target polynucleotides comprise single-stranded DNA.
- the target polynucleotides comprise single-stranded cfDNA.
- the “tail” produced by template-independent polymerization refers to the newly-synthesized string of nucleotides polymerized to the end of a target polynucleotide subjected to the polymerization reaction.
- the length and nucleotide sequence of the tail will depend, in part, on the type of nucleotides from which the tail is polymerized (e.g., 1, 2, 3, or 4 of A, T, G, and C), the duration of the reaction, the polymerase used, and the presence of other reagents (e.g. an adapter comprising an overhang that hybridizes to the first tail during the polymerization reaction).
- the tail is polymerized only to the 3′ end of one or more target polynucleotides.
- a tail is polymerized from a pool consisting of four types of DNA bases (A, T, G, and C), such that the resulting tail has a chance of comprising any or all four of the bases.
- a tail is polymerized from a pool consisting of any three of the bases A, T, G, and C, such that the resulting tail has a chance of comprising any or all of the three selected bases.
- a tail is polymerized from a pool consisting of any two types of the bases A, T, G, and C, such as C/T or A/G, such that the resulting tail has a chance of comprising either or both of the two selected bases.
- a tail is polymerized from a pool consisting of one type of base selected from A, T, G, and C, such that the resulting tail consists of bases of the selected type.
- the pool consists of thymine bases (yielding a poly-T tail) or cytosine bases (yielding a poly-C tail).
- the bases are in a triphosphate form (e.g. dATP, dTTP, dGTP, and/or dCTP).
- constitution of the tail can be modulated by adjusting the ratio of the types of bases in the pool.
- all types of bases in the pool are present in approximately equal amounts, such that the ratio of any one type to any other type is about 1:1.
- the ratio of one type of base to another in the pool is about or more than about 2:1, 3:1, 4:1, 5:1, 6:1, 7:1, 8:1, 9:1, 10:1, 15:1, or higher.
- the ratio of one type of base to another in the pool is about or more than about 3:1, 5:1, or 9:1.
- the ratio is about or more than about 9:1.
- the sequence of the tail can be represented as a degenerate sequence of letters representing the members of the pool.
- RRR refers to a sequence of three purines and represents the sequences AAA, AAG, AGA, GAA, AGG, GAG, GGA, and GGG;
- YYY refers to a sequence of three pyrimidines and represents the sequences TTT, TTC, TCT, CTT, TCC, CCT, CTC, and CCC.
- the tail on one molecule may or may not be the same as another.
- the set of possible sequences and their relative likelihoods within a resulting pool of tailed polynucleotides can be modulated based on the types of nucleotides in the pool and their relative amounts.
- the conditions of each reaction can be selected to produce tails that are the same or different, such as in terms of length, types of nucleotides included, and/or relative amounts of nucleotides if more than one is present in the pool.
- the method comprises two tailing reactions and the tails are the same. In some embodiments, the method comprises two tailing reactions and the tails are different.
- one or more steps comprise polynucleotide extension by a polymerase.
- Example polynucleotide extension reactions include reverse transcription, tailing, and amplification. A variety of polymerases are available and can be suitably selected for the appropriate type of polynucleotide extension reaction.
- the polynucleotide extension reaction is a tailing reaction, such as a template-independent tailing reaction.
- the template-independent tailing reaction involves polynucleotide extension by a template-independent polymerase.
- a template-independent polymerase is a polymerase that is capable of catalyzing a polynucleotide extension reaction in the absence of a template complementary to the sequence being polymerized. While template-independent polymerases do not require the presence of a template in order to catalyze the reaction, such that polymerization occurs independently of whether or not a template molecule is present, absence of a template is not necessarily required.
- template-independent polymerases include terminal deoxynucleotidyl transferases (TdT; also known as DNA nucleotidylexotransferase (DNTT) or terminal transferase), poly-A polymerases, RNA-specific nucleotidyl transferases, poly(U) polymerases, and mutated or modified versions thereof.
- TdT terminal deoxynucleotidyl transferases
- DNTT DNA nucleotidylexotransferase
- poly-A polymerases RNA-specific nucleotidyl transferases
- poly(U) polymerases mutated or modified versions thereof.
- the template-independent polymerase is a TDT.
- the template-independent polymerase can be from any suitable source.
- Specific non-limiting examples of template-independent polymerases include recombinantly produced calf thymus TDT and E.
- a tailing reaction comprises an adapter comprising an overhang that hybridizes to the tail.
- the overhang may hybridize to the tail during the polynucleotide extension reaction; however, in a template-independent polymerization reaction initiated by a template-independent polymerase, such hybridization does not negate the status of the reaction as template-independent.
- An adapter with an overhang comprises at least one single-stranded region (the overhang) and at least one double-stranded region (immediately adjacent to the overhang).
- An adapter can comprise an overhang on both ends, and involve the same or different strands.
- a double-stranded region can be formed by hybridizing a short oligonucleotide in the middle of a longer oligonucleotide.
- two oligonucleotides can be hybridized to one another such that an overhang at one end is formed by one of the oligonucleotides, and an overhang at the other end is formed by the other oligonucleotide.
- An adapter can also be formed by hybridizing more than two oligonucleotides, and may comprise internal single-stranded regions between double-stranded regions (e.g., as in two short oligonucleotides hybridized to the same long oligonucleotide at regions that are one or more nucleotides apart along the long oligonucleotide).
- the overhang is a 3′ overhang.
- the adaptor has both a 3′ overhang and a 5′ overhang.
- the 5′ overhang creates a recessive 3′ end that can prevent a leaky tailing reaction on the adaptor itself.
- the 5′ overhang creates a 3′ recessive end on the other strand, which prevents a leaky tailing reaction on the adapter due to incomplete 3′ end chemical blocking during oligonucleotide synthesis.
- an overhang that hybridizes to a particular tail comprises a sequence designed to be complementary to the tail to be polymerized.
- the entire length of the overhang is designed to hybridize to the tail.
- the sequence designed to hybridize to the tail need not be perfectly complementary to the tail; rather, the overhang need only be designed to hybridize to the tail under a particular reaction condition, such as during the tailing reaction.
- the overhang is designed to be perfectly complementary. In cases where a tail is polymerized from a pool of a single type of nucleotide (e.g., poly-A), designing a perfectly complementary overhang (or portion thereof) is relatively straightforward (e.g., poly-T in the case of poly-A).
- a tail is polymerized from a pool of two or more types of polynucleotides
- individual tail sequences can vary, such that an adapter overhang that is perfectly complementary to one individual tail will not be perfectly complementary to another.
- a single adapter overhang sequence is designed to maximize complementarity with a tail polymerized from two or more nucleotides.
- a tail polymerized from C and T with a C:T ratio of 5:1 could be designed to be poly-G.
- a tail of 10 nucleotides would be expected to have an average of 2 mismatches along the same length of a poly-G adapter overhang.
- an adapter sequence can be expressed as containing one or more (or all) degenerate positions, selected based on degenerate positions of the tail to which it is designed to hybridize. For example, for a tail represented by the sequence “YYY,” an overhang could be designed to have sequence “RRR.” Where an overhang comprises one or more degenerate base positions, “the adapter” represent a pool of adapter oligonucleotides with each of the different nucleotides at each degenerate position represented in the pool.
- the relative representation of a particular nucleotide in the overhang, or the relative amount of one or more sequences in the pool can be modulated (e.g., to correspond to the relative amounts of nucleotides in the pool of nucleotides from which the tail is polymerized).
- an oligonucleotide that forms the strand of the adapter forming the overhang can be polymerized from a pool of nucleotides complementary to the nucleotides of the tail, and in corresponding relative amounts (e.g., 9:1 G:A for a tail polymerized from a 9:1 C:T).
- an adapter designed to hybridize to a poly-C/T tail could be designed to be 10 nucleotides in length and comprising in equal amounts all possible overhangs having a single adenine, and optionally every sequence having two adenines.
- Other variations for designing an overhang that hybridizes to a tail polymerized from a given pool of nucleotides are possible.
- the length of the adapter's overhang is selected to control the length of the tail produced by the template-independent polymerase, particularly in cases where the polymerase lacks strand-displacement activity.
- the double-stranded region of the adapter inhibits elongation of the tail when the tail is hybridized to the overhang. Inhibiting tail elongation does not necessarily require that all tails produced in the elongation reaction to be that same length as the overhang. Rather, tail elongation is considered to be inhibited by an adapter if the average tail length produced in the template-independent polymerization reaction is shorter than the average tail length produced in the absence of the adapter.
- an adapter overhang is about or less than about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, or more nucleotides in length. In some embodiments, the adapter overhang is between about 3-25, 5-20, or 10-15 nucleotides in length. In some embodiments, the overhang is about 6-12 nucleotides in length.
- the length and/or sequence of the adapters, or any portion thereof can be the same or different.
- the method comprises two tailing reactions that each comprise an adapter, and the two adapters have overhangs of equal lengths and/or the same sequence.
- the method comprises two tailing reactions that each comprise an adapter, and the two adapters have overhangs of different lengths and/or different sequences.
- the adapter is present in a tailing reaction in a relative molar amount of about or less than about 0.25-fold, 0.5-fold, 0.75-fold, 1-fold, 2-fold, 3-fold, 4-fold, 5-fold, 10-fold, or more with respect to the amount of target polynucleotides in the reaction. In some embodiments, the adapter is present in the tailing reaction at an approximately 1:1 molar ratio with respect to the target polynucleotides.
- an adapter comprises one or more of a variety of sequence elements, in addition to the overhang that hybridizes with the tail.
- additional sequence elements include, but are not limited to, one or more amplification primer annealing sequences or complements thereof, one or more sequencing primer annealing sequences or complements thereof, one or more index sequences (e.g., one or more sequences associated with a particular sample source or reaction that can be used to identify the origin of a target polynucleotide with which the index is associated), one or more common sequences shared among multiple different adapters or subsets of different adapters, one or more restriction enzyme recognition sites, one or more probe binding sites (e.g.
- a sequencing platform such as a flow cell for massive parallel sequencing, such as flow cells as developed by Illumina, Inc.
- a sequencing platform such as a flow cell for massive parallel sequencing, such as flow cells as developed by Illumina, Inc.
- one or more random or near-random sequences e.g. one or more nucleotides selected at random from a set of two or more different nucleotides at one or more positions, with each of the different nucleotides selected at one or more positions represented in a pool of adapters comprising the random sequence
- an adapter is used to purify target polynucleotides to which they are attached, for example by using beads (particularly magnetic beads for ease of handling) that are coated with oligonucleotides comprising a complementary sequence to the adapter (or portion thereof) attached to a target polynucleotide.
- Two or more sequence elements can be non-adjacent to one another (e.g. separated by one or more nucleotides), adjacent to one another, partially overlapping, or completely overlapping.
- an amplification primer annealing sequence can also serve as a sequencing primer annealing sequence.
- Sequence elements can be located at or near the 3′ end, at or near the 5′ end, or in the interior of the adapter oligonucleotide.
- a sequence element may be of any suitable length, such as about or less than about 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more nucleotides in length.
- Adapter oligonucleotides can have any suitable length, at least sufficient to accommodate the one or more sequence elements of which they are comprised.
- adapters comprise oligonucleotides that are each independently selected to have a length of about or less than about 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, or more nucleotides in length.
- an adapter oligonucleotide is in the range of about 10 to 75 nucleotides in length, such as about 15 to 50 nucleotides in length. In some embodiments, an adapter comprises a double-stranded portion that is about or less than about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 75, or more nucleotides in length.
- an adapter comprises one or more 3′ ends that are not a substrate for polynucleotide extension, such as during a template-independent polymerization reaction.
- the 3′ end is referred to as being “blocked.”
- a 3′ end that is blocked is the 3′ end of the overhang that hybridizes to the tail formed during template-independent polymerization, such that the 3′ end is not extended during the reaction.
- Various methods are available for forming a 3′ end that cannot be extended, including, without limitation, incorporating at the 3′ end a nucleotide that cannot be extended and modifying the 3′ end nucleotide to render it unextendable.
- the 3′ end lacks a 3′ hydroxyl group needed by a polymerase to covalently attach another nucleotide.
- a blocking group is added to the terminal 3′-OH or 2′-OH in the adapter.
- blocking groups include an alkyl group, non-nucleotide linkers, a phosphate group, a phosphorothioate group, alkane-diol moieties, and an amino group.
- the 3′-hydroxyl group is modified by substitution of hydrogen with fluorine or by formation of an ester, amide, sulfate or glycoside.
- the 3′-OH group is replaced with hydrogen (to form a dideoxynucleotide).
- the 3′ end comprises a phosphate group.
- a strand of the adapter is ligated to a tail sequence, such as in a ligation reaction.
- ligation occurs in the same reaction mixture as a tailing reaction.
- reagents for carrying out a ligation reaction are included in a tailing reaction.
- reagents for carrying out a ligation reaction are added to a reaction mixture after tailing is initiated or terminated.
- ligation is effected by a ligase enzyme.
- a variety of ligase enzymes are available, non-limiting examples of which include NAD-dependent ligases including Taq DNA ligase, Thermus filiformis DNA ligase, E.
- thermostable ligase Ampligase thermostable DNA ligase, VanC-type ligase, and 9° N DNA Ligase
- ATP-dependent ligases including T4 RNA ligase, T4 DNA ligase, T3 DNA ligase, T7 DNA ligase, Pfu DNA ligase, DNA ligase 1, DNA ligase III, and DNA ligase IV.
- target polynucleotides are treated to differentially modify methylated cytosines or unmethylated cytosines.
- treatment to distinguish cytosine methylation status is performed prior to an amplification reaction, such as after a first ligation reaction involving the target polynucleotides but before subsequent amplification, during the ligation reaction, or before the ligation reaction (e.g. before tailing target polynucleotides, or as part of sample preparation).
- treatment to distinguish cytosine methylation status is performed on a portion of target polynucleotides from a particular source, and another portion from the same source is untreated (e.g., as in different aliquots from a common solution), such that the treated and untreated samples can be subsequently compared.
- comparison facilitates identifying cytosine methylation status, such as in identifying sequence differences produced as a result of treatment.
- a variety of treatment processes for differentially modifying methylated or unmethylated cytosines are available.
- a reagent that selectively modifies methylated cytosines is the TET family of proteins (e.g., TET1, TET2, TET3, and CSSC4), which convert the cytosine nucleotide 5-methylcytosine into 5-hydroxymethylcytosine by hydroxylation.
- 5-hydroxymethylcytosine can be selectively modified, such as by treatment with metal (VI) oxo complexes (e.g., manganate (Mn(VI)O 4 2 ⁇ ), ferrate (Fe(VI)O 4 2 ⁇ ), osmate (Os(VI)O 4 2 ⁇ ), ruthenate (Ru(VI)O 4 2 ⁇ ), or molybate (Mo(VI)O 4 2 ⁇ )).
- metal (VI) oxo complexes e.g., manganate (Mn(VI)O 4 2 ⁇ ), ferrate (Fe(VI)O 4 2 ⁇ ), osmate (Os(VI)O 4 2 ⁇ ), ruthen
- treatment to differentially modify methylated cytosines or unmethylated cytosines comprises treating the target polynucleotides with sodium hydrogen sulfite (bisulfite), which sulfonates unmethylated cytosine but does not efficiently sulfonate methylated cytosine.
- bisulfite sodium hydrogen sulfite
- the sulfonated unmethylated cytosine is prone to spontaneous deamination, which yields sulfonated uracil.
- the sulfonated uracil can then be desulfonated to uracil at high pH.
- the base-pairing properties of the pyrimidines uracil and cytosine are fundamentally different: uracil in DNA is recognized as the equivalent of thymine and therefore is paired with adenine during hybridization or polymerization of DNA, whereas cytosine is paired with guanosine during hybridization or polymerization of DNA. Performance of genomic sequencing or PCR on bisulfite treated DNA can therefore be used to distinguish unmethylated cytosine in the genome, which has been converted to uracil, versus methylated cytosine, which has remained unconverted.
- target polynucleotides comprising a first tail ligated to a strand of a first adapter, resulting from being subjected to a first tailing reaction and a first ligation reaction, are amplified.
- amplification comprises extending a first primer hybridized to the strand of the first adapter ligated in an earlier ligation reaction.
- the primer comprises a sequence that is hybridizable to at least a portion of the ligated strand of the adapter.
- the hybridizable sequence is complementary to the sequence to which it hybridizes.
- the primer hybridizes to a common sequence present in all first adapter polynucleotides ligated during the ligation reaction.
- the hybridizable portion of the primer is about or more than about 10, 15, 20, 25, 30, 35, 45, 50, or more nucleotides in length.
- the hybridizable portion of a primer comprises the 3′ end of the primer.
- the first primer comprises one or more additional sequence elements.
- additional sequence elements include, but are not limited to, one or more primer annealing sequences or complements thereof (e.g., a sequencing primer), one or more index sequences (e.g., one or more sequences associated with a particular sample source or reaction that can be used to identify the origin of a target polynucleotide with which the index is associated), one or more restriction enzyme recognition sites, one or more probe binding sites (e.g. for attachment to a sequencing platform, such as a flow cell for massive parallel sequencing, such as flow cells as developed by Illumina, Inc.), one or more random or near-random sequences (e.g.
- primer annealing sequences or complements thereof e.g., a sequencing primer
- index sequences e.g., one or more sequences associated with a particular sample source or reaction that can be used to identify the origin of a target polynucleotide with which the index is associated
- restriction enzyme recognition sites e.g. for attachment to a sequencing platform, such as a flow cell for
- a sequence element may be of any suitable length, such as about or less than about 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more nucleotides in length.
- a variety of amplification processes are available for amplifying target polynucleotides comprising a first tail ligated to a strand of a first adapter, and include both exponential and non-exponential (e.g., linear) processes.
- a primer extension product is used as the template for producing a further primer extension product that is complementary to the first.
- Linear amplification reactions are typically designed to minimize or eliminate formation of primer extension products templated off of other primer extension products formed during the reaction.
- amplification of target polynucleotides comprising a first tail ligated to a strand of a first adapter is a linear amplification.
- the first step of amplification comprises primer annealing, in which the first primer hybridizes to the strand of the adapter ligated to the tail.
- the primer hybridization site comprises a double-stranded portion of the adapter
- the hybridization site in the template strand will first be exposed. Exposure of the hybridization site can be achieved by denaturing and/or degrading the non-template strand of the adapter. Denaturation can comprise heat denaturation, such has heating to about or more than about 90° C. or 95° C. for a period of time (e.g., about or more than about 1, 2, 3, 4, 5, 10, or more minutes).
- RNA bases a ribonuclease (e.g., RNase H or RNase A) can be used to degrade the non-template strand.
- RNase H or RNase A a ribonuclease
- degradation can be effected by addition of Uracil-Specific Excision Reagent (USER) enzyme, which is a mixture of Uracil DNA glycosylase (UDG) and the DNA glycosylase-lyase Endonuclease VIII.
- Uracil-Specific Excision Reagent Uracil-Specific Excision Reagent
- a variety of processes for linear amplification are available, and examples include isothermal and non-isothermal processes.
- a non-isothermal process the process includes denaturation and primer extension steps carried out at different temperatures. Denaturation releases a primer extension product formed on a template, freeing the primer hybridization site for hybridization with another copy of the primer. Extension of the further copy of the first primer produces another primer extension product from the same template, and the whole process can be repeated through several “cycles” of denaturation and extension.
- a non-isothermal process is used, and the number of cycles is about or at least about 2, 5, 10, 15, 20, 25, or more.
- An example of an isothermal linear amplification process is single primer isothermal amplification (SPIA).
- SPIA comprises extension of a composite primer having a 3′ DNA portion and a 5′ RNA portion, degradation of the RNA portion by RNase H, annealing of another copy of the composite primer, and extension of the further copy of the composite primer by a polymerase with strand-displacement activity, all of which can take place at the same temperature. Further descriptions of these and other amplification reactions can be found, e.g., in US20170362636 A1, which is hereby incorporated by reference.
- amplification produces a plurality of single-stranded copies complementary to the template target polynucleotides, comprising sequences complementary to the first tail and at least a portion of the ligated strand of the first adapter.
- amplification conditions are selected to produce about or less than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 40, 50, 100, 200, 500, or more copies of a target polynucleotide.
- amplification products of the amplification reaction with the first primer are subjected to a tailing reaction, referred to as the second tailing reaction.
- the second tailing reaction adds a second tail to each of a plurality of the amplified target polynucleotides by template-independent polymerization.
- the length and nucleotide sequence of the tail will depend, in part, on the type of nucleotides from which the tail is polymerized (e.g., 1, 2, 3, or 4 of A, T, G, and C), the duration of the reaction, the polymerase used, and the presence of other reagents (e.g.
- the tail is polymerized only to the 3′ end of one or more amplified target polynucleotides.
- the second tailing reaction is designed to produce a tail having the same or substantially the same sequence as the first tail, or a sequence complementary thereto.
- the first a second tail can be formed from a pool of only adenine bases, forming poly-A tails.
- the resulting second-tailed polynucleotide would comprise a poly-A tail at one end and a poly-T tail adjacent to at least a portion of the complement of the adapter strand to which the first tail was hybridized.
- the first tail could be a poly-A tail and the second tail could be a poly-T tail.
- the second tailing reaction is performed on amplification products complementary to the tailed target polynucleotide templates, the result in this example would be a polynucleotide having two poly-T stretches, one from the first tail and one from the second.
- the second tailing reaction is designed to produce a tail having a different sequence from the first tail, such as by using one or more nucleotides in the nucleotide pool for the second tailing reaction that were not used in the pool used in the first tailing reaction.
- Various combinations of different first a second tails are possible.
- Non-limiting examples of tail combinations include: (a) one tail consists of one type of nucleotide, and another tail consists of another type of nucleotide; (b) one tail consists of one type of nucleotide, and another tail comprises or consists of two or more types of nucleotides; (c) both tails comprise or consist of two or more types of nucleotides, but each comprises at least one type of nucleotide not contained in the other.
- the first tail, the second tail, or both are selected from the group consisting of poly-A, poly-C, and poly-C/T.
- the second tailing reaction comprises an adapter (referred to as the second adapter) comprising an overhang that hybridizes to the second tail.
- the overhang may hybridize to the tail during the polynucleotide extension reaction; however, in a template-independent polymerization reaction initiated by a template-independent polymerase, such hybridization does not negate the status of the reaction as template-independent.
- the second adapter comprises at least one single-stranded region (the overhang) and at least one double-stranded region (immediately adjacent to the overhang).
- the second adapter can comprise an overhang on both ends, and involve the same or different strands.
- a double-stranded region can be formed by hybridizing a short oligonucleotide in the middle of a longer oligonucleotide.
- two oligonucleotides can be hybridized to one another such that an overhang at one end is formed by one of the oligonucleotides, and an overhang at the other end is formed by the other oligonucleotide.
- An adapter can also be formed by hybridizing more than two oligonucleotides, and may comprise internal single-stranded regions between double-stranded regions (e.g., as in two short oligonucleotides hybridized to the same long oligonucleotide at regions that are one or more nucleotides apart along the long oligonucleotide).
- the overhang is a 3′ overhang.
- the adaptor has both a 3′ overhang and a 5′ overhang. If a first and second adaptor is used, both adaptors can have a both a 5′ overhang and a 3′ overhang.
- the second adapter is the same as the first adapter. In some embodiments, at least a portion of the second adapter differs from the first adapter. In some embodiments, the first and second adapter comprise one or more portions in common, while differing in other portions.
- the first and second adapter may comprise a common primer binding sequence, designed such that after attachment of the second adapter to the amplified target polynucleotides, further exponential amplification can be achieved with a single primer that hybridizes to that common primer binding sequence or complement thereof.
- both the first and second adapters comprise a primer binding sequence that is designed for exponential amplification by different primers.
- a strand of the second adapter is ligated to the second tail sequence, such as in a ligation reaction (referred to as the second ligation reaction).
- ligation occurs in the same reaction mixture as the second tailing reaction.
- reagents for carrying out the second ligation reaction are included in the second tailing reaction.
- reagents for carrying out the second ligation reaction are added to a reaction mixture after the second tailing is initiated or terminated.
- ligation is effected by a ligase enzyme, examples of which are provided above.
- products of the second ligation reaction are a collection of polynucleotides, each comprising the following elements, from 5′ to 3′: (a) a sequence complementary to at least a portion of the ligated strand of the first adapter, (b) a sequence complementary to the first tail, (c) a sequence complementary to a target polynucleotide, (d) the second tail, and (e) the ligated strand of the second adapter.
- ligation products as well as amplification products thereof, will be referred to as “dual-adapted” or “double-adapted” target polynucleotides, even though it is understood that element (a) might not comprise the entire ligated adapter strand of the first adapter, element (b) is a complementary copy of a target polynucleotide, and element (e) might not comprise the entire ligated adapter strand (e.g., in the case of an amplification product of the second ligation product).
- the collection may be referred to as a library.
- the double-adapted target polynucleotides are amplified in an amplification reaction.
- the amplification comprises extending a second primer hybridized to the ligated strand of the second adapter.
- the second primer comprises a sequence that is hybridizable to at least a portion of the ligated strand of the second adapter.
- the hybridizable sequence is complementary to the sequence to which it hybridizes.
- the primer hybridizes to a common sequence present in all second adapter polynucleotides ligated during the second ligation reaction.
- the hybridizable portion of the primer is about or more than about 10, 15, 20, 25, 30, 35, 45, 50, or more nucleotides in length.
- the hybridizable portion of a primer comprises the 3′ end of the primer.
- the second primer comprises one or more additional sequence elements.
- additional sequence elements include, but are not limited to, one or more primer annealing sequences or complements thereof (e.g., a sequencing primer), one or more index sequences (e.g., one or more sequences associated with a particular sample source or reaction that can be used to identify the origin of a target polynucleotide with which the index is associated), one or more restriction enzyme recognition sites, one or more probe binding sites (e.g.
- a sequence element may be of any suitable length, such as about or less than about 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more nucleotides in length.
- Amplification with the second primer can be exponential or non-exponential (e.g., linear). Amplification can be isothermal or non-isothermal. In some embodiments, products of the second ligation reaction are substantially linear, and amplification consists of rendering the ligation products double-stranded by extension of the second primer.
- the second primer is the same as the first primer, or comprises the same hybridizable sequence as the first primer. In some embodiments, the second primer differs from the first primer, such as with regard to the hybridizable sequence. In some embodiments, the amplification reaction comprises the second primer and a reverse primer that differs from the second primer.
- the reverse primer is the first primer (described above with regard to amplifying products of the first ligation). In some embodiments, the reverse primer hybridizes to a sequence that is downstream with respect to where the first primer hybridizes (also referred to as “nested”), and may optionally include one or more additional sequence elements (e.g., any one or more primer sequence element described above). In some embodiments, the reverse primer comprises all or a portion of the hybridizable sequence of the first primer, and one or more sequence elements that differ from the first primer (e.g., any one or more primer sequence element described above).
- the first step of amplification comprises primer annealing, in which the second primer hybridizes to the strand of the second adapter ligated to the second tail.
- the hybridization site in the template strand will first be exposed. Exposure of the hybridization site can be achieved by denaturing and/or degrading the non-template strand of the adapter, example processes for which are described above. Non-limiting examples of linear amplification processes are described above. Non-limiting examples of exponential amplification processes are described above, and in more detail below.
- double-adapted target polynucleotides are amplified in an amplification reaction with a third primer and a fourth primer, wherein (i) the third primer hybridizes to a complement of at least a portion of the first primer, and (ii) the fourth primer hybridizes to a complement of at least a portion of the second primer.
- this amplification step replaces the step of amplification with the second primer, in which case the third and fourth primers are analogous to the second primer and reverse primer described above.
- amplification with the third and fourth primers is in addition to the amplification with the second primer (which may or may not have included amplification with the reverse primer).
- the hybridizable sequence of the third primer is different from the hybridizable sequence of the first primer, and/or the hybridizable sequence of the fourth primer is different from the hybridizable sequence of the second primer.
- the third primer is nested with regard to the first primer and/or the fourth primer is nested with regard to the second primer.
- the hybridizable portion of the third and/or fourth primer is independently selected from a length of about or more than about 10, 15, 20, 25, 30, 35, 45, 50, or more nucleotides.
- the hybridizing portion of a primer comprises the 3′ end of the primer.
- the third and/or fourth primer comprises one or more additional sequence elements (e.g., any one or more primer sequence element described above).
- a sequence element may be of any suitable length, such as about or less than about 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more nucleotides in length.
- the third primer and fourth primer are different, such as with regard to one or more of total length, sequence, sequence of the hybridizable sequence, presence of one or more sequence elements, length of one or more sequence elements, and sequence of one or more sequence elements.
- the third primer, the fourth primer, or both comprise an index sequence (also referred to as a barcode, or simply “index”).
- index refers to a known nucleic acid sequence that allows some feature of a polynucleotide with which the index is associated to be identified.
- the feature of the polynucleotide to be identified is the source (e.g. sample, sample fraction, or reaction) from which the polynucleotide is derived.
- indexes are about or at least about 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or more nucleotides in length. In some embodiments, indexes are shorter than 10, 9, 8, 7, 6, 5, or 4 nucleotides in length.
- indexes associated with some polynucleotides are of different lengths than indexes associated with other polynucleotides.
- indexes are of sufficient length and comprise sequences that are sufficiently different to allow the identification of sources based on indexes with which they are associated, particularly from among different indexes associated with polynucleotides from different sources in a mixture.
- an index, and the source with which it is associated can be identified accurately after the mutation, insertion, or deletion of one or more nucleotides in the index sequence, such as the mutation, insertion, or deletion of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides.
- each index in a plurality of indexes differ from every other index in the plurality at least three nucleotide positions, such as at least 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotide positions.
- a plurality of indexes may be represented in a pool of polynucleotides from different sources, each source comprising polynucleotides comprising one or more indexes that differ from the indexes contained in the polynucleotides derived from the other sources in the pool. It is emphasized here that indexes need only be unique within a given experiment. Thus, the same index may be used to tag a different sample being processed in a different experiment.
- a user may use the same index to tag a subset of different samples within the same experiment. For example, all samples derived from individuals having a specific phenotype may be tagged with the same index, e.g., all samples derived from control (or wild-type) subjects can be tagged with a first index while subjects having a disease condition can be tagged with a second index (different than the first index). As another example, it may be desirable to tag different samples derived from the same source with different indexes (e.g., samples derived over time, derived from different sites within a tissue, or different aliquots of the same sample subjected to different treatments (e.g., with or without bisulfite treatment)).
- different indexes e.g., samples derived over time, derived from different sites within a tissue, or different aliquots of the same sample subjected to different treatments (e.g., with or without bisulfite treatment)).
- a method comprises identifying the sample from which a target polynucleotide is derived based on an index sequence to which the target polynucleotide (or complement or derivative thereof) is joined. Examples of indexes and their use in identifying sample sources can be found in US20140121116, US20150087535, and US20120071331, which are hereby incorporated by reference.
- the method comprises an exponential amplification step.
- Exponential amplification includes, for example, reactions comprising a forward and reverse primer, such that the primer extension products of the forward primer serve as templates for primer extension of the reverse primer, and vice versa.
- Amplification may be isothermal or non-isothermal.
- methods for amplification of target polynucleotides are available, and include without limitation, methods based on polymerase chain reaction (PCR).
- Conditions favorable to the amplification of target sequences by PCR can be optimized at a variety of steps in the process, and depend on characteristics of elements in the reaction, such as target type, target concentration, sequence length to be amplified, sequence of the target and/or one or more primers, primer length, primer concentration, polymerase used, reaction volume, ratio of one or more elements to one or more other elements, and others, some or all of which can be suitably altered.
- PCR involves the steps of denaturation of the target to be amplified (if double stranded), hybridization of one or more primers to the target, and extension of the primers by a DNA polymerase, with the steps repeated (or “cycled”) in order to amplify the target sequence.
- Steps in this process can be optimized for various outcomes, such as to enhance yield, decrease the formation of spurious products, and/or increase or decrease specificity of primer annealing.
- Methods of optimization include adjustments to the type or amount of elements in the amplification reaction and/or to the conditions of a given step in the process, such as temperature at a particular step, duration of a particular step, and/or number of cycles.
- an amplification reaction comprises at least 5, 10, 15, 20, 25, 30, 35, 50, or more cycles.
- an amplification reaction comprises no more than 5, 10, 15, 20, 25, 35, 50, or more cycles. Cycles can contain any number of steps, such as 1, 2, 3, 4, 5, or more steps.
- Steps can comprise any temperature or gradient of temperatures, suitable for achieving the purpose of the given step, including but not limited to, 3′ end extension, primer annealing, primer extension, and strand denaturation. Steps can be of any duration, including but not limited to about or less than about 1, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 120, 180, 240, 300, 360, 420, 480, 540, 600, or more seconds, including indefinitely until manually interrupted.
- amplification is performed before or after pooling of target polynucleotides (e.g., double-adapter target polynucleotides) from independent samples or aliquots.
- target polynucleotides e.g., double-adapter target polynucleotides
- Non-limiting examples of PCR amplification techniques include quantitative PCR (qPCR or real-time PCR), digital PCR, and target-specific PCR.
- Non-limiting examples of polymerase enzymes for use in PCR include thermostable DNA polymerases, such as Thermus thermophilus HB8 polymerase; Thermus oshimai polymerase; Thermus scotoductus polymerase; Thermus thermophilus polymerase; Thermus aquaticus polymerase (e.g., AmpliTaq® FS or Taq (G46D; F667Y); Pyrococcus furiosus polymerase; Thermococcus sp. (strain 9° N-7) polymerase; Tsp polymerase; Phusion High-Fidelity DNA Polymerase (ThermoFisher); and mutants, variants, or derivatives thereof.
- thermostable DNA polymerases such as Thermus thermophilus HB8 polymerase; Thermus oshimai polymerase; Thermus scotoductus polymerase; Thermus thermophilus polymerase; Thermus aquaticus polymerase (e.g.,
- polymerase enzymes useful for some PCR reactions include, but are not limited to, DNA polymerase I, mutant DNA polymerase I, Klenow fragment, Klenow fragment (3′ to 5′ exonuclease minus), T4 DNA polymerase, mutant T4 DNA polymerase, T7 DNA polymerase, mutant T7 DNA polymerase, phi29 DNA polymerase, and mutant phi29 DNA polymerase.
- a hot start polymerase is used.
- a hot start polymerase is a modified form of a DNA Polymerase that requires thermal activation. Typically, the hot start enzyme is provided in an inactive state. Upon thermal activation the modification or modifier is released, generating active enzyme.
- hot start polymerases are available from various commercial sources, such as Applied Biosystems; Bio-Rad; ThermoFisher; New England Biolabs; Promega; QIAGEN; Roche Applied Science; Sigma-Aldrich; and the like.
- primer extension and amplification reactions comprise isothermal reactions.
- isothermal amplification technologies are ligase chain reaction (LCR) (see e.g., U.S. Pat. Nos. 5,494,810 and 5,830,711); transcription mediated amplification (TMA) (see e.g., U.S. Pat. Nos. 5,399,491, 5,888,779, 5,705,365, 5,710,029); nucleic acid sequence-based amplification (NASBA) (see e.g., U.S. Pat. No.
- LCR ligase chain reaction
- TMA transcription mediated amplification
- NASBA nucleic acid sequence-based amplification
- SMART signal mediated amplification of RNA technology
- SDA strand displacement amplification
- thermophilic SDA see e.g., U.S. Pat. No. 5,648,211
- rolling circle amplification see e.g., U.S. Pat. No. 5,854,033
- LAMP loop-mediated isothermal amplification of DNA
- HDA helicase-dependent amplification
- cHDA circular helicase-dependent amplification
- methods comprise sequencing double-adapted polynucleotides.
- the methods comprise sequencing products of the amplification with the second primer.
- the methods comprise sequencing products of amplification with the third and fourth primer.
- a variety of sequencing methodologies are available, particularly high-throughput sequencing methodologies. Examples include, without limitation, sequencing systems manufactured by Illumina (sequencing systems such as HiSeq® and MiSeq®), Life Technologies (Ion Torrent®, SOLiD®, etc.), Roche's 454 Life Sciences systems, Pacific Biosciences systems, nanopore sequencing platforms by Oxford Nanopore Technologies, etc.
- sequencing comprises producing reads of about or more than about 50, 75, 100, 125, 150, 175, 200, 250, 300, or more nucleotides in length.
- sequencing comprises a sequencing by synthesis process, where individual nucleotides are identified iteratively, as they are added to the growing primer extension product. Pyrosequencing is an example of a sequence by synthesis process that identifies the incorporation of a nucleotide by assaying the resulting synthesis mixture for the presence of by-products of the sequencing reaction, namely pyrophosphate, an example description of which can be found in U.S. Pat. No. 6,210,891.
- the primer/template/polymerase complex is immobilized upon a substrate and the complex is contacted with labeled nucleotides. Further non-limiting examples of sequencing technologies are described in US20160304954, U.S. Pat. Nos. 7,033,764, 7,416,844, and WO2016077602.
- sequencing reactions of various types may comprise a variety of sample processing units.
- Sample processing units may include but are not limited to multiple lanes, multiple channels, multiple wells, and other mean of processing multiple sample sets substantially simultaneously. Additionally, the sample processing unit may include multiple sample chambers to facilitate processing of multiple runs simultaneously.
- simultaneous sequencing reactions are performed using multiplex sequencing.
- polynucleotides are sequenced to produce about or more than about 5000, 10000, 50000, 100000, 1000000, 5000000, 10000000, or more sequencing reads in parallel, such as in a single reaction or reaction vessel. Subsequent data analysis can be performed on all or part of the sequencing reactions. Where polynucleotides are associated with an index sequence, data analysis can comprise grouping sequences based on index sequence for analysis together, and/or comparison to sequences associated with one or more different indexes.
- sequence analysis comprises comparison of one or more reads to a reference sequence (e.g., a control sequence, sequencing data for a reference population, sequencing data for a different tissue of the same subject, sequencing data for the same subject at another time point, or a reference genome), such as by performing an alignment.
- a reference sequence e.g., a control sequence, sequencing data for a reference population, sequencing data for a different tissue of the same subject, sequencing data for the same subject at another time point, or a reference genome
- a reference sequence e.g., a control sequence, sequencing data for a reference population, sequencing data for a different tissue of the same subject, sequencing data for the same subject at another time point, or a reference genome
- a reference sequence e.g., a control sequence, sequencing data for a reference population, sequencing data for a different tissue of the same subject, sequencing data for the same subject at another time point, or a reference genome
- an alignment is sometimes called a pairwise alignment.
- Multiple sequence alignment generally refers to the alignment of two or more sequences, including, for example, by a series of pairwise alignments.
- scoring an alignment involves setting values for the probabilities of substitutions and indels. When individual bases are aligned, a match or mismatch contributes to the alignment score by a substitution probability. An indel deducts from an alignment score by a gap penalty. Gap penalties and substitution probabilities can be based on empirical knowledge or a priori assumptions about how sequences mutate. Their values affect the resulting alignment.
- Examples of algorithms for performing alignments include, without limitation, the Smith-Waterman (SW) algorithm, the Needleman-Wunsch (NW) algorithm, algorithms based on the Burrows-Wheeler Transform (BWT), and hash function aligners such as Novoalign (Novocraft Technologies; available at www.novocraft.com), ELAND (Illumina, San Diego, Calif.), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
- One exemplary alignment program which implements a BWT approach, is Burrows-Wheeler Aligner (BWA) available from the SourceForge web site maintained by Geeknet (Fairfax, Va.).
- An alignment program that implements a version of the Smith-Waterman algorithm is MUMmer, available from the SourceForge web site maintained by Geeknet (Fairfax, Va.).
- Other non-limiting examples of alignment programs include: BLAT from Kent Informatics (Santa Cruz, Calif.); SOAP2, from Beijing Genomics Institute (Beijing, Conn.) or BGI Americas Corporation (Cambridge, Mass.); Bowtie; Efficient Large-Scale Alignment of Nucleotide Databases (ELAND) or the ELANDv2 component of the Consensus Assessment of Sequence and Variation (CASAVA) software (Illumina, San Diego, Calif.); RTG Investigator from Real Time Genomics, Inc.
- amplification products are sequenced to detect a sequence variant, e.g., insertions, deletions, substitutions, duplications, translocations, and/or rare somatic mutations, with respect to a reference sequence or in a background of no mutations.
- the sequence variant is correlated with a disease or trait.
- the sequence variant is not correlated with a disease or trait.
- sequence variants for which there is statistical, biological, and/or functional evidence of association with a disease or trait are referred to as “causal genetic variants.”
- a single causal genetic variant can be associated with more than one disease or trait.
- a causal genetic variant is associated with a Mendelian trait, a non-Mendelian trait, or both.
- Causal genetic variants can manifest as variations in a polynucleotide, such 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 50, or more sequence differences (such as between a polynucleotide comprising the causal genetic variant and a polynucleotide lacking the causal genetic variant at the same relative genomic position).
- Non-limiting examples of types of causal genetic variants include single nucleotide polymorphisms (SNP), deletion/insertion polymorphisms (DIP), copy number variants (CNV), short tandem repeats (STR), restriction fragment length polymorphisms (RFLP), simple sequence repeats (SSR), variable number of tandem repeats (VNTR), randomly amplified polymorphic DNA (RAPD), amplified fragment length polymorphisms (AFLP), inter-retrotransposon amplified polymorphisms (TRAP), long and short interspersed elements (LINE/SINE), long tandem repeats (LTR), mobile elements, retrotransposon microsatellite amplified polymorphisms, retrotransposon-based insertion polymorphisms, sequence specific amplified polymorphisms, and heritable epigenetic modifications (for example, DNA methylation).
- SNP single nucleotide polymorphisms
- DIP deletion/insertion polymorphisms
- CNV copy number variants
- STR short
- a causal genetic variant can comprise a set of closely related genetic variants. Some causal genetic variants may exert influence as sequence variations in RNA. At this level, some causal genetic variants are also indicated by the presence or absence of a species of RNA. Some causal genetic variants result in sequence variations in protein. A number of causal genetic variants have been reported. An example of a causal genetic variant that is a SNP is the HbS variant of hemoglobin that causes sickle cell anemia. An example of a causal genetic variant that is a DIP is the delta-F508 mutation of the CFTR gene which causes cystic fibrosis. An example of a causal genetic variant that is a CNV is trisomy 21, which causes Down's syndrome. An example of a causal genetic variant that is an STR is the tandem repeat that causes Huntington's disease. Additional non-limiting examples of causal genetic variants are described in US2014121116.
- diseases and gene targets with which a causal genetic variant may be associated include, but are not limited to, 21-Hydroxylase Deficiency, ABCC8-Related Hyperinsulinism, ARSACS, Achondroplasia, Achromatopsia, Adenosine Monophosphate Deaminase 1, Agenesis of Corpus Callosum with Neuronopathy, Alkaptonuria, Alpha-1-Antitrypsin Deficiency, Alpha-Mannosidosis, Alpha-Sarcoglycanopathy, Alpha-Thalassemia, Alzheimers, Angiotensin II Receptor, Type I, Apolipoprotein E Genotyping, Argininosuccinicaciduria, Aspartylglycosaminuria, Ataxia with Vitamin E Deficiency, Ataxia-Telangiectasia, Autoimmune Polyendocrinopathy Syndrome Type 1, BRCA1 Hereditary Breast/Ovarian Cancer, BRCA2 Hereditary Breast/Ovarian Cancer, one or more other types of cancer
- sequence variants associated with cancers include, but are not limited to, sequence variants in the PIK3CA gene (found in, e.g., colorectal cancers; most commonly located within two “hotspot” areas within exon 9 (the helical domain) and exon 20 (the kinase domain); position 3140 may be specifically targeted); sequence variants in the BRAF gene (found in, e.g., malignant melanomas, including melanomas derived from skin without chronic sun-induced damage, especially missense mutation resulting in V600E); sequence variants in the EGFR gene (found in, e.g., Non-Small Cell Lung Cancer, particularly within EGFR exons 18-21, and including exon 19 deletions and exon 21 L858R point mutations); sequence variants in the KIT gene (found in, e.g., Gastrointestinal Stromal Tumor (GIST), especially in juxtamembrane domain (exon 11), extracellular dimerization motif (exon
- sequence variants in one or more genes associated with cancer are identified.
- genes associated with cancer include PTEN; ATM; ATR; EGFR; ERBB2; ERBB3; ERBB4; Notch1; Notch2; Notch3; Notch4; AKT; AKT2; AKT3; HIF; HIF1a; HIF3a; Met; HRG; Bc12; PPAR alpha; PPAR gamma; WT1 (Wilms Tumor); FGF Receptor Family members (5 members: 1, 2, 3, 4, 5); CDKN2a; APC; RB (retinoblastoma); MEN1; VHL; BRCA1; BRCA2; AR; (Androgen Receptor); TSG101; IGF; IGF Receptor; Igf1 (4 variants); Igf2 (3 variants); Igf 1 Receptor; Igf 2 Receptor; Bax; Bc12; caspases family (9 members: 1, 2, 3, 4,
- methods of the invention have a high sensitivity for detecting nucleic acid species that are present in relatively low abundance.
- the low abundance species is a contaminant (e.g., in food or water), a particular bacterium in a complex population (e.g., in environmental testing), and nucleic acids associated with disease (e.g. infection, or a causal genetic variant).
- the methods detect nucleic acid species (e.g., a mutant form of a reference polynucleotide) present at about or less than about 1 in 1000, 1 in 5000, 1 in 10000, 1 in 20000, or lower.
- methods further comprise detecting presence or absence of disease, such as cancer or infection, in a subject.
- Cancer cells as most cells, can be characterized by a rate of turnover, in which old cells die and are replaced by newer cells. Generally dead cells, in contact with vasculature in a given subject, may release DNA or fragments of DNA into the blood stream. This is also true of cancer cells during various stages of the disease. Cancer cells may also be characterized, dependent on the stage of the disease, by various causal genetic variants, such as copy number variation as well as rare mutations. This phenomenon may be used to detect the presence or absence of cancer in a subject using the methods and systems described herein. In some cases, cancer is detected before symptoms or other hallmarks of disease occur.
- the types and number of cancers that may be detected include, but are not limited to, blood cancers, brain cancers, lung cancers, skin cancers, nose cancers, throat cancers, liver cancers, bone cancers, lymphomas, pancreatic cancers, skin cancers, bowel cancers, rectal cancers, thyroid cancers, bladder cancers, kidney cancers, mouth cancers, stomach cancers, solid state tumors, heterogeneous tumors, homogenous tumors and the like.
- the systems and methods described herein are used to help characterize certain cancers. Genetic data produced from the system and methods of this disclosure may allow practitioners to help better characterize a specific form of cancer. Often times, cancers are heterogeneous in both composition and staging.
- Genetic profile data may allow characterization of specific sub-types of cancer that may be important in the diagnosis or treatment of that specific sub-type. This information may also provide a subject or practitioner clues regarding the prognosis of a specific type of cancer. Progression of cancer development and/or response to treatment regimen can be followed by detecting appearance, disappearance, or changes in relative amounts of certain causal genetic variants over time.
- compositions for use in or produced by methods described herein, including with respect to any of the various other aspects and embodiments of this disclosure.
- Compositions of the disclosure can comprise any one or more of the elements described herein.
- compositions include one or more of the following: one or more pools of nucleotides from which a tail can be polymerized, one or more adapters comprising a 3′ overhang that hybridizes to a tail, one or more reagents for differentially modifying methylated or unmethylated cytosines, one or more amplification primers, one or more sequencing primers, one or more enzymes (e.g.
- a polymerase e.g. one or more of a polymerase, a reverse transcriptase, a ligase, a ribonuclease, and a glycosylase
- one or more buffers e.g. sodium carbonate buffer, a sodium bicarbonate buffer, a borate buffer, a Tris buffer, a MOPS buffer, a HEPES buffer
- reagents for utilizing any of these e.g. sodium carbonate buffer, a sodium bicarbonate buffer, a borate buffer, a Tris buffer, a MOPS buffer, a HEPES buffer
- reaction mixtures comprising any of these
- instructions for using any of these e.g. sodium carbonate buffer, a sodium bicarbonate buffer, a borate buffer, a Tris buffer, a MOPS buffer, a HEPES buffer
- reagents for utilizing any of these e.g. sodium bicarbonate buffer, a borate buffer, a Tris buffer, a MO
- the present disclosure provides reaction mixtures for use in or produced by methods described herein, including with respect to any of the various other aspects of this disclosure.
- the reaction mixture comprises one or more compositions described herein.
- kits for use in any of the methods described herein, including with respect to any of the various other aspects of this disclosure.
- the kit comprises one or more compositions described herein. Elements of the kit can further be provided, without limitation, in any amount and/or combination (such as in the same kit or same container).
- kits comprise additional agents for use according to the methods of the invention.
- Kit elements can be provided in any suitable container, including but not limited to test tubes, vials, flasks, bottles, ampules, syringes, or the like.
- the agents can be provided in a form that may be directly used in the methods of the invention, or in a form that requires preparation prior to use, such as in the reconstitution of lyophilized agents.
- a kit comprises: (a) a template-independent polymerase; (b) a first pool of nucleotides that can be polymerized by the template-independent polymerase; (c) a second pool of nucleotides that can be polymerized by the template-independent polymerase; (d) a first adapter comprising an overhang that is hybridizable to tails formed by polymerizing the first pool of polynucleotides; and (e) a second adapter comprising an overhang that is hybridizable to tails formed by polymerizing the second pool of polynucleotides, wherein the second adapter comprises a different sequence than the first adapter.
- the kit further comprises one or more primers. Examples of polymerases, nucleotide pools, adapters, and primers are disclosed herein, including with regard
- the present disclosure provides systems, such as computer systems, for implementing methods described herein, including with respect to any of the various other aspects of this disclosure. It should be understood that it is not practical, or even possible in most cases, for an unaided human being to perform computational operations involved in some embodiments of methods disclosed herein. For example, mapping a single 30 bp read from a sample to any one of the human chromosomes might require years of effort without the assistance of a computational apparatus. Of course, the challenge of unaided sequence analysis and alignment is compounded in cases where reliable calls of low allele frequency mutations require mapping thousands (e.g., at least about 10,000) or even millions of reads to one or more chromosomes.
- the disclosure provides tangible and/or non-transitory computer readable media or computer program products that include program instructions and/or data (including data structures) for performing various computer-implemented operations.
- Examples of computer-readable media include, but are not limited to, semiconductor memory devices, magnetic media such as disk drives, magnetic tape, optical media such as CDs, magneto-optical media, and hardware devices that are specially configured to store and perform program instructions, such as read-only memory devices (ROM) and random access memory (RAM).
- ROM read-only memory devices
- RAM random access memory
- the computer readable media may be directly controlled by an end user or the media may be indirectly controlled by the end user. Examples of directly controlled media include the media located at a user facility and/or media that are not shared with other entities.
- Examples of indirectly controlled media include media that is indirectly accessible to the user via an external network and/or via a service providing shared resources such as the “cloud.”
- Examples of program instructions include both machine code, such as produced by a compiler, and files containing higher level code that may be executed by the computer using an interpreter.
- the data or information employed in methods and systems disclosed herein are provided in an electronic format.
- data or information include, but are not limited to, sequencing reads derived from a nucleic acid sample, reference sequences (including reference sequences providing solely or primarily polymorphisms), sequences of one or more oligonucleotides used in the preparation of the sequencing reads (including portions thereof, and/or complements thereof), calls such as cancer diagnosis calls, counseling recommendations, diagnoses, and the like.
- data or other information provided in electronic format is available for storage on a machine and transmission between machines. Conventionally, data in electronic format is provided digitally and may be stored as bits and/or bytes in various data structures, lists, databases, etc. The data may be embodied electronically, optically, etc.
- a computer program product for generating an output indicating the sequences of polynucleotides in a test sample.
- the computer product may contain instructions for performing any one or more of the above-described methods for preparing a library of polynucleotides, and optionally determining polynucleotide sequences.
- the computer product may include a non-transitory and/or tangible computer readable medium having a computer executable or compilable logic (e.g., instructions) recorded thereon for enabling a processor to determine a sequence of interest.
- the computer product includes a computer readable medium having a computer executable or compilable logic (e.g., instructions) recorded thereon for enabling a processor to diagnose a condition and/or determine a nucleic acid sequence of interest.
- methods described herein are performed using a computer processing system which is adapted or configured to perform a method as described herein.
- the system includes a sequencing device adapted or configured for sequencing polynucleotides to obtain the type of sequence information described elsewhere herein, such as with regard to any of the various aspects described herein.
- the apparatus includes components for processing the sample, such as liquid handlers and sequencing systems, comprising modules for implementing one or more steps of any of the various methods described herein (e.g. sample processing, polynucleotide purification, and various reactions (e.g. tailing reactions, ligations reactions, amplification reactions, and sequencing reactions).
- sequence or other data is input into a computer or stored on a computer readable medium either directly or indirectly.
- a computer system is directly coupled to a sequencing device that reads and/or analyzes sequences of nucleic acids from samples. Sequences or other information from such tools are provided via interface in the computer system. Alternatively, the sequences processed by system are provided from a sequence storage source such as a database or other repository.
- a memory device or mass storage device buffers or stores, at least temporarily, sequences of the nucleic acids.
- the memory device may store read counts for various chromosomes or genomes, etc.
- the memory may also store various routines and/or programs for analyzing the sequence or mapped data.
- the programs/routines include programs for performing statistical analyses.
- a user provides a polynucleotide sample into a sequencing apparatus.
- Data is collected and/or analyzed by the sequencing apparatus which is connected to a computer.
- Software on the computer allows for data collection and/or analysis.
- Data can be stored, displayed (via a monitor or other similar device), and/or sent to another location.
- the computer may be connected to the internet, which is used to transmit data to a handheld device utilized by a remote user (e.g., a physician, scientist or analyst). It is understood that the data can be stored and/or analyzed prior to transmittal.
- raw data is collected and sent to a remote user or apparatus that will analyze and/or store the data. Transmittal can occur via the internet, but can also occur via satellite or other connection.
- data can be stored on a computer-readable medium and the medium can be shipped to an end user (e.g., via mail).
- the remote user can be in the same or a different geographical location including, but not limited to a building, city, state, country or continent.
- the methods comprise collecting data regarding a plurality of polynucleotide sequences (e.g., reads, and/or reference chromosome sequences) and sending the data to a computer or other computational system.
- the computer can be connected to laboratory equipment, e.g., a sample collection apparatus, a nucleotide amplification apparatus, or a nucleotide sequencing apparatus.
- the computer can then collect applicable data gathered by the laboratory device.
- the data can be stored on a computer at any step, e.g., while collected in real time, prior to the sending, during or in conjunction with the sending, or following the sending.
- the data can be stored on a computer-readable medium that can be extracted from the computer.
- the data collected or stored can be transmitted from the computer to a remote location, e.g., via a local network or a wide area network such as the internet. At the remote location various operations can be performed on the transmitted data.
- these various types of data are obtained, stored transmitted, analyzed, and/or manipulated at one or more locations using distinct apparatus.
- the processing options span a wide spectrum of options.
- the sample is obtained at one location, it is processed and optionally sequenced at a different location, reads are aligned and calls are made at one or more different locations, and diagnoses, recommendations, and/or plans are prepared at still another location (which may be a location where the sample was obtained).
- NA12878 genomic DNA was obtained from Coriell Institute (Coriell Institute, NA12878). The concentration was measured by Qubit dsDNA HS assay kit (Thermo Fisher Scientific, Q32851) and the amount of DNA used in library preparation was 10 ng.
- DNA substrates were diluted into 50 ⁇ l IDTE buffer (IDT, 11-05-01-09), and sheared into fragments of about 100-600 bp using a focused acoustic sonicator (Covaris, M220). The sonication parameters were set as follows: peak incident power 50 W, duty factor 20%, cycle per burst 200, duration 150 seconds, and temperature 6-8° C. The size of the sheared DNA fragments was confirmed by LabChip GXII touch 24 (Perkin Elmer).
- the bisulfite conversion step (BC) was carried out with a modified protocol from EZ-96 DNA methylation-LightningTM MagPrep (Zymo, D5047). 97.5 ⁇ l of Lightning Conversion Reagent and 15 ⁇ l of sheared genomic DNA or cfDNA were added in a 48-well Plate (Thermo Fisher Scientific, AB0648). The samples were mixed by pipetting up and down and incubated in a thermal cycler with the following conditions: (i) 98° C. for 8 minutes; (ii) 54° C. for 60 minutes; (iii) 4° C. storage for up to 20 hours.
- the BC-treated DNA samples were transferred to a 96-well midi-plate (Thermo Scientific, AB0859) with preloaded 450 ⁇ l of M-Binding Buffer and 7.5 ⁇ l of MagBinding Beads for each well. Components were mixed thoroughly and the plate was allowed to stand at room temperature for 5 minutes. The plate was then transferred to a magnetic stand for an additional 5 minutes, and the supernatant was removed. The beads were washed with 300 ⁇ l of M-Wash Buffer and incubated beads with 150 ⁇ l of L-Desulphonation Buffer at room temperature (20-30° C.) for 25 minutes.
- the plates were placed on the magnetic stand for 3 minutes and supernatant discarded, followed by washing the beads with 300 ⁇ l of M-Wash Buffer twice. After the washing step, the plate was transferred to a metal heater (Illumina, SC-60-504, BD-60-601) at 55° C. for 30 minutes to dry the beads, then 16 ⁇ l of M-Elution Buffer was added with additional 4 min incubation at 55° C. The plate was then moved to the magnetic stand for 1 minute and the supernatant was recovered as template for subsequent library prep steps.
- a metal heater Illumina, SC-60-504, BD-60-601
- the splinter adapter MDA1 was designed to have a plurality of eight G or A randomly synthesized at 9:1 molar ratio. During the first tailing and ligation step, it annealed to the 3′ end poly-C/T tail of the single stranded DNA substrate (as illustrated in FIG. 3 , bottom). The sequences of the oligonucleotides forming MDA1 are illustrated in FIG. 2 .
- the MDA1 adapter was prepared by annealing oligo ATN-R2-Top and ATN-R2-Bot together. In detail, 50 ⁇ l of each oligo (100 ⁇ M) was mixed and incubated at 95° C.
- the MDA2 adapter was prepared with ATN-R1-Top and ATN-R1-Bot oligo following similar strategy.
- the sequences of the oligonucleotides forming MDA2 are also illustrated in FIG. 2 . Sequences for oligonucleotides forming MDA1, MDA2, and for an amplification primer designated “Anchor primer” are set forth in Table 1.
- Bisulfite converted DNA fragments were end-repaired by mixing 12.5 ⁇ l of DNA sample, 1.5 ⁇ l of 10 ⁇ CutSmart buffer (NEB, B7204S), 1 ⁇ l Shrimp alkaline phosphatase (NEB, M0371L), and incubated at 37° C. for 30 minutes. The products were further denatured by incubating at 95° C. for 5 min and fast cooling on ice.
- the first ligation reaction was performed in a 20 ⁇ l reaction volume containing pretreated DNA substrates, 1 ⁇ CutSmart Buffer, 0.25 mM CoCl 2 (NEB, B0252S), 0.025 mM (3-Nicotinamide adenine dinucleotide (NEB, B9007S), 0.09 mM dCTP (Roche, 11934520001), 0.01 mM dTTP (Roche, 11934546001), 1 ⁇ M MDA1 adapter, 0.5 U/ ⁇ l E. coli ligase (NEB,M0205L) and 0.5 U/ ⁇ l terminal deoxynucleotidyl transferase (TdT; NEB, M0315S). The reaction was incubated at 37° C. for 30 minutes followed by heating at 95° C. for 2 minutes and held at 4° C.
- the ligated product was extended and linearly amplified in the presence of 1 ⁇ KAPA HiFi HotStart Uracil+ ReadyMix (KAPA, KK2802), and 0.91 ⁇ M anchor primer.
- the linear amplification reaction was carried out with the following thermal profile: (i) 95° C. for 5 minutes; (ii) 98° C. for 20 seconds, 62° C. for 30 seconds, 72° C. for 1 minutes, 15 cycles and (iii) 72° C. for 5 minutes.
- buffer was exchanged by purification with 2.5 ⁇ AMPure XP beads (Beckman Coulter, A63881) and eluted with 11.5 ⁇ l Elution Buffer (10 mM Tris-HCl, pH 8.0).
- the second ligation reaction was performed in a 20 ⁇ l reaction volume containing 10 ⁇ l of purified DNA products, 1 ⁇ CutSmart buffer, 0.25 mM CoCl 2 (NEB, B0252S), 0.025 mM ⁇ -Nicotinamide adenine dinucleotide (NEB, B9007S), 0.1 mM dATP (Roche, 11934511001), 1 ⁇ M MDA2, 0.5 U/ ⁇ l E. coli ligase (NEB,M0205L) and 0.5 U/ ⁇ l terminal deoxynucleotidyl transferase (NEB, M0315S).
- the reaction was incubated at 37° C. for 30 minutes followed by heating at 95° C. for 2 minutes and held at 4° C.
- An illustration of an example product of the second ligation is provided in FIG. 3 (bottom), compared to the product of a ligation reaction involving “Y” adapters (top).
- PCR enrichment of ligated product was performed in a 50 ⁇ l reaction containing 20 ⁇ l of the above-mentioned DNA product, 1 ⁇ KAPA HiFi buffer, dNTP, 1 ⁇ M primer F and primer R, and 1 u/ ⁇ l KAPA HiFi polymerase.
- the PCR program was as follows: (i) 95° C. for 5 minutes; (ii) 98° C. for 20 seconds, 60° C. for 30 seconds, 72° C. for 1 minutes, 12 cycles and (iii) 72° C. for 10 minutes.
- the PCR products were purified using Agencourt AMPure XP beads (Beckman Coulter, A63881) and eluted in 18 ⁇ l of EB (10 mM Tris-HCl, pH 8.0).
- the sequence of primer F was ACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO: 17).
- the sequence of primer R was GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC (SEQ ID NO: 18).
- FIG. 4 illustrates an example plot of a capillary electrophoretic analysis, showing an example size distribution of pre-capture library fragments after PCR enrichment.
- the expected peak size was 200-400 bp. All libraries were loaded on HT DNA High Sensitivity LabChip Kit (Perkin Elmer).
- the highest curve at 300 bp shows the ligated substrate when provided with 1 ⁇ MDA1 adapters.
- the next curves, from top to bottom, represent 2 ⁇ , 3 ⁇ , and 4 ⁇ adapters, respectively.
- the data indicate that 1 ⁇ MDA1 is sufficient for attaching the adaptor, and the ligation efficiency decreased with increasing MDA1 concentration, under these conditions.
- indexing primers Premixed i5 and i7, 20 ⁇ M each
- the PCR Program was as follows: (i) 98° C. for 45 seconds; (ii) 98° C. for 15 seconds, 60° C. for 30 seconds, 72° C. for 1 minute, 12 cycles and (iii) 72° C. for 5 minutes.
- Purified DNA libraries were eluted in 20 ⁇ l of EB and quantified by Qubit dsDNA HS assay kit.
- the sequence of index primer i5 was
- index primer i7 was
- a tailing step is performed using TdT with appropriate dNTP(s) to create a homopolymer or near-homopolymer tail to the 3′ end of ssDNA fragments.
- the homopolymer anneals to the 3′ overhang of an adapter containing a 5′ phosphate group in the top strand.
- the ligation reaction catalyzed by ligase seals the 3′ end of the ssDNA fragment to prevent excessive tailing.
- the bottom strand of the adapter is competed out by the anchor primer, exposing the initiating sites for a linear amplification process.
- the amplified ssDNA strands serve as templates for the second round of tailing and ligation, the products of which are then amplified.
- NA12878 genomic DNA was obtained from Coriell Institute (Coriell Institute, NA12878). The concentration was measured by Qubit dsDNA HS assay kit (Thermo Fisher Scientific, Q32851) and the amount of DNA used in library preparation ranged from 2-30 ng.
- DNA substrates were diluted into 50 ⁇ l IDTE buffer (IDT, 11-05-01-09), and sheared into fragments of about 100-600 bp using a focused acoustic sonicator (Covaris, M220).
- the sonication parameters were set as follows: peak incident power 50 W, duty factor 20%, cycle per burst 200, duration 150 seconds, and temperature 6-8° C. The size of the sheared DNA fragments was confirmed by LabChip GXII touch 24 (Perkin Elmer).
- Plasma samples were obtained from human blood draws.
- Cell free DNA (cfDNA) was extracted using the QiaAmp Circulating Nucleic Acid Kit (Qiagen, 55114).
- cfDNA was quantified by Qubit dsDNA HS assay kit as NA12878 genomic DNA but not subjected to fragmentation.
- the bisulfite conversion step (BC) was carried out with a modified protocol from EZ-96 DNA methylation-lightning MagPrep (Zymo, D5047). 97.5 ⁇ l of Lightning Conversion Reagent and 15 ⁇ l of sheared genomic DNA or cfDNA were added in a 48-well Plate (Thermo Fisher Scientific, AB0648). The samples were mixed by pipetting up and down and incubated in a thermal cycler with the following conditions: (i) 98° C. for 8 minutes; (ii) 54° C. for 60 minutes; (iii) 4° C. storage for up to 20 hours.
- the BC-treated DNA samples were transferred to a 96-well midi-plate (Thermo Scientific, AB0859) with preloaded 450 ⁇ l of M-Binding Buffer and 7.5 ⁇ l of MagBinding Beads for each well. Components were mixed thoroughly and the plate was allowed to stand at room temperature for 5 minutes. The plate was then transferred to a magnetic stand for an additional 5 minutes, and the supernatant was removed. The beads were washed with 300 ⁇ l of M-Wash Buffer and incubated beads with 150 ⁇ l of L-Desulphonation Buffer at room temperature (20-30° C.) for 25 minutes.
- the plates were placed on the magnetic stand for 3 minutes and supernatant discarded, followed by washing the beads with 300 ⁇ l of M-Wash Buffer twice. After the washing step, the plate was transferred to a metal heater (Illumina, SC-60-504, BD-60-601) at 55° C. for 30 minutes to dry the beads, then 16 ⁇ l of M-Elution Buffer was added with additional 4 min incubation at 55° C. The plate was then moved to the magnetic stand for 1 minute and the supernatant was recovered as template for subsequent library prep steps.
- a metal heater Illumina, SC-60-504, BD-60-601
- the splinter adapter MDA1 was designed to have a plurality of eight G or A randomly synthesized at 9:1 molar ratio. During the first tailing and ligation step, it annealed to the 3′ end poly-C/T tail of the single stranded DNA substrate (as illustrated in FIG. 3 , bottom). The sequences of the oligonucleotides forming MDA1 are illustrated in FIG. 2 .
- the MDA1 and MDA2 adapters were prepared as in Example 1. Sequences for oligonucleotides forming MDA1, MDA2, and for an amplification primer designated “Anchor primer” are set forth in Table 1, above.
- Bisulfite converted DNA fragments were end-repaired by mixing 12.5 ⁇ l of DNA sample, 1.5 ⁇ l of 10 ⁇ CutSmart buffer (NEB, B7204S), 1 ⁇ l Shrimp alkaline phosphatase (NEB, M0371L), and incubated at 37° C. for 30 minutes. The products were further denatured by incubating at 95° C. for 5 min and fast cooling on ice.
- the first ligation reaction was performed in a 20 ⁇ l reaction volume containing pretreated DNA substrates, 1 ⁇ CutSmart Buffer, 0.25 mM CoCl 2 (NEB, B0252S), 0.025 mM (3-Nicotinamide adenine dinucleotide (NEB, B9007S), 0.09 mM dCTP (Roche, 11934520001), 0.01 mM dTTP (Roche, 11934546001), 1 ⁇ M MDA1 adapter, 0.5 U/ ⁇ l E. coli ligase (NEB, M0205L) and 0.5 U/ ⁇ l terminal deoxynucleotidyl transferase (TdT, NEB, M0315 S). The reaction was incubated at 37° C. for 30 minutes followed by heating at 95° C. for 2 minutes and held at 4° C.
- the ligated product was extended and linearly amplified in the presence of 1 ⁇ KAPA HiFi HotStart Uracil+ ReadyMix (KAPA, KK2802), and 0.91 ⁇ M anchor primer.
- the linear amplification reaction was carried out with the following thermal profile: (i) 95° C. for 5 minutes; (ii) 98° C. for 20 seconds, 62° C. for 30 seconds, 72° C. for 1 minutes, 15 cycles and (iii) 72° C. for 5 minutes.
- buffer was exchanged by purification with 2.5 ⁇ AMPure XP beads (Beckman Coulter, A63881) and eluted with 11.5 ⁇ l Elution Buffer (10 mM Tris-HCl, pH 8.0).
- the second ligation reaction was performed in a 20 ⁇ l reaction volume containing 10 ⁇ l of purified DNA products, 1 ⁇ CutSmart buffer, 0.25 mM CoCl 2 (NEB, B0252S), 0.025 mM ⁇ -Nicotinamide adenine dinucleotide (NEB, B9007S), 0.1 mM dATP (Roche, 11934511001), 1 ⁇ M MDA2, 0.5 U/ ⁇ l E. coli ligase (NEB,M0205L) and 0.5 U/ ⁇ l terminal deoxynucleotidyl transferase (NEB, M0315S).
- the reaction was incubated at 37° C. for 30 minutes followed by heating at 95° C. for 2 minutes and held at 4° C.
- An illustration of an example product of the second ligation is provided in FIG. 3 (bottom), compared to the product of a ligation reaction involving “Y” adapters (top).
- PCR enrichment of ligated product was performed in a 50 ⁇ l reaction containing 20 ⁇ l of the above-mentioned DNA product, 1 ⁇ KAPA HiFi buffer, dNTP, 1 ⁇ M primer F and primer R, and 1 U/ ⁇ l KAPA HiFi polymerase.
- the PCR program was as follows: (i) 95° C. for 5 minutes; (ii) 98° C. for 20 seconds, 60° C. for 30 seconds, 72° C. for 1 minutes, 12 cycles and (iii) 72° C. for 10 minutes.
- the PCR products were purified using Agencourt AMPure XP beads (Beckman Coulter, A63881) and eluted in 18 ⁇ l of EB (10 mM Tris-HCl, pH 8.0).
- FIGS. 5A-C illustrate example plots of a capillary electrophoretic analyses, showing example size distributions of pre-capture library fragments after PCR enrichment.
- the expected peak size was 200-400 bp.
- the pre-captured library yield increased as input increased.
- the cfDNA had a higher yield than the sheared genomic DNA (gDNA). All libraries were loaded on HT DNA High Sensitivity LabChip Kit (Perkin Elmer).
- the beads were first washed once at room temperature with 500 ⁇ l of Wash Buffer1 (0.15 M Sodium Chloride, 0.015 M Sodium Citrate, 0.1% SDS), then three times with Wash Buffer2 (0.015 M Sodium Chloride, 0.0015 M Sodium Citrate, 0.1% SDS) at 65° C. The beads were then resuspended in 20 ⁇ l of elution buffer (10 mM Tris-HCl, pH 8.0) and used as template for the following indexing PCR step.
- Wash Buffer1 0.15 M Sodium Chloride, 0.015 M Sodium Citrate, 0.1% SDS
- SW48 genomic DNA which has increased levels of methylation, was purchased from ATCC (ATCC, CCL231). The concentration was measured by Qubit dsDNA HS assay kit (Thermo Fisher Scientific, Q32851). 10 ng of SW48 gnomic DNA was whole genome amplified (WGA) by REPLI-g Mini Kit (Qiagen 150023) in 50 ⁇ l following standard protocol (including 16 hour incubation at 30° C.). The amplified material was purified by 100 ⁇ l Ampure XP bead (Beckman Coulter, A63881) and eluted into 50 ⁇ l IDTE buffer (IDT, 11-05-01-09).
- the final WGA DNA yield was about 3 ⁇ g with a methylation level of about 1/300 of original SW48.
- the WGA DNA was proportionally mixed with original SW48 genomic DNA at 0%, 20%, 50%, 80%, and 100% level to mimic genome-wide methylation level gradient.
- 50 ng of each DNA mix was sheared into fragments of about 100-600 bp using a focused acoustic sonicator (Covaris, M220).
- the sonication parameters were set as follows: peak incident power 50 W, duty factor 20%, cycle per burst 200, duration 150 seconds, and temperature 6-8° C.
- the size of the sheared DNA fragments was confirmed by LabChip GXII touch 24 (Perkin Elmer).
- the bisulfite conversion step (BC) was carried out with a modified protocol from EZ-96 DNA methylation-LightningTM MagPrep (Zymo, D5047). 97.5 ⁇ l of Lightning Conversion Reagent and 40 ng sheared genomic DNA mix in 15 ⁇ l were added in a 48-well Plate (Thermo Fisher Scientific, AB0648). The samples were mixed by pipetting up and down and incubated in a thermal cycler with the following conditions: (i) 98° C. for 8 minutes; (ii) 54° C. for 60 minutes; (iii) 4° C. storage for up to 20 hours.
- the BC-treated DNA samples were transferred to a 96-well midi-plate (Thermo Scientific, AB0859) with preloaded 450 ⁇ l of M-Binding Buffer and 7.5 ⁇ l of MagBinding Beads for each well. Components were mixed thoroughly and the plate was allowed to stand at room temperature for 5 minutes. The plate was then transferred to a magnetic stand for an additional 5 minutes, and the supernatant was removed. The beads were washed with 300 ⁇ l of M-Wash Buffer and incubated beads with 150 ⁇ l of L-Desulphonation Buffer at room temperature (20-30° C.) for 25 minutes.
- the plates were placed on the magnetic stand for 3 minutes and supernatant discarded, followed by washing the beads with 300 ⁇ l of M-Wash Buffer twice. After the washing step, the plate was transferred to a metal heater (Illumina, SC-60-504, BD-60-601) at 55° C. for 30 minutes to dry the beads, then 16 ⁇ l of M-Elution Buffer was added with additional 4 min incubation at 55° C. The plate was then moved to the magnetic stand for 1 minute and the supernatant was recovered as template for subsequent library prep steps.
- a metal heater Illumina, SC-60-504, BD-60-601
- the MDA1 and MDA2 adapters were prepared as in Example 1. Sequences for oligonucleotides forming MDA1, MDA2, and for an amplification primer designated “Anchor primer” are set forth in Table 1, above.
- the first ligation, subsequent amplification, second ligation, and PCR enrichment were performed as in Example 1.
- 15 ⁇ l of purified DNA library 50-200 ng/ ⁇ l was mixed well with 4 ⁇ l blocker mix, and incubated in a thermal cycler with the following conditions: (i) 95° C. for 5 minutes; (ii) 65° C. hold.
- 10 ⁇ l of Hybridization Buffer 13 ⁇ SSPE; 13.5 mM EDTA; 13 ⁇ Denhart's Solution; 0.45% SDS
- 0.5 ⁇ l RNAse-inhibitor 0.5 ⁇ l Agilent SureSelect Custom Panel Probe Pool were pre-warmed at 65° C. for 2 minutes. Then the entire contents of the DNA-blocker mix was transferred to the probe mix, allowing the hybridization reaction to proceed at 65° C. for 16-24 hours.
- FIG. 6A illustrates an example plot of a capillary electrophoretic analysis, showing size distribution of pre-capture library fragments after PCR enrichment. Curves from top to bottom correspond to samples indicated in the legend from bottom to top. The expected peak size was 200-400 bp. All libraries were loaded on HT DNA High Sensitivity LabChip Kit (Perkin Elmer). All pre-captured libraries have very similar yield and insert size, indicating that the library prep method had no bias on methylated states.
- FIG. 6B illustrates an example plot of a capillary electrophoretic analysis, showing size distribution of post-capture library fragments after indexing PCR. All libraries were loaded on HT DNA High Sensitivity LabChip Kit (Perkin Elmer). Library yield gradually decreased as the original methylation level increased, indicating the general GC bias of the library preparation procedure under these conditions.
- FIG. 7 illustrates the methylation level of 12,977 targeted CpG sites. These sites have >97% methylation level in SW48-1 samples (100% SW48, 0% WGA). With different WGA sample spike-in, the methylation levels of these sites decreased proportionally and were within expectations. This indicated that the whole library preparation and capture process can precisely and accurately measure CpG methylation levels.
- NA12878 genomic DNA and customized 5% mutation genomic DNA reference were obtained from Coriell Institute (Coriell Institute, NA12878) and Horizon Discovery (HD-C669). The concentration was measured by Qubit dsDNA HS assay kit (Thermo Fisher Scientific, Q32851). The HD-C669 was proportionally mixed with NA12878 at a ratio of 1:9 to expect a mutation allele frequencies of 0.5% (the resulting mixture was named “PC1”). Mutations and their expected frequencies are listed in Table 6A.
- 50 ng of pure NA12878 and 0.5% AF Mixed DNA substrates were diluted into 50 ⁇ l IDTE buffer (IDT, 11-05-01-09), and sheared into fragments of about 100-600 bp using a focused acoustic sonicator (Covaris, M220).
- the sonication parameters were set as follows: peak incident power 50 W, duty factor 20%, cycle per burst 200, duration 150 seconds, and temperature 6-8° C.
- the size of the sheared DNA fragments was confirmed by LabChip GXII touch 24 (Perkin Elmer).
- the sheared materials were quantified by Qubit dsDNA HS assay kit to get 10 ng as the library prep input.
- a library was prepared using a typical “Y” adapter procedure. 10 ng of sheared genomic DNA in 50 ⁇ l IDTE was added in a 48-well Plate (Thermo Fisher Scientific, AB0648). The samples were end repaired and ligated using standard KAPA Hyper Prep kit (KAPA Biosystem, KK8504). The “Y” adapters described in FIG. 3 (top) were used in the ligation system with final concentration at 0.8 ⁇ M.
- splinter adapter assisted library prep 10 ng of sheared genomic DNA in 12.5 ⁇ l IDTE was added in a 48-well Plate (Thermo Fisher Scientific, AB0648) and end-repaired by mixing with 1.5 ⁇ l of 10 ⁇ CutSmart buffer (NEB, B7204S) and 1 ⁇ l Shrimp alkaline phosphatase (NEB, M0371L). The mixture was incubated at 37° C. for 30 minutes and then heated to 95° C. for 5 min following fast cooling on ice.
- the MDA1 and MDA2 adapters were prepared as in Example 1. Sequences for oligonucleotides forming MDA1, MDA2, and for an amplification primer designated “Anchor primer” are set forth in Table 1, above. The first ligation, subsequent amplification, second ligation, and PCR enrichment were performed as in Example 1.
- PCR enrichment of ligated products using both “Y” adapters and splinter adapters was performed in 50 ⁇ l reactions containing 20 ⁇ l of DNA product, 1 ⁇ KAPA HiFi buffer, dNTP, 1 ⁇ M primer F and primer R, and 1 U/ ⁇ l KAPA HiFi polymerase.
- the PCR program was as follows: (i) 95° C. for 5 minutes; (ii) 98° C. for 20 seconds, 60° C. for 30 seconds, 72° C. for 1 minutes, 12 cycles and (iii) 72° C. for 10 minutes.
- the PCR products were purified using Agencourt AMPure XP beads (Beckman Coulter, A63881) and eluted in 18 ⁇ l of EB (10 mM Tris-HCl, pH 8.0).
- FIG. 8A illustrates an example plot of a capillary electrophoretic analysis, showing an example size distribution of pre-capture library fragments after PCR enrichment (top and bottom plots are ELSA-12878-pre and HS-12878-pre, respectively.
- ELSA denotes splinter adapter libraries
- HS denotes “Y” adapter libraries.
- the expected peak size was 200-500 bp. All libraries were loaded on HT DNA High Sensitivity LabChip Kit (Perkin Elmer).
- the beads were first washed once at room temperature with 500 ⁇ l of Wash Buffer1 (0.15 M Sodium Chloride, 0.015 M Sodium Citrate, 0.1% SDS), then three times with Wash Buffer2 (0.015 M Sodium Chloride, 0.0015 M Sodium Citrate, 0.1% SDS) at 65° C. The beads were then resuspended in 20 ⁇ l of elution buffer (10 mM Tris-HCl, pH 8.0) and used as template for the following indexing PCR step.
- Wash Buffer1 0.15 M Sodium Chloride, 0.015 M Sodium Citrate, 0.1% SDS
- FIG. 8B illustrates an example plot of a capillary electrophoretic analysis, showing an example size distribution of captured library fragments after Indexing PCR (top and bottom plots are ELSA-12878-post and HS-12878-post, respectively).
- Lambda DNA was purchased from Promega (Madison, Wis., Catalog number: D1521). The concentration was measured by Qubit dsDNA HS assay kit (Thermo Fisher Scientific, Waltham, Mass., Q32851), and the amount of DNA used in library preparation ranged from 1-10 ng.
- DNA substrates were diluted into 50 ⁇ l IDTE buffer (Integrated DNA Technologies, Coralville, Iowa; 11-05-01-09), and sheared into fragments of about 100-600 bp using a focused acoustic sonicator (Covaris, Woburn, Mass., M220). The sonication parameters were set as follows: peak incident power 50 W, duty factor 20%, cycle per burst 200, duration 150 seconds, and temperature 6-8° C. The size of the sheared DNA fragments was confirmed by LabChip GXII touch 24 (Perkin Elmer, Waltham, Mass.).
- the bisulfite conversion step (BC) was carried out with a modified protocol from EZ-96 DNA methylation-LightningTM MagPrep (Zymo, Irvine, Calif., D5047). 97.5 ⁇ l of Lightning Conversion Reagent and 15 ⁇ l of sheared genomic DNA were added in a 48-well Plate (Thermo Fisher Scientific, AB0648). The samples were mixed by pipetting up and down and incubated in a thermal cycler with the following conditions: (i) 98° C. for 8 minutes; (ii) 54° C. for 60 minutes; (iii) 4° C. storage for up to 20 hours.
- the BC-treated DNA samples were transferred to a 96-well midi-plate (Thermo Scientific, AB0859) with preloaded 450 ⁇ l of M-Binding Buffer and 7.5 ⁇ l of MagBinding Beads for each well. Components were mixed thoroughly and the plate was allowed to stand at room temperature for 5 minutes. The plate was then transferred to a magnetic stand for an additional 5 minutes, and the supernatant was removed. The beads were washed with 300 ⁇ l of M-Wash Buffer and beads were incubated with 150 ⁇ l of L-Desulphonation Buffer at room temperature (20-30° C.) for 25 minutes.
- the plates were placed on the magnetic stand for 3 minutes and supernatant discarded, followed by washing the beads with 300 ⁇ l of M-Wash Buffer twice. After the washing step, the plate was transferred to a metal heater (Illumina, San Diego, Calif., SC-60-504, BD-60-601) at 55° C. for 30 minutes to dry the beads, then 16 ⁇ l of M-Elution Buffer was added with an additional 4 minutes of incubation at 55° C. The plate was then moved to the magnetic stand for 1 minute, and the supernatant was recovered as template for subsequent library prep steps.
- a metal heater Illumina, San Diego, Calif., SC-60-504, BD-60-601
- the adapter MDA1 was designed to have an eight base 3′ overhang and a four base 5′ overhang on the bottom strand.
- the 3′ overhang has a plurality of eight G or A randomly synthesized at a 3:1 molar ratio.
- the four base 5′ overhang creates a recessive 3′ end on the top strand, which prevents leaky TdT activity due to incomplete block of the 3′ end of the top strand.
- the 3′ overhang annealed to the 3′ end poly-C/T tail of the single stranded DNA substrate (as illustrated in FIG. 9 ).
- the sequences of the oligonucleotides forming MDA1 are illustrated in FIG. 10 .
- the MDA1 adapter was prepared by annealing oligo ATN-R2-Top and ATN-R2-Bot together. In detail, 50 ⁇ l of each oligo (100 ⁇ M) was mixed and incubated at 95° C. for 10 minutes and allowed to slowly cool to room temperature in 10 mM Tris-HCl containing 0.1 mM EDTA and 50 mM NaCl. The 3′ ends of both oligos were blocked by a phosphate group to prevent self-ligation.
- the MDA2 adapter was designed to have a plurality of seven N (A, T, G or C randomly synthesized at 1:1:1:1 molar ratio). It annealed to the 3′ end of the single stranded DNA substrate and prompted the ligation between MDA2 and DNA substrate during the second ligation step (as illustrated in FIG. 9 ).
- the MDA2 adapter was prepared by annealing oligo ATN-R1-Top and ATN-R1-Bot together.
- the sequences of the oligonucleotides forming MDA2 are illustrated in FIG. 10 . Sequences for oligonucleotides forming MDA1, MDA2, and for an amplification primer designated “Anchor primer” are set forth in Table 7.
- Bisulfite converted DNA fragments were end-repaired by mixing 12.5 ⁇ l of DNA sample, 1.5 ⁇ l of 10 ⁇ CutSmart buffer (NEB, B7204S), 1 ⁇ l Shrimp alkaline phosphatase (New England Biolabs (NEB), Ipswich, Mass., M0371L), and incubated at 37° C. for 30 minutes. The products were further denatured by incubating at 95° C. for 5 minutes and fast cooling on ice.
- the first ligation reaction was performed in a 20 ⁇ l reaction volume containing pretreated DNA substrates, 1 ⁇ CutSmart Buffer, 0.25 mM CoCl 2 (NEB, B0252S), 0.025 mM ⁇ -Nicotinamide adenine dinucleotide (NEB, B9007S), 0.09 mM dCTP (Roche, 11934520001, sold by Sigma-Aldrich, St. Louis, Mo.), 0.01 mM dTTP (Roche, 11934546001, 1 ⁇ M MDA1 adapter, 0.5 U/ ⁇ l E.
- the ligated product was extended and linearly amplified in the presence of 1 ⁇ KAPA HiFi HotStart Uracil+ ReadyMix (KAPA Biosystems, Wilmington, Mass., KK2802), and 0.91 ⁇ M anchor primer.
- the linear amplification reaction was carried out with the following thermal profile: (i) 95° C. for 5 minutes; (ii) 98° C. for 20 seconds, 62° C. for 30 seconds, 72° C. for 1 minute, 15 cycles and (iii) 72° C. for 5 minutes.
- the second ligation reaction was performed in a 20 ⁇ l reaction volume containing 10 ⁇ l of purified DNA products, 1 ⁇ T4 DNA ligase buffer, 10% PEG8000, 1 ⁇ M MDA1 adapter and 20 U/ ⁇ l T4 DNA ligase (NEB, M0202L). The reaction was incubated at 20° C. for 30 minutes followed by heating at 65° C. for 20 minutes and held at 4° C.
- PCR enrichment of ligated product was performed in a 50 ⁇ l reaction containing 20 ⁇ l of the above-mentioned DNA product, 1 ⁇ KAPA HiFi buffer, dNTP, 1 ⁇ M primer F and primer R, and 1 U/ ⁇ l KAPA HiFi polymerase.
- the PCR program was as follows: (i) 95° C. for 5 minutes; (ii) 98° C. for 20 seconds, 60° C. for 30 seconds, 72° C. for 1 minute, 8 cycles and (iii) 72° C. for 10 minutes.
- the PCR products were purified using Agencourt AMPure XP beads (Beckman Coulter, A63881) and eluted in 18 ⁇ l of EB (10 mM Tris-HCl, pH 8.0).
- FIG. 11 illustrates an example plot of a capillary electrophoretic analysis, showing size distribution of library fragments after indexing PCR. All libraries were loaded on HT DNA High Sensitivity LabChip Kit (Perkin Elmer).
- a tailing step is performed using TdT with appropriate dNTP(s) to create a homopolymer or near-homopolymer tail to the 3′ end of ssDNA fragments.
- the homopolymer anneals to the 3′ overhang of an adapter containing a 5′ phosphate group in the top strand.
- the ligation reaction catalyzed by ligase seals the 3′ end of the ssDNA fragment to prevent excessive tailing.
- the bottom strand of the adapter is competed out by the anchor primer, exposing the initiating sites for a linear amplification process.
- the amplified ssDNA strands serve as substrate for the second round of ligation, where splint oligonucleotides were used to create short stretches of dsDNA fragments that allow subsequent ligation of adapters using standard dsDNA ligation with T4 DNA ligase.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNPCT/CN2018/081748 | 2018-04-03 | ||
PCT/CN2018/081748 WO2019191900A1 (fr) | 2018-04-03 | 2018-04-03 | Compositions et procédés de préparation de bibliothèques d'acides nucléiques |
PCT/CN2019/081059 WO2019192489A1 (fr) | 2018-04-03 | 2019-04-02 | Compositions et procédés de préparation de bibliothèques d'acides nucléiques |
Related Parent Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/081748 Continuation WO2019191900A1 (fr) | 2018-04-03 | 2018-04-03 | Compositions et procédés de préparation de bibliothèques d'acides nucléiques |
PCT/CN2019/081059 A-371-Of-International WO2019192489A1 (fr) | 2018-04-03 | 2019-04-02 | Compositions et procédés de préparation de bibliothèques d'acides nucléiques |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/225,082 Continuation US20210254051A1 (en) | 2018-04-03 | 2021-04-07 | Compositions and methods for preparing nucleic acid libraries |
Publications (1)
Publication Number | Publication Date |
---|---|
US20210040475A1 true US20210040475A1 (en) | 2021-02-11 |
Family
ID=68099745
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/044,723 Pending US20210040475A1 (en) | 2018-04-03 | 2019-04-02 | Compositions and methods for preparing nucleic acid libraries |
US17/225,082 Pending US20210254051A1 (en) | 2018-04-03 | 2021-04-07 | Compositions and methods for preparing nucleic acid libraries |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/225,082 Pending US20210254051A1 (en) | 2018-04-03 | 2021-04-07 | Compositions and methods for preparing nucleic acid libraries |
Country Status (9)
Country | Link |
---|---|
US (2) | US20210040475A1 (fr) |
EP (1) | EP3740604A4 (fr) |
JP (1) | JP2021517556A (fr) |
CN (2) | CN110892097A (fr) |
AU (1) | AU2019248276A1 (fr) |
BR (1) | BR112020020207A2 (fr) |
CA (1) | CA3095837A1 (fr) |
SG (1) | SG11202009774XA (fr) |
WO (2) | WO2019191900A1 (fr) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111455469B (zh) * | 2020-04-07 | 2023-08-18 | 深圳易倍科华生物科技有限公司 | 一种单链快速建库方法及建库仪器 |
WO2022103857A1 (fr) * | 2020-11-10 | 2022-05-19 | The United States Of America, As Represented By The Secretary, Department Of Health And Human Services | Profilage dans une cellule unique de l'occupation de la chromatine et séquençage d'arn |
CN112538657B (zh) * | 2020-12-25 | 2021-08-17 | 北京吉因加医学检验实验室有限公司 | 一种脑脊液基因测序建库、检测方法及其应用 |
CN113564226A (zh) * | 2021-07-26 | 2021-10-29 | 深圳泰莱生物科技有限公司 | 一种捕获cfDNA5mC片段的检测方法 |
WO2023193456A1 (fr) * | 2022-04-07 | 2023-10-12 | 广州燃石医学检验所有限公司 | Composition microbienne, son procédé de préparation et son utilisation |
CN114736951A (zh) * | 2022-04-20 | 2022-07-12 | 深圳大学 | 一种小分子rna的高通量测序文库构建方法 |
CN116287124A (zh) * | 2023-05-24 | 2023-06-23 | 中国农业科学院农业基因组研究所 | 单链接头预连接方法、高通量测序文库的建库方法及试剂盒 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150087027A1 (en) * | 2012-03-13 | 2015-03-26 | Swift Biosciences, Inc. | Methods and Compositions for Size-Controlled Homopolymer Tailing of Substrate Polynucleotides by a Nucleic Acid Polymerase |
US20160251700A1 (en) * | 2014-02-24 | 2016-09-01 | Tobias William Barr Ost | Nucleic acid sample preparation |
US20170145492A1 (en) * | 2012-01-31 | 2017-05-25 | Pacific Biosciences Of California, Inc. | Compositions and methods for selection of nucleic acids |
Family Cites Families (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3578697B1 (fr) * | 2012-01-26 | 2024-03-06 | Tecan Genomics, Inc. | Compositions et procédés d'enrichissement de séquence d'acide nucléique ciblé et de génération de bibliothèques à haute efficacité |
CA2900481A1 (fr) * | 2013-02-08 | 2014-08-14 | 10X Genomics, Inc. | Generation de codes a barres de polynucleotides |
WO2014143157A1 (fr) * | 2013-03-13 | 2014-09-18 | The Board Institute, Inc. | Compositions et procédés pour des banques de long insert, à extrémités appariées d'acides nucléiques dans des gouttelettes d'émulsion |
EP3114231B1 (fr) * | 2014-03-03 | 2019-01-02 | Swift Biosciences, Inc. | Ligature d'adaptateur améliorée |
CN104264231B (zh) * | 2014-09-30 | 2017-04-19 | 天津华大基因科技有限公司 | 构建测序文库的方法及其应用 |
CN106192021B (zh) * | 2016-08-02 | 2017-04-26 | 中国海洋大学 | 一种串联rad标签测序文库的构建方法 |
CN106497920A (zh) * | 2016-11-21 | 2017-03-15 | 深圳华大基因研究院 | 一种用于非小细胞肺癌基因突变检测的文库构建方法及试剂盒 |
EP3545106B1 (fr) * | 2017-08-01 | 2022-01-19 | Helitec Limited | Procédés d'enrichissement et de détermination de séquences nucléotidiques cibles |
-
2018
- 2018-04-03 WO PCT/CN2018/081748 patent/WO2019191900A1/fr active Application Filing
-
2019
- 2019-04-02 SG SG11202009774XA patent/SG11202009774XA/en unknown
- 2019-04-02 CN CN201980002533.4A patent/CN110892097A/zh active Pending
- 2019-04-02 US US17/044,723 patent/US20210040475A1/en active Pending
- 2019-04-02 WO PCT/CN2019/081059 patent/WO2019192489A1/fr unknown
- 2019-04-02 CA CA3095837A patent/CA3095837A1/fr active Pending
- 2019-04-02 AU AU2019248276A patent/AU2019248276A1/en active Pending
- 2019-04-02 CN CN202110396910.6A patent/CN113106145A/zh active Pending
- 2019-04-02 JP JP2019566740A patent/JP2021517556A/ja active Pending
- 2019-04-02 BR BR112020020207-0A patent/BR112020020207A2/pt unknown
- 2019-04-02 EP EP19769980.4A patent/EP3740604A4/fr active Pending
-
2021
- 2021-04-07 US US17/225,082 patent/US20210254051A1/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170145492A1 (en) * | 2012-01-31 | 2017-05-25 | Pacific Biosciences Of California, Inc. | Compositions and methods for selection of nucleic acids |
US20150087027A1 (en) * | 2012-03-13 | 2015-03-26 | Swift Biosciences, Inc. | Methods and Compositions for Size-Controlled Homopolymer Tailing of Substrate Polynucleotides by a Nucleic Acid Polymerase |
US20180223321A1 (en) * | 2012-03-13 | 2018-08-09 | Swift Biosciences, Inc. | Methods and compositions for size-controlled homopolymer tailing of substrate polynucleotides by a nucleic acid polymerase |
US20160251700A1 (en) * | 2014-02-24 | 2016-09-01 | Tobias William Barr Ost | Nucleic acid sample preparation |
Also Published As
Publication number | Publication date |
---|---|
US20210254051A1 (en) | 2021-08-19 |
BR112020020207A2 (pt) | 2021-01-19 |
WO2019192489A1 (fr) | 2019-10-10 |
CA3095837A1 (fr) | 2019-10-10 |
AU2019248276A1 (en) | 2020-10-22 |
EP3740604A1 (fr) | 2020-11-25 |
EP3740604A4 (fr) | 2021-12-29 |
CN110892097A (zh) | 2020-03-17 |
JP2021517556A (ja) | 2021-07-26 |
SG11202009774XA (en) | 2020-10-29 |
CN113106145A (zh) | 2021-07-13 |
WO2019191900A1 (fr) | 2019-10-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210254051A1 (en) | Compositions and methods for preparing nucleic acid libraries | |
JP7008407B2 (ja) | ヌクレアーゼ、リガーゼ、ポリメラーゼ、及び配列決定反応の組み合わせを用いた、核酸配列、発現、コピー、またはdnaのメチル化変化の識別及び計数方法 | |
JP6966052B2 (ja) | 稀な配列変異体を検出するための組成物および方法 | |
US20210254134A1 (en) | Methods and compositions for forming ligation products | |
JP6435334B2 (ja) | 稀な配列変異体を検出するための組成物および方法 | |
US20180363039A1 (en) | Methods and compositions for forming ligation products | |
JP7240337B2 (ja) | ライブラリー調製方法ならびにそのための組成物および使用 | |
US10160998B2 (en) | PCR primers containing cleavable nucleotides | |
US20130123117A1 (en) | Capture probe and assay for analysis of fragmented nucleic acids | |
US20230374574A1 (en) | Compositions and methods for highly sensitive detection of target sequences in multiplex reactions | |
CN114450420A (zh) | 用于肿瘤学精确测定的组合物和方法 | |
JP2024060054A (ja) | ヌクレアーゼ、リガーゼ、ポリメラーゼ、及び配列決定反応の組み合わせを用いた、核酸配列、発現、コピー、またはdnaのメチル化変化の識別及び計数方法 | |
US20210292750A1 (en) | Methods and composition for targeted genomic analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION UNDERGOING PREEXAM PROCESSING |
|
AS | Assignment |
Owner name: GUANGZHOU BURNING ROCK DX CO., LTD, CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHANG, ZHIHONG;ZHENG, TAO;LI, BINGSI;AND OTHERS;REEL/FRAME:054042/0525 Effective date: 20200930 |
|
AS | Assignment |
Owner name: GUANGZHOU BURNING ROCK DX CO., LTD., CHINA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:ZHANG, ZHIHONG;ZHENG, TAO;LI, BINGSI;AND OTHERS;SIGNING DATES FROM 20201021 TO 20201023;REEL/FRAME:054186/0735 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: APPLICATION DISPATCHED FROM PREEXAM, NOT YET DOCKETED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |