WO2020023744A1 - Multiple sequencing using a single flow cell - Google Patents
Multiple sequencing using a single flow cell Download PDFInfo
- Publication number
- WO2020023744A1 WO2020023744A1 PCT/US2019/043434 US2019043434W WO2020023744A1 WO 2020023744 A1 WO2020023744 A1 WO 2020023744A1 US 2019043434 W US2019043434 W US 2019043434W WO 2020023744 A1 WO2020023744 A1 WO 2020023744A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequencing
- nucleic acid
- libraries
- acid molecules
- generate
- Prior art date
Links
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 542
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 345
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 342
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 342
- 238000000034 method Methods 0.000 claims abstract description 186
- 238000011176 pooling Methods 0.000 claims abstract description 43
- 239000000523 sample Substances 0.000 claims description 187
- 238000012545 processing Methods 0.000 claims description 126
- 238000001369 bisulfite sequencing Methods 0.000 claims description 57
- 238000012070 whole genome sequencing analysis Methods 0.000 claims description 42
- 230000002068 genetic effect Effects 0.000 claims description 19
- 238000001114 immunoprecipitation Methods 0.000 claims description 19
- 108090000623 proteins and genes Proteins 0.000 claims description 19
- 239000012472 biological sample Substances 0.000 claims description 17
- 238000012164 methylation sequencing Methods 0.000 claims description 16
- 230000002829 reductive effect Effects 0.000 claims description 15
- 230000011987 methylation Effects 0.000 claims description 13
- 238000007069 methylation reaction Methods 0.000 claims description 13
- 102000004169 proteins and genes Human genes 0.000 claims description 12
- 230000001590 oxidative effect Effects 0.000 claims description 11
- 108091008146 restriction endonucleases Proteins 0.000 claims description 11
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 claims 2
- 108020004414 DNA Proteins 0.000 description 122
- 102000053602 DNA Human genes 0.000 description 122
- 229920002477 rna polymer Polymers 0.000 description 108
- 210000004027 cell Anatomy 0.000 description 74
- 238000013467 fragmentation Methods 0.000 description 64
- 238000006062 fragmentation reaction Methods 0.000 description 64
- 239000012634 fragment Substances 0.000 description 54
- 230000000875 corresponding effect Effects 0.000 description 38
- 238000010804 cDNA synthesis Methods 0.000 description 29
- 238000006243 chemical reaction Methods 0.000 description 29
- 102000008579 Transposases Human genes 0.000 description 26
- 108010020764 Transposases Proteins 0.000 description 26
- 238000006911 enzymatic reaction Methods 0.000 description 26
- 238000003199 nucleic acid amplification method Methods 0.000 description 26
- 238000000053 physical method Methods 0.000 description 26
- 230000003321 amplification Effects 0.000 description 25
- 239000000126 substance Substances 0.000 description 25
- 108010042407 Endonucleases Proteins 0.000 description 22
- 102000004533 Endonucleases Human genes 0.000 description 22
- 238000010008 shearing Methods 0.000 description 22
- 238000000527 sonication Methods 0.000 description 22
- 230000015654 memory Effects 0.000 description 20
- 102000004190 Enzymes Human genes 0.000 description 19
- 108090000790 Enzymes Proteins 0.000 description 19
- 108020004635 Complementary DNA Proteins 0.000 description 18
- 239000002299 complementary DNA Substances 0.000 description 18
- 238000003860 storage Methods 0.000 description 18
- 238000007481 next generation sequencing Methods 0.000 description 17
- 125000003729 nucleotide group Chemical group 0.000 description 17
- 239000002773 nucleotide Substances 0.000 description 16
- 238000004513 sizing Methods 0.000 description 14
- 238000003559 RNA-seq method Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 11
- 230000002596 correlated effect Effects 0.000 description 11
- 238000003752 polymerase chain reaction Methods 0.000 description 11
- 241000282414 Homo sapiens Species 0.000 description 10
- 238000013459 approach Methods 0.000 description 10
- 238000010839 reverse transcription Methods 0.000 description 10
- 210000001519 tissue Anatomy 0.000 description 10
- 238000009396 hybridization Methods 0.000 description 9
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 8
- 230000015572 biosynthetic process Effects 0.000 description 8
- 230000003287 optical effect Effects 0.000 description 8
- XEBWQGVWTUSTLN-UHFFFAOYSA-M phenylmercury acetate Chemical compound CC(=O)O[Hg]C1=CC=CC=C1 XEBWQGVWTUSTLN-UHFFFAOYSA-M 0.000 description 8
- 230000026731 phosphorylation Effects 0.000 description 8
- 238000006366 phosphorylation reaction Methods 0.000 description 8
- 102000040430 polynucleotide Human genes 0.000 description 8
- 108091033319 polynucleotide Proteins 0.000 description 8
- 239000002157 polynucleotide Substances 0.000 description 8
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 7
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 7
- 108010006785 Taq Polymerase Proteins 0.000 description 7
- 239000000872 buffer Substances 0.000 description 7
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 7
- 238000003786 synthesis reaction Methods 0.000 description 7
- 108010017826 DNA Polymerase I Proteins 0.000 description 6
- 102000004594 DNA Polymerase I Human genes 0.000 description 6
- 239000011324 bead Substances 0.000 description 6
- 239000003153 chemical reaction reagent Substances 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 6
- 210000000349 chromosome Anatomy 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 108091093088 Amplicon Proteins 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 4
- 108091034117 Oligonucleotide Proteins 0.000 description 4
- 108010021757 Polynucleotide 5'-Hydroxyl-Kinase Proteins 0.000 description 4
- 102000008422 Polynucleotide 5'-hydroxyl-kinase Human genes 0.000 description 4
- 101710188535 RNA ligase 2 Proteins 0.000 description 4
- 101710204104 RNA-editing ligase 2, mitochondrial Proteins 0.000 description 4
- 238000003556 assay Methods 0.000 description 4
- 238000004132 cross linking Methods 0.000 description 4
- 239000012530 fluid Substances 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- 238000002473 ribonucleic acid immunoprecipitation Methods 0.000 description 4
- 239000007787 solid Substances 0.000 description 4
- 239000000758 substrate Substances 0.000 description 4
- 238000011282 treatment Methods 0.000 description 4
- 229940035893 uracil Drugs 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 3
- 238000013500 data storage Methods 0.000 description 3
- 230000001419 dependent effect Effects 0.000 description 3
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 239000000839 emulsion Substances 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 3
- 150000002500 ions Chemical class 0.000 description 3
- 230000000670 limiting effect Effects 0.000 description 3
- 238000007857 nested PCR Methods 0.000 description 3
- 239000002777 nucleoside Substances 0.000 description 3
- 150000003833 nucleoside derivatives Chemical class 0.000 description 3
- 239000004065 semiconductor Substances 0.000 description 3
- 229940113082 thymine Drugs 0.000 description 3
- 241001515965 unidentified phage Species 0.000 description 3
- ASJSAQIRZKANQN-CRCLSJGQSA-N 2-deoxy-D-ribose Chemical compound OC[C@@H](O)[C@@H](O)CC=O ASJSAQIRZKANQN-CRCLSJGQSA-N 0.000 description 2
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 2
- 108091035707 Consensus sequence Proteins 0.000 description 2
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 108060004795 Methyltransferase Proteins 0.000 description 2
- 108020005196 Mitochondrial DNA Proteins 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- 229920000388 Polyphosphate Polymers 0.000 description 2
- 108010019653 Pwo polymerase Proteins 0.000 description 2
- 108091028664 Ribonucleotide Proteins 0.000 description 2
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 2
- 108010001244 Tli polymerase Proteins 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- OIRDTQYFTABQOQ-KQYNXXCUSA-N adenosine Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O OIRDTQYFTABQOQ-KQYNXXCUSA-N 0.000 description 2
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000000295 complement effect Effects 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 2
- RGWHQCVHVJXOKC-SHYZEUOFSA-N dCTP Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](CO[P@](O)(=O)O[P@](O)(=O)OP(O)(O)=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-N 0.000 description 2
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 2
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 2
- 238000012350 deep sequencing Methods 0.000 description 2
- 239000005549 deoxyribonucleoside Substances 0.000 description 2
- 239000005547 deoxyribonucleotide Substances 0.000 description 2
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000005546 dideoxynucleotide Substances 0.000 description 2
- 238000007847 digital PCR Methods 0.000 description 2
- 238000007672 fourth generation sequencing Methods 0.000 description 2
- 238000012165 high-throughput sequencing Methods 0.000 description 2
- 238000007834 ligase chain reaction Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000000116 mitigating effect Effects 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 2
- BASFCYQUMIYNBI-UHFFFAOYSA-N platinum Chemical compound [Pt] BASFCYQUMIYNBI-UHFFFAOYSA-N 0.000 description 2
- 238000006116 polymerization reaction Methods 0.000 description 2
- 239000001205 polyphosphate Substances 0.000 description 2
- 235000011176 polyphosphates Nutrition 0.000 description 2
- 238000012175 pyrosequencing Methods 0.000 description 2
- 239000011541 reaction mixture Substances 0.000 description 2
- 239000002336 ribonucleotide Substances 0.000 description 2
- 125000002652 ribonucleotide group Chemical group 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 239000001226 triphosphate Substances 0.000 description 2
- 235000011178 triphosphate Nutrition 0.000 description 2
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical group CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- LSNNMFCWUKXFEE-UHFFFAOYSA-M Bisulfite Chemical compound OS([O-])=O LSNNMFCWUKXFEE-UHFFFAOYSA-M 0.000 description 1
- 239000002126 C01EB10 - Adenosine Substances 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 208000035473 Communicable disease Diseases 0.000 description 1
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- 108050009160 DNA polymerase 1 Proteins 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 241000701533 Escherichia virus T4 Species 0.000 description 1
- 241000282326 Felis catus Species 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- JLVVSXFLKOJNIY-UHFFFAOYSA-N Magnesium ion Chemical compound [Mg+2] JLVVSXFLKOJNIY-UHFFFAOYSA-N 0.000 description 1
- WAEMQWOKJMHJLA-UHFFFAOYSA-N Manganese(2+) Chemical compound [Mn+2] WAEMQWOKJMHJLA-UHFFFAOYSA-N 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 241000282577 Pan troglodytes Species 0.000 description 1
- 241000009328 Perro Species 0.000 description 1
- 108010002747 Pfu DNA polymerase Proteins 0.000 description 1
- 108010010677 Phosphodiesterase I Proteins 0.000 description 1
- 206010036790 Productive cough Diseases 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- DWAQJAXMDSEUJJ-UHFFFAOYSA-M Sodium bisulfite Chemical compound [Na+].OS([O-])=O DWAQJAXMDSEUJJ-UHFFFAOYSA-M 0.000 description 1
- 241000282898 Sus scrofa Species 0.000 description 1
- 108010020713 Tth polymerase Proteins 0.000 description 1
- PGAVKCOVUIYSFO-XVFCMESISA-N UTP Chemical compound O[C@@H]1[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O[C@H]1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-XVFCMESISA-N 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 229960005305 adenosine Drugs 0.000 description 1
- 238000007844 allele-specific PCR Methods 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000007845 assembly PCR Methods 0.000 description 1
- 238000007846 asymmetric PCR Methods 0.000 description 1
- 108010058966 bacteriophage T7 induced DNA polymerase Proteins 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 150000001720 carbohydrates Chemical class 0.000 description 1
- 235000014633 carbohydrates Nutrition 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 108091092259 cell-free RNA Proteins 0.000 description 1
- 239000007979 citrate buffer Substances 0.000 description 1
- 238000000576 coating method Methods 0.000 description 1
- 230000008094 contradictory effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000003292 diminished effect Effects 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 239000003925 fat Substances 0.000 description 1
- 230000001605 fetal effect Effects 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- -1 for example Chemical class 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 238000007849 hot-start PCR Methods 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000007852 inverse PCR Methods 0.000 description 1
- ODBLHEXUDAPZAU-UHFFFAOYSA-N isocitric acid Chemical compound OC(=O)C(O)C(C(O)=O)CC(O)=O ODBLHEXUDAPZAU-UHFFFAOYSA-N 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 235000019689 luncheon sausage Nutrition 0.000 description 1
- 210000002751 lymph Anatomy 0.000 description 1
- 238000007403 mPCR Methods 0.000 description 1
- 229910001425 magnesium ion Inorganic materials 0.000 description 1
- 229910001437 manganese ion Inorganic materials 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 238000007855 methylation-specific PCR Methods 0.000 description 1
- 238000007856 miniprimer PCR Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 150000004712 monophosphates Chemical class 0.000 description 1
- 210000003097 mucus Anatomy 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- 210000002381 plasma Anatomy 0.000 description 1
- 239000004033 plastic Substances 0.000 description 1
- 229910052697 platinum Inorganic materials 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 230000000135 prohibitive effect Effects 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 125000000548 ribosyl group Chemical group C1([C@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 210000003296 saliva Anatomy 0.000 description 1
- 238000007480 sanger sequencing Methods 0.000 description 1
- 210000000582 semen Anatomy 0.000 description 1
- 238000007841 sequencing by ligation Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 239000010454 slate Substances 0.000 description 1
- 235000010267 sodium hydrogen sulphite Nutrition 0.000 description 1
- 238000002798 spectrophotometry method Methods 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 210000003802 sputum Anatomy 0.000 description 1
- 208000024794 sputum Diseases 0.000 description 1
- 239000004094 surface-active agent Substances 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 210000001138 tear Anatomy 0.000 description 1
- 238000007861 thermal asymmetric interlaced PCR Methods 0.000 description 1
- 238000007862 touchdown PCR Methods 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 1
- 229950010342 uridine triphosphate Drugs 0.000 description 1
- PGAVKCOVUIYSFO-UHFFFAOYSA-N uridine-triphosphate Natural products OC1C(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)OC1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-UHFFFAOYSA-N 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2527/00—Reactions demanding special reaction conditions
- C12Q2527/146—Concentration of target or template
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2535/00—Reactions characterised by the assay type for determining the identity of a nucleotide base or a sequence of oligonucleotides
- C12Q2535/122—Massive parallel sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2537/00—Reactions characterised by the reaction format or use of a specific feature
- C12Q2537/10—Reactions characterised by the reaction format or use of a specific feature the purpose or use of
- C12Q2537/159—Reduction of complexity, e.g. amplification of subsets, removing duplicated genomic regions
Definitions
- nucleic acid sequencing may remain costly. Recognizing a need for efficient and/or high-throughput whole genome sequencing approaches, the present disclosure provides methods and systems for nucleic acid sequencing. Such systems and methods may use a single flow cell to perform unbiased and/or biased sequencing to generate libraries of nucleic acid molecules.
- a method of biasing specific regions of the genome may be employed in order to enhance the confidence in the sequencing output for areas with relatively greater importance toward assessment or management (e.g., diagnosis, prognosis, treatment selection, treatment monitoring, monitoring for recurrence) of certain diseases while reducing the cost of sequencing on a per sample basis.
- this increased bias may also reduce the complexity of the sequenced sample, which may lead to difficulties for the sequencer in calling individual bases of the genome.
- a smaller, well-characterized control genome of the Phi X 174 bacteriophage may be run as a small percentage of total reads available along with the biased samples of interest in order to increase overall complexity of the sequencing run.
- the present disclosure provides a method whereby a user may recover this lost sequencing capacity while maintaining the sequencing complexity required for optimal sequencing run quality.
- An unbiased sample(s) sequenced along with a biased sample(s) may allow the user to make use of the capacity typically lost to the processing of the control genome. This may provide users the ability to run multiple assays in parallel, thus improving sequencing efficiency and/or throughput, thereby saving on overall sequencing cost and time, while still maintaining the sample complexity required for a successful sequencing run.
- sequencers may place constraints on what assays may be economically run on those sequencing instruments. For example, a model designed for higher sequencing output may be too costly to run for biased sequencing applications without multiplexing a large number of specimens in a single run, yet can meaningfully decrease the cost per base for unbiased sequencing runs. The ability to combine both biased and unbiased specimens into a single sequencing run may make the use of higher output instruments more versatile, as they can then be used across a broader spectrum of applications with a reduced run cost per specimen.
- An aspect of the present disclosure provides a method for increasing complexity of a sample for sequencing, the method comprising: providing a first nucleic acid sample having a first degree of complexity that differs from a desired degree of complexity; providing a second nucleic acid sample having a second degree of complexity that differs from the first degree of complexity and that differs from the desired degree of complexity; pooling at least a portion of the first nucleic acid sample and at least a portion of the second nucleic acid sample, thereby generating a pooled nucleic acid sample having the desired degree of complexity; and sequencing at least a portion of the pooled nucleic acid sample.
- the sequencing comprises whole genome sequencing (WGS). In some embodiments, the sequencing comprises massively parallel sequencing. In some embodiments, the sequencing comprises sequencing on a sequencing platform that comprises an output of at least about 1 billion reads per flow cell. In some embodiments, the sequencing comprises sequencing on a sequencing platform that comprises an output of at least about 1.5 billion reads per flow cell. In some embodiments, the sequencing comprises sequencing on a sequencing platform that comprises an output of at least about 2 billion reads per flow cell.
- Another aspect of the present disclosure provides a method for sequencing nucleic acid molecules, comprising: processing a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing an unbiased sequencing; processing a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a biased sequencing; pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; and using a single flow cell of a sequencing platform, sequencing the pooled plurality of libraries to generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules and a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- the unbiased sequencing comprises whole genome sequencing (WGS). In some embodiments, the unbiased sequencing is performed at a depth of no more than about 0.1X, no more than about 0.5X, no more than about IX, no more than about 2X, no more than about 3X, no more than about 4X, no more than about 5X, no more than about 6X, no more than about 7X, no more than about 8X, no more than about 9X, no more than about 10X, no more than about 12X, no more than about 14X, no more than about 16X, no more than about 18X, no more than about 20X, no more than about 22X, no more than about 24X, no more than about 26X, no more than about 28X, or no more than about 3 OX.
- WGS whole genome sequencing
- the biased sequencing comprises targeted sequencing of a target capture panel comprising a plurality of genetic loci.
- the targeted sequencing comprises targeted methyl-seq.
- the unbiased sequencing comprises methylation sequencing.
- the methylation sequencing comprises bisulfite sequencing, whole genome bisulfite sequencing (WGBS), APOBEC-seq, methyl-CpG-binding domain (MBD) protein capture, methyl-DNA immunoprecipitation (MeDIP), methylation sensitive restriction enzyme sequencing (MSRE/MRE-Seq or Methyl-Seq), oxidative bisulfite sequencing (oxBS-Seq), reduced representative bisulfite sequencing (RRBS), or Tet-assisted bisulfite sequencing (TAB- Seq).
- generating the second plurality of sequencing reads comprises using at least a portion of the first plurality of libraries as control libraries.
- the method further comprises pooling a third plurality of libraries to generate the pooled plurality of libraries, wherein the third plurality of libraries comprises control libraries for generating the first plurality of sequencing reads or the second plurality of sequencing reads.
- the first plurality of nucleic acid molecules and the second plurality of nucleic acid molecules comprise DNA molecules.
- the first plurality of nucleic acid molecules and the second plurality of nucleic acid molecules comprise RNA molecules.
- the sequencing platform is an IlluminaTM sequencer.
- Another aspect of the present disclosure provides a method for sequencing nucleic acid molecules, comprising: processing a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing a first biased sequencing; processing a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a second biased sequencing; pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; and using a single flow cell of a sequencing platform, sequencing the pooled plurality of libraries to generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules and a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- the first biased sequencing comprises targeted sequencing of a first target capture panel comprising a first plurality of genetic loci
- the second biased sequencing comprises targeted sequencing of a second target capture panel comprising a second plurality of genetic loci
- Another aspect of the present disclosure provides a method for sequencing nucleic acid molecules, comprising: processing a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing a first unbiased sequencing; processing a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a second unbiased sequencing; pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; and using a single flow cell of a sequencing platform, sequencing the pooled plurality of libraries to generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules and a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- the first unbiased sequencing comprises whole genome sequencing (WGS)
- the second unbiased sequencing comprises methylation sequencing.
- the methylation sequencing comprises bisulfite sequencing, whole genome bisulfite sequencing (WGBS), APOBEC-seq, methyl-CpG-binding domain (MBD) protein capture, methyl-DNA immunoprecipitation (MeDIP), methylation sensitive restriction enzyme sequencing (MSRE/MRE-Seq or Methyl-Seq), oxidative bisulfite sequencing (oxBS-Seq), reduced representative bisulfite sequencing (RRBS), or Tet-assisted bisulfite sequencing (TAB- Seq).
- the unbiased sequencing is performed at a depth of no more than about 0.1X, no more than about 0.5X, no more than about IX, no more than about 2X, no more than about 3X, no more than about 4X, no more than about 5X, no more than about 6X, no more than about 7X, no more than about 8X, no more than about 9X, no more than about 10X, no more than about 12X, no more than about 14X, no more than about 16X, no more than about 18X, no more than about 20X, no more than about 22X, no more than about 24X, no more than about 26X, no more than about 28X, or no more than about 3 OX.
- the nucleic acid molecules are extracted from a sample.
- the sample is a biological sample.
- the first plurality of nucleic acid molecules and the second plurality of nucleic acid molecules are generated from a same initial biological sample.
- Another aspect of the present disclosure provides a system for sequencing nucleic acid molecules, comprising: a controller comprising one or more computer processors; and a support operatively coupled to the controller; wherein the one or more computer processors are individually or collectively programmed to: direct the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing an unbiased sequencing; direct the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a biased sequencing, direct the pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; generate, from the pooled plurality of libraries, a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules; and generate, from the pooled plurality of libraries, a second plurality of sequencing reads
- the unbiased sequencing comprises whole genome sequencing (WGS) or methylation sequencing.
- the methylation sequencing comprises bisulfite sequencing, whole genome bisulfite sequencing (WGBS), APOBEC-seq, methyl-CpG- binding domain (MBD) protein capture, methyl-DNA immunoprecipitation (MeDIP), methylation sensitive restriction enzyme sequencing (MSRE/MRE-Seq or Methyl-Seq), oxidative bisulfite sequencing (oxBS-Seq), reduced representative bisulfite sequencing (RRBS), or Tet-assisted bisulfite sequencing (TAB-Seq).
- the biased sequencing comprises targeted sequencing of a target capture panel comprising a plurality of genetic loci.
- the targeted sequencing comprises targeted methyl-seq.
- a system for sequencing nucleic acid molecules comprising: a controller comprising one or more computer processors; and a support operatively coupled to the controller; wherein the one or more computer processors are individually or collectively programmed to: direct the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing a first biased sequencing; direct the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a second biased sequencing, direct the pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; generate, from the pooled plurality of libraries, a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules; and generate, from the pooled plurality of libraries, a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- the first biased sequencing comprises targeted sequencing of a first target capture panel comprising a first target capture panel comprising a first target capture panel comprising of
- Another aspect of the present disclosure provides a system for sequencing nucleic acid molecules, comprising: a controller comprising one or more computer processors; and a support operatively coupled to the controller; wherein the one or more computer processors are individually or collectively programmed to: direct the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing a first unbiased
- sequencing direct the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a second unbiased sequencing, direct the pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; generate, from the pooled plurality of libraries, a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules; and generate, from the pooled plurality of libraries, a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- the first unbiased sequencing or the second unbiased sequencing comprises whole genome sequencing (WGS) or methylation sequencing.
- the methylation sequencing comprises bisulfite sequencing, whole genome bisulfite sequencing (WGBS), APOBEC-seq, methyl-CpG-binding domain (MBD) protein capture, methyl-DNA immunoprecipitation (MeDIP), methylation sensitive restriction enzyme sequencing
- MSRE/MRE-Seq or Methyl-Seq oxidative bisulfite sequencing
- oxBS-Seq reduced representative bisulfite sequencing
- RRBS reduced representative bisulfite sequencing
- TAB-Seq Tet-assisted bisulfite sequencing
- Another aspect of the present disclosure provides a non-transitory computer-readable medium comprising machine-executable code that, upon execution by a computer processor, implements a method for sequencing nucleic acid molecules, the method comprising: directing the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing an unbiased sequencing; directing the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a biased sequencing; directing the pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; generating, from the pooled plurality of libraries, a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules; and generating, from the pooled plurality of libraries, a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- Another aspect of the present disclosure provides a non-transitory computer-readable medium comprising machine-executable code that, upon execution by a computer processor, implements a method for sequencing nucleic acid molecules, the method comprising: directing the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing a first biased sequencing; directing the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a second biased sequencing; directing the pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; generating, from the pooled plurality of libraries, a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules; and generating, from the pooled plurality of libraries, a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- Another aspect of the present disclosure provides a non-transitory computer-readable medium comprising machine-executable code that, upon execution by a computer processor, implements a method for sequencing nucleic acid molecules, the method comprising: directing the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries for performing a first unbiased sequencing; directing the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a second unbiased sequencing; directing the pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries; generating, from the pooled plurality of libraries, a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules; and generating, from the pooled plurality of libraries, a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- FIG. 1 shows a computer system that is programmed or otherwise configured to implement methods or systems provided herein;
- FIG. 2 shows an example of a method of sequencing nucleic acid molecules using unbiased and biased sequencing, in accordance with disclosed embodiments
- FIG. 3 shows an example of a method of sequencing nucleic acid molecules using biased sequencing, in accordance with disclosed embodiments
- FIG. 4 shows an example of a method of sequencing nucleic acid molecules using unbiased sequencing, in accordance with disclosed embodiments
- FIG. 5 shows an example of a method of sequencing nucleic acid molecules using biased and unbiased sequencing with a control library, in accordance with disclosed embodiments; and [0034] FIG. 6 shows an example of how sequencing reads obtained from nucleic acid molecules prepared for biased and/or unbiased sequencing may be correlated with the original nucleic acid molecules, in accordance with disclosed embodiments.
- nucleic acid generally refers to a molecule comprising nucleic acid subunits, or nucleotides.
- a nucleic acid may include nucleotides selected from adenosine (A), cytosine (C), guanine (G), thymine (T), and uracil (U), or variants thereof.
- a nucleotide generally includes a nucleoside and at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more phosphate (P0 3 ) groups.
- a nucleotide may include a nucleobase, a five-carbon sugar (either ribose or deoxyribose), and phosphate groups.
- Ribonucleotides are nucleotides in which the sugar is ribose.
- Deoxyribonucleotides are nucleotides in which the sugar is deoxyribose.
- a nucleotide may be a nucleoside
- a nucleotide may be a deoxyribonucleoside polyphosphate, such as, e.g., a deoxyribonucleoside triphosphate (dNTP), which may be selected from deoxyadenosine triphosphate (dATP), deoxycytidine triphosphate (dCTP), deoxyguanosine triphosphate (dGTP), uridine triphosphate (dUTP) and deoxythymidine triphosphate (dTTP) dNTPs, that include detectable tags, such as luminescent tags or markers (e.g., fluorophores).
- dNTP deoxyribonucleoside triphosphate
- detectable tags such as luminescent tags or markers (e.g., fluorophores).
- a nucleotide may include any subunit that may be incorporated into a growing nucleic acid strand. Such subunit may be an A, C, G, T, or U, or any other subunit that is specific to complementary A, C, G, T, or U, or complementary to a purine (i.e., A or G, or variant thereof) or a pyrimidine (i.e., C, T or U, or variant thereof).
- a nucleic acid is deoxyribonucleic acid (DNA), ribonucleic acid (RNA), or derivatives or variants thereof.
- a nucleic acid may be single-stranded or double stranded.
- a nucleic acid molecule is circular.
- nucleic acid molecule generally refer to a polynucleotide that may have various lengths, such as either deoxyribonucleotides or ribonucleotides (RNA), or analogs thereof.
- RNA ribonucleotides
- a nucleic acid molecule may have a length of at least about 10 bases, 20 bases, 30 bases, 40 bases, 50 bases, 100 bases, 200 bases, 300 bases, 400 bases, 500 bases, 1 kilobase (kb), 2 kb, 3 kb, 4 kb, 5 kb, 10 kb, 50 kb, or more.
- An oligonucleotide is typically composed of a specific sequence of four nucleotide bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine (T) when the polynucleotide is RNA).
- oligonucleotide sequence is the alphabetical representation of a polynucleotide molecule; alternatively, the term may be applied to the polynucleotide molecule itself.
- This alphabetical representation may be input into databases in a computer having a central processing unit and used for bioinformatics applications such as functional genomics and homology searching.
- Oligonucleotides may include nonstandard nucleotide(s), nucleotide analog(s), and/or modified nucleotides.
- sample generally refers to a biological sample.
- biological samples include nucleic acid molecules, amino acids, polypeptides, proteins, carbohydrates, fats, or viruses.
- the sample contains a target nucleic acid molecule.
- a biological sample is a nucleic acid sample including nucleic acid molecule(s).
- the biological sample is a nucleic acid sample including target nucleic acid molecule(s).
- the target nucleic acid molecules may be cell-free or cell-free nucleic acid molecules, such as cell-free DNA or cell-free RNA.
- the target nucleic acid molecules may be derived from a variety of sources including human, mammal, non-human mammal, ape, monkey, chimpanzee, reptilian, amphibian, or avian, sources. Further, samples may be extracted from variety of animal fluids containing cell-free sequences, including but not limited to blood, serum, plasma, vitreous, sputum, urine, tears, perspiration, saliva, semen, mucosal excretions, mucus, spinal fluid, amniotic fluid, lymph fluid- and the like. Cell-free polynucleotides may be fetal in origin (via fluid taken from a pregnant subject), or may be derived from tissue of the subject itself.
- the term“subject,” as used herein, generally refers to an individual having a biological sample that is undergoing processing or analysis.
- a subject can be an animal or plant.
- the subject can be a mammal, such as a human, dog, cat, horse, pig or rodent.
- the subject can be a patient, e.g., have or be suspected of having a disease, such as one or more cancers, one or more infectious diseases, one or more genetic disorder, or one or more tumors, or any combination thereof.
- the tumors may be of one or more types.
- amplification generally refers to generating copies of a nucleic acid.
- amplification of DNA generally refers to generating copies of a DNA molecule.
- amplification of a nucleic acid may be linear, exponential, or a combination thereof.
- Amplification may be emulsion based or may be non-emulsion based.
- nucleic acid amplification methods include reverse transcription, primer extension, polymerase chain reaction (PCR), ligase chain reaction (LCR), helicase-dependent amplification, asymmetric amplification, rolling circle amplification, and multiple displacement amplification (MDA).
- any form of PCR may be used, with non-limiting examples that include real-time PCR, allele-specific PCR, assembly PCR, asymmetric PCR, digital PCR, emulsion PCR, dial-out PCR, helicase-dependent PCR, nested PCR, hot start PCR, inverse PCR, methylation-specific PCR, mini-primer PCR, multiplex PCR, nested PCR, overlap-extension PCR, thermal asymmetric interlaced PCR and touchdown PCR.
- amplification may be conducted in a reaction mixture comprising various components (e.g., a primer(s), template, nucleotides, a polymerase, buffer components, co-factors, etc.) that participate or facilitate amplification.
- the reaction mixture comprises a buffer.
- buffers include magnesium-ion buffers, manganese-ion buffers and iso-citrate buffers. Additional examples of such buffers are also described in Tabor, S. et al. C.C. PNAS, 1989, 86, 4076-4080 and U.S. Patent Nos. 5,409,811 and 5,674,716, each of which is herein incorporated by reference in its entirety.
- Sequequencing generally refers to generating or identifying the sequence of nucleic molecules. Sequencing may be single-molecule sequencing or sequencing by synthesis. Sequencing may be massively parallel array sequencing (e.g., IlluminaTM sequencing), which may be performed using template nucleic acid molecules immobilized on a support, such as a flow cell. For example, sequencing may comprise a first-generation sequencing method, such as Maxam-Gilbert or Sanger sequencing, or a high-throughput sequencing (e.g., next-generation sequencing or NGS) method. A high-throughput sequencing method may sequence simultaneously (or substantially simultaneously) at least about 10,000, 100,000, 1 million, 10 million, 100 million, 1 billion, or more polynucleotide molecules.
- first-generation sequencing method such as Maxam-Gilbert or Sanger sequencing
- high-throughput sequencing e.g., next-generation sequencing or NGS
- Sequencing methods may include, but are not limited to: pyrosequencing, sequencing-by synthesis, single-molecule sequencing, nanopore sequencing, semiconductor sequencing, sequencing-by-ligation, sequencing-by-hybridization, Digital Gene Expression (Helicos), massively parallel sequencing, e.g., Helicos, Clonal Single Molecule Array (Solexa/Illumina), sequencing using PacBio, SOLiD, Ion Torrent, or Nanopore platforms.
- the term“support,” as used herein, generally refers to a solid support such as a slide, a bead, a resin, a chip, an array, a matrix, a membrane, a nanopore, or a gel.
- the solid support may, for example, be a bead on a flat substrate (such as glass, plastic, silicon, etc.) or a bead within a well of a substrate.
- the substrate may have surface properties, such as textures, patterns, microstructure coatings, surfactants, or any combination thereof to retain the bead at a desire location (such as in a position to be in operative communication with a detector).
- the detector of bead-based supports may be configured to maintain substantially the same read rate independent of the size of the bead.
- the support may be a flow cell or an open substrate.
- the support may comprise a biological support, a non-biological support, an organic support, an inorganic support, or any combination thereof.
- the support may be in optical communication with the detector, may be physically in contact with the detector, may be separated from the detector by a distance, or any combination thereof.
- the support may have a plurality of independently addressable locations.
- the nucleic acid molecules may be immobilized to the support at a given independently addressable location of the plurality of independently addressable locations. Immobilization of each of the plurality of nucleic acid molecules to the support may be aided by the use of an adaptor.
- the support may be optically coupled to the detector. Immobilization on the support may be aided by an adaptor.
- “flow cell” generally refers to a support which contains small fluidic channels through which substances may be pumped. Such substances may be polymerases, nucleic acid molecules and buffers. In some examples, the support may be functionalized.“Flow cell” may also generally refer to a vessel having a chamber where a reaction can be carried out, an inlet for delivering reagents to the chamber, and an outlet for removing reagents from the chamber.
- the chamber is configured for detection of the reaction that occurs in the chamber (e.g., on a surface that is in fluid contact with the chamber).
- the chamber can include one or more transparent surfaces allowing optical detection of arrays, optically labeled molecules, or the like, in the chamber.
- flow cells include, but are not limited to those used in a nucleic acid sequencing apparatus, such as flow cells for the Genome Analyzer ® , MiSeq ® , NextSeq ® , HiSeq ® , or NovaSeqTM platforms commercialized by Illumina, Inc. (San Diego, CA); or for the SOLiDTM or Ion TorrentTM sequencing platform commercialized by Life Technologies (Carlsbad, CA).
- detector generally refers to a device, generally including optical and/or electronic components that can detect signals.
- WGS whole genome sequencing
- Sequencing coverage generally describes the average number of reads that align to known reference bases. Sequencing coverage requirements may vary by application. In some examples, the depth of coverage may be about 0.1X, 0.5X, IX, 2X, 3X, 4X, 5X, 6X, 7X, 8X,
- the depth of coverage may be about 10X, 15X, 20X, 25X, 3 OX, 35X, 40X, 45X, 50X, 60X, 70X, 80X, 90X, 100X, or more than about 100X.
- targeted sequencing generally refers to the process of sequencing a subset of genes or regions of a genome.
- a plurality of nucleic acid molecules corresponding to a subset of genes or genomic regions may be isolated, enriched, and/or amplified prior to the sequencing.
- exomes, specific genes of interest, targets within genes, or mitochondrial DNA are sequenced.
- a plurality of nucleic acid molecules corresponding to the specific genes of interest, targets within genes, or mitochondrial DNA may be isolated, enriched, and/or amplified prior to the sequencing.
- target capture panel generally refers to panels which contain a select set of genes or genomic regions (e.g., genetic loci) known or suspected to have associations with certain diseases or phenotypes.
- the term“genetic loci,” as used herein, generally refers to locations on a chromosome or any region of genomic nucleic acid molecules that is considered to be discrete genetic units for the purpose of formal linkage analysis or molecular genetic studies.
- the term“bisulfite sequencing,” as used herein, generally refers to a sequencing method that comprises the treatment of nucleic acid molecules with bisulfite (e.g., to selectively convert unmethylated cytosine residues of DNA molecules to uracil, while leaving methylated cytosine (5-methylcytosine) residues intact). Bisulfite sequencing may be used to detect methylation patterns in nucleic acid molecules (e.g., at a single-nucleotide resolution).
- control libraries generally refers to a library of nucleic acid molecules used to process a sample of nucleic acid molecules to generate a plurality of sequencing reads.
- the control libraries are generated using unbiased sequencing.
- the control libraries are generated using biased sequencing.
- polymerase generally refers to any enzyme capable of catalyzing a polymerization reaction.
- examples of polymerases include, without limitation, a nucleic acid polymerase.
- the polymerase can be naturally occurring or synthesized. In some cases, a polymerase has relatively high processivity.
- An example polymerase is a F29 polymerase or a derivative thereof.
- a polymerase can be a polymerization enzyme. In some cases, a transcriptase or a ligase is used (i.e., enzymes which catalyze the formation of a bond).
- polymerases examples include a DNA polymerase, an RNA polymerase, a thermostable polymerase, a wild-type polymerase, a modified polymerase, E. coli DNA polymerase I, T7 DNA polymerase, bacteriophage T4 DNA polymerase F29 (phi29) DNA polymerase, Taq polymerase, Tth polymerase, Tli polymerase, Pfu polymerase, Pwo polymerase, VENT polymerase, DEEPVENT polymerase, EX-Taq polymerase, LA-Taq polymerase, Sso polymerase, Poc polymerase, Pab polymerase, Mth polymerase, ES4 polymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tea polymerase, Tih polymerase, Tfi polymerase, Platinum Taq polymerases, Tbr polymerase, Tfl polymerase, Pfutubo polymerase
- the polymerase is a single subunit polymerase.
- the polymerase can have high processivity, namely the capability of the polymerase to consecutively incorporate nucleotides into a nucleic acid template without releasing the nucleic acid template.
- a polymerase is a polymerase modified to accept dideoxynucleotide triphosphates, such as for example, Taq polymerase having a 667 Y mutation.
- a polymerase is a polymerase having a modified nucleotide binding, which may be useful for nucleic acid sequencing, with non-limiting examples that include Thermo Sequenas polymerase (GE Life Sciences), AmpliTaq FS (Therm oFisher) polymerase and Sequencing Pol polymerase (Jena Bioscience).
- the polymerase is genetically engineered to have discrimination against dideoxynucleotides, such, as for example, Sequenase DNA polymerase (ThermoFisher).
- biased samples are primarily run on a sequencer, and if the bases in the initial (e.g., the first five cycles) are too similar, where the majority of the flow cell is the same base, that can create conflict in identifying individual bases in that flow cell.
- the bases in the initial e.g., the first five cycles
- the majority of the flow cell is the same base
- a standard control such as phiX reference genome may be run along with a biased sample.
- the addition of the standard control may be used to break up the monotony on the flow cell. In this way, the added complexity may prevent the same base from occurring over a great amount of the flow cell and causing problems in determining the sequencing reads.
- a different base such as from the phiX genome may be added which breaks up the monotony in the imaging process during sequencing of a sample of interest. This, in turn, may allow the sequencing algorithm to continue working so as to generate the deep sequencing information around the targeted genomic region of interest.
- a possible disadvantage of the use of a phiX control is the amount of sequencing data that can be generated but for the loss of space that is dedicated to the phiX control on a flow cell. While the use of a phiX control may work to increase complexity so as to ensure deep sequencing of particular regions of interest, the loss of real estate on the flow cell can decrease the efficiency of, and thereby increase the cost of, sequencing a particular sample and/or represent a diminished capacity of sequencing unbiased samples of interest.
- biased and unbiased libraries may be combined so as to generate a degree of complexity, while also providing the desired run depths of the samples.
- sequencer real estate devoted to the unbiased samples that are used to increase complexity may result in desirable sequencing results.
- desired complexity may be achieved so as to allow sequencing of biased samples to a desired depth, while also generating desirable sequencing results of unbiased samples.
- complexity may relate to a number of unique molecules within a sequencing library. In some examples, complexity may relate to a diversity of molecules within a sequencing library.
- regions that are conserved and more specifically the initial bases that are read, such as about 75 bases being read along that molecule, and the first 5 to 20 bases, may be highly conserved, such that a high number of clusters may be lost if they similarly light up to an imaging camera. For example, when too many molecules are lit up, then a camera that is imaging the sample may not be able to distinguish particular molecules within the sample, which may all appear the same to the camera.
- the amount of capacity needed to introduce diversity may be variable depending on the assay being run, and may also be dependent on the sample and how much conservation is present within the molecules being analyzed.
- more than one biased sample and/or more than one unbiased sample may be incorporated into the combined pool of samples.
- enough complexity may be generated within the flow cell so as to allow for a sequencer to complete its run successfully, while also obtaining a desired depth around both the biased and unbiased samples. In this way, not only is desired complexity accomplished, but data is able to be obtained from two or more types of sequencing libraries without the loss of real estate on the flow cell to negligible sequencing (e.g., sequencing of a control bacteriophage).
- the present disclosure provides methods for sequencing nucleic acid molecules by using pooled libraries of nucleic acid molecules.
- Library complexity may refer to the number of unique molecules in the library that are sampled by finite sequencing.
- particular methods that may be used prior to and during preparation of a sequencing library may reduce sample complexity. For example, sample complexity may be reduced by increasing duplicates.
- PCR and other biasing methods can reduce sample complexity.
- each library of nucleic acid molecules may be processed for performing either unbiased or biased sequencing.
- an unbiased sequencing library may be generated using a whole genome approach. In some cases, an unbiased sequencing library may be generated using a shotgun sequencing approach. In some cases, an unbiased sequencing library may be generated by taking a human sample, and prepare the DNA for sequencing independent of a particular targeted region of the genome.
- a biased sequencing library may be generated by specifically targeting particular regions in the genome. For example, in certain embodiments where additional sequencing depth is beneficial so as to increase confidence in assessing particular mutations (e.g., single nucleotide polymorphisms (SNPs), copy number variations (CNVs), insertions or deletions (indels), or fusions), a biased library may be generated.
- a biased library may be generated by first generating an unbiased library and then biasing the generated unbiased library using a targeted pull down.
- target-specific primers may pull down the region of interest and untargeted regions may be discarded, thereby generating a biased library.
- a biased library may be generated using an amplicon-based polymerase chain reaction (PCR) approach.
- PCR polymerase chain reaction
- a sample of interest may be taken and a PCR-based approach may be used for regions that are of interest, thereby generating a biased library.
- the libraries may be pooled together.
- mass When considering mass, it may be important to consider whether there are enough reads to cover both biased and unbiased samples.
- mass may be considered by normalizing samples to the same concentration, e.g., the same number of molecules. For example, given a number of biased samples having a same or similar concentration, a pool of the biased samples may be generated where the pool has a same or similar concentration as the individual biased samples. In addition to pooling samples with the same, or similar,
- samples may also be pooled so as to ensure sufficient reads of the samples.
- the percentage contributed from each library may be designed so as to ensure sufficient sequencing reads for each of the biased samples as well as each of the unbiased samples.
- the percentage of unbiased samples versus biased samples may be flexible depending on the application. In some biased targeted panel sets, a larger panel of biased samples may be provided such that more reads may need to be allocated to the biased samples.
- unbiased shotgun samples may be run at a lower depth, such that fewer reads are allocated to the unbiased samples.
- a small targeted biased panel may be provided so the percentage of reads allocated to the total sequencing run may only comprise as much as 10%, thereby leaving 90% available to use for a deeper unbiased approach.
- a percentage of contribution attributable to components of the pooled libraries may be adjustable.
- two or more fixed biased pools may be provided with two different panel sets, respectively.
- an unbiased sample may be run along the two fixed biased pools.
- the two fixed biased pools may be run together without the need of an unbiased pool.
- two fixed unbiased pools may be provided with two different panel sets, and with no additional contribution from a biased pool. In these ways, different applications can use pools combined at variable percentages based on the samples and the application in order to achieve the
- each library of nucleic acid molecules may be processed for performing the same type of sequencing as other libraries of nucleic acid molecules.
- each library of nucleic acid molecules may be processed for performing a different type of sequencing to at least one other library of nucleic acid molecules. This may address issues associated with the efficiency and cost of whole genome sequencing.
- Methods of the disclosure can comprise pooling two or more nucleic acid libraries. In some cases, at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, or more than 50 libraries can be pooled in order to achieve sufficient complexity on the flow cell, to maximize use of sequencing capacity, or a combination thereof.
- Non-limiting examples of libraries that can be pooled with the methods of the disclosure include WGS library, targeted library, methylation-Seq library, RNA-seq library, biased RNA library, and any combination thereof.
- a WGS library is pooled with a targeted library.
- a WGS library is pooled with a methylation-seq library.
- a RNA-seq library is pooled with a biased RNA library.
- a WGS library is pooled with a RNA-seq library
- a RNA-seq library is pooled with a methyl-seq library. Sequencing of Pooled Biased and Unbiased Libraries
- a method for sequencing nucleic acid molecules may comprise processing a first plurality of nucleic acid molecules. This may generate a first plurality of libraries for performing an unbiased sequencing.
- the method may comprise processing a second plurality of nucleic acid molecules. This may generate a second plurality of libraries for performing a biased sequencing.
- the method may comprise pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries.
- the method may use a single flow cell of a sequencing platform to sequence the pooled plurality of libraries. This may generate a first plurality of sequencing reads
- pooling the first and second pluralities of libraries may increase complexity of the pooled plurality of libraries relative to at least one of the first and second plurality of libraries. In some embodiments, pooling the first and second plurality of libraries may increase complexity of the pooled plurality of libraries relative to at least one of the first and second plurality of libraries by about 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%,
- the first and second pluralities of nucleic acid molecules may be sourced from a same sample. In some embodiments, the first and second pluralities of nucleic acid molecules may be sourced from samples from a same patient. In some embodiments, the first and second pluralities of nucleic acid molecules may be sourced from samples from patients from a same family. In some embodiments, the first and second pluralities of nucleic acid molecules may be sourced from samples from patients from a same race or ethnicity. In some embodiments, the first and second pluralities of nucleic acid molecules may be sourced from samples from patients from a same sex or gender.
- a portion of the sample may be processed into a first plurality of nucleic acid molecules within a biased library, and a second portion of the sample may be processed into a second plurality of nucleic acid molecules within an unbiased library.
- portions of the first and second pluralities of nucleic acid molecules may be combined on a sequencer and may be sequenced.
- a single unbiased library may be used as a control for the sequencing of each biased library.
- a plurality of biased libraries may be sequenced together along with a control unbiased library.
- a general sequencing control may be provided by generating a control from a known sample that has undergone the same or similar steps as the biased sample.
- a well-characterized control such as phiX may not be as beneficial in comparison, since the information gained from the known sample may also be well- characterized.
- pooled mixtures of unbiased and biased samples may be sequenced with controls for each sample such that an unbiased sample may be a control for a biased sample and/or a biased sample may be a control for an unbiased sample.
- the processing of the first plurality of nucleic acid molecules optionally involves the fragmentation of the nucleic acid molecules. In some cases, processing may not involve fragmentation, for example, for cell-free nucleic acids obtained from a subject. Fragmentation of the first plurality of nucleic acid molecules may be done by physical methods, enzymatic methods or chemical methods. Some examples of physical methods of fragmentation include, but are not limited to, acoustic shearing or sonication. Some examples of enzymatic methods include, but are not limited to, non-specific endonuclease cocktails or transposase tagmentation reactions.
- the processing of the first plurality of nucleic acid molecules involves the sizing of the fragments of the first plurality of nucleic acid molecules.
- Preferred sizes of fragments of the first plurality of nucleic acid molecules may be less than about 50 bases, less than about 100 bases, less than about 200 bases, less than about 400 bases, less than about 600 bases, less than about 800 bases, less than about 1000 bases, about 50 bases or more, about 100 bases or more, about 200 bases or more, about 400 bases or more, about 600 bases or more, about 800 bases or more, from about 10 bases to about 1000 bases, from about 20 bases to about 800 bases, from about 30 bases to about 600 bases, from about 40 bases to about 400 bases, from about 50 bases to about 200 bases, or from about 40 bases to about 100.
- preferred sizes of fragments of the first plurality of nucleic acid molecules may also have base lengths that are on an order of 1,000 bases; 10,000 bases; 100,000 bases; 1,000,000 bases; or more than 1,000,000 bases.
- the first plurality of nucleic acid molecules is DNA.
- the processing of the first plurality of nucleic acid molecules may involve the blunting and phosphorylation of the 5’ end. Blunting and phosphorylation of the 5’ end may be accomplished using at least one enzyme. These enzymes may be T4 polynucleotide kinase, T4 DNA polymerase, or Klenow Large Fragment.
- the processing of the first plurality of nucleic acid molecules may involve the A-tailing of the 3’ end.
- the A-tailing of the 3’ end may use enzymes. These enzymes may be Taq polymerase or Klenow Fragment.
- the processing of the first plurality of nucleic acid molecules may involve multiplexing.
- the processing of the first plurality of nucleic acid molecules may involve tagmentation. Tagmentation may involve the use of a transposase enzyme to simultaneously fragment and tag nucleic acid molecules.
- the first plurality of nucleic acid molecules is RNA.
- the processing of the first plurality of nucleic acid molecules may involve ligation with a DNA adaptor.
- the DNA adaptor may be an adenylated DNA adaptor with a block 3’ end.
- the ligation may be done using truncated T4 RNA ligase 2.
- the processing of the first plurality of nucleic acid molecules may involve the addition of an adaptor. This adaptor may be a 5’ RNA adaptor.
- the processing of the first plurality of nucleic acid molecules may involve hybridization of a primer. This primer may be a reverse transcription primer.
- the processing of the first plurality of nucleic acid molecules may be based on complementary DNA (cDNA) synthesis.
- This synthesis may involve, but is not limited to, using random primers or oligo-dT primers or attaching adaptors.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, using primers to initiate the cDNA synthesis. This may then involve template switching where an adaptor sequence is added to the cDNA molecules.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, reduced amplification.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, reducing duplicate reads.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, using multiple combinations of indexed adaptors.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, mitigating batch effects.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, reducing variability in day-to-day sample processing. This may involve reducing day-to-day variability in reaction conditions, reagent batches, pipetting accuracy, and human error.
- the processing of the second plurality of nucleic acid molecules involves the fragmentation of the nucleic acid molecules. Fragmentation of the second plurality of nucleic acid molecules may be done by physical methods, enzymatic methods or chemical methods. Some examples of physical methods of fragmentation include, but are not limited to, acoustic shearing or sonication. Some examples of enzymatic methods include, but are not limited to, non-specific endonuclease cocktails or transposase tagmentation reactions. In some examples, the processing of the second plurality of nucleic acid molecules involves the sizing of the fragments of the second plurality of nucleic acid molecules.
- Preferred sizes of fragments of the second plurality of nucleic acid molecules may be less than about 50 bases, less than about 100 bases, less than about 200 bases, less than about 400 bases, less than about 600 bases, less than about 800 bases, less than about 1000 bases, about 50 bases or more, about 100 bases or more, about 200 bases or more, about 400 bases or more, about 600 bases or more, about 800 bases or more, from about 10 bases to about 1000 bases, from about 20 bases to about 800 bases, from about 30 bases to about 600 bases, from about 40 bases to about 400 bases, from about 50 bases to about 200 bases, or from about 40 bases to about 100.
- the second plurality of nucleic acid molecules is DNA.
- the processing of the second plurality of nucleic acid molecules may involve the blunting and phosphorylation of the 5’ end. Blunting and phosphorylation of the 5’ end may be accomplished using at least one enzyme. These enzymes may be T4 polynucleotide kinase, T4 DNA polymerase, or Klenow Large Fragment.
- the processing of the second plurality of nucleic acid molecules may involve the A-tailing of the 3’ end.
- the A-tailing of the 3’ end may use enzymes. These enzymes may be Taq polymerase or Klenow Fragment.
- the processing of the second plurality of nucleic acid molecules may involve multiplexing.
- the processing of the second plurality of nucleic acid molecules may involve tagmentation. Tagmentation may involve the use of a transposase enzyme to simultaneously fragment and tag nucleic acid molecules.
- the second plurality of nucleic acid molecules is RNA.
- the processing of the second plurality of nucleic acid molecules may involve ligation with a DNA adaptor.
- the DNA adaptor may be an adenylated DNA adaptor with a block 3’ end.
- the ligation may be done using truncated T4 RNA ligase 2.
- the processing of the second plurality of nucleic acid molecules may involve the addition of an adaptor. This adaptor may be a 5’
- RNA adaptor The processing of the second plurality of nucleic acid molecules may involve hybridization of a primer. This primer may be a reverse transcription primer.
- the processing of the second plurality of nucleic acid molecules may be based on cDNA synthesis. This synthesis may involve, but is not limited to, using random primers or oligo-dT primers or attaching adaptors.
- the processing of the second plurality of nucleic acid molecules may involve, but is not limited to, using primers to initiate the cDNA synthesis. This may then involve template switching where an adaptor sequence is added to the cDNA molecules.
- the processing of the second plurality of nucleic acid molecules may involve, but is not limited to, increasing amplification.
- the processing of the second plurality of nucleic acid molecules may involve, but is not limited to, increasing duplicate reads.
- the processing of the second plurality of nucleic acid molecules may involve, but is not limited to, using minimal combinations of indexed adaptors.
- the processing of the second plurality of nucleic acid molecules may involve, but is not limited to, exaggerating batch effects.
- the processing of the second plurality of nucleic acid molecules may involve, but is not limited to, increasing variability in day-to-day sample processing. This may involve increasing day-to-day variability in reaction conditions, reagent batches, pipetting accuracy, and human error.
- the first plurality of libraries and the second plurality of libraries are pooled.
- a pooled plurality of libraries may be generated. Pooling may involve, but is not limited to, mixing.
- sequencing of the pooled plurality of libraries involves, but is not limited to, whole genome sequencing (WGS), de novo sequencing, mate pair sequencing, chromosome immunoprecipitation sequencing (ChIP-seq), RNA immunoprecipitation sequencing (RIP-seq), crosslinking and immunoprecipitation sequencing (CLIP-seq).
- WGS whole genome sequencing
- ChIP-seq chromosome immunoprecipitation sequencing
- RIP-seq RNA immunoprecipitation sequencing
- CLIP-seq crosslinking and immunoprecipitation sequencing
- Sequencing may involve, but is not limited to, flow cell sequencing. Sequencing may involve, but is not limited to, patterned flow cell sequencing.
- Unbiased sequencing may comprise whole genome sequencing (WGS), de novo sequencing, mate pair sequencing, chromosome immunoprecipitation sequencing (ChIP-seq), RNA immunoprecipitation sequencing (RIP-seq), crosslinking and immunoprecipitation sequencing (CLIP-seq) and RNA sequencing (RNA-Seq).
- Unbiased sequencing may involve, but is not limited to, flow cell sequencing.
- Unbiased sequencing may involve, but is not limited to, patterned flow cell sequencing.
- Biased sequencing may comprise whole genome sequencing (WGS), de novo sequencing, mate pair sequencing, chromosome immunoprecipitation sequencing (ChIP-seq), RNA immunoprecipitation sequencing (RIP-seq), crosslinking and immunoprecipitation sequencing (CLIP-seq).
- WGS whole genome sequencing
- ChIP-seq chromosome immunoprecipitation sequencing
- RIP-seq RNA immunoprecipitation sequencing
- CLIP-seq crosslinking and immunoprecipitation sequencing
- Biased sequencing may involve, but is not limited to, flow cell sequencing.
- Biased sequencing may involve, but is not limited to, patterned flow cell sequencing.
- the sequencing may be performed at a depth of no more than about 0.1X, no more than about 0.5X, no more than about IX, no more than about 2X, no more than about 3X, no more than about 4X, no more than about 5X, no more than about 6X, no more than about 7X, no more than about 8X, no more than about 9X, no more than about 10X, no more than about 15X, no more than about 20X, no more than about 3 OX, no more than about 40X, no more than about 50X, no more than about 60X, no more than about 70X, no more than about 80X, no more than about 90X, no more than about 100X, no more than about 200X, no more than about 300X, no more than about 400X, no more than about 500X, no more than about 600X, no more than about 700X, no more than about 800X, no more than about 900X, no more than about 1000X, at least about 0.
- IX at least about 0.5X, at least about IX, at least about 2X, at least about 3X, at least about 4X, at least about 5X, at least about 6X, at least about 7X, at least about 8X, at least about 9X, at least about 10X, at least about 15X, at least about 20X, at least about 30X, at least about 40X, at least about 50X, at least about 60X, at least about 70X, at least about 80X, no more than at least about 90X, at least about 100X, at least about 200X, at least about 300X, at least about 400X, at least about 500X, at least about 600X, at least about 700X, at least about 800X, at least about 900X, at least about 1000X, at least about 2000X, at least about 3000X, at least about 4000X, at least about 5000X, at least about 6000X, at least about 7000X, at least about 8000X, at least about 9000X, or at least about IO,OOOC.
- biased sequencing may be performed at a first depth
- unbiased sequencing may be performed at a second depth.
- the first depth may be the same or substantially similar to the second depth.
- the first depth may be greater than the second depth.
- the second depth may be greater than the first depth.
- sequencing of a first library may be performed at a first depth
- sequencing of a second library may be performed at a second depth.
- the first depth may be the same or substantially similar to the second depth.
- the first depth may be greater than the second depth.
- the second depth may be greater than the first depth.
- multiple libraries may be sequenced where one or more of the multiple libraries are sequenced at different depths.
- the biased sequencing may comprise targeted sequencing of a target capture panel comprising a plurality of genetic loci.
- the biased sequencing may comprise targeted methyl-seq.
- Target sequencing may comprise at least one of (i) hybridization capture approaches, (ii) microdroplet PCT droplet libraries, (iii) custom-designed droplet libraries, and (iv) amplicon sequencing.
- the unbiased sequencing may comprise bisulfite sequencing, whole genome bisulfite sequencing (WGBS), APOBEC-seq, methyl-CpG-binding domain (MBD) protein capture, methyl-DNA immunoprecipitation (MeDIP), methylation sensitive restriction enzyme sequencing (MSRE/MRE-Seq or Methyl-Seq), oxidative bisulfite sequencing (oxBS-Seq), reduced representative bisulfite sequencing (RRBS), Tet-assisted bisulfite sequencing (TAB- Seq), or similar.
- Treatment of nucleic acid molecules with sodium bisulfite may result in the chemical conversion of unmethylated cytosine to uracil while methylated cytosines may be protected.
- the method may further comprise generating the second plurality of sequencing reads.
- the second plurality of sequencing reads may comprise using at least a portion of the first plurality of libraries as control libraries.
- the method may further comprise pooling a third plurality of libraries to generate the pooled plurality of libraries.
- the third plurality of libraries may comprise control libraries for generating the first plurality of sequencing reads or the second plurality of sequencing reads.
- the first plurality of nucleic acid molecules and the second plurality of nucleic acid molecules comprise DNA molecules. In some examples, the first plurality of nucleic acid molecules and the second plurality of nucleic acid molecules comprise RNA molecules. In some examples, the first plurality of nucleic acid molecules and the second plurality of nucleic acid molecules comprise a combination of DNA and RNA molecules
- Sequencing the nucleic acid can be performed using any suitable method, such as next- generation sequencing.
- sequencing the nucleic acid can be performed using chain termination sequencing, hybridization sequencing, Illumina sequencing, ion torrent semiconductor sequencing, mass spectrophotometry sequencing, massively parallel signature sequencing (MPSS), Maxam-Gilbert sequencing, nanopore sequencing, polony sequencing, pyrosequencing, shotgun sequencing, single molecule real time (SMRT) sequencing, SOLiD sequencing, universal sequencing, or any combination thereof.
- the sequencing can comprise digital PCR.
- the sequencing platform is an amplification sequencing, or any combination thereof.
- the sequencing platform comprises an output range of greater than, for example, about 2,000 million reads per flow cell. In some embodiments, the sequencing platform is a NovaSeqTM.
- a method for sequencing nucleic acid molecules may comprise processing a first plurality of nucleic acid molecules. This may generate a first plurality of libraries for performing a first biased sequencing.
- the method may comprise processing a second plurality of nucleic acid molecules. This may generate a second plurality of libraries for performing a second biased sequencing.
- the method may comprise pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries.
- the method may use a single flow cell of a sequencing platform to sequence the pooled plurality of libraries. This may generate a first plurality of sequencing reads
- the processing of the first and second pluralities of nucleic acid molecules involves the fragmentation of the nucleic acid molecules. Fragmentation of the first and second plurality of nucleic acid molecules may be done by physical methods, enzymatic methods, or chemical methods. Some examples of physical methods of fragmentation include, but are not limited to, acoustic shearing or sonication. Some examples of enzymatic methods include, but are not limited to, non-specific endonuclease cocktails or transposase tagmentation reactions.
- the processing of the first and second pluralities of nucleic acid molecules involves the sizing of the fragments of the first plurality of nucleic acid molecules.
- Preferred sizes of fragments of the first plurality of nucleic acid molecules may be less than about 50 bases, less than about 100 bases, less than about 200 bases, less than about 400 bases, less than about 600 bases, less than about 800 bases, less than about 1000 bases, about 50 bases or more, about 100 bases or more, about 200 bases or more, about 400 bases or more, about 600 bases or more, about 800 bases or more, from about 10 bases to about 1000 bases, from about 20 bases to about 800 bases, from about 30 bases to about 600 bases, from about 40 bases to about 400 bases, from about 50 bases to about 200 bases, or from about 40 bases to about 100.
- the first and second pluralities of nucleic acid molecules are DNA.
- the processing of the first and second pluralities of nucleic acid molecules may involve the blunting and phosphorylation of the 5’ end. Blunting and phosphorylation of the 5’ end may be accomplished using at least one enzyme. These enzymes may be T4 polynucleotide kinase, T4 DNA polymerase, or Klenow Large Fragment.
- the processing of the first and second pluralities of nucleic acid molecules may involve the A-tailing of the 3’ end.
- the A-tailing of the 3’ end may use enzymes. These enzymes may be Taq polymerase or Klenow Fragment.
- the processing of the first and second pluralities of nucleic acid molecules may involve multiplexing.
- the processing of the first and second pluralities of nucleic acid molecules may involve tagmentation. Tagmentation may involve the use of a transposase enzyme to simultaneously fragment and tag nucleic acid molecules.
- the first and second pluralities of nucleic acid molecules are RNA.
- the processing of the first and second pluralities of nucleic acid molecules may involve ligation with a DNA adaptor.
- the DNA adaptor may be an adenylated DNA adaptor with a block 3’ end.
- the ligation may be done using truncated T4 RNA ligase 2.
- the processing of the first and second pluralities of nucleic acid molecules may involve the addition of an adaptor.
- This adaptor may be a 5’ RNA adaptor.
- the processing of the first and second pluralities of nucleic acid molecules may involve hybridization of a primer. This primer may be a reverse
- the processing of the first and second pluralities of nucleic acid molecules may be based on cDNA synthesis. This synthesis may involve, but is not limited to, using random primers or oligo-dT primers or attaching adaptors.
- the processing of the first and second pluralities of nucleic acid molecules may involve, but is not limited to, using primers to initiate the cDNA synthesis. This may then involve template switching where an adaptor sequence is added to the cDNA molecules.
- the processing of the first and second pluralities of nucleic acid molecules may involve, but is not limited to, increasing amplification.
- the processing of the first and second pluralities of nucleic acid molecules may involve, but is not limited to, increasing duplicate reads.
- the processing of the first and second pluralities of nucleic acid molecules may involve, but is not limited to, using minimal combinations of indexed adaptors.
- the processing of the first and second pluralities of nucleic acid molecules may involve, but is not limited to, exaggerating batch effects.
- the processing of the first and second pluralities of nucleic acid molecules may involve, but is not limited to, increasing variability in day-to-day sample processing. This may involve increasing day-to-day variability in reaction conditions, reagent batches, pipetting accuracy, and human error.
- the first plurality of libraries and the second plurality of libraries are pooled.
- a pooled plurality of libraries may be generated. Pooling may involve, but is not limited to, mixing.
- sequencing of the pooled plurality of libraries involves, but is not limited to, whole genome sequencing (WGS), de novo sequencing, mate pair sequencing, chromosome immunoprecipitation sequencing (ChIP-seq), RNA immunoprecipitation sequencing (RIP-seq), crosslinking and immunoprecipitation sequencing (CLIP-seq).
- WGS whole genome sequencing
- ChIP-seq chromosome immunoprecipitation sequencing
- RIP-seq RNA immunoprecipitation sequencing
- CLIP-seq crosslinking and immunoprecipitation sequencing
- Sequencing may involve, but is not limited to, flow cell sequencing. Sequencing may involve, but is not limited to, patterned flow cell sequencing.
- the first biased sequencing may comprise targeted sequencing of a first target capture panel comprising a first plurality of genetic loci.
- the first biased sequencing may comprise targeted methyl-seq.
- Target sequencing may comprise at least one of (i) hybridization capture approaches, (ii) microdroplet PCT droplet libraries, (iii) custom- designed droplet libraries, and (iv) amplicon sequencing.
- the second biased sequencing may comprise targeted sequencing of a second target capture panel comprising a second plurality of genetic loci.
- the second biased sequencing may comprise targeted methyl-seq.
- Target sequencing may comprise at least one of (i) hybridization capture approaches, (ii) microdroplet PCT droplet libraries, (iii) custom-designed droplet libraries, and (iv) amplicon sequencing. Sequencing of Pooled Distinct Unbiased Libraries
- a method for sequencing nucleic acid molecules may comprise processing a first plurality of nucleic acid molecules. This may generate a first plurality of libraries for performing a first unbiased sequencing.
- the method may comprise processing a second plurality of nucleic acid molecules. This may generate a second plurality of libraries for performing a second unbiased sequencing.
- the method may comprise pooling the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries.
- the method may use a single flow cell of a sequencing platform to sequence the pooled plurality of libraries. This may generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules and a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- the processing of the first and second plurality of nucleic acid molecules optionally involves the fragmentation of the nucleic acid molecules.
- processing may not involve fragmentation.
- Fragmentation of the first and second plurality of nucleic acid molecules may be done by physical methods, enzymatic methods or chemical methods. Some examples of physical methods of fragmentation include, but are not limited to, acoustic shearing or sonication. Some examples of enzymatic methods include, but are not limited to, non-specific endonuclease cocktails or transposase tagmentation reactions.
- the processing of the first and second plurality of nucleic acid molecules involves the sizing of the fragments of the first plurality of nucleic acid molecules.
- Preferred sizes of fragments of the first plurality of nucleic acid molecules may be less than about 50 bases, less than about 100 bases, less than about 200 bases, less than about 400 bases, less than about 600 bases, less than about 800 bases, less than about 1000 bases, about 50 bases or more, about 100 bases or more, about 200 bases or more, about 400 bases or more, about 600 bases or more, about 800 bases or more, from about 10 bases to about 1000 bases, from about 20 bases to about 800 bases, from about 30 bases to about 600 bases, from about 40 bases to about 400 bases, from about 50 bases to about 200 bases, or from about 40 bases to about 100.
- preferred sizes of fragments of the first plurality of nucleic acid molecules may also have base lengths that are on an order of 1,000 bases; 10,000 bases; 100,000 bases; 1,000,000 bases; or more than 1,000,000 bases.
- the first and second plurality of nucleic acid molecules is DNA.
- the processing of the first and second pluralities of nucleic acid molecules may involve the blunting and phosphorylation of the 5’ end. Blunting and phosphorylation of the 5’ end may be accomplished using at least one enzyme. These enzymes may be T4 polynucleotide kinase,
- the processing of the first and second pluralities of nucleic acid molecules may involve the A-tailing of the 3’ end.
- the A-tailing of the 3’end may use enzymes. These enzymes may be Taq polymerase or Klenow Fragment.
- the processing of the first and second pluralities of nucleic acid molecules may involve multiplexing.
- the processing of the first and second pluralities of nucleic acid molecules may involve tagmentation. Tagmentation may involve the use of a transposase enzyme to simultaneously fragment and tag nucleic acid molecules.
- the first and second pluralities of nucleic acid molecules are RNA.
- the processing of the first and second pluralities of nucleic acid molecules may involve ligation with a DNA adaptor.
- the DNA adaptor may be an adenylated DNA adaptor with a block 3’ end.
- the ligation may be done using truncated T4 RNA ligase 2.
- the processing of the first and second pluralities of nucleic acid molecules may involve the addition of an adaptor.
- This adaptor may be a 5’ RNA adaptor.
- the processing of the first and second pluralities of nucleic acid molecules may involve hybridization of a primer.
- This primer may be a reverse transcription primer.
- the processing of the first and second pluralities of nucleic acid molecules may be based on cDNA synthesis. This synthesis may involve, but is not limited to, using random primers or oligo-dT primers or attaching adaptors.
- the processing of the first and second pluralities of nucleic acid molecules may involve, but is not limited to, using primers to initiate the cDNA synthesis. This may then involve template switching where an adaptor sequence is added to the cDNA molecules.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, reduced amplification.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, reducing duplicate reads (e.g., generating consensus sequences) or detecting/correcting base errors in reads.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, using multiple combinations of indexed adaptors.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, mitigating batch effects.
- the processing of the first plurality of nucleic acid molecules may involve, but is not limited to, reducing variability in day-to-day sample processing. This may involve reducing day-to-day variability in reaction conditions, reagent batches, pipetting accuracy, and human error.
- the first unbiased sequencing comprises whole genome sequencing. In some examples, the first unbiased sequencing comprises RNA sequencing. In some examples, the first unbiased sequencing comprises whole genome sequencing and RNA sequencing. In some examples, the first unbiased sequencing comprises bisulfite sequencing, whole genome bisulfite sequencing (WGBS), APOBEC-seq, methyl-CpG-binding domain (MBD) protein capture, methyl-DNA immunoprecipitation (MeDIP), methylation sensitive restriction enzyme sequencing (MSRE/MRE-Seq or Methyl-Seq), oxidative bisulfite sequencing (oxBS-Seq), reduced representative bisulfite sequencing (RRBS), Tet-assisted bisulfite sequencing (TAB-Seq), or similar.
- WGBS whole genome bisulfite sequencing
- APOBEC-seq methyl-CpG-binding domain
- MBD methyl-CpG-binding domain
- MeDIP methyl-DNA immunoprecipitation
- the second unbiased sequencing comprises RNA sequencing. In some examples, the second unbiased sequencing comprises whole genome sequencing and RNA sequencing. In some examples, the second unbiased sequencing comprises bisulfite sequencing, whole genome bisulfite sequencing (WGBS), APOBEC-seq, methyl-CpG- binding domain (MBD) protein capture, methyl-DNA immunoprecipitation (MeDIP), methylation sensitive restriction enzyme sequencing (MSRE/MRE-Seq or Methyl-Seq), oxidative bisulfite sequencing (oxBS-Seq), reduced representative bisulfite sequencing (RRBS), Tet-assisted bisulfite sequencing (TAB-Seq), or similar.
- WGBS whole genome bisulfite sequencing
- APOBEC-seq methyl-CpG- binding domain
- MBD methyl-DNA immunoprecipitation
- MeDIP methylation sensitive restriction enzyme sequencing
- MSRE/MRE-Seq or Methyl-Seq oxidative bisul
- the unbiased sequencing may be performed at a depth of no more than about 0.1X, no more than about 0.5X, no more than about IX, no more than about 2X, no more than about 3X, no more than about 4X, no more than about 5X, no more than about 6X, no more than about 7X, no more than about 8X, no more than about 9X, no more than about 10X, no more than about 15X, no more than about 20X no more than about 3 OX, no more than about 40X, no more than about 50X, no more than about 60X, no more than about 70X, no more than about 80X, no more than about 90X, no more than about 100X, no more than about 200X, no more than about 300X, no more than about 400X, no more than about 500X, no more than about 600X, no more than about 700X, no more than about 800X, no more than about 900X, no more than about 1000X, at least about 0.1X, at least about 0.5X, at least about IX
- the nucleic acid molecules used in the methods described herein are extracted from a sample.
- the sample may be a biological sample. Svstems
- a system for sequencing nucleic acid molecules may comprise a controller.
- the system may also comprise a support operatively coupled to the controller.
- the controller may comprise one or more computer processors.
- the one or more computer processors may be individually or collectively programmed to direct the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries. This may generate a first plurality of libraries for performing an unbiased sequencing.
- the computer processors may be individually or collectively programmed to direct the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries. This may generate a second plurality of libraries for performing a biased sequencing.
- the computer processors may be individually or collectively programmed to direct the pooling of the first plurality of libraries and the second plurality of libraries. This may generate a pooled plurality of libraries. This pooled plurality of libraries may be used to generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules. This pooled plurality of libraries may also be used to generate a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- the system may comprise a controller.
- the system may also comprise a support operatively coupled to the controller.
- the controller may comprise one or more computer processors.
- the one or more computer processors may be individually or collectively programmed to direct the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries. This may generate a first plurality of libraries for performing a first biased sequencing.
- the computer processors may be individually or collectively programmed to direct the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries. This may generate a second plurality of libraries for performing a second biased sequencing.
- the computer processors may be individually or collectively programmed to direct the pooling of the first plurality of libraries and the second plurality of libraries. This may generate a pooled plurality of libraries. This pooled plurality of libraries may be used to generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules. This pooled plurality of libraries may also be used to generate a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- a system for sequencing nucleic acid molecules may comprise a controller.
- the system may also comprise a support operatively coupled to the controller.
- the controller may comprise one or more computer processors.
- the one or more computer processors may be individually or collectively programmed to direct the processing of a first plurality of nucleic acid molecules to generate a first plurality of libraries. This may generate a first plurality of libraries for performing a first unbiased sequencing.
- the computer processors may be individually or collectively programmed to direct the processing of a second plurality of nucleic acid molecules to generate a second plurality of libraries. This may generate a second plurality of libraries for performing a second unbiased sequencing.
- the computer processors may be individually or collectively programmed to direct the pooling of the first plurality of libraries and the second plurality of libraries. This may generate a pooled plurality of libraries. This pooled plurality of libraries may be used to generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules. This pooled plurality of libraries may also be used to generate a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- a non-transitory computer-readable medium may comprise machine-executable code.
- the machine- executable code may implement a method for sequencing nucleic acid molecules.
- the method being implemented may comprise processing a first plurality of nucleic molecules to generate a first plurality of libraries for performing an unbiased sequencing.
- the implemented may comprise processing a second plurality of nucleic acid molecules to generate a second plurality of libraries for performing a biased sequencing.
- the method being implemented may pool the first plurality of libraries and the second plurality of libraries.
- the method being implemented may generate a pooled plurality of libraries.
- the method being implemented may use a single flow cell of a sequencing platform to sequence the pooled plurality of libraries.
- the method being implemented may generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules and a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- a non-transitory computer-readable medium that may comprise machine-executable code.
- the machine-executable code may implement a method for sequencing nucleic acid molecules.
- the method being implemented may process a first plurality of nucleic acid molecules.
- the method being implemented may generate a first plurality of libraries for performing a first biased sequencing.
- the method being implemented may process a second plurality of nucleic acid molecules.
- the method being implemented may generate a second plurality of libraries for perfonning a second biased sequencing.
- the method being implemented may pool the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries.
- the method being implemented may use a single flow cell of a sequencing platform to sequence the pooled plurality of libraries.
- the method being implemented may generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules and a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- a non-transitory computer-readable medium that may comprise machine-executable code.
- the machine-executable code may implement a method for sequencing nucleic acid molecules.
- the method being implemented may process a first plurality of nucleic acid molecules.
- the method implemented may generate a first plurality of libraries for performing a first unbiased sequencing.
- the method being implemented may process a second plurality of nucleic acid molecules.
- the method being implemented may generate a second plurality of libraries for performing a second unbiased sequencing.
- the method being implemented may pool the first plurality of libraries and the second plurality of libraries to generate a pooled plurality of libraries.
- the method being implemented may use a single flow cell of a sequencing platform to sequence the pooled plurality of libraries.
- the method being implemented may generate a first plurality of sequencing reads corresponding to the first plurality of nucleic acid molecules and a second plurality of sequencing reads corresponding to the second plurality of nucleic acid molecules.
- FIG. 1 shows a computer system 101 that is programmed or otherwise configured to implement methods and systems of the present disclosure, such as performing nucleic acid sequence and sequence analysis.
- the computer system 101 includes a central processing unit (CPU, also “processor” and“computer processor” herein) 105, which can be a single core or multi core processor, or a plurality of processors for parallel processing.
- the computer system 101 also includes memory or memory location 110 (e.g., random-access memory, read-only memory, flash memory), electronic storage unit 115 (e.g., hard disk), communication interface 120 (e.g., network adapter) for communicating with other systems, and peripheral devices 125, such as cache, other memory, data storage and/or electronic display adapters.
- the memory 110, storage unit 115, interface 120 and peripheral devices 125 are in communication with the CPU 105 through a communication bus (solid lines), such as a motherboard.
- the storage unit 115 can be a data storage unit (or data repository) for storing data.
- the computer system 101 can be operatively coupled to a computer network (“network”) 130 with the aid of the communication interface 120.
- the network 130 can be the Internet, an internet and/or extranet, or an intranet and/or extranet that is in communication with the Internet.
- the network 130 in some cases is a telecommunication and/or data network.
- the network 130 can include computer server(s), which can enable distributed computing, such as cloud computing.
- the network 130 in some cases with the aid of the computer system 101, can implement a peer-to-peer network, which may enable devices coupled to the computer system 101 to behave as a client or a server.
- the CPU 105 can execute a sequence of machine-readable instructions, which can be embodied in a program or software.
- the instructions may be stored in a memory location, such as the memory 110.
- the instructions can be directed to the CPU 105, which can subsequently program or otherwise configure the CPU 105 to implement methods of the present disclosure. Examples of operations performed by the CPU 105 can include fetch, decode, execute, and writeback.
- the CPU 105 can be part of a circuit, such as an integrated circuit. Other component s) of the system 101 can be included in the circuit. In some cases, the circuit is an application specific integrated circuit (ASIC).
- ASIC application specific integrated circuit
- the storage unit 115 can store files, such as drivers, libraries and saved programs.
- the storage unit 115 can store user data, e.g., user preferences and user programs.
- the computer system 101 in some cases can include additional data storage unit(s) that is external to the computer system 101, such as located on a remote server that is in communication with the computer system 101 through an intranet or the Internet.
- the computer system 101 can communicate with remote computer systems through the network 130.
- the computer system 101 can communicate with a remote computer system of a user.
- remote computer systems include personal computers (e.g., portable PC), slate or tablet PC’s (e.g., Apple® iPad, Samsung® Galaxy Tab), telephones, Smart phones (e.g., Apple® iPhone, Android-enabled device, Blackberry®), or personal digital assistants.
- the user can access the computer system 101 via the network 130.
- Methods as described herein can be implemented by way of machine (e.g., computer processor) executable code stored on an electronic storage location of the computer system 101, such as, for example, on the memory 110 or electronic storage unit 115.
- the machine executable or machine readable code can be provided in the form of software.
- the code can be executed by the processor 105.
- the code can be retrieved from the storage unit 115 and stored on the memory 110 for ready access by the processor 105.
- the electronic storage unit 115 can be precluded, and machine-executable instructions are stored on memory 110.
- the code can be pre-compiled and configured for use with a machine having a processer adapted to execute the code, or can be compiled during runtime.
- the code can be supplied in a programming language that can be selected to enable the code to execute in a pre- compiled or as-compiled fashion.
- aspects of the systems and methods provided herein can be embodied in programming.
- Various aspects of the technology may be thought of as “products” or“articles of manufacture” typically in the form of machine (or processor) executable code and/or associated data that is carried on or embodied in a type of machine readable medium.
- Machine-executable code can be stored on an electronic storage unit, such as memory (e.g., read-only memory, random-access memory, flash memory) or a hard disk.
- “Storage” type media can include any or all of the tangible memory of the computers, processors or the like, or associated modules thereof, such as various semiconductor memories, tape drives, disk drives and the like, which may provide non-transitory storage at any time for the software programming. All or portions of the software may at times be communicated through the Internet or various other telecommunication networks. Such communications, for example, may enable loading of the software from one computer or processor into another, for example, from a management server or host computer into the computer platform of an application server.
- another type of media that may bear the software elements includes optical, electrical and electromagnetic waves, such as used across physical interfaces between local devices, through wired and optical landline networks and over various air-links.
- a machine readable medium such as computer-executable code
- a tangible storage medium such as computer-executable code
- Non-volatile storage media include, for example, optical or magnetic disks, such as any of the storage devices in any computer(s) or the like, such as may be used to implement the databases, etc. shown in the drawings.
- Volatile storage media include dynamic memory, such as main memory of such a computer platform.
- Tangible transmission media include coaxial cables; copper wire and fiber optics, including the wires that comprise a bus within a computer system.
- Carrier-wave transmission media may take the form of electric or electromagnetic signals, or acoustic or light waves such as those generated during radio frequency (RF) and infrared (IR) data communications.
- RF radio frequency
- IR infrared
- Common forms of computer-readable media therefore include for example: a floppy disk, a flexible disk, hard disk, magnetic tape, any other magnetic medium, a CD-ROM, DVD or DVD-ROM, any other optical medium, punch cards paper tape, any other physical storage medium with patterns of holes, a RAM, a ROM, a PROM and EPROM, a FLASH-EPROM, any other memory chip or cartridge, a carrier wave transporting data or instructions, cables or links transporting such a carrier wave, or any other medium from which a computer may read programming code and/or data.
- Many of these forms of computer readable media may be involved in carrying sequences of instructions to a processor for execution.
- the computer system 101 can include or be in communication with an electronic display 135 that comprises a user interface (EGI) 140 for providing, for example, results of nucleic acid sequencing (e.g., sequence reads, consensus sequences, etc.).
- ETIs include, without limitation, a graphical user interface (GET) and web-based user interface.
- Methods and systems of the present disclosure can be implemented by way of algorithms.
- An algorithm can be implemented by way of software upon execution by the central processing unit 105.
- the algorithm can, for example, implement methods and systems of the present disclosure.
- Example 1 Method of sequencing DNA using unbiased/biased sequencing
- the present disclosure provides a method of sequencing DNA using libraries prepared for performing unbiased and biased sequencing (FIG. 2).
- DNA is extracted from tissue or cells.
- the extracted DNA is divided into two samples, a first sample 202 and a second sample 203.
- the DNA in the first sample is then processed in operation 204.
- the DNA in the first sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the first sample are sized.
- the sized DNA fragments of the first sample are converted into the first library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Tip until this point, steps have been taken to mitigate bias in the fragmentation, sizing and ligation of the DNA in the first sample.
- the DNA in the second sample is then processed in operation 205.
- the DNA in the second sample is subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the second sample are sized.
- the sized DNA fragments of the second sample are converted into the second library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to exaggerate bias in the fragmentation, sizing, and ligation of the DNA in the second sample.
- the first library and the second library are pooled to produce the pooled library (FIG. 6). Specifically, the processed DNA 603 of the first library 602 is pooled with the processed DNA 605 of the second library 604. The pooling of the first library 602 and the second library 602 occurs before entering the flow cell 607. The adaptors of the DNA of the first library and the DNA of the second library interact with surface of the channels in the flow cell 608. The pooled library is subjected to clonal amplification using cluster generation. The pooled library is then subjected to sequencing, for example, paired end or single read sequencing to produce sequencing reads. Sequencing reads are then correlated to the DNA of the first sample 610 and the DNA of the second sample 609.
- the present disclosure provides a method of sequencing DNA using libraries prepared for performing unbiased and biased sequencing (FIG. 3).
- DNA is extracted from tissue or cells.
- the extracted DNA is divided into two samples, a first sample 302 and a second sample 303.
- the DNA in the first sample is then processed in operation 304.
- the DNA in the first sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the first sample are sized.
- the sized DNA fragments of the first sample are converted into the first library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform.
- the DNA in the second sample is then processed in operation 305.
- the DNA in the second sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the second sample are sized.
- the sized DNA fragments of the second sample are converted into the second library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to exaggerate bias in the fragmentation, sizing, and ligation of the DNA in the second sample.
- the first library and the second library are pooled to produce the pooled library (FIG. 6). Specifically, the processed DNA 603 of the first library 602 is pooled with the processed DNA 605 of the second library 604. The pooling of the first library 602 and the second library 602 occurs before entering the flow cell 607. The adaptors of the DNA of the first library and the DNA of the second library interact with surface of the channels in the flow cell 608. The pooled library is subjected to clonal amplification using cluster generation. The pooled library is then subjected to sequencing for example, paired end or single read sequencing to produce sequencing reads. Sequencing reads are then correlated to the DNA of the first sample 610 and the DNA of the second sample 609.
- Example 3 Method of sequencing DNA using unbiased/unbiased sequencing
- the present disclosure provides a method of sequencing DNA using libraries prepared for performing unbiased and biased sequencing (FIG. 4).
- DNA is extracted from tissue or cells.
- the extracted DNA is divided into two samples, a first sample 402 and a second sample 403.
- the DNA in the first sample is then processed in operation 404.
- the DNA in the first sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the first sample are sized.
- the sized DNA fragments of the first sample are converted into the first library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to mitigate bias in the fragmentation, sizing, and ligation of the DNA in the first sample.
- the DNA in the second sample is then processed in operation 405.
- the DNA in the second sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the second sample are sized.
- the sized DNA fragments of the second sample are converted into the second library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform.
- the first library and the second library are pooled to produce the pooled library (FIG. 6). Specifically, the processed DNA 603 of the first library 602 is pooled with the processed DNA 605 of the second library 604. The pooling of the first library 602 and the second library 602 occurs before entering the flow cell 607. The adaptors of the DNA of the first library and the DNA of the second library interact with surface of the channels in the flow cell 608. The pooled library is subjected to clonal amplification using cluster generation. The pooled library is then subjected to sequencing for example, paired end or single read sequencing to produce sequencing reads. Sequencing reads are then correlated to the DNA of the first sample 610 and the DNA of the second sample 609.
- the present disclosure provides a method of sequencing RNA using libraries prepared for performing unbiased and biased sequencing (FIG. 2).
- RNA is extracted from tissue or cells. The extracted RNA is divided into two samples, a first sample 202 and a second sample 203.
- the RNA in the first sample is then processed in operation 204.
- the RNA in the first sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions) or chemical methods.
- the resulting RNA fragments of the first sample are sized.
- the sized RNA fragments of the first sample are converted to cDNA using reverse transcription to produce the first library. Up until this point, steps have been taken to mitigate bias in the fragmentation and cDNA synthesis of the RNA in the first sample.
- the RNA in the second sample is then processed in operation 205.
- the RNA in the second sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting RNA fragments of the second sample are sized.
- the sized RNA fragments of the second sample are converted to cDNA using reverse transcription to produce the second library. Up until this point, steps have been taken to exaggerate bias in the fragmentation and cDNA synthesis of the RNA in the second sample.
- the first library and the second library are pooled to produce the pooled library (FIG. 6). Specifically, the processed RNA 603 of the first library 602 is pooled with the processed RNA 605 of the second library 604. The pooling of the first library 602 and the second library 602 occurs before entering the flow cell 607. The cDNA of the first library and the cDNA of the second library interact with surface of the channels in the flow cell 608. The pooled library is subjected to clonal amplification using cluster generation. The pooled library is then subjected to sequencing for example, paired end or single read sequencing to produce sequencing reads. Sequencing reads are then correlated to the RNA of the first sample 610 and the RNA of the second sample 609.
- Example 5 Method of sequencing RNA using biased/biased sequencing
- the present disclosure provides a method of sequencing RNA using libraries prepared for performing unbiased and biased sequencing (FIG. 3).
- RNA is extracted from tissue or cells.
- the extracted RNA is divided into two samples, a first sample 302 and a second sample 303.
- the RNA in the first sample is then processed in operation 304.
- the RNA in the first sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting RNA fragments of the first sample are sized.
- the sized RNA fragments of the first sample are converted to cDNA using reverse transcription to produce the first library.
- the RNA in the second sample is then processed in operation 305.
- the RNA in the second sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting RNA fragments of the second sample are sized.
- the sized RNA fragments of the second sample are converted to cDNA using reverse transcription to produce the second library. Up until this point, steps have been taken to exaggerate bias in the fragmentation and cDNA synthesis of the RNA in the second sample.
- the first library and the second library are pooled to produce the pooled library (FIG. 6). Specifically, the processed RNA 603 of the first library 602 is pooled with the processed RNA 605 of the second library 604. The pooling of the first library 602 and the second library 602 occurs before entering the flow cell 607. The cDNA of the first library and the cDNA of the second library interact with surface of the channels in the flow cell 608. The pooled library is subjected to clonal amplification using cluster generation. The pooled library is then subjected to sequencing, for example, single read or paired end sequencing to produce sequencing reads. Sequencing reads are then correlated to the RNA of the first sample 610 and the RNA of the second sample 609.
- Example 6 Method of sequencing RNA using unbiased/unbiased sequencing
- the present disclosure provides a method of sequencing RNA using libraries prepared for performing unbiased and biased sequencing (FIG. 4).
- RNA is extracted from tissue or cells.
- the extracted RNA is divided into two samples, a first sample 402 and a second sample 403.
- the RNA in the first sample is then processed in operation 404.
- the RNA in the first sample is subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting RNA fragments of the first sample are sized.
- the sized RNA fragments of the first sample are converted to cDNA using reverse transcription to produce the first library. Up until this point, steps have been taken to mitigate bias in the fragmentation and cDNA synthesis of the RNA in the first sample.
- the RNA in the second sample is then processed in operation 405.
- the RNA in the second sample is subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting RNA fragments of the second sample are sized.
- the sized RNA fragments of the second sample are converted to cDNA using reverse transcription to produce the second library.
- the first library and the second library are pooled to produce the pooled library (FIG. 6). Specifically, the processed RNA 603 of the first library 602 is pooled with the processed RNA 605 of the second library 604. The pooling of the first library 602 and the second library 602 occurs before entering the flow cell 607. The cDNA of the first library and the cDNA of the second library interact with surface of the channels in the flow cell 608. The pooled library is subjected to clonal amplification using cluster generation. The pooled library is then subjected to sequencing for example, paired end or single read sequencing to produce sequencing reads. Sequencing reads are then correlated to the RNA of the first sample 610 and the RNA of the second sample 609.
- Example 7 Method of sequencing DNA using unbiased/biased/unbiased sequencing
- the present disclosure provides a method of sequencing DNA using libraries prepared for performing unbiased and biased sequencing (FIG. 5).
- DNA is extracted from tissue or cells.
- the extracted DNA is divided into three samples, a first sample 502, a second sample 503, and a third sample 504.
- the DNA in the first sample is then processed in operation 505.
- the DNA in the first sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the first sample are sized.
- the sized DNA fragments of the first sample are converted into the first library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to mitigate bias in the fragmentation, sizing, and ligation of the DNA in the first sample.
- the DNA in the second sample is then processed in operation 506.
- the DNA in the second sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the second sample are sized.
- the sized DNA fragments of the second sample are converted into the second library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to exaggerate bias in the fragmentation, sizing, and ligation of the DNA in the second sample.
- the DNA in the third sample is then processed in operation 507.
- the DNA in the third sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting DNA fragments of the third sample are sized.
- the sized DNA fragments of the third sample are converted into the third library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to exaggerate or mitigate bias in the fragmentation, sizing, and ligation of the DNA in the third sample.
- the first library, the second library, and the third library are then pooled in operation 508 to generate a pooled library.
- the pooled library is subjected to clonal
- sequencing reads are then correlated to the DNA of the first sample, the DNA of the second sample, and the DNA of the third sample.
- Example 8 Method of sequencing RNA using unbiased/biased/unbiased sequencing
- RNA is extracted from a biological sample.
- the extracted RNA is divided into three samples, a first sample 502, a second sample 503, and a third sample 504.
- the RNA in the first sample is then processed in operation 505.
- the RNA in the first sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting RNA fragments of the first sample are sized.
- the sized RNA fragments of the first sample are converted into the first library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to mitigate bias in the fragmentation, sizing, and ligation of the RNA in the first sample.
- the RNA in the second sample is then processed in operation 506.
- the RNA in the second sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions), or chemical methods.
- the resulting RNA fragments of the second sample are sized.
- the sized RNA fragments of the second sample are converted into the second library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to exaggerate bias in the fragmentation, sizing, and ligation of the RNA in the second sample.
- the RNA in the third sample is then processed in operation 507.
- the RNA in the third sample is optionally subjected to fragmentation.
- Some fragmentation methods are physical methods (acoustic shearing or sonication), enzymatic methods (endonuclease cocktails or transposase tagmentation reactions) or chemical methods.
- the resulting RNA fragments of the third sample are sized.
- the sized RNA fragments of the third sample are converted into the third library by ligation to sequencing adaptors containing specific sequences designed to interact with the surface of the flow cell of a next-generation sequencing platform. Up until this point, steps have been taken to exaggerate or mitigate bias in the fragmentation, sizing, and ligation of the RNA in the third sample.
- the first library, the second library and the third library are then pooled in operation 508 to generate a pooled library.
- the pooled library is subjected to clonal
- the pooled library is then subjected to sequencing, for example, paired end or single read sequencing, to produce sequencing reads 509. Sequencing reads are then correlated to the RNA of the first sample, the RNA of the second sample, and the RNA of the third sample.
- Example 9 Method of sequencing DNA and RNA using RNA biased and DNA unbiased sequencing
- RNA is extracted from a biological sample.
- DNA is extracted from a biological sample.
- the biological sample can comprise cell-free nucleic acids, tissue, cells, or any combination thereof.
- the extracted RNA is processed to generate a biased RNA library, such as a targeted RNA library.
- the extracted DNA is processed to generate an unbiased DNA library, such as a WGS library. Both libraries are prepared for running on a NGS sequencing platform, for example, by appending sequences designed to hybridize with sequences on a flow cell.
- the biased RNA library and the unbiased DNA library are pooled to generate a pooled library.
- the pooled library is subjected to clonal amplification using cluster generation.
- the pooled library is then subjected to sequencing, for example, paired end or single read sequencing, to produce sequencing reads. Sequencing reads are then correlated to the RNA of the biased library and the DNA of the unbiased library.
- Example 10 Method of sequencing DNA and RNA using DNA biased and RNA unbiased sequencing
- RNA is extracted from a biological sample.
- DNA is extracted from a biological sample.
- the biological sample can comprise cell- free nucleic acids, tissue, cells, or any combination thereof.
- the extracted RNA is processed to generate an unbiased RNA library, such as an RNA-seq library.
- the extracted DNA is processed to generate a biased DNA library, such as a targeted library.
- Both libraries are prepared for running on a NGS sequencing platform, for example, by appending sequences designed to hybridize with sequences on a flow cell.
- the unbiased RNA library and the biased DNA library are pooled to generate a pooled library.
- the pooled library is subjected to clonal amplification using cluster generation.
- the pooled library is then subjected to sequencing, for example, paired end or single read sequencing, to produce sequencing reads. Sequencing reads are then correlated to the RNA of the unbiased library and the DNA of the biased library.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Genetics & Genomics (AREA)
- Immunology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
Abstract
Description
Claims
Priority Applications (11)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/263,108 US20210164038A1 (en) | 2018-07-26 | 2019-07-25 | Multiple sequencing using a single flow cell |
CN201980062782.2A CN112752848A (en) | 2018-07-26 | 2019-07-25 | Multiplex sequencing using a single flow cell |
CA3106820A CA3106820A1 (en) | 2018-07-26 | 2019-07-25 | Multiple sequencing using a single flow cell |
SG11202100570YA SG11202100570YA (en) | 2018-07-26 | 2019-07-25 | Multiple sequencing using a single flow cell |
IL310622A IL310622A (en) | 2018-07-26 | 2019-07-25 | Multiple sequencing using a single flow cell |
JP2021504285A JP7418402B2 (en) | 2018-07-26 | 2019-07-25 | Multiplexed sequencing using a single flow cell |
BR112021001247-8A BR112021001247A2 (en) | 2018-07-26 | 2019-07-25 | multiple sequencing using a single flow cell |
EP19840946.8A EP3827091A4 (en) | 2018-07-26 | 2019-07-25 | Multiple sequencing using a single flow cell |
AU2019309870A AU2019309870A1 (en) | 2018-07-26 | 2019-07-25 | Multiple sequencing using a single flow cell |
IL280359A IL280359A (en) | 2018-07-26 | 2021-01-24 | Multiple sequencing using a single flow cell |
JP2023171965A JP2023171901A (en) | 2018-07-26 | 2023-10-03 | Multiple sequencing using single flow cell |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201862703763P | 2018-07-26 | 2018-07-26 | |
US62/703,763 | 2018-07-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020023744A1 true WO2020023744A1 (en) | 2020-01-30 |
Family
ID=69181146
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2019/043434 WO2020023744A1 (en) | 2018-07-26 | 2019-07-25 | Multiple sequencing using a single flow cell |
Country Status (10)
Country | Link |
---|---|
US (1) | US20210164038A1 (en) |
EP (1) | EP3827091A4 (en) |
JP (2) | JP7418402B2 (en) |
CN (1) | CN112752848A (en) |
AU (1) | AU2019309870A1 (en) |
BR (1) | BR112021001247A2 (en) |
CA (1) | CA3106820A1 (en) |
IL (2) | IL310622A (en) |
SG (1) | SG11202100570YA (en) |
WO (1) | WO2020023744A1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021236328A1 (en) * | 2020-05-22 | 2021-11-25 | Novartis Ag | Cdna library generation |
DE112021004478T5 (en) | 2020-08-25 | 2023-09-14 | Seer, Inc. | COMPOSITIONS AND METHODS FOR DETERMINING PROTEINS AND NUCLEIC ACIDS |
WO2023196324A1 (en) * | 2022-04-08 | 2023-10-12 | University Of Florida Research Foundation, Incorporated | Instrument and methods involving high-throughput screening and directed evolution of molecular functions |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160046987A1 (en) * | 2014-08-14 | 2016-02-18 | Abbott Molecular Inc. | Library generation for next-generation sequencing |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8785353B2 (en) * | 2005-06-23 | 2014-07-22 | Keygene N.V. | Strategies for high throughput identification and detection of polymorphisms |
WO2012149171A1 (en) * | 2011-04-27 | 2012-11-01 | The Regents Of The University Of California | Designing padlock probes for targeted genomic sequencing |
EP3039161B1 (en) * | 2013-08-30 | 2021-10-06 | Personalis, Inc. | Methods and systems for genomic analysis |
US9745614B2 (en) * | 2014-02-28 | 2017-08-29 | Nugen Technologies, Inc. | Reduced representation bisulfite sequencing with diversity adaptors |
US11021502B2 (en) | 2014-08-04 | 2021-06-01 | The Trustees Of The University Of Pennsylvania | Transcriptome in vivo analysis (TIVA) and transcriptome in situ analysis (TISA) |
WO2016127944A1 (en) | 2015-02-10 | 2016-08-18 | The Chinese University Of Hong Kong | Detecting mutations for cancer screening and fetal analysis |
US20160281166A1 (en) | 2015-03-23 | 2016-09-29 | Parabase Genomics, Inc. | Methods and systems for screening diseases in subjects |
-
2019
- 2019-07-25 EP EP19840946.8A patent/EP3827091A4/en active Pending
- 2019-07-25 IL IL310622A patent/IL310622A/en unknown
- 2019-07-25 US US17/263,108 patent/US20210164038A1/en active Pending
- 2019-07-25 AU AU2019309870A patent/AU2019309870A1/en active Pending
- 2019-07-25 SG SG11202100570YA patent/SG11202100570YA/en unknown
- 2019-07-25 JP JP2021504285A patent/JP7418402B2/en active Active
- 2019-07-25 CN CN201980062782.2A patent/CN112752848A/en active Pending
- 2019-07-25 BR BR112021001247-8A patent/BR112021001247A2/en unknown
- 2019-07-25 CA CA3106820A patent/CA3106820A1/en active Pending
- 2019-07-25 WO PCT/US2019/043434 patent/WO2020023744A1/en active Application Filing
-
2021
- 2021-01-24 IL IL280359A patent/IL280359A/en unknown
-
2023
- 2023-10-03 JP JP2023171965A patent/JP2023171901A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20160046987A1 (en) * | 2014-08-14 | 2016-02-18 | Abbott Molecular Inc. | Library generation for next-generation sequencing |
Non-Patent Citations (2)
Title |
---|
AIRD ET AL.: "Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries", GENOME BIOLOGY, vol. 12, no. 2, 2011, pages 1 - 14, XP021091793, DOI: 10.1186/gb-2011-12-2-r18 * |
See also references of EP3827091A4 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021236328A1 (en) * | 2020-05-22 | 2021-11-25 | Novartis Ag | Cdna library generation |
DE112021004478T5 (en) | 2020-08-25 | 2023-09-14 | Seer, Inc. | COMPOSITIONS AND METHODS FOR DETERMINING PROTEINS AND NUCLEIC ACIDS |
WO2023196324A1 (en) * | 2022-04-08 | 2023-10-12 | University Of Florida Research Foundation, Incorporated | Instrument and methods involving high-throughput screening and directed evolution of molecular functions |
Also Published As
Publication number | Publication date |
---|---|
EP3827091A1 (en) | 2021-06-02 |
US20210164038A1 (en) | 2021-06-03 |
CN112752848A (en) | 2021-05-04 |
IL280359A (en) | 2021-03-25 |
JP7418402B2 (en) | 2024-01-19 |
AU2019309870A1 (en) | 2021-02-18 |
SG11202100570YA (en) | 2021-02-25 |
JP2021531794A (en) | 2021-11-25 |
CA3106820A1 (en) | 2020-01-30 |
BR112021001247A2 (en) | 2021-04-27 |
JP2023171901A (en) | 2023-12-05 |
EP3827091A4 (en) | 2022-04-27 |
IL310622A (en) | 2024-04-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2752700C2 (en) | Methods and compositions for dna profiling | |
US20170253922A1 (en) | Human identification using a panel of snps | |
US20110003701A1 (en) | System and method for improved processing of nucleic acids for production of sequencable libraries | |
JP2023171901A (en) | Multiple sequencing using single flow cell | |
US20110287432A1 (en) | System and method for tailoring nucleotide concentration to enzymatic efficiencies in dna sequencing technologies | |
USRE49207E1 (en) | Transposase-random priming DNA sample preparation | |
WO2010118865A1 (en) | System and method for detection of hla variants | |
US11208692B2 (en) | Combinatorial barcode sequences, and related systems and methods | |
US20140141436A1 (en) | Methods and Compositions for Very High Resolution Genotyping of HLA | |
US20100136516A1 (en) | System and method for detection of HIV integrase variants | |
US20170175182A1 (en) | Transposase-mediated barcoding of fragmented dna | |
CN109923215B (en) | Detection of sequence variants | |
US20120244523A1 (en) | System and Method for Detection of HIV Integrase Variants | |
Schmidt | DNA: Blueprint of the Proteins | |
EP3983558A2 (en) | Methods for accurate base calling using molecular barcodes | |
JP2022546485A (en) | Compositions and methods for tumor precision assays | |
US20120322665A1 (en) | System and method for detection of hiv-1 clades and recombinants of the reverse transcriptase and protease regions | |
US20230348969A1 (en) | Methods and systems for nucleic acid sequencing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19840946 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 3106820 Country of ref document: CA |
|
ENP | Entry into the national phase |
Ref document number: 2021504285 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112021001247 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 2019309870 Country of ref document: AU Date of ref document: 20190725 Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2019840946 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 112021001247 Country of ref document: BR Kind code of ref document: A2 Effective date: 20210122 |