CN117512063A - DNA library construction method, device and application thereof - Google Patents
DNA library construction method, device and application thereof Download PDFInfo
- Publication number
- CN117512063A CN117512063A CN202311225560.2A CN202311225560A CN117512063A CN 117512063 A CN117512063 A CN 117512063A CN 202311225560 A CN202311225560 A CN 202311225560A CN 117512063 A CN117512063 A CN 117512063A
- Authority
- CN
- China
- Prior art keywords
- dna molecule
- stranded dna
- modification
- double
- primer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000010276 construction Methods 0.000 title abstract description 13
- 239000000523 sample Substances 0.000 claims abstract description 120
- 238000000034 method Methods 0.000 claims abstract description 44
- 238000012163 sequencing technique Methods 0.000 claims abstract description 33
- 108020004414 DNA Proteins 0.000 claims description 119
- 102000053602 DNA Human genes 0.000 claims description 105
- 108020004682 Single-Stranded DNA Proteins 0.000 claims description 61
- 230000003321 amplification Effects 0.000 claims description 56
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 56
- 230000000295 complement effect Effects 0.000 claims description 55
- 230000004048 modification Effects 0.000 claims description 52
- 238000012986 modification Methods 0.000 claims description 52
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 claims description 10
- 108010008286 DNA nucleotidylexotransferase Proteins 0.000 claims description 9
- 102100029764 DNA-directed DNA/RNA polymerase mu Human genes 0.000 claims description 9
- 229960002685 biotin Drugs 0.000 claims description 5
- 235000020958 biotin Nutrition 0.000 claims description 5
- 239000011616 biotin Substances 0.000 claims description 5
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 5
- 125000000446 sulfanediyl group Chemical group *S* 0.000 claims description 5
- 238000001712 DNA sequencing Methods 0.000 claims description 4
- WYURNTSHIVDZCO-UHFFFAOYSA-N Tetrahydrofuran Chemical compound C1CCOC1 WYURNTSHIVDZCO-UHFFFAOYSA-N 0.000 claims description 4
- 238000000137 annealing Methods 0.000 claims description 4
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 claims description 4
- 238000007622 bioinformatic analysis Methods 0.000 claims description 3
- AUTOLBMXDDTRRT-JGVFFNPUSA-N (4R,5S)-dethiobiotin Chemical compound C[C@@H]1NC(=O)N[C@@H]1CCCCCC(O)=O AUTOLBMXDDTRRT-JGVFFNPUSA-N 0.000 claims description 2
- HKVRRPIGVZKBQT-UHFFFAOYSA-N 3,3-diphenylcyclooctyne Chemical compound C1CCCCC#CC1(C=1C=CC=CC=1)C1=CC=CC=C1 HKVRRPIGVZKBQT-UHFFFAOYSA-N 0.000 claims description 2
- RBTBFTRPCNLSDE-UHFFFAOYSA-N 3,7-bis(dimethylamino)phenothiazin-5-ium Chemical compound C1=CC(N(C)C)=CC2=[S+]C3=CC(N(C)C)=CC=C3N=C21 RBTBFTRPCNLSDE-UHFFFAOYSA-N 0.000 claims description 2
- LTMHDMANZUZIPE-AMTYYWEZSA-N Digoxin Natural products O([C@H]1[C@H](C)O[C@H](O[C@@H]2C[C@@H]3[C@@](C)([C@@H]4[C@H]([C@]5(O)[C@](C)([C@H](O)C4)[C@H](C4=CC(=O)OC4)CC5)CC3)CC2)C[C@@H]1O)[C@H]1O[C@H](C)[C@@H](O[C@H]2O[C@@H](C)[C@H](O)[C@@H](O)C2)[C@@H](O)C1 LTMHDMANZUZIPE-AMTYYWEZSA-N 0.000 claims description 2
- KJTLSVCANCCWHF-UHFFFAOYSA-N Ruthenium Chemical compound [Ru] KJTLSVCANCCWHF-UHFFFAOYSA-N 0.000 claims description 2
- DMLAVOWQYNRWNQ-UHFFFAOYSA-N azobenzene Chemical compound C1=CC=CC=C1N=NC1=CC=CC=C1 DMLAVOWQYNRWNQ-UHFFFAOYSA-N 0.000 claims description 2
- ZDZHCHYQNPQSGG-UHFFFAOYSA-N binaphthyl group Chemical group C1(=CC=CC2=CC=CC=C12)C1=CC=CC2=CC=CC=C12 ZDZHCHYQNPQSGG-UHFFFAOYSA-N 0.000 claims description 2
- 235000012000 cholesterol Nutrition 0.000 claims description 2
- LTMHDMANZUZIPE-PUGKRICDSA-N digoxin Chemical compound C1[C@H](O)[C@H](O)[C@@H](C)O[C@H]1O[C@@H]1[C@@H](C)O[C@@H](O[C@@H]2[C@H](O[C@@H](O[C@@H]3C[C@@H]4[C@]([C@@H]5[C@H]([C@]6(CC[C@@H]([C@@]6(C)[C@H](O)C5)C=5COC(=O)C=5)O)CC4)(C)CC3)C[C@@H]2O)C)C[C@@H]1O LTMHDMANZUZIPE-PUGKRICDSA-N 0.000 claims description 2
- 229960005156 digoxin Drugs 0.000 claims description 2
- LTMHDMANZUZIPE-UHFFFAOYSA-N digoxine Natural products C1C(O)C(O)C(C)OC1OC1C(C)OC(OC2C(OC(OC3CC4C(C5C(C6(CCC(C6(C)C(O)C5)C=5COC(=O)C=5)O)CC4)(C)CC3)CC2O)C)CC1O LTMHDMANZUZIPE-UHFFFAOYSA-N 0.000 claims description 2
- 150000004662 dithiols Chemical class 0.000 claims description 2
- KTWOOEGAPBSYNW-UHFFFAOYSA-N ferrocene Chemical compound [Fe+2].C=1C=C[CH-]C=1.C=1C=C[CH-]C=1 KTWOOEGAPBSYNW-UHFFFAOYSA-N 0.000 claims description 2
- 229960000907 methylthioninium chloride Drugs 0.000 claims description 2
- 229920000642 polymer Polymers 0.000 claims description 2
- 229920001184 polypeptide Polymers 0.000 claims description 2
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 2
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 2
- 229910052707 ruthenium Inorganic materials 0.000 claims description 2
- YLQBMQCUIZJEEH-UHFFFAOYSA-N tetrahydrofuran Natural products C=1C=COC=1 YLQBMQCUIZJEEH-UHFFFAOYSA-N 0.000 claims description 2
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 claims description 2
- 150000003573 thiols Chemical class 0.000 claims 2
- 238000003908 quality control method Methods 0.000 abstract description 41
- 230000000694 effects Effects 0.000 abstract description 12
- 230000015572 biosynthetic process Effects 0.000 abstract description 8
- 238000003786 synthesis reaction Methods 0.000 abstract description 8
- 230000008569 process Effects 0.000 abstract description 5
- 238000012360 testing method Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 2
- 239000013615 primer Substances 0.000 description 82
- 238000006243 chemical reaction Methods 0.000 description 30
- 238000000746 purification Methods 0.000 description 23
- 239000000047 product Substances 0.000 description 17
- 238000009396 hybridization Methods 0.000 description 16
- 239000002987 primer (paints) Substances 0.000 description 13
- 108091034117 Oligonucleotide Proteins 0.000 description 12
- 238000002156 mixing Methods 0.000 description 12
- 239000011324 bead Substances 0.000 description 11
- 238000002474 experimental method Methods 0.000 description 9
- 239000000203 mixture Substances 0.000 description 9
- 239000003153 chemical reaction reagent Substances 0.000 description 8
- 150000007523 nucleic acids Chemical class 0.000 description 8
- 102000039446 nucleic acids Human genes 0.000 description 7
- 108020004707 nucleic acids Proteins 0.000 description 7
- 239000006228 supernatant Substances 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 6
- 239000000872 buffer Substances 0.000 description 6
- 238000001514 detection method Methods 0.000 description 6
- 239000012634 fragment Substances 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 5
- 239000011541 reaction mixture Substances 0.000 description 5
- 102000012410 DNA Ligases Human genes 0.000 description 4
- 108010061982 DNA Ligases Proteins 0.000 description 4
- 238000003766 bioinformatics method Methods 0.000 description 4
- 238000005119 centrifugation Methods 0.000 description 4
- 239000000539 dimer Substances 0.000 description 4
- 239000012154 double-distilled water Substances 0.000 description 4
- 230000002255 enzymatic effect Effects 0.000 description 4
- 238000006911 enzymatic reaction Methods 0.000 description 4
- 238000001704 evaporation Methods 0.000 description 4
- 244000052769 pathogen Species 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 239000012264 purified product Substances 0.000 description 4
- 230000008685 targeting Effects 0.000 description 4
- 238000010257 thawing Methods 0.000 description 4
- 238000003260 vortexing Methods 0.000 description 4
- 238000009739 binding Methods 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 230000008020 evaporation Effects 0.000 description 3
- 239000002773 nucleotide Substances 0.000 description 3
- 125000003729 nucleotide group Chemical group 0.000 description 3
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 3
- 125000003396 thiol group Chemical class [H]S* 0.000 description 3
- 230000001052 transient effect Effects 0.000 description 3
- 238000005406 washing Methods 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 108091034057 RNA (poly(A)) Proteins 0.000 description 2
- 238000007259 addition reaction Methods 0.000 description 2
- 239000007853 buffer solution Substances 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 2
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000001717 pathogenic effect Effects 0.000 description 2
- 230000026731 phosphorylation Effects 0.000 description 2
- 238000006366 phosphorylation reaction Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000002864 sequence alignment Methods 0.000 description 2
- 238000011191 terminal modification Methods 0.000 description 2
- 239000011534 wash buffer Substances 0.000 description 2
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 1
- 235000005078 Chaenomeles speciosa Nutrition 0.000 description 1
- 240000000425 Chaenomeles speciosa Species 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 102000003960 Ligases Human genes 0.000 description 1
- 108090000364 Ligases Proteins 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 108091028043 Nucleic acid sequence Proteins 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000005251 capillar electrophoresis Methods 0.000 description 1
- 239000007795 chemical reaction product Substances 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 1
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 1
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000000132 electrospray ionisation Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 238000011049 filling Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000004128 high performance liquid chromatography Methods 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 150000002500 ions Chemical class 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 238000007403 mPCR Methods 0.000 description 1
- 238000004949 mass spectrometry Methods 0.000 description 1
- 239000011259 mixed solution Substances 0.000 description 1
- 238000007481 next generation sequencing Methods 0.000 description 1
- 238000002414 normal-phase solid-phase extraction Methods 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000011535 reaction buffer Substances 0.000 description 1
- 238000012764 semi-quantitative analysis Methods 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000011895 specific detection Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 238000005820 transferase reaction Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/06—Biochemical methods, e.g. using enzymes or whole viable microorganisms
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Analytical Chemistry (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Wood Science & Technology (AREA)
- Immunology (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- General Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present disclosure provides a DNA library construction method, a kit and applications thereof. Using the methods of the present disclosure, probes or primers can be quality controlled. The quality control can be completed by only comparing the next machine data of the sequencing library with the probe sequence without evaluating the actual use effect of the primer probe, the time cost and the expensive actual test cost are saved, the long-standing technical problem that a small amount of synthesis errors in the primer or probe synthesis process cannot be judged directly from the primer probe use effect is solved, and the blank of the quality control technology for the self-construction of the probe is filled.
Description
Technical Field
The invention relates to the field of gene sequencing, in particular to a method and a device for constructing a library based on DNA and application thereof.
Background
To date, the quality control methods of oligonucleotide primers or probes are broadly divided into two categories: quality control of synthesized fragments and purity and quality control of actual use effect, wherein:
quality control after synthesis and quality control after mixing mainly monitor the purity of the primer or probe, and the quality control modes of oligonucleotides with different lengths are also different. Generally speaking, the quality of the oligonucleotides smaller than 70nt is controlled by high performance liquid chromatography, and the quality of the oligonucleotides larger than 70nt is controlled by polyacrylamide gel electrophoresis. In addition, there are also methods for mass control of oligonucleotides of 20-100nt by analyzing the oligonucleotides using mass spectrometry, generating ion charge distributions of different valence states for the nucleic acids by electrospray ionization, and calculating the molecular weight of the nucleic acids using deconvolution.
The above technical means do not distinguish between nucleic acids of the same nucleotide composition, or of different ligation sequences (isomers): since the probe set or the primer set usually contains thousands or even tens of thousands of oligonucleotide chains, for the oligonucleotide chains with the same length, whether a certain probe or a plurality of probes in the oligonucleotide chains are in missed mixing or not and the phenomenon of multiple mixing cannot be determined after the oligonucleotide chains are mixed; the quality control of the case of nucleotide synthesis sequence errors is also not possible.
Generally speaking, quality control after purification can be indirectly judged by actual enrichment conditions of target sequences, libraries are constructed after multiplex PCR or libraries are constructed first and then probe hybridization capture experiments are carried out, the obtained enriched libraries are subjected to on-machine sequencing, the depth and average depth of a region to be enriched by a primer or a probe to be quality controlled are calculated from sequencing results by using a bioinformatics method, depth coefficients (target region depth/average depth) are obtained through calculation, the action effect of the probe or the primer is judged according to the height of the depth coefficients, and if the depth coefficient of the target region of a certain probe or primer is too low or the depth coefficient is 0, the difference or the absence of the mass of the certain probe or the primer is considered.
However, this conventional indirect quality control method makes it difficult to monitor and determine the presence of small amounts of non-targeted sequences in a probe set.
Moreover, practical evaluation experiments often require significant additional time and labor costs, and because the probe manufacturer and probe user are often not the same department, the design and actual delivery of probe primers is not ideal. The efficiency of primer or probe target binding is partially random, especially in current pathogen targeted enrichment applications (target next generation sequencing, tNGS), where the number of pathogens covered by a primer or probe can reach hundreds of thousands, and practical evaluation tests do not allow for the complete evaluation of all pathogens covered by a primer or probe designed according to a database using all pathogen standards, thus a method for sequence quality control of a probe or primer is highly desirable.
On the other hand, primers or probes used in molecular biology generally use 5' end modification (such as thio modification of the 5' end of the primer, amino modification of the 5' end, thiol modification of the 5' end, biotin modification of the 5' end of the probe in a biotin-streptavidin hybridization capture system), and the traditional library construction scheme cannot directly construct a sequencing library except for phosphorylation modification of the 5' end, so that sequencing quality control cannot be realized under the condition that the primer or the probe is not subjected to non-phosphorylation modification of the 5' end.
The existing single-stranded DNA library construction method has obvious bias, the accurate proportion of each primer probe in the primer probes to be controlled cannot be obtained, and semi-quantification of the primer probes cannot be realized.
Disclosure of Invention
In order to solve at least one of the above problems, the present disclosure provides a DNA library construction method, apparatus and application thereof. Using the methods of the present disclosure, probes or primers can be quality controlled, as well as semi-quantitative analysis.
According to a first aspect of the present disclosure, there is provided a method of constructing a DNA library, the method comprising the steps of:
1) Adding poly (X) at the 3' -end of a single-stranded DNA molecule n Tail to obtain a polypeptide (X) n A first single-stranded DNA molecule of the tail;
2) Obtaining a double stranded DNA molecule based on said first single stranded DNA molecule using an extension primer, wherein said double stranded DNA molecule comprises said first single stranded DNA molecule and a second single stranded DNA molecule complementary to said first single stranded DNA molecule, said extension primer being capable of annealing to poly (X) at the 3' end of said first single stranded DNA molecule n On the tail; and
3) Distal poly (X) of double-stranded adaptors to the double-stranded DNA molecules n And connecting one end of the tail, and amplifying the second single-stranded DNA molecule to obtain an amplification product, wherein the amplification product forms a DNA sequencing library.
In some embodiments, in said step 1), said single stranded DNA molecule is selected from 10 to 150nt in length.
In some specific embodiments, in the step 1), the single-stranded DNA molecule length is selected from 10nt, 20nt, 30nt, 40nt, 50nt, 60nt, 70nt, 80nt, 90nt, 100nt, 110nt, 120nt, 130nt, 140nt, 150nt.
In some specific embodiments, in said step 1), said single stranded DNA molecule is selected from 20 to 120nt in length.
In some embodiments, the single stranded DNA molecule has a modification at the 5' end.
In some specific embodiments, the single stranded DNA molecule 5' end modification comprises a non-phosphorylated modification.
In some specific embodiments, the single stranded DNA molecule 5' terminal modification includes, but is not limited to, amino modification, diphenylcyclooctyne modification, biotin modification, desthiobiotin, sulfhydryl modification, dithiol modification, ferrocene modification, tetrahydrofuran modification, thio modification, phosphorothioate modification, digoxin modification, cholesterol modification, azobenzene modification, methylene blue modification, binaphthyl modification, ruthenium modification, and the like.
In some specific embodiments, the single stranded DNA molecule 5' terminal modification comprises a thio modification, an amino modification, a thiol modification, or a biotin modification.
In some embodiments, in the step 1), the n represents the number of bases X.
In some specific embodiments, n is selected from integers from 6 to 12.
In some specific embodiments, n is selected from 6, 7, 8, 9, 10, 11, or 12.
In some specific embodiments, the X is selected from any one of the four A, T, C, G bases.
In some specific embodiments, the X is selected from the group consisting of base C or G.
In some specific embodiments, the X is selected from a G base.
In some embodiments, in said step 1), poly (X) is reacted with a terminal transferase n The tail is added to the 3' end of the single stranded DNA molecule.
In some embodiments, in the step 2), the 3' end of the extension primer comprises (Y) m Base unit, wherein base Y is complementary to base X, e.g., when base X is G, base Y is C and m represents the number of bases Y.
In some specific embodiments, m is selected from integers from 4 to 12.
In some specific embodiments, m is selected from 4, 5, 6, 7, 8, 9, 10, 11, or 12.
In some embodiments, the extension primer (Y) m The 5' end of the base unit also includes a nucleotide sequence other than poly (X) n One or more bases complementary to the tail.
In some specific embodiments, the extension primer is selected from 20 to 40nt in length.
In some specific embodiments, the length of the extension primer is selected from 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40nt.
In some embodiments, in said step 3), amplification is performed using amplification primers.
In some embodiments, the amplification primers comprise a first amplification primer and/or a second amplification primer.
In some specific embodiments, the length of the first amplification primer and/or the second amplification primer sequence is each independently selected from 20 to 40nt.
In some specific embodiments, the length of the first amplification primer and/or the second amplification primer sequence is each independently selected from 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40nt.
In some embodiments, in said step 3), said double-stranded adaptor is attached to a single end of said double-stranded DNA molecule.
In some specific embodiments, the double-stranded adaptor comprises: a first adaptor single strand to be ligated to the 5' end of the first single strand DNA molecule; and a second adaptor single strand to be ligated to the 3' -end of the second single strand DNA molecule.
In some specific embodiments, in said step 3), the 3 'end of the first adaptor single strand and the 5' end of the second adaptor single strand in said double-stranded adaptor are blunt ends, and said double-stranded adaptor is linked to said double-stranded DNA molecule by blunt ends.
In some embodiments, the blunt end of the double-stranded adaptor has one or more random complementary base pairs.
In some embodiments, the blunt end of the double-stranded adaptor has 1 to 10 random complementary base pairs.
In some specific embodiments, the blunt end of the double-stranded adaptor has 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 random complementary base pairs.
In some embodiments, these random complementary base pairs of the double-stranded adaptor can identify the attached single-stranded molecule, functioning as a molecular tag (UMI). The random complementary base pairs can reduce the bias of the double-stranded adaptor to double-stranded molecule.
In some specific embodiments, molecular tags (UMI) can be used for calibration of amplification or sequencing errors.
In some embodiments, the blunt-ended second adaptor single-stranded 5 'end of the double-stranded adaptor is ligated to the 3' end of the second single-stranded DNA molecule, and the 3 'end of the first adaptor single-stranded is not ligated to the modified 5' end of the first single-stranded DNA molecule. As described above, since the 5 'end of the first single-stranded DNA molecule is modified, the 3' end of the first adaptor single strand of the double-stranded adaptor is not linked to the 5 'end of the first single-stranded DNA molecule, so that a gap exists between the blunt end of the double-stranded adaptor and the 5' end of the first single-stranded DNA molecule. In this way, in a subsequent amplification, only the second single stranded DNA molecule can be amplified.
In some embodiments, the double-stranded adaptor comprises one or more complementary base pairs, e.g., comprising 5 to 20 complementary base pairs, in addition to the random complementary base pairs described above.
In some specific embodiments, the double-stranded adaptor comprises 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 complementary base pairs in addition to the random complementary base pairs described above.
In some embodiments, the double-stranded adaptor further comprises a non-complementary unit at the end remote from the random complementary base.
In some embodiments, the two strands of the non-complementary unit of the double-stranded linker each independently comprise 10 to 30 bases.
In some specific embodiments, the two strands of the non-complementary unit of the double stranded linker each independently comprise 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 bases.
In some embodiments, the second amplification primer is reverse complementary to the 3' end of the second adaptor single strand.
In some embodiments, the 5' end of the first adaptor single strand is not complementary to any amplification primers. Thus, even if a primer dimer is formed, the two-way amplification cannot be realized, and the concentration of the primer dimer can be effectively reduced.
In some embodiments, the method further comprises sequencing the sequencing library of step 3). In some embodiments, the method further comprises performing a bioinformatic analysis on the sequencing data.
In some embodiments, the bioinformatics analysis comprises double-ended read combining of base quality-controlled sequencing data, poly (X) n Subsequent analysis was performed after tail excision.
In some embodiments, the subsequent analysis includes length quality control including length statistics of reads and sequence alignment including poly (X) n And (3) comparing the sequence after excision with the sequence of the probe set, and completing the quality control of the primer and the probe of the single-stranded DNA by analyzing the sequence comparison condition and sequence statistical depth coefficient.
According to a second aspect of the present disclosure, there is provided an apparatus for constructing a library based on single stranded DNA, the apparatus comprising:
a tail linker unit for forming poly (X) at the 3' -end of the single stranded DNA molecule n Tail to obtain a polymer (X) n A first single-stranded DNA molecule at the tail;
an extension unit based on the first single-stranded DNA molecule to obtain a double-stranded DNA molecule, wherein the double-stranded DNA molecule comprises the first single-stranded DNA molecule and a second single-stranded DNA molecule complementary to the first single-stranded DNA molecule, the extension primer being capable of annealing to poly (X) at the 3' -end of the first single-stranded DNA molecule n On the tail;
a linker linking unit for attaching a double-stranded linker to the double-stranded DNA molecule at a distance from poly (X) n One end of the tail is connected; and
an amplification unit for amplifying the second single stranded DNA molecule to obtain amplification products, the amplification products constituting a DNA sequencing library.
In some embodiments, the n in the tail linking unit represents the number of bases X.
In some embodiments, n is selected from integers of 6 to 12.
In some embodiments, n is selected from 6, 7, 8, 9, 10, 11, or 12.
In some embodiments, the X in the tail linking unit is selected from any of the four A, T, C, G bases.
In some specific embodiments, the X is selected from the group consisting of base C or G.
In some specific embodiments, the X is selected from a G base.
In some embodiments, the tail linker unit is used to link poly (X) using a terminal transferase n The tail is added to the 3' end of the single stranded DNA molecule.
In some embodiments, in the amplification unit, amplification is performed using amplification primers.
In some embodiments, the amplification primers comprise a first amplification primer and/or a second amplification primer.
In some embodiments, the first amplification primer and/or the second amplification primer sequence are each independently selected from 20 to 40nt in length.
In some specific embodiments, the first amplification primer and/or the second amplification primer sequence lengths are each independently selected from 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40nt.
In some embodiments, the 3' end of the extension primer comprises (Y) m base units, wherein base Y is complementary to base X, e.g., when base X is G, base Y is C and m represents the number of bases Y.
In some embodiments, m is selected from integers from 4 to 12.
In some embodiments, m is selected from 4, 5, 6, 7, 8, 9, 10, 11, or 12.
In some embodiments, the 5' end of the (Y) m base unit of the extension primer further comprises a primer that is not associated with poly (X) n One or more bases complementary to the tail.
In some embodiments, the extension primer is selected from 20 to 40nt in length.
In some specific embodiments, the length of the extension primer is selected from 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, or 40nt.
In some embodiments, the double-stranded adaptor comprises: a first adaptor single strand to be ligated to the 5' end of the first single strand DNA molecule; and a second adaptor single strand to be ligated to the 3' -end of the second single strand DNA molecule.
In some embodiments, the 3 'end of the first adaptor single strand and the 5' end of the second adaptor single strand are blunt ends, and the adaptor junction unit is linked to the double stranded DNA molecule by the blunt ends.
In some embodiments, the blunt end of the linker connecting unit has one or more random complementary base pairs.
In some embodiments, the blunt end of the linker connecting unit has 1 to 10 random complementary base pairs.
In some embodiments, the blunt end of the linker linking unit has 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 random complementary base pairs.
In some embodiments, these random complementary base pairs of the linker linking unit can identify the linked single stranded molecule, functioning as a molecular tag. The random complementary base pairs can reduce the bias of the double-stranded adaptor to double-stranded molecule.
In some embodiments, the 5 'end of the second adaptor single strand is ligated to the 3' end of the second single strand DNA molecule. As described above, since the 5 'end of the first single-stranded DNA molecule is modified, the 3' end of the first adaptor single-stranded of the adaptor connecting element is not connected to the 5 'end of the first single-stranded DNA molecule, so that a gap exists between the blunt end of the adaptor connecting element and the 5' end of the first single-stranded DNA molecule. In this way, in a subsequent amplification, only the second single stranded DNA molecule can be amplified.
In some embodiments, the linker linking unit comprises one or more complementary base pairs, e.g., comprising 10 to 30 complementary base pairs, in addition to the random complementary base pairs described above.
In some specific embodiments, the linker linking unit comprises 10, 20, or 30 complementary base pairs in addition to the random complementary base pairs described above.
In some embodiments, the linker linking unit further comprises a non-complementary unit at the end remote from the random complementary base.
In some embodiments, the two strands of the non-complementary unit of the linker linking unit each independently comprise 20 to 60 bases.
In some specific embodiments, the two strands of the non-complementary unit of the linker linking unit each independently comprise 20, 30, 40, 50, or 60 bases.
In some embodiments, the second amplification primer is reverse complementary to the 3' end of the second adaptor single strand.
In some embodiments, the 5' end of the first adaptor single strand is not complementary to any amplification primers. Thus, even if a primer dimer is formed, the two-way amplification cannot be realized, and the concentration of the primer dimer can be effectively reduced.
In some embodiments, the method further comprises sequencing the sequencing library of the amplification units and performing a bioinformatic analysis on the sequencing data.
In some embodiments, the bioinformatics analysis comprises double-ended read combining of base quality-controlled sequencing data, poly (X) n Subsequent analysis was performed after tail excision.
In some embodiments, the subsequent analysis includes length quality control including length statistics of reads and sequence alignment including poly (X) n And (3) comparing the sequence after tail excision with the sequence of the probe set, and completing the quality control of the primer and the probe of the single-stranded DNA by analyzing the sequence comparison condition and sequence statistical depth coefficient.
According to a third aspect of the present disclosure there is provided the use of the method of the first aspect and/or the device of the second aspect in probe or primer substance control.
In some embodiments, the method and/or the device is used to sequence and/or semi-quantitatively or quantitatively analyze one or more probes or primers.
According to the method, the quality control is carried out on the primer probe by constructing the oligonucleotide sequencing library, the quality control is carried out without evaluating the actual use effect of the primer probe, the quality control can be finished by only comparing the next machine data of the sequencing library with the probe sequence, and specific detection information including the sequence accuracy of the primer or the probe group, the length of each sequence, the proportion accuracy of different sequences, the pollution degree of non-target sequences and the like is obtained, so that the time cost and the expensive actual test cost are saved.
The method can effectively control the quality of the primer probe base synthesis error, breaks the long-standing technical problem that a small amount of synthesis errors in the primer or probe synthesis process cannot be judged directly from the primer probe using effect, and fills the blank of the quality control technology of the self-library construction of the probe.
Drawings
FIG. 1 shows a library construction procedure of primers and probes.
FIG. 2 shows a schematic representation of blunt end nick ligation.
FIG. 3 shows the bioinformatic alignment of primer and probe sequencing sequences.
FIG. 4 shows the sequence detection results for quality control of a probe set purified twice in succession using the same SPE cartridge.
Fig. 5 shows the quality control results of the probe set in comparison with the depth factor actually captured by the probe. In the figure, the abscissa represents the probe number, and the ordinate represents the number of probe detection sequences and the actual captured depth coefficient of the probe.
FIG. 6 shows a comparison of the effect of the tail of different bases on the results of the library of the present invention.
Detailed Description
In library construction, the adaptor ligation is typically performed using T4DNA ligase, however T4DNA ligase has a base end bias such as a base end with a different ligation efficiency from the other four bases (N). For probe set quality control, this pooling bias can lead to inconsistent pooling efficiency of probes in the probe set, thereby introducing errors in the semi-quantitative process of probes. In the single-chain library construction process of the probe, the double-chain connector with random complementary bases at the flat end is used for carrying out single-end notch connection on DNA molecules, so that the bias of T4DNA ligase is overcome, and the semi-quantitative detection of the quality control of the probe set is realized.
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. The specific embodiments described herein are for purposes of illustration only and are not to be construed as limiting the invention in any way. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure. Such structures and techniques are also described in a number of publications.
Definition of the definition
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly used in the art to which this invention belongs. For the purposes of explaining the present specification, the following definitions will apply, and terms used in the singular will also include the plural and vice versa, as appropriate.
The terms "a" and "an" as used herein include plural referents unless the context clearly dictates otherwise.
In the present disclosure, the term "terminal transferase" refers to an enzyme that is capable of adding deoxynucleotides to one or more 3' ends of a DNA molecule in a template-independent manner. "terminal transferase activity" refers to the terminal transferase activity of any enzyme having that ability.
In the present disclosure, the term "basic quality control" includes filtering low quality reads obtained from sequencing to obtain high quality reads.
In some embodiments, the base quality control further comprises splitting the sequencing data into sequencing data of different sample sources according to the barcode sequence.
In the present disclosure, the term "linker" refers to a nucleic acid that can be attached to an RNA or DNA molecule. The linker may be RNA or single-stranded/double-stranded DNA, or a mixture thereof, and may be single-stranded or double-stranded to the RNA or DNA molecule; the linker may be a perfectly double-stranded complementary linker or may be a hairpin linker (i.e., a molecule that base pairs with itself to form a structure with a double-stranded stem and loop, wherein the 3 'and 5' ends of the molecule are linked to the 5 'and 3' ends of the double-stranded DNA molecule). Y-linkers are also possible. A linker length in the range of 15 to 100 bases, e.g., 50 to 70 bases, may also include linkers of lengths outside of this range.
In the present disclosure, the term "A-T ligation" refers to filling the 3' end of a DNA molecule with a polymerase, adding an A tail, and then ligating a linker comprising a T overhang to the DNA molecule with the A-tail, a known common manner of cohesive end ligation.
In the present disclosure, the term "single ended nick-ligation" is a reaction process that is distinguished from conventional double-ended ligation adaptors, meaning that the adaptor is ligated to only one end of a DNA molecule, one strand of the double-stranded adaptor cannot be ligated to a double-stranded DNA molecule to form a nick, and the other strand is ligated to one strand of the DNA molecule.
In the present disclosure, the term "SPE column purification" is one of the usual post-DNA purification modes, after fixing the synthesized product to the SPE column by solid phase extraction, removing impurities by a detergent, and then eluting the successfully synthesized DNA with an eluting solvent, thereby completing the purification. Compared with the traditional PAGE gel purification mode, the SPE column has higher purification efficiency and shorter purification duration. Because of the high cost of SPE cartridges, a single SPE cartridge is typically reused to purify multiple batches of probes. In the course of repeated use, after the last batch of DNA has been purified, a small amount may remain on the SPE cartridge due to incomplete washing of the cartridge. Thus, if there is a residue of the last purified molecule in the SPE cartridge during the purification of the next batch of DNA, it will be eluted into the next batch of DNA, resulting in the inclusion of non-targeting sequences in the probe set or primer set. The oligonucleotide chains of these non-targeting sequences may affect the practical use of the probe set or primer set, such as non-uniform data, excessive numbers of non-targeting sequences, etc.
The present disclosureThe method is carried out by adding poly (X) to the 3' -end of single-stranded DNA n And (3) tail, namely, obtaining double-stranded DNA by utilizing reverse complementary sequence extension, and connecting the other end of the double-stranded DNA by a double-stranded joint to obtain a double-stranded DNA library with known sequences at both ends. Wherein poly (G) n The addition of tail is compared to poly (C) n Poly (a) n and poly (T) n are more advantageous for controlling the amount and rate of addition and sequencing can utilize more uniform read lengths. See fig. 1.
On the other hand, the double-stranded adaptor-ligated extension product is ligated to the 3' -end of one strand of double-stranded DNA by single-ended nick ligation, and the resulting double-stranded DNA product has one strand containing a double-ended known sequence and the other strand containing only the original single-stranded DNA molecule and the polyG tail. The ligation is followed by amplification using the DNA strand containing the double-ended known sequence as template, while the other strand is not amplified nor detected by sequencing. See fig. 2.
By arranging random complementary base pairs at the terminal of the connector, various base pairs at the terminal of the connector are uniformly distributed, and compared with the traditional A-T connection, the bias of the base at the terminal of the connector in the connection reaction is improved.
The double-stranded joint comprises a Y-shaped joint, the sequence of the connecting end is complementary, the non-connecting end of the forking part is a non-complementary sequence, one strand of the double-stranded joint contains a sequence complementary to the amplification primer and is connected with the 3' end of one strand of double-stranded DNA; the other strand is not complementary to any amplification primer sequence and is not attached to the 5' end of the other strand of double-stranded DNA. That is, the double-stranded adaptor is connected to the double-stranded DNA in a single-stranded nick manner. Even if the double-stranded linker is ligated to form a linker dimer, amplification is not possible in the subsequent amplification reaction, and thus the proportion of non-specific products can be effectively reduced.
Compared with the conventional primer probe, the fragment length interval of the primer probe can only be controlled by polyacrylamide gel electrophoresis quality after synthesis, and the method disclosed by the disclosure can detect the accurate nucleic acid length and the nucleic acid sequence of a single oligonucleotide;
compared with other library building methods, the method disclosed by the disclosure has the advantages that the double-stranded connector and the extension product are used for carrying out 5' -end flat-end single-end notch connection, the method is higher than single-stranded connection, the deviation is lower, the uniformity of a probe library is higher, and the semi-quantitative detection of the quality control of a probe set can be realized.
The method solves the problem that the single end of the non-phosphorylated modified probe cannot be directly added with a single-stranded joint, so that a library cannot be constructed.
The flow operation of the invention is simpler and more convenient, probe targeting capture quality control test is not needed, so that not only are a lot of experiment time and experiment cost saved, but also unstable results caused by unskilled operation of complicated hybridization capture flow of an experimenter can be eliminated.
Examples and figures are provided below to aid in the understanding of the invention. It is to be understood that these examples and drawings are for illustrative purposes only and are not to be construed as limiting the invention in any way. The actual scope of the invention is set forth in the following claims. It will be understood that any modifications and variations may be made without departing from the spirit of the invention.
Examples
Example 1: used for quality control of probe purification mode, i.e. monitoring the probe molecule residue of SPE column after purifying probe
In this example, the same SPE cartridge was used, followed by two purifications, each of which purified 48 different 5' -end biotin-modified probes. The quality control of the probe after the twice purification is respectively carried out by the invention, and the steps are as follows:
Poly-dG tail addition:
1.1 100ng of the probe sets to be quality controlled were placed in 0.2ml PCR tubes and filled to 27.6. Mu.L with double distilled water.
1.2 after thawing the reagents, the reaction solutions were prepared with reference to Table 1.
TABLE 1 Poly-dG tail addition reaction System
Component (A) | Single reaction volume (μL) |
SPE column purification and test probe (100 ng) +water | 27.6 |
10x terminal transferase reaction buffer | 4 |
2.5mMCoCl 2 | 4 |
1mMdGTP | 4 |
20U/. Mu.L terminal transferase | 0.4 |
Total volume of | 40 |
Terminal Transferase for this example was purchased from New England Biolabs; dGTP was purchased from Takara.
1.3 mixing the reaction mixture by shaking with a vortex oscillator, instantly centrifuging with a mini centrifuge, and placing the mixture in a PCR instrument for reaction under the conditions shown in Table 2.
TABLE 2 Poly-dG tail addition reaction conditions
Temperature (temperature) | Time |
Heat cover (42 ℃ C.) | Opening the valve |
37℃ | 20 minutes |
4℃ | Holding |
After the reaction was completed, purification was performed using 100. Mu.LAxygen magnetic beads, and the purified product was dissolved back in 20. Mu.L double distilled water.
2. Primer extension:
2.1 after thawing the reagents, the reaction solution was prepared with reference to Table 3, and the extended primer sequences were: 5'-3': GAACGACATGG CTACGATCCGACTTCCCCCCCC (SEQ ID NO: 1).
TABLE 3 primer binding reaction System
Component (A) | Single reaction volume (μL) |
Poly-dG-tailed DNA (step 1.2 purification of the product) | 20 |
20uM extension primer | 1.5 |
Total volume of | 21.5 |
2.2 mixing the reaction mixture by shaking with a vortex oscillator, performing instantaneous centrifugation with a mini centrifuge, and placing the mixture in a PCR instrument for reaction under the following reaction conditions.
TABLE 4 primer binding reaction conditions
Temperature (temperature) | Time |
Heat cover (80 ℃ C.) | Opening the valve |
75℃ | For 5 minutes |
25℃ | For 10 minutes |
25℃ | Holding |
2.3 after completion of the reaction, the reagents shown in Table 5 were added to the reaction solution.
TABLE 5 primer extension reaction System
Component (A) | Single reaction volume (μL) |
DNA (step 2.2 step product) | 21.5 |
10mMdNTP | 0.75 |
10xT4PNK buffer | 5 |
3U/. Mu.LT 4DNA polymerase | 1 |
ddH2O | 21.75 |
Total volume of | 50 |
T4 DNAPolymerase of this example was purchased from Enzymatics;10X T4 PNKBuffer from Enzymatics.
2.4 the reaction mixture was mixed by shaking with a vortex shaker, centrifuged instantaneously with a mini centrifuge, and placed in a PCR apparatus for reaction under the conditions shown in Table 6.
TABLE 6 primer extension reaction conditions
Temperature (temperature) | Time |
Heat cover (70 ℃ C.) | Opening the valve |
30℃ | 15 minutes |
65℃ | 15 minutes |
25℃ | Holding |
3. And (3) joint connection:
3.1 after thawing the reagents, the reagents shown in Table 7 were added sequentially to the reaction products with reference to the following reaction system, using the linker sequence: top 5'-3': CAAGGTTCGAATCGGCCTCCGACTTNN (SEQ ID NO: 2); and bottom5'-3': NNAAGTCGGAGGCCAAGCGGTCTTAGGAAGAC (SEQ ID NO: 3).
TABLE 7 ligation reaction System
Component (A) | Single reaction volume (μL) |
DNA (step 1.3 step product) | 50 |
15 mu M joint | 1 |
2x quick connect buffer | 44 |
600U/. Mu.LT 4DNA ligase | 5 |
Total volume of | 100 |
T4 DNALigase of this example was purchased from Enzymatics;2X Rapid Ligation Buffer from Enzymatics.
3.2 the reaction mixture was mixed by shaking with a vortex oscillator, centrifuged instantaneously with a mini centrifuge, and reacted in a PCR instrument under the conditions shown in Table 8.
TABLE 8 ligation reaction conditions
Temperature (temperature) | Time |
Thermal cover | Closing |
23℃ | 15 minutes |
4℃ | Holding |
Purification was performed using 60. Mu.LAxygen magnetic beads and the purified product was dissolved back in 20. Mu.L double distilled water.
4. Library amplification:
4.1 after thawing the reagents, a PCR reaction system was prepared according to the amounts of reagents shown in Table 9.
TABLE 9 library amplification reaction System
The embodiment of the inventionProAmplification Mix from the next holy organism;
4.2 mixing the reaction mixture by shaking with a vortex oscillator, instantly centrifuging with a mini centrifuge, and placing the mixture in a PCR instrument for reaction, wherein the reaction conditions are shown in Table 10.
TABLE 10 library amplification reaction conditions
Purification was performed using 45. Mu.LAxygen magnetic beads and the purified product was dissolved back in 30. Mu.LTE buffer.
5. Library sequencing and data analysis:
after DNB preparation of the library using the MGIDNB preparation kit, sequencing was performed using a DNBSEQ-T7 sequencer. Sequencing off-machine data the probe set sequences were aligned using the analytical method of the present invention.
Experimental results: quality control method of 5' -end modified probe of the invention
In this example, the method of the present invention is used to directly control the quality of the sequence of the probes, and the quality of different probe sets with modifications at the 5' end after two SPE column purifications is respectively controlled, and the sequences in the actual probe library are compared with the target probe sequences, and the results are shown in fig. 4 and table 11, where the first set of probes are purified by SPE columns, and after the columns are washed, the second set of probes are purified, and about 0.2% of the first set of probes remain in the obtained second set of probes. See table 11.
TABLE 11 detection of different batches of probe purification
Detection conditions of different batches of probe purification | First batch of probes | Second batch of probes | First batch/second batch |
Average depth (first batch probe purification) | 359484.6 | 0 | NA |
Average depth (second batch probe purification) | 1304.2 | 589717.4 | 0.22% |
The quality control of the probe group purified by the SPE column twice can obviously show that the first group of probes remain in the second group of purified probes, and the method can effectively monitor whether the probes remain and the residual condition exist on the SPE column after the probes are purified by the SPE column, thereby implementing the quality control on the purification feasibility of the SPE column.
Example 2: quality control of probe and comparison experiment of actual capturing efficiency of probe
This example demonstrates the comparative quality control of 5' -end biotin-modified 17544 probe sets for tumor monitoring:
1. the quality control experimental procedure operation of the scheme of the invention is the same as that of the embodiment 1;
2. the actual hybridization capture quality control of the probe uses primary hybridization reagent (product number TC 0023) and the probe to carry out hybridization capture on the prepared humanized genome library, and the specific steps are as follows:
2.1 library evaporation to dryness and concentration:
1000ng of the human genome library was prepared and the evaporated to dryness concentrate system was placed in a 1.5ml centrifuge tube according to Table 12.
TABLE 12 library evaporation to dryness concentration System
Component (A) | Volume (mu L) |
Human genome library | 10 |
Cot-1DNA(1μg/μL) | 5 |
MGI sealer (6. Mu.g/. Mu.L) | 2 |
Total volume of | 15 |
Evaporating the prepared mixed solution by using a vacuum concentrator at 60 ℃.
2.2 hybridization reactions
Hybridization buffer solutions were prepared according to Table 12, and after mixing the prepared hybridization buffer solutions, the mixture was added to the nucleic acid to be hybridized after evaporation to dryness in step 2.1, and after incubation at room temperature for 10 minutes, the solution was transferred to a 0.2mL centrifuge tube, and 4. Mu.L of probe to be quality controlled was added. After mixing, hybridization reactions were performed according to the procedure described in Table 13: 95 ℃,10 minutes, 65 ℃ and 16 hours.
TABLE 13 hybridization reaction System
Component (A) | Volume (mu L) |
Boke2X hybridization buffer | 8.5 |
Boke hybrid enhancers | 2.7 |
Nuclease-free water | 1.8 |
Total volume of | 13 |
2.3 magnetic bead Capture
2.3.1. After 10. Mu.LM 270 strepitavidin beads were washed with 1X Beads Wash Buffer, the supernatant was discarded.
2.3.2. 17uL of hybridization reaction solution was transferred to magnetic beads, and after mixing, incubated at 65℃for 45 minutes.
2.3.3. Vortex mixing for 3sec every 15 min to ensure that the beads are in suspension.
2.4 post hybridization Capture washes
2.4.1 Hot cleaning (65 ℃ C.)
100. Mu.L of 1XWash Buffer 1 preheated at 65℃was added to the product of step 2.3.3, after mixing, all the liquid was transferred to a 1.5mL centrifuge tube and placed on a magnetic rack, and the supernatant removed.
After adding 200. Mu.L of 1XWash Buffer S preheated at 65℃to the centrifuge tube, the supernatant was removed by vortexing and incubating for 5 minutes with shaking at 1200r at 65 ℃. After washing once with 200. Mu.L of 1XWash Buffer S preheated again, the supernatant was removed.
2.4.2 cleaning at Room temperature
To the heat washed product was added 180. Mu.L of 1XWash Buffer I, resuspended by vortexing, and the supernatant was removed after transient centrifugation. Subsequently, 180. Mu.L of 1XWash Buffer II was added, resuspended by vortexing, and the supernatant was removed after transient centrifugation. 180. Mu.L of 1X Wash Buffer III was added, resuspended by vortexing, and the supernatant removed after transient centrifugation. Finally, the washed magnetic bead product was resuspended using 20. Mu. LNuclease-free water.
2.5. Amplification after hybridization:
to the system after the completion of the washing in step 2.4.2, 25. Mu.L of 2X Kapa Hotstart Ready Mix and 2.5. Mu. L F/R Index primer (20. Mu.M) were added, and after thoroughly mixing, the mixture was subjected to cyclic amplification: 98 ℃ for 1 minute; 98℃10s,60℃30s,72℃30s (15 cycles); 72 ℃ for 5 minutes; maintained at 4 ℃. After the amplification, 50. Mu.L of magnetic beads are added into the amplified product for magnetic bead purification, and the purified product is dissolved in 20. Mu.L of TE Buffer.
2.6. Sequencing and data analysis
After DNB preparation of the library using the MGIDNB preparation kit, the sequencing library was sequenced using a DNBSEQ-T7 sequencer. Sequencing results of the library constructed by the method are compared and analyzed by using the bioinformatics method (see figure 3) to obtain the sequence number of single probe successful comparison. On the other hand, genome comparison analysis is performed on sequencing data of the target capture library obtained in the actual use process of the probe, the depth and the average depth of the probe capture area are calculated, and a depth coefficient (depth of a certain probe capture area/average depth×100) is obtained.
Experimental results: the number of sequences of the probe work comparison in the method of the invention is compared with the depth coefficient obtained in the actual capturing application of the probe, and the obtained trend comparison result is shown in figure 5.
The results show that compared with probes with lower sequence numbers, the depth coefficient of the hybridization capture of the probes is lower in practice (the capture effect of the probes can be reflected by the depth coefficient of the probe region, and the probes with low depth coefficient indicate that the capture effect of the probes is poor), and the results of the two quality control methods basically accord with each other, so that the method can effectively control the quality of the probes in the probe group.
Example 3: effect of different poly tails on the construction of the library of the present invention
Example 3 the experimental procedure was substantially identical to that of example 1, except that dGTP was used in the experimental procedure to be changed to dATP, dTTP and dCTP, respectively, and the tail poly (C) 8 of the extension primer was changed to poly (T), respectively, in the experimental procedure 8 ,poly(A) 8 And poly (G) 8 。
Experimental results: the comparison of the library fragments obtained by fragment quality control of the library using a Qsep capillary electrophoresis apparatus (Qsep 100, beckmann) is shown in FIG. 6. The main peaks 20 and 1000 in FIG. 6 are standard molecular weight nucleic acids (markers) of 20bp and 1000bp, respectively.
The results show that the base at the tail of the four different types can be applied to the library construction method of the invention, and can be used for quality control of the primer probe of the invention. Wherein the library fragment ranges for the A-and T-tails are broad, mainly due to the fact that the length of the tailing is not easily controlled. Library fragments of the G tail and the C tail are more concentrated in expression, and are more beneficial to controlling the tail adding length of the scheme, wherein the G tail expression is relatively better and is more suitable for the scheme.
The technical scheme of the invention is not limited to the specific embodiment, and all technical modifications made according to the technical scheme of the invention fall within the protection scope of the invention.
Claims (10)
1. A method of constructing a DNA library, comprising the steps of:
1) Adding poly (X) at the 3' -end of a single-stranded DNA molecule n Tail to obtain a polypeptide (X) n A first single-stranded DNA molecule of the tail;
2) Obtaining a double stranded DNA molecule based on said first single stranded DNA molecule using an extension primer, wherein said double stranded DNA molecule comprises said first single stranded DNA molecule and a second single stranded DNA molecule complementary to said first single stranded DNA molecule, said extension primer being capable of annealing to poly (X) at the 3' end of said first single stranded DNA molecule n On the tail; and
3) Distal poly (X) of double-stranded adaptors to the double-stranded DNA molecules n And connecting one end of the tail, and amplifying the second single-stranded DNA molecule to obtain an amplification product, wherein the amplification product forms a DNA sequencing library.
2. The method according to claim 1, wherein in step 1) the single stranded DNA molecule is selected from the group consisting of 10 to 150nt, preferably from 20 to 120nt in length; and/or the number of the groups of groups,
the 5' end of the single stranded DNA molecule comprises a non-phosphorylated modification, preferably selected from the group consisting of amino modification, diphenylcyclooctyne modification, biotin modification, desthiobiotin, thiol modification, dithiol modification, ferrocene modification, tetrahydrofuran modification, thio modification, phosphorothioate modification, digoxin modification, cholesterol modification, azobenzene modification, methylene blue modification, binaphthyl modification or ruthenium modification, more preferably selected from the group consisting of thio modification, amino modification, thiol modification or biotin modification.
3. The method according to claim 1, wherein in said step 1), poly (X) is reacted with a terminal transferase n Tail addition to the 3' end of the single stranded DNA molecule;
preferably, said n represents the number of bases X, said n being selected from integers from 6 to 12, preferably from 6, 7, 8, 9, 10, 11 or 12;
preferably, X is selected from any one of bases A, T, C or G, preferably from base C or G, more preferably from base G.
4. The method according to claim 1, wherein in said step 2), the 3' end of said extension primer comprises (Y) an m base unit, wherein base Y is complementary to base X, wherein m is selected from integers from 4 to 12;
preferably, the 5' end of the (Y) m base unit of the extension primer further comprises a primer other than poly (X) n One or more bases complementary to the tail;
preferably, the length of the extension primer is selected from 20 to 40nt.
5. The method according to claim 1, wherein in the step 3), amplification is performed using an amplification primer,
preferably, the amplification primers comprise a first amplification primer and a second amplification primer;
preferably, the length of the first and second amplification primer sequences is each independently selected from 20 to 40nt.
6. The method according to claim 1, wherein in the step 3), the double-stranded adaptor comprises: a first adaptor single strand to be ligated to the 5' end of the first single strand DNA molecule; and a second adaptor single strand to be ligated to the 3' end of the second single strand DNA molecule;
preferably, the 3 'end of the first adaptor single strand and the 5' end of the second adaptor single strand are blunt ends, and the double-stranded adaptor is connected to the double-stranded DNA molecule through the blunt ends.
7. The method according to claim 6, characterized in that in step 3) the blunt end of the double-stranded adaptor has one or more random complementary base pairs, preferably 1-10 random complementary base pairs, more preferably the blunt end of the double-stranded adaptor has 2-8 random complementary base pairs, even more preferably the blunt end of the double-stranded adaptor has 1-4 random complementary base pairs, even more preferably the blunt end of the double-stranded adaptor has 1 random complementary base pair; and, a step of, in the first embodiment,
the 5 'end of the second adaptor single strand is linked to the 3' end of the second single strand DNA molecule, and the 3 'end of the first adaptor single strand is not linked to the modified 5' end of the first single strand DNA molecule.
8. The method according to claim 1, further comprising the step of sequencing and/or bioinformatic analysis of the sequencing library obtained in step 3).
9. An apparatus for constructing a DNA library, the apparatus comprising:
a tail linker unit for forming poly (X) at the 3' -end of the single stranded DNA molecule n Tail to obtain a polymer (X) n A first single-stranded DNA molecule at the tail;
an extension unit that obtains a duplex based on the first single stranded DNA moleculeA strand DNA molecule, wherein said double stranded DNA molecule comprises said first single stranded DNA molecule and a second single stranded DNA molecule complementary to said first single stranded DNA molecule, said extension primer being capable of annealing to poly (X) at the 3' end of said first single stranded DNA molecule n On the tail;
a linker linking unit for attaching a double-stranded linker to the double-stranded DNA molecule at a distance from poly (X) n One end of the tail is connected; and
an amplification unit for amplifying the second single stranded DNA molecule to obtain amplification products, the amplification products constituting a DNA sequencing library.
10. Use of the method of claims 1-8 and/or the device of claim 9 for probe or primer control; preferably, the method and/or the device is used for sequencing one or more probes or primers and/or semi-quantifying or quantifying one or more probes or primers.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311225560.2A CN117512063A (en) | 2023-09-21 | 2023-09-21 | DNA library construction method, device and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311225560.2A CN117512063A (en) | 2023-09-21 | 2023-09-21 | DNA library construction method, device and application thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117512063A true CN117512063A (en) | 2024-02-06 |
Family
ID=89759421
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311225560.2A Pending CN117512063A (en) | 2023-09-21 | 2023-09-21 | DNA library construction method, device and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117512063A (en) |
-
2023
- 2023-09-21 CN CN202311225560.2A patent/CN117512063A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6348313B1 (en) | Sequencing of nucleic acids | |
US20140080126A1 (en) | Quantification of nucleic acids and proteins using oligonucleotide mass tags | |
US20020137057A1 (en) | Rapid, quantitative method for the mass spectrometric analysis of nucleic acids for gene expression and genotyping | |
US7867714B2 (en) | Target-specific compomers and methods of use | |
US9181554B2 (en) | Methods for detecting a target nucleotide sequence in a sample utilising a nuclease-aptamer complex | |
KR20170133270A (en) | Method for preparing libraries for massively parallel sequencing using molecular barcoding and the use thereof | |
CN112410331A (en) | Linker with molecular label and sample label and single-chain library building method thereof | |
WO2013079649A1 (en) | Method and kit for characterizing rna in a composition | |
CN111471746A (en) | NGS library preparation joint for detecting low mutation abundance sample and preparation method thereof | |
US20170362641A1 (en) | Dual polarity analysis of nucleic acids | |
CN114657232A (en) | Universal blocking reagent for improving target capture efficiency and application thereof | |
KR20220130592A (en) | Highly sensitive methods for accurate parallel quantification of nucleic acids | |
US6312904B1 (en) | Characterizing nucleic acid | |
CN116536308A (en) | Sequencing sealant and application thereof | |
CN117512063A (en) | DNA library construction method, device and application thereof | |
CN112858693A (en) | Biomolecule detection method | |
WO2022069039A1 (en) | METHOD OF PREPARATION OF cDNA LIBRARY USEFUL FOR EFFICIENT mRNA SEQUENCING AND USES THEREOF | |
EP3309252B1 (en) | On-array ligation assembly | |
CN116790724B (en) | Method for detecting single base difference and gene chip | |
CN108841919A (en) | A kind of inserted type SDA method prepares probe | |
CN118086457A (en) | Construction method and application of DNA library | |
CN112646809A (en) | Nucleic acid sequence, method and kit for detecting enzyme end repair capacity | |
CN1341749A (en) | Nucleic acid target molecule fragment selection and amplification method | |
ZA200401157B (en) | Amplification of nucleic acid fragments using nicking agents. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |