WO2023116376A1 - 单细胞核酸标记和分析方法 - Google Patents
单细胞核酸标记和分析方法 Download PDFInfo
- Publication number
- WO2023116376A1 WO2023116376A1 PCT/CN2022/135478 CN2022135478W WO2023116376A1 WO 2023116376 A1 WO2023116376 A1 WO 2023116376A1 CN 2022135478 W CN2022135478 W CN 2022135478W WO 2023116376 A1 WO2023116376 A1 WO 2023116376A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- primer
- nucleic acid
- region
- oligonucleotide
- Prior art date
Links
- 150000007523 nucleic acids Chemical class 0.000 title claims abstract description 539
- 102000039446 nucleic acids Human genes 0.000 title claims abstract description 526
- 108020004707 nucleic acids Proteins 0.000 title claims abstract description 526
- 238000002372 labelling Methods 0.000 title claims abstract description 44
- 238000004458 analytical method Methods 0.000 title description 5
- 238000000034 method Methods 0.000 claims abstract description 161
- 238000012163 sequencing technique Methods 0.000 claims abstract description 90
- 230000000295 complement effect Effects 0.000 claims description 508
- 108091035707 Consensus sequence Proteins 0.000 claims description 453
- 108091034117 Oligonucleotide Proteins 0.000 claims description 425
- 239000002299 complementary DNA Substances 0.000 claims description 282
- 238000006243 chemical reaction Methods 0.000 claims description 251
- 239000000047 product Substances 0.000 claims description 178
- 108020005187 Oligonucleotide Probes Proteins 0.000 claims description 148
- 239000002751 oligonucleotide probe Substances 0.000 claims description 148
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 142
- 230000000694 effects Effects 0.000 claims description 93
- 238000010839 reverse transcription Methods 0.000 claims description 93
- 125000003729 nucleotide group Chemical group 0.000 claims description 91
- 102100034343 Integrase Human genes 0.000 claims description 75
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 claims description 74
- 230000027455 binding Effects 0.000 claims description 71
- 108020004999 messenger RNA Proteins 0.000 claims description 67
- 239000002773 nucleotide Substances 0.000 claims description 65
- 239000007787 solid Substances 0.000 claims description 61
- 238000000137 annealing Methods 0.000 claims description 54
- 238000011144 upstream manufacturing Methods 0.000 claims description 51
- 238000003499 nucleic acid array Methods 0.000 claims description 47
- 239000000523 sample Substances 0.000 claims description 45
- VAYOSLLFUXYJDT-RDTXWAMCSA-N Lysergic acid diethylamide Chemical compound C1=CC(C=2[C@H](N(C)C[C@@H](C=2)C(=O)N(CC)CC)C2)=C3C2=CNC3=C1 VAYOSLLFUXYJDT-RDTXWAMCSA-N 0.000 claims description 39
- 230000003993 interaction Effects 0.000 claims description 33
- 230000003321 amplification Effects 0.000 claims description 32
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 32
- 238000006073 displacement reaction Methods 0.000 claims description 30
- 102000004190 Enzymes Human genes 0.000 claims description 28
- 108090000790 Enzymes Proteins 0.000 claims description 28
- 102000008579 Transposases Human genes 0.000 claims description 25
- 108010020764 Transposases Proteins 0.000 claims description 25
- 230000004048 modification Effects 0.000 claims description 25
- 238000012986 modification Methods 0.000 claims description 25
- 239000003550 marker Substances 0.000 claims description 24
- 239000003153 chemical reaction reagent Substances 0.000 claims description 23
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 claims description 22
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 claims description 22
- 102000003960 Ligases Human genes 0.000 claims description 20
- 108090000364 Ligases Proteins 0.000 claims description 20
- 238000003776 cleavage reaction Methods 0.000 claims description 20
- 230000007017 scission Effects 0.000 claims description 20
- 238000012546 transfer Methods 0.000 claims description 20
- 108090000623 proteins and genes Proteins 0.000 claims description 19
- 108010039918 Polylysine Proteins 0.000 claims description 17
- 150000001875 compounds Chemical class 0.000 claims description 17
- 230000000977 initiatory effect Effects 0.000 claims description 17
- 230000026731 phosphorylation Effects 0.000 claims description 17
- 238000006366 phosphorylation reaction Methods 0.000 claims description 17
- 229920000656 polylysine Polymers 0.000 claims description 17
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 claims description 16
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 16
- 230000009471 action Effects 0.000 claims description 15
- 102000004169 proteins and genes Human genes 0.000 claims description 12
- 239000007795 chemical reaction product Substances 0.000 claims description 11
- 239000012634 fragment Substances 0.000 claims description 11
- 239000000126 substance Substances 0.000 claims description 11
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 claims description 10
- 239000000203 mixture Substances 0.000 claims description 9
- 238000006116 polymerization reaction Methods 0.000 claims description 9
- 230000010076 replication Effects 0.000 claims description 9
- 229960002685 biotin Drugs 0.000 claims description 8
- 235000020958 biotin Nutrition 0.000 claims description 8
- 239000011616 biotin Substances 0.000 claims description 8
- 239000006285 cell suspension Substances 0.000 claims description 8
- 239000000427 antigen Substances 0.000 claims description 7
- 102000036639 antigens Human genes 0.000 claims description 7
- 108091007433 antigens Proteins 0.000 claims description 7
- 238000011065 in-situ storage Methods 0.000 claims description 7
- 108060002716 Exonuclease Proteins 0.000 claims description 6
- 108010090804 Streptavidin Proteins 0.000 claims description 6
- 125000000304 alkynyl group Chemical group 0.000 claims description 6
- 102000013165 exonuclease Human genes 0.000 claims description 6
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 claims description 6
- 238000005096 rolling process Methods 0.000 claims description 6
- 239000003446 ligand Substances 0.000 claims description 5
- 108091033409 CRISPR Proteins 0.000 claims description 4
- 238000010354 CRISPR gene editing Methods 0.000 claims description 4
- 108091028732 Concatemer Proteins 0.000 claims description 4
- 238000011222 transcriptome analysis Methods 0.000 claims description 4
- 102000040650 (ribonucleotides)n+m Human genes 0.000 claims description 3
- 230000008878 coupling Effects 0.000 claims description 3
- 238000010168 coupling process Methods 0.000 claims description 3
- 238000005859 coupling reaction Methods 0.000 claims description 3
- 230000003834 intracellular effect Effects 0.000 claims description 3
- 238000007781 pre-processing Methods 0.000 claims description 3
- 238000013518 transcription Methods 0.000 claims description 3
- 230000035897 transcription Effects 0.000 claims description 3
- 230000006037 cell lysis Effects 0.000 claims description 2
- 238000007481 next generation sequencing Methods 0.000 claims description 2
- 238000007899 nucleic acid hybridization Methods 0.000 claims description 2
- 230000008823 permeabilization Effects 0.000 claims description 2
- 238000007671 third-generation sequencing Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 abstract description 3
- 239000013615 primer Substances 0.000 description 504
- 210000004027 cell Anatomy 0.000 description 124
- 238000001353 Chip-sequencing Methods 0.000 description 49
- 229940088598 enzyme Drugs 0.000 description 25
- 108020004414 DNA Proteins 0.000 description 21
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Natural products NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 19
- 239000011324 bead Substances 0.000 description 17
- 239000002253 acid Substances 0.000 description 16
- 238000005516 engineering process Methods 0.000 description 16
- 238000002360 preparation method Methods 0.000 description 14
- 238000009396 hybridization Methods 0.000 description 12
- 230000008859 change Effects 0.000 description 10
- 239000000243 solution Substances 0.000 description 10
- 229940104302 cytosine Drugs 0.000 description 8
- 125000005647 linker group Chemical group 0.000 description 8
- 125000006850 spacer group Chemical group 0.000 description 8
- 238000012408 PCR amplification Methods 0.000 description 7
- 108091033319 polynucleotide Proteins 0.000 description 7
- 102000040430 polynucleotide Human genes 0.000 description 7
- 239000002157 polynucleotide Substances 0.000 description 7
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 6
- 108091028664 Ribonucleotide Proteins 0.000 description 6
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000010804 cDNA synthesis Methods 0.000 description 6
- 201000011510 cancer Diseases 0.000 description 6
- 238000009826 distribution Methods 0.000 description 6
- 239000000499 gel Substances 0.000 description 6
- 238000012165 high-throughput sequencing Methods 0.000 description 6
- 238000002156 mixing Methods 0.000 description 6
- 239000012071 phase Substances 0.000 description 6
- -1 polypropylene Polymers 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 238000011084 recovery Methods 0.000 description 6
- 239000002336 ribonucleotide Substances 0.000 description 6
- 125000002652 ribonucleotide group Chemical group 0.000 description 6
- 239000010437 gem Substances 0.000 description 5
- 239000007790 solid phase Substances 0.000 description 5
- 238000003786 synthesis reaction Methods 0.000 description 5
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 4
- 102000053602 DNA Human genes 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 4
- 101710163270 Nuclease Proteins 0.000 description 4
- 108010017842 Telomerase Proteins 0.000 description 4
- 108010012306 Tn5 transposase Proteins 0.000 description 4
- 238000002679 ablation Methods 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 229910052804 chromium Inorganic materials 0.000 description 4
- 239000011651 chromium Substances 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 238000007405 data analysis Methods 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 4
- 125000001921 locked nucleotide group Chemical group 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 239000002096 quantum dot Substances 0.000 description 4
- 102000012410 DNA Ligases Human genes 0.000 description 3
- 108010061982 DNA Ligases Proteins 0.000 description 3
- 101900297506 Human immunodeficiency virus type 1 group M subtype B Reverse transcriptase/ribonuclease H Proteins 0.000 description 3
- 108010083644 Ribonucleases Proteins 0.000 description 3
- 102000006382 Ribonucleases Human genes 0.000 description 3
- 102100032938 Telomerase reverse transcriptase Human genes 0.000 description 3
- 239000003638 chemical reducing agent Substances 0.000 description 3
- 238000007796 conventional method Methods 0.000 description 3
- 239000005547 deoxyribonucleotide Substances 0.000 description 3
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 150000002148 esters Chemical class 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 210000005260 human cell Anatomy 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 239000011807 nanoball Substances 0.000 description 3
- 108010068698 spleen exonuclease Proteins 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 230000017105 transposition Effects 0.000 description 3
- 229940035893 uracil Drugs 0.000 description 3
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 2
- LRFVTYWOQMYALW-UHFFFAOYSA-N 9H-xanthine Chemical compound O=C1NC(=O)NC2=C1NC=N2 LRFVTYWOQMYALW-UHFFFAOYSA-N 0.000 description 2
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- 108020004638 Circular DNA Proteins 0.000 description 2
- 108020004635 Complementary DNA Proteins 0.000 description 2
- 208000034454 F12-related hereditary angioedema with normal C1Inh Diseases 0.000 description 2
- 240000007019 Oxalis corniculata Species 0.000 description 2
- 108091093037 Peptide nucleic acid Proteins 0.000 description 2
- 108091036407 Polyadenylation Proteins 0.000 description 2
- 108010018858 Tn10 transposase Proteins 0.000 description 2
- 239000007983 Tris buffer Substances 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 150000007513 acids Chemical class 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 239000013626 chemical specie Substances 0.000 description 2
- 238000000576 coating method Methods 0.000 description 2
- 238000005520 cutting process Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 238000004925 denaturation Methods 0.000 description 2
- 230000036425 denaturation Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000000839 emulsion Substances 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- KWIUHFFTVRNATP-UHFFFAOYSA-N glycine betaine Chemical compound C[N+](C)(C)CC([O-])=O KWIUHFFTVRNATP-UHFFFAOYSA-N 0.000 description 2
- 229930182470 glycoside Natural products 0.000 description 2
- 238000010438 heat treatment Methods 0.000 description 2
- 208000016861 hereditary angioedema type 3 Diseases 0.000 description 2
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 2
- 210000002865 immune cell Anatomy 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 102000016470 mariner transposase Human genes 0.000 description 2
- 108060004631 mariner transposase Proteins 0.000 description 2
- 238000002844 melting Methods 0.000 description 2
- 230000008018 melting Effects 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 239000010413 mother solution Substances 0.000 description 2
- 230000009871 nonspecific binding Effects 0.000 description 2
- 239000003921 oil Substances 0.000 description 2
- 239000002987 primer (paints) Substances 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- JQWHASGSAFIOCM-UHFFFAOYSA-M sodium periodate Chemical compound [Na+].[O-]I(=O)(=O)=O JQWHASGSAFIOCM-UHFFFAOYSA-M 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 210000000130 stem cell Anatomy 0.000 description 2
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 229940113082 thymine Drugs 0.000 description 2
- WYWHKKSPHMUBEB-UHFFFAOYSA-N tioguanine Chemical compound N1C(N)=NC(=S)C2=C1N=CN2 WYWHKKSPHMUBEB-UHFFFAOYSA-N 0.000 description 2
- 210000001519 tissue Anatomy 0.000 description 2
- 235000011178 triphosphate Nutrition 0.000 description 2
- 239000001226 triphosphate Substances 0.000 description 2
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 2
- 210000004881 tumor cell Anatomy 0.000 description 2
- STGXGJRRAJKJRG-JDJSBBGDSA-N (3r,4r,5r)-5-(hydroxymethyl)-3-methoxyoxolane-2,4-diol Chemical compound CO[C@H]1C(O)O[C@H](CO)[C@H]1O STGXGJRRAJKJRG-JDJSBBGDSA-N 0.000 description 1
- QRZUPJILJVGUFF-UHFFFAOYSA-N 2,8-dibenzylcyclooctan-1-one Chemical compound C1CCCCC(CC=2C=CC=CC=2)C(=O)C1CC1=CC=CC=C1 QRZUPJILJVGUFF-UHFFFAOYSA-N 0.000 description 1
- OVONXEQGWXGFJD-UHFFFAOYSA-N 4-sulfanylidene-1h-pyrimidin-2-one Chemical compound SC=1C=CNC(=O)N=1 OVONXEQGWXGFJD-UHFFFAOYSA-N 0.000 description 1
- UJBCLAXPPIDQEE-UHFFFAOYSA-N 5-prop-1-ynyl-1h-pyrimidine-2,4-dione Chemical compound CC#CC1=CNC(=O)NC1=O UJBCLAXPPIDQEE-UHFFFAOYSA-N 0.000 description 1
- QNNARSZPGNJZIX-UHFFFAOYSA-N 6-amino-5-prop-1-ynyl-1h-pyrimidin-2-one Chemical compound CC#CC1=CNC(=O)N=C1N QNNARSZPGNJZIX-UHFFFAOYSA-N 0.000 description 1
- MSSXOMSJDRHRMC-UHFFFAOYSA-N 9H-purine-2,6-diamine Chemical compound NC1=NC(N)=C2NC=NC2=N1 MSSXOMSJDRHRMC-UHFFFAOYSA-N 0.000 description 1
- 102000013142 Amylases Human genes 0.000 description 1
- 108010065511 Amylases Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- 239000003155 DNA primer Substances 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 238000005698 Diels-Alder reaction Methods 0.000 description 1
- AVXURJPOCDRRFD-UHFFFAOYSA-N Hydroxylamine Chemical compound ON AVXURJPOCDRRFD-UHFFFAOYSA-N 0.000 description 1
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 1
- 101710203526 Integrase Proteins 0.000 description 1
- 108091028649 Multicopy single-stranded DNA Proteins 0.000 description 1
- 101500006448 Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) Endonuclease PI-MboI Proteins 0.000 description 1
- 238000000636 Northern blotting Methods 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 239000002202 Polyethylene glycol Substances 0.000 description 1
- 239000004743 Polypropylene Substances 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- ULUAUXLGCMPNKK-UHFFFAOYSA-N Sulfobutanedioic acid Chemical compound OC(=O)CC(C(O)=O)S(O)(=O)=O ULUAUXLGCMPNKK-UHFFFAOYSA-N 0.000 description 1
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical group O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 1
- GHMVTBRBCJHSKM-UHFFFAOYSA-N [N].O=C1C=CNC(=O)N1 Chemical compound [N].O=C1C=CNC(=O)N1 GHMVTBRBCJHSKM-UHFFFAOYSA-N 0.000 description 1
- 125000002355 alkine group Chemical group 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 235000019418 amylase Nutrition 0.000 description 1
- 229940025131 amylases Drugs 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 229960003237 betaine Drugs 0.000 description 1
- 125000004432 carbon atom Chemical group C* 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000000536 complexating effect Effects 0.000 description 1
- 239000005549 deoxyribonucleoside Substances 0.000 description 1
- SHIBSTMRCDJXLN-KCZCNTNESA-N digoxigenin Chemical group C1([C@@H]2[C@@]3([C@@](CC2)(O)[C@H]2[C@@H]([C@@]4(C)CC[C@H](O)C[C@H]4CC2)C[C@H]3O)C)=CC(=O)OC1 SHIBSTMRCDJXLN-KCZCNTNESA-N 0.000 description 1
- PGUYAANYCROBRT-UHFFFAOYSA-N dihydroxy-selanyl-selanylidene-lambda5-phosphane Chemical compound OP(O)([SeH])=[Se] PGUYAANYCROBRT-UHFFFAOYSA-N 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-K dioxido-sulfanylidene-sulfido-$l^{5}-phosphane Chemical compound [O-]P([O-])([S-])=S NAGJZTKCGNOGPW-UHFFFAOYSA-K 0.000 description 1
- 238000004090 dissolution Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 230000001605 fetal effect Effects 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- 150000002338 glycosides Chemical class 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 239000008241 heterogeneous mixture Substances 0.000 description 1
- 230000007062 hydrolysis Effects 0.000 description 1
- 238000006460 hydrolysis reaction Methods 0.000 description 1
- 230000028993 immune response Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 238000005304 joining Methods 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 229920000126 latex Polymers 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 239000000178 monomer Substances 0.000 description 1
- 239000002077 nanosphere Substances 0.000 description 1
- 210000003061 neural cell Anatomy 0.000 description 1
- 238000001216 nucleic acid method Methods 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 150000004713 phosphodiesters Chemical class 0.000 description 1
- 229920002401 polyacrylamide Polymers 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 229920001155 polypropylene Polymers 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 239000002718 pyrimidine nucleoside Substances 0.000 description 1
- 239000011535 reaction buffer Substances 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 125000000548 ribosyl group Chemical group C1([C@H](O)[C@H](O)[C@H](O1)CO)* 0.000 description 1
- 229920002477 rna polymer Polymers 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- JRPHGDYSKGJTKZ-UHFFFAOYSA-N selenophosphoric acid Chemical compound OP(O)([SeH])=O JRPHGDYSKGJTKZ-UHFFFAOYSA-N 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 239000011550 stock solution Substances 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 210000001541 thymus gland Anatomy 0.000 description 1
- 229960003087 tioguanine Drugs 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 1
- 235000012431 wafers Nutrition 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 229940075420 xanthine Drugs 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/06—Biochemical methods, e.g. using enzymes or whole viable microorganisms
Definitions
- This application relates to the technical field of single-cell transcriptome sequencing and biomolecular spatial information detection. Specifically, the present application relates to a method for positioning and labeling nucleic acid molecules in a single-cell sample, and a method for constructing a single-cell transcriptome sequencing library. Furthermore, the present application also relates to kits for carrying out said methods.
- Single-cell transcriptome sequencing technology is an important tool for identifying cellular heterogeneity.
- the importance of single-cell transcriptome sequencing technology has prompted the rapid development of this technology in terms of throughput and ease of operation.
- the development of single-cell transcriptome sequencing technology has prompted the launch of the Human Cell Atlas Project, a human cell atlas reference system, with huge sums of money internationally.
- the launch of the Human Cell Atlas project has put forward higher requirements and challenges for the throughput of single-cell transcriptome sequencing technology.
- single-cell transcriptome sequencing technology is also used by medical workers to discover a small number of "tumor stem cells" in cancer, so as to find drugs and therapies to overcome malignant tumors. Since malignant tumor cells are relatively rare, single-cell transcriptome sequencing technology is required to have a high cell utilization rate or capture rate to avoid the loss of transcriptome information of a small number of malignant tumor cells.
- Existing single-cell transcriptome sequencing technologies mainly include two categories: one is low-throughput sequencing technology based on multi-well plates, in which a single cell is allocated to a single well of a multi-well plate, such as smart-seq, CEL-seq; The other is magnetic bead-based sequencing technology, in which a cell is co-wrapped in micro-droplets or micro-wells with labeled magnetic beads by means of microfluidics, such as 10x chromium, Drop-seq, Seq- well and other technologies.
- the existing single-cell transcriptome sequencing technology has the highest throughput of 10x chromium, and the throughput of a single run is 5000-7000 cells, up to 10,000 cells, and, depending on the cell type, the capture rate of the cells 30% to 60%.
- the gel beads (Gel Beads) with label molecules or barcode molecules (Barcode) enter the microfluidic system at a uniform speed;
- the gel beads are combined and form GEMs (Gel Bead in em ⁇ Lsion) in the oil phase.
- GEMs Gel Bead in em ⁇ Lsion
- each cell is combined with a Gel Bead to form a GEM, thus, this method can achieve the purpose of single-cell transcriptome sequencing.
- the formation of GEMs follows a Poisson distribution. That is, there may be a phenomenon that a single GEM contains 0 or more cells.
- the sequencing data generated by this GEM does not correspond to the state of a single cell, it cannot be used later and needs to be filtered by an algorithm. Limited by the number of micro-droplets formed in the oil phase, the throughput of this technology is difficult to break through 10,000 levels; at the same time, due to the characteristics of Poisson distribution, the cell capture rate of this technology can theoretically reach up to 60%. Therefore, when it is necessary to perform single-cell transcriptome sequencing on 100,000 or even higher-throughput cells or to capture and sequence rare cells, this technology still has major flaws and is difficult to meet actual needs. Therefore, there is a need in the art to develop new single-cell transcriptome sequencing methods with higher cell capture rates.
- the present application provides a method for positionally marking nucleic acid molecules of a cell sample, and a method for constructing a single-cell transcriptome sequencing library based on the method. Furthermore, the present application also relates to kits for carrying out said methods.
- the application provides a method of generating a population of labeled nucleic acid molecules, comprising the steps of:
- the sample is a single cell suspension; the cells (for example, on their surface) contain the first binding molecule;
- the nucleic acid array includes a solid support, the solid support (for example, on its surface) contains a first label molecule, and the first binding molecule can form an interaction pair with the first label molecule;
- the solid support also includes a plurality of micro-dots, the size of the micro-dots (such as equivalent diameter) is less than 5 ⁇ m, and the center distance between adjacent micro-dots is less than 10 ⁇ m; each micro-dot is even
- An oligonucleotide probe is linked, and each oligonucleotide probe comprises at least one copy; the oligonucleotide probe comprises or consists of: consensus sequence X1, tag sequence from 5' to 3' direction Y and the consensus sequence X2, where,
- Oligonucleotide probes coupled with different microdots have different label sequences Y;
- each cell occupies at least one micro-spot in the nucleic acid array (i.e., each cell is separately associated with the nucleic acid array at least one micro-spot in the nucleic acid array), and make the first binding molecule of the cell and the first labeling molecule of the solid support form an interaction pair;
- RNA e.g., mRNA
- the center-to-center distance between said adjacent microdots is less than 10 ⁇ m, less than 5 ⁇ m, less than 1 ⁇ m, less than 0.5 ⁇ m, less than 0.1 ⁇ m, less than 0.05 ⁇ m, or less than 0.01 ⁇ m; and,
- the microdots have a size (eg equivalent diameter) of less than 5 ⁇ m, less than 1 ⁇ m, less than 0.3 ⁇ m, less than 0.5 ⁇ m, less than 0.1 ⁇ m, less than 0.05 ⁇ m, less than 0.01 ⁇ m, or less than 0.001 ⁇ m.
- the center-to-center distance between the adjacent micro-dots is 0.5 ⁇ m-1 ⁇ m, such as 0.5 ⁇ m-0.9 ⁇ m, 0.5 ⁇ m-0.8 ⁇ m.
- the microdots have a size (eg equivalent diameter) of 0.001 ⁇ m to 0.5 ⁇ m (eg 0.01 ⁇ m to 0.1 ⁇ m, 0.01 ⁇ m to 0.2 ⁇ m, 0.2 ⁇ m to 0.5 ⁇ m, 0.2 ⁇ m to 0.4 ⁇ m , 0.2 ⁇ m ⁇ 0.3 ⁇ m).
- the first binding molecule can form a specific interaction pair or a non-specific interaction pair with the first label molecule.
- the interaction pair is selected from positive and negative charge interactions, affinity interactions (e.g., biotin-avidin, biotin-streptavidin, antigen-antibody, receptor-ligand Enzymes, enzyme-cofactors), molecular pairs capable of click chemistry reactions (eg, alkynyl-containing groups-azido-containing compounds), N-hydroxysulfosuccinyl (NHS) ester-amino-containing compounds, or any combination thereof.
- affinity interactions e.g., biotin-avidin, biotin-streptavidin, antigen-antibody, receptor-ligand Enzymes, enzyme-cofactors
- molecular pairs capable of click chemistry reactions eg, alkynyl-containing groups-azido-containing compounds
- NHS N-hydroxysulfosuccinyl
- the first labeling molecule is polylysine, and the first binding molecule is a protein capable of binding to polylysine; the first labeling molecule is an antibody, and the first binding molecule is a protein capable of binding to polylysine; A binding molecule is an antigen capable of binding to the antibody; the first labeling molecule is an amino-containing compound, and the first binding molecule is N-hydroxysulfosuccinate (NHS); or, the first labeling molecule is biotin, and the first binding molecule is streptavidin.
- the first labeling molecule is polylysine
- the first binding molecule is a protein capable of binding to polylysine
- the first labeling molecule is an antibody
- the first binding molecule is a protein capable of binding to polylysine
- a binding molecule is an antigen capable of binding to the antibody
- the first labeling molecule is an amino-containing compound, and the first binding molecule is N-hydroxysulfosuccinate (NHS); or, the first labeling
- said first binding molecule is naturally contained by said cell.
- said first binding molecule is not naturally contained by said cell.
- the method further comprises the step of binding the first binding molecule to the one or more cells or causing the one or more cells to express the first binding molecule to provide the step (i) said cell sample.
- the method further comprises the step of binding the first marker molecule to the solid support to provide the nucleic acid array of step (i).
- step (2) the pretreatment includes the following steps:
- RNA for example, mRNA
- primer I-A contains a consensus sequence A and a capture sequence A, and the capture sequence A can anneal with the RNA to be captured (for example, mRNA) and initiate an extension reaction; the consensus sequence A is located in the capture sequence A upstream (for example, at the 5' end of the primer I-A);
- the cDNA chain includes the primer I-A formed as a reverse transcription primer and the The cDNA sequence complementary to the RNA (for example, mRNA), and the 3' end overhang; wherein, the primer I-A contains a consensus sequence A and a capture sequence A, and the capture sequence A can be combined with the RNA to be captured (for example, mRNA) anneal and initiate an extension reaction; the consensus sequence A is located upstream of the capture sequence A (e.g., at the 5' end of the primer I-A); and, (b) combine primer I-B with the primer generated in (a) The cDNA chain is annealed and extended to generate a first extension product, which is the first nucleic acid molecule to be labeled, thereby generating a first nucleic acid molecule population; wherein, the primer I-B includes a consensus sequence
- RNA eg, mRNA
- primer I-A' reverse-transcribe the RNA (eg, mRNA) of the one or more cells with primer I-A' to generate a cDNA strand comprising The cDNA sequence complementary to the RNA (for example, mRNA) formed by the primers, and the 3' end overhang; wherein, the primer 1-A' comprises a capture sequence A capable of interacting with the RNA to be captured (for example, mRNA) anneal and initiate an extension reaction; (b) anneal primer I-B to the cDNA strand generated in (a), and perform an extension reaction to generate a first extension product; wherein, the primer I-B comprises a consensus sequence B, a 3' end overhang complementary sequence, and an optional tag sequence B; the 3' end overhang complementary sequence is located at the 3' end of the primer I-B; the consensus sequence B is located at the 3' end Upstream of the overhanging complementary sequence (for example, at the 5' end of the primer I-B); and
- step (3) the first nucleic acid molecule population derived from each cell obtained in the previous step is associated with the microdot-coupled oligonucleotide probes occupied by the cell from which it originated, through the following steps, thereby generating a The second group of nucleic acid molecules marked by the tag sequence Y:
- annealing for example, in-situ annealing
- the bridging oligonucleotide I contact the bridging oligonucleotide I with the first nucleic acid molecule derived from each cell obtained in step (2) and the oligonucleotide probe coupled to the microspot occupied by the cell, annealing (for example, in-situ annealing) of the bridging oligonucleotide I to the first nucleic acid molecule derived from each cell obtained in step (2) and the oligonucleotide probe coupled to the micro-dot occupied by the cell , so that the first nucleic acid molecule group is connected to the oligonucleotide probes on the array, and the obtained connection product is a second nucleic acid molecule with a position marker, thereby generating a second nucleic acid molecule group;
- the bridging oligonucleotide I includes: a first region and a second region, and optionally a third region between the first region and the second region, the first region is located in the second region Upstream (for example, the 5' end); where,
- the first region can anneal to all or part of the consensus sequence A of primer I-A described in step (2)(i) or step (2)(ii) or to the primer I-B described in step (2)(iii).
- the consensus sequence B is fully or partially annealed;
- the second region is capable of annealing in whole or in part to the consensus sequence X2.
- step (3) when the first region and the second region of the bridging oligonucleotide I are adjacent to each other, the first nucleic acid molecule population and the oligonucleotide The acid probe ligation includes: using nucleic acid ligase to ligate the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide I, and the obtained ligation product is the second nucleic acid molecule with a position marker; or,
- said connecting the first population of nucleic acid molecules to the oligonucleotide probe comprises : use a nucleic acid polymerase to carry out a polymerization reaction with the third region as a template, use a nucleic acid ligase to connect the nucleic acid molecules hybridized to the first region, the third region and the second region of the same bridging oligonucleotide I, and obtain
- the ligation product is the second nucleic acid molecule with a position marker; preferably, the nucleic acid polymerase has no 5' to 3' end exonuclease activity or strand displacement activity.
- each oligonucleotide probe comprises one copy.
- each oligonucleotide probe comprises multiple copies.
- each oligonucleotide probe is one copy, each micro-dot is coupled with a probe, and the oligonucleotide probes of different micro-dots have different label sequences Y; when each oligonucleotide When the nucleotide probe contains multiple copies, each micro-dot is coupled with multiple probes, the oligonucleotide probes in the same micro-dot have the same label sequence Y, and the oligonucleotide probes in different micro-dots have Different label sequences Y.
- the solid support comprises a plurality of microdots, each microdot is coupled to an oligonucleotide probe, and each oligonucleotide probe may comprise one or more copies.
- the solid support comprises a plurality (eg, at least 10, at least 10 2 , at least 10 3 , at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , or more) microdots; in certain embodiments, the solid support comprises at least 10 4 (eg, at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , at least 10 9 , at least 10 10 , at least 10 11 , or at least 10 12 ) microdots/square millimeter.
- the solid support comprises at least 10 4 (eg, at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , at least 10 9 , at least 10 10 , at least 10 11 , or at least 10 12 ) microdots/square millimeter.
- Embodiment comprising step (1), step (2)(i) and step (3)
- the method comprises step (1), step (2)(i) and step (3); wherein, the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
- the capture sequence A is a random oligonucleotide sequence.
- the ligation product derived from each copy of the oligonucleotide probe coupled to the same micro-dot has a different capture sequence A, and the capture sequence A serves as the molecular tag (UMI) of the second nucleic acid molecule.
- UMI molecular tag
- the extension product (the first nucleic acid molecule to be labeled) in step (2)(i) comprises from the 5' end to the 3' end: the consensus sequence A, the primer I-A A cDNA sequence complementary to the RNA formed as a primer for reverse transcription.
- the capture sequence A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer I-A further comprises a tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer I-A, and the consensus sequence A is located upstream of the tag sequence A (for example, at the 5' end of the primer I-A).
- step (3) the ligation products derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences A as UMIs.
- the extension product described in step (2)(i) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, the tag sequence A, and the primer I-A as the reverse
- the cDNA sequence complementary to the RNA formed by the primers was recorded.
- Embodiment comprising step (1), step (2)(ii) and step (3)
- the method comprises step (1), step (2)(ii) and step (3); wherein, the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
- the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
- the capture sequence A is a random oligonucleotide sequence.
- the ligation product derived from each copy of the oligonucleotide probe coupled to the same micro-dot has a different capture sequence A, and the capture sequence A serves as the molecular tag (UMI) of the second nucleic acid molecule.
- UMI molecular tag
- the first extension product (the first nucleic acid molecule to be labeled) in step (2)(ii) comprises from the 5' end to the 3' end: the consensus sequence A, and the Primer I-A is the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the overhang sequence at the 3' end, optionally the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the capture sequence A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer I-A further comprises a tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer I-A, and the consensus sequence A is located upstream of the tag sequence A (for example, at the 5' end of the primer I-A).
- step (3) the ligation products derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences A as UMIs.
- the first extension product (the first nucleic acid molecule to be labeled) described in step (2)(ii) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, the Tag sequence A, the cDNA sequence complementary to the RNA formed by using the primer I-A as a reverse transcription primer, the 3' end overhang sequence, optional complementary sequence of the tag sequence B, the consensus sequence B complementary sequence.
- the 5' end of the primer I-A comprises a phosphorylation modification.
- step (1) An exemplary embodiment of the present application comprising step (1), step (2)(ii) and step (3) is described in detail as follows:
- An exemplary scheme for preparing a cDNA chain using RNA (such as mRNA) in a sample as a template comprises the following steps (as shown in Figure 2):
- RNA molecules for example, mRNA molecules
- reverse transcriptase for example, reverse transcriptase with terminal transfer activity
- primer I-A primer I-A
- An overhang eg, an overhang comprising 3 cytosine nucleotides
- Various reverse transcriptases having terminal transfer activity can be used for the reverse transcription reaction.
- the reverse transcriptase used does not have RNaseH activity.
- the primer I-A comprises a poly(T) sequence and a consensus sequence A (labeled CA in the figure).
- the primer I-A further comprises a unique molecular tag sequence (UMI).
- UMI unique molecular tag sequence
- a poly(T) sequence is located at the 3' end of the primer I-A to initiate reverse transcription.
- the UMI sequence is located upstream (for example, the 5' end) of the poly(T) sequence
- the consensus sequence A is located upstream (for example, the 5' end) of the UMI sequence.
- primer I-B which contains consensus sequence B (marked as CB in the figure) to anneal or hybridize with the cDNA strand, and subsequently, the nucleic acid fragment hybridized or annealed with primer I-B can be converted into a consensus sequence under the action of nucleic acid polymerase Sequence B is used as a template to extend, and the complementary sequence of consensus sequence B is added to the 3' end of the cDNA chain, thereby generating a nucleic acid molecule that carries consensus sequence A and tag sequence A at the 5' end and a complementary sequence of consensus sequence B at the 3' end.
- the primer I-B may comprise a sequence complementary to the 3' end overhang of the cDNA strand.
- primer I-B may contain GGG at its 3' end.
- the nucleotides of primer I-B can also be modified (e.g., using a locked nucleic acid) to enhance complementary pairing between primer I-B and the 3' end overhang of the cDNA strand.
- nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
- DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can use the partial sequence of primer I-B as a template to extend the captured nucleic acid fragment (reverse transcription product) can be.
- reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
- step (2) is performed simultaneously with step (1).
- the method optionally further includes step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
- the method does not include step (3).
- Exemplary structures of cDNA strands prepared by the above-described exemplary embodiments include: consensus sequence A, UMI sequence, a sequence complementary to that of RNA (eg, mRNA), and a complementary sequence to consensus sequence B.
- An exemplary protocol for a new nucleic acid molecule comprises the following steps (as shown in Figure 3):
- a bridging oligonucleotide I is provided whose 5' end contains a sequence (first region, P1) that is at least partially complementary to the 5' end of the cDNA sequence (e.g. consensus sequence A (CA)) and whose 3' end contains a sequence that is at least partially complementary to the ChIP sequence A sequence (second region, P2) at least partially complementary to the 3' end (eg consensus sequence X2).
- first region, P1 that is at least partially complementary to the 5' end of the cDNA sequence
- CA consensus sequence A
- second region, P2 at least partially complementary to the 3' end
- the P1 and P2 sequences in the bridging oligonucleotide I are contiguously connected without an intervening nucleotide therebetween.
- the P1 sequence, the P2 sequence, the consensus sequence A and the consensus sequence X2 each independently have a length of 20-100 nt (such as 20-70 nt).
- the bridging oligonucleotide I is annealed or hybridized to the oligonucleotide probe and the cDNA strand, after which the 5' end of the cDNA strand is bonded to the 3' end of the oligonucleotide probe by DNA ligase and/or DNA polymerase.
- the ends are ligated to form a new nucleic acid molecule (ie, a nucleic acid molecule labeled with an oligonucleotide probe) that contains the sequence information of the oligonucleotide probe.
- the DNA polymerase has no 5' to 3' exonuclease activity or strand displacement activity.
- Embodiment comprising step (1), step (2)(iii) and step (3)
- the method comprises step (1), step (2)(iii) and step (3); wherein, the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
- the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
- the extension primer is the primer I-B or primer B", and the primer B" can be complementary to the consensus sequence B
- step (2)(iii)(c) the extension primer is the primer B.
- the capture sequence A of the primer I-A' is a random oligonucleotide sequence.
- the primer I-B comprises a consensus sequence B, a complementary sequence overhanging at the 3' end, and a tag sequence B.
- the first extension product comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA sequence formed by using the primer 1-A' as a reverse transcription primer, and the 3 'terminal overhang sequence, the complementary sequence of the tag sequence B, the complementary sequence of the consensus sequence B; wherein, the complementary sequence of the tag sequence B serves as the molecular tag (UMI) of the second nucleic acid molecule.
- UMI molecular tag
- the second extension product (the first nucleic acid molecule sequence to be labeled) comprises from the 5' end to the 3' end: the consensus sequence B or its 3' end partial sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, the complementary sequence of the cDNA sequence in the first extension product; wherein, the tag sequence B is used as the A molecular signature (UMI) of the second nucleic acid molecule.
- UMI A molecular signature
- the capture sequence A of the primer I-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer I-A' also contains a tag sequence A, such as a random oligonucleotide sequence, and a consensus sequence A.
- the capture sequence A is located at the 3' end of the primer I-A'.
- the consensus sequence A is located upstream of the capture sequence A (e.g., at the 5' end of the primer I-A').
- the primer I-B comprises a consensus sequence B, a 3' end overhang complementary sequence, and a tag sequence B.
- the first extension product comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, The cDNA sequence complementary to the RNA formed by using the primer I-A' as a reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the second extension product (the first nucleic acid molecule sequence to be labeled) comprises from the 5' end to the 3' end: the consensus sequence B or its 3' end partial sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, the complementary sequence of the cDNA sequence in the first extension product, optionally the complementary sequence of the tag sequence A Sequence, the complementary sequence of the consensus sequence A.
- step (3) the ligation products derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences B as UMIs.
- the 5' end of the extension primer comprises a phosphorylation modification.
- step (2)(iii)(c) before step (2)(iii)(c), the method further comprises treating the product of step (2)(iii)(a) or step (2)(iii)(b) (e.g. heat treatment) to remove RNA.
- treating the product of step (2)(iii)(a) or step (2)(iii)(b) e.g. heat treatment
- step (2)(iii)(b) of the method the cDNA strand anneals to the primer I-B via its 3' end overhang, and, in the presence of a nucleic acid polymerase (e.g., Under the action of DNA polymerase or reverse transcriptase), the cDNA chain is extended using the primer I-B as a template to generate the first extension product.
- a nucleic acid polymerase e.g., Under the action of DNA polymerase or reverse transcriptase
- step (1) An exemplary embodiment of the present application comprising step (1), step (2)(iii) and step (3) is described in detail as follows:
- An exemplary scheme for preparing a cDNA strand complementary chain using RNA (such as mRNA) in a sample as a template comprises the following steps (as shown in FIG. 4 ):
- RNA molecules (for example, mRNA molecules) in the permeabilized sample are reverse-transcribed using reverse transcriptase (for example, reverse transcriptase with terminal transfer activity) and primer I-A' to generate cDNA, and An overhang (eg, an overhang comprising 3 cytosine nucleotides) is added to the 3' end of the cDNA.
- reverse transcriptase for example, reverse transcriptase with terminal transfer activity
- primer I-A' primer I-A' to generate cDNA
- An overhang eg, an overhang comprising 3 cytosine nucleotides
- an overhang eg, an overhang comprising 3 cytosine nucleotides
- the reverse transcriptase used does not have RNaseH activity.
- the reverse transcription primer I-A' comprises a poly(T) sequence and a consensus sequence A(CA). Typically, a poly(T) sequence is located at the 3' end of the primer I-A' to initiate reverse transcription.
- primer I-B Anneal or hybridize with the cDNA strand using primer I-B, said primer I-B comprising consensus sequence B (CB) and the complementary sequence of the 3' end overhang of said cDNA.
- the primers I-B further comprise a unique molecular tag sequence (UMI).
- the nucleic acid fragment hybridized or annealed with primer I-B can be extended using the consensus sequence B and the UMI sequence as a template, and the complementary sequence of the consensus sequence B and the complementary sequence of the UMI sequence are added at the 3' end of the cDNA chain sequence, thereby generating a nucleic acid molecule that carries the consensus sequence A at the 5' end and the complementary sequence of the consensus sequence B and the complementary sequence of the UMI molecule at the 3' end.
- Primer I-B may contain GGG at its 3' end when the 3' end of the cDNA strand contains an overhang of 3 cytosine nucleotides.
- the nucleotides of primer I-B can also be modified (e.g., using a locked nucleic acid) to enhance complementary pairing between primer I-B and the 3' end overhang of the cDNA strand.
- nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
- DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can use the sequence of primer I-B or a partial sequence thereof as a template to extend the captured nucleic acid fragment (reverse transcription product).
- reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
- step (2) is performed simultaneously with step (1).
- the method optionally further includes step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
- the method does not include step (3).
- extension primer Using an extension primer to carry out an extension reaction using the cDNA single strand obtained in (3) as a template to obtain an extension product; the extension primer can anneal to the complementary sequence of the consensus sequence B or a partial sequence thereof, and can initiate Extended response.
- the extension primer is the same as the primer I-B.
- the exemplary structure containing the complementary strand of the cDNA strand prepared by the above exemplary embodiment includes: consensus sequence B, UMI sequence, sequence complementary to the cDNA 3' end overhang sequence, complementary sequence of cDNA sequence, complementary sequence of consensus sequence A sequence.
- An exemplary protocol for a new nucleic acid molecule containing ChIP-Seq information comprises the following steps (as shown in Figure 5):
- a bridging oligonucleotide I is provided whose 5' end contains a sequence at least partially complementary to consensus sequence B (CB) (first region, P1 ) and whose 3' end contains a sequence at least partially complementary to consensus sequence X2 (section Second region, P2).
- the P1 and P2 sequences in the bridging oligonucleotide I are contiguously connected without an intervening nucleotide therebetween.
- each of the P1 sequence and the P2 sequence independently has a length of 20-100 nt (eg, 20-70 nt).
- the bridging oligonucleotide I is annealed or hybridized with the oligonucleotide probe and the complementary strand of the cDNA strand, and then the 5' end of the complementary strand of the cDNA strand is bonded to the 3' end of the ChIP-seq by DNA ligase and/or DNA polymerase.
- the ends are ligated to form a new nucleic acid molecule (ie, a nucleic acid molecule labeled with an oligonucleotide probe) that contains the sequence information of the oligonucleotide probe.
- the DNA polymerase has no 5' to 3' exonuclease activity or strand displacement activity.
- step (2) the pretreatment includes the following steps:
- RNA eg, mRNA
- primer II-A contains a capture sequence A capable of binding to the RNA to be captured (for example, mRNA) anneals and initiates an extension reaction
- primer II-B anneals primer II-B to the cDNA strand generated in (a), and performs an extension reaction to generate a first extension product, the first extension product That is, the first nucleic acid molecule to be labeled, thereby generating the first nucleic acid molecule population
- the primer II-B includes a consensus sequence B, a complementary sequence overhanging at the 3' end, and an optional tag sequence B; the 3 The 'end overhang complementary sequence is located at the 3' end of the primer II-
- RNA eg, mRNA
- primer II-A' contains a consensus sequence A and a capture sequence A
- the capture sequence A can be combined with The RNA to be captured (eg, mRNA) anneals and initiates an extension reaction; the consensus sequence A is located upstream of the capture sequence A (eg, at the 5' end of the primer II-A'); (b) the primer II-B' anneals with the cDNA strand generated in (a), and performs an extension reaction to generate a first extension product; wherein, the primer II-B' includes a consensus sequence B, and a complementary sequence overhanging at the 3' end , and an optional tag sequence B; the 3' end overhang complementary sequence is located at the 3' end of the primer
- step (3) the first nucleic acid molecule population derived from each cell obtained in the previous step is associated with the microdot-coupled oligonucleotide probes occupied by the cell from which it originated, through the following steps, thereby Generating a second population of nucleic acid molecules labeled with the tag sequence Y:
- step (2) (i) implementing annealing conditions to the product of step (2), so that the first nucleic acid molecule derived from each cell obtained in step (2) anneals to the oligonucleotide probe coupled to the microspot occupied by the cell ( For example, in-situ annealing), and an extension reaction is performed to generate an extension product, and the extension product is a second nucleic acid molecule with a position marker, thereby generating a second population of nucleic acid molecules; wherein, the consensus of the oligonucleotide probes Sequence X2 or its partial sequence (a) can anneal to the complementary sequence or partial sequence of the consensus sequence B of the first extension product obtained in step (2)(i), or, (b) can anneal to step (2) (ii) annealing to the complementary sequence of said consensus sequence A or a partial sequence thereof of the obtained second extension product; or,
- the bridging oligonucleotide pair is coupled with the first nucleic acid molecule derived from each cell obtained in step (2) and the oligonucleotide probe coupled to the microspot occupied by the cell
- the needles are brought into contact so that the bridging oligonucleotide pair anneals to the first nucleic acid molecule derived from each cell obtained in step (2) and the oligonucleotide probe coupled to the microspot occupied by the cell (for example, the original bit annealing),
- the bridging oligonucleotide pair is composed of bridging oligonucleotide II-I and bridging oligonucleotide II-II, and the bridging oligonucleotide II-I and the bridging oligonucleotide II- II each independently comprises: a first region and a second region, and optionally a third region between the first region and the second region, the first region being located upstream (eg 5' terminal); among them,
- the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
- the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
- the second region (a) of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence B of the first extension product obtained in step (2)(i) or a partial sequence thereof, or, ( b) capable of annealing to the complementary sequence of the consensus sequence A or a partial sequence thereof of the second extension product obtained in step (2)(ii);
- the bridging oligonucleotide II-I and the bridging oligonucleotide of the bridging oligonucleotide pair Oligonucleotides II-II are each present in single-stranded form, alternatively, bridging oligonucleotide II-I and bridging oligonucleotide II-II of the pair of bridging oligonucleotides are annealed to each other to form a partially double-stranded exists in the form of
- Carry out ligation reaction the nucleic acid molecules hybridized in the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the first region and the second region of the same bridging oligonucleotide II-II will be hybridized
- the nucleic acid molecule in the second region is connected; and an extension reaction is performed; wherein, the connection reaction and the extension reaction are performed in any order; the obtained reaction product is a second nucleic acid molecule with a position marker, thereby generating the second nucleic acid molecular group.
- the step of molecular connection comprises: using nucleic acid ligase to connect the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I; or,
- the bridging oligonucleotide II-I includes a first region, a second region, and a third region between the two, the first region and the first region that will hybridize to the same bridging oligonucleotide II-I
- the step of ligating the nucleic acid molecule of the second region comprises: using a nucleic acid polymerase (for example, without 5' to 3' end exonuclease activity or strand displacement activity) to carry out a polymerization reaction using the third region as a template, using a nucleic acid ligase connecting nucleic acid molecules hybridizing to the first, third and second regions of the same bridging oligonucleotide II-I;
- the step of molecular connection comprises: using nucleic acid ligase to connect the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-II; or,
- the bridging oligonucleotide II-II includes a first region, a second region and a third region between the two, the first region and the first region that will hybridize to the same bridging oligonucleotide II-II
- the step of ligating the nucleic acid molecule of the second region comprises: using a nucleic acid polymerase (for example, without 5' to 3' end exonuclease activity or strand displacement activity) to carry out a polymerization reaction using the third region as a template, using a nucleic acid ligase
- the nucleic acid molecules hybridizing to the first region, the third region and the second region of the same bridging oligonucleotide II-II are ligated.
- the method comprises step (1), step (2)(i) and step (3); wherein, in step (2)(i)(b), the primer II-B contains Consensus sequence B, 3' end overhang complementary sequence, and tag sequence B.
- the first extension product described in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the primer formed by using the primer II-A as the reverse transcription primer and The cDNA sequence complementary to the RNA, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- each copy of the second nucleic acid molecule derived from the oligonucleotide probe coupled to the same microdot has a different tag sequence B as UMI.
- Embodiments comprising step (1), step (2)(i) and step (3)(i)
- the method comprises step (1), step (2)(i) and step (3)(i); wherein, the consensus sequence X2 or a partial sequence thereof can be combined with the consensus sequence B
- the complementary sequence or partial sequence thereof is annealed;
- the extension product obtained in step (3)(i) is a labeled nucleic acid molecule, which comprises: the first strand containing the first nucleic acid molecule sequence to be labeled, and/or , the second strand containing the oligonucleotide probe sequence.
- partial sequence of XX (sequence) or “partial sequence of XX (sequence)" means the nucleotide sequence of at least one segment of "XX (sequence)".
- the entire nucleotide sequence of the consensus sequence X2 can anneal to the complementary sequence of the consensus sequence B or the nucleotide sequence of a partial segment of the complementary sequence of the consensus sequence B, and the consensus sequence X2 It is also possible to anneal with the complementary sequence of the consensus sequence B or the nucleotide sequence of a partial segment of the complementary sequence of the consensus sequence B with the nucleotide sequence of a partial segment thereof.
- annealing means that in the two nucleotide sequences that are annealed to each other, each base in one nucleotide sequence can pair with the base in the other nucleotide sequence without mismatching or a gap; or, in two nucleotide sequences that anneal to each other, most of the bases in one nucleotide sequence can pair with the bases in the other nucleotide sequence, which allows mismatches or gaps ( For example, a mismatch or gap of one or several nucleotides). That is, the two nucleotide sequences that can be annealed can be either completely complementary or partially complementary. Unless otherwise indicated herein or clearly contradicted by the context, the description of "annealing" here applies to the entire text.
- the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging A mutant sequence, a complementary sequence of the tag sequence B, a complementary sequence of the consensus sequence B, a complementary sequence of the tag sequence Y, and a complementary sequence of the consensus sequence X1.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence B, the 3' The complementary sequence of the terminal overhang sequence, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer.
- Embodiment comprising step (1), step (2)(i) and step (3)(i): a chain
- the consensus sequence X2 or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence B or a partial sequence thereof (for example, a 3' end partial sequence), and in step (2)(i) The complementary sequence of said consensus sequence B of the first extension product has a 3' free end.
- the extension product obtained in step (3)(i) is a labeled nucleic acid molecule comprising the first strand.
- the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging A mutant sequence, a complementary sequence of the tag sequence B, a complementary sequence of the consensus sequence B, a complementary sequence of the tag sequence Y, and a complementary sequence of the consensus sequence X1.
- step (3)(i) the oligonucleotide probe cannot initiate an extension reaction (eg, the 3' end is blocked).
- the capture sequence A of the primer II-A is a random oligonucleotide sequence.
- the first extension product described in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the primer formed by using the primer II-A as the reverse transcription primer and The cDNA sequence complementary to the RNA, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging A mutant sequence, a complementary sequence of the tag sequence B, a complementary sequence of the consensus sequence B, a complementary sequence of the tag sequence Y, and a complementary sequence of the consensus sequence X1.
- the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer II-A also contains a consensus sequence A, and an optional tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer II-A.
- the consensus sequence A is located upstream of the capture sequence A (e.g., at the 5' end of the primer II-A).
- the first extension product in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, the optional tag sequence A, and the
- the primer II-A is the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the first strand comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, formed by using the primer II-A as a reverse transcription primer
- the cDNA sequence complementary to the RNA, the 3' end overhang sequence, the complementary sequence of the tag sequence B, the complementary sequence of the consensus sequence B, the complementary sequence of the tag sequence Y, the consensus sequence X1 complementary sequence.
- Embodiment comprising step (1), step (2)(i) and step (3)(i): two chains
- the consensus sequence X2 or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence B or a partial sequence thereof, and the oligonucleotide probe
- the consensus sequence X2 of has a 3' free end.
- the extension product obtained in step (3)(i) is a labeled nucleic acid molecule comprising the second strand.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence B, the 3' The complementary sequence of the terminal overhang sequence, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer.
- the first extension product obtained in step (2)(i) cannot initiate an extension reaction (eg, the 3' end is blocked).
- the capture sequence A of the primer II-A is a random oligonucleotide sequence.
- the first extension product described in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the primer formed by using the primer II-A as the reverse transcription primer and The cDNA sequence complementary to the RNA, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence B, the 3' The complementary sequence of the terminal overhang sequence, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer.
- the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer II-A also contains a consensus sequence A, and an optional tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer II-A.
- the consensus sequence A is located upstream of the capture sequence A (eg, at the 5' end of the primer II-A).
- the first extension product in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, The cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence B, the 3' The complementary sequence of the terminal overhang sequence, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, the optional complementary sequence of the tag sequence A, the consensus sequence A complementary sequence.
- step (1) An exemplary embodiment of the present application comprising step (1), step (2)(i) and step (3)(i) is described in detail as follows:
- An exemplary scheme for preparing a cDNA strand containing a complementary sequence of UMI at the 3' end using the RNA (such as mRNA) in the sample as a template comprises the following steps (as shown in Figure 6):
- RNA molecules in the permeabilized sample are reverse-transcribed using reverse transcriptase (eg, reverse transcriptase with terminal transfer activity) and primer II-A to generate cDNA, and the cDNA
- reverse transcriptase eg, reverse transcriptase with terminal transfer activity
- primer II-A primer II-A
- An overhang eg, an overhang comprising 3 cytosine nucleotides
- Various reverse transcriptases having terminal transfer activity can be used for the reverse transcription reaction.
- the reverse transcriptase used does not have RNaseH activity.
- the primer II-A comprises a poly(T) sequence and a consensus sequence A(CA).
- a poly(T) sequence is located at the 3' end of the primer II-A to initiate reverse transcription.
- the primer II-A comprises a random oligonucleotide sequence that can be used to capture RNA without a poly(A) tail.
- the random oligonucleotide sequence is located at the 3' end of the primer II-A to initiate reverse transcription.
- primer II-B Anneal or hybridize with the cDNA strand using primer II-B, said primer II-B comprising a consensus sequence B (CB), a unique molecular tag sequence (UMI), and the complementary sequence of the 3' end overhang of the cDNA .
- CB consensus sequence B
- UMI unique molecular tag sequence
- the consensus sequence B is located upstream of the UMI sequence (for example, the 5' end), and the sequence complementary to the 3' end overhang of the cDNA strand is located at the 3' end of the primer II-B.
- the primer II-B may include GGG at its 3' end.
- the nucleotides of the primer II-B can also be modified (for example, using a locked nucleic acid) to enhance the complementary pairing between the primer II-B and the 3' end overhang of the cDNA strand.
- nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
- DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can be extended using the sequence of the primer II-B or a partial sequence thereof as a template Captured nucleic acid fragments (reverse transcription products) are sufficient.
- reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
- this step is performed simultaneously with step (1) (eg, in the same reaction system).
- the method optionally further comprises step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
- said method does not comprise said step (3).
- the exemplary structure of the cDNA strand prepared by the above exemplary embodiment comprises: consensus sequence A, cDNA sequence, 3' end overhang sequence, complementary sequence of UMI sequence, and complementary sequence of consensus sequence B.
- the performance scheme includes the following steps (as shown in Figure 8):
- the consensus sequence X2 of the ChIP-seq or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence B of the cDNA strand obtained in the above step 1 or a partial sequence thereof.
- the cDNA strand is annealed or hybridized with ChIP-seq, and under the action of polymerase, a new nucleic acid molecule containing ChIP-seq information (ie, a nucleic acid molecule marked with ChIP-seq) is formed.
- the exemplary structure of the new nucleic acid molecule containing chip sequence information formed by the above exemplary embodiment comprises: a consensus sequence A from the 5' end to the 3' end, a cDNA sequence, an overhang sequence at the 3' end, and the complement of the UMI sequence sequence, the complementary sequence of the consensus sequence B, the complementary sequence of the tag sequence Y, and the nucleic acid strand of the complementary sequence of the consensus sequence X1 and/or its complementary nucleic acid strand.
- Embodiment comprising step (1), step (2)(i) and step (3)(ii)
- the method comprises step (1), step (2)(i) and step (3)(ii); wherein, the second region of the bridging oligonucleotide II-II can be combined with
- the first extension product obtained in step (2)(i) is annealed to the complementary sequence of the consensus sequence B or a partial sequence thereof;
- the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule, which comprises: The first strand containing the sequence of the first nucleic acid molecule to be labeled, and/or the second strand containing the sequence of the oligonucleotide probe.
- the second region of the bridging oligonucleotide II-II can be the complementary sequence of the consensus sequence B of the first extension product obtained in step (2)(i) or the complementary sequence of the consensus sequence B
- the nucleotide sequences of the partial segments are annealed.
- the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
- the complementary sequence of the third region of II-I, the bridging oligonucleotide II-II sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, reversed with the primer II-A The complementary sequence of the cDNA sequence complementary to the RNA formed by the recording primers.
- Embodiment comprising step (1), step (2)(i) and step (3)(ii): a chain
- the second region of the bridging oligonucleotide II-II can be the complementary sequence of the consensus sequence B of the first extension product obtained in step (2)(i) or a partial sequence thereof ( For example, the 3' end partial sequence) anneals and the second region of the bridging oligonucleotide II-I has a 3' free end.
- the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule comprising the first strand.
- the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging
- the second region of the bridging oligonucleotide II-I is located at the 3' end of the bridging oligonucleotide II-I.
- the first region of the bridging oligonucleotide II-I is located at the 5' end of the bridging oligonucleotide II-I.
- said bridging oligonucleotide II-I does not contain said third region, and/or said bridging oligonucleotide II-I does not contain said third region.
- the 5' end of the bridging oligonucleotide II-I contains a phosphorylation modification.
- the 3' end of the bridging oligonucleotide II-I contains a free -OH.
- step (3)(ii) the bridging oligonucleotide II-II cannot initiate an extension reaction (for example, the 3' end is blocked), and/or, the oligonucleotide Acid probes cannot initiate extension reactions (eg, the 3' end is blocked).
- the capture sequence A of the primer II-A is a random oligonucleotide sequence.
- the first extension product described in step (2)(i)(b) of the method sequentially comprises from the 5' end to the 3' end: using the primer II-A as a reverse transcription primer
- the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging
- the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer II-A also contains a consensus sequence A, and an optional tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer II-A.
- the first extension product in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, The cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the first strand comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, formed by using the primer II-A as a reverse transcription primer
- the cDNA sequence complementary to the RNA, the 3' end overhang sequence, the complementary sequence of the tag sequence B, the complementary sequence of the consensus sequence B, optionally the bridging oligonucleotide II-II
- step (3)(ii) between the bridging oligonucleotide II-I, the bridging oligonucleotide II-II and the oligonucleotide probe and the oligonucleotide probe
- the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the nucleic acid molecules hybridized to the same bridging oligonucleotide
- the ligation reaction process of the nucleic acid molecule connection of the first region and the second region of acid II-II and the extension reaction described in step (3)(ii) can be carried out in any order, as long as the second nucleic acid with the position marker can be obtained Molecule will do.
- the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-II can be connected, and the bridging oligonucleotide can be used to Oligonucleotide II-I initiates the extension reaction, obtaining the first strand.
- the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
- the first strand can be obtained in the following exemplary ways:
- the polymerase used in the extension reaction preferably has strand displacement activity or 5' to 3' excision activity.
- the extension reaction and the extension reaction are performed in different systems, and the extension reaction is performed first, and then the ligation reaction is performed.
- it can be obtained by initiating an extension reaction with said bridging oligonucleotide II-I and then ligating nucleic acid molecules that hybridize to the first and second regions of the same bridging oligonucleotide II-II. the first chain.
- the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
- Embodiment comprising step (1), step (2)(i) and step (3)(ii): two strands
- the second region of the bridging oligonucleotide II-II is capable of annealing to the complementary sequence of the consensus sequence B of the first extension product obtained in step (2)(i) or a partial sequence thereof, And the second region of the bridging oligonucleotide II-II has a 3' free end.
- the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule comprising said second strand.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
- the complementary sequence of the third region of II-I, the bridging oligonucleotide II-II sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, reversed with the primer II-A The complementary sequence of the cDNA sequence complementary to the RNA formed by the recording primers.
- the second region of the bridging oligonucleotide II-II is located at the 3' end of the bridging oligonucleotide II-II.
- the first region of the bridging oligonucleotide II-II is located at the 5' end of the bridging oligonucleotide II-II.
- said bridging oligonucleotide II-I does not contain said third region, and/or said bridging oligonucleotide II-II does not contain said third region.
- the 5' end of the bridging oligonucleotide II-II contains a phosphorylation modification.
- the 3' end of the bridging oligonucleotide II-II contains a free -OH.
- step (3)(ii) the bridging oligonucleotide II-I cannot initiate an extension reaction (for example, the 3' end is blocked), and/or, step (2)( i) The first extension product obtained cannot initiate an extension reaction (eg the 3' end is blocked).
- the capture sequence A of the primer II-A is a random oligonucleotide sequence.
- the first extension product described in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the primer formed by using the primer II-A as the reverse transcription primer and The cDNA sequence complementary to the RNA, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
- the complementary sequence of the third region of II-I, the bridging oligonucleotide II-II sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, reversed with the primer II-A The complementary sequence of the cDNA sequence complementary to the RNA formed by the recording primers.
- the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer II-A also contains a consensus sequence A, and an optional tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer II-A.
- the first extension product in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, The cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
- the complementary sequence of the third region of II-I, the bridging oligonucleotide II-II sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, reversed with the primer II-A The complementary sequence of the cDNA sequence complementary to the RNA formed by the recording primer, optionally the complementary sequence of the tag sequence A, and the complementary sequence of the consensus sequence A.
- step (3)(ii) between the bridging oligonucleotide II-I, the bridging oligonucleotide II-II and the oligonucleotide probe and the oligonucleotide probe
- the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the nucleic acid molecules hybridized to the same bridging oligonucleotide
- the ligation reaction process of the nucleic acid molecule connection of the first region and the second region of acid II-II and the extension reaction described in step (3)(ii) can be carried out in any order, as long as the second nucleic acid with the position marker can be obtained Molecule will do.
- the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I can be connected, and the bridging oligonucleotide can be used to Oligonucleotide II-II initiates the extension reaction, obtaining the second strand.
- the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
- the second chain can be obtained in the following exemplary ways:
- the polymerase used in the extension reaction preferably has strand displacement activity or 5' to 3' excision activity.
- the extension reaction and the extension reaction are performed in different systems, and the extension reaction is performed first, and then the ligation reaction is performed.
- it can be obtained by initiating an extension reaction with said bridging oligonucleotide II-II and then ligating nucleic acid molecules that hybridize to the first and second regions of the same bridging oligonucleotide II-I the second strand.
- the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
- step (1) An exemplary embodiment of the present application comprising step (1), step (2)(i) and step (3)(ii) is described in detail as follows:
- An exemplary scheme for preparing a cDNA chain using RNA (such as mRNA) in a sample as a template comprises the following steps (as shown in FIG. 6 ):
- RNA molecules in the permeabilized sample are reverse-transcribed using reverse transcriptase (eg, reverse transcriptase with terminal transfer activity) and primer II-A to generate cDNA, and the cDNA
- reverse transcriptase eg, reverse transcriptase with terminal transfer activity
- primer II-A primer II-A
- An overhang eg, an overhang comprising 3 cytosine nucleotides
- Various reverse transcriptases having terminal transfer activity can be used for the reverse transcription reaction.
- the reverse transcriptase used does not have RNaseH activity.
- the primer II-A comprises a poly(T) sequence and a consensus sequence A(CA).
- a poly(T) sequence is located at the 3' end of the primer II-A to initiate reverse transcription.
- the primer II-A comprises a random oligonucleotide sequence that can be used to capture RNA without a poly(A) tail.
- the random oligonucleotide sequence is located at the 3' end of the primer II-A to initiate reverse transcription.
- primer II-B Anneal or hybridize with the cDNA strand using primer II-B, said primer II-B comprising a consensus sequence B (CB), a unique molecular tag sequence (UMI), and the complementary sequence of the 3' end overhang of the cDNA .
- CB consensus sequence B
- UMI unique molecular tag sequence
- the consensus sequence B is located upstream of the UMI sequence (for example, the 5' end), and the sequence complementary to the 3' end overhang of the cDNA strand is located at the 3' end of the primer II-B.
- the primer II-B may include GGG at its 3' end.
- the nucleotides of the primer II-B can also be modified (for example, using a locked nucleic acid) to enhance the complementary pairing between the primer II-B and the 3' end overhang of the cDNA strand.
- nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
- DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can be extended using the sequence of the primer II-B or a partial sequence thereof as a template Captured nucleic acid fragments (reverse transcription products) are sufficient.
- reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
- this step is performed simultaneously with step (1) (eg, in the same reaction system).
- the method optionally further comprises step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
- said method does not comprise said step (3).
- the exemplary structure of the cDNA strand prepared by the above exemplary embodiment comprises: consensus sequence A, cDNA sequence, 3' end overhang sequence, complementary sequence of UMI sequence, and complementary sequence of consensus sequence B.
- the permanent scheme includes the following steps (as shown in Figure 7):
- a bridging oligonucleotide pair consisting of a bridging oligonucleotide II-I and a bridging oligonucleotide II-II is provided, wherein the bridging oligonucleotide II-I and the bridging oligonucleotide II- II each independently include: a first region (P1) and a second region (P2), the first region is located upstream of the second region (eg 5' end); wherein,
- the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
- the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
- the second region of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence B in the cDNA strand obtained in the above step 1 or a partial sequence thereof.
- the bridging oligonucleotide II-I contains spacer nucleotides between the first region and the second region, such as 1-5nt or 5-10nt spacer nucleotides, that is, the The bridging oligonucleotide II-I sequence contains a third region located between the first region and the second region.
- the first region and the second region in the bridging oligonucleotide II-I are adjacently connected without redundant nucleotides, that is, the bridging oligonucleotide
- the acid II-I sequence does not contain a third region located between the first and second regions.
- the bridging oligonucleotide II-II contains spacer nucleotides between the first region and the second region, such as 1-5nt or 5-10nt spacer nucleotides, that is, the The bridging oligonucleotide II-II sequence contains a third region located between the first region and the second region.
- the first region and the second region in the bridging oligonucleotide II-II are adjacently connected without redundant nucleotides, that is, the bridging oligonucleotide
- the acid II-II sequence does not contain a third region located between the first and second regions.
- bridging oligonucleotide II-I Anneal or hybridize the bridging oligonucleotide II-I, bridging oligonucleotide II-II and chip sequence to the cDNA strand obtained in the above step 1, and then hybridize to the same bridging oligonucleotide II-
- the nucleic acid molecules of the first region and the second region of I are linked, and/or the nucleic acid molecules of the first region and the second region hybridizing to the same bridging oligonucleotide II-II are connected.
- new nucleic acid molecules containing ChIP-seq information ie, ChIP-seq-labeled nucleic acid molecules
- the concatenation process and polymerization process are performed in any order.
- the exemplary structure of the new nucleic acid molecule containing chip sequence information formed by the above exemplary embodiment comprises: a consensus sequence A from the 5' end to the 3' end, a cDNA sequence, an overhang sequence at the 3' end, and the complement of the UMI sequence sequence, the complementary sequence of the consensus sequence B, the bridging oligonucleotide II-I sequence, the complementary sequence of the tag sequence Y, and the nucleic acid strand of the complementary sequence of the consensus sequence X1 and/or its complementary nucleic acid strand.
- the method comprises step (1), step (2)(ii) and step (3).
- the first extension product comprises from the 5' end to the 3' end: the consensus sequence A, with the primer II-A' as The cDNA sequence complementary to the RNA formed by the reverse transcription primer, the overhang sequence at the 3' end, the optional complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
- the extension primer is the primer II-B' or primer B", wherein the primer B" can be combined with the consensus sequence B Annealing to the complementary sequence or part thereof, and can initiate the extension reaction.
- the second extension product comprises from the 5' end to the 3' end: a cDNA sequence complementary to the cDNA sequence formed by extending the extension primer Sequence, the complementary sequence of the consensus sequence A.
- Embodiments comprising step (1), step (2)(ii) and step (3)(i)
- the method comprises step (1), step (2)(ii) and step (3)(i); wherein, the consensus sequence X2 or a partial sequence thereof can be combined with the consensus sequence A
- the complementary sequence or partial sequence thereof is annealed;
- the extension product obtained in step (3)(i) is a labeled nucleic acid molecule, which comprises: the first strand containing the first nucleic acid molecule sequence to be labeled, and/or , the second strand containing the oligonucleotide probe sequence.
- the consensus sequence X2 can be annealed with the complementary sequence of the consensus sequence A or the nucleotide sequence of a partial segment of the complementary sequence of the consensus sequence A with its overall nucleotide sequence, the consensus sequence X2 can also anneal to the complementary sequence of the consensus sequence A or the nucleotide sequence of a partial segment of the complementary sequence of the consensus sequence A with the nucleotide sequence of its partial segment.
- the first strand comprises from the 5' end to the 3' end: the sequence of the first nucleic acid molecule to be labeled, the complementary sequence of the tag sequence Y, the complementary sequence of the consensus sequence X1 .
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, and the first nucleic acid molecule to be labeled Sequence complementary cDNA sequences.
- Embodiment comprising step (1), step (2)(ii) and step (3)(i): a chain
- the consensus sequence X2 or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence A or a partial sequence thereof (for example, a partial sequence at the 3' end); obtained in step (3)(i)
- the extension product is a labeled nucleic acid molecule, which includes a first strand containing the sequence of the first nucleic acid molecule to be labeled.
- step (3)(i) the oligonucleotide probe cannot initiate an extension reaction (eg, the 3' end is blocked).
- the capture sequence A of the primer II-A' is a random oligonucleotide sequence.
- the extension primer is the primer II-B'.
- the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhanging sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, and the complementary sequence of the consensus sequence A.
- the first strand comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, the complementary sequence of the overhang sequence at the 3' end, and
- the primer II-A' is the complementary sequence of the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the complementary sequence of the consensus sequence A, the complementary sequence of the tag sequence Y, the complementary sequence of the consensus sequence X1 sequence.
- the first strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different complementary sequences of the capture sequence A as the UMI.
- the capture sequence A of the primer II-A' is a poly(T) sequence or a specificity for a specific target nucleic acid sequence.
- the primer II-A' also contains a tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer II-A.
- the extension primer is the primer II-B'.
- the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhang sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, the complementary sequence of the tag sequence A, the consensus Complement of sequence A.
- the first strand comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, the complementary sequence of the overhang sequence at the 3' end, and
- the primer II-A' is the complementary sequence of the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the complementary sequence of the tag sequence A, the complementary sequence of the consensus sequence A, the complementary sequence of the tag sequence Y Sequence, the complementary sequence of the consensus sequence X1.
- the first strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different complementary sequences of the tag sequence A as the UMI.
- Embodiment comprising step (1), step (2)(ii) and step (3)(i): two strands
- the consensus sequence X2 or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence A or a partial sequence thereof; obtained in step (3)(i)
- the extension product of is a labeled nucleic acid molecule comprising a second strand comprising the oligonucleotide probe sequence.
- the second extension product obtained in step (2)(ii) cannot initiate an extension reaction (eg, the 3' end is blocked).
- the capture sequence A of the primer II-A' is a random oligonucleotide sequence.
- the extension primer is the primer II-B'.
- the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhanging sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, and the complementary sequence of the consensus sequence A.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, and the first nucleic acid molecule to be labeled cDNA sequence complementary to the sequence, the 3' end overhang sequence, optionally the complementary sequence of the tag sequence B, the complementary sequence of the consensus sequence B.
- step (3) the second strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different capture sequences A as UMIs.
- the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer II-A' also contains a tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer II-A.
- the extension primer is the primer II-B'.
- the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhang sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, the complementary sequence of the tag sequence A, the consensus Complement of sequence A.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence A, and the to-be
- step (3) the second strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences A as UMIs.
- step (1) An exemplary embodiment of the present application comprising step (1), step (2)(ii) and step (3)(i) is described in detail as follows:
- RNA such as mRNA
- the exemplary scheme comprises the following steps (as shown in Figure 9):
- RNA molecules (for example, mRNA molecules) in the permeabilized sample are reverse-transcribed using reverse transcriptase (for example, reverse transcriptase with terminal transfer activity) and primer II-A' to generate cDNA, and An overhang (eg, an overhang comprising 3 cytosine nucleotides) is added to the 3' end of the cDNA.
- reverse transcriptase for example, reverse transcriptase with terminal transfer activity
- primer II-A' primer II-A' to generate cDNA
- An overhang eg, an overhang comprising 3 cytosine nucleotides
- an overhang eg, an overhang comprising 3 cytosine nucleotides
- the reverse transcriptase used does not have RNaseH activity.
- the primer II-A' comprises a poly(T) sequence, a UMI sequence, and a consensus sequence A (CA).
- a poly(T) sequence is located at the 3' end of the primer II-A' to initiate reverse transcription, and the consensus sequence A is located upstream (eg, 5' end) of the UMI sequence.
- the primer II-A' comprises a random oligonucleotide sequence and a consensus sequence A, which can be used to capture RNA without a ploy A tail.
- the random oligonucleotide sequence is located at the 3' end of the primer II-A' to initiate reverse transcription.
- primer II-B' comprising a consensus sequence B (CB) and a complementary sequence overhanging at the 3' end of said cDNA.
- CB consensus sequence B
- the nucleic acid fragment that hybridizes or anneals to the primer II-B' can be extended using the consensus sequence B as a template under the action of a nucleic acid polymerase, and a part of the consensus sequence B is added at the 3' end of the cDNA chain.
- Complementary sequence (c(CB)) thereby generating a nucleic acid molecule carrying the complementary sequence of said consensus sequence B at the 3' end.
- sequence complementary to the 3' end overhang of the cDNA strand is located at the 3' end of the primer II-B'.
- the primer II-B' may include GGG at its 3' end.
- the nucleotides of the primer II-B' can also be modified (for example, using a locked nucleic acid) to enhance the complementary pairing between the primer II-B' and the 3' end overhang of the cDNA strand.
- nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
- DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can use the sequence of the primer II-B' or a partial sequence thereof as a template to extend Captured nucleic acid fragments (reverse transcription products) are sufficient.
- reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
- this step is performed simultaneously with step (1) (eg, in the same reaction system).
- the method optionally further comprises step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
- said method does not comprise said step (3).
- extension primer the cDNA strand obtained in the previous step is used as a template for an extension reaction to obtain an extension product;
- the extension primer is the primer II-B', or primer B", and the primer B" can be combined with the The above-mentioned consensus sequence B or a partial sequence thereof is annealed, and can initiate an extension reaction.
- the exemplary structure of the cDNA strand complementary chain prepared by the above exemplary embodiment comprises: consensus sequence B, complementary sequence of 3' end overhang, complementary sequence of cDNA sequence, complementary sequence of UMI sequence, and complementary sequence of consensus sequence A sequence.
- the consensus sequence X2 of the ChIP-seq or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence A or a partial sequence thereof of the complementary strand of the cDNA strand obtained in step 1 above.
- the complementary strand of the cDNA chain is annealed or hybridized with the ChIP-seq, and under the action of the polymerase, a new nucleic acid molecule containing the ChIP-seq information (that is, a nucleic acid molecule labeled with the ChIP-seq) is formed.
- the exemplary structure of the new nucleic acid molecule containing chip sequence information formed by the above exemplary embodiment comprises: from the 5' end to the 3' end containing the consensus sequence B, the complementary sequence of the 3' end overhang, the cDNA sequence Complementary sequence, the complementary sequence of the UMI sequence, the complementary sequence of the consensus sequence A, the complementary sequence of the tag sequence Y, and the nucleic acid strand of the complementary sequence of the consensus sequence X1 and/or its complementary nucleic acid strand.
- Embodiments comprising step (1), step (2)(ii) and step (3)(ii)
- the method comprises step (1), step (2)(ii) and step (3)(ii); wherein, the second region of the bridging oligonucleotide II-II can be combined with The complementary sequence of the consensus sequence A of the second extension product obtained in step (2)(ii) or its partial sequence is annealed; the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule, which comprises: The first strand of the first nucleic acid molecule sequence to be labeled, and/or, the second strand containing the oligonucleotide probe sequence.
- the second region of the bridging oligonucleotide II-II can be the complementary sequence of the consensus sequence A of the second extension product obtained in step (2)(ii) or a part of the complementary sequence of the consensus sequence A
- the nucleotide sequences of the segments are annealed.
- the first strand comprises from the 5' end to the 3' end: the first nucleic acid molecule sequence to be labeled, optionally the third region of the bridging oligonucleotide II-II
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
- the consensus sequence X1 the tag sequence Y
- the consensus sequence X2 optionally the bridging oligonucleotide
- Embodiment comprising step (1), step (2)(ii) and step (3)(ii): a chain
- the second region of the bridging oligonucleotide II-II can be compatible with the complementary sequence of the consensus sequence A or the 3' end of the second extension product obtained in step (2)(ii).
- the partial sequences anneal and the second region of the bridging oligonucleotide II-I has a 3' free end.
- the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule comprising the first strand.
- the second region of the bridging oligonucleotide II-I is located at the 3' end of the bridging oligonucleotide II-I.
- the first region of the bridging oligonucleotide II-I is located at the 5' end of the bridging oligonucleotide II-I. In certain embodiments, said bridging oligonucleotide II-I does not contain said third region, and/or said bridging oligonucleotide II-II does not contain said third region.
- the 5' end of the bridging oligonucleotide II-I contains a phosphorylation modification.
- the 3' end of the bridging oligonucleotide II-I contains a free -OH.
- step (3)(ii) the bridging oligonucleotide II-II cannot initiate an extension reaction (for example, the 3' end is blocked), and/or, the oligonucleotide Acid probes cannot initiate extension reactions (eg, the 3' end is blocked).
- the capture sequence A of the primer II-A' is a random oligonucleotide sequence.
- the extension primer is the primer II-B'.
- the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhanging sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, and the complementary sequence of the consensus sequence A.
- the first strand comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, the complementary sequence of the overhang sequence at the 3' end, and the Primer II-A' is the complementary sequence of the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the complementary sequence of the consensus sequence A, and optionally the third region of the bridging oligonucleotide II-II complementary sequence, the bridging oligonucleotide II-I sequence, the complementary sequence of the tag sequence Y, the complementary sequence of the consensus sequence X1.
- the first strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different complementary sequences of the capture sequence A as the UMI.
- the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer II-A' also contains a tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer II-A.
- the extension primer is the primer II-B'.
- the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhang sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, the complementary sequence of the tag sequence A, the consensus Complement of sequence A.
- the first strand comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, the complementary sequence of the overhang sequence at the 3' end, and the Primer II-A' is the complementary sequence of the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the complementary sequence of the tag sequence A, the complementary sequence of the consensus sequence A, and optionally the bridging oligonucleotide
- the first strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different complementary sequences of the tag sequence A as the UMI.
- step (3)(ii) between the bridging oligonucleotide II-I, the bridging oligonucleotide II-II and the oligonucleotide probe and the oligonucleotide probe
- the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the nucleic acid molecules hybridized to the same bridging oligonucleotide
- the ligation reaction process of the nucleic acid molecule connection of the first region and the second region of acid II-II and the extension reaction described in step (3)(ii) can be carried out in any order, as long as the second nucleic acid with the position marker can be obtained Molecule will do.
- the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-II can be connected, and the bridging oligonucleotide can be used to Oligonucleotide II-I initiates the extension reaction, obtaining the first strand.
- the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
- the first strand can be obtained in the following exemplary ways:
- the polymerase used in the extension reaction preferably has strand displacement activity or 5' to 3' excision activity.
- the extension reaction and the extension reaction are performed in different systems, and the extension reaction is performed first, and then the ligation reaction is performed.
- it can be obtained by initiating an extension reaction with said bridging oligonucleotide II-I and then ligating nucleic acid molecules that hybridize to the first and second regions of the same bridging oligonucleotide II-II. the first chain.
- the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
- Embodiment comprising step (1), step (2)(ii) and step (3)(ii): two chains
- the second region of the bridging oligonucleotide II-II is capable of annealing to the complementary sequence of the consensus sequence A of the second extension product obtained in step (2)(ii) or a partial sequence thereof , and the second region of the bridging oligonucleotide II-II has a 3' free end.
- the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule comprising said second strand.
- the second region of the bridging oligonucleotide II-II is located at the 3' end of the bridging oligonucleotide II-II.
- the first region of the bridging oligonucleotide II-II is located at the 5' end of the bridging oligonucleotide II-II.
- said bridging oligonucleotide II-I does not contain said third region, and/or said bridging oligonucleotide II-II does not contain said third region.
- the 5' end of the bridging oligonucleotide II-II contains a phosphorylation modification.
- the 3' end of the bridging oligonucleotide II-II contains a free -OH.
- step (3)(ii) the bridging oligonucleotide II-I cannot initiate an extension reaction (for example, the 3' end is blocked), and/or, step (2)( ii) The obtained second extension product cannot initiate an extension reaction (eg, the 3' end is blocked).
- the capture sequence A of the primer II-A' is a random oligonucleotide sequence.
- the extension primer is the primer II-B'.
- the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhanging sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, and the complementary sequence of the consensus sequence A.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
- step (3) the second strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different capture sequences A as UMIs.
- the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
- the primer II-A' also contains a tag sequence A, such as a random oligonucleotide sequence.
- the capture sequence A is located at the 3' end of the primer II-A.
- the extension primer is the primer II-B'.
- the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhang sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, the complementary sequence of the tag sequence A, the consensus Complement of sequence A.
- the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
- the consensus sequence X1 the tag sequence Y
- the consensus sequence X2 optionally the bridging oligonucleotide
- step (3) the second strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences A as UMIs.
- step (3)(ii) between the bridging oligonucleotide II-I, the bridging oligonucleotide II-II and the oligonucleotide probe and the oligonucleotide probe
- the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the nucleic acid molecules hybridized to the same bridging oligonucleotide
- the ligation reaction process of the nucleic acid molecule connection of the first region and the second region of acid II-II and the extension reaction described in step (3)(ii) can be carried out in any order, as long as the second nucleic acid with the position marker can be obtained Molecule will do.
- the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I can be connected, and the bridging oligonucleotide can be used to Oligonucleotide II-II initiates the extension reaction, obtaining the second strand.
- the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
- the second chain can be obtained in the following exemplary ways:
- the polymerase used in the extension reaction preferably has strand displacement activity or 5' to 3' excision activity.
- the extension reaction and the extension reaction are performed in different systems, and the extension reaction is performed first, and then the ligation reaction is performed.
- it can be obtained by initiating an extension reaction with said bridging oligonucleotide II-II and then ligating nucleic acid molecules that hybridize to the first and second regions of the same bridging oligonucleotide II-I the second strand.
- the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
- step (1) An exemplary embodiment of the present application comprising step (1), step (2)(ii) and step (3)(ii) is described in detail as follows:
- An exemplary scheme for preparing a cDNA strand complementary chain using RNA (such as mRNA) in a sample as a template comprises the following steps (as shown in FIG. 9 ):
- RNA molecules (for example, mRNA molecules) in the permeabilized sample are reverse-transcribed using reverse transcriptase (for example, reverse transcriptase with terminal transfer activity) and primer II-A' to generate cDNA, and An overhang (eg, an overhang comprising 3 cytosine nucleotides) is added to the 3' end of the cDNA.
- reverse transcriptase for example, reverse transcriptase with terminal transfer activity
- primer II-A' primer II-A' to generate cDNA
- An overhang eg, an overhang comprising 3 cytosine nucleotides
- an overhang eg, an overhang comprising 3 cytosine nucleotides
- the reverse transcriptase used does not have RNaseH activity.
- the primer II-A' comprises a poly(T) sequence, a UMI sequence, and a consensus sequence A (CA).
- a poly(T) sequence is located at the 3' end of the primer II-A' to initiate reverse transcription, and the consensus sequence A is located upstream (eg, 5' end) of the UMI sequence.
- the primer II-A' comprises a random oligonucleotide sequence and a consensus sequence A, which can be used to capture RNA without a ploy A tail.
- the random oligonucleotide sequence is located at the 3' end of the primer II-A' to initiate reverse transcription.
- primer II-B' comprising a consensus sequence B (CB) and a complementary sequence overhanging at the 3' end of said cDNA.
- CB consensus sequence B
- the nucleic acid fragment that hybridizes or anneals to the primer II-B' can be extended using the consensus sequence B as a template under the action of a nucleic acid polymerase, and a part of the consensus sequence B is added at the 3' end of the cDNA chain.
- Complementary sequence (c(CB)) thereby generating a nucleic acid molecule carrying the complementary sequence of said consensus sequence B at the 3' end.
- sequence complementary to the 3' end overhang of the cDNA strand is located at the 3' end of the primer II-B'.
- the primer II-B' may include GGG at its 3' end.
- the nucleotides of the primer II-B' can also be modified (for example, using a locked nucleic acid) to enhance the complementary pairing between the primer II-B' and the 3' end overhang of the cDNA strand.
- nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
- DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can use the sequence of the primer II-B' or a partial sequence thereof as a template to extend Captured nucleic acid fragments (reverse transcription products) are sufficient.
- reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
- this step is performed simultaneously with step (1) (eg, in the same reaction system).
- the method optionally further comprises step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
- said method does not comprise said step (3).
- extension primer the cDNA strand obtained in the previous step is used as a template for an extension reaction to obtain an extension product;
- the extension primer is the primer II-B', or primer B", and the primer B" can be combined with the The above-mentioned consensus sequence B or a partial sequence thereof is annealed, and can initiate an extension reaction.
- the exemplary structure of the cDNA strand complementary chain prepared by the above exemplary embodiment comprises: consensus sequence B, complementary sequence of 3' end overhang, complementary sequence of cDNA sequence, complementary sequence of UMI sequence, and complementary sequence of consensus sequence A sequence.
- a bridging oligonucleotide pair consisting of a bridging oligonucleotide II-I and a bridging oligonucleotide II-II is provided, wherein the bridging oligonucleotide II-I and the bridging oligonucleotide II- II each independently include: a first region (P1) and a second region (P2), the first region is located upstream of the second region (eg 5' end); wherein,
- the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
- the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
- the second region of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence A or its partial sequence in the complementary strand of the cDNA strand obtained in step 1 above.
- the bridging oligonucleotide II-I contains spacer nucleotides between the first region and the second region, such as 1-5nt or 5-10nt spacer nucleotides, that is, the The bridging oligonucleotide II-I sequence contains a third region located between the first region and the second region.
- the first region and the second region in the bridging oligonucleotide II-I are adjacently connected without redundant nucleotides, that is, the bridging oligonucleotide
- the acid II-I sequence does not contain a third region located between the first and second regions.
- the bridging oligonucleotide II-II contains spacer nucleotides between the first region and the second region, such as 1-5nt or 5-10nt spacer nucleotides, that is, the The bridging oligonucleotide II-II sequence contains a third region located between the first region and the second region.
- the first region and the second region in the bridging oligonucleotide II-II are adjacently connected without redundant nucleotides, that is, the bridging oligonucleotide
- the acid II-II sequence does not contain a third region located between the first and second regions.
- bridging oligonucleotide II-I Anneal or hybridize the bridging oligonucleotide II-I, bridging oligonucleotide II-II and the chip sequence to the complementary strand of the cDNA strand obtained in step 1 above, and then hybridize to the same bridging oligonucleotide by DNA ligase
- the nucleic acid molecules of the first and second regions of II-I are linked, and/or the nucleic acid molecules of the first and second regions that hybridize to the same bridging oligonucleotide II-II are linked.
- the exemplary structure of the new nucleic acid molecule containing chip sequence information formed by the above exemplary embodiment comprises: from the 5' end to the 3' end containing the consensus sequence B, the complementary sequence of the 3' end overhang, the cDNA sequence Complementary sequence, complementary sequence of said UMI sequence, complementary sequence of said consensus sequence A, said bridging oligonucleotide II-I sequence, complementary sequence of said tag sequence Y, and complementary sequence of said consensus sequence X1 nucleic acid strand and/or its complementary nucleic acid strand.
- step (2)(i)(b) the cDNA strand anneals to the primer II-B via its 3' end overhang, and, upon nucleic acid polymerase (e.g., DNA Under the action of polymerase or reverse transcriptase), the cDNA chain is extended using the primer II-B as a template to generate the first extension product.
- nucleic acid polymerase e.g., DNA Under the action of polymerase or reverse transcriptase
- step (2)(ii)(b) the cDNA strand anneals to the primer II-B' via its 3' end overhang, and, upon nucleic acid polymerase (e.g., Under the action of DNA polymerase or reverse transcriptase), the cDNA chain is extended using the primer II-B' as a template to generate the first extension product.
- nucleic acid polymerase e.g., Under the action of DNA polymerase or reverse transcriptase
- the 3' terminal overhang has at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9 , at least 10 or more nucleotides in length. In certain embodiments, the 3' terminal overhang is a 3' terminal overhang of 2-5 cytosine nucleotides (eg, a CCC overhang).
- step (2) the pretreatment is performed in cells.
- said one or more RNA before or after said one or more cells are contacted with the solid phase support of said nucleic acid array, said one or more RNA (for example, mRNA ) is preprocessed to generate a first population of nucleic acid molecules.
- the cells are permeabilized prior to said pretreatment.
- step (2) the pretreatment is performed extracellularly.
- RNAs for example, mRNA
- Preprocessing to generate a first population of nucleic acid molecules
- said method prior to said pretreatment, further comprises releasing intracellular RNA (eg, mRNA); preferably, by cell permeabilization or cell lysis treatment to release RNA (eg, mRNA) within a cell.
- RNA eg, mRNA
- performing reverse transcription in step (2) comprises using a reverse transcriptase.
- the reverse transcriptase has terminal transfer activity.
- the reverse transcriptase can use RNA (for example, mRNA) as a template to synthesize a cDNA chain, and add an overhang at the 3' end of the cDNA chain.
- RNA for example, mRNA
- the reverse transcriptase is capable of adding at least 1, at least 2, at least 3, at least 4, at least 5, at least Overhangs of 6, at least 7, at least 8, at least 9, at least 10 or more nucleotides.
- the reverse transcriptase is capable of adding an overhang of 2-5 cytosine nucleotides (eg, a CCC overhang) at the 3' end of the cDNA strand.
- the reverse transcriptase is selected from M-MLV reverse transcriptase, HIV-1 reverse transcriptase, AMV reverse transcriptase, telomerase reverse transcriptase, and Variants, modified products and derivatives of the transposition activity of the posase.
- steps (2) and (3) have one or more features selected from the following:
- primer I-A, primer II-A, primer IA', primer II-A', primer I-B, primer II-B, primer II-B', bridging oligonucleotide I, bridging oligonucleotide Acid II-I, bridging oligonucleotide II-II each independently comprise or consist of naturally occurring nucleotides (e.g. deoxyribonucleotides or ribonucleotides), modified nucleotides, non-natural core nucleotides, or any combination thereof; in certain embodiments, the primer I-A, primer II-A, primer I-A', primer II-A' are capable of initiating an extension reaction;
- the primer I-B, primer II-B, and II-B' each independently comprise a modified nucleotide (such as a locked nucleic acid); in some embodiments, the primer I-B, primer II-B, primer The 3' ends of II-B' each independently comprise one or more modified nucleotides (eg, locked nucleic acids);
- the tag sequence A and the tag sequence B each independently have a length of 5-200 (eg, 5-30nt, 6-15nt);
- the consensus sequence A and the consensus sequence B each independently have 10-200nt (such as 10-100nt, 20-100nt, 25-100nt, 5-10nt, 10-15nt, 15-20nt, 20-50nt, 20 -30nt, 30-40nt, 40-50nt, 50-100nt) length;
- said primer I-A, primer II-A, primer IA', primer II-A', primer I-B, primer II-B, and primer II-B' each independently have 4-200nt (such as 5-200nt , 15-230nt, 26-115nt, 10-130nt, 10-20nt, 20-50nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150-200nt) length;
- the bridging oligonucleotide I, the bridging oligonucleotide II-I, the first region and the second region of the bridging oligonucleotide II-II each independently have 3-100nt (such as 20-100nt, 3-10nt, 10-15nt, 15-20nt, 20-70nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt) length;
- the bridging oligonucleotide I, the bridging oligonucleotide II-I, and the third region of the bridging oligonucleotide II-II each independently have 0-50nt (such as Ont, 0-10nt, 10- 15nt, 15-20nt, 20-30nt, 30-40nt, 40-50nt) length;
- the bridging oligonucleotide I, the bridging oligonucleotide II-I, and the bridging oligonucleotide II-II each independently have 6-200nt (such as 20-100nt, 20-70nt, 6-15nt, 15-20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150-200nt) length;
- the poly(T) sequence includes at least 5, or at least 20 (eg, 6-100, 10-50) deoxythymidine residues;
- the random oligonucleotide sequence has a length of 5-200 (eg 5nt, 5-30nt, 6-15nt).
- the method further comprises: (4) recovering and purifying the second population of nucleic acid molecules.
- the obtained second population of nucleic acid molecules and/or complements thereof are used for constructing a transcriptome library or for transcriptome sequencing.
- the oligonucleotide probes described in step (1) have one or more characteristics selected from the following:
- consensus sequence X1, tag sequence Y and consensus sequence X2 each independently comprise or consist of naturally occurring nucleotides (such as deoxyribonucleotides or ribonucleotides), modified nucleotides, non- Natural nucleotides (such as peptide nucleic acid (PNA) or locked nucleic acid), or any combination thereof;
- naturally occurring nucleotides such as deoxyribonucleotides or ribonucleotides
- modified nucleotides such as peptide nucleic acid (PNA) or locked nucleic acid
- the consensus sequence X1, the tag sequence Y and the consensus sequence X2 each independently have 2-200nt (such as 10-200nt, 25-100nt, 10-30nt, 10-100nt, 5-10nt, 10-15nt, 15 -20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt) length.
- 2-200nt such as 10-200nt, 25-100nt, 10-30nt, 10-100nt, 5-10nt, 10-15nt, 15 -20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt
- the nucleic acid array in step (1) is provided by steps comprising:
- each vector sequence comprising at least one copy (for example, multiple copies) of the vector sequence, the vector sequence comprising from the 5' to 3' direction: the complementary sequence of the consensus sequence X2, The complementary sequence of the tag sequence Y and the fixed sequence; wherein, the complementary sequences of the tag sequence Y of each carrier sequence are different from each other;
- extension product comprises or consists of: a consensus sequence X1, a tag sequence Composed of Y and consensus sequence X2;
- steps (3) and (4) are performed in any order;
- the fixed sequence of the carrier sequence also includes a cleavage site, and the cleavage can be selected from nicking enzyme enzyme cleavage, USER enzyme cleavage, light cleavage, chemical cleavage or CRISPR cleavage;
- the cleavage site contained in the fixed sequence of the carrier sequence is cut to digest the carrier sequence, so that the extension product in step (3) is separated from the template (i.e. the carrier sequence) forming the extension product, so that the oligo Nucleotide probes are attached to the surface of a solid support such as a chip.
- the method further includes separating the extension product in step (3) from the template (ie, the vector sequence) forming the extension product by high temperature denaturation.
- each vector sequence is a DNB formed from a concatemer of multiple copies of the vector sequence.
- step (1) the various vector sequences are provided in step (1) by the following steps:
- each vector template sequence as a template to perform a nucleic acid amplification reaction to obtain an amplification product of each vector template sequence, the amplification product comprising at least one copy of the vector sequence; in certain embodiments, Rolling circle replication is performed to obtain DNBs formed from concatemers of the vector sequences.
- the consensus sequence X2 comprises a capture sequence capable of hybridizing to all or part of the nucleic acid to be captured, which includes a poly(T) sequence, A specific sequence or a random oligonucleotide sequence for a specific target nucleic acid; and, the capture sequence has a 3' free end to enable the consensus sequence X2 to serve as an extension primer.
- said step (2) comprises: contacting said one or more cells with a solid support of said nucleic acid array, whereby each cell occupies at least one of said nucleic acid arrays a micro-dot (that is, each cell is in contact with at least one micro-dot in the nucleic acid array), and the first binding molecule of the cell forms an interaction pair with the first labeling molecule of the solid support ; wherein annealing conditions are implemented such that the nucleic acid of the one or more cells anneals to the capture sequence such that the position of the nucleic acid is corresponding to the position of the oligonucleotide probe on the nucleic acid array;
- the step (3) includes: under conditions that allow primer extension, use the oligonucleotide probe as a primer, and use the captured nucleic acid molecule as a template to perform a primer extension reaction to generate a labeled (for example, by The tag sequence (labeled) nucleic acid molecule; and/or, using the captured nucleic acid molecule as a primer and the oligonucleotide probe as a template, perform a primer extension reaction to generate an extended captured nucleic acid molecule , forming a labeled (eg, labeled by the complementary sequence of said tag sequence Y) nucleic acid molecule.
- a labeled eg, labeled by the complementary sequence of said tag sequence Y
- the oligonucleotide probe described in step (1) further comprises a unique molecular identifier (UMI) sequence.
- UMI unique molecular identifier
- said UMI sequence is located upstream of said capture sequence.
- the UMI sequences contained in the oligonucleotide probes coupled to the same microdot are different from each other.
- the nucleic acid array in step (1) is provided by steps comprising:
- each carrier sequence comprising multiple copies of the carrier sequence, the carrier sequence comprising: a positioning sequence and a first fixed sequence from 5' to 3',
- the positioning sequence is the complementary sequence of the tag sequence Y;
- the first immobilized sequence allows annealing of its complementary nucleotide sequence and initiates an extension reaction
- the first primer comprises the complementary sequence of the first fixed sequence at its 3' end. region, the first fixed sequence complementary region comprises the complementary sequence of the first fixed sequence or a fragment thereof, and has a 3' free end;
- said second nucleic acid molecule comprising a consensus sequence X2 (i.e. a capture sequence), which has a 3' free end so that said second nucleic acid molecule can be used as an extension primer,
- a consensus sequence X2 i.e. a capture sequence
- linking the second nucleic acid molecule to the first nucleic acid molecule for example, using ligase to link the second nucleic acid molecule to the first nucleic acid molecule
- the ligation product is the 5' to 3' An oligonucleotide probe comprising the consensus sequence X1, the tag sequence Y, and the consensus sequence X2 in the direction.
- the carrier sequence is optionally digested so that the ligation product in step (5) is separated from the carrier sequence, thereby attaching the oligonucleotide probe to the surface of the solid support.
- the first nucleic acid molecule or the second nucleic acid molecule further comprises a UMI sequence.
- the second nucleic acid molecule comprises a UMI sequence located 5' to the capture sequence.
- multiple vector sequences are provided by the following steps:
- each carrier template sequence as a template to perform a nucleic acid amplification reaction to obtain an amplification product of each carrier template sequence, the amplification product comprising multiple copies of the carrier sequence;
- the amplification is selected from rolling circle replication (RCA), bridge PCR amplification, multiple strand displacement amplification (MDA) or emulsion PCR amplification; preferably, rolling circle replication is carried out to obtain Alternatively, bridge PCR amplification, emulsion PCR amplification, or multiplex strand displacement amplification is performed to obtain DNA clusters in the form of clonal populations of the vector sequences.
- rolling circle replication RCA
- bridge PCR amplification multiple strand displacement amplification
- MDA multiple strand displacement amplification
- emulsion PCR amplification emulsion PCR amplification
- said oligonucleotide probe is coupled to said solid support via a linker.
- the linker is a linking group capable of reacting with an activating group, and the surface of the solid support is linked with an activating group.
- the linker comprises -SH, -DBCO or -NHS.
- the linker is -DBCO, and the surface of the solid support is bound with ( ester).
- the nucleic acid array in step (1) has one or more characteristics selected from the following:
- the oligonucleotide probes coupled to the same solid support have the same consensus sequence X1 and/or the same consensus sequence X2; (2) the consensus sequence X1 of the oligonucleotide probe comprises a cleavage site; cut or fragmented by cutting, photoablation, chemical ablation, or CRISPR ablation.
- the solid phase support described in step (1) has one or more characteristics selected from the following:
- the solid support is selected from latex beads, dextran beads, polystyrene surfaces, polypropylene surfaces, polyacrylamide gels, gold surfaces, glass surfaces, chips, sensors, electrodes and silicon wafers; In some embodiments, the solid support is a chip;
- the solid support is planar, spherical or porous
- the solid phase support can be used as a sequencing platform, such as a sequencing chip; in some embodiments, the solid phase support is a sequencing chip for Illumina, MGI or Thermo Fisher sequencing platforms; and
- the solid support is capable of releasing all the compounds spontaneously or upon exposure to one or more stimuli (e.g., temperature change, pH change, exposure to a specific chemical substance or phase, exposure to light, reducing agent, etc.) oligonucleotide probes.
- stimuli e.g., temperature change, pH change, exposure to a specific chemical substance or phase, exposure to light, reducing agent, etc.
- the present application also provides a method for constructing a library of nucleic acid molecules, which includes,
- step (c) optionally, amplifying and/or enriching the product of step (b);
- a library of nucleic acid molecules is thereby obtained.
- the library of nucleic acid molecules comprises nucleic acid molecules from multiple single cells, and the nucleic acid molecules of different single cells have different tag sequences Y.
- the library of nucleic acid molecules is used for sequencing, e.g., transcriptome sequencing, e.g., single cell transcriptome sequencing (e.g., 5' or 3' transcriptome sequencing).
- sequencing e.g., transcriptome sequencing, e.g., single cell transcriptome sequencing (e.g., 5' or 3' transcriptome sequencing).
- the method before performing step (b), further comprises a step (pre-b): amplifying and/or enriching the population of labeled nucleic acid molecules.
- step (pre-b) the population of labeled nucleic acid molecules is subjected to a nucleic acid amplification reaction to generate an enriched product.
- the amplification reaction is performed using at least primer C and/or primer D, wherein the primer C is capable of hybridizing or annealing to the complementary sequence of the consensus sequence X1 or a partial sequence thereof, and Initiate an extension reaction; the primer D can hybridize or anneal to the nucleic acid molecular chain containing the tag sequence Y in the labeled nucleic acid molecule population, and initiate an extension reaction.
- the nucleic acid amplification reaction in step (pre-b) is performed using a nucleic acid polymerase (eg, DNA polymerase, eg, DNA polymerase with strand displacement activity and/or high fidelity).
- a nucleic acid polymerase eg, DNA polymerase, eg, DNA polymerase with strand displacement activity and/or high fidelity.
- step (b) of the method the nucleic acid molecule is randomly disrupted with a transposase and adapters are added.
- the nucleic acid molecule obtained in the previous step is randomly interrupted with a transposase, and a first linker and a second linker are respectively added to both ends of the fragment.
- the transposase is selected from Tn5 transposase, MuA transposase, Sleeping Beauty transposase, Mariner transposase, Tn7 transposase, Tn10 transposase, Ty1 transposase, Tn552 transposase, and variants, modified products and derivatives having the transposition activity of the above-mentioned transposases.
- the transposase is a Tn5 transposase.
- step (c) at least primer C' and/or primer D' are used to amplify the product of step (b), wherein said primer C' is capable of combining with said first adapter hybridizes or anneals and initiates an extension reaction, said primer D' is capable of hybridizing or annealing to said second adapter and initiates an extension reaction.
- step (c) at least the product of step (b) is amplified using the primer C and/or primer D'; wherein, the primer D' can be combined with the first The adapter or second adapter hybridizes or anneals and initiates an extension reaction.
- the present application also provides a method for performing transcriptome sequencing on cells in a sample, comprising:
- the application also provides a method for single-cell transcriptome analysis, comprising:
- test kit comprising:
- a nucleic acid array for labeling nucleic acids and optionally a first binding molecule comprising a solid support, the solid support (for example on its surface) containing a first labeling molecule, the first binding molecule capable of forming an interaction pair with the first marker molecule;
- the solid support also includes a plurality of micro-dots, the size of the micro-dots (such as equivalent diameter) is less than 5 ⁇ m, and the center-to-center distance between adjacent micro-dots is less than 10 ⁇ m; each coupling has an oligo Nucleotide probes; each oligonucleotide probe comprises at least one copy; and, the oligonucleotide probes comprise or consist of: a consensus sequence X1, a tag sequence Y and The consensus sequence X2 consists of, wherein,
- Different microdot-coupled oligonucleotide probes have different label sequences Y.
- the center-to-center distance between adjacent microdots is less than 10 ⁇ m, less than 5 ⁇ m, less than 1 ⁇ m, less than 0.5 ⁇ m, less than 0.1 ⁇ m, less than 0.05 ⁇ m, or less than 0.01 ⁇ m; and, the micro The size of the dots (eg, equivalent diameter) is less than 5 ⁇ m, less than 1 ⁇ m, less than 0.3 ⁇ m, less than 0.5 ⁇ m, less than 0.1 ⁇ m, less than 0.05 ⁇ m, less than 0.01 ⁇ m, or less than 0.001 ⁇ m.
- the center-to-center distance between adjacent micro dots is 0.5 ⁇ m ⁇ 1 ⁇ m, such as 0.5 ⁇ m ⁇ 0.9 ⁇ m, 0.5 ⁇ m ⁇ 0.8 ⁇ m.
- the size of the microdots (such as equivalent diameter) is 0.001 ⁇ m to 0.5 ⁇ m (such as 0.01 ⁇ m to 0.1 ⁇ m, 0.01 ⁇ m to 0.2 ⁇ m, 0.2 ⁇ m to 0.5 ⁇ m, 0.2 ⁇ m to 0.4 ⁇ m, 0.2 ⁇ m to 0.3 ⁇ m).
- the solid support comprises a plurality (eg, at least 10, at least 10 2 , at least 10 3 , at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , or more) microdots; in certain embodiments, the solid support comprises at least 10 4 (eg, at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , at least 10 9 , at least 10 10 , at least 10 11 , or at least 10 12 ) microdots/square millimeter.
- the solid support comprises at least 10 4 (eg, at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , at least 10 9 , at least 10 10 , at least 10 11 , or at least 10 12 ) microdots/square millimeter.
- the first binding molecule can form a specific interaction pair or a non-specific interaction pair with the first label molecule.
- the interaction pair is selected from positive and negative charge interactions, affinity interactions (e.g., biotin-avidin, biotin-streptavidin, antigen-antibody, receptor-ligand Enzymes, enzyme-cofactors), molecular pairs capable of click chemistry reactions (eg, alkynyl-containing groups-azido-containing compounds), N-hydroxysulfosuccinyl (NHS) ester-amino-containing compounds, or any combination thereof.
- affinity interactions e.g., biotin-avidin, biotin-streptavidin, antigen-antibody, receptor-ligand Enzymes, enzyme-cofactors
- molecular pairs capable of click chemistry reactions eg, alkynyl-containing groups-azido-containing compounds
- NHS N-hydroxysulfosuccinyl
- the first labeling molecule is poly-lysine, and the first binding molecule is a protein capable of binding to poly-lysine; the first labeling molecule is an antibody, and the first binding molecule is a protein capable of binding to poly-lysine; An antigen that binds to the antibody; the first labeling molecule is an amino-containing compound, and the first binding molecule is N-hydroxysulfosuccinate (NHS); or, the first labeling molecule is biotin, so The first binding molecule is streptavidin.
- the first binding molecule is streptavidin.
- the kit further comprises:
- Primer I-A a primer set comprising Primer I-A' and Primer I-B, or, a primer set comprising Primer I-A and Primer I-B, wherein:
- the primer I-A contains a consensus sequence A and a capture sequence A, and the capture sequence A can anneal to the RNA to be captured (for example, mRNA) and initiate an extension reaction; preferably, the consensus sequence A is located in the capture sequence A upstream (for example, at the 5' end of the primer I-A);
- the primer I-A' comprises a capture sequence A capable of annealing to the RNA to be captured (for example, mRNA) and initiating an extension reaction;
- the primer I-B comprises a consensus sequence B, a 3' end overhang complementary sequence, and an optional tag sequence B; wherein, the 3' end overhang complementary sequence is located at the 3' end of the primer I-B, and the consensus sequence B is located upstream of the complementary sequence of the 3' end overhang (for example, at the 5' end of the primer I-B); wherein, the 3' end overhang is defined by the capture sequence A of the primer I-A'
- the captured RNA is one or more non-template nucleotides contained in the 3' end of the cDNA chain generated by reverse transcription of the template;
- bridging oligonucleotide I comprising: a first region and a second region, and optionally a third region between the first region and the second region, the first region being located in the second region Upstream of the region (e.g., the 5' end); wherein,
- the first region can (a) anneal to all or part of the consensus sequence A of the primer I-A or (b) anneal to all or part of the consensus sequence B of the primer I-B;
- the second region is capable of annealing to all or part of the consensus sequence X2.
- the kit comprises: primer I-A as described in (i), and bridging oligonucleotide I as described in (ii); wherein, the bridging oligonucleotide The first region of I can anneal to all or part of the consensus sequence A of the primer I-A, and the second region of the bridging oligonucleotide can anneal to all or part of the consensus sequence X2;
- the capture sequence A of the primer I-A is a random oligonucleotide sequence; or, the capture sequence A of the primer I-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid, and the primer I-A further comprises
- the tag sequence A is, for example, a random oligonucleotide sequence.
- the 5' end of primer I-A comprises a phosphorylation modification.
- the kit comprises: a primer set comprising primer I-A' and primer I-B as described in (i), and, bridging oligonucleotide I as described in (ii) ; wherein, the first region of the bridging oligonucleotide I can anneal to all or part of the consensus sequence B of the primer I-B, and the second region of the bridging oligonucleotide can fully or partially anneal to the consensus sequence X2 partial annealing;
- the capture sequence A of the primer I-A' is a random oligonucleotide sequence; or, the capture sequence A of the primer I-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid,
- the primer I-A' further comprises a tag sequence A, and a consensus sequence A;
- the primer I-B comprises a consensus sequence B, a complementary sequence overhanging at the 3' end, and a tag sequence B.
- the kit further comprises a primer B", capable of annealing to the complementary sequence of the consensus sequence B or a partial sequence thereof, and capable of initiating an extension reaction.
- primer I-B or primer B comprises a phosphorylation modification.
- the primer I-B comprises modified nucleotides (eg locked nucleic acid); preferably, the 3' end of the primer I-B comprises one or more modified nucleotides (eg locked nucleic acid).
- the kit comprises: a primer set comprising primer I-A and primer I-B as described in (i), and bridging oligonucleotide I as described in (ii); wherein, The first region of the bridging oligonucleotide I can anneal to all or part of the consensus sequence A of the primer I-A, and the second region of the bridging oligonucleotide can anneal to all or part of the consensus sequence X2;
- the capture sequence A of the primer I-A is a random oligonucleotide sequence; or, the capture sequence A of the primer I-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid, and the primer I-A further includes
- the tag sequence A is, for example, a random oligonucleotide sequence.
- the 5' end of primer I-A comprises a phosphorylation modification.
- the primer I-B comprises modified nucleotides (eg locked nucleic acid); preferably, the 3' end of the primer I-B comprises one or more modified nucleotides (eg locked nucleic acid).
- the kit further comprises:
- primer II-A and primer II-B (i) a primer set comprising primer II-A and primer II-B or comprising primer II-A' and primer II-B', wherein:
- the primer II-A contains a capture sequence A capable of annealing to the RNA to be captured (eg, mRNA) and initiating an extension reaction;
- the primer II-B comprises a consensus sequence B, a 3' end overhanging complementary sequence, and an optional tag sequence B; wherein the 3' end overhanging complementary sequence is located at the 3' end of the primer II-B,
- the consensus sequence B is located upstream of the complementary sequence of the 3' end overhang (for example, at the 5' end of the primer II-B); wherein, the 3' end overhang refers to the primer II-A
- the RNA captured by the capture sequence A is one or more non-template nucleotides contained in the 3' end of the cDNA chain generated by reverse transcription of the template;
- the primer II-A' contains a consensus sequence A and a capture sequence A; wherein the capture sequence A is located at the 3' end of the primer II-A', and the consensus sequence A is located upstream of the capture sequence A ( For example at the 5' end of said primer II-A');
- the primer II-B' comprises a consensus sequence B, a 3' end overhanging complementary sequence, and an optional tag sequence B; wherein, the 3' end overhanging complementary sequence is located 3' of the primer II-B' At the end, the consensus sequence B is located upstream of the 3' end overhang complementary sequence (for example, at the 5' end of the primer II-B'); wherein, the 3' end overhang refers to the primer
- the RNA captured by the capture sequence A of II-A' is one or more non-template nucleotides contained in the 3' end of the cDNA chain generated by template reverse transcription.
- the kit comprises: a primer set of primer II-A and primer II-B as described in (i), and, (ii) bridging oligonucleotide II-I and bridging oligonucleotide Nucleotide II-II; wherein, said bridging oligonucleotide II-I and said bridging oligonucleotide II-II each independently comprise: a first region and a second region, and optionally located in the first A third region between a region and a second region, the first region being located upstream (e.g., the 5' end) of the second region; wherein,
- the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
- the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
- the second region of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence B of the primer II-B or a partial sequence thereof;
- the capture sequence A of the primer II-A is a random oligonucleotide sequence; or, the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid, the Primer II-A preferably further comprises a consensus sequence A and an optional tag sequence A, such as a random oligonucleotide sequence;
- the primer II-B contains the consensus sequence B, the complementary sequence overhanging at the 3' end, and the tag sequence B.
- the primer II-B comprises modified nucleotides (such as locked nucleic acid); preferably, the 3' end of the primer II-B comprises one or more modified nucleotides (such as locked nucleic acid).
- the kit comprises: a primer set of primer II-A and primer II-B as described in (i);
- the capture sequence A of the primer II-A is a random oligonucleotide sequence; or, the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid, the Primer II-A preferably further comprises a consensus sequence A and an optional tag sequence A, such as a random oligonucleotide sequence;
- the primer II-B contains the consensus sequence B, the complementary sequence overhanging at the 3' end, and the tag sequence B.
- the primer II-B comprises modified nucleotides (such as locked nucleic acid); preferably, the 3' end of the primer II-B comprises one or more modified nucleotides (such as locked nucleic acid).
- the kit comprises: a primer set of primer II-A' and primer II-B' as described in (i), and, (ii) bridging oligonucleotides II-I and Bridging oligonucleotide II-II; wherein, said bridging oligonucleotide II-I and said bridging oligonucleotide II-II each independently comprise: a first region and a second region, and optionally located in A third region between the first region and the second region, the first region being located upstream (e.g., the 5' end) of the second region; wherein,
- the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
- the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
- the second region of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence A of the primer II-A' or a partial sequence thereof;
- the capture sequence A of the primer II-A' is a random oligonucleotide sequence; or, the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid,
- the primer II-A' further comprises a tag sequence A, such as a random oligonucleotide sequence.
- the primer II-B' comprises modified nucleotides (eg, locked nucleic acid); preferably, the 3' end of the primer II-B' comprises one or more modified nucleotides (e.g. locked nucleic acids).
- the kit further comprises a primer B", capable of annealing to the complementary sequence of the consensus sequence B or a partial sequence thereof, and capable of initiating an extension reaction.
- it comprises a primer set of primer II-A' and primer II-B' as described in (i);
- the capture sequence A of the primer II-A' is a random oligonucleotide sequence; or, the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid,
- the primer II-A' further comprises a tag sequence A, such as a random oligonucleotide sequence;
- the primer II-B' contains the consensus sequence B, the complementary sequence overhanging at the 3' end, and the tag sequence B.
- the primer II-B' comprises modified nucleotides (eg, locked nucleic acid); preferably, the 3' end of the primer II-B' comprises one or more modified nucleotides (e.g. locked nucleic acids).
- the kit further comprises a primer B", capable of annealing to the complementary sequence of the consensus sequence B or a partial sequence thereof, and capable of initiating an extension reaction.
- the kit has one or more features selected from:
- oligonucleotide probe primer I-A, primer II-A, primer I-A', primer II-A', primer I-B, primer II-B, primer II-B', primer B
- Bridging oligonucleotide I, bridging oligonucleotide II-I, bridging oligonucleotide II-II each independently comprise or consist of naturally occurring nucleotides (such as deoxyribonucleotides or ribonucleotides), Modified nucleotides, non-natural nucleotides, or any combination thereof;
- the oligonucleotide probes each independently have 15-300nt (such as 15-200nt, 15-20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150- 200nt) in length;
- primer I-A, primer II-A, primer IA', primer II-A', primer I-B, primer II-B, primer II-B', and primer B" each independently have 4-200nt ( For example 5-200nt, 15-230nt, 26-115nt, 10-130nt, 10-20nt, 20-50nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150-200nt) length;
- the bridging oligonucleotide I, the bridging oligonucleotide II-I, and the bridging oligonucleotide II-II each independently have 6-200nt (such as 20-100nt, 20-70nt, 6-15nt, 15-20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150-200nt) length;
- the oligonucleotide probes coupled to the same solid support have the same consensus sequence X1 and/or the same consensus sequence X2;
- the consensus sequence X1 of the oligonucleotide probe comprises a cleavage site; Cut or fragmented by photoablation, chemical ablation, or CRISPR ablation.
- the kit further comprises reverse transcriptase, nucleic acid ligase, nucleic acid polymerase and/or transposase.
- the reverse transcriptase has terminal transfer activity.
- the reverse transcriptase is capable of synthesizing a cDNA strand using RNA (eg, mRNA) as a template, and adding the 3' end overhang at the 3' end of the cDNA strand.
- the reverse transcriptase is capable of adding at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7 nucleotides in length to the 3' end of the cDNA strand. , an overhang of at least 8, at least 9, at least 10 or more nucleotides.
- the reverse transcriptase is capable of adding an overhang of 2-5 cytosine nucleotides (eg, a CCC overhang) at the 3' end of the cDNA strand.
- the reverse transcriptase is selected from the group consisting of M-MLV reverse transcriptase, HIV-1 reverse transcriptase, AMV reverse transcriptase, telomerase reverse transcriptase, and transposases having the above transposase activity variants, modifications and derivatives.
- the nucleic acid polymerase has no 5' to 3' exonucleating activity or strand displacement activity.
- the nucleic acid polymerase has 5' to 3' exonucleation activity or strand displacement activity.
- the transposase is selected from Tn5 transposase, MuA transposase, Sleeping Beauty transposase, Mariner transposase, Tn7 transposase, Tn10 transposase, Ty1 transposase, Tn552 transposase, and variants, modified products and derivatives having the transposition activity of the above-mentioned transposases.
- the kit further comprises: the primer C, the primer D, the primer C' and/or the primer D'.
- the kit further comprises the primer C, the primer D and the primer D'.
- said kit further comprises said primer C, said primer D, said primer C' and said primer D'.
- the kit further comprises: reagents for nucleic acid hybridization, reagents for nucleic acid extension, reagents for nucleic acid amplification, reagents for recovering or purifying nucleic acids, reagents for A reagent for constructing a transcriptome sequencing library, a reagent for sequencing (such as next-generation sequencing or third-generation sequencing), or any combination thereof.
- cells useful in the methods of the invention can be any cell of interest, e.g., cancer cells, stem cells, neural cells, fetal Cells and immune cells involved in the immune response.
- the cell may be one cell or multiple cells.
- the cells may be a mixture of cells of the same type, or a completely heterogeneous mixture of cells of different types.
- Different cell types may include different tissue cells of an individual or the same tissue cells of different individuals or cells derived from microorganisms of different genus, species, strain, variant, or any combination of any or all of the foregoing.
- different cell types may include normal cells and cancer cells from an individual; various cell types obtained from a human subject, such as various immune cells; various different bacteria from environmental, forensic, microbiome or other samples species, strains and/or variants; or any other various mixture of cell types.
- the term "UMI” refers to "Unique Molecular Identifier, a unique molecular label", which can be used to perform qualitative and/or quantitative nucleic acid molecules. Unless otherwise indicated herein or clearly contradicted by the context, the present application does not limit the position and quantity of the UMI or its complementary sequence in the nucleic acid molecule. For example, when the cDNA chain contains the UMI or its complementary sequence, the UMI or its complementary sequence can be located at the 3' end of the cDNA sequence in the cDNA chain, or at the 5' end of the cDNA sequence, or The UMI or its complement is contained at both the 3' end and the 5' end.
- the UMI or its complementary sequence can be located at the 3' end of the complementary sequence of the cDNA sequence in the complementary strand of the cDNA strand, or at the end of the complementary sequence of the cDNA sequence.
- the 5' end may also contain the UMI or its complementary sequence at both the 3' end and the 5' end.
- DNB DNA nanoball, DNA nanoball
- RCA rolling circle amplification
- the RCA product is a multi-copy single-stranded DNA sequence, which can form a similar "spherical” structure due to the interaction force between the bases of the internal DNA sequence.
- the library molecules are circularized to form single-stranded circular DNA, and then the single-stranded circular DNA can be amplified by multiple orders of magnitude using rolling circle amplification technology, and the resulting amplification product is called DNB.
- a "population of nucleic acid molecules” refers to, for example, nucleic acid molecules derived directly or indirectly from target nucleic acid molecules (e.g., DNA double-stranded molecules, RNA/cDNA hybrid double-stranded molecules, DNA single-stranded molecules, or RNA single-stranded molecules) groups or collections.
- the population of nucleic acid molecules comprises a library of nucleic acid molecules comprising sequences qualitatively and/or quantitatively representative of target nucleic acid molecule sequences.
- the population of nucleic acid molecules comprises a subset of a library of nucleic acid molecules.
- a "library of nucleic acid molecules” refers to labeled nucleic acid molecules (e.g., labeled DNA double-stranded molecules, labeled RNA/cDNA hybrid double-stranded molecules, labeled DNA single-stranded molecules) generated directly or indirectly from target nucleic acid molecules. stranded molecules, or labeled RNA single-stranded molecules) or a collection or population of fragments thereof, wherein the combination of labeled nucleic acid molecules or fragments thereof in the collection or population exhibits a qualitative and/or quantitative representation of the resulting The sequence of the target nucleic acid molecule sequence of the labeled nucleic acid molecule.
- the library of nucleic acid molecules is a sequencing library.
- the library of nucleic acid molecules can be used to construct a sequencing library.
- cDNA or "cDNA strand” refers to a primer that anneals to an RNA molecule of interest, catalyzed by RNA-dependent DNA polymerase or reverse transcriptase, using at least a portion of the RNA molecule of interest as a template
- the "complementary DNA” synthesized by the extension of DNA (this process is also called “reverse transcription”).
- the synthesized cDNA molecule is "homologous” or “complementary” or “base paired” or “complexed” with at least a portion of the template.
- upstream is used to describe the relative positional relationship of two nucleic acid sequences (or two nucleic acid molecules), and has the meaning generally understood by those skilled in the art.
- the expression “one nucleic acid sequence is located upstream of another nucleic acid sequence” means that when arranged in the 5' to 3' direction, the former is located in a more forward position (i.e., closer to the 5' end) than the latter Location).
- downstream has the opposite meaning of "upstream”.
- Tag Sequence Y As used herein, "Tag Sequence Y”, “Tag Sequence A”, “Tag Sequence B”, “Consensus Sequence X1”, “Consensus Sequence X2”, “Consensus Sequence A”, “Consensus Sequence B”, etc.
- the joined nucleic acid molecule or a derivative product of the joined nucleic acid molecule provides means for identification, recognition, and/or molecular manipulation or biochemical manipulation (e.g., by providing A site for annealing an oligonucleotide, such as a primer for DNA polymerase extension or an oligonucleotide for a non-target nucleic acid component of a capture reaction or ligation reaction) glycosides.
- the oligonucleotides may consist of consecutive at least two (preferably about 6 to 100, but there is no firm limit to the length of the oligonucleotides, the exact size depends on many factors which in turn depend on the oligonucleotide
- the final function or use of acid) nucleotides can also be composed of multiple oligonucleotides in continuous or discontinuous arrangement.
- the oligonucleotide sequence may be unique for each nucleic acid molecule it ligates, or it may be unique for a certain class of nucleic acid molecules it ligates.
- the oligonucleotide sequence can be reversibly or irreversibly joined to the polynucleotide sequence to be "labeled” by any means including ligation, hybridization or other methods.
- the process of joining the oligonucleotide sequence to a nucleic acid molecule is sometimes referred to herein as "labeling" and a nucleic acid molecule undergoing labeling or containing a labeling sequence is referred to as a "labeled nucleic acid molecule" or "labeled nucleic acid molecule”. .
- Nucleic acids or polynucleotides of the present invention may comprise one or more modified nucleobases, sugar moieties or internucleoside linkages.
- nucleic acids or polynucleotides linked to sugar moieties or internucleoside linkages include, but are not limited to: (1) changes in Tm; (2) changes in the susceptibility of the polynucleotide to one or more nucleases; (3) ) provides a moiety for attaching a label; (4) provides a label or a label quencher; or (5) provides a moiety for attaching another molecule in solution or bound to a surface, such as biotin.
- oligonucleotides such as primers, may be synthesized such that the random portion comprises one or more conformationally constrained nucleic acid analogs, such as, but not limited to, a ribose ring in which the 2'-O atom is linked to the 4'- One or more ribonucleic acid analogues "locked" by the methylene bridge of the C atom; these modified nucleotides result in an increase in the Tm or melting temperature of each nucleotide monomer by about 2 degrees Celsius to about 8 degrees Celsius.
- conformationally constrained nucleic acid analogs such as, but not limited to, a ribose ring in which the 2'-O atom is linked to the 4'- One or more ribonucleic acid analogues "locked" by the methylene bridge of the C atom
- one indicator of the use of modified nucleotides in the method may be that the oligonucleotide comprising the modified nucleotides may Digested by single-strand-specific RNases.
- said "first said binding molecule” is capable of specific interaction or non-specific interaction with said "first labeling molecule”.
- the first binding molecule interacts with the first label molecule in a manner selected from positive and negative charge interactions, affinity interactions (e.g., biotin-avidin, Biotin-Streptavidin, Antigen-Antibody, Receptor-Ligand, Enzyme-Cofactor), click chemistry reaction (eg, alkynyl group-azido compound), or any combination thereof.
- the first labeling molecule is poly-lysine, and the first binding molecule is a protein capable of binding to poly-lysine; the first labeling molecule is an antibody, and the first binding molecule is a protein capable of binding to poly-lysine; An antigen that binds to the antibody; the first labeling molecule is biotin, and the first binding molecule is streptavidin; the first binding molecule is a compound containing an alkyne group, and the first The labeling molecule is an azido compound; or, the first binding molecule is N-hydroxysulfosuccinate (NHS) ester, and the first labeling molecule is an amino-containing compound.
- the first binding molecule is an amino-containing compound.
- the first labeling molecule is an antigen
- the first binding molecule is an antibody capable of binding to the antigen
- the first labeling molecule is streptavidin, and the first binding molecule is biotin
- the first binding molecule is an azide compound, and the first labeling molecule is a compound containing an alkynyl group; or, the first binding molecule is an amino-containing compound, and the first labeling molecule is an N-hydroxyl group Sulfosuccinate (NHS) ester.
- NHS N-hydroxyl group Sulfosuccinate
- a nucleic acid base in a single nucleotide at one or more positions in a polynucleotide or oligonucleotide may include guanine, adenine, uracil, thymine, or cytosine.
- one or more of the nucleic acid bases may comprise modified bases such as, but not limited to, xanthine, allyamino-uracil, allyamino-thymine Glycosides, hypoxanthine, 2-aminoadenine, 5-propynyluracil, 5-propynylcytosine, 4-thiouracil, 6-thioguanine, nitrogen-uracil and deaza-uracil, thymus pyrimidine nucleoside, cytosine, adenine or guanine.
- modified bases such as, but not limited to, xanthine, allyamino-uracil, allyamino-thymine Glycosides, hypoxanthine, 2-aminoadenine, 5-propynyluracil, 5-propynylcytosine, 4-thiouracil, 6-thioguanine, nitrogen-uracil and deaza-uracil, thymus pyrimidine nucle
- nucleic acid bases may comprise nucleic acid bases derivatized with a biotin moiety, a digoxigenin moiety, a fluorescent or chemiluminescent moiety, a quencher moiety, or some other moiety.
- the invention is not limited to the listed nucleic acid bases; the list given shows examples of a wide range of bases that can be used in the methods of the invention.
- one or more of the sugar moieties may include 2'-deoxyribose, or alternatively, one or more of the sugar moieties may include some other sugar moiety, such as But not limited to: Ribose or 2'-fluoro-2'-deoxyribose or 2'-O-methyl-ribose that provide resistance to some nucleases, or can be passed with visible, fluorescent, infrared fluorescent 2'-amino 2'-deoxyribose or 2'-azido- 2'-deoxyribose.
- internucleoside linkages of nucleic acids or polynucleotides of the invention may be phosphodiester linkages, or alternatively, one or more of the internucleoside linkages may include modified linkages such as, but not limited to: Phosphate, phosphorodithioate, phosphoroselenate, or phosphorodiselenate linkages, which are resistant to some nucleases.
- terminal transfer activity refers to the ability to catalyze the template-independent addition (or “tailing") of one or more deoxyribonucleoside triphosphates (dNTPs) or a single dideoxyribonucleoside triphosphate to Activity of the 3' end of the cDNA.
- dNTPs deoxyribonucleoside triphosphates
- Examples of reverse transcriptases having terminal transfer activity include, but are not limited to, M-MLV reverse transcriptase, HIV-1 reverse transcriptase, AMV reverse transcriptase, telomerase reverse transcriptase, and reverse transcriptases having said reverse transcriptase Variants, modified products and derivatives with recording activity and terminal transfer activity. Described reverse transcriptase does not have or has RNase activity (particularly RNase H activity).
- the reverse transcriptase used to reverse transcribe RNA to generate cDNA does not have RNase activity.
- the reverse transcriptase used to reverse transcribe RNA to generate cDNA has terminal transfer activity and does not have RNase activity.
- nucleic acid polymerase with "strand displacement activity” means that, in the process of elongating a new nucleic acid strand, if it encounters a downstream nucleic acid strand complementary to the template strand, it can continue the extension reaction and replace the nucleic acid strand complementary to the template strand.
- nucleic acid polymerase having "5' to 3' exonuclease activity” refers to a nucleic acid polymerase capable of catalyzing the hydrolysis of 3, 5- Phosphodiester bond, nucleic acid polymerase that degrades nucleotides.
- a nucleic acid polymerase (or DNA polymerase) with "high fidelity” means that, during the process of amplifying nucleic acid, the probability of introducing a wrong nucleotide (i.e., the error rate) is lower than that of the wild-type Taq enzyme (for example, the nucleic acid polymerase (or DNA polymerase) of Taq enzyme whose sequence is shown in UniProt Accession: P19821.1).
- annealing As used herein, the terms “annealing”, “annealing”, “annealing”, “hybridizing” or “hybridizing” and the like refer to the presence of sufficient complementarity to form a complex via Watson-Crick base pairing. Complexes are formed between nucleotide sequences.
- nucleic acid sequences that are “complementary to” or “complementary” or “hybridize” or “anneal” to each other should be able to form or form sufficiently stable “hybrids" or “hybrids” that serve the intended purpose. "Complex".
- every nucleic acid base within the sequence represented by one nucleic acid molecule is capable of base pairing or pairing or complexing with every nucleic acid base within the sequence represented by a second nucleic acid molecule such that the two nucleic acid molecules or one of them
- Corresponding sequences shown are “complementary” or “anneal” or “hybridize” to each other.
- the terms “complementary” or “complementarity” are used when referring to a sequence of nucleotides related by the base pairing rules. For example, the sequence 5'-A-G-T-3' is complementary to the sequence 3'-T-C-A-5'.
- Complementarity can be "partial,” wherein only some of the nucleic acid bases match according to the base pairing rules. Alternatively, there may be “perfect” or “total” complementarity between nucleic acids. The degree of complementarity between nucleic acid strands has a significant effect on the efficiency and strength of hybridization between nucleic acid strands. This is particularly important in amplification reactions and detection methods that rely on hybridization of nucleic acids.
- the term “homology” refers to the degree of complementarity of one nucleic acid sequence to another nucleic acid sequence. There may be partial or complete homology (ie, complementarity).
- a partially complementary sequence is one that at least partially inhibits the hybridization of a fully complementary sequence to a target nucleic acid and is referred to using the functional term "substantially homologous". Inhibition of hybridization of a perfectly complementary sequence to a target sequence can be examined under low stringency conditions using a hybridization assay (eg, Southern or Northern blot, solution hybridization, etc.). Substantially homologous sequences or probes will compete or inhibit binding (ie, hybridization) of a fully homologous sequence to a target under conditions of low stringency. This is not to say that low stringency conditions are conditions that allow non-specific binding; low stringency conditions require that the binding of two sequences to each other is a specific (ie selective) interaction.
- a hybridization assay eg, Southern or Northern blot, solution hybridization, etc.
- the absence of non-specific binding can be tested by using a second target that lacks complementarity or has only a low degree of complementarity (eg, less than about 30% complementarity). In cases of little or no specific binding, the probe will not hybridize to the nucleic acid target.
- substantially homologous when used in reference to a double-stranded nucleic acid sequence, such as a cDNA or genomic clone, means hybridizable to one or both strands of the double-stranded nucleic acid sequence under low stringency conditions as described herein any oligonucleotide or probe.
- the terms “anneal” or “hybridize” are used when referring to the pairing of complementary nucleic acid strands.
- Hybridization and the strength of hybridization are affected by a number of factors well known in the art, including the degree of complementarity between the nucleic acids, including the stringency of conditions affected by conditions such as salt concentration, the degree of hybridization formed The Tm (melting temperature) of the body, the presence of other components (eg, the presence or absence of polyethylene glycol or betaine), the molarity of the hybridized strands, and the G:C content of the nucleic acid strands.
- the solid support can spontaneously or upon exposure to one or more stimuli (e.g., temperature change, pH change, exposure to a particular chemical species or phase, exposure to light, reducing agent, etc.)
- the oligonucleotide probe is released. It will be appreciated that the oligonucleotide probe may be released by cleavage of the bond between the oligonucleotide probe and the solid support, or by degradation of the solid support itself. Oligonucleotide probes, or both, which allow or are accessible to other reagents.
- Addition of various types of labile bonds to the solid support can result in a solid support capable of responding to different stimuli.
- Each type of labile bond can be sensitive to relevant stimuli (eg, chemical stimuli, light, temperature, etc.), so that the release of substances attached to the solid support through each labile bond can be controlled by applying appropriate stimuli.
- labile bonds that can be coupled to solid supports include ester bonds (for example, cleavable with acids, bases, or hydroxylamine), ortho Diol bonds (e.g., cleavable by sodium periodate), Diels-Alder bonds (e.g., cleavable by heat), sulfone bonds (e.g., cleavable by bases), silane Ether bonds (e.g., cleavable by acids), glycosidic bonds (e.g., cleavable by amylases), peptide bonds (e.g., cleavable by proteases), or phosphodiester bonds (e.g., cleavable by nucleases (e.g., DNA Enzyme) cleavage)).
- ester bonds for example, cleavable with acids, bases, or hydroxylamine
- ortho Diol bonds e.g., cleavable by sodium periodate
- Diels-Alder bonds
- the solid support can be activated spontaneously or upon exposure to one or more stimuli (e.g., temperature). degradable, destructible or soluble upon exposure to a change in pH, change in pH, exposure to a particular chemical species or phase, exposure to light, reducing agents, etc.).
- a solid support can be soluble such that the material components of the solid support dissolve upon exposure to a particular chemical or environmental change (eg, a change in temperature or a change in pH).
- the solid support degrades or dissolves under elevated temperature and/or alkaline conditions.
- the solid support can be thermally degradable such that when the solid support is exposed to an appropriate temperature change (eg, heating), the solid support degrades. Degradation or dissolution of a solid support bound to a substance (eg, an oligonucleotide probe) can result in the release of the substance from the solid support.
- an appropriate temperature change eg, heating
- transposase and reverse transcriptase and “nucleic acid polymerase” refer to protein molecules or aggregates of protein molecules responsible for catalyzing specific chemical and biological reactions.
- the methods, compositions or kits of the invention are not limited to the use of a particular transposase, reverse transcriptase or nucleic acid polymerase from a particular source.
- the methods, compositions or kits of the invention include any transposase, reverse transcriptase or nucleic acid polymerase from any source having equivalent enzymatic activity to a particular enzyme disclosed herein according to a particular method, composition or kit .
- the method of the present invention also includes the following embodiment: wherein any specific enzyme provided and used in the steps of the method is replaced by a combination of two or more enzymes, the two or more enzymes When used in combination, whether used separately in a stepwise fashion or together simultaneously, the reaction mixture produces the same results as those obtained with that one particular enzyme.
- the methods, buffers and reaction conditions provided herein, including those in the Examples, are presently preferred for embodiments of the methods, compositions and kits of the invention.
- other enzyme storage buffers, reaction buffers and reaction conditions using some of the enzymes of the invention are known in the art and may also be suitable for use in the invention and are included herein.
- the application provides high-resolution nucleic acid arrays (such as chips) and methods capable of positioning and labeling nucleic acid molecules, and high-throughput sequencing (especially, high-throughput single-cell transcriptome sequencing) using the nucleic acid arrays or methods. )Methods.
- the method of the present application has one or more beneficial technical effects selected from the following:
- the nucleic acid array (such as a chip) has high resolution, and it can contain at least 50 (such as at least 50, at least 100, at least 200, at least 300) in a single cell area (such as 80-100 ⁇ m 2 ) , at least 400, or at least 500) micro-dots, each micro-dot is coupled with a labeling oligonucleotide probe containing position information (for example, an oligonucleotide probe containing a tag sequence Y), each The oligonucleotide probe comprises at least one copy.
- the nucleic acid array can mark different cells in a sample (such as a cell suspension) with a specific localization sequence (such as a tag sequence Y), thereby, by detecting the specific localization sequence (for example, the tag sequence Y), so that the spatial position information of the nucleic acid molecule on the nucleic acid array can be determined, and then the nucleic acid molecule from the same single cell can be determined, thereby realizing the analysis of the single cell sample.
- a specific localization sequence such as a tag sequence Y
- nucleic acid array such as a chip
- a sample such as a cell suspension
- multiple cells can be directly Adsorbed on the nucleic acid array (eg chip). Due to the high resolution of the nucleic acid array (such as a chip), the size and spacing of the micro-dots are far smaller than the size of a single cell.
- each cell or, Nucleic acid molecules from the cells
- oligonucleotide probes such as oligonucleotide probes containing tag sequence Y
- position information on the nucleic acid array such as a chip
- the nucleic acid array (such as a chip) can theoretically capture and label every cell in the sample, which effectively avoids the loss of rare cell information.
- the method of the present invention can capture millions of cells on a single chip for single-cell sequencing, and the cell capture efficiency can theoretically reach 100%. That is, the cell capture throughput of the method of the present invention can reach millions of levels, and the cell capture efficiency can reach nearly 100%, which is far beyond the prior art (the prior art, such as the 10x chromium cell sorting platform, the microstructure formed in the oily phase Due to the limitation of the number of droplets, its flux is difficult to exceed 10,000 levels, and due to the characteristics of Poisson distribution, the cell capture rate can theoretically reach up to 60%).
- the prior art such as the 10x chromium cell sorting platform, the microstructure formed in the oily phase Due to the limitation of the number of droplets, its flux is difficult to exceed 10,000 levels, and due to the characteristics of Poisson distribution, the cell capture rate can theoretically reach up to 60%).
- Fig. 1A shows an exemplary structure of a chip for capturing and labeling nucleic acid molecules in this application, which includes: a chip and oligonucleotide probes (also called chip sequences) coupled to the chip.
- oligonucleotide probes also called chip sequences
- Each oligonucleotide probe contains a label sequence Y corresponding to its position on the chip, and the coupling area between each oligonucleotide probe and the chip can be called a micro spot.
- Each oligonucleotide probe can be single or multiple copies.
- Figure 1B shows that cells in a sample are labeled by one or more microdots on the chip after contacting the chip.
- FIG. 2 shows an exemplary scheme 1 for preparing a cDNA chain using RNA (such as mRNA) in a sample as a template, and an exemplary structure of the cDNA chain.
- CA Consensus A
- CB Consensus B.
- Figure 3 shows that the 5' end of the cDNA strand is tagged with ChIP-seq (i.e., the 5' end of the cDNA strand is ligated to the 3' end of the ChIP-seq), forming a new nucleic acid molecule containing the ChIP-seq information (i.e., the ChIP-seq
- An exemplary scheme of a labeled nucleic acid molecule i.e., the ChIP-seq
- CA consensus sequence A
- CB consensus sequence B
- X1 consensus sequence X1
- Y tag sequence Y
- X2 consensus sequence X2.
- Fig. 4 shows an exemplary scheme 1 for preparing a complementary cDNA chain using RNA (such as mRNA) in a sample as a template, and an exemplary structure of the complementary cDNA chain.
- CA consensus sequence A
- CB consensus sequence B
- EP extension primer.
- Figure 5 shows that the 5' end of the complementary strand of the cDNA strand is marked with ChIP-seq (that is, the 5' end of the complementary strand of the cDNA strand is connected to the 3' end of the ChIP-seq), forming a new nucleic acid molecule containing the ChIP-seq information ( That is, an exemplary scheme of a ChIP-seq-labeled nucleic acid molecule), and an exemplary structure of the novel nucleic acid molecule containing ChIP-seq information.
- CA consensus sequence A
- CB consensus sequence B
- X1 consensus sequence X1
- Y tag sequence Y
- X2 consensus sequence X2.
- FIG. 6 shows an exemplary scheme 2 for preparing a cDNA chain using RNA (such as mRNA) in a sample as a template, and an exemplary structure of the cDNA chain.
- CA Consensus A
- CB Consensus B.
- Figure 7 shows an exemplary scheme 1 for marking the 3' end of a cDNA strand with the complementary sequence of ChIP-seq to form a new nucleic acid molecule containing ChIP-seq information (that is, a nucleic acid molecule labeled with ChIP-seq), and the ChIP-seq-containing Exemplary structures of novel nucleic acid molecules of sequence information.
- CA consensus sequence A
- CB consensus sequence B
- X1 consensus sequence X1
- Y tag sequence Y
- X2 consensus sequence X2
- P1 first region
- P2 second region.
- Figure 8 shows an exemplary scheme 2 for marking the 3' end of a cDNA strand with the complementary sequence of ChIP-seq to form a new nucleic acid molecule containing ChIP-seq information (that is, a nucleic acid molecule labeled with ChIP-seq), and the ChIP-seq-containing Exemplary structures of novel nucleic acid molecules of sequence information.
- CA consensus sequence A
- CB consensus sequence B
- X1 consensus sequence X1
- Y tag sequence Y
- X2 consensus sequence X2.
- FIG. 9 shows an exemplary scheme 2 for preparing a complementary strand of a cDNA chain using RNA (such as mRNA) in a sample as a template, and an exemplary structure of the complementary strand of the cDNA strand.
- RNA such as mRNA
- Figure 10 shows the 3' end of cDNA strand complementary strand labeling with the complementary sequence of ChIP-seq, forms the exemplary scheme 1 of the new nucleic acid molecule (that is, the nucleic acid molecule of marking through ChIP-seq) that contains ChIP-seq information, and, described Exemplary structures of novel nucleic acid molecules containing ChIP-seq information.
- CA consensus sequence A
- CB consensus sequence B
- X1 consensus sequence X1
- Y tag sequence Y
- X2 consensus sequence X2
- P1 first region
- P2 second region.
- Fig. 11 has shown the 3' end of cDNA strand complementary strand labeling with the complementary sequence of ChIP-seq, forms the exemplary scheme 2 of the new nucleic acid molecule (that is, the nucleic acid molecule of marking through ChIP-seq) that contains ChIP-seq information, and, described Exemplary structures of novel nucleic acid molecules containing ChIP-seq information.
- CA consensus sequence A
- CB consensus sequence B
- X1 consensus sequence X1
- Y tag sequence Y
- X2 consensus sequence X2.
- FIG. 12 shows the gene expression profiles of some Hek293 cells obtained by the method of Example 1.
- FIG. 13 shows a partially enlarged view of the gene expression profile of some Hek293 cells obtained by the method of Example 1.
- Fig. 14 shows the length distribution of cDNA amplification products in Example 2.
- FIG. 15 shows the gene expression profile of Hek293 cells obtained by the method of Example 2.
- Figure 16 shows the average number of genes and UMIs captured by a single cell obtained by the method of Example 2.
- DNBSEQ sequencing kit purchased from MGI, Cat. No. 1000019840 was used to prepare DNA nanoballs (DNB). Specific embodiments are briefly described below.
- reaction system shown in Table 1-2 was configured.
- the reaction system was placed in a PCR instrument, and the reaction was carried out according to the following reaction conditions: 95°C for 3 minutes, 40°C for 3 minutes.
- After the reaction put the reaction product on ice, add 40 ⁇ L mixed enzyme I and 2 ⁇ L mixed enzyme II (from DNBSEQ sequencing kit), 1 ⁇ L ATP (100 mM stock solution, obtained from Thermo Fisher), and 0.1 ⁇ L T4 ligase (obtained from from NEB, Cat. No. M0202S).
- the above reaction system was placed in a PCR instrument and reacted at 30° C. for 20 minutes to generate DNB.
- the DNB was loaded onto the BGISEQ SEQ 500 sequencing chip according to the method described in the BGISEQ 500 high-throughput sequencing reagent set (SE50) (purchased from MGI, catalog number: 1000012551).
- the sequencing chip add the MDA reagent in the BGISEQ500PE50 sequencing kit (purchased from MGI, product number: 1000012554), and after incubating at 37° C. for 30 min, wash the chip with 5 ⁇ SSC.
- Chip surface modified with N3-PEG3500-NHS (the modification reagent was purchased from Sigma, product number: JKA5086). After incubation for 30 minutes, pump into the DBCO-modified chip sequence to synthesize primers (sequence shown in SEQ ID NO: 3), and overnight at room temperature Incubation.
- the DNB was sequenced according to the instructions of the BGISEQ-500 high-throughput sequencing reagent kit, and the read length of SE was set to 25bp.
- the above-mentioned DBCO-modified sequence is extended to obtain the chain grown after sequencing, and the chain is decoded to obtain the position sequence information corresponding to the DNB.
- the chain grown after sequencing continues to extend: on the basis of the above step 3, continue to carry out the cPAS reaction of 15 bases to obtain the chip sequence (SEQ ID NO: 8, which contains the consensus sequence X1 (SEQ ID NO: 4), Tag sequence Y, consensus sequence X2 (SEQ ID NO:5)).
- Chip dicing cut the prepared chip into several small pieces, adjust the size of the slice according to the needs of the experiment, soak the chip in 50mM Tris buffer with pH 8.0, and keep it at 4°C for use.
- Reverse transcriptase will use mRNA as a template to synthesize cDNA with polyT-containing primers (sequence shown in SEQ ID NO: 6, which contains consensus sequence A (CA), UMI sequence (NNNNNNNNN) and polyT sequence), and in the cDNA A CCC overhang is added to the 3' end of the strand.
- polyT-containing primers sequence shown in SEQ ID NO: 6, which contains consensus sequence A (CA), UMI sequence (NNNNNNNNNNN) and polyT sequence
- the synthetic cDNA strand comprises the following sequence structure: sequence of reverse transcription primer (SEQ ID NO:6)-cDNA sequence-c(TSO) sequence (complementary sequence of SEQ ID NO:7).
- a nucleic acid molecule comprising the following sequence structure: chip sequence (SEQ ID NO:8)-reverse transcription primer sequence (SEQ ID NO:6)-cDNA sequence-c(TSO) sequence (SEQ ID NO:7 Complementary sequence).
- the chip was washed with 5X SSC. According to the instructions, 200 ⁇ L of Bst polymerization reaction solution (NEB, M0275S) was prepared, pumped into the chip, and reacted at 65°C for 60 minutes to obtain single-stranded nucleic acid molecules containing position information.
- Bst polymerization reaction solution NEB, M0275S
- reaction system in the PCR instrument, set the following reaction program, 95°C for 3min, 11 cycles (98°C for 20s, 58°C for 20s, 72°C for 3min), 72°C for 5min, 4°C ⁇ .
- XP beads purchased from AMPure
- the dsDNA concentration was quantified using a Qubit instrument, and the length distribution of cDNA amplification products was detected using a 2100 Bioanalyzer (available from Agilent).
- cDNA concentration take 20ng cDNA (obtained in step 3), add 0.5 ⁇ M Tn5 transposase and corresponding buffer (purchased from BGI, catalog number 10000028493, Tn5 disrupting enzyme coating method is operated according to Stereomics library preparation kit-S1) , mix well to form a 20 ⁇ L reaction system, react at 55°C for 10 minutes, add 5 ⁇ L 0.1% SDS and mix at room temperature for 5 minutes to end the Tn5 interruption step.
- the reaction conditions are as follows: 95°C for 3 minutes, 40°C for 3 minutes; after the reaction is completed, put it on ice, add 40 ⁇ L of the mixed enzyme I required for DNB preparation in the DNBSEQ sequencing kit, and 2 ⁇ L of the mixed enzyme II, and 1 ⁇ L ATP, 0.1 ⁇ L T4 Ligase, after mixing, put the above reaction system in a PCR instrument at 30°C, and react for 20 minutes to form DNB.
- DNA nanosphere (DNB) preparation configure 40 ⁇ L of the reaction system shown in Table 2-2, and inject 80 fmol of DNA library containing position sequence information
- the reaction conditions are as follows: 95°C for 3 minutes, 40°C for 3 minutes; after the reaction is completed, put it on ice, add 40 ⁇ L of the mixed enzyme I required for DNB preparation in the DNBSEQ sequencing kit, and 2 ⁇ L of the mixed enzyme II, and 1 ⁇ L ATP (100 mM mother solution, Thermo Fisher), 0.1 ⁇ L T4 ligase (purchased from NEB, product number: M0202S), after mixing evenly, put the above reaction system in a PCR instrument at 30°C for 20 minutes to form DNB.
- the DNB was loaded onto the SEQ 500 sequencing chip according to the method described in the BGISEQ-500 high-throughput sequencing reagent set (SE50).
- N3-PEG3500-NHS purchased from Sigma, product number: JKA5086
- DBCO sequence modified by DBCO
- the DNB was sequenced according to the instructions of the BGISEQ-500 high-throughput sequencing reagent kit, and the read length of SE was set to 25bp.
- the above-mentioned DBCO-modified sequence is extended to obtain the chain grown after sequencing, and the chain is decoded to obtain the position sequence information corresponding to the DNB.
- Liuhe Huada synthesized the probe capture sequence with UMI (SEQ ID NO: 15, its 5' terminal phosphorylation modification), and connected the capture sequence to the chain grown after sequencing by T4 ligase according to the following reaction system.
- the connection reaction system is shown in Table 2-3.
- NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN represents the location information.
- Chip cutting cut the prepared chip into several small pieces, adjust the size of the slice according to the needs of the experiment, soak the chip in 50mM tris buffer with pH8.0, and prepare it at 4°C for use.
- cDNA synthesis Use 5XSSC to wash the chip twice at room temperature, configure 200 ⁇ L of reverse transcriptase reaction system as shown in Table 2-4, add the reaction solution to the chip containing cells, fully cover, react at 42°C for 90min-180min, and use the probe polyT on the chip
- the primers are used for cDNA synthesis, and the 3-terminus of the cDNA is tagged with TSO for the synthesis of cDNA complementary strands.
- the cDNA strands are as follows:
- CTGCTGACGTACTGAGAGGCATGGCGACCTTATTCAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTTGTCTTCCTAAGACNNNNTTTTTTTTTTTTTTTTTTTTTTTV(cDNA)CCCGCCTCTCAGTACGTCAGCAG was treated with RNaseH for 30min, and the RNA was digested.
- reaction system in the PCR instrument, set the following reaction program, 95°C for 3min, 11 cycles (98°C for 20s, 58°C for 20s, 72°C for 3min), 72°C for 5min, 4°C ⁇ .
- reaction program 95°C for 3min, 11 cycles (98°C for 20s, 58°C for 20s, 72°C for 3min), 72°C for 5min, 4°C ⁇ .
- XP beads for magnetic bead purification and recovery.
- concentration of dsDNA was quantified using the Qubit kit, and the distribution of cDNA fragments was detected using a 2100 bioanalyzer (purchased from Agilent). The test results are shown in Figure 14, and the cDNA length is normal.
- Tn5 interrupts According to the cDNA concentration, take 20ng cDNA, add 0.5 ⁇ M Tn5 interrupting enzyme (it is coated with the first strand shown in SEQ ID NO:19 and the second strand shown in SEQ ID NO:20) and corresponding buffer ( Purchased from BGI, Cat. No. 10000028493, Tn5 interrupting enzyme coating method according to the Stereomics library preparation kit), mixed to form a 20 ⁇ L reaction system, reacted at 55 °C for 10 minutes, added 5 ⁇ L of 0.1% SDS and mixed at room temperature for 5 minutes to end Tn5 Interrupt steps.
- the reaction conditions are as follows: 95°C for 3 minutes, 40°C for 3 minutes; after the reaction is completed, put it on ice, add 40 ⁇ L of the mixed enzyme I required for DNB preparation in the DNBSEQ sequencing kit, and 2 ⁇ L of the mixed enzyme II, and 1 ⁇ L ATP (100 mM mother solution, Thermo Fisher), 0.1 ⁇ L T4 ligase, after mixing, place the above reaction system in a PCR instrument at 30°C for 20 minutes to form DNB.
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Biochemistry (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Wood Science & Technology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Genetics & Genomics (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Pathology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Hospice & Palliative Care (AREA)
- Oncology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Description
成分 | 体积(μL) | 终浓度 |
上述步骤2的扩增产物 | 80fmol(X) | - |
10X phi29buffer(获自Thermofisher,货号:B62) | 4 | 1X |
DNB引物序列10μM(SEQ ID NO:22,六合华大合成) | 4 | 1μM |
水 | 32-x | - |
Claims (61)
- 一种生成标记的核酸分子群的方法,其包括下述步骤:(1)提供:含有一个或多个细胞的样品,和,核酸阵列;其中,所述样品为单细胞悬液;所述细胞(例如在其表面)含有第一结合分子;所述核酸阵列包括固相支持物,所述固相支持物(例如在其表面)含有第一标记分子,所述第一结合分子能与所述第一标记分子构成相互作用对;并且,所述固相支持物还包含多个微点,所述微点的尺寸(例如等效直径)小于5μm,相邻的所述微点之间的中心距离小于10μm;每个微点偶联有一种寡核苷酸探针,每种寡核苷酸探针包含至少一个拷贝;所述寡核苷酸探针从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成,其中,不同微点偶联的寡核苷酸探针具有不同的标签序列Y;(2)将所述一个或多个细胞与所述核酸阵列的固相支持物接触,由此,每个细胞各自占据所述核酸阵列中的至少一个微点(即,每个细胞各自与所述核酸阵列中的至少一个微点接触),并使得所述细胞的第一结合分子与所述固相支持物的第一标记分子形成相互作用对;其中,在将所述一个或多个细胞与所述核酸阵列接触之前或之后,对所述一个或多个细胞的RNA(例如,mRNA)进行包括逆转录的预处理以生成第一核酸分子群;和,(3)将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群。
- 权利要求1的方法,其中,相邻的所述微点之间的中心距离小于10μm,小于5μm,小于1μm,小于0.5μm,小于0.1μm,小于0.05μm,或小于0.01μm;并且,所述微点的尺寸(例如等效直径)小于5μm,小于1μm,小于0.3μm,小于0.5μm,小于0.1μm,小于0.05μm,小于0.01μm,或小于0.001μm;优选地,相邻的所述微点之间的中心距离为0.5μm~1μm,例如0.5μm~0.9μm,0.5μm~0.8μm;优选地,所述微点的尺寸(例如等效直径)为0.001μm~0.5μm(例如0.01μm~0.1μm,0.01μm~0.2μm,0.2μm~0.5μm,0.2μm~0.4μm,0.2μm~0.3μm)。
- 权利要求1或2的方法,其中,所述第一结合分子能与所述第一标记分子构成特异性相互作用对或者非特异性相互作用对;优选地,所述相互作用对选自正负电荷相互作用,亲和相互作用(例如生物素-亲和素,生物素-链霉亲和素,抗原-抗体,受体-配体,酶-辅因子),能够发生点击化学反应的分子对(例如含炔基基团-叠氮基化合物),N-羟基磺基琥珀(NHS)酯-含氨基化合物,或其任意组合;例如,所述第一标记分子为多聚赖氨酸,所述第一结合分子为能与多聚赖氨酸结合的蛋白质;所述第一标记分子为抗体,所述第一结合分子为能与所述抗体结合的抗原;所述第一标记分子为含氨基化合物,所述第一结合分子为N-羟基磺基琥珀(NHS)酯;或者,所述第一标记分子为生物素,所述第一结合分子为链霉亲和素。
- 权利要求1-3任一项的方法,其中,所述第一结合分子是所述细胞天然含有的。
- 权利要求1-3任一项的方法,其中,所述第一结合分子是所述细胞非天然含有的;优选地,所述方法还包括将所述第一结合分子结合到所述一个或多个细胞或者使所述一个或多个细胞表达所述第一结合分子的步骤,以提供步骤(i)所述的细胞样品。
- 权利要求1-5任一项的方法,其中,所述方法还包括将所述第一标记分子结合到所述固相支持物的步骤,以提供步骤(i)所述的核酸阵列。
- 权利要求1-6任一项的方法,其中,步骤(2)中,所述预处理包括以下步骤:(i)用引物I-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成延伸产物,所述延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);或,(ii)(a)用引物I-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物I-A为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);和,(b)将引物I-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物,所述第一延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物I-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);或,(iii)(a)用引物I-A’对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物I-A’为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物I-A’包含捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;(b)将引物I-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物I-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);和,(c)提供延伸引物,以第一延伸产物为模板进行延伸反应,生成第二延伸产物,所述第二延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;并且,步骤(3)中,通过以下步骤将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群:在允许退火的条件下,将桥接寡核苷酸I与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针接触,使得所述桥接寡核苷酸I与步骤(2)获得的 源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退火),从而使得所述第一核酸分子群与所述阵列上的寡核苷酸探针连接,获得的连接产物即为具有位置标记的第二核酸分子,从而生成第二核酸分子群;其中,所述桥接寡核苷酸I包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,所述第一区域能与步骤(2)(i)或步骤(2)(ii)中所述引物I-A的共有序列A全部或部分退火或者与步骤(2)(iii)中所述引物I-B的共有序列B全部或部分退火;所述第二区域能与所述共有序列X2全部或部分退火。
- 权利要求7的方法,其中,步骤(3)中,当所述桥接寡核苷酸I的第一区域和第二区域相邻时,所述使得所述第一核酸分子群与所述寡核苷酸探针连接包括:使用核酸连接酶将杂交于同一桥接寡核苷酸I的第一区域和第二区域的核酸分子连接,获得的连接产物即为具有位置标记的第二核酸分子;或者,当所述桥接寡核苷酸I包括第一区域、第二区域以及位于两者之间的第三区域时,所述使得所述第一核酸分子群与所述寡核苷酸探针连接包括:使用核酸聚合酶以所述第三区域为模板进行聚合反应,使用核酸连接酶将杂交于同一桥接寡核苷酸I的第一区域、第三区域和第二区域的核酸分子连接,获得的连接产物即为具有位置标记的第二核酸分子;优选地,所述核酸聚合酶无5’至3’端外切酶活性或链置换活性。
- 权利要求7或8的方法,其包括步骤(1)、步骤(2)(i)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子。
- 权利要求9的方法,其中,步骤(2)(i)中,所述捕获序列A是随机寡核苷酸序列;优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述捕获序列A,所述捕获序列A作为所述第二核酸分子的分子标签(UMI)。
- 权利要求9的方法,其中,步骤(2)(i)中,所述捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列;其中,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列;优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述标签序列A作为UMI。
- 权利要求7或8的方法,其包括步骤(1)、步骤(2)(ii)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子。
- 权利要求12的方法,其中,步骤(2)(ii)(a)中,所述捕获序列A是随机寡核苷酸序列;优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述捕获序列A,所述捕获序列A作为所述第二核酸分子的分子标签(UMI)。
- 权利要求12的方法,其中,步骤(2)(ii)(a)中,所述捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列;其中,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列;优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有 不同的所述标签序列A作为UMI。
- 权利要求9-14任一项的方法,其中,所述引物I-A的5’末端包含磷酸化修饰。
- 权利要求7或8的方法,其包括步骤(1)、步骤(2)(iii)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子。
- 权利要求16的方法,其中,步骤(2)(iii)(c)中,所述延伸引物为所述引物I-B或者引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
- 权利要求16或17的方法,其中,步骤(2)(iii)(a)中,所述引物I-A’的捕获序列A为随机寡核苷酸序列;其中,步骤(2)(iii)(b)中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B。
- 权利要求16或17的方法,其中,步骤(2)(iii)(a)中,所述引物I-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B;优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述标签序列B作为UMI。
- 权利要求16-19任一项的方法,其中,所述延伸引物的5’末端包含磷酸化修饰。
- 权利要求16-20任一项的方法,其中,在步骤(2)(iii)(b)中,所述cDNA链通过其3’末端悬突与所述引物I-B退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所述引物I-B为模板被延伸,生成所述第一延伸产物。
- 权利要求1-6任一项的方法,其中,步骤(2)中,所述预处理包括以下步骤:(i)(a)用引物II-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物II-A为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物II-A含有捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;和,(b)将引物II-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物,所述第一延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物II-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物II-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B的5’端);或,(ii)(a)用引物II-A’对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链;所述cDNA链包含以所述引物II-A’为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物II-A’含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物II-A’的5’端);(b)将引物II-B’与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物;其中,所述引物II-B’包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物II-B’的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B’的5’端);和,(c)提供延伸引物,以第一延伸产物为模板进行延伸反应,生成第二延伸产物,所述第二延 伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;并且,步骤(3)中,通过以下步骤将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群:(i)向步骤(2)的产物实施退火条件,使得步骤(2)获得的源自各个细胞的第一核酸分子与所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退火),并进行延伸反应,生成延伸产物,所述延伸产物即为具有位置标记的第二核酸分子,从而生成第二核酸分子群;其中,所述寡核苷酸探针的共有序列X2或其部分序列(a)能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火,或者,(b)能与步骤(2)(ii)获得的第二延伸产物的所述共有序列A的互补序列或其部分序列退火;或,(ii)在允许退火的条件下,将桥接寡核苷酸对与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针接触,使得所述桥接寡核苷酸对与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退火),其中,所述桥接寡核苷酸对由桥接寡核苷酸II-I和桥接寡核苷酸II-II组成,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;所述桥接寡核苷酸II-II的第二区域(a)能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火,或者,(b)能与步骤(2)(ii)获得的第二延伸产物的所述共有序列A的互补序列或其部分序列退火;其中,将所述桥接寡核苷酸对与所述第一核酸分子群、所述寡核苷酸探针接触时,所述桥接寡核苷酸对的桥接寡核苷酸II-I和桥接寡核苷酸II-II各自以单链的形式存在,或者,所述桥接寡核苷酸对的桥接寡核苷酸II-I和桥接寡核苷酸II-II以彼此退火形成部分双链的形式存在;进行连接反应:将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接;并进行延伸反应;其中,所述连接反应与延伸反应以任意顺序进行;所获得的反应产物即为具有位置标记的第二核酸分子,从而生成所述第二核酸分子群。
- 权利要求22的方法,其中,步骤(3)(ii)中:(1)当所述桥接寡核苷酸II-I的第一区域和第二区域相邻时,所述将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸连接酶将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接;或者,当所述桥接寡核苷酸II-I包括第一区域、第二区域以及位于两者之间的第三区域时,所述将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸聚合酶(例如,无5’至3’端外切酶活性或链置换活性)以所述第三区域为模板进行聚合反应,使用核 酸连接酶将杂交于同一桥接寡核苷酸II-I的第一区域、第三区域和第二区域的核酸分子连接;和/或(2)当所述桥接寡核苷酸II-II的第一区域和第二区域相邻时,所述将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸连接酶将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接;或者,当所述桥接寡核苷酸II-II包括第一区域、第二区域以及位于两者之间的第三区域时,所述将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸聚合酶(例如,无5’至3’端外切酶活性或链置换活性)以所述第三区域为模板进行聚合反应,使用核酸连接酶将杂交于同一桥接寡核苷酸II-II的第一区域、第三区域和第二区域的核酸分子连接。
- 权利要求22或23的方法,其包括步骤(1)、步骤(2)(i)和步骤(3);其中,步骤(2)(i)(b)中,所述引物II-B含有共有序列B,3’末端悬突互补序列,以及标签序列B;优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二核酸分子具有不同的所述标签序列B作为UMI。
- 权利要求24的方法,其包括步骤(1)、步骤(2)(i)和步骤(3)(i);其中,所述共有序列X2或其部分序列能与所述共有序列B的互补序列或其部分序列退火;步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
- 权利要求24的方法,其包括步骤(1)、步骤(2)(i)和步骤(3)(ii);其中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火;步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
- 权利要求24-26任一项的方法,其中,步骤(2)(i)(a)中,所述引物II-A的捕获序列A为随机寡核苷酸序列。
- 权利要求24-26任一项的方法,其中,步骤(2)(i)(a)中,所述引物II-A的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列;优选地,所述引物II-A还含有共有序列A,以及任选的标签序列A,例如为随机寡核苷酸序列。
- 权利要求22或23的方法,其包括步骤(1)、步骤(2)(ii)和步骤(3);其中,步骤(2)(ii)(b)中,所述第一延伸产物从5’端至3’端包含:所述共有序列A,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列;优选地,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’或引物B”,其中,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
- 权利要求29的方法,其包括步骤(1)、步骤(2)(ii)和步骤(3)(i);其中,所述共有序列X2或其部分序列能与所述共有序列A的互补序列或其部分序列退火;步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或, 含有所述寡核苷酸探针序列的第二链。
- 权利要求29的方法,其包括步骤(1)、步骤(2)(ii)和步骤(3)(ii);其中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(ii)获得的第二延伸产物的共有序列A的互补序列或其部分序列退火;步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
- 权利要求29-31任一项的方法,其中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为随机寡核苷酸序列;优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二核酸分子具有不同的捕获序列A作为UMI。
- 权利要求29-31任一项的方法,其中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列;其中,所述引物II-A’还含有标签序列A,例如为随机寡核苷酸序列;优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二核酸分子具有不同的标签序列A作为UMI。
- 权利要求22-28任一项的方法,其中,在步骤(2)(i)(b)中,所述cDNA链通过其3’末端悬突与所述引物II-B退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所述引物II-B为模板被延伸,生成所述第一延伸产物。
- 权利要求22-23、29-33任一项的方法,其中,在步骤(2)(ii)(b)中,所述cDNA链通过其3’末端悬突与所述引物II-B’退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所述引物II-B’为模板被延伸,生成所述第一延伸产物。
- 权利要求1-35任一项的方法,其中,步骤(2)中,所述预处理在细胞内进行;优选地,在将所述一个或多个细胞与所述核酸阵列的固相支持物接触之前或之后,对所述一个或多个的RNA(例如,mRNA)进行预处理以生成第一核酸分子群;优选地,在进行所述预处理之前,对细胞进行透化处理。
- 权利要求1-35任一项的方法,其中,步骤(2)中,所述预处理在细胞外进行;优选地,在将所述一个或多个细胞与所述核酸阵列的固相支持物接触之后,对所述一个或多个的RNA(例如,mRNA)进行预处理以生成第一核酸分子群;优选地,在进行所述预处理之前,所述方法还包括释放细胞内的RNA(例如,mRNA);优选地,通过细胞透化或细胞裂解处理以释放细胞内的RNA(例如,mRNA)。
- 权利要求1-37任一项的方法,其中,步骤(2)中所述进行逆转录包括使用逆转录酶;优选地,所述逆转录酶具有末端转移活性;优选地,所述逆转录酶能够以RNA(例如,mRNA)为模板,合成cDNA链,且在所述cDNA链的3’端添加悬突。
- 权利要求1-38任一项所述的方法,其中,所述方法还包括:(4)回收和纯化所述第二核酸分子群。
- 权利要求1-39任一项所述的方法,其中,所获得的第二核酸分子群和/或其互补物用于构 建转录组文库或用于转录组测序。
- 权利要求1-40任一项的方法,其中,步骤(1)所述核酸阵列由包含以下的步骤来提供:(1)提供多种载体序列,每种载体序列包含至少一个拷贝的载体序列,所述载体序列从5’到3’的方向上包含:共有序列X2的互补序列,标签序列Y的互补序列以及固定序列;其中,每种载体序列的标签序列Y的互补序列互不相同;(2)将所述多种载体序列连接于固相支持物(例如芯片)表面;(3)提供固定引物,并以所述载体序列为模板,进行引物延伸反应,生成延伸产物,所述延伸产物即为寡核苷酸探针;其中,所述固定引物包含共有序列X1的序列,并且,所述固定引物能与所述固定序列退火并起始延伸反应;优选地,所述延伸产物从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成;(4)将所述固定引物与所述固相支持物表面连接;其中,步骤(3)与(4)以任意顺序进行;(5)任选地,所述载体序列的固定序列还包含切割位点,所述切割可以选自切刻酶(nicking enzyme)酶切、USER酶切、光切除、化学切除或CRISPR切除;对所述载体序列的固定序列所包含的切割位点进行切割,以消化所述载体序列,使得步骤(3)中的延伸产物与形成延伸产物的模板(即载体序列)分离,从而将所述寡核苷酸探针连接于固相支持物(例如芯片)表面;优选地,所述方法还包括通过高温变性使得步骤(3)中的延伸产物与形成延伸产物的模板(即载体序列)分离;优选地,每种载体序列是由多个拷贝的载体序列的多联体所形成的DNB;优选地,步骤(1)中通过以下步骤提供所述多种载体序列:(i)提供多种载体模板序列,所述载体模板序列包含所述载体序列的互补序列;(ii)以每种载体模板序列为模板,进行核酸扩增反应,以获得每种载体模板序列的扩增产物,所述扩增产物包含至少一个拷贝的载体序列;优选地,进行滚环复制,以获得由所述载体序列的多联体所形成的DNB。
- 一种构建核酸分子文库的方法,其包括,(a)根据权利要求1-41任一项的方法生成标记的核酸分子群;(b)将所述标记的核酸分子群中的核酸分子随机打断并添加接头;和(c)任选地,对步骤(b)的产物进行扩增和/或富集;从而获得核酸分子文库;优选地,所述核酸分子文库包含来自多个单细胞的核酸分子,不同单细胞的核酸分子具有不同的标签序列Y;优选地,所述核酸分子文库用于测序,例如转录组测序,例如单细胞转录组测序(例如5’端或3’端转录组测序)。
- 权利要求42的方法,其中,在进行步骤(b)之前,所述方法还包括步骤(pre-b):扩增和/或富集所述标记的核酸分子群;优选地,所述扩增反应使用至少引物C和/或引物D来进行,其中,所述引物C能够与所述共有序列X1的互补序列或其部分序列杂交或退火,并起始延伸反应;所述引物D能够与所述标 记的核酸分子群中含有所述标签序列Y的核酸分子链杂交或退火,并起始延伸反应。
- 权利要求43所述的方法,其中,在步骤(b)中,用转座酶将前一步骤获得的核酸分子随机打断并在片段两端添加接头;优选地,在步骤(c)中,至少使用引物C’和/或引物D’对步骤(b)的产物进行扩增,其中,片段两端的接头分别为第一接头和第二接头,所述引物C’能够与所述第一接头杂交或退火,并起始延伸反应,所述引物D’能够与所述第二接头杂交或退火,并起始延伸反应。
- 一种对样品中的细胞进行转录组测序的方法,其包括:(1)根据权利要求42-44任一项的方法构建核酸分子文库;和(2)对所述核酸分子文库进行测序。
- 进行单细胞转录组分析的方法,其包括:(1)根据权利要求45的方法对样品中的单细胞进行转录组测序;和(2)对测序数据进行分析,其包括,将获得的测序文库的测序结果与所述核酸阵列上各个微点所偶联的寡核苷酸探针中的标签序列Y或其互补序列进行匹配,若匹配成功,则将所述微点认定为阳性微点,并且,将源自于所述核酸阵列中呈区域连续性的阳性微点的测序数据认定为同一个细胞的转录数据,从而进行单细胞转录组分析。
- 试剂盒,其包含:用于标记核酸的核酸阵列以及任选的第一结合分子,所述核酸阵列包括固相支持物,所述固相支持物(例如在其表面)含有第一标记分子,所述第一结合分子能与所述第一标记分子构成相互作用对;所述固相支持物还包含多个微点,所述微点的尺寸(例如等效直径)小于5μm,相邻的所述微点之间的中心距离小于10μm;每个偶联有一种寡核苷酸探针;每种寡核苷酸探针包含至少一个拷贝;并且,所述寡核苷酸探针从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成,其中,不同微点偶联的寡核苷酸探针具有不同的标签序列Y。
- 权利要求47的试剂盒,其中,相邻的所述微点之间的中心距离小于10μm,小于5μm,小于1μm,小于0.5μm,小于0.1μm,小于0.05μm,或小于0.01μm;并且,所述微点的尺寸(例如等效直径)小于5μm,小于1μm,小于0.3μm,小于0.5μm,小于0.1μm,小于0.05μm,小于0.01μm,或小于0.001μm;优选地,相邻的所述微点之间的中心距离为0.5μm~1μm,例如0.5μm~0.9μm,0.5μm~0.8μm;优选地,所述微点的尺寸(例如等效直径)为0.001μm~0.5μm(例如0.01μm~0.1μm,0.01μm~0.2μm,0.2μm~0.5μm,0.2μm~0.4μm,0.2μm~0.3μm)。
- 权利要求47或48的试剂盒,其中,所述第一结合分子能与所述第一标记分子构成特异性相互作用对或者非特异性相互作用对;优选地,所述相互作用对选自正负电荷相互作用,亲和相互作用(例如生物素-亲和素,生物素-链霉亲和素,抗原-抗体,受体-配体,酶-辅因子),能够发生点击化学反应的分子对(例如含炔基基团-叠氮基化合物),N-羟基磺基琥珀(NHS)酯-含氨基化合物,或其任意组合;例如,所述第一标记分子为多聚赖氨酸,所述第一结合分子为能与多聚赖氨酸结合的蛋白质;所述第一标记分子为抗体,所述第一结合分子为能与所述抗体结合的抗原;所述第一标记分子为含氨基化合物,所述第一结合分子为N-羟基磺基琥珀(NHS)酯;或者,所述第一标记分子为生物素,所述第一结合分子为链霉亲和素。
- 权利要求47-49任一项的试剂盒,其进一步包含:(i)引物I-A,包含引物I-A’和引物I-B的引物组,或者,包含引物I-A和引物I-B的引物组,其中:所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;优选地,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);所述引物I-A’包含捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物I-B的3’末端,所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);其中,所述3’末端悬突是指以所述引物I-A’的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸;以及,(ii)桥接寡核苷酸I,其包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,所述第一区域能(a)与所述引物I-A的共有序列A全部或部分退火或者(b)与所述引物I-B的共有序列B全部或部分退火;所述第二区域能与所述共有序列X2全部或其部分退火。
- 权利要求50的试剂盒,其包含:如(i)中所述的引物I-A,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-A的共有序列A全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;其中,所述引物I-A的捕获序列A是随机寡核苷酸序列;或者,所述引物I-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列;优选地,所述引物I-A的的5’末端包含磷酸化修饰。
- 权利要求50的试剂盒,其包含:如(i)中所述的包含引物I-A’和引物I-B的引物组,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-B的共有序列B全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;其中,所述引物I-A’的捕获序列A为随机寡核苷酸序列;或者,所述引物I-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A’进一步包含标签序列A,以及共有序列A;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B;优选地,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,且能够起始延伸反应;优选地,所述引物I-B或引物B”的5’末端包含磷酸化修饰;优选地,所述引物I-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物I-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
- 权利要求50的试剂盒,其包含:如(i)中所述的包含引物I-A和引物I-B的引物组,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-A的共有序列A全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;其中,所述引物I-A的捕获序列A为随机寡核苷酸序列;或者,所述引物I-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列;优选地,所述引物I-A的的5’末端包含磷酸化修饰;优选地,所述引物I-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物I-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
- 权利要求47-49任一项的试剂盒,其进一步包含:(i)包含引物II-A和引物II-B或者包含引物II-A’和引物II-B’的引物组,其中:所述引物II-A含有捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述引物II-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物II-B的3’末端,所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B的5’端);其中,所述3’末端悬突是指以所述引物II-A的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸;所述引物II-A’含有共有序列A和捕获序列A;其中,所述捕获序列A位于所述引物II-A’的3’端,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物II-A’的5’端);所述引物II-B’包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物II-B’的3’末端,所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B’的5’端);其中,所述3’末端悬突是指以所述引物II-A’的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸。
- 权利要求54的试剂盒,其包含:如(i)中所述的引物II-A和引物II-B的引物组,以及,(ii)桥接寡核苷酸II-I和桥接寡核苷酸II-II;其中,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;所述桥接寡核苷酸II-II的第二区域能与所述引物II-B的共有序列B的互补序列或其部分序列退火;其中,所述引物II-A的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A优选地进一步包含共有序列A和任选的标签序列A,例如为随机寡核苷酸序列;其中,所述引物II-B含有所述共有序列B,3’末端悬突互补序列,以及标签序列B;优选地,所述引物II-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
- 权利要求54的试剂盒,其包含:如(i)中所述的引物II-A和引物II-B的引物组;其中,所述引物II-A的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A优选地进一步包含共有序列A和任选的标签序列A,例如为随机寡核苷酸序列;其中,所述引物II-B含有所述共有序列B,3’末端悬突互补序列,以及标签序列B;优选地,所述引物II-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
- 权利要求54的试剂盒,其包含:如(i)中所述的引物II-A’和引物II-B’的引物组,以及,(ii)桥接寡核苷酸II-I和桥接寡核苷酸II-II;其中,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;所述桥接寡核苷酸II-II的第二区域能与所述引物II-A’的共有序列A互补序列或其部分序列退火;其中,所述引物II-A’的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A’的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A’进一步包含标签序列A,例如为随机寡核苷酸序列;优选地,所述引物II-B’包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B’的3’末端包含一个或多个修饰的核苷酸(例如锁核酸);优选地,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
- 权利要求54的试剂盒,其包含如(i)中所述的引物II-A’和引物II-B’的引物组;其中,所述引物II-A’的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A’的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A’进一步包含标签序列A,例如为随机寡核苷酸序列;其中,所述引物II-B’含有所述共有序列B,3’末端悬突互补序列,以及标签序列B;优选地,所述引物II-B’包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B’的3’末端 包含一个或多个修饰的核苷酸(例如锁核酸);优选地,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
- 权利要求47-58任一项的试剂盒,其进一步包含逆转录酶,核酸连接酶,核酸聚合酶和/或转座酶;优选地,所述逆转录酶具有末端转移活性;优选地,所述逆转录酶能够以RNA(例如,mRNA)为模板,合成cDNA链,且在所述cDNA链的3’端添加所述3’末端悬突。
- 权利要求47-59任一项的试剂盒,其进一步包含:用于进行核酸杂交的试剂、用于进行核酸延伸的试剂、用于进行核酸扩增的试剂、用于回收或纯化核酸的试剂、用于构建转录组测序文库的试剂、用于测序(例如二代测序或三代测序)的试剂、或其任何组合。
- 权利要求1-41任一项的方法或权利要求47-60任一项的试剂盒用于构建核酸分子文库或用于进行转录组测序的用途;优选地,所述方法或试剂盒用于构建单细胞核酸分子文库或用于进行单细胞转录组测序。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2022417425A AU2022417425A1 (en) | 2021-12-24 | 2022-11-30 | Labeling and analysis method for single-cell nucleic acid |
CN202280085229.2A CN118451199A (zh) | 2021-12-24 | 2022-11-30 | 单细胞核酸标记和分析方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111600833 | 2021-12-24 | ||
CN202111600833.8 | 2021-12-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023116376A1 true WO2023116376A1 (zh) | 2023-06-29 |
Family
ID=86901195
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/135478 WO2023116376A1 (zh) | 2021-12-24 | 2022-11-30 | 单细胞核酸标记和分析方法 |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN118451199A (zh) |
AU (1) | AU2022417425A1 (zh) |
WO (1) | WO2023116376A1 (zh) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180320224A1 (en) * | 2017-05-03 | 2018-11-08 | The Broad Institute, Inc. | Single-cell proteomic assay using aptamers |
CN110199196A (zh) * | 2016-11-11 | 2019-09-03 | 伊索普莱克西斯公司 | 用于单细胞的同时基因组,转录本和蛋白质组分析的组合物和方法 |
CN110684829A (zh) * | 2018-07-05 | 2020-01-14 | 深圳华大智造科技有限公司 | 一种高通量的单细胞转录组测序方法和试剂盒 |
US20200157528A1 (en) * | 2018-11-16 | 2020-05-21 | International Business Machines Corporation | Determining position and transcriptomes of biological cells |
WO2020176788A1 (en) * | 2019-02-28 | 2020-09-03 | 10X Genomics, Inc. | Profiling of biological analytes with spatially barcoded oligonucleotide arrays |
WO2020228788A1 (zh) * | 2019-05-15 | 2020-11-19 | 深圳华大生命科学研究院 | 用于检测核酸空间信息的阵列及检测方法 |
CN113604545A (zh) * | 2021-08-09 | 2021-11-05 | 浙江大学 | 一种超高通量单细胞染色质转座酶可及性测序方法 |
-
2022
- 2022-11-30 AU AU2022417425A patent/AU2022417425A1/en active Pending
- 2022-11-30 CN CN202280085229.2A patent/CN118451199A/zh active Pending
- 2022-11-30 WO PCT/CN2022/135478 patent/WO2023116376A1/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110199196A (zh) * | 2016-11-11 | 2019-09-03 | 伊索普莱克西斯公司 | 用于单细胞的同时基因组,转录本和蛋白质组分析的组合物和方法 |
US20180320224A1 (en) * | 2017-05-03 | 2018-11-08 | The Broad Institute, Inc. | Single-cell proteomic assay using aptamers |
CN110684829A (zh) * | 2018-07-05 | 2020-01-14 | 深圳华大智造科技有限公司 | 一种高通量的单细胞转录组测序方法和试剂盒 |
US20200157528A1 (en) * | 2018-11-16 | 2020-05-21 | International Business Machines Corporation | Determining position and transcriptomes of biological cells |
WO2020176788A1 (en) * | 2019-02-28 | 2020-09-03 | 10X Genomics, Inc. | Profiling of biological analytes with spatially barcoded oligonucleotide arrays |
WO2020228788A1 (zh) * | 2019-05-15 | 2020-11-19 | 深圳华大生命科学研究院 | 用于检测核酸空间信息的阵列及检测方法 |
CN113604545A (zh) * | 2021-08-09 | 2021-11-05 | 浙江大学 | 一种超高通量单细胞染色质转座酶可及性测序方法 |
Non-Patent Citations (2)
Title |
---|
"UniProt", Database accession no. P19821.1 |
LEBRIGAND, K. ET AL.: "High throughput error corrected Nanopore single cell transcriptome sequencing.", NATURE COMMUNICATIONS, vol. 11, 12 August 2020 (2020-08-12), XP093047897, DOI: 10.1038/s41467-020-17800-6 * |
Also Published As
Publication number | Publication date |
---|---|
AU2022417425A1 (en) | 2024-07-25 |
CN118451199A (zh) | 2024-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114015755B (zh) | 用于标记核酸分子的方法和试剂盒 | |
JP5951755B2 (ja) | 定量的ヌクレアーゼプロテクションアッセイ(qNPA)法および定量的ヌクレアーゼプロテクション配列決定(qNPS)法の改善 | |
CA2810931C (en) | Direct capture, amplification and sequencing of target dna using immobilized primers | |
CN108796058B (zh) | 用于组织样本中核酸的局部或空间检测的方法和产品 | |
CN117965695A (zh) | 用于空间标记和分析生物样本中的核酸的方法 | |
CN117821565A (zh) | 高灵敏度dna甲基化分析方法 | |
CN114096678A (zh) | 多种核酸共标记支持物及其制作方法与应用 | |
JP2017537657A (ja) | 標的配列の濃縮 | |
JP6089012B2 (ja) | Dnaメチル化分析方法 | |
KR20220130591A (ko) | 희석 또는 비-정제된 샘플에서 핵산의 정확한 병렬 정량분석 방법 | |
CN117089597A (zh) | 一种单细胞文库构建测序方法及其应用 | |
US20240271126A1 (en) | Oligo-modified nucleotide analogues for nucleic acid preparation | |
WO2023116376A1 (zh) | 单细胞核酸标记和分析方法 | |
US20060240431A1 (en) | Oligonucletide guided analysis of gene expression | |
KR20230124636A (ko) | 멀티플렉스 반응에서 표적 서열의 고 감응성 검출을위한 조성물 및 방법 | |
JP2023514388A (ja) | 並列化サンプル処理とライブラリー調製 | |
WO2023116373A1 (zh) | 一种生成标记的核酸分子群的方法及其试剂盒 | |
EP4455306A1 (en) | Labeling and analysis method for single-cell nucleic acid | |
WO2023115536A1 (zh) | 一种生成标记的核酸分子群的方法及其试剂盒 | |
EP4455299A1 (en) | Method for generating labeled nucleic acid molecular population and kit thereof | |
US20240316556A1 (en) | High-throughput analysis of biomolecules | |
US20240279648A1 (en) | Quantitative detection and analysis of molecules | |
US20240318244A1 (en) | Click-chemistry based barcoding | |
US20240209414A1 (en) | Novel nucleic acid template structure for sequencing | |
KR20240032630A (ko) | 핵산의 정확한 병렬 검출 및 정량화 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22909691 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022417425 Country of ref document: AU Date of ref document: 20221130 Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022909691 Country of ref document: EP Effective date: 20240724 |