WO2023116376A1 - 单细胞核酸标记和分析方法 - Google Patents

单细胞核酸标记和分析方法 Download PDF

Info

Publication number
WO2023116376A1
WO2023116376A1 PCT/CN2022/135478 CN2022135478W WO2023116376A1 WO 2023116376 A1 WO2023116376 A1 WO 2023116376A1 CN 2022135478 W CN2022135478 W CN 2022135478W WO 2023116376 A1 WO2023116376 A1 WO 2023116376A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
primer
nucleic acid
region
oligonucleotide
Prior art date
Application number
PCT/CN2022/135478
Other languages
English (en)
French (fr)
Inventor
廖莎
陈奥
章文蔚
徐讯
Original Assignee
深圳华大生命科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大生命科学研究院 filed Critical 深圳华大生命科学研究院
Priority to AU2022417425A priority Critical patent/AU2022417425A1/en
Priority to CN202280085229.2A priority patent/CN118451199A/zh
Publication of WO2023116376A1 publication Critical patent/WO2023116376A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • This application relates to the technical field of single-cell transcriptome sequencing and biomolecular spatial information detection. Specifically, the present application relates to a method for positioning and labeling nucleic acid molecules in a single-cell sample, and a method for constructing a single-cell transcriptome sequencing library. Furthermore, the present application also relates to kits for carrying out said methods.
  • Single-cell transcriptome sequencing technology is an important tool for identifying cellular heterogeneity.
  • the importance of single-cell transcriptome sequencing technology has prompted the rapid development of this technology in terms of throughput and ease of operation.
  • the development of single-cell transcriptome sequencing technology has prompted the launch of the Human Cell Atlas Project, a human cell atlas reference system, with huge sums of money internationally.
  • the launch of the Human Cell Atlas project has put forward higher requirements and challenges for the throughput of single-cell transcriptome sequencing technology.
  • single-cell transcriptome sequencing technology is also used by medical workers to discover a small number of "tumor stem cells" in cancer, so as to find drugs and therapies to overcome malignant tumors. Since malignant tumor cells are relatively rare, single-cell transcriptome sequencing technology is required to have a high cell utilization rate or capture rate to avoid the loss of transcriptome information of a small number of malignant tumor cells.
  • Existing single-cell transcriptome sequencing technologies mainly include two categories: one is low-throughput sequencing technology based on multi-well plates, in which a single cell is allocated to a single well of a multi-well plate, such as smart-seq, CEL-seq; The other is magnetic bead-based sequencing technology, in which a cell is co-wrapped in micro-droplets or micro-wells with labeled magnetic beads by means of microfluidics, such as 10x chromium, Drop-seq, Seq- well and other technologies.
  • the existing single-cell transcriptome sequencing technology has the highest throughput of 10x chromium, and the throughput of a single run is 5000-7000 cells, up to 10,000 cells, and, depending on the cell type, the capture rate of the cells 30% to 60%.
  • the gel beads (Gel Beads) with label molecules or barcode molecules (Barcode) enter the microfluidic system at a uniform speed;
  • the gel beads are combined and form GEMs (Gel Bead in em ⁇ Lsion) in the oil phase.
  • GEMs Gel Bead in em ⁇ Lsion
  • each cell is combined with a Gel Bead to form a GEM, thus, this method can achieve the purpose of single-cell transcriptome sequencing.
  • the formation of GEMs follows a Poisson distribution. That is, there may be a phenomenon that a single GEM contains 0 or more cells.
  • the sequencing data generated by this GEM does not correspond to the state of a single cell, it cannot be used later and needs to be filtered by an algorithm. Limited by the number of micro-droplets formed in the oil phase, the throughput of this technology is difficult to break through 10,000 levels; at the same time, due to the characteristics of Poisson distribution, the cell capture rate of this technology can theoretically reach up to 60%. Therefore, when it is necessary to perform single-cell transcriptome sequencing on 100,000 or even higher-throughput cells or to capture and sequence rare cells, this technology still has major flaws and is difficult to meet actual needs. Therefore, there is a need in the art to develop new single-cell transcriptome sequencing methods with higher cell capture rates.
  • the present application provides a method for positionally marking nucleic acid molecules of a cell sample, and a method for constructing a single-cell transcriptome sequencing library based on the method. Furthermore, the present application also relates to kits for carrying out said methods.
  • the application provides a method of generating a population of labeled nucleic acid molecules, comprising the steps of:
  • the sample is a single cell suspension; the cells (for example, on their surface) contain the first binding molecule;
  • the nucleic acid array includes a solid support, the solid support (for example, on its surface) contains a first label molecule, and the first binding molecule can form an interaction pair with the first label molecule;
  • the solid support also includes a plurality of micro-dots, the size of the micro-dots (such as equivalent diameter) is less than 5 ⁇ m, and the center distance between adjacent micro-dots is less than 10 ⁇ m; each micro-dot is even
  • An oligonucleotide probe is linked, and each oligonucleotide probe comprises at least one copy; the oligonucleotide probe comprises or consists of: consensus sequence X1, tag sequence from 5' to 3' direction Y and the consensus sequence X2, where,
  • Oligonucleotide probes coupled with different microdots have different label sequences Y;
  • each cell occupies at least one micro-spot in the nucleic acid array (i.e., each cell is separately associated with the nucleic acid array at least one micro-spot in the nucleic acid array), and make the first binding molecule of the cell and the first labeling molecule of the solid support form an interaction pair;
  • RNA e.g., mRNA
  • the center-to-center distance between said adjacent microdots is less than 10 ⁇ m, less than 5 ⁇ m, less than 1 ⁇ m, less than 0.5 ⁇ m, less than 0.1 ⁇ m, less than 0.05 ⁇ m, or less than 0.01 ⁇ m; and,
  • the microdots have a size (eg equivalent diameter) of less than 5 ⁇ m, less than 1 ⁇ m, less than 0.3 ⁇ m, less than 0.5 ⁇ m, less than 0.1 ⁇ m, less than 0.05 ⁇ m, less than 0.01 ⁇ m, or less than 0.001 ⁇ m.
  • the center-to-center distance between the adjacent micro-dots is 0.5 ⁇ m-1 ⁇ m, such as 0.5 ⁇ m-0.9 ⁇ m, 0.5 ⁇ m-0.8 ⁇ m.
  • the microdots have a size (eg equivalent diameter) of 0.001 ⁇ m to 0.5 ⁇ m (eg 0.01 ⁇ m to 0.1 ⁇ m, 0.01 ⁇ m to 0.2 ⁇ m, 0.2 ⁇ m to 0.5 ⁇ m, 0.2 ⁇ m to 0.4 ⁇ m , 0.2 ⁇ m ⁇ 0.3 ⁇ m).
  • the first binding molecule can form a specific interaction pair or a non-specific interaction pair with the first label molecule.
  • the interaction pair is selected from positive and negative charge interactions, affinity interactions (e.g., biotin-avidin, biotin-streptavidin, antigen-antibody, receptor-ligand Enzymes, enzyme-cofactors), molecular pairs capable of click chemistry reactions (eg, alkynyl-containing groups-azido-containing compounds), N-hydroxysulfosuccinyl (NHS) ester-amino-containing compounds, or any combination thereof.
  • affinity interactions e.g., biotin-avidin, biotin-streptavidin, antigen-antibody, receptor-ligand Enzymes, enzyme-cofactors
  • molecular pairs capable of click chemistry reactions eg, alkynyl-containing groups-azido-containing compounds
  • NHS N-hydroxysulfosuccinyl
  • the first labeling molecule is polylysine, and the first binding molecule is a protein capable of binding to polylysine; the first labeling molecule is an antibody, and the first binding molecule is a protein capable of binding to polylysine; A binding molecule is an antigen capable of binding to the antibody; the first labeling molecule is an amino-containing compound, and the first binding molecule is N-hydroxysulfosuccinate (NHS); or, the first labeling molecule is biotin, and the first binding molecule is streptavidin.
  • the first labeling molecule is polylysine
  • the first binding molecule is a protein capable of binding to polylysine
  • the first labeling molecule is an antibody
  • the first binding molecule is a protein capable of binding to polylysine
  • a binding molecule is an antigen capable of binding to the antibody
  • the first labeling molecule is an amino-containing compound, and the first binding molecule is N-hydroxysulfosuccinate (NHS); or, the first labeling
  • said first binding molecule is naturally contained by said cell.
  • said first binding molecule is not naturally contained by said cell.
  • the method further comprises the step of binding the first binding molecule to the one or more cells or causing the one or more cells to express the first binding molecule to provide the step (i) said cell sample.
  • the method further comprises the step of binding the first marker molecule to the solid support to provide the nucleic acid array of step (i).
  • step (2) the pretreatment includes the following steps:
  • RNA for example, mRNA
  • primer I-A contains a consensus sequence A and a capture sequence A, and the capture sequence A can anneal with the RNA to be captured (for example, mRNA) and initiate an extension reaction; the consensus sequence A is located in the capture sequence A upstream (for example, at the 5' end of the primer I-A);
  • the cDNA chain includes the primer I-A formed as a reverse transcription primer and the The cDNA sequence complementary to the RNA (for example, mRNA), and the 3' end overhang; wherein, the primer I-A contains a consensus sequence A and a capture sequence A, and the capture sequence A can be combined with the RNA to be captured (for example, mRNA) anneal and initiate an extension reaction; the consensus sequence A is located upstream of the capture sequence A (e.g., at the 5' end of the primer I-A); and, (b) combine primer I-B with the primer generated in (a) The cDNA chain is annealed and extended to generate a first extension product, which is the first nucleic acid molecule to be labeled, thereby generating a first nucleic acid molecule population; wherein, the primer I-B includes a consensus sequence
  • RNA eg, mRNA
  • primer I-A' reverse-transcribe the RNA (eg, mRNA) of the one or more cells with primer I-A' to generate a cDNA strand comprising The cDNA sequence complementary to the RNA (for example, mRNA) formed by the primers, and the 3' end overhang; wherein, the primer 1-A' comprises a capture sequence A capable of interacting with the RNA to be captured (for example, mRNA) anneal and initiate an extension reaction; (b) anneal primer I-B to the cDNA strand generated in (a), and perform an extension reaction to generate a first extension product; wherein, the primer I-B comprises a consensus sequence B, a 3' end overhang complementary sequence, and an optional tag sequence B; the 3' end overhang complementary sequence is located at the 3' end of the primer I-B; the consensus sequence B is located at the 3' end Upstream of the overhanging complementary sequence (for example, at the 5' end of the primer I-B); and
  • step (3) the first nucleic acid molecule population derived from each cell obtained in the previous step is associated with the microdot-coupled oligonucleotide probes occupied by the cell from which it originated, through the following steps, thereby generating a The second group of nucleic acid molecules marked by the tag sequence Y:
  • annealing for example, in-situ annealing
  • the bridging oligonucleotide I contact the bridging oligonucleotide I with the first nucleic acid molecule derived from each cell obtained in step (2) and the oligonucleotide probe coupled to the microspot occupied by the cell, annealing (for example, in-situ annealing) of the bridging oligonucleotide I to the first nucleic acid molecule derived from each cell obtained in step (2) and the oligonucleotide probe coupled to the micro-dot occupied by the cell , so that the first nucleic acid molecule group is connected to the oligonucleotide probes on the array, and the obtained connection product is a second nucleic acid molecule with a position marker, thereby generating a second nucleic acid molecule group;
  • the bridging oligonucleotide I includes: a first region and a second region, and optionally a third region between the first region and the second region, the first region is located in the second region Upstream (for example, the 5' end); where,
  • the first region can anneal to all or part of the consensus sequence A of primer I-A described in step (2)(i) or step (2)(ii) or to the primer I-B described in step (2)(iii).
  • the consensus sequence B is fully or partially annealed;
  • the second region is capable of annealing in whole or in part to the consensus sequence X2.
  • step (3) when the first region and the second region of the bridging oligonucleotide I are adjacent to each other, the first nucleic acid molecule population and the oligonucleotide The acid probe ligation includes: using nucleic acid ligase to ligate the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide I, and the obtained ligation product is the second nucleic acid molecule with a position marker; or,
  • said connecting the first population of nucleic acid molecules to the oligonucleotide probe comprises : use a nucleic acid polymerase to carry out a polymerization reaction with the third region as a template, use a nucleic acid ligase to connect the nucleic acid molecules hybridized to the first region, the third region and the second region of the same bridging oligonucleotide I, and obtain
  • the ligation product is the second nucleic acid molecule with a position marker; preferably, the nucleic acid polymerase has no 5' to 3' end exonuclease activity or strand displacement activity.
  • each oligonucleotide probe comprises one copy.
  • each oligonucleotide probe comprises multiple copies.
  • each oligonucleotide probe is one copy, each micro-dot is coupled with a probe, and the oligonucleotide probes of different micro-dots have different label sequences Y; when each oligonucleotide When the nucleotide probe contains multiple copies, each micro-dot is coupled with multiple probes, the oligonucleotide probes in the same micro-dot have the same label sequence Y, and the oligonucleotide probes in different micro-dots have Different label sequences Y.
  • the solid support comprises a plurality of microdots, each microdot is coupled to an oligonucleotide probe, and each oligonucleotide probe may comprise one or more copies.
  • the solid support comprises a plurality (eg, at least 10, at least 10 2 , at least 10 3 , at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , or more) microdots; in certain embodiments, the solid support comprises at least 10 4 (eg, at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , at least 10 9 , at least 10 10 , at least 10 11 , or at least 10 12 ) microdots/square millimeter.
  • the solid support comprises at least 10 4 (eg, at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , at least 10 9 , at least 10 10 , at least 10 11 , or at least 10 12 ) microdots/square millimeter.
  • Embodiment comprising step (1), step (2)(i) and step (3)
  • the method comprises step (1), step (2)(i) and step (3); wherein, the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
  • the capture sequence A is a random oligonucleotide sequence.
  • the ligation product derived from each copy of the oligonucleotide probe coupled to the same micro-dot has a different capture sequence A, and the capture sequence A serves as the molecular tag (UMI) of the second nucleic acid molecule.
  • UMI molecular tag
  • the extension product (the first nucleic acid molecule to be labeled) in step (2)(i) comprises from the 5' end to the 3' end: the consensus sequence A, the primer I-A A cDNA sequence complementary to the RNA formed as a primer for reverse transcription.
  • the capture sequence A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer I-A further comprises a tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer I-A, and the consensus sequence A is located upstream of the tag sequence A (for example, at the 5' end of the primer I-A).
  • step (3) the ligation products derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences A as UMIs.
  • the extension product described in step (2)(i) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, the tag sequence A, and the primer I-A as the reverse
  • the cDNA sequence complementary to the RNA formed by the primers was recorded.
  • Embodiment comprising step (1), step (2)(ii) and step (3)
  • the method comprises step (1), step (2)(ii) and step (3); wherein, the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
  • the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
  • the capture sequence A is a random oligonucleotide sequence.
  • the ligation product derived from each copy of the oligonucleotide probe coupled to the same micro-dot has a different capture sequence A, and the capture sequence A serves as the molecular tag (UMI) of the second nucleic acid molecule.
  • UMI molecular tag
  • the first extension product (the first nucleic acid molecule to be labeled) in step (2)(ii) comprises from the 5' end to the 3' end: the consensus sequence A, and the Primer I-A is the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the overhang sequence at the 3' end, optionally the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the capture sequence A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer I-A further comprises a tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer I-A, and the consensus sequence A is located upstream of the tag sequence A (for example, at the 5' end of the primer I-A).
  • step (3) the ligation products derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences A as UMIs.
  • the first extension product (the first nucleic acid molecule to be labeled) described in step (2)(ii) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, the Tag sequence A, the cDNA sequence complementary to the RNA formed by using the primer I-A as a reverse transcription primer, the 3' end overhang sequence, optional complementary sequence of the tag sequence B, the consensus sequence B complementary sequence.
  • the 5' end of the primer I-A comprises a phosphorylation modification.
  • step (1) An exemplary embodiment of the present application comprising step (1), step (2)(ii) and step (3) is described in detail as follows:
  • An exemplary scheme for preparing a cDNA chain using RNA (such as mRNA) in a sample as a template comprises the following steps (as shown in Figure 2):
  • RNA molecules for example, mRNA molecules
  • reverse transcriptase for example, reverse transcriptase with terminal transfer activity
  • primer I-A primer I-A
  • An overhang eg, an overhang comprising 3 cytosine nucleotides
  • Various reverse transcriptases having terminal transfer activity can be used for the reverse transcription reaction.
  • the reverse transcriptase used does not have RNaseH activity.
  • the primer I-A comprises a poly(T) sequence and a consensus sequence A (labeled CA in the figure).
  • the primer I-A further comprises a unique molecular tag sequence (UMI).
  • UMI unique molecular tag sequence
  • a poly(T) sequence is located at the 3' end of the primer I-A to initiate reverse transcription.
  • the UMI sequence is located upstream (for example, the 5' end) of the poly(T) sequence
  • the consensus sequence A is located upstream (for example, the 5' end) of the UMI sequence.
  • primer I-B which contains consensus sequence B (marked as CB in the figure) to anneal or hybridize with the cDNA strand, and subsequently, the nucleic acid fragment hybridized or annealed with primer I-B can be converted into a consensus sequence under the action of nucleic acid polymerase Sequence B is used as a template to extend, and the complementary sequence of consensus sequence B is added to the 3' end of the cDNA chain, thereby generating a nucleic acid molecule that carries consensus sequence A and tag sequence A at the 5' end and a complementary sequence of consensus sequence B at the 3' end.
  • the primer I-B may comprise a sequence complementary to the 3' end overhang of the cDNA strand.
  • primer I-B may contain GGG at its 3' end.
  • the nucleotides of primer I-B can also be modified (e.g., using a locked nucleic acid) to enhance complementary pairing between primer I-B and the 3' end overhang of the cDNA strand.
  • nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
  • DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can use the partial sequence of primer I-B as a template to extend the captured nucleic acid fragment (reverse transcription product) can be.
  • reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
  • step (2) is performed simultaneously with step (1).
  • the method optionally further includes step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
  • the method does not include step (3).
  • Exemplary structures of cDNA strands prepared by the above-described exemplary embodiments include: consensus sequence A, UMI sequence, a sequence complementary to that of RNA (eg, mRNA), and a complementary sequence to consensus sequence B.
  • An exemplary protocol for a new nucleic acid molecule comprises the following steps (as shown in Figure 3):
  • a bridging oligonucleotide I is provided whose 5' end contains a sequence (first region, P1) that is at least partially complementary to the 5' end of the cDNA sequence (e.g. consensus sequence A (CA)) and whose 3' end contains a sequence that is at least partially complementary to the ChIP sequence A sequence (second region, P2) at least partially complementary to the 3' end (eg consensus sequence X2).
  • first region, P1 that is at least partially complementary to the 5' end of the cDNA sequence
  • CA consensus sequence A
  • second region, P2 at least partially complementary to the 3' end
  • the P1 and P2 sequences in the bridging oligonucleotide I are contiguously connected without an intervening nucleotide therebetween.
  • the P1 sequence, the P2 sequence, the consensus sequence A and the consensus sequence X2 each independently have a length of 20-100 nt (such as 20-70 nt).
  • the bridging oligonucleotide I is annealed or hybridized to the oligonucleotide probe and the cDNA strand, after which the 5' end of the cDNA strand is bonded to the 3' end of the oligonucleotide probe by DNA ligase and/or DNA polymerase.
  • the ends are ligated to form a new nucleic acid molecule (ie, a nucleic acid molecule labeled with an oligonucleotide probe) that contains the sequence information of the oligonucleotide probe.
  • the DNA polymerase has no 5' to 3' exonuclease activity or strand displacement activity.
  • Embodiment comprising step (1), step (2)(iii) and step (3)
  • the method comprises step (1), step (2)(iii) and step (3); wherein, the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
  • the ligation product obtained in step (3) is the second nucleic acid molecule with a position marker , comprising from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the complementary sequence of the third region of the bridging oligonucleotide I, and The first nucleic acid molecule sequence to be labeled.
  • the extension primer is the primer I-B or primer B", and the primer B" can be complementary to the consensus sequence B
  • step (2)(iii)(c) the extension primer is the primer B.
  • the capture sequence A of the primer I-A' is a random oligonucleotide sequence.
  • the primer I-B comprises a consensus sequence B, a complementary sequence overhanging at the 3' end, and a tag sequence B.
  • the first extension product comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA sequence formed by using the primer 1-A' as a reverse transcription primer, and the 3 'terminal overhang sequence, the complementary sequence of the tag sequence B, the complementary sequence of the consensus sequence B; wherein, the complementary sequence of the tag sequence B serves as the molecular tag (UMI) of the second nucleic acid molecule.
  • UMI molecular tag
  • the second extension product (the first nucleic acid molecule sequence to be labeled) comprises from the 5' end to the 3' end: the consensus sequence B or its 3' end partial sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, the complementary sequence of the cDNA sequence in the first extension product; wherein, the tag sequence B is used as the A molecular signature (UMI) of the second nucleic acid molecule.
  • UMI A molecular signature
  • the capture sequence A of the primer I-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer I-A' also contains a tag sequence A, such as a random oligonucleotide sequence, and a consensus sequence A.
  • the capture sequence A is located at the 3' end of the primer I-A'.
  • the consensus sequence A is located upstream of the capture sequence A (e.g., at the 5' end of the primer I-A').
  • the primer I-B comprises a consensus sequence B, a 3' end overhang complementary sequence, and a tag sequence B.
  • the first extension product comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, The cDNA sequence complementary to the RNA formed by using the primer I-A' as a reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the second extension product (the first nucleic acid molecule sequence to be labeled) comprises from the 5' end to the 3' end: the consensus sequence B or its 3' end partial sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, the complementary sequence of the cDNA sequence in the first extension product, optionally the complementary sequence of the tag sequence A Sequence, the complementary sequence of the consensus sequence A.
  • step (3) the ligation products derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences B as UMIs.
  • the 5' end of the extension primer comprises a phosphorylation modification.
  • step (2)(iii)(c) before step (2)(iii)(c), the method further comprises treating the product of step (2)(iii)(a) or step (2)(iii)(b) (e.g. heat treatment) to remove RNA.
  • treating the product of step (2)(iii)(a) or step (2)(iii)(b) e.g. heat treatment
  • step (2)(iii)(b) of the method the cDNA strand anneals to the primer I-B via its 3' end overhang, and, in the presence of a nucleic acid polymerase (e.g., Under the action of DNA polymerase or reverse transcriptase), the cDNA chain is extended using the primer I-B as a template to generate the first extension product.
  • a nucleic acid polymerase e.g., Under the action of DNA polymerase or reverse transcriptase
  • step (1) An exemplary embodiment of the present application comprising step (1), step (2)(iii) and step (3) is described in detail as follows:
  • An exemplary scheme for preparing a cDNA strand complementary chain using RNA (such as mRNA) in a sample as a template comprises the following steps (as shown in FIG. 4 ):
  • RNA molecules (for example, mRNA molecules) in the permeabilized sample are reverse-transcribed using reverse transcriptase (for example, reverse transcriptase with terminal transfer activity) and primer I-A' to generate cDNA, and An overhang (eg, an overhang comprising 3 cytosine nucleotides) is added to the 3' end of the cDNA.
  • reverse transcriptase for example, reverse transcriptase with terminal transfer activity
  • primer I-A' primer I-A' to generate cDNA
  • An overhang eg, an overhang comprising 3 cytosine nucleotides
  • an overhang eg, an overhang comprising 3 cytosine nucleotides
  • the reverse transcriptase used does not have RNaseH activity.
  • the reverse transcription primer I-A' comprises a poly(T) sequence and a consensus sequence A(CA). Typically, a poly(T) sequence is located at the 3' end of the primer I-A' to initiate reverse transcription.
  • primer I-B Anneal or hybridize with the cDNA strand using primer I-B, said primer I-B comprising consensus sequence B (CB) and the complementary sequence of the 3' end overhang of said cDNA.
  • the primers I-B further comprise a unique molecular tag sequence (UMI).
  • the nucleic acid fragment hybridized or annealed with primer I-B can be extended using the consensus sequence B and the UMI sequence as a template, and the complementary sequence of the consensus sequence B and the complementary sequence of the UMI sequence are added at the 3' end of the cDNA chain sequence, thereby generating a nucleic acid molecule that carries the consensus sequence A at the 5' end and the complementary sequence of the consensus sequence B and the complementary sequence of the UMI molecule at the 3' end.
  • Primer I-B may contain GGG at its 3' end when the 3' end of the cDNA strand contains an overhang of 3 cytosine nucleotides.
  • the nucleotides of primer I-B can also be modified (e.g., using a locked nucleic acid) to enhance complementary pairing between primer I-B and the 3' end overhang of the cDNA strand.
  • nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
  • DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can use the sequence of primer I-B or a partial sequence thereof as a template to extend the captured nucleic acid fragment (reverse transcription product).
  • reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
  • step (2) is performed simultaneously with step (1).
  • the method optionally further includes step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
  • the method does not include step (3).
  • extension primer Using an extension primer to carry out an extension reaction using the cDNA single strand obtained in (3) as a template to obtain an extension product; the extension primer can anneal to the complementary sequence of the consensus sequence B or a partial sequence thereof, and can initiate Extended response.
  • the extension primer is the same as the primer I-B.
  • the exemplary structure containing the complementary strand of the cDNA strand prepared by the above exemplary embodiment includes: consensus sequence B, UMI sequence, sequence complementary to the cDNA 3' end overhang sequence, complementary sequence of cDNA sequence, complementary sequence of consensus sequence A sequence.
  • An exemplary protocol for a new nucleic acid molecule containing ChIP-Seq information comprises the following steps (as shown in Figure 5):
  • a bridging oligonucleotide I is provided whose 5' end contains a sequence at least partially complementary to consensus sequence B (CB) (first region, P1 ) and whose 3' end contains a sequence at least partially complementary to consensus sequence X2 (section Second region, P2).
  • the P1 and P2 sequences in the bridging oligonucleotide I are contiguously connected without an intervening nucleotide therebetween.
  • each of the P1 sequence and the P2 sequence independently has a length of 20-100 nt (eg, 20-70 nt).
  • the bridging oligonucleotide I is annealed or hybridized with the oligonucleotide probe and the complementary strand of the cDNA strand, and then the 5' end of the complementary strand of the cDNA strand is bonded to the 3' end of the ChIP-seq by DNA ligase and/or DNA polymerase.
  • the ends are ligated to form a new nucleic acid molecule (ie, a nucleic acid molecule labeled with an oligonucleotide probe) that contains the sequence information of the oligonucleotide probe.
  • the DNA polymerase has no 5' to 3' exonuclease activity or strand displacement activity.
  • step (2) the pretreatment includes the following steps:
  • RNA eg, mRNA
  • primer II-A contains a capture sequence A capable of binding to the RNA to be captured (for example, mRNA) anneals and initiates an extension reaction
  • primer II-B anneals primer II-B to the cDNA strand generated in (a), and performs an extension reaction to generate a first extension product, the first extension product That is, the first nucleic acid molecule to be labeled, thereby generating the first nucleic acid molecule population
  • the primer II-B includes a consensus sequence B, a complementary sequence overhanging at the 3' end, and an optional tag sequence B; the 3 The 'end overhang complementary sequence is located at the 3' end of the primer II-
  • RNA eg, mRNA
  • primer II-A' contains a consensus sequence A and a capture sequence A
  • the capture sequence A can be combined with The RNA to be captured (eg, mRNA) anneals and initiates an extension reaction; the consensus sequence A is located upstream of the capture sequence A (eg, at the 5' end of the primer II-A'); (b) the primer II-B' anneals with the cDNA strand generated in (a), and performs an extension reaction to generate a first extension product; wherein, the primer II-B' includes a consensus sequence B, and a complementary sequence overhanging at the 3' end , and an optional tag sequence B; the 3' end overhang complementary sequence is located at the 3' end of the primer
  • step (3) the first nucleic acid molecule population derived from each cell obtained in the previous step is associated with the microdot-coupled oligonucleotide probes occupied by the cell from which it originated, through the following steps, thereby Generating a second population of nucleic acid molecules labeled with the tag sequence Y:
  • step (2) (i) implementing annealing conditions to the product of step (2), so that the first nucleic acid molecule derived from each cell obtained in step (2) anneals to the oligonucleotide probe coupled to the microspot occupied by the cell ( For example, in-situ annealing), and an extension reaction is performed to generate an extension product, and the extension product is a second nucleic acid molecule with a position marker, thereby generating a second population of nucleic acid molecules; wherein, the consensus of the oligonucleotide probes Sequence X2 or its partial sequence (a) can anneal to the complementary sequence or partial sequence of the consensus sequence B of the first extension product obtained in step (2)(i), or, (b) can anneal to step (2) (ii) annealing to the complementary sequence of said consensus sequence A or a partial sequence thereof of the obtained second extension product; or,
  • the bridging oligonucleotide pair is coupled with the first nucleic acid molecule derived from each cell obtained in step (2) and the oligonucleotide probe coupled to the microspot occupied by the cell
  • the needles are brought into contact so that the bridging oligonucleotide pair anneals to the first nucleic acid molecule derived from each cell obtained in step (2) and the oligonucleotide probe coupled to the microspot occupied by the cell (for example, the original bit annealing),
  • the bridging oligonucleotide pair is composed of bridging oligonucleotide II-I and bridging oligonucleotide II-II, and the bridging oligonucleotide II-I and the bridging oligonucleotide II- II each independently comprises: a first region and a second region, and optionally a third region between the first region and the second region, the first region being located upstream (eg 5' terminal); among them,
  • the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
  • the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
  • the second region (a) of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence B of the first extension product obtained in step (2)(i) or a partial sequence thereof, or, ( b) capable of annealing to the complementary sequence of the consensus sequence A or a partial sequence thereof of the second extension product obtained in step (2)(ii);
  • the bridging oligonucleotide II-I and the bridging oligonucleotide of the bridging oligonucleotide pair Oligonucleotides II-II are each present in single-stranded form, alternatively, bridging oligonucleotide II-I and bridging oligonucleotide II-II of the pair of bridging oligonucleotides are annealed to each other to form a partially double-stranded exists in the form of
  • Carry out ligation reaction the nucleic acid molecules hybridized in the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the first region and the second region of the same bridging oligonucleotide II-II will be hybridized
  • the nucleic acid molecule in the second region is connected; and an extension reaction is performed; wherein, the connection reaction and the extension reaction are performed in any order; the obtained reaction product is a second nucleic acid molecule with a position marker, thereby generating the second nucleic acid molecular group.
  • the step of molecular connection comprises: using nucleic acid ligase to connect the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I; or,
  • the bridging oligonucleotide II-I includes a first region, a second region, and a third region between the two, the first region and the first region that will hybridize to the same bridging oligonucleotide II-I
  • the step of ligating the nucleic acid molecule of the second region comprises: using a nucleic acid polymerase (for example, without 5' to 3' end exonuclease activity or strand displacement activity) to carry out a polymerization reaction using the third region as a template, using a nucleic acid ligase connecting nucleic acid molecules hybridizing to the first, third and second regions of the same bridging oligonucleotide II-I;
  • the step of molecular connection comprises: using nucleic acid ligase to connect the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-II; or,
  • the bridging oligonucleotide II-II includes a first region, a second region and a third region between the two, the first region and the first region that will hybridize to the same bridging oligonucleotide II-II
  • the step of ligating the nucleic acid molecule of the second region comprises: using a nucleic acid polymerase (for example, without 5' to 3' end exonuclease activity or strand displacement activity) to carry out a polymerization reaction using the third region as a template, using a nucleic acid ligase
  • the nucleic acid molecules hybridizing to the first region, the third region and the second region of the same bridging oligonucleotide II-II are ligated.
  • the method comprises step (1), step (2)(i) and step (3); wherein, in step (2)(i)(b), the primer II-B contains Consensus sequence B, 3' end overhang complementary sequence, and tag sequence B.
  • the first extension product described in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the primer formed by using the primer II-A as the reverse transcription primer and The cDNA sequence complementary to the RNA, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • each copy of the second nucleic acid molecule derived from the oligonucleotide probe coupled to the same microdot has a different tag sequence B as UMI.
  • Embodiments comprising step (1), step (2)(i) and step (3)(i)
  • the method comprises step (1), step (2)(i) and step (3)(i); wherein, the consensus sequence X2 or a partial sequence thereof can be combined with the consensus sequence B
  • the complementary sequence or partial sequence thereof is annealed;
  • the extension product obtained in step (3)(i) is a labeled nucleic acid molecule, which comprises: the first strand containing the first nucleic acid molecule sequence to be labeled, and/or , the second strand containing the oligonucleotide probe sequence.
  • partial sequence of XX (sequence) or “partial sequence of XX (sequence)" means the nucleotide sequence of at least one segment of "XX (sequence)".
  • the entire nucleotide sequence of the consensus sequence X2 can anneal to the complementary sequence of the consensus sequence B or the nucleotide sequence of a partial segment of the complementary sequence of the consensus sequence B, and the consensus sequence X2 It is also possible to anneal with the complementary sequence of the consensus sequence B or the nucleotide sequence of a partial segment of the complementary sequence of the consensus sequence B with the nucleotide sequence of a partial segment thereof.
  • annealing means that in the two nucleotide sequences that are annealed to each other, each base in one nucleotide sequence can pair with the base in the other nucleotide sequence without mismatching or a gap; or, in two nucleotide sequences that anneal to each other, most of the bases in one nucleotide sequence can pair with the bases in the other nucleotide sequence, which allows mismatches or gaps ( For example, a mismatch or gap of one or several nucleotides). That is, the two nucleotide sequences that can be annealed can be either completely complementary or partially complementary. Unless otherwise indicated herein or clearly contradicted by the context, the description of "annealing" here applies to the entire text.
  • the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging A mutant sequence, a complementary sequence of the tag sequence B, a complementary sequence of the consensus sequence B, a complementary sequence of the tag sequence Y, and a complementary sequence of the consensus sequence X1.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence B, the 3' The complementary sequence of the terminal overhang sequence, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer.
  • Embodiment comprising step (1), step (2)(i) and step (3)(i): a chain
  • the consensus sequence X2 or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence B or a partial sequence thereof (for example, a 3' end partial sequence), and in step (2)(i) The complementary sequence of said consensus sequence B of the first extension product has a 3' free end.
  • the extension product obtained in step (3)(i) is a labeled nucleic acid molecule comprising the first strand.
  • the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging A mutant sequence, a complementary sequence of the tag sequence B, a complementary sequence of the consensus sequence B, a complementary sequence of the tag sequence Y, and a complementary sequence of the consensus sequence X1.
  • step (3)(i) the oligonucleotide probe cannot initiate an extension reaction (eg, the 3' end is blocked).
  • the capture sequence A of the primer II-A is a random oligonucleotide sequence.
  • the first extension product described in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the primer formed by using the primer II-A as the reverse transcription primer and The cDNA sequence complementary to the RNA, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging A mutant sequence, a complementary sequence of the tag sequence B, a complementary sequence of the consensus sequence B, a complementary sequence of the tag sequence Y, and a complementary sequence of the consensus sequence X1.
  • the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer II-A also contains a consensus sequence A, and an optional tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer II-A.
  • the consensus sequence A is located upstream of the capture sequence A (e.g., at the 5' end of the primer II-A).
  • the first extension product in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, the optional tag sequence A, and the
  • the primer II-A is the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the first strand comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, formed by using the primer II-A as a reverse transcription primer
  • the cDNA sequence complementary to the RNA, the 3' end overhang sequence, the complementary sequence of the tag sequence B, the complementary sequence of the consensus sequence B, the complementary sequence of the tag sequence Y, the consensus sequence X1 complementary sequence.
  • Embodiment comprising step (1), step (2)(i) and step (3)(i): two chains
  • the consensus sequence X2 or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence B or a partial sequence thereof, and the oligonucleotide probe
  • the consensus sequence X2 of has a 3' free end.
  • the extension product obtained in step (3)(i) is a labeled nucleic acid molecule comprising the second strand.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence B, the 3' The complementary sequence of the terminal overhang sequence, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer.
  • the first extension product obtained in step (2)(i) cannot initiate an extension reaction (eg, the 3' end is blocked).
  • the capture sequence A of the primer II-A is a random oligonucleotide sequence.
  • the first extension product described in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the primer formed by using the primer II-A as the reverse transcription primer and The cDNA sequence complementary to the RNA, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence B, the 3' The complementary sequence of the terminal overhang sequence, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer.
  • the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer II-A also contains a consensus sequence A, and an optional tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer II-A.
  • the consensus sequence A is located upstream of the capture sequence A (eg, at the 5' end of the primer II-A).
  • the first extension product in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, The cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence B, the 3' The complementary sequence of the terminal overhang sequence, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, the optional complementary sequence of the tag sequence A, the consensus sequence A complementary sequence.
  • step (1) An exemplary embodiment of the present application comprising step (1), step (2)(i) and step (3)(i) is described in detail as follows:
  • An exemplary scheme for preparing a cDNA strand containing a complementary sequence of UMI at the 3' end using the RNA (such as mRNA) in the sample as a template comprises the following steps (as shown in Figure 6):
  • RNA molecules in the permeabilized sample are reverse-transcribed using reverse transcriptase (eg, reverse transcriptase with terminal transfer activity) and primer II-A to generate cDNA, and the cDNA
  • reverse transcriptase eg, reverse transcriptase with terminal transfer activity
  • primer II-A primer II-A
  • An overhang eg, an overhang comprising 3 cytosine nucleotides
  • Various reverse transcriptases having terminal transfer activity can be used for the reverse transcription reaction.
  • the reverse transcriptase used does not have RNaseH activity.
  • the primer II-A comprises a poly(T) sequence and a consensus sequence A(CA).
  • a poly(T) sequence is located at the 3' end of the primer II-A to initiate reverse transcription.
  • the primer II-A comprises a random oligonucleotide sequence that can be used to capture RNA without a poly(A) tail.
  • the random oligonucleotide sequence is located at the 3' end of the primer II-A to initiate reverse transcription.
  • primer II-B Anneal or hybridize with the cDNA strand using primer II-B, said primer II-B comprising a consensus sequence B (CB), a unique molecular tag sequence (UMI), and the complementary sequence of the 3' end overhang of the cDNA .
  • CB consensus sequence B
  • UMI unique molecular tag sequence
  • the consensus sequence B is located upstream of the UMI sequence (for example, the 5' end), and the sequence complementary to the 3' end overhang of the cDNA strand is located at the 3' end of the primer II-B.
  • the primer II-B may include GGG at its 3' end.
  • the nucleotides of the primer II-B can also be modified (for example, using a locked nucleic acid) to enhance the complementary pairing between the primer II-B and the 3' end overhang of the cDNA strand.
  • nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
  • DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can be extended using the sequence of the primer II-B or a partial sequence thereof as a template Captured nucleic acid fragments (reverse transcription products) are sufficient.
  • reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
  • this step is performed simultaneously with step (1) (eg, in the same reaction system).
  • the method optionally further comprises step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
  • said method does not comprise said step (3).
  • the exemplary structure of the cDNA strand prepared by the above exemplary embodiment comprises: consensus sequence A, cDNA sequence, 3' end overhang sequence, complementary sequence of UMI sequence, and complementary sequence of consensus sequence B.
  • the performance scheme includes the following steps (as shown in Figure 8):
  • the consensus sequence X2 of the ChIP-seq or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence B of the cDNA strand obtained in the above step 1 or a partial sequence thereof.
  • the cDNA strand is annealed or hybridized with ChIP-seq, and under the action of polymerase, a new nucleic acid molecule containing ChIP-seq information (ie, a nucleic acid molecule marked with ChIP-seq) is formed.
  • the exemplary structure of the new nucleic acid molecule containing chip sequence information formed by the above exemplary embodiment comprises: a consensus sequence A from the 5' end to the 3' end, a cDNA sequence, an overhang sequence at the 3' end, and the complement of the UMI sequence sequence, the complementary sequence of the consensus sequence B, the complementary sequence of the tag sequence Y, and the nucleic acid strand of the complementary sequence of the consensus sequence X1 and/or its complementary nucleic acid strand.
  • Embodiment comprising step (1), step (2)(i) and step (3)(ii)
  • the method comprises step (1), step (2)(i) and step (3)(ii); wherein, the second region of the bridging oligonucleotide II-II can be combined with
  • the first extension product obtained in step (2)(i) is annealed to the complementary sequence of the consensus sequence B or a partial sequence thereof;
  • the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule, which comprises: The first strand containing the sequence of the first nucleic acid molecule to be labeled, and/or the second strand containing the sequence of the oligonucleotide probe.
  • the second region of the bridging oligonucleotide II-II can be the complementary sequence of the consensus sequence B of the first extension product obtained in step (2)(i) or the complementary sequence of the consensus sequence B
  • the nucleotide sequences of the partial segments are annealed.
  • the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
  • the complementary sequence of the third region of II-I, the bridging oligonucleotide II-II sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, reversed with the primer II-A The complementary sequence of the cDNA sequence complementary to the RNA formed by the recording primers.
  • Embodiment comprising step (1), step (2)(i) and step (3)(ii): a chain
  • the second region of the bridging oligonucleotide II-II can be the complementary sequence of the consensus sequence B of the first extension product obtained in step (2)(i) or a partial sequence thereof ( For example, the 3' end partial sequence) anneals and the second region of the bridging oligonucleotide II-I has a 3' free end.
  • the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule comprising the first strand.
  • the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging
  • the second region of the bridging oligonucleotide II-I is located at the 3' end of the bridging oligonucleotide II-I.
  • the first region of the bridging oligonucleotide II-I is located at the 5' end of the bridging oligonucleotide II-I.
  • said bridging oligonucleotide II-I does not contain said third region, and/or said bridging oligonucleotide II-I does not contain said third region.
  • the 5' end of the bridging oligonucleotide II-I contains a phosphorylation modification.
  • the 3' end of the bridging oligonucleotide II-I contains a free -OH.
  • step (3)(ii) the bridging oligonucleotide II-II cannot initiate an extension reaction (for example, the 3' end is blocked), and/or, the oligonucleotide Acid probes cannot initiate extension reactions (eg, the 3' end is blocked).
  • the capture sequence A of the primer II-A is a random oligonucleotide sequence.
  • the first extension product described in step (2)(i)(b) of the method sequentially comprises from the 5' end to the 3' end: using the primer II-A as a reverse transcription primer
  • the first strand comprises from the 5' end to the 3' end: a cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, and the 3' end is overhanging
  • the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer II-A also contains a consensus sequence A, and an optional tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer II-A.
  • the first extension product in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, The cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the first strand comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, formed by using the primer II-A as a reverse transcription primer
  • the cDNA sequence complementary to the RNA, the 3' end overhang sequence, the complementary sequence of the tag sequence B, the complementary sequence of the consensus sequence B, optionally the bridging oligonucleotide II-II
  • step (3)(ii) between the bridging oligonucleotide II-I, the bridging oligonucleotide II-II and the oligonucleotide probe and the oligonucleotide probe
  • the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the nucleic acid molecules hybridized to the same bridging oligonucleotide
  • the ligation reaction process of the nucleic acid molecule connection of the first region and the second region of acid II-II and the extension reaction described in step (3)(ii) can be carried out in any order, as long as the second nucleic acid with the position marker can be obtained Molecule will do.
  • the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-II can be connected, and the bridging oligonucleotide can be used to Oligonucleotide II-I initiates the extension reaction, obtaining the first strand.
  • the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
  • the first strand can be obtained in the following exemplary ways:
  • the polymerase used in the extension reaction preferably has strand displacement activity or 5' to 3' excision activity.
  • the extension reaction and the extension reaction are performed in different systems, and the extension reaction is performed first, and then the ligation reaction is performed.
  • it can be obtained by initiating an extension reaction with said bridging oligonucleotide II-I and then ligating nucleic acid molecules that hybridize to the first and second regions of the same bridging oligonucleotide II-II. the first chain.
  • the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
  • Embodiment comprising step (1), step (2)(i) and step (3)(ii): two strands
  • the second region of the bridging oligonucleotide II-II is capable of annealing to the complementary sequence of the consensus sequence B of the first extension product obtained in step (2)(i) or a partial sequence thereof, And the second region of the bridging oligonucleotide II-II has a 3' free end.
  • the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule comprising said second strand.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
  • the complementary sequence of the third region of II-I, the bridging oligonucleotide II-II sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, reversed with the primer II-A The complementary sequence of the cDNA sequence complementary to the RNA formed by the recording primers.
  • the second region of the bridging oligonucleotide II-II is located at the 3' end of the bridging oligonucleotide II-II.
  • the first region of the bridging oligonucleotide II-II is located at the 5' end of the bridging oligonucleotide II-II.
  • said bridging oligonucleotide II-I does not contain said third region, and/or said bridging oligonucleotide II-II does not contain said third region.
  • the 5' end of the bridging oligonucleotide II-II contains a phosphorylation modification.
  • the 3' end of the bridging oligonucleotide II-II contains a free -OH.
  • step (3)(ii) the bridging oligonucleotide II-I cannot initiate an extension reaction (for example, the 3' end is blocked), and/or, step (2)( i) The first extension product obtained cannot initiate an extension reaction (eg the 3' end is blocked).
  • the capture sequence A of the primer II-A is a random oligonucleotide sequence.
  • the first extension product described in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the primer formed by using the primer II-A as the reverse transcription primer and The cDNA sequence complementary to the RNA, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
  • the complementary sequence of the third region of II-I, the bridging oligonucleotide II-II sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, reversed with the primer II-A The complementary sequence of the cDNA sequence complementary to the RNA formed by the recording primers.
  • the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer II-A also contains a consensus sequence A, and an optional tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer II-A.
  • the first extension product in step (2)(i)(b) sequentially comprises from the 5' end to the 3' end: the consensus sequence A, optionally the tag sequence A, The cDNA sequence complementary to the RNA formed by using the primer II-A as a reverse transcription primer, the overhang sequence at the 3' end, the complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
  • the complementary sequence of the third region of II-I, the bridging oligonucleotide II-II sequence, the tag sequence B, the complementary sequence of the 3' end overhang sequence, reversed with the primer II-A The complementary sequence of the cDNA sequence complementary to the RNA formed by the recording primer, optionally the complementary sequence of the tag sequence A, and the complementary sequence of the consensus sequence A.
  • step (3)(ii) between the bridging oligonucleotide II-I, the bridging oligonucleotide II-II and the oligonucleotide probe and the oligonucleotide probe
  • the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the nucleic acid molecules hybridized to the same bridging oligonucleotide
  • the ligation reaction process of the nucleic acid molecule connection of the first region and the second region of acid II-II and the extension reaction described in step (3)(ii) can be carried out in any order, as long as the second nucleic acid with the position marker can be obtained Molecule will do.
  • the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I can be connected, and the bridging oligonucleotide can be used to Oligonucleotide II-II initiates the extension reaction, obtaining the second strand.
  • the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
  • the second chain can be obtained in the following exemplary ways:
  • the polymerase used in the extension reaction preferably has strand displacement activity or 5' to 3' excision activity.
  • the extension reaction and the extension reaction are performed in different systems, and the extension reaction is performed first, and then the ligation reaction is performed.
  • it can be obtained by initiating an extension reaction with said bridging oligonucleotide II-II and then ligating nucleic acid molecules that hybridize to the first and second regions of the same bridging oligonucleotide II-I the second strand.
  • the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
  • step (1) An exemplary embodiment of the present application comprising step (1), step (2)(i) and step (3)(ii) is described in detail as follows:
  • An exemplary scheme for preparing a cDNA chain using RNA (such as mRNA) in a sample as a template comprises the following steps (as shown in FIG. 6 ):
  • RNA molecules in the permeabilized sample are reverse-transcribed using reverse transcriptase (eg, reverse transcriptase with terminal transfer activity) and primer II-A to generate cDNA, and the cDNA
  • reverse transcriptase eg, reverse transcriptase with terminal transfer activity
  • primer II-A primer II-A
  • An overhang eg, an overhang comprising 3 cytosine nucleotides
  • Various reverse transcriptases having terminal transfer activity can be used for the reverse transcription reaction.
  • the reverse transcriptase used does not have RNaseH activity.
  • the primer II-A comprises a poly(T) sequence and a consensus sequence A(CA).
  • a poly(T) sequence is located at the 3' end of the primer II-A to initiate reverse transcription.
  • the primer II-A comprises a random oligonucleotide sequence that can be used to capture RNA without a poly(A) tail.
  • the random oligonucleotide sequence is located at the 3' end of the primer II-A to initiate reverse transcription.
  • primer II-B Anneal or hybridize with the cDNA strand using primer II-B, said primer II-B comprising a consensus sequence B (CB), a unique molecular tag sequence (UMI), and the complementary sequence of the 3' end overhang of the cDNA .
  • CB consensus sequence B
  • UMI unique molecular tag sequence
  • the consensus sequence B is located upstream of the UMI sequence (for example, the 5' end), and the sequence complementary to the 3' end overhang of the cDNA strand is located at the 3' end of the primer II-B.
  • the primer II-B may include GGG at its 3' end.
  • the nucleotides of the primer II-B can also be modified (for example, using a locked nucleic acid) to enhance the complementary pairing between the primer II-B and the 3' end overhang of the cDNA strand.
  • nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
  • DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can be extended using the sequence of the primer II-B or a partial sequence thereof as a template Captured nucleic acid fragments (reverse transcription products) are sufficient.
  • reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
  • this step is performed simultaneously with step (1) (eg, in the same reaction system).
  • the method optionally further comprises step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
  • said method does not comprise said step (3).
  • the exemplary structure of the cDNA strand prepared by the above exemplary embodiment comprises: consensus sequence A, cDNA sequence, 3' end overhang sequence, complementary sequence of UMI sequence, and complementary sequence of consensus sequence B.
  • the permanent scheme includes the following steps (as shown in Figure 7):
  • a bridging oligonucleotide pair consisting of a bridging oligonucleotide II-I and a bridging oligonucleotide II-II is provided, wherein the bridging oligonucleotide II-I and the bridging oligonucleotide II- II each independently include: a first region (P1) and a second region (P2), the first region is located upstream of the second region (eg 5' end); wherein,
  • the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
  • the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
  • the second region of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence B in the cDNA strand obtained in the above step 1 or a partial sequence thereof.
  • the bridging oligonucleotide II-I contains spacer nucleotides between the first region and the second region, such as 1-5nt or 5-10nt spacer nucleotides, that is, the The bridging oligonucleotide II-I sequence contains a third region located between the first region and the second region.
  • the first region and the second region in the bridging oligonucleotide II-I are adjacently connected without redundant nucleotides, that is, the bridging oligonucleotide
  • the acid II-I sequence does not contain a third region located between the first and second regions.
  • the bridging oligonucleotide II-II contains spacer nucleotides between the first region and the second region, such as 1-5nt or 5-10nt spacer nucleotides, that is, the The bridging oligonucleotide II-II sequence contains a third region located between the first region and the second region.
  • the first region and the second region in the bridging oligonucleotide II-II are adjacently connected without redundant nucleotides, that is, the bridging oligonucleotide
  • the acid II-II sequence does not contain a third region located between the first and second regions.
  • bridging oligonucleotide II-I Anneal or hybridize the bridging oligonucleotide II-I, bridging oligonucleotide II-II and chip sequence to the cDNA strand obtained in the above step 1, and then hybridize to the same bridging oligonucleotide II-
  • the nucleic acid molecules of the first region and the second region of I are linked, and/or the nucleic acid molecules of the first region and the second region hybridizing to the same bridging oligonucleotide II-II are connected.
  • new nucleic acid molecules containing ChIP-seq information ie, ChIP-seq-labeled nucleic acid molecules
  • the concatenation process and polymerization process are performed in any order.
  • the exemplary structure of the new nucleic acid molecule containing chip sequence information formed by the above exemplary embodiment comprises: a consensus sequence A from the 5' end to the 3' end, a cDNA sequence, an overhang sequence at the 3' end, and the complement of the UMI sequence sequence, the complementary sequence of the consensus sequence B, the bridging oligonucleotide II-I sequence, the complementary sequence of the tag sequence Y, and the nucleic acid strand of the complementary sequence of the consensus sequence X1 and/or its complementary nucleic acid strand.
  • the method comprises step (1), step (2)(ii) and step (3).
  • the first extension product comprises from the 5' end to the 3' end: the consensus sequence A, with the primer II-A' as The cDNA sequence complementary to the RNA formed by the reverse transcription primer, the overhang sequence at the 3' end, the optional complementary sequence of the tag sequence B, and the complementary sequence of the consensus sequence B.
  • the extension primer is the primer II-B' or primer B", wherein the primer B" can be combined with the consensus sequence B Annealing to the complementary sequence or part thereof, and can initiate the extension reaction.
  • the second extension product comprises from the 5' end to the 3' end: a cDNA sequence complementary to the cDNA sequence formed by extending the extension primer Sequence, the complementary sequence of the consensus sequence A.
  • Embodiments comprising step (1), step (2)(ii) and step (3)(i)
  • the method comprises step (1), step (2)(ii) and step (3)(i); wherein, the consensus sequence X2 or a partial sequence thereof can be combined with the consensus sequence A
  • the complementary sequence or partial sequence thereof is annealed;
  • the extension product obtained in step (3)(i) is a labeled nucleic acid molecule, which comprises: the first strand containing the first nucleic acid molecule sequence to be labeled, and/or , the second strand containing the oligonucleotide probe sequence.
  • the consensus sequence X2 can be annealed with the complementary sequence of the consensus sequence A or the nucleotide sequence of a partial segment of the complementary sequence of the consensus sequence A with its overall nucleotide sequence, the consensus sequence X2 can also anneal to the complementary sequence of the consensus sequence A or the nucleotide sequence of a partial segment of the complementary sequence of the consensus sequence A with the nucleotide sequence of its partial segment.
  • the first strand comprises from the 5' end to the 3' end: the sequence of the first nucleic acid molecule to be labeled, the complementary sequence of the tag sequence Y, the complementary sequence of the consensus sequence X1 .
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, and the first nucleic acid molecule to be labeled Sequence complementary cDNA sequences.
  • Embodiment comprising step (1), step (2)(ii) and step (3)(i): a chain
  • the consensus sequence X2 or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence A or a partial sequence thereof (for example, a partial sequence at the 3' end); obtained in step (3)(i)
  • the extension product is a labeled nucleic acid molecule, which includes a first strand containing the sequence of the first nucleic acid molecule to be labeled.
  • step (3)(i) the oligonucleotide probe cannot initiate an extension reaction (eg, the 3' end is blocked).
  • the capture sequence A of the primer II-A' is a random oligonucleotide sequence.
  • the extension primer is the primer II-B'.
  • the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhanging sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, and the complementary sequence of the consensus sequence A.
  • the first strand comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, the complementary sequence of the overhang sequence at the 3' end, and
  • the primer II-A' is the complementary sequence of the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the complementary sequence of the consensus sequence A, the complementary sequence of the tag sequence Y, the complementary sequence of the consensus sequence X1 sequence.
  • the first strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different complementary sequences of the capture sequence A as the UMI.
  • the capture sequence A of the primer II-A' is a poly(T) sequence or a specificity for a specific target nucleic acid sequence.
  • the primer II-A' also contains a tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer II-A.
  • the extension primer is the primer II-B'.
  • the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhang sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, the complementary sequence of the tag sequence A, the consensus Complement of sequence A.
  • the first strand comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, the complementary sequence of the overhang sequence at the 3' end, and
  • the primer II-A' is the complementary sequence of the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the complementary sequence of the tag sequence A, the complementary sequence of the consensus sequence A, the complementary sequence of the tag sequence Y Sequence, the complementary sequence of the consensus sequence X1.
  • the first strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different complementary sequences of the tag sequence A as the UMI.
  • Embodiment comprising step (1), step (2)(ii) and step (3)(i): two strands
  • the consensus sequence X2 or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence A or a partial sequence thereof; obtained in step (3)(i)
  • the extension product of is a labeled nucleic acid molecule comprising a second strand comprising the oligonucleotide probe sequence.
  • the second extension product obtained in step (2)(ii) cannot initiate an extension reaction (eg, the 3' end is blocked).
  • the capture sequence A of the primer II-A' is a random oligonucleotide sequence.
  • the extension primer is the primer II-B'.
  • the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhanging sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, and the complementary sequence of the consensus sequence A.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, and the first nucleic acid molecule to be labeled cDNA sequence complementary to the sequence, the 3' end overhang sequence, optionally the complementary sequence of the tag sequence B, the complementary sequence of the consensus sequence B.
  • step (3) the second strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different capture sequences A as UMIs.
  • the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer II-A' also contains a tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer II-A.
  • the extension primer is the primer II-B'.
  • the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhang sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, the complementary sequence of the tag sequence A, the consensus Complement of sequence A.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, the tag sequence A, and the to-be
  • step (3) the second strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences A as UMIs.
  • step (1) An exemplary embodiment of the present application comprising step (1), step (2)(ii) and step (3)(i) is described in detail as follows:
  • RNA such as mRNA
  • the exemplary scheme comprises the following steps (as shown in Figure 9):
  • RNA molecules (for example, mRNA molecules) in the permeabilized sample are reverse-transcribed using reverse transcriptase (for example, reverse transcriptase with terminal transfer activity) and primer II-A' to generate cDNA, and An overhang (eg, an overhang comprising 3 cytosine nucleotides) is added to the 3' end of the cDNA.
  • reverse transcriptase for example, reverse transcriptase with terminal transfer activity
  • primer II-A' primer II-A' to generate cDNA
  • An overhang eg, an overhang comprising 3 cytosine nucleotides
  • an overhang eg, an overhang comprising 3 cytosine nucleotides
  • the reverse transcriptase used does not have RNaseH activity.
  • the primer II-A' comprises a poly(T) sequence, a UMI sequence, and a consensus sequence A (CA).
  • a poly(T) sequence is located at the 3' end of the primer II-A' to initiate reverse transcription, and the consensus sequence A is located upstream (eg, 5' end) of the UMI sequence.
  • the primer II-A' comprises a random oligonucleotide sequence and a consensus sequence A, which can be used to capture RNA without a ploy A tail.
  • the random oligonucleotide sequence is located at the 3' end of the primer II-A' to initiate reverse transcription.
  • primer II-B' comprising a consensus sequence B (CB) and a complementary sequence overhanging at the 3' end of said cDNA.
  • CB consensus sequence B
  • the nucleic acid fragment that hybridizes or anneals to the primer II-B' can be extended using the consensus sequence B as a template under the action of a nucleic acid polymerase, and a part of the consensus sequence B is added at the 3' end of the cDNA chain.
  • Complementary sequence (c(CB)) thereby generating a nucleic acid molecule carrying the complementary sequence of said consensus sequence B at the 3' end.
  • sequence complementary to the 3' end overhang of the cDNA strand is located at the 3' end of the primer II-B'.
  • the primer II-B' may include GGG at its 3' end.
  • the nucleotides of the primer II-B' can also be modified (for example, using a locked nucleic acid) to enhance the complementary pairing between the primer II-B' and the 3' end overhang of the cDNA strand.
  • nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
  • DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can use the sequence of the primer II-B' or a partial sequence thereof as a template to extend Captured nucleic acid fragments (reverse transcription products) are sufficient.
  • reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
  • this step is performed simultaneously with step (1) (eg, in the same reaction system).
  • the method optionally further comprises step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
  • said method does not comprise said step (3).
  • extension primer the cDNA strand obtained in the previous step is used as a template for an extension reaction to obtain an extension product;
  • the extension primer is the primer II-B', or primer B", and the primer B" can be combined with the The above-mentioned consensus sequence B or a partial sequence thereof is annealed, and can initiate an extension reaction.
  • the exemplary structure of the cDNA strand complementary chain prepared by the above exemplary embodiment comprises: consensus sequence B, complementary sequence of 3' end overhang, complementary sequence of cDNA sequence, complementary sequence of UMI sequence, and complementary sequence of consensus sequence A sequence.
  • the consensus sequence X2 of the ChIP-seq or a partial sequence thereof can anneal to the complementary sequence of the consensus sequence A or a partial sequence thereof of the complementary strand of the cDNA strand obtained in step 1 above.
  • the complementary strand of the cDNA chain is annealed or hybridized with the ChIP-seq, and under the action of the polymerase, a new nucleic acid molecule containing the ChIP-seq information (that is, a nucleic acid molecule labeled with the ChIP-seq) is formed.
  • the exemplary structure of the new nucleic acid molecule containing chip sequence information formed by the above exemplary embodiment comprises: from the 5' end to the 3' end containing the consensus sequence B, the complementary sequence of the 3' end overhang, the cDNA sequence Complementary sequence, the complementary sequence of the UMI sequence, the complementary sequence of the consensus sequence A, the complementary sequence of the tag sequence Y, and the nucleic acid strand of the complementary sequence of the consensus sequence X1 and/or its complementary nucleic acid strand.
  • Embodiments comprising step (1), step (2)(ii) and step (3)(ii)
  • the method comprises step (1), step (2)(ii) and step (3)(ii); wherein, the second region of the bridging oligonucleotide II-II can be combined with The complementary sequence of the consensus sequence A of the second extension product obtained in step (2)(ii) or its partial sequence is annealed; the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule, which comprises: The first strand of the first nucleic acid molecule sequence to be labeled, and/or, the second strand containing the oligonucleotide probe sequence.
  • the second region of the bridging oligonucleotide II-II can be the complementary sequence of the consensus sequence A of the second extension product obtained in step (2)(ii) or a part of the complementary sequence of the consensus sequence A
  • the nucleotide sequences of the segments are annealed.
  • the first strand comprises from the 5' end to the 3' end: the first nucleic acid molecule sequence to be labeled, optionally the third region of the bridging oligonucleotide II-II
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
  • the consensus sequence X1 the tag sequence Y
  • the consensus sequence X2 optionally the bridging oligonucleotide
  • Embodiment comprising step (1), step (2)(ii) and step (3)(ii): a chain
  • the second region of the bridging oligonucleotide II-II can be compatible with the complementary sequence of the consensus sequence A or the 3' end of the second extension product obtained in step (2)(ii).
  • the partial sequences anneal and the second region of the bridging oligonucleotide II-I has a 3' free end.
  • the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule comprising the first strand.
  • the second region of the bridging oligonucleotide II-I is located at the 3' end of the bridging oligonucleotide II-I.
  • the first region of the bridging oligonucleotide II-I is located at the 5' end of the bridging oligonucleotide II-I. In certain embodiments, said bridging oligonucleotide II-I does not contain said third region, and/or said bridging oligonucleotide II-II does not contain said third region.
  • the 5' end of the bridging oligonucleotide II-I contains a phosphorylation modification.
  • the 3' end of the bridging oligonucleotide II-I contains a free -OH.
  • step (3)(ii) the bridging oligonucleotide II-II cannot initiate an extension reaction (for example, the 3' end is blocked), and/or, the oligonucleotide Acid probes cannot initiate extension reactions (eg, the 3' end is blocked).
  • the capture sequence A of the primer II-A' is a random oligonucleotide sequence.
  • the extension primer is the primer II-B'.
  • the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhanging sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, and the complementary sequence of the consensus sequence A.
  • the first strand comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, the complementary sequence of the overhang sequence at the 3' end, and the Primer II-A' is the complementary sequence of the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the complementary sequence of the consensus sequence A, and optionally the third region of the bridging oligonucleotide II-II complementary sequence, the bridging oligonucleotide II-I sequence, the complementary sequence of the tag sequence Y, the complementary sequence of the consensus sequence X1.
  • the first strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different complementary sequences of the capture sequence A as the UMI.
  • the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer II-A' also contains a tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer II-A.
  • the extension primer is the primer II-B'.
  • the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhang sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, the complementary sequence of the tag sequence A, the consensus Complement of sequence A.
  • the first strand comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, the complementary sequence of the overhang sequence at the 3' end, and the Primer II-A' is the complementary sequence of the cDNA sequence complementary to the RNA formed by the reverse transcription primer, the complementary sequence of the tag sequence A, the complementary sequence of the consensus sequence A, and optionally the bridging oligonucleotide
  • the first strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different complementary sequences of the tag sequence A as the UMI.
  • step (3)(ii) between the bridging oligonucleotide II-I, the bridging oligonucleotide II-II and the oligonucleotide probe and the oligonucleotide probe
  • the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the nucleic acid molecules hybridized to the same bridging oligonucleotide
  • the ligation reaction process of the nucleic acid molecule connection of the first region and the second region of acid II-II and the extension reaction described in step (3)(ii) can be carried out in any order, as long as the second nucleic acid with the position marker can be obtained Molecule will do.
  • the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-II can be connected, and the bridging oligonucleotide can be used to Oligonucleotide II-I initiates the extension reaction, obtaining the first strand.
  • the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
  • the first strand can be obtained in the following exemplary ways:
  • the polymerase used in the extension reaction preferably has strand displacement activity or 5' to 3' excision activity.
  • the extension reaction and the extension reaction are performed in different systems, and the extension reaction is performed first, and then the ligation reaction is performed.
  • it can be obtained by initiating an extension reaction with said bridging oligonucleotide II-I and then ligating nucleic acid molecules that hybridize to the first and second regions of the same bridging oligonucleotide II-II. the first chain.
  • the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
  • Embodiment comprising step (1), step (2)(ii) and step (3)(ii): two chains
  • the second region of the bridging oligonucleotide II-II is capable of annealing to the complementary sequence of the consensus sequence A of the second extension product obtained in step (2)(ii) or a partial sequence thereof , and the second region of the bridging oligonucleotide II-II has a 3' free end.
  • the reaction product obtained in step (3)(ii) is a labeled nucleic acid molecule comprising said second strand.
  • the second region of the bridging oligonucleotide II-II is located at the 3' end of the bridging oligonucleotide II-II.
  • the first region of the bridging oligonucleotide II-II is located at the 5' end of the bridging oligonucleotide II-II.
  • said bridging oligonucleotide II-I does not contain said third region, and/or said bridging oligonucleotide II-II does not contain said third region.
  • the 5' end of the bridging oligonucleotide II-II contains a phosphorylation modification.
  • the 3' end of the bridging oligonucleotide II-II contains a free -OH.
  • step (3)(ii) the bridging oligonucleotide II-I cannot initiate an extension reaction (for example, the 3' end is blocked), and/or, step (2)( ii) The obtained second extension product cannot initiate an extension reaction (eg, the 3' end is blocked).
  • the capture sequence A of the primer II-A' is a random oligonucleotide sequence.
  • the extension primer is the primer II-B'.
  • the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhanging sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, and the complementary sequence of the consensus sequence A.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
  • step (3) the second strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different capture sequences A as UMIs.
  • the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid.
  • the primer II-A' also contains a tag sequence A, such as a random oligonucleotide sequence.
  • the capture sequence A is located at the 3' end of the primer II-A.
  • the extension primer is the primer II-B'.
  • the second extension product comprises from the 5' end to the 3' end: the consensus sequence B, optionally the tag sequence B, The complementary sequence of the overhang sequence at the 3' end, the complementary sequence of the cDNA sequence complementary to the RNA formed by using the primer II-A' as a reverse transcription primer, the complementary sequence of the tag sequence A, the consensus Complement of sequence A.
  • the second strand comprises from the 5' end to the 3' end: the consensus sequence X1, the tag sequence Y, the consensus sequence X2, optionally the bridging oligonucleotide
  • the consensus sequence X1 the tag sequence Y
  • the consensus sequence X2 optionally the bridging oligonucleotide
  • step (3) the second strands derived from each copy of the oligonucleotide probes coupled to the same microdot have different tag sequences A as UMIs.
  • step (3)(ii) between the bridging oligonucleotide II-I, the bridging oligonucleotide II-II and the oligonucleotide probe and the oligonucleotide probe
  • the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I are connected, and/or, the nucleic acid molecules hybridized to the same bridging oligonucleotide
  • the ligation reaction process of the nucleic acid molecule connection of the first region and the second region of acid II-II and the extension reaction described in step (3)(ii) can be carried out in any order, as long as the second nucleic acid with the position marker can be obtained Molecule will do.
  • the nucleic acid molecules hybridized to the first region and the second region of the same bridging oligonucleotide II-I can be connected, and the bridging oligonucleotide can be used to Oligonucleotide II-II initiates the extension reaction, obtaining the second strand.
  • the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
  • the second chain can be obtained in the following exemplary ways:
  • the polymerase used in the extension reaction preferably has strand displacement activity or 5' to 3' excision activity.
  • the extension reaction and the extension reaction are performed in different systems, and the extension reaction is performed first, and then the ligation reaction is performed.
  • it can be obtained by initiating an extension reaction with said bridging oligonucleotide II-II and then ligating nucleic acid molecules that hybridize to the first and second regions of the same bridging oligonucleotide II-I the second strand.
  • the polymerase used in the extension reaction preferably does not have strand displacement activity or 5' to 3' excision activity.
  • step (1) An exemplary embodiment of the present application comprising step (1), step (2)(ii) and step (3)(ii) is described in detail as follows:
  • An exemplary scheme for preparing a cDNA strand complementary chain using RNA (such as mRNA) in a sample as a template comprises the following steps (as shown in FIG. 9 ):
  • RNA molecules (for example, mRNA molecules) in the permeabilized sample are reverse-transcribed using reverse transcriptase (for example, reverse transcriptase with terminal transfer activity) and primer II-A' to generate cDNA, and An overhang (eg, an overhang comprising 3 cytosine nucleotides) is added to the 3' end of the cDNA.
  • reverse transcriptase for example, reverse transcriptase with terminal transfer activity
  • primer II-A' primer II-A' to generate cDNA
  • An overhang eg, an overhang comprising 3 cytosine nucleotides
  • an overhang eg, an overhang comprising 3 cytosine nucleotides
  • the reverse transcriptase used does not have RNaseH activity.
  • the primer II-A' comprises a poly(T) sequence, a UMI sequence, and a consensus sequence A (CA).
  • a poly(T) sequence is located at the 3' end of the primer II-A' to initiate reverse transcription, and the consensus sequence A is located upstream (eg, 5' end) of the UMI sequence.
  • the primer II-A' comprises a random oligonucleotide sequence and a consensus sequence A, which can be used to capture RNA without a ploy A tail.
  • the random oligonucleotide sequence is located at the 3' end of the primer II-A' to initiate reverse transcription.
  • primer II-B' comprising a consensus sequence B (CB) and a complementary sequence overhanging at the 3' end of said cDNA.
  • CB consensus sequence B
  • the nucleic acid fragment that hybridizes or anneals to the primer II-B' can be extended using the consensus sequence B as a template under the action of a nucleic acid polymerase, and a part of the consensus sequence B is added at the 3' end of the cDNA chain.
  • Complementary sequence (c(CB)) thereby generating a nucleic acid molecule carrying the complementary sequence of said consensus sequence B at the 3' end.
  • sequence complementary to the 3' end overhang of the cDNA strand is located at the 3' end of the primer II-B'.
  • the primer II-B' may include GGG at its 3' end.
  • the nucleotides of the primer II-B' can also be modified (for example, using a locked nucleic acid) to enhance the complementary pairing between the primer II-B' and the 3' end overhang of the cDNA strand.
  • nucleic acid polymerases for example, DNA polymerase or reverse transcriptase
  • DNA polymerase or reverse transcriptase can be used to carry out the extension reaction, as long as it can use the sequence of the primer II-B' or a partial sequence thereof as a template to extend Captured nucleic acid fragments (reverse transcription products) are sufficient.
  • reverse transcriptase enzyme as in the previous reverse transcription step can be used to extend the captured nucleic acid fragment (reverse transcription product).
  • this step is performed simultaneously with step (1) (eg, in the same reaction system).
  • the method optionally further comprises step (3): adding RNaseH to digest the RNA strand in the RNA/cDNA hybrid duplex to form a cDNA single strand.
  • said method does not comprise said step (3).
  • extension primer the cDNA strand obtained in the previous step is used as a template for an extension reaction to obtain an extension product;
  • the extension primer is the primer II-B', or primer B", and the primer B" can be combined with the The above-mentioned consensus sequence B or a partial sequence thereof is annealed, and can initiate an extension reaction.
  • the exemplary structure of the cDNA strand complementary chain prepared by the above exemplary embodiment comprises: consensus sequence B, complementary sequence of 3' end overhang, complementary sequence of cDNA sequence, complementary sequence of UMI sequence, and complementary sequence of consensus sequence A sequence.
  • a bridging oligonucleotide pair consisting of a bridging oligonucleotide II-I and a bridging oligonucleotide II-II is provided, wherein the bridging oligonucleotide II-I and the bridging oligonucleotide II- II each independently include: a first region (P1) and a second region (P2), the first region is located upstream of the second region (eg 5' end); wherein,
  • the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
  • the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
  • the second region of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence A or its partial sequence in the complementary strand of the cDNA strand obtained in step 1 above.
  • the bridging oligonucleotide II-I contains spacer nucleotides between the first region and the second region, such as 1-5nt or 5-10nt spacer nucleotides, that is, the The bridging oligonucleotide II-I sequence contains a third region located between the first region and the second region.
  • the first region and the second region in the bridging oligonucleotide II-I are adjacently connected without redundant nucleotides, that is, the bridging oligonucleotide
  • the acid II-I sequence does not contain a third region located between the first and second regions.
  • the bridging oligonucleotide II-II contains spacer nucleotides between the first region and the second region, such as 1-5nt or 5-10nt spacer nucleotides, that is, the The bridging oligonucleotide II-II sequence contains a third region located between the first region and the second region.
  • the first region and the second region in the bridging oligonucleotide II-II are adjacently connected without redundant nucleotides, that is, the bridging oligonucleotide
  • the acid II-II sequence does not contain a third region located between the first and second regions.
  • bridging oligonucleotide II-I Anneal or hybridize the bridging oligonucleotide II-I, bridging oligonucleotide II-II and the chip sequence to the complementary strand of the cDNA strand obtained in step 1 above, and then hybridize to the same bridging oligonucleotide by DNA ligase
  • the nucleic acid molecules of the first and second regions of II-I are linked, and/or the nucleic acid molecules of the first and second regions that hybridize to the same bridging oligonucleotide II-II are linked.
  • the exemplary structure of the new nucleic acid molecule containing chip sequence information formed by the above exemplary embodiment comprises: from the 5' end to the 3' end containing the consensus sequence B, the complementary sequence of the 3' end overhang, the cDNA sequence Complementary sequence, complementary sequence of said UMI sequence, complementary sequence of said consensus sequence A, said bridging oligonucleotide II-I sequence, complementary sequence of said tag sequence Y, and complementary sequence of said consensus sequence X1 nucleic acid strand and/or its complementary nucleic acid strand.
  • step (2)(i)(b) the cDNA strand anneals to the primer II-B via its 3' end overhang, and, upon nucleic acid polymerase (e.g., DNA Under the action of polymerase or reverse transcriptase), the cDNA chain is extended using the primer II-B as a template to generate the first extension product.
  • nucleic acid polymerase e.g., DNA Under the action of polymerase or reverse transcriptase
  • step (2)(ii)(b) the cDNA strand anneals to the primer II-B' via its 3' end overhang, and, upon nucleic acid polymerase (e.g., Under the action of DNA polymerase or reverse transcriptase), the cDNA chain is extended using the primer II-B' as a template to generate the first extension product.
  • nucleic acid polymerase e.g., Under the action of DNA polymerase or reverse transcriptase
  • the 3' terminal overhang has at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9 , at least 10 or more nucleotides in length. In certain embodiments, the 3' terminal overhang is a 3' terminal overhang of 2-5 cytosine nucleotides (eg, a CCC overhang).
  • step (2) the pretreatment is performed in cells.
  • said one or more RNA before or after said one or more cells are contacted with the solid phase support of said nucleic acid array, said one or more RNA (for example, mRNA ) is preprocessed to generate a first population of nucleic acid molecules.
  • the cells are permeabilized prior to said pretreatment.
  • step (2) the pretreatment is performed extracellularly.
  • RNAs for example, mRNA
  • Preprocessing to generate a first population of nucleic acid molecules
  • said method prior to said pretreatment, further comprises releasing intracellular RNA (eg, mRNA); preferably, by cell permeabilization or cell lysis treatment to release RNA (eg, mRNA) within a cell.
  • RNA eg, mRNA
  • performing reverse transcription in step (2) comprises using a reverse transcriptase.
  • the reverse transcriptase has terminal transfer activity.
  • the reverse transcriptase can use RNA (for example, mRNA) as a template to synthesize a cDNA chain, and add an overhang at the 3' end of the cDNA chain.
  • RNA for example, mRNA
  • the reverse transcriptase is capable of adding at least 1, at least 2, at least 3, at least 4, at least 5, at least Overhangs of 6, at least 7, at least 8, at least 9, at least 10 or more nucleotides.
  • the reverse transcriptase is capable of adding an overhang of 2-5 cytosine nucleotides (eg, a CCC overhang) at the 3' end of the cDNA strand.
  • the reverse transcriptase is selected from M-MLV reverse transcriptase, HIV-1 reverse transcriptase, AMV reverse transcriptase, telomerase reverse transcriptase, and Variants, modified products and derivatives of the transposition activity of the posase.
  • steps (2) and (3) have one or more features selected from the following:
  • primer I-A, primer II-A, primer IA', primer II-A', primer I-B, primer II-B, primer II-B', bridging oligonucleotide I, bridging oligonucleotide Acid II-I, bridging oligonucleotide II-II each independently comprise or consist of naturally occurring nucleotides (e.g. deoxyribonucleotides or ribonucleotides), modified nucleotides, non-natural core nucleotides, or any combination thereof; in certain embodiments, the primer I-A, primer II-A, primer I-A', primer II-A' are capable of initiating an extension reaction;
  • the primer I-B, primer II-B, and II-B' each independently comprise a modified nucleotide (such as a locked nucleic acid); in some embodiments, the primer I-B, primer II-B, primer The 3' ends of II-B' each independently comprise one or more modified nucleotides (eg, locked nucleic acids);
  • the tag sequence A and the tag sequence B each independently have a length of 5-200 (eg, 5-30nt, 6-15nt);
  • the consensus sequence A and the consensus sequence B each independently have 10-200nt (such as 10-100nt, 20-100nt, 25-100nt, 5-10nt, 10-15nt, 15-20nt, 20-50nt, 20 -30nt, 30-40nt, 40-50nt, 50-100nt) length;
  • said primer I-A, primer II-A, primer IA', primer II-A', primer I-B, primer II-B, and primer II-B' each independently have 4-200nt (such as 5-200nt , 15-230nt, 26-115nt, 10-130nt, 10-20nt, 20-50nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150-200nt) length;
  • the bridging oligonucleotide I, the bridging oligonucleotide II-I, the first region and the second region of the bridging oligonucleotide II-II each independently have 3-100nt (such as 20-100nt, 3-10nt, 10-15nt, 15-20nt, 20-70nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt) length;
  • the bridging oligonucleotide I, the bridging oligonucleotide II-I, and the third region of the bridging oligonucleotide II-II each independently have 0-50nt (such as Ont, 0-10nt, 10- 15nt, 15-20nt, 20-30nt, 30-40nt, 40-50nt) length;
  • the bridging oligonucleotide I, the bridging oligonucleotide II-I, and the bridging oligonucleotide II-II each independently have 6-200nt (such as 20-100nt, 20-70nt, 6-15nt, 15-20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150-200nt) length;
  • the poly(T) sequence includes at least 5, or at least 20 (eg, 6-100, 10-50) deoxythymidine residues;
  • the random oligonucleotide sequence has a length of 5-200 (eg 5nt, 5-30nt, 6-15nt).
  • the method further comprises: (4) recovering and purifying the second population of nucleic acid molecules.
  • the obtained second population of nucleic acid molecules and/or complements thereof are used for constructing a transcriptome library or for transcriptome sequencing.
  • the oligonucleotide probes described in step (1) have one or more characteristics selected from the following:
  • consensus sequence X1, tag sequence Y and consensus sequence X2 each independently comprise or consist of naturally occurring nucleotides (such as deoxyribonucleotides or ribonucleotides), modified nucleotides, non- Natural nucleotides (such as peptide nucleic acid (PNA) or locked nucleic acid), or any combination thereof;
  • naturally occurring nucleotides such as deoxyribonucleotides or ribonucleotides
  • modified nucleotides such as peptide nucleic acid (PNA) or locked nucleic acid
  • the consensus sequence X1, the tag sequence Y and the consensus sequence X2 each independently have 2-200nt (such as 10-200nt, 25-100nt, 10-30nt, 10-100nt, 5-10nt, 10-15nt, 15 -20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt) length.
  • 2-200nt such as 10-200nt, 25-100nt, 10-30nt, 10-100nt, 5-10nt, 10-15nt, 15 -20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt
  • the nucleic acid array in step (1) is provided by steps comprising:
  • each vector sequence comprising at least one copy (for example, multiple copies) of the vector sequence, the vector sequence comprising from the 5' to 3' direction: the complementary sequence of the consensus sequence X2, The complementary sequence of the tag sequence Y and the fixed sequence; wherein, the complementary sequences of the tag sequence Y of each carrier sequence are different from each other;
  • extension product comprises or consists of: a consensus sequence X1, a tag sequence Composed of Y and consensus sequence X2;
  • steps (3) and (4) are performed in any order;
  • the fixed sequence of the carrier sequence also includes a cleavage site, and the cleavage can be selected from nicking enzyme enzyme cleavage, USER enzyme cleavage, light cleavage, chemical cleavage or CRISPR cleavage;
  • the cleavage site contained in the fixed sequence of the carrier sequence is cut to digest the carrier sequence, so that the extension product in step (3) is separated from the template (i.e. the carrier sequence) forming the extension product, so that the oligo Nucleotide probes are attached to the surface of a solid support such as a chip.
  • the method further includes separating the extension product in step (3) from the template (ie, the vector sequence) forming the extension product by high temperature denaturation.
  • each vector sequence is a DNB formed from a concatemer of multiple copies of the vector sequence.
  • step (1) the various vector sequences are provided in step (1) by the following steps:
  • each vector template sequence as a template to perform a nucleic acid amplification reaction to obtain an amplification product of each vector template sequence, the amplification product comprising at least one copy of the vector sequence; in certain embodiments, Rolling circle replication is performed to obtain DNBs formed from concatemers of the vector sequences.
  • the consensus sequence X2 comprises a capture sequence capable of hybridizing to all or part of the nucleic acid to be captured, which includes a poly(T) sequence, A specific sequence or a random oligonucleotide sequence for a specific target nucleic acid; and, the capture sequence has a 3' free end to enable the consensus sequence X2 to serve as an extension primer.
  • said step (2) comprises: contacting said one or more cells with a solid support of said nucleic acid array, whereby each cell occupies at least one of said nucleic acid arrays a micro-dot (that is, each cell is in contact with at least one micro-dot in the nucleic acid array), and the first binding molecule of the cell forms an interaction pair with the first labeling molecule of the solid support ; wherein annealing conditions are implemented such that the nucleic acid of the one or more cells anneals to the capture sequence such that the position of the nucleic acid is corresponding to the position of the oligonucleotide probe on the nucleic acid array;
  • the step (3) includes: under conditions that allow primer extension, use the oligonucleotide probe as a primer, and use the captured nucleic acid molecule as a template to perform a primer extension reaction to generate a labeled (for example, by The tag sequence (labeled) nucleic acid molecule; and/or, using the captured nucleic acid molecule as a primer and the oligonucleotide probe as a template, perform a primer extension reaction to generate an extended captured nucleic acid molecule , forming a labeled (eg, labeled by the complementary sequence of said tag sequence Y) nucleic acid molecule.
  • a labeled eg, labeled by the complementary sequence of said tag sequence Y
  • the oligonucleotide probe described in step (1) further comprises a unique molecular identifier (UMI) sequence.
  • UMI unique molecular identifier
  • said UMI sequence is located upstream of said capture sequence.
  • the UMI sequences contained in the oligonucleotide probes coupled to the same microdot are different from each other.
  • the nucleic acid array in step (1) is provided by steps comprising:
  • each carrier sequence comprising multiple copies of the carrier sequence, the carrier sequence comprising: a positioning sequence and a first fixed sequence from 5' to 3',
  • the positioning sequence is the complementary sequence of the tag sequence Y;
  • the first immobilized sequence allows annealing of its complementary nucleotide sequence and initiates an extension reaction
  • the first primer comprises the complementary sequence of the first fixed sequence at its 3' end. region, the first fixed sequence complementary region comprises the complementary sequence of the first fixed sequence or a fragment thereof, and has a 3' free end;
  • said second nucleic acid molecule comprising a consensus sequence X2 (i.e. a capture sequence), which has a 3' free end so that said second nucleic acid molecule can be used as an extension primer,
  • a consensus sequence X2 i.e. a capture sequence
  • linking the second nucleic acid molecule to the first nucleic acid molecule for example, using ligase to link the second nucleic acid molecule to the first nucleic acid molecule
  • the ligation product is the 5' to 3' An oligonucleotide probe comprising the consensus sequence X1, the tag sequence Y, and the consensus sequence X2 in the direction.
  • the carrier sequence is optionally digested so that the ligation product in step (5) is separated from the carrier sequence, thereby attaching the oligonucleotide probe to the surface of the solid support.
  • the first nucleic acid molecule or the second nucleic acid molecule further comprises a UMI sequence.
  • the second nucleic acid molecule comprises a UMI sequence located 5' to the capture sequence.
  • multiple vector sequences are provided by the following steps:
  • each carrier template sequence as a template to perform a nucleic acid amplification reaction to obtain an amplification product of each carrier template sequence, the amplification product comprising multiple copies of the carrier sequence;
  • the amplification is selected from rolling circle replication (RCA), bridge PCR amplification, multiple strand displacement amplification (MDA) or emulsion PCR amplification; preferably, rolling circle replication is carried out to obtain Alternatively, bridge PCR amplification, emulsion PCR amplification, or multiplex strand displacement amplification is performed to obtain DNA clusters in the form of clonal populations of the vector sequences.
  • rolling circle replication RCA
  • bridge PCR amplification multiple strand displacement amplification
  • MDA multiple strand displacement amplification
  • emulsion PCR amplification emulsion PCR amplification
  • said oligonucleotide probe is coupled to said solid support via a linker.
  • the linker is a linking group capable of reacting with an activating group, and the surface of the solid support is linked with an activating group.
  • the linker comprises -SH, -DBCO or -NHS.
  • the linker is -DBCO, and the surface of the solid support is bound with ( ester).
  • the nucleic acid array in step (1) has one or more characteristics selected from the following:
  • the oligonucleotide probes coupled to the same solid support have the same consensus sequence X1 and/or the same consensus sequence X2; (2) the consensus sequence X1 of the oligonucleotide probe comprises a cleavage site; cut or fragmented by cutting, photoablation, chemical ablation, or CRISPR ablation.
  • the solid phase support described in step (1) has one or more characteristics selected from the following:
  • the solid support is selected from latex beads, dextran beads, polystyrene surfaces, polypropylene surfaces, polyacrylamide gels, gold surfaces, glass surfaces, chips, sensors, electrodes and silicon wafers; In some embodiments, the solid support is a chip;
  • the solid support is planar, spherical or porous
  • the solid phase support can be used as a sequencing platform, such as a sequencing chip; in some embodiments, the solid phase support is a sequencing chip for Illumina, MGI or Thermo Fisher sequencing platforms; and
  • the solid support is capable of releasing all the compounds spontaneously or upon exposure to one or more stimuli (e.g., temperature change, pH change, exposure to a specific chemical substance or phase, exposure to light, reducing agent, etc.) oligonucleotide probes.
  • stimuli e.g., temperature change, pH change, exposure to a specific chemical substance or phase, exposure to light, reducing agent, etc.
  • the present application also provides a method for constructing a library of nucleic acid molecules, which includes,
  • step (c) optionally, amplifying and/or enriching the product of step (b);
  • a library of nucleic acid molecules is thereby obtained.
  • the library of nucleic acid molecules comprises nucleic acid molecules from multiple single cells, and the nucleic acid molecules of different single cells have different tag sequences Y.
  • the library of nucleic acid molecules is used for sequencing, e.g., transcriptome sequencing, e.g., single cell transcriptome sequencing (e.g., 5' or 3' transcriptome sequencing).
  • sequencing e.g., transcriptome sequencing, e.g., single cell transcriptome sequencing (e.g., 5' or 3' transcriptome sequencing).
  • the method before performing step (b), further comprises a step (pre-b): amplifying and/or enriching the population of labeled nucleic acid molecules.
  • step (pre-b) the population of labeled nucleic acid molecules is subjected to a nucleic acid amplification reaction to generate an enriched product.
  • the amplification reaction is performed using at least primer C and/or primer D, wherein the primer C is capable of hybridizing or annealing to the complementary sequence of the consensus sequence X1 or a partial sequence thereof, and Initiate an extension reaction; the primer D can hybridize or anneal to the nucleic acid molecular chain containing the tag sequence Y in the labeled nucleic acid molecule population, and initiate an extension reaction.
  • the nucleic acid amplification reaction in step (pre-b) is performed using a nucleic acid polymerase (eg, DNA polymerase, eg, DNA polymerase with strand displacement activity and/or high fidelity).
  • a nucleic acid polymerase eg, DNA polymerase, eg, DNA polymerase with strand displacement activity and/or high fidelity.
  • step (b) of the method the nucleic acid molecule is randomly disrupted with a transposase and adapters are added.
  • the nucleic acid molecule obtained in the previous step is randomly interrupted with a transposase, and a first linker and a second linker are respectively added to both ends of the fragment.
  • the transposase is selected from Tn5 transposase, MuA transposase, Sleeping Beauty transposase, Mariner transposase, Tn7 transposase, Tn10 transposase, Ty1 transposase, Tn552 transposase, and variants, modified products and derivatives having the transposition activity of the above-mentioned transposases.
  • the transposase is a Tn5 transposase.
  • step (c) at least primer C' and/or primer D' are used to amplify the product of step (b), wherein said primer C' is capable of combining with said first adapter hybridizes or anneals and initiates an extension reaction, said primer D' is capable of hybridizing or annealing to said second adapter and initiates an extension reaction.
  • step (c) at least the product of step (b) is amplified using the primer C and/or primer D'; wherein, the primer D' can be combined with the first The adapter or second adapter hybridizes or anneals and initiates an extension reaction.
  • the present application also provides a method for performing transcriptome sequencing on cells in a sample, comprising:
  • the application also provides a method for single-cell transcriptome analysis, comprising:
  • test kit comprising:
  • a nucleic acid array for labeling nucleic acids and optionally a first binding molecule comprising a solid support, the solid support (for example on its surface) containing a first labeling molecule, the first binding molecule capable of forming an interaction pair with the first marker molecule;
  • the solid support also includes a plurality of micro-dots, the size of the micro-dots (such as equivalent diameter) is less than 5 ⁇ m, and the center-to-center distance between adjacent micro-dots is less than 10 ⁇ m; each coupling has an oligo Nucleotide probes; each oligonucleotide probe comprises at least one copy; and, the oligonucleotide probes comprise or consist of: a consensus sequence X1, a tag sequence Y and The consensus sequence X2 consists of, wherein,
  • Different microdot-coupled oligonucleotide probes have different label sequences Y.
  • the center-to-center distance between adjacent microdots is less than 10 ⁇ m, less than 5 ⁇ m, less than 1 ⁇ m, less than 0.5 ⁇ m, less than 0.1 ⁇ m, less than 0.05 ⁇ m, or less than 0.01 ⁇ m; and, the micro The size of the dots (eg, equivalent diameter) is less than 5 ⁇ m, less than 1 ⁇ m, less than 0.3 ⁇ m, less than 0.5 ⁇ m, less than 0.1 ⁇ m, less than 0.05 ⁇ m, less than 0.01 ⁇ m, or less than 0.001 ⁇ m.
  • the center-to-center distance between adjacent micro dots is 0.5 ⁇ m ⁇ 1 ⁇ m, such as 0.5 ⁇ m ⁇ 0.9 ⁇ m, 0.5 ⁇ m ⁇ 0.8 ⁇ m.
  • the size of the microdots (such as equivalent diameter) is 0.001 ⁇ m to 0.5 ⁇ m (such as 0.01 ⁇ m to 0.1 ⁇ m, 0.01 ⁇ m to 0.2 ⁇ m, 0.2 ⁇ m to 0.5 ⁇ m, 0.2 ⁇ m to 0.4 ⁇ m, 0.2 ⁇ m to 0.3 ⁇ m).
  • the solid support comprises a plurality (eg, at least 10, at least 10 2 , at least 10 3 , at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , or more) microdots; in certain embodiments, the solid support comprises at least 10 4 (eg, at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , at least 10 9 , at least 10 10 , at least 10 11 , or at least 10 12 ) microdots/square millimeter.
  • the solid support comprises at least 10 4 (eg, at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , at least 10 9 , at least 10 10 , at least 10 11 , or at least 10 12 ) microdots/square millimeter.
  • the first binding molecule can form a specific interaction pair or a non-specific interaction pair with the first label molecule.
  • the interaction pair is selected from positive and negative charge interactions, affinity interactions (e.g., biotin-avidin, biotin-streptavidin, antigen-antibody, receptor-ligand Enzymes, enzyme-cofactors), molecular pairs capable of click chemistry reactions (eg, alkynyl-containing groups-azido-containing compounds), N-hydroxysulfosuccinyl (NHS) ester-amino-containing compounds, or any combination thereof.
  • affinity interactions e.g., biotin-avidin, biotin-streptavidin, antigen-antibody, receptor-ligand Enzymes, enzyme-cofactors
  • molecular pairs capable of click chemistry reactions eg, alkynyl-containing groups-azido-containing compounds
  • NHS N-hydroxysulfosuccinyl
  • the first labeling molecule is poly-lysine, and the first binding molecule is a protein capable of binding to poly-lysine; the first labeling molecule is an antibody, and the first binding molecule is a protein capable of binding to poly-lysine; An antigen that binds to the antibody; the first labeling molecule is an amino-containing compound, and the first binding molecule is N-hydroxysulfosuccinate (NHS); or, the first labeling molecule is biotin, so The first binding molecule is streptavidin.
  • the first binding molecule is streptavidin.
  • the kit further comprises:
  • Primer I-A a primer set comprising Primer I-A' and Primer I-B, or, a primer set comprising Primer I-A and Primer I-B, wherein:
  • the primer I-A contains a consensus sequence A and a capture sequence A, and the capture sequence A can anneal to the RNA to be captured (for example, mRNA) and initiate an extension reaction; preferably, the consensus sequence A is located in the capture sequence A upstream (for example, at the 5' end of the primer I-A);
  • the primer I-A' comprises a capture sequence A capable of annealing to the RNA to be captured (for example, mRNA) and initiating an extension reaction;
  • the primer I-B comprises a consensus sequence B, a 3' end overhang complementary sequence, and an optional tag sequence B; wherein, the 3' end overhang complementary sequence is located at the 3' end of the primer I-B, and the consensus sequence B is located upstream of the complementary sequence of the 3' end overhang (for example, at the 5' end of the primer I-B); wherein, the 3' end overhang is defined by the capture sequence A of the primer I-A'
  • the captured RNA is one or more non-template nucleotides contained in the 3' end of the cDNA chain generated by reverse transcription of the template;
  • bridging oligonucleotide I comprising: a first region and a second region, and optionally a third region between the first region and the second region, the first region being located in the second region Upstream of the region (e.g., the 5' end); wherein,
  • the first region can (a) anneal to all or part of the consensus sequence A of the primer I-A or (b) anneal to all or part of the consensus sequence B of the primer I-B;
  • the second region is capable of annealing to all or part of the consensus sequence X2.
  • the kit comprises: primer I-A as described in (i), and bridging oligonucleotide I as described in (ii); wherein, the bridging oligonucleotide The first region of I can anneal to all or part of the consensus sequence A of the primer I-A, and the second region of the bridging oligonucleotide can anneal to all or part of the consensus sequence X2;
  • the capture sequence A of the primer I-A is a random oligonucleotide sequence; or, the capture sequence A of the primer I-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid, and the primer I-A further comprises
  • the tag sequence A is, for example, a random oligonucleotide sequence.
  • the 5' end of primer I-A comprises a phosphorylation modification.
  • the kit comprises: a primer set comprising primer I-A' and primer I-B as described in (i), and, bridging oligonucleotide I as described in (ii) ; wherein, the first region of the bridging oligonucleotide I can anneal to all or part of the consensus sequence B of the primer I-B, and the second region of the bridging oligonucleotide can fully or partially anneal to the consensus sequence X2 partial annealing;
  • the capture sequence A of the primer I-A' is a random oligonucleotide sequence; or, the capture sequence A of the primer I-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid,
  • the primer I-A' further comprises a tag sequence A, and a consensus sequence A;
  • the primer I-B comprises a consensus sequence B, a complementary sequence overhanging at the 3' end, and a tag sequence B.
  • the kit further comprises a primer B", capable of annealing to the complementary sequence of the consensus sequence B or a partial sequence thereof, and capable of initiating an extension reaction.
  • primer I-B or primer B comprises a phosphorylation modification.
  • the primer I-B comprises modified nucleotides (eg locked nucleic acid); preferably, the 3' end of the primer I-B comprises one or more modified nucleotides (eg locked nucleic acid).
  • the kit comprises: a primer set comprising primer I-A and primer I-B as described in (i), and bridging oligonucleotide I as described in (ii); wherein, The first region of the bridging oligonucleotide I can anneal to all or part of the consensus sequence A of the primer I-A, and the second region of the bridging oligonucleotide can anneal to all or part of the consensus sequence X2;
  • the capture sequence A of the primer I-A is a random oligonucleotide sequence; or, the capture sequence A of the primer I-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid, and the primer I-A further includes
  • the tag sequence A is, for example, a random oligonucleotide sequence.
  • the 5' end of primer I-A comprises a phosphorylation modification.
  • the primer I-B comprises modified nucleotides (eg locked nucleic acid); preferably, the 3' end of the primer I-B comprises one or more modified nucleotides (eg locked nucleic acid).
  • the kit further comprises:
  • primer II-A and primer II-B (i) a primer set comprising primer II-A and primer II-B or comprising primer II-A' and primer II-B', wherein:
  • the primer II-A contains a capture sequence A capable of annealing to the RNA to be captured (eg, mRNA) and initiating an extension reaction;
  • the primer II-B comprises a consensus sequence B, a 3' end overhanging complementary sequence, and an optional tag sequence B; wherein the 3' end overhanging complementary sequence is located at the 3' end of the primer II-B,
  • the consensus sequence B is located upstream of the complementary sequence of the 3' end overhang (for example, at the 5' end of the primer II-B); wherein, the 3' end overhang refers to the primer II-A
  • the RNA captured by the capture sequence A is one or more non-template nucleotides contained in the 3' end of the cDNA chain generated by reverse transcription of the template;
  • the primer II-A' contains a consensus sequence A and a capture sequence A; wherein the capture sequence A is located at the 3' end of the primer II-A', and the consensus sequence A is located upstream of the capture sequence A ( For example at the 5' end of said primer II-A');
  • the primer II-B' comprises a consensus sequence B, a 3' end overhanging complementary sequence, and an optional tag sequence B; wherein, the 3' end overhanging complementary sequence is located 3' of the primer II-B' At the end, the consensus sequence B is located upstream of the 3' end overhang complementary sequence (for example, at the 5' end of the primer II-B'); wherein, the 3' end overhang refers to the primer
  • the RNA captured by the capture sequence A of II-A' is one or more non-template nucleotides contained in the 3' end of the cDNA chain generated by template reverse transcription.
  • the kit comprises: a primer set of primer II-A and primer II-B as described in (i), and, (ii) bridging oligonucleotide II-I and bridging oligonucleotide Nucleotide II-II; wherein, said bridging oligonucleotide II-I and said bridging oligonucleotide II-II each independently comprise: a first region and a second region, and optionally located in the first A third region between a region and a second region, the first region being located upstream (e.g., the 5' end) of the second region; wherein,
  • the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
  • the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
  • the second region of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence B of the primer II-B or a partial sequence thereof;
  • the capture sequence A of the primer II-A is a random oligonucleotide sequence; or, the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid, the Primer II-A preferably further comprises a consensus sequence A and an optional tag sequence A, such as a random oligonucleotide sequence;
  • the primer II-B contains the consensus sequence B, the complementary sequence overhanging at the 3' end, and the tag sequence B.
  • the primer II-B comprises modified nucleotides (such as locked nucleic acid); preferably, the 3' end of the primer II-B comprises one or more modified nucleotides (such as locked nucleic acid).
  • the kit comprises: a primer set of primer II-A and primer II-B as described in (i);
  • the capture sequence A of the primer II-A is a random oligonucleotide sequence; or, the capture sequence A of the primer II-A is a poly(T) sequence or a specific sequence for a specific target nucleic acid, the Primer II-A preferably further comprises a consensus sequence A and an optional tag sequence A, such as a random oligonucleotide sequence;
  • the primer II-B contains the consensus sequence B, the complementary sequence overhanging at the 3' end, and the tag sequence B.
  • the primer II-B comprises modified nucleotides (such as locked nucleic acid); preferably, the 3' end of the primer II-B comprises one or more modified nucleotides (such as locked nucleic acid).
  • the kit comprises: a primer set of primer II-A' and primer II-B' as described in (i), and, (ii) bridging oligonucleotides II-I and Bridging oligonucleotide II-II; wherein, said bridging oligonucleotide II-I and said bridging oligonucleotide II-II each independently comprise: a first region and a second region, and optionally located in A third region between the first region and the second region, the first region being located upstream (e.g., the 5' end) of the second region; wherein,
  • the first region of the bridging oligonucleotide II-I can anneal with the first region of the bridging oligonucleotide II-II; the second region of the bridging oligonucleotide II-I can anneal with the bridging oligonucleotide II-I
  • the consensus sequence X2 of the oligonucleotide probe or a partial sequence thereof is annealed;
  • the second region of the bridging oligonucleotide II-II can anneal to the complementary sequence of the consensus sequence A of the primer II-A' or a partial sequence thereof;
  • the capture sequence A of the primer II-A' is a random oligonucleotide sequence; or, the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid,
  • the primer II-A' further comprises a tag sequence A, such as a random oligonucleotide sequence.
  • the primer II-B' comprises modified nucleotides (eg, locked nucleic acid); preferably, the 3' end of the primer II-B' comprises one or more modified nucleotides (e.g. locked nucleic acids).
  • the kit further comprises a primer B", capable of annealing to the complementary sequence of the consensus sequence B or a partial sequence thereof, and capable of initiating an extension reaction.
  • it comprises a primer set of primer II-A' and primer II-B' as described in (i);
  • the capture sequence A of the primer II-A' is a random oligonucleotide sequence; or, the capture sequence A of the primer II-A' is a poly(T) sequence or a specific sequence for a specific target nucleic acid,
  • the primer II-A' further comprises a tag sequence A, such as a random oligonucleotide sequence;
  • the primer II-B' contains the consensus sequence B, the complementary sequence overhanging at the 3' end, and the tag sequence B.
  • the primer II-B' comprises modified nucleotides (eg, locked nucleic acid); preferably, the 3' end of the primer II-B' comprises one or more modified nucleotides (e.g. locked nucleic acids).
  • the kit further comprises a primer B", capable of annealing to the complementary sequence of the consensus sequence B or a partial sequence thereof, and capable of initiating an extension reaction.
  • the kit has one or more features selected from:
  • oligonucleotide probe primer I-A, primer II-A, primer I-A', primer II-A', primer I-B, primer II-B, primer II-B', primer B
  • Bridging oligonucleotide I, bridging oligonucleotide II-I, bridging oligonucleotide II-II each independently comprise or consist of naturally occurring nucleotides (such as deoxyribonucleotides or ribonucleotides), Modified nucleotides, non-natural nucleotides, or any combination thereof;
  • the oligonucleotide probes each independently have 15-300nt (such as 15-200nt, 15-20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150- 200nt) in length;
  • primer I-A, primer II-A, primer IA', primer II-A', primer I-B, primer II-B, primer II-B', and primer B" each independently have 4-200nt ( For example 5-200nt, 15-230nt, 26-115nt, 10-130nt, 10-20nt, 20-50nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150-200nt) length;
  • the bridging oligonucleotide I, the bridging oligonucleotide II-I, and the bridging oligonucleotide II-II each independently have 6-200nt (such as 20-100nt, 20-70nt, 6-15nt, 15-20nt, 20-30nt, 30-40nt, 40-50nt, 50-100nt, 100-150nt, 150-200nt) length;
  • the oligonucleotide probes coupled to the same solid support have the same consensus sequence X1 and/or the same consensus sequence X2;
  • the consensus sequence X1 of the oligonucleotide probe comprises a cleavage site; Cut or fragmented by photoablation, chemical ablation, or CRISPR ablation.
  • the kit further comprises reverse transcriptase, nucleic acid ligase, nucleic acid polymerase and/or transposase.
  • the reverse transcriptase has terminal transfer activity.
  • the reverse transcriptase is capable of synthesizing a cDNA strand using RNA (eg, mRNA) as a template, and adding the 3' end overhang at the 3' end of the cDNA strand.
  • the reverse transcriptase is capable of adding at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7 nucleotides in length to the 3' end of the cDNA strand. , an overhang of at least 8, at least 9, at least 10 or more nucleotides.
  • the reverse transcriptase is capable of adding an overhang of 2-5 cytosine nucleotides (eg, a CCC overhang) at the 3' end of the cDNA strand.
  • the reverse transcriptase is selected from the group consisting of M-MLV reverse transcriptase, HIV-1 reverse transcriptase, AMV reverse transcriptase, telomerase reverse transcriptase, and transposases having the above transposase activity variants, modifications and derivatives.
  • the nucleic acid polymerase has no 5' to 3' exonucleating activity or strand displacement activity.
  • the nucleic acid polymerase has 5' to 3' exonucleation activity or strand displacement activity.
  • the transposase is selected from Tn5 transposase, MuA transposase, Sleeping Beauty transposase, Mariner transposase, Tn7 transposase, Tn10 transposase, Ty1 transposase, Tn552 transposase, and variants, modified products and derivatives having the transposition activity of the above-mentioned transposases.
  • the kit further comprises: the primer C, the primer D, the primer C' and/or the primer D'.
  • the kit further comprises the primer C, the primer D and the primer D'.
  • said kit further comprises said primer C, said primer D, said primer C' and said primer D'.
  • the kit further comprises: reagents for nucleic acid hybridization, reagents for nucleic acid extension, reagents for nucleic acid amplification, reagents for recovering or purifying nucleic acids, reagents for A reagent for constructing a transcriptome sequencing library, a reagent for sequencing (such as next-generation sequencing or third-generation sequencing), or any combination thereof.
  • cells useful in the methods of the invention can be any cell of interest, e.g., cancer cells, stem cells, neural cells, fetal Cells and immune cells involved in the immune response.
  • the cell may be one cell or multiple cells.
  • the cells may be a mixture of cells of the same type, or a completely heterogeneous mixture of cells of different types.
  • Different cell types may include different tissue cells of an individual or the same tissue cells of different individuals or cells derived from microorganisms of different genus, species, strain, variant, or any combination of any or all of the foregoing.
  • different cell types may include normal cells and cancer cells from an individual; various cell types obtained from a human subject, such as various immune cells; various different bacteria from environmental, forensic, microbiome or other samples species, strains and/or variants; or any other various mixture of cell types.
  • the term "UMI” refers to "Unique Molecular Identifier, a unique molecular label", which can be used to perform qualitative and/or quantitative nucleic acid molecules. Unless otherwise indicated herein or clearly contradicted by the context, the present application does not limit the position and quantity of the UMI or its complementary sequence in the nucleic acid molecule. For example, when the cDNA chain contains the UMI or its complementary sequence, the UMI or its complementary sequence can be located at the 3' end of the cDNA sequence in the cDNA chain, or at the 5' end of the cDNA sequence, or The UMI or its complement is contained at both the 3' end and the 5' end.
  • the UMI or its complementary sequence can be located at the 3' end of the complementary sequence of the cDNA sequence in the complementary strand of the cDNA strand, or at the end of the complementary sequence of the cDNA sequence.
  • the 5' end may also contain the UMI or its complementary sequence at both the 3' end and the 5' end.
  • DNB DNA nanoball, DNA nanoball
  • RCA rolling circle amplification
  • the RCA product is a multi-copy single-stranded DNA sequence, which can form a similar "spherical” structure due to the interaction force between the bases of the internal DNA sequence.
  • the library molecules are circularized to form single-stranded circular DNA, and then the single-stranded circular DNA can be amplified by multiple orders of magnitude using rolling circle amplification technology, and the resulting amplification product is called DNB.
  • a "population of nucleic acid molecules” refers to, for example, nucleic acid molecules derived directly or indirectly from target nucleic acid molecules (e.g., DNA double-stranded molecules, RNA/cDNA hybrid double-stranded molecules, DNA single-stranded molecules, or RNA single-stranded molecules) groups or collections.
  • the population of nucleic acid molecules comprises a library of nucleic acid molecules comprising sequences qualitatively and/or quantitatively representative of target nucleic acid molecule sequences.
  • the population of nucleic acid molecules comprises a subset of a library of nucleic acid molecules.
  • a "library of nucleic acid molecules” refers to labeled nucleic acid molecules (e.g., labeled DNA double-stranded molecules, labeled RNA/cDNA hybrid double-stranded molecules, labeled DNA single-stranded molecules) generated directly or indirectly from target nucleic acid molecules. stranded molecules, or labeled RNA single-stranded molecules) or a collection or population of fragments thereof, wherein the combination of labeled nucleic acid molecules or fragments thereof in the collection or population exhibits a qualitative and/or quantitative representation of the resulting The sequence of the target nucleic acid molecule sequence of the labeled nucleic acid molecule.
  • the library of nucleic acid molecules is a sequencing library.
  • the library of nucleic acid molecules can be used to construct a sequencing library.
  • cDNA or "cDNA strand” refers to a primer that anneals to an RNA molecule of interest, catalyzed by RNA-dependent DNA polymerase or reverse transcriptase, using at least a portion of the RNA molecule of interest as a template
  • the "complementary DNA” synthesized by the extension of DNA (this process is also called “reverse transcription”).
  • the synthesized cDNA molecule is "homologous” or “complementary” or “base paired” or “complexed” with at least a portion of the template.
  • upstream is used to describe the relative positional relationship of two nucleic acid sequences (or two nucleic acid molecules), and has the meaning generally understood by those skilled in the art.
  • the expression “one nucleic acid sequence is located upstream of another nucleic acid sequence” means that when arranged in the 5' to 3' direction, the former is located in a more forward position (i.e., closer to the 5' end) than the latter Location).
  • downstream has the opposite meaning of "upstream”.
  • Tag Sequence Y As used herein, "Tag Sequence Y”, “Tag Sequence A”, “Tag Sequence B”, “Consensus Sequence X1”, “Consensus Sequence X2”, “Consensus Sequence A”, “Consensus Sequence B”, etc.
  • the joined nucleic acid molecule or a derivative product of the joined nucleic acid molecule provides means for identification, recognition, and/or molecular manipulation or biochemical manipulation (e.g., by providing A site for annealing an oligonucleotide, such as a primer for DNA polymerase extension or an oligonucleotide for a non-target nucleic acid component of a capture reaction or ligation reaction) glycosides.
  • the oligonucleotides may consist of consecutive at least two (preferably about 6 to 100, but there is no firm limit to the length of the oligonucleotides, the exact size depends on many factors which in turn depend on the oligonucleotide
  • the final function or use of acid) nucleotides can also be composed of multiple oligonucleotides in continuous or discontinuous arrangement.
  • the oligonucleotide sequence may be unique for each nucleic acid molecule it ligates, or it may be unique for a certain class of nucleic acid molecules it ligates.
  • the oligonucleotide sequence can be reversibly or irreversibly joined to the polynucleotide sequence to be "labeled” by any means including ligation, hybridization or other methods.
  • the process of joining the oligonucleotide sequence to a nucleic acid molecule is sometimes referred to herein as "labeling" and a nucleic acid molecule undergoing labeling or containing a labeling sequence is referred to as a "labeled nucleic acid molecule" or "labeled nucleic acid molecule”. .
  • Nucleic acids or polynucleotides of the present invention may comprise one or more modified nucleobases, sugar moieties or internucleoside linkages.
  • nucleic acids or polynucleotides linked to sugar moieties or internucleoside linkages include, but are not limited to: (1) changes in Tm; (2) changes in the susceptibility of the polynucleotide to one or more nucleases; (3) ) provides a moiety for attaching a label; (4) provides a label or a label quencher; or (5) provides a moiety for attaching another molecule in solution or bound to a surface, such as biotin.
  • oligonucleotides such as primers, may be synthesized such that the random portion comprises one or more conformationally constrained nucleic acid analogs, such as, but not limited to, a ribose ring in which the 2'-O atom is linked to the 4'- One or more ribonucleic acid analogues "locked" by the methylene bridge of the C atom; these modified nucleotides result in an increase in the Tm or melting temperature of each nucleotide monomer by about 2 degrees Celsius to about 8 degrees Celsius.
  • conformationally constrained nucleic acid analogs such as, but not limited to, a ribose ring in which the 2'-O atom is linked to the 4'- One or more ribonucleic acid analogues "locked" by the methylene bridge of the C atom
  • one indicator of the use of modified nucleotides in the method may be that the oligonucleotide comprising the modified nucleotides may Digested by single-strand-specific RNases.
  • said "first said binding molecule” is capable of specific interaction or non-specific interaction with said "first labeling molecule”.
  • the first binding molecule interacts with the first label molecule in a manner selected from positive and negative charge interactions, affinity interactions (e.g., biotin-avidin, Biotin-Streptavidin, Antigen-Antibody, Receptor-Ligand, Enzyme-Cofactor), click chemistry reaction (eg, alkynyl group-azido compound), or any combination thereof.
  • the first labeling molecule is poly-lysine, and the first binding molecule is a protein capable of binding to poly-lysine; the first labeling molecule is an antibody, and the first binding molecule is a protein capable of binding to poly-lysine; An antigen that binds to the antibody; the first labeling molecule is biotin, and the first binding molecule is streptavidin; the first binding molecule is a compound containing an alkyne group, and the first The labeling molecule is an azido compound; or, the first binding molecule is N-hydroxysulfosuccinate (NHS) ester, and the first labeling molecule is an amino-containing compound.
  • the first binding molecule is an amino-containing compound.
  • the first labeling molecule is an antigen
  • the first binding molecule is an antibody capable of binding to the antigen
  • the first labeling molecule is streptavidin, and the first binding molecule is biotin
  • the first binding molecule is an azide compound, and the first labeling molecule is a compound containing an alkynyl group; or, the first binding molecule is an amino-containing compound, and the first labeling molecule is an N-hydroxyl group Sulfosuccinate (NHS) ester.
  • NHS N-hydroxyl group Sulfosuccinate
  • a nucleic acid base in a single nucleotide at one or more positions in a polynucleotide or oligonucleotide may include guanine, adenine, uracil, thymine, or cytosine.
  • one or more of the nucleic acid bases may comprise modified bases such as, but not limited to, xanthine, allyamino-uracil, allyamino-thymine Glycosides, hypoxanthine, 2-aminoadenine, 5-propynyluracil, 5-propynylcytosine, 4-thiouracil, 6-thioguanine, nitrogen-uracil and deaza-uracil, thymus pyrimidine nucleoside, cytosine, adenine or guanine.
  • modified bases such as, but not limited to, xanthine, allyamino-uracil, allyamino-thymine Glycosides, hypoxanthine, 2-aminoadenine, 5-propynyluracil, 5-propynylcytosine, 4-thiouracil, 6-thioguanine, nitrogen-uracil and deaza-uracil, thymus pyrimidine nucle
  • nucleic acid bases may comprise nucleic acid bases derivatized with a biotin moiety, a digoxigenin moiety, a fluorescent or chemiluminescent moiety, a quencher moiety, or some other moiety.
  • the invention is not limited to the listed nucleic acid bases; the list given shows examples of a wide range of bases that can be used in the methods of the invention.
  • one or more of the sugar moieties may include 2'-deoxyribose, or alternatively, one or more of the sugar moieties may include some other sugar moiety, such as But not limited to: Ribose or 2'-fluoro-2'-deoxyribose or 2'-O-methyl-ribose that provide resistance to some nucleases, or can be passed with visible, fluorescent, infrared fluorescent 2'-amino 2'-deoxyribose or 2'-azido- 2'-deoxyribose.
  • internucleoside linkages of nucleic acids or polynucleotides of the invention may be phosphodiester linkages, or alternatively, one or more of the internucleoside linkages may include modified linkages such as, but not limited to: Phosphate, phosphorodithioate, phosphoroselenate, or phosphorodiselenate linkages, which are resistant to some nucleases.
  • terminal transfer activity refers to the ability to catalyze the template-independent addition (or “tailing") of one or more deoxyribonucleoside triphosphates (dNTPs) or a single dideoxyribonucleoside triphosphate to Activity of the 3' end of the cDNA.
  • dNTPs deoxyribonucleoside triphosphates
  • Examples of reverse transcriptases having terminal transfer activity include, but are not limited to, M-MLV reverse transcriptase, HIV-1 reverse transcriptase, AMV reverse transcriptase, telomerase reverse transcriptase, and reverse transcriptases having said reverse transcriptase Variants, modified products and derivatives with recording activity and terminal transfer activity. Described reverse transcriptase does not have or has RNase activity (particularly RNase H activity).
  • the reverse transcriptase used to reverse transcribe RNA to generate cDNA does not have RNase activity.
  • the reverse transcriptase used to reverse transcribe RNA to generate cDNA has terminal transfer activity and does not have RNase activity.
  • nucleic acid polymerase with "strand displacement activity” means that, in the process of elongating a new nucleic acid strand, if it encounters a downstream nucleic acid strand complementary to the template strand, it can continue the extension reaction and replace the nucleic acid strand complementary to the template strand.
  • nucleic acid polymerase having "5' to 3' exonuclease activity” refers to a nucleic acid polymerase capable of catalyzing the hydrolysis of 3, 5- Phosphodiester bond, nucleic acid polymerase that degrades nucleotides.
  • a nucleic acid polymerase (or DNA polymerase) with "high fidelity” means that, during the process of amplifying nucleic acid, the probability of introducing a wrong nucleotide (i.e., the error rate) is lower than that of the wild-type Taq enzyme (for example, the nucleic acid polymerase (or DNA polymerase) of Taq enzyme whose sequence is shown in UniProt Accession: P19821.1).
  • annealing As used herein, the terms “annealing”, “annealing”, “annealing”, “hybridizing” or “hybridizing” and the like refer to the presence of sufficient complementarity to form a complex via Watson-Crick base pairing. Complexes are formed between nucleotide sequences.
  • nucleic acid sequences that are “complementary to” or “complementary” or “hybridize” or “anneal” to each other should be able to form or form sufficiently stable “hybrids" or “hybrids” that serve the intended purpose. "Complex".
  • every nucleic acid base within the sequence represented by one nucleic acid molecule is capable of base pairing or pairing or complexing with every nucleic acid base within the sequence represented by a second nucleic acid molecule such that the two nucleic acid molecules or one of them
  • Corresponding sequences shown are “complementary” or “anneal” or “hybridize” to each other.
  • the terms “complementary” or “complementarity” are used when referring to a sequence of nucleotides related by the base pairing rules. For example, the sequence 5'-A-G-T-3' is complementary to the sequence 3'-T-C-A-5'.
  • Complementarity can be "partial,” wherein only some of the nucleic acid bases match according to the base pairing rules. Alternatively, there may be “perfect” or “total” complementarity between nucleic acids. The degree of complementarity between nucleic acid strands has a significant effect on the efficiency and strength of hybridization between nucleic acid strands. This is particularly important in amplification reactions and detection methods that rely on hybridization of nucleic acids.
  • the term “homology” refers to the degree of complementarity of one nucleic acid sequence to another nucleic acid sequence. There may be partial or complete homology (ie, complementarity).
  • a partially complementary sequence is one that at least partially inhibits the hybridization of a fully complementary sequence to a target nucleic acid and is referred to using the functional term "substantially homologous". Inhibition of hybridization of a perfectly complementary sequence to a target sequence can be examined under low stringency conditions using a hybridization assay (eg, Southern or Northern blot, solution hybridization, etc.). Substantially homologous sequences or probes will compete or inhibit binding (ie, hybridization) of a fully homologous sequence to a target under conditions of low stringency. This is not to say that low stringency conditions are conditions that allow non-specific binding; low stringency conditions require that the binding of two sequences to each other is a specific (ie selective) interaction.
  • a hybridization assay eg, Southern or Northern blot, solution hybridization, etc.
  • the absence of non-specific binding can be tested by using a second target that lacks complementarity or has only a low degree of complementarity (eg, less than about 30% complementarity). In cases of little or no specific binding, the probe will not hybridize to the nucleic acid target.
  • substantially homologous when used in reference to a double-stranded nucleic acid sequence, such as a cDNA or genomic clone, means hybridizable to one or both strands of the double-stranded nucleic acid sequence under low stringency conditions as described herein any oligonucleotide or probe.
  • the terms “anneal” or “hybridize” are used when referring to the pairing of complementary nucleic acid strands.
  • Hybridization and the strength of hybridization are affected by a number of factors well known in the art, including the degree of complementarity between the nucleic acids, including the stringency of conditions affected by conditions such as salt concentration, the degree of hybridization formed The Tm (melting temperature) of the body, the presence of other components (eg, the presence or absence of polyethylene glycol or betaine), the molarity of the hybridized strands, and the G:C content of the nucleic acid strands.
  • the solid support can spontaneously or upon exposure to one or more stimuli (e.g., temperature change, pH change, exposure to a particular chemical species or phase, exposure to light, reducing agent, etc.)
  • the oligonucleotide probe is released. It will be appreciated that the oligonucleotide probe may be released by cleavage of the bond between the oligonucleotide probe and the solid support, or by degradation of the solid support itself. Oligonucleotide probes, or both, which allow or are accessible to other reagents.
  • Addition of various types of labile bonds to the solid support can result in a solid support capable of responding to different stimuli.
  • Each type of labile bond can be sensitive to relevant stimuli (eg, chemical stimuli, light, temperature, etc.), so that the release of substances attached to the solid support through each labile bond can be controlled by applying appropriate stimuli.
  • labile bonds that can be coupled to solid supports include ester bonds (for example, cleavable with acids, bases, or hydroxylamine), ortho Diol bonds (e.g., cleavable by sodium periodate), Diels-Alder bonds (e.g., cleavable by heat), sulfone bonds (e.g., cleavable by bases), silane Ether bonds (e.g., cleavable by acids), glycosidic bonds (e.g., cleavable by amylases), peptide bonds (e.g., cleavable by proteases), or phosphodiester bonds (e.g., cleavable by nucleases (e.g., DNA Enzyme) cleavage)).
  • ester bonds for example, cleavable with acids, bases, or hydroxylamine
  • ortho Diol bonds e.g., cleavable by sodium periodate
  • Diels-Alder bonds
  • the solid support can be activated spontaneously or upon exposure to one or more stimuli (e.g., temperature). degradable, destructible or soluble upon exposure to a change in pH, change in pH, exposure to a particular chemical species or phase, exposure to light, reducing agents, etc.).
  • a solid support can be soluble such that the material components of the solid support dissolve upon exposure to a particular chemical or environmental change (eg, a change in temperature or a change in pH).
  • the solid support degrades or dissolves under elevated temperature and/or alkaline conditions.
  • the solid support can be thermally degradable such that when the solid support is exposed to an appropriate temperature change (eg, heating), the solid support degrades. Degradation or dissolution of a solid support bound to a substance (eg, an oligonucleotide probe) can result in the release of the substance from the solid support.
  • an appropriate temperature change eg, heating
  • transposase and reverse transcriptase and “nucleic acid polymerase” refer to protein molecules or aggregates of protein molecules responsible for catalyzing specific chemical and biological reactions.
  • the methods, compositions or kits of the invention are not limited to the use of a particular transposase, reverse transcriptase or nucleic acid polymerase from a particular source.
  • the methods, compositions or kits of the invention include any transposase, reverse transcriptase or nucleic acid polymerase from any source having equivalent enzymatic activity to a particular enzyme disclosed herein according to a particular method, composition or kit .
  • the method of the present invention also includes the following embodiment: wherein any specific enzyme provided and used in the steps of the method is replaced by a combination of two or more enzymes, the two or more enzymes When used in combination, whether used separately in a stepwise fashion or together simultaneously, the reaction mixture produces the same results as those obtained with that one particular enzyme.
  • the methods, buffers and reaction conditions provided herein, including those in the Examples, are presently preferred for embodiments of the methods, compositions and kits of the invention.
  • other enzyme storage buffers, reaction buffers and reaction conditions using some of the enzymes of the invention are known in the art and may also be suitable for use in the invention and are included herein.
  • the application provides high-resolution nucleic acid arrays (such as chips) and methods capable of positioning and labeling nucleic acid molecules, and high-throughput sequencing (especially, high-throughput single-cell transcriptome sequencing) using the nucleic acid arrays or methods. )Methods.
  • the method of the present application has one or more beneficial technical effects selected from the following:
  • the nucleic acid array (such as a chip) has high resolution, and it can contain at least 50 (such as at least 50, at least 100, at least 200, at least 300) in a single cell area (such as 80-100 ⁇ m 2 ) , at least 400, or at least 500) micro-dots, each micro-dot is coupled with a labeling oligonucleotide probe containing position information (for example, an oligonucleotide probe containing a tag sequence Y), each The oligonucleotide probe comprises at least one copy.
  • the nucleic acid array can mark different cells in a sample (such as a cell suspension) with a specific localization sequence (such as a tag sequence Y), thereby, by detecting the specific localization sequence (for example, the tag sequence Y), so that the spatial position information of the nucleic acid molecule on the nucleic acid array can be determined, and then the nucleic acid molecule from the same single cell can be determined, thereby realizing the analysis of the single cell sample.
  • a specific localization sequence such as a tag sequence Y
  • nucleic acid array such as a chip
  • a sample such as a cell suspension
  • multiple cells can be directly Adsorbed on the nucleic acid array (eg chip). Due to the high resolution of the nucleic acid array (such as a chip), the size and spacing of the micro-dots are far smaller than the size of a single cell.
  • each cell or, Nucleic acid molecules from the cells
  • oligonucleotide probes such as oligonucleotide probes containing tag sequence Y
  • position information on the nucleic acid array such as a chip
  • the nucleic acid array (such as a chip) can theoretically capture and label every cell in the sample, which effectively avoids the loss of rare cell information.
  • the method of the present invention can capture millions of cells on a single chip for single-cell sequencing, and the cell capture efficiency can theoretically reach 100%. That is, the cell capture throughput of the method of the present invention can reach millions of levels, and the cell capture efficiency can reach nearly 100%, which is far beyond the prior art (the prior art, such as the 10x chromium cell sorting platform, the microstructure formed in the oily phase Due to the limitation of the number of droplets, its flux is difficult to exceed 10,000 levels, and due to the characteristics of Poisson distribution, the cell capture rate can theoretically reach up to 60%).
  • the prior art such as the 10x chromium cell sorting platform, the microstructure formed in the oily phase Due to the limitation of the number of droplets, its flux is difficult to exceed 10,000 levels, and due to the characteristics of Poisson distribution, the cell capture rate can theoretically reach up to 60%).
  • Fig. 1A shows an exemplary structure of a chip for capturing and labeling nucleic acid molecules in this application, which includes: a chip and oligonucleotide probes (also called chip sequences) coupled to the chip.
  • oligonucleotide probes also called chip sequences
  • Each oligonucleotide probe contains a label sequence Y corresponding to its position on the chip, and the coupling area between each oligonucleotide probe and the chip can be called a micro spot.
  • Each oligonucleotide probe can be single or multiple copies.
  • Figure 1B shows that cells in a sample are labeled by one or more microdots on the chip after contacting the chip.
  • FIG. 2 shows an exemplary scheme 1 for preparing a cDNA chain using RNA (such as mRNA) in a sample as a template, and an exemplary structure of the cDNA chain.
  • CA Consensus A
  • CB Consensus B.
  • Figure 3 shows that the 5' end of the cDNA strand is tagged with ChIP-seq (i.e., the 5' end of the cDNA strand is ligated to the 3' end of the ChIP-seq), forming a new nucleic acid molecule containing the ChIP-seq information (i.e., the ChIP-seq
  • An exemplary scheme of a labeled nucleic acid molecule i.e., the ChIP-seq
  • CA consensus sequence A
  • CB consensus sequence B
  • X1 consensus sequence X1
  • Y tag sequence Y
  • X2 consensus sequence X2.
  • Fig. 4 shows an exemplary scheme 1 for preparing a complementary cDNA chain using RNA (such as mRNA) in a sample as a template, and an exemplary structure of the complementary cDNA chain.
  • CA consensus sequence A
  • CB consensus sequence B
  • EP extension primer.
  • Figure 5 shows that the 5' end of the complementary strand of the cDNA strand is marked with ChIP-seq (that is, the 5' end of the complementary strand of the cDNA strand is connected to the 3' end of the ChIP-seq), forming a new nucleic acid molecule containing the ChIP-seq information ( That is, an exemplary scheme of a ChIP-seq-labeled nucleic acid molecule), and an exemplary structure of the novel nucleic acid molecule containing ChIP-seq information.
  • CA consensus sequence A
  • CB consensus sequence B
  • X1 consensus sequence X1
  • Y tag sequence Y
  • X2 consensus sequence X2.
  • FIG. 6 shows an exemplary scheme 2 for preparing a cDNA chain using RNA (such as mRNA) in a sample as a template, and an exemplary structure of the cDNA chain.
  • CA Consensus A
  • CB Consensus B.
  • Figure 7 shows an exemplary scheme 1 for marking the 3' end of a cDNA strand with the complementary sequence of ChIP-seq to form a new nucleic acid molecule containing ChIP-seq information (that is, a nucleic acid molecule labeled with ChIP-seq), and the ChIP-seq-containing Exemplary structures of novel nucleic acid molecules of sequence information.
  • CA consensus sequence A
  • CB consensus sequence B
  • X1 consensus sequence X1
  • Y tag sequence Y
  • X2 consensus sequence X2
  • P1 first region
  • P2 second region.
  • Figure 8 shows an exemplary scheme 2 for marking the 3' end of a cDNA strand with the complementary sequence of ChIP-seq to form a new nucleic acid molecule containing ChIP-seq information (that is, a nucleic acid molecule labeled with ChIP-seq), and the ChIP-seq-containing Exemplary structures of novel nucleic acid molecules of sequence information.
  • CA consensus sequence A
  • CB consensus sequence B
  • X1 consensus sequence X1
  • Y tag sequence Y
  • X2 consensus sequence X2.
  • FIG. 9 shows an exemplary scheme 2 for preparing a complementary strand of a cDNA chain using RNA (such as mRNA) in a sample as a template, and an exemplary structure of the complementary strand of the cDNA strand.
  • RNA such as mRNA
  • Figure 10 shows the 3' end of cDNA strand complementary strand labeling with the complementary sequence of ChIP-seq, forms the exemplary scheme 1 of the new nucleic acid molecule (that is, the nucleic acid molecule of marking through ChIP-seq) that contains ChIP-seq information, and, described Exemplary structures of novel nucleic acid molecules containing ChIP-seq information.
  • CA consensus sequence A
  • CB consensus sequence B
  • X1 consensus sequence X1
  • Y tag sequence Y
  • X2 consensus sequence X2
  • P1 first region
  • P2 second region.
  • Fig. 11 has shown the 3' end of cDNA strand complementary strand labeling with the complementary sequence of ChIP-seq, forms the exemplary scheme 2 of the new nucleic acid molecule (that is, the nucleic acid molecule of marking through ChIP-seq) that contains ChIP-seq information, and, described Exemplary structures of novel nucleic acid molecules containing ChIP-seq information.
  • CA consensus sequence A
  • CB consensus sequence B
  • X1 consensus sequence X1
  • Y tag sequence Y
  • X2 consensus sequence X2.
  • FIG. 12 shows the gene expression profiles of some Hek293 cells obtained by the method of Example 1.
  • FIG. 13 shows a partially enlarged view of the gene expression profile of some Hek293 cells obtained by the method of Example 1.
  • Fig. 14 shows the length distribution of cDNA amplification products in Example 2.
  • FIG. 15 shows the gene expression profile of Hek293 cells obtained by the method of Example 2.
  • Figure 16 shows the average number of genes and UMIs captured by a single cell obtained by the method of Example 2.
  • DNBSEQ sequencing kit purchased from MGI, Cat. No. 1000019840 was used to prepare DNA nanoballs (DNB). Specific embodiments are briefly described below.
  • reaction system shown in Table 1-2 was configured.
  • the reaction system was placed in a PCR instrument, and the reaction was carried out according to the following reaction conditions: 95°C for 3 minutes, 40°C for 3 minutes.
  • After the reaction put the reaction product on ice, add 40 ⁇ L mixed enzyme I and 2 ⁇ L mixed enzyme II (from DNBSEQ sequencing kit), 1 ⁇ L ATP (100 mM stock solution, obtained from Thermo Fisher), and 0.1 ⁇ L T4 ligase (obtained from from NEB, Cat. No. M0202S).
  • the above reaction system was placed in a PCR instrument and reacted at 30° C. for 20 minutes to generate DNB.
  • the DNB was loaded onto the BGISEQ SEQ 500 sequencing chip according to the method described in the BGISEQ 500 high-throughput sequencing reagent set (SE50) (purchased from MGI, catalog number: 1000012551).
  • the sequencing chip add the MDA reagent in the BGISEQ500PE50 sequencing kit (purchased from MGI, product number: 1000012554), and after incubating at 37° C. for 30 min, wash the chip with 5 ⁇ SSC.
  • Chip surface modified with N3-PEG3500-NHS (the modification reagent was purchased from Sigma, product number: JKA5086). After incubation for 30 minutes, pump into the DBCO-modified chip sequence to synthesize primers (sequence shown in SEQ ID NO: 3), and overnight at room temperature Incubation.
  • the DNB was sequenced according to the instructions of the BGISEQ-500 high-throughput sequencing reagent kit, and the read length of SE was set to 25bp.
  • the above-mentioned DBCO-modified sequence is extended to obtain the chain grown after sequencing, and the chain is decoded to obtain the position sequence information corresponding to the DNB.
  • the chain grown after sequencing continues to extend: on the basis of the above step 3, continue to carry out the cPAS reaction of 15 bases to obtain the chip sequence (SEQ ID NO: 8, which contains the consensus sequence X1 (SEQ ID NO: 4), Tag sequence Y, consensus sequence X2 (SEQ ID NO:5)).
  • Chip dicing cut the prepared chip into several small pieces, adjust the size of the slice according to the needs of the experiment, soak the chip in 50mM Tris buffer with pH 8.0, and keep it at 4°C for use.
  • Reverse transcriptase will use mRNA as a template to synthesize cDNA with polyT-containing primers (sequence shown in SEQ ID NO: 6, which contains consensus sequence A (CA), UMI sequence (NNNNNNNNN) and polyT sequence), and in the cDNA A CCC overhang is added to the 3' end of the strand.
  • polyT-containing primers sequence shown in SEQ ID NO: 6, which contains consensus sequence A (CA), UMI sequence (NNNNNNNNNNN) and polyT sequence
  • the synthetic cDNA strand comprises the following sequence structure: sequence of reverse transcription primer (SEQ ID NO:6)-cDNA sequence-c(TSO) sequence (complementary sequence of SEQ ID NO:7).
  • a nucleic acid molecule comprising the following sequence structure: chip sequence (SEQ ID NO:8)-reverse transcription primer sequence (SEQ ID NO:6)-cDNA sequence-c(TSO) sequence (SEQ ID NO:7 Complementary sequence).
  • the chip was washed with 5X SSC. According to the instructions, 200 ⁇ L of Bst polymerization reaction solution (NEB, M0275S) was prepared, pumped into the chip, and reacted at 65°C for 60 minutes to obtain single-stranded nucleic acid molecules containing position information.
  • Bst polymerization reaction solution NEB, M0275S
  • reaction system in the PCR instrument, set the following reaction program, 95°C for 3min, 11 cycles (98°C for 20s, 58°C for 20s, 72°C for 3min), 72°C for 5min, 4°C ⁇ .
  • XP beads purchased from AMPure
  • the dsDNA concentration was quantified using a Qubit instrument, and the length distribution of cDNA amplification products was detected using a 2100 Bioanalyzer (available from Agilent).
  • cDNA concentration take 20ng cDNA (obtained in step 3), add 0.5 ⁇ M Tn5 transposase and corresponding buffer (purchased from BGI, catalog number 10000028493, Tn5 disrupting enzyme coating method is operated according to Stereomics library preparation kit-S1) , mix well to form a 20 ⁇ L reaction system, react at 55°C for 10 minutes, add 5 ⁇ L 0.1% SDS and mix at room temperature for 5 minutes to end the Tn5 interruption step.
  • the reaction conditions are as follows: 95°C for 3 minutes, 40°C for 3 minutes; after the reaction is completed, put it on ice, add 40 ⁇ L of the mixed enzyme I required for DNB preparation in the DNBSEQ sequencing kit, and 2 ⁇ L of the mixed enzyme II, and 1 ⁇ L ATP, 0.1 ⁇ L T4 Ligase, after mixing, put the above reaction system in a PCR instrument at 30°C, and react for 20 minutes to form DNB.
  • DNA nanosphere (DNB) preparation configure 40 ⁇ L of the reaction system shown in Table 2-2, and inject 80 fmol of DNA library containing position sequence information
  • the reaction conditions are as follows: 95°C for 3 minutes, 40°C for 3 minutes; after the reaction is completed, put it on ice, add 40 ⁇ L of the mixed enzyme I required for DNB preparation in the DNBSEQ sequencing kit, and 2 ⁇ L of the mixed enzyme II, and 1 ⁇ L ATP (100 mM mother solution, Thermo Fisher), 0.1 ⁇ L T4 ligase (purchased from NEB, product number: M0202S), after mixing evenly, put the above reaction system in a PCR instrument at 30°C for 20 minutes to form DNB.
  • the DNB was loaded onto the SEQ 500 sequencing chip according to the method described in the BGISEQ-500 high-throughput sequencing reagent set (SE50).
  • N3-PEG3500-NHS purchased from Sigma, product number: JKA5086
  • DBCO sequence modified by DBCO
  • the DNB was sequenced according to the instructions of the BGISEQ-500 high-throughput sequencing reagent kit, and the read length of SE was set to 25bp.
  • the above-mentioned DBCO-modified sequence is extended to obtain the chain grown after sequencing, and the chain is decoded to obtain the position sequence information corresponding to the DNB.
  • Liuhe Huada synthesized the probe capture sequence with UMI (SEQ ID NO: 15, its 5' terminal phosphorylation modification), and connected the capture sequence to the chain grown after sequencing by T4 ligase according to the following reaction system.
  • the connection reaction system is shown in Table 2-3.
  • NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN represents the location information.
  • Chip cutting cut the prepared chip into several small pieces, adjust the size of the slice according to the needs of the experiment, soak the chip in 50mM tris buffer with pH8.0, and prepare it at 4°C for use.
  • cDNA synthesis Use 5XSSC to wash the chip twice at room temperature, configure 200 ⁇ L of reverse transcriptase reaction system as shown in Table 2-4, add the reaction solution to the chip containing cells, fully cover, react at 42°C for 90min-180min, and use the probe polyT on the chip
  • the primers are used for cDNA synthesis, and the 3-terminus of the cDNA is tagged with TSO for the synthesis of cDNA complementary strands.
  • the cDNA strands are as follows:
  • CTGCTGACGTACTGAGAGGCATGGCGACCTTATTCAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNTTGTCTTCCTAAGACNNNNTTTTTTTTTTTTTTTTTTTTTTTV(cDNA)CCCGCCTCTCAGTACGTCAGCAG was treated with RNaseH for 30min, and the RNA was digested.
  • reaction system in the PCR instrument, set the following reaction program, 95°C for 3min, 11 cycles (98°C for 20s, 58°C for 20s, 72°C for 3min), 72°C for 5min, 4°C ⁇ .
  • reaction program 95°C for 3min, 11 cycles (98°C for 20s, 58°C for 20s, 72°C for 3min), 72°C for 5min, 4°C ⁇ .
  • XP beads for magnetic bead purification and recovery.
  • concentration of dsDNA was quantified using the Qubit kit, and the distribution of cDNA fragments was detected using a 2100 bioanalyzer (purchased from Agilent). The test results are shown in Figure 14, and the cDNA length is normal.
  • Tn5 interrupts According to the cDNA concentration, take 20ng cDNA, add 0.5 ⁇ M Tn5 interrupting enzyme (it is coated with the first strand shown in SEQ ID NO:19 and the second strand shown in SEQ ID NO:20) and corresponding buffer ( Purchased from BGI, Cat. No. 10000028493, Tn5 interrupting enzyme coating method according to the Stereomics library preparation kit), mixed to form a 20 ⁇ L reaction system, reacted at 55 °C for 10 minutes, added 5 ⁇ L of 0.1% SDS and mixed at room temperature for 5 minutes to end Tn5 Interrupt steps.
  • the reaction conditions are as follows: 95°C for 3 minutes, 40°C for 3 minutes; after the reaction is completed, put it on ice, add 40 ⁇ L of the mixed enzyme I required for DNB preparation in the DNBSEQ sequencing kit, and 2 ⁇ L of the mixed enzyme II, and 1 ⁇ L ATP (100 mM mother solution, Thermo Fisher), 0.1 ⁇ L T4 ligase, after mixing, place the above reaction system in a PCR instrument at 30°C for 20 minutes to form DNB.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Pathology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Hospice & Palliative Care (AREA)
  • Oncology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

提供了用于对核酸分子进行定位标记的方法,构建用于单细胞转录组测序的核酸分子文库的方法,涉及单细胞转录组测序(transcriptome sequencing)和生物分子空间信息检测的技术领域。还提供了利用所述方法构建的核酸分子文库,以及用于实施所述方法的试剂盒。

Description

单细胞核酸标记和分析方法 技术领域
本申请涉及单细胞转录组测序(transcriptome sequencing)和生物分子空间信息检测的技术领域。具体而言,本申请涉及对单细胞样本的核酸分子进行定位标记的方法,构建单细胞转录组测序文库的方法。此外,本申请还涉及用于实施所述方法的试剂盒。
背景技术
单细胞转录组测序技术是识别细胞异质性的重要工具。单细胞转录组测序技术的重要性,促使了该技术在通量及操作简便性等方面的快速发展。而单细胞转录组测序技术的发展,又促使国际上花巨资启动了人类细胞图谱计划,制作人类细胞图谱参考系。人类细胞图谱计划的启动,对单细胞转录组测序技术的通量提出了更高的要求和挑战。除科研需求外,单细胞转录组测序技术也被医学工作者尝试用于发现癌症中的少数“肿瘤干细胞”,从而有针对性地寻找药物和疗法克服恶性肿瘤。由于恶性肿瘤细胞较为稀少,因而需要单细胞转录组测序技术有很高的细胞利用率或捕获率,避免少数的恶性肿瘤细胞转录组信息的丢失。
现有的单细胞转录组测序技术主要包括两类:一类是基于多孔板的低通量测序技术,其中,将单个细胞分配到多孔板的单个孔内,例如smart-seq,CEL-seq;另一类是基于磁珠的测序技术,其中,通过微流控的方式将一个细胞与带有标签的磁珠共同包裹在微液滴或微孔中,例如10x chromium,Drop-seq,Seq-well等技术。现有的单细胞转录组测序技术以10x chromium的通量最高,单次运行的通量为5000~7000个细胞,最多可达到1万个细胞,并且,根据细胞类型的不同,细胞的捕获率为30%~60%。
以市场应用最为广泛的10x chromium为例,其技术特点是运用微流控系统进行细胞分选。简言之,使带有标签分子或条形码分子(Barcode)的凝胶珠粒(Gel Beads)匀速地进入微流控系统;并且,使待分选的细胞和酶以一定时间间隔进入,与凝胶珠粒结合,并在油相中形成GEMs(Gel Bead in emμLsion)。理想的情况是,每一个细胞分别与一个Gel Bead结合,形成一个GEM,由此,该方法可以实现单细胞转录组测序的目的。然而,GEMs的形成是呈泊松分布状态的。即,可能出现单个GEM含有0个或多个细胞的现象。由于这种GEM产生的测序数据不对应于单个细胞的状态,因此后续无法使用,需要通过算法进行过滤。受油相中形成的微液滴数量的限制,该技术的通量难以突破万级别;同时由于泊松分布的特点,该技术的细胞捕获率理论上最多能达到60%。因此,当需要对10万甚至更高通量的细胞进行单细胞转录组测序或者需要对稀有细胞进行捕获并测序时,该技术仍然存在较大缺陷,难以满足实际需求。因此,本领域需要开发新的具有更高细胞捕获率的单细胞转录组测序方法。
发明内容
本申请提供了一种对细胞样本的核酸分子进行定位标记的方法,以及基于该方法构建单细胞转录组测序文库的方法。此外,本申请还涉及用于实施所述方法的试剂盒。
生成标记的核酸分子群的方法
在一方面,本申请提供了一种生成标记的核酸分子群的方法,其包括下述步骤:
(1)提供:含有一个或多个细胞的样品,和,核酸阵列;
其中,所述样品为单细胞悬液;所述细胞(例如在其表面)含有第一结合分子;
所述核酸阵列包括固相支持物,所述固相支持物(例如在其表面)含有第一标记分子,所述第一结合分子能与所述第一标记分子构成相互作用对;
并且,所述固相支持物还包含多个微点,所述微点的尺寸(例如等效直径)小于5μm,相邻的所述微点之间的中心距离小于10μm;每个微点偶联有一种寡核苷酸探针,每种寡核苷酸探针包含至少一个拷贝;所述寡核苷酸探针从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成,其中,
不同微点偶联的寡核苷酸探针具有不同的标签序列Y;
(2)将所述一个或多个细胞与所述核酸阵列的固相支持物接触,由此,每个细胞各自占据所述核酸阵列中的至少一个微点(即,每个细胞各自与所述核酸阵列中的至少一个微点接触),并使得所述细胞的第一结合分子与所述固相支持物的第一标记分子形成相互作用对;其中,
在将所述一个或多个细胞与所述核酸阵列接触之前或之后,对所述一个或多个细胞的RNA(例如,mRNA)进行包括逆转录的预处理以生成第一核酸分子群;
和,
(3)将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群。
在某些实施方案中,所述相邻的所述微点之间的中心距离小于10μm,小于5μm,小于1μm,小于0.5μm,小于0.1μm,小于0.05μm,或小于0.01μm;并且,所述微点的尺寸(例如等效直径)小于5μm,小于1μm,小于0.3μm,小于0.5μm,小于0.1μm,小于0.05μm,小于0.01μm,或小于0.001μm。
在某些实施方案中,所述相邻的所述微点之间的中心距离为0.5μm~1μm,例如0.5μm~0.9μm,0.5μm~0.8μm。
在某些实施方案中,所述微点的尺寸(例如等效直径)为0.001μm~0.5μm(例如0.01μm~0.1μm,0.01μm~0.2μm,0.2μm~0.5μm,0.2μm~0.4μm,0.2μm~0.3μm)。
在某些实施方案中,所述第一结合分子能与所述第一标记分子构成特异性相互作用对或者非特异性相互作用对。
在某些实施方案中,所述相互作用对选自正负电荷相互作用,亲和相互作用(例如生物素-亲和素,生物素-链霉亲和素,抗原-抗体,受体-配体,酶-辅因子),能够发生点击化学反应的分子对(例如含炔基基团-叠氮基化合物),N-羟基磺基琥珀(NHS)酯-含氨基化合物,或其任意组合。
在某些实施方案中,所述第一标记分子为多聚赖氨酸,所述第一结合分子为能与多聚赖氨酸结合的蛋白质;所述第一标记分子为抗体,所述第一结合分子为能与所述抗体结合的抗原;所述第一标记分子为含氨基化合物,所述第一结合分子为N-羟基磺基琥珀(NHS)酯;或者,所述第一标记分子为生物素,所述第一结合分子为链霉亲和素。
在某些实施方案中,所述第一结合分子是所述细胞天然含有的。
在某些实施方案中中,所述第一结合分子是所述细胞非天然含有的。
在某些实施方案中,所述方法还包括将所述第一结合分子结合到所述一个或多个细胞或者使所述一个或多个细胞表达所述第一结合分子的步骤,以提供步骤(i)所述的细胞样品。
在某些实施方案中,所述方法还包括将所述第一标记分子结合到所述固相支持物的步骤,以提供步骤(i)所述的核酸阵列。
方案I
在某些实施方案中,步骤(2)中,所述预处理包括以下步骤:
(i)用引物I-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成延伸产物,所述延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);
或,
(ii)(a)用引物I-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物I-A为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);和,(b)将引物I-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物,所述第一延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物I-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);
或,
(iii)(a)用引物I-A’对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物I-A’为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物I-A’包含捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;(b)将引物I-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物I-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);和,(c)提供延伸引物,以第一延伸产物为模板进行延伸反应,生成第二延伸产物,所述第二延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;
并且,
步骤(3)中,通过以下步骤将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群:
在允许退火的条件下,将桥接寡核苷酸I与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针接触,使得所述桥接寡核苷酸I与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退 火),从而使得所述第一核酸分子群与所述阵列上的寡核苷酸探针连接,获得的连接产物即为具有位置标记的第二核酸分子,从而生成第二核酸分子群;
其中,所述桥接寡核苷酸I包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
所述第一区域能与步骤(2)(i)或步骤(2)(ii)中所述引物I-A的共有序列A全部或部分退火或者与步骤(2)(iii)中所述引物I-B的共有序列B全部或部分退火;
所述第二区域能与所述共有序列X2全部或部分退火。
在某些实施方案中,步骤(3)中,当所述桥接寡核苷酸I的第一区域和第二区域相邻时,所述使得所述第一核酸分子群与所述寡核苷酸探针连接包括:使用核酸连接酶将杂交于同一桥接寡核苷酸I的第一区域和第二区域的核酸分子连接,获得的连接产物即为具有位置标记的第二核酸分子;或者,
当所述桥接寡核苷酸I包括第一区域、第二区域以及位于两者之间的第三区域时,所述使得所述第一核酸分子群与所述寡核苷酸探针连接包括:使用核酸聚合酶以所述第三区域为模板进行聚合反应,使用核酸连接酶将杂交于同一桥接寡核苷酸I的第一区域、第三区域和第二区域的核酸分子连接,获得的连接产物即为具有位置标记的第二核酸分子;优选地,所述核酸聚合酶无5’至3’端外切酶活性或链置换活性。
在某些实施方案中,每种寡核苷酸探针包含一个拷贝。
在某些实施方案中,每种寡核苷酸探针包含多个拷贝。
容易理解,当每种寡核苷酸探针为一个拷贝时,每个微点偶联一个探针,并且不同微点的寡核苷酸探针具有不同的标签序列Y;当每种寡核苷酸探针包含多个拷贝时,每个微点偶联多个探针,同一微点内的寡核苷酸探针具有相同的标签序列Y,不同微点的寡核苷酸探针具有不同的标签序列Y。
在某些实施方案中,所述固相支持物包含多个微点,每个微点偶联一种寡核苷酸探针,每种寡核苷酸探针可包含一个或多个拷贝。
在某些实施方案中,所述固相支持物包含多个(例如,至少10个,至少10 2个,至少10 3个,至少10 4个,至少10 5个,至少10 6个,至少10 7个,至少10 8个,或更多个)微点;在某些实施方案中,所述固相支持物包含至少10 4个(例如至少10 4个,至少10 5个,至少10 6个,至少10 7个,至少10 8个,至少10 9个,至少10 10个,至少10 11个,或至少10 12个)微点/平方毫米。
包括步骤(1)、步骤(2)(i)和步骤(3)的实施方案
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(i)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子,其从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸I的第三区域的互补序列,以及所述待标记的第一核酸分子序列。
在某些实施方案中,所述方法步骤(2)(i)中,所述捕获序列A是随机寡核苷酸序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述捕获序列A,所述捕获序列A作为所述第二核酸分子的分子标签(UMI)。
在某些实施方案中,步骤(2)(i)中所述的延伸产物(待标记的第一核酸分子)从5’端至3’端包含:所述共有序列A,以所述引物I-A为逆转录引物形成的与所述RNA互补的cDNA序列。
在某些实施方案中,所述方法步骤(2)(i)中,所述捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物I-A的3’端,所述共有序列A位于所述标签序列A的上游(例如位于所述引物I-A的5’端)。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述标签序列A作为UMI。
在某些实施方案中,步骤(2)(i)中所述的延伸产物从5’端至3’端依次包含:所述共有序列A,所述标签序列A,以所述引物I-A为逆转录引物形成的与所述RNA互补的cDNA序列。
包括步骤(1)、步骤(2)(ii)和步骤(3)的实施方案
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(ii)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子,其从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸I的第三区域的互补序列,以及所述待标记的第一核酸分子序列。
在某些实施方案中,所述方法步骤(2)(ii)(a)中,所述捕获序列A是随机寡核苷酸序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述捕获序列A,所述捕获序列A作为所述第二核酸分子的分子标签(UMI)。
在某些实施方案中,步骤(2)(ii)中所述的第一延伸产物(待标记的第一核酸分子)从5’端至3’端包含:所述共有序列A,以所述引物I-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述步骤(2)(ii)(a)中,所述捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物I-A的3’端,所述共有序列A位于所述标签序列A的上游(例如位于所述引物I-A的5’端)。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述标签序列A作为UMI。
在某些实施方案中,步骤(2)(ii)中所述的第一延伸产物(待标记的第一核酸分子)从5’端至3’端依次包含:所述共有序列A,所述标签序列A,以所述引物I-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述方法中,所述引物I-A的5’末端包含磷酸化修饰。
本申请的包含步骤(1)、步骤(2)(ii)和步骤(3)一个示例性实施方案详细描述如下:
一、以样本中的RNA(例如mRNA)为模板制备cDNA链的示例性方案包含以下步骤(如 图2所示):
(1)用逆转录酶(例如,具有末端转移活性的逆转录酶)和引物I-A对透化的细胞样本中的RNA分子(例如,mRNA分子)进行逆转录,以生成cDNA,并在cDNA的3’端添加悬突(例如,包含3个胞嘧啶核苷酸的悬突)。可使用各种具有末端转移活性的逆转录酶来进行逆转录反应。在某些优选的实施方案中,所使用的逆转录酶不具有RNaseH活性。
所述引物I-A包含poly(T)序列和共有序列A(在图中标记为CA)。在某些实施方案中(例如当所述方法用于3’转录组建库时),所述引物I-A还包含独特分子标签序列(UMI)。通常情况下,poly(T)序列位于所述引物I-A的3’末端,以便起始逆转录。在优选的实施方案中,所述UMI序列位于所述poly(T)序列的上游(例如5’端),且,所述共有序列A位于所述UMI序列的上游(例如5’端)。
(2)使用引物I-B,其包含共有序列B(在图中标记为CB)与cDNA链进行退火或杂交,随后,与引物I-B杂交或退火的核酸片段在核酸聚合酶的作用下,可以以共有序列B为模板进行延伸,在cDNA链3’末端添加共有序列B的互补序列,从而生成5’端携带共有序列A和标签序列A、且3’端携带共有序列B的互补序列的核酸分子。
所述引物I-B可包含与cDNA链的3’末端悬突互补的序列。例如,当cDNA链的3’末端包含3个胞嘧啶核苷酸的悬突时,引物I-B可在其3’端包含GGG。此外,还可以对引物I-B的核苷酸进行修饰(例如,使用锁核酸),以增强引物I-B与cDNA链的3’末端悬突之间的互补配对。
不受理论限制,可以使用各种合适的核酸聚合酶(例如,DNA聚合酶或逆转录酶)来进行延伸反应,只要其能够以引物I-B的部分序列为模板延伸被捕获的核酸片段(逆转录产物)即可。在某些示例性实施方案中,可使用与前述逆转录步骤相同的逆转录酶来延伸被捕获的核酸片段(逆转录产物)。
在某些优选的实施方案中,步骤(2)与步骤(1)同时进行。
在某些实施方案中,所述方法任选地还包括步骤(3):加入RNaseH,消化RNA/cDNA杂合双链中的RNA链,形成cDNA单链。
在某些优选的实施方案中,所述方法不包括步骤(3)。
通过上述示例性实施方案所制备的cDNA链的示例性结构包含:共有序列A,UMI序列,与RNA(例如,mRNA)的序列互补的序列,以及共有序列B的互补序列。
二、用寡核苷酸探针(也称,芯片序列)标记cDNA链的5’端(即,将cDNA链的5’端与芯片序列的3’端进行连接)以形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案包含以下步骤(如图3所示):
提供桥接寡核苷酸I,其5’端含有与cDNA序列5’端(例如共有序列A(CA))至少部分互补的序列(第一区域,P1),且其3’端含有与芯片序列3’端(例如共有序列X2)至少部分互补的序列(第二区域,P2)。
在某些优选的实施方案中,所述桥接寡核苷酸I中P1和P2序列是相邻连接的,二者之间没有间隔核苷酸。
在某些优选的实施方案中,所述P1序列、P2序列、共有序列A以及共有序列X2各自独立 地具有20-100nt(例如20-70nt)的长度。将该桥接寡核苷酸I与寡核苷酸探针和cDNA链退火或杂交,之后通过DNA连接酶和/或DNA聚合酶将cDNA链的5’端与寡核苷酸探针的3’端进行连接,形成含有寡核苷酸探针序列信息的新核酸分子(即,经寡核苷酸探针标记的核酸分子)。在某些优选的实施方案中,所述DNA聚合酶无5’端至3’端外切酶活性或链置换活性。
通过上述示例性实施方案所形成的含有芯片序列信息的新核酸分子的示例性结构,其包含:共有序列X1,标签序列Y,共有序列X2,共有序列A,UMI序列,与RNA(例如,mRNA)的序列互补的序列和共有序列B的互补序列。
包括步骤(1)、步骤(2)(iii)和步骤(3)的实施方案
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(iii)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子,其从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸I的第三区域的互补序列,以及所述待标记的第一核酸分子序列。
在某些实施方案中,所述方法步骤(2)(iii)(c)中,所述延伸引物为所述引物I-B或者引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,且能起始延伸反应。
在某些实施方案中,步骤(2)(iii)(c)中,所述延伸引物为所述引物B”。
在某些实施方案中,所述方法步骤(2)(iii)(a)中,所述引物I-A’的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,步骤(2)(iii)(b)中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B。
在某些实施方案中,所述第一延伸产物从5’端至3’端包含:以所述引物I-A’为逆转录引物形成的与所述RNA序列互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列;其中,所述标签序列B的互补序列作为所述第二核酸分子的分子标签(UMI)。
在某些实施方案中,步骤(2)(iii)(c)中,所述第二延伸产物(待标记的第一核酸分子序列)从5’端至3’端包含:所述共有序列B或其3’端部分序列,所述标签序列B,所述3’末端悬突序列的互补序列,所述第一延伸产物中的cDNA序列的互补序列;其中,所述标签序列B作为所述第二核酸分子的分子标签(UMI)。
在某些实施方案中,所述方法步骤(2)(iii)(a)中,所述引物I-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物I-A’还含有标签序列A,例如为随机寡核苷酸序列,以及共有序列A。
在某些实施方案中,所述捕获序列A位于所述引物I-A’的3’端。
在某些实施方案中,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A’的5’端)。
在某些实施方案中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B。
在某些实施方案中,步骤(2)(iii)(b)中,所述第一延伸产物从5’端至3’端包含:所述共 有序列A,任选的所述标签序列A,以所述引物I-A’为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,步骤(2)(iii)(c)中,所述第二延伸产物(待标记的第一核酸分子序列)从5’端至3’端包含:所述共有序列B或其3’端部分序列,所述标签序列B,所述3’末端悬突序列的互补序列,所述第一延伸产物中的cDNA序列的互补序列,任选的所述标签序列A的互补序列,所述共有序列A的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述标签序列B作为UMI。
在某些实施方案中,所述延伸引物的5’末端包含磷酸化修饰。
在某些实施方案中,步骤(2)(iii)(c)之前,所述方法还包括对步骤(2)(iii)(a)或步骤(2)(iii)(b)的产物进行处理(例如加热处理),以去除RNA。
在某些实施方案中,所述方法在步骤(2)(iii)(b)中,所述cDNA链通过其3’末端悬突与所述引物I-B退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所述引物I-B为模板被延伸,生成所述第一延伸产物。
本申请的包含步骤(1)、步骤(2)(iii)和步骤(3)一个示例性实施方案详细描述如下:
一、以样本中的RNA(例如mRNA)为模板制备cDNA链互补链的示例性方案包含以下步骤(如图4所示):
(1)用逆转录酶(例如,具有末端转移活性的逆转录酶)和引物I-A’对透化的样本中的RNA分子(例如,mRNA分子)进行逆转录,以生成cDNA,并在cDNA的3’端添加悬突(例如,包含3个胞嘧啶核苷酸的悬突)。可使用各种具有末端转移活性的逆转录酶来进行逆转录反应。在某些优选的实施方案中,所使用的逆转录酶不具有RNaseH活性。
所述逆转录引物I-A’包含poly(T)序列和共有序列A(CA)。通常情况下,poly(T)序列位于所述引物I-A’的3’末端,以便起始逆转录。
(2)使用引物I-B与cDNA链进行退火或杂交,所述引物I-B包含共有序列B(CB)和所述cDNA的3’端悬突的互补序列。在某些实施方案中(例如当所述方法用于5’转录组建库时),所述引物I-B还包含独特分子标签序列(UMI)。随后,与引物I-B杂交或退火的核酸片段在核酸聚合酶的作用下,可以以共有序列B和UMI序列为模板进行延伸,在cDNA链3’末端添加共有序列B的互补序列和UMI序列的互补序列,从而生成5’端携带共有序列A、且3’端携带共有序列B的互补序列以及UMI分子的互补序列的核酸分子。
当cDNA链的3’末端包含3个胞嘧啶核苷酸的悬突时,引物I-B可在其3’端包含GGG。此外,还可以对引物I-B的核苷酸进行修饰(例如,使用锁核酸),以增强引物I-B与cDNA链的3’末端悬突之间的互补配对。
不受理论限制,可以使用各种合适的核酸聚合酶(例如,DNA聚合酶或逆转录酶)来进行延伸反应,只要其能够以引物I-B的序列或其部分序列为模板延伸被捕获的核酸片段(逆转录产物)即可。在某些示例性实施方案中,可使用与前述逆转录步骤相同的逆转录酶来延伸被捕获的核酸片段(逆转录产物)。
在某些优选的实施方案中,步骤(2)与步骤(1)同时进行。
在某些实施方案中,所述方法任选地还包括步骤(3):加入RNaseH,消化RNA/cDNA杂合双链中的RNA链,形成cDNA单链。
在某些优选的实施方案中,所述方法不包括步骤(3)。
(4)使用延伸引物,以(3)获得的cDNA单链为模板进行延伸反应,获得延伸产物;所述延伸引物能与所述共有序列B的互补序列或其部分序列退火,且能起始延伸反应。
在某些实施方案中,所述延伸引物与所述引物I-B相同。
通过上述示例性实施方案所制备的含cDNA链互补链的示例性结构包含:共有序列B,UMI序列,与cDNA 3’末端悬突序列互补的序列,cDNA序列的互补序列,共有序列A的互补序列。
二、用寡核苷酸探针(也称,芯片序列)标记cDNA链互补链的5’端(即,将cDNA链的互补链的5’端与芯片序列的3’端进行连接)以形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案包含以下步骤(如图5所示):
提供桥接寡核苷酸I,其5’端含有与共有序列B(CB)至少部分互补的序列(第一区域,P1),且其3’端含有与共有序列X2至少部分互补的序列(第二区域,P2)。
在某些优选的实施方案中,所述桥接寡核苷酸I中P1和P2序列是相邻连接的,二者之间没有间隔核苷酸。
在某些优选的实施方案中,所述P1序列、P2序列各自独立地具有20-100nt(例如20-70nt)的长度。
将该桥接寡核苷酸I与寡核苷酸探针和cDNA链互补链退火或杂交,之后通过DNA连接酶和/或DNA聚合酶将cDNA链互补链的5’端与芯片序列的3’端进行连接,形成含有寡核苷酸探针序列信息的新核酸分子(即,经寡核苷酸探针标记的核酸分子)。在某些优选的实施方案中,所述DNA聚合酶无5’端至3’端外切酶活性或链置换活性。
通过上述示例性实施方案所形成的含有芯片序列信息的新核酸分子的示例性结构,其包含:共有序列X1,标签序列Y,共有序列X2,共有序列B,UMI序列,cDNA序列的互补序列和共有序列A的互补序列。
方案II
在某些实施方案中,步骤(2)中,所述预处理包括以下步骤:
(i)(a)用引物II-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物II-A为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物II-A含有捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;和,(b)将引物II-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物,所述第一延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物II-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物II-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B的5’端);或,
(ii)(a)用引物II-A’对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成 cDNA链;所述cDNA链包含以所述引物II-A’为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物II-A’含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物II-A’的5’端);(b)将引物II-B’与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物;其中,所述引物II-B’包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物II-B’的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B’的5’端);和,(c)提供延伸引物,以第一延伸产物为模板进行延伸反应,生成第二延伸产物,所述第二延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;
并且,步骤(3)中,通过以下步骤将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群:
(i)向步骤(2)的产物实施退火条件,使得步骤(2)获得的源自各个细胞的第一核酸分子与所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退火),并进行延伸反应,生成延伸产物,所述延伸产物即为具有位置标记的第二核酸分子,从而生成第二核酸分子群;其中,所述寡核苷酸探针的共有序列X2或其部分序列(a)能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火,或者,(b)能与步骤(2)(ii)获得的第二延伸产物的所述共有序列A的互补序列或其部分序列退火;或,
(ii)在允许退火的条件下,将桥接寡核苷酸对与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针接触,使得所述桥接寡核苷酸对与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退火),
其中,所述桥接寡核苷酸对由桥接寡核苷酸II-I和桥接寡核苷酸II-II组成,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;
所述桥接寡核苷酸II-II的第二区域(a)能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火,或者,(b)能与步骤(2)(ii)获得的第二延伸产物的所述共有序列A的互补序列或其部分序列退火;
其中,将所述桥接寡核苷酸对与所述第一核酸分子群、所述寡核苷酸探针接触时,所述桥接寡核苷酸对的桥接寡核苷酸II-I和桥接寡核苷酸II-II各自以单链的形式存在,或者,所述桥接寡核苷酸对的桥接寡核苷酸II-I和桥接寡核苷酸II-II以彼此退火形成部分双链的形式存在;
进行连接反应:将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接;并进行延伸反应;其中,所述连接反应与延伸反应以任意顺序进行;所获得的反应产物即为具有位置标记的第二核 酸分子,从而生成所述第二核酸分子群。
在某些实施方案中,步骤(3)(ii)中:
(1)当所述桥接寡核苷酸II-I的第一区域和第二区域相邻时,所述将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸连接酶将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接;或者,
当所述桥接寡核苷酸II-I包括第一区域、第二区域以及位于两者之间的第三区域时,所述将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸聚合酶(例如,无5’至3’端外切酶活性或链置换活性)以所述第三区域为模板进行聚合反应,使用核酸连接酶将杂交于同一桥接寡核苷酸II-I的第一区域、第三区域和第二区域的核酸分子连接;
和/或
(2)当所述桥接寡核苷酸II-II的第一区域和第二区域相邻时,所述将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸连接酶将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接;或者,
当所述桥接寡核苷酸II-II包括第一区域、第二区域以及位于两者之间的第三区域时,所述将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸聚合酶(例如,无5’至3’端外切酶活性或链置换活性)以所述第三区域为模板进行聚合反应,使用核酸连接酶将杂交于同一桥接寡核苷酸II-II的第一区域、第三区域和第二区域的核酸分子连接。
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(i)和步骤(3);其中,步骤(2)(i)(b)中,所述引物II-B含有共有序列B,3’末端悬突互补序列,以及标签序列B。
在某些实施方案中,步骤(2)(i)(b)中所述的第一延伸产物从5’端至3’端依次包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二核酸分子具有不同的所述标签序列B作为UMI。
包括步骤(1)、步骤(2)(i)和步骤(3)(i)的实施方案
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(i)和步骤(3)(i);其中,所述共有序列X2或其部分序列能与所述共有序列B的互补序列或其部分序列退火;步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
易于理解,所述“XX(序列)的部分序列”或“XX(序列)部分序列”意指“XX(序列)”的至少一个区段的核苷酸序列。
例如,所述共有序列X2可以以其整体的核苷酸序列与所述共有序列B的互补序列或所述共有序列B的互补序列的部分区段的核苷酸序列退火,所述共有序列X2也可以以其部分区段的核苷酸序列与所述共有序列B的互补序列或所述共有序列B的互补序列的部分区段的核苷酸序列退火。
所述“退火”意指,相互退火的两段核苷酸序列中,一段核苷酸序列中的每一个碱基都能够与 另一段核苷酸序列中的碱基配对,而不存在错配或缺口;或者,相互退火的两段核苷酸序列中,一段核苷酸序列中的大部分碱基都能够与另一段核苷酸序列中的碱基配对,其允许存在错配或缺口(例如,一个或数个核苷酸的错配或缺口)。也即,能够退火的两段核苷酸序列既可以是完全互补,也可以是部分互补。除非本文另外指明或根据上下文明显矛盾,否则,此处有关“退火”的描述适用于本文全文。
在某些实施方案中,所述第一链从5’端至3’端包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列。
包括步骤(1)、步骤(2)(i)和步骤(3)(i)的实施方案:一链
在某些实施方案中,所述共有序列X2或其部分序列能与所述共有序列B的互补序列或其部分序列(例如,3’端部分序列)退火,并且步骤(2)(i)中的第一延伸产物的所述共有序列B的互补序列具有3’自由端。
在某些实施方案中,步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含所述第一链。
在某些实施方案中,所述第一链从5’端至3’端包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,步骤(3)(i)中,所述寡核苷酸探针不能起始延伸反应(例如3’端是封闭的)。
在某些实施方案中,所述方法步骤(2)(i)(a)中,所述引物II-A的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,步骤(2)(i)(b)中所述的第一延伸产物从5’端至3’端依次包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述第一链从5’端至3’端包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,所述方法步骤(2)(i)(a)中,所述引物II-A的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物II-A还含有共有序列A,以及任选的标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物II-A的3’端。
在某些实施方案中,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物II-A的 5’端)。
在某些实施方案中,步骤(2)(i)(b)中所述第一延伸产物从5’端至3’端依次包含:所述共有序列A,任选的标签序列A,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述第一链从5’端至3’端包含:所述共有序列A,任选的所述标签序列A,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
包括步骤(1)、步骤(2)(i)和步骤(3)(i)的实施方案:二链
在某些实施方案中,所述共有序列X2或其部分序列(例如,3’端部分序列)能与所述共有序列B的互补序列或其部分序列退火,并且所述寡核苷酸探针的所述共有序列X2具有3’自由端。
在某些实施方案中,步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含所述第二链。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列。
在某些实施方案中,步骤(2)(i)获得的第一延伸产物不能起始延伸反应(例如3’端是封闭的)。
在某些实施方案中,所述方法步骤(2)(i)(a)中,所述引物II-A的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,步骤(2)(i)(b)中所述的第一延伸产物从5’端至3’端依次包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列。
在某些实施方案中,所述方法步骤(2)(i)(a)中,所述引物II-A的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物II-A还含有共有序列A,以及任选的标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物II-A的3’端。
在某些实施方案中,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物II-A的5’端)。
在某些实施方案中,步骤(2)(i)(b)中所述第一延伸产物从5’端至3’端依次包含:所述共有序列A,任选的所述标签序列A,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,任选的所述标签序列A的互补序列,所述共有序列A的互补序列。
本申请的包含步骤(1)、步骤(2)(i)和步骤(3)(i)的一个示例性实施方案详细描述如下:
一、以样本中的RNA(例如mRNA)为模板制备3’端含有UMI的互补序列的cDNA链的示例性方案包含以下步骤(如图6所示):
(1)用逆转录酶(例如,具有末端转移活性的逆转录酶)和引物II-A对透化的样本中的RNA分子(例如,mRNA分子)进行逆转录,以生成cDNA,并在cDNA的3’端添加悬突(例如,包含3个胞嘧啶核苷酸的悬突)。可使用各种具有末端转移活性的逆转录酶来进行逆转录反应。在某些优选的实施方案中,所使用的逆转录酶不具有RNaseH活性。
在某些实施方案中,所述引物II-A包含poly(T)序列以及共有序列A(CA)。通常情况下,poly(T)序列位于所述引物II-A的3’末端,以便起始逆转录。
在某些实施方案中,所述引物II-A包含随机寡核苷酸序列,可用于捕获无ploy(A)尾的RNA。通常情况下,所述随机寡核苷酸序列位于所述引物II-A的3’末端,以便起始逆转录。
(2)使用引物II-B与cDNA链进行退火或杂交,所述引物II-B包含共有序列B(CB)、独特分子标签序列(UMI)以及所述cDNA的3’端悬突的互补序列。随后,与所述引物II-B杂交或退火的核酸片段在核酸聚合酶的作用下,可以以所述UMI序列和所述共有序列B为模板进行延伸,从而生成3’端携带所述UMI序列的互补序列、所述共有序列B的互补序列的的核酸分子。
通常情况下,所述共有序列B位于所述UMI序列的上游(例如5’端),所述与cDNA链的3’末端悬突互补的序列位于所述引物II-B的3’末端。
例如,当cDNA链的3’末端包含3个胞嘧啶核苷酸的悬突时,所述引物II-B可在其3’端包含GGG。此外,还可以对所述引物II-B的核苷酸进行修饰(例如,使用锁核酸),以增强所述引物II-B与cDNA链的3’末端悬突之间的互补配对。
不受理论限制,可以使用各种合适的核酸聚合酶(例如,DNA聚合酶或逆转录酶)来进行延伸反应,只要其能够以所述引物II-B的序列或其部分序列为模板延伸被捕获的核酸片段(逆转录产物)即可。在某些示例性实施方案中,可使用与前述逆转录步骤相同的逆转录酶来延伸被捕获的核酸片段(逆转录产物)。
在某些实施方案中,该步骤与步骤(1)同时进行(例如,在同一反应体系中进行)。
在某些实施方案中,所述方法任选地还包含步骤(3):加入RNaseH,消化RNA/cDNA杂合双链中的RNA链,形成cDNA单链。
在某些实施方案中,所述方法不包括所述步骤(3)。
通过上述示例性实施方案所制备的cDNA链的示例性结构包含:共有序列A,cDNA序列,3’末端悬突序列,UMI序列的互补序列,以及,共有序列B的互补序列。
二、用寡核苷酸探针(也称,芯片序列)的互补序列标记cDNA链的3’端,以形成含有芯片 序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案包含以下步骤(如图8所示):
在某些实施方案中,所述芯片序列的共有序列X2或其部分序列能与上述步骤一中获得的cDNA链的所述共有序列B的互补序列或其部分序列退火。将该cDNA链与芯片序列退火或杂交,在聚合酶的作用下,形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)。
通过上述示例性实施方案所形成的含有芯片序列信息的新核酸分子的示例性结构包含:从5’端至3’端含有共有序列A,cDNA序列,3’末端悬突序列,UMI序列的互补序列,共有序列B的互补序列,标签序列Y的互补序列,以及,共有序列X1的互补序列的核酸链和/或其互补核酸链。
包括步骤(1)、步骤(2)(i)和步骤(3)(ii)的实施方案
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(i)和步骤(3)(ii);其中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火;步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
易于理解,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或所述共有序列B的互补序列的部分区段的核苷酸序列退火。
在某些实施方案中,所述第一链从5’端至3’端包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列,任选的所述桥接寡核苷酸II-II的第三区域的互补序列,所述桥接寡核苷酸II-I序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸II-I的第三区域的互补序列,所述桥接寡核苷酸II-II序列,所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列。
包括步骤(1)、步骤(2)(i)和步骤(3)(ii)的实施方案:一链
在某些实施方案中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列(例如,3’端部分序列)退火,并且所述桥接寡核苷酸II-I的第二区域具有3’自由端。
在某些实施方案中,步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含所述第一链。
在某些实施方案中,所述第一链从5’端至3’端包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列,任选的所述桥接寡核苷酸II-II的第三区域的互补序列,所述桥接寡核苷酸II-I序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,所述桥接寡核苷酸II-I的第二区域位于所述桥接寡核苷酸II-I的3’末端。
在某些实施方案中,所述桥接寡核苷酸II-I的第一区域位于所述桥接寡核苷酸II-I的5’末端。
在某些实施方案中,所述桥接寡核苷酸II-I不含有所述第三区域,和/或,所述桥接寡核苷酸 II-II不含有所述第三区域。
在某些实施方案中,所述桥接寡核苷酸II-I的5’末端含有磷酸化修饰。
在某些实施方案中,所述桥接寡核苷酸II-I的3’末端含有自由-OH。
在某些实施方案中,步骤(3)(ii)中,所述桥接寡核苷酸II-II不能起始延伸反应(例如3’端是封闭的),和/或,所述寡核苷酸探针不能起始延伸反应(例如3’端是封闭的)。
在某些实施方案中,所述方法步骤(2)(i)(a)中,所述引物II-A的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,所述方法步骤(2)(i)(b)中所述的第一延伸产物从5’端至3’端依次包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述第一链从5’端至3’端包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列,任选的所述桥接寡核苷酸II-II的第三区域的互补序列,所述桥接寡核苷酸II-I序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,步骤(2)(i)(a)中,所述引物II-A的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物II-A还含有共有序列A,以及任选的标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物II-A的3’端。
在某些实施方案中,步骤(2)(i)(b)中所述第一延伸产物从5’端至3’端依次包含:所述共有序列A,任选的所述标签序列A,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述第一链从5’端至3’端包含:所述共有序列A,任选的所述标签序列A,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列,任选的所述桥接寡核苷酸II-II的第三区域的互补序列,所述桥接寡核苷酸II-I序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
易于理解,步骤(3)(ii)中,在所述桥接寡核苷酸II-I、桥接寡核苷酸II-II与所述寡核苷酸探针以及所述寡核苷酸探针对应位置的待标记的第一核酸分子退火之后,将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的连接反应过程与步骤(3)(ii)中所述的延伸反应可以任意顺序进行,只要能获得带有位置标记的第二核酸分子即可。
例如,当所述连接反应与所述延伸反应在相同体系中进行,可通过将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接,并以所述桥接寡核苷酸II-I起始延伸反应,获得所述第一链。在该种情况下,所述用于延伸反应的聚合酶优选不具有链置换活性或5'至3'外切活性。
例如,当所述连接反应与所述延伸反应在不同体系中进行,并且,先进行所述连接反应,后进行所述延伸反应。在该种情况下,所述第一链可以通过以下示例性方式获得:
(A)将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接,并以所述桥接寡核苷酸II-I起始延伸反应,获得所述第一链;其中,所述用于延伸反应的聚合酶优选具有或者不具有链置换活性或5'至3'外切活性;
或,
(B)将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,并以所述待标记的第一核酸分子起始延伸反应,获得所述第一链;其中,所述用于延伸反应的聚合酶优选具有链置换活性或5'至3'外切活性。
例如,当所述连接反应与所述延伸反应在不同体系中进行,并且,先进行所述延伸反应,后进行所述连接反应。在该种情况下,可通过以所述桥接寡核苷酸II-I起始延伸反应,再将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接获得所述第一链。在该种情况下,所述用于延伸反应的聚合酶优选不具有链置换活性或5'至3'外切活性。
包括步骤(1)、步骤(2)(i)和步骤(3)(ii)的实施方案:二链
在某些实施方案中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(i)获得的第一延伸产物的所述共有序列B互补序列或其部分序列退火,并且所述桥接寡核苷酸II-II的第二区域具有3’自由端。
在某些实施方案中,步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含所述第二链。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸II-I的第三区域的互补序列,所述桥接寡核苷酸II-II序列,所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列。
在某些实施方案中,所述桥接寡核苷酸II-II的第二区域位于所述桥接寡核苷酸II-II的3’末端。
在某些实施方案中,所述桥接寡核苷酸II-II的第一区域位于所述桥接寡核苷酸II-II的5’末端。
在某些实施方案中,所述桥接寡核苷酸II-I不含有所述第三区域,和/或,所述桥接寡核苷酸II-II不含有所述第三区域。
在某些实施方案中,所述桥接寡核苷酸II-II的5’末端含有磷酸化修饰。
在某些实施方案中,所述桥接寡核苷酸II-II的3’末端含有自由-OH。
在某些实施方案中,步骤(3)(ii)中,所述桥接寡核苷酸II-I不能起始延伸反应(例如3’端是封闭的),和/或,步骤(2)(i)获得的第一延伸产物不能起始延伸反应(例如3’端是封闭的)。
在某些实施方案中,步骤(2)(i)(a)中,所述引物II-A的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,步骤(2)(i)(b)中所述的第一延伸产物从5’端至3’端依次包含:以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸II-I的第三区域的互补序列,所述桥接寡核苷酸II-II序列,所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列。
在某些实施方案中,步骤(2)(i)(a)中,所述引物II-A的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物II-A还含有共有序列A,以及任选的标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物II-A的3’端。
在某些实施方案中,步骤(2)(i)(b)中所述第一延伸产物从5’端至3’端依次包含:所述共有序列A,任选的所述标签序列A,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸II-I的第三区域的互补序列,所述桥接寡核苷酸II-II序列,所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,任选的所述标签序列A的互补序列,所述共有序列A的互补序列。
易于理解,步骤(3)(ii)中,在所述桥接寡核苷酸II-I、桥接寡核苷酸II-II与所述寡核苷酸探针以及所述寡核苷酸探针对应位置的待标记的第一核酸分子退火之后,将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的连接反应过程与步骤(3)(ii)中所述的延伸反应可以任意顺序进行,只要能获得带有位置标记的第二核酸分子即可。
例如,当所述连接反应与所述延伸反应在相同体系中进行,可通过将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,并以所述桥接寡核苷酸II-II起始延伸反应,获得所述第二链。在该种情况下,所述用于延伸反应的聚合酶优选不具有链置换活性或5'至3'外切活性。
例如,当所述连接反应与所述延伸反应在不同体系中进行,并且,先进行所述连接反应,后进行所述延伸反应。在该种情况下,所述第二链可以通过以下示例性方式获得:
(A)将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,并以所述桥接寡核苷酸II-II起始延伸反应,获得所述第二链;其中,所述用于延伸反应的聚合酶优选具有或者不具有链置换活性或5'至3'外切活性;
或,
(B)将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接,并以所述寡核苷酸探针起始延伸反应,获得所述第二链;其中,所述用于延伸反应的聚合酶优选具有链置换活性或5'至3'外切活性。
例如,当所述连接反应与所述延伸反应在不同体系中进行,并且,先进行所述延伸反应,后 进行所述连接反应。在该种情况下,可通过以所述桥接寡核苷酸II-II起始延伸反应,再将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接获得所述第二链。在该种情况下,所述用于延伸反应的聚合酶优选不具有链置换活性或5'至3'外切活性。
本申请的包含步骤(1)、步骤(2)(i)和步骤(3)(ii)的一个示例性实施方案详细描述如下:
一、以样本中的RNA(例如mRNA)为模板制备cDNA链的示例性方案包含以下步骤(如图6所示):
(1)用逆转录酶(例如,具有末端转移活性的逆转录酶)和引物II-A对透化的样本中的RNA分子(例如,mRNA分子)进行逆转录,以生成cDNA,并在cDNA的3’端添加悬突(例如,包含3个胞嘧啶核苷酸的悬突)。可使用各种具有末端转移活性的逆转录酶来进行逆转录反应。在某些优选的实施方案中,所使用的逆转录酶不具有RNaseH活性。
在某些实施方案中,所述引物II-A包含poly(T)序列以及共有序列A(CA)。通常情况下,poly(T)序列位于所述引物II-A的3’末端,以便起始逆转录。
在某些实施方案中,所述引物II-A包含随机寡核苷酸序列,可用于捕获无ploy(A)尾的RNA。通常情况下,所述随机寡核苷酸序列位于所述引物II-A的3’末端,以便起始逆转录。
(2)使用引物II-B与cDNA链进行退火或杂交,所述引物II-B包含共有序列B(CB)、独特分子标签序列(UMI)以及所述cDNA的3’端悬突的互补序列。随后,与所述引物II-B杂交或退火的核酸片段在核酸聚合酶的作用下,可以以所述UMI序列和所述共有序列B为模板进行延伸,从而生成3’端携带所述UMI序列的互补序列、所述共有序列B的互补序列的的核酸分子。
通常情况下,所述共有序列B位于所述UMI序列的上游(例如5’端),所述与cDNA链的3’末端悬突互补的序列位于所述引物II-B的3’末端。
例如,当cDNA链的3’末端包含3个胞嘧啶核苷酸的悬突时,所述引物II-B可在其3’端包含GGG。此外,还可以对所述引物II-B的核苷酸进行修饰(例如,使用锁核酸),以增强所述引物II-B与cDNA链的3’末端悬突之间的互补配对。
不受理论限制,可以使用各种合适的核酸聚合酶(例如,DNA聚合酶或逆转录酶)来进行延伸反应,只要其能够以所述引物II-B的序列或其部分序列为模板延伸被捕获的核酸片段(逆转录产物)即可。在某些示例性实施方案中,可使用与前述逆转录步骤相同的逆转录酶来延伸被捕获的核酸片段(逆转录产物)。
在某些实施方案中,该步骤与步骤(1)同时进行(例如,在同一反应体系中进行)。
在某些实施方案中,所述方法任选地还包含步骤(3):加入RNaseH,消化RNA/cDNA杂合双链中的RNA链,形成cDNA单链。
在某些实施方案中,所述方法不包括所述步骤(3)。
通过上述示例性实施方案所制备的cDNA链的示例性结构包含:共有序列A,cDNA序列,3’末端悬突序列,UMI序列的互补序列,以及,共有序列B的互补序列。
二、用寡核苷酸探针(也称,芯片序列)的互补序列标记cDNA链的3’端,以形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案包含以下步骤(如图7 所示):
提供由桥接寡核苷酸II-I和桥接寡核苷酸II-II组成的桥接寡核苷酸对,其中,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域(P1)和第二区域(P2),所述第一区域位于所述第二区域的上游(例如5’端);其中,
所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;
所述桥接寡核苷酸II-II的第二区域能与上述步骤一中获得的cDNA链中的所述共有序列B的互补序列或其部分序列退火。
在某些实施方案中,所述桥接寡核苷酸II-I中第一区域和第二区域之间包含间隔核苷酸,例如1-5nt或5-10nt的间隔核苷酸,即所述桥接寡核苷酸II-I序列含有位于第一区域与第二区域之间的第三区域。在某些优选的实施方案中,所述桥接寡核苷酸II-I中第一区域和第二区域是相邻连接的,二者之间没有多余核苷酸,即所述桥接寡核苷酸II-I序列不含有位于第一区域与第二区域之间的第三区域。
在某些实施方案中,所述桥接寡核苷酸II-II中第一区域和第二区域之间包含间隔核苷酸,例如1-5nt或5-10nt的间隔核苷酸,即所述桥接寡核苷酸II-II序列含有位于第一区域与第二区域之间的第三区域。在某些优选的实施方案中,所述桥接寡核苷酸II-II中第一区域和第二区域是相邻连接的,二者之间没有多余核苷酸,即所述桥接寡核苷酸II-II序列不含有位于第一区域与第二区域之间的第三区域。
将该桥接寡核苷酸II-I、桥接寡核苷酸II-II和芯片序列和上述步骤一获得的cDNA链退火或杂交,之后通过DNA连接酶将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接。并且,在DNA聚合酶的作用下,形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)。所述连接过程和聚合过程以任意顺序进行。
通过上述示例性实施方案所形成的含有芯片序列信息的新核酸分子的示例性结构包含:从5’端至3’端含有共有序列A,cDNA序列,3’末端悬突序列,UMI序列的互补序列,共有序列B的互补序列,桥接寡核苷酸II-I序列,标签序列Y的互补序列,以及共有序列X1的互补序列的核酸链和/或其互补核酸链。
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(ii)和步骤(3)。在某些实施方案中,步骤(2)(ii)(b)中,所述第一延伸产物从5’端至3’端包含:所述共有序列A,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’或引物B”,其中,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:以所述延伸引物延伸形成的与所述cDNA序列互补的序列,所述共有序列A的互补序列。
包括步骤(1)、步骤(2)(ii)和步骤(3)(i)的实施方案
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(ii)和步骤(3)(i);其中,所述共有序列X2或其部分序列能与所述共有序列A的互补序列或其部分序列退火;步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
易于理解,所述共有序列X2可以以其整体的核苷酸序列与所述共有序列A的互补序列或所述共有序列A的互补序列的部分区段的核苷酸序列退火,所述共有序列X2也可以以其部分区段的核苷酸序列与所述共有序列A的互补序列或所述共有序列A的互补序列的部分区段的核苷酸序列退火。
在某些实施方案中,所述第一链从5’端至3’端包含:所述待标记的第一核酸分子序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,与所述待标记的第一核酸分子序列互补的cDNA序列。
包括步骤(1)、步骤(2)(ii)和步骤(3)(i)的实施方案:一链
在某些实施方案中,所述共有序列X2或其部分序列能与所述共有序列A的互补序列或其部分序列(例如,3’端部分序列)退火;步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含含有所述待标记的第一核酸分子序列的第一链。
在某些实施方案中,步骤(3)(i)中,所述寡核苷酸探针不能起始延伸反应(例如3’端是封闭的)。
在某些实施方案中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’。在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述共有序列A的互补序列。在某些实施方案中,所述第一链从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述共有序列A的互补序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第一链具有不同的捕获序列A的互补序列作为UMI。
在某些实施方案中在某些实施方案中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物II-A’还含有标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物II-A的3’端。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’。在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:所述共有序列B,任选的 所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述标签序列A的互补序列,所述共有序列A的互补序列。在某些实施方案中,所述第一链从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述标签序列A的互补序列,所述共有序列A的互补序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第一链具有不同的标签序列A的互补序列作为UMI。
包括步骤(1)、步骤(2)(ii)和步骤(3)(i)的实施方案:二链
在某些实施方案中,所述共有序列X2或其部分序列(例如,3’端部分序列)能与所述共有序列A的互补序列或其部分序列退火;步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含含有所述寡核苷酸探针序列的第二链。
在某些实施方案中,步骤(2)(ii)获得的第二延伸产物不能起始延伸反应(例如3’端是封闭的)。
在某些实施方案中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’。在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述共有序列A的互补序列。在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,与所述待标记的第一核酸分子序列互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二链具有不同的捕获序列A作为UMI。
在某些实施方案中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物II-A’还含有标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物II-A的3’端。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’。在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述标签序列A的互补序列,所述共有序列A的互补序列。在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,所述标签序列A,与所述待标记的第一核酸分子序列互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二链具有不同的标签序列A作为UMI。
本申请的包含步骤(1)、步骤(2)(ii)和步骤(3)(i)的一个示例性实施方案详细描述如下:
一、以样本中的RNA(例如mRNA)为模板制备3’端含有UMI的互补序列的cDNA链互补链的示例性方案包含以下步骤(如图9所示):
(1)用逆转录酶(例如,具有末端转移活性的逆转录酶)和引物II-A’对透化的样本中的RNA分子(例如,mRNA分子)进行逆转录,以生成cDNA,并在cDNA的3’端添加悬突(例如,包含3个胞嘧啶核苷酸的悬突)。可使用各种具有末端转移活性的逆转录酶来进行逆转录反应。在某些优选的实施方案中,所使用的逆转录酶不具有RNaseH活性。
在某些实施方案中,所述引物II-A’包含poly(T)序列,UMI序列,以及共有序列A(CA)。通常情况下,poly(T)序列位于所述引物II-A’的3’末端以便起始逆转录,所述共有序列A位于所述UMI序列的上游(例如5’端)。
在某些实施方案中,所述引物II-A’包含随机寡核苷酸序列以及共有序列A,可用于捕获无ploy A尾的RNA。通常情况下,所述随机寡核苷酸序列位于所述引物II-A’的3’末端,以便起始逆转录。
(2)使用含引物II-B’与cDNA链进行退火或杂交,所述引物II-B’包含共有序列B(CB)、以及所述cDNA的3’端悬突的互补序列。随后,与所述引物II-B’杂交或退火的核酸片段在核酸聚合酶的作用下,可以以所述共有序列B为模板进行延伸,在cDNA链3’末端添加所述共有序列B的的互补序列(c(CB)),从而生成3’端携带所述共有序列B的互补序列的核酸分子。
通常情况下,所述与cDNA链的3’末端悬突互补的序列位于所述引物II-B’的3’末端。
例如,当cDNA链的3’末端包含3个胞嘧啶核苷酸的悬突时,所述引物II-B’可在其3’端包含GGG。此外,还可以对所述引物II-B’的核苷酸进行修饰(例如,使用锁核酸),以增强所述引物II-B’与cDNA链的3’末端悬突之间的互补配对。
不受理论限制,可以使用各种合适的核酸聚合酶(例如,DNA聚合酶或逆转录酶)来进行延伸反应,只要其能够以所述引物II-B’的序列或其部分序列为模板延伸被捕获的核酸片段(逆转录产物)即可。在某些示例性实施方案中,可使用与前述逆转录步骤相同的逆转录酶来延伸被捕获的核酸片段(逆转录产物)。
在某些实施方案中,该步骤与步骤(1)同时进行(例如,在同一反应体系中进行)。
在某些实施方案中,所述方法任选地还包含步骤(3):加入RNaseH,消化RNA/cDNA杂合双链中的RNA链,形成cDNA单链。
在某些实施方案中,所述方法不包括所述步骤(3)。
(4)使用延伸引物,以前一步骤获得的cDNA链为模板进行延伸反应,获得延伸产物;所述延伸引物为所述引物II-B’,或者引物B”,所述引物B”能与所述共有序列B或其部分序列退火,且能起始延伸反应。
通过上述示例性实施方案所制备的cDNA链互补链的示例性结构包含:共有序列B,3’末端 悬突的互补序列,cDNA序列的互补序列,UMI序列的互补序列,以及共有序列A的互补序列。
二、用寡核苷酸探针(也称,芯片序列)的互补序列标记cDNA链互补链的3’端,以形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案包含以下步骤(如图11所示):
在某些实施方案中,所述芯片序列的共有序列X2或其部分序列能与上述步骤一中获得的cDNA链互补链的所述共有序列A的互补序列或其部分序列退火。将该cDNA链互补链与芯片序列退火或杂交,在聚合酶的作用下,形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)。
通过上述示例性实施方案所形成的含有芯片序列信息的新核酸分子的示例性结构包含:从5’端至3’端含有所述共有序列B,3’末端悬突的互补序列,cDNA序列的互补序列,所述UMI序列的互补序列,所述共有序列A的互补序列,所述标签序列Y的互补序列,以及所述共有序列X1的互补序列的核酸链和/或其互补核酸链。
包括步骤(1)、步骤(2)(ii)和步骤(3)(ii)的实施方案
在某些实施方案中,所述方法包括步骤(1)、步骤(2)(ii)和步骤(3)(ii);其中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(ii)获得的第二延伸产物的共有序列A的互补序列或其部分序列退火;步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
易于理解,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(ii)获得的第二延伸产物的共有序列A的互补序列或所述共有序列A的互补序列的部分区段的核苷酸序列退火。
在某些实施方案中,所述第一链从5’端至3’端包含:所述待标记的第一核酸分子序列,任选的所述桥接寡核苷酸II-II的第三区域的互补序列,所述桥接寡核苷酸II-I序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸II-I的第三区域的互补序列,所述桥接寡核苷酸II-II序列,与所述待标记的第一核酸分子序列互补的cDNA序列。
包括步骤(1)、步骤(2)(ii)和步骤(3)(ii)的实施方案:一链
在某些实施方案中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(ii)获得的第二延伸产物的所述共有序列A的互补序列或其3’端部分序列退火,并且所述桥接寡核苷酸II-I的第二区域具有3’自由端。
在某些实施方案中,步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含所述第一链。
在某些实施方案中,所述桥接寡核苷酸II-I的第二区域位于所述桥接寡核苷酸II-I的3’末端。
在某些实施方案中,所述桥接寡核苷酸II-I的第一区域位于所述桥接寡核苷酸II-I的5’末端。在某些实施方案中,所述桥接寡核苷酸II-I不含有所述第三区域,和/或,所述桥接寡核苷酸II-II不含有所述第三区域。
在某些实施方案中,所述桥接寡核苷酸II-I的5’末端含有磷酸化修饰。
在某些实施方案中,所述桥接寡核苷酸II-I的3’末端含有自由-OH。
在某些实施方案中,步骤(3)(ii)中,所述桥接寡核苷酸II-II不能起始延伸反应(例如3’端是封闭的),和/或,所述寡核苷酸探针不能起始延伸反应(例如3’端是封闭的)。
在某些实施方案中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’。在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述共有序列A的互补序列。在某些实施方案中,所述第一链从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述共有序列A的互补序列,任选的所述桥接寡核苷酸II-II的第三区域的互补序列,所述桥接寡核苷酸II-I序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第一链具有不同的捕获序列A的互补序列作为UMI。
在某些实施方案中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物II-A’还含有标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物II-A的3’端。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’。在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述标签序列A的互补序列,所述共有序列A的互补序列。在某些实施方案中,所述第一链从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述标签序列A的互补序列,所述共有序列A的互补序列,任选的所述桥接寡核苷酸II-II的第三区域的互补序列,所述桥接寡核苷酸II-I序列,所述标签序列Y的互补序列,所述共有序列X1的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第一链具有不同的标签序列A的互补序列作为UMI。
易于理解,步骤(3)(ii)中,在所述桥接寡核苷酸II-I、桥接寡核苷酸II-II与所述寡核苷酸探针以及所述寡核苷酸探针对应位置的待标记的第一核酸分子退火之后,将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的连接反应过程与步骤(3)(ii)中所述的延伸反应可以任意顺序进行,只要能获得带有位置标记的第二核酸分子即可。
例如,当所述连接反应与所述延伸反应在相同体系中进行,可通过将杂交于同一桥接寡核苷 酸II-II的第一区域和第二区域的核酸分子连接,并以所述桥接寡核苷酸II-I起始延伸反应,获得所述第一链。在该种情况下,所述用于延伸反应的聚合酶优选不具有链置换活性或5'至3'外切活性。
例如,当所述连接反应与所述延伸反应在不同体系中进行,并且,先进行所述连接反应,后进行所述延伸反应。在该种情况下,所述第一链可以通过以下示例性方式获得:
(A)将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接,并以所述桥接寡核苷酸II-I起始延伸反应,获得所述第一链;其中,所述用于延伸反应的聚合酶优选具有或者不具有链置换活性或5'至3'外切活性;
或,
(B)将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,并以所述待标记的第一核酸分子起始延伸反应,获得所述第一链;其中,所述用于延伸反应的聚合酶优选具有链置换活性或5'至3'外切活性。
例如,当所述连接反应与所述延伸反应在不同体系中进行,并且,先进行所述延伸反应,后进行所述连接反应。在该种情况下,可通过以所述桥接寡核苷酸II-I起始延伸反应,再将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接获得所述第一链。在该种情况下,所述用于延伸反应的聚合酶优选不具有链置换活性或5'至3'外切活性。
包括步骤(1)、步骤(2)(ii)和步骤(3)(ii)的实施方案:二链
在某些实施方案中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(ii)获得的第二延伸产物的所述共有序列A的互补序列或其部分序列退火,并且所述桥接寡核苷酸II-II的第二区域具有3’自由端。
在某些实施方案中,步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含所述第二链。
在某些实施方案中,所述桥接寡核苷酸II-II的第二区域位于所述桥接寡核苷酸II-II的3’末端。
在某些实施方案中,所述桥接寡核苷酸II-II的第一区域位于所述桥接寡核苷酸II-II的5’末端。
在某些实施方案中,所述桥接寡核苷酸II-I不含有所述第三区域,和/或,所述桥接寡核苷酸II-II不含有所述第三区域。
在某些实施方案中,所述桥接寡核苷酸II-II的5’末端含有磷酸化修饰。
在某些实施方案中,所述桥接寡核苷酸II-II的3’末端含有自由-OH。
在某些实施方案中,步骤(3)(ii)中,所述桥接寡核苷酸II-I不能起始延伸反应(例如3’端是封闭的),和/或,步骤(2)(ii)获得的第二延伸产物不能起始延伸反应(例如3’端是封闭的)。
在某些实施方案中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为随机寡核苷酸序列。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’。在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述共有序列A的互补序列。在某些实施方案中,所述第 二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸II-I的第三区域的互补序列,所述桥接寡核苷酸II-II序列,与所述待标记的第一核酸分子序列互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二链具有不同的捕获序列A作为UMI。
在某些实施方案中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列。
在某些实施方案中,所述引物II-A’还含有标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述捕获序列A位于所述引物II-A的3’端。
在某些实施方案中,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’。在某些实施方案中,步骤(2)(ii)(c)中,所述第二延伸产物从5’端至3’端包含:所述共有序列B,任选的所述标签序列B,所述3’末端悬突序列的互补序列,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列的互补序列,所述标签序列A的互补序列,所述共有序列A的互补序列。在某些实施方案中,所述第二链从5’端至3’端包含:所述共有序列X1,所述标签序列Y,所述共有序列X2,任选的所述桥接寡核苷酸II-I的第三区域的互补序列,所述桥接寡核苷酸II-II序列,所述标签序列A,与所述待标记的第一核酸分子序列互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列。
在某些实施方案中,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二链具有不同的标签序列A作为UMI。
易于理解,步骤(3)(ii)中,在所述桥接寡核苷酸II-I、桥接寡核苷酸II-II与所述寡核苷酸探针以及所述寡核苷酸探针对应位置的待标记的第一核酸分子退火之后,将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的连接反应过程与步骤(3)(ii)中所述的延伸反应可以任意顺序进行,只要能获得带有位置标记的第二核酸分子即可。
例如,当所述连接反应与所述延伸反应在相同体系中进行,可通过将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,并以所述桥接寡核苷酸II-II起始延伸反应,获得所述第二链。在该种情况下,所述用于延伸反应的聚合酶优选不具有链置换活性或5'至3'外切活性。
例如,当所述连接反应与所述延伸反应在不同体系中进行,并且,先进行所述连接反应,后进行所述延伸反应。在该种情况下,所述第二链可以通过以下示例性方式获得:
(A)将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,并以所述桥接寡核苷酸II-II起始延伸反应,获得所述第二链;其中,所述用于延伸反应的聚合酶优选具有或者不具有链置换活性或5'至3'外切活性;
或,
(B)将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接,并以所述寡 核苷酸探针起始延伸反应,获得所述第二链;其中,所述用于延伸反应的聚合酶优选具有链置换活性或5'至3'外切活性。
例如,当所述连接反应与所述延伸反应在不同体系中进行,并且,先进行所述延伸反应,后进行所述连接反应。在该种情况下,可通过以所述桥接寡核苷酸II-II起始延伸反应,再将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接获得所述第二链。在该种情况下,所述用于延伸反应的聚合酶优选不具有链置换活性或5'至3'外切活性。
本申请的包含步骤(1)、步骤(2)(ii)和步骤(3)(ii)的一个示例性实施方案详细描述如下:
一、以样本中的RNA(例如mRNA)为模板制备cDNA链互补链的示例性方案包含以下步骤(如图9所示):
(1)用逆转录酶(例如,具有末端转移活性的逆转录酶)和引物II-A’对透化的样本中的RNA分子(例如,mRNA分子)进行逆转录,以生成cDNA,并在cDNA的3’端添加悬突(例如,包含3个胞嘧啶核苷酸的悬突)。可使用各种具有末端转移活性的逆转录酶来进行逆转录反应。在某些优选的实施方案中,所使用的逆转录酶不具有RNaseH活性。
在某些实施方案中,所述引物II-A’包含poly(T)序列,UMI序列,以及共有序列A(CA)。通常情况下,poly(T)序列位于所述引物II-A’的3’末端以便起始逆转录,所述共有序列A位于所述UMI序列的上游(例如5’端)。
在某些实施方案中,所述引物II-A’包含随机寡核苷酸序列以及共有序列A,可用于捕获无ploy A尾的RNA。通常情况下,所述随机寡核苷酸序列位于所述引物II-A’的3’末端,以便起始逆转录。
(2)使用含引物II-B’与cDNA链进行退火或杂交,所述引物II-B’包含共有序列B(CB)、以及所述cDNA的3’端悬突的互补序列。随后,与所述引物II-B’杂交或退火的核酸片段在核酸聚合酶的作用下,可以以所述共有序列B为模板进行延伸,在cDNA链3’末端添加所述共有序列B的的互补序列(c(CB)),从而生成3’端携带所述共有序列B的互补序列的的核酸分子。
通常情况下,所述与cDNA链的3’末端悬突互补的序列位于所述引物II-B’的3’末端。
例如,当cDNA链的3’末端包含3个胞嘧啶核苷酸的悬突时,所述引物II-B’可在其3’端包含GGG。此外,还可以对所述引物II-B’的核苷酸进行修饰(例如,使用锁核酸),以增强所述引物II-B’与cDNA链的3’末端悬突之间的互补配对。
不受理论限制,可以使用各种合适的核酸聚合酶(例如,DNA聚合酶或逆转录酶)来进行延伸反应,只要其能够以所述引物II-B’的序列或其部分序列为模板延伸被捕获的核酸片段(逆转录产物)即可。在某些示例性实施方案中,可使用与前述逆转录步骤相同的逆转录酶来延伸被捕获的核酸片段(逆转录产物)。
在某些实施方案中,该步骤与步骤(1)同时进行(例如,在同一反应体系中进行)。
在某些实施方案中,所述方法任选地还包含步骤(3):加入RNaseH,消化RNA/cDNA杂合双链中的RNA链,形成cDNA单链。
在某些实施方案中,所述方法不包括所述步骤(3)。
(4)使用延伸引物,以前一步骤获得的cDNA链为模板进行延伸反应,获得延伸产物;所述延伸引物为所述引物II-B’,或者引物B”,所述引物B”能与所述共有序列B或其部分序列退火,且能起始延伸反应。
通过上述示例性实施方案所制备的cDNA链互补链的示例性结构包含:共有序列B,3’末端悬突的互补序列,cDNA序列的互补序列,UMI序列的互补序列,以及共有序列A的互补序列。
二、用寡核苷酸探针(也称,芯片序列)的互补序列标记cDNA链互补链的3’端,以形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案包含以下步骤(如图10所示):
提供由桥接寡核苷酸II-I和桥接寡核苷酸II-II组成的桥接寡核苷酸对,其中,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域(P1)和第二区域(P2),所述第一区域位于所述第二区域的上游(例如5’端);其中,
所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;
所述桥接寡核苷酸II-II的第二区域能与上述步骤一中获得的cDNA链互补链中的所述共有序列A的互补序列或其部分序列退火。
在某些实施方案中,所述桥接寡核苷酸II-I中第一区域和第二区域之间包含间隔核苷酸,例如1-5nt或5-10nt的间隔核苷酸,即所述桥接寡核苷酸II-I序列含有位于第一区域与第二区域之间的第三区域。在某些优选的实施方案中,所述桥接寡核苷酸II-I中第一区域和第二区域是相邻连接的,二者之间没有多余核苷酸,即所述桥接寡核苷酸II-I序列不含有位于第一区域与第二区域之间的第三区域。
在某些实施方案中,所述桥接寡核苷酸II-II中第一区域和第二区域之间包含间隔核苷酸,例如1-5nt或5-10nt的间隔核苷酸,即所述桥接寡核苷酸II-II序列含有位于第一区域与第二区域之间的第三区域。在某些优选的实施方案中,所述桥接寡核苷酸II-II中第一区域和第二区域是相邻连接的,二者之间没有多余核苷酸,即所述桥接寡核苷酸II-II序列不含有位于第一区域与第二区域之间的第三区域。
将该桥接寡核苷酸II-I、桥接寡核苷酸II-II和芯片序列和上述步骤一获得的cDNA链互补链退火或杂交,之后通过DNA连接酶将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接。随后,在DNA聚合酶的作用下,形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)。所述连接过程和聚合过程以任意顺序进行。
通过上述示例性实施方案所形成的含有芯片序列信息的新核酸分子的示例性结构包含:从5’端至3’端含有所述共有序列B,3’末端悬突的互补序列,cDNA序列的互补序列,所述UMI序列的互补序列,所述共有序列A的互补序列,所述桥接寡核苷酸II-I序列,所述标签序列Y的互补序列,以及所述共有序列X1的互补序列的核酸链和/或其互补核酸链。
在某些实施方案中,在步骤(2)(i)(b)中,所述cDNA链通过其3’末端悬突与所述引物II-B退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所 述引物II-B为模板被延伸,生成所述第一延伸产物。
在某些实施方案中,在步骤(2)(ii)(b)中,所述cDNA链通过其3’末端悬突与所述引物II-B’退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所述引物II-B’为模板被延伸,生成所述第一延伸产物。
在某些实施方案中,所述3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个或更多个核苷酸的长度。在某些实施方案中,所述3’末端悬突为2-5个胞嘧啶核苷酸的3’末端悬突(例如CCC悬突)。
在方案I或方案II的某些实施方案中,步骤(2)中,所述预处理在细胞内进行。
在方案I或方案II的某些实施方案中,在将所述一个或多个细胞与所述核酸阵列的固相支持物接触之前或之后,对所述一个或多个的RNA(例如,mRNA)进行预处理以生成第一核酸分子群。
在方案I或方案II的某些实施方案中,在进行所述预处理之前,对细胞进行透化处理。
在方案I或方案II的某些实施方案中,步骤(2)中,所述预处理在细胞外进行。
在方案I或方案II的某些实施方案中,在将所述一个或多个细胞与所述核酸阵列的固相支持物接触之后,对所述一个或多个的RNA(例如,mRNA)进行预处理以生成第一核酸分子群。
在方案I或方案II的某些实施方案中,在进行所述预处理之前,所述方法还包括释放细胞内的RNA(例如,mRNA);优选地,通过细胞透化或细胞裂解处理以释放细胞内的RNA(例如,mRNA)。
在方案I或方案II的某些实施方案中,步骤(2)中所述进行逆转录包括使用逆转录酶。
在方案I或方案II的某些实施方案中,所述逆转录酶具有末端转移活性。
在方案I或方案II的某些实施方案中,所述逆转录酶能够以RNA(例如,mRNA)为模板,合成cDNA链,且在所述cDNA链的3’端添加悬突。
在方案I或方案II的某些实施方案中,所述逆转录酶能够在cDNA链的3’末端添加长度为至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个或更多个核苷酸的悬突。
在方案I或方案II的某些实施方案中,所述逆转录酶能够在cDNA链的3’末端添加2-5个胞嘧啶核苷酸的悬突(例如CCC悬突)。
在方案I或方案II的某些实施方案中,所述逆转录酶选自M-MLV逆转录酶、HIV-1逆转录酶、AMV逆转录酶,端粒酶逆转录酶,以及具有上述转座酶的转座活性的变体、修饰产物和衍生物。
在方案I或方案II的某些实施方案中,步骤(2)和(3)具有选自以下的一项或多项特征:
(1)所述引物I-A,引物II-A,引物I-A’,引物II-A’,引物I-B,引物II-B,引物II-B’,桥接寡核苷酸I,桥接寡核苷酸II-I,桥接寡核苷酸II-II各自独立地包含或者由天然存在的核苷酸(例如脱氧核糖核苷酸或核糖核苷酸),经修饰的核苷酸,非天然的核苷酸,或其任何组合组成;在某些实施方案中,所述引物I-A,引物II-A,引物I-A’,引物II-A’能够起始延伸反应;
(2)所述引物I-B,引物II-B,II-B’各自独立地包含修饰的核苷酸(例如锁核酸);在某些实施方案中,所述引物I-B,引物II-B,引物II-B’的3’末端各自独立地包含一个或多个修饰的核苷 酸(例如锁核酸);
(3)所述标签序列A,标签序列B各自独立地具有5-200(例如5-30nt,6-15nt)的长度;
(4)所述共有序列A,共有序列B各自独立地具有10-200nt(例如10-100nt,20-100nt,25-100nt,5-10nt,10-15nt,15-20nt,20-50nt,20-30nt,30-40nt,40-50nt,50-100nt)的长度;
(5)所述引物I-A,引物II-A,引物I-A’,引物II-A’,引物I-B,引物II-B,引物II-B’各自独立地具有4-200nt(例如5-200nt,15-230nt,26-115nt,10-130nt,10-20nt,20-50nt,20-30nt,30-40nt,40-50nt,50-100nt,100-150nt,150-200nt)的长度;
(6)所述桥接寡核苷酸I,桥接寡核苷酸II-I,桥接寡核苷酸II-II的第一区域、第二区域各自独立地具有3-100nt(例如20-100nt,3-10nt,10-15nt,15-20nt,20-70nt,20-30nt,30-40nt,40-50nt,50-100nt)的长度;
(7)所述桥接寡核苷酸I,桥接寡核苷酸II-I,桥接寡核苷酸II-II的第三区域各自独立地具有0-50nt(例如0nt,0-10nt,10-15nt,15-20nt,20-30nt,30-40nt,40-50nt)的长度;
(8)所述桥接寡核苷酸I,桥接寡核苷酸II-I,桥接寡核苷酸II-II各自独立地具有6-200nt(例如20-100nt,20-70nt,6-15nt,15-20nt,20-30nt,30-40nt,40-50nt,50-100nt,100-150nt,150-200nt)的长度;
(9)所述poly(T)序列包括至少5个,或至少20个(例如6-100个,10-50个)脱氧胸腺嘧啶核苷残基;
(10)所述随机寡核苷酸序列具有5-200(例如5nt,5-30nt,6-15nt)的长度。
在方案I或方案II的某些实施方案中,所述方法还包括:(4)回收和纯化所述第二核酸分子群。
在方案I或方案II的某些实施方案中,所获得的第二核酸分子群和/或其互补物用于构建转录组文库或用于转录组测序。
在方案I或方案II的某些实施方案中,步骤(1)中所述寡核苷酸探针具有选自下列的一个或多个特征:
(1)所述共有序列X1,标签序列Y和共有序列X2各自独立地包含或者由天然存在的核苷酸(例如脱氧核糖核苷酸或核糖核苷酸),经修饰的核苷酸,非天然的核苷酸(例如肽核酸(PNA)或锁核酸),或其任何组合组成;
(2)所述共有序列X1,标签序列Y和共有序列X2各自独立地具有2-200nt(例如10-200nt,25-100nt,10-30nt,10-100nt,5-10nt,10-15nt,15-20nt,20-30nt,30-40nt,40-50nt,50-100nt)的长度。
在方案I或方案II的某些实施方案中,步骤(1)所述核酸阵列由包含以下的步骤来提供:
(1)提供多种载体序列,每种载体序列包含至少一个拷贝(例如,多个拷贝)的载体序列,所述载体序列从5’到3’的方向上包含:共有序列X2的互补序列,标签序列Y的互补序列以及固定序列;其中,每种载体序列的标签序列Y的互补序列互不相同;
(2)将所述多种载体序列连接于固相支持物(例如芯片)表面;
(3)提供固定引物,并以所述载体序列为模板,进行引物延伸反应,生成延伸产物,所述延伸 产物即为寡核苷酸探针;其中,所述固定引物包含共有序列X1的序列,并且,所述固定引物能与所述固定序列退火并起始延伸反应;在某些实施方案中,所述延伸产物从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成;
(4)将所述固定引物与所述固相支持物表面连接;其中,步骤(3)与(4)以任意顺序进行;
(5)任选地,所述载体序列的固定序列还包含切割位点,所述切割可以选自切刻酶(nicking enzyme)酶切、USER酶切、光切除、化学切除或CRISPR切除;对所述载体序列的固定序列所包含的切割位点进行切割,以消化所述载体序列,使得步骤(3)中的延伸产物与形成延伸产物的模板(即载体序列)分离,从而将所述寡核苷酸探针连接于固相支持物(例如芯片)表面。在某些实施方案中,所述方法还包括通过高温变性使得步骤(3)中的延伸产物与形成延伸产物的模板(即载体序列)分离。
在方案I或方案II的某些实施方案中,每种载体序列是由多个拷贝的载体序列的多联体所形成的DNB。
在方案I或方案II的某些实施方案中,步骤(1)中通过以下步骤提供所述多种载体序列:
(i)提供多种载体模板序列,所述载体模板序列包含所述载体序列的互补序列;
(ii)以每种载体模板序列为模板,进行核酸扩增反应,以获得每种载体模板序列的扩增产物,所述扩增产物包含至少一个拷贝的载体序列;在某些实施方案中,进行滚环复制,以获得由所述载体序列的多联体所形成的DNB。
方案III
在某些实施方案中,在所述方法的步骤(1)中,所述共有序列X2包含捕获序列,所述捕获序列能够与待捕获核酸的全部或部分杂交,其包括poly(T)序列、针对特定靶核酸的特异性序列或随机寡核苷酸序列;并且,所述捕获序列具有3’自由端以使所述共有序列X2能作为延伸引物。
在此类实施方案中,所述步骤(2)包括:将所述一个或多个细胞与所述核酸阵列的固相支持物接触,由此,每个细胞各自占据所述核酸阵列中的至少一个微点(即,每个细胞各自与所述核酸阵列中的至少一个微点接触),并使得所述细胞的第一结合分子与所述固相支持物的第一标记分子形成相互作用对;其中,实施退火条件以使得所述一个或多个细胞的核酸与所述捕获序列发生退火,从而使得该核酸的位置被对应至核酸阵列上的寡核苷酸探针的位置;
并且,所述步骤(3)包括:在允许引物延伸的条件下,以寡核苷酸探针作为引物,以被捕获的核酸分子为模板,进行引物延伸反应,以产生经标记的(例如由所述标签序列Y标记的)核酸分子;和/或,以被捕获的核酸分子作为引物,以所述寡核苷酸探针为模板,进行引物延伸反应,以产生经延长的被捕获核酸分子,形成经标记的(例如由所述标签序列Y互补序列标记的)核酸分子。
在某些实施方案中,步骤(1)中所述的寡核苷酸探针还包含唯一分子标识符(UMI)序列。优选地,所述UMI序列位于所述捕获序列的上游方向。优选地,同一微点上偶联的寡核苷酸探针所包含的UMI序列彼此不同。
在某些实施方案中,步骤(1)所述核酸阵列由包含以下的步骤来提供:
(1)提供多种载体序列,每种载体序列包含多个拷贝的载体序列,所述载体序列从5’到3’的 方向上包含:定位序列和第一固定序列,
所述定位序列是标签序列Y的互补序列;
所述第一固定序列允许与其互补的核苷酸序列发生退火并起始延伸反应;
(2)将所述多种载体序列连接于固相支持物(例如芯片)表面;
(3)提供第一引物,并以所述载体序列为模板,进行引物延伸反应,使得所述载体序列的第一固定序列和定位序列区域形成双链,其中与所述载体序列杂交的链即为第一核酸分子,所述第一核酸分子从5’至3’方向上包含第一固定序列和定位序列的互补序列;其中,所述第一引物在其3’端包含第一固定序列互补区,所述第一固定序列互补区包含所述第一固定序列的互补序列或其片段,且具有3’自由端;
(4)提供第二核酸分子,所述第二核酸分子包含共有序列X2(即捕获序列),其具有3’自由端以使所述第二核酸分子能作为延伸引物,
(5)将所述第二核酸分子与所述第一核酸分子连接(例如,使用连接酶将所述第二核酸分子与第一核酸分子连接),连接产物即为从5’到3’的方向上包含共有序列X1、标签序列Y和共有序列X2的寡核苷酸探针。
在某些实施方案中,任选地消化所述载体序列,使得步骤(5)中的连接产物与载体序列分离,从而将所述寡核苷酸探针连接于固相支持物表面。
在某些实施方案中,所述第一核酸分子或第二核酸分子还包含UMI序列。在某些实施方案中,所述第二核酸分子包含UMI序列,所述UMI序列位于捕获序列的5’端。
在某些实施方案中,通过以下步骤提供多种载体序列:
(i)提供多种载体模板序列,所述载体模板序列包含所述载体序列的互补序列;
(ii)以每种载体模板序列为模板,进行核酸扩增反应,以获得每种载体模板序列的扩增产物,所述扩增产物包含多个拷贝的载体序列;
优选地,所述扩增选自滚环复制(RCA)、桥式PCR扩增、多重链置换扩增(MDA)或乳液PCR扩增;优选地,进行滚环复制,以获得由所述载体序列的多联体所形成的DNB,或者,进行桥式PCR扩增、乳液PCR扩增或多重链置换扩增,以获得由所述载体序列的克隆群形的DNA簇。
在方案I、方案II或方案III的某些实施方案中,所述寡核苷酸探针通过连接子与所述固相支持物偶联。
在方案I、方案II或方案III的某些实施方案中,所述连接子是能够与活化基团反应的连接基团,且所述固相支持物表面连接有活化基团。
在方案I、方案II或方案III的某些实施方案中,所述连接子包括-SH、-DBCO或-NHS。
在方案I、方案II或方案III的某些实施方案中,所述连接子是-DBCO,且所述固相支持物表面连接有
Figure PCTCN2022135478-appb-000001
(
Figure PCTCN2022135478-appb-000002
ester)。
在方案I、方案II或方案III的某些实施方案中,步骤(1)所述核酸阵列具有选自下列的一个或多个特征:
在方案I、方案II或方案III的某些实施方案中,(1)偶联在同一固相支持物上的所述寡核苷 酸探针具有相同的共有序列X1和/或相同的共有序列X2;(2)所述寡核苷酸探针的共有序列X1包含切割位点;在某些实施方案中,所述切割位点可以通过选自切刻酶(nicking enzyme)酶切、USER酶切、光切除、化学切除或CRISPR切除的方式而被切割或断裂。
在方案I、方案II或方案III的某些实施方案中,步骤(1)所述固相支持物具有选自下列的一个或多个特征:
(1)所述固体支持物选自乳胶珠、葡聚糖珠、聚苯乙烯表面、聚丙烯表面、聚丙烯酰胺凝胶、金表面、玻璃表面、芯片、传感器、电极和硅晶片;在某些实施方案中,所述固相支持物是芯片;
(2)所述固体支持物为平面的、球形的或多孔的;
(3)所述固相支持物能够用作测序平台,例如测序芯片;在某些实施方案中,所述固相支持物是用于Illumina、MGI或Thermo Fisher测序平台的测序芯片;和
(4)所述固相支持物能够自发地或在暴露于一种或多种刺激(例如,温度变化、pH变化、暴露于特定化学物质或相、暴露于光、还原剂等)时释放所述寡核苷酸探针。
构建核酸分子文库的方法
在另一方面,本申请还提供了一种构建核酸分子文库的方法,其包括,
(a)根据如上所述的生成标记的核酸分子群的方法生成标记的核酸分子群;
(b)将所述标记的核酸分子群中的核酸分子随机打断并添加接头;和
(c)任选地,对步骤(b)的产物进行扩增和/或富集;
从而获得核酸分子文库。
在某些实施方案中,所述核酸分子文库包含来自多个单细胞的核酸分子,不同单细胞的核酸分子具有不同的标签序列Y。
在某些实施方案中,所述核酸分子文库用于测序,例如转录组测序,例如单细胞转录组测序(例如5’端或3’端转录组测序)。
在某些实施方案中,在进行步骤(b)之前,所述方法还包括步骤(pre-b):扩增和/或富集所述标记的核酸分子群。
在某些实施方案中,在步骤(pre-b)中,对所述标记的核酸分子群进行核酸扩增反应,以产生富集产物。
在某些实施方案中,所述扩增反应使用至少引物C和/或引物D来进行,其中,所述引物C能够与所述共有序列X1的互补序列或其部分序列杂交或退火,并起始延伸反应;所述引物D能够与所述标记的核酸分子群中含有所述标签序列Y的核酸分子链杂交或退火,并起始延伸反应。
在某些实施方案中,步骤(pre-b)中的所述核酸扩增反应使用核酸聚合酶(例如DNA聚合酶。例如具有链置换活性和/或高保真性的DNA聚合酶)来进行。
在某些实施方案中,所述方法在步骤(b)中,用转座酶将所述核酸分子随机打断并添加接头。
在某些实施方案中,所述方法在步骤(b)中,用转座酶将前一步骤获得的核酸分子随机打断并在片段两端分别添加第一接头和第二接头。
在某些实施方案中,所述转座酶选自Tn5转座酶、MuA转座酶、睡美人转座酶、Mariner转座酶、Tn7转座酶、Tn10转座酶、Ty1转座酶、Tn552转座酶,以及具有上述转座酶的转座活性 的变体、修饰产物和衍生物。
在某些实施方案中,所述转座酶为Tn5转座酶。
在某些实施方案中,在步骤(c)中,至少使用引物C’和/或引物D’对步骤(b)的产物进行扩增,其中,所述引物C’能够与所述第一接头杂交或退火,并起始延伸反应,所述引物D’能够与所述第二接头杂交或退火,并起始延伸反应。
在某些实施方案中,在步骤(c)中,至少使用所述引物C和/或引物D’对步骤(b)的产物进行扩增;其中,所述引物D’能够与所述第一接头或第二接头杂交或退火,并起始延伸反应。
进行转录组测序的方法
在另一方面,本申请还提供了一种对样品中的细胞进行转录组测序的方法,其包括:
(1)根据如上所述的构建核酸分子文库的方法构建核酸分子文库;和
(2)对所述核酸分子文库进行测序。
进行单细胞转录分析的方法
在另一方面,本申请还提供了进行单细胞转录组分析的方法,其包括:
(1)根据如上所述的方法对样品中的单细胞进行转录组测序;和
(2)对测序数据进行分析,其包括,将获得的测序文库的测序结果与所述核酸阵列上各个微点所偶联的寡核苷酸探针中的标签序列Y或其互补序列进行匹配,若匹配成功,则将所述微点认定为阳性微点,并且,将源自于所述核酸阵列中呈区域连续性的阳性微点的测序数据认定为同一个细胞的转录数据,从而进行单细胞转录组分析。
试剂盒
在另一方面,本申请还提供了试剂盒,其包含:
用于标记核酸的核酸阵列以及任选的第一结合分子,所述核酸阵列包括固相支持物,所述固相支持物(例如在其表面)含有第一标记分子,所述第一结合分子能与所述第一标记分子构成相互作用对;
所述固相支持物还包含多个微点,所述微点的尺寸(例如等效直径)小于5μm,相邻的所述微点之间的中心距离小于10μm;每个偶联有一种寡核苷酸探针;每种寡核苷酸探针包含至少一个拷贝;并且,所述寡核苷酸探针从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成,其中,
不同微点偶联的寡核苷酸探针具有不同的标签序列Y。
在某些实施方案中,相邻的所述微点之间的中心距离小于10μm,小于5μm,小于1μm,小于0.5μm,小于0.1μm,小于0.05μm,或小于0.01μm;并且,所述微点的尺寸(例如等效直径)小于5μm,小于1μm,小于0.3μm,小于0.5μm,小于0.1μm,小于0.05μm,小于0.01μm,或小于0.001μm。
优选地,相邻的所述微点之间的中心距离为0.5μm~1μm,例如0.5μm~0.9μm,0.5μm~0.8μm。
优选地,所述微点的尺寸(例如等效直径)为0.001μm~0.5μm(例如0.01μm~0.1μm,0.01μm~0.2μm,0.2μm~0.5μm,0.2μm~0.4μm,0.2μm~0.3μm)。
在某些实施方案中,所述固相支持物包含多个(例如,至少10个,至少10 2个,至少10 3个, 至少10 4个,至少10 5个,至少10 6个,至少10 7个,至少10 8个,或更多个)微点;在某些实施方案中,所述固相支持物包含至少10 4个(例如至少10 4个,至少10 5个,至少10 6个,至少10 7个,至少10 8个,至少10 9个,至少10 10个,至少10 11个,或至少10 12个)微点/平方毫米。
在某些实施方案中,所述第一结合分子能与所述第一标记分子构成特异性相互作用对或者非特异性相互作用对。
在某些实施方案中,所述相互作用对选自正负电荷相互作用,亲和相互作用(例如生物素-亲和素,生物素-链霉亲和素,抗原-抗体,受体-配体,酶-辅因子),能够发生点击化学反应的分子对(例如含炔基基团-叠氮基化合物),N-羟基磺基琥珀(NHS)酯-含氨基化合物,或其任意组合。
例如,所述第一标记分子为多聚赖氨酸,所述第一结合分子为能与多聚赖氨酸结合的蛋白质;所述第一标记分子为抗体,所述第一结合分子为能与所述抗体结合的抗原;所述第一标记分子为含氨基化合物,所述第一结合分子为N-羟基磺基琥珀(NHS)酯;或者,所述第一标记分子为生物素,所述第一结合分子为链霉亲和素。
在某些实施方案中,所述试剂盒进一步包含:
(i)引物I-A,包含引物I-A’和引物I-B的引物组,或者,包含引物I-A和引物I-B的引物组,其中:
所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;优选地,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);
所述引物I-A’包含捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;
所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物I-B的3’末端,所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);其中,所述3’末端悬突是指以所述引物I-A’的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸;
以及,
(ii)桥接寡核苷酸I,其包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
所述第一区域能(a)与所述引物I-A的共有序列A全部或部分退火或者(b)与所述引物I-B的共有序列B全部或部分退火;
所述第二区域能与所述共有序列X2全部或其部分退火。
在某些实施方案中,所述试剂盒包含:如(i)中所述的引物I-A,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-A的共有序列A全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;
其中,所述引物I-A的捕获序列A是随机寡核苷酸序列;或者,所述引物I-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述引物I-A的的5’末端包含磷酸化修饰。
在某些实施方案中,所述试剂盒包含:如(i)中所述的包含引物I-A’和引物I-B的引物组,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-B的共有序列B全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;
其中,所述引物I-A’的捕获序列A为随机寡核苷酸序列;或者,所述引物I-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A’进一步包含标签序列A,以及共有序列A;
其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B。
在某些实施方案中,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,且能够起始延伸反应。
在某些实施方案中,所述引物I-B或引物B”的5’末端包含磷酸化修饰。
在某些实施方案中,所述引物I-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物I-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
在某些实施方案中,所述试剂盒包含:如(i)中所述的包含引物I-A和引物I-B的引物组,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-A的共有序列A全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;
其中,所述引物I-A的捕获序列A为随机寡核苷酸序列;或者,所述引物I-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述引物I-A的的5’末端包含磷酸化修饰。
在某些实施方案中,所述引物I-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物I-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
在某些实施方案中,所述试剂盒进一步包含:
(i)包含引物II-A和引物II-B或者包含引物II-A’和引物II-B’的引物组,其中:
所述引物II-A含有捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;
所述引物II-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物II-B的3’末端,所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B的5’端);其中,所述3’末端悬突是指以所述引物II-A的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸;
所述引物II-A’含有共有序列A和捕获序列A;其中,所述捕获序列A位于所述引物II-A’的3’端,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物II-A’的5’端);
所述引物II-B’包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物II-B’的3’末端,所述共有序列B位于所述3’末端悬突互补序列 的上游(例如位于所述引物II-B’的5’端);其中,所述3’末端悬突是指以所述引物II-A’的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸。
在某些实施方案中,所述试剂盒包含:如(i)中所述的引物II-A和引物II-B的引物组,以及,(ii)桥接寡核苷酸II-I和桥接寡核苷酸II-II;其中,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;
所述桥接寡核苷酸II-II的第二区域能与所述引物II-B的共有序列B的互补序列或其部分序列退火;
其中,所述引物II-A的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A优选地进一步包含共有序列A和任选的标签序列A,例如为随机寡核苷酸序列;
其中,所述引物II-B含有所述共有序列B,3’末端悬突互补序列,以及标签序列B。
在某些实施方案中,所述引物II-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
在某些实施方案中,所述试剂盒包含:如(i)中所述的引物II-A和引物II-B的引物组;
其中,所述引物II-A的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A优选地进一步包含共有序列A和任选的标签序列A,例如为随机寡核苷酸序列;
其中,所述引物II-B含有所述共有序列B,3’末端悬突互补序列,以及标签序列B。
在某些实施方案中,所述引物II-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
在某些实施方案中,所述试剂盒包含:如(i)中所述的引物II-A’和引物II-B’的引物组,以及,(ii)桥接寡核苷酸II-I和桥接寡核苷酸II-II;其中,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;
所述桥接寡核苷酸II-II的第二区域能与所述引物II-A’的共有序列A互补序列或其部分序列退火;
其中,所述引物II-A’的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A’的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A’进一步包含标签序列A,例如为随机寡核苷酸序列。
在某些实施方案中,所述引物II-B’包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B’的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
在某些实施方案中,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
在某些实施方案中,其包含如(i)中所述的引物II-A’和引物II-B’的引物组;
其中,所述引物II-A’的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A’的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A’进一步包含标签序列A,例如为随机寡核苷酸序列;
其中,所述引物II-B’含有所述共有序列B,3’末端悬突互补序列,以及标签序列B。
在某些实施方案中,所述引物II-B’包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B’的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
在某些实施方案中,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
在某些实施方案中,所述试剂盒具有选自以下的一项或多项特征:
(1)所述寡核苷酸探针,引物I-A,引物II-A,引物I-A’,引物II-A’,引物I-B,引物II-B,引物II-B’,引物B”,桥接寡核苷酸I,桥接寡核苷酸II-I,桥接寡核苷酸II-II各自独立地包含或者由天然存在的核苷酸(例如脱氧核糖核苷酸或核糖核苷酸),经修饰的核苷酸,非天然的核苷酸,或其任何组合组成;
(2)所述寡核苷酸探针各自独立地具有15-300nt(例如15-200nt,15-20nt,20-30nt,30-40nt,40-50nt,50-100nt,100-150nt,150-200nt)的长度;
(3)所述引物I-A,引物II-A,引物I-A’,引物II-A’,引物I-B,引物II-B,引物II-B’,引物B”各自独立地具有4-200nt(例如5-200nt,15-230nt,26-115nt,10-130nt,10-20nt,20-50nt,20-30nt,30-40nt,40-50nt,50-100nt,100-150nt,150-200nt)的长度;
(4)所述桥接寡核苷酸I,桥接寡核苷酸II-I,桥接寡核苷酸II-II各自独立地具有6-200nt(例如20-100nt,20-70nt,6-15nt,15-20nt,20-30nt,30-40nt,40-50nt,50-100nt,100-150nt,150-200nt)的长度;
(5)偶联在同一固相支持物上的所述寡核苷酸探针具有相同的共有序列X1和/或相同的共有序列X2;
(6)所述寡核苷酸探针的共有序列X1包含切割位点;在某些实施方案中,所述切割位点可以通过选自切刻酶(nicking enzyme)酶切、USER酶切、光切除、化学切除或CRISPR切除的方式而被切割或断裂。
在某些实施方案中,所述试剂盒进一步包含逆转录酶,核酸连接酶,核酸聚合酶和/或转座酶。
在某些实施方案中,所述逆转录酶具有末端转移活性。在某些实施方案中,所述逆转录酶能够以RNA(例如,mRNA)为模板,合成cDNA链,且在所述cDNA链的3’端添加所述3’末端悬突。在某些实施方案中,所述逆转录酶能够在cDNA链的3’末端添加长度为至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个或更多个核苷酸的悬突。在某些实施方案中,所述逆转录酶能够在cDNA链的3’末端添加2-5个胞嘧啶核苷酸的悬突(例如CCC悬突)。在某些实施方案中,所述逆转录酶选自M-MLV逆转录酶、 HIV-1逆转录酶、AMV逆转录酶,端粒酶逆转录酶,以及具有上述转座酶的转座活性的变体、修饰产物和衍生物。
在某些实施方案中,所述核酸聚合酶无5’至3’端外切活性或链置换活性。
在某些实施方案中,所述核酸聚合酶具有5’至3’端外切活性或链置换活性。
在某些实施方案中,所述转座酶选自Tn5转座酶、MuA转座酶、睡美人转座酶、Mariner转座酶、Tn7转座酶、Tn10转座酶、Ty1转座酶、Tn552转座酶,以及具有上述转座酶的转座活性的变体、修饰产物和衍生物。
在某些实施方案中,所述试剂盒进一步包含:所述引物C,所述引物D,所述引物C’和/或所述引物D’。例如,所述试剂盒进一步包含所述引物C,所述引物D和所述引物D’。例如,所述试剂盒进一步包含所述引物C,所述引物D,所述引物C’和所述引物D’。
在某些实施方案中,所述试剂盒进一步包含:用于进行核酸杂交的试剂、用于进行核酸延伸的试剂、用于进行核酸扩增的试剂、用于回收或纯化核酸的试剂、用于构建转录组测序文库的试剂、用于测序(例如二代测序或三代测序)的试剂、或其任何组合。
术语定义
在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的分子生物学、生物化学、核酸化学、细胞培养等操作步骤均为相应领域内广泛使用的常规步骤。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。
当本文使用术语“例如”、“如”、“诸如”、“包括”、“包含”或其变体时,这些术语将不被认为是限制性术语,而将被解释为表示“但不限于”或“不限于”。
除非本文另外指明或根据上下文明显矛盾,否则术语“一个”和“一种”以及“该”和类似指称物在描述本发明的上下文中(尤其在以下权利要求的上下文中)应被解释成覆盖单数和复数。
如本文所用,可用于本发明方法的细胞(例如,可使用本发明方法进行处理以生成标记的核酸分子群的细胞)可以是任何感兴趣的细胞,例如,癌细胞、干细胞、神经细胞、胎儿细胞和参与免疫应答的免疫细胞。所述细胞可以是一个细胞,也可以是多个细胞。所述细胞可以是相同类型的细胞混合,也可以是完全异质的不同类型细胞混合。不同的细胞类型可包括个体的不同组织细胞或不同个体的相同组织细胞或者来源于不同属、种、菌株、变体或任何或所有前述的任何组合的微生物的细胞。例如,不同的细胞类型可包括个体的正常细胞和癌细胞;获自人类受试者的各种细胞类型,例如多种免疫细胞;来自环境、法医、微生物组或其他样品的多种不同的细菌物种、菌株和/或变体;或细胞类型的任何其他各种混合物。
如本文所使用的,术语“UMI”指“Unique Molecular Identifier,独特分子标签”,其可用于进行核酸分子的定性和/或定量。除非本文另外指明或根据上下文明显矛盾,本申请对所述UMI或其互补序列在核酸分子中的位置以及数量不做限定。例如,当cDNA链含有所述UMI或其互补序列,所述UMI或其互补序列可位于所述cDNA链中的cDNA序列的3’端,也可位于所述cDNA序列的5’端,也可以在3’端和5’端均包含所述UMI或其互补序列。当cDNA链互补链含有所述UMI或其互补序列,所述UMI或其互补序列可位于所述cDNA链互补链中的cDNA序列互补序列的3’端,也可位于所述 cDNA序列互补序列的5’端,也可以在3’端和5’端均包含所述UMI或其互补序列。
如本文所用,“DNB”(DNA nano ball,DNA纳米球)是一种典型的RCA(rolling circle amplification,RCA)产物,其具有RCA产物的特征。其中,所述RCA产物是一种多拷贝的单链DNA序列,因内部DNA序列的碱基间的相互作用力,可以形成类似“球形“结构。典型地,文库分子环化形成单链环状DNA,随后使用滚环扩增技术可将单链环状DNA扩增多个数量级,所产生的扩增产物称为DNB。
如本文所用,“核酸分子群”是指例如直接或间接来源于靶核酸分子(例如DNA双链分子、RNA/cDNA杂合双链分子、DNA单链分子、或RNA单链分子)的核酸分子的群体或集合。在一些实施方案中,核酸分子群包括核酸分子文库,所述核酸分子文库包含性质上和/或数量上代表靶核酸分子序列的序列。在另一些实施方案中,核酸分子群包含核酸分子文库的子集。
如本文所用,“核酸分子文库”表示直接或间接从靶核酸分子产生的标记的核酸分子(例如经标记的DNA双链分子、经标记的RNA/cDNA杂合双链分子、经标记的DNA单链分子、或经标记的RNA单链分子)或其片段的集合或群体,其中,在该集合或群体中经标记的核酸分子或其片段的组合显示在性质上和/或数量上代表从中产生经标记的核酸分子的靶核酸分子序列的序列。在某些实施方案中,所述核酸分子文库是测序文库。在某些实施方案中,所述核酸分子文库可用于构建测序文库。
如本文所用,“cDNA”或“cDNA链”是指使用感兴趣的RNA分子的至少一部分作为模板,通过RNA依赖性DNA聚合酶或反转录酶催化的与该感兴趣的RNA分子退火的引物的延伸而合成的“互补的DNA”(该过程也称为“反转录”)。所合成的cDNA分子与该模板的至少一部分“同源”或“互补”或“碱基配对”或“形成复合物”。
如本文中所使用的,术语“上游”用于描述两条核酸序列(或两个核酸分子)的相对位置关系,并且具有本领域技术人员通常理解的含义。例如,表述“一条核酸序列位于另一条核酸序列的上游”意指,当以5'至3'方向排列时,与后者相比,前者位于更靠前的位置(即,更接近5'端的位置)。如本文中所使用的,术语“下游”具有与“上游”相反的含义。
如本文所用,“标签序列Y”、“标签序列A”、“标签序列B”、“共有序列X1”、“共有序列X2”、“共有序列A”、“共有序列B”等,是指向它所接合的核酸分子或其接合的核酸分子的衍生产物(例如,核酸分子的互补片段、核酸分子的断裂短片段等)提供鉴定、识别和/或分子操作或生物化学操作手段(例如,通过提供用于使寡核苷酸退火的位点,所述寡核苷酸诸如用于DNA聚合酶延伸的引物或者用于捕获反应或连接反应的寡核苷酸)的非靶核酸组分的寡核苷酸。所述寡核苷酸可以由连续的至少两个(优选大约6到100个,但是对寡核苷酸的长度没有确定的限制,确切大小取决于许多因素,而这些因素又取决于寡核苷酸的最终功能或用途)核苷酸组成,也可以由多段寡核苷酸连续或非连续排列组合而成。所述寡核苷酸序列可以对于其接合的每个核酸分子是唯一的,也可以对于其接合的某一类核酸分子是唯一的。所述寡核苷酸序列可以通过任何方法包括连接、杂交或其他方法与待“标记”的多核苷酸序列可逆或不可逆地接合。将所述寡核苷酸序列与核酸分子接合的过程有时在本文称为“添加标记”并且经历添加标记或含标记序列的核酸分子称为“经标记的核酸分子”或“标记的核酸分子”。
出于多种原因,本发明的核酸或多核苷酸(例如“标签序列Y”、“标签序列A”、“标签序列B”、“共有序列X1”、“共有序列X2”、“共有序列A”、“共有序列B”、“引物I-A”、“引物I-A’”、“引物I-B”、“引物II-A”、“引物II-A’”、“引物II-B”、“引物II-B’”“引物B””、“引物C”、“引物D”、“引物C’”、“引物D’”、“随机引物”、“桥接寡核苷酸I”、“桥接寡核苷酸序列II-I”“桥接寡核苷酸序列II-II”等)可包括一种或多种修饰的核酸碱基、糖部分或核苷间连接。例如,使用包含修饰的碱基、糖部分或核苷间连接的核酸或多核苷酸的一些原因包括但不限于:(1)Tm的改变;(2)改变多核苷酸对一种或多种核酸酶的易感性;(3)提供用于连接标记的部分;(4)提供标记或标记猝灭剂;或(5)提供用于连接溶液中或结合于表面的另一种分子的部分,诸如生物素。例如,在一些实施方案中,可将寡核苷酸诸如引物合成为使得随机部分包含一种或多种构象受限制的核酸类似物,诸如但不限于其中的核糖环被连接2’-O原子与4’-C原子的亚甲基桥“锁定”的一种或多种核糖核酸类似物;这些修饰的核苷酸导致每个核苷酸单体的Tm或解链温度提高大约2摄氏度到大约8摄氏度。例如,在其中使用包含核糖核苷酸的寡核苷酸引物的一些实施方案中,在该方法中使用修饰的核苷酸的一个指标可以是包含该修饰的核苷酸的寡核苷酸可以被单链特异性RNA酶消化。
如本文所用,所述“第一所述结合分子”能与所述“第一标记分子”发生特异性相互作用或者非特异性相互作用。在某些实施方案中,所述第一结合分子与所述第一标记分子通过选自下述的方式发生相互作用:正负电荷相互作用,亲和相互作用(例如生物素-亲和素,生物素-链霉亲和素,抗原-抗体,受体-配体,酶-辅因子),点击化学反应(例如含炔基基团-叠氮基化合物),或其任意组合。
例如,所述第一标记分子为多聚赖氨酸,所述第一结合分子为能与多聚赖氨酸结合的蛋白质;所述第一标记分子为抗体,所述第一结合分子为能与所述抗体结合的抗原;所述第一标记分子为生物素,所述第一结合分子为链霉亲和素;所述第一结合分子为含炔基基团的化合物,所述第一标记分子为叠氮基化合物;或者,所述第一结合分子为N-羟基磺基琥珀(NHS)酯,所述第一标记分子为含氨基化合物。
例如,所述第一标记分子为抗原,所述第一结合分子为能与所述抗原结合的抗体;所述第一标记分子为链霉亲和素,所述第一结合分子为生物素;所述第一结合分子为叠氮基化合物,所述第一标记分子为含炔基基团的化合物;或者,所述第一结合分子为含氨基化合物,所述第一标记分子为N-羟基磺基琥珀(NHS)酯。
在本发明的方法中,例如,在多核苷酸或寡核苷酸中的一个或多个位置的单核苷酸中的核酸碱基可包括鸟嘌呤、腺嘌呤、尿嘧啶、胸腺嘧啶或胞嘧啶,或者可选地,所述核酸碱基中的一种或多种可包含修饰的碱基,诸如但不限于黄嘌呤、烯丙氨基(allyamino)-尿嘧啶、烯丙氨基-胸腺嘧啶核苷、次黄嘌呤、2-氨基腺嘌呤、5-丙炔基尿嘧啶、5-丙炔基胞嘧啶、4-硫尿嘧啶、6-硫鸟嘌呤、氮尿嘧啶和脱氮尿嘧啶、胸腺嘧啶核苷、胞嘧啶、腺嘌呤或鸟嘌呤。此外,它们可包含用如下部分衍生的核酸碱基:生物素部分、洋地黄毒苷部分、荧光部分或化学发光部分、猝灭部分或某种其他部分。本发明不限于列出的核酸碱基;给出的这份名单示出了可用于本发明方法中的范围广泛的碱基的实例。
就本发明的核酸或多核苷酸来说,糖部分中的一个或多个可包括2′-脱氧核糖,或者可选地,糖部分中的一个或多个可包括某种其他糖部分,诸如但不限于:提供对一些核酸酶的抵抗力的核糖或2’-氟代-2’-脱氧核糖或2’-O-甲基-核糖,或可通过与可见的、荧光的、红外荧光的或其他可检测的染料或具有亲电子的、光反应性的、炔基或其他反应性化学部分的化学物质进行反应而标记的2’-氨基2’-脱氧核糖或2’-叠氮基-2’-脱氧核糖。
本发明的核酸或多核苷酸的核苷间连接可以是磷酸二酯键连接,或者可选地,核苷间连接中的一种或多种可包括修饰的连接,诸如但不限于:硫代磷酸酯、二硫代磷酸酯、硒代磷酸酯(phosphoroselenate)、或二硒代磷酸酯(phosphorodiselenate)连接,它们对一些核酸酶具有抵抗力。
如本文所用,术语“末端转移活性”是指,能催化一个或多个脱氧核糖核苷三磷酸(dNTP)或单个双脱氧核糖核苷三磷酸不依赖模板地添加(或“加尾”)至cDNA的3’末端的活性。具有末端转移活性的逆转录酶的实例包括但不限于,M-MLV逆转录酶、HIV-1逆转录酶、AMV逆转录酶、端粒酶逆转录酶,以及具有所述逆转录酶的逆转录活性和末端转移活性的变体、修饰产物和衍生物。所述逆转录酶不具有或者具有RNase活性(特别是RNase H活性)。在优选的实施方案中,用于逆转录RNA以生成cDNA的逆转录酶不具有RNase活性。因此,在优选的实施方案中,用于逆转录RNA以生成cDNA的逆转录酶具有末端转移活性,且不具有RNase活性。
如本文所用,具有“链置换活性”的核酸聚合酶是指,在延伸新核酸链的过程中,如果遇到下游与模板链互补的核酸链,可以继续延伸反应并将所述与模板链互补的核酸链剥离(而非降解)的核酸聚合酶。
如本文所用,具有“5’至3’端外切酶活性”的核酸聚合酶是指,能从多核苷酸链的5’端开始按5’端至3’端的次序催化水解3、5-磷酸二酯键,降解核苷酸的核酸聚合酶。
如本文所用,具有“高保真性”的核酸聚合酶(或DNA聚合酶)是指,在扩增核酸的过程中,引入错误核苷酸的概率(即,错误率)低于野生型Taq酶(例如其序列如UniProt Acession:P19821.1所示的Taq酶)的核酸聚合酶(或DNA聚合酶)。
如本文所用,术语“发生退火”、“进行退火”、“退火”、“使杂交”或“杂交”等是指,具有经由沃森-克里克碱基配对形成复合物的充分互补性的核苷酸序列之间形成复合物。就本发明来说,彼此之间“对其互补”或“与之互补”或与其“杂交”或“退火”的核酸序列应该能形成或形成服务于预定目的的足够稳定的“杂交体”或“复合物”。不要求由一个核酸分子显示的序列内的每个核酸碱基能够与由第二核酸分子显示的序列内的每个核酸碱基进行碱基配对或配对或复合,以便这两个核酸分子或其中显示的相应序列与彼此“互补”或“退火”或“杂交”。如本文所述,在提及按碱基配对法则联系的核苷酸的序列时使用术语“互补的”或“互补性”。例如,序列5’-A-G-T-3’与序列3’-T-C-A-5’互补。互补性可以是“部分的”,其中核酸碱基中只有一些根据碱基配对法则匹配。或者,在核酸之间可具有“完全的”或“全部的”互补性。核酸链之间的互补性的程度对核酸链之间的杂交的效率和强度具有显著影响。这在扩增反应以及依赖于核酸的杂交的检测方法中是特别重要的。术语“同源性”是指一条核酸序列与另一条核酸序列的互补性程度。可具有部分同源性或完全同源性(即,互补性)。部分互补的序列是至少部分地抑制完全互补的序列与靶核 酸的杂交的序列并且使用功能术语“基本上同源的”称呼。完全互补的序列与靶序列的杂交的抑制可使用杂交测定(例如,DNA印迹或RNA印迹,溶液杂交等)在低严格度条件下来检查。基本上同源的序列或探针将竞争或抑制完全同源的序列与靶在低严格度条件下的结合(即杂交)。这并不是说低严格度条件是允许非特异性结合的条件;低严格度条件要求两条序列与彼此的结合是特异性(即选择性)相互作用。非特异性结合的不存在可以通过使用缺乏互补性或只具有低互补性程度(例如,小于约30%的互补性)的第二靶来测试。在特异性结合很低或不存在的情况下,探针将不与核酸靶杂交。当用于提及双链核酸序列诸如cDNA或基因组克隆时,术语“基本上同源的”是指可在本文所述的低严格度条件下与双链核酸序列的一条链或两条链杂交的任何寡核苷酸或探针。如本文所用,在提及互补的核酸链的配对时使用术语“退火”或“杂交”。杂交和杂交强度(即,核酸链之间的缔合强度)受本领域中公知的许多因素影响,包括核酸之间的互补性程度,包括受诸如盐浓度影响的条件的严格度,形成的杂交体的Tm(解链温度),其他组分的存在(如,存在或不存在聚乙二醇或甜菜碱),杂交链的摩尔浓度以及核酸链的G:C含量。
如本文所述,所述固相支持物能够自发地或在暴露于一种或多种刺激(例如,温度变化、pH变化、暴露于特定化学物质或相、暴露于光、还原剂等)时释放所述寡核苷酸探针。应当理解的是,可以通过寡核苷酸探针与固相支持物之间的键的裂解来释放所述寡核苷酸探针,或通过固相支持物本身的降解来释放所述寡核苷酸探针,或两者兼而有之,所述寡核苷酸探针允许被其他试剂接近或可被其他试剂接近。
向所述固相支持物中添加多种类型的不稳定键可导致生成能够对不同刺激有反应的固相支持物。每种类型的不稳定键可以对相关的刺激(例如,化学刺激、光、温度等)敏感,使得通过施加适当的刺激可以控制通过每个不稳定键连接到固相支持物的物质的释放。除了可热裂解的键、二硫键和UV敏感键之外,可以与固相支持物偶合的不稳定键的其他非限制性实例包括酯键(例如,可用酸、碱或羟胺裂解)、邻位二醇键(例如,可通过高碘酸钠裂解)、狄尔斯-阿尔德(Diels-Alder)键(例如,可通过热裂解)、砜键(例如,可通过碱裂解)、甲硅烷基醚键(例如,可通过酸裂解)、糖苷键(例如,可通过淀粉酶裂解)、肽键(例如,可通过蛋白酶裂解)或磷酸二酯键(例如,可通过核酸酶(例如,DNA酶)裂解))。
除了上文所述的固相支持物与寡核苷酸之间的可裂解键之外或作为其替代,固相支持物可以在自发地或在暴露于一种或多种刺激(例如,温度变化、pH变化、暴露于特定化学物质或相、暴露于光、还原剂等)时为可降解、可破坏或可溶解的。在一些情况下,固相支持物可以是可溶解的,使得固相支持物的材料组分在暴露于特定化学物质或环境变化(例如变化温度或pH变化)时溶解。在一些情况下,固相支持物在升高的温度和/或碱性条件下降解或溶解。在一些情况下,固相支持物可以是可热降解的,使得当固相支持物暴露于适当的温度变化(例如,加热)时,固相支持物降解。与物质(例如,寡核苷酸探针)结合的固相支持物的降解或溶解可导致物质从固相支持物中释放。
如本文所用,术语“转座酶”和“逆转录酶”以及“核酸聚合酶”是指负责催化特异性化学反应和生物学反应的蛋白质分子或蛋白质分子聚集体。一般来说,本发明的方法、组合物或试剂盒 不限于使用来自特定来源的特定的转座酶、逆转录酶或核酸聚合酶。反之,本发明的方法、组合物或试剂盒包括与根据特定方法、组合物或试剂盒的本文公开的特定酶具有等同酶活性的来自任何来源的任何转座酶、逆转录酶或核酸聚合酶。更进一步,本发明的方法还包括如下实施方案:其中在所述方法的步骤中提供和使用的任何一种特定的酶被两种或多种酶的组合取代,所述两种或多种酶在组合使用时,不论是以分步方式分别使用还是同时一起使用,反应混合物产生的结果与使用该一种特定的酶获得的结果相同。本文提供的方法、缓冲液和反应条件,包括在实施例中的方法、缓冲液和反应条件目前对于本发明的方法、组合物和试剂盒的实施方案是优选的。然而,使用本发明的一些酶的其他的酶储存缓冲液、反应缓冲液和反应条件是本领域已知的,其也可能适于在本发明中使用并且被包括在本文中。
发明的有益效果
本申请提供了能够对核酸分子进行定位标记的高分辨率核酸阵列(例如芯片)和方法,以及利用所述核酸阵列或方法进行高通量测序(特别是,高通量的单细胞转录组测序)的方法。本申请的方法具有一个或多个选自下列的有益技术效果:
(1)所述核酸阵列(例如芯片)分辨率高,其在单个细胞面积(例如80-100μm 2)下可以含有至少50个(例如至少50个,至少100个,至少200个,至少300个,至少400个,或至少500个)微点,每个微点偶联有一种含有位置信息的标记用寡核苷酸探针(例如含有标签序列Y的寡核苷酸探针),每种寡核苷酸探针包含至少一个拷贝。因此,所述核酸阵列可实现将样品(例如细胞悬液)中的不同细胞分别标记上特异性定位序列(例如标签序列Y),由此,通过检测标记的核酸分子中的特异性定位序列(例如标签序列Y),从而可以确定所述核酸分子在核酸阵列上的空间位置信息,进而确定来自于同一个单细胞的核酸分子,从而实现单细胞样本的分析。
(2)当所述核酸阵列(例如芯片)用于构建高通量测序文库时,不需要对样品(例如细胞悬液)中的细胞进行单细胞分选,而是可以直接将多个细胞直接吸附在所述核酸阵列(例如芯片)上。由于所述核酸阵列(例如芯片)的分辨率高,所述微点的尺寸和间距远远小于单个细胞的大小,因此,可以保证铺在核酸阵列(例如芯片)上的每个细胞(或者,来自所述细胞的核酸分子)都被所述核酸阵列(例如芯片)上一个或多个带有位置信息的标记用寡核苷酸探针(例如含有标签序列Y的寡核苷酸探针)捕获并标记。换言之,所述核酸阵列(例如芯片)在理论上可捕获和标记样品中的每一个细胞,这有效避免了稀有细胞信息的缺失。
(3)基于所述核酸阵列(例如芯片)的独特设计,本发明方法在单张芯片上即可捕获上百万个细胞,用于单细胞测序,且细胞捕获效率理论上可达到100%。即,本发明方法的细胞捕获通量可达到百万级别,细胞捕获效率可达近100%,远超出现有技术(现有技术,例如10x chromium细胞分选平台,受油相中形成的微液滴数量的限制,其通量难以突破万级别,且由于泊松分布的特点,细胞捕获率理论上最多能达到60%)。
下面将结合附图和实施例对本发明的优选实施方案进行详细描述,但是本领域技术人员将理解,下列附图和实施例仅用于说明本发明,而不是对本发明的范围的限定。根据附图和优选实施方案的下列详细描述,本发明的各种目的和有利方面对于本领域技术人员来说将变得可实施。
附图说明
图1A显示了本申请中用于捕获和标记核酸分子的芯片的示例性结构,其包含:芯片和偶联在芯片上的寡核苷酸探针(也称芯片序列)。每种寡核苷酸探针包含与其在芯片上的位置相对应的标签序列Y,每种寡核苷酸探针与芯片的偶联区域可称为微点。每种寡核苷酸探针可以是单拷贝的或多拷贝的。
图1B显示了样品中的细胞与芯片接触后,被芯片上的一个或多个微点所标记。
图2显示了以样本中的RNA(例如mRNA)为模板制备cDNA链的示例性方案1,以及,所述cDNA链的示例性结构。CA:共有序列A;CB:共有序列B。
图3显示了用芯片序列标记cDNA链的5’端(即,将cDNA链的5’端与芯片序列的3’端进行连接),形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案,以及,所述含有芯片序列信息的新核酸分子的示例性结构。CA:共有序列A;CB:共有序列B;X1:共有序列X1,Y:标签序列Y;X2:共有序列X2。
图4显示了以样本中的RNA(例如mRNA)为模板制备cDNA链互补链的示例性方案1,以及,所述cDNA链互补链的示例性结构。CA:共有序列A;CB:共有序列B;EP:延伸引物。
图5显示了用芯片序列标记cDNA链互补链的5’端(即,将cDNA链的互补链的5’端与芯片序列的3’端进行连接),形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案,以及,所述含有芯片序列信息的新核酸分子的示例性结构。CA:共有序列A;CB:共有序列B;X1:共有序列X1,Y:标签序列Y;X2:共有序列X2。
图6显示了以样本中的RNA(例如mRNA)为模板制备cDNA链的示例性方案2,以及,所述cDNA链的示例性结构。CA:共有序列A;CB:共有序列B。
图7显示了用芯片序列的互补序列标记cDNA链的3’端,形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案1,以及,所述含有芯片序列信息的新核酸分子示例性结构。CA:共有序列A;CB:共有序列B;X1:共有序列X1;Y:标签序列Y;X2:共有序列X2;P1:第一区域;P2:第二区域。
图8显示了用芯片序列的互补序列标记cDNA链的3’端,形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案2,以及,所述含有芯片序列信息的新核酸分子的示例性结构。CA:共有序列A;CB:共有序列B;X1:共有序列X1;Y:标签序列Y;X2:共有序列X2。
图9显示了以样本中的RNA(例如mRNA)为模板制备cDNA链的互补链的示例性方案2,以及,所述cDNA链互补链的示例性结构。CA:共有序列A;CB:共有序列B;EP:延伸引物。
图10显示了用芯片序列的互补序列标记cDNA链互补链的3’端,形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案1,以及,所述含有芯片序列信息的新核酸分子示例性结构。CA:共有序列A;CB:共有序列B;X1:共有序列X1;Y:标签序列Y;X2:共有序列X2;P1:第一区域;P2:第二区域。
图11显示了用芯片序列的互补序列标记cDNA链互补链的3’端,形成含有芯片序列信息的新核酸分子(即,经芯片序列标记的核酸分子)的示例性方案2,以及,所述含有芯片序列信息 的新核酸分子的示例性结构。CA:共有序列A;CB:共有序列B;X1:共有序列X1;Y:标签序列Y;X2:共有序列X2。
图12显示了通过实施例1的方法得到的部分Hek293细胞的基因表达图谱。
图13显示了通过实施例1的方法得到的部分Hek293细胞的基因表达图谱的局部放大图。
图14显示了实施例2中cDNA扩增产物的长度分布。
图15显示了通过实施例2的方法得到的Hek293细胞的基因表达图谱。
图16显示了通过实施例2的方法得到的单个细胞捕获到的平均基因数及UMI数。
具体实施方式
现参照下列意在举例说明本发明(而非限定本发明)的实施例来描述本发明。除非特别指明,否则基本上按照本领域内熟知的以及在各种参考文献中描述的常规方法进行实施例中描述的实验和方法。另外,实施例中未注明具体条件者,按照常规条件或制造商建议的条件进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。本领域技术人员知晓,实施例以举例方式描述本发明,且不意欲限制本发明所要求保护的范围。本文中提及的全部公开案和其他参考资料以其全文通过引用合并入本文。
实施例1
本实施例涉及的序列信息如表1-1所示:
表1-1序列信息
Figure PCTCN2022135478-appb-000003
Figure PCTCN2022135478-appb-000004
注:“r”表示其3’相邻位置的核苷酸为核糖核苷酸;“+”表示其3’相邻位置的核苷酸存在LNA(锁核苷酸)修饰;“*”表示硫代磷酸修饰;“p”表示磷酸化修饰;N=A,T,C or G;V=A,C or G。
一、捕获芯片的制备
1.设计含有用于定位芯片位置信息的DNA文库分子的序列,其从5’端到3’端包含:共有序列X1(X1),标签序列(Y)和共有序列X2(X2)的编码序列。DNA文库分子的典型核苷酸序列如SEQ ID NO:1所示。委托北京六合华大基因科技有限公司(Beijing Liuhe BGI Co.,Ltd)合成DNA文库分子。
2.文库分子的扩增和装载
(1)使用DNBSEQ测序试剂盒(购自MGI,货号1000019840)来制备DNA纳米球(DNB)。具体的实施方案简要描述如下。
简言之,配置如表1-2所示的反应体系40μL。将该反应体系放置于PCR仪,并按照如下反应条件进行反应:95℃ 3min,40℃ 3min。反应结束后,将反应产物放于冰上,加入40μL混合酶I和2μL混合酶II(来自于DNBSEQ测序试剂盒),1μL ATP(母液100mM,获自Thermo Fisher),和0.1μL T4 ligase(获自NEB,货号:M0202S)。混匀后,将上述反应体系置于PCR仪并在30℃反应20min,生成DNB。
表1-2制备DNB的反应体系
Figure PCTCN2022135478-appb-000005
(2)随后,将DNB按照BGISEQ 500高通量测序试剂套装(SE50)(购自MGI,货号:1000012551)所述的方法将DNB装载至BGISEQ SEQ 500测序芯片上。
在测序芯片内,加入BGISEQ500PE50测序试剂盒(购自MGI,货号:1000012554)中的MDA试剂,37℃孵育30min后,5XSSC清洗芯片。
(3)芯片表面修饰N3-PEG3500-NHS(修饰试剂购自Sigma,货号:JKA5086),孵育30min后,泵入DBCO修饰的芯片序列合成引物(序列如SEQ ID NO:3所示),常温过夜孵育。
3.位置序列信息的测序解码。按照BGISEQ-500高通量测序试剂套装的说明书对DNB进行测序,SE设置读长25bp。在上述DBCO修饰的序列进行延伸获得测序后生长出来的链,对该链进行解码,获得对应DNB的位置序列信息。
4.测序后生长出来的链继续延伸:在上述步骤3基础上继续进行15个碱基的cPAS反应,得到芯片序列(SEQ ID NO:8,其含有共有序列X1(SEQ ID NO:4),标签序列Y,共有序列X2 (SEQ ID NO:5))。
5.使用限制性内切酶HaeIII切除DNB,并高温变性去除DNB上的残留片段,使芯片仅残留步骤4的芯片序列。
6.芯片切块:将制备好的芯片切成若干小片,切片大小根据实验需要进行调整,将芯片浸泡在50mM pH8.0的Tris buffer中,4℃待用。
二、单细胞固定
1.取大约3000个Hek293细胞按常规方法制备成细胞悬浮液(PBS溶液)。
2.取制备好的芯片,用多聚赖氨酸处理30min,进行半小时的干燥。
3.将细胞悬浮液滴在处理好的芯片上,室温孵育30min后,将芯片置于-20C冰冻甲醇中固定30min。
4.采用0.1%triton100,处理15min,对细胞进行透化。
三、cDNA原位合成
1.cDNA合成
配置如表1-3所示的逆转录酶反应体系200μL,将反应液加到芯片上,充分覆盖,42℃反应90min-180min。逆转录酶将以mRNA为模板,以含有polyT的引物(序列如SEQ ID NO:6所示,其含有共有序列A(CA),UMI序列(NNNNNNNNNN)和polyT序列)进行cDNA合成,并在cDNA链的3’末端添加CCC悬突。在TSO序列(SEQ ID NO:7,其含有共有序列B(CB)以及GGG悬突)与cDNA链杂交退火后(通过TSO序列末端的GGG与cDNA链的CCC悬突的互补配对),逆转录酶将以共有序列B为模板,继续延伸cDNA链,使cDNA的3’端带上c(CB)标签(共有序列B的互补序列)。
表1-3 cDNA合成体系
Figure PCTCN2022135478-appb-000006
合成的cDNA链包含如下的序列结构:逆转录引物的序列(SEQ ID NO:6)-cDNA序列- c(TSO)的序列(SEQ ID NO:7的互补序列)。
2.测序芯片的芯片序列与cDNA的连接
cDNA合成后,用5X SSC清洗芯片两次,配置如表1-4所示的反应体系1ml,向芯片中泵入合适的体积,保证芯片中充满下列连接反应液,室温反应30min。
上述反应可实现将cDNA序列的5’端与单细胞测序芯片序列的3’端连接起来(即,用芯片序列标记cDNA序列的5’端),得到新的含有位置信息(即标签序列Y)的核酸分子,其包含如下的序列结构:芯片序列(SEQ ID NO:8)-逆转录引物的序列(SEQ ID NO:6)-cDNA序列-c(TSO)的序列(SEQ ID NO:7的互补序列)。
反应结束后,使用5X SSC清洗芯片。按照说明书配制Bst聚合反应液(NEB,M0275S)200μL,泵入芯片,65℃反应60min,得到含有位置信息的单链核酸分子。
表1-4连接体系
Figure PCTCN2022135478-appb-000007
3.cDNA释放
使用75μL 80mM KOH室温孵育芯片5min,收集液体后加入10μL 1M pH8.0Tris-HCl中和cDNA回收液。
4.cDNA扩增
配制如表1-5所示的反应体系200μL,分别用于3’端转录组测序建库,分成2管PCR:
表1-5 cDNA扩增体系
Figure PCTCN2022135478-appb-000008
将上述反应体系至于PCR仪,设置如下反应程序,95℃ 3min,11循环(98℃ 20s,58℃20s,72℃ 3min),72℃ 5min,4℃∞。反应结束后,用XP beads(购自AMPure)进行磁珠纯化回收。使用Qubit仪器对dsDNA浓度进行定量,并且,使用2100生物分析仪(购自Agilent)检测cDNA扩增产物的长度分布。
四、cDNA建库测序
1.Tn5打断
根据cDNA浓度,取20ng cDNA(步骤三中获得的),加入0.5μM Tn5转座酶及相应buffer (购自BGI,货号10000028493,Tn5打断酶包被方法按照Stereomics文库制备试剂盒-S1操作),混匀配成20μL的反应体系,在55℃反应10min后,加入5μL 0.1%SDS混匀室温5min结束Tn5打断步骤。
2.PCR扩增
配置如下反应体系100μL:
表1-6建库扩增反应体系
Figure PCTCN2022135478-appb-000009
混匀后置于PCR仪,设置如下程序95℃ 3min,11循环(98℃ 20s,58℃ 20s,72℃3min),72℃ 5min,4℃∞。反应结束后,用XP beads进行磁珠纯化回收。使用Qubit仪器定量dsDNA浓度。
3.测序
取上述打断后的扩增产物80fmol,进行DNB制备。配置如下40μL反应体系:
表1-7测序用DNB制备体系
Figure PCTCN2022135478-appb-000010
将上述反应体积放置于PCR仪反应,反应条件如下:95℃ 3min,40℃ 3min;反应结束后,放于冰上,加入40μL DNBSEQ测序试剂盒中DNB制备所需的混合酶I,2μL混合酶II,及1μL ATP,0.1μL T4 Ligase,混匀后,将上述反应体系至于PCR仪30℃,反应20min,形成DNB。
按照MGISEQ 2000配套的PE50试剂盒所述的方法,将DNB装载至MGISEQ 2000的测序芯片上,并按照相关说明书要求进行测序,选择PE50测序模型,其中一链测序分成两段测序,先测25bp后进行15个循环暗反应,再测10bp UMI序列,二链测序设置测50bp。
五、数据分析:
1.登录网站http://stereomap.cngb.org/Stereo-Draftsman/report/index,按照网站操作指南进行数据分析。PE50测序中获得的read1序列(来自于一链测序),其前25bp序列与芯片制备过程中的25bp位置信息进行比对,把能够比对到芯片上的位置信息的reads保留下来,并将它们对应到相应的芯片位置上。找出对应到芯片位置上的reads所对应的read2(来自于二链测序),将reads2与人的基因组进行比对,根据UMI信息去掉重复的reads,得到每个细胞中捕获到的基因情况及每个基因的reads数量。
2.截取芯片上一部分,其分析结果如图12和图13,其中图13为图12中部分放大的细胞,从结果图中可以看到单个细胞散布在芯片上。圈取其中一个细胞,可看到该细胞捕获到的平均基因数及UMI数。
实施例2
本实施例涉及的序列信息如表2-1和表1-1所示:
表2-1序列信息
Figure PCTCN2022135478-appb-000011
注:“r”表示其3’相邻位置的核苷酸为核糖核苷酸;“+”表示其3’相邻位置的核苷酸存在LNA(锁核苷酸)修饰;“*”表示硫代磷酸修饰;“p”表示磷酸化修饰;N=A,T,C or G;V=A,C or G。
一、捕获芯片的制备
1、设计并合成含有位置信息的DNA文库序列,其核苷酸序列如SEQ ID NO:1所示。委托北京六合华大基因科技有限公司(Beijing Liuhe BGI Co.,Ltd)进行序列合成。
2、文库原位扩增
(1)DNA纳米球(DNB)制备:配置如表2-2所示的反应体系40μL,投入80fmol含有位置序列信息的DNA文库
表2-2制备DNB的反应体系
Figure PCTCN2022135478-appb-000012
将上述反应体积放置于PCR仪反应,反应条件如下:95℃3min,40℃3min;反应结束后,放于冰上,加入40μL DNBSEQ测序试剂盒中DNB制备所需的混合酶I,2μL混合酶II,及1μL ATP(母液100mM,Thermo Fisher),0.1μL T4 ligase(购自NEB,货号:M0202S),混匀后,将上述 反应体系至于PCR仪30℃,反应20min,形成DNB。将所述的DNB按照BGISEQ-500高通量测序试剂套装(SE50)所述的方法将DNB装载至SEQ 500测序芯片上。
(2)在测序芯片内,加入PE50测序试剂盒(购自MGI,货号:1000012554)中的MDA试剂,37C孵育30min后,5XSSC清洗芯片。
(3)芯片表面修饰N3-PEG3500-NHS(购自Sigma,货号:JKA5086),孵育30min后,泵入DBCO修饰的序列(SEQ ID NO:3),常温过夜孵育。
3、位置序列信息的测序解码。按照BGISEQ-500高通量测序试剂套装的说明书对DNB进行测序,SE设置读长25bp。在上述DBCO修饰的序列进行延伸获得测序后生长出来的链,对该链进行解码,获得对应DNB的位置序列信息。
4、六合华大合成带UMI的探针捕获序列(SEQ ID NO:15,其5’末端磷酸化修饰),通过T4连接酶按下列反应体系将捕获序列连接到测序后生长出来的链上。连接反应体系如表2-3。
表2-3连接反应体系
Figure PCTCN2022135478-appb-000013
5、使用限制性内切酶HaeIII及MboI切除DNB,并高温变性去除DNB上的残留片段,使芯片仅残留含有位置信息和捕获序列的探针(序列如SEQ ID NO:16):
CTGCTGACGTACTGAGAGGCATGGCGACCTTATCAG NNNNNNNNNNNNNNNNNNNNNNNNNTTGTCTTCCTAAGACNNNNNNNNNNTTTTTTTTTTTTTTTTTTTTTTV。其中, NNNNNNNNNNNNNNNNNNNNNNNNN代表位置信息。
6、芯片切块:将制备好的芯片切成若干小片,切片大小根据实验需要进行调整,将芯片浸泡在50mM pH8.0的tris buffer中,制备4℃待用。
二、单细胞固定
1、取大约3000个Hek293细胞按常规方法制备成细胞悬浮液(PBS溶液)。
2、取制备好的芯片,用多聚赖氨酸处理30min,进行半小时的干燥。
3、将细胞悬浮液滴在处理好的芯片上,室温孵育30min后,将芯片置于-20C冰冻甲醇中固定30min。
4、采用0.1%triton100,处理15min,对细胞进行透化。
三、cDNA原位合成
1、cDNA合成。使用5XSSC室温清洗芯片两次,配置如表2-4的逆转录酶反应体系200μL,将反应液加到含有细胞的芯片上,充分覆盖,42℃反应90min-180min,以芯片上的探针polyT为引物进行cDNA合成,cDNA的3端带上TSO标签,用于cDNA互补链的合成,cDNA链如下:
CTGCTGACGTACTGAGAGGCATGGCGACCTTATCAGNNNNNNNNNNNNNNNNNNNNNN NNNTTGTCTTCCTAAGACNNNNNNNNNNTTTTTTTTTTTTTTTTTTTTTTV(cDNA)CCCGCCTCTCAGTACGTCAGCAG,RNaseH处理30min,消化RNA。
表2-4逆转录酶反应体系
Figure PCTCN2022135478-appb-000014
3、cDNA释放。使用75μL 200mM KOH室温孵育芯片5min,收集液体后加入15μL 1M pH8.0Tris-HCl中和cDNA回收液。
4、cDNA扩增。配制如表2-5的反应体系200μL分别用于3’端和5’端转录组建库,各分成2管PCR:
表2-5 cDNA扩增体系
Figure PCTCN2022135478-appb-000015
将上述反应体系至于PCR仪,设置如下反应程序,95℃3min,11循环(98℃20s,58℃20s,72℃3min),72℃5min,4℃∞。反应结束后,用XP beads进行磁珠纯化回收。使用Qubit试剂盒dsDNA浓度定量,使用2100生物分析仪(购自Agilent)检测cDNA片段分布。检测结果如图14所示,cDNA长度正常。
四、cDNA建库测序
1、Tn5打断。根据cDNA浓度,取20ng cDNA,加入0.5μM Tn5打断酶(其包被了如SEQ ID NO:19所示的第一链和如SEQ ID NO:20所示的第二链)及相应buffer(购自BGI,货号10000028493,Tn5打断酶包被方法按照Stereomics文库制备试剂盒操作),混匀配成20μL的反应体系,在55℃反应10min后,加入5μL 0.1%SDS混匀室温5min结束Tn5打断步骤。
2、PCR扩增。配置如表2-6的反应体系100μL:
表2-6 PCR扩增反应体系
Figure PCTCN2022135478-appb-000016
Figure PCTCN2022135478-appb-000017
混匀后置于PCR仪,设置如下程序95℃ 3min,11循环(98℃ 20s,58℃ 20s,72℃ 3min),72℃ 5min,4℃∞。反应结束后,用XP beads进行磁珠纯化回收。使用Qubit试剂盒dsDNA浓度定量。
3、测序。取上述打断后的扩增产物80fmol,进行DNB制备。配置如表2-7的40μL反应体系:
表2-7测序用DNB制备体系
成分 体积(μL) 终浓度
上述步骤2的扩增产物 80fmol(X) -
10X phi29buffer(获自Thermofisher,货号:B62) 4 1X
DNB引物序列10μM(SEQ ID NO:22,六合华大合成) 4 1μM
32-x -
将上述反应体积放置于PCR仪反应,反应条件如下:95℃3min,40℃3min;反应结束后,放于冰上,加入40μL DNBSEQ测序试剂盒中DNB制备所需的混合酶I,2μL混合酶II,及1μL ATP(母液100mM,Thermo Fisher),0.1μL T4 ligase,混匀后,将上述反应体系至于PCR仪30℃,反应20min,形成DNB。
按照MGISEQ 2000配套的PE50试剂盒所述的方法,将DNB装载至MGISEQ 2000的测序芯片上,并按照相关说明书要求进行测序,选择PE50测序模型,其中一链测序分成两段测序,先测25bp后进行15个循环暗反应,再测10bp的UMI序列,二链测序设置50bp。
五、数据分析
1、登录网站http://stereomap.cngb.org/Stereo-Draftsman/report/index,按照网站操作指南进行数据分析。PE50中的read1序列前25bp与芯片制备过程中的25bp位置信息进行比对,read1比对到芯片上位置信息的reads保留下来,并对应到相应的芯片位置。将对应到芯片位置reads的read2进行人的基因组比对,根据UMI信息去掉重复的reads,得到每个细胞中捕获到基因及每个基因的reads数量。
2、分析结果如图15和图16:截取了芯片上一部分,从中可以看到大量单细胞分布在芯片上。圈取其中一个细胞,可看到该细胞捕获到的平均基因数及UMI数。
尽管本发明的具体实施方式已经得到详细的描述,但本领域技术人员将理解:根据已经公布的所有教导,可以对细节进行各种修改和变动,并且这些改变均在本发明的保护范围之内。本发明的全部分为由所附权利要求及其任何等同物给出。

Claims (61)

  1. 一种生成标记的核酸分子群的方法,其包括下述步骤:
    (1)提供:含有一个或多个细胞的样品,和,核酸阵列;
    其中,所述样品为单细胞悬液;所述细胞(例如在其表面)含有第一结合分子;
    所述核酸阵列包括固相支持物,所述固相支持物(例如在其表面)含有第一标记分子,所述第一结合分子能与所述第一标记分子构成相互作用对;
    并且,所述固相支持物还包含多个微点,所述微点的尺寸(例如等效直径)小于5μm,相邻的所述微点之间的中心距离小于10μm;每个微点偶联有一种寡核苷酸探针,每种寡核苷酸探针包含至少一个拷贝;所述寡核苷酸探针从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成,其中,
    不同微点偶联的寡核苷酸探针具有不同的标签序列Y;
    (2)将所述一个或多个细胞与所述核酸阵列的固相支持物接触,由此,每个细胞各自占据所述核酸阵列中的至少一个微点(即,每个细胞各自与所述核酸阵列中的至少一个微点接触),并使得所述细胞的第一结合分子与所述固相支持物的第一标记分子形成相互作用对;其中,
    在将所述一个或多个细胞与所述核酸阵列接触之前或之后,对所述一个或多个细胞的RNA(例如,mRNA)进行包括逆转录的预处理以生成第一核酸分子群;
    和,
    (3)将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群。
  2. 权利要求1的方法,其中,相邻的所述微点之间的中心距离小于10μm,小于5μm,小于1μm,小于0.5μm,小于0.1μm,小于0.05μm,或小于0.01μm;并且,所述微点的尺寸(例如等效直径)小于5μm,小于1μm,小于0.3μm,小于0.5μm,小于0.1μm,小于0.05μm,小于0.01μm,或小于0.001μm;
    优选地,相邻的所述微点之间的中心距离为0.5μm~1μm,例如0.5μm~0.9μm,0.5μm~0.8μm;
    优选地,所述微点的尺寸(例如等效直径)为0.001μm~0.5μm(例如0.01μm~0.1μm,0.01μm~0.2μm,0.2μm~0.5μm,0.2μm~0.4μm,0.2μm~0.3μm)。
  3. 权利要求1或2的方法,其中,所述第一结合分子能与所述第一标记分子构成特异性相互作用对或者非特异性相互作用对;
    优选地,所述相互作用对选自正负电荷相互作用,亲和相互作用(例如生物素-亲和素,生物素-链霉亲和素,抗原-抗体,受体-配体,酶-辅因子),能够发生点击化学反应的分子对(例如含炔基基团-叠氮基化合物),N-羟基磺基琥珀(NHS)酯-含氨基化合物,或其任意组合;
    例如,所述第一标记分子为多聚赖氨酸,所述第一结合分子为能与多聚赖氨酸结合的蛋白质;所述第一标记分子为抗体,所述第一结合分子为能与所述抗体结合的抗原;所述第一标记分子为含氨基化合物,所述第一结合分子为N-羟基磺基琥珀(NHS)酯;或者,所述第一标记分子为生物素,所述第一结合分子为链霉亲和素。
  4. 权利要求1-3任一项的方法,其中,所述第一结合分子是所述细胞天然含有的。
  5. 权利要求1-3任一项的方法,其中,所述第一结合分子是所述细胞非天然含有的;
    优选地,所述方法还包括将所述第一结合分子结合到所述一个或多个细胞或者使所述一个或多个细胞表达所述第一结合分子的步骤,以提供步骤(i)所述的细胞样品。
  6. 权利要求1-5任一项的方法,其中,所述方法还包括将所述第一标记分子结合到所述固相支持物的步骤,以提供步骤(i)所述的核酸阵列。
  7. 权利要求1-6任一项的方法,其中,步骤(2)中,所述预处理包括以下步骤:
    (i)用引物I-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成延伸产物,所述延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);
    或,
    (ii)(a)用引物I-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物I-A为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);和,(b)将引物I-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物,所述第一延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物I-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);
    或,
    (iii)(a)用引物I-A’对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物I-A’为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物I-A’包含捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;(b)将引物I-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物I-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);和,(c)提供延伸引物,以第一延伸产物为模板进行延伸反应,生成第二延伸产物,所述第二延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;
    并且,
    步骤(3)中,通过以下步骤将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群:
    在允许退火的条件下,将桥接寡核苷酸I与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针接触,使得所述桥接寡核苷酸I与步骤(2)获得的 源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退火),从而使得所述第一核酸分子群与所述阵列上的寡核苷酸探针连接,获得的连接产物即为具有位置标记的第二核酸分子,从而生成第二核酸分子群;
    其中,所述桥接寡核苷酸I包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
    所述第一区域能与步骤(2)(i)或步骤(2)(ii)中所述引物I-A的共有序列A全部或部分退火或者与步骤(2)(iii)中所述引物I-B的共有序列B全部或部分退火;
    所述第二区域能与所述共有序列X2全部或部分退火。
  8. 权利要求7的方法,其中,步骤(3)中,当所述桥接寡核苷酸I的第一区域和第二区域相邻时,所述使得所述第一核酸分子群与所述寡核苷酸探针连接包括:使用核酸连接酶将杂交于同一桥接寡核苷酸I的第一区域和第二区域的核酸分子连接,获得的连接产物即为具有位置标记的第二核酸分子;或者,
    当所述桥接寡核苷酸I包括第一区域、第二区域以及位于两者之间的第三区域时,所述使得所述第一核酸分子群与所述寡核苷酸探针连接包括:使用核酸聚合酶以所述第三区域为模板进行聚合反应,使用核酸连接酶将杂交于同一桥接寡核苷酸I的第一区域、第三区域和第二区域的核酸分子连接,获得的连接产物即为具有位置标记的第二核酸分子;优选地,所述核酸聚合酶无5’至3’端外切酶活性或链置换活性。
  9. 权利要求7或8的方法,其包括步骤(1)、步骤(2)(i)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子。
  10. 权利要求9的方法,其中,步骤(2)(i)中,所述捕获序列A是随机寡核苷酸序列;
    优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述捕获序列A,所述捕获序列A作为所述第二核酸分子的分子标签(UMI)。
  11. 权利要求9的方法,其中,步骤(2)(i)中,所述捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列;其中,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列;
    优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述标签序列A作为UMI。
  12. 权利要求7或8的方法,其包括步骤(1)、步骤(2)(ii)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子。
  13. 权利要求12的方法,其中,步骤(2)(ii)(a)中,所述捕获序列A是随机寡核苷酸序列;
    优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述捕获序列A,所述捕获序列A作为所述第二核酸分子的分子标签(UMI)。
  14. 权利要求12的方法,其中,步骤(2)(ii)(a)中,所述捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列;其中,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列;
    优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有 不同的所述标签序列A作为UMI。
  15. 权利要求9-14任一项的方法,其中,所述引物I-A的5’末端包含磷酸化修饰。
  16. 权利要求7或8的方法,其包括步骤(1)、步骤(2)(iii)和步骤(3);其中,步骤(3)中获得的连接产物即为具有位置标记的第二核酸分子。
  17. 权利要求16的方法,其中,步骤(2)(iii)(c)中,所述延伸引物为所述引物I-B或者引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
  18. 权利要求16或17的方法,其中,步骤(2)(iii)(a)中,所述引物I-A’的捕获序列A为随机寡核苷酸序列;其中,步骤(2)(iii)(b)中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B。
  19. 权利要求16或17的方法,其中,步骤(2)(iii)(a)中,所述引物I-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列;其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B;
    优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述连接产物具有不同的所述标签序列B作为UMI。
  20. 权利要求16-19任一项的方法,其中,所述延伸引物的5’末端包含磷酸化修饰。
  21. 权利要求16-20任一项的方法,其中,在步骤(2)(iii)(b)中,所述cDNA链通过其3’末端悬突与所述引物I-B退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所述引物I-B为模板被延伸,生成所述第一延伸产物。
  22. 权利要求1-6任一项的方法,其中,步骤(2)中,所述预处理包括以下步骤:
    (i)(a)用引物II-A对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链,所述cDNA链包含以所述引物II-A为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物II-A含有捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;和,(b)将引物II-B与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物,所述第一延伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;其中,所述引物II-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物II-B的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B的5’端);或,
    (ii)(a)用引物II-A’对所述一个或多个细胞的RNA(例如,mRNA)进行逆转录,生成cDNA链;所述cDNA链包含以所述引物II-A’为逆转录引物形成的与所述RNA(例如,mRNA)互补的cDNA序列,以及3’末端悬突;其中,所述引物II-A’含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;所述共有序列A位于所述捕获序列A的上游(例如位于所述引物II-A’的5’端);(b)将引物II-B’与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成第一延伸产物;其中,所述引物II-B’包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;所述3’末端悬突互补序列位于所述引物II-B’的3’末端;所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B’的5’端);和,(c)提供延伸引物,以第一延伸产物为模板进行延伸反应,生成第二延伸产物,所述第二延 伸产物即为待标记的第一核酸分子,从而生成第一核酸分子群;
    并且,步骤(3)中,通过以下步骤将前一步骤获得的源自各个细胞的第一核酸分子群与其所源自的细胞占据的微点偶联的寡核苷酸探针相关联,从而生成经所述标签序列Y标记的第二核酸分子群:
    (i)向步骤(2)的产物实施退火条件,使得步骤(2)获得的源自各个细胞的第一核酸分子与所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退火),并进行延伸反应,生成延伸产物,所述延伸产物即为具有位置标记的第二核酸分子,从而生成第二核酸分子群;其中,所述寡核苷酸探针的共有序列X2或其部分序列(a)能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火,或者,(b)能与步骤(2)(ii)获得的第二延伸产物的所述共有序列A的互补序列或其部分序列退火;或,
    (ii)在允许退火的条件下,将桥接寡核苷酸对与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针接触,使得所述桥接寡核苷酸对与步骤(2)获得的源自各个细胞的第一核酸分子以及所述细胞占据的微点所偶联的寡核苷酸探针退火(例如原位退火),
    其中,所述桥接寡核苷酸对由桥接寡核苷酸II-I和桥接寡核苷酸II-II组成,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
    所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;
    所述桥接寡核苷酸II-II的第二区域(a)能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火,或者,(b)能与步骤(2)(ii)获得的第二延伸产物的所述共有序列A的互补序列或其部分序列退火;
    其中,将所述桥接寡核苷酸对与所述第一核酸分子群、所述寡核苷酸探针接触时,所述桥接寡核苷酸对的桥接寡核苷酸II-I和桥接寡核苷酸II-II各自以单链的形式存在,或者,所述桥接寡核苷酸对的桥接寡核苷酸II-I和桥接寡核苷酸II-II以彼此退火形成部分双链的形式存在;
    进行连接反应:将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接,和/或,将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接;并进行延伸反应;其中,所述连接反应与延伸反应以任意顺序进行;所获得的反应产物即为具有位置标记的第二核酸分子,从而生成所述第二核酸分子群。
  23. 权利要求22的方法,其中,步骤(3)(ii)中:
    (1)当所述桥接寡核苷酸II-I的第一区域和第二区域相邻时,所述将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸连接酶将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接;或者,
    当所述桥接寡核苷酸II-I包括第一区域、第二区域以及位于两者之间的第三区域时,所述将杂交于同一桥接寡核苷酸II-I的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸聚合酶(例如,无5’至3’端外切酶活性或链置换活性)以所述第三区域为模板进行聚合反应,使用核 酸连接酶将杂交于同一桥接寡核苷酸II-I的第一区域、第三区域和第二区域的核酸分子连接;
    和/或
    (2)当所述桥接寡核苷酸II-II的第一区域和第二区域相邻时,所述将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸连接酶将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接;或者,
    当所述桥接寡核苷酸II-II包括第一区域、第二区域以及位于两者之间的第三区域时,所述将杂交于同一桥接寡核苷酸II-II的第一区域和第二区域的核酸分子连接的步骤包括:使用核酸聚合酶(例如,无5’至3’端外切酶活性或链置换活性)以所述第三区域为模板进行聚合反应,使用核酸连接酶将杂交于同一桥接寡核苷酸II-II的第一区域、第三区域和第二区域的核酸分子连接。
  24. 权利要求22或23的方法,其包括步骤(1)、步骤(2)(i)和步骤(3);其中,步骤(2)(i)(b)中,所述引物II-B含有共有序列B,3’末端悬突互补序列,以及标签序列B;
    优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二核酸分子具有不同的所述标签序列B作为UMI。
  25. 权利要求24的方法,其包括步骤(1)、步骤(2)(i)和步骤(3)(i);其中,所述共有序列X2或其部分序列能与所述共有序列B的互补序列或其部分序列退火;步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
  26. 权利要求24的方法,其包括步骤(1)、步骤(2)(i)和步骤(3)(ii);其中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(i)获得的第一延伸产物的所述共有序列B的互补序列或其部分序列退火;步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
  27. 权利要求24-26任一项的方法,其中,步骤(2)(i)(a)中,所述引物II-A的捕获序列A为随机寡核苷酸序列。
  28. 权利要求24-26任一项的方法,其中,步骤(2)(i)(a)中,所述引物II-A的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列;
    优选地,所述引物II-A还含有共有序列A,以及任选的标签序列A,例如为随机寡核苷酸序列。
  29. 权利要求22或23的方法,其包括步骤(1)、步骤(2)(ii)和步骤(3);其中,步骤(2)(ii)(b)中,所述第一延伸产物从5’端至3’端包含:所述共有序列A,以所述引物II-A’为逆转录引物形成的与所述RNA互补的cDNA序列,所述3’末端悬突序列,任选的所述标签序列B的互补序列,所述共有序列B的互补序列;
    优选地,步骤(2)(ii)(c)中,所述延伸引物为所述引物II-B’或引物B”,其中,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
  30. 权利要求29的方法,其包括步骤(1)、步骤(2)(ii)和步骤(3)(i);其中,所述共有序列X2或其部分序列能与所述共有序列A的互补序列或其部分序列退火;步骤(3)(i)中获得的延伸产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或, 含有所述寡核苷酸探针序列的第二链。
  31. 权利要求29的方法,其包括步骤(1)、步骤(2)(ii)和步骤(3)(ii);其中,所述桥接寡核苷酸II-II的第二区域能与步骤(2)(ii)获得的第二延伸产物的共有序列A的互补序列或其部分序列退火;步骤(3)(ii)中获得的反应产物即为标记的核酸分子,其包含:含有所述待标记的第一核酸分子序列的第一链,和/或,含有所述寡核苷酸探针序列的第二链。
  32. 权利要求29-31任一项的方法,其中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为随机寡核苷酸序列;
    优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二核酸分子具有不同的捕获序列A作为UMI。
  33. 权利要求29-31任一项的方法,其中,步骤(2)(ii)(a)中,所述引物II-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列;其中,所述引物II-A’还含有标签序列A,例如为随机寡核苷酸序列;
    优选地,步骤(3)中,源自同一微点所偶联的寡核苷酸探针的每个拷贝的所述第二核酸分子具有不同的标签序列A作为UMI。
  34. 权利要求22-28任一项的方法,其中,在步骤(2)(i)(b)中,所述cDNA链通过其3’末端悬突与所述引物II-B退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所述引物II-B为模板被延伸,生成所述第一延伸产物。
  35. 权利要求22-23、29-33任一项的方法,其中,在步骤(2)(ii)(b)中,所述cDNA链通过其3’末端悬突与所述引物II-B’退火,并且,在核酸聚合酶(例如,DNA聚合酶或逆转录酶)的作用下,所述cDNA链以所述引物II-B’为模板被延伸,生成所述第一延伸产物。
  36. 权利要求1-35任一项的方法,其中,步骤(2)中,所述预处理在细胞内进行;
    优选地,在将所述一个或多个细胞与所述核酸阵列的固相支持物接触之前或之后,对所述一个或多个的RNA(例如,mRNA)进行预处理以生成第一核酸分子群;
    优选地,在进行所述预处理之前,对细胞进行透化处理。
  37. 权利要求1-35任一项的方法,其中,步骤(2)中,所述预处理在细胞外进行;
    优选地,在将所述一个或多个细胞与所述核酸阵列的固相支持物接触之后,对所述一个或多个的RNA(例如,mRNA)进行预处理以生成第一核酸分子群;
    优选地,在进行所述预处理之前,所述方法还包括释放细胞内的RNA(例如,mRNA);优选地,通过细胞透化或细胞裂解处理以释放细胞内的RNA(例如,mRNA)。
  38. 权利要求1-37任一项的方法,其中,步骤(2)中所述进行逆转录包括使用逆转录酶;
    优选地,所述逆转录酶具有末端转移活性;
    优选地,所述逆转录酶能够以RNA(例如,mRNA)为模板,合成cDNA链,且在所述cDNA链的3’端添加悬突。
  39. 权利要求1-38任一项所述的方法,其中,所述方法还包括:(4)回收和纯化所述第二核酸分子群。
  40. 权利要求1-39任一项所述的方法,其中,所获得的第二核酸分子群和/或其互补物用于构 建转录组文库或用于转录组测序。
  41. 权利要求1-40任一项的方法,其中,步骤(1)所述核酸阵列由包含以下的步骤来提供:
    (1)提供多种载体序列,每种载体序列包含至少一个拷贝的载体序列,所述载体序列从5’到3’的方向上包含:共有序列X2的互补序列,标签序列Y的互补序列以及固定序列;其中,每种载体序列的标签序列Y的互补序列互不相同;
    (2)将所述多种载体序列连接于固相支持物(例如芯片)表面;
    (3)提供固定引物,并以所述载体序列为模板,进行引物延伸反应,生成延伸产物,所述延伸产物即为寡核苷酸探针;其中,所述固定引物包含共有序列X1的序列,并且,所述固定引物能与所述固定序列退火并起始延伸反应;优选地,所述延伸产物从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成;
    (4)将所述固定引物与所述固相支持物表面连接;其中,步骤(3)与(4)以任意顺序进行;
    (5)任选地,所述载体序列的固定序列还包含切割位点,所述切割可以选自切刻酶(nicking enzyme)酶切、USER酶切、光切除、化学切除或CRISPR切除;对所述载体序列的固定序列所包含的切割位点进行切割,以消化所述载体序列,使得步骤(3)中的延伸产物与形成延伸产物的模板(即载体序列)分离,从而将所述寡核苷酸探针连接于固相支持物(例如芯片)表面;优选地,所述方法还包括通过高温变性使得步骤(3)中的延伸产物与形成延伸产物的模板(即载体序列)分离;
    优选地,每种载体序列是由多个拷贝的载体序列的多联体所形成的DNB;
    优选地,步骤(1)中通过以下步骤提供所述多种载体序列:
    (i)提供多种载体模板序列,所述载体模板序列包含所述载体序列的互补序列;
    (ii)以每种载体模板序列为模板,进行核酸扩增反应,以获得每种载体模板序列的扩增产物,所述扩增产物包含至少一个拷贝的载体序列;优选地,进行滚环复制,以获得由所述载体序列的多联体所形成的DNB。
  42. 一种构建核酸分子文库的方法,其包括,
    (a)根据权利要求1-41任一项的方法生成标记的核酸分子群;
    (b)将所述标记的核酸分子群中的核酸分子随机打断并添加接头;和
    (c)任选地,对步骤(b)的产物进行扩增和/或富集;
    从而获得核酸分子文库;
    优选地,所述核酸分子文库包含来自多个单细胞的核酸分子,不同单细胞的核酸分子具有不同的标签序列Y;
    优选地,所述核酸分子文库用于测序,例如转录组测序,例如单细胞转录组测序(例如5’端或3’端转录组测序)。
  43. 权利要求42的方法,其中,在进行步骤(b)之前,所述方法还包括步骤(pre-b):扩增和/或富集所述标记的核酸分子群;
    优选地,所述扩增反应使用至少引物C和/或引物D来进行,其中,所述引物C能够与所述共有序列X1的互补序列或其部分序列杂交或退火,并起始延伸反应;所述引物D能够与所述标 记的核酸分子群中含有所述标签序列Y的核酸分子链杂交或退火,并起始延伸反应。
  44. 权利要求43所述的方法,其中,在步骤(b)中,用转座酶将前一步骤获得的核酸分子随机打断并在片段两端添加接头;
    优选地,在步骤(c)中,至少使用引物C’和/或引物D’对步骤(b)的产物进行扩增,
    其中,片段两端的接头分别为第一接头和第二接头,所述引物C’能够与所述第一接头杂交或退火,并起始延伸反应,所述引物D’能够与所述第二接头杂交或退火,并起始延伸反应。
  45. 一种对样品中的细胞进行转录组测序的方法,其包括:
    (1)根据权利要求42-44任一项的方法构建核酸分子文库;和
    (2)对所述核酸分子文库进行测序。
  46. 进行单细胞转录组分析的方法,其包括:
    (1)根据权利要求45的方法对样品中的单细胞进行转录组测序;和
    (2)对测序数据进行分析,其包括,将获得的测序文库的测序结果与所述核酸阵列上各个微点所偶联的寡核苷酸探针中的标签序列Y或其互补序列进行匹配,若匹配成功,则将所述微点认定为阳性微点,并且,将源自于所述核酸阵列中呈区域连续性的阳性微点的测序数据认定为同一个细胞的转录数据,从而进行单细胞转录组分析。
  47. 试剂盒,其包含:
    用于标记核酸的核酸阵列以及任选的第一结合分子,所述核酸阵列包括固相支持物,所述固相支持物(例如在其表面)含有第一标记分子,所述第一结合分子能与所述第一标记分子构成相互作用对;
    所述固相支持物还包含多个微点,所述微点的尺寸(例如等效直径)小于5μm,相邻的所述微点之间的中心距离小于10μm;每个偶联有一种寡核苷酸探针;每种寡核苷酸探针包含至少一个拷贝;并且,所述寡核苷酸探针从5’到3’的方向上包含或者由:共有序列X1,标签序列Y和共有序列X2组成,其中,
    不同微点偶联的寡核苷酸探针具有不同的标签序列Y。
  48. 权利要求47的试剂盒,其中,相邻的所述微点之间的中心距离小于10μm,小于5μm,小于1μm,小于0.5μm,小于0.1μm,小于0.05μm,或小于0.01μm;并且,所述微点的尺寸(例如等效直径)小于5μm,小于1μm,小于0.3μm,小于0.5μm,小于0.1μm,小于0.05μm,小于0.01μm,或小于0.001μm;
    优选地,相邻的所述微点之间的中心距离为0.5μm~1μm,例如0.5μm~0.9μm,0.5μm~0.8μm;
    优选地,所述微点的尺寸(例如等效直径)为0.001μm~0.5μm(例如0.01μm~0.1μm,0.01μm~0.2μm,0.2μm~0.5μm,0.2μm~0.4μm,0.2μm~0.3μm)。
  49. 权利要求47或48的试剂盒,其中,所述第一结合分子能与所述第一标记分子构成特异性相互作用对或者非特异性相互作用对;
    优选地,所述相互作用对选自正负电荷相互作用,亲和相互作用(例如生物素-亲和素,生物素-链霉亲和素,抗原-抗体,受体-配体,酶-辅因子),能够发生点击化学反应的分子对(例如含炔基基团-叠氮基化合物),N-羟基磺基琥珀(NHS)酯-含氨基化合物,或其任意组合;
    例如,所述第一标记分子为多聚赖氨酸,所述第一结合分子为能与多聚赖氨酸结合的蛋白质;所述第一标记分子为抗体,所述第一结合分子为能与所述抗体结合的抗原;所述第一标记分子为含氨基化合物,所述第一结合分子为N-羟基磺基琥珀(NHS)酯;或者,所述第一标记分子为生物素,所述第一结合分子为链霉亲和素。
  50. 权利要求47-49任一项的试剂盒,其进一步包含:
    (i)引物I-A,包含引物I-A’和引物I-B的引物组,或者,包含引物I-A和引物I-B的引物组,其中:
    所述引物I-A含有共有序列A和捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;优选地,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物I-A的5’端);
    所述引物I-A’包含捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;
    所述引物I-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物I-B的3’末端,所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物I-B的5’端);其中,所述3’末端悬突是指以所述引物I-A’的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸;
    以及,
    (ii)桥接寡核苷酸I,其包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
    所述第一区域能(a)与所述引物I-A的共有序列A全部或部分退火或者(b)与所述引物I-B的共有序列B全部或部分退火;
    所述第二区域能与所述共有序列X2全部或其部分退火。
  51. 权利要求50的试剂盒,其包含:如(i)中所述的引物I-A,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-A的共有序列A全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;
    其中,所述引物I-A的捕获序列A是随机寡核苷酸序列;或者,所述引物I-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列;
    优选地,所述引物I-A的的5’末端包含磷酸化修饰。
  52. 权利要求50的试剂盒,其包含:如(i)中所述的包含引物I-A’和引物I-B的引物组,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-B的共有序列B全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;
    其中,所述引物I-A’的捕获序列A为随机寡核苷酸序列;或者,所述引物I-A’的捕获序列A为poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A’进一步包含标签序列A,以及共有序列A;
    其中,所述引物I-B包含共有序列B,3’末端悬突互补序列,以及标签序列B;
    优选地,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,且能够起始延伸反应;
    优选地,所述引物I-B或引物B”的5’末端包含磷酸化修饰;
    优选地,所述引物I-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物I-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
  53. 权利要求50的试剂盒,其包含:如(i)中所述的包含引物I-A和引物I-B的引物组,以及,如(ii)中所述的桥接寡核苷酸I;其中,所述桥接寡核苷酸I的第一区域能与所述引物I-A的共有序列A全部或部分退火,所述桥接寡核苷酸的第二区域能与所述共有序列X2全部或部分退火;
    其中,所述引物I-A的捕获序列A为随机寡核苷酸序列;或者,所述引物I-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物I-A进一步包含标签序列A,例如为随机寡核苷酸序列;
    优选地,所述引物I-A的的5’末端包含磷酸化修饰;
    优选地,所述引物I-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物I-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
  54. 权利要求47-49任一项的试剂盒,其进一步包含:
    (i)包含引物II-A和引物II-B或者包含引物II-A’和引物II-B’的引物组,其中:
    所述引物II-A含有捕获序列A,所述捕获序列A能与待捕获的RNA(例如,mRNA)退火并起始延伸反应;
    所述引物II-B包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物II-B的3’末端,所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B的5’端);其中,所述3’末端悬突是指以所述引物II-A的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸;
    所述引物II-A’含有共有序列A和捕获序列A;其中,所述捕获序列A位于所述引物II-A’的3’端,所述共有序列A位于所述捕获序列A的上游(例如位于所述引物II-A’的5’端);
    所述引物II-B’包含共有序列B,3’末端悬突互补序列,以及任选的标签序列B;其中,所述3’末端悬突互补序列位于所述引物II-B’的3’末端,所述共有序列B位于所述3’末端悬突互补序列的上游(例如位于所述引物II-B’的5’端);其中,所述3’末端悬突是指以所述引物II-A’的捕获序列A所捕获的RNA为模板逆转录生成的cDNA链的3’末端所包含的一个或多个非模板核苷酸。
  55. 权利要求54的试剂盒,其包含:如(i)中所述的引物II-A和引物II-B的引物组,以及,(ii)桥接寡核苷酸II-I和桥接寡核苷酸II-II;其中,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
    所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;
    所述桥接寡核苷酸II-II的第二区域能与所述引物II-B的共有序列B的互补序列或其部分序列退火;
    其中,所述引物II-A的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A优选地进一步包含共有序列A和任选的标签序列A,例如为随机寡核苷酸序列;
    其中,所述引物II-B含有所述共有序列B,3’末端悬突互补序列,以及标签序列B;
    优选地,所述引物II-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
  56. 权利要求54的试剂盒,其包含:如(i)中所述的引物II-A和引物II-B的引物组;
    其中,所述引物II-A的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A优选地进一步包含共有序列A和任选的标签序列A,例如为随机寡核苷酸序列;
    其中,所述引物II-B含有所述共有序列B,3’末端悬突互补序列,以及标签序列B;
    优选地,所述引物II-B包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B的3’末端包含一个或多个修饰的核苷酸(例如锁核酸)。
  57. 权利要求54的试剂盒,其包含:如(i)中所述的引物II-A’和引物II-B’的引物组,以及,(ii)桥接寡核苷酸II-I和桥接寡核苷酸II-II;其中,所述桥接寡核苷酸II-I和所述桥接寡核苷酸II-II各自独立地包括:第一区域和第二区域,以及任选的位于第一区域和第二区域之间的第三区域,所述第一区域位于所述第二区域的上游(例如5’端);其中,
    所述桥接寡核苷酸II-I的第一区域能与所述桥接寡核苷酸II-II的第一区域退火;所述桥接寡核苷酸II-I的第二区域能与所述寡核苷酸探针的共有序列X2或其部分序列退火;
    所述桥接寡核苷酸II-II的第二区域能与所述引物II-A’的共有序列A互补序列或其部分序列退火;
    其中,所述引物II-A’的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A’的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A’进一步包含标签序列A,例如为随机寡核苷酸序列;
    优选地,所述引物II-B’包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B’的3’末端包含一个或多个修饰的核苷酸(例如锁核酸);
    优选地,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
  58. 权利要求54的试剂盒,其包含如(i)中所述的引物II-A’和引物II-B’的引物组;
    其中,所述引物II-A’的捕获序列A是随机寡核苷酸序列;或者,所述引物II-A’的捕获序列A是poly(T)序列或针对特定靶核酸的特异性序列,所述引物II-A’进一步包含标签序列A,例如为随机寡核苷酸序列;
    其中,所述引物II-B’含有所述共有序列B,3’末端悬突互补序列,以及标签序列B;
    优选地,所述引物II-B’包含修饰的核苷酸(例如锁核酸);优选地,所述引物II-B’的3’末端 包含一个或多个修饰的核苷酸(例如锁核酸);
    优选地,所述试剂盒进一步包含引物B”,所述引物B”能与所述共有序列B的互补序列或其部分序列退火,并且能起始延伸反应。
  59. 权利要求47-58任一项的试剂盒,其进一步包含逆转录酶,核酸连接酶,核酸聚合酶和/或转座酶;
    优选地,所述逆转录酶具有末端转移活性;优选地,所述逆转录酶能够以RNA(例如,mRNA)为模板,合成cDNA链,且在所述cDNA链的3’端添加所述3’末端悬突。
  60. 权利要求47-59任一项的试剂盒,其进一步包含:用于进行核酸杂交的试剂、用于进行核酸延伸的试剂、用于进行核酸扩增的试剂、用于回收或纯化核酸的试剂、用于构建转录组测序文库的试剂、用于测序(例如二代测序或三代测序)的试剂、或其任何组合。
  61. 权利要求1-41任一项的方法或权利要求47-60任一项的试剂盒用于构建核酸分子文库或用于进行转录组测序的用途;
    优选地,所述方法或试剂盒用于构建单细胞核酸分子文库或用于进行单细胞转录组测序。
PCT/CN2022/135478 2021-12-24 2022-11-30 单细胞核酸标记和分析方法 WO2023116376A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2022417425A AU2022417425A1 (en) 2021-12-24 2022-11-30 Labeling and analysis method for single-cell nucleic acid
CN202280085229.2A CN118451199A (zh) 2021-12-24 2022-11-30 单细胞核酸标记和分析方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111600833 2021-12-24
CN202111600833.8 2021-12-24

Publications (1)

Publication Number Publication Date
WO2023116376A1 true WO2023116376A1 (zh) 2023-06-29

Family

ID=86901195

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/135478 WO2023116376A1 (zh) 2021-12-24 2022-11-30 单细胞核酸标记和分析方法

Country Status (3)

Country Link
CN (1) CN118451199A (zh)
AU (1) AU2022417425A1 (zh)
WO (1) WO2023116376A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180320224A1 (en) * 2017-05-03 2018-11-08 The Broad Institute, Inc. Single-cell proteomic assay using aptamers
CN110199196A (zh) * 2016-11-11 2019-09-03 伊索普莱克西斯公司 用于单细胞的同时基因组,转录本和蛋白质组分析的组合物和方法
CN110684829A (zh) * 2018-07-05 2020-01-14 深圳华大智造科技有限公司 一种高通量的单细胞转录组测序方法和试剂盒
US20200157528A1 (en) * 2018-11-16 2020-05-21 International Business Machines Corporation Determining position and transcriptomes of biological cells
WO2020176788A1 (en) * 2019-02-28 2020-09-03 10X Genomics, Inc. Profiling of biological analytes with spatially barcoded oligonucleotide arrays
WO2020228788A1 (zh) * 2019-05-15 2020-11-19 深圳华大生命科学研究院 用于检测核酸空间信息的阵列及检测方法
CN113604545A (zh) * 2021-08-09 2021-11-05 浙江大学 一种超高通量单细胞染色质转座酶可及性测序方法

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110199196A (zh) * 2016-11-11 2019-09-03 伊索普莱克西斯公司 用于单细胞的同时基因组,转录本和蛋白质组分析的组合物和方法
US20180320224A1 (en) * 2017-05-03 2018-11-08 The Broad Institute, Inc. Single-cell proteomic assay using aptamers
CN110684829A (zh) * 2018-07-05 2020-01-14 深圳华大智造科技有限公司 一种高通量的单细胞转录组测序方法和试剂盒
US20200157528A1 (en) * 2018-11-16 2020-05-21 International Business Machines Corporation Determining position and transcriptomes of biological cells
WO2020176788A1 (en) * 2019-02-28 2020-09-03 10X Genomics, Inc. Profiling of biological analytes with spatially barcoded oligonucleotide arrays
WO2020228788A1 (zh) * 2019-05-15 2020-11-19 深圳华大生命科学研究院 用于检测核酸空间信息的阵列及检测方法
CN113604545A (zh) * 2021-08-09 2021-11-05 浙江大学 一种超高通量单细胞染色质转座酶可及性测序方法

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"UniProt", Database accession no. P19821.1
LEBRIGAND, K. ET AL.: "High throughput error corrected Nanopore single cell transcriptome sequencing.", NATURE COMMUNICATIONS, vol. 11, 12 August 2020 (2020-08-12), XP093047897, DOI: 10.1038/s41467-020-17800-6 *

Also Published As

Publication number Publication date
AU2022417425A1 (en) 2024-07-25
CN118451199A (zh) 2024-08-06

Similar Documents

Publication Publication Date Title
CN114015755B (zh) 用于标记核酸分子的方法和试剂盒
JP5951755B2 (ja) 定量的ヌクレアーゼプロテクションアッセイ(qNPA)法および定量的ヌクレアーゼプロテクション配列決定(qNPS)法の改善
CA2810931C (en) Direct capture, amplification and sequencing of target dna using immobilized primers
CN108796058B (zh) 用于组织样本中核酸的局部或空间检测的方法和产品
CN117965695A (zh) 用于空间标记和分析生物样本中的核酸的方法
CN117821565A (zh) 高灵敏度dna甲基化分析方法
CN114096678A (zh) 多种核酸共标记支持物及其制作方法与应用
JP2017537657A (ja) 標的配列の濃縮
JP6089012B2 (ja) Dnaメチル化分析方法
KR20220130591A (ko) 희석 또는 비-정제된 샘플에서 핵산의 정확한 병렬 정량분석 방법
CN117089597A (zh) 一种单细胞文库构建测序方法及其应用
US20240271126A1 (en) Oligo-modified nucleotide analogues for nucleic acid preparation
WO2023116376A1 (zh) 单细胞核酸标记和分析方法
US20060240431A1 (en) Oligonucletide guided analysis of gene expression
KR20230124636A (ko) 멀티플렉스 반응에서 표적 서열의 고 감응성 검출을위한 조성물 및 방법
JP2023514388A (ja) 並列化サンプル処理とライブラリー調製
WO2023116373A1 (zh) 一种生成标记的核酸分子群的方法及其试剂盒
EP4455306A1 (en) Labeling and analysis method for single-cell nucleic acid
WO2023115536A1 (zh) 一种生成标记的核酸分子群的方法及其试剂盒
EP4455299A1 (en) Method for generating labeled nucleic acid molecular population and kit thereof
US20240316556A1 (en) High-throughput analysis of biomolecules
US20240279648A1 (en) Quantitative detection and analysis of molecules
US20240318244A1 (en) Click-chemistry based barcoding
US20240209414A1 (en) Novel nucleic acid template structure for sequencing
KR20240032630A (ko) 핵산의 정확한 병렬 검출 및 정량화 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22909691

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022417425

Country of ref document: AU

Date of ref document: 20221130

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2022909691

Country of ref document: EP

Effective date: 20240724