WO2022222101A1 - Rna测序文库的构建方法、测序方法及试剂盒 - Google Patents

Rna测序文库的构建方法、测序方法及试剂盒 Download PDF

Info

Publication number
WO2022222101A1
WO2022222101A1 PCT/CN2021/088984 CN2021088984W WO2022222101A1 WO 2022222101 A1 WO2022222101 A1 WO 2022222101A1 CN 2021088984 W CN2021088984 W CN 2021088984W WO 2022222101 A1 WO2022222101 A1 WO 2022222101A1
Authority
WO
WIPO (PCT)
Prior art keywords
cdna
sequence
primer
tag
mrna
Prior art date
Application number
PCT/CN2021/088984
Other languages
English (en)
French (fr)
Inventor
刘龙奇
林秀妹
石泉
史旭洋
刘传宇
黄亚灵
刘亚
Original Assignee
深圳华大生命科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大生命科学研究院 filed Critical 深圳华大生命科学研究院
Priority to EP21937340.4A priority Critical patent/EP4328362A1/en
Priority to CN202180097147.5A priority patent/CN117178083A/zh
Priority to AU2021442183A priority patent/AU2021442183A1/en
Priority to PCT/CN2021/088984 priority patent/WO2022222101A1/zh
Priority to CA3217523A priority patent/CA3217523A1/en
Publication of WO2022222101A1 publication Critical patent/WO2022222101A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • C40B40/08Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1065Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present invention relates to the field of sequencing, in particular, to a method for constructing an RNA sequencing library, a sequencing method and a kit.
  • Single-cell technology has also been continuously updated and iterated, from single-tube amplification to droplet-based high-throughput, which has greatly changed in terms of manpower, time-consuming, cost, and cell capture.
  • full-length cDNA often requires interrupted library construction to be sequenced.
  • Smart-seq2 technology adopts a single-tube amplification strategy, using flow cytometry and microdissection and other technologies to divide single cells into corresponding single wells for lysis, transcription and amplification, interruption, and library construction.
  • the current droplet-based high-throughput sequencing strategy specifically captures the sequences at both ends of the cDNA by synthesizing a sequence with a specific molecular marker on the microbeads.
  • the full-length cDNA with the tag cannot be sequenced directly, and further interruption is required. Since the sequence in the middle of the cDNA sequence after the interruption is not tagged and screened out, there is no way to obtain the full length, so important information such as alternative splicing is lost.
  • the specific steps of the droplet-based high-throughput sequencing strategy are as follows:
  • water-in-oil droplets are used to encapsulate cells and microbeads to obtain droplets containing one cell and one magnetic bead.
  • barcode specific molecular markers
  • the cells inside the droplets are split by the lysate, releasing a large amount of mRNA.
  • the mRNA is captured by free poly(dT), the RT reaction is completed in the droplet, and the TSO sequence is added to the end of the mRNA.
  • mRNA is captured on microbeads, and the 5' sequence is amplified with specific molecular markers by PCR, thereby realizing 5' RNA single-cell high-throughput sequencing, such as 10x genomics.
  • the biggest problem with 10x genomics at present is its high cost.
  • the main purpose of the present invention is to provide a method for constructing an RNA sequencing library, a sequencing method and a kit to solve the problem that it is difficult to achieve high-throughput sequencing of the 5' end or full length of RNA in the prior art.
  • a method for constructing an RNA sequencing library comprising: obtaining a single-stranded cDNA of a reverse transcription product of mRNA, wherein the 3' end of the single-stranded cDNA contains a cDNA tag sequence; single-stranded cDNA is circularized to obtain single-stranded circularized cDNA; single-stranded circularized cDNA is amplified by using random primers or primer combinations formed by gene-specific primers and cDNA tag primers to obtain amplified fragments, wherein, The cDNA tag primer is at least a part of the cDNA tag sequence; the amplified fragment is subjected to fragmentation library construction to obtain an RNA sequencing library.
  • obtaining the single-stranded cDNA of the reverse transcription product of the mRNA, and the 3' end of the single-stranded cDNA contains a cDNA tag sequence includes: performing reverse transcription of the mRNA to obtain a first-strand cDNA; amplifying the first-strand cDNA to obtain a double-stranded cDNA cDNA, wherein the 3' end of the second-strand cDNA complementary to the first-strand cDNA contains a cDNA tag sequence, and the cDNA tag sequence contains poly(A); the double-strand cDNA is melted to obtain a single-strand cDNA.
  • the cDNA tag sequence contains a second PCR linker, a second cell tag, a second unique molecular marker and poly(A) in sequence from 3' to 5'.
  • the mRNA is derived from a single-cell sample, and the mRNA is single-cell mRNA.
  • the single-cell mRNA is prepared by a droplet method, so that the single-cell mRNA is connected to a solid support, preferably the solid support is microbeads.
  • the single-cell mRNA is prepared by the droplet method, so that the single-cell mRNA is connected to the microbeads, including: providing a single-cell suspension and a microbead respectively, the microbeads are provided with a microbead tag sequence, and the end of the microbead tag sequence is provided.
  • Poly(A) binding thereby linking mRNA in single-cell suspension to beads, yields single-cell mRNA.
  • the microbead tag sequence contains the first PCR linker, the first cell tag, the first unique molecular tag and poly(dT) in sequence from 5' to 3', and correspondingly, the cDNA tag sequence is from 3' to 5'.
  • the second PCR linker, the second cell tag, the second unique molecular marker and poly(A) in sequence wherein the second PCR linker is complementary to the first PCR linker, the second cell label is complementary to the first cell label, the second unique The molecular marker is complementary to the first unique molecular marker.
  • the 5' end of the single-stranded cDNA contains the sequence of the TSO primer.
  • a first-strand cDNA is obtained, wherein the transcriptase has terminal transferase activity, and the 3' end of the first-strand cDNA contains the complementary sequence of the TSO linker;
  • One-strand cDNA is amplified to obtain a second-strand cDNA, and the 5' end of the second-strand cDNA contains the sequence of the TSO primer.
  • sequence of the TSO linker is SEQ ID NO: 1.
  • the reverse transcriptase is selected from MGI's Alpha reverse transcriptase, Invitrogen's SuperScript TM II reverse transcriptase, Thermo's Superscript IV or Maxima H Minus.
  • random amplification and/or full-length amplification are performed on the first-strand cDNA to obtain double-stranded cDNA.
  • the first-strand cDNA is amplified by the adaptor amplification primer and the TSO primer to obtain double-stranded cDNA; or the first-strand cDNA is amplified by the adaptor amplification primer, TSO-random primer and TSO primer to obtain double-stranded cDNA Stranded cDNA.
  • sequence of the adapter amplification primer is SEQ ID NO:2
  • sequence of the TSO primer is SEQ ID NO:3
  • sequence of the TSO-random primer is SEQ ID NO:4.
  • circularizing the single-stranded cDNA to obtain the single-stranded circularized cDNA includes: ligating the single-stranded cDNA into a circle under the action of a circularization auxiliary sequence and a ligase to obtain a ligated product; digesting the ligated product by enzyme digestion The single-stranded cDNA that is not ligated into a circle is obtained as a single-stranded circularized cDNA; wherein, the circularization auxiliary sequence is complementary to the sequences at both ends of the single-stranded cDNA.
  • the circularization helper sequence is selected from SEQ ID NO:5.
  • the gene-specific primers are TCR primers for TCR gene amplification and/or BCR primers for BCR gene amplification.
  • the cDNA tag primer is a poly(A) primer, more preferably SEQ ID NO: 6.
  • performing fragmentation library construction on the amplified fragments to obtain the RNA sequencing library includes: adding a library adapter to the amplified fragments to obtain the RNA sequencing library.
  • amplified fragments are digested and fragmented to obtain the digested fragments; the digested fragments are sequentially subjected to end repair, A addition and library adapter ligation to obtain the RNA sequencing library.
  • the method further includes performing PCR amplification on the ligation product of the library adapter to obtain an RNA sequencing library.
  • the library adapter is the adapter of the MGI sequencing platform or the adapter of the Illumina sequencing platform.
  • an RNA library construction kit includes: a circularization auxiliary sequence, a DNA ligase, a cDNA index primer and at least one of the following primers: (a) random primers ; (b) TCR primers; (c) BCR primers.
  • kit further includes RNA reverse transcription reagents.
  • RNA reverse transcription reagent includes reverse transcriptase, and the reverse transcriptase is a reverse transcriptase with terminal transferase activity.
  • the reverse transcriptase is selected from MGI's Alpha reverse transcriptase, Invitrogen's SuperScript TM II reverse transcriptase, Thermo's Superscript IV or Maxima H Minus.
  • RNA reverse transcription reagent also includes a TSO linker.
  • sequence of the TSO linker is SEQ ID NO: 1.
  • kit also includes TSO primers and adapter amplification primers.
  • sequence of the adapter amplification primer is SEQ ID NO: 2
  • sequence of the TSO primer is SEQ ID NO: 3.
  • kit further includes TSO-random primers.
  • sequence of the TSO-random primer is SEQ ID NO: 4.
  • the circularization helper sequence is SEQ ID NO:5.
  • the cDNA tag primer is a poly(A) primer.
  • sequence of the cDNA tag primer is SEQ ID NO: 6.
  • kit further includes at least one of exonuclease and library adapter.
  • exonuclease is selected from exonuclease I or exonuclease III.
  • the library adapter is the adapter of the MGI sequencing platform or the adapter of the Illumina sequencing platform.
  • the MGI sequencing platform is selected from vesicle adapters; the adapters of the Illumina sequencing platform are selected from P5 and P7 adapters.
  • DNA ligase is selected from T4 DNA ligase.
  • the kit also includes a solid support, and the solid support is provided with a support tag sequence, wherein the cDNA tag primer is complementary to at least a part of the support tag sequence; preferably, the solid support is a microbead,
  • the support tag sequence is a bead tag sequence.
  • the support tag sequence sequentially includes: a first PCR linker, a first cell tag, a first unique molecular tag and poly(dT) according to the 5' to 3' direction.
  • a method for sequencing an RNA library comprising: constructing an RNA sequencing library by using any of the aforementioned RNA sequencing library construction methods, and sequencing the RNA sequencing library.
  • the single-stranded cDNA with the cDNA tag sequence is circularized, so that The two ends of the single-stranded cDNA are connected, that is, the 5' end and the 3' end of the corresponding mRNA are connected, and then the fragment at the 5' end of the mRNA is labeled by the 3' end cDNA tag sequence.
  • the single-stranded circular cDNA is amplified by the cDNA tag primer and random primer or gene-specific primer identical to at least a part of the cDNA tag sequence, so as to obtain the amplified fragment starting from the position of the specific gene at the 5' end, Or amplified fragments starting from any position at the 5' end, and finally, by fragmenting and screening these amplified fragments, building a library for sequencing to achieve the purpose of high-throughput sequencing. Therefore, according to whether the length of the single-stranded cDNA obtained for circular formation is a full-length cDNA or a random-length cDNA, and according to different research purposes, a 5' RNA sequencing library or a full-length RNA sequencing library can be obtained.
  • FIG. 1 shows a schematic diagram of a chip structure for preparing droplets in Example 1 according to the present invention.
  • Figure 2 shows a schematic diagram of the construction principle of the RNA full-length library of the present application and the flow chart of the library construction and sequencing.
  • Fig. 3 shows the 5' RNA end library construction principle of the present application and the schematic flow chart of the library construction and sequencing
  • Figure 4A and Figure 4B show the Agilent 2100 bioanalyzer detection results of the full-length cDNA amplification products of the cell line samples and solid tissue samples in Example 1 of the present application.
  • Figure 5A and Figure 5B show the detection results of the Agilent 2100 bioanalyzer for the random primer amplified cDNA products of the cell line samples and solid tissue samples in Example 2 of the present application.
  • Figure 6A and Figure 6B show the Agilent 2100 bioanalyzer detection results of the random primer amplification products of the cell line samples and solid tissue samples in Example 1 of the present application after the single-stranded cDNA circularization.
  • Figure 7A and Figure 7B show the Agilent 2100 bioanalyzer detection results of TCR/BCR primer amplification products after single-stranded cDNA circularization of cell line samples and solid tissue samples in Example 2 of the present application.
  • Figure 8 shows the analysis results of the coverage of the transcripts by the sequencing fragments at the 5' and 3' ends in the offline data after the library constructed in a preferred embodiment of the present application is subjected to sequencing analysis.
  • TSO Template switch oligo
  • TSO linker a reverse transcriptase having terminal transferase activity.
  • a reverse transcriptase with terminal transferase activity adds CCC to the end of the first-strand cDNA (only to the full-length transcript) when reverse-transcribed to mRNA.
  • the TSO linker has rGrG+G paired with CCC (rG represents riboguanine nucleotide, +G represents LNA modified deoxyriboguanine nucleotide), so that during reverse transcription, the first strand cDNA 3' The complementary sequence of the TSO linker except rGrG+G on the terminal CCC band.
  • TSO primer can be combined with at least part of the 3' end of the first-strand cDNA for second-strand cDNA synthesis and cDNA amplification.
  • the TSO primer is a sequence obtained by removing rGrG+G from the TSO linker.
  • Random primers in this application refer to sequences consisting of base Ns only, eg, random sequences consisting of Ns of 6-12 nt.
  • the TSO-random primer refers to a sequence containing a TSO primer upstream of the random primer consisting of base N.
  • Support tag sequence refers to a tag sequence attached to a solid support for capturing mRNA, which at least includes oligo(dT) for capturing mRNA.
  • the support tag sequence comprises a cell tag and oligo(dT) in the 5' to 3' order, wherein oligo(dT) is used to complement the poly(A) of mRNA to capture mRNA, and the cell tag is used for to label mRNAs derived from the same cell.
  • a unique molecular label ie, UMI
  • UMI unique molecular label
  • a PCR linker in order to further facilitate subsequent library construction, can be set in the 5' direction of the cell tag for subsequent PCR amplification.
  • the solid support is a microbead
  • the tag sequence arranged on the microbead is denoted as a microbead tag sequence, in the order from near to the farthest from the microbead (also 5' to 3' sequence) including in turn: PCR linker, cell tag, unique molecular marker (ie UMI) and oligo(dT).
  • cDNA tag sequence the tag sequence on the circularized single-stranded cDNA with poly(A), wherein the poly(A) is used to complement the oligo(dT) on the solid support, so as to realize the identification of the solid support.
  • the captured mRNA is amplified.
  • the cDNA tag sequence contains a cell tag in addition to poly(A) for tagging the cell source.
  • a unique molecular marker ie UMI
  • UMI is also provided between the cell tag and the oligo(dT) to label different mRNA molecules in the same cell.
  • a PCR linker is also provided in the 3' direction of the cell tag, which is used as a primer for PCR amplification in the subsequent library construction step.
  • the cDNA tag sequence includes, in order from 3' to 5', a PCR linker, a cell tag, a unique molecular marker and poly(A).
  • the PCR adapter, cell tag, and unique molecular marker (ie UMI) on the above-mentioned support tag sequence are respectively recorded as the first PCR adapter, the first cell Tag, first unique molecular marker, correspondingly, the PCR adapter, cell tag, and unique molecular marker (ie UMI) on the cDNA tag sequence are respectively recorded as the second PCR adapter, the second cell tag, and the second unique molecular marker.
  • the adapter amplification primer in this application is the PCR adapter, which is described from the perspective of cDNA amplification. During cDNA amplification, the adapter amplification primer and the TSO primer are used as primer combinations for amplification.
  • cDNA tag primer refers to a primer used for amplifying circularized single-stranded cDNA, at least a part of the tag sequence linked to poly(A) of circularized single-stranded cDNA, for example, can be poly( A), it can also be a cDNA tag sequence, which is used to amplify the sequence with poly(T) after amplifying the circularized single-stranded cDNA by random primers or gene-specific primers, so as to obtain the construction of 5' RNA library or complete Amplified fragments required for long RNA libraries.
  • TCR/BCR primer refers to T Cell Receptor/B Cell Receptor, that is, a primer encoding T cell receptor or B cell receptor gene, which is one of gene-specific primers and is used for single-cell T/B cells. Receptor sequencing, and then study the immune mechanism.
  • Auxiliary circularization sequence refers to an auxiliary sequence that is used to bring the 5' end and the 3' end of the single-stranded cDNA closer to each other when the single-stranded cDNA is circularized. It pulls the 5' and 3' ends of the single-stranded cDNA closer together by a sequence that is complementary to the 5' end and the 3' end of the single-stranded cDNA, respectively, and the 5' and 3' ends that are close to each other can exist in the middle Under the action of DNA ligase, the gap is completed, thereby realizing the circularization of single-stranded cDNA.
  • poly(A) those skilled in the art are well-known that mRNA has a poly(A) tail (polyadenylic acid), and correspondingly, the poly(A) tail corresponding to DNA described in this application refers to polydeoxyadenosine acid, which is complementary to poly(dT).
  • the library method has been improved, and found that there is no report to transfer the tag sequence from the 3' end of the mRNA to the 5' end of the mRNA by the method of circular ligation after high-throughput capture of cDNA based on 3' RNA single cell, or with 3' RNA. 'ends share a tag, enabling single-cell high-throughput sequencing of RNA at the 3' and/or 5' ends simultaneously.
  • the inventors conducted detailed studies on the construction of a full-length cDNA single-cell high-throughput sequencing library and the construction of a 5'-end RNA single-cell high-throughput sequencing library, and further refined the experimental design to confirm the method.
  • the specific principles and steps are as follows:
  • the 3'-end RNA_seq droplet-based strategy captures mRNA and reverse-transcribes it to generate full-length cDNA.
  • the cDNA is then amplified using adapter amplification primers, TSO primers, and TSO-random primers.
  • the bead tag sequence on the 3' end originates from a solid support, such as beads, (beads in English) due to the amplification.
  • the converted cDNA tag sequence is retained, and the 5' end forms a DNA fragment of variable length due to the different binding positions of TSO-random primers.
  • fragments of different lengths are connected and circularized to form a loop in which the cDNA tag sequence at the 3' end and the TSO primer at the 5' end are connected end to end. Then use random primers complementary to the cDNA loop and poly(A) primers to further amplify with the loop as a template. After fragmentation and screening of these amplified products, the library is constructed and sequenced, so as to amplify the full length of the cDNA and To achieve the purpose of high-throughput sequencing (see Figure 2).
  • PCR linker sequence for capturing the poly(A) at the 3' end of mRNA.
  • the PCR linker sequence can be attached to a solid support (such as microbeads), and the PCR linker can be used as a linker during cDNA amplification Amplification primers; design a TSO primer complementary to the 3' end of the first-strand cDNA, and a TSO-random primer, after synthesizing the first-strand cDNA, use the above-mentioned primers for amplification to obtain cDNA fragments of different sizes;
  • the single-stranded 3'-end cDNA tag sequence with poly(A) in the cDNA fragment is connected end-to-end with the 5'-end sequence to obtain a circular DNA molecule of uneven size;
  • the upstream primer is the same as the poly(A) sequence at the 3' end of the cDNA, and use random primers to amplify the circular DNA molecule to obtain an amplified sequence with poly(T), using poly(T) (A)
  • the combination of primers and poly(T) thereby amplifying to obtain DNA fragments of different lengths, and performing library-building and sequencing on these fragments to obtain full-length RNA sequences, see the examples for details.
  • the overall technical route of the above-mentioned full-length RNA sequencing is as follows: preparation of droplets (the microbead phase contains lysate) ⁇ mRNA capture ⁇ demulsification ⁇ reverse transcription reaction ⁇ random primer amplification ⁇ amplification product circularization Ligation ⁇ PCR amplification of circularized products ⁇ fragmentation library building ⁇ sequencing. For details, see the examples.
  • the strategy adopted for 5'-end RNA sequencing is to obtain mRNA through the droplet technology of 3'-end RNA_seq, and then perform reverse transcription amplification of mRNA to obtain full-length cDNA.
  • A The single-stranded cDNA of (A) is circularized, so that the 3' and 5' ends of the mRNA are connected end-to-end, and then use gene-specific primers (such as TCR/BCR primers) or random primers and polynucleotides complementary to the circularized cDNA.
  • Primer A is further amplified using the loop as a template, and the amplified products are fragmented and screened to build a library for sequencing, so as to capture the 5'-end TCR/BCR sequence or capture the 5' sequence of other target genes (as shown in Figure 3). ).
  • the PCR linker sequence can be attached to a solid support (such as magnetic beads), and can be used as a linker amplification primer during cDNA amplification
  • the single strand with poly(A) in the above cDNA full-length fragment is circularly ligated to obtain a circular DNA molecule in which the cDNA tag sequence at the 3' end and the TSO primer at the 5' end are connected end-to-end .
  • the fragments obtained by the amplification of the TCR/BCR primers or the amplified fragments randomly initiated at the 5' end are sequenced for library building, and then the TCR/BCR sequence at the 5' end of the mRNA or the sequence randomly initiated at the 5' end is obtained.
  • the overall technical route of 5'-end RNA sequencing is as follows: preparation of droplets (the liquid phase of the beads contains cell lysate) ⁇ demulsification ⁇ mRNA capture ⁇ reverse transcription reaction ⁇ full-length cDNA amplification ⁇ The amplification products were circularized and ligated ⁇ PCR amplification of the circularized products using ployA primer + TCR/BCR primer or random primer ⁇ fragmentation library construction ⁇ sequencing.
  • the present application captures mRNA by applying a 3' droplet-based strategy, that is, a PCR linker and a cell tag and UMI (ie unique molecular marker) sequence on a solid support (eg, microbeads), and at the same time A poly(dT) is added to the end, and the poly(dT) is complementary to the poly(A) tail at the 3' end of the mature mRNA, thereby capturing the mRNA, and further reverse transcription to synthesize the full length of the first-strand cDNA. Then, fragments of different sizes of cDNA or full-length cDNA are obtained by using random primer amplification and ligation to form a circle, so as to achieve single-cell RNA 5' end and RNA full-length sequencing.
  • a 3' droplet-based strategy that is, a PCR linker and a cell tag and UMI (ie unique molecular marker) sequence on a solid support (eg, microbeads)
  • the application provides a method for constructing an RNA sequencing library, the construction method comprising:
  • the single-stranded cDNA is circularized to obtain a single-stranded circularized cDNA
  • Fragmentation library construction is performed on the amplified fragments to obtain an RNA sequencing library.
  • the above construction method by obtaining the single-stranded cDNA of the reverse transcription product of mRNA, and the 3' end of the single-stranded cDNA chain has a cDNA tag sequence, and circularizing the single-stranded cDNA with the cDNA tag sequence, so that the single-stranded cDNA is The two ends are connected, that is, the 5' end and the 3' end of the corresponding mRNA are connected, and then the fragment at the 5' end of the mRNA is labeled by the 3' end cDNA tag sequence.
  • the single-stranded circular cDNA is amplified by the cDNA tag primer and random primer or gene-specific primer identical to at least a part of the cDNA tag sequence, so as to obtain the amplified fragment starting from the position of the specific gene at the 5' end, Or amplified fragments starting from any position at the 5' end, and finally, by fragmenting and screening these amplified fragments, building a library for sequencing to achieve the purpose of high-throughput sequencing. Therefore, a full-length RNA sequencing library or a 5' RNA sequencing library can be obtained according to whether the length of the single-stranded cDNA obtained for circular formation is a full-length cDNA or a random-length cDNA.
  • the above construction method is applicable to RNA library construction of any sample, as long as the reverse-transcribed single-stranded cDNA of the mRNA of the sample can be obtained.
  • the mRNA is derived from a single-cell sample, and the mRNA is single-cell mRNA.
  • single-cell mRNA is prepared by a droplet method, so that single-cell mRNA is linked to a solid support, preferably microbeads.
  • the single-cell mRNA is prepared by the droplet method, so that the single-cell mRNA is connected to the microbeads, including: providing the single-cell suspension and the microbeads respectively, and the microbeads are provided with a microbead label sequence, the end of the microbead tag sequence contains poly(dT); the single cell suspension and microbeads are wrapped in droplets, and each droplet contains a single cell and a microbead, and the microbeads pass through poly(dT) It binds to poly(A) of mRNA in the single-cell suspension, thereby linking the mRNA in the single-cell suspension to the microbeads to obtain single-cell mRNA.
  • the single cell suspension described above can also be regarded as a cell nucleus suspension because it contains a cell lysate.
  • the droplet method realizes the capture of mRNA by combining poly(dT) on microbeads with poly(A) of mRNA, and the capture efficiency is high. Since a single oil droplet corresponds to a single bead and a single cell, the bead tag sequence on the microbead can also specifically label a single cell.
  • the specific method for obtaining the single-stranded cDNA of the reverse transcription product of mRNA is not limited, as long as it can carry the above-mentioned cDNA tag sequence corresponding to the 3' end of the mRNA.
  • the length of the single-stranded cDNA obtained is not particularly limited, and it may be a full-length cDNA, a random-length cDNA, or both.
  • the step of obtaining the single-stranded cDNA of the reverse transcription product of mRNA, the single-stranded cDNA having a cDNA tag sequence at the 3' end includes: The mRNA is reverse transcribed to obtain the first-strand cDNA; the first-strand cDNA is amplified to obtain a double-strand cDNA, wherein the 3' end of the second-strand cDNA complementary to the first-strand cDNA contains the aforementioned cDNA tag sequence, the cDNA tag The sequence contains poly(A); the double-stranded cDNA is melted to obtain a single-stranded cDNA containing the cDNA tag sequence.
  • the purpose of melting the double-stranded cDNA is for subsequent circularization.
  • the subsequent circularization method is not limited here, as long as the cDNA can be linked end to end.
  • the microbead tag sequence on the microbeads used to capture single cells preferably contains the first PCR linker, the first cell tag, the first unique molecular tag and poly(dT) sequentially from 5' to 3'.
  • the cDNA tag sequence contains a second PCR linker, a second cell label, a second unique molecular marker and poly(A) in sequence from 3' to 5', wherein the second PCR linker is complementary to the first PCR linker,
  • the second cell tag is complementary to the first cell tag, and the second unique molecular tag is complementary to the first unique molecular tag. That is, the 5' end of the first strand cDNA carries a bead tag sequence, and the 3' end of the amplified second strand cDNA carries a cDNA tag sequence complementary to the bead tag sequence.
  • the 5' end contains the sequence of the TSO primer.
  • a reverse transcriptase with terminal transferase activity and a TSO linker are used to reverse transcribe the mRNA to obtain a first-strand cDNA, and the 3' end of the first-strand cDNA contains the complementary sequence of the TSO linker ; Amplify the first-strand cDNA to obtain the second-strand cDNA, and the 5' end of the second-strand single-strand cDNA contains the sequence of the TSO primer.
  • the full-length second-strand cDNA can be obtained if TSO primers are used, and the second-strand cDNA of any length can be obtained if TSO-random primers are used for amplification.
  • full-length or arbitrary-length double-stranded cDNA can be obtained, and further full-length or arbitrary-length single-stranded cDNA with the above-mentioned cDNA tag sequence can be obtained after melting.
  • sequence of the above-mentioned TSO linker can adopt the existing known sequence, and can also be designed according to the needs.
  • sequence of the TSO linker is SEQ ID NO:1.
  • reverse transcriptases include, but are not limited to, MGI's Alpha reverse transcriptase, Invitrogen's SuperScript TM II reverse transcriptase, Thermo's Superscript IV or Maxima H Minus.
  • the method of capturing single-cell mRNA based on the droplet method is currently the method that truly realizes low-cost high-throughput transcriptome sequencing.
  • the core of this method is to use droplets as microreactors, and to contain a cell and a label sequence (usually the first cell label sequence (ie Cell barcode) and a unique molecular marker (ie Unique Molecular label) are included in the droplet. Identifier, UMI)), preferably microbeads.
  • the cells in the droplet are lysed to release the mRNA, which binds to the capture sequence on the microbeads, thereby realizing the capture of the mRNA.
  • the steps of capturing the mRNA of single cells by the droplet method are the same as the above method.
  • the process of the first-strand cDNA obtained by reverse transcription of mRNA is also the same as that of the existing method, and the first-strand full-length cDNA can be obtained by reverse transcription using a reverse transcriptase having terminal transferase activity.
  • 3 Cs can be added to the end of the first-strand full-length cDNA, at this time, the free TSO linker in the droplet (see SEQ ID NO: 1 in Example 1 for an example)
  • the terminal three rGrG+G of the first-strand full-length cDNA can bind to the three Cs, and then a sequence complementary to the TSO linker is synthesized downstream of the terminal CCC of the first-strand full-length cDNA (see the underlining of SEQ ID NO: 1 in Example 1 for an example. part sequence).
  • Second strand cDNA synthesis is then accomplished with TSO primers or TSO-random primers.
  • the double-stranded cDNA thus obtained includes full-length cDNA and cDNA of arbitrary length, and carries a cDNA tag sequence at one end corresponding to the 3' end of the mRNA.
  • the transfer of the cDNA tag sequence can be achieved by circularizing one of the double-stranded cDNA strands (ie, the single-stranded cDNA with poly(A)), thereby obtaining the 5' end of the mRNA to be tagged.
  • the non-full-length amplified fragments, or even full-length amplified fragments can be sequenced at the 5' end and/or full-length of mRNA by constructing these amplified fragments into a library.
  • random amplification or full-length amplification of the above-mentioned first-strand cDNA can be performed.
  • the above-mentioned random amplification and/or full-length cDNA is performed on the first-strand cDNA. Amplification to obtain double-stranded cDNA.
  • primers see SEQ ID NO: 2 in Example 1 for examples
  • TSO primers see SEQ ID NO: 3 in Example 1 for examples
  • TSO - at least one of random primers (see SEQ ID NO: 4 in Example 1 for an example) to amplify the first-strand cDNA to obtain a double-stranded cDNA.
  • Double-stranded cDNA fragments starting from any position at the 5' end can be amplified by TSO-random primers, so as to cover all sequences of mRNA from the 5' end to the 3' end as comprehensively as possible.
  • the TSO primers ensured full-length double-stranded cDNA fragments. Circularize these fragments of different lengths to obtain single-stranded circular DNA molecules with non-uniform sizes, and amplify all circular fragments to obtain amplification that can mark different positions at the 5' end of mRNA. Fragments, and library construction is performed on these amplified fragments to obtain a sequencing library covering fragments at different positions at the 5' end and/or full length of the mRNA.
  • the above-mentioned circularizing the single-stranded cDNA to obtain the single-stranded circularized cDNA includes: linking the single-stranded cDNA into a loop under the action of a circularization auxiliary sequence and a ligase to obtain a ligated product ; Digest the ligation product to digest the single-stranded DNA that is not connected into a circle (if the double-stranded DNA is directly circularized without separation after melting, there may also be uncircularized double-stranded DNA here) to obtain a single-stranded DNA Circularized cDNA; wherein a circularization helper sequence (see SEQ ID NO: 5 of Example 1 for an example) is complementary to sequences at both ends of the circularized single-stranded cDNA
  • the double-stranded cDNA is first thermally denatured to melt the double-stranded cDNA into two single strands, and in the single-stranded state, the circularization auxiliary sequence (according to the Rational design of end sequences) is incubated with the single-strand, and the circularization auxiliary sequence is complementary to the sequences at both ends of the single-strand to bring the head and tail ends closer, thereby realizing single-strand circularization under the action of ligase.
  • the poly(A) primers and random primers or specific types of gene-specific primers can be used for amplification to obtain the marker 5 ' end of the different amplified fragments.
  • the method realizes the conversion of the existing method of labeling the 3' end of mRNA into the labeling of the target fragment at the 5' end of the mRNA through amplification after circularization, thereby realizing the sequencing of the 5' end of the mRNA and/or Full-length sequencing.
  • the method is simple and convenient, and is compatible with the library construction steps of various existing sequencing platforms, which is helpful to achieve high-throughput sequencing of the 5'-end and/or full-length mRNA of single cells.
  • a combination of at least one of the following primers and a cDNA tag primer is used to amplify the single-stranded circular cDNA to obtain amplified fragments that meet different requirements: (a) random primers; (b) TCR Gene primer; (c) BCR gene primer; preferably, the cDNA tag primer is a poly(A) primer, more preferably SEQ ID NO: 6.
  • primers for TCR and/or BCR genes are used to obtain the expression of immune-related genes.
  • the above step of constructing a fragmented library of amplified fragments to obtain a single-cell RNA sequencing library may adopt a conventional fragmented library construction process.
  • the step includes: adding a library adapter to the amplified fragment to obtain an RNA sequencing library.
  • the specific way of adding adapters can be selected according to different sequencing platforms, and appropriate library adapters and operation methods can be selected for addition.
  • this step includes: performing enzyme digestion and fragmentation on the amplified fragments to obtain enzyme-cut fragments; sequentially performing end repair and A and adapter ligation on the enzyme-cut fragments to obtain single-cell RNA sequencing library. More preferably, after the end repair plus A and the adapter ligation, it also includes amplifying the obtained ligated fragments, so as to obtain an RNA sequencing library that meets the requirements of the computer.
  • adapters suitable for specific sequencing platforms can be reasonably selected.
  • it can be the adapter of the MGI sequencing platform or the adapter of the Illumina sequencing platform.
  • the amplification primers used to amplify the ligated fragments connected to the adapters are also matched with the corresponding platform adapter sequences.
  • the primers used to amplify the ligated fragments are also the amplification primers of the MGI sequencing platform.
  • a single-cell RNA library construction kit includes: a circularization auxiliary sequence, a DNA ligase, a cDNA index primer and at least one of the following primer sequences Species: (a) random primers; (b) TCR gene primers; (c) BCR gene primers.
  • the kit is mainly designed based on the reagents used in the circularization step and the step of amplifying a specific fragment at the 5' end of the single-stranded circularized DNA or a random fragment starting from any position at the 5' end in the above-mentioned library construction method.
  • the above reagents are convenient and quick to complete the library construction.
  • the circularization auxiliary sequence is complementary to the TSO linker corresponding to the 5' end of the mRNA and the tag sequence corresponding to the 3' end of the mRNA. Therefore, its specific sequence composition is also based on the specific sequence of the tag sequence and the TSO linker. varies from one to another.
  • the DNA ligase in the above kit mainly connects a base with phosphorylation modification and a base with hydroxyl in the DNA chain, so any DNA ligase that can realize DNA ligation is suitable for this application.
  • it can be a heat-labile DNA ligase, such as T4 DNA ligase, or a heat-stable DNA ligase, such as Thermo stable DNA ligase.
  • the above-mentioned kit further comprises a solid phase support with a support tag sequence on the solid phase support, wherein at least the cDNA tag primer and the support tag sequence are at least equal to each other.
  • a part is complementary, preferably the solid support is a microbead, and the support tag sequence is a microbead tag sequence.
  • the solid-phase support with the support tag sequence can easily capture single-cell mRNA, and at the same time make the mRNA carry a tag sequence complementary to the support tag sequence.
  • microbeads with tag sequences can be purchased from existing microbeads, or can be prepared by yourself.
  • the tag sequence on each microbead includes the following DNA sequences: (1) PCR linker, used for PCR amplification. (2) Cell barcode, one bead corresponds to one cell label. (3) Unique Molecular Identifier (UMI), which is used to label different template molecules in the same cell to quantify the abundance of transcripts. (4) A capture sequence, usually poly(dT), captures mRNA by binding to the poly(A) tail of the mRNA.
  • this test kit also includes RNA extraction reagent and/or RNA reverse transcription reagent, and RNA reverse transcription reagent includes reverse transcriptase, and reverse transcriptase is a reverse transcriptase with terminal transferase activity (for example, it can be MGI's Alpha reverse transcriptase, Invitrogen's SuperScript TM II reverse transcriptase, or Thermo's Maxima H Minus, Superscript IV, etc.).
  • Reagents related to mRNA capture and reverse transcription based on the droplet method can be used together.
  • poly(dT) binds to the poly(A) tail of mRNA to capture mRNA
  • the first-strand cDNA can be captured by reverse transcriptase.
  • the usual reverse transcription reagents include, in addition to reverse transcriptase, a TSO linker (such as shown in SEQ ID NO: 1).
  • a reverse transcriptase with terminal transferase activity can be used to add CCC to the end of the first cDNA strand, and then use the rGrG+G on the TSO linker to complement CCC, and then use the TSO linker sequence as a template.
  • the complementary sequence of the TSO linker ie, equivalent to the complementary sequence of TSO ligated to the end of the first cDNA strand).
  • the above-mentioned kit further includes TSO primers, TSO-random primers and adapter amplification primers.
  • the sequence of the adapter amplification primer is SEQ ID NO: 2
  • the sequence of the TSO primer is SEQ ID NO: 3
  • the TSO primer TSO-random primer is SEQ ID NO: 4.
  • the adapter amplification primer can bind to the cDNA tag sequence at the 3' end of the second cDNA strand (corresponding to the 3' end of the mRNA), while the TSO-random primer can bind to any position at the 3' end of the first strand cDNA (corresponding to the 5' end of the mRNA) , so that cDNA fragments of different lengths can be obtained.
  • the circularization auxiliary sequence is SEQ ID NO: 5, preferably, the cDNA tag primer is a poly(A) primer, more preferably SEQ ID NO: 6;
  • the kit further includes at least one of exonuclease and library linker.
  • the exonuclease is used to degrade the uncircularized single-stranded or double-stranded cDNA after the double-stranded cDNA is melted into single-stranded and circularized, for example, it can be exonuclease I or exonuclease Enzyme III, etc.
  • the linker for library construction can be the linker of the MGI sequencing platform (for example, one is a linear linker, and the other is a double-stranded linker of a bubble linker, wherein the linear linker is A+31bp sequence+10bp index sequence+17bp, and the bubble linker includes 17bp bubble sequence, 13bp before the 17bp bubble sequence and 7bp+T after the 17bp bubble sequence, the total length of the linker is 97bp). It can also be adapters of other sequencing platforms, such as adapters of Illumina sequencing platforms (such as Y-type P5 and P7 adapters, one or both of the P5 and P7 adapters have library tag sequences as needed, which is convenient for later sequencing of mixed samples. The output data is split).
  • the linear linker is A+31bp sequence+10bp index sequence+17bp
  • the bubble linker includes 17bp bubble sequence, 13bp before the 17bp bubble sequence and 7bp+T after the 17bp bubble sequence, the
  • an RNA sequencing method comprising: constructing an RNA sequencing library by using any of the above RNA sequencing library construction methods, and sequencing the RNA sequencing library.
  • the RNA-sequencing library constructed by the aforementioned RNA-sequencing library construction method can be a fragment covering more mRNA 5' ends, or a fragment covering the full-length mRNA, so it can meet the needs of the current market. 5'-end sequencing requirements, such as the need for 5'-end sequencing in the construction of immune peptide libraries. Sequencing of full-length RNA-sequencing libraries is sufficient for the study of structural variation in alternative splicing of certain transcripts.
  • the following examples include cell suspension preparation, microbead preparation, droplet generation, demulsification, reverse transcription RT reaction, cDNA amplification, circularization ligation, circularization product amplification, fragmentation enzyme library construction, high-throughput sequencing, etc. .
  • TSO linker sequence SEQ ID NO: 1: 5'- AAGCAGTGGTATCAACGCAGAGTACAT rGrG+G-3', +G represents locked nucleotides, the reason for using rGrG+G is that the hybridization between RNA and DNA has better thermal stability.
  • the Tn primer ie, the adapter amplification primer
  • the Tn primer is used to amplify from the end connected to the magnetic bead, and its specific sequence is:
  • SEQ ID NO: 2 5'-CGTAGCCATGTCGTTCTG-3';
  • TSO primers are used to amplify from one end of the TSO adapter, and their specific sequences are:
  • SEQ ID NO: 3 5' Phos-AAGCAGTGGTATCAACGCAGAGTACAT-3';
  • TSO-random primers are used to amplify from any position at the 5' end of the cDNA to the 3' end.
  • the specific sequence is:
  • SEQ ID NO: 4 5'phos-AAGCAGTGGTATCAACGCAGAGTACATNNNNNN-3'.
  • PCR reaction Carry out PCR reaction according to the following conditions: 95°C, 3min; 10-15 cycles (98°C, 20s; 58°C, 20s; 72°C, 3min); 72°C, 5min; 4°C, maintenance.
  • VAHTSTM DNA Clean Beads VAZYME:N411-03 (equilibrate at room temperature for 30min in advance) to purify and recover the PCR product.
  • SEQ ID NO: 5 5'-TACCACTGCTTCGTAGCCATGT-3'.
  • the above PCR tube was placed in an ice bath, and the reaction system was prepared according to the following table.
  • Pipette 4 ⁇ L of the prepared enzyme digestion reaction solution (used to digest uncircularized single-stranded and possible unmelted double-stranded) into the single-stranded circularized product, vortex briefly to mix evenly, and centrifuge briefly. , place the PCR tube on the PCR machine, incubate at 37 °C for 30 min, and heat the lid at 75 °C.
  • the specific sequence of the poly(A) primer is:
  • SEQ ID NO: 6 5'phos-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA-3'.
  • the cDNA concentration obtained in step 8.3 take 100-200 ng (about 0.1-0.2 pmol) of the cDNA to be interrupted in a new 0.2 mL PCR tube, the volume should be ⁇ 16 ⁇ L, and the portion less than 16 ⁇ L should be supplemented with H 2 O.
  • SEQ ID NO: 7 5'-Phos-AGTCGGAGGCCAAGCGGTCTTAGGAAGACAA-3';
  • SEQ ID NO: 8 3'-TTCAGCTCGGT-5'.
  • the specific sequence of the primer for amplifying the ligation product of the adapter is:
  • NNNNNNNNNN is the tag sequence, and N represents any of A/T/C/G, which is used to distinguish different libraries.
  • the above PCR tube was placed in an ice bath, and the reaction system was prepared according to the following table.
  • the steps are the same as the previous full-length RNA sequencing steps.
  • VAHTSTM DNA Clean Beads (equilibrate at room temperature for 30 minutes in advance) to purify and recover PCR products.
  • the circularization procedure was the same as the full-length RNA sequencing procedure described above.
  • the library constructed in Example 2 was sequenced using the sequencing instrument of the MGI sequencing platform, and the coverage of the transcripts by the sequencing fragments at the 5' end and 3' end of the data after being off the machine was analyzed.
  • the specific results are shown in Figure 8 .
  • light gray corresponds to the coverage of the 5' end
  • dark gray corresponds to the coverage of the 3' end. It can be seen from Figure 8 that the method of the present invention can effectively capture the information at the 5' and 3' ends of the transcript.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Microbiology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • Medicinal Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Plant Pathology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Saccharide Compounds (AREA)

Abstract

本发明提供了一种RNA测序文库的构建方法、测序方法及试剂盒。其中,构建方法包括:获取mRNA的逆转录产物单链cDNA,单链cDNA的3'端含有cDNA标签序列;将单链cDNA环化得到单链环化cDNA;利用随机引物或基因特异性引物与cDNA标签引物形成的引物组合对单链环化cDNA扩增,得到扩增片段,cDNA标签引物为cDNA标签序列的至少一部分;对扩增片段进行片段化文库构建得到RNA测序文库。

Description

RNA测序文库的构建方法、测序方法及试剂盒 技术领域
本发明涉及测序领域,具体而言,涉及一种RNA测序文库的构建方法、测序方法及试剂盒。
背景技术
随着单细胞技术的快速发展,很快迎来了单细胞测序的时代。单细胞技术也不断的更新迭代,从单管扩增到现在基于液滴高通量,无论是在人力,耗时,成本,细胞捕获量都有了极大的改变。然而,由于测序读长的限制,全长cDNA往往需要打断建库才能测序。Smart-seq2技术采用单管扩增的策略,使用流式细胞和显微切割等技术将单细胞分到相应的单孔中,进行裂解,转录和扩增,打断,建库。由于单管建库没有特定的barcode对细胞来源进行区分,因此只能在建库的时候对一个细胞的所有序列进行加标签,最后将不同细胞来源的文库合在一起进行测序。这样的测序策略虽然获得了全长,但是细胞的通量有限,同时测序的成本也会极大的提高。
而目前基于液滴的高通量测序策略,通过在微珠上合成一段带有特定分子标记的序列,来特异性捕获cDNA两末端的序列。但由于测序读长限制,不能直接对带有标签的全长cDNA进行测序,还是需要进一步打断。由于打断后处于cDNA序列中间的序列没有带上标签而被筛选掉,因此没有办法获得全长,从而丢失很多诸如可变剪切等重要的信息。基于液滴的高通量测序策略的具体步骤如下:
基于液滴微流控技术采用油包水的液滴对细胞和微珠进行包裹,得到同时含有一个细胞和一个磁珠的液滴,液滴内其他成分还包括裂解液,其中微珠上含有大量的带有特定分子标记DNA序列(barcode)以及TSO序列。在液滴不断生成过程中,液滴内的细胞被裂解液裂开,释放出大量mRNA。与此同时,mRNA被游离的poly(dT)捕获,在液滴内完成RT反应,同时在mRNA末端加上TSO序列,第一链cDNA延伸末端的TSO的互补序列与微珠上的TSO结合,从而将mRNA捕获到微珠上,通过PCR扩增使5’序列带上特定的分子标记,进而实现5’RNA单细胞高通量测序,如10x genomics。而目前10x genomics较大的问题在于它的成本较高。
市场上提供单细胞测序仪器和服务的公司主要有10X Genomics、BD、Dolomite公司的Nadia等。除了10X Genomics公司之外,目前尚缺乏高通量5’RNA测序的方法,而5’RNA seq对于免疫组库的获得是必不可少的环节,如5’TCR/BCR作为免疫细胞的重要组分,获得TCR/BCR序列深入了解免疫机制至关重要。另外,目前基于液滴高通量的全长mRNA单细胞测序技术也尚未被报导,而全长对于我们了解基因的多样性,调控机制发挥了重要作用。
由以上可知,目前基于单管的全长RNA测序序列,难以实现高通量单细胞测序。而液滴微流控技术的不足则在于由于分子标记仅能设计在cDNA的3’末端或者5’末端,建库过程中打断筛选后,中间的序列因为没有特定的分子标记,无法追朔片段来源而被丢弃,因此无法获得全长。而且,目前10x genomics虽然实现了5’RNA单细胞高通量测序,但是基于TSO互补捕获mRNA效率没有基于poly(dT)的捕获效率高,且成本也较高。
因此,仍需要对现有的方法进行改进,以实现对单细胞的RNA 5’末端或RNA全长进行高通量测序。
发明内容
本发明的主要目的在于提供一种RNA测序文库的构建方法、测序方法及试剂盒,以解决现有技术中难以实现对RNA 5’末端或全长进行高通量测序的问题。
为了实现上述目的,根据本发明的一个方面,提供了一种RNA测序文库的构建方法,该构建方法包括:获取mRNA的逆转录产物单链cDNA,其中,单链cDNA的3’端含有cDNA标签序列;将单链cDNA进行环化,得到单链环化cDNA;利用随机引物或基因特异性引物与cDNA标签引物形成的引物组合对单链环化cDNA进行扩增,得到扩增片段,其中,cDNA标签引物为cDNA标签序列的至少一部分;对扩增片段进行片段化文库构建,得到RNA测序文库。
进一步地,获取mRNA的逆转录产物单链cDNA,单链cDNA的3’端含有cDNA标签序列包括:将mRNA进行逆转录,得到第一链cDNA;对第一链cDNA进行扩增,得到双链cDNA,其中,与第一链cDNA互补的第二链cDNA的3’端含有cDNA标签序列,cDNA标签序列含有poly(A);对双链cDNA进行解链,得到单链cDNA。
进一步地,cDNA标签序列从3’到5’方向依次含有第二PCR接头、第二细胞标签,第二唯一分子标记以及poly(A)。
进一步地,mRNA来源于单细胞样本,mRNA为单细胞mRNA。
进一步地,采用液滴法制备单细胞mRNA,以使得单细胞mRNA连接到固相支持物上,优选固相支持物为微珠。
进一步地,采用液滴法制备单细胞mRNA,以使得单细胞mRNA连接到微珠上包括:分别提供单细胞悬液和微珠,微珠上带有微珠标签序列,微珠标签序列的末端含有poly(dT);将单细胞悬液与微珠包裹于液滴中,且每个液滴中含有一个单细胞以及一个微珠,微珠通过poly(dT)与单细胞悬液中mRNA的poly(A)结合,从而将单细胞悬液中的mRNA连接到微珠上,得到单细胞mRNA。
进一步地,微珠标签序列从5’到3’方向依次含有第一PCR接头、第一细胞标签、第一唯一分子标签以及poly(dT),相应地,cDNA标签序列从3’到5’方向依次含有第二PCR接头、 第二细胞标签,第二唯一分子标记以及poly(A),其中,第二PCR接头与第一PCR接头互补,第二细胞标签与第一细胞标签互补,第二唯一分子标记与第一唯一分子标记互补。
进一步地,单链cDNA的5’端含有TSO引物的序列。
进一步地,通过采用逆转录酶和TSO接头对mRNA进行逆转录,得到第一链cDNA,其中,转录酶具有末端转移酶活性,第一链cDNA的3’端含有TSO接头的互补序列;对第一链cDNA进行扩增,得到第二链cDNA,第二链cDNA的5’端含有TSO引物的序列。
进一步地,TSO接头的序列为SEQ ID NO:1。
进一步地,逆转录酶选自MGI的Alpha逆转录酶、Invitrogen的SuperScript TMII逆转录酶、Thermo的Superscript IV或Maxima H Minus。
进一步地,对第一链cDNA进行随机扩增和/或全长扩增,得到双链cDNA。
进一步地,采用接头扩增引物与TSO引物对第一链cDNA进行扩增,得到双链cDNA;或者采用接头扩增引物、TSO-随机引物以及TSO引物对第一链cDNA进行扩增,得到双链cDNA。
进一步地,接头扩增引物的序列为SEQ ID NO:2,TSO引物的序列为SEQ ID NO:3,TSO-随机引物的序列为SEQ ID NO:4。
进一步地,将单链cDNA进行环化,得到单链环化cDNA包括:在环化辅助序列和连接酶的作用下将单链cDNA连接成环,得到连接产物;对连接产物进行酶切以消化未连接成环的单链cDNA,得到单链环化cDNA;其中,环化辅助序列与单链cDNA两端的序列互补。
进一步地,环化辅助序列选自SEQ ID NO:5。
进一步地,基因特异性引物为针对TCR基因扩增的TCR引物和/或针对BCR基因扩增的BCR引物。
进一步地,cDNA标签引物为poly(A)引物,更优选为SEQ ID NO:6。
进一步地,对扩增片段进行片段化文库构建,得到RNA测序文库包括:对扩增片段添加文库接头,得到RNA测序文库。
进一步地,对扩增片段进行酶切片段化,得到酶切片段;对酶切片段依次进行末端修复、加A和文库接头连接,得到RNA测序文库。
进一步地,在进行文库接头连接之后,进一步包括,对文库接头的连接产物进行PCR扩增,得到RNA测序文库。
进一步地,文库接头为MGI测序平台的接头或Illumina测序平台的接头。
根据本申请的第二个方面,提供了一种RNA文库构建试剂盒,该试剂盒包括:环化辅助序列、DNA连接酶、cDNA标签引物以及如下引物中的至少一种:(a)随机引物;(b)TCR引物;(c)BCR引物。
进一步地,试剂盒进一步包括RNA逆转录试剂。
进一步地,RNA逆转录试剂包括逆转录酶,逆转录酶为具有末端转移酶活性的逆转录酶。
进一步地,逆转录酶选自MGI的Alpha逆转录酶、Invitrogen的SuperScript TMII逆转录酶、Thermo的Superscript IV或Maxima H Minus。
进一步地,RNA逆转录试剂还包括TSO接头。
进一步地,TSO接头的序列为SEQ ID NO:1。
进一步地,试剂盒还包括TSO引物和接头扩增引物。
进一步地,接头扩增引物的序列为SEQ ID NO:2,TSO引物的序列为SEQ ID NO:3。
进一步地,试剂盒进一步包括TSO-随机引物。
进一步地,TSO-随机引物的序列为SEQ ID NO:4。
进一步地,环化辅助序列为SEQ ID NO:5。
进一步地,cDNA标签引物为poly(A)引物。
进一步地,cDNA标签引物的序列为SEQ ID NO:6。
进一步地,试剂盒进一步还包括核酸外切酶及文库接头中的至少一种。
进一步地,核酸外切酶选自核酸外切酶I或外切酶III。
进一步地,文库接头为MGI测序平台的接头或Illumina测序平台的接头。
进一步地,MGI测序平台选自泡状接头;Illumina测序平台的接头选自P5和P7接头。
进一步地,DNA连接酶选自T4DNA连接酶。
进一步地,试剂盒还包括固相支持物,固相支持物上设置有支持物标签序列,其中,cDNA标签引物与支持物标签序列的至少一部分互补;优选地,固相支持物为微珠,支持物标签序列为微珠标签序列。
进一步地,支持物标签序列按照5’至3’方向依次包括:第一PCR接头、第一细胞标签、第一唯一分子标签及poly(dT)。
根据本申请的第三个方面,提供了一种RNA文库的测序方法,该测序方法包括:采用前述任一种RNA测序文库构建方法构建RNA测序文库,以及对RNA测序文库进行测序。
应用本发明的技术方案,通过获得mRNA的逆转录产物单链cDNA,且该单链cDNA链的3’端带有cDNA标签序列,将带有该cDNA标签序列的单链cDNA进行环化,使得单链cDNA两个末端相连,即对应于mRNA的5’端和3’端相连,进而通过3’端cDNA标签序列标记了mRNA的5’端的片段。然后通过与cDNA标签序列的至少一部分相同的cDNA标签引物与随机引物或者基因特异性引物对单链环化cDNA进行扩增,即可获得从5’端特定基因的位置起始的扩增片段,或者从5’端任意位置起始的扩增片段,最后通过对这些扩增片段进行片段化筛选后建库测序,实现高通量测序的目的。因此,根据上述所获取的用于成环的单链cDNA的长度是全长cDNA的还是随机长度的cDNA,根据研究目的的不同,可以获得5’RNA测序文库,或者全长RNA测序文库。
附图说明
构成本申请的一部分的说明书附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。在附图中:
图1示出了根据本发明的实施例1中制备液滴的芯片结构示意图。
图2示出了本申请的RNA全长文库构建原理及其建库测序流程示意图。
图3示出了本申请的5’RNA末端文库构建原理及其建库测序流程示意图;
图4A和图4B示出了本申请实施例1中细胞系样本和实体组织样本的全长cDNA扩增产物的Agilent 2100生物分析仪检测结果。
图5A和图5B示出了本申请实施例2中细胞系样本和实体组织样本的随机引物扩增cDNA产物的Agilent 2100生物分析仪检测结果。
图6A和图6B示出了本申请实施例1中细胞系样本和实体组织样本的单链cDNA环化后随机引物扩增产物的Agilent 2100生物分析仪检测结果。
图7A和图7B示出了本申请实施例2中细胞系样本和实体组织样本的单链cDNA环化后TCR/BCR引物扩增产物的Agilent 2100生物分析仪检测结果。
图8示出了本申请的一优选实施例中所构建的文库进行测序分析后,对下机数据中5’端及3’端的测序片段对转录本的覆盖情况的分析结果。
具体实施方式
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将结合实施例来详细说明本发明。
术语解释:
TSO:模板置换寡核苷酸(Template switch oligo,TSO),在本申请中有时也称为“TSO接头”,它与具有末端转移酶活性的逆转录酶配合使用。具有末端转移酶活性的逆转录酶在对mRNA进行逆转录时,会在第一链cDNA的末端加上CCC(只添加到全长转录本上)。而TSO接头带有与CCC配对结合的rGrG+G(rG表示核糖鸟嘌呤核苷酸,+G表示LNA修饰的脱氧核糖鸟嘌呤核苷酸),从而在逆转录时在第一链cDNA 3’末端CCC带上除rGrG+G之外的TSO接头的互补序列。
TSO引物:可与第一链cDNA 3’末端至少部分结合,用于cDNA第二链合成以及cDNA扩增。在一种具体的实施方式中,TSO引物为TSO接头去除rGrG+G后的序列。
随机引物和TSO-随机引物:本申请中的随机引物指仅由碱基N组成的序列,比如,6~12nt的N组成的随机序列。而TSO-随机引物是指在碱基N组成的随机引物的上游含有TSO引物的序列。
支持物标签序列:在本申请中,是指用于捕获mRNA的固相支持物上连接的标签序列,其至少包括用于捕获mRNA的oligo(dT)。在一些实施例中,该支持物标签序列按照5’到3’的顺序包含细胞标签和oligo(dT),其中oligo(dT)用于与mRNA的poly(A)互补从而捕获mRNA,细胞标签用于标记来源于同一细胞的mRNA。在某些实施例中,为了进一步标记同一细胞中的不同的mRNA分子,可以在细胞标签和oligo(dT)之间设置唯一分子标记(即UMI)。在某些优选的实施例中,为了进一步方便后续文库构建,在细胞标签的5’方向上还可以设置一段PCR接头,用于后续的PCR扩增。在一些优选的实施例中,该固相支持物为微珠,设置在微珠上的标签序列记作微珠标签序列,按照与微珠由近及远的顺序(也是5’到3’的顺序)依次包括:PCR接头、细胞标签、唯一分子标记(即UMI)及oligo(dT)。
cDNA标签序列:环化的带有poly(A)的单链cDNA上的标签序列,其中的poly(A)用以与固相支持物上的oligo(dT)互补,从而实现对固相支持物所捕获的mRNA进行扩增。在一些优选的实施例中,该cDNA标签序列除了含有poly(A)外,还含有细胞标签,用以标签细胞来源。在某些优选实施例中,在细胞标签和oligo(dT)之间还设置有唯一分子标记(即UMI)以标记同一细胞中的不同的mRNA分子。在另一些优选的实施例中,在细胞标签的3’方向上还设置有一段PCR接头,以用作后续文库构建步骤中PCR扩增的引物。在一种更优选的实施例中,该cDNA标签序列按照3’到5’的顺序依次包括:PCR接头、细胞标签、唯一分子标记及poly(A)。本申请中,为了更准确地区分序列,将上述支持物标签序列(或微珠标签序列)上的PCR接头、细胞标签、唯一分子标记(即UMI)分别记作第一PCR接头、第一细胞标签、第一唯一分子标记,相应地,将cDNA标签序列上的PCR接头、细胞标签、唯一分子标记(即UMI)分别记作第二PCR接头、第二细胞标签、第二唯一分子标记。
接头扩增引物:本申请中的接头扩增引物即为PCR接头,是从cDNA扩增的角度来描述的,在cDNA扩增时,利用接头扩增引物与TSO引物作为引物组合进行扩增。
cDNA标签引物:是指对环化的单链cDNA进行扩增时所使用的一条引物,与环化的单链cDNA的poly(A)相连的标签序列上的至少一部分,比如,可以是poly(A),也可以是cDNA 标签序列,用于对随机引物或基因特异性引物扩增环化单链cDNA后的带有poly(T)的序列进行扩增,从而获得构建5’RNA文库或全长RNA文库所需的扩增片段。
TCR/BCR引物:TCR/BCR指T Cell Receptor/B Cell Receptor,即编码T细胞受体或B细胞受体基因的引物,是基因特异性引物中的一种,用于单细胞T/B细胞受体测序,进而对免疫机制进行研究。
辅助环化序列:本申请中,是指在单链cDNA环化时,用于将单链cDNA的5’端和3’端拉近以实现首尾相互靠近的辅助序列。其通过一段分别与单链cDNA的5’端互补和与3’端的互补的序列来将单链cDNA的5’端和3’端拉近,相互靠近的5’端和3’端中间可以存在缺口,在DNA连接酶的作用下,将缺口补全,进而实现单链cDNA的环化。
poly(A):本领域技术人员熟知mRNA具有poly(A)尾(多聚腺苷酸),相应地,本申请中所述的DNA对应的poly(A)尾,是指多聚脱氧腺苷酸,其与poly(dT)互补配对。
如背景技术所提到的,现有的RNA测序方法中,能够对全长mRNA进行测序的方法难以实现高通量,而能够实现高通量测序的方法大部分针对3’端的mRNA测序,5’端单细胞测序目前主要以10x genomics为主,而单细胞全长mRNA进行测序还没有报道,为了改善这一现状,发明人结合现有单细胞测序技术的缺陷,对现有单细胞RNA建库方法进行改进,并发现目前尚未有报道通过基于3’RNA单细胞高通量捕获cDNA后,通过环化连接的方法将mRNA的3’端的标签序列转移到mRNA的5’端,或者与3’端共享一个标签,同时实现3’端和/或5’端的RNA单细胞高通量测序。具体地,发明人分别针对全长cDNA单细胞高通量测序文库的构建,和5’端的RNA单细胞高通量测序文库的构建进行了详细的研究,并进一步细化了实验设计确认该方法的可行性,具体原理和步骤分别如下:
(一)RNA全长单细胞测序策略
基于3’端RNA_seq液滴的策略捕获mRNA,逆转录生成cDNA全长。随后使用接头扩增引物、TSO引物以及TSO-随机引物对cDNA进行扩增,3’端的源自固相支持物,比如微珠,(英文为beads)上的微珠标签序列因被扩增而转化为cDNA标签序列保留下来,5’端因TSO-随机引物结合的位置不同而形成长度不定的DNA片段。进而将片段长短不一的片段进行连接环化,形成3’端的cDNA标签序列和5’端TSO引物首尾相连的环。再使用与cDNA环互补的随机引物及poly(A)引物进一步以环为模板进行扩增,通过对这些扩增产物进行片段化筛选后建库测序,从而达到既扩增cDNA全长,又可以实现高通量测序的目的(见图示2所示)。
具体的步骤如下:
(1)设计一段用于捕获mRNA 3’端poly(A)的PCR接头序列,该PCR接头序列可以连接在固相支持物(例如微珠)上,该PCR接头在cDNA扩增时可作为接头扩增引物;设计一段与第一链cDNA 3’端互补的TSO引物,以及TSO-随机引物,在合成第一链cDNA后,使用上述引物进行扩增,获得不同大小的cDNA片段;
(2)利用环化连接的技术,将cDNA片段中带有poly(A)的单链的3’端cDNA标签序列与5’端序列首尾相连,得到大小不均一的环状DNA分子;
(3)设计一段序列作为上游引物,该上游引物与cDNA 3’端poly(A)序列相同,使用随机引物对环状DNA分子进行扩增,得到带poly(T)的扩增序列,利用poly(A)引物与poly(T)的结合,从而扩增得到不同长度的DNA片段,对这些片段进行打断建库测序,从而获得全长的RNA序列,具体见实施例。
在一个具体实施方式中,上述全长RNA测序的整体技术路线如下:制备液滴(微珠相含有裂解液)→mRNA捕获→破乳→逆转录反应→随机引物扩增→扩增产物环化连接→环化产物PCR扩增→片段化建库→测序。具体见实施例。
(二)5’端RNA单细胞测序策略
5’端RNA测序采取的策略是:通过3’端RNA_seq的液滴技术获得mRNA,然后对mRNA进行逆转录扩增,获得全长cDNA,将扩增后的双链全长cDNA中的带poly(A)的单链cDNA进行环化连接,使得mRNA 3’端和5’端序列末端首尾相连,再使用与环化cDNA互补的基因特异性引物(如TCR/BCR引物)或者随机引物以及poly A引物进一步以环为模板进行扩增,通过对这些扩增产物进行片段化筛选后建库测序,从而实现5’端TCR/BCR序列捕获或其他目的基因5’序列捕获(如图3所示)。
具体的步骤如下:
(1):设计一段PCR接头序列,用于捕获mRNA 3’端poly(A。该PCR接头序列可以连接在固相支持物(例如磁珠)上,在cDNA扩增时可作为接头扩增引物。设计一段TSO引物,该TSO引物与第一链cDNA的3’端互补。在合成第一链cDNA后,使用PCR接头序列和TSO引物形成的引物组合进行扩增,获得cDNA全长片段。
(2)利用环化连接技术,将上述cDNA全长片段中带有poly(A)的单链进行环化连接,获得3’端的cDNA标签序列和5’端的TSO引物首尾相连的环状DNA分子。
(3)设计一段poly(A)引物作为上游引物,该引物与cDNA 3’端poly(A)序列相同,设计一段与cDNA环互补的TCR/BCR引物或随机引物,对环状DNA分子进行扩增,获得5’端TCR/BCR的片段,或者5’端随机起始的扩增片段。
(4)将TCR/BCR引物扩增获得的片段或者5’端随机起始的扩增片段进行建库测序,进而获得mRNA 5’端的TCR/BCR序列或5’端随机起始的序列。
在一个具体实施方式中,5’端RNA测序的整体技术路线如下:制备液滴(微珠的液相中含有细胞裂解液)→破乳→mRNA捕获→逆转录反应→全长cDNA扩增→扩增产物环化连接→对环化产物采用ployA引物+TCR/BCR引物或随机引物PCR扩增→片段化建库→测序。
综上可知,本申请通过应用基于3’液滴的策略捕获mRNA,即在固相支持物(比如,微珠)上带有一段PCR接头及细胞标签和UMI(即唯一分子标记)序列,同时在末端加上一段 poly(dT),通过poly(dT)与成熟mRNA 3’末端的poly(A)尾巴互补,从而捕获mRNA,进一步逆转录合成第一链cDNA全长。然后通过使用随机引物扩增以及连接成环的方式获得cDNA不同大小的片段或者cDNA全长,从而实现单细胞RNA 5’端及RNA全长测序。
基于上述改进思想及研究结果,申请人提出了本申请的技术方案。在一种典型的实施方式中,本申请提供了一种RNA测序文库的构建方法,该构建方法包括:
获取单细胞mRNA的逆转录产物单链cDNA,该单链cDNA的3’末端含有cDNA标签序列;
将单链cDNA进行环化,得到单链环化cDNA;
利用随机引物或基因特异性引物与cDNA标签引物形成的引物组合进行扩增,得到扩增片段,其中,cDNA标签引物为cDNA标签序列的至少一部分;
对扩增片段进行片段化文库构建,得到RNA测序文库。
上述构建方法,通过获得mRNA的逆转录产物单链cDNA,且该单链cDNA链的3’端带有cDNA标签序列,将带有该cDNA标签序列的单链cDNA进行环化,使得单链cDNA两个末端相连,即对应于mRNA的5’端和3’端相连,进而通过3’端cDNA标签序列标记了mRNA的5’端的片段。然后通过与cDNA标签序列的至少一部分相同的cDNA标签引物与随机引物或者基因特异性引物对单链环化cDNA进行扩增,即可获得从5’端特定基因的位置起始的扩增片段,或者从5’端任意位置起始的扩增片段,最后通过对这些扩增片段进行片段化筛选后建库测序,实现高通量测序的目的。因此,根据上述所获取的用于成环的单链cDNA的长度是全长cDNA的还是随机长度的cDNA,可以获得全长RNA测序文库或者5’RNA测序文库。
需要说明的是,上述构建方法适用于任何样本的RNA文库构建,只要能够获取该样本的mRNA的逆转录单链cDNA即可。尤其适用于单细胞RNA文库构建。在一种优选的实施例中,mRNA来源于单细胞样本,mRNA为单细胞mRNA。
对于获取单细胞mRNA的方式可以采用现有技术中的已知方法进行获取。在本申请一种优选实施例中,采用液滴法制备单细胞mRNA以使得单细胞mRNA连接到固相支持物上,固相支持物优选微珠。
在本申请一种优选的实施例中,采用液滴法制备单细胞mRNA,以使得单细胞mRNA连接到微珠上包括:分别提供单细胞悬液和微珠,微珠上带有微珠标签序列,微珠标签序列的末端含有poly(dT);将单细胞悬液与微珠包裹于液滴中,且每个液滴中含有一个单细胞以及一个微珠,微珠通过poly(dT)与单细胞悬液中mRNA的poly(A)结合,从而将单细胞悬液中的mRNA连接到微珠上,得到单细胞mRNA。需要说明的是,上述单细胞悬液中因含有细胞裂解液,因而也可以认为是细胞核悬液。该液滴法通过微珠上poly(dT)与mRNA的poly(A)结合从而实现对mRNA的捕获,捕获效率高。由于单个油滴中对应单个微珠和单个细胞,因而微珠上的微珠标签序列也能够特异性标记单个细胞。
上述构建方法中,获取mRNA的逆转录产物单链cDNA的具体方法不限,只要能够在对应于mRNA的3’端带有上述cDNA标签序列即可。如上述,对于获得的单链cDNA长度也无特别限定,可以是全长cDNA,也可以是随机长度的cDNA,还可以是两种情况均存在。为了能够充分挖掘和利用转录组信息,在本申请一种优选的实施例中,上述获取mRNA的逆转录产物单链cDNA,该单链cDNA的3’端带有cDNA标签序列的步骤包括:将mRNA进行逆转录,得到第一链cDNA;对第一链cDNA进行扩增,得到双链cDNA,其中,与第一链cDNA互补的第二链cDNA的3’端含有前述cDNA标签序列,cDNA标签序列含有poly(A);对双链cDNA进行解链,得到含有cDNA标签序列的单链cDNA。对双链cDNA进行解链是为了进行后续的环化,在双链的情况下,环化辅助序列难以有效的结合到所欲环化的链上。此处并未限定后续的环化方法,只要能把cDNA首尾相连即可。
上述优选实施例中,用于捕获单细胞的微珠上的微珠标签序列优选从5’到3’方向依次含有第一PCR接头、第一细胞标签、第一唯一分子标签以及poly(dT),相应地,cDNA标签序列从3’到5’方向依次含有第二PCR接头、第二细胞标签,第二唯一分子标记以及poly(A),其中,第二PCR接头与第一PCR接头互补,第二细胞标签与第一细胞标签互补,第二唯一分子标记与第一唯一分子标记互补。也就是说,第一链cDNA的5’末端带有微珠标签序列,而扩增得到的第二链cDNA的3’末端带有与该微珠标签序列互补的cDNA标签序列。
为了方便对单链cDNA进行扩增,优选其5’端含有TSO引物的序列。具体使其在5’端带上TSO引物的序列的方法不限。在本申请一种优选的实施例中,采用具有末端转移酶活性的逆转录酶和TSO接头对mRNA进行逆转录,得到第一链cDNA,第一链cDNA的3’端含有TSO接头的互补序列;对第一链cDNA进行扩增,得到第二链cDNA,该第二链的单链cDNA的5’端含有TSO引物的序列。
具体地,对第一链cDNA进行扩增时,如果采用TSO引物即可获得全长第二链cDNA,如果采用TSO-随机引物进行扩增,即可获得任意长度的第二链cDNA,因而根据扩增引物的种类不同,可以获得全长或任意长度的双链cDNA,进而解链后可以获得全长或任意长度的带有上述cDNA标签序列的单链cDNA。
上述TSO接头的序列可以采用现有已知序列,也可以根据需要自行设计。在本申请一些优选实施例中,TSO接头的序列为SEQ ID NO:1。上述逆转录酶包括但不仅限于MGI的Alpha逆转录酶、Invitrogen的SuperScript TMII逆转录酶、Thermo的Superscript IV或Maxima H Minus。
需要说明的是,基于液滴法捕获单细胞mRNA的方法是目前真正实现低成本高通量转录组测序的方法。该方法的核心是利用液滴作为微反应器,并在液滴中包含一个细胞以及一个含有标签序列(通常已经预先包含了第一细胞标签序列(即Cell barcode)和唯一分子标记(即Unique Molecular Identifier,UMI))的载体/支持物,该载体/支持物优选微珠。在液滴形成后,液滴中的细胞裂解后释放出mRNA,并与微珠上的捕获序列进行结合,从而实现对mRNA的捕获。通常在液滴中完成对mRNA的富集后,再将所有微珠富集的mRNA(此时mRNA已带上与微珠标签序列互补的标签,同批次可以处理几千个细胞)合并进行后续的建库。10× genomics的mRNA文库构建正是采用上述原理,其缺点在于主要是mRNA的3’末端建库测序,对于mRNA的全长难以实现建库测序。
本申请的上述优选实施例中,通过液滴法捕获单细胞的mRNA的步骤与上述方法相同。对mRNA进行逆转录得到的第一链cDNA的过程也与现有方法相同,可以利用具有末端转移酶活性的逆转录酶进行逆转录得到第一链全长cDNA。由于该逆转录酶的末端转移酶活性,能够在第一链全长cDNA的末端添加3个C,这时,液滴中游离的TSO接头(示例参见实施例1中的SEQ ID NO:1)的末端三位rGrG+G能够与3个C结合,接着在第一链全长cDNA的末端CCC的下游合成与TSO接头互补的序列(示例参见实施例1中的SEQ ID NO:1的划线部分的序列)。然后以TSO引物或TSO-随机引物完成第二链cDNA的合成。这样获得的双链cDNA有全长cDNA,也有任意长度片段的cDNA,且在对应于mRNA的3’端的一端带有cDNA标签序列。如上述,本申请通过对这样的双链cDNA中的一条链(即带poly(A)的单链cDNA)进行环化即可实现cDNA标签序列的转移,从而获得对mRNA的5’端进行标记的非全长的扩增片段,甚至全长的扩增片段,对这些扩增片段构建成文库即可实现mRNA的5’端和/或全长测序。
因此,根据实际需要,可以对上述第一链cDNA进行随机扩增或全长扩增。为了进一步提高上述扩增片段对mRNA的5’端片段,甚至涵盖全长的全面覆盖程度,在本申请一种优选的实施例中,上述对第一链cDNA进行随机扩增和/或全长扩增,得到双链cDNA。在另一种优选的实施例中,通过利用接头扩增引物(示例参见实施例1中的SEQ ID NO:2),以及TSO引物(示例参见实施例1中的SEQ ID NO:3)和TSO-随机引物(示例参见实施例1中的SEQ ID NO:4)中的至少之一,对第一链cDNA进行扩增,得到双链cDNA。
通过TSO-随机引物可以扩增得到5’端任意位置起始的双链cDNA片段,从而尽可能全面地涵盖mRNA从5’端至3’端的所有序列的片段。而TSO引物能够确保得到全长双链cDNA片段。对这些长短不等的片段中进行环化,即可得到大小不均一的单链环状DNA分子,对所有的环状片段进行扩增,从而获得能够标记mRNA的5’端的不同位置的扩增片段,对这些扩增片段进行文库构建,即可获得涵盖mRNA的5’端和/或全长不同位置的片段的测序文库。
上述对单链cDNA进行环化的步骤,采用现有的环化方即可。比如,采用华大智造环化测序技术中的环化手段来实现。在本申请一种优选的实施例中,上述将单链cDNA进行环化,得到单链环化cDNA包括:在环化辅助序列和连接酶的作用下将单链cDNA连接成环,得到连接产物;对连接产物进行酶切以消化未连接成环的单链DNA(如果双链DNA解链后不经分离直接进行环化,此处也可能存在未环化的双链DNA),得到单链环化cDNA;其中,环化辅助序列(示例参见实施例1的SEQ ID NO:5)与成环的单链cDNA两端的序列互补(比如,分别与cDNA标签序列中的第二PCR接头互补,以及与TSO引物互补)。
在一种更优选的实施例中,先将双链cDNA进行热变性,使双链cDNA解链成两条单链,在单链状态下将环化辅助序列(根据所欲环化的单链的两端序列进行合理设计)与单链孵育,环化辅助序列通过与单链的两端序列互补而将首尾两端拉近,进而在连接酶的作用下实现单链环化。
在得到单链环化cDNA后,即可根据实际研究目的的不同,利用poly(A)引物与随机引物或特定类型的基因特异性引物(比如TCR/BCR引物)进行扩增,从而获得标记5’端的不同扩增片段。该方法通过环化后扩增,实现了将现有的对mRNA的3’端进行标记的方法转化为对mRNA的5’端的目的片段进行标记,从而实现了mRNA的5’端测序和/或全长测序。该方法简单方便,且与现有多种测序平台的建库步骤兼容,有助于实现单细胞的mRNA 5’端和/或全长的高通量测序。
在一种优选的实施例中,采用以下至少一种引物与cDNA标签引物的组合对单链环化cDNA进行扩增,得到满足不同需求的扩增片段:(a)随机引物;(b)TCR基因引物;(c)BCR基因引物;优选地,cDNA标签引物为poly(A)引物,更优选为SEQ ID NO:6。如前述,如果是研究免疫组库,采用TCR和/或BCR基因的引物,从而获得与免疫相关的基因的表达情况。
上述对扩增片段进行片段化文库构建,得到单细胞的RNA测序文库的步骤采用常规的片段化文库构建流程即可。在本申请一种优选的实施例中,该步骤包括:对扩增片段添加文库接头即可获得RNA测序文库。具体添加接头的方式可以根据不同测序平台的不同,选择合适的文库接头和操作方式进行添加。在本申请一种优选的实施例中,该步骤包括:对扩增片段进行酶切片段化,得到酶切片段;对酶切片段依次进行末端修复加A和接头连接,得到单细胞的RNA测序文库。更优选地,在末端修复加A和接头连接之后,还包括对得到的连接片段进行扩增,从而获得满足上机所需的RNA测序文库。
上述接头连接步骤中,根据测序平台的不同,可以合理选择适合特定测序平台的接头。比如,即可以是MGI测序平台的接头,也可以是Illumina测序平台的接头。相应地,对连上接头后的连接片段进行扩增所采用的扩增引物也是与对应平台接头序列配套。比如,如果采用MGI测序平台的接头,则对连接片段进行扩增的引物也是MGI测序平台的扩增引物。
在本申请第二种典型的实施方式中,还提供了一种单细胞RNA文库构建试剂盒,该试剂盒包括:环化辅助序列、DNA连接酶、cDNA标签引物以及如下引物序列中的至少一种:(a)随机引物;(b)TCR基因引物;(c)BCR基因引物。
该试剂盒主要基于上述文库构建方法中环化步骤和扩增单链环化DNA上的目的基因5’端的特定片段或5’端任意位置起始的随机片段步骤所用的试剂设计而成,通过包含上述试剂便于方便快速地完成文库的构建。其中,环化辅助序列分别与对应于mRNA的5’端所添加的TSO接头和对应于mRNA的3’端的标签序列互补结合,因而其具体序列组成,也是根据标签序列和TSO接头上的具体序列的不同而不同。
上述试剂盒中的DNA连接酶主要是将DNA链中一个带有磷酸化修饰的碱基与一个带有羟基的碱基进行连接,因而任何能够实现DNA连接的DNA连接酶都适用于本申请。具体地,可以是热不稳定的DNA连接酶,如T4DNA连接酶,也可以是热稳定的DNA连接酶,如Thermo stable DNA ligase。
为方便文库的构建,在一种优选的实施例中,上述试剂盒还包括固相支持物,该固相支持物上带有支持物标签序列,其中,cDNA标签引物与支持物标签序列的至少一部分互补,优选该固相支持物为微珠,支持物标签序列为微珠标签序列。带有支持物标签序列的固相支持物能够方便捕获单细胞mRNA的同时,使得mRNA带上与支持物标签序列互补的标签序列。
上述试剂盒中,带有标签序列的微珠(如凝胶微珠),可以从采用现有的微珠中选择购买,也可以自行制备。每个微珠上的标签序列包括如下几段DNA序列:(1)PCR接头,用于PCR扩增。(2)细胞标签(Cell barcode),一个微珠对应一种细胞标签。(3)唯一分子标记(Unique Molecular Identifier,UMI),用来对同一细胞中的不同模板分子进行标记,用于定量转录本的丰度。(4)捕获序列,通常是poly(dT),通过与mRNA的poly(A)尾结合而捕获mRNA。
为进一步提高上述文库构建的便利性,优选该试剂盒还包括RNA提取试剂和/或RNA逆转录试剂,RNA逆转录试剂包括逆转录酶,逆转录酶为具有末端转移酶活性的逆转录酶(比如可以是MGI的Alpha reverse transcriptase,可以是Invitrogen的SuperScript TMII逆转录酶,也可以是Thermo的Maxima H Minus,Superscript IV等)。
基于液滴法进行mRNA捕获及逆转录的相关试剂可以配套使用,当poly(dT)通过与mRNA的poly(A)尾结合而捕获mRNA后,在逆转录酶的作用下实现第一链cDNA的合成,为方便合成第二链,通常的逆转录试剂除了包括逆转录酶外,还包括TSO接头(比如SEQ ID NO:1所示)。比如采用具有末端转移酶活性的逆转录酶能够在第一cDNA链的末端添加CCC,然后利用TSO接头上的rGrG+G与CCC互补结合后,进而以TSO接头序列为模板,在CCC后面添加了TSO接头的互补序列(即相当于在第一cDNA链的末端连上了TSO的互补序列)。
为了能够获得mRNA的从5’端的任意位置至3’端的序列片段,以便通过长短不等的多个片段,得到涵盖mRNA全长的所有片段,从而实现对5’端进行测序或者对全长进行测序。在本申请一种优选的实施例中,上述试剂盒还包括TSO引物、TSO-随机引物以及接头扩增引物。优选地,接头扩增引物的序列为SEQ ID NO:2,TSO引物的序列为SEQ ID NO:3,TSO引物TSO-随机引物为SEQ ID NO:4。接头扩增引物能够与第二cDNA链3’端(对应于mRNA3’端)的cDNA标签序列结合,而TSO-随机引物能够与第一链cDNA 3’端(对应于mRNA5’端)任意位置结合,从而能够获得长度不等的cDNA片断。
优选地,环化辅助序列为SEQ ID NO:5,优选地,cDNA标签引物为poly(A)引物,更优选为SEQ ID NO:6;
在一些优选的实施例中,该试剂盒进一步还包括核酸外切酶及文库接头中的至少一种。其中,核酸外切酶用于在双链cDNA解链为单链进行环化后,对未环化成功的单链或双链cDNA进行降解,比如,可以是核酸外切酶I或核酸外切酶III等。文库构建用接头可以是MGI测序平台的接头(示例如一条为线性接头,一条为泡状接头的双链接头,其中,线状接头为A+31bp序列+10bp index序列+17bp,泡状接头包括17bp的泡状序列、17bp泡状序列前的13bp以及17bp泡状序列后面的7bp+T,接头总长为97bp)。也可以是其他测序平台的接头,比如 Illumina测序平台的接头(比如Y型P5和P7接头,根据需要P5和P7接头中的一个或两个带有文库标签序列,便于后期对混样测序的文库的产出数据进行拆分)。
在本申请第三种典型的实施方式中,还提供了一种RNA测序的方法,该方法包括:采用上述任一种RNA测序文库构建方法构建RNA测序文库,以及对RNA测序文库进行测序。采用前述的RNA测序文库构建方法构建的RNA测序文库,根据研究目的的不同,可以是涵盖更多mRNA 5’端的片段,也可以是涵盖mRNA全长的片断,因而能够满足目前市场上所需的5’端测序需求,比如构建免疫肽库中对5’端测序的需求。对全长RNA测序文库进行测序能够满足对某些转录本可变剪接的结构变异情况的研究。
需要说明的是,上述5’端RNA测序和全长测序可以同时进行,也可以单独进行,根据具体应用场景决定。
下面将结合具体的实施例来进一步说明本申请的技术效果。
下列实施例包括细胞悬液准备、微珠准备、液滴生成、破乳、逆转录RT反应、cDNA扩增、环化连接,环化产物扩增,片段化酶建库、高通量测序等。
实施例1全长RNA测序
本实施例按照图2所示的原理流程进行制备,具体如下:
1.单细胞悬液准备
1.1针对细胞系、实体组织,采用合适的消化法/研磨法制备单细胞/细胞核悬液,并用PBS(含0.04%BSA)清洗1-2次,并使用40μm的细胞筛过滤。
1.2使用细胞计数板或计数仪检测细胞/细胞核的浓度。
1.3根据细胞浓度,吸取10万个细胞/细胞核,300-500g,4℃,5min离心收集细胞沉淀,加入100μL细胞重悬液(Cell Resuspension Buffer:0.04%BSA+PBS)重悬细胞/细胞核。
2微珠准备
2.1吸取200μL(220,000个)磁珠到0.2mL PCR管中,放在磁力架上静置2min,去上清。
2.2从磁力架取下PCR管,加入200μL 1x Buffer D(1mM EDTA,9mg/mL 85%KOH)悬浮磁珠,室温孵育5min。
2.3放在磁力架上静置2min,去除上清。
2.4保持PCR管在磁力架上,加入200μL 1x Buffer D,静置30s去除上清。
2.5加入200μL LSWB(50mM TrTSO-HCl,150mM NaCl,0.05%Tween-20),静置30s,然后去除上清。重复上一步操作。
2.6加入200μL裂解液(Lysis Buffer:6%Ficoll PM-400,0.2%沙丁胺醇,20mM EDTA,200mM Tris pH 7.5,H 2O),静置30s,然后去除上清。从磁力架取下PCR管,加入100μL裂解液和5μL 1M DTT。
3液滴生成
3.1将芯片(见图1)表面保护膜撕掉,放置于液滴发生装置(10×Genomics)的芯片槽区域。
3.2将收集盖子上的连接管A端(接触收集管底部的连接管)插入芯片的出口(Outlet)孔。
3.3将50mL注射器放置在固定架,并调节推杆到初始位置。用平口针头连接注射器和收集管盖子上的连接管B端(非接触收集管底部的连接管)。
3.4向收集管加入200μL液滴生成油,旋紧收集盖子,并将收集管竖直放置于固定架上。
3.5使用移液器轻轻吹打混匀细胞,向芯片的细胞(cells)孔加入100μL步骤1.3制备的细胞悬液,确保枪头接触孔底部。
3.6使用移液器轻轻吹打混匀磁珠,向芯片微珠(beads)孔加入100μL磁珠,确保枪头接触孔底。
3.7立即添加350μL液滴生成油到芯片油滴(Oil)孔。
3.8迅速将注射器的推杆拉到卡槽位置,将推杆卡在卡槽处。
3.9启动计时器计时20min,收集液滴。
3.10 20min后,立即拧松收集管上的收集盖子,拔出芯片出口孔的连接管,竖直拉伸连接管,让管中的液滴流入收集管中,然后换上普通的收集管盖子。
3.11室温静置20min,使mRNA分子充分和磁珠结合。
4破乳
4.1准备破乳试剂,在15mL离心管中加入10mL 6X SSC(20X SSC,Invitrogen,用无酶水稀释至6x)和200μL PFO(Perfluorooctanol,Sigma,370533-25G)。
4.2连接过滤装置与真空泵,调节压力参数到0.01MPa或100mbar,启动真空泵。
4.3加入20mL 6X SSC,对装置进行预处理。
4.4待滤膜上无液体残留,将收集管中的所有液体均匀的倒在滤膜表面,并用2mL 6X SSC清洗收集管两次,将清洗液一并倒入过滤装置中。
4.5用力颠倒混匀10mL破乳试剂,分次缓慢的倒入过滤装置中。
4.6待滤膜上无液体残留,连续加入30mL 6X SSC,分次清洗磁珠。
4.7待滤膜上无液体残留,关闭真空泵,断开真空泵与过滤装置的连接。
4.8用注射器或橡胶塞封闭过滤装置的过滤端口。
4.9使用移液器加入1.0mL收集缓冲液,并轻轻吹打整个滤膜表面,约20次,悬浮磁珠。
4.10转移含有磁珠的收集液到1.5mL低吸附离心管中。
4.11再用1.0mL收集缓冲液,轻轻吹打整个滤膜表面,约10次,悬浮残留的磁珠。
4.12转移含有磁珠的收集液到1.5mL低吸附离心管,置于磁力架上,静置2min,缓慢去除上清液。
4.13从磁力架上取下离心管,使用100μL收集缓冲液,依次悬浮吸附在两个离心管一侧的磁珠,转移液体到0.2mL低吸附PCR管中。
4.14再次用100μL收集缓冲液,依次悬浮吸附在两个离心管一侧的磁珠,转移液体到上述0.2mL低吸附PCR管中。
4.15将装有磁珠的PCR管放置在磁力架上,静置2min,去除上清液。
4.16保持磁珠吸附状态,加入200μL 6X SSC,静置30s,去除上清液。
4.17加入200μL 5X FS Buffer(MGI,01E022MS),静置30s,缓慢去除上清,避免吸到磁珠。
5逆转录反应
5.1在冰上配制逆转录反应体系:5μL H 2O,20μL 5x FS Buffer(First-Strand Buffer),20μL5M甜菜碱(Betaine),10μL 10mM dNTPs,7.5μL 100mM MgCl 2,5μL 50μM模板置换寡核苷酸(Template switch oligo,TSO接头),5μL 100mM DTT,5μL 200U/μL SuperScript TMII逆转录酶(Invitrogen,18064014),2.5μL 40U/μL RNA酶抑制剂(RNase inhibitor)。
TSO接头序列:SEQ ID NO:1:5′- AAGCAGTGGTATCAACGCAGAGTACATrGrG+G-3′,+G表示锁核苷酸,采用rGrG+G的原因是RNA与DNA的杂交热稳定性更好。
5.2吸取100μL逆转录反应体系,加到步骤4.17装有磁珠的PCR管中,吹打混匀。
5.3按照下述条件进行逆转录反应,42℃,90min;10个循环(50℃,2min;42℃,2min),热盖温度设置为75℃。由于磁珠存在沉降现象,每间隔20min轻弹混匀磁珠,短暂离心后继续反应。
5.4反应结束后,短暂离心,放置在磁力架上,静置2min,去除反应液。
5.5从磁力架取下PCR管,加入200μL TE-SDS(TE Buffer+0.5%SDS),震荡混匀,终止反应。
5.6短暂离心后,放置在磁力架上,静置2min,去除液体。
5.7保持磁珠吸附状态,加入200μL TE-TW(TE Buffer+0.01%Tween-20),静置30s,去除上清液。
5.8重复上述步骤。
5.9保持磁珠吸附,加入200μL 10mM NF-H 2O,静置30s去除上清液。
6.第一链cDNA随机引物扩增
6.1配制PCR反应体系:42μL H 2O、4μL 10μM Tn引物(即接头扩增引物)、2μL 20μm TSO-随机引物、2μL 20μm TSO引物以及50μL 2x KAPA HiFi Hotstart Ready mix(KAPA:KK2602)。
其中,Tn引物(即接头扩增引物)用于从与磁珠连接的一端开始扩增,其具体序列为:
SEQ ID NO:2:5’-CGTAGCCATGTCGTTCTG-3’;
TSO引物用于从TSO接头一端开始扩增,其具体序列为:
SEQ ID NO:3:5’Phos-AAGCAGTGGTATCAACGCAGAGTACAT-3’;
TSO-随机引物用于从cDNA的5’端任意位置开始向3’端方向扩增,其具体序列为:
SEQ ID NO:4:5’phos-AAGCAGTGGTATCAACGCAGAGTACATNNNNNN-3’。
6.2按照下述条件进行PCR反应:95℃,3min;10-15个循环(98℃,20s;58℃,20s;72℃,3min);72℃,5min;4℃,维持。
6.3 PCR结束后,使用120μL(1.2x)VAHTSTM DNA Clean Beads(VAZYME:N411-03)(室温提前30min平衡)纯化回收PCR产物。
6.4 PCR纯化产物使用Qubit荧光计进行定量以及Agilent 2100生物分析仪检测片段的分布情况(细胞系样本和实体组织样本分别参见图4A和4B)。
7.DNA环化
7.1取100-200ng上述cDNA产物,用NF-H 2O补充体积至45μL,加入5μL splint oligo 1(20μm),短暂涡旋混合均匀,瞬时离心5s,95℃反应3分钟(热盖105℃,以使dsDNA解链成单链便于单链环化),迅速置于冰上5-10分钟。
其中,splint oligo 1(环化辅助序列)的具体序列为:
SEQ ID NO:5:5’-TACCACTGCTT CGTAGCCATGT-3’。
7.2连接:
上述PCR管置于冰浴中,按照如下表格配制反应体系。
表1
Figure PCTCN2021088984-appb-000001
将配制好的连接产物加到解链产物中,短暂涡旋混合均匀,短暂离心,将PCR管置于PCR仪上,37℃孵育45min,热盖温度75℃。
7.3酶切消化:
在单链环化反应快结束时,提前按下表在冰上配制酶切消化反应液。
表2
Figure PCTCN2021088984-appb-000002
用移液器吸取4μL配制好的酶切消化反应液(用于消化未环化的单链和可能的未解链的双链)加入单链环化产物中,短暂涡旋混合均匀,瞬时离心,将PCR管置于PCR仪上,37℃孵育30min,热盖温度75℃。
7.4酶切终止:酶切反应结束后,往PCR管中加入3μL终止液(Stop solution,0.1M EDTA),混匀,短暂离心收集液体至管底。
7.5 cDNA环化文库纯化,使用PEG32磁珠对步骤7.4获得的环化产物进行纯化,使用Qubit荧光计对纯化后的cDNA环化产物进行定量,作为后续的扩增使用。
8 cDNA随机引物PCR扩增
8.1取起始为10-50ng的环化DNA产物,用NF-H 2O补充体积至42μL,加入4μL 20μm poly(A)引物,4μL随机引物(SEQ ID NO:12:5’-NNNNNN-3’),50μL 2x KAPA HiFi Hotstart Ready mix。这样的引物能够将不同大小的环化产物,进行不同长度片段的扩增,从而尽可能覆盖cDNA的全长序列。
poly(A)引物的具体序列为:
SEQ ID NO:6:5’phos-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA-3’。
8.2按照下述条件进行PCR反应:95℃,3min;10-15个循环(98℃,20s;60℃,20s;72℃,30s);72℃,5min;4℃,维持。
8.3使用0.5×+0.7×的VAHTSTM DNA Clean Beads磁珠进行PCR产物纯化,筛选后的片段(长度在150~800bp)进行Qubit定量及Agilent 2100生物分析仪检测片段的分布情况(细胞系样本和实体组织样本分别参见图6A和图6B)。
9 cDNA文库构建
9.1 DNA片段化:
根据步骤8.3获得的cDNA浓度,取100-200ng(约0.1-0.2pmol)待打断cDNA于新的0.2mL PCR管中,体积应≤16μL,不足16μL部分用H 2O补足。按下表在冰上配制片段化反应液。
表3
Figure PCTCN2021088984-appb-000003
将PCR管置于PCR仪上,热盖设为75℃,37℃孵育10min,反应结束后,向PCR管中加入30μL 0.1M EDTA,涡旋振荡混匀,终止反应。
9.2片段化产物纯化:上述打断的DNA产物,使用0.6×+0.2×的VAHTSTM DNA Clean Beads磁珠进行纯化筛选(保留300~500bp的片段),并使用Qubit荧光计进行浓度定量。
9.3按下表在冰上配制末端修复反应液:
表4
Figure PCTCN2021088984-appb-000004
用移液器吸取10μL配制好的末端修复反应液加入步骤9.2纯化后的片段化产物中,短暂涡旋混合均匀,瞬时离心,将PCR管置于PCR仪上,37℃,30min,65℃,15min,4℃,维持。
9.4文库接头连接
按下表在冰上配制接头连接反应液:
表5
Figure PCTCN2021088984-appb-000005
文库接头具体序列为:
SEQ ID NO:7:5’-Phos-AGTCGGAGGCCAAGCGGTCTTAGGAAGACAA-3’;
SEQ ID NO:8:3’-TTCAGCCTCCGGT-5’。
用移液器缓慢吸取30μL配制好的接头连接反应液,加入末端修复产物中,涡旋震荡混匀,瞬时离心将反应液收集到管底,将PCR管置于PCR仪上,23℃,30min,4℃,维持。
9.5连接产物纯化,使用1.0×的VAHTSTM DNA Clean Beads磁珠纯化,并用Qubit荧光计对连接产物进行浓度测定;
9.6接头连接产物PCR扩增
扩增接头连接产物的引物具体序列为:
FP:5’-phos-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA-3’(SEQ ID NO:9);
RP:5’- TGTGAGCCAAGGAGTTGNNNNNNNNNNTTGTCTTCCTAAGACCGCT-3’(SEQ ID NO:10);
NNNNNNNNNN为标签序列,N代表A/T/C/G任意一种,用于区分不同文库。在离心管中按下表配制PCR反应混合液。
表6
Figure PCTCN2021088984-appb-000006
Figure PCTCN2021088984-appb-000007
用移液器吸取54μL配制好的PCR反应混合液加入纯化后的连接产物中,涡旋震荡混匀,瞬时离心将反应液收集至管底,进行PCR扩增:95℃,3min;10-15个循环(98℃,20s;60℃,20s;72℃,30s);72℃,5min;4℃,维持。
9.7 PCR扩增产物片段筛选,使用(0.6×+0.6×)VAHTSTM DNA Clean Beads磁珠进行纯化,并用Qubit进行产物定量。
10高通量测序
10.1变性:取200-400ng上述cDNA产物,用NF-H 2O补充体积至47μL,加入3μL splint oligo 2(20μm),短暂涡旋混合均匀,瞬时离心5s,95℃反应3分钟(热盖105℃),迅速置于冰上5-10分钟。
(splint oligo 2:5’-TTTTTTTTTTT TGTGAGCCAAG-3’)(SEQ ID NO:11,下划线部分的序列与SEQ ID NO:10中的FP序列的前11位相同)。
10.2连接:
上述PCR管置于冰浴中,按照如下表格配制反应体系。
表7
Figure PCTCN2021088984-appb-000008
将配制好的连接产物加到解链产物中,短暂vortex混合均匀,短暂离心,将PCR管置于PCR仪上,37℃孵育30min,热盖温度75℃。
10.3酶切消化:
在单链环化反应快结束时,提前按下表在冰上配制酶切消化反应液。
表8
Figure PCTCN2021088984-appb-000009
用移液器吸取4μL配制好的酶切消化反应液加入单链环化产物中,短暂vortex混合均匀,瞬时离心,将PCR管置于PCR仪上,37℃孵育30min,热盖温度75℃。
10.4酶切终止:酶切反应结束后,往PCR管中加入3μL终止液(Stop solution,0.1M EDTA),vortex混匀,短暂离心收集液体至管底;
10.5环化文库纯化,使用PEG32磁珠对环化产物进行纯化,使用Qubit荧光计对纯化后的产物进行定量,要求文库质量>0.5ng/μL,文库检测合格,使用MGISEQ2000高通量测序仪进行测序。
实施例2 5’端RNA单细胞测序
本实施例按照图2所示的原理流程进行制备,具体如下:
(一)单细胞mRNA捕获及第一链cDNA合成
步骤与前面全长RNA测序步骤一致。
(二)cDNA扩增的步骤如下:
1配制PCR反应体系:42μL H 2O,4μL 10μM Tn引物(即前述接头扩增引物SEQ ID NO:2);4μL TSO引物(即前述SEQ ID NO:4),50μL 2x KAPA HiFi Hotstart Ready mix。
2按照下述条件进行PCR反应:95℃,3min;13-20个循环(98℃,20s;58℃,20s;72℃,3min);72℃,5min;4℃,维持。
3 PCR结束后,使用60μl(0.6X)VAHTSTM DNA Clean Beads(室温提前30min平衡)纯化回收PCR产物.
4 PCR纯化产物使用Qubit荧光计进行定量以及Agilent 2100生物分析仪检测片段的分布情况(细胞系样本和实体组织样本分别参见图5A和5B)。
(三)cDNA环化
环化过程与上述全长RNA测序步骤一样。
(四)5’RNA扩增
1、取起始为10-50ng的环化DNA产物,用NF-H 2O补充体积至42μl,加入4μL 20μM poly(A)引物(SEQ ID NO:6:5’-phos-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA-3’),4μL 20μM TCR/BCR引物(10x Genomics V(D)J序列:PN-1000005,PN-1000016)或随机引物(5’-NNNNNN-3’),50ul 2x KAPA HiFi Hotstart Ready mix。
2按照下述条件进行PCR反应:95℃,3min;13-20个循环(98℃,20s;60℃,20s;72℃,30s);72℃,5min;4℃,维持。
3使用0.5×+1.0×的VAHTSTM DNA Clean Beads磁珠进行PCR产物纯化,筛选后的片段进行Qubit定量及使用Agilent 2100生物分析仪检测片段的分布情况(细胞系样本和实体组织样本分别参见图7A和7B,其中图7A显示的是TCR特异性引物扩增的2100检测结果,图7B显示的是BCR特异性引物扩增的2100检测结果)。
(五)cDNA 5’文库构建(后续步骤与实施例1相同,不再赘述)。
检测和验证:
对实施例2中所构建的文库采用MGI测序平台的测序仪器进行测序,并对下机后的数据的5’端及3’端的测序片段对转录本的覆盖情况进行分析,具体结果见图8。其中,灰度浅的对应5’端的覆盖情况,灰度深的对应3’端的覆盖情况。从图8中可以看出:本发明方法能够有效捕获到转录本5’端和3’端的信息。
以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (43)

  1. 一种RNA测序文库的构建方法,其特征在于,所述构建方法包括:
    获取mRNA的逆转录产物单链cDNA,其中,所述单链cDNA的3’端含有cDNA标签序列;
    将所述单链cDNA进行环化,得到单链环化cDNA;
    利用随机引物或基因特异性引物与cDNA标签引物形成的引物组合对所述单链环化cDNA进行扩增,得到扩增片段,其中,所述cDNA标签引物为所述cDNA标签序列的至少一部分;
    对所述扩增片段进行片段化文库构建,得到所述RNA测序文库。
  2. 根据权利要求1所述的构建方法,其特征在于,获取mRNA的逆转录产物单链cDNA,所述单链cDNA的3’端含有cDNA标签序列包括:
    将所述mRNA进行逆转录,得到第一链cDNA;
    对所述第一链cDNA进行扩增,得到双链cDNA,其中,与所述第一链cDNA互补的第二链cDNA的3’端含有所述cDNA标签序列,所述cDNA标签序列含有poly(A);
    对所述双链cDNA进行解链,得到所述单链cDNA。
  3. 根据权利要求2所述的构建方法,其特征在于,所述cDNA标签序列从3’到5’方向依次含有第二PCR接头、第二细胞标签,第二唯一分子标记以及所述poly(A)。
  4. 根据权利要求1所述的构建方法,其特征在于,所述mRNA来源于单细胞样本,所述mRNA为单细胞mRNA。
  5. 根据权利要求4所述的构建方法,其特征在于,采用液滴法制备所述单细胞mRNA,以使得所述单细胞mRNA连接到固相支持物上,优选所述固相支持物为微珠。
  6. 根据权利要求5所述的构建方法,其特征在于,采用液滴法制备所述单细胞mRNA,以使得所述单细胞mRNA连接到所述微珠上包括:
    分别提供单细胞悬液和所述微珠,所述微珠上带有微珠标签序列,所述微珠标签序列的末端含有poly(dT);
    将所述单细胞悬液与所述微珠包裹于液滴中,且每个所述液滴中含有一个单细胞以及一个所述微珠,所述微珠通过poly(dT)与所述单细胞悬液中mRNA的poly(A)结合,从而将所述单细胞悬液中的mRNA连接到所述微珠上,得到所述单细胞mRNA。
  7. 根据权利要求6所述的构建方法,其特征在于,所述微珠标签序列从5’到3’方向依次含有第一PCR接头、第一细胞标签、第一唯一分子标签以及所述poly(dT),相应地,所述cDNA标签序列从3’到5’方向依次含有第二PCR接头、第二细胞标签,第二唯一分子标记以及poly(A),其中,所述第二PCR接头与所述第一PCR接头互补,所述第二细胞标签与所述 第一细胞标签互补,所述第二唯一分子标记与所述第一唯一分子标记互补。
  8. 根据权利要求2所述的构建方法,其特征在于,所述单链cDNA的5’端含有TSO引物的序列。
  9. 根据权利要求8所述的构建方法,其特征在于,通过采用逆转录酶和TSO接头对所述mRNA进行逆转录,得到所述第一链cDNA,其中,所述逆转录酶具有末端转移酶活性,所述第一链cDNA的3’端含有所述TSO接头的互补序列;
    对所述第一链cDNA进行扩增,得到所述第二链cDNA,所述第二链cDNA的5’端含有所述TSO引物的序列。
  10. 根据权利要求9所述的构建方法,其特征在于,所述TSO接头的序列为SEQ ID NO:1。
  11. 根据权利要求9所述的构建方法,其特征在于,所述逆转录酶选自MGI的Alpha逆转录酶、Invitrogen的SuperScript TM II逆转录酶、Thermo的Superscript IV或Maxima H Minus。
  12. 根据权利要求2所述的构建方法,其特征在于,对所述第一链cDNA进行随机扩增和/或全长扩增,得到所述双链cDNA。
  13. 根据权利要求12所述的构建方法,其特征在于,采用接头扩增引物与TSO引物对所述第一链cDNA进行扩增,得到所述双链cDNA;或者
    采用接头扩增引物、TSO-随机引物以及所述TSO引物对所述第一链cDNA进行扩增,得到所述双链cDNA。
  14. 根据权利要求13所述的构建方法,其特征在于,所述接头扩增引物的序列为SEQ ID NO:2,所述TSO引物的序列为SEQ ID NO:3,所述TSO-随机引物的序列为SEQ ID NO:4。
  15. 根据权利要求1至14中任一项所述的构建方法,其特征在于,将所述单链cDNA进行环化,得到单链环化cDNA包括:
    在环化辅助序列和连接酶的作用下将所述单链cDNA连接成环,得到连接产物;
    对所述连接产物进行酶切以消化未连接成环的单链cDNA,得到所述单链环化cDNA;
    其中,所述环化辅助序列与所述单链cDNA两端的序列互补。
  16. 根据权利要求15所述的构建方法,其特征在于,所述环化辅助序列选自SEQ ID NO:5。
  17. 根据权利要求1所述的构建方法,其特征在于,所述基因特异性引物为针对TCR基因扩增的TCR引物和/或针对BCR基因扩增的BCR引物。
  18. 根据权利要求1所述的构建方法,其特征在于,所述cDNA标签引物为poly(A)引物,优 选为SEQ ID NO:6。
  19. 根据权利要求1所述的构建方法,其特征在于,对所述扩增片段进行片段化文库构建,得到所述RNA测序文库包括:
    对所述扩增片段添加文库接头,得到所述RNA测序文库。
  20. 根据权利要求19所述的构建方法,其特征在于,对所述扩增片段进行酶切片段化,得到酶切片段;
    对所述酶切片段依次进行末端修复、加A和文库接头连接,得到所述RNA测序文库。
  21. 根据权利要求19或20所述的构建方法,其特征在于,在进行所述文库接头连接之后,进一步包括,对所述文库接头的连接产物进行PCR扩增,得到所述RNA测序文库。
  22. 根据权利要求19所述的构建方法,其特征在于,所述文库接头为MGI测序平台的接头或Illumina测序平台的接头。
  23. 一种RNA文库构建试剂盒,其特征在于,所述试剂盒包括:环化辅助序列、DNA连接酶、cDNA标签引物以及如下引物中的至少一种:(a)随机引物;(b)TCR引物;(c)BCR引物。
  24. 根据权利要求23所述的试剂盒,其特征在于,所述试剂盒进一步包括RNA逆转录试剂。
  25. 根据权利要求24所述的试剂盒,其特征在于,所述RNA逆转录试剂包括逆转录酶,所述逆转录酶为具有末端转移酶活性的逆转录酶。
  26. 根据权利要求25所述的试剂盒,其特征在于,所述逆转录酶选自MGI的Alpha逆转录酶、Invitrogen的SuperScript TM II逆转录酶、Thermo的Superscript IV或Maxima H Minus。
  27. 根据权利要求25所述的试剂盒,其特征在于,所述RNA逆转录试剂还包括TSO接头。
  28. 根据权利要求27所述的试剂盒,其特征在于,所述TSO接头的序列为SEQ ID NO:1。
  29. 根据权利要求23所述的试剂盒,其特征在于,所述试剂盒还包括TSO引物和接头扩增引物。
  30. 根据权利要求29所述的试剂盒,其特征在于,所述接头扩增引物的序列为SEQ ID NO:2,所述TSO引物的序列为SEQ ID NO:3。
  31. 根据权利要求29所述的试剂盒,其特征在于,所述试剂盒进一步包括TSO-随机引物。
  32. 根据权利要求31所述的试剂盒,其特征在于,所述TSO-随机引物的序列为SEQ ID NO:4。
  33. 根据权利要求23所述的试剂盒,其特征在于,所述环化辅助序列为SEQ ID NO:5。
  34. 根据权利要求23所述的试剂盒,其特征在于,所述cDNA标签引物为poly(A)引物。
  35. 根据权利要求34所述的试剂盒,其特征在于,所述cDNA标签引物的序列为SEQ ID NO:6。
  36. 根据权利要求23所述的试剂盒,其特征在于,所述试剂盒进一步还包括核酸外切酶及文库接头中的至少一种。
  37. 根据权利要求36所述的试剂盒,其特征在于,所述核酸外切酶选自核酸外切酶I或外切酶III。
  38. 根据权利要求36所述的试剂盒,其特征在于,所述文库接头为MGI测序平台的接头或Illumina测序平台的接头。
  39. 根据权利要求38所述的试剂盒,其特征在于,所述MGI测序平台选自泡状接头;所述Illumina测序平台的接头选自P5和P7接头。
  40. 根据权利要求23所述的试剂盒,其特征在于,所述DNA连接酶选自T4 DNA连接酶。
  41. 根据权利要求23所述的试剂盒,其特征在于,所述试剂盒还包括固相支持物,所述固相支持物上设置有支持物标签序列,其中,所述cDNA标签引物与所述支持物标签序列的至少一部分互补;优选地,所述固相支持物为微珠,所述支持物标签序列为微珠标签序列。
  42. 根据权利要求41所述的试剂盒,其特征在于,所述支持物标签序列按照5’至3’方向依次包括:第一PCR接头、第一细胞标签、第一唯一分子标签及poly(dT)。
  43. 一种RNA文库的测序方法,其特征在于,所述测序方法包括:采用权利要求1至22中任一项所述的RNA测序文库构建方法构建RNA测序文库,以及对所述RNA测序文库进行测序。
PCT/CN2021/088984 2021-04-22 2021-04-22 Rna测序文库的构建方法、测序方法及试剂盒 WO2022222101A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP21937340.4A EP4328362A1 (en) 2021-04-22 2021-04-22 Construction method for rna sequencing library, sequencing method, and kit
CN202180097147.5A CN117178083A (zh) 2021-04-22 2021-04-22 Rna测序文库的构建方法、测序方法及试剂盒
AU2021442183A AU2021442183A1 (en) 2021-04-22 2021-04-22 Construction method for rna sequencing library, sequencing method, and kit
PCT/CN2021/088984 WO2022222101A1 (zh) 2021-04-22 2021-04-22 Rna测序文库的构建方法、测序方法及试剂盒
CA3217523A CA3217523A1 (en) 2021-04-22 2021-04-22 Construction method for rna sequencing library, sequencing method, and kit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/088984 WO2022222101A1 (zh) 2021-04-22 2021-04-22 Rna测序文库的构建方法、测序方法及试剂盒

Publications (1)

Publication Number Publication Date
WO2022222101A1 true WO2022222101A1 (zh) 2022-10-27

Family

ID=83723555

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/088984 WO2022222101A1 (zh) 2021-04-22 2021-04-22 Rna测序文库的构建方法、测序方法及试剂盒

Country Status (5)

Country Link
EP (1) EP4328362A1 (zh)
CN (1) CN117178083A (zh)
AU (1) AU2021442183A1 (zh)
CA (1) CA3217523A1 (zh)
WO (1) WO2022222101A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116497105A (zh) * 2023-06-28 2023-07-28 浙江大学 基于末端转移酶的单细胞转录组测序试剂盒及测序方法

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105985945A (zh) * 2015-01-30 2016-10-05 深圳华大基因研究院 mRNA片段化方法及基于其构建测序文库的方法
US20180023119A1 (en) * 2016-07-22 2018-01-25 Illumina, Inc. Single cell whole genome libraries and combinatorial indexing methods of making thereof
CN107636163A (zh) * 2015-04-29 2018-01-26 加利福尼亚大学董事会 用于构建链特异性cDNA文库的组合物和方法
CN109750086A (zh) * 2017-11-06 2019-05-14 深圳华大智造科技有限公司 单链环状文库的构建方法
CN109811045A (zh) * 2017-11-22 2019-05-28 深圳华大智造科技有限公司 高通量的单细胞全长转录组测序文库的构建方法及其应用
CN110114472A (zh) * 2016-12-21 2019-08-09 深圳华大智造科技有限公司 将线性测序文库转换为环状测序文库的方法
CN110684829A (zh) * 2018-07-05 2020-01-14 深圳华大智造科技有限公司 一种高通量的单细胞转录组测序方法和试剂盒

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105985945A (zh) * 2015-01-30 2016-10-05 深圳华大基因研究院 mRNA片段化方法及基于其构建测序文库的方法
CN107636163A (zh) * 2015-04-29 2018-01-26 加利福尼亚大学董事会 用于构建链特异性cDNA文库的组合物和方法
US20180023119A1 (en) * 2016-07-22 2018-01-25 Illumina, Inc. Single cell whole genome libraries and combinatorial indexing methods of making thereof
CN110114472A (zh) * 2016-12-21 2019-08-09 深圳华大智造科技有限公司 将线性测序文库转换为环状测序文库的方法
CN109750086A (zh) * 2017-11-06 2019-05-14 深圳华大智造科技有限公司 单链环状文库的构建方法
CN109811045A (zh) * 2017-11-22 2019-05-28 深圳华大智造科技有限公司 高通量的单细胞全长转录组测序文库的构建方法及其应用
CN110684829A (zh) * 2018-07-05 2020-01-14 深圳华大智造科技有限公司 一种高通量的单细胞转录组测序方法和试剂盒

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Disease gene identification : methods and protocols Series title: Methods in Molecular Biology", vol. 1706, 9 February 2018, SPRINGER, ISBN: 978-1-4939-7470-2, article ROBERT DURRUTHY-DURRUTHY , MANISHA RAY: "Chapter 11: Using Fluidigm C1 to Generate Single-Cell Full-Length cDNA Libraries for mRNA Sequencing", pages: 199 - 221, XP009540641, DOI: 10.1007/978-1-4939-7471-9_11 *
TREGER REBECCA, POPE SCOTT, XING XIAOJUN, IWASAKI AKIKO: "Application of a Modified Smart-seq2 Sample Preparation Protocol for Rare Cell Full-length Single-cell mRNA Sequencing to Mouse Oocytes", BIO-PROTOCOL, vol. 9, no. 16, 1 January 2019 (2019-01-01), Sunnyvale, CA, USA , pages 1 - 13, XP009540447, ISSN: 2331-8325, DOI: 10.21769/BioProtoc.3345 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116497105A (zh) * 2023-06-28 2023-07-28 浙江大学 基于末端转移酶的单细胞转录组测序试剂盒及测序方法
CN116497105B (zh) * 2023-06-28 2023-09-29 浙江大学 基于末端转移酶的单细胞转录组测序试剂盒及测序方法

Also Published As

Publication number Publication date
EP4328362A1 (en) 2024-02-28
CN117178083A (zh) 2023-12-05
CA3217523A1 (en) 2022-10-27
AU2021442183A1 (en) 2023-12-07

Similar Documents

Publication Publication Date Title
US11845924B1 (en) Methods of preparing nucleic acid samples for sequencing
US11987838B2 (en) Methods and kits for labeling cellular molecules
JP2009072062A (ja) 核酸の5’末端を単離するための方法およびその適用
US20230049664A1 (en) Droplet microfluidics-based single cell sequencing and applications
US20220403465A1 (en) Systems, methods, and compositions for generating multi-omic information from single cells
WO2022222101A1 (zh) Rna测序文库的构建方法、测序方法及试剂盒
CN111801428B (zh) 一种获得单细胞mRNA序列的方法
JP2024521287A (ja) Rnaシークエンシングライブラリーの構築方法、シークエンシング方法及びキット
WO2023155135A1 (zh) 单细胞转录组测序文库的构建方法和测序方法,以及制备单细胞转录组文库的试剂盒
US20220373544A1 (en) Methods and systems for determining cell-cell interaction
CN115261389A (zh) 一种功能序列、包含其的标签群以及应用
CN115948388A (zh) 特异性捕获引物、靶向捕获探针组合物、靶向捕获文库的构建方法及应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21937340

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2023564239

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 3217523

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 18287887

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 2021442183

Country of ref document: AU

Ref document number: 2021937340

Country of ref document: EP

Ref document number: AU2021442183

Country of ref document: AU

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2021937340

Country of ref document: EP

Effective date: 20231122

ENP Entry into the national phase

Ref document number: 2021442183

Country of ref document: AU

Date of ref document: 20210422

Kind code of ref document: A