WO2022021279A1 - Multi-nucleic acid co-labeling support, preparation method therefor, and application thereof - Google Patents

Multi-nucleic acid co-labeling support, preparation method therefor, and application thereof Download PDF

Info

Publication number
WO2022021279A1
WO2022021279A1 PCT/CN2020/106089 CN2020106089W WO2022021279A1 WO 2022021279 A1 WO2022021279 A1 WO 2022021279A1 CN 2020106089 W CN2020106089 W CN 2020106089W WO 2022021279 A1 WO2022021279 A1 WO 2022021279A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
cell
sequence
support
labeled
Prior art date
Application number
PCT/CN2020/106089
Other languages
French (fr)
Chinese (zh)
Inventor
焦少灼
韩金桓
李研
刘书杰
马兴勇
罗云超
桑国芹
谢莹莹
徐猛
李宗文
Original Assignee
北京寻因生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京寻因生物科技有限公司 filed Critical 北京寻因生物科技有限公司
Priority to PCT/CN2020/106089 priority Critical patent/WO2022021279A1/en
Priority to CN202080005408.1A priority patent/CN114096678A/en
Publication of WO2022021279A1 publication Critical patent/WO2022021279A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present invention relates to a kind of multiple nucleic acid co-labeling support and its preparation method and application.
  • nucleic acid molecule reactions including nucleic acid hybridization, extension, amplification and other reactions are carried out in liquid phase.
  • the liquid phase provides a uniform and stable environment for the nucleic acid and enzyme reactions involved in the reaction to maximize the output.
  • attaching nucleic acids or enzymes involved in the reaction to the surface of the solid phase can give the nucleic acid spatial position information to facilitate purification, separation, detection and analysis. Therefore, more and more solid phase nucleic acids have been developed.
  • the reaction is used for nucleic acid sequence analysis and nucleic acid quantification, such as nucleic acid chip technology in which oligonucleotides are immobilized on a substrate in an orderly manner, bridge amplification and microsphere emulsion amplification technology for next-generation gene sequencing, and high-throughput applications Nucleic acid encoding microbeads for single-cell sequencing, etc.
  • SNPs Single nucleotide polymorphisms
  • GWAS genome-wide association studies
  • paternity testing paternity testing
  • population identification Traditional SNP identification technologies can analyze hundreds of loci simultaneously, including TaqMan fluorescence analysis, KASPar identification technology, and direct PCR sequencing technology.
  • the advantage is that the experimental operation is flexible when the number of SNP loci to be detected is small, but the disadvantage is that a single sample is used.
  • the cost of building a library and sequencing a single sample is high, especially when the number of SNPs detected in a single sample is less than 1000, the average detection cost of a single SNP increases rapidly; when the number of SNPs to be detected in a single sample is Using multiplex PCR library construction combined with next-generation sequencing at 20-1000 can effectively reduce the average detection cost of a single SNP.
  • Multiplex PCR library building generally refers to adding multiple pairs of PCR primers to the PCR amplification system to amplify multiple target fragments at the same time, and then through the second step of universal primer amplification to form a nucleic acid library with adapters and sample tags required by the sequencer . Due to the addition of multiple pairs of primers at a higher concentration in the PCR amplification system, multiple primer-dimers are easily formed in multiplex PCR, and primer-dimers will directly affect the quality of the constructed library. Therefore, how to remove primer-dimers becomes a multiplex PCR construct key link in the library. According to public reports, Zuiyi Yang et al. can reduce the primer-dimer of multiplex PCR by optimizing the sequence of primer pairs.
  • Base modification can also effectively reduce primer dimers during the reaction.
  • primer-dimers can be removed by the methods disclosed above and post-PCR purification, multiplex PCR still has problems such as PCR bias caused by different efficiency of amplifying the target region and non-specific amplification caused by cross combination of primer pairs. More importantly, since the PCR amplification target regions of different primer pairs cannot overlap in liquid phase conditions, it is difficult for ordinary multiplex PCR to achieve continuous sequence analysis of long DNA fragments in a single reaction tube in a liquid phase system.
  • single-cell sequencing mainly includes single-cell genome sequencing, single-cell RNA sequencing, single-cell epigenome sequencing and spatial single-cell sequencing; from the perspective of detection throughput, it is mainly divided into low-throughput single-cell sequencing (one-time detection). 1-500 cells) and high-throughput single-cell sequencing (1000-10,000 cells at a time).
  • High-throughput single-cell RNA detection mainly includes three implementations: water-in-oil-based droplet separation technology, microplate-based beads labeling technology, and microfluidics.
  • Water-in-oil-based droplet segmentation technology is represented by 10X Genomics, Drop-Seq platform and inDrop platform. This technology uses microfluidic technology to encapsulate barcode-labeled microbeads and single cells in oil droplets and cleaves to release RNA containing polyA tails; each gel microbead is coupled with oligo dT containing cell tags and molecular tags.
  • mRNA is bound to the oligo dT nucleic acid molecule of cell tag and molecular tag, and then reverse transcription to cDNA from different cell sources to tag different cell tags and use for subsequent mixed library construction and sequencing analysis.
  • Microplate-based beads labeling technologies are represented by BD CytoSeq, SeqWell and microwell-seq.
  • This technology naturally settles cells into a microwell array with more than ten times the number of cells to ensure a single cell entry rate, and then adds cell label-labeled microbeads to the microwells to capture the mRNA after cell lysis; mRNA is bound to the cell label After the oligo dT with molecular tags, the cDNAs derived from different cells are labeled with different cell tags by reverse transcription and used for subsequent mixed library construction and sequencing analysis.
  • the droplet method completely isolates cells and label beads from other cells and beads through water-in-oil, effectively reducing the possibility of cross-contamination; at the same time, in addition to the realization of 3' RNA expression profiling libraries, the droplet protocol can also Tags and molecular tags are coupled with template switching sequences to achieve 5' single-cell RNA expression profiling; however, due to the instability and suspension characteristics of droplets, the single-cell library construction scheme based on droplet method cannot be used in RNA cells. The medium cannot be changed before and after labeling, thereby reducing the possibility of further complex reactions, especially the lack of positional information of the labeling beads.
  • the microplate method avoids the problem of probabilistic collision affecting the capture efficiency in 10X, and has better cell capture efficiency; the label beads have a fixed position after falling into the microwell, and more liquid exchange operations can be performed; however, due to the The microwell is a semi-closed structure with an open top, which can cause cellular RNA to diffuse out of the well, so all current microwell methods can only construct 3' single-cell RNA expression profiling libraries.
  • An object of the present invention is to provide a multi-nucleic acid co-labeled support.
  • Another object of the present invention is to provide a method for preparing a support for co-labeling multiple nucleic acids.
  • Another object of the present invention is to provide the application of multiple nucleic acid co-labeled supports.
  • the present invention modifies two or more nucleic acid molecules on the support, wherein one nucleic acid molecule is used to capture the target compound from the reaction pool and participate in a specific biochemical process together with other types of molecules co-labeled on the surface of the same solid-phase compound, including However, it is not limited to the application directions of multiplex PCR library construction, single-cell RNA expression profiling, single-cell transcriptome sequencing library construction, and single-cell multi-omics sequencing library construction.
  • the present invention provides a multi-nucleic acid co-labeled support, which includes a support body and a variety of nucleic acid labels located on the surface and/or inside of the support body, nucleic acid labeled on a single support At least include: one or more first nucleic acid markers, whose role at least includes capturing specific compounds in the reaction system to the surface of the support; one or more second nucleic acid markers, whose role at least includes participating in the capture on the surface of the support.
  • the specified biochemical reaction process for a specific compound is not limited to be performed by the reaction system.
  • the support body is solid beads and/or semi-solid hydrogel beads.
  • a plurality of nucleic acid co-labeled supports of the present invention are compositions comprising a plurality of supports.
  • the number of the first nucleic acid label and the second nucleic acid label on the same support can be ⁇ 1 and/or ⁇ 10 13 respectively.
  • the sequences of multiple first nucleic acid markers on the same support are the same or different; the sequences of the first nucleic acid markers on different supports are the same or different; the sequences of multiple second nucleic acid markers on the same support are the same or different; or the sequences of the second nucleic acid markers on different supports are the same or different.
  • the present invention also provides a method for making the described multiple nucleic acid co-labeled supports, comprising:
  • nucleic acids are labeled on the support body by grafting and/or grafting to obtain a support with multiple nucleic acids co-labeled.
  • the preparation method of the multiple nucleic acid co-labeled supports of the present invention includes:
  • the support body and the nucleic acid are respectively modified with functional units that can interact, so that the two react to label the nucleic acid on the support body;
  • the nucleic acid is directly synthesized on the support body according to the preset nucleotide sequence; and/or
  • Nucleic acid labeling is performed on the body of the support using a biochemical reaction for nucleic acid extension or ligation protocols.
  • the present invention also provides the 5' single-cell RNA expression profile analysis of the multiple nucleic acid co-labeled supports, the construction of a 5' single-cell VDJ library of a microwell array platform, and the construction of a 3' single-cell RNA library. , Construction of single-cell transcriptome library, single-cell multi-omics research, multiplex PCR and/or construction of multiplex PCR sequencing library applications.
  • the multiple nucleic acid co-labeled supports of the invention are used for 5' single cell RNA expression profiling.
  • template switching sequences containing cell tags and molecular tags and RNA capture sequences are fixed on the support.
  • at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence;
  • the first nucleic acid sequence contains at least a capture sequence, which is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription;
  • the second nucleic acid sequence Sequences include cell tag sequences, which are used to tag molecules derived from all mRNAs in the same cell; different kinds of supports have different cell tags.
  • the support is made to capture the RNA released after single cell lysis in the micropores of the chip, and the RNA derived from the same cell is labeled with the same cell label by template switching during the reverse transcription process, and then the cDNA is realized by amplification Amplified and finally constructed as a 5' single-cell RNA expression profiling library.
  • the multiple nucleic acid co-labeled supports of the invention are 5' single-cell VDJ libraries used to construct microwell array platforms.
  • template switching sequences containing cell tags and molecular tags and RNA capture sequences are fixed on the support.
  • at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence; the first nucleic acid sequence contains at least a capture sequence, which is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription;
  • the second nucleic acid sequence Sequences include cell tag sequences, molecular tag sequences and template switching sequences.
  • Cell tag sequences are used to label molecules derived from all mRNAs in the same cell; molecular tag sequences are used to label each reverse transcribed cDNA molecule from the same support.
  • the cDNA molecules reversely transcribed from different RNAs are marked with different molecular tags; the template switching sequence can be used as a template to extend the 3' end of the reverse transcribed cDNA to mark the molecular tag sequence and cell tag sequence; different species with different cell labels on the supports.
  • the support is made to capture the RNA released after single cell lysis in the micropores of the chip, and in the reverse transcription process, the RNA derived from the same cell is labeled with the same cell label through template switching, and further through TCR and BCR/
  • the constant region primers of the Ig gene realize the enrichment of TCR and BCR/Ig nucleic acid sequences and finally break them into a high-throughput single-cell VDJ sequencing library.
  • the multiple nucleic acid co-labeled supports of the invention are used to construct a 3' single cell RNA library.
  • a random primer containing a cell tag that can be conditionally blocked and an RNA capture sequence are fixed on the support, and specifically, at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence; the first nucleic acid sequence The sequence contains at least a capture sequence, which is used to capture the target nucleic acid molecule and serve as primer extension or reverse transcription; the second nucleic acid sequence includes a conditional blockable random primer containing a cell tag, and the cell tag sequence is used to tag all cells derived from the same cell.
  • Molecules of mRNA different kinds of supports have different cell tags.
  • the support is made to capture the RNA released after single cell lysis in the micropores of the chip and reverse transcribed into cDNA, and the subsequent random primers containing cell tags are synthesized by two strands to achieve the same tag on the cDNA derived from the same cell. Cell labeling, followed by amplification of cDNA to construct a 3' single-cell RNA library.
  • the multiple nucleic acid co-labeled supports of the invention are used to construct single cell transcriptome libraries.
  • random primer sequences containing cell tags and RNA capture sequences are fixed on the support, different types of supports have different cell tags, and can detect any sequence of RNA molecules without being limited to the 3' end or 5' end;
  • the support includes two types of supports, each type of single support has at least two nucleic acid sequences, a combination of a first nucleic acid sequence and a second nucleic acid sequence, or a third nucleic acid sequence and a third nucleic acid sequence.
  • the first nucleic acid sequence contains at least a capture sequence for capturing target nucleic acid molecules
  • the second nucleic acid sequence includes a random primer sequence containing a cell tag, and the cell tag sequence is used to tag molecules derived from all mRNAs in the same cell
  • the third nucleic acid sequence includes a cell tag sequence and a capture sequence.
  • the support is made to capture the RNA released after the lysis of single cells in the micropores of the chip, and in the process of reverse transcription, the RNA derived from the same cell is labeled with the same cell label, and then the cDNA is amplified and amplified by amplification.
  • a single-cell RNA transcriptome library was constructed.
  • the multiple nucleic acid co-labeled supports of the present invention are used in single-cell multi-omics studies.
  • a library for constructing RNA expression levels and/or for detecting protein expression levels by nucleic acid tags of proteins is included.
  • RNA capture sequences containing cell tags and capture sequences for protein-labeled nucleic acid tags are fixed on the support, and different types of supports have different cell tags.
  • the first nucleic acid sequence includes at least a capture sequence for capturing the target nucleic acid molecule and extending as a primer;
  • the second nucleic acid sequence includes a cell tag sequence, a molecular tag sequence and a template switching sequence;
  • the cell tag sequence is used to label cells derived from the same cell All mRNA molecules of the mRNA;
  • the molecular tag sequence is used to label each reverse transcribed cDNA molecule, and the cDNA molecules reverse transcribed from different RNAs on the same support are marked with different molecular tags;
  • the template switching sequence can be used as a The template continues to extend the 3' end of the reverse transcribed cDNA to label the molecular tag sequence and cell tag sequence;
  • the third nucleic acid sequence includes the cell tag sequence, molecular tag sequence and protein nucleic acid tag capture sequence, and the protein nucleic acid tag capture sequence is used to capture And extend the protein nucleic acid marker in the same spatial structure as the single cell to be tested.
  • the support is made to capture RNA and protein nucleic acid tags released after single cell lysis in the micropores of the chip, and in the reverse transcription process, the RNA and protein nucleic acid tags derived from the same cell are labeled with the same cell tag. , and then finally constructed into a single-cell RNA transcriptome library and a protein-labeled nucleic acid library through amplification.
  • the multiple nucleic acid co-labeled supports of the invention are used to construct multiplex PCR sequencing libraries.
  • the primers that can interfere with each other are respectively fixed on different supports.
  • the supports comprise at least two types of supports: one or more primer-labeled supports of a first kind, and one or more primer-labeled supports of a second kind, each of which is labeled with at least one A pair of nucleic acid primers: a first nucleic acid primer pair is labeled on the first type of primer-labeled support, and a second nucleic acid primer pair different from the first nucleic acid primer pair is labeled on the second type of primer-labeled support.
  • Each kind of support can also selectively include more nucleic acid primer pairs, such as other nucleic acid primer pairs, and the target fragments amplified by multiple pairs of primer pairs on the same support do not overlap on the template; Different primer pairs can amplify different regions of interest, and these regions of interest may or may not overlap.
  • all the supports are mixed in proportions and then mixed with the nucleic acid template and the PCR enzyme reaction system, so as to perform a single-tube unbiased multiplex PCR.
  • the present invention also provides a kit, which includes the multiple nucleic acid co-labeled supports of the present invention.
  • the kit is a 5' single-cell VDJ library that can be applied to 5' single-cell RNA expression profiling, the construction of a microwell array platform, the construction of a 3' single-cell RNA library, the construction of a single-cell transcriptome library, a single-cell VDJ library Kits for multi-omics studies, multiplex PCR and/or construction of multiplex PCR sequencing libraries.
  • the kit also includes one or more of the following compositions:
  • Composition 1 a mixture of a support containing a template switching sequence of a cell tag and a molecular tag and an RNA capture sequence, a microwell chip, a cell lysate, a reverse transcription reagent, a nucleic acid amplification reagent, and a nucleic acid interruption library building module; comprising the The kit of composition 1 can be used for 5' single-cell RNA expression profiling;
  • Composition 2 mixture of template switching sequences containing cell tags and molecular tags and supports for RNA capture sequences, microwell chips, cell lysates, reverse transcription reagents, constant region primers, nucleic acid amplification reagents, and nucleic acid interruption library construction Module; a kit comprising the composition 2 can be used to construct a 5' single-cell VDJ library of a microwell array platform;
  • Composition 3 a mixture of random primers containing cell tags and supports for RNA capture sequences, a microwell chip, a cell lysate, a reverse transcription reagent, a double-stranded synthesis module, and a nucleic acid amplification and extension reagent;
  • the kit can be used to construct 3' single-cell RNA library;
  • Composition 4 a support mixture containing a random primer sequence of a cell tag and an RNA capture sequence, a microwell chip, a cell lysate, a reverse transcription reagent, a two-strand synthesis module, and a nucleic acid amplification and extension reagent; a composition comprising the composition 4
  • the kit can be used to construct single-cell transcriptome library;
  • Composition 5 a capture sequence support mixture containing cell-tagged protein-tagged nucleic acids, a microwell chip, a cell lysate, a reverse transcription reagent, and a nucleic acid interrupt library building module; a kit comprising the composition 5 can be used for single-cell multiplexing. omics research;
  • Composition 6 premixed support mixture of coupled primers, multiplex PCR enzyme and buffer; further optionally, index primers adapted to a high-throughput sequencer; the kit comprising the composition 6 can be used for Multiplex PCR and/or construction of multiplex PCR sequencing library (pre-mixed support mixture of coupled primers and multiplex PCR enzymes and buffers, can achieve single-tube unbiased multiplex PCR; further includes adapting to high-throughput sequencers. Indexing primers to construct multiplex PCR libraries that can be used for sequencing analysis by index PCR).
  • the present invention provides a support for co-labeling of multiple nucleic acids and a method for making and application thereof.
  • the technology of the present invention can capture nucleic acid molecules on the solid surface and perform specific biochemical reactions together with other kinds of nucleic acids modified on the solid surface by carrying out a variety of nucleic acid modification schemes on solid-phase (including semi-solid) supports.
  • Specific types of polynucleic acid modified solid supports can be used in the fields of multiplex PCR library construction, single molecule long fragment nucleic acid sequencing library construction, single cell transcriptome sequencing library construction and single cell multi-omics sequencing library construction.
  • 1A-1C are schematic diagrams of the structures of multiple nucleic acid co-labeled supports of the present invention.
  • FIG. 2A and FIG. 2B are schematic diagrams of the application of multiple nucleic acid co-labeled supports of the present invention to multiplex PCR reactions.
  • FIG. 2C is a schematic diagram of the application of various nucleic acid-labeled supports of the present invention to the construction of multiplex PCR sequencing libraries.
  • FIG. 2D and FIG. 2E are schematic diagrams of the design method and structure of multiple nucleic acid co-labeling supports for multiplex PCR of the present invention.
  • Figure 3A is a schematic structural diagram of a multi-nucleic acid co-labeled support for 5' single-cell RNA expression profiling and the construction of a 5' single-cell VDJ library of a microwell array platform of the present invention.
  • Figure 3B is an experimental flow chart of the multi-nucleic acid labeling support shown in Figure 3A applied to 5' single-cell RNA expression profiling and the construction of a 5' single-cell VDJ library on a microwell array platform.
  • Figure 3C is a schematic diagram of the preparation of the nucleic acid co-labeling support for 5' single-cell RNA expression profiling according to the present invention.
  • Figure 4A is a schematic structural diagram of a multi-nucleic acid co-labeled support applied to a 3' single-cell RNA library of the present invention.
  • Figure 4B is a schematic flowchart of the application of the multiple nucleic acid co-labeled supports of the present invention to a 3' single-cell RNA library.
  • Figure 5A is a schematic structural diagram of a plurality of nucleic acid co-labeling supports for constructing a single-cell transcriptome library of the present invention.
  • FIG. 5B is a schematic flowchart of the use of the multiple nucleic acid co-labeled supports of the present invention to construct a single-cell transcriptome library.
  • FIG. 6A is a schematic structural diagram of a plurality of nucleic acid co-labeling supports for constructing a multi-omics single-cell library of the present invention.
  • FIG. 6B is a schematic flow chart of the multi-omics single-cell library construction using the multiple nucleic acid co-labeled supports of the present invention.
  • FIG. 7A shows the agarose electrophoresis results of the membrane protein nucleic acid tag sequencing library constructed by the procedure in Example 1.
  • FIG. 7A shows the agarose electrophoresis results of the membrane protein nucleic acid tag sequencing library constructed by the procedure in Example 1.
  • Figure 7B shows the analysis results of the 5' single-cell expression profile library fragments constructed by the process in Example 1.
  • FIG. 7C shows the analysis results of the T cell VDJ library fragments constructed by the procedure in Example 1.
  • Figure 7D shows the analysis results of the B cell VDJ library fragments constructed by the procedure in Example 1.
  • Figure 8 shows the distribution of reads at the gene level obtained by sequencing analysis of the 3' single-cell RNA library constructed by the process in Example 2 and the single-cell transcriptome library constructed by the process in Example 3.
  • the BD Phapsody 3' single-cell expression profile library is a library analysis structure constructed entirely with BD Rhapsody.
  • Figure 9 shows the results of fragment size analysis of the multiplex amplification PCR sequencing library constructed using the procedure in Example 4.
  • the present invention first provides structures for multiple nucleic acid co-labeled supports.
  • the support body can be a solid plane (Figure 1A), a solid bead (Figure 1B) or a semi-solid hydrogel (Figure 1C); nucleic acid labels can be located on a solid surface (Figure 1A) and Figure 1B) can also be located in the loose interior of the hydrogel ( Figure 1C).
  • the nucleic acid labeled on a single support at least includes: one or more first nucleic acid labels 101, the function of which at least includes capturing a specific compound in the reaction system to the surface of the support (so the first nucleic acid label 101 is also called a capture nucleic acid label);
  • One or more second nucleic acid labels 102 the functions of which include at least being able to participate in a specified biochemical reaction process of a specific compound captured on the surface of the support (hence the second nucleic acid label 102 is also referred to as a reactive nucleic acid label).
  • other types of nucleic acid labels IN may also be included on the support.
  • a plurality of nucleic acid co-labeled supports of the present invention are compositions comprising a plurality of the above-mentioned supports (supports with the structures shown in FIG. 1A , FIG. 1B and/or FIG. 1C ) .
  • sequences of multiple first nucleic acid markers 101 on the same support may be the same. In a specific application, the sequences of multiple first nucleic acid markers 101 on the same support may be different.
  • sequences of the first nucleic acid markers 101 on different supports may be the same. In certain applications, the sequences of the first nucleic acid markers 101 on different supports may be different.
  • sequences of multiple second nucleic acid labels 102 on the same support may be identical. In certain applications, the sequences of the plurality of second nucleic acid markers 102 on the same support may be different.
  • sequences of the second nucleic acid labels 102 on different supports may be identical. In certain applications, the sequences of the second nucleic acid labels 102 on different supports may be different.
  • sequences of multiple other types of nucleic acid markers IN on the same support may be the same or different for a specific application.
  • sequences of the other kinds of nucleic acid labels on different supports may be the same or different under certain applications.
  • the number of the first nucleic acid label, the second nucleic acid label, and other kinds of nucleic acid labels on the same support can be ⁇ 1 and/or ⁇ 10 13 , respectively.
  • the functions of the first nucleic acid label and the second nucleic acid label can be switched, that is, the same nucleic acid label can have the function of "capturing a specific compound in the reaction system to the surface of the support” described in the present invention, or It can have the function of "participating in the specified biochemical reaction process of the specific compound captured on the surface of the support” described in the present invention.
  • a first nucleic acid label and a second nucleic acid label may be two primers in a primer pair.
  • the present invention also provides a method for preparing multiple nucleic acid co-labeled supports for different purposes.
  • Nucleic acid labeling of the support can be carried out in two ways: “graft to” and “graft from”. In certain applications, nucleic acid labeling of the support can be performed using a "graft to” protocol alone. In certain applications, nucleic acid labeling of the support can be performed using the "graft from” protocol alone. Under specific applications, nucleic acid labeling of supports can be mixed using "graft to" and "graft from” protocols.
  • the support and nucleic acid are modified with functional units that can interact with each other, including but not limited to hydroxyl, aldehyde, epoxy, amino, carboxyl and their activated forms, phosphoric acid, alkynyl, One or more of azide, mercapto, alkene, biotin, avidin, isothiocyanate, isocyanate, acyl azide, sulfonyl chloride, tosyl ester, etc.
  • Different types of nucleic acid labels can be modified with the same functional unit or with different functional units.
  • the modified support and the nucleic acid are brought into contact under specific conditions sufficient to allow the functional units capable of interacting to react and connect to each other.
  • the "graft from” scheme When the "graft from” scheme is adopted, it can be directly synthesized on the support according to the preset nucleotide sequence, or the nucleic acid can be labeled with the scheme of nucleic acid extension or ligation by biochemical reaction. Nucleic acid modifications well known in the art can be added to the labeled nucleic acid, including but not limited to amino, phosphate, alkynyl, azide, sulfhydryl, disulfide, alkene, biotin, azobenzene, methyl, spacer, photocleavage groups One or more of , dI, dU, LNA, XNA, ribonucleic acid bases and dideoxyribonucleic acid bases, and the like.
  • the present invention also provides the use of multiple nucleic acid co-labeled supports for multiplex PCR.
  • multiplex PCR applications all templates and primers are mixed in the same reaction system.
  • the type and total concentration of primer pairs will also increase accordingly, which is easy to form primer dimers and thus Reduce the amplification efficiency of the target region.
  • the use of the multiple nucleic acid co-labeled supports provided by the present invention can well reduce the generation of primer dimers in multiplex PCR, and because mutual interference between PCR primers can be avoided, the continuous sequence of long fragments can be analyzed in a single tube. As shown in FIG.
  • the multiple nucleic acid co-labeled supports provided by the present invention comprise at least two types of supports: one or more first type of primer-labeled supports 1, one or more The second kind of primer-labeled support 2, each support is labeled with at least one pair of nucleic acid primers: as shown in the figure, the first kind of primer-labeled support 1 is labeled with the first nucleic acid primer pair (No.
  • the reverse primer 202R, the two types of supports independently can also selectively include more nucleic acid primer pairs such as other nucleic acid primer pairs (other forward primer 2NF and other reverse primer 2NR).
  • PCR primer software such as Primer Primer optimizes primer sequences to reduce mutual interference between multiple primer pairs on the same magnetic bead.
  • the target fragments amplified by multiple primer pairs on the same support do not overlap on the template.
  • Primer pairs labeled on different supports can Different thus can amplify different target regions, and these target regions can partially overlap or not overlap.
  • Each primer marked on the support at least includes the H region that can be combined and extended with the target region: the first forward primer H region 201FH, The first reverse primer H region 201RH, the second forward primer H region 202FH, the second forward primer H region 202RH, the other forward primer H region 2NFH, the other reverse primer H region 2NRH.
  • Each primer marked on the above also includes at least the universal nucleic acid sequence U region: the first forward primer U region 201FU, the first reverse primer U region 201RU, the second forward primer U region 202FU, the second reverse primer U region 202RU, Other forward primer U region 2NFU, other reverse primer U region 2NRU.
  • the forward primer U region FU or reverse primer U region RU sequence of primers on all supports may be inconsistent.
  • the supports labeled with different kinds of nucleic acid primers are combined with multiplex PCR in a preset ratio.
  • the reaction systems are mixed together, and the preset ratio is determined according to the nucleic acid amplification efficiency on different kinds of supports, which can be as low as the average ratio of different kinds of supports (for example, labeling primers P1/P2/P3 on magnetic beads to form the first type of magnetic particles).
  • the primers P4/P5/P6 are labeled on the magnetic beads to form the second type of magnetic beads, the ratio here refers to 0.01 times the quantity ratio of the first type of magnetic beads to the second type of magnetic beads when the magnetic beads are mixed), Can be as high as 100 times the average ratio of different types of supports.
  • Multiplex PCR reaction system to At least include DNA template, DNA polymerase, dNTP, buffer of appropriate concentration, etc.
  • one of the primers marked on the support for example, the first forward primer 201F or the second forward primer 202F
  • each support can be coupled with less than 5 primer pairs, or each support Up to 10 primer pairs can be coupled to the substrate, or up to 100 primer pairs can be coupled to each support.
  • the present invention also provides the use of multiple nucleic acid co-labeled supports for multiplex PCR sequencing library construction.
  • the supports with nucleic acid sequences 207 and 208 complementary to the complementary strands of the DNA template generated in the reaction shown in FIG. 2B are used as templates for primer sequences compatible with the sequencer: the third forward primer 209F and The third reverse primer 209R performs the PCR reaction of the second step, thereby obtaining a nucleic acid sequencing library that can be used for sequencing.
  • the primer sequences compatible with the sequencer include at least the universal binding sequence in the primers shown in Figure 2A (universal nucleic acid sequence U region, namely other forward primer U region 2NFU/other reverse primer U region 2NRU), sample label 2NFi/2NRi
  • Sequencers used for sequencing include but are not limited to MGIseq sequencing platform, illumina sequencing platform, Ion sequencing platform, PacBio sequencing platform, Nanopore sequencing platform, etc.
  • the present invention also provides a method for making multiple nucleic acid co-labeling supports for multiplex PCR.
  • FIG. 2D when the continuous base sequence of the long fragment 210 needs to be sequenced and analyzed, the amplification of a single primer pair can no longer meet the needs. In this case, multiple pairs of primers need to be designed to amplify and then construct the library for sequencing.
  • a primer pair represented by the H region of the primer in the scheme shown in FIG.
  • the first forward primer H region 201FH and the first reverse primer H region 201RH amplifies the first target fragment 201, and uses the second primer pair ( The second forward primer H region 202FH and the second reverse primer H region 202RH) amplify the second target fragment 202, and use more primer pairs (other forward primer H region 2NFH and other reverse primer H region 2NRH) More target fragments 2N are amplified, and finally the sequencing results are spliced into the sequence of the long fragment 210 .
  • the second forward primer and the first reverse primer together will produce small non-target amplification products, and the primers need to be Divide the multiplex PCR reaction into at least two tubes of parallel amplification, and amplify the primer pairs with no overlap of the target fragment: the first forward primer, the first reverse primer, other forward primers and other reverse primers are one tube, and the second The forward primer and the second reverse primer are in one tube.
  • the multiplex PCR library building method using the multi-nucleic acid labeling support in the present invention can be obtained by labeling the first primer pair that amplifies the target fragment without overlapping and other primer pairs on the same magnetic bead, such as the first magnetic bead.
  • the first type of multiple nucleic acid co-labeled supports, and the second primer pair that overlaps with the amplification target fragment of the primer pair on the first magnetic bead is labeled on another magnetic bead such as the second magnetic bead to obtain the first.
  • the present invention pre-synthesizes a first primer pair 201F, 201R with a 5' specific modification 211, a second primer pair 202F, 202R with a 5' specific modification 211, and more with Additional primer pairs 2NF and 2NR for 5' specific modification of 211 (Fig. 2E).
  • 5' specific modifications include, but are not limited to, hydroxyl, aldehyde, epoxy, amino, carboxyl and their activated forms, phosphoric acid, alkynyl, azide, sulfhydryl, alkene, biotin, avidin, isothiocyanate, Isocyanates, acyl azides, sulfonyl chlorides, tosyl esters, etc.
  • the corresponding selected supports include but are not limited to epoxy, amino, carboxyl, alkynyl, azide, alkene, heavy metal, azide, affinity and other functional groups 212.
  • the supports with functional groups 212 are contacted and coupled with nucleic acid primers with 5' specific modifications 211 under appropriate conditions, in particular the primers capable of producing non-specific products are coupled separately on different supports, e.g.
  • a first primer pair 201F, 201R with a 5' specific modification 211, other primer pairs 2NF, 2NR with a 5' specific modification 211 are coupled to the first microbeads to form a first product 213, and a first product 213 will have a 5' specific modification 211
  • the second primer pair 202F and 202R is coupled to the second microbead to form the second product 214 (Fig. 2E), and finally the first product 213 and the second product 214 are mixed together in proportion to form the final product with multiple nucleic acid labels.
  • the supports are used for multiplex PCR library construction.
  • the present invention also provides the use of multiple nucleic acid co-labeled supports for 5' single-cell RNA expression profiling. It is well known that complex living organisms are composed of many cells with specific properties, and the types and quantities of RNAs transcribed and expressed by each cell in a specific state are different, so it is of great significance to detect RNA transcription at the single-cell level. Current technologies for detecting single-cell transcriptomes can be divided into low- and medium-throughput and high-throughput single-cell transcriptome sequencing technologies according to throughput.
  • the medium and low-throughput single-cell transcriptome sequencing is represented by smart-seq, and the single-cell transcriptome is constructed by reverse transcription and amplification of the RNA obtained by direct lysis of a single cell; the library preparation of high-throughput single-cell transcriptome sequencing is based on oil-in-oil Represented by water microfluidics and microwell array platforms, mRNA molecules derived from different cells are reverse transcribed into corresponding mRNA molecules through oligo dT primers or template switch oligo (TSO) containing cell tags and molecular tags. Uniquely tagged cDNA molecules and further sequencing can simultaneously analyze the mRNA expression of thousands of single cells.
  • the water-in-oil technology platform encapsulates a single cell and a single microbead containing a cell tag in a single droplet for lysis and reverse transcription in one step. According to the sequencing read near the 3' end or 5' end of the RNA, it can be divided into 3' single cells RNA expression profiling library and 5' single cell RNA expression profiling library.
  • the microwell array platform is usually an array chip consisting of microwells with a diameter of 20-60 ⁇ M.
  • RNA reverse transcripts derived from the same cell are labeled with the same and unique cellular label by reverse transcription extension.
  • the efficiency of the current microwell array-based single-cell sequencing library preparation platform largely depends on the RNA capture efficiency in the microwells containing oligo dT microbeads, and unlike the water-in-oil platform, the microwell array platform can only construct 3 'Single-cell RNA expression profiling library.
  • the invention can improve the RNA capture efficiency in the micropore by carrying out various nucleic acid labels on the microbeads in the preparation of the single-cell sequencing library, and can realize the preparation of the 5' single-cell RNA expression profile library.
  • the present invention provides the use of a variety of nucleic acid co-labeled supports for 5' single-cell RNA expression profile analysis, and the construction of a 5'-single-cell VDJ library of a microwell array platform.
  • the support here is microbeads (solid microbeads or semi-solid hydrogel microbeads), and at least two nucleic acid sequences are labeled on the support: a first nucleic acid sequence 301 and a second nucleic acid sequence 304 .
  • the first nucleic acid sequence 301 contains at least a capture sequence 303, which is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription, such as a base sequence oligo dT with a length of 15-40, which can be adjusted by adjusting the first nucleic acid sequence 301 on the support. on the quantity and density to control the efficiency of RNA capture.
  • the first nucleic acid sequence 301 also includes a first universal sequence nucleic acid 302 and a conditionally cleavable site X under certain uses.
  • Conditionally cleavable sites include, but are not limited to, one or more of disulfide modifications, dU modifications, RNA base modifications, dI modifications, DSpacer modifications, AP site modifications, photocleavable PC linkers, and restriction endonuclease recognition sequences. variety.
  • the second nucleic acid sequence 304 is composed of one or more of the second universal nucleic acid sequence 305 , the cell tag sequence 306 , the molecular tag sequence 307 and the template switching sequence 308 .
  • the second universal nucleic acid sequence 305 may include an adapter nucleic acid sequence that matches the sequencer, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer.
  • the cell tag sequence 306 is used to tag molecules derived from all mRNAs in the same cell, with the same cell tag on each support and different cell tags on different kinds of supports.
  • the cell tag sequence 306 can be a random or semi-random nucleic acid sequence, such as a 12bp degenerate base NNNNNNNNNN, or a combination of multiple fixed nucleic acid sequences, such as 96 8-base sequences and 96 8-base sequences and A random combination of 96 8-base sequences, which may or may not include connecting nucleic acid regions between the 8-base sequences.
  • the molecular tag sequence 307 is used to label each reverse transcribed cDNA molecule, and the cDNA molecules reverse transcribed from different RNAs on the same support are marked with different molecular tags.
  • the molecular tag 307 can be a random or semi-random nucleic acid sequence of 8-20 bases in length, such as 9 random degenerate bases NNNNNNN or NNNNNNNV.
  • the template switching sequence 308 can be used as a template to extend the 3' end of the cDNA reverse transcribed from the first nucleic acid sequence 301 to label the molecular tag sequence 307, the cell tag sequence 306 and the second universal nucleic acid sequence 305.
  • Template switching sequence 308 includes two or more RNA bases rG or other modified base G analogs, such as LNA or XNA, at least at the 3' end.
  • FIG 3B shows the experimental flow chart of the multi-nucleic acid-labeled support shown in Figure 3A in the construction of the 5' single-cell library of the microwell array platform.
  • each support is coupled with two kinds of Nucleic acid markers: the first nucleic acid sequence 301 and the second nucleic acid sequence 304 .
  • the RNA 309 containing the complementary sequence of the second nucleic acid sequence 304 is captured by the first nucleic acid sequence 301 on the support and reversed by reverse
  • the transcription reaction system forms a cDNA molecule 310, in which the cDNA is extended to the 5' end of the RNA 309 by the reverse transcriptase with terminal nucleotransferase function adding a continuous base C to the cDNA strand, and then the cDNA strand will be with the same support surface.
  • the adjacent second nucleic acid sequence 304 containing more than two bases rG or its base analogs is complementary and extended to the second universal nucleic acid sequence 305 to form a complete cDNA molecule 310 with cell tags and molecular tags.
  • the cDNA molecule 310 can be detached from the support through the cleavable site X as a template for the next step of amplification, or can be extended through the single primer extension of the second universal nucleic acid sequence 305, which is complementary to the cDNA molecule 310.
  • the strand is used as the template for the next amplification, or the support containing the cDNA molecule 310 after removing the first nucleic acid sequence 301 and the second nucleic acid sequence 304 not involved in the reverse transcription reaction on the support by enzymatic treatment is used as the template for the next amplification.
  • the fragmented or unfragmented cDNA molecule 310 as a template is PCR amplified with a primer pair containing the first universal sequence 302 and the second universal sequence 305 to form a double-stranded nucleic acid 311 product.
  • double-stranded nucleic acid 311 can analyze the type and abundance of single-cell RNA expression through two library construction methods, one of which is to analyze the expression of all RNA molecules with polyA tails without bias.
  • the library scheme is to randomly break the double-stranded nucleic acid 311, end-repair, and add a base A at the 3' end to form a molecular structure 312, which is then connected to a linker 313 containing a protruding T, and is ligated by the first primer 315 containing the first sample Index317 Amplify with the second primer 319 containing the second sample Index 321 to form the first final library 323 .
  • the first primer 315 comprises a first nucleic acid sequence 316 compatible with the sequencer, a first sample index 317 and a sequence 318 complementary to the long-chain partial sequence in the adapter 313 .
  • the second primer 319 includes a second nucleic acid sequence 320 compatible with the sequencer, a second sample index 321 and a sequence 322 that is identical to a partial sequence of the second universal nucleic acid sequence 305 .
  • This library construction method can also be replaced by other random library construction schemes that can achieve the same purpose, including but not limited to the library construction scheme of transposase interrupted library construction or random primer extension.
  • the purpose of another library construction scheme is to analyze the expression of the target gene in a targeted manner, which can be realized by two-step multiplex PCR, such as using the first gene-specific primer 324, the second gene-specific primer 326 and the universal primer 305 respectively.
  • the primer pairs form the first-step multiplex PCR product 325 and the second-step multiplex PCR product 327, and finally use the second-step multiplex PCR product 327 as a template to pass through the first primer 315 containing the first sample Index317 and the second sample Index321
  • the second primer 319 amplifies to form a second final library 328.
  • the library constructed by targeted multiplex PCR can be used for the analysis of the immune repertoire.
  • the multiplex PCR product 327 in the second step can also be used to construct a full-length VDJ immune repertoire library according to the first random interrupt library construction scheme. Analysis of T cell receptor and antibody VDJ sequences. Both the first final library 323 and the second final library 328 are further used for sequencing and information analysis.
  • the present invention also provides a method for making a nucleic acid co-labeling support for 5' single-cell RNA expression profiling analysis.
  • the cell tag sequence 306 consists of a first cell tag 329, a first attachment region 330, a second cell tag 331, a second attachment region 332, a third attachment region 333, and a third cell tag 334 that are sequentially linked. It consists of 6 areas.
  • the present invention pre-synthesizes the first nucleic acid sequence 301 and the third nucleic acid sequence 335 with a 5' specific modification 340.
  • the 5' specific modification 340 includes But not limited to hydroxyl, aldehyde, epoxy, amino, carboxyl and their activated forms, phosphoric acid, alkynyl, azide, sulfhydryl, alkene, biotin, avidin, isothiocyanate, isocyanate, acyl azide , sulfonyl chloride, tosyl ester, etc.
  • the corresponding selected supports include but are not limited to functional groups such as epoxy, amino, carboxyl, alkynyl, azide, alkene, heavy metal, azide, avidin, etc. Mission 339.
  • the support with functional group 339 is contacted and coupled (step 341) with the first nucleic acid sequence 301 and the third nucleic acid sequence 335 with the 5' modification under appropriate conditions to form a first product 342, the first nucleic acid sequence
  • the ratio of 301 to the third nucleic acid sequence 355 on the first product 342 can be adjusted by adding different concentrations; then the first product 342 and the fourth nucleic acid sequence 336 undergo hybridization in an environment containing an appropriate salt ion concentration, dNTP and a polymerase buffer
  • the second product 344 is obtained by hybridization and extension, and the fourth nucleic acid sequence 336 nucleic acid molecule sequentially contains the complementary sequence 332' of the second connecting region 332, the complementary sequence 331' of the second cell tag 331 and the first connecting region 330 from 5'.
  • the complementary sequence 330' of the third nucleic acid sequence 335 on the support hybridizes with the complementary sequence 330' on the fourth nucleic acid sequence 336 through its own connecting region 330 sequence and extends to the sequence
  • the second cell tag 331 and the second connecting region 332 The second product 344 is connected with the fifth nucleic acid sequence 346 under the action of DNA ligase after the complementary strand is removed under denaturing conditions to generate the third product 347;
  • the fifth nucleic acid sequence 346 is composed of the first nucleic acid molecule 337 and
  • the second nucleic acid molecule 338 is the double-stranded DNA obtained by annealing and hybridization in advance, the first nucleic acid molecule 337 contains the third connecting region 333, the third cell tag 334, the molecular tag sequence 307 and the template switching sequence 308 connected in sequence, and the second nucleic acid molecule 338 It contains at least the complementary sequence of the partial sequence of the second connecting region 332 and the third connecting region 333, and also
  • the third connecting region 333 and the second connecting region 332 on the second product 344 are hybridized together and connected to each other by a ligase; in particular, the 5' end of the first nucleic acid molecule 337 sometimes also contains a phosphoric acid modification; finally, under denaturing conditions
  • the third product 347 undergoes an elution step 348 to wash off the complementary nucleic acid sequence, that is, the second nucleic acid molecule 338 to form a support 349 that is finally marked with the first nucleic acid sequence 301 and the second nucleic acid sequence 304, which has the first nucleic acid sequence 301 and the second nucleic acid sequence 304.
  • the supports 349 labeled with the two nucleic acid sequences 304 can be used directly in the library building procedure for 5' single-cell RNA expression profiling.
  • the present invention also provides the use of a plurality of nucleic acid co-labeled supports applied to a 3' single-cell RNA library, wherein the cell label and the reverse transcription primer oligo dT are respectively located at both ends of the cDNA molecule.
  • Figure 4A shows at least two nucleic acid-labeled supports, where the supports are beads (solid beads or semi-solid hydrogel beads), and at least two nucleic acid sequences are labeled on the supports: the first Nucleic acid sequence 401 and second nucleic acid sequence 404.
  • the first nucleic acid sequence 401 contains at least a capture sequence 403, which is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription, such as a base sequence oligo dT with a length of 15-40, which can be adjusted by adjusting the first nucleic acid sequence 401 on the support.
  • the number and density of the above control the efficiency of capturing RNA; the first nucleic acid sequence 401 also includes the first universal sequence nucleic acid 402 under certain uses.
  • the second nucleic acid sequence 404 is composed of one or more of the second universal nucleic acid sequence 405 , the cell tag sequence 406 , the primer sequence 407 and the reversible blocking site 408 .
  • the second universal nucleic acid sequence 405 may include an adaptor nucleic acid sequence that matches the sequencer, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer, and the optional second universal nucleic acid sequence 405 contains a conditional breaking site X .
  • Conditionally cleavable sites include, but are not limited to, disulfide modifications, dU modifications, RNA base modifications, dI modifications, DSpacer modifications, AP site modifications, photocleavable PC linkers, and restriction endonuclease recognition sequences.
  • the cell tag sequence 406 is used to tag molecules derived from all mRNAs in the same cell, with the same cell tag on each support and different cell tags on different kinds of supports.
  • the cell tag sequence 406 can be a random or semi-random nucleic acid sequence, such as a 12bp degenerate base NNNNNNNNNN, or a combination of multiple fixed nucleic acid sequences, such as 96 8-base sequences and 96 8-base sequences and A random combination of 96 8-base sequences, which may or may not include connecting nucleic acid regions between the 8-base sequences.
  • the primer sequence 407 can be used as a primer to extend the cDNA molecule that is complementary to it, and can be combined and extended with the cDNA product obtained by reverse transcription of the capture sequence 403; the primer sequence 407 can be a random or semi-random nucleic acid sequence with a length of 5-15 bases, For example, 6 random degenerate bases NNNNNN, and gene-specific sequences can also be used to enrich targeted regions.
  • the function of the reversible blocking site 408 is to prevent the non-specific extension of the primer sequence 407 as a primer when the first nucleic acid sequence 401 captures and extends the target nucleic acid, and in a specific situation, the blocking effect is released to allow the primer sequence 407 to act as a primer. extend.
  • the reversible blocking site 408 can be simple 3' phosphate modification, ddNTP modification or C3 spacer modification, or a combination of cleavable modification and extension blocking modification, and the cleavable modification can be DSpacer modification/RNA base Modification/dU modification, etc., extension blocking modification including but not limited to LNA/XNA/3' phosphate/inverted dT/ddNTP/C3 spacer/C6 spacer/various fluorescent dyes and quenching modifications, such as reversible blocking sites 408 can be (rN)NNNN-C3 or (rN)N-C3-C3-ddN, rN represents any ribonucleotide degenerate base, N represents any deoxyribonucleotide degenerate base, C3 is an extension blocking modified C3 spacer, and ddN is a dideoxyribonucleoside; after this sequence forms a double strand with the target DNA, it can be recognized and excised by R
  • Figure 4B shows the flow chart of the experiment of constructing a 3' single-cell RNA library with the double nucleic acid labeling support shown in Figure 4A.
  • a single support labeled with the first nucleic acid sequence 401 and the second nucleic acid sequence 404 is contacted with RNA derived from a single cell, the RNA 409 containing the complementary sequence of the second nucleic acid sequence 404 is captured by the first nucleic acid sequence 401 on the support and reversed by reverse
  • the transcription reaction system forms a cDNA molecule 410, and then the cDNA molecule 410 is denatured by high temperature and then melted with the RNA to be complementary to the region of the primer sequence 407 on the second nucleic acid sequence 404 near the surface of the same support;
  • the cDNA molecule 410 can be combined with More than one second nucleic acid sequence 404 on the surface of the support is complementary; further, the primer sequence 407 that is complementary to the cDNA molecule 410 can be recognized by a related
  • the two nucleic acid molecules 413 are used as templates for further amplification to form a double-stranded nucleic acid product 414, wherein the forward and reverse amplification primers respectively contain all or part of the nucleic acid sequences of the first universal sequence nucleic acid 402 and the second universal sequence nucleic acid 405. Further, the double-stranded nucleic acid product 414 can be used to analyze the type and abundance of single-cell RNA expression through two library construction methods, one of which aims to unbiased analysis of the expression of all RNA molecules with polyA tails.
  • the library construction scheme is to amplify the double-stranded nucleic acid product 414 and amplify it by the first primer 415 containing the first sample Index417 and the second primer 419 containing the second sample Index421 to form the first final library 423, wherein the first primer 415 includes The first nucleic acid sequence 416, the first sample index 417 and the first primer hybridization region 418 compatible with the sequencer, the second primer 419 includes the second nucleic acid sequence 420, the second sample index 421 and the second primer hybridization region compatible with the sequencer 422.
  • This library construction method can also be replaced by other random library construction schemes that can achieve the same purpose, including but not limited to the library construction scheme of transposase interrupted library construction or random primer extension.
  • the purpose of another library construction scheme is to analyze the expression of the target gene in a targeted manner, which can be achieved by two-step multiplex PCR, such as using the first gene-specific primer 424, the second gene-specific primer 426 and the universal primer 305 respectively.
  • the primer pair is the product of two-step multiplex PCR: the product 425 of the first-step multiplex PCR and the product 427 of the second-step multiplex PCR, and finally the product 427 of the second-step multiplex PCR is used as a template to pass with the product containing the first sample Index417.
  • the first primer 415 and the second primer 419 containing the second sample Index 421 are amplified to form a second final library 428 .
  • Libraries constructed by targeted multiplex PCR can be used for immune repertoire analysis, especially for full-length T cell receptor and antibody VDJ sequences. Both the first final library 423 and the second final library 428 are further used for sequencing and information analysis.
  • the present invention further provides the use of a multi-nucleic acid co-labeled support for constructing a single-cell transcriptome library.
  • the cellular label can label any position of the RNA chain to form a cDNA molecule with cellular and molecular labels.
  • the supports here are microbeads or hydrogel beads, and there are at least two nucleic acid sequences on a single support, such as the first nucleic acid sequence 501 and the second nucleic acid sequence.
  • the first nucleic acid sequence 501 at least contains a capture sequence 503 for capturing target nucleic acid molecules, such as a base sequence oligo dT with a length of 15-40, and the capture can be controlled by adjusting the number and density of the first nucleic acid sequence 501 on the support RNA efficiency.
  • the first nucleic acid sequence 501 also includes a polymerase extension blocking site 504, which can prevent the capture sequence 503 from extending the capture nucleic acid molecule as a primer, including but not limited to LNA/XNA/3' phosphate/inverted dT/ddNTP/C3 spacer/C6 spacer/various fluorescent dyes and quenching modifications, etc.
  • the first nucleic acid sequence 501 also includes the first universal nucleic acid sequence 502 in a specific application, and the capture efficiency of the first universal nucleic acid sequence 502 can be adjusted 503 by adjusting the sequence and length of the first universal nucleic acid sequence 502 .
  • the second nucleic acid sequence 505 is composed of one or more of a second universal nucleic acid sequence 506, a cell tag sequence 507, and a primer sequence 508, wherein the second universal nucleic acid sequence 506 may include a linker nucleic acid sequence matching a sequencer, such as illumina sequencing Read1 Sequencing Primer or Read2 Sequencing Primer in the instrument; cell tag sequence 507 is used to tag molecules derived from all mRNAs in the same cell, each support has the same cell tag and different types of supports have different cell tags .
  • a linker nucleic acid sequence matching a sequencer such as illumina sequencing Read1 Sequencing Primer or Read2 Sequencing Primer in the instrument
  • cell tag sequence 507 is used to tag molecules derived from all mRNAs in the same cell, each support has the same cell tag and different types of supports have different cell tags .
  • the cell tag sequence 507 can be a random or semi-random nucleic acid sequence, such as a 12bp degenerate base NNNNNNNNNN, or a combination of multiple fixed nucleic acid sequences, such as 96 kinds of 8-base sequences and 96 kinds of 8-base sequences and A random combination of 96 8-base sequences, which may or may not include connecting nucleic acid regions between the 8-base sequences.
  • the primer sequence 508 can be combined with the RNA template as a primer extension and extended into a cDNA molecule, and can be combined with the RNA captured by the capture sequence 503 and extended.
  • the primer sequence 508 can be a random or semi-random nucleic acid sequence with a length of 5-15 bases, For example, 6 random degenerate bases NNNN, and gene-specific sequences can also be used to enrich targeted regions.
  • the third nucleic acid sequence 509 is composed of one or more of the second universal nucleic acid sequence 506, the cell tag sequence 507, the molecular tag sequence 510 and the capture sequence 503, wherein the second universal nucleic acid sequence 506 may include an adaptor nucleic acid matching the sequencer Sequence, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer; the cell tag sequence 507 is used to label molecules derived from all mRNAs in the same cell, and each support has the same cell tag and different types of supports.
  • the second universal nucleic acid sequence 506 may include an adaptor nucleic acid matching the sequencer Sequence, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer
  • the cell tag sequence 507 is used to label molecules derived from all mRNAs in the same cell, and each support has the same cell tag and different types of supports.
  • the cell tag sequence 507 can be a random or semi-random nucleic acid sequence, such as a 12bp degenerate base NNNNNNNNNN, or a combination of multiple fixed nucleic acid sequences, such as 96 8-base sequences and 96 A random combination of 8-base sequences and 96 kinds of 8-base sequences, the 8-base sequences may or may not include connecting nucleic acid regions; the molecular tag sequence 510 is used to label each reverse transcribed cDNA molecule, which is obtained from the same cDNA molecule. The cDNA molecules reversely transcribed from different RNAs on the support are marked with different molecular tags.
  • the molecular tag 510 can be a random or semi-random nucleic acid sequence with a length of 5-20 bases, such as 9 random degenerate bases.
  • Base NNNNNNNNN or NNNNNNNV; capture sequence 503 is used to capture target nucleic acid molecules, such as base sequence oligo dT with a length of 15-40.
  • Figure 5B shows the experimental flow chart of the construction of single-cell transcriptome library using two types of dual nucleic acid labeling supports.
  • the single nucleic acid-labeled support contacts the RNA derived from a single cell
  • the RNA 512 containing the complementary sequence to the capture sequence 503 is captured by the first nucleic acid sequence 501 or the third nucleic acid sequence 509 on the support
  • the RNA 512 captured on the surface of the support is Under suitable conditions, it combines with the primer sequence 508 of the second nucleic acid sequence 505 and forms a cDNA molecule 514 through a reverse transcription reaction system
  • 509 is the cDNA formed by the primer
  • the reverse transcribed support can be further digested to remove the molecules of the first nucleic acid sequence 501 , the third nucleic acid sequence 509 and the second nucleic acid sequence 505 that do not participate in the reaction.
  • the supports containing the cDNA molecules 514 can then be analyzed for the type and abundance of single-cell RNA expression by two methods of library construction.
  • One of the library construction methods aims to unbiased analysis of the expression of all RNA molecules with polyA tails, and this library construction scheme uses a random primer 517 extension amplification scheme.
  • the random primer 517 is composed of a universal primer sequence 515 and a random base sequence 516:
  • the universal primer sequence 515 can include an adaptor nucleic acid sequence matching the sequencer, such as Read2 Sequencing Primer or Read1 Sequencing Primer in the illumina sequencer;
  • the random base sequence 516 It can be a random or semi-random nucleic acid sequence with a length of 5-15 bases, such as 9 consecutive degenerate bases NNNNNNNNN.
  • the random primer 517 hybridizes to the cDNA molecule 514 in an appropriate environment and produces a complementary strand 518 of the cDNA molecule 514 under the action of a DNA polymerase with strand replacement activity;
  • the primer pair of the nucleic acid sequence 506 and the universal primer sequence 515 is amplified to generate a double-stranded product 519;
  • the double-stranded product 519 is amplified by the first primer 520 containing the first sample Index522 and the second primer 524 containing the second sample Index526 to form the first primer.
  • a final library 528 wherein the first primer 520 comprises a sequencer-compatible first nucleic acid sequence 521, a first sample index 522 and a first primer hybridization region 523, and the second primer 524 comprises a sequencer-compatible second nucleic acid sequence 525, the second sample index526 and the partial sequence of the hybridization region 527 with the second primer;
  • this library construction method can also be replaced with other random library construction schemes that can achieve the same purpose, including but not limited to ultrasonic interruption, enzyme interruption Or library construction schemes such as transposase interruption.
  • the purpose of another library construction scheme is to analyze the expression of the target gene in a targeted manner, which can be achieved by two-step multiplex PCR, such as using the first gene-specific primer 529, the second gene-specific primer 531 and the universal primer 506 respectively.
  • the primer pair is the product of two-step multiplex PCR: the product 530 of the first-step multiplex PCR and the product 532 of the second-step multiplex PCR, and finally the product 532 of the second-step multiplex PCR is used as the template to pass the first primer 520 and the second primer 520 524 is amplified to form a second final library 533.
  • Libraries constructed by targeted multiplex PCR can be used for immune repertoire analysis, especially for full-length T cell receptor and antibody VDJ sequences. Both the first final library 528 and the second final library 533 were further used for sequencing and information analysis.
  • the present invention also provides the use of multiple nucleic acid co-labeled supports for single-cell multi-omics research.
  • DNA carrying genetic information transmits the information to RNA through transcription and is translated into protein, which finally performs the main biological function.
  • RNA and protein due to the complexity of the physiological system, the expression levels of RNA and protein are not consistent, and RNA cannot directly reflect the post-translational modification and interaction of proteins. Therefore, it is very important to study the expression of RNA and protein in the same cell at the same time.
  • the present invention discloses a nucleic acid labeling support structure capable of simultaneously analyzing RNA expression level and sequence and protein expression and interaction.
  • the support here is microbeads (including solid microbeads or semi-solid hydrogel microbeads), and at least three nucleic acid sequences are labeled on the support: a first nucleic acid sequence 601, a second nucleic acid sequence 604 and the third nucleic acid sequence 609.
  • the first nucleic acid sequence 601 contains at least a capture sequence 603 for capturing the target nucleic acid molecule and extending as a primer, such as a base sequence oligo dT with a length of 15-40, which can be adjusted by adjusting the number and amount of the first nucleic acid sequence 601 on the support.
  • a capture sequence 603 for capturing the target nucleic acid molecule and extending as a primer, such as a base sequence oligo dT with a length of 15-40, which can be adjusted by adjusting the number and amount of the first nucleic acid sequence 601 on the support.
  • the first nucleic acid sequence 601 also includes the first universal nucleic acid sequence 602 and a conditional cleavage site X, and the conditionally cleavable site includes but is not limited to disulfide modification, dU modification , RNA base modification, dI modification, DSpacer modification, AP site modification, photocleavage PC linker and restriction endonuclease recognition sequence.
  • the second nucleic acid sequence 604 is composed of one or more of the second universal nucleic acid sequence 605, the cell tag sequence 606, the molecular tag sequence 607 and the template switching sequence 608, wherein the second universal nucleic acid sequence 605 may include a sequencer matching sequence Adapter nucleic acid sequence, such as Read1 Sequencing Primer or Read2 Sequencing Primer in illumina sequencer; cell tag sequence 606 is used to label molecules derived from all mRNAs in the same cell, each support has the same cell tag and different types of support There are different cell tags on the object, and the cell tag sequence 606 can be a random or semi-random nucleic acid sequence, such as 12bp degenerate base NNNNNNNN, or it can be a combination of multiple fixed nucleic acid sequences, such as 96 kinds of 8-base sequences With the random combination of 96 kinds of 8-base sequences and 96 kinds of 8-base sequences, the connecting nucleic acid region may or may not be included between the 8-
  • the molecular tag 607 can be a random or semi-random nucleic acid sequence with a length of 8-20 bases, such as 9 random and base NNNNNNNNN or NNNNNNNV; the template switching sequence 608 can be used as a template to extend the 3' end of the cDNA reverse transcribed from the first nucleic acid sequence 601 to label the molecular tag sequence 607, the cell tag sequence 606 and the second universal nucleic acid sequence 605 , the template switching sequence 608 includes two or more RNA bases rG or other modified base G analogs, such as LNA or XNA, at least at the 3' end.
  • the third nucleic acid sequence 609 is composed of one or more of the third universal nucleic acid sequence 610, the cell tag sequence 606, the molecular tag sequence 607 and the protein nucleic acid tag capture sequence 611, wherein the cell tag sequence 606 and the molecular tag sequence 607 are the same as the first
  • the structures of the two nucleic acid sequences 604 are consistent;
  • the third universal nucleic acid sequence 610 is inconsistent with the second universal nucleic acid sequence 605 and contains an adaptor nucleic acid sequence that matches the sequencer, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer;
  • protein The nucleic acid tag capture sequence 611 is used to capture and extend the protein nucleic acid tag in the same spatial structure as the single cell to be tested.
  • the same spatial structure means that the protein nucleic acid tag can be located inside the cell, on the surface of the cell, or in the cell where the cell is located. chamber or droplet.
  • FIG. 6B shows the experimental flow chart of the construction of a multi-omics single-cell library with the three nucleic acid labeling supports shown in FIG. 6A .
  • the cells to be tested are contacted and bound with a nucleic acid-labeled antibody molecule 612 that recognizes a specific protein in advance, and the non-specifically bound nucleic acid-conjugated antibody is washed away.
  • the structure of the antibody molecule 612 includes complementary binding to the protein nucleic acid tag capture sequence 611
  • the sequence 613, the protein-specific sequence 614, the fourth universal primer sequence 615, and the molecule 616, the fourth universal primer sequence 615 is inconsistent with the second universal nucleic acid sequence 605, the third universal nucleic acid sequence 610 and contains a matching adapter with the sequencer Nucleic acid sequences, such as Read2 Sequencing Primer or Read1 Sequencing Primer in the illumina sequencer
  • molecule 616 refers to specific antibodies in this process, and can also be small molecular compounds, carbohydrates, peptides and other substances that bind to the target detection protein.
  • RNA 617 and nucleic acid-conjugated antibody molecules 612 are separated by the first nucleic acid sequence 601 and the third nucleic acid sequence 609 on the support, respectively Capture and form a cDNA molecule 618 or a DNA molecule 619 through a reverse transcription reaction system, wherein the cDNA is extended to the 5' end of RNA617 by a reverse transcriptase with terminal nucleotransferase function to add a continuous base C to the cDNA strand, and then the The cDNA strand will be complementary to the template switching sequence 608 containing two or more bases rG or its base analogs near the surface of the same support and continue to extend to the second universal nucleic acid sequence 605 region to form a complete cell tag and molecular tag.
  • the cDNA molecule 618 and the DNA molecule 619 can be detached from the support through the cleavable site X as a template for the next step of amplification, or can pass the second universal nucleic acid sequence 605 and the fourth universal nucleic acid sequence 605.
  • the extended chain complementary to the cDNA molecule 618 or the DNA molecule 619 formed after the single primer extension of the nucleic acid sequence 615 is used as a template for the next amplification, or the first nucleic acid sequence 601, the first nucleic acid sequence 601, the first nucleic acid sequence 601, the first nucleic acid sequence 601, the first nucleic acid sequence 601 on the support that does not participate in the reverse transcription reaction are removed by enzymatic treatment.
  • the support containing the cDNA molecules 618 and 619 is used as the template for the next step of amplification; in the subsequent step, the mixture of the fragmented or unfragmented cDNA molecule 618 and the DNA molecule 619 is used as the template
  • a first double-stranded nucleic acid product 621 and a second double-stranded nucleic acid product are formed by PCR amplification with the first universal nucleic acid sequence 602/second universal nucleic acid sequence 605 and the third universal nucleic acid sequence 610/fourth universal nucleic acid sequence 615 double primer pair 620 mixture.
  • the formed double-stranded nucleic acid product can be used for single-cell multi-omics analysis through three library construction methods.
  • the purpose of the first library construction scheme is to construct a nucleic acid library that can analyze the abundance of the protein to be detected, and the first library 630 can be obtained directly by PCR amplification of the first index primer 622 and the second index primer 626; wherein the first index primer 622 includes The sequencer-compatible first nucleic acid sequence 623, the first sample index 624 and the nucleic acid sequence 625 complementary to the fourth universal nucleic acid sequence 615 are sequentially linked, and the second index primer 626 includes the sequencer-compatible first sequencer The second nucleic acid sequence 627, the second sample index 628 and the nucleic acid sequence 629 complementary to the third universal nucleic acid sequence 610 are composed.
  • the purpose of the second library construction scheme is to analyze the expression of all RNA molecules with polyA tails without bias.
  • This library construction scheme is to randomly interrupt the mixture of the first double-stranded nucleic acid product and the second double-stranded nucleic acid product. End repair and add base A at the 3' end to form a molecular structure 631, which is then ligated with a linker 632 containing an overhang T and amplified by the first primer 634 containing the first sample Index624 and the second primer 636 containing the second sample index628 Augmentation forms a second final library 638, wherein the first primer 634 comprises a sequencer-compatible first nucleic acid sequence 623, a first sample index 624, and a nucleic acid sequence 635 complementary to the first universal nucleic acid sequence 602, connected in sequence, and the second The primer 636 includes a sequencer-compatible second nucleic acid sequence 627, a second sample index 628 and a nucleic acid sequence 637 complementary to the second universal nucle
  • Random library construction schemes including but not limited to transposase interrupt library construction or random primer extension library construction schemes.
  • the purpose of the third library construction scheme is to analyze the expression of the target gene in a targeted manner, which can be achieved by two-step multiplex PCR, such as using the first gene-specific primer 639 and the second gene-specific primer 641 and the second universal primer respectively.
  • the primer pair composed of 605 forms a multiplex PCR product: the first step multiplex PCR product 640 and the second step multiplex PCR product 642, and finally the second step multiplex PCR product 642 is used as a template to pass through with the first primer 634 containing the first sample Index624.
  • a third final library 643 is formed by amplifying with the second primer 636 containing the second sample Index 628 .
  • the library constructed by targeted multiplex PCR can be used for the analysis of the immune repertoire. Furthermore, the multiplex PCR product 642 in the second step can also be used to construct a full-length VDJ immune repertoire library according to the first random interrupt library construction scheme. Analysis of T cell receptor and antibody VDJ sequences.
  • the library 630, the second final library 638, and the third final library 643 are all further used for sequencing and information analysis.
  • Embodiment 1 the support of multiple nucleic acid co-labeling is applied to 5' single-cell RNA expression profile library construction and VDJ library construction and multi-omics library construction
  • nucleic acid co-labeled supports were prepared according to the following operation steps and applied to construct a 5' single-cell RNA expression profile library, a VDJ library and a multi-omics library.
  • reagent 200 ⁇ L system Superscript II first-strand buffer(5 ⁇ ) 40 DTT (100mM) 10 Betaine(5M) 40 MgCl2 (1M) 1.2 dNTP 10mM 20 RNAse inhibitor 5 SuperScript II reverse transcriptase 10 RNase-free water 73.8
  • Example 2 Multiple nucleic acid co-labeled supports applied to 3' single-cell RNA library construction
  • nucleic acid co-labeled supports were prepared according to the following operation steps and applied to construct a 3' single-cell RNA library.
  • Random primer NrNx double-strand extension configure the following hybridization reaction system and suspend the magnetic beads obtained in 3.2.5.
  • Example 3 Multiple nucleic acid co-labeled supports for single-cell transcriptome library construction
  • nucleic acid co-labeled supports were prepared according to the following operation steps and applied to construct a single-cell transcriptome library.
  • nnnnnnn is an 8bp Cell barcode sequence, with a total of 384 types.
  • step temperature time Rotating speed 1 normal temperature 30min normal rotation 2 37°C 30min 1200rpm
  • the nucleic acid sequences measured by this method are more evenly distributed over the full length of RNA than BD rhapsody and the library in Example 2.
  • the distribution of reads at the gene level was obtained through sequencing analysis.
  • the BD Phapsody 3' single-cell expression profile library is a library analysis structure constructed entirely with BD Rhapsody.
  • the sequences contained in the 3' single-cell RNA library constructed in Example 2 are mainly distributed at the 3' end of the gene, while the sequences contained in the single-cell transcriptome library constructed in Example 3 are significantly higher than The 3' single-cell RNA library and the BD Phapsody 3' single-cell expression profiling library were more biased towards the middle of the gene.
  • Example 4 Multiple nucleic acid co-labeled supports for multiplex PCR sequencing library construction
  • the purpose of this example is to realize single-tube multiplex PCR detection of the full-length gene sequences of Brca1 and Brca2, and the design of multiplex PCR primers is as follows:

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Microbiology (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Provided are a variety of nucleic acid co-labeling supports, a preparation method therefor, and an application thereof. The supports comprise a support body, and a variety of nucleic acid labels located on the surface and/or inside of the support body. The nucleic acid labeled on a single support at least comprises: one or more first nucleic acid labels having a function that at least comprises of trapping a specific compound in a reaction system onto the surface of the support; and one or more second nucleic acid labels having a function that at least comprises participating in a specified biochemical reaction process of the specific compound trapped onto the surface of the support. The foregoing variety of nucleic acid co-labeled supports can be used for 5'-terminus single-cell RNA expression profile analysis, construction of a 5'-terminus single-cell VDJ library for a microwell array platform, construction of a 3'-terminus single-cell RNA library, construction of a single-cell transcriptome library, single-cell multi-omics research, multiplex PCR and/or construction of a multiplex PCR sequencing library, etc.

Description

多种核酸共标记支持物及其制作方法与应用A variety of nucleic acid co-labeling supports and their production methods and applications 技术领域technical field
本发明是关于一种多种核酸共标记支持物及其制作方法与应用。The present invention relates to a kind of multiple nucleic acid co-labeling support and its preparation method and application.
背景技术Background technique
传统的核酸分子反应,包括核酸杂交、延伸、扩增等反应都是在液相中进行的,液相为参与反应的核酸和酶反应提供了均一稳定的环境从而能够最大化产出。随着生物研究的逐渐深入,科学家发现将参与反应的核酸或者酶连接到固相表面可以赋予核酸空间位置信息从而更方便纯化、分离、检测与分析,因此开发出越来越多的固相核酸反应用于核酸序列分析和核酸定量,例如寡核苷酸有序固定在基质上的核酸芯片技术、应用于二代基因测序的桥式扩增和微球乳液扩增技术、应用于高通量单细胞测序的核酸编码微珠等。Traditional nucleic acid molecule reactions, including nucleic acid hybridization, extension, amplification and other reactions are carried out in liquid phase. The liquid phase provides a uniform and stable environment for the nucleic acid and enzyme reactions involved in the reaction to maximize the output. With the gradual deepening of biological research, scientists have found that attaching nucleic acids or enzymes involved in the reaction to the surface of the solid phase can give the nucleic acid spatial position information to facilitate purification, separation, detection and analysis. Therefore, more and more solid phase nucleic acids have been developed. The reaction is used for nucleic acid sequence analysis and nucleic acid quantification, such as nucleic acid chip technology in which oligonucleotides are immobilized on a substrate in an orderly manner, bridge amplification and microsphere emulsion amplification technology for next-generation gene sequencing, and high-throughput applications Nucleic acid encoding microbeads for single-cell sequencing, etc.
单核苷酸多态(SNP)作为遗传标记广泛应用于群体基因组学、全基因组关联研究(GWAS)、亲子鉴定和人群鉴定。传统的SNP鉴定技术可以同时分析数百个以内的位点,包括TaqMan荧光分析、KASPar鉴定技术以及直接PCR测序技术,优点在于在待检测SNP位点数量较少时实验操作灵活,缺点是单个样本待检测SNP数量多时(尤其≥50)成本线性增加;包括生物芯片和高通量测序在内的新检测技术可以实现10 3-10 6个SNP的鉴定,例如RAD测序(restriction site associated DNA sequencing)、外显子测序和全基因组测序,其缺点在于单个样本检测建库与测序成本较高,尤其在单个样本检测SNP数量少于1000时单个SNP平均检测成本迅速增加;当单个样本待检测SNP数量在20-1000时使用多重PCR建库结合二代测序能够有效地降低单SNP平均检测成本。 Single nucleotide polymorphisms (SNPs) as genetic markers are widely used in population genomics, genome-wide association studies (GWAS), paternity testing, and population identification. Traditional SNP identification technologies can analyze hundreds of loci simultaneously, including TaqMan fluorescence analysis, KASPar identification technology, and direct PCR sequencing technology. The advantage is that the experimental operation is flexible when the number of SNP loci to be detected is small, but the disadvantage is that a single sample is used. When the number of SNPs to be detected is large (especially ≥50), the cost increases linearly; new detection technologies including biochips and high-throughput sequencing can realize the identification of 10 3 -10 6 SNPs, such as RAD sequencing (restriction site associated DNA sequencing) , exome sequencing and whole genome sequencing, the disadvantage is that the cost of building a library and sequencing a single sample is high, especially when the number of SNPs detected in a single sample is less than 1000, the average detection cost of a single SNP increases rapidly; when the number of SNPs to be detected in a single sample is Using multiplex PCR library construction combined with next-generation sequencing at 20-1000 can effectively reduce the average detection cost of a single SNP.
多重PCR建库一般是指在PCR扩增体系中加入多对PCR引物以同时扩增多个目标片段,然后通过第二步通用引物扩增形成具有测序仪要求的含有接头和样本标签的核酸文库。由于在PCR扩增体系中加入较高浓度的多对引物,多重PCR容易形成大量的引物二聚体,而引物二聚体会直接影响构建文库的质量,所以如何去除引物二聚体成为多重PCR建库中的关键环节。据公开报道,Zuiyi Yang等人通过优化引物对序列可以降低多重PCR的引物二聚体,Integrated DNA Technologies公司、Adaptive Biotechnology公司以及Kenneth J.Livak实验室利用RNaseH依赖的PCR原理对多重PCR引物进行RNA碱基修饰也可以有效的降低反应过程中的引物二聚体。虽然引物二聚体可以通过以上披露的方法以及PCR后纯化去除,多重PCR还存在扩增目标区效率不同导致的PCR偏差以及引物对交叉组合导致的非特异扩增等问题。更重要的是,由于在液相条件中不同引物对的PCR扩增目标区域不能重 叠,普通多重PCR很难在液相体系中实现单反应管长片段DNA的连续序列分析。Multiplex PCR library building generally refers to adding multiple pairs of PCR primers to the PCR amplification system to amplify multiple target fragments at the same time, and then through the second step of universal primer amplification to form a nucleic acid library with adapters and sample tags required by the sequencer . Due to the addition of multiple pairs of primers at a higher concentration in the PCR amplification system, multiple primer-dimers are easily formed in multiplex PCR, and primer-dimers will directly affect the quality of the constructed library. Therefore, how to remove primer-dimers becomes a multiplex PCR construct key link in the library. According to public reports, Zuiyi Yang et al. can reduce the primer-dimer of multiplex PCR by optimizing the sequence of primer pairs. Base modification can also effectively reduce primer dimers during the reaction. Although primer-dimers can be removed by the methods disclosed above and post-PCR purification, multiplex PCR still has problems such as PCR bias caused by different efficiency of amplifying the target region and non-specific amplification caused by cross combination of primer pairs. More importantly, since the PCR amplification target regions of different primer pairs cannot overlap in liquid phase conditions, it is difficult for ordinary multiplex PCR to achieve continuous sequence analysis of long DNA fragments in a single reaction tube in a liquid phase system.
众所周知复杂的生命机体由许多性质特异的细胞组成,在特定的状态下每个细胞表达的核酸和蛋白的种类与数量都有所不同,所以在单细胞水平上检测核酸和蛋白指标对生物医学研究有重要意义。单细胞测序从检测指标来看主要有单细胞基因组测序、单细胞RNA测序、单细胞表观基因组测序及空间单细胞测序;从检测通量来看主要分为低通量单细胞测序(一次检测1-500细胞)和高通量单细胞测序(一次检测1000-10000细胞)。It is well known that complex living organisms are composed of many cells with specific properties, and the types and quantities of nucleic acids and proteins expressed by each cell in a specific state are different. Therefore, the detection of nucleic acid and protein indicators at the single-cell level is very important for biomedical research. have important meaning. From the perspective of detection indicators, single-cell sequencing mainly includes single-cell genome sequencing, single-cell RNA sequencing, single-cell epigenome sequencing and spatial single-cell sequencing; from the perspective of detection throughput, it is mainly divided into low-throughput single-cell sequencing (one-time detection). 1-500 cells) and high-throughput single-cell sequencing (1000-10,000 cells at a time).
高通量单细胞RNA检测主要有基于油包水的液滴区隔技术、基于微孔板的beads标记技术以及微流控三种实现方式。基于油包水的液滴区隔技术以10X Genomics、Drop-Seq平台及inDrop平台为代表。该类技术通过微流控技术将barcode标记的微珠和单个细胞包裹在个油滴中并裂解释放含有polyA尾巴的RNA;每一个凝胶微珠偶联了含有细胞标签和分子标签的oligo dT核酸序列;mRNA结合到细胞标签和分子标签的oligo dT核酸分子后通过逆转录给不同细胞来源的cDNA标记上不同的细胞标签并用于以后的混合建库并测序分析。基于微孔板的beads标技术以BD CytoSeq、SeqWell及microwell-seq为代表。该技术将细胞自然沉降至细胞数量十倍以上的微孔阵列中保证单细胞入孔率,然后在微孔中加入细胞标签标记的微珠用以捕获细胞裂解后的mRNA;mRNA结合到细胞标签和分子标签的oligo dT后通过逆转录给不同细胞来源的cDNA标记上不同的细胞标签并用于以后的混合建库并测序分析。High-throughput single-cell RNA detection mainly includes three implementations: water-in-oil-based droplet separation technology, microplate-based beads labeling technology, and microfluidics. Water-in-oil-based droplet segmentation technology is represented by 10X Genomics, Drop-Seq platform and inDrop platform. This technology uses microfluidic technology to encapsulate barcode-labeled microbeads and single cells in oil droplets and cleaves to release RNA containing polyA tails; each gel microbead is coupled with oligo dT containing cell tags and molecular tags. Nucleic acid sequence; mRNA is bound to the oligo dT nucleic acid molecule of cell tag and molecular tag, and then reverse transcription to cDNA from different cell sources to tag different cell tags and use for subsequent mixed library construction and sequencing analysis. Microplate-based beads labeling technologies are represented by BD CytoSeq, SeqWell and microwell-seq. This technology naturally settles cells into a microwell array with more than ten times the number of cells to ensure a single cell entry rate, and then adds cell label-labeled microbeads to the microwells to capture the mRNA after cell lysis; mRNA is bound to the cell label After the oligo dT with molecular tags, the cDNAs derived from different cells are labeled with different cell tags by reverse transcription and used for subsequent mixed library construction and sequencing analysis.
液滴法通过油包水将细胞和标签微珠与其它细胞与微珠完全隔离,有效地降低了交叉污染的可能;同时除了可以实现3’RNA表达谱文库外,液滴方案还可以将细胞标签和分子标签与模板转换序列偶联在一起从而实现5’单细胞RNA表达谱测序;但由于液滴本身的不稳定性以及悬浮特性,基于液滴法的单细胞建库方案在RNA被细胞标签标记前后不能换液,从而减少了其进一步进行复杂反应的可能,尤其是缺少标签微珠的位置信息。微孔板法避免了10X中存在的概率碰撞影响捕获效率的问题,有更好的细胞捕获效率;标签微珠落入微孔后具有固定的位置,可以进行更多的换液操作;但是由于微孔是顶部开放的半封闭结构会导致细胞RNA扩散出孔,因此目前所有的微孔法只能构建3’单细胞RNA表达谱文库。The droplet method completely isolates cells and label beads from other cells and beads through water-in-oil, effectively reducing the possibility of cross-contamination; at the same time, in addition to the realization of 3' RNA expression profiling libraries, the droplet protocol can also Tags and molecular tags are coupled with template switching sequences to achieve 5' single-cell RNA expression profiling; however, due to the instability and suspension characteristics of droplets, the single-cell library construction scheme based on droplet method cannot be used in RNA cells. The medium cannot be changed before and after labeling, thereby reducing the possibility of further complex reactions, especially the lack of positional information of the labeling beads. The microplate method avoids the problem of probabilistic collision affecting the capture efficiency in 10X, and has better cell capture efficiency; the label beads have a fixed position after falling into the microwell, and more liquid exchange operations can be performed; however, due to the The microwell is a semi-closed structure with an open top, which can cause cellular RNA to diffuse out of the well, so all current microwell methods can only construct 3' single-cell RNA expression profiling libraries.
发明内容SUMMARY OF THE INVENTION
本发明的一个目的在于提供一种多种核酸共标记的支持物。An object of the present invention is to provide a multi-nucleic acid co-labeled support.
本发明的另一目的在于提供一种多种核酸共标记的支持物的制作方法。Another object of the present invention is to provide a method for preparing a support for co-labeling multiple nucleic acids.
本发明的另一目的在于提供多种核酸共标记的支持物的应用。Another object of the present invention is to provide the application of multiple nucleic acid co-labeled supports.
本发明在支持物上修饰两种以上的核酸分子,其中一种核酸分子用于从反应池中捕获目的化合物并与共标记在同一固相化合物表面的其它类型分子一起参与特定的生物化学过程,包括但不限于多重PCR文库构建、单细胞RNA表达谱、单细胞转录组测序文库构建和单细胞多组学测序文库构建等应用方向。The present invention modifies two or more nucleic acid molecules on the support, wherein one nucleic acid molecule is used to capture the target compound from the reaction pool and participate in a specific biochemical process together with other types of molecules co-labeled on the surface of the same solid-phase compound, including However, it is not limited to the application directions of multiplex PCR library construction, single-cell RNA expression profiling, single-cell transcriptome sequencing library construction, and single-cell multi-omics sequencing library construction.
具体而言,一方面,本发明提供了一种多种核酸共标记的支持物,其包括支持物本体以及位于支持物本体表面和/或内部的多种核酸标记,单个支持物上标记的核酸至少包括:一或多个第一核酸标记,其作用至少包括捕获反应体系中的特定化合物到支持物表面;一或多个第二核酸标记,其作用至少包括可以参与到捕获到支持物表面的特定化合物的指定生物化学反应过程。Specifically, on the one hand, the present invention provides a multi-nucleic acid co-labeled support, which includes a support body and a variety of nucleic acid labels located on the surface and/or inside of the support body, nucleic acid labeled on a single support At least include: one or more first nucleic acid markers, whose role at least includes capturing specific compounds in the reaction system to the surface of the support; one or more second nucleic acid markers, whose role at least includes participating in the capture on the surface of the support. The specified biochemical reaction process for a specific compound.
根据本发明的具体实施方案,本发明的多种核酸共标记的支持物中,所述支持物本体为固体珠子和/或半固态水凝胶珠。According to a specific embodiment of the present invention, in the multiple nucleic acid co-labeled supports of the present invention, the support body is solid beads and/or semi-solid hydrogel beads.
根据本发明的具体实施方案,本发明的多种核酸共标记的支持物,其为包括多个支持物的组合物。According to a specific embodiment of the present invention, a plurality of nucleic acid co-labeled supports of the present invention are compositions comprising a plurality of supports.
根据本发明的具体实施方案,本发明的多种核酸共标记的支持物中,同一支持物上的第一核酸标记、第二核酸标记的数量可以分别≥1个和/或≤10 13个。 According to a specific embodiment of the present invention, in the multiple nucleic acid co-labeled supports of the present invention, the number of the first nucleic acid label and the second nucleic acid label on the same support can be ≥ 1 and/or ≤ 10 13 respectively.
根据本发明的具体实施方案,本发明的多种核酸共标记的支持物中,同一支持物上的多个第一核酸标记的序列相同或不同;不同支持物上的第一核酸标记的序列相同或不同;同一支持物上的多个第二核酸标记的序列相同或不同;或不同支持物上的第二核酸标记的序列相同或不同。According to a specific embodiment of the present invention, in the multiple nucleic acid co-labeled supports of the present invention, the sequences of multiple first nucleic acid markers on the same support are the same or different; the sequences of the first nucleic acid markers on different supports are the same or different; the sequences of multiple second nucleic acid markers on the same support are the same or different; or the sequences of the second nucleic acid markers on different supports are the same or different.
另一方面,本发明还提供了所述的多种核酸共标记的支持物的制作方法,其包括:On the other hand, the present invention also provides a method for making the described multiple nucleic acid co-labeled supports, comprising:
将多种核酸通过接枝到和/或接枝于的方式标记到支持物本体上,得到多种核酸共标记的支持物。Multiple nucleic acids are labeled on the support body by grafting and/or grafting to obtain a support with multiple nucleic acids co-labeled.
根据本发明的具体实施方案,本发明的多种核酸共标记的支持物的制作方法包括:According to a specific embodiment of the present invention, the preparation method of the multiple nucleic acid co-labeled supports of the present invention includes:
将支持物本体和核酸分别修饰上能相互作用的功能单位,使二者反应将核酸标记到支持物本体上;The support body and the nucleic acid are respectively modified with functional units that can interact, so that the two react to label the nucleic acid on the support body;
按照预设好的核苷酸序列将核酸直接合成在支持物本体上;和/或The nucleic acid is directly synthesized on the support body according to the preset nucleotide sequence; and/or
采用生物化学反应进行核酸延伸或连接的方案在支持物本体上进行核酸标记。Nucleic acid labeling is performed on the body of the support using a biochemical reaction for nucleic acid extension or ligation protocols.
另一方面,本发明还提供了所述的多种核酸共标记的支持物在5’单细胞RNA表达谱分析、构建微孔阵列平台的5’单细胞VDJ文库、构建3’单细胞RNA文库、构建单细胞转录组文库、单细胞多组学研究、多重PCR和/或构建多重PCR测序文库中的应用。On the other hand, the present invention also provides the 5' single-cell RNA expression profile analysis of the multiple nucleic acid co-labeled supports, the construction of a 5' single-cell VDJ library of a microwell array platform, and the construction of a 3' single-cell RNA library. , Construction of single-cell transcriptome library, single-cell multi-omics research, multiplex PCR and/or construction of multiplex PCR sequencing library applications.
根据本发明的一些具体实施方案,本发明的多种核酸共标记的支持物是用于5’单细 胞RNA表达谱分析。其中:支持物上固定有含有细胞标签与分子标签的模板转换序列以及RNA捕获序列。具体地,支持物上标记了至少两种核酸序列:第一核酸序列和第二核酸序列;第一核酸序列至少包含捕获序列,用于捕获目的核酸分子并作为引物延伸或逆转录;第二核酸序列包括细胞标签序列,细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;不同种类的支持物上具有不同的细胞标签。优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA,并在逆转录过程中通过模板转换实现对来源于同一细胞的RNA标记相同的细胞标签,而后通过扩增实现cDNA扩增并最终构建为5’单细胞RNA表达谱文库。According to some embodiments of the invention, the multiple nucleic acid co-labeled supports of the invention are used for 5' single cell RNA expression profiling. Wherein: template switching sequences containing cell tags and molecular tags and RNA capture sequences are fixed on the support. Specifically, at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence; the first nucleic acid sequence contains at least a capture sequence, which is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription; the second nucleic acid sequence Sequences include cell tag sequences, which are used to tag molecules derived from all mRNAs in the same cell; different kinds of supports have different cell tags. Preferably, the support is made to capture the RNA released after single cell lysis in the micropores of the chip, and the RNA derived from the same cell is labeled with the same cell label by template switching during the reverse transcription process, and then the cDNA is realized by amplification Amplified and finally constructed as a 5' single-cell RNA expression profiling library.
根据本发明的一些具体实施方案,本发明的多种核酸共标记的支持物是用于构建微孔阵列平台的5’单细胞VDJ文库。其中:支持物上固定有含有细胞标签与分子标签的模板转换序列以及RNA捕获序列。具体地,支持物上标记了至少两种核酸序列:第一核酸序列和第二核酸序列;第一核酸序列至少包含捕获序列,用于捕获目的核酸分子并作为引物延伸或逆转录;第二核酸序列包括细胞标签序列、分子标签序列和模板转换序列,细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;分子标签序列用于标记每个逆转录出来的cDNA分子,从同一支持物上的不同的RNA逆转录出来的cDNA分子都被标记上不同的分子标签;模板转换序列可以作为模板使逆转录出来的cDNA 3’端继续延伸以标记上分子标签序列和细胞标签序列;不同种类的支持物上具有不同的细胞标签。优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA,并在逆转录过程中通过模板转换实现对来源于同一细胞的RNA标记相同的细胞标签,进一步通过TCR与BCR/Ig基因的恒定区引物实现TCR与BCR/Ig核酸序列的富集并最终打断构建为高通量单细胞VDJ测序文库。According to some embodiments of the invention, the multiple nucleic acid co-labeled supports of the invention are 5' single-cell VDJ libraries used to construct microwell array platforms. Wherein: template switching sequences containing cell tags and molecular tags and RNA capture sequences are fixed on the support. Specifically, at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence; the first nucleic acid sequence contains at least a capture sequence, which is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription; the second nucleic acid sequence Sequences include cell tag sequences, molecular tag sequences and template switching sequences. Cell tag sequences are used to label molecules derived from all mRNAs in the same cell; molecular tag sequences are used to label each reverse transcribed cDNA molecule from the same support. The cDNA molecules reversely transcribed from different RNAs are marked with different molecular tags; the template switching sequence can be used as a template to extend the 3' end of the reverse transcribed cDNA to mark the molecular tag sequence and cell tag sequence; different species with different cell labels on the supports. Preferably, the support is made to capture the RNA released after single cell lysis in the micropores of the chip, and in the reverse transcription process, the RNA derived from the same cell is labeled with the same cell label through template switching, and further through TCR and BCR/ The constant region primers of the Ig gene realize the enrichment of TCR and BCR/Ig nucleic acid sequences and finally break them into a high-throughput single-cell VDJ sequencing library.
根据本发明的一些具体实施方案,本发明的多种核酸共标记的支持物是用于构建3’单细胞RNA文库。其中:支持物上固定有含有细胞标签的可条件性封闭的随机引物以及RNA捕获序列,具体地,支持物上标记了至少两种核酸序列:第一核酸序列和第二核酸序列;第一核酸序列至少包含捕获序列,用于捕获目的核酸分子并作为引物延伸或逆转录;第二核酸序列包括含有细胞标签的可条件性封闭的随机引物,细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;不同种类的支持物上具有不同的细胞标签。优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA并逆转录为cDNA,随后的含有细胞标签的随机引物通过二链合成实现对来源于同一细胞的cDNA标记上相同的细胞标签,而后通过扩增实现cDNA扩增构建为3’单细胞RNA文库。According to some embodiments of the invention, the multiple nucleic acid co-labeled supports of the invention are used to construct a 3' single cell RNA library. Wherein: a random primer containing a cell tag that can be conditionally blocked and an RNA capture sequence are fixed on the support, and specifically, at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence; the first nucleic acid sequence The sequence contains at least a capture sequence, which is used to capture the target nucleic acid molecule and serve as primer extension or reverse transcription; the second nucleic acid sequence includes a conditional blockable random primer containing a cell tag, and the cell tag sequence is used to tag all cells derived from the same cell. Molecules of mRNA; different kinds of supports have different cell tags. Preferably, the support is made to capture the RNA released after single cell lysis in the micropores of the chip and reverse transcribed into cDNA, and the subsequent random primers containing cell tags are synthesized by two strands to achieve the same tag on the cDNA derived from the same cell. Cell labeling, followed by amplification of cDNA to construct a 3' single-cell RNA library.
根据本发明的一些具体实施方案,本发明的多种核酸共标记的支持物是用于构建单 细胞转录组文库。其中:支持物上固定有含有细胞标签的随机引物序列以及RNA捕获序列,不同种类的支持物上具有不同的细胞标签,能够检测RNA分子上的任何一段序列而不局限于3’端或5’端;优选地,支持物包括两种类型的支持物,每种类型的单个支持物上至少有两种核酸序列,第一核酸序列和第二核酸序列的组合,或者是第三核酸序列和第二核酸序列的组合;第一核酸序列至少包含捕获序列用于捕获目的核酸分子;第二核酸序列包括含有细胞标签的随机引物序列,细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;第三核酸序列包括细胞标签序列和捕获序列。优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA,并在逆转录过程中实现对来源于同一细胞的RNA标记相同的细胞标签,而后通过扩增实现cDNA扩增并最终构建为单细胞RNA转录组文库。According to some embodiments of the invention, the multiple nucleic acid co-labeled supports of the invention are used to construct single cell transcriptome libraries. Among them: random primer sequences containing cell tags and RNA capture sequences are fixed on the support, different types of supports have different cell tags, and can detect any sequence of RNA molecules without being limited to the 3' end or 5' end; preferably, the support includes two types of supports, each type of single support has at least two nucleic acid sequences, a combination of a first nucleic acid sequence and a second nucleic acid sequence, or a third nucleic acid sequence and a third nucleic acid sequence. A combination of two nucleic acid sequences; the first nucleic acid sequence contains at least a capture sequence for capturing target nucleic acid molecules; the second nucleic acid sequence includes a random primer sequence containing a cell tag, and the cell tag sequence is used to tag molecules derived from all mRNAs in the same cell ; the third nucleic acid sequence includes a cell tag sequence and a capture sequence. Preferably, the support is made to capture the RNA released after the lysis of single cells in the micropores of the chip, and in the process of reverse transcription, the RNA derived from the same cell is labeled with the same cell label, and then the cDNA is amplified and amplified by amplification. Finally, a single-cell RNA transcriptome library was constructed.
根据本发明的一些具体实施方案,本发明的多种核酸共标记的支持物是用于单细胞多组学研究。优选地,包括用于构建RNA表达水平的文库和/或通过蛋白的核酸标签用于检测蛋白表达水平。其中:支持物上固定有含有细胞标签的RNA捕获序列和用于标记蛋白的核酸标签的捕获序列,而且不同种类的支持物上具有不同的细胞标签。优选地,第一核酸序列至少包含捕获序列用于捕获目的核酸分子并作为引物延伸;第二核酸序列包括细胞标签序列、分子标签序列和模板转换序列;细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;分子标签序列用于标记每个逆转录出来的cDNA分子,从同一支持物上的不同的RNA逆转录出来的cDNA分子都被标记上不同的分子标签;模板转换序列可以作为模板使逆转录出来的cDNA 3’端继续延伸以标记上分子标签序列、细胞标签序列;第三核酸序列包括细胞标签序列、分子标签序列和蛋白核酸标签捕获序列,蛋白核酸标签捕获序列用来捕获并延伸与待测单细胞在同一空间结构的蛋白核酸标记。优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA以及蛋白的核酸标签,并在逆转录过程中实现对来源于同一细胞的RNA与蛋白核酸标签标记上相同的细胞标签,而后通过扩增最终构建为单细胞RNA转录组文库及蛋白标记核酸文库。According to some embodiments of the present invention, the multiple nucleic acid co-labeled supports of the present invention are used in single-cell multi-omics studies. Preferably, a library for constructing RNA expression levels and/or for detecting protein expression levels by nucleic acid tags of proteins is included. Wherein: RNA capture sequences containing cell tags and capture sequences for protein-labeled nucleic acid tags are fixed on the support, and different types of supports have different cell tags. Preferably, the first nucleic acid sequence includes at least a capture sequence for capturing the target nucleic acid molecule and extending as a primer; the second nucleic acid sequence includes a cell tag sequence, a molecular tag sequence and a template switching sequence; the cell tag sequence is used to label cells derived from the same cell All mRNA molecules of the mRNA; the molecular tag sequence is used to label each reverse transcribed cDNA molecule, and the cDNA molecules reverse transcribed from different RNAs on the same support are marked with different molecular tags; the template switching sequence can be used as a The template continues to extend the 3' end of the reverse transcribed cDNA to label the molecular tag sequence and cell tag sequence; the third nucleic acid sequence includes the cell tag sequence, molecular tag sequence and protein nucleic acid tag capture sequence, and the protein nucleic acid tag capture sequence is used to capture And extend the protein nucleic acid marker in the same spatial structure as the single cell to be tested. Preferably, the support is made to capture RNA and protein nucleic acid tags released after single cell lysis in the micropores of the chip, and in the reverse transcription process, the RNA and protein nucleic acid tags derived from the same cell are labeled with the same cell tag. , and then finally constructed into a single-cell RNA transcriptome library and a protein-labeled nucleic acid library through amplification.
根据本发明的一些具体实施方案,本发明的多种核酸共标记的支持物是用于构建多重PCR测序文库。其中:将能够相互干扰的引物分别固定到不同支持物上。具体地,支持物包含至少两种类的支持物:一或多个第一种类的引物标记的支持物,一或多个第二种类的引物标记的支持物,每个支持物上标记上至少一对的核酸引物:第一种类的引物标记的支持物上标记上第一核酸引物对,第二种类的引物标记的支持物上标记上与第一核酸引物对不同的第二核酸引物对,两种类支持物各自独立地还可选择性包括更多的核酸引物对例如其他核酸引物对,同一支持物上的多对引物对所扩增的目标片段在模板上 不重合;不同支持物上标记的引物对不同从而可扩增不同的目的区域,这些目的区域间可以部分重合或者不重合。优选地,将所有支持物按照比例混合后与核酸模板和PCR酶反应体系混合,从而进行单管无偏差的多重PCR。According to some embodiments of the invention, the multiple nucleic acid co-labeled supports of the invention are used to construct multiplex PCR sequencing libraries. Among them: the primers that can interfere with each other are respectively fixed on different supports. Specifically, the supports comprise at least two types of supports: one or more primer-labeled supports of a first kind, and one or more primer-labeled supports of a second kind, each of which is labeled with at least one A pair of nucleic acid primers: a first nucleic acid primer pair is labeled on the first type of primer-labeled support, and a second nucleic acid primer pair different from the first nucleic acid primer pair is labeled on the second type of primer-labeled support. Each kind of support can also selectively include more nucleic acid primer pairs, such as other nucleic acid primer pairs, and the target fragments amplified by multiple pairs of primer pairs on the same support do not overlap on the template; Different primer pairs can amplify different regions of interest, and these regions of interest may or may not overlap. Preferably, all the supports are mixed in proportions and then mixed with the nucleic acid template and the PCR enzyme reaction system, so as to perform a single-tube unbiased multiplex PCR.
另一方面,本发明还提供了一种试剂盒,其包括本发明所述的多种核酸共标记的支持物。优选地,所述试剂盒为可应用于5’单细胞RNA表达谱分析、构建微孔阵列平台的5’单细胞VDJ文库、构建3’单细胞RNA文库、构建单细胞转录组文库、单细胞多组学研究、多重PCR和/或构建多重PCR测序文库的试剂盒。In another aspect, the present invention also provides a kit, which includes the multiple nucleic acid co-labeled supports of the present invention. Preferably, the kit is a 5' single-cell VDJ library that can be applied to 5' single-cell RNA expression profiling, the construction of a microwell array platform, the construction of a 3' single-cell RNA library, the construction of a single-cell transcriptome library, a single-cell VDJ library Kits for multi-omics studies, multiplex PCR and/or construction of multiplex PCR sequencing libraries.
更优选地,所述试剂盒还包括以下组合物中的一种或多种:More preferably, the kit also includes one or more of the following compositions:
组合物1:含有细胞标签与分子标签的模板转换序列以及RNA捕获序列的支持物的混合物、微孔芯片、细胞裂解液、逆转录试剂、核酸扩增试剂以及核酸打断建库模块;包含该组合物1的试剂盒可用于5’单细胞RNA表达谱分析;Composition 1: a mixture of a support containing a template switching sequence of a cell tag and a molecular tag and an RNA capture sequence, a microwell chip, a cell lysate, a reverse transcription reagent, a nucleic acid amplification reagent, and a nucleic acid interruption library building module; comprising the The kit of composition 1 can be used for 5' single-cell RNA expression profiling;
组合物2:含有细胞标签与分子标签的模板转换序列以及RNA捕获序列的支持物的混合物、微孔芯片、细胞裂解液、逆转录试剂、恒定区引物、核酸扩增试剂以及核酸打断建库模块;包含该组合物2的试剂盒可用于构建微孔阵列平台的5’单细胞VDJ文库;Composition 2: mixture of template switching sequences containing cell tags and molecular tags and supports for RNA capture sequences, microwell chips, cell lysates, reverse transcription reagents, constant region primers, nucleic acid amplification reagents, and nucleic acid interruption library construction Module; a kit comprising the composition 2 can be used to construct a 5' single-cell VDJ library of a microwell array platform;
组合物3:含有细胞标签的随机引物以及RNA捕获序列的支持物的混合物、微孔芯片、细胞裂解液、逆转录试剂、二链合成模块和核酸扩增与延伸试剂;包含该组合物3的试剂盒可用于构建3’单细胞RNA文库;Composition 3: a mixture of random primers containing cell tags and supports for RNA capture sequences, a microwell chip, a cell lysate, a reverse transcription reagent, a double-stranded synthesis module, and a nucleic acid amplification and extension reagent; The kit can be used to construct 3' single-cell RNA library;
组合物4:含有细胞标签的随机引物序列以及RNA捕获序列的支持物混合物、微孔芯片、细胞裂解液、逆转录试剂、二链合成模块以及核酸扩增与延伸试剂;包含该组合物4的试剂盒可用于构建单细胞转录组文库;Composition 4: a support mixture containing a random primer sequence of a cell tag and an RNA capture sequence, a microwell chip, a cell lysate, a reverse transcription reagent, a two-strand synthesis module, and a nucleic acid amplification and extension reagent; a composition comprising the composition 4 The kit can be used to construct single-cell transcriptome library;
组合物5:含有细胞标签的蛋白标签核酸的捕获序列支持物混合物、微孔芯片、细胞裂解液、逆转录试剂、核酸打断建库模块;包含该组合物5的试剂盒可用于单细胞多组学研究;Composition 5: a capture sequence support mixture containing cell-tagged protein-tagged nucleic acids, a microwell chip, a cell lysate, a reverse transcription reagent, and a nucleic acid interrupt library building module; a kit comprising the composition 5 can be used for single-cell multiplexing. omics research;
组合物6:预混的已偶联引物的支持物混合物及多重PCR酶及缓冲液;进一步选择性地还包括适配高通量测序仪的标签引物;包含该组合物6的试剂盒可用于多重PCR和/或构建多重PCR测序文库(预混的已偶联引物的支持物混合物及多重PCR酶及缓冲液,可以实现单管无偏差的多重PCR;进一步包括适配高通量测序仪的标签引物,通过index PCR构建能够用于测序分析的多重PCR文库)。Composition 6: premixed support mixture of coupled primers, multiplex PCR enzyme and buffer; further optionally, index primers adapted to a high-throughput sequencer; the kit comprising the composition 6 can be used for Multiplex PCR and/or construction of multiplex PCR sequencing library (pre-mixed support mixture of coupled primers and multiplex PCR enzymes and buffers, can achieve single-tube unbiased multiplex PCR; further includes adapting to high-throughput sequencers. Indexing primers to construct multiplex PCR libraries that can be used for sequencing analysis by index PCR).
综上所述,本发明提供了一种多种核酸共标记的支持物及其制作方法与应用。本发明的技术通过在固相(包括半固态)支持物上进行多种核酸修饰的方案,可以将核酸分子捕获到固相表面并与固相表面修饰的其它种类核酸一起进行特定的生物化学反应。特 定种类的多核酸修饰固相支持物可以用于多重PCR文库构建、单分子长片段核酸测序文库构建、单细胞转录组测序文库构建和单细胞多组学测序文库构建等领域。To sum up, the present invention provides a support for co-labeling of multiple nucleic acids and a method for making and application thereof. The technology of the present invention can capture nucleic acid molecules on the solid surface and perform specific biochemical reactions together with other kinds of nucleic acids modified on the solid surface by carrying out a variety of nucleic acid modification schemes on solid-phase (including semi-solid) supports. . Specific types of polynucleic acid modified solid supports can be used in the fields of multiplex PCR library construction, single molecule long fragment nucleic acid sequencing library construction, single cell transcriptome sequencing library construction and single cell multi-omics sequencing library construction.
附图说明Description of drawings
图1A-图1C为本发明的多种核酸共标记的支持物的结构示意图。1A-1C are schematic diagrams of the structures of multiple nucleic acid co-labeled supports of the present invention.
图2A、图2B为本发明的多种核酸共标记的支持物应用于多重PCR反应的示意图。FIG. 2A and FIG. 2B are schematic diagrams of the application of multiple nucleic acid co-labeled supports of the present invention to multiplex PCR reactions.
图2C为本发明的多种核酸标记的支持物应用于多重PCR测序文库构建的示意图。FIG. 2C is a schematic diagram of the application of various nucleic acid-labeled supports of the present invention to the construction of multiplex PCR sequencing libraries.
图2D、图2E为本发明的应用于多重PCR用途的多种核酸共标记支持物的设计方法及结构示意图。FIG. 2D and FIG. 2E are schematic diagrams of the design method and structure of multiple nucleic acid co-labeling supports for multiplex PCR of the present invention.
图3A为本发明的应用于5’单细胞RNA表达谱分析、构建微孔阵列平台的5’单细胞VDJ文库的多种核酸共标记的支持物的结构示意图。Figure 3A is a schematic structural diagram of a multi-nucleic acid co-labeled support for 5' single-cell RNA expression profiling and the construction of a 5' single-cell VDJ library of a microwell array platform of the present invention.
图3B为图3A中所示的多核酸标记支持物在应用于5’单细胞RNA表达谱分析、构建微孔阵列平台的5’单细胞VDJ文库的实验流程图。Figure 3B is an experimental flow chart of the multi-nucleic acid labeling support shown in Figure 3A applied to 5' single-cell RNA expression profiling and the construction of a 5' single-cell VDJ library on a microwell array platform.
图3C为本发明的应用于5’单细胞RNA表达谱分析的核酸共标记支持物的制作示意图。Figure 3C is a schematic diagram of the preparation of the nucleic acid co-labeling support for 5' single-cell RNA expression profiling according to the present invention.
图4A为本发明的应用于3’单细胞RNA文库的多种核酸共标记的支持物的结构示意图。Figure 4A is a schematic structural diagram of a multi-nucleic acid co-labeled support applied to a 3' single-cell RNA library of the present invention.
图4B为本发明的多种核酸共标记的支持物应用于3’单细胞RNA文库的流程示意图。Figure 4B is a schematic flowchart of the application of the multiple nucleic acid co-labeled supports of the present invention to a 3' single-cell RNA library.
图5A为本发明的用于构建单细胞转录组文库的多种核酸共标记支持物的结构示意图。Figure 5A is a schematic structural diagram of a plurality of nucleic acid co-labeling supports for constructing a single-cell transcriptome library of the present invention.
图5B为本发明的多种核酸共标记的支持物用于构建单细胞转录组文库的流程示意图。FIG. 5B is a schematic flowchart of the use of the multiple nucleic acid co-labeled supports of the present invention to construct a single-cell transcriptome library.
图6A为本发明的用于构建多组学单细胞文库的多种核酸共标记支持物的结构示意图。FIG. 6A is a schematic structural diagram of a plurality of nucleic acid co-labeling supports for constructing a multi-omics single-cell library of the present invention.
图6B为本发明的多种核酸共标记的支持物用于构建多组学单细胞文库的流程示意图。FIG. 6B is a schematic flow chart of the multi-omics single-cell library construction using the multiple nucleic acid co-labeled supports of the present invention.
图7A所示为采用实施例1中流程构建的膜蛋白核酸标签测序文库的琼脂糖电泳结果。FIG. 7A shows the agarose electrophoresis results of the membrane protein nucleic acid tag sequencing library constructed by the procedure in Example 1. FIG.
图7B所示为采用实施例1中流程构建的5’单细胞表达谱文库片段分析结果。Figure 7B shows the analysis results of the 5' single-cell expression profile library fragments constructed by the process in Example 1.
图7C所示为采用实施例1中流程构建的T细胞VDJ文库片段分析结果。FIG. 7C shows the analysis results of the T cell VDJ library fragments constructed by the procedure in Example 1. FIG.
图7D所示为采用实施例1中流程构建的B细胞VDJ文库片段分析结果。Figure 7D shows the analysis results of the B cell VDJ library fragments constructed by the procedure in Example 1.
图8所示为采用实施例2中流程所构建的3’单细胞RNA文库和实施例3中流程所构建的单细胞转录组文库经过测序分析得到reads在基因水平上的分布。BD Phapsody 3’单细胞表达谱文库为完全采用BD Rhapsody构建的文库分析结构。Figure 8 shows the distribution of reads at the gene level obtained by sequencing analysis of the 3' single-cell RNA library constructed by the process in Example 2 and the single-cell transcriptome library constructed by the process in Example 3. The BD Phapsody 3' single-cell expression profile library is a library analysis structure constructed entirely with BD Rhapsody.
图9所示为采用实施例4中流程构建的多重扩增PCR测序文库的片段大小分析结果。Figure 9 shows the results of fragment size analysis of the multiplex amplification PCR sequencing library constructed using the procedure in Example 4.
具体实施方式detailed description
以下通过具体实施方式和实施例详细说明本发明的实施过程和产生的有益效果,旨 在帮助阅读者更好地理解本发明的实质和特点,不作为对本案可实施范围的限定。下述具体实施方式和实施例中未详细注明的方法,按照所属领域的常规操作或是仪器厂商建议的操作条件进行。The implementation process and the beneficial effects of the present invention are described in detail below through specific embodiments and examples, which are intended to help readers better understand the essence and characteristics of the present invention, and are not intended to limit the scope of implementation of this case. The methods that are not specified in the following specific embodiments and examples are carried out according to conventional operations in the field or operating conditions suggested by instrument manufacturers.
本发明的描述中所使用的术语“一”、“一个”、“一种”以及“该”“所述”,意旨也包括复数形式,除非上下文另有明确说明或根据上下文含义能明确其表示单数。The terms "a," "an," "an," and "the" and "the" used in the description of the present invention are also intended to include plural forms, unless the context clearly dictates otherwise or the meaning of the context clearly indicates that Odd number.
本发明首先提供了多种核酸共标记的支持物的结构。如图1A至图1C所示,支持物本体可以是固体平面(图1A),也可以是固体珠子(图1B)或者半固态水凝胶(图1C);核酸标记可以位于固体表面(图1A和图1B)也可以位于水凝胶疏松的内部(图1C)。单个支持物上标记的核酸至少包括:一或多个第一核酸标记101,其作用至少包括捕获反应体系中的特定化合物到支持物表面(因此第一核酸标记101亦称为捕获核酸标记);一或多个第二核酸标记102,其作用至少包括可以参与到捕获到支持物表面的特定化合物的指定生物化学反应过程(因此第二核酸标记102亦称为反应核酸标记)。支持物上还可选择性包括其他种类的核酸标记1N。The present invention first provides structures for multiple nucleic acid co-labeled supports. As shown in Figures 1A to 1C, the support body can be a solid plane (Figure 1A), a solid bead (Figure 1B) or a semi-solid hydrogel (Figure 1C); nucleic acid labels can be located on a solid surface (Figure 1A) and Figure 1B) can also be located in the loose interior of the hydrogel (Figure 1C). The nucleic acid labeled on a single support at least includes: one or more first nucleic acid labels 101, the function of which at least includes capturing a specific compound in the reaction system to the surface of the support (so the first nucleic acid label 101 is also called a capture nucleic acid label); One or more second nucleic acid labels 102, the functions of which include at least being able to participate in a specified biochemical reaction process of a specific compound captured on the surface of the support (hence the second nucleic acid label 102 is also referred to as a reactive nucleic acid label). Optionally, other types of nucleic acid labels IN may also be included on the support.
根据本发明的一些具体实施方案,本发明的多种核酸共标记的支持物,其为包括多个上述支持物(图1A、图1B和/或图1C所示意结构的支撑物)的组合物。According to some specific embodiments of the present invention, a plurality of nucleic acid co-labeled supports of the present invention are compositions comprising a plurality of the above-mentioned supports (supports with the structures shown in FIG. 1A , FIG. 1B and/or FIG. 1C ) .
在特定的用途下,同一支持物上的多个第一核酸标记101的序列可以是相同的。在特定的用途下,同一支持物上的多个第一核酸标记101的序列可以是不相同的。In a specific application, the sequences of multiple first nucleic acid markers 101 on the same support may be the same. In a specific application, the sequences of multiple first nucleic acid markers 101 on the same support may be different.
在特定的用途下,不同支持物上的第一核酸标记101的序列可以是相同的。在特定的用途下,不同支持物上的第一核酸标记101的序列可以是不相同的。In certain applications, the sequences of the first nucleic acid markers 101 on different supports may be the same. In certain applications, the sequences of the first nucleic acid markers 101 on different supports may be different.
在特定的用途下,同一支持物上的多个第二核酸标记102的序列可以是相同的。在特定的用途下,同一支持物上的多个第二核酸标记102的序列可以是不相同的。Under certain uses, the sequences of multiple second nucleic acid labels 102 on the same support may be identical. In certain applications, the sequences of the plurality of second nucleic acid markers 102 on the same support may be different.
在特定的用途下,不同支持物上的第二核酸标记102的序列可以是相同的。在特定的用途下,不同支持物上的第二核酸标记102的序列可以是不相同的。In certain applications, the sequences of the second nucleic acid labels 102 on different supports may be identical. In certain applications, the sequences of the second nucleic acid labels 102 on different supports may be different.
同样的,在特定的用途下,同一支持物上的多个其他种类的核酸标记1N的序列可以是相同的或是不同的。在特定的用途下,不同支持物上的其他种类的核酸标记1N的序列可以是相同的或是不同的。Likewise, the sequences of multiple other types of nucleic acid markers IN on the same support may be the same or different for a specific application. The sequences of the other kinds of nucleic acid labels on different supports may be the same or different under certain applications.
根据不同的用途,同一支持物上的第一核酸标记、第二核酸标记、其他种类的核酸标记的数量可以分别≥1个和/或者≤10 13个。 According to different uses, the number of the first nucleic acid label, the second nucleic acid label, and other kinds of nucleic acid labels on the same support can be ≥ 1 and/or ≤ 10 13 , respectively.
在特定的用途下,第一核酸标记与第二核酸标记的功能可以转换,即,同一种核酸标记即可具备本发明所述的“捕获反应体系中的特定化合物到支持物表面”功能,也可以具备本发明所述的“参与到捕获到支持物表面的特定化合物的指定生物化学反应过程”功 能。举例而言,这样的第一核酸标记与第二核酸标记可以是引物对中的两条引物。In a specific application, the functions of the first nucleic acid label and the second nucleic acid label can be switched, that is, the same nucleic acid label can have the function of "capturing a specific compound in the reaction system to the surface of the support" described in the present invention, or It can have the function of "participating in the specified biochemical reaction process of the specific compound captured on the surface of the support" described in the present invention. For example, such a first nucleic acid label and a second nucleic acid label may be two primers in a primer pair.
本发明还提供了针对不同用途的多种核酸共标记的支持物的制作方法。对支持物进行核酸标记可以采用“graft to”(接枝到)和“graft from”(接枝于)两种方案。在特定用途下,对支持物进行核酸标记可以单独采用“graft to”方案。在特定用途下,对支持物进行核酸标记可以单独采用“graft from”方案。在特定用途下,对支持物进行核酸标记可以混合采用“graft to”和“graft from”两种方案。The present invention also provides a method for preparing multiple nucleic acid co-labeled supports for different purposes. Nucleic acid labeling of the support can be carried out in two ways: "graft to" and "graft from". In certain applications, nucleic acid labeling of the support can be performed using a "graft to" protocol alone. In certain applications, nucleic acid labeling of the support can be performed using the "graft from" protocol alone. Under specific applications, nucleic acid labeling of supports can be mixed using "graft to" and "graft from" protocols.
采用“graft to”方案时,支持物和核酸分别修饰上能相互作用的功能单位,功能单位包括但不限于羟基、醛基、环氧基、氨基、羧基及其活化形式、磷酸、炔基、叠氮、巯基、烯烃、生物素、亲和素、异硫氰酸酯、异氰酸酯、酰基叠氮、磺酰氯、甲苯磺酰基酯等中的一种或多种。不同种类的核酸标记(第一核酸标记、第二核酸标记、其他种类的核酸标记)可以修饰上相同的功能单位,也可以修饰上不同的功能单位。修饰后的支持物和核酸在特定的条件下充分接触以使能够相互作用的功能单位发生反应从而相互连接。When using the "graft to" scheme, the support and nucleic acid are modified with functional units that can interact with each other, including but not limited to hydroxyl, aldehyde, epoxy, amino, carboxyl and their activated forms, phosphoric acid, alkynyl, One or more of azide, mercapto, alkene, biotin, avidin, isothiocyanate, isocyanate, acyl azide, sulfonyl chloride, tosyl ester, etc. Different types of nucleic acid labels (first nucleic acid label, second nucleic acid label, and other nucleic acid labels) can be modified with the same functional unit or with different functional units. The modified support and the nucleic acid are brought into contact under specific conditions sufficient to allow the functional units capable of interacting to react and connect to each other.
采用“graft from”方案时,可以按照预设好的核苷酸序列直接合成在支持物上,也可以采用生物化学反应进行核酸延伸或连接的方案进行核酸标记。标记的核酸中可以加入本领域内熟知的核酸修饰,包括但不限于氨基、磷酸、炔基、叠氮、巯基、双硫、烯烃、生物素、偶氮苯、甲基、spacer、光裂解集团、dI、dU、LNA、XNA、核糖核酸碱基和双脱氧核糖核酸碱基等中的一种或多种。When the "graft from" scheme is adopted, it can be directly synthesized on the support according to the preset nucleotide sequence, or the nucleic acid can be labeled with the scheme of nucleic acid extension or ligation by biochemical reaction. Nucleic acid modifications well known in the art can be added to the labeled nucleic acid, including but not limited to amino, phosphate, alkynyl, azide, sulfhydryl, disulfide, alkene, biotin, azobenzene, methyl, spacer, photocleavage groups One or more of , dI, dU, LNA, XNA, ribonucleic acid bases and dideoxyribonucleic acid bases, and the like.
本发明还提供了多种核酸共标记的支持物应用于多重PCR的用途。在传统的多重PCR应用中,所有的模板和引物被混合在同一个反应体系内,随着PCR的目的区域增加引物对的种类和总浓度也会相应增加,这就容易形成引物二聚体从而降低目标区域的扩增效率。使用本发明提供的多种核酸共标记的支持物可以很好地减少多重PCR中的引物二聚体的产生,而且由于避免了PCR引物间相互干扰可以实现单管分析长片段的连续序列。如图2A所示,在该用途下,本发明提供的多种核酸共标记的支持物包含至少两种类的支持物:一或多个第一种类的引物标记的支持物1,一或多个第二种类的引物标记的支持物2,每个支持物上标记上至少一对的核酸引物:如图所示,第一种类的引物标记的支持物1上标记上第一核酸引物对(第一正向引物201F和第一反向引物201R),第二种类的引物标记的支持物2上标记上与第一核酸引物对不同的第二核酸引物对(第二正向引物202F和第二反向引物202R,两种类支持物各自独立地还可选择性包括更多的核酸引物对例如其他核酸引物对(其他正向引物2NF与其他反向引物2NR)。且可以通过PCR引物设计软件例如Primer Premier优化引物序列降低同一个磁珠上的多对引物之间相互干扰。同一支持物上的多对引物对所扩增的目标片段在模板上不重合。不同支持物上 标记的引物对可不同从而可以扩增不同的目的区域,这些目的区域间可以部分重合或者不重合。支持物上标记的各引物至少包括可以与目的区域结合并延伸的H区:第一正向引物H区201FH、第一反向引物H区201RH、第二正向引物H区202FH、第二正向引物H区202RH、其他正向引物H区2NFH、其他反向引物H区2NRH。在特定用途中,支持物上标记的各引物还至少包括通用核酸序列U区:第一正向引物U区201FU、第一反向引物U区201RU、第二正向引物U区202FU、第二反向引物U区202RU、其他正向引物U区2NFU、其他反向引物U区2NRU。在特定用途中所有支持物上的引物的正向引物U区FU或反向引物U区RU序列可以是一致的:第一正向引物U区201FU=第二正向引物U区202FU=其他正向引物U区2NFU,第一反向引物U区201RU=第二反向引物U区202RU=其他反向引物U区2NRU。在特定用途中所有支持物上的引物的正向引物U区FU或反向引物U区RU序列可以是不一致的。在多重PCR实施时将不同种类核酸引物标记的支持物按照预设的比例与多重PCR反应体系混合在一起,预设的比例根据不同种类支持物上核酸扩增效率确定,可以低至不同种类支持物平均比例(比如将引物P1/P2/P3标记到磁珠上形成第一类磁珠,将引物P4/P5/P6标记到磁珠上形成第二类磁珠,此处比例是指将磁珠混合时第一类磁珠与第二类磁珠的数量比例)的0.01倍,可以高至不同种类支持物平均比例的100倍。多重PCR反应体系至少包括DNA模板、DNA聚合酶、dNTP、合适浓度的缓冲液等。如图2B所示,多重PCR反应开始时支持物上标记的其中一种引物(例如第一正向引物201F或第二正向引物202F)会与反应体系中DNA单链模板203、204结合将其限制在支持物表面,并在聚合酶的作用下生成DNA模板的互补链205、206,随后该互补链205、206与模板解离并被同一个支持物上另一种引物(例如第一反向引物201R、第二反向引物202R)结合并延伸得到与前述DNA模板的互补链205、206互补的核酸链207、208;如此反复得到带有大量核酸序列的支持物。由于同一个支持物上标记的引物对种类有限且与其它支持物上的引物对物理距离远,所以可以有效减少不同引物对的相互结合形成二聚体的可能;同时可以通过给更多的支持物标记上不同种类的核酸以及增加每个支持物上的引物对标记数量从而有效增加多重PCR的目的扩增区域,例如每种支持物上可以偶联5种以内的引物对,或者每种支持物上可以偶联10种以内的引物对,或者每种支持物上可以偶联100种以内的引物对。The present invention also provides the use of multiple nucleic acid co-labeled supports for multiplex PCR. In traditional multiplex PCR applications, all templates and primers are mixed in the same reaction system. As the target region of PCR increases, the type and total concentration of primer pairs will also increase accordingly, which is easy to form primer dimers and thus Reduce the amplification efficiency of the target region. The use of the multiple nucleic acid co-labeled supports provided by the present invention can well reduce the generation of primer dimers in multiplex PCR, and because mutual interference between PCR primers can be avoided, the continuous sequence of long fragments can be analyzed in a single tube. As shown in FIG. 2A , in this application, the multiple nucleic acid co-labeled supports provided by the present invention comprise at least two types of supports: one or more first type of primer-labeled supports 1, one or more The second kind of primer-labeled support 2, each support is labeled with at least one pair of nucleic acid primers: as shown in the figure, the first kind of primer-labeled support 1 is labeled with the first nucleic acid primer pair (No. A forward primer 201F and a first reverse primer 201R), a second nucleic acid primer pair different from the first nucleic acid primer pair (the second forward primer 202F and the second nucleic acid primer pair is labeled on the support 2 of the second kind of primers) The reverse primer 202R, the two types of supports independently can also selectively include more nucleic acid primer pairs such as other nucleic acid primer pairs (other forward primer 2NF and other reverse primer 2NR). And can be designed by PCR primer software such as Primer Primer optimizes primer sequences to reduce mutual interference between multiple primer pairs on the same magnetic bead. The target fragments amplified by multiple primer pairs on the same support do not overlap on the template. Primer pairs labeled on different supports can Different thus can amplify different target regions, and these target regions can partially overlap or not overlap. Each primer marked on the support at least includes the H region that can be combined and extended with the target region: the first forward primer H region 201FH, The first reverse primer H region 201RH, the second forward primer H region 202FH, the second forward primer H region 202RH, the other forward primer H region 2NFH, the other reverse primer H region 2NRH. Each primer marked on the above also includes at least the universal nucleic acid sequence U region: the first forward primer U region 201FU, the first reverse primer U region 201RU, the second forward primer U region 202FU, the second reverse primer U region 202RU, Other forward primer U region 2NFU, other reverse primer U region 2NRU. The forward primer U region FU or reverse primer U region RU sequence of primers on all supports in a specific use can be identical: first forward Primer U region 201FU = second forward primer U region 202FU = other forward primer U region 2NFU, first reverse primer U region 201RU = second reverse primer U region 202RU = other reverse primer U region 2NRU. The forward primer U region FU or reverse primer U region RU sequence of primers on all supports may be inconsistent. When multiplex PCR is implemented, the supports labeled with different kinds of nucleic acid primers are combined with multiplex PCR in a preset ratio. The reaction systems are mixed together, and the preset ratio is determined according to the nucleic acid amplification efficiency on different kinds of supports, which can be as low as the average ratio of different kinds of supports (for example, labeling primers P1/P2/P3 on magnetic beads to form the first type of magnetic particles). Beads, the primers P4/P5/P6 are labeled on the magnetic beads to form the second type of magnetic beads, the ratio here refers to 0.01 times the quantity ratio of the first type of magnetic beads to the second type of magnetic beads when the magnetic beads are mixed), Can be as high as 100 times the average ratio of different types of supports. Multiplex PCR reaction system to At least include DNA template, DNA polymerase, dNTP, buffer of appropriate concentration, etc. As shown in FIG. 2B , at the beginning of the multiplex PCR reaction, one of the primers marked on the support (for example, the first forward primer 201F or the second forward primer 202F) will bind to the DNA single-stranded templates 203 and 204 in the reaction system to It is confined to the surface of the support and generates a complementary strand 205, 206 of the DNA template under the action of a polymerase, which is subsequently dissociated from the template and is replaced by another primer on the same support (eg the first one). The reverse primer 201R, the second reverse primer 202R) are combined and extended to obtain nucleic acid strands 207 and 208 complementary to the complementary strands 205 and 206 of the aforementioned DNA template; this is repeated to obtain a support with a large number of nucleic acid sequences. Due to the limited types of primer pairs marked on the same support and the physical distance from primer pairs on other supports, the possibility of different primer pairs combining with each other to form dimers can be effectively reduced; Different kinds of nucleic acids can be labeled with different types of nucleic acids and the number of primer pairs on each support can be increased to effectively increase the target amplification region of multiplex PCR. For example, each support can be coupled with less than 5 primer pairs, or each support Up to 10 primer pairs can be coupled to the substrate, or up to 100 primer pairs can be coupled to each support.
更进一步地,本发明还提供了使用多种核酸共标记的支持物应用于多重PCR测序文库构建的用途。如图2C所示,以图2B所示反应中生成的带有与DNA模板的互补链互补的核酸序列207、208的支持物作为模板与测序仪兼容的引物序列:第三正向引物209F和第三反向引物209R,进行第二步的PCR反应,从而得到可以用于测序的核酸测序文库。 与测序仪兼容的引物序列至少包括图2A中所示引物中的通用结合序列(通用核酸序列U区,即其他正向引物U区2NFU/其他反向引物U区2NRU)、样本标签2NFi/2NRi和与测序仪兼容的核酸序列2NFA/2NRA,此处所有支持物上的正向通用核酸序列U区均相同:第一正向引物U区201FU=第二正向引物U区202FU=其他正向引物U区2NFU,所有支持物上的反向通用核酸序列U区也均相同:第一反向引物U区201RU=第二反向引物U区202RU=其他反向引物U区2NRU。用于测序的测序仪包括但不限于MGIseq测序平台、illumina测序平台、Ion测序平台、PacBio测序平台、Nanopore测序平台等。Further, the present invention also provides the use of multiple nucleic acid co-labeled supports for multiplex PCR sequencing library construction. As shown in FIG. 2C , the supports with nucleic acid sequences 207 and 208 complementary to the complementary strands of the DNA template generated in the reaction shown in FIG. 2B are used as templates for primer sequences compatible with the sequencer: the third forward primer 209F and The third reverse primer 209R performs the PCR reaction of the second step, thereby obtaining a nucleic acid sequencing library that can be used for sequencing. The primer sequences compatible with the sequencer include at least the universal binding sequence in the primers shown in Figure 2A (universal nucleic acid sequence U region, namely other forward primer U region 2NFU/other reverse primer U region 2NRU), sample label 2NFi/2NRi And the nucleic acid sequence 2NFA/2NRA compatible with the sequencer, the forward universal nucleic acid sequence U region on all supports here is the same: the first forward primer U region 201FU=the second forward primer U region 202FU=Other forward Primer U region 2NFU, reverse universal nucleic acid sequence U region on all supports is also the same: first reverse primer U region 201RU=second reverse primer U region 202RU=other reverse primer U region 2NRU. Sequencers used for sequencing include but are not limited to MGIseq sequencing platform, illumina sequencing platform, Ion sequencing platform, PacBio sequencing platform, Nanopore sequencing platform, etc.
本发明还提供了应用于多重PCR用途的多种核酸共标记支持物的制作方法。如图2D所示,当需要测序分析长片段210的连续碱基序列时,单引物对扩增已经不能满足需要,这时候需要设计多对引物分别进行扩增然后构建为文库测序,例如使用第一引物对(图2D所示方案中以引物的H区表示,即第一正向引物H区201FH和第一反向引物H区201RH)扩增第一目的片段201,使用第二引物对(第二正向引物H区202FH和第二反向引物H区202RH)扩增第二目的片段202,以及使用更多的引物对(其他正向引物H区2NFH和其他反向引物H区2NRH)扩增更多的目的片段2N,最后将测序结果拼接为所述长片段210的序列。在传统的液相多重PCR反应时由于第一目的片段201与第二目的片段202有部分重合,第二正向引物和第一反向引物在一起会产生小的非目的扩增产物,引物需要将多重PCR反应至少分为两管平行扩增,扩增目的片段没有重合的引物对:第一正向引物、第一反向引物、其他正向引物和其他反向引物为一管,第二正向引物和第二反向引物为一管。当使用本发明中的利用多核酸标记支持物进行多重PCR建库方法时,可以通过将扩增目的片段没有重合的第一引物对和其他引物对标记在同一磁珠如第一磁珠上得到第一种类的多种核酸共标记的支持物,而将与第一磁珠上的引物对的扩增目的片段有重合的第二引物对标记在另外的磁珠如第二磁珠上得到第二种类的多种核酸共标记的支持物,这样两种类的多种核酸共标记的支持物上的引物对即使在同一管中进行扩增引物之间也不会相互影响。根据图2D中所示的引物设计原理,本发明预先合成了具有5’特定修饰211的第一引物对201F、201R、具有5’特定修饰211的第二引物对202F、202R以及更多的具有5’特定修饰211的其他引物对2NF和2NR(图2E)。5’特定修饰包括但不限于羟基、醛基、环氧基、氨基、羧基及其活化形式、磷酸、炔基、叠氮、巯基、烯烃、生物素、亲和素、异硫氰酸酯、异氰酸酯、酰基叠氮、磺酰氯、甲苯磺酰基酯等,相对应的所选用的支持物上包括但不限于环氧基、氨基、羧基、炔基、叠氮、烯烃、重金属、叠氮、亲和素等功能基团212。在适当的条件下将具有功能基团212的支持物与具有5’特定修饰211的核酸引物接触并偶联,特别地将能够产生非特异产物的引物分开偶联 在不同的支持物上,例如将具有5’特定修饰211的第一引物对201F、201R、具有5’特定修饰211的其他引物对2NF、2NR偶联到第一微珠上形成第一产物213,将具有5’特定修饰211的第二引物对202F和202R偶联到第二微珠上形成第二产物214(图2E),最后将第一产物213和第二产物214按照比例混合在一起形成最终的具有多种核酸标记的支持物应用于多重PCR建库。The present invention also provides a method for making multiple nucleic acid co-labeling supports for multiplex PCR. As shown in FIG. 2D , when the continuous base sequence of the long fragment 210 needs to be sequenced and analyzed, the amplification of a single primer pair can no longer meet the needs. In this case, multiple pairs of primers need to be designed to amplify and then construct the library for sequencing. A primer pair (represented by the H region of the primer in the scheme shown in FIG. 2D, namely the first forward primer H region 201FH and the first reverse primer H region 201RH) amplifies the first target fragment 201, and uses the second primer pair ( The second forward primer H region 202FH and the second reverse primer H region 202RH) amplify the second target fragment 202, and use more primer pairs (other forward primer H region 2NFH and other reverse primer H region 2NRH) More target fragments 2N are amplified, and finally the sequencing results are spliced into the sequence of the long fragment 210 . In the traditional liquid-phase multiplex PCR reaction, since the first target fragment 201 and the second target fragment 202 partially overlap, the second forward primer and the first reverse primer together will produce small non-target amplification products, and the primers need to be Divide the multiplex PCR reaction into at least two tubes of parallel amplification, and amplify the primer pairs with no overlap of the target fragment: the first forward primer, the first reverse primer, other forward primers and other reverse primers are one tube, and the second The forward primer and the second reverse primer are in one tube. When using the multiplex PCR library building method using the multi-nucleic acid labeling support in the present invention, it can be obtained by labeling the first primer pair that amplifies the target fragment without overlapping and other primer pairs on the same magnetic bead, such as the first magnetic bead. The first type of multiple nucleic acid co-labeled supports, and the second primer pair that overlaps with the amplification target fragment of the primer pair on the first magnetic bead is labeled on another magnetic bead such as the second magnetic bead to obtain the first. Two kinds of multiple nucleic acid co-labeled supports, so that primer pairs on the two kinds of multiple nucleic acid co-labeled supports do not interfere with each other even if the primers are amplified in the same tube. According to the primer design principle shown in FIG. 2D, the present invention pre-synthesizes a first primer pair 201F, 201R with a 5' specific modification 211, a second primer pair 202F, 202R with a 5' specific modification 211, and more with Additional primer pairs 2NF and 2NR for 5' specific modification of 211 (Fig. 2E). 5' specific modifications include, but are not limited to, hydroxyl, aldehyde, epoxy, amino, carboxyl and their activated forms, phosphoric acid, alkynyl, azide, sulfhydryl, alkene, biotin, avidin, isothiocyanate, Isocyanates, acyl azides, sulfonyl chlorides, tosyl esters, etc., the corresponding selected supports include but are not limited to epoxy, amino, carboxyl, alkynyl, azide, alkene, heavy metal, azide, affinity and other functional groups 212. The supports with functional groups 212 are contacted and coupled with nucleic acid primers with 5' specific modifications 211 under appropriate conditions, in particular the primers capable of producing non-specific products are coupled separately on different supports, e.g. A first primer pair 201F, 201R with a 5' specific modification 211, other primer pairs 2NF, 2NR with a 5' specific modification 211 are coupled to the first microbeads to form a first product 213, and a first product 213 will have a 5' specific modification 211 The second primer pair 202F and 202R is coupled to the second microbead to form the second product 214 (Fig. 2E), and finally the first product 213 and the second product 214 are mixed together in proportion to form the final product with multiple nucleic acid labels. The supports are used for multiplex PCR library construction.
本发明还提供了多种核酸共标记的支持物应用于5’单细胞RNA表达谱分析的用途。众所周知在复杂的生命机体由许多性质特异的细胞组成,在特定的状态下每个细胞转录表达的RNA的种类和数量都有所不同,所以在单细胞水平上检测RNA的转录有重要意义。目前检测单细胞转录组的技术根据通量可以分为中低通量和高通量单细胞转录组测序技术。中低通量单细胞转录组测序以smart-seq为代表,通过对单个细胞直接裂解得到的RNA逆转录后扩增构建单细胞转录组;高通量单细胞转录组测序的文库制备以油包水微流控和微孔阵列平台为代表,通过含有细胞标签和分子标签的oligo dT引物或者模板转换寡核苷酸(template switch oligo,TSO)将来源于不同细胞的mRNA分子逆转录为含有对应特有标签的cDNA分子,并通过进一步的测序可以同时分析成千上万的单细胞mRNA表达情况。油包水技术平台将单个细胞和含有细胞标签的单个微珠包裹在单液滴内进行裂解逆转录一步操作,根据测序读取靠近RNA的3’端或者5’端可以分为3’单细胞RNA表达谱文库和5’单细胞RNA表达谱文库。微孔阵列平台通常是由直径20-60μM的微孔组成的阵列芯片,单个细胞在微孔中裂解后具有polyA尾巴的mRNA被位于同一孔内携带有细胞标签oligo dT引物的微珠捕获,然后通过逆转录延伸将来源于同一细胞的RNA逆转录产物标记上相同且唯一的细胞标签。目前基于微孔阵列的单细胞测序文库制备平台的效率很大程度上取决于含有oligo dT微珠在微孔中的RNA捕获效率,而且与油包水平台不同微孔阵列平台目前只能构建3’单细胞RNA表达谱文库。本发明通过对单细胞测序文库制备中的微珠进行多种核酸标记可以提高其在微孔中的RNA捕获效率,并且能够实现5’单细胞RNA表达谱文库制备。The present invention also provides the use of multiple nucleic acid co-labeled supports for 5' single-cell RNA expression profiling. It is well known that complex living organisms are composed of many cells with specific properties, and the types and quantities of RNAs transcribed and expressed by each cell in a specific state are different, so it is of great significance to detect RNA transcription at the single-cell level. Current technologies for detecting single-cell transcriptomes can be divided into low- and medium-throughput and high-throughput single-cell transcriptome sequencing technologies according to throughput. The medium and low-throughput single-cell transcriptome sequencing is represented by smart-seq, and the single-cell transcriptome is constructed by reverse transcription and amplification of the RNA obtained by direct lysis of a single cell; the library preparation of high-throughput single-cell transcriptome sequencing is based on oil-in-oil Represented by water microfluidics and microwell array platforms, mRNA molecules derived from different cells are reverse transcribed into corresponding mRNA molecules through oligo dT primers or template switch oligo (TSO) containing cell tags and molecular tags. Uniquely tagged cDNA molecules and further sequencing can simultaneously analyze the mRNA expression of thousands of single cells. The water-in-oil technology platform encapsulates a single cell and a single microbead containing a cell tag in a single droplet for lysis and reverse transcription in one step. According to the sequencing read near the 3' end or 5' end of the RNA, it can be divided into 3' single cells RNA expression profiling library and 5' single cell RNA expression profiling library. The microwell array platform is usually an array chip consisting of microwells with a diameter of 20-60 μM. After a single cell is lysed in the microwell, the mRNA with a polyA tail is captured by the microbeads carrying the cell-tag oligo dT primer in the same well, and then RNA reverse transcripts derived from the same cell are labeled with the same and unique cellular label by reverse transcription extension. The efficiency of the current microwell array-based single-cell sequencing library preparation platform largely depends on the RNA capture efficiency in the microwells containing oligo dT microbeads, and unlike the water-in-oil platform, the microwell array platform can only construct 3 'Single-cell RNA expression profiling library. The invention can improve the RNA capture efficiency in the micropore by carrying out various nucleic acid labels on the microbeads in the preparation of the single-cell sequencing library, and can realize the preparation of the 5' single-cell RNA expression profile library.
本发明提供了一种多种核酸共标记的支持物应用于5’单细胞RNA表达谱分析、构建微孔阵列平台的5’单细胞VDJ文库的用途。The present invention provides the use of a variety of nucleic acid co-labeled supports for 5' single-cell RNA expression profile analysis, and the construction of a 5'-single-cell VDJ library of a microwell array platform.
如图3A所示,此处的支持物为微珠(固体微珠或半固态水凝胶微珠),支持物上标记了至少两种核酸序列:第一核酸序列301和第二核酸序列304。As shown in FIG. 3A , the support here is microbeads (solid microbeads or semi-solid hydrogel microbeads), and at least two nucleic acid sequences are labeled on the support: a first nucleic acid sequence 301 and a second nucleic acid sequence 304 .
第一核酸序列301至少包含捕获序列303,用于捕获目的核酸分子并作为引物延伸或逆转录,比如是长度在15-40的碱基序列oligo dT,可以通过调整第一核酸序列301在支持物上的数量和密度来控制捕获RNA的效率。在特定用途下第一核酸序列301也包括第一 通用序列核酸302和可条件性断裂位点X。条件性可断裂位点包括但不限于双硫修饰、dU修饰、RNA碱基修饰、dI修饰、DSpacer修饰、AP位点修饰、光断裂PC linker以及限制性内切酶识别序列中的一种或多种。The first nucleic acid sequence 301 contains at least a capture sequence 303, which is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription, such as a base sequence oligo dT with a length of 15-40, which can be adjusted by adjusting the first nucleic acid sequence 301 on the support. on the quantity and density to control the efficiency of RNA capture. The first nucleic acid sequence 301 also includes a first universal sequence nucleic acid 302 and a conditionally cleavable site X under certain uses. Conditionally cleavable sites include, but are not limited to, one or more of disulfide modifications, dU modifications, RNA base modifications, dI modifications, DSpacer modifications, AP site modifications, photocleavable PC linkers, and restriction endonuclease recognition sequences. variety.
第二核酸序列304由第二通用核酸序列305、细胞标签序列306、分子标签序列307和模板转换序列308中的一种或几种组成。其中第二通用核酸序列305可以包括与测序仪匹配的接头核酸序列,比如illumina测序仪中的Read1 Sequencing Primer或者Read2 Sequencing Primer。细胞标签序列306用于标记来源于同一细胞中的所有mRNA的分子,每个支持物上具有相同的细胞标签而不同种类支持物上具有不同的细胞标签。细胞标签序列306可以是一段随机或半随机的核酸序列,比如12bp简并碱基NNNNNNNNNNNN,也可以是包含多种固定核酸序列的组合,比如96种8碱基序列与96种8碱基序列与96种8碱基序列的随机组合,8碱基序列之间可以包括也可以不包括连接核酸区。分子标签序列307用于标记每个逆转录出来的cDNA分子,从同一支持物上的不同的RNA逆转录出来的cDNA分子都被标记上不同的分子标签。分子标签307可以是一段8-20碱基长度的随机或半随机的核酸序列,比如9个随机简并碱基NNNNNNNNN或NNNNNNNNNV。模板转换序列308可以作为模板使从第一核酸序列301逆转录出来的cDNA 3’端继续延伸以标记上分子标签序列307、细胞标签序列306和第二通用核酸序列305。模板转换序列308至少在3’端包括两个以上的RNA碱基rG或者其它修饰的碱基G类似物,比如LNA或XNA。The second nucleic acid sequence 304 is composed of one or more of the second universal nucleic acid sequence 305 , the cell tag sequence 306 , the molecular tag sequence 307 and the template switching sequence 308 . Wherein the second universal nucleic acid sequence 305 may include an adapter nucleic acid sequence that matches the sequencer, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer. The cell tag sequence 306 is used to tag molecules derived from all mRNAs in the same cell, with the same cell tag on each support and different cell tags on different kinds of supports. The cell tag sequence 306 can be a random or semi-random nucleic acid sequence, such as a 12bp degenerate base NNNNNNNNNNNN, or a combination of multiple fixed nucleic acid sequences, such as 96 8-base sequences and 96 8-base sequences and A random combination of 96 8-base sequences, which may or may not include connecting nucleic acid regions between the 8-base sequences. The molecular tag sequence 307 is used to label each reverse transcribed cDNA molecule, and the cDNA molecules reverse transcribed from different RNAs on the same support are marked with different molecular tags. The molecular tag 307 can be a random or semi-random nucleic acid sequence of 8-20 bases in length, such as 9 random degenerate bases NNNNNNNNN or NNNNNNNNNV. The template switching sequence 308 can be used as a template to extend the 3' end of the cDNA reverse transcribed from the first nucleic acid sequence 301 to label the molecular tag sequence 307, the cell tag sequence 306 and the second universal nucleic acid sequence 305. Template switching sequence 308 includes two or more RNA bases rG or other modified base G analogs, such as LNA or XNA, at least at the 3' end.
如图3B所示为图3A中所示的多核酸标记支持物在构建微孔阵列平台的5’单细胞文库的实验流程图,在该具体应用下,每个支持物上偶联有两种核酸标记:第一核酸序列301与第二核酸序列304。单个含有第一核酸序列301和第二核酸序列304标记的支持物与来源于单细胞的RNA接触时含有第二核酸序列304互补序列的RNA309被支持物上的第一核酸序列301捕获并通过逆转录反应体系形成cDNA分子310,其中cDNA延伸到RNA309的5’末端时被具有末端核酸转移酶功能的逆转录酶添加连续的碱基C至cDNA链上,然后该cDNA链会与同一支持物表面附近的含有两个以上碱基rG或其碱基类似物的第二核酸序列304互补结合并继续延伸至第二通用核酸序列305区域形成完整的具有细胞标签和分子标签的cDNA分子310。可选的,cDNA分子310可以通过可断裂位点X从支持物上脱离下来作为下一步扩增的模板,也可以通过第二通用核酸序列305单引物延伸后形成的与cDNA分子310互补的延伸链作为下一步扩增的模板,或者通过酶处理去除支持物上未参与逆转录反应的第一核酸序列301和第二核酸序列304后含有cDNA分子310的支持物作为下一步扩增的模板。在随后的步骤中,断裂或者不断裂的cDNA分子310作为模板被与含有第一通用序列302和第二通用序列305的引物对PCR扩增形成双链核酸311产物。进一步 地,双链核酸311可以通过两种建库方式对单细胞RNA表达的种类和丰度进行分析,其中一种目的为无偏差的分析所有具有polyA尾巴的RNA分子的表达情况,该种建库方案为随机将双链核酸311打断、末端修复并在3’端添加碱基A形成分子结构312,然后与含有突出T的接头313连接并被含有第一样本Index317的第一引物315和含有第二样本Index321的第二引物319扩增形成第一最终文库323。其中第一引物315包括与测序仪兼容的第一核酸序列316、第一样本index317和与接头313中长链部分序列互补的序列318组成。第二引物319包括与测序仪兼容的第二核酸序列320、第二样本index321和与第二通用核酸序列305的部分序列相同的序列322组成。此种建库方法也可以替换为其它能够达到同样目的的随机文库构建方案,包括但不局限于转座酶法打断建库或者随机引物延伸的建库方案。另一种建库方案的目的是可以靶向分析目的基因的表达情况,可以通过两步多重PCR实现,如分别用第一基因特异性引物324、第二基因特异性引物326与通用引物305组成的引物对形成第一步多重PCR产物325和第二步多重PCR产物327,最后以第二步多重PCR产物327为模板通过与含有第一样本Index317的第一引物315和含有第二样本Index321的第二引物319扩增形成第二最终文库328。靶向多重PCR构建的文库可用于免疫组库的分析,更进一步地,第二步多重PCR产物327还可以按照第一种随机打断建库方案用于构建全长VDJ免疫组库文库用以分析T细胞受体和抗体VDJ序列。第一最终文库323和第二最终文库328均进一步用于测序和信息分析。Figure 3B shows the experimental flow chart of the multi-nucleic acid-labeled support shown in Figure 3A in the construction of the 5' single-cell library of the microwell array platform. In this specific application, each support is coupled with two kinds of Nucleic acid markers: the first nucleic acid sequence 301 and the second nucleic acid sequence 304 . When a single support labeled with the first nucleic acid sequence 301 and the second nucleic acid sequence 304 is contacted with RNA derived from a single cell, the RNA 309 containing the complementary sequence of the second nucleic acid sequence 304 is captured by the first nucleic acid sequence 301 on the support and reversed by reverse The transcription reaction system forms a cDNA molecule 310, in which the cDNA is extended to the 5' end of the RNA 309 by the reverse transcriptase with terminal nucleotransferase function adding a continuous base C to the cDNA strand, and then the cDNA strand will be with the same support surface. The adjacent second nucleic acid sequence 304 containing more than two bases rG or its base analogs is complementary and extended to the second universal nucleic acid sequence 305 to form a complete cDNA molecule 310 with cell tags and molecular tags. Optionally, the cDNA molecule 310 can be detached from the support through the cleavable site X as a template for the next step of amplification, or can be extended through the single primer extension of the second universal nucleic acid sequence 305, which is complementary to the cDNA molecule 310. The strand is used as the template for the next amplification, or the support containing the cDNA molecule 310 after removing the first nucleic acid sequence 301 and the second nucleic acid sequence 304 not involved in the reverse transcription reaction on the support by enzymatic treatment is used as the template for the next amplification. In a subsequent step, the fragmented or unfragmented cDNA molecule 310 as a template is PCR amplified with a primer pair containing the first universal sequence 302 and the second universal sequence 305 to form a double-stranded nucleic acid 311 product. Further, double-stranded nucleic acid 311 can analyze the type and abundance of single-cell RNA expression through two library construction methods, one of which is to analyze the expression of all RNA molecules with polyA tails without bias. The library scheme is to randomly break the double-stranded nucleic acid 311, end-repair, and add a base A at the 3' end to form a molecular structure 312, which is then connected to a linker 313 containing a protruding T, and is ligated by the first primer 315 containing the first sample Index317 Amplify with the second primer 319 containing the second sample Index 321 to form the first final library 323 . The first primer 315 comprises a first nucleic acid sequence 316 compatible with the sequencer, a first sample index 317 and a sequence 318 complementary to the long-chain partial sequence in the adapter 313 . The second primer 319 includes a second nucleic acid sequence 320 compatible with the sequencer, a second sample index 321 and a sequence 322 that is identical to a partial sequence of the second universal nucleic acid sequence 305 . This library construction method can also be replaced by other random library construction schemes that can achieve the same purpose, including but not limited to the library construction scheme of transposase interrupted library construction or random primer extension. The purpose of another library construction scheme is to analyze the expression of the target gene in a targeted manner, which can be realized by two-step multiplex PCR, such as using the first gene-specific primer 324, the second gene-specific primer 326 and the universal primer 305 respectively. The primer pairs form the first-step multiplex PCR product 325 and the second-step multiplex PCR product 327, and finally use the second-step multiplex PCR product 327 as a template to pass through the first primer 315 containing the first sample Index317 and the second sample Index321 The second primer 319 amplifies to form a second final library 328. The library constructed by targeted multiplex PCR can be used for the analysis of the immune repertoire. Furthermore, the multiplex PCR product 327 in the second step can also be used to construct a full-length VDJ immune repertoire library according to the first random interrupt library construction scheme. Analysis of T cell receptor and antibody VDJ sequences. Both the first final library 323 and the second final library 328 are further used for sequencing and information analysis.
本发明还提供了应用于5’单细胞RNA表达谱分析的核酸共标记支持物的制作方法。如图3C所示,细胞标签序列306由顺序连接的第一细胞标签329、第一连接区330、第二细胞标签331、第二连接区332、第三连接区333和第三细胞标签334共6个区域组成。为了将第一核酸序列301和第二核酸序列304共标记在支持物上,本发明预先合成了具有5’特定修饰340的第一核酸序列301和第三核酸序列335。5’特定修饰340包括但不限于羟基、醛基、环氧基、氨基、羧基及其活化形式、磷酸、炔基、叠氮、巯基、烯烃、生物素、亲和素、异硫氰酸酯、异氰酸酯、酰基叠氮、磺酰氯、甲苯磺酰基酯等,相对应的所选用的支持物上包括但不限于环氧基、氨基、羧基、炔基、叠氮、烯烃、重金属、叠氮、亲和素等功能基团339。在适当的条件下将具有功能基团339的支持物与具有5’修饰的第一核酸序列301和第三核酸序列335经过接触并偶联(步骤341)形成第一产物342,第一核酸序列301与第三核酸序列355在第一产物342上的比例可通过加入浓度不同进行调节;随后第一产物342与第四核酸序列336在含有适当盐离子浓度、dNTP和聚合酶缓冲环境下经过杂交步骤343杂交延伸得到第二产物344,第四核酸序列336核酸分子从5’开始依次含有第二连接区332的互补序列332’、第二细胞标签331的互补序列331’以及第一连 接区330的互补序列330’;支持物上的第三核酸序列335通过自身的连接区330序列与第四核酸序列336上的互补序列330’杂交并延伸至序列第二细胞标签331与第二连接区332;第二产物344在变性条件下去除互补链后在DNA连接酶的作用下经过连接步骤345与第五核酸序列346连接生成第三产物347;第五核酸序列346是由第一核酸分子337和第二核酸分子338提前退火杂交得到的双链DNA,第一核酸分子337含有顺序连接的第三连接区333、第三细胞标签334、分子标签序列307以及模板转换序列308,第二核酸分子338至少含有第二连接区332和第三连接区333的部分序列的互补序列,在特定条件下还包括部分第三细胞标签334的互补序列,因此第二核酸分子338可以与第一核酸分子337上的第三连接区333以及第二产物344上的第二连接区332杂交在一起并通过连接酶相互连接;特别地,第一核酸分子337的5’端有时还含有磷酸修饰;最后在变性条件下第三产物347经过洗脱步骤348洗去互补核酸序列即第二核酸分子338形成最终有第一核酸序列301和第二核酸序列304标记的支持物349,该有第一核酸序列301和第二核酸序列304标记的支持物349可直接用于5’单细胞RNA表达谱分析的建库流程。The present invention also provides a method for making a nucleic acid co-labeling support for 5' single-cell RNA expression profiling analysis. As shown in FIG. 3C, the cell tag sequence 306 consists of a first cell tag 329, a first attachment region 330, a second cell tag 331, a second attachment region 332, a third attachment region 333, and a third cell tag 334 that are sequentially linked. It consists of 6 areas. In order to co-label the first nucleic acid sequence 301 and the second nucleic acid sequence 304 on the support, the present invention pre-synthesizes the first nucleic acid sequence 301 and the third nucleic acid sequence 335 with a 5' specific modification 340. The 5' specific modification 340 includes But not limited to hydroxyl, aldehyde, epoxy, amino, carboxyl and their activated forms, phosphoric acid, alkynyl, azide, sulfhydryl, alkene, biotin, avidin, isothiocyanate, isocyanate, acyl azide , sulfonyl chloride, tosyl ester, etc., the corresponding selected supports include but are not limited to functional groups such as epoxy, amino, carboxyl, alkynyl, azide, alkene, heavy metal, azide, avidin, etc. Mission 339. The support with functional group 339 is contacted and coupled (step 341) with the first nucleic acid sequence 301 and the third nucleic acid sequence 335 with the 5' modification under appropriate conditions to form a first product 342, the first nucleic acid sequence The ratio of 301 to the third nucleic acid sequence 355 on the first product 342 can be adjusted by adding different concentrations; then the first product 342 and the fourth nucleic acid sequence 336 undergo hybridization in an environment containing an appropriate salt ion concentration, dNTP and a polymerase buffer In step 343, the second product 344 is obtained by hybridization and extension, and the fourth nucleic acid sequence 336 nucleic acid molecule sequentially contains the complementary sequence 332' of the second connecting region 332, the complementary sequence 331' of the second cell tag 331 and the first connecting region 330 from 5'. The complementary sequence 330' of the third nucleic acid sequence 335 on the support hybridizes with the complementary sequence 330' on the fourth nucleic acid sequence 336 through its own connecting region 330 sequence and extends to the sequence The second cell tag 331 and the second connecting region 332 The second product 344 is connected with the fifth nucleic acid sequence 346 under the action of DNA ligase after the complementary strand is removed under denaturing conditions to generate the third product 347; the fifth nucleic acid sequence 346 is composed of the first nucleic acid molecule 337 and The second nucleic acid molecule 338 is the double-stranded DNA obtained by annealing and hybridization in advance, the first nucleic acid molecule 337 contains the third connecting region 333, the third cell tag 334, the molecular tag sequence 307 and the template switching sequence 308 connected in sequence, and the second nucleic acid molecule 338 It contains at least the complementary sequence of the partial sequence of the second connecting region 332 and the third connecting region 333, and also includes part of the complementary sequence of the third cell tag 334 under certain conditions, so the second nucleic acid molecule 338 can be combined with the first nucleic acid molecule 337. The third connecting region 333 and the second connecting region 332 on the second product 344 are hybridized together and connected to each other by a ligase; in particular, the 5' end of the first nucleic acid molecule 337 sometimes also contains a phosphoric acid modification; finally, under denaturing conditions The third product 347 undergoes an elution step 348 to wash off the complementary nucleic acid sequence, that is, the second nucleic acid molecule 338 to form a support 349 that is finally marked with the first nucleic acid sequence 301 and the second nucleic acid sequence 304, which has the first nucleic acid sequence 301 and the second nucleic acid sequence 304. The supports 349 labeled with the two nucleic acid sequences 304 can be used directly in the library building procedure for 5' single-cell RNA expression profiling.
本发明还提供了一种多种核酸共标记的支持物应用于3’单细胞RNA文库的用途,其中细胞标签和逆转录引物oligo dT分别位于cDNA分子的两端。如图4A所示为至少两种核酸标记的支持物,此处的支持物为微珠(固体微珠或半固态水凝胶微珠),支持物上标记了至少两种核酸序列:第一核酸序列401和第二核酸序列404。The present invention also provides the use of a plurality of nucleic acid co-labeled supports applied to a 3' single-cell RNA library, wherein the cell label and the reverse transcription primer oligo dT are respectively located at both ends of the cDNA molecule. Figure 4A shows at least two nucleic acid-labeled supports, where the supports are beads (solid beads or semi-solid hydrogel beads), and at least two nucleic acid sequences are labeled on the supports: the first Nucleic acid sequence 401 and second nucleic acid sequence 404.
第一核酸序列401至少包含捕获序列403,用于捕获目的核酸分子并作为引物延伸或逆转录,比如是长度在15-40的碱基序列oligo dT,可以通过调整第一核酸序列401在支持物上的数量和密度来控制捕获RNA的效率;在特定用途下第一核酸序列401也包括第一通用序列核酸402。The first nucleic acid sequence 401 contains at least a capture sequence 403, which is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription, such as a base sequence oligo dT with a length of 15-40, which can be adjusted by adjusting the first nucleic acid sequence 401 on the support. The number and density of the above control the efficiency of capturing RNA; the first nucleic acid sequence 401 also includes the first universal sequence nucleic acid 402 under certain uses.
第二核酸序列404由第二通用核酸序列405、细胞标签序列406、引物序列407和可逆性阻断位点408中的一种或几种组成。其中第二通用核酸序列405可以包括与测序仪匹配的接头核酸序列,比如illumina测序仪中的Read1 Sequencing Primer或者Read2 Sequencing Primer,可选的第二通用核酸序列405上含有可条件性断裂位点X。条件性可断裂位点包括但不限于双硫修饰、dU修饰、RNA碱基修饰、dI修饰、DSpacer修饰、AP位点修饰、光断裂PC linker以及限制性内切酶识别序列。细胞标签序列406用于标记来源于同一细胞中的所有mRNA的分子,每个支持物上具有相同的细胞标签而不同种类支持物上具有不同的细胞标签。细胞标签序列406可以是一段随机或半随机的核酸序列,比如12bp简并碱基NNNNNNNNNNNN,也可以是包含多种固定核酸序列的组合,比如96种8碱基序列与96种8碱基序列与96种8碱基序列的随机组合,8碱基序列之间可以包括也可以不包括 连接核酸区。引物序列407可作为引物延伸与其互补结合的cDNA分子,可以与捕获序列403逆转录所得的cDNA产物结合并延伸;引物序列407可以是一段5-15碱基长度的随机或半随机的核酸序列,比如6个随机简并碱基NNNNNN,也可以是和基因特异性序列用于富集靶向区域。可逆性阻断位点408的作用是在第一核酸序列401捕获并延伸目标核酸时阻止引物序列407作为引物发生非特异延伸,并在特定的情形下解除阻断作用从而允许引物序列407作为引物延伸。可逆性阻断位点408可以是单纯的3’磷酸修饰、ddNTP修饰或C3 spacer修饰,也可以是可断裂修饰与延伸阻断修饰之间的组合,可断裂修饰可以是DSpacer修饰/RNA碱基修饰/dU修饰等,延伸阻断修饰包括但不限于LNA/XNA/3’磷酸/inverted dT/ddNTP/C3 spacer/C6 spacer/各种荧光染料和淬灭修饰等,例如可逆性阻断位点408可以是(rN)NNNN-C3或(rN)N-C3-C3-ddN,rN代表任意一种核糖核苷酸简并碱基,N代表任意一种脱氧核糖核苷酸简并碱基,C3是延伸阻断修饰C3 spacer,ddN是双脱氧核糖核苷算;此序列与靶向DNA形成双链后能够被RNaseH识别切除并暴露出引物序列407的3’羟基从而激活引物序列407作为引物的核酸延伸能力。The second nucleic acid sequence 404 is composed of one or more of the second universal nucleic acid sequence 405 , the cell tag sequence 406 , the primer sequence 407 and the reversible blocking site 408 . Wherein the second universal nucleic acid sequence 405 may include an adaptor nucleic acid sequence that matches the sequencer, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer, and the optional second universal nucleic acid sequence 405 contains a conditional breaking site X . Conditionally cleavable sites include, but are not limited to, disulfide modifications, dU modifications, RNA base modifications, dI modifications, DSpacer modifications, AP site modifications, photocleavable PC linkers, and restriction endonuclease recognition sequences. The cell tag sequence 406 is used to tag molecules derived from all mRNAs in the same cell, with the same cell tag on each support and different cell tags on different kinds of supports. The cell tag sequence 406 can be a random or semi-random nucleic acid sequence, such as a 12bp degenerate base NNNNNNNNNNNN, or a combination of multiple fixed nucleic acid sequences, such as 96 8-base sequences and 96 8-base sequences and A random combination of 96 8-base sequences, which may or may not include connecting nucleic acid regions between the 8-base sequences. The primer sequence 407 can be used as a primer to extend the cDNA molecule that is complementary to it, and can be combined and extended with the cDNA product obtained by reverse transcription of the capture sequence 403; the primer sequence 407 can be a random or semi-random nucleic acid sequence with a length of 5-15 bases, For example, 6 random degenerate bases NNNNNN, and gene-specific sequences can also be used to enrich targeted regions. The function of the reversible blocking site 408 is to prevent the non-specific extension of the primer sequence 407 as a primer when the first nucleic acid sequence 401 captures and extends the target nucleic acid, and in a specific situation, the blocking effect is released to allow the primer sequence 407 to act as a primer. extend. The reversible blocking site 408 can be simple 3' phosphate modification, ddNTP modification or C3 spacer modification, or a combination of cleavable modification and extension blocking modification, and the cleavable modification can be DSpacer modification/RNA base Modification/dU modification, etc., extension blocking modification including but not limited to LNA/XNA/3' phosphate/inverted dT/ddNTP/C3 spacer/C6 spacer/various fluorescent dyes and quenching modifications, such as reversible blocking sites 408 can be (rN)NNNN-C3 or (rN)N-C3-C3-ddN, rN represents any ribonucleotide degenerate base, N represents any deoxyribonucleotide degenerate base, C3 is an extension blocking modified C3 spacer, and ddN is a dideoxyribonucleoside; after this sequence forms a double strand with the target DNA, it can be recognized and excised by RNaseH and expose the 3' hydroxyl of primer sequence 407, thereby activating primer sequence 407 as a primer nucleic acid extension ability.
图4B所示为图4A所示双核酸标记支持物构建3’单细胞RNA文库的实验流程图。单个含有第一核酸序列401和第二核酸序列404标记的支持物与来源于单细胞的RNA接触时含有第二核酸序列404互补序列的RNA409被支持物上的第一核酸序列401捕获并通过逆转录反应体系形成cDNA分子410,然后通过高温变性cDNA分子410与RNA解链后与同一支持物表面附近的第二核酸序列404上的引物序列407区域互补结合;可优化地,cDNA分子410可以与支持物表面一个以上的第二核酸序列404互补结合;进一步地,与cDNA分子410互补结合的引物序列407能够被相关的酶识别切除可逆性阻断位点408并暴露出引物序列407的3’羟基,该过程可以由RNaseH切割核糖核酸碱基完成,也可以由碱性磷酸酶单独处理3’磷酸完成,也可以由能够切除AP位点形成3’羟基的核酸内切酶完成,或者由USER酶\AP位点切除酶结合碱性磷酸酶完成;随后通过具有链替代活性的DNA聚合酶延伸形成一端包含第一核酸序列401互补序列另一端具有细胞标签序列406的第一核酸分子411;具有第一核酸分子411的支持物可以通过单引物扩增方法形成互补链412并从支持物上洗脱下来用于扩增模板,也可以通过条件性断裂位点X直接从支持物上断裂得到第二核酸分子413作为模板用于下一步的扩增从而形成双链核酸产物414,其中正反向扩增引物分别含有全部或部分第一通用序列核酸402与第二通用序列核酸405的核酸序列。进一步地,双链核酸产物414可以通过两种建库方式对单细胞RNA表达的种类和丰度进行分析,其中一种目的为无偏差的分析所有具有polyA尾巴的RNA分子的表达情况,该种建库方案为扩增双链核酸产物414并被含有第一样本Index417的第一引物415和含有第 二样本Index421的第二引物419扩增形成第一最终文库423,其中第一引物415包括与测序仪兼容的第一核酸序列416、第一样本index417和第一引物杂交区418,第二引物419包括与测序仪兼容的第二核酸序列420、第二样本index421和第二引物杂交区422。此种建库方法也可以替换为其它能够达到同样目的的随机文库构建方案,包括但不局限于转座酶法打断建库或者随机引物延伸的建库方案。另一种建库方案的目的是可以靶向分析目的基因的表达情况,可以通过两步多重PCR实现,如分别用第一基因特异性引物424和第二基因特异性引物426与通用引物305组成的引物对通过两步多重PCR的产物:第一步多重PCR的产物425和第二步多重PCR的产物427,最后以第二步多重PCR的产物427为模板通过与含有第一样本Index417的第一引物415和含有第二样本Index421的第二引物419扩增形成第二最终文库428。靶向多重PCR构建的文库可用于免疫组库的分析,尤其是用以分析全长的T细胞受体和抗体VDJ序列。第一最终文库423和第二最终文库428均进一步用于测序和信息分析。Figure 4B shows the flow chart of the experiment of constructing a 3' single-cell RNA library with the double nucleic acid labeling support shown in Figure 4A. When a single support labeled with the first nucleic acid sequence 401 and the second nucleic acid sequence 404 is contacted with RNA derived from a single cell, the RNA 409 containing the complementary sequence of the second nucleic acid sequence 404 is captured by the first nucleic acid sequence 401 on the support and reversed by reverse The transcription reaction system forms a cDNA molecule 410, and then the cDNA molecule 410 is denatured by high temperature and then melted with the RNA to be complementary to the region of the primer sequence 407 on the second nucleic acid sequence 404 near the surface of the same support; Optimally, the cDNA molecule 410 can be combined with More than one second nucleic acid sequence 404 on the surface of the support is complementary; further, the primer sequence 407 that is complementary to the cDNA molecule 410 can be recognized by a related enzyme to excise the reversible blocking site 408 and expose the 3' of the primer sequence 407 Hydroxyl, this process can be completed by RNaseH cleavage of ribonucleic acid bases, or by alkaline phosphatase alone to treat 3' phosphate, or by endonuclease that can excise AP site to form 3' hydroxyl, or by USER Enzyme\AP site excision enzyme combined with alkaline phosphatase to complete; then extended by DNA polymerase with strand replacement activity to form a first nucleic acid molecule 411 comprising a complementary sequence of first nucleic acid sequence 401 at one end and a cell tag sequence 406 at the other end; having The support of the first nucleic acid molecule 411 can be formed by a single-primer amplification method to form a complementary strand 412 and eluted from the support to amplify the template, or it can be directly cleaved from the support through the conditional cleavage site X to obtain the first nucleic acid molecule. The two nucleic acid molecules 413 are used as templates for further amplification to form a double-stranded nucleic acid product 414, wherein the forward and reverse amplification primers respectively contain all or part of the nucleic acid sequences of the first universal sequence nucleic acid 402 and the second universal sequence nucleic acid 405. Further, the double-stranded nucleic acid product 414 can be used to analyze the type and abundance of single-cell RNA expression through two library construction methods, one of which aims to unbiased analysis of the expression of all RNA molecules with polyA tails. The library construction scheme is to amplify the double-stranded nucleic acid product 414 and amplify it by the first primer 415 containing the first sample Index417 and the second primer 419 containing the second sample Index421 to form the first final library 423, wherein the first primer 415 includes The first nucleic acid sequence 416, the first sample index 417 and the first primer hybridization region 418 compatible with the sequencer, the second primer 419 includes the second nucleic acid sequence 420, the second sample index 421 and the second primer hybridization region compatible with the sequencer 422. This library construction method can also be replaced by other random library construction schemes that can achieve the same purpose, including but not limited to the library construction scheme of transposase interrupted library construction or random primer extension. The purpose of another library construction scheme is to analyze the expression of the target gene in a targeted manner, which can be achieved by two-step multiplex PCR, such as using the first gene-specific primer 424, the second gene-specific primer 426 and the universal primer 305 respectively. The primer pair is the product of two-step multiplex PCR: the product 425 of the first-step multiplex PCR and the product 427 of the second-step multiplex PCR, and finally the product 427 of the second-step multiplex PCR is used as a template to pass with the product containing the first sample Index417. The first primer 415 and the second primer 419 containing the second sample Index 421 are amplified to form a second final library 428 . Libraries constructed by targeted multiplex PCR can be used for immune repertoire analysis, especially for full-length T cell receptor and antibody VDJ sequences. Both the first final library 423 and the second final library 428 are further used for sequencing and information analysis.
本发明进一步提供了一种多种核酸共标记的支持物应用于构建单细胞转录组文库的用途,细胞标签可以标记RNA链的任意位置从而形成具有细胞和分子标签的cDNA分子。如图5A所示为两种类型的支持物核酸标记结构,此处的支持物为微珠或水凝胶beads,单个支持物上至少有两种核酸序列,例如第一核酸序列501和第二核酸序列505的组合,或者是第三核酸序列509和第二核酸序列505的组合。The present invention further provides the use of a multi-nucleic acid co-labeled support for constructing a single-cell transcriptome library. The cellular label can label any position of the RNA chain to form a cDNA molecule with cellular and molecular labels. As shown in Figure 5A, there are two types of support nucleic acid labeling structures, the supports here are microbeads or hydrogel beads, and there are at least two nucleic acid sequences on a single support, such as the first nucleic acid sequence 501 and the second nucleic acid sequence. The combination of the nucleic acid sequence 505, or the combination of the third nucleic acid sequence 509 and the second nucleic acid sequence 505.
第一核酸序列501至少包含捕获序列503用于捕获目的核酸分子,比如是长度在15-40的碱基序列oligo dT,可以通过调整第一核酸序列501在支持物上的数量和密度来控制捕获RNA的效率。第一核酸序列501还包括了聚合酶延伸阻断位点504,该位点可以阻止捕获序列503作为引物延伸捕获核酸分子,包括但不限于LNA/XNA/3’磷酸/inverted dT/ddNTP/C3 spacer/C6 spacer/各种荧光染料和淬灭修饰等。在特定用途下第一核酸序列501也包括第一通用核酸序列502,可以通过调整第一通用核酸序列502的序列和长度调节503的捕获效率。The first nucleic acid sequence 501 at least contains a capture sequence 503 for capturing target nucleic acid molecules, such as a base sequence oligo dT with a length of 15-40, and the capture can be controlled by adjusting the number and density of the first nucleic acid sequence 501 on the support RNA efficiency. The first nucleic acid sequence 501 also includes a polymerase extension blocking site 504, which can prevent the capture sequence 503 from extending the capture nucleic acid molecule as a primer, including but not limited to LNA/XNA/3' phosphate/inverted dT/ddNTP/C3 spacer/C6 spacer/various fluorescent dyes and quenching modifications, etc. The first nucleic acid sequence 501 also includes the first universal nucleic acid sequence 502 in a specific application, and the capture efficiency of the first universal nucleic acid sequence 502 can be adjusted 503 by adjusting the sequence and length of the first universal nucleic acid sequence 502 .
第二核酸序列505由第二通用核酸序列506、细胞标签序列507、引物序列508的一种或几种组成,其中第二通用核酸序列506可以包括与测序仪匹配的接头核酸序列,比如illumina测序仪中的Read1 Sequencing Primer或者Read2 Sequencing Primer;细胞标签序列507用于标记来源于同一细胞中的所有mRNA的分子,每个支持物上具有相同的细胞标签而不同种类支持物上具有不同的细胞标签。细胞标签序列507可以是一段随机或半随机的核酸序列,比如12bp简并碱基NNNNNNNNNNNN,也可以是包含多种固定核酸序列的组合,比如96种8碱基序列与96种8碱基序列与96种8碱基序列的随机组合,8碱基序列 之间可以包括也可以不包括连接核酸区。引物序列508可作为引物延伸与RNA模板结合并延伸成为cDNA分子,可以与捕获序列503捕获的RNA结合并延伸,引物序列508可以是一段5-15碱基长度的随机或半随机的核酸序列,比如6个随机简并碱基NNNNNN,也可以是和基因特异性序列用于富集靶向区域。The second nucleic acid sequence 505 is composed of one or more of a second universal nucleic acid sequence 506, a cell tag sequence 507, and a primer sequence 508, wherein the second universal nucleic acid sequence 506 may include a linker nucleic acid sequence matching a sequencer, such as illumina sequencing Read1 Sequencing Primer or Read2 Sequencing Primer in the instrument; cell tag sequence 507 is used to tag molecules derived from all mRNAs in the same cell, each support has the same cell tag and different types of supports have different cell tags . The cell tag sequence 507 can be a random or semi-random nucleic acid sequence, such as a 12bp degenerate base NNNNNNNNNNNN, or a combination of multiple fixed nucleic acid sequences, such as 96 kinds of 8-base sequences and 96 kinds of 8-base sequences and A random combination of 96 8-base sequences, which may or may not include connecting nucleic acid regions between the 8-base sequences. The primer sequence 508 can be combined with the RNA template as a primer extension and extended into a cDNA molecule, and can be combined with the RNA captured by the capture sequence 503 and extended. The primer sequence 508 can be a random or semi-random nucleic acid sequence with a length of 5-15 bases, For example, 6 random degenerate bases NNNNNN, and gene-specific sequences can also be used to enrich targeted regions.
第三核酸序列509由第二通用核酸序列506、细胞标签序列507、分子标签序列510和捕获序列503的一种或几种组成,其中第二通用核酸序列506可以包括与测序仪匹配的接头核酸序列,比如illumina测序仪中的Read1 Sequencing Primer或者Read2 Sequencing Primer;细胞标签序列507用于标记来源于同一细胞中的所有mRNA的分子,每个支持物上具有相同的细胞标签而不同种类支持物上具有不同的细胞标签,细胞标签序列507可以是一段随机或半随机的核酸序列,比如12bp简并碱基NNNNNNNNNNNN,也可以是包含多种固定核酸序列的组合,比如96种8碱基序列与96种8碱基序列与96种8碱基序列的随机组合,8碱基序列之间可以包括也可以不包括连接核酸区;分子标签序列510用于标记每个逆转录出来的cDNA分子,从同一支持物上的不同的RNA逆转录出来的cDNA分子都被标记上不同的分子标签,分子标签510可以是一段5-20碱基长度的随机或半随机的核酸序列,比如9个随机简并碱基NNNNNNNNN或NNNNNNNNNV;捕获序列503用于捕获目的核酸分子,比如是长度在15-40的碱基序列oligo dT。The third nucleic acid sequence 509 is composed of one or more of the second universal nucleic acid sequence 506, the cell tag sequence 507, the molecular tag sequence 510 and the capture sequence 503, wherein the second universal nucleic acid sequence 506 may include an adaptor nucleic acid matching the sequencer Sequence, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer; the cell tag sequence 507 is used to label molecules derived from all mRNAs in the same cell, and each support has the same cell tag and different types of supports. With different cell tags, the cell tag sequence 507 can be a random or semi-random nucleic acid sequence, such as a 12bp degenerate base NNNNNNNNNNNN, or a combination of multiple fixed nucleic acid sequences, such as 96 8-base sequences and 96 A random combination of 8-base sequences and 96 kinds of 8-base sequences, the 8-base sequences may or may not include connecting nucleic acid regions; the molecular tag sequence 510 is used to label each reverse transcribed cDNA molecule, which is obtained from the same cDNA molecule. The cDNA molecules reversely transcribed from different RNAs on the support are marked with different molecular tags. The molecular tag 510 can be a random or semi-random nucleic acid sequence with a length of 5-20 bases, such as 9 random degenerate bases. Base NNNNNNNNN or NNNNNNNNNV; capture sequence 503 is used to capture target nucleic acid molecules, such as base sequence oligo dT with a length of 15-40.
图5B所示为两种类型的双核酸标记支持物构建单细胞转录组文库的实验流程图。单个核酸标记的支持物与来源于单细胞的RNA接触时含有与捕获序列503互补序列的RNA512被支持物上的第一核酸序列501或者第三核酸序列509捕获,捕获到支持物表面的RNA512在适宜条件下与第二核酸序列505的引物序列508结合并通过逆转录反应体系形成cDNA分子514;特别地,在使用含第三核酸序列509支持物时cDNA分子514也包括了以第三核酸序列509为引物形成的cDNA;逆转录后的支持物可以通过进一步的酶切去除没有参与反应的第一核酸序列501、第三核酸序列509和第二核酸序列505分子。此后含有cDNA分子514的支持物可以通过两种建库方式对单细胞RNA表达的种类和丰度进行分析。其中一种建库方法的目的为无偏差的分析所有具有polyA尾巴的RNA分子的表达情况,该种建库方案采用随机引物517延伸扩增方案。随机引物517由通用引物序列515和随机碱基序列516组成:通用引物序列515可以包括与测序仪匹配的接头核酸序列,比如illumina测序仪中的Read2 Sequencing Primer或者Read1 Sequencing Primer;随机碱基序列516可以是一段5-15碱基长度的随机或半随机的核酸序列,比如9个连续的简并碱基NNNNNNNNN。在适当环境中随机引物517与cDNA分子514杂交并在具有链替代活性DNA聚合酶的作用下生产cDNA分子514的互补链518;互补链518可以从支持物上洗脱下 来并通过含有第二通用核酸序列506和通用引物序列515的引物对扩增生成双链产物519;双链产物519被含有第一样本Index522的第一引物520和含有第二样本Index526的第二引物524扩增形成第一最终文库528,其中第一引物520包括与测序仪兼容的第一核酸序列521、第一样本index 522和第一引物杂交区523,第二引物524包括与测序仪兼容的第二核酸序列525、第二样本index526和与第二引物杂交区527的部分序列;此种建库方法也可以替换为其它能够达到同样目的的随机文库构建方案,包括但不局限于超声打断、酶打断或者转座酶法打断等建库方案。另一种建库方案的目的是可以靶向分析目的基因的表达情况,可以通过两步多重PCR实现,如分别用第一基因特异性引物529和第二基因特异性引物531与通用引物506组成的引物对通过两步多重PCR的产物:第一步多重PCR的产物530和第二步多重PCR的产物532,最后以第二步多重PCR的产物532为模板通过第一引物520和第二引物524扩增形成第二最终文库533。靶向多重PCR构建的文库可用于免疫组库的分析,尤其是用以分析全长的T细胞受体和抗体VDJ序列。第一最终文库528和第二最终文库533均进一步用于测序和信息分析。Figure 5B shows the experimental flow chart of the construction of single-cell transcriptome library using two types of dual nucleic acid labeling supports. When the single nucleic acid-labeled support contacts the RNA derived from a single cell, the RNA 512 containing the complementary sequence to the capture sequence 503 is captured by the first nucleic acid sequence 501 or the third nucleic acid sequence 509 on the support, and the RNA 512 captured on the surface of the support is Under suitable conditions, it combines with the primer sequence 508 of the second nucleic acid sequence 505 and forms a cDNA molecule 514 through a reverse transcription reaction system; 509 is the cDNA formed by the primer; the reverse transcribed support can be further digested to remove the molecules of the first nucleic acid sequence 501 , the third nucleic acid sequence 509 and the second nucleic acid sequence 505 that do not participate in the reaction. The supports containing the cDNA molecules 514 can then be analyzed for the type and abundance of single-cell RNA expression by two methods of library construction. One of the library construction methods aims to unbiased analysis of the expression of all RNA molecules with polyA tails, and this library construction scheme uses a random primer 517 extension amplification scheme. The random primer 517 is composed of a universal primer sequence 515 and a random base sequence 516: the universal primer sequence 515 can include an adaptor nucleic acid sequence matching the sequencer, such as Read2 Sequencing Primer or Read1 Sequencing Primer in the illumina sequencer; the random base sequence 516 It can be a random or semi-random nucleic acid sequence with a length of 5-15 bases, such as 9 consecutive degenerate bases NNNNNNNNN. The random primer 517 hybridizes to the cDNA molecule 514 in an appropriate environment and produces a complementary strand 518 of the cDNA molecule 514 under the action of a DNA polymerase with strand replacement activity; The primer pair of the nucleic acid sequence 506 and the universal primer sequence 515 is amplified to generate a double-stranded product 519; the double-stranded product 519 is amplified by the first primer 520 containing the first sample Index522 and the second primer 524 containing the second sample Index526 to form the first primer. A final library 528, wherein the first primer 520 comprises a sequencer-compatible first nucleic acid sequence 521, a first sample index 522 and a first primer hybridization region 523, and the second primer 524 comprises a sequencer-compatible second nucleic acid sequence 525, the second sample index526 and the partial sequence of the hybridization region 527 with the second primer; this library construction method can also be replaced with other random library construction schemes that can achieve the same purpose, including but not limited to ultrasonic interruption, enzyme interruption Or library construction schemes such as transposase interruption. The purpose of another library construction scheme is to analyze the expression of the target gene in a targeted manner, which can be achieved by two-step multiplex PCR, such as using the first gene-specific primer 529, the second gene-specific primer 531 and the universal primer 506 respectively. The primer pair is the product of two-step multiplex PCR: the product 530 of the first-step multiplex PCR and the product 532 of the second-step multiplex PCR, and finally the product 532 of the second-step multiplex PCR is used as the template to pass the first primer 520 and the second primer 520 524 is amplified to form a second final library 533. Libraries constructed by targeted multiplex PCR can be used for immune repertoire analysis, especially for full-length T cell receptor and antibody VDJ sequences. Both the first final library 528 and the second final library 533 were further used for sequencing and information analysis.
本发明还提供了多种核酸共标记的支持物应用于单细胞多组学研究的用途。根据生物学中心法则,携带有遗传信息的DNA通过转录将信息传递给RNA并被翻译为蛋白质,最终由蛋白质行使主要的生物学功能。但由于生理系统的复杂性RNA与蛋白的表达量并不一致,而且RNA并不能直接反映蛋白质的翻译后修饰和相互作用,因此同时研究同一细胞的RNA表达量和蛋白表达十分重要。本发明披露了能够同时分析RNA表达量及序列和蛋白质表达及相互作用的核酸标记支持物结构。The present invention also provides the use of multiple nucleic acid co-labeled supports for single-cell multi-omics research. According to the central dogma of biology, DNA carrying genetic information transmits the information to RNA through transcription and is translated into protein, which finally performs the main biological function. However, due to the complexity of the physiological system, the expression levels of RNA and protein are not consistent, and RNA cannot directly reflect the post-translational modification and interaction of proteins. Therefore, it is very important to study the expression of RNA and protein in the same cell at the same time. The present invention discloses a nucleic acid labeling support structure capable of simultaneously analyzing RNA expression level and sequence and protein expression and interaction.
如图6A所示,此处的支持物为微珠(包括固体微珠或半固态水凝胶微珠),支持物上标记了至少三种核酸序列:第一核酸序列601、第二核酸序列604和第三核酸序列609。As shown in FIG. 6A , the support here is microbeads (including solid microbeads or semi-solid hydrogel microbeads), and at least three nucleic acid sequences are labeled on the support: a first nucleic acid sequence 601, a second nucleic acid sequence 604 and the third nucleic acid sequence 609.
第一核酸序列601至少包含捕获序列603用于捕获目的核酸分子并作为引物延伸,比如是长度在15-40的碱基序列oligo dT,可以通过调整第一核酸序列601在支持物上的数量和密度来控制捕获RNA的效率;在特定用途下第一核酸序列601也包括第一通用核酸序列602和可条件性断裂位点X,条件性可断裂位点包括但不限于双硫修饰、dU修饰、RNA碱基修饰、dI修饰、DSpacer修饰、AP位点修饰、光断裂PC linker以及限制性内切酶识别序列。The first nucleic acid sequence 601 contains at least a capture sequence 603 for capturing the target nucleic acid molecule and extending as a primer, such as a base sequence oligo dT with a length of 15-40, which can be adjusted by adjusting the number and amount of the first nucleic acid sequence 601 on the support. Density to control the efficiency of capturing RNA; under specific use, the first nucleic acid sequence 601 also includes the first universal nucleic acid sequence 602 and a conditional cleavage site X, and the conditionally cleavable site includes but is not limited to disulfide modification, dU modification , RNA base modification, dI modification, DSpacer modification, AP site modification, photocleavage PC linker and restriction endonuclease recognition sequence.
第二核酸序列604由第二通用核酸序列605、细胞标签序列606、分子标签序列607和模板转换序列608中的一种或几种组成,其中第二通用核酸序列605可以包括与测序仪匹配的接头核酸序列,比如illumina测序仪中的Read1 Sequencing Primer或者Read2 Sequencing Primer;细胞标签序列606用于标记来源于同一细胞中的所有mRNA的分子, 每个支持物上具有相同的细胞标签而不同种类支持物上具有不同的细胞标签,细胞标签序列606可以是一段随机或半随机的核酸序列,比如12bp简并碱基NNNNNNNNNNNN,也可以是包含多种固定核酸序列的组合,比如96种8碱基序列与96种8碱基序列与96种8碱基序列的随机组合,8碱基序列之间可以包括也可以不包括连接核酸区;分子标签序列607用于标记每个逆转录出来的cDNA分子,从同一支持物上的不同的RNA逆转录出来的cDNA分子都被标记上不同的分子标签,分子标签607可以是一段8-20碱基长度的随机或半随机的核酸序列,比如9个随机简并碱基NNNNNNNNN或NNNNNNNNNV;模板转换序列608可以作为模板使从第一核酸序列601逆转录出来的cDNA 3’端继续延伸以标记上分子标签序列607、细胞标签序列606和第二通用核酸序列605,模板转换序列608至少在3’端包括两个及以上的RNA碱基rG或者其它修饰的碱基G类似物,比如LNA或XNA。The second nucleic acid sequence 604 is composed of one or more of the second universal nucleic acid sequence 605, the cell tag sequence 606, the molecular tag sequence 607 and the template switching sequence 608, wherein the second universal nucleic acid sequence 605 may include a sequencer matching sequence Adapter nucleic acid sequence, such as Read1 Sequencing Primer or Read2 Sequencing Primer in illumina sequencer; cell tag sequence 606 is used to label molecules derived from all mRNAs in the same cell, each support has the same cell tag and different types of support There are different cell tags on the object, and the cell tag sequence 606 can be a random or semi-random nucleic acid sequence, such as 12bp degenerate base NNNNNNNNNNNN, or it can be a combination of multiple fixed nucleic acid sequences, such as 96 kinds of 8-base sequences With the random combination of 96 kinds of 8-base sequences and 96 kinds of 8-base sequences, the connecting nucleic acid region may or may not be included between the 8-base sequences; the molecular tag sequence 607 is used to label each reverse transcribed cDNA molecule, The cDNA molecules reverse transcribed from different RNAs on the same support are marked with different molecular tags. The molecular tag 607 can be a random or semi-random nucleic acid sequence with a length of 8-20 bases, such as 9 random and base NNNNNNNNN or NNNNNNNNNV; the template switching sequence 608 can be used as a template to extend the 3' end of the cDNA reverse transcribed from the first nucleic acid sequence 601 to label the molecular tag sequence 607, the cell tag sequence 606 and the second universal nucleic acid sequence 605 , the template switching sequence 608 includes two or more RNA bases rG or other modified base G analogs, such as LNA or XNA, at least at the 3' end.
第三核酸序列609由第三通用核酸序列610、细胞标签序列606、分子标签序列607和蛋白核酸标签捕获序列611中的一种或几种组成,其中细胞标签序列606和分子标签序列607与第二核酸序列604上的结构一致;第三通用核酸序列610是与第二通用核酸序列605不一致的含有与测序仪匹配的接头核酸序列,比如illumina测序仪中的Read1 Sequencing Primer或者Read2 Sequencing Primer;蛋白核酸标签捕获序列611用来捕获并延伸与待测单细胞在同一空间结构的蛋白核酸标记,同一空间结构指的是蛋白核酸标记可以位于细胞内部,也可以位于细胞表面,或者位于细胞所处的腔室或者液滴内。The third nucleic acid sequence 609 is composed of one or more of the third universal nucleic acid sequence 610, the cell tag sequence 606, the molecular tag sequence 607 and the protein nucleic acid tag capture sequence 611, wherein the cell tag sequence 606 and the molecular tag sequence 607 are the same as the first The structures of the two nucleic acid sequences 604 are consistent; the third universal nucleic acid sequence 610 is inconsistent with the second universal nucleic acid sequence 605 and contains an adaptor nucleic acid sequence that matches the sequencer, such as Read1 Sequencing Primer or Read2 Sequencing Primer in the illumina sequencer; protein The nucleic acid tag capture sequence 611 is used to capture and extend the protein nucleic acid tag in the same spatial structure as the single cell to be tested. The same spatial structure means that the protein nucleic acid tag can be located inside the cell, on the surface of the cell, or in the cell where the cell is located. chamber or droplet.
图6B所示为图6A所示的三种核酸标记支持物构建多组学单细胞文库的实验流程图。首先将待测细胞与预先偶联有核酸标记的识别特定蛋白的抗体分子612接触结合并洗掉非特异结合的核酸偶联抗体,抗体分子612的结构包括可以和蛋白核酸标签捕获序列611互补结合的序列613、蛋白特异序列614、第四通用引物序列615及分子616,第四通用引物序列615是与第二通用核酸序列605、第三通用核酸序列610均不一致的含有与测序仪匹配的接头核酸序列,比如illumina测序仪中的Read2 Sequencing Primer或者Read1 Sequencing Primer,分子616在此流程中指代特异性抗体,也可以是和目标检测蛋白相互结合的小分子化合物、糖类、肽类等其它物质;当单个支持物与结合有核酸偶联抗体分子612的单细胞接触时,细胞裂解释放出RNA617以及核酸偶联抗体分子612并分别被支持物上的第一核酸序列601和第三核酸序列609捕获并通过逆转录反应体系形成cDNA分子618或DNA分子619,其中cDNA延伸到RNA617的5’末端时被具有末端核酸转移酶功能的逆转录酶添加连续的碱基C至cDNA链上,然后该cDNA链会与同一支持物表面附近的含有两个以上碱基rG或其碱基类似物的模板转换序列608互补结合并继续延伸至第二通用核酸序列605区域形成完整的具有细胞标签和分子标签的cDNA分子618;可选地, cDNA分子618和DNA分子619可以通过可断裂位点X从支持物上脱离下来作为下一步扩增的模板,也可以通过第二通用核酸序列605和第四通用核酸序列615单引物延伸后形成的与cDNA分子618或DNA分子619互补的延伸链作为下一步扩增的模板,或者通过酶处理去除支持物上未参与逆转录反应的第一核酸序列601、第二核酸序列604和第三核酸序列609后以含有cDNA分子618和619的支持物作为下一步扩增的模板;在随后的步骤中,断裂或者不断裂的cDNA分子618和DNA分子619混合物作为模板被与第一通用核酸序列602/第二通用核酸序列605及第三通用核酸序列610/第四通用核酸序列615双引物对PCR扩增形成第一双链核酸产物621和第二双链核酸产物620的混合物。进一步地,形成的双链核酸产物可以通过三种建库方式进行单细胞多组学分析。第一种建库方案的目的是构建可以分析待检测蛋白丰度的核酸文库,可以直接通过第一index引物622和第二index引物626PCR扩增得到第一文库630;其中第一index引物622包括依序连接的与测序仪兼容的第一核酸序列623、第一样本index624和与第四通用核酸序列615互补的核酸序列625,第二index引物626包括依序连接的与测序仪兼容的第二核酸序列627、第二样本index628和与第三通用核酸序列610互补的核酸序列629组成。第二种建库方案的目的是无偏差的分析所有具有polyA尾巴的RNA分子的表达情况,该种建库方案为随机将第一双链核酸产物、第二双链核酸产物的混合物打断、末端修复并在3’端添加碱基A形成分子结构631,然后与含有突出T的接头632连接并被含有第一样本Index624的第一引物634和含有第二样本index628的第二引物636扩增形成第二最终文库638,其中第一引物634包括依序连接的与测序仪兼容的第一核酸序列623、第一样本index624和与第一通用核酸序列602互补的核酸序列635,第二引物636包括依序连接的与测序仪兼容的第二核酸序列627、第二样本index628和与第二通用核酸序列605互补的核酸序列637,此种建库方法也可以替换为其它能够达到同样目的的随机文库构建方案,包括但不局限于转座酶法打断建库或者随机引物延伸的建库方案。第三种建库方案的目的是可以靶向分析目的基因的表达情况,可以通过两步多重PCR实现,如分别用第一基因特异性引物639和第二基因特异性引物641与第二通用引物605组成的引物对形成多重PCR产物:第一步多重PCR产物640和第二步多重PCR产物642,最后以第二步多重PCR产物642为模板通过与含有第一样本Index624的第一引物634和含有第二样本Index628的第二引物636扩增形成第三最终文库643。靶向多重PCR构建的文库可用于免疫组库的分析,更进一步地,第二步多重PCR产物642还可以按照第一种随机打断建库方案用于构建全长VDJ免疫组库文库用以分析T细胞受体和抗体VDJ序列。文库630、第二最终文库638和第三最终文库643均进一步用于测序和信息分析。FIG. 6B shows the experimental flow chart of the construction of a multi-omics single-cell library with the three nucleic acid labeling supports shown in FIG. 6A . First, the cells to be tested are contacted and bound with a nucleic acid-labeled antibody molecule 612 that recognizes a specific protein in advance, and the non-specifically bound nucleic acid-conjugated antibody is washed away. The structure of the antibody molecule 612 includes complementary binding to the protein nucleic acid tag capture sequence 611 The sequence 613, the protein-specific sequence 614, the fourth universal primer sequence 615, and the molecule 616, the fourth universal primer sequence 615 is inconsistent with the second universal nucleic acid sequence 605, the third universal nucleic acid sequence 610 and contains a matching adapter with the sequencer Nucleic acid sequences, such as Read2 Sequencing Primer or Read1 Sequencing Primer in the illumina sequencer, molecule 616 refers to specific antibodies in this process, and can also be small molecular compounds, carbohydrates, peptides and other substances that bind to the target detection protein. When a single support is contacted with a single cell bound with nucleic acid-conjugated antibody molecules 612, cell lysis releases RNA 617 and nucleic acid-conjugated antibody molecules 612 and is separated by the first nucleic acid sequence 601 and the third nucleic acid sequence 609 on the support, respectively Capture and form a cDNA molecule 618 or a DNA molecule 619 through a reverse transcription reaction system, wherein the cDNA is extended to the 5' end of RNA617 by a reverse transcriptase with terminal nucleotransferase function to add a continuous base C to the cDNA strand, and then the The cDNA strand will be complementary to the template switching sequence 608 containing two or more bases rG or its base analogs near the surface of the same support and continue to extend to the second universal nucleic acid sequence 605 region to form a complete cell tag and molecular tag. cDNA molecule 618; alternatively, the cDNA molecule 618 and the DNA molecule 619 can be detached from the support through the cleavable site X as a template for the next step of amplification, or can pass the second universal nucleic acid sequence 605 and the fourth universal nucleic acid sequence 605. The extended chain complementary to the cDNA molecule 618 or the DNA molecule 619 formed after the single primer extension of the nucleic acid sequence 615 is used as a template for the next amplification, or the first nucleic acid sequence 601, the first nucleic acid sequence 601, the first nucleic acid sequence 601, the first nucleic acid sequence 601 on the support that does not participate in the reverse transcription reaction are removed by enzymatic treatment. After the second nucleic acid sequence 604 and the third nucleic acid sequence 609, the support containing the cDNA molecules 618 and 619 is used as the template for the next step of amplification; in the subsequent step, the mixture of the fragmented or unfragmented cDNA molecule 618 and the DNA molecule 619 is used as the template A first double-stranded nucleic acid product 621 and a second double-stranded nucleic acid product are formed by PCR amplification with the first universal nucleic acid sequence 602/second universal nucleic acid sequence 605 and the third universal nucleic acid sequence 610/fourth universal nucleic acid sequence 615 double primer pair 620 mixture. Further, the formed double-stranded nucleic acid product can be used for single-cell multi-omics analysis through three library construction methods. The purpose of the first library construction scheme is to construct a nucleic acid library that can analyze the abundance of the protein to be detected, and the first library 630 can be obtained directly by PCR amplification of the first index primer 622 and the second index primer 626; wherein the first index primer 622 includes The sequencer-compatible first nucleic acid sequence 623, the first sample index 624 and the nucleic acid sequence 625 complementary to the fourth universal nucleic acid sequence 615 are sequentially linked, and the second index primer 626 includes the sequencer-compatible first sequencer The second nucleic acid sequence 627, the second sample index 628 and the nucleic acid sequence 629 complementary to the third universal nucleic acid sequence 610 are composed. The purpose of the second library construction scheme is to analyze the expression of all RNA molecules with polyA tails without bias. This library construction scheme is to randomly interrupt the mixture of the first double-stranded nucleic acid product and the second double-stranded nucleic acid product. End repair and add base A at the 3' end to form a molecular structure 631, which is then ligated with a linker 632 containing an overhang T and amplified by the first primer 634 containing the first sample Index624 and the second primer 636 containing the second sample index628 Augmentation forms a second final library 638, wherein the first primer 634 comprises a sequencer-compatible first nucleic acid sequence 623, a first sample index 624, and a nucleic acid sequence 635 complementary to the first universal nucleic acid sequence 602, connected in sequence, and the second The primer 636 includes a sequencer-compatible second nucleic acid sequence 627, a second sample index 628 and a nucleic acid sequence 637 complementary to the second universal nucleic acid sequence 605, which can also be replaced by other methods that can achieve the same purpose. Random library construction schemes, including but not limited to transposase interrupt library construction or random primer extension library construction schemes. The purpose of the third library construction scheme is to analyze the expression of the target gene in a targeted manner, which can be achieved by two-step multiplex PCR, such as using the first gene-specific primer 639 and the second gene-specific primer 641 and the second universal primer respectively. The primer pair composed of 605 forms a multiplex PCR product: the first step multiplex PCR product 640 and the second step multiplex PCR product 642, and finally the second step multiplex PCR product 642 is used as a template to pass through with the first primer 634 containing the first sample Index624. A third final library 643 is formed by amplifying with the second primer 636 containing the second sample Index 628 . The library constructed by targeted multiplex PCR can be used for the analysis of the immune repertoire. Furthermore, the multiplex PCR product 642 in the second step can also be used to construct a full-length VDJ immune repertoire library according to the first random interrupt library construction scheme. Analysis of T cell receptor and antibody VDJ sequences. The library 630, the second final library 638, and the third final library 643 are all further used for sequencing and information analysis.
实施例1:多种核酸共标记的支持物应用于5’单细胞RNA表达谱文库构建与VDJ文库构建及多组学文库构建Embodiment 1: the support of multiple nucleic acid co-labeling is applied to 5' single-cell RNA expression profile library construction and VDJ library construction and multi-omics library construction
本实施例中,按照以下操作步骤制备多种核酸共标记的支持物并应用于构建5’单细胞RNA表达谱文库与VDJ文库及多组学文库。In this example, a variety of nucleic acid co-labeled supports were prepared according to the following operation steps and applied to construct a 5' single-cell RNA expression profile library, a VDJ library and a multi-omics library.
1制作多种核酸标记的磁珠1 Make a variety of nucleic acid-labeled magnetic beads
1.1合成以下序列的单链核酸。1.1 Synthesize the single-stranded nucleic acid of the following sequence.
Figure PCTCN2020106089-appb-000001
Figure PCTCN2020106089-appb-000001
1.2在0.25M EDC浓度下分别将384种300pmol氨基修饰的CB1单链核酸(SEQ ID No.2)与300pmol氨基修饰的dT单链核酸(SEQ ID No.1)及6万个30μM羧基磁珠室温旋转混合3小时,洗涤两次后得到384种核酸标记磁珠。1.2 At the concentration of 0.25M EDC, 384 kinds of 300pmol amino-modified CB1 single-stranded nucleic acid (SEQ ID No.2) and 300pmol amino-modified dT single-stranded nucleic acid (SEQ ID No.1) and 60,000 30μM carboxyl magnetic beads were respectively mixed Rotate and mix at room temperature for 3 hours, and after washing twice, 384 nucleic acid-labeled magnetic beads were obtained.
1.3将1.2中得到的384种核酸标记磁珠混合均匀后均分至384孔板中。1.3 Mix the 384 nucleic acid-labeled magnetic beads obtained in 1.2 and evenly distribute them into a 384-well plate.
1.4将具有相同Cell Barcode的单链核酸CB2-TSO(SEQ ID No.3,其3’端3个“(rG)”表示为RNA的G)与CB2-T7(SEQ ID No.5)一起和rCB(SEQ ID No.4)退火形成具有粘性末端的双链结构,其中rCB2中的n’n’n’n’n’n’n’n’CTGTAG序列与CB2-TSO和CB2-T7中的CTACAGnnnnnnnn是反向互补序列;nnnnnnnn为8bp的Cell barcode序列,本实施例中共有384种类型。1.4 The single-stranded nucleic acid CB2-TSO with the same Cell Barcode (SEQ ID No.3, whose 3 '(rG)'s are expressed as the G of RNA) together with CB2-T7 (SEQ ID No.5) and rCB (SEQ ID No. 4) annealed to form a double-stranded structure with sticky ends, in which the n'n'n'n'n'n'n'n'CTGTAG sequence in rCB2 was identical to that in CB2-TSO and CB2-T7. CTACAGnnnnnnnn is a reverse complementary sequence; nnnnnnnn is an 8bp Cell barcode sequence, and there are 384 types in this embodiment.
1.5分别将384种退火的CB2-rCB2双链核酸按以下配比加入到含有磁珠的384孔板中,22℃反应30分钟。1.5 Add 384 kinds of annealed CB2-rCB2 double-stranded nucleic acids to a 384-well plate containing magnetic beads in the following proportions, and react at 22°C for 30 minutes.
试剂reagent 50μL体系50μL system
2X Rapid Ligation Buffer2X Rapid Ligation Buffer 25μL25μL
CB2-rCB2双链核酸CB2-rCB2 double-stranded nucleic acid 3μL3μL
T4 DNA LigaseT4 DNA Ligase 3μL3μL
RNase-free waterRNase-free water 补充至50μLMake up to 50 μL
1.6反应完成后混匀384种磁珠,95℃高温处理去除互补链后置于4℃备用。1.6 After the reaction is completed, mix 384 kinds of magnetic beads, remove the complementary strands by high temperature treatment at 95°C, and store at 4°C for later use.
2使用多核酸标记磁珠进行单细胞cDNA及蛋白的细胞标签标记2. Cell labeling of single-cell cDNA and protein using polynucleic acid-labeled magnetic beads
2.1将偶联有T7核酸序列的抗human CD4分子的抗体与新鲜的PBMC细胞混合孵育使抗体与细胞膜表面CD4蛋白充分结合,然后用新鲜PBS洗涤。2.1 Incubate the antibody against human CD4 molecule conjugated with T7 nucleic acid sequence with fresh PBMC cells to make the antibody fully bind to the CD4 protein on the cell membrane surface, and then wash with fresh PBS.
2.2按照BD Rhapsody单细胞测序试剂盒中提供的微孔芯片说明书处理芯片,并加入1万个孵育好的PBMC细胞。2.2 Process the chip according to the microwell chip instructions provided in the BD Rhapsody Single Cell Sequencing Kit, and add 10,000 incubated PBMC cells.
2.3加入30万个步骤1中制好的磁珠至微孔板中,磁吸入孔后清洗掉多余磁珠。2.3 Add 300,000 magnetic beads prepared in step 1 to the microplate, and wash off the excess magnetic beads after magnetic suction holes.
2.4加入试剂盒中自带的裂解液,2min后磁吸取出磁珠并清洗。2.4 Add the lysis solution that comes with the kit, and after 2 minutes, magnetically suck out the magnetic beads and wash them.
2.5按以下反应配置逆转录试剂200uL并悬浮磁珠。2.5 Prepare 200uL of reverse transcription reagent according to the following reaction and suspend the magnetic beads.
试剂reagent 200μL体系200μL system
Superscript II first-strand buffer(5×)Superscript II first-strand buffer(5×) 4040
DTT(100mM)DTT (100mM) 1010
Betaine(5M)Betaine(5M) 4040
MgCl2(1M)MgCl2 (1M) 1.21.2
dNTP 10mM dNTP 10mM 2020
RNAse inhibitor RNAse inhibitor 55
SuperScript II reverse transcriptaseSuperScript II reverse transcriptase 1010
RNase-free waterRNase-free water 73.873.8
2.6按以下条件逆转录。2.6 Reverse transcription according to the following conditions.
Figure PCTCN2020106089-appb-000002
Figure PCTCN2020106089-appb-000002
2.7直接去除逆转录上清并加入以下200uL外切酶反应液,37℃反应30分钟去除磁珠上的多余引物。2.7 Directly remove the reverse transcription supernatant, add the following 200uL exonuclease reaction solution, and react at 37°C for 30 minutes to remove excess primers on the magnetic beads.
试剂reagent 200μL体系200μL system
10X外切酶缓冲液10X Exonuclease Buffer 20μL20μL
外切酶exonuclease 10μL10μL
RNase-free waterRNase-free water 170μL170μL
3构建单细胞膜蛋白表达量的高通量测序文库3 Construction of high-throughput sequencing library of single-cell membrane protein expression
3.1用BD Rhapsody试剂盒自带的洗脱缓冲液洗涤磁珠并重悬,在95℃高温下处理5min后立即吸出上清备用;剩余磁珠重悬于洗脱缓冲液中备用。3.1 Wash the magnetic beads with the elution buffer that comes with the BD Rhapsody kit and resuspend them. After treating at a high temperature of 95°C for 5 min, aspirate the supernatant immediately for use; the remaining magnetic beads are resuspended in the elution buffer for use.
3.2配置以下PCR Mix。3.2 Configure the following PCR Mix.
组分component For 1 library(μL)For 1 library(μL)
PCR MasterMix(Cat.No.91-1118)PCR MasterMix (Cat.No.91-1118) 100100
Universal Oligo(Cat.No.650000074)Universal Oligo(Cat.No.650000074) 1010
Bead RT/PCR Enhancer(Cat.No.91-1082)Bead RT/PCR Enhancer(Cat.No.91-1082) 1212
Primer T7(SEQ ID No.7)Primer T7 (SEQ ID No.7) 1010
2.2.8中洗脱上清Elution supernatant in 2.2.8 6868
Total Total 200200
3.3按以下条件扩增后使用SPRI beads 1.4×纯化,30uL洗脱。3.3 Purify with SPRI beads 1.4× after amplification according to the following conditions, and elute with 30uL.
Figure PCTCN2020106089-appb-000003
Figure PCTCN2020106089-appb-000003
3.4配置以下Index PCR mix。3.4 Configure the following Index PCR mix.
PCR MasterMix(Cat.No.91-1118)PCR MasterMix (Cat.No.91-1118) 2525
Index P5 primer(SEQ ID No.9)Index P5 primer (SEQ ID No.9) 22
Index P7 primer(SEQ ID No.10)Index P7 primer (SEQ ID No. 10) 22
Nuclease-free waterNuclease-free water 1818
2.3.3中的洗脱液Eluent in 2.3.3 33
Total Total 5050
3.5按以下条件扩增后使用SPRI beads 0.8×纯化,30uL无菌水洗脱后得到膜蛋白CD4单细胞表达量文库,文库片段大小约280bp,符合库检标准,如图7A所示。3.5 After amplification according to the following conditions, use SPRI beads 0.8× to purify, and wash with 30uL of sterile water to obtain the membrane protein CD4 single-cell expression library. The library fragment size is about 280bp, which meets the library inspection standard, as shown in Figure 7A.
Figure PCTCN2020106089-appb-000004
Figure PCTCN2020106089-appb-000004
4单细胞5’表达谱文库及免疫受体VDJ文库构建4 Construction of single cell 5' expression profile library and immune receptor VDJ library
4.1全长cDNA扩增:配置以下PCR反应体系并重悬2.3.1中制备的磁珠。4.1 Full-length cDNA amplification: configure the following PCR reaction system and resuspend the magnetic beads prepared in 2.3.1.
Kit组分Kit Components For 1 library(μL)For 1 library(μL)
PCR MasterMix(Cat.No.91-1118)PCR MasterMix (Cat.No.91-1118) 6060
Universal Oligo(Cat.No.650000074)Universal Oligo(Cat.No.650000074) 1010
Primer Full(SEQ ID No.8)Primer Full(SEQ ID No.8) 1010
RNase-free waterRNase-free water 4040
TotalTotal 120120
4.2按以下条件扩增后使用SPRI beads 0.6×纯化,30uL洗脱后备用。4.2 Amplify according to the following conditions and use SPRI beads 0.6× to purify, and 30uL of elution is used for later use.
Figure PCTCN2020106089-appb-000005
Figure PCTCN2020106089-appb-000005
Figure PCTCN2020106089-appb-000006
Figure PCTCN2020106089-appb-000006
4.3将4.2中的部分洗脱产物使用10×genomics公司的Chromium Single Cell V(D)J Reagent Kits中5’Library Construction Kit(PN-1000020)进行打断建库,得到5’单细胞表达谱文库,如图7B中所示,由图中可见构建出来的文库主峰在484附近,符合库检标准。4.3 Use the 5'Library Construction Kit (PN-1000020) in the Chromium Single Cell V(D)J Reagent Kits of 10×genomics company to interrupt and construct the library to obtain a 5' single cell expression profile library. , as shown in Figure 7B, it can be seen from the figure that the main peak of the constructed library is around 484, which meets the library inspection standard.
4.4将4.2中的部分洗脱产物使用10×genomics公司的Chromium Single Cell V(D)J Reagent Kits中Enrichment Kit(Human T Cell,PN-1000005和Human B Cell,PN-1000016)进行扩增及打断建库,得到T细胞和B细胞的单细胞VDJ文库,分别如图7C和图7D所示。由图7C中可见构建出来的文库大小在200-1000bp之间,符合预期。由图7D中可见构建出来的文库主峰在545附近,符合库检标准。4.4 Use the Enrichment Kit (Human T Cell, PN-1000005 and Human B Cell, PN-1000016) in the Chromium Single Cell V(D)J Reagent Kits of 10×genomics company to amplify and label the eluted product in 4.2. The library was constructed to obtain single-cell VDJ libraries of T cells and B cells, as shown in Figure 7C and Figure 7D, respectively. It can be seen from Figure 7C that the size of the constructed library is between 200-1000 bp, which is in line with expectations. It can be seen from Figure 7D that the main peak of the constructed library is around 545, which meets the library inspection standard.
实施例2:多种核酸共标记的支持物应用于3’单细胞RNA文库构建Example 2: Multiple nucleic acid co-labeled supports applied to 3' single-cell RNA library construction
本实施例中,按照以下操作步骤制备多种核酸共标记的支持物并应用于构建3’单细胞RNA文库。In this example, a variety of nucleic acid co-labeled supports were prepared according to the following operation steps and applied to construct a 3' single-cell RNA library.
1制作多种核酸标记的磁珠1 Make a variety of nucleic acid-labeled magnetic beads
1.1合成以下序列的单链核酸。1.1 Synthesize the single-stranded nucleic acid of the following sequence.
Figure PCTCN2020106089-appb-000007
Figure PCTCN2020106089-appb-000007
1.2在0.25M EDC浓度下分别将384种300pmol氨基修饰的CB1单链核酸(SEQ ID No.2)与300pmol氨基修饰的SP2-dT30VN单链核酸(SEQ ID No.11)及6万个30μM羧基磁珠室温旋转混合3小时,洗涤两次后得到384种核酸标记磁珠。1.2 384 kinds of 300pmol amino-modified CB1 single-stranded nucleic acid (SEQ ID No. 2) and 300 pmol amino-modified SP2-dT30VN single-stranded nucleic acid (SEQ ID No. 11) and 60,000 30 μM carboxyl groups were prepared at a concentration of 0.25M EDC. The magnetic beads were rotated and mixed at room temperature for 3 hours, and 384 nucleic acid-labeled magnetic beads were obtained after washing twice.
1.3将1.2中得到的384种核酸标记磁珠混合均匀后均分至384孔板中。1.3 Mix the 384 nucleic acid-labeled magnetic beads obtained in 1.2 and evenly distribute them into a 384-well plate.
1.4将单链核酸CB2-NrNx(SEQ ID No.12,其中的“(rN)”表示为RNA碱基,另,该序列3’端带有C3修饰)与rCB2退火形成具有粘性末端的双链结构,其中rCB2中的n’n’n’n’n’n’n’CTGTAG序列与CB2-NrNx中的CTACAGnnnnnnnn是反向互补序列;nnnnnnnn为8bp的Cell barcode序列,共有384种类型。1.4 Single-stranded nucleic acid CB2-NrNx (SEQ ID No. 12, wherein "(rN)" is expressed as RNA base, and the 3' end of the sequence is modified with C3) and rCB2 are annealed to form a double-stranded with sticky ends The structure, in which the n'n'n'n'n'n'n'CTGTAG sequence in rCB2 and the CTACAGnnnnnnnn in CB2-NrNx are the reverse complementary sequences; nnnnnnnn is the 8bp Cell barcode sequence, with a total of 384 types.
1.5分别将384种退火的CB2-NrNx/rCB2双链核酸按以下配比加入到含有磁珠的384 孔板中,22℃反应30分钟。1.5 Add 384 kinds of annealed CB2-NrNx/rCB2 double-stranded nucleic acids to a 384-well plate containing magnetic beads in the following proportions, and react at 22°C for 30 minutes.
试剂reagent 50μL体系50μL system
2X Rapid Ligation Buffer2X Rapid Ligation Buffer 25μL25μL
CB2-NrNx/rCB2双链核酸CB2-NrNx/rCB2 double-stranded nucleic acid 3μL3μL
T4 DNA LigaseT4 DNA Ligase 3μL3μL
RNase-free waterRNase-free water 补充至50μLMake up to 50 μL
1.6反应完成后混匀384种磁珠,95℃高温处理去除互补链后置于4℃备用。1.6 After the reaction is completed, mix 384 kinds of magnetic beads, remove the complementary strands by high temperature treatment at 95°C, and store at 4°C for later use.
2使用多核酸标记磁珠进行3’单细胞RNA文库构建2 Use polynucleic acid-labeled magnetic beads for 3' single-cell RNA library construction
2.1抽取外周血并获得新鲜的PBMC细胞,重悬于PBS中。2.1 Peripheral blood was drawn and fresh PBMC cells were obtained and resuspended in PBS.
2.2按照BD Rhapsody单细胞文库构建试剂盒中提供的微孔芯片说明书处理芯片,并加入1万个孵育好的PBMC细胞。2.2 Process the chip according to the microwell chip instructions provided in the BD Rhapsody single-cell library construction kit, and add 10,000 incubated PBMC cells.
2.3加入30万个步骤2.1中制好的磁珠至微孔板中,磁吸入孔后清洗掉多余磁珠。2.3 Add 300,000 magnetic beads prepared in step 2.1 to the microplate, and wash off the excess magnetic beads after magnetic suction.
2.4加入试剂盒中自带的裂解液,2min后磁吸取出磁珠并清洗。2.4 Add the lysis solution that comes with the kit, and after 2 minutes, magnetically suck out the magnetic beads and wash them.
2.5按以BD Rhapsody单细胞文库构建试剂盒中说明逆转录并用ExoI切除磁珠上未利用的引物。2.5 Follow the instructions in the BD Rhapsody Single Cell Library Construction Kit to reverse transcription and use ExoI to excise unused primers on the magnetic beads.
2.6随机引物NrNx二链延伸:配置以下杂交反应体系并悬浮3.2.5中得到的磁珠。2.6 Random primer NrNx double-strand extension: configure the following hybridization reaction system and suspend the magnetic beads obtained in 3.2.5.
Kit组分Kit Components For 1 library(μL)For 1 library(μL)
WTA Extension Buffer(Cat.No.91-1114)WTA Extension Buffer(Cat.No.91-1114) 2020
Nuclease-free water(Cat.No.650000076)Nuclease-free water(Cat.No.650000076) 150150
RNase HIIRNase HII 44
TotalTotal 174174
2.7按照以下温度条件杂交。2.7 Hybridize according to the following temperature conditions.
Figure PCTCN2020106089-appb-000008
Figure PCTCN2020106089-appb-000008
2.8加入以下延伸试剂。2.8 Add the following extension reagents.
Kit组分Kit Components For 1 library(μL)For 1 library(μL)
10mM dNTP(Cat.No.650000077)10mM dNTP (Cat.No.650000077) 88
Bead RT/PCR Enhancer(Cat.No.91-1082)Bead RT/PCR Enhancer(Cat.No.91-1082) 1212
WTA Extension Enzyme(Cat.No.91-1117)WTA Extension Enzyme(Cat.No.91-1117) 66
Total Total 2626
2.9按照BD试剂盒指定条件延伸(如下条件)。2.9 Extend according to the conditions specified by the BD kit (the following conditions).
Figure PCTCN2020106089-appb-000009
Figure PCTCN2020106089-appb-000009
Figure PCTCN2020106089-appb-000010
Figure PCTCN2020106089-appb-000010
2.10延伸完毕洗涤磁珠,配置下列反应体系并悬浮磁珠。2.10 Wash the magnetic beads after extension, configure the following reaction system and suspend the magnetic beads.
Kit组分Kit Components For 1 library(μL)For 1 library(μL)
PCR MasterMix(Cat.No.91-1118)PCR MasterMix (Cat.No.91-1118) 6060
Universal Oligo(Cat.No.650000074)Universal Oligo(Cat.No.650000074) 1010
UPP-2(SEQ ID No.13)UPP-2 (SEQ ID No. 13) 1010
RNase-free waterRNase-free water 4040
2.11按照以下反应条件进行PCR,然后使用SPRI beads 0.9×纯化,30uL洗脱。2.11 Carry out PCR according to the following reaction conditions, then use SPRI beads 0.9× to purify and elute with 30uL.
Figure PCTCN2020106089-appb-000011
Figure PCTCN2020106089-appb-000011
2.12配置以下Index PCR mix。2.12 Configure the following Index PCR mix.
PCR MasterMix(Cat.No.91-1118)PCR MasterMix (Cat.No.91-1118) 2525
Index P5 primerIndex P5 primer 22
Index P7 primerIndex P7 primer 22
Nuclease-free waterNuclease-free water 1111
3.2.11中的洗脱液Eluent in 3.2.11 1010
Total Total 5050
2.13按以下条件扩增后使用SPRI beads 0.5/0.25×分选,30uL无菌水洗脱后得到3’单细胞RNA文库。2.13 Amplify according to the following conditions and use SPRI beads 0.5/0.25× sorting, eluting with 30uL sterile water to obtain 3’ single-cell RNA library.
Figure PCTCN2020106089-appb-000012
Figure PCTCN2020106089-appb-000012
2.14 illumina NovaSeq 6000测序,并分析测到的核酸序列在RNA的位置,如图8中所示,所测得的核酸序列主要位于靠近RNA 3’端。2.14 Illumina NovaSeq 6000 sequenced and analyzed the position of the detected nucleic acid sequence in the RNA. As shown in Figure 8, the detected nucleic acid sequence was mainly located near the 3' end of the RNA.
实施例3:多种核酸共标记的支持物应用于单细胞转录组文库构建Example 3: Multiple nucleic acid co-labeled supports for single-cell transcriptome library construction
本实施例中,按照以下操作步骤制备多种核酸共标记的支持物并应用于构建单细胞转录组文库。In this example, multiple nucleic acid co-labeled supports were prepared according to the following operation steps and applied to construct a single-cell transcriptome library.
1制作多种核酸标记的磁珠1 Make a variety of nucleic acid-labeled magnetic beads
1.1合成以下序列的单链核酸。1.1 Synthesize the single-stranded nucleic acid of the following sequence.
SEQ ID No.SEQ ID No. 名称name 序列sequence
1414 CB2-dN6CB2-dN6 CTACAGnnnnnnnnNNNNNNCTACAGnnnnnnnnNNNNNN
1.2在0.25M EDC浓度下分别将384种300pmol氨基修饰的CB1单链核酸(SEQ ID No.2)与300pmol氨基修饰的dT单链核酸(SEQ ID No.1)及6万个30μM羧基磁珠室温旋转混合3小时,洗涤两次后得到384种核酸标记磁珠。1.2 At the concentration of 0.25M EDC, 384 kinds of 300pmol amino-modified CB1 single-stranded nucleic acid (SEQ ID No.2) and 300pmol amino-modified dT single-stranded nucleic acid (SEQ ID No.1) and 60,000 30μM carboxyl magnetic beads were respectively mixed Rotate and mix at room temperature for 3 hours, and after washing twice, 384 nucleic acid-labeled magnetic beads were obtained.
1.3将1.2中得到的384种核酸标记磁珠混合均匀后均分至384孔板中。1.3 Mix the 384 nucleic acid-labeled magnetic beads obtained in 1.2 and evenly distribute them into a 384-well plate.
1.4将单链核酸CB2-dN6(SEQ ID No.14)与rCB2退火形成具有粘性末端的双链结构,其中rCB2中的n’n’n’n’n’n’n’CTGTAG序列与CB2-dN6中的CTACAGnnnnnnnn是反向互补序列;nnnnnnnn为8bp的Cell barcode序列,共有384种类型。1.4 Annealing the single-stranded nucleic acid CB2-dN6 (SEQ ID No. 14) and rCB2 to form a double-stranded structure with sticky ends, wherein the n'n'n'n'n'n'n'CTGTAG sequence in rCB2 and CB2- CTACAGnnnnnnnn in dN6 is a reverse complementary sequence; nnnnnnnn is an 8bp Cell barcode sequence, with a total of 384 types.
1.5分别将384种退火的CB2-dN6/rCB2双链核酸按以下配比加入到含有磁珠的384孔板中,22℃反应30分钟。1.5 Add 384 kinds of annealed CB2-dN6/rCB2 double-stranded nucleic acids to a 384-well plate containing magnetic beads in the following proportions, and react at 22°C for 30 minutes.
试剂reagent 50μL体系50μL system
2X Rapid Ligation Buffer2X Rapid Ligation Buffer 25μL25μL
CB2-dN6/rCB2双链核酸CB2-dN6/rCB2 double-stranded nucleic acid 3μL3μL
T4 DNA LigaseT4 DNA Ligase 3μL3μL
RNase-free waterRNase-free water 补充至50μLMake up to 50 μL
1.6反应完成后混匀384种磁珠,95℃高温处理去除互补链后置于4℃备用。1.6 After the reaction is completed, mix 384 kinds of magnetic beads, remove the complementary strands by high temperature treatment at 95°C, and store at 4°C for later use.
2使用多核酸标记磁珠进行单细胞转录组文库构建2 Single-cell transcriptome library construction using polynucleic acid-labeled magnetic beads
2.1抽取外周血并获得新鲜的PBMC细胞,重悬于PBS中。2.1 Peripheral blood was drawn and fresh PBMC cells were obtained and resuspended in PBS.
2.2按照BD Rhapsody单细胞文库构建试剂盒中提供的微孔芯片说明书处理芯片,并加入1万个孵育好的PBMC细胞。2.2 Process the chip according to the microwell chip instructions provided in the BD Rhapsody single-cell library construction kit, and add 10,000 incubated PBMC cells.
2.3加入30万个步骤1中制好的磁珠至微孔板中,磁吸入孔后清洗掉多余磁珠。2.3 Add 300,000 magnetic beads prepared in step 1 to the microplate, and wash off the excess magnetic beads after magnetic suction holes.
2.4加入试剂盒中自带的裂解液,2min后磁吸取出磁珠并清洗。2.4 Add the lysis solution that comes with the kit, and after 2 minutes, magnetically suck out the magnetic beads and wash them.
2.5按照BD Rhapsody单细胞文库构建试剂盒中说明逆转录并用ExoI切除磁珠上未利用的引物,其中逆转录反应条件修改如下:2.5 Follow the instructions in the BD Rhapsody single-cell library construction kit to reverse transcription and use ExoI to excise unused primers on the magnetic beads. The reverse transcription reaction conditions are modified as follows:
步骤step 温度temperature 时间 time 转速Rotating speed
11 常温normal temperature 30min30min 普通旋转normal rotation
22 37℃37℃ 30min30min 1200rpm1200rpm
2.6以下步骤严格按照BD Rhapsody mRNA Whole Transcriptome Analysis(WTA)Library Preparation Protocol中所说的进行杂交延伸和扩增形成最终的文库。2.6 The following steps are carried out in strict accordance with BD Rhapsody mRNA Whole Transcriptome Analysis (WTA) Library Preparation Protocol to carry out hybridization extension and amplification to form the final library.
2.7 illumina NovaSeq 6000测序,并分析测到的核酸序列在RNA的位置。2.7 Illumina NovaSeq 6000 sequencing, and analyze the position of the detected nucleic acid sequence in the RNA.
该方法测得的核酸序列比BD rhapsody与实施例2中的文库更均匀分布在RNA全长上。如图8所示,为采用实施例2中流程所构建的3’单细胞RNA文库和实施例3中流程所 构建的单细胞转录组文库经过测序分析得到reads在基因水平上的分布。BD Phapsody 3’单细胞表达谱文库为完全采用BD Rhapsody构建的文库分析结构。由图中可见,实施例2中所构建的3’单细胞RNA文库中包含的序列主要分布在基因的3’端,而实施例3中所构建的单细胞转录组文库中包含的序列明显比3’单细胞RNA文库和BD Phapsody 3’单细胞表达谱文库更偏向于基因的中间位置。The nucleic acid sequences measured by this method are more evenly distributed over the full length of RNA than BD rhapsody and the library in Example 2. As shown in Figure 8, for the 3' single-cell RNA library constructed by the procedure in Example 2 and the single-cell transcriptome library constructed by the procedure in Example 3, the distribution of reads at the gene level was obtained through sequencing analysis. The BD Phapsody 3' single-cell expression profile library is a library analysis structure constructed entirely with BD Rhapsody. As can be seen from the figure, the sequences contained in the 3' single-cell RNA library constructed in Example 2 are mainly distributed at the 3' end of the gene, while the sequences contained in the single-cell transcriptome library constructed in Example 3 are significantly higher than The 3' single-cell RNA library and the BD Phapsody 3' single-cell expression profiling library were more biased towards the middle of the gene.
实施例4:多种核酸共标记的支持物应用于多重PCR测序文库构建Example 4: Multiple nucleic acid co-labeled supports for multiplex PCR sequencing library construction
本实施例目的是实现单管多重PCR检测Brca1和Brca2全长基因序列,其设计多重PCR引物如下:The purpose of this example is to realize single-tube multiplex PCR detection of the full-length gene sequences of Brca1 and Brca2, and the design of multiplex PCR primers is as follows:
Pool 1:SEQ ID No.15~SEQ ID No.34;Pool 1: SEQ ID No.15~SEQ ID No.34;
Pool 2:SEQ ID No.35~SEQ ID No.54;Pool 2: SEQ ID No.35~SEQ ID No.54;
Pool 3:SEQ ID No.55~SEQ ID No.74;Pool 3: SEQ ID No.55~SEQ ID No.74;
Pool 4:SEQ ID No.75~SEQ ID No.94;Pool 4: SEQ ID No.75~SEQ ID No.94;
Pool 5:SEQ ID No.95~SEQ ID No.114;Pool 5: SEQ ID No.95~SEQ ID No.114;
Pool 6:SEQ ID No.115~SEQ ID No.134;Pool 6: SEQ ID No.115~SEQ ID No.134;
Pool 7:SEQ ID No.135~SEQ ID No.154;Pool 7: SEQ ID No.135~SEQ ID No.154;
Pool 8:SEQ ID No.155~SEQ ID No.174;Pool 8: SEQ ID No.155~SEQ ID No.174;
Pool 9:SEQ ID No.175~SEQ ID No.194;Pool 9: SEQ ID No.175~SEQ ID No.194;
Pool 10:SEQ ID No.195~SEQ ID No.214;Pool 10: SEQ ID No.195~SEQ ID No.214;
Pool 11:SEQ ID No.215~SEQ ID No.234;Pool 11: SEQ ID No.215~SEQ ID No.234;
Pool 12:SEQ ID No.235~SEQ ID No.254;Pool 12: SEQ ID No.235~SEQ ID No.254;
Pool 13:SEQ ID No.255~SEQ ID No.274;Pool 13: SEQ ID No.255~SEQ ID No.274;
Pool 14:SEQ ID No.275~SEQ ID No.294;Pool 14: SEQ ID No.275~SEQ ID No.294;
Pool 15:SEQ ID No.295~SEQ ID No.314;Pool 15: SEQ ID No.295~SEQ ID No.314;
Pool 16:SEQ ID No.315~SEQ ID No.334;Pool 16: SEQ ID No.315~SEQ ID No.334;
Pool 17:SEQ ID No.335~SEQ ID No.348。Pool 17: SEQ ID No.335 to SEQ ID No.348.
1提取基因组DNA1 Extraction of genomic DNA
按照天根血液基因组DNA提取试剂盒说明提取200uL人外周血基因组DNA,并使用Qubit测量浓度。Extract 200uL of human peripheral blood genomic DNA according to the instructions of Tiangen blood genomic DNA extraction kit, and use Qubit to measure the concentration.
2制作多核酸标记磁珠2 Making polynucleic acid labeled magnetic beads
2.1按照表1中的序列合成5’氨基修饰的引物序列,并将相同pool编号的引物等量混 合在一起。2.1 Synthesize 5' amino-modified primer sequences according to the sequences in Table 1, and mix together equal amounts of primers with the same pool number.
2.2在0.25M EDC浓度下分别将10nmol pool编号1-17的所引物混合物(Pool 1、Pool 2、Pool 3、Pool 4、Pool 5、Pool 6、Pool 7、Pool 8、Pool 9、Pool 10、Pool 11、Pool 12、Pool 13、Pool 14、Pool 15、Pool 16或Pool 17)与10mg磁珠(Dynabeads MyOne Carboxylic Acid)室温旋转混合3小时,洗涤两次后得到17类核酸标记磁珠。2.2 The primer mixtures (Pool 1, Pool 2, Pool 3, Pool 4, Pool 5, Pool 6, Pool 7, Pool 8, Pool 9, Pool 10, Pool 6, Pool 7, Pool 8, Pool 9, Pool 10, Pool 11, Pool 12, Pool 13, Pool 14, Pool 15, Pool 16 or Pool 17) were mixed with 10 mg of magnetic beads (Dynabeads MyOne Carboxylic Acid) for 3 hours at room temperature, and washed twice to obtain 17 types of nucleic acid-labeled magnetic beads.
2.3将得到的17类核酸标记磁珠按照1:1的比例混合后待用。2.3 Mix the obtained 17 types of nucleic acid-labeled magnetic beads according to the ratio of 1:1 before use.
3多重PCR扩增3 Multiplex PCR Amplification
3.1按照下表配置PCR体系。3.1 Configure the PCR system according to the following table.
试剂reagent 50μL体系50μL system
2x QIAGEN Multiplex PCR Master Mix2x QIAGEN Multiplex PCR Master Mix 25μL25μL
核酸标记磁珠Nucleic acid labeled magnetic beads 5μL5μL
基因组DNAgenomic DNA 10ng10ng
RNase-free waterRNase-free water 补充至50μLMake up to 50 μL
3.2在PCR仪上按照以下程序运行。3.2 Follow the procedure below on the PCR machine.
Figure PCTCN2020106089-appb-000013
Figure PCTCN2020106089-appb-000013
4 Index PCR形成文库4 Index PCR to form a library
4.1按照下表配置index PCR体系。4.1 Configure the index PCR system according to the following table.
试剂reagent 50μL体系50μL system
2x KAPA Hifi2x KAPA Hifi 25μL25μL
I5 PrimerI5 Primer 5μL5μL
I7 PrimerI7 Primer 5μL5μL
RNase-free waterRNase-free water 补充至50μLMake up to 50 μL
4.2用配置好的50μL index PCR体系悬浮洗涤过的1.3.2中的磁珠,运行以下PCR程序:4.2 Use the configured 50μL index PCR system to suspend the washed magnetic beads in 1.3.2, and run the following PCR program:
Figure PCTCN2020106089-appb-000014
Figure PCTCN2020106089-appb-000014
4.3使用SPRI beads纯化长度300-500bp范围的DNA,测量浓度并使用Caliper进行片 段长度分析,如图9所示,得到主峰为379bp左右的文库,符合库检标准。4.3 Use SPRI beads to purify DNA in the range of 300-500bp in length, measure the concentration and use Caliper for fragment length analysis, as shown in Figure 9, to obtain a library with a main peak of about 379bp, which meets the library inspection standard.

Claims (15)

  1. 一种多种核酸共标记的支持物,其包括支持物本体以及位于支持物本体表面和/或内部的多种核酸标记,单个支持物上标记的核酸至少包括:一或多个第一核酸标记,其作用至少包括捕获反应体系中的特定化合物到支持物表面;一或多个第二核酸标记,其作用至少包括可以参与到捕获到支持物表面的特定化合物的指定生物化学反应过程。A kind of multi-nucleic acid co-labeled support, it includes the support body and a variety of nucleic acid labels located on the surface and/or inside of the support body, the nucleic acid labeled on a single support at least includes: one or more first nucleic acid labels , whose role at least includes capturing a specific compound in the reaction system to the surface of the support; one or more second nucleic acid labels, whose role at least includes participating in the specified biochemical reaction process of the specific compound captured on the surface of the support.
  2. 根据权利要求1所述的多种核酸共标记的支持物,其中,所述支持物本体为固体珠子和/或半固态水凝胶珠。The multiple nucleic acid co-labeled supports according to claim 1, wherein the support body is solid beads and/or semi-solid hydrogel beads.
  3. 根据权利要求1所述的多种核酸共标记的支持物,其为包括多个支持物的组合物。The multi-nucleic acid co-labeled support of claim 1, which is a composition comprising a plurality of supports.
  4. 根据权利要求3所述的多种核酸共标记的支持物,其中,同一支持物上的第一核酸标记、第二核酸标记的数量可以分别≥1个和/或≤10 13个。 The multiple nucleic acid co-labeled supports according to claim 3, wherein the number of the first nucleic acid label and the second nucleic acid label on the same support can be ≥ 1 and/or ≤ 10 13 respectively.
  5. 根据权利要求3所述的多种核酸共标记的支持物,其中,The multiple nucleic acid co-labeled support according to claim 3, wherein,
    同一支持物上的多个第一核酸标记的序列相同或不同;The sequences of multiple first nucleic acid markers on the same support are the same or different;
    不同支持物上的第一核酸标记的序列相同或不同;The sequences of the first nucleic acid markers on different supports are the same or different;
    同一支持物上的多个第二核酸标记的序列相同或不同;或The sequences of multiple second nucleic acid labels on the same support are the same or different; or
    不同支持物上的第二核酸标记的序列相同或不同。The sequences of the second nucleic acid labels on the different supports are the same or different.
  6. 权利要求1~5任一项所述的多种核酸共标记的支持物的制作方法,该方法包括:The method for making a multi-nucleic acid co-labeled support according to any one of claims 1 to 5, the method comprising:
    将多种核酸通过接枝到和/或接枝于的方式标记到支持物本体上,得到多种核酸共标记的支持物。Multiple nucleic acids are labeled on the support body by grafting and/or grafting to obtain a support with multiple nucleic acids co-labeled.
  7. 根据权利要求6所述的制作方法,其包括:The manufacturing method according to claim 6, comprising:
    将支持物本体和核酸分别修饰上能相互作用的功能单位,使二者反应将核酸标记到支持物本体上;The support body and the nucleic acid are respectively modified with functional units that can interact, so that the two react to label the nucleic acid on the support body;
    按照预设好的核苷酸序列将核酸直接合成在支持物本体上;和/或The nucleic acid is directly synthesized on the support body according to the preset nucleotide sequence; and/or
    采用生物化学反应进行核酸延伸或连接的方案在支持物本体上进行核酸标记。Nucleic acid labeling is carried out on the support body using a biochemical reaction for nucleic acid extension or ligation protocols.
  8. 权利要求1~5任一项所述的多种核酸共标记的支持物在5’单细胞RNA表达谱分析、构建微孔阵列平台的5’单细胞VDJ文库、构建3’单细胞RNA文库、构建单细胞转录组文库、单细胞多组学研究、多重PCR和/或构建多重PCR测序文库中的应用。The multiple nucleic acid co-labeled supports according to any one of claims 1 to 5 are used for 5' single cell RNA expression profile analysis, construction of a 5' single cell VDJ library of a microwell array platform, construction of a 3' single cell RNA library, Applications in the construction of single-cell transcriptome libraries, single-cell multi-omics studies, multiplex PCR and/or construction of multiplex PCR sequencing libraries.
  9. 根据权利要求8所述的应用,其中,所述的多种核酸共标记的支持物是用于5’单细胞RNA表达谱分析:其中:The application according to claim 8, wherein the multiple nucleic acid co-labeled supports are used for 5' single-cell RNA expression profiling: wherein:
    支持物上固定有含有细胞标签与分子标签的模板转换序列以及RNA捕获序列,具体 地,支持物上标记了至少两种核酸序列:第一核酸序列和第二核酸序列;第一核酸序列至少包含捕获序列,用于捕获目的核酸分子并作为引物延伸或逆转录;第二核酸序列包括细胞标签序列,细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;不同种类的支持物上具有不同的细胞标签;A template switching sequence containing a cell tag and a molecular tag and an RNA capture sequence are immobilized on the support. Specifically, at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence; the first nucleic acid sequence at least contains The capture sequence is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription; the second nucleic acid sequence includes a cell tag sequence, and the cell tag sequence is used to label molecules derived from all mRNAs in the same cell; different kinds of supports have different cell labels;
    优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA,并在逆转录过程中通过模板转换实现对来源于同一细胞的RNA标记相同的细胞标签,而后通过扩增实现cDNA扩增并最终构建为5’单细胞RNA表达谱文库。Preferably, the support is made to capture the RNA released after single cell lysis in the micropores of the chip, and the RNA derived from the same cell is labeled with the same cell label by template switching during the reverse transcription process, and then the cDNA is realized by amplification Amplified and finally constructed as a 5' single-cell RNA expression profiling library.
  10. 根据权利要求8所述的应用,其中,所述的多种核酸共标记的支持物是用于构建微孔阵列平台的5’单细胞VDJ文库;其中:The application according to claim 8, wherein, the multi-nucleic acid co-labeled support is a 5' single-cell VDJ library for constructing a microwell array platform; wherein:
    支持物上固定有含有细胞标签与分子标签的模板转换序列以及RNA捕获序列,具体地,支持物上标记了至少两种核酸序列:第一核酸序列和第二核酸序列;第一核酸序列至少包含捕获序列,用于捕获目的核酸分子并作为引物延伸或逆转录;第二核酸序列包括细胞标签序列、分子标签序列和模板转换序列,细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;分子标签序列用于标记每个逆转录出来的cDNA分子,从同一支持物上的不同的RNA逆转录出来的cDNA分子都被标记上不同的分子标签;模板转换序列可以作为模板使逆转录出来的cDNA 3’端继续延伸以标记上分子标签序列和细胞标签序列;不同种类的支持物上具有不同的细胞标签;A template switching sequence containing a cell tag and a molecular tag and an RNA capture sequence are immobilized on the support. Specifically, at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence; the first nucleic acid sequence at least contains The capture sequence is used to capture the target nucleic acid molecule and serve as a primer for extension or reverse transcription; the second nucleic acid sequence includes a cell tag sequence, a molecular tag sequence and a template switching sequence, and the cell tag sequence is used to label molecules derived from all mRNAs in the same cell ; Molecular tag sequences are used to label each reverse transcribed cDNA molecule, and cDNA molecules reverse transcribed from different RNAs on the same support are marked with different molecular tags; the template conversion sequence can be used as a template to reverse transcribed The 3' end of the cDNA continues to be extended to be labeled with molecular tag sequences and cell tag sequences; different kinds of supports have different cell tags;
    优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA,并在逆转录过程中通过模板转换实现对来源于同一细胞的RNA标记相同的细胞标签,进一步通过TCR与BCR/Ig基因的恒定区引物实现TCR与BCR/Ig核酸序列的富集并最终打断构建为高通量单细胞VDJ测序文库。Preferably, the support is made to capture the RNA released after single cell lysis in the micropores of the chip, and in the reverse transcription process, the RNA derived from the same cell is labeled with the same cell label through template switching, and further through TCR and BCR/ The constant region primers of the Ig gene realize the enrichment of TCR and BCR/Ig nucleic acid sequences and finally break them into a high-throughput single-cell VDJ sequencing library.
  11. 根据权利要求8所述的应用,其中,所述的多种核酸共标记的支持物是用于构建3’单细胞RNA文库;其中:The application according to claim 8, wherein the multiple nucleic acid co-labeled supports are used to construct a 3' single-cell RNA library; wherein:
    支持物上固定有含有细胞标签的可条件性封闭的随机引物以及RNA捕获序列,具体地,支持物上标记了至少两种核酸序列:第一核酸序列和第二核酸序列;第一核酸序列至少包含捕获序列,用于捕获目的核酸分子并作为引物延伸或逆转录;第二核酸序列包括含有细胞标签的可条件性封闭的随机引物,细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;不同种类的支持物上具有不同的细胞标签;Conditionally blocked random primers containing cell tags and RNA capture sequences are immobilized on the support. Specifically, at least two nucleic acid sequences are marked on the support: a first nucleic acid sequence and a second nucleic acid sequence; the first nucleic acid sequence is at least Contains a capture sequence for capturing the nucleic acid molecule of interest and serves as primer extension or reverse transcription; the second nucleic acid sequence includes a conditionally blockable random primer containing a cell tag, which is used to label all mRNAs derived from the same cell. Molecules; different cell tags on different kinds of supports;
    优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA并逆转录为cDNA,随后的含有细胞标签的随机引物通过二链合成实现对来源于同一细胞的cDNA标记上相同的细胞标签,而后通过扩增实现cDNA扩增构建为3’单细胞RNA文库。Preferably, the support is made to capture the RNA released after single cell lysis in the micropores of the chip and reverse transcribed into cDNA, and the subsequent random primers containing cell tags are synthesized by two strands to achieve the same tag on the cDNA derived from the same cell. Cell labeling, followed by amplification of cDNA to construct a 3' single-cell RNA library.
  12. 根据权利要求8所述的应用,其中,所述的多种核酸共标记的支持物是用于构建单细胞转录组文库;其中:The application according to claim 8, wherein the multiple nucleic acid co-labeled supports are used to construct a single-cell transcriptome library; wherein:
    支持物上固定有含有细胞标签的随机引物序列以及RNA捕获序列,不同种类的支持物上具有不同的细胞标签,能够检测RNA分子上的任何一段序列而不局限于3’端或5’端;优选地,支持物包括两种类型的支持物,每种类型的单个支持物上至少有两种核酸序列,第一核酸序列和第二核酸序列的组合,或者是第三核酸序列和第二核酸序列的组合;第一核酸序列至少包含捕获序列用于捕获目的核酸分子;第二核酸序列包括含有细胞标签的随机引物序列,细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;第三核酸序列包括细胞标签序列和捕获序列;Random primer sequences containing cell tags and RNA capture sequences are fixed on the support. Different types of supports have different cell tags, which can detect any sequence of RNA molecules without being limited to the 3' end or 5' end; Preferably, the support comprises two types of supports, each type of support having at least two nucleic acid sequences on a single support, a combination of a first nucleic acid sequence and a second nucleic acid sequence, or a third nucleic acid sequence and a second nucleic acid A combination of sequences; the first nucleic acid sequence comprises at least a capture sequence for capturing target nucleic acid molecules; the second nucleic acid sequence comprises a random primer sequence containing a cell tag, and the cell tag sequence is used to label molecules derived from all mRNAs in the same cell; Three nucleic acid sequences include cell tag sequences and capture sequences;
    优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA,并在逆转录过程中实现对来源于同一细胞的RNA标记相同的细胞标签,而后通过扩增实现cDNA扩增并最终构建为单细胞RNA转录组文库。Preferably, the support is made to capture the RNA released after the lysis of single cells in the micropores of the chip, and in the process of reverse transcription, the RNA derived from the same cell is labeled with the same cell label, and then the cDNA is amplified and amplified by amplification. Finally, a single-cell RNA transcriptome library was constructed.
  13. 根据权利要求8所述的应用,其中,所述的多种核酸共标记的支持物是用于单细胞多组学研究;优选地,包括用于构建RNA表达水平的文库和/或通过蛋白的核酸标签用于检测蛋白表达水平;其中:The application according to claim 8, wherein the multiple nucleic acid co-labeled supports are used for single-cell multi-omics research; preferably, including a library for constructing RNA expression levels and/or by protein Nucleic acid tags are used to detect protein expression levels; where:
    支持物上固定有含有细胞标签的RNA捕获序列和用于标记蛋白的核酸标签的捕获序列,而且不同种类的支持物上具有不同的细胞标签;优选地,第一核酸序列至少包含捕获序列用于捕获目的核酸分子并作为引物延伸;第二核酸序列包括细胞标签序列、分子标签序列和模板转换序列;细胞标签序列用于标记来源于同一细胞中的所有mRNA的分子;分子标签序列用于标记每个逆转录出来的cDNA分子,从同一支持物上的不同的RNA逆转录出来的cDNA分子都被标记上不同的分子标签;模板转换序列可以作为模板使逆转录出来的cDNA 3’端继续延伸以标记上分子标签序列、细胞标签序列;第三核酸序列包括细胞标签序列、分子标签序列和蛋白核酸标签捕获序列,蛋白核酸标签捕获序列用来捕获并延伸与待测单细胞在同一空间结构的蛋白核酸标记;An RNA capture sequence containing a cell tag and a capture sequence for a nucleic acid tag for labeling proteins are immobilized on the support, and different types of supports have different cell tags; preferably, the first nucleic acid sequence at least contains a capture sequence for The target nucleic acid molecule is captured and extended as a primer; the second nucleic acid sequence includes a cell tag sequence, a molecular tag sequence and a template switching sequence; the cell tag sequence is used to tag molecules derived from all mRNAs in the same cell; the molecular tag sequence is used to tag each Each reverse-transcribed cDNA molecule, the cDNA molecules reverse-transcribed from different RNAs on the same support are marked with different molecular tags; the template switching sequence can be used as a template to continue to extend the 3' end of the reverse-transcribed cDNA to Molecular tag sequence and cell tag sequence are labeled; the third nucleic acid sequence includes cell tag sequence, molecular tag sequence and protein nucleic acid tag capture sequence. The protein nucleic acid tag capture sequence is used to capture and extend the protein in the same spatial structure as the single cell to be tested. Nucleic acid markers;
    优选地,使得支持物在芯片的微孔中捕获单细胞裂解后释放的RNA以及蛋白的核酸标签,并在逆转录过程中实现对来源于同一细胞的RNA与蛋白核酸标签标记上相同的细胞标签,而后通过扩增最终构建为单细胞RNA转录组文库及蛋白标记核酸文库。Preferably, the support is made to capture RNA and protein nucleic acid tags released after single cell lysis in the micropores of the chip, and in the reverse transcription process, the RNA and protein nucleic acid tags derived from the same cell are labeled with the same cell tag. , and then finally constructed into a single-cell RNA transcriptome library and a protein-labeled nucleic acid library through amplification.
  14. 根据权利要求8所述的应用,其中,所述的多种核酸共标记的支持物是用于构建多重PCR测序文库;其中:The application according to claim 8, wherein the multiple nucleic acid co-labeled supports are used to construct a multiplex PCR sequencing library; wherein:
    将能够相互干扰的引物分别固定到不同支持物上;具体地,支持物包含至少两种类的支持物:一或多个第一种类的引物标记的支持物,一或多个第二种类的引物标记的支 持物,每个支持物上标记上至少一对的核酸引物:第一种类的引物标记的支持物上标记上第一核酸引物对,第二种类的引物标记的支持物上标记上与第一核酸引物对不同的第二核酸引物对,两种类支持物各自独立地还可选择性包括更多的核酸引物对例如其他核酸引物对,同一支持物上的多对引物对所扩增的目标片段在模板上不重合;不同支持物上标记的引物对不同从而可扩增不同的目的区域,这些目的区域间可以部分重合或者不重合;The primers capable of interfering with each other are respectively immobilized on different supports; specifically, the supports comprise at least two kinds of supports: supports labeled with one or more primers of the first kind, and supports labeled with one or more primers of the second kind. Labeled supports, each support is labeled with at least one pair of nucleic acid primers: the first nucleic acid primer pair is labeled on the first type of primer-labeled support, and the second kind of primer-labeled support is labeled with a pair of nucleic acid primers. The first nucleic acid primer pair is different from the second nucleic acid primer pair, and each of the two types of supports can optionally include more nucleic acid primer pairs, such as other nucleic acid primer pairs, amplified by multiple pairs of primer pairs on the same support. The target fragments do not overlap on the template; the primer pairs marked on different supports are different, so that different target regions can be amplified, and these target regions can be partially overlapped or not overlapped;
    优选地,将所有支持物按照比例混合后与核酸模板和PCR酶反应体系混合,从而进行单管无偏差的多重PCR。Preferably, all the supports are mixed in proportions and then mixed with the nucleic acid template and the PCR enzyme reaction system, so as to perform a single-tube unbiased multiplex PCR.
  15. 一种试剂盒,其包括权利要求1~5任一项所述的多种核酸共标记的支持物;A kit comprising the multiple nucleic acid co-labeled supports according to any one of claims 1 to 5;
    优选地,所述试剂盒为可应用于5’单细胞RNA表达谱分析、构建微孔阵列平台的5’单细胞VDJ文库、构建3’单细胞RNA文库、构建单细胞转录组文库、单细胞多组学研究、多重PCR和/或构建多重PCR测序文库的试剂盒;Preferably, the kit is a 5' single-cell VDJ library that can be applied to 5' single-cell RNA expression profiling, the construction of a microwell array platform, the construction of a 3' single-cell RNA library, the construction of a single-cell transcriptome library, a single-cell VDJ library Kits for multi-omics studies, multiplex PCR and/or construction of multiplex PCR sequencing libraries;
    更优选地,所述试剂盒还包括以下组合物中的一种或多种:More preferably, the kit also includes one or more of the following compositions:
    组合物1:含有细胞标签与分子标签的模板转换序列以及RNA捕获序列的支持物的混合物、微孔芯片、细胞裂解液、逆转录试剂、核酸扩增试剂以及核酸打断建库模块;Composition 1: a mixture containing a template switching sequence of a cell tag and a molecular tag and a support for an RNA capture sequence, a microwell chip, a cell lysate, a reverse transcription reagent, a nucleic acid amplification reagent, and a nucleic acid interruption library building module;
    组合物2:含有细胞标签与分子标签的模板转换序列以及RNA捕获序列的支持物的混合物、微孔芯片、细胞裂解液、逆转录试剂、恒定区引物、核酸扩增试剂以及核酸打断建库模块;Composition 2: mixture of template switching sequences containing cell tags and molecular tags and supports for RNA capture sequences, microwell chips, cell lysates, reverse transcription reagents, constant region primers, nucleic acid amplification reagents, and nucleic acid interruption library construction module;
    组合物3:含有细胞标签的随机引物以及RNA捕获序列的支持物的混合物、微孔芯片、细胞裂解液、逆转录试剂、二链合成模块和核酸扩增与延伸试剂;Composition 3: a mixture of random primers containing cell tags and supports for RNA capture sequences, a microwell chip, a cell lysate, a reverse transcription reagent, a two-strand synthesis module, and a nucleic acid amplification and extension reagent;
    组合物4:含有细胞标签的随机引物序列以及RNA捕获序列的支持物混合物、微孔芯片、细胞裂解液、逆转录试剂、二链合成模块以及核酸扩增与延伸试剂;Composition 4: a support mixture containing a random primer sequence of a cell tag and an RNA capture sequence, a microwell chip, a cell lysate, a reverse transcription reagent, a two-strand synthesis module, and a nucleic acid amplification and extension reagent;
    组合物5:含有细胞标签的蛋白标签核酸的捕获序列支持物混合物、微孔芯片、细胞裂解液、逆转录试剂、核酸打断建库模块;Composition 5: a capture sequence support mixture containing a cell-tagged protein-tagged nucleic acid, a microwell chip, a cell lysate, a reverse transcription reagent, and a nucleic acid interrupt library building module;
    组合物6:预混的已偶联引物的支持物混合物及多重PCR酶及缓冲液;进一步选择性地还包括适配高通量测序仪的标签引物。Composition 6: premixed primer-coupled support mixture, multiplex PCR enzyme and buffer; further optionally, index primers adapted to a high-throughput sequencer are included.
PCT/CN2020/106089 2020-07-31 2020-07-31 Multi-nucleic acid co-labeling support, preparation method therefor, and application thereof WO2022021279A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/106089 WO2022021279A1 (en) 2020-07-31 2020-07-31 Multi-nucleic acid co-labeling support, preparation method therefor, and application thereof
CN202080005408.1A CN114096678A (en) 2020-07-31 2020-07-31 Multiple nucleic acid co-labeling support, and preparation method and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/106089 WO2022021279A1 (en) 2020-07-31 2020-07-31 Multi-nucleic acid co-labeling support, preparation method therefor, and application thereof

Publications (1)

Publication Number Publication Date
WO2022021279A1 true WO2022021279A1 (en) 2022-02-03

Family

ID=80037416

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/106089 WO2022021279A1 (en) 2020-07-31 2020-07-31 Multi-nucleic acid co-labeling support, preparation method therefor, and application thereof

Country Status (2)

Country Link
CN (1) CN114096678A (en)
WO (1) WO2022021279A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114574569A (en) * 2022-03-28 2022-06-03 浙江大学 Terminal transferase-based genome sequencing kit and sequencing method
CN115386622A (en) * 2022-10-26 2022-11-25 北京寻因生物科技有限公司 Transcriptome library building method and application thereof

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115198000B (en) * 2022-07-21 2023-07-21 北京寻因生物科技有限公司 Method for constructing single-cell complete sequence transcriptome library
CN115198001B (en) * 2022-07-21 2023-07-04 北京寻因生物科技有限公司 Construction method and application of single-cell complete sequence transcriptome library
CN115747301B (en) * 2022-08-01 2023-12-22 深圳赛陆医疗科技有限公司 Method for constructing sequencing library, kit for constructing sequencing library and gene sequencing method
CN117089599B (en) * 2023-10-20 2024-02-13 青岛百创智能制造技术有限公司 Long coding sequence microbead and preparation method thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104364392A (en) * 2012-02-27 2015-02-18 赛卢拉研究公司 Compositions and kits for molecular counting
CN106459967A (en) * 2014-04-29 2017-02-22 Illumina公司 Multiplexed single cell gene expression analysis using template switch and tagmentation
CN106498040A (en) * 2016-10-12 2017-03-15 浙江大学 A kind of molecular labeling microballon and the unicellular sequence measurement of the high flux based on the molecular labeling microballon
US20190367966A1 (en) * 2018-02-12 2019-12-05 10X Genomics, Inc. Methods and systems for analysis of major histocompatability complex
CN110684829A (en) * 2018-07-05 2020-01-14 深圳华大智造科技有限公司 High-throughput single-cell transcriptome sequencing method and kit
WO2020123316A2 (en) * 2018-12-10 2020-06-18 10X Genomics, Inc. Methods for determining a location of a biological analyte in a biological sample

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104364392A (en) * 2012-02-27 2015-02-18 赛卢拉研究公司 Compositions and kits for molecular counting
CN106459967A (en) * 2014-04-29 2017-02-22 Illumina公司 Multiplexed single cell gene expression analysis using template switch and tagmentation
CN106498040A (en) * 2016-10-12 2017-03-15 浙江大学 A kind of molecular labeling microballon and the unicellular sequence measurement of the high flux based on the molecular labeling microballon
US20190367966A1 (en) * 2018-02-12 2019-12-05 10X Genomics, Inc. Methods and systems for analysis of major histocompatability complex
CN110684829A (en) * 2018-07-05 2020-01-14 深圳华大智造科技有限公司 High-throughput single-cell transcriptome sequencing method and kit
WO2020123316A2 (en) * 2018-12-10 2020-06-18 10X Genomics, Inc. Methods for determining a location of a biological analyte in a biological sample

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MCEVOY CHRISTOPHER R, SEMPLE TIMOTHY, YELLAPU BHARGAVI, CHOONG DAVID Y, XU HUILING, MIR ARNAU GISELA, FELLOWES ANDREW P, FOX STEPH: "Improved next-generation sequencing pre-capture library yields and sequencing parameters using on-bead PCR", BIOTECHNIQUES, vol. 68, no. 1, 1 January 2020 (2020-01-01), US , pages 48 - 51, XP055891162, ISSN: 0736-6205, DOI: 10.2144/btn-2019-0059 *
YUAN ZHOU: "Single Nucleic Acid Molecule Manipulation and Single Cell Sequencing", DEPARTMENT OF CHEMICAL BIOLOGY, XIAMEN UNIVERSITY, no. 1, August 2017 (2017-08-01), pages 1 - 152, XP055891195 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114574569A (en) * 2022-03-28 2022-06-03 浙江大学 Terminal transferase-based genome sequencing kit and sequencing method
CN115386622A (en) * 2022-10-26 2022-11-25 北京寻因生物科技有限公司 Transcriptome library building method and application thereof
CN115386622B (en) * 2022-10-26 2023-10-27 北京寻因生物科技有限公司 Library construction method of transcriptome library and application thereof

Also Published As

Publication number Publication date
CN114096678A (en) 2022-02-25

Similar Documents

Publication Publication Date Title
WO2022021279A1 (en) Multi-nucleic acid co-labeling support, preparation method therefor, and application thereof
US20220403376A1 (en) Surface-Based Tagmentation
US20210380974A1 (en) Combinatorial sets of nucleic acid barcodes for analysis of nucleic acids associated with single cells
WO2021013244A1 (en) Method for constructing capture library and kit
KR102531677B1 (en) Methods of analyzing nucleic acids from individual cells or cell populations
US11306348B2 (en) Complex surface-bound transposome complexes
JP2017532028A (en) Isolated oligonucleotides and their use in sequencing nucleic acids
TW201321518A (en) Method of micro-scale nucleic acid library construction and application thereof
US11401543B2 (en) Methods and compositions for improving removal of ribosomal RNA from biological samples
US20230056763A1 (en) Methods of targeted sequencing
CN110886021B (en) Construction method of single-cell DNA library
WO2021253372A1 (en) High-compatibility pcr-free library building and sequencing method
US20100204050A1 (en) Target preparation for parallel sequencing of complex genomes
RU2790295C2 (en) Complex systems of transposome bound on surface
WO2023116376A1 (en) Labeling and analysis method for single-cell nucleic acid
WO2023116373A1 (en) Method for generating population of labeled nucleic acid molecules and kit for the method
WO2023115536A1 (en) Method for generating labeled nucleic acid molecular population and kit thereof
EP4334033A1 (en) High-throughput analysis of biomolecules
CN113490750A (en) High-throughput sequencing method for trace DNA methylation
CN117089597A (en) Single cell library construction sequencing method and application thereof
CN117305410A (en) Method and kit for preparing sequencing library and corresponding sequencing method
CN117737217A (en) Space transcriptome detection method for low-quality sample and application thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20947545

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20947545

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20947545

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26/07/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20947545

Country of ref document: EP

Kind code of ref document: A1