WO2024077439A1 - 一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法 - Google Patents

一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法 Download PDF

Info

Publication number
WO2024077439A1
WO2024077439A1 PCT/CN2022/124336 CN2022124336W WO2024077439A1 WO 2024077439 A1 WO2024077439 A1 WO 2024077439A1 CN 2022124336 W CN2022124336 W CN 2022124336W WO 2024077439 A1 WO2024077439 A1 WO 2024077439A1
Authority
WO
WIPO (PCT)
Prior art keywords
library
cell
transcriptome
sequencing
chromatin
Prior art date
Application number
PCT/CN2022/124336
Other languages
English (en)
French (fr)
Inventor
瞿昆
高绪远
刘柯
Original Assignee
中国科学技术大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学技术大学 filed Critical 中国科学技术大学
Priority to PCT/CN2022/124336 priority Critical patent/WO2024077439A1/zh
Publication of WO2024077439A1 publication Critical patent/WO2024077439A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present invention belongs to the field of biomedicine technology, and specifically, relates to a method for constructing a single-cell sequencing library, and more specifically, to a method for constructing a single-cell transcriptome and chromatin accessibility bi-omics sequencing library, the prepared sequencing library, and a method for sequencing using the library.
  • sci-CAR-Seq 1 First, cells are dispersed into different wells of the well plate, and mRNA is reverse transcribed into cDNA by reverse transcription reaction, and the cDNA from the cells in the same well is encoded in the first round, and then the transposition reaction is performed with a transposon to make the chromatin open sites from the cells in the same well carry the first round of encoding, and then all cells are mixed evenly and dispersed into a new well plate for lysis, and a part of the lysate is taken for amplification of the cDNA library, and the cDNA is encoded in the second round, and another part of the lysate is used for amplification of the chromatin open site library, and the chromatin open site library is encoded in the second round. Finally, the transcriptome library and chromatin open site library from the same cell are determined by identifying the combination of two rounds of encoding (see Figure 2 for a schematic diagram of the sci-CAR-seq technical process).
  • scCAT-Seq 2 Use laser to sort cells into wells in a well plate, with a single cell in each well. Lyse the cells in the air, separate the cytoplasm from the nucleus, and perform reverse transcription on the cytoplasm to construct a transcriptome library, while performing transposition on the nucleus to construct a chromatin open site library (see Figure 3 for a schematic diagram of the scCAT-Seq technology process).
  • SNARE-Seq 3 Prepare a single cell nucleus (mononucleus) suspension, perform Tn5 transposon transposition reaction on the mononucleus, capture chromatin open site information, and encapsulate the capture magnetic beads, mononucleus and splint primers in a droplet through a microfluidic process, react in the droplet, transfer the chromatin open site sequence information to the magnetic beads, and use the primers of the magnetic beads to capture the mRNA information. Amplify the library to obtain a transcriptome library and a chromatin open site library for sequencing (see Figure 4 for a schematic diagram of the SNARE-Seq process).
  • Paired-Seq 4 Disperse cells evenly into different wells, use transposition and reverse transcription reactions to make the chromatin open sites and transcriptomes in the cells in the same well carry the same code, connect the codes using three rounds of combined indexing, and finally determine the chromatin open sites and transcriptome sequences from the same cell based on the combination of codes (see Figure 5 for a schematic diagram of the Paired-Seq process).
  • Shared-Seq 5 First, the cells are transposed, and then reverse transcribed using a biotin-labeled reverse transcription primer; using a combined indexing method, the cells are evenly dispersed into different wells for coding connection and repeated twice. Finally, the chromatin open site information and transcriptome information from the same cell can be identified based on the combination of codes (see Figure 6 for a schematic diagram of the Share-Seq technology process).
  • 10X Chromium Single Cell Multi-omics (ATAC+Gene Expression) Kit Prepare single cell nucleus (mononucleus) suspension, perform Tn5 transposon transposition reaction on the single nucleus, capture chromatin open site information, and encapsulate the capture magnetic beads, single nucleus and reaction system in a droplet through a microfluidic process. The reaction is carried out in the droplet, and the cell's chromatin open sites and transcriptome library are encoded at the same time.
  • ISSAC-seq 6 Prepare a single cell nucleus (mononucleus) suspension, perform Tn5 transposon transposition reaction on the single nucleus, and capture the chromatin open site information; then perform reverse transcription reaction to capture transcriptome information, and then choose to use microfluidics process to encapsulate the capture magnetic beads, single nucleus and reaction system in a droplet, react in the droplet, and encode the cell's chromatin open site and transcriptome library at the same time; or you can choose to use flow sorting method to sort into the well plate for encoding connection.
  • TCR and BCR information can be obtained, and immune cells can be traced back, and the evolutionary trajectory of immune cells can be further corrected to obtain more reliable results, which is very important for the study of immune plasticity.
  • the technical principles displayed in this disclosure can solve these problems and other related needs.
  • the inventors of this application have designed a high-throughput single-cell transcriptome and chromatin accessibility bi-omics sequencing library construction technology that can capture messenger RNA 5' information. Specifically, the technical problems existing in this field are solved through the technical solutions shown in the following projects.
  • a method for constructing a single-cell transcriptome and chromatin accessibility dual-omics single-cell sequencing library comprising the following steps:
  • a chromatin open site which comprises performing a transposition reaction on the chromatin of the cell using a Tn5 transposon assembled by a primer dimer and a Tn5 transposase, cutting the chromatin open site of the cell, and connecting the primer dimer to the chromatin open site;
  • transcriptome library and chromatin open site library with cell codes e) subsequent encoding to obtain a transcriptome library and a chromatin open site library with cell codes, wherein the subsequent encoding includes single-cell separation or coding integration of cells using different platforms or processes as needed, wherein the transcriptome library and chromatin open site library from the same cell carry the same cell code or cell code combination;
  • initial library amplification which includes amplifying the previously constructed library to increase the library fragments available for use
  • preparing a chromatin open site library and a transcriptome library for sequencing which comprises separating the chromatin open site library and the transcriptome library from the library amplified in the previous step,
  • the template switching oligomer (as shown in FIG. 1 ) comprises a sequence 5 that is completely or partially complementary to a sequence 8 generated by the terminal transferase activity of the reverse transcriptase, so that the reverse transcriptase can continue to synthesize the sequence using the template switching oligomer as a template, and the template switching oligomer further comprises a partial structural region 4 located at the 5′ end of the sequence 5 and a handle sequence 3 located at the 5′ end of the partial structural region 4 for subsequent coding integration, and optionally, the template switching oligomer further comprises a sequence 6 that is complementary to the partial structural region 4 to form a double-stranded structure with the template switching oligomer.
  • transcriptome includes but is not limited to mRNA encoding T cell receptor (TCR), B cell receptor (BCR) and/or guide RNA in the CRISPR system
  • transcriptome library includes a TCR library, a BCR library and/or a gRNA library.
  • a single-cell transcriptome and chromatin accessibility bi-omics sequencing library prepared by the methods described in items 1 to 13.
  • USTC-V-Seq The technology of the present invention is named USTC-V-Seq in this article, which adopts a new coding (barcode) connection method, which can connect the code to the chromatin open site or/and the transcriptome and retain the 5' information of the transcriptome, and can carry the same coding combination on the chromatin open sequence fragment and the transcriptome sequence from the same cell.
  • the gRNA library and BCR or/and TCR library sequences can be further enriched from the obtained transcriptome library to complete the corresponding gRNA sequencing, BCR sequencing and TCR sequencing.
  • USTC-V-Seq can simultaneously obtain chromatin accessibility information and transcriptome information from the same cell and is compatible with the acquisition of gRNA information (if included), BCR sequence information (if included) and TCR sequence information (if included).
  • the coding integration handle site is connected to the coding, thereby realizing the coding of the cell.
  • the addition of double strands is also expected to further block the template switching reaction.
  • the coding can be integrated into the coding integration handle site of the single-stranded template switching oligomer or the double-stranded template switching oligomer.
  • FIG. 1 Schematic diagram of transcriptome acquisition.
  • Figure 4 Schematic diagram of the SNARE-Seq process.
  • Figure 5 Schematic diagram of the Paired-Seq process.
  • FIG. 1 Schematic diagram of the Share-Seq technology process.
  • FIG. 7 Schematic diagram of the integration of subsequent encoding.
  • Barcode The barcode (barcode, barcoding or index) or the combination of barcodes described herein refers to different base sequences composed of nucleic acids, for example, ATCG and TACG are two different barcodes.
  • Cell A basic component of mammals (such as humans and mice) for life activities.
  • the cell in the present disclosure is not limited to the whole cell, but also refers to other cell components, such as cell nuclei, mitochondria, etc.
  • a single cell suspension can also be a single cell nucleus suspension.
  • Chromatin A linear complex structure in the cell nucleus composed of DNA, histones, non-histone proteins, and a small amount of RNA. Its basic element is the nucleosome formed by DNA wrapped around histones.
  • Chromatin accessibility that is, evaluating whether a certain section of DNA is wrapped around histones.
  • DNA is tightly wrapped around nucleosomes, which is called closed DNA; 2) DNA is wrapped around nucleosomes and is exposed, which is called open DNA.
  • Chromatin accessibility library A sequencing library consisting of sequences of chromatin open sites.
  • ATAC-seq A sequencing technology developed by Stanford University in 2012 to detect chromatin accessibility in biological samples (>500 cells).
  • Genome the complete DNA sequence of an organism, consisting of four bases, ATCG, arranged in an orderly manner. The genomes of major mammals such as humans and mice have been fully sequenced.
  • a gene is the entire DNA sequence required to produce a polypeptide chain or functional RNA.
  • a gene is generally one or more segments of DNA in the genome.
  • Transcriptome also known as "transcriptome”, in a broad sense refers to the combination of all RNAs that can be transcribed by cells, including messenger RNA (mRNA), ribosomal RNA (rRNA), transfer RNA (tRNA) and non-coding RNA; in a narrow sense, it refers to all messenger RNA (mRNA) that can be transcribed by cells.
  • mRNA messenger RNA
  • rRNA ribosomal RNA
  • tRNA transfer RNA
  • non-coding RNA in a narrow sense, it refers to all messenger RNA (mRNA) that can be transcribed by cells.
  • mRNA messenger RNA
  • rRNA ribosomal RNA
  • tRNA transfer RNA
  • non-coding RNA in a narrow sense, it refers to all messenger RNA (mRNA) that can be transcribed by cells.
  • Antigen refers to a substance that can cause the production of antibodies, and is any substance that can induce an immune response.
  • Antibody refers to a protein with protective effect produced by the body in response to antigen stimulation.
  • B cell receptor is a molecule located on the surface of B cells that is responsible for the specific recognition and binding of antigens. It is essentially an immunoglobulin on the membrane surface.
  • T cell receptor A specific receptor located on the surface of T cells that is responsible for recognizing antigens presented by the major histocompatibility complex (MHC); but unlike the B cell receptor, it cannot recognize free antigens.
  • MHC major histocompatibility complex
  • B cell receptor/T cell receptor sequencing (BCR-Seq/TCR-Seq): Sequencing technology targeted at B cell receptor/T cell receptor sequences.
  • Transcription factor A protein that can recognize specific DNA sequence patterns (Motif) and bind to DNA, thereby activating or regulating gene expression.
  • Single-cell chromatin accessibility sequencing (scATAC-seq): A sequencing method used to detect chromatin accessibility in single cells.
  • Single-cell transcriptome sequencing (scRNA-Seq): A sequencing method used to detect the transcriptome of a single cell.
  • Single-cell multi-omics sequencing a sequencing library construction method used to detect multiple dimensions of information on a single cell, such as a sequencing library construction method that simultaneously obtains dual-omics or more-omics information such as transcriptome, chromatin accessibility and/or proteome information from the same cell (such as sci-CAR, Paired-Seq, Shared-Seq, 10X Chromium, CITE-Seq, REAP-Seq, etc.).
  • a sequencing library construction method used to detect multiple dimensions of information on a single cell, such as a sequencing library construction method that simultaneously obtains dual-omics or more-omics information such as transcriptome, chromatin accessibility and/or proteome information from the same cell (such as sci-CAR, Paired-Seq, Shared-Seq, 10X Chromium, CITE-Seq, REAP-Seq, etc.).
  • the "Template Switching Oligomer (TSO)" described herein may be named by other names or no name, and is essentially a single-stranded or double-stranded nucleic acid sequence, which may be attached to a medium including but not limited to beads, magnetized beads, microwells, chips, plates, etc., or may be in a free state.
  • the Template Switching Oligomer (TSO) may include functional regions including but not limited to coding, unique molecular identification signals (Unique Molecular Identifier, UMI) and subsequent coding integration handle sites.
  • template switching reaction generally refers to a reaction in which, under the action of reverse transcriptase, sequence synthesis is continued using a template switching oligomer as a template to generate a sequence that is complementary or partially complementary to the template switching oligomer.
  • Tissues in a non-free state or/and tissues in a free state can be prepared into single cell suspensions by various methods including but not limited to cutting, grinding and enzymatic hydrolysis.
  • the tissue can be a tissue in a healthy state or a tissue in a diseased state, and the existence state of the tissue includes but is not limited to fresh tissue, frozen tissue, sliced tissue, etc.
  • it may be necessary to extract cell nuclei and the extraction process includes but is not limited to the use of grinding, permeabilization agent treatment, flow cytometry sorting, etc.
  • the single cell suspension can be processed without fixation.
  • the cell concentration in the single cell suspension needs to be determined and a suitable fixation system (containing a fixative) is used to achieve the desired fixation effect.
  • the fixatives that can be used include, but are not limited to, aldehydes such as formaldehyde, glutaraldehyde, and paraformaldehyde, alcohols such as methanol, ethanol, and acetone, and the fixative can be used at any suitable level or concentration.
  • the fixation may be terminated after fixation, and the fixation may be terminated using, but not limited to, any suitable level or concentration of glycine and bovine serum albumin or other reagents or methods.
  • the subsequent reaction may be performed directly without terminating the fixation.
  • the designed primer dimer can be incubated with Tn5 transposase to assemble a Tn5 transposon that can be used for transposition.
  • the corresponding Tn5 transposition system is prepared, and the Tn5 transposon is used to perform a transposition reaction, cut the open chromatin site of the cell, and connect the primer dimer to the open chromatin site.
  • the primer dimer usually contains a handle sequence that encodes the subsequent integration.
  • primer dimers can be designed or adjusted as needed. This article only provides an exemplified scheme that can be implemented, and does not limit the sequences used in the present invention.
  • the Tn5 transposase can be synthesized or purified by the user, or purchased from a supplier, including but not limited to VAZYME S601-01.
  • the incubation system of the primer and the transposase can be prepared by the user or commercially available.
  • the reaction conditions of the incubation reaction and the transposition reaction can be adjusted.
  • any component of the primer dimer can be modified, including but not limited to phosphorylation modification and biotin modification.
  • Reverse transcription primers can be used to perform reverse transcription reaction on cells. After the reverse transcription reaction is completed, the cells can be incubated at 42°C for a period of time, and a template switching oligomer (TSO) can be used to perform a template switching reaction.
  • TSO template switching oligomer
  • the template switching oligomer can provide a handle sequence for subsequent coding integration for the reverse transcription product.
  • the result of the template switching reaction is the integration of the template switching oligomer, the complementary product of the template switching oligomer, or the complementary product of a portion of the template switching oligomer into the library sequence.
  • the "template switching reaction” can be replaced by, but not limited to, reverse transcription reaction, ligation reaction, and DNA polymerization reaction.
  • the reverse transcriptase can be used, but not limited to, a reverse transcriptase based on mouse leukemia virus (MMLV) and a reverse transcriptase derived therefrom.
  • the reverse transcription primer can be modified, and this modification includes, but is not limited to, biotin modification.
  • the incubation time at 42° C. can be adjusted as needed. Incubation at 42° C. may also be omitted. Generally, incubation at 42° C. can improve the yield to a certain extent.
  • Template switching oligomers include but are not limited to modified or unmodified double-stranded template switching oligomers or modified or unmodified single-stranded template switching oligomers.
  • the double-stranded template switching oligomer can be modified by including but not limited to 3' phosphorylation modification and locked nucleic acid modification.
  • a double-stranded template switching oligomer can be formed by annealing two nucleic acid sequences.
  • the modified double-stranded template switching oligomer refers to modifications including but not limited to only or simultaneously containing 5' phosphorylation, 3' phosphorylation and locked nucleic acid.
  • the TSO modification needs to be removed; after the template switching step is completed, a related enzyme that can remove one or more special modifications is used to remove one or more special modifications, so that the double-stranded TSO exposes chemical groups that can be used for subsequent reactions.
  • the modification may not affect the subsequent reaction steps of this article, so it is not necessary to remove the special modification.
  • the special modifications described herein may be modified by 3' phosphorylation, in which case it may or may not be necessary to use a reaction including but not limited to T4 polynucleotide kinase (NEB M0201S or NEB M0201L) to remove the special modifications.
  • T4 polynucleotide kinase NEB M0201S or NEB M0201L
  • a handle site is provided on the template switching oligomer for the subsequent integration of the code, thereby ensuring that the code is integrated at the 5' end of the mRNA sequence.
  • Other single-cell dual-omics sequencing technologies do not use this method of code integration.
  • the introduction of a 42°C incubation step increases the product yield.
  • FIG1 1 shows a single strand of a template switching oligomer (TSO) single strand or double strand, including three parts 3, 4 and 5.
  • FIG1 2 shows a transcriptome, including but not limited to mRNA and gRNA in the CRISPR system.
  • FIG1 3 shows a subsequent coding integration handle sequence, the sequence length is a range, which can be adjusted according to the experiment, and in some embodiments, this fragment can also be removed.
  • FIG1 4 shows a partial structural region of TSO, which may include one or more functional regions, including but not limited to unique molecular identification signals (UMI), partial coding (functions include but are not limited to cell recognition, tissue recognition, etc.), double-stranded complementary sequences, etc.
  • FIG1 5 shows complementary bases in FIG1 8, the number of M is a range, which can be adjusted according to the experiment, M can be the same base or different bases, and the specific base can be determined according to N in FIG1 8; M can include but is not limited to deoxyribonucleic acid, ribonucleic acid and locked nucleic acid.
  • UMI unique molecular identification signals
  • partial coding functions include but are not limited to cell recognition, tissue recognition, etc.
  • FIG1 5 shows complementary bases in FIG1 8, the number of M is a range, which can be adjusted according to the experiment, M can be the same base or different bases, and the specific base can be determined according to N in FIG1 8; M can include but is not limited to deoxyribonucle
  • 6 in FIG1 is a sequence complementary to 4 in FIG1, which can form a double-stranded structure with the template switching oligomer to increase the possibility that region 3 is a single strand; in addition, the sequence shown in 6 in FIG1 can be modified, and such modification includes but is not limited to 3' phosphorylation, 5' phosphorylation, etc.
  • 7 in FIG1 is a newly generated sequence using the partial sequence of 4 in FIG1 as a template.
  • 8 in FIG1 is a sequence generated under the terminal transferase activity of reverse transcriptase, and the length is a range. The specific sequence is also related to the reverse transcriptase used.
  • 9 in FIG1 is a sequence synthesized under the reverse transcription primer.
  • Figure 7 shows the integration link of subsequent coding. Through this link, the subsequent coding can be integrated into the 5' end of the corresponding RNA in the transcriptome library, and at the same time, the transcriptome library and the chromatin open site library from the same cell have the same coding.
  • Figure 7A shows a schematic diagram of the library structure before the subsequent coding link, where 1 shows a schematic diagram of the chromatin open site library structure after transposition, where X is the genome open sequence; 2 shows a schematic diagram of the transcriptome library structure after template conversion, where M and N are as described in 5 and 8 in Figure 1; 3 shows a simplified version of 1 and 2, and the coding integration principles of the transcriptome library and the chromatin open site library are similar, so the schematic diagram of 3 is used below to replace the chromatin open site library and the transcriptome library; 4 shows the coding integration handle site; 5 shows the protruding region at the other end, which can be designed as needed, and can be the same or different from 4, and can be used for coding integration or other purposes.
  • FIG7B shows a schematic diagram of a medium-based capture library (applicable to, but not limited to, microfluidics, micropore technology, etc.), wherein 6 is a structural sequence complementary to 7, the complex formed by 6 and 7 can be integrated with the coding integration handle site, at least one of 6 and 7 contains a sequence including but not limited to cell coding information, at least one of 6 and 7 is combined with the medium interface 8; 8 is the medium interface, which may include but not limited to a plane, a curved surface, a sphere, etc.
  • FIG7C shows the integration of the code into the library 3 based on the amplification method (applicable to, but not limited to, flow sorting technology, etc.), 4 is the coding integration handle site, which may not be used in the amplification method, and a sequence including but not limited to cell coding information is added to the primer 9 to integrate the code into the library 3, 10 is the amplification primer at the other end, and a sequence including but not limited to coding information may be added or not added as needed;
  • FIG7D shows the coding integration method based on non-medium and amplification method mediation (applicable to, but not limited to, combined indexing technology, flow sorting technology, etc.), 4 is the coding integration handle site, and It constitutes the first round of coding sequence and can be integrated with site 4. and The second round of coding sequence can be In addition, a third round or more of encoding can be designed as needed. In some embodiments, one or more rounds of encoding can be used for integration.
  • integration or binding includes but is not limited to integration or binding by complete sequence complementarity, integration or binding by partial sequence complementarity, or connection integration or binding, etc.
  • integration or binding result mentioned herein includes but is not limited to the formation of a double-stranded structure or a single-stranded structure, etc.
  • the cells and beads are prepared into a reaction system by microfluidics technology, and any suitable cells or cell components (such as cell nuclei, etc.) can be used.
  • the “subsequent reaction” includes but is not limited to ligation reaction and polymerase chain amplification (PCR) reaction.
  • cells can be sorted into reaction units by sorting techniques.
  • the coding integration handle sites of the chromatin open site sequence and the coding integration handle sites on the transcriptome of the cells can bind or react with the substances in the reaction unit to perform coding integration, thereby integrating the new coding into the chromatin open site sequence and the transcriptome sequence.
  • the sorting technology includes but is not limited to flow sorting technology.
  • the reaction unit includes but is not limited to a well of a well plate or a well of a microplate.
  • binding or reaction includes but is not limited to binding by sequence complementarity, binding by partial sequence complementarity or binding by connection, etc.
  • binding or reaction includes but is not limited to ligation reaction and polymerase chain amplification (PCR) reaction.
  • binding or reaction results includes but is not limited to the formation of a double-stranded structure or a single-stranded structure.
  • encoding includes but is not limited to being derived from a single strand or a double strand.
  • no handle sites may be included.
  • cells can be encoded multiple times in a combined index-based manner. After the cells are evenly mixed, they are randomly dispersed into a collection containing several reaction units.
  • the handle sites of the chromatin open site sequence and the handle sites on the TSO can be combined or reacted with the substances in the coding reaction units of the combined index for coding integration; after each round of integration, it is necessary to block the unintegrated coding by using a designed blocker to avoid interference with subsequent processes; if there are multiple rounds of combined indexes, after each round of integration, the chromatin open site sequence and the transcriptome sequence will be integrated into the handle sites available for subsequent coding integration. Through one or more rounds of coding integration, new coding can be integrated into the chromatin open site sequence and the transcriptome sequence.
  • the reaction unit includes but is not limited to a well of a well plate or a well of a microplate.
  • the "assembly comprising a plurality of reaction units” includes but is not limited to a well plate or a connecting tube.
  • the “subsequent reaction” includes but is not limited to ligation reaction and polymerase chain amplification (PCR) reaction.
  • cell lysis is required.
  • the cells need to be distributed to different wells ( ⁇ 1) for lysis after being evenly mixed, and the number of cells in each well is ⁇ 0.
  • the chromatin open site sequence and transcriptome sequence are released for subsequent processes.
  • the cleavage reaction may need to be terminated under certain circumstances to avoid interference with subsequent processes.
  • the nucleic acid sequence attached to the medium can be used to simultaneously integrate into the chromatin open site and the transcriptome for encoding, and the coding integration step can be skipped to enter the subsequent process.
  • the present invention can use the handle site on the template switching oligomer to integrate the subsequent code into the template switching oligomer, so that the code is integrated into the 5' end of the mRNA sequence, or the code can be designed into the amplification primer to add the code to the 5' end of the mRNA.
  • the code integration of other single-cell bi-omics technologies known in the art is all at the 3' end of the mRNA sequence.
  • the constructed library needs to be amplified to increase the library fragments available for use.
  • amplification primers containing the coding and other systems for polymerase chain reaction can be added to the terminated cell lysate to perform a PCR reaction.
  • the primers may include a coding structure, which may be used as a part of a cell coding combination, or may be adjusted as needed to include but not limited to coding for sample source, experimental batch, cell type, etc.
  • beads integrated with library information can be placed in a library amplification system for library amplification.
  • the coding of the technology of the present invention can not only encode cells, but also encode and mark sample sources, test batches, cell types, etc. as needed, so that they can be distinguished when integrating multiple samples, multiple batches and multiple cell types data in the future.
  • streptavidin magnetic beads can be used to bind and capture the chromatin open site library or transcriptome library containing biotin label, and the mixture of magnetic beads and library is placed on a magnetic frame for separation; the transcriptome library or chromatin open site library without biotin label will be present in the supernatant, while the chromatin open site library or transcriptome library containing biotin label will be bound to the magnetic beads and adsorbed by the magnetic frame.
  • the separated chromatin open site library and transcriptome library can be further amplified and purified as needed, respectively.
  • the purification includes but is not limited to removal of impurities and fragment length screening.
  • the separated chromatin open site library may not have a sequencing adapter and cannot be sequenced, so it is necessary to add a sequencing universal adapter to the library.
  • the reaction of adding a sequencing universal adapter includes but is not limited to using a PCR reaction, a ligation reaction, etc.
  • amplification and purification are required after adding the universal sequencing adapter.
  • the purification includes but is not limited to removing impurities and screening the fragment length.
  • the separated full-length transcriptome library may not have a sequencing adapter and cannot be sequenced.
  • the purified transcriptome library is processed to add a sequencing adapter.
  • the "processing" includes but is not limited to a single-step process or a multi-step process using Tn5 transposition, nuclease cleavage, PCR reaction, and ligation reaction.
  • transcriptome library contains special sequence libraries such as T cell receptor sequences, B cell receptor sequences, gRNA sequences, etc.
  • relevant primers can be used for enrichment if necessary (including but not limited to PCR amplification and biotin streptavidin magnetic bead capture, etc.).
  • the enriched T cell receptor sequencing library, B cell receptor sequencing library and gRNA sequence sequencing library do not use adapters for sequencing, they need to be further processed so that the special sequence sequencing library obtains a universal sequencing adapter, including but not limited to using Tn5 transposition reaction, PCR reaction, ligation reaction, etc.
  • the cell coding combination can be retained while enriching special sequence libraries such as T cell receptor sequences, B cell receptor sequences and gRNA sequences, so the preparation and sequencing of compatible related special sequence libraries can be achieved.
  • the experimental reagents, equipment and specific experimental conditions in the following embodiments have been verified to enable the present invention to be implemented.
  • the experimental reagents, equipment and specific experimental conditions are only for the convenience of understanding the present invention, but do not limit the present invention.
  • the use of alternative reagents, alternative equipment or alternative conditions is within the scope of protection of the present invention.
  • a mixed cell line library was prepared based on the K562 cell line and the NIH3T3 cell line, and only the chromatin open site library and the transcriptome library obtained by mRNA reverse transcription were prepared (special libraries such as TCR library, BCR library, CRISPR gRNA library, etc. were not included).
  • K562 is a suspension cell line and does not need to be treated with trypsin.
  • NIH3T3 is an adherent cell line and needs to be treated with a certain concentration of trypsin to make it suspended.
  • Tn5 naked enzyme was first assembled according to the system in Table 3 and incubated at room temperature for 1 hour for later use.
  • the 10X assembly buffer in the Tn5 transposon assembly system shown in Table 3 is the assembly buffer provided by the supplier, and the assembly reaction is carried out according to the standard operating procedures provided by the supplier.
  • the transposition primer dimer is a double-stranded DNA structure, and the annealing system and conditions are carried out according to the standard operating procedures provided by the supplier.
  • the assembled Tn5 transposon was transposed for 30 minutes at 37°C 500rpm in a constant temperature shaker under the reaction system shown in Table 4. After the transposition is completed, the chromatin open site sequence will carry the subsequent coding integration handle site.
  • NIB-BSA-RI (H) wash solution After the reverse transcription reaction is completed, add an appropriate volume (such as 50 ⁇ L) of NIB-BSA-RI (H) wash solution, centrifuge at 500xg, 4°C for 5min, remove the supernatant and repeat the wash twice. After washing, resuspend the cell pellet with the template switching system (Table 9) and perform the template switching reaction (Table 10). If the template switching oligomer adopts a double-stranded structure, it needs to be annealed first. After the template switching reaction is completed, add an appropriate volume (such as 50 ⁇ L) of NIB-BSA-RI wash solution, centrifuge at 500xg, 4°C for 5min, remove the supernatant and repeat the wash twice.
  • an appropriate volume such as 50 ⁇ L
  • RNA Resuspend the cells with an appropriate amount (such as 1152 ⁇ L) of NIB-BSA-RI (L) (Table 13). At this point, the transcriptome in the cell has provided a subsequent coding integration handle site at the 5' end of the RNA.
  • Reagents Volume 100 ⁇ M template switching oligomer single strand 1 10 100 ⁇ M template switching oligomer single strand 2 10 H 2 O (added after annealing) 30
  • the subsequent coding integration can be performed in different ways.
  • This embodiment adopts a combined index method, so the following steps take the combined index as an example.
  • This link may introduce another round of coding, or may not introduce new coding.
  • This embodiment introduces a round of coding in this link, and the following is the process of introducing a new round of coding.
  • samples from all wells were collected and mixed, and then purified according to the operating procedures of the commercial kit MinElute PCR Purification Kit (QIAGEN Cat. No./ID: 28006), and finally eluted with 50 ⁇ L DEPC H 2 O. After elution, 100 ⁇ L (2X) SPRISelect Beads (Beckman Coulter) were added to purify the fragments according to the operating procedures.
  • the transcriptome library has been labeled with biotin, while the chromatin open site library does not have biotin.
  • the chromatin open site sequence and transcriptome sequence from the same cell will contain the same cell coding combination.
  • the purified product was bound to biotin-streptavidin magnetic beads (Dynabead TM MyOne TM streptavidin C1), incubated at room temperature for 1 hour, and placed on a magnetic stand for magnetic adsorption to separate the chromatin open site library and the transcriptome library, wherein the magnetic beads will bind to the transcriptome library, and the supernatant will contain the chromatin open site library.
  • the obtained supernatant was purified using QIAGEN MinElute PCR Purification Kit and eluted in 23 ⁇ L DEPC H 2 O for the preparation of the final chromatin open site library (which can be sequenced).
  • the magnetic beads are separated from the product system using a magnetic stand, the product system (supernatant) is obtained, and amplified using PCR reaction. After amplification, the library is purified using 0.6X SPRISelect beads, and finally eluted with an appropriate volume (such as 20 ⁇ L) of DEPC H 2 O for later use. It can be used for the construction of the final transcriptome library and the enrichment and construction of the special sequence library.
  • the method of initial library amplification can be simulated to use corresponding primers to enrich and separate the mRNA reverse transcription library and the special sequence library (the special sequence library can be similar to the chromatin open site library), and the final library can be constructed for sequencing.
  • the chromatin open site library obtained in Section 6 was constructed using the following system and reaction. After the construction was completed, the chromatin open site final library was purified using the 0.5X-0.9X SPRISelect beads double-end screening mode and eluted in 20 ⁇ L DEPC H 2 O. At this point, the chromatin open site library was prepared and can be sequenced.
  • the 10 ⁇ M final library primer 1 in Table 31 can be designed to include coding as needed.
  • the transcriptome library obtained in Section 6 was used to prepare the transcriptome excision system and was allowed to stand at a constant temperature of 55°C for 5 minutes.
  • the 10 ⁇ M final library primer 1 in Table 34 can be designed to include coding as needed.
  • the product is purified with 0.6X SPRISelect beads and eluted with 20 ⁇ L DEPC H 2 O. At this point, the final transcriptome library preparation is complete, and the library contains the 5' information of the transcriptome, which can be sequenced.
  • FIG8 is a diagram of the cell encoding effect obtained according to an embodiment of the present invention.
  • 50,000 cells were input to obtain transcriptome data of 7,138 cells, and chromatin open site data of 6,665 cells.
  • the two omics can simultaneously obtain transcriptome and chromatin open site data for a total of 6,378 cells, with an acceptable double cell rate ( ⁇ 5%).
  • the obtained transcriptome data were compared for distribution, and the results showed that the obtained transcriptome sequences were mainly distributed at the 5’ end of the mRNA, proving that the transcriptome data of the present invention can retain the mRNA 5’ information (taking K562 as an example) (see Figure 9).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biochemistry (AREA)
  • Biotechnology (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Immunology (AREA)
  • Plant Pathology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本申请公开了一种单细胞转录组和染色质可及性双组学单细胞测序文库的构建方法,其包括a)制备单细胞悬液;b)获取染色质开放位点,c)使用逆转录酶和逆转录引物对细胞的转录组进行逆转录反应,以获取单细胞的转录组信息;d)使用模板转换寡聚物(TSO)进行模板转换反应;e)后续编码;f)初始文库扩增,和g)制备测序用染色质开放位点文库以及转录组文库。本申请还公开了一种对单细胞转录组和染色质可及性双组学测序文库测序的方法,其包括将通过所述方法制备的染色质开放位点文库以及转录组文库分别进行测序。

Description

一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法 技术领域
本发明属于生物医药技术领域,具体而言,涉及一种构建单细胞测序文库构建的方法,更具体涉及一种单细胞转录组及染色质可及性双组学测序文库构建方法、所制备的测序文库及以及利用所述文库进行测序的方法。
背景技术
单细胞转录组及染色质可及性双组学测序技术(本交底书后续简称为单细胞双组学测序技术)于2018年首次发表于Science杂志(图2,Sci-CAR-Seq 1),Sci-CAR-Seq采用组合索引的原理实现了对单个细胞的编码,然而由于只有两轮组合索引流程,理论通量只有万级。随后研究人员又开发出基于流式细胞分选技术(图3,scCAT-Seq 2,2019,百级通量)及基于微流控(图4,SNARE-Seq 3,2019,万级通量)的单细胞双组学测序技术。提升通量与质量始终是技术发展的方向。由于组合索引技术可以提供极高通量的测序技术,研究人员基于此又开发出了百万级通量的单细胞染色质可及性及转录组双组学测序技术(图5,Paired-Seq 4,2019;图6,SHARE-Seq 5,2020)。目前市面上已有10X Genomics公司发布的首个基于微流控技术的单细胞染色质可及性及转录组双组学测序试剂盒(Chromium,2020)。在专利申请日前最新开发的技术可基于流式分选或微流控进行细胞编码(ISSAC-Seq6,2022)。
目前已有的单细胞双组学测序技术简介如下:
sci-CAR-Seq 1:先将细胞分散至孔板的不同的孔中,通过逆转录反应将mRNA逆转录成cDNA并使来自于同一个孔的细胞的cDNA获得第一轮编码,再用转座子进行转座反应使来自于同一个孔的细胞的染色质开放位点带上第一轮编码,然后将所有细胞混合均匀再分散到新的孔板中进行裂解,取一部分裂解液进行cDNA文库的扩增,并使cDNA带上第二轮编码,另一部分裂解液进行染色质开放位点文库的扩增,并使染色质开放位点文库带上第二轮编码。最终通过识别两轮编码的组合来确定来自于同一个细胞 的转录组文库及染色质开放位点文库(sci-CAR-seq技术流程示意图见图2)。
scCAT-Seq 2:用激光将细胞分选到孔板中的孔中,每个孔有单个细胞。将空中的细胞裂解,分离细胞质与细胞核,并将细胞质进行逆转录反应构建转录组文库,同时将细胞核进行转座反应构建染色质开放位点文库(scCAT-Seq技术流程示意图见图3)。
SNARE-Seq 3:制备单个细胞核(单核)悬液,对单核进行Tn5转座子转座反应,捕获染色质开放位点信息,通过微流控流程,将捕获磁珠、单核及splint引物包裹在一个微滴中,在微滴中进行反应,将染色质开放位点序列信息转移到磁珠上,并且利用磁珠的引物将mRNA信息进行捕获。对文库进行扩增获得可供测序用的转录组文库及染色质开放位点文库(SNARE-Seq流程示意图见图4)。
Paired-Seq 4:将细胞均匀分散到不同的孔中,用转座及逆转录反应分别使同一个孔中的细胞中的染色质开放位点及转录组带有相同的编码,利用三轮组合索引的方法进行编码的连接,最后根据编码的组合来确定来自于同一个细胞的染色质开放位点及转录组序列(Paired-Seq流程示意图见图5)。
Shared-Seq 5:先将细胞进行转座,再用含有生物素标记的逆转录引物进行逆转录;利用组合索引的方式,将细胞均匀分散到不同的孔中进行编码连接再重复两次,最后可根据编码的组合识别来自于同一个细胞的染色质开放位点信息及转录组信息(Share-Seq技术流程示意图见图6)。
10X Chromium单细胞多组学(ATAC+基因表达)试剂盒:制备单个细胞核(单核)悬液,对单核进行Tn5转座子转座反应,捕获染色质开放位点信息,通过微流控流程,将捕获磁珠、单核及反应体系包裹在一个微滴中,在微滴中进行反应,同时给细胞的染色质开放位点及转录组文库进行编码。
ISSAC-seq 6:制备单个细胞核(单核)悬液,对单核进行Tn5转座子转座反应,捕获染色质开放位点信息;接着进行逆转录反应捕获转录组信息,后续可选择通过微流控流程,将捕获磁珠、单核及反应体系包裹在一个微滴中,在微滴中进行反应,同时给细胞的染色质开放位点及转录组文库进行编码;或者也可以选择采用流式分选的方法,分选到孔板中进行编码连 接。
然而目前所有的单细胞双组学测序技术中主要获得的是信使RNA的3’端信息,无法保留信使RNA较多的5’信息,这导致目前已有的单细胞双组学测序技术的使用存在一定限制。例如目前的双组学技术无法方便地对进行基因编辑筛选(CRISPR-Screen)的商业化sgRNA序列库进行测序,通过获得sgRNA序列及基因编辑或基因调控后的转录组及染色质开放位点变化信息,可以便于进行转录调控网络及细胞信号转导网络等的研究。同时3’测序对TCR测序及BCR测序兼容有一定的限制,使得在免疫系统的单细胞研究中通过TCR及BCR信息进行溯源成为困难。而通过5’测序可以获得TCR及BCR信息,并对免疫细胞进行溯源,可以对免疫细胞的演化轨迹推测进行进一步修正,得到更加可靠的结果,对于免疫可塑性的研究十分重要。本公开所展示的技术原理可解决这些问题和其他相关需求。
发明内容
本申请发明人设计了一种可捕获信使RNA 5’信息的高通量单细胞转录组及染色质可及性双组学测序建库技术。具体而言,通过以下项目所示技术方案解决了本领域中存在的技术问题。
1.一种单细胞转录物和染色质可及性双组学单细胞测序文库的构建方法,其包括以下步骤:
a)制备单细胞悬液;
b)获取染色质开放位点,其包括用引物二聚体和Tn5转座酶组装而成的Tn5转座子进行对所述细胞的染色质进行转座反应,对细胞的染色质开放位点进行切割,并将引物二聚体连接到染色质开放位点;
c)使用逆转录酶和逆转录引物对细胞的转录组进行逆转录反应,以获取单细胞的转录组信息;
d)使用模板转换寡聚物(TSO)进行模板转换反应;
e)后续编码以得到带有细胞编码的转录组文库和染色质开放位点文库,所述后续编码包括根据需要采用不同的平台或流程对细胞进行单细胞分离或编码整合,其中来自同一个细胞的转录组文库和染色质开放位点文库带 有相同的细胞编码或细胞编码组合;
f)初始文库扩增,其包括对前面构建的文库进行扩增以增加可供使用的文库片段;
g)制备测序用染色质开放位点文库以及转录组文库,其包括从上一步扩增的文库分离出染色质开放位点文库和转录组文库,
其中所述模板转换寡聚物(图1所示)包含与逆转录酶的末端转移酶活性生成的序列⑧完全或部分互补的序列⑤,使得所述逆转录酶可以以模板转换寡聚物为模板继续进行序列合成,所述模板转换寡聚物还包含位于序列⑤的5’端的部分结构区④以及位于所述部分结构区④的5’端的用于后续编码整合的把手序列③,任选地,所述模板转换寡聚物还包含与部分结构区④互补以与模板转换寡聚物形成双链结构的序列⑥。
2.项目1所述的方法,其中步骤a)中的所述单细胞悬液不用或用固定剂,例如甲醛固定后用于后续步骤。
3.项目2所述的方法,其中所述固定后进行终止固定或者不终止固定而用于后续步骤。
4.项目1至3任一项所述的方法,其中所述引物二聚体、模板转换寡聚物、逆转录引物被修饰,例如被磷酸化修饰及生物素修饰。
5.项目1至4任一项所述的方法,其中所述图1序列⑥被3’磷酸化修饰,并且在后续反应中可被消除3’磷酸化修饰,例如,通过多聚核苷酸激酶反应。
6.项目1至5所述的方法,其中所述模板转换寡聚物附着到介质如珠子、磁化珠子、微孔、芯片、平板上或者处于游离状态。
7.项目1至6所述的方法,其中所述图1部分结构区④包括一个或多个功能区,所述功能区可选自并不限于独特分子识别信号(UMI)、部分编码和双链互补序列等。
8.项目1至7所述的方法,其中所述图1序列⑤由DNA、RNA、锁核酸或其组合组成。
9.项目1至4所述的方法,其中所述图1序列⑥被5’磷酸化修饰。
10.项目1至9所述的方法,其中初始文库扩增通过PCR反应进行, 所用引物包含编码结构、作为细胞编码组合中的一部分,和/或包含样本来源、实验批次、或细胞种类的编码。
11.项目1至10所述的方法,其还包括对分离后的染色质开放位点文库及转录组文库进行进一步的扩增及纯化。
12.项目1至11所述的方法,其还包括对分离后的染色质开放位点文库添加测序通用接头和/或对分离后的转录组文库添加测序通用接头。
13.项目1至12所述的方法,其中所述转录组包括但不限于编码T细胞受体(TCR)、B细胞受体(BCR)的mRNA和/或CRISPR系统中的guide RNA,所述转录组文库包括TCR文库、BCR文库和/或gRNA文库。
14.通过项目1至13所述的方法制备的单细胞转录组和染色质可及性双组学测序文库。
在本文中将本发明技术命名为USTC-V-Seq,其采用新型编码(barcode)连接方式,可以将编码连接在染色质开放位点或/和转录组上并保留转录组的5’信息,并可以将来自于同一个细胞的染色质开放序列片段及转录组序列上带有相同的编码组合。同时由于存在特殊序列,可以从获得的转录组文库进一步富集gRNA文库及BCR或/和TCR文库序列,完成相应的gRNA测序、BCR测序及TCR测序。基于此,USTC-V-Seq可同时获得来自于同一个细胞的染色质可及性信息及转录组信息并可兼容gRNA信息(若含)、BCR序列信息(若含)以及TCR序列信息(若含)的获取。
本发明中模板转换的设计具有独创性。在一般的知识背景下,大多数人会认为在进行模板转换反应中新合成的互补链会补平模板转换链,其他技术的流程图或示意图中也会默认补平,理论上编码是无法连接在模板转换寡聚物上的。但这种共识并不是完全正确的,在我们的实验中发现,即使是单链的模板转换寡聚物,在进行完转座反应后,编码也可以连接在模板转换链上,并经过多次实验验证可行。背后的机理目前还没有验证,但推测可能是由于不完全反应导致模板转换寡聚物并没有被补平,所以编码整合把手位点得以与编码进行连接,从而实现对细胞的编码。加入双链也是希望能进一步阻挡模板转换反应的进行,在实际情况下,编码可以整合到单链模板转换寡聚物或双链模板转换寡聚物的编码整合把手位点上。
附图说明
图1.转录组的获取示意图。
图2.sci-CAR-seq技术流程示意图。
图3.scCAT-Seq技术流程示意图。
图4.SNARE-Seq流程示意图。
图5.Paired-Seq流程示意图。
图6.Share-Seq技术流程示意图。
图7.后续编码的整合示意图。
图8.细胞编码效果图。
图9.转录组数据基因比对图。
图10.染色质开放位点与转录组数据具相关性。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚明白,以下结合具体实施例,并参照附图,对本发明作进一步的详细说明。除非在本文中特别说明,本申请中的术语具有本领域技术人员通常理解的含义。
定义
编码(barcode):本文所述的编码(barcode、barcoding或index)或编码组合是指由核酸构成的不同碱基序列,例如ATCG及TACG为两种不同编码。
细胞:哺乳动物(如人、鼠)行驶生命活动的基本组成元件。本公开中的细胞不只局限于细胞整体,还另外指代包括其他细胞组成部分,如细胞核、线粒体等。例如单细胞悬液也可以是单细胞核悬液。
染色质:细胞核内由DNA、组蛋白、非组蛋白及少量RNA组成的线性复合结构。其基本原件为DNA缠绕在组蛋白上形成的核小体。
染色质可及性:即评价某段DNA是否缠绕在组蛋白上。一般情况下,染色质有两种情况:1)DNA紧紧缠绕在核小体上,称为关闭的DNA;2)DNA为缠绕在核小体上,呈裸露状态,称开放的DNA。
染色质可及性文库:由染色质开放位点的序列构成的测序文库。
染色质可及性测序(ATAC-seq):一种2012年斯坦福大学开发的一款,用于检测生物样本(>500细胞)染色质可及性情况的测序技术。
基因组:即生物体全DNA序列,由ATCG四种碱基有序排列组成。人、鼠等主要哺乳动物的基因组已经全部测序完成。
基因:基因(遗传因子)是产生一条多肽链或功能RNA所需的全部DNA序列。一个基因一般是基因组上一段或多段DNA。
转录组:Transcriptome,也称为“转录物组”,广义上指细胞所能转录出的所有RNA的综合,包括信使RNA(mRNA),核糖体RNA(rRNA),转运RNA(tRNA)及非编码RNA;狭义上指细胞所能转录出的所有信使RNA(mRNA)。本交底书中即采用特有的定义,即所有符合建库需求的RNA,包含但不限于mRNA及CRISPR系统中的gRNA等。
抗原(Antigen):是指能引起抗体生成的物质,是任何可诱发免疫反应的物质。
抗体(Antibody):是指机体由于受到抗原刺激而产生的具有保护作用的蛋白质。
B细胞受体(BCR,B cell receptor):位于B细胞表面的负责特异性识别和结合抗原的分子,本质是膜表面的免疫球蛋白。
T细胞受体(TCR,T cell receptor):位于T细胞表面的特异性受体,负责识别由主要组织相容性复合体(MHC)所提呈的抗原;但与B细胞受体不同,无法识别游离抗原。
B细胞受体/T细胞受体测序(BCR-Seq/TCR-Seq):靶向于B细胞受体/T细胞受体序列的测序技术。
转录因子:一种可识别特定DNA序列模式(Motif)并结合在DNA上的蛋白质,可以启动或调节基因的表达。
单细胞染色质可及性测序(scATAC-seq):用于检测单个细胞染色质可及性的测序方法。
单细胞转录组测序(scRNA-Seq):用于检测单个细胞转录组的测序方法。
单细胞多组学测序:用于检测单个细胞多个维度信息的测序文库构建 方法,比如同时获得来自于同一个细胞的转录组、染色质可及性及/或蛋白组信息等双组学或更多组学信息的测序文库构建方法(如sci-CAR、Paired-Seq、Shared-Seq、10X Chromium、CITE-Seq、REAP-Seq等)。
本文所述的“模板转换寡聚物(TSO)”为可以被命名为其他名称或无名称,其实质为一段单链或双链核酸序列,这段序列可以附着包括但不限于珠子、磁化珠子、微孔、芯片、平板等介质上,也可以处于游离状态。模板转换寡聚物(TSO)可包含包括但不限于编码、独特分子识别信号(Unique Molecular Identifier,UMI)及后续编码整合把手位点等功能区域。
本文所述的“模板转换反应”通常指在逆转录酶的作用下,以模板转换寡聚物为模板继续进行序列合成的反应,生成可与模板转换寡聚物互补或部分互补的序列。
单细胞悬液的制备
非游离状态的组织或/和游离状态的组织(如血液)可通过各种途径包括但不限于切割、研磨及酶解等制备为单细胞悬液。所述组织可以是健康状态下的组织或疾病状态下的组织,并且所述组织的存在状态包括但不限于新鲜组织、冷冻组织、切片组织等。在一些实施方式中,可能需要提取细胞核,提取环节包括但不限于使用研磨、透化剂处理、流式细胞术分选等。
单细胞悬液的固定
在一些实施方式中,单细胞悬液可以不进行固定也可以进行后续流程。在一些实施方式中,单细胞悬液中细胞浓度需要进行测定并采用合适的固定体系(含有固定剂)以达到理想固定效果。可以使用的固定剂包括但不限于醛类如甲醛、戊二醛和多聚甲醛,醇类如甲醇、乙醇,以及丙酮等,并且可以以任何合适的水平或浓度使用固定剂。
在一些实施方式中,固定后可进行固定终止,可以使用包括但不限于任何合适的水平或浓度的甘氨酸及牛血清白蛋白或其他试剂或方式等进行固定终止。在一些实施方式中,可不终止固定直接进行后续反应。
染色质开放位点的获取
可以使用设计的引物二聚体与Tn5转座酶孵育,组装成可以用于转座 的Tn5转座子。配制相应的Tn5转座体系,并使用Tn5转座子进行转座反应,对细胞的染色质开放位点进行切割,将引物二聚体连接到染色质开放位点。引物二聚体上通常含有后续编码整合的把手(handle)序列。
设计引物二聚体的方法是本领域中公知的。在一些实施方式中,引物二聚体可以根据需要自己进行设计或调整,本文中只提供一种可实施的方案举例,并不限制本发明所使用的序列。
Tn5转座酶可以是自己合成或纯化获得的,也可以是从供应商购买的,包括但不限于例如VAZYME S601-01。引物与转座酶的孵育体系可以是自己进行配制的或者是商业化的。孵育反应及转座反应的反应条件可以进行调整。
在一些实施方式中,引物二聚体的任何组分可以进行修饰,这种修饰包括但不限于磷酸化修饰及生物素修饰。
转录组信息的获取
可以使用逆转录引物对细胞进行逆转录反应,逆转录反应完成后,可于42℃进行孵育一段时间,并进一步使用模板转换寡聚物(TSO)进行模板转换反应。模板转换寡聚物可以为逆转录产物提供后续编码整合的把手(handle)序列。
模板转换反应的结果是将模板转换寡聚物、模板转换寡聚物的互补产物或部分模板转换寡聚物的互补产物整合到文库序列上。
在一些实施方式中,“模板转换反应”可以采用包括但不限于逆转录反应、连接反应及DNA聚合反应等替代。在一些实施方式中,逆转录酶可使用包括但不限于基于小鼠白血病病毒(MMLV)的逆转录酶及其衍生的逆转录酶。在一些实施方式中,逆转录引物可以进行修饰,这种修饰包括但不限于生物素修饰。
42℃下孵育的时间可以根据需要调整。也可以不进行42℃下的孵育。通常,42℃下的孵育可以在一定程度上提高产率。
模板转换寡聚物包括但不限于经修饰或非修饰的双链模板转换寡聚物或经修饰或非修饰的单链模板转换寡聚物。例如为了避免双链TSO在逆转录酶的作用下进行延伸导致编码连接位点被封闭,同时可以保证双链TSO 的连接效率,可以通过对双链模板转换寡聚物进行包括但不限于3’磷酸化修饰及锁核酸修饰等。
在一些实施方式中,双链模板转换寡聚物(TSO)可由两条核酸序列退火形成。
在一些实施方式中,修饰的双链模板转换寡聚物(TSO)所指的修饰包括但不限于只含有或同时含有5’磷酸化、3’磷酸化及锁核酸等。
在一些实施方式中,TSO修饰需要得到解除;完成模板转换环节后使用可移除某个或某些特殊修饰的相关酶移除掉某个或某些特殊修饰,使得双链TSO暴露出可以进行后续反应的化学基团。在一些实施方式中,所述修饰可能不影响本文的后续反应环节,所以并不一定需要进行特殊修饰的解除。
在一些实施方式中,本文所述的特殊修饰可能会采用3’磷酸化进行修饰,此时,可能需要或不需要使用包括但不限于T4多核苷酸激酶(Polynucleotide Kinase)(NEB M0201S或NEB M0201L)进行反应去除特殊修饰。
在转录组信息的获取程序中,为后续编码的整合在模板转换寡聚物上提供了把手位点,从而保证了编码整合在mRNA序列的5’端,其他单细胞双组学测序技术均未采用此种编码整合的方法。同时引入了42℃的孵育环节增加了产物产量。
在图1中示例的转录组获取的示意图。通过此环节可以使转录组获得部分5’编码及后续编码整合把手。图1中①所示为模板转换寡聚物(TSO)单链或双链中的单链,包含③、④及⑤三个部分。图1中②所示为转录组,包括但不限于mRNA及CRISPR系统中的gRNA。图1中③所示为后续编码整合把手序列,序列长度是一个范围,可根据实验进行调整,在一些实施方式中亦可以去掉此片段。图1中④所示为TSO的部分结构区,这一部分可包括一种或多种功能区域,功能区域包括但不限于独特分子识别信号(UMI),部分编码(功能包括但不限制于用于细胞识别、组织识别等),双链互补序列等。图1中⑤所示为图1中⑧互补碱基,M数量是一个范围,可根据实验进行调整,M可为相同碱基或不同碱基,具体碱基可依据图1 中⑧中的N来确定;M可以是包括但不限于脱氧核糖核酸、核糖核酸及锁核酸。图1中⑥所示为与图1中④互补的一段序列,可以与模板转换寡聚物形成双链结构以增加③区域为单链的可能性;此外图1中⑥所示序列可以是经过修饰的,这种修饰包括但不限于3’磷酸化、5’磷酸化等。图1中⑦为以图1中④部分序列为模板新生成的序列。图1中⑧为在逆转录酶的末端转移酶活性下生成的序列,长度是一个范围,具体的序列也与使用的逆转录酶有关。图1中⑨所示为在逆转录引物下合成的序列。
后续编码
可根据需要采用不同的平台或流程对细胞进行单细胞分离或编码整合,这些平台或流程包括但不限于基于微流控技术、基于流式分选技术及基于组合索引的细胞编码技术。图7所示为后续编码的整合环节。通过此环节,后续编码可以整合到转录组文库中相对应RNA的5’端,并同时使来自于同一个细胞的转录组文库及染色质开放位点文库带有相同的编码。图7A所示为进行后续编码环节前的文库结构示意图,其中①所示为转座后染色质开放位点文库结构示意,其中X为基因组开放序列;②所示为模板转换后转录组文库结构示意,其中M和N含义如图1中的⑤和⑧所述;③所示为①和②的简化版,转录组文库和染色质开放位点文库的编码整合原理相似故下文用③的示意图对染色质开放位点文库及转录组文库进行替代;④所示为编码整合把手位点;⑤所示为另一端突出区域,可根据需要进行设计,可以与④相同或者不同,可用于编码整合或其他用途。图7B所示为基于介质捕获文库示意图(适用于包括但不限于微流控技术、微孔技术等),其中⑥为一段结构序列与⑦互补,⑥与⑦形成的复合体可以与编码整合把手位点进行整合,⑥与⑦中至少一条包含包括但不限于细胞编码信息的序列,⑥与⑦中至少一条与介质界⑧面结合;⑧即为介质界面,可包括但不限于平面、曲面、球面等。图7C所示为基于扩增方法将编码整合到③文库(适用于包括但不限于流式分选技术等),④为编码整合把手位点,在扩增方法中可以不使用此位点,而在⑨引物中加入包括但不限于细胞编码信息的序列以将编码整合到③文库,⑩为另一端的扩增引物,可以根据需要加入或者不加入包括但不限于编码信息的序列;图7D所示为基于非介质及扩增方 法介导的编码整合方式(适用于包括但不限于组合索引技术、流式分选技术等),④为编码整合把手位点,
Figure PCTCN2022124336-appb-000001
Figure PCTCN2022124336-appb-000002
组成了第一轮编码序列,可以与④位点整合,
Figure PCTCN2022124336-appb-000003
Figure PCTCN2022124336-appb-000004
组成了第二轮编码序列,可以与
Figure PCTCN2022124336-appb-000005
的游离端进行整合,此外可根据需要设计第三轮及以上编码,在一些实施方式中可以使用一轮或多轮编码进行整合。
所述的“整合”或“结合”包括但不限于依靠序列完全互补整合或结合、依靠序列部分互补整合或结合,或连接整合或结合等。所述的整合或结合结果包括但不限于形成双链结构或形成单链结构等。
所述的通过微流控技术将细胞与珠子制备成反应体系,可采用任何合适的细胞或者细胞组成部分(如细胞核等)。
所述的“后续反应”包括但不限于连接反应及聚合酶链式扩增(PCR)反应等。
在一些实施方式中,细胞可以通过分选技术分选到反应单元中。细胞中染色质开放位点序列的编码整合把手位点及转录组上的编码整合把手位点可以和反应单元中的物质进行结合或反应进行编码整合,从而将新的编码整合至染色质开放位点序列及转录组序列上。
所述的分选技术包括但不限于流式分选技术。
所述的反应单元包括但不限于孔板的孔中或微孔板的孔中。
所述的“结合或反应”包括但不限于依靠序列互补结合、依靠序列部分互补结合或连接结合等。
所述的“结合或反应”包括但不限于连接反应及聚合酶链式扩增(PCR)反应等。
所述的“结合或反应”结果包括但不限于形成双链结构或形成单链结构等。
所述的“编码”包括但不限于源自单链或双链。
在一些实施方式中,可以不包含把手位点。
在一些实施方式中,细胞可以通过基于组合索引的方式进行多轮编码。细胞混合均匀后随机分散到包含数个反应单元的集合中,染色质开放位点序列的把手位点及TSO上的把手位点可以和组合索引的编码反应单元中的 物质进行结合或反应进行编码整合;每一轮整合完成后,需要通过使用设计过的阻断物(blocker)对没有整合的编码进行封闭,避免对后续流程的干扰;若有多轮组合索引,每一轮整合完成后会使染色质开放位点序列及转录组序列分别整合到可供后续编码整合的把手位点。通过一轮或多轮的编码整合,可以将新的编码整合至染色质开放位点序列及转录组序列上。
所述的反应单元包括但不限于孔板的孔中或微孔板的孔中。
所述的“包含数个反应单元的集合”包括但不限于孔板或联管等。
所述的“后续反应”包括但不限于连接反应及聚合酶链式扩增(PCR)反应等。
在一些实施方式中,需要进行细胞裂解。细胞需要在混合均匀后分配到不同的孔(≥1)中进行裂解,每个孔中的细胞数量≥0。从而,释放染色质开放位点序列以及转录组序列以进行后续流程。
所述的裂解反应在一定情况下可能需要进行终止以避免对后续流程的干扰。
在一些实施方式中,可使用附着在介质上的核酸序列同时整合到染色质开放位点及转录组上以进行编码,并可以跳过编码整合环节而进入后续流程。
本发明可使用模板转换寡聚物上的把手位点,将后续编码整合在模板转换寡聚物上,从而使编码整合在mRNA序列的5’端,亦可以将编码设计到扩增引物上在相对mRNA的5’端加上编码。相比,本领域已知的其他单细胞双组学技术的编码整合均在mRNA序列的3’端。
初始文库扩增
构建的文库需要进行扩增以增加可供使用的文库片段。
在一些实施方式中,可以在终止后的细胞裂解液中加入包含编码的扩增引物及其他进行聚合酶链式反应(PCR)的体系,并进行PCR反应。
所述的引物可包含编码结构,可以作为细胞编码组合中的一部分,亦可以根据需要调整为包括但不限于作为样本来源、实验批次、细胞种类等的编码。
在一些实施方式中,可以将整合了文库信息的珠子放入文库扩增体系 中,进行文库扩增。
在本领域中已经报道了在扩增引物上加入编码对细胞进行编码。但是,本发明的技术的编码除了能对细胞进行编码,同时可根据需要对样本来源、试验批次、细胞种类等进行编码标记,在后续整合多样本、多批次及多细胞种类数据时可以进行区分。
测序用染色质开放位点文库以及转录组文库的准备
在一些实施方式中,可使用链亲和素磁珠对含有生物素标记的染色质开放位点文库或转录组文库进行结合抓取,并将磁珠与文库的混合液置于磁力架上进行分离;不含生物素标记的转录组文库或染色质开放位点文库会存在于上清中,而含有生物素标记的染色质开放位点文库或转录组文库会结合在磁珠上被磁力架吸附。
在一些实施方式中,可根据需要再分别对分离后的染色质开放位点文库及转录组文库进行进一步的扩增及纯化。所述的纯化包括但不限于去除杂质及片段长度筛选等。
在一些实施方式中,分离获得的染色质开放位点文库可能没有测序用的接头,还无法进行测序,所以需要对文库添加测序通用接头。所述的添加测序通用接头的反应包括但不限于使用PCR反应、连接反应等。
在一些实施方式中,添加测序通用接头后还需要进行扩增及纯化。所述的纯化包括但不限于去除杂质及片段长度筛选等。
在一些实施方式中,分离获得的全长转录组文库可能没有测序用的接头,还无法进行测序。对纯化后的转录组文库进行处理以加入测序接头。所述的“处理”包括但不限于使用Tn5转座、核酸酶切割、PCR反应及连接反应等的单一环节处理或多环节处理。
特殊序列文库的制备
获得的转录组文库中若含有T细胞受体序列、B细胞受体序列、gRNA序列等特殊序列文库,若有需要则可使用相关引物进行富集(包括但不限于使用PCR进行扩增及生物素链亲和素磁珠抓取等),富集后的T细胞受体测序文库、B细胞受体测序文库及gRNA序列测序文库如果没有测序使用接头,则需要再进行处理,使得特殊序列测序文库获得测序通用接头,包括 但不限于使用Tn5转座反应、PCR反应、连接反应等。
由于编码整合在mRNA序列的5’端,可以在富集T细胞受体序列、B细胞受体序列及gRNA序列等特殊序列文库的同时保留细胞编码组合,所以可以实现兼容相关特殊序列文库的制备及测序。
实施例
以下的实施例便于更好地理解本发明,但并不限定本发明。
如无特殊说明,下述实施例中的实验方法均为常规实验方法,所用试剂均可常规购买得到。
下述实施例中的实验试剂、器材及具体实验条件等已被验证可使得本发明得以实现,所述实验试剂、器材及具体实验条件只为便于理解本发明,但并不限定本发明。使用替代试剂、替代器材或替代条件均在本发明的保护范围内。
实施例1
该实施例中基于K562细胞系及NIH3T3细胞系进行混合细胞系文库制备,仅制备染色质开放位点文库及mRNA逆转录得到的转录组文库(未包含特殊文库,如TCR文库,BCR文库,CRISPR gRNA文库等)。
1.制备单细胞悬液
K562为悬浮细胞系,无需使用胰蛋白酶处理,NIH3T3为贴壁细胞系,需用一定浓度的胰蛋白酶进行处理以使其悬浮。
2.细胞固定
可以选择对细胞进行固定以完成建库流程,也可以不进行此环节。固定可以选择用66.8μL 1.6%甲醛溶液加入到1ml单细胞悬液(悬于磷酸缓冲盐水(phosphate buffered saline),PBS)中,细胞密度为10 6个/ml。固定5分钟后用固定终止液(56μL 2.5M甘氨酸,20μL 1M Tris-HCl 8.0,13.4μL 7.5%BSA)进行终止5分钟,离心除去。并用PBS-BSA-RI洗液洗涤两次,将细胞重悬于1XTD缓冲液(由4XTD缓冲液稀释)中。
表1.PBS-BSA-RI洗液体系
试剂 体积(微升,μL)
1X PBS 987.5
10%BSA 1
0.1M DTT 10
SUPERaseIn 1
RNaseOUT 0.5
总计 1000
表2. 4XTD体系
试剂 体积(微升,μL)
1M Tris-acetate,PH7.8 132
5M乙酸钾 52.8
1M乙酸镁 40
二甲基甲酰胺(DMF) 640
DEPC H 2O 135.2
总计 1000
3.染色质开放位点的获取
为获取染色质开放位点,先根据表3体系对Tn5裸酶进行组装,于室温孵育1小时备用。
表3.Tn5转座子组装体系
Figure PCTCN2022124336-appb-000006
表3所示Tn5转座子组装体系中10X组装缓冲液为供应商提供组装缓冲液,组装反应根据供应商提供的标准操作流程进行。转座引物二聚体为双链DNA结构,退火体系及条件根据供应商提供的标准操作流程进行。用组装好的Tn5转座子在表4所示反应体系下于37℃ 500rpm恒温摇床进行 转座30分钟。转座完成后,染色质开放位点序列将带有后续编码整合把手位点。
表4.转座反应体系
Figure PCTCN2022124336-appb-000007
4.转录组信息的获取
转座完成后加入等体积的NIB-BSA-RI(H)洗液(表6),于500xg,4℃离心5min,去掉上清重复洗涤共三次。洗涤完成后用逆转录体系(表7)重悬细胞沉淀,并进行逆转录反应(表8)。
表5.NIB体系
试剂 体积(μL)
DEPC H 2O 9840
1M Tris-HCl 7.5 100
5M NaCl 20
1M MgCl 2 30
5%毛地黄皂苷 10
总计 10000
表6.NIB-BSA-RI(H)体系
试剂 体积(μL)
NIB 100
10%BSA 0.5
0.1M DTT 1
SUPERaseIn 1
RNaseOUT 0.5
表7.逆转录体系
Figure PCTCN2022124336-appb-000008
表8.逆转录热循环
Figure PCTCN2022124336-appb-000009
逆转录反应完成后加入适量体积(如50μL)的NIB-BSA-RI(H)洗液,于500xg,4℃离心5min,去掉上清重复洗涤共两次。洗涤完成后用模板转换体系(表9)重悬细胞沉淀,并进行模板转换反应(表10)。若模板转换寡聚物采用双链结构,需要先进行退火。模板转换反应完成后加入适量体积(如50μL)的NIB-BSA-RI洗液,于500xg,4℃离心5min,去掉上清重复洗涤共两次。用适量(如1152μL)NIB-BSA-RI(L)(表13)重悬细胞。此时,细胞中的转录组已经在RNA的5’端提供了后续编码整合把手位点。
表9.模板转换体系
Figure PCTCN2022124336-appb-000010
Figure PCTCN2022124336-appb-000011
表10.模板转换反应条件
Figure PCTCN2022124336-appb-000012
表11.模板转换寡聚物退火体系
试剂 体积(μL)
100μM模板转换寡聚物单链1 10
100μM模板转换寡聚物单链2 10
H 2O(退火后加入) 30
表12.模板转换寡聚物退火条件
Figure PCTCN2022124336-appb-000013
表13.NIB-BSA-RI(L)体系
试剂 体积(μL)
NIB 100
10%BSA 0.5
0.1M DTT 1
SUPERaseIn 0.25
RNaseOUT 0.125
5.后续编码整合
后续编码整合可采用不同方式,本实施例采用组合索引方式,故下述步骤以组合索引为例。
配制后续编码整合反应体系(表14),将细胞混匀分散至含有第一轮编码的96孔板中,并于25℃ 300rpm恒温摇床孵育30min。反应完成后加入第一轮封闭引物,再于25℃ 300rpm恒温摇床孵育30min进行封闭。封闭后,将96孔板中所有反应体系合并混匀并加入192μL T4DNA连接酶(Ligase),再混匀分散至每孔含有10μL第二轮编码的96孔板中,于25℃300rpm恒温摇床孵育30min,反应完成后加入第二轮封闭引物,再于25℃300rpm恒温摇床孵育30min进行封闭。封闭后,将96孔板中所有反应体系合并,与500xg,4℃离心5min后去除上清,并用NIB-BSA-RI(L)清洗两次。
表14.后续编码整合体系(组合索引体系)
Figure PCTCN2022124336-appb-000014
表15.分子杂交液体系
Figure PCTCN2022124336-appb-000015
Figure PCTCN2022124336-appb-000016
表16.第一轮编码退火体系(针对每一个编码)
Figure PCTCN2022124336-appb-000017
表17 引物退火缓冲体系
引物退火缓冲液 体积(μl) 终浓度
1M Tris 8.0 10 10mM
5M NaCl 10 50mM
0.5M EDTA 2 1mM
H 2O(或DEPC H 2O) 978  
Total 1000  
表18.编码退火条件
Figure PCTCN2022124336-appb-000018
Figure PCTCN2022124336-appb-000019
表19.第一轮编码封闭体系
Figure PCTCN2022124336-appb-000020
表20.第二轮编码退火体系(针对每一个编码)
Figure PCTCN2022124336-appb-000021
表21.第二轮编码封闭体系
Figure PCTCN2022124336-appb-000022
6.初始文库扩增
此环节可以再引入一轮编码,也可以不加入新的编码。本实施例在此环节引入了一轮编码,以下为引入新一轮编码的流程。
细胞洗涤后,用1478μL 10mM Tris-HCl PH7.5重悬细胞,并均匀地分散到96孔板中(每孔14μL,会有一定剩余),每孔加入2μL细胞裂解缓冲液及0.2μL 20mg/mL蛋白酶K混合均匀,然后于55℃,500rpm恒温摇床孵育15min.孵育结束后,每孔加入4μL 10%Tween-20及0.4μL100mM PMSF终止。
配制初始文库扩增体系,进行10轮线性扩增后加入指数扩增引物再进行5轮指数扩增(所有的扩增轮数可根据情况进行调整)。
表22.线性扩增体系
Figure PCTCN2022124336-appb-000023
表23.线性扩增热循环
Figure PCTCN2022124336-appb-000024
Figure PCTCN2022124336-appb-000025
表24.指数扩增体系
Figure PCTCN2022124336-appb-000026
表25.指数扩增热循环
Figure PCTCN2022124336-appb-000027
指数扩增完成后,将所有孔的样品收集混合,然后根据商业试剂盒MinElute PCR Purification Kit(QIAGEN Cat.No./ID:28006)的操作流程进行纯化,最后用50μL DEPC H 2O洗脱。洗脱后加入100μL(2X)SPRISelect Beads(Beckman Coulter)根据操作流程进行片段纯化。纯化产物中,转录组文库已被生物素标记,染色质开放位点文库则不带有生物素。产物中,来自于同一个细胞的染色质开放位点序列及转录组序列将含有相同的细胞编码组合。
纯化后的产物采用(Dynabead TM MyOne TM链霉素亲和素C1)生物素 链霉素亲和素磁珠结合,室温孵育1小时,并置于磁力架上进行磁力吸附,以分离染色质开放位点文库及转录组文库,其中磁珠将结合转录组文库,上清将包含染色质开放位点文库。获得的上清用QIAGEN MinElute PCR Purification Kit进行纯化并洗脱于23μL DEPC H 2O中,用于制备染色质开放位点终文库(可以进行测序)的构建。磁珠在先后用50μL 1X BW-T(表27),50μL 1X BW,50μL DEPC H 2O洗涤后,用21μL DEPC H 2O重悬,配制PCR反应体系并利用PCR反应进行转录组文库释放。
表26. 1X BW体系
试剂 体积(μL)
1M Tris-HCl pH7.5 50
5M NaCl 2000
0.5M EDTA 10
DEPC H 2O 7940
总计 10000
表27. 1X BW-T体系
试剂 体积(μL)
1X BW 1000
10%Tween-20 5
表28.转录组文库释放体系
Figure PCTCN2022124336-appb-000028
Figure PCTCN2022124336-appb-000029
表29.转录组文库释放热循环
Figure PCTCN2022124336-appb-000030
完成转录组文库释放反应后利用磁力架将磁珠与产物体系分离,取得产物体系(上清),并利用PCR反应进行扩增。扩增完成后,用0.6X SPRISelect beads浓度进行文库纯化,最后用适当体积(如20μL)DEPC H 2O洗脱,备用。后续可用于转录组终文库的构建及特殊序列文库的富集及构建。
若转录组文库中存在特殊序列文库,可模拟初始文库扩增的方法用相应的引物进行mRNA逆转录文库及特殊序列文库的富集与分离(特殊序列文库可类似于染色质开放位点文库),并构建终文库以进行测序。
表30.转录组文库再扩增热循环
Figure PCTCN2022124336-appb-000031
7-1染色质开放位点终文库的构建
在第6节中获得的染色质开放位点文库通过如下体系及反应进行染色 质开放位点终文库的构建。构建完成后采用0.5X-0.9X SPRISelect beads双端筛选的模式纯化染色质开放位点终文库,并洗脱于20μL DEPC H 2O。至此,染色质开放位点文库制备完成,可以进行测序。
表31.染色质开放位点终文库构建体系
Figure PCTCN2022124336-appb-000032
表31中10μM终文库引物1可以根据需要设计包含编码。
表32.染色质开放位点终文库构建热循环
Figure PCTCN2022124336-appb-000033
7-2转录组终文库的构建(若含有特殊序列,在分离出特殊序列文库后可依据此方法进行特殊序列文库终文库的构建)
在第6节中获得的转录组文库用配制转录组切割体系,并于55℃恒温环境中静置5min。
表33.转录组切割体系
试剂 体积(μL)
转录组文库 20ng相应体积(V)
4X TD缓冲液 12.5
DEPC H 2O 36.5-V
Tn5(相同转座复合体序列) 1
总计 50
切割完成后,在切割体系中加入1μL 10%SDS进行终止反应,室温孵育5min。孵育完成后,用0.6X SPRISelect beads进行产物纯化,并用23μL DEPC H 2O洗脱。洗脱后的产物配制转录组终文库反应体系并进行转录组终文库制备反应。
表34.转录组终文库构建体系
Figure PCTCN2022124336-appb-000034
表34中10μM终文库引物1可以根据需要设计包含编码。
表35.转录组终文库构建热循环
Figure PCTCN2022124336-appb-000035
转录组终文库反应完成后,用0.6X SPRISelect beads进行产物纯化,并 用20μL DEPC H 2O洗脱。至此转录组终文库制备完成,文库包含的是转录组5’信息,可以进行测序。
实验结果
图8为根据本发明实施例所获得的细胞编码效果图。在一次实施中,输入5万细胞可获得7138个细胞的转录组数据,并可获得6665个细胞的染色质开放位点数据。两种组学总共可于6378个细胞同时获得转录组及染色质开放位点数据,同时可接受的双细胞率(~5%)。
对获得的转录组数据进行分布比对,结果表明获得的转录组序列主要分布在mRNA的5’端,证明本发明的转录组数据可以保留mRNA 5’信息(以K562为例)(见图9)。
通过相关性分析,发现细胞系的特征基因的染色质开放位点信息及转录组信息可以获得很好的相关性(以K562为例,MALAT1为常表达管家基因,GATA1为K562特征基因)(见图10)。
以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步详细说明,应理解的是,以上所述仅为本发明的具体实施例而已,并不用于限制本发明,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。
参考文献
1 Cao,J.et al.Joint profiling of chromatin accessibility and gene expression in thousands of single cells.Science 361,1380-1385,doi:10.1126/science.aau0730(2018).
2 Liu,L.et al.Deconvolution of single-cell multi-omics layers reveals regulatory heterogeneity.Nature Communications 10,470,doi:10.1038/s41467-018-08205-7(2019).
3 Chen,S.,Lake,B.B.& Zhang,K.High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell.Nature Biotechnology 37,1452-1457,doi:10.1038/s41587-019-0290-0(2019).
4 Zhu,C.et al.An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome.Nature Structural & Molecular Biology 26,1063-1070,doi:10.1038/s41594-019-0323-x (2019).
5 Ma,S.et al.Chromatin potential identified by shared single cell profiling of RNA and chromatin.bioRxiv,2020.2006.2017.156943,doi:10.1101/2020.06.17.156943(2020).
6 Xu,W.et al.ISSAAC-seq enables sensitive and flexible multimodal profiling of chromatin accessibility and gene expression in single cells.bioRxiv,2022.2001.2016.476488,doi:10.1101/2022.01.16.476488(2022).

Claims (15)

  1. 一种单细胞转录物和染色质可及性双组学单细胞测序文库的构建方法,其包括以下步骤:
    a)制备单细胞悬液;
    b)获取染色质开放位点,其包括用引物二聚体和Tn5转座酶组装而成的Tn5转座子进行对所述细胞的染色质进行转座反应,对细胞的染色质开放位点进行切割,并将引物二聚体连接到染色质开放位点;
    c)使用逆转录酶和逆转录引物对细胞的转录组进行逆转录反应,以获取单细胞的转录组信息;
    d)使用模板转换寡聚物(TSO)进行模板转换反应;
    e)后续编码以得到带有细胞编码的转录组文库和染色质开放位点文库,所述后续编码包括根据需要采用不同的平台或流程对细胞进行单细胞分离或编码整合,其中来自同一个细胞的转录组文库和染色质开放位点文库带有相同的细胞编码或细胞编码组合;
    f)初始文库扩增,其包括对前面构建的文库进行扩增以增加可供使用的文库片段;
    g)制备测序用染色质开放位点文库以及转录组文库,其包括从上一步扩增的文库分离出染色质开放位点文库和转录组文库,
    其中所述模板转换寡聚物(图1所示)包含与逆转录酶的末端转移酶活性生成的序列⑧完全或部分互补的序列⑤,使得所述逆转录酶可以以模板转换寡聚物为模板继续进行序列合成,所述模板转换寡聚物还包含位于序列⑤的5’端的部分结构区④以及位于所述部分结构区④的5’端的用于后续编码整合的把手序列③,任选地,所述模板转换寡聚物还包含与部分结构区④互补以与模板转换寡聚物形成双链结构的序列⑥。
  2. 权利要求1所述的方法,其中步骤a)中的所述单细胞悬液不用或用固定剂,例如甲醛固定后用于后续步骤。
  3. 权利要求2所述的方法,其中所述固定后进行终止固定或者不终止固定而用于后续步骤。
  4. 权利要求1至3任一项所述的方法,其中所述引物二聚体、模板转换寡聚物、逆转录引物被修饰,例如被磷酸化修饰及生物素修饰。
  5. 权利要求1至4任一项所述的方法,其中所述图1序列⑥被3’磷酸化修饰,并且在后续反应中可被消除3’磷酸化修饰,例如,通过多聚核苷酸激酶反应。
  6. 权利要求1至5所述的方法,其中所述模板转换寡聚物附着到介质如珠子、磁化珠子、微孔、芯片、平板上或者处于游离状态。
  7. 权利要求1至6所述的方法,其中所述图1部分结构区④包括一个或多个功能区,所述功能区选自独特分子识别信号(UMI)、部分编码和双链互补序列。
  8. 权利要求1至7所述的方法,其中所述图1序列⑤由DNA、RNA、锁核酸或其组合组成。
  9. 权利要求1至4所述的方法,其中所述图1序列⑥被5’磷酸化修饰。
  10. 权利要求1至9所述的方法,其中初始文库扩增通过PCR反应进行,所用引物包含编码结构、作为细胞编码组合中的一部分,和/或包含样本来源、实验批次、或细胞种类的编码。
  11. 权利要求1至10所述的方法,其还包括对分离后的染色质开放位点文库及转录组文库进行进一步的扩增及纯化。
  12. 权利要求1至11所述的方法,其还包括对分离后的染色质开放位点文库添加测序通用接头和/或对分离后的转录组文库添加测序通用接头。
  13. 权利要求1至12所述的方法,其中所述转录组包括但不限于编码T细胞受体(TCR)、B细胞受体(BCR)的mRNA和/或CRISPR系统中的引导RNA(guide RNA),所述转录组文库包括TCR文库、BCR文库和/或gRNA文库。
  14. 通过权利要求1至13所述的方法制备的单细胞转录组和染色质可及性双组学测序文库。
  15. 一种对单细胞转录组和染色质可及性双组学测序文库测序的方法,其包括将权利要求1至13所述的方法制备的染色质开放位点文库以及转录组文库分别进行测序。
PCT/CN2022/124336 2022-10-10 2022-10-10 一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法 WO2024077439A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/124336 WO2024077439A1 (zh) 2022-10-10 2022-10-10 一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/124336 WO2024077439A1 (zh) 2022-10-10 2022-10-10 一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法

Publications (1)

Publication Number Publication Date
WO2024077439A1 true WO2024077439A1 (zh) 2024-04-18

Family

ID=90668491

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/124336 WO2024077439A1 (zh) 2022-10-10 2022-10-10 一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法

Country Status (1)

Country Link
WO (1) WO2024077439A1 (zh)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108103055A (zh) * 2018-01-09 2018-06-01 上海亿康医学检验所有限公司 一种单细胞rna逆转录与文库构建的方法
CN109996892A (zh) * 2016-12-07 2019-07-09 深圳华大智造科技有限公司 单细胞测序文库的构建方法及其应用
WO2020009665A1 (en) * 2018-07-06 2020-01-09 Agency For Science, Technology And Research Method for single-cell transcriptome and accessible regions sequencing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109996892A (zh) * 2016-12-07 2019-07-09 深圳华大智造科技有限公司 单细胞测序文库的构建方法及其应用
CN108103055A (zh) * 2018-01-09 2018-06-01 上海亿康医学检验所有限公司 一种单细胞rna逆转录与文库构建的方法
WO2020009665A1 (en) * 2018-07-06 2020-01-09 Agency For Science, Technology And Research Method for single-cell transcriptome and accessible regions sequencing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MA SAI; ZHANG BING; LAFAVE LINDSAY M.; EARL ANDREW S.; CHIANG ZACHARY; HU YAN; DING JIARUI; BRACK ALISON; KARTHA VINAY K.; TAY TRI: "Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin", CELL, ELSEVIER, AMSTERDAM NL, vol. 183, no. 4, 23 October 2020 (2020-10-23), Amsterdam NL , pages 1103, XP086341410, ISSN: 0092-8674, DOI: 10.1016/j.cell.2020.09.056 *
NONGLUK PLONGTHONGKUM ET AL.: "Scalable Dual-Omics Profiling With Single-Nucleus Chromatin Accessibility and mRNA Expression Sequencing 2 (SNARE-seq2)", NATURE PROTOCOLS, vol. 16, no. 11, 14 October 2021 (2021-10-14), XP037607625, ISSN: 1754-2189, DOI: 10.1038/s41596-021-00507-3 *
XIONG, HAIQING ET AL.: "Single-Cell Joint Detection of Chromatin Occupancy and Transcriptome Enables Higher-Dimensional Epigenomic Reconstructions", NATURE METHODS, vol. 18, no. 6, 6 May 2021 (2021-05-06), pages 652 - 660, XP037473893, ISSN: 1548-7091, DOI: 10.1038/s41592-021-01129-z *

Similar Documents

Publication Publication Date Title
US11072816B2 (en) Single-cell proteomic assay using aptamers
US11591652B2 (en) System and methods for massively parallel analysis of nucleic acids in single cells
CN105506125B (zh) 一种dna的测序方法及一种二代测序文库
Fan et al. Combinatorial labeling of single cells for gene expression cytometry
CN115516109A (zh) 条码化核酸用于检测和测序的方法
CN110050067A (zh) 产生经扩增的双链脱氧核糖核酸的方法以及用于所述方法的组合物和试剂盒
JP2009072062A (ja) 核酸の5’末端を単離するための方法およびその適用
CN106319639B (zh) 构建测序文库的方法及设备
CN103602726B (zh) 同时对多种核酸样本进行测序的方法
TW201321518A (zh) 微量核酸樣本的庫製備方法及其應用
WO2021046232A1 (en) Optically readable barcodes and systems and methods for characterizing molecular interactions
CN115176026A (zh) Rna寡核苷酸的测序方法
CN114729349A (zh) 条码化核酸用于检测和测序的方法
JP2019523010A (ja) 核酸配列決定調製物からアダプター二量体を除去する方法
CN116635535A (zh) 单细胞dna和rna的同时扩增
WO2024077439A1 (zh) 一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法
CN111801428B (zh) 一种获得单细胞mRNA序列的方法
CN115478098A (zh) 一种单细胞转录组及染色质可及性双组学测序文库构建方法及测序方法
CN114096679B (zh) 使用固相载体的核酸扩增方法
CN107794257A (zh) 一种dna大片段文库的构建方法及其应用
CN113999891B (zh) 用于构建去除样本内嵌合体序列的免疫组库高通量测序文库的方法、一组引物和试剂盒
Tian mRNA 3’End Processing and Metabolism
US20240191299A1 (en) Chemical sample indexing for high-throughput single-cell analysis
Qiao et al. Recent Innovations and Technical Advances in High‐Throughput Parallel Single‐Cell Whole‐Genome Sequencing Methods
CN117089597A (zh) 一种单细胞文库构建测序方法及其应用

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22961635

Country of ref document: EP

Kind code of ref document: A1