WO2017035821A1 - RNA 5mC重亚硫酸盐测序的文库构建方法及其应用 - Google Patents

RNA 5mC重亚硫酸盐测序的文库构建方法及其应用 Download PDF

Info

Publication number
WO2017035821A1
WO2017035821A1 PCT/CN2015/088908 CN2015088908W WO2017035821A1 WO 2017035821 A1 WO2017035821 A1 WO 2017035821A1 CN 2015088908 W CN2015088908 W CN 2015088908W WO 2017035821 A1 WO2017035821 A1 WO 2017035821A1
Authority
WO
WIPO (PCT)
Prior art keywords
rna
sequencing
bisulfite
reverse transcription
methylation
Prior art date
Application number
PCT/CN2015/088908
Other languages
English (en)
French (fr)
Inventor
杨运桂
杨莹
孙宝发
杨鑫
孙慧颖
孙敏
张兵
黄春敏
Original Assignee
中国科学院北京基因组研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院北京基因组研究所 filed Critical 中国科学院北京基因组研究所
Priority to PCT/CN2015/088908 priority Critical patent/WO2017035821A1/zh
Publication of WO2017035821A1 publication Critical patent/WO2017035821A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the invention belongs to the field of genome sequencing technology, and particularly relates to a library construction method for reverse transcription of a bisulfite-treated RNA fragment by using a random hexamer primer containing only ACT three bases and its high-passivity at 5 mC Application in quantitative sequencing.
  • RNA has both dual functions of regulation and information molecules, and plays a central role in many cellular mechanisms.
  • Post-transcriptional modification of RNA lays the chemical foundation for the diversification of RNA function.
  • RNA modifications in nature are widely present in the four types of nucleotides A, U, C, and G.
  • RNAMDB has recorded a total of 109 RNA modifications, of which methylation modification accounts for about 80% of the total RNA modification.
  • This type of methylation modification mainly occurs on the nitrogen atom N of the base group. , as well as the C atom of purine and pyrimidine, or a special position such as 2'-OH oxygen atom.
  • RNA nucleosides The methylation reaction on RNA nucleosides is mainly dependent on the completion of methyltransferase and methyl group donors in the organism.
  • 6-methyladenosine (m6A) a methylation modification occurring at the sixth N atom of base A, is the most common post-transcriptional modification of RNA in eukaryotes because of its richness. The content and the high degree of conservation have received extensive attention and research in recent years.
  • 5-methylcytosine is another methylated modified form that is widely present in RNA.
  • 5mC modification has been widely reported in DNA research, research in RNA is still in its infancy. As early as the 1970s, 5mC was found in the mRNA of hamster cells. The distribution characteristics and biological functions of the modified sites are still unclear until the recent detection of single-points at the genomic level of RNA 5mC and organisms. Learning function The research has only begun to explore.
  • RNA 5mC modification should also be dynamically reversible.
  • the methyltransferase uses S-adenosylmethionine (SAM) as the methyl donor and the methyl group to the cytosine C to form the 5-methyl cell. Pyrimidine (5mC).
  • SAM S-adenosylmethionine
  • 5mC Pyrimidine
  • RsmB is the first discovered RNA 5mC methyltransferase that primarily catalyzes methylation on bacterial rRNA. Subsequently, 5mC methyltransferases were found in more than 30 kinds of RNAs.
  • methyltransferases can be mainly divided into NOP2/NOL1, YebU/Trm4, RsmB/Yn1022c and PH1991/NSUN, and these enzymes are in eukaryotes.
  • NOP2/NOL1 NOP2/NOL1
  • YebU/Trm4 RsmB/Yn1022c
  • PH1991/NSUN a high degree of conservation.
  • the NSUN protein family has been extensively studied.
  • the human NSUN protein family has a total of nine proteins, and multiple members of this family have potential 5mC methyltransferase functional domains, of which the catalytic activity of NSUN2 has been confirmed.
  • DNMT2 is also thought to be a mammalian RNA5mC methyltransferase, which has a certain intersection with the catalytic site of NSUN2.
  • NSUN1 and NSUN3-7 are predicted to catalyze the methylation of some conserved methylation sites.
  • NSUN1, NSUN2, and NSUN5 have been shown to bind RNA, but the specific binding substrates for these enzymes are not well understood.
  • RNA 5mC demethylation reaction in DNA is mediated by the TET family of proteins, and a recent study showed that TET protein can also mediate RNA 5mC demethylation to form 5hmC, and TET3 is highly active compared to TET1. The selectivity for RNA seems to be stronger. The research on RNA 5mC demethylase has yet to be further explored.
  • 5mC modification may regulate the alternative splicing of mRNA, and the level of 5mC modification may affect the retention level of exons and the assembled form of transcript.
  • 5mC may be closely related to protein translation and mRNA stability.
  • 5mC modifications are widely present in RNA and their functions may involve intracellular signal transduction, tissue development and differentiation, and cancer.
  • the detection and distribution of 5mC sites in mRNA, the discovery of 5mC modified enzymes and binding proteins will help to explain the regulation mechanism of 5mC modification on mRNA processing and metabolism.
  • RNA bisulfite sequencing is the most ideal method so far.
  • Bisulfite treatment can deamination of cytosine C (mostly) which is not methylated in the nucleotide sequence into uracil U. The methylated cytosine remains unchanged. After PCR amplification, uracil U is completely converted to thymine T, so it can be distinguished from the original C-base with methylation modification. This method can detect RNA.
  • the unmethylated cytosine is converted to uracil.
  • the PCR product is sequenced and compared with the untreated sequence to determine whether the cytosine site has been methylated. It has been confirmed that there are 5mC modifications in both tRNA and rRNA. In the early studies, only a few 5mC modifications were found in mRNA due to the limitations of detection techniques, and the mRNA and non-Here in HeLa cells were recently discovered by bisulfite sequencing. There is extensive modification of 5mC in the coding RNA, and it is pointed out that the 5mC modification in the mRNA is mainly enriched in the non-coding region (UTR).
  • RNA 5mC bisulfite sequencing uses a random hexamer primer containing four bases of ATCG in the reverse transcription stage of library construction, which has certain limitations.
  • the proportion of methylation C in RNA in C base is very low.
  • the level of 5mC in human cell mRNA detected by ultra performance liquid chromatography tandem mass spectrometry is less than one thousandth of all C bases (such as As shown in Figure 1, the level of 5 mC of mRNA in HeLa cells was 0.03918%, and 0.072401% in 293T cells.
  • the existing RNA 5mC bisulfite sequencing uses the SOLID sequencing platform based on the "double base coding principle". Unlike the intuitive base sequence of the Illumina platform, SOLID sequencing encodes the use of color space for encoding. The base and its adjacent base are represented by one color, but in the fluorescence decoding stage, since it is a double base to determine a fluorescent signal, it is easy to produce a chain decoding error in the event of an error, and is not suitable for a 5 mC single base. Base conversion levels of bisulfite sequencing. Therefore, the existing sequencing results can not reflect the true distribution of RNA 5mC modification.
  • A-base random hexamer primer containing only ACT can perform reverse transcription PCR on bisulfite-treated RNA, and A in a three-base random hexamer primer can efficiently match the original non-methylation.
  • the C region U after bisulfite treatment
  • the binding efficiency of bisulfite-transformed RNA to reverse transcription primers is significantly higher than that of the traditional ACTG four-base random hexameric primers, which facilitates the detection of 5mC locus and the distribution and functional mechanism of 5mC. Further analysis. The principle is shown in Figure 2.
  • the mRNA or precursor mRNA (pre-mRNA) is isolated and purified from total RNA. After fragmentation and bisulfite conversion, most of the C is converted to U, ACTG is four bases random. Due to the presence of G, the hexameric primer can only match a small amount of bisulfite-converted fragments, so the reverse transcription efficiency is low, and only partial fragments can be reverse-transcribed, and the entire RNA (or corresponding cDNA) cannot be completely covered. Sequence, and the matching efficiency of transformed C (ie, unmethylated C) using ACT three-base random hexamer primer is much higher than that of the common ACTG four-base random hexameric primer, which is synthesized by II chain. After PCR amplification, most of the RNA fragments containing unmethylated C can be efficiently amplified, and the thus constructed cDNA library can completely cover all the sequences of the sequenced samples, and can obtain better and more realistic data.
  • ACTG is four bases random. Due to
  • the present invention relates to a bisulfite sequencing library construction, high-throughput sequencing and methylation detection method for 5mC methylated RNA, comprising the following steps:
  • the RNA sample is interrupted by physical or chemical means into fragments of length suitable for sequencing.
  • the RNA sample is interrupted into fragments of about 100 nt size. Fragmentation of RNA samples can be achieved using commercially available kits, such as the Ambion RNA Fragmentation Kit.
  • RNA sample is an isolated and purified RNA to be detected, which may be derived from a human body, an animal body, a plant body or an organ, tissue or cell thereof.
  • the RNA can be mRNA, tRNA, rRNA, total RNA or other RNA comprising or possibly comprising a 5 mC methylation site.
  • total RNA can be extracted from tissues or cells using the Trizol method.
  • RNA can also be extracted (isolated and purified) using commercially available kits, for example, using Ambion's mRNA extraction kit to extract mRNA.
  • RNA sample may be an unmethylated RNA standard.
  • the unmethylated RNA standard sequence does not contain a 5mC methylation site, so all cytosines will theoretically be converted to uracil when treated with bisulfite, and in the RNA sample after bisulfite treatment.
  • Analysis of the conversion of cytosine to uracil (eg, percentage) in the methylated RNA standard sequence can reflect bisulfite conversion efficiency.
  • the RNA sequence of the mouse DHFR gene can be used as a non-methylated RNA standard in the present invention.
  • the RNA sequence of DHFR can be transcribed from the plasmid pcDNA3-HA-mDHFR carrying the full length of the mouse DHFR gene using the T7 high-efficiency RNA synthesis kit (NEB, E20405).
  • the fragmented RNA sample is treated with a bisulfite such as sodium bisulfite to convert the unmethylated cytosine to uracil.
  • a bisulfite such as sodium bisulfite to convert the unmethylated cytosine to uracil.
  • Methods of treating RNA with bisulfite are well known in the art.
  • the bisulfite treatment can include a bisulfite incubation, desalting, and desulfonation step.
  • An exemplary method of bisulfite treatment comprises dissolving the mRNA precipitate with a freshly prepared bisulfite solution (40% sodium bisulfite, 600 ⁇ M hydroquinone, pH 5.1) and placing it on a 75 ° C PCR machine for 4 hours;
  • the bisulfite treated sample is desalted, for example using Micro Bio-spin 6 Chromatography columns (Bio-Rad); followed by desulfonic acid treatment by adding 1 M Tris (pH 9.0) to a 75 ° C PCR machine for 1 hour.
  • glycogen and 3 M sodium acetate (pH 5.2) and pre-cooled pure ethanol were added, and mRNA was precipitated overnight at -80 °C.
  • cDNA was synthesized using an ACT three-base random hexameric primer and reverse transcriptase using a bisulfite-treated RNA sample as a template.
  • cDNA was synthesized by reverse transcription using Superscript II Reverse Transcriptase Kit (Invitrogen), and the AGCT four-base random hexameric primer in the kit was replaced with a self-synthesized ACT three-base random hexameric primer.
  • ACT three-base random hexameric primer refers to a mixture of hexameric primers randomly synthesized from three bases A, C, and T, and any of the primers in the mixture consists of only A, C, and T.
  • a sequencing library was constructed using cDNA synthesized by reverse transcription.
  • the cDNA synthesized by reverse transcription is subjected to II strand synthesis. After purification of the product, an "A" base (A tailing) is added to the end of the cDNA fragment, and then Adapter is ligated and purified to recover the recovered DNA fragment as a template. PCR amplification was carried out, and the PCR product was purified to obtain an RNA 5 mC bisulfite sequencing library.
  • the sequencing library can be constructed using the cDNA synthesized by reverse transcription, using KAPA Stranded mRNA-Seq Kit (Cat. No. KK8420). Specifically, the II chain synthesis can be performed according to the instructions of the kit and the subsequent library construction (ie, sequencing) Library) steps.
  • High throughput sequencing was performed using the sequencing library constructed above.
  • sequencing is performed on a second-generation sequencing platform such as Illumina's Hiseq sequencing platform using sequencing by side synthesis.
  • the Illumina's Hiseq sequencing platform includes, for example, the Hiseq 2000/2500/3000/4000 sequencing platform.
  • Bisulfite treatment converts unmethylated cytosine C to U, while methylated cytosine It remains unchanged and then PCR amplification causes U to become T. Therefore, when the sequence obtained by sequencing is aligned with the reference genome, the alignment result is C-C (the reference genome is C at a certain position, and the measured reads are also C at the position) is a methyl group.
  • the cytosine, the result of the alignment of C-T is unmethylated cytosine.
  • the specific analysis process includes the following three steps:
  • Quality control Base quality control, trimming primers, etc. are performed on the sequence obtained by sequencing.
  • quality control of sequencing data can be performed by software such as FASTX-Toolkit, Cutadapt, and Trimmomatic [1-3].
  • Sequence alignment The quality-controlled sequencing sequence is aligned with the original sequence on the human reference genome. In some cases, it is considered that the transcriptome sequencing data will contain some sequences that span the intron, and therefore, for those sequences that are not aligned in the alignment with the original sequence of the human reference genome, and then with the human reference genome Transcriptome data and ligation sequences consisting of exon-exon junctions were aligned. The hg19 version of the human reference genome can be used for alignment. The comparison can be performed by Bismark software [4].
  • Methylation level analysis For each cytosine position in the original sequence of the human reference genome, determine the number of sequences in the sequencing sequence that appear to be methylated at that position (ie, the alignment result is C-C, hereinafter referred to as methylation) The number of sequences) and the number of sequences which are expressed as unmethylated (ie, the result of the alignment is C-T, hereinafter referred to as the number of unmethylated sequences).
  • the number of methylated sequences and the number of unmethylated sequences per cytosine position are determined in the same manner as described above, and The corresponding position information is converted into the original reference genome position information of the human reference genome, and the methylation level is calculated in combination.
  • the methylation level of each cytosine position was calculated using the formula M/(U+M), where U and M are the number of unmethylated sequences and methylated sequences at the selected cytosine position, respectively.
  • the result is C.
  • RNA methylation such high proportions of methylated cytosine sites are likely to be caused by inadequate processing, so these sequencing fragments are filtered out prior to calculating methylation levels.
  • a random hexamer primer containing only ACT three bases is used for bisulfite treatment.
  • the reverse transcription of RNA fragments can efficiently match the 5mC modified region, which can greatly improve the efficiency of reverse transcription and PCR amplification, and is more conducive to the detection of potential 5mC sites in the pairing region with random primers during reverse transcription.
  • RNA of the DHFR gene not containing a 5 mC methylation site is used as a standard, and bisulfite conversion efficiency can be identified.
  • the bisulfite treatment method of the present invention can achieve a conversion efficiency of 99.7%.
  • the present invention uses the Hiseq2000 sequencing platform for high-throughput sequencing, which can be efficiently and accurately used for 5mC bisulfite sequencing, avoiding the weakness of the SOLID sequencing platform with high error rate.
  • the cDNA library constructed by the method of the present invention detected 14691 5mC modification sites in the sequencing data obtained by the Hiseq2000 sequencing platform, and most of them tend to be distributed in the coding region of mRNA, especially near the translation initiation site, and further research results. It indicated that 5mC was also distributed in the exon near the cleavage site, and there was a certain sequence preference in the 5mC distribution. Compared with CpG and CHG, the 5mC methylation site was mainly distributed in the CHH region.
  • Figure 1 shows the ratio of 5 mC to the total number of all cytosines in the mRNA of two human cell lines HeLa and 293T cells detected by ultra performance liquid chromatography (UHPLC).
  • Figure 2 shows a schematic diagram of the principle of RNA 5mC bisulfite sequencing of the present invention.
  • Figure 3 shows the results of the identification of the reverse transcription efficiency of the DHFR RNA standard in Example 1.
  • Panel A shows the results of agarose gel electrophoresis of the mRNA samples of the mRNA samples before and after the bisulfite treatment using different reverse transcription primers for reverse transcription synthesis and PCR amplification using DHFR-specific primers.
  • Panel B shows a quantitative analysis of the results of agarose gel electrophoresis.
  • Figure 4 shows the type distribution of the 5mC modification.
  • Panel A is a pie chart showing all gene types containing the 5mC modified gene, including mRNA encoding the protein and non-coding RNA.
  • Picture B is showing A histogram of the number of various non-coding genes modified with 5 mC.
  • Figure 5 shows the distribution characteristics of 5 mC in each region of the Hela cell line.
  • Panel A shows the proportional distribution of 5 mC in HeLa cells in different intervals.
  • Panel B shows the distribution of actual and expected methylation sites in each interval.
  • Figure 6 is a distribution plot of 5 mC in the 5' UTR, CDS and 3' UTR sections of the mRNA.
  • Figure 7 shows the sequence characteristics near the 5 mC modification.
  • Panel A shows the ratio of the 5mC site and the downstream bases CG, CHG, CHH, where H represents A, C, U.
  • Panel B shows the sequence composition of 5 bases near 5 mC.
  • Figure 8 shows the validation of the 5 mC sites of the three genes PLOD3, COL4A5 and FAM129B in the data obtained by sequencing the ACT random hexamer primers in Example 1 using the bisulfite treatment in combination with the Sanger sequencing method.
  • the left panel shows the Sanger sequencing peak and sequence alignment of unsulfated untreated and treated bisulfate; the right panel shows the methylation level of the corresponding gene obtained by high-throughput sequencing, in which the open circle indicates that the methylation is transformed.
  • Cytosine a solid circle indicates methylation, ie, untransformed cytosine.
  • the invention provides a method for increasing the reverse transcription efficiency of a bisulfite-treated RNA sample, characterized in that:
  • RNA sample was reverse transcribed using ACT three-base random hexamer primer to synthesize cDNA.
  • Another aspect of the present invention provides a method for constructing an RNA 5mC bisulfite sequencing library, which comprises the steps of:
  • a sequencing library was constructed using cDNA synthesized by reverse transcription.
  • Another aspect of the present invention provides a method for sequencing 5 mC methylated RNA, which comprises the steps of:
  • a further aspect of the invention provides a method for detecting RNA 5mC methylation, comprising the steps of:
  • 5mC refers to 5-methylcytosine.
  • bisulfite treatment refers to the treatment of RNA with bisulfite to convert unmethylated cytosine contained in the RNA to uracil.
  • the RNA sample described above is the RNA to be tested or comprises the RNA to be tested.
  • the RNA to be tested may contain 5 mC.
  • the RNA to be tested is isolated and purified.
  • the RNA to be detected contained in the RNA sample can be derived from the human body, An animal, a plant, or an organ, tissue, or cell thereof.
  • the RNA can be mRNA, tRNA, rRNA, total RNA or other RNA comprising or possibly comprising a 5 mC methylation site.
  • the RNA sample is an isolated and purified RNA to be tested.
  • the RNA sample is a mixture of unmethylated RNA standards and isolated and purified RNA to be tested in a ratio of 1:100 (by weight).
  • the non-methylated RNA standard is the RNA sequence of the mouse DHFR gene.
  • the "fragmentation" is to break an RNA sample, preferably to an RNA fragment of about 100 nt size.
  • the bisulfite treatment comprises a bisulfite incubation, desalting and desulfonation step.
  • the bisulfite treatment substantially completely converts unmethylated cytosine in the RNA sample to uracil.
  • substantially complete means that the conversion efficiency of unmethylated cytosine to uracil in the RNA sample after treatment with bisulfite is greater than or equal to 99%, more preferably greater than or equal to 99.6%, and more preferably greater than or equal to 99.7%. Most preferably, it is 99.9% or more.
  • conversion efficiency refers to the ratio of unmethylated cytosine converted to uracil by bisulfite treatment in an RNA sample.
  • the ACT three-base random hexameric primer is a mixture of hexameric primers randomly synthesized from three bases A, C, and T.
  • the sequencing library constructed by reverse transcription synthesis comprises II strand synthesis, product purification, terminal addition of "A" base (A tailing), ligated adapter (Adapter), purification recovery, PCR amplification and purification.
  • the step of recovering the PCR product comprises II strand synthesis, product purification, terminal addition of "A" base (A tailing), ligated adapter (Adapter), purification recovery, PCR amplification and purification. The step of recovering the PCR product.
  • the high throughput sequencing is performed using a constructed sequencing library on a sequencing platform, preferably a second generation sequencing platform, more preferably a Hiseq 2000 sequencing platform.
  • the data analysis is to compare the humanized whole genome sequence (hg19) and the transcriptome and exon-exon junction sequences after quality control of the sequencing data, and based on the ratio
  • the methylation level of each of the measured cytosine sites was calculated for the results.
  • the methylation level of a certain cytosine, U and M is calculated using the formula M/(U+M) These are the number of unmethylated sequences and the number of methylated sequences at this cytosine site in the sequencing data, respectively.
  • the result is C, and the sequences are filtered before the methylation level is calculated.
  • a further aspect of the invention provides a sequencing library constructed according to the RNA 5mC bisulfite sequencing library construction method of the invention.
  • a further aspect of the invention provides the use of a sequencing library constructed according to the RNA 5mC bisulfite sequencing library construction method of the invention for RNA 5mC methylation detection.
  • a further aspect of the invention provides the use of an ACT three-base random hexameric primer for RNA 5mC methylation detection comprising RNA fragmentation, bisulfite treatment, reverse transcription A step of synthesizing cDNA, high-throughput sequencing, and data analysis, characterized in that a random hexamer of ACT three bases is used as a primer when reverse-transcribed cDNA synthesis of bisulfite-treated RNA.
  • the ACT three-base random hexameric primer used in the following examples was synthesized by Invitrogen Corporation unless otherwise specified.
  • PCR instrument used in the following examples was a Veriti 96-well thermal cycler available from Applied Biosystems.
  • the reagents used in the following examples are analytical grade reagents and are commercially available from conventional sources.
  • RNA sequence of DHFR was transcribed from the plasmid pcDNA3-HA-mDHFR carrying the full length of the mouse DHFR gene using the T7 in vitro transcription kit (NEB, E2040S), and used as a standard for identifying bisulfite conversion efficiency.
  • Total RNA was extracted from human HeLa cells using the Trizol method. according to The mRNA Purification Kit (Ambion, 61006) demonstrates the isolation and purification of mRNA from total RNA. 2 ⁇ g of mRNA was mixed with 20 ng of DHFR-RNA standard transcribed in vitro to form an mRNA sample, which was fragmented into a 100 nt-sized fragment according to the method of the Ambion RNA Fragmentation Kit.
  • Fragmentation conditions were 2 ⁇ g mRNA/9 ⁇ l per reaction, 1 ⁇ l of fragmentation reagent was added, and reaction was carried out at 90 ° C for 1 minute, and then 1 ⁇ l of the stop buffer was added. Glycogen and 3 M sodium acetate (pH 5.2) and pre-cooled pure ethanol were added to the reacted mRNA product, and mRNA was precipitated overnight at -80 °C.
  • the mRNA precipitate was recovered by centrifugation, and the mRNA precipitate was dissolved in 100 ⁇ l of freshly prepared bisulfite solution (40% sodium bisulfite, 600 ⁇ M hydroquinone, pH 5.1), divided into 2 tubes, 50 ⁇ l per tube, and placed on a 75 ° C PCR machine. hour.
  • the bisulfite-treated sample was then desalted using Micro Bio-spin 6 chromatography columns (Bio-Rad). Subsequent desulfonic acid treatment was carried out by adding 1 M Tris (pH 9.0) to a 75 ° C PCR machine for 1 hour. Finally, glycogen and 3 M sodium acetate (pH 5.2) and pre-cooled pure ethanol were added, and mRNA was precipitated overnight at -80 °C.
  • the mRNA precipitate was recovered by centrifugation, and after dissolving, 300 ng of the sulfite-treated sample was separated, and reverse transcription cDNA synthesis was performed using Superscript II Reverse Transcriptase kit (Invitrogen), in which a self-synthesized ACT three-base random hexameric primer was used instead of the kit. The original random hexamer primer. Then according to KAPA's KAPA Stranded mRNA-Seq Kit ( Platform) (Cat. No. KK8420) Chain-specific mRNA library kit for subsequent II strand synthesis, product purification, A-tailing, ligation, purification, PCR amplification, PCR product purification, library detection process.
  • KAPA's KAPA Stranded mRNA-Seq Kit Platform
  • Chain-specific mRNA library kit for subsequent II strand synthesis, product purification, A-tailing, ligation, purification, PCR amplification, PCR product purification, library detection
  • the bisulfite-treated and non-bisulfite-treated mRNAs were reverse transcribed using different random hexamer primers and reversed.
  • the cDNA synthesized by PCR was amplified by PCR using DHFR-specific primers.
  • the PCR amplification products were analyzed by agarose gel electrophoresis (Fig. 3A), and the electrophoresis bands were quantitatively analyzed (Fig. 3B) to detect the reverse transcription efficiency of different primers for DHFR RNA standards.
  • the primers used were respectively It is an ACT three-base random hexameric primer and a conventional AGCT four-base random hexameric primer.
  • Sequencing was performed on the Hiseq2000 sequencing platform using the constructed library to obtain sequencing data.
  • Sequence alignment For the sequences obtained after quality control, the original sequences on the human reference genome (hg19 version) were aligned by Bismark software. For the unpaired data by hg19 original sequence alignment, the transcriptome data of hg19 and the ligation sequence composed of exon-exon ligation were compared by Bismark software.
  • Methylation level analysis For each cytosine position in the original sequence of the human reference genome, determine the number of sequences in the sequencing sequence that appear to be methylated at that position (ie, the alignment result is C-C, hereinafter referred to as methylation) The number of sequences) and the number of sequences which are expressed as unmethylated (ie, the result of the alignment is C-T, hereinafter referred to as the number of unmethylated sequences).
  • the number of methylated sequences and the number of unmethylated sequences per cytosine position are determined in the same manner as described above, and The corresponding position information is converted into the original reference genome position information of the human reference genome, and the methylation level is calculated in combination.
  • the methylation level of each cytosine position was calculated using the formula M/(U+M), where U and M are the number of unmethylated sequences and methylated sequences at the selected cytosine position, respectively.
  • the DHFR standard has a bisulfite conversion efficiency of 0.997 (ie, 99.7%), that is, the conversion efficiency of the DHFR RNA standard is 99.7%, which represents the measured mRNA sample. Bisulfite conversion efficiency.
  • transcripts were first divided into four regions according to the annotation information of the Ensembl database, 5' non-coding region (5'UTR), protein coding region (CDS), Intron and 3' non-coding region (3'UTR).
  • 5'UTR 5' non-coding region
  • CDS protein coding region
  • 3'UTR 3' non-coding region
  • the position information of the 5mC modification site on the mRNA in Example 1 was extracted, and analyzed by Bedtools software and the annotation information downloaded from the Ensembl database to detect these methylation sites in the four regions of the mRNA (5'UTR, CDS, The number distribution of Intron and 3'UTR showed that the distribution of these methylation sites in CDS was 35%, in introns was 28%, 3' non-coding region was 24%, and the 5' non-coding region was 13% ( Figure 5A).
  • the distribution ratio of C sites (number of coverages greater than or equal to 1) detected in the sequencing results in the four regions (5'UTR, CDS, Intron, and 3'UTR) was used as the desired methylation C site.
  • the 5' non-coding region, the protein coding region and the 3' non-coding region of the mRNA were normalized to a total length of 100 according to the total length, and were used by Bedtools software. Perform positional correlation, count the number of methylation sites in each interval of 1-100, and divide the number of methylation sites between each cell by the sum of the number of all methylation sites. The percentage distribution curve of the methylation site at the overall level of mRNA was obtained (Fig. 6). The results showed that the distribution of 5mC sites was significantly enriched relative to other regions near the CDS start site.
  • the 5mC methylation site can be classified into CpG, CHG and CHH according to the base information in its vicinity, where H is A, C, or U.
  • H is A, C, or U.
  • the left panel of Figure 8 is a Sanger sequencing peak and sequence alignment map of bisulfite untreated and treated.
  • the right panel of Figure 8 schematically shows the number of measured methylated cytosines represented by the sequencing results of 10 reads in the high-throughput sequencing results, representing the methylation level of the site;
  • the circle indicates the methylation status of cytosines on the left and right sides of the methylation sites in different sequences, and the open circles indicate the cytosines which are unmethylated, ie, converted to bisulfite after sequencing, and the solid circles represent methyl groups. It is an untransformed cytosine.
  • the results showed that there was a methylated cytosine which was not transformed by bisulfite treatment in the three genes, which was consistent with the high-throughput sequencing results, which further confirmed the reliability and accuracy of the method of the present invention.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • Genetics & Genomics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Medicinal Chemistry (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Physics & Mathematics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Plant Pathology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明提供了对RNA分子上的5-甲基胞嘧啶(5mC)修饰的重亚硫酸测序的文库构建方法、测序方法和甲基化检测方法。通过使用只含有ACT三种碱基的随机六聚引物,显著提高了经过重亚硫酸盐处理后的RNA片段的逆转录和PCR扩增效率以及5mC位点的检测效率。

Description

RNA 5mC重亚硫酸盐测序的文库构建方法及其应用 技术领域
本发明属于基因组测序技术领域,具体的,涉及一种通过利用只含ACT三碱基的随机六聚引物对重亚硫酸盐处理的RNA片段进行逆转录后的文库构建方法及其在5mC高通量测序中的应用。
背景技术
RNA同时具有调控和信息分子的双重功能,在众多细胞机制中发挥着核心作用。RNA转录后的修饰为RNA功能的多样化奠定了化学基础。自然界中的RNA修饰广泛存在于A、U、C、G四类核苷酸上。截止目前,RNA修饰数据库RNAMDB共收录了109种RNA的修饰形式,其中甲基化修饰约占RNA修饰总量的80%,这类甲基化修饰主要发生在碱基基团的氮原子N上,以及嘌呤和嘧啶的C原子上,或2’-OH氧原子等特殊位置上。RNA核苷上的甲基化反应主要依赖于生物体内的甲基转移酶及甲基基团供体完成。6-甲基腺嘌呤(N6-methyladenosine,m6A),是发生在碱基A第六位N原子上的甲基化修饰,作为真核生物中最常见的一种RNA转录后修饰,因其丰富的含量和高度的保守型,近年来得到了广泛的关注和研究。
除m6A外,5-甲基胞嘧啶(5-methylcytosine,5mC)是另一种在RNA中广泛存在的甲基化修饰形式。虽然5mC修饰在DNA研究中已被广泛报道,而在RNA中的研究尚处于起步阶段。早在20世纪70年代5mC就已被发现存在于仓鼠细胞的mRNA中,其修饰位点的分布特征以及生物学功能等仍不清楚,直到近期对于RNA 5mC的基因组水平上的单位点检测及生物学功能 的研究才有了初步探索。
和m6A一样,RNA 5mC修饰也应该是动态可逆的,甲基转移酶以S-腺苷甲硫氨酸(SAM)作为甲基供体,将甲基转移到胞嘧啶C形成5-甲基胞嘧啶(5mC)。RsmB是第一个被发现的RNA 5mC甲基转移酶,主要催化细菌rRNA上的甲基化形成。随后30多种RNA上的5mC甲基转移酶陆续被发现,这些甲基转移酶主要可分为NOP2/NOL1,YebU/Trm4,RsmB/Yn1022c和PH1991/NSUN四类,而且这些酶在真核生物中有很高的保守性,近年来NSUN蛋白家族被广泛深入研究。人的NSUN蛋白家族共有9个蛋白,该家族的多个成员都具有潜在的5mC甲基转移酶功能结构域,其中NSUN2的催化活性已被证实。除NSUN2外,DNMT2也被认为可能是哺乳动物的RNA5mC甲基转移酶,它和NSUN2的催化位点有一定的交集。其它的酶如NSUN1和NSUN3-7被预测可以催化一些保守的甲基化位点的甲基化过程。NSUN1、NSUN2、NSUN5已被证实能结合RNA,但这些酶的特异性结合底物并不是很清楚。
利用5mC抗体免疫沉淀结合亚硫酸氢盐处理测序的结果发现了古生菌mRNA中的多个5mC修饰,其保守序列为AU(m5C)GANGU,和古生菌的rRNA上的保守序列一致,说明mRNA和rRNA上的5mC修饰可能通过同一种甲基转移酶催化形成。
DNA中5mC的去甲基化反应由TET家族蛋白催化介导,而近期一项研究表明TET蛋白也能介导RNA 5mC去甲基化形成5hmC,而且相较于TET1对DNA的高度活性,TET3对RNA的选择性似乎更强。针对RNA 5mC去甲基化酶的研究还有待更深入的探索。
相比m6A,5mC在mRNA中的功能研究还并不深入。基于现有研究推测,5mC的修饰可能会调控mRNA的选择性剪接,5mC修饰水平可能会影响外显子的保留水平以及转录本的组装形式。除此之外,5mC还可能与蛋白质翻译以及mRNA的稳定性有较为密切的关系。总之,5mC修饰在RNA中广泛存在而且其功能可能涉及到包括细胞内信号转导,组织发育分化和癌症 等许多方面,而对mRNA中5mC位点的检测和分布规律的探索,5mC修饰酶及结合蛋白的发现将有助于阐述5mC修饰对mRNA加工代谢的调控机制。
RNA整体水平上的5mC修饰可用质谱技术检测,而对于5mC的单位点鉴定迄今为止已有四种方法被报道,分别为5mC重亚硫酸盐测序(Bisulfite sequencing,BS-seq)、5mC-RIP、Aza-IP、miCLIP。其中RNA重亚硫酸盐测序是目前为止最为理想的方法,重亚硫酸盐处理能够将核苷酸序列中未发生甲基化的胞嘧啶C(占绝大部分)脱氨基转换成尿嘧啶U,而甲基化的胞嘧啶则保持不变,进行PCR扩增后尿嘧啶U全部转换成胸腺嘧啶T,因此可与原本具有甲基化修饰的C碱基区分开来,该方法可以检测到RNA经重亚硫酸盐处理后转换为尿嘧啶的非甲基化胞嘧啶,最后,对PCR产物进行测序,并且与未经处理的序列进行比较,可判断胞嘧啶位点是否发生了甲基化。已有研究证实了tRNA和rRNA中均存在5mC修饰,早期研究由于受到检测技术的限制,在mRNA中仅发现少数的5mC修饰,而近期通过重亚硫酸盐测序方法发现在HeLa细胞的mRNA和非编码RNA中5mC均存在着广泛的修饰,并且指出mRNA中的5mC修饰主要富集在非编码区(UTR)。
发明内容
本发明人发现已报道的RNA 5mC重亚硫酸盐测序在文库构建的逆转录阶段采用含有ATCG四个碱基的随机六聚引物,其存在一定的局限性。RNA中甲基化C在C碱基中所占比例很低,利用超高效液相色谱串联质谱检测到人细胞mRNA中的5mC水平在所有C碱基中的占比不足千分之一(如图1所示,在HeLa细胞中mRNA的5mC的水平是0.03918%,293T细胞中为0.072401%)。因此,RNA经重亚硫酸盐处理后,绝大部分(例如99.7%以上,甚至99.9%以上)的非甲基化C都将转换成为U,导致RNA序列中A、U、G的富集以及C的显著减少,按照四个碱基ATCG的随机排列,有(4^6)4096种组合,其中包含G的组合有(4^6-3^6)3367种,即超过80%的序列不能 被ATCG四碱基随机六聚引物有效识别,造成了扩增过程中包含G的ATCG四碱基随机六聚引物无法有效配对原来的非甲基化C区域,导致随机六聚引物逆转录效率降低和扩增的偏好性。其次,现有的RNA 5mC重亚硫酸盐测序采用了基于“双碱基编码原理”的SOLID测序平台,与Illumina平台直观的碱基序列不同,SOLID测序将reads利用颜色空间进行编码,将每一个碱基与它邻近的碱基用一种颜色表示,但在荧光解码阶段,鉴于其是双碱基确定一个荧光信号,因而一旦发生错误就容易产生连锁的解码错误,并不适用于5mC单碱基转化水平的重亚硫酸盐测序。因此,现有的测序结果无法反映出RNA 5mC修饰的真实的分布规律。
本发明人发现,利用只含有ACT的三碱基随机六聚引物对经重亚硫酸盐处理的RNA进行逆转录PCR,三碱基随机六聚引物中的A能够高效匹配原来的非甲基化C区域(重亚硫酸盐处理之后为U),克服了原来含有G的四个碱基ATCG的随机六聚引物无法有效配对由非甲基化C转变而来的U区域的缺陷,可以提高经重亚硫酸盐转化的RNA与逆转录引物的结合效率,其逆转录效率显著高于传统的ACTG四碱基随机六聚引物,从而有利于5mC位点的检测以及对5mC的分布规律和功能机制的进一步分析。其原理如图2所示,从总RNA中分离纯化mRNA或者前体mRNA(pre-mRNA),经片段化处理及重亚硫酸盐转化后,绝大部分C转变为U,ACTG四碱基随机六聚引物由于其中存在G,仅能与少量重亚硫酸盐转化后的片段匹配,因此逆转录效率较低,仅能逆转录扩增部分片段,无法全面覆盖RNA(或相应的cDNA)的全部序列,而利用ACT三碱基随机六聚引物对已转化的C(即非甲基化的C)的匹配效率要远远高于普通的ACTG四碱基随机六聚引物,经II链合成和PCR扩增后,能有效扩增绝大多数含非甲基化C的RNA片段,由此构建的cDNA文库基本能完全覆盖测序样品的所有序列,能得到更优的更真实的数据。
在此基础上,本发明涉及5mC甲基化RNA的重亚硫酸盐测序文库构建、高通量测序及甲基化检测方法,包括以下步骤:
(1)将RNA样品片段化;
通过物理或化学方法将RNA样品打断成为适合于测序的长度的片段。优选将RNA样品打断为约100nt大小的片段。可以使用商购的试剂盒,例如Ambion RNA片段化试剂盒实现RNA样品的片段化。
“RNA样品”是分离纯化的待测RNA,可以来自于人体、动物体、植物体或其器官、组织或细胞。所述RNA可以是mRNA、tRNA、rRNA、总RNA或其它包含或可能包含5mC甲基化位点的RNA。例如可以使用Trizol法从组织或细胞中提取总RNA。还可以使用商购的试剂盒提取(分离纯化)RNA,例如使用Ambion的mRNA提取试剂盒提取mRNA。
为了鉴定重亚硫酸盐处理RNA方法的可靠性,RNA样品中可以包含一定比例的非甲基化RNA标准品作为鉴定重亚硫酸盐转化效率的标准品,例如RNA样品可以是非甲基化RNA标准品与待测RNA以1:100(重量比)的比例混合的混合物。非甲基化RNA标准品序列不含5mC甲基化位点,因此在用重亚硫酸盐处理时全部胞嘧啶理论上都将转换为尿嘧啶,在重亚硫酸盐处理后对RNA样品中非甲基化RNA标准品序列中胞嘧啶转化为尿嘧啶的情况(例如百分比)进行分析,可以反映重亚硫酸盐转化效率。本发明中可以使用小鼠DHFR基因的RNA序列作为非甲基化RNA标准品。DHFR的RNA序列可以用T7高效RNA合成试剂盒(NEB、E20405)从携带小鼠DHFR基因全长的质粒pcDNA3-HA-mDHFR上转录出。
(2)对片段化的RNA样品进行重亚硫酸盐处理;
将片段化的RNA样品用重亚硫酸盐例如重亚硫酸钠进行处理,以使非甲基化胞嘧啶转换为尿嘧啶。用重亚硫酸盐处理RNA的方法是本领域公知的。重亚硫酸盐处理可以包括重亚硫酸盐温育、脱盐和脱磺酸化步骤。重亚硫酸盐处理的示例性方法包括用新鲜配制的重亚硫酸盐溶液(40%重亚硫酸钠,600μM氢醌,pH5.1)溶解mRNA沉淀,置于75℃PCR仪上处理4小时;然后对重亚硫酸盐处理后的样品进行脱盐处理,例如使用Micro Bio-spin 6 chromatography columns(Bio-Rad);随后加入1M Tris(pH 9.0)于75℃PCR仪上1小时进行脱磺酸处理。最后加入糖原和3M醋酸钠(pH 5.2)及预冷的纯乙醇,放置-80℃过夜沉淀mRNA。
(3)利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA;
为了提高经重亚硫酸盐处理的RNA的逆转录效率,以重亚硫酸盐处理后的RNA样品为模板,使用ACT三碱基随机六聚引物和反转录酶合成cDNA。例如使用Superscript II Reverse Transcriptase Kit(Invitrogen)进行逆转录合成cDNA,并用自行合成的ACT三碱基随机六聚引物代替试剂盒中的AGCT四碱基随机六聚引物。
“ACT三碱基随机六聚引物”是指由A、C、T三种碱基随机合成的六聚引物的混合物,该混合物中任一引物均由且仅由A、C、T组成。
(4)用逆转录合成的cDNA构建测序文库。
用逆转录合成的cDNA进行II链合成,纯化产物后,在cDNA片段的末端加上“A”碱基(A tailing),再连接接头(Adapter)并纯化回收,以纯化回收的DNA片段为模板进行PCR扩增,并纯化回收PCR产物,获得RNA 5mC重亚硫酸盐测序文库。
可以由逆转录合成的cDNA开始,使用KAPA Stranded mRNA-Seq Kit(Cat.No.KK8420)构建测序文库,具体地,可以按照试剂盒的操作说明进行II链合成及之后的建库(即构建测序文库)步骤。
(4)进行高通量测序。
利用上述构建的测序文库进行高通量测序。例如在二代测序平台例如Illumina的Hiseq测序平台上采用边合成边测序的方法进行序列测定。所述Illumina的Hiseq测序平台包括例如Hiseq 2000/2500/3000/4000测序平台。
(5)对测序结果进行数据分析,获得RNA的5mC甲基化信息。
重亚硫酸盐处理会将非甲基化的胞嘧啶C转变为U,而甲基化的胞嘧啶 保持不变,然后PCR扩增使得U变成T。因此,在将测序得到的序列与参考基因组进行比对时,比对结果为C-C(参考基因组上在某个位置上是C,测得的reads在该位置上也是C)的是甲基化的胞嘧啶,比对结果为C-T的是非甲基化的胞嘧啶。具体的分析过程包括下面三个步骤:
质量控制:对测序得到的序列(reads)进行碱基质量控制、修剪引物等处理。例如可以通过FASTX-Toolkit,Cutadapt,Trimmomatic等软件[1-3]进行测序数据的质量控制。
序列比对:将质量控制后的测序序列与人参考基因组上的原始序列进行比对。在一些情况下,考虑到转录组测序数据会包含一些跨内含子的序列,因此,对于那些在与人参考基因组原始序列的比对中未比对上的测序序列,再与人参考基因组的转录组数据以及外显子-外显子连接构成的连接序列进行比对。可以使用hg19版本的人参考基因组进行比对。比对可以通过Bismark软件[4]进行。
甲基化水平分析:对于人参考基因组原始序列中的每个胞嘧啶位置,确定测序序列中在该位置表现为甲基化的序列数(即比对结果为C-C,以下称甲基化序列数)和表现为非甲基化的序列数(即比对结果为C-T,以下称非甲基化序列数)。对于与人参考基因组的转录组数据以及外显子-外显子连接序列比对的结果,与上述方法相同确定每个胞嘧啶位置的甲基化序列数和非甲基化序列数,并将其对应的位置信息转换为人参考基因组原始序列位置信息,并综合起来计算甲基化水平。使用公式M/(U+M)计算每个胞嘧啶位置的甲基化水平,其中U和M分别是在选定胞嘧啶位置上的非甲基化序列数和甲基化序列数。在一些情况下,对于比对上的测序序列,如果在一个测序片段(即测序文库构建过程中打断形成的片段)中所包含的胞嘧啶位置中有超过30%的位置测序结果为C,鉴于RNA甲基化的低水平,如此高比例的甲基化胞嘧啶位点很可能为处理不充分导致,因此在计算甲基化水平之前先将这些测序片段过滤掉。
本发明中使用只含有ACT三碱基的随机六聚引物对重亚硫酸盐处理的 RNA片段进行逆转录,能够高效匹配5mC修饰区域,可以大幅提高逆转录和PCR扩增的效率,更有利于逆转录时与随机引物配对区域潜在的5mC位点的检测。
进一步,本发明中使用不含有5mC甲基化位点的DHFR基因的RNA作为标准品,可以鉴定重亚硫酸盐转化效率。
进一步,本发明的重亚硫酸盐处理方法,可以达到99.7%的转化效率。
进一步,本发明中使用Hiseq2000测序平台进行高通量测序,能够高效、准确地用于5mC重亚硫酸盐测序,规避了SOLID测序平台错误率高的弱势。
利用本发明的方法检测RNA的5mC甲基化位点,可以提高逆转录合成效率,检测更加全面和完整。利用本发明方法构建的cDNA文库经由Hiseq2000测序平台所得的测序数据中,检测到了14691个5mC修饰位点,且大多倾向于分布在mRNA的编码区,尤其是翻译起始位点附近,进一步研究结果表明5mC还分布在剪切位点附近的外显子中,且5mC分布存在一定的序列偏好性,相对于CpG和CHG,5mC甲基化位点主要分布在CHH区域。
附图说明
图1显示了利用超高效液相色谱(UHPLC)检测的两种人细胞系HeLa和293T细胞的mRNA中5mC占所有胞嘧啶总数的比例。
图2显示了本发明RNA 5mC重亚硫酸盐测序的原理示意图。
图3显示了实施例1中对于DHFR RNA标准品的逆转录效率的鉴定结果。A图显示经重亚硫酸盐处理前后的mRNA样品利用不同的逆转录引物进行逆转录合成,并利用DHFR特异性引物进行PCR扩增的产物的琼脂糖凝胶电泳检测结果。B图显示对琼脂糖凝胶电泳检测结果的定量分析。
图4显示了5mC修饰的类型分布。A图是显示所有含有5mC修饰基因的基因类型的饼图,包括编码蛋白的mRNA和非编码RNA。B图是显示含 有5mC修饰的各种非编码基因的个数的柱形图。
图5显示了Hela细胞系中5mC在mRNA上各区间的分布特征。A图显示了Hela细胞中5mC在不同区间的比例分布。B图显示了实际和预期的甲基化位点在各个区间的分布比例。
图6是5mC在mRNA 5’UTR、CDS以及3’UTR区间的分布曲线。
图7显示了5mC修饰附近的序列特征。A图显示了5mC位点及下游碱基CG,CHG,CHH的比例,其中H代表A、C、U。B图显示了5mC附近5个碱基的序列组成。
图8显示了利用重亚硫酸盐处理结合Sanger测序方法对实施例1中利用ACT随机六聚引物测序所得数据中的3个基因PLOD3、COL4A5和FAM129B的5mC位点的验证。左图为重亚硫酸盐未处理和处理的Sanger测序峰图和序列比对图;右图为高通量测序所得对应基因的甲基化水平,其中,空心圆圈表示非甲基化即转化了的胞嘧啶,实心圆圈表示甲基化即未转化的胞嘧啶。
具体实施方式
可以理解的是,在此描述的特定实施方式通过举例的方式来表示,其并不作为对本发明的限制。在不偏离于本发明范围的情况下,本发明的主要特征可以用于各种实施方式。本领域的技术人员将会意识到或能够确认,仅仅使用常规实验,许多等同物都能应用于本文所描述的特定步骤中。这些等同物被认为处在本发明的范围之内,并且被权利要求所覆盖。
本发明一方面提供了一种提高重亚硫酸盐处理后的RNA样品的逆转录效率的方法,其特征在于:
利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA。
本发明另一方面提供了RNA 5mC重亚硫酸盐测序文库构建方法,其特征在于包括以下步骤:
(1)将RNA样品片段化;
(2)对片段化的RNA样品进行重亚硫酸盐处理;
(3)利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA;
(4)用逆转录合成的cDNA构建测序文库。
本发明另一方面提供了5mC甲基化RNA的测序方法,其特征在于包括以下步骤:
(1)将RNA样品片段化;
(2)对片段化的RNA样品进行重亚硫酸盐处理;
(3)利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA;
(4)用逆转录获得的cDNA构建测序文库;
(5)进行高通量测序。
本发明又一方面提供了RNA 5mC甲基化检测方法,其特征在于包括以下步骤:
(1)将RNA样品片段化;
(2)对片段化的RNA样品进行重亚硫酸盐处理;
(3)利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA;
(4)用逆转录合成的cDNA构建测序文库;
(5)进行高通量测序;
(6)对测序结果进行数据分析,获得RNA的5mC甲基化信息。
本文中使用的术语“5mC”指5-甲基胞嘧啶。
本文中的术语“重亚硫酸盐处理”是指使用重亚硫酸盐处理RNA,使RNA中所包含的未甲基化胞嘧啶转变为尿嘧啶。
在一个具体实施方案中,上述RNA样品是待测RNA或包含待测RNA。待测RNA可以包含5mC。在优选的实施方案中,所述待测RNA是分离纯化的。在一个具体实施方案中,RNA样品中包含的待测RNA可以来自于人体、 动物体、植物体或其器官、组织或细胞。在一个具体实施方案中,所述RNA可以是mRNA、tRNA、rRNA、总RNA或其它包含或可能包含5mC甲基化位点的RNA。在一个具体实施方案中,RNA样品是分离纯化的待测RNA。在一个具体实施方案中,RNA样品是非甲基化RNA标准品与分离纯化的待测RNA以1:100(重量比)的比例混合的混合物。在优选的实施方案中,所述非甲基化RNA标准品是小鼠DHFR基因的RNA序列。
在一个具体实施方案中,所述“片段化”是将RNA样品打断,优选地打断为约100nt大小的RNA片段。
在一个具体实施方案中,所述重亚硫酸盐处理包括重亚硫酸盐温育、脱盐和脱磺酸化步骤。
在优选的实施方案中,所述重亚硫酸盐处理基本上完全将RNA样品中的未甲基化的胞嘧啶转化为尿嘧啶。其中“基本上完全”是指经重亚硫酸盐处理后,RNA样品中未甲基化胞嘧啶转化为尿嘧啶的转化效率大于等于99%,更优选大于等于99.6%,更优选大于等于99.7%,最优选大于等于99.9%。其中“转化效率”是指RNA样品中未甲基化胞嘧啶经重亚硫酸盐处理转化为尿嘧啶的比例。
在一个具体实施方案中,所述ACT三碱基随机六聚引物是由A、C、T三种碱基随机合成的六聚引物的混合物。
在一个具体实施方案中,用逆转录合成的cDNA构建测序文库包括II链合成、产物纯化、末端加“A”碱基(A tailing)、连接接头(Adapter)、纯化回收、PCR扩增和纯化回收PCR产物的步骤。
在一个具体实施方案中,所述高通量测序是利用构建的测序文库在测序平台,优选第二代测序平台,更优选Hiseq2000测序平台上进行测序。
在一个具体实施方案中,所述数据分析是对测序数据进行质量控制之后将其与人的全基因组序列(hg19)及转录组和外显子-外显子连接序列进行比对,并基于比对结果计算每个测到的胞嘧啶位点的甲基化水平。在一个具体实施方案中,使用公式M/(U+M)计算某个胞嘧啶的甲基化水平,U和M 分别是测序数据中这个胞嘧啶位点上的非甲基化序列数和甲基化序列数。在一个具体的实施方案中,对于比对上的测序片段,如果该测序片段所包含的胞嘧啶位置超过30%的位置测序结果为C,在计算甲基化水平之前先将这些序列过滤掉。
本发明又一方面提供了根据本发明的RNA 5mC重亚硫酸盐测序文库构建方法构建的测序文库。
本发明的又一方面提供了根据本发明的RNA 5mC重亚硫酸盐测序文库构建方法构建的测序文库在RNA 5mC甲基化检测中的应用。
本发明的又一方面提供了ACT三碱基的随机六聚引物在RNA 5mC甲基化检测中的应用,所述RNA 5mC甲基化检测包括将RNA片段化、重亚硫酸盐处理、逆转录合成cDNA、高通量测序和数据分析的步骤,其特征在于:在对重亚硫酸盐处理后的RNA进行逆转录合成cDNA时,使用ACT三碱基的随机六聚体作为引物。
下面将结合实施例对本发明的实施方案进行详细描述,但本领域技术人员将会理解,下列实施例仅用于说明本发明,而不应当视为对本发明范围的限制。
除非特别指明,以下实施例中所用的ACT三碱基随机六聚引物由Invitrogen公司合成。
除非特别指明,以下实施例中所用的PCR仪为Veriti 96孔热循环仪,购自Applied Biosystems公司。
除非特别指明,以下实施例中所用的试剂均为分析纯级的试剂,且可从常规渠道商购获得。
实施例1
1)RNA提取及片段化处理
利用T7体外转录试剂盒(NEB、E2040S)从携带小鼠DHFR基因全 长的质粒pcDNA3-HA-mDHFR上转录出DHFR的RNA序列,并以此作为鉴定重亚硫酸盐转换效率的标准品。利用Trizol法从人HeLa细胞中提取总RNA。根据
Figure PCTCN2015088908-appb-000001
mRNA Purification Kit(Ambion,61006)说明,从总RNA中分离纯化mRNA。将2μg mRNA与20ng体外转录得到的DHFR-RNA标准品混匀成mRNA样品,根据Ambion RNA片段化试剂盒的方法说明,将该mRNA样品片段化成100nt大小的片段。片段化条件为每个反应2μg mRNA/9μl,加入1μl片段化试剂,90℃反应1分钟后再加入1μl终止缓冲液。向反应后的mRNA产物中加入糖原和3M醋酸钠(pH 5.2)及预冷的纯乙醇,放置-80℃过夜沉淀mRNA。
2)重亚硫酸盐处理、脱盐及脱磺酸化
离心回收mRNA沉淀,用100μl新鲜配制的重亚硫酸盐溶液(40%重亚硫酸钠,600μM氢醌,pH5.1)溶解mRNA沉淀,分成2管,每管50μl,置于75℃PCR仪上处理4小时。然后用Micro Bio-spin 6 chromatography columns(Bio-Rad)对重亚硫酸盐处理后的样品进行脱盐处理。随后加入1M Tris(pH 9.0)于75℃PCR仪上1小时进行脱磺酸处理。最后加入糖原和3M醋酸钠(pH 5.2)及预冷的纯乙醇,放置-80℃过夜沉淀mRNA。
3)cDNA合成及文库构建
离心回收mRNA沉淀,溶解后分出300ng重亚硫酸盐处理后的样品,利用Superscript II Reverse Transcriptase kit(Invitrogen)进行逆转录cDNA合成,其中使用自行合成的ACT三碱基随机六聚引物代替试剂盒中原有的随机六聚引物。然后根据KAPA公司的KAPA Stranded mRNA-Seq Kit(
Figure PCTCN2015088908-appb-000002
platform)(Cat.No.KK8420)链特异性mRNA建库试剂盒说明进行后续的II链合成、产物纯化、A-tailing、加接头、纯化、PCR扩增、PCR产物纯化、文库检测过程-。
为了鉴定ACT三碱基随机六聚引物的逆转录效率,将经重亚硫酸盐处理的和未经重亚硫酸盐处理的上述mRNA,分别使用不同的随机六聚引物进行逆转录,并将逆转录合成的cDNA分别使用DHFR特异性引物进行PCR扩 增,对PCR扩增产物进行琼脂糖凝胶电泳分析(图3A),并对电泳条带进行定量统计(图3B),检测不同引物对DHFR RNA标准品的逆转录效率,所使用的引物分别是ACT三碱基随机六聚引物和传统的AGCT四碱基随机六聚引物。结果显示,对于未经重亚硫酸盐处理的mRNA样品,利用两种逆转录引物逆转录合成并扩增后DNA的量相当,说明其逆转录扩增效率相当;而对于重亚硫酸盐处理过的mRNA样品,利用ACT三碱基随机六聚引物进行逆转录合成并进行PCR后,DNA的量比ACTG四碱基随机六聚引物高3倍以上,说明其逆转录扩增效率较普通ACTG四碱基随机六聚引物高3倍以上。
(4)高通量测序
使用所构建的文库在Hiseq2000测序平台上进行测序,获得测序数据。
(5)甲基化位点鉴定及甲基化水平计算
质量控制:用FASTX-Toolkit软件对测序数据进行碱基质量控制、修剪引物等处理。
序列比对:对于进行质量控制之后获得的序列,通过Bismark软件与人的参考基因组(hg19版本)上的原始序列进行比对。对于通过hg19原始序列比对未比对上的数据,再通过Bismark软件与hg19的转录组数据以及外显子-外显子连接构成的连接序列进行比对。
甲基化水平分析:对于人参考基因组原始序列中的每个胞嘧啶位置,确定测序序列中在该位置表现为甲基化的序列数(即比对结果为C-C,以下称甲基化序列数)和表现为非甲基化的序列数(即比对结果为C-T,以下称非甲基化序列数)。对于与人参考基因组的转录组数据以及外显子-外显子连接序列比对的结果,与上述方法相同确定每个胞嘧啶位置的甲基化序列数和非甲基化序列数,并将其对应的位置信息转换为人参考基因组原始序列位置信息,并综合起来计算甲基化水平。使用公式M/(U+M)计算每个胞嘧啶位置的甲基化水平,其中U和M分别是在选定胞嘧啶位置上的非甲基化序列数和甲基化序列数。对于比对上的测序序列,如果在一个测序片段(即测序文库构建过程中打断形成的片段)中所包含的胞嘧啶位置中有超过30%的位置测序 结果为C,鉴于RNA甲基化的低水平,如此高比例的甲基化胞嘧啶位点很可能为处理不充分导致,因此在计算甲基化水平之前先将这些测序片段过滤掉。
为分析重亚硫酸盐处理的转化效率,计算DHFR标准品的重亚硫酸盐转化效率,将测序数据进行质量控制之后与DHFR标准品的序列进行比对,并基于比对结果计算标准品的转化效率,使用公式U'/(U'+M')计算标准品的转化效率,U'和M'分别是测序数据中DHFR的所有胞嘧啶位点上的非甲基化序列数之和和甲基化序列数之和。同样,对于比对上的DHFR的测序片段,如果该测序片段所包含的胞嘧啶位置超过30%的位置测序结果为C,在计算甲基化水平之前先将这些序列过滤掉。根据测序结果,U'=8337,M'=3097496,DHFR标准品的重亚硫酸盐转化效率为0.997(即99.7%),即DHFR RNA标准品的转化效率达99.7%,代表了所测mRNA样品的重亚硫酸盐转化效率。
对获得的人Hela细胞mRNA 5mC甲基化位点的分布规律进行进一步分析,如以下实施例2-5所示。
实施例2
从Ensembl数据库下载人的基因组注释文件(版本号72)。对测序分析得到的甲基化位点,用Bedtools软件基于下载的注释文件进行注释。将注释结果中含有5mC修饰的基因基于ensembl中对基因的归类进行分类。结果表明含有5mC甲基化修饰的基因中87%属于蛋白编码基因(图4A),其余的修饰基因为非编码基因,包含pseudogene,lincRNA,antisense,processed transcript等。其中,5mC修饰的pseudogene有202个,lincRNA 62个,antisense 58个,processed transcript 39个(图4B)。
实施例3
为了检查5mC修饰在转录本上的各个区域的分布情况,首先将转录本按照Ensembl数据库的注释信息分为四个区域,5’非编码区(5’UTR),蛋白编码区域(CDS),内含子(Intron)和3’非编码区(3’UTR)。将实施例1中mRNA上的5mC修饰位点位置信息提取出来,通过Bedtools软件与Ensembl数据库下载的注释信息进行分析,检测这些甲基化位点在mRNA的四个区域(5’UTR、CDS、Intron以及3’UTR)的数量分布,结果表明这些甲基化位点在CDS中的分布为35%,内含子中为28%,3’非编码区为24%,5’非编码区为13%(图5A)。统计测序结果中测到的mRNA上C位点(覆盖次数大于等于1)在四个区域(5’UTR、CDS、Intron以及3’UTR)的分布比例,将其作为期望的甲基化C位点的分布,用期望的甲基化位点的分布和实际甲基化位点的分布来检测5mC位点在四个区域分布的偏好性,结果表明,和期望值相比,5mC修饰位点显著富集在蛋白编码区域和5’非编码区以及3’非编码区(图5B)。
实施例4
为了检查5mC修饰在转录本上的分布特征,将mRNA的5’非编码区、蛋白编码区域以及3’非编码区三个区间分别按照总长度标准化成总成为100的长度,并通过Bedtools软件与其进行位置关联,统计在1-100各个区间上的甲基化位点个数,并将每个小区间的甲基化位点的个数除以所有甲基化位点的个数之和,得到甲基化位点在mRNA整体水平上的百分比分布曲线(图6)。结果表明,在CDS起始位点附近,5mC位点分布相对于其他区域有着显著的富集。
实施例5
5mC甲基化位点可以根据其附近的碱基信息分为CpG,CHG及CHH,其中H为A、C、或U。对筛选得到的甲基化位点,提取其附近的2-3nt的序 列,分析其中CpG,CHG及CHH的分布数量及比例。结果表明,相对于CpG和CHG,5mC甲基化位点主要分布在CHH区域(图7A)。同时,我们以甲基化位点为中心,将其上下游各延长2nt得到以甲基化位点为中心的5nt序列,通过fastaFromBed程序对序列进行提取,并利用Weblogo展示甲基化位点附近5nt的序列(图7B)。
实施例6
为了进一步证明测序结果的可靠性,从实施例1的测序数据中选取了含有5mC修饰的3个PLOD3、COL4A5和FAM129B,利用重亚硫酸盐处理并结合Sanger测序方法对其中的5mC甲基化位点进行验证,即将上述基因的对应甲基化位点两侧设计PCR引物,以经重亚硫酸盐未处理和处理的RNA为模板进行逆转录PCR,通过琼脂糖凝胶电泳检测PCR扩增产物,并将对应位置的条带切下回收DNA片段,进行Sanger测序。图8显示了验证结果。图8的左图为重亚硫酸盐未处理和处理的Sanger测序峰图和序列比对图。图8的右图示意性地以高通量测序结果中10条序列(reads)的测序结果为代表显示了测得的甲基化胞嘧啶的次数,代表了该位点的甲基化水平;其中用圆圈表示不同序列中甲基化位点左右两侧胞嘧啶的甲基化状态,空心圆圈表示非甲基化即重亚硫酸盐转化后测序结果为T的胞嘧啶,实心圆圈表示甲基化即未转化的胞嘧啶。结果表明三个基因中分别存在一个经重亚硫酸盐处理后未发生转化的甲基化胞嘧啶,与高通量测序结果吻合,进一步证实了本发明方法的可靠性和准确性。
参考文献
[1]Gordon A.FASTX-Toolkit.http://hannonlab.cshl.edu/fastx_toolkit/.
[2]Martin M.Cutadapt removes adapter sequences from high-throughput sequencing reads.EMBnet J.2011;17(1):10-12.
[3]Bolger AM,Lohse M,Usadel B.Trimmomatic:a flexible trimmer for Illumina sequence data.Bioinformatics.2014;30(15):2114-20.
[4]Krueger,F.Andrews,S.R.Bismark:a flexible aligner and methylation caller for Bisulfite-Seq applications.Bioinformatics.2011;27(11):1571-2.

Claims (17)

  1. 提高重亚硫酸盐处理后的RNA样品的逆转录效率的方法,其特征在于:
    利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA。
  2. RNA5mC重亚硫酸盐测序文库构建方法,其特征在于包括以下步骤:
    (1)将RNA样品片段化;
    (2)对片段化的RNA样品进行重亚硫酸盐处理;
    (3)利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA;
    (4)用逆转录合成的cDNA构建测序文库。
  3. 根据权利要求1的测序文库构建方法,其中所述测序文库适合于在Illunima Hiseq测序平台上进行测序。
  4. 一种5mC甲基化RNA的测序方法,其特征在于包括以下步骤:
    (1)将RNA样品片段化;
    (2)对片段化的RNA样品进行重亚硫酸盐处理;
    (3)利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA;
    (4)用逆转录获得的cDNA构建测序文库;
    (5)进行高通量测序。
  5. 一种RNA5mC甲基化检测方法,其特征在于包括以下步骤:
    (1)将RNA样品片段化;
    (2)对片段化的RNA样品进行重亚硫酸盐处理;
    (3)利用ACT三碱基随机六聚引物对重亚硫酸盐处理后的RNA样品进行逆转录,合成cDNA;
    (4)用逆转录合成的cDNA构建测序文库;
    (5)进行高通量测序;
    (6)对测序结果进行数据分析,获得RNA的5mC甲基化信息。
  6. 根据权利要求4或5的方法,其中所述高通量测序为在Illunima Hiseq2000测序平台上进行测序。
  7. 根据权利要求1-6任一项的方法,其中所述RNA是mRNA、tRNA、rRNA、总RNA或其它包含或可能包含5mC甲基化位点的RNA。
  8. 根据权利要求1-6任一项的方法,其中RNA样品是非甲基化RNA标准品与分离纯化的待测RNA以1:100(重量比)的比例混合的混合物。
  9. 根据权利要求8的方法,其中所述非甲基化RNA标准品是小鼠DHFR基因的RNA序列。
  10. 根据权利要求1-6任一项的方法,其中所述重亚硫酸盐处理基本上完全将RNA样品中的未甲基化的胞嘧啶转化为尿嘧啶。
  11. 根据权利要求2、3、6-10任一项的测序文库构建方法构建的测序文库,
  12. 权利要求11的测序文库在RNA甲基化检测中的应用。
  13. ACT三碱基的随机六聚引物在RNA5mC甲基化检测中的应用,所述RNA5mC甲基化检测包括将RNA样品片段化、重亚硫酸盐处理、逆转录合成cDNA、高通量测序和数据分析的步骤,其特征在于:在对重亚硫酸盐处理后的RNA进行逆转录时,使用ACT三碱基的随机六聚体作为引物。
  14. 根据权利要求13的应用,其中RNA样品是非甲基化RNA标准品与分离纯化的待测RNA以1:100(重量比)的比例混合的混合物。
  15. 根据权利要求14的方法,其中所述非甲基化RNA标准品是小鼠DHFR基因的RNA序列。
  16. 根据权利要求13的应用,其中所述重亚硫酸盐处理基本上完全将RNA样品中的未甲基化的胞嘧啶转化为尿嘧啶。
  17. 根据权利要求13的应用,其中高通量测序是在Illunima Hiseq测序平台上进行测序。
PCT/CN2015/088908 2015-09-02 2015-09-02 RNA 5mC重亚硫酸盐测序的文库构建方法及其应用 WO2017035821A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/088908 WO2017035821A1 (zh) 2015-09-02 2015-09-02 RNA 5mC重亚硫酸盐测序的文库构建方法及其应用

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/088908 WO2017035821A1 (zh) 2015-09-02 2015-09-02 RNA 5mC重亚硫酸盐测序的文库构建方法及其应用

Publications (1)

Publication Number Publication Date
WO2017035821A1 true WO2017035821A1 (zh) 2017-03-09

Family

ID=58186491

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/088908 WO2017035821A1 (zh) 2015-09-02 2015-09-02 RNA 5mC重亚硫酸盐测序的文库构建方法及其应用

Country Status (1)

Country Link
WO (1) WO2017035821A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114072525A (zh) * 2019-12-23 2022-02-18 艾跃生物科技公司 用于dna和rna修饰和功能基序的富集和检测的方法和试剂盒
CN115651973A (zh) * 2022-09-08 2023-01-31 苏州京脉生物科技有限公司 一种可传代细胞的高保真甲基化位点的分离与分析方法
WO2023142625A1 (zh) * 2022-01-27 2023-08-03 安康优乐复生科技有限责任公司 一种甲基化测序数据过滤方法及应用

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101263227A (zh) * 2005-09-16 2008-09-10 454生命科学公司 cDNA文库制备
CN103114086A (zh) * 2013-02-18 2013-05-22 重庆市畜牧科学院 一种快速构建差异表达基因文库的方法
CN103827321A (zh) * 2011-07-29 2014-05-28 剑桥表现遗传学有限公司 用于检测核苷酸修饰的方法
CN105132409A (zh) * 2015-09-02 2015-12-09 中国科学院北京基因组研究所 RNA 5mC重亚硫酸盐测序的文库构建方法及其应用

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101263227A (zh) * 2005-09-16 2008-09-10 454生命科学公司 cDNA文库制备
CN103827321A (zh) * 2011-07-29 2014-05-28 剑桥表现遗传学有限公司 用于检测核苷酸修饰的方法
CN103114086A (zh) * 2013-02-18 2013-05-22 重庆市畜牧科学院 一种快速构建差异表达基因文库的方法
CN105132409A (zh) * 2015-09-02 2015-12-09 中国科学院北京基因组研究所 RNA 5mC重亚硫酸盐测序的文库构建方法及其应用

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114072525A (zh) * 2019-12-23 2022-02-18 艾跃生物科技公司 用于dna和rna修饰和功能基序的富集和检测的方法和试剂盒
EP3959342A4 (en) * 2019-12-23 2023-05-24 Active Motif, Inc. METHODS AND KITS FOR ENRICHMENT AND DETECTION OF DNA AND RNA MODIFICATIONS AND FUNCTIONAL MOTORS
WO2023142625A1 (zh) * 2022-01-27 2023-08-03 安康优乐复生科技有限责任公司 一种甲基化测序数据过滤方法及应用
CN115651973A (zh) * 2022-09-08 2023-01-31 苏州京脉生物科技有限公司 一种可传代细胞的高保真甲基化位点的分离与分析方法
CN115651973B (zh) * 2022-09-08 2023-09-29 苏州京脉生物科技有限公司 一种可传代细胞的高保真甲基化位点的分离与分析方法

Similar Documents

Publication Publication Date Title
US11560558B2 (en) Methods of capturing cell-free methylated DNA and uses of same
Taggart et al. Large-scale analysis of branchpoint usage across species and cell lines
CN105132409B (zh) RNA 5mC重亚硫酸盐测序的文库构建方法及其应用
Cullum et al. The next generation: using new sequencing technologies to analyse gene regulation
Sui et al. Molecular dysfunctions in acute rejection after renal transplantation revealed by integrated analysis of transcription factor, microRNA and long noncoding RNA
Zhang et al. Isoform evolution in primates through independent combination of alternative RNA processing events
Rounge et al. MicroRNA biomarker discovery and high-throughput DNA sequencing are possible using long-term archived serum samples
US20150045237A1 (en) Method for identification of the sequence of poly(a)+rna that physically interacts with protein
Okada et al. Transcriptome-wide identification of A-to-I RNA editing sites using ICE-seq
CN113557300A (zh) 核酸序列、rna目标区域测序文库的构建方法及应用
WO2017035821A1 (zh) RNA 5mC重亚硫酸盐测序的文库构建方法及其应用
US11859248B2 (en) Nucleic acid modification and identification method
Trier Maansson et al. Cell‐free chromatin immunoprecipitation can determine tumor gene expression in lung cancer patients
Jaksik et al. RNA-seq library preparation for comprehensive transcriptome analysis in cancer cells: the impact of insert size
US20200232010A1 (en) Methods, compositions, and systems for improving recovery of nucleic acid molecules
US20230104375A1 (en) Method for multiplexable strand-specific 3' end sequencing of mrna transcriptome primer set, kit and application thereof
Hirst Epigenomics: sequencing the methylome
Baubec et al. Genome-wide analysis of DNA methylation patterns by high-throughput sequencing
CN108103207B (zh) Brca1、jaml及其调控基因在品种选育中的应用
EP4242323A1 (en) Method for producing mirna libraries for massive parallel sequencing
Goovaerts Exploring allele-specific expression mechanisms in health and disease
US20220090169A1 (en) Methods and kits for improving global gene expression analysis of human urine derived rna
Wheeler Transcriptome-wide search for exonic variants modulating translational efficiency
WO2023086950A1 (en) Methylation signatures in cell-free dna for tumor classification and early detection
Rodrigues de Melo Costa Genome-wide Determination Of Splicing Efficiency And Dynamics From RNA-Seq Data

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15902628

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15902628

Country of ref document: EP

Kind code of ref document: A1