CN113774121A - Low sample size m based on RNA connection label6A high throughput sequencing method - Google Patents

Low sample size m based on RNA connection label6A high throughput sequencing method Download PDF

Info

Publication number
CN113774121A
CN113774121A CN202111066944.5A CN202111066944A CN113774121A CN 113774121 A CN113774121 A CN 113774121A CN 202111066944 A CN202111066944 A CN 202111066944A CN 113774121 A CN113774121 A CN 113774121A
Authority
CN
China
Prior art keywords
rna
sample
samples
linker
barcode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111066944.5A
Other languages
Chinese (zh)
Other versions
CN113774121B (en
Inventor
周翔
翁小成
秦珊珊
韩少卿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN202111066944.5A priority Critical patent/CN113774121B/en
Publication of CN113774121A publication Critical patent/CN113774121A/en
Application granted granted Critical
Publication of CN113774121B publication Critical patent/CN113774121B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Abstract

The invention discloses a low sample size m based on an RNA connection label6A high-throughput sequencing method belongs to the field of high-throughput sequencing of RNA. The method comprises the following steps: carrying out phosphorylation and adenylation treatment on 3' linker containing different barcode tag sequences; respectively connecting the adenylated 3' -linker to the broken RNA samples from different sources, mixing the samples, reserving an input control sample for RNA-seq, and performing m on the rest mixed samples6Antibody A immunoprecipitation to obtain IP samples for m6A-seq, finally obtaining a second generation sequencing library and sequencing; according to the barcode sequence, the sequencing data is split and analyzed to obtain the RNA-seq and m of the initial single sample6A-seq information. The invention can simultaneously realize the m of a plurality of clinical low-sample-size samples6Antibody A immunoprecipitation and library construction sequencing.

Description

Low sample size m based on RNA connection label6A high throughput sequencing method
Technical Field
The invention relates to the field of high-throughput sequencing of RNA, in particular to a low-sample-size m based on an RNA connection label6A high throughput sequencing method.
Background
Post-synthetic modification of biological macromolecules, where N plays an important role in many life processes6-methylated adenosine (m)6A) Is the most abundant posttranscriptional RNA modification in eukaryotic messenger deoxyribonucleic acid (mRNA). m is6The A modification level is dynamically reversible in mammalian cells by a variety of m6Regulation of a-related proteins. Previous studies showed that m6The dynamic regulation process of A is closely related to the vital physiological process, and m6Dysregulation of a has also been shown to result in some disease-related pathological changes. m is6The overall A content can be obtained by subjecting the RNA sample to enzymatic digestion and then using LC-MS, however, since m6A plays an important role in almost all metabolic processes of mRNA (e.g.formation, processing, transport, translation, degradation, etc.) and thus on m6The localization of the A modification and the study of the changes in its modification level at a particular site are of great significance. In the study of nucleic acid biomacromolecules, qualitative and quantitative analysis of gene sequences at specific sites is usually achieved by sequencing. Compared with the first generation sequencing technology, the second generation sequencing technology can perform rapid sequencing analysis on hundreds of samples and thousands of samples of hundreds of samples and millions of DNA molecules at the same time with low cost and more than 99% accuracy for 1 time, thereby reducing the sequencing cost, improving the sequencing flux and being more suitable for performing sequencing analysis on a plurality of samples. At the same time, m6The discovery of A-specific recognition antibodies greatly advanced m at the transcriptome level6And (3) related research of A modification sites. By mixing m6A specific recognition antibody is used for co-immunoprecipitation, and a sequencing method MeRIP-seq (methylated RNA amplification biased by sequencing) developed by combining high-throughput sequencing technology realizes m in transcriptome range6A positions and pushes m6And (4) researching the molecular mechanism and action mechanism of A. However, since MeRIP-seq et al utilize m6High throughput sequencing method for A-specific recognition of antibodies efficiency of antibody immunoprecipitation and m6The effect of the lower content of A modification per se, m is difficult to achieve for a single clinical peripheral blood sample (2-4mL whole blood sample)6A-seq。
The development of single cell sequencing has been greatly facilitated by the bar code labeling (Barcode) labeling technique, which has the advantage that it can label multiple samples. The RNA or DNA of a single cell is labeled and mixed by barcode, then transcriptome, genome or modified sequencing after genome synthesis can be carried out, when bioinformatics analysis is carried out on sequencing data, the sequencing data can be split by different barcode label sequences, and each cell of an initial sample is traced.
Disclosure of Invention
The invention aims to provide a plurality of low-sample-size samples m based on RNA (ribonucleic acid) connected barcode labels6A high throughput sequencing method to simultaneously achieve mixing of multiple clinical low sample size samples6Antibody A immunoprecipitation and library construction sequencing.
The purpose of the invention is realized by the following technical scheme:
multiple low-sample-size samples m based on RNA (ribonucleic acid) connection barcode label6A high throughput sequencing method, comprising the steps of:
(1) a library containing different barcode tag sequences was phosphorylated and adenylated with a3 'linker oligonucleotide chain (3' linker). The composition of the 3 ' linker is 5 ' -barcode sequence-random sequence-PCR primer linker sequence-3 '.
(2) Respectively connecting the 3' -linker containing different barcode tag sequences after the adenylation treatment in the step (1) to the broken RNA samples of different sources, mixing the samples, purifying, reserving an input control sample for RNA-seq, performing m6A antibody immunoprecipitation on the residual mixed sample to obtain an IP sample for m6A-seq, and finally obtaining a next-generation sequencing library of the input and IP samples and sequencing.
(3) Splitting the sequencing data of the mixed sample according to the barcode sequence, and analyzing the data by using a bioinformatics analysis means to obtain the RNA-seq and m of the initial single sample6A-seq information.
Preferably, in step (1), the 3' linker is purified after phosphorylation and adenylation to reduce mutual interference between different reactions. Wherein, the phosphorylation treatment is carried out on the 3 ' -linker by utilizing the T4 PNK enzyme under the condition that ATP is contained in a reaction system, the adenylation treatment is carried out on the phosphorylated 3 ' -linker by utilizing a 5 ' adenylation reagent, and the purification is carried out by utilizing an oligonucleotide purification concentration kit.
Preferably, in step (1), the random sequence is a 6-base random sequence to ensure the accuracy of the barcode sequence in the sequencing process. The barcode sequence preferably consists of 6 bases, avoids the repetition of an Index sequence of a PCR primer used for constructing a commercial next-generation sequencing library in sequence design, ensures that the distribution of four bases of ATGC is uniform as much as possible, and avoids the situation of multiple repetition of a single base (such as GGGG).
Further, the step (2) comprises the following steps:
1) RNA is extracted from a sample (cell/blood sample/tissue sample, etc.).
2) The RNA sample is broken into 200-400bp fragments by a chemical ion breaking method, the breaking reagent is removed by purification, the phosphate group at the 3 'end of the broken RNA is removed by enzyme, and the phosphate group is added at the 5' end.
3) The adenylated 3 '-linker containing different barcode tags was attached to different cleaved RNA samples and the excess 3' -linker was removed after the reaction was complete.
4) Mixing and purifying different samples, leaving an input sample, and performing m on the rest samples6And performing antibody A immunoprecipitation to obtain an IP sample, performing reverse transcription on the input sample and the IP sample, removing a reverse transcription primer and a template, and purifying to obtain cDNA.
5) And connecting a 5' joint on the cDNA, purifying a reaction system, and determining the cycle number required for constructing the library PCR by using RT-qPCR.
6) And (3) carrying out PCR (polymerase chain reaction) to construct a library by taking the cDNA connected with the 5' -joint as a substrate, purifying a product by utilizing gel cutting recovery to obtain a second-generation sequencing library, and sending the purified library to a sequencing company for sequencing.
Preferably, in step 1), the RNA in the sample is extracted by using a tizol reagent.
Preferably, in the step 2), the RNA is interrupted by a magnesium ion chemical interruption reagent, the RNA after interruption is purified by an RNA purification concentration kit, and the purified RNA fragment is subjected to end repair by T4 PNK enzyme.
Preferably, in step 3), the adenylated 3' -linker is ligated to the cleaved RNA sample by performing an overnight reaction using T4 RNA ligase 2(truncated KQ). After completion of the reaction, excess 3 '-linker was removed using 5' deadenylase and RecJf enzyme.
Preferably, in step 4), a plurality of reaction mixtures in step 3) are directly mixed, and then the reaction system is purified using an RNA purification concentration kit. Leave 1/50 sample as input control, the remainder according to m6Antibody A instructions were immunoprecipitated to obtain IP samples. The sample was reverse transcribed using Superscript III enzyme to obtain cDNA.
Preferably, in step 5), the 5' linker is ligated to the cDNA using T4 RNA ligase 1 (high concentration) overnight reaction. The number of cycles at which the fluorescence value reached a plateau when RT-qPCR was performed was selected as the number of cycles for PCR library amplification.
Further, the step (3) comprises the following steps:
1) and analyzing the comparison rate of the sequencing data, and checking the data quality.
2) And splitting the data according to the barcode tag sequence, and corresponding to the initial sample.
3) And analyzing the split data to obtain sequencing information.
The strategy diagram of the present invention is shown in FIG. 1, since the existing method has difficulty in realizing m of single sample for clinical low sample size sample6A-seq, therefore the present invention is in m6Based on the A-seq library construction method, the barcode labeling technology is combined, RNA samples from different sources are labeled and then mixed for m6A, immunoprecipitation, and reading and tracing to an initial sample according to a barcode sequence in a data analysis stage to realize single low-sample-size sample m6And obtaining the information of the A-seq. Compared with the prior art, the invention has the following advantages and beneficial effects:
(1) designing different barcode tag sequences at the 5' -end of a library-building linker 3-linker and connecting the sequences to broken RNA samples from different sourcesLine-differentiated post-mix m6A immunoprecipitation, lowering single sample progression m6Amount of RNA required for A-seq, m for achieving low sample size clinical samples6A-seq. Only less than 20ng of broken RNA in a single sample can be successfully used for constructing a library to realize m6A-seq。
(2) The library construction method has universality, different numbers of barcode labels can be used according to the number of samples, and library construction sequencing can be realized by increasing the number of mixed samples for a small number of samples.
(3) The library building method of the invention simplifies the experimental operation, reduces the experimental cost, and only needs to perform m for a plurality of samples once6a-IP, reduces the consumption of m ═ a antibodies and the corresponding experimental manipulations.
(4) The library construction method has good sequencing effect, the constructed library is subjected to next generation sequencing under the condition that the single sample amount before mixing is similar, the obtained sequencing data are split according to the barcode sequence, the data amount is relatively average, and the data analysis can be carried out by using a common method.
Drawings
FIG. 1 is a schematic of the strategy of the present invention.
FIG. 2 is a polyacrylamide gel electrophoresis of 3 ' -linker after adenylation treatment in the present invention, in which the uppermost part is 3 ' -linker after adenylation treatment and the lowermost part is 3 ' -linker without adenylation as a control.
FIG. 3 shows the results of one-generation sequencing of the library sequences constructed according to the present invention using TA clones.
FIG. 4 shows the case of sequencing data split before and after optimization of 3' -linker sequence design in the present invention. The upper graph is the result of splitting the sequencing data of 12 3 '-linkers before (without random sequences) sequence design optimization, and the lower graph is the result of splitting the sequencing data of 6 newly designed 3' -linkers after (with 6 base random sequences) sequence design optimization.
FIG. 5 shows the data independence experiment design and experiment results of the hybrid database construction using mRNA of HeLa cells as background after labeling different barcode tags to specific oligonucleotide sequences in the present invention, the upper graph is the experiment design graph, and the lower graph is the experiment results.
FIG. 6 shows the library construction and sequencing of mRNA of 100ng HeLa cells labeled with 6 barcode tags, respectively, according to the present invention6Results of analysis of A-seq data. The m is split according to 6 barcode sequences6A-seq data, m obtained after analysis6Percentage distribution of a peak in 5 non-overlapping transcriptome fragments.
FIG. 7 shows the data obtained by mixing and sequencing 20ng of mRNA of HeLa cells labeled with 16 barcode tags, and splitting the sequencing data obtained by the present invention. A and B are the results after data splitting of input and IP samples, C is for m6M after analysis of A-seq data6Motif sequence where peak A is located.
Detailed Description
The invention will be further explained with reference to the following examples and the accompanying drawings for better understanding. The present invention is not limited to the following embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which are made without departing from the spirit and principle of the present invention are also intended to be equivalent substitutions within the scope of the present invention.
The sequences of the 3' linker referred to in the examples below are shown in tables 1 and 2 below.
TABLE 13 ' linker sequence (5 ' -3 ')
General sequence optimization of Pre-3' linker NNNNNNAGATCGGAAGAGCGTCGTG-SpC3
General sequence of optimized 3' linker NNNNNNNNNNNNAGATCGGAAGAGCGTCGTG-SpC3
Note: the bold N is the barcode sequence, the specific sequence is shown in table 2 below, the italic N is a random sequence of 6 bases, and the SpC3 modification at the 3' end is: an intermediate arm (Spacer) with 3 carbon atoms is introduced at the 3 ' end to prevent the 3 ' linker from connecting with other nucleic acid chains at the 3 ' end.
TABLE 2 sequences of different 3' linkers
Figure RE-GDA0003350588820000051
Note: italicized N is a random sequence of 6 bases.
Example 1
Adenylation treatment of 3' -linker:
(1) mixing 3' -linker (sequence shown in table) with ATP, T4 PNK Buffer and T4 PNK enzyme, and reacting in a specific treatment mode: the ordered 3' linker was dissolved with enzyme-free water to a final concentration of 100. mu.M, and then 5. mu.L was taken out and added to the following reaction system (Table 3) for reaction at 37 ℃ for 1h, followed by inactivation at 65 ℃ for 20 min.
TABLE 3 phosphorylation system (50. mu.L)
3’linker(100μM) 5μL
10mM ATP 3μL
10×T4 PNK Buffer 5μL
10U/. mu. L T4 PNK enzyme 2μL
Enzyme-free water To 50 μ L
The reaction system was purified using Oligo Clean & Concentrator (OCC, Zymo) purification kit and eluted with 10. mu.L of enzyme-free water to give phosphorylated 3' -linker.
(2) The phosphorylated 3 '-linker obtained in step (1) was reacted with Mth RNA Ligase, 5' DNA amplification reaction buffer, and ATP at 65 ℃ for 1 hour in the following reaction system (Table 4), followed by inactivation at 85 ℃ for 5 min.
TABLE 4 adenylation system (50. mu.L)
Phosphorylated 3' linker 3μL
1mM ATP 5μL
10×5’DNA adenylation reaction buffer 5μL
50μM Mth RNA ligase 3μL
Enzyme-free water To 50 μ L
The reaction mixture was purified again using OCC kit to obtain adenylated 3' -linker containing different barcode tag sequences.
Example 2
And (3) adenosine reaction verification:
equal amounts of the adenylated samples and 3' -linker not subjected to the adenylation treatment were added to a loading buffer and subjected to 20% neutral polyacrylamide gel electrophoresis.
And (3) analyzing an experimental result:
FIG. 2 is a polyacrylamide gel electrophoresis of 3 '-linker after adenylation treatment in the present invention, from which it can be seen that 3' -linker was successfully adenylated but a small amount of substrate was not adenylated. The unsuccessfully adenylated 3 '-linker has no linking reaction activity and the 3' -linker used in the reaction is greatly excessive, so that the subsequent experiment is not influenced.
Example 3
1. Library construction:
RNA in samples (cells, tissues, blood samples, etc.) was extracted using Trizol reagent, and then different RNA samples were disrupted according to the instructions of magnesium ion chemical disruption reagent (NEB, E6150S), and the disrupted RNA was purified using RNA Clean & concentrate (RCC) purification kit (Zymo), and then eluted with 7. mu.L of enzyme-free water. The eluted RNA was subjected to PNK treatment at 37 ℃ for 1h by adding the following system (Table 5) to allow ligation with an adenylated 3' linker.
TABLE 5 PNK treatment System (10. mu.L)
Disrupted and purified RNA 7μL
RibiLock RNase Inhibitor(40U/μL) 1μL
10×T4 PNK Buffer 1μL
10U/. mu. L T4 PNK enzyme 1μL
As shown in the following Table 6, 3' linker and T4 Ligase 2(truncated KQ) and other reagents required for ligation were directly added to the PNK reaction system, and after being blown up and down by a pipette, the mixture was reacted at 25 ℃ for 2 hours and then at 16 ℃ overnight (12 hours).
TABLE 63' linker ligation reaction systems (20. mu.L)
PNK post-treatment system 10μL
3' linker after adenylation treatment and purification 2μL
10×T4 PNK Buffer 1μL
50%PEG8000 6μL
0.1M DTT 1μL
T4 RNA Ligase 2(truncated KQ) 1μL
The next day, 1. mu.L of 5' deadenylase was added directly to the reaction sample and reacted at 30 ℃ for 1 hour, followed by 1. mu.L of RecJf and reacted at 37 ℃ for 1 hour. After the reaction was completed, different samples were mixed, and after purification by an RCC purification kit, 50.7. mu.L of enzyme-free water was eluted. To the eluted mixed sample, 1.3. mu.L RibioLock RNase Inhibitor (40U/. mu.L) was added and mixed well to prevent RNA degradation, and 1. mu.L of the sample was taken out and added to 9. mu.L of non-enzyme water to be left as an Input control. The rest of the sample is pressed
Figure RE-GDA0003350588820000072
N6-Methylaldenosine Enrichment Kit (NEB, E1610S) Specification for m6Antibody A was immunoprecipitated and finally eluted with 12. mu.L of enzyme-free water to obtain IP samples. The Input sample and the IP sample are added into the following system (table 7) and blown by a pipette tip to be uniformly mixed, and then the reverse transcription reaction is carried out under the reaction conditions of 25 ℃ for 3min, 42 ℃ for 10min and 52 ℃ for 40 min.
TABLE 7 reverse transcription reaction System (20. mu.L)
Figure RE-GDA0003350588820000071
After completion of the reverse transcription reaction, 1. mu.L of Exo I enzyme was added to the reaction system, and reacted at 37 ℃ for 30min to remove excess reverse transcription primer, followed by addition of 15. mu.L of 0.5M EDTA (pH 8.0) and 15. mu.L of 1M NaOH solution to the reaction system and treatment at 65 ℃ for 15min to remove RNA template. The reaction was purified using Oligo Clean & concentrator (OCC) purification kit (Zymo) and eluted with 7. mu.L of enzyme-free water to obtain cDNA samples. The cDNA samples were then ligated with 5 ' adaptor (5 ' -Phos-NNNNNNNNNNAGATCGGAAGAGCACACGTCTG-SpC-3 ', N stands for random base) overnight (12h) at 25 ℃ in the following reaction system (Table 8).
TABLE 85' adaptor connection system (20 μ L)
Figure RE-GDA0003350588820000081
The reaction was eluted with 12. mu.L of enzyme-free water after purification with OCC purification kit.
Taking 1 mu L of IP and Input samples to perform RT-qPCR in a 20 mu L system, wherein the primer sequences are as follows:
RT-qPCR primer sequence (5 '-3')
qPCR forward primer TACCTTGGCACCCCAGAC
qPCR reverse primer TTCAGAGTTCTACAGTCCGA
And observing a fluorescence curve, and selecting the minimum Ct value when the fluorescence value reaches a platform as the cycle number of the PCR constructed by the library.
Library construction PCR reactions were performed in the following reaction system (Table 9), where the PCR primers were NEB second generation sequencing primers and the PCR program was based on
Figure RE-GDA0003350588820000083
UltraTM II
Figure RE-GDA0003350588820000084
The Master Mix instructions were set.
TABLE 9 construction of the library PCR reaction System (50. mu.L)
Figure RE-GDA0003350588820000082
The PCR product was purified using a gel recovery kit (steps according to kit instructions used) to obtain library samples that could be sent to sequencing companies for second generation sequencing.
2. Library composition verification:
the library constructed by the method of the invention is inserted into plasmid by TA cloning, 5 monoclonals are selected for first-generation sequencing, and the constructed library is verified to be in accordance with expectations.
The results are shown in FIG. 3, and the first-generation sequencing results show that the parts with gray shades at both ends of the DNA sequence respectively correspond to the forward primer and the reverse primer in the library-building PCR primer kit; the wavy line part in the figure is a random sequence of ten N on 5' adaptor, and the result shows that the sequences of the parts of five monoclonals are different; the part with the lower dotted line in the figure can correspond to a barcode label sequence (6 random bases), and six base sequences obtained by five clones are different and all correspond to the designed barcode sequence; the portion with the solid line drawn in the figure is the DNA sequence corresponding to the inserted RNA fragment, and this portion is different from each other because of experiments using cellular mRNA.
Example 4
3' -linker sequence design optimization:
3' -linker (Table 1-2, FIG. 3) containing different barcode tags before/after optimization was ligated to HeLa cell mRNA after equivalent disruption and mixed6And A-seq, splitting the obtained sequencing data, and checking whether the data distribution is uniform, namely whether the barcode label influences the sequencing.
The result is shown in fig. 4, a phenomenon of obvious data nonuniformity exists after the splitting of the sequencing data before the design of the optimized sequence, and the data obtained after the splitting of the optimized sequencing data is more uniform. In combination with the sequencing result, it can be presumed that this is because errors are easily generated at the first few bases of sequencing, which results in the failure to successfully split the sequencing data and the waste of data.
Example 5
And (3) carrying out independence verification on multiple groups of data split by the same library according to the barcode:
(1) designing four oligo RNA strands with known sequences (as shown in Table 10), and mixing the four oligo RNA strands with the mRNA of the fragmented HeLa cells according to a mass ratio of 1: 100;
TABLE 10 Oligo RNA sequences (5 '-3')
Oligo RNA1 AUACUGCCACAUGCUGCACAGUGC
Oligo RNA2 GGACUGAGAACUGGACUGUCUGGGGUGCCAAGGUA
Oligo RNA3 GGACUGAACUGGACUGUCUGGGGUGCCAAGGUA
Oligo RNA4 GUACGUCAUCGAGAUCAGCUU
(2) Respectively taking 100ng of mRNA mixed with oligo RNA, correspondingly connecting 3 '-linkers with different labels at the 3' ends of the mRNA, mixing 4 samples into one sample, and performing library construction, high-throughput sequencing and data analysis;
(3) in the data analysis process, the numbers of reads of four oligo RNAs in the data split by each barcode are respectively counted so as to analyze whether data pollution exists or not.
The results are shown in FIG. 5, which lists the reads numbers of four oligo RNAs in the four barcode resolution data, and the results show that each oligo RNA has a large number of reads in the data resolved by its corresponding barcode, and basically none of the data resolved by the other three barcode. Based on this result, it was determined that the present invention utilizes the barcode tag for multiple low-sample-size samples m6A high-throughput sequencing experimental scheme is feasible, and the split data are independent from each other and have no mutual pollution.
Example 6
By utilizing the library construction method, 6 parts of 100ng of broken HeLa cell mRNA are respectively connected with 6 different barcode labels (3' linker 1-6 in Table 2), and m is carried out6And A-seq library construction and sequencing. Analyzing the obtained sequencing data, and checking m obtained by the invention6In the sequencing information, m6Whether the percentage distribution of the a peak is consistent with the general distribution.
The results are shown in FIG. 6, according to the barcode sequence for m6After the sequencing data are split and analyzed, m is found6The percentage distribution of A peaks in 5 non-overlapping transcriptome fragments corresponds to m6The general distribution of a in the transcriptome, i.e. more distribution at the coding region, 3' UTR and stop codon of the transcriptome.
Example 7
Experiments were performed on a small number of samples:
(1) 16 portions of 20ng of broken HeLa cell mRNA were taken, and connected to 16 different barcode tags (3' linker 1-16 in Table 2), and m was performed6A-seq library construction and sequencing;
(2) and splitting according to the barcode sequence after the sequencing data is obtained, analyzing the split data information, and comparing the data difference split by different barcodes.
As a result, 16 parts of 20ng of cleaved HeLa cell mRNA was labeled with barcode, mixed samples were pooled, and RNA-seq and m-seq could be successfully constructed6A-seq library. Through splitting and analyzing library sequencing data, the numbers of reads obtained by splitting according to different barcode and m found in each sample are found6There was no significant difference in the number of A peaks (FIG. 7A, B), the distribution was more uniform, and the distribution was more uniform by the number of m pairs6The data of A-seq are analyzed, and m is successfully constructed6A sample library after enrichment of antibody, and m6The sequence of the A modification site conforms to the general m6A modified motif (fig. 7C), i.e., RRACH (R ═ G or a; H ═ a, C or U).
Sequence listing
<110> Wuhan university
<120>Based on RLow sample size m of NA-linked tags6A high throughput sequencing method
<160> 16
<170> SIPOSequenceListing 1.0
<210> 1
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
actcgannnn nnagatcgga agagcgtcgt g 31
<210> 2
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
agctgannnn nnagatcgga agagcgtcgt g 31
<210> 3
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
agcagannnn nnagatcgga agagcgtcgt g 31
<210> 4
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
agctcgnnnn nnagatcgga agagcgtcgt g 31
<210> 5
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
atcgcannnn nnagatcgga agagcgtcgt g 31
<210> 6
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
agctcannnn nnagatcgga agagcgtcgt g 31
<210> 7
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
ttcggannnn nnagatcgga agagcgtcgt g 31
<210> 8
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
catcgannnn nnagatcgga agagcgtcgt g 31
<210> 9
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
ctagcannnn nnagatcgga agagcgtcgt g 31
<210> 10
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
acggtannnn nnagatcgga agagcgtcgt g 31
<210> 11
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
ccattgnnnn nnagatcgga agagcgtcgt g 31
<210> 12
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
gattcgnnnn nnagatcgga agagcgtcgt g 31
<210> 13
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
cgttagnnnn nnagatcgga agagcgtcgt g 31
<210> 14
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
gactgtnnnn nnagatcgga agagcgtcgt g 31
<210> 15
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
acagcannnn nnagatcgga agagcgtcgt g 31
<210> 16
<211> 31
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
agtcgtnnnn nnagatcgga agagcgtcgt g 31

Claims (10)

1. Multiple low-sample-size samples m based on RNA (ribonucleic acid) connection barcode label6A high throughput sequencing method, characterized in that: the method comprises the following steps:
(1) carrying out phosphorylation and adenylation treatment on a library containing different barcode tag sequences by using a 3' linker; the 3 ' linker comprises a 5 ' -barcode sequence-random sequence-PCR primer linker sequence-3 ';
(2) respectively connecting the 3' -linker containing different barcode tag sequences after the adenylation treatment in the step (1) to the broken RNA samples from different sources, mixing the samples, purifying, and reserving an input control sample for RNA-seq, and m is performed on the remaining mixed sample6Antibody A immunoprecipitation to obtain IP samples for m6A-seq, finally obtaining a second generation sequencing library of input and IP samples and sequencing;
(3) the sequencing data is split according to the barcode sequence and analyzed to obtain the RNA-seq and m of the initial single sample6A-seq information.
2. The plurality of low sample size samples m of claim 1 based on RNA-linked barcode tags6A high throughput sequencing method, characterized in that: in the step (1), the 3' linker is purified after phosphorylation and adenylation.
3. The plurality of low sample size samples m of claim 1 based on RNA-linked barcode tags6A high throughput sequencing method, characterized in that: in the step (1), phosphorylation treatment is performed on 3 ' -linker by using T4 PNK enzyme, and adenylation treatment is performed on the phosphorylated 3 ' -linker by using a 5 ' adenylation reagent.
4. The plurality of low sample size samples m of claim 1 based on RNA-linked barcode tags6A high throughput sequencing method, characterized in that: in the step (1), the random sequence is a random sequence of 6 bases.
5. The plurality of low sample size samples m of claim 1 based on RNA-linked barcode tags6A high throughput sequencing method, characterized in that: the step (2) comprises the following steps:
1) extracting RNA from the sample;
2) breaking RNA sample into 200-400bp segments by using a chemical ion breaking method, removing breaking reagent by purification, removing phosphate groups at the 3 'end of broken RNA by using enzyme, and adding phosphate groups at the 5' end;
3) connecting the adenylated 3 '-linker containing different barcode labels to different broken RNA samples, and removing redundant 3' -linker after reaction;
4) mixing and purifying different samples, leaving an input sample, and performing m on the rest samples6Performing antibody immunoprecipitation to obtain an IP sample, and performing reverse transcription on the input sample and the IP sample to obtain cDNA;
5) connecting a 5' joint on the cDNA, purifying a reaction system, and determining the cycle number required by constructing the library PCR by using RT-qPCR;
6) and (3) carrying out PCR (polymerase chain reaction) to construct a library by taking the cDNA connected with the 5' -joint as a substrate, purifying a product by utilizing gel cutting recovery to obtain a second-generation sequencing library, and sequencing the library.
6. A plurality of low sample size samples m according to claim 5 based on RNA linked barcode tags6A high throughput sequencing method, characterized in that: in the step 2), breaking RNA by adopting a magnesium ion chemical breaking reagent; the enzyme is T4 PNK enzyme.
7. A plurality of low sample size samples m according to claim 5 based on RNA linked barcode tags6A high throughput sequencing method, characterized in that: in the step 3), connecting the adenylated 3' -linker to the broken RNA sample by using T4 RNA ligase 2; excess 3 '-linker was removed using 5' deadenylase and RecJf enzyme.
8. A plurality of low sample size samples m according to claim 5 based on RNA linked barcode tags6A high throughput sequencing method, characterized in that: in step 4), reverse transcription was performed using Superscript III enzyme.
9. A plurality of low sample size samples m according to claim 5 based on RNA linked barcode tags6A high throughput sequencing method, characterized in that: in step 5), a 5' linker was ligated to the cDNA using T4 RNA ligase 1.
10. The multiple oligo based on RNA linked barcode tag of claim 1Sample size sample m6A high throughput sequencing method, characterized in that: the step (3) comprises the following steps:
1) analyzing the comparison rate of the sequencing data, and checking the data quality;
2) splitting data according to a barcode tag sequence, and corresponding to the initial sample;
3) and analyzing the split data to obtain sequencing information.
CN202111066944.5A 2021-09-13 2021-09-13 Low sample size m based on RNA (ribonucleic acid) connection tag 6 A high throughput sequencing method Active CN113774121B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111066944.5A CN113774121B (en) 2021-09-13 2021-09-13 Low sample size m based on RNA (ribonucleic acid) connection tag 6 A high throughput sequencing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111066944.5A CN113774121B (en) 2021-09-13 2021-09-13 Low sample size m based on RNA (ribonucleic acid) connection tag 6 A high throughput sequencing method

Publications (2)

Publication Number Publication Date
CN113774121A true CN113774121A (en) 2021-12-10
CN113774121B CN113774121B (en) 2024-02-20

Family

ID=78842844

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111066944.5A Active CN113774121B (en) 2021-09-13 2021-09-13 Low sample size m based on RNA (ribonucleic acid) connection tag 6 A high throughput sequencing method

Country Status (1)

Country Link
CN (1) CN113774121B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105200530A (en) * 2015-10-13 2015-12-30 北京百迈客生物科技有限公司 Method for establishing multi-sample hybrid library suitable for high-flux whole-genome sequencing
CN108504651A (en) * 2017-02-27 2018-09-07 深圳市乐土精准医疗科技有限公司 The library constructing method and reagent in library are built in PCR product large sample size mixing based on high-flux sequence
CN110904192A (en) * 2018-12-28 2020-03-24 广州表观生物科技有限公司 Ultra-micro RNA methylation m6A detection method and application thereof
CN113308514A (en) * 2021-05-19 2021-08-27 武汉大学 Construction method and kit for detection library of trace m6A and high-throughput detection method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105200530A (en) * 2015-10-13 2015-12-30 北京百迈客生物科技有限公司 Method for establishing multi-sample hybrid library suitable for high-flux whole-genome sequencing
CN108504651A (en) * 2017-02-27 2018-09-07 深圳市乐土精准医疗科技有限公司 The library constructing method and reagent in library are built in PCR product large sample size mixing based on high-flux sequence
CN110904192A (en) * 2018-12-28 2020-03-24 广州表观生物科技有限公司 Ultra-micro RNA methylation m6A detection method and application thereof
CN113308514A (en) * 2021-05-19 2021-08-27 武汉大学 Construction method and kit for detection library of trace m6A and high-throughput detection method

Also Published As

Publication number Publication date
CN113774121B (en) 2024-02-20

Similar Documents

Publication Publication Date Title
US11676682B1 (en) Methods for accurate sequence data and modified base position determination
Carøe et al. Single‐tube library preparation for degraded DNA
EP2914745B1 (en) Barcoding nucleic acids
EP3622089A1 (en) Universal short adapters for indexing of polynucleotide samples
CN108085315A (en) A kind of library constructing method and kit for noninvasive antenatal detection
JP5801349B2 (en) Method for identifying the clonal source of restriction fragments
EP3555305B1 (en) Method for increasing throughput of single molecule sequencing by concatenating short dna fragments
JP7033602B2 (en) Barcoded DNA for long range sequencing
JP7332733B2 (en) High molecular weight DNA sample tracking tags for next generation sequencing
US9334532B2 (en) Complexity reduction method
US20120316075A1 (en) Sequence preserved dna conversion for optical nanopore sequencing
US20240117343A1 (en) Methods and compositions for preparing nucleic acid sequencing libraries
CN109825552B (en) Primer and method for enriching target region
CN115715323A (en) High-compatibility PCR-free library building and sequencing method
EP3956445A1 (en) Multiplex assembly of nucleic acid molecules
CN113774121B (en) Low sample size m based on RNA (ribonucleic acid) connection tag 6 A high throughput sequencing method
CN110144383B (en) Method for enriching target DNA fragments by utilizing multiplex PCR
CN116529430A (en) UMI molecular tag and application thereof, joint connecting reagent, kit and library construction method
CN113564235A (en) DNA sequencing method and kit
WO2022125100A1 (en) Methods for sequencing polynucleotide fragments from both ends
WO2022101162A1 (en) Paired end sequential sequencing based on rolling circle amplification
WO2023025784A1 (en) Optimised set of oligonucleotides for bulk rna barcoding and sequencing
CN116804216A (en) Detection method for single cell containing 5hmC
CN114686453A (en) Method and kit for constructing transcriptome sequencing library
Khayal et al. TRANSCRIPTOMIC CHARACTERIZATION USING RNA-SEQ DATA ANALYSIS

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant