CN113774121A

CN113774121A - Low sample size m based on RNA connection label6A high throughput sequencing method

Info

Publication number: CN113774121A
Application number: CN202111066944.5A
Authority: CN
Inventors: 周翔; 翁小成; 秦珊珊; 韩少卿
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2021-09-13
Filing date: 2021-09-13
Publication date: 2021-12-10
Anticipated expiration: 2041-09-13
Also published as: CN113774121B

Abstract

The invention discloses a low sample size m based on an RNA connection label⁶A high-throughput sequencing method belongs to the field of high-throughput sequencing of RNA. The method comprises the following steps: carrying out phosphorylation and adenylation treatment on 3' linker containing different barcode tag sequences; respectively connecting the adenylated 3' -linker to the broken RNA samples from different sources, mixing the samples, reserving an input control sample for RNA-seq, and performing m on the rest mixed samples⁶Antibody A immunoprecipitation to obtain IP samples for m⁶A-seq, finally obtaining a second generation sequencing library and sequencing; according to the barcode sequence, the sequencing data is split and analyzed to obtain the RNA-seq and m of the initial single sample⁶A-seq information. The invention can simultaneously realize the m of a plurality of clinical low-sample-size samples⁶Antibody A immunoprecipitation and library construction sequencing.

Description

Low sample size m based on RNA connection label6A high throughput sequencing method

Technical Field

The invention relates to the field of high-throughput sequencing of RNA, in particular to a low-sample-size m based on an RNA connection label⁶A high throughput sequencing method.

Background

Post-synthetic modification of biological macromolecules, where N plays an important role in many life processes⁶-methylated adenosine (m)⁶A) Is the most abundant posttranscriptional RNA modification in eukaryotic messenger deoxyribonucleic acid (mRNA). m is⁶The A modification level is dynamically reversible in mammalian cells by a variety of m⁶Regulation of a-related proteins. Previous studies showed that m⁶The dynamic regulation process of A is closely related to the vital physiological process, and m⁶Dysregulation of a has also been shown to result in some disease-related pathological changes. m is⁶The overall A content can be obtained by subjecting the RNA sample to enzymatic digestion and then using LC-MS, however, since m⁶A plays an important role in almost all metabolic processes of mRNA (e.g.formation, processing, transport, translation, degradation, etc.) and thus on m⁶The localization of the A modification and the study of the changes in its modification level at a particular site are of great significance. In the study of nucleic acid biomacromolecules, qualitative and quantitative analysis of gene sequences at specific sites is usually achieved by sequencing. Compared with the first generation sequencing technology, the second generation sequencing technology can perform rapid sequencing analysis on hundreds of samples and thousands of samples of hundreds of samples and millions of DNA molecules at the same time with low cost and more than 99% accuracy for 1 time, thereby reducing the sequencing cost, improving the sequencing flux and being more suitable for performing sequencing analysis on a plurality of samples. At the same time, m⁶The discovery of A-specific recognition antibodies greatly advanced m at the transcriptome level⁶And (3) related research of A modification sites. By mixing m⁶A specific recognition antibody is used for co-immunoprecipitation, and a sequencing method MeRIP-seq (methylated RNA amplification biased by sequencing) developed by combining high-throughput sequencing technology realizes m in transcriptome range⁶A positions and pushes m⁶And (4) researching the molecular mechanism and action mechanism of A. However, since MeRIP-seq et al utilize m⁶High throughput sequencing method for A-specific recognition of antibodies efficiency of antibody immunoprecipitation and m⁶The effect of the lower content of A modification per se, m is difficult to achieve for a single clinical peripheral blood sample (2-4mL whole blood sample)⁶A-seq。

The development of single cell sequencing has been greatly facilitated by the bar code labeling (Barcode) labeling technique, which has the advantage that it can label multiple samples. The RNA or DNA of a single cell is labeled and mixed by barcode, then transcriptome, genome or modified sequencing after genome synthesis can be carried out, when bioinformatics analysis is carried out on sequencing data, the sequencing data can be split by different barcode label sequences, and each cell of an initial sample is traced.

Disclosure of Invention

The invention aims to provide a plurality of low-sample-size samples m based on RNA (ribonucleic acid) connected barcode labels⁶A high throughput sequencing method to simultaneously achieve mixing of multiple clinical low sample size samples⁶Antibody A immunoprecipitation and library construction sequencing.

The purpose of the invention is realized by the following technical scheme:

multiple low-sample-size samples m based on RNA (ribonucleic acid) connection barcode label⁶A high throughput sequencing method, comprising the steps of:

(1) a library containing different barcode tag sequences was phosphorylated and adenylated with a3 'linker oligonucleotide chain (3' linker). The composition of the 3 ' linker is 5 ' -barcode sequence-random sequence-PCR primer linker sequence-3 '.

(2) Respectively connecting the 3' -linker containing different barcode tag sequences after the adenylation treatment in the step (1) to the broken RNA samples of different sources, mixing the samples, purifying, reserving an input control sample for RNA-seq, performing m6A antibody immunoprecipitation on the residual mixed sample to obtain an IP sample for m6A-seq, and finally obtaining a next-generation sequencing library of the input and IP samples and sequencing.

(3) Splitting the sequencing data of the mixed sample according to the barcode sequence, and analyzing the data by using a bioinformatics analysis means to obtain the RNA-seq and m of the initial single sample⁶A-seq information.

Preferably, in step (1), the 3' linker is purified after phosphorylation and adenylation to reduce mutual interference between different reactions. Wherein, the phosphorylation treatment is carried out on the 3 ' -linker by utilizing the T4 PNK enzyme under the condition that ATP is contained in a reaction system, the adenylation treatment is carried out on the phosphorylated 3 ' -linker by utilizing a 5 ' adenylation reagent, and the purification is carried out by utilizing an oligonucleotide purification concentration kit.

Preferably, in step (1), the random sequence is a 6-base random sequence to ensure the accuracy of the barcode sequence in the sequencing process. The barcode sequence preferably consists of 6 bases, avoids the repetition of an Index sequence of a PCR primer used for constructing a commercial next-generation sequencing library in sequence design, ensures that the distribution of four bases of ATGC is uniform as much as possible, and avoids the situation of multiple repetition of a single base (such as GGGG).

Further, the step (2) comprises the following steps:

1) RNA is extracted from a sample (cell/blood sample/tissue sample, etc.).

2) The RNA sample is broken into 200-400bp fragments by a chemical ion breaking method, the breaking reagent is removed by purification, the phosphate group at the 3 'end of the broken RNA is removed by enzyme, and the phosphate group is added at the 5' end.

3) The adenylated 3 '-linker containing different barcode tags was attached to different cleaved RNA samples and the excess 3' -linker was removed after the reaction was complete.

4) Mixing and purifying different samples, leaving an input sample, and performing m on the rest samples⁶And performing antibody A immunoprecipitation to obtain an IP sample, performing reverse transcription on the input sample and the IP sample, removing a reverse transcription primer and a template, and purifying to obtain cDNA.

5) And connecting a 5' joint on the cDNA, purifying a reaction system, and determining the cycle number required for constructing the library PCR by using RT-qPCR.

6) And (3) carrying out PCR (polymerase chain reaction) to construct a library by taking the cDNA connected with the 5' -joint as a substrate, purifying a product by utilizing gel cutting recovery to obtain a second-generation sequencing library, and sending the purified library to a sequencing company for sequencing.

Preferably, in step 1), the RNA in the sample is extracted by using a tizol reagent.

Preferably, in the step 2), the RNA is interrupted by a magnesium ion chemical interruption reagent, the RNA after interruption is purified by an RNA purification concentration kit, and the purified RNA fragment is subjected to end repair by T4 PNK enzyme.

Preferably, in step 3), the adenylated 3' -linker is ligated to the cleaved RNA sample by performing an overnight reaction using T4 RNA ligase 2(truncated KQ). After completion of the reaction, excess 3 '-linker was removed using 5' deadenylase and RecJf enzyme.

Preferably, in step 4), a plurality of reaction mixtures in step 3) are directly mixed, and then the reaction system is purified using an RNA purification concentration kit. Leave 1/50 sample as input control, the remainder according to m⁶Antibody A instructions were immunoprecipitated to obtain IP samples. The sample was reverse transcribed using Superscript III enzyme to obtain cDNA.

Preferably, in step 5), the 5' linker is ligated to the cDNA using T4 RNA ligase 1 (high concentration) overnight reaction. The number of cycles at which the fluorescence value reached a plateau when RT-qPCR was performed was selected as the number of cycles for PCR library amplification.

Further, the step (3) comprises the following steps:

1) and analyzing the comparison rate of the sequencing data, and checking the data quality.

2) And splitting the data according to the barcode tag sequence, and corresponding to the initial sample.

3) And analyzing the split data to obtain sequencing information.

The strategy diagram of the present invention is shown in FIG. 1, since the existing method has difficulty in realizing m of single sample for clinical low sample size sample⁶A-seq, therefore the present invention is in m⁶Based on the A-seq library construction method, the barcode labeling technology is combined, RNA samples from different sources are labeled and then mixed for m⁶A, immunoprecipitation, and reading and tracing to an initial sample according to a barcode sequence in a data analysis stage to realize single low-sample-size sample m⁶And obtaining the information of the A-seq. Compared with the prior art, the invention has the following advantages and beneficial effects:

(1) designing different barcode tag sequences at the 5' -end of a library-building linker 3-linker and connecting the sequences to broken RNA samples from different sourcesLine-differentiated post-mix m⁶A immunoprecipitation, lowering single sample progression m⁶Amount of RNA required for A-seq, m for achieving low sample size clinical samples⁶A-seq. Only less than 20ng of broken RNA in a single sample can be successfully used for constructing a library to realize m⁶A-seq。

(2) The library construction method has universality, different numbers of barcode labels can be used according to the number of samples, and library construction sequencing can be realized by increasing the number of mixed samples for a small number of samples.

(3) The library building method of the invention simplifies the experimental operation, reduces the experimental cost, and only needs to perform m for a plurality of samples once⁶a-IP, reduces the consumption of m ═ a antibodies and the corresponding experimental manipulations.

(4) The library construction method has good sequencing effect, the constructed library is subjected to next generation sequencing under the condition that the single sample amount before mixing is similar, the obtained sequencing data are split according to the barcode sequence, the data amount is relatively average, and the data analysis can be carried out by using a common method.

Drawings

FIG. 1 is a schematic of the strategy of the present invention.

FIG. 2 is a polyacrylamide gel electrophoresis of 3 ' -linker after adenylation treatment in the present invention, in which the uppermost part is 3 ' -linker after adenylation treatment and the lowermost part is 3 ' -linker without adenylation as a control.

FIG. 3 shows the results of one-generation sequencing of the library sequences constructed according to the present invention using TA clones.

FIG. 4 shows the case of sequencing data split before and after optimization of 3' -linker sequence design in the present invention. The upper graph is the result of splitting the sequencing data of 12 3 '-linkers before (without random sequences) sequence design optimization, and the lower graph is the result of splitting the sequencing data of 6 newly designed 3' -linkers after (with 6 base random sequences) sequence design optimization.

FIG. 5 shows the data independence experiment design and experiment results of the hybrid database construction using mRNA of HeLa cells as background after labeling different barcode tags to specific oligonucleotide sequences in the present invention, the upper graph is the experiment design graph, and the lower graph is the experiment results.

FIG. 6 shows the library construction and sequencing of mRNA of 100ng HeLa cells labeled with 6 barcode tags, respectively, according to the present invention⁶Results of analysis of A-seq data. The m is split according to 6 barcode sequences⁶A-seq data, m obtained after analysis⁶Percentage distribution of a peak in 5 non-overlapping transcriptome fragments.

FIG. 7 shows the data obtained by mixing and sequencing 20ng of mRNA of HeLa cells labeled with 16 barcode tags, and splitting the sequencing data obtained by the present invention. A and B are the results after data splitting of input and IP samples, C is for m⁶M after analysis of A-seq data⁶Motif sequence where peak A is located.

Detailed Description

The invention will be further explained with reference to the following examples and the accompanying drawings for better understanding. The present invention is not limited to the following embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which are made without departing from the spirit and principle of the present invention are also intended to be equivalent substitutions within the scope of the present invention.

The sequences of the 3' linker referred to in the examples below are shown in tables 1 and 2 below.

TABLE 13 ' linker sequence (5 ' -3 ')

General sequence optimization of Pre-3' linker	NNNNNNAGATCGGAAGAGCGTCGTG-SpC3
		General sequence of optimized 3' linker	NNNNNNNNNNNNAGATCGGAAGAGCGTCGTG-SpC3

Note: the bold N is the barcode sequence, the specific sequence is shown in table 2 below, the italic N is a random sequence of 6 bases, and the SpC3 modification at the 3' end is: an intermediate arm (Spacer) with 3 carbon atoms is introduced at the 3 ' end to prevent the 3 ' linker from connecting with other nucleic acid chains at the 3 ' end.

TABLE 2 sequences of different 3' linkers

Note: italicized N is a random sequence of 6 bases.

Example 1

Adenylation treatment of 3' -linker:

(1) mixing 3' -linker (sequence shown in table) with ATP, T4 PNK Buffer and T4 PNK enzyme, and reacting in a specific treatment mode: the ordered 3' linker was dissolved with enzyme-free water to a final concentration of 100. mu.M, and then 5. mu.L was taken out and added to the following reaction system (Table 3) for reaction at 37 ℃ for 1h, followed by inactivation at 65 ℃ for 20 min.

TABLE 3 phosphorylation system (50. mu.L)

3’linker(100μM)	5μL
		10mM ATP	3μL
10×T4 PNK Buffer	5μL
		10U/. mu. L T4 PNK enzyme	2μL
Enzyme-free water	To 50 μ L

The reaction system was purified using Oligo Clean & Concentrator (OCC, Zymo) purification kit and eluted with 10. mu.L of enzyme-free water to give phosphorylated 3' -linker.

(2) The phosphorylated 3 '-linker obtained in step (1) was reacted with Mth RNA Ligase, 5' DNA amplification reaction buffer, and ATP at 65 ℃ for 1 hour in the following reaction system (Table 4), followed by inactivation at 85 ℃ for 5 min.

TABLE 4 adenylation system (50. mu.L)

Phosphorylated 3' linker	3μL
		1mM ATP	5μL
10×5’DNA adenylation reaction buffer	5μL
		50μM Mth RNA ligase	3μL
Enzyme-free water	To 50 μ L

The reaction mixture was purified again using OCC kit to obtain adenylated 3' -linker containing different barcode tag sequences.

Example 2

And (3) adenosine reaction verification:

equal amounts of the adenylated samples and 3' -linker not subjected to the adenylation treatment were added to a loading buffer and subjected to 20% neutral polyacrylamide gel electrophoresis.

And (3) analyzing an experimental result:

FIG. 2 is a polyacrylamide gel electrophoresis of 3 '-linker after adenylation treatment in the present invention, from which it can be seen that 3' -linker was successfully adenylated but a small amount of substrate was not adenylated. The unsuccessfully adenylated 3 '-linker has no linking reaction activity and the 3' -linker used in the reaction is greatly excessive, so that the subsequent experiment is not influenced.

Example 3

1. Library construction:

RNA in samples (cells, tissues, blood samples, etc.) was extracted using Trizol reagent, and then different RNA samples were disrupted according to the instructions of magnesium ion chemical disruption reagent (NEB, E6150S), and the disrupted RNA was purified using RNA Clean & concentrate (RCC) purification kit (Zymo), and then eluted with 7. mu.L of enzyme-free water. The eluted RNA was subjected to PNK treatment at 37 ℃ for 1h by adding the following system (Table 5) to allow ligation with an adenylated 3' linker.

TABLE 5 PNK treatment System (10. mu.L)

Disrupted and purified RNA	7μL
		RibiLock RNase Inhibitor(40U/μL)	1μL
10×T4 PNK Buffer	1μL
		10U/. mu. L T4 PNK enzyme	1μL

As shown in the following Table 6, 3' linker and T4 Ligase 2(truncated KQ) and other reagents required for ligation were directly added to the PNK reaction system, and after being blown up and down by a pipette, the mixture was reacted at 25 ℃ for 2 hours and then at 16 ℃ overnight (12 hours).

TABLE 63' linker ligation reaction systems (20. mu.L)

PNK post-treatment system	10μL
		3' linker after adenylation treatment and purification	2μL
10×T4 PNK Buffer	1μL
		50％PEG8000	6μL
0.1M DTT	1μL
		T4 RNA Ligase 2(truncated KQ)	1μL

The next day, 1. mu.L of 5' deadenylase was added directly to the reaction sample and reacted at 30 ℃ for 1 hour, followed by 1. mu.L of RecJf and reacted at 37 ℃ for 1 hour. After the reaction was completed, different samples were mixed, and after purification by an RCC purification kit, 50.7. mu.L of enzyme-free water was eluted. To the eluted mixed sample, 1.3. mu.L RibioLock RNase Inhibitor (40U/. mu.L) was added and mixed well to prevent RNA degradation, and 1. mu.L of the sample was taken out and added to 9. mu.L of non-enzyme water to be left as an Input control. The rest of the sample is pressed

N6-Methylaldenosine Enrichment Kit (NEB, E1610S) Specification for m⁶Antibody A was immunoprecipitated and finally eluted with 12. mu.L of enzyme-free water to obtain IP samples. The Input sample and the IP sample are added into the following system (table 7) and blown by a pipette tip to be uniformly mixed, and then the reverse transcription reaction is carried out under the reaction conditions of 25 ℃ for 3min, 42 ℃ for 10min and 52 ℃ for 40 min.

TABLE 7 reverse transcription reaction System (20. mu.L)

After completion of the reverse transcription reaction, 1. mu.L of Exo I enzyme was added to the reaction system, and reacted at 37 ℃ for 30min to remove excess reverse transcription primer, followed by addition of 15. mu.L of 0.5M EDTA (pH 8.0) and 15. mu.L of 1M NaOH solution to the reaction system and treatment at 65 ℃ for 15min to remove RNA template. The reaction was purified using Oligo Clean & concentrator (OCC) purification kit (Zymo) and eluted with 7. mu.L of enzyme-free water to obtain cDNA samples. The cDNA samples were then ligated with 5 ' adaptor (5 ' -Phos-NNNNNNNNNNAGATCGGAAGAGCACACGTCTG-SpC-3 ', N stands for random base) overnight (12h) at 25 ℃ in the following reaction system (Table 8).

TABLE 85' adaptor connection system (20 μ L)

The reaction was eluted with 12. mu.L of enzyme-free water after purification with OCC purification kit.

Taking 1 mu L of IP and Input samples to perform RT-qPCR in a 20 mu L system, wherein the primer sequences are as follows:

RT-qPCR primer sequence (5 '-3')

qPCR forward primer	TACCTTGGCACCCCAGAC
		qPCR reverse primer	TTCAGAGTTCTACAGTCCGA

And observing a fluorescence curve, and selecting the minimum Ct value when the fluorescence value reaches a platform as the cycle number of the PCR constructed by the library.

Library construction PCR reactions were performed in the following reaction system (Table 9), where the PCR primers were NEB second generation sequencing primers and the PCR program was based on

Ultra^TM II

The Master Mix instructions were set.

TABLE 9 construction of the library PCR reaction System (50. mu.L)

The PCR product was purified using a gel recovery kit (steps according to kit instructions used) to obtain library samples that could be sent to sequencing companies for second generation sequencing.

2. Library composition verification:

the library constructed by the method of the invention is inserted into plasmid by TA cloning, 5 monoclonals are selected for first-generation sequencing, and the constructed library is verified to be in accordance with expectations.

The results are shown in FIG. 3, and the first-generation sequencing results show that the parts with gray shades at both ends of the DNA sequence respectively correspond to the forward primer and the reverse primer in the library-building PCR primer kit; the wavy line part in the figure is a random sequence of ten N on 5' adaptor, and the result shows that the sequences of the parts of five monoclonals are different; the part with the lower dotted line in the figure can correspond to a barcode label sequence (6 random bases), and six base sequences obtained by five clones are different and all correspond to the designed barcode sequence; the portion with the solid line drawn in the figure is the DNA sequence corresponding to the inserted RNA fragment, and this portion is different from each other because of experiments using cellular mRNA.

Example 4

3' -linker sequence design optimization:

3' -linker (Table 1-2, FIG. 3) containing different barcode tags before/after optimization was ligated to HeLa cell mRNA after equivalent disruption and mixed⁶And A-seq, splitting the obtained sequencing data, and checking whether the data distribution is uniform, namely whether the barcode label influences the sequencing.

The result is shown in fig. 4, a phenomenon of obvious data nonuniformity exists after the splitting of the sequencing data before the design of the optimized sequence, and the data obtained after the splitting of the optimized sequencing data is more uniform. In combination with the sequencing result, it can be presumed that this is because errors are easily generated at the first few bases of sequencing, which results in the failure to successfully split the sequencing data and the waste of data.

Example 5

And (3) carrying out independence verification on multiple groups of data split by the same library according to the barcode:

(1) designing four oligo RNA strands with known sequences (as shown in Table 10), and mixing the four oligo RNA strands with the mRNA of the fragmented HeLa cells according to a mass ratio of 1: 100;

TABLE 10 Oligo RNA sequences (5 '-3')

Oligo RNA1	AUACUGCCACAUGCUGCACAGUGC
		Oligo RNA2	GGACUGAGAACUGGACUGUCUGGGGUGCCAAGGUA
Oligo RNA3	GGACUGAACUGGACUGUCUGGGGUGCCAAGGUA
		Oligo RNA4	GUACGUCAUCGAGAUCAGCUU

(2) Respectively taking 100ng of mRNA mixed with oligo RNA, correspondingly connecting 3 '-linkers with different labels at the 3' ends of the mRNA, mixing 4 samples into one sample, and performing library construction, high-throughput sequencing and data analysis;

(3) in the data analysis process, the numbers of reads of four oligo RNAs in the data split by each barcode are respectively counted so as to analyze whether data pollution exists or not.

The results are shown in FIG. 5, which lists the reads numbers of four oligo RNAs in the four barcode resolution data, and the results show that each oligo RNA has a large number of reads in the data resolved by its corresponding barcode, and basically none of the data resolved by the other three barcode. Based on this result, it was determined that the present invention utilizes the barcode tag for multiple low-sample-size samples m⁶A high-throughput sequencing experimental scheme is feasible, and the split data are independent from each other and have no mutual pollution.

Example 6

By utilizing the library construction method, 6 parts of 100ng of broken HeLa cell mRNA are respectively connected with 6 different barcode labels (3' linker 1-6 in Table 2), and m is carried out⁶And A-seq library construction and sequencing. Analyzing the obtained sequencing data, and checking m obtained by the invention⁶In the sequencing information, m⁶Whether the percentage distribution of the a peak is consistent with the general distribution.

The results are shown in FIG. 6, according to the barcode sequence for m⁶After the sequencing data are split and analyzed, m is found⁶The percentage distribution of A peaks in 5 non-overlapping transcriptome fragments corresponds to m⁶The general distribution of a in the transcriptome, i.e. more distribution at the coding region, 3' UTR and stop codon of the transcriptome.

Example 7

Experiments were performed on a small number of samples:

(1) 16 portions of 20ng of broken HeLa cell mRNA were taken, and connected to 16 different barcode tags (3' linker 1-16 in Table 2), and m was performed⁶A-seq library construction and sequencing;

(2) and splitting according to the barcode sequence after the sequencing data is obtained, analyzing the split data information, and comparing the data difference split by different barcodes.

As a result, 16 parts of 20ng of cleaved HeLa cell mRNA was labeled with barcode, mixed samples were pooled, and RNA-seq and m-seq could be successfully constructed⁶A-seq library. Through splitting and analyzing library sequencing data, the numbers of reads obtained by splitting according to different barcode and m found in each sample are found⁶There was no significant difference in the number of A peaks (FIG. 7A, B), the distribution was more uniform, and the distribution was more uniform by the number of m pairs⁶The data of A-seq are analyzed, and m is successfully constructed⁶A sample library after enrichment of antibody, and m⁶The sequence of the A modification site conforms to the general m⁶A modified motif (fig. 7C), i.e., RRACH (R ═ G or a; H ═ a, C or U).

Sequence listing

<110> Wuhan university

<120>Based on RLow sample size m of NA-linked tags⁶A high throughput sequencing method

<160> 16

<170> SIPOSequenceListing 1.0

<210> 1

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

actcgannnn nnagatcgga agagcgtcgt g 31

<210> 2

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 2

agctgannnn nnagatcgga agagcgtcgt g 31

<210> 3

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 3

agcagannnn nnagatcgga agagcgtcgt g 31

<210> 4

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 4

agctcgnnnn nnagatcgga agagcgtcgt g 31

<210> 5

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 5

atcgcannnn nnagatcgga agagcgtcgt g 31

<210> 6

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 6

agctcannnn nnagatcgga agagcgtcgt g 31

<210> 7

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 7

ttcggannnn nnagatcgga agagcgtcgt g 31

<210> 8

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 8

catcgannnn nnagatcgga agagcgtcgt g 31

<210> 9

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 9

ctagcannnn nnagatcgga agagcgtcgt g 31

<210> 10

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 10

acggtannnn nnagatcgga agagcgtcgt g 31

<210> 11

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 11

ccattgnnnn nnagatcgga agagcgtcgt g 31

<210> 12

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 12

gattcgnnnn nnagatcgga agagcgtcgt g 31

<210> 13

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 13

cgttagnnnn nnagatcgga agagcgtcgt g 31

<210> 14

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 14

gactgtnnnn nnagatcgga agagcgtcgt g 31

<210> 15

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 15

acagcannnn nnagatcgga agagcgtcgt g 31

<210> 16

<211> 31

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 16

agtcgtnnnn nnagatcgga agagcgtcgt g 31

Claims

1. Multiple low-sample-size samples m based on RNA (ribonucleic acid) connection barcode label⁶A high throughput sequencing method, characterized in that: the method comprises the following steps:

(1) carrying out phosphorylation and adenylation treatment on a library containing different barcode tag sequences by using a 3' linker; the 3 ' linker comprises a 5 ' -barcode sequence-random sequence-PCR primer linker sequence-3 ';

(2) respectively connecting the 3' -linker containing different barcode tag sequences after the adenylation treatment in the step (1) to the broken RNA samples from different sources, mixing the samples, purifying, and reserving an input control sample for RNA-seq, and m is performed on the remaining mixed sample⁶Antibody A immunoprecipitation to obtain IP samples for m⁶A-seq, finally obtaining a second generation sequencing library of input and IP samples and sequencing;

(3) the sequencing data is split according to the barcode sequence and analyzed to obtain the RNA-seq and m of the initial single sample⁶A-seq information.

2. The plurality of low sample size samples m of claim 1 based on RNA-linked barcode tags⁶A high throughput sequencing method, characterized in that: in the step (1), the 3' linker is purified after phosphorylation and adenylation.

3. The plurality of low sample size samples m of claim 1 based on RNA-linked barcode tags⁶A high throughput sequencing method, characterized in that: in the step (1), phosphorylation treatment is performed on 3 ' -linker by using T4 PNK enzyme, and adenylation treatment is performed on the phosphorylated 3 ' -linker by using a 5 ' adenylation reagent.

4. The plurality of low sample size samples m of claim 1 based on RNA-linked barcode tags⁶A high throughput sequencing method, characterized in that: in the step (1), the random sequence is a random sequence of 6 bases.

5. The plurality of low sample size samples m of claim 1 based on RNA-linked barcode tags⁶A high throughput sequencing method, characterized in that: the step (2) comprises the following steps:

1) extracting RNA from the sample;

2) breaking RNA sample into 200-400bp segments by using a chemical ion breaking method, removing breaking reagent by purification, removing phosphate groups at the 3 'end of broken RNA by using enzyme, and adding phosphate groups at the 5' end;

3) connecting the adenylated 3 '-linker containing different barcode labels to different broken RNA samples, and removing redundant 3' -linker after reaction;

4) mixing and purifying different samples, leaving an input sample, and performing m on the rest samples⁶Performing antibody immunoprecipitation to obtain an IP sample, and performing reverse transcription on the input sample and the IP sample to obtain cDNA;

5) connecting a 5' joint on the cDNA, purifying a reaction system, and determining the cycle number required by constructing the library PCR by using RT-qPCR;

6) and (3) carrying out PCR (polymerase chain reaction) to construct a library by taking the cDNA connected with the 5' -joint as a substrate, purifying a product by utilizing gel cutting recovery to obtain a second-generation sequencing library, and sequencing the library.

6. A plurality of low sample size samples m according to claim 5 based on RNA linked barcode tags⁶A high throughput sequencing method, characterized in that: in the step 2), breaking RNA by adopting a magnesium ion chemical breaking reagent; the enzyme is T4 PNK enzyme.

7. A plurality of low sample size samples m according to claim 5 based on RNA linked barcode tags⁶A high throughput sequencing method, characterized in that: in the step 3), connecting the adenylated 3' -linker to the broken RNA sample by using T4 RNA ligase 2; excess 3 '-linker was removed using 5' deadenylase and RecJf enzyme.

8. A plurality of low sample size samples m according to claim 5 based on RNA linked barcode tags⁶A high throughput sequencing method, characterized in that: in step 4), reverse transcription was performed using Superscript III enzyme.

9. A plurality of low sample size samples m according to claim 5 based on RNA linked barcode tags⁶A high throughput sequencing method, characterized in that: in step 5), a 5' linker was ligated to the cDNA using T4 RNA ligase 1.

10. The multiple oligo based on RNA linked barcode tag of claim 1Sample size sample m⁶A high throughput sequencing method, characterized in that: the step (3) comprises the following steps:

1) analyzing the comparison rate of the sequencing data, and checking the data quality;

2) splitting data according to a barcode tag sequence, and corresponding to the initial sample;

3) and analyzing the split data to obtain sequencing information.