Construction method and kit of RNA library
Technical Field
The application belongs to the technical field of gene sequencing, and particularly relates to a construction method of an RNA library and a kit.
Background
The transcriptome sequencing technology, RNA-seq, is an important tool for researching biological gene expression, and the technology is widely applied to the analysis of gene expression profiles of different species. The recent RNA-seq library construction kit mainly focuses on RNA strand specificity research, the library construction process needs to extract total RNA, enrich mRNA, synthesize first strand cDNA through reverse transcription, synthesize second strand cDNA by dUTP, then perform terminal filling of double-stranded cDNA, add adenine A base, perform linker connection, digest cDNA containing uracil U base, and then complete library construction through PCR amplification.
The concrete library building process is as follows:
(1) fragmenting RNA, adding dUTP to synthesize double-stranded cDNA, and distinguishing strand specificity by uracil U base;
(2) carrying out terminal filling repair on the cDNA and adding adenine A base to the 3 ' end of the cDNA to ensure that the 3 ' end of the cDNA double-chain contains A base suspension and further matches with thymine T base at the 3 ' end of the joint to realize TA connection to complete library construction;
(3) two single-chain linkers are required to be designed and annealed to form a double chain, and contain thiosulphate bond modification to ensure that the T basic group of the linker does not fall off, so that effective linker connection can be carried out;
(4) finally, the second strand of the cDNA containing uracil U bases is digested prior to PCR to achieve strand-specific library construction.
Because the existing process needs to firstly break RNA into small segments, then generate double-stranded cDNA, and finally perform the steps of filling up double strands, adding A, connecting joints, amplifying and building a library, the process is long, the test can be completed within 7-9 hours generally, and the library building efficiency is low.
Disclosure of Invention
In order to improve the library construction efficiency, the application provides a construction method of an RNA library, which can simplify the library construction route, synchronously complete the reverse transcription of a template and the introduction of a double-end joint sequence through two rounds of cDNA synthesis with joint sequences, complete the enrichment of the template, and realize the rapid and efficient library construction.
The application is realized by the following scheme:
the application provides a construction method of an RNA library, which comprises the following steps:
reverse transcribing total RNA with a first primer to obtain first strand cDNA, wherein the first primer comprises a first adaptor sequence and a random sequence;
purifying the first strand cDNA;
amplifying the first strand cDNA with a second primer to generate a two-strand DNA, the second primer comprising a second linker sequence and a random sequence;
and amplifying the double-strand DNA by using a third primer to obtain an RNA library.
By reverse transcription of the complete cDNA template, the present application obtains cDNA in higher yield than the original template due to the multiple point binding of the random sequence in the first primer to the template, and each product has a different sequence due to the randomness of the primer binding sites.
In the present application, the first strand cDNA is purified to remove the residual linker sequence, thereby reducing subsequent non-specific amplification and reducing dimer formation.
In one embodiment of the present application, a DNA polymerase having a strand displacement function is used for amplifying the first strand cDNA with the second primer.
In the present application, when synthesizing double-stranded DNA, since DNA polymerase has a high-efficiency strand displacement activity, one cDNA strand can synthesize a plurality of double-stranded DNAs of different lengths, and the product has a double-terminal linker sequence, and no linker ligation is required.
In a specific embodiment of the present application, the DNA polymerase is the phi29 enzyme.
In the present application, since the binding efficiency of the random sequence is positively correlated with the template length, the uninterrupted RNA can generate more target fragments.
In the application, the template is enriched by amplifying through the third primer, and meanwhile, the DNA is amplified by utilizing the characteristic that the amplification efficiency of the short fragment of the DNA amplification is far higher than that of the long fragment, and the amplification product is mainly the inserted small fragment of the DNA. The resulting library fragment lengths are more suitable for NGS sequencing.
In a specific embodiment of the present application, the sequence of the first primer is ACGCTCTTCCGATCT + NNNNNN, wherein ACGCTCTTCCGATCT is the first linker sequence; n is A, T, C, G degenerate four bases. In the present application, the first linker sequence is not limited, and can be designed by itself according to the sequencing platform.
In a specific embodiment of the present application, the sequence of the second primer is CGTATGCCGTCTTCTGCTTG + NNNNNN, wherein CGTATGCCGTCTTCTGCTTG is the second linker sequence; n is A, T, C, G degenerate four bases. In the present application, the second linker sequence is not limited, and can be designed by itself according to the sequencing platform.
In one embodiment of the present application, the third primer comprises an upstream primer and a downstream primer, and the upstream primer comprises, in order from 5 'end to 3' end: the sequence of the upstream sequencing platform, the sequence of the first library tag and the sequence of the upstream sequencing primer, wherein the downstream primer sequentially comprises from 5 'end to 3' end: a downstream sequencing platform sequence, a second library tag sequence, and a downstream sequencing primer sequence, wherein the downstream sequencing primer sequence is identical or complementary to at least a portion of the sequence at the 5 'end of the first primer, and the upstream sequencing primer sequence is complementary or identical to at least a portion of the sequence at the 5' end of the second primer.
In one embodiment of the present application, the third primer sequence is:
an upstream primer:
GATCGGAAGAGCACACGTCTGAACTCCAGTCACXXXXXXXXATCTCGTATGCCGTCTTCTGCTTG;
a downstream primer:
AATGATACGGCGACCACCGAGATCTACACXXXXXXXXACACTCTTTCCCTACACGACGCTCTTCCGATC,
wherein XXXXXXXXX is an index sequence.
In a specific embodiment of the present application, the purification is magnetic bead purification.
In a specific embodiment of the present application, the RNA is total RNA, mRNA, incrna, or ribosome-depleted RNA.
In another aspect, the present application provides a kit comprising the first primer, the second primer and the third primer described above.
In one embodiment of the present application, the kit further comprises a reverse transcriptase and a DNA polymerase.
In one embodiment of the present application, the DNA polymerase is a DNA polymerase having a strand displacement function. Preferably, the DNA polymerase is a phi29 enzyme.
The construction method of the RNA library provided by the application has at least one of the following beneficial effects:
the application provides a construction method of RNA library to complete RNA chain carries out the reverse transcription as the template, need not to fragment RNA, and the lug connection has the connector sequence on the primer, need not the later stage and carries out the double-end joint again and connect, and this method only needs the single strand that has the double-end joint moreover, consequently no matter is amplification efficiency or builds storehouse speed and all promotes by a wide margin.
Drawings
FIG. 1 is a schematic diagram of the procedure for constructing an RNA library provided in the examples of the present application.
FIG. 2 is a gel electrophoresis of the RNA library provided in the examples of the present application.
FIG. 3 is a graph comparing the results of RNA first strand cDNA products and enriched double stranded products provided in the examples of the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.
The technical solutions of the present application will be described clearly and completely in conjunction with the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. The examples, in which specific conditions are not specified, were conducted under conventional conditions or conditions recommended by the manufacturer. The reagents or instruments used are not indicated by the manufacturer, and are all conventional products available commercially.
The method for constructing an RNA library in the present application can be applied to total RNA, mRNA, lncRNA, or ribosome-removing RNA. The following examples are given by way of example of total RNA.
EXAMPLE 1 construction of RNA library
A schematic of the construction of the RNA library is shown in FIG. 1. The steps for RNA library construction are as follows: as shown in FIG. 1, a first cDNA strand is synthesized by reverse transcription using a random primer having a first linker sequence, and then a second cDNA strand is synthesized by adding a random primer having a second linker sequence, and then an RNA library is obtained by PCR amplification and enrichment. The specific process is as follows:
1. reverse transcription: the total RNA samples were reverse transcribed using a reverse transcription kit (SuperScript IV reverse transcriptase Invitrogen, cat # 18090200).
1.1 reverse transcription mix was formulated as in Table 1 below.
TABLE 1 reverse transcription of mix
The sequence of the first primer is ACGCTCTTCCGATCT + NNNNNN, wherein acgctcttccgatct is the first linker sequence, and N is a degenerate base of A, T, C, G four bases.
1.2 the reverse transcription mix was placed on a PCR instrument and reverse transcription was performed according to the procedure in Table 2 below to obtain a reverse transcription product, i.e., first strand cDNA.
TABLE 2 reverse transcription reaction procedure
1.3 purification of the reverse transcription product Using VAHTS DNA Clean Beads (VAHTS DNA magnetic bead) kit (Novozam, cat # N411).
Adding 30ul of Nuclear Free (NF for short) water into the reverse transcription product, adding 50ul of VAHTS DNA magnetic bead with 1 volume, sucking, mixing uniformly, standing for 5min, placing on a magnetic frame to clarify, and discarding the supernatant.
Cleaning the magnetic beads with 200ul of fresh 80% ethanol, washing twice, and eluting the magnetic beads with 22ul of NF water to obtain purified reverse transcription products; sucking 20ul of reverse transcription product, adding 20ul of magnetic beads, sucking, uniformly mixing, standing for 5min, placing on a magnetic frame for clarification, and discarding the supernatant. And then cleaning the magnetic beads with 200ul of fresh 80% ethanol, cleaning twice, eluting the magnetic beads with 12ul of NF water to obtain purified reverse transcription products, and sucking 10ul of the reverse transcription products to enter a subsequent test.
2. Double chain synthesis
The two-chain synthetic mix was prepared according to Table 3
TABLE 3 two-chain Synthesis of mix
The sequence of the second primer is CGTATGCCGTCTTCTGCTTG + NNNNNN, wherein cgtatgccgtcttctgcttg is the second linker sequence, and N is a degenerate base of the four bases A, T, C, G.
Placing the two-strand synthesis mix on a PCR instrument, incubating for 20 minutes at 42 ℃, and inactivating for 10 minutes at 65 ℃ to obtain two-strand DNA.
Adding 20ul of magnetic beads into the double-stranded DNA, sucking, uniformly mixing, standing for 5min, placing on a magnetic frame until the mixture is clarified, and removing the supernatant; and then cleaning the magnetic beads with 200ul of fresh 80% ethanol, cleaning twice, eluting the magnetic beads with 12ul of NF water to obtain purified double-chain products, and sucking 10ul of the double-chain products to enter a subsequent test.
3. PCR amplification
To the double-stranded product, 1ul of UDI primer (Swift Unit Dual Indexing UDI Kit, Cat. number X9096) and 4ul of NF water, 25ul of high fidelity PCR master mix (containing DNA polymerase) (KAPA HiFi PCR Kit, KK 2101) were added, pipetted and mixed, and PCR amplification was performed according to the procedure of Table 4 to obtain an amplification product.
TABLE 4 PCR amplification run program
In the UDI primers:
an upstream primer:
gatcggaagagcacacgtctgaactccagtcaccaacacagatctcgtatgccgtcttctgcttg
a downstream primer:
aatgatacggcgaccaccgagatctacacatgtgaagacactctttccctacacgacgctcttccgatc
wherein the underlined part is the index sequence.
Adding 40ul of magnetic beads into the amplification product, sucking, uniformly mixing, standing for 5min, placing on a magnetic frame until the mixture is clarified, and removing the supernatant.
The beads were washed twice with 200ul fresh 80% ethanol and then eluted with 22ul NF-water to obtain a purified RNA library.
The purified product was subjected to electrophoresis, and the results are shown in FIG. 2. As can be seen from FIG. 2, small fragments can be better removed by magnetic bead purification, the main library length is concentrated in the region of 200-400 bp, and the large fragments generate less libraries due to the amplification efficiency.
Therefore, the construction method of the RNA library provided by the embodiment only needs 3 amplification reactions in the whole process, the operation is simple, the whole experiment time is 4-4.5 hours, the time required by the whole library construction is shortened by half, and the constructed RNA library is beneficial to later popularization and application.
Example 2 application
The first strand cDNA products and the enriched double-stranded products were amplified using 18s detection of two human RNAs (labeled 1 and 2), respectively, as shown in Table 5 below and FIG. 3.
TABLE 5 comparison of amplification results
As can be seen from Table 5 and FIG. 3, the amplification of the same template product was several times improved after the two-strand synthesis.
In conclusion, the random primer with the adaptor is used for amplification, the traditional scheme of firstly synthesizing double chains and then connecting double-end adaptors is abandoned, and the defect that the yield of each link of a double-chain cDNA product is reduced by orders of magnitude, so that the efficiency of constructing the library is influenced; furthermore, a plurality of double-stranded products can be generated on the cDNA template at once by performing double-stranded synthesis using random primers with linkers and by combining the strand displacement properties of the phi29 enzyme.
The present embodiment is only for explaining the present application, and it is not limited to the present application, and those skilled in the art can make modifications of the present embodiment without inventive contribution as needed after reading the present specification, but all of them are protected by patent law within the scope of the claims of the present application.
Sequence listing
<110> Zhejiang medical science and technology Ltd
<120> construction method and kit of RNA library
<160> 4
<170> SIPOSequenceListing 1.0
<210> 1
<211> 15
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<220>
<223> first linker sequence
<400> 1
acgctcttcc gatct 15
<210> 2
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<220>
<223> second linker sequence
<400> 2
cgtatgccgt cttctgcttg 20
<210> 3
<211> 65
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<220>
<223> upstream primer
<400> 3
gatcggaaga gcacacgtct gaactccagt caccaacaca gatctcgtat gccgtcttct 60
gcttg 65
<210> 4
<211> 69
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<220>
<223> downstream primer
<400> 4
aatgatacgg cgaccaccga gatctacaca tgtgaagaca ctctttccct acacgacgct 60
cttccgatc 69