CN112824534A - Method for amplifying target region of nucleic acid, library construction and sequencing method and kit - Google Patents

Method for amplifying target region of nucleic acid, library construction and sequencing method and kit Download PDF

Info

Publication number
CN112824534A
CN112824534A CN201911141638.6A CN201911141638A CN112824534A CN 112824534 A CN112824534 A CN 112824534A CN 201911141638 A CN201911141638 A CN 201911141638A CN 112824534 A CN112824534 A CN 112824534A
Authority
CN
China
Prior art keywords
sequence
nucleic acid
sequencing
target region
umi
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911141638.6A
Other languages
Chinese (zh)
Inventor
黄标
周荣芳
白寅琪
黄金
陈智超
唐冲
田志坚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Bgi Medical Laboratory Co ltd
Original Assignee
Wuhan Bgi Medical Laboratory Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Bgi Medical Laboratory Co ltd filed Critical Wuhan Bgi Medical Laboratory Co ltd
Priority to CN201911141638.6A priority Critical patent/CN112824534A/en
Publication of CN112824534A publication Critical patent/CN112824534A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Abstract

The invention relates to the field of gene sequencing, in particular to a method for amplifying a target region of nucleic acid, a library building method and a sequencing method. Provided methods of amplifying a target region of a nucleic acid include: (1) inserting a UMI sequence onto the nucleic acid using a transposase so as to obtain a fusion product containing the UMI sequence and the target region sequence; (2) and amplifying the fusion product by using the primer so as to obtain an amplification product containing the target region sequence and the UMI sequence. Methods of pooling target regions of nucleic acids and sequencing methods are also provided. The method provided by the invention can shorten the time of library construction and sequencing; and the production cost can be reduced, and the data utilization rate is effectively improved.

Description

Method for amplifying target region of nucleic acid, library construction and sequencing method and kit
Technical Field
The invention relates to the field of gene sequencing, in particular to a method for amplifying a target region of nucleic acid, a library building method, a sequencing method and a kit.
Background
The Pacbio third generation sequencing is based on the principle of simultaneous synthesis and sequencing, an SMRT (single molecule real-time fluorescence sequencing technology) chip is used as a carrier for sequencing reaction, genome DNA is broken into a plurality of small fragments during sequencing, and the small fragments are dispersed into different ZMW nanopores after being made into liquid drops. When the ZMW bottom polymerization reaction occurs, the nucleotides labeled with different fluorescence are retained by the polymerase in the fluorescence detection region of the wells, and the type of the base composition of the template DNA can be determined based on the type of fluorescence and the duration of fluorescence.
Pacbio platform A SMRT chip has 100 ten thousand ZMW sequencing wells, each of which can generate a piece of sequence information (roughly 20-30Kb in length), and on average each chip can generate 5-15G of data.
Metagenomics (Metagenomics), also called microbial environment genomics, is a method for directly extracting DNA of all microorganisms from an environment sample to construct a metagenomic library, and researching genetic composition and community functions of all microorganisms contained in the environment sample by utilizing a research strategy of genomics.
The traditional method at present is to determine 16S rDNA genes on a microbial genome, the length of the genes is usually about 1500 bases, the genes are widely distributed in prokaryotes, enough information can be provided, and the evolution process is relatively slow; the sequence of the region has both conservative type and high variability, and the species of the microorganism are distinguished through a conservative region and a specific region. Based on these characteristics, scientists have conveniently studied the compositional diversity of species in an environment by selecting these gene regions, but have not been able to fully analyze gene function in an environment. At present, scientists can sequence the whole genome in the environment and comprehensively analyze the microbial community structure, the gene function composition and the like after acquiring massive data by using the wide application of a new generation of high-throughput low-cost sequencing technology.
Although the existing 16S amplicon library-building sequencing technology has the great advantages of high throughput and low cost and can objectively reduce the flora structure and the relative abundance ratio, the relative abundance ratio information of a certain bacterium is obtained by the ratio of the sequence number of a certain OTU classification unit to the total sequence number, but the relative abundance ratio information cannot reflect the true absolute abundance condition of the species in a sample, so the result obtained by the technology indicates that the increase of the bacteria X in the group A relative to the group B is wrong, and the abundance ratio of the bacteria X in the group A relative to the group B is increased.
Further improvements are needed to achieve quantification of target region sequences, such as 16S rDNA.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art. To this end, it is an object of the present invention to provide a method for amplifying a target region of a nucleic acid, a method for pooling and sequencing, and a kit.
When the corresponding sequencing information is obtained by amplifying, pooling and sequencing the target region nucleic acid of the genome, generally, a fusion primer is marked on the target region nucleic acid in a PCR amplification mode, then a sequence containing the target region nucleic acid and the fusion primer is amplified by using a specific primer, pooling is performed, a sequencing library is obtained, and the information of the target region nucleic acid is obtained by performing on-machine sequencing. Taking the 16S rDNA gene on the genome of a microorganism as an example, in the process of amplifying the 16S rDNA, a fusion primer (Barcode + UMI + primer) is generally used for amplifying 2 cycles by means of PCR, so as to mark UMI on the full length of the 16S rDNA. Then magnetic bead purification is utilized to remove the fusion primer, and then a specific primer is utilized to carry out two-round PCR amplification, and products containing 16s rDNA, UMI sequences, barcode sequences and the like are obtained by enrichment. The obtained products can be subjected to library construction by using a Pacbio platform, for example, the obtained products are subjected to damage repair, end repair, joint connection and finally fragment sorting to obtain a sequencing library with the length of 1-2 Kb. And performing on-machine sequencing on the obtained sequencing library.
However, the above treatment method has some disadvantages, which are shown as follows: (1) when UMI is labeled on 16S rDNA in a first PCR amplification mode, the labeled UMI sequence is usually short, for example, a random primer of 4bp, the amplification specificity is poor, and when the base composition of the UMI sequence has high homology with the upper part (non-16S region) of a genome, UMI can be randomly combined to the region except 16S, so that non-specific amplification is caused; (2) in general, single-ended UMI is generally only 4 bases in length, and the type of UMI is only 4 in some cases4(perhaps only 256) and the number of microbial species in the environment often exceeds 1000, so that the requirement cannot be met if only single-ended plugging is used. Meanwhile, in order to improve the amplification specificity, the UMI sequence is usually labeled at both ends of the target region, for example, the UMI sequence of 4bp is labeled at both ends of the target region, and the data with the UMI attached to both ends is selected for downstream analysis. In fact, the probability that both ends are connected with the UMI is low, and the data utilization rate is low; (3) after the first PCR amplification, in order to avoid the fusion primers with different UMI sequences being labeled on the same 16s rDNA template in the subsequent PCR amplification process, thereby causing UMI distortion, magnetic bead purification is required before the second PCR amplification, which wastes time and wastes samples.
In the research process, the inventor finds that the UMI sequence can be marked on the target region by using transposase, such as Tn5 transposase, instead of marking the UMI sequence on the target region by using a PCR amplification mode, so that on one hand, nonspecific amplification can be reduced, on the other hand, a primer for first amplification does not need to be specially removed before second amplification is carried out, the process is simplified, and an amplification primer for the target region of a genome can be quickly and accurately obtained. For example, a barcode molecular tag sequence can be inserted into an original genome DNA molecule by using Tn5 transposase to carry out molecular labeling, a PCR amplification mode is adopted to enrich a 16S region and barcode amplification, a Pacbio platform library building and sequencing are carried out, so that the original genome DNA template is labeled and screened, after sequencing, the sequence is quantified and subjected to other advanced analysis according to the unique barcode tag, and the gene quantification level and the assembly condition of various products of a third generation sequencing platform can be remarkably improved. Meanwhile, Tn5 transposase can be used to directly insert barcode molecular tag sequence into genomic DNA molecule for molecular labeling, and the obtained product can be directly used for library construction and sequencing without PCR amplification, for example, Pacbio platform can be used for library construction and sequencing to realize genomic DNA quantification and analysis.
Specifically, the invention provides the following technical scheme:
in a first aspect of the invention, there is provided a method of amplifying a target region of a nucleic acid, comprising: inserting a UMI sequence onto the nucleic acid using a transposase so as to obtain a fusion product containing the UMI sequence and the target region sequence; amplifying the fusion product using an amplification primer to obtain an amplification product containing the UMI sequence and the target region sequence. Rapid amplification of a target region of nucleic acid can be achieved using this method. And the obtained amplification product is applied to library construction and sequencing, so that the sequencing quality is not influenced, and the quantification of the nucleic acid in the target region can be realized.
According to an embodiment of the present invention, the method for amplifying a target region of a nucleic acid described above may further include the following technical features:
in some embodiments of the invention, the transposase is Tn5 transposase.
In some embodiments of the present invention, the UMI sequence is a random sequence with a length of 5-25 bp, such as a random sequence with a length of 5-20 bp, preferably a random sequence with a length of 8-15 bp, and more preferably a random sequence with a length of 10 bp.
In some embodiments of the invention, the target region is a 16S rDNA, 18SrDNA, or ITS region.
In some embodiments of the present invention, the target region is 1-10 kb in length. By applying the method, longer target region sequences can be amplified, such as a target region with the length of 1-10 kb, a target region with the length of 1-8 kb and a target region with the length of 1-6 kb, or a target region with the length of 1-4 kb and a target region with the length of 1-2 kb.
In some embodiments of the invention, the nucleic acid is genomic DNA of a microorganism. Due to the provided amplification method, the UMI sequence is introduced by a transposase mode, the specificity is strong, and the method can be suitable for amplifying a target region on the genome DNA of a microorganism. For example, the method is suitable for amplifying the 16S rDNA gene on the microbial genome, and analyzing the diversity of microbial species and the functions of the corresponding genes by performing library building and sequencing on the amplification result.
In some embodiments of the invention, the UMI sequence is located between the target region sequences. When the UMI sequence is inserted into the nucleic acid by using a transposase, the position of insertion of the UMI sequence is not particularly limited, and the UMI sequence may be inserted between the sequences of the target region, or may be inserted upstream or downstream of the target region. When the UMI sequence is inserted between the target regions, the target region and the UMI sequence can be only amplified in a targeted manner during subsequent amplification, and some irrelevant regions do not need to be amplified, so that the waste of sequencing data is avoided, and the cost is reduced.
In some embodiments of the invention, the target region contains 1 UMI sequence every 1-2kb in length. Therefore, the waste of sequencing data caused by introducing excessive UMI sequences can be avoided, and the cost is reduced.
In some embodiments of the present invention, step (1) further comprises: (1-1) obtaining a nucleic acid fragment containing a recognition sequence and said UMI sequence; (1-2) mixing the nucleic acid fragment, Tn5 transposase and the nucleic acid, and reacting at 32-40 ℃ for 25-50 minutes, preferably at 37 ℃ for 30 minutes. When the Tn5 transposase is used to insert the UMI sequence into the nucleic acid, if the reaction conditions and the reaction system are not controlled, a sequence carried by the Tn5 enzyme may be inserted into the sequence every 100-200 bp; when the method is directly applied to amplification of 16S rDNA, about 10 UMI tag sequences can be inserted into the 16S rDNA with the length of 1.5kb, which is a great waste of data. In the actual analysis process, quantitative analysis can be determined only by one UMI, so that the reaction can be controlled at 32-40 ℃ for 25-50 minutes, preferably at 37 ℃ for 30 minutes, so that a target region with the length of 1-2kb contains 1 UMI sequence, and sequencing data are prevented from being wasted.
In a second aspect of the invention, there is provided a method of pooling a target region of a nucleic acid comprising: amplifying a target region of said nucleic acid using a method according to any embodiment of the first aspect of the invention to obtain an amplification product; based on the amplification products, a library is constructed so as to obtain a sequencing library. By applying the method, on one hand, the time for establishing a library and sequencing can be shortened; and the production cost can be reduced, and the data utilization rate is effectively improved.
According to an embodiment of the present invention, the method for pooling a target region of a nucleic acid described above may further comprise the following technical features:
in some embodiments of the invention, the pooling is performed using a Pacbio platform.
In some embodiments of the invention, the creating the library comprises: subjecting the amplification product to damage repair and end repair to obtain a repair product; ligating the repair product and the linker to obtain a ligation product; and carrying out enzyme digestion on the connection product, and carrying out fragment sorting on the obtained enzyme digestion fragment to obtain a fragment containing a target region and a UMI sequence.
In some embodiments of the invention, the repair product is purified using magnetic beads prior to ligation of the repair product to the adaptor, or prior to fragment sorting of the resulting cleaved fragments, respectively.
In a third aspect of the invention, the invention provides a method of constructing a sequencing library, comprising: inserting the UMI sequence onto genomic DNA using a transposase so as to obtain a fusion product; based on the fusion products, a library is constructed in order to obtain a sequencing library. UMI is inserted into genome DNA by using transposase, then library construction is carried out, and sequencing is carried out on the obtained sequencing library, so that quantitative analysis and detection of the genome DNA can be realized.
In some embodiments of the invention, the pooling is performed using a Pacbio platform.
In some embodiments of the invention, the creating the library comprises: subjecting the fusion product to damage repair and end repair to obtain a repaired product; ligating the repair product and the linker to obtain a ligation product; and carrying out enzyme digestion on the ligation product, and carrying out fragment sorting on the obtained enzyme digestion fragments so as to obtain the sequencing library.
In a fourth aspect of the invention, the invention provides a sequencing method comprising: obtaining a sequencing library using a method according to the second aspect of the invention or according to any embodiment of the third aspect of the invention; and (4) performing machine sequencing based on the sequencing library so as to obtain a sequencing result.
In some embodiments of the invention, the in-silico sequencing is performed using a Pacbio platform.
In a fifth aspect of the invention, the invention provides a kit comprising: tn5 transposase, and the sequences shown in SEQ ID NO. 1 and SEQ ID NO. 2.
According to an embodiment of the present invention, the kit may further comprise: the kit is suitable for Pacibo platform library construction sequencing.
In a sixth aspect of the present invention, there is provided a method for modifying a nucleic acid, characterized by inserting a UMI sequence into the nucleic acid using a transposase to obtain a fusion product. The UMI sequence is inserted into the nucleic acid by using transposase, so that the nucleic acid is marked with the UMI sequence, each library fragment can be traced by using the UMI sequence, the original state of a nucleic acid sample is accurately reduced, and absolute and digital accurate quantification is realized.
According to an embodiment of the present invention, the method for modifying nucleic acid described above may further include the following technical features:
in some embodiments of the invention, the transposase is Tn5 transposase.
In some embodiments of the present invention, the UMI sequence is a random sequence with a length of 5-25 bp, preferably a random sequence with a length of 8-15 bp, and more preferably a random sequence with a length of 10 bp.
In some embodiments of the invention, every 1-2kb of nucleic acid sequence contains 1 UMI sequence. The nucleic acid sequence with the length of 1-2kb contains 1 UMI sequence, so that excessive UMI sequences can be prevented from being introduced, and sequencing data can be wasted when the nucleic acid is sequenced subsequently.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic diagram of a method for amplifying a target region (16s DNA) of a nucleic acid using Tn5 transposase according to an embodiment of the present invention;
FIG. 2 is a graph showing the results of strain classification and quantification provided in example 1 of the present invention;
FIG. 3 is a graph showing the results of classification and quantification of bacterial species provided in comparative example 1 according to the present invention.
Detailed Description
The embodiments of the present invention will be described in detail with reference to the following examples, which are intended to be illustrative and not to be construed as limiting the invention.
As used herein, unless otherwise specified, "N" represents any base A, T, C or G when referring to a nucleic acid sequence.
In one aspect of the invention, the invention provides a method of pooling a target region of a nucleic acid comprising: (1) inserting a UMI sequence onto the nucleic acid using a transposase so as to obtain a fusion product containing the UMI sequence and the target region sequence; (2) amplifying the fusion product using amplification primers to obtain an amplification product comprising the UMI sequence and the target region sequence; (3) based on the amplification products, a library is constructed so as to obtain a sequencing library.
Generally, when introducing a UMI sequence into a nucleic acid molecule, the UMI sequence is PCR-ligated to both ends of the nucleic acid molecule, for example, a DNA molecule, by a ligase. The present invention can avoid non-specific amplification by labeling the UMI sequence by means of PCR when the UMI sequence is inserted into the nucleic acid using a transposase, preferably Tn5 transposase. During introduction of the UMI sequence by the transposase, the position of insertion of the UMI sequence is random, either at both ends of the target region of the nucleic acid or between the target regions. Of course, the number of inserted UMI sequences is also required to be constant according to the length of the target region. Generally, one UMI sequence can be inserted into each target region with the length of 1-2kb, and if too few UMI sequences are inserted, the UMI sequences can be marked. If too many UMI sequences are inserted, sequencing data is wasted. Taking Tn5 transposase as an example, the length of the recognition sequence and UMI sequence contained therein is about 50bp, and if 5 fragments are inserted into a target region of 1-2kb length, the data of about 250bp are wasted. Of course, when Tn5 transposase is inserted into a nucleic acid, the nucleic acid may be cleaved after insertion, but a gap of only a few bases is formed, and the formed gap can be repaired and filled in the subsequent library construction process.
Inserting a UMI sequence into a target region with the length of every 1-2kb, for example, every 1-1.5 kb, can be realized by the temperature and time of a reaction system. In at least some embodiments of the invention, step (1) further comprises: (1-1) obtaining a nucleic acid fragment containing a recognition sequence and said UMI sequence; (1-2) mixing the nucleic acid fragment, Tn5 transposase and the nucleic acid for reaction, wherein the reaction condition is that the reaction is carried out for 25-50 minutes at 32-40 ℃, and preferably for 30 minutes at 37 ℃; the reaction system contains magnesium ions and a buffer solution. Therefore, one UMI sequence can be inserted into each target region with the length of 1-2 kb.
In at least some embodiments of the present invention, the Pacbio platform can be utilized for the library building. For example, a library can be constructed using commercially available kits suitable for use with the PacBio platform available on the market.
The scheme of the invention will be explained with reference to the examples. It will be appreciated by those skilled in the art that the following examples are illustrative of the invention only and should not be taken as limiting the scope of the invention. The examples, where specific techniques or conditions are not indicated, are to be construed according to the techniques or conditions described in the literature in the art or according to the product specifications. The reagents or instruments used are not indicated by the manufacturer, and are all conventional products commercially available.
Example 1 method for library construction and sequencing Using Tn5 transposase
Example 1 provides a method for pooling and sequencing 16s DNA of a microbial strain using T5 transposase, comprising the steps of:
(1) sample preparation: the DNA of each strain was extracted using the following four strains (hereinafter abbreviated as A \ B \ C \ D), and then mixed so that each strain DNA contained 50000 copies per microliter of the solution based on the 16S rDNA copy number of each strain.
Strain abbreviation Bacterial full scale GC content% Number of copies per μ l
A Rhodobacter sphaeroides 69.01 50000
B Escherichia coli 50.79 50000
C Acinetobacter baumannii 38.94 50000
D Bacillus cereus 35.58 50000
(2) Tn5 treatment marker:
A. a nucleic acid fragment containing a recognition sequence and a UMI sequence (detailed sequences are shown in SEQ ID NO:1 and SEQ ID NO:2 below) was synthesized, diluted to 100. mu.M and mixed in proportion (Tn5: Me 1:2), annealed according to the following reaction conditions, and named Adapter MIX:
hot cover on 105℃
75℃ 10min
60℃ 10min
50℃ 10min
40℃ 10min
25℃ 10min
The nucleic acid fragments used were:
Figure BDA0002281104980000071
wherein, the underlined part in SEQ ID NO. 1 is the Me sequence, the following NNNNNNNN base is the UMI sequence, and the UMI sequence is followed by the Me sequence. The Tn5 transposase functions as a dimer, and the Me sequences at both ends of the UMI sequence bind to a subunit of Tn5 transposase, respectively, so that the Tn5 transposase dimer functions as a corresponding transposase.
B. Combined assembly of Tn5 monoose and sequence:
and (3) preparing a reaction system, and reacting at 30 ℃ for 1 hour to embed and assemble the Tn5/Me sequence annealed in the step A onto the Tn5 single enzyme to form the Tn5-UMI complex.
Figure BDA0002281104980000072
Figure BDA0002281104980000081
C. Preparing a Tn5 transposase reaction system, and reacting at 37 ℃ for 30 min:
reagent Volume (μ l)
5X tagment buffer L 4
DNA 1
Tn5-UMI complex prepared in step B 1
Mg ion (2.0mmol/L) 1
Water (W) 13
Total volume 20
D. And (3) terminating the reaction: the reaction stopping buffer was prepared according to the following reaction system, and 4. mu.l of the reaction stopping buffer was added to the reaction reagent of the previous step, and the mixture was mixed, centrifuged, and placed on ice for 10 min.
Reagent Volume (μ l)
TrisHCI(PH=8),10mM 1
EDTA(PH=8),20mM 1
Water (W) 48
E. Repair Nick 10ul per sample
Figure BDA0002281104980000082
High-Fidelity 2X PCR Master Mix (NEB M0541L) and incubation at 72 ℃ for 10 min.
(3)16S full-length amplification: tn 5-treated DNA was amplified using 16S primers synthesized by Shanghai Biotechnology Ltd (the primers used are shown in the following table), and the same primers were used in four sets of repeats, which were designated PCR-1, PCR-2, PCR-3, and PCR-4 in this order.
Primer name Primer sequences ((5'to3'))
16S-F1 TCAGACGATGCGTCAT AGAGTTTGATCCTGGCTCAG(SEQ ID NO:3)
16S-F2 CTATACATGACTCTGC AGAGTTTGATCCTGGCTCAG(SEQ ID NO:4)
16S-F3 TACTAGAGTAGCACTC AGAGTTTGATCCTGGCTCAG(SEQ ID NO:5)
16S-F4 TGTGTATCAGTACATG AGAGTTTGATCCTGGCTCAG(SEQ ID NO:6)
16S-R1 ATGACGCATCGTCTGA GGYTACCTTGTTACGACTT(SEQ ID NO:7)
16S-R2 GCAGAGTCATGTATAG GGYTACCTTGTTACGACTT(SEQ ID NO:8)
16S-R3 GAGTGCTACTCTAGTA GGYTACCTTGTTACGACTT(SEQ ID NO:9)
16S-R4 CATGTACTGATACACA GGYTACCTTGTTACGACTT(SEQ ID NO:10)
(4) The 4 PCR amplification products (PCR-1 and PCR-2, PCR-3 and PCR-4) were mixed in equal amounts.
(5) Pacbio library construction sequencing operation was performed using the Pacbio library construction Kit SMRTbell Template Prep Kit 1.0 (cat # 100-:
A. and (3) damage repair reaction: the following reaction system is prepared and reacted for 60min at 37 ℃:
reagent Dosage (mu L)
DNA Damage Repair Buffer 5
NAD+ 0.5
ATP high (ATP high) 5
dNTP 0.5
DNA Damage Repair Mix (DNA Damage Repair Mix) 1
DNA 37
B. End repair reaction: the following reaction system is prepared and reacted for 10min at 25 ℃:
reagent Dosage (mu L)
DNA 50
DNA Damage Repair Mix (DNA Damage Repair Mix) 2
C. Magnetic bead purification: after adding 50. mu.L of magnetic beads to bind to the sample for 10min, the sample was washed twice with 75% absolute ethanol, and then DNA was dissolved from the top of the magnetic beads with 24. mu.L of water.
The magnetic Beads used were purchased from Novozam VAHTSTM DNA Clean Beads (lot: N411).
D. Adding a joint: the following reaction system was prepared and reacted overnight at 25 ℃ (12-16 hours)
Reagent Dosage (mu L)
DNA 20
Annealed Blunt Adapter (20. mu.M) 10
Template Prep Buffer (Buffer preparation Template) 4
ATP low (ATP Low) 2
Ligase (Ligase) 1
Water (W) 3
E. Double enzyme digestion: the following reaction system was prepared and reacted at 37 ℃ for 60min.
Reagent Dosage (mu L)
DNA 40
ExoIII 1
ExoVII 1
F. Magnetic bead purification: after 50ul of magnetic beads were added and bound to the sample for 10min, the sample was washed twice with 75% absolute ethanol, and then the DNA was dissolved from the top of the beads with 30ul of water.
G. Fragment sorting: fragments between 1-2Kb were sorted.
(6) Preparing and processing: primer + sequencing polymerase was added.
The sorted fragments were prepared and sequenced as described in the Pacbio library construction Kit SMRTbell Template Prep Kit 1.0 (cat # 100-.
A. Primer connection: the following reaction system was prepared and reacted at 20 ℃ for 60min.
Reagent Dosage (mu L)
Water (W) 6.1
10 XPrimer Buffer (Primer Buffer) 1
Sample Volume (Sample size) 1.9
Diluted Sequencing Primer (Diluted Sequencing Primer) 1
B. Polymerase ligation: the reaction is carried out for 60min at 30 ℃.
Reagent Dosage (mu L)
dNTP 1.6
DTT 1.6
Binding Buffer V2 (Binding Buffer V2) 1.6
Sample Volume (Sample size) 9.2
Polymerase Diluted (Diluted Polymerase) 1
C. Magnetic bead purification: the product was purified using 0.6-fold magnetic beads.
(7) Data analysis processing
The results are as follows:
results of Agilent2100 fragment size before and after optimization:
TABLE 1 data of the present invention for amplification and library construction
Figure BDA0002281104980000101
The non-Tn 5 treatment in Table 1 means: using the above procedure, the amplified fragment length was obtained without using Tn5 transposase. Repetition 1 to repetition 4 represent 4 repeated experiments, and the results obtained by the four repeated experiments have certain differences, and the possible reasons are mainly that: when the Agilent2100 is used for detection, the Agilent2100 instrument is sensitive, and brings some differences in detection results; in addition, genomic gDNA itself is a mixture of multiple bacteria, and there may be some difference in amplification efficiency.
Whether the tag sequence carried by Tn5 was inserted into the 16S rDNA region was verified by fragment size. As can be seen from Table 1, the amplified fragment size results before and after treatment with Tn5 showed that the amplified product after treatment with Tn5 was longer than the amplified product without treatment with Tn 5. Comparing the length of the sequence obtained by the growth with the Tn5 wrapped sequence proves that the method successfully inserts the UMI and the Me sequences at both ends into the middle of 16S.
In addition, since the total length of 16S rDNA is about 1500bp, it is basically between 1 and 2Kb in consideration of individual differences and detection differences. The 1-2Kb calculated in Table 1 is to illustrate the proportion of non-specific amplification; ratios other than 1-2Kb are due to non-specific amplification.
Table 2 test result data of the present invention
Figure BDA0002281104980000111
In table 2, tag number represents the number of reads aligned to the sample, i.e., the number of sequence strips, length (bp) represents the pure insert sequence from which the UMI, amplification primers and adaptor sequence are removed, i.e., the full length of 16S rDNA in the present invention, RQ quality value represents the quality value, and a value close to 1 represents the better quality.
The results of example 1 show that the 16S lengths aligned according to the reference sequence are close, and the RQ quality values are close, which indicates that no misjudgment is caused to the identification of the 16S sequence in the whole experimental process, and the base quality values before and after optimization are close.
The results of classification and quantification of the strains are shown in fig. 2, wherein the actual input amount is shown by the black horizontal line in fig. 2, and the strain proportion is calculated for data processing in the histogram, and experimental results show that the results of actual input of the strains and the data processing result of sequencing tend to be consistent by the method provided in example 1, which shows that the quantitative results of the method provided in example 1 are more accurate and reliable.
Example 2Tn5 reaction conditions were investigated
Referring to the method of example 1, example 2 the reaction conditions of step C in the Tn treatment of step (2) were investigated and reacted at 55 degrees Celsius, 45 degrees Celsius, 37 degrees Celsius and 20 degrees Celsius for 30 minutes, respectively, as shown in Table 3 below:
TABLE 3 results and data of amplification obtained from the reaction conditions
Figure BDA0002281104980000112
Figure BDA0002281104980000121
In Table 3, the length before Tn5-UMI treatment represents: the fragment length amplified without the use of Tn5 transposase. The lengths of Tn5-UMI obtained before treatment under different reaction conditions are different, and the possible reasons are mainly that: when the Agilent2100 is used for detection, the Agilent2100 instrument is sensitive, and brings some differences in detection results; in addition, genomic gDNA itself is a mixture of multiple bacteria, and there may be some difference in amplification efficiency.
As can be seen from the results given in Table 3, the inserted UMI sequence was more suitable at 37 ℃ for 30min of reaction.
Comparative example 1 method for 16S UMI library construction sequencing described in the existing Pacbio platform literature
Comparative example 1 provides a method for 16S rDNA pooling and sequencing based on the existing Pacbio platform, a method described in the reference, comprising the steps of:
(1) sample preparation: the DNA of each strain was extracted using the following four strains (hereinafter abbreviated as A \ B \ C \ D), and then mixed so that each strain DNA contained 50000 copies per microliter of the solution based on the 16S rDNA copy number of each strain.
Figure BDA0002281104980000122
(2) First, 16S fusion primers (shown as SEQ ID NO: 11-SEQ ID NO: 18) for the first round of amplification are synthesized by Shanghai Biotechnology Limited, the full length of 16S is amplified, four sets of repeats are made, and the sequences are named as PCR-1, PCR-2, PCR-3 and PCR-4.
Figure BDA0002281104980000123
Figure BDA0002281104980000131
Figure BDA0002281104980000132
The amplification primers used are shown in the following table:
pacbio 16S full length first round amplification primer sequences.
Primer name Primer sequences ((5'to3'))
16S-UMI-F1 TCAGACGATGCGTCATNNNNNNNNNNAGAGTTTGATCCTGGCTCAG(SEQ ID NO:11)
16S-UMI-F2 CTATACATGACTCTGCNNNNNNNNNNAGAGTTTGATCCTGGCTCAG(SEQ ID NO:12)
16S-UMI-F3 TACTAGAGTAGCACTCNNNNNNNNNNAGAGTTTGATCCTGGCTCAG(SEQ ID NO:13)
16S-UMI-F4 TGTGTATCAGTACATGNNNNNNNNNNAGAGTTTGATCCTGGCTCAG(SEQ ID NO:14)
16S-UMI-R1 ATGACGCATCGTCTGANNNNNNNNNNGGYTACCTTGTTACGACTT(SEQ ID NO:15)
16S-UMI-R2 GCAGAGTCATGTATAGNNNNNNNNNNGGYTACCTTGTTACGACTT(SEQ ID NO:16)
16S-UMI-R3 GAGTGCTACTCTAGTANNNNNNNNNNGGYTACCTTGTTACGACTT(SEQ ID NO:17)
16S-UMI-R4 CATGTACTGATACACANNNNNNNNNNGGYTACCTTGTTACGACTT(SEQ ID NO:18)
(3) PCR-1, PCR-2, PCR-3 and PCR-4 were subjected to magnetic bead purification (50. mu.l of magnetic beads were added for binding for 10min, washed twice with 75% absolute ethanol, and thawed in 10. mu.l NF water).
(4) 16S fusion primers for second round amplification (shown as SEQ ID NO: 19-SEQ ID NO: 26) are synthesized by Shanghai Biotechnology Limited, and PCR-1, PCR-2, PCR-3 and PCR-4 are subjected to second round amplification respectively, wherein the second round amplification reaction system and conditions are as follows:
reagent Volume (μ l)
Enzyme/polymerase Total amount of
2X KAPA HIFI Mix 25
Primer Mix 4
Template DNA(50ng) X
NF-H2O 21-X
Figure BDA0002281104980000141
The primers used were as follows:
pacbio 16S full-length second-round amplification primer sequences
Primer name Primer sequences ((5'to3'))
16S-F1 TCAGACGATGCGTCAT(SEQ ID NO:19)
16S-F2 CTATACATGACTCTGC(SEQ ID NO:20)
16S-F3 TACTAGAGTAGCACTC(SEQ ID NO:21)
16S-F4 TGTGTATCAGTACATG(SEQ ID NO:22)
16S-R1 ATGACGCATCGTCTGA(SEQ ID NO:23)
16S-R2 GCAGAGTCATGTATAG(SEQ ID NO:24)
16S-R3 GAGTGCTACTCTAGTA(SEQ ID NO:25)
16S-R4 CATGTACTGATACACA(SEQ ID NO:26)
(5) The 4 PCR amplification products (PCR-1 and PCR-2, PCR-3 and PCR-4) were mixed in equal amounts.
(6) Pacbio library construction sequencing operation is as follows:
A. and (3) damage repair reaction: the following reaction system is prepared and reacted for 60min at 37 ℃:
reagent Dosage (mu L)
DNA Damage Repair Buffer 5
NAD+ 0.5
ATP high (ATP high) 5
dNTP 0.5
DNA Damage Repair Mix (DNA Damage Repair Mix) 1
DNA 37
B. End repair reaction: the following reaction system is prepared and reacted for 10min at 25 ℃:
reagent Dosage (mu L)
DNA 50
DNA Damage Repair Mix (DNA Damage Repair Mix) 2
C. Magnetic bead purification: after 50ul of magnetic beads were added and bound to the sample for 10min, the sample was washed twice with 75% absolute ethanol, and then DNA was dissolved from the top of the beads with 24ul of water.
D. Adding a joint: the following reaction system was prepared and reacted overnight at 25 ℃ (12-16 hours)
Reagent Dosage (mu L)
DNA 20
Annealed Blunt Adapter (20. mu.M) 10
Template Prep Buffer (Buffer preparation Template) 4
ATP low (ATP Low) 2
Ligase (Ligase) 1
Water (W) 3
E. Double enzyme digestion: the following reaction system was prepared and reacted at 37 ℃ for 60min.
Reagent Dosage (mu L)
DNA 40
ExoIII 1
ExoVII 1
F. Magnetic bead purification: magnetic bead purification: after 50ul of magnetic beads were added and bound to the sample for 10min, the sample was washed twice with 75% absolute ethanol, and then the DNA was dissolved from the top of the beads with 30ul of water.
G. Fragment sorting: fragments between 1-2Kb were sorted.
(7) Preparing and processing: primer + sequencing polymerase was added.
A. Primer connection: the following reaction system was prepared and reacted at 20 ℃ for 60min.
Reagent Dosage (mu L)
Water (W) 6.1
10 XPrimer Buffer (Primer Buffer) 1
Sample Volume (Sample size) 1.9
Diluted Sequencing Primer (Diluted Sequencing Primer) 1
B. Polymerase ligation: the reaction is carried out for 60min at 30 ℃.
Reagent Dosage (mu L)
dNTP 1.6
DTT 1.6
Binding Buffer V2 (Binding Buffer V2) 1.6
Sample Volume (Sample size) 9.2
Polymerase Diluted (Diluted Polymerase) 1
C. Magnetic bead purification: the product was purified using 0.6-fold magnetic beads.
(8) Data analysis processing
The results are as follows:
1. agilent2100 fragment size results:
TABLE 4 amplification results of the prior art methods (before optimization)
Figure BDA0002281104980000161
The total length of bacterial 16S DNA was around 1.5Kb, and experiments were carried out by the official Pacbio method (non-UMI amplification), with about 80% of sample fragments generally ranging from 1-2 Kb. After amplification is carried out by using the currently published UMI method, the Agilent2100 quality inspection result shows that the specific amplification notes of the amplification product obtained by the method are about 20-30%, and a large amount of non-specific amplification can be caused.
TABLE 5 sequencing results of the Prior Art methods (before optimization)
Figure BDA0002281104980000162
Comparing example 1 with comparative example 1, it can be seen that the data quality of the method provided by the present invention is substantially similar to that of the conventional method, indicating that the method provided by the present invention does not adversely affect the sequencing.
The method provided in comparative example 1 was used to quantify the bacterial colonies, the results are shown in fig. 3, the bacterial colonies quantified by Pacbio sequencing are A, B, C and D in the legend, the black horizontal line part in fig. 3 is the actual bacterial colony input amount, and the experimental results show that the method has a large theoretical and actual deviation.
The results show that the method for amplifying, establishing the library and sequencing the target region by using the transposase does not influence the sequencing quality, and can realize the quantification of the nucleic acid in the target region.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.
SEQUENCE LISTING
<110> Wuhanhua university medical laboratory Co., Ltd
<120> method for amplifying target region of nucleic acid, method for creating library and sequencing, and kit
<130> PIDC3194980
<160> 26
<170> PatentIn version 3.5
<210> 1
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> Tn5
<220>
<221> misc_feature
<222> (20)..(27)
<223> n is a, c, g, or t
<400> 1
agatgtgtat aagagacagn nnnnnnnaga tgtgtataag agacag 46
<210> 2
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> Me
<400> 2
ctgtctctta tacacatct 19
<210> 3
<211> 36
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-F1
<400> 3
tcagacgatg cgtcatagag tttgatcctg gctcag 36
<210> 4
<211> 36
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-F2
<400> 4
ctatacatga ctctgcagag tttgatcctg gctcag 36
<210> 5
<211> 36
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-F3
<400> 5
tactagagta gcactcagag tttgatcctg gctcag 36
<210> 6
<211> 36
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-F4
<400> 6
tgtgtatcag tacatgagag tttgatcctg gctcag 36
<210> 7
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-R1
<400> 7
atgacgcatc gtctgaggyt accttgttac gactt 35
<210> 8
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-R2
<400> 8
gcagagtcat gtatagggyt accttgttac gactt 35
<210> 9
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-R3
<400> 9
gagtgctact ctagtaggyt accttgttac gactt 35
<210> 10
<211> 35
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-R4
<400> 10
catgtactga tacacaggyt accttgttac gactt 35
<210> 11
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-UMI-F1
<220>
<221> misc_feature
<222> (17)..(26)
<223> n is a, c, g, or t
<400> 11
tcagacgatg cgtcatnnnn nnnnnnagag tttgatcctg gctcag 46
<210> 12
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-UMI-F2
<220>
<221> misc_feature
<222> (17)..(26)
<223> n is a, c, g, or t
<400> 12
ctatacatga ctctgcnnnn nnnnnnagag tttgatcctg gctcag 46
<210> 13
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-UMI-F3
<220>
<221> misc_feature
<222> (17)..(26)
<223> n is a, c, g, or t
<400> 13
tactagagta gcactcnnnn nnnnnnagag tttgatcctg gctcag 46
<210> 14
<211> 46
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-UMI-F4
<220>
<221> misc_feature
<222> (17)..(26)
<223> n is a, c, g, or t
<400> 14
tgtgtatcag tacatgnnnn nnnnnnagag tttgatcctg gctcag 46
<210> 15
<211> 45
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-UMI-R1
<220>
<221> misc_feature
<222> (17)..(26)
<223> n is a, c, g, or t
<400> 15
atgacgcatc gtctgannnn nnnnnnggyt accttgttac gactt 45
<210> 16
<211> 45
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-UMI-R2
<220>
<221> misc_feature
<222> (17)..(26)
<223> n is a, c, g, or t
<400> 16
gcagagtcat gtatagnnnn nnnnnnggyt accttgttac gactt 45
<210> 17
<211> 45
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-UMI-R3
<220>
<221> misc_feature
<222> (17)..(26)
<223> n is a, c, g, or t
<400> 17
gagtgctact ctagtannnn nnnnnnggyt accttgttac gactt 45
<210> 18
<211> 45
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-UMI-R4
<220>
<221> misc_feature
<222> (17)..(26)
<223> n is a, c, g, or t
<400> 18
catgtactga tacacannnn nnnnnnggyt accttgttac gactt 45
<210> 19
<211> 16
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-F1
<400> 19
tcagacgatg cgtcat 16
<210> 20
<211> 16
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-F2
<400> 20
ctatacatga ctctgc 16
<210> 21
<211> 16
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-F3
<400> 21
tactagagta gcactc 16
<210> 22
<211> 16
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-F4
<400> 22
tgtgtatcag tacatg 16
<210> 23
<211> 16
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-R1
<400> 23
atgacgcatc gtctga 16
<210> 24
<211> 16
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-R2
<400> 24
gcagagtcat gtatag 16
<210> 25
<211> 16
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-R3
<400> 25
gagtgctact ctagta 16
<210> 26
<211> 16
<212> DNA
<213> Artificial Sequence
<220>
<223> 16S-R4
<400> 26
catgtactga tacaca 16

Claims (10)

1. A method of amplifying a target region of a nucleic acid, comprising:
(1) inserting a UMI sequence onto the nucleic acid using a transposase so as to obtain a fusion product containing the UMI sequence and the target region sequence;
(2) specifically amplifying the fusion product using primers to obtain an amplification product comprising the UMI sequence and the target region sequence.
2. The method of claim 1, wherein the transposase is Tn5 transposase;
optionally, the UMI sequence is a random sequence with the length of 5-25 bp, preferably a random sequence with the length of 8-15 bp, and more preferably a random sequence with the length of 10 bp.
3. The method of claim 1, wherein the target region is a 16S rDNA, 18S rDNA, or ITS region;
optionally, the length of the target region is 1-10 kb, preferably 1-2 kb;
optionally, the nucleic acid is genomic DNA of a microorganism.
4. The method of claim 1, wherein the UMI sequence is located between the target region sequences;
optionally, every 1-2kb length of the target region contains 1 UMI sequence.
5. The method of claim 1, wherein step (1) further comprises:
(1-1) obtaining a nucleic acid fragment containing a recognition sequence and said UMI sequence;
(1-2) mixing the nucleic acid fragment, Tn5 transposase and the nucleic acid for reaction, wherein the reaction condition is that the reaction is carried out for 25-50 minutes at 32-40 ℃, and preferably for 30 minutes at 37 ℃;
optionally, in the step (1-2), magnesium ions and a buffer are contained in the reaction system.
6. A method of pooling a target region of a nucleic acid comprising:
amplifying a target region of the nucleic acid using the method of any one of claims 1 to 5 to obtain an amplification product;
based on the amplification products, creating a library so as to obtain a sequencing library;
optionally, performing the banking using a Pacbio platform;
optionally, the creating a library comprises:
subjecting the amplification product to damage repair and end repair to obtain a repair product;
ligating the repair product and the linker to obtain a ligation product;
carrying out enzyme digestion on the connection product, and carrying out fragment sorting on the obtained enzyme digestion fragment to obtain a fragment containing a target region and a UMI sequence;
optionally, the repair product is purified separately using magnetic beads before the ligation of the repair product to the linker or before the fragment sorting of the obtained digested fragments.
7. A method of constructing a sequencing library, comprising:
inserting the UMI sequence onto genomic DNA using a transposase so as to obtain a fusion product;
building a library based on the fusion products so as to obtain a sequencing library;
optionally, performing the banking using a Pacbio platform;
optionally, the creating a library comprises:
subjecting the fusion product to damage repair and end repair to obtain a repaired product;
ligating the repair product and the linker to obtain a ligation product;
and carrying out enzyme digestion on the ligation product, and carrying out fragment sorting on the obtained enzyme digestion fragments so as to obtain the sequencing library.
8. A sequencing method, comprising:
obtaining a sequencing library using the method of claim 6 or 7;
based on the sequencing library, performing machine sequencing so as to obtain a sequencing result;
optionally, the in-silico sequencing is performed using a Pacbio platform.
9. A kit, comprising:
tn5 transposase, and
1 and 2 respectively;
optionally, further comprising:
the kit is suitable for Pacibo platform library construction sequencing.
10. A method of modifying a nucleic acid, wherein a UMI sequence is inserted into the nucleic acid using a transposase to obtain a fusion product;
optionally, the transposase is Tn5 transposase;
optionally, the UMI sequence is a random sequence with the length of 5-25 bp, preferably a random sequence with the length of 8-15 bp, and more preferably a random sequence with the length of 10 bp;
optionally, every 1-2kb of nucleic acid sequence contains 1 UMI sequence.
CN201911141638.6A 2019-11-20 2019-11-20 Method for amplifying target region of nucleic acid, library construction and sequencing method and kit Pending CN112824534A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911141638.6A CN112824534A (en) 2019-11-20 2019-11-20 Method for amplifying target region of nucleic acid, library construction and sequencing method and kit

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911141638.6A CN112824534A (en) 2019-11-20 2019-11-20 Method for amplifying target region of nucleic acid, library construction and sequencing method and kit

Publications (1)

Publication Number Publication Date
CN112824534A true CN112824534A (en) 2021-05-21

Family

ID=75906819

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911141638.6A Pending CN112824534A (en) 2019-11-20 2019-11-20 Method for amplifying target region of nucleic acid, library construction and sequencing method and kit

Country Status (1)

Country Link
CN (1) CN112824534A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112251422A (en) * 2020-10-21 2021-01-22 华中农业大学 Transposase complex containing unique molecular tag sequence and application thereof

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105463066A (en) * 2014-09-04 2016-04-06 中国科学院北京基因组研究所 DNA amplification method
CN105525357A (en) * 2014-09-30 2016-04-27 深圳华大基因股份有限公司 Sequencing library construction method, and kit and application thereof
CN106801044A (en) * 2017-01-09 2017-06-06 北京全式金生物技术有限公司 A kind of ring-type transposons compound and its application
CN107586835A (en) * 2017-10-19 2018-01-16 东南大学 A kind of construction method of sequencing library of future generation based on single-stranded joint and its application
CN109056077A (en) * 2018-09-13 2018-12-21 武汉菲沙基因信息有限公司 A kind of amplicon sample mixing sequencing library construction method suitable for PacBio microarray dataset
CN109072206A (en) * 2016-03-10 2018-12-21 斯坦福大学托管董事会 The imaging to accessible genome that transposase mediates
CN110004210A (en) * 2019-04-02 2019-07-12 杭州进一生物科技有限公司 A method of for constructing bacterial 16 S rDNA overall length high-throughput sequencing library
CN110396534A (en) * 2019-08-12 2019-11-01 华大生物科技(武汉)有限公司 The construction method of gene library, determined nucleic acid sample gene mutation detection method and kit

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105463066A (en) * 2014-09-04 2016-04-06 中国科学院北京基因组研究所 DNA amplification method
CN105525357A (en) * 2014-09-30 2016-04-27 深圳华大基因股份有限公司 Sequencing library construction method, and kit and application thereof
CN109072206A (en) * 2016-03-10 2018-12-21 斯坦福大学托管董事会 The imaging to accessible genome that transposase mediates
CN106801044A (en) * 2017-01-09 2017-06-06 北京全式金生物技术有限公司 A kind of ring-type transposons compound and its application
CN107586835A (en) * 2017-10-19 2018-01-16 东南大学 A kind of construction method of sequencing library of future generation based on single-stranded joint and its application
CN109056077A (en) * 2018-09-13 2018-12-21 武汉菲沙基因信息有限公司 A kind of amplicon sample mixing sequencing library construction method suitable for PacBio microarray dataset
CN110004210A (en) * 2019-04-02 2019-07-12 杭州进一生物科技有限公司 A method of for constructing bacterial 16 S rDNA overall length high-throughput sequencing library
CN110396534A (en) * 2019-08-12 2019-11-01 华大生物科技(武汉)有限公司 The construction method of gene library, determined nucleic acid sample gene mutation detection method and kit

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
佚名: "Robust Tn5 转座酶", 《ROBUSTNIQUE》 *
许昊等: "基于工程化转座酶的少量细胞RNA-Seq文库构建的研究", 《第三军医大学学报》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112251422A (en) * 2020-10-21 2021-01-22 华中农业大学 Transposase complex containing unique molecular tag sequence and application thereof
CN112251422B (en) * 2020-10-21 2024-04-19 华中农业大学 Transposase complex containing unique molecular tag sequence and application thereof

Similar Documents

Publication Publication Date Title
US11352665B2 (en) Nucleic acid constructs and methods of use
US20210047635A1 (en) Transposase compositions for reduction of insertion bias
CN113166797A (en) Nuclease-based RNA depletion
US20210363570A1 (en) Method for increasing throughput of single molecule sequencing by concatenating short dna fragments
US20120003657A1 (en) Targeted sequencing library preparation by genomic dna circularization
CN112195521A (en) DNA/RNA co-database building method based on transposase, kit and application
EP4180539A1 (en) Single end duplex dna sequencing
WO2016138292A1 (en) Methods and compositions for in silico long read sequencing
WO2021146534A1 (en) Methods of targeted sequencing
EP3612646A1 (en) Nucleic acid characteristics as guides for sequence assembly
CN113462748A (en) Preparation method and kit of DNA sequencing library
CN112824534A (en) Method for amplifying target region of nucleic acid, library construction and sequencing method and kit
US20220348987A1 (en) Methods and compositions for processing samples containing nucleic acids
US20230295714A1 (en) Methods of Producing Ribosomal Ribonucleic Acid Complexes
Oomen et al. SisterC: A novel 3C-technique to detect chromatin interactions between and along sister chromatids
CA3213037A1 (en) Blocking oligonucleotides for the selective depletion of non-desirable fragments from amplified libraries
CN114507903A (en) Plasmid sequencing method
WO2005038026A1 (en) Method of typing mutation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40053465

Country of ref document: HK

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210521