WO2020073748A1 - 一种测序文库的构建方法 - Google Patents

一种测序文库的构建方法 Download PDF

Info

Publication number
WO2020073748A1
WO2020073748A1 PCT/CN2019/102651 CN2019102651W WO2020073748A1 WO 2020073748 A1 WO2020073748 A1 WO 2020073748A1 CN 2019102651 W CN2019102651 W CN 2019102651W WO 2020073748 A1 WO2020073748 A1 WO 2020073748A1
Authority
WO
WIPO (PCT)
Prior art keywords
polynucleotide
tail
substrate
dna
sequencing
Prior art date
Application number
PCT/CN2019/102651
Other languages
English (en)
French (fr)
Inventor
张翼
陈琼
Original Assignee
北京优乐复生科技有限责任公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京优乐复生科技有限责任公司 filed Critical 北京优乐复生科技有限责任公司
Priority to US17/284,734 priority Critical patent/US20220002713A1/en
Priority to CN201980013343.2A priority patent/CN111989406B/zh
Priority to EP19871204.4A priority patent/EP3865584A4/en
Publication of WO2020073748A1 publication Critical patent/WO2020073748A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay

Definitions

  • the present invention relates to a method and kit for constructing a second-generation high-throughput sequencing library, and more specifically, the present invention relates to a method and kit for constructing a high-throughput sequencing library with a 3 'terminal hanging random base sequencing adapter .
  • the second-generation sequencing technology has faster sequencing speed and higher throughput, which is in line with the current technological development's demand for sequencing.
  • the platforms of second-generation sequencing technologies mainly include Illumina's Hiseq, Miseq, Nextseq, Novaseq, and Life Technologies' SOLID system, PGM, Proton, etc.
  • the technical idea of the second generation sequencing technology is sequencing by synthesis, that is, determining the DNA sequence according to the signal changes caused by the newly synthesized different bases.
  • the Illumina sequencing platform detects optical signals
  • the Life sequencing platform detects acid and base changes. Current changes.
  • Second-generation sequencing technology is by far the most mature and widely used high-throughput DNA sequencing method. It plays an important role in large-scale genome sequencing and gene diagnosis and treatment, and its clinical application will become more and more extensive.
  • Circulating DNA also known as free DNA
  • free DNA is DNA that exists outside the cell in the blood.
  • the main source of free DNA is apoptotic cells or bone marrow cells. After the DNA released by these cells is cleaved by nuclease in vivo, a small fragment of DNA with a length of about 166bp is generated (Y.M. Dennis Lo. Et al. Science Translational Medicine. 2010.10: 61ra91).
  • Free DNA is in a dynamic equilibrium state in the body, so free DNA can be used as an important parameter for health assessment. Changes in tumorigenesis and organ transplantation will lead to changes in the properties of free DNA in peripheral blood. These properties include free DNA length, base information, and apparent modifications; therefore, free DNA can be used for early diagnosis, monitoring, and prognostic evaluation of diseases An important marker for noninvasive detection.
  • the traditional construction process of the second-generation high-throughput methylation sequencing library is to first perform pre-library construction, including end-filling, 5'-end phosphorylation, 3'-end overhang A and linker connection steps; After the pre-library construction is completed, bisulfite treatment is carried out. Bisulfite treatment will cause a lot of DNA damage, and the template that can be sequenced finally accounts for less than 10% of the original template (Masahiko Shiraishi et al. 2004.10: 409 -415).
  • the construction process of the methylation sequencing library requires 1) purification at each step, which is cumbersome; 2) the filling step will artificially introduce nucleotides to change the true methylation status; 3) a large number of DNA templates in hydrogen sulfite It was destroyed during salt treatment and lost after PCR amplification.
  • the Swift methylation sequencing library construction method can build libraries more efficiently than traditional methods (CN104395480, see Figure 5).
  • the construction process is to first perform bisulfite treatment, and then perform library construction, including 3 'end tailing and The adapter is connected, and then the extension reaction is performed.
  • the other end of the sequencing adapter is connected, because the DNA template during the extension reaction contains a large amount of dUTP, plus bisulfite treatment to the DNA Only one round of extension reaction occurs when the template is damaged, and the efficiency of obtaining a complete double-stranded deoxypolynucleotide is low.
  • the present invention provides a method for linker connection of single-stranded deoxypolynucleotide; and further provides a first Construction method of second-generation high-throughput sequencing library.
  • the method of the present invention is applicable not only to normal DNA, but also to samples with severe damage such as FFPE samples, ancient DNA, and bisulfite-treated DNA samples.
  • the invention relates to a method for building a library of deoxypolynucleotide substrates.
  • the method includes the following steps:
  • the 3 'end of the deoxynucleotide single-stranded substrate reacts with the deoxynucleotide in the solution by a tailing reaction, and the 3' end of the substrate with a polynucleotide homopolymer tail is added Connect with the linker of the tail-controlling component to obtain the tailed substrate;
  • step (3) DNA polymerase, deoxynucleotides including dGTP, dCTP, dATP and dTTP and linear amplification primers are added to form a second mixture;
  • step (3) Incubate the second mixture, perform the first linear extension reaction using the tailed substrate obtained in step (2) as a template, synthesize the complementary strand of the substrate, and then melt the strand to linearly amplify the primer and substrate After complementation, the subsequent linear extension reaction is performed again, in which the number of linear extension reactions is not less than 3;
  • step (6) Add a 5 'sequencing adapter and DNA ligase to the solution of step (5) to form a third mixture;
  • the 5 'sequencing adapter is combined with the complementary strand of the substrate to prepare a DNA library.
  • the primer used in the first linear extension reaction of step (4), is a tail-controlling molecule (that is, a single-stranded chain composed of the tail-controlling region and the X region).
  • the primer used is the linear amplification primer added in step (3).
  • the polynucleotide homopolymer and X region fragment in the tail-controlling component are degraded, and the linear amplification primer added in step (3) is directed to the substrate Perform a linear extension reaction.
  • the linear amplification primer added in step (3) performs competitive binding to the substrate, so that the added linear amplification primer performs a linear extension reaction against the substrate.
  • the invention further relates to a kit, which can build a library of deoxypolynucleotide substrates, including
  • Component 1 contains a deoxynucleotide selected from dGTP, dCTP, dATP and dTTP, terminal deoxynucleotide transferase, DNA ligase and tail control component, wherein the tail control component consists of 5 Polynucleotide homopolymers up to 20 nucleotides in length and the X region, and a partially double-stranded nucleotide molecule composed of a linker polynucleotide complementary to the X region, the polynucleotide homopolymer and the A deoxynucleotide complementary to dGTP, dCTP, dATP and dGTP;
  • the tail control component consists of 5 Polynucleotide homopolymers up to 20 nucleotides in length and the X region, and a partially double-stranded nucleotide molecule composed of a linker polynucleotide complementary to the X region, the polynucle
  • Component 2 DNA polymerase, deoxynucleotides including dGTP, dCTP, dATP and dTTP, and linear amplification primers;
  • Component 3 contains 5 'sequencing adapter and DNA ligase.
  • the present invention can effectively increase the number of complementary strands of the original single-stranded polynucleotide substrate by designing linear amplification.
  • a 5 'sequencing adaptor with several random bases suspended at the 3' end of a strand, a 5 'sequencing adaptor can be added to the 3' end of the complementary strand of the polynucleotide substrate very efficiently.
  • the polynucleotide substrate is denatured into a single strand, after the substrate is tailed, it is connected to a linker, and then the linear amplification of the polynucleotide substrate is completed to obtain a complementary strand, and the complementary strand is connected to the 3 'end of the complementary strand after a 5' sequencing adapter , PCR enrichment, to get a library for next-generation sequencing.
  • the substrate is methylated, a U-base-containing deoxypolynucleotide substrate template is obtained. Since the U-base may cause the suspension of linear amplification, the length of the complementary chain obtained varies, but the present invention adopts a The connection of the 5 'sequencing linker with several random bases suspended at the 3' end of the chain and the complementary strand single-stranded polynucleotide greatly improves the utilization rate of the complementary strand. According to the present invention, the library building process can construct a whole genome methylation sequencing library for genomic DNA derived from human cultured cells as low as 2 ng, and obtain efficient sequencing results.
  • Figure 1 Flow chart of DNA library construction of the method of the present invention
  • FIG. 3 The structure of 6N "5 'sequencing adapter"
  • Figure 4 Flow chart of traditional method for DNA methylation library construction
  • the 3 ', 3' and 3 'ends have the same meaning, and the 5', 5 'and 5' ends have the same meaning. They refer to the 3 'or 5' end of the nucleotide sequence, respectively.
  • a polynucleotide substrate is a fragment of a polynucleotide substrate that requires a tailing reaction and library construction.
  • the polynucleotide substrate is single-stranded or double-stranded DNA.
  • the polynucleotide substrate is a chemically treated nucleotide sequence, including but not limited to a bisulfite-treated polynucleotide.
  • the polynucleotide substrate may be of natural origin or synthetic. Natural sources are polynucleotide sequences from prokaryotes or eukaryotes, such as humans, mice, viruses, plants, or bacteria.
  • the polynucleotide substrate of the present invention may also be a sample with severe damage such as FFPE samples, ancient DNA, and bisulfite-treated DNA.
  • Polynucleotide substrates are tailed and can be used in assays involving microarrays and generate libraries for next-generation nucleic acid sequencing. Tailed polynucleotide substrates can also be used for efficient cloning of polynucleotide sequences.
  • the polynucleotide substrate is single-stranded or double-stranded and contains a 3 'free hydroxyl group. In some aspects, the polynucleotide substrate is double-stranded and contains blunt ends. In other aspects, the double-stranded polynucleotide substrate comprises 3 'recessed ends.
  • the length of the protruding or recessed ends of the polynucleotide substrate can vary. In various aspects, the length of the protruding or recessed ends of the polynucleotide substrate is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more nucleotides.
  • the length of the polynucleotide substrate is between about 10 and about 5000 nucleotides, or between about 40 and about 2000 nucleotides, or between about 50 and about Between 1000 nucleotides, or between about 100 and about 500 nucleotides. In further aspects, the length of the polynucleotide substrate is at least 3 to at most about 50, 100, or 1000 nucleotides.
  • DNA dephosphorylation refers to the removal of the first amino acid residue phosphate group at the 5 'and 3' ends of DNA; generally, the DNA is treated with alkaline phosphatase to dephosphorylate the 5 'and 3' residues .
  • the present invention dephosphorylates the deoxypolynucleotide substrate before tailing the deoxypolynucleotide substrate.
  • Step (2) of the method of the present invention tails the deoxypolynucleotide substrate.
  • the method is used to add the required number of nucleotides to the 3 'end of the polynucleotide substrate in a controlled manner.
  • tail control components By way of example and without limitation, by adding tail control components, the tail of the polynucleotide substrate is controlled within a certain length (see FIG. 1).
  • the tail-controlling component contains a polynucleotide homopolymer of 5 to 20 nucleotides in length, and the tail-controlling component forms a double-stranded structure with the newly added nucleotide homopolytail sequence of the substrate, thus reducing the polymerization process
  • the speed of the polynucleotide substrate is controlled within a certain length (see Figure 1).
  • the nucleotide added in the tailing reaction is a deoxynucleotide selected from dGTP, dCTP, dATP and dTTP, for example, it may be a dGTP solution, or a dCTP solution, or a dATP solution, or a dGTP solution.
  • the tail-controlling component of the poly (dT) nucleotide homopolymer sequence is used to control the addition of poly (TdT enzyme (terminal deoxynucleotidyl transferase) to the 3 'end of the polynucleotide substrate dA) tail (also known as nucleotide (dA) homopolytail).
  • the homo-tail of the polynucleotide substrate is connected to the linker of the tail-controlling component to form a tailing region at the 3 'end of the substrate.
  • the tail-controlling component comprises 5-20, preferably 5-13, further preferably 7-10, more preferably 7-9 identical nucleotide homopolymers sequence.
  • the molar concentration ratio of the polynucleotide substrate to the tail-controlling component ranges from 1: 1-1: 100, preferably 1: 5-1: 50.
  • the pH of the tailing reaction in step (2) of the method of the present invention ranges from about 5.0 to about 9.0; the molar concentration ratio of polynucleotide substrate to single nucleotide ranges from 1:10 to 1: 20000, preferably 1: 100-1: 2000; incubation time is 1 minute to 120 minutes, preferably 0.5-60 minutes, 0.5-30 minutes, 1-20 minutes, 1-15 minutes or 1-10 minutes; incubation The temperature is 20 ° C-50 ° C, preferably 25 ° C-45 ° C, more preferably 25 ° C-37 ° C.
  • the present invention controls the tailing length and efficiency of the polynucleotide substrate by adding tailing control components.
  • the tail-controlling component is composed of a tail-controlling region and an X region, and a linker sequence complementary to the X region (see FIG. 2).
  • the tail-controlling component is also called a "tail-controlling linker”.
  • the single chain composed of tail-controlling region and X region is called "tail-controlling molecule".
  • the tail-controlling region of the present invention is a polynucleotide homopolymer of 5-20 nucleotides in length.
  • Polynucleotide homopolymer also known as "poly region" is a chain of polynucleotides connected by the same nucleotide.
  • the tail control region of the present invention is a nucleotide homopolymer sequence composed of a deoxynucleotide in dGTP, dCTP, dATP, and dTTP; preferably, the tail control region of poly (dT) composed of dTTP.
  • the length of the polynucleotide homopolymer of the tail-controlling region of the present invention is 5-20 nucleotides, preferably 7-20, 9-20 nucleotides, further preferably 5-10 nucleotides, 7-10 nucleotides, more preferably 7-9 nucleotides.
  • a certain length of dGTP or dCTP polynucleotide homopolymer can effectively control the tailing of the polynucleotide substrate to about 20 nucleotides.
  • the "X region sequence” provides a priming sequence for the amplification or sequencing of nucleic acid fragments, and may also include a labeling sequence for distinguishing different substrate molecules.
  • the labeling sequence may contain 4-16 bases and is used in some aspects For next-generation sequencing applications.
  • the X region sequence may be, but is not limited to, a next generation sequencing (NGS) linker sequence compatible with Illumina, Ion Torrent, Roche 454, or SOLiD sequencing platforms.
  • the X region sequence may be a DNA sequence, an RNA sequence, or a heteropolymeric sequence containing DNA and RNA.
  • linker in the tail-controlling component only the linker that is complementary to the sequence of the X region is called “short linker”; in addition to the sequence complementary to the X region, the linker sequence that includes the extension primer binding region is called “ “Long linker” as shown in Table 2.
  • the present invention uses a method of tailing a deoxypolynucleotide substrate, which is used to add a desired number of nucleotides to the 3 'end of the polynucleotide substrate in a controlled manner.
  • the tail-controlling component comprises a polynucleotide homopolymer of 5-20 nucleotides in length, the tail-controlling component and the newly added nucleoside of the substrate.
  • the acid and polytail sequences complement each other to form a double-stranded structure, thus reducing the rate of the polymerization process and controlling the tail of the polynucleotide substrate within a certain length (see Figure 1).
  • a tail-controlling component containing a poly (dT) nucleotide homopolymer sequence is used to control the addition of a poly (dA) tail to the 3 'end of the polynucleotide substrate (also called Nucleotide (dA) with poly tail). Further, the poly (dA) tail of the polynucleotide substrate is connected to the linker of the tail-controlling component to form a tailing region at the 3 'end of the substrate.
  • the tail-controlling component comprises a blocking group.
  • a blocking group is a part that prevents extension by an enzyme. If there is no blocking group, the enzyme can synthesize the polynucleotide by adding nucleotides.
  • Blocking groups include, but are not limited to: phosphate groups, carbon triple arms, dideoxynucleotides, ribonucleotides, amino groups, and reverse deoxythymidine.
  • the tail control component linker has a phosphorylation modification at the 5 'end, a blocking group at the 3' end, and a blocking group at the 3 'end of the tail control region.
  • linear amplification reaction is also called “linear extension reaction” or “linear extension”.
  • a linear amplification reaction is further performed using the deoxypolynucleotide substrate as a template.
  • the present invention provides a method for linearly amplifying deoxynucleotide substrates. The method is used to increase the number of deoxypolynucleotide substrates by linear amplification (see FIG. 1).
  • a linear amplification primer is added, and the tailing substrate is used as a template to perform an extension reaction.
  • the extension reaction may first be an extension reaction through a tail-controlling molecule.
  • the linear amplification primer separates the tail-controlling molecule from the substrate polynucleotide substrate in a competitive manner, and then uses the deoxypolynucleotide substrate as a template to perform a linear amplification reaction.
  • the polynucleotide homopolymer and X region fragments in the tail-controlling component are degraded, and the linear amplification primer added in step (3) is directed to the substrate Perform a linear extension reaction.
  • the method includes: after the tailing reaction, the polynucleotide substrate is denatured in a single-stranded state, and linear amplification primers complementary to the substrate 3 'linker sequence, DNA polymerase, and deoxynucleus are added Glycosides, react with polynucleotide substrates; a nucleotide extension reaction occurs at the 3 'end of the linear amplification primer to synthesize the complementary strand of the substrate to obtain a double-stranded deoxypolynucleotide, and the double-stranded deoxypolynucleotide passes Denaturation separates the complementary strand of the substrate from the substrate, and the substrate again undergoes an extension reaction with the linear amplification primer, DNA polymerase, and deoxynucleotide. The number of extension reactions that occurs is called the linear amplification cycle number.
  • the DNA polymerase can efficiently amplify a U base-containing deoxypolynucle
  • the number of linear amplification cycles is not less than 3, preferably not less than 4, preferably 4-50, more preferably 4-20, more preferably 4-12, 8- 12 times.
  • the 5 'sequencing adaptor of the present invention is a deoxypolynucleotide having a partially double-stranded structure; "partially double-stranded” refers to the 5' sequencing adaptor having a random number of bases that includes a single-stranded portion and a double-stranded portion.
  • sequencing adaptors provide priming sequences for amplification or sequencing of nucleic acid fragments, and in some aspects are used in next-generation sequencing applications.
  • a 5 'sequencing adapter with a dangling random base is connected to the complementary strand of the substrate obtained after the linear amplification reaction.
  • the 3' end of the 5 'sequencing adapter with a dangling random base contains a random base single strand Polynucleotides (see Figure 3), so the 5 'sequencing linker with dangling random bases is a multi-molecular structure with partially double-stranded polynucleotides.
  • the 5' sequencing linker with dangling random bases is also called "hanging N" 5'sequencing Connector ".
  • a 5 'sequencing adapter with 6 random bases is also called “overhang 6N 5' sequencing adapter”
  • N represents a deoxynucleotide base, that is, each of the 6 N is randomly selected from dGTP, dCTP , DATP and dTTP, a kind of deoxynucleotide base
  • "hang 6N 5 'sequencing adapter” is a mixed molecule of different random bases connected.
  • the number of dangling random bases of the 5 'sequencing adapter of the present invention is 0-50, preferably 2-30, further preferably 2-17, 4-15, 4-10, more preferably 7-10, 7-9 Pcs.
  • the 5 'sequencing linker is formed by annealing two polynucleotide strands, the 5' end of the polynucleotide strand that does not contain random bases has phosphorylation modification, and the 3 'end has a blocking group, The polynucleotide chain containing random bases has a blocking group at the 3 'end (see Figure 3).
  • a method for linking a single-stranded deoxypolynucleotide the method is used for a substrate obtained after a linear amplification reaction of a random base single-stranded polynucleotide portion of a 5 'sequencing linker suspended with a random base
  • the 3 'end of the complementary strand of the DNA is complementary, resulting in a partial double-stranded structure other than the double stranded portion of the 5' sequencing linker.
  • the 5 'sequencing linker does not contain a random base polynucleotide chain
  • the 5 'end is connected to the 3' end of the complementary strand of the polynucleotide substrate (see Figure 1); after the linear amplification step and the purification step, the substrate complementary strand used to connect to the 5 'sequencing adapter is increased, and the connection efficiency is improved .
  • the molar concentration ratio of the polynucleotide substrate to the 5 'sequencing adapter ranges from 1: 100 to 1: 4000, preferably 1: 500 to 1: 1000.
  • the ligases that can be used in the method of the present invention may be DNA ligase and RNA ligase, including but not limited to T4 DNA ligase, E. coli DNA ligase, T7 DNA ligase and T4 RNA ligase.
  • the ligase of the present invention connects the linker in the tail-controlling component to the substrate-tailed polynucleotide.
  • the ligase of the invention ligates the 5 ' sequencing adaptor to a single strand deoxynucleotide complementary to the synthetic substrate.
  • the polynucleotide product after step (7) of the present invention is purified.
  • the purification of the polynucleotide product is performed by any method known and understood by those skilled in the art.
  • the purification of the polynucleotide substrate of the present invention can be performed by adding magnetic beads whose surface is modified with carboxyl groups.
  • the purification of the polynucleotide substrate is performed by column purification and precipitation.
  • Phos phosphoric acid
  • * thio site
  • C3 Spacer carbon 3 arm
  • 5mC 5-methyl-cytosine deoxynucleotide
  • N dA, dT, dC or dG nucleotide
  • Phos phosphoric acid
  • C3 Spacer carbon 3 arms
  • * thio site
  • N dA, dT, dC or dG nucleotides
  • Phos phosphoric acid
  • C3 Spacer carbon 3 arms
  • * thio site
  • N dA, dT, dC or dG nucleotides
  • Phos phosphoric acid
  • 5mC 5-methyl-cytosine deoxynucleotide
  • 10x green buffer (Enzymatics, catalog number B0120, 20mM Tris-acetate, 50mM potassium acetate, 10mM magnesium acetate, pH 7.9)
  • TdT enzyme Enzymatics, catalog number P7070L, 20U / ⁇ L
  • E. coli DNA ligase (Takara, catalog number 2161, 60U / ⁇ L)
  • dNTP (takara, catalog number 4030, 2.5mM each)
  • Phusion U hot-start DNA polymerase (ThermoFisher, catalog number F555L, 2U / ⁇ L)
  • SB buffer 20% PEG8000, 2.5M NaCl, 10mM Tris-hydrochloric acid, 1mM EDTA
  • Linker preparation Mix polynucleotide pairs (001 / 002,007 / 015,008 / 015,009 / 015,010 / 015,011 / 015, 012 / 015,013 / 015,014 / 015) in equimolar amounts, and incubate in 1x annealing buffer at 95 ° C for 2 Minutes, and then slowly cooled to room temperature to obtain a tail-controlling linker (as shown in Table 2) and a 5 'sequencing linker (shown in Table 3) with a different number of random bases suspended at the 3' end.
  • step (3) Use a focused ultrasound system (Covaris, catalog number S220) to fragment the DNA product of step (2) to 300 bp, to be used.
  • Tailing the polynucleotide substrate prepare the tailing and ligation reaction mixture as shown in Table 1-2. After warming the reaction mixture at 37 ° C for 30 minutes, treat it at 95 ° C for 5 minutes, and then keep at 4 °C.
  • Linear amplification prepare the reaction mixture of linear amplification as shown in Table 1-3, run 4 linear amplifications according to the PCR amplification program shown in Table 1-4; use 166 ⁇ l diluted 1: 6 Beckman Ampure XP magnetic beads (1 volume of Beckman Ampure XP magnetic beads plus 5 volumes of SB buffer) and 280 ⁇ l of 1.8: 1 diluted SB buffer (1.8 volume of SB buffer plus 1 volume of enzyme-free water) Purify and recover the linear amplification product, and then add 100 ⁇ l of EB buffer to elute. Divide 100 ⁇ l of the eluate into 5 ⁇ l / part in a 200 ⁇ l PCR tube, and divide 18 parts for the next reaction.
  • step (6) 18 parts of the DNA of step (6) are reacted at 95 ° C for 5 minutes, and immediately placed on ice for 2 minutes to be used, so that the double-stranded melt remains single-stranded.
  • N "5 'sequencing adapters” can effectively ligate with the complementary strand of the nucleotide substrate obtained after linear amplification, and methylate the bisulfite-treated DNA Construction of chemical libraries.
  • Connector preparation Prepare the tail-controlling connector (001/002, as shown in Table 2) and the "5 'sequencing connector" (011/015, such as 011/015, which hangs 6 N at the 3' end according to the connector preparation method in Example 1) (Table 3)
  • linear amplification products were prepared according to the method described in Example 1. When the linear amplification products were purified and recovered, 31.2 ⁇ l of EB buffer was added for elution, and 31.2 ⁇ l of the eluent was divided into 2.6 ⁇ l A portion was placed in a 200 ⁇ l PCR tube, and 10 portions were separated for the next reaction.
  • step (3) The DNA of step (3) was reacted at 95 ° C for 5 minutes, and immediately placed on ice for 2 minutes for use.
  • the ratio of DNA substrate to 5 'sequencing adapter from 1: 100 to 1: 4000 can effectively construct a methylation library for bisulfite-treated DNA.
  • DNA substrate 5 ’sequencing adapter ratio 1: 100 1: 500 1: 1000 1: 2000 1: 4000 Library concentration (nM) 0.0210 0.0499 0.0641 0.0693 0.0836
  • Connector preparation Prepare a tail-controlling connector (001/002, as shown in Table 2) and a "5 'sequencing connector" (012/015, such as 012/015, with 7 N overhanging at the 3' end) according to the connector preparation method in Example 1. Table 3).
  • step (3) Use a focused ultrasound system (Covaris, catalog number S220) to fragment the DNA product of step (2) to 300 bp, to be used.
  • step (6) The DNA prepared in step (6) was reacted at 95 ° C for 5 minutes and immediately placed on ice for 2 minutes.
  • Linker preparation A random molecular tag tail-controlling linker (006/029, as shown in Table 4) was prepared according to the method described in Example 1, and a "5 'sequencing linker with 7 N hanging at the 3' end "(012/015, as shown in Table 3).
  • (2) Preparation of methylation sequencing library construct the methylation sequencing library according to the method described in Example 3 to the "5 'sequencing adapter" ligation reaction; use 17 ⁇ l Beckman Ampure XP magnetic beads to recover the ligated DNA, and then add 20 ⁇ l enzyme-free water elution; take 2 ⁇ l eluent diluted 10 times in 18 ⁇ l enzyme-free water to obtain a 10-fold dilution, and then take 2 ⁇ l 10 times dilution in 18 ⁇ l enzyme-free water and dilute 10 times to obtain a 100-fold dilution, in order Dilute to obtain 10,000-fold dilution, take 5.34 ⁇ l of 10,000-fold dilution, and use it.
  • each ⁇ -DNA genomic locus obtained 100 unique sequencing fragments, which is called the database construction efficiency of 10%.
  • Count the number of unique sequencing fragments obtained at each ⁇ -DNA genomic locus calculate the average and standard deviation, and divide the standard deviation by the average to obtain the "variation coefficient".
  • the coefficient of variation represents the uniformity of the database building method, the lower the uniformity, the better.
  • Database building efficiency refers to the effectiveness of the method of building a database, the higher the better.
  • Linear amplification cycles of 4, 6, 8, and 12 can effectively complete linear amplification, and construct a methylation sequencing library for bisulfite-treated DNA; when the linear amplification cycle is 12, The database building efficiency is the highest, reaching 92.63%.
  • 10x End Repair Buffer (New England Biolabs, catalog number B6052S, 50 mM Tris-hydrochloric acid, 10 mM magnesium chloride, 10 mM dithiothreitol, 1 mM adenosine triphosphate, 0.4 mM dATP, 0.4 mM dCTP, 0.4 mM dGTP, 0.4 mM dTTP, pH 7.5 )
  • T4DNA polymerase Enzymatics, catalog number P7080L, 3U / ⁇ L
  • T4 polynucleotide kinase Enzymatics, catalog number Y9040L, 10U / ⁇ L
  • 10x dA tailing buffer (New England Biolabs, catalog number B6059S, 10 mM Tris-hydrochloric acid, 10 mM magnesium chloride, 50 mM sodium chloride, 1 mM dithiothreitol, 0.2 mM dATP, pH 7.9)
  • Connector preparation Prepare a tail-controlling connector (001/002, as shown in Table 2) and a "5 'sequencing connector" (012/015) with 7 Ns hanging at the 3' end according to the connector preparation method in Example 1 ,as shown in Table 3).
  • step (1-3) Use a focused ultrasound system (Covaris, catalog number S220) to fragment the DNA product of step (1-2) to 300 bp, both of which are divided into two (that is, two parallel), to be used.
  • step (1-6) The DNA eluate of step (1-6) was reacted at 95 ° C for 5 minutes, and immediately placed on ice for 2 minutes for use.
  • step (1-11) Use the Illumina-Nova sequencer to perform 150PE mode sequencing on the library obtained in step (1-9), and the analysis method is the same as step (5) in Example 4.
  • step (2-9) Use the Illumina-Nova sequencer to sequence the library obtained in step (2-7) in 150PE mode, and the analysis method is the same as step (5) in Example 4.
  • step (3-1) Fragment the product of step (3-1) to 300 bp using a focused ultrasound system, and divide them into two parts (that is, two parallel) for use.
  • step (3-4) Perform PCR amplification on the DNA from step (3-3) according to the Index PCR step in the Swift DNA Methylation Library Construction Kit (Swift Biosciences, catalog number 30024), using Index as the Index kit (Swift Biosciences, catalog number 36024) I16 or I19, the difference is that the number of PCR amplification cycles is 28.
  • step (3-4) Purify and recover the PCR product from step (3-4) according to the PCR product purification method shown in step (2-7) to obtain the final sequencing library.
  • step (2-7) Use the Illumina-Nova sequencer to perform 150PE mode sequencing on the library obtained in step (2-7).
  • the analysis method is the same as step (5) in Example 4.
  • the concentration of the constructed methylation sequencing library is 18.684nM, 1.641nM and 0.146nM, respectively.
  • the efficiency of library construction is 56.1%, 22% and 3.92%, and the coefficients of variation are 0.463, 8.49, and 3.73, respectively.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Immunology (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

一种用于第二代高通量测序文库构建的方法和试剂盒,可提高核酸模板的利用效率,简化测序文库的构建流程,使得测序结果更加准确和覆盖度更加均一。

Description

一种测序文库的构建方法 技术领域
本发明涉及用于第二代高通量测序文库构建的方法和试剂盒,更具体地,本发明涉及3’末端悬随机碱基的测序接头用于高通量测序文库构建的方法和试剂盒。
背景技术
相对于第一代测序技术,第二代测序技术测序速度更快,通量更高,符合目前科技发展对测序的需求。目前,第二代测序技术的平台主要包括Illumina公司的Hiseq、Miseq、Nextseq、Novaseq以及Life Technologies公司的SOLID system、PGM、Proton等。第二代测序技术的技术思路是边合成边测序,即根据新合成的不同碱基带来的信号变化确定DNA序列,比如,Illumina测序平台是检测光信号,Life测序平台是检测酸碱变化引起的电流变化。第二代测序技术是迄今为止发展最为成熟、使用最为广泛的DNA高通量测序手段,在基因组大规模测序和基因诊断治疗中功不可没,其在临床方面的应用也将越来越广泛。
循环DNA又称为游离DNA,是血液中在细胞外存在的DNA。游离DNA的主要来源是凋亡细胞或骨髓细胞,这些细胞释放的DNA再经体内核酸酶切割后,产生了长度约为166bp的小片段DNA(Y.M.Dennis Lo et al.ScienceTranslationalMedicine.2010.10:61ra91)。游离DNA在体内处于一个动态平衡状态,所以,游离DNA可以作为健康评估的一个重要参数。肿瘤发生、器官移植等变化都会导致外周血游离DNA的性质发生改变,这些性质包括游离DNA的长度、碱基信息、表观修饰等;所以,游离DNA可以作为疾病的早期诊断、监测和预后评估的一种无创检测的重要标志物。
目前,以游离DNA作为分子标记开展的无创产前诊断临床应用已经获得了全方面认可,多个国家已经全面推进该技术应用。除了碱基信息,游离DNA的长度信息也是一种非常重要的分子标记。有研究发现,不同组织或不同状态细胞的核小体、转录因子或DNA结合蛋白会与DNA的不同区域结合,最终导致游离DNA长度和测序覆盖度发生变化,根据这些差异可以追溯这些游离DNA的来源,这将给癌症早诊、器官移植、监控等领域带来新的曙光(Matthew W.Snyder et al.Cell 2016.1:57-68)。另一个方面,利用高通 量测序方法对肿瘤甲基化的研究发现,利用甲基化测序分析肿瘤与正常组织的DNA甲基化差异信号后,可以通过此差异实现癌症的早期诊断,再结合不同组织特异的甲基化信号,还可以对肿瘤的具体位置进行定位,这对于癌症早筛后诊治具有重大意义(Kun Sun et al.2015.5:5503-12;ShichengGuo et al.2017.3:635–642)。
利用第二代高通量测序技术对游离DNA进行甲基化测序前,需要先构建甲基化测序文库。目前,第二代高通量甲基化测序文库的传统构建流程(参见图4)是先进行预文库构建,包括末端补平,5’末端磷酸化,3’末端悬A和接头连接步骤;在预文库构建完成后,再进行亚硫酸氢盐处理,亚硫酸氢盐处理会导致大量DNA损伤,最终可以进行测序的模板占原始模板的比例不到10%(Masahiko Shiraishi et al.2004.10:409-415)。甲基化测序文库的构建流程需要,1)每一步都需要纯化,操作繁琐;2)补平步骤会人为引入核苷酸,改变真实的甲基化状态;3)大量DNA模板在亚硫酸氢盐处理时被破坏,并在PCR扩增后丢失。
目前,Swift甲基化测序文库构建方法较传统方法能更高效的建库(CN104395480,参见图5),构建流程是先进行亚硫酸氢盐处理,再进行文库构建,包括3’末端加尾和接头连接,再进行延伸反应,在得到双链脱氧多核苷酸的基础上,再进行另一端测序接头的连接,因为延伸反应时的DNA模板含有大量的dUTP,加上亚硫酸氢盐处理对DNA模板的损伤,只发生一轮延伸反应,得到完整双链脱氧多核苷酸的效率低,可用于另一端测序接头连接的模板少,最终可以进行测序的模板少。而在基因诊断领域一直需要开发出更优、更高效的建库方法,以提高模板的利用效率。
发明内容
鉴于目前基于亚硫酸氢盐处理的DNA甲基化测序文库构建过程中所遇到的问题,本发明提供了一种对单链脱氧多核苷酸进行接头连接的方法;并进而提供了一种第二代高通量测序文库构建方法。本发明的方法不但适用于正常DNA,还适用于FFPE样本、古DNA、亚硫酸氢盐处理后的DNA样本等损伤严重的样本。
本发明涉及一种对脱氧多核苷酸底物进行建库的方法,所述方法包括如下步骤:
(1)将所述脱氧多核苷酸单链底物与如下物质混合以形成第一混合物:a)选自dGTP、dCTP、dATP和dTTP中的一种脱氧核苷酸;b)末端脱氧核苷酸转移酶和DNA连接酶;c)控尾组分,其中该控尾组分是由5至20个核苷酸长度的多核苷酸同聚物和X 区,以及与X区互补的连接子多核苷酸组成的部分双链核苷酸分子;其中多核苷酸同聚物与a)中的脱氧核苷酸互补;
(2)孵育所述的第一混合物,脱氧多核苷酸单链底物的3’端与溶液中的脱氧核苷酸发生加尾反应,添加了多核苷酸同聚尾的底物3’端与控尾组分的连接子连接,得到加尾后的底物;
(3)在步骤(2)的反应体系中,添加DNA聚合酶、包含dGTP、dCTP、dATP和dTTP的脱氧核苷酸和线性扩增引物,以形成第二混合物;
(4)孵育所述第二混合物,以步骤(2)得到的加尾后的底物为模板进行第一次线性延伸反应,合成底物的互补链,再解链,线性扩增引物与底物互补后,再次进行后续的线性延伸反应,其中线性延伸反应的次数不低于3次;
(5)使步骤(4)的产物解链;
(6)在步骤(5)的溶液中添加5’测序接头和DNA连接酶形成第三混合物;
(7)孵育所述的第三混合物,5’测序接头与底物互补链结合,制备得到DNA文库。
在一个具体的实施方式,在步骤(4)的第一次线性延伸反应中,所用的引物是控尾分子(也就是控尾区和X区所组成的单链),后续的延伸反应中,所用的引物为步骤(3)中添加的线性扩增引物。
在一个具体的实施方式,在步骤(4)延伸反应开始之前,降解控尾组分中的多核苷酸同聚物和X区片段,以步骤(3)中添加的线性扩增引物针对底物进行线性延伸反应。
在一个具体的实施方式,步骤(3)中添加的线性扩增引物对底物进行竞争性结合,从而该添加的线性扩增引物针对底物进行线性延伸反应。
本发明进一步涉及一种试剂盒,该试剂盒能对脱氧多核苷酸底物进行建库,其包
含:
组分一:包含选自dGTP、dCTP、dATP和dTTP中的一种脱氧核苷酸、末端脱氧核苷酸转移酶、DNA连接酶和控尾组分,其中所述的控尾组分由5至20个核苷酸长度的多核苷酸同聚物和X区,以及与X区互补的连接子多核苷酸组成的部分双链核苷酸分子,该多核苷酸同聚物与所述的选自dGTP、dCTP、dATP和dGTP中的一种脱氧核苷酸互补;
组分二:包含DNA聚合酶、包含dGTP、dCTP、dATP和dTTP的脱氧核苷酸,和线性扩增引物;
组分三:包含5’测序接头和DNA连接酶。
本发明通过设计线性扩增,可有效提高原始单链多核苷酸底物的互补链的数量。通过设计一条链的3’末端悬若干个随机碱基的5’测序接头,可以非常高效的在多核苷酸底物的互补链的3’末端添加5’测序接头。进一步,将多核苷酸底物变性为单链,在底物加尾后与连接子连接、之后完成多核苷酸底物的线性扩增得到互补链、互补链3’端连接5’测序接头后,PCR富集,得到可进行下一代测序的文库。
此外,由于底物被甲基化,得到了含U碱基的脱氧多核苷酸底物模板,由于U碱基可能导致线性扩增的中止,得到的互补链长短不一,但是本发明通过一条链的3’末端悬若干个随机碱基的5’测序接头与互补链单链多核苷酸的连接,大大提高了互补链的利用率。依据本发明,本建库流程可对低至2ng人类培养细胞来源的基因组DNA构建全基因组甲基化测序文库,并得到高效的测序结果。
附图说明
图1:本发明方法DNA文库构建流程图
图2:控尾组分的结构
图3:悬6N“5’测序接头”的结构
图4:传统方法DNA甲基化文库构建流程图
图5:Swift方法DNA甲基化文库构建流程图
发明详述
说明书中的3’、3’端和3’末端的含义相同,5’、5’端和5’末端的含义相同,他们分别指核苷酸序列的3’端或者5’端。
多核苷酸底物
多核苷酸底物是需要进行加尾反应和文库构建的多核苷酸底物片段。在各种实施方案中,多核苷酸底物为单链或双链的DNA。在另外的实施方案中,多核苷酸底物是经过化学处理的核苷酸序列,包括但不限于是经过亚硫酸氢盐处理的多核苷酸。
多核苷酸底物可以是天然来源的或合成的。天然来源是来自原核生物或真核生物,如人、小鼠、病毒、植物或细菌的多核苷酸序列。本发明的多核苷酸底物还可以是,FFPE样本、古DNA、亚硫酸氢盐处理后的DNA等损伤严重的样本。多核苷酸底物被加尾, 能用于涉及微阵列的测定并且产生用于下一代核酸测序的文库。加尾的多核苷酸底物还可以用于多核苷酸序列的有效克隆。
在一些实施方式中,多核苷酸底物是单链的或双链的并且包含3’端游离羟基。在一些方面,多核苷酸底物是双链的并且包含平末端。在其它方面,双链多核苷酸底物包含3’凹陷末端。多核苷酸底物的突出末端或凹陷末端的长度可以变化。在各个方面,多核苷酸底物的突出末端或凹陷末端的长度为1、2、3、4、5、6、7、8、9、10或更多个核苷酸。
在一些方面,多核苷酸底物的长度介于约10个与约5000个核苷酸之间,或介于约40个与约2000个核苷酸之间,或介于约50个与约1000个核苷酸之间,或介于约100个与约500个核苷酸之间。在另外的方面,多核苷酸底物的长度为至少3个到至多约50、100或1000个核苷酸。
DNA去磷酸化
DNA去磷酸化是指DNA 5’端和3’端的第一个氨基酸残基磷酸基团的除去;一般是采用碱性磷酸酶处理DNA来实现5’端和3’端残基的去磷酸化。
在一些实施方式中,本发明在对脱氧多核苷酸底物进行加尾之前,对脱氧多核苷酸底物进行去磷酸反应。
加尾反应
如本文所使用的术语“加尾”可与术语“受控加尾”互换。本发明所述方法的步骤(2)对脱氧多核苷酸底物进行加尾,所述方法用于以受控方式将所需数量的核苷酸添加至多核苷酸底物3’端。通过举例并且非限制性地,通过添加控尾组分,使多核苷酸底物的尾部控制在一定的长度范围内(见图1)。该控尾组分包含了5至20个核苷酸长度的多核苷酸同聚物,控尾组分与底物新添加的核苷酸同聚尾序列形成双链结构,因此降低了聚合过程的速率,使多核苷酸底物的尾部控制在一定的长度范围内(见图1)。
加尾反应中所添加的核苷酸是选自dGTP、dCTP、dATP和dTTP中的一种脱氧核苷酸,例如可以是dGTP溶液,或dCTP溶液,或dATP溶液,或dGTP溶液。在一个具体实施方案中,采用含聚(dT)的核苷酸同聚序列的控尾组分控制TdT酶(末端脱氧核苷酸转移酶)在多核苷酸底物的3’端添加聚(dA)尾(又称为,核苷酸(dA)同聚尾)。进一步,多核苷酸底物的同聚尾与控尾组分的连接子连接,形成底物的3’端加尾区。
在一个具体实施方案中,控尾组分包含了5-20个、优选5-13个、进一步优选为7-10个、更优选为7-9个相同核苷酸的核苷酸同聚物序列。
在一个具体实施方案中,多核苷酸底物与控尾组分的摩尔浓度比范围为1:1-1:100,优选1:5-1:50。
在一个具体实施方案中,本发明所述方法的步骤(2)的加尾反应的pH范围为约5.0到约9.0;多核苷酸底物与单核苷酸摩尔浓度比范围为1:10-1:20000,优选1:100-1:2000;孵育的时间在1分钟到120分钟,优选0.5-60分钟,0.5-30分钟,1-20分钟,1-15分钟或1-10分钟;孵育的温度为20℃-50℃,优选25℃-45℃,更优选25℃-37℃。
控尾组分
本发明通过添加控尾组分来控制多核苷酸底物的加尾长度和效率。控尾组分是由控尾区和X区,以及能够与X区互补的连接子序列(见图2),控尾组分又称为“控尾接头”。控尾区和X区所组成的单链被称为“控尾分子”。
本发明的控尾区是一段5-20个核苷酸长度的多核苷酸同聚物。多核苷酸同聚物又称“poly区”,是相同的核苷酸连接成的多核苷酸链。本发明的控尾区是dGTP、dCTP、dATP和dTTP中的一种脱氧核苷酸组成的核苷酸同聚物序列;优选是dTTP组成的聚(dT)的控尾区。优选地,本发明控尾区的多核苷酸同聚物的长度为5-20个核苷酸,优选7-20、9-20个核苷酸,进一步优选为5-10个核苷酸,7-10个核苷酸,更优选为7-9个核苷酸。一定长度的dGTP或dCTP多核苷酸同聚物可以有效控制多核苷酸底物加尾在20个核苷酸左右。
“X区序列”提供用于核酸片段的扩增或测序的引发序列,还可以包含用于区分不同的底物分子的标记序列,标记序列可以包含4-16个碱基,并且在一些方面用于下一代测序应用。在本发明的一些实施方式中,X区序列可以是但不限于与Illumina、Ion Torrent、Roche 454或SOLiD测序平台相容的含下一代测序(NGS)接头序列。X区序列可以是DNA序列、RNA序列或者包含DNA和RNA的杂聚序列。
控尾组分中的连接子,只具有与X区序列互补的连接子被称为“短连接子”;除了与X区互补序列外还包括延伸引物结合区的连接子序列,被称为“长连接子”,如表2所示。
本发明使用了一种对脱氧多核苷酸底物进行加尾的方法,所述方法用于以受控方式将所需数量的核苷酸添加至多核苷酸底物3’端。通过举例并且非限制性地,通过添加 控尾组分,该控尾组分包含了5-20个核苷酸长度的多核苷酸同聚物,控尾组分与底物新添加的核苷酸同聚尾序列互补形成双链结构,因此降低了聚合过程的速率,使多核苷酸底物的尾部控制在一定的长度范围内(见图1)。
在本发明的一些实施方式中,采用含聚(dT)的核苷酸同聚序列的控尾组分控制TdT酶在多核苷酸底物的3’端添加聚(dA)尾(又称为核苷酸(dA)同聚尾)。进一步,多核苷酸底物的聚(dA)尾与控尾组分的连接子连接,形成底物的3’端加尾区。
在本发明的一些实施方式中,控尾组分包含封闭基团。本文所使用的封闭基团是阻止通过酶进行延伸的部分。如果没有封闭基团,酶能够通过添加核苷酸合成多核苷酸。封闭基团包括但不限于:磷酸基团、碳三间臂、双脱氧核苷酸、核糖核苷酸、氨基以及反向脱氧胸苷。
在本发明的一些实施方式中,控尾组分连接子的5’端有磷酸化修饰,3’端有封闭基团,控尾区的3’有封闭基团。
线性扩增反应
线性扩增反应也称为“线性延伸反应”或“线性延伸”。
本发明在对脱氧多核苷酸底物进行加尾之后,进一步以脱氧多核苷酸底物为模板进行线性扩增反应。本发明提供了一种对脱氧核苷酸底物进行线性扩增的方法,所述方法用于以线性扩增的方式来增加脱氧多核苷酸底物的数量(见图1)。
在本发明的一些实施方式中,在加尾反应之后,加入线性扩增引物,以加尾的底物为模板进行延伸反应,延伸反应可以首先是通过控尾分子发生延伸反应。在本发明的一些实施方式中,线性扩增引物通过竞争的方式使控尾分子与底物多核苷酸底物分离,进而以脱氧多核苷酸底物为模板进行线性扩增反应。在一些具体实施方式中,在步骤(4)延伸反应开始之前,降解控尾组分中的多核苷酸同聚物和X区片段,以步骤(3)中添加的线性扩增引物针对底物进行线性延伸反应。
在一个具体实施方式中,该方法包括:在加尾反应之后,多核苷酸底物变性呈单链状态,加入与底物3’连接子序列互补的线性扩增引物、DNA聚合酶和脱氧核苷酸,与多核苷酸底物进行反应;在线性扩增引物的3’端发生核苷酸延伸反应,合成底物的互补链,得到双链脱氧多核苷酸,双链脱氧多核苷酸通过变性使底物的互补链与底物分离,底物再次与线性扩增引物、DNA聚合酶和脱氧核苷酸发生延伸反应,发生延伸反应的次数称为线性扩增循环数。在一些方面,所述的DNA聚合酶可以高效扩增含U碱基的 脱氧多核苷酸底物模板。
在一个具体实施例方案中,线性扩增循环数不低于3次,优选不低于4次,优选为4-50次,进一步优选4-20次,更优选为4-12次,8-12次。
5’测序接头的连接反应
本发明的5’测序接头是具有部分双链结构的脱氧多核苷酸;“部分双链”指的是所述悬随机个数碱基的5’测序接头包含单链部分和双链部分。
5’测序接头提供用于核酸片段的扩增或测序的引发序列,并且在一些方面用于下一代测序应用。
本发明通过添加悬随机碱基的5’测序接头与线性扩增反应之后得到的底物的互补链发生连接反应,悬随机碱基的5’测序接头的3’末端包含一段随机碱基单链多核苷酸(见图3),因此悬随机碱基的5’测序接头是多分子结构的具有部分双链的多核苷酸,悬随机碱基的5’测序接头又称“悬N 5’测序接头”。例如悬6个随机碱基的5’测序接头又称“悬6N 5’测序接头”,N代表一种脱氧核苷酸碱基,也就是6个N中每一个都是随机选自dGTP、dCTP、dATP和dTTP中的一种脱氧核苷酸碱基,“悬6N 5’测序接头”是不同随机碱基连接后的混合分子。
本发明的5’测序接头悬随机碱基的个数为0-50个,优选2-30个,进一步优选为2-17,4-15,4-10,更优选7-10,7-9个。
在本发明的具体实施方式中,5’测序接头由两条多核苷酸链退火形成,不含随机碱基的多核苷酸链的5’端有磷酸化修饰,3’端有封闭基团,含随机碱基的多核苷酸链的3’端有封闭基团(见图3)。
在对单链脱氧多核苷酸进行接头连接的方法中,所述方法用于以悬随机碱基的5’测序接头的随机碱基单链多核苷酸部分与线性扩增反应之后得到的底物的互补链的3’端形成互补,得到除5’测序接头双链部分之外的部分双链结构,在DNA连接酶的作用下,5’测序接头不含随机碱基的多核苷酸链的5’端与多核苷酸底物的互补链的3’端连接(见图1);在线性扩增步骤及纯化步骤之后用于与5’测序接头连接的底物互补链增多,连接效率提高。
在一个具体实施例方案中,多核苷酸底物与5’测序接头的摩尔浓度比范围为1:100-1:4000,优选1:500-1:1000。
连接酶
可用于本发明方法的连接酶可以是DNA连接酶和RNA连接酶,包括但不限于T4DNA连接酶、大肠杆菌DNA连接酶、T7DNA连接酶以及T4RNA连接酶。
本发明的连接酶使控尾组分中的连接子与底物加尾的多核苷酸连接。在另一些实施方式中,本发明的连接酶使5’测序接头与合成的底物的互补链单链脱氧多核苷酸连接。
分离步骤
在一些实施方案中,对本发明步骤(7)之后的多核苷酸产物进行纯化。多核苷酸产物的纯化通过本领域技术人员已知和理解的任何方法来进行。在本发明多核苷酸底物的纯化可以通过加入表面是羧基修饰的磁珠来进行。在其它的具体实施方式中,通过柱纯化和沉淀来进行多核苷酸底物的纯化。
具体实施方式
实施例中所用的序列参见如下表1-表6
表1.用于实施例的DNA多核苷酸
Figure PCTCN2019102651-appb-000001
Figure PCTCN2019102651-appb-000002
Phos:磷酸;*:硫代位点;C3 Spacer:碳3间臂;5mC:5-甲基-胞嘧啶脱氧核苷酸;N:dA、dT、dC或dG核苷酸
表2.用于甲基化测序文库构建的悬聚(dT)控尾组分
Figure PCTCN2019102651-appb-000003
Phos:磷酸;C3 Spacer:碳3间臂;*:硫代位点
表3.用于实施例1的悬不同数量N的5’测序接头
Figure PCTCN2019102651-appb-000004
Phos:磷酸;C3 Spacer:碳3间臂;*:硫代位点;N:dA、dT、dC或dG的核苷酸
表4.用于甲基化测序文库构建的带分子标签的悬聚(dT)控尾组分
Figure PCTCN2019102651-appb-000005
Phos:磷酸;C3 Spacer:碳3间臂;*:硫代位点;N:dA、dT、dC或dG的核苷酸
表5.用于甲基化测序文库构建的“传统甲基化测序接头”
Figure PCTCN2019102651-appb-000006
Phos:磷酸;5mC:5-甲基-胞嘧啶脱氧核苷酸
实施例
实施例1.悬不同数量随机碱基的5’测序接头对5’测序接头连接的影响
材料:
5x退火缓冲液(碧云天,目录号D0251)
无酶水(索莱宝,目录号R1600-100)
λ-DNA(takara,目录号3019)
亚硫酸氢盐处理试剂盒(Zymo Research,目录号D5005)
FastAP温敏碱性磷酸酶(ThermoFisher,EF0651,1U/μL)
10x CutSmart缓冲液(New England Biolabs,目录号B7204S)
10x绿色缓冲液(Enzymatics,目录号B0120,20mM Tris-醋酸盐、50mM乙酸钾、10mM乙酸镁,pH 7.9)
β-烟酰胺腺嘌呤二核苷酸(New England Biolabs,目录号B9007S,50mM)
dATP(Takara,目录号4026,100mM)
TdT酶(Enzymatics,目录号P7070L,20U/μL)
大肠杆菌DNA连接酶(Takara,目录号2161,60U/μL)
EB缓冲液(Qiagen,目录号19086)
dNTP(takara,目录号4030,每种2.5mM)
Phusion U热启动DNA聚合酶(ThermoFisher,目录号F555L,2U/μL)
5x Phusion HF缓冲液(ThermoFisher,目录号F555L)
Beckman Ampure XP磁珠(Beckman,目录号A63882)
SB缓冲液:20%PEG8000,2.5M NaCl,10mM Tris-盐酸,1mM EDTA
2x T4 DNA快速连接反应缓冲液(Enzymatics,目录号B1010)
T4 DNA快速连接酶(Enzymatics,目录号L6030-HC-L,600U/μl)
方法:
(1)接头制备:将多核苷酸对(001/002,007/015,008/015,009/015,010/015,011/015, 012/015,013/015,014/015)等摩尔量混合,在1x退火缓冲液中于95℃孵育2分钟,然后缓慢冷却至室温,得到控尾接头(如表2所示)和3’末端悬不同数量随机碱基的5’测序接头(如表3所示)。
(2)使用亚硫酸氢盐处理试剂盒对40ngλ-DNA(takara,目录号3019)进行亚硫酸氢盐处理。
(3)使用聚焦超声仪(Covaris,目录号S220)将步骤(2)的DNA产物片段化至300bp,待用。
(4)按照表1-1所示制备DNA去磷酸化反应混合液,在37℃温浴30分钟后,95℃处理5分钟,之后立即插入冰上并孵育2分钟后待用,使底物保持单链状态。
表1-1
Figure PCTCN2019102651-appb-000007
(5)对多核苷酸底物进行加尾:按照表1-2所示制备加尾和连接反应混合液,将反应混合液在37℃温浴30分钟后,95℃处理5分钟,然后保持在4℃。
表1-2
Figure PCTCN2019102651-appb-000008
(6)线性扩增:制备如表1-3所示的线性扩增的反应混合液,按照表1-4所示的PCR扩增程序运行4个线性扩增;使用166μl 1:6稀释的Beckman Ampure XP磁珠(1体积的Beckman Ampure XP磁珠加上5体积的SB缓冲液)和280μl的1.8:1稀释的SB缓冲液(1.8体积的SB缓冲液加上1体积的无酶水)纯化回收线性扩增产物,再加入100μl EB缓冲液洗脱,将100μl洗脱液分成5μl/份于200μl PCR管中,分出18份用于下步反应。
表1-3
Figure PCTCN2019102651-appb-000009
表1-4
Figure PCTCN2019102651-appb-000010
(7)将步骤(6)的18份DNA在95℃反应5分钟,立即置于冰上2分钟待用,使双链解链保持单链待用。
(8)按照如表1-5所示制备来自步骤(7)的DNA与悬2N-9N的“5’测序接头”连接的反应 混合液,每种接头两个平行,在25℃温浴15分钟,使用17μl Beckman Ampure XP磁珠回收连接后的DNA,再加入26μl无酶水洗脱,得到PCR扩增前文库。
表1-5
Figure PCTCN2019102651-appb-000011
(9)使用文库定量试剂盒(KAPA Biosystems,目录号KK4824)以及DNA定量标准品和预混合引物试剂盒(KAPA Biosystems,目录号KK4808),加上qPCR正向引物/qPCR反向引物(005/004,如表1,使用浓度同预混合引物试剂盒)检测PCR扩增前文库的摩尔浓度,计算时PCR扩增前文库片段大小为320bp。
实验结果:如表1-6所示,使用悬2个N的“5’测序接头”构建的PCR扩增前文库浓度最低,为0.000780nM,“5’测序接头”悬N的数量从2增加到4个,PCR扩增前文库浓度从0.000780nM增加到0.0254nM,呈指数增加;从4到7,PCR扩增前文库浓度增加趋势减弱,使用悬7N“5’测序接头”构建的PCR扩增前文库浓度最高,为0.0653nM;使用悬7个N、8个N和9个N的“5’测序接头”(见表3)构建的PCR扩增前文库浓度基本一致,分别为0.0653nM、0.0627nM和0.0646nM。
结论:悬2-9个N的“5’测序接头”均能有效与线性扩增后得到的核苷酸底物的互补链发生连接反应,并对亚硫酸氢盐处理后的DNA进行甲基化文库的构建。
表1-6
Figure PCTCN2019102651-appb-000012
实施例2.不同接头浓度对5’测序接头连接的影响
方法:
(1)接头制备:按照实施例1中接头制备方法制备控尾接头(001/002,如表2所示)以及3’末端悬6个N的“5’测序接头”(011/015,如表3所示)
(2)使用亚硫酸氢盐处理试剂盒对24ngλ-DNA进行亚硫酸氢盐处理。
(3)制备线性扩增产物:按照实施例1中所描述的方法制备线性扩增产物,线性扩增产物纯化回收时加入31.2μl EB缓冲液洗脱,将31.2μl洗脱液分成2.6μl/份于200μl PCR管中,分出10份用于下步反应。
(4)将步骤(3)的DNA在95℃反应5分钟,立即置于冰上2分钟待用。
(5)反应:如下表2-1所示配制进行“5’测序接头”连接反应的混合液,在25℃温浴15分钟,使用17μl Beckman Ampure XP磁珠回收连接后DNA,再加入26μl无酶水洗脱,得到PCR扩增前文库。
表2-1
Figure PCTCN2019102651-appb-000013
(6)按照实施例1所描述的方法检测PCR扩增前文库的摩尔浓度。
实验结果:如表2-2所示,DNA底物与5’测序接头比例为1:100时构建的PCR扩增前文库浓度最低,为0.0210nM,1:4000时构建的PCR扩增前文库浓度最高,为0.0836nM。
结论:DNA底物与5’测序接头比例为1:100到1:4000均能有效的对亚硫酸氢盐处理后的DNA实施甲基化文库的构建。
表2-2
DNA底物:5’测序接头比例 1:100 1:500 1:1000 1:2000 1:4000
文库浓度(nM) 0.0210 0.0499 0.0641 0.0693 0.0836
实施例3.不同线性扩增循环数对甲基化PCR扩增前文库构建的影响
(1)接头制备:按照实施例1中接头制备方法制备控尾接头(001/002,如表2所示)以及3’末端悬7个N的“5’测序接头”(012/015,如表3所示)。
(2)使用亚硫酸氢盐处理试剂盒对20ngλ-DNA进行亚硫酸氢盐处理。
(3)使用聚焦超声仪(Covaris,目录号S220)将步骤(2)的DNA产物片段化至300bp,待用。
(4)按照表3-1所示制备DNA去磷酸化反应混合液,在37℃温浴30分钟后,95℃处理5分钟,之后立即插入冰上并孵育2分钟后待用。
表3-1
Figure PCTCN2019102651-appb-000014
(5)按照表3-2所示制备加尾和连接反应混合液,将反应混合液在37℃混浴30分钟后,95℃处理5分钟,然后保持在4℃,待反应完成后,将反应混合液平均分成4份,每份10μl,待用。
表3-2
Figure PCTCN2019102651-appb-000015
Figure PCTCN2019102651-appb-000016
(6)制备如表3-3所示的线性扩增的反应混合液,按照表3-4所示的PCR扩增程序运行,其中95℃,30秒,60℃,30秒,68℃,1分钟的反应循环数分别使用4、6、8和12;均使用166μl 1:6稀释的Beckman Ampure XP磁珠(1体积的Beckman Ampure XP磁珠加上5体积的SB缓冲液)和280μl的1.8:1稀释的SB缓冲液(1.8体积的SB缓冲液加上1体积的无酶水)纯化回收线性扩增产物,再加入12.5μl EB缓冲液洗脱,四种线性扩增循环数的12.5μl洗脱液都分成5μl/份于200μl PCR管中,都分出2份用于下步反应。
表3-3
Figure PCTCN2019102651-appb-000017
表3-4
Figure PCTCN2019102651-appb-000018
(7)将步骤(6)制备得到的DNA在95℃反应5分钟,立即置于冰上2分钟待用。
(8)按照如表3-5所示制备四种“5’测序接头”连接的反应混合液,分别在25℃温浴15分钟,分别使用17μl Beckman Ampure XP磁珠回收连接后DNA,再加入26μl无酶水洗脱,得到PCR扩增前文库。
表3-5
Figure PCTCN2019102651-appb-000019
(9)按照实施例1所描述的方法检测PCR扩增前文库的摩尔浓度。
实验结果:如表3-6所示,线性扩增循环数为4、6、8和12,PCR扩增前文库的摩尔浓度分别为0.0553nM、0.0947nM、0.131nM和0.199nM,线性扩增循环数为12时构建的文库浓度最高。
表3-6
线性扩增使用循环数 4个 6个 8个 12个
文库浓度(nM) 0.0553 0.0947 0.131 0.199
结论:对加尾的脱氧多核苷酸底物进行线性扩增,线性扩增4-12个循环,然后连接悬7N“5’测序接头”,均能得到有效的甲基化文库。
实施例4.NGS检测不同线性扩增循环数对甲基化测序文库构建的影响
材料:
2x高保真热启动PCR混合液(KAPA Biosystems,目录号KK2602)
方法:
(1)接头制备:按照实施例1中所描述的方法制备带随机分子标签的控尾接头(006/029,如表4所示),以及3’末端悬7个N的“5’测序接头”(012/015,如表3所示)。
(2)甲基化测序文库制备:按照实施例3中所描述的方法构建甲基化测序文库至“5’测序接头”连接反应后;使用17μl Beckman Ampure XP磁珠回收连接后DNA,再加入20μl无酶水洗脱;取2μl洗脱液于18μl无酶水中稀释10倍,得到10倍稀释液,再取2μl 10倍稀释液于18μl无酶水中稀释10倍,得到100倍稀释液,依次进行稀释得到10000倍稀释液,取10000倍稀释液5.34μl,待用。
(3)如表4-1所示制备PCR扩增反应混合液,按照表4-2所示的PCR扩增程序运行;使用80μl Beckman Ampure XP磁珠回收PCR产物,再加入50μl无酶水洗脱,再使用40μl Beckman Ampure XP磁珠回收50μl洗脱产物,最后加入25μl EB缓冲液洗脱,得到最终测序文库。
表4-1
Figure PCTCN2019102651-appb-000020
表4-2
Figure PCTCN2019102651-appb-000021
(4)使用安捷伦2100生物分析仪(Agilent Technologies,目录号G2939BA)和安捷伦高灵 敏DNA试剂盒(Agilent Technologies,目录号5067-4626)检测文库片段分布;使用文库定量试剂盒(KAPA Biosystems,目录号KK4808)以及DNA定量标准品和预混引物试剂盒(KAPA Biosystems,目录号KK4808)检测文库摩尔浓度。
(5)使用Illumina-NS500测序仪对步骤(3)得到的文库进行75PE模式测序,利用软件Cutadapt(v1.12)去除接头序列;采用软件Bwa-Meth(v0.2.0)对甲基化测序序列进行基因组比对;利用软件包Sambamba(v0.5.4)标记重复序列;最后,利用软件包bedtods(v2.25.0)对测序深度进行统计。
实验结果如下表4-3所示,线性扩增循环数为4、6、8和12时,构建的甲基化测序文库的浓度分别为32.22nM,54.90nM,134.24nM和139.95nM。高通量测序结果显示的测序数据量为4Mb时,建库效率分别为30.35%,48.36%,70.36%和92.63%,变异系数分别为0.455,0.275,0.999和0.637。
其中,从1000个模板λ-DNA基因组起始建库,通过测序,最终在每个λ-DNA基因组位点上平均得到100个独一无二的测序片段,即称为建库效率10%。对每个λ-DNA基因组位点上得到的独一无二的测序片段数进行统计,计算其平均值及标准差,以标准差除以平均值,得“变异系数”。变异系数代表建库方法的均一性,越低则均一性越好。建库效率指代建库方法的有效性,越高越好。
结论:线性扩增循环数为4、6、8和12均能有效的完成线性扩增,并对亚硫酸氢盐处理后的DNA构建甲基化测序文库;线性扩增循环数为12时,建库效率最高,达92.63%。
表4-3
Figure PCTCN2019102651-appb-000022
Figure PCTCN2019102651-appb-000023
实施例5
比较本发明方法与传统方法以及Swift方法在构建甲基化测序文库上效率的差异
材料:
10x末端修复缓冲液(New England Biolabs,目录号B6052S,50mM Tris-盐酸、10mM氯化镁、10mM二硫苏糖醇、1mM三磷酸腺苷、0.4mM dATP、0.4mM dCTP、0.4mM dGTP、0.4mM dTTP,pH 7.5)
T4DNA聚合酶(Enzymatics,目录号P7080L,3U/μL)
T4多聚核苷酸激酶(Enzymatics,目录号Y9040L,10U/μL)
10x dA加尾缓冲液(New England Biolabs,目录号B6059S,10mM Tris-盐酸、10mM氯化镁、50mM氯化钠、1mM二硫苏糖醇、0.2mM dATP,pH 7.9)
Klenow大片段(Exo-)(Enzymatics,目录号P7010-LC-L,10U/μL)
(1)利用本发明方法构建λ-DNA甲基化测序文库
(1-1)接头制备:按照实施例1中接头制备方法制备控尾接头(001/002,如表2所示)以及3’末端悬7个N的“5’测序接头”(012/015,如表3所示)。
(1-2)使用亚硫酸氢盐处理试剂盒对20ngλ-DNA进行亚硫酸氢盐处理。
(1-3)使用聚焦超声仪(Covaris,目录号S220)将步骤(1-2)DNA产物片段化至300bp,均分成两份(即两个平行),待用。
(1-4)按照表5-1所示制备DNA去磷酸化反应混合液,在37℃温浴30分钟后,95℃处理5分钟,之后立即插入冰上并孵育2分钟后待用。
表5-1
Figure PCTCN2019102651-appb-000024
Figure PCTCN2019102651-appb-000025
(1-5)按照表5-2所示制备加尾和连接反应混合液,将反应混合液在37℃混浴30分钟后,95℃处理5分钟,然后保持在4℃,待用。
表5-2
Figure PCTCN2019102651-appb-000026
(1-6)制备如表5-3所示的线性扩增的反应混合液,按照表5-4所示的PCR扩增程序运行,反应完成后使用166μl 1:6稀释的Beckman Ampure XP磁珠(1体积的Beckman Ampure XP磁珠加上5体积的SB缓冲液)和280μl的1.8:1稀释的SB缓冲液(1.8体积的SB缓冲液加上1体积的无酶水)纯化回收线性扩增产物,用6.6μl EB缓冲液洗脱。
表5-3
Figure PCTCN2019102651-appb-000027
Figure PCTCN2019102651-appb-000028
表5-4
Figure PCTCN2019102651-appb-000029
(1-7)将步骤(1-6)的DNA洗脱液在95℃反应5分钟,立即置于冰上2分钟待用。
(1-8)按照如表5-5所示制备“5’测序接头”连接的反应混合液,在25℃温浴15分钟,使用17μl Beckman Ampure XP磁珠回收连接后DNA,再加入100μl无酶水洗脱;取2μl洗脱液于18μl无酶水中稀释10倍,得到10倍稀释液,再取2μl 10倍稀释液于18μl无酶水中稀释10倍,得到100倍稀释液,依次进行稀释得到10000倍稀释液,取10000倍稀释液5.34μl,待用。
表5-5
DNA(来自步骤(1-7)) 6.6μl
2x T4 DNA快速连接反应缓冲液 10μl
25μM悬7N“5’测序接头”(012/015) 2.4μl
T4 DNA快速连接酶 1μl
总体积 20μl
(1-9)如表5-6所示制备PCR扩增反应混合液,按照表5-7所示的PCR扩增程序运行;使用80μl Beckman Ampure XP磁珠回收PCR产物,再加入50μl无酶水洗脱,再使用40μl Beckman Ampure XP磁珠回收50μl洗脱产物,最后加入25μl EB缓冲液洗脱,得到最终测序文库。
表5-6
Figure PCTCN2019102651-appb-000030
Figure PCTCN2019102651-appb-000031
表5-7
Figure PCTCN2019102651-appb-000032
(1-10)文库浓度检测方法同实施例4.
(1-11)使用Illumina-Nova测序仪对步骤(1-9)得到的文库进行150PE模式测序,分析方法同实施例4中的步骤(5)。
(2)利用传统方法构建λ-DNA甲基化测序文库(建库流程示意图如图4)
(2-1)接头制备,按照实施例1中接头制备方法制备“传统甲基化测序接头”(016/017,如表5所示)。
(2-2)取20ngλ-DNA,使用聚焦超声仪将DNA片段化至300bp,均分成两份(即两个平行),待用。
(2-3)如表5-8所示配制末端修复反应混合液,在20℃下反应30分钟,然后加入45μl Beckman Ampure XP磁珠回收修复后DNA,使用26μl无酶水洗脱。
表5-8
Figure PCTCN2019102651-appb-000033
Figure PCTCN2019102651-appb-000034
(2-4)如表5-9所示配制dA加尾反应混合液,在37℃下反应30分钟,然后加入45μl Beckman Ampure XP磁珠回收完成dA加尾的DNA,使用12μl无酶水洗脱。
表5-9
Figure PCTCN2019102651-appb-000035
(2-5)如表5-10所示配制接头连接反应混合液,在25℃下反应15分钟,然后加入21μl Beckman Ampure XP磁珠回收完成接头连接的DNA,使用20μl无酶水洗脱。
表5-10
Figure PCTCN2019102651-appb-000036
(2-6)使用亚硫酸氢盐处理试剂盒对DNA(来自步骤(2-5))进行亚硫酸氢盐处理,使用100μl无酶水洗脱;取2μl洗脱液按照步骤(1-8)所描述的稀释方法进行稀释,得到10000倍稀释液,取10000倍稀释液5.34μl,待用。
(2-7)如表5-11所示制备PCR扩增反应混合液,按照表5-12所示的PCR扩增程序运行;使用40μl Beckman Ampure XP磁珠回收PCR产物,再加入50μl无酶水洗脱,再使用40μl Beckman Ampure XP磁珠回收50μl洗脱产物,最后加入25μl EB缓冲液洗脱,得到最终测序文库。
表5-11
Figure PCTCN2019102651-appb-000037
表5-12
Figure PCTCN2019102651-appb-000038
(2-8)文库浓度检测方法同实施例4.
(2-9)使用Illumina-Nova测序仪对步骤(2-7)得到的文库进行150PE模式测序,分析方法同实施例4中的步骤(5)。
(3)利用Swift方法构建λ-DNA甲基化测序文库(建库流程示意图如图5)
(3-1)使用亚硫酸氢盐处理试剂盒对20ngλ-DNA进行亚硫酸氢盐处理。
(3-2)使用聚焦超声仪将步骤(3-1)产物片段化至300bp,均分成两份(即两个平行),待用。
(3-3)按照Swift DNA甲基化建库试剂盒(Swift Biosciences,目录号30024)的建库步骤对来自步骤(3-2)的DNA进行甲基化文库的构建,不同之处在文库构建至5’测序接头连接完纯化后(Index PCR之前)用100μl无酶水洗脱;取2μl洗脱液按照步骤1-8所描述的稀释方法进行稀释,得到10000倍稀释液,取10000倍稀释液5.34μl,补无酶水14.66μl至终体积为20μl,待用。
(3-4)按照Swift DNA甲基化建库试剂盒(Swift Biosciences,目录号30024)中的Index PCR步骤对来自步骤(3-3)的DNA进行PCR扩增,使用的Index为Index试剂盒(Swift  Biosciences,目录号36024)中的I16或I19,不同之处在PCR扩增循环数为28。
(3-5)按照步骤(2-7)所示的PCR产物纯化方法对来自步骤(3-4)的PCR产物进行纯化回收,得到最终测序文库。
(3-6)按照实施例4所描述的方法检测文库摩尔浓度。
(3-7)使用Illumina-Nova测序仪对步骤(2-7)得到的文库进行150PE模式测序,分析方法同实施例4中的步骤(5)。
实验结果如表5-13所示,分别用本发明方法,Swift方法和传统建库方法,构建的甲基化测序文库的浓度分别为18.684nM,1.641nM和0.146nM。测序数据量为4Mb时,建库效率分别为56.1%,22%和3.92%,变异系数分别为0.463,8.49,和3.73。
表5-13
Figure PCTCN2019102651-appb-000039
结论:使用本发明方法能够高效的对少量基因组DNA进行甲基化测序文库构建,建库效率和建库方法的均一性都远远优于Swift方法和传统方法。

Claims (29)

  1. 一种对脱氧多核苷酸底物进行建库的方法,所述方法包括如下步骤:
    (1)将所述脱氧多核苷酸单链底物与如下物质混合以形成第一混合物:a)选自dGTP、dCTP、dATP和dTTP中的一种脱氧核苷酸;b)末端脱氧核苷酸转移酶和DNA连接酶;c)控尾组分,其中该控尾组分是由5至20个核苷酸长度的多核苷酸同聚物和X区,以及与X区互补的连接子多核苷酸组成的部分双链核苷酸分子;其中多核苷酸同聚物与a)中的脱氧核苷酸互补;
    (2)孵育所述的第一混合物,脱氧多核苷酸单链底物的3’端与溶液中的脱氧核苷酸发生加尾反应,添加了多核苷酸同聚尾的底物3’端与控尾组分的连接子连接,得到加尾后的底物;
    (3)在步骤(2)的反应体系中,添加DNA聚合酶、包含dGTP、dCTP、dATP和dTTP的脱氧核苷酸和线性扩增引物,以形成第二混合物;
    (4)孵育所述第二混合物,以步骤(2)得到的加尾后的底物为模板进行第一次线性延伸反应,合成底物的互补链,再解链,线性扩增引物与底物互补后,再次进行后续的线性延伸反应,其中线性延伸反应的次数不低于3次;
    (5)使步骤(4)的产物解链;
    (6)在步骤(5)的溶液中添加5’测序接头和DNA连接酶形成第三混合物;
    (7)孵育所述的第三混合物,5’测序接头与底物互补链结合,制备得到DNA文库。
  2. 如权利要求1所述的方法,其中a)中的脱氧多核苷酸是dATP。
  3. 前述任一项权利要求所述的方法,其中步骤(1)中的控尾组分包含5至13个核苷酸长度的多核苷酸同聚物。
  4. 前述任一项权利要求所述的方法,其中步骤(1)中的控尾组分包含7至10个核苷酸长度的多核苷酸同聚物。
  5. 前述任一项权利要求所述的方法,其中步骤(2)在温度20℃-50℃下进行,优选25℃-45℃,更优选25℃-37℃。
  6. 前述任一项权利要求所述的方法,其中步骤(3)的线性扩增引物与加尾后的底物的3’末端互补。
  7. 权利要求6所述的方法,其中步骤(3)的线性扩增引物与控尾组分的连接子互补。
  8. 前述任一项权利要求所述的方法,其中在步骤(4)的第一次线性延伸反应中,所用的引物是控尾分子。
  9. 前述权利要求1-7任一项所述的方法,其中在步骤(4)的第一次线性延伸反应中,所用的引物是步骤(3)中添加的线性扩增引物。
  10. 权利要求9所述的方法,其中在步骤(4)的第一次线性延伸反应中,控尾组分中与底物互补的序列被降解或被步骤(3)中添加的线性扩增引物所竞争。
  11. 前述任一项权利要求所述的方法,其中步骤(4)的线性延伸反应的次数不低于4次,优选为4-50次,进一步优选4-20次,更优选为4-12次。
  12. 前述任一项权利要求所述的方法,其中步骤(6)和(7)的5’测序接头的一条链的3’末端悬了0-50个随机碱基的单链多核苷酸。
  13. 权利要求12所述的方法,其中步骤(6)和(7)的5’测序接头的一条链的3’末端悬了0-30个随机碱基的单链多核苷酸。
  14. 权利要求13所述的方法,其中5’测序接头的一条链的3’末端悬了2-30个随机碱基的单链多核苷酸,优选2-17个,更优选4-15个,更优选7-10个。
  15. 前述任一项权利要求所述的方法,其特征在于控尾组分的多核苷酸同聚物和连接子包含3’封闭基团。
  16. 前述任一项权利要求所述的方法,其特征在于5’测序接头包含3’封闭基团。
  17. 前述任一项权利要求所述的方法,封闭基团选自以下组分的一种或几种:核糖核苷酸、碳三间臂、磷酸、双脱氧核苷酸、氨基基团和倒置脱氧胸腺嘧啶核苷。
  18. 前述任一项权利要求所述的方法,控尾组分中的连接子多核苷酸包含5’端磷酸和3’端封闭基团。
  19. 前述任一项权利要求所述的方法,5’测序接头中与含有随机碱基链的互补链上包含5’端磷酸。
  20. 前述任一项权利要求所述的方法,其中,脱氧多核苷酸底物是去磷酸化的多核苷酸序列。
  21. 前述任一项权利要求所述的方法,其中在步骤(4)之后,从所述的第二混合物中分离双链及单链脱氧多核苷酸,解链后得到单链脱氧多核苷酸用于后续步骤。
  22. 前述任一项权利要求所述的方法,其中在步骤(7)之后,对制备得到的DNA进行 PCR扩增。
  23. 权利要求22所述的方法,其中在步骤(7)之后,对制备得到的DNA进行纯化,纯化后再进行PCR扩增。
  24. 一种试剂盒,其包含:
    组分一:包含选自dGTP、dCTP、dATP和dTTP中的一种脱氧核苷酸、末端脱氧核苷酸转移酶、DNA连接酶和控尾组分,其中所述的控尾组分由5至20个核苷酸长度的多核苷酸同聚物和X区,以及与X区互补的连接子多核苷酸组成的部分双链核苷酸分子,该多核苷酸同聚物与所述的选自dGTP、dCTP、dATP和dGTP中的一种脱氧核苷酸互补;
    组分二:包含DNA聚合酶、包含dGTP、dCTP、dATP和dTTP的脱氧核苷酸,和线性扩增引物;
    组分三:包含5’测序接头和DNA连接酶。
  25. 权利要求24所述的试剂盒,其中组分一中选自dGTP、dCTP、dATP和dTTP中的一种脱氧核苷酸是dATP。
  26. 权利要求24或25所述的试剂盒,其中的控尾组分包含5至13个,7至10个,更优选7至9个核苷酸长度的多核苷酸同聚物。
  27. 权利要求24-26任一项所述的试剂盒,其中所述的线性扩增引物与控尾组分的3’端互补,优选与控尾组分的连接子互补。
  28. 权利要求24-27任一项所述的试剂盒,其中所述的5’测序接头的一条链的3’末端悬了0-50个随机碱基的单链多核苷酸,优选2-30,优选2-17个,更优选4-15个,更优选7-10个随机碱基的单链多核苷酸。
  29. 权利要求24-28任一项所述的试剂盒在对脱氧多核苷酸底物进行建库中的应用。
PCT/CN2019/102651 2018-10-11 2019-08-26 一种测序文库的构建方法 WO2020073748A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US17/284,734 US20220002713A1 (en) 2018-10-11 2019-08-26 Method for constructing sequencing library
CN201980013343.2A CN111989406B (zh) 2018-10-11 2019-08-26 一种测序文库的构建方法
EP19871204.4A EP3865584A4 (en) 2018-10-11 2019-08-26 PROCEDURE FOR CREATING A SEQUENCING LIBRARY

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811185409 2018-10-11
CN201811185409.X 2018-10-11

Publications (1)

Publication Number Publication Date
WO2020073748A1 true WO2020073748A1 (zh) 2020-04-16

Family

ID=70165046

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/102651 WO2020073748A1 (zh) 2018-10-11 2019-08-26 一种测序文库的构建方法

Country Status (4)

Country Link
US (1) US20220002713A1 (zh)
EP (1) EP3865584A4 (zh)
CN (1) CN111989406B (zh)
WO (1) WO2020073748A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113322523A (zh) * 2021-06-17 2021-08-31 翌圣生物科技(上海)股份有限公司 Rna快速建库方法及其应用

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113897414A (zh) * 2021-10-11 2022-01-07 湖南大地同年生物科技有限公司 一种痕量核酸文库构建方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100221785A1 (en) * 2005-05-26 2010-09-02 Human Genetic Signatures Pty Ltd Isothermal Strand Displacement Amplification Using Primers Containing a Non-Regular Base
CN104395480A (zh) 2012-03-13 2015-03-04 斯威夫特生物科学公司 用于通过核酸聚合酶对衬底多核苷酸进行大小受控的同聚物加尾的方法和组合物
CN105525357A (zh) * 2014-09-30 2016-04-27 深圳华大基因股份有限公司 一种测序文库的构建方法及试剂盒和应用
CN108004301A (zh) * 2017-12-15 2018-05-08 格诺思博生物科技南通有限公司 基因目标区域富集方法及建库试剂盒
CN108456713A (zh) * 2017-11-27 2018-08-28 天津诺禾致源生物信息科技有限公司 接头封闭序列、文库构建试剂盒及测序文库的构建方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018090373A1 (zh) * 2016-11-21 2018-05-24 深圳华大智造科技有限公司 一种dna末端修复与加a的方法
WO2019055819A1 (en) * 2017-09-14 2019-03-21 Grail, Inc. METHODS FOR PREPARING A SEQUENCING LIBRARY FROM SINGLE STRANDED DNA

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100221785A1 (en) * 2005-05-26 2010-09-02 Human Genetic Signatures Pty Ltd Isothermal Strand Displacement Amplification Using Primers Containing a Non-Regular Base
CN104395480A (zh) 2012-03-13 2015-03-04 斯威夫特生物科学公司 用于通过核酸聚合酶对衬底多核苷酸进行大小受控的同聚物加尾的方法和组合物
CN105525357A (zh) * 2014-09-30 2016-04-27 深圳华大基因股份有限公司 一种测序文库的构建方法及试剂盒和应用
CN108456713A (zh) * 2017-11-27 2018-08-28 天津诺禾致源生物信息科技有限公司 接头封闭序列、文库构建试剂盒及测序文库的构建方法
CN108004301A (zh) * 2017-12-15 2018-05-08 格诺思博生物科技南通有限公司 基因目标区域富集方法及建库试剂盒

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MATTHEW W. SNYDER ET AL., CELL, vol. 1, 2016, pages 57 - 68
See also references of EP3865584A4
Y. M. DENNIS LO ET AL., SCIENCE TRANSLATIONAL MEDICINE, vol. 10, 2010, pages 61 - 91

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113322523A (zh) * 2021-06-17 2021-08-31 翌圣生物科技(上海)股份有限公司 Rna快速建库方法及其应用
CN113322523B (zh) * 2021-06-17 2024-03-19 翌圣生物科技(上海)股份有限公司 Rna快速建库方法及其应用

Also Published As

Publication number Publication date
EP3865584A1 (en) 2021-08-18
CN111989406A (zh) 2020-11-24
EP3865584A4 (en) 2021-12-08
US20220002713A1 (en) 2022-01-06
CN111989406B (zh) 2023-02-21

Similar Documents

Publication Publication Date Title
US11697843B2 (en) Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing
US9745614B2 (en) Reduced representation bisulfite sequencing with diversity adaptors
US10301660B2 (en) Methods and compositions for repair of DNA ends by multiple enzymatic activities
JP7008407B2 (ja) ヌクレアーゼ、リガーゼ、ポリメラーゼ、及び配列決定反応の組み合わせを用いた、核酸配列、発現、コピー、またはdnaのメチル化変化の識別及び計数方法
CN111183145B (zh) 高灵敏度dna甲基化分析方法
JP7202556B2 (ja) 遺伝子標的エリアの富化方法及びキット
EP2619329B1 (en) Direct capture, amplification and sequencing of target dna using immobilized primers
EP2585593B1 (en) Methods for polynucleotide library production, immortalization and region of interest extraction
JP6542771B2 (ja) 核酸プローブ及びゲノム断片検出方法
JP7240337B2 (ja) ライブラリー調製方法ならびにそのための組成物および使用
JP2022527725A (ja) 部位特異的核酸を用いた核酸濃縮と続いての捕捉方法
CN111979583B (zh) 一种单链核酸分子高通量测序文库的构建方法及其应用
JP2023513606A (ja) 核酸を評価するための方法および材料
WO2020073748A1 (zh) 一种测序文库的构建方法
WO2021051665A1 (zh) 基因目标区域的富集方法及体系
US20230374574A1 (en) Compositions and methods for highly sensitive detection of target sequences in multiplex reactions
WO2019090621A1 (zh) 钩状探针、核酸连接方法以及测序文库的构建方法
WO2022144003A1 (zh) 一种用于高通量靶向测序的多重pcr文库构建方法
WO2019090482A1 (zh) 一种第二代高通量测序文库构建方法
JP2022546485A (ja) 腫瘍高精度アッセイのための組成物および方法
KR20200005658A (ko) 서열-기반의 유전 검사용 대조군을 제조하기 위한 조성물 및 방법
WO2018009677A1 (en) Fast target enrichment by multiplexed relay pcr with modified bubble primers
CN113073133A (zh) 放大微量dna并用于多种核酸检测的方法及核酸检测装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19871204

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019871204

Country of ref document: EP

Effective date: 20210511