WO2019090621A1 - 钩状探针、核酸连接方法以及测序文库的构建方法 - Google Patents

钩状探针、核酸连接方法以及测序文库的构建方法 Download PDF

Info

Publication number
WO2019090621A1
WO2019090621A1 PCT/CN2017/110252 CN2017110252W WO2019090621A1 WO 2019090621 A1 WO2019090621 A1 WO 2019090621A1 CN 2017110252 W CN2017110252 W CN 2017110252W WO 2019090621 A1 WO2019090621 A1 WO 2019090621A1
Authority
WO
WIPO (PCT)
Prior art keywords
hook
nucleic acid
probe
hook probe
acid fragment
Prior art date
Application number
PCT/CN2017/110252
Other languages
English (en)
French (fr)
Inventor
江媛
席阳
章文蔚
刘鹏娟
赵霞
李巧玲
沈寒婕
张永卫
德马纳克拉多杰
Original Assignee
深圳华大智造科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大智造科技有限公司 filed Critical 深圳华大智造科技有限公司
Priority to EP17931336.6A priority Critical patent/EP3730613A4/en
Priority to PCT/CN2017/110252 priority patent/WO2019090621A1/zh
Priority to US16/762,898 priority patent/US11680285B2/en
Priority to CN201780096314.8A priority patent/CN111278974B/zh
Publication of WO2019090621A1 publication Critical patent/WO2019090621A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6834Enzymatic or biochemical coupling of nucleic acids to a solid phase
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • C12Q1/6855Ligating adaptors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes

Definitions

  • the invention relates to the technical field of molecular biology, in particular to a hook probe, a nucleic acid connection method and a method for constructing a sequencing library.
  • High-throughput sequencing is a revolutionary change to traditional sequencing, sequencing hundreds of thousands to millions of DNA molecules at a time to analyze the transcriptome and genome of a species for detailed analysis. Phenotypic-related genes or gene variant sites provide a theoretical basis for research and application.
  • a high-throughput sequencing library is a collection of DNA or cDNA sequence fragments that can be amplified by amplification of known synthetic sequences into fragmented DNA or RNA through a series of biochemical reactions.
  • the known synthetic sequences often referred to as linkers, serve primarily to amplify the library and provide sequencing primers for hybridizing complementary sequences.
  • the basic flow of library construction includes nucleic acid extraction, nucleic acid fragmentation (if fragmentation is required for free DNA), fragmentation of nucleic acid ends and modifications, linker ligation, fragment amplification or no amplification.
  • the process of library construction is slightly different on the basis of the above basic processes. For example, the construction of free DNA does not need to interrupt DNA, and the construction of RNA needs to be reverse-transcribed into cDNA. For additional processing, the establishment of the target region requires hybridization or amplification to enrich the target region sequence from DNA or RNA.
  • the first category is based on probe hybridization technology, which is divided into solid phase hybridization and liquid phase hybridization.
  • liquid phase hybridization techniques are widely used, such as Agilent.
  • RNA probe liquid phase hybridization Roche NimbleGen's DNA probe liquid phase hybridization technique
  • Illumina's transposase interrupted DNA probe hybridization technique is widely used.
  • the second category is based on amplification techniques, which are divided into single-pair amplification pooling and multiplex amplification techniques.
  • multiplex amplification techniques such as Life Technologies Ion Torrent AmpliSeq multiplex PCR technology are widely used.
  • the probe liquid phase hybridization capture technique is well suited for high throughput sequencing libraries that require highly parallel sample preparation.
  • This technique utilizes a closed nucleic acid such as Cot-1 DNA and/or sequence-specific blocking oligonucleotides to reduce non-specific hybridization and enhance the specificity of hybridization reactions between the probe and the sample nucleic acid.
  • commonly used hybrid capture methods require very long hybridization times to achieve equilibrium and/or achieve efficient capture and enrichment of the target nucleic acid. Even so, the method still has at least about 40% contamination in non-target areas.
  • the target sequence also has the risk of random loss.
  • the reagents used in the method for example, probes, blocking reagents, streptavidin magnetic beads, etc.
  • the multiplex PCR technique can enrich the target region sequence more efficiently and specifically, and the cost is lower than that of the probe liquid phase hybridization capture technique, but it is not conducive to the enrichment of the target region of the small fragment cfDNA which has been naturally fragmented.
  • the present invention provides a hook probe, a nucleic acid ligation method, and a method for constructing a sequencing library, which can be used to add a known sequence of a tool to a single-strand end of a nucleic acid fragment to be ligated, which is known by the tool.
  • the sequence performs subsequent operations and can be used for different applications.
  • the present invention provides a hook probe including a target specific area and a connected hook area, the above target
  • the specific region includes a sequence complementary to at least a portion of a single strand of the nucleic acid fragment to be joined, the hook region comprising a sequence unpaired with the nucleic acid fragment, the end of the hook region being a connectable end, the connectable end being connectable to the above A single-stranded end of a nucleic acid fragment.
  • the present invention provides a kit comprising the hook probe of the first aspect, optionally further comprising at least one ligase for use in the hook probe described above
  • the ligated end is ligated to the end of the nucleic acid fragment to be ligated.
  • the invention provides the use of a hook probe of the first aspect in the construction of a nucleic acid sequencing library.
  • the present invention provides a nucleic acid ligation method, the method comprising: annealing an hook probe of the first aspect to a denatured nucleic acid fragment to be ligated; and causing the hook in the presence of a ligase
  • the linkable terminus of the probe is linked to the single-stranded end of the nucleic acid fragment described above.
  • the present invention provides a method for constructing a nucleic acid sequencing library, the method comprising: a step of annealing a hook probe of the first aspect to a denatured nucleic acid fragment to be ligated; and in the presence of a ligase And a step of linking the connectable end of the hook probe to the single-stranded end of the nucleic acid fragment.
  • the hook probe provided by the invention through its ingenious design and its cooperation with ligase, achieves rapid hybridization to capture a nucleic acid fragment (such as a target nucleic acid fragment) to be joined, and a known sequence of tools is designed by clever design. Different applications can be implemented by using the known sequence of the tool for subsequent operation.
  • the present invention is applicable to a wide range of sample types, and is applicable to a wide variety of detection types and applications (not limited to high-throughput library construction, but also to fields of molecular cloning and synthetic biology).
  • the realization of the invention will greatly simplify the process, shorten the time and cost, and break through the applicable sample type restrictions, which will benefit a variety of scientific research applications and kit packaging, and the market potential and prospects are very broad.
  • Figure 1 shows a schematic representation of the composition of a target-specific 5' hook probe and a 3' hook probe.
  • Figure 2 shows a flow diagram of a PCR protocol after bilateral hook probe hybridization.
  • Figure 3 shows a flow diagram of a PCR-free protocol after hybridization of a bilateral hook probe.
  • Figure 4 is a flow diagram showing one protocol for PCR/no PCR after single-sided 3' hook probe hybridization.
  • Figure 5 shows a flow diagram of another protocol for PCR/no PCR after single-sided 3' hook probe hybridization.
  • Figure 6 shows a flow diagram of a PCR/no PCR protocol after single-sided 5' hook probe hybridization.
  • Figure 7 shows a flow diagram of a post-PCR protocol for single-sided 3' hook probe hybridization.
  • Figure 8 shows 10% denaturing polyacrylamide condensation of a target region nucleic acid single-stranded fragment (YJ-439) and a target-specific 5' hook probe (YJ-765) incubated with cirligase I at different reaction temperatures.
  • Glue (U-PAGE) glue map
  • Figure 9 shows a target region nucleic acid single-stranded fragment (YJ-439) with a non-target-specific 5' hook probe (YJ-890) and a non-target specific 3' hook probe (YJ-891)
  • YJ-439 target region nucleic acid single-stranded fragment
  • YJ-890 non-target-specific 5' hook probe
  • YJ-891 non-target specific 3' hook probe
  • Fig. 10 is a view showing a sequence connection in a test example for verifying the basic principle.
  • Figure 11 shows the results of polyacrylamide gel electrophoresis of a part of the products of Example 1 of the present invention.
  • Figure 12 shows the results of partial product polyacrylamide gel electrophoresis in Example 2 of the present invention.
  • Figure 13 is a diagram showing the distribution of sequencing reads on chromosome 10 in Example 2 of the present invention, wherein A and B are WGS control libraries and negative control libraries, respectively; C is a bilateral hook probe hybrid library, Figure 13C The area indicated by the two arrows is 2 ROIs enrichment areas.
  • Fig. 14 is a view showing a case where the read read length and the target read length are respectively covered to the ROI area in the embodiment 2 of the present invention.
  • Figure 15 shows the results of partial product polyacrylamide gel electrophoresis in Example 3 of the present invention.
  • the present invention provides a hook probe (or "hook nucleic acid probe"), which can be used to rapidly capture nucleic acid fragments by utilizing the basic principle of the hook probe of the present invention and the intermolecular single-strand linkage reaction.
  • the upper link the present invention proposes a series of technical solutions based on the hook probe and the basic principle, including but not limited to the oligonucleotide sequence composition, enzyme and reagent components, reaction conditions required to implement the technical solution. , method steps, etc.
  • the basic principle of the present invention is: under appropriate reaction conditions, a nucleic acid fragment, such as a single-stranded nucleic acid fragment of a target region (a phosphate group at the 5' end or a hydroxyl group at the 3' end) and a partial sequence and a nucleic acid sequence of the target region
  • a target-specific hook probe complementary to the strand fragment (having a phosphate group at the 5' end or a hydroxyl group at the 3' end) forms a hybrid complex, and the non-complementary region of the hybrid complex is catalyzed by certain ligases (
  • the ratio of the product of the single-stranded linkage between the 5' end of the nucleic acid single strand of the target region and the 3' end of the hook probe is higher than the ratio of the single-chain cyclized product in the molecule of the single-stranded fragment of the nucleic acid of the target region.
  • the present invention relates to the following basic concepts: hook probes, nucleic acid fragments (e.g., target region nucleic acid fragments), ligases.
  • the invention has at least one of the following advantages: the applicable template is not limited by the sample type, the hybridization time is fast, no additional joint step is required, the process is simple, the reaction time is short, the cost is low, and the application field is wide.
  • a hook probe comprising a target specific region and an associated hook region, the target specific region comprising a complementary pair with at least a portion of a single strand of the nucleic acid fragment to be joined
  • the hook region includes a sequence which is unpaired with the nucleic acid fragment, and the end of the hook region is a ligninable terminal which is operably linked to the single-stranded end of the nucleic acid fragment.
  • the hook probe can be a 5' hook probe or a 3' hook probe, and Figure 1 is a schematic view of a hook probe.
  • the hook probe provided by the embodiment of the present invention includes a Target Specific Region (TSR) and a Hook Region (HR).
  • TSR Target Specific Region
  • HR Hook Region
  • the sequence of the specific region of the target i.e., the gene-specific binding site in Figure 1 is complementary to the sequence in the vicinity of the nucleic acid fragment (e.g., the target sequence) to be ligated in the sample, and the hybridization capture function is performed.
  • the hook region of the 5' hook probe can include a universal primer binding site, a unique molecular tag sequence, a sample tag sequence, a cell tag sequence, and other useful elements, or any combination thereof.
  • the hook region of the 3' hook probe can include a universal primer binding site, a unique molecular tag sequence, a sample tag sequence, a cell tag sequence, and other useful elements, or any combination thereof.
  • the hook region can also be designed as a random sequence for a database construction scheme for non-sequence capture (eg, whole genome library construction).
  • the universal primer binding site of the hook probe can participate in the subsequent hybridization intermolecular ligation reaction and PCR amplification of the hook-shaped ligation product or PCR-free construction.
  • a sample tag sequence and/or a unique molecular tag sequence on a hook probe for identifying different samples and/or target sequence segments.
  • the 5' hook probe has the structure: 5'-(target specific region)-(hook region)-3'.
  • the 5' hook probe has a structure 5'-(target specific region)-(single molecular tag and/or sample tag)-(general primer binding site)-3'.
  • the 3' hook probe has the structure: 5'-(hook region)-(target specific region)-3'.
  • the 3' hook probe has a structure 5'-(general primer binding site)-(unique molecular tag and/or sample tag)-(target specific region)-3'.
  • the biochemical component of the hook probe may be deoxyribonucleic acid, ribonucleic acid, or a mixture of deoxyribonucleic acid and ribonucleic acid.
  • the target specific region in the hook probe can have any suitable length and sequence for target specific hybridization with the target nucleic acid in the reaction mixture containing the target nucleic acid and the non-target nucleic acid.
  • the specific region of the target is typically less than 200 nucleotides in length.
  • the target specific region is not dedicated to hybridization with the target nucleic acid, which is only a specific application example of the specific region of the target.
  • the target specific region functions to hybridize to the sequence in the nucleic acid fragment to be joined so as to be able to be ligated to the end of the nucleic acid fragment to be ligated. That is to say, the target specific region in the present invention is also applicable to a scheme of non-sequence capture (such as whole genome library construction), in which no specific region of the target is required to capture a specific sequence region.
  • one or more hook probes can be designed that hybridize to sequences near the target site and/or target site.
  • a combination of hook probes for capturing the same target sequence is referred to as a hook probe set.
  • the hook probe set can include a 5' hook probe and/or a 3' hook probe, and can also include a plurality of 5' hook probes and/or multiple 3' hook probes.
  • the hook probe can be designed to be located at the target site flanking position and/or target site, or it can be designed to be located at a site flanking position linked to the target site and/or linked to the target site.
  • target site may, in the present invention, be a specific site within a target nucleic acid (especially a target region in a target nucleic acid), for example, may be a variant site associated with a certain biological function and/or Gene loci (eg SNPs, insertion deletion sites, gene fusion sites, methylation sites, etc.).
  • a target nucleic acid especially a target region in a target nucleic acid
  • Gene loci eg SNPs, insertion deletion sites, gene fusion sites, methylation sites, etc.
  • the hook region of the hook probe is all or part of the nucleic acid sequence necessary for template-dependent primer extension or primer-mediated PCR amplification reaction, and is also a full or partial nucleic acid sequence of the linker necessary for sequencing reaction, which can be used for expansion. Increased or not used for amplification.
  • the hook region sequence has no homologous sequence to the template DNA (or RNA), and neither the target nucleic acid nor the non-target nucleic acid has a fully complementary pairing sequence and a partial complementary pairing sequence.
  • the universal primer binding sequences of the hook regions can be matched according to sequencing primers and/or linker sequences of different sequencing platforms.
  • the length of the hook region can have any suitable length and sequence, typically less than 200 nucleotides.
  • the hook region can include a specific primer binding site or universal primer binding site, Unique Identifiers/Unique Molecular Identifiers (UMI), Sample Barcode (SB), such as a cell barcode, Sample barcode or other barcode, as well as other useful components or any combination thereof.
  • UMI Unique Identifiers/Unique Molecular Identifiers
  • SB Sample Barcode
  • Each 5' hook probe and/or 3' hook probe may comprise one or more unique molecular tags (UMI), the position of the UMI in the hook region, the number of bases, and the base composition according to the amount of the starting template Designed for different purposes such as application purposes and/or sequencing strategies.
  • Each 5' hook probe and/or 3' hook probe may comprise a sample label (SB), the position of the SB in the hook region, the number of bases, the base composition according to the number of samples and/or the purpose of the application. Designed for different purposes such as sequencing strategies.
  • SB sample label
  • the hook region can comprise a cleavage site that is a restriction enzyme that recognizes a binding site and/or one or more modified nucleotides that can be cleaved, such that the hook-like ligation product is achieved There is no need for PCR (PCR-free) to build libraries and/or remove sequences from the TSR region ( Figure 3).
  • modified nucleotide/enzyme combinations include, but are not limited to, (i) deoxyuridine and E. coli uracil DNA glycosylase (UDG) or A.
  • fulgidis UDG (Afu UDG) with one or more An enzyme that removes AP sites, such as human AP (apurinic/apyrimidinic) endonuclease (APE 1), endonuclease III (Endo III), endonuclease IV (Endo IV), endonuclease VIII ( Endo VIII), methionin [fapy]-DNA glycosylase (Fpg), human 8-oxyguanosylase (hOGG1) or human endoglin glycosylase 1 (hNEIL1) endonuclease VIII (Endo VIII); (ii) deoxyinosine and endonuclease V or human 3-alkyladenine DNA glycosylase (hAGG) to generate an AP site and one or more removable AP sites Enzymes such as APE1, Endo III, Endo IV, Endo VIII, Fpg, hOGG1 or hNEIL1; (iii)
  • a hooked probe comprises a ligated end that is capable of ligation to a single stranded end of a nucleic acid fragment (eg, a target nucleic acid).
  • the 5' hook probe has a functional 3' hydroxyl group capable of ligation to the 5' end of a nucleic acid fragment (eg, a target nucleic acid), and the 5' end of the 5' hook probe has the ability to block it from other single strands or A 5' blocking group (including but not limited to a 5' hydroxyl group, a dideoxy single nucleotide, etc.) in which a single chain undergoes a ligation reaction.
  • a nucleic acid fragment eg, a target nucleic acid
  • the 3' hook probe includes a functional 5' phosphate group capable of ligation to the 3' end of a nucleic acid fragment (eg, a target nucleic acid), and the 3' end of the 3' hook probe contains a blocker that can block it from other single strands or 3' blocking group in which the self-single chain undergoes a ligation reaction (including but not limited to 3' phosphate, 3' ring-opening sugar such as 3'-phosphate- ⁇ , ⁇ -unsaturated aldehyde (PA), 3' amino modification, 3' Dideoxynucleotide, 3' thiophosphate (PS) bond or 3' phosphate, etc.).
  • a functional 5' phosphate group capable of ligation to the 3' end of a nucleic acid fragment (eg, a target nucleic acid)
  • a blocker that can block it from other single strands or 3' blocking group in which the self-single chain undergoes a
  • the nucleic acid fragment of the present invention to be ligated includes a nucleic acid fragment of a target region.
  • the target region refers to one or more consecutive nucleotide base sequences and/or one or more nucleotide bases, which may be variant sites and/or gene loci associated with a certain biological function (such as SNP, InDel, SV, CNV, gene fusion, methylation site, etc.), may also be a known DNA sequence and / or RNA sequence and / or synthetic nucleotide sequence, may also be a and / Or a plurality of genes or a specific set of genes associated with a certain function and/or a set of genes of interest, or even a specific genome and/or transcriptome and/or a certain type of RNA (eg, 16S ribosomal RNA, ribozyme) , antisense RNA, guide RNA, etc.).
  • a certain biological function such as SNP, InDel, SV, CNV, gene
  • a nucleic acid fragment containing a target region is simply referred to as a target nucleic acid.
  • the target nucleic acid can be double-stranded and/or single-stranded (eg, dsDNA, cfDNA, ctDNA, ssDNA, DNA/RNA hybrid, RNA, mRNA, cDNA first strand, cDNA second strand, cDNA, etc.).
  • samples containing the target nucleic acid can be obtained from any suitable source.
  • a sample can be obtained or provided from any organism of interest. These organisms include plants, animals (eg, mammals, including humans and non-human primates), pathogens (such as bacteria and viruses).
  • the sample is obtained or extracted directly from cells, tissues, secretions, and the like of the biological population of interest.
  • the sample can be a microbiota or a microbiota.
  • the sample is an environmental sample, such as a sample of water, air or soil.
  • Samples from organisms of interest or such biological populations of interest may include, but are not limited to, body fluid samples (including but not limited to blood, urine, serum, lymph, saliva, anal and vaginal secretions, sweat and semen, etc.), Cells, tissues, biopsy samples, experimental samples (eg, products of nucleic acid amplification reactions, such as PCR amplification reactions, etc.), purified samples (eg, purified genomic DNA, RNA, etc.) and original samples (eg, bacteria, viruses, Genomic DNA, etc.).
  • target polynucleotides e.g., genomic DNA, total RNA, etc.
  • samples from most of the examples need to be fragmented to produce one or more fragments of a particular size or produced.
  • a group of fragments having a narrow segment length distribution can be used, which can be broken by physical means (for example, ultrasonic cutting, acoustic shearing, needle shearing, atomization or sonication), or by chemical methods (for example, heating and divalent metal cations).
  • Enzymatic methods for example using endonucleases, nickases or transposases
  • Methods of disruption are known in the art, for example see US 2012/0004126.
  • a target nucleic acid or nucleic acid fragment (eg, fragmented genomic DNA or RNA) is subjected to size selection processing to obtain a nucleic acid fragment having a particular fragment size or a specific fragment range.
  • Any method of fragment selection can be used, for example, in some embodiments, the fragmented target nucleic acid can be separated by gel electrophoresis, and a piece of gel corresponding to a particular fragment size or a particular fragment range can be extracted and purified from the gel.
  • a purification column can be used to select a fragment having a particular minimum size.
  • paramagnetic beads can be used to selectively bind DNA fragments having a desired fragment range.
  • a solid phase reversible immobilization (SPRI) method can be used to enrich a nucleic acid fragment having a particular fragment size or a particular fragment range.
  • SPRI solid phase reversible immobilization
  • a combination of the segment selection methods described above can be used.
  • the fragmented nucleic acid is selected to have a size ranging from about 50 to about 3000 bases, and may be a fragment of a certain size, a fragment of a certain average size, or a specific range. Fragment of.
  • the formation of a hooked probe ligation product requires the use of a hook probe in combination with one or more ligases.
  • the ligase used in the present invention is capable of linking molecules of a polynucleotide having a single-stranded end under appropriate conditions and at a suitable substrate concentration.
  • the ligase is a "single stranded DNA/RNA ligase.”
  • Cirligase can catalyze the formation of covalent phosphodiester bonds between two different nucleic acid strands under appropriate reaction conditions.
  • a ligase catalyzes the synthesis of a phosphodiester bond between the 3'-hydroxyl group of one polynucleotide and the 5'-phosphoryl group of the second polynucleotide.
  • hybridization of a hook probe to a target nucleic acid can result in a substrate for ligation.
  • hybridization of a 5' hook probe to a target nucleic acid can produce a 3' hydroxyl group suitable for ligation to the 5' end of the target nucleic acid.
  • the 5' hook probe includes a closed 5' end that is unsuitable for attachment.
  • hybridization of a 3' hook probe to a target nucleic acid can produce a free 5' phosphate that can be ligated to the 3' end of the target nucleic acid.
  • the 3' hook probe includes a closed 3' end that is not suitable for attachment.
  • the ligase is a thermostable RNA ligase, including but not limited to TS2126 RNA ligase or adenylated form of TS2126 RNA ligase, CIRCLIGASETM ssDNA ligase or CIRCLIGASE IITM ssDNA ligase (see Epicenter Biotechnologies) , Madison, Wisconsin; Lucks et al, 2011, Proc. Natl. Acad. Sci. USA 108: 11063-11068; Li et al, 2006, Anal. Biochem. 349: 242-246; Blondal et al, 2005, Nucleic Acids Res.
  • thermoautotrophic lipoprotein RNA ligase 1 or "MthRn1 ligase” see U.S. Patent No. 7,303,901, U.S. Patent No. 9,217,167 and International Publication No. WO 2010/094040
  • T4 RNA ligase for example, T4 RNA ligase I; Zhang et al, 1996, Nucleic Acids Res. 24: 990-991; Tessier et al, 1986, Anal. Biochem 158: 171-178
  • a thermostable 5' ApA/DNA ligase for example, T4 RNA ligase
  • the present invention also provides a kit comprising the above hook probe of the present invention, optionally further comprising at least one ligase, the ligase
  • the ligase The ligated end of the hook probe is ligated to the end of the nucleic acid fragment to be ligated, and the ligase may be any of the ligases described above.
  • the present invention also provides the use of the above hook probe of the present invention in constructing a nucleic acid sequencing library, particularly in the construction of a high throughput sequencing library. It is to be understood that the hook probe of the present invention is very versatile and is not limited to the construction of nucleic acid sequencing libraries, and only one major use is listed herein.
  • the present invention also provides a nucleic acid ligation method, the method comprising:
  • the ligated end of the hook probe is ligated to the single-stranded end of the nucleic acid fragment in the presence of a ligase.
  • nucleic acid ligation method of the present invention is a basic method, and its application is not particularly limited, and can be used in any application scenario in which the above-described hook probe of the present invention is linked to a nucleic acid fragment.
  • a typical but non-limiting application scenario is that the method for constructing a nucleic acid sequencing library is as follows:
  • the method for constructing the above nucleic acid sequencing library may further comprise:
  • a step of removing a linear non-specific ligation product, an excess nucleic acid fragment, and an excess hook probe preferably, the linear non-specific ligation product, the excess nucleic acid fragment, and the excess hook probe are digested using a single-strand exonuclease .
  • the above method further includes:
  • the hook region can include a restriction enzyme binding site that can be cleaved by a restriction enzyme and/or one or more modified nucleotides that can be cleaved, and the method further includes:
  • the step of restriction enzyme digestion of the restriction enzyme binding site using a restriction enzyme and/or cleavage of one or more modified nucleotides using a cleavage enzyme is a restriction enzyme and/or cleavage of one or more modified nucleotides using a cleavage enzyme.
  • the present invention proposes a series of methods for constructing nucleic acid sequencing libraries, which can be applied to different sequencing platforms, and each of the specific database construction schemes will be described in detail below.
  • other similar aspects and modifications thereof are also included in the scope of the present invention in addition to the embodiments described below.
  • the bilateral hook probe hybridization scheme is shown in Figures 2 and 3.
  • the hybridized nucleic acid fragment is captured by a 5' hook probe and a 3' hook probe that are matched to the 5' end and the 3' end portion of the nucleic acid fragment (for example, the target nucleic acid), and the hybridization complex is intermolecularly connected to form a specific Double hooks connect the product.
  • PCR amplification of the ligation product is achieved by the hook region (see Figure 2) or without PCR (PCR-free) library construction ( Figure 3).
  • this scheme can be used for the target sequence capture construction process, and can also be used to rapidly detect unknown flanking sequences at both ends of the known sequence (eg, rapid cloning of cDNA ends by PCR, ie RACE) (rapid-amplification of cDNA ends)) or artificial splicing of sequences in synthetic biology.
  • this scheme can be used for PCR construction of genomic DNA or RNA or without PCR (PCR-free).
  • the hook probe comprises a 5' hook probe and a 3' hook probe, the 3' end of the hook region of the 5' hook probe having the function of being ligated to the 5' end of the nucleic acid fragment a 3' hydroxyl group; the 5' end of the hook region of the 3' hook probe has a functional 5' phosphate group capable of ligation to the 3' end of the nucleic acid fragment; a hook region of the 5' hook probe and 3'
  • the hook regions of the hook probes each comprise a universal primer binding site. As shown in FIG.
  • the method for constructing a nucleic acid sequencing library in this embodiment specifically includes the steps of: (1) making a hook probe (for example, one pair of pairs) and a denatured nucleic acid fragment to be linked (eg, fragmented DNA, Plasma free DNA or reverse transcription cDNA, etc.) annealing hybridization; (2) in the presence of ligase, the functional 3' hydroxyl group of the 5' hook probe is linked to the 5' end of the nucleic acid fragment, 3' hook The functional 5' phosphate group of the probe is linked to the 3' end of the nucleic acid fragment; (3) the single-strand exonuclease digests the linear non-specific ligation product, the excess nucleic acid fragment, and the excess hook probe; (4) The universal primer matched with the hook region sequence is subjected to PCR amplification of the ligation product of the hook probe and the nucleic acid fragment, wherein the universal primer is complementary to the universal primer binding site of the 5' hook probe and the 3' hook probe, respectively.
  • the hook probe comprises a 5' hook probe and a 3' hook probe, the 3' end of the hook region of the 5' hook probe having a 5' end capable of ligation to the nucleic acid fragment a functional 3' hydroxyl group; the 5' end of the hook region of the 3' hook probe has a functional 5' phosphate group capable of ligation to the 3' end of the nucleic acid fragment; a hook region of the 5' hook probe and 3
  • the hook regions of the hook probe include a restriction enzyme binding site that can be cleaved by a restriction enzyme and/or one or more modified nucleotides that can be cleaved, respectively. As shown in FIG.
  • the method for constructing a nucleic acid sequencing library in this embodiment specifically includes the steps of: (1) making a hook probe (for example, one pair of pairs) and a denatured nucleic acid fragment to be ligated (for example, fragmented DNA, Plasma free DNA or reverse transcription cDNA, etc.) annealing hybridization; (2) in the presence of ligase, the functional 3' hydroxyl group of the 5' hook probe is linked to the 5' end of the nucleic acid fragment, 3' hook The functional 5' phosphate group of the probe is linked to the 3' end of the nucleic acid fragment; (3) the single-strand exonuclease digests the linear non-specific ligation product, the excess nucleic acid fragment, and the excess hook probe; (4) In either or both of the two ways: 4.1.
  • the ligated product is denatured and hybridized with the restriction endonuclease recognition sequence to form a restriction.
  • the enzyme recognition site is cleaved with the corresponding restriction endonuclease, and the non-essential sequences on the hook probe (such as the GSP region and/or the sequence of the hook region unrelated to sequencing) are excised; 4.2.
  • the hook region sequence When there is a U base, the USER enzyme is added for cleavage, and the hook probe is detached. Shall sequence (e.g., a hook region GSP region sequences and / or sequencing unrelated) excised fragment resected subsequent library construction may be used and the machine.
  • the unilateral hook probe hybridization scheme is shown in Figures 4-7.
  • the hybridized nucleic acid fragment is captured by a 5' hook probe or a 3' hook probe that is matched to a 5' end or a 3' end portion of a nucleic acid fragment (for example, a target nucleic acid), and a specific hybrid is formed after completion of the intermolecular connection of the hybrid complex.
  • Single hook to connect the product PCR or nucleic acid fragment-free PCR-free enrichment detection is achieved by combining primer extension and branch ligation techniques.
  • the unilateral hook probe hybridization scheme is also applicable to the target sequence capture and construction process (especially gene fusion and SV detection), genome-wide database construction process, RNA-seq database construction process, RACE, sequence artificial splicing and the like.
  • the hook probe comprises a 3' hook probe, the 5' end of the hook region of the 3' hook probe having a functional 5' phosphate group capable of ligation to the 3' end of the nucleic acid fragment.
  • the method for constructing a nucleic acid sequencing library in this embodiment specifically includes the following steps: (1) locating a hook probe (1 to more) with a denatured nucleic acid fragment to be ligated (eg, fragmented DNA, plasma free) DNA or reverse transcription cDNA, etc.) annealing hybridization; (2) linking the functional 5' phosphate group of the 3' hook probe to the 3' end of the nucleic acid fragment in the presence of ligase; (3) using a single Exonuclease digests linear non-specific ligation products, redundant nucleic acid fragments, and redundant hook probes; (4) primer extension reaction with primer sequences complementary to all or part of the hook region sequence of the 3' hook probe, For PCR library construction; (5) ligation of 5
  • the hook probe comprises a 3' hook probe, and the 5' end of the hook region of the 3' hook probe has a functional 5' phosphate group capable of ligation to the 3' end of the nucleic acid fragment.
  • the hook region of the 3' hook probe includes a restriction enzyme binding site that can be cleaved by a restriction enzyme and/or one or more modified nucleotides that can be cleaved. As shown in FIG.
  • the method for constructing a nucleic acid sequencing library in this embodiment specifically includes the following steps: (1) locating a hook probe (1 to more) with a denatured nucleic acid fragment to be ligated (eg, fragmented DNA, plasma free) DNA or reverse transcription cDNA, etc.) annealing hybridization; (2) linking the functional 5' phosphate group of the 3' hook probe to the 3' end of the nucleic acid fragment in the presence of ligase; (3) using a single Exonuclease digests linear non-specific ligation products, redundant nucleic acid fragments, and redundant hook probes; (4) Implements a PCR-free protocol by either or both of the following two approaches: :4.1.
  • the ligated product is denatured and hybridized with the restriction endonuclease recognition sequence to form a restriction endonuclease recognition site, with corresponding restriction Excision of the endonuclease, excision of non-essential sequences on the hook probe (such as GSP region and/or sequencing-independent hook region sequence); 4.2.
  • the hook region has a U base, join USER
  • the enzyme cleaves the non-essential sequences on the hook probe (eg GSP region and / or sequence of hook regions unrelated to sequencing), the excised fragments can be used for subsequent library construction and on-board.
  • the hook probe comprises a 3' hook probe, and the 5' end of the hook region of the 3' hook probe has a functional 5' phosphate group capable of ligation to the 3' end of the nucleic acid fragment.
  • the nucleic acid fragment includes a sequence of the target region. As shown in FIG.
  • the method for constructing the nucleic acid sequencing library in this embodiment specifically includes the following steps: (1) making the hook probe (1 to more) and the denatured nucleic acid fragment to be ligated (eg, fragmented DNA, plasma free) DNA or reverse transcription cDNA, etc.) annealing hybridization; (2) linking the functional 5' phosphate group of the 3' hook probe to the 3' end of the nucleic acid fragment in the presence of ligase; (3) using a single Exonuclease digests linear non-specific ligation products, redundant nucleic acid fragments, and redundant hook probes; (4) primer extension reactions using primer sequences complementary to all or part of the target region sequence or its adjacent regions; (5) A 5' linker (eg, a 5' blunt end linker or a TA linker linker) is attached to the 5' end of the extension reaction product; (6) a primer that is fully or partially complementary to the hook region sequence of the 5' linker and the 3' hook probe is used. The sequence was subjected to
  • the hook probe comprises a 5' hook probe, and the 3' end of the hook region of the 5' hook probe has a functional 3' hydroxyl group capable of attaching to the 5' end of the nucleic acid fragment.
  • the hook region of the 5' hook probe includes a universal primer binding site.
  • the method for constructing a nucleic acid sequencing library in this embodiment specifically includes the following steps: (1) end-repairing and dephosphorylation of the fragmented nucleic acid fragment to be ligated (for example, fragmented DNA or plasma free DNA).
  • the hook probe comprises a 5' hook probe, and the 3' end of the hook region of the 5' hook probe is capable of being ligated to the nucleic acid sheet a functional 3' hydroxyl group at the 5' end of the segment; the hook region of the 5' hook probe includes a restriction enzyme binding site that can be cleaved by a restriction enzyme and/or one or more modified nucleotides that can be cleaved .
  • the method for constructing a nucleic acid sequencing library in this embodiment specifically includes the following steps: (1) end-repairing and dephosphorylation of the fragmented nucleic acid fragment to be ligated (for example, fragmented DNA or plasma free DNA).
  • the hook probe comprises a 3' hook probe, and the 5' end of the hook region of the 3' hook probe has a functional 5' phosphate group capable of ligation to the 3' end of the nucleic acid fragment.
  • the method for constructing a nucleic acid sequencing library in this embodiment specifically includes the steps of: (1) making a hook probe (for example, one or more) and a denatured nucleic acid fragment to be ligated (for example, fragmented DNA or plasma).
  • Free DNA annealing hybridization; (2) linking the functional 5' phosphate group of the 3' hook probe to the 3' end of the nucleic acid fragment in the presence of a ligase; (3) in the presence of a polymerase, The 3' end of the 3' hook probe is used as an extension reaction of the polymerase reaction starting point; (4) the linear non-specific ligation product, the excess nucleic acid fragment and the excess hook probe are digested with a single-stranded exonuclease; a 5' linker (5' blunt end linker or TA linker linker) at the 5' end of the extension reaction product; (6) a primer complementary to all or part of the hook region sequence of the 5' linker and the 3' hook probe The sequence was subjected to PCR amplification, and the amplified product was used for subsequent library construction and loading.
  • the method for constructing the nucleic acid sequencing library of the invention solves the problem that the existing probe liquid phase hybridization capture technology has a complicated process, a long time-consuming process and a high cost.
  • the invention technology is applicable to various types of samples, including but not limited to whole genome DNA, cfDNA, ctDNA, FFPE DNA, RNA, mRNA, etc., SNP (single nucleotide polymorphism), InDel (insertion-deletion, insertion deletion), CNV (Copy number variations, gene copy number variation), SV (Structural variation), gene fusion (gene fusion) and other types of genetic variation.
  • the variant of the invention can also be used for rapid and direct library construction of whole genome DNA, cfDNA, ctDNA, FFPE DNA and the like.
  • the invention realizes the rapid hybridization of capturing nucleic acid fragments (such as target sequences) by adding a piece of known sequence of the tool through the ingenious design of the hook probe and its cooperation with the ligase, which is known by skillfully designing and applying the tool.
  • the sequence performs subsequent operations and can be used for different applications.
  • the present invention is applicable to a wide range of sample types, and is applicable to a wide variety of detection types and applications (not limited to high-throughput library construction, but also to fields of molecular cloning and synthetic biology). It is foreseeable that the implementation of the present invention will greatly simplify the process, shorten the time and cost, and break through the applicable sample type restrictions, and will benefit from a variety of scientific applications and kit packaging, and the market potential and prospects are very broad.
  • Test Example 1 and Test Example 2 demonstrate the basic principles of the present invention. That is, under appropriate reaction conditions, a nucleic acid fragment, such as a single-stranded nucleic acid fragment of the target region (having a phosphate group at the 5' end or a hydroxyl group at the 3' end) and a partial sequence complementary to the nucleic acid single-stranded fragment of the target region
  • a target-specific hook probe (with a phosphate group at the 5' end or a hydroxyl group at the 3' end) forms a hybrid complex, which is catalyzed by certain ligases, and the non-complementary region of the hybrid complex (target region nucleic acid single)
  • the ratio of the product of the intermolecular single-stranded linkage between the 5' end of the strand and the 3' end of the hooked probe is higher than the ratio of the single-chain cyclized product in the molecule of the single-stranded fragment of the nucleic acid of the target region.
  • Figure 8 shows a 10% denaturing polyacrylamide gel (U-PAGE) gel of the product. The results show:
  • Lane 1 Target region nucleic acid single-stranded fragment YJ-439 (synthesized by IDT, 90 nt), the sequence is as follows:
  • P-CTCATGCCCTTCGGCTGCCTCCTGGACTATGTCCGGGAACACAAAGACaatattggctcccagtacctgctcaactggtgtgtgcagatc SEQ ID NO: 1.
  • Lane 2 Target-specific 5' hook probe YJ-765 (synthesized by IDT, 59 nt, containing 20 nt complementary to YJ-439), the sequence is as follows:
  • CAGGAGGCAGCCGAAGGGCAGAACGACATGGCTACGATCCGACTTNNNNNNCATTTCAT SEQ ID NO: 2.
  • Lane 3 Products treated with exonuclease I and III after YJ-439 and Cirligase I were reacted at an optimum temperature of 55 °C.
  • Lane 4 Product of YJ-439 and Cirligase I after reaction at an optimum temperature of 55 °C.
  • Lanes 5-9 Products after YJ-439/YJ-765 and Cirligase I were reacted together at different temperatures (25 ° C, 37 ° C, 45 ° C, 55 ° C and 60 ° C).
  • Lane 3 (containing exonuclease I and III treatment) and lane 4 (without exonuclease I and III treatment) indicated that YJ-439 forms a single-stranded loop (about 150 nt) with Cirligase I at an optimum temperature of 55 °C. Where, marked by a triangle). Most YJ-439 formation and hooking when incubated with hook probe (YJ-765) at different temperatures (25°C, 37°C, 45°C, 55°C and 60°C) as shown in lanes 5-9 The needle-ligated product (149 nt, labeled with an arrow) instead of a single-stranded loop, as these products can be degraded by exonuclease I and III.
  • Figure 9 shows a 10% U-PAGE gel of the product. The results show:
  • Lane 1 Target region nucleic acid single-stranded fragment YJ-439 (synthesized by IDT, 90 nt).
  • Lane 2 The product of exonuclease I and III after YJ-439 and Cirligase I were reacted at an optimum temperature of 55 °C.
  • Lane 3 Non-target-specific 5' hook probe YJ-890 (synthesized by IDT, 46 nt) whose sequence is to replace the 20 bp and YJ-439 complementary sequences in YJ-765 with a random base sequence:
  • Lane 4 Non-target specific 3' hook probe YJ-891 (synthesized by IDT, 40 nt), the sequence of which is:
  • Lane 5 Product of YJ-890/YJ-891 and Cirligase I after reaction at an optimum temperature of 55 °C.
  • Lanes 6-10 Products after YJ-439/YJ-890/YJ-891 and Cirligase I were reacted together at different temperatures (25 ° C, 37 ° C, 45 ° C, 55 ° C and 60 ° C).
  • Lane 2 (containing exonuclease I and III treatment) indicated that YJ-439 itself formed a single-stranded loop (about 150 nt, marked by a triangle) with Cirligase I at an optimum temperature of 55 °C.
  • Lane 5 indicates that the non-target specific hook probes YJ890 and YJ-891 form an intermolecular product (86 nt, indicated by the long arrow in lanes 5-10) at Cirligase I at an optimum temperature of 55 °C.
  • Most of the connections were incubated with non-target specific hook probes (YJ-890/YJ-891) at different temperatures (25 ° C, 37 ° C, 45 ° C, 55 ° C and 60 ° C) as shown in lanes 6-10.
  • the product is a single-stranded loop that is not easily treated by exonuclease (data not shown) (marked by a triangle at about 150 nt) and a 5' hook probe that can be degraded by exonuclease I and III (data not shown).
  • the intermolecular junction product (86 nt, labeled by the long arrow in lanes 5-10) formed with the 3' hook probe, and almost no template (YJ-439) and non-target specific hook probe (YJ890) Random inter-molecular junction product (136 nt and / or 130 nt) with YJ-891).
  • Human NA12878 (GM12878, CORIELL INSTITUTE) genomic DNA was disrupted by fragmentase, and then a 200-400 bp DNA fragment was selected by double selection.
  • the above mixture was added to the hybridization reaction system, mixed by a pipette, and incubated at 42 ° C for 1 hour. Note that when the mixture is added, the temperature of the mixture is at room temperature and the hybridization mixture is still in the PCR machine.
  • exonuclease 1 (Exo I, NEB) was added, and the mixture was mixed by a pipette, reacted at 37 ° C for 30 minutes, and reacted at 80 ° C for 20 minutes.
  • Hybridization at the 5' end Add 1 ul of each of the corresponding concentrations (0.1 uM, 0.01 uM, 0.005 uM, 0.002 Um) of 5' hook probes to each reaction, mix them with a pipette, and then mix The following reaction procedure was carried out: 95 ° C, 5 minutes, cooling to 42 ° C at a rate of 0.1 seconds, reaction at 42 ° C for 30 minutes, and storage at 42 ° C.
  • the 5' hook probe is designed as follows:
  • the above mixture was added to the hybridization reaction system, mixed with a pipette, and incubated at 42 ° C for 1 hour at 80 ° C for 20 minutes. Note that when the mixture is added, the temperature of the mixture is at room temperature and the hybridization mixture is still in the PCR machine.
  • Lane 1 only 3' hook probe and DNA, so there is no corresponding product after PCR amplification
  • Lane 2 only 5' hook probe and DNA, so there is no corresponding product after PCR amplification, from the lane 1 and 2 show that the specificity of the hook probe and the PCR primer is better.
  • Lane 3 contains a 3' hook probe and a 5' hook probe and DNA, so there is a corresponding product after PCR amplification.
  • the corresponding target product can be obtained only when the number of PCR cycles is relatively large.
  • the number of PCR cycles in this reaction is lower than the theoretical value, so the brighter the band in the gel image indicates that the off-target efficiency is relatively high.
  • an unknown band of 400 bp appears in the PCR product.
  • Lane 4-6 Keeping the concentration of the 3' hook probe constant and decreasing the concentration of the 5' hook probe, the results showed that the concentration of the 5' hook probe decreased, and the total amount of PCR product decreased. Indirectly reflects the fact that the actual target product gradually emerges. Moreover, when the concentration is lowered to a certain concentration, the 400 bp band will disappear, so subsequent attempts to further debug the hook probe concentration are further attempted.
  • Lanes 7-8 Keeping the concentration of the 5' hook probe constant and decreasing the concentration of the 3' hook probe, the results showed that the 400 bp band also disappeared under this condition, but the total amount of non-specific PCR products also decreased. Therefore, subsequent attempts to adjust the hook probe concentration to improve the capture efficiency.
  • Human NA12878 (GM12878, CORIELL INSTITUTE) genomic DNA was disrupted by fragmentase, and then a 200-400 bp DNA fragment was selected by double selection.
  • Preparation of hybridization reaction system Take 10 ng of human NA12878 (GM12878, CORIELL INSTITUTE) genomic DNA, and mix it with 1 ⁇ L of 5' and 3' hook probe 0.1uM, add 1 ⁇ L of reaction buffer, and hydrate to 10 ⁇ L. . The mixture was vortexed and placed in a PCR machine. The reaction procedure was as follows: 95 ° C, 5 minutes, the temperature was lowered to 42 ° C at a rate of 0.1 seconds, the reaction was carried out at 42 ° C for 1 hour, and stored at 42 ° C.
  • the 5' hook probe is designed as follows:
  • the 3' hook probe is designed as follows:
  • the above mixture was added to the hybridization reaction system, mixed by a pipette, and incubated at 42 ° C for 1 hour. Note that when the mixture is added, the temperature of the mixture is at room temperature and the hybridization mixture is still in the PCR machine.
  • exonuclease 1 (Exo I, NEB) was added, and the mixture was mixed by a pipette, reacted at 37 ° C for 30 minutes, and reacted at 80 ° C for 20 minutes.
  • Figure 12 shows the results of partial product polyacrylamide gel electrophoresis in this example, wherein lane 14 is the result of electrophoresis of the library product under the conditions of the reservoir of the present example.
  • Figure 13 shows the distribution of sequencing reads on chromosome 10, where A and B are WGS control and negative control libraries, respectively, and no target region is enriched; C is bilateral hook probe hybridization. In the library, it is obvious that two ROIs are enriched.
  • Figure 14 shows the capture reads and the coverage of the ROI in the target read length.
  • the capture read length clearly covers the ROI region, and the target region on the chromosome is significantly enriched.
  • the red line (curve 1 in Fig. 14) is the capture read length
  • the blue line (curve 2 in Fig. 14) is the target read length, deep.
  • the gray area (indicated by A in Fig. 14) is ROI ⁇ 20 bp
  • the light gray area (indicated by B in Fig. 14) is ROI ⁇ 100 bp
  • the yellow line indicated by C in Fig. 14) shows that the read length of both ends of the PE is turned on. length.
  • Human NA12878 (GM12878, CORIELL INSTITUTE) genomic DNA was disrupted by fragmentase, and then a 200-400 bp DNA fragment was selected by double selection.
  • Hybridization reaction system preparation The above dephosphorylation reaction solution was mixed with 1 ⁇ L of a 0.2 uM 3' hook probe at a total volume of 11 ⁇ L. The mixture was vortexed and placed in a PCR machine. The reaction procedure was as follows: 95 ° C, 5 minutes, the temperature was lowered to 42 ° C at a rate of 0.1 seconds, the reaction was carried out at 42 ° C for 1 hour, and stored at 42 ° C.
  • the 3' hook probe is designed as follows:
  • the above mixture was added to the hybridization reaction system, mixed by a pipette, and incubated at 42 ° C for 1 hour. Note that when the mixture is added, the temperature of the mixture is at room temperature and the hybridization mixture is still in the PCR machine.
  • Exonuclease 1 (Exo I, NEB) was added, and the mixture was mixed with a pipette. The reaction was carried out at 37 ° C for 30 minutes and at 80 ° C for 20 minutes.
  • the above mixture was added to a 21 ⁇ L exonuclease digestion reaction system, which was mixed with a pipette. Store at 98 ° C for 3 minutes; 60 ° C for 30 minutes, store at 4 ° C.
  • the short chain of the linker GCTACGATCCGACT/ddT/(SEQ ID NO: 17);
  • Linker long chain /Phos/AAGTCGGATCGTAGCCATGTCGTT/ddC/ (SEQ ID NO: 18).
  • the above mixture was added to a 50 ⁇ L primer extension reaction system, which was mixed by a pipette. Store at 37 ° C for 30 minutes and store at 4 ° C.
  • the reaction mixture of Table 15 below was prepared, and 20 ⁇ L of the above purified product was subjected to a PCR reaction:
  • Component Dosage Connection product 20 ⁇ L 2 ⁇ KAPA HiFi HotStart ReadyMix 25 ⁇ L 20 ⁇ M primer 3 2 ⁇ L 20 ⁇ M primer 4 2 ⁇ L water 1 ⁇ L total capacity 50 ⁇ L
  • Primer 3 /5Phos/GAACGACATGGCTACGA (SEQ ID NO: 19);
  • Primer 4 TTGGAGCCAGGAGGTTG (SEQ ID NO: 20).
  • Fig. 15 shows the results of partial product polyacrylamide gel electrophoresis in this example, wherein lane 12 is the result of electrophoresis of the library product under the conditions of the reservoir of the present embodiment.
  • the library of this example was analyzed by PE50 and analyzed, and the PE50 sequencing data performed better at the on target rate and the capture rate.
  • derivations, variants, or substitutions include: hook probes can be designed not only as DNA probes, but also as RNA probes. Minute
  • the hybridization complex between the sub-interns may be one or more of a DNA/DNA, DNA/RNA, RNA/RNA hybrid complex.
  • the elimination of potential intermolecular non-specific linear ligation products, in addition to single-strand exonuclease treatment, can also be used in other possible ways such as gel extraction or magnetic bead selection.
  • the method of joining the non-unilateral hook probe hybridization ends may have other alternatives than those described herein.
  • it can be improved by changing the reaction conditions such as the hybridization system, the hybridization component, the hybridization reagent, and the hybridization temperature, or can be carried out by performing two or more nested enrichment methods.
  • the present invention is not limited to the development of high-throughput library construction technology and kits, but also can be applied to molecular biology cloning experiments such as RACE and synthetic biological experiments such as synthetic sequence splicing and corresponding kit development, and Development of biochemical assays and kits relying on chemiluminescence indicating detection results such as genotyping, real-time PCR, and any information or utilization that requires detection or extraction of known or known sequence flanking sequences by a known nucleic acid sequence Sequences are known to mediate the addition of other sequences to known sequence fragments or to known sequence flanking fragments.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Immunology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明提供了一种钩状探针、核酸连接方法以及测序文库的构建方法,所述钩状探针包括目标特定区域和相连的钩区域,所述目标特定区域包括与待连接的核酸片段的至少部分单链互补配对的序列,所述钩区域包括与所述核酸片段不配对的序列,所述钩区域的末端是可连接末端,所述可连接末端能够连接到所述核酸片段的单链末端。

Description

钩状探针、核酸连接方法以及测序文库的构建方法 技术领域
本发明涉及分子生物学技术领域,具体涉及一种钩状探针、核酸连接方法以及测序文库的构建方法。
背景技术
高通量测序技术是对传统测序的一次革命性的改变,一次对几十万到几百万条DNA分子进行序列测定,以对一个物种的转录组和基因组进行细致全貌的分析,以发现与表型相关的基因或基因变异位点,为科研和应用提供理论基础。高通量测序文库是通过一系列生化反应,将已知人工合成序列加入片段化的DNA或RNA中形成的可以被扩增放大的DNA或cDNA序列片段集。加上的已知人工合成序列通常称为接头,主要起到扩增文库和提供测序引物杂交互补序列的作用。文库构建的基本流程概括包括核酸提取、核酸片段化(若为游离DNA,则无需片段化)、片段化核酸末端修复及修饰、接头连接、片段扩增或无需扩增这几步。根据不同的研究目的及不同的起始样本类型,文库构建的流程在上述基本流程基础上略有不同,如游离DNA的建库无需再打断DNA,RNA的建库需要做反转录成cDNA的额外处理,目标区域的建库需要采用杂交或扩增的方法从DNA或RNA中富集出目标区域序列等。
近10年来,高通量测序成本在以超摩尔定律的模式急剧下降,使高通量测序从科研领域扩展到了临床检测等更多领域,特别为精准医疗提供了新颖、可靠的检测方法。DNA、RNA、cfDNA、ctDNA目标区域测序技术因只需对感兴趣基因进行高深度测序,具有检测位点多,样本通量大,经济成本低,检测灵敏度更高等优势,被广泛地应用于疾病诊断、靶向药物治疗。提供高效快速、准确可靠、价格实惠的文库构建及测序技术,是高通量测序应用到精准医疗的关键突破口。
目标区域富集的建库技术主流的有两类,第一类是基于探针杂交技术,分为固相杂交和液相杂交两种,目前应用广泛的是液相杂交技术,如Agilent公司的RNA探针液相杂交技术,Roche NimbleGen公司的DNA探针液相杂交技术,Illumina公司的转座酶打断结合DNA探针液相杂交技术。第二类是基于扩增技术,分为单对扩增后混库(pooling)和多重扩增技术,目前应用较广泛的是多重扩增技术,如Life Technologies Ion Torrent AmpliSeq多重PCR技术。
探针液相杂交捕获技术非常适合需要高度并行化样品制备的高通量测序建库流程。该技术利用封闭核酸如Cot-1 DNA和/或序列特异性阻断寡核苷酸来降低非特异杂交,增强探针和样品核酸之间杂交反应的特异性。然而,常用的杂交捕获方法需要非常长的杂交时间以达到平衡和/或实现目标核酸的有效捕获和富集。即便如此,该方法仍然存在至少约40%的非目标区域的污染。此外,在杂交、洗涤、洗脱或在杂交步骤的上游(例如,接头连接)或下游(例如,带有生物素标记的杂交复合体与链霉亲和素磁珠的结合)的反应过程中,目标序列也存在随机丢失的风险。此外,该方法所使用的试剂(例如,探针、封闭试剂、链霉亲和素磁珠等)价格较昂贵,价格下降空间较低。
多重PCR技术可以更加高效、特异地富集目的区域序列,成本相对探针液相杂交捕获技术低,但不利于已经自然片段化的小片段cfDNA的目标区域富集。
发明内容
本发明提供一种钩状探针、核酸连接方法以及测序文库的构建方法,使用该钩状探针能够在待连接的核酸片段单链末端加上一段工具已知序列,通过该段工具已知序列进行后续操作反应,可以实现不同的应用。
根据第一方面,本发明提供一种钩状探针,该钩状探针包括目标特定区域和相连的钩区域,上述目标 特定区域包括与待连接的核酸片段的至少部分单链互补配对的序列,上述钩区域包括与上述核酸片段不配对的序列,上述钩区域的末端是可连接末端,该可连接末端能够连接到上述核酸片段的单链末端。
根据第二方面,本发明提供一种试剂盒,该试剂盒包括第一方面的钩状探针,任选地,还包括至少一种连接酶,该连接酶用于将上述钩状探针的可连接末端与待连接的核酸片段的末端连接。
根据第三方面,本发明提供一种第一方面的钩状探针在构建核酸测序文库中的用途。
根据第四方面,本发明提供一种核酸连接方法,该方法包括:使第一方面的钩状探针与变性的待连接的核酸片段退火杂交;以及在连接酶的存在下,使上述钩状探针的可连接末端与上述核酸片段的单链末端连接。
根据第五方面,本发明提供一种核酸测序文库的构建方法,该方法包括:使第一方面的钩状探针与变性的待连接的核酸片段退火杂交的步骤;以及在连接酶的存在下,使上述钩状探针的可连接末端与上述核酸片段的单链末端连接的步骤。
本发明提供的钩状探针,通过其巧妙设计和其与连接酶的配合,实现快速杂交捕获待连接的核酸片段(例如靶核酸片段)的同时,加上一段工具已知序列,通过巧妙设计和运用该段工具已知序列进行后续操作反应,可以实现不同的应用。
特别地,本发明适用的样本类型范围广泛,适用的检测类型广泛,应用领广泛(不仅局限于高通量文库构建,还可以应用到分子克隆和合成生物学等领域)。本发明的实现,将极大程度简化流程、缩短时间和节约成本,并突破适用样品类型限制,将有益于多种科研应用和试剂盒包装,市场潜力和前景非常广阔。
附图说明
图1示出了靶特异性的5'钩状探针和3'钩状探针的组成示意图。
图2示出了双边钩状探针杂交连接后PCR方案的流程图。
图3示出了双边钩状探针杂交连接后无需PCR(PCR-free)方案的流程图。
图4示出了单边3’钩状探针杂交连接后PCR/无需PCR一个方案的流程图。
图5示出了单边3’钩状探针杂交连接后PCR/无需PCR另一个方案的流程图。
图6示出了单边5’钩状探针杂交连接后PCR/无需PCR方案的流程图。
图7示出了单边3’钩状探针杂交连接后PCR方案的流程图。
图8示出了目标区域核酸单链片段(YJ-439)与靶特异性5’钩状探针(YJ-765)在不同反应温度下与cirligase I孵育反应产物的10%变性聚丙烯酰胺凝胶(U-PAGE)胶图。
图9示出了目标区域核酸单链片段(YJ-439)与非靶特异性的5’钩状探针(YJ-890)和非靶特异性的3’钩状探针(YJ-891)在不同反应温度下与Cirligase I孵育反应产物的10%U-PAGE胶图。
图10示出了验证基本原理的试验例中的序列连接示意图。
图11示出了本发明实施例1中部分产物的聚丙烯酰胺凝胶电泳结果。
图12示出了本发明实施例2中部分产物聚丙烯酰胺凝胶电泳结果。
图13示出了本发明实施例2中测序读长在10号染色体上的分布情况,其中A、B分别为WGS对照文库和阴性对照文库;C为双边钩状探针杂交文库,图13C中两个箭头所指区域为2个ROIs富集区。
图14示出了本发明实施例2中捕获读长和在靶读长分别覆盖到ROI区域的情况。
图15示出了本发明实施例3中部分产物聚丙烯酰胺凝胶电泳结果。
具体实施方式
下面通过具体实施方式结合附图对本发明作进一步详细说明。在以下的实施方式中,很多细节描述是 为了使得本发明能被更好的理解。然而,本领域技术人员可以毫不费力的认识到,其中部分特征在不同情况下是可以省略的,或者可以由其他元件、材料、方法所替代。在某些情况下,本发明相关的一些操作并没有在说明书中显示或者描述,这是为了避免本发明的核心部分被过多的描述所淹没,而对于本领域技术人员而言,详细描述这些相关操作并不是必要的,他们根据说明书中的描述以及本领域的一般技术知识即可完整了解相关操作。
本发明提出了一种钩状探针(或称“钩状核酸探针”),利用本发明的钩状探针与分子间单链连接反应的基本原理,能够在快速捕获核酸片段的同时加上接头,本发明提出了一系列基于钩状探针和该基本原理所产生的技术方案,包括但不限于实现该技术方案所需要的寡核苷酸序列组成、酶和试剂组分、反应条件、方法步骤等。
本发明的基本原理是:在适当的反应条件下,核酸片段,例如目标区域核酸单链片段(5’末端带磷酸基团或3’末端带羟基基团)和部分序列与该目标区域核酸单链片段互补的靶特异性钩状探针(5’末端带磷酸基团或3’末端带羟基基团)形成杂交复合体,在某些连接酶催化下,该杂交复合体的非互补区域(目标区域核酸单链的5’端和钩状探针的3’端)分子间单链连接的产物比例高于目标区域核酸单链片段分子内单链环化产物的比例。且没有互补配对区域的单链片段之间形成单链分子间产物的比例极低。
本发明涉及以下基本概念:钩状探针,核酸片段(例如目标区域核酸片段),连接酶。本发明具有如下优势中的至少一项:适用模板不受样本类型限制,杂交时间快,无需额外加接头步骤,流程简单,反应时间短,成本较低,应用领域广。
本发明的一种实施例中提供一种钩状探针,该钩状探针包括目标特定区域和相连的钩区域,上述目标特定区域包括与待连接的核酸片段的至少部分单链互补配对的序列,上述钩区域包括与上述核酸片段不配对的序列,上述钩区域的末端是可连接末端,该可连接末端能够连接到上述核酸片段的单链末端。
以下对一些基本概念及其组成元素的具体定义进行详细描述。
1.钩状探针
钩状探针可以是5'钩状探针或3'钩状探针,图1是钩状探针的示意图。本发明实施例提供的钩状探针包括目标特定区域(Target Specific Region,TSR)和钩区域(Hook Region,HR)。目标特定区域的序列(即图1中基因特异结合位点)与样品中待连接的核酸片段(例如靶序列)附近的序列互补配对,执行杂交捕获功能。5'钩状探针的钩区域可以包括通用引物结合位点、唯一分子标签序列、样本标签序列、细胞标签序列及其它有用元件或其任何组合。类似地,3'钩状探针的钩区域可以包括通用引物结合位点、唯一分子标签序列、样本标签序列、细胞标签序列及其它有用元件或其任何组合。特别地,钩区域还可以设计成随机序列,用于非序列捕获的建库方案(如全基因组文库构建)。钩状探针的通用引物结合位点可以参与后续的杂交复合体分子间连接反应及钩状连接产物的PCR扩增或无需PCR(PCR-free)建库。钩状探针上的样本标签序列和/或唯一分子标签序列,用于标识不同的样本和/或靶序列片段。
在一些实施方案中,5'钩状探针具有以下结构:5'-(目标特定区域)-(钩区域)-3'。优选地,5'钩状探针具有结构5'-(目标特定区域)-(唯一分子标签和/或样本标签)-(通用引物结合位点)-3'。在一些实施方案中,3'钩状探针具有以下结构:5'-(钩区域)-(目标特定区域)-3'。优选地,3'钩状探针具有结构5'-(通用引物结合位点)-(唯一分子标签和/或样本标签)-(目标特定区域)-3'。
钩状探针的生化成分可以是脱氧核糖核酸,也可以是核糖核酸,还可以是脱氧核糖核酸和核糖核酸的混合物。
1.1目标特定区域(TSR)
钩状探针中的目标特定区域可以具有任何合适的长度和序列,用于与含有靶核酸和非靶核酸的反应混合物中的靶核酸进行靶特异性杂交。目标特定区域的长度通常小于200个核苷酸。
需要说明的是,目标特定区域并非专用于与靶核酸杂交,这只是目标特定区域的一种具体的应用实例。在其他应用实例中,目标特定区域的作用在于与待连接的核酸片段中的序列杂交,以便能够与待连接的核酸片段末端连接。也就是说,本发明中目标特定区域也适用于非序列捕获的方案(如全基因组文库构建),在这样的方案中不要求目标特定区域捕获特定的序列区域。
如图2所示,可以设计1个或多个与靶位点和/或靶位点附近序列杂交的钩状探针。用于捕获相同靶序列的钩状探针的组合称为钩状探针组。钩状探针组可以包括一个5'钩状探针和/或一个3'钩状探针,也可以包括多个5'钩状探针和/或多个3'钩状探头。钩状探针可以设计在靶位点侧翼位置和/或靶位点内,也可以设计在与靶位点连锁的位点侧翼位置和/或与靶位点连锁的位点内。所称的“靶位点”,在本发明中可以是靶核酸(尤其是靶核酸中的目标区域)内的特定位点,例如可以是与某种生物学功能相关的变异位点和/或基因位点(如SNP、插入缺失位点、基因融合位点、甲基化位点等)。
1.2钩区域(HR)
钩状探针的钩区域是模板依赖性引物延伸或引物介导的PCR扩增反应所必须的全部或部分核酸序列,同时也是测序反应必须的接头全部或部分核酸序列,它可以被用于扩增,也可以不用于扩增。钩区域序列与模板DNA(或RNA)没有同源序列,与靶核酸和非靶核酸都没有完全互补配对序列和部分互补配对序列。当应用到NGS文库构建时,钩区域的通用引物结合序列可根据不同测序平台的测序引物和/或接头序列来进行匹配。钩区域的长度可以具有任何合适的长度和序列,通常小于200个核苷酸。
在各种实施方案中,钩区可以包括特定引物结合位点或通用引物结合位点,唯一分子标签(Unique Identifiers/Unique Molecular Identifiers,UMI),样本标签(Sample Barcode,SB),例如细胞条形码、样本条形码或其他条形码,以及其他有用元件或其任何组合。每个5'钩状探针和/或3'钩状探针可以包括一个或多个唯一分子标签(UMI),UMI在钩区域的位置、碱基个数、碱基组成根据起始模板量和/或应用目的和/或测序策略等不同目的来设计。每个5'钩状探针和/或3'钩状探针可以包括一个样本标签(SB),SB在钩区域的位置、碱基个数、碱基组成根据样本个数和/或应用目的和/或测序策略等不同目的来设计。
在一些实施方案中,钩区域可包含酶切位点,该酶切位点为限制酶识别结合位点和/或能被切割的一个或多个修饰的核苷酸,以便钩状连接产物实现无需PCR(PCR-free)建库和/或去除TSR区域的序列(如图3)。修饰的核苷酸/酶组合的实例包括但不限于:(i)脱氧尿苷和大肠杆菌尿嘧啶DNA糖基化酶(UDG)或A.fulgidis UDG(Afu UDG)与一种或多种可以除去AP位点的酶,如人类AP(apurinic/apyrimidinic)内切核酸酶(APE 1),核酸内切酶III(Endo III),核酸内切酶IV(Endo IV),核酸内切酶VIII(Endo VIII),甲脒嘧啶[fapy]-DNA糖基化酶(Fpg),人8-氧基鸟嘌呤糖基酶(hOGG1)或人内膜糖基化酶1(hNEIL1)核酸内切酶VIII(Endo VIII);(ii)脱氧肌苷和内切核酸酶V或人3-烷基腺嘌呤DNA糖基化酶(hAGG)以产生AP位点和一种或多种可除去AP位点的酶,例如APE1,Endo III,Endo IV,Endo VIII,Fpg,hOGG1或hNEIL1;(iii)氧化嘧啶核苷酸(例如5,6-二羟基胸腺嘧啶,胸腺嘧啶二醇,5-羟基-5-甲基乙内酰脲,尿嘧啶二醇,6-羟基-5,6-二氢胸腺嘧啶或甲基三亚甲基脲)和Endo VIII,Endo III,hNEIL1或其组合;(iv)氧化的嘌呤核苷酸(例如8-氧基鸟嘌呤,8-羟基鸟嘌呤,8-氧基腺嘌呤,fapy-鸟嘌呤,甲基-鸟嘌呤或fapy-腺嘌呤)和Fpg,hOGG1,hNEIL1或其组合;(v)烷基化嘌呤(例如3-甲基腺嘌呤,7-甲基鸟嘌呤,1,N6-乙烯基腺嘌呤和次黄嘌呤)和hAGG以产生AP位点和一种或多种可以除去AP位点的酶,例如APE1,Endo III,Endo IV,Endo VIII,Fpg,hOGG1或hNEIL1;和(vi)5-羟基尿嘧啶,5-羟甲基尿嘧啶或5-甲酰尿嘧啶和人单链选择性单功能尿嘧啶DNA糖基化酶SMUG1(hSMUG1)以产生AP位点和一种或多种可除去AP位点的酶,如APE 1,Endo III,Endo IV,Endo VIII,Fpg,hOGG1或hNEIL1。
1.3钩状探针末端修饰
钩状探针包含能够连接到核酸片段(例如靶核酸)的单链末端的可连接末端。
5'钩状探针具有能够连接到核酸片段(例如靶核酸)的5'末端的功能性3'羟基基团,5'钩状探针的5'末端具有可阻断其与其他单链或自身单链发生连接反应的5'封闭基团(包括但不限于5'羟基,双脱氧单核苷酸等)。3'钩状探针包括能够连接到核酸片段(例如靶核酸)的3'末端的功能性5'磷酸基团,3'钩状探针的3'末端含有可阻断其与其他单链或自身单链发生连接反应的3'封闭基团(包括但不限于3'磷酸,3'开环糖如3'-磷酸-α,β-不饱和醛(PA),3'氨基修饰,3'二脱氧核苷酸,3'硫代磷酸酯(PS)键或3'磷酸酯等)。
2.核酸片段
2.1目标区域和靶核酸
本发明的待连接的核酸片段,尤其包括目标区域核酸片段。其中,目标区域是指一段或多段连续的核苷酸碱基序列和/或一个或多个核苷酸碱基,可以是与某种生物学功能相关的变异位点和/或基因位点(如SNP、InDel、SV、CNV、基因融合、甲基化位点等),也可以是一段已知的DNA序列和/或RNA序列和/或人工合成核苷酸序列,还可以是一个和/或多个基因或特定的与某种功能相关的基因集和/或感兴趣的基因集,甚至可以是特定的基因组和/或转录组和/或某类RNA(如16S核糖体RNA、核酶、反义RNA、指导RNA等)。
含有目标区域的核酸片段,简称为靶核酸。靶核酸可以是双链和/或单链(如dsDNA、cfDNA、ctDNA、ssDNA、DNA/RNA杂交体、RNA、mRNA、cDNA第一链、cDNA第二链、cDNA等)。
2.2样品
靶核酸与非目标区域片段形成的混合物称为样品。含有靶核酸的样品可以从任何合适的来源获得。例如,样品可以从感兴趣的任何生物获得或提供。这些生物体包括植物、动物(例如,哺乳动物,包括人和非人灵长类动物)、病原体(如细菌和病毒)。在一些情况下,样品从感兴趣的生物群体的细胞、组织、分泌物等直接获得或提取获得。作为另一个实例,样品可以是微生物群或微生物群。任选地,样品是环境样品,例如水,空气或土壤的样品。
来自感兴趣的生物体或这些感兴趣的生物群体的样品可以包括但不限于体液样品(包括但不限于血液、尿液、血清、淋巴、唾液、肛门和阴道分泌物、汗水和精液等)、细胞、组织、活检样本、实验样品(例如,核酸扩增反应的产物,如PCR扩增反应等)、纯化的样品(如纯化的基因组DNA、RNA等)和原始样品(例如,细菌、病毒、基因组DNA等)。从生物体获得靶多核苷酸(例如基因组DNA、总RNA等)的方法是本领域熟知的。
2.3片段化
除天然片段化的样品(如cfDNA,ctDNA,一段合成的核酸等),大部分实施例中的样品(如基因组DNA等)需要通过片段化处理,以产生一个或多个特定大小的片段或产生具有窄片段长度分布的片段群。可以使用任何碎片的方法,可以通过物理手段(例如,超声波切割、声剪切、针剪切、雾化或超声处理)破碎,也可以通过化学方法(例如,加热和二价金属阳离子),还可以通过酶法(例如使用内切核酸酶、切口酶或转座酶)。破碎方法是本领域已知的,例如参见US 2012/0004126。
2.4核酸片段大小选择
片段化的样品根据不同的实施方案,需要对靶核酸或核酸片段(例如,片段化的基因组DNA或RNA)进行大小选择处理以获得具有特定片段大小或特定片段范围的核酸片段。可以使用任何片段选择的方法,例如,在一些实施方案中,可以通过凝胶电泳分离片段化的靶核酸,并且从凝胶中提取和纯化对应特定片段大小或特定片段范围的一块胶块。在一些实施方案中,纯化柱可用于选择具有特定最小尺寸的片段。在一些实施方案中,顺磁珠可用于选择性地结合具有所需片段范围的DNA片段。在一些实施方案中,可以使用固相可逆固定(SPRI)方法来富集具有特定片段大小或特定片段范围的核酸片段。在一些实施例中,可以使用上述片段选择方法的组合。
片段化核酸选择得到的大小范围为约50至约3000个碱基,在此范围内,可以是某个特定大小的片段,也可以是某个特定平均大小的片段,也可以是某个特定范围的片段。
3.连接酶
钩状探针连接产物的形成需要钩状探针与一种或多种连接酶组合使用。本发明使用的连接酶能够在合适的条件下和适当的底物浓度下,使具有单链末端的多核苷酸的分子间发生连接。
在一些实施方案中,连接酶为“单链DNA/RNA连接酶”。如本文所用,Cirligase可以在适当反应条件下催化两个不同核酸链之间形成共价磷酸二酯键。例如,连接酶催化一个多核苷酸的3'-羟基与第二多核苷酸的5'-磷酰基之间的磷酸二酯键的合成。在一些情况下,钩状探针与靶核酸的杂交可以产生用于连接的底物。例如,5'钩状探针与靶核酸的杂交可以产生适合于连接靶核酸的5'末端的3'羟基。任选地,5'钩状探针包括不适于连接的封闭的5'端。类似地,3'钩状探针与靶核酸的杂交可以产生可连接到靶核酸的3'末端的游离的5'磷酸。任选地,3'钩状探针包括不适于连接的封闭的3'末端。
在一些实施方案中,连接酶是热稳定的RNA连接酶,包括但不限于TS2126 RNA连接酶或腺苷酸化形式的TS2126RNA连接酶,CIRCLIGASE TM ssDNA连接酶或CIRCLIGASE II TM ssDNA连接酶(参见Epicenter Biotechnologies,Madison,Wisconsin;Lucks等人,2011,Proc.Natl.Acad.Sci.USA108:11063-11068;Li et al,2006,Anal.Biochem.349:242-246;Blondal等,2005,Nucleic Acids Res.33:135-142),热自养脂蛋白RNA连接酶1或“MthRn1连接酶”(参见美国专利号7,303,901,美国专利9217167和国际公开号WO2010/094040),T4RNA连接酶(例如,T4RNA连接酶I;Zhang等人,1996,Nucleic Acids Res.24:990-991;Tessier等人,1986,Anal.Biochem 158:171-178),热稳定的5'ApA/DNA连接酶。
在本发明的钩状探针的基础上,本发明还提供一种试剂盒,该试剂盒包括本发明的上述钩状探针,任选地,还包括至少一种连接酶,该连接酶用于将钩状探针的可连接末端与待连接的核酸片段的末端连接,该连接酶可以是上述介绍的任一种连接酶。
在本发明的钩状探针的基础上,本发明还提供本发明的上述钩状探针在构建核酸测序文库中的用途,尤其是在构建高通量测序文库中的用途。需要理解,本发明的钩状探针的用途非常广泛,并不局限于构建核酸测序文库,这里列举的只是一种主要的用途。
在本发明的钩状探针的基础上,本发明还提供一种核酸连接方法,该方法包括:
使本发明的上述钩状探针与变性的待连接的核酸片段退火杂交;以及
在连接酶的存在下,使钩状探针的可连接末端与核酸片段的单链末端连接。
需要理解,本发明的核酸连接方法是一种基础性的方法,其应用没有具体限制,可以用于任何需要本发明的上述钩状探针与核酸片段连接的应用场景中。一种典型但非限定性的应用场景是,用于核酸测序文库的构建方法中,具体如下所述:
在本发明的钩状探针的基础上,本发明还提供一种核酸测序文库的构建方法,该方法包括:
使本发明的上述钩状探针与变性的待连接的核酸片段退火杂交的步骤;和
在连接酶的存在下,使钩状探针的可连接末端与核酸片段的单链末端连接的步骤。
为了消除不需要的核酸对后续反应的影响,上述核酸测序文库的构建方法还可以包括:
去除线性非特异性连接产物、多余的核酸片段以及多余的钩状探针的步骤;优选地,使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针。
经连接酶连接后的产物和/或经消除不需要的核酸后的产物,在后续的核酸测序文库的构建过程中有两条可选择的路径,即PCR扩增路径和无需PCR(PCR-free)路径。
对PCR扩增路径而言,上述方法还包括:
使用通用引物对钩状探针和核酸片段的连接产物进行PCR扩增的步骤。
对无需PCR(PCR-free)路径而言,钩区域可以包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸,上述方法还包括:
使用限制酶对限制酶结合位点进行限制性酶切和/或使用切割酶对一个或多个修饰的核苷酸进行切割的步骤。
基于上述基本原理和元素设计,本发明提出了一系列核酸测序文库的构建方法,这些方法能够适用于不同的测序平台,以下将详细介绍每一种具体的建库方案。特别地,除下文介绍的实施方案,其他类似方案及其变形方案也包括在本发明的权利范围之内。
双边钩状探针杂交方案如图2和图3所示。通过与核酸片段(例如靶核酸)5’端和3’端部分匹配的5’钩状探针和3’钩状探针来捕获杂交核酸片段,完成杂交复合体分子间连接之后,形成特异的双钩连接产物。通过钩区域实现连接产物的PCR扩增(如图2)或无需PCR(PCR-free)建库(如图3)。当GSP区域为一段特异的靶核酸杂交序列时,此方案可以用于目标序列捕获建库流程,也可用于快速检测已知序列两端的未知侧翼序列(如通过PCR进行cDNA末端快速克隆,即RACE(rapid-amplification of cDNA ends))或合成生物学中的序列人工拼接。当GSP区域为一段随机序列时,此方案可以用于基因组DNA或RNA的PCR建库或无需PCR(PCR-free)建库。
在一种实施方案中,钩状探针包括5'钩状探针和3'钩状探针,5'钩状探针的钩区域3'末端具有能够连接到核酸片段的5'末端的功能性3'羟基基团;3'钩状探针的钩区域5'末端具有能够连接到核酸片段的3'末端的功能性5'磷酸基团;5'钩状探针的钩区域和3'钩状探针的钩区域分别包括通用引物结合位点。如图2所示,该实施方案中核酸测序文库的构建方法具体包括如下步骤:(1)使钩状探针(例如1对至多对)与变性的待连接的核酸片段(例如片段化DNA、血浆游离DNA或反转录cDNA等)退火杂交;(2)在连接酶的存在下,5'钩状探针的功能性3'羟基基团与核酸片段的5'末端连接,3'钩状探针的功能性5'磷酸基团与核酸片段的3'末端连接;(3)单链外切酶消化线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;(4)用与钩区域序列匹配的通用引物对钩状探针和核酸片段的连接产物进行PCR扩增,其中通用引物分别与5'钩状探针和3'钩状探针的通用引物结合位点互补配对,引物上可以带上样本标签。扩增得到的产物是两端加上测序接头序列的靶核酸分子,可用于后续的文库构建和上机。
在另一种实施方案中,钩状探针包括5'钩状探针和3'钩状探针,5'钩状探针的钩区域3'末端具有能够连接到核酸片段的5'末端的功能性3'羟基基团;3'钩状探针的钩区域5'末端具有能够连接到核酸片段的3'末端的功能性5'磷酸基团;5'钩状探针的钩区域和3'钩状探针的钩区域分别包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸。如图3所示,该实施方案中核酸测序文库的构建方法具体包括如下步骤:(1)使钩状探针(例如1对至多对)与变性的待连接的核酸片段(例如片段化DNA、血浆游离DNA或反转录cDNA等)退火杂交;(2)在连接酶的存在下,5'钩状探针的功能性3'羟基基团与核酸片段的5'末端连接,3'钩状探针的功能性5'磷酸基团与核酸片段的3'末端连接;(3)单链外切酶消化线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;(4)如下两种方式中任一种或两种:4.1.当钩区域序列上带有限制性内切酶识别序列时,将连接产物变性后与限制性内切酶识别序列互补序列进行杂交形成限制性内切酶识别位点,用对应的限制性内切酶进行切割,将钩状探针上非必须的序列(如GSP区域和/或与测序无关的钩区域序列)切除;4.2.当钩区域序列上带有U碱基时,加入USER酶进行切割,将钩状探针上非必须的序列(如GSP区域和/或与测序无关的钩区域序列)切除,切除后的片段可用于后续的文库构建和上机。
单边钩状探针杂交方案如图4至图7所示。通过与核酸片段(例如靶核酸)5’端或3’端部分匹配的5’钩状探针或3’钩状探针来捕获杂交核酸片段,完成杂交复合体分子间连接之后,形成特异的单钩连接产物。前后通过引物延伸、分支连接等技术的结合,实现核酸片段的PCR或无需PCR(PCR-free)富集检测。同 理,单边钩状探针杂交方案也适用于目标序列捕获建库流程(特别是基因融合和SV检测),全基因组建库流程,RNA-seq建库流程,RACE,序列人工拼接等。
在一种实施方案中,钩状探针包括3'钩状探针,3'钩状探针的钩区域5'末端具有能够连接到核酸片段的3'末端的功能性5'磷酸基团。如图4所示,该实施方案中核酸测序文库的构建方法具体包括如下步骤:(1)使钩状探针(1至多个)与变性的待连接的核酸片段(例如片段化DNA、血浆游离DNA或反转录cDNA等)退火杂交;(2)在连接酶的存在下,使3'钩状探针的功能性5'磷酸基团与核酸片段的3'末端连接;(3)使用单链外切酶消化线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;(4)用与3'钩状探针的钩区域序列全部或部分互补的引物序列进行引物延伸反应,用于PCR建库方案;(5)在延伸反应的产物5'端连接5'接头(例如5’平末端接头或T-A连接接头);(6)用与5'接头和3'钩状探针的钩区域序列全部或部分互补的引物序列进行PCR扩增,扩增产物用于后续文库构建和上机。
在另一种实施方案中,钩状探针包括3'钩状探针,3'钩状探针的钩区域5'末端具有能够连接到核酸片段的3'末端的功能性5'磷酸基团;3'钩状探针的钩区域包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸。如图4所示,该实施方案中核酸测序文库的构建方法具体包括如下步骤:(1)使钩状探针(1至多个)与变性的待连接的核酸片段(例如片段化DNA、血浆游离DNA或反转录cDNA等)退火杂交;(2)在连接酶的存在下,使3'钩状探针的功能性5'磷酸基团与核酸片段的3'末端连接;(3)使用单链外切酶消化线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;(4)通过如下两种方式中任一种或两种实现用于无需PCR(PCR-free)的方案:4.1.当钩区域序列上带有限制性内切酶识别序列时,将连接产物变性后与限制性内切酶识别序列互补序列进行杂交形成限制性内切酶识别位点,用对应的限制性内切酶进行切割,将钩状探针上非必须的序列(如GSP区域和/或与测序无关的钩区域序列)切除;4.2.当钩区域序列上带有U碱基时,加入USER酶进行切割,将钩状探针上非必须的序列(如GSP区域和/或与测序无关的钩区域序列)切除,切除后的片段可用于后续的文库构建和上机。
在另一种实施方案中,钩状探针包括3'钩状探针,3'钩状探针的钩区域5'末端具有能够连接到核酸片段的3'末端的功能性5'磷酸基团;核酸片段包括目标区域序列。如图5所示,该实施方案中核酸测序文库的构建方法具体包括如下步骤:(1)使钩状探针(1至多个)与变性的待连接的核酸片段(例如片段化DNA、血浆游离DNA或反转录cDNA等)退火杂交;(2)在连接酶的存在下,使3'钩状探针的功能性5'磷酸基团与核酸片段的3'末端连接;(3)使用单链外切酶消化线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;(4)使用与目标区域序列或其临近区域全部或部分互补的引物序列进行引物延伸反应;(5)在延伸反应的产物5'端连接5'接头(例如5’平末端接头或T-A连接接头);(6)使用与5'接头和3'钩状探针的钩区域序列全部或部分互补的引物序列进行PCR扩增,扩增产物用于后续文库构建和上机。也可以用类似图4的方法进行无需PCR(PCR-free)建库处理。
在另一种实施方案中,钩状探针包括5'钩状探针,5'钩状探针的钩区域3'末端具有能够连接到核酸片段的5'末端的功能性3'羟基基团;5'钩状探针的钩区域包括通用引物结合位点。如图6所示,该实施方案中核酸测序文库的构建方法具体包括如下步骤:(1)对片段化的待连接的核酸片段(例如片段化DNA或血浆游离DNA)进行末端修复和去磷酸化处理;(2)在经末端修复和去磷酸化处理的核酸片段的3'端连接3'接头(例如平末端接头),3'接头具有通用引物结合位点;(3)使钩状探针(例如1至多个)与连接3'接头后的变性的核酸片段退火杂交;(4)进行磷酸化处理之后,在连接酶的存在下,使3'钩状探针的功能性5'磷酸基团与核酸片段的3'末端连接;(5)使用单链外切酶消化线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;(6)使用通用引物对钩状探针和核酸片段的连接产物进行PCR扩增,其中通用引物分别与5'钩状探针和3'接头的通用引物结合位点互补配对。
在另一种实施方案中,钩状探针包括5'钩状探针,5'钩状探针的钩区域3'末端具有能够连接到核酸片 段的5'末端的功能性3'羟基基团;5'钩状探针的钩区域包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸。如图6所示,该实施方案中核酸测序文库的构建方法具体包括如下步骤:(1)对片段化的待连接的核酸片段(例如片段化DNA或血浆游离DNA)进行末端修复和去磷酸化处理;(2)在经末端修复和去磷酸化处理的核酸片段的3'端连接3'接头(例如平末端接头),3'接头具有通用引物结合位点;(3)使钩状探针(例如1至多个)与连接3'接头后的变性的核酸片段退火杂交;(4)进行磷酸化处理之后,在连接酶的存在下,使3'钩状探针的功能性5'磷酸基团与核酸片段的3'末端连接;(5)使用单链外切酶消化线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;(6)使用限制酶对限制酶结合位点进行限制性酶切和/或使用切割酶对一个或多个修饰的核苷酸进行切割,实现无需PCR(PCR-free)建库方案。
在另一种实施方案中,钩状探针包括3'钩状探针,3'钩状探针的钩区域5'末端具有能够连接到核酸片段的3'末端的功能性5'磷酸基团。如图7所示,该实施方案中核酸测序文库的构建方法具体包括如下步骤:(1)使钩状探针(例如1至多个)与变性的待连接的核酸片段(例如片段化DNA或血浆游离DNA)退火杂交;(2)在连接酶的存在下,使3'钩状探针的功能性5'磷酸基团与核酸片段的3'末端连接;(3)在聚合酶的存在下,以3'钩状探针的3'端为聚合酶反应起点进行延伸反应;(4)用单链外切酶消化线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;(5)在延伸反应的产物5'端连接5'接头(5’平末端接头或T-A连接接头);(6)用与5'接头和3'钩状探针的钩区域序列全部或部分互补的引物序列进行PCR扩增,扩增产物用于后续文库构建和上机。
本发明的核酸测序文库的构建方法,作为一种改进的快速目标区域富集方法,解决了现有探针液相杂交捕获技术建库流程较繁琐、耗时较长、成本较高的问题,且该发明技术适用于各种类型的样品,包括但不限于全基因组DNA、cfDNA、ctDNA、FFPE DNA、RNA、mRNA等,可检测SNP(single nucleotide polymorphism,单核苷酸多态性)、InDel(insertion-deletion,插入缺失)、CNV(Copy number variations,基因拷贝数变异)、SV(Structural variation,基因组结构变异)、gene fusion(基因融合)等各种类型的基因变异。本发明的变形方案还可用于全基因组DNA、cfDNA、ctDNA、FFPE DNA等的快速直接建库。
本发明通过钩状探针的巧妙设计和其与连接酶的配合,实现了快速杂交捕获核酸片段(例如靶序列)的同时加上一段工具已知序列,通过巧妙设计和运用该段工具已知序列进行后续操作反应,可以实现不同的应用。特别地,本发明适用的样本类型范围广泛,适用的检测类型广泛,应用领广泛(不仅局限于高通量文库构建,还可以应用到分子克隆和合成生物学等领域)。可以预见,本发明的实现,将极大程度简化流程、缩短时间和节约成本,并突破适用样品类型限制,将受益于多种科研应用和试剂盒包装,市场潜力和前景非常广阔。
以下通过试验例和实施例详细说明本发明的技术方案和效果,应当理解,试验例和实施例仅是示例性的,不能理解为对本发明保护范围的限制。
以下试验例1和试验例2证明了本发明的基本原理。也就是,在适当的反应条件下,核酸片段,例如目标区域核酸单链片段(5’末端带磷酸基团或3’末端带羟基基团)和部分序列与该目标区域核酸单链片段互补的靶特异性钩状探针(5’末端带磷酸基团或3’末端带羟基基团)形成杂交复合体,在某些连接酶催化下,该杂交复合体的非互补区域(目标区域核酸单链的5’端和钩状探针的3’端)分子间单链连接的产物比例高于目标区域核酸单链片段分子内单链环化产物的比例。且没有互补配对区域的单链片段之间形成单链分子间产物的比例极低。
试验例1
本试验例中研究了目标区域核酸单链片段(YJ-439)与靶特异性5’钩状探针(YJ-765)(图10)在不同反应温度下与Cirligase I(Epicenter公司)孵育反应的发生情况。
图8示出了产物的10%变性聚丙烯酰胺凝胶(U-PAGE)胶图。结果显示:
泳道1:目标区域核酸单链片段YJ-439(由IDT合成,90nt),序列如下:
P-CTCATGCCCTTCGGCTGCCTCCTGGACTATGTCCGGGAACACAAAGACaatattggctcccagtacctgctcaactggtgtgtgcagatc(SEQ ID NO:1)。
泳道2:靶特异性5'钩状探针YJ-765(由IDT合成,59nt,含20nt与YJ-439互补的序列),序列如下:
CAGGAGGCAGCCGAAGGGCAGAACGACATGGCTACGATCCGACTTNNNNNNCATTTCAT(SEQ ID NO:2)。
泳道3:YJ-439与Cirligase I在最适温度为55℃下反应之后被核酸外切酶I和III处理的产物。
泳道4:YJ-439与Cirligase I在最适温度为55℃下反应之后的产物。
泳道5-9:YJ-439/YJ-765与Cirligase I在不同温度(25℃,37℃,45℃,55℃和60℃)下一起反应之后的产物。
泳道3(含有核酸外切酶I和III处理)和泳道4(无核酸外切酶I和III处理)表明YJ-439在与Cirligase I在最适温度55℃下自身形成单链环(约150nt处,由三角形标记)。在泳道5-9所示的不同温度(25℃,37℃,45℃,55℃和60℃)下与钩状探针(YJ-765)一起温育时,大多数YJ-439形成与钩状探针连接的产物(149nt,用箭头标记)而不是单链环,因这些产物可以被外切核酸酶I和III降解。
试验例2
本试验例中研究了目标区域核酸单链片段(YJ-439)与非靶特异性的5’钩状探针(YJ-890)和非靶特异性的3’钩状探针(YJ-891)(图10)在不同反应温度下与Cirligase I(Epicenter公司)孵育反应的发生情况。
图9示出了产物的10%U-PAGE胶图。结果显示:
泳道1:目标区域核酸单链片段YJ-439(由IDT合成,90nt)。
泳道2:YJ-439与Cirligase I在最适温度为55℃下反应之后被核酸外切酶I和III处理的产物。
泳道3:非靶特异性的5’钩状探针YJ-890(由IDT合成,46nt),其序列是将YJ-765中20bp与YJ-439互补的序列替换成随机碱基序列:
NNNNNNNNNNNNNNNGAACGACATGGCTACGATCCGACTTNNNNNN(SEQ ID NO:3)。
泳道4:非靶特异性的3’钩状探针YJ-891(由IDT合成,40nt),其序列是:
P-NNNNNNNNNNNNNNNGAACGACATGGCTACGATCCGACTTNNNNNN-P(SEQ ID NO:4)。
泳道5:YJ-890/YJ-891与Cirligase I在最适温度为55℃下反应之后的产物。
泳道6-10:YJ-439/YJ-890/YJ-891与Cirligase I在不同温度(25℃,37℃,45℃,55℃和60℃)下一起反应之后的产物。
泳道2(含有核酸外切酶I和III处理)表明YJ-439在与Cirligase I在最适温度55℃下自身形成单链环(约150nt处,由三角形标记)。泳道5表明非靶特异性钩状探针YJ890和YJ-891与在Cirligase I在最适温度55℃下形成分子间连产物(86nt,由泳道5-10中的长箭头标记)。在泳道6-10所示的不同温度(25℃,37℃,45℃,55℃和60℃)下与非靶特异性钩状探针(YJ-890/YJ-891)一起温育时,多数连接产物是不易被外切核酸酶处理(数据未显示)的单链环(约150nt处,由三角形标记)和可以被外切核酸酶I和III降解(数据未显示)的5'钩状探针和3'钩状探针形成的分子间连接产物(86nt,由泳道5-10中的长箭头标记),而几乎看不到模板(YJ-439)与非靶特异性钩状探针(YJ890和YJ-891)的随机分子间连接产物(136nt和/或130nt)。因此,对比图8的结果,推测出只有模板(YJ-439)与靶特异性钩状探针(YJ-765)发生部分序列杂交反应之后,形成距离相近的模板5'单链和靶特异性探针3'单链,模板和探针发生分子间连接反应的几率会明显提高。
实施例1:双边钩状探针杂交测试
1.样品收集及处理
利用片段化酶(fragmentase)打断人NA12878(GM12878,CORIELL INSTITUTE)基因组DNA,然后利用双选方法选择出200-400bp的DNA片段。
2.变性杂交
杂交反应体系制备:每个反应取20ng酶切打断的人NA12878(GM12878,CORIELL INSTITUTE)(GM12878,coriell institute)基因组DNA,将其分别与2ul不同浓度(0.1uM,0.01uM,0.005uM)的3’钩状探针混合,补加1μL反应缓冲液,补水到10μL。涡旋混匀后放入PCR仪中,反应程序如下所示:95℃,5分钟,以0.1秒的速率降温至42℃,42℃反应1小时,42℃保存。其中,3’钩状探针设计如下:
/5phos/ATGCTGACGGTCAAGTGGTCTTAGGNNNNNNNNNNNNNNNNN/3AmMO/(SEQ ID NO:5,划线为部分接头反向互补序列,未划线为针对不同ROI区域设计的杂交互补序列)。
部分探针序列展示如下表1所示。
表1
Figure PCTCN2017110252-appb-000001
3. 3’端连接反应
按下表2制备混合液:
表2
组分 用量
10×反应缓冲液(Epicentre公司) 1μL
0.025mM ATP(Epicentre公司) 0.5μL
50mM MnCl2(Epicentre公司) 1μL
100%DMSO(sigma公司) 2μL
Circligase环化酶(100U/μL)(Epicentre公司) 0.2μL
无酶纯水 5.3μL
总体积 10μL
将上述混合液加入到杂交反应体系中,用移液器将其混匀,置于42℃混孵育1小时。需要注意:加入混合液时,混合液温度是室温状态,且杂交混合物仍处于PCR仪中。
4.外切酶消化
向每个反应中加入1μL核酸外切酶1(Exo I,NEB公司),用移液器将其混匀,37℃反应30分钟,80℃反应20分钟。
5.5’端连接反应
5’端杂交:向每个反应中年加入1ul与其对应的不同浓度(0.1uM,0.01uM,0.005uM,0.002Um)的5’钩状探针混合,用移液器将其混匀,然后进行以下反应程序:95℃,5分钟,以0.1秒的速率降温至42℃,42℃反应30分钟,42℃保存。其中5’钩状探针设计如下:
NNNNNNNNNNNNNNNNNGAACGACATGGCTACGATCCGACTTNNNNNNT(SEQ ID NO:10,划线为部分接头序列,未划线为针对不同ROI区域设计的杂交互补序列)。
部分探针序列展示如下表3所示。
表3
Figure PCTCN2017110252-appb-000002
5’端连接反应:
按下表4制备混合液:
表4
组分 用量
10×反应缓冲液(Epicentre公司) 0.1μL
0.025mM ATP(Epicentre公司) 0.5μL
Circligase环化酶(100U/μL)(Epicentre公司) 0.2μL
无酶纯水 0.2μL
总体积 1μL
将上述混合液加入到杂交反应体系中,用移液器将其混匀,置于42℃混孵育1小时,80℃20分钟。需要注意:加入混合液时,混合液温度是室温状态,且杂交混合物仍处于PCR仪中。
6.磁珠纯化
a)向上述反应样品(22μL)中加入39.6μL Ampure XP磁珠,枪头吹打混匀磁珠和样品的混合液7-10次,室温结合5分钟后,再次用枪头吹打混匀7-10次,室温结合5分钟后置于磁力架上结合2分钟(至液体澄清),小心吸弃上清。
b)在磁力架上向8连管里加入180μL 70%乙醇,盖紧管盖,上下颠倒混匀5次,弃掉上清;320μL 70%乙醇重复洗1次,用小量程的移液器尽可能弃掉残留的乙醇,室温晾干。
c)用20μL TE溶液重悬磁珠,枪头吹打混匀7-10次,室温结合5分钟后,再次用枪头吹打混匀7-10次,室温结合5分钟后置于磁力架上结合2分钟(至液体澄清),小心吸出20μL上清至新的0.2mL PCR管中,准备进行下一步反应或-20℃保存。
7.PCR扩增
制备如下表5的反应混合物,取10ul上述产物进行PCR反应:
表5
Figure PCTCN2017110252-appb-000003
引物1序列:
/5Phos/CACAGAACGACATGGCTACGATCCGACT(SEQ ID NO:15);
引物2序列:
TGTGAGCCAAGGAGTTGACTTTACTTGTCTTCCTAAGACCACTTGACCGTCAGCAT(SEQ ID NO:16)。
反应程序如下表6:
表6
Figure PCTCN2017110252-appb-000004
8.聚丙烯酰胺凝胶电泳
取6μL PCR产物进行聚丙烯酰胺凝胶电泳,240V电泳20分钟。结果如图11所示。结果显示:
泳道1:只有3’钩状探针和DNA,因此PCR扩增后没有相对应的产物;泳道2:只有5’钩状探针和DNA,因此PCR扩增后没有相对应的产物,从泳道1和2可知,钩状探针以及PCR引物的特异性比较好。
泳道3:含有3’钩状探针和5’钩状探针和DNA,因此PCR扩增后有相对应的产物。但是由于最初投入量相对比较少,因此只有当PCR循环数相对比较多时,才能得到相应的目的产物。而此反应中PCR循环数低于理论值,所以胶图中条带越亮,则表明脱靶效率比较高。而且PCR产物中会出现400bp的未知条带。
泳道4-6:保持3’钩状探针浓度不变,降低5’钩状探针浓度,结果显示目的条带随着5’钩状探针浓度的减低,PCR产物总量会降低,这间接反映出真实的目标产物逐渐显现出来。而且当降低到一定浓度后,400bp条带会消失,故后续尝试进一步调试钩状探针浓度。
泳道7-8:保持5’钩状探针浓度不变,降低3’钩状探针浓度,结果显示在该条件下400bp条带也会消失,但是非特异PCR产物总量也有降低的趋势。故后续尝试调整钩状探针浓度以提高捕获效率。
泳道9:只有DNA模板。
实施例2:双边钩状探针杂交富集目的区域
1.样品收集及处理
利用片段化酶(fragmentase)打断人NA12878(GM12878,CORIELL INSTITUTE)基因组DNA,然后利用双选方法选择出200-400bp的DNA片段。
2.变性杂交
杂交反应体系制备:取10ng酶切打断的人NA12878(GM12878,CORIELL INSTITUTE)基因组DNA,将其与1μL5’和3’钩状探针混合液0.1uM,补加1μL反应缓冲液,补水到10μL。涡旋混匀后放入PCR仪中,反应程序如下所示:95℃,5分钟,以0.1秒的速率降温至42℃,42℃反应1小时,42℃保存。
其中,5’钩状探针设计如下:
NNNNNNNNNNNNNNNNNGAACGACATGGCTACGATCCGACTTNNNNNNT(SEQ ID NO:10,划线为部分接头序列,未划线为针对不同ROI区域设计的杂交互补序列)。
3’钩状探针设计如下:
/5phos/ATGCTGACGGTCAAGTGGTCTTAGGNNNNNNNNNNNNNNNNN/3AmMO/(SEQ ID NO:5,划线为部分接头反向互补序列,未划线为针对不同ROI区域设计的杂交互补序列)。
部分探针序列展示如下表7:
表7
Figure PCTCN2017110252-appb-000005
Figure PCTCN2017110252-appb-000006
3.双边钩状探针连接反应
按下表8制备混合液:
表8
组分 用量
10×反应缓冲液(Epicentre公司) 1μL
0.025mM ATP(Epicentre公司) 0.5μL
50mM MnCl2(Epicentre公司) 1μL
100%DMSO(sigma公司) 2μL
Circligase环化酶(100U/μL)(Epicentre公司) 0.2μL
无酶纯水 5.3μL
总体积 10μL
将上述混合液加入到杂交反应体系中,用移液器将其混匀,置于42℃混孵育1小时。需要注意:加入混合液时,混合液温度是室温状态,且杂交混合物仍处于PCR仪中。
4.外切酶消化
向每个反应中加入1μL核酸外切酶1(Exo I,NEB公司),用移液器将其混匀,37℃反应30分钟,80℃反应20分钟。
5.磁珠纯化
a)向上述反应样品(21μL)中加入37.8μL Ampure XP磁珠,枪头吹打混匀磁珠和样品的混合液7-10次,室温结合5分钟后,再次用枪头吹打混匀7-10次,室温结合5分钟后置于磁力架上结合2分钟(至液体澄清),小心吸弃上清。
b)在磁力架上向8连管里加入180μL 70%乙醇,盖紧管盖,上下颠倒混匀5次,弃掉上清;320μL 70%乙醇重复洗1次,用小量程的移液器尽可能弃掉残留的乙醇,室温晾干。
c)用20μL TE溶液重悬磁珠,枪头吹打混匀7-10次,室温结合5分钟后,再次用枪头吹打混匀7-10次,室温结合5分钟后置于磁力架上结合2分钟(至液体澄清),小心吸出20μL上清至新的0.2mL PCR管中,准备进行下一步反应或-20℃保存。
6.PCR扩增
制备如下表9的反应混合物,取10μL上述产物进行PCR反应:
表9
组分 用量
连接产物 10μL
2×KAPA HiFi HotStart ReadyMix 25μL
20μM引物1 1μL
20μM引物2 1μL
13μL
总体积 50μL
引物1序列:
/5Phos/CACAGAACGACATGGCTACGATCCGACT(SEQ ID NO:15);
引物2序列:
TGTGAGCCAAGGAGTTGACTTTACTTGTCTTCCTAAGACCACTTGACCGTCAGCAT(SEQ ID NO:16)。
反应程序如下表10:
表10
Figure PCTCN2017110252-appb-000007
8.聚丙烯酰胺凝胶电泳
取6μL PCR产物进行聚丙烯酰胺凝胶电泳,240V电泳20分钟。图12示出了本实施例中部分产物聚丙烯酰胺凝胶电泳结果,其中泳道14为本实施例的建库条件下的建库产物的电泳结果。
9.PE50测序结果及分析
图13示出了测序读长(reads)在10号染色体上的分布情况,其中A、B分别为WGS对照文库和阴性对照文库,未见有目标区域富集;C为双边钩状探针杂交文库,可见有明显的2个ROIs被富集。
图14示出了捕获读长(Capture reads)、在靶读长(On target reads)分别覆盖到ROI区域的情况,显 示捕获读长明显覆盖到ROI区域,该染色体上的目标区域被明显富集,红线(图14中曲线1)为捕获读长,蓝线(图14中曲线2)为在靶读长,深灰色区域(图14中A所指示)为ROI±20bp,浅灰色区域(图14中B所指示)为ROI±100bp,黄色线(图14中C所指示)显示PE两端读长接通后的长度。
以上结果表明,本实施例条件下,PE50测序数据的在靶率(On target rate)和捕获率(Capture rate)表现较好。
实施例3:3’钩状探针杂交测试
1.样品收集及处理
利用片段化酶(fragmentase)打断人NA12878(GM12878,CORIELL INSTITUTE)基因组DNA,然后利用双选方法选择出200-400bp的DNA片段。
2.去磷酸化反应
取10ng酶切打断的人NA12878(GM12878,CORIELL INSTITUTE)基因组DNA,加适量rSAP酶(NEB公司),补加1μL 10×反应缓冲液(NEB公司),补水到10μL。涡旋混匀后放入PCR仪中,反应程序如下所示:37℃,30分钟,65℃反应15分钟,4℃保存。
3.变性杂交
杂交反应体系制备:将上述去磷酸化反应液与1μL0.2uM的3’钩状探针混合,此时总体积11μL。涡旋混匀后放入PCR仪中,反应程序如下所示:95℃,5分钟,以0.1秒的速率降温至42℃,42℃反应1小时,42℃保存。其中,3’钩状探针设计如下:
/5phos/ATGCTGACGGTCAAGTGGTCTTAGGNNNNNNNNNNNNNNNNN/3AmMO/(SEQ ID NO:5,划线为部分接头反向互补序列,未划线为针对不同ROI区域设计的杂交互补序列)。
部分探针序列展示如下表11:
表11
Figure PCTCN2017110252-appb-000008
4. 3’端连接反应
按下表12制备混合液:
表12
组分 用量
10×反应缓冲液(Epicentre公司) 2μL
0.025mM ATP(Epicentre公司) 0.5μL
50mM MnCl2(Epicentre公司) 1μL
100%DMSO(sigma公司) 1μL
Circligase环化酶(100U/μL)(Epicentre公司) 0.2μL
无酶纯水 4.3μL
总体积 9μL
将上述混合液加入到杂交反应体系中,用移液器将其混匀,置于42℃混孵育1小时。需要注意:加入混合液时,混合液温度是室温状态,且杂交混合物仍处于PCR仪中。
5.外切酶消化
向每个反应中加入1μL核酸外切酶1(Exo I,NEB公司),用移液器将其混匀。37℃反应30分钟,80℃反应20分钟。
6.引物延伸反应
引入适量带有不同标签序列(barcode)的PE引物,按下表13制备混合液:
表13
组分 用量
2×反应缓冲液(Agilent公司) 25μL
PfuTurbo Cx Hotstart DNA聚合酶(Agilent公司) 0.5μL
PE引物(25μM) 2μL
无酶纯水 1.5μL
总体积 29μL
将上述混合液加入到21μL外切酶消化反应体系中,用移液器将其混匀。置于98℃3分钟;60℃30分钟,4℃保存。
7.接头连接反应
按下表14制备混合液:
表14
组分 用量
ATP(NEB公司) 0.8μL
连接酶(NEB公司) 2μL
10×反应缓冲液(NEB公司) 8μL
50%PEG 8000 12μL
接头(10μM) 1μL
无酶纯水 6.2μL
总体积 30μL
其中,接头短链:GCTACGATCCGACT/ddT/(SEQ ID NO:17);
接头长链:/Phos/AAGTCGGATCGTAGCCATGTCGTT/ddC/(SEQ ID NO:18)。
将上述混合液加入到50μL引物延伸反应体系中,用移液器将其混匀。置于37℃30分钟,4℃保存。
8.磁珠纯化
a)向上述反应样品(80μL)中加入40μL Ampure XP磁珠,枪头吹打混匀磁珠和样品的混合液7-10次,室温结合5分钟后,再次用枪头吹打混匀7-10次,室温结合5分钟后置于磁力架上结合2分钟(至液体澄清),小心吸弃上清。
b)在磁力架上向8连管里加入180μL 70%乙醇,盖紧管盖,上下颠倒混匀5次,弃掉上清;320μL 70%乙醇重复洗1次,用小量程的移液器尽可能弃掉残留的乙醇,室温晾干。
c)用20μL TE溶液重悬磁珠,枪头吹打混匀7-10次,室温结合5分钟后,再次用枪头吹打混匀7-10次,室温结合5分钟后置于磁力架上结合2分钟(至液体澄清),小心吸出20μL上清至新的0.2mL PCR管中,准备进行下一步反应或-20℃保存。
9.PCR扩增
制备以下表15的反应混合物,取20μL上述纯化产物进行PCR反应:
表15
组分 用量
连接产物 20μL
2×KAPA HiFi HotStart ReadyMix 25μL
20μM引物3 2μL
20μM引物4 2μL
1μL
总体积 50μL
引物3:/5Phos/GAACGACATGGCTACGA(SEQ ID NO:19);
引物4:TGTGAGCCAAGGAGTTG(SEQ ID NO:20)。
反应程序如下表16:
表16
Figure PCTCN2017110252-appb-000009
10.聚丙烯酰胺凝胶电泳
取6μL PCR产物进行聚丙烯酰胺凝胶电泳,240V电泳20分钟。图15示出了本实施例中部分产物聚丙烯酰胺凝胶电泳结果,其中泳道12为本实施例的建库条件下的建库产物的电泳结果。
此外,本实施例的文库经PE50测序及分析表明,PE50测序数据的在靶率(On target rate)和捕获率(Capture rate)表现较好。
以上应用了具体个例对本发明进行阐述,只是用于帮助理解本发明,并不用以限制本发明。对于本发明所属技术领域的技术人员,依据本发明的思想,还可以做出若干简单推演、变形或替换。例如,典型但非限定性的推演、变形或替换的例子包括:钩状探针不仅可设计成DNA探针,也可设计成RNA探针。分 子间连接的杂交复合体可以是DNA/DNA、DNA/RNA、RNA/RNA杂交复合体的其中一种或多种。潜在的分子间非特异线性连接产物的消除,除单链外切酶处理外,还可以采用切胶回收或磁珠选择等其他可能的方式。单边钩状探针方案中,非单边钩状探针杂交连接端的接头连接方法除本文所述之外,还可能有其他的替代方式。为了提高靶核酸的富集效果,可以通过改变杂交体系、杂交组分、杂交试剂、杂交温度等反应条件来改善,也可以进行二次或多次巢式富集的方式来实现。如正文中所述,本发明不仅仅限于高通量文库构建技术及试剂盒的开发,也可以应用到RACE等分子克隆实验以及人工合成序列拼接等合成生物学实验及对应试剂盒开发,还可以用到基因分型、荧光定量PCR等依赖化学发光指示检测结果的生化实验及试剂盒开发,以及任何需要通过一段已知核酸序列来检测或提取已知序列或已知序列侧翼序列的信息或利用已知序列来介导其他序列加入已知序列片段中或已知序列侧翼片段中。

Claims (26)

  1. 一种钩状探针,其特征在于,所述钩状探针包括目标特定区域和相连的钩区域,所述目标特定区域包括与待连接的核酸片段的至少部分单链互补配对的序列,所述钩区域包括与所述核酸片段不配对的序列,所述钩区域的末端是可连接末端,所述可连接末端能够连接到所述核酸片段的单链末端。
  2. 根据权利要求1所述的钩状探针,其特征在于,所述待连接的核酸片段是包括目标区域的靶核酸,所述目标区域是待富集的核酸区域。
  3. 根据权利要求2所述的钩状探针,其特征在于,所述目标特定区域包括与所述靶核酸的靶位点侧翼位置和/或靶位点内的序列、或与靶位点连锁的位点侧翼位置和/或与靶位点连锁的位点内的序列互补配对的序列。
  4. 根据权利要求1所述的钩状探针,其特征在于,所述钩区域包括通用引物结合位点;
    任选的,所述的钩区域还包括唯一分子标签序列和/或样本标签序列。
  5. 根据权利要求1所述的钩状探针,其特征在于,所述钩状探针是5'钩状探针,所述5'钩状探针具有结构5'-(目标特定区域)-(钩区域)-3',所述钩区域3'末端具有能够连接到所述核酸片段的5'末端的功能性3'羟基基团。
  6. 根据权利要求5所述的钩状探针,其特征在于,所述5'钩状探针的5'末端具有可阻断其与其它单链或自身单链发生连接反应的5'封闭基团。
  7. 根据权利要求5或6所述的钩状探针,其特征在于,所述5'钩状探针的钩区域包括唯一分子标签和/或样本标签、通用引物结合位点,所述5'钩状探针具有结构5'-(目标特定区域)-(唯一分子标签和/或样本标签)-(通用引物结合位点)-3'。
  8. 根据权利要求1所述的钩状探针,其特征在于,所述钩状探针是3'钩状探针,所述3'钩状探针具有结构5'-(钩区域)-(目标特定区域)-3',所述钩区域5'末端具有能够连接到所述核酸片段的3'末端的功能性5'磷酸基团。
  9. 根据权利要求8所述的钩状探针,其特征在于,所述3'钩状探针的3'末端具有可阻断其与其它单链或自身单链发生连接反应的3'封闭基团。
  10. 根据权利要求8或9所述的钩状探针,其特征在于,所述3'钩状探针的钩区域包括通用引物结合位点、唯一分子标签和/或样本标签,所述3'钩状探针具有结构5'-(通用引物结合位点)-(唯一分子标签和/或样本标签)-(目标特定区域)-3'。
  11. 根据权利要求1所述的钩状探针,其特征在于,所述钩区域包括酶切位点;
    优先的,所述的酶切位点为限制酶识别结合位点和/或能被切割的一个或多个修饰的核苷酸。
  12. 一种试剂盒,其特征在于,所述试剂盒包括权利要求1-11任一项所述的钩状探针,任选地,还包括至少一种连接酶,所述连接酶用于将所述钩状探针的可连接末端与待连接的核酸片段的末端连接。
  13. 权利要求1-11任一项所述的钩状探针在构建核酸测序文库中的用途;
    优选地,所述核酸测序文库是高通量测序文库。
  14. 一种核酸连接方法,其特征在于,所述方法包括:
    使权利要求1-11任一项所述的钩状探针与变性的待连接的核酸片段退火杂交;以及
    在连接酶的存在下,使所述钩状探针的可连接末端与所述核酸片段的单链末端连接。
  15. 一种核酸测序文库的构建方法,其特征在于,所述方法包括:
    使权利要求1-11任一项所述的钩状探针与变性的待连接的核酸片段退火杂交的步骤;和
    在连接酶的存在下,使所述钩状探针的可连接末端与所述核酸片段的单链末端连接的步骤。
  16. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述方法还包括:
    去除线性非特异性连接产物、多余的核酸片段以及多余的钩状探针的步骤;
    优选地,使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针。
  17. 根据权利要求15或16所述的核酸测序文库的构建方法,其特征在于,所述方法还包括:
    使用通用引物对所述钩状探针和所述核酸片段的连接产物进行PCR扩增的步骤。
  18. 根据权利要求15或16所述的核酸测序文库的构建方法,其特征在于,所述钩区域包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸,所述方法还包括:
    使用限制酶对所述限制酶结合位点进行限制性酶切和/或使用切割酶对所述一个或多个修饰的核苷酸进行切割的步骤。
  19. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述钩状探针包括5'钩状探针和3'钩状探针,所述5'钩状探针的钩区域3'末端具有能够连接到所述核酸片段的5'末端的功能性3'羟基基团;所述3'钩状探针的钩区域5'末端具有能够连接到所述核酸片段的3'末端的功能性5'磷酸基团;所述5'钩状探针的钩区域和所述3'钩状探针的钩区域分别包括通用引物结合位点;所述方法具体包括如下步骤:
    使所述钩状探针与变性的待连接的核酸片段退火杂交;
    在连接酶的存在下,使所述5'钩状探针的功能性3'羟基基团与所述核酸片段的5'末端连接,使所述3'钩状探针的功能性5'磷酸基团与所述核酸片段的3'末端连接;
    使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;和
    使用通用引物对所述钩状探针和所述核酸片段的连接产物进行PCR扩增,其中所述通用引物分别与所述5'钩状探针和所述3'钩状探针的通用引物结合位点互补配对。
  20. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述钩状探针包括5'钩状探针和3'钩状探针,所述5'钩状探针的钩区域3'末端具有能够连接到所述核酸片段的5'末端的功能性3'羟基基团;所述3'钩状探针的钩区域5'末端具有能够连接到所述核酸片段的3'末端的功能性5'磷酸基团;所述5'钩状探针的钩区域和所述3'钩状探针的钩区域分别包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸;所述方法具体包括如下步骤:
    使所述钩状探针与变性的待连接的核酸片段退火杂交;
    在连接酶的存在下,使所述5'钩状探针的功能性3'羟基基团与所述核酸片段的5'末端连接,使所述3'钩状探针的功能性5'磷酸基团与所述核酸片段的3'末端连接;
    使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;和
    使用限制酶对所述限制酶结合位点进行限制性酶切和/或使用切割酶对所述一个或多个修饰的核苷酸进行切割。
  21. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述钩状探针包括3'钩状探针,所述3'钩状探针的钩区域5'末端具有能够连接到所述核酸片段的3'末端的功能性5'磷酸基团;所述方法具体包括如下步骤:
    使所述钩状探针与变性的待连接的核酸片段退火杂交;
    在连接酶的存在下,使所述3'钩状探针的功能性5'磷酸基团与所述核酸片段的3'末端连接;
    使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;
    使用与所述3'钩状探针的钩区域序列全部或部分互补的引物序列进行引物延伸反应;
    在所述延伸反应的产物5'端连接5'接头;和
    使用与所述5'接头和所述3'钩状探针的钩区域序列全部或部分互补的引物序列进行PCR扩增。
  22. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述钩状探针包括3'钩状探针,所述3'钩状探针的钩区域5'末端具有能够连接到所述核酸片段的3'末端的功能性5'磷酸基团;所述3'钩状探针的钩区域包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸;所述方法 具体包括如下步骤:
    使所述钩状探针与变性的待连接的核酸片段退火杂交;
    在连接酶的存在下,使所述3'钩状探针的功能性5'磷酸基团与所述核酸片段的3'末端连接;
    使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;和
    使用限制酶对所述限制酶结合位点进行限制性酶切和/或使用切割酶对所述一个或多个修饰的核苷酸进行切割。
  23. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述钩状探针包括3'钩状探针,所述3'钩状探针的钩区域5'末端具有能够连接到所述核酸片段的3'末端的功能性5'磷酸基团;所述核酸片段包括目标区域序列;所述方法具体包括如下步骤:
    使所述钩状探针与变性的待连接的核酸片段退火杂交;
    在连接酶的存在下,使所述3'钩状探针的功能性5'磷酸基团与所述核酸片段的3'末端连接;
    使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;
    使用与所述目标区域序列或其临近区域全部或部分互补的引物序列进行引物延伸反应;
    在所述延伸反应的产物5'端连接5'接头;和
    使用与所述5'接头和所述3'钩状探针的钩区域序列全部或部分互补的引物序列进行PCR扩增。
  24. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述钩状探针包括5'钩状探针,所述5'钩状探针的钩区域3'末端具有能够连接到所述核酸片段的5'末端的功能性3'羟基基团;所述5'钩状探针的钩区域包括通用引物结合位点;所述方法具体包括如下步骤:
    对片段化的待连接的核酸片段进行末端修复和去磷酸化处理;
    在经末端修复和去磷酸化处理的核酸片段的3'端连接3'接头,所述3'接头具有通用引物结合位点;
    使所述钩状探针与连接3'接头后的变性的核酸片段退火杂交;
    进行磷酸化处理之后,在连接酶的存在下,使所述3'钩状探针的功能性5'磷酸基团与所述核酸片段的3'末端连接;
    使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;和
    使用通用引物对所述钩状探针和所述核酸片段的连接产物进行PCR扩增,其中所述通用引物分别与所述5'钩状探针和3'接头的通用引物结合位点互补配对。
  25. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述钩状探针包括5'钩状探针,所述5'钩状探针的钩区域3'末端具有能够连接到所述核酸片段的5'末端的功能性3'羟基基团;所述5'钩状探针的钩区域包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸;所述方法具体包括如下步骤:
    对片段化的待连接的核酸片段进行末端修复和去磷酸化处理;
    在经末端修复和去磷酸化处理的核酸片段的3'端连接3'接头,所述3'接头包括可用限制酶切割的限制酶结合位点和/或能被切割的一个或多个修饰的核苷酸;
    使所述钩状探针与连接3'接头后的变性的核酸片段退火杂交;
    进行磷酸化处理之后,在连接酶的存在下,使所述3'钩状探针的功能性5'磷酸基团与所述核酸片段的3'末端连接;
    使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;和
    使用限制酶对所述限制酶结合位点进行限制性酶切和/或使用切割酶对所述一个或多个修饰的核苷酸进行切割。
  26. 根据权利要求15所述的核酸测序文库的构建方法,其特征在于,所述钩状探针包括3'钩状探针, 所述3'钩状探针的钩区域5'末端具有能够连接到所述核酸片段的3'末端的功能性5'磷酸基团;所述方法具体包括如下步骤:
    使所述钩状探针与变性的待连接的核酸片段退火杂交;
    在连接酶的存在下,使所述3'钩状探针的功能性5'磷酸基团与所述核酸片段的3'末端连接;
    在聚合酶的存在下,以所述3'钩状探针的3'端为聚合酶反应起点进行延伸反应;
    使用单链外切酶消化所述线性非特异性连接产物、多余的核酸片段以及多余的钩状探针;
    在所述延伸反应的产物5'端连接5'接头;和
    使用与所述5'接头和所述3'钩状探针的钩区域序列全部或部分互补的引物序列进行PCR扩增。
PCT/CN2017/110252 2017-11-09 2017-11-09 钩状探针、核酸连接方法以及测序文库的构建方法 WO2019090621A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP17931336.6A EP3730613A4 (en) 2017-11-09 2017-11-09 HOOK PROBE, PROCESS FOR LIGATURE OF NUCLEIC ACID AND PROCESS FOR BUILDING A SEQUENCING BANK
PCT/CN2017/110252 WO2019090621A1 (zh) 2017-11-09 2017-11-09 钩状探针、核酸连接方法以及测序文库的构建方法
US16/762,898 US11680285B2 (en) 2017-11-09 2017-11-09 Hooked probe, method for ligating nucleic acid and method for constructing sequencing library
CN201780096314.8A CN111278974B (zh) 2017-11-09 2017-11-09 钩状探针、核酸连接方法以及测序文库的构建方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/110252 WO2019090621A1 (zh) 2017-11-09 2017-11-09 钩状探针、核酸连接方法以及测序文库的构建方法

Publications (1)

Publication Number Publication Date
WO2019090621A1 true WO2019090621A1 (zh) 2019-05-16

Family

ID=66437530

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/110252 WO2019090621A1 (zh) 2017-11-09 2017-11-09 钩状探针、核酸连接方法以及测序文库的构建方法

Country Status (4)

Country Link
US (1) US11680285B2 (zh)
EP (1) EP3730613A4 (zh)
CN (1) CN111278974B (zh)
WO (1) WO2019090621A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117230170A (zh) * 2023-11-13 2023-12-15 元码基因科技(北京)股份有限公司 基于定点成环连接的端粒特异性接头、预文库及其构建方法
EP4032986A4 (en) * 2019-09-20 2024-01-24 Shanghai Zenisight Ltd ENRICHMENT METHOD AND SYSTEM FOR GENE TARGET REGION

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114277095A (zh) * 2021-12-27 2022-04-05 上海市肺科医院 一种检测基因变异的核苷酸组合物及其构建的高通量测序文库

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7303901B2 (en) 2002-09-20 2007-12-04 Prokaria Ehf. Thermostable RNA ligase from thermus phage
WO2010094040A1 (en) 2009-02-16 2010-08-19 Epicentre Technologies Corporation Template-independent ligation of single-stranded dna
US20120004126A1 (en) 2006-10-27 2012-01-05 Complete Genomics, Inc. Efficient Arrays of Amplified Polynucleotides
US9217167B2 (en) 2013-07-26 2015-12-22 General Electric Company Ligase-assisted nucleic acid circularization and amplification
CN106232833A (zh) * 2014-01-30 2016-12-14 加利福尼亚大学董事会 用于非侵入性诊断的甲基化单体型分析(monod)
WO2017075265A1 (en) * 2015-10-28 2017-05-04 The Broad Institute, Inc. Multiplex analysis of single cell constituents
CN107236729A (zh) * 2017-07-04 2017-10-10 上海阅尔基因技术有限公司 一种基于探针捕获富集的快速构建靶核酸测序文库的方法和试剂盒

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9260753B2 (en) * 2011-03-24 2016-02-16 President And Fellows Of Harvard College Single cell nucleic acid detection and analysis
US20160257993A1 (en) * 2015-02-27 2016-09-08 Cellular Research, Inc. Methods and compositions for labeling targets
US10385387B2 (en) * 2015-04-20 2019-08-20 Pacific Biosciences Of California, Inc. Methods for selectively amplifying and tagging nucleic acids
US10870848B2 (en) * 2015-09-15 2020-12-22 Takara Bio Usa, Inc. Methods for preparing a next generation sequencing (NGS) library from a ribonucleic acid (RNA) sample and compositions for practicing the same
US11091791B2 (en) * 2017-02-24 2021-08-17 Mgi Tech Co., Ltd. Methods for hybridization based hook ligation

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7303901B2 (en) 2002-09-20 2007-12-04 Prokaria Ehf. Thermostable RNA ligase from thermus phage
US20120004126A1 (en) 2006-10-27 2012-01-05 Complete Genomics, Inc. Efficient Arrays of Amplified Polynucleotides
WO2010094040A1 (en) 2009-02-16 2010-08-19 Epicentre Technologies Corporation Template-independent ligation of single-stranded dna
US9217167B2 (en) 2013-07-26 2015-12-22 General Electric Company Ligase-assisted nucleic acid circularization and amplification
CN106232833A (zh) * 2014-01-30 2016-12-14 加利福尼亚大学董事会 用于非侵入性诊断的甲基化单体型分析(monod)
WO2017075265A1 (en) * 2015-10-28 2017-05-04 The Broad Institute, Inc. Multiplex analysis of single cell constituents
CN107236729A (zh) * 2017-07-04 2017-10-10 上海阅尔基因技术有限公司 一种基于探针捕获富集的快速构建靶核酸测序文库的方法和试剂盒

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
BLONDAL ET AL., NUCLEIC ACIDS RES., vol. 33, 2005, pages 135 - 142
LI ET AL., ANAL. BIOCHEM., vol. 349, 2006, pages 242 - 246
LUCKS ET AL., PROC. NATL. ACAD. SCI. USA, vol. 108, 2011, pages 11063 - 11068
See also references of EP3730613A4
TESSIER ET AL., ANAL. BIOCHEM., vol. 158, 1986, pages 171 - 178
ZHANG ET AL., NUCLEIC ACIDS RES., vol. 24, 1996, pages 990 - 991

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4032986A4 (en) * 2019-09-20 2024-01-24 Shanghai Zenisight Ltd ENRICHMENT METHOD AND SYSTEM FOR GENE TARGET REGION
CN117230170A (zh) * 2023-11-13 2023-12-15 元码基因科技(北京)股份有限公司 基于定点成环连接的端粒特异性接头、预文库及其构建方法
CN117230170B (zh) * 2023-11-13 2024-04-12 元码基因科技(北京)股份有限公司 基于定点成环连接的端粒特异性接头、预文库及其构建方法

Also Published As

Publication number Publication date
CN111278974B (zh) 2024-06-11
CN111278974A (zh) 2020-06-12
US20210172009A1 (en) 2021-06-10
EP3730613A1 (en) 2020-10-28
EP3730613A4 (en) 2021-11-03
US11680285B2 (en) 2023-06-20

Similar Documents

Publication Publication Date Title
US11697843B2 (en) Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing
US11964997B2 (en) Methods of library construction for polynucleotide sequencing
US9745614B2 (en) Reduced representation bisulfite sequencing with diversity adaptors
US20210071171A1 (en) Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation
US10570448B2 (en) Compositions and methods for identification of a duplicate sequencing read
CN110191961B (zh) 制备经不对称标签化的测序文库的方法
JP6803327B2 (ja) 標的化されたシークエンシングからのデジタル測定値
JP6335918B2 (ja) 制限酵素を用いない標的富化
KR102390285B1 (ko) 핵산 프로브 및 게놈 단편을 검출하는 방법
US11091791B2 (en) Methods for hybridization based hook ligation
US20230056763A1 (en) Methods of targeted sequencing
WO2013192292A1 (en) Massively-parallel multiplex locus-specific nucleic acid sequence analysis
US9365896B2 (en) Addition of an adaptor by invasive cleavage
US20210115510A1 (en) Generation of single-stranded circular dna templates for single molecule sequencing
WO2019090621A1 (zh) 钩状探针、核酸连接方法以及测序文库的构建方法
US20160258002A1 (en) Synthesis of Pools of Probes by Primer Extension
US20180100180A1 (en) Methods of single dna/rna molecule counting
WO2018081666A1 (en) Methods of single dna/rna molecule counting
CN115175985A (zh) 从未经处理的生物样本中提取单链dna和rna并测序的方法
JP2024035109A (ja) 核酸の正確な並行検出及び定量のための方法
CN113454235A (zh) 经改进的核酸靶标富集和相关方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17931336

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017931336

Country of ref document: EP

Effective date: 20200609