WO2023108362A1 - Endonuclease and use thereof in dna fragmentation - Google Patents

Endonuclease and use thereof in dna fragmentation Download PDF

Info

Publication number
WO2023108362A1
WO2023108362A1 PCT/CN2021/137530 CN2021137530W WO2023108362A1 WO 2023108362 A1 WO2023108362 A1 WO 2023108362A1 CN 2021137530 W CN2021137530 W CN 2021137530W WO 2023108362 A1 WO2023108362 A1 WO 2023108362A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequencing
present
endonuclease
dna
fragmentation
Prior art date
Application number
PCT/CN2021/137530
Other languages
French (fr)
Chinese (zh)
Inventor
张晓红
高重亮
徐讯
谢庆庆
郑越
董宇亮
章文蔚
Original Assignee
深圳华大生命科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大生命科学研究院 filed Critical 深圳华大生命科学研究院
Priority to PCT/CN2021/137530 priority Critical patent/WO2023108362A1/en
Publication of WO2023108362A1 publication Critical patent/WO2023108362A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/26Preparation of nitrogen-containing carbohydrates
    • C12P19/28N-glycosides
    • C12P19/30Nucleotides
    • C12P19/34Polynucleotides, e.g. nucleic acids, oligoribonucleotides

Definitions

  • the present invention relates to the field of biology.
  • the present invention relates to endonucleases and their use in DNA fragmentation.
  • NGS next-generation sequencing technology
  • DNA fragmentation methods There are three main types of DNA fragmentation methods, which are divided into physical fragmentation, enzymatic fragmentation, and chemical fragmentation, each of which has its own unique advantages and disadvantages. Among them, enzymatic fragmentation and physical ultrasonic fragmentation are currently more commonly used DNA fragmentation methods.
  • the physical ultrasonic fragmentation method has the advantages of convenient operation and low bias, but it will cause obvious damage to DNA, and the cost of ultrasonic instruments and consumables remains high, which makes enzymatic fragmentation more and more popular.
  • Enzyme-based fragmentation typically fragments DNA by simultaneously cutting both strands or creating nicks in each strand of dsDNA, and is highly flexible and can be used to generate lengths ranging from a few hundred bases to many thousands of bases base fragment.
  • DNase I is an endonuclease that digests ssDNA, dsDNA, and RNA:DNA hybrids. DNase I digestion of DNA is dependent on Ca 2+ and can be activated by Mg 2+ or Mn 2+ . Although DNase I can be used to fragment DNA, studies have found that DNase I has a clear preference for sites next to pyrimidine nucleotides, which greatly affects the diversity of sequencing libraries.
  • Fragmentase is a time-dependent enzyme that can fragment dsDNA into 50-1000bp fragments according to different action times. Fragmentase is composed of two enzymes, non-specific endonuclease VVN and specific endonuclease MBP-T7. The non-specific endonuclease VVN randomly generates gaps on dsDNA, while MBP-T7 recognizes gaps and binds on the opposite strand. Cutting is performed, resulting in dsDNA fragments. Although Fragmentase is simple and efficient, studies have shown that Fragmentase produces more Artifactual Indels.
  • Illumina's Nextera technology is based on the interruption of the transposase Tn5, which forms a coated adapter with the P5, P7 terminal part of the adapter sequence and the transposon end sequence, and forms a transposition complex with Tn5, which will interrupt the receptor DNA to form Adapter1 with P5 part at one end and Adapter2 with P7 part at one end, and then add the index sequence and the rest of the linker by PCR to form a complete library.
  • the fragmentation method based on the transposase Tn5 simultaneously fragments the DNA and adds adapters, without the need for final modification and A addition, which greatly shortens the time for library construction and improves work efficiency.
  • the transposase Tn5 has obvious advantages in the construction of sequencing libraries, its interrupt specificity has also been criticized by researchers.
  • the present invention aims to solve at least one of the technical problems existing in the prior art at least to a certain extent.
  • the present invention proposes isolated nucleic acids, isolated polypeptides, constructs, recombinant cells, endonucleases, kits, DNA fragmentation methods, methods for constructing sequencing libraries, sequencing libraries and sequencing methods.
  • the isolated polypeptides can be used as The application of endonuclease in DNA fragmentation has the advantages of no obvious bias in fragmentation (Bias), good GC coverage (Coverage), good quality of sequencing data, high accuracy, and low cost of library construction and sequencing. widely used.
  • the invention provides an isolated polypeptide.
  • the isolated polypeptide has the amino acid sequence shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology therewith.
  • the inventors of the present invention performed gene sequencing on samples obtained from the natural environment, and found an unreported polypeptide from the sequencing database, which was determined to be the CRISPR-related protein Cas2 through comparative analysis. Further, its performance was tested, and it was found that it can be used for DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the cost of library construction and sequencing , the application prospect is good.
  • the present invention proposes the use of the aforementioned isolated polypeptide as an endonuclease.
  • the present invention provides an endonuclease.
  • the endonuclease has the amino acid sequence shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology therewith. Therefore, the endonuclease according to the embodiment of the present invention can be used for DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the time required for library construction and sequencing. The cost and the application prospect are good.
  • the invention provides an isolated nucleic acid.
  • the nucleic acid encodes the aforementioned isolated polypeptide or the endonuclease.
  • the polypeptide encoded by the isolated nucleic acid according to the embodiment of the present invention can be used as an endonuclease in DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the sequencing data.
  • the quality and accuracy reduce the cost of library construction and sequencing, and the application prospect is good.
  • the invention proposes a construct.
  • the construct comprises: the aforementioned isolated nucleic acid.
  • the above-mentioned polypeptide can be expressed by transforming the construct according to the embodiment of the present invention.
  • the polypeptide can be used as an endonuclease in DNA fragmentation, and has no obvious preference for interruption and relatively high GC coverage. Good advantages, thereby improving the quality and accuracy of sequencing data, reducing the cost of library construction and sequencing, and has a good application prospect.
  • the present invention provides a recombinant cell.
  • the recombinant cells are obtained by transforming recipient cells with the aforementioned constructs.
  • the recombinant cells according to the embodiment of the present invention can express the above-mentioned polypeptide, which can be used as an endonuclease for DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thus The quality and accuracy of sequencing data are improved, the cost of library construction and sequencing is reduced, and the application prospect is good.
  • the present invention provides a kit.
  • the kit includes: the aforementioned isolated polypeptide, the endonuclease, the isolated nucleic acid, the construct and/or the recombinant cell. Therefore, using this kit can achieve DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the cost of library construction and sequencing. good.
  • the present invention proposes that the aforementioned isolated polypeptide, the endonuclease, the isolated nucleic acid, the construct, the recombinant cell or the kit are in the application.
  • the aforementioned polypeptides, endonucleases, isolated nucleic acids, constructs, recombinant cells or kits can be directly or indirectly fragmented, and have the advantages of no obvious preference for fragmentation and good GC coverage, Thereby, the quality and accuracy of sequencing data are improved, the cost of library construction and sequencing is reduced, and the application prospect is good.
  • the present invention provides a method for DNA fragmentation.
  • the method includes: reacting the DNA sample with the aforementioned isolated polypeptide or the aforementioned endonuclease to obtain DNA fragments.
  • the isolated polypeptide or DNA fragmentation enzyme according to the embodiment of the present invention can recognize and interrupt the base or base sequence on the DNA to realize DNA fragmentation, and the fragmentation has no obvious preference and the GC coverage is good.
  • the present invention provides a method for constructing a sequencing library.
  • the method according to the embodiment of the present invention includes: obtaining DNA fragments by using the aforementioned DNA fragmentation method. Therefore, the quality of the sequencing library obtained by using the method according to the embodiment of the present invention is good, and the cost of building the library is low.
  • the present invention provides a sequencing library.
  • the sequencing library is obtained by the aforementioned method for constructing a sequencing library. Therefore, the quality of the sequencing library according to the embodiment of the present invention is good.
  • the present invention provides a sequencing method.
  • the method includes: performing sequencing on the aforementioned sequencing library. Therefore, the accuracy of the method according to the embodiment of the present invention is good.
  • Fig. 1 has shown the fragmentation electrophoresis figure of Cas2 protein to NA12878DNA according to one embodiment of the present invention
  • Fig. 2 shows the electrophoresis diagram of Cas2 protein and other enzymatic DNA fragmentation according to one embodiment of the present invention
  • Figure 3 shows a schematic diagram of the comparison of double-selection effects of magnetic beads for DNA fragmentation products by different enzymatic methods according to an embodiment of the present invention
  • Fig. 4 shows a schematic diagram of GC preference comparison of gene sequencing of different enzymatic fragmentation according to an embodiment of the present invention
  • Fig. 5 shows a schematic diagram of the comparison of gene sequencing results of different enzymatic fragmentation according to an embodiment of the present invention.
  • Embodiments of the present invention are described in detail below.
  • the embodiments described below are exemplary only for explaining the present invention and should not be construed as limiting the present invention. If no specific technique or condition is indicated in the examples, it shall be carried out according to the technique or condition described in the literature in this field or according to the product specification. The reagents or instruments used were not indicated by the manufacturer, and they were all commercially available conventional products.
  • the inventors of the present invention carried out gene sequencing on samples obtained in the natural environment, and found an unreported polypeptide from the sequencing database, which was determined to be the CRISPR-associated protein Cas2 (with the sequence shown in SEQ ID NO: 1) through comparative analysis. amino acid sequence), which is an endonuclease.
  • the DNA sequence encoding the protein was synthesized by gene synthesis technology and cloned into the expression vector pET-28b. Then the constructed plasmid containing the sequence encoding the Cas2 protein was transferred into E. Coli, and the Cas2 protein was expressed heterologously through E. Coli. Finally, the Cas2 protein was purified and extracted by affinity chromatography and ion exchange chromatography to obtain a Cas2 protein with a purity of more than 95%.
  • the inventors further studied the performance of the Cas2 protein, specifically using the Cas2 protein to fragment the NA12878 standard DNA, and analyzing the fragmentation effect and enzyme performance through nucleic acid electrophoresis, bioanalyzer 2100 and sequencing. Through analysis, it was confirmed that the discovered endonuclease Cas2 could be used for DNA fragmentation.
  • the present invention proposes the use of the aforementioned isolated polypeptide as an endonuclease.
  • the present invention provides an endonuclease.
  • the endonuclease has the amino acid sequence shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology therewith. Therefore, the endonuclease according to the embodiment of the present invention can be used for DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the time required for library construction and sequencing. The cost and the application prospect are good.
  • the invention provides an isolated nucleic acid.
  • the nucleic acid encodes the aforementioned isolated polypeptide or the endonuclease.
  • the polypeptide encoded by the isolated nucleic acid according to the embodiment of the present invention can be used as an endonuclease in DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the sequencing data.
  • the quality and accuracy reduce the cost of library construction and sequencing, and the application prospect is good.
  • the nucleic acid has a nucleotide sequence shown in SEQ ID NO: 2 or a nucleotide sequence having at least 80% homology therewith.
  • the invention proposes a construct.
  • the construct comprises: the aforementioned isolated nucleic acid.
  • the above-mentioned polypeptide can be expressed by transforming the construct according to the embodiment of the present invention.
  • the polypeptide can be used as an endonuclease in DNA fragmentation, and has no obvious preference for interruption and relatively high GC coverage. Good advantages, thereby improving the quality and accuracy of sequencing data, reducing the cost of library construction and sequencing, and has a good application prospect.
  • the present invention provides a recombinant cell.
  • the recombinant cells are obtained by transforming recipient cells with the aforementioned constructs.
  • the recombinant cells according to the embodiment of the present invention can express the above-mentioned polypeptide, which can be used as an endonuclease for DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thus The quality and accuracy of sequencing data are improved, the cost of library construction and sequencing is reduced, and the application prospect is good.
  • the present invention provides a kit.
  • the kit includes: the aforementioned isolated polypeptide, the endonuclease, the isolated nucleic acid, the construct, and the recombinant cell. Therefore, using this kit can achieve DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the cost of library construction and sequencing. good.
  • the kit further includes a buffer, and the buffer contains: Tris, KCl, MgCl 2 and water.
  • the buffer contains: Tris, KCl, MgCl 2 and water.
  • the concentration of the Tris is 1-50 mM
  • the concentration of the KCl is 100-500 mM
  • the concentration of the MgCl is 1-50 mM
  • the DNA is 1-50 ng/ ⁇ L.
  • the concentration of the Tris is 20-30 mM
  • the concentration of the KCl is 100-300 mM
  • the concentration of the MgCl is 1-10 mM
  • the DNA is 20-30 ng/ ⁇ L.
  • the present invention proposes that the aforementioned isolated polypeptide, the endonuclease, the isolated nucleic acid, the construct, the recombinant cell or the kit are in the application.
  • the aforementioned polypeptides, endonucleases, isolated nucleic acids, constructs, recombinant cells or kits can be directly or indirectly fragmented, and have the advantages of no obvious preference for fragmentation and good GC coverage, Thereby, the quality and accuracy of sequencing data are improved, the cost of library construction and sequencing is reduced, and the application prospect is good.
  • the present invention provides a method for DNA fragmentation.
  • the method includes: reacting the DNA sample with the aforementioned isolated polypeptide or the aforementioned endonuclease to obtain DNA fragments.
  • the isolated polypeptide or DNA fragmentation enzyme according to the embodiment of the present invention can recognize and interrupt the base or base sequence on the DNA to realize DNA fragmentation, and the fragmentation has no obvious preference and the GC coverage is good.
  • the reaction is carried out in the buffer in the aforementioned kit.
  • the DNA fragmentation reaction can be efficiently performed.
  • the reaction is carried out at 40-70° C. for 0.5-2 hours. Under these conditions, the enzyme activity is high, and DNA fragmentation proceeds efficiently.
  • the reaction is carried out at 50-60° C. for 50-90 minutes. Under these conditions, the enzyme activity is high, and DNA fragmentation proceeds efficiently.
  • the reaction is terminated by adding a terminating reaction reagent selected from EDTA.
  • the length of the DNA fragment is 100-1000 bp.
  • the present invention provides a method for constructing a sequencing library.
  • the method according to the embodiment of the present invention includes: obtaining DNA fragments by using the aforementioned DNA fragmentation method. Therefore, the quality of the sequencing library obtained by using the method according to the embodiment of the present invention is good, and the cost of building the library is low.
  • the present invention provides a sequencing library.
  • the sequencing library is obtained by the aforementioned method for constructing a sequencing library. Therefore, the quality of the sequencing library according to the embodiment of the present invention is good.
  • the present invention provides a sequencing method.
  • the method includes: performing sequencing on the aforementioned sequencing library. Therefore, the accuracy of the method according to the embodiment of the present invention is good.
  • the gene sequence of the unreported Cas protein found in the database is the nucleotide sequence shown in SEQ ID NO: 2, and the encoded amino acid sequence is the amino acid sequence shown in SEQ ID NO: 1.
  • the expression plasmid pET-28b-Cas2 purchased from Beijing Liuhe Huada Gene Technology Co., Ltd.
  • 6 His tags are fused to the N-terminus of the amino acid sequence to facilitate protein purification.
  • the expression plasmid pET-28b-Cas2 was transformed into BL21 competent cells (purchased from Tiangen Biochemical Technology (Beijing) Co., Ltd.), and then single clones were picked from the plate and placed in 5ml LB medium containing kana resistance (50 ⁇ g/ml) , 37°C, 200rpm/min, cultivate overnight.
  • the ultrasonic condition is: variable amplitude
  • the diameter of the rod is 10mm
  • the ultrasonic intensity is 40%
  • the ultrasonic is 2s
  • the interval is 3s
  • the ultrasonic is 30min.
  • centrifuge at 13000 rpm and 4°C for 30 min to collect the supernatant.
  • the purified protein was stored at -20°C after dialysis with dialysate (10 mM Tris, 100 mM NaCl, 2 mM DTT, 50% glycerol, pH 7.5). Protein concentration and purity distribution were determined by A280 method and SDS-PAGE method.
  • NA12878 DNA was used as the fragmented sample, and Cas2 protein was used as the fragmentation enzyme to fragment the DNA.
  • 1x DNA loading buffer (10mM Tris, 10mM EDTA, 10% glycerol, 0.05% orange G, pH7.6) was added, mixed well and then analyzed by electrophoresis with 1.3% agarose gel to finally determine the fragment
  • the reaction time required for interrupting 100-1000bp is 1h. The specific results are shown in Figure 1.
  • NA12878 DNA was used as the fragmentation sample, and Cas2 protein, DNase I (NEB), the fragmentation kit of company B (BGI), or the fragmentation kit of company Y (Yeasen) were used as fragmentation enzymes to fragment the DNA.
  • the reaction composition using Cas2 protein as fragmentation enzyme was 25ng/ ⁇ L NA12878DNA, 25ng/ ⁇ L Cas2, 25mM Tris (pH8.5), 200mM KCl, 5mM MgCl 2 and incubated at 55°C for 1h.
  • the fragmentation conditions of other fragmentation enzymes were operated according to the instructions of each product, and the amount of NA12878DNA in the reaction system was 25ng/ ⁇ L. After the reaction was completed, 50 mM EDTA was added to terminate the reaction.
  • Example 4 Cas2 is used for the performance of DNA fragmentation in gene sequencing
  • the fragmentation kit of Cas2 protein, DNase I (NEB), company Q (Qiagen), or the fragmentation kit of company Y is the fragmentation enzyme for DNA fragmentation Double selection with magnetic beads.
  • the relevant instructions of MGIEasy Enzyme Digestion PCR-Free DNA Library Preparation Kit to prepare the sequencing library.
  • the specific steps include end repair and A addition, adapter ligation, adapter ligation product purification, denaturation, single-strand circularization, Exo digestion, Exo Digestion product purification and quality control.
  • the MGISEQ-2000RS high-throughput sequencing kit (PE150) was used for sequencing, and the sequencing depth was 3x.
  • the newly discovered Cas2 protein can be used as a fragmentation enzyme for DNA fragmentation in gene sequencing.
  • the indicators of its sequencing meet the requirements, there is no obvious preference, and the GC preference is smaller than that of existing products.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • General Chemical & Material Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Medicinal Chemistry (AREA)
  • Biomedical Technology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Provided in the present invention are an isolated nucleic acid, isolated polypeptide, construct, recombinant cell, endonuclease, kit, method for DNA fragmentation and method for constructing a sequencing library, the sequencing library and a sequencing method. The isolated polypeptide has an amino acid sequence as shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology thereto.

Description

核酸内切酶及其在DNA片段化中的应用Endonuclease and its application in DNA fragmentation 技术领域technical field
本发明涉及生物领域。具体地,本发明涉及核酸内切酶及其在DNA片段化中的应用。The present invention relates to the field of biology. In particular, the present invention relates to endonucleases and their use in DNA fragmentation.
背景技术Background technique
高通量测序又称下一代测序(Next-generation Sequencing technology,NGS),以能同时对数百万个短序列读长进行测序为标志使测序技术发生了彻底的变革,为探索更多生态学与进化问题提供了机会。尽管NGS测序有不同的平台,但对大多数的测序平台而言,制备高质量的测序文库是不可或缺的步骤。受限于平台测序读长短,建库前需要将DNA进行随机打断,即DNA片段化。DNA片段化效果对测序至关重要,一般要求产生的片段大小适当,打断随机无偏向性。High-throughput sequencing, also known as next-generation sequencing technology (NGS), has revolutionized sequencing technology marked by the simultaneous sequencing of millions of short-sequence reads. Problems with evolution provide opportunities. Although there are different platforms for NGS sequencing, preparing high-quality sequencing libraries is an indispensable step for most sequencing platforms. Due to the limitation of platform sequencing read length, DNA needs to be randomly interrupted before library construction, that is, DNA fragmentation. The effect of DNA fragmentation is very important for sequencing, and it is generally required that the size of the generated fragments be appropriate, and the interruption is random and unbiased.
DNA片段化的方法主要有三类,分为物理打断、酶法打断和化学打断,每种都有其独特的优缺点。其中,酶法打断和物理超声打断是目前比较常用的DNA打断方法。物理超声打断法具有操作方便、偏向性低的优点,但是会对DNA造成明显的损伤,且超声仪器和耗材成本居高不下,这使得酶法打断受到了越来越多的青睐。基于酶法的打断一般是通过同时切割两条链或在dsDNA的每条链上产生缺口来使DNA断裂,该方法高度灵活,可用于生成长度从几百个碱基到许多几千个碱基的片段。此外,酶法打断的成本较低,操作简便省时,只需在样本中加入片段化酶,反应一段时间即可完成反应,市面上已有将片段化酶和末修酶混一起的集DNA片段化及末修的一管反应,减少了操作步骤并极大地缩短了反应时间。There are three main types of DNA fragmentation methods, which are divided into physical fragmentation, enzymatic fragmentation, and chemical fragmentation, each of which has its own unique advantages and disadvantages. Among them, enzymatic fragmentation and physical ultrasonic fragmentation are currently more commonly used DNA fragmentation methods. The physical ultrasonic fragmentation method has the advantages of convenient operation and low bias, but it will cause obvious damage to DNA, and the cost of ultrasonic instruments and consumables remains high, which makes enzymatic fragmentation more and more popular. Enzyme-based fragmentation typically fragments DNA by simultaneously cutting both strands or creating nicks in each strand of dsDNA, and is highly flexible and can be used to generate lengths ranging from a few hundred bases to many thousands of bases base fragment. In addition, the cost of enzymatic fragmentation is low, and the operation is simple and time-saving. You only need to add fragmentation enzyme to the sample and react for a period of time to complete the reaction. One-tube reaction for DNA fragmentation and finishing, reducing operation steps and greatly shortening reaction time.
目前市面上报道的用于DNA打断的片段化酶有很多,主要有DNase I、NEB的Fragmentase以及Illumina的Nextera技术。Currently, there are many fragmentation enzymes reported on the market for DNA fragmentation, mainly including DNase I, NEB's Fragmentase and Illumina's Nextera technology.
DNase I是一种核酸内切酶,可以消化ssDNA、dsDNA和RNA:DNA杂合体。DNase I消化DNA需要依赖于Ca 2+且能被Mg 2+或Mn 2+激活。虽然DNase I可用于DNA的片段化,但是有研究发现DNase I对嘧啶核苷酸旁边位点有明显的偏好性,大大影响了测序文库的多样性。 DNase I is an endonuclease that digests ssDNA, dsDNA, and RNA:DNA hybrids. DNase I digestion of DNA is dependent on Ca 2+ and can be activated by Mg 2+ or Mn 2+ . Although DNase I can be used to fragment DNA, studies have found that DNase I has a clear preference for sites next to pyrimidine nucleotides, which greatly affects the diversity of sequencing libraries.
Fragmentase是时间依赖型酶,能根据不同作用时间将dsDNA打断成50-1000bp的片段。Fragmentase由非特异性核酸内切酶VVN和特异性核酸内切酶MBP-T7两种酶组成,非特异性核酸内切酶VVN随机的在dsDNA上产生缺口,而MBP-T7识别缺口处并在对链进行切割,从而产生dsDNA片段。虽然Fragmentase具有简便高效的有点,但是有研究表明, Fragmentase产生了较多的Artifactual Indels。Fragmentase is a time-dependent enzyme that can fragment dsDNA into 50-1000bp fragments according to different action times. Fragmentase is composed of two enzymes, non-specific endonuclease VVN and specific endonuclease MBP-T7. The non-specific endonuclease VVN randomly generates gaps on dsDNA, while MBP-T7 recognizes gaps and binds on the opposite strand. Cutting is performed, resulting in dsDNA fragments. Although Fragmentase is simple and efficient, studies have shown that Fragmentase produces more Artifactual Indels.
Illumina的Nextera技术是基于转座酶Tn5进行的打断,其将P5、P7端部分接头序列和转座子末端序列形成包被接头,与Tn5形成转座复合体,该复合体会打断受体DNA,形成一端带有P5部分的Adapter1,一端带有P7部分Adapter2的DNA,之后通过PCR加上index序列以及接头其余部分,形成完整的文库。基于转座酶Tn5的打断方法同时进行DNA的片段化和接头的添加,无需进行末修和加A,极大地缩短了建库时间,提高工作效率。虽然转座酶Tn5用于测序文库构建的优势明显,但是其打断特异性的问题也受到研究者的诟病。Illumina's Nextera technology is based on the interruption of the transposase Tn5, which forms a coated adapter with the P5, P7 terminal part of the adapter sequence and the transposon end sequence, and forms a transposition complex with Tn5, which will interrupt the receptor DNA to form Adapter1 with P5 part at one end and Adapter2 with P7 part at one end, and then add the index sequence and the rest of the linker by PCR to form a complete library. The fragmentation method based on the transposase Tn5 simultaneously fragments the DNA and adds adapters, without the need for final modification and A addition, which greatly shortens the time for library construction and improves work efficiency. Although the transposase Tn5 has obvious advantages in the construction of sequencing libraries, its interrupt specificity has also been criticized by researchers.
因此,目前用于DNA打断的片段化酶仍有待研究。Therefore, fragmentation enzymes currently used for DNA fragmentation remain to be investigated.
发明内容Contents of the invention
本发明旨在至少在一定程度上解决现有技术中存在的技术问题至少之一。本发明提出了分离的核酸、分离的多肽、构建体、重组细胞、核酸内切酶、试剂盒、DNA片段化的方法、构建测序文库的方法、测序文库和测序方法,该分离的多肽可作为核酸内切酶应用于DNA片段化中,具有打断无明显偏好性(Bias)、GC覆盖度(Coverage)较好、测序数据质量好、准确性高、建库测序成本低等优点,适于广泛应用。The present invention aims to solve at least one of the technical problems existing in the prior art at least to a certain extent. The present invention proposes isolated nucleic acids, isolated polypeptides, constructs, recombinant cells, endonucleases, kits, DNA fragmentation methods, methods for constructing sequencing libraries, sequencing libraries and sequencing methods. The isolated polypeptides can be used as The application of endonuclease in DNA fragmentation has the advantages of no obvious bias in fragmentation (Bias), good GC coverage (Coverage), good quality of sequencing data, high accuracy, and low cost of library construction and sequencing. widely used.
在本发明的一个方面,本发明提出了一种分离的多肽。根据本发明的实施例,所述分离的多肽具有SEQ ID NO:1所示的氨基酸序列或与其具有至少80%同源性的氨基酸序列。In one aspect of the invention, the invention provides an isolated polypeptide. According to an embodiment of the present invention, the isolated polypeptide has the amino acid sequence shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology therewith.
本发明的发明人对自然环境获得的样本进行基因测序,从测序数据库中发现一种未报道过的多肽,经比对分析确定其为CRISPR相关蛋白Cas2。进一步地,对其性能进行检测,发现其可用于DNA的片段化,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。The inventors of the present invention performed gene sequencing on samples obtained from the natural environment, and found an unreported polypeptide from the sequencing database, which was determined to be the CRISPR-related protein Cas2 through comparative analysis. Further, its performance was tested, and it was found that it can be used for DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the cost of library construction and sequencing , the application prospect is good.
在本发明的另一方面,本发明提出了前面所述分离的多肽在作为核酸内切酶中的应用。发明人发现该分离的多肽可作为核酸内切酶应用于DNA片段化中,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In another aspect of the present invention, the present invention proposes the use of the aforementioned isolated polypeptide as an endonuclease. The inventors found that the isolated polypeptide can be used as an endonuclease in DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the time required for library construction. Sequencing cost, good application prospects.
在本发明的又一方面,本发明提出了一种核酸内切酶。根据本发明的实施例,所述核酸内切酶具有SEQ ID NO:1所示的氨基酸序列或与其具有至少80%同源性的氨基酸序列。由此,根据本发明实施例的核酸内切酶可用于DNA片段化,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the present invention, the present invention provides an endonuclease. According to an embodiment of the present invention, the endonuclease has the amino acid sequence shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology therewith. Therefore, the endonuclease according to the embodiment of the present invention can be used for DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the time required for library construction and sequencing. The cost and the application prospect are good.
在本发明的又一方面,本发明提出了一种分离的核酸。根据本发明的实施例,所述核酸编码前面所述分离的多肽或所述核酸内切酶。由此,利用根据本发明实施例的分离的核酸编码所得多肽可作为核酸内切酶应用于DNA片段化中,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the invention, the invention provides an isolated nucleic acid. According to an embodiment of the present invention, the nucleic acid encodes the aforementioned isolated polypeptide or the endonuclease. Thus, the polypeptide encoded by the isolated nucleic acid according to the embodiment of the present invention can be used as an endonuclease in DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the sequencing data. The quality and accuracy reduce the cost of library construction and sequencing, and the application prospect is good.
在本发明的又一方面,本发明提出了一种构建体。根据本发明的实施例,所述构建体包含:前面所述的分离的核酸。由此,将根据本发明实施例的构建体进行转化,可以表达出前面所述多肽,该多肽可作为核酸内切酶应用于DNA片段化中,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the invention, the invention proposes a construct. According to an embodiment of the present invention, the construct comprises: the aforementioned isolated nucleic acid. Thus, the above-mentioned polypeptide can be expressed by transforming the construct according to the embodiment of the present invention. The polypeptide can be used as an endonuclease in DNA fragmentation, and has no obvious preference for interruption and relatively high GC coverage. Good advantages, thereby improving the quality and accuracy of sequencing data, reducing the cost of library construction and sequencing, and has a good application prospect.
在本发明的又一方面,本发明提出了一种重组细胞。根据本发明的实施例,所述重组细胞是通过前面所述的构建体转化受体细胞而获得的。由此,根据本发明实施例的重组细胞可表达出前面所述多肽,该多肽可作为核酸内切酶应用于DNA片段化,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the present invention, the present invention provides a recombinant cell. According to an embodiment of the present invention, the recombinant cells are obtained by transforming recipient cells with the aforementioned constructs. Thus, the recombinant cells according to the embodiment of the present invention can express the above-mentioned polypeptide, which can be used as an endonuclease for DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thus The quality and accuracy of sequencing data are improved, the cost of library construction and sequencing is reduced, and the application prospect is good.
在本发明的又一方面,本发明提出了一种试剂盒。根据本发明的实施例,所述试剂盒包括:前面所述分离的多肽、所述核酸内切酶、所述分离的核酸、所述构建体和/或所述重组细胞。由此,利用该试剂盒可以实现DNA片段化,并且,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the present invention, the present invention provides a kit. According to an embodiment of the present invention, the kit includes: the aforementioned isolated polypeptide, the endonuclease, the isolated nucleic acid, the construct and/or the recombinant cell. Therefore, using this kit can achieve DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the cost of library construction and sequencing. good.
在本发明的又一方面,本发明提出了前面所述分离的多肽、所述核酸内切酶、所述分离的核酸、所述构建体、所述重组细胞或所述试剂盒在DNA片段化中的应用。前面所述的多肽、核酸内切酶、分离的核酸、构建体、重组细胞或试剂盒可直接或间接地进行DNA片段化,并且具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the present invention, the present invention proposes that the aforementioned isolated polypeptide, the endonuclease, the isolated nucleic acid, the construct, the recombinant cell or the kit are in the application. The aforementioned polypeptides, endonucleases, isolated nucleic acids, constructs, recombinant cells or kits can be directly or indirectly fragmented, and have the advantages of no obvious preference for fragmentation and good GC coverage, Thereby, the quality and accuracy of sequencing data are improved, the cost of library construction and sequencing is reduced, and the application prospect is good.
在本发明的又一方面,本发明提出了一种DNA片段化的方法。根据本发明的实施例,所述方法包括:将DNA样品与前面所述分离的多肽或所述核酸内切酶进行反应,得到DNA片段。根据本发明实施例的分离的多肽或DNA片段化酶可以识别并打断DNA上的碱基或碱基序列,实现DNA片段化,并且,打断无明显偏好性、GC覆盖度较好。In yet another aspect of the present invention, the present invention provides a method for DNA fragmentation. According to an embodiment of the present invention, the method includes: reacting the DNA sample with the aforementioned isolated polypeptide or the aforementioned endonuclease to obtain DNA fragments. The isolated polypeptide or DNA fragmentation enzyme according to the embodiment of the present invention can recognize and interrupt the base or base sequence on the DNA to realize DNA fragmentation, and the fragmentation has no obvious preference and the GC coverage is good.
在本发明的又一方面,本发明提出了一种构建测序文库的方法。根据本发明的实施例所述方法包括:采用前面所述DNA片段化的方法获得DNA片段。由此,利用根据本发明实施例的方法获得的测序文库质量好,建库成本低。In yet another aspect of the present invention, the present invention provides a method for constructing a sequencing library. The method according to the embodiment of the present invention includes: obtaining DNA fragments by using the aforementioned DNA fragmentation method. Therefore, the quality of the sequencing library obtained by using the method according to the embodiment of the present invention is good, and the cost of building the library is low.
在本发明的又一方面,本发明提出了一种测序文库。根据本发明的实施例,所述测序文库是通过前面所述构建测序文库的方法获得的。由此,根据本发明实施例的测序文库质量好。In yet another aspect of the present invention, the present invention provides a sequencing library. According to an embodiment of the present invention, the sequencing library is obtained by the aforementioned method for constructing a sequencing library. Therefore, the quality of the sequencing library according to the embodiment of the present invention is good.
在本发明的又一方面,本发明提出了一种测序方法。根据本发明的实施例,所述方法包括:对前面所述测序文库进行测序。由此,根据本发明实施例的方法准确性好。In yet another aspect of the present invention, the present invention provides a sequencing method. According to an embodiment of the present invention, the method includes: performing sequencing on the aforementioned sequencing library. Therefore, the accuracy of the method according to the embodiment of the present invention is good.
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Additional aspects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
附图说明Description of drawings
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and understandable from the description of the embodiments in conjunction with the following drawings, wherein:
图1显示了根据本发明一个实施例的Cas2蛋白对NA12878DNA的片段化电泳图;Fig. 1 has shown the fragmentation electrophoresis figure of Cas2 protein to NA12878DNA according to one embodiment of the present invention;
图2显示了根据本发明一个实施例的Cas2蛋白与其他酶法DNA片段化电泳图;Fig. 2 shows the electrophoresis diagram of Cas2 protein and other enzymatic DNA fragmentation according to one embodiment of the present invention;
图3显示了根据本发明一个实施例的不同酶法DNA片段化产物磁珠双选效果对比示意图;Figure 3 shows a schematic diagram of the comparison of double-selection effects of magnetic beads for DNA fragmentation products by different enzymatic methods according to an embodiment of the present invention;
图4显示了根据本发明一个实施例的不同酶法片段化的基因测序的GC偏好对比示意图;Fig. 4 shows a schematic diagram of GC preference comparison of gene sequencing of different enzymatic fragmentation according to an embodiment of the present invention;
图5显示了根据本发明一个实施例的不同酶法片段化的基因测序结果对比示意图。Fig. 5 shows a schematic diagram of the comparison of gene sequencing results of different enzymatic fragmentation according to an embodiment of the present invention.
具体实施方式Detailed ways
下面详细描述本发明的实施例。下面描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。Embodiments of the present invention are described in detail below. The embodiments described below are exemplary only for explaining the present invention and should not be construed as limiting the present invention. If no specific technique or condition is indicated in the examples, it shall be carried out according to the technique or condition described in the literature in this field or according to the product specification. The reagents or instruments used were not indicated by the manufacturer, and they were all commercially available conventional products.
分离的多肽或核酸内切酶Isolated polypeptide or endonuclease
本发明的发明人对自然环境获得的样本进行基因测序,从测序数据库中发现一种未报道过的多肽,经比对分析确定其为CRISPR相关蛋白Cas2(具有如SEQ ID NO:1所示的氨基酸序列),即为核酸内切酶。通过基因合成技术,合成了编码该蛋白的DNA序列,并将其克隆至表达载体pET-28b中。然后将构建好的含有编码Cas2蛋白序列的质粒转入E.Coli中,通过E.Coli异源表达Cas2蛋白。最后通过亲和层析和离子交换层析的方法对Cas2蛋白进行纯化提取,获得纯度达95%以上的Cas2蛋白。The inventors of the present invention carried out gene sequencing on samples obtained in the natural environment, and found an unreported polypeptide from the sequencing database, which was determined to be the CRISPR-associated protein Cas2 (with the sequence shown in SEQ ID NO: 1) through comparative analysis. amino acid sequence), which is an endonuclease. The DNA sequence encoding the protein was synthesized by gene synthesis technology and cloned into the expression vector pET-28b. Then the constructed plasmid containing the sequence encoding the Cas2 protein was transferred into E. Coli, and the Cas2 protein was expressed heterologously through E. Coli. Finally, the Cas2 protein was purified and extracted by affinity chromatography and ion exchange chromatography to obtain a Cas2 protein with a purity of more than 95%.
另外,发明人进一步对Cas2蛋白的性能进行研究,具体为使用Cas2蛋白对NA12878标准DNA进行片段化,通过核酸电泳、生物分析仪2100及测序分析打断的效果和酶性能。通过分析,确认发现的核酸内切酶Cas2可用于DNA的片段化。In addition, the inventors further studied the performance of the Cas2 protein, specifically using the Cas2 protein to fragment the NA12878 standard DNA, and analyzing the fragmentation effect and enzyme performance through nucleic acid electrophoresis, bioanalyzer 2100 and sequencing. Through analysis, it was confirmed that the discovered endonuclease Cas2 could be used for DNA fragmentation.
Figure PCTCN2021137530-appb-000001
Figure PCTCN2021137530-appb-000001
分离的多肽在作为核酸内切酶中的应用Use of isolated polypeptides as endonucleases
在本发明的另一方面,本发明提出了前面所述分离的多肽在作为核酸内切酶中的应用。发明人发现该分离的多肽可作为核酸内切酶应用于DNA片段化中,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In another aspect of the present invention, the present invention proposes the use of the aforementioned isolated polypeptide as an endonuclease. The inventors found that the isolated polypeptide can be used as an endonuclease in DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the time required for library construction. Sequencing cost, good application prospects.
需要说明,前面针对分离的多肽所描述的特征和优点,同样适用于该应用,在此不再赘述。It should be noted that the features and advantages described above for the isolated polypeptide are also applicable to this application and will not be repeated here.
核酸内切酶endonuclease
在本发明的又一方面,本发明提出了一种核酸内切酶。根据本发明的实施例,所述核酸内切酶具有SEQ ID NO:1所示的氨基酸序列或与其具有至少80%同源性的氨基酸序列。由此,根据本发明实施例的核酸内切酶可用于DNA片段化,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the present invention, the present invention provides an endonuclease. According to an embodiment of the present invention, the endonuclease has the amino acid sequence shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology therewith. Therefore, the endonuclease according to the embodiment of the present invention can be used for DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the time required for library construction and sequencing. The cost and the application prospect are good.
核酸nucleic acid
在本发明的又一方面,本发明提出了一种分离的核酸。根据本发明的实施例,所述核酸编码前面所述分离的多肽或所述核酸内切酶。由此,利用根据本发明实施例的分离的核酸编码所得多肽可作为核酸内切酶应用于DNA片段化中,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the invention, the invention provides an isolated nucleic acid. According to an embodiment of the present invention, the nucleic acid encodes the aforementioned isolated polypeptide or the endonuclease. Thus, the polypeptide encoded by the isolated nucleic acid according to the embodiment of the present invention can be used as an endonuclease in DNA fragmentation, which has the advantages of no obvious preference for fragmentation and good GC coverage, thereby improving the sequencing data. The quality and accuracy reduce the cost of library construction and sequencing, and the application prospect is good.
根据本发明的实施例,所述核酸具有SEQ ID NO:2所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列。According to an embodiment of the present invention, the nucleic acid has a nucleotide sequence shown in SEQ ID NO: 2 or a nucleotide sequence having at least 80% homology therewith.
Figure PCTCN2021137530-appb-000002
Figure PCTCN2021137530-appb-000002
Figure PCTCN2021137530-appb-000003
Figure PCTCN2021137530-appb-000003
需要说明,前面针对分离的多肽和核酸内切酶所描述的特征和优点,同样适用于该核酸,在此不再赘述。It should be noted that the features and advantages described above for the isolated polypeptide and endonuclease are also applicable to the nucleic acid, and will not be repeated here.
构建体construct
在本发明的又一方面,本发明提出了一种构建体。根据本发明的实施例,所述构建体包含:前面所述的分离的核酸。由此,将根据本发明实施例的构建体进行转化,可以表达出前面所述多肽,该多肽可作为核酸内切酶应用于DNA片段化中,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the invention, the invention proposes a construct. According to an embodiment of the present invention, the construct comprises: the aforementioned isolated nucleic acid. Thus, the above-mentioned polypeptide can be expressed by transforming the construct according to the embodiment of the present invention. The polypeptide can be used as an endonuclease in DNA fragmentation, and has no obvious preference for interruption and relatively high GC coverage. Good advantages, thereby improving the quality and accuracy of sequencing data, reducing the cost of library construction and sequencing, and has a good application prospect.
需要说明,前面针对核酸所描述的特征和优点,同样适用于该构建体,在此不再赘述。It should be noted that the features and advantages described above for the nucleic acid are also applicable to the construct and will not be repeated here.
重组细胞recombinant cells
在本发明的又一方面,本发明提出了一种重组细胞。根据本发明的实施例,所述重组细胞是通过前面所述的构建体转化受体细胞而获得的。由此,根据本发明实施例的重组细胞可表达出前面所述多肽,该多肽可作为核酸内切酶应用于DNA片段化,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the present invention, the present invention provides a recombinant cell. According to an embodiment of the present invention, the recombinant cells are obtained by transforming recipient cells with the aforementioned constructs. Thus, the recombinant cells according to the embodiment of the present invention can express the above-mentioned polypeptide, which can be used as an endonuclease for DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thus The quality and accuracy of sequencing data are improved, the cost of library construction and sequencing is reduced, and the application prospect is good.
需要说明,前面针对构建体所描述的特征和优点,同样适用于该重组细胞,在此不再赘述。It should be noted that the features and advantages described above for the construct are also applicable to the recombinant cell, and will not be repeated here.
试剂盒Reagent test kit
在本发明的又一方面,本发明提出了一种试剂盒。根据本发明的实施例,所述试剂盒包括:前面所述分离的多肽、所述核酸内切酶、所述分离的核酸、所述构建体、所述重组细胞。由此,利用该试剂盒可以实现DNA片段化,并且,具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the present invention, the present invention provides a kit. According to an embodiment of the present invention, the kit includes: the aforementioned isolated polypeptide, the endonuclease, the isolated nucleic acid, the construct, and the recombinant cell. Therefore, using this kit can achieve DNA fragmentation, and has the advantages of no obvious preference for interruption and good GC coverage, thereby improving the quality and accuracy of sequencing data and reducing the cost of library construction and sequencing. good.
根据本发明的实施例,所述试剂盒进一步包括缓冲液,所述缓冲液中含有:Tris、KCl、MgCl 2和水。由此,可以为后续DNA片段化提供适宜的反应环境。并且,Cas2蛋白在该缓冲液中,可以保持较好的酶活。 According to an embodiment of the present invention, the kit further includes a buffer, and the buffer contains: Tris, KCl, MgCl 2 and water. Thus, a suitable reaction environment can be provided for subsequent DNA fragmentation. Moreover, the Cas2 protein can maintain better enzymatic activity in the buffer.
根据本发明的实施例,基于所述缓冲液的总体积,所述Tris的浓度为1~50mM,所述KCl的浓度为100~500mM,所述MgCl 2的浓度为1~50mM,所述DNA片段化酶的浓度为1~50ng/μL。由此,可以进一步提高Cas2蛋白在该缓冲液中的稳定性和酶活,促进DNA片段化反应的发生。 According to an embodiment of the present invention, based on the total volume of the buffer, the concentration of the Tris is 1-50 mM, the concentration of the KCl is 100-500 mM, the concentration of the MgCl is 1-50 mM, the DNA The concentration of the fragmenting enzyme is 1-50 ng/μL. Thus, the stability and enzyme activity of the Cas2 protein in the buffer can be further improved, and the DNA fragmentation reaction can be promoted.
根据本发明的实施例,基于所述缓冲液的总体积,所述Tris的浓度为20~30mM,所述KCl的浓度为100~300mM,所述MgCl 2的浓度为1~10mM,所述DNA片段化酶的浓度为20~30ng/μL。由此,可以进一步提高Cas2蛋白在该缓冲液中的稳定性和酶活,促进DNA片段化反应的发生。 According to an embodiment of the present invention, based on the total volume of the buffer, the concentration of the Tris is 20-30 mM, the concentration of the KCl is 100-300 mM, the concentration of the MgCl is 1-10 mM, the DNA The concentration of the fragmenting enzyme is 20-30 ng/μL. Thus, the stability and enzyme activity of the Cas2 protein in the buffer can be further improved, and the DNA fragmentation reaction can be promoted.
需要说明,前面针对分离的多肽、核酸内切酶、分离的核酸、构建体、重组细胞所描述的特征和优点,同样适用于该试剂盒,在此不再赘述。It should be noted that the features and advantages described above for the isolated polypeptide, endonuclease, isolated nucleic acid, construct, and recombinant cell are also applicable to the kit, and will not be repeated here.
应用application
在本发明的又一方面,本发明提出了前面所述分离的多肽、所述核酸内切酶、所述分离的核酸、所述构建体、所述重组细胞或所述试剂盒在DNA片段化中的应用。前面所述的多肽、核酸内切酶、分离的核酸、构建体、重组细胞或试剂盒可直接或间接地进行DNA片段化,并且具有打断无明显偏好性、GC覆盖度较好的优点,从而提高了测序数据质量和准确性,降低了建库测序成本,应用前景好。In yet another aspect of the present invention, the present invention proposes that the aforementioned isolated polypeptide, the endonuclease, the isolated nucleic acid, the construct, the recombinant cell or the kit are in the application. The aforementioned polypeptides, endonucleases, isolated nucleic acids, constructs, recombinant cells or kits can be directly or indirectly fragmented, and have the advantages of no obvious preference for fragmentation and good GC coverage, Thereby, the quality and accuracy of sequencing data are improved, the cost of library construction and sequencing is reduced, and the application prospect is good.
需要说明,前面针对分离的多肽、核酸内切酶、分离的核酸、构建体、重组细胞、试剂盒所描述的特征和优点,同样适用于该应用,在此不再赘述。It should be noted that the features and advantages described above for isolated polypeptides, endonucleases, isolated nucleic acids, constructs, recombinant cells, and kits are also applicable to this application, and will not be repeated here.
DNA片段化的方法DNA Fragmentation Methods
在本发明的又一方面,本发明提出了一种DNA片段化的方法。根据本发明的实施例,所述方法包括:将DNA样品与前面所述分离的多肽或所述核酸内切酶进行反应,得到DNA片段。根据本发明实施例的分离的多肽或DNA片段化酶可以识别并打断DNA上的碱基或碱基序列,实现DNA片段化,并且,打断无明显偏好性、GC覆盖度较好。In yet another aspect of the present invention, the present invention provides a method for DNA fragmentation. According to an embodiment of the present invention, the method includes: reacting the DNA sample with the aforementioned isolated polypeptide or the aforementioned endonuclease to obtain DNA fragments. The isolated polypeptide or DNA fragmentation enzyme according to the embodiment of the present invention can recognize and interrupt the base or base sequence on the DNA to realize DNA fragmentation, and the fragmentation has no obvious preference and the GC coverage is good.
根据本发明的实施例,所述反应是在前面所述试剂盒中的缓冲液中进行的。由此,可以使得DNA片段化反应高效进行。According to an embodiment of the present invention, the reaction is carried out in the buffer in the aforementioned kit. Thus, the DNA fragmentation reaction can be efficiently performed.
根据本发明的实施例,所述反应是在40~70℃下进行0.5~2小时。在此条件下,酶活较高,DNA片段化高效进行。According to an embodiment of the present invention, the reaction is carried out at 40-70° C. for 0.5-2 hours. Under these conditions, the enzyme activity is high, and DNA fragmentation proceeds efficiently.
根据本发明的实施例,所述反应是在50~60℃下进行50~90分钟。在此条件下,酶活较高,DNA片段化高效进行。According to an embodiment of the present invention, the reaction is carried out at 50-60° C. for 50-90 minutes. Under these conditions, the enzyme activity is high, and DNA fragmentation proceeds efficiently.
根据本发明的实施例,所述反应是通过添加终止反应试剂而终止的,所述终止反应试剂选自EDTA。According to an embodiment of the present invention, the reaction is terminated by adding a terminating reaction reagent selected from EDTA.
根据本发明的实施例,所述DNA片段的长度为100~1000bp。According to an embodiment of the present invention, the length of the DNA fragment is 100-1000 bp.
需要说明,前面针对分离的多肽、核酸内切酶所描述的特征和优点,同样适用于该DNA片段化的方法,在此不再赘述。It should be noted that the features and advantages described above for the isolated polypeptide and endonuclease are also applicable to the DNA fragmentation method, and will not be repeated here.
构建测序文库的方法Methods for Constructing Sequencing Libraries
在本发明的又一方面,本发明提出了一种构建测序文库的方法。根据本发明的实施例所述方法包括:采用前面所述DNA片段化的方法获得DNA片段。由此,利用根据本发明实施例的方法获得的测序文库质量好,建库成本低。In yet another aspect of the present invention, the present invention provides a method for constructing a sequencing library. The method according to the embodiment of the present invention includes: obtaining DNA fragments by using the aforementioned DNA fragmentation method. Therefore, the quality of the sequencing library obtained by using the method according to the embodiment of the present invention is good, and the cost of building the library is low.
需要说明,前面针对DNA片段化的方法所描述的特征和优点,同样适用于该方法,在此不再赘述。It should be noted that the features and advantages described above for the DNA fragmentation method are also applicable to this method, and will not be repeated here.
测序文库Sequencing library
在本发明的又一方面,本发明提出了一种测序文库。根据本发明的实施例,所述测序文库是通过前面所述构建测序文库的方法获得的。由此,根据本发明实施例的测序文库质量好。In yet another aspect of the present invention, the present invention provides a sequencing library. According to an embodiment of the present invention, the sequencing library is obtained by the aforementioned method for constructing a sequencing library. Therefore, the quality of the sequencing library according to the embodiment of the present invention is good.
需要说明,前面针对构建测序文库的方法所描述的特征和优点,同样适用于该测序文库,在此不再赘述。It should be noted that the features and advantages described above for the method for constructing the sequencing library are also applicable to the sequencing library, and will not be repeated here.
测序方法Sequencing method
在本发明的又一方面,本发明提出了一种测序方法。根据本发明的实施例,所述方法包括:对前面所述测序文库进行测序。由此,根据本发明实施例的方法准确性好。In yet another aspect of the present invention, the present invention provides a sequencing method. According to an embodiment of the present invention, the method includes: performing sequencing on the aforementioned sequencing library. Therefore, the accuracy of the method according to the embodiment of the present invention is good.
需要说明,前面针对测序文库所描述的特征和优点,同样适用于该测序方法,在此不再赘述。It should be noted that the features and advantages described above for the sequencing library are also applicable to this sequencing method, and will not be repeated here.
下面将结合实施例对本发明的方案进行解释。本领域技术人员将会理解,下面的实施例仅用于说明本发明,而不应视为限定本发明的范围。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。The solutions of the present invention will be explained below in conjunction with examples. Those skilled in the art will understand that the following examples are only for illustrating the present invention and should not be considered as limiting the scope of the present invention. If no specific technique or condition is indicated in the examples, it shall be carried out according to the technique or condition described in the literature in this field or according to the product specification. The reagents or instruments used were not indicated by the manufacturer, and they were all commercially available conventional products.
实施例1 Cas2蛋白的表达和纯化Expression and purification of embodiment 1 Cas2 protein
从数据库中发现的未报道过的Cas蛋白的基因序列为SEQ ID NO:2所示的核苷酸序列,其编码的氨基酸序列为SEQ ID NO:1所示的氨基酸序列。通过基因合成技术,合成了含有Cas2蛋白DNA序列的表达质粒pET-28b-Cas2(购自北京六合华大基因科技有限公司)。其中,氨基酸序列的N端融合有6个His标签以利于蛋白的纯化。The gene sequence of the unreported Cas protein found in the database is the nucleotide sequence shown in SEQ ID NO: 2, and the encoded amino acid sequence is the amino acid sequence shown in SEQ ID NO: 1. The expression plasmid pET-28b-Cas2 (purchased from Beijing Liuhe Huada Gene Technology Co., Ltd.) containing the DNA sequence of Cas2 protein was synthesized by gene synthesis technology. Among them, 6 His tags are fused to the N-terminus of the amino acid sequence to facilitate protein purification.
将表达质粒pET-28b-Cas2转化BL21感受态细胞(购自天根生化科技(北京)有限公司),然后从平板挑取单克隆于5ml含卡那抗性(50μg/ml)LB培养基中,37℃,200rpm/min,过夜培养。次日按1:100的比例进行稀释,转接于新鲜的1.5L含卡那抗性(50μg/ml)的LB培养基中,37℃,200rpm振荡培养至OD600≈0.6,按照终浓度为0.5mM的量加入诱导剂IPTG,37℃,200rpm/min培养4h诱导表达,最后8000rpm/min,10min的条件离心收集诱导后的菌液沉淀。The expression plasmid pET-28b-Cas2 was transformed into BL21 competent cells (purchased from Tiangen Biochemical Technology (Beijing) Co., Ltd.), and then single clones were picked from the plate and placed in 5ml LB medium containing kana resistance (50μg/ml) , 37°C, 200rpm/min, cultivate overnight. The next day, it was diluted at a ratio of 1:100, transferred to fresh 1.5L LB medium containing kana resistance (50μg/ml), 37°C, 200rpm shaking culture to OD600≈0.6, according to the final concentration of 0.5 Add the inducer IPTG in the amount of mM, incubate at 37°C and 200rpm/min for 4h to induce expression, and finally centrifuge at 8000rpm/min for 10min to collect the induced bacterial liquid precipitate.
收集菌体沉淀后,使用Ni柱亲和A液(50mM Tris,300mM NaCl,5%甘油,20mM咪唑,pH8.0)重悬,在冰水浴条件下进行超声破菌,超声条件为:变幅杆直径10mm,超声强度为40%,超声2s,间歇3s,超声30min。然后在转速13000rpm、4℃下离心30min,收集上清。After collecting the bacterial pellet, use Ni column affinity A solution (50mM Tris, 300mM NaCl, 5% glycerol, 20mM imidazole, pH8.0) to resuspend, and carry out ultrasonic destruction under the condition of ice-water bath, the ultrasonic condition is: variable amplitude The diameter of the rod is 10mm, the ultrasonic intensity is 40%, the ultrasonic is 2s, the interval is 3s, and the ultrasonic is 30min. Then centrifuge at 13000 rpm and 4°C for 30 min to collect the supernatant.
将上步准备好的样品进行亲和纯化,具体如下:Perform affinity purification on the sample prepared in the previous step, as follows:
按照AKTA操作流程,用过滤脱气MillQ水冲洗工作泵和系统,0.5ml/min流速下接上预装柱HisTrap FF 5ml,用H 2O冲洗5CV,再用5CV的Ni柱亲和A液平衡柱子,然后将之前处理好的样品以5ml/min上样至层析柱。上样完毕后,继续用10CV的Ni柱亲和A液冲洗柱子,然后用Ni柱亲和B液(50mM Tris,300mM NaCl,5%甘油,500mM咪唑,pH8.0)在占比0-100%(10CV)范围内线性梯度洗脱,并收集目的蛋白。 According to the AKTA operating procedure, rinse the working pump and system with filtered degassed MillQ water, connect the prepacked column HisTrap FF 5ml at a flow rate of 0.5ml/min, rinse with H 2 O for 5CV, and then equilibrate with 5CV of Ni column affinity A solution column, and then load the previously processed sample to the chromatography column at 5ml/min. After loading the sample, continue to wash the column with 10CV of Ni column affinity A solution, and then use Ni column affinity B solution (50mM Tris, 300mM NaCl, 5% glycerol, 500mM imidazole, pH8.0) at a ratio of 0-100 % (10CV) range linear gradient elution, and collect the target protein.
将经亲和纯化得到的样品用Taq稀释液(50mM Tris,5%甘油,pH8.5)进行6倍稀释。然后进行阴离子交换层析,具体步骤如下:The sample obtained by affinity purification was diluted 6 times with Taq diluent (50mM Tris, 5% glycerol, pH8.5). Carry out anion exchange chromatography then, concrete steps are as follows:
按照AKTA操作流程,用过滤脱气MillQ水冲洗工作泵和系统,0.5ml/min流速下接上预装柱HiTrap Q FF 5ml,用H 2O冲洗5CV,再用5CV的离子A液(50mM Tris,50mM NaCl,5%甘油,pH8.5)平衡柱子,然后将稀释处理好的样品以5ml/min上样至层析柱。上样完毕后,继续用10CV的离子A液冲洗柱子,然后在离子B液((50mM Tris,1M NaCl,5%甘油,pH8.5)在占比0-100%(15CV)范围内线性梯度洗脱,并收集目的蛋白。 According to the AKTA operating procedure, rinse the working pump and system with filtered degassed MillQ water, connect the prepacked column HiTrap Q FF 5ml at a flow rate of 0.5ml/min, rinse with H 2 O for 5CV, and then use 5CV of ion A solution (50mM Tris , 50mM NaCl, 5% glycerol, pH8.5) to equilibrate the column, and then load the diluted and processed sample to the chromatographic column at 5ml/min. After the sample is loaded, continue to wash the column with 10CV of ion A solution, and then linearly gradient in the range of 0-100% (15CV) in ion B solution ((50mM Tris, 1M NaCl, 5% glycerol, pH8.5) Eluted and collected the target protein.
纯化后得到的蛋白经过用透析液(10mM Tris,100mM NaCl,2mM DTT,50%甘油,pH7.5)透析后于-20℃储存。蛋白浓度和纯度分布通过A280法和SDS-PAGE法测定。The purified protein was stored at -20°C after dialysis with dialysate (10 mM Tris, 100 mM NaCl, 2 mM DTT, 50% glycerol, pH 7.5). Protein concentration and purity distribution were determined by A280 method and SDS-PAGE method.
实施例2 Cas2蛋白对DNA片段化的反应时间确定Example 2 Determination of the reaction time of Cas2 protein to DNA fragmentation
以NA12878DNA作为打断样本,Cas2蛋白为片段化酶进行DNA的片段化。在反应组成为25ng/μL NA12878DNA、25ng/μL Cas2、25mM Tris(pH8.5)、200mM KCl、5mM MgCl 2 的缓冲液下于55℃孵育不同时间后加入50mM EDTA终止反应。反应完成后加入1x DNA上样缓冲液(10mM Tris,10mM EDTA,10%甘油,0.05%橙黄G,pH7.6),充分混匀后用1.3%的琼脂糖胶进行电泳分析,最终确定将片段打断之100-1000bp所需的反应时间为1h。具体结果见图1。 NA12878 DNA was used as the fragmented sample, and Cas2 protein was used as the fragmentation enzyme to fragment the DNA. Incubate at 55°C for different times in a buffer with a reaction composition of 25ng/μL NA12878DNA, 25ng/μL Cas2, 25mM Tris (pH8.5), 200mM KCl, and 5mM MgCl 2 , and then add 50mM EDTA to terminate the reaction. After the reaction was completed, 1x DNA loading buffer (10mM Tris, 10mM EDTA, 10% glycerol, 0.05% orange G, pH7.6) was added, mixed well and then analyzed by electrophoresis with 1.3% agarose gel to finally determine the fragment The reaction time required for interrupting 100-1000bp is 1h. The specific results are shown in Figure 1.
实施例3 不同酶法片段化效果对比Example 3 Comparison of Fragmentation Effects of Different Enzymatic Methods
以NA12878DNA作为打断样本,Cas2蛋白、DNase I(NEB)、公司B(BGI)的片段化试剂盒、或公司Y(Yeasen)的片段化试剂盒为片段化酶进行DNA的片段化。以Cas2蛋白为片段化酶的反应组成为25ng/μL NA12878DNA、25ng/μL Cas2、25mM Tris(pH8.5)、200mM KCl、5mM MgCl 2下于55℃孵育1h。其他片段化酶的打断条件参照各产品说明书操作,反应体系中NA12878DNA量均为25ng/μL。反应完成后加入50mM EDTA终止反应。取20μL反应液至0.2mL PCR管中,加入1x DNA上样缓冲液(10mM Tris,10mM EDTA,10%甘油,0.05%橙黄G,pH7.6),充分混匀后用1.3%的琼脂糖胶进行电泳分析。具体结果见图2。 NA12878 DNA was used as the fragmentation sample, and Cas2 protein, DNase I (NEB), the fragmentation kit of company B (BGI), or the fragmentation kit of company Y (Yeasen) were used as fragmentation enzymes to fragment the DNA. The reaction composition using Cas2 protein as fragmentation enzyme was 25ng/μL NA12878DNA, 25ng/μL Cas2, 25mM Tris (pH8.5), 200mM KCl, 5mM MgCl 2 and incubated at 55°C for 1h. The fragmentation conditions of other fragmentation enzymes were operated according to the instructions of each product, and the amount of NA12878DNA in the reaction system was 25ng/μL. After the reaction was completed, 50 mM EDTA was added to terminate the reaction. Take 20μL of the reaction solution into a 0.2mL PCR tube, add 1x DNA loading buffer (10mM Tris, 10mM EDTA, 10% glycerol, 0.05% orange G, pH7.6), mix well and use 1.3% agarose gel Perform electrophoretic analysis. The specific results are shown in Figure 2.
另外取40μL反应液至0.2mL PCR管中进行磁珠纯化,具体为:加入24μL的
Figure PCTCN2021137530-appb-000004
DNA Clean Beads,使用移液器轻轻吹打10次充分混匀,然后在室温孵育10min使DNA结合到磁珠上。短暂离心后将样本置于磁力架上,待液体澄清后,小心吸取上清至新的0.2mL PCR管中,然后加入8μL
Figure PCTCN2021137530-appb-000005
DNA Clean Beads,使用移液器轻轻吹打10次充分混匀,然后在室温孵育10min使DNA结合到磁珠上。短暂离心后将样本置于磁力架上,待液体澄清后,小心移除上清。保持样本始终置于磁力架上,加入200μL新鲜配制的80%乙醇漂洗磁珠,小心移除上清。再加入200μL新鲜配制的80%乙醇漂洗磁珠,小心移除上清。保持样本始终置于磁力架上,室温下开盖干燥直至磁珠出现干裂。将样本从磁力架上取出,加入40μL的EB缓冲液(10mM Tris,pH8.0),使用移液器吹打充分混匀,短暂离心后在磁力架上静置5min,待溶液澄清后,小心吸取上清至一个新的无核酸酶的离心管中。取2μL的洗脱样本使用Aligent 2100High Sensitivity DNA kit确认磁珠双选后DNA片段的大小。结果显示经打断和磁珠双选后,几种酶法片段化的DNA片段的主带均在350bp左右,符合建库要求。具体结果见图3。
In addition, take 40 μL of the reaction solution into a 0.2 mL PCR tube for magnetic bead purification, specifically: add 24 μL of
Figure PCTCN2021137530-appb-000004
For DNA Clean Beads, use a pipette to gently pipette 10 times to mix thoroughly, and then incubate at room temperature for 10 minutes to allow DNA to bind to the magnetic beads. After a brief centrifugation, place the sample on a magnetic stand. After the liquid is clarified, carefully pipette the supernatant into a new 0.2mL PCR tube, and then add 8μL
Figure PCTCN2021137530-appb-000005
For DNA Clean Beads, use a pipette to gently pipette 10 times to mix thoroughly, and then incubate at room temperature for 10 minutes to allow DNA to bind to the magnetic beads. After a brief centrifugation, place the sample on a magnetic stand. After the liquid is clear, carefully remove the supernatant. Keep the sample on the magnetic stand all the time, add 200 μL freshly prepared 80% ethanol to rinse the magnetic beads, carefully remove the supernatant. Then add 200 μL freshly prepared 80% ethanol to rinse the magnetic beads, carefully remove the supernatant. Keep the sample on the magnetic stand all the time, open the cover and dry at room temperature until the magnetic beads appear dry and cracked. Take the sample out of the magnetic stand, add 40 μL of EB buffer solution (10mM Tris, pH8.0), use a pipette to blow and mix well, and after a brief centrifugation, let it stand on the magnetic stand for 5 minutes. After the solution is clear, carefully pipette it Transfer the supernatant to a new nuclease-free centrifuge tube. Take 2 μL of the eluted sample and use the Aligent 2100 High Sensitivity DNA kit to confirm the size of the DNA fragment after double selection of magnetic beads. The results showed that after fragmentation and magnetic bead double selection, the main bands of DNA fragments fragmented by several enzymatic methods were all around 350bp, which met the requirements for library construction. The specific results are shown in Figure 3.
实施例4 Cas2用于基因测序中DNA片段化的表现Example 4 Cas2 is used for the performance of DNA fragmentation in gene sequencing
同实施例3相同,以NA12878DNA作为打断样本,Cas2蛋白、DNase I(NEB)、公司Q(Qiagen)的片段化试剂盒、或公司Y的片段化试剂盒为片段化酶进行DNA的片段化和磁珠双选。然后参照MGIEasy酶切PCR-Free DNA文库制备试剂套装的相关说明进行测序文 库的制备,具体步骤包括末端修复和加A、接头连接、接头连接产物纯化、变性、单链环化、Exo消化、Exo消化产物纯化及和质控。测序文库质控合格后,使用MGISEQ-2000RS高通量测序试剂盒(PE150)进行测序,测序深度为3x。测序下机后分析结果显示,使用Cas2进行片段化的Q30、比对率(Mapping Rate)、错配率(Mismatch Rate)与DNase I、公司Q和公司Y的无差别,但是其GC偏好明显较小。GC偏好分析结果见图4,由图可见Cas2打断的在GC含量为18~87%均有较好的覆盖,GC偏好较低;其次是公司Q和公司Y的(30%~85%),而DNase I的有较高的GC偏好。Same as Example 3, with NA12878DNA as the fragmentation sample, the fragmentation kit of Cas2 protein, DNase I (NEB), company Q (Qiagen), or the fragmentation kit of company Y is the fragmentation enzyme for DNA fragmentation Double selection with magnetic beads. Then refer to the relevant instructions of MGIEasy Enzyme Digestion PCR-Free DNA Library Preparation Kit to prepare the sequencing library. The specific steps include end repair and A addition, adapter ligation, adapter ligation product purification, denaturation, single-strand circularization, Exo digestion, Exo Digestion product purification and quality control. After the quality control of the sequencing library was qualified, the MGISEQ-2000RS high-throughput sequencing kit (PE150) was used for sequencing, and the sequencing depth was 3x. The analysis results after off-machine sequencing showed that Q30, Mapping Rate, and Mismatch Rate (Mismatch Rate) fragmented by Cas2 were not different from those of DNase I, Company Q, and Company Y, but their GC preference was significantly higher than that of DNase I. Small. The results of GC preference analysis are shown in Figure 4. It can be seen from the figure that Cas2 interrupted has good coverage in GC content of 18-87%, and the GC preference is lower; followed by company Q and company Y (30%-85%) , while DNase I has a higher GC preference.
接着,使用MGISEQ-2000RS高通量测序试剂盒(PE150)以Cas2和公司Q的片段化试剂盒为片段化酶的测序文库进行测序,测序深度为30x。测序下机后分析结果如图5所示,使用Cas2进行片段化的Q30明显优于公司Q,而比对率、覆盖度、错配率、SNP精确度(Precision)、SNP敏感度(Sensitivity)、Indel精确度和Indel敏感度与公司Q的无差别。具体测序结果见图5。Next, use the MGISEQ-2000RS high-throughput sequencing kit (PE150) to sequence the sequencing library with Cas2 and the company's Q fragmentation kit as the fragmentation enzyme, and the sequencing depth is 30x. The analysis results after sequencing off the machine are shown in Figure 5. Q30 fragmented using Cas2 is significantly better than the company's Q30, while the alignment rate, coverage, mismatch rate, SNP accuracy (Precision), and SNP sensitivity (Sensitivity) , Indel precision and Indel sensitivity are indistinguishable from those of firm Q. The specific sequencing results are shown in Figure 5.
综上,新发现的Cas2蛋白可作为片段化酶,用于基因测序中的DNA片段化。其测序各项指标符合要求,无明显的偏好性,GC偏好小于现有产品。In summary, the newly discovered Cas2 protein can be used as a fragmentation enzyme for DNA fragmentation in gene sequencing. The indicators of its sequencing meet the requirements, there is no obvious preference, and the GC preference is smaller than that of existing products.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of this specification, descriptions referring to the terms "one embodiment", "some embodiments", "example", "specific examples", or "some examples" mean that specific features described in connection with the embodiment or example , structure, material or characteristic is included in at least one embodiment or example of the present invention. In this specification, the schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the described specific features, structures, materials or characteristics may be combined in any suitable manner in any one or more embodiments or examples. In addition, those skilled in the art can combine and combine different embodiments or examples and features of different embodiments or examples described in this specification without conflicting with each other.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it can be understood that the above embodiments are exemplary and should not be construed as limiting the present invention, those skilled in the art can make the above-mentioned The embodiments are subject to changes, modifications, substitutions and variations.

Claims (21)

  1. 一种分离的多肽,其特征在于,具有SEQ ID NO:1所示的氨基酸序列或与其具有至少80%同源性的氨基酸序列。An isolated polypeptide is characterized in that it has the amino acid sequence shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology therewith.
  2. 权利要求1所述分离的多肽在作为核酸内切酶中的应用。The use of the isolated polypeptide of claim 1 as an endonuclease.
  3. 一种核酸内切酶,其特征在于,具有SEQ ID NO:1所示的氨基酸序列或与其具有至少80%同源性的氨基酸序列。An endonuclease, characterized in that it has the amino acid sequence shown in SEQ ID NO: 1 or an amino acid sequence having at least 80% homology therewith.
  4. 一种分离的核酸,其特征在于,所述核酸编码权利要求1所述分离的多肽或权利要求3所述核酸内切酶。An isolated nucleic acid, characterized in that the nucleic acid encodes the isolated polypeptide of claim 1 or the endonuclease of claim 3.
  5. 根据权利要求4所述分离的核酸,其特征在于,所述核酸具有SEQ ID NO:2所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列。The nucleic acid isolated according to claim 4, wherein the nucleic acid has a nucleotide sequence shown in SEQ ID NO: 2 or a nucleotide sequence having at least 80% homology therewith.
  6. 一种构建体,其特征在于,包含:权利要求4或5所述的分离的核酸。A construct, characterized in that it comprises: the isolated nucleic acid according to claim 4 or 5.
  7. 一种重组细胞,其特征在于,所述重组细胞是通过权利要求6所述的构建体转化受体细胞而获得的。A recombinant cell, characterized in that the recombinant cell is obtained by transforming a recipient cell with the construct according to claim 6.
  8. 一种试剂盒,其特征在于,包括:权利要求1所述分离的多肽、权利要求3所述核酸内切酶、权利要求4或5所述分离的核酸、权利要求6所述构建体和/或权利要求7所述重组细胞。A kit, characterized in that it comprises: the isolated polypeptide of claim 1, the endonuclease of claim 3, the isolated nucleic acid of claim 4 or 5, the construct of claim 6 and/or Or the recombinant cell described in claim 7.
  9. 根据权利要求8所述的试剂盒,其特征在于,进一步包括缓冲液,所述缓冲液中含有:Tris、KCl、MgCl 2和水。 The kit according to claim 8, further comprising a buffer containing: Tris, KCl, MgCl 2 and water.
  10. 根据权利要求9所述的试剂盒,其特征在于,基于所述缓冲液的总体积,所述Tris的浓度为1~50mM,所述KCl的浓度为100~500mM,所述MgCl 2的浓度为1~50mM,所述DNA片段化酶的浓度为1~50ng/μL。 The kit according to claim 9, wherein, based on the total volume of the buffer, the concentration of the Tris is 1 to 50 mM, the concentration of the KCl is 100 to 500 mM, and the concentration of the MgCl is 1-50 mM, the concentration of the DNA fragmentation enzyme is 1-50 ng/μL.
  11. 根据权利要求9或10所述的试剂盒,其特征在于,基于所述缓冲液的总体积,所述Tris的浓度为20~30mM,所述KCl的浓度为100~300mM,所述MgCl 2的浓度为1~10mM,所述DNA片段化酶的浓度为20~30ng/μL。 The kit according to claim 9 or 10, characterized in that, based on the total volume of the buffer, the concentration of the Tris is 20 to 30 mM, the concentration of the KCl is 100 to 300 mM, and the concentration of the MgCl 2 The concentration is 1-10 mM, and the concentration of the DNA fragmentation enzyme is 20-30 ng/μL.
  12. 权利要求1所述分离的多肽、权利要求3所述核酸内切酶、权利要求4或5所述分离的核酸、权利要求6所述构建体、权利要求7所述重组细胞、权利要求8~11任一项所述试剂盒在DNA片段化中的应用。The isolated polypeptide of claim 1, the endonuclease of claim 3, the isolated nucleic acid of claim 4 or 5, the construct of claim 6, the recombinant cell of claim 7, and the recombinant cell of claim 8- 11. Application of any one of the kits in DNA fragmentation.
  13. 一种DNA片段化的方法,其特征在于,包括:将DNA样品与权利要求1所述分离的多肽或权利要求3所述核酸内切酶进行反应,得到DNA片段。A DNA fragmentation method, characterized by comprising: reacting a DNA sample with the isolated polypeptide of claim 1 or the endonuclease of claim 3 to obtain DNA fragments.
  14. 根据权利要求13所述的方法,其特征在于,所述反应是在权利要求9~11任一项 所述试剂盒中的缓冲液中进行的。The method according to claim 13, wherein the reaction is carried out in the buffer in the kit according to any one of claims 9-11.
  15. 根据权利要求13或14所述的方法,其特征在于,所述反应是在40~70℃下进行0.5~2小时。The method according to claim 13 or 14, characterized in that the reaction is carried out at 40-70° C. for 0.5-2 hours.
  16. 根据权利要求13~15任一项所述的方法,其特征在于,所述反应是在50~60℃下进行50~90分钟。The method according to any one of claims 13-15, characterized in that the reaction is carried out at 50-60° C. for 50-90 minutes.
  17. 根据权利要求13~16任一项所述的方法,其特征在于,所述反应是通过添加终止反应试剂而终止的,The method according to any one of claims 13 to 16, wherein the reaction is terminated by adding a terminating reaction reagent,
    所述终止反应试剂选自EDTA。The termination reaction reagent is selected from EDTA.
  18. 根据权利要求13~17任一项所述的方法,其特征在于,所述DNA片段的长度为100~1000bp。The method according to any one of claims 13-17, characterized in that the length of the DNA fragment is 100-1000 bp.
  19. 一种构建测序文库的方法,其特征在于,包括:采用权利要求13~18任一项所述DNA片段化的方法获得DNA片段。A method for constructing a sequencing library, characterized by comprising: obtaining DNA fragments by using the DNA fragmentation method described in any one of claims 13-18.
  20. 一种测序文库,其特征在于,所述测序文库是通过权利要求19所述构建测序文库的方法获得的。A sequencing library, characterized in that the sequencing library is obtained by the method for constructing a sequencing library according to claim 19.
  21. 一种测序方法,其特征在于,包括:对权利要求20所述测序文库进行测序。A sequencing method, characterized by comprising: sequencing the sequencing library according to claim 20.
PCT/CN2021/137530 2021-12-13 2021-12-13 Endonuclease and use thereof in dna fragmentation WO2023108362A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/137530 WO2023108362A1 (en) 2021-12-13 2021-12-13 Endonuclease and use thereof in dna fragmentation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/137530 WO2023108362A1 (en) 2021-12-13 2021-12-13 Endonuclease and use thereof in dna fragmentation

Publications (1)

Publication Number Publication Date
WO2023108362A1 true WO2023108362A1 (en) 2023-06-22

Family

ID=86775269

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/137530 WO2023108362A1 (en) 2021-12-13 2021-12-13 Endonuclease and use thereof in dna fragmentation

Country Status (1)

Country Link
WO (1) WO2023108362A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111763664A (en) * 2020-06-28 2020-10-13 江苏康科斯医疗科技有限公司 Enzyme reaction liquid for constructing sequencing library and application thereof

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111763664A (en) * 2020-06-28 2020-10-13 江苏康科斯医疗科技有限公司 Enzyme reaction liquid for constructing sequencing library and application thereof

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DATABASE Nucleotide 2 September 2014 (2014-09-02), ANONYMOUS : "Paenibacillus macerans strain ATCC 8244 DJ90.Contig468, whole genome s", XP093073210, retrieved from NCBI Database accession no. JMQA01000052.1 *
DATABASE Protein 2 September 2014 (2014-09-02), ANONYMOUS : "CRISPR-associated endoribonuclease Cas2 [Paenibacillus macerans]", XP093073212, retrieved from NCBI Database accession no. KFM93476.1 *
DATABASE Protein 22 May 2020 (2020-05-22), ANONYMOUS : "CRISPR-associated endonuclease Cas2 [Paenibacillus macerans] ", XP093073213, retrieved from NCBI Database accession no. WP_036626985.1 *
KI HYUN NAM ET AL.: "Double-stranded endonuclease activity in Bacillus halodurans clustered regularly interspaced short palindromic repeats (CRISPR)-associated Cas2 protein", JOURNAL OF BIOLOGICAL CHEMISTRY, vol. 287, no. 43, 19 October 2012 (2012-10-19), XP055352251, DOI: 10.1074/jbc.M112.382598 *

Similar Documents

Publication Publication Date Title
EP2737066B1 (en) High throughput method for assembly and cloning polynucleotides comprising highly similar polynucleotidic modules
US20070128724A1 (en) System for the rapid manipulation of nuculeic acid sequences
JP2000512852A (en) Methods and kits for preparing multi-component nucleic acid constructs
Ji et al. A novel method to identify the DNA motifs recognized by a defined transcription factor
WO2013143438A1 (en) Nucleic acid molecular cloning method based on homologous recombination, and related reagent kit
CN113136374B (en) Preparation and application of recombinant mutant Tn5 transposase
Blanco et al. Tailoring translational strength using Kozak sequence variants improves bispecific antibody assembly and reduces product‐related impurities in CHO cells
WO2023169228A1 (en) Novel thermophilic endonuclease mutant, and preparation method therefor and application thereof
US20030044820A1 (en) Rapid and enzymeless cloning of nucleic acid fragments
US20220307009A1 (en) Isolated nucleic acid binding domains
CN113316636B (en) DNA polymerase with improved enzymatic activity and use thereof
Yang et al. Assignment [sup 1] of the human reticulon 4 gene (RTN4) to chromosome 2p14→ 2p13 by radiation hybrid mapping.
WO2023108362A1 (en) Endonuclease and use thereof in dna fragmentation
WO2020124319A1 (en) Fusion protein and application thereof
JP4303112B2 (en) Methods for the generation and identification of soluble protein domains
CN114807084B (en) Mutant Tn5 transposase and kit
Xiong et al. High efficiency and throughput system in directed evolution in vitro of reporter gene
WO2021093434A1 (en) Modified klenow fragment and application thereof
CN112725331B (en) Construction method of high-throughput mutant library
CN113337488B (en) Isolated Cas13 protein
CN113073094B (en) Single base mutation system based on cytidine deaminase LjCDA1L1_4a and mutants thereof
CN115896047B (en) Recombinant T4DNA ligase mutant, fusion protein and application thereof
Teubl et al. Tethered MNase Structure Probing as Versatile Technique for Analyzing RNPs Using Tagging Cassettes for Homologous Recombination in Saccharomyces cerevisiae
CN113429487B (en) Artificially synthesized protein capable of removing antibiotic resistance gene in water environment
WO2023115517A1 (en) Dna polymerase mutant and use thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21967504

Country of ref document: EP

Kind code of ref document: A1