WO2021027236A1 - 构建dna文库的方法及其应用 - Google Patents

构建dna文库的方法及其应用 Download PDF

Info

Publication number
WO2021027236A1
WO2021027236A1 PCT/CN2019/130250 CN2019130250W WO2021027236A1 WO 2021027236 A1 WO2021027236 A1 WO 2021027236A1 CN 2019130250 W CN2019130250 W CN 2019130250W WO 2021027236 A1 WO2021027236 A1 WO 2021027236A1
Authority
WO
WIPO (PCT)
Prior art keywords
dna
library
constructing
present
dna library
Prior art date
Application number
PCT/CN2019/130250
Other languages
English (en)
French (fr)
Inventor
陈晓丹
徐护朝
潘伟业
李志民
李大为
玄兆伶
王海良
王娟
Original Assignee
安诺优达基因科技(北京)有限公司
浙江安诺优达生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 安诺优达基因科技(北京)有限公司, 浙江安诺优达生物科技有限公司 filed Critical 安诺优达基因科技(北京)有限公司
Publication of WO2021027236A1 publication Critical patent/WO2021027236A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present invention relates to the field of biotechnology, in particular, to a method for constructing a DNA library and its application, and more particularly to a method for constructing a DNA library, a method for obtaining chromatin interaction information in individual cells, and a method for obtaining individual biological Information method, a three-dimensional genome research method, a prenatal diagnosis or cancer screening method, a kit, and the use of the kit in three-dimensional gene library or prenatal diagnosis or cancer screening.
  • the existing second-generation sequencing library construction technology has many steps, especially the multi-step operation from the final end repair to the PCR is likely to cause the loss of effective fragments. It is more significant in 3D genome Hi-C library construction.
  • the biotin-labeled chimera DNA is relatively small as a library construction template, so the loss of effective fragments after fishing will directly affect the quality of the final library.
  • the fragment screening step is set for the library to adapt to Illumina's sequencing-by-synthesis principle, but too long fragments will result in poor sequencing data quality.
  • a considerable part of the available library fragments will be screened out due to fragment length issues, especially under the conditions of nanogram-level templates, the screening will result in a reduction in the number of effective libraries, which directly affects the ratio of effective data.
  • an objective of the present invention is to provide a method for constructing a DNA library, which introduces transposase during the library construction process, which simplifies the steps of DNA fragmentation and linking, and does not require end repair and 3'end addition.
  • Base A the library construction time is short, the fragment length of the library product is suitable, it can be directly sequenced on the computer without fragment screening, and the effective data ratio of sequencing is high.
  • the inventor introduced the transposase in the Hi-C library construction. Since the transposase has two short nucleic acids, it is a adaptor suitable for illunima sequencing. When the transposase randomly fragments DNA, it will connect the small DNA fragments with adaptors at the same time, and then use specific primers. Amplification can obtain a library that can be sequenced, as shown in Figure 1, which significantly simplifies the library construction process, shortens the library construction time, and significantly increases the effective data ratio of the Hi-C library.
  • the present invention provides a method for constructing a DNA library.
  • the method includes: providing DNA of a chimeric marker, wherein the DNA of the chimeric marker has three-dimensional structure information; and transposing the DNA of the chimeric marker to obtain Transposition product; the transposition product is captured to obtain captured DNA; and the captured DNA is amplified to obtain the DNA library.
  • the steps of DNA fragmentation and linking are simplified through transposition processing, without end repair and 3'end With the addition of base A step, the library construction time is significantly shortened, and the fragment length of the library product is appropriate. It can be directly sequenced without fragment screening. It is especially suitable for the construction of trace DNA samples Hi-C library, and the effective data ratio of sequencing High, low noise single end suspension value.
  • the present invention provides a method for obtaining chromatin interaction information in individual cells.
  • the method includes: using the aforementioned method to obtain a DNA library of the individual; and sequencing and analyzing the DNA library to obtain chromatin interaction information in the individual cells.
  • the method of obtaining chromatin interaction information in individual cells has simplified steps and shortened operation time, which is especially suitable for the construction of Hi-C libraries of trace DNA samples, and the effective data ratio of sequencing is high, and the noise single-ended hanging value is low.
  • the obtained intracellular chromatin interaction information is beneficial to the research in the field of three-dimensional genome.
  • the method for constructing a DNA library has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • the present invention provides a method for obtaining individual biological information.
  • the method includes: using the aforementioned method of constructing a DNA library to obtain the individual's DNA library; and sequencing and analyzing the DNA library to obtain the individual's biological information.
  • the method of obtaining individual biological information is simplified and the operation time is shortened. It is especially suitable for the construction of Hi-C library of trace DNA samples, and the effective data ratio of sequencing is high, the noise single-end suspension value is low, and the biological information obtained It is useful for research and clinical diagnosis in the field of three-dimensional genomes.
  • the method for constructing a DNA library has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • the present invention provides a three-dimensional genome research method.
  • the method is performed by the aforementioned method of constructing a DNA library or the aforementioned method of obtaining chromatin interaction information in individual cells or the aforementioned method of obtaining individual biological information. Therefore, the method for constructing a DNA library and the method for obtaining individual biological information are simplified, and the operation time is shortened. It is especially suitable for the construction of a library of trace DNA samples, and the effective data ratio of sequencing is high, the noise dangling value is low, and the biological information obtained The information is suitable for three-dimensional genome research. Among them, it should be noted that the method for constructing a DNA library has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • the present invention provides a method for prenatal diagnosis or cancer screening.
  • the method is performed by the aforementioned method of constructing a DNA library or the aforementioned method of obtaining individual biological information or the aforementioned three-dimensional genome research method. Therefore, the method for constructing a DNA library and the method for obtaining individual biological information are simplified, and the operation time is shortened. It is especially suitable for the construction of a library of trace DNA samples, and the effective data ratio of sequencing is high, the noise dangling value is low, and the biological information obtained The information is useful for clinical diagnosis, especially prenatal diagnosis and cancer screening. Among them, it should be noted that the method for constructing a DNA library has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • the present invention provides a kit.
  • the kit includes: reagents, primers, and mediating fragments used in the aforementioned method for constructing a DNA library Or a combination of at least one of them. Therefore, the method for constructing a DNA library and the method for obtaining chromatin interaction information and biological information in individual cells are simplified, and the operation time is shortened. It is especially suitable for library construction of trace DNA samples and effective data for sequencing. The ratio is high, the noise dangling value is low, and the biological information obtained is useful for clinical diagnosis, especially prenatal diagnosis and cancer screening.
  • the kit has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • the present invention provides the use of the aforementioned kit in three-dimensional gene library construction or prenatal diagnosis or cancer screening. Therefore, the method for constructing a DNA library and the method for obtaining chromatin interaction information and biological information in individual cells are simplified, and the operation time is shortened. It is especially suitable for library construction of trace DNA samples and effective data for sequencing. The ratio is high, the noise dangling value is low, it is suitable for three-dimensional gene library construction, and the obtained biological information is beneficial for clinical diagnosis, especially prenatal diagnosis and cancer screening.
  • Figure 1 shows a schematic diagram of the flow comparison of a method for constructing a DNA library according to an embodiment of the present invention
  • Fig. 2 shows a schematic diagram of the single-end suspension value principle of Tn5 transposase to remove noise data according to an embodiment of the present invention
  • Fig. 3 shows a schematic diagram of agarose gel electrophoresis of library restriction digestion according to an embodiment of the present invention
  • Figure 4 shows a schematic diagram of the Agilent HS2100 peak of the library according to an embodiment of the present invention
  • Figure 5 shows a schematic diagram of the Agilent HS2100 peak of the library according to a comparative example of the present invention.
  • first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Thus, the features defined with “first” and “second” may explicitly or implicitly include one or more of these features. Further, in the description of the present invention, unless otherwise specified, “plurality” means two or more.
  • the present invention provides a method for constructing a DNA library.
  • the steps of DNA fragmentation and linking are simplified through transposition processing, without end repair and 3'end
  • the step of adding base A significantly shortens the library construction time.
  • the library can be quickly constructed in 3 hours.
  • the fragment length of the library product is suitable, and it can be directly sequenced without fragment screening. It is especially suitable for the construction of Hi-C library of trace DNA samples, and the effective data ratio of sequencing is high. In some embodiments, the effective data The ratio is more than 35%, which is nearly 10% higher than the prior art, and the noise single-end suspension value is low. In some embodiments, the end suspension value of the invalid noise data sheet is only 0.6%.
  • the steps of DNA fragmentation and linking are simplified through transposition treatment, and the transposition treatment does not affect the ends of small DNA fragments, so that the single-stranded ends have biotinylation
  • the small fragments will not be added with adapters, and the PCR reaction cannot be performed, which significantly reduces or even removes the noise single-end dangling value of the Hi-C library.
  • the transposition process causes the DNA to be cut into small fragments of 200-500 bp.
  • the main peak of the fragment length of the library product after PCR is also within about 300-600 bp, without fragment screening. You can directly sequence on the computer, further simplifying the experimental steps.
  • the experimental steps are simplified, the loss of samples during the experiment is reduced, and the sample amount can be reduced to 103 cells.
  • the method is explained according to the embodiment of the present invention, and the method includes:
  • a DNA of a chimeric marker is provided, wherein the DNA of the chimeric marker has three-dimensional structure information.
  • the method for constructing a library of the embodiment of the present invention constructs a Hi-C high-throughput sequencing library with DNA with a three-dimensional structure of chimeric markers, using high-throughput sequencing technology, combined with bioinformatics methods, to study the presence of chromatin DNA The relationship in spatial position; by capturing the DNA interaction mode, high-resolution three-dimensional structure information of chromatin can be obtained.
  • the marker is biotin. Therefore, the DNA is labeled with biotin, which is convenient for subsequent fishing and purification of the DNA.
  • the DNA of the chimeric marker contains parts of spatially adjacent DNA segments. That is to say, the DNA of the chimeric marker is not a continuous and complete DNA segment on the chromatin in the nucleus of the original cell, but is obtained by chimerizing at least two DNA segments adjacent in space. Furthermore, the analysis of long-range chromatin interactions based on DNA interaction analysis and protein-specific DNA binding based on neighboring connections is helpful to define target genes of cis-regulatory elements and annotate non-coding sequences related to various physiological and pathological conditions The function of the variant is thus used for the pathological research of clinical diseases, especially the exploration of the mechanism of cancer.
  • the method for obtaining the DNA of the chimeric marker includes: immobilizing and cross-linking the chromatin in the cell to form a DNA-protein cross-link; Enzyme digestion treatment is performed to generate DNA-protein complexes containing sticky ends; one or more nucleotides containing biotin markers and ordinary nucleotides without biotin are used to fill in the sticky ends to produce flat The blunt ends are then joined together to form adjacently connected DNA. If all chromatin in the cell is fixed, the adjacently connected DNA is genomic DNA; the genomic DNA is fragmented to obtain the chimeric DNA. DNA combined with markers.
  • the DNA of the chimeric marker is transposed to obtain a transposition product. Therefore, only one step of transposition processing is required to fragment the DNA of the chimeric marker and add linkers, replacing the steps of DNA fragmentation, end repair, 3'addition "A", linker addition and other steps in the prior art. Significantly simplifies the experimental process and shortens the time for building a database.
  • transposase is used for transposition treatment.
  • Tn5 transposase takes Tn5 transposase as an example to explain the transposition process in the process of library construction.
  • the Tn5 transposase used is the Tn5 transposase reagent developed by Epicentre. Since the transposase has two short nucleic acids, it can be sequenced by illunima for the needs of library construction. Adapters. When the transposase randomly fragments DNA, it will connect the fragmented DNA fragments with adapters at the same time, and then use specific primers to amplify to obtain a library that can be sequenced.
  • transposase during the process of Hi-C library construction for transposition processing, especially Tn5 transposase, has at least one of the following advantages:
  • transposase especially Tn5 transposase
  • One-step transposition treatment can directly fragment the DNA and add adapters to the fragments.
  • the length of the DNA fragments after the adapters is appropriate, and the library can be obtained by direct PCR amplification.
  • FIG. 2 The comparison between the method for constructing a DNA library in the embodiment of the present invention and the method for constructing a DNA library in the prior art is shown in FIG. 2, the experimental procedure of the present application is significantly simplified, and the library construction time is significantly shortened.
  • the method of the embodiment of the present invention after DNA is extracted, the method of the embodiment of the present invention only needs 3 hours to complete the rapid library construction.
  • Tn5 will not act on the ends of DNA fragments that are too short, for example, DNA fragments less than 200bp in length, so that the single-stranded ends have biotin.
  • Small fragments of DNA will not be added with adapters, and PCR reactions cannot be carried out.
  • the small fragments with linkers but no biotin-labeled in the middle can be PCR normally, but they cannot be caught by streptavidin magnetic beads, as shown in Figure 3.
  • the use of the characteristics of Tn5 transposase can significantly reduce or even eliminate the noise single-end dangling value of Hi-C libraries.
  • the effective data ratio is increased by about 10%.
  • the ratio of the DNA of the chimeric marker to the transposase is 10ng:50-100 nM. This facilitates repeated transposition of the DNA of the chimeric marker.
  • the inventor found through tests that when the amount of transposase input is too high, such as adding 200 nM, the library fragments will be too small, and the main peak will be at about 290 bp. Since the Hi-C library is a mosaic of two DNA fragments, the genome comparison of the sequencing data is performed by intercepting one fragment at each end of the library, so if the library is too small, it will lead to the unique genome position comparison of the effective part of the sequencing data The rate is too low, and the comparison rate of multiple genome positions in invalid parts is too high. Therefore, after testing, the inventor found that when the ratio of DNA to transposase is 10ng:50-100nM, the library fragment length is more suitable (main peak 300-600bp).
  • the reaction system for the transposition treatment includes: 8-12 ⁇ L transposition buffer; 0.2-1 ⁇ L 10% Tween 20; 7-10 ⁇ L water; 0.5-3 ⁇ L of the transposase, wherein the transposition buffer includes 10 mM Tris-HCl pH 7.6 and 5 mM MgCl 2 . Therefore, in this reaction system, the size of the DNA fragment treated by transposition is appropriate.
  • the temperature of the transposition treatment is 50-60° C., and the time is 5-15 minutes. Therefore, under this temperature regulation, it is advantageous to fragment DNA to an appropriate length range.
  • the transposition product is captured to obtain captured DNA.
  • the label-labeled adapter-added DNA is captured for subsequent amplification, thereby reducing the interference of impurity DNA on amplification.
  • the capture process is a fishing process.
  • the fishing process is performed using streptavidin magnetic beads. Specifically, by combining streptavidin magnetic beads with biotin labeled on DNA, a biotin-labeled chimeric DNA fragment with a linker at both ends is fished from the transposition product.
  • the added amount of the streptavidin magnetic beads is 5-10 ⁇ L. Therefore, it is beneficial to fully capture the DNA with the linker at both ends of the biotin-labeled product from the product, and avoids the waste of reagents caused by excess.
  • the extracted DNA is amplified to obtain the DNA library.
  • the extracted DNA can be amplified by PCR to obtain sufficient material.
  • the library amplified by PCR can be further purified.
  • the present invention provides a method for obtaining chromatin interactions in individual cells.
  • the method includes: using the aforementioned method to obtain a DNA library of the individual; sequencing and analyzing the DNA library to obtain biological information such as chromatin interactions in individual cells.
  • the method for obtaining biological information such as chromatin interactions in individual cells has simplified steps and shortened operation time. It is especially suitable for library construction of trace DNA samples, and the effective data rate of sequencing is high, and the noise single-ended hanging value is low.
  • the obtained biological information is useful for pathological research of clinical diseases and scientific research of three-dimensional genome.
  • the method for constructing a DNA library has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • sequencing can be accomplished by the following methods: classic Sanger sequencing, massively parallel sequencing, next-generation sequencing, polony sequencing, 454 pyrosequencing, Illumina sequencing, SOLEXA sequencing, SOLiD sequencing, ion semiconductor sequencing, DNA nanosphere sequencing, Heliscope single molecule sequencing, single molecule real-time sequencing, nanopore DNA sequencing, tunneling current DNA sequencing, hybridization sequencing, mass spectrometry sequencing, microfluidic Sanger sequencing, microscope-based sequencing, RNA polymerase sequencing, in vitro virus high Throughput sequencing, Maxam-Gibler sequencing, single-end sequencing, paired-end sequencing, deep sequencing, ultra-deep sequencing, especially suitable for Illumina sequencing.
  • a bioinformatics pipeline can be used to process the reads of sequencing to map long-range and/or genome-wide chromatin interactions, thereby obtaining biological information such as chromatin interactions in individual cells.
  • the present invention provides a method for obtaining individual biological information.
  • the method includes: using the aforementioned method to obtain the individual's DNA library; and sequencing and analyzing the DNA library to obtain the individual's biological information.
  • the method of obtaining individual biological information is simplified and the operation time is shortened. It is especially suitable for the construction of Hi-C library of trace DNA samples, and the effective data ratio of sequencing is high, the noise single-end suspension value is low, and the biological information obtained It is useful for research and clinical diagnosis in the field of three-dimensional genomes.
  • the method for constructing a DNA library has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • the present invention provides a method for prenatal diagnosis or cancer screening.
  • the method is performed by the aforementioned method of constructing a DNA library or the aforementioned method of obtaining individual biological information or the aforementioned three-dimensional genome research method. Therefore, the method for constructing a DNA library and the method for obtaining individual biological information are simplified, and the operation time is shortened. It is especially suitable for the construction of a library of trace DNA samples, and the effective data ratio of sequencing is high, the noise dangling value is low, and the biological information obtained The information is useful for clinical diagnosis, especially prenatal diagnosis and cancer screening. Among them, it should be noted that the method for constructing a DNA library has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • the present invention provides a kit.
  • the kit includes: reagents, primers, and mediating fragments used in the aforementioned method for constructing a DNA library Or a combination of at least one of them. Therefore, the method for constructing a DNA library and the method for obtaining chromatin interaction information and biological information in individual cells are simplified, and the operation time is shortened. It is especially suitable for library construction of trace DNA samples and effective data for sequencing. The ratio is high, the noise dangling value is low, and the obtained biological information is beneficial to clinical diagnosis, especially prenatal diagnosis and cancer screening.
  • the kit has all the technical features and effects of the aforementioned method for constructing a DNA library, and will not be repeated here.
  • the present invention provides the use of the aforementioned kit in three-dimensional gene library construction or prenatal diagnosis or cancer screening. Therefore, the method for constructing a DNA library and the method for obtaining chromatin interaction information and biological information in individual cells are simplified, and the operation time is shortened. It is especially suitable for library construction of trace DNA samples and effective data for sequencing. The ratio is high, the noise dangling value is low, it is suitable for three-dimensional gene library construction, and the obtained biological information is beneficial for clinical diagnosis, especially prenatal diagnosis and cancer screening.
  • the kit may include, for example, multiple association molecules, affinity tags, fixatives, restriction endonucleases, ligases, and/or combinations thereof.
  • the associated molecule may be a protein, including, for example, a DNA binding protein (e.g., histone or transcription factor).
  • the fixative may be formaldehyde or any other DNA cross-linking agent.
  • the kit may also contain multiple beads.
  • the beads may be paramagnetic and/or may be coated with a capture agent.
  • the beads may be coated with streptavidin and/or antibodies.
  • the kit may include adaptor oligonucleotides and/or sequencing primers.
  • the kit may include a device capable of amplifying the read pair using adaptor oligonucleotides and/or sequencing primers.
  • the kit may also contain other reagents, including but not limited to lysis buffer, ligation reagents (for example, dNTP, polymerase, polynucleotide kinase and/or ligase buffer, etc.) and PCR reagents (for example, dNTP, polymerase, and/or PCR buffer, etc.).
  • the kit may also include instructions for using the kit components and/or generating read pairs.
  • the mouse 3T3-NIH cell line was used as the experimental material.
  • the cryopreserved 3T3-NIH cells were quickly thawed in a 37°C water bath, transferred to 9ml cell culture medium in a biological safety cabinet and mixed.
  • the cell culture medium included 15% fetal bovine serum, 84% DMEM medium and 1% Streptomycin penicillin antibodies (all V/V). Centrifuge at 1000 rpm at 23°C for 10 minutes, discard the supernatant and add 5ml cell culture solution, resuspend the cells and transfer them to a cell culture flask, place them in a cell incubator at 37°C with 5% CO 2 for static culture.
  • cell lysate to the cross-linked cells, including 10nM Tris-HCl pH7.4, 10mM NaCl, 0.1mM EDTA, 0.5% NP-40, and 5 ⁇ L of protease inhibitor, mix well by pipetting and place on ice Let stand for 1h to lyse. After the lysis is completed, centrifuge at 2500g at 4°C for 5 minutes, and remove the supernatant. Add 20 ⁇ L of cell lysate and 10 ⁇ L 0.5% SDS, and place in a constant temperature mixer at 62°C for 10 min. Then add 5 ⁇ L of 10% Trition X-100, and place it in a constant temperature mixer at 37°C to react for 30 minutes.
  • Terminal biotin labeling 10mM dATP 10mM dGTP10mM dTTP
  • ligation buffer to the biotin-labeled product, including 26.5 ⁇ L ddH 2 O, 7 ⁇ L 10% Trition X-100, 24 ⁇ L 5X T4 ligase buffer, 1.2 ⁇ L 10mg/ml BSA and 400U T4 DNA ligase. Place it in a 16°C constant temperature mixer to react for more than 6 hours, with shaking at 1400rpm for 15s/2min.
  • kit TruePrep TM DNA Library Prep Kit V2 from Novartis to configure the PCR reaction mix reagents, including 10 ⁇ L 5X TAD, 5 ⁇ L PPM, 5 ⁇ L N5 index, 5 ⁇ L N7 index, 4 ⁇ L ddH 2 O and 1 ⁇ L TAE.
  • the product was purified using 0.9X magnetic beads to obtain the final library.
  • CutSmart buffer after mixing, divide into two parts of 14 ⁇ L, mark "-" and “+”, add 1 ⁇ L ddH 2 O to the "-" sign for negative control, and add 1 ⁇ L BspDI to the "+” sign, and place them evenly after mixing. React for 2h in a constant temperature mixer at 37°C. The products were separated by 2% agarose gel electrophoresis, and whether BspDI could cut the library as a criterion for determining the efficiency of the library.
  • Sequencing is performed on Illumina's HiSeq XTen platform, and the specific operations are performed in accordance with official standards.
  • the peak map of the Agilent HS2100 library is shown in Figure 4.
  • the peak map shows that the library fragment length is between 200-1000 bp and the main peak is located at 400 bp, which is in line with the characteristics of a normal Hi-C library.
  • the library data analysis results of this example are shown in Table 2.
  • the valid data Valid is 36.33%, which is about 10.33% higher than the prior art (mostly valid data Valid is about 29%); 0.6%, which is significantly lower than the noise data value of the existing Hi-C library; Cis and Dup are the same as the existing technology.
  • Example 2 According to the method of Example 1, using mouse cells as a sample, construct a DNA library, and perform sequencing and quality control. The difference is that 5 ⁇ L of transposase TTE Mix V50 is added. The results are as follows:
  • Example 1 adjusted the length of the library by adjusting the amount of transposase added so that the main peak was about 409bp, which significantly reduced the multiple comparison; Compared with Example 1, the unique comparison rate and effective data rate of this comparative example are significantly reduced, while the repeated fragment rate is significantly increased
  • step A the library construction time is significantly shortened, and the fragment length of the library products is appropriate. It can be directly sequenced without fragment screening. It is especially suitable for library construction of trace DNA samples (library construction of 103 cells), and sequencing The effective data ratio is high, the noise single end suspension value is low, and the database construction efficiency is high.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Immunology (AREA)
  • Plant Pathology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明提供构建DNA文库的方法及其应用,所述方法包括:提供嵌合标记物的DNA;将所述嵌合标记物的DNA进行转座处理,得到转座产物;对所述转座产物进行捕获处理,得到捕获后的DNA;以及将所述捕获后的DNA进行扩增处理,获得所述DNA文库。

Description

构建DNA文库的方法及其应用 技术领域
本发明涉及生物技术领域,具体地,涉及构建DNA文库的方法及其应用,更具体地,涉及构建DNA文库的方法,一种获得个体细胞内染色质相互作用信息的方法,一种获得个体生物信息的方法、一种三维基因组研究方法,一种产前诊断或癌症筛查的方法,一种试剂盒,以及该试剂盒在三维基因组建库或产前诊断或癌症筛查中的用途。
背景技术
现有二代测序文库构建技术步骤繁多,尤其是最后的末端修复到PCR之前的多步操作都很可能造成有效片段的丢失。在三维基因组Hi-C建库中更为显著,标记生物素的嵌合体DNA作为建库模板是相对微量的,所以钓取后有效片段丢失会直接影响最终文库质量。并且,片段筛选步骤,这一步是为了使文库可以适应illumina的边合成边测序原理而设定,但片段过长会导致测序数据质量较差。同时,也会使相当一部分可用文库片段因为片段长度问题被筛选掉,尤其是纳克级别模板的条件下,筛选会导致有效文库数量减少,直接影响有效数据比例。
由此,现有的文库构建方法有待改进。
发明内容
本发明旨在至少解决现有技术中存在的技术问题之一。为此,本发明的一个目的在于提出一种构建DNA文库的方法,该方法在建库过程中引入了转座酶,简化了DNA片段化和加接头的步骤,无需末端修复、3’端加碱基A,建库时间短,文库产物的片段长度适宜,无需进行片段筛选即可直接上机测序,并且测序的有效数据比例高。
需要说明的是,本发明是基于发明人的下列工作而完成的:
发明人在Hi-C建库中引入转座酶。由于转座酶上带有两段短的核酸,其即为适应于illunima测序的接头,当转座酶随机片段化DNA时,会同时给DNA小片段两端连接上接头,再使用特定的引物扩增即可得到可以测序的文库,如图1所示,显著简化了建库流程,缩短了建库时间,并且使Hi-C文库的有效数据比例显著提升。
因而,根据本发明的第一方面,本发明提供了一种构建DNA文库的方法。根据本发明的实施例,该方法包括:提供嵌合标记物的DNA,其中,所述嵌合标记物的DNA具有三维结构信息;将所述嵌合标记物的DNA进行转座处理,以便得到转座产物;对所述转座产物进行捕获处理,以便得到捕获后的DNA;以及将所述捕获后的DNA进行扩增处理,以便获得所述DNA文库。
根据本发明实施例的构建DNA文库的方法,在建库过程中,尤其是Hi-C建库过程中,通过转座处理简化了DNA片段化和加接头的步骤,无需末端修复和3’端加碱基A步骤,建库时间显著缩短,文库产物的片段长度适宜,无需进行片段筛选即可直接上机测序,尤其适用于痕量DNA样本Hi-C文库的构 建,并且测序的有效数据比例高,噪音单末端悬挂值低。
进一步地,基于上述构建DNA文库的方法,根据本发明的第二方面,本发明提供了一种获得个体细胞内染色质相互作用信息的方法。根据本发明的实施例,该方法包括:利用前述的方法,以便得到所述个体的DNA文库;对所述DNA文库进行测序和分析,以便获得所述个体细胞内染色质相互作用信息。由此,获得个体细胞内染色质相互作用信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的Hi-C文库构建,并且测序的有效数据比例高,噪音单末端悬挂值低,获得的细胞内染色质相互作用信息有利于三维基因组领域的研究。其中,需要说明的是,该构建DNA文库的方法具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
进一步地,基于上述构建DNA文库的方法,根据本发明的第三方面,本发明提供了一种获得个体生物信息的方法。根据本发明的实施例,该方法包括:利用前述的构建DNA文库的方法,以便得到所述个体的DNA文库;对所述DNA文库进行测序和分析,以便获得所述个体生物信息。由此,获得个体生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的Hi-C文库构建,并且测序的有效数据比例高,噪音单末端悬挂值低,获得的生物信息的有利于用于三维基因组领域的研究和临床诊断。其中,需要说明的是,该构建DNA文库的方法具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
进一步地,根据本发明的第四方面,本发明提供了一种三维基因组研究方法。根据本发明的实施例,所述方法是通过前述的构建DNA文库的方法或前述的获得个体细胞内染色质相互作用信息的方法或前述的获得个体生物信息的方法进行的。由此,构建DNA文库的方法和获得个体生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的文库构建,并且测序的有效数据比例高,噪音dangling值低,获得的生物信息适于用于三维基因组研究。其中,需要说明的是,该构建DNA文库的方法具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
进一步地,根据本发明的第五方面,本发明提供了一种产前诊断或癌症筛查的方法。根据本发明的收视率,所述方法是通过前述的构建DNA文库的方法或前述的获得个体生物信息的方法或者前述的三维基因组研究方法进行的。由此,构建DNA文库的方法和获得个体生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的文库构建,并且测序的有效数据比例高,噪音dangling值低,获得的生物信息的有利于用于临床诊断,尤其是产前诊断和癌症筛查。其中,需要说明的是,该构建DNA文库的方法具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
进一步地,根据本发明的第六方面,本发明提供了一种试剂盒,根据本发明的实施例,该试剂盒包括:前述的构建DNA文库的方法中所使用的试剂、引物、介导片段或其中至少一项的组合。由此,该试剂盒构建DNA文库的方法和获得个体细胞内染色质相互作用信息以及生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的文库构建,并且测序的有效数据比例高,噪音dangling值低,获得的生物信息有利于用于临床诊断,尤其是产前诊断和癌症筛查。其中,需要说明的是试剂盒具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
进一步地,根据本发明的第七方面,本发明提供了前述的试剂盒在三维基因组建库或产前诊断或癌症筛查中的用途。由此,该试剂盒构建DNA文库的方法和获得个体细胞内染色质相互作用信息以及生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的文库构建,并且测序的有效数据比 例高,噪音dangling值低,适于用于三维基因组建库,并且获得的生物信息的有利于用于临床诊断,尤其是产前诊断和癌症筛查。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和容易理解,其中:
图1显示了根据本发明一个实施例的构建DNA文库的方法的流程比对示意图;
图2显示了根据本发明一个实施例的Tn5转座酶去除噪音数据单末端悬挂值原理示意图;
图3显示了根据本发明一个实施例的文库酶切质控琼脂糖凝胶电泳示意图;
图4显示了根据本发明一个实施例的文库Agilent HS2100峰示意图;
图5显示了根据本发明一个对比例的文库Agilent HS2100峰示意图。
具体实施方式
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。
需要说明的是,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。进一步地,在本发明的描述中,除非另有说明,“多个”的含义是两个或两个以上。
构建DNA文库的方法
根据本发明的第一方面,本发明提供了一种构建DNA文库的方法。根据本发明实施例的构建DNA文库的方法,在建库过程中,尤其是Hi-C建库过程中,通过转座处理简化了DNA片段化和加接头的步骤,无需末端修复和3’端加碱基A步骤,建库时间显著缩短,根据本发明的实施例,提取DNA后,能够实现3小时快速建库。并且,文库产物的片段长度适宜,无需进行片段筛选即可直接上机测序,尤其适用于痕量DNA样本的Hi-C文库构建,并且测序的有效数据比例高,在一些实施例中,有效数据比例达35%以上,相对于现有技术提高了近10%,噪音单末端悬挂值低。在一些实施例中,无效噪音数据单末端悬挂值仅0.6%。
根据本发明实施例的构建DNA文库的方法,通过转座处理简化了DNA片段化和加接头的步骤,并且转座处理不会作用于DNA小片段的末端,从而使单链末端有生物素的小片段不会被加上接头,进而无法进行PCR反应,从而显著降低甚至去除Hi-C文库的噪音单末端悬挂值。
根据本发明实施例的构建DNA文库的方法,转座处理使DNA被剪切为200-500bp的小片段,此时PCR后的文库产物片段长度主峰也位于约300-600bp以内,无需进行片段筛选即可直接上机测序,进一步简化了实验步骤。
根据本发明实施例的构建DNA文库的方法,简化了实验步骤,降低了实验过程中样本的损失,样本 量可降低至10 3数量细胞。
为了便于理解根据本发明实施例的构建DNA文库的方法,根据本发明的实施例,对该方法进行解释说明,该方法包括:
S100:提供DNA
根据本发明的实施例,提供嵌合标记物的DNA,其中,所述嵌合标记物的DNA具有三维结构信息。具体地,本发明实施例的建库方法以具有三维结构的嵌合标记物的DNA构建Hi-C高通量测序文库,利用高通量测序技术,结合生物信息学方法,研究染色质DNA在空间位置上的关系;通过对DNA相互作用模式进行捕获,获得高分辨率的染色质三维结构信息。
根据本发明的实施例,所述标记物为生物素。由此,以生物素标记DNA,便于后续对DNA的钓取和纯化。
根据本发明的实施例,该嵌合标记物的DNA含有空间上相邻的DNA区段的部分。也就是说,该嵌合标记物的DNA在原细胞核内的染色质上并不是一段连续完整的DNA片段,而是由空间上相邻近的至少两段DNA区段嵌合得到的。进而,利用基于邻近连接的DNA相互作用分析和蛋白质特异性DNA结合对远程染色质相互作用进行分析,有利于定义顺式调控元件的靶基因和注释与各种生理和病理条件相关的非编码序列变体的功能,从而用于临床疾病的病理研究,尤其是癌症机理的探索。
具体地,根据本发明的实施例,获得该嵌合标记物的DNA的方法包括:将细胞内的染色质进行固定交联处理,以形成DNA-蛋白质交联物;将DNA-蛋白质交联物进行酶切处理,以生成含有粘性末端的DNA-蛋白质复合物;用一种或多种含有生物素标记物的核苷酸和无生物素的普通核苷酸补平所述粘性末端,产生平末端随后使平末端连接在一起,形成邻近连接的DNA,如果对于细胞内的全部染色质进行固定,则该邻近连接的DNA为基因组DNA;将所述基因组DNA进行片段化处理,得到所述嵌合标记物的DNA。
S200:转座处理
根据本发明的实施例,将该嵌合标记物的DNA进行转座处理,得到转座产物。由此,仅需一步转座处理即可对嵌合标记物的DNA进行片段化和加接头,取代现有技术中的DNA片段化、末端修复、3’加“A”、加接头等步骤,显著简化了实验流程,缩短了建库时间。
根据本发明的实施例,利用转座酶进行转座处理。在此,以Tn5转座酶为例对建库过程中的转座处理进行解说说明。本发明的一些实施例中,采用的Tn5转座酶是Epicentre公司研发出的Tn5转座酶试剂,由于转座酶上带有两段短的核酸,为了建库的需要,可以为illunima测序的接头,当转座酶随机片段化DNA时,会同时给片段化后的DNA片段两端连接上接头,再使用特定的引物扩增即可得到可以测序的文库。
发明人发现,在Hi-C建库过程中引入转座酶进行转座处理,尤其是Tn5转座酶,至少具有以下优点之一:
第一、简化文库构建步骤:使用转座酶,尤其是Tn5转座酶建库,可取代现有技术中的DNA片段化、末端修复、3’加“A”和加接头等步骤,仅需一步转座处理,可直接将DNA片段化,并给片段加上接头,加接头后的DNA片段长度适宜,直接PCR扩增即可得到文库。本发明实施例的构建DNA文库的方法与现有技术构建DNA文库的方法的对比如图2所示,本申请的实验流程得到明显简化,建库时间显著缩短。 根据本发明的实施例,提取DNA后,本发明实施例的方法仅需3小时即可完成快速建库。
第二、增加有效文库比例:现有的Hi-C建库流程中,影响可用数据比例的一个重要原因就是文库的噪音数值单末端悬挂值过高,源自生物素标记后的平末端连接效率太低,导致一些连接失败的单链末端标记生物素的DNA片段也会被链霉亲和素磁珠最终钓取出来,这部分数据占比太高说明文库构建效率是比较低的,甚至导致文库构建失败。Tn5转座酶的特性可使文库减少单末端悬挂的产生,原理在于Tn5不会作用于长度过短DNA片段的末端,例如,长度小于200bp的DNA片段,这样一来单链末端有生物素的小片段DNA就不会被加上接头,进而无法进行PCR反应。而加上接头但中间不含生物素标记的小片段,可以正常PCR,但无法被链霉亲和素磁珠钓取,如图3所示。由此,利用Tn5转座酶的特性即可显著降低甚至去除Hi-C文库的噪音单末端悬挂值。根据本发明的实施例,有效数据比例提升约10%。
第三、不进行文库的片段筛选:Tn5转座酶的特性可使DNA被剪切为200-500bp的小片段,此时PCR后的文库产物片段长度主峰也位于约300-600bp以内,无需进行片段筛选即可直接上机测序,无需再进行片段化处理,简化了实验流程。
根据本发明的实施例,该嵌合标记物的DNA与所述转座酶的比例为10ng:50-100nM。由此,有利于嵌合标记物的DNA重复进行转座处理。发明人经测试发现,转座酶投入量过高时,如加入200nM时,会导致文库片段过小,主峰位于约290bp。因Hi-C文库为两个DNA片段的嵌合,测序数据的基因组比对是将文库两端各截取一个片段来进行的,所以如果文库偏小会导致测序数据的有效部分唯一基因组位置比对率过低,无效部分多重基因组位置比对率过高。所以经过测试发明人发现当DNA与转座酶的比例为10ng:50-100nM时,文库片段长度更适宜(主峰300-600bp)。
根据本发明的实施例,基于10ng所述嵌合标记物的DNA,所述转座处理的反应体系包括:8-12μL转座缓冲液;0.2-1μL 10%吐温20;7-10μL水;0.5-3μL所述转座酶,其中,该转座缓冲液包括10mM Tris-HCl pH 7.6和5mM MgCl 2。由此,在该反应体系中,转座处理的DNA片段大小适宜。
根据本发明的实施例,该转座处理的温度为50-60℃,时间为5-15分钟。由此,在该温度调节下,有利于DNA片段化至适宜的长度区间。
S300:捕获处理
根据本发明的实施例,对所述转座产物进行捕获处理,得到捕获后的DNA。由此,从转座处理后的反应体系中,捕获加标记物标记的加接头后的DNA进行后续的扩增,减少杂质DNA对扩增的干扰。
根据本发明的实施例,该捕获处理为钓取处理。根据本发明优选的实施例,该钓取处理是利用链霉亲和素磁珠进行的。具体地,通过链霉亲和素磁珠与DNA上标记的生物素相结合,从转座产物中钓取有生物素标记的两端加接头的嵌合DNA片段。
根据本发明的实施例,基于1ng所述提取后的DNA,所述链霉亲和素磁珠的加入量为5-10μL。由此,既有利于从产物中充分捕获生物素标记的两端加接头的DNA,又避免试剂过量造成浪费。
S400:扩增处理
根据本发明的实施例,将提取后的DNA进行扩增处理,获得所述DNA文库。具体地,可以通过PCR扩增提取后的DNA以获得足够的材料。根据本发明的实施例,还可以进一步纯化经PCR扩增的文库。
建库方法的应用
进一步地,基于上述构建DNA文库的方法,根据本发明的第二方面,本发明提供了一种获得个体细 胞内染色质相互作用的方法。根据本发明的实施例,该方法包括:利用前述的方法,以便得到所述个体的DNA文库;对所述DNA文库进行测序和分析,以便获得个体细胞内染色质相互作用等生物信息。由此,获得个体细胞内染色质相互作用等生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的文库构建,并且测序的有效数据比例高,噪音单末端悬挂值低,获得的生物信息的有利于用于临床疾病的病理研究及三维基因组的科学研究。其中,需要说明的是,该构建DNA文库的方法具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
根据本发明的实施例,测序可以通过以下方法来完成:经典的Sanger测序、大规模平行测序、下一代测序、polony测序、454焦磷酸测序、Illumina测序、SOLEXA测序、SOLiD测序、离子半导体测序、DNA纳米球测序、Heliscope单分子测序、单分子实时测序、纳米孔DNA测序、隧穿电流DNA测序、杂交测序、质谱测序、微流体Sanger测序、基于显微镜的测序、RNA聚合酶测序、体外病毒高通量测序、Maxam-Gibler测序、单端测序、配对末端测序、深度测序、超深度测序,尤其适于Illumina测序。
然后,根据本发明的实施例,可以使用生物信息学管道处理测序的读取以绘制长程和/或全基因组范围的染色质相互作用,从而获得个体细胞内染色质相互作用等生物信息。
进一步地,基于上述构建DNA文库的方法,根据本发明的第三方面,本发明提供了一种获得个体生物信息的方法。根据本发明的实施例,该方法包括:利用前述的方法,以便得到所述个体的DNA文库;对所述DNA文库进行测序和分析,以便获得所述个体生物信息。由此,获得个体生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的Hi-C文库构建,并且测序的有效数据比例高,噪音单末端悬挂值低,获得的生物信息的有利于用于三维基因组领域的研究和临床诊断。其中,需要说明的是,该构建DNA文库的方法具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
进一步地,根据本发明的第四方面,本发明提供了一种产前诊断或癌症筛查的方法。根据本发明的收视率,所述方法是通过前述的构建DNA文库的方法或前述的获得个体生物信息的方法或者前述的三维基因组研究方法进行的。由此,构建DNA文库的方法和获得个体生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的文库构建,并且测序的有效数据比例高,噪音dangling值低,获得的生物信息的有利于用于临床诊断,尤其是产前诊断和癌症筛查。其中,需要说明的是,该构建DNA文库的方法具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
进一步地,根据本发明的第四方面,本发明提供了一种试剂盒,根据本发明的实施例,该试剂盒包括:前述的构建DNA文库的方法中所使用的试剂、引物、介导片段或其中至少一项的组合。由此,该试剂盒构建DNA文库的方法和获得个体细胞内染色质相互作用信息以及生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的文库构建,并且测序的有效数据比例高,噪音dangling值低,获得的生物信息的有利于用于临床诊断,尤其是产前诊断和癌症筛查。其中,需要说明的是试剂盒具有前述构建DNA文库的方法的全部技术特征和效果,在此不再一一赘述。
进一步地,根据本发明的第五方面,本发明提供了前述的试剂盒在三维基因组建库或产前诊断或癌症筛查中的用途。由此,该试剂盒构建DNA文库的方法和获得个体细胞内染色质相互作用信息以及生物信息的方法的步骤简化,操作时间缩短,尤其适用于痕量DNA样本的文库构建,并且测序的有效数据比例高,噪音dangling值低,适于用于三维基因组建库,并且获得的生物信息的有利于用于临床诊断,尤其是产前诊断和癌症筛查。
在此需要说明书的是,试剂盒可以用于对本领域技术人员显而易见的任何应用。试剂盒可以包含例如多种缔合分子、亲和标签、固定剂、限制性内切核酸酶、连接酶和/或其组合。在一些情况下,缔合分子可以为蛋白质,包括例如DNA结合蛋白(例如组蛋白或转录因子)。在一些情况下,固定剂可以为甲醛或任何其他DNA交联剂。在一些情况下,试剂盒还可以包含多种珠子。珠子可以是顺磁性的和/或可以是经捕获剂涂覆的。例如,珠子可以是经链霉抗生物素蛋白和/或抗体涂覆的。在一些情况下,试剂盒可以包含衔接子寡核苷酸和/或测序引物。此外,试剂盒可以包含能够使用衔接子寡核苷酸和/或测序引物扩增读取对的装置。在一些情况下,试剂盒还可以包含其他试剂,包括但不限于裂解缓冲液、连接试剂(例如,dNTP、聚合酶、多核苷酸激酶和/或连接酶缓冲液等)和PCR试剂(例如,dNTP、聚合酶、和/或PCR缓冲液等)。该试剂盒还可以包括使用试剂盒组分和/或产生读取对的说明书。
下面参考具体实施例,对本发明进行说明,需要说明的是,这些实施例仅仅是说明性的,而不能理解为对本发明的限制。
下面将结合实施例对本发明的方案进行解释。本领域技术人员将会理解,下面的实施例仅用于说明本发明,而不应视为限定本发明的范围。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件(例如参考J.萨姆布鲁克等著,黄培堂等译的《分子克隆实验指南》,第三版,科学出版社)或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品,例如可以采购自Illumina公司。
实施例1
利用本发明实施例的方法,对小鼠细胞为样本,构建DNA文库,并进行测序和质控,具体如下:
一、实验方法
1.实验材料准备
使用小鼠3T3-NIH细胞系为实验材料。冻存的3T3-NIH细胞在37℃水浴中快速融化,在生物安全柜中转入9ml细胞培养液中混匀,细胞培养液包括15%胎牛血清、84%的DMEM培养液及1%的链霉素青霉素抗体(均为V/V)。置于23℃1000rpm的条件下离心10min,弃上清后加入5ml细胞培养液,重悬细胞后转入细胞培养瓶中,置于细胞培养箱中37℃5%CO 2静置培养。
待细胞长至8成满的时候(约1-2天),取出培养瓶,在生物安全柜中倒掉培养液,加入5ml PBS轻轻晃动清洗细胞一次。加入1ml胰蛋白酶消化细胞1min,再加入5ml细胞培养液,充分的吹吸细胞使其脱离细胞培养瓶。将细胞悬液转入1.5ml离心管中,23℃1000rpm离心10min,弃上清。加入1ml PBS重悬清洗细胞1次,再次使用上述条件离心,去上清。加入500μL PBS重悬细胞,使用血球计数板技术细胞浓度,计算浓度后取1000个细胞于1.5ml离心管中,标记为“3T3-1K-4”。
2.细胞交联
用PBS将细胞悬液补至100μL,轻微的吹打均匀。加入2.78μL的37%甲醛于细胞悬液中,吹打均匀,室温10min,中间偶尔晃动几次。加入11μL的2.5M甘氨酸于上步液体中,吹打均匀,室温10min,冰上15min彻底终止交联。4℃离心机1000×g离心10min,注意标记离心角度,离心结束后弃掉上清留存沉淀,加入100μL PBS缓慢吹打10次重悬细胞。4℃离心机1000×g离心10min,去上清。沉淀即为交联完成的细胞材料。
3.细胞裂解
向交联好的细胞中加入45μL的细胞裂解液,包括10nM Tris-HCl pH7.4、10mM NaCl、0.1mM EDTA、0.5%NP-40,及5μL蛋白酶抑制剂,吹吸混匀后置于冰上静置裂解1h。裂解完成后置于4℃2500g离心5min,去上清。加入20μL细胞裂解液及10μL 0.5%SDS,置于62℃恒温混匀仪中反应10min。然后加入5μL 10%Trition X-100,置于37℃恒温混匀仪中反应30min。
4.染色质酶切
向上一步反应混合液中加入5μL 10X NEbuffer2、50U Mbo I,用ddH 2O补至50μL,置于37℃恒温混匀仪中反应4h,伴随15s/2min的1400rpm摇晃。
5.末端生物素标记10mM dATP10mM dGTP10mM dTTP
向酶切产物中分别加入1.5μL的1mM dATP、1mM dGTP、1mM dTTP、3.75μL 0.4mM biotin-14-dCTP及10U的Klenow Fragment,置于37℃恒温混匀仪中反应90min,伴随15s/2min的1400rpm摇晃。
6.平末端连接
向生物素标记产物中加入60μL连接buffer,包括26.5μL ddH 2O、7μL 10%Trition X-100、24μL 5X T4 ligase buffer、1.2μL 10mg/ml的BSA及400U T4 DNA ligase。置于16℃恒温混匀仪中反应6h以上,伴随15s/2min的1400rpm摇晃。
7.去交联
向连接产物中加入5μL 20mg/ml的蛋白酶K、12μL 10%SDS,置于55℃恒温混匀仪中反应30min。再加入13μL 5M NaCl,置于65℃恒温混匀仪中反应4h,伴随15s/2min的1400rpm摇晃。
8.DNA提取
解交联结束后置于冰上降温,加入2μL 5mg/ml Glycogen,充分吹打混匀。此时总体积约为150μL,加入2倍体积的无水乙醇,混匀后短暂离心,置于-80℃30min沉淀DNA。沉淀结束后4℃18000g离心15min,弃上清。使用80%乙醇清洗沉淀两次后溶于30μL 10mM Tris-HCl pH7.4中,定量。
9.Tn5转座加接头
使用诺唯赞公司的试剂盒TruePrep TMDNA Library Prep Kit V2进行转座反应。向DNA溶液中分别加入10μL TTBL、0.5μL 10%Tween20、8.5μL ddH 20及1μL转座酶TTE Mix V50,混匀后置于55℃反应10min。产物用1.8X磁珠纯化,洗脱于20μL 10mM Tris-HCl pH7.4中,即为转座产物。
10.生物素钓取
取链霉亲和素磁珠10μL,清洗后与转座产物混合,常温结合40min,置于磁力架上去上清,使用200μL磁珠清洗Washing buffer洗1次,再使用50μL 0.1MNaOH清洗磁珠2次,取100μL的10mM Tris-HCl pH7.4清洗磁珠2次。加入20μL10mM Tris-HCl pH7.4重悬磁珠。
11.PCR扩增
使用诺唯赞公司的试剂盒TruePrep TMDNA Library Prep Kit V2配置PCR反应混合试剂,包括10μL 5X TAD、5μL PPM、5μL N5 index、5μL N7 index、4μLddH 2O及1μL TAE。将上一步钓取产物加入PCR混合试剂,混匀后置于PCR仪中执行如下程序:72℃,5min;98℃,30s;(98℃,15s;60℃,30s;72℃,30s)15Cycles;72℃,5min。产物使用0.9X磁珠纯化即得最终文库。
12.文库质控
将文库使用通用引物扩增后取200ng,用ddH 2O补至25μL,加入3μL的10X
CutSmart buffer,混匀后分为14μL的两份,标记“-”和“+”,“-”号中加入1μLddH 2O用于阴性对照,“+”号中加入1μL BspDI,混匀后均置于37℃恒温混匀仪中反应2h。产物使用2%琼脂糖凝胶电泳分离,根据BspDI是否可以把文库切割作为判定文库效率的标准。
文库取3μL分别测定Qpcr浓度和agilent HS2100查看片入片段长度。
13.测序
在Illumina的HiSeq XTen平台进行测序,具体操作均按照官方标准进行。
14.数据质控
使用HiC-pro软件进行数据的比对、文库分子类型的鉴定划分。
二、实验结果
1.过程中定量浓度
表1建库过程中定量浓度
Figure PCTCN2019130250-appb-000001
各步浓度如表1所示,数据均正常。
2.文库酶切质控
文库酶切质控琼脂糖凝胶电泳图如图4所示,其中“+”泳道条带对比“-”明显下移,说明BspD I可将文库片段切开,表明文库效率较高。
3.文库Agilent HS2100峰图
文库Agilent HS2100峰图如图4所示,峰图显示文库片段长度为200-1000bp之间,主峰位于400bp处,符合正常Hi-C文库特征。
4.文库测序数据分析结果
本实施例的文库数据分析结果如表2所示,其中,有效数据Valid为36.33%,较现有技术(有效数据Valid多为29%左右)提高约10.33%;无效噪音数据单末端悬挂值仅0.6%,显著低于现有的Hi-C文库的噪音数据值;Cis与Dup与现有技术持平。
表2文库最终数据分析结果
Figure PCTCN2019130250-appb-000002
Figure PCTCN2019130250-appb-000003
对比例1
按照实施例1的方法,对小鼠细胞为样本,构建DNA文库,并进行测序和质控,其区别在于添加5μL转座酶TTE Mix V50,结果如下所示:
表3文库最终数据分析结果
Figure PCTCN2019130250-appb-000004
本对比例的文库峰图如图5所示,主峰在275bp左右,实施例1通过调整转座酶的加入量,调整了文库长度,使主峰在409bp左右,使多重比对显著降低;与实施例1相比,本对比例的唯一比对率和有效数据率显著降低,而重复片段率显著增加
综上所述,本发明实施例的构建DNA文库的方法,在Hi-C建库过程中,通过转座处理简化了DNA片段化和加接头的步骤,无需末端修复和3‘端加碱基A步骤,建库时间显著缩短,文库产物的片段长度适宜,无需进行片段筛选即可直接上机测序,尤其适用于痕量DNA样本的文库构建(10 3数量细胞的建库),并且测序的有效数据比例高,噪音单末端悬挂值低,建库效率高。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。
尽管已经示出和描述了本发明的实施例,本领域的普通技术人员可以理解:在不脱离本发明的原理和宗旨的情况下可以对这些实施例进行多种变化、修改、替换和变型,本发明的范围由权利要求及其等同物限定。

Claims (17)

  1. 一种构建DNA文库的方法,其特征在于,包括:
    提供嵌合标记物的DNA,其中,所述嵌合标记物的DNA具有三维结构信息;
    将所述嵌合标记物的DNA进行转座处理,以便得到转座产物;
    对所述转座产物进行捕获处理,以便得到捕获后的DNA;以及
    将所述捕获后的DNA进行扩增处理,以便获得所述DNA文库。
  2. 根据权利要求1所述的方法,其特征在于,所述标记物为生物素。
  3. 根据权利要求1所述的方法,其特征在于,所述嵌合标记物的DNA含有空间上相邻近的DNA区段的部分。
  4. 根据权利要求1所述的方法,其特征在于,获得所述嵌合标记物的DNA的方法包括:
    将细胞内的染色质进行固定交联处理,以形成DNA-蛋白质交联物;
    将DNA-蛋白质交联物进行酶切处理,以生成含有粘性末端的DNA-蛋白质复合物;以及
    用含有一种或多种所述标记物的核苷酸补平所述粘性末端,产生平末端随后使平末端连接在一起,形成邻近连接的基因组DNA。
  5. 根据权利要求1所述的方法,其特征在于,利用转座酶进行所述转座处理。
  6. 根据权利要求4所述的方法,其特征在于,所述转座酶为Tn5转座酶。
  7. 根据权利要求4所述的方法,其特征在于,所述嵌合标记物的DNA与所述转座酶的比例为10ng:50-100nM。
  8. 根据权利要求4所述的方法,其特征在于,基于10ng所述嵌合标记物的DNA,所述转座处理的反应体系包括:
    8-12μL转座缓冲液;
    0.2-1μL 10%吐温20;
    7-10μL水;以及
    0.5-3μL所述转座酶。
  9. 根据权利要求4所述的方法,其特征在于,所述转座处理的温度为50-60℃,时间为5-15分钟。
  10. 根据权利要求2所述的方法,其特征在于,所述提取处理为钓取处理,优选地,为链霉亲和素磁珠。
  11. 根据权利要求10所述的方法,其特征在于,基于1ng所述提取后的DNA,所述链霉亲和素磁珠的加入量为5-10μL。
  12. 一种获得个体细胞内染色质相互作用信息的方法,其特征在于,包括:
    利用权利要求1-11任一项所述的构建DNA文库的方法,以便得到所述个体的DNA文库;以及
    对所述DNA文库进行测序和分析,以便获得所述个体细胞内染色质相互作用信息。
  13. 一种获得个体生物信息的方法,其特征在于,包括:
    利用权利要求1-11任一项所述的构建DNA文库的方法,以便得到所述个体的DNA文库;以及
    对所述DNA文库进行测序和分析,以便获得所述个体生物信息。
  14. 一种三维基因组研究方法,其特征在于,所述方法是通过权利要求1-11所述的构建DNA文库的方法或权利要求12所述的获得个体细胞内染色质相互作用信息的方法或权利要求13所述的获得个体生物信息的方法进行的。
  15. 一种产前诊断或癌症筛查的方法,其特征在于,所述方法是通过权利要求1-11所述的构建DNA文库的方法或权利要求12所述的获得个体细胞内染色质相互作用信息的方法或权利要求13所述的获得个体生物信息的方法或者权利要求14所述的三维基因组研究方法进行的。
  16. 一种试剂盒,其特征在于,包括:权利要求1-11所述的构建DNA文库的方法中所使用的试剂、引物、介导片段或其中至少一项的组合。
  17. 权利要求16所述的试剂盒在三维基因组建库或产前诊断或癌症筛查中的用途。
PCT/CN2019/130250 2019-08-12 2019-12-31 构建dna文库的方法及其应用 WO2021027236A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910740285.5A CN110607352A (zh) 2019-08-12 2019-08-12 构建dna文库的方法及其应用
CN201910740285.5 2019-08-12

Publications (1)

Publication Number Publication Date
WO2021027236A1 true WO2021027236A1 (zh) 2021-02-18

Family

ID=68889999

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/130250 WO2021027236A1 (zh) 2019-08-12 2019-12-31 构建dna文库的方法及其应用

Country Status (2)

Country Link
CN (1) CN110607352A (zh)
WO (1) WO2021027236A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110607352A (zh) * 2019-08-12 2019-12-24 安诺优达生命科学研究院 构建dna文库的方法及其应用
CN111778563A (zh) * 2020-07-24 2020-10-16 天津诺禾致源生物信息科技有限公司 细胞Hi-C测序文库的构建方法
CN112795563A (zh) * 2021-03-23 2021-05-14 上海欣百诺生物科技有限公司 生物素化的转座体在回收CUT&Tag或ATAC-seq产物中的用途及方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017066908A1 (zh) * 2015-10-19 2017-04-27 安诺优达基因科技(北京)有限公司 构建高分辨率、大信息量单细胞Hi-C文库的方法
CN106637422A (zh) * 2016-12-16 2017-05-10 中国人民解放军军事医学科学院生物工程研究所 一种构建Hi‑C高通量测序文库的方法
CN108085379A (zh) * 2017-12-28 2018-05-29 上海嘉因生物科技有限公司 应用于组织样本中染色体开放结合区域定位的ATAC-seq方法
CN110607352A (zh) * 2019-08-12 2019-12-24 安诺优达生命科学研究院 构建dna文库的方法及其应用

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017066908A1 (zh) * 2015-10-19 2017-04-27 安诺优达基因科技(北京)有限公司 构建高分辨率、大信息量单细胞Hi-C文库的方法
CN106637422A (zh) * 2016-12-16 2017-05-10 中国人民解放军军事医学科学院生物工程研究所 一种构建Hi‑C高通量测序文库的方法
CN108085379A (zh) * 2017-12-28 2018-05-29 上海嘉因生物科技有限公司 应用于组织样本中染色体开放结合区域定位的ATAC-seq方法
CN110607352A (zh) * 2019-08-12 2019-12-24 安诺优达生命科学研究院 构建dna文库的方法及其应用

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
GUO XIAOQIANG: "Hi-C: A Technology of High Throughput Analysis for Chromosome Conformation", SCIENCE, vol. 71, no. 1, 31 January 2019 (2019-01-31), pages 15 - 18, XP055780631 *

Also Published As

Publication number Publication date
CN110607352A (zh) 2019-12-24

Similar Documents

Publication Publication Date Title
EP3329010B1 (en) Nucleic acids and methods for detecting chromosomal abnormalities
CN106795514B (zh) 泡状接头及其在核酸文库构建及测序中的应用
WO2021168261A1 (en) Capturing genetic targets using a hybridization approach
US20230250476A1 (en) Deep Sequencing Profiling of Tumors
WO2021027236A1 (zh) 构建dna文库的方法及其应用
JP2017537609A (ja) 多重キャプチャー反応のためのユニバーサルブロッキングオリゴシステム及び改良されたハイブリダイゼーションキャプチャー方法
JP2019501641A (ja) ナノポア技術を用いた短いdna断片の迅速な配列決定
US20220267826A1 (en) Methods and compositions for proximity ligation
WO2012126398A1 (zh) Dna标签及其用途
WO2019006975A1 (zh) 极小量细胞原位全基因组染色质构象捕获方法
JP2015516814A (ja) 標的化されたdnaの濃縮および配列決定
WO2024012418A1 (zh) 染色质三维构象捕获方法及其应用
CN115109842A (zh) 用于准确的平行定量核酸的高灵敏度方法
US20180291436A1 (en) Nucleic acid capture method and kit
JP7034299B2 (ja) ハイスループットシークエンシングに基づくオリゴヌクレオチド配列不純物の分析方法及び使用
EP4172357B1 (en) Methods and compositions for analyzing nucleic acid
EP4388128A1 (en) Embryonic nucleic acid analysis
WO2014086037A1 (zh) 构建核酸测序文库的方法及其应用
US20210115503A1 (en) Nucleic acid capture method
CN115109846A (zh) 用于准确的平行定量稀释或未纯化样品中的核酸的方法
EP3935164A2 (en) Methods for rapid dna extraction from tissue and library preparation for nanopore-based sequencing
WO2023217214A1 (zh) 一种单细胞RNA m5C修饰的分析方法
WO2024006712A1 (en) Methods for preparation and analysis of proximity-ligated nucleic acids from single cells
JP2024035110A (ja) 変異核酸の正確な並行定量するための高感度方法
WO2024054517A1 (en) Methods and compositions for analyzing nucleic acid

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19941375

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19941375

Country of ref document: EP

Kind code of ref document: A1