WO2019006975A1 - In situ whole genome chromatin conformation capture method for infinitesimal cells - Google Patents

In situ whole genome chromatin conformation capture method for infinitesimal cells Download PDF

Info

Publication number
WO2019006975A1
WO2019006975A1 PCT/CN2017/114475 CN2017114475W WO2019006975A1 WO 2019006975 A1 WO2019006975 A1 WO 2019006975A1 CN 2017114475 W CN2017114475 W CN 2017114475W WO 2019006975 A1 WO2019006975 A1 WO 2019006975A1
Authority
WO
WIPO (PCT)
Prior art keywords
treatment
dna
genome
sequencing
product
Prior art date
Application number
PCT/CN2017/114475
Other languages
French (fr)
Chinese (zh)
Inventor
颉伟
杜振海
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Publication of WO2019006975A1 publication Critical patent/WO2019006975A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the invention relates to the field of biotechnology.
  • the invention relates to methods of constructing DNA sequencing libraries of a genome to be tested and uses thereof. More specifically, the present invention relates to a method of constructing a DNA sequencing library of a genome to be tested, a method of determining DNA sequence information of a genome to be tested, and a method of determining a three-dimensional spatial structure of a genome to be tested.
  • genome-wide chromatin conformation capture technology is extremely limited by the number of cells. If a sufficient amount of DNA for building a database containing enough effective ligation events is not available, it is impossible to obtain efficient and sufficiently high-resolution genomic three-dimensional structural information. of.
  • the traditional Hi-C technology has a large reaction system, and there are many steps of centrifugation, washing, and tube exchange.
  • the selection of fragment size before DNA amplification causes a large amount of DNA loss.
  • the selection range of DNA fragments is narrower than the degree of enrichment of DNA fragments by ultrasonic interruption, further reducing the amount of DNA used for second-generation sequencing.
  • Hi-C technology is commonly used to study the commonly available cells of the order of magnitude above 107, which is extremely incapable of obtaining a large number of cell types in the early stages of embryonic development. Based on the discovery of the above traditional technical problems by the inventors, the inventors selected by narrowing the reaction system, changing the mode of cell transfer, and changing the size of the DNA fragment.
  • sisHi-C small scale in situ Hi-C
  • the invention proposes a method of constructing a DNA sequencing library of a genome to be tested.
  • the method comprises: (1) digesting the genome to be tested with a restriction endonuclease to obtain a digestion treatment product; and (2) subjecting the digestion treatment product to biotin labeling treatment so that Obtaining a biotin labeling treatment product; (3) linking the biotin labeling treatment product with DNA ligase to obtain a ligation product; (4) de-crosslinking the ligation product; (5) decrosslinking the solution Treating the product for purification treatment; (6) subjecting the purified product to ultrasonication and precipitation treatment by contacting the sonicated product with a streptavidin magnetic bead to obtain a binding enzyme affinity a target DNA fragment of the magnetic beads; and (7) a target DNA fragment based on the magnetic chain bound to the streptavidin magnetic beads.
  • a method of constructing a DNA sequencing library of a genome to be tested according to an embodiment of the present invention can be used to construct a DNA sequencing library having a cell number as low as 10 cells or a genome amount as low as 50 pg, thereby realizing a non-obtainable, small number of types of cells. Capture of genomic chromatin conformation.
  • the invention proposes a sequencing library.
  • the sequencing library is obtained by the method of constructing a DNA sequencing library of the genome to be tested as described above.
  • sequencing libraries according to embodiments of the present invention for sequencing it is possible to obtain genomic sequence information with a cell number as low as 10 or a genome amount as low as 50 pg and genomic three-dimensional structure information of sufficiently high resolution.
  • the invention proposes a method of determining DNA sequence information of a genome to be tested.
  • the method comprises: constructing a DNA sequencing library of a genome to be tested according to the method described above; sequencing the DNA sequencing library to obtain a sequencing result; and determining the site based on the sequencing result Describe the DNA sequence information of the genome.
  • the method of determining the DNA sequence information of the genome to be tested according to an embodiment of the present invention it is possible to obtain genomic sequence information of cells having a cell number as low as 10 or a genome amount as low as 50 pg.
  • the invention proposes a method for determining a three-dimensional spatial structure of a genome to be tested.
  • the method comprises: constructing a DNA sequencing library of a genome to be tested according to the method described above; sequencing the DNA sequencing library to obtain a sequencing result; and determining the site based on the sequencing result
  • the three-dimensional spatial structure information of the detected genome is described.
  • the genome to be tested proposed by the present invention refers to a whole genome or a partial genome of a cell or a tissue, and the genome is composed of chromatin or chromosome.
  • the source of the genome is not particularly limited and can be obtained from any possible route, either directly from a commercial market, directly from other laboratories, or directly from a cell or tissue. Extracted from the sample.
  • FIG. 1 is a flow chart of a method of constructing a DNA sequencing library of a genome to be tested according to an embodiment of the present invention
  • FIG. 2 is a flow chart of a method of constructing a DNA sequencing library of a genome to be tested according to still another embodiment of the present invention
  • FIG. 3 is a flow chart of a method of constructing a DNA sequencing library of a genome to be tested according to still another embodiment of the present invention
  • FIG. 5 is a flowchart of a TruSeq database according to still another embodiment of the present invention.
  • FIG. 6 is a flow chart of a TruSeq library according to still another embodiment of the present invention.
  • FIG. 7 is a flow chart of a TruSeq library according to still another embodiment of the present invention.
  • FIG. 8 is a diagram showing the results of verifying that the method of building a database according to an embodiment of the present invention has significant advantages, according to an embodiment of the present invention.
  • the invention proposes a method of constructing a DNA sequencing library of a genome to be tested.
  • the method includes: S100: digesting a genome to be tested with a restriction endonuclease to obtain a digestion treatment product; S200: performing biotin labeling treatment on the digestion treatment product, In order to obtain a biotin labeling treatment product; S300: ligating the biotin labeling treatment product with DNA ligase to obtain a ligation product; S400: de-crosslinking the ligation product; S500: decrosslinking the treated product Purification treatment; S600: subjecting the purified treatment product to ultrasonication and precipitation treatment by contacting the sonicated product with a streptavidin magnetic bead to obtain a streptavidin-coupled magnetic bead Target DNA fragment; and S700: based on a target DNA fragment bound with a streptavidin magnetic beads.
  • a method of constructing a DNA sequencing library of a genome to be tested according to an embodiment of the present invention can be used to construct a DNA sequencing library having a cell number as low as 10 or a genome amount as low as 50 pg, thereby realizing a readily available, small number of types of cells. Capture of the genomic chromatin conformation.
  • the genome to be tested is obtained by lysing cells or tissues, optionally the cells are cell lines or primary cells.
  • the cells or tissues are lysed to release the genome in the cells or tissues.
  • the cells or tissue are previously subjected to a formaldehyde cross-linking treatment.
  • formaldehyde temporarily immobilizes DNA-protein and protein-protein complexes that are spatially close in the natural state of cells or tissues.
  • the cell or genome is subjected to a transfer treatment by a mouth pipette.
  • a transfer treatment by a mouth pipette.
  • the inventor found that using the mouth The precise transfer of cells or tissue by the pipette can effectively avoid cell loss caused by centrifugation of the replacement solution.
  • the restriction enzyme is MboI.
  • the inventors found that the recognition site of MboI is GATC, the base distribution is uniform, and there is no obvious preference when cutting on the genome; at the same time, the recognition site of MboI is four bases, compared with the commonly used six The base restriction endonuclease has a higher cutting frequency and a higher resolution of the theoretically obtained data. Finally, MboI is highly commercialized and low in cost, and can be easily obtained, thereby effectively controlling the entire DNA sequencing library. cost.
  • the biotin labeling treatment is carried out by treating the digestion treatment product with adenine triphosphate deoxynucleotide, guanine deoxynucleotide triphosphate, thymidine triphosphate deoxynucleotide
  • the biotin-labeled triphosphate cytosine deoxynucleotide and the DNA polymerase large fragment were contacted, and the contact was carried out at 37 ° C for 1.5 hours. Biotin can be efficiently labeled to the end of the MboI fragment by the above method.
  • the ligation process is a T4 linkage, which is carried out by contacting the ligation product with proteinase K, SDS and sodium chloride.
  • T4 DNA ligase By ligation with T4 DNA ligase, different nick ends of DNA can be ligated into a circular chimeric molecule, and the DNA can be efficiently released by separating the DNA from the bound protein by the above-described decrosslinking treatment.
  • the purification treatment is carried out by contacting the decrosslinked treatment product with pre-cooled anhydrous ethanol, which is carried out at -80 ° C for 15 minutes.
  • DNA is efficiently precipitated at the bottom of the tube to further purify the DNA.
  • the decrosslinking treatment product is contacted with hepatic glycogen and sodium acetate during the purification treatment.
  • the inventors have found that in the process of purifying DNA, since the number of cells is small, the amount of DNA is extremely small, and co-precipitation with hepatic glycogen and DNA can indicate the position of DNA precipitation in the EP tube, thereby effectively preventing DNA loss during washing. .
  • the ultrasound is performed for 134 s with a Peak Power of 50, a Duty Factor of 20, and a Cycles/Burst of 200.
  • Peak Power indicates the highest incident power, which is the instantaneous ultrasonic power acting on the sample
  • Duty Factor indicates the working coefficient, that is, the time when the ultrasonic wave acts on the sample as a percentage of the total time period
  • Cycles/Burst indicates that the ultrasonic wave acts on the sample during the ultrasonic wave The number of energy transfers.
  • the inventors have found that under the above ultrasonic conditions, DNA loss during DNA fragment size selection can be effectively reduced.
  • the database in S700 is built by TruSeq, and S700 includes S710: end-repairing, S720: end-addition of target DNA fragment bound with streptavidin magnetic beads. Adenine phosphate deoxyribonucleotides and S730: ligation sequencing linker sequences were processed.
  • the S720 end plus adenosine triphosphate deoxyribonucleotide treatment further comprises a Tween wash of the magnetic bead treatment; preferably, the S720 end plus adenosine triphosphate deoxyribose
  • the S730 is connected to the sequencing linker sequence to further include a Tween wash to treat the magnetic beads;
  • the S730-linked sequencing linker sequence further comprises a Tween wash to treat the magnetic beads.
  • the magnetic beads are directly washed twice with the Tween washing solution, and the solution can be changed simply and quickly, thereby avoiding the loss of DNA during the purification process.
  • the ligation sequencing linker sequence processing product is subjected to S740: DNA first elution treatment and S750: PCR amplification.
  • S740 DNA first elution treatment
  • S750 PCR amplification.
  • the target DNA fragment can be efficiently eluted from the streptavidin magnetic beads, and the target DNA can be efficiently enriched by PCR amplification.
  • a sequencing library obtained by the above method according to an embodiment of the present invention can be used as a sisHi-C library for second generation sequencing.
  • the above method according to an embodiment of the present invention uses a small reaction system to increase the efficiency of enzymatic cleavage and ligation reaction while greatly reducing the cost; accurately transferring cells using a mouth pipette before cell lysis avoids the replacement of the solution by centrifugation Cell loss; binding of DNA to streptavidin magnetic beads for TruSeq library construction, use of milder magnetic bead wash conditions, and controlled library construction in the same EP tube, these measures successfully bypass the purification and recovery steps And effectively reduce the DNA loss during the construction process; optimize the condition of ultrasonic to break the DNA fragment, elute the DNA in two steps, and retain all the DNA in the range of 200bp to 1000bp after PCR, further improve The amount of DNA that is finally available.
  • the whole method not only significantly reduces the amount of starting cells, but also costs one-tenth of the cost of traditional methods, thus efficiently achieving genomic chromatin capture.
  • the invention proposes a sequencing library.
  • the sequencing library is obtained by the method of constructing a DNA sequencing library of the genome to be tested as described above.
  • sequencing libraries according to embodiments of the present invention for sequencing it is possible to obtain genomic sequence information with a cell number as low as 10 or a genome amount as low as 50 pg and genomic three-dimensional structure information of sufficiently high resolution.
  • the invention proposes a method of determining DNA sequence information of a genome to be tested.
  • Root According to an embodiment of the present invention, the method comprises: constructing a DNA sequencing library of a genome to be tested according to the method described above; sequencing the DNA sequencing library to obtain a sequencing result; and determining the result based on the sequencing result DNA sequence information of the genome to be tested.
  • the method for constructing a DNA sequencing library of the genome to be tested has been described in detail above and will not be described herein.
  • the method and apparatus for sequencing a sequencing library are not particularly limited, and in view of the maturity of the technique, according to an embodiment of the present invention, second generation sequencing techniques such as SOLEXA, SOLID, and 454 sequencing may be employed.
  • second generation sequencing techniques such as SOLEXA, SOLID, and 454 sequencing
  • technology such as single-molecule sequencing technologies such as Helicos' True Single Molecule DNA sequencing technology, Pacific Biosciences' single single molecule, real-time (SMRT.TM.), can also be used.
  • Single-molecule sequencing technologies such as Helicos' True Single Molecule DNA sequencing technology, Pacific Biosciences' single single molecule, real-time (SMRT.TM.
  • SMRT.TM. real-time
  • Technology, and nanopore sequencing technology from Oxford Nanopore Technologies, Inc. (Rusk, Nicole (2009-04-01). Cheap Third-Generation Sequencing. Nature Methods 6(4): 244-245).
  • the inventors have surprisingly found that using a method for determining DNA sequence information of a genome to be tested according to an embodiment of the present invention, it is possible to sensitively, accurately and efficiently determine a genomic or microgenome of a trace amount of cells (starting up to 10 cells). Genomic sequence information with a starting amount of DNA as low as 50 pg).
  • the invention proposes a method for determining a three-dimensional spatial structure of a genome to be tested.
  • the method comprises: constructing a DNA sequencing library of a genome to be tested according to the method described above; sequencing the DNA sequencing library to obtain a sequencing result; and determining the site based on the sequencing result The three-dimensional spatial structure information of the detected genome is described.
  • the method comprises constructing a DNA sequencing library of the genome to be tested according to the method described above; performing double-end sequencing on the obtained library to obtain sequence information at both ends of each DNA fragment in the library; The sequence information at both ends is compared to the genome, so that the spatial proximity information between each two segments with different linear distances in the genome can be obtained, and the three-dimensional structure of the genome can be inferred by mathematical methods (Aiden et al.Comprehensive Mapping of long-range interactions reveals folding principle of the human genome. Science. 2009).
  • the inventors have surprisingly found that the method for determining the sequence information of the chromatin target region of the genome to be tested according to an embodiment of the present invention enables sensitive, accurate and efficient determination of trace cells (up to 10 cells starting from the genome) or trace amounts.
  • a sufficiently high resolution genomic three-dimensional structural information of the genome initial amount of DNA as low as 50 pg).
  • the collected samples were transferred to a freshly prepared PBS solution containing 1% formaldehyde by a mouthpiece under a stereoscopic microscope, fixed at room temperature for 10 minutes, and added with a 2.5 M glycine solution to a final concentration of 0.2 M, and allowed to stand at room temperature for 10 minutes.
  • the sample was transferred to a PBS solution by a mouth pipette and washed once, and then transferred to a PCR tube.
  • the solution containing the sample was transferred from the PCR tube to a low adsorption 1.5 ml EP tube.
  • the sample was placed on a Thermo mixer, 500 rpm, and treated at 24 ° C for 5.5 hours, and the different nick ends were ligated into circular chimeric molecules by DNA ligase.
  • the sample was taken out of the oven.
  • 1 ⁇ l of hepatic glycogen and 15 ⁇ l of 3 M sodium acetate were added to each sample, mixed on a vortex, and then 240 ⁇ m was added.
  • the pre-cooled anhydrous ethanol was stirred upside down and allowed to stand at -80 ° C for 15 minutes.
  • the precipitate was washed twice with high speed centrifugation with 75% absolute ethanol.
  • the purified DNA was dissolved in 50 ⁇ l of water, and the metal bath was allowed to stand at 37 ° C for 15 minutes.
  • the process of purifying DNA since the number of cells is small, the amount of DNA is extremely small, and hepatic glycogen is co-precipitated with DNA to indicate the position of DNA precipitation in the EP tube, preventing DNA loss during washing.
  • the sample DNA was shredded into fragments of 300-500 base pairs (bp) using a Covaris M220 ultrasonic DNA disruptor, and the ultrasonic conditions were: Peak Power 50, Duty Factor 20, Cycles/Burst 200, time 134s.
  • the Covaris sonic tube was washed with 20 microliters of water to reduce DNA loss. The inventors have found that DNA loss during DNA fragment size selection can be reduced under the above-described ultrasonic conditions.
  • the target DNA fragment bearing the biotin label is bound to the streptavidin magnetic beads. Because the binding of biotin-labeled DNA to streptavidin magnetic beads is very stable, these magnetic beads bound to the target DNA can be used directly for TruSeq library construction. Terminal repair was performed in sequence, adenine triphosphate deoxyribonucleotide (dATP) was added to the end of the DNA fragment, and the sequencing sequence was ligated. After each step of the reaction, the magnetic beads are directly washed twice with the Tween washing solution, and the solution can be changed simply and quickly, thereby avoiding the loss of DNA during the purification process.
  • dATP adenine triphosphate deoxyribonucleotide
  • the sample On the fourth day, after the sequence of the sequencing linker is connected, the sample is placed on a magnetic stand. After the magnetic beads are all adsorbed to the magnetic frame, the solution becomes clear, the supernatant is discarded, and the magnetic beads are washed twice with the Tween washing solution, each time in the Rotate the mixer for 2 minutes at room temperature. Add 20 ⁇ l of water to the sample, mix by pipetting, and place the sample on a constant temperature mixer at 66 ° C, 1400 rpm for 20 minutes, and elute the DNA twice. PCR amplification was performed.
  • AMPure XP magnetic beads were added to 150 ⁇ l of the sample, pipetted and mixed, and then incubated on a rotary mixer for 5 minutes at room temperature. Transfer the supernatant to a new low-adsorption EP tube, add 78 ⁇ l (1:1) AMPure XP magnetic beads, mix and mix for 5 minutes at room temperature on a rotary mixer and discard the supernatant.
  • the magnetic beads were washed twice with 75% absolute ethanol, and after air drying, 50 ⁇ l of water was added to the EP tube to elute, thereby obtaining a DNA fragment having a size ranging from 200 base pairs to 1000 base pairs.
  • the library obtained by the above procedure (in the present application, the inventors obtained the library obtained by the above method as a sisHi-C library) can be used for second generation sequencing.
  • first and second are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated.
  • features defining “first” and “second” may include one or more of the features either explicitly or implicitly.
  • the meaning of "a plurality” is two or more unless specifically and specifically defined otherwise.

Abstract

Provided is a method for constructing a DNA sequencing library of a genome to be tested, the method comprising: (1) digesting a genome to be tested by means of a restriction enzyme; (2) performing biotin labelling on digested products; (3) linking the biotin-labelled products using a DNA ligase; (4) decrosslinking the linked products; (5) purifying the decrosslinked products; (6) performing ultrasonic and precipitation treatments on the purified products, wherein the precipitation treatment is to bring the sonicated products into contact with streptavidin magnetic beads to obtain target DNA fragments bound to streptavidin magnetic beads; and (7) constructing a library based on the target DNA fragments bound to streptavidin magnetic beads.

Description

极小量细胞原位全基因组染色质构象捕获技术Minimal amount of cell in situ whole genome chromatin conformation capture
优先权信息Priority information
本申请请求2017年07月07日向中国国家知识产权局提交的、专利申请号为2017105525550的专利申请的优先权和权益,并且通过参照将其全文并入此处。The present application claims priority to and the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit of the benefit.
技术领域Technical field
本发明涉及生物技术领域。具体地,本发明涉及构建待测基因组的DNA测序文库的方法及其应用。更具体地,本发明涉及构建待测基因组的DNA测序文库的方法、确定待测基因组的DNA序列信息的方法以及确定待测基因组三维空间结构的方法。The invention relates to the field of biotechnology. In particular, the invention relates to methods of constructing DNA sequencing libraries of a genome to be tested and uses thereof. More specifically, the present invention relates to a method of constructing a DNA sequencing library of a genome to be tested, a method of determining DNA sequence information of a genome to be tested, and a method of determining a three-dimensional spatial structure of a genome to be tested.
背景技术Background technique
近年来随着基因组学的发展,人类基因组计划成功绘制了人类基因组DNA序列图谱。人类基因组百科全书计划相关研究,分析发现了几万个基因,几十万个不同的基因调控元件。最近的研究表明,基因组的三维空间结构对基因组的表达、调控等功能有重要的影响。因此,如何获得不同细胞种类在不同细胞周期,在不同发育和分化阶段以及在正常细胞向疾病细胞转变的过程中基因组的三维结构信息成为了亟待解决的问题。而染色质构象捕获技术和二代测序技术的发展普及,使得基于二代测序的全基因组染色质构象捕获技术Hi-C(whole genome chromosome conformation capture)成为了研究染色体三维结构的重要手段。然而,全基因组染色质构象捕获技术极为受限于细胞数目,如果不能获得足够量的包含足够多有效连接事件的用于建库的DNA,是无法得到有效以及足够高分辨率的基因组三维结构信息的。In recent years, with the development of genomics, the Human Genome Project has successfully mapped human genomic DNA sequences. The Human Genome Encyclopedia plans to analyze tens of thousands of genes and hundreds of thousands of different gene regulatory elements. Recent studies have shown that the three-dimensional structure of the genome has an important impact on the function of genome expression and regulation. Therefore, how to obtain the three-dimensional structural information of the genome of different cell types in different cell cycles, in different stages of development and differentiation, and in the transition from normal cells to disease cells has become an urgent problem to be solved. The development of chromatin conformation capture technology and second-generation sequencing technology has made the whole genome chromosome conformation capture technology based on second-generation sequencing an important means to study the three-dimensional structure of chromosomes. However, genome-wide chromatin conformation capture technology is extremely limited by the number of cells. If a sufficient amount of DNA for building a database containing enough effective ligation events is not available, it is impossible to obtain efficient and sufficiently high-resolution genomic three-dimensional structural information. of.
从而,如何建立不易获得的、少量数量类型细胞的DNA测序文库并有效用于DNA测序和获得足够高分辨率的基因组三维结构信息,是有待解决的问题。Therefore, how to establish a DNA sequencing library of a small number of types of cells that are not easily available and effectively use for DNA sequencing and obtain sufficient high-resolution genomic three-dimensional structure information is a problem to be solved.
发明内容Summary of the invention
本申请是基于发明人对以下问题的发现而做出的:This application is based on the discovery of the following problems by the inventors:
传统的Hi-C技术反应体系较大,离心、洗涤、换管步骤繁多,在DNA扩增前进行片段大小选择会造成大量的DNA损失。而且DNA片段大小选择范围较窄,远远超过超声波打断对DNA片段的富集程度,进一步减少了用于二代测序的DNA量。以上的几点原因,造成了Hi-C技术通常用于研究容易获得的数量级在107以上的常见类型细胞,对于处于胚胎发育早期的这种极难获得大量数目的细胞类型是无能为力的。基于发明人对以上传统技术问题的发现,发明人通过缩小反应体系、更改细胞转移方式、改变DNA片段大小选择的 范围和时间等,创造性地提出了改进的全基因组DNA测序文库构建方法(在本申请中,简称为sisHi-C(small scale in situ Hi-C)技术),该方法可以用来研究细胞数目低至10个的珍贵细胞类型,为研究胚胎早期发育的基因组三维结构建立和基因表达调控带来了希望。The traditional Hi-C technology has a large reaction system, and there are many steps of centrifugation, washing, and tube exchange. The selection of fragment size before DNA amplification causes a large amount of DNA loss. Moreover, the selection range of DNA fragments is narrower than the degree of enrichment of DNA fragments by ultrasonic interruption, further reducing the amount of DNA used for second-generation sequencing. The above reasons have led to the fact that Hi-C technology is commonly used to study the commonly available cells of the order of magnitude above 107, which is extremely incapable of obtaining a large number of cell types in the early stages of embryonic development. Based on the discovery of the above traditional technical problems by the inventors, the inventors selected by narrowing the reaction system, changing the mode of cell transfer, and changing the size of the DNA fragment. Scope and time, etc., creatively proposed an improved whole genome DNA sequencing library construction method (in this application, referred to as sisHi-C (small scale in situ Hi-C) technology), which can be used to study the low number of cells Up to 10 precious cell types have brought hope to the establishment of genomic three-dimensional structure and gene expression regulation in the early development of embryos.
在本发明的第一方面,本发明提出了一种构建待测基因组的DNA测序文库的方法。根据本发明的实施例,所述方法包括:(1)利用限制性内切酶对待测基因组进行消化处理,以便获得消化处理产物;(2)将所述消化处理产物进行生物素标记处理,以便获得生物素标记处理产物;(3)利用DNA连接酶对所述生物素标记处理产物进行连接处理,以便获得连接产物;(4)将连接产物进行解交联处理;(5)将解交联处理产物进行纯化处理;(6)将纯化处理产物进行超声和沉淀处理,所述沉淀处理是通过将超声处理产物与链酶亲合素磁珠进行接触进行的,以便获得结合有链酶亲合素磁珠的目标DNA片段;以及(7)基于结合有链酶亲合素磁珠的目标DNA片段,进行建库。利用根据本发明实施例的构建待测基因组的DNA测序文库的方法可以用来构建细胞数目低至10个细胞或基因组量低至50pg的DNA测序文库,进而实现不易获得的、少量数量类型细胞的基因组染色质构象的捕获。In a first aspect of the invention, the invention proposes a method of constructing a DNA sequencing library of a genome to be tested. According to an embodiment of the present invention, the method comprises: (1) digesting the genome to be tested with a restriction endonuclease to obtain a digestion treatment product; and (2) subjecting the digestion treatment product to biotin labeling treatment so that Obtaining a biotin labeling treatment product; (3) linking the biotin labeling treatment product with DNA ligase to obtain a ligation product; (4) de-crosslinking the ligation product; (5) decrosslinking the solution Treating the product for purification treatment; (6) subjecting the purified product to ultrasonication and precipitation treatment by contacting the sonicated product with a streptavidin magnetic bead to obtain a binding enzyme affinity a target DNA fragment of the magnetic beads; and (7) a target DNA fragment based on the magnetic chain bound to the streptavidin magnetic beads. A method of constructing a DNA sequencing library of a genome to be tested according to an embodiment of the present invention can be used to construct a DNA sequencing library having a cell number as low as 10 cells or a genome amount as low as 50 pg, thereby realizing a non-obtainable, small number of types of cells. Capture of genomic chromatin conformation.
在本发明的第二方面,本发明提出了一种测序文库。根据本发明的实施例,所述测序文库是通过前面所述的构建待测基因组的DNA测序文库的方法获得的。利用根据本发明实施例的测序文库进行测序,可获得细胞数目低至10个的细胞或基因组量低至50pg的基因组序列信息和足够高分辨率的基因组三维结构信息。In a second aspect of the invention, the invention proposes a sequencing library. According to an embodiment of the present invention, the sequencing library is obtained by the method of constructing a DNA sequencing library of the genome to be tested as described above. Using sequencing libraries according to embodiments of the present invention for sequencing, it is possible to obtain genomic sequence information with a cell number as low as 10 or a genome amount as low as 50 pg and genomic three-dimensional structure information of sufficiently high resolution.
在本发明的第三方面,本发明提出了一种确定待测基因组的DNA序列信息的方法。根据本发明的实施例,所述方法包括:根据前面所述的方法构建待测基因组的DNA测序文库;对所述DNA测序文库进行测序,以便获得测序结果;以及基于所述测序结果,确定所述待测基因组的DNA序列信息。利用根据本发明实施例的确定待测基因组的DNA序列信息的方法,可获得细胞数目低至10个的细胞或基因组量低至50pg的基因组序列信息。In a third aspect of the invention, the invention proposes a method of determining DNA sequence information of a genome to be tested. According to an embodiment of the present invention, the method comprises: constructing a DNA sequencing library of a genome to be tested according to the method described above; sequencing the DNA sequencing library to obtain a sequencing result; and determining the site based on the sequencing result Describe the DNA sequence information of the genome. Using the method of determining the DNA sequence information of the genome to be tested according to an embodiment of the present invention, it is possible to obtain genomic sequence information of cells having a cell number as low as 10 or a genome amount as low as 50 pg.
在本发明的第四方面,本发明提出了一种用于确定待测基因组三维空间结构的方法。根据本发明的实施例,所述方法包括:根据前面所述的方法构建待测基因组的DNA测序文库;对所述DNA测序文库进行测序,以便获得测序结果;以及基于所述测序结果,确定所述待测基因组的三维空间结构信息。利用根据本发明实施例的确定待测基因组三维空间结构的方法,可获得细胞数目低至10个的细胞或基因组量低至50pg的足够高分辨率的基因组三维结构信息。In a fourth aspect of the invention, the invention proposes a method for determining a three-dimensional spatial structure of a genome to be tested. According to an embodiment of the present invention, the method comprises: constructing a DNA sequencing library of a genome to be tested according to the method described above; sequencing the DNA sequencing library to obtain a sequencing result; and determining the site based on the sequencing result The three-dimensional spatial structure information of the detected genome is described. With the method of determining the three-dimensional spatial structure of the genome to be tested according to an embodiment of the present invention, it is possible to obtain a sufficiently high-resolution genomic three-dimensional structure information of cells having a cell number as low as 10 or a genome amount as low as 50 pg.
需要说明的是,本发明所提出的待测基因组是指细胞或组织的全基因组或部分基因组,并且基因组由染色质或染色体组成。本领域的技术人员可以理解,基因组的来源不受特别限制,可以从任何可能的途径获得,可以是通过市售直接获得,也可以是从其他实验室直接获取,还可以是直接从细胞或组织样本中提取的。 It should be noted that the genome to be tested proposed by the present invention refers to a whole genome or a partial genome of a cell or a tissue, and the genome is composed of chromatin or chromosome. Those skilled in the art will appreciate that the source of the genome is not particularly limited and can be obtained from any possible route, either directly from a commercial market, directly from other laboratories, or directly from a cell or tissue. Extracted from the sample.
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。The additional aspects and advantages of the invention will be set forth in part in the description which follows.
附图说明DRAWINGS
图1是根据本发明实施例的构建待测基因组的DNA测序文库的方法的流程图;1 is a flow chart of a method of constructing a DNA sequencing library of a genome to be tested according to an embodiment of the present invention;
图2是根据本发明又一实施例的构建待测基因组的DNA测序文库的方法的流程图;2 is a flow chart of a method of constructing a DNA sequencing library of a genome to be tested according to still another embodiment of the present invention;
图3是根据本发明又一实施例的构建待测基因组的DNA测序文库的方法的流程图;3 is a flow chart of a method of constructing a DNA sequencing library of a genome to be tested according to still another embodiment of the present invention;
图4是根据本发明实施例的TruSeq建库的流程图;4 is a flow chart of a TruSeq library according to an embodiment of the present invention;
图5是根据本发明再一实施例的TruSeq建库的流程图;FIG. 5 is a flowchart of a TruSeq database according to still another embodiment of the present invention; FIG.
图6是根据本发明再一实施例的TruSeq建库的流程图;6 is a flow chart of a TruSeq library according to still another embodiment of the present invention;
图7是根据本发明再一实施例的TruSeq建库的流程图;以及7 is a flow chart of a TruSeq library according to still another embodiment of the present invention;
图8是根据本发明实施例的验证本发明实施例所提出的建库方法具有显著优势的结果图。FIG. 8 is a diagram showing the results of verifying that the method of building a database according to an embodiment of the present invention has significant advantages, according to an embodiment of the present invention.
具体实施方式Detailed ways
下面详细描述本发明的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。The embodiments of the present invention are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are used to refer to the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the accompanying drawings are intended to be illustrative of the invention and are not to be construed as limiting.
构建待测基因组的DNA测序文库的方法Method for constructing DNA sequencing library of test genome
在本发明的第一方面,本发明提出了一种构建待测基因组的DNA测序文库的方法。根据本发明的实施例,参考图1,该方法包括:S100:利用限制性内切酶对待测基因组进行消化处理,以便获得消化处理产物;S200:将所述消化处理产物进行生物素标记处理,以便获得生物素标记处理产物;S300:利用DNA连接酶对所述生物素标记处理产物进行连接处理,以便获得连接产物;S400:将连接产物进行解交联处理;S500:将解交联处理产物进行纯化处理;S600:将纯化处理产物进行超声和沉淀处理,所述沉淀处理是通过将超声处理产物与链酶亲合素磁珠进行接触进行的,以便获得结合有链酶亲合素磁珠的目标DNA片段;以及S700:基于结合有链酶亲合素磁珠的目标DNA片段,进行建库。利用根据本发明实施例的构建待测基因组的DNA测序文库的方法可以用来构建细胞数目低至10个的细胞或基因组量低至50pg的DNA测序文库,进而实现不易获得的、少量数量类型细胞的基因组染色质构象的捕获。In a first aspect of the invention, the invention proposes a method of constructing a DNA sequencing library of a genome to be tested. According to an embodiment of the present invention, referring to FIG. 1, the method includes: S100: digesting a genome to be tested with a restriction endonuclease to obtain a digestion treatment product; S200: performing biotin labeling treatment on the digestion treatment product, In order to obtain a biotin labeling treatment product; S300: ligating the biotin labeling treatment product with DNA ligase to obtain a ligation product; S400: de-crosslinking the ligation product; S500: decrosslinking the treated product Purification treatment; S600: subjecting the purified treatment product to ultrasonication and precipitation treatment by contacting the sonicated product with a streptavidin magnetic bead to obtain a streptavidin-coupled magnetic bead Target DNA fragment; and S700: based on a target DNA fragment bound with a streptavidin magnetic beads. A method of constructing a DNA sequencing library of a genome to be tested according to an embodiment of the present invention can be used to construct a DNA sequencing library having a cell number as low as 10 or a genome amount as low as 50 pg, thereby realizing a readily available, small number of types of cells. Capture of the genomic chromatin conformation.
根据本发明的实施例,参考图2,所述待测基因组是通过裂解细胞或组织而获得的,任选地,所述细胞为细胞系或原代细胞。裂解细胞或组织而释放细胞或组织中的基因组。According to an embodiment of the invention, referring to Figure 2, the genome to be tested is obtained by lysing cells or tissues, optionally the cells are cell lines or primary cells. The cells or tissues are lysed to release the genome in the cells or tissues.
根据本发明的实施例,参考图3,所述细胞或组织预先经过甲醛交联处理。进而甲醛瞬时固定细胞或组织内自然状态下在空间上接近的DNA-蛋白质和蛋白质-蛋白质复合物。According to an embodiment of the invention, referring to Figure 3, the cells or tissue are previously subjected to a formaldehyde cross-linking treatment. In turn, formaldehyde temporarily immobilizes DNA-protein and protein-protein complexes that are spatially close in the natural state of cells or tissues.
根据本发明的实施例,所述细胞或基因组由口吸管进行转移处理。发明人发现,使用口 吸管精确转移细胞或组织可有效避免离心法更换溶液带来的细胞损失。According to an embodiment of the invention, the cell or genome is subjected to a transfer treatment by a mouth pipette. The inventor found that using the mouth The precise transfer of cells or tissue by the pipette can effectively avoid cell loss caused by centrifugation of the replacement solution.
根据本发明的实施例,所述限制性内切酶为MboI。发明人发现,MboI的识别位点为GATC,碱基分布均匀,在基因组上进行切割时,不会有明显的偏好性;同时MboI的识别位点为四个碱基,相比于常用的六碱基限制性内切酶,其切割频率更高,理论上得到的数据的分辨率更高;最后MboI商业化程度高,使用成本低,可以轻松获得,因而有效控制了整个构建DNA测序文库的成本。According to an embodiment of the invention, the restriction enzyme is MboI. The inventors found that the recognition site of MboI is GATC, the base distribution is uniform, and there is no obvious preference when cutting on the genome; at the same time, the recognition site of MboI is four bases, compared with the commonly used six The base restriction endonuclease has a higher cutting frequency and a higher resolution of the theoretically obtained data. Finally, MboI is highly commercialized and low in cost, and can be easily obtained, thereby effectively controlling the entire DNA sequencing library. cost.
根据本发明的实施例,所述生物素标记处理是通过如下方式进行的:将消化处理产物与三磷酸腺嘌呤脱氧核苷酸、三磷酸鸟嘌呤脱氧核苷酸、三磷酸胸腺嘧啶脱氧核甘酸、生物素标记的三磷酸胞嘧啶脱氧核苷酸以及DNA聚合酶大片段进行接触,所述接触是在37℃的条件下进行1.5小时。通过上述方式可以将生物素高效标记到MboI酶切片段的末端。According to an embodiment of the present invention, the biotin labeling treatment is carried out by treating the digestion treatment product with adenine triphosphate deoxynucleotide, guanine deoxynucleotide triphosphate, thymidine triphosphate deoxynucleotide The biotin-labeled triphosphate cytosine deoxynucleotide and the DNA polymerase large fragment were contacted, and the contact was carried out at 37 ° C for 1.5 hours. Biotin can be efficiently labeled to the end of the MboI fragment by the above method.
根据本发明的实施例,所述连接处理为T4连接,所述解交联处理是通过将所述连接产物与蛋白酶K、SDS以及氯化钠接触进行的。利用T4DNA连接酶进行连接可将DNA不同的切口末端连接成环状的嵌合分子,进而通过上述的解交联处理,将DNA与其结合的蛋白质分开而将DNA有效释放出来。According to an embodiment of the invention, the ligation process is a T4 linkage, which is carried out by contacting the ligation product with proteinase K, SDS and sodium chloride. By ligation with T4 DNA ligase, different nick ends of DNA can be ligated into a circular chimeric molecule, and the DNA can be efficiently released by separating the DNA from the bound protein by the above-described decrosslinking treatment.
根据本发明的实施例,所述纯化处理是通过将所述解交联处理产物与预冷的无水乙醇进行接触进行的,所述接触是在-80℃的条件下进行15分钟。利用上述的纯化处理方式,DNA有效沉淀于试管底部,进而实现DNA的纯化。According to an embodiment of the present invention, the purification treatment is carried out by contacting the decrosslinked treatment product with pre-cooled anhydrous ethanol, which is carried out at -80 ° C for 15 minutes. Using the purification method described above, DNA is efficiently precipitated at the bottom of the tube to further purify the DNA.
根据本发明的实施例,所述纯化处理过程中将所述解交联处理产物与肝糖原和醋酸钠进行接触。发明人发现,在纯化DNA的过程中,因为细胞数目少,DNA量特别少,而加入肝糖原与DNA共沉淀可以指示DNA沉淀在EP管中的位置,进而有效防止了洗涤过程中DNA损失。According to an embodiment of the invention, the decrosslinking treatment product is contacted with hepatic glycogen and sodium acetate during the purification treatment. The inventors have found that in the process of purifying DNA, since the number of cells is small, the amount of DNA is extremely small, and co-precipitation with hepatic glycogen and DNA can indicate the position of DNA precipitation in the EP tube, thereby effectively preventing DNA loss during washing. .
根据本发明的实施例,所述超声是在Peak Power为50,Duty Factor为20,Cycles/Burst为200的条件下进行134s。其中,Peak Power表示最高入射功率,是作用在样品上的瞬时超声波功率;Duty Factor表示工作系数,即超声波作用于样品的时间占总时间段的百分数;Cycles/Burst表示超声波作用于样品过程中超声波能量传递的数目。发明人发现,在上述超声条件下,可有效减少DNA片段大小选择过程中的DNA损失。According to an embodiment of the invention, the ultrasound is performed for 134 s with a Peak Power of 50, a Duty Factor of 20, and a Cycles/Burst of 200. Among them, Peak Power indicates the highest incident power, which is the instantaneous ultrasonic power acting on the sample; Duty Factor indicates the working coefficient, that is, the time when the ultrasonic wave acts on the sample as a percentage of the total time period; Cycles/Burst indicates that the ultrasonic wave acts on the sample during the ultrasonic wave The number of energy transfers. The inventors have found that under the above ultrasonic conditions, DNA loss during DNA fragment size selection can be effectively reduced.
根据本发明的实施例,参考图4,S700中的所述建库为TruSeq建库,S700包括将结合有链酶亲合素磁珠的目标DNA片段进行S710:末端修复、S720:末端加三磷酸腺嘌呤脱氧核糖核苷酸和S730:连接测序接头序列处理。According to an embodiment of the present invention, referring to FIG. 4, the database in S700 is built by TruSeq, and S700 includes S710: end-repairing, S720: end-addition of target DNA fragment bound with streptavidin magnetic beads. Adenine phosphate deoxyribonucleotides and S730: ligation sequencing linker sequences were processed.
根据本发明的实施例,S710末端修复处理之后、S720末端加三磷酸腺嘌呤脱氧核糖核苷酸处理之前进一步包括吐温洗涤所述磁珠处理;优选地,S720末端加三磷酸腺嘌呤脱氧核糖核苷酸处理之后、S730连接测序接头序列处理之前进一步包括吐温洗涤所述磁珠处理; 优选地,S730连接测序接头序列处理之后进一步包括吐温洗涤所述磁珠处理。在每一步反应结束后,直接用吐温洗涤溶液洗涤磁珠两次,即可简单快速的更换溶液,避免了DNA在纯化过程中的损失。According to an embodiment of the present invention, after the S710 end repair treatment, the S720 end plus adenosine triphosphate deoxyribonucleotide treatment further comprises a Tween wash of the magnetic bead treatment; preferably, the S720 end plus adenosine triphosphate deoxyribose After the nucleotide treatment, the S730 is connected to the sequencing linker sequence to further include a Tween wash to treat the magnetic beads; Preferably, the S730-linked sequencing linker sequence further comprises a Tween wash to treat the magnetic beads. After each step of the reaction, the magnetic beads are directly washed twice with the Tween washing solution, and the solution can be changed simply and quickly, thereby avoiding the loss of DNA during the purification process.
根据本发明的实施例,参考图5,进一步包括将连接测序接头序列处理产物进行S740:DNA第一洗脱处理和S750:PCR扩增。经过第一洗脱处理,可将目标DNA片段从链酶亲合素磁珠高效洗脱下来,经过PCR扩增,可实现目标DNA的高效富集。According to an embodiment of the present invention, with reference to Figure 5, it is further included that the ligation sequencing linker sequence processing product is subjected to S740: DNA first elution treatment and S750: PCR amplification. After the first elution treatment, the target DNA fragment can be efficiently eluted from the streptavidin magnetic beads, and the target DNA can be efficiently enriched by PCR amplification.
根据本发明的实施例,参考图6,进一步包括S760:将PCR扩增产物与AMPure XP磁珠进行结合处理,任选地,参考图7,进一步包括S770:将结合有AMPure XP磁珠的PCR扩增产物进行DNA第二洗脱处理。从而得到大小在200碱基对到1000碱基对范围内的DNA片段。According to an embodiment of the present invention, referring to Figure 6, further comprising S760: combining the PCR amplification product with AMPure XP magnetic beads, optionally, with reference to Figure 7, further comprising S770: PCR to be combined with AMPure XP magnetic beads The amplified product is subjected to a second elution treatment of DNA. Thereby, a DNA fragment having a size ranging from 200 base pairs to 1000 base pairs is obtained.
根据本发明实施例的上述方法获得的测序文库可作为sisHi-C文库用于二代测序。A sequencing library obtained by the above method according to an embodiment of the present invention can be used as a sisHi-C library for second generation sequencing.
根据本发明实施例的上述方法使用很小的反应体系,在大大降低成本的同时增加了酶切和连接反应的效率;在细胞裂解前,使用口吸管精确转移细胞避免了离心法更换溶液带来的细胞损失;将DNA结合在链霉亲合素磁珠上进行TruSeq建库、使用更温和的磁珠洗涤条件以及控制建库操作在同一个EP管中进行,这些措施成功绕开纯化回收步骤,并有效减少了在建库过程中的DNA损失;优化超声打断DNA片段的条件、两步法洗脱DNA、在PCR之后将片段大小在200bp到1000bp范围内的DNA全部保留下来,进一步提高了最后可利用的DNA量。整个方法不但显著减少了起始细胞的用量,成本也仅为传统方法的十分之一,从而高效实现了基因组染色质构象捕获。The above method according to an embodiment of the present invention uses a small reaction system to increase the efficiency of enzymatic cleavage and ligation reaction while greatly reducing the cost; accurately transferring cells using a mouth pipette before cell lysis avoids the replacement of the solution by centrifugation Cell loss; binding of DNA to streptavidin magnetic beads for TruSeq library construction, use of milder magnetic bead wash conditions, and controlled library construction in the same EP tube, these measures successfully bypass the purification and recovery steps And effectively reduce the DNA loss during the construction process; optimize the condition of ultrasonic to break the DNA fragment, elute the DNA in two steps, and retain all the DNA in the range of 200bp to 1000bp after PCR, further improve The amount of DNA that is finally available. The whole method not only significantly reduces the amount of starting cells, but also costs one-tenth of the cost of traditional methods, thus efficiently achieving genomic chromatin capture.
本领域技术人员可以理解的是,基于本申请所述的方法所获得的构建待测基因组的DNA测序文库的设备、装置、单元、模块也在本申请的保护范围内,所述设备、装置、单元、模块的优点与前面所述的方法类似,在此不再详述。本领域技术人员能够理解的是,可以采用本领域中已知的任何适于进行上述操作的装置作为上述各个单元的组成部件。在本文中所使用的术语“相连”应作广义理解,可以是直接相连,也可以通过中间媒介间接相连,对于本领域的普通技术人员而言,可以根据具体情况理解上述术语的具体含义。It will be understood by those skilled in the art that devices, devices, units, and modules for constructing a DNA sequencing library of a genome to be tested obtained based on the method described in the present application are also within the scope of the present application, the device, device, The advantages of the unit and module are similar to those described above and will not be described in detail here. It will be understood by those skilled in the art that any device known in the art suitable for performing the above operations can be employed as a component of each of the above units. The term "connected" as used herein is used in a broad sense and may be directly connected or indirectly connected through an intermediate medium. The specific meaning of the above terms may be understood by one of ordinary skill in the art.
测序文库Sequencing library
在本发明的第二方面,本发明提出了一种测序文库。根据本发明的实施例,所述测序文库是通过前面所述的构建待测基因组的DNA测序文库的方法获得的。利用根据本发明实施例的测序文库进行测序,可获得细胞数目低至10个的细胞或基因组量低至50pg的基因组序列信息和足够高分辨率的基因组三维结构信息。In a second aspect of the invention, the invention proposes a sequencing library. According to an embodiment of the present invention, the sequencing library is obtained by the method of constructing a DNA sequencing library of the genome to be tested as described above. Using sequencing libraries according to embodiments of the present invention for sequencing, it is possible to obtain genomic sequence information with a cell number as low as 10 or a genome amount as low as 50 pg and genomic three-dimensional structure information of sufficiently high resolution.
确定待测基因组的DNA序列信息的方法Method for determining DNA sequence information of a genome to be tested
在本发明的第三方面,本发明提出了一种确定待测基因组的DNA序列信息的方法。根 据本发明的实施例,该方法包括:根据前面所述的方法构建待测基因组的DNA测序文库;对所述DNA测序文库进行测序,以便获得测序结果;以及基于所述测序结果,确定所述待测基因组的DNA序列信息。关于构建待测基因组的DNA测序文库的方法,前面已经进行了详细描述,在此不再赘述。根据本发明的实施例,对测序文库进行测序的方法和装置不受特别限制,考虑到技术的成熟度,根据本发明的实施例,可以采用第二代测序技术,诸如SOLEXA、SOLID和454测序技术。当然,也可以采用正在开发或者尚未开发的新型测序技术,例如单分子测序技术,诸如:Helicos公司的True Single Molecule DNA sequencing技术,Pacific Biosciences公司的the single molecule,real-time(SMRT.TM.)技术,以及Oxford Nanopore Technologies公司的纳米孔测序技术等(Rusk,Nicole(2009-04-01).Cheap Third-Generation Sequencing.Nature Methods 6(4):244–245)。In a third aspect of the invention, the invention proposes a method of determining DNA sequence information of a genome to be tested. Root According to an embodiment of the present invention, the method comprises: constructing a DNA sequencing library of a genome to be tested according to the method described above; sequencing the DNA sequencing library to obtain a sequencing result; and determining the result based on the sequencing result DNA sequence information of the genome to be tested. The method for constructing a DNA sequencing library of the genome to be tested has been described in detail above and will not be described herein. According to an embodiment of the present invention, the method and apparatus for sequencing a sequencing library are not particularly limited, and in view of the maturity of the technique, according to an embodiment of the present invention, second generation sequencing techniques such as SOLEXA, SOLID, and 454 sequencing may be employed. technology. Of course, new sequencing technologies that are being developed or not yet developed, such as single-molecule sequencing technologies such as Helicos' True Single Molecule DNA sequencing technology, Pacific Biosciences' single single molecule, real-time (SMRT.TM.), can also be used. Technology, and nanopore sequencing technology from Oxford Nanopore Technologies, Inc. (Rusk, Nicole (2009-04-01). Cheap Third-Generation Sequencing. Nature Methods 6(4): 244-245).
发明人惊奇地发现,利用根据本发明实施例的确定待测基因组的DNA序列信息的方法,能够灵敏、准确、高效地确定微量细胞(细胞起始量低至10个)的基因组或微量基因组(DNA起始量低至50pg)的基因组序列信息。The inventors have surprisingly found that using a method for determining DNA sequence information of a genome to be tested according to an embodiment of the present invention, it is possible to sensitively, accurately and efficiently determine a genomic or microgenome of a trace amount of cells (starting up to 10 cells). Genomic sequence information with a starting amount of DNA as low as 50 pg).
确定待测基因组三维空间结构的方法Method for determining three-dimensional spatial structure of a genome to be tested
在本发明的第四方面,本发明提出了一种用于确定待测基因组三维空间结构的方法。根据本发明的实施例,所述方法包括:根据前面所述的方法构建待测基因组的DNA测序文库;对所述DNA测序文库进行测序,以便获得测序结果;以及基于所述测序结果,确定所述待测基因组的三维空间结构信息。In a fourth aspect of the invention, the invention proposes a method for determining a three-dimensional spatial structure of a genome to be tested. According to an embodiment of the present invention, the method comprises: constructing a DNA sequencing library of a genome to be tested according to the method described above; sequencing the DNA sequencing library to obtain a sequencing result; and determining the site based on the sequencing result The three-dimensional spatial structure information of the detected genome is described.
根据本发明的具体实施例,所述方法包括根据前面所述的方法构建待测基因组的DNA测序文库;对于所得到的文库进行双端测序,以得到文库中每个DNA片段两端的序列信息;将两端的序列信息分别比对到基因组上,从而可以得到基因组中线性距离各异的每两个片段间的空间接近程度信息,进而结合数学方法推断出基因组的三维空间结构(Aiden et al.Comprehensive mapping of long-range interactions reveals folding principle of the human genome.Science.2009)。According to a specific embodiment of the present invention, the method comprises constructing a DNA sequencing library of the genome to be tested according to the method described above; performing double-end sequencing on the obtained library to obtain sequence information at both ends of each DNA fragment in the library; The sequence information at both ends is compared to the genome, so that the spatial proximity information between each two segments with different linear distances in the genome can be obtained, and the three-dimensional structure of the genome can be inferred by mathematical methods (Aiden et al.Comprehensive Mapping of long-range interactions reveals folding principle of the human genome. Science. 2009).
发明人惊奇地发现,利用根据本发明实施例的确定待测基因组染色质目标区域的序列信息的方法,能够灵敏、准确、高效地确定微量细胞(细胞起始量低至10个)基因组或微量基因组(DNA起始量低至50pg)的足够高分辨率的基因组三维结构信息。The inventors have surprisingly found that the method for determining the sequence information of the chromatin target region of the genome to be tested according to an embodiment of the present invention enables sensitive, accurate and efficient determination of trace cells (up to 10 cells starting from the genome) or trace amounts. A sufficiently high resolution genomic three-dimensional structural information of the genome (initial amount of DNA as low as 50 pg).
下面参考具体实施例,对本发明进行说明,需要说明的是,这些实施例仅是说明性的,而不能理解为对本发明的限制。The invention is described below with reference to the specific embodiments, which are intended to be illustrative, and are not to be construed as limiting.
下面将结合实施例对本发明的方案进行解释。本领域技术人员将会理解,下面的实施例仅用于说明本发明,而不应视为限定本发明的范围。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件(例如参考J.萨姆布鲁克等著,黄培堂等译的《分 子克隆实验指南》,第三版,科学出版社)或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品,例如可以采购自Illumina公司。The solution of the present invention will be explained below in conjunction with the embodiments. Those skilled in the art will appreciate that the following examples are merely illustrative of the invention and are not to be considered as limiting the scope of the invention. Where the specific techniques or conditions are not indicated in the examples, the techniques or conditions described in the literature in the field (for example, refer to J. Sambrook et al., translated by Huang Peitang et al. Subclone Experimental Guide, Third Edition, Science Press) or in accordance with product specifications. The reagents or instruments used are not specified by the manufacturer, and are conventional products that can be obtained commercially, for example, from Illumina.
实施例1 构建细胞基因组的DNA测序文库Example 1 Construction of a DNA sequencing library of a cellular genome
1.1试剂准备1.1 Reagent preparation
裂解液(Lysis buffer)Lysis buffer
·10mM Tris-HCl.,pH=7.4·10 mM Tris-HCl., pH=7.4
·10mM NaCl·10mM NaCl
·0.5%NP-40·0.5% NP-40
·0.1mM EDTA·0.1mM EDTA
·1X Proteinase Inhibitor·1X Proteinase Inhibitor
2X生物素和链霉亲合素磁珠结合溶液(2X Binding buffer)2X biotin and streptavidin magnetic bead binding solution (2X Binding buffer)
·10mM Tris-HCl.,pH=8.0·10mM Tris-HCl., pH=8.0
·2M NaCl·2M NaCl
·1mM EDTA·1mM EDTA
吐温洗涤溶液(Tween wash buffer)Tween wash buffer
·5mM Tris-HCl.,pH=8.0· 5mM Tris-HCl., pH=8.0
·1M NaCl·1M NaCl
·0.05%Tween·0.05% Tween
·0.5mM EDTA·0.5mM EDTA
1.2甲醛交联1.2 formaldehyde crosslinking
将收集的样品在体视镜下用口吸管转移到新鲜配制的含有1%甲醛的PBS溶液中,室温固定10分钟,加入2.5M甘氨酸溶液至终浓度为0.2M,室温静置10分钟。用口吸管将样品转移到PBS溶液中洗涤一次,然后转移到PCR管中。The collected samples were transferred to a freshly prepared PBS solution containing 1% formaldehyde by a mouthpiece under a stereoscopic microscope, fixed at room temperature for 10 minutes, and added with a 2.5 M glycine solution to a final concentration of 0.2 M, and allowed to stand at room temperature for 10 minutes. The sample was transferred to a PBS solution by a mouth pipette and washed once, and then transferred to a PCR tube.
1.3裂解样品及限制性内切酶消化1.3 lysed samples and restriction endonuclease digestion
向含有样品的PCR管中加入50微升裂解液,并用低吸附带滤芯的枪头反复吹吸混匀,在冰上静置50分钟。离心去上清。加入10微升0.5%的SDS溶液,62℃金属浴10分钟。50 μl of the lysate was added to the PCR tube containing the sample, and the mixture was repeatedly pipetted with a low-absorption filter cartridge and allowed to stand on ice for 50 minutes. Centrifuge to remove the supernatant. Ten microliters of 0.5% SDS solution was added and the metal bath was allowed to stand at 62 ° C for 10 minutes.
向样品中加入25微升水和10微升10%Triton X-100溶液。用移液器反复吹吸混匀,将样品置于金属浴中37℃孵育15分钟。混匀液体以防止后续SDS对限制性内切酶活性的影 响。25 microliters of water and 10 microliters of a 10% Triton X-100 solution were added to the sample. The mixture was pipetted repeatedly with a pipette, and the sample was placed in a metal bath and incubated at 37 ° C for 15 minutes. Mix the liquid to prevent subsequent SDS from affecting the restriction enzyme activity ring.
继续向样品中加入5微升10X NEB buffer 2和50U限制性内切酶MboI,混匀后置于旋转混匀仪上,37℃酶切15小时,将染色质进行充分地消化和分离。Continue to add 5 μl of 10X NEB buffer 2 and 50 U restriction enzyme MboI to the sample, mix and place on a rotary homomixer, and digest at 37 °C for 15 hours to fully digest and separate the chromatin.
1.4末端标记和连接1.4 end marking and connection
第二天,将样品从旋转混匀仪上取下,置于62℃金属浴上静置20分钟,以失活MboI。在冰上,向每个样品中分别加入1mM三磷酸腺嘌呤脱氧核苷酸(dATP),1mM三磷酸鸟嘌呤脱氧核苷酸(dGTP)和1mM三磷酸胸腺嘧啶脱氧核甘酸(dTTP)各0.5微升,以及3.75微升生物素标记的0.4mM三磷酸胞嘧啶脱氧核苷酸(biotin-14-dCTP)。混匀后加入10U DNA聚合酶大片段(Klenow),将样品置于旋转混匀仪上37℃处理1.5小时,以将生物素标记到MboI酶切片段的末端。The next day, the sample was removed from the rotary mixer and placed on a 62 ° C metal bath for 20 minutes to inactivate MboI. On ice, 1 mM adenine deoxynucleotide triphosphate (dATP), 1 mM guanine deoxynucleotide triphosphate (dGTP) and 1 mM thymidine triphosphate deoxynucleotide (dTTP) were added to each sample. Microliters, and 3.75 microliters of biotinylated 0.4 mM cytosine deoxynucleotide (biotin-14-dCTP). After mixing, a large fragment of 10 U DNA polymerase (Klenow) was added, and the sample was placed on a rotary homomixer at 37 ° C for 1.5 hours to label biotin to the end of the MboI fragment.
将含有样品的溶液从PCR管中转移到低吸附的1.5毫升EP管中。并加入12微升10X NEB T4DNA连接酶反应溶液、7微升10%Triton X-100、1.2微升10毫克每微升的牛血清白蛋白、1微升400U/ul T4DNA连接酶和39微升水。混匀后,将样品置于恒温混匀仪(Thermo mixer)上,500转/分钟,24℃处理5.5小时,利用DNA连接酶将不同的切口末端连接成环状的嵌合分子。The solution containing the sample was transferred from the PCR tube to a low adsorption 1.5 ml EP tube. Add 12 μl of 10X NEB T4 DNA ligase reaction solution, 7 μl of 10% Triton X-100, 1.2 μl of 10 mg per microliter of bovine serum albumin, 1 μl of 400 U/ul T4 DNA ligase and 39 μl of water. . After mixing, the sample was placed on a Thermo mixer, 500 rpm, and treated at 24 ° C for 5.5 hours, and the different nick ends were ligated into circular chimeric molecules by DNA ligase.
1.5解交联和DNA纯化1.5 decrosslinking and DNA purification
连接反应结束后,向每个样品中加入100U蛋白酶K和12微升10%SDS溶液,55℃金属浴30分钟。然后加入13微升5M氯化钠,在涡旋仪上混匀后置于65℃烘箱中,过夜解交联。After the completion of the ligation reaction, 100 U of Proteinase K and 12 μl of a 10% SDS solution were added to each sample, and a metal bath at 55 ° C for 30 minutes. Then, 13 μl of 5 M sodium chloride was added, mixed on a vortexer, placed in an oven at 65 ° C, and crosslinked overnight.
第三天,将样品从烘箱中取出,当温度降至室温后,向每个样品中加入1微升肝糖原和15微升3M醋酸钠,在涡旋仪上混匀,然后加入240微升预冷的无水乙醇,上下颠倒混匀,-80℃静置15分钟。高速离心,用75%无水乙醇洗涤沉淀两次,待沉淀干燥后用50微升水溶解纯化得到的DNA,37℃金属浴15分钟。在纯化DNA的过程中,因为细胞数目少,DNA量特别少,加入肝糖原与DNA共沉淀以指示DNA沉淀在EP管中的位置,防止洗涤过程中DNA损失。On the third day, the sample was taken out of the oven. When the temperature was lowered to room temperature, 1 μl of hepatic glycogen and 15 μl of 3 M sodium acetate were added to each sample, mixed on a vortex, and then 240 μm was added. The pre-cooled anhydrous ethanol was stirred upside down and allowed to stand at -80 ° C for 15 minutes. The precipitate was washed twice with high speed centrifugation with 75% absolute ethanol. After the precipitate was dried, the purified DNA was dissolved in 50 μl of water, and the metal bath was allowed to stand at 37 ° C for 15 minutes. In the process of purifying DNA, since the number of cells is small, the amount of DNA is extremely small, and hepatic glycogen is co-precipitated with DNA to indicate the position of DNA precipitation in the EP tube, preventing DNA loss during washing.
1.6超声打断DNA和生物素标记沉淀1.6 Ultrasound interrupts DNA and biotin labeling
用Covaris M220超声波DNA破碎仪将样品DNA剪切打碎成300-500碱基对(bp)大小的片段,超声条件为:Peak Power 50,Duty Factor 20,Cycles/Burst 200,time 134s。用20微升水洗涤Covaris超声管以减少DNA损失。发明人发现,在上述超声条件下可以减少DNA片段大小选择过程中的DNA损失。 The sample DNA was shredded into fragments of 300-500 base pairs (bp) using a Covaris M220 ultrasonic DNA disruptor, and the ultrasonic conditions were: Peak Power 50, Duty Factor 20, Cycles/Burst 200, time 134s. The Covaris sonic tube was washed with 20 microliters of water to reduce DNA loss. The inventors have found that DNA loss during DNA fragment size selection can be reduced under the above-described ultrasonic conditions.
准备链霉亲和素磁珠(Dynabeads MyOne Streptavidin C1),每个样品需要100微克磁珠。用吐温洗涤溶液清洗磁珠两次,然后用70微升2X生物素和链霉亲合素磁珠结合溶液重悬磁珠,并将其加入每个样品中,用移液器吹吸混匀。将样品置于旋转混匀仪上,室温50分钟,进行生物素标记沉淀以获取有标记的目的DNA片段。Prepare streptavidin magnetic beads (Dynabeads MyOne Streptavidin C1), each sample requires 100 micrograms of magnetic beads. The magnetic beads were washed twice with the Tween washing solution, and then the magnetic beads were resuspended with 70 μl of 2X biotin and streptavidin magnetic bead binding solution, and added to each sample, and pipetted and mixed. uniform. The sample was placed on a rotary mixer and subjected to biotin-labeled precipitation for 50 minutes at room temperature to obtain a labeled DNA fragment of interest.
1.7建库和DNA片段选择1.7 Database construction and DNA fragment selection
经过生物素标记沉淀后,带有生物素标记的目标DNA片段结合在链霉亲和素磁珠上。因为生物素标记的DNA与链霉亲和素磁珠结合非常稳定,所以这些结合目标DNA的磁珠可以直接用来进行TruSeq建库。依次进行末端修复、在DNA片段末端加三磷酸腺嘌呤脱氧核糖核苷酸(dATP)和连接测序接头序列。在每一步反应结束后,直接用吐温洗涤溶液洗涤磁珠两次,即可简单快速的更换溶液,避免了DNA在纯化过程中的损失。After biotin-labeled precipitation, the target DNA fragment bearing the biotin label is bound to the streptavidin magnetic beads. Because the binding of biotin-labeled DNA to streptavidin magnetic beads is very stable, these magnetic beads bound to the target DNA can be used directly for TruSeq library construction. Terminal repair was performed in sequence, adenine triphosphate deoxyribonucleotide (dATP) was added to the end of the DNA fragment, and the sequencing sequence was ligated. After each step of the reaction, the magnetic beads are directly washed twice with the Tween washing solution, and the solution can be changed simply and quickly, thereby avoiding the loss of DNA during the purification process.
第四天,连接测序接头序列后,将样品置于磁力架上,待磁珠全部吸附到磁力架,溶液变澄清后,弃上清,用吐温洗涤溶液清洗磁珠两次,每次在旋转混匀仪上室温转动2分钟。向样品中加入20微升水,吹吸混匀后将样品置于恒温混匀仪上66℃,1400转/分钟,二十分钟,洗脱DNA两次。进行PCR扩增。PCR结束后,向150微升的样品中加入72微升(1:0.48)AMPure XP磁珠,用移液器吹吸混匀后,在旋转混匀仪上室温结合5分钟。将上清转移到新的低吸附EP管中,再加入78微升(1:1)AMPure XP磁珠,混匀后在旋转混匀仪上室温结合5分钟,弃上清。用75%无水乙醇清洗磁珠两次,晾干后,向EP管中加入50微升水洗脱,从而得到大小在200碱基对到1000碱基对范围内的DNA片段。经过上述步骤得到的文库(在本申请中,发明人将上述方法获得文库称为sisHi-C文库)即可用于二代测序。On the fourth day, after the sequence of the sequencing linker is connected, the sample is placed on a magnetic stand. After the magnetic beads are all adsorbed to the magnetic frame, the solution becomes clear, the supernatant is discarded, and the magnetic beads are washed twice with the Tween washing solution, each time in the Rotate the mixer for 2 minutes at room temperature. Add 20 μl of water to the sample, mix by pipetting, and place the sample on a constant temperature mixer at 66 ° C, 1400 rpm for 20 minutes, and elute the DNA twice. PCR amplification was performed. After the end of the PCR, 72 μl (1:0.48) AMPure XP magnetic beads were added to 150 μl of the sample, pipetted and mixed, and then incubated on a rotary mixer for 5 minutes at room temperature. Transfer the supernatant to a new low-adsorption EP tube, add 78 μl (1:1) AMPure XP magnetic beads, mix and mix for 5 minutes at room temperature on a rotary mixer and discard the supernatant. The magnetic beads were washed twice with 75% absolute ethanol, and after air drying, 50 μl of water was added to the EP tube to elute, thereby obtaining a DNA fragment having a size ranging from 200 base pairs to 1000 base pairs. The library obtained by the above procedure (in the present application, the inventors obtained the library obtained by the above method as a sisHi-C library) can be used for second generation sequencing.
实施例2Example 2
为了对比本申请的sisHi-C和前人的方法(Rao et al.A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping.Cell.2014)在少量细胞研究中的效果,发明人利用口吸管精确地数出四组小鼠胚胎干细胞。每组细胞数目为500个。然后发明人分别利用sisHi-C和前人的方法各对两组细胞(分别标记为重复试验1和重复试验2)进行建库。在PCR前测量DNA的浓度,并在文库测序后分析有效的数据所占的比例。图8A结果显示,在PCR前,使用sisHi-C方法的两组DNA浓度均明显高于使用前人方法的两组;图8B结果显示,sisHi-C显著减少了DNA的损失,在测序数据中PCR产生的完全相同的序列所占的比例很少,有效的数据所占的比例明显高于使用前人的方法的两组。综上,和前人的方法相比,本申请的sisHi-C显著地减少了DNA在实验过程中的损失,非常 适用于研究极少量细胞的基因组三维结构。In order to compare the effects of the sisHi-C and the previous method (Rao et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 2014) in a small amount of cell research, the inventors used the mouth. The pipette accurately counts four groups of mouse embryonic stem cells. The number of cells in each group is 500. The inventors then used the sisHi-C and predecessor methods to construct two sets of cells (labeled as Repeat Test 1 and Repeat Test 2, respectively). The concentration of DNA was measured before PCR and the proportion of valid data was analyzed after sequencing of the library. The results in Figure 8A show that the concentration of the two groups of DNA using the sisHi-C method was significantly higher than that of the two groups before the PCR; Figure 8B shows that sisHi-C significantly reduced the loss of DNA in the sequencing data. The exact proportion of PCR-generated sequences is small, and the proportion of valid data is significantly higher than the two groups using the predecessor's method. In summary, compared with the previous methods, the sisHi-C of the present application significantly reduces the loss of DNA during the experiment, very It is suitable for studying the three-dimensional structure of genomes of very few cells.
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。Moreover, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" and "second" may include one or more of the features either explicitly or implicitly. In the description of the present invention, the meaning of "a plurality" is two or more unless specifically and specifically defined otherwise.
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。In the description of the present specification, the description with reference to the terms "one embodiment", "some embodiments", "example", "specific example", or "some examples" and the like means a specific feature described in connection with the embodiment or example. A structure, material or feature is included in at least one embodiment or example of the invention. In the present specification, the schematic representation of the above terms is not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any one or more embodiments or examples. In addition, various embodiments or examples described in the specification and features of various embodiments or examples may be combined and combined without departing from the scope of the invention.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。 Although the embodiments of the present invention have been shown and described, it is understood that the above-described embodiments are illustrative and are not to be construed as limiting the scope of the invention. The embodiments are subject to variations, modifications, substitutions and variations.

Claims (12)

  1. 一种构建待测基因组的DNA测序文库的方法,其特征在于,包括:A method for constructing a DNA sequencing library of a genome to be tested, comprising:
    (1)利用限制性内切酶对待测基因组进行消化处理,以便获得消化处理产物;(1) digesting the genome to be tested with a restriction endonuclease to obtain a digested product;
    (2)将所述消化处理产物进行生物素标记处理,以便获得生物素标记处理产物;(2) subjecting the digestion treatment product to biotin labeling treatment to obtain a biotin labeling treatment product;
    (3)利用DNA连接酶对所述生物素标记处理产物进行连接处理,以便获得连接产物;(3) ligating the biotin-labeled treatment product with DNA ligase to obtain a ligation product;
    (4)将连接产物进行解交联处理;(4) subjecting the linked product to decrosslinking treatment;
    (5)将解交联处理产物进行纯化处理;(5) purifying the cross-linked product;
    (6)将纯化处理产物进行超声和沉淀处理,所述沉淀处理是通过将超声处理产物与链酶亲合素磁珠进行接触进行的,以便获得结合有链酶亲合素磁珠的目标DNA片段;以及(6) subjecting the purified product to ultrasonication and precipitation treatment by contacting the sonicated product with a streptavidin magnetic bead to obtain a target DNA to which a streptavidin magnetic bead is bound Fragment;
    (7)基于结合有链酶亲合素磁珠的目标DNA片段,进行建库。(7) Construction of a library based on a target DNA fragment to which a streptavidin magnetic bead is bound.
  2. 根据权利要求1所述的方法,其特征在于,所述待测基因组是通过裂解细胞或组织而获得的基因组的至少一部分,The method according to claim 1, wherein said genome to be tested is at least a part of a genome obtained by lysing cells or tissues,
    任选地,所述细胞为细胞系或原代细胞。Optionally, the cell is a cell line or a primary cell.
  3. 根据权利要求2所述的方法,其特征在于,所述细胞或组织预先经过甲醛交联处理,The method according to claim 2, wherein said cells or tissues are previously subjected to formaldehyde cross-linking treatment,
    优选地,所述细胞或组织由口吸管进行转移处理。Preferably, the cells or tissues are subjected to a transfer treatment by a mouth pipette.
  4. 根据权利要求1所述的方法,其特征在于,所述限制性内切酶为MboI。The method of claim 1 wherein the restriction endonuclease is MboI.
  5. 根据权利要求1所述的方法,其特征在于,所述生物素标记处理是通过如下方式进行的:The method of claim 1 wherein said biotin labeling process is performed by:
    将消化处理产物与三磷酸腺嘌呤脱氧核苷酸、三磷酸鸟嘌呤脱氧核苷酸、三磷酸胸腺嘧啶脱氧核甘酸、生物素标记的三磷酸胞嘧啶脱氧核苷酸以及DNA聚合酶大片段进行接触,所述接触是在37℃的条件下进行1.5小时。Digestion treatment products with adenine triphosphate deoxynucleotide, guanine deoxynucleotide triphosphate, thymidine triphosphate deoxynucleotide, biotin-labeled cytosine deoxynucleotide and DNA polymerase large fragment The contact was carried out at 37 ° C for 1.5 hours.
  6. 根据权利要求1所述的方法,其特征在于,所述连接处理为T4连接,所述解交联处理是通过将所述连接产物与蛋白酶K、SDS以及氯化钠接触进行的,The method according to claim 1, wherein said linking treatment is a T4 linkage, and said decrosslinking treatment is carried out by contacting said ligated product with proteinase K, SDS and sodium chloride,
    任选地,所述纯化处理是通过将所述解交联处理产物与预冷的无水乙醇进行接触进行的,所述接触是在-80℃的条件下进行15分钟;Optionally, the purification treatment is carried out by contacting the decrosslinking treatment product with pre-cooled anhydrous ethanol, the contacting being carried out at -80 ° C for 15 minutes;
    优选地,所述纯化处理过程中将所述解交联处理产物与肝糖原和醋酸钠进行接触。Preferably, the decrosslinking treatment product is contacted with hepatic glycogen and sodium acetate during the purification treatment.
  7. 根据权利要求1所述的方法,其特征在于,所述超声是在Peak Power为50,Duty Factor为20,Cycles/Burst为200的条件下进行134s。The method according to claim 1, wherein the ultrasonication is performed for 134 s under conditions of a Peak Power of 50, a Duty Factor of 20, and a Cycles/Burst of 200.
  8. 根据权利要求1所述的方法,其特征在于,所述建库为TruSeq建库,包括将结合有链酶亲合素磁珠的目标DNA片段进行末端修复、末端加三磷酸腺嘌呤脱氧核糖核苷酸和连接测序接头序列处理,The method according to claim 1, wherein the database is built by TruSeq, comprising end-repairing a target DNA fragment bound with streptavidin magnetic beads, and terminally adding adenine triphosphate deoxyriborib nucleus. Glycosidic acid and ligation sequencing linker sequences,
    优选地,末端修复处理之后、末端加三磷酸腺嘌呤脱氧核糖核苷酸处理之前进一步包括 吐温洗涤所述磁珠处理;Preferably, after the end repair treatment, before the end plus adenine triphosphate deoxyribonucleotide treatment, further comprising Tween washing the magnetic beads treatment;
    优选地,末端加三磷酸腺嘌呤脱氧核糖核苷酸处理之后、连接测序接头序列处理之前进一步包括吐温洗涤所述磁珠处理;Preferably, after the terminal plus adenine triphosphate deoxyribonucleotide treatment, before the ligation of the sequencing linker sequence, the method further comprises a tween washing of the magnetic bead treatment;
    优选地,连接测序接头序列处理之后进一步包括吐温洗涤所述磁珠处理。Preferably, the processing of the sequencing sequencing linker further comprises a Tween wash of the magnetic bead treatment.
  9. 根据权利要求8所述的方法,其特征在于,进一步包括将连接测序接头序列处理产物进行DNA第一洗脱处理和PCR扩增,The method according to claim 8, further comprising the step of ligating the sequencing linker processing product for DNA first elution treatment and PCR amplification,
    任选地,进一步包括将PCR扩增产物与AMPure XP磁珠进行结合处理,Optionally, further comprising combining the PCR amplification product with the AMPure XP magnetic beads,
    任选地,进一步包括将结合有AMPure XP磁珠的PCR扩增产物进行DNA第二洗脱处理。Optionally, further comprising subjecting the PCR amplification product incorporating the AMPure XP magnetic beads to a second DNA elution treatment.
  10. 一种测序文库,其特征在于,所述测序文库是通过权利要求1~9任一项所述的构建待测基因组的DNA测序文库的方法获得的。A sequencing library obtained by the method of constructing a DNA sequencing library of a genome to be tested according to any one of claims 1 to 9.
  11. 一种确定待测基因组的DNA序列信息的方法,其特征在于,包括:A method for determining DNA sequence information of a genome to be tested, characterized in that it comprises:
    根据权利要求1~9任一项所述的方法构建待测基因组的DNA测序文库;Constructing a DNA sequencing library of the genome to be tested according to the method of any one of claims 1 to 9;
    对所述DNA测序文库进行测序,以便获得测序结果;以及Sequencing the DNA sequencing library to obtain sequencing results;
    基于所述测序结果,确定所述待测基因组的DNA序列信息。Based on the sequencing result, DNA sequence information of the test genome is determined.
  12. 一种用于确定待测基因组三维空间结构的方法,其特征在于,包括:A method for determining a three-dimensional spatial structure of a genome to be tested, characterized in that it comprises:
    根据权利要求1~9任一项所述的方法构建待测基因组的DNA测序文库;Constructing a DNA sequencing library of the genome to be tested according to the method of any one of claims 1 to 9;
    对所述DNA测序文库进行测序,以便获得测序结果;以及Sequencing the DNA sequencing library to obtain sequencing results;
    基于所述测序结果,确定所述待测基因组的三维空间结构信息。 Based on the sequencing result, the three-dimensional spatial structure information of the genome to be tested is determined.
PCT/CN2017/114475 2017-07-07 2017-12-04 In situ whole genome chromatin conformation capture method for infinitesimal cells WO2019006975A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710552555.0A CN107217309A (en) 2017-07-07 2017-07-07 Build the method and its application in the DNA sequencing library of testing gene group
CN201710552555.0 2017-07-07

Publications (1)

Publication Number Publication Date
WO2019006975A1 true WO2019006975A1 (en) 2019-01-10

Family

ID=59952545

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/114475 WO2019006975A1 (en) 2017-07-07 2017-12-04 In situ whole genome chromatin conformation capture method for infinitesimal cells

Country Status (2)

Country Link
CN (1) CN107217309A (en)
WO (1) WO2019006975A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107217309A (en) * 2017-07-07 2017-09-29 清华大学 Build the method and its application in the DNA sequencing library of testing gene group
CN108265104B (en) * 2018-01-02 2021-07-30 北京诺禾致源科技股份有限公司 Chromosome configuration capture library and construction method thereof
CN109797436B (en) * 2018-12-29 2021-10-08 阅尔基因技术(苏州)有限公司 Sequencing library construction method
CN109735900A (en) * 2019-03-20 2019-05-10 嘉兴菲沙基因信息有限公司 A kind of small fragment DNA library construction method suitable for Hi-C
CN110396534A (en) * 2019-08-12 2019-11-01 华大生物科技(武汉)有限公司 The construction method of gene library, determined nucleic acid sample gene mutation detection method and kit
CN111364105B (en) * 2020-04-30 2021-09-07 华中农业大学 Simple and effective construction method of plant long fragment in situ DLO Hi-C sequencing library
CN112522251A (en) * 2020-12-29 2021-03-19 上海派森诺生物科技股份有限公司 Method for extracting Hi-C animal tissue

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105754995A (en) * 2016-04-19 2016-07-13 清华大学 Method for constructing DNA sequencing library of genome under detection and application of method
CN106566828A (en) * 2016-11-11 2017-04-19 中国农业科学院农业基因组研究所 Efficient whole-genome chromosome conformation capture technology (eHi-C)
CN106637422A (en) * 2016-12-16 2017-05-10 中国人民解放军军事医学科学院生物工程研究所 Method for constructing Hi-C high-throughput sequencing library
CN107217309A (en) * 2017-07-07 2017-09-29 清华大学 Build the method and its application in the DNA sequencing library of testing gene group

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030196214A1 (en) * 2002-03-27 2003-10-16 Priti Sharma Novel genes from drought stress tolerant tea plant and a method of introducing water-stress tolerance
US20070026397A1 (en) * 2003-02-21 2007-02-01 Nuevolution A/S Method for producing second-generation library
CN106591285B (en) * 2015-10-19 2019-11-29 浙江安诺优达生物科技有限公司 A method of constructing the library Hi-C of high availability data rate
CN106591955B (en) * 2015-10-19 2019-10-29 浙江安诺优达生物科技有限公司 Construct high-resolution, the method in the unicellular library Hi-C of large information capacity

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105754995A (en) * 2016-04-19 2016-07-13 清华大学 Method for constructing DNA sequencing library of genome under detection and application of method
CN106566828A (en) * 2016-11-11 2017-04-19 中国农业科学院农业基因组研究所 Efficient whole-genome chromosome conformation capture technology (eHi-C)
CN106637422A (en) * 2016-12-16 2017-05-10 中国人民解放军军事医学科学院生物工程研究所 Method for constructing Hi-C high-throughput sequencing library
CN107217309A (en) * 2017-07-07 2017-09-29 清华大学 Build the method and its application in the DNA sequencing library of testing gene group

Also Published As

Publication number Publication date
CN107217309A (en) 2017-09-29

Similar Documents

Publication Publication Date Title
WO2019006975A1 (en) In situ whole genome chromatin conformation capture method for infinitesimal cells
JP7379418B2 (en) Deep sequencing profiling of tumors
Sun et al. High-resolution enzymatic mapping of genomic 5-hydroxymethylcytosine in mouse embryonic stem cells
US10400279B2 (en) Method for constructing a sequencing library based on a single-stranded DNA molecule and application thereof
CN109797436B (en) Sequencing library construction method
WO2017181880A1 (en) Method for constructing dna sequencing library for to-be-detected genome, and applications thereof
CA3209385A1 (en) Methods for genome assembly and haplotype phasing
CN107002292A (en) The construction method and reagent in a kind of twin adapter single stranded circle library of nucleic acid
JP2019501641A (en) Rapid sequencing of short DNA fragments using nanopore technology
WO2017193833A1 (en) Method and kit comprising 4,000 human pathogenic target genes
JPH09508268A (en) Nucleic acid sequencing
US10287621B2 (en) Targeted chromosome conformation capture
CN111575347A (en) Method for constructing library for simultaneously obtaining free DNA methylation and fragmentation pattern information in plasma
WO2021027236A1 (en) Method for constructing dna library and application thereof
WO2011063210A2 (en) Methods of mapping genomic methylation patterns
CN111575349B (en) Linker sequence and application thereof
CN113528612A (en) NicE-C technology for detecting chromatin interaction between chromatin open sites
WO2020135347A1 (en) Method for detecting dna methylation, test kit, device and application
CN107794258A (en) A kind of method and its application in constructed dna large fragment library
WO2014086037A1 (en) Method for constructing nucleic acid sequencing library and applications thereof
WO2012083845A1 (en) Methods for removal of vector fragments in sequencing library and uses thereof
JP2022525373A (en) Methods, systems and applications for constructing sequencing libraries based on target regions of methylated DNA
CN107794257A (en) A kind of construction method in DNA large fragments library and its application
US20220145368A1 (en) Methods for noninvasive prenatal testing of fetal abnormalities
Oomen et al. SisterC: A novel 3C-technique to detect chromatin interactions between and along sister chromatids

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17916577

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17916577

Country of ref document: EP

Kind code of ref document: A1