WO2014205981A1 - 甲基化CpG岛的高通量测序检测方法 - Google Patents

甲基化CpG岛的高通量测序检测方法 Download PDF

Info

Publication number
WO2014205981A1
WO2014205981A1 PCT/CN2013/087231 CN2013087231W WO2014205981A1 WO 2014205981 A1 WO2014205981 A1 WO 2014205981A1 CN 2013087231 W CN2013087231 W CN 2013087231W WO 2014205981 A1 WO2014205981 A1 WO 2014205981A1
Authority
WO
WIPO (PCT)
Prior art keywords
primer
dna
linker
cpg
end portion
Prior art date
Application number
PCT/CN2013/087231
Other languages
English (en)
French (fr)
Inventor
文路
李静宜
黄岩谊
汤富酬
Original Assignee
北京大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学 filed Critical 北京大学
Priority to EP13888369.9A priority Critical patent/EP3015552B1/en
Priority to US14/392,322 priority patent/US10100351B2/en
Publication of WO2014205981A1 publication Critical patent/WO2014205981A1/zh

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • C12Q1/6855Ligating adaptors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • C12Q1/6874Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation

Definitions

  • the present invention relates to the use of high throughput sequencing techniques to detect methylated CpG islands in the genome. Background technique
  • DNA methylation refers to the covalent modification of methyl groups on the 5th carbon atom of cytosine (C) on DNA to 5' methylcytosine (5mC), which has a variety of important biological functions, including Transcriptional regulation, transposon silencing, gene imprinting and X chromosome inactivation have always been a research hotspot in the field of molecular biology. In the vertebrate including humans, most of the DNA methylation changes occur at the CpG site (CpG indicates a nucleotide pair in which guanine (G) follows the C chain on the DNA strand).
  • the average level of CpG in the vertebrate genome is lower than expected, but in some segments of the genome, CpG remains at or above normal probability, and these segments are called CpG islands.
  • the CpG island is mainly located in the gene promoter. In the human genome, there are about 30,000 CpG islands, more than 50% of which are located in the promoter, and more than 60% of the gene promoters contain CpG islands. Once the promoter CpG island is methylated, it will result in silencing of the corresponding gene. Studies have shown that this mechanism is involved in the regulation of various physiological processes, including X chromosome inactivation, gene imprinting, embryonic stem cell differentiation, germ cell development, and tumors. The occurrence and development. Intra- and inter-gene CpG islands may be unidentified promoters. A comprehensive understanding of the biological functions of CpG island methylation requires systematic and efficient detection techniques.
  • Bisulfite sequencing The principle is that bisulfite treatment can convert C on DNA to uracil (U), while 5mC is unaffected, so subsequent high-throughput sequencing reveals methylation modifications on DNA.
  • the resolution of the method is accurate to a single base and is the gold standard for DNA methylation analysis.
  • the single-base resolution human genome-wide methylation map obtained by bisulfite sequencing was first reported in 2009. However, Because this technique is a whole genome sequencing, it is very expensive and hinders its application to a large number of samples. Later, researchers have proposed simplified representative bisulfite sequencing (Gu H, et al. Preparati On of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat Pro toe.
  • the method enriches the promoter and the CpG island region by Mspl digestion, running gel, and purification steps, and then performs the steps of terminal repair, addition of eight, linker ligation, fragment selection and purification, and PCR amplification to construct a sequencing library, although the whole genome is more sulfurous than sulfuric acid.
  • Hydrogen salt sequencing is more economical and efficient, but the library construction process is cumbersome and takes about 5-6 days, and the enrichment process cannot distinguish between methylated and unmethylated CpG islands, increasing the cost of sequencing.
  • Patent Construction Method of High-Throughput Sequencing Library and Application thereof, CN103103624A
  • CN103103624A Capturing a target fragment including a CpG island using a specific probe, followed by bisulfite sequencing, but the sequence capture and library construction process are also quite time consuming.
  • the method of the invention enriches the methylated CpG island in the genomic DNA transformed by the modifier by the high CpG frequency primer, and simultaneously connects the linker sequence to the target fragment to be tested by a three-step PCR reaction, thereby efficiently enriching Methylated CpG islands and rapid construction of high-throughput sequencing libraries are a novel, efficient and economical method for high-throughput sequencing of methylated CpG islands.
  • the invention provides a high throughput sequencing assay for methylated CpG islands comprising the steps of: Step 1: treating a DNA sample with a modifying agent to convert cytosine in the DNA sample to uracil, The 5' methylcytosine is unchanged, and a converted DNA fragment is obtained.
  • the DNA can be any polymer comprising deoxynucleotides.
  • the DNA sample is genomic DNA derived from at least one of an animal, a plant and a microorganism, and preferably the animal is at least one of a human and a mouse.
  • the DNA sample may be derived from human cells, tissue, blood, body fluids, urine, feces, or a combination thereof; in a preferred form, the DNA sample is derived from human plasma or serum free DNA; optionally, Said DNA sample is derived from human whole blood genomic DNA; optionally, said DNA sample is derived from a human tumor cell line;
  • the modifier that treats the DNA sample modifies C under conditions of single-stranded DNA, but does not modify 5 mC.
  • Bisulfite, acetate or citrate may be employed.
  • the modifier is bisulfite.
  • the DNA sample can be subjected to bisulfite treatment using a commercial kit, and optionally, MethylCode Bisulfite Conversion Kit (Invitrogen), EZ DNA methylation-Gold Kit (ZYMO) or EpiTect Bisulfite Kit (Qiagen) can be used.
  • Step 2 The transformed DNA fragment is linearly amplified by primer A and DNA polymerase at least once to obtain a target fragment capable of anchoring the linker primer C at one end;
  • the primer A consists of two parts, a 3' end portion and a 5' end portion, wherein the 3' end portion is used for binding and amplifying the transformed DNA fragment, characterized in that the length is greater than or equal to 4 nucleotides and Can combine the converted DNA fragments;
  • the 3' end portion of primer A contains only C, A and T, and it will be understood by those skilled in the art that CpG represents a nucleotide pair in which guanine (G) follows the C chain on the DNA strand.
  • the second nucleotide of the 3' end of primer A is. .
  • the 3' end portion of the primer A can also be used for preliminary enrichment of the methylated CpG island, and the target fragment which is initially enriched in the CpG island and can be anchored at one end to the linker primer C can be obtained in the second step, in the second step.
  • Preliminary enrichment can improve the overall enrichment efficiency, but it is still necessary to further enrich the methylated CpG island in step 3 to achieve full enrichment.
  • Preliminary enrichment requires that the 3' end of primer A have a certain CpG frequency or C frequency.
  • the medium CpG frequency is one CpG in the first 7 nucleotides of the 3' end of the leader; preferably, characterized by a high C frequency, said high C Frequency means that at least 3 C are included in the first 5 nucleotides from the 3' end, and preferably at least 3 nucleotides from the 3' end are C; optionally, characterized by a high CpG frequency, which is high
  • the CpG frequency means that 2 or 3 CpGs are contained in the first 7 nucleotides from the 3' end, and it is also possible to select no further enrichment of the methylated CpG island in step 3 (see below).
  • the 5' end portion of the primer A is used to anchor the adaptor primer C, characterized in that the reverse complement sequence can be bound by the primer primer C and subjected to PCR amplification.
  • Anchoring means that its primer A can connect the linker primer C to the fragment to be tested through a PCR reaction through its 5' end portion.
  • the 5' end portion of primer A is identical to the first 15 to 40 nucleotide sequences of the 3' end of adaptor primer C.
  • the DNA polymerase can be any suitable polymerase such as Taq polymerase, ExTaq, LATaq DNA polymerase, AmpliTaq, Amplitaq Gold, Titanium Taq polymerase, KlenTaq DNA polymerase, Platinum Taq polymerase, Accuprime Taq polymerase, Pyrobest DNA.
  • Nm DNA polymerase Therminator DNA polymerase, Expand DNA polymerase, rTth DNA polymerase, DyNazymeTM EXT polymerase, DNA polymerase 1, T7 DNA polymerase, T4 DNA polymerase, Bst DNA polymerase, phi-29 DNA polymerization Enzyme and Klenow fragment.
  • the DNA polymerase has strand displacement ability.
  • Any suitable polymerase having strand displacement capability including but not limited to DNA polymerase I (Klenow) large fragment (New England Biolabs (NEB) catalog number M0210S), Klenow fragment (exo-) (NEB catalog number M0212S) , Bst DNA polymerase large fragment (NEB catalog number M0275S), Vent (exo-) (NEB catalog number M0257S), Deep Vent (exo-) (NEB catalog number M0259S), M-MulV reverse transcriptase (NEB catalog number) M0253S), 9. Nm DNA polymerase (NEB catalog number M0260S) and Phi-29 DNA polymerase (NEB catalog number M0269S).
  • the DNA polymerase is a Klenow fragment (exo-).
  • the DNA polymerase is defaced by an exonuclease.
  • Linear amplification means that the amount of amplification product increases linearly rather than exponentially with increasing number of amplifications. At least one linear amplification is required, and preferably, 2 to 30 linear amplifications are performed.
  • Step 3 The target fragment capable of anchoring the linker primer C at one end is amplified with primer B and DNA polymerase to obtain an enriched methylated CpG island, and both ends can anchor the linker primer C and the linker primer D, respectively. Fragment of purpose;
  • the primer B is composed of a 3' end portion and a 5' end portion, wherein the 3' end portion is used for binding and enrichment amplification of methylated CpG islands, and the characteristics are: 1) the length is greater than or equal to 7 2 nucleotides; 2) High CpG frequency, the high CpG frequency one refers to 2 or 3 CpG in the first 7 nucleotides from the 3' end.
  • the primers only contain G, and 1 ⁇
  • Enrichment amplification is a guideline that tends to bind and amplify methylated CpG island regions in the genome, but does not tend to bind and amplify unmethylated CpG island regions.
  • the reason for enrichment amplification is primer B.
  • the sequence of the 3' end portion is characterized by a high CpG frequency.
  • Our bioinformatics analysis shows that the primer sequences with the high CpG frequency are not randomly distributed in the genome, but are distributed in different degrees to the CpG island region, so primers with high CpG frequency can be used to enrich the methyl group. The role of CpG islands.
  • the 3' end portion of the primer B has a characteristic of high GC content, wherein the high GC content means that the sum of C and G in the first 10 nucleotides from the 3' end is greater than or equal to 7, high GC content
  • the feature further helps to further increase the annealing temperature of the primer, and those skilled in the art can understand that the primer having a high GC content has a higher annealing temperature and has the high GC content.
  • the feature is such that the annealing temperature of the primer reaches between about 40 and about 60 degrees, which facilitates binding of the primer to the template and increases amplification efficiency.
  • the 3' end portion of primer B has a characteristic of a very high CpG frequency, and the extremely high CpG frequency means that 3 CpG is contained in 7 nucleotides at the 3' end.
  • the 5' end portion of the primer B is used to anchor the linker primer D, characterized in that its reverse complement sequence can be bound by the linker primer D and subjected to PCR amplification.
  • Anchoring means that its primer B can connect the linker primer D to the fragment to be tested by a PCR reaction through its 5' end portion.
  • the 5' end portion of primer B is identical to the first 15 to 40 nucleotide sequences from the 3' end of linker primer D.
  • the DNA polymerase can be any suitable polymerase as described above.
  • the DNA polymerase is hot-started.
  • the annealing temperature of the amplification reaction is between about 40 and about 60 degrees.
  • step 2 has already enriched the methylated CpG island (and one end is capable of anchoring the linker primer C)
  • the primer B in step three The 3' end portion only needs to be able to bind and amplify the target fragment to obtain an enriched methylated CpG island and the two ends can anchor the linker primer C and For the target fragment of the head primer D, it is no longer necessary to enrich the methylated CpG island in the third step.
  • Step 4 The target fragment capable of anchoring the methylated CpG island and capable of anchoring the linker primer C and the linker primer D, respectively, is subjected to PCR index amplification using the linker primer C, the linker primer D and the DNA polymerase, thereby obtaining Amplification product;
  • Linker primers are here the role of the linker in the conventional high-throughput sequencing library construction methods (eg, the TruSeq DNA Sample Prep Kit from Illumnia and the SOLiDTM Fragment Library Construction Kit from Applied Biosystems (ABI)).
  • the DNA fragment to be tested is bound to a sequencing chip and allowed to be enriched and sequenced by PCR amplification.
  • the method of the present invention uses the anchor sequence of the 5' portion of the primer A and the primer B to link the linker sequence to the DNA fragment to be tested by a PCR reaction. Both ends.
  • Linker Primer C and Primer D correspond to linker sequences for specific high-throughput sequencing platforms including, but not limited to, Illumina's Genome Analyzer IIx, HiSeq and MiSeq, ABI's SoLiD, 5500 W Series Genetic Analyzer, and Ion Torrent PGM, Roche454's GS Junior and GS FLX+.
  • Step 5 The amplification product is isolated and purified, and the amplification product constitutes a high-throughput sequencing library; and the high-throughput sequencing library is sequenced and analyzed.
  • the method of separating and purifying the amplification product may be any suitable method including, but not limited to, magnetic bead purification, purification column purification, and agarose gel electrophoresis purification.
  • the purification method is capable of selecting the length of the fragment of interest.
  • the target fragment is 160-400 bp in length.
  • a 160-400 bp fragment of interest is selected by electrophoresis on a 2% agarose gel and purified by gel recovery.
  • High-throughput sequencing platforms for analyzing sequencing libraries include, but are not limited to, Illumina's Genome Analyzer IIx, HiSeq and MiSeq sequencing platforms, ABI's SoLiD, 5500 W Series Genetic Analyzer and Ion Torrent PGM sequencing platforms, Roche454's GS Junior ⁇ GS FLX+ sequencing platform.
  • the method of data analysis is not limited, and any suitable data analysis and sequence alignment software can be applied, including but not limited to Bismark, BSMAP, Bowtie, and SOAP.
  • the invention provides a kit for high throughput sequencing detection of methylated CpG islands, the kit comprising: primer A, primer B, linker primer C and linker primer D, DNA polymerase and Use the instructions for the kit.
  • DRAWINGS Figure 1 is a schematic representation of a high throughput sequencing assay for methylated CpG islands of the invention.
  • Figure 2 shows the results of agarose gel analysis of genomic DNA of Hela cells and Human Peripheral Blood Mononuclear Cell (hPBMC) using the method of the present invention.
  • Figure 3 shows the results of Bioanalyzer 2100 detection of methylated CpG island high-throughput sequencing libraries of Hela cells and hPBMC genomic DNA.
  • Figure 4 shows a comparison of the high-throughput sequencing results of the MAEL, ILDR2 and CDKN2A gene regions using the method of the invention and the whole genome bisulfite sequencing method.
  • Figure 5 shows the distribution of short nucleotide sequences containing 1 to 3 CpGs in the CpG and non-CpG island regions of the human genome.
  • Figure 6 shows the degree of enrichment of CpG islands by the sequence combination of the different primer A and primer B's 3' end portions.
  • Step 1 Hydrobisulfite treatment of DNA samples.
  • MethylCodeTM Bisulfite Conversion Kit (Invitrogen) and follow the manufacturer's instructions, as follows:
  • PCR tube is subjected to the following procedure on a thermal cycler: 98 degrees... -10 minutes, 64 degrees... -2.5 hours, 4 degrees storage (not more than 20 hours)
  • Step 2 Primer A and DNA polymerase were linearly amplified.
  • step 1 The DNA obtained in step 1 is configured in the following amplification reaction system in a PCR tube:
  • the following amplification reaction system was configured in a PCR tube:
  • Step 4 Connector primers. , linker primer D and DNA polymerase were exponentially amplified.
  • TCT wherein the straight line portion is the same as the 5' end portion of the primer ;
  • Step 5 Fragment selection, purification, high-throughput sequencing and data analysis.
  • the library was subjected to high-throughput sequencing analysis to obtain the original sequencing data according to the double-end sequencing with a read length of lOObp.
  • Figure 2 shows the results of 2% agarose gel electrophoresis after amplification of Hela cells and hPBMC genomic DNA using the above method of the present invention.
  • Lanes 1 and 4 are DNA Maker
  • Lane 2 is the Hela cell genomic DNA results
  • the initial amount of DNA sample for sodium bisulfite treatment is 15 ng
  • lane 3 is the lane result indicating hPBMC genomic DNA (from an adult male)
  • the initial amount of the DNA sample for sodium hydrogen sulfite treatment was 30 ng
  • lane 5 was a control for the amplification reaction without the DNA sample.
  • the results showed that both the Hela cell genome and the hPBMC genomic DNA samples were positively amplified. This result indicates that as little as 15 ng of the starting genomic DNA can be detected by the method of the present invention for high-throughput sequencing of methylated CpG islands, indicating that the method has high efficiency and high sensitivity.
  • Figure 3 shows the results of Bi Oana ly Z er_2100 detection of a high-throughput sequencing library of HeLa cells and hPBMC genomic DNA obtained by the above method of the present invention.
  • the DNA obtained by subjecting the above-mentioned 2% agarose gel-separated DNA fragment into a gel and recovering and purifying it constitutes the high-throughput sequencing library.
  • Figure 3A shows the results of the HeLa cell genomic DNA sequencing library
  • Figure 3B shows the results of the hPBMC genomic DNA sequencing library. The results show that the fragment length is between 160 ⁇ 280bp.
  • the above two sequencing libraries were subjected to high-throughput sequencing, and the original data amount of the sequencing was 1.3G of Hela cell genomic DNA and 1.5G of hPBMC genomic DNA.
  • the statistical analysis results of the data are shown in Table 1.
  • the CpG island region in the human genome accounts for approximately 0.7% of the whole genome, while 39% and 20% of the high-throughput sequencing data of Hela cells and hPBMC cells using the method of the present invention are distributed on CpG islands, thereby The enrichment levels are 56 and 29 times, respectively.
  • the method of the invention can obtain the sequencing depth of the genomic region including the methylated CpG island by an average of 20 ⁇ 30x only by using the original sequencing data amount of 1 ⁇ 2Gb, and the same sequencing depth is achieved by sequencing with whole genome bisulfite.
  • the amount of raw sequencing data of 150-200 Gb is required, which indicates that the method of the present invention greatly improves the efficiency of high-throughput sequencing detection of methylated CpG islands.
  • Figure 4 shows a comparison of the high-throughput sequencing results of the MAEL, ILDR2 and CDKN2A gene regions using the methods of the invention and whole genome bisulfite sequencing methods.
  • Figure 4A shows the results of the MAEL and ILDR2 gene regions.
  • the MAEL (maelstrom spermatogenic transposon silencer) gene is a testis-specific gene, and its promoter CpG island is demethylated in germ cells, but is highly methylated in somatic cells.
  • ILDR2 immunoglobulin- Like domain containing receptor 2
  • results of the method of the present invention show that in the Hela and hPBMC genomes, the promoter CpG island of MAEL is amplified, and the sequencing results show that it is highly methylated, and the promoter CpG island of ILDR2 is not amplified. This result indicates that the method of the present invention is capable of selectively amplifying methylated CpG islands without amplifying unmethylated CpG islands.
  • FIG. 4B shows the results of the CDKN2A (cyclin-dependent kinase inhibitor 2A) gene region.
  • the CDKN2A gene region has multiple CpG islands, which are demethylated in somatic cells and partially methylated in tumor cells.
  • the results of the method of the present invention showed that two of the four CpG islands in the gene region were amplified in the Hela cell genome and the sequencing results showed that they were highly methylated, while in the hPBMC genome, four CpG islands were not expanded. increase. This result further indicates that the method of the present invention is capable of enriching and efficiently amplifying methylated CpG islands and performing high-throughput sequencing detection.
  • Figure 5 shows the results of bioinformatics analysis of the distribution of short nucleotide sequences containing 1 to 3 CpGs in the CpG and non-CpG island regions of the human genome. The results showed that the degree of enrichment of the short nucleotide sequence in the CpG island was positively correlated with the number of CpG.
  • Figure 6 shows the results of analysis of CpG island enrichment by Hela cell genomic high-throughput sequencing library obtained by the method of the present invention using different primer A and primer 3's 3' end partial sequence combination. The results showed that the degree of CpG island enrichment was positively correlated with the CpG frequency of the primer sequence.
  • H C/A/T/
  • D G/A/T
  • R G/A
  • N A/T/C/G.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

甲基化CpG岛的高通量测序检测方法,包括:用修饰剂处理DNA样品,将DNA样品中的胞嘧啶转换为尿嘧啶,而5'甲基胞嘧啶不变;所得片段用引物A和DNA聚合酶扩增,获得一端能够锚定接头引物C的片段;所得片段用引物B和DNA聚合酶扩增,获得富集甲基化CpG岛且两端能够分别锚定接头引物C和D的片段;所得片段用接头引物C、D和DNA聚合酶进行PCR指数扩增,获得扩增产物;扩增产物分离纯化,构成高通量测序文库、上机测序及数据分析。在高CpG频率引物对修饰剂转换的基因组DNA中的甲基化CpG岛进行富集性扩增的同时通过三步PCR反应将接头序列连接到待测目的片段,以富集甲基化CpG岛及构建高通量测序文库。

Description

甲基化 CpG岛的高通量测序检测方法
技术领域
本发明涉及利用高通量测序技术检测基因组中的甲基化 CpG岛。 背景技术
DNA甲基化是指在 DNA上的胞嘧啶 (C) 的第 5个碳原子上被甲基共价修饰成为 5' 甲基胞嘧啶 (5mC), 它具有多种重要的生物学功能, 包括转录调控, 转座子沉默、 基因印 记和 X染色体失活等, 一直是分子生物学领域的一个研究热点。在包括人类在内的脊椎动 物中, DNA甲基化修饰绝大多数发生在 CpG位点上(CpG表示核苷酸对,其中鸟嘌呤(G) 在 DNA链上紧随 C后)。 CpG在脊椎动物基因组中的平均含量低于预期概率,但在基因组 某些区段, CpG保持或高于正常概率, 这些区段被称作 CpG岛。 CpG岛主要位于基因启 动子, 在人类基因组中, 约有 3万个 CpG岛, 其中 50%以上位于启动子, 而 60%以上的 基因启动子含有 CpG岛。 启动子 CpG岛一旦被甲基化将导致相应基因表达沉默, 已有研 究表明这种机制参与了多种生理过程的调控, 包括 X染色体失活、 基因印记、 胚胎干细胞 分化、生殖细胞发育以及肿瘤的发生和发展。基因内和基因间的 CpG岛可能是尚未鉴定的 启动子。 全面理解 CpG岛甲基化的生物学功能需要系统和高效的检测技术。
传统检测 DNA甲基化的方法,包括限制性酶切、限制性酶切 -PCR、甲基化特异性 PCR, 只能检测单个或少数位点。 近年来, 随着高通量测序技术的发展, 人们开始能够系统地从 全基因组水平了解 DNA甲基化谱式。目前基于高通量测序技术的 DNA甲基化检测方法包 括: 1 ) 甲基化 DNA免疫沉淀; 2) 甲基化 CpG免疫沉淀和 3) 亚硫酸氢盐测序 (Bisulfite Sequencing^ 前二者采用抗体或重组甲基化 CpG结合蛋白捕获甲基化 DNA, 随后进行高 通量测序, 这两种方法只能对 DNA甲基化状态进行半定量检测, 其分辨率为 lOObp左右。 亚硫酸氢盐测序的原理是亚硫酸氢盐处理能使得 DNA上的 C转变成为尿嘧啶 (U),而 5mC 则不受影响, 因此随后进行高通量测序就可以得知 DNA上的甲基化修饰情况。 这种方法 的分辨率精确到单个碱基, 是 DNA甲基化分析的金标准。 2009年第一次报道了亚硫酸氢 盐测序获得的单碱基分辨率人类全基因组甲基化图谱。 但是, 由于这种技术是对全基因组 进行测序, 费用非常高, 阻碍其应用于大量样本的检测。 随后有研究者提出了简化代表性 亚硫酸氢盐测序 ( Gu H, et al. Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling. Nat Pro toe. 2011 6(4):468-81.), 这禾中方 法通过 Mspl酶切、 跑胶、 纯化步骤富集启动子及 CpG岛区域, 随后进行末端修复、加八、 接头连接、片段选择纯化和 PCR扩增等步骤构建测序文库, 虽然比全基因组亚硫酸氢盐测 序更加经济和高效, 但是文库构建过程繁琐, 需要大约 5~6天时间, 而且富集过程不能区 分甲基化和非甲基化 CpG岛, 增加了测序成本。专利(高通量测序文库的构建方法及其应 用, CN103103624A) 利用特异性探针捕获包括 CpG岛在内的目的片段, 然后进行亚硫酸 氢盐测序, 但序列捕获和文库构建过程同样相当费时。
因此, 目前尚缺乏一种高效的甲基化 CpG岛高通量测序检测方法。
发明内容
本发明方法通过高 CpG频率引物对修饰剂转换的基因组 DNA中的甲基化 CpG岛进行 富集性扩增, 同时通过三步 PCR反应将接头序列连接到待测目的片段,得以高效地富集甲 基化 CpG岛及快速地构建高通量测序文库, 是一种新颖、 高效和经济的甲基化 CpG岛高 通量测序检测方法。
在第一方面, 本发明提供了甲基化 CpG岛的高通量测序检测方法, 包括下列步骤: 步骤一、用修饰剂处理 DNA样品, 以便将 DNA样品中的胞嘧啶转换为尿嘧啶, 而 5' 甲基胞嘧啶不变, 获得经过转换的 DNA片段。
DNA可以是任何包含脱氧核苷酸的聚合物。 优选地, 所述 DNA样品为基因组 DNA, 来源于动物、 植物和微生物中至少一种, 优选所述动物为人和小鼠中至少一种。
所述 DNA样品可以来自人的细胞、 组织、 血液、 体液、 尿液、 排泄物或其组合; 在 一个优选的形式中, 所述 DNA样品来自人的血浆或血清游离 DNA; 任选地, 所述 DNA 样品来自人的全血基因组 DNA; 任选地, 所述 DNA样品来自人的肿瘤细胞株;
处理 DNA样品的修饰剂在形成单链 DNA的条件下修饰 C,但不修饰 5mC。可以采用 亚硫酸氢盐、 乙酸盐或柠檬酸盐, 优选地, 修饰剂是亚硫酸氢盐。 可以采用商品化试剂盒 将 DNA样品进行亚硫酸氢盐处理, 任选地, 可以采用 MethylCode Bisulfite Conversion Kit (Invitrogen)、 EZ DNA methylation-Gold Kit (ZYMO)或 EpiTect Bisulfite Kit (Qiagen)。
步骤二、 将所述经过转换的 DNA片段用引物 A和 DNA聚合酶进行至少 1次线性扩 增, 以便获得一端能够锚定接头引物 C的目的片段;
其中所述引物 A由 3'端和 5'端两部分组成,其中所述 3'端部分用于结合和扩增经过转 换的 DNA片段, 其特征为, 长度大于或等于 4个核苷酸且能结合经过转换的 DNA片段; 优选地, 除了 CpG以外, 引物 A的 3'端部分只包含 C、 A和 T,本领域技术人员可以理解, CpG表示核苷酸对, 其中鸟嘌呤 (G) 在 DNA链上紧随 C后。 更优选地, 引物 A的 3'端 第二个核苷酸为。。 优选地, 引物 A的 3'端部分还可用于初步富集甲基化 CpG岛, 得以 在步骤二中获得初步富集 CpG岛且一端能够锚定接头引物 C的目的片段, 在步骤二中进 行初步富集能够提高整体富集效率,但是仍需在步骤三中对甲基化 CpG岛进行进一步富集 以达到充分富集的目的。 初步富集需要引物 A的 3'端部分具有一定的 CpG频率或者 C频 率。 优选地, 其特征为中等 CpG频率, 其所述中等 CpG频率是指引物 3'端起前 7个核苷 酸中包含 1个 CpG; 优选地, 其特征为高 C频率, 其所述高 C频率是指 3'端起前 5个核 苷酸中至少包含 3个 C, 优选 3'端起至少连续 3个核苷酸为 C; 任选地, 其特征为高 CpG 频率, 其所述高 CpG频率是指 3'端起前 7个核苷酸中包含 2个或 3个 CpG, 此时也可选 择不再在步骤三中对甲基化 CpG岛进行进一步富集 (见后)。
其中所述引物 A的 5'端部分用于锚定接头引物 C, 其特征为, 反向互补序列能够被接 头引物 C结合并进行 PCR扩增。 锚定是指其引物 A能够通过其 5'端部分将接头引物 C通 过 PCR反应连接到待测目的片段上。 优选地, 引物 A的 5'端部分与接头引物 C的 3'端起 前 15~40个核苷酸序列相同。
DNA聚合酶可以是任何合适的聚合酶, 如 Taq聚合酶、 ExTaq、 LATaq DNA聚合酶、 AmpliTaq、 Amplitaq Gold、 Titanium Taq聚合酶、 KlenTaq DNA聚合酶、 Platinum Taq聚合 酶、 Accuprime Taq聚合酶、 Pyrobest DNA聚合酶、 Pfu聚合酶、 Pfu聚合酶 turbo、 Phusion 聚合酶、 Pwo聚合酶、 Vent聚合酶、 Vent Exo-聚合酶、 SequenaseTM 聚合酶、 9。 Nm DNA 聚合酶、 Therminator DNA聚合酶、 Expand DNA聚合酶、 rTth DNA聚合酶、 DyNazymeTM EXT聚合酶、 DNA聚合酶 1、 T7 DNA聚合酶、 T4 DNA聚合酶、 Bst DNA聚合酶、 phi-29 DNA聚合酶和 Klenow片段。
优选地, DNA聚合酶具有链置换能力。 可以是具有链置换能力的任何合适的聚合酶, 包括但不限于 DNA聚合酶 I (Klenow)大片段(New England Biolabs (NEB)目录号 M0210S)、 Klenow片段 (exo-)( NEB目录号 M0212S)、: Bst DNA聚合酶大片段(NEB目录号 M0275S)、 Vent(exo-) (NEB目录号 M0257S)、 Deep Vent (exo-) (NEB目录号 M0259S)、 M-MulV 逆转录酶(NEB目录号 M0253S)、9。Nm DNA聚合酶(NEB目录号 M0260S)和 Phi-29 DNA 聚合酶 (NEB目录号 M0269S)。 在一个优选形式中, DNA聚合酶为 Klenow片段 (exo-)。
优选地, DNA聚合酶是核酸外切酶缺损的。 线性扩增是指扩增产物量随着扩增次数增加呈线性而非指数增加。 需要进行至少 1次 线性扩增, 优选地, 进行 2~30次线性扩增。
步骤三、 将所述一端能够锚定接头引物 C的目的片段用引物 B和 DNA聚合酶进行扩 增, 以便获得富集甲基化 CpG岛且两端能够分别锚定接头引物 C和接头引物 D的目的片 段;
其中所述引物 B由 3'端和 5'端两部分组成,其中所述 3'端部分用于结合并富集性扩增 甲基化 CpG岛, 其特征为: 1 ) 长度大于或等于 7个核苷酸; 2) 高 CpG频率, 其所述高 CpG频率一是指 3'端起前 7个核苷酸中包含 2个或 3个 CpG。 优选地, 除了 CpG以外, 引物只包含 G、 和1\
富集性扩增是指引物倾向于结合与扩增基因组中甲基化 CpG岛区域,而不倾向于结合 与扩增非甲基化 CpG岛区域, 富集性扩增的原因是引物 B的 3'端部分的序列具有高 CpG 频率的特征。我们的生物信息学分析表明, 具有所述高 CpG频率的引物序列在基因组中并 非随机分布, 而是以不同程度聚集分布于 CpG岛区域, 因此使用高 CpG频率的引物可起 到富集甲基化 CpG岛作用。 优选地, 引物 B的 3'端部分具有高 GC含量的特征, 其所述 高 GC含量是指 3'端起前 10个核苷酸中 C与 G之和大于或等于 7个, 高 GC含量特征除 了使引物序列进一步聚集分布于 CpG岛区域以外, 还有助于进一步提高引物的退火温度, 本领域技术人员可以理解, GC含量高的引物具有更高的退火温度, 具有所述高 GC含量 特征可以使得引物的退火温度达到约 40至约 60度之间, 这样有助于引物与模板结合, 提 高扩增效率。 优选地, 引物 B的 3'端部分具有极高 CpG频率的特征, 其所述极高 CpG频 率是指 3'端 7个核苷酸中包含 3个 CpG。
其中所述引物 B的 5'端部分用于锚定接头引物 D, 其特征为, 其反向互补序列能够被 接头引物 D结合并进行 PCR扩增。 锚定是指其引物 B能够通过其 5'端部分将接头引物 D 通过 PCR反应连接到待测目的片段上。 优选地, 引物 B的 5'端部分与接头引物 D的 3'端 起前 15~40个核苷酸序列相同。
DNA聚合酶可以是前述任何合适的聚合酶。 优选地, DNA聚合酶是热启动的。 优选 地, 扩增反应的退火温度为约 40至约 60度之间。
当 3'端部分具有所述高 CpG频率特征的引物 A用于步骤二扩增时, 由于步骤二已经 富集甲基化 CpG岛 (且一端能够锚定接头引物 C), 步骤三中引物 B的 3'端部分只需能够 结合和扩增该目的片段即能获得富集甲基化 CpG岛且两端能够分别锚定接头引物 C和接 头引物 D的目的片段, 无须再在步骤三中对甲基化 CpG岛进行富集。
步骤四、 将所述富集甲基化 CpG岛且两端能够分别锚定接头引物 C和接头引物 D的 目的片段用接头引物 C、 接头引物 D和 DNA聚合酶进行 PCR指数扩增, 以便获得扩增产 物;
接头引物此处是指引物的作用与常规高通量测序文库构建方法(如 Illumnia的 TruSeq DNA Sample Prep Kit 禾口 Applied Biosystems (ABI)的 The SOLiD™ Fragment Library Construction Kit) 中接头(adaptor) 的作用一样, 即将待测 DNA片段结合到测序芯片上并 使其能被 PCR扩增富集以及测序。与常规高通量测序文库构建过程中利用连接反应加接头 序列的方法不同,本发明的方法利用引物 A和引物 B的 5'部分的锚定序列将接头序列通过 PCR反应连接到待测 DNA片段的两端。 本领域技术人员可以理解, 接头引物 C与接头引 物 D之一中可以包含标签序列, 以便在测序芯片上同时检测多个文库。接头引物 C与引物 D对应特定高通量测序平台的接头序列, 这些高通量测序平台包括, 但不局限于 Illumina 的 Genome Analyzer IIx、 HiSeq禾口 MiSeq, ABI的 SoLiD、 5500 W Series Genetic Analyzer 和 Ion Torrent PGM, Roche454的 GS Junior和 GS FLX+。
步骤五、 将所述扩增产物进行分离纯化, 所述扩增产物构成高通量测序文库; 以及, 将所述高通量测序文库上机测序及数据分析。
对扩增产物进行分离纯化的方法可以是任何合适的方法, 包括但不局限于磁珠纯化、 纯化柱纯化和琼脂糖胶电泳纯化。 优选地, 纯化方法能够对目的片段的长度进行选择。 优 选地, 目的片段长度为 160-400bp。 在一个优选形式中, 用 2%琼脂糖胶电泳的方法选择 160-400bp的目的片段并进行胶回收纯化。
用于分析测序文库的高通量测序平台包括, 但不局限于 Illumina的 Genome Analyzer IIx、 HiSeq和 MiSeq测序平台, ABI的 SoLiD、 5500 W Series Genetic Analyzer和 Ion Torrent PGM测序平台, Roche454的 GS Junior禾卩 GS FLX+测序平台。
数据分析的方法不受限制, 可以应用任何合适的数据分析和序列比对软件, 包括但不 局限于 Bismark、 BSMAP、 Bowtie和 SOAP等。
在第二方面,本发明提供了用于甲基化 CpG岛高通量测序检测的试剂盒,该试剂盒包 含: 引物 A、弓 I物 B、接头引物 C和接头引物 D、 DNA聚合酶以及使用该试剂盒的说明书。
附图说明 图 1显示本发明甲基化 CpG岛高通量测序检测方法的示意图。
图 2 显示使用本发明方法扩增 Hela细胞和人类外周血单核细胞 (Human Peripheral Blood Mononuclear Cell, hPBMC) 基因组 DNA的琼脂糖凝胶分析结果。
图 3 显示 Hela细胞和 hPBMC 基因组 DNA 的甲基化 CpG 岛高通量测序文库的 Bioanalyzer_2100检测分析结果。
图 4显示 MAEL、 ILDR2和 CDKN2A基因区使用本发明方法和全基因组亚硫酸氢盐 测序方法的高通量测序结果比较。
图 5显示含 1~3个 CpG的短核苷酸序列在人类基因组的 CpG岛和非 CpG岛区域的分 布情况。
图 6显示不同的引物 A和引物 B的 3'端部分序列组合对 CpG岛的富集程度。
具体实施方式
以下结合具体实施方式和附图, 对本发明作进一步说明。 本领域技术人员可以对具体 实施方案中所示的本发明作出多种变化和修改而不脱离广义描述时的本发明的精神和范 围。
根据本发明的甲基化 CpG 岛的富集性扩增和高通量测序文库的同步构建以如下方式 发生 (见图 1 )。
1. 步骤一、 亚硫酸氢盐处理 DNA样品。
采用试剂盒 MethylCode™ Bisulfite Conversion Kit ( Invitrogen ), 并按照制造商提供的 说明书进行操作, 具体步骤如下:
1.1 制备 CT转换试剂(CT Conversion Reagent)溶液: 从试剂盒中取出 CT转换试剂, 加入 900 μΐ水、 50 μΐ重悬缓冲液和 300μ1稀释缓冲液, 室温短时震荡混匀 10分钟待其溶 解, 室温避光保存;
1.2将 500 pg至 500 ng DNA样品共 20 μΐ加入 PCR管中;
1.3 将 130 μΐ CT转换试剂溶液加入 DNA样品, 轻弹或用枪头吹打混匀;
1.4将 PCR管在一台热循环仪上进行如下程序: 98度… -10分钟, 64度… -2.5小时, 4度保存 (不超过 20小时) 备用;
1.5 将 DNA纯化柱放入收集管中, 加入 600 μΐ结合缓冲液; 1.6 将 1.4步骤中的 DNA样品加入结合结合缓冲液中, 避盖上下颠倒混匀数次; 1.7 最大转速 (>10,000g) 离心 30秒, 弃过柱液;
1.8 加入 100 μΐ洗涤缓冲液 (已加乙醇), 最大转速离心 30秒, 弃过柱液;
1.9 加入 200 μΐ Desulphonation缓冲液, 将纯化柱在室温站立 15~20分钟;
1.10 最大转速离心 30秒, 弃过柱液;
1.11 加入 100 μΐ洗涤缓冲液 (已加乙醇), 最大转速离心 30秒, 弃过柱液;
1.12 重复 1.11一次, 将纯化柱放置到一个新的 1.5ml离心管中;
1.13 加入 10 μΐ溶解缓冲液, 最大转速离心 30秒以洗脱 DNA。
2. 步骤二、 引物 A和 DNA聚合酶进行线性扩增。
2.1 将步骤 1中得到的 DNA在一个 PCR管中配置如下扩增反应体系:
Figure imgf000008_0001
*: 引物 A: TCTTTCCCTACACGACGCTCTTCCGATCTHHHHHCGCH (H=A/T/C,其 中下划波浪线部分为引物的 3'端部分, 下划直线部分为引物的 5'端部分。
**: 在 2.3步骤中加入
2.2将 PCR管放入 PCR热循环仪中进行如下程序: 95度… -2分钟, 4度保存;
2.3 加入 0.3 μΐ Klenow片段( exo-) (ΝΕΒ目录号 M0212S), 混匀, 点离;
2.4在 PCR热循环仪中进行如下程序: 4度… -50秒, 10度… -50秒, 20度一-50秒, 30度… _50秒, 37度一 _5分钟, 95度… _10秒, 4度一-暂停;
2.5 重复步骤 2.3和 2.4, 共重复 4次, 最后一次不进行 4度暂停;
2.6在 PCR热循环仪中进行如下程序以灭活 Klenow片段: 75度… -20分钟, 50度- 暂停; 步骤三、 引物 B和 DNA聚合酶进行扩增 <
在一个 PCR管中配置如下扩增反应体系:
Figure imgf000009_0001
*:引物 Β : TGGAGTTCAGACGTGTGCTCTTCCGATCTDDDDCGCGCGG (D=A/T/G), 其中下划波浪线部分为引物的 3'端部分, 下划直线部分为引物的 5'端部分。
3.2将 PCR管在热循环仪上预热到 50度;
3.3 将预热的混合物加入 2.6的第一轮扩增反应产物中, 吹打混匀 5~6次;
3.4在 PCR热循环仪中进行如下程序: 95度一3分钟, 50 度一 1分钟, 72 度一 1分 钟, 50度… -暂停;
4. 步骤四、 接头引物。、 接头引物 D和 DNA聚合酶进行指数扩增。
4.1 在一个 PCR管中配置如下扩增反应体系:
Figure imgf000009_0002
*: 接头引物 C:
TCT, 其中下划直线部分与引物 Α的 5'端部分相同;
**: 接头引物 D: CTTCCGATCT, 其中下划直线部分与引物 B 的 5'端部分相同, 下划双线处为标签序列 ( Illumina index 9 );
4.2 将 PCR管在热循环仪上预热到 50度;
4.3 将预热的混合物加入 3.4的第二轮扩增反应产物中, 吹打混匀 5~6次;
4.4 在 PCR热循环仪中进行如下程序: 95度 ----3分钟;
4.5 在 PCR热循环仪中进行如下程序: 95度一-30秒, 67度… -30秒, 72度一-1分 钟, 共 20个循环, 4度保存。
5. 步骤五、 片段选择、 纯化、 进行高通量测序和数据分析。
5.1 制备 2%琼脂糖凝胶, 加入 lxSYBR Safe (Invitrogen);
5.2 将 4.5的扩增产物在 2%琼脂糖凝胶中进行电泳分离;
5.3 对凝胶中的 DNA进行成像分析;
5.4 切胶回收 160-400bp的目的片段;
5.5 将目的片段进行胶回收纯化(Qiagen, QIAquick Gel Extraction Kit), 获得高通量 测序文库;
5.6 用 Bioanalyzer_2100分析系统 (Agilent) 检测高通量测序文库插入片段的大小, 并利用 QPCR对文库的浓度进行绝对定量分析;
5.7 在 Illumina HiSeq2000测序仪上, 按照读长为 lOObp的双末端测序,对文库进行高 通量测序分析, 获得原始测序数据;
5.8 数据分析: 首先去除任何接头序列和低质量序列, 然后应用 Bismark软件将数据 与参考人类基因组序列 (Hgl9) 进行比对, 并以此为基础进行后续的生物信息学分析。
图 2显示采用本发明上述方法对来自 Hela细胞和 hPBMC基因组 DNA进行扩增后的 2%琼脂糖凝胶电泳结果。 其中泳道 1和泳道 4为 DNA Maker, 泳道 2为 Hela细胞基因组 DNA 结果, 用于亚硫酸氢钠处理的 DNA样品起始量为 15ng, 泳道 3 为泳道结果表明 hPBMC基因组 DNA (来自一名成年男性) 结果, 用于亚硫酸氢钠处理的 DNA样品起始 量为 30ng, 泳道 5为无 DNA样品扩增反应的对照。 结果显示 Hela细胞基因组和 hPBMC 基因组 DNA样品均有阳性扩增。 这一结果表明少至 15ng起始的基因组 DNA即可以采用 本发明所述方法进行甲基化 CpG岛高通量测序检测,表明该方法具有高效率和高灵敏度的 特点。
图 3显示本发明上述方法获得的 Hela细胞和 hPBMC基因组 DNA的高通量测序文库 的 BiOanalyZer_2100检测结果。 把上述 2%琼脂糖凝胶分离的目的 DNA片段进行切胶及回 收纯化后所获得的 DNA即构成所述高通量测序文库。图 3A为 Hela细胞基因组 DNA测序 文库的检测结果, 图 3B为 hPBMC基因组 DNA测序文库的检测结果。 结果显示片段长度 介于 160~280bp之间。
将上述两个测序文库进行高通量测序检测, 测序原始数据量分别为 Hela细胞基因组 DNA 1.3G, hPBMC基因组 DNA 1.5G, 数据的统计分析结果详见表 1。 人类基因组中的 CpG岛区域约占全基因组的 0.7%, 而用本发明方法的 Hela细胞和 hPBMC细胞基因组高 通量测序数据中有 39%和 20%分布于 CpG岛, 由此对 CpG岛的富集程度分别为 56和 29 倍。 本发明方法仅用 l~2Gb的原始测序数据量即可获得包含甲基化 CpG岛在内的基因组 区域达平均 20~30x的测序深度, 而用全基因组亚硫酸氢盐测序要达到同样测序深度需要 150~200Gb的原始测序数据量,这表明本发明方法极大地提高了甲基化 CpG岛的高通量测 序检测效率。
表 1 使用本发明方法获得的 Hela细胞和 hPBMC基因组 DNA的高通量测序数据分析结果
Figure imgf000011_0001
图 4显示 MAEL、 ILDR2和 CDKN2A基因区使用本发明方法和全基因组亚硫酸氢盐 测序方法的高通量测序结果比较。 图 4A 显示 MAEL 和 ILDR2 基因区域的结果。 MAEL(maelstrom spermatogenic transposon silencer)基因是— 1^睾丸特异表达基因, 其启动 子 CpG 岛在生殖细胞中处于去甲基化状态, 但是在体细胞中被高度甲基化, 相反, ILDR2(immunoglobulin-like domain containing receptor 2)基因在体细胞中处于去甲基化状 态, 本发明方法结果显示: 在 Hela和 hPBMC基因组中, MAEL的启动子 CpG岛均被扩 增, 且测序结果显示其处于高度甲基化状态, 而 ILDR2的启动子 CpG岛均未被扩增, 这 一结果表明本发明方法能够选择性扩增甲基化 CpG岛而不能扩增非甲基化 CpG岛。 全基 因组亚硫酸氢盐测序结果(来自人脑组织)一方面证实, 正常体细胞中 MAEL启动子 CpG 岛被高度甲基化, 而 ILDR2启动子 CpG岛无甲基化, 另一方面显示, 大多数散在分布的 CpG处于高度甲基化状态, 而它们均在本发明方法富集甲基化 CpG岛和构建文库过程中 被选择性丢弃。 图 4B显示 CDKN2A(cyclin-dependent kinase inhibitor 2A)基因区域的结果。 CDKN2A基因区域有多个 CpG岛, 在体细胞中处于去甲基化状态, 而在肿瘤细胞中部分 被甲基化。 本发明方法结果显示: 该基因区域 4个 CpG岛中的 2个在 Hela细胞基因组中 被扩增且测序结果显示其处于高度甲基化状态, 而在 hPBMC基因组中, 4个 CpG岛均未 扩增。该结果进一步表明本发明方法能够准确和高效地富集性扩增甲基化 CpG岛并进行高 通量测序检测。
图 5显示含 1~3个 CpG的短核苷酸序列在人类基因组的 CpG岛和非 CpG岛区域分布 情况的生物信息学分析结果。 结果显示, 短核苷酸序列在 CpG岛的富集程度与 CpG的个 数呈正相关。
图 6显示用不同的引物 A和引物 B的 3'端部分序列组合通过本发明方法获得的 Hela 细胞基因组高通量测序文库对 CpG岛富集程度的分析结果。 结果显示 CpG岛富集程度与 引物序列的 CpG频率呈正相关。 其中 H=C/A/T/, D=G/A/T, R=G/A, N=A/T/C/G。
以上所述仅为本发明的一种具体实施方式, 但本发明的保护范围并不局限于此, 任何 熟悉该技术的人在本发明揭露的技术范围以内, 可以轻易想到的变化或替换, 都应涵盖在 本发明的保护范围之内。 因此, 本发明的保护范围应该以权利要求书的保护范围为准。

Claims

权利要求书
1. 一种甲基化 CpG岛的高通量测序检测方法, 其特征在于, 包括下列步骤:
步骤一、用修饰剂处理 DNA样品, 以便将 DNA样品中的胞嘧啶转换为尿嘧啶, 而 5'甲基 胞嘧啶不变, 获得经过转换的 DNA片段;
步骤二、 将所述经过转换的 DNA片段用引物 A和 DNA聚合酶进行至少 1次线性扩增, 以便获得一端能够锚定接头引物 C的目的片段;
其中所述引物 A由 3'端和 5'端两部分组成,其中所述 3'端部分用于结合和扩增所述经过转 换的 DNA片段, 其特征为, 长度大于或等于 4个核苷酸且能结合所述经过转换的 DNA片 段;
其中所述引物 A的 5'端部分用于锚定接头引物 C, 其特征为, 其反向互补序列能够被接头 引物 C结合并进行聚合酶链式反应 (PCR) 扩增;
步骤三、 将所述一端能够锚定接头引物 C的目的片段用引物 B和 DNA聚合酶进行扩增, 以便获得富集甲基化 CpG岛且两端能够分别锚定接头引物 C和接头引物 D的目的片段; 其中所述引物 B由 3'端和 5'端两部分组成,其中所述 3'端部分用于结合所述一端能够锚定 接头引物 C的目的片段并富集性扩增甲基化 CpG岛, 其特征为: 1 ) 长度大于或等于 7个 核苷酸; 2) 高 CpG频率, 其所述高 CpG频率一是指 3'端起前 7个核苷酸中包含 2个或 3 个 CpG;
其中所述引物 B的 5'端部分用于锚定接头引物 D, 其特征为, 其反向互补序列能够被接头 引物 D结合并进行 PCR扩增。
步骤四、 将所述富集甲基化 CpG岛且两端能够分别锚定接头引物 C和接头引物 D的目的 片段用接头引物。、 接头引物 D和 DNA聚合酶进行 PCR指数扩增, 以便获得扩增产物; 步骤五、 将所述扩增产物进行分离纯化, 所述扩增产物构成高通量测序文库; 以及, 将所述高通量测序文库上机测序及数据分析。
2. 根据权利要求 1所述的方法, 所述修饰剂为亚硫酸氢盐。
3. 根据权利要求 1所述的方法, 所述引物 A的 3'端部分特征为, 除了 CpG以外, 引物只 包含 C、 A和 T; 所述引物 B的 3'端部分特征为, 除了 CpG以外, 弓 I物只包含 0、八和1\
4. 根据权利要求 1~3任一项所述的方法, 所述引物 A的 3'端第二个核苷酸为 C。
5. 根据权利要求 1~4任一项所述的方法, 其中所述引物 A的 3'端部分用于结合和扩增经 过转换的 DNA片段并初步富集甲基化 CpG岛,其特征为中等 CpG频率,其所述中等 CpG 频率是指引物 3'端起前 7个核苷酸中包含 1个 CpG。
6. 根据权利要求 1~5任一项所述的方法, 其中所述引物 A的 3'端部分用于结合和扩增经 过转换的 DNA片段并初步富集甲基化 CpG岛,其特征为高 C频率,其所述高 C频率是指 3'端起前 5个核苷酸中至少包含 3个 C, 优选 3'端起至少连续 3个核苷酸为 C。
7. 根据权利要求 1~6任一项所述的方法, 其中所述引物 A的 3'端部分用于结合和扩增经 过转换的 DNA片段并初步富集甲基化 CpG岛, 其特征为高 CpG频率, 其所述高 CpG频 率是指 3'端起前 7个核苷酸中包含 2个或 3个 CpG。
8. 根据权利要求 1~7任一项所述的方法, 所述引物 B的 3'端部分特征为高 GC含量, 其 所述高 GC含量是指 3'端起前 10个核苷酸中 C与 G之和大于或等于 7个。
9. 根据权利要求 1~8任一项所述的方法, 所述引物 B的 3'端部分特征为极高 CpG频率, 其所述极高 CpG频率是指 3'端 7个核苷酸中包含 3个 CpG。
10. 根据权利要求 1~9任一项所述的方法, 所述步骤二中的 DNA聚合酶具有链置换功能。
11. 根据权利要求 1~10任一项所述的方法, 所述步骤二中的 DNA聚合酶是外切核酸酶缺 损的。
12. 根据权利要求 1~11任一项所述的方法, 所述步骤二进行 2~30次线性扩增。
13. 根据权利要求 1~12任一项所述的方法, 所述步骤三中的 DNA聚合酶是热启动的。
14. 一种甲基化 CpG岛的高通量测序检测方法, 其特征在于, 包括下列步骤:
步骤一、用修饰剂处理 DNA样品, 以便将 DNA样品中的胞嘧啶转换为尿嘧啶, 而 5'甲基 胞嘧啶不变, 获得经过转换的 DNA片段;
步骤二、 将所述经过转换的 DNA片段用引物 A和 DNA聚合酶进行至少 1次线性扩增, 以便获得富集甲基化 CpG岛且一端能够锚定接头引物 C的目的片段;
其中所述引物 A由 3'端和 5'端两部分组成, 其中所述 3'端部分用于结合所述经过转换的 DNA片段并富集性扩增甲基化 CpG岛, 其特征为: 1 ) 长度大于或等于 7个核苷酸; 2) 高 CpG频率, 其所述高 CpG频率一是指 3'端起前 7个核苷酸中包含 2个或 3个 CpG; 其中所述引物 A的 5'端部分用于锚定接头引物 C, 其特征为, 其反向互补序列能够被接头 引物 C结合并进行 PCR扩增;
步骤三、将所述富集甲基化 CpG岛且一端能够锚定接头引物 C的目的片段用引物 B和 DNA 聚合酶进行至少 1次扩增, 以便获得富集甲基化 CpG岛且两端能够分别锚定接头引物 C 和接头引物 D的目的片段;
其中所述引物 B由 3'端和 5'端两部分组成, 其中所述引物 B的 3'端部分用于结合和扩增 所述富集甲基化 CpG岛且一端能够锚定接头引物 C的目的片段, 其特征为, 长度大于或 等于 4个核苷酸且能结合所述富集甲基化 CpG岛且两端能够分别锚定接头引物 C和接头 引物 D的目的片段;
其中所述引物 B的 5'端部分用于锚定接头引物 D, 其特征为, 其反向互补序列能够被接头 引物 D结合并进行 PCR扩增;
步骤四、 将所述富集甲基化 CpG岛且两端能够分别锚定接头引物 C和接头引物 D的目的 片段用接头引物。、 接头引物 D和 DNA聚合酶进行 PCR指数扩增, 以便获得扩增产物; 步骤五、 将所述扩增产物进行分离纯化, 所述扩增产物构成高通量测序文库; 以及, 将所述高通量测序文库上机测序及数据分析。
15. 根据权利要求 1~14任一项所述的方法, 所述引物 A的 5'端部分与所述接头引物 C的 3'端起前 15~40个核苷酸序列相同; 所述引物 B的 5'端部分与所述接头引物 D的 3'端起 15-40个核苷酸序列相同。
16. 根据权利要求 1~15任一项所述的方法, 所述 DNA样品来自人的细胞、 组织、 血液、 体液、 尿液、 排泄物或其组合; 优选来自人的血浆或血清游离 DNA。
17. 一种用于甲基化 CpG岛高通量测序检测的试剂盒, 该试剂盒包含: 权利要求 1~16任 一项所述的引物 、 引物 B、接头引物 C和接头引物 D、 DNA聚合酶以及使用该试剂盒的 说明书。
PCT/CN2013/087231 2013-06-27 2013-11-15 甲基化CpG岛的高通量测序检测方法 WO2014205981A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP13888369.9A EP3015552B1 (en) 2013-06-27 2013-11-15 High-throughput sequencing detection method for methylated cpg islands
US14/392,322 US10100351B2 (en) 2013-06-27 2013-11-15 High-throughput sequencing detection method for methylated CpG islands

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310259709.9 2013-06-27
CN201310259709.9A CN104250663B (zh) 2013-06-27 2013-06-27 甲基化CpG岛的高通量测序检测方法

Publications (1)

Publication Number Publication Date
WO2014205981A1 true WO2014205981A1 (zh) 2014-12-31

Family

ID=52140945

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/087231 WO2014205981A1 (zh) 2013-06-27 2013-11-15 甲基化CpG岛的高通量测序检测方法

Country Status (4)

Country Link
US (1) US10100351B2 (zh)
EP (1) EP3015552B1 (zh)
CN (1) CN104250663B (zh)
WO (1) WO2014205981A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11410750B2 (en) 2018-09-27 2022-08-09 Grail, Llc Methylation markers and targeted methylation probe panel

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI813141B (zh) 2014-07-18 2023-08-21 香港中文大學 Dna混合物中之組織甲基化模式分析
US11789906B2 (en) * 2014-11-19 2023-10-17 Arc Bio, Llc Systems and methods for genomic manipulations and analysis
EP3239302A4 (en) * 2014-12-26 2018-05-23 Peking University Method for detecting differentially methylated cpg islands associated with abnormal state of human body
CN105695577B (zh) * 2016-03-02 2019-03-19 上海易毕恩基因科技有限公司 微量DNA中甲基化CpG岛高通量测序方法
CN105779465A (zh) * 2016-04-15 2016-07-20 广东医学院 一种cdkn2a基因片段及其引物以及在肿瘤诊断中的应用
CN116445593A (zh) 2016-08-10 2023-07-18 格里尔公司 测定一生物样品的一甲基化图谱的方法
WO2018099418A1 (en) 2016-11-30 2018-06-07 The Chinese University Of Hong Kong Analysis of cell-free dna in urine and other samples
CN107164535A (zh) * 2017-07-07 2017-09-15 沈阳宁沪科技有限公司 一种无创高通量甲基化结肠癌诊断、研究和治疗方法
US20200216910A1 (en) * 2017-08-09 2020-07-09 Enrich Bioscience Inc. Method and system for analysis of dna methylation and use of same to detect cancer
CN110468179B (zh) * 2018-05-10 2021-03-05 北京大学 选择性扩增核酸序列的方法
CN110305946A (zh) * 2019-07-18 2019-10-08 重庆大学附属肿瘤医院 基于高通量测序的dna甲基化检测方法
CA3162799A1 (en) * 2019-12-23 2021-07-01 Benjamin F. DELATTE Methods and kits for the enrichment and detection of dna and rna modifications and functional motifs
CN114171115B (zh) * 2021-11-12 2022-07-29 深圳吉因加医学检验实验室 一种差异性甲基化区域筛选方法及其装置
WO2023082251A1 (zh) * 2021-11-15 2023-05-19 深圳华大智造科技股份有限公司 一种基于标签序列和链置换的全基因组甲基建库测序方法
CN116555426A (zh) * 2023-05-04 2023-08-08 杭州圣庭医疗科技有限公司 一种鉴定肿瘤组织来源的试剂盒及数据分析方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101613749A (zh) * 2009-08-11 2009-12-30 中国人民解放军第二军医大学 Papolb基因甲基化定量检测方法
CN102399861A (zh) * 2010-09-16 2012-04-04 上海迦美生物科技有限公司 基于核酸内切酶消化的甲基化dna检测方法
CN103103624A (zh) 2011-11-15 2013-05-15 深圳华大基因科技有限公司 高通量测序文库的构建方法及其应用

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5882856A (en) * 1995-06-07 1999-03-16 Genzyme Corporation Universal primer sequence for multiplex DNA amplification
US6017704A (en) * 1996-06-03 2000-01-25 The Johns Hopkins University School Of Medicine Method of detection of methylated nucleic acid using agents which modify unmethylated cytosine and distinguishing modified methylated and non-methylated nucleic acids
US5786146A (en) * 1996-06-03 1998-07-28 The Johns Hopkins University School Of Medicine Method of detection of methylated nucleic acid using agents which modify unmethylated cytosine and distinguishing modified methylated and non-methylated nucleic acids
US6331393B1 (en) * 1999-05-14 2001-12-18 University Of Southern California Process for high-throughput DNA methylation analysis
DE60125065T2 (de) * 2000-08-25 2007-06-28 Lovelace Respiratory Research Institute, Albuquerque Verfahren zur krebsdetektion mittels verschachtelter ("nested") methylierungsspezifischer pcr
US7851148B2 (en) * 2003-10-13 2010-12-14 Qiagen Gmbh Method and kit for primer based multiplex amplification of nucleic acids employing primer binding tags
ES2538214T3 (es) * 2006-08-08 2015-06-18 Epigenomics Ag Un método para el análisis de metilación de ácido nucleico
GB0712882D0 (en) * 2007-07-03 2007-08-15 Leicester University Of Nucleic acid amplification
US8911937B2 (en) * 2007-07-19 2014-12-16 Brainreader Aps Method for detecting methylation status by using methylation-independent primers
US8828688B2 (en) * 2010-05-27 2014-09-09 Affymetrix, Inc. Multiplex amplification methods

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101613749A (zh) * 2009-08-11 2009-12-30 中国人民解放军第二军医大学 Papolb基因甲基化定量检测方法
CN102399861A (zh) * 2010-09-16 2012-04-04 上海迦美生物科技有限公司 基于核酸内切酶消化的甲基化dna检测方法
CN103103624A (zh) 2011-11-15 2013-05-15 深圳华大基因科技有限公司 高通量测序文库的构建方法及其应用

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
GU H ET AL.: "Preparation of reduced representation bisulfite sequencing libraries for genome-scale DNA methylation profiling", NAT PROTOC., vol. 6, no. 4, 2011, pages 468 - 81, XP055151116, DOI: doi:10.1038/nprot.2010.190
See also references of EP3015552A4

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11410750B2 (en) 2018-09-27 2022-08-09 Grail, Llc Methylation markers and targeted methylation probe panel
US11685958B2 (en) 2018-09-27 2023-06-27 Grail, Llc Methylation markers and targeted methylation probe panel
US11725251B2 (en) 2018-09-27 2023-08-15 Grail, Llc Methylation markers and targeted methylation probe panel
US11795513B2 (en) 2018-09-27 2023-10-24 Grail, Llc Methylation markers and targeted methylation probe panel

Also Published As

Publication number Publication date
CN104250663B (zh) 2017-09-15
CN104250663A (zh) 2014-12-31
EP3015552A4 (en) 2016-11-02
EP3015552A1 (en) 2016-05-04
US10100351B2 (en) 2018-10-16
US20160298183A1 (en) 2016-10-13
EP3015552B1 (en) 2018-07-25

Similar Documents

Publication Publication Date Title
WO2014205981A1 (zh) 甲基化CpG岛的高通量测序检测方法
CN102796808B (zh) 甲基化高通量检测方法
US10648037B2 (en) Method and kit for non-invasively detecting EGFR gene mutations
WO2016101258A1 (zh) 一种检测与人体异常状态相关的差异甲基化CpG岛的方法
JP6968894B2 (ja) メチル化dnaの多重検出方法
KR20100063050A (ko) 디지털 pcr에 의한 다양한 길이의 핵산의 분석
EP2844766B1 (en) Targeted dna enrichment and sequencing
WO2013064066A1 (zh) 全基因组甲基化高通量测序文库的构建方法及其应用
JP2018518967A (ja) ヌクレアーゼを使用する野生型dnaの選択的分解および突然変異体対立遺伝子の濃縮
JP2010535513A (ja) 高スループット亜硫酸水素dnaシークエンシングのための方法および組成物ならびに有用性
JP2007530026A (ja) 核酸配列決定
WO2014008635A1 (zh) 片断dna检测方法、片断dna检测试剂盒及其应用
WO2016152812A1 (ja) 標的核酸の高感度検出方法
CN110741096A (zh) 用于检测循环肿瘤dna的组合物和方法
US20150099670A1 (en) Method of preparing post-bisulfite conversion DNA library
WO2011120409A1 (zh) 以alu间聚合酶链式反应为基础的检测基因区特征的方法
CN109234388B (zh) 用于dna高甲基化区域富集的试剂、富集方法及应用
US20230374574A1 (en) Compositions and methods for highly sensitive detection of target sequences in multiplex reactions
WO2019062614A1 (en) METHOD FOR AMPLIFYING TARGET NUCLEIC ACID
US8377657B1 (en) Primers for analyzing methylated sequences and methods of use thereof
US20130309667A1 (en) Primers for analyzing methylated sequences and methods of use thereof
US20220325317A1 (en) Methods for generating a population of polynucleotide molecules
TW201819638A (zh) 以表觀遺傳區分dna
CN115996938A (zh) 用于免校准和多重变体等位基因频率定量的定量阻断剂置换扩增(qbda)测序
CN114787385A (zh) 用于检测核酸修饰的方法和系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13888369

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 14392322

Country of ref document: US

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2013888369

Country of ref document: EP