WO2023141829A1 - 同时进行全基因组dna测序和全基因组dna甲基化或/和羟甲基化测序的方法 - Google Patents

同时进行全基因组dna测序和全基因组dna甲基化或/和羟甲基化测序的方法 Download PDF

Info

Publication number
WO2023141829A1
WO2023141829A1 PCT/CN2022/074093 CN2022074093W WO2023141829A1 WO 2023141829 A1 WO2023141829 A1 WO 2023141829A1 CN 2022074093 W CN2022074093 W CN 2022074093W WO 2023141829 A1 WO2023141829 A1 WO 2023141829A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequencing
dna
strand
methylation
nascent
Prior art date
Application number
PCT/CN2022/074093
Other languages
English (en)
French (fr)
Inventor
杨林
夏军
陈恬
张艳艳
陈芳
聂自豪
张韶红
杨贵芳
王业钦
吕硕
Original Assignee
深圳华大智造科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大智造科技股份有限公司 filed Critical 深圳华大智造科技股份有限公司
Priority to PCT/CN2022/074093 priority Critical patent/WO2023141829A1/zh
Priority to CN202280052323.8A priority patent/CN118076734A/zh
Publication of WO2023141829A1 publication Critical patent/WO2023141829A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present invention relates to the field of biotechnology. Specifically, the present invention relates to a method for simultaneously performing whole-genome DNA sequencing and whole-genome DNA methylation or/and hydroxymethylation sequencing.
  • DNA methylation is an epigenetic regulatory modification that participates in the regulation of protein synthesis without changing the base sequence.
  • DNA methylation is a very wonderful chemical modification. The care of loved ones, the aging of the body, smoking, alcoholism and even obesity will be faithfully recorded on the genome by methylation. The genome is like a diary, and the methylation is used as words to record the experience of the human body.
  • DNA methylation is an important epigenetic mark information. In mammals, the most common methylation modification occurs on cytosine, mainly including 5-methylation modification (5mc) and 5-hydroxymethylation modification (5hmc), obtaining genome-wide methylation level data of all cytosines is of great significance for the study of spatiotemporal specificity of epigenetics.
  • mapping of DNA methylation levels across the genome and the analysis of high-precision methylation modification patterns in specific species will surely have milestone significance in epigenomics research , and lay the foundation for the research of basic mechanisms such as cell differentiation and tissue development, as well as animal and plant breeding, human health and disease research.
  • Whole Genome Bisulfite Sequencing WGBS Whole Genome Bisulfite Sequencing
  • Whole Genome Bisulfite Sequencing that is, whole genome bisulfite sequencing
  • the premise of methylation sequencing is that the whole genome DNA information of the species has been obtained. After bisulfite treatment, the methylated C remains unchanged, and no The methylated C is converted to U, and the methylation sequencing results are compared with the genome information to obtain the modification of cytosine at this position; 2.
  • the unmethylated C base after bisulfite treatment will be converted into a U base
  • the GC content of the whole genome changes drastically, resulting in great amplification and sequencing bias in the subsequent amplification; 3.
  • C cytosine
  • T thymine
  • the comparison (map) of the results obtained by sequencing to the reference genome is less efficient, and there will be too many multiple alignments , leading to abnormal alignment, and even if the sequencing throughput is increased in some positions, effective DNA methylation information cannot be obtained, resulting in the loss of gene-wide methylation information.
  • the present invention aims to solve at least one of the technical problems existing in the prior art at least to a certain extent. For this reason, the present invention proposes linker element, linker element composition, test kit and its application, the construction method of sequencing library, sequencing library and its application in sequencing and carry out whole-genome DNA sequencing and whole-genome DNA methylation simultaneously Or/and a method for sequencing hydroxymethylation, using the sequencing library for sequencing, a method and system that can simultaneously sequence whole genome DNA and DNA methylation or/and hydroxymethylation sequencing, and DNA and DNA Simultaneous sequencing of methylation or/and hydroxymethylation sequencing is done on one molecule, which can accurately obtain methylation information without reference to gene information, and can accurately locate the methylation position, greatly improving methylation Accuracy of methylation or/and hydroxymethylation sequencing information.
  • the invention proposes a joint element.
  • the linker element is a bubble-shaped single-stranded nucleic acid
  • the single-stranded nucleic acid has a non-complementary region and a complementary region formed by a 5' end sequence and a 3' end sequence, and the 5' end Or have a cohesive end at the 3' end.
  • the positive and negative strands can be effectively connected to form a circular DNA molecule for subsequent DNB (DNA nanoball) preparation experiments.
  • the above joint element may also have the following additional technical features:
  • the cohesive end or the complementary region has an endonuclease recognition site.
  • a cut is formed, and the chain is extended at the cut to obtain a nascent chain.
  • the base at the cohesive end is a U base or a T base.
  • the cohesive end is a U base, it can be used as an endonuclease recognition site for digestion with User endonuclease.
  • the endonuclease is selected from USER endonuclease, DNase endonuclease, RNase endonuclease.
  • the endonuclease recognition site is selected from U bases, deoxynucleotides or ribonucleotides.
  • the adapter element contains one or more sequencing primer sequences, molecular tag sequences and/or sample tag sequences.
  • the length of the joint element is 20-200 nt.
  • the positive and negative strands can be effectively connected to form a circular DNA molecule for subsequent DNB preparation experiments.
  • the linker elements are deoxyribonucleotides and/or ribonucleotides.
  • the linker element has a nucleotide sequence as shown in SEQ ID NO: 1 or 2 or a nucleotide sequence having at least 80% homology therewith.
  • the present invention provides a joint element composition.
  • the linker element composition includes two aforementioned linker elements, and at least one of the linker elements has an endonuclease recognition site on its sticky end or complementary region.
  • the positive and negative strands can be effectively connected by using the adapter element composition according to the embodiment of the present invention to form a circular DNA molecule for subsequent DNB (DNA nanoball) preparation experiments.
  • the linker element composition includes: linker element 1, the linker element 1 has a nucleotide sequence as shown in SEQ ID NO: 1 or a nucleoside having at least 80% homology therewith acid sequence; linker element 2, said linker element 2 has a nucleotide sequence as shown in SEQ ID NO: 2 or a nucleotide sequence with at least 80% homology therewith.
  • the present invention provides a kit.
  • the kit includes: the aforementioned linker element and the linker element composition.
  • the present invention proposes the application of the aforementioned linker elements, linker element compositions, and kits in the construction of sequencing libraries.
  • the sequencing library is used for at least one of whole-genome DNA methylation sequencing and hydroxymethylation sequencing and whole-genome DNA sequencing.
  • methylation or/and hydroxymethylation information can be accurately obtained using the aforementioned linker elements.
  • the present invention proposes a method for constructing a sequencing library. According to an embodiment of the present invention, the method includes:
  • the linker element 1 is selected from the aforementioned linker elements, and the sticky end or the complementary region has an endonuclease recognition site;
  • the cytosines in the nascent chain are all methylated modified cytosines or are all unmethylated modified cytosines;
  • dumbbell-shaped double-stranded DNA the sequence of the nascent strand remains unchanged, and the unmethylated cytosine on the template strand will be converted into uracil or the methylated and/or Hydroxymethylated cytosines are converted to dihydrouracils, resulting in sequencing libraries.
  • the two positive and negative strands of a DNA molecule are connected through the linker element 1, and a nick is formed on its endonuclease recognition site, so that chain extension can be performed on the nick to generate a new chain.
  • a closed DNA loop can be formed to obtain a dumbbell-shaped double-stranded DNA, which is helpful for the subsequent preparation of DNA nanospheres. Sequencing libraries are obtained by converting dumbbell-shaped double-stranded DNA so that all uracils are converted to dihydrouracils.
  • the whole genome sequence can be obtained based on the sequence information of the nascent strand, and the methylation/hydroxymethylation information can be accurately obtained by comparing the whole genome sequence with the sequence information of the template strand.
  • the simultaneous sequencing of DNA and DNA methylation or/and hydroxymethylation is completed on one molecule, and methylation information can be accurately obtained without reference to gene information, and the methylation position can be precisely positioned, greatly improving Improve the accuracy of methylation information.
  • the above-mentioned method for constructing a sequencing library may also have the following additional technical features:
  • the fragmentation is to randomly interrupt or cut double-stranded DNA by physical or chemical methods.
  • the fragmentation is performed by physical ultrasonic method or enzyme reaction method.
  • the blunt end repair is performed using T4 DNA polymerase or mung bean nuclease.
  • T4 DNA polymerase or mung bean nuclease.
  • the phosphorylation is performed by nucleotide kinase.
  • the phosphorylation is performed by using T4 polynucleotide kinase (T4 DNA phosphokinase).
  • the addition of base A at the 3' end is performed by using rTaq enzyme or Klenow polymerase without 3-5 exonuclease activity.
  • rTaq enzyme or Klenow polymerase without 3-5 exonuclease activity.
  • the base of the sticky end is selected from U base or T base;
  • the endonuclease is selected from USER endonuclease, DNase endonuclease or RNase endonuclease;
  • the enzyme recognition site is selected from U base, deoxyribonucleic acid or ribonucleic acid, and the number of said cuts is one or more.
  • the extension uses a DNA polymerase with 5-3 exonuclease or 5-3 displacement function.
  • the DNA polymerase is selected from T4 DNA polymerase, phi29 DNA polymerase or Bst DNA polymerase.
  • T4 DNA polymerase phi29 DNA polymerase
  • Bst DNA polymerase phi29 DNA polymerase
  • all cytosines in the dNTPs used in the extension are methylated or all unmethylated cytosines. Due to methylated cytosine, after bisulfite conversion treatment, the sequence remains unchanged, or with unmethylated cytosine, after conversion treatment (such as using TET enzyme, potassium perruthenate, beta Glycosyltransferase and TET enzyme are converted), the sequence remains unchanged, and genomic DNA information can be obtained by sequencing it.
  • the cytosines in the nascent strands are all methylated cytosines
  • step 6) includes: subjecting the dumbbell-shaped double-stranded DNA to bisulfite treatment to obtain a sequencing library.
  • all the cytosines on the nascent chain are methylated, and after the bisulfite in step 6), its sequence remains unchanged, and it is sequenced Genomic DNA information is available.
  • the unmethylated cytosine will be converted into uracil, which is sequenced, and the sequencing result is compared with the genomic DNA information obtained above to know Methylation information.
  • the cytosines in the nascent strands are all methylated cytosines
  • step 6) includes: converting the dumbbell-shaped double-stranded DNA to obtain a sequencing library, and the conversion process uses
  • the reagent comprises: auxiliary reagent and pyridine borane or bisulfite;
  • the auxiliary reagent is selected from one of the following three: TET enzyme; potassium perruthenate; beta glycosyltransferase and TET enzyme; the conversion
  • the treatment includes: sequentially treating the dumbbell-shaped double-stranded DNA with auxiliary reagents and pyridine borane or treating the dumbbell-shaped double-stranded DNA with bisulfite.
  • TET enzyme recognition can recognize 5mc and 5hmc
  • beta glycosyltransferase can recognize 5mc
  • potassium perruthenate can recognize 5hmc.
  • cytosines on the nascent chain are unmethylated, and after step 6) TET enzyme-assisted or potassium perruthenate-assisted conversion treatment, its The sequence remains the same, and sequencing it yields genomic DNA information.
  • the template strand is treated with auxiliary reagents, which can convert methylated cytosine into carboxylated cytosine, and then convert carboxylated cytosine into dihydrouracil (that is, two more H atoms) under the action of pyridine borane. Cytosine), dihydrouracil will be recognized as thymine in the sequencing results, and the methylation or/and hydroxymethylation information can be obtained by comparing the sequencing results with the genomic DNA information obtained above.
  • the method further includes: preparing the sequencing library into DNA nanospheres.
  • sequencing can be performed on a DNB sequencer.
  • the method for preparing the DNA nanoball includes: performing rolling circle amplification (Roll circle amplification) on the sequencing library using primer sequences.
  • the primer sequence has a nucleotide sequence as shown in SEQ ID NO: 3 or a core sequence having at least 80% (such as 85%, 90%, 95%, 99%) homology therewith nucleotide sequence.
  • the invention proposes a sequencing library.
  • the sequencing library is obtained by the aforementioned method for constructing a sequencing library. Therefore, using the sequencing library according to the embodiment of the present invention for sequencing, the method and system for sequencing the whole genome DNA and the methylation/hydroxymethylation of the whole genome DNA can be performed simultaneously, and the simultaneous sequencing of DNA and DNA methylation is Completed on one molecule, methylation information can be accurately obtained without reference to gene information, greatly improving the accuracy of methylation information.
  • the present invention proposes the application of the aforementioned sequencing library in sequencing. Therefore, using the sequencing library for sequencing, the method and system for sequencing the whole genome DNA and the methylation/hydroxymethylation of the whole genome DNA can be performed simultaneously, and the simultaneous sequencing of DNA and DNA methylation is completed on one molecule , Accurately obtain methylation information without reference to gene information, greatly improving the accuracy of methylation information.
  • the sequencing includes at least one of whole-genome DNA methylation sequencing and hydroxymethylation sequencing, and whole-genome DNA sequencing.
  • the present invention proposes a method for simultaneously performing whole-genome DNA sequencing and whole-genome DNA methylation or/and hydroxymethylation sequencing.
  • the method includes: sequencing the aforementioned sequencing library to obtain sequencing information, the sequencing information including nascent strand information and template strand information, and the nascent strand information is DNA information of the whole gene;
  • the template strand information is compared and analyzed with the nascent strand information to obtain the genome-wide DNA methylation or/and hydroxymethylation information of the template strand. Therefore, the method according to the embodiment of the present invention can obtain methylation modification information without referring to genome information, and can accurately locate the position of the methylation sequence, thereby improving the accuracy of methylation sequencing data comparison.
  • the comparison analysis includes:
  • the position of the guanine in the nascent chain in the sequencing results
  • the base corresponding to the corresponding position of the complementary strand of the template strand is thymine, which is an indication that no methylation occurs at the position
  • the position of the guanine in the nascent strand corresponds to the complementary strand of the template strand corresponding to
  • the base at the position is cytosine, which is an indication that methylation has occurred at that position;
  • cytosines in the nascent chain are unmethylated cytosines and the dumbbell-shaped double-stranded DNA is converted with TET enzyme and pyridine borane, in the sequencing results, in the nascent chain
  • the base corresponding to the position of guanine in the complementary chain of the template strand is thymine, which is an indication of methylation at the position;
  • the position of guanine in the nascent chain corresponds to the template strand
  • the base at the corresponding position of the complementary strand of the above is cytosine, which is an indication that methylation does not occur at the position;
  • the nascent chain When all the cytosines in the nascent chain are unmethylated cytosines and the dumbbell-shaped double-stranded DNA is converted with potassium perruthenate and pyridine borane, in the sequencing results, the nascent The base corresponding to the position of guanine in the chain and the corresponding position of the complementary chain of the template strand is thymine, which is an indication of hydroxymethylation at the position; the base corresponding to the position of guanine in the nascent chain The base at the corresponding position of the complementary strand of the template strand is cytosine, which is an indication that hydroxymethylation has not occurred at the position;
  • the sequence results , the base corresponding to the position of the guanine in the nascent chain corresponding to the position of the complementary chain of the template strand is thymine, which is an indication of methylation at the position; the position of the guanine in the nascent chain The base corresponding to the corresponding position of the complementary strand of the template strand is cytosine, which is an indication that no methylation occurs at the position.
  • the present invention can obtain genome information and genome methylation information at the same time, and can obtain methylation modification and/or hydroxymethylation information of unknown species without referring to genome information;
  • the present invention uses genome position information to accurately locate the methylation or/and hydroxymethylation sequence position, improving the accuracy of methylation and/or hydroxymethylation data comparison;
  • the present invention does not need to go through PCR, and can effectively and uniformly obtain the methylation and/or hydroxymethylation information of the whole genome;
  • the present invention can realize accurate methylation and/or hydroxymethylation modification detection of C/T polymorphic positions.
  • Figure 1 shows a schematic flow diagram of the preparation process of whole genome DNA and whole genome DNA methylation mixed library based on bisulfite conversion treatment according to an embodiment of the present invention
  • Figure 2 shows a schematic diagram of the preparation process of whole-genome DNA and whole-genome DNA methylation mixed library based on TET-assisted or potassium perruthenate-assisted, according to an embodiment of the present invention
  • FIG. 3 shows a schematic structural view of a joint element 1 and a joint element 2 according to an embodiment of the present invention
  • FIG. 4 shows a schematic diagram of information analysis according to an embodiment of the present invention
  • Fig. 5 shows a flowchart of permutation enzyme sequencing according to an embodiment of the present invention.
  • the present invention proposes a method for simultaneously performing whole-genome DNA sequencing and whole-genome DNA methylation sequencing or/and hydroxymethylation, including:
  • Genomic DNA is randomly interrupted to produce 200-500bp fragments, or DNA that has been interrupted such as cfDNA.
  • fragmented DNA molecules are excisioned by mung bean nuclease on the sticky ends to form blunt ends;
  • Phosphorylate the blunt-ended double-stranded DNA at the 5-end add base A to the 3-end to form a sticky-ended double-stranded DNA molecule with phosphoric acid at the 5-end and base A at the 3-end.
  • adapter element 1 Add adapter element 1 to the above molecule.
  • the main function of the adapter is for subsequent chain extension.
  • the adapter sequence can contain one or more sequencing primer sequences or/and molecular tags (UMI, Unique Molecular Identifiers) or/and samples Label sequence (Index Barcode).
  • UMI Unique Molecular Identifiers
  • Index Barcode samples Label sequence
  • the linker is a special bubble linker (Scheme 3a) with non-complementary sequences in the middle and phosphorylated at the 5-terminus.
  • linker element 1 The 5' and 3' ends of linker element 1 are complementary sequences and one of them has a cohesive terminal U base.
  • the U base can be recognized and excised by the subsequent USER enzyme to generate a cut for polymerase excision or displacement and polymerization extension; or the 5' end and 3' end are complementary sequence cuts containing multiple U bases , with a cohesive terminal T base.
  • the U base can be recognized and excised by subsequent USER enzymes, resulting in one or more nicks, which are used for excision or replacement by polymerase and polymerization extension ( Figure 1).
  • the above ligation product forms one or more cuts under the action of USER enzyme
  • the nascent strand is extended at the nick, and the extension is carried out by an enzyme with 5-3 exonuclease activity (such as T4 DNA polymerase) or 5-3 displacement enzyme activity (such as phi29, Bst).
  • the cytosines in the extended dNTP are all methylated or unmethylated cytosines, and all the cytosines in the original DNA template strand are replaced with methylated or unmethylated cytosines.
  • strand forming a mixed DNA double strand of the original template strand and the nascent strand.
  • the mixed double strands formed above are connected to the linker element 2 to obtain a dumbbell-shaped double-stranded DNA library.
  • the linker sequence includes one or more sequencing primer sequences or/and molecular tags (UMI, Unique Molecular Identifiers) or/and sample tag sequences (Index Barcode).
  • UMI Unique Molecular Identifiers
  • Index Barcode sample tag sequences
  • the linker is a special bubble linker (Schematic 2b), with non-complementary sequences in the middle, complementary sequences at the 5' and 3' ends, sticky T/U bases at the 3' end, and phosphorylation at the 5' end .
  • dumbbell-shaped double-stranded DNA undergoes bisulfite or TET enzyme-assisted conversion treatment, potassium perruthenate (KRuO4), beta glycosyltransferase and TET enzyme-assisted conversion treatment, and the unmethylated modified original template strand Cytosine is converted to uracil or methylated cytosine of the original template strand is converted to dihydrouracil (DHU), while all methylated cytosine of the newly generated strand maintains the same sequence.
  • KRuO4 potassium perruthenate
  • DHU dihydrouracil
  • the transformed dumbbell-shaped double-stranded DNA library is prepared with DNA nanospheres under the action of universal primers.
  • the universal primer is combined with the adapter sequence of the dumbbell-shaped double-stranded DNA, and is linearly extended under the action of an enzyme with displacement activity to generate DNA nanospheres.
  • the DNA nanospheres are loaded onto the DNB sequencing chip for sequencing.
  • the sequencing reaction is carried out. Under the action of the sequencing primers of Read1 and Read2 and the sequencing enzyme with displacement activity (see Figure 3), the original template strand (bisulfite conversion strand, enzyme-assisted or high ruthenium) is respectively measured. Potassium perruthenate (KRuO 4 )-assisted conversion strand) and nascent strand, in which the nascent strand obtains reference genomic DNA information, and the original template strand (bisulfite converted strand, enzyme-assisted, potassium perruthenate (KRuO 4 )-assisted conversion chain) to obtain cytosine conversion information.
  • KRuO 4 Potassium perruthenate
  • KRuO 4 potassium perruthenate
  • a DNB nanopore generates two read lengths Read1 and Read2, of which Read1 or Read2 is derived from the newly generated chain information, and the read is compared to the genome by any comparison software to obtain accurate position information on the genome; the corresponding Read2 Or Read1 is derived from the original template strand (bisulfite conversion strand or enzyme-assisted or potassium perruthenate (KRuO 4 )-assisted conversion strand), compare Read1 and Read2, under the bisulfite conversion condition, the original The position where cytosine is converted to adenine in the template strand is determined to be unmethylated cytosine, and the cytosine that is not converted to adenine is methylated.
  • the main band is about 300bp;
  • the end repair reaction system and conditions are as follows.
  • methylated adapters sometimes also called “methylated tag adapters”.
  • Linker 1 5'-/Phos/GCTCGCAGTCGA GGTCAAGCGGTCTTAGGCTC BBBBBBBB TCTGA AGGACATGGCTA CGATCGACTGCGAGCT-3' (SEQ ID NO: 1)
  • the underlined cytosine is methylated modified cytosine (m5c-dCTP), and B is the sample tag sequence.
  • Linker 2 5'-/5Phos/CGGACTCGACCT GACAATGCATGGCATCTC AGGTCGAGTCCGT-3' (SEQ ID NO: 2) The underlined cytosines in linker 2 are protected by methylation modification
  • CT Conversion Reagent Prepares the CT conversion reagent (CT Conversion Reagent) solution: take out the CT conversion reagent (solid mixture) from the kit, add 900 ⁇ L of water, 50 ⁇ L of M-dissolving buffer (M-Dissolving Buffer) and 300 ⁇ L of M- Dilution Buffer (M-Dilution Buffer), dissolve at room temperature and shake for 10 minutes or shake on a shaker for 10 minutes.
  • M-Dissolving Buffer M-dissolving Buffer
  • M-Dilution Buffer M-Dilution Buffer
  • DNB was quantified using HS Qubit ssDNA kit.
  • the obtained library is subjected to high-throughput sequencing, the sequencing platform MGISEQ-2000, the sequencing type PE100, and the sequenced data are compared and then the basic parameters are counted, including off-machine data, available data, comparison data, etc.
  • the conventional method adopts BS-MAP software to compare, and the method of the present invention adopts BWA software to compare the genome of the nascent chain (cytosine conversion chain) to obtain the accurate position of the read, and then obtain the original template chain (bisulfite) according to the genome comparison position. Conversion chain or enzyme conversion chain) information, and then get accurate methylation alignment information.
  • Using the method of the present invention can greatly improve the methylation comparison rate, and can provide CpG site coverage, can improve the utilization rate of data, and improve the accuracy of methylation detection.
  • the sequencing type is PE100, and the sequencing depth is 30 ⁇ .
  • Data analysis including performance such as data utilization rate, comparison rate, and preference.
  • the main band is about 300bp;
  • the end repair reaction system and conditions are as follows.
  • methylated adapters sometimes also called “methylated tag adapters”.
  • Linker 1 5'-/5Phos/GCTCGCAGTCGAGGTCAAGCGGTCTTAGGCTCBBBBBBBBBTCTGAAGGACATGGCTACGATCGACTGCGAGCT-3' (SEQ ID NO: 1), B is the sample tag sequence
  • Linker 2 5'-/5Phos/CGGACTCGACCTGACAATGCATGGCATCTCAGGTCGAGTCCGT-3' (SEQ ID NO: 2)
  • TET enzyme uses NEBNext Enzymatic Methyl-seq Kit (NEB, E7120S)
  • DNA was purified using PB buffer and Zymo-Spin TM IC Column (Zymo research company), and finally dissolved in 20 ⁇ L TE.
  • DNB was quantified using HS Qubit ssDNA kit.
  • the obtained library is subjected to high-throughput sequencing, the sequencing platform MGISEQ-2000, the sequencing type PE100, the sequenced data are compared and then the basic parameters are counted, including off-machine data, available data, comparison data, etc.
  • the conventional method adopts BS-MAP software to compare, and the method of the present invention adopts BWA software to compare the genome of the nascent chain (cytosine conversion chain) to obtain the accurate position of the read, and then obtain the original template chain (bisulfite) according to the genome comparison position. Conversion chain or enzyme conversion chain) information, and then get accurate methylation alignment information.
  • Using the method of the present invention can greatly improve the methylation comparison rate, and can provide CpG site coverage, can improve the utilization rate of data, and improve the accuracy of methylation detection.

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Microbiology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Plant Pathology (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

提供了一种测序文库的构建方法,包括:1)将双链DNA进行片段化,并对所得DNA片段进行平端修复、5'末端磷酸化和3'末端加碱基A;2)通过连接反应,在步骤1)所得DNA片段两端分别加上接头元件1,得到连接产物;3)利用内切酶在所述内切酶识别位点上形成切口;4)在该切口处,以与该接头元件1中不具有粘性末端的一端相连的DNA片段为模板进行扩增,形成含有模板链和新生链的混合DNA双链;5)通过连接反应,在该混合DNA双链的未连接该接头元件1的一端加上接头元件2,得到哑铃状双链DNA;6)将该哑铃状双链DNA进行重亚硫酸盐或转化处理,得到测序文库。

Description

同时进行全基因组DNA测序和全基因组DNA甲基化或/和羟甲基化测序的方法 技术领域
本发明涉及生物技术领域。具体地,本发明涉及同时进行全基因组DNA测序和全基因组DNA甲基化或/和羟甲基化测序的方法。
背景技术
DNA甲基化是一种表观调控修饰,它在不改变碱基序列的情况下,参与调控蛋白质合成的多少。对人类来说,DNA甲基化是一种非常奇妙的化学修饰,亲人的关怀、机体的衰老、抽烟、酗酒甚至肥胖,都会被甲基化如实地记录到基因组上。基因组就像是一个日记本,甲基化作为文字,记录下人体的经历。DNA甲基化是重要的表观遗传学标记信息,在哺乳动物中,最常见的甲基化修饰发生在胞嘧啶上,主要有5-甲基化修饰(5mc)和5羟甲基化修饰(5hmc),获得全基因组范围内所有胞嘧啶的甲基化水平数据,对于表观遗传学的时空特异性研究具有重要意义。以新一代高通量测序平台为基础,进行全基因组DNA甲基化水平图谱绘制,特定物种的高精确度甲基化修饰模式的分析,必将在表观基因组学研究中具有里程碑式的意义,并为细胞分化、组织发育等基础机制研究,以及动植物育种、人类健康与疾病研究奠定基础。
全基因组甲基化测序WGBS(Whole Genome Bisulfite Sequencing),即全基因组亚硫酸氢盐测序,是研究生物甲基化的最常用手段,它可以覆盖所有甲基化位点,能够获得更加全面的甲基化图谱。但其在高通量测序中遇到了很多挑战:1、进行甲基化测序的前提是已经获取了该物种全基因组DNA信息,通过重亚硫酸盐处理,甲基化的C保持不变,未甲基化的C转化为U,再甲基化测序结果和基因组信息对比,获得该位置胞嘧啶的修饰情况;2、重亚硫酸盐处理后的未甲基化C碱基会转变成U碱基,整个基因组的GC含量发生极端变化,造成后续扩增产生极大的扩增和测序偏好性;3、数据进行分析时却遇到了很大的问题,由于亚硫酸氢钠处理后基因组中大多数的胞嘧啶(C)都会转变为胸腺嘧啶(T),造成基因组复杂度降低,测序得到的结果比对(map)到参考基因组上效率较低,会遇到过多的多重比对的情况,导致比对异常,有的位置即使增大测序通量也无法得到有效的DNA甲基化信息,造成全基因甲基化信息丢失。
近年来宋春啸教授团队(Liu,Y.,Siejka-Zielińska,P.,Velikova,G.,Bi,Y.,Yuan,F.,Tomkova,M.,...&Song,C.X.(2019).Bisulfite-free direct detection of 5-methylcytosine and 5-hydroxymethylcytosine at base resolution.Nature biotechnology,37(4),424-429.)开发了以TET 酶辅助的吡啶硼烷转化方法(TAPs),可以将甲基化的胞嘧啶转化为二氢尿嘧啶,随后在PCR的过程中二氢尿嘧啶转化为胸腺嘧啶,通过检测胸腺嘧啶同时和基因组对比从而推出胞嘧啶是否存在甲基化修饰。该方法是转化甲基化的胞嘧啶,相对于未甲基化的胞嘧啶,甲基化的胞嘧啶在基因组中的占比非常低,能够有效降低对基因组的改变,但是在某些高甲基化的CpG岛区域,同样也会面临基因组改变过多导致复杂度低带来的准确比对的问题。
无论是基于重亚硫酸氢盐,还是基于TET酶的转化测序方法,其对基因组的改变会造成比对率低的问题,导致无法准确获取某些区域的甲基化信息,因此开发一种可以提高比对率的方法具有重要意义。
发明内容
本发明旨在至少在一定程度上解决现有技术中存在的技术问题至少之一。为此,本发明提出了接头元件、接头元件组合物、试剂盒及其应用、测序文库的构建方法、测序文库及其在测序中的应用和同时进行全基因组DNA测序和全基因组DNA甲基化或/和羟甲基化测序的方法,利用该测序文库进行测序,可以同时对全基因组DNA和全基因组DNA甲基化或/和羟甲基化测序进行测序的方法和系统,并且DNA和DNA甲基化或/和羟甲基化测序同时测序是在一个分子上完成,不需要参考基因信息就能准确获取甲基化信息,并可对甲基化位置进行精准定位,极大地提高甲基化或/和羟甲基化测序信息的准确性。
在本发明的一个方面,本发明提出了一种接头元件。根据本发明的实施例,所述接头元件为呈泡状的单链核酸,所述单链核酸具有非互补区和由5’端序列和3’端序列形成的互补区,所述5’端或3’端具有粘性末端。由此,可以有效将正负链进行连接并形成环状DNA分子,用于后续的DNB(DNA纳米球)制备实验。
根据本发明的实施例,上述接头元件还可以具有下列附加技术特征:
根据本发明的实施例,所述粘性末端或者所述互补区上具有内切酶识别位点。由此,以便将接头元件切开,形成切口,在此切口处进行链延伸,获得新生链。
根据本发明的实施例,所述粘性末端的碱基为U碱基或T碱基。粘性末端为U碱基时,其可以作为内切酶识别位点,利用User内切酶进行酶切。
根据本发明的实施例,所述内切酶选自USER内切酶、Dnase内切酶、RNase内切酶。
根据本发明的实施例,所述内切酶识别位点选自U碱基、脱氧核苷酸或者核糖核苷酸。
根据本发明的实施例,所述接头元件含有一个或多个测序引物序列、分子标签序列和/或样本标签序列。
根据本发明的实施例,所述接头元件的长度为20~200nt。由此,可以有效将正负链进 行连接并形成环状DNA分子,用于后续的DNB制备实验。
根据本发明的实施例,所述接头元件为脱氧核糖核苷酸和/或核糖核苷酸。
根据本发明的实施例,所述接头元件具有如SEQ ID NO:1或2所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列。
5’-/Phos/GCTCGCAGTCGA GGTCAAGCGGTCTTAGGCTCBBBBBBBBBBTCTGAAGGA CATGGCTACGATCGACTGCGAGCU-3’(SEQ ID NO:1),其中/Phos/表示磷酸化修饰,下划线的胞嘧啶采用甲基化修饰或采用未甲基化修饰的胞嘧啶,B为任意碱基,B组成的碱基序列为样本标签序列。通过接头元件1将一条DNA分子的两条正负链进行连接,并在接头元件1的3端的U可以作为内切酶识别位点,将其切开形成切口,可以开始新生链的生成。
5’-/Phos/CGGACTCGACCT GACAATGCATGGCATCTCAGGTCGAGTCCGT-3’(SEQ ID NO:2),其中/Phos/表示磷酸化修饰,下划线的胞嘧啶采用甲基化修饰或采用未甲基化修饰的胞嘧啶(m5c-dCTP)。模板链和上述新生成的新生链连接接头元件2后就能形成一个封闭DNA环,以便后续进行DNA纳米球制备。
在本发明的另一方面,本发明提出了一种接头元件组合物。根据本发明的实施例,所述接头元件组合物包括2个前面所述接头元件,并且,至少一个所述接头元件的粘性末端或者互补区上具有内切酶识别位点。由此,利用根据本发明实施例的接头元件组合物可以有效将正负链进行连接并形成环状DNA分子,用于后续的DNB(DNA纳米球)制备实验。
根据本发明的实施例,所述接头元件组合物包括:接头元件1,所述接头元件1具有如SEQ ID NO:1所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列;接头元件2,所述接头元件2具有如SEQ ID NO:2所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列。
在本发明的又一方面,本发明提出了一种试剂盒。根据本发明的实施例,所述试剂盒包括:前面所述接头元件、所述接头元件组合物。
在本发明的又一方面,本发明提出了前面所述接头元件、接头元件组合物、试剂盒在构建测序文库中的应用。
根据本发明的实施例,所述测序文库用于全基因组DNA甲基化测序和羟甲基化测序的至少之一以及全基因组DNA测序。由此,利用前面所述的接头元件可以准确地获知甲基化或/和羟甲基化信息。
在本发明的又一个方面,本发明提出了一种测序文库的构建方法。根据本发明的实施例,所述方法包括:
1)将双链DNA进行片段化,并对所得DNA片段进行平端修复、5’末端磷酸化和3’末端加碱基A;
2)通过连接反应,在步骤1)所得DNA片段两端分别加上接头元件1,得到连接产物;
其中,所述接头元件1选自前面所述的接头元件,且所述粘性末端或者所述互补区上具有内切酶识别位点;
3)利用内切酶在所述内切酶识别位点上形成切口;
4)在所述切口处,以与所述接头元件1中不具有所述粘性末端的一端相连的DNA片段为模板进行延伸,形成含有模板链和新生链的混合DNA双链;其中,所述新生链中胞嘧啶均为甲基化修饰的胞嘧啶或均为未甲基化修饰的胞嘧啶;
5)通过连接反应,在所述混合DNA双链的未连接所述接头元件1的一端加上接头元件2,得到哑铃状双链DNA;其中,所述接头元件2选自前面所述的接头元件;
6)将所述哑铃状双链DNA进行转化处理,所述新生链的序列不变,所述模板链上未甲基化的胞嘧啶将转变为尿嘧啶或者使具有的甲基化和/或羟甲基化的胞嘧啶转化为二氢尿嘧啶,得到测序文库。
通过接头元件1将一条DNA分子的两条正负链进行连接,并在其内切酶识别位点上形成切口,这样可以在切口上进行链延伸,生成新生链。模板链和上述新生成的新生链连接接头元件2后就能形成一个封闭DNA环,获得哑铃状双链DNA,有助于后续进行DNA纳米球制备。通过对哑铃状双链DNA进行转化处理,以便使尿嘧啶均转化为二氢尿嘧啶,获得测序文库。对该测序文库进行测序,可以基于新生链的序列信息获知全基因组序列,并将该全基因组序列和模板链的序列信息进行比对,可以准确获知甲基化/羟甲基化信息。并且DNA和DNA甲基化或/和羟甲基化同时测序是在一个分子上完成,不需要参考基因信息就能准确获取甲基化信息,并可对甲基化位置进行精准定位,极大地提高甲基化信息的准确性。
根据本发明的实施例,上述测序文库的构建方法还可以具有下列附加技术特征:
根据本发明的实施例,所述片段化是利用物理方法或化学方法将双链DNA进行随机打断或切断。
根据本发明的实施例,所述片段化是利用物理超声法或酶反应法进行的。
根据本发明的实施例,所述平端修复是利用T4 DNA聚合酶或绿豆核酸酶进行的。由此,方便后续的连接反应。
根据本发明的实施例,所述磷酸化是利用核苷酸激酶进行的。
根据本发明的实施例,所述磷酸化是利用T4多聚核苷酸激酶(T4 DNA磷酸激酶)进行的。
根据本发明的实施例,所述3’末端加碱基A是利用rTaq酶或无3-5外切酶活性的Klenow聚合酶进行的。由此,可以在后续操作中,方便地在所述双链DNA片段的两端添加接头。从而,提高了构建测序文库的效率。
根据本发明的实施例,所述粘性末端的碱基选自U碱基或T碱基;所述内切酶选自USER内切酶、Dnase内切酶或RNase内切酶;所述内切酶识别位点选自U碱基、脱氧核糖核酸或核糖核酸,所述切口的个数为1个或多个。
根据本发明的实施例,所述延伸采用具有5-3外切酶或5-3置换功能的DNA聚合酶。
根据本发明的实施例,所述DNA聚合酶选自T4 DNA聚合酶、phi29 DNA聚合酶或Bst DNA聚合酶。由此,以便实现高效扩增,得到新生链。
根据本发明的实施例,所述延伸所采用的dNTP中胞嘧啶全部为甲基化修饰或全部为未甲基化修饰的胞嘧啶。由于具有甲基化修饰的胞嘧啶,经重亚硫酸盐转化处理后,序列保持不变,或者具有未甲基化修饰的胞嘧啶,经过转化处理(如采用TET酶、高钌酸钾、beta糖基转移酶和TET酶进行转化处理),序列保持不变,对其进行测序可以获知基因组DNA信息。
根据本发明的实施例,所述新生链中胞嘧啶均为甲基化修饰的胞嘧啶,步骤6)包括:将所述哑铃状双链DNA进行重亚硫酸盐处理,得到测序文库。参见图1,根据本发明实施例的方法构建的测序文库中,新生链上的胞嘧啶全为甲基化修饰,经步骤6)的重亚硫酸盐,其序列保持不变,对其进行测序可以获得基因组DNA信息。模板链经重亚硫酸盐的转化处理后,具有的未甲基化的胞嘧啶将转变为尿嘧啶,对其进行测序,并将测序结果与前述获得的基因组DNA信息进行比对,即可获知甲基化信息。
根据本发明的实施例,所述新生链中胞嘧啶均为甲基化修饰的胞嘧啶,步骤6)包括:将所述哑铃状双链DNA进行转化处理,得到测序文库,所述转化处理采用的试剂包括:辅助试剂和吡啶硼烷或亚硫酸氢盐;所述辅助试剂选自下列三种中的一种:TET酶;高钌酸钾;beta糖基转移酶和TET酶;所述转化处理包括:依次利用辅助试剂和吡啶硼烷处理所述哑铃状双链DNA或者利用重亚硫酸盐处理所述哑铃状双链DNA。TET酶识别可以识别5mc和5hmc,beta糖基转移酶可以识别5mc,高钌酸钾可以识别5hmc。
参见图2,根据本发明实施例的方法构建的测序文库中,新生链上的胞嘧啶全为未甲基化修饰,经步骤6)TET酶辅助或高钌酸钾辅助的转化处理后,其序列保持不变,对其进行测序可以获得基因组DNA信息。模板链经辅助试剂处理,可以使甲基化的胞嘧啶转化成羧 基化的胞嘧啶,再在吡啶硼烷作用下把羧基化的胞嘧啶转化成二氢尿嘧啶(即多两个H原子的胞嘧啶),二氢尿嘧啶在测序结果中会识别成胸腺嘧啶,将测序结果与前述获得的基因组DNA信息进行比对,即可获知甲基化或/和者羟甲基化信息。
根据本发明的实施例,所述方法进一步包括:将所述测序文库制备成DNA纳米球。由此,以便可以在DNB测序仪上进行测序。
根据本发明的实施例,制备所述DNA纳米球的方法包括:利用引物序列对所述测序文库进行滚环扩增(Roll circle amplication)。
根据本发明的实施例,所述引物序列具有如SEQ ID NO:3所示的核苷酸序列或与其具有至少80%(例如85%、90%、95%、99%)同源性的核苷酸序列。
GAGCCTAAGACCGCTTGACCTCAACTACAAAC(SEQ ID NO:3)
在本发明的另一方面,本发明提出了一种测序文库。根据本发明的实施例,所述测序文库是通过前面所述测序文库的构建方法获得的。由此,利用根据本发明实施例的测序文库进行测序,可以同时对全基因组DNA和全基因组DNA甲基化/羟甲基化进行测序的方法和系统,并且DNA和DNA甲基化同时测序是在一个分子上完成,不需要参考基因信息就能准确获取甲基化信息,极大地提高甲基化信息的准确性。
在本发明的又一方面,本发明提出了前面所述测序文库在测序中的应用。由此,利用该测序文库进行测序,可以同时对全基因组DNA和全基因组DNA甲基化/羟甲基化进行测序的方法和系统,并且DNA和DNA甲基化同时测序是在一个分子上完成,不需要参考基因信息就能准确获取甲基化信息,极大地提高甲基化信息的准确性。
根据本发明的实施例,所述测序包括全基因组DNA甲基化测序和羟甲基化测序的至少之一以及全基因组DNA测序。
在本发明的又一方面,本发明提出了一种同时进行全基因组DNA测序和全基因组DNA甲基化或/和羟甲基化测序的方法。根据本发明的实施例,所述方法包括:对前面所述测序文库进行测序,获得测序信息,所述测序信息包括新生链信息和模板链信息,所述新生链信息为全基因DNA信息;将模板链信息与新生链信息进行比对分析,获知所述模板链全基因组DNA甲基化或/和羟甲基化信息。由此,根据本发明实施例的方法无需参考基因组信息就可以获取甲基化修饰信息,并可对甲基化序列位置进行精准定位,提高甲基化测序数据比对的准确性。
根据本发明的实施例,所述比对分析包括:
a)当所述新生链中胞嘧啶均为甲基化修饰的胞嘧啶且将所述哑铃状双链DNA进行重亚硫酸盐处理后,测序结果中,所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补 链相应位置的碱基为胸腺嘧啶,是所述位置未发生甲基化的指示;所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胞嘧啶,是所述位置发生甲基化的指示;
b)当所述新生链中胞嘧啶全部为未甲基化修饰的胞嘧啶且采用TET酶和吡啶硼烷对所述哑铃状双链DNA进行转化处理后,测序结果中,所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胸腺嘧啶,是所述位置发生甲基化的指示;所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胞嘧啶,是所述位置未发生甲基化的指示;
c)当所述新生链中胞嘧啶全部为未甲基化修饰的胞嘧啶且采用高钌酸钾和吡啶硼烷对所述哑铃状双链DNA进行转化处理后,测序结果中,所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胸腺嘧啶,是所述位置发生羟甲基化的指示;所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胞嘧啶,是所述位置未发生羟甲基化的指示;
d)当所述新生链中胞嘧啶全部为未甲基化修饰的胞嘧啶且采用beta糖基转移酶、TET酶和吡啶硼烷对所述哑铃状双链DNA进行转化处理后,测序结果中,所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胸腺嘧啶,是所述位置发生甲基化的指示;所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胞嘧啶,是所述位置未发生甲基化的指示。
有益效果
1、本发明可以同时获取基因组信息和基因组甲基化信息,不需要参考基因组信息就可以获取未知物种的甲基化修饰和/或羟甲基化信息;
2、本发明借助基因组位置信息对甲基化或/和羟甲基化序列位置进行精准定位,提高甲基化和/或羟甲基化数据比对的准确性;
3、本发明不需要经过PCR,可以有效均一获取全基因组的甲基化和/或羟甲基化信息;
4、本发明可以实现对C/T多态性位置进行准确的甲基化和/或羟甲基化修饰检测。
本发明的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。
附图说明
本发明的上述和/或附加的方面和优点从结合下面附图对实施例的描述中将变得明显和 容易理解,其中:
图1显示了根据本发明一个实施例,基于重亚硫酸盐转化处理的全基因组DNA和全基因组DNA甲基化混合文库制备流程示意图;
图2显示了根据本发明一个实施例,基于TET辅助或者高钌酸钾辅助的全基因组DNA和全基因组DNA甲基化混合文库制备流程示意图;
图3显示了根据本发明一个实施例的接头元件1和接头元件2的结构示意图;
图4显示了根据本发明一个实施例的信息分析原理图;
图5显示了根据本发明一个实施例的置换酶测序流程图。
具体实施方式
下面详细描述本发明的实施例。下面描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。
本发明提出了一种同时进行全基因组DNA测序和全基因组DNA甲基化测序或/和羟甲基化的方法,包括:
构建文库(参见图1)
1、基因组DNA经过随机打断,生产200-500bp的片段,或者是已经打断好的DNA如cfDNA。
2、打断后的DNA分子通过绿豆核酸酶对粘性末端外切形成平末端;。
3、对平末端的双链DNA进行5端磷酸化,3端加碱基A,形成5端带有磷酸,3端带有碱基A的粘性末端双链DNA分子。
4、上述分子加上接头元件1,该接头主要的作用是为了后续的链延伸,该接头序列可以包含一个或多个测序引物序列或/和分子标签(UMI,Unique Molecular Identifiers)或/和样本标签序列(Index Barcode)。该接头是一个特殊的泡状接头(示意图3a),中间是不互补的序列,5端磷酸化。
接头元件1的5’端和3’端是互补的序列并且其中一个带有粘性末端U碱基。该U碱基可以被后续的USER酶识别并切除,产生一个切口,用于聚合酶的外切或置换和聚合延伸;或者5’端和3’端是互补的序列切包含多个U碱基,并带有粘性末端T碱基。U碱基可以被后续的USER酶识别并切除,产生一个或多个切口,用于聚合酶的外切或置换和聚合延伸(图1)。
5、上述连接产物在USER酶的作用下形成一个或多个切口;
6、在切口处进行新生链的延伸,该延伸具有5-3外切酶活性(如T4DNA聚合酶)或5-3置换酶活性的酶(如phi29,Bst)进行延伸。延伸的dNTP中胞嘧啶全部为甲基化修饰或未 甲基化修饰的胞嘧啶,将原有的DNA模板链中的胞嘧啶全部置换为含有甲基化或未甲基化修饰的胞嘧啶新生链,形成原有模板链和新生链的混合DNA双链。
7、上述形成的混合双链再和接头元件2进行连接,得到哑铃状双链DNA文库。该接头序列包含一个或多个测序引物序列或/和分子标签(UMI,Unique Molecular Identifiers)或/和样本标签序列(Index Barcode)。该接头是一个特殊的泡状接头(示意图2b),中间是不互补的序列,5’端和3’端是互补的序列,3’端具有粘性末端T/U碱基,5端’磷酸化。
8、得到的哑铃状双链DNA经过重亚硫酸盐或者TET酶辅助、高钌酸钾(KRuO4)、beta糖基转移酶和TET酶辅助的转化处理,将原始模板链未甲基化修饰的胞嘧啶转化为尿嘧啶或将原始模板链甲基化修饰的胞嘧啶转化为二氢尿尿嘧啶(DHU),而新生成链所有甲基化修饰的胞嘧啶保持序列不变。
9、经过转化后的哑铃状双链DNA文库在通用引物的作用下进行DNA纳米球制备。通用引物结合哑铃状双链DNA的接头序列,在有置换活性的酶的作用下进行线性延伸,生成DNA纳米球。
10、DNA纳米球装载到DNB测序芯片上进行测序。
测序
11、DNBloading到芯片上后进行测序反应,Read1和Read2的测序引物和具有置换活性的测序酶(参见图3)的作用下分别测原始模板链(重亚硫酸盐转化链、酶辅助或高钌酸钾(KRuO 4)辅助的转化链)和新生链,其中新生链获取参考基因组DNA信息,原始模板链(重亚硫酸盐转化链、酶辅助、高钌酸钾(KRuO 4)辅助辅助的转化链)获取胞嘧啶转化信息。
信息分析方案
12、一个DNB纳米孔产生两条读长Read1和Read2,其中Read1或Read2是来源于新生成链信息,该条read通过任意比对软件比对到基因组,获取基因组上准确位置信息;对应的Read2或Read1来源于原始模板链(重亚硫酸盐转化链或酶辅助或高钌酸钾(KRuO 4)辅助辅助的转化链),将Read1和Read2进行比较,在重亚硫酸盐转化条件下,原始模板链中胞嘧啶转化为腺嘌呤的位置确定为胞嘧啶未甲基化,未转化为腺嘌呤的胞嘧啶则有甲基化修饰。在或酶辅助或高钌酸钾(KRuO 4)辅助的转化条件下,原始模板链中胞嘧啶转化为腺嘌呤的位置确定为胞嘧啶甲基化,未转化为腺嘌呤的胞嘧啶则为未甲基化修饰。
下面将结合实施例对本发明的方案进行解释。本领域技术人员将会理解,下面的实施例仅用于说明本发明,而不应视为限定本发明的范围。实施例中未注明具体技术或条件的, 按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过采购获得的常规产品。
实施例1
取1μg炎黄细胞系gDNA分别按照本发明的方法和常规方法对DNA进行甲基化全基因组文库制备,文库到MGISEQ-2000测序仪上进行上机测序,测序类型PE100,测序深度30×,然后进行数据分析,包括数据利用率、比对率、偏好性等性能。常规WGBS采用Hieff
Figure PCTCN2022074093-appb-000001
Methyl-seq DNA library Prep kit(翌圣生物科技(上海)股份有限公司,货号12211ES08)试剂盒进行文库制备,实验步骤严格按照说明书执行。
1.DNA片段化
采用covaris对gDNA进行片段化,主带在300bp左右;
2.末端修复
末端修复反应体系和条件如下
打断的DNA 40μL
10X T4 DNA磷酸激酶buffer 5μL
T4 DNA磷酸激酶 2μL
绿豆核酸酶 1μL
rTaq 1μL
dATP(10mM) 1μL
总体积 50μL
将上述反应体系置于PCR仪上,37℃的10min,65度10min。反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于20μl洗脱缓冲液。在37℃温度下磷酸激酶发生作用进行磷酸化,在65℃温度下rTaq酶发挥作用在双链DNA的末尾加上A碱基。
3.连接接头元件1:
1)将上一步得到的DNA按下表配制甲基化接头(有时也称为“甲基化标签接头”)的连接反应体系:
DNA 18μL
2×Rapid T4 DNA连接缓冲液(Enzymatic) 25μL
甲基化标签接头(10uM)* 4μL
T4 DNA连接酶(Rapid,L603-HC-L Enzymatic) 3μL
总体积 50μL
*甲基化接头序列为:
接头1:5’-/Phos/GCTCGCAGTCGA GGTCAAGCGGTCTTAGGCTCBBBBBBBBBB TCTGA AGGACATGGCTACGATCGACTGCGAGCT-3’(SEQ ID NO:1)下划线的胞嘧啶是甲基化修饰的胞嘧啶(m5c-dCTP),B为样本标签序列.
2)将上述反应体系置于20℃的Thermomixer(Eppendorf)上,进行反应15min,获得连接产物。反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于40μl洗脱缓冲液。
4.新生链生成
1)将上一步得到的DNA按下表的延伸反应体系:
DNA 40μL
BST反应缓冲液 5μL
USER 1μL
dATP/dGTP/dTTP/m5C-dCTP 2μL
BST 2μL
总体积 50L
2)37℃,5分钟;65℃,10分钟。反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于22μl洗脱缓冲液。
5.连接接头元件2:
1)将上一步得到的DNA按下表配制甲基化接头(有时也称为“甲基化标签接头”)的,连接反应体系:
DNA 18μL
2×Rapid T4 DNA连接缓冲液(Enzymatic) 25μL
甲基化标签接头(10uM)* 4μL
T4 DNA连接酶(Rapid,L603-HC-L Enzymatic) 3μL
总体积 50μL
*甲基化接头序列为:
接头2:5’-/5Phos/CGGACTCGACCT GACAATGCATGGCATCTCAGGTCGAGTCCGT-3’(SEQ ID NO:2)接头2中的下划线的胞嘧啶均进行了甲基化修饰保护
2)将上述反应体系置于20℃的Thermomixer(Eppendorf)上,进行反应15min,获得连接产物。反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于40μl洗脱缓冲液。
6.重亚硫酸盐处理
采用EZ DNA Methylation-Gold Kit TM(ZYMO),将上述连接好的DNA进行重亚硫酸盐共处理,具体步骤如下:
1)制备CT转换试剂(CT Conversion Reagent)溶液:从试剂盒中取出CT转换试剂(固体混合物),分别加入900μL的水、50μL的M-溶解缓冲液(M-Dissolving Buffer)和300μL的M-稀释缓冲液(M-Dilution Buffer),室温下溶解并且震荡10分钟或在摇床上摇动10分钟。
2)M-洗涤缓冲液的制备:向M-洗涤缓冲液中添加24mL 100%的乙醇,备用。
3)在PCR管中加入130μL的CT转换试剂溶液和上述连接好的DNA,轻弹或移液器吹悬混合样品。
4)将样品管放到PCR仪上按以下步骤操作:
98℃下持续5分钟
64℃下持续2.5小时
完成上述操作后,立刻进行下一步操作或者在4℃下存储(最多20小时)备用。
5)将Zymo-Spin IC TM Column放入收集管(Collection Tube)中,并加入600μL的M-结合缓冲液(M-Binding Buffer)。
6)将重亚硫酸盐处理的样品加入到含M-结合缓冲液的Zymo-Spin IC TM Column中,盖上盖子颠倒混匀。
7)全速(>10,000x g)离心30秒,弃收集管中的收集液。
8)向柱中加入100μL的M-洗涤缓冲液,全速(>10,000×g)离心30秒,弃收集管中的液体。
9)向柱中添加200μL的M-Desulphonation Buffer,室温放置15min,全速(>10,000×g)离心30s,弃收集管中的液体。
10)向柱中添加200μL的M-洗涤缓冲液,全速(>10,000×g)离心30s,弃收集管中的液体,并再重复此步骤1次。
11)将Zymo-Spin IC TM Column置于新的1.5mL EP管中,加入20μL的M-洗脱缓冲液r到柱基质中,室温放置2min,全速(>10,000×g)离心洗脱目的片段DNA。
7.DNB制备
将上一步得到的目的片段DNA按以下体系配制DNB制备反应体系:
上一步连接后的DNA 20μL
Phi29反应缓冲液 25μL
通用引物1(10μM) 5μL
总体积 50μL
25℃,30分钟。
通用引物1:GAGCCTAAGACCGCTTGACCTCAACTACAAAC(SEQ ID NO:3)
8.文库检测:
采用HS Qubit ssDNA试剂盒对DNB进行定量。
9.上机测序
将得到的文库进行高通量测序,测序平台MGISEQ-2000,测序类型PE100,测序后数据经过比对后统计各项基本参数,包括下机数据、可用数据、比对数据等。
10.信息分析
常规方法采用BS-MAP软件进行比对,本发明的方法采用BWA软件对新生链(胞嘧啶转化链)比对基因组获取read准确位置,根据基因组比对位置再获取原始模板链(重亚硫酸盐转化链或酶转化链)信息,进而得到准确的甲基化比对信息。
11.结果:
表1
Figure PCTCN2022074093-appb-000002
利用本发明的方法能够大幅度提高甲基化比对率,并且能够提供CpG位点覆盖度,能够提高数据的利用率,提高甲基化检测的准确性。
实施例2
取1μg炎黄细胞系gDNA分别按照本发明的方法和常规方法对DNA进行甲基化全基因 组文库制备,文库到MGISEQ-2000测序仪上进行上机测序,测序类型PE100,测序深度30×,然后进行数据分析,包括数据利用率、比对率、偏好性等性能。
1.DNA片段化
采用covaris对gDNA进行片段化,主带在300bp左右;
2.末端修复
末端修复反应体系和条件如下
打断的DNA 40μL
10×T4 DNA磷酸激酶buffer 5μL
T4 DNA磷酸激酶 2μL
绿豆核酸酶 1μL
rTaq 1μL
dATP(10mM) 1μL
总体积 50μL
将上述反应体系置于PCR仪上,37℃的10min,65度10min。反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于20μl洗脱缓冲液。
3.连接接头1:
1)将上一步得到的DNA按下表配制甲基化接头(有时也称为“甲基化标签接头”)的连接反应体系:
DNA 18μL
2×Rapid T4 DNA连接缓冲液(Enzymatic) 25μL
甲基化标签接头(10μM)* 4μL
T4 DNA连接酶(Rapid,L603-HC-L Enzymatic) 3μL
总体积 50μL
*接头序列为:
接头1:5’-/5Phos/GCTCGCAGTCGAGGTCAAGCGGTCTTAGGCTCBBBBBBBBBBTCTGAAGGACATGGCTACGATCGACTGCGAGCT-3’(SEQ ID NO:1),B为样本标签序列
2)20℃反应15min,获得连接产物。反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于40μl洗脱缓冲液。
4.新生链生成
1)将上一步得到的DNA按下表的延伸反应体系:
DNA 40μL
BST反应缓冲液 5μL
USER 1μL
dNTP 2μL
BST 2μL
总体积 50μL
2)37℃,5分钟;65℃,10分钟。反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于22μl洗脱缓冲液。
5.连接接头2:
1)将上一步得到的DNA按下表配制接头的连接反应体系:
DNA 18μL
2×Rapid T4 DNA连接缓冲液(Enzymatic) 25μL
标签接头2(10μM)* 4μL
T4 DNA连接酶(Rapid,L603-HC-L Enzymatic) 3μL
总体积 50μL
*甲基化接头序列为:
接头2:5’-/5Phos/CGGACTCGACCTGACAATGCATGGCATCTCAGGTCGAGTCCGT-3’(SEQ ID NO:2)
2)将上述反应体系置于20℃的Thermomixer(Eppendorf)上,进行反应15min,获得连接产物。反应完后用1.0×AMPure磁珠进行纯化,最后将纯化产物溶于40μl洗脱缓冲液。
6.TET酶辅助的吡啶硼烷进行转化处理
TET酶采用NEBNext Enzymatic Methyl-seq Kit(NEB,E7120S)
1)将上一步得到的DNA进行如下配置反应体系:
TET buffer 10μL
氧化辅助成分 1μL
DTT 1μL
氧化增强剂 1μL
TET酶 4μL
2)将PCR管置于PCR中37℃孵育1h,然后加入1μL终止buffer,3737℃孵育30分钟。
3)反应完后用80μL AMPure磁珠进行纯化,最后将纯化产物溶于35μl洗脱缓冲液。
4)向35μL样品中加入10μL的3M醋酸钠溶液(pH=4.3)和5μL的10M吡啶硼烷。将PCR管置于Thermo Mixer(Eppendorf)中37℃温度下850rpm振荡16h。
5)用PB buffer和Zymo-Spin TM IC Column纯化DNA(Zymo research公司),最终溶于20μL TE中。
7.DNB制备
1)将上一步得到的目的片段DNA按以下体系配制DNB制备反应体系:
上一步连接后的DNA 20μL
Phi29反应缓冲液 25μL
通用引物1(10μM) 5μL
总体积 50μL
2)25℃,30分钟。
通用引物1:GAGCCTAAGACCGCTTGACCTCAACTACAAAC(SEQ ID NO:3)
8.文库检测:
采用HS Qubit ssDNA试剂盒对DNB进行定量。
9.上机测序
将得到的文库进行高通量测序,测序平台MGISEQ-2000,测序类型PE100,测序后数据经过比对后统计各项基本参数,包括下机数据、可用数据、比对数据等。
10.信息分析
常规方法采用BS-MAP软件进行比对,本发明的方法采用BWA软件对新生链(胞嘧啶转化链)比对基因组获取read准确位置,根据基因组比对位置再获取原始模板链(重亚硫酸盐转化链或酶转化链)信息,进而得到准确的甲基化比对信息。
11.结果:
表1
Figure PCTCN2022074093-appb-000003
注:常规方法(TAPS)Liu,Y.,Siejka-Zielińska,P.,Velikova,G.,Bi,Y.,Yuan,F.,Tomkova,M.,...&Song,C.X.(2019).Bisulfite-free direct detection of 5-methylcytosine and 5-hydroxymethylcytosine at base resolution.Nature biotechnology,37(4),424-429.严格按照文献实验步骤进行实验。
利用本发明的方法能够大幅度提高甲基化比对率,并且能够提供CpG位点覆盖度,能够提高数据的利用率,提高甲基化检测的准确性。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (37)

  1. 一种接头元件,其特征在于,所述接头元件为呈泡状的单链核酸,所述单链核酸具有非互补区和由5’端序列和3’端序列形成的互补区,所述5’端或3’端具有粘性末端。
  2. 根据权利要求1所述的接头元件,其特征在于,所述粘性末端或者所述互补区上具有内切酶识别位点。
  3. 根据权利要求1所述的接头元件,其特征在于,所述粘性末端的碱基为U碱基或T碱基。
  4. 根据权利要求2所述的接头元件,其特征在于,所述内切酶选自USER内切酶、Dnase内切酶、RNase内切酶。
  5. 根据权利要求2或4所述的接头元件,其特征在于,所述内切酶识别位点选自U碱基、脱氧核苷酸或者核糖核苷酸。
  6. 根据权利要求1~5任一项所述的接头元件,其特征在于,所述接头元件含有一个或多个测序引物序列、分子标签序列和/或样本标签序列。
  7. 根据权利要求1~6任一项所述的接头元件,其特征在于,所述接头元件的长度为20~200nt。
  8. 根据权利要求1~7任一项所述的接头元件,其特征在于,所述接头元件为脱氧核糖核苷酸和/或核糖核苷酸。
  9. 根据权利要求1~8任一项所述的接头元件,其特征在于,所述接头元件具有如SEQ ID NO:1或2所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列。
  10. 一种接头元件组合物,其特征在于,包括2个权利要求1~9任一项所述接头元件,并且,至少一个所述接头元件的粘性末端或者互补区上具有内切酶识别位点。
  11. 根据权利要求10所述的接头元件组合物,其特征在于,包括:
    接头元件1,所述接头元件1具有如SEQ ID NO:1所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列;
    接头元件2,所述接头元件2具有如SEQ ID NO:2所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列。
  12. 一种试剂盒,其特征在于,包括:权利要求1~9任一项所述接头元件、权利要求10或11所述接头元件组合物。
  13. 权利要求1~9任一项所述接头元件、权利要求10或11所述接头元件组合物、权利要求12所述试剂盒在构建测序文库中的应用。
  14. 根据权利要求13所述的应用,其特征在于,所述测序文库用于全基因组DNA甲 基化测序和羟甲基化测序的至少之一以及全基因组DNA测序。
  15. 一种测序文库的构建方法,其特征在于,包括:
    1)将双链DNA进行片段化,并对所得DNA片段进行平端修复、5’末端磷酸化和3’末端加碱基A;
    2)通过连接反应,在步骤1)所得DNA片段两端分别加上接头元件1,得到连接产物;
    其中,所述接头元件1选自权利要求1~8任一项所述的接头元件,且所述粘性末端或者所述互补区上具有内切酶识别位点;
    3)利用内切酶在所述内切酶识别位点上形成切口;
    4)在所述切口处,以与所述接头元件1中不具有所述粘性末端的一端相连的DNA片段为模板进行延伸,形成含有模板链和新生链的混合DNA双链;
    其中,所述新生链中胞嘧啶均为甲基化修饰的胞嘧啶或均为未甲基化修饰的胞嘧啶;
    5)通过连接反应,在所述混合DNA双链的未连接所述接头元件1的一端加上接头元件2,得到哑铃状双链DNA;
    其中,所述接头元件2选自权利要求1~8任一项所述的接头元件;
    6)对所述哑铃状双链DNA进行重亚硫酸盐处理或者转化处理,所述新生链的序列不变,所述模板链上未甲基化的胞嘧啶将转变为尿嘧啶或者使具有的甲基化和/或羟甲基化的胞嘧啶转化为二氢尿嘧啶,得到测序文库。
  16. 根据权利要求15所述的构建方法,其特征在于,所述片段化是利用物理方法或化学方法将双链DNA进行随机打断或切断。
  17. 根据权利要求15或16所述的构建方法,其特征在于,所述片段化是利用物理超声法或酶反应法进行的。
  18. 根据权利要求15~17任一项所述的构建方法,其特征在于,所述平端修复是利用T4 DNA聚合酶或绿豆核酸酶进行的。
  19. 根据权利要求15~18任一项所述的构建方法,其特征在于,所述磷酸化是利用核苷酸激酶进行的。
  20. 根据权利要求15~19任一项所述的构建方法,其特征在于,所述磷酸化是利用T4多聚核苷酸激酶进行的。
  21. 根据权利要求15~20任一项所述的构建方法,其特征在于,所述3’末端加碱基A是利用rTaq酶或无3-5外切酶活性的Klenow聚合酶进行的。
  22. 根据权利要求15~21任一项所述的构建方法,其特征在于,所述粘性末端的碱基 选自U碱基或T碱基;
    所述内切酶选自USER内切酶、Dnase内切酶或RNase内切酶;
    所述内切酶识别位点选自U碱基、脱氧核糖核酸或者核糖核苷酸;
    所述切口的个数为1个或多个。
  23. 根据权利要求15~22任一项所述的构建方法,其特征在于,所述接头元件1具有如SEQ ID NO:1所示的核苷酸序列;
    所述接头元件2具有如SEQ ID NO:2所示的核苷酸序列。
  24. 根据权利要求15~23任一项所述的构建方法,其特征在于,所述延伸采用具有5-3外切酶或5-3置换功能的DNA聚合酶。
  25. 根据权利要求24所述的构建方法,其特征在于,所述DNA聚合酶选自T4 DNA聚合酶、phi29 DNA聚合酶或Bst DNA聚合酶。
  26. 根据权利要求15~25任一项所述的构建方法,其特征在于,所述延伸所采用的dNTP中胞嘧啶全部为甲基化修饰或全部为未甲基化修饰的胞嘧啶。
  27. 根据权利要求26所述的构建方法,其特征在于,所述新生链中胞嘧啶均为甲基化修饰的胞嘧啶,步骤6)包括:将所述哑铃状双链DNA进行重亚硫酸盐处理,得到测序文库。
  28. 根据权利要求26所述的构建方法,其特征在于,所述新生链中胞嘧啶均为甲基化修饰的胞嘧啶,步骤6)包括:将所述哑铃状双链DNA进行转化处理,得到测序文库;
    所述转化处理采用的试剂包括:辅助试剂和吡啶硼烷或亚硫酸氢盐。
  29. 根据权利要求28所述的构建方法,其特征在于,所述辅助试剂选自下列三种中的一种:TET酶;高钌酸钾;beta糖基转移酶和TET酶;
    所述转化处理包括:依次利用辅助试剂和吡啶硼烷处理所述哑铃状双链DNA或者利用重亚硫酸盐处理所述哑铃状双链DNA。
  30. 根据权利要求15~29任一项所述的构建方法,其特征在于,进一步包括:将所述测序文库制备成DNA纳米球。
  31. 根据权利要求30所述的构建方法,其特征在于,制备所述DNA纳米球的方法包括:利用引物序列对所述测序文库进行滚环扩增。
  32. 根据权利要求31所述的构建方法,其特征在于,所述引物序列具有如SEQ ID NO:3所示的核苷酸序列或与其具有至少80%同源性的核苷酸序列。
  33. 一种测序文库,其特征在于,所述测序文库是通过权利要求15~32任一项所述测序文库的构建方法获得的。
  34. 权利要求33所述测序文库在测序中的应用。
  35. 根据权利要求34所述的应用,其特征在于,所述测序包括全基因组DNA甲基化测序和羟甲基化测序的至少之一以及全基因组DNA测序。
  36. 一种同时进行全基因组DNA测序和全基因组DNA甲基化或/和羟甲基化测序的方法,其特征在于,包括:
    对权利要求33所述测序文库进行测序,获得测序信息,所述测序信息包括新生链信息和模板链信息,所述新生链信息为全基因DNA信息;
    将模板链信息与新生链信息进行比对分析,获知所述模板链全基因组DNA甲基化或/和羟甲基化信息。
  37. 根据权利要求36所述的方法,其特征在于,所述比对分析包括:
    a)当所述新生链中胞嘧啶均为甲基化修饰的胞嘧啶且将所述哑铃状双链DNA进行重亚硫酸盐处理后,测序结果中,
    所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胸腺嘧啶,是所述位置未发生甲基化的指示;
    所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胞嘧啶,是所述位置发生甲基化的指示;
    b)当所述新生链中胞嘧啶全部为未甲基化修饰的胞嘧啶且采用TET酶和吡啶硼烷对所述哑铃状双链DNA进行转化处理后,测序结果中,
    所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胸腺嘧啶,是所述位置发生甲基化的指示;
    所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胞嘧啶,是所述位置未发生甲基化的指示;
    c)当所述新生链中胞嘧啶全部为未甲基化修饰的胞嘧啶且采用高钌酸钾和吡啶硼烷对所述哑铃状双链DNA进行转化处理后,测序结果中,
    所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胸腺嘧啶,是所述位置发生羟甲基化的指示;
    所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胞嘧啶,是所述位置未发生羟甲基化的指示;
    d)当所述新生链中胞嘧啶全部为未甲基化修饰的胞嘧啶且采用beta糖基转移酶、TET酶和吡啶硼烷对所述哑铃状双链DNA进行转化处理后,测序结果中,
    所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胸腺嘧 啶,是所述位置发生甲基化的指示;
    所述新生链中的鸟嘌呤所在位置对应的所述模板链的互补链相应位置的碱基为胞嘧啶,是所述位置未发生甲基化的指示。
PCT/CN2022/074093 2022-01-26 2022-01-26 同时进行全基因组dna测序和全基因组dna甲基化或/和羟甲基化测序的方法 WO2023141829A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2022/074093 WO2023141829A1 (zh) 2022-01-26 2022-01-26 同时进行全基因组dna测序和全基因组dna甲基化或/和羟甲基化测序的方法
CN202280052323.8A CN118076734A (zh) 2022-01-26 2022-01-26 同时进行全基因组dna测序和全基因组dna甲基化或/和羟甲基化测序的方法

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/074093 WO2023141829A1 (zh) 2022-01-26 2022-01-26 同时进行全基因组dna测序和全基因组dna甲基化或/和羟甲基化测序的方法

Publications (1)

Publication Number Publication Date
WO2023141829A1 true WO2023141829A1 (zh) 2023-08-03

Family

ID=87470160

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/074093 WO2023141829A1 (zh) 2022-01-26 2022-01-26 同时进行全基因组dna测序和全基因组dna甲基化或/和羟甲基化测序的方法

Country Status (2)

Country Link
CN (1) CN118076734A (zh)
WO (1) WO2023141829A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012019320A1 (zh) * 2010-08-11 2012-02-16 中国科学院心理研究所 一种甲基化dna的高通量测序方法及其应用
WO2016058134A1 (zh) * 2014-10-14 2016-04-21 深圳华大基因科技有限公司 一种接头元件和使用其构建测序文库的方法
WO2016082130A1 (zh) * 2014-11-26 2016-06-02 深圳华大基因研究院 一种核酸的双接头单链环状文库的构建方法和试剂
CN107586835A (zh) * 2017-10-19 2018-01-16 东南大学 一种基于单链接头的下一代测序文库的构建方法及其应用
CN113337501A (zh) * 2021-08-06 2021-09-03 北京橡鑫生物科技有限公司 一种发卡型接头及其在双端index建库中的应用

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012019320A1 (zh) * 2010-08-11 2012-02-16 中国科学院心理研究所 一种甲基化dna的高通量测序方法及其应用
WO2016058134A1 (zh) * 2014-10-14 2016-04-21 深圳华大基因科技有限公司 一种接头元件和使用其构建测序文库的方法
WO2016082130A1 (zh) * 2014-11-26 2016-06-02 深圳华大基因研究院 一种核酸的双接头单链环状文库的构建方法和试剂
CN107586835A (zh) * 2017-10-19 2018-01-16 东南大学 一种基于单链接头的下一代测序文库的构建方法及其应用
CN113337501A (zh) * 2021-08-06 2021-09-03 北京橡鑫生物科技有限公司 一种发卡型接头及其在双端index建库中的应用

Also Published As

Publication number Publication date
CN118076734A (zh) 2024-05-24

Similar Documents

Publication Publication Date Title
US11697843B2 (en) Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing
JP6571895B1 (ja) 核酸プローブ及びゲノム断片検出方法
US9745614B2 (en) Reduced representation bisulfite sequencing with diversity adaptors
JP3421664B2 (ja) ヌクレオチド塩基の同定方法
JP4663118B2 (ja) ヌクレオチドの変異について核酸をスクリーニングする方法
WO2020056381A9 (en) PROGRAMMABLE RNA-TEMPLATED SEQUENCING BY LIGATION (rSBL)
WO2016037361A1 (zh) 试剂盒及其在核酸测序中的用途
US20090047680A1 (en) Methods and compositions for high-throughput bisulphite dna-sequencing and utilities
CA3213538A1 (en) Method for identification and enumeration of nucleic acid sequence, expression, copy, or dna methylation changes, using combined nuclease, ligase, polymerase, and sequencing reactions
EP3098324A1 (en) Compositions and methods for preparing sequencing libraries
EP2451973A1 (en) Method for differentiation of polynucleotide strands
EP2844766B1 (en) Targeted dna enrichment and sequencing
CN114438184B (zh) 游离dna甲基化测序文库构建方法及应用
WO2007083766A1 (ja) 分子内プローブによる核酸配列の検出方法
JP2007125014A (ja) 遺伝子メチル化検査法対照物
WO2023141829A1 (zh) 同时进行全基因组dna测序和全基因组dna甲基化或/和羟甲基化测序的方法
EP4060052A1 (en) Methods for accurate parallel quantification of nucleic acids in dilute or non-purified samples
CN113544282B (zh) 基于dna样本构建测序文库的方法及应用
CN115874291A (zh) 一种对样本中dna和rna分子进行标记并同时检测的方法
JP2023519979A (ja) ゲノム内の構造再編成の検出方法
EP4332238A1 (en) Methods for accurate parallel detection and quantification of nucleic acids
TW202411431A (zh) 準確地平行定量變體核酸的高靈敏度方法
CN117625739A (zh) 同时进行基因组和甲基化组测序的测序接头组合物、建库方法和测序方法
So Universal Sequence Tag Array (U-STAR) platform: strategies towards the development of a universal platform for the absolute quantification of gene expression on a global scale

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22922679

Country of ref document: EP

Kind code of ref document: A1