WO2020232635A1 - 基于甲基化dna目标区域构建测序文库及系统和应用 - Google Patents
基于甲基化dna目标区域构建测序文库及系统和应用 Download PDFInfo
- Publication number
- WO2020232635A1 WO2020232635A1 PCT/CN2019/087824 CN2019087824W WO2020232635A1 WO 2020232635 A1 WO2020232635 A1 WO 2020232635A1 CN 2019087824 W CN2019087824 W CN 2019087824W WO 2020232635 A1 WO2020232635 A1 WO 2020232635A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sequence
- universal
- primer
- sequencing
- dna sample
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1093—General methods of preparing gene libraries, not provided for in other subgroups
Definitions
- the invention relates to the field of gene sequencing, in particular to a method, system and application for constructing a sequencing library based on a target region of methylated DNA.
- DNA methylation is an apparent regulatory modification, which participates in the regulation of protein synthesis without changing the base sequence.
- DNA methylation is a very wonderful chemical modification. The care of relatives, the body's aging, smoking, alcoholism and even obesity will be truthfully recorded on the genome by methylation. The genome is like a diary, and methylation is used as text to record the experience of the human body.
- DNA methylation is an important epigenetic marker information. Obtaining the methylation level data of all C sites in the whole genome is of great significance for the study of epigenetic spatio-temporal specificity.
- mapping the DNA methylation level of the whole genome, and analyzing the high-precision methylation modification patterns of specific species will surely have a milestone significance in epigenomics research.
- Whole Genome Methylation Sequencing WGBS Whole Genome Bisulfite Sequencing
- Whole Genome Methylation Sequencing WGBS Whole Genome Bisulfite Sequencing
- Bisulfite treatment will single-strand DNA and cause serious damage
- Unmethylated C bases after bisulfite treatment will change U bases, the GC content of the entire genome undergoes extreme changes, resulting in great preference for subsequent amplification
- the library construction requires microgram-level starting DNA, and it is difficult to have a very effective library construction method for trace DNA.
- whole-genome methylation sequencing is complicated and too expensive. The use of targeted methylation sequencing technology can effectively solve these problems.
- Targeted methylation sequencing technology can be divided into probe capture and multiplex PCR-based sequencing technology.
- probe capture the required starting amount is high.
- trace samples such as plasma free DNA
- it is difficult to capture Moreover, the design and operation process of the probe capture probe are too complicated, the detection cycle is long, and the cost is high; and the multiplex PCR based on the DNA bisulfite treatment requires low initial requirements, simple operation and high sensitivity, but this technology It needs further improvement.
- an object of the present invention is to provide a method, system and application for constructing a sequencing library based on a target region of methylated DNA.
- the method provided by the present invention builds a library for the target region of the methylated DNA sample, and during the library building process, only one strand of the methylated DNA sample is amplified and the library is built.
- specific primers and universal primers for amplification the target product is obtained, which can effectively solve the problem of primer dimers.
- multiple specific primers are used to amplify the target region of the same methylated DNA template to ensure the specificity of amplification.
- the inventors of the present invention noticed in the research process that multiple PCR based on DNA bisulfite treatment has simple operation and high sensitivity but high technical requirements. It has been previously reported that single-molecule BS-PCR using droplet technology can detect about nine thousand targets at the same time, but the starting amount is relatively high, requiring 2 ⁇ g DNA. In 2015, Lu Wen and other researchers cleverly used the characteristic sequences of CpG islands as primer binding sites to develop MCTA-seq based on PCR technology, which can simultaneously detect the methylation signals of a large number of CpG island regions. This technology is extremely sensitive. It can detect 7.5pg of gDNA, but MCTA-seq is more like a fixed CGI Panel. As a targeted sequencing platform, it is slightly less flexible. Therefore, the development of a targeted methylation technology with low initial quantity requirements and strong flexibility is the future development direction of targeted methylation.
- Basic PCR is mainly due to the formation of serious primer dimers in the PCR process.
- the unmethylated cytosine is converted to uracil after the bisulfite-treated DNA, and most of the cytosine in the genome is unmethylated , So most of the bases of the sequence have changed from the previous four components of A/T/C/G to A/T/G.
- one primer is designed for the positive strand and the other is designed for the complementary strand. Therefore, one strand used for PCR is an ATG-rich sequence, and the other is an ATC-rich sequence.
- This "natural "Complementary" primer sequences can easily form primer dimers. When the number of primer pairs increases, the formation of primer dimers also increases sharply. In the process of multiplex PCR, too many primers are exhausted due to the generation of primer dimers, causing multiple PCR failures. Therefore, it is necessary to solve multiple sub The problem of sulfate multiplex PCR is to solve the problem that primers easily form primer dimers.
- the present invention provides the following technical solutions:
- the present invention provides a method for constructing a sequencing library based on a target region of methylated DNA, including: (1) based on the methylated DNA sample, At least one end is connected to the universal sequence, and the DNA sample is treated with bisulfite to obtain a transformed DNA sample with the universal sequence; (2) using the first specific primer and the first universal primer to pair the transformed band A DNA sample with a universal sequence is first amplified to obtain a first amplification product; wherein, the first specific primer is located upstream of the target region, and the first universal primer is at least part of the universal sequence Matching or overlapping; the universal sequence is located downstream of the target region; (3) using a second specific primer, a second universal primer and a tag primer to perform a second amplification on the first amplification product to obtain a second amplification Product to obtain a sequencing library; wherein the second specific primer is located downstream of the first specific primer and upstream of the target region, and the second universal primer and at least a partial sequence of the second
- the method for constructing a sequencing library based on a methylated DNA target region is to design specific primers for a chain of a methylated DNA template to obtain enrichment of the target region and build a library.
- the first specific primer can match a strand of the DNA sample, and the first universal primer can match the universal sequence. Achieve specific amplification.
- the first specific primer designed is a sequence rich in bases A, T, G or bases A, T, and C. Dimers will not form between.
- the first universal primer contains four bases A, T, C, and G, and will not form a primer dimer with the first specific primer, so the formation of primer dimer can be completely avoided.
- a second specific primer is designed downstream of the first specific primer and upstream of the target region or downstream of the target region, using the second specific primer, the second universal primer and The tag primer is used to perform second amplification on the first amplification product to obtain the second amplification product, and obtain the required sequencing library.
- the above-mentioned method for constructing a sequencing library based on a target region of methylated DNA may further include the following technical features:
- the 5'end of the second specific primer overlaps at least part of the sequence of the 3'end of the second universal primer, and the 3'end of the tag primer It overlaps with the partial sequence of the 5'end of the first universal primer.
- the 5'end sequence of the second specific primer can overlap with at least part of the 3'end sequence of the second universal primer, and the 3'end sequence can match the template region on the DNA template downstream of the first specific primer and upstream of the target region Therefore, specific amplification of the target region can be achieved based on the first amplification product.
- the 5'end of the second specific primer overlaps with at least part of the sequence of the 3'end of the tag primer, and the 3'end of the second universal primer overlaps with The partial sequence of the 5'end of the first specific primer overlaps.
- the 5'end sequence of the second specific primer overlaps at least part of the 3'end sequence of the tag primer, and its 3'end sequence can be matched with the template region located downstream of the target region on the DNA template, thereby achieving specificity for the target region Amplification.
- the tag primers contain tag sequences. These tag sequences may be commonly used by some sequencing platforms to distinguish different samples, so as to facilitate simultaneous sequencing of multiple mixed samples. According to embodiments, these The length of the tag sequence can be 8-12 bp, for example, it can be 10 bp, 8 bp, etc.
- step (1) further comprises: (1-a) processing the methylated DNA sample with bisulfite to obtain a transformed DNA sample; (1-b) ) Using DNA polymerase and a random primer with a first sequencing sequence to replicate the transformed DNA sample to obtain the transformed DNA sample with a universal sequence, the 3'end of the random primer It is a random base sequence, and the 5'end of the random primer is a universal sequence.
- the random base sequence is 6-12, and the random base is A, T, C or G.
- the random base sequence is 6-12, and the random base is A, T or C.
- the universal sequence is a sequencing linker sequence or a fixed sequence.
- the cytosine in the sequencing linker sequence or the fixed sequence is a methylated modified cytosine.
- step (1) further includes: (1-1) performing end repair plus A on the methylated DNA sample to obtain a repaired DNA sample; (1-2) adding A to the methylated DNA sample; At least one end of the repaired DNA sample is connected with a universal sequence to obtain a DNA sample with a universal sequence; (1-3) using bisulfite to process the DNA sample with a universal sequence to obtain the The transformed DNA sample with universal sequence.
- the universal sequence is selected from at least one of the following: sequencing adapter sequence or modified sequencing adapter sequence.
- the modified sequencing linker sequence is that one chain of cytosine is modified by methylation, one chain of cytosine is not modified by methylation, and the 3'end base of one chain is modified by non-hydroxyl group.
- the random sequence is a molecular tag sequence.
- the number of original DNA templates can be counted through a large number of different molecular tag sequences, and the number of original templates can be traced through subsequent statistics of molecular tag sequences and errors generated in the sequencing or PCR process can be corrected, so as to realize the detection of DNA templates. Precise detection and quantitative research.
- step (1) further includes: 1Using a transposase to interrupt and transpose the DNA sample, so as to obtain a DNA sample with a universal sequence, in the transposase Embedded with the universal sequence; 2Using bisulfite to process the DNA sample with the universal sequence to obtain the transformed DNA sample with the universal sequence.
- the universal sequence is a transposase effector sequence or a transposase effector sequence with a sequencing linker, preferably a transposase effector sequence, and the transposase can be Tn5, MuA or Other transposases with similar functions are preferably Tn5 transposase.
- the cytosine in the transposase effector sequence is methylated modified cytosine.
- the conversion of unmethylated cytosine to guanine is not a 100% process, and it may or may not be converted, so the subsequent amplification with universal primers will increase uncertainty.
- the methylated cytosine will not be converted to uracil under the condition of subsequent sulfite treatment, and the sequence information will remain unchanged. Therefore, in order to sequence more accurately, the cytosine in the transposase effect sequence can be methylated. Of course, the cytosine may not undergo methylation modification treatment.
- the methylated DNA sample is genomic DNA, fragmented genomic DNA, or free DNA.
- the present invention provides a system for constructing a sequencing library based on a target region of methylated DNA, comprising: a universal transformation module, which is constructed based on the methylated DNA sample At least one end of the methylated DNA sample is connected with a universal sequence and is bisulfite-treated DNA sample, so as to obtain a transformed DNA sample with a universal sequence; a first amplification module, the first amplification module The amplification module is connected to the universal transformation module, and the first amplification module uses the first specific primer and the first universal primer to first amplify the transformed DNA sample with the universal sequence, so as to obtain the first An amplification product; wherein the first specific primer is located upstream of the target region, and the first universal primer matches or overlaps at least partially with the universal sequence; and the second amplification module, the second amplification The amplification module is connected to the first amplification module, and the second amplification module uses a second specific primer, a second universal primer and a
- the second specific primer is located downstream of the first specific primer and upstream of the target region, and the second universal primer and the second specific primer are at least Partial sequence overlap
- the tag primer contains a tag sequence
- the tag primer overlaps with the partial sequence of the first universal primer
- the second specific primer is located downstream of the target region, and the second At least a partial sequence of the universal primer overlaps with the first specific primer
- the tag primer contains a tag sequence
- the tag primer overlaps with a partial sequence of the second specific primer.
- the aforementioned system for constructing a sequencing library based on a target region of methylated DNA may further include the following technical features:
- the 5'end of the second specific primer in the second amplification module overlaps at least part of the sequence of the 3'end of the second universal primer, and the The 3'end of the tag primer overlaps with a partial sequence of the 5'end of the first universal primer.
- the 5'end of the second specific primer in the second amplification module overlaps at least part of the sequence of the 3'end of the tag primer, and the second universal The 3'end of the primer overlaps with the partial sequence of the 5'end of the first specific primer.
- the length of the tag sequence is 8-12 bp.
- the universal transformation module further includes: a transformation unit that uses bisulfite to process the methylated DNA sample to obtain a transformed DNA sample;
- the amplification unit is connected to the transformation unit, and the amplification unit uses DNA polymerase and the first sequencing primer to replicate the transformed DNA sample so as to obtain the transformed DNA sample.
- the 3'end of the first sequencing primer is a random base
- the 5'end of the first sequencing primer is a universal sequence.
- the random base is 6-12, and the random base is A, T, C or G.
- the random bases in the above system are 6-12, and the random bases are A, T or C.
- the universal sequence is a sequencing linker sequence or a fixed sequence.
- the cytosine in the sequencing linker sequence or the fixed sequence is a methylated modified cytosine.
- the universal transformation module further includes: a repair unit for performing end repair plus A on the methylated DNA sample to obtain a repaired DNA sample; a connection unit, The connection unit is connected to the repair unit, and the connection unit is used to connect at least one end of the repaired DNA sample with a universal sequence, so as to obtain a DNA sample with a universal sequence; a transformation unit, the transformation unit and The connecting unit is connected, and the transformation unit uses bisulfite to process the DNA sample with the universal sequence, so as to obtain the transformed DNA sample with the universal sequence.
- the universal sequence is selected from at least one of the following: a sequencing adapter sequence or a modified sequencing adapter sequence.
- the modified sequencing linker sequence is one chain cytosine undergoing methylation modification, one chain cytosine undergoing no methylation modification, and one chain 3' A sequencing adapter sequence with a non-hydroxyl modified end base, a sequencing adapter sequence with a fixed sequence and a random sequence, or a sequencing adapter sequence with a fixed sequence and a random sequence modified by a non-hydroxyl end base of a chain.
- the random sequence is a molecular tag sequence.
- the number of original DNA templates can be counted through a large number of different molecular marker sequences, and the number of original templates can be traced through subsequent statistics of molecular marker sequences and the errors generated during sequencing or PCR can be corrected, so as to realize the detection of DNA templates. Precise detection and quantitative research.
- the universal transformation module further includes: a transposition unit that uses transposase to transpose the DNA sample to obtain a DNA sample with a universal sequence, A universal sequence is embedded in the transposase; a transformation unit, the transformation unit is connected to the transposition unit, and the transformation unit uses bisulfite to process the DNA sample with the universal sequence, In order to obtain the transformed DNA sample with universal sequence.
- the universal sequence is a transposase effector sequence or a transposase effector sequence with a sequencing linker, preferably a transposase effector sequence.
- the cytosine in the transposase effector sequence is a methylated modified cytosine.
- the methylated DNA sample is genomic DNA, fragmented genomic DNA or free DNA.
- the present invention provides a method for sequencing a methylated DNA sample, including:
- a sequencing library is constructed according to the method described in any embodiment of the first aspect of the present invention or using the system described in any embodiment of the second aspect of the present invention; Throughput sequencing to obtain sequencing results.
- a sequencing platform is used to perform high-throughput sequencing on the sequencing library, and the sequencing platform is selected from at least one of MGISEQ, Illumina, and Proton.
- the present invention provides a method for determining the methylation status of a methylated DNA sample, including:
- a sequencing library is constructed according to the method described in any embodiment of the first aspect of the present invention or using the system described in any embodiment of the second aspect of the present invention; Throughput sequencing to obtain a sequencing result; comparing the sequencing result with a reference genome to determine the methylation status of the methylated DNA sample.
- the reference genome is the human genome hg19 or Yanhuang genome.
- the present invention provides a kit comprising: a universal sequence, a tag primer, a first universal primer, a second universal primer and a conventional methylation detection reagent; wherein the tag primer contains a tag Sequence, the first universal primer matches or overlaps at least part of the universal sequence, the first universal primer is SEQ ID NO: 1, and the second universal primer is SEQ ID NO: 22.
- the conventional methylation detection reagent can be, for example, a bisulfite detection reagent or a corresponding kit.
- the kit described above further includes the following additional technical features:
- the tag primer is shown in SEQ ID NO: 23.
- the kit further includes: a first specific primer and a second specific primer, the first specific primer includes the sequence shown in SEQ ID NO: 1 to SEQ ID NO: 10 , The second specific primer includes the sequence shown in SEQ ID NO: 11 to SEQ ID NO: 20.
- the kit utilizes the method described in the first aspect of the present invention to construct a sequencing library based on methylated DNA target regions.
- Fig. 1 is a flowchart of random primer library construction according to an embodiment of the present invention.
- Fig. 2 is a flow chart of joint connection library construction according to an embodiment of the present invention.
- Fig. 3 is a flow chart of transposon library construction according to an embodiment of the present invention.
- Figure 4 is a schematic diagram of different linker sequences provided according to an embodiment of the present invention.
- Figure 5 is a quality inspection diagram of a sequencing library provided according to an embodiment of the present invention.
- Fig. 6 is a result diagram of the sequencing depth of each amplicon provided according to an embodiment of the present invention.
- Fig. 7 is a quality inspection diagram of a sequencing library provided according to an embodiment of the present invention.
- Fig. 8 is a result diagram of the sequencing depth of each amplicon provided according to an embodiment of the present invention.
- Fig. 9 is a schematic structural diagram of a system for constructing a sequencing library based on a target region of methylated DNA according to an embodiment of the present invention.
- Fig. 10 is a schematic structural diagram of a universal conversion module according to an embodiment of the present invention.
- Fig. 11 is a schematic structural diagram of a universal conversion module according to an embodiment of the present invention.
- Fig. 12 is a schematic structural diagram of a universal conversion module according to an embodiment of the present invention.
- upstream and downstream refer to the sequence of nucleotide 5'-3', compared with two or more nucleic acid sequences, the nucleic acid sequence located upstream is compared with the nucleic acid sequence located downstream, The recognition or matching region is closer to the 5'end of the template sequence.
- the length of different nucleic acid sequences may be different, the length of the region to be recognized or matched may also be different.
- the A nucleic acid sequence is located downstream of the B nucleic acid sequence, only the 3'end recognition or binding site of the A nucleic acid sequence is closer to the 3'end of the template sequence than the recognition or binding site of the B nucleic acid sequence. End.
- nucleic acid sequences when it means “match” between two nucleic acid sequences, it means that complementary pairing occurs between the bases of the two nucleic acid sequences. When it means that two nucleic acid sequences at least partially overlap, it means that the two nucleic acid sequences have at least one nucleic acid sequence that is the same.
- bisulfite refers to a reagent or process that deamination of cytosine in DNA into uracil. Therefore, whether it is based on bisulfite treatment, sulfite treatment, or bisulfite treatment, it is included in the protection scope of the present invention.
- the present invention creatively invented a single-directional primer amplification method, that is, only for DNA
- One chain of the template is designed for primers, and the designed specific primers only contain A, T, G or A, T, C, and it is difficult to form primer dimers between each other.
- specific primers are designed on the products of the first round of amplification for amplification to further ensure the specificity of amplification.
- the sequencing library thus prepared meets the requirements of sequencing.
- genomic DNA is transposed by a Tn5 transposon, and the broken gDNA or free DNA (cfDNA) molecules are connected by a linker or the DNA is randomly copied.
- a universal sequence is introduced on the original DNA.
- the DNA is subjected to bisulfite treatment (BS treatment) to obtain the bisulfite-converted DNA sequence (the original DNA unmethylated modified cytosine (C) is converted to uracil (U)).
- BS treatment bisulfite treatment
- C original DNA unmethylated modified cytosine
- U uracil
- Design universal primers based on the introduced universal sequence, and design specific primers upstream of the target region of the transformed DNA sequence. Specific primers are designed for only one strand of the DNA template, and PCR amplification is performed through universal primers and specific primers , Get the PCR product.
- nested primers downstream of the above specific primers or design specific primers downstream of the target region are designed for only one strand on the DNA template.
- the second step of the PCR product of the first step is amplified by nested primers or downstream specific primers and universal primers, and finally a PCR amplification product (BS-PCR) directed against the template after the bisulfite treatment is obtained.
- BS-PCR PCR amplification product
- the present invention provides a method for constructing a sequencing library based on a target region of methylated DNA, comprising: (1) constructing a sequencing library based on the methylated DNA sample At least one end is connected with a universal sequence and a DNA sample treated with bisulfite to obtain a transformed DNA sample with a universal sequence; (2) using the first specific primer and the first universal primer to The transformed DNA sample with universal sequence undergoes first amplification to obtain a first amplification product; wherein, the first specific primer is located upstream of the target region, and the first universal primer is the same as the universal The sequences overlap or match at least partially; the universal sequence is located downstream of the target region; (3) the second specific primer, the second universal primer and the tag primer are used to perform a second amplification on the first amplification product to obtain the first Two amplification products to obtain a sequencing library; wherein the second specific primer is located downstream of the first specific primer and upstream of the target region, and the second universal primer and the second specific primer are At least
- the universal sequence is introduced by the following methods: 1. gDNA, interrupted gDNA or cfDNA, first treat the DNA molecule with bisulfite, and then use the first sequencing primer, that is, 3' End with 6-12 random N bases (degenerate bases composed of A/T/C/G) or 6-12 random H bases (degenerate bases composed of A/T/C), 5'end Primer with partial or complete sequencing linker sequence or fixed sequence (wherein the cytosine in the sequence is preferentially methylated modified cytosine) and DNA polymerase to replicate the template to obtain a repeat with a universal sequence at the 5'end DNA template after sulfite treatment (shown in Figure 1).
- the available sequencing adapter sequences include, but are not limited to, the sequencing adapters of the MGI platform and the sequencing adapter sequences of the illumina and proton platforms.
- the available DNA polymerase can be conventional rTaq, Fusion, or Bst or phi29.
- the universal sequence is introduced by the following method:
- the broken gDNA or cfDNA is repaired by adding A, and then a specific linker sequence is added.
- the sequence can be a partial or complete sequencing linker sequence or a modified sequencing linker sequence.
- These modified sequencing linker sequences can be the 3 end of a strand A sequencing adapter sequence with a fixed sequence in which the base is modified by a non-hydroxy group, or a sequencing adapter sequence with a fixed sequence, or a sequencing adapter sequence with a fixed sequence in a chain 3'end base modified by a non-hydroxy group, such as Number 1, number 2, number 3 and number 4 shown in FIG.
- the product with universal sequence added with sulfite was used to obtain the transformed DNA template ( Figure 2).
- the universal sequence is introduced by the following method:
- Tn5 transposase embeds a linker sequence.
- the linker can be an effective 19bp specific sequence of Tn5 transposase itself, or a combination of effective sequence + other sequences (such as sequencing linker sequence), preferably 19bp specific Sequence, the cytosine in the 19bp specific sequence is preferentially methylated modified cytosine, and the gDNA is transposed by Tn5 transposition plus a specific linker.
- the product with a specific linker is processed by bisulfite. The processed DNA template is obtained (as shown in Figure 3).
- PCR amplification is performed with one-way specific primers to obtain a sequencing library, and the amplification method can be any of the following:
- the sequencing library is obtained by PCR amplification by the following method:
- the first specific primer and the first universal primer are used to perform the first step PCR amplification of the sulfite-treated DNA.
- the 3'end sequence of the first universal primer is partially or completely complementary to or overlaps with the introduced universal sequence.
- the 5'end of the first universal sequence is part or all of the sequencing linker sequence (preferred partial sequence).
- the binding site of the first specific primer sequence is located upstream of the target region to be amplified, and its design is for the DNA template sequence after the bisulfite treatment; the product obtained is purified and then passed through the second specific primer ( In the following examples, it is also called nested primer), second universal primer, and tag primer for the second PCR amplification.
- the second specific primer and the tag primer are first subjected to PCR, and the subsequent cycles are performed through the second specific primer, the second universal primer and the tag primer together for multiple rounds of PCR.
- the 5'end of the second specific primer overlaps with part or all of the 3'end of the second universal primer.
- the 3'end of the second specific primer is a specific sequence, and the specific sequence is designed in the first specific primer and the target region Between; the second universal primer can be part or all of the sequence of the universal adapter for sequencing, the 3'end and the 5'end of the second specific primer are partly or completely the same; the 3'end of the tag primer and the 5'of the first universal primer The end part or all of the sequence is the same, and there is a fixed tag sequence of 8-12 bp in the middle (each platform is used to distinguish the tag sequence of mixed samples) for subsequent multi-sample mixed sequencing ( Figure 1A, Figure 2A, Figure 3A ).
- the sequencing library is obtained by performing PCR amplification by the following method.
- the first specific primer also referred to as the upstream specific primer in the following examples
- the first universal primer are used to perform the first step PCR amplification of the sulfite-treated DNA.
- the 3'end sequence of the first universal primer is partially or fully complementary or overlapped with the introduced universal sequence (here the universal sequence preferentially uses a fixed sequence other than the sequencing adapter sequence), the specific sequence at the 3'end of the first specific primer Design the upstream of the target region that needs to be amplified, and the design is for the DNA template sequence after the bisulfite treatment, and the 5'end is part or all of the sequence of the sequencing adapter sequence (the priority part sequence).
- the second specific primer (correspondingly, can also be referred to as the downstream specific primer in the following embodiments), the second universal primer, and the tag primer are used for the second step of PCR amplification.
- the second specific primer and the second universal primer are first subjected to PCR amplification, and the second specific primer, the second universal primer and the tag primer are combined to perform multiple rounds of PCR in the subsequent cycles;
- the 5'end of the downstream specific primer overlaps part or all of the sequence at the 3'end of the tag primer.
- the 3'end of the second specific primer is a specific sequence, and the specific sequence is designed downstream of the target region;
- the second universal primer can be Sequencing a part or all of the sequence of the adapter sequence, the 3'end of which overlaps with the 5'end of the first specific primer partially or all of the sequence; the 3'end of the tag primer and the 5'end of the second specific primer partially or completely have the same sequence,
- There is a fixed tag sequence of 8-12 bp in the middle (each platform is used to distinguish the tag sequence of mixed samples) for subsequent multi-sample mixed sequencing ( Figure 1B, Figure 2B, Figure 3B).
- the present invention provides a system for constructing a sequencing library based on a target region of methylated DNA, as shown in FIG. 9, including a universal transformation module, a first amplification module, and a second amplification module, The modules are connected in turn.
- the universal transformation module is based on the methylated DNA sample and constructs a bisulfite-treated DNA sample connected to at least one end of the methylated DNA sample to obtain a transformed DNA sample.
- DNA samples with universal sequences are examples of DNA samples.
- the first amplification module uses the first specific primer and the first universal primer to perform the first amplification on the transformed DNA sample with the universal sequence, so as to obtain a first amplification product, wherein the first The specific primer is located upstream of the target region, and the first universal primer at least partially matches or overlaps with the universal sequence.
- the second amplification module uses a second specific primer, a second universal primer, and a tag primer to perform a second amplification on the first amplification product to obtain a second amplification product to obtain a sequencing library; wherein Two specific primers, the universal primer and the tag primer are as shown in (i) or (ii): (i) the second specific primer is located downstream of the first specific primer and the target region Upstream of, the second universal primer overlaps at least a partial sequence of the second specific primer, the tag primer contains a tag sequence, and the tag primer overlaps with a partial sequence of the first universal primer; (ii ) The second specific primer is located downstream of the target region, the second universal primer and the first specific primer overlap at least a part of the sequence, the tag primer contains a tag sequence, and the tag primer is The partial sequence of the second specific primer overlaps.
- the universal transformation module includes a transformation unit and an amplification unit connected to the transformation unit.
- the conversion unit uses bisulfite to process the methylated DNA sample so as to obtain a transformed DNA sample.
- the amplification unit uses a DNA polymerase and a first sequencing primer to replicate the transformed DNA sample, so as to obtain the transformed DNA sample with a universal sequence, 3'of the first sequencing primer The end is a random base, and the 5'end of the first sequencing primer is a universal sequence.
- the universal transformation module includes a repair unit, a connection unit and a transformation unit, and each unit is connected in sequence.
- the repair unit is used to perform end repair plus A on the methylated DNA sample to obtain a repaired DNA sample.
- the connecting unit is used to connect at least one end of the repaired DNA sample with a universal sequence, so as to obtain a DNA sample with a universal sequence.
- the conversion unit uses bisulfite to process the DNA sample with the universal sequence, so as to obtain the transformed DNA sample with the universal sequence.
- the universal transformation module includes a transposition unit and a transformation unit connected to the transposition unit.
- the transposable unit uses a transposase to interrupt and transpose the DNA sample, so as to obtain a DNA sample with a universal sequence, and the transposase has a universal sequence embedded in it.
- the conversion unit uses bisulfite to process the DNA sample with the universal sequence, so as to obtain the transformed DNA sample with the universal sequence.
- Experimental design Use 100ng Yanhuang genomic DNA for bisulfite treatment, and then prepare a DNA target methylation library according to the steps of the invention, and send the library to MGISEQ-2000 sequencer for computer sequencing, sequencing type PE100, and then perform data Analysis, including data utilization, comparison rate, amplicon specificity, uniformity and other properties.
- CT Conversion Reagent solution Take out CT Conversion Reagent (solid mixture) from the kit, add 900 ⁇ L of water, 50 ⁇ L of M-Dissolving Buffer and 300 ⁇ L of M-dissolving buffer respectively Solution (M-Dilution Buffer), dissolve at room temperature and shake for 10 minutes or shake on a shaker for 10 minutes.
- the random primer sequence (that is, the first sequencing primer mentioned in this article): CGCTTGGCCTCCGACTTNNNNNNNN (SEQ ID NO: 24), where N is a random sequence composed of four bases: A/T/C/G.
- sample 1 to sample 3 represent the same sample made three replicates
- the comparison rate refers to the ratio of the comparison to the genome
- the specificity refers to the ratio of the reads in the target region to the total reads in the total sequence.
- Uniformity It refers to the proportion of the target area whose depth is greater than 0.1 times the average depth of the target area to the total number of target areas.
- Experimental design use Yanhuang genomic DNA interrupted to 200-300bp, and then prepare a DNA target methylation library according to the method provided by the present invention, and send the library to MGISEQ-2000 sequencer for computer sequencing, sequencing type PE100, Then perform data analysis, including data utilization, comparison rate, amplicon specificity, uniformity and other properties.
- Connector 1 5’/5Phos/AGTCGGAGGCCAAGCGGT (SEQ ID NO: 25)
- Connector 2 5’ACATGGCTACGATCCGACTddT (SEQ ID NO: 26)
- the C in the linker 1 sequence is protected by methylation modification
- the sequence in linker 2 can be protected with or without methylation modification
- the last base of the 3 end in linker 2 is blocked and modified to prevent connection with the template , That is, dideoxy modification.
- EZ DNA Methylation-Gold Kit TM (ZYMO) was used to co-process the above-mentioned ligated DNA with bisulfite.
- CT Conversion Reagent solution Take out CT Conversion Reagent (solid mixture) from the kit, add 900 ⁇ L of water, 50 ⁇ L of M-Dissolving Buffer and 300 ⁇ L of M-dissolving buffer respectively Solution (M-Dilution Buffer), dissolve at room temperature and shake for 10 minutes or shake on a shaker for 10 minutes.
- a Bioanalyzer analysis system (Agilent, Santa Clara, USA) was used to detect the size and content of the insert in the library, and the results are shown in Figure 7.
- the sequencing platform uses MGISEQ-2000, sequencing type PE100. After sequencing, the data is compared and the basic parameters are counted, including offline data, available data, comparison rate, and specificity The results are shown in Table 2. The sequencing depth of each amplicon is shown in Figure 8.
- sample 1 to sample 3 represent the same sample for three replicates
- the comparison rate refers to the ratio of the comparison to the genome
- the specificity refers to the ratio of the reads in the target region to the total reads in the total sequence.
- Uniformity It refers to the proportion of the target area whose depth is greater than 0.1 times the average depth of the target area to the total number of target areas.
- the adapter filtering ratio is around 1%, the primer dimer is less, and the comparison rate is between 84-86%.
- the performance is between 89-90%, the performance is good, and the coverage depth between each amplicon is uniform.
- the first specific primer pool is made up of equimolar mixing of the above primers, and the Y base is a C/T merged base
- the second specific primer pool is made up of equimolar mixing of the above primers, and the Y base is a C/T merged base
- the N base is the barcode sequence on the MGI sequencing platform.
- first, second, etc. are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Therefore, the features defined with “first” and “second” may explicitly or implicitly include at least one of the features.
- a plurality of means at least two, such as two, three, etc., unless otherwise specifically defined.
- the terms “connected”, “connected”, “fixed” and other terms should be understood in a broad sense, for example, they may be fixedly connected, detachably connected, or integrated ; It can be mechanically connected, or electrically connected, or can communicate with each other; it can be directly connected, or indirectly connected through an intermediate medium, it can be the internal communication of two components or the interaction relationship between two components, unless otherwise clear The limit.
- the specific meaning of the above-mentioned terms in the present invention can be understood according to specific circumstances.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Biochemistry (AREA)
- Physics & Mathematics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Analytical Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Immunology (AREA)
- Plant Pathology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022502317A JP7203276B2 (ja) | 2019-05-21 | 2019-05-21 | メチル化されたdnaの標的領域に基づいてシーケンシングライブラリーを構築する方法及びキット |
| PCT/CN2019/087824 WO2020232635A1 (zh) | 2019-05-21 | 2019-05-21 | 基于甲基化dna目标区域构建测序文库及系统和应用 |
| EP19929647.6A EP3950956A4 (en) | 2019-05-21 | 2019-05-21 | Method and system for constructing sequencing library on the basis of methylated dna target region, and use thereof |
| CN201980092935.8A CN113811618B (zh) | 2019-05-21 | 2019-05-21 | 基于甲基化dna目标区域构建测序文库及系统和应用 |
| US17/493,991 US20220056519A1 (en) | 2019-05-21 | 2021-10-05 | Method and system for constructing sequencing library on the basis of methylated dna target region, and use thereof |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2019/087824 WO2020232635A1 (zh) | 2019-05-21 | 2019-05-21 | 基于甲基化dna目标区域构建测序文库及系统和应用 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US17/493,991 Continuation US20220056519A1 (en) | 2019-05-21 | 2021-10-05 | Method and system for constructing sequencing library on the basis of methylated dna target region, and use thereof |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2020232635A1 true WO2020232635A1 (zh) | 2020-11-26 |
Family
ID=73459291
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2019/087824 Ceased WO2020232635A1 (zh) | 2019-05-21 | 2019-05-21 | 基于甲基化dna目标区域构建测序文库及系统和应用 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20220056519A1 (https=) |
| EP (1) | EP3950956A4 (https=) |
| JP (1) | JP7203276B2 (https=) |
| CN (1) | CN113811618B (https=) |
| WO (1) | WO2020232635A1 (https=) |
Families Citing this family (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN114908151B (zh) * | 2022-04-29 | 2026-02-03 | 上海伯杰医疗科技股份有限公司 | 用于基因cds区域一代测序的嵌套引物对、测序引物对、测序试剂、测序试剂盒及应用 |
| CN115386966B (zh) * | 2022-10-26 | 2023-03-21 | 北京寻因生物科技有限公司 | Dna表观修饰的建库方法、测序方法及其建库试剂盒 |
| WO2024119481A1 (zh) * | 2022-12-09 | 2024-06-13 | 深圳华大智造科技股份有限公司 | 一种快速制备多重pcr测序文库的方法及其应用 |
| WO2024124400A1 (zh) * | 2022-12-13 | 2024-06-20 | 深圳华大智造科技股份有限公司 | 一种基于多重pcr的靶向甲基化建库体系、方法及其应用 |
| WO2024259564A1 (zh) * | 2023-06-19 | 2024-12-26 | 深圳华大智造科技股份有限公司 | 一种一步法构建靶向文库的方法及其应用 |
| CN117316289B (zh) * | 2023-09-06 | 2024-04-26 | 复旦大学附属华山医院 | 一种中枢神经系统肿瘤的甲基化测序分型方法及系统 |
| CN117778578B (zh) * | 2023-12-29 | 2024-10-15 | 深圳海普洛斯医学检验实验室 | 一种基于高通量测序检测目标基因甲基化程度的引物组合物及其应用 |
| CN118390167A (zh) * | 2024-03-15 | 2024-07-26 | 臻赫医药(杭州)有限公司 | 高通量测序文库的构建方法、试剂盒及应用 |
| CN118389494B (zh) * | 2024-06-21 | 2025-06-20 | 北京寻因生物科技有限公司 | 细胞标签微珠及其制备方法和应用 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107541791A (zh) * | 2017-10-26 | 2018-01-05 | 中国科学院北京基因组研究所 | 血浆游离dna甲基化检测文库的构建方法、试剂盒及应用 |
| CN109666720A (zh) * | 2018-12-28 | 2019-04-23 | 北京中科遗传与生殖医学研究院有限责任公司 | 一种对胚胎培养液进行DedscRRBS-PGS分析的方法 |
Family Cites Families (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN102796808B (zh) * | 2011-05-23 | 2014-06-18 | 深圳华大基因科技服务有限公司 | 甲基化高通量检测方法 |
| US20150011396A1 (en) | 2012-07-09 | 2015-01-08 | Benjamin G. Schroeder | Methods for creating directional bisulfite-converted nucleic acid libraries for next generation sequencing |
| CN106011230A (zh) * | 2016-05-10 | 2016-10-12 | 人和未来生物科技(长沙)有限公司 | 用于检测碎片化dna目标区域的引物组合物及其应用 |
| CN105861724B (zh) * | 2016-06-03 | 2019-07-16 | 人和未来生物科技(长沙)有限公司 | 一种kras基因超低频突变检测试剂盒 |
| ES2922281T3 (es) * | 2016-12-07 | 2022-09-12 | Mgi Tech Co Ltd | Método para construir una biblioteca de secuenciación de una célula individual y uso del mismo |
| AU2018243360B2 (en) * | 2017-03-29 | 2023-10-26 | Cornell University | Devices, processes, and systems for determination of nucleic acid sequence, expression, copy number, or methylation changes using combined nuclease, ligase, polymerase, and sequencing reactions |
| WO2019006392A1 (en) * | 2017-06-30 | 2019-01-03 | Life Technologies Corporation | LIBRARY PREPARATION METHODS AND COMPOSITIONS AND USES THEREOF |
| CN107937985A (zh) * | 2017-10-25 | 2018-04-20 | 人和未来生物科技(长沙)有限公司 | 一种微量碎片化dna甲基化检测文库的构建方法和检测方法 |
-
2019
- 2019-05-21 CN CN201980092935.8A patent/CN113811618B/zh active Active
- 2019-05-21 WO PCT/CN2019/087824 patent/WO2020232635A1/zh not_active Ceased
- 2019-05-21 JP JP2022502317A patent/JP7203276B2/ja active Active
- 2019-05-21 EP EP19929647.6A patent/EP3950956A4/en active Pending
-
2021
- 2021-10-05 US US17/493,991 patent/US20220056519A1/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107541791A (zh) * | 2017-10-26 | 2018-01-05 | 中国科学院北京基因组研究所 | 血浆游离dna甲基化检测文库的构建方法、试剂盒及应用 |
| CN109666720A (zh) * | 2018-12-28 | 2019-04-23 | 北京中科遗传与生殖医学研究院有限责任公司 | 一种对胚胎培养液进行DedscRRBS-PGS分析的方法 |
Also Published As
| Publication number | Publication date |
|---|---|
| EP3950956A1 (en) | 2022-02-09 |
| JP2022525373A (ja) | 2022-05-12 |
| JP7203276B2 (ja) | 2023-01-12 |
| CN113811618B (zh) | 2024-02-09 |
| EP3950956A4 (en) | 2022-05-04 |
| CN113811618A (zh) | 2021-12-17 |
| US20220056519A1 (en) | 2022-02-24 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| WO2020232635A1 (zh) | 基于甲基化dna目标区域构建测序文库及系统和应用 | |
| CN102329876B (zh) | 一种测定待检测样本中疾病相关核酸分子的核苷酸序列的方法 | |
| CN103060924B (zh) | 微量核酸样本的文库制备方法及其应用 | |
| ES2393318T3 (es) | Estrategias para la identificación y detección de alto rendimiento de polimorfismos | |
| CN103131754B (zh) | 一种检测核酸羟甲基化修饰的方法及其应用 | |
| CN103898199B (zh) | 一种高通量核酸分析方法及其应用 | |
| CN106591441B (zh) | 基于全基因捕获测序的α和/或β-地中海贫血突变的检测探针、方法、芯片及应用 | |
| CN105886608B (zh) | ApoE基因引物组、检测试剂盒和检测方法 | |
| CN102409047B (zh) | 一种构建杂交测序文库的方法 | |
| CN110628891B (zh) | 一种对胚胎进行基因异常筛查的方法 | |
| CN111471754B (zh) | 一种通用型高通量测序接头及其应用 | |
| WO2019144582A1 (zh) | 用于检测基因突变和已知、未知基因融合类型的高通量测序靶向捕获目标区域的探针和方法 | |
| CN107828883B (zh) | 一种苯丙酮尿症的检测引物组、试剂盒及基因突变检测方法 | |
| JP2020536525A (ja) | プローブ及びこれをハイスループットシーケンシングに適用するターゲット領域の濃縮方法 | |
| CN106939344B (zh) | 用于二代测序的接头 | |
| CN110004225B (zh) | 一种肿瘤化疗药个体化基因检测试剂盒、引物及方法 | |
| US20220090059A1 (en) | Method and use for construction of sequencing library based on dna samples | |
| CN108060227A (zh) | 一种检测pah基因突变的扩增引物、试剂盒及其检测方法 | |
| CN111500748B (zh) | 用于检测617种SNP和InDel的引物组合及其在法医鉴定和亲缘关系鉴定中的应用 | |
| CN113969307A (zh) | Dna甲基化测序文库及制备方法和dna甲基化检测方法 | |
| CN109825552A (zh) | 一种用于对目标区域进行富集的引物及方法 | |
| CN108504651A (zh) | 基于高通量测序的pcr产物大样本量混合建库的文库构建方法和试剂 | |
| CN118308459A (zh) | 一种单细胞全基因组甲基化文库建库方法及其应用 | |
| CN109136217B (zh) | 一种测序文库构建的方法、建库试剂及其应用 | |
| CN112048543A (zh) | 高通量低成本高碱基准确性的质粒或dna片段的新型测序方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19929647 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2022502317 Country of ref document: JP Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2019929647 Country of ref document: EP Effective date: 20211029 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |