CN110651050A - Targeted enrichment method and kit for detecting low-frequency mutation - Google Patents

Targeted enrichment method and kit for detecting low-frequency mutation Download PDF

Info

Publication number
CN110651050A
CN110651050A CN201780091041.8A CN201780091041A CN110651050A CN 110651050 A CN110651050 A CN 110651050A CN 201780091041 A CN201780091041 A CN 201780091041A CN 110651050 A CN110651050 A CN 110651050A
Authority
CN
China
Prior art keywords
primer
sequencing
pcr amplification
kit
bases
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201780091041.8A
Other languages
Chinese (zh)
Inventor
杨林
高雅
蒲丹丹
张海萍
程云阳
陈芳
蒋慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Shenzhen Co Ltd
Shenzhen BGI Life Science Research Institute
Original Assignee
Shenzhen BGI Life Science Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen BGI Life Science Research Institute filed Critical Shenzhen BGI Life Science Research Institute
Publication of CN110651050A publication Critical patent/CN110651050A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a targeted enrichment method and a kit for detecting low-frequency mutation. The method comprises the following steps: under the action of terminal transferase, adding a single-nucleotide tail with single base at the 3' end of each strand of the single-stranded DNA and/or the double-stranded DNA; performing first PCR amplification on a region where a mutation target site is located by using an upstream specific primer and a first universal primer, wherein the first universal primer comprises a sequencing primer sequence 1 at a 5 'end and continuous single bases which are complementary with a single nucleotide tail at a 3' end; performing a second PCR amplification on the product of the first PCR amplification with a downstream specific primer and a second universal primer, wherein the second universal primer comprises the sequencing primer sequence 1 of the first universal primer.

Description

Targeted enrichment method and kit for detecting low-frequency mutation Technical Field
The invention relates to the technical field of molecular biology, in particular to a targeted enrichment method and a kit for detecting low-frequency mutation.
Background
The development of the second generation sequencing (NGS) technology opens a new situation for the research of modern genomics, however, the cost and the complexity of analysis of whole genome sequencing are still difficult for researchers, and although the throughput of the second generation sequencing is higher and the cost is lower, the second generation sequencing is still not a feasible choice for most genetic laboratories. This is especially true for the study of complex diseases, which require at least hundreds of samples to achieve sufficient statistical power. However, whole genome sequencing of so many samples, both from a cost standpoint and from a data analysis standpoint, is relatively difficult.
The advent of Target region targeted enrichment sequencing (Target region sequencing) technology alleviates the above-mentioned problems. The target region target enrichment sequencing technology is a technology for enriching an interested target gene and combining a second-generation sequencing technology to perform sequencing to obtain base information of a target region, so that the purpose of detecting diseases is achieved.
There are two major techniques for enriching a target region on the market, one is a capture enrichment technique based on a probe, and the other is an enrichment technique based on multiplex PCR. Based on the capture enrichment technology of the probe, the principle of complementary hybridization of nucleic acid molecule base groups is utilized, oligonucleotide probes with reverse complementary are designed according to a target region, then genome DNA is broken, a joint for sequencing is added to hybridize with the probe, DNA which is not hybridized is eluted, a target DNA fragment is recovered, and then library preparation is carried out to carry out DNA sequencing. The technology needs high sample initial DNA amount (generally reaching microgram), long experiment operation time, complicated experiment, low data utilization rate and high cost, and is not beneficial to automatic library building. The enrichment technology based on the multiplex PCR is to design a primer according to a target region, then enrich the target region through the multiplex PCR, and then carry out library preparation on a PCR product for DNA sequencing. Although the experimental time is short, the technology needs complicated primer design work in the early stage and needs a great deal of tedious primer optimization work in the later stage. In addition, the two technologies have strict requirements on the quantity and the integrity of the template, and have no effect on samples such as cell free DNA, highly degraded DNA, paraffin embedded formaldehyde fixed medicine and the like. Therefore, it is important to develop a simple target enrichment method capable of effectively enriching short-segment DNA.
Disclosure of Invention
The invention provides a targeted enrichment method and a kit for detecting low-frequency mutation, which can effectively enrich short-segment, single-strand, double-strand and lost DNA and can detect the low-frequency mutation in the DNA by combining a second-generation sequencing technology.
According to a first aspect, in one embodiment there is provided a targeted enrichment method for detecting low frequency mutations, comprising:
under the action of terminal transferase, adding a single-nucleotide tail with single base at the 3' end of each strand of the single-stranded DNA and/or the double-stranded DNA;
performing a first PCR amplification on a region where a mutation target site is located, using an upstream specific primer having a target sequence complementary thereto as an anchor site and a first universal primer having the single nucleotide tail as an anchor site, wherein the first universal primer includes 5 '-end sequencing primer sequences 1 and 3' -end consecutive single bases complementary to the single nucleotide tail;
and (3) performing second PCR amplification on the product of the first PCR amplification by using a downstream specific primer and a second universal primer, wherein the downstream specific primer is positioned at the downstream of the upstream specific primer, the 5 'end of the downstream specific primer is provided with a sequencing primer sequence 2, and the second universal primer comprises a sequencing primer sequence 1 at the 5' end of the first universal primer.
According to a second aspect, there is provided in one embodiment a targeted enrichment kit for detecting low frequency mutations, comprising:
the single-base single-nucleotide tail is added to the 3' end of each strand of the single-stranded DNA and/or the double-stranded DNA under the action of the terminal transferase;
an upstream specific primer and a first universal primer for performing a first PCR amplification on a region where a mutation target site is located, wherein the upstream specific primer has a target sequence complementary thereto as an anchor site, the first universal primer has the single nucleotide tail as an anchor site, and the first universal primer comprises 5 '-sequencing primer sequences 1 and 3' -consecutive single bases complementary to the single nucleotide tail;
a downstream specific primer and a second universal primer, for performing a second PCR amplification on the product of the first PCR amplification, wherein the downstream specific primer is located downstream of the upstream specific primer and has a sequencing primer sequence 2 at the 5 'end, and the second universal primer comprises a sequencing primer sequence 1 at the 5' end of the first universal primer.
The method can effectively enrich short-fragment single-stranded DNA, double-stranded DNA and DNA with gap damage and has extremely high template utilization rate; the kit has high detection sensitivity, and can detect low-frequency mutation as low as 0.1%; can effectively enrich multiple target areas at one time, and can ensure good specificity, uniformity and stability.
Drawings
FIG. 1 is a schematic diagram of the principle of a targeted enrichment method for detecting low frequency mutations according to one embodiment of the present invention;
FIG. 2 is a schematic diagram of the concept of a targeted enrichment method for detecting low frequency mutations according to another embodiment of the present invention;
FIG. 3 is a schematic illustration of molecular signature calibration in one embodiment of the present invention;
FIG. 4 is a graph showing the results of quality inspection of Agilent 2100 on the target enrichment library in example 1 of the present invention;
FIG. 5 is a graph showing the results of uniformity of each amplicon of HBB in example 1 of the present invention;
FIG. 6 is a graph showing the results of quality inspection of Agilent 2100 on the target enrichment library in example 2 of the present invention;
FIG. 7 is a depth profile of sequencing data for 10 amplicon regions in example 2 of the present invention;
FIG. 8 is a depth profile of 10 amplicon region sequencing data after molecular signature correction in example 2 of the present invention;
FIG. 9 is a graph showing the results of the consistency of mutation detection in example 2 of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following detailed description and accompanying drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, those skilled in the art will readily recognize that some of the features may be omitted in various instances or may be replaced by other raw materials, methods. In some instances, certain operations related to the present invention have not been shown or described in the specification in order to avoid obscuring the present invention from the excessive description, and it is not necessary for those skilled in the art to describe these operations in detail, so that they can be fully understood from the description in the specification and the general knowledge in the art.
The numbering of the components as such, e.g., "first", "second", etc., is used herein only to distinguish the objects as described, and does not have any sequential or technical meaning.
FIG. 1 shows the principle of a targeted enrichment method for detecting low frequency mutations according to one embodiment of the present invention, which comprises:
step I: under the action of terminal transferase, a single-nucleotide tail of a single base is added to the 3' end of each strand of the single-stranded DNA and/or the double-stranded DNA.
Specifically, in the example shown in FIG. 1, double-stranded DNA (dsDNA) is denatured and melted to obtain single-stranded DNA (ssDNA), and 15-30 single-base C (or A, T, G) is added to the 3 'end of the single-stranded DNA by terminal transferase to obtain a single-polynucleotide tail, wherein all of the single-stranded, double-stranded and damaged DNA templates have a single-polynucleotide tail consisting of a single base and having a length of 15-30bp at the 3' end. It should be noted that, the double-stranded DNA molecule may be subjected to a single-stranded polynucleotide tail directly added to the 3' -end of each strand by the action of a terminal transferase without denaturation.
Step II: performing first PCR amplification on a region where the mutation target site is located by using an upstream specific primer and a first universal primer, wherein the upstream specific primer takes a target sequence complementary with the upstream specific primer as an anchor site, the first universal primer takes a single nucleotide tail as an anchor site, and the first universal primer comprises a sequencing primer sequence 1 at the 5 'end and continuous single bases complementary with the single nucleotide tail at the 3' end.
Specifically, in the example shown in FIG. 1, Upstream Specific Primers (USP) were designed 25-150bp before the mutation target site. One end takes single nucleotide tail as anchor position, the other end takes target sequence complemented by Upstream Specific Primer (USP) as anchor position to carry out 10-20 cycles of amplification. The primer sequence with a single polynucleotide tail as the anchor site is referred to as the "first universal primer" (i.e., universal primer 1 in FIG. 1) and includes: a sequencing primer sequence 1 (such as a sequencing primer sequence of BGISEQ-500, Illumina or Proton and other sequencing platforms) at the 5' end, a middle continuous base G (or T, A, C), wherein the number of the continuous bases C or G is 11-15, the Tm value range is 54-70 ℃, and too many bases are not beneficial to PCR amplification; the number of the continuous bases A or T is 25-35, the Tm value ranges from 54-62 ℃, and the number is too large to be beneficial to PCR amplification. In the preferred embodiment shown in FIG. 1, the universal primer 1 has 13 consecutive bases G in the middle, and preferably, a degenerate base H (V, B or D in other cases) is added at the end to fix the length of the 3 'end product, and after PCR amplification of a template with degenerate bases at the end of different length C bases, a product with fixed length C (A, T or G in other cases) at the 3' end is obtained. After the resulting PCR product is purified by magnetic beads, the product is sequentially from 5 'end to 3' end the target upstream specific primer sequence, the target region sequence, the contiguous single nucleotide C (which may be A, T or G in other cases), and finally the sequencing primer sequence. It is noted that the methods of the embodiments of the present invention are applicable to targeted enrichment of various mutation types, including Single Nucleotide Polymorphisms (SNPs), insertions and deletions (INDELs), and Copy Number Variations (CNVs).
Step III: and performing second PCR amplification on the product of the first PCR amplification by using a downstream specific primer and a second universal primer, wherein the downstream specific primer is positioned at the downstream of the upstream specific primer, the 5 'end of the downstream specific primer is provided with a sequencing primer sequence 2, and the second universal primer comprises a sequencing primer sequence 1 at the 5' end of the first universal primer.
Specifically, in the example shown in FIG. 1, a Downstream Specific Primer (DSP) is designed downstream of an Upstream Specific Primer (USP) (e.g., a region 0-10bp downstream of the USP), a sequencing primer sequence 2 (e.g., a sequencing primer sequence of a sequencing platform such as BGISEQ-500, Illumina or Proton) is added to the 5' end of the Downstream Specific Primer (DSP), and 15-30 cycles of amplification are performed with the Downstream Specific Primer (DSP) having the sequencing primer sequence 2 and a universal primer 2 (also referred to as a "second universal primer"). The second universal primer comprises a sequencing primer sequence 1 at the 5' end of the first universal primer.
As a preferred technical scheme, a sequencing tag primer (BC) can be introduced, wherein the 3 'end of the sequencing tag primer comprises a sequencing primer sequence 2 at the 5' end of the downstream specific primer. In the first cycle of PCR, a Downstream Specific Primer (DSP) and a universal primer 2 are amplified, and because the 5' end of the Downstream Specific Primer (DSP) also has a sequencing primer sequence 2, both ends of an obtained product have sequencing primer sequences; starting from the second cycle, with the sequencing primer sequence as an anchor site, two universal primers (i.e. universal primer 2 and sequencing tag primer (BC)) and a downstream specific primer are simultaneously amplified to obtain a target sequencing library. The benefit of introducing sequencing tag primers is: (1) the specific amplification steps are reduced, the general amplification steps are added, and the uniformity of different amplification sub-regions is facilitated; (2) the adapter sequences introduced by the specific primers are reduced and more adapter sequences are introduced by the sequencing tag primers.
The method of the embodiment has the following beneficial effects:
(1) can effectively enrich single-stranded and double-stranded DNA and has extremely high template utilization rate. Specifically, the terminal transferase can add bases to free single-stranded, double-stranded, and lost DNA, and can add bases to a template at an efficiency of 99% or more, that is, 99% or more of the template can have a single-nucleotide tail added to its 3' end. The target region is enriched by the specific primer and the single polynucleotide tail, and effective enrichment can be generated only by the specific primer binding site on the template. For free DNA with the size of about 165bp, the primer occupies about 25bp, and the utilization rate of the template can reach (165-25bp)/165bp is 85%, so that the method can achieve good enrichment effect on a plurality of highly fragmented single strands, double strands and lost DNA.
(2) Multiple targeted regions can be enriched. Specifically, a specific primer and an anchor primer are adopted for multiplex PCR amplification, the specific primer can be a primer aiming at different target regions, the anchor primer is a fixed primer sequence, multiple regions are enriched through the mixed specific primer and the fixed target primer, the enrichment specificity can be improved through two rounds of nested PCR, and a target sequencing library of the multiple regions is obtained.
FIG. 2 shows the principle of a targeted enrichment method for detecting low frequency mutations according to another embodiment of the present invention, which comprises:
the double-stranded DNA molecule is denatured and melted to obtain single-stranded DNA, 5-10 random bases are added to the 3' end of the single-stranded DNA template by terminal transferase to serve as a molecular label for marking the original template. The kinds of the added molecular tags are millions (N-5)4+64+74+…1041397760) far exceeding the number of templates, so that each template is labeled with a unique 5-10bp molecular tag consisting of A, T, C, G four bases with a probability of more than 99.9%. The residual dNTPs were removed by magnetic bead purification.
Then adding 15-30 single base C (any one of A, T, G) on the basis of molecular tag by terminal transferase to obtain a single polynucleotide tail, and purifying with magnetic beads to remove residual dCTP (or dATP, dTTP, dGTP). All the free single-stranded, double-stranded and damaged DNA templates have a 5-10bp molecular tag composed of four bases and a 15-30bp single-polynucleotide tail composed of one base at the 3' end. It should be noted that the double-stranded DNA molecule may be subjected to a terminal transferase to add a molecular tag and a single-nucleotide tail directly to the 3' end of each strand without denaturation.
Upstream Specific Primers (USP) were designed 25-150bp before the mutation target site. One end takes single nucleotide tail as anchor position, the other end takes target sequence complemented by Upstream Specific Primer (USP) as anchor position to carry out 10-20 cycles of amplification. The primer sequence with a single polynucleotide tail as the anchor site is referred to as the "first universal primer" (i.e., universal primer 1 in FIG. 2) and includes: a sequencing primer sequence 1 (such as a sequencing primer sequence of BGISEQ-500, Illumina or Proton and other sequencing platforms) at the 5' end, a middle continuous base G (or T, A, C), wherein the number of the continuous bases C or G is 11-15, the Tm value range is 54-70 ℃, and too many bases are not beneficial to PCR amplification; the number of the continuous bases A or T is 25-35, the Tm value ranges from 54-62 ℃, and the number is too large to be beneficial to PCR amplification. In the preferred embodiment shown in FIG. 2, the universal primer 1 has 13 consecutive bases G in the middle, and preferably, a degenerate base H (V, B or D in other cases) is added at the end to fix the length of the 3 'end product, and after PCR amplification of a template with degenerate bases at the end of different length C bases, a product with fixed length C (A, T or G in other cases) at the 3' end is obtained. After the obtained PCR product is purified by magnetic beads, the product is sequentially provided with a target upstream specific primer sequence, a target region sequence, a molecular tag consisting of 5-10 random bases, a continuous single nucleotide C (A, T or G in other cases), and a final sequencing primer sequence from the 5 'end to the 3' end. It is noted that the methods of the embodiments of the present invention are applicable to targeted enrichment of various mutation types, including Single Nucleotide Polymorphisms (SNPs), insertions and deletions (INDELs), and Copy Number Variations (CNVs).
Designing a Downstream Specific Primer (DSP) at the downstream of an Upstream Specific Primer (USP), adding a sequencing primer sequence 2 (such as a sequencing primer sequence of a sequencing platform such as BGISEQ-500, Illumina or Proton) at the 5' end of the Downstream Specific Primer (DSP), and carrying out 15-30 cycles of amplification by using the Downstream Specific Primer (DSP) with the sequencing primer sequence 2 and a universal primer 2 (also called a second universal primer). The second universal primer comprises a sequencing primer sequence 1 at the 5' end of the first universal primer.
As a preferred technical scheme, a sequencing tag primer (BC) can be introduced, wherein the 3 'end of the sequencing tag primer comprises a sequencing primer sequence 2 at the 5' end of the downstream specific primer. In the first cycle of PCR, a Downstream Specific Primer (DSP) and a universal primer 2 are amplified, and because the 5' end of the Downstream Specific Primer (DSP) also has a sequencing primer sequence 2, both ends of an obtained product have sequencing primer sequences; starting from the second cycle, with the sequencing primer sequence as an anchor site, two universal primers (i.e. universal primer 2 and sequencing tag primer (BC)) and a downstream specific primer are simultaneously amplified to obtain a target sequencing library. The benefit of introducing sequencing tag primers is: (1) the specific amplification steps are reduced, the general amplification steps are added, and the uniformity of different amplification sub-regions is facilitated; (2) the adapter sequences introduced by the specific primers are reduced and more adapter sequences are introduced by the sequencing tag primers.
The method shown in fig. 2 has the following advantageous effects in addition to the advantageous effects of the method shown in fig. 1: high detection sensitivity, and can detect low-frequency mutation further reduced to 0.1%. Specifically, a random sequence consisting of four basic groups and having a length of 5-10bp is randomly added to the 3' end of a free DNA template through terminal transferase, the types of the sequence can reach millions, the initial template can be uniquely marked, and low-frequency mutation further reduced to 0.1% can be detected through a molecular marker combined information analysis method.
As shown in fig. 3, the target sequencing library obtained in this embodiment is subjected to double-end sequencing, one end of the specific primer is used for detecting a target enrichment region, the other end of the specific primer is used for reading molecular tag information to label a template, and PCR errors and sequencing errors are removed through the molecular tag by combining with a specific data analysis algorithm, so that very low-frequency mutation detection is realized.
The technical solutions and effects of the present invention are described in detail by the following embodiments, and it should be understood that the embodiments are only exemplary and should not be construed as limiting the scope of the present invention.
Example 1: thalassemia paternity mutation detection
And (3) designing 19 pairs of primers aiming at HBB genes related to beta thalassemia, detecting common beta thalassemia mutation sites, and detecting whether the fetus in the plasma free DNA of the pregnant woman carries a father mutation or not aiming at different mutation types carried by parents so as to achieve exclusion diagnosis.
The plasma source was plasma free DNA from pregnant women at 12 weeks, the mother carried the CD 41/42. beta.E (del CTTT) mutation, the father carried the CD71/72(Ins A) mutation, and the fetus was tested for carrying the father-derived (CD71/72) mutation by sequencing after pooling the plasma free DNA capture.
The experimental steps are as follows:
1. addition of oligonucleotide tails
cfDNA was first heat denatured at 95 ℃ for 5 minutes, then quickly inserted on ice, and then subjected to an enzymatic reaction. An oligonucleotide tail was added to the 3' end of the DNA by Terminal Transferase (Terminal Transferase, cat # M0315S, NEB, USA).
The reaction system is shown in table 1:
TABLE 1
Figure PCTCN2017093914-APPB-000001
The reaction was incubated at 37 ℃ for 30 minutes and then stopped by the addition of 10. mu.l EDTA at a concentration of 0.5M. Mu.l of Agencour AMPure XP magnetic beads (Beckmann Kurt Co., Ltd., USA) with a volume of 1.8 times was added thereto, and the mixture was purified according to the instructions, and then the DNA was dissolved in 20. mu.l of distilled water.
2. First round PCR amplification
As the PCR enzyme, NEB was used
Figure PCTCN2017093914-APPB-000002
Hot Start High-Fidelity 2X Master Mix, Cat No.: M0494L.
The PCR reaction system is shown in Table 2:
TABLE 2
Figure PCTCN2017093914-APPB-000003
The pool of forward specific primers is shown in Table 3 and the pool of universal primers 1 is shown in Table 4.
TABLE 3
Primer name Primer sequence (5-3') Primer and method for producing the sameNumbering
HBB-USP1 TGAGAGATGCAGGATAAGCAA SEQ ID NO:1
HBB-USP2 GTTGCCAATGTGCATTAGCT SEQ ID NO:2
HBB-USP3 TCCCAAGGTTTGAACTAGCTC SEQ ID NO:3
HBB-USP4 TTAGGGAACAAAGGAACCTTTAAT SEQ ID NO:4
HBB-USP5 GTGGGAGGAAGATAAGAGGTATGA SEQ ID NO:5
HBB-USP6 GCTGCTATTAGCAATATGAAACCTC SEQ ID NO:6
HBB-USP7 TGATACATTGTATCATTATTGCCCTG SEQ ID NO:7
HBB-USP8 TAGTAATGTACTAGGCAGACTGTGT SEQ ID NO:8
HBB-USP9 TCATTCGTCTGTTTCCCATTC SEQ ID NO:9
HBB-USP10 CCTTCCTATGACATGAACTTAACC SEQ ID NO:10
HBB-USP11 GCGTCCCATAGACTCACCC SEQ ID NO:11
HBB-USP12 CACCGAGCACTTTCTTGCC SEQ ID NO:12
HBB-USP13 GAAAATAGACCAATAGGCAGAGAGA SEQ ID NO:13
HBB-USP14 CCTTAAACCTGTCTTGTAACCTTGAT SEQ ID NO:14
HBB-USP15 CAGTAACGGCAGACTTCTCCTC SEQ ID NO:15
HBB-USP16 GTTGTGTCAGAAGCAAATGTAAGC SEQ ID NO:16
HBB-USP17 CTGACTTTTATGCCCAGCC SEQ ID NO:17
HBB-USP18 CTAGGGTGTGGCTCCACAG SEQ ID NO:18
HBB-USP19 CAGCCGTACCTGTCCTTGG SEQ ID NO:19
The pool of upstream specific primers consisted of an equimolar mixture of primers as shown in Table 3.
TABLE 4
Figure PCTCN2017093914-APPB-000004
The amplification system is shown in table 5 below:
TABLE 5
Figure PCTCN2017093914-APPB-000005
Mu.l of 1.8-fold volume of Agencour AMPure XP magnetic beads (Beckmann Kurt Co., Ltd., USA) was added, and purification was performed according to the instructions, and after purification, the DNA was dissolved in 20. mu.l of distilled water.
3. Second round of PCR amplification
The PCR reaction system is shown in Table 6 below:
TABLE 6
Figure PCTCN2017093914-APPB-000006
The pool of downstream specific primers is shown in Table 7, and the generic primer 2 and sequencing tag primers are shown in Table 4.
TABLE 7
Figure PCTCN2017093914-APPB-000007
Figure PCTCN2017093914-APPB-000008
The pool of downstream specific primers consisted of an equimolar mixture of primers as shown in Table 7.
The amplification system is shown in table 8 below:
TABLE 8
Figure PCTCN2017093914-APPB-000009
50. mu.l of 1-fold volume of Agencour AMPure XP magnetic beads (Beckmann Kort Co., Ltd., USA) was added thereto, and the mixture was purified according to the instructions, and then the DNA was dissolved in 30. mu.l of distilled water.
4. Library quality inspection
The Agilent 2100 is used for detecting the target enrichment library, and the quality detection result is shown in figure 4.
5. Sequencing on machine
And after the quality is qualified, sequencing by adopting a BGISEQ-500 sequencing platform, sequencing by 100bp on a single end, and analyzing the obtained off-line data by adopting the following information after data conversion and quality filtration.
6. Information analysis
Firstly, removing joints from the obtained data to obtain a single-ended sequencing result, comparing a genome (reference genome hg19), and carrying out statistics on mutation of a target site through data analysis to obtain information of the target site. The results are shown in tables 9 to 10. The depth of each target area is counted according to the position of the target area to obtain the uniformity information of the target area, and as a result, as shown in fig. 5, it can be seen that the uniformity is good.
Table 9: offline data statistics
Sample(s) Data of leaving the machine Comparison rate Capture rate 0.1X average depth
1 6789141 93.2% 93.5% 100%
Table 10: beta thalassemia detection result
Figure PCTCN2017093914-APPB-000010
And (4) conclusion: the detection result is the same as the detection result of the embodiment. Non-invasive detection of whether the fetus is beta poor paternal mutation can be achieved by this method.
Example 2: plasma free DNA low frequency mutation detection
Designing primers aiming at 10 hot spot mutations related to lung cancer, constructing a target sequencing library for plasma free DNA, and detecting hot spot regions related to lung cancer by combining high-throughput sequencing and specific information analysis.
Plasma free DNA used was a horizons cfDNA standard: 0.1% Multiplex I cfDNA Reference Standard (cat. HD779), mutation information as in Table 11, initial amount of 10ng, according to the following experiment.
Table 11: cfDNA standard mutation information of Horizon
Gene Name of mutation Type of mutation Frequency of abrupt change
BRAF V600E c.1799T>A(exon15) 0.00%
cKIT D816V c.2447A>T 0.00%
EGFR G719S c.2155G>A 0.00%
EGFR T790M c.2369C>T 0.10%
EGFR L858R c.2573T>G 0.10%
EGFR ΔE746-A750 c.2235_2249del15(Deletion) 0.10%
KRAS G12D c.35G>A 0.13%
KRAS G13D c.38G>A 0.00%
NRAS Q61K c.35G>A 0.13%
PIK3CA E545K c.35G>A 0.13%
PIK3CA H1047R c.35G>A 0.00%
The experimental steps are as follows:
1. addition of molecular tags
cfDNA was first heat denatured at 95 ℃ for 5 minutes, then quickly inserted on ice, and then subjected to an enzymatic reaction. 5-10 random bases were added to the 3' -end of the DNA by Terminal Transferase (Terminal Transferase, cat. No. M0315S, NEB, USA).
The reaction system is shown in table 12 below:
TABLE 12
Figure PCTCN2017093914-APPB-000011
Figure PCTCN2017093914-APPB-000012
The reaction was incubated at 37 ℃ for 30 minutes and then stopped by the addition of 10. mu.l EDTA at a concentration of 0.5M. Mu.l of Agencour AMPure XP magnetic beads (Beckmann Kurt Co., Ltd., USA) with a volume of 1.8 times was added thereto, and the mixture was purified according to the instructions, and then the DNA was dissolved in 34. mu.l of distilled water.
2. Addition of oligonucleotide tails
The reaction system is shown in table 13 below:
watch 13
Figure PCTCN2017093914-APPB-000013
The reaction was incubated at 37 ℃ for 30 minutes and then stopped by the addition of 10. mu.l EDTA at a concentration of 0.5M. Mu.l of Agencour AMPure XP magnetic beads (Beckmann Kurt Co., Ltd., USA) with a volume of 1.8 times was added thereto, and the mixture was purified according to the instructions, and then the DNA was dissolved in 20. mu.l of distilled water.
3. First round PCR amplification
As the PCR enzyme, NEB was used
Figure PCTCN2017093914-APPB-000014
Hot Start High-Fidelity 2X Master Mix, Cat No.: M0494L.
The reaction system is shown in table 14 below:
TABLE 14
Figure PCTCN2017093914-APPB-000015
The pool of forward specific primers is shown in Table 15 and the pool of universal primers 1 is shown in Table 4.
Watch 15
Figure PCTCN2017093914-APPB-000016
The pool of upstream specific primers consisted of an equimolar mixture of primers as described in Table 15.
The amplification system is shown in table 16 below:
TABLE 16
Figure PCTCN2017093914-APPB-000017
Mu.l of 1.8-fold volume of Agencour AMPure XP magnetic beads (Beckmann Kurt Co., Ltd., USA) was added, and purification was performed according to the instructions, and after purification, the DNA was dissolved in 20. mu.l of distilled water.
4. Second round of PCR amplification:
the PCR reaction system is shown in Table 17 below:
TABLE 17
Figure PCTCN2017093914-APPB-000018
The pool of downstream specific primers is shown in Table 18, and the generic primer 2 and sequencing tag primers are shown in Table 4.
Watch 18
Figure PCTCN2017093914-APPB-000019
The pool of downstream specific primers consisted of an equimolar mixture of primers as shown in Table 18.
The amplification system is shown in Table 19 below:
watch 19
Figure PCTCN2017093914-APPB-000020
Figure PCTCN2017093914-APPB-000021
50. mu.l of 1-fold volume of Agencour AMPure XP magnetic beads (Beckmann Kort Co., Ltd., USA) was added thereto, and the mixture was purified according to the instructions, and then the DNA was dissolved in 30. mu.l of distilled water.
5. Library quality inspection
The results of the detection of the target enrichment library using Agilent 2100 are shown in FIG. 6.
6. Sequencing on machine
And after the quality is qualified, sequencing by adopting a BGISEQ-500 sequencing platform, sequencing by 50bp at two ends, and analyzing the obtained off-line data by adopting the following information after data conversion and quality filtration.
7. Information analysis
Firstly, removing joints from the obtained data to obtain a double-end sequencing result, wherein the sequencing result at one end is used for comparing a genome (reference genome hg19), and after continuous G bases are removed from the result at the other end, 10 base sequences are intercepted from the removed end to serve as molecular tags for marking sequence information at the front end; performing basic parameter statistics (table 20) to compare to the data ratio on the genome, and performing statistics on the depth of each target region according to the position of the target region to obtain target region uniformity information (fig. 7-8), wherein fig. 7 shows the sequencing depth of the original data, and fig. 8 shows the sequencing depth after read (reads) deduplication. The results showed good uniformity. The depth and four base ratio of the target site were counted, repetition and sequencing errors, PCR errors were removed by molecular labeling, and mutation information of the target site was obtained by a specific information analysis algorithm, with the results shown in table 21. Table 22 shows the results of the consistency of the detection results. FIG. 9 shows mutation detection consensus information indicating that the detected mutation information of the target site is consistent with the expectation.
Table 20: offline data statistics
Figure PCTCN2017093914-APPB-000022
Table 21: the result of the detection
Figure PCTCN2017093914-APPB-000023
Figure PCTCN2017093914-APPB-000024
Figure PCTCN2017093914-APPB-000025
Table 22: consistency of detection results
Mutation site Original mutation Corrected mutation Mutation of standard
V600E 0.60% 0.00% 0.00%
D816V 0.14% 0.00% 0.00%
G719S 0.20% 0.00% 0.00%
T790M 0.11% 0.08% 0.10%
L858R 0.47% 0.10% 0.10%
ΔE746-A750 0.30% 0.15% 0.10%
G12D 0.07% 0.12% 0.13%
G13D 0.30% 0.00% 0.00%
Q61K 0.14% 0.11% 0.13%
E545K 0.20% 0.15% 0.13%
H1047R 0.11% 0.00% 0.00%
And (4) conclusion: in the standard, V600E, D816V, G719S, G13D and H1047R were all negative, and the mutation ratio was 0.00%, and the result obtained in this example was negative, and the mutation ratio was also 0.00%; in the standard, T790M, L858R, delta E746-A750, G12D, Q61K and E545K are positive, the mutation ratios are respectively 0.10%, 0.13% and 0.13%, the mutation ratios detected by the method are respectively 0.08%, 0.10%, 0.15%, 0.12%, 0.11% and 0.15%, and the detection values and the actual values are within a +/-0.02% error range, which indicates that the method can accurately detect the mutation as low as 0.10%. It can be seen that the duplication and sequencing errors can be removed by molecular marker (UID) correction, and the sequencing background is reduced from 0.60% to 0.00% (it is to be noted here that the error value before V600E correction is the largest, i.e. the method error rate can be reduced from 0.60% to 0.00%).
As can be seen from FIG. 9, the mutation information detected at the target site is consistent with the expected value, negative mutation points are not detected, positive mutation points are detected, and the difference between the detection frequency and the expected value is not large, so that the method can detect the mutation as low as 0.10% in the plasma free DNA.
The present invention has been described in terms of specific examples, which are provided to aid understanding of the invention and are not intended to be limiting. For a person skilled in the art to which the invention pertains, several simple deductions, modifications or substitutions may be made according to the idea of the invention.

Claims (24)

  1. A targeted enrichment method for detecting low frequency mutations, comprising:
    under the action of terminal transferase, adding a single-nucleotide tail with single base at the 3' end of each strand of the single-stranded DNA and/or the double-stranded DNA;
    performing a first PCR amplification on a region where a mutation target site is located by using an upstream specific primer and a first universal primer, wherein the upstream specific primer takes a target sequence complementary to the upstream specific primer as an anchoring site, the first universal primer takes the single nucleotide tail as an anchoring site, and the first universal primer comprises a sequencing primer sequence 1 at the 5 'end and a continuous single base complementary to the single nucleotide tail at the 3' end;
    and carrying out second PCR amplification on the product of the first PCR amplification by using a downstream specific primer and a second universal primer, wherein the downstream specific primer is positioned at the downstream of the upstream specific primer and carries a sequencing primer sequence 2 at the 5 'end, and the second universal primer comprises a sequencing primer sequence 1 at the 5' end of the first universal primer.
  2. The method of claim 1, wherein a sequencing tag primer is further added to the second PCR amplification, and the 3 'end of the sequencing tag primer comprises a sequencing primer sequence 2 at the 5' end of the downstream specific primer.
  3. The method of claim 1, further comprising: before adding the single-nucleotide tail, a random base is added to the 3' end of each strand of the single-stranded DNA and/or the double-stranded DNA under the action of terminal transferase.
  4. The method of claim 3, wherein the random bases are 5-10 bases in length.
  5. The method of claim 1, wherein the single polynucleotide tail is 15-30 single bases in length.
  6. The method of claim 1, wherein the upstream specific primer is 25-150bp away from the mutation target site.
  7. The method of claim 1, wherein the first PCR amplification is performed for 10-20 cycles.
  8. The method of claim 1, wherein the contiguous single base is 11-15 contiguous C or G bases, or 25-35 contiguous a or T bases.
  9. The method of claim 1, wherein the first universal primer further comprises a degenerate base after the consecutive single bases.
  10. The method of claim 9, wherein the degenerate base is H, V, B or D.
  11. The method of claim 1, wherein the second PCR amplification is performed for 15-30 cycles.
  12. The method of claim 1, wherein the mutation target sites comprise single nucleotide polymorphisms, insertions and deletions, and copy number variations.
  13. A targeted enrichment kit for detecting low frequency mutations, comprising:
    the single-base single-nucleotide tail is added to the 3' end of each strand of the single-stranded DNA and/or the double-stranded DNA under the action of the terminal transferase;
    an upstream specific primer and a first universal primer, which are used for carrying out first PCR amplification on a region where a mutation target site is located, wherein the upstream specific primer takes a target sequence complementary to the upstream specific primer as an anchoring site, the first universal primer takes the single nucleotide tail as an anchoring site, and the first universal primer comprises a sequencing primer sequence 1 at the 5 'end and a continuous single base complementary to the single nucleotide tail at the 3' end;
    and the downstream specific primer and the second universal primer are used for carrying out second PCR amplification on the product of the first PCR amplification, wherein the downstream specific primer is positioned at the downstream of the upstream specific primer, the 5 'end of the downstream specific primer is provided with a sequencing primer sequence 2, and the second universal primer comprises a sequencing primer sequence 1 at the 5' end of the first universal primer.
  14. The kit of claim 13, further comprising a sequencing tag primer for performing a second PCR amplification on the first PCR amplified product, wherein the 3 'end of the sequencing tag primer comprises a sequencing primer sequence 2 at the 5' end of the downstream specific primer.
  15. The kit of claim 13, further comprising: and (b) mixed nucleotides for adding a random base to the 3' end of the single-stranded DNA and/or each strand of the double-stranded DNA under the action of a terminal transferase before adding the single-nucleotide tail.
  16. The kit of claim 15, wherein the random bases are 5-10 bases in length.
  17. The kit of claim 13, wherein the single polynucleotide tail is 15-30 single bases in length.
  18. The kit of claim 13, wherein the upstream specific primer is 25-150bp away from the mutation target site.
  19. The kit of claim 13, wherein the first PCR amplification is performed for 10-20 cycles.
  20. The kit of claim 13, wherein the contiguous single base is 11-15 contiguous C or G bases, or 25-35 contiguous a or T bases.
  21. The kit of claim 13, wherein the first universal primer further comprises a degenerate base after the contiguous single base.
  22. The kit of claim 21, wherein the degenerate base is H, V, B or D.
  23. The kit of claim 13, wherein the second PCR amplification is performed for 15-30 cycles.
  24. The kit of claim 13, wherein the mutation target sites comprise single nucleotide polymorphisms, insertions and deletions, and copy number variations.
CN201780091041.8A 2017-07-21 2017-07-21 Targeted enrichment method and kit for detecting low-frequency mutation Pending CN110651050A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/093914 WO2019014936A1 (en) 2017-07-21 2017-07-21 Targeted enrichment method and kit for detecting low-frequency mutation

Publications (1)

Publication Number Publication Date
CN110651050A true CN110651050A (en) 2020-01-03

Family

ID=65014966

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780091041.8A Pending CN110651050A (en) 2017-07-21 2017-07-21 Targeted enrichment method and kit for detecting low-frequency mutation

Country Status (2)

Country Link
CN (1) CN110651050A (en)
WO (1) WO2019014936A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112359101A (en) * 2020-11-13 2021-02-12 苏州金唯智生物科技有限公司 Method for cross contamination of quality testing oligonucleotide
CN114317696A (en) * 2021-12-24 2022-04-12 深圳裕康医学检验实验室 Kit, library construction method thereof and pollution detection method
CN117343929A (en) * 2023-12-06 2024-01-05 广州迈景基因医学科技有限公司 PCR random primer and method for enhancing targeted enrichment by using same

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160115532A1 (en) * 2012-08-10 2016-04-28 Sequenta, Inc. High sensitivity mutation detection using sequence tags
CN105861724A (en) * 2016-06-03 2016-08-17 人和未来生物科技(长沙)有限公司 KRAS gene ultralow frequency mutation detection kit
CN106192018A (en) * 2015-05-07 2016-12-07 深圳华大基因研究院 A kind of method of grappling Nest multiplex PCR enrichment DNA target area and test kit
CN106676182A (en) * 2017-02-07 2017-05-17 北京诺禾致源科技股份有限公司 Low-frequency gene fusion detection method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160115532A1 (en) * 2012-08-10 2016-04-28 Sequenta, Inc. High sensitivity mutation detection using sequence tags
CN106192018A (en) * 2015-05-07 2016-12-07 深圳华大基因研究院 A kind of method of grappling Nest multiplex PCR enrichment DNA target area and test kit
CN105861724A (en) * 2016-06-03 2016-08-17 人和未来生物科技(长沙)有限公司 KRAS gene ultralow frequency mutation detection kit
CN106676182A (en) * 2017-02-07 2017-05-17 北京诺禾致源科技股份有限公司 Low-frequency gene fusion detection method and device

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ORIYA VARDI: "Biases in the SMART-DNA library preparation method associated with genomic poly dA/dTsequences", 《PLOS ONE》, vol. 12, no. 2, pages 1 - 14 *
VARDI O.: "Biases in the SMART-DNA library preparation method associated with genomic poly dA/dT sequences", 《PLOS ONE》 *
VARDI O.: "Biases in the SMART-DNA library preparation method associated with genomic poly dA/dT sequences", 《PLOS ONE》, vol. 12, no. 2, 28 February 2017 (2017-02-28), pages 1 - 14, XP055562515, DOI: 10.1371/journal.pone.0172769 *
ZHENG ZL: "Anchored multiplex PCR for targeted next-generation sequencing", 《NATURE MEDICINE》, vol. 20, no. 12, 31 December 2014 (2014-12-31), pages 1479 - 1486 *
瞿礼嘉: "《医学生物化学与分子生物学实验》", 医学常用实验技术精编, pages: 158 - 162 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112359101A (en) * 2020-11-13 2021-02-12 苏州金唯智生物科技有限公司 Method for cross contamination of quality testing oligonucleotide
CN112359101B (en) * 2020-11-13 2023-10-03 苏州金唯智生物科技有限公司 Method for cross contamination of quality inspection oligonucleotides
CN114317696A (en) * 2021-12-24 2022-04-12 深圳裕康医学检验实验室 Kit, library construction method thereof and pollution detection method
CN117343929A (en) * 2023-12-06 2024-01-05 广州迈景基因医学科技有限公司 PCR random primer and method for enhancing targeted enrichment by using same
CN117343929B (en) * 2023-12-06 2024-04-05 广州迈景基因医学科技有限公司 PCR random primer and method for enhancing targeted enrichment by using same

Also Published As

Publication number Publication date
WO2019014936A1 (en) 2019-01-24

Similar Documents

Publication Publication Date Title
CN110734908B (en) Construction method of high-throughput sequencing library and kit for library construction
US9745614B2 (en) Reduced representation bisulfite sequencing with diversity adaptors
CN107075581B (en) Digital measurement by targeted sequencing
CN106555226B (en) A kind of method and kit constructing high-throughput sequencing library
AU2014248511B2 (en) Systems and methods for prenatal genetic analysis
CN102796808B (en) Methylation high-flux detection method
JP2022025083A (en) Compositions and Methods for Detecting Rare Sequence Variants
EP3885445B1 (en) Methods of attaching adapters to sample nucleic acids
CN102533985B (en) Method for detecting deletion and/or duplication of exons in DMD gene
JP2012525147A (en) Methods and compositions for assessing genetic markers
CN108517567B (en) Adaptor, primer group, kit and library construction method for cfDNA library construction
US10465241B2 (en) High resolution STR analysis using next generation sequencing
WO2018184495A1 (en) Method for constructing amplicon library through one-step process
CN107893109A (en) A kind of low abundance gene mutation enrichment method based on removal wild-type sequence
US20220154286A1 (en) Compositions and methods for analyzing dna using partitioning and base conversion
CN110603327A (en) PCR primer pair and application thereof
CN110869515A (en) Sequencing method for genome rearrangement detection
CN110651050A (en) Targeted enrichment method and kit for detecting low-frequency mutation
CN108359723B (en) Method for reducing deep sequencing errors
CN107236727B (en) Preparation method of single-stranded probe for polygene capture sequencing
US20180100180A1 (en) Methods of single dna/rna molecule counting
WO2018081666A1 (en) Methods of single dna/rna molecule counting
CN114774522A (en) Method and kit for constructing high fidelity sequencing library and application
CN114277114A (en) Method for adding unique identifier in amplicon sequencing and application
CN114746560A (en) Methods, compositions, and systems for improved binding of methylated polynucleotides

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination