WO2019114146A1 - 基因目标区域富集方法及建库试剂盒 - Google Patents

基因目标区域富集方法及建库试剂盒 Download PDF

Info

Publication number
WO2019114146A1
WO2019114146A1 PCT/CN2018/080229 CN2018080229W WO2019114146A1 WO 2019114146 A1 WO2019114146 A1 WO 2019114146A1 CN 2018080229 W CN2018080229 W CN 2018080229W WO 2019114146 A1 WO2019114146 A1 WO 2019114146A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequencing
dna
sequence
stranded
double
Prior art date
Application number
PCT/CN2018/080229
Other languages
English (en)
French (fr)
Inventor
杨国华
李英辉
林健
汤泽源
Original Assignee
格诺思博生物科技南通有限公司
上海格诺生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 格诺思博生物科技南通有限公司, 上海格诺生物科技有限公司 filed Critical 格诺思博生物科技南通有限公司
Publication of WO2019114146A1 publication Critical patent/WO2019114146A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]

Definitions

  • the invention belongs to the technical field of molecular biology, and in particular, the invention relates to a gene target region enrichment method and a library building kit mainly used for second generation sequencing.
  • Target region sequencing also known as targeted sequencing
  • the core of sequencing of the target region is to enrich and sequence the genes of interest.
  • the two existing enrichment methods include multiplex PCR and hybrid capture.
  • the multiplex PCR method is represented by Life Technologies' Ampliseq method, and a pair of PCR primers are designed for each site, and multiplex PCR is used to enrich the gene or site of interest.
  • Hybrid capture methods usually design a probe of about 120 bases in length for a target gene, and hybridize and enrich the gene of interest on a solid phase carrier or in a liquid phase. This method can detect common point mutations at the DNA level. , insertional mutations, microdeletion mutations, and known and unknown fusion mutations. For efficient enrichment, hybrid capture methods require large numbers of long-chain probes and longer hybridization cycles (usually greater than 24 hours), requiring a large number of DNA templates due to low hybridization efficiency.
  • the object of the present invention is to provide a gene target region enrichment method and a library building kit.
  • a method of performing target region enrichment comprising:
  • the needle includes a sequencing adaptor 2 and a sequence complementary to the DNA fragment containing the target region to be enriched;
  • the method of enrichment of the target region is applied to high throughput sequencing.
  • the target region comprises: a site in which the sequence is mutated, for example, a SNP mutation site region, a base deletion site region, a base insertion site region, and a fusion mutation site region.
  • the DNA polymerase is a high fidelity DNA polymerase.
  • the method is a non-diagnostic or therapeutic method.
  • the method is applied to high-throughput detection, wherein in one test, the DNA fragment containing the target region to be enriched is one or more (2 to 1000, such as 5, 10, 15, 20, 30, 50, 80, 100, 200, 500, 800), so that there are also one or more corresponding sequencing probes.
  • the DNA capable of forming a partially double-stranded structure may form a "stem loop” structure, a "double loop” structure or a "Y type” structure, or may comprise a palindrome.
  • the adaptor ligation method of step 1 is a single-strand ligation, and is ligated to the 5' end of the phosphorylated DNA fragment containing the target region to be enriched by a double-stranded ligase; preferably, the pair Chain ligases include: T4 DNA ligase.
  • the DNA capable of forming a partially double-stranded structure comprises: a sequence recognizable by a sequencing system, a sample tag sequence; preferably, further comprising: a molecular tag sequence.
  • sequence, sample tag sequence and/or molecular tag sequence which are recognized by the sequencing system are present in the DNA which forms part of the double-stranded structure and cannot form a double-stranded portion (eg " "Ring" in the structure of the stem ring").
  • steps (1), (2) and (3) a step of purifying the ligation product, the amplification product and the extension product, respectively, is further included.
  • one end of the sequencing linker 2 is ligated to the sequence complementary to the DNA fragment containing the target region to be enriched, and the other end is ligated to the purification tag; or when the sequence extension is performed Stretching with "purification of the tag-labeled dNTP"; preferably, the purification tag is a tag capable of solid phase purification, and is subjected to solid phase purification using a solid phase purification system;
  • the sample labeled with the sample label is purified and separated.
  • the sequencing linker 2 is located at the 5' end of the specific probe, and the sequence complementary to the DNA fragment containing the target region to be enriched is located at the 3' end of the specific probe.
  • the 5' end of the sequencing linker 2 is ligated to a purification tag.
  • the purification tag is biotin
  • the solid phase purification system for solid phase purification is avidin-coated magnetic beads or streptavidin-coated magnetic beads.
  • the extending comprises: a single wheel or a plurality of cycles of denaturation, annealing, and stretching; preferably, performing a plurality of cycles of denaturation, annealing, and stretching steps (for example, , performing 2 to 45, such as linear amplification of 5, 10, 15, 20, 25, 30, 35 cycles.
  • the PCR amplification is exponential PCR amplification (for example, performing PCR amplification of 10 to 20 cycles).
  • step (3) PCR amplification is performed using a forward primer and a reverse primer, the reverse primer being identical or complementary to the sequence recognized by the sequencing system; Primers are identical or complementary to the sequence of the sequencing linker 2.
  • sequencing is performed using an Ion Torrent sequencing system, the sequence of which is an Ion Torrent P1 sequence; the sequence that can be recognized by the sequencing system is the Ion Torrent A sequence.
  • step (3) further comprising: identifying the sequencing linker 1 in the amplification product obtained in the step (3) (in particular, a sequence (such as an A sequence) which can be recognized by the sequencing system) and sequencing The linker 2 was subjected to sequencing to obtain a sequencing result of the target region.
  • a sequence such as an A sequence
  • the bases of the sequencing linker 1 and the sequencing linker 2 are modified or unmodified; preferably, the modifications include, but are not limited to, deoxyuracil ( dU) modification, thio modification.
  • the DNA fragment containing the target region to be enriched is double-stranded DNA, including but not limited to genomic DNA, complementary DNA (cDNA).
  • kits for performing enrichment of a target region comprising:
  • linker 1 which is a DNA capable of forming a partially double-stranded structure
  • a specific probe comprising a sequencing linker 2 and a sequence complementary to a DNA fragment comprising a region of interest to be enriched
  • the kit further comprises one or more reagents selected from the group consisting of DNA polymerase, double-stranded ligase, dNTPs.
  • the target region is an EGFR exon 18 G719X (A/S/C) mutation, EGFR 19Del, EGFR exon 20 T790M, EGFR exon 21 L858R, BRAF V600E, KRAS G12D, KRAS Q61H and/or ALK intron 19 fusion mutation
  • the nucleotide sequence of the sequencing linker 1 is shown as SEQ ID NO: 1, 6, 7, 8, 9, 10, 11, 22 or 23, the specific probe The nucleotide sequence is as shown in one or more of SEQ ID NOs: 2, 3, 12-21, and/or 24-39,
  • the nucleotide sequences of the forward primer and the reverse primer are shown in SEQ ID NO: 4 and SEQ ID NO: 5.
  • Figure 1 Flow chart of library construction.
  • FIG. 7 The H2228 cell line EML4-ALK fusion in Example 3 was verified by one-generation sequencing.
  • Figure 8 The results of one-generation sequencing of EML4-ALK fusion in lung cancer patients in Example 3.
  • the inventors have intensively studied and proposed a novel gene target region enrichment method, which is suitable for high-throughput sequencing.
  • the method of the present invention utilizes the fidelity of a high-fidelity DNA polymerase to achieve enrichment of a target region of a gene by linear amplification of a single primer.
  • the method of the invention can efficiently and accurately enrich the gene of interest and significantly reduce sequencing errors generated during library construction.
  • a DNA fragment comprising a region of interest to be enriched refers to a stretch of double-stranded DNA sequence, including but not limited to genomic DNA, complementary DNA (cDNA), etc., which contain an interest of "target area".
  • the "target region” includes: a site in which the sequence is mutated, for example, a SNP mutation site region, a base deletion site region, a base insertion site region, and a fusion mutation site region.
  • the "target region” may be a region closely related to the occurrence of the disease, or may be an area of interest to those skilled in the art for other research purposes.
  • the L858R mutation of the 21st exon of the human EGFR gene the E746_A750DEL-1 deletion mutation of the 19th exon of the human EGFR gene, the A763_Y764insFQEA insertion mutation of the 20th exon of the human EGFR gene, and the human EML4-ALK gene fusion mutation.
  • a "complementary" sequence generally refers to a sequence that converts a sequence in the 5'-3' direction to its 3'-5' direction (eg, 5'ATCG 3'->GCTA), and then takes its complementary sequence (such as GCTA ⁇ 5 'CGAT3 ').
  • a “stem loop” structure is also referred to as a "hairpin” structure.
  • a single-stranded nucleotide molecule that forms a secondary structure comprising a double-stranded region (stem portion) consisting of two regions of the nucleotide molecule (on the same molecule) Forming, the two regions are flanked by two sides of the double-stranded portion; they also include at least one "loop" structure, including non-complementary nucleotide molecules, ie, single-stranded regions.
  • double loop structure Refers to a double-stranded nucleotide molecule in which both ends are complementary nucleotide molecules forming a double-stranded region in the structure; a non-complementary nucleotide molecule is contained in the middle to form a single-strand "loop" in the structure. structure.
  • Y-type structure refers to a double-stranded nucleotide molecule in which one end is a complementary nucleotide molecule that forms a double-stranded region in the structure; the other end is a non-complementary nucleotide molecule that forms a single-stranded "forked" region in the structure .
  • the present invention provides a method for enrichment of a target region, the method comprising: (1) ligating a sequencing linker 1 on a DNA fragment comprising a region to be enriched, which is a DNA capable of forming a partially double-stranded structure, obtained The product is ligated; (2) a specific probe is added to the ligation product of (1), and a high-fidelity DNA polymerase is added, and sequence extension is performed on the basis of complementation of the probe and the ligation product to obtain an extension product;
  • the specific probe includes a sequencing linker 2 and a sequence complementary to the DNA fragment containing the target region to be enriched; (3) PCR amplification of the extension product of (2) by high-fidelity DNA polymerase to obtain enrichment Amplification product of DNA in the target region.
  • the DNA capable of forming a partially double-stranded structure may form a "stem loop" structure, a "bicyclic” structure or a "Y-type” structure, or a common double-stranded DNA containing a palindromic sequence.
  • the adaptor ligation method of step 1 is a single-strand ligation, and is ligated to the 5' end of the phosphorylated DNA fragment containing the target region to be enriched by a double-stranded ligase.
  • the use of such a DNA capable of forming a partially double-stranded structure as a linker facilitates obtaining a single ligation product in the direction of sequence connection and avoids interference caused by connections in different directions.
  • the DNA capable of forming a partial double-stranded structure further comprises: a sequence recognizable by a sequencing system (e.g., corresponding to the Ion Torrent sequencing system, the sequence is A sequence), a sample tag sequence, and optionally also a molecular tag sequence.
  • a sequencing system e.g., corresponding to the Ion Torrent sequencing system, the sequence is A sequence
  • sample tag sequence e.g., a sample tag sequence
  • optionally also a molecular tag sequence e.g., corresponding to the Ion Torrent sequencing system, the sequence is A sequence
  • sample tag sequence e.g., corresponding to the Ion Torrent sequencing system, the sequence is A sequence
  • optionally also a molecular tag sequence e.g., corresponding to the Ion Torrent sequencing system, the sequence is A sequence
  • sample tag sequence e.g., corresponding to the Ion Torrent sequencing system, the sequence is A sequence
  • optionally also a molecular tag sequence e.g.,
  • the application of the sample tag sequence can facilitate the subsequent differentiation of the sequences of different samples in the bio-information analysis, thereby achieving simultaneous sequencing of multiple samples in a single reaction.
  • the application of sequences that can be recognized by the sequencing system can be facilitated by subsequent capture and sequencing by high-throughput sequencing systems.
  • steps (1), (2), and (3) are steps of purifying the ligation product, the amplification product, and the extension product, respectively, and purifying and separating the sample labeled with the sample tag.
  • one end of the sequencing linker 2 is ligated to the sequence complementary to the DNA fragment containing the target region to be enriched, and the other end is ligated to the purification tag; or
  • the "purification of the tag-tagged dNTP" is extended; preferably, the purification tag is a tag capable of solid phase purification, and is subjected to solid phase purification using a solid phase purification system.
  • the purification tag is biotin
  • the solid phase purification system for solid phase purification is avidin-coated magnetic beads or streptavidin-coated magnetic beads.
  • the extending comprises: a step of denaturation, annealing, and stretching of one or more cycles.
  • the denaturation, annealing, and stretching steps of the multiple cycles are performed. For example, linear amplification of 2 to 45 cycles is performed.
  • the PCR amplification is exponential PCR amplification. For example, PCR amplification of 10 to 20 cycles is performed. PCR amplification was performed using a forward primer and a reverse primer.
  • the method further comprises: identifying the sequencing adaptor 1 in the amplification product obtained in the step (3) (in particular, a sequence (such as an A sequence) which can be recognized by the sequencing system) and sequencing the linker 2, and performing sequencing , obtaining the sequencing result of the target region.
  • a sequence such as an A sequence which can be recognized by the sequencing system
  • the sequencing linker to which the present invention is applied may be unmodified or may be a modified linker obtained by means such as a nucleic acid chain backbone modification technique, and the modification does not substantially change the oligonucleotide molecule binding characteristics; preferably those Modifications that increase the stability of oligonucleotide molecules.
  • the modification is a dU modification, a thio modification, or an alkyl modification at the 2' position of the ribose. It will be understood that any modification capable of maintaining the binding properties of the oligonucleotide molecule is encompassed by the present invention.
  • the thio method in which the oxygen atom on the phosphate bond of the DNA backbone is replaced with a sulfur atom, and the thio group may be thio or on the entire phosphate bond. It is a thio group on a partial phosphate bond.
  • the thio modification can greatly enhance the stability of the oligonucleotide molecule, thereby facilitating accurate detection results.
  • Deoxyuracil can be inserted into the oligonucleotide to increase the melting temperature of the double strand to increase the stability of the duplex.
  • the DNA fragment is first ligated to the sequencing linker by a linker, and the sequencing linker may comprise a sequence which can be recognized by the sequencing system, a sample tag sequence, a molecular tag sequence, and the like.
  • the ligation product is then mixed with the probe and added to the thermostable, high-fidelity DNA polymerase.
  • the probe After high temperature denaturation, annealing, and extension steps, the probe anneals to the target DNA molecule to form a double strand, and under the action of DNA polymerase, specificity Incorporating dNTP and extending, through multiple cycles of denaturation, annealing and extension steps, efficient enrichment of the target gene can be achieved, and the streptavidin-biotin purification system can be used to achieve efficient enrichment and purification of the target region. .
  • the present invention can be enriched for any DNA fragment to which a sequencing linker has been ligated, wherein the linker sequence can be a linker to the Ion Torrent sequencing platform or a linker to the Illumina sequencing platform.
  • the methods of the present invention are also applicable to other sequencing platforms in accordance with the principles of the present invention.
  • the method of the invention can achieve effective enrichment by designing only one oligonucleotide probe for each site, and overcomes the difficulty of the PCR method in designing short nucleic acid fragments, and can effectively enrich short nucleic acid fragments ( For example, plasma free DNA, FFPE tissue sample DNA, etc.).
  • Another advantage of the present invention is that the enrichment of the target region can be achieved by specific probe extension, which can shorten the time (for example, in the embodiment in which the region of the EGFR mutation site is enriched, the process takes only one hour. The time left and right), compared to the way the probe hybridization is captured, saves a lot of time.
  • the library structurally comprises the following sequence portions: a 5'-end sequencing linker, a gene-specific probe, an enriched target region, a sample tag, a molecular tag (may contain), 3' end sequencing linker ( Figure 1).
  • the enriched target region contains information on the mutation of the gene, and the partial sequence is characterized in that the position of the 5' end sequence is fixed on the genome (determined by the gene-specific probe), and the 3' end is not fixed (Fig. 3 Figure 4) is determined by the initial DNA fragmentation state of the database. Therefore, when analyzing the data, the position of the 3' end of the sequence on the genome can act as a molecular tag, which can effectively reduce the background noise. In addition, combined with molecular tags can greatly reduce background noise and improve the sensitivity and accuracy of detection.
  • the present invention also provides a kit for performing enrichment of a target region, the kit comprising: a sequencing adaptor 1, which is a DNA capable of forming a partially double-stranded structure; A needle comprising a sequencing linker 2 and a sequence complementary to a DNA fragment comprising a region to be enriched; a forward primer and a reverse primer, which have a sequence with the sequencing linker 1 (especially a sequence in which the system can be recognized by the sequencing system (eg A) Sequence)) and sequence complementary to the sequence in linker 2.
  • a sequencing adaptor 1 which is a DNA capable of forming a partially double-stranded structure
  • a needle comprising a sequencing linker 2 and a sequence complementary to a DNA fragment comprising a region to be enriched
  • a forward primer and a reverse primer which have a sequence with the sequencing linker 1 (especially a sequence in which the system can be recognized by the sequencing system (eg A) Sequence)) and sequence complementary to the sequence in linker 2.
  • the kit may further comprise: a high-fidelity DNA polymerase, a double-stranded ligase, dNTPs, and the like.
  • the kit may also include instructions for use, wherein the method of performing seamless DNA assembly of the present invention is described to facilitate application by those skilled in the art.
  • the single-link connection of the joint can avoid the reverse connection of the joint and improve the connection efficiency
  • the 3' end of the target area and the molecular tag can reduce background noise during analysis and improve detection accuracy.
  • oligonucleotide sequences used in this example are shown in Table 1.
  • Linker 1 contains sequencing linker 1, here is the A sequence in the Ion Torrent sequencing system; the underlined partial sequence is the sample tag sequence; the "N" part is the molecular tag sequence (where NNNNNN is a random sequence, used to mark Distinguish different segments in the same sample; GATCGC is a Barcode adapter sequence for quality control);
  • probe 1 and probe 2 contain sequencing linker 2 (ie, italicized base, here is the P1 sequence in the Ion Torrent sequencing system, and the biotin modification at the 5' end; the underlined part is the target region-specific sequence Among them, probe 1 is directed to the EGFR gene 21 exon L858R mutation, and probe 2 is directed to the EGFR gene 19 exon deletion mutation.
  • sequencing linker 2 ie, italicized base, here is the P1 sequence in the Ion Torrent sequencing system, and the biotin modification at the 5' end; the underlined part is the target region-specific sequence Among them, probe 1 is directed to the EGFR gene 21 exon L858R mutation, and probe 2 is directed to the EGFR gene 19 exon deletion mutation.
  • Sample genomic DNA samples were extracted using commercial kits, including healthy human oral exfoliated cells, NCI-H1975 cell line (this cell line is the EGFR gene 21 exon L858R mutation), and NCI-H1650 cell line (this cell line is EGFR). Gene 19 exon deletion mutation), the DNA sample was quantified by spectrophotometer, and NCI-H1975 cell line DNA and NCI-H1650 cell line DNA were incorporated into DNA of healthy human oral exfoliated cells at a ratio of 1%, using Covaris The ultrasonic DNA shredder interrupted the DNA to approximately 200 bp for use.
  • the end-repair reaction system was prepared according to Table 2.
  • DNA sample 60ul 10 ⁇ T4 polynucleotide kinase buffer 10ul dNTP (10 mM each) 2ul T4 DNA polymerase 2ul T4 polynucleotide kinase 2ul Klenow fragment 2ul ddH 2 O 22ul total capacity 100ul
  • the ligation reaction system was prepared according to Table 3.
  • connection buffer 10ul T4 DNA ligase 2ul Connector (10 ⁇ M) 2ul ddH 2 O 36ul total capacity 100ul
  • the ligation product was purified by 150 ul of Agencourt AMPure magnetic beads, and the purified product was eluted in 50 ul of elution buffer, thereby obtaining a DNA library labeled with a sample tag.
  • the probe extension reaction system was prepared according to Table 4.
  • reaction product was purified by streptavidin-coated magnetic beads and finally dissolved in 40 ul of elution buffer.
  • Step 5 Library amplification
  • the library amplification reaction system was prepared according to Table 6.
  • the PCR product was purified by using 80 ul of Agencourt AMPure magnetic beads, and the purified product was dissolved in 30 ul of elution buffer, thereby obtaining a library to be sequenced.
  • the prepared library was sequenced on Ion Proton, including water-in-oil PCR, library enrichment, chip loading, and sequencing on the machine. For details, see Ion PI TM Hi-Q TM OT2 200 Kit Manual, Ion PI TM Hi-Q TM Sequencing 200 Kit Instructions.
  • the analysis includes the total number of reads, the number of reads on hg19, the comparison rate, the number of targets in the target area, the target area ratio, and the mutation information. See Table 8 and Table 9.
  • the above-mentioned kit for quantitative PCR detection of ARMS a quantitative detection kit for human EGFR gene mutation (real-time fluorescent PCR method) produced by Genosbo Biotechnology Nantong Co., Ltd., quantitatively detects mutation of EGFR gene in the sample.
  • oligonucleotide sequences used in this example are shown in Table 10.
  • linker 1 and linker 2 are unmodified linkers; part of the "T" bases in the linker 3 and linker 4 sequences are replaced by "U” bases; linker 5 and linker 6 are thio-modified.
  • Sample preparation Similar to Example 1, a commercial kit was used to extract genomic DNA samples from healthy human oral exfoliated cells, and the DNA samples were quantified by a spectrophotometer, and the DNA was disrupted to about 200 bp by a Covaris ultrasonic DNA disruptor.
  • the database construction and sequencing process is the same as in Example 1.
  • the linker 1 and the linker 2 were combined with the probes 1 to 12, the primer F and the primer R, respectively, for library construction and sequencing; the linker 3 and the linker 4 were matched with the probes 1 to 12, the primer F and the primer R, respectively.
  • Library construction and sequencing; linker 5 and linker 6 were combined with probes 1-12 and primer F and primer R for library construction and sequencing.
  • oligonucleotide sequences used in this example are shown in Table 12.
  • Sample genomic DNA samples were extracted using a commercial kit.
  • Sample 1 was H2228 cell line (known as EML4-ALK type 3 fusion mutation)
  • sample 2 was plasma of a lung cancer patient
  • DNA samples were quantified by spectrophotometer using Covaris ultrasound.
  • the DNA shatterer interrupts the DNA extracted from the sample to about 200 bp, and the plasma free DNA extracted from the sample 2 does not need to be interrupted.
  • FIG. 5 The steps 1 to 7 of the database construction are the same as those in the first embodiment.
  • the NGS statistical results are shown in FIG. 5.
  • FIG. 6A and FIG. 6B show the results of the IGV observation of the H2228 cell line fusion breakpoint detected by the present invention, and the Sanger sequencing verification result. See Figure 7 and Figure 8.
  • the fusion of EML4 and ALK genes in the H2228 cell line was detected, and the ALK gene was fused to the EML4 gene intron 6 at the intron 19, which was a type 3 fusion mutation, and the results were consistent with the prior art reports (Choi YL, Takeuchi K, Soda M et al. Identification of novel isoforms of the EML4-ALK transforming gene in non-small cell lung cancer. Cancer Res 2008; 68: 4971-4976).
  • the present invention found that the fusion mode is that the EML4 gene is fused to the intron 17 of the ALK gene at intron 20, which is an unreported fusion (E20: A18).
  • the commercial kit showed that the pathological tissue sample of this patient was E20:A20. In-depth analysis showed that the length of exon 18 of ALK gene was 153 bp, and the length of exon 19 of ALK gene was 105 bp.
  • the commercial kits only designed detection primers for E13:A20, E20:A20 and E6:A20.
  • the detection primers for E20:A20 can also detect the E20:A18 fusion type in this example, so this fusion type is misjudged. For E20: A20. This also embodies the accuracy of the detection of the present invention.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了基因目标区域富集方法及建库试剂盒,本发明的方法通过单引物线性扩增的方式,利用高保真DNA聚合酶的保真性,实现对基因目标区域的富集。

Description

基因目标区域富集方法及建库试剂盒 技术领域
本发明属于分子生物学技术领域,具体而言,本发明涉及一种主要用于二代测序的基因目标区域富集方法及建库试剂盒。
背景技术
2016年《CA:临床医师癌症杂志》(CA Cancer J Clin)发表的中国国家癌症中心公布的2015年癌症数据显示,2015年中国新发癌症数量约为429万,癌症已成为中国首要杀手。20世纪前,临床上对于肿瘤患者除了手术、化疗、放疗外没有其他有效的治疗手段,随着人类基因组计划的完成,科技工作者们对肿瘤的研究进入到了基因序列层面,并首先在肺癌这一肿瘤类型中发现了治疗的靶点——EGFR(肿瘤表皮生长因子受体)突变,并开发出了靶向药物易瑞沙,开创一种新的靶向治疗模式。靶向治疗的前提是需要对患者进行基因状态的筛查,以前的检测手段包括实时荧光定量PCR、一代测序等,这些手段在一定程度上解决了临床上的需求,但是随着科学的发展,我们意识到在肿瘤组织本身即是个小的生态系统,其组织内部存在异质性,并且EGFR靶向药物对于携带有KRAS基因突变的患者的疗效就很差,因此对于单基因的检测并不能满足临床上的需要。21世纪初二代测序技术逐渐成熟,成本已降低至1000美元完成一个基因组的测序,加上其在检测通量上优势,二代测序技术越来越多的被用于临床检测中。
尽管全基因组测序成本在不断降低,但是对于某些检测到的突变位点,在临床上仍无法判断该突变是否与肿瘤相关、是否可以指导用药。目标区域测序又被称为靶向测序,是目前临床上通常使用的策略,这种方法通常聚焦于先前已知的某些与肿瘤发生发展或用药指导相关的基因进行测序,在降低测序成本的同时,提高了临床应用的便利性。目标区域测序的核心是将感兴趣的基因进行富集并测序,已有的两种富集方法包括多重PCR法和杂交捕获法。多重PCR法以Life Technologies公司的Ampliseq方法为代表,针对每个位点设计一对PCR引物,利用多重PCR的方法对感兴趣的基因或位点进行富集。该方法对起始DNA样本量要求低,操作相对简单,但是对于长度较短的DNA(例如血浆游离DNA、FFPE组织样本DNA等)来说,其在引物设计上有很大困难,且该方法在PCR扩增不同基因时容易出现偏好性。杂交捕获的方法通常是针对目标基因设计约120个碱基长度的探针,在固相载体上或液相中对感兴趣的基因进行杂交富集,该方法可以在DNA层面检测常见的点突变、插入突变、微缺失突变以及已知和未知的融合突变。为了有效富集,杂交捕获法需要大量的长链探 针和较长的杂交周期(通常大于24小时),由于杂交效率低,所以需要大量的DNA模板。
因此,本领域还需要改进方法,以更便捷的方法实现更高效的基因目标区域富集。
发明内容
本发明的目的在于提供基因目标区域富集方法及建库试剂盒。
在本发明的第一方面,提供一种进行目标区域富集的方法,所述方法包括:
(1)在包含待富集目标区域的DNA片段上连接测序接头1,其是能形成部分双链结构的DNA,获得连接产物;
(2)在(1)的连接产物中加入特异性探针,并加入DNA聚合酶,在探针与连接产物构成互补的基础上进行序列延伸,获得延伸产物;其中,所述的特异性探针包括测序接头2以及与包含待富集目标区域的DNA片段互补的序列;
(3)以DNA聚合酶对(2)的延伸产物进行PCR扩增,获得含有富集目标区域的DNA的扩增产物。
在一个优选例中,所述的目标区域富集的方法应用于高通量测序中。
在另一优选例中,所述的目标区域包括:序列发生变异的位点,例如:SNP突变位点区域,碱基缺失位点区域,碱基插入位点区域和融合突变位点区域等。
在另一优选例中,所述的DNA聚合酶为高保真DNA聚合酶。
在另一优选例中,所述的方法为非诊断性或治疗性的方法。
在另一优选例中,所述的方法应用于高通量检测,在一次检测中,所述的包含待富集目标区域的DNA片段为1条或多条(2~1000条,如5,10,15,20,30,50,80,100,200,500,800条),从而对应测序探针也为1条或多条。
在另一优选例中,步骤(1)中,所述的能形成部分双链结构的DNA可形成一种“茎环”结构、“双环”结构或“Y型”结构,或为包含回文序列的普通双链DNA;步骤1的接头连接方法为单链连接,通过双链连接酶连接于经磷酸化的包含待富集目标区域的DNA片段的5’端;较佳地,所述双链连接酶包括:T4 DNA连接酶。
在另一优选例中,所述的能形成部分双链结构的DNA包括:能被测序系统识别的序列、样本标签序列;较佳地,还包括:分子标签序列。
在另一优选例中,所述的能被测序系统识别的序列、样本标签序列和/或分子标签序列存在于所述的能形成部分双链结构的DNA中不能构成双链的部分(如“茎环”结构的“环”中)。
在另一优选例中,在步骤(1)、(2)和(3)中,分别还包括对连接产物、扩增产 物和延伸产物进行纯化的步骤。
在另一优选例中,步骤(2)中,所述的测序接头2的一端与所述包含待富集目标区域的DNA片段互补的序列连接,另一端连接纯化标签;或在进行序列延伸时以“纯化标签标记的dNTP”进行延伸;较佳地,所述的纯化标签是能进行固相纯化的标签,应用固相纯化系统进行固相纯化;
步骤(1)、(2)或(3)中,纯化分离标记有样本标签的样品。
在另一优选例中,所述的测序接头2位于特异性探针的5’端,所述的与包含待富集目标区域的DNA片段互补的序列位于特异性探针的3’端。
在另一优选例中,所述的测序接头2的5’端,连接纯化标签。
在另一优选例中,所述的纯化标签是生物素,进行固相纯化的固相纯化系统为亲和素包被的磁珠或者链霉亲和素包被的磁珠。
在另一优选例中,步骤(2)中,所述的延伸包括:单轮或多轮循环的变性、退火、延伸步骤;较佳地,进行多轮循环的变性、退火、延伸步骤(例如,进行2~45个,如5,10,15,20,25,30,35个循环数的线性扩增)。
在另一优选例中,步骤(3)中,所述的PCR扩增为指数PCR扩增(例如,进行10~20个循环数的PCR扩增)。
在另一优选例中,步骤(3)中,采用正向引物与反向引物进行PCR扩增,所述的反向引物与所述被测序系统识别的序列相同或互补;所述的正向引物与所述测序接头2序列相同或互补。
在另一优选例中,采用Ion Torrent测序系统进行测序,所述的测序接头2的序列为Ion Torrent P1序列;所述的能被测序系统识别的序列为Ion Torrent A序列。
在另一优选例中,步骤(3)之后,还包括:识别步骤(3)获得的扩增产物中的测序接头1(特别是其中能被测序系统识别的序列(如A序列))以及测序接头2,进行测序,获得目标区域的测序结果。
在另一优选例中,所述的测序接头1和测序接头2的部分或全部碱基经修饰,或未经修饰;较佳地,所述的修饰包括(但不限于):脱氧脲嘧啶(dU)修饰、硫代修饰。
在另一优选例中,步骤(1)中,所述的包含待富集目标区域的DNA片段为双链DNA,包括但不限于基因组DNA、互补DNA(complementary DNA,cDNA)。
在本发明的另一方面,提供一种应用于进行目标区域富集的试剂盒,所述试剂盒中包括:
测序接头1,其是能形成部分双链结构的DNA;
特异性探针,其包括测序接头2以及与包含待富集目标区域的DNA片段 互补的序列;
正向引物和反向引物,它们具有与测序接头1(特别是其中能被测序系统识别的序列(如A序列))和测序接头2中的序列互补的序列。
在一个优选例中,所述的试剂盒中还包括选自一下的一种或多种试剂:DNA聚合酶,双链连接酶,dNTPs。
在另一优选例中,所述的目标区域为EGFR exon 18 G719X(A/S/C)突变、EGFR 19Del、EGFR exon 20 T790M、EGFR exon 21 L858R、BRAF V600E、KRAS G12D、KRAS Q61H和/或ALK intron 19 fusion突变,所述的测序接头1的核苷酸序列如SEQ ID NO:1、6、7、8、9、10、11、22或23所示,所述的特异性探针的核苷酸序列如SEQ ID NO:2、3、12~21和/或24~39的一条或多条所示,
所述的正向引物与反向引物的核苷酸序列如SEQ ID NO:4和SEQ ID NO:5所示。
本发明的其它方面由于本文的公开内容,对本领域的技术人员而言是显而易见的。
附图说明
图1、文库构建流程图。
图2、实施例1测序读长分布图。
图3、实施例1 EGFR基因19外显子测序碱基深度图。
图4、实施例1 EGFR基因21外显子测序碱基深度图。
图5、实施例3根据本发明检测出H2228细胞株及一例肺癌病人融合突变数据。
图6A-B、实施例3中H2228细胞株EML4-ALK融合断点的IGV查看图。
图7、实施例3中H2228细胞株EML4-ALK融合用一代测序验证结果。
图8、实施例3中肺癌病人EML4-ALK融合用一代测序验证结果。
具体实施方式
本发明人经过深入的研究,提出了一种新型的基因目标区域富集方法,该方法适用于高通量测序中。本发明的方法通过单引物线性扩增的方式,利用高保真DNA聚合酶的保真性,实现对基因目标区域的富集。本发明的方法可以高效、准确地富集目的基因,显著地减低文库构建时产生的测序错误。
术语
如本文所用,所述的“包含待富集目标区域的DNA片段”是指一段双链DNA序列,包括但不限于基因组DNA、互补DNA(complementary DNA,cDNA)等,其中包含有感兴趣的“目标区域”。其中,该“目标区域”包括:序列发生变异的位点,例如:SNP突变位点区域,碱基缺失位点区域,碱基插入位点区域和融合突变位点区域等。该“目标区域”可以是与疾病的发生密切相关的区域,也可以是本领域技术人员以其它研究为目的而感兴趣的区域。例如:人EGFR基因第21外显子的L858R突变、人EGFR基因第19外显子E746_A750DEL-1缺失突变、人EGFR基因第20外显子的A763_Y764insFQEA插入突变和人EML4-ALK基因融合突变等。
如本文所用,“互补”的序列通常是指将5’-3’方向的序列转换为其3’-5’方向的序列(如5’ATCG 3’→GCTA),然后再取其互补序列(如GCTA→5’CGAT3’)。
如本文所用,“茎环”结构也被称作“发夹”结构
Figure PCTCN2018080229-appb-000001
是指一种单链核苷酸分子,其可形成一种包括双链区域(茎部)的二级结构,所述的双链区域由该核苷酸分子的两个区域(位于同一分子上)形成,两个区域分列双链部分的两侧;其还包括至少一个“环”结构,包括非互补的核苷酸分子,即单链区域。
如本文所用,“双环”结构
Figure PCTCN2018080229-appb-000002
是指一种双链核苷酸分子,其中的两端为互补核苷酸分子,形成结构中的双链区域;中间包含一段非互补核苷酸分子,形成结构中的单链“环状”结构。
如本文所用,“Y型”结构
Figure PCTCN2018080229-appb-000003
是指一种双链核苷酸分子,其中的一端为互补核苷酸分子,形成结构中的双链区域;另一端为非互补核苷酸分子,形成结构中的单链“分叉”区域。
目标区域富集方法
本发明提供了一种进行目标区域富集的方法,所述方法包括:(1)在包含待富集目标区域的DNA片段上连接测序接头1,其是能形成部分双链结构的DNA,获得连接产物;(2)在(1)的连接产物中加入特异性探针,并加入高保真DNA聚合酶,在探针与连接产物构成互补的基础上进行序列延伸,获得延伸产物;其中,所述的特异性探针包括测序接头2以及与包含待富集目标区域的DNA片段互补的序列;(3)以高保真DNA聚合酶对(2)的延伸产物进行PCR扩增,获得含有富集目标区域的DNA的扩增产物。
步骤(1)中,所述的能形成部分双链结构的DNA可形成一种“茎环”结构、“双环”结构或“Y型”结构,或为包含回文序列的普通双链DNA。在优选的方式中,步骤1的接头连接方法为单链连接,通过双链连接酶连接于经磷酸化 的包含待富集目标区域的DNA片段的5’端。利用这种能形成部分双链结构的DNA作为接头,有利于获得序列连接方向单一的连接产物,避免产生不同方向的连接而导致的干扰。
在优选的方式中,除了构成双链的部分序列之外,所述的能形成部分双链结构的DNA还包括:能被测序系统识别的序列(如,相应于Ion Torrent测序系统,该序列为A序列)、样本标签序列,以及可选地还包括:分子标签序列。所述的能被测序系统识别的序列、样本标签序列和/或分子标签序列存在于所述的能形成部分双链结构的DNA的“茎环”结构的“环”中(即不能构成双链的部分)。样本标签序列的应用,可以有利于后续的生信分析中对不同样本的序列进行区别,从而实现在单次反应中同时进行多个样本的测序。能被测序系统识别的序列的应用,可以有利于后续被高通量测序系统捕捉、测序。
在步骤(1)、(2)和(3)中分别还包括对连接产物、扩增产物和延伸产物进行纯化的步骤,纯化分离标记有样本标签的样品。在优选的方式中,步骤(2)中,所述的测序接头2的一端与所述包含待富集目标区域的DNA片段互补的序列连接,另一端连接纯化标签;或在进行序列延伸时以“纯化标签标记的dNTP”进行延伸;较佳地,所述的纯化标签是能进行固相纯化的标签,应用固相纯化系统进行固相纯化。在更优选的方式中,所述的纯化标签是生物素,进行固相纯化的固相纯化系统为亲和素包被的磁珠或者链霉亲和素包被的磁珠。
步骤(2)中,所述的延伸包括:单轮或多轮循环的变性、退火、延伸步骤。在本发明的优选方式中,进行多轮循环的变性、退火、延伸步骤。例如,进行2~45个循环数的线性扩增。
步骤(3)中,所述的PCR扩增为指数PCR扩增。例如,进行10~20个循环数的PCR扩增。采用正向引物与反向引物进行PCR扩增。
在完成步骤(3)之后,还包括:识别步骤(3)获得的扩增产物中的测序接头1(特别是其中能被测序系统识别的序列(如A序列))以及测序接头2,进行测序,获得目标区域的测序结果。
本发明应用的测序接头可以是未经修饰的,也可以是采用如基于核酸链骨架修饰技术等手段获得的经修饰的接头,所述的修饰基本不改变寡核苷酸分子结合特性;优选那些能够提高寡核苷酸分子稳定性的修饰。例如,所述的修饰为dU修饰,硫代修饰,或在核糖的2’位置进行烷基修饰。应理解,任何能够保持所述寡核苷酸分子结合特性的修饰都包含在本发明中。
寡核苷酸的骨架修饰方法有多种,包括硫代法,该方法是将DNA骨架上磷酸键上的氧原子用硫原子替代,所述的硫代可以是全部磷酸键上的硫代或是部分磷酸键上的硫代。硫代的修饰能够大大增强所述寡核苷酸分子的稳定性, 从而有利于获得准确的检测结果。脱氧脲嘧啶可以插进寡核苷酸来增加双链的熔点温度从而增长双链的稳定性。
本发明的上述方法,首先通过接头连接将DNA片段连接上测序接头,该测序接头可以包含能被测序系统识别的序列、样本标签序列、分子标签序列等。然后将连接产物与探针混合并加入热稳定、高保真DNA聚合酶,经过高温变性、退火、延伸步骤,探针与目标DNA分子退火形成双链,并在DNA聚合酶的作用下,特异性的掺入dNTP并延伸,通过多轮变性、退火、延伸步骤的循环,可以实现对目的基因的高效富集,结合链霉亲和素-生物素纯化系统实现对目标区域的高效富集与纯化。
本发明可针对任意已经连接上测序接头的DNA片段进行富集,其中接头序列可以是Ion Torrent测序平台的接头,也可以是Illumina测序平台的接头。根据本发明的原理,本领域技术人员了解本发明的方法也可适用于其它的测序平台。
本发明的方法针对每个位点只设计一条寡核苷酸探针即可实现有效富集,克服了PCR方法在针对短的核酸片段设计上的困难,可以有效地富集短的核酸片段(例如血浆游离DNA、FFPE组织样本DNA等)。本发明的另一优势之处在于,通过特异性的探针延伸的方式实现目标区域的富集,可缩短时间(例如在富集EGFR突变位点区域的实施例中,过程只需要1个小时左右的时间),相对于探针杂交捕获的方式,时间节省很多。
另一方面,通过本发明方法或试剂盒进行文库构建后,文库在结构上依次包含以下序列部分:5’端的测序接头、基因特异性探针、富集到的目标区域、样本标签、分子标签(可含有)、3’端的测序接头(图1)。其中富集到的目标区域包含基因的突变信息,该部分序列的特点是5’端序列在基因组上的位置是固定的(由基因特异性探针决定),而3’端不固定(图3、图4),由建库初始的DNA片段化状态决定,因此在分析数据时,该序列3’端在基因组上的位置可以起到分子标签的作用,可以有效降低背景噪声。另外,结合分子标签更可大大降低背景噪声,提高检测的灵敏度与准确性。
试剂盒
基于本发明的前述方法,本发明还提供了一种应用于进行目标区域富集的试剂盒,所述试剂盒中包括:测序接头1,其是能形成部分双链结构的DNA;特异性探针,其包括测序接头2以及与包含待富集目标区域的DNA片段互补的序列;正向引物和反向引物,它们具有与测序接头1(特别是其中能被测序系统识别的序列(如A序列))和测序接头2中的序列互补的序列。
所述的试剂盒中还可包括:高保真DNA聚合酶,双链连接酶,dNTPs等。所述的试剂盒中还可包括使用说明书,其中描述了本发明的进行DNA无缝组装的方法,以便于本领域技术人员应用。
本发明的主要优点在于:
1.高效、准确富集目的基因,减低文库构建时产生的测序错误;
2.只需设计一条引物(探针)即可,克服短片段如ctDNA设计PCR引物的困难;
3.接头单链连接,可避免接头的反向连接,提高连接效率;
4.多重线性扩增法,只需1小时即可完成富集,较杂交捕获法快;
5.目标区域3’端和分子标签可在分析时降低背景噪音,提高检测准确性。
下面结合具体实施例,进一步阐述本发明。应理解,这些实施例仅用于说明本发明而不用于限制本发明的范围。下列实施例中未注明具体条件的实验方法,通常按照常规条件如J.萨姆布鲁克等编著,分子克隆实验指南,第三版,科学出版社,2002中所述的条件,或按照制造厂商所建议的条件。
实施例1、点突变、插入缺失检测
本实施例中所用寡核苷酸序列如表1所示。
表1、寡核苷酸序列
Figure PCTCN2018080229-appb-000004
注:①接头1中包含测序接头1,此处为Ion Torrent测序系统中的A序列;下划线部分序列为样本标签序列;“N”部分为分子标签序列(其中,NNNNNN 为随机序列,用以标记区别同一样本中的不同片段;GATCGC为Barcode adapter序列,用于质控);
②探针1和探针2中包含测序接头2(即斜体标示的碱基,此处为Ion Torrent测序系统中的P1序列,且5’端有生物素修饰;下划线部分为目标区域特异性序列。其中,探针1针对EGFR基因21外显子L858R突变,探针2针对EGFR基因19外显子缺失突变。
步骤1:样本准备
采用商用试剂盒抽提样本基因组DNA样本,样本包括健康人口腔脱落细胞、NCI-H1975细胞株(该细胞株为EGFR基因21外显子L858R突变)、NCI-H1650细胞株(该细胞株为EGFR基因19外显子缺失突变),通过分光亮度计对DNA样本定量,将NCI-H1975细胞株DNA、NCI-H1650细胞株DNA按照1%的比例掺入到健康人口腔脱落细胞DNA中,采用Covaris超声波DNA破碎仪将DNA打断至约200bp备用。
步骤2:末端修复
按表2配制末端修复反应体系。
表2
DNA样本 60ul
10×T4多核苷酸激酶缓冲液 10ul
dNTP(每种10mM) 2ul
T4 DNA聚合酶 2ul
T4多核苷酸激酶 2ul
Klenow片段 2ul
ddH 2O 22ul
总体积 100ul
充分混匀后,20℃孵育30分钟,在65℃孵育30分钟。然后对产物进行硅胶柱纯化,60ul洗脱缓冲液洗脱。
步骤3:接头连接
按表3配制连接反应体系。
表3
经过修复的DNA溶液 50ul
10×连接缓冲液 10ul
T4 DNA连接酶 2ul
接头(10μM) 2ul
ddH 2O 36ul
总体积 100ul
充分混匀后,20℃孵育30分钟。取150ul Agencourt AMPure磁珠对连接产物进行纯化,纯化产物溶于50ul洗脱缓冲液洗脱中,至此得到标记有样本标签的DNA文库。
步骤4:探针延伸
按表4配制探针延伸反应体系。
表4
连接纯化产物 40ul
5×Q5 Buffer 20ul
dNTP(10mM each) 2ul
Q5高保真DNA聚合酶(2 U/μL) 1ul
探针1(10μM) 1ul
探针2(10μM) 1ul
ddH 2O 35ul
总体积 100ul
充分混匀并离心后,将反应管置于PCR仪中,按表5设置程序并运行:
表5
Figure PCTCN2018080229-appb-000005
反应结束后,用链霉亲和素包被的磁珠纯化反应产物,并最终溶解于40ul 洗脱缓冲液中。
步骤5:文库扩增
按表6配制文库扩增反应体系。
表6
上一步纯化产物 30ul
5×Q5 Buffer 10ul
dNTP(10mM each) 1ul
Q5高保真DNA聚合酶(2 U/μL) 1ul
引物F(10μM) 1.5ul
引物R(10μM) 1.5ul
ddH 2O 5ul
总体积 50ul
充分混匀并离心后,将反应管置于PCR仪中,按表7设置程序并运行:
表7
Figure PCTCN2018080229-appb-000006
反应结束后,取80ul Agencourt AMPure磁珠对PCR产物进行纯化,纯化产物溶于30ul洗脱缓冲液洗脱中,至此得到待上机测序的文库。
步骤6:测序
将制备好的文库,在Ion Proton上进行测序,包括油包水PCR、文库富集、芯片加样、上机测序,具体操作流程详见Ion PI TMHi-Q TMOT2 200 Kit说明书、Ion PI TMHi-Q TMSequencing 200 Kit说明书。
步骤7:数据分析
分析内容包括总reads数、比对到hg19上的reads数、比对率、目标区域reads数、目标区域比例、突变信息等,见表8、表9。
表8
总reads数 315060
比对到hg19上的reads数 296944
比对率 94.25%
目标区域reads数 160349
目标区域reads比例 54%
表9
Figure PCTCN2018080229-appb-000007
上述进行ARMS定量PCR检测的试剂盒:格诺思博生物科技南通有限公司生产的人类EGFR基因突变定量检测试剂盒(实时荧光PCR法),对样本中EGFR基因的突变进行定量检测。
由表8和表9的结果可见,本方法测序得到的突变比例显著高于ARMS定量PCR检测突变比例,可以更为精确地获得突变结果。
实施例2、接头修饰
本实施例中所用寡核苷酸序列如表10所示。
表10
Figure PCTCN2018080229-appb-000008
Figure PCTCN2018080229-appb-000009
Figure PCTCN2018080229-appb-000010
表10中,接头1和接头2为非修饰的接头;接头3和接头4序列中部分“T”碱基被“U”碱基替代;接头5和接头6为硫代修饰。
样本准备:与实施例1相似,采用商用试剂盒抽提健康人口腔脱落细胞基因组DNA样本,通过分光亮度计对DNA样本定量,采用Covaris超声波DNA破碎仪将DNA打断至约200bp备用。
建库及测序流程同实施例1。在进行实验时,接头1和接头2分别配合探针1~12及引物F和引物R进行一次建库和测序;接头3和接头4分别配合探针1~12及引物F和引物R进行一次建库和测序;接头5和接头6分别配合探针1~12及引物F和引物R进行一次建库和测序。
测序结果见表11。
表11
Figure PCTCN2018080229-appb-000011
表11结果显示,非修饰接头、dU修饰接头、硫代修饰接头在Align Rate、Uniformity、On Target Rate方面没有差别,都有较好的效果。
实施例3、EML4-ALK融合检测
本实施例中所用寡核苷酸序列如表12。
表12
Figure PCTCN2018080229-appb-000012
Figure PCTCN2018080229-appb-000013
步骤1:样本准备
采用商用试剂盒抽提样本基因组DNA样本,样本1为H2228细胞株(已知为EML4-ALK 3型融合突变)、样本2为某肺癌病人血浆,通过分光亮度计对DNA样本定量,采用Covaris超声波DNA破碎仪将由样本抽提得到的DNA打断至约200bp备用,样本2抽提得到的血浆游离DNA无需打断。
建库步骤1至步骤7与实施例1相同,NGS统计结果见图5,图6A和图6B示出了用本发明检测到的H2228细胞株融合断点用IGV查看的结果,Sanger测序验证结果见图7和图8。
根据本发明检测到H2228细胞株存在EML4与ALK基因的融合,ALK基因在内含子19处与EML4基因内含子6处发生融合,为3型融合突变,结果与现有技术报导一致(Choi YL,Takeuchi K,Soda M et al.Identification of novel isoforms of the EML4‐ALK transforming gene in non‐small cell lung cancer.Cancer Res2008;68:4971-4976)。在另外一例血浆样本中,本发明发现其融合方式为EML4基因在内含子20处与ALK基因的内含子17处融合在一起,这是一种尚未报道过的融合(E20:A18)。而采用商业试剂盒(RT-PCR方法)对该例病人的病理组织样本的检测结果是E20:A20,深入分析可知,ALK基因18外显子长度为153bp,ALK基因19外显子长度为105bp,商业试剂盒只针对E13:A20、E20:A20和E6:A20设计了检测引物,针对E20:A20的检测引物也可以检测出此例中的E20:A18融合类型,所以将此融合类型误判为E20:A20。这也体现了本发明检测的准确性。
在本发明提及的所有文献都在本申请中引用作为参考,就如同每一篇文献被单独引用作为参考那样。此外应理解,在阅读了本发明的上述讲授内容之后,本领域技术人员可以对本发明作各种改动或修改,这些等价形式同样落于本申请所附权利要求书所限定的范围。

Claims (15)

  1. 一种进行目标区域富集的方法,其特征在于,所述方法包括:
    (1)在包含待富集目标区域的DNA片段上连接测序接头1,其是能形成部分双链结构的DNA,获得连接产物;
    (2)在(1)的连接产物中加入特异性探针,并加入DNA聚合酶,在探针与连接产物构成互补的基础上进行序列延伸,获得延伸产物;其中,所述的特异性探针包括测序接头2以及与包含待富集目标区域的DNA片段互补的序列;
    (3)以DNA聚合酶对(2)的延伸产物进行PCR扩增,获得含有富集目标区域的DNA的扩增产物。
  2. 如权利要求1所述的方法,其特征在于,步骤(1)中,所述的能形成部分双链结构的DNA可形成一种“茎环”结构、“双环”结构或“Y型”结构,或为包含回文序列的普通双链DNA;步骤1的接头连接方法为单链连接,通过双链连接酶连接于经磷酸化的包含待富集目标区域的DNA片段的5’端;较佳地,所述双链连接酶包括:T4 DNA连接酶。
  3. 如权利要求1所述的方法,其特征在于,所述的能形成部分双链结构的DNA包括:能被测序系统识别的序列、样本标签序列;较佳地,还包括:分子标签序列。
  4. 如权利要求3所述的方法,其特征在于,所述的能被测序系统识别的序列、样本标签序列和/或分子标签序列存在于所述的能形成部分双链结构中不能构成双链的部分。
  5. 如权利要求1所述的方法,其特征在于,在步骤(1)、(2)或(3)中,分别还包括对连接产物、扩增产物和延伸产物进行纯化的步骤。
  6. 如权利要求5所述的方法,其特征在于,步骤(2)中,所述的测序接头2的一端与所述包含待富集目标区域的DNA片段互补的序列连接,另一端连接纯化标签;或在进行序列延伸时以“纯化标签标记的dNTP”进行延伸;较佳地,所述的纯化标签是能进行固相纯化的标签,应用固相纯化系统进行固相纯化;
    步骤(1)、(2)或(3)中,纯化分离标记有样本标签的样品。
  7. 如权利要求5所述的方法,其特征在于,所述的纯化标签是生物素,进行固相纯化的固相纯化系统为亲和素包被的磁珠或者链霉亲和素包被的磁珠。
  8. 如权利要求1所述的方法,其特征在于,步骤(2)中,所述的延伸包括:单轮或多轮循环的变性、退火、延伸步骤;较佳地,进行多轮循环的变性、退火、延伸步骤。
  9. 如权利要求1所述的方法,其特征在于,步骤(3)中,所述的PCR扩增为指数PCR扩增。
  10. 如权利要求1所述的方法,其特征在于,步骤(3)之后,还包括:识别步骤(3)获得的扩增产物中的测序接头1以及测序接头2,进行测序,获得目标区域的测序结果。
  11. 如权利要求1所述的方法,其特征在于,所述的测序接头1和测序接头2的部分或全部碱基经修饰,或未经修饰;较佳地,所述的修饰包括:脱氧脲嘧啶修饰、硫代修饰。
  12. 如权利要求1所述的方法,其特征在于,步骤(1)中,所述的包含待富集目标区域的DNA片段为双链DNA,包括但不限于基因组DNA、互补DNA。
  13. 一种应用于进行目标区域富集的试剂盒,其特征在于,所述试剂盒中包括:
    测序接头1,其是能形成部分双链结构的DNA;
    特异性探针,其包括测序接头2以及与包含待富集目标区域的DNA片段互补的序列;
    正向引物和反向引物,它们具有与测序接头1和测序接头2中的序列互补的序列。
  14. 如权利要求13所述的试剂盒,其特征在于,所述的试剂盒中还包括选自一下的一种或多种试剂:DNA聚合酶,双链连接酶,dNTPs。
  15. 如权利要求13所述的试剂盒,其特征在于,所述的目标区域为EGFR  exon 18 G719X(A/S/C)突变、EGFR 19Del、EGFR exon 20 T790M、EGFR exon 21 L858R、BRAF V600E、KRAS G12D、KRAS Q61H和/或ALK intron 19 fusion突变,所述的测序接头1的核苷酸序列如SEQ ID NO:1、6、7、8、9、10、11、22或23所示,所述的特异性探针的核苷酸序列如SEQ ID NO:2、3、12~21和/或24~39的一条或多条所示,
    所述的正向引物与反向引物的核苷酸序列如SEQ ID NO:4和SEQ ID NO:5所示。
PCT/CN2018/080229 2017-12-15 2018-03-23 基因目标区域富集方法及建库试剂盒 WO2019114146A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711348235.X 2017-12-15
CN201711348235.XA CN108004301B (zh) 2017-12-15 2017-12-15 基因目标区域富集方法及建库试剂盒

Publications (1)

Publication Number Publication Date
WO2019114146A1 true WO2019114146A1 (zh) 2019-06-20

Family

ID=62059444

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/080229 WO2019114146A1 (zh) 2017-12-15 2018-03-23 基因目标区域富集方法及建库试剂盒

Country Status (2)

Country Link
CN (1) CN108004301B (zh)
WO (1) WO2019114146A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111020005A (zh) * 2019-12-10 2020-04-17 上海臻迪基因科技有限公司 提高下一代测序建库成功率的方法和系统
CN111020711A (zh) * 2019-10-07 2020-04-17 深圳易倍科华生物科技有限公司 一种带有分子标签的单链建库方法和接头组合、试剂盒
CN112195521A (zh) * 2020-09-11 2021-01-08 翌圣生物科技(上海)有限公司 一种基于转座酶的dna/rna共建库方法、试剂盒及应用
CN112941147A (zh) * 2021-03-02 2021-06-11 深圳市睿法生物科技有限公司 一种高保真靶标基因建库方法及其试剂盒
CN113234799A (zh) * 2021-05-11 2021-08-10 赛雷纳(中国)医疗科技有限公司 一种用于染色体缺失/重复断点精确定位的方法
CN114295462A (zh) * 2021-12-20 2022-04-08 常州天烁生物科技有限公司 一种提高样本活性与富集效率的分离富集试剂盒
EP4032986A4 (en) * 2019-09-20 2024-01-24 Shanghai Zenisight Ltd ENRICHMENT METHOD AND SYSTEM FOR GENE TARGET REGION
CN117701679A (zh) * 2024-02-06 2024-03-15 中国医学科学院基础医学研究所 一种基于5’连接的单链dna特异的高通量测序方法
CN112941147B (zh) * 2021-03-02 2024-06-04 深圳市睿法生物科技有限公司 一种高保真靶标基因建库方法及其试剂盒

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108531582A (zh) * 2018-05-15 2018-09-14 广州达瑞生殖技术有限公司 一种检测人类胚胎α-地中海贫血基因突变的引物组合及方法
EP3865584A4 (en) * 2018-10-11 2021-12-08 Beijing Euler Technology Limited Company PROCEDURE FOR CREATING A SEQUENCING LIBRARY
CN109517819A (zh) * 2018-10-24 2019-03-26 深圳市易基因科技有限公司 一种用于检测多靶点基因突变、甲基化修饰和/或羟甲基化修饰的检测探针、方法和试剂盒
CN110699426B (zh) * 2019-01-02 2022-01-28 上海臻迪基因科技有限公司 基因目标区域富集方法及试剂盒
CN109593836B (zh) * 2019-01-07 2022-04-19 艾吉泰康生物科技(北京)有限公司 一种使用镜像探针进行甲基化捕获测序的方法
CN109825552B (zh) * 2019-02-01 2022-04-05 厦门艾德生物医药科技股份有限公司 一种用于对目标区域进行富集的引物及方法
CN112410331A (zh) * 2020-10-28 2021-02-26 深圳市睿法生物科技有限公司 带分子标签和样本标签的接头及其单链建库方法
CN112779320B (zh) * 2020-12-04 2023-07-14 深圳市易基因科技有限公司 多区域dna甲基化检测探针设计及其检测方法
CN114807317A (zh) * 2021-01-22 2022-07-29 上海羿鸣生物科技有限公司 一种优化的dna线性扩增方法及试剂盒
CN113186291B (zh) * 2021-05-26 2022-04-29 嘉兴允英医学检验有限公司 基于多重pcr的引物组和试剂盒

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105442054A (zh) * 2015-11-19 2016-03-30 杭州谷坤生物技术有限公司 对血浆游离dna进行多目标位点扩增建库的方法
CN105986015A (zh) * 2015-02-05 2016-10-05 大连晶泰生物技术有限公司 一种基于高通量测序的多样本的一个或多个靶序列的检测方法和试剂盒
WO2016189288A1 (en) * 2015-05-22 2016-12-01 Cambridge Epigenetix Ltd Nucleic acid sample enrichment

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2334802A4 (en) * 2008-09-09 2012-01-25 Life Technologies Corp METHODS OF GENERATING SPECIFIC LIBRARIES OF GENES
GB0912909D0 (en) * 2009-07-23 2009-08-26 Olink Genomics Ab Probes for specific analysis of nucleic acids
CN102329876B (zh) * 2011-10-14 2014-04-02 深圳华大基因科技有限公司 一种测定待检测样本中疾病相关核酸分子的核苷酸序列的方法
ES2637538T3 (es) * 2012-07-17 2017-10-13 Counsyl, Inc. Sistema y métodos para la detección de una variación genética
CN106755451A (zh) * 2017-01-05 2017-05-31 苏州艾达康医疗科技有限公司 核酸制备及分析
CN107236729A (zh) * 2017-07-04 2017-10-10 上海阅尔基因技术有限公司 一种基于探针捕获富集的快速构建靶核酸测序文库的方法和试剂盒

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105986015A (zh) * 2015-02-05 2016-10-05 大连晶泰生物技术有限公司 一种基于高通量测序的多样本的一个或多个靶序列的检测方法和试剂盒
WO2016189288A1 (en) * 2015-05-22 2016-12-01 Cambridge Epigenetix Ltd Nucleic acid sample enrichment
CN105442054A (zh) * 2015-11-19 2016-03-30 杭州谷坤生物技术有限公司 对血浆游离dna进行多目标位点扩增建库的方法

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4032986A4 (en) * 2019-09-20 2024-01-24 Shanghai Zenisight Ltd ENRICHMENT METHOD AND SYSTEM FOR GENE TARGET REGION
CN111020711A (zh) * 2019-10-07 2020-04-17 深圳易倍科华生物科技有限公司 一种带有分子标签的单链建库方法和接头组合、试剂盒
CN111020005A (zh) * 2019-12-10 2020-04-17 上海臻迪基因科技有限公司 提高下一代测序建库成功率的方法和系统
CN111020005B (zh) * 2019-12-10 2023-06-30 上海臻迪基因科技有限公司 提高下一代测序建库成功率的方法和系统
CN112195521A (zh) * 2020-09-11 2021-01-08 翌圣生物科技(上海)有限公司 一种基于转座酶的dna/rna共建库方法、试剂盒及应用
CN112941147A (zh) * 2021-03-02 2021-06-11 深圳市睿法生物科技有限公司 一种高保真靶标基因建库方法及其试剂盒
CN112941147B (zh) * 2021-03-02 2024-06-04 深圳市睿法生物科技有限公司 一种高保真靶标基因建库方法及其试剂盒
CN113234799A (zh) * 2021-05-11 2021-08-10 赛雷纳(中国)医疗科技有限公司 一种用于染色体缺失/重复断点精确定位的方法
CN114295462A (zh) * 2021-12-20 2022-04-08 常州天烁生物科技有限公司 一种提高样本活性与富集效率的分离富集试剂盒
CN114295462B (zh) * 2021-12-20 2023-11-10 浙江天烁生物技术有限公司 一种提高样本活性与富集效率的分离富集试剂盒
CN117701679A (zh) * 2024-02-06 2024-03-15 中国医学科学院基础医学研究所 一种基于5’连接的单链dna特异的高通量测序方法

Also Published As

Publication number Publication date
CN108004301A (zh) 2018-05-08
CN108004301B (zh) 2022-02-22

Similar Documents

Publication Publication Date Title
WO2019114146A1 (zh) 基因目标区域富集方法及建库试剂盒
CN108138209B (zh) 通过原位扩增制备细胞游离核酸分子的方法
EP3607065B1 (en) Method and kit for constructing nucleic acid library
WO2019024598A1 (zh) 一种用于与微卫星不稳定性相关微卫星位点进行杂交的dna探针库、检测方法和试剂盒
CN105442054B (zh) 对血浆游离dna进行多目标位点扩增建库的方法
CN109971827B (zh) 血浆dna的建库方法和建库试剂盒
WO2013142389A1 (en) Methods of lowering the error rate of massively parallel dna sequencing using duplex consensus sequencing
KR20100063050A (ko) 디지털 pcr에 의한 다양한 길이의 핵산의 분석
WO2017219512A1 (zh) 一种游离dna文库构建方法及试剂盒
CN110541033B (zh) Egfr基因突变检测用组合物及检测方法
WO2018028001A1 (zh) 特异捕获并重复复制低频率dna碱基变异的方法及应用
EP3541934A1 (en) Methods for preparing dna reference material and controls
CN111073961A (zh) 一种基因稀有突变的高通量检测方法
WO2017202389A1 (zh) 一种适用于超微量dna测序的接头及其应用
CN110760936A (zh) 构建dna甲基化文库的方法及其应用
BR112019013391A2 (pt) Adaptador de ácido nucleico, e, método para detecção de uma mutação em uma molécula de dna circulante tumoral (ctdna) de fita dupla.
EP3927838A1 (en) Methods and compositions for early cancer detection
US20180291369A1 (en) Error-proof nucleic acid library construction method and kit
US10894978B2 (en) Genetic test for detecting congenital adrenal hyperplasia
WO2018121634A1 (zh) 用于dna片段的非特异性复制的方法及试剂盒
KR102112951B1 (ko) 암의 진단을 위한 ngs 방법
CN114277114B (zh) 一种扩增子测序添加唯一性标识符的方法及应用
WO2018214989A1 (zh) 鉴定和定量低频体细胞突变的方法
EP3990548A1 (en) Methods and systems for disease detection
US20210115435A1 (en) Error-proof nucleic acid library construction method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18888765

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18888765

Country of ref document: EP

Kind code of ref document: A1