WO2021018127A1 - 一种建库方法及应用 - Google Patents

一种建库方法及应用 Download PDF

Info

Publication number
WO2021018127A1
WO2021018127A1 PCT/CN2020/105117 CN2020105117W WO2021018127A1 WO 2021018127 A1 WO2021018127 A1 WO 2021018127A1 CN 2020105117 W CN2020105117 W CN 2020105117W WO 2021018127 A1 WO2021018127 A1 WO 2021018127A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
sequencing
primer
sample
library
Prior art date
Application number
PCT/CN2020/105117
Other languages
English (en)
French (fr)
Inventor
郑乔松
师晓
焦宇辰
陈敏
张凯华
王思振
阎海
Original Assignee
北京泛生子基因科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京泛生子基因科技有限公司 filed Critical 北京泛生子基因科技有限公司
Priority to EP20845892.7A priority Critical patent/EP3995588A4/en
Priority to KR1020227006561A priority patent/KR20220077907A/ko
Priority to US17/631,214 priority patent/US20220267760A1/en
Publication of WO2021018127A1 publication Critical patent/WO2021018127A1/zh

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1065Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1093General methods of preparing gene libraries, not provided for in other subgroups
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6827Hybridisation assays for detection of mutation or polymorphism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2531/00Reactions of nucleic acids characterised by
    • C12Q2531/10Reactions of nucleic acids characterised by the purpose being amplify/increase the copy number of target nucleic acid
    • C12Q2531/113PCR
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2537/00Reactions characterised by the reaction format or use of a specific feature
    • C12Q2537/10Reactions characterised by the reaction format or use of a specific feature the purpose or use of
    • C12Q2537/143Multiplexing, i.e. use of multiple primers or probes in a single reaction, usually for simultaneously analyse of multiple analysis
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Definitions

  • the invention relates to the technical field of molecular biology, in particular to a method and application for building a library.
  • the method of library construction is nothing more than capture library construction and amplification library construction.
  • the capture library is relatively large region of the genome, such as the enrichment of tens or hundreds of gene all exon regions, while the multiple amplification library is the specific hotspot region, or the entire exon of individual genes. Targeted capture and sequencing analysis of the proton region.
  • Amplification library construction method is to design corresponding specific primers according to the target region, and then use these primers to multiplex the target sequence. It should be noted that these specific primers will directly carry sequencing adapters or carry bridge sequences. After the second PCR amplification, a sequencing adapter is added to it. This is the process of general amplification and library building. There are some problems in the application of the existing amplification library construction methods.
  • the library construction process is relatively cumbersome, requiring at least two rounds of PCR amplification and two corresponding library purifications, requiring a lot of manual operation time, and high requirements for the library construction personnel , Is not conducive to promotion; primer design and system optimization are more complicated; the cost of library construction is higher; the entire library construction process requires a long time and other disadvantages.
  • the present invention provides the following technical solutions.
  • An object of the present invention is to provide a primer combination for the construction of an amplicon library for detecting the mutation of the target gene.
  • the primer combination provided by the present invention includes:
  • the upstream outer primer F1 is sequentially composed of a sequencing adapter sequence 1, a barcode sequence for distinguishing different samples, and a general sequence;
  • the upstream inner primer F2 is composed of a universal sequence and an upstream specific primer sequence of the target amplicon in turn (the tissue sample may be detected without molecular tags);
  • the downstream outer primer R consists of sequencing adapter 2 and the downstream specific primer sequence of the target amplicon in sequence.
  • the upstream inner primer F2 consists of a universal sequence, a molecular tag sequence, and an upstream specific primer sequence of the target amplicon in sequence.
  • the molecular tag sequence is composed of 6-30 bases, including random bases and 0-N (N is an integer ⁇ 0) group of specific bases; the specific bases are set in random bases, for example, One group, two groups, three groups or four groups; the specific bases in each group consist of 1-5 bases, for example, 1, 2, 3, 4, or 5 bases.
  • the base sequence of each group is randomly selected, and the molecular tag sequence is used to distinguish different starting DNA template molecules.
  • the base type (A, T, G, C) of the random base is freely selected.
  • the specific bases are set to one or two groups, and the sequence is ACT and/or TGA; for example, in this embodiment, the molecular tag sequence is NNNNNACTNNNNTGA, where ACT And TGA is a specific base, N is a random base, and N is A, T, C or G.
  • the sequencing adapter 1 and the sequencing adapter 2 are corresponding sequencing adapters selected according to different sequencing platforms.
  • the sequencing platform is an Illumina platform, the sequencing adapter 1 is I5, and the sequencing adapter 2 is I7;
  • the sequencing platform is an Ion Torrent platform, the sequencing adapter 1 is A, and the sequencing adapter 2 is P;
  • sequencing platform is a BGI/MGI platform
  • nucleotide sequence of the universal sequence is Sequence 1.
  • Another object of the present invention is to provide a kit for constructing an amplicon library for detecting the mutation of the target gene.
  • the kit provided by the present invention includes the above-mentioned primer combination.
  • the above kit also includes PCR amplification buffer and DNA polymerase system.
  • Another object of the present invention is to provide any of the following applications of the above-mentioned primer combination or kit:
  • Another object of the present invention is to provide a method for constructing an amplicon library for detecting the mutation of the target gene.
  • the method provided by the present invention includes the following steps:
  • the above primer combination or the above kit using the DNA or cDNA of the sample to be tested as a template, perform one-step PCR amplification to obtain the amplified product, which is the amplicon library of the target gene.
  • the molar ratio of the upstream outer primer F1, the upstream inner primer F2 and the downstream primer R in the amplification system for one-step PCR amplification is: (5-20): (1-20): (5-20) ).
  • the sample to be tested is a tissue sample, a frozen sample, a puncture sample, an FFPE sample, blood, urine, cerebrospinal fluid, pleural fluid or other body fluids.
  • the amplicon library prepared by the above method is also within the protection scope of the present invention.
  • Another object of the present invention is to provide a method for detecting the mutation of the target gene of the sample to be tested.
  • the method provided by the present invention includes the following steps:
  • Another object of the present invention is to provide a method for detecting the variation frequency of the target area of the sample to be tested.
  • the method provided by the present invention includes the following steps:
  • Mutation frequency number of mutation clusters/total number of effective clusters*100%.
  • the sample to be tested is an isolated tissue sample, a frozen sample, a puncture sample, an FFPE sample, blood, urine, cerebrospinal fluid, or pleural fluid.
  • the nucleotide sequence of the universal sequence is sequence 1;
  • the nucleotide sequence of the sequencing adapter 1 is sequence 2;
  • the nucleotide sequence of the sequencing adapter 2 is sequence 17.
  • the corresponding upstream specific primer sequence and downstream specific primer sequence are sequence 14 and sequence 18, or, sequence 15 and sequence 19, or, sequence 21 and sequence 24, or, sequence 22 and sequence 25;
  • the corresponding upstream specific primer sequence and downstream specific primer sequence are sequence 16 and sequence 20, or sequence 23 and sequence 26, respectively;
  • the corresponding upstream specific primer sequence and downstream specific primer sequence are sequence 27 and sequence 31, or sequence 28 and sequence 31, respectively;
  • the corresponding upstream specific primer sequence and downstream specific primer sequence are sequence 29 and sequence 32 respectively;
  • the corresponding upstream specific primer sequence and downstream specific primer sequence are sequence 30 and sequence 33, respectively.
  • the barcode sequences are all nucleotides with a length of 6-12 nt, no more than 3 consecutive bases, and a GC content of 40-60%;
  • the length of the universal sequence 1 and the universal sequence 2 is generally 16-25 nt, and there is no continuous base, the GC content is 35-65%, and there is no obvious secondary structure;
  • the molecular tag sequence is a sequence containing 6-15 random bases; including but not limited to the above sequence; in the embodiment of the present invention, for example, the barcode sequences used to distinguish different samples are sequence 3 to sequence 12 ;
  • the variation can be point mutation, deletion or insertion, or fragment fusion.
  • Figure 1 shows the primer composition of one-step rapid amplification library construction technology.
  • Figure 2 shows the product situation of the template amplification by rapid amplification library building technology.
  • FIG 3 shows the Agilent2200 results of the library constructed by the BRCA1/2 one-step primer pool.
  • Figure 4 shows the homogeneity of sequencing amplicons of the library constructed by the BRCA1/2 one-step primer pool.
  • Figure 5 is a schematic diagram of the functional structure of each component of the four-function primer and the three-function primer.
  • Figure 6 shows the homogeneity results of the library constructed with three-functional component primer and four-functional component primer pool.
  • Figure 7 shows the number of clusters (number of molecular tags) of one of the amplicons obtained after 30ng cfDNA library construction using one-step primer pool.
  • Figure 8 shows the background noise at the level of 0.1 ⁇ -1 ⁇ after sequencing the libraries built by the two methods.
  • Figure 9 is the Agilent 2200 TapeStation result of the library built in Example 2.
  • Figure 10 shows the Agilent 2200 TapeStation results of the library built in Example 3.
  • Figure 11 is the Agilent 2200 TapeStation result of the library built in Example 4.
  • the present invention is used to construct a second-generation sequencing library amplification library construction method, and the specific primer structures involved in the method are as follows (see Figure 1):
  • Upstream outer primer F1 5’-sequencing adapter sequence 1+Barcode sequence+universal sequence-3’;
  • Upstream inner primer F2 5'-universal sequence+molecular tag sequence+gene upstream specific primer sequence-3';
  • upstream inner primer F2 5'-universal sequence + gene upstream specific primer sequence-3' (the detection of low-frequency mutations requires molecular tags, and the detection of tissue samples may not add molecular tags);
  • Downstream primer R 5'-sequencing linker sequence 2+gene downstream specific primer sequence-3'.
  • the structure of the upstream inner primer F2 is: 5'-universal sequence+molecular tag sequence+gene upstream specific primer sequence-3'.
  • the barcode sequence is used to distinguish the nucleic acid sequence of different samples; a sample to be tested corresponds to a barcode sequence, the barcode sequence length is 6-12 nt, it requires no more than 3 consecutive bases, and the GC content is 40-60%.
  • the primers of Barcode sequence have no obvious secondary structure, etc.
  • the upstream outer primer F1 is used to distinguish different samples. As long as it is the same sample, the upstream outer primer F1 is the same and has nothing to do with the detection site.
  • the molecular tag sequence is used to mark different starting DNA template molecules (templates for different amplicons), and one starting DNA template molecule corresponds to a molecular tag sequence.
  • the molecular tag sequence includes random bases and at least one set of specific bases.
  • the specific bases are set in random bases, for example, 1 or 2 sets; each set of specific bases consists of 1-5 bases.
  • Composition for example, 3 or 4.
  • the base types (A, T, G, C) of random bases are randomly selected.
  • Using molecular tag sequences to classify the starting templates of the sequencing results can eliminate amplification errors and sequencing errors.
  • two types of specific bases are used: ACT and TGA, which can be used alone or in combination.
  • Gene upstream specific primer sequence and gene downstream specific primer sequence are primer sequences used to amplify specific target regions (including each upstream primer and corresponding downstream primer required to amplify different target regions);
  • the universal sequence 1 is a specific nucleic acid sequence, which can be changed according to actual needs.
  • the length is 16-25 nt, and there are no consecutive bases.
  • the GC content is 35-65% and there is no obvious secondary structure.
  • This example uses the general sequence GGCACCCGAGAATTCCA (sequence 1), with a size of 17 nt;
  • Sequencing adaptor sequence 1 and sequencing adaptor sequence 2 are specific sequences that need to be introduced on the primers during sequencing, and can specifically correspond to Ion Torrent, Illumina, or BGISEQ/MGISEQ sequencing platforms.
  • the sequencing adapter sequences 1 and 2 are I5 and I7, respectively.
  • the adapter sequence and the primer sequence on the chip are complementary, and the adapter is added to connect the nucleic acid fragment to the vector.
  • sequencing adapter sequences 1 and 2 are A and P, respectively.
  • the A adapter is used for sequencing and is complementary to the sequencing primer
  • the P adapter is complementary to the sequence on the vector to connect the template to the vector.
  • the sequencing adapter is required for sequencing, which meets the specific sequence required for single-strand circularization, subsequent DNB preparation, and on-machine sequencing.
  • the primer design of one-step rapid amplification library construction technology is as described above.
  • the upstream outer primer F1 and the upstream inner primer F2 share a common sequence, so the upstream outer primer F1 can Use the upstream inner primer F2 as a template to add a sequencing adapter and sample barcode sequence to the target sequence.
  • the upstream inner primer MIX1 (MIX1 made by mixing the upstream primer F2 of multiple amplicons in a specific ratio) and the downstream primer MIX2 (MIX2 made by mixing the downstream primer R corresponding to multiple amplicons in a specific ratio)
  • the template undergoes the first cycle reaction to produce amplified products with F2 and R; in the second cycle reaction, in addition to the above two PCR products, products with F2 and R sequences on both ends will be obtained; Three cycles of reaction will start to produce the target product with the complete sequence of the complete sequencing library, but at this time the product has only one strand; after that, the fourth cycle will produce a double-stranded product with a complete sequence of two-end adapters.
  • the upstream outer primer F1 has a much higher TM value than the upstream inner primer F2, and the concentration is also much higher than that of the upstream inner primer F2, the subsequent complete product (that is, the red dashed box in the fourth cycle PCR product) The exponential amplification of the two labeled products), and finally through a dozen to dozens of cyclic reaction processes, complete library construction.
  • the upstream outer primer F1 was dissolved in water to a primer concentration of 100 ⁇ M, and the upstream inner primer F2 was dissolved in water to each primer concentration of 100 ⁇ M. Each primer was mixed in an equimolar ratio to form the upstream outer primer MIX1, and the downstream primer R was dissolved in water to 100 ⁇ M. Mix into the downstream primer MIX2 in an equimolar ratio.
  • Table 1 shows the amplification system of a certain sample
  • KAPA HiFi PCR Kits including but not limited to the DNA polymerase
  • 10 Genomic DNA of a sample usually 5-20ng gDNA 1-10
  • Upstream inner primer MIX1 (100 ⁇ M) 0.01-5 Upstream outer primer F1 (100 ⁇ M) 0.01-5 Downstream primer MIX2 (100 ⁇ M) 0.01-5 DNAase-free H 2 O Replenish to 20
  • Table 2 shows the PCR amplification program
  • the PCR product obtained is the amplicon library.
  • the purified amplicon library uses Qubit 2.0 for DNA library concentration determination and Agilent 2200 TapeStation Systems detection.
  • the calculation method of the mutation frequency of the library with molecular tags is as follows:
  • the calculation method of the mutation frequency is as follows:
  • DNA molecules with the same molecular tag are defined as a cluster, and DNA molecules with the same molecular tag are the amplification products of an initial DNA template, that is, a series of amplification products from the same original template DNA molecule;
  • Mutation frequency number of mutation clusters/total number of effective clusters*100%.
  • the number of DNA molecules in the same cluster (a sequence sequenced) in the sequencing result is ⁇ 2 to have statistical significance
  • the detection area of this experiment contains 3 amplicons (EGFR L858R, 19del and ERBB2 insertion mutations);
  • test samples include 2 frozen lung cancer tissue samples (sample 1, sample 2), 4 lung cancer FFPE (formalin fixed paraffin embedded tissue samples) samples (sample 3, sample 4, sample 5, sample 6) and 2 Samples of white blood cells from healthy people (Sample 7, Sample 8). The variation of the above 8 samples is known.
  • Table 3 is the primer sequence of EGFR L858R, 19del and ERBB2 insertion mutation
  • the sequencing adapter is suitable for the Ion GeneStudio TM S5 Plus System sequencing platform.
  • Nucleic acid extraction and purification kit DNA extraction from FFPE samples: GeneRead DNA FFPE kit, Qiagen, 180134; DNA extraction from frozen tissue samples: QIAamp DNA Mini Kit 250, QIAGEN, 51306).
  • the amplification system is shown in Table 4.
  • Table 4 shows the amplification system
  • Table 5 is the amplification program
  • PCR products were recovered by magnetic bead purification (Agencourt AMPure XP, Beckman Coulter, A63880), and Qubit 2.0 was used for DNA library concentration determination and Agilent 2200 TapeStation Systems detection.
  • the PCR products of all samples were mixed at equal concentrations and diluted to 100pM to obtain a DNA library for sequencing.
  • EGFR p.E746_A750delELREA represents the 746-750th amino acid of the EGFR gene ELREA (E: Glu glutamic acid; L: Leu leucine; R: Arg arginine; E: Glu glutamate; A: Ala alanine ) Deletion; is a kind of EGFR 19del;
  • EGFR p.K745_E749delKELRE represents the 745th-749th amino acid of EGFR gene KELRE (K: Lys lysine; E: Glu glutamate; L: Leu leucine; R: Arg arginine; E: Glu glutamate ) Deletion; is a kind of EGFR 19del;
  • ERBB2 p.A775_G776insYVMA means that YVMA (Y: Tyr tyrosine; V: Val valine; M: Met Methionine; A: Ala alanine), corresponding to ERBB2 in Table 3.
  • the 63 gene detection product is a product of Beijing Genetron Gene Technology Co., Ltd.'s tumor liquid biopsy product. It applies high-throughput and high-precision second-generation sequencing technology to all solid tumor patients. Comprehensive detection is closely related to tumor targeted therapy and development.
  • the mutation analysis of 63 gene loci covers the target region with a sequencing depth of 20,000x, and the detection sensitivity can reach 0.1%. Provide comprehensive and high-value reference information for precise medication, molecular typing, and curative effect recurrence monitoring.
  • the samples in this experiment are plasma samples from lung cancer patients, including plasma samples from 4 different patients and two healthy people (the sample variation is known), using the kit (MagMAX TM Cell-Free DNA Isolation Kit, Applied Biosystems TM , A29319) After cfDNA was extracted, the library was constructed using a primer pool with molecular tags containing EGFR L858R, 19del and ERBB2 insertion mutations.
  • the primers shown in Table 7 were designed (the upstream outer primers are the same, and the others are different. This example corresponds to 6 barcode sequences):
  • Table 7 is the primer sequence of EGR L858R, 19del and ERBB2 insertion mutation
  • Nucleic acid extraction and purification kit DNA extraction from FFPE samples: GeneRead DNA FFPE kit, Qiagen, 180134; DNA extraction from frozen tissue samples: QIAamp DNA Mini Kit 250, QIAGEN, 51306).
  • Table 8 shows the amplification system
  • Table 9 is the amplification program
  • Magnetic bead purification (Agencourt AMPure XP, Beckman Coulter, A63880) recovers PCR products, and performs Qubit 2.0 for DNA library concentration determination and Agilent 2200 TapeStation Systems detection.
  • the PCR products of all samples were mixed at equal concentrations and diluted to 100 ⁇ M to obtain a DNA library for amplicon sequencing.
  • Table 10 shows the test results of 4 tissue samples and 2 healthy human samples
  • EGFR p.E746_A750delELREA represents the 746-750th amino acid of the EGFR gene ELREA (E: Glu glutamic acid; L: Leu leucine; R: Arg arginine; E: Glu glutamate; A: Ala alanine ) Deletion; is a kind of EGFR 19del;
  • ERBB2 p.A775_G776insYVMA represents the insertion of YVMA (Y: Tyr Tyrosine; V: Val Valine; M: Met) between the 775th alanine (abbreviated as A in English) and Glycine (abbreviated as G in English) Methionine; A: Ala alanine), corresponding to ERBB2 in Table 7.
  • the database is sequenced on the computer, and the point mutation, deletion mutation and insertion mutation of the plasma cfDNA sample tested are consistent with the sample mutation information obtained from the detection of the known 63 genes.
  • the samples in this experiment are FNA puncture samples of 3 thyroid cancer patients with gene fusion (gene fusion information is known) and FNA puncture samples of 2 patients with benign thyroid nodules.
  • MagMAX TM FFPE DNA/RNA Ultra Kit (Applied Biosystems TM , A31881) find the manufacturer's instructions for RNA sample extraction, and then use SuperScript TM VILO TM MasterMix (Invitrogen TM , 11755050) to perform reverse transcription according to the manufacturer's kit instructions.
  • the detection primers for gene fusion are designed before and after the breakpoint, without fixed upstream and downstream Primer matching; upstream and downstream primers designed for the fusion breakpoint, the following example ALK_20 and ELM4_6/EML4_13 are combined separately to detect two ALK-EML4 fusion forms.
  • Table 11 is the primers for gene fusion
  • Table 12 shows the amplification system
  • Table 13 is the amplification program
  • Magnetic bead purification (Agencourt AMPure XP, Beckman Coulter, A63880) recovers PCR products, and performs Qubit 2.0 for DNA library concentration determination and Agilent 2200 TapeStation Systems detection.
  • the PCR products of all samples were mixed at equal concentrations and diluted to 100 ⁇ M to obtain a DNA library for amplicon sequencing.
  • Table 14 shows the comparison of gene fusion test results
  • EML4-ALK-V3a corresponds to EML4_6_ and ALK_20 in Table 11;
  • EML4-ALK-V1 (E13 A20) corresponds to EML4_13_ and ALK_20 in Table 11.
  • the 63 gene detection product uses Agilent's customized probes to capture and build a library.
  • the product has been tested for thousands of clinical plasma samples and has stable product performance.
  • the method of the present invention is sequenced on the computer after the database is built, and the fusion variation form of the tested sample is consistent with the sample variation information obtained from the detection of known 63 genes.
  • the upstream outer primers are the same as in Table 3, and the other differences, the barcode is determined by the number of samples in the library.
  • Upstream inner primer F2 universal sequence + upstream specific primer sequence
  • Downstream primer R sequencing adapter 2+ downstream specific primer sequence
  • Table 15 is the BRCA1/2 primer set
  • Magnetic bead purification (Agencourt AMPure XP, Beckman Coulter, A63880) recovers PCR products, and performs Qubit 2.0 for DNA library concentration determination and Agilent 2200 TapeStation Systems detection.
  • the constructed library has high specificity, no non-specific amplification products and primer dimers, and the constructed library is of high quality and can be sequenced on the computer.
  • the PCR products of all samples were mixed at equal concentrations and diluted to 100 ⁇ M to obtain a DNA library for amplicon sequencing.
  • the sequencing results are shown in Figure 4.
  • the sequencing analysis results show that the 121 amplicons of the BRCA1/2 detection library have good uniformity, which shows the superiority of the one-step French library construction technology of the present invention in terms of amplicon uniformity and guarantees Effective output of data.
  • the upstream outer primers are the same as in Table 3, in which there are 67 barcode sequences; the general sequence is the same as in Table 15, and the upstream specificity factor sequence is P1_B2_F1 to P1_B2_F67 in Table 15;
  • Sequencing adapter 2 is the same as Table 15, and the downstream specific primer sequences are P1_B2_R1 to P1_B2_R67 in Table 15;
  • Barcode primer F1 sequencing adapter 1+barcode sequence+universal sequence 1;
  • Upstream inner primer F2 general sequence 1+molecular tag+specific base sequence+upstream specific primer sequence
  • Downstream outer primer R1 sequencing adapter 2 + universal sequence 2;
  • Downstream inner primer R2 universal sequence 2 + downstream specific primer sequence
  • Table 16 is the control primer sequence
  • the method is the same as that obtained in Example 2.
  • An important indicator of the quality of the library is the uniformity of the amplicons in the library.
  • Good uniformity of the library means that the target area of the library has a higher coverage and the detection accuracy of the panel coverage area is also better. It is for this purpose, under the premise of ensuring the overall functional structure of the primers is complete, considering the improvement of the primer design of the amplicon, the improved primer structure is optimized, from the original F1+F2+R1+R2(4 function
  • the primer component) is simplified to F1+F2+R (3 functional primer components). This design will increase the stability of the reaction system and ensure the uniformity of library amplicons.
  • the primer set of the present invention and the control primer set are respectively amplified.
  • the primer set of the present invention and the control primer set are respectively amplified.
  • the results are shown in Figure 7.
  • the three-functional primers have better original template capture efficiency than the four-functional primers, which makes ultra-low frequency detection more sensitive and stable.
  • the figure below is an amplicon randomly selected in the three-functional component primer method, and the data information after adding tags to the original template after the library is built. Higher template capture efficiency gives the trifunctional component primer method a lower detection limit of mutation frequency.
  • the primer set of the present invention and the control primer set are respectively amplified.
  • the final three-functional primer components achieve a three-tenths level of ultra Effective detection of low frequency mutations.
  • the primer structure of this library construction method has been fully optimized, and the performance of this library construction method is much better than the traditional low-frequency mutation detection method.
  • Table 17 is a comparison of one-step rapid library construction methods, common amplification library construction and capture library construction methods
  • PA970TQ1 2.8 100% 56.71%
  • LAAAF0T1 2.8 100% 73.34%
  • PD010TQ1 2.3 80.17% 69.85%
  • PC916TQ1 2.2 100% 82.54%
  • PC980TQ1 2 100% 52.30%
  • the present invention has developed a one-step rapid amplification library construction method. Compared with traditional capture methods, the amplification library construction method has the following advantages ( Figure 1). The personnel requirements are low, and only a common PCR operation and corresponding reaction time are required to complete the library construction. Because the quality and purity of the library constructed by this method are very high, only a simple round of magnetic bead purification and Qubit quantification are required. After that, normal computer sequencing can be performed.
  • This one-step library construction technology can be applied to all second-generation platforms including IonTorrent, illumina, and BGI/MGI. Based on the library construction method, the present invention has developed SNP, Ins/Del, CNV, A Basic detection, and detection products for gene fusion and expression of RNA samples.
  • composition of 4 functional primers will increase the uncertainty of the reaction system and reaction conditions, and is more sensitive to sample quality, reaction system and external environmental influences.
  • the composition of 3 functional primers has been improved in this respect, and it is simpler The system composition makes the system more stable, and the repeatability and accuracy of sample detection is higher;
  • the entire library building process takes nearly 48 hours and requires high requirements for operators.
  • the general amplification library building method requires at least two rounds of PCR and two rounds of purification, including subsequent QPCR quantification.
  • the entire database construction process requires at least one working day.
  • the present invention only involves one-step PCR reaction and corresponding product purification steps.
  • the entire library building process can be completed within 1.5h, which simplifies the library building operation process and saves library building time (the library building can be completed within 1.5 hours, from library construction to The entire process of the completion of the computer and the completion of the biometric analysis can be controlled within 22 hours);
  • the starting sample can be fresh tissue samples, frozen samples, puncture samples, FFPE samples and other tissue sample types. It can also detect cfDNA or CTC isolated from blood, urine, cerebrospinal fluid, and pleural fluid. Normal samples extract DNA or RNA Later, it can be followed by one-step rapid amplification library construction method for library construction;
  • the cost of library preparation is greatly reduced.
  • the cost of capture probes used in traditional capture library construction is high, and the reagents and consumables involved in the lengthy experimental process also increase the cost of capture library construction.
  • the one-step library construction process uses reagents and consumables. The amount is greatly reduced, and the cost of building a database is much lower than traditional methods of capturing and building a database.
  • at least one round of PCR and one round of purification, and the QPCR quantitative link of the library will also greatly increase the cost of library construction.
  • the composition of 3 functional primers is lower in terms of the total amount of primers and the amount of each component of the primers, so it also has a lower cost advantage;

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Biochemistry (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Plant Pathology (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

公开一种建库方法及应用。提供了一种用于检测待测样本目的基因待检区域的变异情况的扩增子文库构建方法,包括如下步骤:1)根据目标区域设计合成上游外引物F1、上游内引物F2和下游引物R;2)用所述上游外引物F1、所述上游内引物F2和所述下游内引物R对待测样本进行一步PCR扩增,得到扩增产物,即为目标区域的扩增子文库。该一步法建库技术可以应用于包括IonTorrent、illumina和BGI/MGI在内的所有二代平台,以此建库方法为基础,开发出了针对DNA的SNP、Ins/Del、CNV、甲基化的检测,以及针对RNA样本的基因融合和表达的检测产品。

Description

一种建库方法及应用 技术领域
本发明涉及分子生物学技术领域,具体涉及一种建库方法及应用。
背景技术
一般来说,现阶段对目标区域序列进行测序分析,需要首先进行文库构建,而文库构建的方式总的来说不外乎捕获建库和扩增建库。
捕获建库相对来说是进行基因组较大区域,如几十、数百个基因全外显子区域的富集建库,而多重扩增建库是进行特定热点区域,或者个别基因的全外显子区域的靶向捕获并测序分析。
扩增建库方法,就是根据目标区域设计相应的特异性引物,之后使用这些引物对靶向序列进行多重扩增,需要注意的是,这些特异性引物会直接带有测序接头或者带有搭桥序列,之后经过二次PCR扩增,为其添加测序接头,这就是一般扩增建库的流程。现有的扩增建库方式在应用时会存在一些问题,如建库流程较为繁琐,需要至少两轮PCR扩增以及两次对应的文库纯化,手动操作时间较多,对建库人员要求高,不利于推广;引物设计、体系优化较为复杂;建库成本较高;整个建库流程所需的时间较长等弊端。
发明公开
针对扩增建库方式存在的各种问题,本发明提供了如下技术方案。
本发明一个目的是提供一种用于检测目的基因变异情况的扩增子文库构建的引物组合。
本发明提供的引物组合,包括:
根据目标扩增子设计的上游外引物F1、上游内引物F2和下游引物R;
所述上游外引物F1依次由测序接头序列1、用于区分不同样本的barcode序列和通用序列组成;
所述上游内引物F2依次由通用序列和所述目标扩增子的上游特异性引物序列组成(检测组织样本可以不加分子标签);
所述下游外引物R依次由测序接头2和所述目标扩增子的下游特异性引物序列组成。
上述引物组合中,可选的,在检测低频突变时需要加分子标签,所述 上游内引物F2依次由通用序列、分子标签序列和所述目标扩增子的上游特异性引物序列组成。
所述分子标签序列由6-30个碱基组成,包括随机碱基和0-N(N为≥0的整数)组特定碱基;所述特定碱基设置于随机碱基中,设置例如,1组、2组、3组或4组;所述每组特定碱基由1-5个碱基组成,例如1个、2个、3个、4个或5个。
所述每组的碱基序列随意选择,所述分子标签序列用于区分不同的起始DNA模板分子,在一次文库构建过程中,所述分子标签序列中除特定碱基的位置和组成固定外,随机碱基的碱基类别(A、T、G、C)随意选择。
例如,在本发明的一个实施例中,所述特定碱基设置为一组或两组,序列为ACT和/或TGA;例如,在本实施例中,所述分子标签序列为NNNNNACTNNNNTGA,其中ACT和TGA为特定碱基,N为随机碱基,N为A、T、C或G。
上述引物组合中,所述测序接头1和所述测序接头2为根据不同测序平台选择对应的测序接头。
上述引物组合中,
所述测序平台为Illumina平台,所述测序接头1为I5,所述测序接头2为I7;
或所述测序平台为Ion Torrent平台,所述测序接头1为A,所述测序接头2为P;
或所述测序平台为BGI/MGI平台;
或,所述通用序列的核苷酸序列为序列1。
本发明另一个目的是提供一种用于检测目的基因变异情况的扩增子文库构建的试剂盒。
本发明提供的试剂盒,包括上述引物组合。
上述试剂盒中,还包括PCR扩增缓冲液和DNA聚合酶体系。
本发明另一个目的是提供上述引物组合或试剂盒的任一以下应用:
(1)在检测目的基因变异情况的扩增子文库构建中的应用;
(2)检测待测样本目标区域突变位点或变异情况中的应用;
(3)在检测待测样本目标区域的变异频率中的应用。
本发明还有一个目的是提供一种用于检测目的基因变异情况的扩增子文库构建方法。
本发明提供的方法,包括如下步骤:
用上述引物组合或上述试剂盒,以待测样本的DNA或cDNA为模板,进行一步PCR扩增,得到扩增产物,即为目标基因的扩增子文库。
上述方法中,进行一步PCR扩增的扩增体系中上游外引物F1、上游内引物F2和下游引物R三者的摩尔比为:(5-20):(1-20):(5-20)。
上述方法中,所述待测样本为组织样本、冰冻样本、穿刺样本、FFPE样本、血液、尿液、脑脊液、胸水或其他体液。
上述方法在检测待测样本目的基因突变位点或变异情况中的应用。
上述方法在检测待测样本目的基因的变异频率中的应用。
由上述方法制备的扩增子文库也是本发明保护的范围。
本发明还有一个目的是提供一种检测待测样本目的基因的变异情况的方法。
本发明提供的方法,包括如下步骤:
1)用上述方法制备目的基因的扩增子文库;
2)将所有样本的目的基因的扩增子文库混匀后稀释,得到测序DNA文库;
3)测序所述测序DNA文库,得到测序结果,根据测序结果分析待测样本目的基因的变异情况。
本发明还有一个目的是提供一种检测待测样本目标区域的变异频率的方法。
本发明提供的方法,包括如下步骤:
1)用上述方法制备目的基因的扩增子文库;
2)将所有样本的目的基因的扩增子文库混匀后稀释,得到测序DNA文库;
3)测序所述测序DNA文库,得到测序结果,根据测序结果计算待测样本目的基因的变异频率;
变异频率=突变簇的数量/有效簇总的数量*100%。
上述方法中,所述待测样本为离体组织样本、冰冻样本、穿刺样本、 FFPE样本、血液、尿液、脑脊液或胸水。
上述方法中,可选的,
所述通用序列的核苷酸序列为序列1;
所述测序接头1的核苷酸序列为序列2;
所述测序接头2的核苷酸序列为序列17。
例如,当待测目的基因为EGFR时,可选的,对应的上游特异性引物序列和下游特异性引物序列分别为序列14和序列18,或,序列15和序列19,或,序列21和序列24,或,序列22和序列25;
当待测目的基因为ERBB2时,可选的,对应的上游特异性引物序列和下游特异性引物序列分别为序列16和序列20,或,序列23和序列26;
当待测目的基因为EML4时,可选的,对应的上游特异性引物序列和下游特异性引物序列分别为序列27和序列31,或,序列28和序列31;
当待测目的基因为LMNA时,可选的,对应的上游特异性引物序列和下游特异性引物序列分别为序列29和序列32;
当待测目的基因为MYC时,可选的,对应的上游特异性引物序列和下游特异性引物序列分别为序列30和序列33。
例如,所述barcode序列均为长度为6-12nt、无3个以上连续碱基,且GC含量为40-60%的核苷酸;
所述通用序列1和所述通用序列2的长度一般为16-25nt,且无连续碱基,GC含量为35-65%,无明显二级结构;
例如,所述分子标签序列为包含6-15位随机碱基的序列;包括但不限于上述序列;在本发明的实施例中举例,用于区分不同样本的barcode序列分别为序列3-序列12;
变异情况可以为点突变,可以为缺失或者插入,也可以为片段融合。
附图说明
图1为一步法快速扩增建库技术的引物构成。
图2为快速扩增建库技术对模板扩增时的产物情况。
图3为BRCA1/2一步法引物池所构建文库的Agilent2200结果。
图4为BRCA1/2一步法引物池所构建文库的测序扩增子均一性情况。
图5为四功能引物与三功能引物各组分功能结构图示。
图6为三功能组分引物和四功能组分引物池所建文库的均一性结果。
图7为30ng cfDNA使用一步法引物池进行文库构建,数据分析后所得的其中一个扩增子的簇数(分子标签种类数)。
图8为两种方法所建文库测序后,在0.1‰-1‰水平的背景噪音。
图9为实施例2所建文库的Agilent 2200 TapeStation结果。
图10为实施例3所建文库的Agilent 2200 TapeStation结果。
图11为实施例4所建文库的Agilent 2200 TapeStation结果。
实施发明的最佳方式
下述实施例中所使用的实验方法如无特殊说明,均为常规方法。
下述实施例中所用的材料、试剂等,如无特殊说明,均可从商业途径得到。
实施例1、一步法扩增子测序文库的建库的引物设计和合成
一、一步法扩增子测序文库的建库的引物设计
本发明用以构建二代测序文库的扩增建库方法,该方法具体所涉及到的引物结构情况如下(见图1):
上游外引物F1:5’-测序接头序列1+Barcode序列+通用序列-3’;
上游内引物F2:5’-通用序列+分子标签序列+基因上游特异性引物序列-3’;
或上游内引物F2:5’-通用序列+基因上游特异性引物序列-3’(检测低频突变时需要加分子标签,检测组织样本可以不加分子标签);
下游引物R:5’-测序接头序列2+基因下游特异性引物序列-3’。
在检测低频突变的时候,所述上游内引物F2的结构为:5’-通用序列+分子标签序列+基因上游特异性引物序列-3’。
其中,barcode序列是用来区分不同样本的核酸序列;一个待测样本对应一个barcode序列,此Barcode序列长度为6-12nt,要求无3个以上连续碱基,GC含量为40-60%,引入Barcode序列的引物无明显二级结构等。
上游外引物F1是用来区分不同样本,只要是同一样本,上游外引物F1均相同,与检测位点无关。
分子标签序列用来标记不同的起始DNA模板分子(不同扩增子的模 板),一种起始DNA模板分子对应一种分子标签序列。
所述分子标签序列包括随机碱基和至少一组特定碱基,特定碱基设置于随机碱基中,设置例如,1组或2组;所述每组特定碱基由1-5个碱基组成,例如,3个或4个。在一次文库构建过程中,所述分子标签序列中除特定碱基的位置和组成固定外,随机碱基的碱基类别(A、T、G、C)随意选择。
通过分子标签序列来对测序结果的起始模板进行分类,就可以排除扩增错误及测序错误。本实施例采用的是特定碱基为2种:ACT和TGA,可单独或组合使用。
基因上游特异性引物序列和基因下游特异性引物序列是扩增特定目标区域的引物序列(分别包括了用来扩增不同靶向区域所需的各上游引物及对应下游引物);
通用序列1分别为一段特定核酸序列,该序列可根据实际需要变化,长度为16-25nt,且无连续碱基,GC含量为35-65%,无明显二级结构。
本实施例采用的是通用序列GGCACCCGAGAATTCCA(序列1),大小17nt;
测序接头序列1和测序接头序列2为测序时在引物上需要引入的特定序列,具体可对应Ion Torrent、Illumina或者BGISEQ/MGISEQ测序平台。
如果测序平台为Illumina平台,测序接头序列1和2分别为I5和I7,接头序列和芯片上的引物序列是互补的,加接头是为了把核酸片段连接到载体上。
如果测序平台为Ion Torrent平台,测序接头序列1和2分别为A和P,A接头用来测序,与测序引物互补,P接头与载体上序列互补,用来将模板与载体连接。
如果测序平台为BIISEQ/MGISEQ平台,测序接头为测序所需,满足单链环化需要、后续DNB制备以及上机测序所需的特定序列。
二代测序文库上机测序时会同时进行多个样本的检测,因此,会设计出成套的上游外引物F1,M条上游外引物F1对应M个样本,每条上游外引物F1中的barcode序列不同;根据每个样本上的靶向捕获的区域所需的扩增子数P,设计P个上游内引物F2和对应的Q(一般情况下P=Q,但也存在二者不等情况,例如,RNA融合基因检测时)个下游引物R,P个上游 内引物F2中的分子标签结构均相同。
二、一步法扩增子测序文库的扩增原理
一步快速扩增建库技术引物设计如上所述,在对模板DNA/RNA进行扩增时,遵循图2所示情况,上游外引物F1和上游内引物F2共有通用序列,因此上游外引物F1可以以上游内引物F2为模板,从而为目标序列添加测序接头和样本barcode序列。扩增时,上游内引物MIX1(多个扩增子的上游引物F2按特定比例混合成的MIX1)和下游引物MIX2(多个扩增子对应的下游引物R按特定比例混合成的MIX2)对模板进行第一个循环反应,产生带有F2和R的扩增产物;第二个循环反应时,除了上述两种PCR产物外,还会得到两端分别带有F2和R序列的产物;第三个循环反应,会开始产生带有完整测序文库完成结构序列的目标产物,但此时该产物只有一条链;之后,第四个循环就会产生带有完整两端接头序列的双链产物,由于上游外引物F1具有比上游内引物F2高的多的TM值,同时浓度也比上游内引物F2高的多,所以,后续会实现完整产物(也就是第四个循环PCR产物中红色虚线框标注的两种产物)的指数扩增,最终经过十几到几十个的循环反应过程,完成文库构建。
三、检测方法的建立
1、一步法扩增
将上述一合成的引物按照如下制备:
上游外引物F1加水溶解至引物浓度为100μM,上游内引物F2分别加水溶解至各引物浓度为100μM后,各引物按照等摩尔比例混合成上游外引物MIX1,下游引物R分别加水溶解至100μM后,按等摩尔比例混合成下游引物MIX2。
分别提取多个待测样本的基因组DNA。
向0.2ml的八连排管或96孔板中,依次加入表1所示的试剂(各类型核酸样本提取按照实施例所提供特定厂家试剂盒的说明书进行提取):
表1为某个样本的扩增体系
试剂 体积(μl)
KAPA HiFi PCR Kits(包括但不限于该DNA聚合酶) 10
某个样本的基因组DNA(一般5-20ng gDNA) 1-10
上游内引物MIX1(100μM) 0.01-5
上游外引物F1(100μM) 0.01-5
下游引物MIX2(100μM) 0.01-5
DNAase-free H 2O 补水至20
上述PCR扩增的程序如表2所示。
表2为PCR扩增程序
Figure PCTCN2020105117-appb-000001
PCR反应完成,得到的PCR产物即为扩增子文库。
2、磁珠纯化和Qubit定量
PCR反应结束后,用Beckman Coulter公司的Agencourt AMPure XP Kit(货号A63880/A63881/A63882)进行纯化。操作步骤如下:
1)提前30分钟取出Agilent court AMPure XP Kit,充分涡旋后,室温静置。
2)PCR反应结束后,将磁珠再次充分涡旋,向体系中加入24μl磁珠,反复吹打5次以上或充分涡旋,室温静置5分钟。
3)将EP管转移至置于磁力架上,静置5分钟至溶液澄清后,用移液枪小心除去上清,注意不要触碰磁珠。
4)每管加入100μl新鲜配置的80%乙醇溶液,EP管置于磁力架上缓慢旋转2圈,静置5m,弃去上清。
5)5重复4步一次。
6)将EP管打开,室温静置,使液体挥发干净,以磁珠表面无光泽为准,注意不要过分干燥磁珠。
7)从磁力架上取下EP管,加入30μl PCR级纯化水,涡旋混匀后,室温静置10分钟。
8)将上步的EP管置于磁力架上2分钟或直至溶液澄清后,用移液枪在远离磁石的一面小心吸取上清液,注意不要触碰磁珠。
得到纯化的扩增子文库。
将纯化的扩增子文库采用Qubit 2.0进行DNA文库浓度测定和Agilent 2200 TapeStation Systems检测。
3、上机测序及结果分析
等浓度混和多个样本纯化的扩增子文库,然后稀释至100PM,得到用于扩增子测序的DNA文库。测序(所使用测序仪为Ion GeneStudio TM S5 Plus System,Thermofisher,A38195),通过数据处理分析(S5 Torrent Server)后得到检测样本的变异情况和变异频率。
带分子标签文库的变异频率计算方法如下:
因文库扩增过程对原始模板进行分子标记,变异频率的计算方法如下:
测序结果中,带有同一种分子标签的DNA分子定义为一个簇,带有同一种分子标签的DNA分子为一种初始DNA模板的扩增产物,即同一个原始模板扩增而来的一系列DNA分子;
确认该簇中的突变与否,若该簇中某一位置特定碱基类型的比例≥80%,则该簇记作有效簇,若有效簇中带有分子标签的突变的DNA分子数占比≥80%,则记为突变簇;
变异频率=突变簇的数量/有效簇总的数量*100%。
备注:测序结果中同一簇的DNA分子(测序出的一条序列)数≥2才有统计意义
实施例2、一步法扩增子测序文库的构建及测序
一、一步法扩增子测序文库的建库的引物设计
本实验检测区域包含3个扩增子(EGFR L858R、19del以及ERBB2的插入突变);
检测样本包括了2例冰冻肺癌组织样本(样本1、样本2)、4例肺癌FFPE(福尔马林固定石蜡包埋组织样本)样本(样本3、样本4、样本5、样本6)以及2例健康人白细胞样本(样本7、样本8),上述8例样本变异情况已知。
根据3个扩增子(EGFR L858R、19del以及ERBB2的插入突变)设计表3所示的引物(本实施例采用8个Barcode序列):
表3为EGFR L858R、19del以及ERBB2的插入突变的引物序列
Figure PCTCN2020105117-appb-000002
测序接头适用于Ion GeneStudio TM S5 Plus System测序平台。
二、一步法扩增子测序文库
核酸提取纯化试剂盒(FFPE样本DNA提取:GeneRead DNA FFPE kit,Qiagen,180134;冰冻组织样本DNA提取:QIAamp DNA Mini Kit 250,QIAGEN,51306)。
1、一步法扩增
按照实施例1的三的1步骤进行,得到PCR产物。
扩增体系见表4。
表4为扩增体系
试剂 体积(μl)
KAPA HiFi PCR Kits(包括但不限于该DNA聚合酶) 10
某个样本的基因组DNA 5-20
上游内引物MIX1(100μM) 1
上游外引物F1(100μM) 0.5
下游引物MIX2(100μM) 0.5
DNAase-free H 2O 补水至20
表5为扩增程序
Figure PCTCN2020105117-appb-000003
2、磁珠纯化和Qubit定量
按照实施例1的三的2步骤进行。
磁珠纯化(Agencourt AMPure XP,Beckman Coulter,A63880)回收PCR产物,使用Qubit 2.0进行DNA文库浓度测定和Agilent 2200 TapeStation Systems检测。
Agilent 2200 TapeStation Systems检测结果如图9所示。
3、上机测序及结果分析
将所有样本的PCR产物等浓度混样后稀释至100pM,得到用于上机测序的DNA文库。
测序,结果如表6所示:
表6为上机测序结果
Figure PCTCN2020105117-appb-000004
EGFR:p.E746_A750delELREA表示EGFR基因第746-750个氨基酸ELREA(E:Glu谷氨酸;L:Leu亮氨酸;R:Arg精氨酸;E:Glu谷氨酸;A:Ala丙氨酸)缺失;是EGFR 19del的一种;
EGFR:p.K745_E749delKELRE表示EGFR基因第745-749个氨基酸KELRE(K:Lys赖氨酸;E:Glu谷氨酸;L:Leu亮氨酸;R:Arg精氨酸;E:Glu谷氨酸)缺失;是EGFR 19del的一种;
ERBB2:p.A775_G776insYVMA表示ERBB2基因第775丙氨酸(英文缩写为A)和776甘氨酸(英文缩写为G)之间插入YVMA(Y:Tyr酪氨酸;V:Val 缬氨酸;M:Met甲硫氨酸;A:Ala丙氨酸),与表3中的ERBB2对应。
63基因检测产品为北京泛生子基因科技有限公司肿瘤液体活检旗下产品,面向全部实体瘤患者,应用高通量、高精准度的二代测序技术,全面检测与肿瘤靶向治疗及发生发展紧密相关的63个基因位点的变异(包括58个基因的突变分析、10个基因的重排分析以及7个基因的CNV检测),涵盖目标区域测序深度为20,000x,检测灵敏度可达0.1%。为精准用药、分子分型及疗效复发监控提供全面、高价值的参考信息。
上述结果表明,本发明方法建库后上机测序,所检组织样本点突变、缺失突变和插入突变与已知63基因检测所得样本变异信息一致。
实施例3、一步法扩增子测序文库的构建及测序
本实验样本为肺癌病人血浆样本,共包含4位不同患者和两位健康人的血浆样本(样本变异情况已知),使用试剂盒(MagMAX TM Cell-Free DNA Isolation Kit,Applied Biosystems TM,A29319)提取cfDNA后,使用带有分子标签的包含有EGFR L858R、19del以及ERBB2的插入突变的引物池进行文库构建。
一、一步法扩增子测序文库的建库的引物设计
根据3个扩增子(EGR L858R、19del以及ERBB2的插入突变)设计表7所示的引物(上游外引物相同,其它不同,此实施例中对应6个barcode序列):
表7为EGR L858R、19del以及ERBB2的插入突变的引物序列
Figure PCTCN2020105117-appb-000005
Figure PCTCN2020105117-appb-000006
二、一步法扩增子测序文库
核酸提取纯化试剂盒(FFPE样本DNA提取:GeneRead DNA FFPE kit,Qiagen,180134;冰冻组织样本DNA提取:QIAamp DNA Mini Kit 250,QIAGEN,51306)。
1、一步法扩增
按照实施例1的三的1步骤进行,得到PCR产物。
表8为扩增体系
试剂 体积(μl)
KAPA HiFi PCR Kits(包括但不限于该DNA聚合酶) 10
某个样本的基因组DNA 5-20
上游内引物MIX1(100μM) 0.5
上游外引物F1(100μM) 1
下游引物MIX2(100μM) 1
DNAase-free H 2O 补水至20
表9为扩增程序
Figure PCTCN2020105117-appb-000007
Figure PCTCN2020105117-appb-000008
2、磁珠纯化和Qubit定量
按照实施例1的三的2步骤进行。
磁珠纯化(Agencourt AMPure XP,Beckman Coulter,A63880)回收PCR产物,进行Qubit 2.0进行DNA文库浓度测定和Agilent 2200 TapeStation Systems检测。
Agilent 2200 TapeStation Systems检测结果如图10所示。
3、上机测序及结果分析
将所有样本的PCR产物等浓度混样后稀释至100μM,得到用于扩增子测序的DNA文库。
测序,结果如表10所示:
表10为4例组织样本和2例健康人样本的检测结果
Figure PCTCN2020105117-appb-000009
EGFR:p.E746_A750delELREA表示EGFR基因第746-750个氨基酸ELREA(E:Glu谷氨酸;L:Leu亮氨酸;R:Arg精氨酸;E:Glu谷氨酸;A:Ala丙氨酸)缺失;是EGFR 19del的一种;
ERBB2:p.A775_G776insYVMA表示ERBB2基因第775丙氨酸(英文缩写为A)和776甘氨酸(英文缩写为G)之间插入YVMA(Y:Tyr酪氨酸;V:Val缬氨酸;M:Met甲硫氨酸;A:Ala丙氨酸),与表7中的ERBB2对应。
本发明方法建库后上机测序,所检血浆cfDNA样本点突变、缺失突变和插入突变与已知63基因检测所得样本变异信息一致。患者1样本ctDNA提取量较多,将患者1样本稀释5倍后,仍得到4.6‰频率的L858R的检出(数据Reads去重后:突变簇=2;该位点总的簇=4380)。
实施例4、一步法扩增子测序文库的构建及测序
本实验样本为3位带有基因融合的甲状腺癌病人的FNA穿刺样本(基因融合信息已知)和2例良性甲状腺结节患者的FNA穿刺样本,分别使用MagMAX TM FFPE DNA/RNA Ultra Kit(Applied Biosystems TM,A31881)找厂家说明书进行RNA样本提取,之后使用SuperScript TM VILO TM MasterMix(Invitrogen TM,11755050),按照厂家试剂盒说明书进行反转录。
一、一步法扩增子测序文库的建库的引物设计
根据基因融合设计表11所示的引物(上游外引物与表3相同,其它不同,此实施例中对应5个barcode序列):基因融合的检测引物是设计在断点前后,没有固定的上下游引物搭配;针对融合断点设计的上下游引物,如下示例ALK_20与ELM4_6/EML4_13分别组合,用以检测两种ALK-EML4融合形式。
表11为基因融合的引物
Figure PCTCN2020105117-appb-000010
Figure PCTCN2020105117-appb-000011
二、一步法扩增子测序文库
1、一步法扩增
按照实施例1的三的1步骤进行,得到PCR产物。
表12为扩增体系
试剂 体积(μl)
Platinum multiplex PCR Master Mix 15
Thy RNA Fusion Panel 2
Barcode(50μM) 1
cDNA ≤12
ddH 2O 补足到30
表13为扩增程序
Figure PCTCN2020105117-appb-000012
2、磁珠纯化和Qubit定量
按照实施例1的三的2步骤进行。
磁珠纯化(Agencourt AMPure XP,Beckman Coulter,A63880)回收PCR产物,进行Qubit 2.0进行DNA文库浓度测定和Agilent 2200 TapeStation Systems检测。
Agilent 2200 TapeStation Systems检测结果如图11所示。
3、上机测序及结果分析
将所有样本的PCR产物等浓度混样后稀释至100μM,得到用于扩增子测序的DNA文库。
测序,结果如表14所示:
表14为基因融合检测结果对比
样本号 本发明方法 63基因检测
患者1 EML4-ALK-V3a(E6a A20) EML4-ALK-V3a(E6a A20)
患者2 EML4-ALK-V3b(E6b A20) EML4-ALK-V3b(E6b A20)
患者3 EML4-ALK-V1(E13 A20) EML4-ALK-V1(E13 A20)
健康人1
健康人2
EML4-ALK-V3a(E6a A20)对应表11中的EML4_6_和ALK_20;
EML4-ALK-V1(E13 A20)对应表11中的EML4_13_和ALK_20。
63基因检测产品使用安捷伦定制的探针进行捕获建库,该产品已经进行了数千例临床血浆样本的检测,产品性能稳定。
本发明方法建库后上机测序,所检样本融合变异形式与已知63基因检测所得样本变异信息一致。
上述各实施例仅用于说明本发明,其中各部件的结构、连接方式和制作工艺等都是可以有所变化的,凡是在本发明技术方案的基础上进行的等同变换和改进,均不应排除在本发明的保护范围之外。
实施例5、BRCA1/2一步法引物池
一、一步法扩增子测序文库的建库的引物设计
上游外引物与表3相同,其他不同,barcode根据建库样本数量而定。
上游内引物F2:通用序列+上游特异性引物序列
下游引物R:测序接头2+下游特异性引物序列
表15为BRCA1/2引物集合
Figure PCTCN2020105117-appb-000013
Figure PCTCN2020105117-appb-000014
Figure PCTCN2020105117-appb-000015
Figure PCTCN2020105117-appb-000016
Figure PCTCN2020105117-appb-000017
Figure PCTCN2020105117-appb-000018
Figure PCTCN2020105117-appb-000019
Figure PCTCN2020105117-appb-000020
二、一步法扩增子测序文库
1、一步法扩增
以0.5pg健康人血浆白细胞gDNA为起始样本,按照实施例1的三的1步骤进行,得到PCR产物。
2、磁珠纯化和Qubit定量
按照实施例1的三的2步骤进行。
磁珠纯化(Agencourt AMPure XP,Beckman Coulter,A63880)回收PCR产物,进行Qubit 2.0进行DNA文库浓度测定和Agilent 2200 TapeStation Systems检测。
Agilent 2200检测结果如图3所示,所构建文库特异性高,无非特异性扩增产物及引物二聚体,所建文库质量高,可进行上机测序。
3、上机测序及结果分析
将所有样本的PCR产物等浓度混样后稀释至100μM,得到用于扩增子测序的DNA文库。
测序,结果如图4所示,测序分析结果显示,BRCA1/2检测文库的121个扩增子均一性良好,显示了本发明一步法文库构建技术在扩增子均一性方面的优越性,保证了数据的有效产出。
对比例、本发明方法的3条引物和现有技术4条引物的对比
一、一步法扩增子测序文库的建库的引物设计
3条引物和4条引物的设计结构如图5所示。
本发明:
根据实施例1的设计原则,设计本发明的3条引物:
上游外引物与表3相同,其中barcode序列为67种;通用序列与表15相同,上游特异性因序列为表15中的P1_B2_F1至P1_B2_F67;
测序接头2与表15相同,下游特异性引物序列为表15中的P1_B2_R1至P1_B2_R67;
对照:
根据如下现有技术原则设计4条引物:
设计原则:
Barcode引物F1:测序接头1+barcode序列+通用序列1;
上游内引物F2:通用序列1+分子标签+特定碱基序列+上游特异性引物序列;
下游外引物R1:测序接头2+通用序列2;
下游内引物R2:通用序列2+下游特异性引物序列;
上述测序接头1+barcode序列见表3所示。
其余见下表16:
表16为对照引物序列
Figure PCTCN2020105117-appb-000021
Figure PCTCN2020105117-appb-000022
Figure PCTCN2020105117-appb-000023
Figure PCTCN2020105117-appb-000024
Figure PCTCN2020105117-appb-000025
二、一步法扩增子测序文库
方法与实施例2得到二相同。
测序,结果分析如下:
1、三功能组分引物和四功能组分引物池所建文库的均一性结果
文库质量的好坏有一个很重要的指标,就是文库中扩增子的均一性,文库的均一性好意味着文库目标区域的覆盖度更高,panel覆盖区域的检测准确性也更理想。正是出于该目的,在保证引物总体功能架构完备的前提下,考虑将扩增子的引物设计进行改进,改进后的引物结构得以优化,由最初的F1+F2+R1+R2(4功能引物组分)简化为F1+F2+R(3功能引物组分),这种设计会增加反应体系的稳定性,保证文库扩增子间的均一性。
以同一份白细胞DNA样本为模板,分别为上述本发明的引物组和对照引物组进行扩增。
结果如图6所示,在未调整特异性引物比例时,67个扩增子(实施例5中121对引物中挑选出的BRCA2的67对扩增子)文库的3功能引物组分与4功能引物组分所构建文库的扩增子均一性对比,3功能组分引物在文库均一性方面有显著优势。
2、30ng cfDNA使用一步法引物池进行文库构建,数据分析后所得的其中一个扩增子的分子标签种类数/簇数
以同一份cfDNA样本为模板,分别为上述本发明的引物组和对照引物组进行扩增。
结果如图7所示,与四功能组分引物相比,三功能组分引物组成具有较四功能组分引物组成更优的原始模板捕获效率,这使得超低频率的检出更加灵敏及稳定,下图是三功能组分引物方式中随机选择的一个扩增子,建库后为原始模板添加标签后的数据信息。更高的模板捕获效率赋予了三功能组分引物方式更低的变异频率检出限。
3、两种方法所建文库测序后,在0.1‰-1‰水平的背景噪音(同2)
以同一份cfDNA样本为模板,分别为上述本发明的引物组和对照引物组进行扩增。
结果如图8,相较于四功能组分引物的扩增子建库方式,采用三功能组分引物后,模板的捕获效率得到有效提升,降低了文库的非特异性扩增,降低了文库扩增的循环数,同时,通过对两种扩增子建库方式对比发现,在均使用高保真DNA聚合酶的情况下,二者在万分之五水平的测序数据背景噪音方面,三功能组分引物方式更优。更低的背景噪音使得三功能引物组分方式在检测较低频突变时更加的准确。
良好的扩增均一性,高效的原始模板分子捕获效率、高保真的DNA聚合酶以及超低的背景噪音,同时结合分子标签的引入,最终三功能引物组分实现了万分之三水平的超低频突变的有效检出。该建库方式的引物结构得到了充分优化,此建库方法的性能大大优于传统低频突变建库检测方式。
本发明的一步法快速建库方法、普通扩增建库以及捕获建库方法对比结果如表17和表18所示:
表17为一步法快速建库方法、普通扩增建库以及捕获建库方法对比
Figure PCTCN2020105117-appb-000026
表18为本方法与对比方法在文库目标片段占比结果比较
样本号 RIN 一步法文库主峰占比 对比文库主峰占比
LAAAFST1 3.2 50.66% 30.28%
PC949TQ2 2.9 100% 57.48%
PA970TQ1 2.8 100% 56.71%
LAAAF0T1 2.8 100% 73.34%
LAAAFPT1 2.7 84.86% 59.94%
PD010TQ1 2.3 80.17% 69.85%
PC916TQ1 2.2 100% 82.54%
PC980TQ1 2 100% 52.30%
PC977TP1 1.7 100% 51.83%
LAAAEVT1 1.7 100% 38.01%
扩增子文库构建完成后的体系中有可能会存在目标片段的扩增产物,引物二聚体或多聚体,以及非特异性扩增的片段产物,目标片段扩增产物占比高就成为了评价扩增子文库好坏的一个极为重要的指标。表18展示了本方法与对比方法在文库目标片段占比方面的巨大优势。
工业应用
为了解决目前建库的困难,本发明开发出了一步快速扩增建库方法,相对于传统捕获的方法,扩增建库方式具有以下优势(图1),该建库方式简单快速,对操作人员要求低,仅需一个普通的PCR操作及相应反应时间,即可完成建库,由于该方法所建文库的质量和纯度都非常高,因此后续只需简单的一轮磁珠纯化和Qubit定量后即可进行正常的上机测序。该一步法建库技术可以应用于包括IonTorrent、illumina和BGI/MGI在内的所有二代平台,以此建库方法为基础,本发明开发出了针对DNA的SNP、Ins/Del、CNV、甲基化的检测,以及针对RNA样本的基因融合和表达的检测产品。
本发明由于采取以上技术方案,具有以下优点:
1、样本用量少,利用率高。样本中的原始模板分子捕获效率高,需要的起始模板量相对较少。在进行胚系突变检测时甚至只需要pg级别的起始模板量。进行cfDNA的低频变异检测时,有限的起始模板量,可以实现更高的模板捕获效率,从而达到对痕量ctDNA分子的有效捕获,实现更低的检出限和更高的灵敏度;
2、超低检出限。独特的引物设计、配套的PCR反应体系、反应条件以及后续的信息分析降噪系统,最终实现了万分之三的最低突变检出限,使超早期、痕量ctDNA样本突变的准确检出成为了可能;
3、文库均一性好。创新的引物结构设计,以及配套的反应体系使得文库中扩增子的均一性发挥到了最优水平。多重扩增由于各扩增子之间的序列结构特征以及各引物间的扩增效率不同,最终会造成文库扩增子丰度的千差万别,如何平衡扩增子丰度差异是评价文库质量的一个重要指标,而本方法所采用的3部分功能引物的组分构成,相对于之前的4组分版本,具有明显的优势,表现为,引物构成及反应体系相配合,保证了该方法可以将扩增子差异扩增控制在降低的循环数,而后采用类似于通用引物扩增的方式,由于不存在下图R1和R2之间的竞争,所以在后续循环中达到了稳定低差异性扩增的目的;
4、高可重复性。4功能引物的组分构成会增加反应体系与反应条件的不确定性,对样本质量、反应体系及外部环境影响更敏感,而3功能引物的组分构成恰恰在这方面得到改善,更简单的体系构成使得体系的稳定性更高,样本检测的可重复性和准确性更高;
5、操作简便,节省时间。传统的建库捕获技术操作繁琐,流程长,整个建库流程需要近48h时间,对操作人员要求高,普通扩增建库方法需要至少两轮PCR以及两轮的纯化,包括后续的QPCR定量,整个建库流程需要至少一个工作日的时间。本发明只涉及一步PCR反应及对应产物纯化步骤,真个建库流程可在1.5h内完成,简化了建库的操作流程,节省建库时间(1.5小时内可完成建库,从文库构建到上机结束及生信分析完成的整个流程可控制在22小时内);
6、可检测多基因种变异类型。以DNA样本起始,可检测SNP、SNV、Ins/Del、甲基化、基因或外显子水平的拷贝数变异以及染色体臂水平的拷贝数变异,除此之外,引物中添加分子标签后,还可检测低至1‰水平的突变,以RNA样本起始,可检测特定基因的表达、特定基因间发生的融合等;
7、多种样本类型。起始样本可以是新鲜组织样本、冰冻样本、穿刺样本以及FFPE样本等组织样本类型,同时也可检测血液、尿液、脑脊 液以及胸水等分离的cfDNA或CTC等,正常的样本抽提DNA或RNA后,可接一步法快速扩增建库方法进行文库构建;
8、有效杜绝样本间交叉污染。在PCR起始便加入区分不同样本的barcode序列,且操作过程及步骤的简化有效的杜绝了建库过程中有可能造成的交叉污染,尤其是在检测低频变异时,样本间的交叉污染极有可能被判断为假阳性突变;
9、降低建库成本。与传统的捕获技术相比,该文库制备所需成本大大降低。传统捕获建库时所使用的捕获探针成本高,其冗长的实验流程所涉及的试剂耗材也给捕获建库增加了很大的成本,相对而言,一步法建库过程将试剂耗材的使用量大大减少,建库成本也较传统捕获建库方法低得多。同时,相较于一步法快速扩增建库方式而言,普通扩增建库方式多出来的至少一轮PCR和一轮纯化,以及文库的QPCR定量环节等,也会大大增加建库成本;3功能引物组成较之前4引物功能组成,在引物总量和引物各组分用量方面更低,所以也具有着更低的成本优势;
10、节省空间。由于本方法只需一轮PCR,因此实验室要求分室只需3个房间(样本提取、PCR扩增间、文库纯化和测序),与传统文库制备所需4房间(样本提取、PCR1、PCR2及文库纯化、测序)相比,节省空间需求。
灵活简单的建库方法、多种变异类型可检以及极高的检测灵敏度是本发明的最大特点。

Claims (14)

  1. 一种用于检测目的基因变异情况的扩增子文库构建的引物组合,包括:
    根据目标扩增子设计的上游外引物F1、上游内引物F2和下游引物R;
    所述上游外引物F1依次由测序接头序列1、用于区分不同样本的barcode序列和通用序列组成;
    所述上游内引物F2依次由通用序列和所述目标扩增子的上游特异性引物序列组成;
    所述下游外引物R依次由测序接头2和所述目标扩增子的下游特异性引物序列组成。
  2. 根据权利要求1所述的引物组合,其特征在于:所述上游内引物F2依次由通用序列、分子标签序列和所述目标扩增子的上游特异性引物序列组成。
  3. 根据权利要求2所述的引物组合,其特征在于:所述分子标签序列由6-30个碱基组成,包括随机碱基和至少一组特定碱基;所述特定碱基设置于随机碱基中;所述每组特定碱基由1-5个碱基组成。
  4. 根据权利要1-3任一项所述的引物组合,其特征在于:所述barcode序列为长度为6-12nt、无3个以上连续碱基,且GC含量为40-60%的核苷酸序列;
    所述通用序列1和所述通用序列2的长度分别为16-25nt,且无连续碱基,GC含量为35-65%,无明显二级结构。
  5. 根据权利要求1-3任一项所述的引物组合,其特征在于:所述测序接头1和所述测序接头2为根据不同测序平台选择对应的测序接头。
  6. 根据权利要求5所述的引物组合,其特征在于:
    所述测序平台为Illumina平台,所述测序接头1为I5,所述测序接头2为I7;
    或所述测序平台为Ion Torrent平台,所述测序接头1为A,所述测序接头2为P;
    或所述测序平台为BGI/MGI平台;
    或,所述通用序列的核苷酸序列为序列1。
  7. 一种用于检测目的基因变异情况的扩增子文库构建的试剂盒,其特征在于:包括权利要求1-6任一项所述的引物组合。
  8. 权利要求1-6任一项所述的引物组合或权利要求7所述试剂盒的任一以下应用:
    (1)在检测目的基因变异情况的扩增子文库构建中的应用;
    (2)在检测待测样本目标区域突变位点或变异情况中的应用;
    (3)在检测待测样本目标区域的变异频率中的应用。
  9. 一种用于检测目的基因变异情况的扩增子文库构建方法:包括如下步骤:
    用权利要求1-6任一项所述的引物组合或权利要求7所述的试剂盒,以待测样本的DNA或cDNA为模板,进行一步PCR扩增,得到扩增产物,即为目标基因的扩增子文库。
  10. 根据权利要求9所述的方法,其特征在于:所述待测样本为离体组织样本、冰冻样本、穿刺样本、FFPE样本、血液、尿液、脑脊液或胸水。
  11. 一种检测待测样本目的基因的变异情况的方法,包括如下步骤:
    1)用权利要求9所述的方法制备目的基因的扩增子文库;
    2)将所有样本的目的基因的扩增子文库混匀后稀释,得到测序DNA文库;
    3)测序所述测序DNA文库,得到测序结果,根据测序结果分析待测样本目的基因的变异情况。
  12. 根据权利要求11所述的方法,其特征在于:所述待测样本为离体组织样本、冰冻样本、穿刺样本、FFPE样本、血液、尿液、脑脊液或胸水。
  13. 一种检测待测样本目标区域的变异频率的方法,包括如下步骤:
    1)用权利要求9所述的方法制备目的基因的扩增子文库;
    2)将所有样本的目的基因的扩增子文库混匀后稀释,得到测序DNA文库;
    3)测序所述测序DNA文库,得到测序结果,根据测序结果计算待测样本目的基因的变异频率;
    所述变异频率=突变簇的数量/有效簇总的数量*100%。
  14. 根据权利要求13所述的方法,其特征在于:所述待测样本为离体组织样本、冰冻样本、穿刺样本、FFPE样本、血液、尿液、脑脊液或胸水。
PCT/CN2020/105117 2019-07-30 2020-07-28 一种建库方法及应用 WO2021018127A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP20845892.7A EP3995588A4 (en) 2019-07-30 2020-07-28 LIBRARY CREATION PROCESS AND APPLICATION
KR1020227006561A KR20220077907A (ko) 2019-07-30 2020-07-28 라이브러리 구축 방법 및 응용
US17/631,214 US20220267760A1 (en) 2019-07-30 2020-07-28 Library preparation method and application

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910694844.3A CN112301430B (zh) 2019-07-30 2019-07-30 一种建库方法及应用
CN201910694844.3 2019-07-30

Publications (1)

Publication Number Publication Date
WO2021018127A1 true WO2021018127A1 (zh) 2021-02-04

Family

ID=74229473

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105117 WO2021018127A1 (zh) 2019-07-30 2020-07-28 一种建库方法及应用

Country Status (5)

Country Link
US (1) US20220267760A1 (zh)
EP (1) EP3995588A4 (zh)
KR (1) KR20220077907A (zh)
CN (1) CN112301430B (zh)
WO (1) WO2021018127A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114317696A (zh) * 2021-12-24 2022-04-12 深圳裕康医学检验实验室 一种试剂盒及其文库构建方法与污染检测方法
CN114507728B (zh) * 2022-03-03 2024-03-22 苏州贝康医疗器械有限公司 一种捕获引物及其应用

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104372093A (zh) * 2014-11-10 2015-02-25 博奥生物集团有限公司 一种基于高通量测序的snp检测方法
CN105524983A (zh) * 2014-09-30 2016-04-27 大连晶泰生物技术有限公司 基于高通量测序的标记和捕获多个样本的一个或多个特定基因的方法和试剂盒
CN106555226A (zh) * 2016-04-14 2017-04-05 北京京诺玛特科技有限公司 一种构建高通量测序文库的方法和试剂盒

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106835292B (zh) * 2017-04-05 2019-04-09 北京泛生子基因科技有限公司 一步法快速构建扩增子文库的方法
CN107012139A (zh) * 2017-04-05 2017-08-04 北京泛生子医学检验实验室有限公司 一种快速构建扩增子文库的方法
CN107604067A (zh) * 2017-10-19 2018-01-19 北京泛生子基因科技有限公司 一种用于检测目的基因低频突变的引物及试剂盒
CN107604045A (zh) * 2017-10-19 2018-01-19 北京泛生子基因科技有限公司 一种用于检测目的基因低频突变的扩增子文库的构建方法
CN109797437A (zh) * 2019-01-18 2019-05-24 北京爱普益生物科技有限公司 一种检测多个样品时测序文库的构建方法及其应用

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105524983A (zh) * 2014-09-30 2016-04-27 大连晶泰生物技术有限公司 基于高通量测序的标记和捕获多个样本的一个或多个特定基因的方法和试剂盒
CN104372093A (zh) * 2014-11-10 2015-02-25 博奥生物集团有限公司 一种基于高通量测序的snp检测方法
CN106555226A (zh) * 2016-04-14 2017-04-05 北京京诺玛特科技有限公司 一种构建高通量测序文库的方法和试剂盒

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BALECH, B. ET AL.: "Tackling critical parameters in metazoan meta-barcoding experiments: a preliminary study based on coxI DNA barcode", PEERJ, vol. 6, 13 June 2018 (2018-06-13), pages e4845, XP055777555 *
BHAGWAT, R.M. ET AL.: "Two New Potential Barcodes to Discriminate Dalbergia Species", PLOS ONE, vol. 10, no. 11,, 16 November 2015 (2015-11-16), pages e0142965, XP055777557 *
See also references of EP3995588A4 *
ZHAO, HUANYING ET AL.: "Application of Polymerase Chain Reaction-high-resolution Melt Technology for Bacterial Identification in Samples Collected from Lower Respiratory Tract", JOURNAL OF MICROBES AND INFECTIONS, vol. 10, no. 5, 25 October 2015 (2015-10-25), pages 308 - 314, XP055777562 *

Also Published As

Publication number Publication date
CN112301430B (zh) 2022-05-17
EP3995588A1 (en) 2022-05-11
KR20220077907A (ko) 2022-06-09
EP3995588A4 (en) 2022-10-05
CN112301430A (zh) 2021-02-02
US20220267760A1 (en) 2022-08-25

Similar Documents

Publication Publication Date Title
JP6161607B2 (ja) サンプルにおける異なる異数性の有無を決定する方法
CN108300716A (zh) 接头元件、其应用和基于不对称多重pcr进行靶向测序文库构建的方法
CN106834275A (zh) ctDNA超低频突变检测文库的构建方法、试剂盒及文库检测数据的分析方法
CN107541791A (zh) 血浆游离dna甲基化检测文库的构建方法、试剂盒及应用
WO2021073490A1 (zh) 一种检测ctDNA中肿瘤特异基因的变异和甲基化的方法
WO2020233094A1 (zh) 一种ngs建库分子接头及其制备方法和用途
WO2020007089A1 (zh) 一种同时检测多种肝癌常见突变的ctDNA文库构建和测序数据分析方法
WO2019144582A1 (zh) 用于检测基因突变和已知、未知基因融合类型的高通量测序靶向捕获目标区域的探针和方法
WO2016049878A1 (zh) 一种基于snp分型的亲子鉴定方法及应用
WO2018184495A1 (zh) 一步法构建扩增子文库的方法
CN109971827A (zh) 血浆dna的建库方法和建库试剂盒
CN108517567B (zh) 用于cfDNA建库的接头、引物组、试剂盒和建库方法
WO2021018127A1 (zh) 一种建库方法及应用
CN111073961A (zh) 一种基因稀有突变的高通量检测方法
US20210095393A1 (en) Method for preparing amplicon library for detecting low-frequency mutation of target gene
WO2018028001A1 (zh) 特异捕获并重复复制低频率dna碱基变异的方法及应用
CN110760936A (zh) 构建dna甲基化文库的方法及其应用
CN109082470A (zh) 微卫星不稳定性状态的二代测序引物探针组及其检测方法
WO2023216707A1 (zh) Nk细胞治疗产品通用型临床前生物分布检测试剂盒
CN111690748B (zh) 使用高通量测序检测微卫星不稳定的探针组、试剂盒及微卫星不稳定的检测方法
CN109385469A (zh) 一种高灵敏度双链循环肿瘤dna检测方法及试剂盒
CN111575347A (zh) 构建用于同时获得血浆中游离dna甲基化和片段化模式信息的文库的方法
CN108103143B (zh) 一种目标区域多重pcr与快速文库构建的方法
WO2023226939A1 (zh) 用于检测结直肠癌淋巴结转移的甲基化生物标记物及其应用
CN107513570A (zh) 基于高通量测序均一化多靶标文库构建的方法及试剂盒

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20845892

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2020845892

Country of ref document: EP

Effective date: 20220204