WO2022007863A1 - Procédé d'enrichissement rapide d'une région de gène cible - Google Patents

Procédé d'enrichissement rapide d'une région de gène cible Download PDF

Info

Publication number
WO2022007863A1
WO2022007863A1 PCT/CN2021/105073 CN2021105073W WO2022007863A1 WO 2022007863 A1 WO2022007863 A1 WO 2022007863A1 CN 2021105073 W CN2021105073 W CN 2021105073W WO 2022007863 A1 WO2022007863 A1 WO 2022007863A1
Authority
WO
WIPO (PCT)
Prior art keywords
probe
nucleic acid
exonuclease
sequence
purification
Prior art date
Application number
PCT/CN2021/105073
Other languages
English (en)
Chinese (zh)
Inventor
姜正文
丁慧
Original Assignee
天昊基因科技(苏州)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天昊基因科技(苏州)有限公司 filed Critical 天昊基因科技(苏州)有限公司
Publication of WO2022007863A1 publication Critical patent/WO2022007863A1/fr

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing

Definitions

  • the invention relates to the field of biotechnology, and more particularly to a method for rapid enrichment of target gene regions.
  • the total cost of genomic research should include the cost of DNA sequencing, data management, and data analysis (producing directly interpretable data), which makes research at the large population level as well as clinical applications , the actual cost of genomic research is difficult to reduce in a short period of time.
  • a new research method that can enrich for specific regions and biological pathways of diseases, genes, and even the entire exome (1% of the genome), and then conduct unbiased research, this method is Target region enrichment for high-throughput sequencing.
  • Target region enrichment high-throughput sequencing is to design probes for one or several sequences of interest, capture and enrich by different methods, and further sequence and analyze the captured sequences. Due to its flexible probe design and high coverage depth, it is more suitable for large-scale disease sample analysis, or to verify the results of whole genome, GWAS analysis or linkage analysis; it can not only verify the discovered loci, but also further Find disease susceptibility loci in candidate regions.
  • the method of enriching the target gene and then performing high-throughput sequencing (NGS) has the following advantages: 1) it can significantly reduce the cost, 2) the high sequencing depth of the target region ensures more accurate sequencing results, and 3) it is shorter The project turnaround time, 4) the clear function of the target area makes our analysis of the results easier.
  • target region enrichment combined with high-throughput sequencing can analyze larger sample populations, and this method can also have important applications in biomedical research and clinical diagnosis of Mendelian diseases value, and finally individualized medicine based on individual genetic characteristics.
  • the enrichment of the target region is the first task. How to choose the most suitable enrichment method in a specific research project needs to consider the size of the entire enrichment region and the number of samples. and whether multiple samples need to be sequenced simultaneously (the most efficient use of the throughput of the sequencer).
  • enrichment techniques used in scientific research and some commercial platforms, but they can be divided into three categories according to their core reaction principles: target region enrichment based on PCR amplification, circularization and hybridization capture.
  • PCR amplification PCR amplification of the target region is carried out directly by multiple long-range PCR (Long-range PCR), or standard multiplex PCR with limited multiples or multiple multiplex PCR with high multiples can be selected to amplify a large number of short fragments.
  • can also be innovative multiplex PCR Ion AmpliSeqTM from Life Technologies, GeneRead DNAseq System from Qiagen, TargetRichTM from Kailos), microdroplet PCR (RainDance), or chip-based PCR (Access ArrayTM from Fluidigm).
  • PCR-based methods are best suited for small target regions in the 10-100 kb range, and such enrichment methods typically require target region-specific primer design and PCR reactions.
  • the main problems of the PCR amplification method are: the sequence variation of the primer binding region can easily lead to the loss of the amplicon, and the structural variation can only be found by the reduction of sequencing reads.
  • “Circularization” Also called Molecular inversion probes (MIPs), Gap-fill padlock probes or Selector probes.
  • MIPs Molecular inversion probes
  • Gap-fill padlock probes Gap-fill padlock probes
  • Selector probes In the range of 100-500kb, a single-stranded DNA loop containing the sequence of the target region is formed in a highly specific manner (gap filling and ligation reaction), thereby generating a structure containing a common DNA element for the target of interest.
  • the region is selectively amplified, and the representative methods are Haloplex (Agilent) and MIPs.
  • the main problems of this method are: sequence variation in the primer binding region easily leads to loss of amplicon, relatively low sensitivity and uniformity, and relatively high probe cost.
  • Hybridization capture The nucleic acid in the sample is hybridized with a DNA/RNA probe complementary to the target region anchored on a solid support or directly in a liquid, and then the sequence of interest is isolated by physical capture.
  • the capture range is from 500kb to the whole whole exome, and some classic commercial hybridization methods have been developed, such as SureSelect (Agilent), Nextera (Illumina), TruSeq (Illumina), SeqCap (Nimble-Gen), Ion TargetSeq (Life Technologies), these methods have better capture efficiency and cost-effectiveness for large and pre-engineered areas.
  • the main problems of the "hybrid capture method” are: the quality and quantity of the sample are relatively high, and generally cannot be used for FFPE samples. In practice, the optimized TruSeq and SureSelect methods can also be used for FFPE samples.
  • probe hybridization-based methods are the most commonly applicable and have been widely used for exome capture in humans and mice. This method is further divided into solid phase capture (eg: chip capture) and liquid phase capture, depending on how the capture reaction takes place. Among them, liquid phase capture is more popular because its automated mechanical capture method has more advantages.
  • solid phase capture eg: chip capture
  • liquid phase capture is more popular because its automated mechanical capture method has more advantages.
  • probe libraries in addition to the inherent shortcomings of these hybrid capture methods, such as their low capture efficiency and the need for more tedious and time-consuming DNA library construction steps, their pre-made probe libraries also greatly limit the selection of target regions and species for most studies. flexibility.
  • the PCR-based enrichment can bypass the preparation of shotgun libraries, but directly use appropriate 5' primers for fragment amplification in the final amplification stage for sequencing, which is relatively flexible in the selection of candidate regions and laboratory operations.
  • the main drawback of this method is that it is not easy to scale (problems such as cross-matching of multiple primers, dimer formation, and non-specific matching), whether for enrichment of very large genomic regions or simultaneous processing of large numbers of sample.
  • the circularization method based on molecular inversion probe is very different from other methods. The most notable feature is that it has extremely high specificity, but it is difficult to process multiple samples simultaneously in a single reaction. .
  • Each pair of probes used for circular enrichment consists of a single-stranded DNA oligonucleotide, the sequences at both ends of which are complementary to part of the discontinuous fragments in the enriched region, respectively, in an inverted linear sequence.
  • Targeting complementary arms such as the 5' and 3' ends of a padlock probe rapidly approach and hybridize when hybridized to the target sequence, leaving a gap in the target region; if it is phosphorylated at 5', DNA ligase will ligate From both ends, a circular padlock probe is formed, which is linked to the target region. Following circularization, exonuclease digestion removes large amounts of uncircularized probes and DNA fragments.
  • rolling circle amplification or direct PCR targeting common sequences on all circles to amplify the target region to generate an NGS library. To detect the presence or absence of target sequences, the sensitivity of this reaction requires only a single hybridization event, and the specificity is excellent.
  • TruSeq Custom Amplicon developed by illumina is a fully customizable, expansion-based A targeted resequencing system for amplicon detection that allows researchers to focus on any key region of the genome of our interest, allowing simultaneous sequencing of up to 1,536 amplicons covering a genomic region of 600 kb in length in a single reaction .
  • the system is based on the extension-ligation reaction.
  • a pair of oligonucleotide probes (the sequence consists of a general sequence and a specific sequence complementary to the sequence on both sides of the amplicon) are designed for each target region amplicon.
  • composition multiple pairs of probes (up to 1,536 pairs) are mixed in a reaction tube (Custom amplicon tube, CAT), unfragmented sample DNA is added, the CAT probes hybridize to the flanking sequences of the target region, and pass the fragmentation Size selection removes unhybridized oligonucleotide sequences, and then extends and ligates under the action of polymerase and ligase successively to obtain amplicon fragments containing the target region.
  • a reaction tube Customer amplicon tube, CAT
  • unfragmented sample DNA is added
  • the CAT probes hybridize to the flanking sequences of the target region, and pass the fragmentation Size selection removes unhybridized oligonucleotide sequences, and then extends and ligates under the action of polymerase and ligase successively to obtain amplicon fragments containing the target region.
  • the primers with complementary sequences are used for PCR amplification, so as to obtain the amplicon library of multi-target regions, and multiple samples (a single MiSeq run supports up to 96 samples to mix) can be mixed into a library, which can be sequenced and analyzed by the MiSeq System.
  • this method also has some shortcomings: (1) the enrichment process only performs a single round of extension ligation reaction after probe hybridization, which is prone to hybridization off-target and non-specific hybridization, and the capture efficiency of complex sequences is low; (2) ) Remove unhybridized probes by fragment size sorting, which cannot effectively avoid non-specific hybridization and non-target fragment residues.
  • the invention technology of "A high-throughput nucleic acid analysis method and its application” (ZL201210581830.9) disclosed and authorized by the applicant earlier can also be based on the extension ligation reaction (Extention-ligation) to achieve rapid enrichment of the target region, but the same Compared with TruSeq Custom Amplicon technology, it uses 5' anti-exonuclease-modified 5-terminal extension primers and 3' anti-exonuclease-modified 3-terminal ligation probes, through denaturing hybridization/multiple extension ligation for simultaneous reaction, and the reaction product is used.
  • exonuclease I exonuclease I
  • exonuclease III exonuclease III
  • lambda exonuclease lambda exonuclease
  • This method reduces the operation steps through the sample genomic DNA/probe hybridization and polymerase/ligase extension and ligation in the same tube at the same time, and provides the utilization effect of genomic DNA template through multiple extension and ligation cycles, which has certain advantages, but it also has difficulties in system optimization and Insufficient digestion of non-specific amplification products.
  • the purpose of the present invention is to provide a rapid enrichment method for target gene regions.
  • a first aspect of the present invention provides a method for enriching nucleic acid fragments, the method comprising the steps of:
  • reaction system includes: a sample to be tested and n probe groups;
  • each probe set includes a first probe and a second probe respectively;
  • the first probe and the second probe are respectively specifically hybridized to the 3' end and the 5' end of the same target nucleic acid fragment (the specific hybridization refers to at least partial complementarity or complete complementarity);
  • the first probe cannot be degraded by an exonuclease in the 5'->3' direction and/or the second probe cannot be degraded by an exonuclease in the 3'->5' direction;
  • the first probe includes a first part that specifically hybridizes to the 3' end of the target nucleic acid fragment and a second part corresponding to the sequence of subsequent PCR amplification primers (the correspondence refers to the reverse complement of the second part).
  • the sequence and PCR amplification primers can specifically hybridize);
  • the second probe includes a first part that specifically hybridizes to the 5' end of the target nucleic acid fragment and a second part that specifically hybridizes to the sequence of subsequent PCR amplification primers;
  • the 3' end of the first probe and the 5' end of the second probe are separated by at least one nucleus distance of nucleotides
  • reaction mixture I (2) Perform high temperature denaturation and annealing treatment on the reaction system, and the first probe and the second probe specifically hybridize with the target nucleic acid fragment of the sample to be tested during the high temperature denaturation and annealing process to form hybridization product, thereby obtaining reaction mixture I containing the hybridization product;
  • reaction mixture II contains the undigested hybrid product
  • reaction mixture IV containing the ligation product
  • purification treatment such as nucleic acid-specific exonuclease digestion
  • PCR amplification is performed to obtain a PCR amplification product, that is, an enriched nucleic acid fragment.
  • the purification treatment in steps (4) and (5), also removes salt ions and proteins in the reaction mixture I at the same time.
  • the hybridization product is a ternary complex formed by the single-stranded binding of the first probe and the second probe to the target nucleic acid fragment.
  • step (4) a physical method is used for purification.
  • the single-stranded nucleic acid-specific exonuclease cleaves (or digests): the single-stranded DNA (especially the complementary single-stranded DNA that is not specifically hybridized with the probe to form the hybridization product) strand), the unbound (or free) first probe, and the bound (or free) second probe.
  • step (3) the single-stranded nucleic acid-specific exonuclease does not cleave (or digest) or substantially does not cleave the hybrid product.
  • step (5) a physical method is used for purification.
  • the nucleic acid-specific exonuclease cleaves (or digests) the single-stranded DNA (especially the complementary strand), the unbound (or free) first probe, and bound (or free) said second probe;.
  • the nucleic acid-specific exonuclease does not cleave (or digest) or does not substantially cleave the extension ligation product.
  • the n probe sets target different target nucleic acid fragments respectively.
  • the lower limit of n is 20, 30, 40, 50, 100, 200, or 500, and/or the upper limit of n is 2000, 5000, 10000, 100000, 500000, or 1000000.
  • the method further includes the step of: preparing the PCR amplification product into a nucleic acid fragment library.
  • step (5) under the action of the nucleic acid polymerase, the first probe extends the DNA strand along the target nucleic acid fragment to extend to 5 of the second probe. ' end is blocked by it to obtain the first probe to extend the DNA chain; and under the action of the nucleic acid ligase, the 3' end of the first probe to extend the DNA chain and the second probe 5' The ends are ligated to form a reaction mixture containing the ligated product.
  • the first probe cannot be degraded by an exonuclease in the 5'->3' direction, but can be degraded by an exonuclease in the 3'->5' direction.
  • the 5' end of the first probe is provided with a protective group to prevent degradation by exonuclease.
  • the second probe cannot be degraded by an exonuclease in the 3'->5' direction, but can be degraded by an exonuclease in the 5'->3' direction.
  • the 3' end of the second probe is provided with a protective group to prevent degradation by exonuclease.
  • the first probe cannot be degraded by an exonuclease in the 5'->3' direction, and the exonuclease used in step (3) is single-stranded in the 5'->3' direction Nucleic acid specific exonuclease.
  • the second probe cannot be degraded by an exonuclease in the 3'->5' direction, and the exonuclease used in step (3) is single-stranded in the 3'->5' direction Nucleic acid specific exonuclease.
  • the first probe cannot be degraded by 5'->3' exonuclease and the second probe cannot be degraded by 3'->5' exonuclease
  • the 3'->5' direction single-stranded nucleic acid-specific exonuclease is used in step (3), and the 5'->3' direction single-stranded nucleic acid-specific exonuclease and 3'->5' direction are simultaneously used in step (5) Single-stranded nucleic acid specific exonuclease.
  • the 5' end of the first probe and/or the 3' end of the second probe are modified with resistance to exonuclease to achieve the first One probe cannot be degraded by the 5' exonuclease and/or the second probe cannot be degraded by the 3' exonuclease.
  • the modifications include but are not limited to: Phosphorothioates modification, 5-Propyne pdC modification, pdU modification, 2'-Fluoro bases modification, 2'-O-methyl bases modification, 2'-5'linked bases modification modification, LNA bases modification, Chimeric linkage modification, 3'Inverted dT modification, or a combination thereof.
  • 1-10, preferably 2-6 bases at the 5' end of the first probe are modified with resistance to exonuclease.
  • 1-10, preferably 2-6 bases at the 3' end of the second probe are modified with resistance to exonuclease.
  • the exonuclease is selected from the group consisting of: T5 Exonuclease, T7 Exonuclease, Lambda Exonuclease, RecJ f, Exonuclease T, Exonuclease I, Exonuclease V, Exonuclease III, or combinations thereof.
  • the nucleic acid polymerase is a high-temperature thermostable nucleic acid polymerase, preferably, the nucleic acid polymerase is selected from the following group: Hemo (NEB), AmpliTaq DNA Polymerase (AmpliTaq DNA Polymerase), Stoffel Fragment (Life Technologies); Hot Start Flex DNA Polymerase (NEB).
  • the nucleic acid polymerase is a polymerase having substantially no 5' to 3' exonuclease activity.
  • the nucleic acid ligase is a high temperature thermostable nucleic acid ligase, preferably, the nucleic acid ligase is selected from the following group: Taq DNA Ligase (NEB); Ampligase (Epicentre); 9°N TM DNA Ligase (NEB).
  • the Tm value of the second probe that amplifies the same target nucleic acid fragment is higher than the Tm value of the first probe.
  • the Tm value of the second probe is 3°C-10°C higher than the Tm value of the first probe, and preferably the Tm value of the second probe is higher than that of the first probe
  • the Tm value of the probe is 4°C-6°C, such as 5°C.
  • the Tm value of each first probe in each of the probe sets is 59°C to 68°C.
  • the Tm value of each second probe in each probe set is 68°C-75°C.
  • the 5' end of the second probe is modified by phosphorylation.
  • the n (the number of probe sets) is 20-1,000,000, preferably 30-500,000, more preferably 40-100,000, most preferably 50-10,000, such as 100-10,000, 500- 10000, 1000-10000.
  • the probe sets for the same target (target) nucleic acid fragment are referred to as one (one) probe set.
  • n 2
  • the two probe sets are respectively directed to Two different target nucleic acid fragments.
  • the length of the first part of the first probe is 16-50 bp (preferably 21-36 bp, more preferably 33 bp), and/or the length of the second part is 18-30 bp.
  • the length of the first part of the second probe is 16-50 bp (preferably 21-36 bp, more preferably 32 bp), and/or the length of the second part is 21-36 bp.
  • the second parts of the first probes of each probe set are the same or substantially the same.
  • the second part of the second probe of each probe set is the same or substantially the same.
  • the total amount of target nucleic acid fragments in the sample is 1-2000ng, preferably 200-500ng.
  • the sample is a nucleic acid sample derived from animals, plants or microorganisms, preferably a DNA sample or an RNA reverse transcription product cDNA sample.
  • the sample is a nucleic acid sample derived from an animal (preferably a mammal, more preferably a human), preferably a DNA sample or an RNA reverse transcription product cDNA sample.
  • the sample to be tested includes only one kind of sample or the sample to be tested includes multiple detection samples from different subjects (for example, samples taken from multiple patients, or multiple different samples, respectively). tissue samples).
  • reaction system further includes a buffer.
  • the conditions of high temperature denaturation and annealing treatment in the step (2) are 95-100°C for 2-20min, followed by treatment at 50°C for 0.5-20h, preferably 1-5h.
  • the hybridization product in the step (3), will not be degraded by exonuclease, and the first probe and/or the second probe that has not been hybridized will be degraded by exonuclease .
  • the purification treatment in the step (4) includes: magnetic bead purification, silica gel column purification, membrane filtration purification, ethanol or isopropanol precipitation purification, or a combination thereof.
  • the length of the specific sequence of the extension ligation product (ie the target nucleic acid sequence) in the step (5) is 30-5000 bp, preferably 100-1000 bp, more preferably 150-310 bp.
  • no amplification cycle is performed in the step (5).
  • the PCR amplification primers have a tag sequence, and the tag sequence length is 1-100 bp, preferably 5-10 bp.
  • the ligation products of different samples can be amplified with PCR amplification primers with different tag sequences, so that the amplification products of different samples can be mixed together, and the sequenced sequences can be classified according to the tag sequence in the subsequent sequencing data.
  • the length of the PCR amplification primer is 42-58 bp.
  • step (6) only one PCR amplification primer pair is used.
  • the primers (PCR amplification primers) used in the PCR amplification include forward primers and reverse primers, and the forward primers include A sequence that specifically hybridizes to the reverse complement of the second portion of the sequence of the first probe, the reverse primer comprising a sequence that specifically hybridizes to the second portion of the second probe.
  • the forward primer and/or the reverse primer contains a universal sequence compatible with a high-throughput chip sequencing platform.
  • the forward primer and/or the reverse primer contains a tag sequence, and different tag sequences are used for different samples.
  • the ligation products are amplified using universal primers containing different tag sequences to establish a library suitable for the next-generation sequencing platform, and the libraries constructed using universal primers containing different tag sequences can be mixed together for next-generation sequencing. Sequencing.
  • the second partial sequence of the first probe is:
  • the second partial sequence of the second probe is:
  • the forward primer sequence is:
  • [X] is no or tag sequence; preferably, [X] length is 0bp-100bp, preferably 0bp-10bp, such as 8bp.
  • the reverse primer sequence is:
  • sequence of the first part of the first probe is shown in SEQ ID NO.: 2a
  • sequence of the first part of the second probe is shown in SEQ ID NO.: 2a+1 , where a is an integer from 2 to 51.
  • the method is suitable for enrichment and amplification of multiple gene fragments, and the number of gene fragments amplified at the same time can be tens, hundreds or thousands, or even tens of thousands. Contains some reference gene fragments, the number can be 0-999.
  • the sequencing data of the nucleic acid fragments enriched by the method can be analyzed to obtain the copy number of the target gene fragment, and the analysis method is to count the sequencing depth of each target and reference fragment, and each target fragment of the patient sample
  • the sequencing depth of each reference fragment is divided by the sequencing depth of each reference fragment to obtain m ratios (m is the reference gene fragment, the reference gene can be any gene fragment other than this fragment), and each ratio is divided by the corresponding ratio of the normal sample or
  • the median ratio of all samples is multiplied by the copy number of the normal sample on the target fragment or the copy number of most samples on the fragment, so that m values are obtained, and the median is taken as the sample on the target fragment. copy number detection value.
  • the second aspect of the present invention provides a nucleic acid sequencing method, which comprises the steps of: enriching the target nucleic acid fragments by using the method described in the first aspect of the present invention.
  • a high-throughput chip sequencing platform is used to perform single-molecule amplification sequencing or direct single-molecule sequencing on the target nucleic acid fragments enriched by the method described in the first aspect of the present invention .
  • the method further comprises the steps of: analyzing the sequencing data, classifying the samples of the sequencing sequence, reading gene mutation sites and/or calculating the copy number of each gene fragment.
  • a third aspect of the present invention provides a kit for enriching nucleic acid fragments, the kit includes: one or more probes corresponding to the nucleotide sequences in the sample to be tested groups, nucleic acid polymerases and nucleic acid ligases;
  • the probe set includes a first probe and a second probe
  • the first probe and the second probe are respectively specifically hybridized to the 3' end and the 5' end of the same target nucleic acid fragment (the specific hybridization refers to at least partial complementarity or complete complementarity);
  • the first probe cannot be degraded by exonuclease in the 5'->3' direction;
  • the second probe cannot be degraded by exonuclease in the 3'->5' direction;
  • the first probe includes a first part that specifically hybridizes to the 3' end of the target nucleic acid fragment and a second part corresponding to the sequence of subsequent PCR amplification primers (the correspondence refers to the reverse complement of the second part).
  • the sequence and PCR amplification primers can specifically hybridize);
  • the second probe includes a first part that specifically hybridizes to the 5' end of the target nucleic acid fragment and a second part that corresponds to the sequence of subsequent PCR amplification primers (the correspondence refers to the second part and the PCR amplification). primers capable of specific hybridization);
  • the 3' end of the first probe and the 5' end of the second probe are separated by at least one nucleus nucleotide distance.
  • the kit further includes a PCR amplification primer, the PCR amplification primer includes a forward primer and a reverse primer, and the forward primer includes all the primers capable of interacting with the first probe. a sequence that specifically hybridizes to the reverse complement of the second portion, and the reverse primer comprises a sequence that specifically hybridizes to the second portion of the second probe.
  • the forward primer and/or the reverse primer contains a universal sequence compatible with a high-throughput chip sequencing platform.
  • the forward primer and/or the reverse primer contains a tag sequence, and different tag sequences are used for different samples.
  • the kit further includes conventional PCR reagents.
  • Figure 1 shows the operational flow of the invention.
  • Fig. 2 shows the detection value of the copy number of the target gene fragment in the three patient samples in the Example.
  • FIG. 3 shows the capillary electrophoresis results of Example 1 and Example 2.
  • the present inventor unexpectedly discovered a new technology of target gene region enrichment based on extension ligation reaction for the first time.
  • the experimental results show that the method of the present invention can achieve rapid enrichment of multiple target gene fragments, significantly improve the enrichment efficiency of target sequences, and improve the effective reading and sequencing depth of target gene fragments.
  • the enriched products of multiple target gene fragments can be used for sequencing analysis on various high-throughput chip sequencing platforms such as next-generation sequencing platforms after modification and purification. The present invention has been completed on this basis.
  • the present invention invented a new multiple target based on extension ligation reaction A rapid enrichment method for gene regions, which uses extension primers or/and blocking probes that are resistant to exonuclease modification, and performs single or multiple single-stranded nucleic acid-specific exonuclease enzymes after the primer-probe pair is denatured and hybridized with the sample genomic DNA.
  • the products are purified by enzymatic digestion, and then purified by physical methods such as magnetic bead purification, silica gel column purification or membrane filtration purification, and then amplified and purified with universal primers that match the next-generation sequencing platform to obtain a sequencing library.
  • the method is specific and efficient for capturing the target sequence, and the sequencing data of the amplified product can also be used for the analysis of the copy number of the target gene fragment, so as to realize the simultaneous detection of the point mutation and the copy number of the target gene fragment.
  • the second half is the specific sequence hybridized with the target nucleic acid fragment
  • the 5' end of the 3' end probe is phosphorylated
  • the first half is the specific sequence hybridized with the target nucleic acid fragment
  • the second half is the follow-up PCR amplification primers are consistent with the general sequence, and the 5' end of the 5' end probe is protected and modified 1 from exonuclease degradation, or the 3' end of the 3' end probe is modified by a few bases at the 3' end Protect modification 1 from exonuclease degradation, or both are modified at the same time, there are several bases between the two probes,
  • the probe is hybridized with the template DNA, it is digested with one or more single-stranded nucleic acid specific exonuclease 2 to remove the residual primer probe that is not hybridized with the template DNA.
  • Enzymatic digestion products are then purified by physical methods such as magnetic bead purification, silica gel column purification or membrane filtration purification
  • the purified product is subjected to an extension ligation reaction in a reaction system containing both polymerase and ligase: under the action of a polymerase without 5'->3' exonuclease activity, the gap between the two probes is filled, and then Ligation under the action of ligase;
  • the ligation reaction product is purified by physical methods such as magnetic bead purification, silica gel column purification or membrane filtration purification;
  • PCR primers also have a tag sequence with a length of several to dozens of bases.
  • the ligation products of different samples can be amplified with PCR primers with different tag sequences, so that the amplification products of different samples can be mixed in At the same time, in the subsequent sequencing data, the sequencing sequences can be classified into different samples according to the tag sequence;
  • Exonuclease-resistant modifications of the present invention include but are not limited to the following types: Phosphorothioates, 5-Propyne pdC, pdU, 2'-Fluoro bases, 2'-O-methyl bases, 2'-5'linked bases, LNA bases ,Chimeric linkage,3'Inverted dT.
  • exonuclease of the present invention includes but is not limited to the following types: T5 Exonuclease, T7 Exonuclease, Lambda Exonuclease, RecJ f , Exonuclease T, Exonuclease I, Exonuclease V, Exonuclease III.
  • an anti-exonuclease modification is first introduced into the 5' end of the extension primer and/or the 3' end of the blocking probe, and then the hybrid product is digested and purified, and then the secondary purification is carried out by physical methods. It is possible to remove the residual primer probes of unhybridized genomic DNA, purify the product, and then use high-temperature ligase and polymerase to complete the extension and ligation reaction in one reaction system at the same time. Purification removes residual primer probes as much as possible.
  • the method of the present invention can significantly reduce non-specific amplification and improve enrichment efficiency.
  • the method of the present invention can realize the enrichment of multiple target gene fragments, and the number of gene fragments can be from tens to thousands, or even tens of thousands.
  • the method of the present invention is simple and quick to operate, and can achieve the enrichment of target fragments of hundreds of samples within a few hours.
  • the method of the present invention can significantly improve the signal-to-noise ratio of the detection results unexpectedly, especially when multiple probe sets (n probe sets) are used in the same system, for example, n ⁇ 20, ⁇ 30, ⁇ 40, ⁇ 50, ⁇ 100, ⁇ 200, or ⁇ 500.
  • the sequencer performs sequencing, and the sequencing data is first sorted according to different tag sequences.
  • the sequencing data of each sample is paired with the human reference genome using the Burrows-Wheeler Aligner (BWA) software, and then the sequencing data is counted. Copy number estimation of target gene fragments.
  • BWA Burrows-Wheeler Aligner
  • primer3 primer design software http://bioinfo.ut.ee/primer3-0.4.0/primer3/
  • self-developed program designed for all exons of MVK, MVD, PMVK, FDPS4 genes 42 pairs of probes were designed, and the specific sequence amplification length was 183bp-280bp.
  • 8 pairs of probes were designed for 8 reference gene fragments, and the specific sequence amplification length was 185bp-283bp.
  • the 5' extension primer (the first probe) is composed of the 5' end universal sequence (the second part) plus the 3' end specific sequence (the first part), and the 5' end universal sequence is 5' ACACTCTTTCCCTACACGACGCTCTTCCGATCT3' (SEQ ID NO: 1 ), the 3' blocking probe (second probe) consists of a specific sequence at the 5' end (the first part) plus a general sequence at the 3' end (the second part), the 5' end is phosphorylated, and the 3' end is phosphorylated.
  • the phosphoester bond between the last 2 bases of the terminal is replaced with a thioester bond, and the general sequence at the 3' end is 5' AGATCGGAAGAGCACACGTCTGAACTCCAGTC3' (SEQ ID NO: 2).
  • the Tm value of the specific sequence of the 5' extension primer is 59°C-68°C
  • the Tm value of the specific sequence of the 3' blocking probe is 68°C-75°C
  • the Tm value of the 3' blocking probe of the same amplified fragment is usually More than 5°C larger than the 5' extension primer.
  • the enriched fragments and probe-specific sequence information are shown in Table 1.
  • probe hybridization solution 1.5 ⁇ l 10 ⁇ hybridization solution, 1.5 ⁇ l primer probe mixture (0.01 ⁇ M/5’ extension primer + 0.02 ⁇ M/3’ blocking probe), 2 ⁇ l ddH 2 0.
  • Hybridization reaction After shaking and mixing, put it on a PCR instrument, the PCR program is "95°C for 5 minutes, 50°C for 3 hours", and leave it at room temperature for 10 minutes for use.
  • the PCR amplification primer pair is a forward universal primer (5'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGC
  • the PCR reaction system is 20 ⁇ l, which contains 1 ⁇ HF buffer (NEB), 2.5 mM MgCl 2 , 0.3 mM dNTP mix, 0.3 ⁇ M each primer pair, 1 U Phusion DNA polymerase (NEB) and 10 ⁇ l of the above extension ligation purified product.
  • reaction system mixture was run according to the following PCR program: 98°C for 30s; (98°C for 10s, 65°C for 30s, 72°C for 1 min) ⁇ 30; 72°C for 5 minutes; 4°C incubation.
  • the quantified library was sequenced on the MiSeq second-generation sequencer of Illumina Company in the United States.
  • Data analysis Sort the sequencing data according to different tag sequences to obtain the sequencing data of each sample; use the Burrows-Wheeler Aligner (BWA) program to pair the sequencing data with the human genome reference sequence, and count the total sequencing amount of each sample, The sequencing depth of each target and reference fragment and the enrichment efficiency of each sample; the sequencing depth of each target fragment of the patient sample was divided by the sequencing depth of 8 reference fragments to obtain 8 ratios, and each ratio was divided by The corresponding ratio of the normal sample is multiplied by 2, so that 8 values are obtained, and the median is taken as the detection value of the copy number of the sample on the target fragment.
  • BWA Burrows-Wheeler Aligner
  • the sequencing depth of each fragment of 3 patient samples (P1, P2, P3) and 1 normal sample (C1) is shown in Table 2, and the statistical results of sequencing data are shown in Table 3. From the statistical data, all four samples achieved effective enrichment of 50 gene fragments: the enrichment efficiency was over 85%, the average effective reads were over 500 ⁇ , and the sequencing depth of all fragments was over 10 ⁇ .
  • the copy number calculation of each fragment was performed using the sequencing depth data.
  • the copy number detection values of the 42 gene fragments of the three patient samples (P1, P2 and P3) are shown in Figure 2. It can be seen from the figure that P1 has at least the deletion of exon 1 to exon 5 in the MVK gene. , while P2 and P3 deleted the exon 1 to exon 3 segment and the exon 5 to exon 8 segment in the FDPS gene, respectively. The results of these deletion mutations were confirmed by RT-PCR experiments.
  • MVK NM_000431.2
  • PMVK NM_006556.3
  • MVD NM_002461.1
  • FDPS NM_002004.2
  • the phosphoester bond between the 2 bases at the 5' end of the 5' extension primer (the first probe) is replaced with a thioester bond, i.e. the 5' end of the universal sequence at the 5' end of the first probe (the second part)
  • the phosphoester bond between the two bases was replaced with a thioester bond, and the rest was the same as in Example 1.
  • the 3' blocking probe (second probe) is exactly the same as in Example 1, and the phosphoester bond between the last 2 bases at the 3' end is replaced by a thioester bond, that is, the general sequence of the 3' end of the second probe The phosphoester bond between the last 2 bases at the 3' end of (Part II) was replaced with a thioester bond.
  • Example 1 increase the digestion step of the extension ligation product: add 5 ⁇ l of digestion and purification mixture: 0.5 ⁇ l Exonuclease I (20U/ ⁇ l, NEB), 1 ⁇ l Lambda (5U/ ⁇ l, NEB), 3.5 ⁇ l ddH 2 O .
  • the digested product was purified using 37.5 ⁇ l of magnetic beads (1.5 ⁇ , Vazyme), and finally eluted with 15 ⁇ l of 10 mM Tris.Cl, pH 8.0.
  • Example 3 Compared with Example 1, the results of capillary electrophoresis are shown in Figure 3, wherein the upper two figures are the capillary electrophoresis results of the sample of Example 1 and the blank control, respectively, and the following two figures are the sample of Example 2 and the blank control respectively. capillary electrophoresis results. The results show that both can enrich the target region, but Example 2 has fewer heterobands, and the enrichment effect is better.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

La présente invention concerne un procédé d'enrichissement rapide d'une région de gène cible. Le procédé comprend les étapes suivantes : utilisation d'une amorce d'extension et/ou d'une sonde de blocage résistante à la modification par l'exonucléase, ; après que la sonde de l'amorce se dénature et s'hybride avec l'ADN de l'échantillon, réalisation d'une purification par digestion enzymatique par exonucléase spécifique de l'acide nucléique simple brin ; puis soumission du produit purifié par la digestion enzymatique à une purification secondaire au moyen de procédés physiques tels que la purification par billes magnétiques, la purification sur colonne de gel de silice ou la purification par filtration sur membrane ; enfin, réalisation d'une purification par amplification en utilisant une amorce universelle correspondant à une plate-forme de séquençage de deuxième génération, de manière à obtenir une banque de séquençage. Le procédé de la présente invention peut réaliser l'enrichissement de multiples fragments de gènes cibles, améliorer l'efficacité de l'enrichissement des séquences cibles, améliorer la lecture effective et la profondeur de séquençage des fragments de gènes cibles, et peut être utilisé pour l'analyse de séquençage de diverses plates-formes de séquençage de puces à haut débit telles qu'une plate-forme de séquençage de deuxième génération.
PCT/CN2021/105073 2020-07-07 2021-07-07 Procédé d'enrichissement rapide d'une région de gène cible WO2022007863A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010647922.7A CN113913493B (zh) 2020-07-07 2020-07-07 一种靶基因区域快速富集方法
CN202010647922.7 2020-07-07

Publications (1)

Publication Number Publication Date
WO2022007863A1 true WO2022007863A1 (fr) 2022-01-13

Family

ID=79231364

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/105073 WO2022007863A1 (fr) 2020-07-07 2021-07-07 Procédé d'enrichissement rapide d'une région de gène cible

Country Status (2)

Country Link
CN (1) CN113913493B (fr)
WO (1) WO2022007863A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115011594A (zh) * 2022-05-16 2022-09-06 纳昂达(南京)生物科技有限公司 一种用于检测hpv的液相杂交捕获探针、应用及其试剂盒

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060073511A1 (en) * 2004-10-05 2006-04-06 Affymetrix, Inc. Methods for amplifying and analyzing nucleic acids
WO2014101655A1 (fr) * 2012-12-27 2014-07-03 上海天昊生物科技有限公司 Procédé pour l'analyse d'un acide nucléique à rendement élevé et son application
CN105803055A (zh) * 2014-12-31 2016-07-27 天昊生物医药科技(苏州)有限公司 一种基于多重循环延伸连接的靶基因区域富集新方法
US20180363039A1 (en) * 2015-12-03 2018-12-20 Accuragen Holdings Limited Methods and compositions for forming ligation products

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10017761B2 (en) * 2013-01-28 2018-07-10 Yale University Methods for preparing cDNA from low quantities of cells

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060073511A1 (en) * 2004-10-05 2006-04-06 Affymetrix, Inc. Methods for amplifying and analyzing nucleic acids
WO2014101655A1 (fr) * 2012-12-27 2014-07-03 上海天昊生物科技有限公司 Procédé pour l'analyse d'un acide nucléique à rendement élevé et son application
CN105803055A (zh) * 2014-12-31 2016-07-27 天昊生物医药科技(苏州)有限公司 一种基于多重循环延伸连接的靶基因区域富集新方法
US20180363039A1 (en) * 2015-12-03 2018-12-20 Accuragen Holdings Limited Methods and compositions for forming ligation products

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115011594A (zh) * 2022-05-16 2022-09-06 纳昂达(南京)生物科技有限公司 一种用于检测hpv的液相杂交捕获探针、应用及其试剂盒
CN115011594B (zh) * 2022-05-16 2023-10-20 纳昂达(南京)生物科技有限公司 一种用于检测hpv的液相杂交捕获探针、应用及其试剂盒

Also Published As

Publication number Publication date
CN113913493B (zh) 2024-04-09
CN113913493A (zh) 2022-01-11

Similar Documents

Publication Publication Date Title
US10538759B2 (en) Compounds and method for representational selection of nucleic acids from complex mixtures using hybridization
CN106591441B (zh) 基于全基因捕获测序的α和/或β-地中海贫血突变的检测探针、方法、芯片及应用
CN113166797A (zh) 基于核酸酶的rna耗尽
CN109536579B (zh) 单链测序文库的构建方法及其应用
JP7232643B2 (ja) 腫瘍のディープシークエンシングプロファイリング
KR20220162873A (ko) 근접 보존 전위
JP6925424B2 (ja) 短いdna断片を連結することによる一分子シーケンスのスループットを増加する方法
CN110079592B (zh) 用于检测基因突变和已知、未知基因融合类型的高通量测序靶向捕获目标区域的探针和方法
KR102354422B1 (ko) 대량 평행 서열분석을 위한 dna 라이브러리의 생성 방법 및 이를 위한 키트
CN109576346B (zh) 高通量测序文库的构建方法及其应用
TW201321518A (zh) 微量核酸樣本的庫製備方法及其應用
WO2014101655A1 (fr) Procédé pour l'analyse d'un acide nucléique à rendement élevé et son application
WO2018195217A1 (fr) Compositions et procédés pour la construction de bibliothèques et l'analyse de séquences
US11261479B2 (en) Methods and compositions for enrichment of target nucleic acids
EP3480319A1 (fr) Méthode de production d'une banque d'adn et méthode d'analyse d'adn génomique à l'aide d'une banque d'adn
WO2022007863A1 (fr) Procédé d'enrichissement rapide d'une région de gène cible
CN112639127A (zh) 用于对基因改变进行检测和定量的方法
CN110938681A (zh) 等位基因核酸富集和检测方法
EP4215619A1 (fr) Procédés de quantification parallèle, sensible et précise d'acides nucléiques
EP3696279A1 (fr) Procédés de test prénatal non invasif d'anomalies f tales
Barry Overcoming the challenges of applying target enrichment for translational research

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21837447

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21837447

Country of ref document: EP

Kind code of ref document: A1