CN110938681A - Allele nucleic acid enrichment and detection method - Google Patents

Allele nucleic acid enrichment and detection method Download PDF

Info

Publication number
CN110938681A
CN110938681A CN201911378483.8A CN201911378483A CN110938681A CN 110938681 A CN110938681 A CN 110938681A CN 201911378483 A CN201911378483 A CN 201911378483A CN 110938681 A CN110938681 A CN 110938681A
Authority
CN
China
Prior art keywords
probe
allele
probes
amplification
extension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911378483.8A
Other languages
Chinese (zh)
Inventor
杨敬敏
徐辉
卢大儒
王梁辉
唐嘉婕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Wickham Biomedical Technology Co ltd
Original Assignee
Shanghai Wickham Biomedical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Wickham Biomedical Technology Co ltd filed Critical Shanghai Wickham Biomedical Technology Co ltd
Priority to CN201911378483.8A priority Critical patent/CN110938681A/en
Publication of CN110938681A publication Critical patent/CN110938681A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/172Haplotypes

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides a method of enriching nucleic acid extension products, comprising a. enriching nucleic acids comprising one or more sets of allele pairs in a sample, the enriched products being at least 5kb, preferably at least 8kb, more preferably at least 10kb, and the enriched products comprising at least one set of allele pairs; b. contacting a sample containing nucleic acids with probes under conditions in which the probes hybridize to the nucleic acids, the probes comprising at least two probes that target each of the two alleles of any one of the at least one set of allele pairs and the extension of the probe that targets one of the alleles is inhibited, c.

Description

Allele nucleic acid enrichment and detection method
Technical Field
The invention belongs to the field of nucleic acid enrichment, and particularly relates to a method for enriching and detecting alleles by hybridizing and extending nucleic acids.
Background
Biological haploid, diploid and polyploid organisms. Many species, including humans, are diploid organisms, i.e., contain two sets of chromosomes, and a single set of chromosomes is a haploid. In haploids, a linear combination of closely linked multiple alleles, each combination being a haplotype. The haplotype can be composed of a plurality of SNP loci and contains abundant genetic information, the research haplotype has better analysis effect than a single SNP locus, the research haplotype can more effectively reflect the genetic mechanism of diseases, and the research haplotype has more and more extensive requirements in the field of genetic disease detection.
Variations in genetic information are common features of all genomes, while single base pair differences, also known as Single Nucleotide Polymorphisms (SNPs), are the most common form of variation, accounting for more than 90% of all known polymorphisms. SNP sites are not inherited independently, but in groups on chromosomes. Generally, a SNP site has only two alleles, and is therefore also called a biallelic gene. Single nucleotide polymorphisms are important bases for the study of genetic variation in human families and animal and plant lines, and are therefore widely used in population genetics research and the study of disease-associated genes, playing an important role in pharmacogenomics, diagnostics and biomedical research.
Genotyping (genotyping) is also referred to as gene Phasing, haplotyping, or haploid construction. Genotyping refers to the correct mapping of alleles (including heterozygous loci, e.g., SNPs) on a diploid (or even polyploid) genome to the chromosomes of a father or mother according to their parents, such that all alleles from the same parent can be aligned on the same chromosome.
The existing NGS sequencing technology is to scramble and sequence together. This method does not directly distinguish which of these sequences is the parent and which is the parent. Usually, only those variations in the genome and the base composition (homozygous, heterozygous) of those variations, i.e., what is usually called the genotype, can be detected. Differentiation of this genotype can only be achieved by genotyping.
Genotyping is widely used in genetic research. For example, the haplotype reference sequence set formed after genotyping a population is the necessary data material for genotyping. And genotype inference is an essential link in genotype-phenotype association analysis research. The high-quality reference sequence set can improve the statistical efficacy of association analysis, and genotyping of the studied object can also greatly improve the accuracy of genotype inference. Also for example, a haplotype group consisting of multiple loci, rather than a simple single-locus genotype, can enable the study of population genetic history. In addition, important genetic parameters such as chromosome recombination rate, recombination hot spots and the like can be calculated through the haplotype sequence of the genotyped family population. Genotyping can also be used to detect frequent mutations, selection signals, and homeopathic regulation of gene expression.
The SNP is the most common and stable human heritable variation, and reliable SNP analysis and detection have important significance for prevention and diagnosis of diseases, realization of personalized medical treatment and the like. At present, a plurality of detection methods for SNP detection exist. Respectively designing a PCR primer and a TaqMan probe by a TaqMan probe method aiming at different SNP sites on a chromosome, and carrying out real-time fluorescence PCR amplification by utilizing a fluorescent group and a quenching fluorescent group; the SNaPshot method is a typing technique based on the principle of fluorescence-labeled single base extension developed by applied biology of America (ABI), also known as mini sequencing; high resolution melting curve analysis (HRM) judges whether SNP exists by monitoring the combination condition of the double-stranded DNA fluorescent dye and the PCR amplification product in the heating process in real time. The MassArray method realizes genotyping detection by combining primer extension or cutting reaction with MALDI-TOF-MS technology; homogeneous label-free SNP genotyping technology (Hong-Qi Wang et al, Analytical Chemistry,2011,83, 1883) -1889) is based on ligase-mediated strand displacement amplification technology and DNase-catalyzed chemiluminescence reaction; and target region capture technologies based on next generation sequencing (multiplex PCR and probe capture). These techniques have two drawbacks in detecting haplotypes.
Firstly, most of the detection methods are performed by using DNA as a template, the DNA is extracted from a sample to be detected, and all chromosomes are in the same reaction tube in the extraction process, i.e., two groups of haploids are mixed together, so that the SNP information provided by the detection result comes from the sum of the two groups of haploids, i.e., which genotypes come from the same haploids cannot be judged, and the haplotypes cannot be obtained finally. Secondly, the DNA is highly fragmented during detection due to the limitation of the detection platform, so that the information near the SNP point is difficult to reflect by the detection result. These problems cannot be solved because of the poor read length capability of next generation sequencing. The third generation sequencing SMRT technology, depending on its ultra-long read length capability, can solve the above two problems, but is limited by the cost of the third generation sequencing platform and cannot be used as a conventional detection means for haplotype detection.
The existing HSE (Haplotpype-Specific Separation) technology can separate one of two haploids from a physical layer and then carry out subsequent genotype detection, and can be applied to subsequent STR typing, mixed sample identification in the forensic field and the like. The RSE (Region-Specific Extraction) technology similar to the HSE technology is applied to continuous capture and research of large fragment regions and can be applied to high-throughput sequencing.
HSE utilizes a probe which specifically recognizes one type of SNP to combine with a haploid DNA sequence of the type, polymerase takes the 3' end of the probe as a start, extends along a template sequence, incorporates biotin-modified base to form a DNA sequence containing biotin modification, and finally captures the DNA sequence by using the characteristic that streptavidin modified magnetic beads are combined with biotin to achieve the effect of physical separation. The effect of the separation depends on the quality of the starting DNA and the specificity of the probe, and is generally best near the capture point and worse the further away from the capture point. HSE is generally capable of isolating DNA fragments of about 20kb to 50kb in length, which is much longer than that detectable by conventional detection means.
The starting DNA quality on which HSE depends is established in the following respects. The first is the integrity of the DNA, i.e., the length of the DNA, the longer the DNA-templated polymerase can extend, and the longer the separation length. Secondly, the randomness of DNA fragmentation, the DNA fragments during extraction from the original sample, and the random position of this fragmentation eventually causes the capture effect to decrease with increasing distance (Dapprich et al BMC Genomics (2016)17: 486). Thirdly, the total amount of DNA, the size of the human genome is about 3x109bp, which can be separated in one step by HSE technologyHas a size of about 20kb, which accounts for about (1/6.67) x10 of the whole genome-6It is not suitable for processing a trace amount of sample because a high initial amount needs to be input to secure a sufficient amount of the target region. These all present challenges to the experimental design of HSE technology.
The probe specificity relied upon by the HSE technique is established in the following respects. The first is base complementary pairing, i.e. adenine a must pair with thymine T and guanine G must pair with cytosine C. However, it was found that there were still possible mismatches between the 4 bases (Mei-Mei Huang et al, Nucleic Acids Research, Vol.20, No. 174567-4573). The second is that the polymerase can only extend if the bases at the 3' end of the probe are fully paired. However, the polymerase can still perform efficient extension under partial mismatch conditions (supra). Thirdly, the length of the probe, the success rate of separation becomes lower as the length of the probe increases, however, shorter probes also cause nonspecific problems (J.Zander et al, Forensic Science International: Genetics 29(2017) 242-249).
HSE-binding second-generation sequencing will typically select WGA (book genome amplification) (Dapprich et. BMC Genomics (2016)17:486) or multiplex PCR. WGA requires a larger amount of data and is more costly. Multiplex PCR is limited by the number of primers, the SNP sites detected are limited, and only known mutations can be detected, which is not good at finding low-frequency mutations.
There remains a need in the art for highly specific, high throughput, rapid, low cost methods for haplotype isolation.
Disclosure of Invention
According to the method, long-fragment enrichment is carried out on the target region before allele detection, so that the initial amount of the target region is increased and the method is more suitable for trace samples, and the fragment length is unified, so that the same capture effect of each target position is ensured. The method of the invention designs two probes aiming at one allele (such as SNP locus) in an enriched target region, and captures two types of alleles respectively. The method of the invention is combined with allele detection, so that the number of detected alleles is increased, and the cost of allele (such as haplotype) analysis is obviously reduced.
In one aspect, the present invention provides a method of enriching nucleic acid extension products, comprising: a. amplifying nucleic acids comprising one or more sets of allele pairs in a sample, the amplified products being at least 5kb, preferably at least 8kb, more preferably at least 10kb, and the amplified products comprising at least one set of allele pairs; b. contacting probes with the amplified product under conditions wherein the probes hybridize to the nucleic acid, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs and the extension of the probe that targets one of the alleles is inhibited, c.
In one or more embodiments, the binding sites of the probes to the amplification products are near either end of the amplification products. Preferably, the binding site of the probe to the amplification product is less than 30% to 5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
In one or more embodiments, the amplification is primer directed amplification, the primers being designed such that the binding sites of the probes to the amplification products are near either end of the amplification products. Preferably, the binding site of the probe to the amplification product is less than 30% to 5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
In one embodiment, the amplification is PCR amplification. In one embodiment, a probe that targets one of the alleles comprises a modification that inhibits extension of the probe. In one embodiment, the modification is a dideoxy modification of a base. In one embodiment, the dideoxy modification of the base is located at the 3' end of the probe. In one embodiment, the sequences of both alleles in an allele pair are different. In one embodiment, the nucleic acid may be genomic DNA. In one embodiment, the nucleic acid comprises a plurality of sets of allele pairs. In one embodiment, the amplified products comprise sets of allele pairs. In one embodiment, in each of the plurality of sets of allele pairs, extension of the probe targeting one allele is inhibited. In one embodiment, in each of the plurality of sets of allele pairs, the extension-inhibited allele of the probe is located on the same chromosome. In one embodiment, the extension of step b comprises the use of NTPs and/or dntps and a nucleic acid polymerase. In one embodiment, the NTPs and/or dntps include NTPs and/or dntps with an enriched label. In one embodiment, the enrichment marker is biotin, and the enrichment is performed by using magnetic beads with streptavidin. In one embodiment, the allele pair is a SNP pair.
In another aspect, the present invention provides a method for preparing a nucleic acid library, comprising: a. amplifying nucleic acids comprising an allele pair in a sample, the amplified product being at least 5kb, preferably at least 8kb, more preferably at least 10 kb; and the amplified product comprises at least one set of allele pairs; b. contacting probes with the amplified product under conditions in which the probes hybridize to the nucleic acid, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs and the extension of the probe that targets one of the alleles is inhibited, b.
In one or more embodiments, the binding sites of the probes to the amplification products are near either end of the amplification products. Preferably, the binding site of the probe to the amplification product is less than 30% to 5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
In one or more embodiments, the amplification is primer directed amplification, the primers being designed such that the binding sites of the probes to the amplification products are near either end of the amplification products. Preferably, the binding site of the probe to the amplification product is less than 30% to 5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
In one embodiment, the method further comprises the step d: amplifying the enriched extension product. In one embodiment, the probe targeting one of the alleles comprises a modification that inhibits extension of the probe. In one embodiment, the modification is a dideoxy modification of a base. In one embodiment, the dideoxy modification of the base is located at the 3' end of the probe. In one embodiment, the sequences of both alleles in an allele pair are different. In one embodiment, the nucleic acid may be genomic DNA. In one embodiment, the nucleic acid comprises a plurality of sets of allele pairs. In one embodiment, the amplification products of step a comprise a plurality of sets of allele pairs. In one embodiment, in each of the plurality of sets of allele pairs, extension of the probe targeting one allele is inhibited. In one embodiment, in each of the plurality of sets of allele pairs, the extension-inhibited allele of the probe is located on the same chromosome. In one embodiment, the extension of step b comprises the use of NTPs and/or dntps and a nucleic acid polymerase. In one embodiment, the NTPs and/or dntps include NTPs and/or dntps with an enriched label. In one embodiment, the enrichment marker is biotin, and the enrichment is performed by using magnetic beads with streptavidin. In one embodiment, the allele pair is a SNP pair.
In another aspect, the present invention provides a method for separating a haplotype, comprising: a. amplifying nucleic acids comprising an allele pair in a sample, the amplified product being at least 5kb, preferably at least 8kb, more preferably at least 10 kb; and the amplified product comprises at least one set of allele pairs; b. contacting probes with the amplified product under conditions wherein the probes hybridize to the nucleic acid, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs and the extension of the probe that targets one of the alleles is inhibited, c.
In one or more embodiments, the binding sites of the probes to the amplification products are near either end of the amplification products. Preferably, the binding site of the probe to the amplification product is less than 30% to 5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
In one or more embodiments, the amplification is primer directed amplification, the primers being designed such that the binding sites of the probes to the amplification products are near either end of the amplification products. Preferably, the binding site of the probe to the amplification product is less than 30% to 5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
In one embodiment, the probe targeting one of the alleles comprises a modification that inhibits extension of the probe. In one embodiment, the modification is a dideoxy modification of a base. In one embodiment, the dideoxy modification of the base is located at the 3' end of the probe. In one embodiment, the sequences of both alleles in an allele pair are different. In one embodiment, the nucleic acid may be genomic DNA. In one embodiment, the nucleic acid comprises a plurality of sets of allele pairs. In one embodiment, the amplification products of step a comprise a plurality of sets of allele pairs. In one embodiment, in each of the plurality of sets of allele pairs, extension of the probe targeting one allele is inhibited. In one embodiment, in each of the plurality of sets of allele pairs, the extension-inhibited allele of the probe is located on the same chromosome. In one embodiment, the extension of step b comprises the use of NTPs and/or dntps and a nucleic acid polymerase. In one embodiment, the NTPs and/or dntps include NTPs and/or dntps with an enriched label. In one embodiment, the enrichment marker is biotin, and the enrichment is performed by using magnetic beads with streptavidin. In one embodiment, the pair of alleles comprises a SNP pair.
In another aspect, the present invention provides a method for detecting a haplotype, comprising: a. amplifying nucleic acids comprising an allele pair in a sample, the amplified product being at least 5kb, preferably at least 8kb, more preferably at least 10 kb; and the amplified product comprises at least one set of allele pairs; b. contacting probes with the amplified product under conditions wherein the probes hybridize to nucleic acids, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs and the extension of the probe that targets one of the alleles is inhibited, c.
In one embodiment, the determining the haplotype of the sample comprises: comparing the change in allele pair sequencing results before hybridization in step b and after amplification in step d. In one or more embodiments, the change is a change in the depth of sequencing of the allele pair. In one embodiment, the sequencing result is a ratio of any allele in the pair of alleles, e.g., a sequencing depth ratio.
In one embodiment, the probe targeting one of the alleles comprises a modification that inhibits extension of the probe. In one embodiment, the modification is a dideoxy modification of a base. In one embodiment, the dideoxy modification of the base is located at the 3' end of the probe. In one embodiment, the sequences of both alleles in an allele pair are different. In one embodiment, the nucleic acid may be genomic DNA. In one embodiment, the nucleic acid comprises a plurality of sets of allele pairs. In one embodiment, the amplification products of step a comprise a plurality of sets of allele pairs. In one embodiment, the amplification of steps a and d is a PCR amplification and the same primers are used. In one embodiment, in each of the plurality of sets of allele pairs, extension of the probe targeting one allele is inhibited. In one embodiment, in each of the plurality of sets of allele pairs, the extension-inhibited allele of the probe is located on the same chromosome. In one embodiment, the extension of step b comprises the use of NTPs and/or dntps and a nucleic acid polymerase. In one embodiment, the NTPs and/or dntps include NTPs and/or dntps with an enriched label. In one embodiment, the enrichment marker is biotin, and the enrichment is performed by using magnetic beads with streptavidin. In one embodiment, the pair of alleles comprises a SNP pair.
In another aspect, the present invention provides a kit for haplotype isolation or haplotype detection, comprising: primers for amplifying nucleic acids comprising one or more sets of allele pairs in a sample, the amplified product being at least 5kb, preferably at least 8kb, more preferably at least 10kb, and the amplified product comprising at least one set of allele pairs; nucleic acid probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs, and the probe that targets one of the alleles comprises a modification that inhibits extension of the probe; a nucleic acid polymerase; and NTPs and/or dntps with enriched labels.
In one or more embodiments, the binding sites of the probes to the amplification products are near either end of the amplification products. Preferably, the binding site of the probe to the amplification product is less than 30% to 5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
In one or more embodiments, the amplification is primer directed amplification, the primers being designed such that the binding sites of the probes to the amplification products are near either end of the amplification products. Preferably, the binding site of the probe to the amplification product is less than 30% to 5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
In one embodiment, the nucleic acid comprises a plurality of sets of allele pairs. In one embodiment, the amplified products comprise a plurality of sets of allele pairs. In one embodiment, the modification is a dideoxy modification of the base. In one embodiment, the dideoxy modification of the base is located at the 3' end of the probe. In one embodiment, the sequences of both alleles in an allele pair are different. In one embodiment, the allele pairs are multi-set allele pairs. In one embodiment, for each of the plurality of sets of allele pairs, extension of the probe targeting one allele is inhibited. In one embodiment, for each of the plurality of sets of allele pairs, the extension-inhibited allele of the probe is located on the same chromosome. In one embodiment, the enrichment marker is biotin. In one embodiment, the allele pair is a SNP pair. In one embodiment, the nucleic acid polymerase is a DNA polymerase. In one embodiment, the kit further comprises reagents for PCR, including but not limited to lysis reagents, buffers, primers, negative controls, and positive controls. In some embodiments, the kit further comprises instructions for performing the methods described herein.
Drawings
FIG. 1 is a flow chart of one embodiment of the methods described herein.
FIG. 2 is a schematic diagram of one embodiment of a method described herein. Label 1: the probe specifically binds to one haplotype and does not completely bind to the other haplotype. Marker 2: the polymerase extends along the binding site and incorporates the biotin-modified dNTP. Marker 3: streptavidin-containing magnetic beads capture DNA strands comprising biotin. Marker 4: the magnetic beads containing streptavidin are adsorbed by the magnet, so that the effect of physical separation is achieved.
FIG. 3 is a graphical representation of the results of the methods described herein. In the figure, 4 solid lines are listed to represent the same enrichment (amplification product) of 4 different copies from the same sample, 4 dotted lines are listed to represent SNP polymorphic sites at the same position of the product, wherein the point close to the tail end of one dotted line is a hybrid point, namely the extension starting position of polymerase, and an arrow represents the extension direction of the polymerase. x4 indicates that there are 4 different copies of the SNP polymorphic site. Therefore, any SNP polymorphic site on the amplification product has the same copy number.
FIG. 4 is a graph showing the results of the comparative method. In the figure, 4 experiments are listed to represent 4 naturally occurring different copies of DNA from the same sample, 4 dotted lines are listed to represent SNP polymorphic sites at the same position of different copies of DNA, wherein one dotted line is marked as a cross point, namely the extension starting position of polymerase, and an arrow represents the extension direction of the polymerase. x4 indicates that there are 4 different copies of the SNP polymorphic site, x3 indicates that there are 3 different copies of the SNP polymorphic site, and x2 indicates that there are 2 different copies of the SNP polymorphic site. The difference in copy number at different positions is caused by random fragmentation of the native DNA.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art. See, e.g., Lackie, DICTIONARY OF CELLAND MOLEC. mu.LAR BIOLOGY (DICTIONARY OF cell molecular BIOLOGY), Elswell Press (Elsevier) (4 th edition 2007); green et al, MOLEC μ LARCLONING, ALABORATORY MANUAL (molecular cloning, A laboratory Manual), Cold spring harbor laboratory Press (Cold spring harbor, N.Y. 2012).
The invention provides a haplotype detection method, which comprises the following steps: a. amplifying nucleic acids comprising an allele pair in a sample, the amplified product being at least 5kb, preferably at least 8kb, more preferably at least 10 kb; and the amplified product comprises at least one set of allele pairs; b. contacting the sample containing nucleic acids with probes under conditions wherein the probes hybridize to the nucleic acids, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs and the extension of the probe that targets one of the alleles is inhibited, c. extending the probes, and d. amplifying and sequencing the extension products of step c, thereby determining the haplotype of the sample.
The terms "a" or "an" are intended to mean "one or more" or "one or more". The term "comprising" and its variants, such as "comprises" and "comprising," are intended to mean that the addition of further steps or elements is optional and non-exclusive. When used to define compositions and methods, "consisting essentially of … …" is meant to exclude other elements that have any substantial meaning to the combination when used for the purpose intended. "consisting of … …" shall mean excluding trace elements and parts of the basic method steps more than other components. Any methods, devices, and materials similar or equivalent to those described herein can be used in the practice of the present invention. The following definitions are provided to facilitate understanding of certain terms used frequently herein and are not intended to limit the scope of the present invention.
The term "sample" as used herein refers to any tissue or fluid from a subject that is suitable for enrichment of nucleic acids. The object may be any living or non-living organism, including but not limited to a human, non-human animal, plant, bacteria, fungus, or protist. Any human or non-human animal can be selected, including, but not limited to, mammals, reptiles, birds, amphibians, fish, ungulates, ruminants, bovines (e.g., cattle), equines (e.g., horses), goats, and ovines (e.g., sheep, goats), porcines (e.g., pigs), alpacas (e.g., camels, llamas, alpacas), monkeys, apes (e.g., gorilla, chimpanzees), felidaes (e.g., bears), poultry, dogs, cats, mice, rats, fish, dolphins, whales, and sharks. The subject may be male or female (e.g., woman, pregnant woman). The subject may be of any age (e.g., embryo, fetus, infant, child, adult).
Nucleic acids can be isolated from any type of suitable biological sample or specimen. The sample or test sample may be any specimen isolated or obtained from a subject or portion thereof. Non-limiting examples of test samples include fluids or tissues of a subject including, but not limited to, blood or blood products, cord blood, down hairs, amniotic fluid, cerebrospinal fluid, spinal fluid, washing fluid, biopsy sample, interstitial fluid sample, cells or parts thereof, female genital tract wash, urine, stool, sputum, saliva, nasal mucosa, prostatic fluid, lavage fluid, semen, lymph fluid, bile, tears, sweat, breast milk, breast fluid, and the like, or combinations thereof. In some embodiments, the biological sample may be blood, and sometimes plasma or serum. Other suitable biological samples will be familiar to those of ordinary skill in the relevant art. Biological samples can be obtained using techniques well within the ordinary knowledge of clinical practitioners. The methods disclosed herein can be used to enrich for nucleic acids in any type of sample, such as in samples for genomic DNA.
The term "isolated" refers to a material, such as a nucleic acid molecule and/or protein, that is substantially free of, or removed from, components that normally accompany or interact with the material in a naturally occurring environment. Isolated polynucleotides may be purified from host cells in which they naturally occur. Conventional nucleic acid purification methods known to those skilled in the art can be used to obtain isolated polynucleotides.
The terms "nucleic acid", "nucleic acid molecule", "polynucleotide sequence", "nucleic acid sequence" and "nucleic acid fragment" are used interchangeably herein to refer to a polymer of deoxyribonucleotides or ribonucleotides, and their complements, in either single-or double-stranded form. Examples of nucleic acids include single and double stranded DNA, single and double stranded RNA (including siRNA), and hybrid molecules having a mixture of single and double stranded DNA and RNA, optionally containing synthetic, non-natural or altered nucleotide bases. The polynucleotide in the form of a polymer of DNA may comprise one or more strands of cDNA, genomic DNA, synthetic DNA, or mixtures thereof. The nucleic acids of the invention may comprise a combination of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine. The nucleic acids of the invention can be synthesized to include unnatural amino acid modifications. The nucleic acids of the invention can be obtained by chemical synthesis methods or by recombinant methods.
In some embodiments, the nucleic acid is fragmented or cleaved before, during, or after the methods of the invention. As used herein, "fragmenting" or "cleaving" refers to a method or conditions under which a nucleic acid molecule (e.g., a nucleic acid template gene molecule or an amplification product thereof) can be separated into two or more smaller nucleic acid molecules. Nucleic acid fragments may contain overlapping nucleotide sequences, such overlapping sequences may facilitate the construction of nucleotide sequences of corresponding nucleic acids or segments thereof that are not fragmented. In certain embodiments, the nucleic acid may be partially fragmented (e.g., from an incomplete or aborted specific cleavage reaction) or completely fragmented. Such fragmentation or cleavage can be sequence specific, base specific or non-specific and can be accomplished by any of a variety of methods, reagents or conditions including, for example, chemical, enzymatic, physical fragmentation.
The term "extension" refers to the process of synthesizing a strand complementary to a template (RNA or DNA) from the 5 ' → 3 ' direction by connecting nucleotides one by one under the action of a nucleic acid polymerase from the 3 ' end of the primer bound to the template on the basis of base pairing. The term "extension product" refers to a nucleic acid fragment produced during an extension reaction. The extension composition may comprise components for nucleic acid extension, such as: nucleotide triphosphates (NTPs or dntps), one or more primers or probes having the appropriate sequence, polymerase, buffer, solutes and proteins. NTPs or dntps include NTPs or dntps with an enriched label.
In the present invention, nucleic acids in a sample are enriched prior to hybridization with a probe. Preferably, the enrichment is achieved by amplification, such as PCR amplification. As used herein, the product of "amplification" refers to a nucleic acid fragment produced during a primer directed amplification reaction. Typical methods for primer directed amplification include Polymerase Chain Reaction (PCR), Ligase Chain Reaction (LCR) or Strand Displacement Amplification (SDA). If a PCR method is chosen, the replication composition may comprise components for nucleic acid replication, such as: nucleotide triphosphates, two (or more) primers with appropriate sequences, thermostable polymerase, buffers, solutes, and proteins.
In one or more embodiments, the amplified product is at least 2kb, at least 3kb, at least 4kb, at least 5kb, at least 6kb, at least 7kb, at least 8kb, at least 9kb, at least 10kb or at least 12 kb. In one or more embodiments, the amplified product is 2-20kb, 4-18kb, 6-16kb, 8-14kb, preferably 10-12kb in length. In one or more embodiments, the amplified product comprises at least one set of allele pairs.
Herein, the primers and the probes are designed such that the binding sites of the probes to the amplification products are close to either end of the amplification products, and the extension direction is the other end. The inventors have found that in order to ensure sufficient extension of the probe in the direction of extension, the binding site should be as close as possible to one of the ends of the template. Thus, the binding site of the probe to the amplification product is less than 30% -5% of the length of the amplification product from either end of the amplification product. Preferably, the binding site of the probe to the amplification product is less than 25%, 20%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5% of the length of the amplification product from either end of the amplification product.
The term "primer" refers to a synthetic oligonucleotide that is capable of acting as a point of initiation of nucleic acid synthesis or replication along a complementary strand when placed under conditions in which synthesis of the complementary strand is catalyzed by a polymerase. The primer may also contain an enrichment marker.
The term "probe" refers to a synthetic oligonucleotide that is complementary (but not necessarily fully complementary) to a polynucleotide of interest and forms a duplex structure upon hybridization to at least one strand of the polynucleotide of interest. Probes of the invention may comprise single-stranded nucleic acids that hybridize under stringent hybridization conditions to a target sequence in a nucleic acid. A "target sequence" is a nucleic acid sequence that defines the portion of a nucleic acid to which a binding molecule, e.g., a probe, will hybridize, e.g., bind, provided that sufficient conditions for hybridization exist. The probe may also contain an enrichment marker. The probe can be designed based on the well-known method of the present invention. For example, the probe design can be performed according to the following principles: 1. the 3 'end of the probe comprises a base complementary to a base on the SNP site, the complementary base can be the last base or the penultimate base of the 3' end, the 2.GC content is 30-70%, the 3.Tm value is within 50-58 ℃, 4. no hairpin, no self-linking structure, and no 4 continuous G or C bases. Preferably, the probe of the present invention has a length of 15 to 22 nt.
In one embodiment, the probes comprise at least two probes that target two alleles of an allele pair, respectively, and extension of the probe that targets one of the alleles is inhibited or blocked. In one embodiment, the allele pairs comprise multiple sets of allele pairs. In each of the sets of allele pairs, extension of the probe targeting one allele is inhibited or blocked. In one embodiment, in each of the plurality of sets of allele pairs, the allele whose extension of the probe is inhibited or blocked is located on the same chromosome. In some embodiments, the methods of the invention comprise unlabeled probes and labeled probes for enriching nucleic acids.
The term "modified" as used throughout this document is intended to indicate that the sequence is altered in any way. The modification of the nucleic acid sequence of interest (and deletions thereof) may be in all desired forms. The modification of the present invention is not limited as long as the modification can inhibit or block nucleotide extension. Modifications to nucleic acids include the addition of a group to a base, such as methylation, and also include the action of catalyzing the entry of a rare base into RNA or DNA. The modification may also be a dideoxy modification of a base, which is capable of terminating extension of the nucleic acid, for example a dideoxy modification of the 3' terminal base of the polynucleotide. In one embodiment, the probes of the invention comprise a modification that inhibits or blocks extension of the probe. In one embodiment, a probe that targets one allele of a pair of alleles comprises a modification that inhibits or blocks extension of the probe. In one embodiment, the probes targeting both alleles of an allele pair comprise a modification that inhibits or blocks extension of the probe. In one embodiment, the modification to the nucleic acid is a dideoxy modification. In one embodiment, the dideoxy modification is at the 3' terminus.
The term "targeting" of the present invention refers to an action taken on a particular target (molecule, cell, individual, etc.). Such as the targeted insertion of foreign genes at the desired location in the host cell genomic DNA; targeted delivery or action of drug molecules on effector target tissues or cells. The term "hybridization" according to the present invention allows the targeting of nucleic acids by base pairing of two single-stranded DNA or RNA. The present invention separates, selects and/or enriches subsets of nucleic acids in a sample for further processing by sequence specific oligonucleotides.
As used herein, the terms "label," "enrichment label," and the like refer to a molecule that can be used to enrich for nucleic acids, including, but not limited to, enzymes, metal ions, metal sols, semiconductor nanocrystals, and ligands (e.g., biotin, avidin, streptavidin, or haptens). An example of an enrichment marker may be biotin, in which case nucleic acid enrichment may be performed using a solid phase with e.g. streptavidin.
The term "percent identity," when referring to two or more polynucleotides, refers to two or more sequences or subsequences that are the same, or have a specified percentage of nucleotides that are the same, as measured using the BLAST or BLAST 2.0 sequence alignment algorithm, using default parameters, or by manual alignment and visual inspection (i.e., about 60% identity, preferably 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or more identity over a specified region, when compared and aligned for maximum correspondence over a comparison window or designated region). See, e.g., NCBI website NCBI. These sequences may then be said to be "substantially identical". The percent identity is typically determined over optimally aligned sequences, making this definition applicable to sequences having deletions and/or additions, as well as sequences having substitutions. Algorithms commonly used in the art take into account gaps and the like. Typically, identity exists over a region comprising a sequence of at least about 25 nucleotides in length, or over a region of 50-100 nucleotides in length, or over the entire length of a reference sequence.
The term "selectively" or "selective" with respect to a nucleic acid refers to the distinction between a target nucleic acid sequence (e.g., a target sequence of an allele) and a non-target nucleic acid sequence (e.g., a non-target sequence of an allele). Targeting is selective for a sequence if little or no hybridization of the primer or probe to a non-target sequence occurs.
Alleles (allels), also known as alleles, are replicable deoxyribonucleic acids that occupy chromosomal loci. An allelic pair generally refers to a pair of genes or sequences or even a single nucleotide located at the same position on a pair of homologous chromosomes that controls a relative trait. The enrichment methods of the invention utilize allele pairs contained in nucleic acids to enrich a subset of nucleic acids. Thus, a nucleic acid herein may comprise an allele pair. In one embodiment, the nucleic acid comprises one or more sets of allele pairs. In one embodiment, the sequences of both alleles in an allele pair are different. The allele may be a SNP. Alleles may comprise SNPs. SNPs refer to DNA sequence polymorphisms at the genomic level caused by single nucleotide variations (including transitions, transversions, deletions and insertions). Theoretically, there are 4 different variants at each SNP site, but only two types, i.e., transition and transversion, occur in practice. In one embodiment, the allele pair may be a SNP pair.
DNA sequencing refers to the analysis of the base sequence of a particular DNA fragment. The sequencing depth is expressed as the ratio of the total base number (bp) obtained by sequencing to the Genome size (Genome), and is one of the indexes for evaluating the sequencing quantity. The sequencing depth and the genome coverage are in a positive correlation, and the error rate or false positive results brought by sequencing can be reduced along with the increase of the sequencing depth.
The invention also provides a haplotype detection method, which comprises the following steps: a. amplifying nucleic acids comprising an allele pair in a sample, the amplified product being at least 5kb, preferably at least 8kb, more preferably at least 10 kb; and the amplified product comprises at least one set of allele pairs; b. contacting the sample containing nucleic acids with probes under conditions wherein the probes hybridize to the nucleic acids, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs and the extension of the probe that targets one of the alleles is inhibited, c. extending the probes, and d. amplifying and sequencing the extension products of step c, thereby determining the haplotype of the sample.
In one embodiment, the determining the haplotype of the sample comprises: comparing the change in sequencing results of the allele pairs before hybridization in step b and after amplification in step d. In one or more embodiments, the change is a change in the depth of sequencing of the allele pair. In one embodiment, the sequencing result is a ratio of any allele in the pair of alleles, for example a ratio of sequencing depths. For example, if a site has both alleles A1 and A2, the sequencing results before isolation of the haplotype are A1 and A2(0/1:1535,1765: 3300). Wherein 0/1 represents the site as heterozygous, 1535 the sequencing depth of A1, 1765 the sequencing depth of A2, 3300 the sum of the two depths, i.e., the total sequencing depth of the site. Then the proportion of A1 in the total depth of sequencing before haplotype isolation was 46.5%. After isolation of the haplotype, the sequencing results were A1, A2(0/1:2127,672:2976), and the proportion of A1 to the total depth of sequencing was 71.5%. It can be seen that the ratio of the sequencing depth of A1 to the total sequencing depth increased from 46.5% to 71.5% after haplotype isolation, and a significant change occurred, indicating that the probe was successfully captured and A1 was isolated.
As used herein, "enrichment," "concentration," "separation," and "purification" are used interchangeably and refer to the process of utilizing differences in physical or chemical properties of one or more of a plurality of components to allow one or more of the components to partition relatively to different spatial regions or to sequentially partition to the same spatial region at different times. Enrichment can be achieved by pooling one or more of the components in a larger number of fractions, thereby increasing the concentration of the component, separating the component from other components, or removing other components from the component, etc. Herein, enrichment may refer to the process of separating one or more of the multiple alleles (e.g., SNPs) from the other alleles so that the concentration is increased. In one embodiment, the enrichment methods herein can spatially separate one allele from another allele in a set of allele pairs. In one embodiment, the enrichment methods herein can spatially separate one or more alleles from other alleles in one or more sets of allele pairs. The enrichment methods herein can spatially separate the allele targeted by a probe whose extension is not inhibited from the allele targeted by a probe whose extension is inhibited. In one embodiment, when one allele is targeted by a probe whose extension is inhibited in each of the sets of allele pairs, the enrichment methods herein can spatially separate one allele from the other for each of the sets of allele pairs. For example, the enriched product obtained by the methods herein can comprise one allele of each of the plurality of sets of allele pairs. Advantageously, the polynucleotide is attached to a solid support to facilitate said enrichment. For example, the polynucleotide may be present in an acrylamide or agarose gel matrix, or more preferably, immobilized on a membrane surface, particle surface, or in the well of a microtiter plate. The attachment to the solid support may be achieved by NTPs or dntps with enriched label and a molecule on the solid support associated with the enriched label. For example, in embodiments where the enrichment marker is biotin, the polynucleotide can be attached to a solid support (e.g., a particle) that carries streptavidin, thereby effecting enrichment. The solid support may be a magnetic particle or a particle that responds to magnetic adsorption, so that the solid support with the polynucleotide attached thereto is conveniently enriched by magnetic adsorption.
The term "polymerase chain reaction" (PCR) is a molecular biological technique for amplifying a specific DNA fragment. The basic principle of PCR technology is similar to the natural replication process of DNA, and its specificity depends on oligonucleotide primers complementary to both ends of the target sequence. Common PCR consists of three basic reaction steps of denaturation, annealing and extension.
The term "about" or "approximately" means within an acceptable error range for the particular value determined by one of ordinary skill in the art, which will depend on the manner in which the value is measured or determined, e.g., the limits of the measurement system. For example, "about" can mean within 1 or more standard deviation ranges. Alternatively, "about" may represent a range of up to 20%, or up to 10%, or up to 5%, or up to 1% of a given value. Alternatively, especially for biological systems or processes, the term may mean within an order of magnitude, preferably within 5-fold, more preferably within 2-fold, of a value. When a particular value is described in this application and in the claims, unless otherwise stated, the term "about" is assumed to mean within an acceptable error range for the particular value.
All percentages and ratios/ratios are by weight unless explicitly stated otherwise.
All percentages and ratios are calculated based on the total composition, unless otherwise indicated.
Every maximum numerical limitation given throughout this disclosure includes every lower numerical limitation, as if such lower numerical limitations were expressly written herein. Every minimum numerical limitation given throughout this disclosure includes every higher numerical limitation, as if such higher numerical limitations were expressly written herein. Every numerical range given throughout this disclosure includes every narrower numerical range that falls within such broader numerical range, as if such narrower numerical ranges were all expressly written herein.
The values recited herein should not be construed as being strictly limited to the exact numerical values recited. Rather, unless specifically stated otherwise, each stated value is intended to mean both the stated value and a functionally equivalent range surrounding that value. For example, a value disclosed as "20 μ Ι" is intended to mean "about 20 μ Ι".
Each document cited herein, including any cross-reference and related patents or applications, is hereby incorporated by reference unless expressly excluded or otherwise limited. The citation of any document is not an admission that it is prior art with respect to any invention disclosed or claimed herein or that it alone, or in any combination with any other reference or references, teaches, suggests or discloses any such invention. Further, to the extent that any meaning or definition of a term in this document conflicts with any meaning or definition of the term in a document incorporated by reference, the meaning or definition assigned to the term in this document shall govern.
While particular embodiments of the present disclosure have been illustrated and described, various other changes and modifications can be made without departing from the spirit and scope of the disclosure. The scope of the appended claims includes all such changes and modifications as fall within the true spirit and scope of the disclosure.
Examples
In order that the invention disclosed herein may be more effectively understood, the following examples are provided. It should be understood that these examples are for illustrative purposes only and are not to be construed as limiting the invention in any way. Unless otherwise indicated, the molecular Cloning reactions and other standard recombinant DNA techniques throughout these examples were performed according to the methods described by Maniatis et al, molecular Cloning-A laboratory Manual, 2 nd edition, Cold spring harbor Press (Cold spring harbor Press) (1989), using commercially available reagents.
Example 1, materials and methods
(I) Probe design
The probe design was performed according to the following principles:
1. the 3' end of the probe contains a base complementary to the base at the SNP site,
2 GC content of 30 to 70 percent,
tm value is within 50-58 ℃,
4. there are no hairpins, no self-ligating structures, and no 4 consecutive G or C bases.
(II) enrichment of target region
Amplification of target region Using LongAmp Taq DNA polymerase from NEB
Amplification system
Name (R) Volume of
DNA 14ul
5xPCR buffer 5ul
dNTPs(10mM) 0.75ul
taq 1ul
primer(10uM) 1ul
H2O xul
total 25ul
Amplification reaction procedure
Figure BDA0002341652380000151
Figure BDA0002341652380000161
(III) Probe hybridization and extension
Hybridization and elongation system
Concentration of Volume of
DNA (PCR product) x 100/xul
Probe needle 100uM 3ul(1.5μlx2)
Taq 0.3μl
10xBuffer 2μl
dATP 4mM 1μl
dTTP 4mM 1μl
dCTP 4mM 0.8μl
dGTP 4mM 1μl
bio-14-dCTP 0.4mM 2μl
H2O x
total 20μl
Hybridization and extension procedure: 95 ℃: 7.5 min, 64 ℃: for 20 minutes.
(IV) Capture
Hybridization and extension products were hybridized using the Dynabeads Kit BINDER Kit capture probe from Invitrogen.
1. Preparation of Dynabeads Kit magnetic beads
1.1 mix the beads by vortexing and remove 5ul to a new 1.5ml centrifuge tube.
And 1.2, placing the centrifugal tube on a magnetic frame for 2 minutes, and after the magnetic beads are separated from the supernatant, sucking and removing the supernatant by using a liquid transfer device.
1.3 taking the centrifugal tube off the magnetic frame, adding 20ul binding buffer, and gently blowing and mixing by using a pipette.
And 1.4, placing the centrifugal tube on a magnetic frame for 2 minutes, and after the magnetic beads are separated from the supernatant, sucking and removing the supernatant by using a liquid transfer device.
1.5 taking the centrifuge tube off the magnetic frame, adding 20ul binding buffer, and gently blowing and mixing by using a pipette.
2. And (3) completely transferring the nucleic acid sample subjected to probe hybridization and extension into the magnetic beads prepared in the step (1), gently blowing and uniformly mixing by using a liquid transfer device, then placing into a rotary mixer, and uniformly mixing for 3 hours at room temperature.
3. Washing magnetic bead
3.1 placing the sample centrifuge tube uniformly mixed in the step 2 on a magnetic frame for 2 minutes, and after the magnetic beads are separated from the supernatant, removing the supernatant by using a pipettor.
3.2 taking the centrifugal tube off the magnetic frame, adding 40ul washing buffer, and gently blowing and mixing by using a pipette.
3.3 placing the centrifuge tube on a magnetic frame for 2 minutes, and after the magnetic beads are separated from the supernatant, using a pipettor to suck and discard the supernatant.
3.4 repeat step 3.2-3.3 once.
3.5 taking the centrifugal tube off the magnetic frame, adding 40ul of water, and gently blowing and mixing by using a pipette.
3.6 placing the centrifuge tube on a magnetic frame for 2 minutes, and after the magnetic beads are separated from the supernatant, using a pipettor to suck and discard the supernatant.
3.7 remove the centrifuge tube from the magnetic rack and add 14ul of water to resuspend the beads.
(V) amplification enrichment of captured products
The target region was amplified using LongAmp Taq DNA polymerase from NEB.
Amplification system
Name (R) Volume of
DNA 14ul
5xPCR buffer 5ul
dNTPs(10mM) 0.75ul
taq 1ul
primer(10uM) 1ul
H2O xul
total 25ul
Amplification reaction procedure
Figure BDA0002341652380000171
(VI) library construction
The present invention comprises 2 library construction methods.
(1) Library construction was performed using the VAHTS Universal Plus DNA Library Prep Kit for Illumina Kit from Vazyme.
DNA fragmentation, end repair, addition of A-tail
Reaction system
Name (R) Volume of
PCR product xul
FEA buffer 5ul
FEA Enzyme Mix 10ul
H2O to35ul
total 50ul
Reaction procedure
Temperature of Time of day
37℃ 19min
65 30min
4℃
2. Connecting joint
Reaction system
Name (R) Volume of
Product of step 1 50ul
Rapid Ligation Buffer3 25ul
Rapid DNALigase 5ul
DNA Adapter X 5ul
H2O 15
total 100ul
Reaction procedure
Temperature of Time of day
20 15min
4℃
3. Magnetic bead purification
The reaction product was purified using VAHTS DNA Clean Beads from Vazyme.
3.1 after the beads are equilibrated at room temperature, the beads are vortexed and shaken and mixed well.
3.2 suck 100ul of magnetic beads into the reaction product in the step 2, gently blow and mix the magnetic beads by a pipette, and stand the mixture for 5min at room temperature.
3.3 placing the centrifuge tube on a magnetic frame for 2min, and after the magnetic beads are separated from the supernatant, removing the supernatant by a liquid transfer device.
3.4 keep the centrifuge tube on the magnetic frame, add 200ul 75% ethanol to wash the magnetic beads, stand at room temperature for 30s, and remove the supernatant with a pipette.
3.5 repeat step 3.4 once.
3.5 air-drying the magnetic beads at room temperature, adding 100ul H when no ethanol residue on the surfaces of the magnetic beads2And O, gently blowing and beating by using a pipette and uniformly mixing.
3.6 add 60ul of magnetic beads, gently blow and mix with a pipette, and stand at room temperature for 5 min.
3.7 placing the centrifuge tube on a magnetic frame for 2min, after the magnetic beads are separated from the supernatant, sucking 155ul of the supernatant to a new centrifuge tube by a liquid-moving machine.
3.8 add 20ul of magnetic beads into the supernatant, gently blow and mix with a pipette, and stand at room temperature for 5 min.
3.9 placing the centrifuge tube on a magnetic frame for 2min, and after the magnetic beads are separated from the supernatant, removing the supernatant by a liquid transfer device.
3.10 keep the centrifuge tube on the magnetic frame, add 200ul 75% ethanol to wash the magnetic beads, stand at room temperature for 30s, and remove the supernatant with a pipette.
3.11 repeat step 3.10 once.
3.12 air drying the magnetic beads at room temperature, adding 19ul H when no ethanol residue on the surfaces of the magnetic beads2And O, gently blowing and beating by using a pipette and uniformly mixing.
4. Library amplification
Library amplification was performed using KAPAHiFi HotStart ReadyMix (2X) from KAPA Biosystems.
PCR system
Name (R) Volume of
2x KAPA HiFi HotStart ReadyMix 20μl
DNA 19μl
Primer (with barcode) 1μl
Total of 40μl
Reaction conditions
Figure BDA0002341652380000191
5. Library quality inspection
The library bands were detected using agarose gel electrophoresis at 1% concentration.
6. Sequencing on machine
The upper machine platform is NextSeq 500.
(2) Library construction was performed using the KAPA 2G Fast Multiplex PCR Kit from KAPA.
1. Target region enrichment
The PCR reaction system was prepared according to the following table:
PCR system
Figure BDA0002341652380000192
Figure BDA0002341652380000201
Reaction conditions
Figure BDA0002341652380000202
2. Magnetic bead purification
The reaction product was purified using VAHTS DNA Clean Beads from Vazyme.
2.1 after the beads were equilibrated at room temperature, the beads were vortexed and shaken to mix them well.
2.2 suck 42ul of magnetic beads into the reaction product in step 1, gently blow and mix the mixture by a pipette, and stand the mixture at room temperature for 5 min.
2.3 placing the centrifuge tube on a magnetic frame for 2min, and after the magnetic beads are separated from the supernatant, using a pipettor to suck and discard the supernatant.
2.4 keep the centrifuge tube on the magnetic frame, add 200ul 75% ethanol to wash the magnetic beads, stand at room temperature for 30s, and remove the supernatant with a pipette.
2.5 repeat step 2.4 once.
2.5 air drying the magnetic beads at room temperature, adding 14ul H when no ethanol residue on the surfaces of the magnetic beads2And O, gently blowing and beating by using a pipette and uniformly mixing.
3. Library amplification was performed using KAPAHiFi HotStart ReadyMix (2X) from KAPA Biosystems.
PCR system
Name (R) Volume of
2x KAPA HiFi HotStart ReadyMix 15μl
DNA 14μl
Primer (with barcode) 1μl
Total of 30μl
Reaction conditions
Figure BDA0002341652380000203
4. Library quality inspection
The library bands were detected using agarose gel electrophoresis at 1% concentration.
5. Sequencing on machine
The upper machine platform is NextSeq 500.
Example 2: determination of genotype of site to be tested
The HSE technology has the precondition that the genotype of a probe design site is required to be known, so that the genotyping is carried out on a site to be detected firstly, then the probe design is carried out on a heterozygous site, a homozygous site cannot be used as the site to be detected, and the gene polymorphism among different individuals is different.
TABLE 1
#CHROM POS A1(REF) A2(ALT) Sequencing results (depth). sup. (v)
chr11 5243757 G C 0/1:1535,1765:3300
chr11 5244144 A G 0/1:1306,1681:3002
chr11 5244299 T A 0/1:1270,1224:2589
chr11 5245406 G C 0/1:1583,1757:3340
chr11 5245507 G A 0/1:1632,1750:3382
chr11 5246000 C T 0/1:1416,1510:2926
chr11 5246042 G C 0/1:1385,1498:2891
chr11 5246203 A C 0/1:1389,1464:2860
chr11 5246512 T G 0/1:1498,1555:3053
chr11 5247141 G A 0/1:1363,1593:2956
chr11 5247733 A C 0/1:1408,1541:2949
chr11 5247791 C G 0/1:1371,1441:2812
chr11 5248243 A G 0/1:1346,1581:2927
chr11 5248641 G A 0/1:1365,1526:2891
chr11 5248770 G A 0/1:1193,1255:2448
chr11 5248829 T A 0/1:931,893:1919
chr11 5248842 G A 0/1:982,912:1968
chr11 5248852 A G 0/1:1023,1132:2155
chr11 5249004 A G 0/1:1331,1305:2636
chr11 5250168 G A 0/1:1582,1427:3009
chr11 5252251 T C 0/1:1314,1269:2583
*: 0/0 represents homozygous, 0/1 represents heterozygous
Example 3: probe design
Selecting chr 11: 4657966(G/C hybrid) as candidate points for probe design. Other sites serve as candidate points for verifying the separation effect. Blocking probes were designed in the same way as shown in the table below. And G is captured by the probe, and C is captured by the block probe. The 3' end of the block probe is modified by dideoxy.
Figure BDA0002341652380000221
Example 4: target region enrichment
The target region was amplified using a pair of primers, the primer information is shown in the following table:
Figure BDA0002341652380000222
example 5: nucleic acid hybridization, extension and Capture
Two different sets of experiments are set, and the probe and the target region enrichment product are hybridized and extended in the set 1. Group 2 probes and genomic DNA were hybridized and extended. The Dynabeads killbased BINDER Kit was used for capture.
Example 6: amplification enrichment of captured products
Set 1 amplifies the target region using a pair of primers, the primer information is shown in the following table:
Figure BDA0002341652380000223
group 2 selected 5 pairs of PCR primer pairs to capture the products for multiplex PCR pooling, with primer information as shown in the following table:
Figure BDA0002341652380000231
example 7: library preparation
Group 1 Library construction was performed using the VAHTS Universal Plus DNA Library Prep Kit for Illumina Kit from Vazyme. Group 2 library construction was performed using the KAPA 2G Fast Multiplex PCR Kit from KAPA. The primers used are as described in example 6.
Example 8: haplotype isolation Effect
In the scheme, all heterozygous SNP sites are used as the basis for detecting whether the experiment is successful or not within the range of 10kb downstream of the detected heterozygote, and normally, the two types of sequencing data of heterozygotes respectively account for about 50 percent. When the proportion of one of the heterozygotes in which one type is trapped exceeds 61%, the separation is considered to be successful. The relative positions of the sites and the hybridization points are shown in Table 2, the sequencing results are shown in Table 3, and the haplotype isolation results are shown in Table 4.
TABLE 2
Figure BDA0002341652380000232
Figure BDA0002341652380000241
Table 3: sequencing results
Figure BDA0002341652380000242
Figure BDA0002341652380000251
Table 4: haplotype isolation results
Figure BDA0002341652380000252
The above data show that:
the sequencing depth difference of all 21 heterozygous SNP sites in the group 1 is small, the uniformity is good, the capture ratio of one genotype of all the heterozygous SNPs exceeds 61%, and the capture ratio of 76.5% is still maintained at the farthest position away from the heterozygous 8.494 kb.
The 5 SNP sites in the group 2 have larger sequencing depth difference, which is probably caused by the difference of the homogeneity of different primers of the multiplex PCR and the random fragmentation of the template. The capture ratio of one genotype decreased with increasing distance from the hybridization point, and when the distance was 8.494kb, the capture ratio decreased to 57.5%, separation failed, and the difference from the hybridization point was large at 89.1%. In addition, the group 2 is constructed in a multiple PCR way, a plurality of PCR primers need to be designed, experimental conditions need to be optimized, the cost is high, and time and labor are wasted.
Figure BDA0002341652380000271
Figure BDA0002341652380000281
Figure BDA0002341652380000291
Figure BDA0002341652380000301
Figure BDA0002341652380000311
Figure BDA0002341652380000321
Figure BDA0002341652380000331
Figure BDA0002341652380000341
Sequence listing
<110> Shanghai Wehn biomedical science and technology, Inc
<120> allele nucleic acid enrichment and detection method
<130>WSH003
<160>16
<170>SIPOSequenceListing 1.0
<210>1
<211>22
<212>DNA
<213>Artificial Sequence
<400>1
gggttgataa ttaaataaca gt 22
<210>2
<211>21
<212>DNA
<213>Artificial Sequence
<400>2
gggttgataa ttaaataaca c 21
<210>3
<211>25
<212>DNA
<213>Artificial Sequence
<400>3
tcggttactg aggcttgtgt atttg 25
<210>4
<211>27
<212>DNA
<213>Artificial Sequence
<400>4
attaataata ccatcgcaca gtgtttc 27
<210>5
<211>22
<212>DNA
<213>Artificial Sequence
<400>5
ctggtgtgcc atttgttaag ct 22
<210>6
<211>25
<212>DNA
<213>Artificial Sequence
<400>6
aggaaccact ggtttacata tcaga 25
<210>7
<211>31
<212>DNA
<213>Artificial Sequence
<400>7
tgctaatctt aaacatcctg aggaagaatg g 31
<210>8
<211>31
<212>DNA
<213>Artificial Sequence
<400>8
gcttagtatg aaaagttagg actgagaaga a 31
<210>9
<211>28
<212>DNA
<213>Artificial Sequence
<400>9
tctcccttga tagtttctac tttgggtt 28
<210>10
<211>31
<212>DNA
<213>Artificial Sequence
<400>10
caatcgatta cacaattagg tgtgaaggta a 31
<210>11
<211>29
<212>DNA
<213>Artificial Sequence
<400>11
tacgtaatat ttggaatcac agcttggta 29
<210>12
<211>31
<212>DNA
<213>Artificial Sequence
<400>12
ccatctacat atcccaaagc tgaattatgg t 31
<210>13
<211>28
<212>DNA
<213>Artificial Sequence
<400>13
gatttcacat gttgaatcct gggaaaag 28
<210>14
<211>31
<212>DNA
<213>Artificial Sequence
<400>14
gtctaaatct aaaacaatgc taatgcaggt t 31
<210>15
<211>29
<212>DNA
<213>Artificial Sequence
<400>15
catcactcag tgatgaactt aatacccaa 29
<210>16
<211>27
<212>DNA
<213>Artificial Sequence
<400>16
gagtgtggga gctaaatgat gatacac 27

Claims (10)

1. A method for enriching a nucleic acid extension product, comprising
a. Enriching nucleic acids comprising one or more sets of allele pairs in a sample, the enriched product being at least 5kb, preferably at least 8kb, more preferably at least 10kb, and the enriched product comprising at least one set of allele pairs;
b. contacting probes with the enriched product under conditions in which the probes hybridize to the nucleic acid, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs, and extension of the probe that targets one of the alleles is inhibited;
c. extending the probe; and
d. the extension products are enriched and the obtained product is,
preferably, the enrichment in step a is primer directed amplification, such as PCR amplification.
2. A method according to claim 1, wherein the probe targeting one allele of the pair of alleles comprises a modification which inhibits extension of the probe, preferably the modification is a dideoxy modification of a base, preferably the dideoxy modification of a base is located at the 3' end of the probe.
3. The method of claim 1, wherein the nucleic acid comprises a plurality of sets of allele pairs, and in each of the plurality of sets of allele pairs, extension of the probe targeting one allele is inhibited.
4. The method of claim 1, wherein the method has one or more characteristics selected from the group consisting of:
the pair of alleles is a pair of SNPs,
the binding sites of the probes and the amplification products are close to any end of the amplification products; preferably, the binding site of the probe to the amplification product is less than 30% -5% of the length of the amplification product from either end of the amplification product,
the primers are designed such that the binding sites of the probes to the amplification products are near either end of the amplification products.
5. A haplotype isolation method comprising:
a. enriching nucleic acids comprising one or more sets of allele pairs in a sample, the enriched product being at least 5kb, preferably at least 8kb, more preferably at least 10kb, and the enriched product comprising at least one set of allele pairs;
b. contacting probes with the enriched product under conditions in which the probes hybridize to the nucleic acid, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs, and extension of the probe that targets one of the alleles is inhibited;
c. extending the probe; and
d. the extension products are enriched and the obtained product is,
preferably, the enrichment in step a is primer directed amplification, such as PCR amplification.
6. A method according to claim 5 wherein the probe targeting one of the alleles comprises a modification which inhibits extension of the probe, preferably the modification is a dideoxy modification of a base, preferably the dideoxy modification of a base is at the 3' end of the probe.
7. The method of claim 5, wherein the nucleic acid comprises a plurality of sets of allele pairs, and in each of the plurality of sets of allele pairs, extension of the probe targeting one allele is inhibited.
8. The method of claim 5, wherein the method has one or more characteristics selected from the group consisting of:
the pair of alleles is a pair of SNPs,
the binding sites of the probes and the amplification products are close to any end of the amplification products; preferably, the binding site of the probe to the amplification product is less than 30% -5% of the length of the amplification product from either end of the amplification product,
the primers are designed such that the binding sites of the probes to the amplification products are near either end of the amplification products.
9. A haplotype detection method comprising:
a. amplifying nucleic acids comprising an allele pair in a sample, the amplified product being at least 5kb, preferably at least 8kb, more preferably at least 10 kb; and the amplified product comprises at least one set of allele pairs;
b. contacting probes with the amplified product under conditions in which the probes hybridize to the nucleic acid, the probes comprising at least two probes that target each of the two alleles of either of the at least one set of allele pairs, and extension of the probe that targets one of the alleles is inhibited;
c. extending the probe; and
d. amplifying and sequencing the extension product of step c, thereby determining the haplotype of the sample,
preferably, the enrichment in step a is primer directed amplification, such as PCR amplification,
preferably, the determining the haplotype of the sample is by comparing data changes for the same allele pair, more preferably, the determining the haplotype of the sample comprises: comparing the change in the sequencing results of the allele pair before hybridization in step b with the amplification in step d, more preferably, the change is a change in the sequencing depth of the allele pair.
10. A kit for haplotype isolation comprising:
nucleic acid probes comprising at least two probes targeting two alleles of an allele pair, respectively, and a probe targeting one of the alleles comprising a modification that inhibits extension of the probe,
a nucleic acid polymerase, and
NTPs and/or dNTPs with enriched labels,
preferably, the allele pair is a SNP pair
The binding site of the probe to its conjugate is near either end of the conjugate; preferably, the binding site of the probe to the conjugate is less than 30% to 5% of the length of the conjugate from either end of the conjugate.
CN201911378483.8A 2019-12-27 2019-12-27 Allele nucleic acid enrichment and detection method Pending CN110938681A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911378483.8A CN110938681A (en) 2019-12-27 2019-12-27 Allele nucleic acid enrichment and detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911378483.8A CN110938681A (en) 2019-12-27 2019-12-27 Allele nucleic acid enrichment and detection method

Publications (1)

Publication Number Publication Date
CN110938681A true CN110938681A (en) 2020-03-31

Family

ID=69913798

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911378483.8A Pending CN110938681A (en) 2019-12-27 2019-12-27 Allele nucleic acid enrichment and detection method

Country Status (1)

Country Link
CN (1) CN110938681A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113096736A (en) * 2021-03-26 2021-07-09 北京源生康泰基因科技有限公司 Method and system for automatically analyzing viruses in real time based on nanopore sequencing

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040241722A1 (en) * 2003-03-12 2004-12-02 Baochuan Guo Molecular haplotyping of genomic DNA
US20160312281A1 (en) * 2013-12-10 2016-10-27 Conexio Genomics Pty Ltd Methods and probes for identifying gene alleles
CN109971831A (en) * 2019-04-18 2019-07-05 上海韦翰斯生物医药科技有限公司 Allele nucleic acid enriching method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040241722A1 (en) * 2003-03-12 2004-12-02 Baochuan Guo Molecular haplotyping of genomic DNA
US20160312281A1 (en) * 2013-12-10 2016-10-27 Conexio Genomics Pty Ltd Methods and probes for identifying gene alleles
CN109971831A (en) * 2019-04-18 2019-07-05 上海韦翰斯生物医药科技有限公司 Allele nucleic acid enriching method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113096736A (en) * 2021-03-26 2021-07-09 北京源生康泰基因科技有限公司 Method and system for automatically analyzing viruses in real time based on nanopore sequencing
CN113096736B (en) * 2021-03-26 2023-10-31 北京源生康泰基因科技有限公司 Virus real-time automatic analysis method and system based on nanopore sequencing

Similar Documents

Publication Publication Date Title
US10202637B2 (en) Methods for analyzing nucleic acid
CN102877136B (en) Genome simplification and next-generation sequencing-based deoxyribose nucleic acid (DNA) library preparation method and kit
JP5637850B2 (en) Amplification method of target nucleic acid sequence, detection method of mutation using the same, and reagent used therefor
WO2014101655A1 (en) Method for analyzing high-throughput nucleic acid and application thereof
US20050100911A1 (en) Methods for enriching populations of nucleic acid samples
JP2015527072A (en) Nucleic acid multiplex analysis method
EP3314026A1 (en) Single nucleotide polymorphism inhla-b*15:02
EP2982762B1 (en) Nucleic acid amplification method using allele-specific reactive primer
KR101353083B1 (en) SNP markers and methods for highly fetile pig
US20020164634A1 (en) Methods for reducing complexity of nucleic acid samples
US20220145380A1 (en) Cost-effective detection of low frequency genetic variation
CN110938681A (en) Allele nucleic acid enrichment and detection method
CN109971845B (en) PCR primer group for amplifying human PKD1 gene 1-33 exons and amplification system
CN109971831A (en) Allele nucleic acid enriching method
KR101249635B1 (en) Novel EGR2 SNPs Related to Bipolar Disorder, Microarrays and Kits Comprising them for Diagnosing Bipolar Disorder
WO2022007863A1 (en) Method for rapidly enriching target gene region
CN113913530B (en) Molecular marker related to sheep body height and application thereof
US20070231803A1 (en) Multiplex pcr mixtures and kits containing the same
WO2019005763A1 (en) A method for the clustering of dna sequences
CN109554462B (en) PCR primer group, kit, amplification system and detection method of gene CYP11B1 exon
CN113234838A (en) Primer pair, product and method for identifying sheep FecB genotype by high-resolution melting curve
TWI674320B (en) Method and kit for making prognosis on gitelman&#39;s syndrome
KR101136008B1 (en) A ligase-based SNP analysis method using oxanine base-containing ligation fragment and a system for analyzing SNP comprising the fragment
KR101985659B1 (en) Method for identification of Baekwoo breed using single nucleotide polymorphism markers
TWI707864B (en) Oligonucleotide microarray for detection of canine mdr1 gene mutations and determination method thereof and method for determination of the risk of chemotherapy side effects to a subject

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Yang Jingmin

Inventor after: Xu Hui

Inventor after: Lu Daru

Inventor after: Wang Lianghui

Inventor after: Tang Jiajie

Inventor after: Gao Pengfei

Inventor before: Yang Jingmin

Inventor before: Xu Hui

Inventor before: Lu Daru

Inventor before: Wang Lianghui

Inventor before: Tang Jiajie

CB03 Change of inventor or designer information