CN113512595A - Biomarker for tracking and detecting DNA sample, method and application - Google Patents

Biomarker for tracking and detecting DNA sample, method and application Download PDF

Info

Publication number
CN113512595A
CN113512595A CN202110653767.4A CN202110653767A CN113512595A CN 113512595 A CN113512595 A CN 113512595A CN 202110653767 A CN202110653767 A CN 202110653767A CN 113512595 A CN113512595 A CN 113512595A
Authority
CN
China
Prior art keywords
sample
dna
biomarker
pcr amplification
snp
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110653767.4A
Other languages
Chinese (zh)
Other versions
CN113512595B (en
Inventor
李妍珂
刘永初
吕佩涛
刘阳
李阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aegicare Shenzhen Technology Co ltd
Original Assignee
Aegicare Shenzhen Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aegicare Shenzhen Technology Co ltd filed Critical Aegicare Shenzhen Technology Co ltd
Priority to CN202110653767.4A priority Critical patent/CN113512595B/en
Publication of CN113512595A publication Critical patent/CN113512595A/en
Application granted granted Critical
Publication of CN113512595B publication Critical patent/CN113512595B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A biomarker, a method and application for DNA sample tracking detection are provided, wherein a specific multiplex PCR primer sequence is designed according to screened 26 SNP loci, a polluted sample to be detected can be tracked and detected, and the method is particularly suitable for verifying sample adding sequence confusion. By adopting the method for tracking and detecting the suspected polluted DNA sample, the period of experiment cause checking is greatly shortened, the process monitoring can be carried out on the clinical sample, the NGS sequencing platform can be fully utilized, other equipment is not required to be added, and the strict requirement of the clinical sample on the detection period can be met.

Description

Biomarker for tracking and detecting DNA sample, method and application
Technical Field
The invention relates to the field of gene detection and tracking, in particular to a biomarker for tracking and detecting a DNA sample, a method and application.
Background
Currently, the high throughput Sequencing (NGS) technology is gradually emerging in clinical diagnosis applications, and is widely applied to a plurality of fields such as early cancer screening, cancer accompanying diagnosis, genetic disease auxiliary diagnosis, noninvasive prenatal diagnosis, and pathogenic microorganism detection, and increasingly exerts strong detection capability. While this technology has progressed rapidly, the corresponding laboratory internal quality control has progressed relatively slowly. NGS techniques involve nucleic acid extraction, library construction, and some even require complex probe hybridization capture steps, which can be followed by twenty-several steps before final in-silico sequencing. It is difficult to ensure that the manual operation does not cause any error in the process, so that the samples are mixed up and confused with each other. How to establish and establish a sample confusion prevention tracking system of what sample, the effective monitoring result is accurate and reliable, and the method is one of the most urgent problems to be solved for clinical large-scale development of the NGS at present.
In the case where no additional sample tracking system is built for NGS process, the most common method for monitoring whether there is sample confusion is to perform sex analysis on the sample while completing the analysis of human Whole Genome Sequencing (WGS) or Whole Exome Sequencing (we). The method has the advantages that the conclusion whether the actual analysis result is consistent with the information of the sample to be inspected can be obtained without adding extra experiments; however, as the sex analysis is too simple, only the samples with different sexes can be distinguished, and whether different samples with the same sex are mixed or not is still unknown.
The Promega company in the United states develops an STR detection kit based on capillary electrophoresis sequencing to accurately identify individual DNA. STR refers to Short Tandem Repeat (Short Tandem Repeat) which is a kind of DNA sequence formed by repeating and connecting 2-6 bases as core unit, wherein, on average, one STR exists in every 15kb of human genome, and more than 7000 STRs are found at present. Because the STR core unit has high specificity among individuals with different repetition times, the STR core unit has genetic diversity and can identify the genetic relationship and the identity of individual DNA. The technology is widely applied to the fields of criminal investigation and forensic science as a gold standard for DNA identification, and can judge sample confusion which may occur in the experimental process according to the map analysis result. However, this method has its limitations, and although the number of repeats of the core unit can be accurately discriminated, the specific structure of the repeated bases cannot be known, and the sites having the same number of repeats and only the difference in internal bases cannot be accurately judged; meanwhile, the mapping analysis is influenced by various factors, so that the result is wrong.
The Rockwell diagnosis develops a SeqCap EZ Primer monomer extreme panel containing 340 SNP sites in a newly released whole exon sequencing product, wherein the SNP sites form a sample identification ID derived from the inside of DNA, and can track samples and monitor the confusion condition among the samples in the whole detection process. Because this panel need not manual addition marker, can effectively avoid the risk that artificial misoperation caused. However, because the regions covered by different whole exon sequencing panels are different, the 340 SNPs cannot be covered by other panels of panel (panel refers to a group or a set of detection) probes, and what detection method is adopted to compare and verify the 340 sites is also a problem to be solved. In addition, the adoption of a plurality of SNP sites can increase corresponding data, and increase the complexity of data processing and the detection cost.
Disclosure of Invention
The invention mainly aims to provide a method for tracking and detecting a DNA sample and application thereof, which can effectively track the DNA of a biological sample by screening out nucleic acid containing 26 SNP loci as a biomarker for tracking and detecting the DNA sample, verify a sample to be detected suspected to be confused and judge whether the DNA of the sample is polluted or not.
In order to achieve the purpose, the following technical scheme is adopted in the application:
according to a first aspect, the present application provides a biomarker for DNA sample tracking detection, the biomarker being a nucleic acid containing 26 SNP sites, specifically rs1344, rs3917981, rs857870, rs4478844, rs4621, rs1058900, rs1130598, rs4703, rs8481, rs4673, rs1132812, rs9930567, rs1045280, rs1135989, rs17626, rs354021, rs2304186, rs6554, rs2230267, rs11554159, rs 0850378, rs3822585, rs9483504, rs 38711998, rs11136343, rs 1071583.
The term "SNP" refers to a single nucleotide polymorphism (SNP, abbreviation). The theoretical basis is that the single nucleotide polymorphism of each person is different except for monozygotic twins, so the single nucleotide polymorphism analysis can be used as a specific marker of each individual, and if the single nucleotide polymorphism analysis result is different from the SNP site condition of the sample to be detected, the sample to be detected is judged to be possibly polluted or cross-polluted by different sample reaction systems to be detected in the whole test process. In the present application, by screening out specific 26 SNP sites, DNA samples suspected of being confused can be verified to further determine whether sample contamination exists.
The more SNP sites detected, the higher the accuracy, and the larger the corresponding data size. However, the detection cost is increased due to the large data volume, and influence and interference on the data of the detection can be caused; therefore, considering the accuracy of SNP site detection and the amount of SNP data in a sample, the preferred SNP sites in the present application are 26 selected by genetic distance analysis from 443 family samples with exon panel results obtained.
The application aims at verifying samples which are detected by other methods and judged to be possibly confused (polluted). By detecting the 26 SNP loci contained in the biomarker, the DNA of a biological sample can be effectively tracked, and whether the DNA of the sample is polluted or not can be verified. Compared with the existing method for carrying out sample tracking and pollution judgment through 340 SNP sites, the biomarker only needs to detect 26 SNP sites, and has stronger operability and higher cost performance; moreover, the biomarker can be used for rapidly identifying sample pollution in WES and WGS experimental processes, repeated experiments are not needed, and the result is accurate. The biomarker is adopted for sample tracking and pollution detection, the period of experiment cause checking is greatly shortened, the process monitoring can be carried out on clinical samples, the NGS sequencing platform can be fully utilized, other equipment does not need to be added, and the strict requirement of the clinical samples on the detection period can be met. It can be understood that the key of the biomarker of the application is that the biomarker contains 26 designated SNP sites, and sample tracking and pollution detection can be accurately and effectively realized through the 26 designated SNP sites; the biomarker of the present application may also contain other SNP sites or other sequences for other functional requirements, and is not specifically limited herein.
In the present application, the specific screening method for 26 SNP sites is: screening SNP loci with the population frequency of 0.4-0.6, wherein the screened SNP loci are located in a region which can be deeply covered by the exon panel, carrying out genetic distance analysis on the screened SNP loci in 443 family samples with obtained exon panel results, and finally screening out 26 SNP loci which meet the requirements.
It should be noted that "population frequency" can be understood as the ratio of base differences at a site in a particular population (e.g., the Chinese population). For example, a SNP site with a population frequency of 0.5 means that 50% of the individuals are A and 50% are C/T/G. In the present application, the desired SNP site should first satisfy a population frequency of about 0.5, such as 0.4-0.6. The second requirement is that the screened sites are located in the region that can be covered deeply by the exon panel, i.e., there is no error in the result of the exon panel due to insufficient depth coverage of the sites. And then, performing genetic distance analysis in 443 family samples with obtained exon panel results according to the selected sites, wherein the genetic distance analysis belongs to a common technical means in the field and is not repeated herein.
It should be noted that the key point of the present application is to detect 26 designated SNP sites in the biomarkers of the present application, and in order to simply and effectively implement sample tracking and contamination detection, specific primers can be designed to perform multiplex PCR amplification and high-throughput sequencing on the biomarkers of the 26 SNP sites. The specific primer sequence can be determined according to the existing primer design principle, and is not particularly limited herein.
According to a second aspect, the present application provides a kit for DNA sample follow-up detection, the kit comprising PCR amplification primers for detecting the 26 SNP sites of the biomarker according to the first aspect. Preferably, the PCR amplification primers are shown as SEQ ID No.1 to SEQ ID No.52 in sequence.
The sequences of SEQ ID No.1 to SEQ ID No.52 are shown in the following table:
Figure BDA0003112936580000041
Figure BDA0003112936580000051
Figure BDA0003112936580000061
the key point of the kit is that the detection of 26 SNP sites of the biomarker is realized by PCR amplification and subsequent high-throughput sequencing; for specific PCR amplification primer sequences, reference can be made to existing primer design principles. The kit containing 26 pairs of multiplex PCR amplification primers can be assembled by independently packaging each primer and then selectively adding the primers according to requirements, or each group of primer pairs can be uniformly mixed together as a whole, or the 26 pairs of primers related to 26 SNP sites are all uniformly mixed together for 52 primers, and a primer mixed solution is added when the kit is used. In addition, PCR enzyme premix solution, enzyme-free water, end repair reaction buffer solution, library joints, ligation reaction solution, high-fidelity PCR enzyme, tag primers, purified magnetic beads (magnetic beads for short) and other substances adopted in the reagent kit can be purchased from the market.
In an implementation manner of the present application, the PCR multiplex primers related to 26 SNP sites are designed, and can be fully applied to sample confusion verification of Whole Genome (WGS) or Whole Exome (WES) detection, and no adverse effect is caused to WGS or WES detection itself, that is, 26 SNP sites are analyzed, and detection of any confused sample can be achieved.
It should be noted that, the biomarker of the present application is used for tracking and detecting DNA samples suspected of being contaminated, and the whole process can be performed in sequencing of an exoscope group or in sequencing of a genome, that is, 26 SNP sites in the biomarker of the present application are analyzed and detected while sequencing of the exoscope group or sequencing of the genome, thereby realizing sample tracking and contamination detection; it is also possible to use directly exome sequenced or genome sequenced panel, for example in one implementation of the present application, in particular the full exon panel of IDT.
Further, the kit further comprises: PCR enzyme premix solution, enzyme water-free, end repair reaction buffer solution, a library joint, connection reaction solution, high-fidelity PCR enzyme, a label primer and purified magnetic beads. The enzyme-free water is water without DNase or RNase, and can be used for eluting reactants; as the PCR enzyme premix, Taq enzyme premix manufactured by Takara was used. The reactants used, other than those specifically mentioned, may be selected from the components and linkers conventionally used in the art. The library can be constructed by methods conventional in the art, for example, including steps of multiplex PCR primer mixing, multiplex PCR amplification, magnetic bead purification, end-repair and A-tail, linker ligation, magnetic bead purification, library PCR amplification, magnetic bead purification, and library quality inspection.
In one embodiment, after PCR amplification primers (primers for short) are designed and synthesized according to the 26 SNP sites, the single primer is diluted to a primer mother solution with a concentration of 100. mu.M, and each primer is mixed in equal volume so that the single primer concentration of the working solution is 1. mu.M; performing a round of multiplex PCR amplification by using Taq enzyme Premix (Premix Taq) of Takara to obtain a PCR product with the fragment size of 190 bp-320 bp; purifying the PCR product, and removing primer dimer and Taq enzyme premix; then carrying out end repair and joint connection on the purified PCR product, and adding a universal joint; then carrying out magnetic bead purification on the connection product, and removing the joint dimer and the connection reaction liquid; and amplifying the purified ligation products by using a high-fidelity PCR enzyme and a label primer for distinguishing samples to obtain a library meeting the requirement of on-computer concentration, wherein the size of the library fragments is subjected to Agilent 2100 quality control, the size of the library fragments is concentrated between 270-400 bp, and downstream sequencing can be performed.
It should be noted that the key point of the present application is to screen out 26 SNP sites and design 26 pairs of multiplex PCR primers, and for other subsequent library construction, sequencing steps, etc., all refer to the related steps of existing library construction and high-throughput sequencing, and for the subsequent SNP analysis on the sequencing result, an NGS sequencing platform can be used for analysis, etc., which are not described herein again.
Based on the fact that no high-throughput detection product which is simple and convenient to operate and high in cost performance is available in the market at present to accurately judge whether a sample is polluted or not, the application discloses a method for tracking the sample by using DNA internal SNPs, which is suitable for tracking whole genome or whole Exome sequencing detection, and is particularly suitable for tracking whole Exome sequencing detection by using Integrated DNA Technologies (IDT) company XGen Exome Research Panel. By the method for tracking and detecting the DNA sample, the SNP locus of the sample to be detected can be quickly judged to be compared with the mutation situation of the locus which should actually exist, so that whether sample pollution (or confusion) exists or not can be judged.
Therefore, according to the third aspect, the present application also provides a method for tracking and detecting a DNA sample, comprising performing PCR amplification on a sample DNA to be detected by using the PCR amplification primers in the kit of the second aspect, performing high-throughput sequencing on the PCR amplification product, and analyzing the mutation status of 26 SNP sites of the biomarker according to the high-throughput sequencing result, thereby determining whether the sample DNA to be detected is contaminated. The step of judging whether the DNA of the sample to be detected is polluted specifically comprises the following steps: and comparing the mutation condition of the SNP site obtained by detection with the actually known mutation condition of the SNP site which should exist in the DNA sample to be detected, and judging whether the DNA sample to be detected has pollution or not.
In one embodiment, whether the sample is mixed up can be judged by verifying whether the sample adding sequence of the sample to be detected is consistent with the preset sample adding sequence. For example, for a sample to be tested suspected of being confused, the DNA fragments are subjected to multiplex PCR amplification, a library is constructed according to a preset sample adding sequence, and high-throughput sequencing is carried out. During verification, adding PCR amplification primers designed according to 26 SNP sites into a sample to be detected in the process of constructing a library according to a preset sample adding sequence, and performing PCR amplification and library construction together with the sample to be detected; sequencing the constructed mixed library, and analyzing the single nucleotide polymorphism of a sample to be detected; and judging whether the sample to be detected is polluted by samples from different individual sources or not according to the single nucleotide polymorphism analysis result. Wherein, the DNA fragment initially comprises 26 SNP sites or comprises 26 SNP sites after amplification.
And when the verification sequencing result is the same as the preset sample adding result, indicating that the DNA sample has a confusion state. The initial sequencing result is different from the preset sampling result, so that the situation that the sample to be detected is confused is judged, the sampling sequence of the sample to be detected is inconsistent with the preset sampling sequence, the guess is probably caused by the fact that the adding sequence of the DNA sample is reversed, the guess result is verified, the sample to be detected is subjected to PCR amplification, library building and sequencing according to the originally preset sampling sequence, when the obtained verification sequencing result is the same as the preset sampling result, the sample to be detected is confused, and the correct sampling sequence is the preset sampling sequence.
According to a fourth aspect, the present application also provides the use of a biomarker as defined in the first aspect above, or a kit as defined in the second aspect, for performing a follow-up detection on a DNA sample suspected of being contaminated in exome sequencing or genomic sequencing.
Due to the adoption of the technical scheme, the beneficial effects of the application are as follows:
the application can realize the tracking and pollution detection of the sample only by detecting 26 SNP sites, is particularly suitable for the verification of the confusion of the sample adding sequence of the sample to be detected, can effectively reduce the consumption of the sample and a reagent, and has stronger operability and higher cost performance.
Moreover, the biomarker can be used for rapidly identifying sample pollution in WES and WGS experimental processes, repeated experiments are not needed, and the result is accurate. By adopting the DNA sample tracking detection method to track and verify the suspected confused sample to be detected, the period of experiment cause checking is greatly shortened, the process monitoring can be carried out on the clinical sample, the NGS sequencing platform can be fully utilized, other equipment is not required to be added, and the strict requirement of the clinical sample on the detection period can be met.
Drawings
FIG. 1 is a library peak chart in quality control of the library in example 1.
Detailed Description
The present invention will be described in further detail with reference to the following detailed description and accompanying drawings.
In the following description, numerous details are set forth in order to provide a better understanding of the present application. However, those skilled in the art will readily recognize that some of the features may be omitted or replaced with other elements, materials, methods in different instances. In some instances, certain operations related to the present application have not been shown or described in detail in order to avoid obscuring the core of the present application from excessive description, and it is not necessary for those skilled in the art to describe these operations in detail, so that they may be fully understood from the description in the specification and the general knowledge in the art.
Furthermore, the features, operations, or characteristics described in the specification may be combined in any suitable manner to form various embodiments. Also, the various steps or actions in the method descriptions may be transposed or transposed in order, as will be apparent to one of ordinary skill in the art. Thus, the various sequences in the specification and drawings are for the purpose of describing certain embodiments only and are not intended to imply a required sequence unless otherwise indicated where such sequence must be followed.
The judgment of DNA pollution in the prior art is not a good verification method, and because the regions covered by different whole exon sequencing panels are different, if too many SNP sites are selected, the SNP sites cannot be covered by panels of other brands, the application range is limited, the universality is poor, and when too many SNP sites are available, the accuracy of the SNP sites is difficult to verify. The method is based on the defects of the prior art, and creatively provides a method for tracking the pollution of the DNA sample, so as to accurately judge whether the sample is polluted (or mixed). The 26 SNP sites based on the tracking method can be covered by most brands of panel, and can be used for human whole genome sequencing or whole exome sequencing. Specifically, the SNP sites include the following 26 sites: rs1344, rs3917981, rs857870, rs4478844, rs4621, rs1058900, rs1130598, rs4703, rs8481, rs4673, rs1132812, rs9930567, rs1045280, rs1135989, rs17626, rs354021, rs2304186, rs6554, rs2230267, rs11554159, rs5030878, rs3822585, rs9483504, rs11998387, rs11136343, and rs 1071583. The 26 pairs (52) of PCR primers designed according to the 26 SNP sites are shown in the following table:
Figure BDA0003112936580000091
Figure BDA0003112936580000101
Figure BDA0003112936580000111
the present application is described in further detail below with reference to specific embodiments and the attached drawings. The following examples are intended to be illustrative of the present application only and should not be construed as limiting the present application.
Example 1
Detecting 16 samples in total of 4 ancestors of families and non-families by using a full exon panel of IDT, wherein A1-A3 are families 1, A4-A6 are families 2, A7-A8 are families 3, B6-B8 are families 4, and the rest samples are independent ancestors; the preset sample adding sequence is shown as the following figure:
Figure BDA0003112936580000112
the actual loading sequence is shown in the following figure:
Figure BDA0003112936580000121
the detection result shows that A1, A2 and A3 do not accord with family relations; a4, A5 and A6 do not conform to family relations; a7, A8 and B1 do not conform to family relations; b6, B7 and B8 do not conform to family relations; the batch of samples is suspected to be seriously confused, and the tracking method of the application is adopted to construct a library for the 16 samples of the batch of DNA according to a preset loading sequence. The method comprises the following steps:
1. multiplex primer mix
Add 52 single primer mother solution with concentration of 100. mu.M into 1.5mL centrifuge tube at 3. mu.L/tube, supplement 144. mu.L TE buffer solution to make final volume 300. mu.L, mix well by vortex, centrifuge instantaneously, single primer working solution concentration is 1. mu.M.
2. Multiplex PCR amplification
And (3) placing Premix Taq on ice for melting, mixing uniformly by vortex, centrifuging briefly, and preparing a reaction system according to the following table:
Figure BDA0003112936580000122
vortex mixing the components, instantaneous centrifugation, multiple PCR amplification according to the following reaction program:
Figure BDA0003112936580000123
Figure BDA0003112936580000131
3. magnetic bead purification
After the reaction was completed, the product was purified using XP magnetic beads. Adding 35 mu L of XP magnetic beads into 25 mu L of samples, fully and uniformly mixing, and standing for 5 minutes at room temperature; performing instantaneous centrifugation, and standing for 3 minutes on a magnetic frame; thoroughly absorbing and discarding the supernatant, adding 150 μ L of freshly prepared 80% ethanol, and rinsing for 2 times; completely absorbing and discarding 80% ethanol, adding 45 mu L of enzyme-free water for elution, taking 1 mu L of supernatant for detection, taking 3 mu L of supernatant for 2% agarose gel electrophoresis detection, and requiring that an electrophoresis band is between 190 and 320 bp; 40 μ L of the supernatant was taken for downstream experiments.
4. End repair with A tail
Taking out the tail end, repairing the tail end, adding the buffer solution A, melting at room temperature, mixing by vortex, centrifuging instantaneously, and placing on ice for later use; taking out the tail end, repairing, adding A reaction enzyme, performing instantaneous centrifugation, and placing on ice; the end-repair reaction system was prepared as follows:
name of reagent Volume (μ L)
End repair buffer 6
End repair plus A-reactive enzyme 4
PCR purified product 40
Total amount of 50
Vortex and mix the components, centrifuge instantaneously, react according to the following procedure:
temperature (. degree.C.) Time (min)
20 30
65 30
10 Holding
5. Joint connection
Taking out the connection buffer solution, melting at room temperature, mixing by vortex, centrifuging instantaneously, and placing on ice for later use; taking out the ligase, and placing on ice for standby after instantaneous centrifugation; taking out the universal joint, thawing on ice, mixing by vortex, centrifuging instantaneously, and placing on ice for later use.
Adding 2 mu L of universal joint into the product A after the end repair, mixing by vortex, centrifuging instantly, placing a sample tube on ice, and preparing a connection reaction system according to the following table:
name of reagent Volume (μ L)
Ligation buffer 26
Ligase 2
Total amount of 28
Adding 28 mu L of the ligase reaction mixed solution into a sample tube, mixing by vortex, centrifuging instantaneously, and placing on ice firstly;
the ligation reaction procedure was set as follows:
temperature (. degree.C.) Time (min)
20 15
10 Holding
After the PCR instrument reaches 20 ℃, the reaction tube is placed in the instrument.
6. Magnetic bead purification
After the reaction was completed, the product was purified using XP magnetic beads. Adding 40 mu L of XP magnetic beads into 80 mu L of samples, fully and uniformly mixing, and standing for 5 minutes at room temperature; performing instantaneous centrifugation, and standing for 3 minutes on a magnetic frame; thoroughly absorbing and discarding the supernatant, adding 150 μ L of freshly prepared 80% ethanol, and rinsing for 2 times; 80% ethanol was completely removed by aspiration, and 23. mu.L of enzyme-free water was added for elution, and 20. mu.L of the supernatant was taken for downstream experiments.
7. PCR amplification of libraries
Taking out the high-fidelity PCR enzyme and the label primer, putting the high-fidelity PCR enzyme and the label primer on ice for melting, mixing uniformly in a vortex mode, centrifuging instantaneously, and preparing a PCR reaction system according to the following table:
name of reagent Volume (μ L)
High fidelity PCR enzyme Mix 25
index Primer Mix 5
Purification of ligation products 20
Total amount of 50
Vortex and mix the above mixture, and centrifuge instantly. The library amplification procedure was set up as follows:
Figure BDA0003112936580000151
8. magnetic bead purification
After the reaction was completed, the product was purified using XP magnetic beads. Adding 50 mu L of XP magnetic beads into 50 mu L of samples, fully and uniformly mixing, and standing for 5 minutes at room temperature; performing instantaneous centrifugation, and standing for 3 minutes on a magnetic frame; thoroughly absorbing and discarding the supernatant, adding 150 μ L of freshly prepared 80% ethanol, and rinsing for 2 times; the ethanol was completely removed by aspiration, and then 33. mu.L of enzyme-free water was added to elute, and 30. mu.L of the supernatant was collected and stored.
9. Library quality inspection
The concentrations of 16 libraries are detected by using the Qubit, and the library concentrations are all more than 20 ng/. mu.L; the size of the library fragments was analyzed by using Agilent 2100 bioanalyzer, and the library peak patterns were similar to those shown in FIG. 1.
The size of the library fragments is distributed between 270-400 bp, the quality inspection is qualified, and the library is subjected to high-throughput sequencing subsequently.
10. The results show that
The high-throughput sequencing result shows that the result obtained according to the preset sample adding sequence is consistent with the preset sample adding result of the WES experiment (A1-A3 are family 1, A4-A6 are family 2, A7-A8 are family 3, and B6-B8 are family 4), so that the actual sample adding sequence of 2 rows of the WES experiment can be proved to be opposite to the preset sequence, the WES experiment does not need to be repeated, the cause of experiment errors can be accurately found, and the experiment period is effectively shortened.
The present invention has been described in terms of specific examples, which are provided to aid understanding of the invention and are not intended to be limiting. For a person skilled in the art to which the invention pertains, several simple deductions, modifications or substitutions may be made according to the idea of the invention.

Claims (10)

1. A biomarker for DNA sample tracking detection is characterized in that the biomarker is a nucleic acid containing 26 SNP sites, and the 26 SNP sites are rs1344, rs3917981, rs857870, rs4478844, rs4621, rs1058900, rs1130598, rs4703, rs8481, rs4673, rs 1132812812, rs 3059967, rs1045280, rs1135989, rs17626, rs354021, rs 230414186, rs6554, rs2230267, rs11554159, rs5030878, rs 3885, rs9483504, rs11998387, rs11136343 and rs 1071583.
2. The biomarker of claim 1, wherein the screening method for the 26 SNP sites comprises: screening SNP loci with the population frequency of 0.4-0.6, wherein the screened SNP loci are located in a region which can be deeply covered by the exon panel, carrying out genetic distance analysis on the screened SNP loci in 443 family samples with obtained exon panel results, and finally screening out 26 SNP loci which meet the requirements.
3. A kit for DNA sample follow-up detection, characterized in that the kit comprises PCR amplification primers for detecting 26 SNP sites in the biomarker of claim 1.
4. The kit according to claim 3, wherein the PCR amplification primers are in the sequence of SEQ ID No.1 to SEQ ID No. 52.
5. The kit of claim 4, further comprising: PCR enzyme premix solution, enzyme water-free, end repair reaction buffer solution, a library joint, connection reaction solution, high-fidelity PCR enzyme, a label primer and purified magnetic beads.
6. A method for tracing and detecting DNA sample, which is characterized in that, PCR amplification is carried out on the DNA of the sample to be detected by using the PCR amplification primer in the kit of any one of claims 4-5, the PCR amplification product is subjected to high-throughput sequencing, and the mutation condition of 26 SNP sites in the biomarker of claim 1 or 2 is analyzed according to the high-throughput sequencing result, thereby judging whether the DNA of the sample to be detected is polluted or not.
7. The method according to claim 6, wherein the determining whether the sample DNA is contaminated comprises: and comparing the mutation condition of the SNP site obtained by detection with the actually known mutation condition of the SNP site which should exist in the DNA sample to be detected, and judging whether the DNA sample to be detected has pollution or not.
8. The method according to claim 6, wherein the PCR amplification of the sample DNA to be tested and the high-throughput sequencing of the PCR amplification product specifically comprise: after PCR amplification is carried out on the DNA of a sample to be detected by adopting the PCR enzyme premix, purification, end repair, joint connection and magnetic bead purification are carried out in sequence, then the connection product after magnetic bead purification is amplified, a library meeting the concentration requirement of an upper computer is obtained, and high-throughput sequencing is carried out.
9. The method according to claim 8, wherein the library meeting the on-machine concentration requirement is characterized in that the fragment size of the PCR amplification product is concentrated between 270-400 bp through Agilent 2100 quality control.
10. Use of a biomarker according to claim 1 or 2, or a kit according to any of claims 3 to 5, for the follow-up detection of a DNA sample suspected of being contaminated in exome sequencing or genomic sequencing.
CN202110653767.4A 2021-06-11 2021-06-11 Biomarker, method and application for DNA sample tracking detection Active CN113512595B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110653767.4A CN113512595B (en) 2021-06-11 2021-06-11 Biomarker, method and application for DNA sample tracking detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110653767.4A CN113512595B (en) 2021-06-11 2021-06-11 Biomarker, method and application for DNA sample tracking detection

Publications (2)

Publication Number Publication Date
CN113512595A true CN113512595A (en) 2021-10-19
CN113512595B CN113512595B (en) 2023-07-25

Family

ID=78065328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110653767.4A Active CN113512595B (en) 2021-06-11 2021-06-11 Biomarker, method and application for DNA sample tracking detection

Country Status (1)

Country Link
CN (1) CN113512595B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116805510A (en) * 2022-09-01 2023-09-26 杭州链康医学检验实验室有限公司 Site combination for judging sample pairing or pollution and application thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130302801A1 (en) * 2008-11-07 2013-11-14 Sequenta, Inc. Detection and quantification of sample contamination in immune repertoire analysis
US20150191778A1 (en) * 2011-04-19 2015-07-09 Life Technologies Corporation Compositions and methods for detecting and identifying salmonella enterica strains
CN108823296A (en) * 2017-05-05 2018-11-16 深圳华大基因股份有限公司 A kind of method, kit and the application of the pollution of detection sample of nucleic acid
CN109022562A (en) * 2018-08-29 2018-12-18 天津诺禾致源生物信息科技有限公司 For detecting the screening technique of the SNP site of sample contamination and the method for detecting sample contamination in high-flux sequence
CN109280696A (en) * 2017-07-21 2019-01-29 安塞斯(北京)生物技术有限公司 The method of SNP detection technique fractionation mixing sample
WO2020118046A1 (en) * 2018-12-05 2020-06-11 William Marsh Rice University Quantifying foreign dna in low-volume blood samples using snp profiling

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130302801A1 (en) * 2008-11-07 2013-11-14 Sequenta, Inc. Detection and quantification of sample contamination in immune repertoire analysis
US20150191778A1 (en) * 2011-04-19 2015-07-09 Life Technologies Corporation Compositions and methods for detecting and identifying salmonella enterica strains
CN108823296A (en) * 2017-05-05 2018-11-16 深圳华大基因股份有限公司 A kind of method, kit and the application of the pollution of detection sample of nucleic acid
CN109280696A (en) * 2017-07-21 2019-01-29 安塞斯(北京)生物技术有限公司 The method of SNP detection technique fractionation mixing sample
CN109022562A (en) * 2018-08-29 2018-12-18 天津诺禾致源生物信息科技有限公司 For detecting the screening technique of the SNP site of sample contamination and the method for detecting sample contamination in high-flux sequence
WO2020118046A1 (en) * 2018-12-05 2020-06-11 William Marsh Rice University Quantifying foreign dna in low-volume blood samples using snp profiling

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CARL等: "Data quality control in genetic case-control association studies", 《NATURE PROTOCOLS》 *
CARL等: "Data quality control in genetic case-control association studies", 《NATURE PROTOCOLS》, vol. 5, no. 9, 30 September 2010 (2010-09-30), pages 1564 - 1573, XP055801113, DOI: 10.1038/nprot.2010.116 *
易丽聪等: "基于SNP 标记的西瓜种质资源遗传多样性分析", 《中国瓜菜》 *
易丽聪等: "基于SNP 标记的西瓜种质资源遗传多样性分析", 《中国瓜菜》, vol. 33, no. 12, 31 December 2020 (2020-12-31), pages 9 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116805510A (en) * 2022-09-01 2023-09-26 杭州链康医学检验实验室有限公司 Site combination for judging sample pairing or pollution and application thereof

Also Published As

Publication number Publication date
CN113512595B (en) 2023-07-25

Similar Documents

Publication Publication Date Title
CN108004304B (en) Method for detecting clonality of lymphocyte related gene rearrangement
NL1044132B1 (en) Kit and method for abo blood group genotyping based on high-throughput sequencing
CN106011230A (en) Primer composition for detecting fragmentized DNA target area and application thereof
CN111748628B (en) Primer and kit for detecting thyroid cancer prognosis related gene variation
CN111748637A (en) SNP molecular marker combination, multiplex composite amplification primer set, kit and method for genetic relationship analysis and identification
CN113512595B (en) Biomarker, method and application for DNA sample tracking detection
CN108866155B (en) Preparation method of next generation sequencing library
CN109295500B (en) Single cell methylation sequencing technology and application thereof
CN112259165B (en) Method and system for detecting microsatellite instability state
CN114015749A (en) Construction method of mitochondrial genome sequencing library based on high-throughput sequencing and amplification primer
CN109652499B (en) Method and kit for rapidly detecting 3'-5' exoactivity or mismatch of DNA polymerase
CN111394434B (en) CHO host cell DNA residue detection kit adopting TaqMan probe method and application thereof
CN116524999B (en) Screening method and detection system for target set for detecting colorectal cancer tiny residual focus
CN116463408A (en) ABO gene amplification primer, amplification system, amplification method, sequencing library construction method and sequencing method
CN117106877A (en) RHD gene primer, primer mixed system amplification method, amplification product quality detection method, sequencing library construction method and sequencing method
CN116219016A (en) Thyroid nodule benign and malignant detection method, kit and application
CN116064818A (en) Primer group, method and system for detecting IGH gene rearrangement and hypermutation
CN114317696A (en) Kit, library construction method thereof and pollution detection method
CN116355909A (en) Marker for detecting amplification of neuroblastoma MYCN and application thereof
CN107904297B (en) Primer group, joint group and sequencing method for microbial diversity research
CN113913495A (en) Duplex UMI linker and sequencing method
CN112858693A (en) Biomolecule detection method
CN113658639A (en) Somatic mutation hypersensitivity detection method based on nucleic acid mass spectrometry platform
CN108929901B (en) Multiple amplification library building method for trace DNA and special kit thereof
CN114807302B (en) Amplicon library construction method and kit for thalassemia mutant and deletion type gene detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518000 a3803, building 11, Shenzhen Bay science and technology ecological park, No. 16, Keji South Road, community, high tech Zone, Yuehai street, Nanshan District, Shenzhen, Guangdong

Applicant after: Shenzhen Yaji Technology Co.,Ltd.

Address before: 518000 room 1901, building 6, Baoneng Science Park, Qinghu community, Longhua street, Longhua District, Shenzhen City, Guangdong Province

Applicant before: AEGICARE (SHENZHEN) TECHNOLOGY CO.,LTD.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant