CN113512595B - Biomarker, method and application for DNA sample tracking detection - Google Patents

Biomarker, method and application for DNA sample tracking detection Download PDF

Info

Publication number
CN113512595B
CN113512595B CN202110653767.4A CN202110653767A CN113512595B CN 113512595 B CN113512595 B CN 113512595B CN 202110653767 A CN202110653767 A CN 202110653767A CN 113512595 B CN113512595 B CN 113512595B
Authority
CN
China
Prior art keywords
sample
snp
biomarker
dna
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110653767.4A
Other languages
Chinese (zh)
Other versions
CN113512595A (en
Inventor
李妍珂
刘永初
吕佩涛
刘阳
李阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yaji Technology Co ltd
Original Assignee
Shenzhen Yaji Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yaji Technology Co ltd filed Critical Shenzhen Yaji Technology Co ltd
Priority to CN202110653767.4A priority Critical patent/CN113512595B/en
Publication of CN113512595A publication Critical patent/CN113512595A/en
Application granted granted Critical
Publication of CN113512595B publication Critical patent/CN113512595B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6858Allele-specific amplification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A biomarker, method and application for tracking and detecting DNA sample features that the specific multiplex PCR primer sequence is designed according to the 26 SNP loci chosen, so tracking and detecting the polluted sample to be detected, and especially suitable for verifying the confusion of sample loading sequence. By adopting the method for tracking and detecting the DNA sample suspected to be polluted, the period of experimental factor checking is greatly shortened, the flow monitoring can be carried out on the clinical sample, the NGS sequencing platform can be fully utilized, other equipment is not required to be added, and the strict requirement of the clinical sample on the detection period can be met.

Description

Biomarker, method and application for DNA sample tracking detection
Technical Field
The invention relates to the field of gene detection tracking, in particular to a biomarker, a method and application for DNA sample tracking detection.
Background
At present, the high-throughput sequencing technology (Next Generation Sequencing, NGS) gradually turns right in clinical diagnosis application, is widely applied to a plurality of fields such as early screening of cancers, companion diagnosis of cancers, auxiliary diagnosis of genetic diseases, noninvasive prenatal diagnosis, detection of pathogenic microorganisms and the like, and increasingly exerts strong detection capability. While this technology has evolved rapidly, the corresponding laboratory internal quality control has evolved relatively slowly. NGS technology involves nucleic acid extraction, library construction, and some even requires complex probe hybridization capture procedures, which can be followed by twenty steps prior to final on-machine sequencing. It is difficult to ensure that no errors occur in the manual operation during this process, which can cause confusion of samples. How to build and establish a sample anti-confusion tracking system of what sample, the effective monitoring result is accurate and reliable, is one of the most urgent problems to be solved for clinical large-scale development of NGS at present.
The most common method for monitoring whether sample confusion exists is to perform a gender analysis on the sample while completing a human whole genome sequencing (Whole Genome Sequencing, WGS) or whole exome sequencing (Whole Exome Sequencing, WES) analysis without additionally establishing a sample tracking system for situations where sample confusion may occur for NGS procedures. The method has the advantages that a conclusion whether the actual analysis result is consistent with the information of the sample to be inspected can be obtained without adding additional experiments; however, since sex analysis is too simple, it is not known whether different samples of the same sex are confused or not, and only samples of different sexes of the male and female are distinguished.
The company Promega, U.S.A., developed STR detection kit based on capillary electrophoresis sequencing to accurately identify individual DNA. STR refers to a short tandem repeat (Short Tandem Repeat) which is a class of DNA sequences formed by repeated tandem of 2-6 bases as the core unit, there being an average of one STR per 15kb in the human genome, over 7000 STRs have been found. Because of the high specificity among individuals with different repetition times of the STR core unit, the STR core unit has genetic diversity, and can identify the individual DNA genetic relationship and the individual DNA identity. The technology is widely applied to the fields of criminal investigation and forensics as a gold standard for DNA identification, and meanwhile, sample confusion possibly occurring in the experimental process can be judged according to a graph analysis result. However, this method has a limitation that although the number of repetitions of the core unit can be accurately resolved, the specific constitution of the repeated bases cannot be known, and the sites having the same number of repetitions but only different internal bases cannot be accurately determined; and meanwhile, the atlas analysis is also influenced by various factors, so that the result is wrong.
Rogowski diagnostics a SeqCap EZ Primer Exome panel containing 340 SNP sites in the recently released whole exon sequencing product, which constitute a sample identification ID derived from the interior of DNA, can track samples throughout the detection process and monitor confusion between samples. Because the panel does not need to manually add the markers, the risk caused by human misoperation can be effectively avoided. However, because of the different regions covered by different whole exon sequencing panels, other brands of panel (panel refers to a group or set of detection) probes cannot cover the above 340 SNPs, and the comparison and verification of the 340 sites by using any detection method is also a problem to be solved. In addition, the adoption of a plurality of SNP loci can increase corresponding data, and increase the complexity and detection cost of data processing.
Disclosure of Invention
The invention mainly aims to provide a method for tracking and detecting DNA samples and application thereof, which can effectively track biological sample DNA by screening out nucleic acid containing 26 SNP loci as a biomarker for tracking and detecting the DNA samples, verify a sample to be detected suspected of confusion and judge whether the sample DNA is polluted or not.
In order to achieve the above purpose, the present application adopts the following technical scheme:
according to a first aspect, the present application provides a biomarker for DNA sample tracking detection, the biomarker being a nucleic acid containing 26 SNP sites, the 26 SNP sites being in particular rs1344, rs3917981, rs857870, rs4478844, rs4621, rs1058900, rs1130598, rs4703, rs8481, rs4673, rs1132812, rs9930567, rs1045280, rs1135989, rs17626, rs354021, rs2304186, rs6554, rs2230267, rs11554159, rs5030878, rs3822585, rs9483504, rs11998387, rs11136343, rs1071583.
SNP refers to a single nucleotide polymorphism (SNP, abbreviation of single nucleotide polymorphism). The theoretical basis of judging whether the nucleic acid sample to be tested has pollution from different individuals according to the single nucleotide polymorphism analysis result is that the single nucleotide polymorphism of each individual is different except for syngeneic gemini, so that the single nucleotide polymorphism analysis can be used as a specific mark of each individual, and if the single nucleotide polymorphism analysis result is different from the SNP locus condition of the sample to be tested, the sample to be tested is judged to have the possibility of pollution or the cross pollution of different sample reaction systems to be tested exists in the whole test process. In the application, by screening out specific 26 SNP loci, DNA samples suspected of confusion can be verified to further judge whether sample contamination exists.
The more SNP loci are detected, the higher the accuracy is, and the larger the corresponding data volume is. However, the data volume greatly increases the detection cost, and affects and interferes the data of the detection itself; therefore, in consideration of both the accuracy of sample SNP locus detection and the SNP data amount, the preferred SNP loci of the present application are 26 selected from the family samples from which 443 exon panel results have been obtained by genetic distance analysis.
The application is aimed at verifying samples which are judged to possibly have confusion (pollution) after being detected by other existing methods. By detecting 26 SNP loci contained in the biomarker, the biological sample DNA can be effectively tracked, and whether the sample DNA is polluted or not can be verified. Compared with the existing sample tracking and pollution judging method by 340 SNP loci, the biomarker of the application only needs to detect 26 SNP loci, and has stronger operability and higher cost performance; moreover, the biomarker can rapidly identify sample pollution in WES and WGS experiment processes, repeated experiments are not needed, and the result is accurate. The biomarker of the application is adopted for sample tracking and pollution detection, so that the period of experimental factor checking is greatly shortened, the flow monitoring can be carried out on clinical samples, the NGS sequencing platform can be fully utilized, other equipment is not required to be added, and the strict requirement of the clinical samples on the detection period can be met. It can be understood that the key point of the biomarker is that the biomarker contains 26 appointed SNP loci, and sample tracking and pollution detection can be accurately and effectively realized through the 26 appointed SNP loci; other SNP sites or other sequences may also be included in the biomarkers of the present application for other functional requirements, and are not specifically limited herein.
In the present application, the specific screening method for 26 SNP sites is as follows: and screening SNP loci with the crowd frequency of 0.4-0.6, wherein the screened SNP loci are positioned in a region which can be deeply covered by an exon panel, carrying out genetic distance analysis on the screened SNP loci in 443 family samples with the exon panel results, and finally screening out 26 SNP loci meeting the requirements.
It should be noted that "population frequency" is understood to be the proportion of base differences at a site in a specific population (e.g., chinese population). For example, SNP loci with a population frequency of 0.5 mean that there are 50% of human A,50% of human C/T/G. In the present application, the desired SNP site should first meet the population frequency of about 0.5, such as 0.4-0.6. Secondly, the screened sites are required to be positioned in a region which can be deeply covered by the exon panel, namely, the result error of the exon panel cannot be caused by insufficient site deep coverage, and in the application, the exon panel is adopted as the exon panel produced by Integrated DNA Technologies (hereinafter referred to as IDT) company. And then carrying out genetic distance analysis in 443 family samples with exon panel results according to the screened loci, wherein the genetic distance analysis belongs to common technical means in the field and is not described herein.
It should be noted that, the key point of the application is that by detecting 26 designated SNP sites in the biomarker of the application, in order to simply and effectively realize sample tracking and pollution detection, specific primers can be designed to perform multiplex PCR amplification and high-throughput sequencing on the biomarker of 26 SNP sites. As for the specific primer sequence, it can be determined according to the existing primer design principle, and is not particularly limited herein.
According to a second aspect, the present application provides a kit for use in the follow-up detection of a DNA sample, the kit comprising PCR amplification primers for detecting 26 SNP sites of the biomarker of the first aspect. Preferably, the PCR amplification primers are sequences shown in SEQ ID No.1 to SEQ ID No.52 in sequence.
The sequences of SEQ ID No.1 to SEQ ID No.52 are shown in the following table:
it should be noted that, the key point of the kit is that the detection of 26 SNP loci of the biomarker is realized by PCR amplification and subsequent high-throughput sequencing; as for specific PCR amplified primer sequences, reference may be made to existing primer design principles. The kit containing 26 pairs of multiplex PCR amplification primers can be prepared by independently packaging each primer during assembly, then selecting and adding according to requirements, or uniformly mixing each primer pair as a whole, or uniformly mixing all 26 pairs of 52 primers related to 26 SNP loci together, and adding primer mixture during use. In addition, substances such as PCR enzyme premix, enzyme water-free, end repair reaction buffer, library linker, ligation reaction solution, high-fidelity PCR enzyme, tag primer, purified magnetic beads (abbreviated as magnetic beads) and the like adopted in the kit can be purchased in the market.
In one implementation of the application, the PCR multiplex primer related to 26 SNP loci is designed, and can be fully applied to sample confusion verification of Whole Genome (WGS) or Whole Exome (WES) detection, and the detection of any confusion sample can be realized by analyzing 26 SNP loci without adversely affecting WGS or WES detection.
It should be noted that, the biomarker of the application is adopted to carry out tracking detection on a DNA sample suspected to be polluted, the whole process can be carried out in the sequencing of an exome or the sequencing of a genome, namely, 26 SNP loci in the biomarker of the application are analyzed and detected while the sequencing of the exome or the sequencing of the genome is carried out, so that the tracking and the pollution detection of the sample are realized; it is also possible to directly utilize the panel of exome sequencing or genome sequencing, for example, in one implementation of the present application, the full-exon panel of IDT is specifically employed.
Still further, the kit further comprises: the kit comprises PCR enzyme premix, enzyme water-free, end repair reaction buffer, library joint, ligation reaction solution, high-fidelity PCR enzyme, label primer and purified magnetic beads. The enzyme water-free refers to water which does not contain DNase and RNase and can be used for eluting reactants; the PCR enzyme premix was Taq enzyme premix produced by Takara. Other reactants not specifically described may be selected from the group consisting of components and linkers commonly employed in the art. The method for constructing the library can be conventional in the art, and for example, comprises the steps of multiplex PCR primer mixing, multiplex PCR amplification, magnetic bead purification, end repair and A tail addition, linker ligation, magnetic bead purification, library PCR amplification, magnetic bead purification, library quality inspection and the like.
In one example, after designing and synthesizing PCR amplification primers (abbreviated as primers) according to the 26 SNP sites, diluting a single primer to a primer mother solution with a concentration of 100 mu M, and mixing the primers in equal volume so that the concentration of the single primer in the working solution is 1 mu M; carrying out a round of multiplex PCR amplification by adopting Takara Taq enzyme Premix (Premix Taq) to obtain a PCR product with the fragment size of 190 bp-320 bp; purifying the PCR product, and removing primer dimer and Taq enzyme premix; subsequently, performing end repair and linker ligation on the purified PCR product, and adding a universal linker; purifying the connection product by magnetic beads, and removing the joint dimer and the connection reaction liquid; and amplifying the purified ligation product by using high-fidelity PCR enzyme and a label primer for distinguishing samples to obtain a library meeting the requirement of the concentration of the machine, wherein the size of the library fragment is controlled by Agilent 2100, and the size of the library fragment is concentrated between 270 and 400bp and can be subjected to downstream sequencing.
It should be noted that, the key point of the present application is to screen out 26 SNP sites and design 26 pairs of multiplex PCR primers, and for other subsequent library construction, sequencing steps, etc., reference may be made to the existing library construction, related steps of high throughput sequencing, and subsequent SNP analysis for the sequencing result may be performed by using an NGS sequencing platform, etc., which will not be described herein.
Based on the fact that no high-throughput detection product with simple operation and high cost performance exists on the current market to accurately judge whether the sample is polluted or not, the application discloses a method for tracking the sample by utilizing SNPs in DNA, which is suitable for tracking whole genome or whole exome sequencing detection, and is particularly suitable for tracking whole exome sequencing detection by adopting Integrated DNA Technologies (hereinafter referred to as IDT) company xGen Exome Research Panel. By the DNA sample tracking detection method, SNP loci of a sample to be detected can be rapidly judged and compared with locus mutation conditions which should exist actually, so that whether sample pollution (or confusion) exists or not is judged.
Therefore, according to a third aspect, the present application also provides a method for tracking and detecting a DNA sample, which comprises using the PCR amplification primer in the kit according to the second aspect, performing PCR amplification on the DNA of the sample to be detected, performing high throughput sequencing on the PCR amplification product, and analyzing mutation conditions of 26 SNP sites of the biomarker according to the high throughput sequencing result, thereby judging whether the DNA of the sample to be detected is contaminated. The step of judging whether the DNA of the sample to be tested is polluted or not specifically comprises the following steps: and comparing the mutation condition of the SNP locus obtained by detection with the mutation condition of the SNP locus which is supposed to exist in the actually known DNA sample to be detected, and judging whether the DNA sample to be detected is polluted or not.
In one embodiment, whether the sample is confused can be determined by verifying whether the sample loading sequence of the sample to be tested is consistent with the preset sample loading sequence. For example, for a suspected confusing sample to be tested, a library is constructed by multiplex PCR amplification of DNA fragments according to a predetermined loading sequence, and high throughput sequencing is performed. During verification, adding PCR amplification primers designed according to 26 SNP loci in the library construction process of a sample to be tested according to a preset sample adding sequence, and carrying out PCR amplification and library construction together with the sample to be tested; sequencing the constructed mixed library, and analyzing single nucleotide polymorphism of a sample to be tested; according to the single nucleotide polymorphism analysis result, whether the sample to be detected has sample pollution from different individual sources can be judged. Wherein the DNA fragment initially contains 26 SNP sites or contains 26 SNP sites after amplification.
And when the verification sequencing result is the same as the preset sample adding result, indicating that the DNA sample is in a mixed state. Because the initial sequencing result is different from the preset sample adding result, the condition that the sample to be tested is confused is judged, the sample adding sequence is inconsistent with the preset sample adding sequence, the guess is possibly caused by the fact that the DNA sample adding sequence is reversed, the guess result is verified, the sample to be tested is subjected to PCR amplification, library establishment and sequencing according to the original preset sample adding sequence, and when the obtained verification sequencing result is identical with the preset sample adding result, the condition that the sample to be tested is confused is indicated, and the correct sample adding sequence is the preset sample adding sequence.
According to a fourth aspect, the present application also provides the use of a biomarker as described in the first aspect, or a kit as described in the second aspect, for the follow-up detection of a DNA sample suspected of being contaminated in exome sequencing or genomic sequencing.
Due to the adoption of the technical scheme, the beneficial effects of the application are that:
the method can realize the tracking and pollution detection of the sample only by detecting 26 SNP loci, is particularly suitable for verifying the confusion of the sample adding sequence of the sample to be detected, can effectively reduce the consumption of the sample and the reagent, and has stronger operability and higher cost performance.
Moreover, the biomarker can rapidly identify sample pollution in WES and WGS experiment processes, repeated experiments are not needed, and the result is accurate. By adopting the DNA sample tracking detection method disclosed by the application to track and verify suspected confusing samples to be detected, the period of experimental factor checking is greatly shortened, the flow monitoring can be carried out on clinical samples, the NGS sequencing platform can be fully utilized, other equipment is not required to be added, and the strict requirement of the clinical samples on the detection period can be met.
Drawings
FIG. 1 is a library peak diagram at the time of library quality inspection in example 1.
Detailed Description
The invention will be described in further detail below with reference to the drawings by means of specific embodiments.
In the following embodiments, numerous specific details are set forth in order to provide a better understanding of the present application. However, one skilled in the art will readily recognize that some of the features may be omitted, or replaced by other elements, materials, or methods in different situations. In some instances, some operations associated with the present application have not been shown or described in the specification to avoid obscuring the core portions of the present application, and may not be necessary for a person skilled in the art to describe in detail the relevant operations based on the description herein and the general knowledge of one skilled in the art.
Furthermore, the described features, operations, or characteristics of the description may be combined in any suitable manner in various embodiments. Also, various steps or acts in the method descriptions may be interchanged or modified in a manner apparent to those of ordinary skill in the art. Thus, the various orders in the description and drawings are for clarity of description of only certain embodiments, and are not meant to be required orders unless otherwise indicated.
In the prior art, a good verification method for judging DNA pollution does not exist, and because different full-exon sequencing panels cover different areas, if the selected SNP sites are too many, the SNP sites cannot be covered by other brands of panels, the application range is limited, the universality is poor, and in addition, the accuracy of the SNP sites is difficult to verify when the SNP sites are too many. Based on the defects of the prior art, the application creatively provides a tracking method of DNA sample pollution, and makes accurate judgment on whether the sample is polluted (or confused). The 26 SNP loci according to the tracking method can be covered by most brands of panel, and can be used for human whole genome sequencing or whole exome sequencing. Specifically, the SNP sites include the following 26 sites: rs1344, rs3917981, rs857870, rs4478844, rs4621, rs1058900, rs1130598, rs4703, rs8481, rs4673, rs1132812, rs9930567, rs1045280, rs1135989, rs17626, rs354021, rs2304186, rs6554, rs2230267, rs11554159, rs5030878, rs3822585, rs9483504, rs11998387, rs11136343, rs1071583. The 26 pairs (52) of PCR primers designed based on the 26 SNP sites are shown in the following table:
the present application is described in further detail below by way of specific examples and figures. The following examples are merely illustrative of the present application and should not be construed as limiting the present application.
Example 1
Detecting a total of 16 samples of 4 families and non-family forensics by adopting an all-exon panel of the IDT, wherein A1-A3 are family 1, A4-A6 are family 2, A7-A8 are family 3, B6-B8 are family 4, and the rest samples are independent forensics; the preset sample adding sequence is shown in the following chart:
the actual sample loading sequence is shown in the following chart:
the detection result shows that A1, A2 and A3 do not accord with family relation; a4, A5 and A6 do not accord with family relation; a7, A8 and B1 do not accord with family relation; b6, B7 and B8 do not accord with family relation; the batch of samples is suspected to be seriously confused, and 16 DNA samples in the batch are subjected to library construction according to a preset sample adding sequence by adopting the tracking method. The method comprises the following steps:
1. multiplex primer mixing
52 single primer mother solutions with the concentration of 100 mu M are added into a 1.5mL centrifuge tube according to 3 mu L/strip, 144 mu L TE buffer is supplemented to make the final volume 300 mu L, vortex mixing is fully carried out, instantaneous centrifugation is carried out, and the working solution concentration of single primer is 1 mu M.
2. Multiplex PCR amplification
Melting Premix Taq on ice, vortex mixing, centrifuging briefly, and preparing a reaction system according to the following table:
vortex mixing the components, instantaneous centrifugation, multiplex PCR amplification according to the following table reaction procedure:
3. magnetic bead purification
After the completion of the reaction, the reaction mixture was purified using XP beads. Adding 35 mu L XP magnetic beads into 25 mu L sample, mixing well, standing for 5 min at room temperature; instantaneous centrifugation, standing on a magnetic rack for 3 minutes; the supernatant is thoroughly sucked, 150 mu L of freshly prepared 80% ethanol is added, and the mixture is rinsed for 2 times; thoroughly sucking 80% ethanol, adding 45 mu L of enzyme, eluting with water, collecting 1 mu L of supernatant, detecting concentration, collecting 3 mu L of supernatant, and detecting by 2% agarose gel electrophoresis, wherein the electrophoresis band is 190-320 bp; 40. Mu.L of the supernatant was used for downstream experiments.
4. End repair with A tail
Taking out the tail end repair, adding the buffer solution A, melting at room temperature, mixing uniformly by vortex, centrifuging instantly, and placing on ice for standby; taking out the end repair, adding the A-reactive enzyme, performing instantaneous centrifugation, and placing on ice; the end repair reaction system was formulated as follows:
reagent name Volume (mu L)
Terminal repair buffer solution 6
End repair and A-adding reactive enzyme 4
PCR purification product 40
Total amount of 50
Vortex mixing the components, instantaneous centrifugation, and reaction according to the following table procedure:
temperature (. Degree. C.) Time (min)
20 30
65 30
10 Holding
5. Joint connection
Taking out the connection buffer solution, melting at room temperature, vortex mixing, instantaneous centrifuging, and placing on ice for standby; taking out the ligase, and putting the ligase on ice for standby by instantaneous centrifugation; taking out the universal joint, melting on ice, vortex mixing, instantaneous centrifuging, and placing on ice for standby.
Adding 2 mu L of universal joint into the end repair and A product, vortex mixing, instantaneous centrifuging, placing a sample tube on ice, and preparing a connection reaction system according to the following table:
reagent name Volume (mu L)
Connection buffer solution 26
Ligase enzyme 2
Total amount of 28
Adding 28 mu L of the ligase reaction mixed solution into a sample tube, performing vortex mixing and instantaneous centrifugation, and placing the mixture on ice;
the ligation reaction procedure was set up as follows:
temperature (. Degree. C.) Time (min)
20 15
10 Holding
After the PCR apparatus reached 20℃the reaction tube was placed in the apparatus.
6. Magnetic bead purification
After the completion of the reaction, the reaction mixture was purified using XP beads. Adding 40 mu L XP magnetic beads into 80 mu L sample, mixing well, standing for 5 min at room temperature; instantaneous centrifugation, standing on a magnetic rack for 3 minutes; the supernatant is thoroughly sucked, 150 mu L of freshly prepared 80% ethanol is added, and the mixture is rinsed for 2 times; after the washing, 80% ethanol was removed, 23. Mu.L of enzyme was added to the mixture to elute the mixture, and 20. Mu.L of the supernatant was used for the downstream experiment.
7. Library PCR amplification
Taking out the high-fidelity PCR enzyme and the label primer, melting on ice, mixing by vortex, and centrifuging instantly to prepare a PCR reaction system according to the following table:
reagent name Volume (mu L)
High-fidelity PCR enzyme Mix 25
index Primer Mix 5
Purification of the ligation product 20
Total amount of 50
Vortex mixing the above mixture, and instantaneous centrifuging. Library amplification procedure was set up as follows:
8. magnetic bead purification
After the completion of the reaction, the reaction mixture was purified using XP beads. Adding 50 mu L XP magnetic beads into 50 mu L sample, mixing well, standing for 5 min at room temperature; instantaneous centrifugation, standing on a magnetic rack for 3 minutes; the supernatant is thoroughly sucked, 150 mu L of freshly prepared 80% ethanol is added, and the mixture is rinsed for 2 times; after the ethanol was completely removed by suction, 33. Mu.L of enzyme was added to the mixture to elute the mixture, and 30. Mu.L of the supernatant was collected and stored.
9. Library quality inspection
Detecting the concentration of 16 libraries by adopting Qubit, wherein the library concentration is more than 20 ng/. Mu.L; library fragment size analysis was performed using an Agilent 2100 bioanalyzer, and the library peak patterns were similar to those shown in FIG. 1.
The size of the library fragments is distributed between 270 and 400bp, the quality inspection is qualified, and the library is subjected to high-throughput sequencing.
10. Description of results
According to the high-throughput sequencing result, the result obtained according to the preset sample adding sequence is consistent with the preset sample adding result (A1-A3 are family 1, A4-A6 are family 2, A7-A8 are family 3, and B6-B8 are family 4) of the WES experiment, so that the actual sample adding sequence of 2 rows of samples of the WES experiment is opposite to the preset sequence, the WES experiment is not required to be repeated, the cause of experimental error is accurately found, and the experimental period is effectively shortened.
The foregoing description of the invention has been presented for purposes of illustration and description, and is not intended to be limiting. Several simple deductions, modifications or substitutions may also be made by a person skilled in the art to which the invention pertains, based on the idea of the invention.

Claims (9)

1. A biomarker for DNA sample tracking detection, characterized in that the biomarker is 26-segment nucleic acid containing 26 SNP sites respectively, wherein the 26 SNP sites are specifically rs1344, rs3917981, rs857870, rs4478844, rs4621, rs1058900, rs1130598, rs4703, rs8481, rs4673, rs1132812, rs9930567, rs1045280, rs1135989, rs17626, rs354021, rs2304186, rs6554, rs2230267, rs11554159, rs5030878, rs3822585, rs9483504, rs11998387, rs11136343, rs1071583.
2. The biomarker according to claim 1, wherein the 26 SNP sites are screened by the following method: and screening SNP loci with the crowd frequency of 0.4-0.6, wherein the screened SNP loci are positioned in a region which can be deeply covered by an exon panel, carrying out genetic distance analysis on the screened SNP loci in 443 family samples with the exon panel results, and finally screening out 26 SNP loci meeting the requirements.
3. A kit for use in the follow-up detection of a DNA sample, comprising PCR amplification primers for detecting 26 SNP sites in the biomarker of claim 1; the PCR amplification primers are sequences shown in SEQ ID No.1 to SEQ ID No.52 in sequence.
4. A kit according to claim 3, wherein the kit further comprises: the kit comprises a PCR enzyme premix, nuclease-free water, a terminal repair reaction buffer, library joints, a ligation reaction solution, high-fidelity PCR enzyme, a label primer and purified magnetic beads.
5. A method for the follow-up detection of a DNA sample, comprising PCR amplifying a DNA sample to be detected using the PCR amplification primer of the kit of claim 3 or 4, and high throughput sequencing the PCR amplification product, and analyzing mutation of 26 SNP sites in the biomarker of claim 1 or 2 according to the high throughput sequencing result, thereby judging whether the DNA sample to be detected is contaminated.
6. The method according to claim 5, wherein the determining whether the sample DNA to be tested is contaminated comprises: and comparing the mutation condition of the SNP locus obtained by detection with the mutation condition of the SNP locus which is supposed to exist in the actually known DNA sample to be detected, and judging whether the DNA sample to be detected is polluted or not.
7. The method according to claim 5, wherein the PCR amplification is performed on the sample DNA to be detected, and the high throughput sequencing is performed on the PCR amplified product, specifically comprising: and (3) carrying out PCR amplification on the DNA of the sample to be detected by adopting a PCR enzyme premix, sequentially carrying out purification, terminal repair and joint connection, purifying magnetic beads, amplifying the connection products after purifying the magnetic beads to obtain a library meeting the requirement of the concentration of the machine, and carrying out high-throughput sequencing.
8. The method of claim 7, wherein the library meeting the requirement of the on-line concentration is Agilent 2100 quality control, and the fragment size of the PCR amplification product is concentrated between 270 and 400 bp.
9. Use of a biomarker according to claim 1 or 2, or a kit according to claim 3 or 4, in the follow-up detection of a DNA sample suspected of being contaminated in exome sequencing or genomic sequencing.
CN202110653767.4A 2021-06-11 2021-06-11 Biomarker, method and application for DNA sample tracking detection Active CN113512595B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110653767.4A CN113512595B (en) 2021-06-11 2021-06-11 Biomarker, method and application for DNA sample tracking detection

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110653767.4A CN113512595B (en) 2021-06-11 2021-06-11 Biomarker, method and application for DNA sample tracking detection

Publications (2)

Publication Number Publication Date
CN113512595A CN113512595A (en) 2021-10-19
CN113512595B true CN113512595B (en) 2023-07-25

Family

ID=78065328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110653767.4A Active CN113512595B (en) 2021-06-11 2021-06-11 Biomarker, method and application for DNA sample tracking detection

Country Status (1)

Country Link
CN (1) CN113512595B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115394357B (en) * 2022-09-01 2023-06-30 杭州链康医学检验实验室有限公司 Site combination for judging sample pairing or pollution and screening method and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109022562A (en) * 2018-08-29 2018-12-18 天津诺禾致源生物信息科技有限公司 For detecting the screening technique of the SNP site of sample contamination and the method for detecting sample contamination in high-flux sequence
CN109280696A (en) * 2017-07-21 2019-01-29 安塞斯(北京)生物技术有限公司 The method of SNP detection technique fractionation mixing sample

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9394567B2 (en) * 2008-11-07 2016-07-19 Adaptive Biotechnologies Corporation Detection and quantification of sample contamination in immune repertoire analysis
WO2012145557A1 (en) * 2011-04-19 2012-10-26 Life Technologies Corporation Compositions and methods for detecting and identifying salmonella enterica strains
CN108823296B (en) * 2017-05-05 2021-12-21 深圳华大基因股份有限公司 Method and kit for detecting nucleic acid sample pollution and application
WO2020118046A1 (en) * 2018-12-05 2020-06-11 William Marsh Rice University Quantifying foreign dna in low-volume blood samples using snp profiling

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109280696A (en) * 2017-07-21 2019-01-29 安塞斯(北京)生物技术有限公司 The method of SNP detection technique fractionation mixing sample
CN109022562A (en) * 2018-08-29 2018-12-18 天津诺禾致源生物信息科技有限公司 For detecting the screening technique of the SNP site of sample contamination and the method for detecting sample contamination in high-flux sequence

Also Published As

Publication number Publication date
CN113512595A (en) 2021-10-19

Similar Documents

Publication Publication Date Title
CN109887548B (en) ctDNA ratio detection method and detection device based on capture sequencing
CN111052249B (en) Methods of determining predetermined chromosome conservation regions, methods of determining whether copy number variation exists in a sample genome, systems, and computer readable media
CN115786459B (en) Method for detecting tiny residual disease of solid tumor by high-throughput sequencing
WO2002024949A1 (en) Method of analyzing nucleic acid
CN113512595B (en) Biomarker, method and application for DNA sample tracking detection
CN113462749A (en) High-sensitivity amplicon library construction kit, library construction method and application
CN108866155B (en) Preparation method of next generation sequencing library
CN109686404B (en) Method and device for detecting sample confusion
CN114015749A (en) Construction method of mitochondrial genome sequencing library based on high-throughput sequencing and amplification primer
CN112259165B (en) Method and system for detecting microsatellite instability state
CN111477277A (en) Sample quality evaluation method and device
CN116064818A (en) Primer group, method and system for detecting IGH gene rearrangement and hypermutation
CN111926075A (en) Tumor microsatellite instability detection kit based on second-generation sequencing probe capture technology
CN117106877A (en) RHD gene primer, primer mixed system amplification method, amplification product quality detection method, sequencing library construction method and sequencing method
CN114317696A (en) Kit, library construction method thereof and pollution detection method
CN116355909A (en) Marker for detecting amplification of neuroblastoma MYCN and application thereof
CN111394434B (en) CHO host cell DNA residue detection kit adopting TaqMan probe method and application thereof
CN107904297B (en) Primer group, joint group and sequencing method for microbial diversity research
CN110993024B (en) Method and device for establishing fetal concentration correction model and method and device for quantifying fetal concentration
CN115349020A (en) Probe composition for identifying or assisting in identifying mammal species, kit and application thereof
CN113913495A (en) Duplex UMI linker and sequencing method
CN111197072B (en) Rapid extraction method of DNA and application of rapid extraction method in detection of low-frequency chimeric gene
CN114277096B (en) Method and kit for identifying thalassemia alpha anti4.2 heterozygotes and HK alpha heterozygotes
CN114807302B (en) Amplicon library construction method and kit for thalassemia mutant and deletion type gene detection
CN116524999B (en) Screening method and detection system for target set for detecting colorectal cancer tiny residual focus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 518000 a3803, building 11, Shenzhen Bay science and technology ecological park, No. 16, Keji South Road, community, high tech Zone, Yuehai street, Nanshan District, Shenzhen, Guangdong

Applicant after: Shenzhen Yaji Technology Co.,Ltd.

Address before: 518000 room 1901, building 6, Baoneng Science Park, Qinghu community, Longhua street, Longhua District, Shenzhen City, Guangdong Province

Applicant before: AEGICARE (SHENZHEN) TECHNOLOGY CO.,LTD.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant