CN115717163A - Molecular coding detection system for monitoring and correcting sequencing pollution and application thereof - Google Patents
Molecular coding detection system for monitoring and correcting sequencing pollution and application thereof Download PDFInfo
- Publication number
- CN115717163A CN115717163A CN202211328995.5A CN202211328995A CN115717163A CN 115717163 A CN115717163 A CN 115717163A CN 202211328995 A CN202211328995 A CN 202211328995A CN 115717163 A CN115717163 A CN 115717163A
- Authority
- CN
- China
- Prior art keywords
- coding
- nucleic acid
- acid sequence
- batch
- pollution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 60
- 238000001514 detection method Methods 0.000 title claims abstract description 40
- 238000012544 monitoring process Methods 0.000 title claims abstract description 31
- 150000007523 nucleic acids Chemical group 0.000 claims abstract description 113
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 92
- 238000011109 contamination Methods 0.000 claims abstract description 62
- 238000003780 insertion Methods 0.000 claims abstract description 42
- 230000037431 insertion Effects 0.000 claims abstract description 42
- 239000000523 sample Substances 0.000 claims description 84
- 108091026890 Coding region Proteins 0.000 claims description 52
- 238000000034 method Methods 0.000 claims description 28
- 238000011084 recovery Methods 0.000 claims description 28
- 238000012937 correction Methods 0.000 claims description 6
- 238000003556 assay Methods 0.000 claims description 5
- 238000007405 data analysis Methods 0.000 claims description 4
- 230000002068 genetic effect Effects 0.000 claims description 4
- 230000000295 complement effect Effects 0.000 claims description 3
- 230000035772 mutation Effects 0.000 claims description 3
- 238000002360 preparation method Methods 0.000 claims description 3
- 238000002156 mixing Methods 0.000 claims description 2
- 238000012864 cross contamination Methods 0.000 abstract description 6
- 238000012165 high-throughput sequencing Methods 0.000 abstract description 6
- 230000007774 longterm Effects 0.000 abstract description 4
- 238000013461 design Methods 0.000 abstract description 3
- 230000007613 environmental effect Effects 0.000 abstract description 2
- 108020004414 DNA Proteins 0.000 description 24
- 230000008569 process Effects 0.000 description 13
- 238000006243 chemical reaction Methods 0.000 description 12
- 108020004707 nucleic acids Proteins 0.000 description 10
- 102000039446 nucleic acids Human genes 0.000 description 10
- 238000007481 next generation sequencing Methods 0.000 description 9
- 239000003153 chemical reaction reagent Substances 0.000 description 8
- 239000000203 mixture Substances 0.000 description 8
- 108091093088 Amplicon Proteins 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 7
- 239000011324 bead Substances 0.000 description 7
- 108090000623 proteins and genes Proteins 0.000 description 7
- 229960002685 biotin Drugs 0.000 description 6
- 239000011616 biotin Substances 0.000 description 6
- 238000009396 hybridization Methods 0.000 description 6
- 239000000047 product Substances 0.000 description 6
- 230000003321 amplification Effects 0.000 description 5
- 239000000872 buffer Substances 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000000605 extraction Methods 0.000 description 5
- 238000003199 nucleic acid amplification method Methods 0.000 description 5
- 238000000746 purification Methods 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- 108700026244 Open Reading Frames Proteins 0.000 description 4
- 238000001914 filtration Methods 0.000 description 4
- 238000004064 recycling Methods 0.000 description 4
- 102000053602 DNA Human genes 0.000 description 3
- 101150079778 PREP gene Proteins 0.000 description 3
- 238000004140 cleaning Methods 0.000 description 3
- 238000003912 environmental pollution Methods 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 239000007791 liquid phase Substances 0.000 description 3
- 238000011002 quantification Methods 0.000 description 3
- 230000008439 repair process Effects 0.000 description 3
- 229910021642 ultra pure water Inorganic materials 0.000 description 3
- 239000012498 ultrapure water Substances 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- 238000007400 DNA extraction Methods 0.000 description 2
- 230000033616 DNA repair Effects 0.000 description 2
- 108091081062 Repeated sequence (DNA) Proteins 0.000 description 2
- 108020004682 Single-Stranded DNA Proteins 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 239000012452 mother liquor Substances 0.000 description 2
- 238000013441 quality evaluation Methods 0.000 description 2
- 239000002096 quantum dot Substances 0.000 description 2
- 238000013102 re-test Methods 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 2
- 239000012224 working solution Substances 0.000 description 2
- LLIANSAISVOLHR-GBCQHVBFSA-N 5-[(3as,4s,6ar)-2-oxidanylidene-1,3,3a,4,6,6a-hexahydrothieno[3,4-d]imidazol-4-yl]pentanoic acid Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21.N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 LLIANSAISVOLHR-GBCQHVBFSA-N 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 101100271190 Plasmodium falciparum (isolate 3D7) ATAT gene Proteins 0.000 description 1
- 239000000443 aerosol Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 239000012295 chemical reaction liquid Substances 0.000 description 1
- 238000013506 data mapping Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000012502 diagnostic product Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 239000013067 intermediate product Substances 0.000 description 1
- 238000007403 mPCR Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000001821 nucleic acid purification Methods 0.000 description 1
- 230000010355 oscillation Effects 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000001303 quality assessment method Methods 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 239000002994 raw material Substances 0.000 description 1
- 239000011535 reaction buffer Substances 0.000 description 1
- 239000011541 reaction mixture Substances 0.000 description 1
- 239000013074 reference sample Substances 0.000 description 1
- 238000005067 remediation Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 229960005322 streptomycin Drugs 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a molecular coding detection system for monitoring and correcting sequencing pollution and application thereof. The molecular coding detection system comprises at least one insertion coding nucleic acid sequence, wherein the insertion coding nucleic acid sequence comprises a skeleton sequence area with a known sequence and at least one variable coding area, the variable coding area is a random sequence consisting of any one or at least two of A, T, C or G, the variable coding areas are randomly distributed in the skeleton sequence area, and the insertion coding nucleic acid sequence is single-stranded or double-stranded. The invention designs an insertion type coding nucleic acid sequence with a specific structure, marks a sample to be detected by using the insertion type coding nucleic acid sequence, and analyzes based on high-throughput sequencing original data, so that cross contamination among samples in a short-term batch and historical environmental contamination caused by long-term batch detection can be quickly, effectively and recognized.
Description
Technical Field
The invention belongs to the technical field of gene sequencing, and relates to a molecular coding detection system for monitoring and correcting sequencing pollution and application thereof.
Background
The next generation sequencing technology (NGS) has become an emerging technology for modern biological research and medical diagnosis due to its huge information flux, sample capacity, ultra-high sensitivity, capability of detecting multiple analysis targets simultaneously, and low cost of single sample analysis. Diagnostic products based on the NGS technology are more and more approved by medical supervision departments, and commercialization, technical standardization and industrialization are realized, but the hidden danger of industrial diagnosis is also caused by the problems of sample pollution caused by long process, complex process, batch library building and centralized detection of the NGS technology.
NGS detection of contamination generally comes from three sources: (1) Sample processing contamination including sample information errors, cross contamination occurring during sample collection and nucleic acid extraction; (2) The detection process pollution is generally the reagent pollution such as joint index pollution in the complex library building process or the carrying or cross pollution among library building intermediate products, and is particularly common in the synchronous library building process of a large number of samples in the same batch; (3) Detection of environmental contamination, caused by high concentrations of aerosol contaminating molecules in the detection environment.
The existing centralized on-machine sequencing posing method is to use molecular tags to label libraries, namely, a joint or a primer with additional library identification sequence information is used for independent library building, and sample data is separated by backtracking of tag information of data after off-machine. Any pollution in the pooling process can be carried into the sequencing process, and the pollution cannot be identified and preprocessed through data quality control after the machine is taken off, and whether the pollution occurs in the operation process of a certain sample can be detected only after the data result is analyzed. Cross-contamination of the sample label reagent itself during the library construction process can even cause artificial false contamination, i.e., data contamination. The existing sample pollution identification and monitoring method is mainly implemented by passively analyzing the sex of a patient sample, the consistency, the impurity degree and the like of genetic SNP of a reference sample and a detection sample, whether the sample is polluted or not can be obtained only after the analysis is finished, and the pollution source cannot be traced back after the sample is polluted. No control samples, or small, targeted sequencing panel could be performed. The industrial detection of NGS requires a new system to solve the above-mentioned sample contamination problem.
In conclusion, how to provide a method for monitoring, identifying and correcting the pollution of a high-throughput sequencing sample has great significance to the technical field of gene sequencing.
Disclosure of Invention
Aiming at the defects and actual requirements of the prior art, the invention provides a molecular coding detection system for monitoring and correcting sequencing pollution and application thereof.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a molecular coding detection system for monitoring and correcting sequencing contamination, the molecular coding detection system comprises at least one insertion coding nucleic acid sequence, the insertion coding nucleic acid sequence comprises a skeleton sequence region with a known sequence and at least one variable coding region, the variable coding region is a random sequence composed of any one or at least two of a, T, C or G, the variable coding regions are randomly distributed in the skeleton sequence region, and the insertion coding nucleic acid sequence is single-stranded or double-stranded.
In the invention, an insertion type coding nucleic acid sequence with a specific structure is designed, one part is a fixed known reference framework sequence and is used for sequence replying comparison in information recovery, and the other part is a variable coding region and is used for specific sample information coding so as to carry out pollution identification. The insertion type coding nucleic acid sequence is utilized to mark a sample to be detected, analysis is carried out on the basis of high-throughput sequencing original data, cross contamination among samples in a short-term batch and historical environmental pollution caused by long-term batch detection can be rapidly and effectively identified, and the insertion type coding nucleic acid sequence can be used as a set of standard NGS reagents to carry out quality assessment of a detection laboratory and cleaning, correcting and remedying of a detection result without retest sample contamination.
In the invention, a sequence with a known sequence is selected as a framework sequence region, so that no homology with a sample to be detected is ensured.
Preferably, the length of the insertion-encoding nucleic acid sequence is 100-2000 bp, including but not limited to 101bp, 102bp, 103bp, 104bp, 105bp, 120bp, 200bp, 220bp, 240bp, 260bp, 280bp, 300bp, 500bp, 800bp, 1000bp, 1200bp, 1300bp, 1400bp, 1600bp, 1700bp, 1800bp, 1900bp, 1950bp, 1980bp, 1990bp, 1995bp, 1998bp or 1999bp, preferably 200-300 bp.
Preferably, the length of the variable coding region is 1-20 bp, including but not limited to 2bp, 3bp, 4bp, 5bp, 6bp, 7bp, 8bp, 10bp, 12bp, 15bp, 16bp, 17bp, 18bp or 19bp, and the number is 1-4.
Preferably, the intervening coding nucleic acid sequences are classified as intervening coding nucleic acid sequences for identifying batch-to-batch contamination or intervening coding nucleic acid sequences for identifying batch-to-batch contamination, depending on the variable coding region.
Preferably, the length of the variable coding region in the intervening coding nucleic acid sequence for identifying batch-to-batch contamination is different from the length of the variable coding region in the intervening coding nucleic acid sequence for identifying batch-to-batch contamination.
In the present invention, the length of the inserted coding nucleic acid sequence for identifying contamination in a batch can be designed according to the requirement. It may be 100 to 2000 bases, preferably 200 to 300 bases, and more preferably 240 bases, and the total length of each variable coding region is generally 1 to 4 bases. Distributed over 1 to 4 positions, preferably 1 base per coding region in length, distributed over 4 positions of the nucleic acid sequence.
In the present invention, the variable coding region of the intervening coding nucleic acid sequence for identifying batch contamination may have a length of 1 to 20 bases, preferably 5 bases, and preferably, the coding region and mode for identifying batch contamination are different from those for identifying batch sample contamination, for example, the variable coding region of the batch identification sequence is a continuous basic region, and more preferably, the variable coding region of the batch identification sequence may be two independent continuous basic regions with the same coding, so as to increase the filtering condition and improve the information reliability in the extraction of coding information, in order to prevent signal noise or information loss due to sequencing errors or non-uniform sequencing depth.
Preferably, the length of the variable coding region in the insertion coding nucleic acid sequence for identifying batch-to-batch pollution is 1bp, and the number of the variable coding regions is 4.
Preferably, the length of the variable coding region in the insertion coding nucleic acid sequence for identifying the pollution in the batch is 5bp, and the number of the variable coding regions is 2.
Preferably, the inserted coding nucleic acid sequence for identifying batch-to-batch contamination comprises the sequence shown in SEQ ID NO. 1.
Preferably, the insertion-type encoding nucleic acid sequence for identifying the batch contamination comprises a sequence shown in SEQ ID NO. 2.
SEQ ID NO.1:
CTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACTCCNNNNACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACNNNNAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATT。
SEQ ID NO.2:
CGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGNCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCANGATCTCCTGTCATCCCACCTTGCTCCTGCCGAGAAAGTATCCATCATGNCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATNCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGA。
Wherein N is any one of A, T, C and G.
Preferably, the molecular coding detection system further comprises a coding information recovery system.
Preferably, the encoded information recovery system comprises a probe or primer complementary to the inserted encoding nucleic acid sequence.
According to different application scenes, the coded information recovery system can be realized according to different modes such as liquid phase hybridization capture or amplicon primer amplification. In some embodiments, a library of insertion-encoding nucleic acid sequence-specific recovery probes is added to a library of hybrid capture probes, the probes consisting of matching bases of the insertion-encoding nucleic acid sequence, in the variable coding region, preferably, of degenerate complementary sequences. The length of the probes may be between 50 and 200 bases, preferably 120 bases, and the number of probes may be any number within 1 to 1000. The working concentration of the recovered probe may be between 0.1nM and 10nM. The recovery probe is characterized in that one or more biotin (biotin) marks are arranged on the probe, so that the recovery is convenient, specifically, a probe enrichment and bank building insertion type coding nucleic acid sequence with an interruption step and a working schematic diagram of a recovery system are shown in figure 1, and a genome DNA and an insertion type coding nucleic acid sequence 101 form a fragment 102 with the length of about 150-200 bp after ultrasonic interruption; adding a library building joint to the two ends after the tail end is repaired to form a library 103 before amplification; in the liquid phase hybridization capture process, the inserted coding nucleic acid sequence segment and the genome segment containing the target sequence are respectively combined with the coding nucleic acid sequence recovery probe and the gene specific probe 104 to complete the capture. The working schematic diagram of the probe enrichment, library construction and insertion type coding nucleic acid sequence and recovery system without interruption step is shown in FIG. 2, and the interruption step is not needed when sequencing is carried out on part of sequencing sample substrate types such as ctDNA. Part of the bottom DNA contains a target sequence, is mixed with a coding nucleic acid sequence 201, is connected through a joint to form a library 202 before amplification, and in the liquid phase hybridization capture process, an inserted coding nucleic acid sequence segment and a genome segment containing the target sequence are respectively combined with a coding nucleic acid sequence recovery probe and a gene specific probe 203 to complete capture.
In other embodiments, the recycling system is composed of primers matching 10-30 bases of the 5' end and 3' end of the inserted coding nucleic acid sequence, preferably 18-25 bases in length, the working concentration of the recycling primers can be 0.1-10 μ M, and specifically, the amplicon enrichment and banking insertion coding nucleic acid sequence and recycling system working schematically are shown in fig. 3, genomic DNA containing the target sequence is mixed with the coding nucleic acid sequence 301, the first round of PCR is performed, the target gene-specific primer pair modified by the universal sequence and the sequencing primer sequence at the 5' end and the insertion coding nucleic acid sequence recycling primer pair are combined with the genomic fragment and the coding nucleic acid sequence 302, respectively, the second round of PCR is performed by amplifying primers composed of sequences respectively having P5, P7 and index sequences at the 5' end and matching the universal sequence at the 5' end of the first round of PCR, and the banking 303 is completed.
Preferably, the probes may have a length of 50 to 200bp and a number of 1 to 1000.
Preferably, the length of the primer is 18 to 25bp.
Preferably, the nucleic acid sequence of the probe for identifying an intervening coding nucleic acid sequence of batch-to-batch contamination is selected from the sequences shown in SEQ ID NO.3 and/or SEQ ID NO. 4.
SEQ ID NO.3:
CTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACTCCNNNNACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCC-biotin。
SEQ ID NO.4:
CTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACNNNNAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATT-biotin。
Preferably, the nucleic acid sequence of the probe for identifying an intervening coding nucleic acid sequence of contamination within a lot is selected from the sequences shown in SEQ ID No.5 and/or SEQ ID No. 6.
SEQ ID NO.5:
CGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCTGTGCTCGACGTTGNCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGGGGCANGATCTCCTGTCATCCCACCTTG-biotin。
SEQ ID NO.6:
CTCCTGCCGAGAAAGTATCCATCATGNCTGATGCAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATNCGACCACCAAGCGAAACATCGCATCGAGCGAGCACGTACTCGGA-biotin。
Preferably, the nucleic acid sequence of the primer for identifying the intervening encoding nucleic acid sequence of batch-to-batch contamination comprises the sequences shown in SEQ ID No.7 and SEQ ID No. 8.
SEQ ID NO.7:
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCTAAATCGGGGGCTCCCTTTAGG。
SEQ ID NO.8:
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGAATAGGCCGAAATCGGCAAAATCCCT。
Preferably, the nucleic acid sequence of the primer for identifying the contaminating intervening coding nucleic acid sequence within the batch comprises the sequences shown in SEQ ID NO.9 and SEQ ID NO. 10.
SEQ ID NO.9:
TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGCGTGGCTGGCCACGACGGGCGTTCCTT。
SEQ ID NO.10:
GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGTCCGAGTACGTGCTCGCTCGATGCGA。
In the present invention, the application of the molecular coding detection system for monitoring and correcting sequencing contamination can be performed throughout the entire process of high throughput sequencing, such as from the time and beginning of sample nucleic acid extraction, so that the insert-type coding nucleic acid sequence can be pre-fabricated into a nucleic acid container in an amount of 1% to 1000%, preferably 10% to 100%, of the number of sample molecules.
In a second aspect, the present invention provides the use of the molecular coding detection system of the first aspect for monitoring and correcting sequencing contamination in the preparation of a genetic sequencing product.
The molecular coding detection system for monitoring and correcting sequencing pollution designed by the invention can be effectively applied to preparing sequencing products and used as a component for monitoring and correcting sequencing pollution.
In a third aspect, the present invention provides a sequencing kit comprising the molecular coding detection system of the first aspect for monitoring and correcting sequencing contamination.
In a fourth aspect, the present invention provides the use of the molecular coding detection system of the first aspect for monitoring and correcting sequencing contamination in gene sequencing.
In a fifth aspect, the present invention provides a method of monitoring and correcting sequencing contamination, the method comprising:
mixing the molecular coding detection system for monitoring and correcting sequencing pollution and a sample to be detected, constructing and purifying a library, sequencing the purified library, and performing data analysis and pollution correction according to a sequencing result.
The standard for judging pollution is as follows: all variable coding regions in the insertion coding nucleic acid sequence have non-sample unique coding sequences, and the number of reads of suspected pollution sequences exceeds 3.
The contamination correction includes: and backtracking a sample pointed by pollution, and performing comparison to remove false positive mutation of the sample.
In the present invention, the flow chart of the method for monitoring and correcting sequencing contamination is shown in fig. 4, the analysis process includes mapping the sequencing data to the reference genome and the reference coding nucleic acid sequence, and filtering and effective depth statistics are performed on the recovered inserted coding nucleic acid sequence, and the steps of filtering and contamination identification of the inserted coding nucleic acid sequence data are as follows: data replying, batch-to-batch and batch-to-batch variable coding region sequence extraction, repeated sequence removal and pollution identification, wherein the pollution identification conditions are as follows:
(1) Batch contamination, wherein the variable coding regions in the batch insertion type coding nucleic acid sequence have non-sample unique coding sequences, suspected contamination codes exist in all the variable coding regions simultaneously, and the number of reads of the suspected contamination sequences exceeds 3;
(2) Batch contamination, (a) simultaneous occurrence of non-sample unique coding sequences in all variable coding regions in the batch of insert-encoded nucleic acid sequences and more than 3 reads of suspected contamination coding sequences per coding region, and (b) traceability of suspected contamination coding sequences within the batch of samples.
The effective depth statistics (pollution index statistics) is the ratio of the effective depth of the only coding of the target sample to the effective depth of the total recovery coding.
Compared with the prior art, the invention has the following beneficial effects:
in the invention, an insertion type coding nucleic acid sequence with a specific structure is designed, a sample to be detected is marked by using the insertion type coding nucleic acid sequence, and analysis is carried out based on high-throughput sequencing original data, so that cross contamination among samples in a short-term batch and historical environmental pollution caused by long-term batch detection can be rapidly and effectively identified, and the insertion type coding nucleic acid sequence can be used as a set of standard NGS reagents for quality evaluation of a detection laboratory and cleaning correction and remediation of a detection result without sample pollution rechecking.
Drawings
FIG. 1 is a schematic diagram of the operation of a probe enrichment, banking and insertion type encoding nucleic acid sequence and recovery system with interruption steps;
FIG. 2 is a schematic diagram of the probe enrichment, banking, insertion-type encoding nucleic acid sequence and recovery system operation without interruption;
FIG. 3 is a schematic diagram of the operation of an amplicon enrichment, banking, insertion-type encoding nucleic acid sequence and recovery system;
FIG. 4 is a flow chart of a method for monitoring and correcting sequencing contamination;
FIG. 5 is a graph showing the result of performance verification of the molecular coding assay system for monitoring and correcting sequencing contamination according to the present invention.
Detailed Description
To further illustrate the technical means adopted by the present invention and the effects thereof, the present invention is further described below with reference to the embodiments and the accompanying drawings. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and that no limitation of the invention is intended.
The examples do not show the specific techniques or conditions, according to the technical or conditions described in the literature in the field, or according to the product specifications. The reagents or apparatus used are conventional products commercially available from normal sources, not indicated by the manufacturer.
Example 1
This example is designed to monitor and correct for sequencing contamination.
1. Design of an inserted coding nucleic acid sequence
The inserted coding nucleic acid sequence is double-stranded DNA, and the skeleton sequence is designed by adopting an exogenous artificial sequence. The insertion type coding nucleic acid sequence for identifying the pollution among batches is shown in SEQ ID NO.1 and consists of 240 bases, blast search has no homology with human genome, wherein 58 th to 61 th bases and 179 th to 182 bases are variable coding regions and are represented by NNNN, each N represents any one of A, T, C and G, the sequences of 58 th to 61 th bases and 179 th to 182 th bases are completely identical, 256 different insertion type coding nucleic acid sequences are designed according to different combinations of N, the capture recovery probe sequence is single-stranded DNA and consists of two 120 base sequences which are respectively matched with 1 to 120 bases in SEQ ID NO.1 and 121-240 bases, wherein the matched variable coding region sequences are merged sequences, the capture probe sequence has a biotin label at one base at 3', the specific sequences are shown in SEQ ID NO.3 and SEQ ID NO.4, the amplicon recovery primers of the coding nucleic acid sequences are composed of two forward and reverse primers which respectively correspond to 23 bases at the 5' end of SEQ ID NO.1 and 23 'ends of 3' of SEQ ID NO.1, the specific primers are shown in SEQ ID NO.3, and 8 bases of the insertion type primers are respectively matched with 26 bases, and each primer is shown in the base fusion base sequence of SEQ ID NO. 8.
The insertion coding nucleic acid sequence for identifying the pollution in the batch is shown in SEQ ID NO.2 and consists of 240 bases, and blast search has no homology with human genome, wherein 49, 98, 147 and 196 bases are variable coding regions N, 49 bases are coded as A, T, C or G, 98 bases are coded as T or C, 147 bases are coded as A, C or G, and 196 bases are coded as A, T, C or G. Designing 96 different insertion type coding nucleic acid sequences according to different combinations of N, wherein a capture recovery probe sequence is single-stranded DNA and consists of two 120 base sequences which are respectively matched with 1 to 120 bases and 121 to 240 bases in SEQ ID NO.2, a variable coding region is degenerate base, a 3 'end is provided with a biotin label, specific sequences are shown as SEQ ID NO.5 and SEQ ID NO.6, an amplicon recovery primer consists of two forward and reverse primers which respectively correspond to 27 bases at a 5' end and 26 bases at a 3 'end in SEQ ID NO.2, and a 5' end of each primer is respectively provided with a library-establishing primer matching fusion sequence, and the specific sequences are shown as SEQ ID NO.9 and SEQ ID NO. 10.
2. Chemical synthesis
256 of the insert coding nucleic acid sequences and matched capture and recovery probe and recovery primer sequences designed to identify batch-to-batch contamination and 96 of the insert coding nucleic acid sequences and matched capture and recovery probe and recovery primer sequences designed to identify batch-to-batch contamination were committed to synthesis (Integrated DNA Technologies) in the form of dry powders.
Mother liquor preparation and quantification the doping type double-stranded nucleic acid sequence, the matched recovery probe and the primer thereof are added into ultrapure water according to the instruction of a synthetic product to prepare 100 mu M mother liquor, then the batch-to-batch doping type double-stranded nucleic acid is continuously diluted to the production concentration of the prefabricated liquor according to 30000 copies/mu L, 2 mu L of the prefabricated liquor is added into the bottom of a 1.5mL EP tube, and the prefabricated tube is placed into a refrigerator at-80 ℃ for storage.
Example 2
The working principle of the test of interrupted capture enrichment depot building (targeted panel) is shown in fig. 1, and the test specifically comprises the following steps:
(1) Extracting sample DNA, wherein the type of the tested sample is an FFPE sample, the sample DNA extraction Kit is a QIAamp DNA FFPE Tissue Kit, 200ng of the extracted DNA is quantitatively added into a prefabricated tube, the base of the variable coding region of the insertion coding nucleic acid sequence polluted among identification batches of the prefabricated tube is AGGT, the base of the variable coding region of the insertion coding nucleic acid sequence polluted in the identification batches is A, T, C and C respectively according to the sequence from 5 'to 3', and the extracted DNA is slightly vibrated and swirled for 30s after being added into the prefabricated tube;
(2) A step of establishing a library, wherein a library establishing reagent related in the embodiment is purchased from NEB, and a probe hybridization reagent is from IDT;
a. a nucleic acid disruption step of supplementing the DNA to 50. Mu.L with 1 XTE buffer, and performing DNA disruption using a Covaris M220 ultrasonic disruptor according to the procedure of Table L;
TABLE 1
|
10% |
Peak power | 75 |
Number of burst cycles | 200 |
Duration of interruption | 100-330s |
Temperature of water bath | 18-20℃ |
b. Repairing is interrupted, reaction liquid is prepared according to the table 2, and incubation is carried out for 15min at 20 ℃;
TABLE 2
Fragmenting FFPE DNA | 50μL |
FFPE DNA buffer | 6.5μL |
NEBNext FFPE DNA Repair Mix | 2μL |
Ultra-pure water | 3.5μL |
In all | 62μL |
c. Magnetic bead purification and End repair nucleic acid purification was performed using AMPure XP Beads, end repair was performed using nebnexext Ultra II End Prep kit, and the repair reaction system and PCR program are shown in tables 3 and 4;
TABLE 3
FFPE DNA | 50μL |
NEBNext Ultra Ⅱ End Prep Buffer | 7μL |
NEBNext Ultra Ⅱ End Prep enzyme mix | 3μL |
In all | 60μL |
TABLE 4
Step (ii) of | Temperature of | Time |
Cycle 1 | 20 | 30min |
Cycle | ||
2 | 65℃ | 30min |
Cycle 3 | 4℃ | Pausing |
d. Performing joint connection, namely building a library according to a reaction system shown in the table 5, and incubating for 15min at 20 ℃;
TABLE 5
DNA Repair Reaction Mixture | 60μL |
NEBNext Ultra Ⅱ Ligation Master Mix | 30μL |
NEBNext Ligation Enhancer | 1μL |
Duplex Seq Adapters | 2μL |
In all | 93μL |
e. Screening and pre-amplifying library fragments, screening AMPure XP beads, and pre-amplifying joints according to a reaction system in a table 6 and reaction conditions in a table 7;
TABLE 6
NEBNext Ultra Ⅱ Q5 Master Mix | 25μL |
UDI Primer Mix | 5μL |
In all | 30μL |
TABLE 7
f. Performing hybrid capture, and capturing a target sequence and a coding nucleic acid sequence according to a reaction system in a table 8 and reaction conditions in a table 9;
TABLE 8
2X Hybridization Buffer | 8.5μL |
Hybridization Buffer Enhancer | 2.7μL |
Targeting gene panel | 4μL |
Inter-batch intra-coded nucleic acid recovery probes | 1.8μL |
In all | 17μL |
TABLE 9
Step (ii) of | Temperature of | Time |
Cycle 1 | 95 | 30s |
Cycle | ||
2 | 65℃ | 4h |
Cycle 3 | 65℃ | Pausing |
g. Recovering streptomycin magnetic beads, amplifying and purifying a capture library, recovering and washing a hybrid capture sequence by using an instruction according to Dynabeads M-270, and amplifying the capture library according to a reaction system in a table 10 and reaction conditions in a table 11;
watch 10
Library PCR Master Mix(2×) | 25μL |
Illumina P5/P7 Primer Mix(10×) | 5μL |
Dynabeads | 20μL |
In all | 50μL |
TABLE 11
h. Library purification and quantification
Amplification was performed using AMPure XP beads and then library purification was performed, purification was performed using Qubit 3.0 and then library quantification was performed.
(3) Sequencing
The Novaseq 6000 high-throughput sequencer PE150 is used for reading length to carry out on-machine sequencing, the sequencing depth is 10000 x, data mapping is carried out to a reference genome and a reference coding nucleic acid sequence, the recovered insertion coding nucleic acid sequence is filtered and effectively subjected to depth statistics, and the standards and steps for filtering the insertion coding nucleic acid data and judging pollution are as follows: data replying, extraction of variable coding region sequences among batches and in batches, removal of repeated sequences, pollution identification, and the pollution identification conditions are as follows: 1) Batch contamination, wherein the variable coding region of the batch insertion coding sequence has a non-sample unique coding sequence, and suspected contamination codes exist in the first variable coding region and the second variable coding region simultaneously; and the number of reads of suspected pollution sequences exceeds 3; 2) Batch contamination, (a) batch in-batch insert coding variable coding regions 1, 2, 3, 4 simultaneously present non-sample unique coding sequences, and suspected contamination coding sequence reads exceed 3 in each variable coding region, (b) suspected contamination coding sequences can be traced in the batch of samples.
And (3) counting the pollution indexes: the only coding effective depth of the target sample accounts for the ratio of the total recovery coding effective depth.
Pollution correction: backtracking samples of pollution among batches and pollution direction in batches, and removing false positive mutation of the samples through comparison.
The depth of the inter-batch and intra-batch interpolation coding sequence in this implementation is shown in table 12, which proves that the method can recover a sufficient number of sample unique identification codes.
TABLE 12
Effective depth of sequencing target | 5192× |
Inter-batch insertion coding sequence validationDepth of field | 4567× |
Batch interpolation coding order effective depth | 4605× |
Example 3
The working principle of the amplicon library construction method (TRB immune repertoire targeted sequencing) performed in this example is shown in FIG. 3, and comprises the following steps:
1. extracting sample DNA, wherein the type of a tested sample is a blood sample, the sample DNA extraction kit is a QIAamp DNA blood kit, and after DNA is extracted, quantifying 1 mu g of the extracted DNA, adding the quantified DNA into a prefabricated tube, wherein the base of a batch inserted coding nucleic acid sequence variable coding region of the prefabricated tube is ATAT, the base of the batch inserted coding nucleic acid sequence variable coding region is T, C and T according to the sequence from 5 'to 3', and extracting the DNA, adding the DNA into the prefabricated tube, and then slightly performing vortex oscillation for 30s;
2. amplifying and enriching target and coding sequences by PCR, amplifying target regions by multiple PCR using a TRB primer system, configuring a reaction system according to table 13, wherein related intra-batch and inter-batch nucleic acid primer pairs are shown as SEQ ID NO.7, SEQ ID NO.8, SEQ ID NO.9 and SEQ ID NO.10, and the reaction conditions are shown as table 14;
watch 13
2×Multiplex PCR Buffer | 25μL |
Multiplex Polymerase | 1μL |
TRB primer Mix(10μM) | 2μL |
Batch-to-batch coding nucleic acid primer working solution | 2μL |
Batch coding nucleic acid primer working solution | 2μL |
Ultra-pure water | 2μL |
DNA(1000ng) | 20μL |
TABLE 14
3. After being purified by AMPure XP beads, library construction PCR is carried out according to a reaction system shown in a table 15 (wherein P5-F and P7-R sequences are shown as SEQ ID NO.11 (aatgatacggcacccagatctacatacgtacatgcgctcgctcgtcggcgcgcgcgcgtc) and SEQ ID NO.12 (caagcagagagaagaccgacatgaagctcgtctcgtgggctcgg)) and reaction conditions shown in a table 16;
watch 15
5x Reaction Buffer | 10μL |
DNA Polymerase | 0.5μL |
10mM dNTP | 1μL |
P5-F(10uM) | 1μL |
P7-R(10uM) | 1μL |
Nuclease-free water | 34.5μL |
TABLE 16
4. Library purification and sequencing
And (3) performing amplification by using AMPure XP beads, purifying and quantifying the library by using the Qubit 3.0, and performing on-machine sequencing by using a Novaseq 6000 high-throughput sequencer PE150 for reading, wherein the sequencing quantity is 0.3 Mbeads.
5. Data analysis and depth statistics
The data analysis and depth statistical method is as described in embodiment 1, and the depths of the inter-batch and intra-batch interpolated coding sequences in this embodiment are shown in table 17, which proves that the amplicon library construction method can recover a sufficient number of unique identification codes of the samples.
TABLE 17
Inter-batch coded efficient reads | 63425 |
Intra-batch coded efficient reads | 52583 |
Example 4
This example performs contamination identification performance verification of the manually mixed sample.
The performance verification of the pollution identification capability is carried out by respectively preparing artificial simulation doping ratio pollution samples, the proportion gradient of the pollution doping ratio is 0.1%, 0.5%, l%, 5% and 10%, the data of the actual pollution index is shown in figure 5, and the molecular coding detection system for monitoring and correcting sequencing pollution can identify the pollution of 0.1% level at the lowest.
In summary, the invention designs an insertion type coding nucleic acid sequence with a specific structure, the insertion type coding nucleic acid sequence is utilized to mark a sample to be detected, and the analysis is carried out based on high-throughput sequencing original data, so that the cross contamination among samples in a short-term batch and the historical environmental pollution caused by long-term batch detection can be rapidly and effectively identified, and the insertion type coding nucleic acid sequence can be used as a set of standard NGS reagents to carry out quality evaluation of a detection laboratory and cleaning, correcting and remedying of a detection result without retest sample contamination.
The applicant states that the present invention is illustrated by the above examples to show the detailed method of the present invention, but the present invention is not limited to the above detailed method, that is, it does not mean that the present invention must rely on the above detailed method to be carried out. It should be understood by those skilled in the art that any modification of the present invention, equivalent substitutions of the raw materials of the product of the present invention, addition of auxiliary components, selection of specific modes, etc., are within the scope and disclosure of the present invention.
Claims (10)
1. A molecular coding assay system for monitoring and correcting sequencing contamination, said molecular coding assay system comprising at least one intervening coding nucleic acid sequence;
the insertion-encoding nucleic acid sequence comprises a framework sequence region with a known sequence and at least one variable encoding region;
the variable coding region is a random sequence consisting of any one or at least two of A, T, C or G;
the variable coding regions are randomly distributed within the framework sequence region;
the inserted coding nucleic acid sequence is single-stranded or double-stranded.
2. The molecular coding detection system for monitoring and correcting sequencing contamination according to claim 1, wherein the length of the inserted coding nucleic acid sequence is 100-2000 bp;
the length of the variable coding region is 1-20 bp, and the number of the variable coding regions is 1-4.
3. The molecular coding detection system for monitoring and correcting sequencing contamination of claim 1, wherein the intervening coding nucleic acid sequences are classified as intervening coding nucleic acid sequences for identifying batch-to-batch contamination or intervening coding nucleic acid sequences for identifying batch-to-batch contamination based on the variable coding regions;
the length of the variable coding region in the insertion-type coding nucleic acid sequence for identifying batch-to-batch pollution is different from the length of the variable coding region in the insertion-type coding nucleic acid sequence for identifying batch-to-batch pollution.
4. The molecular coding detection system for monitoring and correcting sequencing contamination according to claim 3, wherein the length of the variable coding region in the insertion coding nucleic acid sequence for identifying batch contamination is 1bp, and the number of the variable coding regions is 4;
the length of the variable coding region in the insertion type coding nucleic acid sequence for identifying batch pollution is 5bp, and the number of the variable coding regions is 2;
the insertion coding nucleic acid sequence for identifying batch-to-batch pollution comprises a sequence shown in SEQ ID NO. 1;
the inserted coding nucleic acid sequence for identifying the pollution in the batch comprises a sequence shown in SEQ ID NO. 2.
5. The molecular coding assay system for monitoring and correcting sequencing contamination of claim 1, further comprising a coded information recovery system;
the encoded information recovery system includes probes or primers complementary to the inserted encoding nucleic acid sequence.
6. The molecular coding detection system for monitoring and correcting sequencing contamination according to claim 5, wherein the length of the probe is 50-200 bp, and the number of the probes is 1-1000;
the length of the primer is 18-25 bp;
the nucleic acid sequence of the probe for identifying the insertion-type coding nucleic acid sequence of batch-to-batch pollution is selected from the sequences shown in SEQ ID NO.3 and/or SEQ ID NO. 4;
the nucleic acid sequence of the probe for identifying the insertion-type coding nucleic acid sequence of the pollution in the batch is selected from a sequence shown in SEQ ID NO.5 and/or SEQ ID NO. 6;
the nucleic acid sequence of the primer for identifying the insertion-type coding nucleic acid sequence of the batch-to-batch pollution comprises the sequences shown in SEQ ID NO.7 and SEQ ID NO. 8;
the nucleic acid sequence of the primer for identifying the plug-in coding nucleic acid sequence of the pollution in the batch comprises the sequences shown in SEQ ID NO.9 and SEQ ID NO. 10.
7. Use of the molecular coding detection system of any one of claims 1 to 6 for monitoring and correcting sequencing contamination in the preparation of a genetic sequencing product.
8. A sequencing kit comprising the molecular coding assay system of any one of claims 1 to 6 for monitoring and correcting sequencing contamination.
9. Use of the molecular coding detection system of any one of claims 1 to 6 for monitoring and correcting sequencing contamination in genetic sequencing.
10. A method of monitoring and correcting sequencing contamination, the method comprising:
mixing the molecular coding detection system for monitoring and correcting sequencing pollution according to any one of claims 1 to 6 with a sample to be detected, constructing and purifying a library, sequencing the purified library, and performing data analysis and pollution correction according to a sequencing result;
the standard for judging pollution is as follows: all variable coding regions in the insertion coding nucleic acid sequence have non-sample unique coding sequences, and the number of reads of suspected pollution sequences exceeds 3;
the contamination correction includes: and backtracking a sample pointed by pollution, and performing comparison to remove false positive mutation of the sample.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211328995.5A CN115717163B (en) | 2022-10-27 | 2022-10-27 | Molecular coding detection system for monitoring and correcting sequencing pollution and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211328995.5A CN115717163B (en) | 2022-10-27 | 2022-10-27 | Molecular coding detection system for monitoring and correcting sequencing pollution and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115717163A true CN115717163A (en) | 2023-02-28 |
CN115717163B CN115717163B (en) | 2023-10-27 |
Family
ID=85254369
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211328995.5A Active CN115717163B (en) | 2022-10-27 | 2022-10-27 | Molecular coding detection system for monitoring and correcting sequencing pollution and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115717163B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115896255A (en) * | 2023-03-08 | 2023-04-04 | 中国环境科学研究院 | Tracing method using DNA identification code |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014128453A1 (en) * | 2013-02-19 | 2014-08-28 | Genome Research Limited | Nucleic acid marker molecules for identifying and detecting cross contamination of nucleic acid samples |
CN109628568A (en) * | 2019-01-10 | 2019-04-16 | 上海境象生物科技有限公司 | A kind of internal standard and its application polluted for differentiating and calibrating high-flux sequence |
JP2019131539A (en) * | 2018-01-31 | 2019-08-08 | 公益財団法人かずさDna研究所 | Detection method of cross-contamination between samples in next-generation sequencing |
WO2019212138A1 (en) * | 2018-05-03 | 2019-11-07 | 주식회사 셀레믹스 | Internal control substance for discovering cross-contamination between samples for next generation sequencing |
CN111944807A (en) * | 2020-08-26 | 2020-11-17 | 天津诺禾医学检验所有限公司 | Human sequencing sample tracking marker, and monitoring method and monitoring device for human sequencing sample cross contamination |
CN113897354A (en) * | 2021-08-27 | 2022-01-07 | 海宁麦凯医学检验有限公司 | Internal standard for sequencing correction and application thereof |
-
2022
- 2022-10-27 CN CN202211328995.5A patent/CN115717163B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014128453A1 (en) * | 2013-02-19 | 2014-08-28 | Genome Research Limited | Nucleic acid marker molecules for identifying and detecting cross contamination of nucleic acid samples |
JP2019131539A (en) * | 2018-01-31 | 2019-08-08 | 公益財団法人かずさDna研究所 | Detection method of cross-contamination between samples in next-generation sequencing |
WO2019212138A1 (en) * | 2018-05-03 | 2019-11-07 | 주식회사 셀레믹스 | Internal control substance for discovering cross-contamination between samples for next generation sequencing |
CN109628568A (en) * | 2019-01-10 | 2019-04-16 | 上海境象生物科技有限公司 | A kind of internal standard and its application polluted for differentiating and calibrating high-flux sequence |
CN111944807A (en) * | 2020-08-26 | 2020-11-17 | 天津诺禾医学检验所有限公司 | Human sequencing sample tracking marker, and monitoring method and monitoring device for human sequencing sample cross contamination |
CN113897354A (en) * | 2021-08-27 | 2022-01-07 | 海宁麦凯医学检验有限公司 | Internal standard for sequencing correction and application thereof |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115896255A (en) * | 2023-03-08 | 2023-04-04 | 中国环境科学研究院 | Tracing method using DNA identification code |
Also Published As
Publication number | Publication date |
---|---|
CN115717163B (en) | 2023-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108893466B (en) | Sequencing joint, sequencing joint group and detection method of ultralow frequency mutation | |
CN111052249B (en) | Methods of determining predetermined chromosome conservation regions, methods of determining whether copy number variation exists in a sample genome, systems, and computer readable media | |
CN108998508B (en) | Construction method of amplicon sequencing library, primer group and kit | |
CN104694635A (en) | Method for constructing high-flux simplified genome sequencing library | |
US20160115544A1 (en) | Molecular barcoding for multiplex sequencing | |
CN105695448A (en) | Construction method of blood free DNA (deoxyribonucleic acid) library based on Ion ProtonTM sequencing platform, reagents and application of reagents | |
CN107604045A (en) | A kind of construction method of amplification sublibrary for the mutation of testing goal gene low frequency | |
CN106554955A (en) | Build method and kit of the sequencing library of PKHD1 gene mutations and application thereof | |
CN107893260A (en) | Efficiently remove the method and kit of the structure transcript profile sequencing library of rRNA | |
WO2013173774A2 (en) | Molecular inversion probes | |
CN109853047A (en) | A kind of genomic DNA sequencing library fast construction method and matched reagent box | |
CN111424119B (en) | High-flux detection primer and kit for SARS-CoV-2 virus | |
CN115717163B (en) | Molecular coding detection system for monitoring and correcting sequencing pollution and application thereof | |
CN109295500B (en) | Single cell methylation sequencing technology and application thereof | |
CN103998625B (en) | For the method and system of Viral diagnosis | |
CN111944807B (en) | Human sequencing sample tracking marker, and monitoring method and monitoring device for human sequencing sample cross contamination | |
CN107083440A (en) | Kit for detecting chromosome aneuploidy and preparation method and application thereof | |
CN115948607B (en) | Method and kit for simultaneously detecting multiple pathogen genes | |
CN110734982A (en) | High-throughput sequencing technology-based linkage autosomal STR typing system and kit | |
CN109266723A (en) | Rare mutation detection method, its kit and application | |
CN115011695A (en) | Multiple cancer species identification marker based on free circular DNA gene, kit and application | |
CN113897354A (en) | Internal standard for sequencing correction and application thereof | |
WO2024119481A1 (en) | Method for rapidly preparing multiplex pcr sequencing library and use thereof | |
CN108085367A (en) | A kind of genetic analyzer tests special allele standard control preparation method of reagent thereof | |
CN111197072B (en) | Rapid extraction method of DNA and application of rapid extraction method in detection of low-frequency chimeric gene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |