WO2020118596A1 - Procédé de détection de séquence d'étiquettes - Google Patents

Procédé de détection de séquence d'étiquettes Download PDF

Info

Publication number
WO2020118596A1
WO2020118596A1 PCT/CN2018/120820 CN2018120820W WO2020118596A1 WO 2020118596 A1 WO2020118596 A1 WO 2020118596A1 CN 2018120820 W CN2018120820 W CN 2018120820W WO 2020118596 A1 WO2020118596 A1 WO 2020118596A1
Authority
WO
WIPO (PCT)
Prior art keywords
tag
sequence
template
sequences
detection method
Prior art date
Application number
PCT/CN2018/120820
Other languages
English (en)
Chinese (zh)
Inventor
赵霞
赵静
章文蔚
陈奥
Original Assignee
深圳华大生命科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大生命科学研究院 filed Critical 深圳华大生命科学研究院
Priority to CN201880099610.8A priority Critical patent/CN113168889B/zh
Priority to PCT/CN2018/120820 priority patent/WO2020118596A1/fr
Publication of WO2020118596A1 publication Critical patent/WO2020118596A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Definitions

  • the invention relates to the field of sequencing technology, in particular to a method for detecting a tag sequence.
  • multiple libraries can be mixed together for sequencing by adding a library tag of known sequence to the library, and later the information of different samples is separated according to the library tag.
  • the first is a single-tagged library, which can be added to the library by adding a tag sequence to a linker or a tag sequence to a PCR primer, and then adding the tag sequence to the library through linker ligation or PCR amplification, respectively.
  • the second type is the dual-tag library. There are three ways to add the dual-tag sequence to the library. The first is to add the first tag sequence to the linker, add the second tag sequence to the PCR primer, and add the two tags to the library in sequence by adding the linker and PCR.
  • the second is the F primer (Forward amplification primer, Forward primer) and R primer (Reverse amplification primer, Reverse) of the two tag sequences added to the PCR primers (Polymerase Chain Reaction Primer, Polymerase Chain Reaction). primer), two tags are added to the library simultaneously by PCR amplification.
  • the third is that two tags are added to the top chain and bottom chain of the linker sequence, and the two tags are added to the library at the same time by adding a linker.
  • the oligo constructing the tag library can be divided into two categories: tag adaptor and tag primer.
  • the standard oligonucleotide QC (Quality Control) methods of oligonucleotide synthesis suppliers mainly include OD (optical density, optical density) value detection, chromatographic detection, mass spectrometry detection, these methods can not detect oligonucleosides
  • OD optical density, optical density
  • chromatographic detection mass spectrometry detection
  • mass spectrometry detection these methods can not detect oligonucleosides
  • the accuracy of acid synthesis bases such as planning to synthesize AATTCCGGA, and 1% of the actually synthesized oligonucleotides are synthesized into AATTCCGGT, and 1% are synthesized into GATTCCGGA. If the wrong synthetic base is located in the sequencing primer The 3'end will directly affect the sequencing primer hybridization success rate and sequencing success rate, resulting in a reduction in the number of effective sequencing templates or sequencing errors.
  • the method used to detect the oligonucleotide contamination rate in the prior art has been based on a piece of amplified 180bp DNA with a 10bp sample index (index) sequence amplified from plasmid DNA as The template is matched with the oligonucleotide to be tested to build a NGS library, and NGS sequencing is used to distinguish the number of reads of library tag sequences that match the sample tag sequence, thereby calculating the contamination rate of other tags that do not match.
  • this method has a different base sequence of only 10bp in template DNA.
  • the quality of the second-strand sequencing cannot be guaranteed due to the imbalance of the bases, so the quality of the second-strand sequencing cannot be used to indirectly detect the oligonucleosides.
  • oligonucleotide base synthesis mainly by PCR amplification of the full length of the oligonucleotide sequence and adding A base to do TA cloning, monoclonal library constructed by Sanger sequencing oligonucleotide, generally each Oligonucleotides need to be sequenced with at least 100 monoclonal libraries.
  • the invention provides a method for detecting a tag sequence.
  • the method includes:
  • a set of template sequences are matched with a set of tag sequences to be tested to build a library to obtain a set of tag libraries.
  • the above template sequences are different gene sequences amplified or artificially synthesized. Different template sequences are different in sequence from each other.
  • the above-mentioned tag sequence to be tested has a one-to-one or many-to-one correspondence;
  • the above template sequence is a different gene sequence amplified from genomic DNA.
  • the number of the above-mentioned set of template sequences is 96.
  • the size of the above template sequence is 50-1000bp, preferably 180bp.
  • all the above template sequences are of equal size.
  • the above template sequence satisfies that the base sequence ratio of A, T, C, and G at the 5'end and the 3'end is the same as the sequence length of the sequencing read length.
  • the proportion of bases with a balanced base signal is the same as the sequence length of the sequencing read length.
  • the 5'and 3'ends of the above template sequence have the same base sequence range as the read length of the sequencing is in the range of 20bp to 200bp at the 5'and 3'ends, preferably within 30bp.
  • the above base ratio is 10% to 30%, preferably 15%.
  • the number of the template sequences is N times the number of the label sequences to be tested, N is an integer greater than or equal to 1, and the template sequence is divided into subgroups corresponding to the number of the label sequences to be tested, each The above-mentioned template sequence of the subgroup contains N template sequences.
  • the tag sequence to be tested is a tag adapter and/or tag primer.
  • the tag sequence to be tested is a single tag linker.
  • the matching library construction includes: linking the template sequence and the single-tag adaptor in a one-to-one or many-to-one correspondence, and then performing PCR amplification with universal primers to obtain the Single-label library.
  • the tag sequence to be tested is a single tag primer.
  • the above matching library construction includes: connecting the above template sequence with a universal linker to obtain a ligation product, and then performing PCR amplification with a single-label primer in a one-to-one or many-to-one correspondence to obtain a computer-based Single-label library.
  • the tag sequence to be tested is a double tag sequence composed of a tag adapter and a tag primer.
  • the matching library construction includes: linking the template sequence to the tag adapter according to a one-to-one or many-to-one correspondence, and then corresponding to the tag primer according to one-to-one or many-to-one correspondence PCR amplification to obtain a dual-tag library for computer use.
  • the tag sequence to be tested is a double tag primer composed of two tag primers.
  • the above matching library construction includes: connecting the above template sequence with a universal linker to obtain a ligation product, and then performing PCR amplification with the above-mentioned double-tag primers in a one-to-one or many-to-one correspondence to obtain a computer Dual-label library.
  • the above-mentioned tag sequence to be tested is a double tag connector composed of two tag connectors.
  • the matching library construction includes: connecting the template sequence and the double-tag adaptor in a one-to-one or many-to-one correspondence to obtain a ligation product, and then PCR amplifying with a universal primer to obtain Double-tagged library.
  • the above sequencing is double-end sequencing.
  • the above sequencing is PE30+10 sequencing.
  • all the tag sequences to be tested are all tag sequences synthesized in the same batch.
  • the above method further includes obtaining a second strand (read 2) sequencing quality evaluation result based on the above sequencing read long sequence;
  • the above template sequence is obtained by amplifying the human genome with 96 primer pairs shown in SEQ ID NO: 1 to 192.
  • the invention uses a set of template sequences to match the tag sequence to be tested and build a library to detect the tag sequence. Once the template sequence is successfully prepared, it can be amplified multiple times and then amplified, saving template sequence preparation costs; The sequences are different from each other, and there is no need to worry about the situation that different templates cannot be distinguished due to sequence errors caused by multiple PCR amplifications.
  • the preferred technical solution indirectly detects the quality of the 5'base synthesis of the tag linker or the linker related to the second-strand sequencing by detecting the quality of the second-strand sequencing.
  • the simultaneous detection of label contamination rate and oligonucleotide synthesis quality related to sequencing primers can be achieved through an experiment, saving quality control labor and cost.
  • FIG. 2 is a schematic diagram of a database building process in the quality control method of a single-label joint in the prior art
  • FIG. 3 is a schematic diagram of the sequencing principle in the quality control method of a single-tag joint in the prior art
  • FIG. 5 is a schematic diagram of a template DNA preparation method in an embodiment of the present invention.
  • FIG. 6 is a schematic diagram of a library building process of a quality inspection single label connector according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of a library construction process of a quality inspection single label primer in an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of the sequencing principle of the single label sequence of the quality inspection in the embodiment of the present invention.
  • FIG. 9 is the contamination rate after the library sequencing of the quality inspection single tag sequence and the ESR of the second-strand sequencing quality evaluation in the embodiment of the present invention.
  • FIG. 10 is a schematic diagram of a library construction process of a quality inspection tag adapter + tag primer in an embodiment of the present invention
  • FIG. 11 is a schematic diagram of a library construction process of a double tag primer for quality inspection in an embodiment of the present invention.
  • FIG. 12 is a schematic diagram of the process of building a library for a quality-inspected double-tag connector according to an embodiment of the present invention.
  • FIG. 13 is a schematic diagram of a library sequencing principle of a quality-checked double-tag sequence in an embodiment of the present invention.
  • FIG. 14 is the contamination rate after the library sequencing of the quality-checked double-tag sequence and the ESR of the second-strand sequencing quality evaluation in the embodiment of the present invention.
  • 15 is a linker sequence statistical result obtained after sequencing the library SE200 of the ESR second-strand non-elevated and ESR second-strand elevated main band of 160 bp in an embodiment of the present invention
  • 16 is a graph of ESR statistical results of a single-label mixed library constructed by different manufacturers and different batches of label adapters in an embodiment of the present invention
  • 17 is a diagram of amplifying products identified by agarose gel electrophoresis in an embodiment of the present invention.
  • 18 is an electrophoresis diagram of a 180bp specific product repurification amplification product in an embodiment of the present invention.
  • 21 is a graph of ESR results of the mixed library in the embodiment of the present invention.
  • FIG. 22 is a graph showing the results of ESR splitting of a library constructed by all 8 tag adapters to be tested in the embodiment of the present invention.
  • a tag sequence oligonucleotide also referred to herein as a "tag sequence” (barcode), or “oligonucleotide” (oligo) refers to a library (such as a sequencing library) used in the construction of differentiating Sample source and/or molecular source functional nucleotide sequence, including tag adaptor and tag primer. These tag sequences are obtained by artificial synthesis.
  • Tag linker refers to a linker sequence used in the construction of a library (such as a sequencing library) that has the function of distinguishing different sample sources and/or molecular sources, including a single tag linker and a double tag linker, where the double tag linker consists of Consists of a single label connector.
  • Tag primers refer to the primer sequences used in the construction of libraries (such as sequencing libraries) to distinguish different sample sources and/or molecular sources, including single tag primers and double tag primers, where the double tag primer consists of two Consists of a single tag primer.
  • the tag library refers to a library containing the tag sequence of the present invention obtained by a library construction method, especially a sequencing library.
  • Tag libraries include single tag libraries and dual tag libraries.
  • the single tag library contains a single tag adaptor or a single tag primer.
  • the dual tag library contains dual tag adaptors or dual tag primers.
  • the template sequence refers to a sequence used for matching and building a library with a tag sequence in the present invention, and different template sequences are different from each other in sequence.
  • the template sequence is a different gene sequence amplified from genomic DNA.
  • Sequencing refers to the method for determining the nucleic acid sequence. In the present invention, it specifically refers to the method for determining the sequence of the tag library, which includes single-end sequencing and double-end sequencing.
  • the present invention prefers double-end sequencing, especially PE30+10 sequencing, which includes both ends 30bp sequencing length and 10bp tag sequence length sequencing strategy.
  • the invention can simultaneously complete the quality control of the tag sequence oligonucleotide synthesis pollution rate and the indirect quality control of the oligonucleotide base synthesis quality related to the sequencing quality in one experiment, which is the tag sequence oligonucleotide
  • the quality inspection provides a new method.
  • the detection method of the present invention is applicable not only to single-tag sequence oligonucleotides, but also to double-tag sequence oligonucleotides.
  • the proportion of any base of A/T/C/G is at least not less than a set value, so that A/T/C can be guaranteed /G base signal balance
  • the set value may be, for example, a certain percentage value in the range of 10% to 30%, for example, 15%, etc., to ensure the base balance of the insert sequence of the tag sequence sequencing (such as PE30+10) , That is, to ensure that the quality of sequencing is not affected by the imbalance of template bases.
  • the test template DNAs are all divided into N groups, and then the grouped template DNAs are mixed separately, and matched with the tag sequence oligonucleotides to be tested to build a library.
  • the test template DNAs are all divided into N groups, and then the grouped template DNAs are mixed separately, and matched with the tag sequence oligonucleotides to be tested to build a library.
  • 1-96 template DNAs are divided into 8 groups, and the mass ratio of template DNAs 1-12 is mixed with the tag sequence 1( Barcode1) for matching and library building, after mixing equal proportions of template DNA No. 13-24 with tag sequence 2 (Barcode2), and so on, and mixing with equal sequence proportions of template DNA No. 85-96 and tag sequence 8 ( Barcode8) to match and build the library.
  • PE30+10 sequencing is used to distinguish the number of reads of the tag sequence that matches the template DNA, so as to calculate the contamination rate of other tag sequences that do not match, and the quality of the second-strand sequencing, such as the second-strand ESR (Effective) in DNB sequencing Spot Rate, the ratio of effective sequencing sites) to improve the situation, to indirectly detect the base synthesis accuracy of the hybridization sequence of the tag sequence oligonucleotide and the double-stranded primer.
  • the second-strand ESR Effective in DNB sequencing Spot Rate, the ratio of effective sequencing sites
  • the tag sequence oligonucleotide may be a single tag library-building oligonucleotide or a double tag library-building oligonucleotide. It should be noted that, in the case where the tag sequence to be tested is less than the test template DNA (for example, 96), theoretically only part of the test template DNA can be used, however, considering the need to ensure that the template sequence is in the selected sequencing strategy (For example, PE30 sequencing strategy) the base balance of each sequencing position, using all the test template DNA (for example, 96) is beneficial to ensure this.
  • the test template DNA for example, 96
  • the tag sequence oligonucleotide is a single tag library-building oligonucleotide, that is, a single tag adaptor or a single tag primer
  • its quality inspection method is shown in Figure 6-9.
  • N 4X, X ⁇ 1
  • single-tag adaptors such as tag 1 to tag N adaptor
  • the prepared N template DNA fragments such as template DNA A to template DNA N , Or Gene A fragment to Gene N fragment
  • the adapters are connected in a one-to-one correspondence with DNA and tag adapters, and then PCR amplification is performed with universal primers to obtain a single-tag library for computer use.
  • PCR amplification is performed with universal primers to obtain a single-tag library for computer use.
  • single-label primers such as label 1 primer to label N primer
  • the prepared N template DNA fragments such as template DNA to A to Template DNA (N fragments, or gene A fragments to gene N fragments) for quality inspection
  • PCR amplification is performed in a one-to-one correspondence with the DNA ligation products and tag primers after the DNA template is connected to the universal adapter to obtain the single Tag library.
  • the sequencing read length of each template DNA (such as gene A fragment to gene N fragment) obtained by PE30+10 sequencing corresponds to different tags (such as tag 1 to tag N ), such as 4995, 4, X, 1, 0, 4998, Y, 2, 2, 8, 8, Z, 4990 and other values in Figure 9.
  • tags such as tag 1 to tag N
  • the pollution rate of each label adapter/primer containing other label adapters/primers can be calculated.
  • the pollution rate of the label 1 adapter is (4+X+1)/ (4+X+1+4995)*100%
  • the pollution rate of the label 2 connector is (0+Y+2)/(0+Y+2+4998)*100%
  • the pollution rate of the label N connector is (2 +8+Z)/(2+8+Z+4990)*100%.
  • the sequencing quality evaluation results of the second strand read2 (as shown in Figure 9 ESR results) can be obtained, and the 5'end bases of the hybrid single-tagged library and the second strand sequencing related linkers can be indirectly judged by the evaluation results Synthetic quality.
  • the sequencing quality evaluation result of each single tag (not shown in the figure) can be obtained.
  • the quality inspection method is shown in Figure 10-14.
  • N 4X, X ⁇ 1
  • tag adapters such as tag 1 to tag N adapter
  • N tag primers such as tag 1 to tag N primer
  • the prepared N When a template DNA fragment (such as template DNA A to template DNA N, or gene A to gene N) is subjected to quality inspection, the adapters are connected in a one-to-one correspondence with DNA and tag adapters, and then one-to-one correspondence with tag primers PCR was performed to obtain a dual-tag library for computer use.
  • N 4X, X ⁇ 1
  • tag primers F such as tag 1 to tag N primer F
  • N tag primers R such as tag 1 to tag N primer R
  • N 4X, X ⁇ 1
  • tag connector top chain (top) and N tag connector bottom chain (bottom) such as tag 1 + tag 1 to tag N + tag N connector
  • N template DNA fragments such as template DNA A to template DNA N, or gene A to gene N fragments
  • the DNA ligation product and universal primers were amplified by PCR to obtain a dual-tag library for computer use.
  • the sequencing read length of each template DNA (such as gene A fragment to gene N fragment) obtained by PE30+10 sequencing corresponds to different tag adapters or tag primers F Or the read length of the tag linker top chain (such as tag 1 to tag N) (such as 4995, 4, X, 1, 0, 4998, Y, 2, 2, 8, Z, 4990 in Figure 14) and different tag primers Or the read length of the tag primer R or the bottom link of the tag adaptor (as shown in Fig. 14 4990, 10, X, 0, 3, 4995, Y, 2, 3, 12, Z, 4985).
  • the pollution rate of each label adapter or label primer F or label adapter top chain containing other label adapters or label primers F or label adapter top chain can be calculated by reading the length according to the label contamination rate formula, as shown in the first in Figure 14
  • the pollution rate of the tag 1 linker or tag 1 primer F or tag 1 linker top chain is (4+X+1)/(4+X+1+4995)*100%
  • the tag 2 pollution rate is (0+ Y+2)/(0+Y+2+4998)*100%
  • the pollution rate of label N is (2+8+Z)/(2+8+Z+4990)*100%
  • the contamination rate of the label 1 primer or the label 1 primer R or the label 1 linker bottom chain is (10+X+0)/(10+X+0+4990)*100%
  • the contamination rate of the bottom chain of the R or tag 2 linker is (3+Y+2)/(3+Y+2+4995)*
  • the sequencing quality evaluation results of the second strand (read2) (as shown in the ESR results in the figure) can be obtained, and the 5'base synthesis of the hybrid double-tagged library and the second strand sequencing related linker can be indirectly judged by the evaluation results quality. Further, through the split analysis of the off-line data, the sequencing quality evaluation result of each single tag (not shown in the figure) can be obtained.
  • the characteristics of the present invention include: (1) The design of template DNA needs to follow the principle of base balance to ensure that the sequencing quality is not affected by template DNA due to base imbalance. (2) Once the template DNA is successfully prepared, it can be amplified multiple times before amplification (PCR on PCR), saving the cost of template DNA preparation. Since the sequence of each position is different between template DNAs, there is no need to worry about the situation that different templates cannot be distinguished due to sequence errors caused by multiple PCR amplifications. A small number of errors caused by PCR can be solved by proper fault tolerance. In the prior art, since only the 10 bp sequence is different, it is not suitable for template preparation after amplification (PCR on PCR).
  • the number of label oligonucleotides (label oligo) to be detected can be detected between 4-X, and the experimental arrangement is not subject to the label oligonucleotides to be detected The effect of the number of nucleotides.
  • (4) innovatively invented a method for indirectly detecting the quality of 5'base synthesis of a tag connector or a connector related to second-strand sequencing by detecting the quality of second-strand sequencing.
  • Simultaneous detection of label contamination rate and oligonucleotide synthesis quality related to sequencing primers can be achieved through a quality control system, saving quality control labor and cost.
  • the method of the invention can meet the quality inspection of tag oligonucleotides constructed by various types of tag libraries, and is flexible and convenient.
  • the method of indirectly detecting the synthesis quality of the oligonucleotide hybridized with the second-strand sequencing primer by detecting the quality of the second-strand sequencing is generated after a series of test investigations.
  • the statistical results of the linker sequence obtained after sequencing the library SE200 of the ESR second-strand non-elevation and the main band of the ESR second-strand promotion is 160 bp are shown.
  • the figure shows the adapter sequence (adapterSeq ) The linker sequence obtained after sequencing the library SE200 with a main band of 160 bp.
  • the number (number) is the number of reads corresponding to each sequence, and the percentage (percent%) is the percentage of each sequence to the total sequence.
  • the first line of sequence is the correct linker sequence
  • the other line of sequence is the linker sequence containing the wrong base (base in the box in the figure)
  • the ESR second strand does not increase the proportion of the correct linker sequence of the library is lower than the ESR second strand Increase the proportion of correct linker sequences in the library.
  • the following figure compares the ESR results of the single-label 49-56 connector mixed library of manufacturer B purchased from different batches (first batch, second batch, third batch). It can be seen that different batches of single-label 49-56 connector mixed
  • the library's second-chain ESR has been significantly improved in 1 batch, and 2 batches have not been significantly improved. Therefore, it has been established whether the quality of the 5'base synthesis of the linker sequence hybridized with the second-strand sequencing primer is judged by detecting whether the second-strand ESR is improved.
  • Example 1 Detection of the label contamination rate and joint synthesis quality of 8 single-label adapters of DNA nanosphere (DNB) sequencing platform
  • SEQ ID NO: 1 ⁇ 192 Design 96 sets of specific amplification of 180bp fragments involving 23 pairs of chromosomes on the human genome, respectively from 96 genes, primers as shown in SEQ ID NO: 1 ⁇ 192, in which each two sequences constitute a primer to amplify a gene Yes, for example, SEQ ID NO: 1 ⁇ 2 are the primer pairs for amplifying the first gene, SEQ ID NO: 3 ⁇ 4 are the primer pairs for amplifying the second gene, and so on, SEQ ID NO: 191 ⁇ 192 is the primer pair for amplifying the 96th gene.
  • sequence of the amplification product is shown in SEQ ID NO: 193-288, where each sequence represents the amplification product of a gene, where SEQ ID NO: 193-240 represents the amplification product sequence of gene 1-48, SEQ ID NO : 241 ⁇ 288 represents the sequence of the amplification product of gene No.49-96.
  • TAE agarose gel The size of the gel depends on the number of samples to be checked, usually 2.5g of agarose is added to each 100mL 1 ⁇ TAE buffer, and heated to boiling until the powder is completely dissolved; in a warm water bath After 2 minutes of intermediate cooling, add 2 ⁇ L GelStain (full-style gold), mix gently, pour into the rubber plate, put in a wide-hole rubber comb, and leave it at room temperature for 20-30 minutes until the gel solidifies before it can be used.
  • GelStain full-style gold
  • Electrophoresis conditions 150V, 30min.
  • TAE agarose gel The size of the gel depends on the number of samples to be checked, usually 2.5g of agarose (BIO-RAD Megabase Agrose) is added to 100mL of 1 ⁇ TAE buffer, and heated and boiled to the powder Completely dissolved, the solution does not contain any solid insolubles; after cooling in a warm water bath for 2min, without adding any dye, pour into the rubber plate, put in a wide-hole rubber comb, and let it stand at room temperature for 20-30min until the gel is solidified. Note that there are no bubbles in the agarose solution.
  • agarose BIO-RAD Megabase Agrose
  • Electrophoresis conditions 100V, 2h-2.5h, bromophenol yellow can run to the bottom of the gel.
  • Reagent name volume 5M NaCl 4 ⁇ L 1M TrisHCl 4 ⁇ L 2mM EDTA 20 ⁇ L water 172 ⁇ L Total 200 ⁇ L
  • 96 specific products of 180bp are numbered according to 1-96, and are divided into 8 groups (numbers are 1-12, 13-24, 25-36, 37-48, 49 respectively) -60, 61-72, 73-84, 85-96), after the equal mass of each group is mixed, take 50ng to match the tags 501-508 in order to build the library.
  • the product can be subjected to the next reaction or stored in a -20°C refrigerator.
  • reaction sample into the PCR instrument to react.
  • the reaction conditions are as follows in Table 13:
  • the main band is between 250-300bp. As shown in Figure 19, there will be some template self-linking products above the main band. Experimental verification shows that it will not affect the results, and the self-linking products will disappear after cyclization and digestion.
  • the linearly digested single-stranded loop products were quantified using the Qubit single-strand analysis kit (QubitssDNA Assay Kit).
  • the buffer and dye ratio is 199:1, mix and vortex, and centrifuge to mix. Take two 190 ⁇ L diluted dye working solution and add 10 ⁇ L of two standard products to vortex and centrifuge to mix. Use 199 ⁇ L diluted dye to work Add 1 ⁇ L of sample to the solution, vortex and centrifuge to quantify by Qubit instrument
  • the mixed library was sent to the BGISEQ-500 platform for sequencing by PE30+10 strategy.
  • Figure 21 shows the ESR results of the mixed library. It can be seen that the overall 5'end synthesis quality of the batch of 501-508 tag adapters is good, and the second strand is improved.
  • the ESR split results when constructing a library for all 8 tag adapters to be tested and taking 6 FOV data.
  • Label number Label matching rate
  • Label contamination rate 501 99.97% 0.03% 502 99.95% 0.05% 503 99.96% 0.04% 504 98.70% 1.30% 505 99.96% 0.04% 506 99.96% 0.04% 507 99.86% 0.14% 508 99.97% 0.03%

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biophysics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

L'invention concerne un procédé de détection de séquence d'étiquettes consistant : à effectuer un établissement de bibliothèque d'appariement sur un groupe de séquences matrices et un groupe de séquences d'étiquettes à détecter, et à obtenir un groupe de bibliothèques d'étiquettes, les séquences matrices étant différentes séquences de gènes amplifiées ou synthétisées artificiellement, différentes séquences matrices sont différentes les unes des autres dans une séquence, et les séquences matrices et les séquences d'étiquettes à détecter ont une correspondance biunivoque ou univoque ; à effectuer un séquençage sur les bibliothèques d'étiquettes et à obtenir une séquence de séquençage et de lecture de longueur de chaque bibliothèque d'étiquettes ; à comparer la séquence de séquençage et de lecture de longueur à toutes les séquences d'étiquettes à détecter, et à collecter des statistiques concernant le nombre de séquences de séquençage et de lecture de longueur sur chaque séquence d'étiquettes à détecter par comparaison ; et à calculer, en fonction du nombre, un taux de contamination d'autres séquences d'étiquettes se trouvant dans chaque séquence d'étiquettes à détecter. Selon le procédé, étant donné que les séquences matrices sont différentes les unes des autres dans la séquence, il n'est pas nécessaire de se soucier de l'incapacité à distinguer différentes matrices en raison d'erreurs de séquence provoquées par de multiples amplifications PCR.
PCT/CN2018/120820 2018-12-13 2018-12-13 Procédé de détection de séquence d'étiquettes WO2020118596A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880099610.8A CN113168889B (zh) 2018-12-13 2018-12-13 标签序列的检测方法
PCT/CN2018/120820 WO2020118596A1 (fr) 2018-12-13 2018-12-13 Procédé de détection de séquence d'étiquettes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/120820 WO2020118596A1 (fr) 2018-12-13 2018-12-13 Procédé de détection de séquence d'étiquettes

Publications (1)

Publication Number Publication Date
WO2020118596A1 true WO2020118596A1 (fr) 2020-06-18

Family

ID=71075863

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/120820 WO2020118596A1 (fr) 2018-12-13 2018-12-13 Procédé de détection de séquence d'étiquettes

Country Status (2)

Country Link
CN (1) CN113168889B (fr)
WO (1) WO2020118596A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115197999A (zh) * 2022-07-15 2022-10-18 纳昂达(南京)生物科技有限公司 质控双端唯一标签接头合成串扰的方法及装置
EP3998343A4 (fr) * 2020-08-19 2022-11-02 Nanodigmbio (nanjing) Biotechnology Co., Ltd Composition de marqueurs de banque à double extrémité et son application dans une plateforme de séquençage mgi.
CN116004763A (zh) * 2022-07-19 2023-04-25 纳昂达(南京)生物科技有限公司 一种组合型接头的选择验证和质控方法
CN116287161A (zh) * 2021-12-31 2023-06-23 安诺优达基因科技(北京)有限公司 一种寡核苷酸序列一致性的检测方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101921841A (zh) * 2010-06-30 2010-12-22 深圳华大基因科技有限公司 基于Illumina GA测序技术的HLA基因高分辨率分型方法
CN106021987A (zh) * 2016-05-24 2016-10-12 人和未来生物科技(长沙)有限公司 超低频突变分子标签聚类分群算法
CN106755454A (zh) * 2017-01-06 2017-05-31 杭州杰毅麦特医疗器械有限公司 一种分子标签核酸检测方法
CN108932401A (zh) * 2018-06-07 2018-12-04 江西海普洛斯生物科技有限公司 一种测序样本的标识方法及其应用

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2776673T3 (es) * 2012-02-27 2020-07-31 Univ North Carolina Chapel Hill Métodos y usos para etiquetas moleculares

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101921841A (zh) * 2010-06-30 2010-12-22 深圳华大基因科技有限公司 基于Illumina GA测序技术的HLA基因高分辨率分型方法
CN106021987A (zh) * 2016-05-24 2016-10-12 人和未来生物科技(长沙)有限公司 超低频突变分子标签聚类分群算法
CN106755454A (zh) * 2017-01-06 2017-05-31 杭州杰毅麦特医疗器械有限公司 一种分子标签核酸检测方法
CN108932401A (zh) * 2018-06-07 2018-12-04 江西海普洛斯生物科技有限公司 一种测序样本的标识方法及其应用

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3998343A4 (fr) * 2020-08-19 2022-11-02 Nanodigmbio (nanjing) Biotechnology Co., Ltd Composition de marqueurs de banque à double extrémité et son application dans une plateforme de séquençage mgi.
CN116287161A (zh) * 2021-12-31 2023-06-23 安诺优达基因科技(北京)有限公司 一种寡核苷酸序列一致性的检测方法
CN115197999A (zh) * 2022-07-15 2022-10-18 纳昂达(南京)生物科技有限公司 质控双端唯一标签接头合成串扰的方法及装置
CN115197999B (zh) * 2022-07-15 2024-01-23 纳昂达(南京)生物科技有限公司 质控双端唯一标签接头合成串扰的方法及装置
CN116004763A (zh) * 2022-07-19 2023-04-25 纳昂达(南京)生物科技有限公司 一种组合型接头的选择验证和质控方法
CN116004763B (zh) * 2022-07-19 2024-02-09 纳昂达(南京)生物科技有限公司 一种组合型接头的选择验证和质控方法

Also Published As

Publication number Publication date
CN113168889B (zh) 2023-04-04
CN113168889A (zh) 2021-07-23

Similar Documents

Publication Publication Date Title
WO2020118596A1 (fr) Procédé de détection de séquence d'étiquettes
CN108300716B (zh) 接头元件、其应用和基于不对称多重pcr进行靶向测序文库构建的方法
CN104372093B (zh) 一种基于高通量测序的snp检测方法
CN104531883B (zh) Pkd1基因突变的检测试剂盒及检测方法
AU2018331434A1 (en) Universal short adapters with variable length non-random unique molecular identifiers
CN111440896B (zh) 一种新型β冠状病毒变异检测方法、探针和试剂盒
CN107541791A (zh) 血浆游离dna甲基化检测文库的构建方法、试剂盒及应用
WO2012068919A1 (fr) Bibliothèque d'adn et procédé de préparation de celle-ci, procédé et dispositif de détection de snp
CN108251504A (zh) 一种超快速构建基因组dna测序文库的方法和试剂盒
KR101406720B1 (ko) 차세대 염기서열 분석법을 위한 융합 프라이머의 설계방법 그리고 이러한 융합 프라이머 및 차세대 염기서열 분석법을 이용한 표적 유전자의 유전자형 분석방법
WO2023284768A1 (fr) Kit de séquençage à haut débit du génome entier mitochondrial humain basé sur le procédé d'amplification directe par amorces de fusion
WO2020232635A1 (fr) Procédé et système pour construire une banque de séquençage sur la base d'une région cible d'adn méthylé, et son utilisation
CN110656157A (zh) 用于高通量测序样本溯源的质控品及其设计和使用方法
BR112021006402A2 (pt) Ferramenta baseada em sequência-gráfico para determinar a variação em regiões curtas de repetição em tandem
CN106939344A (zh) 用于二代测序的接头
CN111676325A (zh) 一种用于检测SARS-CoV-2全基因组的引物组合及应用方法
WO2024104130A1 (fr) Procédé de développement de marqueurs moléculaires pour le génome entier utilisant l'amplification par amorces dégénérées
CN111748606A (zh) 一种快速构建血浆dna测序文库的方法和试剂盒
CN109825552A (zh) 一种用于对目标区域进行富集的引物及方法
CN107002150B (zh) 一种dna合成产物的高通量检测方法
CN109022558A (zh) 基于酶切组合基因分型测序技术的许氏平鲉基因组snp分子标记方法
US20230235320A1 (en) Methods and compositions for analyzing nucleic acid
US20230340609A1 (en) Cancer detection, monitoring, and reporting from sequencing cell-free dna
Gao et al. HITAC-seq enables high-throughput cost-effective sequencing of plasmids and DNA fragments with identity
KR20210079309A (ko) 핵산의 바코딩

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18943100

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18943100

Country of ref document: EP

Kind code of ref document: A1