CN116377046A - Quality control product and kit for parallel detection of tag primer sequences - Google Patents

Quality control product and kit for parallel detection of tag primer sequences Download PDF

Info

Publication number
CN116377046A
CN116377046A CN202211734834.6A CN202211734834A CN116377046A CN 116377046 A CN116377046 A CN 116377046A CN 202211734834 A CN202211734834 A CN 202211734834A CN 116377046 A CN116377046 A CN 116377046A
Authority
CN
China
Prior art keywords
sequence
primer
quality control
seq
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211734834.6A
Other languages
Chinese (zh)
Inventor
姜锋
张介中
杜洋
王娟
李志民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Annoroad Gene Technology Beijing Co ltd
Original Assignee
Annoroad Gene Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Annoroad Gene Technology Beijing Co ltd filed Critical Annoroad Gene Technology Beijing Co ltd
Publication of CN116377046A publication Critical patent/CN116377046A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/20Polymerase chain reaction [PCR]; Primer or probe design; Probe optimisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/166Oligonucleotides used as internal standards, controls or normalisation probes

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a quality control product for detecting a label primer, which is a double-stranded DNA fragment and comprises a segment of non-natural oligonucleotide fragment with a known sequence, wherein the non-natural oligonucleotide fragment is completely different from any position sequence on the genome of any existing known species. The two ends of the DNA fragment are provided with linker sequences which are complementary sequences capable of being specifically combined with the 3' -end of the label primer pair to be detected. The quality control product can carry out PCR amplification reaction with the label primer pair to be detected, the amplified product is used as a sequencing library to carry out second generation sequencing, the quality control product is used for splitting the data of the next machine, and the cross contamination condition and/or the synthesis error condition of different label primers can be obtained by analysis.

Description

Quality control product and kit for parallel detection of tag primer sequences
Technical Field
The invention belongs to the field of high-throughput gene sequencing, and relates to a quality control product and a kit for large-scale parallel detection of a tag primer sequence.
Background
High throughput sequencing technology (High-throughput sequencing), also known as "Next generation" sequencing technology ("Next-generation" sequencing technology, NGS), is marked by the ability to sequence hundreds of thousands to millions of DNA molecules in parallel at a time, and by the general short read lengths. High throughput sequencing techniques can sequence millions of DNA molecules simultaneously, detecting hundreds or even thousands of samples in parallel at a time. In high throughput sequencing, a large number of tag sequences are required to label different sample libraries to facilitate distinguishing between different sample sequences in high throughput sequencing results.
In second generation sequencing, tag primers (index primers) are used to add tags for sequencing data resolution in different libraries. Each library needs to strictly correspond to a unique index sequence, so that the accuracy of sequencing data obtained by splitting is ensured, and no cross contamination exists among the libraries. However, in the actual use process, the phenomenon of cross contamination among different label primers (namely, mixing a small amount of primer B into primer A) can be found out due to large-scale automation operation and the like. This causes cross-contamination of sequencing data from different libraries. In addition, the synthesis method of the label primer has certain limitations, so that the synthesis error phenomenon exists in the label primer. If high-proportion cross contamination exists among the label primers, the accuracy of the sequencing result may be reduced, false positive and false negative data results are reported, and the accuracy of the sequencing result is greatly influenced. If a high proportion of synthesis errors occur in the tag primer, the proportion of undeployed data in the next machine data is increased, and the high proportion of sequencing cost is increased.
For NGS tag primers, in the prior art, primer synthesis companies often employ strict process flows to control, for example, by isolating different batches of primers to reduce the likelihood of cross-contamination between primers. In the aspect of quality control, quality control means such as nanodrop concentration detection, capillary electrophoresis or mass spectrometry for detecting the number of nucleotides are mostly adopted. However, these quality control means cannot effectively calibrate the accuracy of NGS tag primer sequences, and it is difficult to meet the actual quality control requirements and sequencing requirements during large-scale parallel sequencing in downstream experiments.
Disclosure of Invention
Aiming at the defects of the prior art and the requirements in practical production experiments, the invention provides a quality control product and a kit which can be used for large-scale parallel detection of a tag primer sequence. The quality control product is a known sequence oligonucleotide segment with one end complementary with the label primer to be detected, PCR amplification reaction can be carried out on the oligonucleotide segment and the label primer to be detected, the amplified product is used as a sequencing library to carry out second generation sequencing on a machine, the machine-down data is split through the quality control product, and the cross contamination condition and/or the synthesis error condition of different label primers can be obtained through analysis.
Specifically, the invention adopts the following technical scheme:
1. a quality control product for detecting a label primer is characterized in that,
the quality control product is a double-stranded DNA fragment;
the DNA fragment comprises a segment of non-natural oligonucleotide with a known sequence, and is characterized in that the non-natural oligonucleotide segment is completely different from any position sequence on the genome of any known species.
The two ends of the DNA fragment are provided with linker sequences, and the linker sequences are complementary sequences which can be specifically combined with the 3' end of the label primer pair to be detected.
2. The quality control product according to item 1, characterized in that,
the 3' ends of the sense strand and the antisense strand of the linker sequence have the same sequence and comprise a sequence specifically complementary to any one of the tag primer pairs to be detected;
the 5' ends of the sense strand and the antisense strand of the linker sequence have the same sequence and comprise a sequence specifically complementary to the other one of the pair of tag primers to be detected;
3. the quality control product according to item 1, wherein the quality control product can be PCR amplified with a pair of tag primers to be detected, and the amplified product is a tag primer detection library.
4. The quality control product according to item 2, wherein the label primer pair to be detected is a second generation sequencing label primer pair;
the label primer pair to be detected comprises a forward primer sequence and a reverse primer sequence, wherein the two primer sequences have complementary sequences which are specifically recognized with the 3' end of the DNA fragment with the known sequence respectively; the oligonucleotide sequence exists on any one of the forward primer sequence and the reverse primer sequence and is positioned between the 5' end sequence and the end sequence;
preferably, the 5' end sequences of the forward and reverse strand primer sequences are provided with a linker sequence complementary to the sequencing platform, respectively.
5. The quality control product according to item 4, wherein the sequence of the label primer to be detected is any pair of SEQ ID NO.1 and SEQ ID NO.2, SEQ ID NO.3 and SEQ ID NO.4, SEQ ID NO.1 and SEQ ID NO. 4.
6. The quality control product according to item 5, wherein any one of the chain sequences of the quality control product is shown as SEQ ID NO. 55.
7. The quality control according to item 1, characterized in that the known non-natural oligonucleotide fragment has a length of 50-1000 bp, preferably 150-500 bp.
8. The quality control article of item 1 wherein the non-native oligonucleotide chip sequences of known sequence are used to split the tag primer detection library data.
9. A kit for detecting a tag primer, wherein the kit comprises the quality control article of items 1-8.
10. The kit according to item 9, further comprising a pair of adaptor primers, a PCR amplification reagent, and a purification reagent.
11. A quality control method of a second generation sequencing tag primer, which is characterized by comprising the following steps:
A. carrying out PCR amplification on the label primer pair to be detected and the quality control product;
B. obtaining an amplification product, and sequencing the amplification product on a machine to obtain sequencing data;
C. confirming the sequence consistency of the oligonucleotides to be detected according to the sequencing data;
wherein the quality control product is selected from any one of the quality control products in items 1-8;
the label primer pair to be detected comprises a forward primer sequence and a reverse primer sequence, wherein the two primer sequences have complementary sequences which are specifically recognized with the 3' end of the DNA fragment of the known sequence respectively; the oligonucleotide sequence exists on any one of the forward primer sequence and the reverse primer sequence and is positioned between the 5' end sequence and the end sequence;
preferably, the 5' end sequences of the forward primer sequence and the reverse primer sequence are respectively provided with a connector sequence which is complementarily matched with the sequencing platform;
preferably, the sequence of the label primer to be detected is any pair of SEQ ID NO.1 and SEQ ID NO.2, SEQ ID NO.3 and SEQ ID NO.4, SEQ ID NO.1 and SEQ ID NO.4; any chain sequence of the quality control product is shown as SEQ ID NO. 5.
12. The quality control method according to item 11, wherein,
the step of confirming the sequence consistency of the oligonucleotide to be detected according to the sequencing data is to split the original sequencing data through the sequence data of the DNA fragments with known sequences, split the data containing the DNA fragments with the same known sequences into the same data set, and confirm the sequence consistency of the oligonucleotide to be detected in the data set.
13. The method of quality control according to item 12, wherein,
the sequence consistency of the oligonucleotide to be detected comprises the cross contamination rate and the synthesis error rate of the nucleotide to be detected;
preferably, the cross contamination rate and the synthesis error rate are calculated for counting the types and the number of tag sequences in the sequencing read length of the original sequencing data.
Effects of the invention
When the quality control product is applied by means of a second-generation sequencing technology, a set of complete label primer sequence quality control experimental flow can be established, and the sequence accuracy of the label primer to be detected is analyzed and detected by detecting the known sequence carried on the quality control product. The quality control product is used for detecting the sequence of the label primer, the DNA fragment with the known sequence in the quality control product is different from the known genome of the existing species, and the detection process is not polluted by other libraries sequenced in the same batch.
Compared with the prior art, the quality control product can be used for detecting the consistency of batch label primers in large scale in parallel, is not influenced by the homology of exogenous DNA, and can be used for establishing a standardized quality control flow. By means of the second generation sequencing analysis technology, the sequencing result of the label primer can be accurately analyzed, and the quality inspection process is completed in high throughput. The cross contamination condition and the synthesis error condition of the label primer are analyzed through specific sequences, and a high-throughput quality inspection method conforming to practical test application is provided.
Drawings
FIGS. 1-2 show the quality control detection flow of quality control products for single-label primer pairs and bidirectional label primer pairs, respectively.
Detailed Description
The following description of the present invention will be made clearly and fully, and it is apparent that the embodiments described are some, but not all, of the embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
It should be noted that certain terms are used throughout the specification and the claims to refer to particular components. Those of skill in the art will understand that a person may refer to the same component by different names. The description and the items do not differ by the terms of their components, but rather by the functional differences of the components. As referred to throughout the specification and claims, the terms "comprise" or "include" are open-ended terms, and thus should be interpreted to mean "include, but not limited to. The description hereinafter sets forth a preferred embodiment for practicing the invention, but is not intended to limit the scope of the invention, as the description proceeds with reference to the general principles of the description. The scope of the invention is defined by the appended claims.
The term "oligonucleotide" as used herein refers to a linear polynucleotide fragment in which 2 to 10 nucleotide residues are linked by phosphodiester bonds, but the number of nucleotide residues is not strictly defined when this term is used, and in many documents, polynucleotide molecules containing 30 or more nucleotide residues are also referred to as oligonucleotides. The oligonucleotide can be automatically synthesized by an instrument, and can be used as a Primer (Primer), a gene Probe (Probe) and the like for DNA synthesis.
The term "primer" as used herein refers to a macromolecule, which stimulates synthesis at the initiation of nucleotide polymerization, having a specific nucleotide sequence, which is hydrogen-bonded to a reactant, such a molecule being referred to as a primer. Primers are typically two oligonucleotide sequences synthesized, i.e., primer pairs, one primer complementary to one strand of the DNA template at one end of the target region and the other primer complementary to the other strand of the DNA template at the other end of the target region, and function as a starting point for nucleotide polymerization, from the 3-terminus of which a nucleic acid polymerase can begin to synthesize a new strand of nucleic acid.
The term "amplification" as used herein refers to gene amplification, i.e., the process whereby the copy number of a particular gene is selectively increased while the copy number of other genes is not increased to scale.
Natural gene amplification, also known as chromosomal replication, or gene replication, is the primary mechanism by which new genetic material is produced during the evolution of biomolecules. It refers to the replication of any DNA fragment containing a gene.
Gene amplification may also be performed manually by:
polymerase Chain Reaction (PCR): the method of replicating the target DNA fragment is repeated by polymerizing the nucleotides.
Ligase Chain Reaction (LCR): a gene amplification method of a nucleic acid amplification probe. For each of the two DNA strands, the ligase ligates the two partial probes into the actual one. Thus, LCR uses two enzymes: DNA polymerase (for initial template amplification) and thermostable DNA ligase.
Transcription mediated amplification: an isothermal gene amplification method, which utilizes two enzymes, namely RNA polymerase and reverse transcriptase, to rapidly amplify target RNA/DNA.
In the present invention, the amplification method is not particularly limited, and a polymerase chain reaction, i.e., a PCR amplification method, is preferably used.
The term "sequencing" as used herein refers to gene sequencing, a novel gene detection technique capable of analyzing and determining the complete sequence of a gene from blood or saliva, and predicting the possibility of suffering from various diseases, and having reasonable behavior characteristics and behaviors of individuals. The gene sequencing technology can lock the individual lesion gene and prevent and treat the individual lesion gene in advance. The existing gene sequencing technology can be divided into 3 generations according to the technical characteristics, and mainly comprises the step of first generation sequencing, which is also called Sanger sequencing and capillary sequencing; second generation sequencing (NGS) is also known as high throughput sequencing, large-scale parallel sequencing; third generation sequencing is also called single molecule sequencing, and includes the technologies of Heliscope sequencing technology, SMRT (Single Molecule Real Time, single molecule real-time sequencing) Ion semiconductor sequencing technology (Ion Torrent), and the like. More sophisticated is the SMRT sequencing technology.
The invention is not limited to the method of sequencing, and preferably, second generation sequencing (NGS) is employed.
The sense strand, also called the coding strand, is generally located at the upper end of the double-stranded DNA, and the direction is 5 '-3' from left to right, and the base sequence is basically the same as that of the mRNA of the gene; the primer combined with the strand is a positive strand primer; which is extended along the plus strand.
The negative strand, i.e., the nonsense strand, also called the non-coding strand, is complementary to the positive strand, and the primer that binds to this strand is the reverse strand primer, which is a primer that extends uninterrupted along the negative strand, upstream of the DNA duplex.
The term "5 'end" herein means that the phosphate group of the last nucleotide and the hydroxyl group of the next nucleotide form a phosphodiester bond when DNA is ligated, and the phosphate end of one more phosphate group is added to both ends of the nucleotide chain, and the term "3' end" herein means the hydroxyl end of one more hydroxyl group is added to both ends of the nucleotide chain.
The term "sequencing platform" as used herein refers to the apparatus or equipment or software used in gene sequencing, including but not limited to Sanger, 2.454, solid, hiSeq2000, helicobacter, DNA Nanoball array, the PacBio RS system, PGM, miSeq, illuminate, etc., and the present invention is not limited to the type of sequencing platform, preferably, the second generation sequencing platform from the company illuminate is used
The term "complementary pairing" as used herein, i.e., base complementary pairing, refers to the phenomenon whereby the bases of nucleotide residues in a nucleic acid molecule are hydrogen bonded to each other in a corresponding relationship of A to T, A with U and G to C.
The invention specifically comprises a quality control product for detecting a label primer, which is characterized in that,
the quality control product is a double-stranded DNA fragment, and the DNA fragment comprises a non-natural oligonucleotide fragment with a known sequence, and is characterized in that the non-natural oligonucleotide fragment is completely different from any position sequence on the genome of any existing known species.
The two ends of the DNA fragment are provided with linker sequences, and the linker sequences are complementary sequences which can be specifically combined with the 3' end of the label primer pair to be detected.
The 3' ends of the sense strand and the antisense strand of the linker sequence have the same sequence and comprise a sequence specifically complementary to any one of the tag primer pairs to be detected; the sense strand and the antisense strand of the linker sequence have the same sequence at their 5' ends and comprise a sequence specifically complementary to the other of the pair of tag primers to be detected
The quality control product can be subjected to PCR amplification with the label primer pair to be detected, and the amplified product is a label primer detection library. The label primer pair to be detected is a second generation sequencing label primer pair;
the label primer pair to be detected comprises a forward primer sequence and a reverse primer sequence, wherein the two primer sequences have complementary sequences which are specifically recognized with the 3' end of the DNA fragment with the known sequence respectively; the oligonucleotide sequence exists on any one of the forward primer sequence and the reverse primer sequence and is positioned between the 5' end sequence and the end sequence;
preferably, the 5' end sequences of the forward and reverse strand primer sequences are provided with a linker sequence complementary to the sequencing platform, respectively.
Specifically, the sequence of the label primer to be detected is any pair of SEQ ID NO.1 and SEQ ID NO.2, SEQ ID NO.3 and SEQ ID NO.4, SEQ ID NO.1 and SEQ ID NO.4; any chain sequence of the quality control product is shown as SEQ ID NO. 55.
Further, the quality control product comprises a non-natural sequence of a known sequence, and is characterized in that the non-natural sequence is completely different from any sequence at any position on the genome of any known species. The non-native sequence may be obtained by any method, such as artificial synthesis.
The length of the known DNA sequence is 50-1000 bp, preferably 150-500 bp, for example 150bp, 200bp, 250bp, 300bp, 350bp, 400bp, 450bp or 500bp.
The length of the label in the label primer to be detected is 6-20 bp, preferably 6-12 bp, for example, 6bp, 7bp, 8bp, 9bp, 10bp, 11bp or 12bp.
The tag may be located in the forward primer and/or reverse Yan Zu sequence of the tag primer.
The non-natural oligonucleotide chip sequences of known sequence can be used to split the tag primer detection library data.
The invention also relates to a kit for detecting the label primer, which comprises the quality control product.
Further, it may further comprise a pair of adaptor primers, a PCR amplification reagent, and a purification reagent.
The invention also relates to a quality control method of the second-generation sequencing tag primer, which is characterized by comprising the following steps of:
A. carrying out PCR amplification on the label primer pair to be detected and the quality control product;
B. obtaining an amplification product, and sequencing the amplification product on a machine to obtain sequencing data;
C. confirming the sequence consistency of the oligonucleotides to be detected according to the sequencing data;
wherein the quality control product is selected from any one of the quality control products;
the label primer pair to be detected comprises a forward primer sequence and a reverse primer sequence, wherein the two primer sequences have complementary sequences which are specifically recognized with the 3' end of the DNA fragment of the known sequence respectively; the oligonucleotide sequence exists on any one of the forward primer sequence and the reverse primer sequence and is positioned between the 5' end sequence and the end sequence;
preferably, the 5' end sequences of the forward primer sequence and the reverse primer sequence are respectively provided with a connector sequence which is complementarily matched with the sequencing platform;
preferably, the sequence of the label primer to be detected is any pair of SEQ ID NO.1 and SEQ ID NO.2, SEQ ID NO.3 and SEQ ID NO.4, SEQ ID NO.1 and SEQ ID NO.4; any chain sequence of the quality control product is shown as SEQ ID NO. 55.
The step of confirming the sequence consistency of the oligonucleotide to be detected according to the sequencing data is to split the original sequencing data through the sequence data of the DNA fragments with known sequences, split the data containing the DNA fragments with the same known sequences into the same data set, and confirm the sequence consistency of the oligonucleotide to be detected in the data set.
The sequence consistency of the oligonucleotide to be detected comprises the cross contamination rate and the synthesis error rate of the nucleotide to be detected;
preferably, the cross contamination rate and the synthesis error rate are calculated for counting the types and the number of tag sequences in the sequencing read length of the original sequencing data.
Further, the quality control method is characterized in that the sequence consistency of the oligonucleotide to be detected is confirmed according to the sequencing data, namely, the original sequencing data are split through the sequence data of the DNA fragments with known sequences, the data containing the DNA fragments with the same known sequences are split into the same data set, and the sequence consistency of the oligonucleotide to be detected in the data set is confirmed.
The sequence consistency of the oligonucleotide to be detected means that the sequence of the oligonucleotide to be detected is consistent with the sequencing data of a preset sequence and/or the sequencing data only contains a single sequence data result.
In a specific embodiment, the quality control sequence is selected from any one or more of SEQ ID NO.5-SEQ ID NO. 29.
In one embodiment, the forward and reverse strand primer sequences are:
SEQ ID NO:1
5’-CAAGCAGAAGACGGCATACGAGATNNNN…NNNNGTGACTGGAGTTCAGACGTGTGCTCTTCCGA TCT-3’
SEQ ID NO:2
5’-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-3’
or (b)
SEQ ID NO:3
5’-CAAGCAGAAGACGGCATACGAGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-3’
SEQ ID NO:4
5’-AATGATACGGCGACCACCGAGATCTNNNN…NNNNACACTCTTTCCCTACACGACGCTCTTCCGA TCT-3’
SEQ ID NO:55
5’-ACACTCTTTCCCTACACGACGCTCTTCCGATCT-NNNNN........NNNNN-AGATCGGAAGAGCACACGTCT-3’
Example 1
The technical solutions of the present invention will be clearly and completely described in connection with the embodiments, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Materials, reagents and the like used in the examples described below are commercially available unless otherwise specified.
The 25 quality control products are shown in SEQ ID NO.6-30, and table 1.
TABLE 1 25 known DNA sequences
Figure BDA0004034059690000091
Figure BDA0004034059690000101
Figure BDA0004034059690000111
Figure BDA0004034059690000121
/>
Figure BDA0004034059690000131
Figure BDA0004034059690000141
/>
The sequences of 25 tag primers to be detected are shown in SEQ ID NO.31-55, see Table 2
TABLE 2 25 oligonucleotide sequences to be detected
SEQ ID NO.30 1 CATTGCTT
SEQ ID NO.31 2 TTCGGATT
SEQ ID NO.32 3 TCATCATT
SEQ ID NO.33 4 CAACAGGT
SEQ ID NO.34 5 TTCAAGGT
SEQ ID NO.35 6 CCTAACGT
SEQ ID NO.36 7 CACGTAGT
SEQ ID NO.37 8 TACCTTCT
SEQ ID NO.38 9 CCAGCGCT
SEQ ID NO.39 10 ACCAGACT
SEQ ID NO.40 11 CTATAACT
SEQ ID NO.41 12 CTAGTTAT
SEQ ID NO.42 13 TCTTATAT
SEQ ID NO.43 14 AATAAGAT
SEQ ID NO.44 15 TATGCCAT
SEQ ID NO.45 16 ATTCTAAT
SEQ ID NO.46 17 TAATGTTG
SEQ ID NO.47 18 ATTCACTG
SEQ ID NO.48 19 ATCATATG
SEQ ID NO.49 20 CTTGATGG
SEQ ID NO.50 21 TTAACCGG
SEQ ID NO.51 22 CTAAGTCG
SEQ ID NO.52 23 TATTCGCG
SEQ ID NO.53 24 CCTGTGAG
SEQ ID NO.54 25 CAACTAAG
1. Preparation of quality control with known DNA sequences (known sequences)
(1) Artificial synthesis of 25 known DNA sequences with PCR linker (known sequence is shown as SEQ ID NO. 5-29)
(2) 25 known sequences can be amplified using the adaptor primers 1 and 2 described below, and a large number of available known sequences can be obtained continuously and stably;
the sequence of the adaptor primer 1 is shown in SEQ ID NO. 55:
SEQ ID NO.55:GACTGGAGTTCAGACGTGTGCTCTTCCGATCT
the sequence of the adaptor primer 2 is shown in SEQ ID NO. 56:
SEQ ID NO.56:ACACTCTTTCCCTACACGACGCTCTTCCGATCT
(3) Artificial synthesis of known sequence dilutions: artificially synthesized known DNA sequence (known sequence) concentrations were detected using a qubit HS and diluted to 1ng/ul using an dilution buffer;
(4) The PCR amplification system is shown in Table 3, 2.
TABLE 3 Table 3
Sequence number Total system 50. Mu.L.times.1 tube Single use amount (mu L)
1 Known DNA sequences 1
2 HiFi Mix 25
3 Adapter primer 1 (10 pmol/. Mu.L) 4
4 Adapter primer 2 (10 pmol/. Mu.L) 4
5 ddH 2 O 16
(5) PCR amplification procedure 94℃for 2min; (94 ℃ C. 15s,62 ℃ C. 30s,72 ℃ C. 30 s) 17cycles;72 ℃ for 10min;4 ℃ forever;
(6) Purifying magnetic beads: purification with 1.5-fold beads after amplification was completed, elution with 50ul elution buffer
2. Preparation of label primer to be detected
(1) Primer solubilization
And dissolving the dry powder of the label primer to be detected into working solution.
Wherein, the primer sequence of the forward chain of the primer pair is shown as SEQ ID NO.1 or SEQ ID NO.3;
the reverse strand primer sequence is shown as SEQ ID NO.2 or SEQ ID NO.4;
the NNNN … NNNN part of the shown sequence was replaced with the sequence obtained by the oligonucleotide sequences to be examined shown in Table 2. A combination of SEQ ID NO.1 and SEQ ID NO. 2.
Selected for use in the present embodiment
The dissolution method comprises the following steps: the dry label primer powder was placed on a high-speed centrifuge and centrifuged at 12000rpm for 5min. Diluting the primer dry powder to 10 pmol/mu L with sterilized purified water, wherein the added volume of the sterilized purified water is 100 times of the nmol number of the primer, adding the sterilized purified water, shaking and mixing uniformly, and placing the mixture in a palm centrifuge for short centrifugation; standing for 5min, repeating shaking, mixing, and centrifuging in palm centrifuge for short time.
Note that: the effective period of the label primer working solution is 14 months, and the storage temperature is below-15 ℃.
PCR reaction
(1) Taking out the known DNA sequence, melting at room temperature, shaking, mixing, centrifuging instantaneously, and placing on an ice box. By fluorescence quantifier
Figure BDA0004034059690000161
dsDNA HS Assay Kit the concentration of the template of the known DNA sequence after thawing was measured and the sample was measured at 1. Mu.L. mu.L of the known DNA sequence was taken and the known DNA sequence template was diluted to 1 ng/. Mu.L with sterile purified water.
(2) Taking out KAPA HiFi Hotstart Ready Mix and the reverse strand primer, melting at room temperature, shaking, mixing, centrifuging, and placing on an ice box. The preparation of the premix according to the PCR reaction system was performed on an ice bin, and the premix was as shown in Table 4 below. And (3) oscillating and uniformly mixing the prepared PCR reaction premix, and performing instantaneous centrifugation.
TABLE 4 Table 4
Reagent name Single reaction quantity (mu L)
KAPA HiFi Hotstart Ready Mix 25
Reverse strand primer (10 pmol/. Mu.L) 4
Sterilizing purified water 16
(3) mu.L of the PCR reaction premix was aspirated separately and added to 23 wells (or PCR tubes) of a 96-well PCR plate.
(4) And respectively sucking 4 mu L of the label primer to be detected, adding the label primer to a PCR plate (or a PCR tube) filled with the PCR reaction premix, oscillating and uniformly mixing, and performing instantaneous centrifugation.
(5) 1. Mu.L of the known DNA sequences diluted in 1) were each aspirated, and the 96-well PCR plate (or PCR tube) added to 4) was correspondingly used according to the following table. Shaking and mixing evenly, and carrying out instantaneous centrifugation.
(6) The sample was placed on a PCR gene amplification apparatus, and the PCR conditions were as shown in Table 5 below.
TABLE 5
Figure BDA0004034059690000171
Note that: the PCR instrument was capped at 105℃and a volume of 55. Mu.L.
PCR reaction product purification
Purification of 0.9 Xmagnetic beads, elution with 50ul elution buffer
Amplification of purified products
Using a fluorescent quantifier
Figure BDA0004034059690000173
dsDNA HS Assay Kit the concentration of the amplified and purified product was measured, and the sample measurement was 1. Mu.L.
The sample volumes (volumes) of the amplified and purified products were calculated separately, and according to the sample volumes (volumes), 23 amplified and purified products were pipetted into 1 new 1.5ml centrifuge tubes and mixed into a one-tube library.
5. Library quality detection
The pooling library concentration was determined using a fluorescent quantitative PCR analyzer.
6. Sequencing on machine
The library was subjected to on-machine sequencing, sequencing type: SE40+8,8M reads,75cycles.
7. Information analysis
And (3) splitting and analyzing the data after the machine is started, and checking whether each label primer to be checked is 'free from data output' and the cross contamination rate. The Nextseq550/500 platform processes the reads dataset and counts the ratio of corresponding nucleotide sequence reads to be tested to corresponding single dataset reads in the split dataset.
The results are shown in Table 6
TABLE 6
Figure BDA0004034059690000172
/>
Figure BDA0004034059690000181
Analysis conclusion:
of the 25 sets of oligonucleotides tested,
group 8, group 15 had cross contamination with contamination rates of 0.34% and 0.37%, respectively; group 22 had a synthetic error.

Claims (13)

1. A quality control product for detecting a label primer is characterized in that,
the quality control product is a double-stranded DNA fragment;
the DNA fragment comprises a segment of non-natural oligonucleotide with a known sequence, and is characterized in that the non-natural oligonucleotide segment is completely different from any position sequence on the genome of any known species.
The two ends of the DNA fragment are provided with linker sequences, and the linker sequences are complementary sequences which can be specifically combined with the 3' end of the label primer pair to be detected.
2. The quality control article according to claim 1, wherein,
the 3' ends of the sense strand and the antisense strand of the linker sequence have the same sequence and comprise a sequence specifically complementary to any one of the tag primer pairs to be detected;
the 5' ends of the sense strand and the antisense strand of the linker sequence have the same sequence and comprise a sequence specifically complementary to the other of the pair of tag primers to be detected.
3. The quality control product according to claim 1, wherein the quality control product can be amplified by PCR with a pair of tag primers to be detected, and the amplified product is a tag primer detection library.
4. The quality control article of claim 2, wherein the pair of tag primers to be detected is a pair of second generation sequencing tag primers;
the label primer pair to be detected comprises a forward primer sequence and a reverse primer sequence, wherein the two primer sequences have complementary sequences which are specifically recognized with the 3' end of the DNA fragment with the known sequence respectively; the oligonucleotide sequence exists on any one of the forward primer sequence and the reverse primer sequence and is positioned between the 5' end sequence and the end sequence;
preferably, the 5' end sequences of the forward and reverse strand primer sequences are provided with a linker sequence complementary to the sequencing platform, respectively.
5. The quality control article according to claim 4, wherein the tag primer sequences to be detected are SEQ ID NO.1 and SEQ ID NO.2, SEQ ID NO.3 and SEQ ID NO.4, SEQ ID NO.2
NO 1 and SEQ ID NO 4.
6. The quality control product according to claim 5, wherein any one of the chain sequences of the quality control product is shown as SEQ ID NO. 55.
7. The quality control according to claim 1, wherein the known non-natural oligonucleotide fragment has a length of 50-1000 bp, preferably 150-500 bp.
8. The quality control of claim 1 wherein the sequence-known unnatural oligonucleotide chip sequence is used to resolve tag primer detection library data.
9. A kit for detecting a tagged primer, comprising the quality control article of claims 1-8.
10. The kit according to claim 9, further comprising a pair of adapter primers, PCR
Amplification reagent, purification reagent.
11. A quality control method of a second generation sequencing tag primer, which is characterized by comprising the following steps:
A. carrying out PCR amplification on the label primer pair to be detected and the quality control product;
B. obtaining an amplification product, and sequencing the amplification product on a machine to obtain sequencing data;
C. confirming the sequence consistency of the oligonucleotides to be detected according to the sequencing data;
wherein the quality control product is selected from any one of the quality control products of claims 1-8;
the label primer pair to be detected comprises a forward primer sequence and a reverse primer sequence, wherein the two primer sequences have complementary sequences which are specifically recognized with the 3' end of the DNA fragment of the known sequence respectively; the oligonucleotide sequence exists on any one of the forward primer sequence and the reverse primer sequence and is positioned between the 5' end sequence and the end sequence;
preferably, the 5' end sequences of the forward primer sequence and the reverse primer sequence are respectively provided with a connector sequence which is complementarily matched with the sequencing platform;
preferably, the sequence of the label primer to be detected is any pair of SEQ ID NO.1 and SEQ ID NO.2, SEQ ID NO.3 and SEQ ID NO.4, SEQ ID NO.1 and SEQ ID NO.4; any chain sequence of the quality control product is shown as SEQ ID NO. 5.
12. The method of claim 11, wherein,
the step of confirming the sequence consistency of the oligonucleotide to be detected according to the sequencing data is to split the original sequencing data through the sequence data of the DNA fragments with known sequences, split the data containing the DNA fragments with the same known sequences into the same data set, and confirm the sequence consistency of the oligonucleotide to be detected in the data set.
13. The method of claim 12, wherein,
the sequence consistency of the oligonucleotide to be detected comprises the cross contamination rate and the synthesis error rate of the nucleotide to be detected;
preferably, the cross contamination rate and the synthesis error rate are calculated for counting the types and the number of tag sequences in the sequencing read length of the original sequencing data.
CN202211734834.6A 2021-12-31 2022-12-31 Quality control product and kit for parallel detection of tag primer sequences Pending CN116377046A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2021116750667 2021-12-31
CN202111675066 2021-12-31

Publications (1)

Publication Number Publication Date
CN116377046A true CN116377046A (en) 2023-07-04

Family

ID=85890819

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202211733871.5A Pending CN115948522A (en) 2021-12-31 2022-12-30 Method for detecting oligonucleotide sequence consistency
CN202211734834.6A Pending CN116377046A (en) 2021-12-31 2022-12-31 Quality control product and kit for parallel detection of tag primer sequences

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202211733871.5A Pending CN115948522A (en) 2021-12-31 2022-12-30 Method for detecting oligonucleotide sequence consistency

Country Status (1)

Country Link
CN (2) CN115948522A (en)

Also Published As

Publication number Publication date
CN115948522A (en) 2023-04-11

Similar Documents

Publication Publication Date Title
EP3371309B1 (en) Combinatorial sets of nucleic acid barcodes for analysis of nucleic acids associated with single cells
CA2697640C (en) Tools and methods for genetic tests using next generation sequencing
CN111808854B (en) Balanced joint with molecular bar code and method for quickly constructing transcriptome library
CN110129415B (en) NGS library-building molecular joint and preparation method and application thereof
CN108138175B (en) Reagents, kits and methods for molecular barcode encoding
WO2012037881A1 (en) Nucleic acid tags and use thereof
KR20170133270A (en) Method for preparing libraries for massively parallel sequencing using molecular barcoding and the use thereof
CN113308526A (en) Fusion primer direct amplification method human mitochondrial whole genome high-throughput sequencing kit
CN108932401B (en) Identification method of sequencing sample and application thereof
CN112795654A (en) Method and kit for organism fusion gene detection and fusion abundance quantification
CN116377046A (en) Quality control product and kit for parallel detection of tag primer sequences
CN116287161A (en) Oligonucleotide sequence consistency detection method
CN110892079A (en) Assay methods and compositions for detecting nucleic acid identifier contamination
CN114854825A (en) Library building joint and method for simplified genome sequencing suitable for DNBSEQ technology
CN111793623A (en) Typing genetic marker composition, kit, identification system and typing method of 62 multi-allelic SNP-NGS
CN116515977B (en) Single-ended-adaptor-transposase-based single-cell genome sequencing kit and method
CN116103383B (en) Method for identifying false base of NGS linker oligo and library thereof
Flood Novel Fragmentation Method for Automated Next Generation Sequencing Exome Library Preparation
CN112176041A (en) Detection method, reagent and application of epigenetic modification
WO2022125100A1 (en) Methods for sequencing polynucleotide fragments from both ends
CN112646809A (en) Nucleic acid sequence, method and kit for detecting enzyme end repair capacity
KR20220122095A (en) Composition for improving molecular barcoding efficiency and use thereof
CN110656159A (en) Method for detecting copy number variation
CN114622286A (en) High-throughput transcriptome sequencing library construction method and application thereof
CN114686562A (en) Compositions, kits, methods, and systems for nucleic acid sample amplification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination