CN113337639B - Method for detecting COVID-19 based on mNGS and application thereof - Google Patents
Method for detecting COVID-19 based on mNGS and application thereof Download PDFInfo
- Publication number
- CN113337639B CN113337639B CN202110597445.2A CN202110597445A CN113337639B CN 113337639 B CN113337639 B CN 113337639B CN 202110597445 A CN202110597445 A CN 202110597445A CN 113337639 B CN113337639 B CN 113337639B
- Authority
- CN
- China
- Prior art keywords
- primer
- genome
- reverse transcription
- primers
- primer group
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/70—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
- C12Q1/701—Specific hybridization probes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6844—Nucleic acid amplification reactions
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/06—Biochemical methods, e.g. using enzymes or whole viable microorganisms
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/16—Primer sets for multiplex assays
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Biophysics (AREA)
- Analytical Chemistry (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- General Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Virology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention provides a method for detecting a novel COVID-19 coronavirus based on mNGS and application thereof, wherein the method improves the reverse transcription efficiency of the RNA of the novel coronavirus, reduces the problem of aerosol pollution and simultaneously improves the enrichment capacity of nucleic acid sequences of different variant viruses by designing and preparing a multiple genome-specific reverse transcription primer group of the novel COVID-19 coronavirus.
Description
Technical Field
The invention belongs to the field of gene detection, and particularly relates to a method for detecting COVID-19 based on mNGS and application thereof.
Background
The symptoms of the novel coronavirus pneumonia are similar to those of common pneumonia, so that a rapid and accurate diagnosis technology plays a crucial role in patient treatment. At present, the mainstream detection means for disease diagnosis and epidemic situation screening is a real-time fluorescence RT-PCR detection technology, and the technology has the advantages of simple test operation, short detection time and high sensitivity, and is the first choice for rapid diagnosis and large-scale population screening at present. However, the technology also has the disadvantages that the current commercial kit usually takes two specific gene loci of the new coronavirus as targets, only two short sequences can be detected, the nucleic acid of the new coronavirus is RNA which is easier to degrade than DNA, the virus dies and splits after sampling, the integrity of the RNA is reduced, false positive can appear if the residual nucleic acid is not in a target region during detection, the novel virus continuously varies, and once the target region in a sample mutates, the detection means also has the risk of missing detection.
And the detection sequence of the new coronavirus genome based on the second-generation sequencing can cover the full length of the COVID-19 sequence. The defect that only a small number of COVID-19 known regions can be detected by RT-PCR is overcome, the detection accuracy can be improved, the possible variation of the virus can be identified, the disease condition can be accurately judged according to the genotype, a treatment scheme is formulated, and the source can be traced according to the possible variation of the virus, and the propagation path can be found. Therefore, the genome detection of the new coronavirus has been generally accepted by the medical field in China as an in vitro diagnosis method, and the genome detection of the new coronavirus is the same as real-time fluorescence RT-PCR in the novel diagnosis and treatment scheme for coronavirus pneumonia issued by the national health and health committee.
Common genome sequencing means comprise metagenome and targeted enrichment detection, but the metagenome has the defect of indiscriminate detection of all nucleic acids in a sample at present, and human nucleic acid with high proportion not only reduces the detection sensitivity, but also causes unstable detection signal intensity and generates false negative due to the proportion difference of different samples. The targeted enrichment mostly adopts a method of multiple PCR to enrich new coronavirus genome sequences to remove the interference of human source nucleic acid, but the cycle number used for cDNA amplification after reverse transcription is high, and PCR products are easy to cause cross contamination and aerosol contamination, so that false positive is generated. Moreover, designing primers only against a single reference genomic sequence provided by NCBI will inevitably eliminate variant gene sequences. The new coronavirus has high mutation speed and very rich sequence diversity, new virus subtypes are continuously discovered and uploaded in the world, the gene sequences of the new coronavirus recorded by NCBI (national center for Biotechnology information) reach 50,326, and the enrichment effect on the variant virus sequences can be reduced by only designing a primer aiming at a fixed genome sequence, so that false negative is caused.
The invention is provided in view of the above.
Disclosure of Invention
The invention aims to find a method for effectively detecting novel coronavirus COVID-19, and provides the following technical scheme for achieving the aim of improvement:
the invention firstly provides a preparation method of a virus multiple genome specific reverse transcription primer group, which comprises the following steps:
step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the number of the elements is 100< N1< 1000.
Step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt (preferably 394nt) short fragments with 150-300nt (preferably 197nt) overlap, intercepting 30-60nt (preferably 50nt) at two ends of each fragment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt (preferably 13nt) aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency (preferably sequencing from high to low), selecting the primer with the highest occurrence frequency and containing no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; n2 is more than or equal to 1 and less than or equal to 8.
Further, the method further comprises:
step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, self-dimer or cross-dimer is possibly formed, homopolymer repeated bases with more than 5nt exist, and the final multiple genome reverse transcription primer group is obtained through screening and rejecting.
Step 4), designing a conservative region primer: and (2) the sequences of the Primer design regions positioned in highly conserved regions (such as E genes) are consistent in all genomes, for such regions, 13nt Primer design is carried out at the 3' end of the region by using Primer3 software, after primers which have Tm values deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3) are deleted, the Primer with the most front software sequencing is selected to replace the Primer in the same Primer design region in the Primer in the step 3), and a final reverse transcription Primer group is formed.
Further, the genome database in step 1) includes, but is not limited to, NCBI GenBank database, DDBJ database, EMBL database; preferably, multiple genomic sequences of the virus are downloaded from the NCBI database, and all the downloaded genomes are aligned for match using the fast fourier transform MAFFT.
Further, the step 2) is as follows: step 2) preparing a candidate primer group: using PYFASTA software to divide the matched multiple genome sequence into 394nt short fragments with 197nt overlapping, intercepting 50nt at two ends of each fragment as a primer design region, and designing a plurality of 13nt forward and/or reverse primers aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region from high to low according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N times to obtain a candidate primer group; n is more than or equal to 1 and less than or equal to 8.
Further, the virus is a novel COVID-19 coronavirus;
further, the virus is a COVID-19 novel coronavirus based on the mNGS detection.
In some embodiments, the final multiplexed genomic reverse transcription primer set sequences are shown in SEQ ID nos. 1-277.
The invention also provides a library construction method for detecting the COVID-19 novel coronavirus based on the mNGS, which is characterized by comprising the following steps of:
1) the preparation method of the multiple genome-specific reverse transcription primer group comprises the following steps: designing and screening multiple genome specificity reverse transcription primer groups;
2) reverse transcription and cDNA synthesis: comprises the step of performing multiplex amplification by using the designed multiplex genome-specific reverse transcription primer set;
3) library construction step: and (3) performing library construction on the cDNA sequence.
Further, the step 2) of synthesizing cDNA by reverse transcription comprises the steps of adding random reverse transcription primers to the multiple genome reverse transcription primer group and then performing multiple amplification;
further, the multiplex amplification is carried out by using full-length reverse transcriptase;
in some embodiments, the full length reverse transcriptase is a HiScript III Enzyme system.
In some more preferred embodiments, the multiplex amplification step is specifically as follows:
heating the sample at 65 deg.C for 5min, rapidly cooling on ice, and standing on ice for 2 min;
the following reaction system was prepared:
reagent | Volume of |
10×RT Mix | 2μl |
HiScript III Enzyme Mix | 2μl |
Reverse transcription primer working solution | 2μl |
Random hexamers | 1μl |
First Strand cDNA amplification was performed under the following conditions
105 deg.C thermal cover | on |
25℃ | 5min |
37℃ | 45min |
85℃ | 5sec |
Further, the multiple amplified sequences were prepared by double-stranded cDNA synthesis using a commercial double-stranded synthesis system.
Further, the step 1) of designing and screening multiple genome-specific reverse transcription primer sets comprises the following steps:
step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the number of the elements is 100< N1< 1000.
Step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt (preferably 394nt) short fragments with 150-300nt (preferably 197nt) overlap, intercepting 30-60nt (preferably 50nt) at two ends of each fragment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt (preferably 13nt) aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency (preferably sequencing from high to low), selecting the primer with the highest occurrence frequency and containing no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; n2 is more than or equal to 1 and less than or equal to 8.
The method further comprises the following steps:
step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, self-dimer or cross-dimer is possibly formed, homopolymer repeated bases with more than 5nt exist, and the final multiple genome reverse transcription primer group is obtained through screening and rejecting.
Step 4), designing a conservative region primer: and (2) the sequences of the Primer design regions positioned in highly conserved regions (such as E genes) are consistent in all genomes, for such regions, 13nt Primer design is carried out at the 3' end of the region by using Primer3 software, after primers which have Tm values deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3) are deleted, the Primer with the most front software sequencing is selected to replace the Primer in the same Primer design region in the Primer in the step 3), and a final reverse transcription Primer group is formed.
Further, the genome database in step 1) includes, but is not limited to, NCBI GenBank database, DDBJ database, EMBL database; preferably, multiple genomic sequences of the virus are downloaded from the NCBI database, and all the downloaded genomes are aligned for match using the fast fourier transform MAFFT.
Further, the step 2) is as follows: step 2) preparing a candidate primer group: using PYFASTA software to divide the matched multiple genome sequence into 394nt short fragments with 197nt overlapping, intercepting 50nt at two ends of each fragment as a primer design region, and designing a plurality of 13nt forward and/or reverse primers aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region from high to low according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N times to obtain a candidate primer group; n is more than or equal to 1 and less than or equal to 8.
Further, the virus is a novel COVID-19 coronavirus;
further, the virus is a COVID-19 novel coronavirus based on the mNGS detection.
In some embodiments, the final multiplexed genomic reverse transcription primer set sequences are shown in SEQ ID nos. 1-277.
The invention also provides a method for detecting the COVID-19 novel coronavirus based on the mNGS, which comprises the steps of the method and comprises the steps of sequencing and generating information analysis.
The invention also provides a multiple genome specific reverse transcription primer group of the COVID-19 novel coronavirus, and the primer sequence is shown as SEQ ID NO. 1-277.
Further, the list is obtained by screening by the method for designing and screening multiple genome-specific reverse transcription primer sets as described above.
The invention also provides any one of the following uses of the multiple genome-specific reverse transcription primer set:
1) use in multiple amplifications of the COVID-19 novel coronavirus;
2) the application in COVID-19 novel coronavirus sequencing library construction;
3) use in targeted enrichment of novel COVID-19 coronaviruses;
4) the application of the conjugate in preparing a reagent for detecting the COVID-19 novel coronavirus.
The invention also provides a novel coronavirus COVID-19 nucleic acid library construction or nucleic acid detection kit, which comprises the primer group of SEQ ID NO. 1-277.
Further the kit further comprises a full-length reverse transcriptase; in some preferred embodiments, the full length reverse transcriptase is a HiScript III Enzyme system.
Compared with the prior art, the invention has at least the following advantages:
1) the invention adopts a method of adding new crown specific primers and random primers in the reverse transcription stage to improve the reverse transcription efficiency of new crown virus RNA and reduce the probability of aerosol pollution while enriching.
2) When designing a primer, the invention adopts a primer design method which can take a plurality of genome sequences of the new coronavirus as a reference sequence set, in particular, the invention designs a reverse transcription primer by taking at least more than 100 multiple genomes as the basis, considers the diversity of various new coronavirus sequences, improves the enrichment capacity of different variant virus nucleic acid sequences, and has a very high probability of successfully enriching the new variant viruses which possibly appear in the future because the screening of the primer is to match the primer sequence with the highest frequency in a plurality of different variant types.
3) The final primer system SEQ ID NO.1-277 established by design and screening can be comprehensively and efficiently used for establishing a library of new coronavirus and even variant strains, and the high sensitivity of subsequent sequencing detection is ensured.
4) Compared with the traditional method of RNA breaking and then amplification and library building, the method firstly amplifies the long fragment and then breaks, and specifically adopts HiScript III Enzyme Mix to adopt full-length reverse transcriptase during reverse transcription, so that the extension length of the specific primer is increased, and the enrichment efficiency is improved.
Detailed Description
The technical solutions of the present invention are described clearly and completely below, and it is obvious that the described embodiments are some, not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following terms or definitions are provided only to aid in understanding the present invention. These definitions should not be construed to have a scope less than understood by those skilled in the art.
Unless defined otherwise below, all technical and scientific terms used in the detailed description of the present invention are intended to have the same meaning as commonly understood by one of ordinary skill in the art. While the following terms are believed to be well understood by those skilled in the art, the following definitions are set forth to better explain the present invention.
As used herein, the terms "comprising," "including," "having," "containing," or "involving" are inclusive or open-ended and do not exclude additional unrecited elements or method steps. The term "consisting of …" is considered to be a preferred embodiment of the term "comprising". If in the following a certain group is defined to comprise at least a certain number of embodiments, this should also be understood as disclosing a group which preferably only consists of these embodiments.
Where an indefinite or definite article is used when referring to a singular noun e.g. "a" or "an", "the", this includes a plural of that noun.
The terms "about" and "substantially" in the present invention denote an interval of accuracy that can be understood by a person skilled in the art, which still guarantees the technical effect of the feature in question. The term generally denotes a deviation of ± 10%, preferably ± 5%, from the indicated value.
Furthermore, the terms first, second, third, (a), (b), (c), and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.
Specific examples are as follows.
Experimental example 1 primer design
1. Generation of multiple genomes
More than 100 COVID-19 novel coronavirus genomes were first downloaded from the NCBI website, and all downloaded genomes were aligned for match using fast fourier transform (MAFFT) (v.7.388); note that, in order to ensure the detection rate of mutation of the subsequent primers, the source sequence of the multiple genome must be at least more than 100.
2. And (5) designing a candidate primer.
The matched multiplex genome is divided into 394nt short fragments with 197nt overlap by using PYFASTA software, 50nt is intercepted at two ends of each fragment to be used as a primer design region, a plurality of 13nt forward or reverse primers are designed to form a primer group to be merged in the region, all primers of the primer group to be merged in each region are sequenced from high to low according to the occurrence frequency, the primer with the highest occurrence frequency and containing no uncertain base is selected, all primers with the same sequence as the primer are deleted, the rest primers are sequenced and screened again, and a large number of candidate primer groups are obtained after repeating for multiple times.
3. Primer screening
The resulting candidate primer set is subjected to secondary screening to delete primers that appear such as: the Tm value deviates from the average by more than 2 standard deviations, and is likely to form a self-dimer or a cross-dimer, and there are more than 5nt homopolymer repeat bases, which are only exemplified here, and more screening conditions are included in the actual process. And (3) obtaining a final COVID-19 multiple genome reverse transcription primer group through secondary screening and rejecting, wherein the sequences of the primer group are shown in the following table.
The invention further comprises manual sequence adjustment and the like, for example, for conservative region primer design: and (2) the sequences of the Primer design regions positioned in highly conserved regions (such as E genes) are consistent in all genomes, for such regions, 13nt Primer design is carried out at the 3' end of the region by using Primer3 software, after primers which have Tm values deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3) are deleted, the Primer with the most front software sequencing is selected to replace the Primer in the same Primer design region in the Primer in the step 3), and a final reverse transcription Primer group is formed.
Experimental example 2 experimental procedure
(first) preparation of primers
(1) The primers were synthesized at a loading of 10nmol per tube, and purified by PAGE in 277 tubes.
(2) The primers were centrifuged at 4000rmp for 1 min.
(3) Adding Tris-HCl into each tube, performing vortex mixing, placing on ice, performing vortex mixing again, and performing instantaneous centrifugation.
(4) And combining the primers of all tubes into 1 tube to obtain a working solution of the COVID-19 multi-genome reverse transcription primer for later use.
(II) RNA nucleic acid extraction
RNA nucleic acid extraction was performed using the viral RNA extraction kit from Qiagen. The method is operated according to the product specification, after the sample is cracked, ethanol is added to separate out RNA, after the RNA is centrifugally filtered through a column, the separated RNA is combined with a silica gel membrane under the condition of high salt, residual impurities on the silica gel membrane are washed away by using a rinsing liquid, and finally the RNA is eluted from the silica gel membrane under the conditions of low salt and high pH value.
(III) human ribosomal RNA removal
Removal of ribosomal RNA from the host was performed using a commercial ribosomal RNA removal kit (human/mouse/rat) and RNA purification magnetic beads. Firstly, a ribosome RNA specific DNA probe is fully combined with human ribosome RNA by utilizing a slowly-reduced incubation temperature, secondly, a combination body is digested by using RNase H for specifically digesting the RNA & DNA combination body, then, the residual DNA probe is digested by using DNA digestive enzyme, finally, the residual RNA is enriched by using RNA purification magnetic beads matched with the kit, and the RNA combined with the magnetic beads is redissolved by using nuclease-free water.
(IV) reverse transcription
(1)1 Strand cDNA Synthesis Using full Length reverse transcription System
(1.1) 13. mu.L of sample was sampled, and RNase-free ddH was added if not enough2Make up to 13 μ L of O, heat at 65 ℃ for 5min, quench quickly on ice, and stand on ice for 2 min.
(1.2) preparing a reaction system:
reagent | Volume of |
10×RT Mix | 2μl |
HiScript III Enzyme Mix | 2μl |
Reverse transcription primer working solution | 2μl |
Random hexamers | 1μl |
The HiScript III Enzyme Mix was from HiScript III 1st Strand cDNA Synthesis Kit (+ gDNA wiper) of Nanjing Novozae.
(1.3) first Strand cDNA Synthesis reaction under the following conditions
105 deg.C thermal cover | on |
25℃ | 5min |
37℃ | 45min |
85℃ | 5sec |
(2) Double-stranded cDNA synthesis, using a commercial double-stranded synthesis system:
(2.1) taking out the components required by the two-chain synthesis from-30 to-15 ℃, dissolving the components on ice, turning upside down and mixing the components uniformly, and briefly separating the components
The core was collected at the bottom of the tube and the second strand cDNA synthesis reaction system was prepared as follows:
components | Volume (μ l) |
Single strand cDNA | 20 |
Double-stranded Synthesis buffer | 20 |
Double-stranded synthetase | 5 |
Nuclease-free water | 5 |
Total | 50 |
(2.2) adjust the pipette to the range of 100. mu.L, and gently suck and beat 10 times to mix well.
(2.3) temporarily placing the PCR tube on ice, setting the following program on the PCR instrument, placing the PCR tube into the PCR instrument, and then continuing to run the program:
(2.4) immediately after the PCR reaction was completed, the product was purified using 90. mu.L of commercial DNA Clean Beads, and the cDNA enriched on the magnetic Beads was redissolved using 50. mu.L of nuclease-free water.
(V) library construction
(1) DNA fragmentation/end repair/dA tail addition
The cDNA obtained by reverse transcription was fragmented, end-filled, phosphorylated at the 5 'end and dA added at the 3' end using commercial fragmentation and end-repair enzymes.
(2) Connecting joint
A double-ended index linker adapted to the Illumina sequencer was added to both ends of the cDNA fragment using commercial T4 ligase.
(3) Magnetic bead purification
Immediately after the ligation reaction was completed, the ligation product was purified using 60. mu.L of commercial DNA Clean Beads, and the DNA enriched on the magnetic Beads was back-solubilized using 20. mu.L of nuclease-free water.
(4) Library amplification
Library amplification was performed using commercial high fidelity PCR enzyme premix and universal primers for Illumina tester adaptors to increase the concentration of library fragments enough for next generation sequencing.
(5) Magnetic bead purification
Immediately after the PCR reaction was completed, the ligation product was purified using 50. mu.L of commercial DNA Clean Beads, and the DNA enriched on the magnetic Beads was back-solubilized using 20. mu.L of nuclease-free water.
(VI) on-Board analysis of the library
The molecular concentration of the purified library is detected by a qPCR library quantitative kit, then the libraries are mixed according to the equivalent molecular concentration, the sequencing is carried out on an illumina platform, the computer strategy SE75 is carried out, and the data volume is 20M. After the data are downloaded, the number of reads detected by the novel coronavirus is obtained by analysis through a bioinformatics analysis method.
Example 1 Experimental validation of reverse transcription primer set
1. Sample source
The novel coronaviruses used for the preparation of the positive reference were purchased from Shanghai assist in san, and a retroviral vector was used, which was loaded with a partial sequence of the COVID-19 virus ORF1 a/b gene, and the entire sequence of the coding regions of the E gene and the N gene, and had a length of 2000 bp. The in vitro diagnostic reagent (fluorescent quantitative PCR method) verifies that the cell is positive, the quantification is 2 x 10^8copies/mL through a standard curve, and the used cells are purchased from Nanjing Kyobai and are cell sediments of 10^6 cells/tube.
2. Preparation of Experimental samples
Since the limit of detection of RNA virus in clinical respiratory tract samples by metagenomic detection is expected to be 1000copies/mL, the concentration of COVID-19 for preparing positive samples is 1 x 10^3 copies/mL. To mimic the host content of alveolar lavage fluid, cultured human Cell lines were added to positive samples to a Cell concentration of 10^5 Cell/mL. And preparing a virus-free negative sample comprising 10^5 cells/mL of human cells.
3. Experimental methods
In the reverse transcription reaction, 0.4nmol of the specific primer set was added to 20. mu.L of the reaction system.
And in the process of RNA library construction, comparing whether the most detected experimental schemes are removed or not under different two-strand synthesis systems and cDNA library construction systems.
The experimental design is as follows:
4. results of the experiment
Therefore, after the RNA of the ribosome is denuded, the full-length reverse transcription is carried out by using the specific primer and the random primer, and then the library is constructed by carrying out enzyme digestion on the double-stranded cDNA, so that the experimental scheme which can stably detect the new coronavirus sequence under the concentration of 10^3copies/mL is the only experimental scheme, and therefore, the primer system is the most suitable experimental scheme for the specific reverse transcription primer group of the COVID-19 multiple genomes.
Example 2: the reverse transcription primer group experiment verifies that the primer group has the detected promotion effect.
First, the sample source was the same as in example 1.
Second, preparation of experimental sample
Positive samples P1 and P2 were prepared with COVID-19 pseudovirus concentrations of 1X 10^3copies/mL and 1X 10^4copies/mL, respectively. To mimic the host content of alveolar lavage fluid, human Cell lines were cultured with the addition of P1, P2 to a Cell concentration of 10^5 Cell/mL. And a negative sample containing no pseudovirus was prepared N1.
Third, Experimental methods
The experimental protocol selected in example 1 was used, i.e., after removal of ribosomal RNA, full-length reverse transcription was performed using specific primers plus random primers, then library construction was interrupted by cleavage of double-stranded cDNA, and the number of reads detected using and without the COVID-19 multiple genome-specific reverse transcription primer set was compared
The experimental design is as follows:
fourth, experimental results
Therefore, the number of the reads detected by the new crown can be increased by more than two times by adding the specific primer group. The detection of the library construction scheme by removing rRNA + full-length reverse transcription + enzyme digestion is most stable and can be detected in each repeated sample.
Example 3: and (3) carrying out experimental verification on the reverse transcription primer group (verifying the performance of the primer group and the detection method).
First, the source of the sample
The COVID-19 pseudovirus and the human cells used were as in example 1, and the virus was purchased from ATCC type culture Collection. The concentration measurement was performed using the fluorescent quantitative PCR method.
Second, preparation of experimental sample
Cells and viruses were diluted with PBS and formulated into the following reference.
Third, Experimental methods
The experimental scheme selected in the embodiment 1 is adopted, namely after the RNA of the ribosome is denucleated, a specific primer and a random primer are used for full-length reverse transcription, then the double-stranded cDNA is cut by enzyme, the library is built, the Illumina platform is used for sequencing, the number of reads detected by COVID-19 is analyzed by the messenger software, and the threshold value of the number of the reads larger than 0 is used for judging the negative and positive.
The experimental design was as follows:
1) and detecting the positive reference substance P and the negative reference substance N once respectively for verifying the accuracy of the detection method.
2) The specific reference substance N1-N4 is detected once and is used for verifying the specificity of the detection method.
3) The sensitivity reference S is detected once for verifying the lowest detection limit of the detection method.
4) And detecting the precision reference product R for 10 times to verify the stability of the detection method.
5) The negative quality control product and the positive quality control product are respectively detected once and are used for the quality control of the detection process, so that the result is ensured to be real and effective
Fourth, experimental results
As can be seen from the table above, the results of the stability verification (R1-R10) are positive, and the stability is qualified; the result of the detection limit verification (S1) is positive, and the lowest detection limit verification is qualified; the detection result of the positive reference substance (P) is positive, the detection result of the negative reference substance (N) is negative, and the accuracy is qualified; the negative reference product is detected to be negative, and the specificity is qualified.
The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.
Sequence listing
<110> Tianjin gold spoon medical science and technology Limited
<120> method for detecting COVID-19 based on mNGS and application thereof
<160> 20
<170> SIPOSequenceListing 1.0
<210> 1
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
gcaggtgact cag 13
<210> 2
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
aattatgagg ttt 13
<210> 3
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
tacttattgt taa 13
<210> 4
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
cctgttttcc ttc 13
<210> 5
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
tgactcttgg tgt 13
<210> 6
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
tgagagtaag act 13
<210> 7
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gaacttctac atg 13
<210> 8
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
tcacggacag cat 13
<210> 9
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
aacactgttt aca 13
<210> 10
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
atgtgctgga gca 13
<210> 11
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
aaacatgcat tcc 13
<210> 12
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
acagacagca cca 13
<210> 13
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
gctggccttg aag 13
<210> 14
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
acaattgaag aag 13
<210> 15
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
taagagtcat ttt 13
<210> 16
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
cagtacaaaa gac 13
<210> 17
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
gatgtaaact tac 13
<210> 18
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
catagaagtc ttt 13
<210> 19
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
gtaaataaat ttt 13
<210> 20
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
tttctccctc taa 13
Claims (6)
1. A library building method for detecting COVID-19 novel coronavirus based on mNGS is characterized by comprising the following steps:
1) the preparation method of the multiple genome-specific reverse transcription primer group comprises the following steps: designing and screening multiple genome specificity reverse transcription primer groups;
2) reverse transcription and cDNA synthesis: comprises the step of performing multiplex amplification by using the designed multiplex genome-specific reverse transcription primer set;
3) library construction step: constructing a library of cDNA sequences synthesized by reverse transcription;
the step 2) of synthesizing cDNA by reverse transcription comprises the steps of adding random reverse transcription primers into a multiple genome reverse transcription primer group and then performing multiple amplification;
in the step of synthesizing cDNA by reverse transcription, multiplex amplification is carried out by adopting full-length reverse transcriptase in multiplex amplification, wherein the full-length reverse transcriptase is HiScript III Enzyme;
the primer sequence of the multiple genome specificity reverse transcription primer group is shown as SEQ ID NO. 1-277.
2. The method for constructing a library for detecting COVID-19 novel coronaviruses based on mNGS as claimed in claim 1, wherein 1) the preparation of the multiple genome-specific reverse transcription primer set comprises the following steps:
step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the N1 is as follows: 100< N1< 1000;
step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt short segments with 150-300nt overlap, intercepting 30-60nt at two ends of each segment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; the N2 is as follows: n2 is more than or equal to 1 and less than or equal to 8;
step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, a self-dimer or a cross-dimer is possibly formed, more than 5nt homopolymer repeated bases exist, and a multiple genome reverse transcription primer group is obtained through screening and rejecting;
step 4), designing a conservative region primer: and (3) designing 13nt primers at the 3' end of the region by using Primer3 software aiming at the region, deleting primers which have the Tm value deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3), and selecting the Primer with the most front software sequencing to replace the Primer in the same Primer design region in the Primer in the step 3) to form a final reverse transcription Primer group.
3. The database building method of claim 2, wherein the genome database in step 1) includes but is not limited to NCBI GenBank database, DDBJ database, EMBL database; the alignment in step 1) is to align all downloaded genome matches using fast fourier transform MAFFT.
4. A multiple genome-specific reverse transcription primer group of a COVID-19 novel coronavirus is characterized in that a primer sequence is shown as SEQ ID NO. 1-277.
5. The method of preparing the set of primers for multiple genome-specific reverse transcription of the COVID-19 novel coronavirus of claim 4, comprising the steps of:
step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the N1 is as follows: 100< N1< 1000;
step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt short segments with 150-300nt overlap, intercepting 30-60nt at two ends of each segment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; the N2 is as follows: n2 is more than or equal to 1 and less than or equal to 8;
step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, a self-dimer or a cross-dimer is possibly formed, more than 5nt homopolymer repeated bases exist, and a multiple genome reverse transcription primer group is obtained through screening and rejecting;
step 4), designing a conservative region primer: and (3) designing 13nt primers at the 3' end of the region by using Primer3 software aiming at the region, deleting primers which have the Tm value deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3), and selecting the Primer with the most front software sequencing to replace the Primer in the same Primer design region in the Primer in the step 3) to form a final reverse transcription Primer group.
6. The use of the multiplex genome-specific reverse transcription primer set according to claim 4, which comprises:
1) use in the multiplex amplification of a COVID-19 novel coronavirus, said use being a non-disease diagnostic use;
2) the application in COVID-19 novel coronavirus sequencing library construction;
3) use in targeted enrichment of novel COVID-19 coronaviruses;
4) the application of the conjugate in preparing a reagent for detecting the COVID-19 novel coronavirus.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110597445.2A CN113337639B (en) | 2021-05-28 | 2021-05-28 | Method for detecting COVID-19 based on mNGS and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110597445.2A CN113337639B (en) | 2021-05-28 | 2021-05-28 | Method for detecting COVID-19 based on mNGS and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113337639A CN113337639A (en) | 2021-09-03 |
CN113337639B true CN113337639B (en) | 2022-01-25 |
Family
ID=77472608
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110597445.2A Active CN113337639B (en) | 2021-05-28 | 2021-05-28 | Method for detecting COVID-19 based on mNGS and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113337639B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115101126B (en) * | 2022-02-22 | 2023-04-18 | 中国医学科学院北京协和医院 | Respiratory tract virus and/or bacterial subtype primer design method and system based on CE platform |
CN114550816B (en) * | 2022-03-01 | 2022-11-04 | 上海图灵智算量子科技有限公司 | Method for predicting virus mutation probability based on photonic chip |
CN114317705A (en) * | 2022-03-03 | 2022-04-12 | 天津金匙医学科技有限公司 | Relative quantitative detection method for mNGS (human growth hormone receptor) pathogen by adopting single label |
CN114574606B (en) * | 2022-04-02 | 2023-04-28 | 予果生物科技(北京)有限公司 | Primer group for detecting mycobacterium tuberculosis in metagenome and high-throughput sequencing method |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111118226B (en) * | 2020-03-25 | 2021-04-02 | 北京微未来科技有限公司 | Novel coronavirus whole genome capture method, primer group and kit |
CN111334868B (en) * | 2020-03-26 | 2023-05-23 | 福州福瑞医学检验实验室有限公司 | Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction |
CN111500781B (en) * | 2020-05-15 | 2021-10-29 | 广州微远医疗器械有限公司 | Amplification primer group for detecting SARS-CoV-2 by mNGS and application thereof |
CN112322788B (en) * | 2020-11-24 | 2021-07-06 | 杭州杰毅生物技术有限公司 | mNGS primer group and kit for detecting SARS-CoV-2 |
-
2021
- 2021-05-28 CN CN202110597445.2A patent/CN113337639B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN113337639A (en) | 2021-09-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113337639B (en) | Method for detecting COVID-19 based on mNGS and application thereof | |
CN110093455B (en) | Respiratory virus detection method | |
CN113073150B (en) | Digital PCR detection kit for novel coronavirus and variant thereof | |
CN106906211B (en) | Molecular joint and application thereof | |
CN111334615B (en) | Novel coronavirus detection method and kit | |
CN111440896B (en) | Novel beta coronavirus variation detection method, probe and kit | |
CN105400776B (en) | Oligonucleotide linker and application thereof in constructing nucleic acid sequencing single-stranded circular library | |
WO2017054302A1 (en) | Sequencing library, and preparation and use thereof | |
CN111808854B (en) | Balanced joint with molecular bar code and method for quickly constructing transcriptome library | |
JP2024105673A (en) | Creation of single-stranded circular DNA templates for single molecule sequencing | |
CN111321202A (en) | Gene fusion variation library construction method, detection method, device, equipment and storage medium | |
CN107699957A (en) | Fusion based on DNA, which is quantitatively sequenced, builds storehouse, detection method and its application | |
CN111593142A (en) | Detection kit for simultaneously detecting nine respiratory viruses including SARS-CoV-2 | |
CN113249437A (en) | Library construction method for sRNA sequencing | |
CN108192965B (en) | Method for detecting heterogeneity of mitochondrial genome A3243G locus | |
US20220033809A1 (en) | Method and kit for construction of rna library | |
TW201321520A (en) | Method and system for virus detection | |
CN111979353A (en) | Library construction method for sequencing novel coronavirus SARS-CoV-2 full-length genome | |
CN106566875A (en) | Primers, kit and method for detecting myelodysplastic syndromes (MDS) gene mutation | |
CN116287162A (en) | Kit for detecting BCR-ABL1 fusion gene and tyrosine kinase region mutation and promoter methylation thereof and application method | |
CN115094164A (en) | Multiple qPCR (quantitative polymerase chain reaction) kit and detection method for ASFV (advanced specific immunodeficiency syndrome) with different gene deletion types | |
CN114790579A (en) | Method for constructing new coronavirus sequencing library, method for determining new coronavirus nucleic acid sequence, sequencing library and kit | |
US20200208140A1 (en) | Methods of making and using tandem, twin barcode molecules | |
CN111394474A (en) | Method for detecting copy number variation of cattle GA L3 ST1 gene and application thereof | |
CN109609661A (en) | A kind of combination of kidney-yang deficiency exogenous disease mouse model lung tissue qPCR reference gene and its screening technique |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |