CN113337639B - Method for detecting COVID-19 based on mNGS and application thereof - Google Patents

Method for detecting COVID-19 based on mNGS and application thereof Download PDF

Info

Publication number
CN113337639B
CN113337639B CN202110597445.2A CN202110597445A CN113337639B CN 113337639 B CN113337639 B CN 113337639B CN 202110597445 A CN202110597445 A CN 202110597445A CN 113337639 B CN113337639 B CN 113337639B
Authority
CN
China
Prior art keywords
primer
genome
reverse transcription
primers
primer group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110597445.2A
Other languages
Chinese (zh)
Other versions
CN113337639A (en
Inventor
王棪
梁永
李玉龙
李立锋
蒋智
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Jinke Medical Technology Co ltd
Original Assignee
Tianjin Jinke Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Jinke Medical Technology Co ltd filed Critical Tianjin Jinke Medical Technology Co ltd
Priority to CN202110597445.2A priority Critical patent/CN113337639B/en
Publication of CN113337639A publication Critical patent/CN113337639A/en
Application granted granted Critical
Publication of CN113337639B publication Critical patent/CN113337639B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
    • C12Q1/701Specific hybridization probes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biophysics (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Virology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a method for detecting a novel COVID-19 coronavirus based on mNGS and application thereof, wherein the method improves the reverse transcription efficiency of the RNA of the novel coronavirus, reduces the problem of aerosol pollution and simultaneously improves the enrichment capacity of nucleic acid sequences of different variant viruses by designing and preparing a multiple genome-specific reverse transcription primer group of the novel COVID-19 coronavirus.

Description

Method for detecting COVID-19 based on mNGS and application thereof
Technical Field
The invention belongs to the field of gene detection, and particularly relates to a method for detecting COVID-19 based on mNGS and application thereof.
Background
The symptoms of the novel coronavirus pneumonia are similar to those of common pneumonia, so that a rapid and accurate diagnosis technology plays a crucial role in patient treatment. At present, the mainstream detection means for disease diagnosis and epidemic situation screening is a real-time fluorescence RT-PCR detection technology, and the technology has the advantages of simple test operation, short detection time and high sensitivity, and is the first choice for rapid diagnosis and large-scale population screening at present. However, the technology also has the disadvantages that the current commercial kit usually takes two specific gene loci of the new coronavirus as targets, only two short sequences can be detected, the nucleic acid of the new coronavirus is RNA which is easier to degrade than DNA, the virus dies and splits after sampling, the integrity of the RNA is reduced, false positive can appear if the residual nucleic acid is not in a target region during detection, the novel virus continuously varies, and once the target region in a sample mutates, the detection means also has the risk of missing detection.
And the detection sequence of the new coronavirus genome based on the second-generation sequencing can cover the full length of the COVID-19 sequence. The defect that only a small number of COVID-19 known regions can be detected by RT-PCR is overcome, the detection accuracy can be improved, the possible variation of the virus can be identified, the disease condition can be accurately judged according to the genotype, a treatment scheme is formulated, and the source can be traced according to the possible variation of the virus, and the propagation path can be found. Therefore, the genome detection of the new coronavirus has been generally accepted by the medical field in China as an in vitro diagnosis method, and the genome detection of the new coronavirus is the same as real-time fluorescence RT-PCR in the novel diagnosis and treatment scheme for coronavirus pneumonia issued by the national health and health committee.
Common genome sequencing means comprise metagenome and targeted enrichment detection, but the metagenome has the defect of indiscriminate detection of all nucleic acids in a sample at present, and human nucleic acid with high proportion not only reduces the detection sensitivity, but also causes unstable detection signal intensity and generates false negative due to the proportion difference of different samples. The targeted enrichment mostly adopts a method of multiple PCR to enrich new coronavirus genome sequences to remove the interference of human source nucleic acid, but the cycle number used for cDNA amplification after reverse transcription is high, and PCR products are easy to cause cross contamination and aerosol contamination, so that false positive is generated. Moreover, designing primers only against a single reference genomic sequence provided by NCBI will inevitably eliminate variant gene sequences. The new coronavirus has high mutation speed and very rich sequence diversity, new virus subtypes are continuously discovered and uploaded in the world, the gene sequences of the new coronavirus recorded by NCBI (national center for Biotechnology information) reach 50,326, and the enrichment effect on the variant virus sequences can be reduced by only designing a primer aiming at a fixed genome sequence, so that false negative is caused.
The invention is provided in view of the above.
Disclosure of Invention
The invention aims to find a method for effectively detecting novel coronavirus COVID-19, and provides the following technical scheme for achieving the aim of improvement:
the invention firstly provides a preparation method of a virus multiple genome specific reverse transcription primer group, which comprises the following steps:
step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the number of the elements is 100< N1< 1000.
Step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt (preferably 394nt) short fragments with 150-300nt (preferably 197nt) overlap, intercepting 30-60nt (preferably 50nt) at two ends of each fragment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt (preferably 13nt) aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency (preferably sequencing from high to low), selecting the primer with the highest occurrence frequency and containing no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; n2 is more than or equal to 1 and less than or equal to 8.
Further, the method further comprises:
step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, self-dimer or cross-dimer is possibly formed, homopolymer repeated bases with more than 5nt exist, and the final multiple genome reverse transcription primer group is obtained through screening and rejecting.
Step 4), designing a conservative region primer: and (2) the sequences of the Primer design regions positioned in highly conserved regions (such as E genes) are consistent in all genomes, for such regions, 13nt Primer design is carried out at the 3' end of the region by using Primer3 software, after primers which have Tm values deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3) are deleted, the Primer with the most front software sequencing is selected to replace the Primer in the same Primer design region in the Primer in the step 3), and a final reverse transcription Primer group is formed.
Further, the genome database in step 1) includes, but is not limited to, NCBI GenBank database, DDBJ database, EMBL database; preferably, multiple genomic sequences of the virus are downloaded from the NCBI database, and all the downloaded genomes are aligned for match using the fast fourier transform MAFFT.
Further, the step 2) is as follows: step 2) preparing a candidate primer group: using PYFASTA software to divide the matched multiple genome sequence into 394nt short fragments with 197nt overlapping, intercepting 50nt at two ends of each fragment as a primer design region, and designing a plurality of 13nt forward and/or reverse primers aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region from high to low according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N times to obtain a candidate primer group; n is more than or equal to 1 and less than or equal to 8.
Further, the virus is a novel COVID-19 coronavirus;
further, the virus is a COVID-19 novel coronavirus based on the mNGS detection.
In some embodiments, the final multiplexed genomic reverse transcription primer set sequences are shown in SEQ ID nos. 1-277.
The invention also provides a library construction method for detecting the COVID-19 novel coronavirus based on the mNGS, which is characterized by comprising the following steps of:
1) the preparation method of the multiple genome-specific reverse transcription primer group comprises the following steps: designing and screening multiple genome specificity reverse transcription primer groups;
2) reverse transcription and cDNA synthesis: comprises the step of performing multiplex amplification by using the designed multiplex genome-specific reverse transcription primer set;
3) library construction step: and (3) performing library construction on the cDNA sequence.
Further, the step 2) of synthesizing cDNA by reverse transcription comprises the steps of adding random reverse transcription primers to the multiple genome reverse transcription primer group and then performing multiple amplification;
further, the multiplex amplification is carried out by using full-length reverse transcriptase;
in some embodiments, the full length reverse transcriptase is a HiScript III Enzyme system.
In some more preferred embodiments, the multiplex amplification step is specifically as follows:
heating the sample at 65 deg.C for 5min, rapidly cooling on ice, and standing on ice for 2 min;
the following reaction system was prepared:
reagent Volume of
10×RT Mix 2μl
HiScript III Enzyme Mix 2μl
Reverse transcription primer working solution 2μl
Random hexamers 1μl
First Strand cDNA amplification was performed under the following conditions
105 deg.C thermal cover on
25℃ 5min
37℃ 45min
85℃ 5sec
Further, the multiple amplified sequences were prepared by double-stranded cDNA synthesis using a commercial double-stranded synthesis system.
Further, the step 1) of designing and screening multiple genome-specific reverse transcription primer sets comprises the following steps:
step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the number of the elements is 100< N1< 1000.
Step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt (preferably 394nt) short fragments with 150-300nt (preferably 197nt) overlap, intercepting 30-60nt (preferably 50nt) at two ends of each fragment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt (preferably 13nt) aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency (preferably sequencing from high to low), selecting the primer with the highest occurrence frequency and containing no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; n2 is more than or equal to 1 and less than or equal to 8.
The method further comprises the following steps:
step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, self-dimer or cross-dimer is possibly formed, homopolymer repeated bases with more than 5nt exist, and the final multiple genome reverse transcription primer group is obtained through screening and rejecting.
Step 4), designing a conservative region primer: and (2) the sequences of the Primer design regions positioned in highly conserved regions (such as E genes) are consistent in all genomes, for such regions, 13nt Primer design is carried out at the 3' end of the region by using Primer3 software, after primers which have Tm values deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3) are deleted, the Primer with the most front software sequencing is selected to replace the Primer in the same Primer design region in the Primer in the step 3), and a final reverse transcription Primer group is formed.
Further, the genome database in step 1) includes, but is not limited to, NCBI GenBank database, DDBJ database, EMBL database; preferably, multiple genomic sequences of the virus are downloaded from the NCBI database, and all the downloaded genomes are aligned for match using the fast fourier transform MAFFT.
Further, the step 2) is as follows: step 2) preparing a candidate primer group: using PYFASTA software to divide the matched multiple genome sequence into 394nt short fragments with 197nt overlapping, intercepting 50nt at two ends of each fragment as a primer design region, and designing a plurality of 13nt forward and/or reverse primers aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region from high to low according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N times to obtain a candidate primer group; n is more than or equal to 1 and less than or equal to 8.
Further, the virus is a novel COVID-19 coronavirus;
further, the virus is a COVID-19 novel coronavirus based on the mNGS detection.
In some embodiments, the final multiplexed genomic reverse transcription primer set sequences are shown in SEQ ID nos. 1-277.
The invention also provides a method for detecting the COVID-19 novel coronavirus based on the mNGS, which comprises the steps of the method and comprises the steps of sequencing and generating information analysis.
The invention also provides a multiple genome specific reverse transcription primer group of the COVID-19 novel coronavirus, and the primer sequence is shown as SEQ ID NO. 1-277.
Further, the list is obtained by screening by the method for designing and screening multiple genome-specific reverse transcription primer sets as described above.
The invention also provides any one of the following uses of the multiple genome-specific reverse transcription primer set:
1) use in multiple amplifications of the COVID-19 novel coronavirus;
2) the application in COVID-19 novel coronavirus sequencing library construction;
3) use in targeted enrichment of novel COVID-19 coronaviruses;
4) the application of the conjugate in preparing a reagent for detecting the COVID-19 novel coronavirus.
The invention also provides a novel coronavirus COVID-19 nucleic acid library construction or nucleic acid detection kit, which comprises the primer group of SEQ ID NO. 1-277.
Further the kit further comprises a full-length reverse transcriptase; in some preferred embodiments, the full length reverse transcriptase is a HiScript III Enzyme system.
Compared with the prior art, the invention has at least the following advantages:
1) the invention adopts a method of adding new crown specific primers and random primers in the reverse transcription stage to improve the reverse transcription efficiency of new crown virus RNA and reduce the probability of aerosol pollution while enriching.
2) When designing a primer, the invention adopts a primer design method which can take a plurality of genome sequences of the new coronavirus as a reference sequence set, in particular, the invention designs a reverse transcription primer by taking at least more than 100 multiple genomes as the basis, considers the diversity of various new coronavirus sequences, improves the enrichment capacity of different variant virus nucleic acid sequences, and has a very high probability of successfully enriching the new variant viruses which possibly appear in the future because the screening of the primer is to match the primer sequence with the highest frequency in a plurality of different variant types.
3) The final primer system SEQ ID NO.1-277 established by design and screening can be comprehensively and efficiently used for establishing a library of new coronavirus and even variant strains, and the high sensitivity of subsequent sequencing detection is ensured.
4) Compared with the traditional method of RNA breaking and then amplification and library building, the method firstly amplifies the long fragment and then breaks, and specifically adopts HiScript III Enzyme Mix to adopt full-length reverse transcriptase during reverse transcription, so that the extension length of the specific primer is increased, and the enrichment efficiency is improved.
Detailed Description
The technical solutions of the present invention are described clearly and completely below, and it is obvious that the described embodiments are some, not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following terms or definitions are provided only to aid in understanding the present invention. These definitions should not be construed to have a scope less than understood by those skilled in the art.
Unless defined otherwise below, all technical and scientific terms used in the detailed description of the present invention are intended to have the same meaning as commonly understood by one of ordinary skill in the art. While the following terms are believed to be well understood by those skilled in the art, the following definitions are set forth to better explain the present invention.
As used herein, the terms "comprising," "including," "having," "containing," or "involving" are inclusive or open-ended and do not exclude additional unrecited elements or method steps. The term "consisting of …" is considered to be a preferred embodiment of the term "comprising". If in the following a certain group is defined to comprise at least a certain number of embodiments, this should also be understood as disclosing a group which preferably only consists of these embodiments.
Where an indefinite or definite article is used when referring to a singular noun e.g. "a" or "an", "the", this includes a plural of that noun.
The terms "about" and "substantially" in the present invention denote an interval of accuracy that can be understood by a person skilled in the art, which still guarantees the technical effect of the feature in question. The term generally denotes a deviation of ± 10%, preferably ± 5%, from the indicated value.
Furthermore, the terms first, second, third, (a), (b), (c), and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.
Specific examples are as follows.
Experimental example 1 primer design
1. Generation of multiple genomes
More than 100 COVID-19 novel coronavirus genomes were first downloaded from the NCBI website, and all downloaded genomes were aligned for match using fast fourier transform (MAFFT) (v.7.388); note that, in order to ensure the detection rate of mutation of the subsequent primers, the source sequence of the multiple genome must be at least more than 100.
2. And (5) designing a candidate primer.
The matched multiplex genome is divided into 394nt short fragments with 197nt overlap by using PYFASTA software, 50nt is intercepted at two ends of each fragment to be used as a primer design region, a plurality of 13nt forward or reverse primers are designed to form a primer group to be merged in the region, all primers of the primer group to be merged in each region are sequenced from high to low according to the occurrence frequency, the primer with the highest occurrence frequency and containing no uncertain base is selected, all primers with the same sequence as the primer are deleted, the rest primers are sequenced and screened again, and a large number of candidate primer groups are obtained after repeating for multiple times.
3. Primer screening
The resulting candidate primer set is subjected to secondary screening to delete primers that appear such as: the Tm value deviates from the average by more than 2 standard deviations, and is likely to form a self-dimer or a cross-dimer, and there are more than 5nt homopolymer repeat bases, which are only exemplified here, and more screening conditions are included in the actual process. And (3) obtaining a final COVID-19 multiple genome reverse transcription primer group through secondary screening and rejecting, wherein the sequences of the primer group are shown in the following table.
The invention further comprises manual sequence adjustment and the like, for example, for conservative region primer design: and (2) the sequences of the Primer design regions positioned in highly conserved regions (such as E genes) are consistent in all genomes, for such regions, 13nt Primer design is carried out at the 3' end of the region by using Primer3 software, after primers which have Tm values deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3) are deleted, the Primer with the most front software sequencing is selected to replace the Primer in the same Primer design region in the Primer in the step 3), and a final reverse transcription Primer group is formed.
Figure BDA0003090080200000071
Figure BDA0003090080200000081
Figure BDA0003090080200000091
Figure BDA0003090080200000101
Figure BDA0003090080200000111
Figure BDA0003090080200000121
Experimental example 2 experimental procedure
(first) preparation of primers
(1) The primers were synthesized at a loading of 10nmol per tube, and purified by PAGE in 277 tubes.
(2) The primers were centrifuged at 4000rmp for 1 min.
(3) Adding Tris-HCl into each tube, performing vortex mixing, placing on ice, performing vortex mixing again, and performing instantaneous centrifugation.
(4) And combining the primers of all tubes into 1 tube to obtain a working solution of the COVID-19 multi-genome reverse transcription primer for later use.
(II) RNA nucleic acid extraction
RNA nucleic acid extraction was performed using the viral RNA extraction kit from Qiagen. The method is operated according to the product specification, after the sample is cracked, ethanol is added to separate out RNA, after the RNA is centrifugally filtered through a column, the separated RNA is combined with a silica gel membrane under the condition of high salt, residual impurities on the silica gel membrane are washed away by using a rinsing liquid, and finally the RNA is eluted from the silica gel membrane under the conditions of low salt and high pH value.
(III) human ribosomal RNA removal
Removal of ribosomal RNA from the host was performed using a commercial ribosomal RNA removal kit (human/mouse/rat) and RNA purification magnetic beads. Firstly, a ribosome RNA specific DNA probe is fully combined with human ribosome RNA by utilizing a slowly-reduced incubation temperature, secondly, a combination body is digested by using RNase H for specifically digesting the RNA & DNA combination body, then, the residual DNA probe is digested by using DNA digestive enzyme, finally, the residual RNA is enriched by using RNA purification magnetic beads matched with the kit, and the RNA combined with the magnetic beads is redissolved by using nuclease-free water.
(IV) reverse transcription
(1)1 Strand cDNA Synthesis Using full Length reverse transcription System
(1.1) 13. mu.L of sample was sampled, and RNase-free ddH was added if not enough2Make up to 13 μ L of O, heat at 65 ℃ for 5min, quench quickly on ice, and stand on ice for 2 min.
(1.2) preparing a reaction system:
reagent Volume of
10×RT Mix 2μl
HiScript III Enzyme Mix 2μl
Reverse transcription primer working solution 2μl
Random hexamers 1μl
The HiScript III Enzyme Mix was from HiScript III 1st Strand cDNA Synthesis Kit (+ gDNA wiper) of Nanjing Novozae.
(1.3) first Strand cDNA Synthesis reaction under the following conditions
105 deg.C thermal cover on
25℃ 5min
37℃ 45min
85℃ 5sec
(2) Double-stranded cDNA synthesis, using a commercial double-stranded synthesis system:
(2.1) taking out the components required by the two-chain synthesis from-30 to-15 ℃, dissolving the components on ice, turning upside down and mixing the components uniformly, and briefly separating the components
The core was collected at the bottom of the tube and the second strand cDNA synthesis reaction system was prepared as follows:
components Volume (μ l)
Single strand cDNA 20
Double-stranded Synthesis buffer 20
Double-stranded synthetase 5
Nuclease-free water 5
Total 50
(2.2) adjust the pipette to the range of 100. mu.L, and gently suck and beat 10 times to mix well.
(2.3) temporarily placing the PCR tube on ice, setting the following program on the PCR instrument, placing the PCR tube into the PCR instrument, and then continuing to run the program:
Figure BDA0003090080200000131
Figure BDA0003090080200000141
(2.4) immediately after the PCR reaction was completed, the product was purified using 90. mu.L of commercial DNA Clean Beads, and the cDNA enriched on the magnetic Beads was redissolved using 50. mu.L of nuclease-free water.
(V) library construction
(1) DNA fragmentation/end repair/dA tail addition
The cDNA obtained by reverse transcription was fragmented, end-filled, phosphorylated at the 5 'end and dA added at the 3' end using commercial fragmentation and end-repair enzymes.
(2) Connecting joint
A double-ended index linker adapted to the Illumina sequencer was added to both ends of the cDNA fragment using commercial T4 ligase.
(3) Magnetic bead purification
Immediately after the ligation reaction was completed, the ligation product was purified using 60. mu.L of commercial DNA Clean Beads, and the DNA enriched on the magnetic Beads was back-solubilized using 20. mu.L of nuclease-free water.
(4) Library amplification
Library amplification was performed using commercial high fidelity PCR enzyme premix and universal primers for Illumina tester adaptors to increase the concentration of library fragments enough for next generation sequencing.
(5) Magnetic bead purification
Immediately after the PCR reaction was completed, the ligation product was purified using 50. mu.L of commercial DNA Clean Beads, and the DNA enriched on the magnetic Beads was back-solubilized using 20. mu.L of nuclease-free water.
(VI) on-Board analysis of the library
The molecular concentration of the purified library is detected by a qPCR library quantitative kit, then the libraries are mixed according to the equivalent molecular concentration, the sequencing is carried out on an illumina platform, the computer strategy SE75 is carried out, and the data volume is 20M. After the data are downloaded, the number of reads detected by the novel coronavirus is obtained by analysis through a bioinformatics analysis method.
Example 1 Experimental validation of reverse transcription primer set
1. Sample source
The novel coronaviruses used for the preparation of the positive reference were purchased from Shanghai assist in san, and a retroviral vector was used, which was loaded with a partial sequence of the COVID-19 virus ORF1 a/b gene, and the entire sequence of the coding regions of the E gene and the N gene, and had a length of 2000 bp. The in vitro diagnostic reagent (fluorescent quantitative PCR method) verifies that the cell is positive, the quantification is 2 x 10^8copies/mL through a standard curve, and the used cells are purchased from Nanjing Kyobai and are cell sediments of 10^6 cells/tube.
2. Preparation of Experimental samples
Since the limit of detection of RNA virus in clinical respiratory tract samples by metagenomic detection is expected to be 1000copies/mL, the concentration of COVID-19 for preparing positive samples is 1 x 10^3 copies/mL. To mimic the host content of alveolar lavage fluid, cultured human Cell lines were added to positive samples to a Cell concentration of 10^5 Cell/mL. And preparing a virus-free negative sample comprising 10^5 cells/mL of human cells.
3. Experimental methods
In the reverse transcription reaction, 0.4nmol of the specific primer set was added to 20. mu.L of the reaction system.
And in the process of RNA library construction, comparing whether the most detected experimental schemes are removed or not under different two-strand synthesis systems and cDNA library construction systems.
The experimental design is as follows:
Figure BDA0003090080200000151
Figure BDA0003090080200000161
4. results of the experiment
Figure BDA0003090080200000162
Figure BDA0003090080200000171
Figure BDA0003090080200000181
Therefore, after the RNA of the ribosome is denuded, the full-length reverse transcription is carried out by using the specific primer and the random primer, and then the library is constructed by carrying out enzyme digestion on the double-stranded cDNA, so that the experimental scheme which can stably detect the new coronavirus sequence under the concentration of 10^3copies/mL is the only experimental scheme, and therefore, the primer system is the most suitable experimental scheme for the specific reverse transcription primer group of the COVID-19 multiple genomes.
Example 2: the reverse transcription primer group experiment verifies that the primer group has the detected promotion effect.
First, the sample source was the same as in example 1.
Second, preparation of experimental sample
Positive samples P1 and P2 were prepared with COVID-19 pseudovirus concentrations of 1X 10^3copies/mL and 1X 10^4copies/mL, respectively. To mimic the host content of alveolar lavage fluid, human Cell lines were cultured with the addition of P1, P2 to a Cell concentration of 10^5 Cell/mL. And a negative sample containing no pseudovirus was prepared N1.
Third, Experimental methods
The experimental protocol selected in example 1 was used, i.e., after removal of ribosomal RNA, full-length reverse transcription was performed using specific primers plus random primers, then library construction was interrupted by cleavage of double-stranded cDNA, and the number of reads detected using and without the COVID-19 multiple genome-specific reverse transcription primer set was compared
The experimental design is as follows:
Figure BDA0003090080200000182
Figure BDA0003090080200000191
fourth, experimental results
Figure BDA0003090080200000192
Therefore, the number of the reads detected by the new crown can be increased by more than two times by adding the specific primer group. The detection of the library construction scheme by removing rRNA + full-length reverse transcription + enzyme digestion is most stable and can be detected in each repeated sample.
Example 3: and (3) carrying out experimental verification on the reverse transcription primer group (verifying the performance of the primer group and the detection method).
First, the source of the sample
The COVID-19 pseudovirus and the human cells used were as in example 1, and the virus was purchased from ATCC type culture Collection. The concentration measurement was performed using the fluorescent quantitative PCR method.
Second, preparation of experimental sample
Cells and viruses were diluted with PBS and formulated into the following reference.
Figure BDA0003090080200000193
Figure BDA0003090080200000201
Third, Experimental methods
The experimental scheme selected in the embodiment 1 is adopted, namely after the RNA of the ribosome is denucleated, a specific primer and a random primer are used for full-length reverse transcription, then the double-stranded cDNA is cut by enzyme, the library is built, the Illumina platform is used for sequencing, the number of reads detected by COVID-19 is analyzed by the messenger software, and the threshold value of the number of the reads larger than 0 is used for judging the negative and positive.
The experimental design was as follows:
1) and detecting the positive reference substance P and the negative reference substance N once respectively for verifying the accuracy of the detection method.
2) The specific reference substance N1-N4 is detected once and is used for verifying the specificity of the detection method.
3) The sensitivity reference S is detected once for verifying the lowest detection limit of the detection method.
4) And detecting the precision reference product R for 10 times to verify the stability of the detection method.
5) The negative quality control product and the positive quality control product are respectively detected once and are used for the quality control of the detection process, so that the result is ensured to be real and effective
Fourth, experimental results
Figure BDA0003090080200000202
Figure BDA0003090080200000211
As can be seen from the table above, the results of the stability verification (R1-R10) are positive, and the stability is qualified; the result of the detection limit verification (S1) is positive, and the lowest detection limit verification is qualified; the detection result of the positive reference substance (P) is positive, the detection result of the negative reference substance (N) is negative, and the accuracy is qualified; the negative reference product is detected to be negative, and the specificity is qualified.
The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.
Sequence listing
<110> Tianjin gold spoon medical science and technology Limited
<120> method for detecting COVID-19 based on mNGS and application thereof
<160> 20
<170> SIPOSequenceListing 1.0
<210> 1
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
gcaggtgact cag 13
<210> 2
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
aattatgagg ttt 13
<210> 3
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
tacttattgt taa 13
<210> 4
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
cctgttttcc ttc 13
<210> 5
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
tgactcttgg tgt 13
<210> 6
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
tgagagtaag act 13
<210> 7
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gaacttctac atg 13
<210> 8
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
tcacggacag cat 13
<210> 9
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
aacactgttt aca 13
<210> 10
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
atgtgctgga gca 13
<210> 11
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
aaacatgcat tcc 13
<210> 12
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
acagacagca cca 13
<210> 13
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
gctggccttg aag 13
<210> 14
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
acaattgaag aag 13
<210> 15
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
taagagtcat ttt 13
<210> 16
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
cagtacaaaa gac 13
<210> 17
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
gatgtaaact tac 13
<210> 18
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
catagaagtc ttt 13
<210> 19
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
gtaaataaat ttt 13
<210> 20
<211> 13
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
tttctccctc taa 13

Claims (6)

1. A library building method for detecting COVID-19 novel coronavirus based on mNGS is characterized by comprising the following steps:
1) the preparation method of the multiple genome-specific reverse transcription primer group comprises the following steps: designing and screening multiple genome specificity reverse transcription primer groups;
2) reverse transcription and cDNA synthesis: comprises the step of performing multiplex amplification by using the designed multiplex genome-specific reverse transcription primer set;
3) library construction step: constructing a library of cDNA sequences synthesized by reverse transcription;
the step 2) of synthesizing cDNA by reverse transcription comprises the steps of adding random reverse transcription primers into a multiple genome reverse transcription primer group and then performing multiple amplification;
in the step of synthesizing cDNA by reverse transcription, multiplex amplification is carried out by adopting full-length reverse transcriptase in multiplex amplification, wherein the full-length reverse transcriptase is HiScript III Enzyme;
the primer sequence of the multiple genome specificity reverse transcription primer group is shown as SEQ ID NO. 1-277.
2. The method for constructing a library for detecting COVID-19 novel coronaviruses based on mNGS as claimed in claim 1, wherein 1) the preparation of the multiple genome-specific reverse transcription primer set comprises the following steps:
step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the N1 is as follows: 100< N1< 1000;
step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt short segments with 150-300nt overlap, intercepting 30-60nt at two ends of each segment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; the N2 is as follows: n2 is more than or equal to 1 and less than or equal to 8;
step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, a self-dimer or a cross-dimer is possibly formed, more than 5nt homopolymer repeated bases exist, and a multiple genome reverse transcription primer group is obtained through screening and rejecting;
step 4), designing a conservative region primer: and (3) designing 13nt primers at the 3' end of the region by using Primer3 software aiming at the region, deleting primers which have the Tm value deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3), and selecting the Primer with the most front software sequencing to replace the Primer in the same Primer design region in the Primer in the step 3) to form a final reverse transcription Primer group.
3. The database building method of claim 2, wherein the genome database in step 1) includes but is not limited to NCBI GenBank database, DDBJ database, EMBL database; the alignment in step 1) is to align all downloaded genome matches using fast fourier transform MAFFT.
4. A multiple genome-specific reverse transcription primer group of a COVID-19 novel coronavirus is characterized in that a primer sequence is shown as SEQ ID NO. 1-277.
5. The method of preparing the set of primers for multiple genome-specific reverse transcription of the COVID-19 novel coronavirus of claim 4, comprising the steps of:
step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the N1 is as follows: 100< N1< 1000;
step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt short segments with 150-300nt overlap, intercepting 30-60nt at two ends of each segment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; the N2 is as follows: n2 is more than or equal to 1 and less than or equal to 8;
step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, a self-dimer or a cross-dimer is possibly formed, more than 5nt homopolymer repeated bases exist, and a multiple genome reverse transcription primer group is obtained through screening and rejecting;
step 4), designing a conservative region primer: and (3) designing 13nt primers at the 3' end of the region by using Primer3 software aiming at the region, deleting primers which have the Tm value deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3), and selecting the Primer with the most front software sequencing to replace the Primer in the same Primer design region in the Primer in the step 3) to form a final reverse transcription Primer group.
6. The use of the multiplex genome-specific reverse transcription primer set according to claim 4, which comprises:
1) use in the multiplex amplification of a COVID-19 novel coronavirus, said use being a non-disease diagnostic use;
2) the application in COVID-19 novel coronavirus sequencing library construction;
3) use in targeted enrichment of novel COVID-19 coronaviruses;
4) the application of the conjugate in preparing a reagent for detecting the COVID-19 novel coronavirus.
CN202110597445.2A 2021-05-28 2021-05-28 Method for detecting COVID-19 based on mNGS and application thereof Active CN113337639B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110597445.2A CN113337639B (en) 2021-05-28 2021-05-28 Method for detecting COVID-19 based on mNGS and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110597445.2A CN113337639B (en) 2021-05-28 2021-05-28 Method for detecting COVID-19 based on mNGS and application thereof

Publications (2)

Publication Number Publication Date
CN113337639A CN113337639A (en) 2021-09-03
CN113337639B true CN113337639B (en) 2022-01-25

Family

ID=77472608

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110597445.2A Active CN113337639B (en) 2021-05-28 2021-05-28 Method for detecting COVID-19 based on mNGS and application thereof

Country Status (1)

Country Link
CN (1) CN113337639B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115101126B (en) * 2022-02-22 2023-04-18 中国医学科学院北京协和医院 Respiratory tract virus and/or bacterial subtype primer design method and system based on CE platform
CN114550816B (en) * 2022-03-01 2022-11-04 上海图灵智算量子科技有限公司 Method for predicting virus mutation probability based on photonic chip
CN114317705A (en) * 2022-03-03 2022-04-12 天津金匙医学科技有限公司 Relative quantitative detection method for mNGS (human growth hormone receptor) pathogen by adopting single label
CN114574606B (en) * 2022-04-02 2023-04-28 予果生物科技(北京)有限公司 Primer group for detecting mycobacterium tuberculosis in metagenome and high-throughput sequencing method

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111118226B (en) * 2020-03-25 2021-04-02 北京微未来科技有限公司 Novel coronavirus whole genome capture method, primer group and kit
CN111334868B (en) * 2020-03-26 2023-05-23 福州福瑞医学检验实验室有限公司 Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction
CN111500781B (en) * 2020-05-15 2021-10-29 广州微远医疗器械有限公司 Amplification primer group for detecting SARS-CoV-2 by mNGS and application thereof
CN112322788B (en) * 2020-11-24 2021-07-06 杭州杰毅生物技术有限公司 mNGS primer group and kit for detecting SARS-CoV-2

Also Published As

Publication number Publication date
CN113337639A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
CN113337639B (en) Method for detecting COVID-19 based on mNGS and application thereof
CN110093455B (en) Respiratory virus detection method
CN113073150B (en) Digital PCR detection kit for novel coronavirus and variant thereof
CN106906211B (en) Molecular joint and application thereof
CN111334615B (en) Novel coronavirus detection method and kit
CN111440896B (en) Novel beta coronavirus variation detection method, probe and kit
CN105400776B (en) Oligonucleotide linker and application thereof in constructing nucleic acid sequencing single-stranded circular library
WO2017054302A1 (en) Sequencing library, and preparation and use thereof
CN111808854B (en) Balanced joint with molecular bar code and method for quickly constructing transcriptome library
JP2024105673A (en) Creation of single-stranded circular DNA templates for single molecule sequencing
CN111321202A (en) Gene fusion variation library construction method, detection method, device, equipment and storage medium
CN107699957A (en) Fusion based on DNA, which is quantitatively sequenced, builds storehouse, detection method and its application
CN111593142A (en) Detection kit for simultaneously detecting nine respiratory viruses including SARS-CoV-2
CN113249437A (en) Library construction method for sRNA sequencing
CN108192965B (en) Method for detecting heterogeneity of mitochondrial genome A3243G locus
US20220033809A1 (en) Method and kit for construction of rna library
TW201321520A (en) Method and system for virus detection
CN111979353A (en) Library construction method for sequencing novel coronavirus SARS-CoV-2 full-length genome
CN106566875A (en) Primers, kit and method for detecting myelodysplastic syndromes (MDS) gene mutation
CN116287162A (en) Kit for detecting BCR-ABL1 fusion gene and tyrosine kinase region mutation and promoter methylation thereof and application method
CN115094164A (en) Multiple qPCR (quantitative polymerase chain reaction) kit and detection method for ASFV (advanced specific immunodeficiency syndrome) with different gene deletion types
CN114790579A (en) Method for constructing new coronavirus sequencing library, method for determining new coronavirus nucleic acid sequence, sequencing library and kit
US20200208140A1 (en) Methods of making and using tandem, twin barcode molecules
CN111394474A (en) Method for detecting copy number variation of cattle GA L3 ST1 gene and application thereof
CN109609661A (en) A kind of combination of kidney-yang deficiency exogenous disease mouse model lung tissue qPCR reference gene and its screening technique

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant