CN113337639B

CN113337639B - Method for detecting COVID-19 based on mNGS and application thereof

Info

Publication number: CN113337639B
Application number: CN202110597445.2A
Authority: CN
Inventors: 王棪; 梁永; 李玉龙; 李立锋; 蒋智
Original assignee: Tianjin Jinke Medical Technology Co ltd
Current assignee: Tianjin Jinke Medical Technology Co ltd
Priority date: 2021-05-28
Filing date: 2021-05-28
Publication date: 2022-01-25
Anticipated expiration: 2041-05-28
Also published as: CN113337639A

Abstract

The invention provides a method for detecting a novel COVID-19 coronavirus based on mNGS and application thereof, wherein the method improves the reverse transcription efficiency of the RNA of the novel coronavirus, reduces the problem of aerosol pollution and simultaneously improves the enrichment capacity of nucleic acid sequences of different variant viruses by designing and preparing a multiple genome-specific reverse transcription primer group of the novel COVID-19 coronavirus.

Description

Method for detecting COVID-19 based on mNGS and application thereof

Technical Field

The invention belongs to the field of gene detection, and particularly relates to a method for detecting COVID-19 based on mNGS and application thereof.

Background

The symptoms of the novel coronavirus pneumonia are similar to those of common pneumonia, so that a rapid and accurate diagnosis technology plays a crucial role in patient treatment. At present, the mainstream detection means for disease diagnosis and epidemic situation screening is a real-time fluorescence RT-PCR detection technology, and the technology has the advantages of simple test operation, short detection time and high sensitivity, and is the first choice for rapid diagnosis and large-scale population screening at present. However, the technology also has the disadvantages that the current commercial kit usually takes two specific gene loci of the new coronavirus as targets, only two short sequences can be detected, the nucleic acid of the new coronavirus is RNA which is easier to degrade than DNA, the virus dies and splits after sampling, the integrity of the RNA is reduced, false positive can appear if the residual nucleic acid is not in a target region during detection, the novel virus continuously varies, and once the target region in a sample mutates, the detection means also has the risk of missing detection.

And the detection sequence of the new coronavirus genome based on the second-generation sequencing can cover the full length of the COVID-19 sequence. The defect that only a small number of COVID-19 known regions can be detected by RT-PCR is overcome, the detection accuracy can be improved, the possible variation of the virus can be identified, the disease condition can be accurately judged according to the genotype, a treatment scheme is formulated, and the source can be traced according to the possible variation of the virus, and the propagation path can be found. Therefore, the genome detection of the new coronavirus has been generally accepted by the medical field in China as an in vitro diagnosis method, and the genome detection of the new coronavirus is the same as real-time fluorescence RT-PCR in the novel diagnosis and treatment scheme for coronavirus pneumonia issued by the national health and health committee.

Common genome sequencing means comprise metagenome and targeted enrichment detection, but the metagenome has the defect of indiscriminate detection of all nucleic acids in a sample at present, and human nucleic acid with high proportion not only reduces the detection sensitivity, but also causes unstable detection signal intensity and generates false negative due to the proportion difference of different samples. The targeted enrichment mostly adopts a method of multiple PCR to enrich new coronavirus genome sequences to remove the interference of human source nucleic acid, but the cycle number used for cDNA amplification after reverse transcription is high, and PCR products are easy to cause cross contamination and aerosol contamination, so that false positive is generated. Moreover, designing primers only against a single reference genomic sequence provided by NCBI will inevitably eliminate variant gene sequences. The new coronavirus has high mutation speed and very rich sequence diversity, new virus subtypes are continuously discovered and uploaded in the world, the gene sequences of the new coronavirus recorded by NCBI (national center for Biotechnology information) reach 50,326, and the enrichment effect on the variant virus sequences can be reduced by only designing a primer aiming at a fixed genome sequence, so that false negative is caused.

The invention is provided in view of the above.

Disclosure of Invention

The invention aims to find a method for effectively detecting novel coronavirus COVID-19, and provides the following technical scheme for achieving the aim of improvement:

the invention firstly provides a preparation method of a virus multiple genome specific reverse transcription primer group, which comprises the following steps:

step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the number of the elements is 100< N1< 1000.

Step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt (preferably 394nt) short fragments with 150-300nt (preferably 197nt) overlap, intercepting 30-60nt (preferably 50nt) at two ends of each fragment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt (preferably 13nt) aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency (preferably sequencing from high to low), selecting the primer with the highest occurrence frequency and containing no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; n2 is more than or equal to 1 and less than or equal to 8.

Further, the method further comprises:

step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, self-dimer or cross-dimer is possibly formed, homopolymer repeated bases with more than 5nt exist, and the final multiple genome reverse transcription primer group is obtained through screening and rejecting.

Step 4), designing a conservative region primer: and (2) the sequences of the Primer design regions positioned in highly conserved regions (such as E genes) are consistent in all genomes, for such regions, 13nt Primer design is carried out at the 3' end of the region by using Primer3 software, after primers which have Tm values deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3) are deleted, the Primer with the most front software sequencing is selected to replace the Primer in the same Primer design region in the Primer in the step 3), and a final reverse transcription Primer group is formed.

Further, the genome database in step 1) includes, but is not limited to, NCBI GenBank database, DDBJ database, EMBL database; preferably, multiple genomic sequences of the virus are downloaded from the NCBI database, and all the downloaded genomes are aligned for match using the fast fourier transform MAFFT.

Further, the step 2) is as follows: step 2) preparing a candidate primer group: using PYFASTA software to divide the matched multiple genome sequence into 394nt short fragments with 197nt overlapping, intercepting 50nt at two ends of each fragment as a primer design region, and designing a plurality of 13nt forward and/or reverse primers aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region from high to low according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N times to obtain a candidate primer group; n is more than or equal to 1 and less than or equal to 8.

Further, the virus is a novel COVID-19 coronavirus;

further, the virus is a COVID-19 novel coronavirus based on the mNGS detection.

In some embodiments, the final multiplexed genomic reverse transcription primer set sequences are shown in SEQ ID nos. 1-277.

The invention also provides a library construction method for detecting the COVID-19 novel coronavirus based on the mNGS, which is characterized by comprising the following steps of:

1) the preparation method of the multiple genome-specific reverse transcription primer group comprises the following steps: designing and screening multiple genome specificity reverse transcription primer groups;

2) reverse transcription and cDNA synthesis: comprises the step of performing multiplex amplification by using the designed multiplex genome-specific reverse transcription primer set;

3) library construction step: and (3) performing library construction on the cDNA sequence.

Further, the step 2) of synthesizing cDNA by reverse transcription comprises the steps of adding random reverse transcription primers to the multiple genome reverse transcription primer group and then performing multiple amplification;

further, the multiplex amplification is carried out by using full-length reverse transcriptase;

in some embodiments, the full length reverse transcriptase is a HiScript III Enzyme system.

In some more preferred embodiments, the multiplex amplification step is specifically as follows:

heating the sample at 65 deg.C for 5min, rapidly cooling on ice, and standing on ice for 2 min;

the following reaction system was prepared:

reagent	Volume of
		10×RT Mix	2μl
HiScript III Enzyme Mix	2μl
		Reverse transcription primer working solution	2μl
Random hexamers	1μl

First Strand cDNA amplification was performed under the following conditions

105 deg.C thermal cover	on
		25℃	5min
37℃	45min
		85℃	5sec

Further, the multiple amplified sequences were prepared by double-stranded cDNA synthesis using a commercial double-stranded synthesis system.

Further, the step 1) of designing and screening multiple genome-specific reverse transcription primer sets comprises the following steps:

The method further comprises the following steps:

Further, the virus is a novel COVID-19 coronavirus;

further, the virus is a COVID-19 novel coronavirus based on the mNGS detection.

The invention also provides a method for detecting the COVID-19 novel coronavirus based on the mNGS, which comprises the steps of the method and comprises the steps of sequencing and generating information analysis.

The invention also provides a multiple genome specific reverse transcription primer group of the COVID-19 novel coronavirus, and the primer sequence is shown as SEQ ID NO. 1-277.

Further, the list is obtained by screening by the method for designing and screening multiple genome-specific reverse transcription primer sets as described above.

The invention also provides any one of the following uses of the multiple genome-specific reverse transcription primer set:

1) use in multiple amplifications of the COVID-19 novel coronavirus;

2) the application in COVID-19 novel coronavirus sequencing library construction;

3) use in targeted enrichment of novel COVID-19 coronaviruses;

4) the application of the conjugate in preparing a reagent for detecting the COVID-19 novel coronavirus.

The invention also provides a novel coronavirus COVID-19 nucleic acid library construction or nucleic acid detection kit, which comprises the primer group of SEQ ID NO. 1-277.

Further the kit further comprises a full-length reverse transcriptase; in some preferred embodiments, the full length reverse transcriptase is a HiScript III Enzyme system.

Compared with the prior art, the invention has at least the following advantages:

1) the invention adopts a method of adding new crown specific primers and random primers in the reverse transcription stage to improve the reverse transcription efficiency of new crown virus RNA and reduce the probability of aerosol pollution while enriching.

2) When designing a primer, the invention adopts a primer design method which can take a plurality of genome sequences of the new coronavirus as a reference sequence set, in particular, the invention designs a reverse transcription primer by taking at least more than 100 multiple genomes as the basis, considers the diversity of various new coronavirus sequences, improves the enrichment capacity of different variant virus nucleic acid sequences, and has a very high probability of successfully enriching the new variant viruses which possibly appear in the future because the screening of the primer is to match the primer sequence with the highest frequency in a plurality of different variant types.

3) The final primer system SEQ ID NO.1-277 established by design and screening can be comprehensively and efficiently used for establishing a library of new coronavirus and even variant strains, and the high sensitivity of subsequent sequencing detection is ensured.

4) Compared with the traditional method of RNA breaking and then amplification and library building, the method firstly amplifies the long fragment and then breaks, and specifically adopts HiScript III Enzyme Mix to adopt full-length reverse transcriptase during reverse transcription, so that the extension length of the specific primer is increased, and the enrichment efficiency is improved.

Detailed Description

The technical solutions of the present invention are described clearly and completely below, and it is obvious that the described embodiments are some, not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The following terms or definitions are provided only to aid in understanding the present invention. These definitions should not be construed to have a scope less than understood by those skilled in the art.

Unless defined otherwise below, all technical and scientific terms used in the detailed description of the present invention are intended to have the same meaning as commonly understood by one of ordinary skill in the art. While the following terms are believed to be well understood by those skilled in the art, the following definitions are set forth to better explain the present invention.

As used herein, the terms "comprising," "including," "having," "containing," or "involving" are inclusive or open-ended and do not exclude additional unrecited elements or method steps. The term "consisting of …" is considered to be a preferred embodiment of the term "comprising". If in the following a certain group is defined to comprise at least a certain number of embodiments, this should also be understood as disclosing a group which preferably only consists of these embodiments.

Where an indefinite or definite article is used when referring to a singular noun e.g. "a" or "an", "the", this includes a plural of that noun.

The terms "about" and "substantially" in the present invention denote an interval of accuracy that can be understood by a person skilled in the art, which still guarantees the technical effect of the feature in question. The term generally denotes a deviation of ± 10%, preferably ± 5%, from the indicated value.

Furthermore, the terms first, second, third, (a), (b), (c), and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.

Specific examples are as follows.

Experimental example 1 primer design

1. Generation of multiple genomes

More than 100 COVID-19 novel coronavirus genomes were first downloaded from the NCBI website, and all downloaded genomes were aligned for match using fast fourier transform (MAFFT) (v.7.388); note that, in order to ensure the detection rate of mutation of the subsequent primers, the source sequence of the multiple genome must be at least more than 100.

2. And (5) designing a candidate primer.

The matched multiplex genome is divided into 394nt short fragments with 197nt overlap by using PYFASTA software, 50nt is intercepted at two ends of each fragment to be used as a primer design region, a plurality of 13nt forward or reverse primers are designed to form a primer group to be merged in the region, all primers of the primer group to be merged in each region are sequenced from high to low according to the occurrence frequency, the primer with the highest occurrence frequency and containing no uncertain base is selected, all primers with the same sequence as the primer are deleted, the rest primers are sequenced and screened again, and a large number of candidate primer groups are obtained after repeating for multiple times.

3. Primer screening

The resulting candidate primer set is subjected to secondary screening to delete primers that appear such as: the Tm value deviates from the average by more than 2 standard deviations, and is likely to form a self-dimer or a cross-dimer, and there are more than 5nt homopolymer repeat bases, which are only exemplified here, and more screening conditions are included in the actual process. And (3) obtaining a final COVID-19 multiple genome reverse transcription primer group through secondary screening and rejecting, wherein the sequences of the primer group are shown in the following table.

The invention further comprises manual sequence adjustment and the like, for example, for conservative region primer design: and (2) the sequences of the Primer design regions positioned in highly conserved regions (such as E genes) are consistent in all genomes, for such regions, 13nt Primer design is carried out at the 3' end of the region by using Primer3 software, after primers which have Tm values deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3) are deleted, the Primer with the most front software sequencing is selected to replace the Primer in the same Primer design region in the Primer in the step 3), and a final reverse transcription Primer group is formed.

Experimental example 2 experimental procedure

(first) preparation of primers

(1) The primers were synthesized at a loading of 10nmol per tube, and purified by PAGE in 277 tubes.

(2) The primers were centrifuged at 4000rmp for 1 min.

(3) Adding Tris-HCl into each tube, performing vortex mixing, placing on ice, performing vortex mixing again, and performing instantaneous centrifugation.

(4) And combining the primers of all tubes into 1 tube to obtain a working solution of the COVID-19 multi-genome reverse transcription primer for later use.

(II) RNA nucleic acid extraction

RNA nucleic acid extraction was performed using the viral RNA extraction kit from Qiagen. The method is operated according to the product specification, after the sample is cracked, ethanol is added to separate out RNA, after the RNA is centrifugally filtered through a column, the separated RNA is combined with a silica gel membrane under the condition of high salt, residual impurities on the silica gel membrane are washed away by using a rinsing liquid, and finally the RNA is eluted from the silica gel membrane under the conditions of low salt and high pH value.

(III) human ribosomal RNA removal

Removal of ribosomal RNA from the host was performed using a commercial ribosomal RNA removal kit (human/mouse/rat) and RNA purification magnetic beads. Firstly, a ribosome RNA specific DNA probe is fully combined with human ribosome RNA by utilizing a slowly-reduced incubation temperature, secondly, a combination body is digested by using RNase H for specifically digesting the RNA & DNA combination body, then, the residual DNA probe is digested by using DNA digestive enzyme, finally, the residual RNA is enriched by using RNA purification magnetic beads matched with the kit, and the RNA combined with the magnetic beads is redissolved by using nuclease-free water.

(IV) reverse transcription

(1)1 Strand cDNA Synthesis Using full Length reverse transcription System

(1.1) 13. mu.L of sample was sampled, and RNase-free ddH was added if not enough₂Make up to 13 μ L of O, heat at 65 ℃ for 5min, quench quickly on ice, and stand on ice for 2 min.

(1.2) preparing a reaction system:

The HiScript III Enzyme Mix was from HiScript III 1st Strand cDNA Synthesis Kit (+ gDNA wiper) of Nanjing Novozae.

(1.3) first Strand cDNA Synthesis reaction under the following conditions

105 deg.C thermal cover	on
		25℃	5min
37℃	45min
		85℃	5sec

(2) Double-stranded cDNA synthesis, using a commercial double-stranded synthesis system:

(2.1) taking out the components required by the two-chain synthesis from-30 to-15 ℃, dissolving the components on ice, turning upside down and mixing the components uniformly, and briefly separating the components

The core was collected at the bottom of the tube and the second strand cDNA synthesis reaction system was prepared as follows:

components	Volume (μ l)
		Single strand cDNA	20
Double-stranded Synthesis buffer	20
		Double-stranded synthetase	5
Nuclease-free water	5
		Total	50

(2.2) adjust the pipette to the range of 100. mu.L, and gently suck and beat 10 times to mix well.

(2.3) temporarily placing the PCR tube on ice, setting the following program on the PCR instrument, placing the PCR tube into the PCR instrument, and then continuing to run the program:

(2.4) immediately after the PCR reaction was completed, the product was purified using 90. mu.L of commercial DNA Clean Beads, and the cDNA enriched on the magnetic Beads was redissolved using 50. mu.L of nuclease-free water.

(V) library construction

(1) DNA fragmentation/end repair/dA tail addition

The cDNA obtained by reverse transcription was fragmented, end-filled, phosphorylated at the 5 'end and dA added at the 3' end using commercial fragmentation and end-repair enzymes.

(2) Connecting joint

A double-ended index linker adapted to the Illumina sequencer was added to both ends of the cDNA fragment using commercial T4 ligase.

(3) Magnetic bead purification

Immediately after the ligation reaction was completed, the ligation product was purified using 60. mu.L of commercial DNA Clean Beads, and the DNA enriched on the magnetic Beads was back-solubilized using 20. mu.L of nuclease-free water.

(4) Library amplification

Library amplification was performed using commercial high fidelity PCR enzyme premix and universal primers for Illumina tester adaptors to increase the concentration of library fragments enough for next generation sequencing.

(5) Magnetic bead purification

Immediately after the PCR reaction was completed, the ligation product was purified using 50. mu.L of commercial DNA Clean Beads, and the DNA enriched on the magnetic Beads was back-solubilized using 20. mu.L of nuclease-free water.

(VI) on-Board analysis of the library

The molecular concentration of the purified library is detected by a qPCR library quantitative kit, then the libraries are mixed according to the equivalent molecular concentration, the sequencing is carried out on an illumina platform, the computer strategy SE75 is carried out, and the data volume is 20M. After the data are downloaded, the number of reads detected by the novel coronavirus is obtained by analysis through a bioinformatics analysis method.

Example 1 Experimental validation of reverse transcription primer set

1. Sample source

The novel coronaviruses used for the preparation of the positive reference were purchased from Shanghai assist in san, and a retroviral vector was used, which was loaded with a partial sequence of the COVID-19 virus ORF1 a/b gene, and the entire sequence of the coding regions of the E gene and the N gene, and had a length of 2000 bp. The in vitro diagnostic reagent (fluorescent quantitative PCR method) verifies that the cell is positive, the quantification is 2 x 10^8copies/mL through a standard curve, and the used cells are purchased from Nanjing Kyobai and are cell sediments of 10^6 cells/tube.

2. Preparation of Experimental samples

Since the limit of detection of RNA virus in clinical respiratory tract samples by metagenomic detection is expected to be 1000copies/mL, the concentration of COVID-19 for preparing positive samples is 1 x 10^3 copies/mL. To mimic the host content of alveolar lavage fluid, cultured human Cell lines were added to positive samples to a Cell concentration of 10^5 Cell/mL. And preparing a virus-free negative sample comprising 10^5 cells/mL of human cells.

3. Experimental methods

In the reverse transcription reaction, 0.4nmol of the specific primer set was added to 20. mu.L of the reaction system.

And in the process of RNA library construction, comparing whether the most detected experimental schemes are removed or not under different two-strand synthesis systems and cDNA library construction systems.

The experimental design is as follows:

4. results of the experiment

Therefore, after the RNA of the ribosome is denuded, the full-length reverse transcription is carried out by using the specific primer and the random primer, and then the library is constructed by carrying out enzyme digestion on the double-stranded cDNA, so that the experimental scheme which can stably detect the new coronavirus sequence under the concentration of 10^3copies/mL is the only experimental scheme, and therefore, the primer system is the most suitable experimental scheme for the specific reverse transcription primer group of the COVID-19 multiple genomes.

Example 2: the reverse transcription primer group experiment verifies that the primer group has the detected promotion effect.

First, the sample source was the same as in example 1.

Second, preparation of experimental sample

Positive samples P1 and P2 were prepared with COVID-19 pseudovirus concentrations of 1X 10^3copies/mL and 1X 10^4copies/mL, respectively. To mimic the host content of alveolar lavage fluid, human Cell lines were cultured with the addition of P1, P2 to a Cell concentration of 10^5 Cell/mL. And a negative sample containing no pseudovirus was prepared N1.

Third, Experimental methods

The experimental protocol selected in example 1 was used, i.e., after removal of ribosomal RNA, full-length reverse transcription was performed using specific primers plus random primers, then library construction was interrupted by cleavage of double-stranded cDNA, and the number of reads detected using and without the COVID-19 multiple genome-specific reverse transcription primer set was compared

The experimental design is as follows:

fourth, experimental results

Therefore, the number of the reads detected by the new crown can be increased by more than two times by adding the specific primer group. The detection of the library construction scheme by removing rRNA + full-length reverse transcription + enzyme digestion is most stable and can be detected in each repeated sample.

Example 3: and (3) carrying out experimental verification on the reverse transcription primer group (verifying the performance of the primer group and the detection method).

First, the source of the sample

The COVID-19 pseudovirus and the human cells used were as in example 1, and the virus was purchased from ATCC type culture Collection. The concentration measurement was performed using the fluorescent quantitative PCR method.

Second, preparation of experimental sample

Cells and viruses were diluted with PBS and formulated into the following reference.

Third, Experimental methods

The experimental scheme selected in the embodiment 1 is adopted, namely after the RNA of the ribosome is denucleated, a specific primer and a random primer are used for full-length reverse transcription, then the double-stranded cDNA is cut by enzyme, the library is built, the Illumina platform is used for sequencing, the number of reads detected by COVID-19 is analyzed by the messenger software, and the threshold value of the number of the reads larger than 0 is used for judging the negative and positive.

The experimental design was as follows:

1) and detecting the positive reference substance P and the negative reference substance N once respectively for verifying the accuracy of the detection method.

2) The specific reference substance N1-N4 is detected once and is used for verifying the specificity of the detection method.

3) The sensitivity reference S is detected once for verifying the lowest detection limit of the detection method.

4) And detecting the precision reference product R for 10 times to verify the stability of the detection method.

5) The negative quality control product and the positive quality control product are respectively detected once and are used for the quality control of the detection process, so that the result is ensured to be real and effective

Fourth, experimental results

As can be seen from the table above, the results of the stability verification (R1-R10) are positive, and the stability is qualified; the result of the detection limit verification (S1) is positive, and the lowest detection limit verification is qualified; the detection result of the positive reference substance (P) is positive, the detection result of the negative reference substance (N) is negative, and the accuracy is qualified; the negative reference product is detected to be negative, and the specificity is qualified.

The foregoing descriptions of specific exemplary embodiments of the present invention have been presented for purposes of illustration and description. It is not intended to limit the invention to the precise form disclosed, and obviously many modifications and variations are possible in light of the above teaching. The exemplary embodiments were chosen and described in order to explain certain principles of the invention and its practical application to enable one skilled in the art to make and use various exemplary embodiments of the invention and various alternatives and modifications as are suited to the particular use contemplated. It is intended that the scope of the invention be defined by the claims and their equivalents.

Sequence listing

<110> Tianjin gold spoon medical science and technology Limited

<120> method for detecting COVID-19 based on mNGS and application thereof

<160> 20

<170> SIPOSequenceListing 1.0

<210> 1

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

gcaggtgact cag 13

<210> 2

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 2

aattatgagg ttt 13

<210> 3

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 3

tacttattgt taa 13

<210> 4

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 4

cctgttttcc ttc 13

<210> 5

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 5

tgactcttgg tgt 13

<210> 6

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 6

tgagagtaag act 13

<210> 7

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 7

gaacttctac atg 13

<210> 8

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 8

tcacggacag cat 13

<210> 9

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 9

aacactgttt aca 13

<210> 10

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 10

atgtgctgga gca 13

<210> 11

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 11

aaacatgcat tcc 13

<210> 12

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 12

acagacagca cca 13

<210> 13

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 13

gctggccttg aag 13

<210> 14

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 14

acaattgaag aag 13

<210> 15

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 15

taagagtcat ttt 13

<210> 16

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 16

cagtacaaaa gac 13

<210> 17

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 17

gatgtaaact tac 13

<210> 18

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 18

catagaagtc ttt 13

<210> 19

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 19

gtaaataaat ttt 13

<210> 20

<211> 13

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 20

tttctccctc taa 13

Claims

1. A library building method for detecting COVID-19 novel coronavirus based on mNGS is characterized by comprising the following steps:

3) library construction step: constructing a library of cDNA sequences synthesized by reverse transcription;

the step 2) of synthesizing cDNA by reverse transcription comprises the steps of adding random reverse transcription primers into a multiple genome reverse transcription primer group and then performing multiple amplification;

in the step of synthesizing cDNA by reverse transcription, multiplex amplification is carried out by adopting full-length reverse transcriptase in multiplex amplification, wherein the full-length reverse transcriptase is HiScript III Enzyme;

the primer sequence of the multiple genome specificity reverse transcription primer group is shown as SEQ ID NO. 1-277.

2. The method for constructing a library for detecting COVID-19 novel coronaviruses based on mNGS as claimed in claim 1, wherein 1) the preparation of the multiple genome-specific reverse transcription primer set comprises the following steps:

step 1) generating a multiplex genome: downloading N1 genome sequences of the virus from a genome database, and matching and aligning all the downloaded genome sequences; the N1 is as follows: 100< N1< 1000;

step 2) preparing a candidate primer group: dividing the matched multiple genome sequence into 300-500nt short segments with 150-300nt overlap, intercepting 30-60nt at two ends of each segment as a primer design region, and randomly designing a plurality of forward and/or reverse primers of 10-15nt aiming at the primer design region to form a primer group to be merged in the region; sequencing all primers of the primer group to be merged in each region according to the occurrence frequency, selecting the primer with the highest occurrence frequency and no uncertain base, deleting all primers with the same sequence as the primer, sequencing and screening the rest primers again, and repeating for N2 times to obtain a candidate primer group; the N2 is as follows: n2 is more than or equal to 1 and less than or equal to 8;

step 3) primer screening: and (3) carrying out secondary screening on the candidate primer group, and deleting any or more of the following primers: the Tm value deviates from the average value by more than 2 standard deviations, a self-dimer or a cross-dimer is possibly formed, more than 5nt homopolymer repeated bases exist, and a multiple genome reverse transcription primer group is obtained through screening and rejecting;

step 4), designing a conservative region primer: and (3) designing 13nt primers at the 3' end of the region by using Primer3 software aiming at the region, deleting primers which have the Tm value deviating from the average value of the Primer group obtained in the step 3) by more than 2 standard deviations and possibly form cross dimers with the Primer group obtained in the step 3), and selecting the Primer with the most front software sequencing to replace the Primer in the same Primer design region in the Primer in the step 3) to form a final reverse transcription Primer group.

3. The database building method of claim 2, wherein the genome database in step 1) includes but is not limited to NCBI GenBank database, DDBJ database, EMBL database; the alignment in step 1) is to align all downloaded genome matches using fast fourier transform MAFFT.

4. A multiple genome-specific reverse transcription primer group of a COVID-19 novel coronavirus is characterized in that a primer sequence is shown as SEQ ID NO. 1-277.

5. The method of preparing the set of primers for multiple genome-specific reverse transcription of the COVID-19 novel coronavirus of claim 4, comprising the steps of:

6. The use of the multiplex genome-specific reverse transcription primer set according to claim 4, which comprises:

1) use in the multiplex amplification of a COVID-19 novel coronavirus, said use being a non-disease diagnostic use;

3) use in targeted enrichment of novel COVID-19 coronaviruses;