CN111334868B

CN111334868B - Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction

Info

Publication number: CN111334868B
Application number: CN202010225821.0A
Authority: CN
Inventors: 王洋; 李�杰; 王辰; 高汉林; 郭超; 王健伟; 任丽丽; 杨明; 刘静; 赵晔
Original assignee: Fuzhou Furui Medical Laboratory Co ltd; Chinese Academy of Medical Sciences CAMS
Current assignee: Fuzhou Furui Medical Laboratory Co ltd; Chinese Academy of Medical Sciences CAMS
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2023-05-23
Anticipated expiration: 2040-03-26
Also published as: CN111334868A

Abstract

The invention provides a method for constructing a novel coronavirus whole genome high-throughput sequencing library and a kit for constructing the library. The method comprises the following steps: 1) Reverse transcription of viral RNA; 2) Performing a first round of PCR reaction using multiplex amplification primers of the anchor portion Illumina adaptor sequence; 3) And (3) carrying out a second round of PCR reaction by using the tagged Illumina library amplification primer, and purifying an amplification product to obtain a high-throughput sequencing library. The anchoring multiplex amplification primer combination provided by the invention can be used for carrying out high-efficiency targeted enrichment on the genome of the novel coronavirus COVID-19, overcomes the defects of low targeting, low experimental timeliness and easiness in bringing into the influence of host background pollution in the existing method, and is beneficial to completing the whole genome sequencing of the virus of the COVID-19 in a short time under the conditions of less sequencing data quantity and low cost, thereby realizing differential diagnosis of the virus of the COVID-19 and identification of virus mutation.

Description

Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction

Technical Field

The invention relates to the technical field of biology, in particular to a method for constructing a novel coronavirus whole genome high-throughput sequencing library and a kit for constructing the library.

Background

The novel coronavirus, covd-19 (severeacute respiratory syndrome coronavirus, sars-CoV-2), belongs to the genus b coronavirus, and like other discovered coronavirus genomes, the covd-19 genome comprises 6 major open reading frames (ORF, open Reading Frame), ORF1ab, ORF3a, ORF6, ORF7a, ORF8 and ORF10, respectively, and other accessory genes (accessoriy genes), S genes, E genes, M genes and N genes, respectively. The method for obtaining the whole nucleic acid mutation of the viral genome, the virus typing and the evolution relation research by carrying out the whole genome high-depth sequencing on the virus has the strongest sensitivity and specificity at present; however, due to the background interference of host nucleic acid and other factors, the current mainstream viral whole genome high-throughput sequencing schemes have the problems of large sequencing data volume requirement, higher experimental cost, lower timeliness and the like.

At present, most of detection kits for COVID-19, which are approved by the national drug administration (NMPA), are fluorescent real-time quantitative PCR (qRT-PCR) detection methods based on Taqman probes, colloidal gold antibody detection methods of IgG/IgM and IgM antibody detection methods based on a magnetic particle chemiluminescence method; the qRT-PCR method has strong specificity, and can complete the relative quantification of positive virus-carrying samples within 2 hours; however, as the variation of RNA viruses is often much faster than that of other types of viruses, once the combination position of probes and the combination position of specific primers on the viral genome are mutated, the mutation is influenced by factors such as the quality of extracted viral RNA, experimental means, laboratory personnel operation and the like, the sensitivity is lower (the false negative is higher), the Ct value is unstable or the Ct value is greater than an untrusted value such as 40, and the false negative is improved, so that the misdiagnosis rate and the missed diagnosis rate are improved; the colloidal gold antibody detection of IgG/IgM based on immunological antibody antigen reaction has extremely quick timeliness, but has high false positive and still needs subsequent clinical diagnosis support; CT detection is used as a gold standard for clinical diagnosis, depends on a large-scale instrument, and is difficult to realize by common screening; the digital PCR (ddPCR) based on the water-in-oil droplet technology has strong specificity and strong sensitivity; however, the method has the defects of low flux and high cost, and once mutation occurs in the primer binding position, the detection rate can be influenced.

In the field of research on microorganisms/viruses, especially RNA viruses, the method mainly uses RNA-seq as a main technical means to sequence the whole genome of the virus, and uses total RNA isolated from a host to carry out post-construction sequencing of a secondary sequencing library by removing (rRNA amplification) host ribosomal RNA (hrRNA, human Ribosomal RNA), the method has the defect that host genome and transcriptome information brought in when viruses are isolated can cause that reads from the genome of the virus account for only a very small part (0.01-0.1%) in the next machine data, so that the requirement for initial nucleic acid RNA is high; meanwhile, the sequence comparison and assembly of the subsequent raw signal analysis usually have the defects of a certain proportion of gap, insufficient coverage of partial viral genome region and sequencing depth, insufficient whole coverage proportion, high requirement on the data size of the next machine (the abundance of the virus in the host is usually more than 10G data according to the size of the virus), higher experimental cost, lower timeliness and the like. Recent research results published in the Nature journal show that RNA-seq was performed on a new coronavirus isolated from human and studied by the metagenome (metagenome) analysis method, and that of all obtained sequencing reads of 10038758 after the machine, the sequencing reads from the host human were filtered, and finally only 1582 sequencing reads were obtained for subsequent COVID-19 analysis. The virus whole genome sequencing through the targeted liquid phase hybridization capture system has stronger specificity and low data volume requirement; the method has the defects that the requirement on the initial target cDNA is high, the risk of capturing the viral genome exists, the probe design cost is high, the timeliness is low (the total hybridization capture is more than 12 hours), and the clinical transformation fitness is not high; through literature search, studies on high throughput Sequencing of RNA viral whole genomes using targeted multiplex PCR Sequencing (TMS, targeted Multiplexing-PCR Sequencing) technology have been recently reported.

To make up for the technical blank in the field, we propose a novel coronavirus (covd-19) whole genome mutation rapid differential diagnosis technology and kit application based on targeted multiplex polymerase chain reaction amplicon sequencing (Targeted Multiplexed Amplicon-seq). The method is not affected by host genome, has strong targeting to the COVID-2019, high and uniform coverage, low sample initial quantity requirement, greatly reduced experiment and sequencing cost compared with the existing virus high-throughput sequencing method, greatly improved timeliness, and can realize high sensitivity, accuracy and comprehensive differential diagnosis of the COVID-19 virus in biological samples such as throat swab, alveolar lavage fluid and the like and virus culture samples.

Disclosure of Invention

The invention aims to provide a method for constructing a novel coronavirus whole genome high-throughput sequencing library based on a targeted multiplex polymerase chain reaction amplicon sequencing technology (Targeted Multiplexed Amplicon-seq) and a kit for constructing the library.

It is another object of the present invention to provide the use of the above method in the detection of novel coronavirus variants.

To achieve the object of the present invention, in a first aspect, the present invention provides a method for constructing a novel coronavirus whole genome high throughput sequencing library, comprising the steps of:

A. Extracting RNA of a virus sample, and carrying out reverse transcription to obtain single-stranded cDNA or double-stranded cDNA;

B. according to the published novel coronavirus COVID-19 genome sequence, performing shingled full-coverage primer design, respectively designing a multiplex amplification primer group 1 of an anchor part Illumina joint sequence and a multiplex amplification primer group 2 of the anchor part Illumina joint sequence (the anchor multiplex amplification primer group 1 and the anchor multiplex amplification primer group 2), taking single-stranded cDNA or double-stranded cDNA as a template, respectively performing a first round of PCR reaction by using the primer group 1 and the primer group 2, and mixing amplification products according to equimolar amounts to cover the whole genome of the virus;

C. b, performing a second round of PCR reaction by using the mixed amplification products in the step B as templates and using tagged Illumina library amplification primers, and purifying the amplification products to obtain a high-throughput sequencing library;

the design method of the multiplex amplification primer group 1 of the anchor part Illumina linker sequence and the multiplex amplification primer group 2 of the anchor part Illumina linker sequence in the step B comprises the following steps:

b1, designing non-anchored multiplex amplification primer groups according to a novel coronavirus COVID-19 genome sequence, wherein the non-anchored multiplex amplification primer groups are respectively a multiplex specific amplification primer group I and a multiplex specific amplification primer group II, the primer group I comprises a forward primer pool and a reverse primer pool, the primer group II comprises a forward primer pool and a reverse primer pool, and each pair of forward primer and reverse primer corresponds to one amplicon; respectively designing a forward primer and a reverse primer of a primer group II in two adjacent amplicon sequences of the primer group I, respectively designing the forward primer and the reverse primer of the primer group I in the two adjacent amplicon sequences of the primer group II, and repeating the steps until the amplicons corresponding to the primer group I and the amplicons corresponding to the primer group II cover the whole genome of the virus in a shingled mode;

B2, adding the Illumina part linker sequence (1) to the 5 'end of each forward primer according to the 5' -3 'direction, and adding the Illumina part linker sequence (2) to the 5' end of each reverse primer according to the 5'-3' direction; a forward primer F pool with the Illumina part joint sequence (1) and a reverse primer R pool with the Illumina part joint sequence (2) are used as a multiplex amplification primer group 1 of an anchoring part Illumina joint sequence; a forward primer F 'pool with the Illumina part joint sequence (1) and a reverse primer R' pool with the Illumina part joint sequence (2) are used as a multiplex amplification primer group 2 of the anchoring part Illumina joint sequence;

wherein the sequence of Illumina partial linker sequence (1) is as follows: 5' -I7 tagged primer 3' terminal sequence-AGATGTGTATAAGAGACAG-3 ';

the sequence of Illumina partial linker sequence (2) is as follows: 5' -I5 tagged primer 3' -terminal sequence-AGATGTGTATAAGAGACAG-3 ';

and the size of the 3' -end sequence of the I7 tagged primer is 9-15 bp, and the size of the 3' -end sequence of the I5 tagged primer is 8-14 bp, so that the I7 tagged primer and the I5 tagged primer can be specifically annealed to the 3' -end binding position on the amplicon.

In the method, the Tm threshold difference between each primer pair in the step B is +/-2 ℃; and/or

The amplicon size is 200-300bp; and/or

Primer pairs that may cause Primer or Primer interior formation of dimers (Primer primers) and Stem-Loop structures (Stem-Loop) are removed during Primer design; and/or

In the same multiplex specific amplification primer set, the reverse primer sequence 5 'of the upstream amplicon of the genome is located upstream of the forward primer sequence 5' of the downstream amplicon to prevent short fragment byproducts from forming and to perform PCR competition.

The method of reverse transcription of RNA into single-stranded cDNA in step A is selected from the following a or b:

a. guiding single-stranded cDNA synthesis by using a 6-10bp random primer;

b. a plurality of primers from a reverse primer R pool and a reverse primer R ' pool are mixed to form a specific reverse transcription primer group to guide single-stranded cDNA synthesis, the reverse primers are uniformly distributed along the 3' -5' direction of a viral genome, and the primers are 800-1000bp apart.

The method for reverse transcription of RNA into double-stranded cDNA in step A comprises:

i. guiding single-stranded cDNA synthesis by using a 6-10bp random primer;

ii. Nicking RNA-cDNA hybrid duplex with RNase H (RNaseH) in the presence of dNTPs;

and iii, synthesizing double-stranded cDNA by using the small fragment RNA generated at the notch as a primer and utilizing RNA-dependent DNA polymerase.

The labeled Illumina library amplification primers in step C were as follows (SEQ ID NOS: 503-504):

i7 tagged primer: 5'-CAAGCAGAAGACGGCATACGAGAT (I7) GTCTCGTGGGCTCGG-3', I5 tagged primer: 5'-AATGATACGGCGACCACCGAGATCTACAC (i 5) TCGTCGGCAGCGTC-3'.

Preferably, the sequence of Illumina partial linker sequence (1) in step B is as follows: 5'-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-3' (SEQ ID NO: 1);

the sequence of Illumina partial linker sequence (2) is as follows: 5'-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3' (SEQ ID NO: 2).

In the method, the multiplex amplification primer set 1 of the anchor part Illumina linker sequence and the multiplex amplification primer set 2 of the anchor part Illumina linker sequence in the step B comprise 250 pairs of primers, wherein the forward primers are COV-1-F-COV-250-F, the nucleotide sequences of the forward primers are respectively shown as SEQ ID NO. 3-252, the reverse primers are COV-1-R-COV-250-R, the nucleotide sequences of the reverse primers are respectively shown as SEQ ID NO. 253-502, the COV-1-F and the COV-1-R are a pair of primers, and the COV-2-F and the COV-2-R are a pair of primers, and the like.

Preferably, the primer information of the multiplex amplification primer set 1 of the anchor part Illumina adaptor sequence and the multiplex amplification primer set 2 of the anchor part Illumina adaptor sequence in step B are shown in table 1 and table 2, respectively.

TABLE 1 primer information for anchoring multiplex primer set 1

/>

/>

TABLE 2 primer information for Anchor multiplex primer set 2

/>

/>

Wherein the primer number COV-1 corresponds to the primers COV-1-F and COV-1-R, the primer number COV-2 corresponds to the primers COV-2-F and COV-2-R, and so on.

In the present invention, the virus sample may be derived from a biological sample such as a throat swab, an alveolar lavage, or a supernatant isolated culture after virus infection of cells.

In a second aspect, the present invention provides a kit for constructing a novel coronavirus whole genome high throughput sequencing library, the kit comprising multiplex amplification primer set 1 of anchor moiety Illumina adaptor sequences and multiplex amplification primer set 2 of anchor moiety Illumina adaptor sequences and tagged Illumina library amplification primers used in the library construction method described above, optionally comprising various reagents (e.g. amplification enzyme reagents, corresponding buffers, etc.) for library construction.

In a third aspect, the present invention provides the use of the above library construction method in the detection of novel coronavirus variants, said use comprising:

(1) Constructing and obtaining a novel coronavirus whole genome high-throughput sequencing library to be tested according to the method;

(2) Sequencing the high-throughput sequencing library on a machine after the quality inspection of the high-throughput sequencing library is qualified;

(3) Bioinformatics analysis and detection of mutation sites.

Preferably, step (3) comprises the sub-steps of:

1) Constructing a novel coronavirus COVID-19 reference genome MT019531.1 index data set by using BWA software, and generating fai files by using samtools faidx;

2) reads quality control analysis: filtering and quality control analysis are carried out on the double-end reads by using SOAPnuke to obtain clean reads (read length after filtering); reads with the following conditions will be removed: condition 1: reads containing linker sequence contamination; condition 2: reads with more than 10% N bases; condition 3: the number of low quality (Q < 38) bases exceeds 50% of the total reads;

3) Data alignment and sequencing: the BWA is combined with the samtools to compare clear reads to a reference genome MT019531.1 to generate a BAM file, and the comparison parameters are "-t 32-M"; sequencing by SortSam.jar using picard software; establishing an index for the ordered BAM files by using an index tool of samtools; performing quality control on the generated BAM file by using a Qualimap tool;

4) And (3) mutation detection: detecting SNP and InDel of the virus by using samtools pileup and VarScan; the SNP detection parameters are: "-min-coverage 8-min-reads 24-min-var-freq 0.1-min-avg-quat 0-p-value 1.0-strand-filter 0-variants-output-vcf 1"; the InDel detection parameters are: "-min-coverage 8-min-reads 2 4-min-var-freq 0.1-min-avg-quat 0-p-value 1.0-strand-filter 0-variants-output-vcf 1";

5) Finally, the detected SNPs and indels were annotated using annovar software based on the GFF file of MT019531.1 reference genome.

By means of the technical scheme, the invention has at least the following advantages and beneficial effects:

the anchoring multiplex amplification primer combination provided by the invention can be used for carrying out high-efficiency targeted enrichment on the genome of the novel coronavirus COVID-19, overcomes the defects of low targeting, low experimental timeliness and easiness in bringing into the influence of host background pollution in the existing method, and is beneficial to completing the whole genome sequencing of the virus of the COVID-19 in a short time under the conditions of less sequencing data quantity and low cost.

The multiplex polymerase chain reaction amplification primer provided by the invention specifically targets a novel coronavirus COVID-19 genome sequence, is not influenced by host human RNA, has low sample initial quantity requirement, and can realize specific detection, differential diagnosis and mutation identification of the COVID-19 RNA virus whole genome extracted from samples such as throat swabs, alveolar lavage fluid and the like. The invention carries out multiple rounds of detection and optimization on the primer sequences of multiplex PCR amplification, and finally the coverage uniformity of the obtained sequencing data reaches more than 90%; meanwhile, the method overcomes the defects of too little machine-setting data of the genome of the virus itself, too long experiment period and large amount of initial nucleic acid materials of the experiment caused by introducing a large amount of host RNA residues in the RNA-seq sequencing method based on RNA virus genome, and can also carry out secondary detection and diagnosis on false negative patients with suspected symptoms, which are rapidly detected by the conventional qRT-PCR method.

In actual use, the invention optimizes a PCR reaction system and a program for targeted enrichment and further amplification in the library construction process, and effectively improves the problems of low amplification efficiency and poor uniformity of the conventional general multiplex PCR reaction.

Drawings

FIG. 1 shows the principle of primer design according to the present invention.

FIG. 2 is a chart of Agilent 2200 micro-electrophoresis peaks of sequencing library quality control in example 1 of the present invention. Where a is the quality inspection result of library 46d1-1, b is the quality inspection result of library 50d1-1, size (bp) on the abscissa represents library fragment Size, and Sample sensitivity on the ordinate represents signal Intensity.

FIG. 3 is a chart of Agilent 2200 micro-electrophoresis peaks of quality control of a sequencing library in example 2 of the present invention. Where a is the quality inspection result of library 46d1-2, b is the quality inspection result of library 50d1-2, size (bp) on the abscissa represents library fragment Size, and Sample sensitivity on the ordinate represents signal Intensity.

FIG. 4 is a chart of Agilent 2200 micro-electrophoresis peaks of sequencing library quality control in example 3 of the present invention. Where a is the quality inspection result of library 48d5-1, b is the quality inspection result of library 47d1-1, size (bp) on the abscissa represents library fragment Size, and Sample sensitivity on the ordinate represents signal Intensity.

FIG. 5 is a chart of Agilent 2200 micro-electrophoresis peaks of sequencing library quality control in example 4 of the present invention. Where a is the quality inspection result of library 48d5-2, b is the quality inspection result of library 47d1-2, size (bp) on the abscissa represents library fragment Size, and Sample sensitivity on the ordinate represents signal Intensity.

FIG. 6 is a chart of Agilent 2200 micro-electrophoresis peaks of sequencing library quality control in example 5 of the present invention. Wherein a is the quality inspection result of the library XH1P2_R, b is the quality inspection result of the library WHP6_R, c is the quality inspection result of the library XH1P6_R, size (bp) on the abscissa represents the Size of the library fragment, and Sample on the ordinate represents the signal Intensity.

Detailed Description

The invention provides a novel coronavirus (COVID-19) whole genome mutation rapid differential diagnosis technology based on targeted multiplex polymerase chain reaction amplicon sequencing and a kit application, and a primer combination and a kit designed according to the method, and a COVID-19 single-stranded RNA library building method using the primer combination can realize the accuracy and comprehensive differential diagnosis of novel coronavirus COVID-19 in biological samples such as pharyngeal swabs, alveolar lavage fluid and the like and virus culture samples.

The invention also provides a method for rapidly identifying the mutation of the RNA viruses with known genome sequences of all types, ideas and reference modes of identification, and furthermore, the anchoring multiplex PCR primer sequences targeting different RNA virus genome sequences can be replaced according to actual requirements, so that the kit suitable for different application ranges is developed.

The technical scheme of the invention is as follows:

the invention provides a library construction method for human novel coronavirus (COVID-19) whole genome high throughput sequencing, which comprises the following steps:

1. the method for synthesizing a single-stranded cDNA by reverse transcription of viral single-stranded RNA comprises the following steps: step a, 6 base-10 base Random Primer (Random 6mer-10mer Primer) guided single-strand cDNA (1 st cDNA) synthesis; step b, the 1st cDNA is not purified or enters a subsequent PCR amplification reaction after being purified; or step a, mixing a plurality of primers in the non-anchored reverse primer pool R in the claim 3a into a specific reverse transcription primer group for guiding the synthesis of a strand cDNA, wherein the binding positions of the selected specific reverse transcription primers are uniformly distributed along the 3'-5' direction of the viral genome, and the primers are separated by a distance of 800-100bp base; step b, purifying the one-strand cDNA and then carrying out subsequent PCR amplification reaction; the method for synthesizing double-stranded cDNA by reverse transcription of virus single-stranded RNA comprises the following steps: step a, one-strand cDNA synthesis guided by a 6-10 base random primer; step b, RNaseH mediates RNA-1st cDNA hybrid double-strand (RNA-1 st cDNA hybrid) notch generation with the assistance of deoxyribonucleoside triphosphates (dNTPs); step c, using the small fragment RNA generated at the notch as a primer, and synthesizing a two-chain cDNA (2 nd cDNA) by using RNA-dependent DNA polymerase; step d, recovering and purifying double-stranded cDNA; the steps can be carried out by using commercial reverse transcription and two-chain synthesis kits.

2. Designing a multiple specific primer 250 pair of the COVID-19 virus genome of the anchor part Illumina sequencing linker sequence:

a. two sets of multiplex amplification primer sets 1 and 2 (each primer pool comprising a forward primer F pool and a reverse primer R pool, respectively) were designed based on the full length of the sequence MT019531.1 of the genome of COVID-19 published on the National Center for Biotechnology Information (NCBI) website (Accession No: MT019531GWHABKH 00000000); the forward primer and the reverse primer of the amplification primer group 2 are respectively designed in two adjacent amplicon sequences of the amplification primer group 1, and the forward primer and the reverse primer of the amplification primer group 1 are respectively designed in two adjacent amplicon sequences of the amplification primer group 2 continuously and repeatedly until the amplification products of the amplification primer group 1 and the amplification primer group 2 can cover the whole virus genome in a shingled manner (figure 1). Designing primers, setting Tm threshold difference between different primers to be +/-2 ℃, setting the size of an amplicon product to be 200-300bp, and simultaneously removing Primer pairs which can cause Dimer (Primer Dimer) and Stem-Loop structure (Stem-Loop) to be formed between the primers and/or inside the primers in the design; in the same amplification primer group, the 5 'end of the reverse primer sequence of the upstream amplicon of the genome is ensured to be positioned at the upstream of the 5' end of the forward primer sequence of the downstream amplicon as much as possible, so that the formation of short-fragment byproducts is prevented and PCR competition is carried out; meanwhile, the amplification efficiency in the system is ensured to be close to high consistency.

b. Design of an Anchor multiplex amplification primer F pool for anchoring the Illumina partial linker sequence the Illumina Nextera linker partial sequence (5' -GTCTCGTGGGCTCGG)AGATGTGTATAAGAGACAG-3 ') to the 5' end of all primers in the forward primer F pool according to the 5'-3' direction; wherein 5'-AGATGTGTATAAGAGACAG-3' is Tn5 transposase binding site sequence in the Illumina Nextera linker, 5' -GTCTCGTGGGCTCGG-3 'is identical to the 3' terminal sequence of the tagged Primer (I7-Indexed Primer) used for Illumina complete library amplification; wherein 5'-GTCTCGTGGGCTCGG-3' is suitably shortened or lengthened to ensure that the 3' terminal sequence (downstream of I7) of the tagged Primer (I7-Indexed Primer) amplified from the complete library of Illumina can anneal normally thereto.

c. Design of an Anchor multiplex amplification primer R pool for anchoring the Illumina partial linker sequence the Illumina Nextera linker partial sequence (5' -TCGTCGGCAGCGTC)AGATGTGTATAAGAGACAG-3 ') adding to the 5' ends of all primers in the upstream primer R pool according to the 5'-3' direction; wherein 5'-AGATGTGTATAAGAGACAG-3' is the Tn5 transposase binding site sequence in the Illumina Nextera linker, 5'-TCGTCGGCAGCGTC-3' is the same sequence as the 3' terminal sequence of the tagged Primer (I5-Indexed Primer) used for amplification of Illumina complete library; wherein 5'-TCGTCGGCAGCGTC-3' is suitably shortened or lengthened to ensure that the 3' terminal sequence (downstream of I5) of the tagged Primer (I5-Indexed Primer) amplified from the complete library of Illumina can anneal normally thereto.

d. The synthesized pool of anchored multiplex amplification primers F and R were mixed to form an anchored multiplex amplification primer set 1 (Anchored Primer Pool 1) and an anchored multiplex amplification primer set 2 (Anchored Primer Pool), and the primer mix patterns are shown in tables 1 and 2. In practical applications, it is necessary to mix the amplification products of the anchored multiplex amplification primer set 1 or the anchored multiplex amplification primer set 2 in equimolar amounts to cover the whole viral genome.

The library construction method provided by the invention comprises the following steps: step 1) single-stranded RNA of retrovirus into one-stranded cDNA or single-stranded RNA of retrovirus is synthesized into double-stranded cDNA, wherein the double-stranded cDNA synthesis reagent is preferably EpiNext Hi-Fi cDNA kit (Epigentek), and the one-stranded cDNA synthesis reagent is preferably TAKARA PrimeScript 1 ^st strand cDNA Synthesis kit (TAKARA, cat No. 6110A); the reverse transcription primer is selected as a specific reverse transcription primer group formed by mixing a plurality of primers in a 6-10 base random primer or a non-anchoring reverse primer pool R; step 2) performing PCR reactions using the anchored multiplex amplification primer set 1 and the anchored multiplex amplification primer set 2, respectively,enriching novel coronavirus cDNA in a targeting way; 3) Purifying the PCR product in the step 2), and mixing according to the equimolar amount; 4) Performing PCR library amplification on the cDNA subjected to targeted enrichment to obtain a DNA library which can be used for sequencing by an Illumina sequencing platform; 5) And (5) purifying the library. The reaction system of the PCR in the step 2) comprises: 5. Mu.L-10. Mu.L of cDNA template, 2 5. Mu.L of anchored multiplex amplification primer set 1 or anchored multiplex amplification primer set, 15. Mu.L of DNA polymerase and 2 Xbuffer system, or 0-5. Mu.L of double distilled water calculated from the total reaction volume (total reaction volume/2), preferably KAPA HiFi HotStart ReadyMix of DNA polymerase and 2 Xbuffer system (Roche, cat No. KKK2602); the PCR amplification procedure of step 2) comprises: step a, pre-denaturation: denaturation at 98℃for 1min; step b, cyclic amplification: denaturation at 98 ℃ for 20s, annealing at 60 ℃ for 30s, and extension at 72 ℃ for 30s, and the cycle number is set to be 10-25 according to the number of copies of different viruses; step c, total extension: extending at 72 ℃ for 60s; preserving at 4 ℃. Step 3) the first round of anchored PCR amplification product purification procedure included: step a) adding an equal volume (30. Mu.L) of DNA purification beads, preferably Agencourt AMPure XP Beads (Beckman Cat No. 14403400) of amplification product; step b, incubating for 5min at room temperature; step c, placing the steel plate in a magnetic rack for 10min; step d, preparing 80% of fresh ethanol to wash the magnetic beads twice; step e, use 30. Mu.L EB buffer (Qiagen Cat No. 19086) for the solubilization. Step 3) mixing two groups of PCR products according to equimolar quantity, wherein the step a comprises the steps of respectively detecting the mass concentration of the purified two groups of anchored multiplex amplification yield increase products; step b, respectively detecting two groups of purified anchored multiplex amplification products; step c, calculating the molar concentration of the two groups of anchored multiplex amplification products respectively; step d, mixing in equal molar amount; step 4) performing PCR library amplification on the targeted enriched cDNA comprises: 20. Mu.L of mixed anchored multiplex amplification PCR product, 5. Mu.L of Illumina library amplification (Indexed PCR) primer pair, 25. Mu.L of DNA polymerase and 2 Xbuffer system, preferably DNA polymerase and 2 Xbuffer system selection KAPA HiFi HotStart ReadyMix (Roche, cat No. KKK2602); the PCR amplification procedure of step 4) includes: step a, pre-denaturation: denaturation at 98℃for 45s; step b, cyclic amplification: 15s of denaturation at 98 ℃, 30s of annealing at 60 ℃ and 30s of extension at 72 ℃ with 10 cycles; step c, total extension: extension at 72 DEG C Stretching for 60s; preserving at 4 ℃. The library purification procedure of step 5) comprises: step a, adding 0.6-1 volume (30-50. Mu.L) of the amplified product of DNA purification beads, preferably Agencourt AMPure XP Beads (Beckman Cat No. 14403400); step b, incubating for 5min at room temperature; step c, placing the steel plate in a magnetic rack for 10min; step d, washing the magnetic beads twice by using 80% fresh prepared ethanol.

In some specific embodiments, the viral RNA sample is from an extraction of virus isolated from alveolar lavage fluid.

In some specific embodiments, the viral RNA sample is from viral extraction from a pharyngeal swab.

In some specific embodiments, the viral RNA sample is from a high copy number viral extract isolated from the supernatant of an in vitro infected cell culture virus.

In some embodiments, the reverse transcription primer employs a 6 base random primer.

In some embodiments, the reverse transcription primer employs a specific reverse transcription primer set that is a mixture of several primers in the non-anchor reverse primer pool R.

The method for analyzing the off-line data provided by the invention comprises the following steps: step 1: construction of a reference genome (MT 019531.1) index dataset was performed using BWA software, and fai files were generated using samtools faidx. Statistics MT019531.1 genome basic information: the total length is 29899bp, and the GC content is 37.98%; step 2: and (5) performing quality control analysis on reads. The double-ended reads were filtered and analyzed for quality control using SOAPnuke to obtain clean reads (read length after filtration). Reads meeting the following conditions will be removed: 1) Reads containing linker sequence contamination; 2) Reads with more than 10% N bases; 3) The number of low quality (Q < 38) bases exceeds 50% of the total reads; step 3: data alignment and ordering. The BWA-combined samtools were used to align clean reads onto the reference genome (MT 019531.1) of the COVID-2019 to generate BAM files with alignment parameters of "-t 32-M". Sort is done using the sortSam. And (3) indexing the ordered BAM files by using an index tool of samtools. Performing quality control on the generated BAM file by using tools such as Qualimap and the like; step 4: and (3) mutation detection: SNP and InDel variants of the virus were detected using samtools pileup and VarScan. The SNP detection parameters are: "-min-coverage 8-min-reads 24-min-var-freq 0.1-min-avg-quat 0-p-value 1.0-strand-filter 0-variants-output-vcf 1"; the InDel detection parameters are: "-min-coverage 8-min-reads 2 4-min-var-freq 0.1-min-avg-quat 0-p-value 1.0-strand-filter 0-variants-output-vcf 1"; and 5, finally annotating the detected SNP by using annovar software based on the GFF file of the MT019531.1 reference genome.

The invention also provides a kit for constructing a human novel coronavirus (covd-19) whole genome high throughput sequencing library, the kit comprising the following components: a specific reverse transcription primer set, an anchor multiplex amplification primer set 1 and an anchor multiplex amplification primer set 2 which are formed by mixing a plurality of primers in a non-anchor reverse primer pool; sequencing library amplification primers, and various reagents used in library construction. Further, the kit also contains instructions for the method and safe use should be known.

The following examples are illustrative of the invention and are not intended to limit the scope of the invention. Unless otherwise indicated, the examples are in accordance with conventional experimental conditions, such as the molecular cloning laboratory Manual of Sambrook et al (Sambrook J & Russell DW, molecular Cloning: a Laboratory Manual, 2001), or in accordance with the manufacturer's instructions.

Example 1

The viral RNA used in this example was obtained from alveolar lavage fluid of a novel patient with coronary pneumonia by magnetic bead extraction, two cases in total; the extraction and quality inspection of RNA is performed by the biological safety class 3 (P3) laboratory of the institute of pathogenic biology of the national academy of medical sciences/Beijing synergetic hospital.

The method provided by the embodiment can be used for detecting virus types in alveolar lavage fluid or detecting virus genome mutation from patients diagnosed with novel coronary pneumonia; viral copy number viral concentration (Copies/. Mu.L) was determined by absolute quantitative qRT-PCR using N gene copy number of novel coronavirus nucleic acid standard (high concentration) GBW (E) 091089 (China national institute of metrology) (Table 3).

TABLE 3 alveolar lavage RNA viral copy number and clinical information

The specific experimental method is as follows:

the virus single-stranded RNA extracted from alveolar lavage fluid was reverse transcribed into a single-stranded cDNA (1 st cDNA) using a 6 base random primer, and the 1st cDNA synthesis kit was selected as follows: TAKARA PrimeScript 1 ^st strand cDNASynthesis kit (TAKARA, cat No. 6110A); the 1st cDNA was purified for subsequent amplification.

Using the anchored multiplex amplification primer set 1, a PCR reaction was performed, the PCR reaction system comprising: 5. Mu.L of the purified cDNA template, 1 5. Mu.L of the anchor multiplex primer set, 15. Mu.L of the DNA polymerase and 2 Xbuffer system, 5. Mu.L of double distilled water, DNA polymerase and 2 Xbuffer system were selected as follows: KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure included: step a, pre-denaturation: denaturation at 98℃for 1min; step b, cyclic amplification: denaturation at 98℃for 20s, annealing at 60℃for 30s, elongation at 72℃for 30s, cycle number 15; step c, total extension: extending at 72 ℃ for 60s; preserving at 4 ℃.

Using the anchored multiplex amplification primer set 2, a PCR reaction is performed, the PCR reaction system comprising: 5. Mu.L of cDNA template after purification of the same sample, 2 5. Mu.L of anchored multiplex primer set, 15. Mu.L of DNA polymerase and 2 Xbuffer system, 5. Mu.L of double distilled water, DNA polymerase and 2 Xbuffer system were selected as follows: KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure included: step a, pre-denaturation: denaturation at 98℃for 1min; step b, cyclic amplification: denaturation at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and cycle number of 15; step c, total extension: extending at 72 ℃ for 60s; preserving at 4 ℃.

The first round anchored PCR amplification products were purified separately, and the procedure included: step a, adding 1 volume (30. Mu.L) of DNA purification beads, preferably Agencourt AMPure XP Beads (Beckman Cat No. 14403400) of the amplified product; step b, incubating for 5min at room temperature; step c, placing the steel plate in a magnetic rack for 10min; step d, preparing 80% of fresh ethanol to wash the magnetic beads twice; step e, use 30. Mu.L EB buffer (Qiagen Cat No. 19086) to dissolve back; equimolar amounts of the two sets of PCR products were mixed.

The first round PCR product of the Illumina library tagged amplification primer amplification mix is 20. Mu.L, the Illumina library amplification primer pair is 5. Mu.L, the DNA polymerase and 2 Xbuffer system is 25. Mu.L total, preferably, the DNA polymerase and 2 Xbuffer system is KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure of step 4) includes: step a, pre-denaturation: denaturation at 98℃for 45s; step b, cyclic amplification: 15s of denaturation at 98 ℃, 30s of annealing at 60 ℃ and 30s of extension at 72 ℃ with 10 cycles; step c, total extension: extending at 72 ℃ for 60s; preserving at 4 ℃.

The second round of Illumina library amplification product purification procedure included: step a, adding 1 volume (50. Mu.L) of DNA purification beads, preferably Agencourt AMPure XP Beads (Beckman Cat No. 14403400) of the amplified product; step b, incubating for 5min at room temperature; step c, placing the steel plate in a magnetic rack for 10min; step d, preparing 80% of fresh ethanol to wash the magnetic beads twice; step e, use 30. Mu.L EB buffer (Qiagen Cat No. 19086) for the solubilization.

High throughput sequencing, namely performing high throughput sequencing on the library purified in the previous step according to the on-machine operation steps of illuminea Novaseq; the amount of sequencing data was set to 1G.

And (5) machine-starting data analysis:

step 1: construction of a reference genome (MT 019531.1) index dataset was performed using BWA software, and fai files were generated using samtools faidx. Statistics MT019531.1 genome basic information: the total length is 29899bp, and the GC content is 37.98%;

step 2: and (5) performing quality control analysis on reads. The double-ended reads were filtered and analyzed for quality control using SOAPnuke to obtain clean reads (read length after filtration). Reads meeting the following conditions will be removed: 1) Reads containing linker sequence contamination; 2) Reads with more than 10% N bases; 3) The number of low quality (Q < 38) bases exceeds 50% of the total reads;

step 3: data alignment and ordering. The BWA-combined samtools were used to align clear Reads onto the reference genome (MT 019531.1) of the COVID-2019 to generate BAM files with alignment parameters of "-t 32-M". Sort is done using the sortSam. And (3) indexing the ordered BAM files by using an index tool of samtools. Performing quality control on the generated BAM file by using tools such as Qualimap and the like;

Step 4: and (3) mutation detection: SNP and InDel variants of the virus were detected using samtools pileup and VarScan. The SNP detection parameters are: "-min-coverage 8-min-reads 24-min-var-freq 0.1-min-avg-quat 0-p-value 1.0-strand-filter 0-variants-output-vcf 1"; the detection parameters of InDel are: "-min-coverage 8-min-reads 2 4-min-var-freq 0.1-min-avg-quat 0-p-value 1.0-strand-filter 0-variants-output-vcf 1";

and 5, annotating the detected SNP by using annovar software based on the GFF file of the MT019531.1 reference genome.

Analysis of results:

FIG. 2 shows the library construction results of this example, showing that the library construction using alveolar lavage fluid to isolate viral RNA, the 46d1-1 sequencing library and the 50d1-1 sequencing library are bimodal, with a major peak at about 380-400bp (80%), which is in line with the average size of the designed amplicon and the size of the complete library, and a minor peak at 800-1000bp, presumably due to: 1) A small amount of genome byproducts generated under the influence of random primers; 2) Potential primer dimers are over-amplified by library amplification primers; or 3) the potential anchored multiplex amplification primer set 1 and anchored multiplex amplification primer 2 remain after the first round of purification, and are amplified to form a ratio of about 20%; blank NC did not construct libraries, meeting expectations (results not shown); the machine-down data are respectively as follows: 0.75G (46 d 1-1) and 1.1G (50 d 1-1), the raw data Q30 values were 90.38% and 80.72%, respectively (Table 4).

Table 4 alveolar lavage fluid sample library off-the-shelf data quality control

The comparison of the machine-setting data to the virus reference genome MT019531.1 (Accession No: MT019531 GWHABKH 00000000) after filtration is carried out, the comparison base number, the comparison rate, the mismatch rate, the average depth coverage ratio and the like are shown in Table 5, and the comparison rate of the machine-setting data of the alveolar lavage fluid sample is more than 92%, and the mismatch rate is less than 0.2%; the sequencing depth 100 times coverage ratio of the N and S accessory genes of the novel coronaviruses of the two libraries is 100%, so that the novel coronavirus COVID-19 can be determined; the viral genome sequencing depth 100 x coverage ratio reached 97.38% and 98.24%, respectively (table 5).

TABLE 5 off-the-shelf data analysis and statistics of alveolar lavage fluid sample library

In both libraries, 46d1-1 had 9 Single Nucleotide Polymorphism Sites (SNPs), no indel mutations were found, respectively: MT019531.1 genomic position 3127 (orf 1ab: T2862C), MT019531.1 genomic position 3706 (orf 1ab: A3441G), MT019531.1 genomic position 5369 (orf 1ab: G5104T), MT019531.1 genomic position 5812 (orf 1ab: C5547T), MT019531.1 genomic position 6996 (orf 1ab: C67531T), MT019531.1 genomic position 7010 (orf 1ab: G6755A), MT019531.1 genomic position 18395 (orf 1ab: C18130T), MT019531.1 genomic position 18557 (orf 1ab: C18292T), MT019531.1 genomic position 18640 (orf 1ab: A18375G).

50d1-1 had 8 Single Nucleotide Polymorphism Sites (SNPs) and no indel mutation was found, respectively: MT019531.1 genomic position 1880 (orf 1ab: G1615A), MT019531.1 genomic position 3127 (orf 1ab: T2862C), MT019531.1 genomic position 5369 (orf 1ab: G5104T), MT019531.1 genomic position 6996 (orf 1ab: C6751T), MT019531.1 genomic position 7010 (orf 1ab: G6755A), MT019531.1 genomic position 18395 (orf 1ab: C18130T), MT019531.1 genomic position 18557 (orf 1ab: C18292T), MT019531.1 genomic position 28620 (N: G346A).

Example 2

The viral RNA samples 46d1 and 50d1 used in this example are the same as those used in example 1.

The specific experimental method is as follows:

viral single-stranded RNA extracted from alveolar lavage fluid was purified using 34 gene-specific reverse primers (splitThe kit for synthesizing 1st cDNA was selected by mixing COV-1-R, COV-8-R, COV-12-R, COV-20-R, COV-30-R, COV-38-R, COV-47-R, COV-54-R, COV-62-R, COV-71-R, COV-80-R, COV-86-R, COV-94-R, COV-102-R, COV-111-R, COV-119-R, COV-125-R, COV-132-R, COV-141-R, COV-146-R, COV-155-R, COV-162-R, COV-172-R, COV-179-R, COV-187-R, COV-195-R, COV-202-R, COV-210-R, COV-220-R, COV-228-R, COV-233-R, COV-239-R, COV-247-R, COV-252-R (genomic direction 3 '-5'), and reverse transcription into one strand cDNA (1 st cDNA): TAKARA PrimeScript 1 ^st strand cDNA Synthesis kit (TAKARA, cat No. 6110A); the 1st cDNA was purified for subsequent amplification.

Using the anchored multiplex amplification primer set 1, a PCR reaction was performed, the PCR reaction system comprising: 5. Mu.L of the purified cDNA template, 1 5. Mu.L of the anchor multiplex primer set, 15. Mu.L of the DNA polymerase and 2 Xbuffer system, 5. Mu.L of double distilled water, DNA polymerase and 2 Xbuffer system were selected as follows: KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure included: step a, pre-denaturation: denaturation at 98℃for 1min; step b, cyclic amplification: denaturation at 98 ℃ for 20S, annealing at 60 ℃ for 30S, extension at 72 ℃ for 30S, and cycle number of 15; step c, total extension: extending at 72 ℃ for 60S; preserving at 4 ℃.

The first round PCR product of the Illumina library tagged amplification primer amplification mix is 20. Mu.L, the Illumina library amplification primer pair is 5. Mu.L, the DNA polymerase and 2 Xbuffer system is 25. Mu.L total, preferably, the DNA polymerase and 2 Xbuffer system is KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure of step 4) includes: step a, pre-denaturation: denaturation at 98℃for 45s; step b, cyclic amplification: denaturation at 98℃for 15s, annealing at 60℃for 30s, extension at 72℃for 30s, cycle number 10; step c, total extension: extending at 72 ℃ for 60s; preserving at 4 ℃.

And (5) machine-starting data analysis:

step 2: and (5) performing quality control analysis on the Reads. The double-ended reads were filtered and analyzed for quality control using SOAPnuke to obtain clean reads (read length after filtration). Reads meeting the following conditions will be removed: 1) Reads containing linker sequence contamination; 2) Reads with more than 10% N bases; 3) The number of low quality (Q < 38) bases exceeds 50% of the total reads;

step 5, annotating the detected SNP by using annovar software based on a GFF file of a MT019531.1 reference genome;

analysis of results:

FIG. 3 shows the library construction results of this example, showing that the sequencing libraries of 46d1-2 and 50d1-2 are unimodal, with average library fragment sizes of about 380-420bp, following amplicon average size and complete library size expectations, by mixed specific primer reverse transcription and subsequent library construction; blank NC did not construct libraries, meeting expectations (results not shown); the machine-down data are respectively as follows: 0.9G (46 d 1-2) and 1.0G (50 d 1-2), the raw data Q30 values were 94.15% and 92.46%, respectively (Table 6).

TABLE 6 control of on-machine data quality for alveolar lavage fluid sample library

The comparison of the machine-setting data to the virus reference genome MT019531.1 (Accession No: MT019531 GWHABKH 00000000) after filtration is carried out, the comparison base number, the comparison rate, the mismatch rate, the average depth coverage ratio and the like are shown in Table 7, and the comparison rate of the machine-setting data of the alveolar lavage fluid sample is more than 97%, and the mismatch rate is less than 0.1%; the sequencing depth 100 times coverage ratio of the N and S accessory genes of the novel coronaviruses of the two libraries is 100%, so that the novel coronavirus COVID-19 can be determined; the viral genome sequencing depth 100 x coverage ratio reached 99.08% and 99.24%, respectively (table 7).

TABLE 7 off-the-shelf data analysis and statistics of alveolar lavage fluid sample library

In both samples, 46d1-2 had 9 Single Nucleotide Polymorphism Sites (SNPs), and no indel mutations were found, respectively: MT019531.1 genomic position 3127 (orf 1ab: T2862C), MT019531.1 genomic position 3706 (orf 1ab: A3441G), MT019531.1 genomic position 5369 (orf 1ab: G5104T), MT019531.1 genomic position 5812 (orf 1ab: C5547T), MT019531.1 genomic position 6996 (orf 1ab: C67531T), MT019531.1 genomic position 7010 (orf 1ab: G6755A), MT019531.1 genomic position 18395 (orf 1ab: C18130T), MT019531.1 genomic position 18557 (orf 1ab: C18292T), MT019531.1 genomic position 18640 (orf 1ab: A18375G).

50d1-1 had 8 Single Nucleotide Polymorphism Sites (SNPs) and no indel mutation was found, respectively: MT019531.1 genomic position 1880 (orf 1ab: G1615A), MT019531.1 genomic position 3127 (orf 1ab: T2862C), MT019531.1 genomic position 5369 (orf 1ab: G5104T), MT019531.1 genomic position 6996 (orf 1ab: C6731T), MT019531.1 genomic position 7010 (orf 1ab: G6755A), MT019531.1 genomic position 18395 (orf 1ab: C18130T), MT019531.1 genomic position 18557 (orf 1ab: C18292T), MT019531.1 genomic position 28620 (N: G346A); from a combination of the results of example 1 and example 2, it can be seen that the same mutation site can be identified by both reverse transcription methods for the same sample, and the comparison rate, average sequencing depth, 100×sequencing depth coverage ratio, and the like of the library obtained by reverse transcription using the specific primer are all great advantages in terms of data utilization rate.

Example 3

The viral RNA used in this example was obtained from a throat swab sample from a novel patient with coronary pneumonia by magnetic bead extraction; the extraction and quality inspection of RNA is performed by the biological safety class 3 (P3) laboratory of the institute of pathogenic biology of the national academy of medical sciences/Beijing synergetic hospital.

The method provided by the embodiment can be used for detecting virus types in throat swab samples or detecting virus genome mutations from patients with established or suspected novel coronary pneumonia; viral copy number viral concentration (Copies/. Mu.L) was determined by absolute quantitative qRT-PCR using N gene and E gene copy numbers of novel coronavirus nucleic acid standard (low concentration) GBW (E) 091090 (China national institute of metrology) (Table 8). .

TABLE 8 pharyngeal swab RNA Virus copy number, clinical information

The specific experimental method is as follows:

the throat swab collects the novel oral epithelial cells of patients with coronary pneumonia, and then the virus single-stranded RNA extracted by a magnetic bead method is reversely transcribed into a strand cDNA (1 st cDNA) by using a 6-base random primer and a 1st cDNA synthesis kit, wherein the 1st cDNA synthesis kit is selected as follows: TAKARA PrimeScript 1 ^st strand cDNA Synthesis kit (TAKARA, cat No. 6110A); purifying 1st cDNA for subsequent amplification;

Using the anchored multiplex amplification primer set 1, a PCR reaction was performed, the PCR reaction system comprising: 10. Mu.L of the purified cDNA template, 1 5. Mu.L of the anchor multiplex primer set, 15. Mu.L of the DNA polymerase and 2 Xbuffer system were selected from the group consisting of: KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure included: step a, pre-denaturation: denaturation at 98℃for 1min; step b, cyclic amplification: denaturation at 98℃for 20S, annealing at 60℃for 30S, elongation at 72℃for 30S, cycle number 25; step c, total extension: extending at 72 ℃ for 60S; preserving at 4 ℃.

Using the anchored multiplex amplification primer set 2, a PCR reaction is performed, the PCR reaction system comprising: 10. Mu.L of cDNA template after purification of the same sample, 2 5. Mu.L of anchor multiplex primer set, 15. Mu.L of DNA polymerase and 2 Xbuffer system were selected as follows: KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure included: step a, pre-denaturation: denaturation at 98℃for 1min; step b, cyclic amplification: denaturation at 98℃for 20s, annealing at 60℃for 30s, elongation at 72℃for 30s, cycle number 25; step c, total extension: extending at 72 ℃ for 60s; preserving at 4 ℃.

The first round anchored PCR amplification products were purified separately, and the procedure included: step a, adding 1 volume (30. Mu.L) of DNA purification beads, preferably Agencourt AMPure XP Beads (Beckman Cat No. 14403400) of the amplified product; step b, incubating for 5min at room temperature; step c, placing the steel plate in a magnetic rack for 10min; step d, preparing 80% of fresh ethanol to wash the magnetic beads twice; step e, use 30. Mu.L EB buffer (Qiagen Cat No. 19086) to dissolve back; mixing the two sets of PCR products in equimolar amounts;

The second round of Illumina library amplification product purification procedure included: step a, adding 0.8 volumes (40. Mu.L) of DNA purification beads, preferably Agencourt AMPure XP Beads (Beckman Cat No. 14403400) of amplified product; step b, incubating for 5min at room temperature; step c, placing the steel plate in a magnetic rack for 10min; step d, preparing 80% of fresh ethanol to wash the magnetic beads twice; step e, use 30. Mu.L EB buffer (Qiagen Cat No. 19086) for the solubilization.

High throughput sequencing, namely performing high throughput sequencing on the library purified in the previous step according to the on-machine operation steps of illuminea Novaseq; the amount of sequencing data was set to 1G, and in this example, the actual number of moles of the library was adjusted due to the library peak type effect.

And (5) machine-starting data analysis:

Analysis of results:

FIG. 4 shows the results of library construction using throat swab viral RNA with a low relative viral copy number, wherein both libraries are multimodal, and have a major peak (proportion of about 80%) of about 180bp, suspected to be less affected by the sample viral copy number, and have low reverse transcription efficiency, and the anchor primer is obtained after dimer formation and excessive amplification by the library amplification primer, and the expected major peak has two minor peaks at 380-440bp, which are about 20%, thus increasing the actual number of moles of the library on the fly by 4 times during sequencing on the machine; blank NC did not construct libraries, meeting expectations (results not shown); the machine-down data are respectively as follows: 1.2G (48 d 5-1) and 1.3G (47 d 1-1), the raw data Q30 values were 85.28% and 79.77%, respectively (Table 9).

Table 9 control of on-machine data quality of throat swab sample library

The comparison of the machine-setting data to the virus reference genome MT019531.1 (Accession No: MT019531 GWHABKH 00000000) after filtering is long, the comparison base number, the comparison rate, the mismatch rate and the like are shown in Table 10, and the comparison rate of the machine-setting data of the throat swab sample is more than 93%, and the mismatch rate is less than 0.03%; the sequencing depth 100 times coverage ratio of the N and S accessory genes of the novel coronaviruses of the two libraries is 100%, so that the novel coronavirus COVID-19 can be determined; viral genome sequencing depth 100 x coverage ratio reached 98.12% and 96.73%, respectively (table 10).

TABLE 10 off-the-shelf data analysis and statistics of throat swab sample library

48d5-1 has 6 single nucleotide polymorphism Sites (SNP), 4 deletion mutation sites, respectively: MT019531.1 genomic position 2132 (orf 1ab: A1867G), MT019531.1 genomic position 6996 (orf 1ab: C6731T), MT019531.1 genomic position 11354 (orf 1ab: G11089A), MT019531.1 genomic position 17194 (orf 1ab: A16929G), MT019531.1 genomic position 18395 (orf 1ab: C18130T), MT019531.1 genomic position 18557 (orf 1ab: C18292T), deletion mutation at MT019531.1 genomic position 9264 (orf 1ab: 9000_9005del), MT019531.1 genomic position 9851 (orf 1ab: 9587_9596del), 019531.1 genomic position 20296 (orf 1ab:20032_20035 del), MT019531.1 genomic position 29067 (N: 795_808del).

47d1-1 had 3 Single Nucleotide Polymorphism Sites (SNPs), and no indel mutations were found, respectively: MT019531.1 genomic position 1578 (orf 1ab: T1313A), MT019531.1 genomic position 6996 (orf 1ab: C6731T), MT019531.1 genomic position 18123 (orf 1ab: T17858C).

Example 4

The viral RNA samples 48d5 and 47d1 used in this example are the same as in example 3.

The specific experimental method is as follows:

the virus single-stranded RNA extracted by the magnetic bead method after collecting the oral epithelial cells of the novel patient with coronary pneumonia from the throat swab is mixed by using 34 gene specific reverse primers (the genome direction is 3'-5', the specific reverse transcription primer combination in the embodiment 2), and a 1st cDNA synthesis kit is reversely transcribed into a strand cDNA (1 st cDNA), wherein the 1st cDNA synthesis kit is selected as follows: TAKARA PrimeScript 1 ^st strand cDNA Synthesis kit (TAKARA, cat No. 6110A); the 1st cDNA was purified for subsequent amplification.

The first round anchored PCR amplification products were purified separately, and the procedure included: step a, adding 1 volume (30. Mu.L) of DNA purification beads, preferably Agencourt AMPure XP Beads (Beckman Cat No. 14403400) of the amplified product; step b, incubating for 5min at room temperature; step c, placing the steel plate in a magnetic rack for 10min; step d, preparing 80% of fresh ethanol to wash the magnetic beads twice; step e, use 30. Mu. LEB buffer (Qiagen Cat No. 19086) for reconstitution; mixing the two sets of PCR products in equimolar amounts;

And (5) machine-starting data analysis:

Analysis of results:

FIG. 5 shows the library construction results of this example, showing that the pharyngeal swab virus RNA with a lower relative virus copy number was reverse transcribed and the subsequent library construction was performed using specific primer pairs, both libraries were unimodal with the expected main peak positions of 380-420bp, which was consistent with the expectation; blank NC does not construct a library, which meets the expectations; the machine-down data are respectively as follows: 1.3G (48 d 5-1) and 1.3G (47 d 1-1), the raw data Q30 values were 94.18% and 94.37%, respectively (Table 11).

Table 11 control of on-machine data quality of throat swab sample library

The comparison of the machine-setting data to the virus reference genome MT019531.1 (Accession No: MT019531 GWHABKH 00000000) after filtering is long, the comparison base number, the comparison rate, the mismatch rate and the like are shown in Table 12, and the comparison rate of the machine-setting data of the throat swab sample is more than 96%, and the mismatch rate is less than 0.03%; the sequencing depth 100 times coverage ratio of the N and S accessory genes of the novel coronaviruses of the two libraries is 100%, so that the novel coronavirus COVID-19 can be determined; the viral genome sequencing depth 100 x coverage ratio reached 99.08% and 98.85%, respectively (table 12).

TABLE 12 off-the-shelf data analysis and statistics of throat swab sample library

48d5-2 has 6 single nucleotide polymorphism Sites (SNP), 4 deletion mutation sites, respectively: MT019531.1 genomic position 2132 (orf 1ab: A1867G), MT019531.1 genomic position 6996 (orf 1ab: C6731T), MT019531.1 genomic position 11354 (orf 1ab: G11089A), MT019531.1 genomic position 17194 (orf 1ab: A16929G), MT019531.1 genomic position 18395 (orf 1ab: C18130T), MT019531.1 genomic position 18557 (orf 1ab: C18292T), deletion mutation at MT019531.1 genomic position 9264 (orf 1ab: 9000_9005del), MT019531.1 genomic position 9851 (orf 1ab: 9587_9596del), 019531.1 genomic position 20296 (orf 1ab:20032_20035 del), MT019531.1 genomic position 29067 (N: 795_808del).

47d1-2 present 3 Single Nucleotide Polymorphic Sites (SNPs), no indel mutations were found, respectively: MT019531.1 genomic position 1578 (orf 1ab: T1313A), MT019531.1 genomic position 6996 (orf 1ab: C6731T), MT019531.1 genomic position 18123 (orf 1ab: T17858C); from a combination of the results of example 3 and example 4, it was found that the same mutation sites could be identified by both reverse transcription methods, and that the comparison rate, average sequencing depth, 100×sequencing depth coverage ratio, and the like of the library obtained by reverse transcription using the specific primers were superior in terms of data utilization.

Example 5

The viruses used in this example were isolated from isolated novel coronavirus strain laboratory cells cultured by in vitro infection, and the virus supernatant was extracted for a total of 3 cases; the virus culture and RNA extraction operations are assisted by the biological safety class 3 (P3) laboratory of the institute of pathogenic biology of the national academy of medical sciences/Beijing synergetic hospital.

The method provided by the embodiment can be used for detecting the virus genome mutation of the novel coronavirus with high copy number and is used for identifying the variation and evolution condition analysis of the virus under the high copy number; viral Copy number viral concentration (Copy/. Mu.L) was determined by absolute quantitative qRT-PCR using the N gene Copy number of novel coronavirus nucleic acid standard (high concentration) GBW (E) 091089 (China national institute of metrology) (Table 13).

TABLE 13 RNA viral copy number of cultured virus and clinical information

The specific experimental method is as follows:

the single-stranded RNA of the virus extracted after virus culture was mixed using 34 gene-specific reverse primers (genome direction 3'-5', the same specific reverse transcription primer combination as in example 2), and reverse transcribed into one-stranded cDNA (1 st cDNA) using a 1st cDNA synthesis kit selected from the group consisting of: TAKARA PrimeScript 1 ^st strand cDNA Synthesis kit (TAKARA, cat No. 6110A), the 1st cDNA was purified and then subjected to the subsequent amplification reaction.

Using the anchored multiplex amplification primer set 1, a PCR reaction was performed, the PCR reaction system comprising: 5. Mu.L of cDNA template, 1 5. Mu.L of anchor multiplex primer set, 15. Mu.L of DNA polymerase and 2 Xbuffer system, 5. Mu.L of double distilled water, DNA polymerase and 2 Xbuffer system were selected as: KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure included: step a, pre-denaturation: denaturation at 98℃for 1min; step b, cyclic amplification: denaturation at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and cycle number of 10; step c, total extension: extending at 72 ℃ for 60s; preserving at 4 ℃.

Using the anchored multiplex amplification primer set 2, a PCR reaction is performed, the PCR reaction system comprising: 5. Mu.L of cDNA template, 2 5. Mu.L of anchored multiplex primer set, 15. Mu.L of DNA polymerase and 2 Xbuffer system for the same sample, 5. Mu.L of double distilled water, DNA polymerase and 2 Xbuffer system were selected as: KAPA HiFi HotStart ReadyMix (Roche, cat No. KK 2602); the PCR amplification procedure included: step a, pre-denaturation: denaturation at 98℃for 1min; step b, cyclic amplification: denaturation at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and cycle number of 10; step c, total extension: extending at 72 ℃ for 60s; preserving at 4 ℃.

High throughput sequencing, namely performing high throughput sequencing on the library purified in the previous step according to the on-machine operation steps of illuminea Novaseq; because of the large number of copies of the virus, the sequencing data volume is set to be 500M (0.5G) in order to reduce the influence of the Illumina sequencing platform on the data result.

And (5) machine-starting data analysis:

Analysis of results:

FIG. 6 shows the results of library construction of this example, using a viral copy number of about 10 ⁸ The individual virus culture samples were subjected to library construction, all libraries were unimodal, the library range was narrow and sharp, and the average library fragment size was 380About 420bp, which meets the expectations of the average size of the amplicon and the size of the complete library, Q30 reaches more than 93.79% (Table 14).

TABLE 14 off-the-shelf data quality control for library of cultured virus samples

The comparison of the machine-setting data after filtration (Clean) to the comparison read length of the virus reference genome MT019531.1 (Accession No: MT019531 GWHABKH 00000000), the comparison base number, the comparison rate, the mismatch rate and the like are shown in Table 6, the comparison rate of the machine-setting data of the sample is more than 99.61%, and the mismatch rate is less than 0.2%; the 100 Xcoverage ratio of the N and S sequencing depth of the novel coronavirus accessory genes of the two libraries is 100%, and the 100 Xsequencing depth coverage ratio of the viral genome of 3 samples can be determined to be the novel coronavirus COVID-19, and the coverage ratio of the viral genome is more than 98.65% (Table 15).

TABLE 15 off-the-shelf data analysis and statistics of cultured virus sample library

7 Single Nucleotide Polymorphism Sites (SNPs) exist in XH1P2-R, and insertion deletion mutation sites are not found, respectively: MT019531.1 genomic position 3127 (orf 1ab: T2862C), MT019531.1 genomic position 3706 (orf 1ab: A3441G), MT019531.1 genomic position 5369 (orf 1ab: G5104T), MT019531.1 genomic position 5812 (orf 1ab: C5547T), MT019531.1 genomic position 6996 (orf 1ab: C6731T), MT019531.1 genomic position 18395 (orf 1ab: C18130T), MT019531.1 genomic position 18557 (orf 1ab: C18292T).

There are 6 Single Nucleotide Polymorphic Sites (SNPs) in XH1P6-R, and insertion deletion mutation sites were not found, respectively: MT019531.1 genomic position 3127 (orf 1ab: T2862C), MT019531.1 genomic position 5369 (orf 1ab: G5104T), MT019531.1 genomic position 5812 (orf 1ab: C5547T), MT019531.1 genomic position 6996 (orf 1ab: C6731T), MT019531.1 genomic position 18557 (orf 1ab: C18292T), MT019531.1 genomic position 26308 (E: G64T).

WHP6-R had 9 single nucleotide polymorphism Sites (SNP), 1 deletion mutation site, respectively: MT019531.1 genomic position 565 (ORF 1ab: T300C), MT019531.1 genomic position 6996 (ORF 1ab: C6731T), MT019531.1 genomic position 7010 (ORF 1ab: G6755A), MT019531.1 genomic position 17825 (ORF 1ab: C17560T), MT019531.1 genomic position 18557 (ORF 1ab: C18292T), MT019531.1 genomic position 21784 (S: T333A), MT019531.1 genomic position 23525 (S: C1965T), MT019531.1 genomic position 23598 (S: A2036G), MT019531.1 genomic position 29573 (ORF 10: G16A), 019531.1 genomic position 23594 (S: 2033_2062del).

While the invention has been described in detail in the foregoing general description and with reference to specific embodiments thereof, it will be apparent to one skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.

Reference is made to:

[1]Ge,X.-Y.et al.Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor.Nature 503,535–538(2013)

[2]Yang,L.et al.Novel SARS-like betacoronaviruses in bats,China,2011.Emerg.Infect.Dis.19,989–991(2013)

[3]Menachery,V.D.et al.SARS-like WIV1-CoV poised for human emergence.Proc.Natl Acad.Sci.USA 113,3048–3053(2016)

[4]Cui,J.,Li,F.&Shi,Z.L.Origin and evolution of pathogenic coronaviruses.Nat.Rev.Microbiol.17,181–192(2019)

[5]Fan,Y.,Zhao,K.,Shi,Z.-L.&Zhou,P.Bat coronaviruses in China.Viruses 11,210(2019)

[6]Wuhan Municipal Health Commission.Press statement related to novel coronavirus infection(in Chinese)http://wjw.wuhan.gov.cn/front/web/showDetail/2020012709194(2020)

[7]Zhou,P.,Yang,X.,Wang,X.et al.A pneumonia outbreak associated with a new coronavirus of probable bat origin.Nature 579,270–273(2020).https://doi.org/10.1038/s41586-020-2012-7。

sequence listing

<110> Fuzhou Furui medical laboratory Co., ltd

Chinese Academy of Medical Sciences

<120> method for constructing novel coronavirus whole genome high throughput sequencing library and kit for library construction

<130> KHP201111315.5

<160> 504

<170> SIPOSequenceListing 1.0

<210> 1

<211> 34

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 1

gtctcgtggg ctcggagatg tgtataagag acag 34

<210> 2

<211> 33

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 2

tcgtcggcag cgtcagatgt gtataagaga cag 33

<210> 3

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 3

gtctcgtggg ctcggagatg tgtataagag acagaccaac caactttcga tctct 55

<210> 4

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 4

gtctcgtggg ctcggagatg tgtataagag acagtcccag gtaacaaacc aacc 54

<210> 5

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 5

gtctcgtggg ctcggagatg tgtataagag acagggtgtg accgaaaggt aagat 55

<210> 6

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 6

gtctcgtggg ctcggagatg tgtataagag acaggtccct ggtttcaacg agaa 54

<210> 7

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 7

gtctcgtggg ctcggagatg tgtataagag acagggcgaa ataccagtgg ctta 54

<210> 8

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 8

gtctcgtggg ctcggagatg tgtataagag acagttgagc tggtagcaga actc 54

<210> 9

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 9

gtctcgtggg ctcggagatg tgtataagag acagggtgtt acccgtgaac tcat 54

<210> 10

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 10

gtctcgtggg ctcggagatg tgtataagag acagtgtccg aacaactgga ctttat 56

<210> 11

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 11

gtctcgtggg ctcggagatg tgtataagag acaggcttga tggctttatg ggtaga 56

<210> 12

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 12

gtctcgtggg ctcggagatg tgtataagag acagattgtc cagcatgtca caattc 56

<210> 13

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 13

gtctcgtggg ctcggagatg tgtataagag acaggtggaa actgtgaaag gtttgg 56

<210> 14

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 14

gtctcgtggg ctcggagatg tgtataagag acagttctcc cgcactcttg aaac 54

<210> 15

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 15

gtctcgtggg ctcggagatg tgtataagag acaggctcgt gttgtacgat caattt 56

<210> 16

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 16

gtctcgtggg ctcggagatg tgtataagag acagtcgcag tggctaacta acatc 55

<210> 17

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 17

gtctcgtggg ctcggagatg tgtataagag acagagagaa gtttaaggaa ggtgtagag 59

<210> 18

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 18

gtctcgtggg ctcggagatg tgtataagag acagcagaga agaaactggc ctactc 56

<210> 19

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 19

gtctcgtggg ctcggagatg tgtataagag acagcatttg tcacgcactc aaagg 55

<210> 20

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 20

gtctcgtggg ctcggagatg tgtataagag acagctgtgc ccttgcacct aata 54

<210> 21

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 21

gtctcgtggg ctcggagatg tgtataagag acagcattgg ttggtacacc agtttg 56

<210> 22

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 22

gtctcgtggg ctcggagatg tgtataagag acagaatgag aagtgctctg cctatac 57

<210> 23

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 23

gtctcgtggg ctcggagatg tgtataagag acaggcaagg ttacaagagt gtgaatatc 59

<210> 24

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 24

gtctcgtggg ctcggagatg tgtataagag acagcttaca ccactgggca ttga 54

<210> 25

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 25

gtctcgtggg ctcggagatg tgtataagag acagcctcca gatgaggatg aagaag 56

<210> 26

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 26

gtctcgtggg ctcggagatg tgtataagag acagcctgaa gaagagcaag aagaaga 57

<210> 27

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 27

gtctcgtggg ctcggagatg tgtataagag acaggcagtg aggacaatca gacaa 55

<210> 28

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 28

gtctcgtggg ctcggagatg tgtataagag acagaaatgc agacattgtg gaagaag 57

<210> 29

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 29

gtctcgtggg ctcggagatg tgtataagag acagccttaa acatggagga ggtgtt 56

<210> 30

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 30

gtctcgtggg ctcggagatg tgtataagag acaggcggac acaatcttgc taaac 55

<210> 31

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 31

gtctcgtggg ctcggagatg tgtataagag acagggtgct gaccctatac attctt 56

<210> 32

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 32

gtctcgtggg ctcggagatg tgtataagag acaggatcgc tgagattcct aaagagg 57

<210> 33

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 33

gtctcgtggg ctcggagatg tgtataagag acagcttcat ccagattctg ccactc 56

<210> 34

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 34

gtctcgtggg ctcggagatg tgtataagag acagtgatgt tgttcaagag ggtgt 55

<210> 35

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 35

gtctcgtggg ctcggagatg tgtataagag acagaactgc tgtggttata cctactaaa 59

<210> 36

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 36

gtctcgtggg ctcggagatg tgtataagag acaggcttgc acatgcagaa gaaa 54

<210> 37

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 37

gtctcgtggg ctcggagatg tgtataagag acagcatgca gaagaaacac gcaaat 56

<210> 38

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 38

gtctcgtggg ctcggagatg tgtataagag acaggcgtca cttatcaaca cacttaac 58

<210> 39

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 39

gtctcgtggg ctcggagatg tgtataagag acagcatctc acttgctggt tcctat 56

<210> 40

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 40

gtctcgtggg ctcggagatg tgtataagag acagggtcct attctggaca atctacac 58

<210> 41

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 41

gtctcgtggg ctcggagatg tgtataagag acagttgaga gaagtgagga ctattaagg 59

<210> 42

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 42

gtctcgtggg ctcggagatg tgtataagag acaggggtag gtacatgtca gcatt 55

<210> 43

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 43

gtctcgtggg ctcggagatg tgtataagag acagaccaca caactgatcc tagttt 56

<210> 44

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 44

gtctcgtggg ctcggagatg tgtataagag acaggacagt aggtgagtta ggtgatg 57

<210> 45

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 45

gtctcgtggg ctcggagatg tgtataagag acagggtgag ttaggtgatg ttagagaaa 59

<210> 46

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 46

gtctcgtggg ctcggagatg tgtataagag acagcagata ccttgtacgt gtggtaa 57

<210> 47

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 47

gtctcgtggg ctcggagatg tgtataagag acagtacgtg tggtaaacaa gctaca 56

<210> 48

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 48

gtctcgtggg ctcggagatg tgtataagag acagttgcat agacggtgct ttact 55

<210> 49

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 49

gtctcgtggg ctcggagatg tgtataagag acagacaatt cttatttcac agagcaacc 59

<210> 50

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 50

gtctcgtggg ctcggagatg tgtataagag acagccaaac caaccatatc caaacg 56

<210> 51

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 51

gtctcgtggg ctcggagatg tgtataagag acagtggtga tgtggtggct attg 54

<210> 52

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 52

gtctcgtggg ctcggagatg tgtataagag acagtccctg acttaaatgg tgatgt 56

<210> 53

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 53

gtctcgtggg ctcggagatg tgtataagag acagcagtct ctgaagaagt agtggaaa 58

<210> 54

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 54

gtctcgtggg ctcggagatg tgtataagag acagtcttgc ctgcgaagat ctaaa 55

<210> 55

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 55

gtctcgtggg ctcggagatg tgtataagag acagaccctt gctactcatg gtttag 56

<210> 56

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 56

gtctcgtggg ctcggagatg tgtataagag acagtactca tggtttagct gctgtt 56

<210> 57

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 57

gtctcgtggg ctcggagatg tgtataagag acagatgccg actactatag caaagaata 59

<210> 58

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 58

gtctcgtggg ctcggagatg tgtataagag acagaaagca tctatgccga ctactat 57

<210> 59

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 59

gtctcgtggg ctcggagatg tgtataagag acaggtactg gttacagaga aggctatt 58

<210> 60

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 60

gtctcgtggg ctcggagatg tgtataagag acagacagag aaggctattt gaactcta 58

<210> 61

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 61

gtctcgtggg ctcggagatg tgtataagag acagtacttg gattggctgc aatcat 56

<210> 62

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 62

gtctcgtggg ctcggagatg tgtataagag acagggctgc aatcatgcaa ttgtt 55

<210> 63

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 63

gtctcgtggg ctcggagatg tgtataagag acagagagca acaagagtcg aatgta 56

<210> 64

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 64

gtctcgtggg ctcggagatg tgtataagag acaggtgtta caaacgtaat agagcaaca 59

<210> 65

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 65

gtctcgtggg ctcggagatg tgtataagag acagagaatg gttccatcca tctttact 58

<210> 66

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 66

gtctcgtggg ctcggagatg tgtataagag acaggaccag tcttcttaca tcgttga 57

<210> 67

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 67

gtctcgtggg ctcggagatg tgtataagag acagcgtctg tttactacag tcagcttat 59

<210> 68

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 68

gtctcgtggg ctcggagatg tgtataagag acagtgttgg tgatagtgcg gaag 54

<210> 69

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 69

gtctcgtggg ctcggagatg tgtataagag acagcttgca aagaatgtgt ccttagac 58

<210> 70

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 70

gtctcgtggg ctcggagatg tgtataagag acaggtgacc ttggtgcttg tattg 55

<210> 71

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 71

gtctcgtggg ctcggagatg tgtataagag acaggactgt agtgcgcgtc atatt 55

<210> 72

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 72

gtctcgtggg ctcggagatg tgtataagag acagtgacat gtgcaactac tagacaa 57

<210> 73

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 73

gtctcgtggg ctcggagatg tgtataagag acagaagata gcacttaagg gtggtaaa 58

<210> 74

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 74

gtctcgtggg ctcggagatg tgtataagag acagccagcg tggtggtagt tatac 55

<210> 75

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 75

gtctcgtggg ctcggagatg tgtataagag acaggacaaa gcttgcccat tgatt 55

<210> 76

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 76

gtctcgtggg ctcggagatg tgtataagag acagcttctg gtaagccagt accatatt 58

<210> 77

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 77

gtctcgtggg ctcggagatg tgtataagag acaggatgct tctggtaagc cagt 54

<210> 78

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 78

gtctcgtggg ctcggagatg tgtataagag acagcagaag ctggtgtttg tgtatc 56

<210> 79

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 79

gtctcgtggg ctcggagatg tgtataagag acaggtggta gatgggtact taacaatga 59

<210> 80

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 80

gtctcgtggg ctcggagatg tgtataagag acagacacca gtttactcat tcttacct 58

<210> 81

<211> 61

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 81

gtctcgtggg ctcggagatg tgtataagag acagctattc cttatgtcat tcactgtact 60

c 61

<210> 82

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 82

gtctcgtggg ctcggagatg tgtataagag acagcgtgta gtctttaatg gtgtttcc 58

<210> 83

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 83

gtctcgtggg ctcggagatg tgtataagag acagttgaag aagctgcgct gt 52

<210> 84

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 84

gtctcgtggg ctcggagatg tgtataagag acagtctcgc aaaggctctc aatg 54

<210> 85

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 85

gtctcgtggg ctcggagatg tgtataagag acagcatgtg atctgcacct ctgaa 55

<210> 86

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 86

gtctcgtggg ctcggagatg tgtataagag acaggatgac gtagtttact gtccaaga 58

<210> 87

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 87

gtctcgtggg ctcggagatg tgtataagag acagcagcca atcctaagac acctaag 57

<210> 88

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 88

gtctcgtggg ctcggagatg tgtataagag acagaggttg atacagccaa tcctaag 57

<210> 89

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 89

gtctcgtggg ctcggagatg tgtataagag acagcatgct ggcacagact taga 54

<210> 90

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 90

gtctcgtggg ctcggagatg tgtataagag acagtggcac agacttagaa ggtaac 56

<210> 91

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 91

gtctcgtggg ctcggagatg tgtataagag acagtgactt taaccttgtg gctatga 57

<210> 92

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 92

gtctcgtggg ctcggagatg tgtataagag acagggacct ctttctgctc aaact 55

<210> 93

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 93

gtctcgtggg ctcggagatg tgtataagag acagaatcaa gggtacacac cactg 55

<210> 94

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 94

gtctcgtggg ctcggagatg tgtataagag acagcacacc actggttgtt actca 55

<210> 95

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 95

gtctcgtggg ctcggagatg tgtataagag acaggctagt tgggtgatgc gtatt 55

<210> 96

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 96

gtctcgtggg ctcggagatg tgtataagag acaggtctat atgcctgcta gttggg 56

<210> 97

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 97

gtctcgtggg ctcggagatg tgtataagag acagccattt ccatgtgggc tctta 55

<210> 98

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 98

gtctcgtggg ctcggagatg tgtataagag acagaggtgt agttacaact gtcatgt 57

<210> 99

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 99

gtctcgtggg ctcggagatg tgtataagag acaggttggt ggcaaacctt gtatc 55

<210> 100

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 100

gtctcgtggg ctcggagatg tgtataagag acagctccca cccaagaata gcatag 56

<210> 101

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 101

gtctcgtggg ctcggagatg tgtataagag acaggctaaa gatactactg aagcctttg 59

<210> 102

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 102

gtctcgtggg ctcggagatg tgtataagag acaggcaggg tgctgtagac ataaa 55

<210> 103

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 103

gtctcgtggg ctcggagatg tgtataagag acaggctgtt gctaatggtg attctg 56

<210> 104

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 104

gtctcgtggg ctcggagatg tgtataagag acaggctaat ggtgattctg aagttgtt 58

<210> 105

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 105

gtctcgtggg ctcggagatg tgtataagag acagacaaca gcagccaaac taatg 55

<210> 106

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 106

gtctcgtggg ctcggagatg tgtataagag acaggagatg gttgtgttcc cttga 55

<210> 107

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 107

gtctcgtggg ctcggagatg tgtataagag acagggccaa ttctgctgtc aaatta 56

<210> 108

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 108

gtctcgtggg ctcggagatg tgtataagag acagcagctt taagggccaa ttctg 55

<210> 109

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 109

gtctcgtggg ctcggagatg tgtataagag acagaacaca acaaagggag gtagg 55

<210> 110

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 110

gtctcgtggg ctcggagatg tgtataagag acaggaaatg ggctagattc cctaaga 57

<210> 111

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 111

gtctcgtggg ctcggagatg tgtataagag acagtagctg ccacagtacg tcta 54

<210> 112

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 112

gtctcgtggg ctcggagatg tgtataagag acagcaagct ggtaatgcaa cagaag 56

<210> 113

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 113

gtctcgtggg ctcggagatg tgtataagag acagggtact ggtcaggcaa taaca 55

<210> 114

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 114

gtctcgtggg ctcggagatg tgtataagag acaggttgcc acatagatca tccaaatc 58

<210> 115

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 115

gtctcgtggg ctcggagatg tgtataagag acagctgatg tcgtatacag ggcttt 56

<210> 116

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 116

gtctcgtggg ctcggagatg tgtataagag acagaacggg tttgcggtgt aa 52

<210> 117

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 117

gtctcgtggg ctcggagatg tgtataagag acaggattgt ccagctgttg ctaaac 56

<210> 118

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 118

gtctcgtggg ctcggagatg tgtataagag acagggtgac atggtaccac atatatca 58

<210> 119

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 119

gtctcgtggg ctcggagatg tgtataagag acagccatgc gaaatgctgg tattg 55

<210> 120

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 120

gtctcgtggg ctcggagatg tgtataagag acaggtgtac gccaagcttt gtt 53

<210> 121

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 121

gtctcgtggg ctcggagatg tgtataagag acagccttga ccagggcttt aact 54

<210> 122

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 122

gtctcgtggg ctcggagatg tgtataagag acagagggct ttaactgcag agtc 54

<210> 123

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 123

gtctcgtggg ctcggagatg tgtataagag acagtctaca gtgttcccac ctaca 55

<210> 124

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 124

gtctcgtggg ctcggagatg tgtataagag acagcacgct gcttctggta atct 54

<210> 125

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 125

gtctcgtggg ctcggagatg tgtataagag acagtacttg tgtatgctgc tgacc 55

<210> 126

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 126

gtctcgtggg ctcggagatg tgtataagag acagtcagga tggtaatgct gctatc 56

<210> 127

<211> 62

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 127

gtctcgtggg ctcggagatg tgtataagag acaggattca atgagttatg aggatcaaga 60

tg 62

<210> 128

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 128

gtctcgtggg ctcggagatg tgtataagag acaggtggtt ggcacaacat gttaaa 56

<210> 129

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 129

gtctcgtggg ctcggagatg tgtataagag acagaagcaa attctatggt ggttgg 56

<210> 130

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 130

gtctcgtggg ctcggagatg tgtataagag acagaaacca ggtggaacct catc 54

<210> 131

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 131

gtctcgtggg ctcggagatg tgtataagag acagcatgtg tggcggttca ctat 54

<210> 132

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 132

gtctcgtggg ctcggagatg tgtataagag acaggacgat gctgttgtgt gtttc 55

<210> 133

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 133

gtctcgtggg ctcggagatg tgtataagag acagatactc tctgacgatg ctgttg 56

<210> 134

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 134

gtctcgtggg ctcggagatg tgtataagag acagtacctt ccttacccag atcca 55

<210> 135

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 135

gtctcgtggg ctcggagatg tgtataagag acagccttac ccagatccat caagaa 56

<210> 136

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 136

gtctcgtggg ctcggagatg tgtataagag acagacatga tgagttaaca ggacaca 57

<210> 137

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 137

gtctcgtggg ctcggagatg tgtataagag acagcaaggt attgggaacc tgag 54

<210> 138

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 138

gtctcgtggg ctcggagatg tgtataagag acagacacac cgcatacagt cttac 55

<210> 139

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 139

gtctcgtggg ctcggagatg tgtataagag acaggctatg tacacaccgc ataca 55

<210> 140

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 140

gtctcgtggg ctcggagatg tgtataagag acagccattg tgtgctaatg gacaag 56

<210> 141

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 141

gtctcgtggg ctcggagatg tgtataagag acagaaacct agaccaccac ttaacc 56

<210> 142

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 142

gtctcgtggg ctcggagatg tgtataagag acaggggaag ttggtaaacc tagacc 56

<210> 143

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 143

gtctcgtggg ctcggagatg tgtataagag acagctggct tatacccaac actcaa 56

<210> 144

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 144

gtctcgtggg ctcggagatg tgtataagag acagaccacc tggtactggt aaga 54

<210> 145

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 145

gtctcgtggg ctcggagatg tgtataagag acagcgtgct cgtgtagagt gttt 54

<210> 146

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 146

gtctcgtggg ctcggagatg tgtataagag acagcctgag acgacagcag atatag 56

<210> 147

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 147

gtctcgtggg ctcggagatg tgtataagag acagcactgt gagtgctttg gtttatg 57

<210> 148

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 148

gtctcgtggg ctcggagatg tgtataagag acaggttgac actgtgagtg ctttg 55

<210> 149

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 149

gtctcgtggg ctcggagatg tgtataagag acaggattca tcacagggct cagaata 57

<210> 150

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 150

gtctcgtggg ctcggagatg tgtataagag acagcaaacc actgaaacag ctcac 55

<210> 151

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 151

gtctcgtggg ctcggagatg tgtataagag acagatccta cacaggcacc taca 54

<210> 152

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 152

gtctcgtggg ctcggagatg tgtataagag acagaatcac tgggttacat cctacac 57

<210> 153

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 153

gtctcgtggg ctcggagatg tgtataagag acagcacccg cgaagaagct ataa 54

<210> 154

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 154

gtctcgtggg ctcggagatg tgtataagag acagcatgtt tatcacccgc gaaga 55

<210> 155

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 155

gtctcgtggg ctcggagatg tgtataagag acagcaccgc ctggagatca attt 54

<210> 156

<211> 61

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 156

gtctcgtggg ctcggagatg tgtataagag acagctggag atcaatttaa acacctcata 60

c 61

<210> 157

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 157

gtctcgtggg ctcggagatg tgtataagag acagtccact gcttcagaca cttatg 56

<210> 158

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 158

gtctcgtggg ctcggagatg tgtataagag acaggcctgt tggcatcatt ctattg 56

<210> 159

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 159

gtctcgtggg ctcggagatg tgtataagag acagagcgtg ttgactggac tattg 55

<210> 160

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 160

gtctcgtggg ctcggagatg tgtataagag acagagtgct ttgttaagcg tgttg 55

<210> 161

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 161

gtctcgtggg ctcggagatg tgtataagag acagtgccac acattctgac aaattc 56

<210> 162

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 162

gtctcgtggg ctcggagatg tgtataagag acagttcaca gatggtgtat gcctatt 57

<210> 163

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 163

gtctcgtggg ctcggagatg tgtataagag acagccacta aagtctgcta cgtgtataa 59

<210> 164

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 164

gtctcgtggg ctcggagatg tgtataagag acagatgtac cactaaagtc tgctacg 57

<210> 165

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 165

gtctcgtggg ctcggagatg tgtataagag acaggatgga caacagggtg aagt 54

<210> 166

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 166

gtctcgtggg ctcggagatg tgtataagag acagcagggt gaagtaccag tttctatc 58

<210> 167

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 167

gtctcgtggg ctcggagatg tgtataagag acaggtgtgg acattgctgc taatac 56

<210> 168

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 168

gtctcgtggg ctcggagatg tgtataagag acaggctaat actgtgatct gggactac 58

<210> 169

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 169

gtctcgtggg ctcggagatg tgtataagag acagtgcccg taatggtgtt cttat 55

<210> 170

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 170

gtctcgtggg ctcggagatg tgtataagag acaggtgcac cactcactgt cttt 54

<210> 171

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 171

gtctcgtggg ctcggagatg tgtataagag acaggttgat ggtgttgtcc aacaatta 58

<210> 172

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 172

gtctcgtggg ctcggagatg tgtataagag acaggttgtc caacaattac ctgaaact 58

<210> 173

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 173

gtctcgtggg ctcggagatg tgtataagag acagagtcat agtcagttag gtggtttac 59

<210> 174

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 174

gtctcgtggg ctcggagatg tgtataagag acagtaggtg gtttacatct actgattgg 59

<210> 175

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 175

gtctcgtggg ctcggagatg tgtataagag acagttatgc tttggtgtaa agatggc 57

<210> 176

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 176

gtctcgtggg ctcggagatg tgtataagag acaggtaaag atggccatgt agaaaca 57

<210> 177

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 177

gtctcgtggg ctcggagatg tgtataagag acagaaacac attaacatta gctgtaccc 59

<210> 178

<211> 60

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 178

gtctcgtggg ctcggagatg tgtataagag acaggtgata tgtacgaccc taagactaaa 60

<210> 179

<211> 62

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 179

gtctcgtggg ctcggagatg tgtataagag acagctcatt attagtgata tgtacgaccc 60

ta 62

<210> 180

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 180

gtctcgtggg ctcggagatg tgtataagag acagtaagct catgggacac ttcg 54

<210> 181

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 181

gtctcgtggg ctcggagatg tgtataagag acagcaaatc caattcagtt gtcttcct 58

<210> 182

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 182

gtctcgtggg ctcggagatg tgtataagag acaggggtac tgctgttatg tcttta 56

<210> 183

<211> 60

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 183

gtctcgtggg ctcggagatg tgtataagag acagcactag tctctagtca gtgtgttaat 60

<210> 184

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 184

gtctcgtggg ctcggagatg tgtataagag acagttattg ccactagtct ctagtcag 58

<210> 185

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 185

gtctcgtggg ctcggagatg tgtataagag acagggtttg ataaccctgt cctacc 56

<210> 186

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 186

gtctcgtggg ctcggagatg tgtataagag acagatacat gtctctggga ccaatg 56

<210> 187

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 187

gtctcgtggg ctcggagatg tgtataagag acaggaccca gtccctactt attgtt 56

<210> 188

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 188

gtctcgtggg ctcggagatg tgtataagag acagttgaat atgtctctca gccttt 56

<210> 189

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 189

gtctcgtggg ctcggagatg tgtataagag acagcggctt tagaaccatt ggtaga 56

<210> 190

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 190

gtctcgtggg ctcggagatg tgtataagag acagtgaccc tctctcagaa acaaag 56

<210> 191

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 191

gtctcgtggg ctcggagatg tgtataagag acagctgtgc acttgaccct ctc 53

<210> 192

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 192

gtctcgtggg ctcggagatg tgtataagag acagctgtgt tgctgattat tctgtcc 57

<210> 193

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 193

gtctcgtggg ctcggagatg tgtataagag acagtcttga ttctaaggtt ggtggtaa 58

<210> 194

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 194

gtctcgtggg ctcggagatg tgtataagag acagggctgc gttatagctt gga 53

<210> 195

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 195

gtctcgtggg ctcggagatg tgtataagag acagggttac caaccataca gagtagtag 59

<210> 196

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 196

gtctcgtggg ctcggagatg tgtataagag acagacccac taatggtgtt ggttac 56

<210> 197

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 197

gtctcgtggg ctcggagatg tgtataagag acagcagaga cattgctgac actact 56

<210> 198

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 198

gtctcgtggg ctcggagatg tgtataagag acaggtgatc cacagacact tgagat 56

<210> 199

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 199

gtctcgtggg ctcggagatg tgtataagag acagactcct acttggcgtg tttatt 56

<210> 200

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 200

gtctcgtggg ctcggagatg tgtataagag acagacacgt gcaggctgtt ta 52

<210> 201

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 201

gtctcgtggg ctcggagatg tgtataagag acagagtcaa tccatcattg cctaca 56

<210> 202

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 202

gtctcgtggg ctcggagatg tgtataagag acagggaata gctgttgaac aagacaaa 58

<210> 203

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 203

gtctcgtggg ctcggagatg tgtataagag acagccaagc aagaggtcat ttattgaag 59

<210> 204

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 204

gtctcgtggg ctcggagatg tgtataagag acagcacctt tgctcacaga tgaaatg 57

<210> 205

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 205

gtctcgtggg ctcggagatg tgtataagag acaggttagc gggtacaatc acttct 56

<210> 206

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 206

gtctcgtggg ctcggagatg tgtataagag acagtcaaga ctcactttct tccacag 57

<210> 207

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 207

gtctcgtggg ctcggagatg tgtataagag acagggctga agtgcaaatt gatagg 56

<210> 208

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 208

gtctcgtggg ctcggagatg tgtataagag acagtgatca caggcagact tcaaa 55

<210> 209

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 209

gtctcgtggg ctcggagatg tgtataagag acaggggcta tcatcttatg tccttcc 57

<210> 210

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 210

gtctcgtggg ctcggagatg tgtataagag acagcacctc atggtgtagt cttctt 56

<210> 211

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 211

gtctcgtggg ctcggagatg tgtataagag acagtgtgtc tggtaactgt gatgtt 56

<210> 212

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 212

gtctcgtggg ctcggagatg tgtataagag acaggactca ttcaaggagg agttagat 58

<210> 213

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 213

gtctcgtggg ctcggagatg tgtataagag acagccatgg tacatttggc taggt 55

<210> 214

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 214

gtctcgtggg ctcggagatg tgtataagag acagtagctg gcttgattgc catag 55

<210> 215

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 215

gtctcgtggg ctcggagatg tgtataagag acagccagtg ctcaaaggag tcaa 54

<210> 216

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 216

gtctcgtggg ctcggagatg tgtataagag acagttgctg tagttgtctc aaggg 55

<210> 217

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 217

gtctcgtggg ctcggagatg tgtataagag acaggataca agcctcactc cctttc 56

<210> 218

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 218

gtctcgtggg ctcggagatg tgtataagag acagctccct ttcggatggc ttatt 55

<210> 219

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 219

gtctcgtggg ctcggagatg tgtataagag acaggctttg gctttgctgg aaat 54

<210> 220

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 220

gtctcgtggg ctcggagatg tgtataagag acagataatg aggctttggc tttgc 55

<210> 221

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 221

gtctcgtggg ctcggagatg tgtataagag acaggatggc acaacaagtc ctatttc 57

<210> 222

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 222

gtctcgtggg ctcggagatg tgtataagag acagattacc agctgtactc aactcaa 57

<210> 223

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 223

gtctcgtggg ctcggagatg tgtataagag acagcacaca atcgacggtt catc 54

<210> 224

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 224

gtctcgtggg ctcggagatg tgtataagag acagcggttc atccggagtt gttaat 56

<210> 225

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 225

gtctcgtggg ctcggagatg tgtataagag acagcttgct ttcgtggtat tcttgc 56

<210> 226

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 226

gtctcgtggg ctcggagatg tgtataagag acagagttac actagccatc cttactg 57

<210> 227

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 227

gtctcgtggg ctcggagatg tgtataagag acagctcctt gaacaatgga acctagta 58

<210> 228

<211> 60

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 228

gtctcgtggg ctcggagatg tgtataagag acagcctatt ccttacatgg atttgtcttc 60

<210> 229

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 229

gtctcgtggg ctcggagatg tgtataagag acagttcttc tcaacgtgcc actc 54

<210> 230

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 230

gtctcgtggg ctcggagatg tgtataagag acaggttcca tgtggtcatt caatcc 56

<210> 231

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 231

gtctcgtggg ctcggagatg tgtataagag acagcgctac aggattggca actataa 57

<210> 232

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 232

gtctcgtggg ctcggagatg tgtataagag acagaacaca gaccattcca gtagc 55

<210> 233

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 233

gtctcgtggg ctcggagatg tgtataagag acaggcactg ataacactcg ctactt 56

<210> 234

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 234

gtctcgtggg ctcggagatg tgtataagag acaggagcaa ccaatggaga ttgattaaa 59

<210> 235

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 235

gtctcgtggg ctcggagatg tgtataagag acagagttac gtgccagatc agttt 55

<210> 236

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 236

gtctcgtggg ctcggagatg tgtataagag acagctgttc atcagacaag aggaagt 57

<210> 237

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 237

gtctcgtggg ctcggagatg tgtataagag acaggctgca tttcaccaag aatgt 55

<210> 238

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 238

gtctcgtggg ctcggagatg tgtataagag acagaatgaa acttgtcacg cctaaac 57

<210> 239

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 239

gtctcgtggg ctcggagatg tgtataagag acaggctggt tctaaatcac ccattc 56

<210> 240

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 240

gtctcgtggg ctcggagatg tgtataagag acagattgaa ttgtgcgtgg atgag 55

<210> 241

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 241

gtctcgtggg ctcggagatg tgtataagag acaggtttgg tggaccctca gatt 54

<210> 242

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 242

gtctcgtggg ctcggagatg tgtataagag acagtcaact ggcagtaacc agaat 55

<210> 243

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 243

gtctcgtggg ctcggagatg tgtataagag acagcaccaa tagcagtcca gatgac 56

<210> 244

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 244

gtctcgtggg ctcggagatg tgtataagag acagctacta ccgaagagct accaga 56

<210> 245

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 245

gtctcgtggg ctcggagatg tgtataagag acagcctgct aacaatgctg caatc 55

<210> 246

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 246

gtctcgtggg ctcggagatg tgtataagag acagccgcaa tcctgctaac aatg 54

<210> 247

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 247

gtctcgtggg ctcggagatg tgtataagag acaggtgatg ctgctcttgc tttg 54

<210> 248

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 248

gtctcgtggg ctcggagatg tgtataagag acagtgctgc tgcttgacag att 53

<210> 249

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 249

gtctcgtggg ctcggagatg tgtataagag acaggaccag gaactaatca gacaagg 57

<210> 250

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 250

gtctcgtggg ctcggagatg tgtataagag acagcccacc aacagagcct aaa 53

<210> 251

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 251

gtctcgtggg ctcggagatg tgtataagag acagctgact caactcaggc ctaaac 56

<210> 252

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 252

gtctcgtggg ctcggagatg tgtataagag acagagacca cacaaggcag atg 53

<210> 253

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 253

tcgtcggcag cgtcagatgt gtataagaga cagggacaag gctctccatc ttac 54

<210> 254

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 254

tcgtcggcag cgtcagatgt gtataagaga cagctccatc ttacctttcg gtcac 55

<210> 255

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 255

tcgtcggcag cgtcagatgt gtataagaga cagccgaacg tttgatgaac acatag 56

<210> 256

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 256

tcgtcggcag cgtcagatgt gtataagaga cagtgctacc agctcaacca taac 54

<210> 257

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 257

tcgtcggcag cgtcagatgt gtataagaga cagagggcca cagaagttgt tatc 54

<210> 258

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 258

tcgtcggcag cgtcagatgt gtataagaga caggggtaac accactgcta tgt 53

<210> 259

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 259

tcgtcggcag cgtcagatgt gtataagaga caggtgtctg caattcatag ctcttt 56

<210> 260

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 260

tcgtcggcag cgtcagatgt gtataagaga cagttggtga cgcaactgga tag 53

<210> 261

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 261

tcgtcggcag cgtcagatgt gtataagaga cagagactat gctcaggtcc tactt 55

<210> 262

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 262

tcgtcggcag cgtcagatgt gtataagaga cagcttcgga accttctcca aca 53

<210> 263

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 263

tcgtcggcag cgtcagatgt gtataagaga cagtagtatt gttatagcgg ccttctg 57

<210> 264

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 264

tcgtcggcag cgtcagatgt gtataagaga caggttagcc actgcgaagt caa 53

<210> 265

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 265

tcgtcggcag cgtcagatgt gtataagaga cagctgaaca acaccacctg taatg 55

<210> 266

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 266

tcgtcggcag cgtcagatgt gtataagaga cagtagagtc agcacacaaa gcc 53

<210> 267

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 267

tcgtcggcag cgtcagatgt gtataagaga cagggcatga gtaggccagt tt 52

<210> 268

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 268

tcgtcggcag cgtcagatgt gtataagaga cagcagagaa gaaactggcc tactc 55

<210> 269

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 269

tcgtcggcag cgtcagatgt gtataagaga cagtattagg tgcaagggca cag 53

<210> 270

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 270

tcgtcggcag cgtcagatgt gtataagaga cagcaacaca ggcgaactca tttac 55

<210> 271

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 271

tcgtcggcag cgtcagatgt gtataagaga cagtaggcag agcacttctc attaag 56

<210> 272

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 272

tcgtcggcag cgtcagatgt gtataagaga cagtcctcat ctggagggta gaaa 54

<210> 273

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 273

tcgtcggcag cgtcagatgt gtataagaga cagtctggag ggtagaaaga acaatac 57

<210> 274

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 274

tcgtcggcag cgtcagatgt gtataagaga cagaggttga agagcagcag aag 53

<210> 275

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 275

tcgtcggcag cgtcagatgt gtataagaga cagagtctga acaactggtg taagt 55

<210> 276

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 276

tcgtcggcag cgtcagatgt gtataagaga cagaggtaaa cattggctgc attaac 56

<210> 277

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 277

tcgtcggcag cgtcagatgt gtataagaga caggcaacac ctcctccatg ttta 54

<210> 278

<211> 51

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 278

tcgtcggcag cgtcagatgt gtataagaga cagttgggcc gacaacatga a 51

<210> 279

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 279

tcgtcggcag cgtcagatgt gtataagaga caggaatgta tagggtcagc accaa 55

<210> 280

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 280

tcgtcggcag cgtcagatgt gtataagaga cagcatttgt gcgaacagta tctacac 57

<210> 281

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 281

tcgtcggcag cgtcagatgt gtataagaga cagcttccag agttgttgta acttcttc 58

<210> 282

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 282

tcgtcggcag cgtcagatgt gtataagaga caggagtggc agaatctgga tgaa 54

<210> 283

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 283

tcgtcggcag cgtcagatgt gtataagaga cagacccggg taagtggtta tataattg 58

<210> 284

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 284

tcgtcggcag cgtcagatgt gtataagaga cagtctgcat gtgcaagcat ttc 53

<210> 285

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 285

tcgtcggcag cgtcagatgt gtataagaga caggcgtgtt tcttctgcat gtg 53

<210> 286

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 286

tcgtcggcag cgtcagatgt gtataagaga cagcatagcc aagtggcatt gtaac 55

<210> 287

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 287

tcgtcggcag cgtcagatgt gtataagaga cagcgagcag cttcttccaa atttaag 57

<210> 288

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 288

tcgtcggcag cgtcagatgt gtataagaga cagtgtccag aataggacca atctttat 58

<210> 289

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 289

tcgtcggcag cgtcagatgt gtataagaga cagacttgcg tgtggaggtt aat 53

<210> 290

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 290

tcgtcggcag cgtcagatgt gtataagaga cagccaaact gttgtccata tgtcatt 57

<210> 291

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 291

tcgtcggcag cgtcagatgt gtataagaga cagctgacat gtacctaccc agaaa 55

<210> 292

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 292

tcgtcggcag cgtcagatgt gtataagaga cagcacctaa ctcacctact gtcttatt 58

<210> 293

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 293

tcgtcggcag cgtcagatgt gtataagaga cagaacatca cctaactcac ctactg 56

<210> 294

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 294

tcgtcggcag cgtcagatgt gtataagaga cagggtgact cctgttgtac tagatatt 58

<210> 295

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 295

tcgtcggcag cgtcagatgt gtataagaga cagcaggtgg tgctgacatc ataa 54

<210> 296

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 296

tcgtcggcag cgtcagatgt gtataagaga cagggacctt tgtattctga ggactt 56

<210> 297

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 297

tcgtcggcag cgtcagatgt gtataagaga cagaacatcc gtaataggac ctttgt 56

<210> 298

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 298

tcgtcggcag cgtcagatgt gtataagaga cagcgaagct tgcgtttgga tatg 54

<210> 299

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 299

tcgtcggcag cgtcagatgt gtataagaga cagatagcca ccacatcacc attta 55

<210> 300

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 300

tcgtcggcag cgtcagatgt gtataagaga cagcgtggct ttattagttg cattgt 56

<210> 301

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 301

tcgtcggcag cgtcagatgt gtataagaga cagtcttcgc aggcaagatt atcc 54

<210> 302

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 302

tcgtcggcag cgtcagatgt gtataagaga cagcactact tcttcagaga ctggtt 56

<210> 303

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 303

tcgtcggcag cgtcagatgt gtataagaga cagacagcag ctaaaccatg agtag 55

<210> 304

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 304

tcgtcggcag cgtcagatgt gtataagaga cagggacact attaacagca gctaaac 57

<210> 305

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 305

tcgtcggcag cgtcagatgt gtataagaga cagctttgct atagtagtcg gcatagat 58

<210> 306

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 306

tcgtcggcag cgtcagatgt gtataagaga cagagtagtc ggcatagatg ctttaat 57

<210> 307

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 307

tcgtcggcag cgtcagatgt gtataagaga cagaccagta cagtaggttg caatag 56

<210> 308

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 308

tcgtcggcag cgtcagatgt gtataagaga cagagagttc aaatagcctt ctctgt 56

<210> 309

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 309

tcgtcggcag cgtcagatgt gtataagaga cagtgcagcc aatccaagta cata 54

<210> 310

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 310

tcgtcggcag cgtcagatgt gtataagaga cagcatgatt gcagccaatc caa 53

<210> 311

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 311

tcgtcggcag cgtcagatgt gtataagaga cagacattcg actcttgttg ctctatt 57

<210> 312

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 312

tcgtcggcag cgtcagatgt gtataagaga caggactctt gttgctctat tacgtttg 58

<210> 313

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 313

tcgtcggcag cgtcagatgt gtataagaga caggaaccat tcttcactgt aacactatc 59

<210> 314

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 314

tcgtcggcag cgtcagatgt gtataagaga cagcgatgta agaagactgg tcagtag 57

<210> 315

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 315

tcgtcggcag cgtcagatgt gtataagaga cagactgcaa cttccgcact atc 53

<210> 316

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 316

tcgtcggcag cgtcagatgt gtataagaga cagcactatc accaacatca gacacta 57

<210> 317

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 317

tcgtcggcag cgtcagatgt gtataagaga cagctgaatc aacaaaccct tgcc 54

<210> 318

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 318

tcgtcggcag cgtcagatgt gtataagaga cagcgccagt aacttctatg tcagat 56

<210> 319

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 319

tcgtcggcag cgtcagatgt gtataagaga caggcattaa tatgacgcgc actac 55

<210> 320

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 320

tcgtcggcag cgtcagatgt gtataagaga cagtttacca cccttaagtg ctatct 56

<210> 321

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 321

tcgtcggcag cgtcagatgt gtataagaga cagcccttaa gtgctatctt tgttgttac 59

<210> 322

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 322

tcgtcggcag cgtcagatgt gtataagaga cagcgagtga caccaccatc aata 54

<210> 323

<211> 50

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 323

tcgtcggcag cgtcagatgt gtataagaga cagccaccac gctggctaaa 50

<210> 324

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 324

tcgtcggcag cgtcagatgt gtataagaga cagtggctta ccagaagcat cttt 54

<210> 325

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 325

tcgtcggcag cgtcagatgt gtataagaga cagaatatgg tactggctta ccagaag 57

<210> 326

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 326

tcgtcggcag cgtcagatgt gtataagaga cagagataca caaacaccag cttct 55

<210> 327

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 327

tcgtcggcag cgtcagatgt gtataagaga cagatcattg ttaagtaccc atctacca 58

<210> 328

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 328

tcgtcggcag cgtcagatgt gtataagaga cagaaaggca actacatgac tgtattc 57

<210> 329

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 329

tcgtcggcag cgtcagatgt gtataagaga cagacagagt acagtgaatg acataagg 58

<210> 330

<211> 51

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 330

tcgtcggcag cgtcagatgt gtataagaga cagacagcgc agcttcttca a 51

<210> 331

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 331

tcgtcggcag cgtcagatgt gtataagaga cagaaagact acacgtctct ttaggt 56

<210> 332

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 332

tcgtcggcag cgtcagatgt gtataagaga cagggtttgt ggtggttggt aaag 54

<210> 333

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 333

tcgtcggcag cgtcagatgt gtataagaga cagcagctga ggtgatagag gtttg 55

<210> 334

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 334

tcgtcggcag cgtcagatgt gtataagaga caggggttaa gcatgtcttc agagg 55

<210> 335

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 335

tcgtcggcag cgtcagatgt gtataagaga cagagtctgt cctggttgaa tgc 53

<210> 336

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 336

tcgtcggcag cgtcagatgt gtataagaga cagaccagat ggtgaaccat tgtaa 55

<210> 337

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 337

tcgtcggcag cgtcagatgt gtataagaga cagctaagtc tgtgccagca tgaa 54

<210> 338

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 338

tcgtcggcag cgtcagatgt gtataagaga cagagcatga actccagttg gtaat 55

<210> 339

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 339

tcgtcggcag cgtcagatgt gtataagaga caggtttgag cagaaagagg tccta 55

<210> 340

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 340

tcgtcggcag cgtcagatgt gtataagaga cagcaattcc agtttgagca gaaaga 56

<210> 341

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 341

tcgtcggcag cgtcagatgt gtataagaga cagcagtggt gtgtaccctt gatt 54

<210> 342

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 342

tcgtcggcag cgtcagatgt gtataagaga cagcaaagac cattgagtac tctgga 56

<210> 343

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 343

tcgtcggcag cgtcagatgt gtataagaga cagcacccaa ctagcaggca tatag 55

<210> 344

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 344

tcgtcggcag cgtcagatgt gtataagaga cagtaatacg catcacccaa ctagc 55

<210> 345

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 345

tcgtcggcag cgtcagatgt gtataagaga cagtaagagc ccacatggaa atgg 54

<210> 346

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 346

tcgtcggcag cgtcagatgt gtataagaga cagcacatgg aaatggcttg atctaaa 57

<210> 347

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 347

tcgtcggcag cgtcagatgt gtataagaga caggtcagtc taaagtagcg gttgag 56

<210> 348

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 348

tcgtcggcag cgtcagatgt gtataagaga caggctattc ttgggtggga gtag 54

<210> 349

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 349

tcgtcggcag cgtcagatgt gtataagaga cagcctgttg tccagcattt cttc 54

<210> 350

<211> 51

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 350

tcgtcggcag cgtcagatgt gtataagaga cagaccctgc atggaaagca a 51

<210> 351

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 351

tcgtcggcag cgtcagatgt gtataagaga cagagcaaca gcctgctcat aa 52

<210> 352

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 352

tcgtcggcag cgtcagatgt gtataagaga caggctgcat cacggtcaaa ttc 53

<210> 353

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 353

tcgtcggcag cgtcagatgt gtataagaga cagtcaaggg aacacaacca tctc 54

<210> 354

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 354

tcgtcggcag cgtcagatgt gtataagaga cagggctgct gttgtaagag gtat 54

<210> 355

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 355

tcgtcggcag cgtcagatgt gtataagaga caggtcgtag tgcaacagga ctaa 54

<210> 356

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 356

tcgtcggcag cgtcagatgt gtataagaga caggacagca gaattggccc tta 53

<210> 357

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 357

tcgtcggcag cgtcagatgt gtataagaga caggtaccag ttccatcact cttagg 56

<210> 358

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 358

tcgtcggcag cgtcagatgt gtataagaga cagggacctt taggtgtgtc tgtaa 55

<210> 359

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 359

tcgtcggcag cgtcagatgt gtataagaga caggacgtac tgtggcagct aaa 53

<210> 360

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 360

tcgtcggcag cgtcagatgt gtataagaga cagcaggcac ttctgttgca ttac 54

<210> 361

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 361

tcgtcggcag cgtcagatgt gtataagaga cagccatatt ggcttccggt gtaa 54

<210> 362

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 362

tcgtcggcag cgtcagatgt gtataagaga cagggcagta cagacaacac gat 53

<210> 363

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 363

tcgtcggcag cgtcagatgt gtataagaga caggttgatc acaactacag ccataac 57

<210> 364

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 364

tcgtcggcag cgtcagatgt gtataagaga cagaaagccc tgtatacgac atcag 55

<210> 365

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 365

tcgtcggcag cgtcagatgt gtataagaga cagggtacca tgtcaccgtc tattc 55

<210> 366

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 366

tcgtcggcag cgtcagatgt gtataagaga caggtttagc aacagctgga caatc 55

<210> 367

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 367

tcgtcggcag cgtcagatgt gtataagaga cagggcgtac acgttcacct aa 52

<210> 368

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 368

tcgtcggcag cgtcagatgt gtataagaga cagacaccaa caataccagc atttc 55

<210> 369

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 369

tcgtcggcag cgtcagatgt gtataagaga cagcctctct tccgtgaagt catattt 57

<210> 370

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 370

tcgtcggcag cgtcagatgt gtataagaga cagaagccct ggtcaaggtt aata 54

<210> 371

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 371

tcgtcggcag cgtcagatgt gtataagaga cagcttgtag gtgggaacac tgtag 55

<210> 372

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 372

tcgtcggcag cgtcagatgt gtataagaga cagggtggga acactgtaga gaataa 56

<210> 373

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 373

tcgtcggcag cgtcagatgt gtataagaga cagaagcacg tagtgcgttt atct 54

<210> 374

<211> 61

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 374

tcgtcggcag cgtcagatgt gtataagaga cagtaacgat agtagtcata atcgctgata 60

g 61

<210> 375

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 375

tcgtcggcag cgtcagatgt gtataagaga cagagcatta ccatcctgag caaa 54

<210> 376

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 376

tcgtcggcag cgtcagatgt gtataagaga cagagtgcat cttgatcctc ataact 56

<210> 377

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 377

tcgtcggcag cgtcagatgt gtataagaga caggttgtgc caaccaccat aga 53

<210> 378

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 378

tcgtcggcag cgtcagatgt gtataagaga cagcatatag tgaaccgcca caca 54

<210> 379

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 379

tcgtcggcag cgtcagatgt gtataagaga caggatgagg ttccacctgg ttta 54

<210> 380

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 380

tcgtcggcag cgtcagatgt gtataagaga caggaaacac acaacagcat cgtc 54

<210> 381

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 381

tcgtcggcag cgtcagatgt gtataagaga caggcatcgt cagagagtat catcatt 57

<210> 382

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 382

tcgtcggcag cgtcagatgt gtataagaga cagattcttg atggatctgg gtaagg 56

<210> 383

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 383

tcgtcggcag cgtcagatgt gtataagaga cagccctagg attcttgatg gatctg 56

<210> 384

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 384

tcgtcggcag cgtcagatgt gtataagaga cagctcaggt tcccaatacc ttgaa 55

<210> 385

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 385

tcgtcggcag cgtcagatgt gtataagaga caggttccca ataccttgaa gtgttatc 58

<210> 386

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 386

tcgtcggcag cgtcagatgt gtataagaga cagggtcgta acagcattta caacataa 58

<210> 387

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 387

tcgtcggcag cgtcagatgt gtataagaga cagacggatt aacagacaag actaa 55

<210> 388

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 388

tcgtcggcag cgtcagatgt gtataagaga cagaacttgt ccattagcac acaatg 56

<210> 389

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 389

tcgtcggcag cgtcagatgt gtataagaga cagagctcat acctcctaag taaagttg 58

<210> 390

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 390

tcgtcggcag cgtcagatgt gtataagaga cagggttaag tggtggtcta ggttta 56

<210> 391

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 391

tcgtcggcag cgtcagatgt gtataagaga caggagtgtt gggtataagc cagtaa 56

<210> 392

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 392

tcgtcggcag cgtcagatgt gtataagaga cagtgggtat aagccagtaa ttctaaca 58

<210> 393

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 393

tcgtcggcag cgtcagatgt gtataagaga cagcgagcac gtgcaggtat aat 53

<210> 394

<211> 51

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 394

tcgtcggcag cgtcagatgt gtataagaga caggctgtcg tctcaggcaa t 51

<210> 395

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 395

tcgtcggcag cgtcagatgt gtataagaga cagggttcta gtgtgccctt agttag 56

<210> 396

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 396

tcgtcggcag cgtcagatgt gtataagaga caggtcaaca atttcagcag gacaa 55

<210> 397

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 397

tcgtcggcag cgtcagatgt gtataagaga cagcatattc tgagccctgt gatgaa 56

<210> 398

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 398

tcgtcggcag cgtcagatgt gtataagaga caggaatcaa cagtttgagt tggtagtc 58

<210> 399

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 399

tcgtcggcag cgtcagatgt gtataagaga cagtaggtgc ctgtgtagga tgta 54

<210> 400

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 400

tcgtcggcag cgtcagatgt gtataagaga caggccaggt atgtcaacac ataaac 56

<210> 401

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 401

tcgtcggcag cgtcagatgt gtataagaga cagcctcgac atcgaagcca atc 53

<210> 402

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 402

tcgtcggcag cgtcagatgt gtataagaga cagaacagct tctctagtag catgac 56

<210> 403

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 403

tcgtcggcag cgtcagatgt gtataagaga cagagtcctt tgtacataag tggtatga 58

<210> 404

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 404

tcgtcggcag cgtcagatgt gtataagaga caggcggtgg tttagcacta act 53

<210> 405

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 405

tcgtcggcag cgtcagatgt gtataagaga cagaagcatg tggcacgtct atc 53

<210> 406

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 406

tcgtcggcag cgtcagatgt gtataagaga cagcataagt gtctgaagca gtgga 55

<210> 407

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 407

tcgtcggcag cgtcagatgt gtataagaga cagtagtcca gtcaacacgc ttaac 55

<210> 408

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 408

tcgtcggcag cgtcagatgt gtataagaga caggccgcat taatcttcag ttcatc 56

<210> 409

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 409

tcgtcggcag cgtcagatgt gtataagaga cagtgtcaga atgtgtggca taaga 55

<210> 410

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 410

tcgtcggcag cgtcagatgt gtataagaga cagtgtgaat ttgtcagaat gtgtgg 56

<210> 411

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 411

tcgtcggcag cgtcagatgt gtataagaga cagctcacat ggactgtcag agtaatag 58

<210> 412

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 412

tcgtcggcag cgtcagatgt gtataagaga cagcgtagca gactttagtg gtacat 56

<210> 413

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 413

tcgtcggcag cgtcagatgt gtataagaga cagctggtac ttcaccctgt tgtc 54

<210> 414

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 414

tcgtcggcag cgtcagatgt gtataagaga cagtcaccct gttgtccatc aaa 53

<210> 415

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 415

tcgtcggcag cgtcagatgt gtataagaga cagatgtgct ggagcatctc ttt 53

<210> 416

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 416

tcgtcggcag cgtcagatgt gtataagaga cagtcagttg gtttcttggc tatgt 55

<210> 417

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 417

tcgtcggcag cgtcagatgt gtataagaga caggggacct acagatggtt gtaaa 55

<210> 418

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 418

tcgtcggcag cgtcagatgt gtataagaga caggactagc ttgtttggga cctac 55

<210> 419

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 419

tcgtcggcag cgtcagatgt gtataagaga cagttccatt tgactcctgg gttta 55

<210> 420

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 420

tcgtcggcag cgtcagatgt gtataagaga cagtgactcc tgggtttaaa ttcttgta 58

<210> 421

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 421

tcgtcggcag cgtcagatgt gtataagaga cagacgttta gctagtccaa tcagtag 57

<210> 422

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 422

tcgtcggcag cgtcagatgt gtataagaga cagagtagat gtaaaccacc taactgac 58

<210> 423

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 423

tcgtcggcag cgtcagatgt gtataagaga caggccatct ttacaccaaa gcataa 56

<210> 424

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 424

tcgtcggcag cgtcagatgt gtataagaga cagtctacat ggccatcttt acacc 55

<210> 425

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 425

tcgtcggcag cgtcagatgt gtataagaga caggggtaca gctaatgtta atgtgttt 58

<210> 426

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 426

tcgtcggcag cgtcagatgt gtataagaga cagctttatc agaaccagca ccaaa 55

<210> 427

<211> 59

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 427

tcgtcggcag cgtcagatgt gtataagaga caggtcttag ggtcgtacat atcactaat 59

<210> 428

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 428

tcgtcggcag cgtcagatgt gtataagaga cagatctatt tgttcgcgtg gtttg 55

<210> 429

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 429

tcgtcggcag cgtcagatgt gtataagaga cagcgtggtt tgccaagata attaca 56

<210> 430

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 430

tcgtcggcag cgtcagatgt gtataagaga cagtttaaag acataacagc agtaccc 57

<210> 431

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 431

tcgtcggcag cgtcagatgt gtataagaga cagctgacta gagactagtg gcaataaa 58

<210> 432

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 432

tcgtcggcag cgtcagatgt gtataagaga cagacaccac gtgtgaaaga attag 55

<210> 433

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 433

tcgtcggcag cgtcagatgt gtataagaga cagggacagg gttatcaaac ctctta 56

<210> 434

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 434

tcgtcggcag cgtcagatgt gtataagaga cagaaatggt aggacagggt tatca 55

<210> 435

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 435

tcgtcggcag cgtcagatgt gtataagaga cagctctgaa ctcactttcc atcca 55

<210> 436

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 436

tcgtcggcag cgtcagatgt gtataagaga cagtcgcact agaataaact ctgaact 57

<210> 437

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 437

tcgtcggcag cgtcagatgt gtataagaga cagctaaatt aataggcgtg tgcttaga 58

<210> 438

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 438

tcgtcggcag cgtcagatgt gtataagaga caggctgtcc aacctgaaga ag 52

<210> 439

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 439

tcgtcggcag cgtcagatgt gtataagaga caggtttctg agagagggtc aagtg 55

<210> 440

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 440

tcgtcggcag cgtcagatgt gtataagaga cagaggagac actccataac acttaaa 57

<210> 441

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 441

tcgtcggcag cgtcagatgt gtataagaga cagtgatgcg gaattatata ggacagaa 58

<210> 442

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 442

tcgtcggcag cgtcagatgt gtataagaga cagaccacca accttagaat caaga 55

<210> 443

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 443

tcgtcggcag cgtcagatgt gtataagaga cagagttgct ggtgcatgta gaa 53

<210> 444

<211> 60

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 444

tcgtcggcag cgtcagatgt gtataagaga caggtactac tactctgtat ggttggtaac 60

<210> 445

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 445

tcgtcggcag cgtcagatgt gtataagaga caggatcacg gacagcatca gtag 54

<210> 446

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 446

tcgtcggcag cgtcagatgt gtataagaga caggaatctc aagtgtctgt ggatca 56

<210> 447

<211> 50

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 447

tcgtcggcag cgtcagatgt gtataagaga cagagcctgc acgtgtttga 50

<210> 448

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 448

tcgtcggcag cgtcagatgt gtataagaga cagtatacct gcaccaatgg gtatg 55

<210> 449

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 449

tcgtcggcag cgtcagatgt gtataagaga cagttgtggg tatggcaata gagtt 55

<210> 450

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 450

tcgtcggcag cgtcagatgt gtataagaga cagagacact ggtagaattt ctgtgg 56

<210> 451

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 451

tcgtcggcag cgtcagatgt gtataagaga cagtttgtct tgttcaacag ctattcc 57

<210> 452

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 452

tcgtcggcag cgtcagatgt gtataagaga caggaggtct ctagcagcaa tatcac 56

<210> 453

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 453

tcgtcggcag cgtcagatgt gtataagaga cagaccaaag gtccaaccag aag 53

<210> 454

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 454

tcgtcggcag cgtcagatgt gtataagaga cagtgcactt gctgtggaag aa 52

<210> 455

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 455

tcgtcggcag cgtcagatgt gtataagaga cagagcgtgt ttaaagcttg tgc 53

<210> 456

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 456

tcgtcggcag cgtcagatgt gtataagaga caggtctgcc tgtgatcaac ctatc 55

<210> 457

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 457

tcgtcggcag cgtcagatgt gtataagaga cagctgactg agggaaggac ataag 55

<210> 458

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 458

tcgtcggcag cgtcagatgt gtataagaga cagaagacac cttcacgagg aaag 54

<210> 459

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 459

tcgtcggcag cgtcagatgt gtataagaga cagacaacat cacagttacc agaca 55

<210> 460

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 460

tcgtcggcag cgtcagatgt gtataagaga caggagtcta attcaggttg caaagg 56

<210> 461

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 461

tcgtcggcag cgtcagatgt gtataagaga cagcattgag gcggtcaatt tcttt 55

<210> 462

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 462

tcgtcggcag cgtcagatgt gtataagaga caggcaactg gtcatacagc aaag 54

<210> 463

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 463

tcgtcggcag cgtcagatgt gtataagaga cagctgaagg agtagcatcc ttgatt 56

<210> 464

<211> 51

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 464

tcgtcggcag cgtcagatgt gtataagaga cagtgcagta gcgcgaacaa a 51

<210> 465

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 465

tcgtcggcag cgtcagatgt gtataagaga caggaagtgc aacgccaaca ataa 54

<210> 466

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 466

tcgtcggcag cgtcagatgt gtataagaga cagccaacaa taagccatcc gaaag 55

<210> 467

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 467

tcgtcggcag cgtcagatgt gtataagaga caggcaaagc caaagcctca ttatt 55

<210> 468

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 468

tcgtcggcag cgtcagatgt gtataagaga caggaacggc atttccagca aag 53

<210> 469

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 469

tcgtcggcag cgtcagatgt gtataagaga cagcaacacc agtgtctgta ctcaa 55

<210> 470

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 470

tcgtcggcag cgtcagatgt gtataagaga cagacagctg gtaatagtct gaagtg 56

<210> 471

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 471

tcgtcggcag cgtcagatgt gtataagaga caggtcgtcg tcggttcatc ataa 54

<210> 472

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 472

tcgtcggcag cgtcagatgt gtataagaga cagcgtacct gtctcttccg aaac 54

<210> 473

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 473

tcgtcggcag cgtcagatgt gtataagaga cagcagcagt acgcacacaa tc 52

<210> 474

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 474

tcgtcggcag cgtcagatgt gtataagaga cagcgttaac aatattgcag cagtacg 57

<210> 475

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 475

tcgtcggcag cgtcagatgt gtataagaga caggtaccgt tggaatctgc cat 53

<210> 476

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 476

tcgtcggcag cgtcagatgt gtataagaga caggttccat tgttcaagga gcttt 55

<210> 477

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 477

tcgtcggcag cgtcagatgt gtataagaga caggcgcaaa cagtctgaaa gaag 54

<210> 478

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 478

tcgtcggcag cgtcagatgt gtataagaga cagggattga atgaccacat ggaac 55

<210> 479

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 479

tcgtcggcag cgtcagatgt gtataagaga cagaatcctg tagcgactgt atgc 54

<210> 480

<211> 53

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 480

tcgtcggcag cgtcagatgt gtataagaga cagaaacctg agtcacctgc tac 53

<210> 481

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 481

tcgtcggcag cgtcagatgt gtataagaga cagtctccat tggttgctct tcatc 55

<210> 482

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 482

tcgtcggcag cgtcagatgt gtataagaga cagcgagtgt tatcagtgcc aaga 54

<210> 483

<211> 57

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 483

tcgtcggcag cgtcagatgt gtataagaga cagtcttgaa cttcctcttg tctgatg 57

<210> 484

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 484

tcgtcggcag cgtcagatgt gtataagaga caggatctgg cacgtaactg atagac 56

<210> 485

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 485

tcgtcggcag cgtcagatgt gtataagaga cagcgtttag gcgtgacaag tttc 54

<210> 486

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 486

tcgtcggcag cgtcagatgt gtataagaga caggtgaaat gcagctacag ttgtg 55

<210> 487

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 487

tcgtcggcag cgtcagatgt gtataagaga cagacaacgc actacaagac tacc 54

<210> 488

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 488

tcgtcggcag cgtcagatgt gtataagaga cagtcgatgt actgaatggg tgattt 56

<210> 489

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 489

tcgtcggcag cgtcagatgt gtataagaga caggcgttct ccattctggt tact 54

<210> 490

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 490

tcgtcggcag cgtcagatgt gtataagaga caggggtgca tttcgctgat tt 52

<210> 491

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 491

tcgtcggcag cgtcagatgt gtataagaga cagtctggta gctcttcggt agtag 55

<210> 492

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 492

tcgtcggcag cgtcagatgt gtataagaga cagaccatct tggactgaga tctttc 56

<210> 493

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 493

tcgtcggcag cgtcagatgt gtataagaga caggcacgat tgcagcattg ttag 54

<210> 494

<211> 55

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 494

tcgtcggcag cgtcagatgt gtataagaga cagtttggca atgttgttcc ttgag 55

<210> 495

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 495

tcgtcggcag cgtcagatgt gtataagaga caggctctca agctggttca atct 54

<210> 496

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 496

tcgtcggcag cgtcagatgt gtataagaga cagctgtcaa gcagcagcaa ag 52

<210> 497

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 497

tcgtcggcag cgtcagatgt gtataagaga cagttgcggc caatgtttgt aatc 54

<210> 498

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 498

tcgtcggcag cgtcagatgt gtataagaga cagccttgtc tgattagttc ctggtc 56

<210> 499

<211> 52

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 499

tcgtcggcag cgtcagatgt gtataagaga caggctctgt tggtgggaat gt 52

<210> 500

<211> 58

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 500

tcgtcggcag cgtcagatgt gtataagaga caggaattca ttctgcacaa gagtagac 58

<210> 501

<211> 54

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 501

tcgtcggcag cgtcagatgt gtataagaga cagcagctct ccctagcatt gttc 54

<210> 502

<211> 56

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 502

tcgtcggcag cgtcagatgt gtataagaga cagcattagg gctcttccat ataggc 56

<210> 503

<211> 39

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 503

caagcagaag acggcatacg agatgtctcg tgggctcgg 39

<210> 504

<211> 43

<212> DNA

<213> Artificial sequence (Artificial Sequence)

<400> 504

aatgatacgg cgaccaccga gatctacact cgtcggcagc gtc 43

Claims

1. The method for constructing the novel coronavirus whole genome high-throughput sequencing library is characterized by comprising the following steps of:

B. according to the published novel coronavirus COVID-19 genome sequence, performing shingled full-coverage primer design, respectively designing a multiplex amplification primer group 1 of an anchor part Illumina joint sequence and a multiplex amplification primer group 2 of the anchor part Illumina joint sequence, performing a first round of PCR reaction by using the primer group 1 and the primer group 2 respectively with single-stranded cDNA or double-stranded cDNA as templates, and mixing amplification products according to equimolar amounts to cover the whole genome of the virus;

b1, respectively designing a multi-specific amplification primer group I and a multi-specific amplification primer group II according to a novel coronavirus COVID-19 genome sequence, wherein the primer group I comprises a forward primer F pool and a reverse primer R pool, the primer group II comprises a forward primer F 'pool and a reverse primer R' pool, and each pair of forward primer and reverse primer corresponds to one amplicon; respectively designing a forward primer and a reverse primer of a primer group II in two adjacent amplicon sequences of the primer group I, respectively designing the forward primer and the reverse primer of the primer group I in the two adjacent amplicon sequences of the primer group II, and repeating the steps until the amplicons corresponding to the primer group I and the amplicons corresponding to the primer group II cover the whole genome of the virus in a shingled mode;

the size of the 3' -end sequence of the I7 tagged primer is 9-15 bp, and the size of the 3' -end sequence of the I5 tagged primer is 8-14 bp, so that the I7 tagged primer and the I5 tagged primer can be specifically annealed to the 3' -end binding position on the amplicon;

the multiplex amplification primer group 1 of the anchor part Illumina linker sequence and the multiplex amplification primer group 2 of the anchor part Illumina linker sequence in the step B comprise 250 pairs of primers, wherein the forward primers are COV-1-F-COV-250-F, the nucleotide sequences of the forward primers are respectively shown as SEQ ID NO. 3-252, the reverse primers are COV-1-R-COV-250-R, the nucleotide sequences of the reverse primers are respectively shown as SEQ ID NO. 253-502, the COV-1-F and the COV-1-R are a pair of primers, and the COV-2-F and the COV-2-R are a pair of primers, and the like.

2. The method of claim 1, wherein the Tm threshold difference between each primer pair in step B is ± 2 ℃; and/or

The amplicon size is 200-300bp; and/or

Primer pairs which can cause the formation of dimer and stem-loop structures between or within the primers are removed during primer design; and/or

In the same multiplex specific amplification primer set, the reverse primer sequence 5 'of the upstream amplicon of the genome is located upstream of the forward primer sequence 5' of the downstream amplicon.

3. The method according to claim 1, wherein the method for reverse transcription of RNA into single stranded cDNA in step a is selected from the group consisting of a or b:

a. guiding single-stranded cDNA synthesis by using a 6-10bp random primer;

b. mixing a plurality of primers from a reverse primer R pool and a reverse primer R ' pool to form a specific reverse transcription primer group to guide single-stranded cDNA synthesis, wherein the reverse primers are uniformly distributed along the 3' -5' direction of a viral genome, and the primers are 800-1000bp apart;

i. guiding single-stranded cDNA synthesis by using a 6-10bp random primer;

ii. Nicking the RNA-cDNA hybrid duplex with RNase H in the presence of dNTPs;

4. The method of claim 1, wherein the labeled Illumina library amplification primers in step C are as follows:

5. The method of claim 4, wherein the sequence of Illumina partial linker sequence (1) in step B is as follows: 5'-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-3';

the sequence of Illumina partial linker sequence (2) is as follows: 5'-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3'.

6. The method according to claim 1, wherein the primer information for multiplex amplification primer set 1 of the anchor portion Illumina adaptor sequence in step B is as follows:

/>

/>

/>

primer information for multiplex amplification primer set 2 for anchor Illumina adaptor sequence is as follows:

/>

/>

/>

7. The method of any one of claims 1-6, wherein the virus sample is from a pharyngeal swab, an alveolar lavage, or a supernatant isolated culture after virus infection of cells.

8. Kit for constructing a novel coronavirus whole genome high throughput sequencing library, characterized in that it comprises a multiplex amplification primer set 1 of anchor part Illumina adaptor sequences and a multiplex amplification primer set 2 of anchor part Illumina adaptor sequences used in the method according to any one of claims 1 to 7 and tagged Illumina library amplification primers, and further comprises reagents for library construction.

9. Use of the method of any one of claims 1-7 in the preparation of a kit for novel coronavirus variant detection, said use comprising:

(1) Constructing a novel coronavirus whole genome high-throughput sequencing library to be tested according to the method of any one of claims 1-7;

(3) Bioinformatics analysis and detection of mutation sites.

10. The use of claim 9, wherein step (3) comprises the sub-steps of:

2) reads quality control analysis: filtering and quality control analysis are carried out on the double-end reads by using SOAPnuke to obtain clean reads; reads with the following conditions will be removed: condition 1: reads containing linker sequence contamination; condition 2: reads with more than 10% N bases; condition 3: the number of low-quality bases exceeds 50% of the total reads, said low quality being Q <38;