CN111334868A - Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction - Google Patents

Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction Download PDF

Info

Publication number
CN111334868A
CN111334868A CN202010225821.0A CN202010225821A CN111334868A CN 111334868 A CN111334868 A CN 111334868A CN 202010225821 A CN202010225821 A CN 202010225821A CN 111334868 A CN111334868 A CN 111334868A
Authority
CN
China
Prior art keywords
artificial sequence
dna
primer
illumina
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010225821.0A
Other languages
Chinese (zh)
Other versions
CN111334868B (en
Inventor
王洋
李�杰
王辰
高汉林
郭超
王健伟
任丽丽
杨明
刘静
赵晔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuzhou Furui Medical Laboratory Co ltd
Chinese Academy of Medical Sciences CAMS
Original Assignee
Fuzhou Furui Medical Laboratory Co ltd
Chinese Academy of Medical Sciences CAMS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuzhou Furui Medical Laboratory Co ltd, Chinese Academy of Medical Sciences CAMS filed Critical Fuzhou Furui Medical Laboratory Co ltd
Priority to CN202010225821.0A priority Critical patent/CN111334868B/en
Publication of CN111334868A publication Critical patent/CN111334868A/en
Application granted granted Critical
Publication of CN111334868B publication Critical patent/CN111334868B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/08Liquid phase synthesis, i.e. wherein all library building blocks are in liquid phase or in solution during library creation; Particular methods of cleavage from the liquid support
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage
    • C12Q1/701Specific hybridization probes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/30Detection of binding sites or motifs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Abstract

The invention provides a construction method of a novel coronavirus whole genome high-throughput sequencing library and a kit for constructing the library. The method comprises the following steps: 1) reverse transcription of viral RNA; 2) performing a first round of PCR reaction using a multiplex amplification primer of an anchor Illumina linker sequence; 3) and carrying out a second round of PCR reaction by using the tagged Illumina library amplification primer, and purifying an amplification product to obtain the high-throughput sequencing library. The anchoring multiple amplification primer combination provided by the invention can be used for carrying out efficient targeted enrichment on the genome of the novel coronavirus COVID-19, overcomes the defects of low targeting property, low experimental aging and easy introduction of the influence of host background pollution in the existing method, and is favorable for completing virus whole genome sequencing of the COVID-19 in a short time under the condition of less sequencing data volume and cost, thereby realizing identification diagnosis and virus mutation identification of the COVID-19 virus.

Description

Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction
Technical Field
The invention relates to the technical field of biology, in particular to a construction method of a novel coronavirus whole genome high-throughput sequencing library and a kit for constructing the library.
Background
The novel coronavirus COVID-19 (segmented respiratory syndrome coronavirus 2, SARS-CoV-2) belongs to the genus of coronavirus B, and is similar to other discovered coronavirus genomes, and the COVID-19 genome comprises 6 main Open Reading Frames (ORFs), namely ORF1ab, ORF3a, ORF6, ORF7a, ORF8 and ORF10, and other accessory genes (processing genes), namely an S gene, an E gene, an M gene and an N gene. The method for carrying out whole genome high-depth sequencing on the virus is one of the methods for obtaining the whole nucleic acid mutation, virus typing and evolutionary relationship research of the virus genome with the strongest sensitivity and specificity at present; however, due to factors such as host nucleic acid background interference, the current mainstream virus whole genome high-throughput sequencing schemes all have the problems of large requirement on sequencing data volume, higher experiment cost, lower timeliness and the like.
At present, most of detection kits aiming at COVID-19 approved by the national drug administration (NMPA) of China are based on a fluorescence real-time quantitative PCR (qRT-PCR) detection method of a Taqman probe, a colloidal gold antibody detection method of IgG/IgM and an IgM antibody detection method based on a magnetic particle chemiluminescence method; the qRT-PCR method has strong specificity, and can complete the relative quantification of positive virus-carrying samples within 2 h; however, because the variation of the RNA virus is often much faster than that of other types of viruses, once the probe binding position and the specific primer binding position on the viral genome are mutated, the mutation is influenced by factors such as the quality of extracted viral RNA, experimental means, operation of laboratory personnel and the like, the sensitivity is low (false negative is high), the unconfide values such as unstable Ct value or Ct value greater than 40 and the like are easy to occur, and the false negative is increased to improve the misdiagnosis rate and the missed diagnosis rate; the detection of the colloidal gold antibody of IgG/IgM based on the immunological antibody antigen reaction has extremely fast timeliness, but has high false positive, and needs the subsequent clinical diagnosis support; CT detection is used as a clinical diagnosis-confirming gold standard, and is dependent on a large instrument, so that the realization of a common sieve is difficult; the digital PCR (ddPCR) based on the water-in-oil microdroplet technology has strong specificity and sensitivity; however, the method has the disadvantages of low throughput and high cost, and the detection rate is affected when the mutation occurs at the primer binding site.
In the field of researching microorganisms/virus genomes by using high-throughput sequencing, particularly in the research of RNA viruses, RNA-seq is mainly used as a main technical means to sequence the whole virus genomes, and the constructed sequencing of a second-generation sequencing library is carried out by carrying out host Ribosomal RNA (hrRNA) rejection (rRNA deletion) on total RNA separated from a host, so that the method has the defect that host genome and transcriptome information brought in the process of separating the viruses can cause that reads from the virus genomes in next-generation data only account for a very small part (0.01-0.1 percent), and therefore, the requirement on initial nucleic acid RNA is high; meanwhile, when sequence comparison and assembly are carried out in subsequent letter generation analysis, gap with a certain proportion, partial virus genome region coverage and sequencing depth are often insufficient, the proportion of the overall coverage is insufficient, the requirement on offline data volume is high (the abundance of the virus in a host is often more than 10G data according to the size of the virus), the experiment cost is high, the timeliness is low and the like. A recent study published in Nature (Nature) journal showed that RNA-seq was performed on a sample of a novel human isolated coronavirus and studied by metagenomic (Metagenomics) analysis method, and that of all 10038758 sequencing reads obtained after off-machine, only 1582 sequencing reads were finally obtained for subsequent covi-19 analysis by filtering the sequencing reads from the host human. The virus whole genome sequencing through a targeted liquid phase hybridization capture system has stronger specificity and low data volume requirement; the method has the disadvantages that the requirement for the initial target cDNA is high, the risk that the viral genome cannot be captured exists, the design cost of the probe is high, the timeliness is slow (the hybridization capture accounts for more than 12 h), and the clinical transformation adaptability is not high; the research of high-throughput Sequencing of RNA virus whole genome by using a Targeted multiplex-PCR Sequencing (TMS) technology through literature search is rarely reported.
In order to make up for the technical blank in the field, a novel coronavirus (COVID-19) whole genome mutation rapid differential diagnosis technology based on Targeted multiple polymerase chain reaction Amplicon sequencing (Targeted Multiplexed amplification-seq) and a kit application are provided. The method is not influenced by host genome, has strong targeting property aiming at the COVID-2019, high coverage and uniformity, low requirement on the initial amount of the sample, greatly reduces the experiment and sequencing cost compared with the prior virus high-throughput sequencing method, greatly improves the timeliness, and can realize high sensitivity, accuracy and comprehensive differential diagnosis on the COVID-19 virus in biological samples such as throat swabs, alveolar lavage fluid and the like and virus culture samples.
Disclosure of Invention
The invention aims to provide a method for constructing a novel coronavirus whole genome high-throughput sequencing library based on a Targeted multiple polymerase chain reaction Amplicon sequencing technology (Targeted Multiplexed amplification-seq) and a kit for constructing the library.
Another object of the present invention is to provide the use of the above method for the detection of variants of novel coronaviruses.
In order to achieve the object, in a first aspect, the invention provides a method for constructing a novel coronavirus whole genome high-throughput sequencing library, comprising the following steps:
A. extracting virus sample RNA, and performing reverse transcription to obtain single-stranded cDNA or double-stranded cDNA;
B. according to a published genome sequence of a novel coronavirus COVID-19, carrying out overlap-tile type full-coverage primer design, respectively designing a multiple amplification primer group 1 with an anchoring part Illumina joint sequence and a multiple amplification primer group 2 with the anchoring part Illumina joint sequence (an anchoring multiple amplification primer group 1 and an anchoring multiple amplification primer group 2), respectively carrying out a first round of PCR reaction by using a primer group 1 and a primer group 2 with a single-stranded cDNA or a double-stranded cDNA as a template, and mixing amplification products according to an equimolar amount to cover a virus full genome;
C. b, taking the amplification product mixed in the step B as a template, performing a second round of PCR reaction by using a tagged Illumina library amplification primer, and purifying the amplification product to obtain a high-throughput sequencing library;
wherein the design method of the multiple amplification primer group 1 with the anchoring part Illumina linker sequence and the multiple amplification primer group 2 with the anchoring part Illumina linker sequence in the step B comprises the following steps:
b1, designing non-anchored multiple amplification primer groups according to the genome sequence of the novel coronavirus COVID-19, wherein the non-anchored multiple amplification primer groups are respectively a multiple specific amplification primer group I and a multiple specific amplification primer group II, the primer group I comprises a forward primer F pool and a reverse primer R pool, the primer group II comprises a forward primer F 'pool and a reverse primer R' pool, and each pair of the forward primer and the reverse primer corresponds to one amplicon; respectively designing a forward primer and a reverse primer of a primer group II in two adjacent amplicon sequences of the primer group I, respectively designing the forward primer and the reverse primer of the primer group I in two adjacent amplicon sequences of the primer group II, and repeating the steps until the amplicon corresponding to the primer group I and the amplicon corresponding to the primer group II cover the whole genome of the virus in a shingled manner;
b2, adding Illumina partial linker sequence ① to the 5 'end of each forward primer in the 5' -3 'direction and Illumina partial linker sequence ② to the 5' end of each reverse primer in the 5 '-3' direction, multiple amplification primer set 1 with the pool of forward primer F with Illumina partial linker sequence ① and the pool of reverse primer R with Illumina partial linker sequence ② as anchor portions, multiple amplification primer set 2 with the pool of forward primer F 'with Illumina partial linker sequence ① and the pool of reverse primer R' with Illumina partial linker sequence ② as anchor portions of Illumina linker sequence;
wherein the sequence of the Illumina partial linker sequence ① is 5 ' -I7 tagged primer 3 ' terminal sequence-AGATGTGTATAAGAGACAG-3 ';
the sequence of the Illumina partial linker sequence ② is shown as follows, 5 ' -I5 tagged primer 3 ' terminal sequence-AGATGTGTATAAGAGACAG-3 ';
and the 3 ' terminal sequence of the I7 tagged primer is 9-15 bp, and the 3 ' terminal sequence of the I5 tagged primer is 8-14 bp, so that the I7 tagged primer and the I5 tagged primer can be specifically annealed to the 3 ' terminal binding position on the amplicon.
In the method, the Tm threshold difference between each primer pair in the step B is +/-2 ℃; and/or
The size of the amplicon is 200-300 bp; and/or
Removing Primer pairs which can cause the formation of dimers (Primer Dimer) and Stem-Loop structures (Stem-Loop) between or in the primers during Primer design; and/or
In the same multiplex specific amplification primer set, the 5 'end of the reverse primer sequence of the genomic upstream amplicon is located upstream of the 5' end of the forward primer sequence of the downstream amplicon to prevent short fragment by-product formation and to allow PCR competition.
The method described above, step A, for reverse transcribing RNA into single stranded cDNA is selected from the following a or b:
a. leading single-chain cDNA synthesis by using a 6-10bp random primer;
b. a plurality of primers from the reverse primer R pool and the reverse primer R ' pool are mixed into a specific reverse transcription primer group to guide the synthesis of the single-stranded cDNA, and the reverse primers are uniformly distributed along the 3 ' -5 ' direction of the virus genome and are separated by 800-1000 bp.
The method for reverse transcribing RNA to double-stranded cDNA in step A comprises:
i. leading single-chain cDNA synthesis by using a 6-10bp random primer;
ii. Nicking the RNA-cDNA hybrid duplex with RNase H (RNaseH) in the presence of dNTPs;
and iii, synthesizing double-stranded cDNA using RNA-dependent DNA polymerase using the small fragment RNA generated at the nick as a primer.
The method described above, step C, labeled Illumina library amplification primers are as follows (SEQ ID NO: 503-504):
i7 tagged primer: 5 '-CAAGCAGAAGACGGCATACGAGAT (I7) GTCTCGTGGGCTCGG-3', I5 tagged primer: 5 '-AATGATACGGCGACCACCGAGATCTACAC (i5) TCGTCGGCAGCGTC-3'.
Preferably, the sequence of Illumina partial linker sequence ① in step B is 5'-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-3' (SEQ ID NO: 1);
the sequence of Illumina partial linker sequence ② is shown below in SEQ ID NO: 5'-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3' (SEQ ID NO: 2).
In the method, the multiple amplification primer group 1 with the anchoring part Illumina linker sequence and the multiple amplification primer group 2 with the anchoring part Illumina linker sequence in the step B comprise 250 pairs of primers, wherein the forward primer is COV-1-F to COV-250-F, the nucleotide sequences of the forward primer and the reverse primer are respectively shown in SEQ ID NO:3-252, the nucleotide sequences of the reverse primer and the reverse primer are COV-1-R to COV-250-R, and the nucleotide sequences of the reverse primer are respectively shown in SEQ ID NO:253-502, wherein COV-1-F and COV-1-R are a pair of primers, COV-2-F and COV-2-R are a pair of primers, and so on.
Preferably, the primer information of the multiple amplification primer set 1 with the anchoring part Illumina linker sequence and the multiple amplification primer set 2 with the anchoring part Illumina linker sequence in the step B are respectively shown in the table 1 and the table 2.
TABLE 1 primer information for anchored multiplex primer set 1
Figure BDA0002427599160000041
Figure BDA0002427599160000051
Figure BDA0002427599160000061
TABLE 2 primer information for anchored multiplex primer set 2
Figure BDA0002427599160000071
Figure BDA0002427599160000081
Figure BDA0002427599160000091
Wherein the primer number COV-1 corresponds to the primers COV-1-F and COV-1-R, the primer number COV-2 corresponds to the primers COV-2-F and COV-2-R, and so on.
In the present invention, the virus sample may be a biological sample such as a pharyngeal swab, an alveolar lavage fluid, or a supernatant isolation culture after virus infection of cells.
In a second aspect, the present invention provides a kit for constructing a novel coronavirus whole genome high throughput sequencing library, the kit comprising the multiplex amplification primer set 1 with an anchor part Illumina linker sequence and the multiplex amplification primer set 2 with an anchor part Illumina linker sequence used in the above library construction method, and tagged Illumina library amplification primers, optionally comprising various reagents (such as an amplidase reagent, corresponding buffer, etc.) for library construction.
In a third aspect, the present invention provides the use of the above library construction method in the detection of variation of a novel coronavirus, wherein the use comprises:
(1) constructing a high-throughput sequencing library of the whole genome of the novel coronavirus to be detected according to the method;
(2) performing on-machine sequencing after the quality of the high-throughput sequencing library is qualified;
(3) bioinformatics analysis and detection of variant sites.
Preferably, step (3) comprises the sub-steps of:
1) constructing a novel coronavirus COVID-19 reference genome MT019531.1 index data set by using BWA software, and generating a fai file by using samtools faidx;
2) reads quality control analysis: filtering and quality control analyzing the double-end reads by using SOAPnuke to obtain clean reads (read length after filtering); reads with the following conditions will be removed: condition 1: reads containing contamination with linker sequences; condition 2: reads with N basic group number more than 10%; condition 3: low mass (Q <38) base numbers exceed 50% of the whole reads;
3) data alignment and ranking: comparing clean reads to a reference genome MT019531.1 by using a BWA combined samtools to generate a BAM file, wherein an alignment parameter is '-t 32-M'; jar using picard software for sorting; establishing an index for the sorted BAM file by using an index tool of samtools; quality control is carried out on the generated BAM file by applying a Qualimap tool;
4) and (3) mutation detection: detecting SNP and InDel of the virus by using samtools pileup and VarScan; the SNP detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; the InDel detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1";
5) finally, annotation of detected SNPs and indels was performed using annovar software based on the GFF file of the MT019531.1 reference genome.
By the technical scheme, the invention at least has the following advantages and beneficial effects:
the anchoring multiple amplification primer combination provided by the invention can be used for carrying out efficient targeted enrichment on the genome of the novel coronavirus COVID-19, overcomes the defects of low targeting property, low experimental aging and easy introduction of the influence of host background pollution in the existing method, and is favorable for completing virus whole genome sequencing of the COVID-19 in a short time under the conditions of less sequencing data volume and low cost.
The multiple polymerase chain reaction amplification primer provided by the invention specifically targets a novel coronavirus COVID-19 genome sequence, is not influenced by host human RNA, has low requirement on the initial amount of a sample, and can realize specific detection, COVID-19 virus differential diagnosis and virus mutation identification of the COVID-19 RNA virus whole genome extracted from samples such as throat swabs, alveolar lavage fluid and the like. The invention carries out multi-round detection and optimization on the multiple PCR amplification primer sequence, and finally the coverage uniformity of the obtained sequencing data reaches more than 90 percent; meanwhile, the method overcomes the defects of too little virus genome machine-exiting data, too long experiment period and large experiment starting nucleic acid material amount caused by introducing a large amount of host RNA residues into the RNA-seq sequencing method based on the RNA virus genome, and can perform secondary detection and accurate diagnosis on false negative patients with suspected symptoms rapidly detected by the conventional qRT-PCR method.
In practical use, the invention also optimizes the PCR reaction system and program for target enrichment and further amplification in the library building process, and effectively improves the problems of low amplification efficiency and poor uniformity of the conventional universal multiple PCR reaction.
Drawings
FIG. 1 shows the principle of primer design according to the present invention.
FIG. 2 is a diagram of the sequencing library quality control Agilent 2200 micro-electrophoresis peak in example 1 of the present invention. Wherein, a is the quality inspection result of the library 46d1-1, b is the quality inspection result of the library 50d1-1, size (bp) on the abscissa represents the size of the library fragment, and Sample Intensity on the ordinate represents the signal Intensity.
FIG. 3 is a diagram of the sequencing library quality control Agilent 2200 micro-electrophoresis peak in example 2 of the present invention. Wherein, a is the quality inspection result of the library 46d1-2, b is the quality inspection result of the library 50d1-2, size (bp) on the abscissa represents the size of the library fragment, and Sample Intensity on the ordinate represents the signal Intensity.
FIG. 4 is a diagram of the sequencing library quality control Agilent 2200 micro-electrophoresis peak in example 3 of the present invention. Wherein, a is the quality inspection result of the library 48d5-1, b is the quality inspection result of the library 47d1-1, size (bp) on the abscissa represents the size of the library fragment, and Sample Intensity on the ordinate represents the signal Intensity.
FIG. 5 is a diagram of the quality control Agilent 2200 micro-electrophoresis peak of the sequencing library in example 4 of the present invention. Wherein, a is the quality inspection result of the library 48d5-2, b is the quality inspection result of the library 47d1-2, size (bp) on the abscissa represents the size of the library fragment, and Sample Intensity on the ordinate represents the signal Intensity.
FIG. 6 is a diagram of the sequencing library quality control Agilent 2200 micro-electrophoresis peak in example 5 of the present invention. Wherein, a is the quality inspection result of the library XH1P2_ R, b is the quality inspection result of the library WHP6_ R, c is the quality inspection result of the library XH1P6_ R, size (bp) on the abscissa represents the size of the library fragment, and Sample Intensity on the ordinate represents the signal Intensity.
Detailed Description
The invention provides a novel coronavirus (COVID-19) whole genome mutation rapid differential diagnosis technology based on targeted multiplex polymerase chain reaction amplicon sequencing and application of a kit, a primer combination and a kit designed according to the method, and a COVID-19 single-stranded RNA library construction method using the primer combination, which can realize the accuracy and comprehensive differential diagnosis of the novel coronavirus COVID-19 in biological samples such as throat swabs, alveolar lavage fluid and virus culture samples.
The invention also provides a thought and a reference mode for quickly identifying and distinguishing the mutation of the RNA viruses of all types of known genome sequences, and further, the anchored multiplex PCR primer sequences targeting different RNA virus genome sequences can be replaced according to actual requirements to develop a kit suitable for different application ranges.
The technical scheme of the invention is as follows:
the invention provides a library construction method for human novel coronavirus (COVID-19) whole genome high-throughput sequencing, which comprises the following steps:
1. the reverse transcription of viral single-stranded RNA to synthesize single-stranded cDNA includes the following steps: step a, synthesizing single-strand cDNA (1st cDNA) guided by a Random 6mer-10mer Random Primer (Random 6mer-10mer Primer); step b, the cDNA of 1st enters the subsequent PCR amplification reaction without purification or after purification; or step a, one-strand cDNA synthesis using a combination of primers in the non-anchored reverse primer pool R of claim 3a into a specific reverse transcription primer set, the binding sites of the selected specific reverse transcription primers being uniformly distributed along the 3 '-5' direction of the viral genome, the primers being separated by a distance of 800-100bp bases; b, purifying the single-strand cDNA and then carrying out subsequent PCR amplification reaction; the method for synthesizing double-stranded cDNA by reverse transcription of virus single-stranded RNA comprises the following steps: step a, synthesizing a single-strand cDNA guided by a random primer of 6-10 bases; step b, the RNase H (RNaseH) mediates the generation of RNA-1st cDNA heteroduplex (RNA-1st cDNA hybrid) nicks under the assistance of deoxyribonucleoside triphosphate (dNTP); step c, synthesizing double-strand cDNA (2nd cDNA) by using RNA-dependent DNA polymerase with the small fragment RNA generated at the gap as a primer; d, recovering and purifying the double-stranded cDNA; commercial reverse transcription and duplex synthesis kits can be used for the steps.
2. Designing a COVID-19 virus genome multiple specificity primer 250 pair of anchoring part Illumina sequencing joint sequence:
a. according to the full length of the sequence of COVID-19 genome MT019531.1(Access No: MT019531GWHABKH00000000) published on the website of the National Center for Biotechnology Information (NCBI), two groups of multiplex amplification primer groups 1 and 2 are designed (each group of primer pools respectively comprises a forward primer F pool and a reverse primer R pool); respectively designing a forward primer and a reverse primer of the amplification primer group 2 in two adjacent amplicon sequences of the amplification primer group 1, and respectively continuing to design the forward primer and the reverse primer of the amplification primer group 1 in two adjacent amplicon sequences of the amplification primer group 2, and repeating the steps until amplification products of the amplification primer group 1 and the amplification primer group 2 can cover the whole virus genome in a shingled manner (FIG. 1). The Tm threshold difference between different primers is set as +/-2 ℃ in Primer design, the size of an amplicon product is set as 200-300bp, and Primer pairs which can cause dimers (Primer dimers) and Stem-Loop structures (Stem-Loop) to be formed between the primers or in the primers are removed in the design; in the same amplification primer group, the 5 'end of the reverse primer sequence of the upstream amplicon of the genome is ensured to be positioned at the upstream of the 5' end of the forward primer sequence of the downstream amplicon as much as possible so as to prevent the formation of short fragment byproducts and carry out PCR competition; and simultaneously ensures that the amplification efficiency in the system is close to high consistency.
b. Designing an anchored multiplex amplification primer F pool anchoring an Illumina partial linker sequence, and adding an Illumina Nextera linker sequence (5' -GTCTCGTGGGCTCGG)AGATGTGTATAAGAGACAG-3 ') are added to the 5' ends of all primers in the pool of forward primers F according to the 5 '-3' direction; wherein 5'-AGATGTGTATAAGAGACAG-3' is a Tn5 transposase binding site sequence in an Illumina nextera linker, 5'-GTCTCGTGGGCTCGG-3' is a sequence identical to the 3 ' terminal sequence of a tagged Primer (I7 extended Primer) for Illumina complete library amplification; wherein 5'-GTCTCGTGGGCTCGG-3' can be shortened or lengthened appropriately to ensure that the sequence of the 3 ' end of the tagged Primer (I7 IndexPrimer) amplified from the Illumina complete library (downstream of I7) can anneal thereto normally.
c. Designing an anchored multiplex amplification primer R pool anchoring an Illumina partial linker sequence, and adding an Illumina Nextera linker sequence (5' -TCGTCGGCAGCGTC)AGATGTGTATAAGAGACAG-3 ') are added to the 5' ends of all primers in the pool of forward primers R according to the 5 '-3' direction; wherein 5'-AGATGTGTATAAGAGACAG-3' is a Tn5 transposase binding site sequence in an Illumina nextera linker, 5'-TCGTCGGCAGCGTC-3' is a sequence identical to the 3 ' terminal sequence of a tagged Primer (I5 extended Primer) for Illumina complete library amplification; wherein 5'-TCGTCGGCAGCGTC-3' can be shortened or lengthened appropriately to ensure that the sequence of the 3 ' end of the tagged Primer (I5 IndexPrimer) amplified from the Illumina complete library (downstream of I5) can anneal thereto normally.
d. The Pool of Anchored multiplex amplification primers F and Pool of Anchored multiplex amplification primers R are synthesized and mixed to form an Anchored multiplex amplification Primer set 1(Anchored Primer Pool1) and an Anchored multiplex amplification Primer set 2(Anchored Primer Pool2), and the Primer mixture modes are shown in tables 1 and 2. In practical applications, the amplification products of either anchored multiplex amplification primer set 1 or anchored multiplex amplification primer set 2 need to be mixed in equimolar amounts to cover the entire viral genome.
The library construction method provided by the invention comprises the following steps: step 1) synthesizing double-stranded cDNA by using retrovirus single-stranded RNA or retrovirus single-stranded RNA, wherein the double-stranded cDNA synthesis reagent is preferably EpiNext Hi-Fi cDNA kit (Epigenek), and the single-stranded cDNA synthesis reagent is preferably TAKARA PrimeScript 1stThe method comprises the steps of (1) selecting a specific reverse transcription primer group formed by mixing a plurality of primers in a 6-10 base random primer or a non-anchored reverse primer pool R by using reverse transcription primers, (2) carrying out PCR reaction by using an anchored multiplex amplification primer group 1 and an anchored multiplex amplification primer group 2 respectively to enrich novel coronavirus cDNA in a targeted manner, 3) purifying and mixing the PCR product in the step 2) according to an equimolar amount, 4) carrying out PCR library amplification on the cDNA enriched in the targeted manner to obtain a DNA library which can be used for sequencing by an Illumina sequencing platform, and (5) purifying the library, wherein the reaction system of the PCR comprises 5 mu L-10 mu L of cDNA template, 25 mu L of the anchored multiplex amplification primer group 1 or the anchored multiplex amplification primer group, 15 mu L of DNA polymerase and 2 × buffer system, or is calculated according to the total reaction volume, (2) mu L of double distilled water is 0-5 mu L, preferably, the DNA polymerase and 2. mu. L of the magnetic bead selection buffer system, the magnetic bead extension system is selected according to the concentration of the PCR amplification system of the PCR amplification of 5 mu. 20 mu. mu.L, the PCR amplification of the PCR amplification products, the PCR amplification of the step 2), the step 2) and the steps of the PCR amplification of the PCR product, the PCR amplification of the PCR product, the amplification of the PCR product is carried out the step 2, the step of the step 2, the amplification step of the amplification step of the PCR amplification step of the PCR product of the amplification step of the PCR amplification step of theRespectively detecting two groups of purified anchored multiplex amplification products, step c, respectively calculating molar concentrations of the two groups of anchored multiplex amplification products, step d, mixing equimolar amounts, and step 4) carrying out PCR library amplification on the target enriched cDNA, wherein the step c comprises 20 mu L of the mixed anchored multiplex amplification PCR products, 5 mu L of Illumina library amplification (induced PCR) primer pairs, 25 mu L of DNA polymerase and 2 × buffer solution system, preferably, KAPA HiFit Start Ready mix (Roche, Cat No. KK2602) is selected from the DNA polymerase and 2 × buffer solution system, the step 4) PCR amplification program comprises the steps a, pre-denaturation at 98 ℃ for 45s, step b, cyclic amplification at 98 ℃ for 15s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, the cycle number is 10, step c, total extension at 72 ℃ for 60s, preservation at 4 ℃ for 5), and the step a, adding 0.6-1 magnetic bead amplification products, preferably, incubation of the magnetic bead amplification products for 30 min, and the step b, wherein the steps b, the steps c, total extension, the steps c, the step b, of incubation, the step b, of preparing of fresh amplification program, and the step b, preferably, the step b.
In some embodiments, the viral RNA sample is from the extraction of virus isolated from alveolar lavage fluid.
In some embodiments, the viral RNA sample is from a viral extract isolated from a pharyngeal swab.
In some embodiments, the viral RNA sample is from a high copy number viral extract isolated from a supernatant of an in vitro infected cell culture virus.
In some embodiments, the reverse transcription primer is a 6-base random primer.
In some embodiments, the reverse transcription primer is a specific reverse transcription primer set formed by mixing several primers in the non-anchored reverse primer pool R.
The off-line data analysis method provided by the invention comprises the following steps: step 1: BWA software is used for constructing a reference genome (MT019531.1) index data set, and samtools faidx is used for generating fai files. Statistical MT019531.1 genome basic information: the total length is 29899bp, and the GC content is 37.98%; step 2: and (5) performing reads quality control analysis. Using SOAPnuke to filter and quality control analysis of both end reads clear reads (read length after filtering). The Reads that satisfy the following condition will be removed: 1) reads containing contamination with linker sequences; 2) reads with N basic group number more than 10%; 3) low mass (Q <38) base numbers exceed 50% of the whole reads; and step 3: and (6) data comparison and sorting. BAM files were generated by aligning clean reads to the reference genome of COVID-2019 (MT019531.1) using BWA in conjunction with samtools, with alignment parameters "-t 32-M". Jar of the picard software was used for ranking. And (5) applying an index tool of samtools to build an index for the sorted BAM file. Quality control is carried out on the generated BAM file by using tools such as Qualimap and the like; and 4, step 4: and (3) mutation detection: SNPs and InDel variations of the virus were detected using samtools pileup and VarScan. The SNP detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; the InDel detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; and 5, finally, annotating the detected SNP by applying annovar software based on the GFF file of the MT019531.1 reference genome.
The invention also provides a kit for constructing a human novel coronavirus (COVID-19) whole genome high-throughput sequencing library, which comprises the following components: a specific reverse transcription primer group, an anchored multiple amplification primer group 1 and an anchored multiple amplification primer group 2 which are formed by mixing a plurality of primers in a non-anchored reverse primer pool; sequencing library amplification primers, various reagents used in the library construction. Further, the kit also contains instructions for the method and instructions for safe use.
The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention. Unless otherwise indicated, the examples follow conventional experimental conditions, such as the Molecular Cloning handbook, Sambrook et al (Sambrook J & Russell DW, Molecular Cloning: a Laboratory Manual,2001), or the conditions as recommended by the manufacturer's instructions.
Example 1
The viral RNA used in this example was extracted from the alveolar lavage fluid of a novel coronary pneumonia patient by the magnetic bead method, for two cases; RNA extraction and quality control were performed in the national academy of medical sciences/Beijing coordination with Hospital institute of pathogenic biology, biosafety level 3 (P3) laboratory.
The methods provided in this example can be used to detect viral species in alveolar lavage fluid or to detect mutations in the viral genome from patients with confirmed novel coronopneumoniae; viral copy number viral concentration (Copies/. mu.L) was determined by absolute quantitative qRT-PCR using N gene copy number of novel coronavirus nucleic acid standard substance (high concentration) GBW (E)091089 (national institute of metrology science) as reference (Table 3).
TABLE 3 alveolar lavage fluid RNA viral copy number, clinical information
Figure BDA0002427599160000141
The specific experimental method is as follows:
viral single-stranded RNA extracted from alveolar lavage fluid was reverse-transcribed into single-stranded cDNA (1st cDNA) using 6-base random primers, and the 1st cDNA synthesis kit was selected as follows: TAKARA PrimeScript 1stStrand cDNAsynthesis kit (TAKARA, Cat No. 6110A); the 1st cDNA was purified for subsequent amplification.
The PCR reaction is carried out by using an anchored multiplex amplification primer group 1, wherein a PCR reaction system comprises 5 mu L of purified cDNA template, 15 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, 5 mu L of double distilled water, and a DNA polymerase and 2 × buffer solution system selected from KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), a PCR amplification program comprises a step a of pre-denaturation at 98 ℃ for 1min, a step b of cyclic amplification, wherein the cyclic amplification comprises denaturation at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and the cycle number is 15, and a step c of total extension at 72 ℃ for 60s, and storage at 4 ℃.
The PCR reaction is carried out by using an anchored multiplex amplification primer group 2, wherein a PCR reaction system comprises 5 mu L of cDNA template after the same sample is purified, 25 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, 5 mu L of double distilled water, and the DNA polymerase and 2 × buffer solution system is KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), a PCR amplification program comprises a step a of denaturation at 98 ℃ for 1min, a step b of cyclic amplification, wherein denaturation at 98 ℃ is carried out for 20s, annealing at 60 ℃ is carried out for 30s, extension at 72 ℃ is carried out for 30s, and the cycle number is 15, and total extension is carried out at 72 ℃ for 60s and storage at 4 ℃.
The first round of anchored PCR amplification products were purified separately, the procedure included: step a, adding DNA purification magnetic Beads with the volume being 1 time (30 mu L) of the amplification products, preferably Agencour AMPure XP Beads (Beckman CatNo. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, using 30. mu.L EB buffer (Qiagen Cat No.19086) to dissolve; two sets of PCR products were mixed in equimolar amounts.
The method comprises the following steps of (1) amplifying 20 mu L of mixed first round PCR products by using Illumina library tagged amplification primers, 5 mu L of Illumina library amplification primer pairs, 25 mu L of DNA polymerase and 2 × buffer solution system, preferably, selecting KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602) by using the DNA polymerase and 2 × buffer solution system, and (4) performing PCR amplification by using a step a, pre-denaturation at 98 ℃ for 45s, cyclic amplification at 98 ℃ for 15s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and cycle number of 10, and step c, total extension at 72 ℃ for 60s, and storing at 4 ℃.
The second round of Illumina library amplification product purification procedure included: step a, adding DNA purification magnetic Beads with the volume being 1 time (50 mu L) of the amplification products, preferably Agencour AMPure XP Beads (Beckman Cat No. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, redissolution using 30. mu.L EB buffer (Qiagen Cat No. 19086).
Performing high-throughput sequencing, namely performing high-throughput sequencing on the library purified in the last step according to the operating steps of the illumina Novaseq on the computer; the amount of sequencing data was set to 1G.
And (3) off-line data analysis:
step 1: BWA software is used for constructing a reference genome (MT019531.1) index data set, and samtools faidx is used for generating fai files. Statistical MT019531.1 genome basic information: the total length is 29899bp, and the GC content is 37.98%;
step 2: and (5) performing reads quality control analysis. Using SOAPnuke to filter and quality control analysis of both end reads clear reads (read length after filtering). Reads that satisfy the following condition will be removed: 1) reads containing contamination with linker sequences; 2) reads with N basic group number more than 10%; 3) low mass (Q <38) base numbers exceed 50% of the whole reads;
and step 3: and (6) data comparison and sorting. BAM files were generated by aligning Clean Reads to the reference genome of COVID-2019 (MT019531.1) using BWA in conjunction with samtools, with alignment parameters "-t 32-M". Jar of the picard software was used for ranking. And (5) applying an index tool of samtools to build an index for the sorted BAM file. Quality control is carried out on the generated BAM file by using tools such as Qualimap and the like;
and 4, step 4: and (3) mutation detection: SNPs and InDel variations of the virus were detected using samtools pileup and VarScan. The SNP detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; the detection parameters of InDel are: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1";
step 5, annotating the detected SNPs with annovar software based on the GFF file of MT019531.1 reference genome.
And (4) analyzing results:
FIG. 2 shows the library construction results of this example, which shows that the library construction using alveolar lavage fluid to isolate viral RNA shows two peaks, the main peak is about 380-: 1) a small amount of genomic by-products are generated under the influence of random primers; 2) potential primer dimers are over-amplified by the library amplification primers; or 3) the potential anchored multiplex amplification primer set 1 and the anchored multiplex amplification primer 2 remain after the first round of purification and are formed by amplification with a ratio of about 20%; blank control NC does not construct a library, and is in line with expectations (results are not shown); the machine-off data are respectively: 0.75G (46d1-1) and 1.1G (50d1-1), and the Q30 values of the original data are 90.38% and 80.72%, respectively (Table 4).
TABLE 4 quality control of alveolar lavage fluid sample library off-line data
Figure BDA0002427599160000161
After filtering, the off-line data are compared to the comparison read length of a virus reference genome MT019531.1(access No: MT019531GWHABKH00000000), the comparison base number, the comparison rate, the mismatch rate, the coverage ratio of each average depth and the like are shown in Table 5, the comparison rate of the off-line data of the alveolar lavage fluid sample is more than 92 percent, the mismatch rate is less than 0.2 percent, the coverage ratio of the novel coronavirus accessory genes N and the sequencing depth S of 100 × of the two libraries is 100 percent, the virus can be determined to be novel coronavirus COVID-19, and the coverage ratio of the virus genome sequencing depth 100 × respectively reaches 97.38 percent and 98.24 percent (Table 5).
TABLE 5 statistical results of on-line data analysis of alveolar lavage fluid sample libraries
Figure BDA0002427599160000171
In both libraries, 46d1-1 had 9 Single Nucleotide Polymorphisms (SNPs) and no indel mutation was found, which were: MT019531.1 genomic position 3127(orf1ab: T2862C), MT019531.1 genomic position 3706(orf1ab: A3441G), MT019531.1 genomic position 5369(orf1ab: G5104T), MT019531.1 genomic position 5812(orf1ab: C5547T), MT019531.1 genomic position 6996(orf1ab: C6731T), MT019531.1 genomic position 7010(orf1ab: G67 6745A), MT019531.1 genomic position 18395(orf1ab: C18130T), MT019531.1 genomic position 18557(orf1ab: C18292T), MT019531.1 genomic position 18640(orf1ab: A183 18375G).
50d1-1 has 8 single nucleotide polymorphism Sites (SNP), no insertion deletion mutation is found, and the SNP is respectively as follows: MT019531.1 genomic position 1880(orf1ab: G1615A), MT019531.1 genomic position 3127(orf1ab: T28 2862C), MT019531.1 genomic position 5369(orf1ab: G5104T), MT019531.1 genomic position 6996(orf1ab: C6731T), MT019531.1 genomic position 7010(orf1ab: G6745A), MT019531.1 genomic position 18395(orf1ab: C18130T), MT019531.1 genomic position 18557(orf1ab: C18292T), MT019531.1 genomic position 28620(N: G346A).
Example 2
The viral RNA samples 46d1 and 50d1 used in this example were the same as in example 1.
The specific experimental method is as follows:
the virus single-stranded RNA extracted from alveolar lavage fluid uses 34 gene-specific reverse primers (COV-1-R, COV-8-R, COV-12-R, COV-20-R, COV-30-R, COV-38-R, COV-47-R, COV-54-R, COV-62-R, COV-71-R, COV-80-R, COV-86-R, COV-94-R, COV-102-R, COV-111-R, COV-119-R, COV-125-R, COV-132-R, COV-141-R, COV-R, COV-155-R, COV-R, COV-172-R, COV-179-R, COV-187-R, COV-195-R, COV-202-R, COV-210-R, COV-220-R, COV-228-R, COV-233-R, COV-239-R, COV-247-R, COV-252-R (genome direction 3 '-5') and reverse transcription into a single-strand cDNA (1st cDNA), wherein the 1st cDNA synthesis kit is selected from the following components: TAKARA PrimeScript 1stStrand cDNA Synthesis kit (TAKARA, Cat No. 6110A); the 1st cDNA was purified for subsequent amplification.
The PCR reaction is carried out by using an anchored multiplex amplification primer group 1, wherein a PCR reaction system comprises 5 mu L of purified cDNA template, 15 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, 5 mu L of double distilled water, and a DNA polymerase and 2 × buffer solution system selected from KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), a PCR amplification program comprises a step a of pre-denaturation at 98 ℃ for 1min, a step b of cyclic amplification, wherein the cyclic amplification comprises denaturation at 98 ℃ for 20S, annealing at 60 ℃ for 30S, extension at 72 ℃ for 30S, and the cycle number is 15, and a step c of total extension, extension at 72 ℃ for 60S, and storage at 4 ℃.
The PCR reaction is carried out by using an anchored multiplex amplification primer group 2, wherein a PCR reaction system comprises 5 mu L of cDNA template after the same sample is purified, 25 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, 5 mu L of double distilled water, and the DNA polymerase and 2 × buffer solution system is KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), a PCR amplification program comprises a step a of denaturation at 98 ℃ for 1min, a step b of cyclic amplification, wherein denaturation at 98 ℃ is carried out for 20s, annealing at 60 ℃ is carried out for 30s, extension at 72 ℃ is carried out for 30s, and the cycle number is 15, and total extension is carried out at 72 ℃ for 60s and storage at 4 ℃.
The first round of anchored PCR amplification products were purified separately, the procedure included: step a, adding DNA purification magnetic Beads with the volume being 1 time (30 mu L) of the amplification products, preferably Agencour AMPure XP Beads (Beckman CatNo. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, using 30. mu.L EB buffer (Qiagen Cat No.19086) to dissolve; two sets of PCR products were mixed in equimolar amounts.
The method comprises the following steps of (1) amplifying 20 mu L of mixed first round PCR products by using Illumina library tagged amplification primers, 5 mu L of Illumina library amplification primer pairs, 25 mu L of DNA polymerase and 2 × buffer solution system, preferably, selecting KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602) by using DNA polymerase and 2 × buffer solution system, and (4) performing PCR amplification procedure, wherein the PCR amplification procedure comprises the steps of a, pre-denaturation at 98 ℃ for 45s, cyclic amplification at 98 ℃ for 15s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and cycle number of 10, and the steps of c, total extension at 72 ℃ for 60s, and storage at 4 ℃.
The second round of Illumina library amplification product purification procedure included: step a, adding DNA purification magnetic Beads with the volume being 1 time (50 mu L) of the amplification products, preferably Agencour AMPure XP Beads (Beckman Cat No. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, redissolution using 30. mu.L EB buffer (Qiagen Cat No. 19086).
Performing high-throughput sequencing, namely performing high-throughput sequencing on the library purified in the last step according to the operating steps of the illumina Novaseq on the computer; the amount of sequencing data was set to 1G.
And (3) off-line data analysis:
step 1: BWA software is used for constructing a reference genome (MT019531.1) index data set, and samtools faidx is used for generating fai files. Statistical MT019531.1 genome basic information: the total length is 29899bp, and the GC content is 37.98%;
step 2: and (5) performing quality control analysis on Reads. Using SOAPnuke to filter and quality control analysis of both end reads clear reads (read length after filtering). Reads that satisfy the following condition will be removed: 1) reads containing contamination with linker sequences; 2) reads with N basic group number more than 10%; 3) low mass (Q <38) base numbers exceed 50% of the whole reads;
and step 3: and (6) data comparison and sorting. BAM files were generated by aligning Clean Reads to the reference genome of COVID-2019 (MT019531.1) using BWA in conjunction with samtools, with alignment parameters "-t 32-M". Jar of the picard software was used for ranking. And (5) applying an index tool of samtools to build an index for the sorted BAM file. Quality control is carried out on the generated BAM file by using tools such as Qualimap and the like;
and 4, step 4: and (3) mutation detection: SNPs and InDel variations of the virus were detected using samtools pileup and VarScan. The SNP detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; the detection parameters of InDel are: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1";
step 5, annotating the detected SNP by applying annovar software based on the GFF file of the MT019531.1 reference genome;
and (4) analyzing results:
FIG. 3 is the library construction result of this example, which shows that the sequencing libraries of 46d1-2 and 50d1-2 both have a single peak and the average library fragment size is about 380-420bp, which is in accordance with the expected amplicon average size and complete library size; blank control NC does not construct a library, and is in line with expectations (results are not shown); the machine-off data are respectively: 0.9G (46d1-2) and 1.0G (50d1-2), and the Q30 values of the original data are 94.15% and 92.46%, respectively (Table 6).
TABLE 6 quality control of alveolar lavage fluid sample library off-line data
Figure BDA0002427599160000191
After filtering, the off-line data are compared to the comparison read length of a virus reference genome MT019531.1(access No: MT019531GWHABKH 000000000000), the comparison base number, the comparison rate, the mismatch rate, the coverage ratio of each average depth and the like are shown in Table 7, the comparison rate of the off-line data of the alveolar lavage fluid sample is more than 97 percent, the mismatch rate is less than 0.1 percent, the coverage ratio of the novel coronavirus accessory genes N and S sequencing depth 100 × of the two libraries is 100 percent, the virus can be determined to be novel coronavirus COVID-19, and the coverage ratio of the virus genome sequencing depth 100 × respectively reaches 99.08 percent and 99.24 percent (Table 7).
TABLE 7 statistical results of on-line data analysis of alveolar lavage fluid sample libraries
Figure BDA0002427599160000192
Figure BDA0002427599160000201
In two samples, 9 Single Nucleotide Polymorphisms (SNPs) were present in 46d1-2, no indel mutation was found, and they were: MT019531.1 genomic position 3127(orf1ab: T2862C), MT019531.1 genomic position 3706(orf1ab: A3441G), MT019531.1 genomic position 5369(orf1ab: G5104T), MT019531.1 genomic position 5812(orf1ab: C5547T), MT019531.1 genomic position 6996(orf1ab: C6731T), MT019531.1 genomic position 7010(orf1ab: G67 6745A), MT019531.1 genomic position 18395(orf1ab: C18130T), MT019531.1 genomic position 18557(orf1ab: C18292T), MT019531.1 genomic position 18640(orf1ab: A183 18375G).
50d1-1 has 8 Single Nucleotide Polymorphism Sites (SNPs), no insertion deletion mutation is found, and the mutation is represented by MT019531.1 genome position 1880(orf1ab: G1615A), MT019531.1 genome position 3127(orf1ab: T2862C), MT019531.1 genome position 5369(orf1ab: G5104T), MT019531.1 genome position 6996(orf1ab: C6731T), MT019531.1 genome position 7010(orf1ab: G6745A), MT019531.1 genome position 18395(orf1ab: C181 18130T), MT019531.1 genome position 18557(orf1ab: C18292T), MT019531.1 genome position 28620(N: G346A), and the results of the comprehensive example 1 and the example 2 show that the same sample can be identified by two reverse transcription methods, the sequencing utilization rate is the same, the sequencing utilization rate is high, and the comparison depth ratio of the specific primers is high in comparison, the average coverage rate of 100 × and the reverse transcription depth ratio is high in the same sample.
Example 3
The viral RNA used in this example was extracted from a throat swab sample of a novel patient with coronary pneumonia by the paramagnetic particle method; RNA extraction and quality control were performed in the national academy of medical sciences/Beijing coordination with Hospital institute of pathogenic biology, biosafety level 3 (P3) laboratory.
The method provided in this example can be used to detect viral species in pharyngeal swab samples or to detect mutations in viral genomes from patients with confirmed or suspected new types of coronary pneumonia; the virus copy number was determined by absolute quantitative qRT-PCR using the copy numbers of the N gene and E gene of a novel coronavirus nucleic acid standard substance (low concentration) gbw (E)091090 (national institute of metrology and science) as a reference (table 8). .
TABLE 8 pharyngeal swab RNA Virus copy number, clinical information
Figure BDA0002427599160000202
The specific experimental method is as follows:
the throat swab collects the oral epithelial cells of a novel coronary pneumonia patient, then the virus single-stranded RNA extracted by a magnetic bead method is reversely transcribed into a single-stranded cDNA (1st cDNA) by using a 6-base random primer and a 1st cDNA synthesis kit, and the 1st cDNA synthesis kit is selected as follows: TAKARA PrimeScript 1stStrand cDNA Synthesis kit (TAKARA, Cat No. 6110A); purifying the 1st cDNA for subsequent amplification;
the PCR reaction is carried out by using the anchored multiplex amplification primer group 1, wherein the PCR reaction system comprises 10 mu L of purified cDNA template, 15 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, the DNA polymerase and 2 × buffer solution system is selected from KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), the PCR amplification program comprises the steps of a, pre-denaturation at 98 ℃ for 1min, cyclic amplification at 98 ℃ for 20S, annealing at 60 ℃ for 30S, extending at 72 ℃ for 30S, and the cycle number is 25, and the total extension at 72 ℃ for 60S and storing at 4 ℃.
The PCR reaction is carried out by using an anchored multiplex amplification primer group 2, wherein the PCR reaction system comprises 10 mu L of cDNA template after the same sample is purified, 25 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, the DNA polymerase and 2 × buffer solution system is selected from KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), the PCR amplification program comprises a step a of pre-denaturation at 98 ℃ for 1min, a step b of cyclic amplification, wherein the step b comprises denaturation at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and the cycle number is 25, and the step c comprises total extension at 72 ℃ for 60s and storage at 4 ℃.
The first round of anchored PCR amplification products were purified separately, the procedure included: step a, adding DNA purification magnetic Beads with the volume being 1 time (30 mu L) of the amplification products, preferably Agencour AMPure XP Beads (Beckman CatNo. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, using 30. mu.L EB buffer (Qiagen Cat No.19086) to dissolve; mixing the two sets of PCR products in equimolar amounts;
the method comprises the following steps of (1) amplifying 20 mu L of mixed first round PCR products by using Illumina library tagged amplification primers, 5 mu L of Illumina library amplification primer pairs, 25 mu L of DNA polymerase and 2 × buffer solution system, preferably, selecting KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602) by using the DNA polymerase and 2 × buffer solution system, and (4) performing PCR amplification by using a step a, pre-denaturation at 98 ℃ for 45s, cyclic amplification at 98 ℃ for 15s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and cycle number of 10, and step c, total extension at 72 ℃ for 60s, and storing at 4 ℃.
The second round of Illumina library amplification product purification procedure included: step a, adding DNA purification magnetic Beads with the volume of 0.8 time (40 mu L) of the amplification product, preferably Agencour AMPure XP Beads (Beckman Cat No. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, redissolution using 30. mu.L EB buffer (Qiagen Cat No. 19086).
Performing high-throughput sequencing, namely performing high-throughput sequencing on the library purified in the last step according to the operating steps of the illumina Novaseq on the computer; the amount of sequencing data was set to 1G, and in this example, the actual number of moles of the computer library was adjusted due to the influence of the library peak pattern.
And (3) off-line data analysis:
step 1: BWA software is used for constructing a reference genome (MT019531.1) index data set, and samtools faidx is used for generating fai files. Statistical MT019531.1 genome basic information: the total length is 29899bp, and the GC content is 37.98%;
step 2: and (5) performing reads quality control analysis. Using SOAPnuke to filter and quality control analysis of both end reads clear reads (read length after filtering). Reads that satisfy the following condition will be removed: 1) reads containing contamination with linker sequences; 2) reads with N basic group number more than 10%; 3) low mass (Q <38) base numbers exceed 50% of the whole reads;
and step 3: and (6) data comparison and sorting. BAM files were generated by aligning Clean Reads to the reference genome of COVID-2019 (MT019531.1) using BWA in conjunction with samtools, with alignment parameters "-t 32-M". Jar of the picard software was used for ranking. And (5) applying an index tool of samtools to build an index for the sorted BAM file. Quality control is carried out on the generated BAM file by using tools such as Qualimap and the like;
and 4, step 4: and (3) mutation detection: SNPs and InDel variations of the virus were detected using samtools pileup and VarScan. The SNP detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; the detection parameters of InDel are: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1";
step 5, annotating the detected SNPs with annovar software based on the GFF file of MT019531.1 reference genome.
And (4) analyzing results:
FIG. 4 is a library construction result of this embodiment, which shows that the library construction is performed by using pharyngeal swab viral RNA with a relatively low virus copy number, both libraries are multimodal, there is a main peak with a size of about 180bp (the ratio is about 80%), it is suspected to be influenced by a small sample virus copy number, the reverse transcription efficiency is low, the anchoring primer forms a dimer and is obtained after over-amplification by the library amplification primer, two secondary peaks are expected at the position of 380 + 440bp of the main peak, and should be a complete library structure, the ratio is about 20%, so the actual number of moles on the computer of the library is increased by 4 times during on-computer sequencing; blank control NC does not construct a library, and is in line with expectations (results are not shown); the machine-off data are respectively: 1.2G (48d5-1) and 1.3G (47d1-1), and the Q30 values of the original data are 85.28% and 79.77%, respectively (Table 9).
TABLE 9 pharyngeal swab sample library off-line data quality control
Figure BDA0002427599160000221
After filtering, the off-line data are compared to a virus reference genome MT019531.1(access No: MT019531GWHABKH00000000) to obtain a comparison reading length, a comparison base number, a comparison rate, a mismatch rate and the like shown in Table 10, the comparison rate of the off-line data of the pharyngeal swab sample exceeds 93%, the mismatch rate is less than 0.03%, the coverage ratio of N and S sequencing depth 100 × of novel coronavirus accessory genes of two libraries is 100%, the virus can be determined to be novel coronavirus COVID-19, and the coverage ratio of virus genome sequencing depth 100 × respectively reaches 98.12% and 96.73% (Table 10).
TABLE 10 in-flight data analysis statistics for pharyngeal swab sample libraries
Figure BDA0002427599160000222
Figure BDA0002427599160000231
48d5-1 has 6 single nucleotide polymorphism Sites (SNP) and 4 deletion mutation sites which are respectively: MT019531.1 genomic position 2132(orf1ab: A1867G), MT019531.1 genomic position 6996(orf1ab: C6731T), MT019531.1 genomic position 11354(orf1ab: G11089A), MT019531.1 genomic position 17194(orf1ab: A169929 16929G), MT019531.1 genomic position 18395(orf1ab: C18130T), MT019531.1 genomic position 18557(orf1ab: C18292T), MT019531.1 genomic position 64 as deletion mutation position 9264(orf1ab:9000_9005del), MT019531.1 genomic position 9851(orf1ab:9587_9596del), MT019531.1 genomic position 20296(orf1ab:20032_20035del), MT019531.1 genomic position 29067(N:795_ del).
47d1-1 has 3 single nucleotide polymorphism Sites (SNP), no insertion deletion mutation is found, and the SNP is respectively as follows: MT019531.1 genomic position 1578(orf1ab: T1313A), MT019531.1 genomic position 6996(orf1ab: C6731T), MT019531.1 genomic position 18123(orf1ab: T17858C).
Example 4
The viral RNA samples 48d5 and 47d1 used in this example were the same as in example 3.
The specific experimental method is as follows:
the pharyngeal swab collects the oral epithelial cells of a novel coronary pneumonia patient, then the viral single-stranded RNA extracted by a magnetic bead method is mixed by using 34 gene-specific reverse primers (the genome direction is 3 '-5', the specific reverse transcription primer combination in the same example 2), the 1st cDNA synthesis kit is reversely transcribed into a single-stranded cDNA (1st cDNA), and the 1st cDNA synthesis kit is selected as follows: TAKARA PrimeScript 1ststrand cDNA Synthesis kit (TAKARA, Cat No. 6110A); the 1stcDNA was purified for subsequent amplification.
The PCR reaction is carried out by using the anchored multiplex amplification primer group 1, wherein the PCR reaction system comprises 10 mu L of purified cDNA template, 15 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, the DNA polymerase and 2 × buffer solution system is selected from KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), the PCR amplification program comprises the steps of a, pre-denaturation at 98 ℃ for 1min, cyclic amplification at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extending at 72 ℃ for 30s, and the cycle number is 25, and the total extension at 72 ℃ for 60s and storing at 4 ℃.
The PCR reaction is carried out by using an anchored multiplex amplification primer group 2, wherein the PCR reaction system comprises 10 mu L of cDNA template after the same sample is purified, 25 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, the DNA polymerase and 2 × buffer solution system is selected from KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), the PCR amplification program comprises a step a of pre-denaturation at 98 ℃ for 1min, a step b of cyclic amplification, wherein the step b comprises denaturation at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and the cycle number is 25, and the step c comprises total extension at 72 ℃ for 60s and storage at 4 ℃.
The first round of anchored PCR amplification products were purified separately, the procedure included: step a, adding DNA purification magnetic Beads with the volume being 1 time (30 mu L) of the amplification products, preferably Agencour AMPure XP Beads (Beckman CatNo. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, using 30 μ LEB buffer (Qiagen Cat No.19086) to dissolve back; mixing the two sets of PCR products in equimolar amounts;
the method comprises the following steps of (1) amplifying 20 mu L of mixed first round PCR products by using Illumina library tagged amplification primers, 5 mu L of Illumina library amplification primer pairs, 25 mu L of DNA polymerase and 2 × buffer solution system, preferably, selecting KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602) by using the DNA polymerase and 2 × buffer solution system, and (4) performing PCR amplification by using a step a, pre-denaturation at 98 ℃ for 45s, cyclic amplification at 98 ℃ for 15s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and cycle number of 10, and step c, total extension at 72 ℃ for 60s, and storing at 4 ℃.
The second round of Illumina library amplification product purification procedure included: step a, adding DNA purification magnetic Beads with the volume of 0.8 time (40 mu L) of the amplification product, preferably Agencour AMPure XP Beads (Beckman Cat No. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, redissolution using 30. mu.L EB buffer (Qiagen Cat No. 19086).
Performing high-throughput sequencing, namely performing high-throughput sequencing on the library purified in the last step according to the operating steps of the illumina Novaseq on the computer; the amount of sequencing data was set to 1G.
And (3) off-line data analysis:
step 1: BWA software is used for constructing a reference genome (MT019531.1) index data set, and samtools faidx is used for generating fai files. Statistical MT019531.1 genome basic information: the total length is 29899bp, and the GC content is 37.98%;
step 2: and (5) performing reads quality control analysis. Using SOAPnuke to filter and quality control analysis of both end reads clear reads (read length after filtering). Reads that satisfy the following condition will be removed: 1) reads containing contamination with linker sequences; 2) reads with N basic group number more than 10%; 3) low mass (Q <38) base numbers exceed 50% of the whole reads;
and step 3: and (6) data comparison and sorting. BAM files were generated by aligning Clean Reads to the reference genome of COVID-2019 (MT019531.1) using BWA in conjunction with samtools, with alignment parameters "-t 32-M". Jar of the picard software was used for ranking. And (5) applying an index tool of samtools to build an index for the sorted BAM file. Quality control is carried out on the generated BAM file by using tools such as Qualimap and the like;
and 4, step 4: and (3) mutation detection: SNPs and InDel variations of the virus were detected using samtools pileup and VarScan. The SNP detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; the detection parameters of InDel are: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1";
step 5, annotating the detected SNPs with annovar software based on the GFF file of MT019531.1 reference genome.
And (4) analyzing results:
FIG. 5 is the library construction result of this embodiment, which shows that the pharyngeal swab viral RNA with low relative viral copy number is reverse transcribed by using the specific primer and subsequent library construction is performed, both libraries are unimodal, and the expected main peak position is 380-420bp, which is in line with the expectation; the blank control NC does not construct a library and accords with expectations; the machine-off data are respectively: 1.3G (48d5-1) and 1.3G (47d1-1), and the Q30 values of the raw data are 94.18% and 94.37%, respectively (Table 11).
TABLE 11 pharyngeal swab sample library off-line data quality control
Figure BDA0002427599160000251
After filtering, the off-line data are compared to the comparison reading length of a virus reference genome MT019531.1(access No: MT019531GWHABKH 000000000000), the comparison base number, the comparison rate, the mismatch rate and the like are shown in Table 12, the comparison rate of the off-line data of the pharyngeal swab sample is more than 96 percent, the mismatch rate is less than 0.03 percent, the coverage ratio of N and S sequencing depth 100 × of novel coronavirus accessory genes of two libraries is 100 percent, the virus can be determined to be novel coronavirus COVID-19, and the coverage ratio of the virus genome sequencing depth 100 × respectively reaches 99.08 percent and 98.85 percent (Table 12).
TABLE 12 in-flight data analysis statistics for pharyngeal swab sample libraries
Figure BDA0002427599160000252
48d5-2 has 6 single nucleotide polymorphism Sites (SNP) and 4 deletion mutation sites which are respectively: MT019531.1 genomic position 2132(orf1ab: A1867G), MT019531.1 genomic position 6996(orf1ab: C6731T), MT019531.1 genomic position 11354(orf1ab: G11089A), MT019531.1 genomic position 17194(orf1ab: A169929 16929G), MT019531.1 genomic position 18395(orf1ab: C18130T), MT019531.1 genomic position 18557(orf1ab: C18292T), MT019531.1 genomic position 64 as deletion mutation position 9264(orf1ab:9000_9005del), MT019531.1 genomic position 9851(orf1ab:9587_9596del), MT019531.1 genomic position 20296(orf1ab:20032_20035del), MT019531.1 genomic position 29067(N:795_ del).
47d1-2 has 3 Single Nucleotide Polymorphism Sites (SNPs), no insertion deletion mutation is found, namely MT019531.1 genome position 1578(orf1ab: T1313A), MT019531.1 genome position 6996(orf1ab: C6731T) and MT019531.1 genome position 18123(orf1ab: T17858C), and the results of example 3 and example 4 are combined, so that the same mutation position can be identified by both reverse transcription methods, and the comparison rate, the average sequencing depth, the sequencing depth coverage ratio of 100 × and the like of the library obtained by reverse transcription by using the specific primers are more advantageous in the aspect of data utilization rate.
Example 5
The virus used in this example was isolated from a novel coronavirus strain laboratory and cultured cells were infected in vitro, and the virus supernatants were extracted for a total of 3 cases; the operation of virus culture and RNA extraction was performed with the help of the biosafety level 3 (P3) laboratory of the institute of pathogenic biology of the Chinese academy of medicine/Beijing collaborated with Hospital.
The method provided by the embodiment can be used for detecting the virus genome mutation of the novel high-copy-number coronavirus so as to identify the virus variation under the high-copy-number condition and analyze the evolution condition; viral Copy number viral concentration (Copy/. mu.L) was determined by absolute quantitative qRT-PCR using the N gene Copy number of novel coronavirus nucleic acid standard substance (high concentration) GBW (E)091089 (national institute of metrology science) as a reference (Table 13).
TABLE 13 RNA Virus copy number of cultured viruses, clinical information
Figure BDA0002427599160000261
The specific experimental method is as follows:
viral single-stranded RNA extracted after virus culture was mixed using 34 gene-specific reverse primers (genome orientation 3 '-5', same specific reverse transcription primer combination as in example 2), and reverse-transcribed into single-stranded cDNA (1st cDNA) using 1st cDNA synthesis kit selected as: TAKARA PrimeScript 1stStrand cDNAsynthesis kit (TAKARA, Cat No.6110A), which is used to purify 1st cDNA and then perform subsequent amplification reaction.
The PCR reaction is carried out by using the anchored multiplex amplification primer group 1, wherein the PCR reaction system comprises 5 mu L of cDNA template, 15 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, 5 mu L of double distilled water, and the DNA polymerase and 2 × buffer solution system is KAPA HiFi HotStart ReadyMix (Roche, Cat No. 260KK2), the PCR amplification program comprises the steps of a, pre-denaturation at 98 ℃ for 1min, cyclic amplification at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and the cycle number is 10, and the step c, total extension at 72 ℃ for 60s, and storage at 4 ℃.
The PCR reaction is carried out by using an anchored multiplex amplification primer group 2, wherein a PCR reaction system comprises 5 mu L of cDNA template of the same sample, 25 mu L of anchored multiplex primer group, 15 mu L of DNA polymerase and 2 × buffer solution system, 5 mu L of double distilled water, and a DNA polymerase and 2 × buffer solution system selected from KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602), a PCR amplification program comprises a step a of pre-denaturation at 98 ℃ for 1min, a step b of cyclic amplification, wherein the cyclic amplification comprises denaturation at 98 ℃ for 20s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and the cycle number is 10, and a step c of total extension, extension at 72 ℃ for 60s, and storage at 4 ℃.
The first round of anchored PCR amplification products were purified separately, the procedure included: step a, adding DNA purification magnetic Beads with the volume being 1 time (30 mu L) of the amplification products, preferably Agencour AMPure XP Beads (Beckman CatNo. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, using 30. mu.L EB buffer (Qiagen Cat No.19086) to dissolve; two sets of PCR products were mixed in equimolar amounts.
The method comprises the following steps of (1) amplifying 20 mu L of mixed first round PCR products by using Illumina library tagged amplification primers, 5 mu L of Illumina library amplification primer pairs, 25 mu L of DNA polymerase and 2 × buffer solution system, preferably, selecting KAPA HiFi HotStart ReadyMix (Roche, Cat No. KK2602) by using the DNA polymerase and 2 × buffer solution system, and (4) performing PCR amplification by using a step a, pre-denaturation at 98 ℃ for 45s, cyclic amplification at 98 ℃ for 15s, annealing at 60 ℃ for 30s, extension at 72 ℃ for 30s, and cycle number of 10, and step c, total extension at 72 ℃ for 60s, and storing at 4 ℃.
The second round of Illumina library amplification product purification procedure included: step a, adding DNA purification magnetic Beads with the volume being 1 time (50 mu L) of the amplification products, preferably Agencour AMPure XP Beads (Beckman Cat No. 14403400); b, incubating for 5min at room temperature; c, placing the magnetic frame for 10 min; d, preparing 80% fresh ethanol and washing the magnetic beads twice; step e, redissolution using 30. mu.L EB buffer (Qiagen Cat No. 19086).
Performing high-throughput sequencing, namely performing high-throughput sequencing on the library purified in the last step according to the operating steps of the illumina Novaseq on the computer; due to the fact that the number of copies of viruses is large, in order to reduce the influence of an Illumina sequencing platform on data results, the sequencing data volume is set to be 500M (0.5G).
And (3) off-line data analysis:
step 1: BWA software is used for constructing a reference genome (MT019531.1) index data set, and samtools faidx is used for generating fai files. Statistical MT019531.1 genome basic information: the total length is 29899bp, and the GC content is 37.98%;
step 2: and (5) performing reads quality control analysis. Using SOAPnuke to filter and quality control analysis of both end reads clear reads (read length after filtering). Reads that satisfy the following condition will be removed: 1) reads containing contamination with linker sequences; 2) reads with N basic group number more than 10%; 3) low mass (Q <38) base numbers exceed 50% of the whole reads;
and step 3: and (6) data comparison and sorting. BAM files were generated by aligning Clean Reads to the reference genome of COVID-2019 (MT019531.1) using BWA in conjunction with samtools, with alignment parameters "-t 32-M". Jar of the picard software was used for ranking. And (5) applying an index tool of samtools to build an index for the sorted BAM file. Quality control is carried out on the generated BAM file by using tools such as Qualimap and the like;
and 4, step 4: and (3) mutation detection: SNPs and InDel variations of the virus were detected using samtools pileup and VarScan. The SNP detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; the detection parameters of InDel are: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1";
step 5, annotating the detected SNPs with annovar software based on the GFF file of MT019531.1 reference genome.
And (4) analyzing results:
FIG. 6 shows the library construction results of this example, showing that the number of copies of the virus used is about 108Library construction was performed on individual virus culture samples, all libraries were unimodal, library range was narrow and sharp, the average library fragment size was around 380-420bp, which was consistent with the average amplicon size and the entire library size expectations, and Q30 was more than 93.79% (Table 14).
TABLE 14 off-line data quality control for virus sample library culture
Figure BDA0002427599160000281
After filtering (Clean), the offline data are compared to the comparison read length of a virus reference genome MT019531.1(access No: MT019531GWHABKH 000000000000), the comparison base number, the comparison rate, the mismatch rate and the like are shown in Table 6, the comparison rate of the offline data of the visible sample exceeds 99.61%, the mismatch rate is less than 0.2%, the coverage rate of the N and S sequencing depth 100 × of the novel coronavirus accessory genes of the two libraries is 100%, the virus can be determined to be the novel coronavirus COVID-19, and the coverage rate of the sequencing depth 100 × of the virus genome of 3 samples is more than 98.65% (Table 15).
TABLE 15 statistical results of the on-line data analysis of the library of cultured virus samples
Figure BDA0002427599160000282
Figure BDA0002427599160000291
7 single nucleotide polymorphism Sites (SNP) exist in XH1P2-R, no insertion deletion mutation site is found, and the sites are respectively: MT019531.1 genomic position 3127(orf1ab: T2862C), MT019531.1 genomic position 3706(orf1ab: A3441G), MT019531.1 genomic position 5369(orf1ab: G5104T), MT019531.1 genomic position 5812(orf1ab: C5547T), MT019531.1 genomic position 6996(orf1ab: C67 6731T), MT019531.1 genomic position 18395(orf1ab: C18130T), MT019531.1 genomic position 18557(orf1ab: C18292T).
6 single nucleotide polymorphism Sites (SNP) exist in XH1P6-R, no insertion deletion mutation site is found, and the sites are respectively: MT019531.1 genomic position 3127(orf1ab: T2862C), MT019531.1 genomic position 5369(orf1ab: G5104T), MT019531.1 genomic position 5812(orf1ab: C5547T), MT019531.1 genomic position 6996(orf1ab: C6731T), MT019531.1 genomic position 18557(orf1ab: C18292T), MT019531.1 genomic position 26308(E: G64T).
WHP6-R has 9 single nucleotide polymorphism Sites (SNP) and 1 deletion mutation site, which are respectively: MT019531.1 genomic position 565(ORF1ab: T300C), MT019531.1 genomic position 6996(ORF1ab: C6731T), MT019531.1 genomic position 7010(ORF1ab: G6745A), MT019531.1 genomic position 17825(ORF1ab: C17560T), MT019531.1 genomic position 18557(ORF1ab: C18292T), MT019531.1 genomic position 21784(S: T333A), MT019531.1 genomic position 23525(S: C1965T), MT019531.1 genomic position 23598(S: A2036G), MT019531.1 genomic position 019531.1 (ORF 019531.1: G16 019531.1), MT019531.1 genomic position 23594(S:2033_2062 del).
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.
Reference documents:
[1]Ge,X.-Y.et al.Isolation and characterization of a bat SARS-likecoronavirus that uses the ACE2 receptor.Nature 503,535–538(2013)
[2]Yang,L.et al.Novel SARS-like betacoronaviruses in bats,China,2011.Emerg.Infect.Dis.19,989–991(2013)
[3]Menachery,V.D.et al.SARS-like WIV1-CoV poised for humanemergence.Proc.Natl Acad.Sci.USA 113,3048–3053(2016)
[4]Cui,J.,Li,F.&Shi,Z.L.Origin and evolution of pathogenic coronaviruses.Nat.Rev.Microbiol.17,181–192(2019)
[5]Fan,Y.,Zhao,K.,Shi,Z.-L.&Zhou,P.Bat coronaviruses in China.Viruses11,210(2019)
[6]Wuhan Municipal Health Commission.Press statement related to novelcoronavirus infection(in Chinese)http://wjw.wuhan.gov.cn/front/web/showDetail/2020012709194(2020)
[7]Zhou,P.,Yang,X.,Wang,X.et al.A pneumonia outbreak associated witha new coronavirus of probable bat origin.Nature 579,270–273(2020).https://doi.org/10.1038/s41586-020-2012-7
sequence listing
<110> Fuzhou Furui medical laboratory Co., Ltd
Chinese Medical Sciences Academy
<120> construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction
<130>KHP201111315.5
<160>504
<170>SIPOSequenceListing 1.0
<210>1
<211>34
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>1
gtctcgtggg ctcggagatg tgtataagag acag 34
<210>2
<211>33
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>2
tcgtcggcag cgtcagatgt gtataagaga cag 33
<210>3
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>3
gtctcgtggg ctcggagatg tgtataagag acagaccaac caactttcga tctct 55
<210>4
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>4
gtctcgtggg ctcggagatg tgtataagag acagtcccag gtaacaaacc aacc 54
<210>5
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>5
gtctcgtggg ctcggagatg tgtataagag acagggtgtg accgaaaggt aagat 55
<210>6
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>6
gtctcgtggg ctcggagatg tgtataagag acaggtccct ggtttcaacg agaa 54
<210>7
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>7
gtctcgtggg ctcggagatg tgtataagag acagggcgaa ataccagtgg ctta 54
<210>8
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>8
gtctcgtggg ctcggagatg tgtataagag acagttgagc tggtagcaga actc 54
<210>9
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>9
gtctcgtggg ctcggagatg tgtataagag acagggtgtt acccgtgaac tcat 54
<210>10
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>10
gtctcgtggg ctcggagatg tgtataagag acagtgtccg aacaactgga ctttat 56
<210>11
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>11
gtctcgtggg ctcggagatg tgtataagag acaggcttga tggctttatg ggtaga 56
<210>12
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>12
gtctcgtggg ctcggagatg tgtataagag acagattgtc cagcatgtca caattc 56
<210>13
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>13
gtctcgtggg ctcggagatg tgtataagag acaggtggaa actgtgaaag gtttgg 56
<210>14
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>14
gtctcgtggg ctcggagatg tgtataagag acagttctcc cgcactcttg aaac 54
<210>15
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>15
gtctcgtggg ctcggagatg tgtataagag acaggctcgt gttgtacgat caattt 56
<210>16
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>16
gtctcgtggg ctcggagatg tgtataagag acagtcgcag tggctaacta acatc 55
<210>17
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>17
gtctcgtggg ctcggagatg tgtataagag acagagagaa gtttaaggaa ggtgtagag 59
<210>18
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>18
gtctcgtggg ctcggagatg tgtataagag acagcagaga agaaactggc ctactc 56
<210>19
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>19
gtctcgtggg ctcggagatg tgtataagag acagcatttg tcacgcactc aaagg 55
<210>20
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>20
gtctcgtggg ctcggagatg tgtataagag acagctgtgc ccttgcacct aata 54
<210>21
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>21
gtctcgtggg ctcggagatg tgtataagag acagcattgg ttggtacacc agtttg 56
<210>22
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>22
gtctcgtggg ctcggagatg tgtataagag acagaatgag aagtgctctg cctatac 57
<210>23
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>23
gtctcgtggg ctcggagatg tgtataagag acaggcaagg ttacaagagt gtgaatatc 59
<210>24
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>24
gtctcgtggg ctcggagatg tgtataagag acagcttaca ccactgggca ttga 54
<210>25
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>25
gtctcgtggg ctcggagatg tgtataagag acagcctcca gatgaggatg aagaag 56
<210>26
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>26
gtctcgtggg ctcggagatg tgtataagag acagcctgaa gaagagcaag aagaaga 57
<210>27
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>27
gtctcgtggg ctcggagatg tgtataagag acaggcagtg aggacaatca gacaa 55
<210>28
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>28
gtctcgtggg ctcggagatg tgtataagag acagaaatgc agacattgtg gaagaag 57
<210>29
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>29
gtctcgtggg ctcggagatg tgtataagag acagccttaa acatggagga ggtgtt 56
<210>30
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>30
gtctcgtggg ctcggagatg tgtataagag acaggcggac acaatcttgc taaac 55
<210>31
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>31
gtctcgtggg ctcggagatg tgtataagag acagggtgct gaccctatac attctt 56
<210>32
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>32
gtctcgtggg ctcggagatg tgtataagag acaggatcgc tgagattcct aaagagg 57
<210>33
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>33
gtctcgtggg ctcggagatg tgtataagag acagcttcat ccagattctg ccactc 56
<210>34
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>34
gtctcgtggg ctcggagatg tgtataagag acagtgatgt tgttcaagag ggtgt 55
<210>35
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>35
gtctcgtggg ctcggagatg tgtataagag acagaactgc tgtggttata cctactaaa 59
<210>36
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>36
gtctcgtggg ctcggagatg tgtataagag acaggcttgc acatgcagaa gaaa 54
<210>37
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>37
gtctcgtggg ctcggagatg tgtataagag acagcatgca gaagaaacac gcaaat 56
<210>38
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>38
gtctcgtggg ctcggagatg tgtataagag acaggcgtca cttatcaaca cacttaac 58
<210>39
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>39
gtctcgtggg ctcggagatg tgtataagag acagcatctc acttgctggt tcctat 56
<210>40
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>40
gtctcgtggg ctcggagatg tgtataagag acagggtcct attctggaca atctacac 58
<210>41
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>41
gtctcgtggg ctcggagatg tgtataagag acagttgaga gaagtgagga ctattaagg 59
<210>42
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>42
gtctcgtggg ctcggagatg tgtataagag acaggggtag gtacatgtca gcatt 55
<210>43
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>43
gtctcgtggg ctcggagatg tgtataagag acagaccaca caactgatcc tagttt 56
<210>44
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>44
gtctcgtggg ctcggagatg tgtataagag acaggacagt aggtgagtta ggtgatg 57
<210>45
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>45
gtctcgtggg ctcggagatg tgtataagag acagggtgag ttaggtgatg ttagagaaa 59
<210>46
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>46
gtctcgtggg ctcggagatg tgtataagag acagcagata ccttgtacgt gtggtaa 57
<210>47
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>47
gtctcgtggg ctcggagatg tgtataagag acagtacgtg tggtaaacaa gctaca 56
<210>48
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>48
gtctcgtggg ctcggagatg tgtataagag acagttgcat agacggtgct ttact 55
<210>49
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>49
gtctcgtggg ctcggagatg tgtataagag acagacaatt cttatttcac agagcaacc 59
<210>50
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>50
gtctcgtggg ctcggagatg tgtataagag acagccaaac caaccatatc caaacg 56
<210>51
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>51
gtctcgtggg ctcggagatg tgtataagag acagtggtga tgtggtggct attg 54
<210>52
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>52
gtctcgtggg ctcggagatg tgtataagag acagtccctg acttaaatgg tgatgt 56
<210>53
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>53
gtctcgtggg ctcggagatg tgtataagag acagcagtct ctgaagaagt agtggaaa 58
<210>54
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>54
gtctcgtggg ctcggagatg tgtataagag acagtcttgc ctgcgaagat ctaaa 55
<210>55
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>55
gtctcgtggg ctcggagatg tgtataagag acagaccctt gctactcatg gtttag 56
<210>56
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>56
gtctcgtggg ctcggagatg tgtataagag acagtactca tggtttagct gctgtt 56
<210>57
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>57
gtctcgtggg ctcggagatg tgtataagag acagatgccg actactatag caaagaata 59
<210>58
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>58
gtctcgtggg ctcggagatg tgtataagag acagaaagca tctatgccga ctactat 57
<210>59
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>59
gtctcgtggg ctcggagatg tgtataagag acaggtactg gttacagaga aggctatt 58
<210>60
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>60
gtctcgtggg ctcggagatg tgtataagag acagacagag aaggctattt gaactcta 58
<210>61
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>61
gtctcgtggg ctcggagatg tgtataagag acagtacttg gattggctgc aatcat 56
<210>62
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>62
gtctcgtggg ctcggagatg tgtataagag acagggctgc aatcatgcaa ttgtt 55
<210>63
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>63
gtctcgtggg ctcggagatg tgtataagag acagagagca acaagagtcg aatgta 56
<210>64
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>64
gtctcgtggg ctcggagatg tgtataagag acaggtgtta caaacgtaat agagcaaca 59
<210>65
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>65
gtctcgtggg ctcggagatg tgtataagag acagagaatg gttccatcca tctttact 58
<210>66
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>66
gtctcgtggg ctcggagatg tgtataagag acaggaccag tcttcttaca tcgttga 57
<210>67
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>67
gtctcgtggg ctcggagatg tgtataagag acagcgtctg tttactacag tcagcttat 59
<210>68
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>68
gtctcgtggg ctcggagatg tgtataagag acagtgttgg tgatagtgcg gaag 54
<210>69
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>69
gtctcgtggg ctcggagatg tgtataagag acagcttgca aagaatgtgt ccttagac 58
<210>70
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>70
gtctcgtggg ctcggagatg tgtataagag acaggtgacc ttggtgcttg tattg 55
<210>71
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>71
gtctcgtggg ctcggagatg tgtataagag acaggactgt agtgcgcgtc atatt 55
<210>72
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>72
gtctcgtggg ctcggagatg tgtataagag acagtgacat gtgcaactac tagacaa 57
<210>73
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>73
gtctcgtggg ctcggagatg tgtataagag acagaagata gcacttaagg gtggtaaa 58
<210>74
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>74
gtctcgtggg ctcggagatg tgtataagag acagccagcg tggtggtagt tatac 55
<210>75
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>75
gtctcgtggg ctcggagatg tgtataagag acaggacaaa gcttgcccat tgatt 55
<210>76
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>76
gtctcgtggg ctcggagatg tgtataagag acagcttctg gtaagccagt accatatt 58
<210>77
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>77
gtctcgtggg ctcggagatg tgtataagag acaggatgct tctggtaagc cagt 54
<210>78
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>78
gtctcgtggg ctcggagatg tgtataagag acagcagaag ctggtgtttg tgtatc 56
<210>79
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>79
gtctcgtggg ctcggagatg tgtataagag acaggtggta gatgggtact taacaatga 59
<210>80
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>80
gtctcgtggg ctcggagatg tgtataagag acagacacca gtttactcat tcttacct 58
<210>81
<211>61
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>81
gtctcgtggg ctcggagatg tgtataagag acagctattc cttatgtcat tcactgtact 60
c 61
<210>82
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>82
gtctcgtggg ctcggagatg tgtataagag acagcgtgta gtctttaatg gtgtttcc 58
<210>83
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>83
gtctcgtggg ctcggagatg tgtataagag acagttgaag aagctgcgct gt 52
<210>84
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>84
gtctcgtggg ctcggagatg tgtataagag acagtctcgc aaaggctctc aatg 54
<210>85
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>85
gtctcgtggg ctcggagatg tgtataagag acagcatgtg atctgcacct ctgaa 55
<210>86
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>86
gtctcgtggg ctcggagatg tgtataagag acaggatgac gtagtttact gtccaaga 58
<210>87
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>87
gtctcgtggg ctcggagatg tgtataagag acagcagcca atcctaagac acctaag 57
<210>88
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>88
gtctcgtggg ctcggagatg tgtataagag acagaggttg atacagccaa tcctaag 57
<210>89
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>89
gtctcgtggg ctcggagatg tgtataagag acagcatgct ggcacagact taga 54
<210>90
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>90
gtctcgtggg ctcggagatg tgtataagag acagtggcac agacttagaa ggtaac 56
<210>91
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>91
gtctcgtggg ctcggagatg tgtataagag acagtgactt taaccttgtg gctatga 57
<210>92
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>92
gtctcgtggg ctcggagatg tgtataagag acagggacct ctttctgctc aaact 55
<210>93
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>93
gtctcgtggg ctcggagatg tgtataagag acagaatcaa gggtacacac cactg 55
<210>94
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>94
gtctcgtggg ctcggagatg tgtataagag acagcacacc actggttgtt actca 55
<210>95
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>95
gtctcgtggg ctcggagatg tgtataagag acaggctagt tgggtgatgc gtatt 55
<210>96
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>96
gtctcgtggg ctcggagatg tgtataagag acaggtctat atgcctgcta gttggg 56
<210>97
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>97
gtctcgtggg ctcggagatg tgtataagag acagccattt ccatgtgggc tctta 55
<210>98
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>98
gtctcgtggg ctcggagatg tgtataagag acagaggtgt agttacaact gtcatgt 57
<210>99
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>99
gtctcgtggg ctcggagatg tgtataagag acaggttggt ggcaaacctt gtatc 55
<210>100
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>100
gtctcgtggg ctcggagatg tgtataagag acagctccca cccaagaata gcatag 56
<210>101
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>101
gtctcgtggg ctcggagatg tgtataagag acaggctaaa gatactactg aagcctttg 59
<210>102
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>102
gtctcgtggg ctcggagatg tgtataagag acaggcaggg tgctgtagac ataaa 55
<210>103
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>103
gtctcgtggg ctcggagatg tgtataagag acaggctgtt gctaatggtg attctg 56
<210>104
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>104
gtctcgtggg ctcggagatg tgtataagag acaggctaat ggtgattctg aagttgtt 58
<210>105
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>105
gtctcgtggg ctcggagatg tgtataagag acagacaaca gcagccaaac taatg 55
<210>106
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>106
gtctcgtggg ctcggagatg tgtataagag acaggagatg gttgtgttcc cttga 55
<210>107
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>107
gtctcgtggg ctcggagatg tgtataagag acagggccaa ttctgctgtc aaatta 56
<210>108
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>108
gtctcgtggg ctcggagatg tgtataagag acagcagctt taagggccaa ttctg 55
<210>109
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>109
gtctcgtggg ctcggagatg tgtataagag acagaacaca acaaagggag gtagg 55
<210>110
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>110
gtctcgtggg ctcggagatg tgtataagag acaggaaatg ggctagattc cctaaga 57
<210>111
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>111
gtctcgtggg ctcggagatg tgtataagag acagtagctg ccacagtacg tcta 54
<210>112
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>112
gtctcgtggg ctcggagatg tgtataagag acagcaagct ggtaatgcaa cagaag 56
<210>113
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>113
gtctcgtggg ctcggagatg tgtataagag acagggtact ggtcaggcaa taaca 55
<210>114
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>114
gtctcgtggg ctcggagatg tgtataagag acaggttgcc acatagatca tccaaatc 58
<210>115
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>115
gtctcgtggg ctcggagatg tgtataagag acagctgatg tcgtatacag ggcttt 56
<210>116
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>116
gtctcgtggg ctcggagatg tgtataagag acagaacggg tttgcggtgt aa 52
<210>117
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>117
gtctcgtggg ctcggagatg tgtataagag acaggattgt ccagctgttg ctaaac 56
<210>118
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>118
gtctcgtggg ctcggagatg tgtataagag acagggtgac atggtaccac atatatca 58
<210>119
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>119
gtctcgtggg ctcggagatg tgtataagag acagccatgc gaaatgctgg tattg 55
<210>120
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>120
gtctcgtggg ctcggagatg tgtataagag acaggtgtac gccaagcttt gtt 53
<210>121
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>121
gtctcgtggg ctcggagatg tgtataagag acagccttga ccagggcttt aact 54
<210>122
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>122
gtctcgtggg ctcggagatg tgtataagag acagagggct ttaactgcag agtc 54
<210>123
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>123
gtctcgtggg ctcggagatg tgtataagag acagtctaca gtgttcccac ctaca 55
<210>124
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>124
gtctcgtggg ctcggagatg tgtataagag acagcacgct gcttctggta atct 54
<210>125
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>125
gtctcgtggg ctcggagatg tgtataagag acagtacttg tgtatgctgc tgacc 55
<210>126
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>126
gtctcgtggg ctcggagatg tgtataagag acagtcagga tggtaatgct gctatc 56
<210>127
<211>62
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>127
gtctcgtggg ctcggagatg tgtataagag acaggattca atgagttatg aggatcaaga 60
tg 62
<210>128
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>128
gtctcgtggg ctcggagatg tgtataagag acaggtggtt ggcacaacat gttaaa 56
<210>129
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>129
gtctcgtggg ctcggagatg tgtataagag acagaagcaa attctatggt ggttgg 56
<210>130
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>130
gtctcgtggg ctcggagatg tgtataagag acagaaacca ggtggaacct catc 54
<210>131
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>131
gtctcgtggg ctcggagatg tgtataagag acagcatgtg tggcggttca ctat 54
<210>132
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>132
gtctcgtggg ctcggagatg tgtataagag acaggacgat gctgttgtgt gtttc 55
<210>133
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>133
gtctcgtggg ctcggagatg tgtataagag acagatactc tctgacgatg ctgttg 56
<210>134
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>134
gtctcgtggg ctcggagatg tgtataagag acagtacctt ccttacccag atcca 55
<210>135
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>135
gtctcgtggg ctcggagatg tgtataagag acagccttac ccagatccat caagaa 56
<210>136
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>136
gtctcgtggg ctcggagatg tgtataagag acagacatga tgagttaaca ggacaca 57
<210>137
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>137
gtctcgtggg ctcggagatg tgtataagag acagcaaggt attgggaacc tgag 54
<210>138
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>138
gtctcgtggg ctcggagatg tgtataagag acagacacac cgcatacagt cttac 55
<210>139
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>139
gtctcgtggg ctcggagatg tgtataagag acaggctatg tacacaccgc ataca 55
<210>140
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>140
gtctcgtggg ctcggagatg tgtataagag acagccattg tgtgctaatg gacaag 56
<210>141
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>141
gtctcgtggg ctcggagatg tgtataagag acagaaacct agaccaccac ttaacc 56
<210>142
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>142
gtctcgtggg ctcggagatg tgtataagag acaggggaag ttggtaaacc tagacc 56
<210>143
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>143
gtctcgtggg ctcggagatg tgtataagag acagctggct tatacccaac actcaa 56
<210>144
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>144
gtctcgtggg ctcggagatg tgtataagag acagaccacc tggtactggt aaga 54
<210>145
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>145
gtctcgtggg ctcggagatg tgtataagag acagcgtgct cgtgtagagt gttt 54
<210>146
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>146
gtctcgtggg ctcggagatg tgtataagag acagcctgag acgacagcag atatag 56
<210>147
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>147
gtctcgtggg ctcggagatg tgtataagag acagcactgt gagtgctttg gtttatg 57
<210>148
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>148
gtctcgtggg ctcggagatg tgtataagag acaggttgac actgtgagtg ctttg 55
<210>149
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>149
gtctcgtggg ctcggagatg tgtataagag acaggattca tcacagggct cagaata 57
<210>150
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>150
gtctcgtggg ctcggagatg tgtataagag acagcaaacc actgaaacag ctcac 55
<210>151
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>151
gtctcgtggg ctcggagatg tgtataagag acagatccta cacaggcacc taca 54
<210>152
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>152
gtctcgtggg ctcggagatg tgtataagag acagaatcac tgggttacat cctacac 57
<210>153
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>153
gtctcgtggg ctcggagatg tgtataagag acagcacccg cgaagaagct ataa 54
<210>154
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>154
gtctcgtggg ctcggagatg tgtataagag acagcatgtt tatcacccgc gaaga 55
<210>155
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>155
gtctcgtggg ctcggagatg tgtataagag acagcaccgc ctggagatca attt 54
<210>156
<211>61
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>156
gtctcgtggg ctcggagatg tgtataagag acagctggag atcaatttaa acacctcata 60
c 61
<210>157
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>157
gtctcgtggg ctcggagatg tgtataagag acagtccact gcttcagaca cttatg 56
<210>158
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>158
gtctcgtggg ctcggagatg tgtataagag acaggcctgt tggcatcatt ctattg 56
<210>159
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>159
gtctcgtggg ctcggagatg tgtataagag acagagcgtg ttgactggac tattg 55
<210>160
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>160
gtctcgtggg ctcggagatg tgtataagag acagagtgct ttgttaagcg tgttg 55
<210>161
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>161
gtctcgtggg ctcggagatg tgtataagag acagtgccac acattctgac aaattc 56
<210>162
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>162
gtctcgtggg ctcggagatg tgtataagag acagttcaca gatggtgtat gcctatt 57
<210>163
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>163
gtctcgtggg ctcggagatg tgtataagag acagccacta aagtctgcta cgtgtataa 59
<210>164
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>164
gtctcgtggg ctcggagatg tgtataagag acagatgtac cactaaagtc tgctacg 57
<210>165
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>165
gtctcgtggg ctcggagatg tgtataagag acaggatgga caacagggtg aagt 54
<210>166
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>166
gtctcgtggg ctcggagatg tgtataagag acagcagggt gaagtaccag tttctatc 58
<210>167
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>167
gtctcgtggg ctcggagatg tgtataagag acaggtgtgg acattgctgc taatac 56
<210>168
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>168
gtctcgtggg ctcggagatg tgtataagag acaggctaat actgtgatct gggactac 58
<210>169
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>169
gtctcgtggg ctcggagatg tgtataagag acagtgcccg taatggtgtt cttat 55
<210>170
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>170
gtctcgtggg ctcggagatg tgtataagag acaggtgcac cactcactgt cttt 54
<210>171
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>171
gtctcgtggg ctcggagatg tgtataagag acaggttgat ggtgttgtcc aacaatta 58
<210>172
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>172
gtctcgtggg ctcggagatg tgtataagag acaggttgtc caacaattac ctgaaact 58
<210>173
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>173
gtctcgtggg ctcggagatg tgtataagag acagagtcat agtcagttag gtggtttac 59
<210>174
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>174
gtctcgtggg ctcggagatg tgtataagag acagtaggtg gtttacatct actgattgg 59
<210>175
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>175
gtctcgtggg ctcggagatg tgtataagag acagttatgc tttggtgtaa agatggc 57
<210>176
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>176
gtctcgtggg ctcggagatg tgtataagag acaggtaaag atggccatgt agaaaca 57
<210>177
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>177
gtctcgtggg ctcggagatg tgtataagag acagaaacac attaacatta gctgtaccc 59
<210>178
<211>60
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>178
gtctcgtggg ctcggagatg tgtataagag acaggtgata tgtacgaccc taagactaaa 60
<210>179
<211>62
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>179
gtctcgtggg ctcggagatg tgtataagag acagctcatt attagtgata tgtacgaccc 60
ta 62
<210>180
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>180
gtctcgtggg ctcggagatg tgtataagag acagtaagct catgggacac ttcg 54
<210>181
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>181
gtctcgtggg ctcggagatg tgtataagag acagcaaatc caattcagtt gtcttcct 58
<210>182
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>182
gtctcgtggg ctcggagatg tgtataagag acaggggtac tgctgttatg tcttta 56
<210>183
<211>60
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>183
gtctcgtggg ctcggagatg tgtataagag acagcactag tctctagtca gtgtgttaat 60
<210>184
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>184
gtctcgtggg ctcggagatg tgtataagag acagttattg ccactagtct ctagtcag 58
<210>185
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>185
gtctcgtggg ctcggagatg tgtataagag acagggtttg ataaccctgt cctacc 56
<210>186
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>186
gtctcgtggg ctcggagatg tgtataagag acagatacat gtctctggga ccaatg 56
<210>187
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>187
gtctcgtggg ctcggagatg tgtataagag acaggaccca gtccctactt attgtt 56
<210>188
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>188
gtctcgtggg ctcggagatg tgtataagag acagttgaat atgtctctca gccttt 56
<210>189
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>189
gtctcgtggg ctcggagatg tgtataagag acagcggctt tagaaccatt ggtaga 56
<210>190
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>190
gtctcgtggg ctcggagatg tgtataagag acagtgaccc tctctcagaa acaaag 56
<210>191
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>191
gtctcgtggg ctcggagatg tgtataagag acagctgtgc acttgaccct ctc 53
<210>192
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>192
gtctcgtggg ctcggagatg tgtataagag acagctgtgt tgctgattat tctgtcc 57
<210>193
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>193
gtctcgtggg ctcggagatg tgtataagag acagtcttga ttctaaggtt ggtggtaa 58
<210>194
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>194
gtctcgtggg ctcggagatg tgtataagag acagggctgc gttatagctt gga 53
<210>195
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>195
gtctcgtggg ctcggagatg tgtataagag acagggttac caaccataca gagtagtag 59
<210>196
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>196
gtctcgtggg ctcggagatg tgtataagag acagacccac taatggtgtt ggttac 56
<210>197
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>197
gtctcgtggg ctcggagatg tgtataagag acagcagaga cattgctgac actact 56
<210>198
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>198
gtctcgtggg ctcggagatg tgtataagag acaggtgatc cacagacact tgagat 56
<210>199
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>199
gtctcgtggg ctcggagatg tgtataagag acagactcct acttggcgtg tttatt 56
<210>200
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>200
gtctcgtggg ctcggagatg tgtataagag acagacacgt gcaggctgtt ta 52
<210>201
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>201
gtctcgtggg ctcggagatg tgtataagag acagagtcaa tccatcattg cctaca 56
<210>202
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>202
gtctcgtggg ctcggagatg tgtataagag acagggaata gctgttgaac aagacaaa 58
<210>203
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>203
gtctcgtggg ctcggagatg tgtataagag acagccaagc aagaggtcat ttattgaag 59
<210>204
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>204
gtctcgtggg ctcggagatg tgtataagag acagcacctt tgctcacaga tgaaatg 57
<210>205
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>205
gtctcgtggg ctcggagatg tgtataagag acaggttagc gggtacaatc acttct 56
<210>206
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>206
gtctcgtggg ctcggagatg tgtataagag acagtcaaga ctcactttct tccacag 57
<210>207
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>207
gtctcgtggg ctcggagatg tgtataagag acagggctga agtgcaaatt gatagg 56
<210>208
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>208
gtctcgtggg ctcggagatg tgtataagag acagtgatca caggcagact tcaaa 55
<210>209
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>209
gtctcgtggg ctcggagatg tgtataagag acaggggcta tcatcttatg tccttcc 57
<210>210
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>210
gtctcgtggg ctcggagatg tgtataagag acagcacctc atggtgtagt cttctt 56
<210>211
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>211
gtctcgtggg ctcggagatg tgtataagag acagtgtgtc tggtaactgt gatgtt 56
<210>212
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>212
gtctcgtggg ctcggagatg tgtataagag acaggactca ttcaaggagg agttagat 58
<210>213
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>213
gtctcgtggg ctcggagatg tgtataagag acagccatgg tacatttggc taggt 55
<210>214
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>214
gtctcgtggg ctcggagatg tgtataagag acagtagctg gcttgattgc catag 55
<210>215
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>215
gtctcgtggg ctcggagatg tgtataagag acagccagtg ctcaaaggag tcaa 54
<210>216
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>216
gtctcgtggg ctcggagatg tgtataagag acagttgctg tagttgtctc aaggg 55
<210>217
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>217
gtctcgtggg ctcggagatg tgtataagag acaggataca agcctcactc cctttc 56
<210>218
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>218
gtctcgtggg ctcggagatg tgtataagag acagctccct ttcggatggc ttatt 55
<210>219
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>219
gtctcgtggg ctcggagatg tgtataagag acaggctttg gctttgctgg aaat 54
<210>220
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>220
gtctcgtggg ctcggagatg tgtataagag acagataatg aggctttggc tttgc 55
<210>221
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>221
gtctcgtggg ctcggagatg tgtataagag acaggatggc acaacaagtc ctatttc 57
<210>222
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>222
gtctcgtggg ctcggagatg tgtataagag acagattacc agctgtactc aactcaa 57
<210>223
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>223
gtctcgtggg ctcggagatg tgtataagag acagcacaca atcgacggtt catc 54
<210>224
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>224
gtctcgtggg ctcggagatg tgtataagag acagcggttc atccggagtt gttaat 56
<210>225
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>225
gtctcgtggg ctcggagatg tgtataagag acagcttgct ttcgtggtat tcttgc 56
<210>226
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>226
gtctcgtggg ctcggagatg tgtataagag acagagttac actagccatc cttactg 57
<210>227
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>227
gtctcgtggg ctcggagatg tgtataagag acagctcctt gaacaatgga acctagta 58
<210>228
<211>60
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>228
gtctcgtggg ctcggagatg tgtataagag acagcctatt ccttacatgg atttgtcttc 60
<210>229
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>229
gtctcgtggg ctcggagatg tgtataagag acagttcttc tcaacgtgcc actc 54
<210>230
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>230
gtctcgtggg ctcggagatg tgtataagag acaggttcca tgtggtcatt caatcc 56
<210>231
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>231
gtctcgtggg ctcggagatg tgtataagag acagcgctac aggattggca actataa 57
<210>232
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>232
gtctcgtggg ctcggagatg tgtataagag acagaacaca gaccattcca gtagc 55
<210>233
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>233
gtctcgtggg ctcggagatg tgtataagag acaggcactg ataacactcg ctactt 56
<210>234
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>234
gtctcgtggg ctcggagatg tgtataagag acaggagcaa ccaatggaga ttgattaaa 59
<210>235
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>235
gtctcgtggg ctcggagatg tgtataagag acagagttac gtgccagatc agttt 55
<210>236
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>236
gtctcgtggg ctcggagatg tgtataagag acagctgttc atcagacaag aggaagt 57
<210>237
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>237
gtctcgtggg ctcggagatg tgtataagag acaggctgca tttcaccaag aatgt 55
<210>238
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>238
gtctcgtggg ctcggagatg tgtataagag acagaatgaa acttgtcacg cctaaac 57
<210>239
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>239
gtctcgtggg ctcggagatg tgtataagag acaggctggt tctaaatcac ccattc 56
<210>240
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>240
gtctcgtggg ctcggagatg tgtataagag acagattgaa ttgtgcgtgg atgag 55
<210>241
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>241
gtctcgtggg ctcggagatg tgtataagag acaggtttgg tggaccctca gatt 54
<210>242
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>242
gtctcgtggg ctcggagatg tgtataagag acagtcaact ggcagtaacc agaat 55
<210>243
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>243
gtctcgtggg ctcggagatg tgtataagag acagcaccaa tagcagtcca gatgac 56
<210>244
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>244
gtctcgtggg ctcggagatg tgtataagag acagctacta ccgaagagct accaga 56
<210>245
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>245
gtctcgtggg ctcggagatg tgtataagag acagcctgct aacaatgctg caatc 55
<210>246
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>246
gtctcgtggg ctcggagatg tgtataagag acagccgcaa tcctgctaac aatg 54
<210>247
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>247
gtctcgtggg ctcggagatg tgtataagag acaggtgatg ctgctcttgc tttg 54
<210>248
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>248
gtctcgtggg ctcggagatg tgtataagag acagtgctgc tgcttgacag att 53
<210>249
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>249
gtctcgtggg ctcggagatg tgtataagag acaggaccag gaactaatca gacaagg 57
<210>250
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>250
gtctcgtggg ctcggagatg tgtataagag acagcccacc aacagagcct aaa 53
<210>251
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>251
gtctcgtggg ctcggagatg tgtataagag acagctgact caactcaggc ctaaac 56
<210>252
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>252
gtctcgtggg ctcggagatg tgtataagag acagagacca cacaaggcag atg 53
<210>253
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>253
tcgtcggcag cgtcagatgt gtataagaga cagggacaag gctctccatc ttac 54
<210>254
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>254
tcgtcggcag cgtcagatgt gtataagaga cagctccatc ttacctttcg gtcac 55
<210>255
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>255
tcgtcggcag cgtcagatgt gtataagaga cagccgaacg tttgatgaac acatag 56
<210>256
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>256
tcgtcggcag cgtcagatgt gtataagaga cagtgctacc agctcaacca taac 54
<210>257
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>257
tcgtcggcag cgtcagatgt gtataagaga cagagggcca cagaagttgt tatc 54
<210>258
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>258
tcgtcggcag cgtcagatgt gtataagaga caggggtaac accactgcta tgt 53
<210>259
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>259
tcgtcggcag cgtcagatgt gtataagaga caggtgtctg caattcatag ctcttt 56
<210>260
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>260
tcgtcggcag cgtcagatgt gtataagaga cagttggtga cgcaactgga tag 53
<210>261
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>261
tcgtcggcag cgtcagatgt gtataagaga cagagactatgctcaggtcc tactt 55
<210>262
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>262
tcgtcggcag cgtcagatgt gtataagaga cagcttcgga accttctcca aca 53
<210>263
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>263
tcgtcggcag cgtcagatgt gtataagaga cagtagtatt gttatagcgg ccttctg 57
<210>264
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>264
tcgtcggcag cgtcagatgt gtataagaga caggttagcc actgcgaagt caa 53
<210>265
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>265
tcgtcggcag cgtcagatgt gtataagaga cagctgaaca acaccacctg taatg 55
<210>266
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>266
tcgtcggcag cgtcagatgt gtataagaga cagtagagtc agcacacaaa gcc 53
<210>267
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>267
tcgtcggcag cgtcagatgt gtataagaga cagggcatga gtaggccagt tt 52
<210>268
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>268
tcgtcggcag cgtcagatgt gtataagaga cagcagagaa gaaactggcc tactc 55
<210>269
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>269
tcgtcggcag cgtcagatgt gtataagaga cagtattagg tgcaagggca cag 53
<210>270
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>270
tcgtcggcag cgtcagatgt gtataagaga cagcaacaca ggcgaactca tttac 55
<210>271
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>271
tcgtcggcag cgtcagatgt gtataagaga cagtaggcag agcacttctc attaag 56
<210>272
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>272
tcgtcggcag cgtcagatgt gtataagaga cagtcctcat ctggagggta gaaa 54
<210>273
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>273
tcgtcggcag cgtcagatgt gtataagaga cagtctggag ggtagaaaga acaatac 57
<210>274
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>274
tcgtcggcag cgtcagatgt gtataagaga cagaggttga agagcagcag aag 53
<210>275
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>275
tcgtcggcag cgtcagatgt gtataagaga cagagtctga acaactggtg taagt 55
<210>276
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>276
tcgtcggcag cgtcagatgt gtataagaga cagaggtaaa cattggctgc attaac 56
<210>277
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>277
tcgtcggcag cgtcagatgt gtataagaga caggcaacac ctcctccatg ttta 54
<210>278
<211>51
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>278
tcgtcggcag cgtcagatgt gtataagaga cagttgggcc gacaacatga a 51
<210>279
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>279
tcgtcggcag cgtcagatgt gtataagaga caggaatgta tagggtcagc accaa 55
<210>280
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>280
tcgtcggcag cgtcagatgt gtataagaga cagcatttgt gcgaacagta tctacac 57
<210>281
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>281
tcgtcggcag cgtcagatgt gtataagaga cagcttccag agttgttgta acttcttc 58
<210>282
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>282
tcgtcggcag cgtcagatgt gtataagaga caggagtggc agaatctgga tgaa 54
<210>283
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>283
tcgtcggcag cgtcagatgt gtataagaga cagacccggg taagtggtta tataattg 58
<210>284
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>284
tcgtcggcag cgtcagatgt gtataagaga cagtctgcat gtgcaagcat ttc 53
<210>285
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>285
tcgtcggcag cgtcagatgt gtataagaga caggcgtgtt tcttctgcat gtg 53
<210>286
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>286
tcgtcggcag cgtcagatgt gtataagaga cagcatagcc aagtggcatt gtaac 55
<210>287
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>287
tcgtcggcag cgtcagatgt gtataagaga cagcgagcag cttcttccaa atttaag 57
<210>288
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>288
tcgtcggcag cgtcagatgt gtataagaga cagtgtccag aataggacca atctttat 58
<210>289
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>289
tcgtcggcag cgtcagatgt gtataagaga cagacttgcg tgtggaggtt aat 53
<210>290
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>290
tcgtcggcag cgtcagatgt gtataagaga cagccaaact gttgtccata tgtcatt 57
<210>291
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>291
tcgtcggcag cgtcagatgt gtataagaga cagctgacat gtacctaccc agaaa 55
<210>292
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>292
tcgtcggcag cgtcagatgt gtataagaga cagcacctaa ctcacctact gtcttatt 58
<210>293
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>293
tcgtcggcag cgtcagatgt gtataagaga cagaacatca cctaactcac ctactg 56
<210>294
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>294
tcgtcggcag cgtcagatgt gtataagaga cagggtgact cctgttgtac tagatatt 58
<210>295
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>295
tcgtcggcag cgtcagatgt gtataagaga cagcaggtgg tgctgacatc ataa 54
<210>296
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>296
tcgtcggcag cgtcagatgt gtataagaga cagggacctt tgtattctga ggactt 56
<210>297
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>297
tcgtcggcag cgtcagatgt gtataagaga cagaacatcc gtaataggac ctttgt 56
<210>298
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>298
tcgtcggcag cgtcagatgt gtataagaga cagcgaagct tgcgtttgga tatg 54
<210>299
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>299
tcgtcggcag cgtcagatgt gtataagaga cagatagcca ccacatcacc attta 55
<210>300
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>300
tcgtcggcag cgtcagatgt gtataagaga cagcgtggct ttattagttg cattgt 56
<210>301
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>301
tcgtcggcag cgtcagatgt gtataagaga cagtcttcgc aggcaagatt atcc 54
<210>302
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>302
tcgtcggcag cgtcagatgt gtataagaga cagcactact tcttcagaga ctggtt 56
<210>303
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>303
tcgtcggcag cgtcagatgt gtataagaga cagacagcag ctaaaccatg agtag 55
<210>304
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>304
tcgtcggcag cgtcagatgt gtataagaga cagggacact attaacagca gctaaac 57
<210>305
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>305
tcgtcggcag cgtcagatgt gtataagaga cagctttgct atagtagtcg gcatagat 58
<210>306
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>306
tcgtcggcag cgtcagatgt gtataagaga cagagtagtc ggcatagatg ctttaat 57
<210>307
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>307
tcgtcggcag cgtcagatgt gtataagaga cagaccagta cagtaggttg caatag 56
<210>308
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>308
tcgtcggcag cgtcagatgt gtataagaga cagagagttc aaatagcctt ctctgt 56
<210>309
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>309
tcgtcggcag cgtcagatgt gtataagaga cagtgcagcc aatccaagta cata 54
<210>310
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>310
tcgtcggcag cgtcagatgt gtataagaga cagcatgatt gcagccaatc caa 53
<210>311
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>311
tcgtcggcag cgtcagatgt gtataagaga cagacattcg actcttgttg ctctatt 57
<210>312
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>312
tcgtcggcag cgtcagatgt gtataagaga caggactctt gttgctctat tacgtttg 58
<210>313
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>313
tcgtcggcag cgtcagatgt gtataagaga caggaaccat tcttcactgt aacactatc 59
<210>314
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>314
tcgtcggcag cgtcagatgt gtataagaga cagcgatgta agaagactgg tcagtag 57
<210>315
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>315
tcgtcggcag cgtcagatgt gtataagaga cagactgcaa cttccgcact atc 53
<210>316
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>316
tcgtcggcag cgtcagatgt gtataagaga cagcactatc accaacatca gacacta 57
<210>317
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>317
tcgtcggcag cgtcagatgt gtataagaga cagctgaatc aacaaaccct tgcc 54
<210>318
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>318
tcgtcggcag cgtcagatgt gtataagaga cagcgccagt aacttctatg tcagat 56
<210>319
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>319
tcgtcggcag cgtcagatgt gtataagaga caggcattaa tatgacgcgc actac 55
<210>320
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>320
tcgtcggcag cgtcagatgt gtataagaga cagtttacca cccttaagtg ctatct 56
<210>321
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>321
tcgtcggcag cgtcagatgt gtataagaga cagcccttaa gtgctatctt tgttgttac 59
<210>322
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>322
tcgtcggcag cgtcagatgt gtataagaga cagcgagtga caccaccatc aata 54
<210>323
<211>50
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>323
tcgtcggcag cgtcagatgt gtataagaga cagccaccac gctggctaaa 50
<210>324
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>324
tcgtcggcag cgtcagatgt gtataagaga cagtggctta ccagaagcat cttt 54
<210>325
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>325
tcgtcggcag cgtcagatgt gtataagaga cagaatatgg tactggctta ccagaag 57
<210>326
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>326
tcgtcggcag cgtcagatgt gtataagaga cagagataca caaacaccag cttct 55
<210>327
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>327
tcgtcggcag cgtcagatgt gtataagaga cagatcattg ttaagtaccc atctacca 58
<210>328
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>328
tcgtcggcag cgtcagatgt gtataagaga cagaaaggca actacatgac tgtattc 57
<210>329
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>329
tcgtcggcag cgtcagatgt gtataagaga cagacagagt acagtgaatg acataagg 58
<210>330
<211>51
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>330
tcgtcggcag cgtcagatgt gtataagaga cagacagcgc agcttcttca a 51
<210>331
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>331
tcgtcggcag cgtcagatgt gtataagaga cagaaagact acacgtctct ttaggt 56
<210>332
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>332
tcgtcggcag cgtcagatgt gtataagaga cagggtttgt ggtggttggt aaag 54
<210>333
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>333
tcgtcggcag cgtcagatgt gtataagaga cagcagctga ggtgatagag gtttg 55
<210>334
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>334
tcgtcggcag cgtcagatgt gtataagaga caggggttaa gcatgtcttc agagg 55
<210>335
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>335
tcgtcggcag cgtcagatgt gtataagaga cagagtctgt cctggttgaa tgc 53
<210>336
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>336
tcgtcggcag cgtcagatgt gtataagaga cagaccagat ggtgaaccat tgtaa 55
<210>337
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>337
tcgtcggcag cgtcagatgt gtataagaga cagctaagtc tgtgccagca tgaa 54
<210>338
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>338
tcgtcggcag cgtcagatgt gtataagaga cagagcatga actccagttg gtaat 55
<210>339
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>339
tcgtcggcag cgtcagatgt gtataagaga caggtttgag cagaaagagg tccta 55
<210>340
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>340
tcgtcggcag cgtcagatgt gtataagaga cagcaattcc agtttgagca gaaaga 56
<210>341
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>341
tcgtcggcag cgtcagatgt gtataagaga cagcagtggt gtgtaccctt gatt 54
<210>342
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>342
tcgtcggcag cgtcagatgt gtataagaga cagcaaagac cattgagtac tctgga 56
<210>343
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>343
tcgtcggcag cgtcagatgt gtataagaga cagcacccaa ctagcaggca tatag 55
<210>344
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>344
tcgtcggcag cgtcagatgt gtataagaga cagtaatacg catcacccaa ctagc 55
<210>345
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>345
tcgtcggcag cgtcagatgt gtataagaga cagtaagagc ccacatggaa atgg 54
<210>346
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>346
tcgtcggcag cgtcagatgt gtataagaga cagcacatgg aaatggcttg atctaaa 57
<210>347
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>347
tcgtcggcag cgtcagatgt gtataagaga caggtcagtc taaagtagcg gttgag 56
<210>348
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>348
tcgtcggcag cgtcagatgt gtataagaga caggctattc ttgggtggga gtag 54
<210>349
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>349
tcgtcggcag cgtcagatgt gtataagaga cagcctgttg tccagcattt cttc 54
<210>350
<211>51
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>350
tcgtcggcag cgtcagatgt gtataagaga cagaccctgc atggaaagca a 51
<210>351
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>351
tcgtcggcag cgtcagatgt gtataagaga cagagcaaca gcctgctcat aa 52
<210>352
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>352
tcgtcggcag cgtcagatgt gtataagaga caggctgcat cacggtcaaa ttc 53
<210>353
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>353
tcgtcggcag cgtcagatgt gtataagaga cagtcaaggg aacacaacca tctc 54
<210>354
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>354
tcgtcggcag cgtcagatgt gtataagaga cagggctgct gttgtaagag gtat 54
<210>355
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>355
tcgtcggcag cgtcagatgt gtataagaga caggtcgtag tgcaacagga ctaa 54
<210>356
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>356
tcgtcggcag cgtcagatgt gtataagaga caggacagca gaattggccc tta 53
<210>357
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>357
tcgtcggcag cgtcagatgt gtataagaga caggtaccag ttccatcact cttagg 56
<210>358
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>358
tcgtcggcag cgtcagatgt gtataagaga cagggacctt taggtgtgtc tgtaa 55
<210>359
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>359
tcgtcggcag cgtcagatgt gtataagaga caggacgtac tgtggcagct aaa 53
<210>360
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>360
tcgtcggcag cgtcagatgt gtataagaga cagcaggcac ttctgttgca ttac 54
<210>361
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>361
tcgtcggcag cgtcagatgt gtataagaga cagccatatt ggcttccggt gtaa 54
<210>362
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>362
tcgtcggcag cgtcagatgt gtataagaga cagggcagta cagacaacac gat 53
<210>363
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>363
tcgtcggcag cgtcagatgt gtataagaga caggttgatc acaactacag ccataac 57
<210>364
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>364
tcgtcggcag cgtcagatgt gtataagaga cagaaagccc tgtatacgac atcag 55
<210>365
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>365
tcgtcggcag cgtcagatgt gtataagaga cagggtacca tgtcaccgtc tattc 55
<210>366
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>366
tcgtcggcag cgtcagatgt gtataagaga caggtttagc aacagctgga caatc 55
<210>367
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>367
tcgtcggcag cgtcagatgt gtataagaga cagggcgtacacgttcacct aa 52
<210>368
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>368
tcgtcggcag cgtcagatgt gtataagaga cagacaccaa caataccagc atttc 55
<210>369
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>369
tcgtcggcag cgtcagatgt gtataagaga cagcctctct tccgtgaagt catattt 57
<210>370
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>370
tcgtcggcag cgtcagatgt gtataagaga cagaagccct ggtcaaggtt aata 54
<210>371
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>371
tcgtcggcag cgtcagatgt gtataagaga cagcttgtag gtgggaacac tgtag 55
<210>372
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>372
tcgtcggcag cgtcagatgt gtataagaga cagggtggga acactgtaga gaataa 56
<210>373
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>373
tcgtcggcag cgtcagatgt gtataagaga cagaagcacg tagtgcgttt atct 54
<210>374
<211>61
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>374
tcgtcggcag cgtcagatgt gtataagaga cagtaacgat agtagtcata atcgctgata 60
g 61
<210>375
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>375
tcgtcggcag cgtcagatgt gtataagaga cagagcatta ccatcctgag caaa 54
<210>376
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>376
tcgtcggcag cgtcagatgt gtataagaga cagagtgcat cttgatcctc ataact 56
<210>377
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>377
tcgtcggcag cgtcagatgt gtataagaga caggttgtgc caaccaccat aga 53
<210>378
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>378
tcgtcggcag cgtcagatgt gtataagaga cagcatatag tgaaccgcca caca 54
<210>379
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>379
tcgtcggcag cgtcagatgt gtataagaga caggatgagg ttccacctgg ttta 54
<210>380
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>380
tcgtcggcag cgtcagatgt gtataagaga caggaaacac acaacagcat cgtc 54
<210>381
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>381
tcgtcggcag cgtcagatgt gtataagaga caggcatcgt cagagagtat catcatt 57
<210>382
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>382
tcgtcggcag cgtcagatgt gtataagaga cagattcttg atggatctgg gtaagg 56
<210>383
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>383
tcgtcggcag cgtcagatgt gtataagaga cagccctagg attcttgatg gatctg 56
<210>384
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>384
tcgtcggcag cgtcagatgt gtataagaga cagctcaggt tcccaatacc ttgaa 55
<210>385
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>385
tcgtcggcag cgtcagatgt gtataagaga caggttccca ataccttgaa gtgttatc 58
<210>386
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>386
tcgtcggcag cgtcagatgt gtataagaga cagggtcgta acagcattta caacataa 58
<210>387
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>387
tcgtcggcag cgtcagatgt gtataagaga cagacggatt aacagacaag actaa 55
<210>388
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>388
tcgtcggcag cgtcagatgt gtataagaga cagaacttgt ccattagcac acaatg 56
<210>389
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>389
tcgtcggcag cgtcagatgt gtataagaga cagagctcat acctcctaag taaagttg 58
<210>390
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>390
tcgtcggcag cgtcagatgt gtataagaga cagggttaag tggtggtcta ggttta 56
<210>391
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>391
tcgtcggcag cgtcagatgtgtataagaga caggagtgtt gggtataagc cagtaa 56
<210>392
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>392
tcgtcggcag cgtcagatgt gtataagaga cagtgggtat aagccagtaa ttctaaca 58
<210>393
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>393
tcgtcggcag cgtcagatgt gtataagaga cagcgagcac gtgcaggtat aat 53
<210>394
<211>51
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>394
tcgtcggcag cgtcagatgt gtataagaga caggctgtcg tctcaggcaa t 51
<210>395
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>395
tcgtcggcag cgtcagatgt gtataagaga cagggttcta gtgtgccctt agttag 56
<210>396
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>396
tcgtcggcag cgtcagatgt gtataagaga caggtcaaca atttcagcag gacaa 55
<210>397
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>397
tcgtcggcag cgtcagatgt gtataagaga cagcatattc tgagccctgt gatgaa 56
<210>398
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>398
tcgtcggcag cgtcagatgt gtataagaga caggaatcaa cagtttgagt tggtagtc 58
<210>399
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>399
tcgtcggcag cgtcagatgt gtataagaga cagtaggtgc ctgtgtagga tgta 54
<210>400
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>400
tcgtcggcag cgtcagatgt gtataagaga caggccaggt atgtcaacac ataaac 56
<210>401
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>401
tcgtcggcag cgtcagatgt gtataagaga cagcctcgac atcgaagcca atc 53
<210>402
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>402
tcgtcggcag cgtcagatgt gtataagaga cagaacagct tctctagtag catgac 56
<210>403
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>403
tcgtcggcag cgtcagatgt gtataagaga cagagtcctt tgtacataag tggtatga 58
<210>404
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>404
tcgtcggcag cgtcagatgt gtataagaga caggcggtgg tttagcacta act 53
<210>405
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>405
tcgtcggcag cgtcagatgt gtataagaga cagaagcatg tggcacgtct atc 53
<210>406
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>406
tcgtcggcag cgtcagatgt gtataagaga cagcataagt gtctgaagca gtgga 55
<210>407
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>407
tcgtcggcag cgtcagatgt gtataagaga cagtagtcca gtcaacacgc ttaac 55
<210>408
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>408
tcgtcggcag cgtcagatgt gtataagaga caggccgcat taatcttcag ttcatc 56
<210>409
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>409
tcgtcggcag cgtcagatgt gtataagaga cagtgtcaga atgtgtggca taaga 55
<210>410
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>410
tcgtcggcag cgtcagatgt gtataagaga cagtgtgaat ttgtcagaat gtgtgg 56
<210>411
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>411
tcgtcggcag cgtcagatgt gtataagaga cagctcacat ggactgtcag agtaatag 58
<210>412
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>412
tcgtcggcag cgtcagatgt gtataagaga cagcgtagca gactttagtg gtacat 56
<210>413
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>413
tcgtcggcag cgtcagatgt gtataagaga cagctggtac ttcaccctgt tgtc 54
<210>414
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>414
tcgtcggcag cgtcagatgt gtataagaga cagtcaccct gttgtccatc aaa 53
<210>415
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>415
tcgtcggcag cgtcagatgt gtataagaga cagatgtgct ggagcatctc ttt 53
<210>416
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>416
tcgtcggcag cgtcagatgt gtataagaga cagtcagttg gtttcttggc tatgt 55
<210>417
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>417
tcgtcggcag cgtcagatgt gtataagaga caggggacct acagatggtt gtaaa 55
<210>418
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>418
tcgtcggcag cgtcagatgt gtataagaga caggactagc ttgtttggga cctac 55
<210>419
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>419
tcgtcggcag cgtcagatgt gtataagaga cagttccatt tgactcctgg gttta 55
<210>420
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>420
tcgtcggcag cgtcagatgt gtataagaga cagtgactcc tgggtttaaa ttcttgta 58
<210>421
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>421
tcgtcggcag cgtcagatgt gtataagaga cagacgttta gctagtccaa tcagtag 57
<210>422
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>422
tcgtcggcag cgtcagatgt gtataagaga cagagtagat gtaaaccacc taactgac 58
<210>423
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>423
tcgtcggcag cgtcagatgt gtataagaga caggccatct ttacaccaaa gcataa 56
<210>424
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>424
tcgtcggcag cgtcagatgt gtataagaga cagtctacat ggccatcttt acacc 55
<210>425
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>425
tcgtcggcag cgtcagatgt gtataagaga caggggtaca gctaatgtta atgtgttt 58
<210>426
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>426
tcgtcggcag cgtcagatgt gtataagaga cagctttatc agaaccagca ccaaa 55
<210>427
<211>59
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>427
tcgtcggcag cgtcagatgt gtataagaga caggtcttag ggtcgtacat atcactaat 59
<210>428
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>428
tcgtcggcag cgtcagatgt gtataagaga cagatctatt tgttcgcgtg gtttg 55
<210>429
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>429
tcgtcggcag cgtcagatgt gtataagaga cagcgtggtt tgccaagata attaca 56
<210>430
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>430
tcgtcggcag cgtcagatgt gtataagaga cagtttaaag acataacagc agtaccc 57
<210>431
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>431
tcgtcggcag cgtcagatgt gtataagaga cagctgacta gagactagtg gcaataaa 58
<210>432
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>432
tcgtcggcag cgtcagatgt gtataagaga cagacaccac gtgtgaaaga attag 55
<210>433
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>433
tcgtcggcag cgtcagatgt gtataagaga cagggacagg gttatcaaac ctctta 56
<210>434
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>434
tcgtcggcag cgtcagatgt gtataagaga cagaaatggt aggacagggt tatca 55
<210>435
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>435
tcgtcggcag cgtcagatgt gtataagaga cagctctgaa ctcactttcc atcca 55
<210>436
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>436
tcgtcggcag cgtcagatgt gtataagaga cagtcgcact agaataaact ctgaact 57
<210>437
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>437
tcgtcggcag cgtcagatgt gtataagaga cagctaaatt aataggcgtg tgcttaga 58
<210>438
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>438
tcgtcggcag cgtcagatgt gtataagaga caggctgtcc aacctgaaga ag 52
<210>439
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>439
tcgtcggcag cgtcagatgt gtataagaga caggtttctg agagagggtc aagtg 55
<210>440
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>440
tcgtcggcag cgtcagatgt gtataagaga cagaggagac actccataac acttaaa 57
<210>441
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>441
tcgtcggcag cgtcagatgt gtataagaga cagtgatgcg gaattatata ggacagaa 58
<210>442
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>442
tcgtcggcag cgtcagatgt gtataagaga cagaccacca accttagaat caaga 55
<210>443
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>443
tcgtcggcag cgtcagatgt gtataagaga cagagttgct ggtgcatgta gaa 53
<210>444
<211>60
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>444
tcgtcggcag cgtcagatgt gtataagaga caggtactac tactctgtat ggttggtaac 60
<210>445
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>445
tcgtcggcag cgtcagatgt gtataagaga caggatcacg gacagcatca gtag 54
<210>446
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>446
tcgtcggcag cgtcagatgt gtataagaga caggaatctc aagtgtctgt ggatca 56
<210>447
<211>50
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>447
tcgtcggcag cgtcagatgt gtataagaga cagagcctgc acgtgtttga 50
<210>448
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>448
tcgtcggcag cgtcagatgt gtataagaga cagtatacct gcaccaatgg gtatg 55
<210>449
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>449
tcgtcggcag cgtcagatgt gtataagaga cagttgtggg tatggcaata gagtt 55
<210>450
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>450
tcgtcggcag cgtcagatgt gtataagaga cagagacact ggtagaattt ctgtgg 56
<210>451
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>451
tcgtcggcag cgtcagatgt gtataagaga cagtttgtct tgttcaacag ctattcc 57
<210>452
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>452
tcgtcggcag cgtcagatgt gtataagaga caggaggtct ctagcagcaa tatcac 56
<210>453
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>453
tcgtcggcag cgtcagatgt gtataagaga cagaccaaag gtccaaccag aag 53
<210>454
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>454
tcgtcggcag cgtcagatgt gtataagaga cagtgcactt gctgtggaag aa 52
<210>455
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>455
tcgtcggcag cgtcagatgt gtataagaga cagagcgtgt ttaaagcttg tgc 53
<210>456
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>456
tcgtcggcag cgtcagatgt gtataagaga caggtctgcc tgtgatcaac ctatc 55
<210>457
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>457
tcgtcggcag cgtcagatgt gtataagaga cagctgactg agggaaggac ataag 55
<210>458
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>458
tcgtcggcag cgtcagatgt gtataagaga cagaagacac cttcacgagg aaag 54
<210>459
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>459
tcgtcggcag cgtcagatgt gtataagaga cagacaacat cacagttacc agaca 55
<210>460
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>460
tcgtcggcag cgtcagatgt gtataagaga caggagtcta attcaggttg caaagg 56
<210>461
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>461
tcgtcggcag cgtcagatgt gtataagaga cagcattgag gcggtcaatt tcttt 55
<210>462
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>462
tcgtcggcag cgtcagatgt gtataagaga caggcaactg gtcatacagc aaag 54
<210>463
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>463
tcgtcggcag cgtcagatgt gtataagaga cagctgaagg agtagcatcc ttgatt 56
<210>464
<211>51
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>464
tcgtcggcag cgtcagatgt gtataagaga cagtgcagta gcgcgaacaa a 51
<210>465
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>465
tcgtcggcag cgtcagatgt gtataagaga caggaagtgc aacgccaaca ataa 54
<210>466
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>466
tcgtcggcag cgtcagatgt gtataagaga cagccaacaa taagccatcc gaaag 55
<210>467
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>467
tcgtcggcag cgtcagatgt gtataagaga caggcaaagc caaagcctca ttatt 55
<210>468
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>468
tcgtcggcag cgtcagatgt gtataagaga caggaacggc atttccagca aag 53
<210>469
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>469
tcgtcggcag cgtcagatgt gtataagaga cagcaacacc agtgtctgta ctcaa 55
<210>470
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>470
tcgtcggcag cgtcagatgt gtataagaga cagacagctg gtaatagtct gaagtg 56
<210>471
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>471
tcgtcggcag cgtcagatgt gtataagaga caggtcgtcg tcggttcatc ataa 54
<210>472
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>472
tcgtcggcag cgtcagatgt gtataagaga cagcgtacct gtctcttccg aaac 54
<210>473
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>473
tcgtcggcag cgtcagatgt gtataagaga cagcagcagt acgcacacaa tc 52
<210>474
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>474
tcgtcggcag cgtcagatgt gtataagaga cagcgttaac aatattgcag cagtacg 57
<210>475
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>475
tcgtcggcag cgtcagatgt gtataagaga caggtaccgt tggaatctgc cat 53
<210>476
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>476
tcgtcggcag cgtcagatgt gtataagaga caggttccat tgttcaagga gcttt 55
<210>477
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>477
tcgtcggcag cgtcagatgt gtataagaga caggcgcaaa cagtctgaaa gaag 54
<210>478
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>478
tcgtcggcag cgtcagatgt gtataagaga cagggattga atgaccacat ggaac 55
<210>479
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>479
tcgtcggcag cgtcagatgt gtataagaga cagaatcctg tagcgactgt atgc 54
<210>480
<211>53
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>480
tcgtcggcag cgtcagatgt gtataagaga cagaaacctg agtcacctgc tac 53
<210>481
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>481
tcgtcggcag cgtcagatgt gtataagaga cagtctccat tggttgctct tcatc 55
<210>482
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>482
tcgtcggcag cgtcagatgt gtataagaga cagcgagtgt tatcagtgcc aaga 54
<210>483
<211>57
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>483
tcgtcggcag cgtcagatgt gtataagaga cagtcttgaa cttcctcttg tctgatg 57
<210>484
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>484
tcgtcggcag cgtcagatgt gtataagaga caggatctgg cacgtaactg atagac 56
<210>485
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>485
tcgtcggcag cgtcagatgt gtataagaga cagcgtttag gcgtgacaag tttc 54
<210>486
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>486
tcgtcggcag cgtcagatgt gtataagaga caggtgaaat gcagctacag ttgtg 55
<210>487
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>487
tcgtcggcag cgtcagatgt gtataagaga cagacaacgc actacaagac tacc 54
<210>488
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>488
tcgtcggcag cgtcagatgt gtataagaga cagtcgatgt actgaatggg tgattt 56
<210>489
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>489
tcgtcggcag cgtcagatgt gtataagaga caggcgttct ccattctggt tact 54
<210>490
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>490
tcgtcggcag cgtcagatgt gtataagaga caggggtgca tttcgctgat tt 52
<210>491
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>491
tcgtcggcag cgtcagatgt gtataagaga cagtctggta gctcttcggt agtag 55
<210>492
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>492
tcgtcggcag cgtcagatgt gtataagaga cagaccatct tggactgaga tctttc 56
<210>493
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>493
tcgtcggcag cgtcagatgt gtataagaga caggcacgat tgcagcattg ttag 54
<210>494
<211>55
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>494
tcgtcggcag cgtcagatgt gtataagaga cagtttggca atgttgttcc ttgag 55
<210>495
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>495
tcgtcggcag cgtcagatgt gtataagaga caggctctca agctggttca atct 54
<210>496
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>496
tcgtcggcag cgtcagatgt gtataagaga cagctgtcaa gcagcagcaa ag 52
<210>497
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>497
tcgtcggcag cgtcagatgt gtataagaga cagttgcggc caatgtttgt aatc 54
<210>498
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>498
tcgtcggcag cgtcagatgt gtataagaga cagccttgtc tgattagttc ctggtc 56
<210>499
<211>52
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>499
tcgtcggcag cgtcagatgt gtataagaga caggctctgt tggtgggaat gt 52
<210>500
<211>58
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>500
tcgtcggcag cgtcagatgt gtataagaga caggaattca ttctgcacaa gagtagac 58
<210>501
<211>54
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>501
tcgtcggcag cgtcagatgt gtataagaga cagcagctct ccctagcatt gttc 54
<210>502
<211>56
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>502
tcgtcggcag cgtcagatgt gtataagaga cagcattagg gctcttccat ataggc 56
<210>503
<211>39
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>503
caagcagaag acggcatacg agatgtctcg tgggctcgg 39
<210>504
<211>43
<212>DNA
<213> Artificial Sequence (Artificial Sequence)
<400>504
aatgatacgg cgaccaccga gatctacact cgtcggcagc gtc 43

Claims (10)

1. The method for constructing the novel coronavirus whole genome high-throughput sequencing library is characterized by comprising the following steps of:
A. extracting virus sample RNA, and performing reverse transcription to obtain single-stranded cDNA or double-stranded cDNA;
B. carrying out imbricate type full-coverage primer design according to a published genome sequence of the novel coronavirus COVID-19, respectively designing a multiple amplification primer group 1 with an anchoring part Illumina linker sequence and a multiple amplification primer group 2 with the anchoring part Illumina linker sequence, respectively carrying out a first round of PCR reaction by using a primer group 1 and a primer group 2 with a single-stranded cDNA or a double-stranded cDNA as a template, and mixing amplification products according to equimolar amount to cover the whole genome of the virus;
C. b, taking the amplification product mixed in the step B as a template, performing a second round of PCR reaction by using a tagged Illumina library amplification primer, and purifying the amplification product to obtain a high-throughput sequencing library;
wherein the design method of the multiple amplification primer group 1 with the anchoring part Illumina linker sequence and the multiple amplification primer group 2 with the anchoring part Illumina linker sequence in the step B comprises the following steps:
b1, respectively designing a multiple specificity amplification primer group I and a multiple specificity amplification primer group II according to the genome sequence of the novel coronavirus COVID-19, wherein the primer group I comprises a forward primer F pool and a reverse primer R pool, the primer group II comprises a forward primer F 'pool and a reverse primer R' pool, and each pair of forward primer and reverse primer corresponds to one amplicon; respectively designing a forward primer and a reverse primer of a primer group II in two adjacent amplicon sequences of the primer group I, respectively designing the forward primer and the reverse primer of the primer group I in two adjacent amplicon sequences of the primer group II, and repeating the steps until the amplicon corresponding to the primer group I and the amplicon corresponding to the primer group II cover the whole genome of the virus in a shingled manner;
b2, adding Illumina partial linker sequence ① to the 5 'end of each forward primer in the 5' -3 'direction and Illumina partial linker sequence ② to the 5' end of each reverse primer in the 5 '-3' direction, multiple amplification primer set 1 with the pool of forward primer F with Illumina partial linker sequence ① and the pool of reverse primer R with Illumina partial linker sequence ② as anchor portions, multiple amplification primer set 2 with the pool of forward primer F 'with Illumina partial linker sequence ① and the pool of reverse primer R' with Illumina partial linker sequence ② as anchor portions of Illumina linker sequence;
wherein the sequence of the Illumina partial linker sequence ① is 5 ' -I7 tagged primer 3 ' terminal sequence-AGATGTGTATAAGAGACAG-3 ';
the sequence of the Illumina partial linker sequence ② is shown as follows, 5 ' -I5 tagged primer 3 ' terminal sequence-AGATGTGTATAAGAGACAG-3 ';
and the 3 ' terminal sequence of the I7 tagged primer is 9-15 bp, and the 3 ' terminal sequence of the I5 tagged primer is 8-14 bp, so that the I7 tagged primer and the I5 tagged primer can be specifically annealed to the 3 ' terminal binding position on the amplicon.
2. The method of claim 1, wherein the threshold difference in Tm between each primer pair in step B is ± 2 ℃; and/or
The size of the amplicon is 200-300 bp; and/or
Removing primer pairs which can cause the formation of dimer and stem-loop structures between primers or inside the primers during primer design; and/or
In the same multiplex specific amplification primer set, the 5 'end of the reverse primer sequence of the genomic upstream amplicon is located upstream of the 5' end of the forward primer sequence of the downstream amplicon.
3. The method of claim 1, wherein the reverse transcription of RNA to single-stranded cDNA in step A is performed by a method selected from the group consisting of a or b:
a. leading single-chain cDNA synthesis by using a 6-10bp random primer;
b. a plurality of primers in a reverse primer R pool and a reverse primer R ' pool are mixed to form a specific reverse transcription primer group to guide the synthesis of single-stranded cDNA, and the reverse primers are uniformly distributed along the 3 ' -5 ' direction of the viral genome and are separated by 800-1000 bp;
the method for reverse transcribing RNA to double-stranded cDNA in step A comprises:
i. leading single-chain cDNA synthesis by using a 6-10bp random primer;
ii. Nicking the RNA-cDNA hybrid duplex with RNase H in the presence of dNTPs;
and iii, synthesizing double-stranded cDNA using RNA-dependent DNA polymerase using the small fragment RNA generated at the nick as a primer.
4. The method of claim 1, wherein the labeled Illumina library amplification primers in step C are as follows:
i7 tagged primer: 5 '-CAAGCAGAAGACGGCATACGAGAT (I7) GTCTCGTGGGCTCGG-3', I5 tagged primer: 5 '-AATGATACGGCGACCACCGAGATCTACAC (i5) TCGTCGGCAGCGTC-3'.
5. The method of claim 4, wherein the sequence of Illumina partial linker sequence ① in step B is 5'-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-3';
the sequence of Illumina partial linker sequence ② is shown below in 5'-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3'.
6. The method as claimed in claim 1, wherein the multiple amplification primer set 1 anchoring part Illumina linker sequence and the multiple amplification primer set 2 anchoring part Illumina linker sequence in step B comprise 250 pairs of primers, wherein the forward primer is COV-1-F to COV-250-F, the nucleotide sequences thereof are respectively shown in SEQ ID NO:3-252, the reverse primer is COV-1-R to COV-250-R, the nucleotide sequences thereof are respectively shown in SEQ ID NO:253-502, wherein COV-1-F and COV-1-R are a pair of primers, COV-2-F and COV-2-R are a pair of primers, and so on.
7. The method of claim 6, wherein the primer information for the multiplex amplification primer set 1 for anchoring part of the Illumina linker sequence in step B is as follows:
Figure FDA0002427599150000021
Figure FDA0002427599150000031
Figure FDA0002427599150000041
Figure FDA0002427599150000051
primer information for multiplex amplification primer set 2 for anchor Illumina linker sequence is as follows:
Figure FDA0002427599150000052
Figure FDA0002427599150000061
Figure FDA0002427599150000071
Figure FDA0002427599150000081
wherein the primer number COV-1 corresponds to the primers COV-1-F and COV-1-R, the primer number COV-2 corresponds to the primers COV-2-F and COV-2-R, and so on.
8. The method of any one of claims 1 to 7, wherein the virus sample is from a pharyngeal swab, alveolar lavage or supernatant isolation culture after virus infection of cells.
9. Kit for the construction of a novel coronavirus whole genome high throughput sequencing library, comprising multiplex amplification primer set 1 for the anchor Illumina linker sequence and multiplex amplification primer set 2 for the anchor Illumina linker sequence and tagged Illumina library amplification primers for use in the method according to any one of claims 1 to 8, optionally comprising various reagents for library construction.
10. Use of the method of any one of claims 1 to 8 for the detection of novel coronavirus variants, said use comprising:
(1) constructing a whole genome high-throughput sequencing library of the novel coronavirus to be tested according to the method of any one of claims 1 to 8;
(2) performing on-machine sequencing after the quality of the high-throughput sequencing library is qualified;
(3) bioinformatics analysis and detection of variant sites;
preferably, step (3) comprises the sub-steps of:
1) constructing a novel coronavirus COVID-19 reference genome MT019531.1 index data set by using BWA software, and generating a fai file by using samtools faidx;
2) reads quality control analysis: filtering and quality control analyzing double-end reads by using SOAPnuke to obtain clearreads; reads with the following conditions will be removed: condition 1: reads containing contamination with linker sequences; condition 2: reads with N basic group number more than 10%; condition 3: a low mass number of bases exceeding 50% of the total reads, said low mass being Q < 38;
3) data alignment and ranking: comparing clean reads to a reference genome MT019531.1 by using a BWA combined samtools to generate a BAM file, wherein an alignment parameter is '-t 32-M'; jar using picard software for sorting; establishing an index for the sorted BAM file by using an index tool of samtools; quality control is carried out on the generated BAM file by applying a Qualimap tool;
4) and (3) mutation detection: detecting SNP and InDel of the virus by using samtools pileup and VarScan; the SNP detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1"; the InDel detection parameters are as follows: "- - - -min-coverage 8- -min-reads24- -min-var-freq 0.1- -min-avg-qual 0- -p-value1.0- -strand-filter0- -variants- -output-vcf 1";
5) finally, annotation of detected SNPs and indels was performed using annovar software based on the GFF file of the MT019531.1 reference genome.
CN202010225821.0A 2020-03-26 2020-03-26 Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction Active CN111334868B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010225821.0A CN111334868B (en) 2020-03-26 2020-03-26 Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010225821.0A CN111334868B (en) 2020-03-26 2020-03-26 Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction

Publications (2)

Publication Number Publication Date
CN111334868A true CN111334868A (en) 2020-06-26
CN111334868B CN111334868B (en) 2023-05-23

Family

ID=71180448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010225821.0A Active CN111334868B (en) 2020-03-26 2020-03-26 Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction

Country Status (1)

Country Link
CN (1) CN111334868B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111676325A (en) * 2020-07-07 2020-09-18 云南科耀生物科技有限公司 Primer combination for detecting SARS-CoV-2 whole genome and application method
CN111979353A (en) * 2020-08-25 2020-11-24 上海融享生物科技有限公司 Library construction method for sequencing novel coronavirus SARS-CoV-2 full-length genome
CN111996290A (en) * 2020-08-21 2020-11-27 上海交通大学医学院附属第九人民医院 SARS-CoV-2 whole genome nucleic acid amplification specific primer based on multiple PCR
CN112063764A (en) * 2020-10-28 2020-12-11 江苏科德生物医药科技有限公司 Multiplex real-time fluorescent RT-PCR primer probe composition and kit for novel coronavirus nucleic acid detection
CN112063752A (en) * 2020-08-20 2020-12-11 广东省科学院动物研究所 Universal coronavirus PCR primer and application thereof
CN112102945A (en) * 2020-11-09 2020-12-18 电子科技大学 Device for predicting severe condition of COVID-19 patient
CN112322788A (en) * 2020-11-24 2021-02-05 杭州杰毅生物技术有限公司 mNGS primer group and kit for detecting SARS-CoV-2
CN113337639A (en) * 2021-05-28 2021-09-03 天津金匙医学科技有限公司 Method for detecting COVID-19 based on mNGS and application thereof
CN114038501A (en) * 2021-12-21 2022-02-11 广州金匙医学检验有限公司 Background bacterium judgment method based on machine learning
CN114067907A (en) * 2020-07-31 2022-02-18 普瑞基准生物医药(苏州)有限公司 Method for accurately identifying RNA virus genome variation
WO2022099794A1 (en) * 2020-11-13 2022-05-19 苏州金唯智生物科技有限公司 Method for testing cross-contamination of oligonucleotides
CN114672591A (en) * 2022-01-11 2022-06-28 湖北省疾病预防控制中心(湖北省预防医学科学院) Primer group and kit for identifying novel coronavirus and application of primer group and kit
WO2023003608A1 (en) * 2021-07-22 2023-01-26 Ohio State Innovation Foundation Methods of collecting and analyzing dust samples for surveillance of viral diseases
CN115838836A (en) * 2022-11-14 2023-03-24 圣湘生物科技股份有限公司 Composition, kit and method for joint detection of different types of viruses and application thereof

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103484564A (en) * 2013-07-23 2014-01-01 中国医学科学院病原生物学研究所 High-sensitivity method used for detecting and identifying human coronavirus
CN104975081A (en) * 2015-06-01 2015-10-14 南京市妇幼保健院 Amplimers, kit and method for detecting PKD1 gene mutation
US20160076094A1 (en) * 2014-09-08 2016-03-17 The Johns Hopkins University Efficient Deep Sequencing and Rapid Genomic Speciation of RNA Viruses (vRNAseq)
CN108456723A (en) * 2018-03-21 2018-08-28 福州福瑞医学检验实验室有限公司 A kind of the genetic test primer and kit of endometriosis risk profile
CN109371139A (en) * 2018-12-29 2019-02-22 杭州迪安医学检验中心有限公司 A kind of primer and its application being used to detect the variation of thyroid cancer pathogenic related gene based on high throughput sequencing technologies
CN110273028A (en) * 2019-06-27 2019-09-24 深圳市海普洛斯生物科技有限公司 Enrichment method, sequencing data analysis method and the device of viral integrase type DNA
CN110343783A (en) * 2019-07-08 2019-10-18 广东省公共卫生研究院 Norovirus sequencing primer, kit and detection method based on high-flux sequence
CN110387438A (en) * 2019-07-08 2019-10-29 广东省公共卫生研究院 Multi-primers, kit and method for enterovirus high-flux sequence
CN110484655A (en) * 2019-08-30 2019-11-22 中国医学科学院病原生物学研究所 The detection method of two generation of parainfluenza virus full-length genome sequencing
CN110734908A (en) * 2019-11-15 2020-01-31 福州福瑞医学检验实验室有限公司 Construction method of high-throughput sequencing library and kit for library construction

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103484564A (en) * 2013-07-23 2014-01-01 中国医学科学院病原生物学研究所 High-sensitivity method used for detecting and identifying human coronavirus
US20160076094A1 (en) * 2014-09-08 2016-03-17 The Johns Hopkins University Efficient Deep Sequencing and Rapid Genomic Speciation of RNA Viruses (vRNAseq)
CN104975081A (en) * 2015-06-01 2015-10-14 南京市妇幼保健院 Amplimers, kit and method for detecting PKD1 gene mutation
CN108456723A (en) * 2018-03-21 2018-08-28 福州福瑞医学检验实验室有限公司 A kind of the genetic test primer and kit of endometriosis risk profile
CN109371139A (en) * 2018-12-29 2019-02-22 杭州迪安医学检验中心有限公司 A kind of primer and its application being used to detect the variation of thyroid cancer pathogenic related gene based on high throughput sequencing technologies
CN110273028A (en) * 2019-06-27 2019-09-24 深圳市海普洛斯生物科技有限公司 Enrichment method, sequencing data analysis method and the device of viral integrase type DNA
CN110343783A (en) * 2019-07-08 2019-10-18 广东省公共卫生研究院 Norovirus sequencing primer, kit and detection method based on high-flux sequence
CN110387438A (en) * 2019-07-08 2019-10-29 广东省公共卫生研究院 Multi-primers, kit and method for enterovirus high-flux sequence
CN110484655A (en) * 2019-08-30 2019-11-22 中国医学科学院病原生物学研究所 The detection method of two generation of parainfluenza virus full-length genome sequencing
CN110734908A (en) * 2019-11-15 2020-01-31 福州福瑞医学检验实验室有限公司 Construction method of high-throughput sequencing library and kit for library construction

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JOSHUA QUICK: "nCoV-2019 sequencing protocol V.1" *
JOSHUA QUICK等: "Multiplex PCR method for MinION and Illumina sequencing of Zika and other virus genomes directly from clinical samples" *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111676325A (en) * 2020-07-07 2020-09-18 云南科耀生物科技有限公司 Primer combination for detecting SARS-CoV-2 whole genome and application method
CN114067907A (en) * 2020-07-31 2022-02-18 普瑞基准生物医药(苏州)有限公司 Method for accurately identifying RNA virus genome variation
CN112063752A (en) * 2020-08-20 2020-12-11 广东省科学院动物研究所 Universal coronavirus PCR primer and application thereof
CN111996290A (en) * 2020-08-21 2020-11-27 上海交通大学医学院附属第九人民医院 SARS-CoV-2 whole genome nucleic acid amplification specific primer based on multiple PCR
CN111979353A (en) * 2020-08-25 2020-11-24 上海融享生物科技有限公司 Library construction method for sequencing novel coronavirus SARS-CoV-2 full-length genome
CN112063764A (en) * 2020-10-28 2020-12-11 江苏科德生物医药科技有限公司 Multiplex real-time fluorescent RT-PCR primer probe composition and kit for novel coronavirus nucleic acid detection
CN112102945A (en) * 2020-11-09 2020-12-18 电子科技大学 Device for predicting severe condition of COVID-19 patient
WO2022099794A1 (en) * 2020-11-13 2022-05-19 苏州金唯智生物科技有限公司 Method for testing cross-contamination of oligonucleotides
CN112322788A (en) * 2020-11-24 2021-02-05 杭州杰毅生物技术有限公司 mNGS primer group and kit for detecting SARS-CoV-2
CN112322788B (en) * 2020-11-24 2021-07-06 杭州杰毅生物技术有限公司 mNGS primer group and kit for detecting SARS-CoV-2
CN113337639A (en) * 2021-05-28 2021-09-03 天津金匙医学科技有限公司 Method for detecting COVID-19 based on mNGS and application thereof
WO2023003608A1 (en) * 2021-07-22 2023-01-26 Ohio State Innovation Foundation Methods of collecting and analyzing dust samples for surveillance of viral diseases
CN114038501A (en) * 2021-12-21 2022-02-11 广州金匙医学检验有限公司 Background bacterium judgment method based on machine learning
CN114038501B (en) * 2021-12-21 2022-05-27 广州金匙医学检验有限公司 Background bacterium judgment method based on machine learning
CN114672591A (en) * 2022-01-11 2022-06-28 湖北省疾病预防控制中心(湖北省预防医学科学院) Primer group and kit for identifying novel coronavirus and application of primer group and kit
CN115838836A (en) * 2022-11-14 2023-03-24 圣湘生物科技股份有限公司 Composition, kit and method for joint detection of different types of viruses and application thereof
CN115838836B (en) * 2022-11-14 2024-01-30 圣湘生物科技股份有限公司 Composition, kit, method and application of different types of virus joint inspection

Also Published As

Publication number Publication date
CN111334868B (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN111334868B (en) Construction method of novel coronavirus whole genome high-throughput sequencing library and kit for library construction
US20210189466A1 (en) Profiling Expression at Transcriptome Scale
US20220033901A1 (en) Universal sanger sequencing from next-gen sequencing amplicons
EP3177740B1 (en) Digital measurements from targeted sequencing
EP2623613B1 (en) Increasing confidence of allele calls with molecular counting
RU2752700C2 (en) Methods and compositions for dna profiling
AU2013246080C1 (en) Compositions and methods for quantifying a nucleic acid sequence in a sample
Moldován et al. Multi-platform sequencing approach reveals a novel transcriptome profile in pseudorabies virus
US10100351B2 (en) High-throughput sequencing detection method for methylated CpG islands
KR20210039989A (en) Use of high temperature resistant Cas protein, detection method and reagent kit of target nucleic acid molecule
JP6739339B2 (en) Covered sequence-converted DNA and detection method
JP2009502137A (en) Method for rapid identification and quantification of nucleic acid variants
WO2022033407A1 (en) Product for detecting dna/rna by using nucleic acid mass spectrometry and detection method
JP2023519782A (en) Methods of targeted sequencing
WO2017160779A2 (en) Methods and kits to identify klebsiella strains
WO2016165591A1 (en) Mgmt gene promoter methylation detection based on pyrosequencing technology
CN115335536A (en) Compositions and methods for point-of-care nucleic acid detection
CN105018490A (en) Primer pairs, probes and kit for detecting polymorphism of human MTHFR gene
US20180051330A1 (en) Methods of amplifying nucleic acids and compositions and kits for practicing the same
EP2886649A1 (en) Primer set for detecting bovine leukemia virus and use thereof
US20220364173A1 (en) Methods and systems for detection of nucleic acid modifications
CN112501166A (en) Chemically modified high-stability RNA, kit and method
Nafea et al. Application of next-generation sequencing to identify different pathogens
Al-Turkmani et al. Molecular assessment of human diseases in the clinical laboratory
WO2013140339A1 (en) Positive control for pcr

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant