WO2002099130A2 - Virus detection using degenerate pcr primers - Google Patents

Virus detection using degenerate pcr primers Download PDF

Info

Publication number
WO2002099130A2
WO2002099130A2 PCT/GB2002/002642 GB0202642W WO02099130A2 WO 2002099130 A2 WO2002099130 A2 WO 2002099130A2 GB 0202642 W GB0202642 W GB 0202642W WO 02099130 A2 WO02099130 A2 WO 02099130A2
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
primers
pair
virus
sequences
Prior art date
Application number
PCT/GB2002/002642
Other languages
French (fr)
Other versions
WO2002099130A3 (en
Inventor
David John Griffiths
Paul Kellam
Robert Anthony Weiss
Original Assignee
University College London
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University College London filed Critical University College London
Priority to AU2002345244A priority Critical patent/AU2002345244A1/en
Publication of WO2002099130A2 publication Critical patent/WO2002099130A2/en
Publication of WO2002099130A3 publication Critical patent/WO2002099130A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/70Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving virus or bacteriophage

Definitions

  • the invention relates to a method of detecting new viruses using a high throughput polymerase chain reaction (PCR) assay.
  • PCR polymerase chain reaction
  • Biological materials can often become contaminated or infected with unidentified organisms.
  • cells grown in tissue culture often exhibit signs of a cytopathic effect consistent with a virus infection but the identity of the virus may not be apparent.
  • Human blood products such as factor VTJI for the treatment of haemophiliacs, can be contaminated with unidentified viruses, as was demonstrated by infection of many haemophiliacs with human immunodeficiency virus in the early 1980s.
  • factor VTJI for the treatment of haemophiliacs
  • PCR allows amplification of a specific region of a polynucleotide.
  • the specificity of the reaction is due to the primers which, during the course of PCR, bind to the region to be amplified in a sequence specific manner.
  • Degenerate primers can be designed which amplify sequence from substantially all members of a virus family. Such primers typically bind to nucleotide sequence which is conserved across the virus family.
  • the invention provides a PCR based high throughput screen that uses such degenerate primers for detecting unknown viruses.
  • the invention provides a high throughput method for screening a biological sample for unknown viruses, which method comprises (a) subjecting DNA from the sample to PCR amplification conditions using simultaneously multiple pairs of degenerate primers, wherein each primer binds a sequence that is conserved across members of a family of viruses and each pair of primers selectively directs amplification of sequence of said family;
  • step (b) sequencing PCR product obtained in step (a);
  • unidentified viruses are believed to play a role in cancers such as leukaemia, autoimmune diseases such as rheumatic disease, cardiovascular diseases such as dilated cardiomyopathy and Kawasaki disease, and prostatitis (zurHausen 2001 The Lancet 357, 381-384; Greaves 1997 The Lancet 349, 344-349; Rowley and Shulman 1998 Clinical Microbiology
  • the invention provides a way of screening for the viruses which may cause or contribute to such diseases. Once identified, the viruses may be used as a target for developing diagnostic tests for, or therapies against, the diseases.
  • the method of the invention is based on obtaining sequences from viruses so
  • the sequences of the novel viruses are amplified using PCR primers which recognise sequences which are conserved (similar/homologous) in known members of virus families.
  • the primers direct amplification of sequence between the conserved regions to give a PCR product whose sequence can be
  • the biological sample which is screened may be any sample susceptible to I
  • a virus may, for example, be a tissue culture sample (e.g. tissue culture supernatant), or a sample of animal (including human) or plant material.
  • tissue culture supernatant e.g. tissue culture supernatant
  • the invention is directed to the identification of unknown human viruses, and in this case the sample will generally be derived from 5 one or more humans.
  • a sample derived from a human or animal may be from a range of tissue and fluid types, for example blood serum, seminal fluid, breast milk, saliva, cerebrospinal fluid, urine, bile, bronchial lavage fluid, nasal secretion, eye secretion or vaginal wash.
  • the virus material in the sample is concentrated, for example by ultiacentrifugation.
  • the virus material may also be purified in a manner which increases the content of viral nucleic acid relative to non- viral nucleic acid.
  • the viral nucleic acid may be concentrated by centrifuging the biological sample under conditions such that cell debris is 15 pelleted and virus particles remain in the supernatant; collecting the supernatant; and centrifuging the supernatant under conditions such that virus particles are pelleted.
  • the initial centrifugation to pellet the cell debris may, for example, be carried 20 out at 100 to 10,000 g, preferably from 1000 to 10,000 g.
  • the subsequent centrifugation to pellet the virus particles is carried out at a higher g force, for example 50,000 to 500,000 g, preferably about 100,000 g.
  • the purification of viral nucleic acid may include a step of treating a suspension comprising the virus with a nuclease so as to digest extraneous nucleic 25 acid, wherein the viral nucleic acid is protected from digestion by viral coat or core protein.
  • the nuclease is preferably a non sequence-specific nuclease which digests DNA and/or RNA, for example micrococcal nuclease S7 (Roche Molecular Biochemicals, Catalogue 107 921).
  • the processing may also comprise a nucleic acid purification, such as 30 phenol/chloroform nucleic acid purification or the use of a column which selectively binds nucleic acid.
  • a nucleic acid purification such as 30 phenol/chloroform nucleic acid purification or the use of a column which selectively binds nucleic acid.
  • purification is carried out using a QiagenTM column.
  • processing of the sample increases the purity of the virus nucleic acid present in the sample (for example leading to an increase in concentration of 2- fold to 1000-fold of viral nucleic acid).
  • the processing of the sample may comprise the reverse transcription of viral
  • RNA in the sample to DNA i.e. RNA from the unknown virus is processed to produce the equivalent (such as the same or a complementary) DNA sequence.
  • the DNA which is subject to PCR conditions may be cDNA. This is required when the unknown virus has an RNA genome.
  • the processing may comprise reverse transcription of the RNA to produce a complementary DNA strand and then optionally synthesising a second DNA strand before carrying out PCR. This can be achieved by using a primer which directs initiation at random sequences in a reverse transcription reaction and then in a second strand synthesis reaction.
  • Random reverse transcription may be directed using a primer which directs initiation of DNA synthesis at random sequences.
  • a primer may be made by synthesising it so that it contains a random sequence, for example a sequence of at least 6 consecutive nucleotides (e.g. from 6 to 20 nucleotides) wherein each nucleotide may be any of the four possible natural nucleotides, i.e. A, T, C or G.
  • a primer contains a sequence NNNNNN wherein each N is A, T, C or G.
  • a "single tube system” is used for the reverse transcription and then PCR with the multiple pairs of degenerate primers.
  • the sample typically after being processed
  • the mixture will comprise both a reverse transcriptase and a thermostable DNA polymerase.
  • the mixture may comprise the TitanTM reagants from Roche Molecular BiochemicalsTM (cat no. 1855476) which uses the avian myeloblastosis virus reverse transcriptase and a Pwo (Pyrococcus woesei) thermostable DNA polymerase.
  • the ProSTARTM system from StratageneTM may be used.
  • the PCR reaction is carried out in a PCR mixture that generally comprises the following: the template DNA (which will be amplified in the event of virus detection), one or more primer pairs specific for members of a virus family, a thermostable polymerase enzyme (typically a DNA polymerase, such as Taq polymerase), deoxynucleotide triphosphates (dATP, dTTP, dCTP and dGTP) and a suitable buffer.
  • the PCR reaction generally comprises cycles of the following steps: a denaturation step, a primer annealing step and a polynucleotide synthesis step.
  • the PCR reaction comprises at least 25 cycles, such as 30, 35, 40 or more cycles, up to a maximum of 60 cycles for example.
  • the PCR mixture is heated to a temperature at which the DNA in the PCR mixture (in particular the region to be amplified) denatures to single-stranded form.
  • the denaturing temperature is generally from 85 to 98 °C.
  • the PCR reaction comprises a "hot start” in which the PCR mixture is kept at the denaturing temperature for an extended amount of time before commencement of the thermal cycles, such as for 5 to 30 minutes, preferably 10 to 20 minutes.
  • the use of Amplitaq GoldTM DNA polymerase (Applied BiosystemsTM) is preferred when the PCR reaction comprises a hot start.
  • the primers bind to template nucleotide sequence in a sequence specific manner. This step is generally carried out at a temperature of from 30 to 65° C.
  • the polymerase replicates/ synthesises nucleotide sequence based on template sequence by addition of nucleotides to the 3' end of the bound primers. This step is generally carried out at about 72°C.
  • the sample (generally after processing as described above) is subject to PCR conditions using a panel of multiple pairs of degenerate primer pairs.
  • primers are capable of binding the conserved sequences of the genome of a family of viruses. These conserved regions typically have a role in providing a necessary or advantageous activity or property to the virus.
  • the conserved sequences may be coding or non-coding sequences.
  • conserved sequences code for or are from virus proteins which have the following activities: DNA or RNA polymerase (replicase), topoisomerase (helicase/gyrase), endonuclease (integrase), nucleic acid binding protein, protease, transcription factors, envelope glycoproteins, structural protein (e.g. capsid or nucleocapsid protein).
  • each of the primer pairs used being selective (or specific) for members of a virus family (for example selective for a subfamily or genus).
  • the numbers of primers used in different embodiments of the invention it is understood that this refers to the numbers of primers which are substantially specific for members of a virus family.
  • additional primer pairs may be used which are selective for more than one family (for example selective for 2 to 10, such as 3 to 6 families). Such embodiments are within the scope of the * present invention.
  • the panel of primer pairs may comprise sets of primer pairs which perform a nested PCR reaction.
  • a set of primer pairs comprises a first and second primer pair.
  • the first primer pair is able to amplify a template nucleotide sequence from a virus to form a PCR product.
  • the second primer pair is able to amplify a nucleotide sequence using the PCR product generated by the first primer pair as a template.
  • Multiply nested sets of primer pairs may also be used. The use of nested sets of primer pairs allows increased sensitivity and specificity.
  • the panel of primers used is capable of detecting viruses which are single-stranded or dou le-stranded DNA or single-stranded or double-stranded RNA viruses.
  • the viruses are generally capable of infecting prokaryotic or eukaryotic cells, such as bacterial, animal, plant, yeast or fungal cells.
  • the viruses are mammalian (preferably primate) or avian viruses, such as human, pig, horse, sheep, goat, cow, chicken, turkey or duck viruses.
  • the viruses are typically from any combination of the following families: Adenoviridae, Arenaviridae, Arteriviridae, Astroviridae, Birnaviridae, Bunyaviridae, Caliciviridae, Circoviridae, Coronaviridae, Deltavirus, Filoviridae, Flaviviridae, Hepadnaviridae, Herpesviridae, Orthomyxoviridae, Papovaviridae, Paramyxoviridae, Parvoviridae, Picomaviridae, Polydnaviridae, Poxviridae, Reoviridae, Retioviridae, Rhabdoviridae, Togaviridae orBornavirus.
  • primer pairs typically in the method 12 to 300' different primer pairs are used, such as 24 to 200 or 48 to 100 primer pairs. These primers may all be used in the same multi- well plate (placed on a thermal cycling machine).
  • the plate may be a 96-well or 384-well plate.
  • at least one of the wells in which the PCR is done comprises more than one primer pair, such as 2, 3, 4, 5, 6, 7, 8 or 9 primer pairs.
  • 3 to 96, such as 12 to 48, of the wells comprise more than one primer pair.
  • some or all of the primer pairs used in the same well carry different labels.
  • one or both primers of each primer pair carries a label.
  • both primers of a primer pair carry a label then these labels are different from each other.
  • at least one of the primers in each primer pair will carry a different label from that used for the other primer pairs in the same well.
  • the PCR product generated by labelled primers carries the labels present on the primers.
  • all forward primers of the group are labelled with one colour and the reverse primers are labelled with a different colour.
  • the primers are labelled with a fluorescent label, such as fluorescein based labels (e.g. fluorescein isothiocyanate).
  • a fluorescent label such as fluorescein based labels (e.g. fluorescein isothiocyanate).
  • fluorescein based labels e.g. fluorescein isothiocyanate
  • Different primer pairs may be labelled with fluorescent labels of different colours.
  • the fluorescent labels which are used may be capable of detection by a Beckman Coulter CEQ2000TM or Applied Biosystems A3700TM fluorescent DNA analyser.
  • the fluorescent labels may be obtained from Beckman CoulterTM or Applied BiosystemsTM.
  • each PCR product which is generated by the group of primers differs in size from all the other PCR products by at least 20, such as at least 50, 100, 200, 500, 1000 or more nucleotides.
  • Each PCR product may for example differ in size from all other PCR products by up to ' a maximum of 3000 nucleotides.
  • multiple biological samples are screened simultaneously by subjecting DNA from multiple samples to PCR conditions using simultaneously multiple pairs of primers.
  • each of the samples is from a different (typically human) individual.
  • 2 to 80, such as 5 to 40 samples are screened simultaneously in the method.
  • DNA from multiple samples is mixed together before being subject to PCR conditions. Typically 2 to 10 such as 5 to 8 samples are pooled together in this way. After the DNA has been subject to PCR conditions any PCR product which is obtained may be sequenced. Typically prior to sequencing the PCR product is gel purified and cloned into a vector, for example a plasmid or a bacteriophage vector.
  • Suitable plasmids are known and commercially available, such as pBluescriptTM
  • Suitable bacteriophage include bacteriophage ⁇ and M 13.
  • the sequencing reaction may be carried out on the PCR product itself, for example using one of the PCR primers as a sequencing primer.
  • an automated sequencer is used to obtain the sequence of the PCR product, such as a Beckman Coulter CEQ2000TM or Applied Biosystems A3700TM DNA analyser.
  • Each of the primer pairs used in the method of the invention binds a sequence conserved across members of a virus family and selectively directs amplification of sequence from the members of the family.
  • the multiple primer pairs which are used are typically designed by:
  • each primer in the pair binds a nucleotide sequence that encodes a conserved region identified in (ii) and wherein the primer pair is designed to amplify by PCR the nucleotide sequence between the nucleotide sequences that encode conserved regions in members of the first virus family, and
  • the multiple primer pairs may also be designed by:
  • each primer in the pair binds a conserved region identified in (ii) and wherein the primer pair is designed to amplify by PCR the nucleotide sequence between the conserved regions in members of the first virus family, and
  • the multiple pairs of primers are capable of detecting unknown viruses in a sample, wherein such a sample originates from a single individual or is a pooled sample from individuals of the same species.
  • the panel of primers detects viruses which infect the same species.
  • the number of primers designed by the above steps is typically the same as the numbers of primers mentioned above for use in the method of the invention.
  • the primer pairs which are designed bind sequence which is conserved across members of a virus family.
  • the panel of primer pairs which is designed may comprise primer pairs that bind sequence which is conserved across substantially members of the family or across a subset of the members of the family, for example across all members of a subfamily or of a genus.
  • the primer pairs bind at least 70%, at least 80%, or at least 90% of the known viruses of the family, subfamily or genus.
  • the panel of primer pairs is generally capable of detecting viruses from at least 10, 15, 2.0, 30 or more families, typically up to a maximum" of 35 families .
  • the panel of primer pairs may comprise sets of primer pairs which perform a nested PCR reaction.
  • a set of primer pairs comprises a first and second primer pair.
  • the first primer pair is able to amplify a template nucleotide sequence from a virus to form a PCR product.
  • the second primer pair is able to amplify a nucleotide sequence using the PCR product generated by the first primer pair as a template.
  • the use of nested sets of primer pairs allows increased sensitivity.
  • each primer pair is specific for a particular virus family, so that it does not detect viruses of other families.
  • the plurality of amino acid or nucleotide sequences are provided from different known viruses of the same family.
  • the sequences will be for the same protein of the different viruses. Typically at least 5, 10, 20, 50, 100 or more sequences are provided. The maximum number of sequences provided will, for example, be 300 sequences.
  • Each of the sequences which is provided is typically at least 20, 50, 100, 200 or more amino acids or nucleotides in length. In general the maximum length of the nucleotide sequences is 1000 nucleotides and the maximum length of the amino acid sequences is 300 amino acids.
  • the sequences may be obtained from a database of sequences, such as GenBank.
  • the sequences may be obtained from a database comprising virus sequences which are organised into homologous protein families (based on sequence similarity relationships). In a preferred embodiment the sequences are obtained from the VIDA database (described in Alba et al (2001) Nucleic Acids Research 29, 133-136) or the Virus Division of GenBank.
  • the sequences may be provided in the form of a database, preferably in computer-readable form.
  • the sequences are preferably provided in the form of a computer-readable database constructed using programs which identify homologous protein families, such as GeneTableMaker, MKDOM or PSCBuilder.
  • conserved regions typically such conserved regions will have a length of at least 12 nucleotides, such as at least 15, 21, 27, 36, 99 or more nucleotides (generally up to a maximum length of 200 nucleotides) or at least 4, 5, 7, 10, 25 or more amino acids (generally up to a maximum length of 50 amino acids).
  • conserved regions typically have a length of at least 12 nucleotides, such as at least 15, 21, 27, 36, 99 or more nucleotides (generally up to a maximum length of 200 nucleotides) or at least 4, 5, 7, 10, 25 or more amino acids (generally up to a maximum length of 50 amino acids).
  • the virus sequences which are being provided will of course share identity or similarity.
  • amino acids or nucleotides in at least 50% of the positions in the region will be the same in at least 50 %, 60%, 70%, or 80%) of the viruses of the group (i.e. in the family, genus or subfamily).
  • the algorithm which identifies conserved regions generally uses a multiple sequence alignment method.
  • the method may comprise (a) aligning all pairs of sequences separately to calculate a distance matrix giving the divergence of each pair of sequences, (b) calculating a guide tree from the distance matrix, and (c) aligning the sequences progressively according to the branching order in the guide tree.
  • a preferred algorithm for the aligning the conserved sequences is
  • BLOCKS conserved regions of amino acids may be extracted from the multiple alignments, typically using the program Blocks Multiple Alignment Processor. Alternatively the entire process of performing multiple alignments and extracting BLOCKS can be performed using BLOCKMAKER (Henikoff and Henikoff (1994) Genomics 19, 97-107).
  • the output from the alignment and BLOCK extraction set (i.e. the information describing the identified conserved regions) is then entered into the algorithm which designs the primers.
  • Such output is typically in the form of partial sequences which correspond to the conserved regions (BLOCKS).
  • BLOCKS conserved regions
  • these BLOCKS are input into a primer design algorithm.
  • such an algorithm is CODEHOP.
  • the conserved regions which are chosen as targets for primers preferably comprise few codons with degenerate counterparts, i.e. preferably the sequence has a low redundancy, such as a redundancy of less than 512 fold, 256 fold or 128 fold.
  • Each primer binds in accordance with Watson-Crick base pairing and thus the binding is sequence specific.
  • Each primer will thus be designed to be wholly or partially complementary to the sequence to which it binds.
  • Each of the primers typically has a length of at least 8 nucleotides, such as at least 10, 12, 15, 20, 30, 40 or more nucleotides (up to a maximum of 50 nucleotides for example).
  • the primer may comprises at least 2, 4 or 6, up to a maximum of 10 for example, inosine bases. Inosine is able to bind to any of the four nucleotides and therefore use of inosine causes a reduction in effective redundancy.
  • Each primer pair will be designed so that the PCR product generally has a length of at least 20, such as at least 50, 100, 200, 500, 1000 or more nucleotides (and typically up to a maximum of 5x10 3 nucleotides long).
  • Each primer is preferably be designed so that it anneals to a single site, i.e. the primer will not bind to any other site in the genome of the relevant viruses.
  • Each primer is preferably designed so that it does not exhibit secondary structure, i.e. the nucleotides in the primer will not bind substantially to any other nucleotide in the primer apart from those to which it is covalently linked.
  • each primer is designed so that it does not bind other primers with the same sequence.
  • the 3' region, and preferably the 3' terminal nucleotide of the primer binds to the target sequence with high affinity,, thus preferably this region or nucleotide comprises a G or C.
  • each primer is designed to have an annealing temperature of from 30 to 65 °C, such as 50 to 60°C or 35 to 45°C.
  • each primer pair may be designed to ensure that the two primers do not bind to each other.
  • the primers are designed by a computer based algorithm.
  • such an algorithm designs primers according to the following rules:
  • a set of blocks is input, where a block is an aligned array of amino acid sequence segments without gaps that represents a highly conserved region of homologous proteins.
  • a weight is provided for each sequence segment, which can be increased to favour the contribution of selected sequences in designing the primer.
  • a codon usage table is chosen for the target genome.
  • PSSM amino acid position-specific scoring matrix
  • a DNA PSSM is calculated from the amino acid matrix (step 2) and the codon usage table.
  • the DNA matrix has three positions for each position of the amino acid matrix.
  • the score for each amino acid is divided among its codons in proportion to their relative weights from the codon usage table, and the scores for each of the four different nucleotides are combined in each DNA matrix position. Nucleotide positions are treated independently when the scores are combined. As an option, the highest scoring nucleotide residue from each position can replace the most common codons from step 4 that are used in the consensus clamp.
  • the degeneracy is determined at each position of the DNA matrix based on the number of bases found there.
  • a weight threshold can be specified such that bases that contribute less than a minimum weight are ignored in determining degeneracy.
  • Possible degenerate core regions are identified by scanning the DNA matrix in the 3' to 5' direction.
  • a core region must start on an invariant 3' nucleotide position, have length of 11 or 12 positions ending on a codon boundary, and have a maximum degeneracy of 128 (this is the default setting of CODEHOP).
  • the degeneracy of a region is the product of the number of possible bases in each position.
  • Candidate degenerate core regions are extended by addition of a 5' consensus clamp from step 4 or 5.
  • the length of the clamp is controlled by a melting point temperature calculation (the CODEHOP default is 60 °C) and is usually about 20 nucleotides.
  • Steps 7 and 8 are repeated on the reverse complement of the DNA matrix from step 5 for primers corresponding to the opposite DNA strand.
  • CODEHOP Rose et al (1998) Nucleic Acids Research 26, 1628-1635) is used to design the primer pairs. This program uses. the above rules.
  • the primers designed by the algorithm may then be mapped back to the original sequence to choose primer pairs which provide the desired length of PCR product.
  • primer pairs can then be synthesis ed and tested. They are typically tested to determine the optimal conditions for using the primers in a PCR reaction.
  • the primers are tested for their ability to amplify one or more of the plurality of nucleotide sequences from known viruses which were used to design the primers, or in the case of amino acid sequences from known viruses being used to design the primers the primers may be tested for their ability to amplify the nucleotide sequence from the virus which encodes the amino acid sequence.
  • the primers may be tested in a range of buffer conditions to determine optimal buffer conditions for PCR using the primers.
  • the buffer conditions which may be tested include pH (typically between 7 and 10), magnesium concentration (typically from 0.5 mM to 5 mM), potassium chloride (typically from 0 to 100 mM), ammonium chloride (typically 0 to 100 mM), glycerol (typically 0 to 20%), dimethysulphoxide (typically 0 to 20%), ethanol (typically 0 to 20%), sorbitol (typically 0 to 20%) or betaine (typically 1M betaine).
  • the primers may be tested at a range of different temperatures to determine the optimal temperatures in the PCR reaction.
  • the primers are tested in PCR reaction in which a range of primer annealing temperatures are tested. .
  • the range of temperatures is from 30 to 65° C.
  • the panel of primer pairs or a group of primers within the panel may be designed to be used together on the same plate (i.e. using the same thermal cycles). Thus such primer pairs will be designed to work at the same annealing temperature.
  • a group of primer pairs within the panel are designed to have similar optimal conditions for use in PCR so that they can be used optimally in the same well or reaction vessel, i.e. that they can be used in multiplex PCR.
  • Such a group typically comprises at least 2, 3, 4, 5, 6 or more primer pairs (up to a maximum of 8 primer pairs for example).
  • the computer based method steps may be used to design primer pairs which are calculated to have similar annealing temperatures and/or the primers are tested to select primer pairs which can be used optimally together. Such testing typically determines whether the primers work optimally with the same buffers and/or whether the primers have similar annealing temperatures.
  • the next step is to determine whether each sequence is present in at least one database of known nucleic acid sequences, typically sequences of viruses known to infect the individuals from which the samples are derived.
  • Appropriate databases include the virus subdivision of GenBank or the VEDA database.
  • each sequence is typically also compared with a database of human sequences to exclude sequences which are human sequences.
  • a database is generally a comprehensive or consensus human genome database.
  • at least one of the human sequence databases searched contains an essentially complete human genome sequence.
  • the human genome contains large areas with repetitive sequences, and much of the unsequenced genome is within these areas.
  • a database comprising expressed sequence tags (ESTs) and a database comprising repetitive elements of the.human genome
  • ESTs expressed sequence tags
  • Appropriate databases include GenBank, the EMBL database, the Celera human genome database, the Ensemble human genome database, the DNA Data Bank of Japan (DDBJ), the Incyte LifeSeqTM database of ESTs and the Repbase database of repetitive elements in the human genome.
  • DDBJ DNA Data Bank of Japan
  • Incyte LifeSeqTM database of ESTs the Repbase database of repetitive elements in the human genome.
  • a nucleic acid sequence designated a PCV is not in fact a human sequence.
  • a preferred way of doing this involves designing and synthesising a specific primer set (or sets) to amplify the nucleic acid designated a PCV and determining whether the set(s) are able to amplify any DNA in a sample of complete genomic human DNA.
  • the amplification conditions for each primer set may be optimised using the original sample from which the PCV derives or using the PCR product which is obtained in the method of the invention.
  • the primer set may be used to screen one or more samples of human genomic DNA, for example from 1 to 100 samples, preferably from 5 to 50 samples.
  • human genomic DNA may be probed with a labelled probe containing sequence from the original PCR product (e.g by Southern blotting). If the PCV cannot be detected in human DNA by experimentation (by PCR or hybridisation with a labelled probe), it may then be subjected to further analysis. It may be designated a Secondary Candidate Virus (SCV).
  • SCV Secondary Candidate Virus
  • the further analysis of an SCV may include gene walking to determine whether the original cloned nucleic acid sequence exists in nature as part of a longer sequence, such as the genomic sequence of an unknown virus.
  • Gene walking may be carried out using techniques known in the art, such as vectorette PCR (Allen et al, PCR Methods Appl. 4:71-75), rapid amplification of cDNA ends (RACE, Frohman et al Proc Natl Acad Sci U S A. 85:8998-9002), rapid amplification of genomic ends (RAGE, Cormack and Somssich. 1997. Gene. 194:273-276) and methods derived from these.
  • the SCV sequence may be "extended” by screening a DNA or cDNA library using the original cloned nucleic acid sequence as a probe.
  • the additional sequence information obtained through DNA walking may reveal information about the identity of the SCV which cannot be determined from the original clone.
  • the additional information may therefore be analysed, for example to determine whether it contains an open reading frame (i.e. a sequence encoding a protein); the presence of an Open reading frame provides further support for the suggestion that the SCV is a virus.
  • the additional information may identify the SCV as being related to a known virus; for example, the information may identify the SCV as being a new member of a known family of viruses.
  • a further step may then be to determine whether a newly-identified candidate virus is associated with a disease, for example with a cancer, autoimmune disease, cardiovascular disease ' or other disease mentioned above. This may be done by obtaining a specimen from each member of a group of subjects with a disease; determining whether the cloned nucleic acid or other nucleic acid of the same virus is present in each specimen; and determining whether the proportion of subjects in whom the nucleic acid is present is greater in the group of subjects who have the disease than in a control group of subjects who do not have the disease, wherein a said greater proportion suggests that the virus may cause or contribute to the disease.
  • the process of determining whether the nucleic acid is present or absent from a specimen from a subject may be carried out by PCR using primers specific for the novel sequence (including any contiguous sequence obtained by DNA walking). Initially, perhaps from 10 to 50 patients from a disease group may be tested, but if positive results are obtained in initial studies, the investigation may be extended to a larger group (e.g. a group of up to 100, 500, 1000 or 10,000).
  • a group e.g. a group of up to 100, 500, 1000 or 10,000.
  • the nature of the biological specimens taken from the members of the group varies depending on the disease association that is being investigated; where possible specimens are from disease affected tissue and from peripheral blood of the subjects (for a published example of this see Griffiths et al, 1999, Arthritis Rheumatism, 42:448-454).
  • the specimens may be from the same tissue and fluid types as the biological samples used in the initial screening assay described above.
  • serological and genetically-based diagnostic assays for infection by the virus may readily be devised. Genetically-based assays can be developed by using the nucleotide sequence of the virus to design probes and/or PCR primers for specifically detecting the nucleic acid of the virus. Serological assays can be developed by producing recombinant proteins or protein fragments encoded by the virus and testing for the presence of antibodies to these proteins in human sera. Alternatively, antibodies specific for the proteins of the virus may be made and the antibodies used to detect the virus directly.
  • the serological assays may take the format of an ELISA, western blot or immunofluorescence assay. Correlations may be sought between serological data and genetic data. Furthermore, the organism provides a target for the development of therapies and/or prophylactic vaccines against the disease.
  • Example 1 illustrates the invention.
  • a panel of primers was designed for detecting unknown viruses from the family Herpesviridae according to the strategy shown in Figure 1.
  • the amino acid sequences of herpes virus DNA packaging protein UL15 were obtained from the VTDA database (Alba et al, see above). These sequences are shown in Table 1.
  • the sequences obtained from the VTDA database were then imported into CLUSTALW. This compares the protein sequences to identify conserved regions and then aligns the sequences according to the conserved regions. The alignment produced by CLUSTALW is shown in Table 2.
  • KSTVFLIPRRHGKTWI A ISVLTjASVE r ⁇ lGYVAHQKHVANAVFTEI ITTLYQWFPSKNIEIKKENG I IYTKPGRKPSTLMCATCFNKNSIRGQTENILYVDEANFIKKEALPAILGFMLQKDAKIIFISSVNSAD KSTSFLFNLRNAKEKMLNVVNYVCPEHKEDFNLQSTLTSCPCYRiaiPTYITIDESIKNTTNLF ⁇ TE] ⁇ GDISTFPTSSMFICVVEEQALFHFDICRVDTTQIDTVKIIDNVLYVYVDPAYTSNSEASGTGIGAVV PLKTKVKTIILGIEHFYLKf ⁇ TGTASQQIAYCVTSMIKAILTLHPHINHVNVAVEGNSSQDSAVAISTFI N ⁇ YCPVPVFFAHO ⁇ TERSSVFQWPIYILGSEKSQAFEKFICAINTGTLSASQTIVSNTIKISFDPyAYLME QIRAIR- LPLKDGS
  • G51 GGO G70 680 690 700 710 720 730 740 750 7B0 770 780

Abstract

A high throughput method for screening a biological sample for unknown viruses, which method comprises: (a) subjecting DNA from the sample to PCR amplication conditions using simultaneously multiple pairs of degenerate primers, wherein each primer binds a sequence that is conserved across members of a family of viruses and each pair of primers selectively directs amplication of sequence of said family; (b) sequencing PCR product obtained in step (a); and (c) comparing the sequence of the PCR product with the sequences in at least one database comprising viral sequences to determine whether the sequence is present in, or absent from, the database, wherein absence of the sequence from the database suggests that the sequence may be from an unknown virus.

Description

VIRUS DETECTION USING DEGENERATE PCR PRIMERS
Field of the invention
The invention relates to a method of detecting new viruses using a high throughput polymerase chain reaction (PCR) assay.
Background of the invention Biological materials can often become contaminated or infected with unidentified organisms. For example, cells grown in tissue culture often exhibit signs of a cytopathic effect consistent with a virus infection but the identity of the virus may not be apparent. Human blood products, such as factor VTJI for the treatment of haemophiliacs, can be contaminated with unidentified viruses, as was demonstrated by infection of many haemophiliacs with human immunodeficiency virus in the early 1980s. Similarly, two decades ago*20% of individuals who received transfused blood contracted hepatitis C (Randall, 2001, J Pediatr. Oncol. Nurs. 18(1), 4-15).
Summary of the invention
PCR allows amplification of a specific region of a polynucleotide. The specificity of the reaction is due to the primers which, during the course of PCR, bind to the region to be amplified in a sequence specific manner. Degenerate primers can be designed which amplify sequence from substantially all members of a virus family. Such primers typically bind to nucleotide sequence which is conserved across the virus family. The invention provides a PCR based high throughput screen that uses such degenerate primers for detecting unknown viruses.
In particular, the invention provides a high throughput method for screening a biological sample for unknown viruses, which method comprises (a) subjecting DNA from the sample to PCR amplification conditions using simultaneously multiple pairs of degenerate primers, wherein each primer binds a sequence that is conserved across members of a family of viruses and each pair of primers selectively directs amplification of sequence of said family; I
(b) sequencing PCR product obtained in step (a); and
(c) comparing the sequence of the PCR product with the sequences in at least one database comprising viral sequences to determine whether the sequence is present in, or absent from, the database, wherein absence of the sequence from the
5 database suggests that the sequence may be from an unknown virus.
Detailed description of the invention General description
There are a number of human diseases in which unidentified viruses are
10 thought to play a causative role. For example, unidentified viruses are believed to play a role in cancers such as leukaemia, autoimmune diseases such as rheumatic disease, cardiovascular diseases such as dilated cardiomyopathy and Kawasaki disease, and prostatitis (zurHausen 2001 The Lancet 357, 381-384; Greaves 1997 The Lancet 349, 344-349; Rowley and Shulman 1998 Clinical Microbiology
15 Reviews 11(3), 405-414; Kawai 1999 Circulation 99, 1091-1100; and Dominigue and Hellstrom 1998 Clinical Microbiology Reviews 11(4), 604-613). Particles resembling retroviruses have been reported in affected tissue from patients with psoriasis, Sjogren's syndrome and rheumatoid arthritis (Iversen 1990 J. Invest. Dermatol. 90, 41S-3S; Garry et al 1990 Science 250, 1127-9; Yamano et al 1997 J.
20 Clin. Pathol. 50, 223-30; and Stransky et al 1993 Br. J. Rheumatol. 32, 1044-8). The invention provides a way of screening for the viruses which may cause or contribute to such diseases. Once identified, the viruses may be used as a target for developing diagnostic tests for, or therapies against, the diseases.
The method of the invention is based on obtaining sequences from viruses so
25 that they can be compared with known viral sequences to determine whether they are from novel viruses. The sequences of the novel viruses are amplified using PCR primers which recognise sequences which are conserved (similar/homologous) in known members of virus families. The primers direct amplification of sequence between the conserved regions to give a PCR product whose sequence can be
30 compared with that of known viruses.
The biological sample which is screened may be any sample susceptible to I
. -3- infection by a virus. It may, for example, be a tissue culture sample (e.g. tissue culture supernatant), or a sample of animal (including human) or plant material. In a particularly preferred embodiment the invention is directed to the identification of unknown human viruses, and in this case the sample will generally be derived from 5 one or more humans. A sample derived from a human or animal may be from a range of tissue and fluid types, for example blood serum, seminal fluid, breast milk, saliva, cerebrospinal fluid, urine, bile, bronchial lavage fluid, nasal secretion, eye secretion or vaginal wash.
Before the sample is subject to PCR it may processed. In one embodiment 10 the virus material in the sample is concentrated, for example by ultiacentrifugation. The virus material may also be purified in a manner which increases the content of viral nucleic acid relative to non- viral nucleic acid. For example the viral nucleic acid may be concentrated by centrifuging the biological sample under conditions such that cell debris is 15 pelleted and virus particles remain in the supernatant; collecting the supernatant; and centrifuging the supernatant under conditions such that virus particles are pelleted.
The initial centrifugation to pellet the cell debris may, for example, be carried 20 out at 100 to 10,000 g, preferably from 1000 to 10,000 g. The subsequent centrifugation to pellet the virus particles is carried out at a higher g force, for example 50,000 to 500,000 g, preferably about 100,000 g.
The purification of viral nucleic acid may include a step of treating a suspension comprising the virus with a nuclease so as to digest extraneous nucleic 25 acid, wherein the viral nucleic acid is protected from digestion by viral coat or core protein. The nuclease is preferably a non sequence-specific nuclease which digests DNA and/or RNA, for example micrococcal nuclease S7 (Roche Molecular Biochemicals, Catalogue 107 921).
The processing may also comprise a nucleic acid purification, such as 30 phenol/chloroform nucleic acid purification or the use of a column which selectively binds nucleic acid. In one embodiment purification is carried out using a Qiagen™ column.
Typically processing of the sample increases the purity of the virus nucleic acid present in the sample (for example leading to an increase in concentration of 2- fold to 1000-fold of viral nucleic acid). The processing of the sample may comprise the reverse transcription of viral
RNA in the sample to DNA, i.e. RNA from the unknown virus is processed to produce the equivalent (such as the same or a complementary) DNA sequence. Thus the DNA which is subject to PCR conditions may be cDNA. This is required when the unknown virus has an RNA genome. Thus the processing may comprise reverse transcription of the RNA to produce a complementary DNA strand and then optionally synthesising a second DNA strand before carrying out PCR. This can be achieved by using a primer which directs initiation at random sequences in a reverse transcription reaction and then in a second strand synthesis reaction.
Random reverse transcription may be directed using a primer which directs initiation of DNA synthesis at random sequences. Such a primer may be made by synthesising it so that it contains a random sequence, for example a sequence of at least 6 consecutive nucleotides (e.g. from 6 to 20 nucleotides) wherein each nucleotide may be any of the four possible natural nucleotides, i.e. A, T, C or G. In other words, such a primer contains a sequence NNNNNN wherein each N is A, T, C or G.
In one embodiment a "single tube system" is used for the reverse transcription and then PCR with the multiple pairs of degenerate primers. In such a system the sample (typically after being processed) is added to a mixture of reagents which allow both reverse transcription and PCR to occur. Thus typically the mixture will comprise both a reverse transcriptase and a thermostable DNA polymerase. The mixture may comprise the Titan™ reagants from Roche Molecular Biochemicals™ (cat no. 1855476) which uses the avian myeloblastosis virus reverse transcriptase and a Pwo (Pyrococcus woesei) thermostable DNA polymerase. Alternatively the ProSTAR™ system from Stratagene™ may be used. The PCR reaction is carried out in a PCR mixture that generally comprises the following: the template DNA (which will be amplified in the event of virus detection), one or more primer pairs specific for members of a virus family, a thermostable polymerase enzyme (typically a DNA polymerase, such as Taq polymerase), deoxynucleotide triphosphates (dATP, dTTP, dCTP and dGTP) and a suitable buffer. The PCR reaction generally comprises cycles of the following steps: a denaturation step, a primer annealing step and a polynucleotide synthesis step. Typically the PCR reaction comprises at least 25 cycles, such as 30, 35, 40 or more cycles, up to a maximum of 60 cycles for example. Generally, in the denaturation step, the PCR mixture is heated to a temperature at which the DNA in the PCR mixture (in particular the region to be amplified) denatures to single-stranded form. The denaturing temperature is generally from 85 to 98 °C.
In one embodiment the PCR reaction comprises a "hot start" in which the PCR mixture is kept at the denaturing temperature for an extended amount of time before commencement of the thermal cycles, such as for 5 to 30 minutes, preferably 10 to 20 minutes. The use of Amplitaq Gold™ DNA polymerase (Applied Biosystems™) is preferred when the PCR reaction comprises a hot start.
In the primer annealing step the primers bind to template nucleotide sequence in a sequence specific manner. This step is generally carried out at a temperature of from 30 to 65° C. In the polynucleotide synthesis step the polymerase replicates/ synthesises nucleotide sequence based on template sequence by addition of nucleotides to the 3' end of the bound primers. This step is generally carried out at about 72°C.
In the method of the invention, the sample (generally after processing as described above) is subject to PCR conditions using a panel of multiple pairs of degenerate primer pairs. In the course of a PCR reaction such primers are capable of binding the conserved sequences of the genome of a family of viruses. These conserved regions typically have a role in providing a necessary or advantageous activity or property to the virus. Generally, the conserved sequences may be coding or non-coding sequences. In one embodiment the conserved sequences code for or are from virus proteins which have the following activities: DNA or RNA polymerase (replicase), topoisomerase (helicase/gyrase), endonuclease (integrase), nucleic acid binding protein, protease, transcription factors, envelope glycoproteins, structural protein (e.g. capsid or nucleocapsid protein).
As discussed above multiple pairs of primers are used in the method each of the primer pairs used being selective (or specific) for members of a virus family (for example selective for a subfamily or genus). In the disclosure below regarding the numbers of primers used in different embodiments of the invention it is understood that this refers to the numbers of primers which are substantially specific for members of a virus family. However, in some embodiments additional primer pairs may be used which are selective for more than one family (for example selective for 2 to 10, such as 3 to 6 families). Such embodiments are within the scope of the * present invention.
The panel of primer pairs may comprise sets of primer pairs which perform a nested PCR reaction. Generally such a set of primer pairs comprises a first and second primer pair. The first primer pair is able to amplify a template nucleotide sequence from a virus to form a PCR product. The second primer pair is able to amplify a nucleotide sequence using the PCR product generated by the first primer pair as a template. Multiply nested sets of primer pairs may also be used. The use of nested sets of primer pairs allows increased sensitivity and specificity. The panel of primers used is capable of detecting viruses which are single-stranded or dou le-stranded DNA or single-stranded or double-stranded RNA viruses. The viruses are generally capable of infecting prokaryotic or eukaryotic cells, such as bacterial, animal, plant, yeast or fungal cells. Preferably the viruses are mammalian (preferably primate) or avian viruses, such as human, pig, horse, sheep, goat, cow, chicken, turkey or duck viruses.
The viruses are typically from any combination of the following families: Adenoviridae, Arenaviridae, Arteriviridae, Astroviridae, Birnaviridae, Bunyaviridae, Caliciviridae, Circoviridae, Coronaviridae, Deltavirus, Filoviridae, Flaviviridae, Hepadnaviridae, Herpesviridae, Orthomyxoviridae, Papovaviridae, Paramyxoviridae, Parvoviridae, Picomaviridae, Polydnaviridae, Poxviridae, Reoviridae, Retioviridae, Rhabdoviridae, Togaviridae orBornavirus. Typically in the method 12 to 300' different primer pairs are used, such as 24 to 200 or 48 to 100 primer pairs. These primers may all be used in the same multi- well plate (placed on a thermal cycling machine). The plate may be a 96-well or 384-well plate. In a preferred embodiment at least one of the wells in which the PCR is done comprises more than one primer pair, such as 2, 3, 4, 5, 6, 7, 8 or 9 primer pairs. Typically 3 to 96, such as 12 to 48, of the wells comprise more than one primer pair.
In one embodiment, some or all of the primer pairs used in the same well carry different labels. Thus, one or both primers of each primer pair carries a label. When both primers of a primer pair carry a label then these labels are different from each other. Typically, at least one of the primers in each primer pair will carry a different label from that used for the other primer pairs in the same well. The PCR product generated by labelled primers carries the labels present on the primers.
Thus, after different primer pairs have been used for PCR in the same well detection of the labels in the PCR products can be used to deduce which primer pah- has directed the PCR reaction. In one embodiment all forward primers of the group are labelled with one colour and the reverse primers are labelled with a different colour.
In a preferred embodiment the primers are labelled with a fluorescent label, such as fluorescein based labels (e.g. fluorescein isothiocyanate). Different primer pairs may be labelled with fluorescent labels of different colours. The fluorescent labels which are used may be capable of detection by a Beckman Coulter CEQ2000™ or Applied Biosystems A3700™ fluorescent DNA analyser. The fluorescent labels may be obtained from Beckman Coulter™ or Applied Biosystems™.
Another way of being able to determine which PCR products are generated by which primer pair is for each primer pair in the group to generate a PCR product of different size to the PCR products generated by the other primer pairs of the group. Typically each PCR product which is generated by the group of primers differs in size from all the other PCR products by at least 20, such as at least 50, 100, 200, 500, 1000 or more nucleotides. Each PCR product may for example differ in size from all other PCR products by up to' a maximum of 3000 nucleotides. In a preferred embodiment, multiple biological samples are screened simultaneously by subjecting DNA from multiple samples to PCR conditions using simultaneously multiple pairs of primers. Generally each of the samples is from a different (typically human) individual. Typically 2 to 80, such as 5 to 40 samples are screened simultaneously in the method.
In one embodiment, DNA from multiple samples is mixed together before being subject to PCR conditions. Typically 2 to 10 such as 5 to 8 samples are pooled together in this way. After the DNA has been subject to PCR conditions any PCR product which is obtained may be sequenced. Typically prior to sequencing the PCR product is gel purified and cloned into a vector, for example a plasmid or a bacteriophage vector.
Suitable plasmids are known and commercially available, such as pBluescript™
(Stratagene) and pGEM-T-Easy™ (Promega). Suitable bacteriophage include bacteriophage λ and M 13. Alternatively the sequencing reaction may be carried out on the PCR product itself, for example using one of the PCR primers as a sequencing primer.
Preferably an automated sequencer is used to obtain the sequence of the PCR product, such as a Beckman Coulter CEQ2000™ or Applied Biosystems A3700™ DNA analyser.
Designing the primers
Each of the primer pairs used in the method of the invention binds a sequence conserved across members of a virus family and selectively directs amplification of sequence from the members of the family. The multiple primer pairs which are used are typically designed by:
(i) providing a plurality of amino acid sequences from members of a first virus family,
(ii) comparing the sequences to identify conserved regions, (iii) designing a first primer pair using a computer based method, wherein each primer in the pair binds a nucleotide sequence that encodes a conserved region identified in (ii) and wherein the primer pair is designed to amplify by PCR the nucleotide sequence between the nucleotide sequences that encode conserved regions in members of the first virus family, and
(iv) repeating steps (i) to (iii) for each virus family. The multiple primer pairs may also be designed by:
(i) providing a. plurality of nucleotide sequences from members of a first virus family,
(ii) comparing the sequences to identify conserved regions,
(iii) designing a first primer pair using a computer based method, wherein each primer in the pair binds a conserved region identified in (ii) and wherein the primer pair is designed to amplify by PCR the nucleotide sequence between the conserved regions in members of the first virus family, and
(iv) repeating steps (i) to (iii) for each virus family.
The multiple pairs of primers are capable of detecting unknown viruses in a sample, wherein such a sample originates from a single individual or is a pooled sample from individuals of the same species. Thus the panel of primers detects viruses which infect the same species.
The number of primers designed by the above steps is typically the same as the numbers of primers mentioned above for use in the method of the invention. The primer pairs which are designed bind sequence which is conserved across members of a virus family. The panel of primer pairs which is designed may comprise primer pairs that bind sequence which is conserved across substantially members of the family or across a subset of the members of the family, for example across all members of a subfamily or of a genus. Generally, the primer pairs bind at least 70%, at least 80%, or at least 90% of the known viruses of the family, subfamily or genus. Preferably less than 10, such as less than 5, primer pairs will be used for the detection of any given family, subfamily or genus in the panel.
The panel of primer pairs is generally capable of detecting viruses from at least 10, 15, 2.0, 30 or more families, typically up to a maximum" of 35 families .
The panel of primer pairs may comprise sets of primer pairs which perform a nested PCR reaction. Generally such a set of primer pairs comprises a first and second primer pair. The first primer pair is able to amplify a template nucleotide sequence from a virus to form a PCR product. The second primer pair is able to amplify a nucleotide sequence using the PCR product generated by the first primer pair as a template. The use of nested sets of primer pairs allows increased sensitivity. In a preferred embodiment each primer pair is specific for a particular virus family, so that it does not detect viruses of other families.
The plurality of amino acid or nucleotide sequences are provided from different known viruses of the same family. The sequences will be for the same protein of the different viruses. Typically at least 5, 10, 20, 50, 100 or more sequences are provided. The maximum number of sequences provided will, for example, be 300 sequences.
Each of the sequences which is provided is typically at least 20, 50, 100, 200 or more amino acids or nucleotides in length. In general the maximum length of the nucleotide sequences is 1000 nucleotides and the maximum length of the amino acid sequences is 300 amino acids. The sequences may be obtained from a database of sequences, such as GenBank. The sequences may be obtained from a database comprising virus sequences which are organised into homologous protein families (based on sequence similarity relationships). In a preferred embodiment the sequences are obtained from the VIDA database (described in Alba et al (2001) Nucleic Acids Research 29, 133-136) or the Virus Division of GenBank. The sequences may be provided in the form of a database, preferably in computer-readable form. The sequences are preferably provided in the form of a computer-readable database constructed using programs which identify homologous protein families, such as GeneTableMaker, MKDOM or PSCBuilder.
The sequences which have been provided are compared to identify conserved regions. Typically such conserved regions will have a length of at least 12 nucleotides, such as at least 15, 21, 27, 36, 99 or more nucleotides (generally up to a maximum length of 200 nucleotides) or at least 4, 5, 7, 10, 25 or more amino acids (generally up to a maximum length of 50 amino acids). Across the conserved region the virus sequences which are being provided will of course share identity or similarity. Typically the amino acids or nucleotides in at least 50% of the positions in the region will be the same in at least 50 %, 60%, 70%, or 80%) of the viruses of the group (i.e. in the family, genus or subfamily). The algorithm which identifies conserved regions generally uses a multiple sequence alignment method. The method may comprise (a) aligning all pairs of sequences separately to calculate a distance matrix giving the divergence of each pair of sequences, (b) calculating a guide tree from the distance matrix, and (c) aligning the sequences progressively according to the branching order in the guide tree. A preferred algorithm for the aligning the conserved sequences is
CLUSTALW as described in Thompson et al (1994) Nucleic Acids Research 22, 4673-80. Other algorithms that can be used for aUgning sequences. are MultAlin (Corpet (1988) Nucleic Acids Research 16, 10881-90) or Jalview (Clamp et al (1998) http://barton.ebi.co.uk). BLOCKS of conserved regions of amino acids may be extracted from the multiple alignments, typically using the program Blocks Multiple Alignment Processor. Alternatively the entire process of performing multiple alignments and extracting BLOCKS can be performed using BLOCKMAKER (Henikoff and Henikoff (1994) Genomics 19, 97-107).
The output from the alignment and BLOCK extraction set (i.e. the information describing the identified conserved regions) is then entered into the algorithm which designs the primers. Such output is typically in the form of partial sequences which correspond to the conserved regions (BLOCKS). These BLOCKS are input into a primer design algorithm. In one embodiment such an algorithm is CODEHOP. In the primer design step the conserved regions which are chosen as targets for primers preferably comprise few codons with degenerate counterparts, i.e. preferably the sequence has a low redundancy, such as a redundancy of less than 512 fold, 256 fold or 128 fold. Each primer binds in accordance with Watson-Crick base pairing and thus the binding is sequence specific. Each primer will thus be designed to be wholly or partially complementary to the sequence to which it binds.
Each of the primers typically has a length of at least 8 nucleotides, such as at least 10, 12, 15, 20, 30, 40 or more nucleotides (up to a maximum of 50 nucleotides for example). In one embodiment the primer may comprises at least 2, 4 or 6, up to a maximum of 10 for example, inosine bases. Inosine is able to bind to any of the four nucleotides and therefore use of inosine causes a reduction in effective redundancy. Each primer pair will be designed so that the PCR product generally has a length of at least 20, such as at least 50, 100, 200, 500, 1000 or more nucleotides (and typically up to a maximum of 5x103 nucleotides long).
Each primer is preferably be designed so that it anneals to a single site, i.e. the primer will not bind to any other site in the genome of the relevant viruses. Each primer is preferably designed so that it does not exhibit secondary structure, i.e. the nucleotides in the primer will not bind substantially to any other nucleotide in the primer apart from those to which it is covalently linked. In addition preferably each primer is designed so that it does not bind other primers with the same sequence. In one embodiment the 3' region, and preferably the 3' terminal nucleotide of the primer binds to the target sequence with high affinity,, thus preferably this region or nucleotide comprises a G or C.
Generally each primer is designed to have an annealing temperature of from 30 to 65 °C, such as 50 to 60°C or 35 to 45°C. In addition each primer pair may be designed to ensure that the two primers do not bind to each other.
The primers are designed by a computer based algorithm. In one embodiment such an algorithm designs primers according to the following rules:
1) A set of blocks is input, where a block is an aligned array of amino acid sequence segments without gaps that represents a highly conserved region of homologous proteins. A weight is provided for each sequence segment, which can be increased to favour the contribution of selected sequences in designing the primer. A codon usage table is chosen for the target genome.
2) An amino acid position-specific scoring matrix (PSSM) is computed for each block using the odds ratio method. 3) A consensus amino acid residue is selected for each position of the block as the highest scoring amino acid in the matrix. 4) For each position of the block, the most common codon corresponding to the amino acid chosen in step 3 is selected utilizing the user-selected codon usage table. This selection is used for the default 5' consensus clamp in step 8.
5) A DNA PSSM is calculated from the amino acid matrix (step 2) and the codon usage table. The DNA matrix has three positions for each position of the amino acid matrix. The score for each amino acid is divided among its codons in proportion to their relative weights from the codon usage table, and the scores for each of the four different nucleotides are combined in each DNA matrix position. Nucleotide positions are treated independently when the scores are combined. As an option, the highest scoring nucleotide residue from each position can replace the most common codons from step 4 that are used in the consensus clamp.
6) The degeneracy is determined at each position of the DNA matrix based on the number of bases found there. As an option, a weight threshold can be specified such that bases that contribute less than a minimum weight are ignored in determining degeneracy.
7) Possible degenerate core regions are identified by scanning the DNA matrix in the 3' to 5' direction. A core region must start on an invariant 3' nucleotide position, have length of 11 or 12 positions ending on a codon boundary, and have a maximum degeneracy of 128 (this is the default setting of CODEHOP). The degeneracy of a region is the product of the number of possible bases in each position.
8) Candidate degenerate core regions are extended by addition of a 5' consensus clamp from step 4 or 5. The length of the clamp is controlled by a melting point temperature calculation (the CODEHOP default is 60 °C) and is usually about 20 nucleotides.
9) Steps 7 and 8 are repeated on the reverse complement of the DNA matrix from step 5 for primers corresponding to the opposite DNA strand.
In one embodiment CODEHOP (Rose et al (1998) Nucleic Acids Research 26, 1628-1635) is used to design the primer pairs. This program uses. the above rules.
The primers designed by the algorithm may then be mapped back to the original sequence to choose primer pairs which provide the desired length of PCR product.
The above-described computer based method is repeated until the desired number of primer pairs have been designed. Optionally the primer pairs can then be synthesis ed and tested. They are typically tested to determine the optimal conditions for using the primers in a PCR reaction.
In one embodiment the primers are tested for their ability to amplify one or more of the plurality of nucleotide sequences from known viruses which were used to design the primers, or in the case of amino acid sequences from known viruses being used to design the primers the primers may be tested for their ability to amplify the nucleotide sequence from the virus which encodes the amino acid sequence.
The primers may be tested in a range of buffer conditions to determine optimal buffer conditions for PCR using the primers. The buffer conditions which may be tested include pH (typically between 7 and 10), magnesium concentration (typically from 0.5 mM to 5 mM), potassium chloride (typically from 0 to 100 mM), ammonium chloride (typically 0 to 100 mM), glycerol (typically 0 to 20%), dimethysulphoxide (typically 0 to 20%), ethanol (typically 0 to 20%), sorbitol (typically 0 to 20%) or betaine (typically 1M betaine).
The primers may be tested at a range of different temperatures to determine the optimal temperatures in the PCR reaction. Preferably the primers are tested in PCR reaction in which a range of primer annealing temperatures are tested. . Typically the range of temperatures is from 30 to 65° C.
The panel of primer pairs or a group of primers within the panel may be designed to be used together on the same plate (i.e. using the same thermal cycles). Thus such primer pairs will be designed to work at the same annealing temperature.
In one embodiment a group of primer pairs within the panel are designed to have similar optimal conditions for use in PCR so that they can be used optimally in the same well or reaction vessel, i.e. that they can be used in multiplex PCR. Such a group typically comprises at least 2, 3, 4, 5, 6 or more primer pairs (up to a maximum of 8 primer pairs for example).
To provide such primer pairs the computer based method steps may be used to design primer pairs which are calculated to have similar annealing temperatures and/or the primers are tested to select primer pairs which can be used optimally together. Such testing typically determines whether the primers work optimally with the same buffers and/or whether the primers have similar annealing temperatures.
Validating a PCR product as being from a novel virus
After sequencing the PCR product(s), the next step is to determine whether each sequence is present in at least one database of known nucleic acid sequences, typically sequences of viruses known to infect the individuals from which the samples are derived. Appropriate databases include the virus subdivision of GenBank or the VEDA database.
In addition each sequence is typically also compared with a database of human sequences to exclude sequences which are human sequences. Such a database is generally a comprehensive or consensus human genome database. Preferably, at least one of the human sequence databases searched contains an essentially complete human genome sequence. However, it needs to be borne in mind that, although there has recently been a great deal of publicity about the "completion" of the human genome sequence, not all the human genome has in fact been sequenced, and it is possible that a cloned sequence could fall within the unsequenced part of the genome. The human genome contains large areas with repetitive sequences, and much of the unsequenced genome is within these areas.
In order to make as comprehensive a search as possible, it is desirable to search a range of different types of database; in addition to a human genome database, it is desirable to search, for example, a database comprising expressed sequence tags (ESTs) and a database comprising repetitive elements of the.human genome. Appropriate databases include GenBank, the EMBL database, the Celera human genome database, the Ensemble human genome database, the DNA Data Bank of Japan (DDBJ), the Incyte LifeSeq™ database of ESTs and the Repbase database of repetitive elements in the human genome. Where the sequence is found to be not present in any of the interrogated databases of known sequences, this indicates that the nucleic acid may be from a previously unknown virus. The nucleic acid then becomes a candidate for further investigation and may be designated a Primary Candidate Virus PCV).
It is generally necessary to confirm by experimentation that a nucleic acid sequence designated a PCV is not in fact a human sequence. A preferred way of doing this involves designing and synthesising a specific primer set (or sets) to amplify the nucleic acid designated a PCV and determining whether the set(s) are able to amplify any DNA in a sample of complete genomic human DNA. The amplification conditions for each primer set may be optimised using the original sample from which the PCV derives or using the PCR product which is obtained in the method of the invention.
The primer set may be used to screen one or more samples of human genomic DNA, for example from 1 to 100 samples, preferably from 5 to 50 samples. As an alternative to PCR, human genomic DNA may be probed with a labelled probe containing sequence from the original PCR product (e.g by Southern blotting). If the PCV cannot be detected in human DNA by experimentation (by PCR or hybridisation with a labelled probe), it may then be subjected to further analysis. It may be designated a Secondary Candidate Virus (SCV).
The further analysis of an SCV may include gene walking to determine whether the original cloned nucleic acid sequence exists in nature as part of a longer sequence, such as the genomic sequence of an unknown virus. Gene walking may be carried out using techniques known in the art, such as vectorette PCR (Allen et al, PCR Methods Appl. 4:71-75), rapid amplification of cDNA ends (RACE, Frohman et al Proc Natl Acad Sci U S A. 85:8998-9002), rapid amplification of genomic ends (RAGE, Cormack and Somssich. 1997. Gene. 194:273-276) and methods derived from these. Alternatively, the SCV sequence may be "extended" by screening a DNA or cDNA library using the original cloned nucleic acid sequence as a probe.
The additional sequence information obtained through DNA walking may reveal information about the identity of the SCV which cannot be determined from the original clone. The additional information may therefore be analysed, for example to determine whether it contains an open reading frame (i.e. a sequence encoding a protein); the presence of an Open reading frame provides further support for the suggestion that the SCV is a virus. Furthermore, the additional information may identify the SCV as being related to a known virus; for example, the information may identify the SCV as being a new member of a known family of viruses.
A further step may then be to determine whether a newly-identified candidate virus is associated with a disease, for example with a cancer, autoimmune disease, cardiovascular disease' or other disease mentioned above. This may be done by obtaining a specimen from each member of a group of subjects with a disease; determining whether the cloned nucleic acid or other nucleic acid of the same virus is present in each specimen; and determining whether the proportion of subjects in whom the nucleic acid is present is greater in the group of subjects who have the disease than in a control group of subjects who do not have the disease, wherein a said greater proportion suggests that the virus may cause or contribute to the disease.
Typically, the process of determining whether the nucleic acid is present or absent from a specimen from a subject may be carried out by PCR using primers specific for the novel sequence (including any contiguous sequence obtained by DNA walking). Initially, perhaps from 10 to 50 patients from a disease group may be tested, but if positive results are obtained in initial studies, the investigation may be extended to a larger group (e.g. a group of up to 100, 500, 1000 or 10,000). The nature of the biological specimens taken from the members of the group varies depending on the disease association that is being investigated; where possible specimens are from disease affected tissue and from peripheral blood of the subjects (for a published example of this see Griffiths et al, 1999, Arthritis Rheumatism, 42:448-454). The specimens may be from the same tissue and fluid types as the biological samples used in the initial screening assay described above. Once a new virus has been identified and found to be positively associated with a particular disease or condition, serological and genetically-based diagnostic assays for infection by the virus may readily be devised. Genetically-based assays can be developed by using the nucleotide sequence of the virus to design probes and/or PCR primers for specifically detecting the nucleic acid of the virus. Serological assays can be developed by producing recombinant proteins or protein fragments encoded by the virus and testing for the presence of antibodies to these proteins in human sera. Alternatively, antibodies specific for the proteins of the virus may be made and the antibodies used to detect the virus directly. The serological assays may take the format of an ELISA, western blot or immunofluorescence assay. Correlations may be sought between serological data and genetic data. Furthermore, the organism provides a target for the development of therapies and/or prophylactic vaccines against the disease.
The following Example illustrates the invention: Example
The Example below refers to Figure 1 which shows how primers were designed using a database known as 'VEDA', and computer programs known as 'CLUSTALW', 'BLOCKS' and 'CODEHOP'.
Designing a panel of primers
A panel of primers was designed for detecting unknown viruses from the family Herpesviridae according to the strategy shown in Figure 1. The amino acid sequences of herpes virus DNA packaging protein UL15 were obtained from the VTDA database (Alba et al, see above). These sequences are shown in Table 1.
The sequences obtained from the VTDA database were then imported into CLUSTALW. This compares the protein sequences to identify conserved regions and then aligns the sequences according to the conserved regions. The alignment produced by CLUSTALW is shown in Table 2.
The BLOCKS program was then used to extract the sequences of the conserved regions identified by CLUSTALW, and to enter these sequences into CODEHOP. The primer sequences were then designed by CODEHOP using the conserved sequences. The output from the CODEHOP program is shown in Table 3. Table 1. All protein sequences of DNA packaging protein UL 15 extracted from VTDA. Here written as a list and unaligned.
>gi_l 0180719 MFGGL GEETKRHFERLMKTKNDRLGASHRNERS IIUJGD VDAPFLNFAIPVPRRHQTVMPAIGI HNCC
DSLGIYSAITTRM YSS IACSEFDELREDSVPRCYPEITNAQAF S PMMMRVANS I IFQEYDEMECAAHR
NAYYSTrøSFISrøTSDAFKQ TVFISRFSK LIASFRDVNKI-DD^^
KMIIJfflATYFVTSVIJ_GDHAERAERI*l*RVAFDTPHFSDIV^
MSSFEGIRIGYTSHIRKAIEPVFEDIGDR RR FGAHRVDHV GETITFSFPSGL STvTFASSHNTNSI RGQDFNLLFVDEANFIRPDAVQTI IGFLNQATCKI IFVSSTNSGKASTSFLYG KGSADDLLNVVTYICaD
EHMKH TDYTN SCSC IJ-^P FI MDGA 1RRTAEMFLPDS MQEIIGGGV D ICQGDRSIFTASA
IDRFLIYRPSTVHNQDPFSQDLYVYVDPAFTAiOTKASGTGVAVIG YG DYIVFGIi3HYFLRA TGESSD
SIGYCVAQO.IQICAIHRKRFGVIKIAIEGNSNQDSAVAIATRIAIEMISYM AAVAPTPHNVSFYHSKS
NGTOVEYPYFTΛQRQKTTAFDFFIAQΗTSGRVTjASQDLVSTOTS^ RTFSGKKGGNDDTWA TMAVYISAHIPDMAFAPIRV
>gi_7S73189
MFGGLLGEET RHFERLMKTKNDR GASHRNERS IRDGDMVDAPFIINFAI PVPRRHQTVMPAIGILHNCC DS GIYSAITTRMLYSSIACSEFDE RRDSVPRCYPRITNAQAF SP MMRVANSIIFQEYDEMECAAHR NAYYSTMNSFISMRTSDAFKQLTVFISRFSK LIASFRDTΛ^ IMIFDAC3IJCI^CFTWRSRRASERLIRVAFDTPHFSDIV RHFRQRKTVFLVPRRHGKT FLVPLIAIA MSSFEGIRIGYISHIRKAIEPVFΈDIGDRLRRWFGAHRVDHVKGETITFSFPSGLKSTVTFASSHNTNSI RGQDFN LFVΓEA FIRPDAVQTI IGFUSTQATCKI IFVSSTWSGKASTSFLYG KGSADDLLNVVTYICΓJ EHMKHVTΩY ATSCSCYVLNKPVFITIYRDGAMRRTA.LMFLPDSFMQEIIGGGV^ IDRFLIYRPSTVITOQDPFSQDLYVYVDPAFTAOT' ASGTGVAVIGKYGTDYIWGLEHYFLRAT.TGESSG S IGYCVAQCLIQICAIHRKRFGVI IAIEGNSNQDSAVAIA RIAIEMISYMKAAVAPTPHNVSFYHSKS NGTDVEYPYFLLQRQKTTAFDFFIAQFNSGRVI-ASQDLVSTTVSLTTDPVEYLT QLTNISEVVTGPTCT RTFSGK GGNDD WALTMAVYISAHIPDMAFAPIRV
>gi_5S89285
MFGGALGESAK HFERLLRDRTffiRLGASRKiraCTjARGGSL^ DGTGIYSAIATRLLYAGIVSSEFGEVRRES SNGHISKRiraEAIiIiAPTL RVANS ITFHEYDDAQCAAHR
NAYYSTrøroOS]yKTSDAFQQLASFIDRFSKLTJAAS
KMIIJfflATYFLTSVI^DHAERAERTJ^VIFDI^^
MSSFEGIRIGYTSHIRK&IEPVFEEIGDR RR FGTQCVDHVKGETITFSFPSGSRS V FASSHIirrNSI
RGQDF-*T LFVDEAOTIRPDAVQTIIGFTJNQANCKIIFV^^ EHMKHVrmrTNATSCSCYVIJrePVFITMDGAMRRTAEMFLPDSFM EIIGGIT^
VERLIiYRPS VRNQDI SRDLYVYVDPAFTAlSriTtASGTGIAVIGRYGADYIIFGLEHFFLRA TGESAD
AIGECAAQCIAQICAIHCERFGTIRVAVEGNSNQDSAVAIATRISIDLASYVQSGVAPAPHDVCFYHSKP
AGSIWEYPFFIiQRQKTAAFDFFIARF SGRVTJiSQD VSTTISLSTDPVEY TKQL ISrLSEVVTGATGT
RTFSGKKGGYDDTWALV AVYISAHASDATFAPIRGVEATCRGP EA >gi_1869837
MFGQQLASDVQQY ERLEKQRQQ VGVDEASAGLIIGGDA RVPFI^FATATPKRHQTVVPGVG LHDCC
EHSP FSAVARRLLFNSLVPAQI.RGRDFGGDHTA T^FIjAPE VRAVAR RFRECaPEDAV'PQRimYYSV
TjraFQAL-ffiSEAFRQLVHFVRDFAQl^KTSFRASSLAETTG^
HATYF]^AAVL GDHAEQVOT'FLRLVFEIPLFSDTAVRHFRQRATVFLVPRRHGK FLVPLIALSLASFR GIKIGYTAHIRKATEPVFDEIDACLRG FGSSRVDHVKGETISFSFPDGSRSTIVFASSHNTNGIRGQDF NLLFVDEANFIRPDAVQTIMGFI-NQANCKIIFVSSTOTGKA≤TSF^
VVTHTNATACSCYIL.-r PVFIΗTOGAVRRTADI^
PSTTTNSGLMAPELYVYVDPAFTANTF-^GTGIAWGRYRDDFIIFAΪ^^
SlαAQVLAIΛPGAFRSVRVAVEGNSSQDSAVAIATHVHTEMHRIIASAGANGPGPEIiFYHCEPPGGAVLY PFFLIuNKQKTPAI*ΕYFIKKFNSGGVMASQELVS TVRLQTDPVEYLSEQT-NNLIETVSPNTDVRMYSGKR •
NGAADDLMVAVIMAIYLAAPTGIPPAFFPITRTS
>gi_59501
MFGQQLASDVQQYIjERIiEKQRQLKVGADEASAG TMGGDAIΛVPFIJDFATATPKRHQTVVPGVGTωro
EHSPLFSAVARR LFNSLVPAQL GPJJFGGDHTAKI^F APEiVRAVAR RFKECAPADVVPQRNAYYSV l-NTFQALHRSEAFRQLVHFVRDFAQLLKTSFRASSLTETC^
HATΫFLAAVI-LGDHAEQVNTFLRLVFEIPLFSDAAVRI^^
GIKIGYTAHI - EPVFEEIDAC RG FGS RroHVKGETISFSFPDGSRS I FASSHN NGI GQDF
NLLFVDEANFIRPDAVQTIMGFMQANCKIIFVSSTNTGKASTSF^
VV HTNATACSCYILNKPVFITimGAVRRTADLFI-ADSFMQEIIGGQARETGDDRPV T SAGERFL YR PS TNSGIjmPDLYVYVDPAFTANTRASGTGVAVVGRYRDDYI IFA EHFF RALTGSAPADIARCVVH
SLTQVTJU-HPGAFRGVRVAVEGNSSQDSAVAIATHVΗTEMHRIJΛSEσADAGSGPEILFYHCEPPGSAV
YPFFI-TJraQKTPAFEHFIKKFNSGGVMASQEIVSATVRL^^
RNGASDDLMVAVIMAIY AAQAGPPHTFAPITRVS
>gi_2S05992 MFGKALSRETIQYFE LRKEVQSRSGA lSrPJAEAQTGGEDDVKTAFLOTAIPTPQRHQ VVPGVG HDC
(^TAQIFASVARRLLFRSLSK RGGES ERr^PSSVEAYVDP VKQALKTISFVEY-TOAEaRSCR.TAYYS
IM1TOFDSLRSSDAFHQVANFVARFSR VDTSFNGADLDGDGQQTS RI VDVPTYGKQRGTLE FQKMI
MHATYFIAAVILGDHADRIGAF KMVF 'PEFSDATIRHFRQRATVFLVPRRHGKTWFTjVPLIALALiATF
KGIKIGYTAHIRKA EPVFDEIGAR RQWFGNSPVDHV GENISFSFPDGS STIVFASSHNTNGIRGQD FNLLFVDEANFIRPEAVQTI IGFLNQTNCKI IFVSSTIWGKASTSFLYH KGAADELLNVVTYICDEHME
RVKAHΪNATSCSCYILNKP IT DGAMR1OTAELFLPDS
YRPSTVANQDIMSNMLYVYVDPAFTTNAMASGTGVAVVGRYRSNWIVFGLEHFFLSALTGSSAELIARCV
AQCLAKWAIHSRPFDSVRIAVEGNSSQDAAVAIATNIQLEL-ITLRQADVVHMPGTVLFYHCTPPGSSVA
YPFFLLQKQKTGAFDHFIKAFNSGLVLASQELISNTVRLQTDPVEYI-IiTQMKNLTEVITGTSETRVFTGK RNGASDDMLVALVMAVYMASLPPTTNAFSSLSTQ
>gi_330792
MFGRVLIGRETVQYFEAIJUFFIVQARRGAKILIRAAΞAQNGGEDDAKTAFIJ-TFAIPTPQRHQTVVPGVGTLHDC CETAQIFASVARRIJ.FRSLSKWQSGEARERLDPASVEAYVDPKVRQALKTISFVEYSDDEARSCRNAYYS IMNTFDALRSSDAFHQVASFVARFSRLVDTSFNGADI^GDGQQAS-OΪARVDVPTYGKQRGTLELFQKMIL Π-ATΪFI I GDHADRIGAFL-^VF T EFSDATIRHFRQRAT FL PRRHGKT F P IAIAR-A F * KGIKIGYTAHIR-^TEPVFDEIGARLRQ FGNSPVXIHVKGENISFSFPDGSKSTIVFASSHCΓΓNGIRGQD F-TLLFVDEANFIRPEAVQTIIGFIINQTNCKIIFVSSTOT RVKAHTNATACSCYIMKPVFI MDGAMRNTAELFLPDSFMQEIIG∞ YRPS VANQDI SSD YVDPAFTIWA ASGTGVA GRYRSN V FG^FFIHFF S I-TGSSAE IARCV AQCLAQVFAIHKRPFRJSVRVAVEGNSSQDAAVAIA NIQLET-TRLRRADVVPMPGAVLFYHCTPHGSSV YPFFT^QKQCTGAFDHFIKAFNSGSVI^QELVSOTVRLQTDPVE RNGASDD LVALVMAVYLSSLPPTSDAFSΞLPAQ
>gi_971317
MFGGAVGEQSARYFQPJI*IffiRQRRAAERGARPDGGGGARGEDDARVPFLDFAVAAPKRHQTVVPGVGTLH * GYCELAPLFAATASRT.TiTiTSMARAEAGIJTOGTGEAHVSRETAGVIiSAI^FAAHPPAEAAAHCNAYHSVMA
AI^SMRASGAFAQVAAFVARFSRLVGTSFSHLGGGDDADPPRAKRARVEPPSGQTRGALELFQKMILMPA TYFVAATIi GEHAERIGAFl_RVAFNTPDFSDAAVAHFRQRATVFLVPRRHGKT FLVPLIALALATFKGI IGYTAHIRKATEPVFEEIVARLRQWFGGERVDHVKGEVISFSFPDGARSTIVTASSHNTNGIRGQDFNL LFVDEANFIRPEAVQTIVGFIJJQASCKIIFVSSTNTGKASTSFLYNLKGASDGI NVVTYIC^ AHGGATACS CYVLNKPVF ITMDAAARNTAETFLPNS FMQE I IGGGEVARRAEPAAVFTRAAGEQFLLYRP STAAARGP PERLYMYIDPAFTSNARASGSGIAVVGRHRGSWi.VLGLEHFFLPALTGSSAAEIARCAVRC FAQVI^VHRRRLDGLFVAVEGNSSQDSAVAIALGVRRELDSLAASGAVPMPAETRFYHCRPPGSAVAYPF FIJLQKQKTAAFDHFIRLFNSGRVVASQDLASLTVRLQTΓJPVEYLFEQLQNLTESTAGPGGARAFSG RRG AADDLMVALVMAVFVGSLPPTDGAFCPLAPRPPAD >gi_5869808 MSLIMFGRTLGEESVRYFERLKRRRDERFGTI£SPTPCSTRQGSLGNATQIPFIαI!rFAIDVTRRHQAVIPG IGTLHNCCEYIP FSATARRAMFGAFLSSTGYNCTPNVVLKPWRYSVNANVSPELKKAVSSVQFYEYSPE EAAPHRNAYSGVMNTFRAFSLSDSFCQLSTFTQRFSYLVETSFESIEECGSHGKRA VDVPIYGRYKGTL ELI^KMILMHTTHFISSVI^^
IALVMATFRGIKVGYTAHIRKATEPVFEGIKSRT.EQWFGANYVDHVKGESITFSFTDGSYSTAVFASSHN TNGIRGQDFNLLFVDEANFIRPDAVQTI VGFLNQTNCKI IFVSSΩ^GKASTSFLYNIΛGSSDQLLNVVT WCTDHMPRVLAHSDVTACSCYVIiNKPWITMDGAMRRTA^
T TARERFILYRPSTVANCAILSSVLYVYVDPAFTSNTRASGTGVAIVGRYKSD IIFGLEHFFLRALTG TSSSEIGRCVTQCLGHIl aiiHP.π'F .rVHVSIEGNSSQDSAVAISIΛIAQQFAVLE GNV SSAPVLLFY HS IPPGCSVAYPFFLLQ QKTPAVDYFVKRFNSGNI IASQELVSLTVKLGVDPVEYLCKQLDNL EVIKG riMGCrLDTKTYTG GTTGTMSDDIiMVALIMSVYIGSSCIPDSVFMPIK >gi_5708110
^rLGKES EIVIKYRDAL KR ^rERGPDD DGQEMSDS FI ASICDR DSA DTM S AS FQFAID QRHQACIAPIGSFHNCCAISRAFSYMASEIIYIΪNLASYSTKYTDTDAALNDLQVSPKRQLFTGAAEDSIL PALRQK ANLNFARFAPSDSLIHDKAFDGIMNGYRGFVKSDEFSQLNl^IYRFHTIiLKKS RAKLEKTTSEQRDGTLELFQK ILMHATYFASS I CLGEGSTERSNRYLSTVFNTSLFSENI IQHFRQRTT VFLVPRRHGKT FLVPLISLLVSSFEGIRIGYTAHLRKATEPVFIEIFTRLYKWFGAKQVEQVKGETITF TFPJ^GNKSAIVFASSQITIWGIiRGQDFNFLFVDEANFIKPAAi TVMGFljNQTNCKLFFVSST^ IJLYNL GKTNSLLNVVTYIC^EHMPEIQKKTDVTTCSCYVLHKPVF^ GGRAGKYDSDRTLVPVRALDQFLIYRPΞTSSKPNISGLGKILTVYVDPAFTTNRSASGTGIALVTALRDS VTJ>1GAEHFY ^ALTGEAALEIAQCVYLCIAYCCLIHAGAFP^IRIAVEGNSSQDSAAAIAGNLTE]^LDS ' LRRRLGFSLTFAHSRQPGTAMAHPFYLIilffiQKSPJ^DLFVSLFNSGRFMASQELVSNTLVLSKDPCEYLV DQIRNITVTHGQGPDSFRTFSGKQGRVPDDMLVAAVMSTYLALEGSPTAGYHPIAPIGRRQRPA >gi_1813970 MIJRGDSAAKIQERYAELQKRKSHPTSCISTAFT-WATL^KRYQMMHPELGLAHSOSIEAFLPLMAFCGRH RDYNSPEESQREIiFHERLKSALDKLTFRPCSEEQRASYQKLDALTELYRDPQFQQINNFMTDFKK LDG GFSTAVEGDAKAIRI-EPFQKNLLIHVIFFIAVTKIPVLANRVLQYLIHAFQIDFLSQTSIDIFKQKATVF LVPRRHG-ΩWFIIPIISFLLKHMIGISIGYVAHQKHVSQFVLKEVEFRσUCTFARDYVVEN DNVISIDH RGAKSTALFASCYNTNSIRGQNFIftiTiTiVDEAHFIKKEAFNTILGFLAQNTTKIIFISSTNTTSDSTCFLT RLNNAPFDMT-N SYVCEEHLHSFTEKGDATACPCTRLHK^^ NKISQNTVLITDQSREEFDILRYSTLNTNAYDYFGKTLYVYLDPAFTTNRKASGTGVAAVGAYRHQFLIY
GLF^FFIJU3LSESSEVAIAEC^UMIISVLSLHPYLDELRIAVEGNTNQAAAVRIACI.IRQSVQSSTLIR VLFYH PDQNHIΞQPFYIMGRDKALAVEQFISRFNSGYIKASQELVSYTIKLSHDPIEYLLEQIQNLHRV TLAEGTTARYSAKRQNRISDDLIIAVIMATYLCIJDIHAIRFRV'S
>gi_27 629S MLRSCDIDAIQKAYQSIΪW HEQDVKISSTFPNSAIFCQKRFIILTPELGFTHAYCRHVKPLYLFCDRQR HV SKIAICDPLNCALSKLKFTAIIEKmΕVQYQKHLELQTSFYRNPMFLQIEKFIQDFQR ICGDFENT NKKERI LEPFQKS ILIHI IFFISVTKLPTIΛNHVLDYLKYKFDIEFINESSVNILKQKASVFLVPRRHG KTWFMIPVICFLLIO^n^GISIGYVAHQKHVSHFV KDVEFKCRRFFPQKNITCQDNVITIEHETIKSTAL FASf^NTHSIRGQSFNLLIVDESHFIKKDAFSTILGFLPQSSTKIIFISSTNSGNHSTSFLTKLSNSPFE LT SYVCΕDHVHILlTORGNATTCACTUuHKPKFISINA^ LITEQGLIEFDLFRYSTISKQIIPFLGKELYΪYIDPAYTINRRASGTGVAAIGTYGDQYIIYGMEHYFLE SLLSNSDAS IAECASHMILAVLELHPFFTELKI I IEGNSNQSSAVKIACILKQTISVIRYKHITFFHTLD QSQIAQPFYIJLGRΞKRLAVEYFISNFNSGYIKASQELISFTIKITYDPIEYVIEQI NLHQININEHVTY NAKKQTCSDDLLISI IMAIYMCHEGKQTSFKEI >gi_32S496 LRT03ITHI NNYEAIIWKGERDCSTISTKYPNSAIFYKKRFIMLTPELGFAHSYNQQVKPLYTFCEKQ RHLKNRKPLTILPSLSHKLQEMKFLPASDKSFESQYTEFLESFKILYREPLFLQIDGFIKDFRKWIKGEF NDFGDTRKIQLEPFQKNILIHVIFFIATCKLPALANRVINYLTHVFDIEFVNE^
RHGKT FIVPIISFIiLKNIEGISIGYVAHQKHVSHFVMKEVEFKCRRMFPE TITCLDNVITIDHQNIKS . TALFASCTNTHQSIRGQSFNl-LIVDΞSHFIKKDAFSTILGFLPQASTKI^^ SPFH-fl-SWSYVOSDHAHMLNERGNATACSCYRLHKPKFISINAEV^
INDVLITEQGQTEFEFFRYSTINKNLIPFLGKDLYVYTjDPAYTG-mRASGTGIAAIGTYLDQYIVYGMEH YFlBSI- π?SSDTAIAECAAHMI SIIΛ HPFFTEVKIIIEG S QAS VKIACIIKENITANKSIQV FF HTPDQNQIAQPFYLLGKEKKLAVEFFISNEWSGNIKASQELISFTIKITYDPVEYAl^QIRNIHQISVNN YITYSAKKQACSDDLIIAIIMAIYVCSGNSSASFREI >gi_854039
MKIJrøSPFEMLS SYVCTDHAHMLϊ-TERGNATACSCYR^
AT rVINDVLITEQGQTEFEFFRYSTINK.TLIPFLGKDLYVYLDPAYTGNRRASGTGIAAIGTYLDQYIV YGMEHYFI^SLMTSSDTAIAE aAH ILSILDLHPFFTEVKIIIEGNSNQASAVKIACIIKENITANKSI QVTFFHTPDQNQIAQPFYIiGKE.OCr.AVEFFISNFNSGKIKASQELISFTIKITYDPVΕYATjEQIRNIHQ ISVNNYITYSAKKQACSDDLIIAIIMAIYVCSGNSSASFREI >gi_5733564
^π-RTCDI HI-*^n*IYEAIIW GE NCSTISTKYPNSAIF K RFIM TPELGF HSY QQVKP TFCEKQ RHLKNRKPLTILPSLTR LQEMKFLPASDKSFESQYTEFLESFKILYSEPLFLQIDGFIKDFRKWΪ GEF NDFGDTRKIQLEPFQKNILIHVIFFIAVTKLPALANRVINYLTHVFDIEFVNESTIΛTTL QKTNVFLVPR RHGKTWFIVPI ISFLLKNIEGIS IGYVAHQKHVSHFVMKEVEFKCRRMFPEKTITCLDNVITIDHQNI S TALFASCYITOISIRGQSFNLLIVDESHFIKKDAFSTILGFLPQASTKILFISSTNSGNHSTSF^ PFEKLSWSYVOTDHAH LlffiRGNATACSCΩtLHKPKFISINAEVKKT^^
NDVLITEQGQTEFEFFRYSTINKNLIPFLGKDLYVYLDPAYTGNRRASGTGIAAIGTYLDQYIVYGMEHY FLESLMTSSDTAIAECAAHMILS ILDLHPFFTEVKI I IEGNSNQASAVKIACI IKENITANKS IQVTFFH TPDQNQIAQPFΥLLGKEKKLAVEFFISNFNSGNIKASQELISFTIKITYDPVEYALEQIRNIHQISVNNY ITYSAKKQACSDDLIIAIIMAIYVCSGNSSASFREI >gi_4996048 KTjraSPFEMLSVVSYVCEDHAHMIiNERGNATACSCYRLHKPKFISINAEVKKTANLFLEGA^ ATaWINDVLITEQGQTEFEFFRYSTINKNLIPFLGKDLYVYIjDPAYTGNRRASGTGIAAIGTYLDQYIV YGMEHYFLESLMTSSDTAIAECAAHMILS ILDLHPFFTEVKI I IEGNSNQASAVKIACI IKENITANKS I QVTFFHTPDQNQIAQPFYLLGKEKKJ^VEFFISNFNSGNIKASQELISFTIKITYDPVEYALEQIRNIHQ ISVNNYITYSAKKQACSDDLIIAIIMAIYVCSGNSSASFREI >gi_1136808 MI*LSRHRERLAAN]^ETAKDAGERWELSAPTFTRHCPKTARMaHPFIGVV^ ' TPTSANPDVGTPRPSEDNVPAKPRLLESLSTYLQMRCVREDAHVSTADQLVEYQAGRKTHDSLHACSVYR ELQAFLVNLSSFLNGCYVPGVHVπ_EPFQQQLVMHTFFFLVSIKA.PQKTHQLFGLFKQYFGLFETPNSVLQ TFKQKASVFLIPRRHGKTWIVVAIISMLLASVENINIGYVAHQKHVANSV7AEIIKTLCKWFPPKNLNIK KENGTIIYTRPGGRSSSIiMCATCFNKNSIRGQTFNLLYVDEANFIKKDALPAILGFMLQKDAKLIFISSV NSSDRSTSFLLNLRNAQEKm-N SYVOTDHREDFm
GAFDTEIMGEGAASSNATLYRVVGDAALTQFDMCRVDTTAQEVQKCLGKQLFVYIDPAYTNNTEASGTGV GA VTSTQTP S ILG^IEHFF RDL G 2 & EIASCaC MIK IAVTJHTTIE V AAVEG SSQDSGVA IATVIiMICPLPIHFLHYTDKSSALQWPIYMLGGEKSSAFETFtYALNSGTLSASQTVVSNTIKISFDPV TYLVEQVRAIKCVPIJIDGGQSYSAKQKHMSDDLLVAVVMAHFMATDDRHMYKPISPQ >gi_1718281
MLQKDAKLIFISSVNSSDRSTSFLLNLRNAQEK TLNWSYVCADiπffiDF IDES IKTTTNLF EGAFDTELMGEGAASSNATLYRVVGDAALTQFDMCRVDTTAQQVQKCLGKQLFVYID PAYTN ΕASGTGVGAVVTSTQTPTRSLILGMEHFFLRDLTGAAAYEIASCΑCTMIKAIAVLHPTIERVN AAVEGNSSQDSGVAIATVLNEICPLPIHFLHYTDKSSALQWPIYfTLGGEKSSAFETFIYALNSGTLSASQ TVVSNTIKISFDPVTYLVEQVRAIKC^PLRDGGQSYSAKQKHMSDDLLVAVVMAHFMATDDRHMYKPISP
Q >gi_2246S15
' mQKDAKLIFISSVNSSDRSTSFLINLRNAQEKMLNWSWCAD∞ IDESIKTTTNLF EGAFDTELMGEGAASSNATLYRVVGDAALTQFDMCRVDTTAQQVQKCLGKQLFVYID PAYTNNTEASGTGVGAVVTSTQTPTRSLILGMEHFFLRDLTGAAAYEIASCACTMIKAIAVLHPTIERVN AAVEGNSSQDSGVAIATVLNEICPLPIHF1HYTDKSSALQWPIYMLGGEKSSAFETFIYALNSGTLSASQ TVVSNTIKISFDPVTYLVEQVRAIKCVPLRDGGQSYSAKQKHMSDDLLVAVVMAHFMATDDRHMYKPISP
Q
>gi_224S552
AmLSRHRERLAAI*rLQETAKDAGERWELSAPTFTRΞCPKTARMAHPFIGVVHRINSYSSVLETYCTRHHPA P SA PDVGTP PSED PAKPRL ESLSTYLQ I C REDAHVSTADQ VEYQA RKTHDS HACS R E QAF ]mSSF ^GCY PGVH LEPFQQQL riϊTFFFL SI PQKTHQ FGLFKQYFG FE PNSVLQ TFKQKASVFLIPRRHGKTWIVVAI ISMLIΛMVENINIGYVAHQKIOTANSVFAEI IKTLC^WFPPKNLNIK KENGTI IYTRPGGRSSSLM<^TCFNKNSIRGQTFNLLYVDEANFIKKDALPAILGFra_QKDAKLIFISSV NSSDRSTSFLI∞ RNAQEKMLNVVSYVCADHREDFHLQDALVSCPCYRLHIPTYITIDESIKTTTNL GAFOTELMGEGAASSNATLYRVVGDAALTQFDMraVDTTAQQVQKCLGKQLFVYIDPAYTNNTEASGTGV GAVVTSTQTPTRSLILGMEHFFLRDLTGAAAYEIASCACTMIKAIAVLHPTIERVNAAVEGNSSQDSGVA lATVIJTEICPLPI-IFIJHYTDKSSALQWPIY GGEKSSAFETFIYALNSGTLSASQTVVSNTIKISFDPV TYLVEQVRAIKCWPLRDGGQSYSAKQKIMSDDLLVAVVMAHFMATDDRHMYKPISPQ >gi_4494933 IttQKDAKLIFISSSNSSDKSTSFLI1NL:KDAHE--3'ffiNVVNYVCPD IDETVRSTTNLFLΞGAFSTELMGDAATSAQSMHKIVSDSSLSQLDLCRVKSTSQDIQGAMKPCLHVYIDP AYTNNTDASGTGIGAVIAVNHKVIKCILLGVEHFFIj LTGTAAYQIASCAAALIRAIVTLHPQITHVNV AVEGNSSQDAGVAIATVIiNEICSVPLSFIHHVDKITmiRSPIYMLGPEKAKAFΕ.SFIYALNSGTFSASQT VVSHTIKLSFDPVAYLIDQIKAIRCIPLKDGGHTYCAKQKTMSDDVLVAAVMAHYMATNDKFVFKSLE >gi_7330018 rø.QKDAKLIFISSSNSSD:KSTSFLIjNLKDAHEKMI NVVN^
IDETVRSTTNLFLΞGAFSTELMGDAATSAQSMHKIVSDSSLSQLDLCRVESTSQDIQGAMKPCLHVYIDP AYTNNTDASGTGIGAVIAVNHKVIKCILLGVEHFFLRDLTGTAAYQIASCAAALIRAIVTIJIPQITHVNV AVEGNSSQDAGVAIATV]^NEICSVPLSFLHH2y3KNTLIRSPIYPrLGPEKAKAFESFIYALNSGTFSASQT VVSHTIKLSFDPVAYLIDQIKAIRCIPLKDGGHTYα^QKTMSDDVT.VAAVMAHYMATNDK^ . >gi_4019255 IjIiKAKKAIMEJffiTEASSTQSETEWTVDTPTMITNIKKSER AYSKIGVIPSINLYSASLTSFCRLYRP - LALKQPLPQTGTLRLLPSEKPYISQKLSlSYVKSLTLKIOTrHDIEA
FIINLSSFLNGCYVKKSTΞIEPFQLQLILHTFYFLISIKSPESTNKLFTJIFKEYFGLGEMDSA LQNFKQ
KAjSIFLIPRRHGKT I AIISty iTSVE.ttHVGYVAHQKHVANSVFra
TL IYKI PGKKPSTLMCASCFNKNS IRGQTFNLLYIDEANFIKKDSLPAILGFMLQKDAKLIFISSVNSGD ICATSFLFIIΠJKNASEKMΓJNIV.WICPDIIKDDFSLQDSLISCPCΎKLYIPTYITIDETIKOT'T^
TEIMGDISVMSK1WIHKVIGETAIMQFDLCRIDTTKPEITQCLNSIMYLYIDPAYTNNSEASGTGIGAII ALKNNSSKCIIVGIEHYFLCTLTGTATYQIASCACSLIRAALVLYPHIQAVHVAVEGNSSQDSAVAISTF TJFFICSPVKVNFMHYKDKTTAMQWPIYMLGSEKSQAFESFIYAINSGTISASQSIISNTIKLTFDPISYLI EQIRAIRCΥPLRDGSHTYC^KKRTVSDDVLVAVVMAHFFSTSNKHIFKQLNSI >gi_4019257 ^QKDAKLIFISS NSGD SFL LK ASEKM]^IVNYICPDHKDDFSLQDSLISCPCYKLYIPTYI IDETIKNTTNLFI*DGAFTTELMGDISVMSKNNIH3OTIGETAIMQFDL< IDTTKPE
PAYTIrøSEASGTGIGAIIALKNNSSKCIIVGIEHYFLKDLTGTATYQIASCACSLIRAALVLYPHIQAVH VAVEGNSSQDSAVAISTFLNECSPVKVNFMHYKDKTTAMQWPIYMLGSEKSQAFESFIYAINSGTISASQ SI ISNTIKLTFDPISYLIEQIRAIRCYPLRDGSHTYCAKKRTVSDD VLVAWMAHFFSTSNKHIFKQLNS I
>gi_60355
MLLLKAKKAIIENLSEVSSTQAETDWDMSTPTIITNTSKSERTAYSKIGVIPSVNLYSSTLTSFCKLYHP LTLNQTQPQTGTLRIiLPHEKPLILQDLSlTΪVKLLTSQNVCHDTEANTEYNAAVQTQKTSMECPTYLELRQ FVINLSSFLiNGCYVKRSTHIEPFQLQLILHTFYFLISIKSPESTNRLFDIFKEYFGLREirDPDMLQIFKQ KASIFLIPRRHGKTWIVVAIISMLLTSVENIHVGYVAHQKHVANSVFTEIINTLQKWFPSRYIDIKKENG TIIYKSPD K S IMCATCF-^NSIRGQTFNLL IDEANFIKKDSLP ILGFrQKD K IFISSV SGD RATSFLFNLKNASEKMLNIVNYICPDHKDDFSLQDSLISCPCYKLYIPTYITIDETIK-W^ TELMGDMSGISKSNMHKVISEMAITQFDLOlADTTKPEITQCLNSTMYIYIDPAYTNlSrSEASGTGIGAIL TFKNNSSKCIIVGMEHYFLKDLTGTATYQIASCACSLIRASLVLYPHIQCVHVAVEGNSSQDSAVAISTL INECSPIKVYFIHYKDKTTTMQWPIYMLGAEKSIAFESFIYAINSGTISASQSIISNTIKLSFDPISYLI EQIRSIRCYPLRDGSHTYCAKKRTVSDDVLVAVVMAYFFATSNKHIFKPIJNST >gi_S95201
MLQKDAKIIFISSVNSSDQTTSFLYNLKNAKEKMI^AΠSΓYVCPQHREDFSLQESVVSCPCYRLHIPTYIA IDENIKDTTNLFMEGAFTTEIJMGDGAAATTQTNMHKVVGEPALVQFTJLCRVDTGSPEAQRGLNPTLFLYV
DPAYTNNTEASGTGMGAVVSMKNSDRCVVVGVEHFFLKELTGASSLQIASCAAALIRSLATLHPFVREAH AIEGNSSQDSAVAIATLLHERSPLPVKFLHHADKATGVQWPMYILGAEKARAFETFIYALNSNTLSCGQ AIVSNTIKLSFDPVAYLIEQIRAIKCTPLKDGTVSY(^AK3IKGGSDDTLVAVVMAHYFATSDFJIVFKNHMK
QI >gi_4928934
MIJ.SSFRNHLQKISΓ^KYSVQAQNIDWPVETPVLISKDSKTNRLAHPLIGVISRINLYSPTLKYYCDEYST
TKQPKFTPDIGYVRDLK-O-RDQYFLPKLQHHLSTL EAYHHVDRQA
FLINLSCFLINGCYVSKSTCIELFQKQLILHTFYFLISIKTPEETISRKMFTFFKHYVGLFDIDDNMLQCFKQ
KSTVFLIPRRHGKTWI A ISVLTjASVE rølGYVAHQKHVANAVFTEI ITTLYQWFPSKNIEIKKENG I IYTKPGRKPSTLMCATCFNKNSIRGQTENILYVDEANFIKKEALPAILGFMLQKDAKIIFISSVNSAD KSTSFLFNLRNAKEKMLNVVNYVCPEHKEDFNLQSTLTSCPCYRiaiPTYITIDESIKNTTNLF^^ TE]^GDISTFPTSSMFICVVEEQALFHFDICRVDTTQIDTVKIIDNVLYVYVDPAYTSNSEASGTGIGAVV PLKTKVKTIILGIEHFYLKfΛTGTASQQIAYCVTSMIKAILTLHPHINHVNVAVEGNSSQDSAVAISTFI NΞYCPVPVFFAHO^TERSSVFQWPIYILGSEKSQAFEKFICAINTGTLSASQTIVSNTIKISFDPyAYLME QIRAIR- LPLKDGSYTYCAKQKTMSDDTLVAVT ANYMAISEKHTFKELCKT >gi_1632798 * rLYASQRGR EN ALQQDSTTQGCLGAETPSI^r ^G S'DR AHP VG IHAS LYCPr RAYC HY GPRPVFVASDESLPMFGASPAIJHTPVQVQMCLLPEIIRDTLQRLLPPPNLEDSEALTEFKTSVSSARAILE
DPNFLEΓREFI^SLASFLSGQYKHKPARLEAFQKQVVLHSFYFLISIKSLEITDTMFDIFQSAFGLEEMT LEKLHIFKQKASVFLIPRRHGKTWI WAI ISLILSNLSNVQIGYVAHQKHVASAVFTEI IDTLTKSFDSK RVEVNKETSTITFRHSGKISSTVMCATCFNKNS IRGQTFHLLFVDEANFIKKEALPAILGFMLQKDAKI I FISSVNSADQATSFLYKLKEAQERIIIJSTVVSYVCQEHRQDFDMQDSMVSCPCFRLHIPSYITMDSNIRATT T^FΩGAFSTELMGDTSSLSQGSLSRTVRDDAINQIJELCKVDTLNPRVAGRLASSLYVYVΫPAYTNNTSA SGTGIAAVTHDRADPNRVIVLGLEHFFLKDLTGDAALQIATCVVALVSSIVTLHPHLEEVKVAVEGNSSQ DSAVAIAS I IGESCPLPCAFVHTKDKTSSLQWPMYIJLTLFFIKSKAFΕRLIYAVNTASLSASQVTVSNTIQL SFDPVLYLISQIRAIKPIPIJRDGTYTYTGKQRITTSDDVLVALVMAHFIATTQKHTFKKVH
>gi_2337991
MFYVKVMPALQKACEELQNQWSAKSGKWPVPETPLVAVETRRSERWPHPYLGLLPGVAAYSSTLEDY'CΞL YNPYIDALTRCDLGQTHRRVATQPVLSDQLCQQLKKLFSCPPJTTSVKAKLEFEAAVRTHQALDNSQVFLE - r l-N SAFIJS RYSD SSHIE FQ Q IMH FFLVSIKAPE CEKF NI KL F IDTroQA,π.DI FKQKASVFLIPRRHGKTWIVVAIISILIiASVQDLRIGYVAHQKHVANAVFTEVINTLHTFFPGKYMDVKK ENGTIIFGLPNKKPSTLLCATCFNKNSIRGQTFQLLFVDEANFIKKDALPTILGFMLQKDAKIIFISSSN SSDQSTSFLYNLKGASERML-TVVSYVCSNHKEDFSMQDGLISCPCΥSLHVPSYISIDEQIKTTTNLFLDG VFDTELMGDSSCGTLSTFQIISESALSQFELCRIDTASPQVQAHLNSTVHMYIDPAFTNNLDASGTGISV IGRLGAKTKVILGCEHFFLQKLTGTAALQIASCATSLLRSWIIHPMIKCAQITIEGNSSQDSAVAIANF IDE^PIPVTFYHQSDIrKGVLCPLYLLGQEKAVAFΕSFIYAra!π.GLCKASQLIVSHTIKLSFDPVTYLL EQVRAIKCQSLRDGSHTΪHAKQKNLSDDLLVSVVMSLYLSSANTLPFKPLHIERFF >gi_2317977 l QKDA IFFISS SGE TTSFLYNL-αD E M SY CSEH rEDFN QSAI ACPCY VPEFIT INDNIKCTTNIΛLEGSFATEIiMG-MQSHTEVSGNSMIHESSLTR^ AYGNNVHASGTGIVAMSHCKHTKKCI ILGLEHFFLNNLTGTAAHNIASCATATi EGILFQHPWIQEIRCI IEGNSNQDSAVAIATFISHNIKLPTLFASYRDKTGMQWPIYJTLSGDKTLAFQNFISSLNQGLLCASQTVV SNTVIιLSSDPISYLIEQIKNTKCIYHKNKTITFQSKTHTMSDDVLIACTOTCYVMTTNKISYISFSIK
FIASKKSYFEAVYRSTVSSHSEEFWKSDDPVYFTQYKKQ- NRLPNAYLGTLHSASKYSENFRHYVATFS NSPLDFPQSVF-TORNPCEYSVPYIXlSALQCSAKTLVGCSVSTTEPJ-rEYEVCKEATRCFKDAMSHKVLKVF LSN S F KGHY S QAFLE FQ QLILHSFMF ASI CPETT KLFDEFKFL Dt YFDN D LTFLQK SPAFLIPRRHGKTWIVTAIISMLLTSVDDI1HIGYVAHQKHVSLAVFLEISNIIJ.AWFPRKNIDIKKENGV ILYSHPGKKSSTLMC^TCFNKNSIRGQTFNLLFVDEANFIKKEALPAILGFMLQKDAKIFFISSVNSGEK TTSFLYNLi ANEKMVNVVSYVCSEHMEDFNKQSAITACPσπU.YVPEFITINDNIKCTTNLLLEGSFAT ELMGN QSHTEVSGNS IHESSLTRLDFYRCDTAGQGAPTTENTLFVYIDPAYGNNVHASGTGIVAMSHC KHTKKCI ILGLEHFFLNNLTGTAAHNIASCATALLEGILFQHPWIQEIRCI IEGNSNQDSAVAIATFISH NIKLPTLFASYRDKTGMQWPIYMLSGDKTLAFQNFISSLNQGLLCASQTVVSNTVLLSSDPISYLIEQIK NTKCIYIQCNKTITFQSKTHTMSDDVLIACVMTCYVMTTNKISYISFSIK 1 10 20 30 40 50 GO 70 80 90 100 110 120 130
1 + + + + 4 + + _+ .—„_+ _,—+ + + 1 gi_101B0719 HFGGLLGEET RHFERLH TKIIDR GnSHRHERSIRDG---^DrlVDRPF— LHFfllPVPRRHQTVMPHIGILHHCCDSLGIYSniTTRM YSSIflCSEFDELRRD S¥PRCYP gi_7G73189 HFGGLLGEETKRHFERLrlKTKHDRLGflSHRHERSIRDG DMVDRPF— LHFBIPVPRRHQTVrlPfllGILHHCCDSLGIYSfllTTRMLYSSIflCSEFDELRRD SYPRCYP gi_5G89285 rfFGGRLGESRKKHFERLLRDRNERLGRSRKNΕCLHRGG SLVDRPF~-LNFHISVPRRHQTWPHVGTLH0CCDGTGT.YSRIRTRLLYRGIVSSEFGEVRRE SLSNGHJ gi_1869837 HFGQQLflSDVQQYLERLEKQRQQ VGV-DEflSRGLTLG GDRLRVPF— DFfiTRTPKRHQTVVPGVGTLHDCCEHSPLFSRVRRRLLFHSLVPRQLRGRDFG GD H gi_59501 HFGQQLRSDVQQY ERLEKQRQL VGR-DERSRGLTMG GDπLRVPF~LDFRTRTP RHQTVVPGVGTLHDCCEHSPLFΞflVRRRLLFHSLVPflQLKGRDFG GD H
CO c gi_2G05992 ttFG flLSRETIQYFETLRKEVQ,SRSGR-KHRRRERQTG--GEODVKTRF-- -LHFfllPTPQRHQTVVPGVGTLHDCCETRQIFRSVflRRLLFRSLSKMRGGES ER LD P gi_330792
DO HFGRVLGRETVQYFERLRREVQRRRGR-KHRRRERQHG--GEDDR TRF~LHFRIPTPQRHQTVVPGVGTLHDCCETRQIFRΞVRRRLLFRSLS HQSGERRER LD P CO gi_971317 HFGGflVGEQSflRYFQRLLRERQRRflRE-RGflRPDGGGGRRGEDDflRVPF~LDFflVRHP RHQTVVPGVGTLHGYCELflPLFRRTflSRLLLTSrlflRflERG Lll T g±_58G9808 HSLIMFGRtLGEESVRYFERL RRRDERFGTLESPTPCSTRQGSLGHflTQIPF~LHFRIDVTRRHQRVIPGIGTLHHCCEYIPLFSRTRRRR FGRFLSSTGYHCTPH VVLKPMR gi_5708110 rJLGKESVEIVKRYRDRLRKRTHERGPDDVDGQEMSDSHFITTRSICDRHDSRRDTHHSPflSRFQFfllDVPQRHQRCIflPIGSFHHCCRISRRFSYHRSEIIYEHLflSYSTKYTDTDRflLHDLQVSPKRQL gi_1813970 MLRGDSflflKIQERYRELQKRKSHPTSCIST-flFTNVfiTLCRKRYQHrlHPELGLflHSCHEnFLPLrlRFCGRHRDYHSPEESQREL m gi_2746296 HLRSCDIDRIQKRYQSIIHKHEQDVK-ISS-TFPHSRIFCQKRFIILTPELGFTHRYCRHVKPLYLFCDRQRHV KSK— I gi_325496 HLRTCDITHIKHNYEflllHKGERDCSTIST-KYPHSfllFYKKRFIHLTPELGFflHSYHQqVKPLYTFCEKQRHL KNR PL '
CO gi_5733564 HLRTCDITHIKHHYERIIMKGERKCSTIST-KYPHSRIFYKKRFIHLTPELGFRHSYHQQV PLYTFCEKQRHL HRKPL
Im gi_113S808 HLLSRHRERLRRHLEETR D— RGE^RHEL-SfiPTFTRHCPKTRRh'flHPFIGVVHRIHSYSSVLETYCTRHHP RTPTSflHPDV GTPRPSE m gi_224G552 MLLSRHRERLRRHLQETRKD-- RGETRHEL-SRPTFTRHCP TRRHRHPFIGVVHRIHSYSSVLETYCTRHHP RTPTSflHPDV GTPRPSE
H 'gi_4019255 MLLL flKKflLHEHLTEflSSTT-QSETEHTV-DTPTMITHI KSERMHYSKIGVIPSIHLYSflSLTSFCRLYRP-r-- LflL QPLPQT GTLRLLP gi_G0355 MLLLKRKKRHEHLSEVSStr-QRETDHDrl-STPTHTHTSKSERTRYS IGVIPSVHLYSSTLTSFCKLYHP LTLHQTQPQT GTLRLLP c gi_4928934 HLLSSFRHHLQ HYEKYSVQ—RQHIDHPV-ETPVLIS DSKTHRLRHPLIGVISRIHLYSPTLKYYCDEYST TKQPKFTPDI GYVRDLK r gi_2337991 HFYVKVHPRLQ flCEELQHQHSRKSGKHPVPETPLVflVETRRSER PHPYLGLLPGVRRYSSTLEDYCHLYHP YIDRLTRCDL GQTHRRV m gi_1632798 ttLYRSQRGRLTEHLRHRLQQDSTTQGCLGR-ETPSIMYTGRKSDRHRHPLVGTIHRSHLYCPHLRRYCRHYGPRPVFVRSDESLPrlF GRSPRLH ro gi_GG25593 MFTRS KSYFEflVYRSTVSS— HSEEFMKSDDPVYFTQY KQCHRLPHRYLGTLHSRSKYSEHFRHYVRTFSH SPLDFPQSVF HERHPCE gi_1718281 gi_2246515
CO gi_4*^9<1933 gi_\7330018
> c gi_4019257 gi_G95201 gi_2317977 gi_854039 gi_499G048
Consensus
TABLE 2
Figure imgf000027_0001
131 140 150 1G0 170 180 190 200 210 220 230 240 250 2G0
I + + + + '. + + + + + + + + _* 1 gi_10180719 RITHRQRFLSPHrlrlRVRHSIIFQEYDEHECRflHRHRYYSTMHSFISHRTSDRFKQLTVFISRFSKLLIRSFRDVHKLDDHTVK— RRRIDRPSYD LHGTLELFQKMILHHRTYFVTSVLLGD-HRERR gi_7G73189 RITNflQRFLSPHMMRVfiHSIIFQEYDEMECRflHRHflYYSTttHSFTSMRTSDflFKQLTVraSRFS LLrflSFRDVH LDDHTVK-n RRRIDRPSYDKLHGTLELFQKHIFDRCHLFCHFCFTWR-SRRflS gi_5G89285 SKRHRERLLRPTLTRVRHSITFHEYDDflQCRfiHRHRYYSTrlHTFGSMRTSDRFQQLflSFIDRFS LLRRSFKDVHILDRHHRP-i-kRRRITRPSYDKPHGTLELFQ HILMHflTYFLTSVLLED-HflERfl gi_18G9837 — TflKLEFLflPELVRRVRRLRFRECflPEDflVPQRHflYYSVLHTFQRLHRSERFRQLVHFVRDFRQLLKTSFRRSSLRETTGPP-KKRfl VDVRTHGqTYGTLELFQ MILMHflTYFLRflVLLGD-HflEQV gi_59501
CO — TflKLEFLflPELVRflVflRLRFKECflPflDVVPQRHHYYSVLHTFQflLHRSERFRQLVHFVRDFRQLLKTSFRRSSLTETTGPP-K RR VDVRTHGRTYGTLELFQKHILHHRTYFLRRVLLGD-HREQV c gi_2605992 — SSVERYVDPKV QRLKTISFVEYHDRERRSCRHRYYSIHHTFDSLRSSDflFHQVflHFVflRFSRLVDTSFHGflDLDGDGQQT-SKRIKVDVPTYG QRGTLELFQ HILMHflTYFIRRVILGD-HflDRI
DO gi_330792 — -RSVERYVDPKVRQRLKTISFVEYSDDERRSCRHRYYSIrlHTFDflLRSSDflFHQVflSFVRRFSRLVDTSFHGflDLDGDGQQfl-SKRflRVDVPTYG QRGTLELFQKrllLMHflTYFIflflVILGD-HflDRI CO gi_971317 — GTGEflHVSRELflGVLSRLRFRflHPPflEflRflHCHRYHSVrRRLESHRflSGRFRQVfiRFVflRFSRLVGTSFSHLGGGDDRDPPRRKRRRVEPPS-GQTRGRLELFQKrllLrlPflTYFVRRTLLGE-HRERI gi_58G9808 --YSVHflHVSPELKKflVSSVQFYEYSPEEflRPHRHflYSGVi HTFRflFSLSDSFCqLSTFTQRFSYLVETSFESIEECGSHG KRRKVDVPIYGRYKGTLELFQKrllLrlHTTHFISSVLLGD-HflDRV gi_5708110 FTGRREDSILPRLRQKLRHLHFRRFRPSDSLIHDKRFDGIHHGYRGFV SDEFSQLHHFIYRFHTLLKKSFSGQflSHDY RfiKLE TTSEQRDGTLELFQKrffLMIRTYFflSSICLGEGSTERS gi L813970 LFHERL SRLDKLTFRPCSEEQ-R RSYQK-LDRLTELYRDPQFQQΪNHFHTDFkKHLDGGFSTRVEGD RKRIRLEPFQKHLLIHVIFFIRVT IPV-LflHRV m gi_274G29G RICOPLHCRLSKLKFTRIIEKHTE VQYQKHLELQTSFYRHPMFLQIEKFIQDFQRHICGDFEHT--H KERIKLEPFQ SILIHIIFFISVT LPT-LRHHV gi_32549G TILPSLSHKLQEM FLPRSDKSFE SQYTEFLESFKILYREPLFLQIDGFI DFRKHI GEFNDF--GD TRKIQLEPFQKHILIHYIFFIflvT LPR-LflNRV
CO
I gi_57335G4 -TILPSLTRKLQErlKFLPRSDKSFE SQYTEFLESF iLYREPLFLqiDGFIKDFRKHI GEFHDF~GD- TR iqLEPFqKHILIHVIFFIHVTKLPfl-LflHRV m gi_1136808 DHVPRKPRLLESLSTYLQMRCVREDRHVSTRDqLVEYqRGR THDSLHflCSVYRELqRFLVHLSSFLHGCYVP- -GVHMLEPFqQqLVMHTFFFLVSIkflPq-KTHqL t - m gi_224G552 DHVPRKPRLLESLSTYLQr/RCVREDRHVSTRDQLVEYQflflR THDSLHflCSVYRELQRFLVHLSSFLHGCYVP- -GVH LEPFqqqLVHHTFFFLVSI RPq- THqL gi_4019255 SE KPYISqKLSNYVKSLTL HV HDIERE~REYYRSVqTEKTFMECPI,YLELRqFIIHLSSFLHGCYVK- -KSTHIEPFqLQLILHTFYFLISIkSPE-STHkL
73 gi_G0355 HE KPLILPDLSHYV LLTSqHVCHDTERH— TEYHRRVqτqKTSHECPTYLELRQFVIHLSSFLHGCYVK- -RSTHIEPFQLqLILHTFYFLISIKSPE-STHRL c gi_4928934 KH DQYFLP LQHHLSTLCEflYHHVDRqflq--VEFHRSILTLKRFHflHGVLHELKQFLIHLSCFLHGCYVS- -KSTCIELFQKqLILHTFYFLISIKTPE-ETNKrf gi_2337991 RT qPVLSDqLGqqLKKLFSCPRHTSVKRK— LEFERRVRTHqRLDHSqVFLEL TFVLNLSRFLHKRYSD- m -RSSHIELFq qLIi HTFFFLVSI HPE-LCEKF gi_lG32798 TPVqVqHCLLPELRDTLqRLLPPPHLEDSERL— TEFktSVSSRRRILEDPHFLErlREFVTSLRSFLSGQYKH- -KPfiRLERFQKQWLHSFYFLISIKSLE-ITDTrf r gi_6G25593 YSV-- TPYLDSRLqCSR TLVGCSVSTTER HEYEVC ERTRCF DRHSHKVL VFLSHLSMFLKGHYKS- - qRFLEPFqKQLILHSFHFVflSIKCPE-TTT L gi_1718281
CO gi._224G515 gi_4494933 gl_7330018
> c gi.,4019257 gi_G95201 gi_2317977 gi_854039 gi_4996048
TABLE 2 CONTINUED
Figure imgf000029_0001
Figure imgf000030_0001
Figure imgf000031_0001
-wimnTri'BPirtinnojr- iMnai
Figure imgf000031_0002
G51 . GGO G70 680 690 700 710 720 730 740 750 7B0 770 780
I + + . .-+ + + + + . + 1 + i + u *._,+ 1 gi_10180719 flP--TPHH*/SFYHSKSHGTDVEYPYFLLqRQKTTflFDFFIflqFHSGRVLflSQDLVSTTySLTTDPVEYLTKQLTHISEVVTG PTCTRTFSGKKGG— -NDDtWflLTilflVYISfiH-IPDrlRFflPIRV gi_7673189 RP--TPHHVSFYHSKSHGTDVEYPYFLLqRQKTTRFDFFinqFHSGRVLRSqDLVSTTVSLTTDPVEYLTKQLTNISEVVTG PTCTRTFSGK I-G---HDDtVVHLTMflVYrSflH-IPDMflFflPIRV gi_5689285 RP--flPHDVCFYHSkPRGSNVEYPFFLLqRQ ;TRfiFDFFIHRFHSGRVLflSQDLVSTTISLSTDPVEYLTKQLTHLSEVVTG RTGTRTFSGKKGG— ^YDDTVVRLVHflVYISflH-fiSDRTFflPIRG gi_1869B37 RH-GPGPELLFYHCEPPGGRVLYPFFLLHKQkTPflFEYFIKKFHSGGVHHSqELVSVTVRLQTDPVEYLSEQLHNLIETVSP HTDVRHYSGKRHGR-^RDDLHVfiVIHRIYLRflPTGIPPflFFP-ITR gi_59501 RDRGSGPELLFYHCEPPGSflVLYPFFLLNKQKTPRFEHFIKKFHSGGVrlRSqEiVSflTVRLQTDPVEYLLEqLHHLTETVSP HTDVRTYSGKRHGR--SDULHVflVIHflIYLRflqflGPPHTFRPITR gi_2605992 VH--«PGTVLFYHCTPPGSεVRYPFFLLqKQKTGflFDHFIKRFHSGLVLRSqELlSHTVRLQTDPVEYLLTqrlKNLTEVITG TjSETRVFTGKRHGfl~-SDDHLVRLVrlflVY«flSLPPTTHRFSSLST gi_330792 VP--HPGRVLTYHCTPHGSSVflYPFFLLqKQKTGflFDHFIKflFNSGSVLfiSQELYSNTVRLOT^ gi_971317 VP~HPflETRFYHCRPPGSRVRYPFFLLqKQKTRflFDHFIRLFHSGRVVflSqDLflSLTVRLqTDPVEYLFEqLqHLTESTRG PGGRRRFSGKRRGH-~flQDLrlVflLVHRVFVGSLPPTDGflFCPLHP
CO gi_58B9808 LS— SflPVLLFYHSIPPGCSVRYPFFLLQKQKTPflVDYFVKRFHSGHIIflS8ELVSLTVKLGVDPVEYLCKQLDNLTEVIKGGHGHLDTKTYTGKGTTGTMS0DLHVRLIHSVYIGSSCIPDSVFrtPIK c gi_5708110 LGFSLTFflllΞRQPGTflHRHPFYLLHKQKSRRFDLFVSLFH£GRFttRSqE VSHTLVLSKDPCEYLVDqiRHrr--VTHGqGPDSFRTFSGKQGRV-^PDDHLVRRVrlSTYLRLEGSPTflGYHPIRP cσ gi_1813970 RVLFYHTPDQHHIEQ-PFYLHGRDKRLfiVEqFlSRFHSGYIKflSqELVSYTIKLSHDP-EYLLEQIQHLHRVTLfl EGTTHRYSRKRQHR-ISDDLIIRVIHRTYLCDDIHRIRFRVS co gi_274629G HITFFHTLDqsqifiα-PFYLLGREKRLflVEYFlSHFHSBYIKflSqELlSFπKITYDPIEYVIEqiKHLHQIHIH EHVT~YHRKKq-T-CSDDLLISIIHflIYHCHEGKqTSFKEI gi_32543B QVTFFHTPDQHqifiQ-PFYLLGKEKKLflVEFFISHFHSGHIKBSQELlSFTIKITYDPVEYRLEqiRNIHQISVH NYrr--YSHKKn-R-CSDϋLIIflIIHflIYVCSGHSSRSFREr gi_57335G4 Ql/TFFHTPDqHqiflQ-PFYLLGKEKKLHVEFFISHFHSGHI RSQELlSFTIKITYDPVEYflLEqiRHIHQISVH NYIT~YSRKKQ-H-CSDDLIIRIIr(RIYVCSGHSSRSFREI gi_113G808 PIHFLHYTDKSSRLqHPIY»lLGGEKSSflFETFIYRLHSGTLSflSqTVVSNTIKIBFDPVTYLVEQVRRIKCVPLR--~DGGqS-YSflKQK-H-HSDDLLVRVVHflHFHRTDDRHHYKPISPq m gi-224G552 PIHFLHYTPKSSflLQHPIYHLGGEKSSRFETFI¥flLHSGTLSfiSqTVVSHTTJ ISFDPVTYLVEqVRRIKCVPLR-----DGGqS-YSRKqK-H-r(SDDLLVflVVrfflllFHRTDDRHHYKPISPQ gi_4019255 KVHFHHYKDKTTRHQμPIYnLGSEKSqflFESFIYRIHSGTISflSqSIISHTIKLTFDPISYLIEqiRflIRCYPLR---DGSHT-YCflKKR-T-VSDDVLVRVVHRHFFSTSHKHIFKqLHSI co gi_G0355 KVYFIHYKDKTTTMQMPIYHLGREKSIflFESFIYRIHSGTISnSqsnSHTIKLSFDPISYLIEqiRSIRCYPLR---DGSHT-YCflKKR-T-VSDDVLVflVVHnYFFRTSHKHIFKPLHST m gi_4928934 PVFFflHCHERSSVFQ PIYILGSEKSqnFEKFICRUITGTLSRSqTlVSHTIKISFDPVRYLMEQIRRIRCLPLK— -DGSYT-YCRKqK-T-HSDDTLVflVVHRNYHRISEKHTFKELCKT m gi_2337931 PVTFYHqSDKTKGVLCPLYLLGqEKflVflFESraYRMHLGLCKflSQUVSHTαLSFDPVTYLLEDVRflIKCQSLR---DGSHT--YHflKQK-H-LSDDLLVSVVHSLYLSSfl TLPFKPLHIER gi_lG32798 PCRFVHTKDKTSSLQHPrlYLLTHFKSKflFERLIYflVHTRSLSflSqVTVSHTiqLSFDPVLYLIsqiRflIKPIPLR---DGTYT-YTGKQR-N-LSDDVLVRLVHRHFLRTTqKHTFKKVH gi_GG25593 PTLFRSYRDKT-GHqHPIYMLSGDKTLRFqHFISSLHQGLLCflSqTVVSHTVLLSSDPISYLIEqiKHTKCIYHK HKTIT-FqSKTH-T-«SDDVLIfiCVMTCYVHTTHKISYISFSIK c gi-1718281 PIHFLHYTDKSSRLQHPIYHLGGEKSSfiFETFIYRLHSGTLSRSqTVVSHTIKISFDPVTYLVEqVRHIKCVPLR---- DGGqSYSRKQKH~HSDDLLVflWHflHFHflTDORHHYKPISPq gi-2246515 PIHFLHYTDKSSRLQμPIYHLGGEKSSfiFETFIYRLHSGTLSftSqTVVSMTIKISFDPVTYLVEqVRflIKCVPLR-^--DGGqSYSflKQKH--HSDrjLLVflWHflHFHnTDDRHrlYKPISPQ m gi_4494933 PLSFLHHVDKHTLIRSPIYttLGPEKRKRFΕSFIYRLHSGTFSRSqTVVSHTIKLSFDPVRY DqiKRIRCrPLK-^ — DGGHTYCRKQKT--rlSDD\>LVflflVHflHYrlflTH„KFVFKSLE r gi-7330018 PLSFLHHHDKHT RSPIYMLGPEKRKRFESFIYflLHSGTFSflSqTVVSHTIKLSFDPVRYLIDQIKfllRCIPLK-*^ — DGGHTYCflKQKT--HSDnVLVflfiVHRHYHHTl*IDKFVFKSLE n gi-4019257 KVHFrlHYKDKTTflHQHPIYrlLGSEKSqflFESFIYfllHSGTISHSqSlISNTIKLTFDPISYLIEQIRfllRCYPLR-1 — DGSHTYCRKKRT~VSDDVLVRWHHHFFSTSHKHIFKqLHSI gi_G35201 PVKFLHHRDKRTGYQHPHYILGfiEKRRflFETFIYfiLHSHTLSCGQfllVSMTIKLSFDPVflYUEQIRRIKCYPLK- — DGTVSYCRKHKG--GSDDTLVRVVHRHYFRTSDRHVFKHHHKQI
CO gi_2317977 PTLFflSYRDK-TGHQHPIYrlLSGD TLfiFQHFISSLHQGLLCRSqTVVSHTVLLSSDPISYLIEQIKHTKCIYHK HKTITFqSKTHT—HSDDVLIRCVHTCYVMTTH ISYISFSIK gi_854039 qVTFFHTPDqHqiRQ-PFYLLGKEKKLflVEFFlSMFHSGHIKnSQELISFTIKrTYDPVEYflLEqiRHIHqiSV NHYITYSflKKqR—CSDDLIIRIIHfllYVCSGHSSflSFREI gi_493G0.48 qVTFFHTPOqHqiRQ-PFYLLGKEKKLflVEFFISHFHSGHIkflSQELISFTIKITYDPVEYRLEqiRHIHqiSV HHYITYSHKKQR~CSDDLIIRIIHflIYVCSGHSSRSFREI c Consensus .„p„F.h.. « q.PiY$Lg< ftK..flf#.FI„a.Hsg_ . RSQ, SnT !klsfDP! ,ϊl,iq!ral.c..l .aK SDD.l!R Ha.ϊ;„t k
TABLE 2 CONTINUED
Figure imgf000032_0001
Figure imgf000033_0001
Table 3. Degenerate primers generated by CODEHOP
Block x7263xbliD
T L Y V Y I D P oligo : 5 ' -AACCTGTACGTGtayntngaycc-3 ' degen=64 temp=33.4. Extend clamp
T Y V Y I D P A ' oligo : 5 ' -AACCTGTACGTGTACntngayccngc-3 ' degen=128 temp=36.0 Extend clamp
T L Y V Y I D P A Y oligo : 5 ' -AACCTGTACGTGTACATngayccngcnt-3 ' degen=128 temp=42.5 Extend clamp
Complement of Block x7263xbliD
Y I D P A Y T N' N T atrnanctrggGCGGATGTGGTTGTTGT oligo : 5 ' -TGTTGTTGGTGTAGGCGggrtcnanrta-3 degen=64 temp=62.9
D P A Y T N N T R A anct r ggncgnaTGTGGTTGTXGTGGGTCCG oligo : 5 ' -GCCTGGGTGTTGTTGGTGTangcnggrtcna-3 ' degen*=128 temp=61.8
D P A Y T N N T R A ctrggncgnawGTGGTTGTTGTGGGTCCG oligo : 5 ' -GCCTGGGTGTTGTTGGTGwangcnggrtc-3 ' degen=64 temp=61.0
Block x7263xbliE
C I I F G M E H F .F oligo : 5 ' -TGGATCATCTTCGGCATngarcaytwyt-3 * degen=64 temp=55.7 Extend clamp
F G M E H 'F F L oligo : 5 ' -CATCTTCGGCATGGAGcaytwytwyyt-3 ' degen=64' temp=62.0 Complement of Block x7263xbliE
E H F* F . R D i T* G , . ctygtrawra GGACTTCCTGGACTGCCC oligo : 5 ' -CCCGTCAGGTCCTTCAGGwarwartgytc-3 ' degen=32 temp=61.7
. H ' F R D L T G tygtra rawrrACTTCCTGGACTGCCCG oligo : 5 ' -GCCCGTCAGGTCCTTCArrwarwartgyt-3 ' de-gen=128 temp=60.8
H F F L R D L T G gtrawrawr-raCTTCCTGGACTGCCCG oligo: 5 '-GCCCGTCAGGTCCTTCarrwarwartg-3 ' degen=64 temp=S0.8 *
Block x7263xbliF • " •
E V H. I - A V E G N ' oligo :.5 ' -GGACGTGCACGTCGCCrtngarggnaa-3 ' degen=64 temp=63.8
Complemen of Block x7263xbliF
E G N 'S S Q D S A anctyccnttrwGGTTGGTCCTGAGGCGG oligo : 5 ' -GGCGGAGTCCTGGTTGGwrttnccytcna-3 ' degen=128 temp=62.7
E G N S S Q D S A V ctyccnttrwsGTTGGTCCTGAGGCGGC oligo : 5 ' -CGGCGGAGTCCTGGTTGswrttnccytc-3 ' degen=64 temp=63.9 '

Claims

1. A high throughput method for screening a biological. sample for urjlαiown viruses, which method comprises 5
(a) subjecting DNA from the sample to PCR -amplification conditions using simultaneously multiple pairs of degenerate primers, wherein each primer binds a sequence that is conserved across members of a family of viruses and each pair of primers selectively directs amplification of sequence of said family;
10 "
(b) sequencing PCR product obtained in step (a); and
*(c) comparing the sequence of the PCR product with the sequences in at least one database comprising viral sequences to determine whether the sequence is present in, ,15 or absent from, the database, wherein absence of the sequence from the database suggests that the sequence may be from an uriknown virus.
2. A method according to claim 1 wherein from 12 to 300 pairs of degenerate primers are used simultaneously.
20
3. A method according to claim 1 wherein from 24 to 96 pairs of degenerate primers • are used simultaneously.
4. A method according to claim 1, 2 or 3 wherein the PCR reaction step is carried 25 out in a multi-well plate.
5. A method according to claim 4 wherein the multi-well plate is a 96-well.or 384- well plate.
30 6. A method according to claim 4 or 5 wherein each well contains more than one pair of the degenerate primers.
7. A method according to claim 6 wherein each well contains three pairs of the degenerate primers.
8. A method according to claim 6 or 7 wherein each pair of the primers used i the same well as other pair(s) of primers generates a PCR product of a different size to said other pair(s).
9. A method according to any one of claims 6 to 8 wherein each pair of primers used in the same well as other pair(s) of primers carries a different label from said other ' pair(s).
10. A method according to claim 9 wherein each pair of primers used in the same . well as other pair(s) of primers carries a differently-coloured fiourescent label from said other pair(s).
•11. A method according to any one of the preceding claims wherein multiple biological samples are screened by simultaneously subjecting DNA from the multiple samples to PCR reaction conditions in* step (a) using simultaneously the multiple • pairs of degenerate primers on the DNA from each, sample.
12. A method according to claim.11 wherein from 2 to 80 samples' are screened simultaneously.
13. A method according to claim 12 wherein from 5 to 40 samples are screened simultaneously.
14. A method according to any one of me preceding claims wherein DNA of
" multiple biological samples is mixed together to produce one or more pooled samples and the PCR reaction of step (a) is carried out on the pooled samples. . ' "_ ' . "
15.' A method according to any one of the preceding claims wherein the DNA is genomic DNA.
16. A method according o any one of claims 1 to 14 wherein the DNA is cDNA.
17. A method according to any one of the preceding claims wherein the multiple pairs of degenerate primers are designed by
(i) providing a plurality of amino acid sequences from members of a first virus family,
(ii) comparing the sequences to identify .conserved regions,
. (iii) designing a first primer pair using a computer based method, wherein each primer in the pair binds a nucleotide sequence that encodes a conserved region identified in (ii) and wherein the primer pair is designed to amplify by PCR the nucleotide sequence between the nucleotide sequences that encode. conserved regions in members of the first virus family, and
(iv) repeating steps (i) to (iii) for each virus family.
18. A method according to any one the preceding claims wherein the biological ' -sample is a human sample.
19. A method according to any one of the preceding claims which further comprises determining whether the sequence of the PCR product is a sequence of human DNA.
20. A method according to any one of the preceding claims wherein, if the sequence of the PCR product is absent from the database comprising viral sequences, DNA walking is carried out to determine any sequence which flanks the sequence of the PCR product.
21. A method according to any one of the preceding claims wherein, if the sequence of the PCR product is absent from the database comprising viral sequences, any virus comprising the sequence is isolated.
- 22. A method according to any one of the preceding claims which further comprises determining whether the sequence of the PCR product or sequence contiguous thereto is present in a specimen from' a patient who has a disease, wherein any presence of a said sequence in the specimen suggests that the sequence may be from a virus which is causing or contributing to the disease.
23. A method according to any one of the preceding claims which further comprises obt_rining a. specimen from each member of a group of subjects with' a . disease; determining whether the sequence of the PCR product or sequence contiguous thereto is present in the specimen from .each member of the group; and
. determining whether the proportion of subjects in- whom a said sequence is . present is greater in the group of subjects who have the disease than in a control group of subjects who do not have the disease, wherein a said, greater proportion suggests that the sequence may be from a virus which causes or contributes to the ■ disease.
PCT/GB2002/002642 2001-06-07 2002-06-07 Virus detection using degenerate pcr primers WO2002099130A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2002345244A AU2002345244A1 (en) 2001-06-07 2002-06-07 Virus detection using degenerate pcr primers

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0113907.0 2001-06-07
GB0113907A GB0113907D0 (en) 2001-06-07 2001-06-07 Virus detection using degenerate PCR primers

Publications (2)

Publication Number Publication Date
WO2002099130A2 true WO2002099130A2 (en) 2002-12-12
WO2002099130A3 WO2002099130A3 (en) 2003-09-04

Family

ID=9916127

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB2002/002642 WO2002099130A2 (en) 2001-06-07 2002-06-07 Virus detection using degenerate pcr primers

Country Status (3)

Country Link
AU (1) AU2002345244A1 (en)
GB (1) GB0113907D0 (en)
WO (1) WO2002099130A2 (en)

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006094238A2 (en) * 2005-03-03 2006-09-08 Isis Pharmaceuticals, Inc. Compositions for use in identification of adventitious viruses
WO2007100397A3 (en) * 2005-11-28 2008-03-06 Isis Pharmaceuticals Inc Compositions for use in identification of adventitious contaminant viruses
US7956175B2 (en) 2003-09-11 2011-06-07 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US7964343B2 (en) 2003-05-13 2011-06-21 Ibis Biosciences, Inc. Method for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US8017358B2 (en) 2001-03-02 2011-09-13 Ibis Biosciences, Inc. Method for rapid detection and identification of bioagents
US8026084B2 (en) 2005-07-21 2011-09-27 Ibis Biosciences, Inc. Methods for rapid identification and quantitation of nucleic acid variants
US8057993B2 (en) 2003-04-26 2011-11-15 Ibis Biosciences, Inc. Methods for identification of coronaviruses
US8073627B2 (en) 2001-06-26 2011-12-06 Ibis Biosciences, Inc. System for indentification of pathogens
US8084207B2 (en) 2005-03-03 2011-12-27 Ibis Bioscience, Inc. Compositions for use in identification of papillomavirus
US8097416B2 (en) 2003-09-11 2012-01-17 Ibis Biosciences, Inc. Methods for identification of sepsis-causing bacteria
US8148163B2 (en) 2008-09-16 2012-04-03 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US8158354B2 (en) 2003-05-13 2012-04-17 Ibis Biosciences, Inc. Methods for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US8158936B2 (en) 2009-02-12 2012-04-17 Ibis Biosciences, Inc. Ionization probe assemblies
US8163895B2 (en) 2003-12-05 2012-04-24 Ibis Biosciences, Inc. Compositions for use in identification of orthopoxviruses
US8173957B2 (en) 2004-05-24 2012-05-08 Ibis Biosciences, Inc. Mass spectrometry with selective ion filtration by digital thresholding
US8187814B2 (en) 2004-02-18 2012-05-29 Ibis Biosciences, Inc. Methods for concurrent identification and quantification of an unknown bioagent
US8214154B2 (en) 2001-03-02 2012-07-03 Ibis Biosciences, Inc. Systems for rapid identification of pathogens in humans and animals
US8268565B2 (en) 2001-03-02 2012-09-18 Ibis Biosciences, Inc. Methods for identifying bioagents
US8298760B2 (en) 2001-06-26 2012-10-30 Ibis Bioscience, Inc. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
US8407010B2 (en) 2004-05-25 2013-03-26 Ibis Biosciences, Inc. Methods for rapid forensic analysis of mitochondrial DNA
US8534447B2 (en) 2008-09-16 2013-09-17 Ibis Biosciences, Inc. Microplate handling systems and related computer program products and methods
US8546082B2 (en) 2003-09-11 2013-10-01 Ibis Biosciences, Inc. Methods for identification of sepsis-causing bacteria
US8550694B2 (en) 2008-09-16 2013-10-08 Ibis Biosciences, Inc. Mixing cartridges, mixing stations, and related kits, systems, and methods
US8563250B2 (en) 2001-03-02 2013-10-22 Ibis Biosciences, Inc. Methods for identifying bioagents
US8822156B2 (en) 2002-12-06 2014-09-02 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US8871471B2 (en) 2007-02-23 2014-10-28 Ibis Biosciences, Inc. Methods for rapid forensic DNA analysis
US8950604B2 (en) 2009-07-17 2015-02-10 Ibis Biosciences, Inc. Lift and mount apparatus
US9149473B2 (en) 2006-09-14 2015-10-06 Ibis Biosciences, Inc. Targeted whole genome amplification method for identification of pathogens
US9194877B2 (en) 2009-07-17 2015-11-24 Ibis Biosciences, Inc. Systems for bioagent indentification
US9434997B2 (en) 2007-08-24 2016-09-06 Lawrence Livermore National Security, Llc Methods, compounds and systems for detecting a microorganism in a sample
US9598724B2 (en) 2007-06-01 2017-03-21 Ibis Biosciences, Inc. Methods and compositions for multiple displacement amplification of nucleic acids
EP3221472A4 (en) * 2014-11-21 2017-11-22 Nantomics, LLC Systems and methods for identification and differentiation of viral infection
US9873906B2 (en) 2004-07-14 2018-01-23 Ibis Biosciences, Inc. Methods for repairing degraded DNA
US9890408B2 (en) 2009-10-15 2018-02-13 Ibis Biosciences, Inc. Multiple displacement amplification

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5478724A (en) * 1991-08-16 1995-12-26 The Rockefeller University Lentivirus-specific nucleotide probes and methods of use
US5639871A (en) * 1988-09-09 1997-06-17 Roche Molecular Systems, Inc. Detection of human papillomavirus by the polymerase chain reaction
US6168917B1 (en) * 1996-10-02 2001-01-02 The United States Of America As Represented By The Department Of Health And Human Services Detection and identification of non-polio enteroviruses
WO2002099129A2 (en) * 2001-06-07 2002-12-12 University College London Designing degenerate pcr primers

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5639871A (en) * 1988-09-09 1997-06-17 Roche Molecular Systems, Inc. Detection of human papillomavirus by the polymerase chain reaction
US5478724A (en) * 1991-08-16 1995-12-26 The Rockefeller University Lentivirus-specific nucleotide probes and methods of use
US6168917B1 (en) * 1996-10-02 2001-01-02 The United States Of America As Represented By The Department Of Health And Human Services Detection and identification of non-polio enteroviruses
WO2002099129A2 (en) * 2001-06-07 2002-12-12 University College London Designing degenerate pcr primers

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KINGHAM B F ET AL: "Identification of avian infectious bronchitis virus by direct automated cycle sequencing of the S-1 gene." AVIAN DISEASES, vol. 44, no. 2, April 2000 (2000-04), pages 325-335, XP009011612 ISSN: 0005-2086 *
STEPHENSEN CHARLES B ET AL: "Phylogenetic analysis of a highly conserved region of the polymerase gene from 11 coronaviruses and development of a consensus polymerase chain reaction assay." VIRUS RESEARCH, vol. 60, no. 2, April 1999 (1999-04), pages 181-189, XP002242737 ISSN: 0168-1702 *
VOELTER CHRISTIANE ET AL: "Screening human tumor samples with a broad-spectrum polymerase chain reaction method for the detection of polyomaviruses." VIROLOGY, vol. 237, no. 2, 27 October 1997 (1997-10-27), pages 389-396, XP002242736 ISSN: 0042-6822 *

Cited By (59)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9752184B2 (en) 2001-03-02 2017-09-05 Ibis Biosciences, Inc. Methods for rapid forensic analysis of mitochondrial DNA and characterization of mitochondrial DNA heteroplasmy
US9416424B2 (en) 2001-03-02 2016-08-16 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US8815513B2 (en) 2001-03-02 2014-08-26 Ibis Biosciences, Inc. Method for rapid detection and identification of bioagents in epidemiological and forensic investigations
US8802372B2 (en) 2001-03-02 2014-08-12 Ibis Biosciences, Inc. Methods for rapid forensic analysis of mitochondrial DNA and characterization of mitochondrial DNA heteroplasmy
US8563250B2 (en) 2001-03-02 2013-10-22 Ibis Biosciences, Inc. Methods for identifying bioagents
US8268565B2 (en) 2001-03-02 2012-09-18 Ibis Biosciences, Inc. Methods for identifying bioagents
US8265878B2 (en) 2001-03-02 2012-09-11 Ibis Bioscience, Inc. Method for rapid detection and identification of bioagents
US8017358B2 (en) 2001-03-02 2011-09-13 Ibis Biosciences, Inc. Method for rapid detection and identification of bioagents
US8214154B2 (en) 2001-03-02 2012-07-03 Ibis Biosciences, Inc. Systems for rapid identification of pathogens in humans and animals
US8073627B2 (en) 2001-06-26 2011-12-06 Ibis Biosciences, Inc. System for indentification of pathogens
US8921047B2 (en) 2001-06-26 2014-12-30 Ibis Biosciences, Inc. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
US8298760B2 (en) 2001-06-26 2012-10-30 Ibis Bioscience, Inc. Secondary structure defining database and methods for determining identity and geographic origin of an unknown bioagent thereby
US9725771B2 (en) 2002-12-06 2017-08-08 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US8822156B2 (en) 2002-12-06 2014-09-02 Ibis Biosciences, Inc. Methods for rapid identification of pathogens in humans and animals
US8057993B2 (en) 2003-04-26 2011-11-15 Ibis Biosciences, Inc. Methods for identification of coronaviruses
US8158354B2 (en) 2003-05-13 2012-04-17 Ibis Biosciences, Inc. Methods for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US8476415B2 (en) 2003-05-13 2013-07-02 Ibis Biosciences, Inc. Methods for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US7964343B2 (en) 2003-05-13 2011-06-21 Ibis Biosciences, Inc. Method for rapid purification of nucleic acids for subsequent analysis by mass spectrometry by solution capture
US8097416B2 (en) 2003-09-11 2012-01-17 Ibis Biosciences, Inc. Methods for identification of sepsis-causing bacteria
US7956175B2 (en) 2003-09-11 2011-06-07 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US8013142B2 (en) 2003-09-11 2011-09-06 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US8546082B2 (en) 2003-09-11 2013-10-01 Ibis Biosciences, Inc. Methods for identification of sepsis-causing bacteria
US8163895B2 (en) 2003-12-05 2012-04-24 Ibis Biosciences, Inc. Compositions for use in identification of orthopoxviruses
US8187814B2 (en) 2004-02-18 2012-05-29 Ibis Biosciences, Inc. Methods for concurrent identification and quantification of an unknown bioagent
US9447462B2 (en) 2004-02-18 2016-09-20 Ibis Biosciences, Inc. Methods for concurrent identification and quantification of an unknown bioagent
US8173957B2 (en) 2004-05-24 2012-05-08 Ibis Biosciences, Inc. Mass spectrometry with selective ion filtration by digital thresholding
US8987660B2 (en) 2004-05-24 2015-03-24 Ibis Biosciences, Inc. Mass spectrometry with selective ion filtration by digital thresholding
US9449802B2 (en) 2004-05-24 2016-09-20 Ibis Biosciences, Inc. Mass spectrometry with selective ion filtration by digital thresholding
US8407010B2 (en) 2004-05-25 2013-03-26 Ibis Biosciences, Inc. Methods for rapid forensic analysis of mitochondrial DNA
US9873906B2 (en) 2004-07-14 2018-01-23 Ibis Biosciences, Inc. Methods for repairing degraded DNA
US8182992B2 (en) 2005-03-03 2012-05-22 Ibis Biosciences, Inc. Compositions for use in identification of adventitious viruses
WO2006094238A2 (en) * 2005-03-03 2006-09-08 Isis Pharmaceuticals, Inc. Compositions for use in identification of adventitious viruses
US8084207B2 (en) 2005-03-03 2011-12-27 Ibis Bioscience, Inc. Compositions for use in identification of papillomavirus
WO2006094238A3 (en) * 2005-03-03 2007-02-08 Isis Pharmaceuticals Inc Compositions for use in identification of adventitious viruses
US8551738B2 (en) 2005-07-21 2013-10-08 Ibis Biosciences, Inc. Systems and methods for rapid identification of nucleic acid variants
US8026084B2 (en) 2005-07-21 2011-09-27 Ibis Biosciences, Inc. Methods for rapid identification and quantitation of nucleic acid variants
JP2009519010A (en) * 2005-11-28 2009-05-14 アイビス バイオサイエンシズ インコーポレイティッド Composition for use in the identification of foreign contaminants
WO2007100397A3 (en) * 2005-11-28 2008-03-06 Isis Pharmaceuticals Inc Compositions for use in identification of adventitious contaminant viruses
US9149473B2 (en) 2006-09-14 2015-10-06 Ibis Biosciences, Inc. Targeted whole genome amplification method for identification of pathogens
US8871471B2 (en) 2007-02-23 2014-10-28 Ibis Biosciences, Inc. Methods for rapid forensic DNA analysis
US9598724B2 (en) 2007-06-01 2017-03-21 Ibis Biosciences, Inc. Methods and compositions for multiple displacement amplification of nucleic acids
US9434997B2 (en) 2007-08-24 2016-09-06 Lawrence Livermore National Security, Llc Methods, compounds and systems for detecting a microorganism in a sample
US8534447B2 (en) 2008-09-16 2013-09-17 Ibis Biosciences, Inc. Microplate handling systems and related computer program products and methods
US8148163B2 (en) 2008-09-16 2012-04-03 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US9023655B2 (en) 2008-09-16 2015-05-05 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US9027730B2 (en) 2008-09-16 2015-05-12 Ibis Biosciences, Inc. Microplate handling systems and related computer program products and methods
US8609430B2 (en) 2008-09-16 2013-12-17 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US8550694B2 (en) 2008-09-16 2013-10-08 Ibis Biosciences, Inc. Mixing cartridges, mixing stations, and related kits, systems, and methods
US8252599B2 (en) 2008-09-16 2012-08-28 Ibis Biosciences, Inc. Sample processing units, systems, and related methods
US8158936B2 (en) 2009-02-12 2012-04-17 Ibis Biosciences, Inc. Ionization probe assemblies
US9165740B2 (en) 2009-02-12 2015-10-20 Ibis Biosciences, Inc. Ionization probe assemblies
US8796617B2 (en) 2009-02-12 2014-08-05 Ibis Biosciences, Inc. Ionization probe assemblies
US8950604B2 (en) 2009-07-17 2015-02-10 Ibis Biosciences, Inc. Lift and mount apparatus
US9194877B2 (en) 2009-07-17 2015-11-24 Ibis Biosciences, Inc. Systems for bioagent indentification
US9890408B2 (en) 2009-10-15 2018-02-13 Ibis Biosciences, Inc. Multiple displacement amplification
EP3221472A4 (en) * 2014-11-21 2017-11-22 Nantomics, LLC Systems and methods for identification and differentiation of viral infection
JP2017535270A (en) * 2014-11-21 2017-11-30 ナントミクス,エルエルシー System and method for identification and differentiation of viral infections
CN107429302A (en) * 2014-11-21 2017-12-01 南托米克斯有限责任公司 System and method for the identification and differentiation of virus infection
AU2015349661B2 (en) * 2014-11-21 2019-05-16 Nantomics, Llc Systems and methods for identification and differentiation of viral infection

Also Published As

Publication number Publication date
GB0113907D0 (en) 2001-08-01
WO2002099130A3 (en) 2003-09-04
AU2002345244A1 (en) 2002-12-16

Similar Documents

Publication Publication Date Title
WO2002099130A2 (en) Virus detection using degenerate pcr primers
WO2002099129A2 (en) Designing degenerate pcr primers
Sabanadzovic et al. Southern tomato virus: the link between the families Totiviridae and Partitiviridae
Ito et al. Discrimination between dog-related and vampire bat-related rabies viruses in Brazil by strain-specific reverse transcriptase-polymerase chain reaction and restriction fragment length polymorphism analysis
Laor et al. Detection of FMDV RNA amplified by the polymerase chain reaction (PCR)
CN107849618A (en) Differentiate and detect the genetic marker of aquatile infectious disease Causative virus and using its Causative virus discriminating and detection method
CN110628944A (en) Duck type-3 adenovirus and duck novel reovirus differential diagnosis kit
CN108220393B (en) A kind of high-throughput quickly method of detection mitochondrial gene mutation or missing
Khaldi et al. Genetic characterization of three ovine breeds in Tunisia using randomly amplified polymorphic DNA markers
Chan et al. Differential diagnosis of orf viruses by a single-step PCR
JP6436598B1 (en) Primer set for specifically amplifying nucleic acid derived from strawberry pathogenic virus and method for detecting strawberry pathogenic virus
CN112322768B (en) Rapid RPA detection method for diagnosing sea-buckthorn branch blight and pathogenic bacteria
CN111454943A (en) Novel coronavirus detection kit
CN115094164A (en) Multiple qPCR (quantitative polymerase chain reaction) kit and detection method for ASFV (advanced specific immunodeficiency syndrome) with different gene deletion types
CN104611420A (en) Tubercle bacillus detection kit
Murakami et al. Nucleotide sequence and polymerase chain reaction/restriction fragment length polymorphism analyses of Aleutian disease virus in ferrets in Japan
Diaz-Lara et al. Development of RT-PCR degenerate primers to overcome the high genetic diversity of grapevine virus T
KR102019950B1 (en) Novel primer set for detecting rotavirus and kit to distinguish wild rotavirus and vaccine rotavirus using the same
KR101617142B1 (en) Nucleic acid test based avian influenza virus detection kit with improved detection accuracy
CN112680546A (en) Specific amplification primer pair and fluorescent quantitative PCR kit
JP3118572B2 (en) Simultaneous detection of citrus tarifavirus and citrus viroid by RT-PCR
CN107988429B (en) Reagent for detecting rabies virus and application thereof
CN112831608A (en) Primer for detecting goat endemic intranasal tumor virus and application of primer in HRM detection reagent
JPWO2020138435A1 (en) Auxiliary diagnostic method for intraocular malignant lymphoma
CN111500782A (en) Establishment and application of novel HIV-1 reverse transcriptase drug-resistant mutation site detection method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP