WO2009017902A2 - Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis - Google Patents

Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis Download PDF

Info

Publication number
WO2009017902A2
WO2009017902A2 PCT/US2008/067911 US2008067911W WO2009017902A2 WO 2009017902 A2 WO2009017902 A2 WO 2009017902A2 US 2008067911 W US2008067911 W US 2008067911W WO 2009017902 A2 WO2009017902 A2 WO 2009017902A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
nos
primer
seqidnos
idnos
Prior art date
Application number
PCT/US2008/067911
Other languages
French (fr)
Other versions
WO2009017902A3 (en
Inventor
Christian Massire
Rangarajan Sampath
Lawrence B. Blyn
David J. Ecker
Original Assignee
Ibis Biosciences, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ibis Biosciences, Inc. filed Critical Ibis Biosciences, Inc.
Priority to US12/666,239 priority Critical patent/US20110105531A1/en
Priority to EP08826817A priority patent/EP2179041A4/en
Publication of WO2009017902A2 publication Critical patent/WO2009017902A2/en
Publication of WO2009017902A3 publication Critical patent/WO2009017902A3/en
Priority to ZA2010/00218A priority patent/ZA201000218B/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2565/00Nucleic acid analysis characterised by mode or means of detection
    • C12Q2565/60Detection means characterised by use of a special device
    • C12Q2565/627Detection means characterised by use of a special device being a mass spectrometer
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Definitions

  • the present invention provides compositions, kits and methods for rapid identification of subspecies characteristics of Mycobacterium tuberculosis by molecular mass and base composition analysis.
  • a problem in determining the cause of a natural infectious outbreak or a bioterrorist attack is the sheer variety of organisms that can cause human disease. There are over 1400 organisms infectious to humans; many of these have the potential to emerge suddenly in a natural epidemic or to be used in a malicious attack by bioterrorists (Taylor et al. Philos. Trans. R. Soc. London B. Biol. Sci., 2001, 356, 983-989). This number does not include numerous strain variants, bioengineered versions, or pathogens that infect plants or animals.
  • PCR polymerase chain reaction
  • Selection for drug-resistant mutants in patients mainly occurs when patients are treated inappropriately or are exposed to, even transiently, subtherapeutic drug levels, conditions that may provide adequate positive selection pressure for the emergence and maintenance of drug-resistant organisms de novo.
  • One of the contributing factors is the exceptional length of chemotherapy required to treat and cure infection with Mycobacterium tuberculosis. The need to maintain high drug levels over many months of treatment, combined with the inherent toxicity of the agents, results in reduced patient compliance and subsequently higher likelihood of acquisition of drug resistance. Therefore, in addition to identifying new antituberculosis agents, the need for shortening the length of chemotherapy is paramount, as it would greatly impact clinical management and the emergence of drug resistance.
  • MDRTB multiple drugs
  • IH isoniazid
  • RAF rifampin
  • Mass spectrometry provides detailed information about the molecules being analyzed, including high mass accuracy. It is also a process that can be easily automated. DNA chips with specific probes can only determine the presence or absence of specifically anticipated organisms. Because there are hundreds of thousands of species of benign bacteria, some very similar in sequence to threat organisms, even arrays with 10,000 probes lack the breadth needed to identify a particular organism.
  • the present invention provides oligonucleotide primers and compositions and kits containing the oligonucleotide primers, which define bacterial bioagent identifying amplicons and, upon amplification, produce corresponding amplification products whose molecular masses provide the means to identify subspecies characteristics of Mycobacterium tuberculosis at and below the species taxonomic level.
  • the present invention provides compositions, kits and methods for rapid identification of subspecies characteristics of Mycobacterium tuberculosis by molecular mass and base composition analysis.
  • the present invention provides a method of identifying a Mycobacterium tuberculosis genotype in a sample comprising obtaining a sample suspected of containing Mycobacterium tuberculosis, isolating nucleic acid from the sample, contacting the nucleic acid with one or more primer pairs configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying the nucleic acid the primers such that one or more amplification products corresponding to bioagent identifying amplicons are produced, and measuring the molecular masses of the one or more amplification products, thereby identifying the Mycobacterium tuberculosis genotype.
  • the method comprises calculating base compositions of the amplification products from the molecular masses. In other embodiments the method comprises comparing the molecular masses or the base compositions with a database containing molecular masses or base compositions of bioagent identifying amplicons of genotypes of Mycobacterium tuberculosis, wherein the bioagent identifying amplicons are defined by the one or more primer pairs.
  • the one or more primer pairs is a primer pair having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair number 3600 (SEQ IDNOs: 1515:1538).
  • the one or more primer pairs further comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers: 3546 (SEQ IDNOs: 1493:1517), 3547 (SEQ IDNOs: 1494:1518), 3548 (SEQIDNOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ ID NOs: 1504:1527), 3559 (SEQ ID NOs: 1505:15
  • the one or more primer pairs further comprises five or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers: 3546 (SEQ IDNOs: 1493:1517), 3547 (SEQ IDNOs: 1494:1518), 3548 (SEQIDNOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ ID NOs: 1504:1527), 3559 (SEQ ID NOs: 1505:15
  • the one or more primer pairs comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers selected from the group consisting of: 3546 (SEQ ID NOs: 1493:1517), 3547 (SEQ ID NOs: 1494:1518), 3548 (SEQ ID NOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ ID NOs: 1504:1527), 3559 (SEQ ID NOs: 1493:1517
  • the one or more primer pairs comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers selected from the group consisting of: 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3908 (SEQ ID NOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ IDNOs: 1546:1547), 4234 (SEQ IDNOs: 1548:1549), 4235 (SEQ ID NOs: 1550:1551), 4236 (SEQ IDNO
  • the present invention provides a method wherein the Mycobacterium tuberculosis genotype is distinguished from Mycobacterium africanum, Mycobacterium bovis, Mycobacterium microti, and Mycobacterium canettii.
  • the Mycobacterium tuberculosis genotype comprises a drug-resistant strain of Mycobacterium tuberculosis.
  • the drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
  • the drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamide.
  • rifampin ethambutol
  • isoniazid diarylquinolone
  • fluoroquinolone fluoroquinolone
  • streptomycin pyrazinamide
  • three or more of the primer pairs are combined in a multiplex reaction to produce a plurality of amplification products corresponding to bioagent identifying amplicons.
  • the molecular masses are measured by mass spectrometry.
  • the sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy.
  • the sample comprises a population of distinct genotypes of Mycobacterium tuberculosis.
  • the present invention provides an oligonucleotide primer pair comprising a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer has at least 70% sequence identity with SEQ ID NO: 1515 and the reverse primer has at least 70% sequence identity with SEQ ID NO: 1538.
  • the oligonucleotide the forward primer of the primer pair comprises at least 80% sequence identity with SEQ ID NO: 1515.
  • the forward primer comprises at least 90% sequence identity with SEQ ID NO: 1515.
  • the forward primer is SEQ ID NO: 1515.
  • the reverse primer of the primer pair comprises at least 80% sequence identity with SEQ ID NO: 1538.
  • the reverse primer comprises at least 90% sequence identity with SEQ ID NO: 1538.
  • the reverse primer is SEQ ID NO: 1538.
  • the present invention provides a kit for identifying a Mycobacterium tuberculosis genotype in a sample comprising a first oligonucleotide primer pair comprising a forward primer and a reverse primer, each configured to hybridize to a Mycobacterium tuberculosis gyrB gene, and each between 13 and 35 linked nucleotides in length wherein the forward primer has at least 70% sequence identity with SEQ ID NO: 1515 and the reverse primer has at least 70% sequence identity with SEQ ID NO: 1538, and at least one additional primer pair wherein the primers of each of the at least one additional primer pair are configured to hybridize to sequence regions within a Mycobacterium tuberculosis gene selected from the group consisting of: rpoB, embB, fabG, inhA, katG, gyrA, pncA, prcA, rv2348c, rv3815c, rvO147, erg3, r
  • each of the at least one additional primer pairs is a primer pair comprising a forward primer and a reverse primer, the forward primer and the reverse primer each between 13 to 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding forward and reverse primers of primer pair numbers: 3546 (SEQ ID NOs: 1493: 1517), 3547 (SEQ ID NOs: 1494:1518), 3548 (SEQ IDNOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQIDNOs: 1497:1521), 3552 (SEQ IDNOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ IDNOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ IDNOs: 1504:1527), 3559 (
  • the present invention provides a kit for identifying a Mycobacterium tuberculosis genotype in a sample comprising a first oligonucleotide primer pair comprising a forward primer and a reverse primer, each configured to hybridize to a Mycobacterium tuberculosis gyrB gene, and each between 13 and 35 linked nucleotides in length selected from the group consisting of: 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3908 (SEQ IDNOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ IDNOs: 1546
  • the present invention provides a method for identifying a drug-resistant strain of Mycobacterium tuberculosis comprising obtaining a sample suspected of containing Mycobacterium tuberculosis, isolating nucleic acid from the sample, contacting the nucleic acid with a primer pair configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying the nucleic acid with the primer pair to obtain an amplification product containing a mutation of a codon known to confer drug resistance upon
  • the method comprises calculating a base composition of the amplification product from the molecular mass, thereby identifying a base composition for the codon.
  • the primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer and the reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting ofprimer pair numbers: 3546 (SEQ ID NOs: 1493: 1517), 3547 (SEQ ID NOs: 1494: 1518), 3548 (SEQ ID NOs: 1495: 1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497: 1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500: 1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503: 1526), 3558 (SEQ ID NOs: 1504: 1527), 35
  • the primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer and the reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting ofprimer pair numbers: 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499: 1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502: 1525), 3908 (SEQ ID NOs: 1540: 1541), 3633 (SEQ ID NOs: 1542: 1543), 3697 (SEQ ID NOs: 1544: 1545), 3828 (SEQ ID NOs: 1546: 1547), 4234 (SEQ ID NOs: 1548:1549), 4235 (SEQ ID NOs: 1550: 1551), 42
  • the drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
  • the drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
  • molecular mass is measured by mass spectrometry.
  • the sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy tissue swab, tissue aspirate, abscess biopsy, cerebrospinal fluid.
  • the sample comprises a population of distinct genotypes of Mycobacterium tuberculosis.
  • the population of distinct genotypes comprises a drug-resistant genotype and a drug-sensitive genotype.
  • the present invention provides a method of treating a human infected with a drug-resistant strain of Mycobacterium tuberculosis comprising obtaining a sample from a human infected with Mycobacterium tuberculosis, isolating nucleic acid from the sample, contacting the nucleic acid with a primer pair configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying the nucleic acid with the primer pair to obtain an amplification product containing a mutation of a codon known to confer drug resistance upon Mycobacterium tuberculosis, measuring the molecular mass of the amplification product, thereby identifying the drug-resistant strain of Mycobacterium tuberculosis, selecting one or more alternative drugs to which the drug-resistant strain is not resistant, and administering the alternative drugs to the human.
  • the primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer and the reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting of primer pair numbers: 3546 (SEQ ID NOs: 1493: 1517), 3547 (SEQ ID NOs: 1494: 1518), 3548 (SEQ ID NOs: 1495: 1519), 3550 (SEQ ID NOs: 1496: 1520), 3551 (SEQ ID NOs: 1497: 1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499: 1523), 3554 (SEQ ID NOs: 1500: 1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502: 1525), 3557 (SEQ ID NOs: 1503: 1526), 3558 (SEQ ID NOs: 1504: 1527), 3559
  • the drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
  • the drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
  • the molecular mass is measured by mass spectrometry.
  • the sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy.
  • the sample comprises a population of distinct genotypes of Mycobacterium tuberculosis.
  • the population of distinct genotypes comprises a drug-resistant genotype and a drug- sensitive genotype.
  • the present invention provides a method for determining the identity and quantity of Mycobacterium tuberculosis in a sample comprising contacting the sample with a pair of primers and a known quantity of a calibration polynucleotide comprising a calibration sequence, concurrently amplifying nucleic acid from the Mycobacterium tuberculosis in the sample with the pair of primers and amplifying nucleic acid from the calibration polynucleotide in the sample with the pair of primers to obtain a first amplification product comprising a Mycobacterium tuberculosis identifying amplicon and a second amplification product comprising a calibration amplicon, obtaining molecular mass and abundance data for the Mycobacterium tuberculosis identifying amplicon and for the calibration amplicon wherein the 5' and 3' ends of the Mycobacterium tuberculosis identifying amplicon and the calibration amplicon are the sequences of the pair of primers or complements thereof,
  • the primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer and the reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting of primer pair numbers: 3546 (SEQ ID NOs: 1493:1517), 3547 (SEQ ID NOs: 1494: 1518), 3548 (SEQ ID NOs: 1495: 1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497: 1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500: 1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503: 1526), 3558 (SEQ ID NOs: 1504: 1527), 3559
  • the calibration polynucleotide is selected from the group consisting of: calibration polynucleotide SEQ ID NO. 1561, calibration polynucleotide SEQ ID NO. 1562, calibration polynucleotide SEQ ID NO. 1563, and calibration polynucleotide SEQ ID NO. 1564.
  • Figure 1 process diagram illustrating a representative primer pair selection process.
  • Figure 2 process diagram illustrating an embodiment of the calibration method.
  • Figure 3 common pathogenic bacteria and primer pair coverage.
  • the primer pair number in the upper right hand corner of each polygon indicates that the primer pair can produce a bioagent identifying amplicon for all species within that polygon.
  • Figure 4 a representative 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA).
  • the diagram indicates that the experimentally determined base compositions of the clinical samples (labeled NHRC samples) closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.
  • Figure 5 a representative mass spectrum of amplification products indicating the presence of bioagent identifying amplicons of Streptococcus pyogenes , Neisseria meningitidis, and Haemophilus influenzae obtained from amplification of nucleic acid from a clinical sample with primer pair number 349 which targets 23 S rRNA. Experimentally determined molecular masses and base compositions for the sense strand of each amplification product are shown.
  • Figure 6 a representative mass spectrum of amplification products representing a bioagent identifying amplicon of Streptococcus pyogenes , and a calibration amplicon obtained from amplification of nucleic acid from a clinical sample with primer pair number 356 which targets rplB.
  • the experimentally determined molecular mass and base composition for the sense strand of the Streptococcus pyogenes amplification product is shown.
  • Figure 7 a representative mass spectrum of an amplified nucleic acid mixture which contained the Ames strain of Bacillus anthracis, a known quantity of combination calibration polynucleotide
  • FIG. 8 a schematic representation of the phylogeny of the M. tuberculosis cluster indicating principal genetic groups (PPGs) including nine genotypes. Selected primer pair numbers used to distinguish PPGs, genotypes and species are indicated.
  • PPGs principal genetic groups
  • Figure 9 base compositions of amplification products using primer pair BCT3908 to amplify a region of the rpoB gene. Six critical mutations may be uniquely resolved compared to the wild type sequence (WT) using dedicated primer pairs.
  • Figure 10 base compositions of amplification products using primer pair BCT3552 to amplify a region of the inhA gene. Rare mutations may be simultaneously queried compared to wild type sequence (WT) using a shared primer pair.
  • WT wild type sequence
  • Figure 11 a schematic representation of determination of resistance-conferring mutations by PCR/ESI-MS with resolution of mass spectra. Primer pairs sharing the same well yield amplicons of distinct lengths and base compositions from assay and internal calibrant templates.
  • Figure 12 an outline of the convention process flow in tuberculosis diagnostic testing compared to molecular genotyping by PCR/ESI-MS.
  • Figure 13 sequences of calibration sequences SEQ ID NO. 1561, SEQ ID NO. 1562, SEQ ID NO. 1563, and SEQ ID NO. 1564.
  • the term “abundance” refers to an amount. The amount may be described in terms of concentration which are common in molecular biology such as “copy number,” “pfu or plate-forming unit” which are well known to those with ordinary skill. Concentration may be relative to a known standard or may be absolute.
  • the term “amplif ⁇ able nucleic acid” is used in reference to nucleic acids that may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid” also comprises "sample template.”
  • amplification refers to a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (i.e., replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of “target” specificity. Target sequences are “targets” in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out. Template specificity is achieved in most amplification techniques by the choice of enzyme.
  • Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid.
  • MDV- 1 RNA is the specific template for the replicase (D.L. Kacian et al., Proc. Natl. Acad. Sci. USA 69:3038 [1972]).
  • Other nucleic acid will not be replicated by this amplification enzyme.
  • this amplification enzyme has a stringent specificity for its own promoters (Chamberlin et al., Nature 228:227 [1970]).
  • T4 DNA ligase the enzyme will not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between the oligonucleotide or polynucleotide substrate and the template at the ligation junction (D.Y. Wu and R. B. Wallace, Genomics 4:560 [1989]).
  • Taq and Pfu polymerases by virtue of their ability to function at high temperature, are found to display high specificity for the sequences bounded and thus defined by the primers; the high temperature results in thermodynamic conditions that favor primer hybridization with the target sequences and not hybridization with non-target sequences (H. A. Erlich (ed.), PCR Technology, Stockton Press [1989]).
  • amplification reagents refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification, excluding primers, nucleic acid template, and the amplification enzyme.
  • amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
  • bioagent identifying amplicon "A” and bioagent identifying amplicon "B”, produced with the same pair of primers are analogous with respect to each other.
  • Bioagent identifying amplicon "C”, produced with a different pair of primers is not analogous to either bioagent identifying amplicon "A” or bioagent identifying amplicon "B".
  • anion exchange functional group refers to a positively charged functional group capable of binding an anion through an electrostatic interaction.
  • anion exchange functional groups are the amines, including primary, secondary, tertiary and quaternary amines.
  • bacteria refers to any member of the groups of eubacteria and archaebacteria.
  • a "base composition” is the exact number of each nucleobase (for example, A, T, C and G) in a segment of nucleic acid.
  • amplification of nucleic acid of Staphylococcus aureus strain carrying the lukS-PV gene with primer pair number 2095 produces an amplification product 117 nucleobases in length from nucleic acid of the lukS-PV gene that has a base composition ofA35 G17 C19 T46 (by convention - with reference to the sense strand of the amplification product).
  • a measured molecular mass can be deconvoluted to a list of possible base compositions.
  • Identification of a base composition of a sense strand which is complementary to the corresponding antisense strand in terms of base composition provides a confirmation of the true base composition of an unknown amplification product.
  • the base composition of the antisense strand of the 139 nucleobase amplification product described above is A46 G19 C17 T35.
  • a “base composition probability cloud” is a representation of the diversity in base composition resulting from a variation in sequence that occurs among different isolates of a given species.
  • the “base composition probability cloud” represents the base composition constraints for each species and is typically visualized using a pseudo four-dimensional plot.
  • a "bioagent” is any organism, cell, or virus, living or dead, or a nucleic acid derived from such an organism, cell or virus.
  • bioagents include, but are not limited, to cells, (including but not limited to human clinical samples, bacterial cells and other pathogens), viruses, fungi, protists, parasites, and pathogenicity markers (including but not limited to: pathogenicity islands, antibiotic resistance genes, virulence factors, toxin genes and other bioregulating compounds). Samples may be alive or dead or in a vegetative state (for example, vegetative bacteria or spores) and may be encapsulated or bioengineered.
  • a "pathogen” is a bioagent which causes a disease or disorder.
  • bioagent division is defined as group of bioagents above the species level and includes but is not limited to, orders, families, classes, clades, genera or other such groupings of bioagents above the species level.
  • bioagent identifying amplicon refers to a polynucleotide that is amplified from nucleic acid of a bioagent in an amplification reaction and which 1) provides sufficient variability to distinguish among bioagents from whose nucleic acid the bioagent identifying amplicon is produced and 2) whose molecular mass is amenable to a rapid and convenient molecular mass determination modality such as mass spectrometry, for example.
  • biological product refers to any product originating from an organism. Biological products are often products of processes of biotechnology. Examples of biological products include, but are not limited to: cultured cell lines, cellular components, antibodies, proteins and other cell-derived biomolecules, growth media, growth harvest fluids, natural products and bio-pharmaceutical products.
  • biowarfare agent and “bioweapon” are synonymous and refer to a bacterium, virus, fungus or protozoan that could be deployed as a weapon to cause bodily harm to individuals.
  • military or terrorist groups may be implicated in deployment of biowarfare agents.
  • the term "broad range survey primer pair” refers to a primer pair designed to produce bioagent identifying amplicons across different broad groupings of bioagents.
  • the ribosomal RNA-targeted primer pairs are broad range survey primer pairs which have the capability of producing bacterial bioagent identifying amplicons for essentially all known bacteria.
  • broad range primer pairs employed for identification of bacteria a broad range survey primer pair for bacteria such as 16S rRNA primer pair number 346 (SEQ ID NOs: 202: 1110) for example, will produce an bacterial bioagent identifying amplicon for essentially all known bacteria.
  • calibration amplicon refers to a nucleic acid segment representing an amplification product obtained by amplification of a calibration sequence with a pair of primers designed to produce a bioagent identifying amplicon.
  • calibration sequence refers to a polynucleotide sequence to which a given pair of primers hybridizes for the purpose of producing an internal (i.e.: included in the reaction) calibration standard amplification product for use in determining the quantity of a bioagent in a sample.
  • the calibration sequence may be expressly added to an amplification reaction, or may already be present in the sample prior to analysis.
  • clade primer pair refers to a primer pair designed to produce bioagent identifying amplicons for species belonging to a clade group.
  • a clade primer pair may also be considered as a "speciating" primer pair which is useful for distinguishing among closely related species.
  • codon refers to a set of three adjoined nucleotides (triplet) that codes for an amino acid or a termination signal.
  • the term "codon base composition analysis,” refers to determination of the base composition of an individual codon by obtaining a bioagent identifying amplicon that includes the codon.
  • the bioagent identifying amplicon will at least include regions of the target nucleic acid sequence to which the primers hybridize for generation of the bioagent identifying amplicon as well as the codon being analyzed, located between the two primer hybridization regions. Codon base composition analysis is particularly useful for interrogating codons suspected of containing mutations that confer drug resistance to bacterial and viral pathogens.
  • the terms “complementary” or “complementarity” are used in reference to polynucleotides (i.e., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid) related by the base-pairing rules. For example, for the sequence “5'-A-G-T-3 T ,” is complementary to the sequence “3'-T-C-A-5 T .” Complementarity may be “partial,” in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be “complete” or “total” complementarity between the nucleic acids.
  • the degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids. Either term may also be used in reference to individual nucleotides, especially within the context of polynucleotides. For example, a particular nucleotide within an oligonucleotide may be noted for its complementarity, or lack thereof, to a nucleotide within another nucleic acid strand, in contrast or comparison to the complementarity between the rest of the oligonucleotide and the nucleic acid strand.
  • nucleic acid sequence refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' end of one sequence is paired with the 3' end of the other, is in "antiparallel association.”
  • Certain bases not commonly found in natural nucleic acids may be included in the nucleic acids disclosed herein and include, for example, inosine and 7-deazaguanine. Complementarity need not be perfect; stable duplexes may contain mismatched base pairs or unmatched bases.
  • oligonucleotide is complementary to a region of a target nucleic acid and a second oligonucleotide has complementary to the same region (or a portion of this region) a "region of overlap" exists along the target nucleic acid. The degree of overlap will vary depending upon the extent of the complementarity.
  • the term "division- wide primer pair” refers to a primer pair designed to produce bioagent identifying amplicons within sections of a broader spectrum of bioagents
  • primer pair number 352 SEQ ID NOs: 687:1411
  • a division-wide primer pair is designed to produce bacterial bioagent identifying amplicons for members of the Bacillus group of bacteria which comprises, for example, members of the genera Streptococci, Enterococci, and Staphylococci.
  • Other division-wide primer pairs may be used to produce bacterial bioagent identifying amplicons for other groups of bacterial bioagents.
  • the term “concurrently amplifying” used with respect to more than one amplification reaction refers to the act of simultaneously amplifying more than one nucleic acid in a single reaction mixture.
  • the term "drill-down primer pair” refers to a primer pair designed to produce bioagent identifying amplicons for identification of sub-species characteristics or confirmation of a species assignment.
  • primer pair number 2146 SEQ ID NOs: 437:11307
  • a drill-down Staphylococcus aureus genotyping primer pair is designed to produce Staphylococcus aureus genotyping amplicons.
  • Other drill-down primer pairs may be used to produce bioagent identifying amplicons for Staphylococcus aureus and other bacterial species.
  • duplex refers to the state of nucleic acids in which the base portions of the nucleotides on one strand are bound through hydrogen bonding the their complementary bases arrayed on a second strand.
  • the condition of being in a duplex form reflects on the state of the bases of a nucleic acid.
  • the strands of nucleic acid also generally assume the tertiary structure of a double helix, having a major and a minor groove. The assumption of the helical form is implicit in the act of becoming duplexed.
  • the term "etiology” refers to the causes or origins, of diseases or abnormal physiological conditions.
  • RNA having a non-coding function e.g., a ribosomal or transfer RNA
  • the RNA or polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or function is retained.
  • sequence identity is meant to be properly determined when the query sequence and the subject sequence are both described and aligned in the 5' to 3' direction. Sequence alignment algorithms such as BLAST, will return results in two different alignment orientations. In the Plus/Plus orientation, both the query sequence and the subject sequence are aligned in the 5' to 3' direction. On the other hand, in the Plus/Minus orientation, the query sequence is in the 5' to 3' direction while the subject sequence is in the 3' to 5' direction. It should be understood that with respect to the primers disclosed herein, sequence identity is properly determined when the alignment is designated as Plus/Plus.
  • Sequence identity may also encompass alternate or modified nucleobases that perform in a functionally similar manner to the regular nucleobases adenine, thymine, guanine and cytosine with respect to hybridization and primer extension in amplification reactions.
  • the two primers will have 100% sequence identity with each other.
  • Inosine (I) may be used as a replacement for G or T and effectively hybridize to C, A or U (uracil).
  • inosine replaces one or more C, A or U residues in one primer which is otherwise identical to another primer in sequence and length
  • the two primers will have 100% sequence identity with each other.
  • Other such modified or universal bases may exist which would perform in a functionally similar manner for hybridization and amplification reactions and will be understood to fall within this definition of sequence identity.
  • Housekeeping gene refers to a gene encoding a protein or RNA involved in basic functions required for survival and reproduction of a bioagent. Housekeeping genes include, but are not limited to genes encoding RNA or proteins involved in translation, replication, recombination and repair, transcription, nucleotide metabolism, amino acid metabolism, lipid metabolism, energy generation, uptake, secretion and the like.
  • hybridization is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is influenced by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, and the T m of the formed hybrid. "Hybridization” methods involve the annealing of one nucleic acid to another, complementary nucleic acid, i.e., a nucleic acid having a complementary nucleotide sequence. The ability of two polymers of nucleic acid containing complementary sequences to find each other and anneal through base pairing interaction is a well-recognized phenomenon.
  • ePCR electronic PCR
  • intelligent primers are primers that are designed to bind to highly conserved sequence regions of a bioagent identifying amplicon that flank an intervening variable region and, upon amplification, yield amplification products which ideally provide enough variability to distinguish individual bioagents, and which are amenable to molecular mass analysis.
  • highly conserved it is meant that the sequence regions exhibit between about 80-100%, or between about 90-100%, or between about 95-100% identity among all, or at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% of species or strains.
  • LCR has also been used in combination with PCR to achieve enhanced detection of single-base changes.
  • the four oligonucleotides used in this assay can pair to form two short ligatable fragments, there is the potential for the generation of target-independent background signal.
  • the use of LCR for mutant screening is limited to the examination of specific nucleic acid positions.
  • LNA locked nucleic acid
  • LNA refers to a nucleic acid analogue containing one or more 2'-O, 4'-C-methylene- ⁇ -D-ribofuranosyl nucleotide monomers in an RNA mimicking sugar conformation.
  • LNA oligonucleotides display unprecedented hybridization affinity toward complementary single-stranded RNA and complementary single- or double-stranded DNA. LNA oligonucleotides induce A-type (RNA-like) duplex conformations.
  • the primers disclosed herein may contain LNA modifications.
  • mass-modifying tag refers to any modification to a given nucleotide which results in an increase in mass relative to the analogous non-mass modified nucleotide.
  • Mass- modifying tags can include heavy isotopes of one or more elements included in the nucleotide such as carbon- 13 for example.
  • Other possible modifications include addition of substituents such as iodine or bromine at the 5 position of the nucleobase for example.
  • mass spectrometry refers to measurement of the mass of atoms or molecules.
  • the molecules are first converted to ions, which are separated using electric or magnetic fields according to the ratio of their mass to electric charge.
  • the measured masses are used to identity the molecules.
  • microorganism as used herein means an organism too small to be observed with the unaided eye and includes, but is not limited to bacteria, virus, protozoans, fungi; and ciliates.
  • multi-drug resistant or multiple-drug resistant refers to a microorganism which is resistant to more than one of the antibiotics or antimicrobial agents used in the treatment of said microorganism.
  • multiplex PCR refers to a PCR reaction where more than one primer set is included in the reaction pool allowing 2 or more different DNA targets to be amplified by PCR in a single reaction tube.
  • non-template tag refers to a stretch of at least three guanine or cytosine nucleobases of a primer used to produce a bioagent identifying amplicon which are not complementary to the template.
  • a non-template tag is incorporated into a primer for the purpose of increasing the primer- duplex stability of later cycles of amplification by incorporation of extra G-C pairs which each have one additional hydrogen bond relative to an A-T pair.
  • nucleic acid sequence refers to the linear composition of the nucleic acid residues A, T, C or G or any modifications thereof, within an oligonucleotide, nucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin which may be single or double stranded, and represent the sense or antisense strand
  • nucleobase is synonymous with other terms in use in the art including “nucleotide,” “deoxynucleotide,” “nucleotide residue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” or deoxynucleotide triphosphate (dNTP).
  • nucleotide analog refers to modified or non-naturally occurring nucleotides such as 5-propynyl pyrimidines (i.e., 5-propynyl-dTTP and 5-propynyl-dTCP), 7-deaza purines (i.e., 7-deaza-dATP and 7-deaza-dGTP). Nucleotide analogs include base analogs and comprise modified forms of deoxyribonucleotides as well as ribonucleotides.
  • oligonucleotide as used herein is defined as a molecule comprising two or more deoxyribonucleotides or ribonucleotides, preferably at least 5 nucleotides, more preferably at least about 13 to 35 nucleotides. The exact size will depend on many factors, which in turn depend on the ultimate function or use of the oligonucleotide.
  • the oligonucleotide may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, PCR, or a combination thereof.
  • an end of an oligonucleotide is referred to as the "5 '-end” if its 5' phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring and as the "3 '-end” if its 3' oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose ring.
  • a nucleic acid sequence even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends.
  • a first region along a nucleic acid strand is said to be upstream of another region if the 3' end of the first region is before the 5' end of the second region when moving along a strand of nucleic acid in a 5' to 3' direction.
  • All oligonucleotide primers disclosed herein are understood to be presented in the 5' to 3' direction when reading left to right.
  • the former When two different, non-overlapping oligonucleotides anneal to different regions of the same linear complementary nucleic acid sequence, and the 3' end of one oligonucleotide points towards the 5' end of the other, the former may be called the "upstream” oligonucleotide and the latter the "downstream” oligonucleotide.
  • the first oligonucleotide when two overlapping oligonucleotides are hybridized to the same linear complementary nucleic acid sequence, with the first oligonucleotide positioned such that its 5' end is upstream of the 5' end of the second oligonucleotide, and the 3' end of the first oligonucleotide is upstream of the 3' end of the second oligonucleotide, the first oligonucleotide may be called the "upstream” oligonucleotide and the second oligonucleotide may be called the "downstream" oligonucleotide.
  • a "pathogen” is a bioagent which causes a disease or disorder.
  • PCR product refers to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
  • PNA peptide nucleic acid
  • the term "peptide nucleic acid” refers to a molecule comprising bases or base analogs such as would be found in natural nucleic acid, but attached to a peptide backbone rather than the sugar-phosphate backbone typical of nucleic acids. The attachment of the bases to the peptide is such as to allow the bases to base pair with complementary bases of nucleic acid in a manner similar to that of an oligonucleotide.
  • These small molecules also designated anti gene agents, stop transcript elongation by binding to their complementary strand of nucleic acid (Nielsen, et al. Anticancer Drug Des. 8:53 63).
  • the primers disclosed herein may comprise PNAs.
  • polymerase refers to an enzyme having the ability to synthesize a complementary strand of nucleic acid from a starting template nucleic acid strand and free dNTPs.
  • PCR polymerase chain reaction
  • the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule.
  • the primers are extended with a polymerase so as to form a new pair of complementary strands.
  • the steps of denaturation, primer annealing, and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one "cycle”; there can be numerous "cycles") to obtain a high concentration of an amplified segment of the desired target sequence.
  • the length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter.
  • PCR polymerase chain reaction
  • any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules.
  • the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
  • polymerization means or “polymerization agent” refers to any agent capable of facilitating the addition of nucleoside triphosphates to an oligonucleotide.
  • Preferred polymerization means comprise DNA and RNA polymerases.
  • a primer pair is used for amplification of a nucleic acid sequence.
  • a pair of primers comprises a forward primer and a reverse primer.
  • the forward primer hybridizes to a sense strand of a target gene sequence to be amplified and primes synthesis of an antisense strand (complementary to the sense strand) using the target sequence as a template.
  • a reverse primer hybridizes to the antisense strand of a target gene sequence to be amplified and primes synthesis of a sense strand (complementary to the antisense strand) using the target sequence as a template.
  • the primers are designed to bind to highly conserved sequence regions of a bioagent identifying amplicon that flank an intervening variable region and yield amplification products which ideally provide enough variability to distinguish each individual bioagent, and which are amenable to molecular mass analysis.
  • the highly conserved sequence regions exhibit between about 80-100%, or between about 90-100%, or between about 95-100% identity, or between about 99- 100% identity.
  • the molecular mass of a given amplification product provides a means of identifying the bioagent from which it was obtained, due to the variability of the variable region.
  • design of the primers requires selection of a variable region with appropriate variability to resolve the identity of a given bioagent.
  • Bioagent identifying amplicons are ideally specific to the identity of the bioagent.
  • Properties of the primers may include any number of properties related to structure including, but not limited to: nucleobase length which may be contiguous (linked together) or non-contiguous (for example, two or more contiguous segments which are joined by a linker or loop moiety), modified or universal nucleobases (used for specific purposes such as for example, increasing hybridization affinity, preventing non-templated adenylation and modifying molecular mass) percent complementarity to a given target sequences.
  • Properties of the primers also include functional features including, but not limited to, orientation of hybridization (forward or reverse) relative to a nucleic acid template.
  • the coding or sense strand is the strand to which the forward priming primer hybridizes (forward priming orientation) while the reverse priming primer hybridizes to the non-coding or antisense strand (reverse priming orientation).
  • the functional properties of a given primer pair also include the generic template nucleic acid to which the primer pair hybridizes. For example, identification of bioagents can be accomplished at different levels using primers suited to resolution of each individual level of identification.
  • Broad range survey primers are designed with the objective of identifying a bioagent as a member of a particular division (e.g., an order, family, genus or other such grouping of bioagents above the species level of bioagents).
  • broad range survey intelligent primers are capable of identification of bioagents at the species or sub-species level.
  • Other primers may have the functionality of producing bioagent identifying amplicons for members of a given taxonomic genus, clade, species, sub-species or genotype (including genetic variants which may include presence of virulence genes or antibiotic resistance genes or mutations). Additional functional properties of primer pairs include the functionality of performing amplification either singly (single primer pair per amplification reaction vessel) or in a multiplex fashion (multiple primer pairs and multiple amplification reactions within a single reaction vessel).
  • kits can comprise one or more purified oligonucleotide primer pairs. When the kit comprises more than one purified oligonucleotide primer pairs, each of those primer pairs can be in separate vials of the kit.
  • each of the desired purified oligonucleotide primer pairs can be in the same vial.
  • each of the desired primer pairs are referred to as purified, meaning that there are no nucleic acids in said vial other than the plurality of desired primer pairs.
  • reverse transcriptase refers to an enzyme having the ability to transcribe DNA from an RNA template. This enzymatic activity is known as reverse transcriptase activity. Reverse transcriptase activity is desirable in order to obtain DNA from RNA viruses which can then be amplified and analyzed by the methods disclosed herein.
  • Ribosomal RNA refers to the primary ribonucleic acid constituent of ribosomes. Ribosomes are the protein-manufacturing organelles of cells and exist in the cytoplasm. Ribosomal RNAs are transcribed from the DNA genes encoding them.
  • sample in the present specification and claims is used in its broadest sense. On the one hand it is meant to include a specimen or culture (e.g., microbiological cultures). On the other hand, it is meant to include both biological and environmental samples.
  • a sample may include a specimen of synthetic origin.
  • Biological samples may be animal, including human, fluid, solid (e.g., stool) or tissue, as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by-products, and waste.
  • Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, lagamorphs, rodents, etc.
  • Environmental samples include environmental material such as surface matter, soil, water, air and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to the methods disclosed herein.
  • source of target nucleic acid refers to any sample that contains nucleic acids (RNA or DNA). Particularly preferred sources of target nucleic acids are biological samples including, but not limited to blood, saliva, cerebral spinal fluid, pleural fluid, milk, lymph, sputum and semen.
  • sample template refers to nucleic acid originating from a sample that is analyzed for the presence of "target” (defined below).
  • background template is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is often a contaminant. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
  • a “segment” is defined herein as a region of nucleic acid within a target sequence.
  • the "self-sustained sequence replication reaction” (Guatelli et al., Proc. Natl. Acad. Sci., 87: 1874-1878 [1990], with an erratum at Proc. Natl. Acad. Sci., 87:7797 [1990]) is a transcription- based in vitro amplification system (Kwok et al., Proc. Natl. Acad. Sci., 86: 1173-1177 [1989]) that can exponentially amplify RNA sequences at a uniform temperature. The amplified RNA can then be utilized for mutation detection (Fahy et al., PCR Meth. Appl., 1 :25-33 [1991]).
  • an oligonucleotide primer is used to add a phage RNA polymerase promoter to the 5' end of the sequence of interest.
  • a cocktail of enzymes and substrates that includes a second primer, reverse transcriptase, RNase H, RNA polymerase and ribo- and deoxyribonucleoside triphosphates, the target sequence undergoes repeated rounds of transcription, cDNA synthesis and second-strand synthesis to amplify the area of interest.
  • the use of 3SR to detect mutations is kinetically limited to screening small segments of DNA (e.g., 200-300 base pairs).
  • sequence alignment refers to a listing of multiple DNA or amino acid sequences and aligns them to highlight their similarities. The listings can be made using bioinformatics computer programs.
  • sepsis and "septicemia refer to disease caused by the spread of bacteria and their toxins in the bloodstream.
  • a "sepsis-causing bacterium” is the causative agent of sepsis i.e. the bacterium infecting the bloodstream of an individual with sepsis.
  • the term "speciating primer pair” refers to a primer pair designed to produce a bioagent identifying amplicon with the diagnostic capability of identifying species members of a group of genera or a particular genus of bioagents.
  • Primer pair number 2249 (SEQ ID NOs: 430: 1321), for example, is a speciating primer pair used to distinguish Staphylococcus aureus from other species of the genus Staphylococcus.
  • a "sub-species characteristic” is a genetic characteristic that provides the means to distinguish two members of the same bioagent species. For example, one viral strain could be distinguished from another viral strain of the same species by possessing a genetic change (e.g., for example, a nucleotide deletion, addition or substitution) in one of the viral genes, such as the RNA- dependent RNA polymerase. Sub-species characteristics such as virulence genes and drug-are responsible for the phenotypic differences among the different strains of bacteria.
  • the term "target” is used in a broad sense to indicate the gene or genomic region being amplified by the primers. Because the methods disclosed herein provide a plurality of amplification products from any given primer pair (depending on the bioagent being analyzed), multiple amplification products from different specific nucleic acid sequences may be obtained. Thus, the term “target” is not used to refer to a single specific nucleic acid sequence. The “target” is sought to be sorted out from other nucleic acid sequences and contains a sequence that has at least partial complementarity with an oligonucleotide primer. The target nucleic acid may comprise single- or double-stranded DNA or RNA. A “segment” is defined as a region of nucleic acid within the target sequence.
  • template refers to a strand of nucleic acid on which a complementary copy is built from nucleoside triphosphates through the activity of a template-dependent nucleic acid polymerase. Within a duplex the template strand is, by convention, depicted and described as the "bottom” strand. Similarly, the non-template strand is often depicted and described as the "top” strand.
  • T m is used in reference to the "melting temperature.”
  • the melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands.
  • triangulation genotyping analysis refers to a method of genotyping a bioagent by measurement of molecular masses or base compositions of amplification products, corresponding to bioagent identifying amplicons, obtained by amplification of regions of more than one gene.
  • triangulation refers to a method of establishing the accuracy of information by comparing three or more types of independent points of view bearing on the same findings.
  • Triangulation genotyping analysis carried out with a plurality of triangulation genotyping analysis primers yields a plurality of base compositions that then provide a pattern or "barcode" from which a species type can be assigned.
  • the species type may represent a previously known sub-species or strain, or may be a previously unknown strain having a specific and previously unobserved base composition barcode indicating the existence of a previously unknown genotype.
  • triangulation genotyping analysis primer pair is a primer pair designed to produce bioagent identifying amplicons for determining species types in a triangulation genotyping analysis.
  • Triangulation identification is pursued by analyzing a plurality of bioagent identifying amplicons produced with different primer pairs. This process is used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al., J. Appl. Microbiol., 1999, 87, 270-278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event.
  • the term "unknown bioagent” may mean either: (i) a bioagent whose existence is known (such as the well known bacterial species Staphylococcus aureus for example) but which is not known to be in a sample to be analyzed, or (ii) a bioagent whose existence is not known (for example, the SARS coronavirus was unknown prior to April 2003). For example, if the method for identification of coronaviruses disclosed in commonly owned U.S. Patent Serial No.
  • variable sequence refers to differences in nucleic acid sequence between two nucleic acids.
  • the genes of two different bacterial species may vary in sequence by the presence of single base substitutions and/or deletions or insertions of one or more nucleotides. These two forms of the structural gene are said to vary in sequence from one another.
  • viral nucleic acid includes, but is not limited to, DNA, RNA, or DNA that has been obtained from viral RNA, such as, for example, by performing a reverse transcription reaction. Viral RNA can either be single-stranded (of positive or negative polarity) or double-stranded.
  • virus refers to obligate, ultramicroscopic, parasites that are incapable of autonomous replication (i.e., replication requires the use of the host cell's machinery). Viruses can survive outside of a host cell but cannot replicate.
  • wild-type refers to a gene or a gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source.
  • a wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the “normal” or “wild- type” form of the gene.
  • modified refers to a gene or gene product that displays modifications in sequence and or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally- occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
  • a "wobble base” is a variation in a codon found at the third nucleotide position of a DNA triplet. Variations in conserved regions of sequence are often found at the third nucleotide position due to redundancy in the amino acid code.
  • a match is obtained when an experimentally-determined molecular mass or base composition of an analyzed amplification product is compared with known molecular masses or base compositions of known bioagent identifying amplicons and the experimentally determined molecular mass or base composition is the same as the molecular mass or base composition of one of the known bioagent identifying amplicons.
  • the experimentally-determined molecular mass or base composition may be within experimental error of the molecular mass or base composition of a known bioagent identifying amplicon and still be classified as a match.
  • the match may also be classified using a probability of match model such as the models described in U.S. Serial No. 11/073,362, which is commonly owned and incorporated herein by reference in entirety.
  • the method can be applied to rapid parallel multiplex analyses, the results of which can be employed in a triangulation identification strategy.
  • the present method provides rapid throughput and does not require nucleic acid sequencing of the amplified target sequence for bioagent detection and identification.
  • viruses Unlike bacterial genomes, which exhibit conservation of numerous genes (i.e. housekeeping genes) across all organisms, viruses do not share a gene that is essential and conserved among all virus families. Therefore, viral identification is achieved within smaller groups of related viruses, such as members of a particular virus family or genus. For example, RNA-dependent RNA polymerase is present in all single-stranded RNA viruses and can be used for broad priming as well as resolution within the virus family.
  • At least one bacterial nucleic acid segment is amplified in the process of identifying the bacterial bioagent.
  • the nucleic acid segments that can be amplified by the primers disclosed herein and that provide enough variability to distinguish each individual bioagent and whose molecular masses are amenable to molecular mass determination are herein described as bioagent identifying amplicons.
  • bioagent identifying amplicons comprise from about 27 to about 200 nucleobases (i.e. from about 45 to about 200 linked nucleosides), although both longer and short regions may be used.
  • nucleobases i.e. from about 45 to about 200 linked nucleosides
  • these embodiments include compounds of 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100
  • bioagent identifying amplicons amenable to molecular mass determination which are produced by the primers described herein are either of a length, size or mass compatible with the particular mode of molecular mass determination or compatible with a means of providing a predictable fragmentation pattern in order to obtain predictable fragments of a length compatible with the particular mode of molecular mass determination.
  • Such means of providing a predictable fragmentation pattern of an amplification product include, but are not limited to, cleavage with chemical reagents, restriction enzymes or cleavage primers, for example.
  • bioagent identifying amplicons are larger than 200 nucleobases and are amenable to molecular mass determination following restriction digestion. Methods of using restriction enzymes and cleavage primers are well known to those with ordinary skill in the art.
  • amplification products corresponding to bioagent identifying amplicons are obtained using the polymerase chain reaction (PCR) that is a routine method to those with ordinary skill in the molecular biology arts.
  • PCR polymerase chain reaction
  • Other amplification methods may be used such as ligase chain reaction (LCR), low-stringency single primer PCR, and multiple strand displacement amplification (MDA). These methods are also known to those with ordinary skill.
  • the primers are designed to bind to conserved sequence regions of a bioagent identifying amplicon that flank an intervening variable region and yield amplification products which provide variability sufficient to distinguish each individual bioagent, and which are amenable to molecular mass analysis.
  • the highly conserved sequence regions exhibit between about 80-100%, or between about 90-100%, or between about 95-100% identity, or between about 99- 100% identity.
  • the molecular mass of a given amplification product provides a means of identifying the bioagent from which it was obtained, due to the variability of the variable region.
  • design of the primers involves selection of a variable region with sufficient variability to resolve the identity of a given bioagent.
  • bioagent identifying amplicons are specific to the identity of the bioagent.
  • identification of bioagents is accomplished at different levels using primers suited to resolution of each individual level of identification.
  • Broad range survey primers are designed with the objective of identifying a bioagent as a member of a particular division (e.g., an order, family, genus or other such grouping of bioagents above the species level of bioagents).
  • broad range survey intelligent primers are capable of identification of bioagents at the species or sub-species level.
  • Examples of broad range survey primers include, but are not limited to: primer pair numbers: 346 (SEQ ID NOs: 202: 1110), 347 (SEQ ID NOs: 560: 1278), 348 SEQ ID NOs: 706:895), and 361 (SEQ ID NOs: 697: 1398) which target DNA encoding 16S rRNA, and primer pair numbers 349 (SEQ ID NOs: 401 : 1156) and 360 (SEQ ID NOs: 409: 1434) which target DNA encoding 23 S rRNA.
  • drill-down primers are designed with the objective of identifying a bioagent at the sub-species level (including strains, subtypes, variants and isolates) based on subspecies characteristics which may, for example, include single nucleotide polymorphisms (SNPs), variable number tandem repeats (VNTRs), deletions, drug resistance mutations or any other modification of a nucleic acid sequence of a bioagent relative to other members of a species having different sub-species characteristics.
  • Drill-down intelligent primers are not always required for identification at the sub-species level because broad range survey intelligent primers may, in some cases provide sufficient identification resolution to accomplishing this identification objective.
  • drill-down primers include, but are not limited to: confirmation primer pairs such as primer pair numbers 351 (SEQ ID NOs: 355: 1423) and 353 (SEQ ID NOs: 220: 1394), which target the pXOl virulence plasmid of Bacillus anthracis.
  • drill-down primer pairs are found in sets of triangulation genotyping primer pairs such as, for example, the primer pair number 2146 (SEQ ID NOs: 437:1137) which targets the arcC gene (encoding carmabate kinase) and is included in an 8 primer pair panel or kit for use in genotyping Staphylococcus aureus, or in other panels or kits of primer pairs used for determining drug-resistant bacterial strains, such as, for example, primer pair number 2095 (SEQ ID NOs: 456:1261) which targets the pv-luk gene (encoding Panton-Valentine leukocidin) and is included in an 8 primer pair panel or kit for use in identification of drug resistant strains of Staphylococcus aureus.
  • a representative process flow diagram used for primer selection and validation process is outlined in Figure 1.
  • candidate target sequences are identified (200) from which nucleotide alignments are created (210) and analyzed (220).
  • Primers are then designed by selecting appropriate priming regions (230) to facilitate the selection of candidate primer pairs (240).
  • the primer pairs are then subjected to in silico analysis by electronic PCR (ePCR) (300) wherein bioagent identifying amplicons are obtained from sequence databases such as GenBank or other sequence collections (310) and checked for specificity in silico (320).
  • ePCR electronic PCR
  • Bioagent identifying amplicons obtained from GenBank sequences (310) can also be analyzed by a probability model which predicts the capability of a given amplicon to identify unknown bioagents such that the base compositions of amplicons with favorable probability scores are then stored in a base composition database (325).
  • base compositions of the bioagent identifying amplicons obtained from the primers and GenBank sequences can be directly entered into the base composition database (330).
  • Candidate primer pairs (240) are validated by testing their ability to hybridize to target nucleic acid by an in vitro amplification by a method such as PCR analysis (400) of nucleic acid from a collection of organisms (410). Amplification products thus obtained are analyzed by gel electrophoresis or by mass spectrometry to confirm the sensitivity, specificity and reproducibility of the primers used to obtain the amplification products (420).
  • primers are well known and routine in the art.
  • the primers may be conveniently and routinely made through the well-known technique of solid phase synthesis.
  • Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, CA). Any other means for such synthesis known in the art may additionally or alternatively be employed.
  • primers are employed as compositions for use in methods for identification of bacterial bioagents as follows: a primer pair composition is contacted with nucleic acid (such as, for example, bacterial DNA or DNA reverse transcribed from the rRNA) of an unknown bacterial bioagent. The nucleic acid is then amplified by a nucleic acid amplification technique, such as PCR for example, to obtain an amplification product that represents a bioagent identifying amplicon.
  • nucleic acid such as, for example, bacterial DNA or DNA reverse transcribed from the rRNA
  • the molecular mass of each strand of the double-stranded amplification product is determined by a molecular mass measurement technique such as mass spectrometry for example, wherein the two strands of the double-stranded amplification product are separated during the ionization process.
  • the mass spectrometry is electrospray Fourier transform ion cyclotron resonance mass spectrometry (ESI-FTICR-MS) or electrospray time of flight mass spectrometry (ESI-TOF-MS).
  • EI-FTICR-MS electrospray Fourier transform ion cyclotron resonance mass spectrometry
  • ESI-TOF-MS electrospray time of flight mass spectrometry
  • the molecular mass or base composition thus determined is then compared with a database of molecular masses or base compositions of analogous bioagent identifying amplicons for known viral bioagents.
  • a match between the molecular mass or base composition of the amplification product and the molecular mass or base composition of an analogous bioagent identifying amplicon for a known viral bioagent indicates the identity of the unknown bioagent.
  • the primer pair used is one of the primer pairs of Table 2.
  • the method is repeated using one or more different primer pairs to resolve possible ambiguities in the identification process or to improve the confidence level for the identification assignment.
  • a bioagent identifying amplicon may be produced using only a single primer (either the forward or reverse primer of any given primer pair), provided an appropriate amplification method is chosen, such as, for example, low stringency single primer PCR (LSSP- PCR). Adaptation of this amplification method in order to produce bioagent identifying amplicons can be accomplished by one with ordinary skill in the art without undue experimentation.
  • LSSP- PCR low stringency single primer PCR
  • the oligonucleotide primers are broad range survey primers which hybridize to conserved regions of nucleic acid encoding the hexon gene of all (or between 80% and 100%, between 85% and 100%, between 90% and 100% or between 95% and 100%) known bacteria and produce bacterial bioagent identifying amplicons.
  • the molecular mass or base composition of a bacterial bioagent identifying amplicon defined by a broad range survey primer pair does not provide enough resolution to unambiguously identify a bacterial bioagent at or below the species level.
  • These cases benefit from further analysis of one or more bacterial bioagent identifying amplicons generated from at least one additional broad range survey primer pair or from at least one additional division-wide primer pair.
  • the employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as triangulation identification.
  • the oligonucleotide primers are division-wide primers which hybridize to nucleic acid encoding genes of species within a genus of bacteria.
  • the oligonucleotide primers are drill-down primers which enable the identification of sub-species characteristics. Drill down primers provide the functionality of producing bioagent identifying amplicons for drill-down analyses such as strain typing when contacted with nucleic acid under amplification conditions. Identification of such sub-species characteristics is often critical for determining proper clinical treatment of viral infections. In some embodiments, sub-species characteristics are identified using only broad range survey primers and division-wide and drill-down primers are not used.
  • the primers used for amplification hybridize to and amplify genomic DNA, and DNA of bacterial plasmids.
  • various computer software programs may be used to aid in design of primers for amplification reactions such as Primer Premier 5 (Premier Biosoft, Palo Alto, CA) or OLIGO Primer Analysis Software (Molecular Biology Insights, Cascade, CO). These programs allow the user to input desired hybridization conditions such as melting temperature of a primer-template duplex for example.
  • an in silico PCR search algorithm such as (ePCR) is used to analyze primer specificity across a plurality of template sequences which can be readily obtained from public sequence databases such as GenBank for example.
  • An existing RNA structure search algorithm Macke et al., Nucl.
  • Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety).
  • This also provides information on primer specificity of the selected primer pairs.
  • the hybridization conditions applied to the algorithm can limit the results of primer specificity obtained from the algorithm.
  • the melting temperature threshold for the primer template duplex is specified to be 35°C or a higher temperature.
  • the number of acceptable mismatches is specified to be seven mismatches or less.
  • the buffer components and concentrations and primer concentrations may be specified and incorporated into the algorithm, for example, an appropriate primer concentration is about 250 nM and appropriate buffer components are 50 mM sodium or potassium and 1.5 mM Mg 2+ .
  • a primer may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event, (e.g., for example, a loop structure or a hairpin structure).
  • the primers may comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or at least
  • Percent homology, sequence identity or complementarity can be determined by, for example, the Gap program (Wisconsin Sequence Analysis Package, Version 8 for UNIX, Genetics Computer Group, University Research Park, Madison WI), using default settings, which uses the algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489).
  • complementarity of primers with respect to the conserved priming regions of viral nucleic acid is between about 70% and about 75% 80%. In other embodiments, homology, sequence identity or complementarity, is between about 75% and about 80%.
  • homology, sequence identity or complementarity is at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or is 100%.
  • the primers described herein comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 98%, or at least 99%, or 100% (or any range therewithin) sequence identity with the primer sequences specifically disclosed herein.
  • the primers are at least 13 nucleobases in length. In another embodiment, the primers are less than 36 nucleobases in length.
  • the oligonucleotide primers are 13 to 35 nucleobases in length (13 to 35 linked nucleotide residues). These embodiments comprise oligonucleotide primers 13, 14, 15, 16, 17,
  • primers may also be linked to one or more other desired moieties, including, but not limited to, affinity groups, ligands, regions of nucleic acid that are not complementary to the nucleic acid to be amplified, labels, etc.
  • Primers may also form hairpin structures.
  • hairpin primers may be used to amplify short target nucleic acid molecules. The presence of the hairpin may stabilize the amplification complex (see e.g., TAQMAN MicroRNA Assays, Applied Biosystems, Foster City, California).
  • any oligonucleotide primer pair may have one or both primers with less then 70% sequence homology with a corresponding member of any of the primer pairs of Table 2 if the primer pair has the capability of producing an amplification product corresponding to a bioagent identifying amplicon.
  • any oligonucleotide primer pair may have one or both primers with a length greater than 35 nucleobases if the primer pair has the capability of producing an amplification product corresponding to a bioagent identifying amplicon.
  • the function of a given primer may be substituted by a combination of two or more primers segments that hybridize adjacent to each other or that are linked by a nucleic acid loop structure or linker which allows a polymerase to extend the two or more primers in an amplification reaction.
  • the primer pairs used for obtaining bioagent identifying amplicons are the primer pairs of Table 2.
  • other combinations of primer pairs are possible by combining certain members of the forward primers with certain members of the reverse primers.
  • An example can be seen in Table 2 for two primer pair combinations of forward primer 16S_EC_78 9_810_F (SEQ ID NO: 206), with the reverse primers 1 6S_EC_880_8 94_R (SEQ ID NO: 796), or 1 6S_EC_882_8 99_R or (SEQ ID NO: 818).
  • a bioagent identifying amplicon that would be produced by the primer pair which preferably is between about 27 to about 200 nucleobases in length.
  • a bioagent identifying amplicon longer than 200 nucleobases in length could be cleaved into smaller segments by cleavage reagents such as chemical reagents, or restriction enzymes, for example.
  • the primers are configured to amplify nucleic acid of a bioagent to produce amplification products that can be measured by mass spectrometry and from whose molecular masses candidate base compositions can be readily calculated.
  • any given primer comprises a modification comprising the addition of a non-templated T residue to the 5' end of the primer (i.e., the added T residue does not necessarily hybridize to the nucleic acid being amplified).
  • the addition of a non-templated T residue has an effect of minimizing the addition of non-templated adenosine residues as a result of the non-specific enzyme activity oi Taq polymerase (Magnuson et al., Biotechniques, 1996, 21, 700-709), an occurrence which may lead to ambiguous results arising from molecular mass analysis.
  • primers may contain one or more universal bases. Because any variation (due to codon wobble in the 3 rd position) in the conserved regions among species is likely to occur in the third position of a DNA (or RNA) triplet, oligonucleotide primers can be designed such that the nucleotide corresponding to this position is a base which can bind to more than one nucleotide, referred to herein as a "universal nucleobase.” For example, under this "wobble” pairing, inosine (I) binds to U, C or A; guanine (G) binds to U or C, and uridine (U) binds to U or C.
  • inosine (I) binds to U, C or A
  • guanine (G) binds to U or C
  • uridine (U) binds to U or C.
  • nitroindoles such as 5-nitroindole or 3-nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001-1003), the degenerate nucleotides dP or dK (Hill et al.), an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot et al., Nucleosides and Nucleotides, 1995, 14, 1053-1056) or the purine analog l-(2-deoxy- ⁇ -D- ribofuranosyl)-imidazole-4-carboxamide (SaIa et al., Nucl. Acids Res., 1996, 24, 3302-3306).
  • nitroindoles such as 5-nitroindole or 3-nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001-1003)
  • the oligonucleotide primers are designed such that the first and second positions of each triplet are occupied by nucleotide analogs that bind with greater affinity than the unmodified nucleotide.
  • these analogs include, but are not limited to, 2,6-diaminopurine which binds to thymine, 5-propynyluracil (also known as propynylated thymine) which binds to adenine and 5- propynylcytosine and phenoxazines, including G-clamp, which binds to G.
  • Propynylated pyrimidines are described in U.S.
  • Propynylated primers are described in U. S Pre-Grant Publication No. 2003-0170682, which is also commonly owned and incorporated herein by reference in its entirety.
  • Phenoxazines are described in U.S. Patent Nos. 5,502,177, 5,763,588, and 6,005,096, each of which is incorporated herein by reference in its entirety.
  • G-clamps are described in U.S. Patent Nos. 6,007,992 and 6,028,183, each of which is incorporated herein by reference in its entirety.
  • primer hybridization is enhanced using primers containing 5-propynyl deoxycytidine and deoxythymidine nucleotides. These modified primers offer increased affinity and base pairing selectivity.
  • non-template primer tags are used to increase the melting temperature (T m ) of a primer-template duplex in order to improve amplification efficiency.
  • T m melting temperature
  • a non-template tag is at least three consecutive A or T nucleotide residues on a primer which are not complementary to the template. In any given non-template tag, A can be replaced by C or G and T can also be replaced by C or G.
  • propynylated tags may be used in a manner similar to that of the non- template tag, wherein two or more 5-propynylcytidine or 5-propynyluridine residues replace template matching residues on a primer.
  • a primer contains a modified internucleoside linkage such as a phosphorothioate linkage, for example.
  • the primers contain mass-modifying tags. Reducing the total number of possible base compositions of a nucleic acid of specific molecular weight provides a means of avoiding a persistent source of ambiguity in determination of base composition of amplification products. Addition of mass-modifying tags to certain nucleobases of a given primer will result in simplification of de novo determination of base composition of a given bioagent identifying amplicon from its molecular mass.
  • the mass modified nucleobase comprises one or more of the following: for example, 7-deaza-2'-deoxyadenosine-5-triphosphate, 5-iodo-2'-deoxyuridine-5'-triphosphate, 5- bromo-2'-deoxyuridine-5'-triphosphate, 5-bromo-2'-deoxycytidine-5'-triphosphate, 5-iodo-2'- deoxycytidine-5'-triphosphate, 5-hydroxy-2'-deoxyuridine-5'-triphosphate, 4-thiothymidine-5'- triphosphate, 5-aza-2'-deoxyuridine-5'-triphosphate, 5-fluoro-2'-deoxyuridine-5'-triphosphate, 06- methyl-2'-deoxyguanosine-5'-triphosphate, N2-methyl-2'-deoxyguanosine-5'-triphosphate, 8-oxo-2'- deoxyguanosine-5'-triphosphate or thiothymidine
  • multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with a plurality of primer pairs.
  • the advantages of multiplexing are that fewer reaction containers (for example, wells of a 96- or 384-well plate) are needed for each molecular mass measurement, providing time, resource and cost savings because additional bioagent identification data can be obtained within a single analysis.
  • Multiplex amplification methods are well known to those with ordinary skill and can be developed without undue experimentation.
  • one useful and non-obvious step in selecting a plurality candidate bioagent identifying amplicons for multiplex amplification is to ensure that each strand of each amplification product will be sufficiently different in molecular mass that mass spectral signals will not overlap and lead to ambiguous analysis results.
  • a 10 Da difference in mass of two strands of one or more amplification products is sufficient to avoid overlap of mass spectral peaks.
  • single amplification reactions can be pooled before analysis by mass spectrometry.
  • the molecular mass of a given bioagent identifying amplicon is determined by mass spectrometry.
  • Mass spectrometry has several advantages, not the least of which is high bandwidth characterized by the ability to separate (and isolate) many molecular peaks across a broad range of mass to charge ratio (m/z).
  • mass spectrometry is intrinsically a parallel detection scheme without the need for radioactive or fluorescent labels, since every amplification product is identified by its molecular mass.
  • the current state of the art in mass spectrometry is such that less than femtomole quantities of material can be readily analyzed to afford information about the molecular contents of the sample.
  • An accurate assessment of the molecular mass of the material can be quickly obtained, irrespective of whether the molecular weight of the sample is several hundred, or in excess of one hundred thousand atomic mass units (amu) or Daltons.
  • intact molecular ions are generated from amplification products using one of a variety of ionization techniques to convert the sample to gas phase.
  • ionization techniques include, but are not limited to, electrospray ionization (ES), matrix-assisted laser desorption ionization (MALDI) and fast atom bombardment (FAB).
  • ES electrospray ionization
  • MALDI matrix-assisted laser desorption ionization
  • FAB fast atom bombardment
  • Electrospray ionization mass spectrometry is particularly useful for very high molecular weight polymers such as proteins and nucleic acids having molecular weights greater than 10 kDa, since it yields a distribution of multiply-charged molecules of the sample without causing a significant amount of fragmentation.
  • the mass detectors used in the methods described herein include, but are not limited to, Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), time of flight (TOF), ion trap, quadrupole, magnetic sector, Q-TOF, and triple quadrupole.
  • base composition is the exact number of each nucleobase (A, T, C and G) determined from the molecular mass of a bioagent identifying amplicon.
  • a base composition provides an index of a specific organism. Base compositions can be calculated from known sequences of known bioagent identifying amplicons and can be experimentally determined by measuring the molecular mass of a given bioagent identifying amplicon, followed by determination of all possible base compositions which are consistent with the measured molecular mass within acceptable experimental error.
  • the following example illustrates determination of base composition from an experimentally obtained molecular mass of a 46-mer amplification product originating at position 1337 of the 16S rRNA of Bacillus anthracis.
  • the forward and reverse strands of the amplification product have measured molecular masses of 14208 and 14079 Da, respectively.
  • the possible base compositions derived from the molecular masses of the forward and reverse strands for the B. anthracis products are listed in Table 1.
  • assignment of previously unobserved base compositions can be accomplished via the use of pattern classifier model algorithms.
  • Base compositions like sequences, vary slightly from strain to strain within species, for example.
  • the pattern classifier model is the mutational probability model.
  • the pattern classifier is the polytope model.
  • the mutational probability model and polytope model are both commonly owned and described in U.S. Patent application Serial No. 11/073,362 which is incorporated herein by reference in entirety.
  • it is possible to manage this diversity by building "base composition probability clouds" around the composition constraints for each species.
  • a "pseudo four-dimensional plot" can be used to visualize the concept of base composition probability clouds.
  • Optimal primer design requires optimal choice of bioagent identifying amplicons and maximizes the separation between the base composition signatures of individual bioagents. Areas where clouds overlap indicate regions that may result in a misclassification, a problem which is overcome by a triangulation identification process using bioagent identifying amplicons not affected by overlap of base composition probability clouds.
  • base composition probability clouds provide the means for screening potential primer pairs in order to avoid potential misclassifications of base compositions.
  • base composition probability clouds provide the means for predicting the identity of a bioagent whose assigned base composition was not previously observed and/or indexed in a bioagent identifying amplicon base composition database due to evolutionary transitions in its nucleic acid sequence.
  • mass spectrometry determination of base composition does not require prior knowledge of the composition or sequence in order to make the measurement.
  • the methods disclosed herein provide bioagent classifying information similar to DNA sequencing and phylogenetic analysis at a level sufficient to identify a given bioagent. Furthermore, the process of determination of a previously unknown base composition for a given bioagent (for example, in a case where sequence information is unavailable) has downstream utility by providing additional bioagent indexing information with which to populate base composition databases. The process of future bioagent identification is thus greatly improved as more BCS indexes become available in base composition databases.
  • a molecular mass of a single bioagent identifying amplicon alone does not provide enough resolution to unambiguously identify a given bioagent.
  • the employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as "triangulation identification.”
  • Triangulation identification is pursued by determining the molecular masses of a plurality of bioagent identifying amplicons selected within a plurality of housekeeping genes. This process is used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al., J. Appl. Microbiol., 1999, 87, 270- 278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event.
  • the triangulation identification process can be pursued by characterization of bioagent identifying amplicons in a massively parallel fashion using the polymerase chain reaction (PCR), such as multiplex PCR where multiple primers are employed in the same amplification reaction mixture, or PCR in multi-well plate format wherein a different and unique pair of primers is used in multiple wells containing otherwise identical reaction mixtures.
  • PCR polymerase chain reaction
  • multiplex and multi-well PCR methods are well known to those with ordinary skill in the arts of rapid throughput amplification of nucleic acids.
  • one PCR reaction per well or container may be carried out, followed by an amplicon pooling step wherein the amplification products of different wells are combined in a single well or container which is then subjected to molecular mass analysis.
  • the combination of pooled amplicons can be chosen such that the expected ranges of molecular masses of individual amplicons are not overlapping and thus will not complicate identification of signals.
  • one or more nucleotide substitutions within a codon of a gene of an infectious organism confer drug resistance upon an organism which can be determined by codon base composition analysis.
  • the organism can be a bacterium, virus, fungus or protozoan.
  • the amplification product containing the codon being analyzed is of a length of about 35 to about 200 nucleobases.
  • the primers employed in obtaining the amplification product can hybridize to upstream and downstream sequences directly adjacent to the codon, or can hybridize to upstream and downstream sequences one or more sequence positions away from the codon.
  • the primers may have between about 70% to 100% sequence complementarity with the sequence of the gene containing the codon being analyzed.
  • the codon base composition analysis is undertaken
  • the codon analysis is undertaken for the purpose of investigating genetic disease in an individual. In other embodiments, the codon analysis is undertaken for the purpose of investigating a drug resistance mutation or any other deleterious mutation in an infectious organism such as a bacterium, virus, fungus or protozoan.
  • the bioagent is a bacterium identified in a biological product.
  • the molecular mass of an amplification product containing the codon being analyzed is measured by mass spectrometry.
  • the mass spectrometry can be either electrospray (ESI) mass spectrometry or matrix-assisted laser desorption ionization (MALDI) mass spectrometry. Time-of- flight (TOF) is an example of one mode of mass spectrometry compatible with the methods disclosed herein.
  • the methods disclosed herein can also be employed to determine the relative abundance of drug resistant strains of the organism being analyzed. Relative abundances can be calculated from amplitudes of mass spectral signals with relation to internal calibrants. In some embodiments, known quantities of internal amplification calibrants can be included in the amplification reactions and abundances of analyte amplification product estimated in relation to the known quantities of the calibrants.
  • one or more alternative treatments can be devised to treat the individual.
  • the identity and quantity of an unknown bioagent can be determined using the process illustrated in Figure 2.
  • Primers (500) and a known quantity of a calibration polynucleotide (505) are added to a sample containing nucleic acid of an unknown bioagent.
  • the total nucleic acid in the sample is then subjected to an amplification reaction (510) to obtain amplification products.
  • the molecular masses of amplification products are determined (515) from which are obtained molecular mass and abundance data.
  • the molecular mass of the bioagent identifying amplicon (520) provides the means for its identification (525) and the molecular mass of the calibration amplicon obtained from the calibration polynucleotide (530) provides the means for its identification (535).
  • the abundance data of the bioagent identifying amplicon is recorded (540) and the abundance data for the calibration data is recorded (545), both of which are used in a calculation (550) which determines the quantity of unknown bioagent in the sample.
  • a sample comprising an unknown bioagent is contacted with a pair of primers that provide the means for amplification of nucleic acid from the bioagent, and a known quantity of a polynucleotide that comprises a calibration sequence.
  • the nucleic acids of the bioagent and of the calibration sequence are amplified and the rate of amplification is reasonably assumed to be similar for the nucleic acid of the bioagent and of the calibration sequence.
  • the amplification reaction then produces two amplification products: a bioagent identifying amplicon and a calibration amplicon.
  • the bioagent identifying amplicon and the calibration amplicon should be distinguishable by molecular mass while being amplified at essentially the same rate.
  • Effecting differential molecular masses can be accomplished by choosing as a calibration sequence, a representative bioagent identifying amplicon (from a specific species of bioagent) and performing, for example, a 2-8 nucleobase deletion or insertion within the variable region between the two priming sites.
  • the amplified sample containing the bioagent identifying amplicon and the calibration amplicon is then subjected to molecular mass analysis by mass spectrometry, for example.
  • the resulting molecular mass analysis of the nucleic acid of the bioagent and of the calibration sequence provides molecular mass data and abundance data for the nucleic acid of the bioagent and of the calibration sequence.
  • the molecular mass data obtained for the nucleic acid of the bioagent enables identification of the unknown bioagent and the abundance data enables calculation of the quantity of the bioagent, based on the knowledge of the quantity of calibration polynucleotide contacted with the sample.
  • construction of a standard curve where the amount of calibration polynucleotide spiked into the sample is varied provides additional resolution and improved confidence for the determination of the quantity of bioagent in the sample.
  • standard curves for analytical determination of molecular quantities is well known to one with ordinary skill and can be performed without undue experimentation.
  • multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with multiple primer pairs which also amplify the corresponding standard calibration sequences.
  • the standard calibration sequences are optionally included within a single vector which functions as the calibration polynucleotide. Multiplex amplification methods are well known to those with ordinary skill and can be performed without undue experimentation.
  • the calibrant polynucleotide is used as an internal positive control to confirm that amplification conditions and subsequent analysis steps are successful in producing a measurable amplicon. Even in the absence of copies of the genome of a bioagent, the calibration polynucleotide should give rise to a calibration amplicon. Failure to produce a measurable calibration amplicon indicates a failure of amplification or subsequent analysis step such as amplicon purification or molecular mass determination. Reaching a conclusion that such failures have occurred is in itself, a useful event.
  • the calibration sequence is comprised of DNA. In some embodiments, the calibration sequence is comprised of RNA. In some embodiments, the calibration sequence is SEQ ID NO.
  • the calibration sequence is SED ID NO. 1562 ( Figure 13B.). In further embodiments, the calibration sequence is SEQ ID NO. 1563 ( Figure 13C). In additional embodiments, the calibration sequence is SEQ ID NO. 1564 ( Figure 13D.)
  • the calibration sequence is inserted into a vector that itself functions as the calibration polynucleotide. In some embodiments, more than one calibration sequence is inserted into the vector that functions as the calibration polynucleotide.
  • a calibration polynucleotide is herein termed a "combination calibration polynucleotide.”
  • the process of inserting polynucleotides into vectors is routine to those skilled in the art and can be accomplished without undue experimentation. Thus, it should be recognized that the calibration method should not be limited to the embodiments described herein.
  • the calibration method can be applied for determination of the quantity of any bioagent identifying amplicon when an appropriate standard calibrant polynucleotide sequence is designed and used.
  • the process of choosing an appropriate vector for insertion of a calibrant is also a routine operation that can be accomplished by one with ordinary skill without undue experimentation.
  • the primer pairs produce bioagent identifying amplicons within stable and highly conserved regions of bacteria.
  • the advantage to characterization of an amplicon defined by priming regions that fall within a highly conserved region is that there is a low probability that the region will evolve past the point of primer recognition, in which case, the primer hybridization of the amplification step would fail.
  • Such a primer set is thus useful as a broad range survey-type primer.
  • the intelligent primers produce bioagent identifying amplicons including a region which evolves more quickly than the stable region described above.
  • the advantage of characterization bioagent identifying amplicon corresponding to an evolving genomic region is that it is useful for distinguishing emerging strain variants or the presence of virulence genes, drug resistance genes, or codon mutations that induce drug resistance.
  • the methods disclosed herein have significant advantages as a platform for identification of diseases caused by emerging bacterial strains such as, for example, drug-resistant strains of Staphylococcus aureus.
  • the methods disclosed herein eliminate the need for prior knowledge of bioagent sequence to generate hybridization probes. This is possible because the methods are not confounded by naturally occurring evolutionary variations occurring in the sequence acting as the template for production of the bioagent identifying amplicon. Measurement of molecular mass and determination of base composition is accomplished in an unbiased manner without sequence prejudice.
  • Another embodiment also provides a means of tracking the spread of a bacterium, such as a particular drug-resistant strain when a plurality of samples obtained from different locations are analyzed by the methods described above in an epidemiological setting.
  • a plurality of samples from a plurality of different locations is analyzed with primer pairs which produce bioagent identifying amplicons, a subset of which contains a specific drug-resistant bacterial strain.
  • the corresponding locations of the members of the drug-resistant strain subset indicate the spread of the specific drug- resistant strain to the corresponding locations.
  • kits for carrying out the methods described herein may comprise a sufficient quantity of one or more primer pairs to perform an amplification reaction on a target polynucleotide from a bioagent to form a bioagent identifying amplicon.
  • the kit may comprise from one to fifty primer pairs, from one to twenty primer pairs, from one to ten primer pairs, or from two to five primer pairs.
  • the kit may comprise one or more primer pairs recited in Table 2.
  • the kit comprises one or more broad range survey primer(s), division wide primer(s), or drill-down primer(s), or any combination thereof. If a given problem involves identification of a specific bioagent, the solution to the problem may require the selection of a particular combination of primers to provide the solution to the problem.
  • a kit may be designed so as to comprise particular primer pairs for identification of a particular bioagent.
  • a drill-down kit may be used, for example, to distinguish different genotypes or strains, drug-resistant, or otherwise.
  • the primer pair components of any of these kits may be additionally combined to comprise additional combinations of broad range survey primers and division- wide primers so as to be able to identify a bacterium.
  • the kit contains standardized calibration polynucleotides for use as internal amplification calibrants. Internal calibrants are described in commonly owned PCT Publication Number WO 2005/098047 which is incorporated herein by reference in its entirety.
  • the kit comprises a sufficient quantity of reverse transcriptase (if RNA is to be analyzed for example), a DNA polymerase, suitable nucleoside triphosphates (including alternative dNTPs such as inosine or modified dNTPs such as the 5-propynyl pyrimidines or any dNTP containing molecular mass-modifying tags such as those described above), a DNA ligase, and/or reaction buffer, or any combination thereof, for the amplification processes described above.
  • a kit may further include instructions pertinent for the particular embodiment of the kit, such instructions describing the primer pairs and amplification conditions for operation of the method.
  • a kit may also comprise amplification reaction containers such as microcentrifuge tubes and the like.
  • a kit may also comprise reagents or other materials for isolating bioagent nucleic acid or bioagent identifying amplicons from amplification, including, for example, detergents, solvents, or ion exchange resins which may be linked to magnetic beads.
  • a kit may also comprise a table of measured or calculated molecular masses and/or base compositions of bioagents using the primer pairs of the kit.
  • kits that contain one or more survey bacterial primer pairs represented by primer pair compositions wherein each member of each pair of primers has 70% to 100% sequence identity with the corresponding member from the group of primer pairs represented by any of the primer pairs of Table 5.
  • the survey primer pairs may include broad range primer pairs which hybridize to ribosomal RNA, and may also include division-wide primer pairs which hybridize to housekeeping genes such as rplB, tufB, rpoB, rpoC, valS, and inffi, for example.
  • a kit may contain one or more survey bacterial primer pairs and one or more triangulation genotyping analysis primer pairs such as the primer pairs of Tables 8, 12, 14, 19, 21, 23, or 24.
  • the kit may represent a less expansive genotyping analysis but include triangulation genotyping analysis primer pairs for more than one genus or species of bacteria.
  • a kit for surveying nosocomial infections at a health care facility may include, for example, one or more broad range survey primer pairs, one or more division wide primer pairs, one or more Acinetobacter baumannii triangulation genotyping analysis primer pairs and one or more Staphylococcus aureus triangulation genotyping analysis primer pairs.
  • One with ordinary skill will be capable of analyzing in silico amplification data to determine which primer pairs will be able to provide optimal identification resolution for the bacterial bioagents of interest.
  • a kit may be assembled for identification of strains of bacteria involved in contamination of food.
  • a kit may be assembled for identification of sepsis-causing bacteria.
  • An example of such a kit embodiment is a kit comprising one or more of the primer pairs of Table 25 which provide for a broad survey of sepsis-causing bacteria.
  • kits are 96-well or 384-well plates with a plurality of wells containing any or all of the following components: dNTPs, buffer salts, Mg + , betaine, and primer pairs.
  • a polymerase is also included in the plurality of wells of the 96-well or 384-well plates.
  • kits contain instructions for PCR and mass spectrometry analysis of amplification products obtained using the primer pairs of the kits.
  • kits include a barcode which uniquely identifies the kit and the components contained therein according to production lots and may also include any other information relative to the components such as concentrations, storage temperatures, etc.
  • the barcode may also include analysis information to be read by optical barcode readers and sent to a computer controlling amplification, purification and mass spectrometric measurements.
  • the barcode provides access to a subset of base compositions in a base composition database which is in digital communication with base composition analysis software such that a base composition measured with primer pairs from a given kit can be compared with known base compositions of bioagent identifying amplicons defined by the primer pairs of that kit.
  • the kit contains a database of base compositions of bioagent identifying amplicons defined by the primer pairs of the kit.
  • the database is stored on a convenient computer readable medium such as a compact disk or USB drive, for example.
  • the kit includes a computer program stored on a computer formatted medium (such as a compact disk or portable USB disk drive, for example) comprising instructions which direct a processor to analyze data obtained from the use of the primer pairs disclosed herein.
  • the instructions of the software transform data related to amplification products into a molecular mass or base composition which is a useful concrete and tangible result used in identification and/or classification of bioagents.
  • the kits contain all of the reagents sufficient to carry out one or more of the methods described herein.
  • bacterial genome segment sequences are obtained, aligned and scanned for regions where pairs of PCR primers would amplify products of about 27 to about 200 nucleotides in length and distinguish subgroups and/or individual strains from each other by their molecular masses or base compositions.
  • a typical process shown in Figure 1 is employed for this type of analysis.
  • a database of expected base compositions for each primer region is generated using an in silico PCR search algorithm, such as (ePCR).
  • An existing RNA structure search algorithm (Macke et al., Nucl. Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety). This also provides information on primer specificity of the selected primer pairs.
  • Table 2 represents a collection of primers (sorted by primer pair number) designed to identify bacteria using the methods described herein.
  • the primer pair number is an in-house database index number. Primer sites were identified on segments of genes, such as, for example, the 16S rRNA gene.
  • the forward or reverse primer name shown in Table 2 indicates the gene region of the bacterial genome to which the primer hybridizes relative to a reference sequence.
  • the forward primer name 16S_EC_1077_1106_F indicates that the forward primer (_F) hybridizes to residues 1077-1106 of the reference sequence represented by a sequence extraction of coordinates 4033120..4034661 from GenBank gi number 16127994 (as indicated in Table 3).
  • the forward primer name BONTA_X52066_450_473 indicates that the primer hybridizes to residues 450-437 of the gene encoding Clostridium botulinum neurotoxin type A (BoNT/A) represented by GenBank Accession No. X52066 (primer pair name codes appearing in Table 2 are defined in Table 3.
  • BoNT/A Clostridium botulinum neurotoxin type A
  • Table 3 T GenBank Accession Numbers for reference sequences of bacteria are shown in Table 3 (below). In some cases, the reference sequences are extractions from bacterial genomic sequences or complements thereof.
  • Primer pair name codes and reference sequences are shown in Table 3.
  • the primer name code typically represents the gene to which the given primer pair is targeted.
  • the primer pair name may include specific coordinates with respect to a reference sequence defined by an extraction of a section of sequence or defined by a GenBank gi number, or the corresponding complementary sequence of the extraction, or the entire GenBank gi number as indicated by the label "no extraction.” Where "no extraction” is indicated for a reference sequence, the coordinates of a primer pair named to the reference sequence are with respect to the GenBank gi listing. Gene abbreviations are shown in bold type in the "Gene Name" column.
  • Alignments can be done using a bioinformatics tool such as BLASTn provided to the public by NCBI (Bethesda, MD).
  • BLASTn provided to the public by NCBI (Bethesda, MD).
  • a relevant GenBank sequence may be downloaded and imported into custom programmed or commercially available bioinformatics programs wherein the alignment can be carried out to determine the primer hybridization coordinates and the sequences, molecular masses and base compositions of the amplification product.
  • primer pair number 2095 SEQ ID NOs: 456: 1261
  • First the forward primer (SEQ ID NO: 456) is subjected to a BLASTn search on the publicly available NCBI BLAST website.
  • RefSeq_Genomic is chosen as the BLAST database since the gi numbers refer to genomic sequences.
  • the hybridization coordinates of the reverse primer (SEQ ID NO: 1261) can be determined in a similar manner and thus, the bioagent identifying amplicon can be defined in terms of genomic coordinates.
  • Table 3 contains sufficient information to determine the primer hybridization coordinates of any of the primers of Table 2 to the applicable reference sequences described therein.
  • artificial reference sequences represent concatenations of partial gene extractions from the indicated reference gi number. Partial sequences were used to create the concatenated sequence because complete gene sequences were not necessary for primer design.
  • Genomic DNA is prepared from samples using the DNeasy Tissue Kit (Qiagen, Valencia, CA) according to the manufacturer's protocols.
  • PCR reactions are assembled in 50 ⁇ L reaction volumes in a 96-well microtiter plate format using a Packard MPII liquid handling robotic platform and MJ. Dyad thermocyclers (MJ research, Waltham, MA) or Eppendorf Mastercycler thermocyclers (Eppendorf, Westbury, NY).
  • the PCR reaction mixture generally consists of 4 units of Amplitaq Gold, Ix buffer II (Applied Biosystems, Foster City, CA), 1.5 mM MgCl 2 , 0.4 M betaine, 800 ⁇ M dNTP mixture and 250 nM of each primer.
  • the following typical PCR conditions are generally used: 95°C for 10 min followed by 8 cycles of 95°C for 30 seconds, 48°C for 30 seconds, and 72°C for 30 seconds with the 48°C annealing temperature increasing 0.9 0 C with each of the eight cycles. The reaction is then continued for 37 additional cycles of 95°C for 15 seconds, 56°C for 20 seconds, and 72°C 20 seconds.
  • the ESI-FTICR mass spectrometer is based on a Bruker Daltonics (Billerica, MA) Apex II 7Oe electrospray ionization Fourier transform ion cyclotron resonance mass spectrometer that employs an actively shielded 7 Tesla superconducting magnet.
  • the active shielding constrains the majority of the fringing magnetic field from the superconducting magnet to a relatively small volume.
  • components that might be adversely affected by stray magnetic fields such as CRT monitors, robotic components, and other electronics, can operate in close proximity to the FTICR spectrometer.
  • Ions are formed via electrospray ionization in a modified Analytica (Branford, CT) source employing an off axis, grounded electrospray probe positioned approximately 1.5 cm from the metallized terminus of a glass desolvation capillary. The atmospheric pressure end of the glass capillary is biased at 6000 V relative to the ESI needle during data acquisition. A counter-current flow of dry N 2 is employed to assist in the desolvation process. Ions are accumulated in an external ion reservoir comprised of an rf-only hexapole, a skimmer cone, and an auxiliary gate electrode, prior to injection into the trapped ion cell where they were mass analyzed.
  • Ionization duty cycles greater than 99% are achieved by simultaneously accumulating ions in the external ion reservoir during ion detection.
  • Each detection event consists of IM data points digitized over 2.3 s.
  • S/N signal-to-noise ratio
  • the ESI-TOF mass spectrometer is based on a Bruker Daltonics MicroTOFTM. Ions from the ESI source undergo orthogonal ion extraction and are focused in a reflectron prior to detection.
  • the TOF and FTICR are equipped with the same automated sample handling and fluidics described above. Ions are formed in the standard MicroTOFTM ESI source that is equipped with the same off- axis sprayer and glass capillary as the FTICR ESI source. Consequently, source conditions are the same as those described above. External ion accumulation is also employed to improve ionization duty cycle during data acquisition. Each detection event on the TOF was comprised of 75,000 data points digitized over 75 ⁇ s.
  • the sample delivery scheme allows sample aliquots to be rapidly injected into the electrospray source at high flow rate and subsequently be electrosprayed at a much lower flow rate for improved ESI sensitivity.
  • a bolus of buffer Prior to injecting a sample, a bolus of buffer is injected at a high flow rate to rinse the transfer line and spray needle to avoid sample contamination/carryover.
  • the autosampler injects the next sample and the flow rate is switched to low flow.
  • data acquisition commences. As spectra are co-added, the autosampler continues rinsing the syringe and picking up buffer to rinse the injector and sample transfer line.
  • one 99-mer nucleic acid strand having a base composition of A 2 7G30C 21 T 21 has a theoretical molecular mass of 30779.058 while another 99-mer nucleic acid strand having a base composition of A 2 6G3iC 22 T 2 o has a theoretical molecular mass of 30780.052.
  • a 1 Da difference in molecular mass may be within the experimental error of a molecular mass measurement and thus, the relatively narrow molecular mass range of the four natural nucleobases imposes an uncertainty factor.
  • nucleobase as used herein is synonymous with other terms in use in the art including "nucleotide,” “deoxynucleotide,” “nucleotide residue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” or deoxynucleotide triphosphate (dNTP).
  • Mass spectra of bioagent-identifying amplicons were analyzed independently using a maximum-likelihood processor, such as is widely used in radar signal processing.
  • This processor referred to as GenX, first makes maximum likelihood estimates of the input to the mass spectrometer for each primer by running matched filters for each base composition aggregate on the input data. This includes the GenX response to a calibrant for each primer.
  • the algorithm emphasizes performance predictions culminating in probability-of-detection versus probability-of- false-alarm plots for conditions involving complex backgrounds of naturally occurring organisms and environmental contaminants.
  • Matched filters consist of a priori expectations of signal values given the set of primers used for each of the bioagents.
  • a genomic sequence database is used to define the mass base count matched filters. The database contains the sequences of known bacterial bioagents and includes threat organisms as well as benign background organisms. The latter is used to estimate and subtract the spectral signature produced by the background organisms.
  • a maximum likelihood detection of known background organisms is implemented using matched filters and a running-sum estimate of the noise covariance. Background signal strengths are estimated and used along with the matched filters to form signatures which are then subtracted. The maximum likelihood process is applied to this "cleaned up" data in a similar manner employing matched filters for the organisms and a running-sum estimate of the noise- covariance for the cleaned up data.
  • Base count blurring can be carried out as follows. "Electronic PCR" can be conducted on nucleotide sequences of the desired bioagents to obtain the different expected base counts that could be obtained for each primer pair. See for example, ncbi.nlm.nih.gov/sutils/e-pcr/; Schuler, Genome Res. 7:541-50, 1997.
  • one or more spreadsheets such as Microsoft Excel workbooks contain a plurality of worksheets. First in this example, there is a worksheet with a name similar to the workbook name; this worksheet contains the raw electronic PCR data.
  • filtered bioagents base count that contains bioagent name and base count; there is a separate record for each strain after removing sequences that are not identified with a genus and species and removing all sequences for bioagents with less than 10 strains.
  • Application of an exemplary script involves the user defining a threshold that specifies the fraction of the strains that are represented by the reference set of base counts for each bioagent.
  • the reference set of base counts for each bioagent may contain as many different base counts as are needed to meet or exceed the threshold.
  • the set of reference base counts is defined by taking the most abundant strain's base type composition and adding it to the reference set and then the next most abundant strain's base type composition is added until the threshold is met or exceeded.
  • the current set of data was obtained using a threshold of 55%, which was obtained empirically.
  • Example 6 Use of Broad Range Survey and Division Wide Primer Pairs for Identification of Bacteria in an Epidemic Surveillance Investigation
  • This investigation employed a set of 16 primer pairs which is herein designated the "surveillance primer set” and comprises broad range survey primer pairs, division wide primer pairs and a single Bacillus clade primer pair.
  • the surveillance primer set is shown in Table 5 and consists of primer pairs originally listed in Table 2.
  • This surveillance set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.
  • Primer pair 449 (non-T modified) has been modified twice. Its predecessors are primer pairs 70 and 357, displayed below in the same row.
  • Primer pair 360 has also been modified twice and its predecessors are primer pairs 17 and 118.
  • Table 5 Bacterial Primer Pairs of the Surveillance Primer Set
  • the 16 primer pairs of the surveillance set are used to produce bioagent identifying amplicons whose base compositions are sufficiently different amongst all known bacteria at the species level to identify, at a reasonable confidence level, any given bacterium at the species level.
  • common respiratory bacterial pathogens can be distinguished by the base compositions of bioagent identifying amplicons obtained using the 16 primer pairs of the surveillance set.
  • triangulation identification improves the confidence level for species assignment.
  • nucleic acid from Streptococcus pyogenes can be amplified by nine of the sixteen surveillance primer pairs and Streptococcus pneumoniae can be amplified by ten of the sixteen surveillance primer pairs.
  • the base compositions of the bioagent identifying amplicons are identical for only one of the analogous bioagent identifying amplicons and differ in all of the remaining analogous bioagent identifying amplicons by up to four bases per bioagent identifying amplicon.
  • the resolving power of the surveillance set was confirmed by determination of base compositions for 120 isolates of respiratory pathogens representing 70 different bacterial species and the results indicated that natural variations (usually only one or two base substitutions per bioagent identifying amplicon) amongst multiple isolates of the same species did not prevent correct identification of major pathogenic organisms at the species level.
  • Bacillus anthracis is a well known biological warfare agent which has emerged in domestic terrorism in recent years. Since it was envisioned to produce bioagent identifying amplicons for identification of Bacillus anthracis, additional drill-down analysis primers were designed to target genes present on virulence plasmids of Bacillus anthracis so that additional confidence could be reached in positive identification of this pathogenic organism. Three drill-down analysis primers were designed and are listed in Tables 2 and 6. In Table 6, the drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.
  • T modifications note TMOD designation in primer names
  • FIG. 3 Physical coverage of bacterial space of the sixteen surveillance primers of Table 5 and the three Bacillus anthracis drill-down primers of Table 6 is shown in Figure 3 which lists common pathogenic bacteria.
  • Figure 3 is not meant to be comprehensive in illustrating all species identified by the primers. Only pathogenic bacteria are listed as representative examples of the bacterial species that can be identified by primers and methods disclosed herein.
  • Nucleic acid of groups of bacteria enclosed within the polygons of Figure 3 can be amplified to obtain bioagent identifying amplicons using the primer pair numbers listed in the upper right hand corner of each polygon. Primer coverage for polygons within polygons is additive.
  • bioagent identifying amplicons can be obtained for Chlamydia trachomatis by amplification with, for example, primer pairs 346-349, 360 and 361, but not with any of the remaining primers of the surveillance primer set.
  • bioagent identifying amplicons can be obtained from nucleic acid originating from Bacillus anthracis (located within 5 successive polygons) using, for example, any of the following primer pairs: 346-349, 360, 361 (base polygon), 356, 449 (second polygon), 352 (third polygon), 355 (fourth polygon), 350, 351 and 353 (fifth polygon).
  • the third set were historical samples, including twenty-seven isolates of group A Streptococcus, from disease outbreaks at this and other military training facilities during previous years.
  • the fourth set of samples was collected from five geographically separated military facilities in the continental U.S. in the winter immediately following the severe November/December 2002 outbreak.
  • FIG. 4 is a 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.
  • primer pair number 356 (SEQ ID NOs: 449: 1380) primarily amplifies the nucleic acid of members of the classes Bacilli and Clostridia and is not expected to amplify proteobacteria such as Neisseria meningitidis and Haemophilus influenzae.
  • primer pair number 356 As expected, analysis of the mass spectrum of amplification products obtained with primer pair number 356 does not indicate the presence of Neisseria meningitidis and Haemophilus influenzae but does indicate the presence of Streptococcus pyogenes ( Figures 3 and 6, Table 7B). Thus, these primers or types of primers can confirm the absence of particular bioagents from a sample.
  • the 15 throat swabs from military recruits were found to contain a relatively small set of microbes in high abundance. The most common were Haemophilus influenza, Neisseria meningitides, and Streptococcus pyogenes . Staphylococcus epidermidis, Moraxella catarrhalis, Corynebacteriumpseudodiphtheriticum, and Staphylococcus aureus were present in fewer samples. An equal number of samples from healthy volunteers from three different geographic locations, were identically analyzed.
  • Example 7 Triangulation Genotyping Analysis for Determination of emm-Type of Streptococcus pyogenes in Epidemic Surveillance
  • MLST Typing
  • classic MLST analysis internal fragments of several housekeeping genes are amplified and sequenced (Enright et al. Infection and Immunity, 2001, 69, 2416-2427).
  • classic MLST analysis internal fragments of several housekeeping genes are amplified and sequenced.
  • bioagent identifying amplicons from housekeeping genes were produced using drill-down primers and analyzed by mass spectrometry. Since mass spectral analysis results in molecular mass, from which base composition can be determined, the challenge was to determine whether resolution of emm classification of strains of Streptococcus pyogenes could be determined.
  • This drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.
  • the primers of Table 8 were used to produce bioagent identifying amplicons from nucleic acid present in the clinical samples.
  • the bioagent identifying amplicons which were subsequently analyzed by mass spectrometry and base compositions corresponding to the molecular masses were calculated.
  • Table 9C Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus samples from Six Military Installations Obtained with Primer Pair Nos. 438 and 441
  • Example 8 Design of Calibrant Polynucleotides based on Bioagent Identifying Amplicons for Identification of Species of Bacteria (Bacterial Bioagent Identifying Amplicons)
  • This example describes the design of 19 calibrant polynucleotides based on bacterial bioagent identifying amplicons corresponding to the primers of the broad surveillance set (Table 5) and the Bacillus anthracis drill-down set (Table 6).
  • Calibration sequences were designed to simulate bacterial bioagent identifying amplicons produced by the T modified primer pairs shown in Tables 5 and 6 (primer names have the designation "TMOD").
  • the calibration sequences were chosen as a representative member of the section of bacterial genome from specific bacterial species which would be amplified by a given primer pair.
  • the model bacterial species upon which the calibration sequences are based are also shown in Table 10.
  • the calibration sequence chosen to correspond to an amplicon produced by primer pair no. 361 is SEQ ID NO: 1445.
  • the forward ( F) or reverse ( R) primer name indicates the coordinates of an extraction representing a gene of a standard reference bacterial genome to which the primer hybridizes e.g.: the forward primer name 16S_EC_713_732_TMOD_F indicates that the forward primer hybridizes to residues 713-732 of the gene encoding 16S ribosomal RNA in an E. coli reference sequence (in this case, the reference sequence is an extraction consisting of residues 4033120-4034661 of the genomic sequence of E. coli K12 (GenBank gi number 16127994). Additional gene coordinate reference information is shown in Table 11.
  • TMOD TMOD
  • the designation "TMOD" in the primer names indicates that the 5' end of the primer has been modified with a non-matched template T residue which prevents the PCR polymerase from adding non-temp lated adenosine residues to the 5' end of the amplification product, an occurrence which may result in miscalculation of base composition from molecular mass data (vide supra).
  • a calibration amplicon based on primer pair 346 (16S rRNA) will be produced in an amplification reaction with primer pair 346 and a calibration amplicon based on primer pair 363 (rpoC) will be produced with primer pair 363.
  • rpoC primer pair 363
  • Example 9 Use of a Calibration Polynucleotide for Determining the Quantity of Bacillus Anthracis in a Sample Containing a Mixture of Microbes
  • the process described in this example is shown in Figure 2.
  • the capC gene is a gene involved in capsule synthesis which resides on the pX02 plasmid of Bacillus anthracis.
  • Primer pair number 350 (see Tables 10 and 11) was designed to identify Bacillus anthracis via production of a bacterial bioagent identifying amplicon.
  • Known quantities of the combination calibration polynucleotide vector described in Example 8 were added to amplification mixtures containing bacterial bioagent nucleic acid from a mixture of microbes which included the Ames strain of Bacillus anthracis. Upon amplification of the bacterial bioagent nucleic acid and the combination calibration polynucleotide vector with primer pair no.
  • bacterial bioagent identifying amplicons and calibration amplicons were obtained and characterized by mass spectrometry.
  • a mass spectrum measured for the amplification reaction is shown in Figure 7.
  • the molecular masses of the bioagent identifying amplicons provided the means for identification of the bioagent from which they were obtained (Ames strain of Bacillus anthracis) and the molecular masses of the calibration amplicons provided the means for their identification as well.
  • the relationship between the abundance (peak height) of the calibration amplicon signals and the bacterial bioagent identifying amplicon signals provides the means of calculation of the copies of the pX02 plasmid of the Ames strain of Bacillus anthracis. Methods of calculating quantities of molecules based on internal calibration procedures are well known to those of ordinary skill in the art.
  • Mycobacterium tuberculosis The optical density of the isolated genomic material is measured in order to estimate the number of genome copies present in the sample. Serial dilutions are then performed to obtain a maximum concentration of 200 genome copies per microliter.
  • a stock solution of Taq polymerase is prepared such that 3 units of Taq polymerase per microliter are present in the final reaction mixture. An aliquot of 40 microliters of this stock solution is mixed with 40 microliters of the diluted genomic DNA in an Eppendorf tube. A volume of 10 microliters of the mixture is then added to a well of a 96-well plate containing primer pairs used for obtaining amplification products corresponding to bioagent identifying amplicons. The plate is sealed and centrifuged at 800 rpm for one minute prior to beginning the PCR cycle.
  • Example 11 Selection of Primer Pairs for Genotyping of Members of the Bacterial Genus Mycobacterium and for Identification of Drug-Resistant Strains of Mycobacterium tuberculosis
  • Mycobacterium tuberculosis In Mycobacterium tuberculosis, the acquisition of drug resistance is mostly associated with the emergence of discrete key mutations that can be unambiguously determined using the methods disclosed herein. [249] The evolution of the Mycobacterium tuberculosis genome is essentially clonal, thus allowing strain typing through the query of distinct genomic markers that are lineage- specific and only vertically inherited. Co-infections of mixed populations of genotypes of Mycobacterium tuberculosis can be revealed simultaneously in the mass spectra of amplification products produced using the primers of Table 12. The high G+C content and of the Mycobacterium tuberculosis genome itself greatly facilitates the development of short, efficient primers which are appropriate for multiplexing (inclusion of a plurality of primers in each amplification reaction mixture).
  • Table 12 Primer Pairs for Genotyping and Determination of Drug Resistance of Strains of
  • primer pairs are designed to be multiplexed into 8 amplification reactions. Thirteen primer pairs were designed with the objective of identifying mutations associated with resistance to drugs including rifampin (primer pair numbers 3546, 3547 and 3548), ethambutol (primer pair numbers 3550 and 3551), isoniazid (primer pair numbers 3352 and 3353), fluoroquinolone (primer pair numbers 3355 and 3556), streptomycin (primer pair number 3557) and pyrazinamide (primer pair numbers 3558, 3559, 3560 and 3561).
  • rifampin primer pair numbers 3546, 3547 and 3548
  • ethambutol primer pair numbers 3550 and 3551
  • isoniazid primer pair numbers 3352 and 3353
  • fluoroquinolone primary pair numbers 3355 and 3556
  • streptomycin primary pair number 3557
  • pyrazinamide primary pair numbers 3558, 3559, 3560 and 3561
  • primer pairs Four of these thirteen primer pairs were specifically designed to provide bioagent identifying amplicons for base composition analysis of single codons (primer pair numbers 3547 (rpoB codon D526), 3548 (rpoB codon H516), 3551 (embB codon M306), and 3553 (katG codon S315)).
  • detection of a mutation identifies a drug-resistant strain of Mycobacterium tuberculosis.
  • primer pair 3546 contains two rpoB codons; D526 and H516).
  • Table 13 Shown in Table 13 are classifications of members of the bacterial genus Mycobacterium according to principal genetic group (PGG, determined using primer pair numbers 3354 and 3356), genotype of Mycobacterium tuberculosis, or species of selected other members of the genus Mycobacterium (determined using primer pair numbers 3381-3384, 3386, 3387 and 3399-3601), and drug resistance to rifampin, ethambutol, isoniazid, fluoroquinolone, streptomycin, and pyrazinamide.
  • PPG principal genetic group
  • codon mutations are indicated by the amino acid single letter code and codon position convention which is well known to those with ordinary skill in the art. For example, when nucleic acid of Mycobacterium tuberculosis strain 13599 is amplified using primer pair number 3555, and the molecular mass or base composition is determined, mutation of codon 90 from alanine (A) to valine (V) is indicated and the conclusion is drawn that strain 13599 is resistant to the drug fluoroquinolone.
  • Primer pair number 3600 is a speciation primer pair which is useful for distinguishing members of Mycobacterium tuberculosis PPGl (including genotypes I, II and HA) from other species of the genus Mycobacterium (such as for example, Mycobacterium africanum, Mycobacterium bovis, Mycobacterium microti, and Mycobacterium canettii — see Figure 8).
  • Table 13 Classification and Drug Resistance Profiles of Strains of Members of the Genus Mycobacterium and Genotypes of Mycobacterium tuberculosis
  • Each primer pair was individually validated using the reference Mycobacterium tuberculosis strain H37Rv. Dilution To Extinction (DTE) experiments yielded the expected base composition down to 16 genomic copies per well.
  • DTE Dilution To Extinction
  • a multiplexing scheme was then determined in order to spread into different wells the primer pairs targeting the same gene, to spread within a single well the expected amplicon masses, and to avoid cross-formation of primer duplexes.
  • the multiplexing scheme is shown in Table 14 where multiplexed amplification reactions are indicated in headings numbered A through H and the primer pairs utilized for each reaction are shown below.
  • Table 15 An example of an experimentally determined table of base compositions is shown in Table 15.
  • Base compositions of amplification products obtained from nucleic acid isolated from Mycobacterium tuberculosis strain 5170 using the primer pair multiplex reactions indicated in Table 14 are shown.
  • Molecular masses of the amplification products were measured by electrospray time of flight mass spectrometry in order to calculate the base compositions. It should be noted that the lengths of the amplification products within each reaction mixture vary greatly in length in order to avoid overlap of molecular masses during the measurements.
  • reaction A has three amplification products which have lengths of 46 (A13 Tl 1 C15 G07), 68 (A14 T18 C21 G15) and 129 (A21 T37 C44 G27).
  • Table 15 Base Compositions Obtained in the Multiplex Amplification Reactions of Nucleic Acid of Mycobacterium tuberculosis Strain 5170
  • Example 13 Diagnosis and Treatment of a Human Subject Infected with a Multi-Drug Resistant Strain of Mycobacterium tuberculosis
  • This example illustrates how the methods disclosed herein would be useful for diagnosis of a human infected with a drug resistant strain of Mycobacterium tuberculosis.
  • a sample is obtained from a human suspected of being infected with Mycobacterium tuberculosis. At this stage, the specific genotype or strain is not known.
  • the sample can be any sample appropriate for identifying a Mycobacterium tuberculosis infection in a human and can be obtained by established clinical methods known to those with ordinary skill in the art.
  • Nucleic acid can be isolated from the sample by known methods or by methods generally similar to those disclosed in Example 10.
  • the nucleic acid is then amplified by known methods or by methods generally similar to those disclosed in Example 2 to obtain amplification products corresponding to bioagent identifying amplicons which are defined, for example, by the primer pairs of Table 12 (whose sequences are shown in Table T), or functional variants thereof.
  • the amplification products are purified by methods generally similar to that described in Example 3 and analyzed according to the methods described in Example 4, and, optionally, Example 5.
  • the quantity of Mycobacterium tuberculosis may be determined by preparing calibration polynucleotides for Mycobacterium tuberculosis using methods similar to those described in Example 9.
  • the series of base compositions of the amplification products obtained in the analyses indicate that the sample contains two distinct populations of two strains of Mycobacterium tuberculosis.
  • the first strain belongs to PGGl as indicated by base compositions of amplification products of primer pair numbers 3554 and 3556 and has genotype I as indicated by base compositions of amplification products of primer pair numbers 3581, 3582, 3583, 3584, 3586, 3587, 3599, 3600, and 3601. None of the drug resistance primer pairs indicate mutations of codons that confer drug resistance so it is concluded that the strain could be either of the known strains 14157 or 15042, neither of which are drug-resistant.
  • the second strain of Mycobacterium tuberculosis in the sample belongs to PPGl as indicated by base compositions of amplification products of primer pair numbers 3554 and 3556 and has genotype II as indicated by base compositions of amplification products of primer pair numbers 3581, 3582, 3583, 3584, 3586, 3587, 3599, 3600, and 3601.
  • Drug resistance primer pairs 3546, 3547 and 3548 indicate the presence of a H528Y mutation indicating resistance to rifampin.
  • Drug resistance primer pairs 3550 and 3551 indicate the presence of a M307V mutation indicating resistance to ethambutol.
  • Drug resistance primer pair 3553 indicates the presence of a S315N/T mutation indicating resistance to isoniazid and drug resistance primer pair 3557 indicates the presence of a K43R mutation indicating resistance to streptomycin. It is then determined that this second strain could be strain 13598, a multi-drug resistant strain. Since this strain does not have resistance to fluoroquinolone or pyrazinamide, these drugs would be in theory, appropriate to treat the individual by killing this strain and presumably would also be useful to kill the first strain which is not resistant to any of the drugs listed in Table 13. The methods could be repeated over the time course of treatment of the subject with fluoroquinolone or pyrazinamide to investigate and verify the eradication of the infection. Likewise, other bacterial co-infections could be investigated using amplification products corresponding to bioagent identifying amplicons defined by other primer pairs disclosed in Table 2.
  • Example 14 Analysis of 102 Diverse Strains of Mycobacterium tuberculosis from the PHRC Collection.
  • tuberculosis genome are simultaneously queried in order to discriminate the different sub-species of the M. tuberculosis complex, down to the nine M. tuberculosis SNP -based clusters (Mathema B, et al., Molecular Epidemiology of Tuberculosis: Current Insights. Clin. Microbiol. Rev. (2006) 19:658-685).
  • the assay was tested using 102 diverse strains from the Public Health Research Institute (PHRC). We found that a 24-primer pair panel, which can be multiplexed into 8 PCR reactions, efficiently characterizes M. tuberculosis into the appropriate subspecies and provide the essential drug resistance profiling needed for prescribing the correct drugs and understanding the epidemiology of an outbreak.
  • Table 16 illustrates the genotype and drug-resistance profiles from the analysis of 102 diverse strains from the PHRC collection. Multiple signatures from individual primer pairs, hinting at the presence of different strains within the same sample, are seen in the Table.
  • Table 16 Base Composition Analysis of Bioagent Identifying Amplicons of Mycobacterium tuberculosis
  • Example 15 Selection of Additional Primer Pairs for Genotyping of Members of the Bacterial Genus Mycobacterium and for Identification of Drug-Resistant Strains of Mycobacterium tuberculosis
  • rifampin resistant strains have mutations with the rifampin resistance determining region (RRRDR) spaning rpoB 505 to 533. Mutations are frequently seen in rpoB codions 516, 526 and 531. As well, at least 54% of isoniazid resistant strains have a mutation in katG codon S315. Secondary mutations are seen in inhA (promoter, S94A and 121V/T) as well as in the ahpC promoter. These mutations are not observed in susceptible strains.
  • RIF R 95% of the multiple drug-resitant genotypes
  • INH R 95% of the multiple drug-resitant genotypes
  • primer pairs targeting mutations conferring resistance to other first and second line drugs were also developed.
  • the first primer pair targets the rpoB, the rrs, embB, the katG, and/or the gyrA gene.
  • the second primer pair targets the inhA, the ahpC, the rrs, and/or the rpoB gene.
  • the third primer pair targets the pncA and/or rpsL genes.
  • the first primer pair taregets the rpoB 516 and 526 polymorphisms, the rrs 1484 polymorphism, the embB 306 polymorphism, the katG 315 and 463 polymorphisms, and/or the gyrA 90..95 and 95 polymorphisms.
  • the second primer pair targets the inhA 189..199 and promoter polymorphisms, the ahpC promoter polymorphism, the rrs 1401-1402 and 511..513 polymormphsim, and/or the rpoB 531, 505..526, and 562..572 polymorphisms.
  • the third primer pair targets the pncA 22..48, 77..102, 103..135, ⁇ 1..2O, 139..171, 49..80 and/or rpsL 29..58, and 59..91 polymorphisms.
  • a panel primer pairs is used to target multiple genes and polymorphisms.
  • Table 17 shows an exemplary Table of multiplex primer pairs used, for example, for drug resistance testing.
  • Figure 9. shows that critical mutations may be uniquely resolved using dedicated primer pairs.
  • Figure 10. Shows that rare mutations may be simultaneously queried using a shared primer pair.
  • Figure 11. shows determination of resistance-conferring mutations by PCR/ESI- MS with resolution of mass spectra, and that primer pairs sharing the same well yield amplicons of distinct lengths.
  • the evolution of the Mycobacterium tuberculosis genome is essentially clonal, thus allowing strain typing through the query of distinct genomic markers that are lineage- specific and only vertically inherited. Co-infections of mixed populations of genotypes of Mycobacterium tuberculosis can be revealed simultaneously in the mass spectra of amplification products produced using the primers of Table 19.
  • the high G+C content and of the Mycobacterium tuberculosis genome itself greatly facilitates the development of short, efficient primers which are appropriate for multiplexing (inclusion of a plurality of primers in each amplification reaction mixture).
  • Table 20 Multiplex assay plate layout: Two primer pairs per well, 8 wells per sample, 12 samples per plate. Primer pairs targeting each of the drugs of choice are coded as follows: Izoniazid (A), Rifampin (B), Fluoroquinolone (C), Diarylquinoline (D) and multiple drug resistance (E).
  • Primer pairs configured to detect isoniazid resistance include: primer pair BCT3553 (molecular target katG codon 315. Mutations at position S315, in particular S315T (ACC), are present in about 54% of the isoniazid-resistant isolates (mutations frequencies for INH resistance according to Hazbon, AAC 2006, 50:2640-9; INH mutation frequencies vary greatly depending on authors, location and sample size). All mutants are distinguished from the wild- type, but a double mutant S315T (ACA) yields the same base composition as the simple mutant S315N (AAC); pimer pair BCT3552 (molecular target inhA operon promoter.
  • primer pair BCT4234 molecular target ahpC promoter. Twelve distinct mutations located 4 to 39 nt upstream of ahpC are detected by this primer pair. These mutations are found in -8% of Isoniazid-resistant isolates); primer pair BCT4235 (molecular target inhA S94A. This mutation is found in -5% of Isoniazid-resistant isolates); and primer pair BCT4236 (molecular target inhA I21V/T. This mutation is found in ⁇ 2% of Isoniazid- resistant isolates).
  • Primer pairs configured to detect rifampin resistance rifampin (RIF) resistance target rpoB, the beta subunit of RNA polymerase. Approximately 95% of RIF r isolates harbor mutations within the Rifampin Resistance Determining Region (RRDR), between rpoB codons 507 and 533 (McCammon, AAC 2005, 49:2200-9, incorporated by reference herein in its entirety). Primary regions within the RRDR are detected by the primer pairs BCT3828, BCT3908, BCT3633, and BCT4366 for the determination of RIF resistance, and primer pairs BCT4237 and BCT3697 detect secondary sites within rpoB.
  • RIF rifampin resistance rifampin
  • Primer pairs configured to detect rifampin resistance include: primer pair BCT3828 (molecular target rpoB codon 531-533. Mutations at position S531, in particular S531L, are present in about half of resistant isolates. Single mutations S531L and S531W, as well as double mutations S531F and S531Y are resolved from one another. The rare L533P mutation is also captured and segregated from the S531L/Y/F/W mutations); primer pair BCT3908 (molecular target rpoB codon 526 only.
  • This primer pair unambiguously resolves the mutations H526N/D/Y/G/L/R found in -25% of the resistant isolates); BCT3633 (molecular target rpoB codons 515 and 516. This primer pair resolves mutations D516V, D516G and D516Y, even in the event of duplication of codon F515); primer pair BCT4366 (molecular target rpoB codons 505 to 516. This primer pair detects RRDR mutations present in the remaining 9- 10% of resistant isolates, but located outside of the three regions described above (including rare single codon insertions or deletions around positions 510-515).
  • primer pair BCT3633 Base compositions from this primer pairs are analyzed in the view of the mutations already detected using primer pair BCT3633); primer pair BCT4237 (molecular target rpoB codons 130 to 140. Mutation V146F is typically found in resistant isolates without RRDR mutations, and accounts for 1% to 4% of the resistant isolates (Heep, JCM 2001, 39:107-110; McCammon, AAC 2005, 49:2200-9, both of which are incorporated by reference herein in their entireties.); and primer pair BCT3697 (molecular target rpoB codons 562 to 572. Mutation I572F may be found in isolates carrying mutations within the RRDR (-1%).
  • Primer pairs configured to detect multiple drug resistance include: primer pair BCT3551 (molecular target embB codon 306).
  • Primer pairs configured to detect diarylquinolone resistance include: primer pair BCT4364 (molecular target atpE. Mutations (A63P, I66M) conferring resistance to diarylquinolines (Petrella, AAC 2006, 50:2853-6, incorporated by reference herein in its entirety) are deduced from the amplicon base composition of this primer pair.
  • Primer pairs configured to detect fluoroquinolone resistance include: primer pair BCT3555 (molecular target gyrA codons 90 to 95, the Quinolone Resistance Determining Region (QRDR). Within this locus, frequently observed mutations include A90V, S91P and D94A ⁇ 7N/G; and primer pair BCT3556 (molecular target gyrA codon 95.
  • the mutation T95S is a phylogenetic marker not associated with fluoroquinolone resistance. However, because of its proximity to the QRDR, codon gyrA 95 is detected by BCT3555 in order to insure the production of an amplicon by BCT3555 regardless of the composition of codon 95. Knowledge of the base composition of codon 95 alone is desired to correctly provide the base composition of the QRDR amplicon. For example, the double mutant D94H+T95S might otherwise be indistinguishable from the wild-type QRDR.
  • Primer pairs configured to detect the principal genetic group include: primer pair BCT3554 (molecular target katG codon 463). Similar to mutations detected by primer pair BCT3556, this mutation is not associated with drug resistance. But in association with BCT3556, this primer pair provides the PGG1/2/3 classification scheme (Sreevatsan, PNAS 1997, 94:9869-74, incorporated by referenence herein in its entirety).
  • the present invention includes any combination of the various species and subgeneric groupings falling within the generic disclosure. This invention therefore includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.
  • This invention While in accordance with the patent statutes, description of the various embodiments and examples have been provided, the scope of the invention is not to be limited thereto or thereby. Modifications and alterations of the present invention will be apparent to those skilled in the art without departing from the scope and spirit of the present invention.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present invention provides compositions, kits and methods for rapid genotyping of strains of Mycobacterium tuberculosis by molecular mass and base composition analysis. Drug-resistant strains of Mycobacterium tuberculosis may be identified in human clinical samples and as such, provide for methods of treatment of humans infected with drug resistant strains of Mycobacterium tuberculosis.

Description

COMPOSITIONS AND METHODS FOR IDENTIFICATION OF SUBSPECIES CHARACTERISTICS OF MYCOBACTERIUM TUBERCULOSIS
RELATED APPLICATIONS
[01] This application claims priority to U.S. Provisional Application Serial Number 60/945,850, filed on June 22, 2007, and U.S. Provisional Application Serial Number 61/037,884 filed, on March 19, 2008 each of which is incorporated by reference in their entireties.
STATEMENT OF GOVERNMENT SUPPORT
[02] This invention was made with United States Government support under CDC SBIR grant 200- 2006-M- 18965. The United States Government has certain rights in the invention.
FIELD OF THE INVENTION
[03] The present invention provides compositions, kits and methods for rapid identification of subspecies characteristics of Mycobacterium tuberculosis by molecular mass and base composition analysis.
BACKGROUND OF THE INVENTION
[04] A problem in determining the cause of a natural infectious outbreak or a bioterrorist attack is the sheer variety of organisms that can cause human disease. There are over 1400 organisms infectious to humans; many of these have the potential to emerge suddenly in a natural epidemic or to be used in a malicious attack by bioterrorists (Taylor et al. Philos. Trans. R. Soc. London B. Biol. Sci., 2001, 356, 983-989). This number does not include numerous strain variants, bioengineered versions, or pathogens that infect plants or animals.
[05] Much of the new technology being developed for detection of biological weapons incorporates a polymerase chain reaction (PCR) step based upon the use of highly specific primers and probes designed to selectively detect certain pathogenic organisms. Although this approach is appropriate for the most obvious bioterrorist organisms, like smallpox and anthrax, experience has shown that it is very difficult to predict which of hundreds of possible pathogenic organisms might be employed in a terrorist attack. Likewise, naturally emerging human disease that has caused devastating consequence in public health has come from unexpected families of bacteria, viruses, fungi, or protozoa. Plants and animals also have their natural burden of infectious disease agents and there are equally important biosafety and security concerns for agriculture. [06] A major conundrum in public health protection, biodefense, and agricultural safety and security is that these disciplines need to be able to rapidly identify and characterize infectious agents, while there is no existing technology with the breadth of function to meet this need. Currently used methods for identification of bacteria rely upon culturing the bacterium to effect isolation from other organisms and to obtain sufficient quantities of nucleic acid followed by sequencing of the nucleic acid, both processes which are time and labor intensive.
[07] Today, despite the availability of effective antituberculosis chemotherapy for over 50 years, TB remains a major global health problem. As the rates of TB infection have fallen dramatically in industrialized countries in the past century, resource-poor countries now bear over 90% of all cases globally. In fact, there are more cases of TB today than ever recorded. As such, there is a need for new therapeutics, diagnostics, and vaccines in conjunction with improved operational guidelines to enhance current TB control strategies.
[08] While much is known about the epidemiology of TB, key questions have eluded classical epidemiologists for decades. These include the current rates of active transmission by differentiating disease due to recent or previous infection; the determination of whether recurrent tuberculosis is attributable to exogenous reinfection; whether all M. tuberculosis strains exert similar epidemiologic characteristics in populations; and an understanding of transmission dynamics on a population- or group-specific level, as well as in identifying extensive transmission or outbreaks from what appear to be sporadic, epidemiologically unrelated cases (Mathema et al., Clinical Microbiology Reviews, 2006, 19, 658-685).
[09] No sooner were the first antituberculosis agents introduced in humans than the emergence of drug-resistant isolates of M. tuberculosis was observed. In vitro studies showed that spontaneous mutations in Mycobacterium tuberculosis can be associated with drug resistance, while selective (antibiotic) pressure can lead to enhanced accumulation of these drug-resistant mutants. The efficient selection of drug resistance in the presence of a single antibiotic led investigators to recommend combination therapy using more than one antibiotic to reduce the emergence of drug resistance during treatment. Indeed, when adequate drug supplies are available and combination treatment is properly managed, TB control has been effective. Selection for drug-resistant mutants in patients mainly occurs when patients are treated inappropriately or are exposed to, even transiently, subtherapeutic drug levels, conditions that may provide adequate positive selection pressure for the emergence and maintenance of drug-resistant organisms de novo. [10] One of the contributing factors is the exceptional length of chemotherapy required to treat and cure infection with Mycobacterium tuberculosis. The need to maintain high drug levels over many months of treatment, combined with the inherent toxicity of the agents, results in reduced patient compliance and subsequently higher likelihood of acquisition of drug resistance. Therefore, in addition to identifying new antituberculosis agents, the need for shortening the length of chemotherapy is paramount, as it would greatly impact clinical management and the emergence of drug resistance. Since the early 1990s, an alarming trend and a growing source of public health concern has been the emergence of resistance to multiple drugs (MDRTB), defined as an isolate that is resistant to at least isoniazid (INH) and rifampin (RIF), the two most potent antituberculosis drugs (Mathema et al., Clinical Microbiology Reviews, 2006, 19, 658-685).
[11] Mass spectrometry provides detailed information about the molecules being analyzed, including high mass accuracy. It is also a process that can be easily automated. DNA chips with specific probes can only determine the presence or absence of specifically anticipated organisms. Because there are hundreds of thousands of species of benign bacteria, some very similar in sequence to threat organisms, even arrays with 10,000 probes lack the breadth needed to identify a particular organism.
[12] The present invention provides oligonucleotide primers and compositions and kits containing the oligonucleotide primers, which define bacterial bioagent identifying amplicons and, upon amplification, produce corresponding amplification products whose molecular masses provide the means to identify subspecies characteristics of Mycobacterium tuberculosis at and below the species taxonomic level.
SUMMARY OF THE INVENTION
[13] The present invention provides compositions, kits and methods for rapid identification of subspecies characteristics of Mycobacterium tuberculosis by molecular mass and base composition analysis.
[14] In some embodiments, the present invention provides a method of identifying a Mycobacterium tuberculosis genotype in a sample comprising obtaining a sample suspected of containing Mycobacterium tuberculosis, isolating nucleic acid from the sample, contacting the nucleic acid with one or more primer pairs configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying the nucleic acid the primers such that one or more amplification products corresponding to bioagent identifying amplicons are produced, and measuring the molecular masses of the one or more amplification products, thereby identifying the Mycobacterium tuberculosis genotype. In some embodiments the method comprises calculating base compositions of the amplification products from the molecular masses. In other embodiments the method comprises comparing the molecular masses or the base compositions with a database containing molecular masses or base compositions of bioagent identifying amplicons of genotypes of Mycobacterium tuberculosis, wherein the bioagent identifying amplicons are defined by the one or more primer pairs. In other embodiments, the one or more primer pairs is a primer pair having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair number 3600 (SEQ IDNOs: 1515:1538).
[15] In some embodiments, the one or more primer pairs further comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers: 3546 (SEQ IDNOs: 1493:1517), 3547 (SEQ IDNOs: 1494:1518), 3548 (SEQIDNOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ ID NOs: 1504:1527), 3559 (SEQ ID NOs: 1505:1528), 3560 (SEQ ID NOs: 1506:1529), 3561 (SEQ ID NOs: 1507:1530), 3581 (SEQIDNOs: 1508:1531), 3582 (SEQ IDNOs: 1509:1532), 3583 (SEQ IDNOs: 1510:1533), 3584 (SEQ ID NOs: 1511:1534), 3586 (SEQ ID NOs: 1512:1535), 3587 (SEQ ID NOs: 1513:1536), 3599 (SEQ ID NOs: 1514:1537), 3601 (SEQ ID NOs: 1516:1539), 3908 (SEQ ID NOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ IDNOs: 1546:1547), 4234 (SEQ ID NOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQ IDNOs: 1552:1553), 4237 (SEQ ID NOs: 1554:1555), 4362 (SEQ IDNOs: 1556:1557), 4364 (SEQ IDNOs: 1558:1559), and 4366 (SEQ IDNOs: 1560:1543).
[16] In other embodiments, the one or more primer pairs further comprises five or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers: 3546 (SEQ IDNOs: 1493:1517), 3547 (SEQ IDNOs: 1494:1518), 3548 (SEQIDNOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ ID NOs: 1504:1527), 3559 (SEQ ID NOs: 1505:1528), 3560 (SEQ ID NOs: 1506:1529), 3561 (SEQ ID NOs: 1507:1530), 3581 (SEQIDNOs: 1508:1531), 3582 (SEQ IDNOs: 1509:1532), 3583 (SEQ IDNOs:
1510:1533), 3584 (SEQ ID NOs: 1511:1534), 3586 (SEQ ID NOs: 1512:1535), 3587 (SEQ ID NOs: 1513:1536), 3599 (SEQ ID NOs: 1514:1537), 3601 (SEQ ID NOs: 1516:1539), 3908 (SEQ ID NOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ IDNOs: 1546:1547), 4234 (SEQ ID NOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQ IDNOs: 1552:1553), 4237 (SEQ ID NOs: 1554:1555), 4362 (SEQ IDNOs: 1556:1557), 4364 (SEQ IDNOs: 1558:1559), and 4366 (SEQ IDNOs: 1560:1543).
[17] In further embodiments, the one or more primer pairs comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers selected from the group consisting of: 3546 (SEQ ID NOs: 1493:1517), 3547 (SEQ ID NOs: 1494:1518), 3548 (SEQ ID NOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ ID NOs: 1504:1527), 3559 (SEQ ID NOs: 1505:1528), 3560 (SEQ ID NOs: 1506:1529), 3561 (SEQIDNOs: 1507:1530), 3581 (SEQIDNOs: 1508:1531), 3582 (SEQ IDNOs: 1509:1532), 3583 (SEQ ID NOs: 1510:1533), 3584 (SEQ ID NOs: 1511:1534), 3586 (SEQ ID NOs: 1512:1535), 3587 (SEQ ID NOs: 1513:1536), 3599 (SEQ ID NOs: 1514:1537), 3601 (SEQ ID NOs: 1516:1539), 3908 (SEQ ID NOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ IDNOs: 1546:1547), 4234 (SEQ IDNOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQ ID NOs: 1552:1553), 4237 (SEQ IDNOs: 1554:1555), 4362 (SEQ IDNOs: 1556:1557), 4364 (SEQ ID NOs: 1558:1559), and 4366 (SEQ ID NOs: 1560:1543).
[18] In still further embodiments the one or more primer pairs comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers selected from the group consisting of: 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3908 (SEQ ID NOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ IDNOs: 1546:1547), 4234 (SEQ IDNOs: 1548:1549), 4235 (SEQ ID NOs: 1550:1551), 4236 (SEQ IDNOs: 1552:1553), 4237 (SEQ IDNOs: 1554:1555), 4364 (SEQ ID NOs: 1558:1559), and 4366 (SEQ ID NOs: 1560:1543).
[19] In some embodiments, the present invention provides a method wherein the Mycobacterium tuberculosis genotype is distinguished from Mycobacterium africanum, Mycobacterium bovis, Mycobacterium microti, and Mycobacterium canettii. In other embodiments, the Mycobacterium tuberculosis genotype comprises a drug-resistant strain of Mycobacterium tuberculosis. In further embodiments, the drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine. In still further embodiments, the drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamide. In some embodiments, three or more of the primer pairs are combined in a multiplex reaction to produce a plurality of amplification products corresponding to bioagent identifying amplicons.
[20] In some embodiments, the molecular masses are measured by mass spectrometry. In other embodiments, the sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy. In further embodiments, the sample comprises a population of distinct genotypes of Mycobacterium tuberculosis.
[21] In some embodiments, the present invention provides an oligonucleotide primer pair comprising a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer has at least 70% sequence identity with SEQ ID NO: 1515 and the reverse primer has at least 70% sequence identity with SEQ ID NO: 1538. In some embodiments, the oligonucleotide the forward primer of the primer pair comprises at least 80% sequence identity with SEQ ID NO: 1515. In other embodiments, the forward primer comprises at least 90% sequence identity with SEQ ID NO: 1515. In further embodiments, the forward primer is SEQ ID NO: 1515. In some embodiments, the reverse primer of the primer pair comprises at least 80% sequence identity with SEQ ID NO: 1538. In other embodiments, the reverse primer comprises at least 90% sequence identity with SEQ ID NO: 1538. In further embodiments, the reverse primer is SEQ ID NO: 1538.
[22] In some embodiments, the present invention provides a kit for identifying a Mycobacterium tuberculosis genotype in a sample comprising a first oligonucleotide primer pair comprising a forward primer and a reverse primer, each configured to hybridize to a Mycobacterium tuberculosis gyrB gene, and each between 13 and 35 linked nucleotides in length wherein the forward primer has at least 70% sequence identity with SEQ ID NO: 1515 and the reverse primer has at least 70% sequence identity with SEQ ID NO: 1538, and at least one additional primer pair wherein the primers of each of the at least one additional primer pair are configured to hybridize to sequence regions within a Mycobacterium tuberculosis gene selected from the group consisting of: rpoB, embB, fabG, inhA, katG, gyrA, pncA, prcA, rv2348c, rv3815c, rvO147, erg3, rv0083, rvlO47, rvl814, rv0041, and rv0260c. In some embodiments, each of the at least one additional primer pairs is a primer pair comprising a forward primer and a reverse primer, the forward primer and the reverse primer each between 13 to 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding forward and reverse primers of primer pair numbers: 3546 (SEQ ID NOs: 1493: 1517), 3547 (SEQ ID NOs: 1494:1518), 3548 (SEQ IDNOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQIDNOs: 1497:1521), 3552 (SEQ IDNOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ IDNOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ IDNOs: 1504:1527), 3559 (SEQ IDNOs: 1505:1528), 3560 (SEQ ID NOs: 1506:1529), 3561 (SEQIDNOs: 1507:1530), 3581 (SEQIDNOs: 1508:1531), 3582 (SEQ ID NOs: 1509:1532), 3583 (SEQ IDNOs: 1510:1533), 3584 (SEQ IDNOs: 1511:1534), 3586 (SEQ ID NOs: 1512:1535), 3587 (SEQ IDNOs: 1513:1536), 3599 (SEQ IDNOs: 1514:1537), 3601 (SEQIDNOs: 1516:1539), 3908 (SEQ IDNOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ ID NOs: 1544:1545), 3828 (SEQ ID NOs: 1546:1547), 4234 (SEQ ID NOs: 1548:1549), 4235 (SEQ ID NOs: 1550:1551), 4236 (SEQ IDNOs: 1552:1553), 4237 (SEQ IDNOs: 1554:1555), 4362 (SEQ ID NOs: 1556:1557), 4364 (SEQ ID NOs: 1558:1559), and 4366 (SEQ ID NOs: 1560:1543).
[23] In some embodiments, the present invention provides a kit for identifying a Mycobacterium tuberculosis genotype in a sample comprising a first oligonucleotide primer pair comprising a forward primer and a reverse primer, each configured to hybridize to a Mycobacterium tuberculosis gyrB gene, and each between 13 and 35 linked nucleotides in length selected from the group consisting of: 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3908 (SEQ IDNOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ IDNOs: 1546:1547), 4234 (SEQ IDNOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQ ID NOs: 1552:1553), 4237 (SEQ IDNOs: 1554:1555), 4364 (SEQ IDNOs: 1558: 1559), and 4366 (SEQ ID NOs: 1560: 1543), and at least one additional primer pair wherein the primers of each of the at least one additional primer pair are configured to hybridize to sequence regions within a Mycobacterium tuberculosis gene selected from the group consisting of: rpoB, embB, fabG, inhA, katG, gyrA, pncA, prcA, rv2348c, rv3815c, rvO147, erg3, rv0083, rvlO47, rvl814, rv0041, and rv0260c.
[24] In some embodiments, the present invention provides a method for identifying a drug-resistant strain of Mycobacterium tuberculosis comprising obtaining a sample suspected of containing Mycobacterium tuberculosis, isolating nucleic acid from the sample, contacting the nucleic acid with a primer pair configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying the nucleic acid with the primer pair to obtain an amplification product containing a mutation of a codon known to confer drug resistance upon
Mycobacterium tuberculosis, and measuring the molecular mass of the amplification product, thereby identifying the drug resistant strain of Mycobacterium tuberculosis. In some embodiments, the method comprises calculating a base composition of the amplification product from the molecular mass, thereby identifying a base composition for the codon. In other embodiments, the primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer and the reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting ofprimer pair numbers: 3546 (SEQ ID NOs: 1493: 1517), 3547 (SEQ ID NOs: 1494: 1518), 3548 (SEQ ID NOs: 1495: 1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497: 1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500: 1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503: 1526), 3558 (SEQ ID NOs: 1504: 1527), 3559 (SEQ ID NOs: 1505:1528), 3560 (SEQ ID NOs: 1506: 1529), 3561 (SEQ ID NOs: 1507: 1530), 3908 (SEQ ID NOs: 1540: 1541), 3633 (SEQ ID NOs: 1542: 1543), 3697 (SEQ ID NOs: 1544: 1545), 3828 (SEQ ID NOs: 1546:1547), 4234 (SEQ ID NOs: 1548: 1549), 4235 (SEQ ID NOs: 1550: 1551), 4236 (SEQ ID NOs: 1552: 1553), 4237 (SEQ ID NOs: 1554: 1555), 4362 (SEQ ID NOs: 1556: 1557), 4364 (SEQ ID NOs: 1558: 1559), and 4366 (SEQ ID NOs: 1560: 1543). In further embodiments, the primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer and the reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting ofprimer pair numbers: 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499: 1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502: 1525), 3908 (SEQ ID NOs: 1540: 1541), 3633 (SEQ ID NOs: 1542: 1543), 3697 (SEQ ID NOs: 1544: 1545), 3828 (SEQ ID NOs: 1546: 1547), 4234 (SEQ ID NOs: 1548:1549), 4235 (SEQ ID NOs: 1550: 1551), 4236 (SEQ ID NOs: 1552: 1553), 4237 (SEQ ID NOs: 1554: 1555), 4364 (SEQ ID NOs: 1558: 1559), and 4366 (SEQ ID NOs: 1560: 1543.
[25] In some embodiments, the drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine. In other embodiments, the drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine. In further embodiments, molecular mass is measured by mass spectrometry. In some embodiments, the sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy tissue swab, tissue aspirate, abscess biopsy, cerebrospinal fluid. In further embodiments, the sample comprises a population of distinct genotypes of Mycobacterium tuberculosis. In other embodiments, the population of distinct genotypes comprises a drug-resistant genotype and a drug-sensitive genotype.
[26] In some embodiments, the present invention provides a method of treating a human infected with a drug-resistant strain of Mycobacterium tuberculosis comprising obtaining a sample from a human infected with Mycobacterium tuberculosis, isolating nucleic acid from the sample, contacting the nucleic acid with a primer pair configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying the nucleic acid with the primer pair to obtain an amplification product containing a mutation of a codon known to confer drug resistance upon Mycobacterium tuberculosis, measuring the molecular mass of the amplification product, thereby identifying the drug-resistant strain of Mycobacterium tuberculosis, selecting one or more alternative drugs to which the drug-resistant strain is not resistant, and administering the alternative drugs to the human. In some embodiments, the primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer and the reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting of primer pair numbers: 3546 (SEQ ID NOs: 1493: 1517), 3547 (SEQ ID NOs: 1494: 1518), 3548 (SEQ ID NOs: 1495: 1519), 3550 (SEQ ID NOs: 1496: 1520), 3551 (SEQ ID NOs: 1497: 1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499: 1523), 3554 (SEQ ID NOs: 1500: 1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502: 1525), 3557 (SEQ ID NOs: 1503: 1526), 3558 (SEQ ID NOs: 1504: 1527), 3559 (SEQ ID NOs: 1505: 1528), 3560 (SEQ ID NOs: 1506: 1529), 3561 (SEQ ID NOs: 1507: 1530), 3908 (SEQ ID NOs: 1540: 1541), 3633 (SEQ ID NOs: 1542: 1543), 3697 (SEQ ID NOs: 1544: 1545), 3828 (SEQ ID NOs: 1546: 1547), 4234 (SEQ ID NOs: 1548: 1549), 4235 (SEQ ID NOs: 1550: 1551), 4236 (SEQ ID NOs: 1552: 1553), 4237 (SEQ ID NOs: 1554:1555), 4362 (SEQ ID NOs: 1556: 1557), 4364 (SEQ ID NOs: 1558: 1559), and 4366 (SEQ ID NOs: 1560: 1543). In other embodiments, the drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine. In further embodiments, the drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine. In other embodiments, the molecular mass is measured by mass spectrometry. In some embodiments, the sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy. In other embodiments, the sample comprises a population of distinct genotypes of Mycobacterium tuberculosis. In further embodiments, the population of distinct genotypes comprises a drug-resistant genotype and a drug- sensitive genotype. [27] In some embodiments, the present invention provides a method for determining the identity and quantity of Mycobacterium tuberculosis in a sample comprising contacting the sample with a pair of primers and a known quantity of a calibration polynucleotide comprising a calibration sequence, concurrently amplifying nucleic acid from the Mycobacterium tuberculosis in the sample with the pair of primers and amplifying nucleic acid from the calibration polynucleotide in the sample with the pair of primers to obtain a first amplification product comprising a Mycobacterium tuberculosis identifying amplicon and a second amplification product comprising a calibration amplicon, obtaining molecular mass and abundance data for the Mycobacterium tuberculosis identifying amplicon and for the calibration amplicon wherein the 5' and 3' ends of the Mycobacterium tuberculosis identifying amplicon and the calibration amplicon are the sequences of the pair of primers or complements thereof, and distinguishing the Mycobacterium tuberculosis identifying amplicon from the calibration amplicon based on their respective molecular masses, wherein the molecular mass of the Mycobacterium tuberculosis identifying amplicon indicates the identity of the Mycobacterium tuberculosis, and comparison of Mycobacterium tuberculosis identifying amplicon abundance data and calibration amplicon abundance data indicates the quantity of Mycobacterium tuberculosis in the sample. In some embodiments, the primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein the forward primer and the reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting of primer pair numbers: 3546 (SEQ ID NOs: 1493:1517), 3547 (SEQ ID NOs: 1494: 1518), 3548 (SEQ ID NOs: 1495: 1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQ ID NOs: 1497: 1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500: 1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502:1525), 3557 (SEQ ID NOs: 1503: 1526), 3558 (SEQ ID NOs: 1504: 1527), 3559 (SEQ ID NOs: 1505:1528), 3560 (SEQ ID NOs: 1506: 1529), 3561 (SEQ ID NOs: 1507: 1530), 3908 (SEQ ID NOs: 1540: 1541), 3633 (SEQ ID NOs: 1542: 1543), 3697 (SEQ ID NOs: 1544: 1545), 3828 (SEQ ID NOs: 1546: 1547), 4234 (SEQ ID NOs: 1548: 1549), 4235 (SEQ ID NOs: 1550: 1551), 4236 (SEQ ID NOs: 1552: 1553), 4237 (SEQ ID NOs: 1554: 1555), 4362 (SEQ ID NOs: 1556: 1557), 4364 (SEQ ID NOs: 1558: 1559), and 4366 (SEQ ID NOs: 1560: 1543). In other embodiments, the calibration polynucleotide is selected from the group consisting of: calibration polynucleotide SEQ ID NO. 1561, calibration polynucleotide SEQ ID NO. 1562, calibration polynucleotide SEQ ID NO. 1563, and calibration polynucleotide SEQ ID NO. 1564.
[28] Additional embodiments of the present invention are described in the description and examples below. BRIEF DESCRIPTION OF THE DRAWINGS
[29] The foregoing summary, as well as the following detailed description, is better understood when read in conjunction with the accompanying drawings which are included by way of example and not by way of limitation.
[30] Figure 1: process diagram illustrating a representative primer pair selection process.
[31] Figure 2: process diagram illustrating an embodiment of the calibration method.
[32] Figure 3: common pathogenic bacteria and primer pair coverage. The primer pair number in the upper right hand corner of each polygon indicates that the primer pair can produce a bioagent identifying amplicon for all species within that polygon.
[33] Figure 4: a representative 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples (labeled NHRC samples) closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.
[34] Figure 5: a representative mass spectrum of amplification products indicating the presence of bioagent identifying amplicons of Streptococcus pyogenes , Neisseria meningitidis, and Haemophilus influenzae obtained from amplification of nucleic acid from a clinical sample with primer pair number 349 which targets 23 S rRNA. Experimentally determined molecular masses and base compositions for the sense strand of each amplification product are shown.
[35] Figure 6: a representative mass spectrum of amplification products representing a bioagent identifying amplicon of Streptococcus pyogenes , and a calibration amplicon obtained from amplification of nucleic acid from a clinical sample with primer pair number 356 which targets rplB. The experimentally determined molecular mass and base composition for the sense strand of the Streptococcus pyogenes amplification product is shown.
[36] Figure 7: a representative mass spectrum of an amplified nucleic acid mixture which contained the Ames strain of Bacillus anthracis, a known quantity of combination calibration polynucleotide
(SEQ ID NO: 1464), and primer pair number 350 which targets the capC gene on the virulence plasmid pX02 of Bacillus anthracis. Calibration amplicons produced in the amplification reaction are visible in the mass spectrum as indicated and abundance data (peak height) are used to calculate the quantity of the Ames strain of Bacillus anthracis.
[37] Figure 8: a schematic representation of the phylogeny of the M. tuberculosis cluster indicating principal genetic groups (PPGs) including nine genotypes. Selected primer pair numbers used to distinguish PPGs, genotypes and species are indicated.
[38] Figure 9: base compositions of amplification products using primer pair BCT3908 to amplify a region of the rpoB gene. Six critical mutations may be uniquely resolved compared to the wild type sequence (WT) using dedicated primer pairs.
[39] Figure 10: base compositions of amplification products using primer pair BCT3552 to amplify a region of the inhA gene. Rare mutations may be simultaneously queried compared to wild type sequence (WT) using a shared primer pair.
[40] Figure 11: a schematic representation of determination of resistance-conferring mutations by PCR/ESI-MS with resolution of mass spectra. Primer pairs sharing the same well yield amplicons of distinct lengths and base compositions from assay and internal calibrant templates.
[41] Figure 12: an outline of the convention process flow in tuberculosis diagnostic testing compared to molecular genotyping by PCR/ESI-MS.
[42] Figure 13: sequences of calibration sequences SEQ ID NO. 1561, SEQ ID NO. 1562, SEQ ID NO. 1563, and SEQ ID NO. 1564.
DEFINITIONS
[43] As used herein, the term "abundance" refers to an amount. The amount may be described in terms of concentration which are common in molecular biology such as "copy number," "pfu or plate-forming unit" which are well known to those with ordinary skill. Concentration may be relative to a known standard or may be absolute. [44] As used herein, the term "amplifϊable nucleic acid" is used in reference to nucleic acids that may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid" also comprises "sample template."
[45] As used herein the term "amplification" refers to a special case of nucleic acid replication involving template specificity. It is to be contrasted with non-specific template replication (i.e., replication that is template-dependent but not dependent on a specific template). Template specificity is here distinguished from fidelity of replication (i.e., synthesis of the proper polynucleotide sequence) and nucleotide (ribo- or deoxyribo-) specificity. Template specificity is frequently described in terms of "target" specificity. Target sequences are "targets" in the sense that they are sought to be sorted out from other nucleic acid. Amplification techniques have been designed primarily for this sorting out. Template specificity is achieved in most amplification techniques by the choice of enzyme. Amplification enzymes are enzymes that, under conditions they are used, will process only specific sequences of nucleic acid in a heterogeneous mixture of nucleic acid. For example, in the case of QB replicase, MDV- 1 RNA is the specific template for the replicase (D.L. Kacian et al., Proc. Natl. Acad. Sci. USA 69:3038 [1972]). Other nucleic acid will not be replicated by this amplification enzyme. Similarly, in the case of T7 RNA polymerase, this amplification enzyme has a stringent specificity for its own promoters (Chamberlin et al., Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzyme will not ligate the two oligonucleotides or polynucleotides, where there is a mismatch between the oligonucleotide or polynucleotide substrate and the template at the ligation junction (D.Y. Wu and R. B. Wallace, Genomics 4:560 [1989]). Finally, Taq and Pfu polymerases, by virtue of their ability to function at high temperature, are found to display high specificity for the sequences bounded and thus defined by the primers; the high temperature results in thermodynamic conditions that favor primer hybridization with the target sequences and not hybridization with non-target sequences (H. A. Erlich (ed.), PCR Technology, Stockton Press [1989]).
[46] As used herein, the term "amplification reagents" refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification, excluding primers, nucleic acid template, and the amplification enzyme. Typically, amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
[47] As used herein, the term "analogous" when used in context of comparison of bioagent identifying amplicons indicates that the bioagent identifying amplicons being compared are produced with the same pair of primers. For example, bioagent identifying amplicon "A" and bioagent identifying amplicon "B", produced with the same pair of primers are analogous with respect to each other. Bioagent identifying amplicon "C", produced with a different pair of primers is not analogous to either bioagent identifying amplicon "A" or bioagent identifying amplicon "B".
[48] As used herein, the term "anion exchange functional group" refers to a positively charged functional group capable of binding an anion through an electrostatic interaction. The most well known anion exchange functional groups are the amines, including primary, secondary, tertiary and quaternary amines.
[49] The term "bacteria" or "bacterium" refers to any member of the groups of eubacteria and archaebacteria.
[50] As used herein, a "base composition" is the exact number of each nucleobase (for example, A, T, C and G) in a segment of nucleic acid. For example, amplification of nucleic acid of Staphylococcus aureus strain carrying the lukS-PV gene with primer pair number 2095 (SEQ ID NOs: 456: 1261) produces an amplification product 117 nucleobases in length from nucleic acid of the lukS-PV gene that has a base composition ofA35 G17 C19 T46 (by convention - with reference to the sense strand of the amplification product). Because the molecular masses of each of the four natural nucleotides and chemical modifications thereof are known (if applicable), a measured molecular mass can be deconvoluted to a list of possible base compositions. Identification of a base composition of a sense strand which is complementary to the corresponding antisense strand in terms of base composition provides a confirmation of the true base composition of an unknown amplification product. For example, the base composition of the antisense strand of the 139 nucleobase amplification product described above is A46 G19 C17 T35.
[51] As used herein, a "base composition probability cloud" is a representation of the diversity in base composition resulting from a variation in sequence that occurs among different isolates of a given species. The "base composition probability cloud" represents the base composition constraints for each species and is typically visualized using a pseudo four-dimensional plot.
[52] As used herein, a "bioagent" is any organism, cell, or virus, living or dead, or a nucleic acid derived from such an organism, cell or virus. Examples of bioagents include, but are not limited, to cells, (including but not limited to human clinical samples, bacterial cells and other pathogens), viruses, fungi, protists, parasites, and pathogenicity markers (including but not limited to: pathogenicity islands, antibiotic resistance genes, virulence factors, toxin genes and other bioregulating compounds). Samples may be alive or dead or in a vegetative state (for example, vegetative bacteria or spores) and may be encapsulated or bioengineered. As used herein, a "pathogen" is a bioagent which causes a disease or disorder.
[53] As used herein, a "bioagent division" is defined as group of bioagents above the species level and includes but is not limited to, orders, families, classes, clades, genera or other such groupings of bioagents above the species level.
[54] As used herein, the term "bioagent identifying amplicon" refers to a polynucleotide that is amplified from nucleic acid of a bioagent in an amplification reaction and which 1) provides sufficient variability to distinguish among bioagents from whose nucleic acid the bioagent identifying amplicon is produced and 2) whose molecular mass is amenable to a rapid and convenient molecular mass determination modality such as mass spectrometry, for example.
[55] As used herein, the term "biological product" refers to any product originating from an organism. Biological products are often products of processes of biotechnology. Examples of biological products include, but are not limited to: cultured cell lines, cellular components, antibodies, proteins and other cell-derived biomolecules, growth media, growth harvest fluids, natural products and bio-pharmaceutical products.
[56] The terms "biowarfare agent" and "bioweapon" are synonymous and refer to a bacterium, virus, fungus or protozoan that could be deployed as a weapon to cause bodily harm to individuals. Military or terrorist groups may be implicated in deployment of biowarfare agents.
[57] As used herein, the term "broad range survey primer pair" refers to a primer pair designed to produce bioagent identifying amplicons across different broad groupings of bioagents. For example, the ribosomal RNA-targeted primer pairs are broad range survey primer pairs which have the capability of producing bacterial bioagent identifying amplicons for essentially all known bacteria. With respect to broad range primer pairs employed for identification of bacteria, a broad range survey primer pair for bacteria such as 16S rRNA primer pair number 346 (SEQ ID NOs: 202: 1110) for example, will produce an bacterial bioagent identifying amplicon for essentially all known bacteria.
[58] The term "calibration amplicon" refers to a nucleic acid segment representing an amplification product obtained by amplification of a calibration sequence with a pair of primers designed to produce a bioagent identifying amplicon. [59] The term "calibration sequence" refers to a polynucleotide sequence to which a given pair of primers hybridizes for the purpose of producing an internal (i.e.: included in the reaction) calibration standard amplification product for use in determining the quantity of a bioagent in a sample. The calibration sequence may be expressly added to an amplification reaction, or may already be present in the sample prior to analysis.
[60] The term "clade primer pair" refers to a primer pair designed to produce bioagent identifying amplicons for species belonging to a clade group. A clade primer pair may also be considered as a "speciating" primer pair which is useful for distinguishing among closely related species.
[61] The term "codon" refers to a set of three adjoined nucleotides (triplet) that codes for an amino acid or a termination signal.
[62] As used herein, the term "codon base composition analysis," refers to determination of the base composition of an individual codon by obtaining a bioagent identifying amplicon that includes the codon. The bioagent identifying amplicon will at least include regions of the target nucleic acid sequence to which the primers hybridize for generation of the bioagent identifying amplicon as well as the codon being analyzed, located between the two primer hybridization regions. Codon base composition analysis is particularly useful for interrogating codons suspected of containing mutations that confer drug resistance to bacterial and viral pathogens.
[63] As used herein, the terms "complementary" or "complementarity" are used in reference to polynucleotides (i.e., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid) related by the base-pairing rules. For example, for the sequence "5'-A-G-T-3T," is complementary to the sequence "3'-T-C-A-5T." Complementarity may be "partial," in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be "complete" or "total" complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids. Either term may also be used in reference to individual nucleotides, especially within the context of polynucleotides. For example, a particular nucleotide within an oligonucleotide may be noted for its complementarity, or lack thereof, to a nucleotide within another nucleic acid strand, in contrast or comparison to the complementarity between the rest of the oligonucleotide and the nucleic acid strand. [64] The term "complement of a nucleic acid sequence" as used herein refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' end of one sequence is paired with the 3' end of the other, is in "antiparallel association." Certain bases not commonly found in natural nucleic acids may be included in the nucleic acids disclosed herein and include, for example, inosine and 7-deazaguanine. Complementarity need not be perfect; stable duplexes may contain mismatched base pairs or unmatched bases. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length of the oligonucleotide, base composition and sequence of the oligonucleotide, ionic strength and incidence of mismatched base pairs. Where a first oligonucleotide is complementary to a region of a target nucleic acid and a second oligonucleotide has complementary to the same region (or a portion of this region) a "region of overlap" exists along the target nucleic acid. The degree of overlap will vary depending upon the extent of the complementarity.
[65] As used herein, the term "division- wide primer pair" refers to a primer pair designed to produce bioagent identifying amplicons within sections of a broader spectrum of bioagents For example, primer pair number 352 (SEQ ID NOs: 687:1411), a division-wide primer pair, is designed to produce bacterial bioagent identifying amplicons for members of the Bacillus group of bacteria which comprises, for example, members of the genera Streptococci, Enterococci, and Staphylococci. Other division-wide primer pairs may be used to produce bacterial bioagent identifying amplicons for other groups of bacterial bioagents.
[66] As used herein, the term "concurrently amplifying" used with respect to more than one amplification reaction refers to the act of simultaneously amplifying more than one nucleic acid in a single reaction mixture.
[67] As used herein, the term "drill-down primer pair" refers to a primer pair designed to produce bioagent identifying amplicons for identification of sub-species characteristics or confirmation of a species assignment. For example, primer pair number 2146 (SEQ ID NOs: 437:1137), a drill-down Staphylococcus aureus genotyping primer pair, is designed to produce Staphylococcus aureus genotyping amplicons. Other drill-down primer pairs may be used to produce bioagent identifying amplicons for Staphylococcus aureus and other bacterial species.
[68] The term "duplex" refers to the state of nucleic acids in which the base portions of the nucleotides on one strand are bound through hydrogen bonding the their complementary bases arrayed on a second strand. The condition of being in a duplex form reflects on the state of the bases of a nucleic acid. By virtue of base pairing, the strands of nucleic acid also generally assume the tertiary structure of a double helix, having a major and a minor groove. The assumption of the helical form is implicit in the act of becoming duplexed.
[69] As used herein, the term "etiology" refers to the causes or origins, of diseases or abnormal physiological conditions.
[70] The term "gene" refers to a DNA sequence that comprises control and coding sequences necessary for the production of an RNA having a non-coding function (e.g., a ribosomal or transfer RNA), a polypeptide or a precursor. The RNA or polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or function is retained.
[71] The terms "homology," "homologous" and "sequence identity" refer to a degree of identity. There may be partial homology or complete homology. A partially homologous sequence is one that is less than 100% identical to another sequence. Determination of sequence identity is described in the following example: a primer 20 nucleobases in length which is otherwise identical to another 20 nucleobase primer but having two non- identical residues has 18 of 20 identical residues (18/20 = 0.9 or 90% sequence identity). In another example, a primer 15 nucleobases in length having all residues identical to a 15 nucleobase segment of a primer 20 nucleobases in length would have 15/20 = 0.75 or 75% sequence identity with the 20 nucleobase primer. As used herein, sequence identity is meant to be properly determined when the query sequence and the subject sequence are both described and aligned in the 5' to 3' direction. Sequence alignment algorithms such as BLAST, will return results in two different alignment orientations. In the Plus/Plus orientation, both the query sequence and the subject sequence are aligned in the 5' to 3' direction. On the other hand, in the Plus/Minus orientation, the query sequence is in the 5' to 3' direction while the subject sequence is in the 3' to 5' direction. It should be understood that with respect to the primers disclosed herein, sequence identity is properly determined when the alignment is designated as Plus/Plus. Sequence identity may also encompass alternate or modified nucleobases that perform in a functionally similar manner to the regular nucleobases adenine, thymine, guanine and cytosine with respect to hybridization and primer extension in amplification reactions. In a non-limiting example, if the 5-propynyl pyrimidines propyne C and/or propyne T replace one or more C or T residues in one primer which is otherwise identical to another primer in sequence and length, the two primers will have 100% sequence identity with each other. In another non-limiting example, Inosine (I) may be used as a replacement for G or T and effectively hybridize to C, A or U (uracil). Thus, if inosine replaces one or more C, A or U residues in one primer which is otherwise identical to another primer in sequence and length, the two primers will have 100% sequence identity with each other. Other such modified or universal bases may exist which would perform in a functionally similar manner for hybridization and amplification reactions and will be understood to fall within this definition of sequence identity.
[72] As used herein, "housekeeping gene" refers to a gene encoding a protein or RNA involved in basic functions required for survival and reproduction of a bioagent. Housekeeping genes include, but are not limited to genes encoding RNA or proteins involved in translation, replication, recombination and repair, transcription, nucleotide metabolism, amino acid metabolism, lipid metabolism, energy generation, uptake, secretion and the like.
[73] As used herein, the term "hybridization" is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is influenced by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, and the Tm of the formed hybrid. "Hybridization" methods involve the annealing of one nucleic acid to another, complementary nucleic acid, i.e., a nucleic acid having a complementary nucleotide sequence. The ability of two polymers of nucleic acid containing complementary sequences to find each other and anneal through base pairing interaction is a well-recognized phenomenon. The initial observations of the "hybridization" process by Marmur and Lane, Proc. Natl. Acad. Sci. USA 46:453 (1960) and Doty et al., Proc. Natl. Acad. Sci. USA 46:461 (1960) have been followed by the refinement of this process into an essential tool of modem biology.
[74] The term "in silico" refers to processes taking place via computer calculations. For example, electronic PCR (ePCR) is a process analogous to ordinary PCR except that it is carried out using nucleic acid sequences and primer pair sequences stored on a computer formatted medium.
[75] As used herein, "intelligent primers" are primers that are designed to bind to highly conserved sequence regions of a bioagent identifying amplicon that flank an intervening variable region and, upon amplification, yield amplification products which ideally provide enough variability to distinguish individual bioagents, and which are amenable to molecular mass analysis. By the term "highly conserved," it is meant that the sequence regions exhibit between about 80-100%, or between about 90-100%, or between about 95-100% identity among all, or at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% of species or strains.
[76] The "ligase chain reaction" (LCR; sometimes referred to as "Ligase Amplification Reaction" (LAR) described by Barany, Proc. Natl. Acad. Sci., 88:189 (1991); Barany, PCR Methods and
Applic, 1 :5 (1991); and Wu and Wallace, Genomics 4:560 (1989) has developed into a well- recognized alternative method for amplifying nucleic acids. In LCR, four oligonucleotides, two adjacent oligonucleotides which uniquely hybridize to one strand of target DNA, and a complementary set of adjacent oligonucleotides, that hybridize to the opposite strand are mixed and DNA ligase is added to the mixture. Provided that there is complete complementarity at the junction, ligase will covalently link each set of hybridized molecules. Importantly, in LCR, two probes are ligated together only when they base-pair with sequences in the target sample, without gaps or mismatches. Repeated cycles of denaturation, hybridization and ligation amplify a short segment of DNA. LCR has also been used in combination with PCR to achieve enhanced detection of single-base changes. However, because the four oligonucleotides used in this assay can pair to form two short ligatable fragments, there is the potential for the generation of target-independent background signal. The use of LCR for mutant screening is limited to the examination of specific nucleic acid positions.
[77] The term "locked nucleic acid" or "LNA" refers to a nucleic acid analogue containing one or more 2'-O, 4'-C-methylene-β-D-ribofuranosyl nucleotide monomers in an RNA mimicking sugar conformation. LNA oligonucleotides display unprecedented hybridization affinity toward complementary single-stranded RNA and complementary single- or double-stranded DNA. LNA oligonucleotides induce A-type (RNA-like) duplex conformations. The primers disclosed herein may contain LNA modifications.
[78] As used herein, the term "mass-modifying tag" refers to any modification to a given nucleotide which results in an increase in mass relative to the analogous non-mass modified nucleotide. Mass- modifying tags can include heavy isotopes of one or more elements included in the nucleotide such as carbon- 13 for example. Other possible modifications include addition of substituents such as iodine or bromine at the 5 position of the nucleobase for example.
[79] The term "mass spectrometry" refers to measurement of the mass of atoms or molecules. The molecules are first converted to ions, which are separated using electric or magnetic fields according to the ratio of their mass to electric charge. The measured masses are used to identity the molecules.
[80] The term "microorganism" as used herein means an organism too small to be observed with the unaided eye and includes, but is not limited to bacteria, virus, protozoans, fungi; and ciliates.
[81] The term "multi-drug resistant" or multiple-drug resistant" refers to a microorganism which is resistant to more than one of the antibiotics or antimicrobial agents used in the treatment of said microorganism. [82] The term "multiplex PCR" refers to a PCR reaction where more than one primer set is included in the reaction pool allowing 2 or more different DNA targets to be amplified by PCR in a single reaction tube.
[83] The term "non-template tag" refers to a stretch of at least three guanine or cytosine nucleobases of a primer used to produce a bioagent identifying amplicon which are not complementary to the template. A non-template tag is incorporated into a primer for the purpose of increasing the primer- duplex stability of later cycles of amplification by incorporation of extra G-C pairs which each have one additional hydrogen bond relative to an A-T pair.
[84] The term "nucleic acid sequence" as used herein refers to the linear composition of the nucleic acid residues A, T, C or G or any modifications thereof, within an oligonucleotide, nucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin which may be single or double stranded, and represent the sense or antisense strand
[85] As used herein, the term "nucleobase" is synonymous with other terms in use in the art including "nucleotide," "deoxynucleotide," "nucleotide residue," "deoxynucleotide residue," "nucleotide triphosphate (NTP)," or deoxynucleotide triphosphate (dNTP).
[86] The term "nucleotide analog" as used herein refers to modified or non-naturally occurring nucleotides such as 5-propynyl pyrimidines (i.e., 5-propynyl-dTTP and 5-propynyl-dTCP), 7-deaza purines (i.e., 7-deaza-dATP and 7-deaza-dGTP). Nucleotide analogs include base analogs and comprise modified forms of deoxyribonucleotides as well as ribonucleotides.
[87] The term "oligonucleotide" as used herein is defined as a molecule comprising two or more deoxyribonucleotides or ribonucleotides, preferably at least 5 nucleotides, more preferably at least about 13 to 35 nucleotides. The exact size will depend on many factors, which in turn depend on the ultimate function or use of the oligonucleotide. The oligonucleotide may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, PCR, or a combination thereof. Because mononucleotides are reacted to make oligonucleotides in a manner such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in one direction via a phosphodiester linkage, an end of an oligonucleotide is referred to as the "5 '-end" if its 5' phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring and as the "3 '-end" if its 3' oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose ring. As used herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends. A first region along a nucleic acid strand is said to be upstream of another region if the 3' end of the first region is before the 5' end of the second region when moving along a strand of nucleic acid in a 5' to 3' direction. All oligonucleotide primers disclosed herein are understood to be presented in the 5' to 3' direction when reading left to right. When two different, non-overlapping oligonucleotides anneal to different regions of the same linear complementary nucleic acid sequence, and the 3' end of one oligonucleotide points towards the 5' end of the other, the former may be called the "upstream" oligonucleotide and the latter the "downstream" oligonucleotide. Similarly, when two overlapping oligonucleotides are hybridized to the same linear complementary nucleic acid sequence, with the first oligonucleotide positioned such that its 5' end is upstream of the 5' end of the second oligonucleotide, and the 3' end of the first oligonucleotide is upstream of the 3' end of the second oligonucleotide, the first oligonucleotide may be called the "upstream" oligonucleotide and the second oligonucleotide may be called the "downstream" oligonucleotide.
[88] As used herein, a "pathogen" is a bioagent which causes a disease or disorder.
[89] As used herein, the terms "PCR product," "PCR fragment," and "amplification product" refer to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
[90] The term "peptide nucleic acid" ("PNA") as used herein refers to a molecule comprising bases or base analogs such as would be found in natural nucleic acid, but attached to a peptide backbone rather than the sugar-phosphate backbone typical of nucleic acids. The attachment of the bases to the peptide is such as to allow the bases to base pair with complementary bases of nucleic acid in a manner similar to that of an oligonucleotide. These small molecules, also designated anti gene agents, stop transcript elongation by binding to their complementary strand of nucleic acid (Nielsen, et al. Anticancer Drug Des. 8:53 63). The primers disclosed herein may comprise PNAs.
[91] The term "polymerase" refers to an enzyme having the ability to synthesize a complementary strand of nucleic acid from a starting template nucleic acid strand and free dNTPs.
[92] As used herein, the term "polymerase chain reaction" ("PCR") refers to the method of K.B. Mullis U.S. Patent Nos. 4,683,195, 4,683,202, and 4,965,188, hereby incorporated by reference, that describe a method for increasing the concentration of a segment of a target sequence in a mixture of genomic DNA without cloning or purification. This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase. The two primers are complementary to their respective strands of the double stranded target sequence. To effect amplification, the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule. Following annealing, the primers are extended with a polymerase so as to form a new pair of complementary strands. The steps of denaturation, primer annealing, and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one "cycle"; there can be numerous "cycles") to obtain a high concentration of an amplified segment of the desired target sequence. The length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter. By virtue of the repeating aspect of the process, the method is referred to as the "polymerase chain reaction" (hereinafter "PCR"). Because the desired amplified segments of the target sequence become the predominant sequences (in terms of concentration) in the mixture, they are said to be "PCR amplified." With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment). In addition to genomic DNA, any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules. In particular, the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
[93] The term "polymerization means" or "polymerization agent" refers to any agent capable of facilitating the addition of nucleoside triphosphates to an oligonucleotide. Preferred polymerization means comprise DNA and RNA polymerases.
[94] As used herein, the terms "pair of primers," or "primer pair" are synonymous. A primer pair is used for amplification of a nucleic acid sequence. A pair of primers comprises a forward primer and a reverse primer. The forward primer hybridizes to a sense strand of a target gene sequence to be amplified and primes synthesis of an antisense strand (complementary to the sense strand) using the target sequence as a template. A reverse primer hybridizes to the antisense strand of a target gene sequence to be amplified and primes synthesis of a sense strand (complementary to the antisense strand) using the target sequence as a template.
[95] The primers are designed to bind to highly conserved sequence regions of a bioagent identifying amplicon that flank an intervening variable region and yield amplification products which ideally provide enough variability to distinguish each individual bioagent, and which are amenable to molecular mass analysis. In some embodiments, the highly conserved sequence regions exhibit between about 80-100%, or between about 90-100%, or between about 95-100% identity, or between about 99- 100% identity. The molecular mass of a given amplification product provides a means of identifying the bioagent from which it was obtained, due to the variability of the variable region. Thus design of the primers requires selection of a variable region with appropriate variability to resolve the identity of a given bioagent. Bioagent identifying amplicons are ideally specific to the identity of the bioagent.
[96] Properties of the primers may include any number of properties related to structure including, but not limited to: nucleobase length which may be contiguous (linked together) or non-contiguous (for example, two or more contiguous segments which are joined by a linker or loop moiety), modified or universal nucleobases (used for specific purposes such as for example, increasing hybridization affinity, preventing non-templated adenylation and modifying molecular mass) percent complementarity to a given target sequences.
[97] Properties of the primers also include functional features including, but not limited to, orientation of hybridization (forward or reverse) relative to a nucleic acid template. The coding or sense strand is the strand to which the forward priming primer hybridizes (forward priming orientation) while the reverse priming primer hybridizes to the non-coding or antisense strand (reverse priming orientation). The functional properties of a given primer pair also include the generic template nucleic acid to which the primer pair hybridizes. For example, identification of bioagents can be accomplished at different levels using primers suited to resolution of each individual level of identification. Broad range survey primers are designed with the objective of identifying a bioagent as a member of a particular division (e.g., an order, family, genus or other such grouping of bioagents above the species level of bioagents). In some embodiments, broad range survey intelligent primers are capable of identification of bioagents at the species or sub-species level. Other primers may have the functionality of producing bioagent identifying amplicons for members of a given taxonomic genus, clade, species, sub-species or genotype (including genetic variants which may include presence of virulence genes or antibiotic resistance genes or mutations). Additional functional properties of primer pairs include the functionality of performing amplification either singly (single primer pair per amplification reaction vessel) or in a multiplex fashion (multiple primer pairs and multiple amplification reactions within a single reaction vessel).
[98] As used herein, the terms "purified" or "substantially purified" refer to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated, and are at least 60% free, preferably 75% free, and most preferably 90% free from other components with which they are naturally associated. An "isolated polynucleotide" or "isolated oligonucleotide" is therefore a substantially purified polynucleotide. As used herein, a kit can comprise one or more purified oligonucleotide primer pairs. When the kit comprises more than one purified oligonucleotide primer pairs, each of those primer pairs can be in separate vials of the kit. Alternatively, each of the desired purified oligonucleotide primer pairs can be in the same vial. In this instance, each of the desired primer pairs are referred to as purified, meaning that there are no nucleic acids in said vial other than the plurality of desired primer pairs.
[99] The term "reverse transcriptase" refers to an enzyme having the ability to transcribe DNA from an RNA template. This enzymatic activity is known as reverse transcriptase activity. Reverse transcriptase activity is desirable in order to obtain DNA from RNA viruses which can then be amplified and analyzed by the methods disclosed herein.
[100] The term "ribosomal RNA" or "rRNA" refers to the primary ribonucleic acid constituent of ribosomes. Ribosomes are the protein-manufacturing organelles of cells and exist in the cytoplasm. Ribosomal RNAs are transcribed from the DNA genes encoding them.
[101] The term "sample" in the present specification and claims is used in its broadest sense. On the one hand it is meant to include a specimen or culture (e.g., microbiological cultures). On the other hand, it is meant to include both biological and environmental samples. A sample may include a specimen of synthetic origin. Biological samples may be animal, including human, fluid, solid (e.g., stool) or tissue, as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by-products, and waste. Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, lagamorphs, rodents, etc. Environmental samples include environmental material such as surface matter, soil, water, air and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to the methods disclosed herein. The term "source of target nucleic acid" refers to any sample that contains nucleic acids (RNA or DNA). Particularly preferred sources of target nucleic acids are biological samples including, but not limited to blood, saliva, cerebral spinal fluid, pleural fluid, milk, lymph, sputum and semen.
[102] As used herein, the term "sample template" refers to nucleic acid originating from a sample that is analyzed for the presence of "target" (defined below). In contrast, "background template" is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is often a contaminant. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
[103] A "segment" is defined herein as a region of nucleic acid within a target sequence.
[104] The "self-sustained sequence replication reaction" (3SR) (Guatelli et al., Proc. Natl. Acad. Sci., 87: 1874-1878 [1990], with an erratum at Proc. Natl. Acad. Sci., 87:7797 [1990]) is a transcription- based in vitro amplification system (Kwok et al., Proc. Natl. Acad. Sci., 86: 1173-1177 [1989]) that can exponentially amplify RNA sequences at a uniform temperature. The amplified RNA can then be utilized for mutation detection (Fahy et al., PCR Meth. Appl., 1 :25-33 [1991]). In this method, an oligonucleotide primer is used to add a phage RNA polymerase promoter to the 5' end of the sequence of interest. In a cocktail of enzymes and substrates that includes a second primer, reverse transcriptase, RNase H, RNA polymerase and ribo- and deoxyribonucleoside triphosphates, the target sequence undergoes repeated rounds of transcription, cDNA synthesis and second-strand synthesis to amplify the area of interest. The use of 3SR to detect mutations is kinetically limited to screening small segments of DNA (e.g., 200-300 base pairs).
[105] As used herein, the term ""sequence alignment"" refers to a listing of multiple DNA or amino acid sequences and aligns them to highlight their similarities. The listings can be made using bioinformatics computer programs.
[106] As used herein, the terms "sepsis" and "septicemia refer to disease caused by the spread of bacteria and their toxins in the bloodstream. For example, a "sepsis-causing bacterium" is the causative agent of sepsis i.e. the bacterium infecting the bloodstream of an individual with sepsis.
[107] As used herein, the term "speciating primer pair" refers to a primer pair designed to produce a bioagent identifying amplicon with the diagnostic capability of identifying species members of a group of genera or a particular genus of bioagents. Primer pair number 2249 (SEQ ID NOs: 430: 1321), for example, is a speciating primer pair used to distinguish Staphylococcus aureus from other species of the genus Staphylococcus.
[108] As used herein, a "sub-species characteristic" is a genetic characteristic that provides the means to distinguish two members of the same bioagent species. For example, one viral strain could be distinguished from another viral strain of the same species by possessing a genetic change (e.g., for example, a nucleotide deletion, addition or substitution) in one of the viral genes, such as the RNA- dependent RNA polymerase. Sub-species characteristics such as virulence genes and drug-are responsible for the phenotypic differences among the different strains of bacteria.
[109] As used herein, the term "target" is used in a broad sense to indicate the gene or genomic region being amplified by the primers. Because the methods disclosed herein provide a plurality of amplification products from any given primer pair (depending on the bioagent being analyzed), multiple amplification products from different specific nucleic acid sequences may be obtained. Thus, the term "target" is not used to refer to a single specific nucleic acid sequence. The "target" is sought to be sorted out from other nucleic acid sequences and contains a sequence that has at least partial complementarity with an oligonucleotide primer. The target nucleic acid may comprise single- or double-stranded DNA or RNA. A "segment" is defined as a region of nucleic acid within the target sequence.
[110] The term "template" refers to a strand of nucleic acid on which a complementary copy is built from nucleoside triphosphates through the activity of a template-dependent nucleic acid polymerase. Within a duplex the template strand is, by convention, depicted and described as the "bottom" strand. Similarly, the non-template strand is often depicted and described as the "top" strand.
[Ill] As used herein, the term "Tm" is used in reference to the "melting temperature." The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. Several equations for calculating the Tm of nucleic acids are well known in the art. As indicated by standard references, a simple estimate of the Tm value may be calculated by the equation: Tm=81.5+0.41(% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985). Other references (e.g., Allawi, H. T. & SantaLucia, J., Jr. Thermodynamics and NMR of internal G.T mismatches in DNA. Biochemistry 36, 10581-94 (1997) include more sophisticated computations which take structural and environmental, as well as sequence characteristics into account for the calculation of Tm.
[112] The term "triangulation genotyping analysis" refers to a method of genotyping a bioagent by measurement of molecular masses or base compositions of amplification products, corresponding to bioagent identifying amplicons, obtained by amplification of regions of more than one gene. In this sense, the term "triangulation" refers to a method of establishing the accuracy of information by comparing three or more types of independent points of view bearing on the same findings. Triangulation genotyping analysis carried out with a plurality of triangulation genotyping analysis primers yields a plurality of base compositions that then provide a pattern or "barcode" from which a species type can be assigned. The species type may represent a previously known sub-species or strain, or may be a previously unknown strain having a specific and previously unobserved base composition barcode indicating the existence of a previously unknown genotype.
[113] As used herein, the term "triangulation genotyping analysis primer pair" is a primer pair designed to produce bioagent identifying amplicons for determining species types in a triangulation genotyping analysis.
[114] The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as "triangulation identification." Triangulation identification is pursued by analyzing a plurality of bioagent identifying amplicons produced with different primer pairs. This process is used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al., J. Appl. Microbiol., 1999, 87, 270-278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event.
[115] As used herein, the term "unknown bioagent" may mean either: (i) a bioagent whose existence is known (such as the well known bacterial species Staphylococcus aureus for example) but which is not known to be in a sample to be analyzed, or (ii) a bioagent whose existence is not known (for example, the SARS coronavirus was unknown prior to April 2003). For example, if the method for identification of coronaviruses disclosed in commonly owned U.S. Patent Serial No. 10/829,826 (incorporated herein by reference in its entirety) was to be employed prior to April 2003 to identify the SARS coronavirus in a clinical sample, both meanings of "unknown" bioagent are applicable since the SARS coronavirus was unknown to science prior to April, 2003 and since it was not known what bioagent (in this case a coronavirus) was present in the sample. On the other hand, if the method of U.S. Patent Serial No. 10/829,826 was to be employed subsequent to April 2003 to identify the SARS coronavirus in a clinical sample, only the first meaning (i) of "unknown" bioagent would apply since the SARS coronavirus became known to science subsequent to April 2003 and since it was not known what bioagent was present in the sample.
[116] The term "variable sequence" as used herein refers to differences in nucleic acid sequence between two nucleic acids. For example, the genes of two different bacterial species may vary in sequence by the presence of single base substitutions and/or deletions or insertions of one or more nucleotides. These two forms of the structural gene are said to vary in sequence from one another. As used herein, the term "viral nucleic acid" includes, but is not limited to, DNA, RNA, or DNA that has been obtained from viral RNA, such as, for example, by performing a reverse transcription reaction. Viral RNA can either be single-stranded (of positive or negative polarity) or double-stranded.
[117] The term "virus" refers to obligate, ultramicroscopic, parasites that are incapable of autonomous replication (i.e., replication requires the use of the host cell's machinery). Viruses can survive outside of a host cell but cannot replicate.
[118] The term "wild-type" refers to a gene or a gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source. A wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the "normal" or "wild- type" form of the gene. In contrast, the term "modified", "mutant" or "polymorphic" refers to a gene or gene product that displays modifications in sequence and or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally- occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
[119] As used herein, a "wobble base" is a variation in a codon found at the third nucleotide position of a DNA triplet. Variations in conserved regions of sequence are often found at the third nucleotide position due to redundancy in the amino acid code.
DETAILED DESCRIPTION OF EMBODIMENTS A. Bioagent Identifying Amplicons
[120] Disclosed herein are methods for detection and identification of unknown bioagents using bioagent identifying amplicons. Primers are selected to hybridize to conserved sequence regions of nucleic acids derived from a bioagent, and which bracket variable sequence regions to yield a bioagent identifying amplicon, which can be amplified and which is amenable to molecular mass determination. The molecular mass then provides a means to uniquely identify the bioagent without a requirement for prior knowledge of the possible identity of the bioagent. The molecular mass or corresponding base composition signature of the amplification product is then matched against a database of molecular masses or base composition signatures. A match is obtained when an experimentally-determined molecular mass or base composition of an analyzed amplification product is compared with known molecular masses or base compositions of known bioagent identifying amplicons and the experimentally determined molecular mass or base composition is the same as the molecular mass or base composition of one of the known bioagent identifying amplicons.
Alternatively, the experimentally-determined molecular mass or base composition may be within experimental error of the molecular mass or base composition of a known bioagent identifying amplicon and still be classified as a match. In some cases, the match may also be classified using a probability of match model such as the models described in U.S. Serial No. 11/073,362, which is commonly owned and incorporated herein by reference in entirety. Furthermore, the method can be applied to rapid parallel multiplex analyses, the results of which can be employed in a triangulation identification strategy. The present method provides rapid throughput and does not require nucleic acid sequencing of the amplified target sequence for bioagent detection and identification.
[121] Despite enormous biological diversity, all forms of life on earth share sets of essential, common features in their genomes. Since genetic data provide the underlying basis for identification of bioagents by the methods disclosed herein, it is necessary to select segments of nucleic acids which ideally provide enough variability to distinguish each individual bioagent and whose molecular mass is amenable to molecular mass determination.
[122] Unlike bacterial genomes, which exhibit conservation of numerous genes (i.e. housekeeping genes) across all organisms, viruses do not share a gene that is essential and conserved among all virus families. Therefore, viral identification is achieved within smaller groups of related viruses, such as members of a particular virus family or genus. For example, RNA-dependent RNA polymerase is present in all single-stranded RNA viruses and can be used for broad priming as well as resolution within the virus family.
[123] In some embodiments, at least one bacterial nucleic acid segment is amplified in the process of identifying the bacterial bioagent. Thus, the nucleic acid segments that can be amplified by the primers disclosed herein and that provide enough variability to distinguish each individual bioagent and whose molecular masses are amenable to molecular mass determination are herein described as bioagent identifying amplicons.
[124] In some embodiments, bioagent identifying amplicons comprise from about 27 to about 200 nucleobases (i.e. from about 45 to about 200 linked nucleosides), although both longer and short regions may be used. One of ordinary skill in the art will appreciate that these embodiments include compounds of 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199 or 200 nucleobases in length, or any range therewithin.
[125] It is the combination of the portions of the bioagent nucleic acid segment to which the primers hybridize (hybridization sites) and the variable region between the primer hybridization sites that comprises the bioagent identifying amplicon. Thus, it can be said that a given bioagent identifying amplicon is "defined by" a given pair of primers.
[126] In some embodiments, bioagent identifying amplicons amenable to molecular mass determination which are produced by the primers described herein are either of a length, size or mass compatible with the particular mode of molecular mass determination or compatible with a means of providing a predictable fragmentation pattern in order to obtain predictable fragments of a length compatible with the particular mode of molecular mass determination. Such means of providing a predictable fragmentation pattern of an amplification product include, but are not limited to, cleavage with chemical reagents, restriction enzymes or cleavage primers, for example. Thus, in some embodiments, bioagent identifying amplicons are larger than 200 nucleobases and are amenable to molecular mass determination following restriction digestion. Methods of using restriction enzymes and cleavage primers are well known to those with ordinary skill in the art.
[127] In some embodiments, amplification products corresponding to bioagent identifying amplicons are obtained using the polymerase chain reaction (PCR) that is a routine method to those with ordinary skill in the molecular biology arts. Other amplification methods may be used such as ligase chain reaction (LCR), low-stringency single primer PCR, and multiple strand displacement amplification (MDA). These methods are also known to those with ordinary skill.
Primers and Primer Pairs
[128] In some embodiments, the primers are designed to bind to conserved sequence regions of a bioagent identifying amplicon that flank an intervening variable region and yield amplification products which provide variability sufficient to distinguish each individual bioagent, and which are amenable to molecular mass analysis. In some embodiments, the highly conserved sequence regions exhibit between about 80-100%, or between about 90-100%, or between about 95-100% identity, or between about 99- 100% identity. The molecular mass of a given amplification product provides a means of identifying the bioagent from which it was obtained, due to the variability of the variable region. Thus, design of the primers involves selection of a variable region with sufficient variability to resolve the identity of a given bioagent. In some embodiments, bioagent identifying amplicons are specific to the identity of the bioagent.
[129] In some embodiments, identification of bioagents is accomplished at different levels using primers suited to resolution of each individual level of identification. Broad range survey primers are designed with the objective of identifying a bioagent as a member of a particular division (e.g., an order, family, genus or other such grouping of bioagents above the species level of bioagents). In some embodiments, broad range survey intelligent primers are capable of identification of bioagents at the species or sub-species level. Examples of broad range survey primers include, but are not limited to: primer pair numbers: 346 (SEQ ID NOs: 202: 1110), 347 (SEQ ID NOs: 560: 1278), 348 SEQ ID NOs: 706:895), and 361 (SEQ ID NOs: 697: 1398) which target DNA encoding 16S rRNA, and primer pair numbers 349 (SEQ ID NOs: 401 : 1156) and 360 (SEQ ID NOs: 409: 1434) which target DNA encoding 23 S rRNA.
[130] In some embodiments, drill-down primers are designed with the objective of identifying a bioagent at the sub-species level (including strains, subtypes, variants and isolates) based on subspecies characteristics which may, for example, include single nucleotide polymorphisms (SNPs), variable number tandem repeats (VNTRs), deletions, drug resistance mutations or any other modification of a nucleic acid sequence of a bioagent relative to other members of a species having different sub-species characteristics. Drill-down intelligent primers are not always required for identification at the sub-species level because broad range survey intelligent primers may, in some cases provide sufficient identification resolution to accomplishing this identification objective. Examples of drill-down primers include, but are not limited to: confirmation primer pairs such as primer pair numbers 351 (SEQ ID NOs: 355: 1423) and 353 (SEQ ID NOs: 220: 1394), which target the pXOl virulence plasmid of Bacillus anthracis. Other examples of drill-down primer pairs are found in sets of triangulation genotyping primer pairs such as, for example, the primer pair number 2146 (SEQ ID NOs: 437:1137) which targets the arcC gene (encoding carmabate kinase) and is included in an 8 primer pair panel or kit for use in genotyping Staphylococcus aureus, or in other panels or kits of primer pairs used for determining drug-resistant bacterial strains, such as, for example, primer pair number 2095 (SEQ ID NOs: 456:1261) which targets the pv-luk gene (encoding Panton-Valentine leukocidin) and is included in an 8 primer pair panel or kit for use in identification of drug resistant strains of Staphylococcus aureus.
[131] A representative process flow diagram used for primer selection and validation process is outlined in Figure 1. For each group of organisms, candidate target sequences are identified (200) from which nucleotide alignments are created (210) and analyzed (220). Primers are then designed by selecting appropriate priming regions (230) to facilitate the selection of candidate primer pairs (240). The primer pairs are then subjected to in silico analysis by electronic PCR (ePCR) (300) wherein bioagent identifying amplicons are obtained from sequence databases such as GenBank or other sequence collections (310) and checked for specificity in silico (320). Bioagent identifying amplicons obtained from GenBank sequences (310) can also be analyzed by a probability model which predicts the capability of a given amplicon to identify unknown bioagents such that the base compositions of amplicons with favorable probability scores are then stored in a base composition database (325). Alternatively, base compositions of the bioagent identifying amplicons obtained from the primers and GenBank sequences can be directly entered into the base composition database (330). Candidate primer pairs (240) are validated by testing their ability to hybridize to target nucleic acid by an in vitro amplification by a method such as PCR analysis (400) of nucleic acid from a collection of organisms (410). Amplification products thus obtained are analyzed by gel electrophoresis or by mass spectrometry to confirm the sensitivity, specificity and reproducibility of the primers used to obtain the amplification products (420).
[132] Many of the important pathogens, including the organisms of greatest concern as biowarfare agents, have been completely sequenced. This effort has greatly facilitated the design of primers for the detection of unknown bioagents. The combination of broad-range priming with division-wide and drill-down priming has been used very successfully in several applications of the technology, including environmental surveillance for biowarfare threat agents and clinical sample analysis for medically important pathogens.
[133] Synthesis of primers is well known and routine in the art. The primers may be conveniently and routinely made through the well-known technique of solid phase synthesis. Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, CA). Any other means for such synthesis known in the art may additionally or alternatively be employed.
[134] In some embodiments, primers are employed as compositions for use in methods for identification of bacterial bioagents as follows: a primer pair composition is contacted with nucleic acid (such as, for example, bacterial DNA or DNA reverse transcribed from the rRNA) of an unknown bacterial bioagent. The nucleic acid is then amplified by a nucleic acid amplification technique, such as PCR for example, to obtain an amplification product that represents a bioagent identifying amplicon. The molecular mass of each strand of the double-stranded amplification product is determined by a molecular mass measurement technique such as mass spectrometry for example, wherein the two strands of the double-stranded amplification product are separated during the ionization process. In some embodiments, the mass spectrometry is electrospray Fourier transform ion cyclotron resonance mass spectrometry (ESI-FTICR-MS) or electrospray time of flight mass spectrometry (ESI-TOF-MS). A list of possible base compositions can be generated for the molecular mass value obtained for each strand and the choice of the correct base composition from the list is facilitated by matching the base composition of one strand with a complementary base composition of the other strand. The molecular mass or base composition thus determined is then compared with a database of molecular masses or base compositions of analogous bioagent identifying amplicons for known viral bioagents. A match between the molecular mass or base composition of the amplification product and the molecular mass or base composition of an analogous bioagent identifying amplicon for a known viral bioagent indicates the identity of the unknown bioagent. In some embodiments, the primer pair used is one of the primer pairs of Table 2. In some embodiments, the method is repeated using one or more different primer pairs to resolve possible ambiguities in the identification process or to improve the confidence level for the identification assignment.
[135] In some embodiments, a bioagent identifying amplicon may be produced using only a single primer (either the forward or reverse primer of any given primer pair), provided an appropriate amplification method is chosen, such as, for example, low stringency single primer PCR (LSSP- PCR). Adaptation of this amplification method in order to produce bioagent identifying amplicons can be accomplished by one with ordinary skill in the art without undue experimentation.
[136] In some embodiments, the oligonucleotide primers are broad range survey primers which hybridize to conserved regions of nucleic acid encoding the hexon gene of all (or between 80% and 100%, between 85% and 100%, between 90% and 100% or between 95% and 100%) known bacteria and produce bacterial bioagent identifying amplicons.
[137] In some cases, the molecular mass or base composition of a bacterial bioagent identifying amplicon defined by a broad range survey primer pair does not provide enough resolution to unambiguously identify a bacterial bioagent at or below the species level. These cases benefit from further analysis of one or more bacterial bioagent identifying amplicons generated from at least one additional broad range survey primer pair or from at least one additional division-wide primer pair. The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as triangulation identification.
[138] In other embodiments, the oligonucleotide primers are division-wide primers which hybridize to nucleic acid encoding genes of species within a genus of bacteria. In other embodiments, the oligonucleotide primers are drill-down primers which enable the identification of sub-species characteristics. Drill down primers provide the functionality of producing bioagent identifying amplicons for drill-down analyses such as strain typing when contacted with nucleic acid under amplification conditions. Identification of such sub-species characteristics is often critical for determining proper clinical treatment of viral infections. In some embodiments, sub-species characteristics are identified using only broad range survey primers and division-wide and drill-down primers are not used.
[139] In some embodiments, the primers used for amplification hybridize to and amplify genomic DNA, and DNA of bacterial plasmids.
[140] In some embodiments, various computer software programs may be used to aid in design of primers for amplification reactions such as Primer Premier 5 (Premier Biosoft, Palo Alto, CA) or OLIGO Primer Analysis Software (Molecular Biology Insights, Cascade, CO). These programs allow the user to input desired hybridization conditions such as melting temperature of a primer-template duplex for example. In some embodiments, an in silico PCR search algorithm, such as (ePCR) is used to analyze primer specificity across a plurality of template sequences which can be readily obtained from public sequence databases such as GenBank for example. An existing RNA structure search algorithm (Macke et al., Nucl. Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety). This also provides information on primer specificity of the selected primer pairs. In some embodiments, the hybridization conditions applied to the algorithm can limit the results of primer specificity obtained from the algorithm. In some embodiments, the melting temperature threshold for the primer template duplex is specified to be 35°C or a higher temperature. In some embodiments the number of acceptable mismatches is specified to be seven mismatches or less. In some embodiments, the buffer components and concentrations and primer concentrations may be specified and incorporated into the algorithm, for example, an appropriate primer concentration is about 250 nM and appropriate buffer components are 50 mM sodium or potassium and 1.5 mM Mg2+.
[141] One with ordinary skill in the art of design of amplification primers will recognize that a given primer need not hybridize with 100% complementarity in order to effectively prime the synthesis of a complementary nucleic acid strand in an amplification reaction. Moreover, a primer may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event, (e.g., for example, a loop structure or a hairpin structure). The primers may comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or at least
99% sequence identity with any of the primers listed in Table 2. Thus, in some embodiments, an extent of variation of 70% to 100%, or any range therewithin, of the sequence identity is possible relative to the specific primer sequences disclosed herein. Determination of sequence identity is described in the following example: a primer 20 nucleobases in length which is identical to another 20 nucleobase primer having two non- identical residues has 18 of 20 identical residues (18/20 = 0.9 or 90% sequence identity). In another example, a primer 15 nucleobases in length having all residues identical to a 15 nucleobase segment of primer 20 nucleobases in length would have 15/20 = 0.75 or 75% sequence identity with the 20 nucleobase primer.
[142] Percent homology, sequence identity or complementarity, can be determined by, for example, the Gap program (Wisconsin Sequence Analysis Package, Version 8 for UNIX, Genetics Computer Group, University Research Park, Madison WI), using default settings, which uses the algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489). In some embodiments, complementarity of primers with respect to the conserved priming regions of viral nucleic acid is between about 70% and about 75% 80%. In other embodiments, homology, sequence identity or complementarity, is between about 75% and about 80%. In yet other embodiments, homology, sequence identity or complementarity, is at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or is 100%.
[143] In some embodiments, the primers described herein comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 98%, or at least 99%, or 100% (or any range therewithin) sequence identity with the primer sequences specifically disclosed herein.
[144] One with ordinary skill is able to calculate percent sequence identity or percent sequence homology and able to determine, without undue experimentation, the effects of variation of primer sequence identity on the function of the primer in its role in priming synthesis of a complementary strand of nucleic acid for production of an amplification product of a corresponding bioagent identifying amplicon.
[145] In one embodiment, the primers are at least 13 nucleobases in length. In another embodiment, the primers are less than 36 nucleobases in length.
[146] In some embodiments, the oligonucleotide primers are 13 to 35 nucleobases in length (13 to 35 linked nucleotide residues). These embodiments comprise oligonucleotide primers 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 nucleobases in length, or any range therewithin. The methods disclosed herein contemplate use of both longer and shorter primers. Furthermore, the primers may also be linked to one or more other desired moieties, including, but not limited to, affinity groups, ligands, regions of nucleic acid that are not complementary to the nucleic acid to be amplified, labels, etc. Primers may also form hairpin structures. For example, hairpin primers may be used to amplify short target nucleic acid molecules. The presence of the hairpin may stabilize the amplification complex (see e.g., TAQMAN MicroRNA Assays, Applied Biosystems, Foster City, California).
[147] In some embodiments, any oligonucleotide primer pair may have one or both primers with less then 70% sequence homology with a corresponding member of any of the primer pairs of Table 2 if the primer pair has the capability of producing an amplification product corresponding to a bioagent identifying amplicon. In other embodiments, any oligonucleotide primer pair may have one or both primers with a length greater than 35 nucleobases if the primer pair has the capability of producing an amplification product corresponding to a bioagent identifying amplicon.
[148] In some embodiments, the function of a given primer may be substituted by a combination of two or more primers segments that hybridize adjacent to each other or that are linked by a nucleic acid loop structure or linker which allows a polymerase to extend the two or more primers in an amplification reaction.
[149] In some embodiments, the primer pairs used for obtaining bioagent identifying amplicons are the primer pairs of Table 2. In other embodiments, other combinations of primer pairs are possible by combining certain members of the forward primers with certain members of the reverse primers. An example can be seen in Table 2 for two primer pair combinations of forward primer 16S_EC_78 9_810_F (SEQ ID NO: 206), with the reverse primers 1 6S_EC_880_8 94_R (SEQ ID NO: 796), or 1 6S_EC_882_8 99_R or (SEQ ID NO: 818). Arriving at a favorable alternate combination of primers in a primer pair depends upon the properties of the primer pair, most notably the size of the bioagent identifying amplicon that would be produced by the primer pair, which preferably is between about 27 to about 200 nucleobases in length. Alternatively, a bioagent identifying amplicon longer than 200 nucleobases in length could be cleaved into smaller segments by cleavage reagents such as chemical reagents, or restriction enzymes, for example.
[150] In some embodiments, the primers are configured to amplify nucleic acid of a bioagent to produce amplification products that can be measured by mass spectrometry and from whose molecular masses candidate base compositions can be readily calculated. [151] In some embodiments, any given primer comprises a modification comprising the addition of a non-templated T residue to the 5' end of the primer (i.e., the added T residue does not necessarily hybridize to the nucleic acid being amplified). The addition of a non-templated T residue has an effect of minimizing the addition of non-templated adenosine residues as a result of the non-specific enzyme activity oi Taq polymerase (Magnuson et al., Biotechniques, 1996, 21, 700-709), an occurrence which may lead to ambiguous results arising from molecular mass analysis.
[152] In some embodiments, primers may contain one or more universal bases. Because any variation (due to codon wobble in the 3rd position) in the conserved regions among species is likely to occur in the third position of a DNA (or RNA) triplet, oligonucleotide primers can be designed such that the nucleotide corresponding to this position is a base which can bind to more than one nucleotide, referred to herein as a "universal nucleobase." For example, under this "wobble" pairing, inosine (I) binds to U, C or A; guanine (G) binds to U or C, and uridine (U) binds to U or C. Other examples of universal nucleobases include nitroindoles such as 5-nitroindole or 3-nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001-1003), the degenerate nucleotides dP or dK (Hill et al.), an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot et al., Nucleosides and Nucleotides, 1995, 14, 1053-1056) or the purine analog l-(2-deoxy-β-D- ribofuranosyl)-imidazole-4-carboxamide (SaIa et al., Nucl. Acids Res., 1996, 24, 3302-3306).
[153] In some embodiments, to compensate for the somewhat weaker binding by the wobble base, the oligonucleotide primers are designed such that the first and second positions of each triplet are occupied by nucleotide analogs that bind with greater affinity than the unmodified nucleotide. Examples of these analogs include, but are not limited to, 2,6-diaminopurine which binds to thymine, 5-propynyluracil (also known as propynylated thymine) which binds to adenine and 5- propynylcytosine and phenoxazines, including G-clamp, which binds to G. Propynylated pyrimidines are described in U.S. Patent Nos. 5,645,985, 5,830,653 and 5,484,908, each of which is commonly owned and incorporated herein by reference in its entirety. Propynylated primers are described in U. S Pre-Grant Publication No. 2003-0170682, which is also commonly owned and incorporated herein by reference in its entirety. Phenoxazines are described in U.S. Patent Nos. 5,502,177, 5,763,588, and 6,005,096, each of which is incorporated herein by reference in its entirety. G-clamps are described in U.S. Patent Nos. 6,007,992 and 6,028,183, each of which is incorporated herein by reference in its entirety.
[154] In some embodiments, primer hybridization is enhanced using primers containing 5-propynyl deoxycytidine and deoxythymidine nucleotides. These modified primers offer increased affinity and base pairing selectivity. [155] In some embodiments, non-template primer tags are used to increase the melting temperature (Tm) of a primer-template duplex in order to improve amplification efficiency. A non-template tag is at least three consecutive A or T nucleotide residues on a primer which are not complementary to the template. In any given non-template tag, A can be replaced by C or G and T can also be replaced by C or G. Although Watson-Crick hybridization is not expected to occur for a non-template tag relative to the template, the extra hydrogen bond in a G-C pair relative to an A-T pair confers increased stability of the primer- template duplex and improves amplification efficiency for subsequent cycles of amplification when the primers hybridize to strands synthesized in previous cycles.
[156] In other embodiments, propynylated tags may be used in a manner similar to that of the non- template tag, wherein two or more 5-propynylcytidine or 5-propynyluridine residues replace template matching residues on a primer. In other embodiments, a primer contains a modified internucleoside linkage such as a phosphorothioate linkage, for example.
[157] In some embodiments, the primers contain mass-modifying tags. Reducing the total number of possible base compositions of a nucleic acid of specific molecular weight provides a means of avoiding a persistent source of ambiguity in determination of base composition of amplification products. Addition of mass-modifying tags to certain nucleobases of a given primer will result in simplification of de novo determination of base composition of a given bioagent identifying amplicon from its molecular mass.
[158] In some embodiments, the mass modified nucleobase comprises one or more of the following: for example, 7-deaza-2'-deoxyadenosine-5-triphosphate, 5-iodo-2'-deoxyuridine-5'-triphosphate, 5- bromo-2'-deoxyuridine-5'-triphosphate, 5-bromo-2'-deoxycytidine-5'-triphosphate, 5-iodo-2'- deoxycytidine-5'-triphosphate, 5-hydroxy-2'-deoxyuridine-5'-triphosphate, 4-thiothymidine-5'- triphosphate, 5-aza-2'-deoxyuridine-5'-triphosphate, 5-fluoro-2'-deoxyuridine-5'-triphosphate, 06- methyl-2'-deoxyguanosine-5'-triphosphate, N2-methyl-2'-deoxyguanosine-5'-triphosphate, 8-oxo-2'- deoxyguanosine-5'-triphosphate or thiothymidine-5'-triphosphate. In some embodiments, the mass- modified nucleobase comprises 15N or 13C or both 15N and 13C.
[159] In some embodiments, multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with a plurality of primer pairs. The advantages of multiplexing are that fewer reaction containers (for example, wells of a 96- or 384-well plate) are needed for each molecular mass measurement, providing time, resource and cost savings because additional bioagent identification data can be obtained within a single analysis. Multiplex amplification methods are well known to those with ordinary skill and can be developed without undue experimentation. However, in some embodiments, one useful and non-obvious step in selecting a plurality candidate bioagent identifying amplicons for multiplex amplification is to ensure that each strand of each amplification product will be sufficiently different in molecular mass that mass spectral signals will not overlap and lead to ambiguous analysis results. In some embodiments, a 10 Da difference in mass of two strands of one or more amplification products is sufficient to avoid overlap of mass spectral peaks.
[160] In some embodiments, as an alternative to multiplex amplification, single amplification reactions can be pooled before analysis by mass spectrometry. In these embodiments, as for multiplex amplification embodiments, it is useful to select a plurality of candidate bioagent identifying amplicons to ensure that each strand of each amplification product will be sufficiently different in molecular mass that mass spectral signals will not overlap and lead to ambiguous analysis results.
Determination of Molecular Mass of Bioagent Identifying Amplicons
[161] In some embodiments, the molecular mass of a given bioagent identifying amplicon is determined by mass spectrometry. Mass spectrometry has several advantages, not the least of which is high bandwidth characterized by the ability to separate (and isolate) many molecular peaks across a broad range of mass to charge ratio (m/z). Thus mass spectrometry is intrinsically a parallel detection scheme without the need for radioactive or fluorescent labels, since every amplification product is identified by its molecular mass. The current state of the art in mass spectrometry is such that less than femtomole quantities of material can be readily analyzed to afford information about the molecular contents of the sample. An accurate assessment of the molecular mass of the material can be quickly obtained, irrespective of whether the molecular weight of the sample is several hundred, or in excess of one hundred thousand atomic mass units (amu) or Daltons.
[162] In some embodiments, intact molecular ions are generated from amplification products using one of a variety of ionization techniques to convert the sample to gas phase. These ionization methods include, but are not limited to, electrospray ionization (ES), matrix-assisted laser desorption ionization (MALDI) and fast atom bombardment (FAB). Upon ionization, several peaks are observed from one sample due to the formation of ions with different charges. Averaging the multiple readings of molecular mass obtained from a single mass spectrum affords an estimate of molecular mass of the bioagent identifying amplicon. Electrospray ionization mass spectrometry (ESI-MS) is particularly useful for very high molecular weight polymers such as proteins and nucleic acids having molecular weights greater than 10 kDa, since it yields a distribution of multiply-charged molecules of the sample without causing a significant amount of fragmentation. [163] The mass detectors used in the methods described herein include, but are not limited to, Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), time of flight (TOF), ion trap, quadrupole, magnetic sector, Q-TOF, and triple quadrupole.
Base Compositions of Bioagent Identifying Amplicons
[164] Although the molecular mass of amplification products obtained using intelligent primers provides a means for identification of bioagents, conversion of molecular mass data to a base composition signature is useful for certain analyses. As used herein, "base composition" is the exact number of each nucleobase (A, T, C and G) determined from the molecular mass of a bioagent identifying amplicon. In some embodiments, a base composition provides an index of a specific organism. Base compositions can be calculated from known sequences of known bioagent identifying amplicons and can be experimentally determined by measuring the molecular mass of a given bioagent identifying amplicon, followed by determination of all possible base compositions which are consistent with the measured molecular mass within acceptable experimental error. The following example illustrates determination of base composition from an experimentally obtained molecular mass of a 46-mer amplification product originating at position 1337 of the 16S rRNA of Bacillus anthracis. The forward and reverse strands of the amplification product have measured molecular masses of 14208 and 14079 Da, respectively. The possible base compositions derived from the molecular masses of the forward and reverse strands for the B. anthracis products are listed in Table 1.
Table 1 Possible Base Compositions for B. anthracis 46mer Amplification Product
Figure imgf000043_0001
[165] Among the 16 possible base compositions for the forward strand and the 18 possible base compositions for the reverse strand that were calculated, only one pair (shown in bold) are complementary base compositions, which indicates the true base composition of the amplification product. It should be recognized that this logic is applicable for determination of base compositions of any bioagent identifying amplicon, regardless of the class of bioagent from which the corresponding amplification product was obtained.
[166] In some embodiments, assignment of previously unobserved base compositions (also known as "true unknown base compositions") to a given phylogeny can be accomplished via the use of pattern classifier model algorithms. Base compositions, like sequences, vary slightly from strain to strain within species, for example. In some embodiments, the pattern classifier model is the mutational probability model. On other embodiments, the pattern classifier is the polytope model. The mutational probability model and polytope model are both commonly owned and described in U.S. Patent application Serial No. 11/073,362 which is incorporated herein by reference in entirety. [167] In one embodiment, it is possible to manage this diversity by building "base composition probability clouds" around the composition constraints for each species. This permits identification of organisms in a fashion similar to sequence analysis. A "pseudo four-dimensional plot" can be used to visualize the concept of base composition probability clouds. Optimal primer design requires optimal choice of bioagent identifying amplicons and maximizes the separation between the base composition signatures of individual bioagents. Areas where clouds overlap indicate regions that may result in a misclassification, a problem which is overcome by a triangulation identification process using bioagent identifying amplicons not affected by overlap of base composition probability clouds.
[168] In some embodiments, base composition probability clouds provide the means for screening potential primer pairs in order to avoid potential misclassifications of base compositions. In other embodiments, base composition probability clouds provide the means for predicting the identity of a bioagent whose assigned base composition was not previously observed and/or indexed in a bioagent identifying amplicon base composition database due to evolutionary transitions in its nucleic acid sequence. Thus, in contrast to probe-based techniques, mass spectrometry determination of base composition does not require prior knowledge of the composition or sequence in order to make the measurement.
[169] The methods disclosed herein provide bioagent classifying information similar to DNA sequencing and phylogenetic analysis at a level sufficient to identify a given bioagent. Furthermore, the process of determination of a previously unknown base composition for a given bioagent (for example, in a case where sequence information is unavailable) has downstream utility by providing additional bioagent indexing information with which to populate base composition databases. The process of future bioagent identification is thus greatly improved as more BCS indexes become available in base composition databases.
Triangulation Identification
[170] In some cases, a molecular mass of a single bioagent identifying amplicon alone does not provide enough resolution to unambiguously identify a given bioagent. The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as "triangulation identification." Triangulation identification is pursued by determining the molecular masses of a plurality of bioagent identifying amplicons selected within a plurality of housekeeping genes. This process is used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al., J. Appl. Microbiol., 1999, 87, 270- 278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event.
[171] In some embodiments, the triangulation identification process can be pursued by characterization of bioagent identifying amplicons in a massively parallel fashion using the polymerase chain reaction (PCR), such as multiplex PCR where multiple primers are employed in the same amplification reaction mixture, or PCR in multi-well plate format wherein a different and unique pair of primers is used in multiple wells containing otherwise identical reaction mixtures. Such multiplex and multi-well PCR methods are well known to those with ordinary skill in the arts of rapid throughput amplification of nucleic acids. In other related embodiments, one PCR reaction per well or container may be carried out, followed by an amplicon pooling step wherein the amplification products of different wells are combined in a single well or container which is then subjected to molecular mass analysis. The combination of pooled amplicons can be chosen such that the expected ranges of molecular masses of individual amplicons are not overlapping and thus will not complicate identification of signals.
Codon Base Composition Analysis
[172] In some embodiments, one or more nucleotide substitutions within a codon of a gene of an infectious organism confer drug resistance upon an organism which can be determined by codon base composition analysis. The organism can be a bacterium, virus, fungus or protozoan.
[173] In some embodiments, the amplification product containing the codon being analyzed is of a length of about 35 to about 200 nucleobases. The primers employed in obtaining the amplification product can hybridize to upstream and downstream sequences directly adjacent to the codon, or can hybridize to upstream and downstream sequences one or more sequence positions away from the codon. The primers may have between about 70% to 100% sequence complementarity with the sequence of the gene containing the codon being analyzed.
[174] In some embodiments, the codon base composition analysis is undertaken
[175] In some embodiments, the codon analysis is undertaken for the purpose of investigating genetic disease in an individual. In other embodiments, the codon analysis is undertaken for the purpose of investigating a drug resistance mutation or any other deleterious mutation in an infectious organism such as a bacterium, virus, fungus or protozoan. In some embodiments, the bioagent is a bacterium identified in a biological product. [176] In some embodiments, the molecular mass of an amplification product containing the codon being analyzed is measured by mass spectrometry. The mass spectrometry can be either electrospray (ESI) mass spectrometry or matrix-assisted laser desorption ionization (MALDI) mass spectrometry. Time-of- flight (TOF) is an example of one mode of mass spectrometry compatible with the methods disclosed herein.
[177] The methods disclosed herein can also be employed to determine the relative abundance of drug resistant strains of the organism being analyzed. Relative abundances can be calculated from amplitudes of mass spectral signals with relation to internal calibrants. In some embodiments, known quantities of internal amplification calibrants can be included in the amplification reactions and abundances of analyte amplification product estimated in relation to the known quantities of the calibrants.
[178] In some embodiments, upon identification of one or more drug-resistant strains of an infectious organism infecting an individual, one or more alternative treatments can be devised to treat the individual.
Determination of the Quantity of a Bioagent
[179] In some embodiments, the identity and quantity of an unknown bioagent can be determined using the process illustrated in Figure 2. Primers (500) and a known quantity of a calibration polynucleotide (505) are added to a sample containing nucleic acid of an unknown bioagent. The total nucleic acid in the sample is then subjected to an amplification reaction (510) to obtain amplification products. The molecular masses of amplification products are determined (515) from which are obtained molecular mass and abundance data. The molecular mass of the bioagent identifying amplicon (520) provides the means for its identification (525) and the molecular mass of the calibration amplicon obtained from the calibration polynucleotide (530) provides the means for its identification (535). The abundance data of the bioagent identifying amplicon is recorded (540) and the abundance data for the calibration data is recorded (545), both of which are used in a calculation (550) which determines the quantity of unknown bioagent in the sample.
[180] A sample comprising an unknown bioagent is contacted with a pair of primers that provide the means for amplification of nucleic acid from the bioagent, and a known quantity of a polynucleotide that comprises a calibration sequence. The nucleic acids of the bioagent and of the calibration sequence are amplified and the rate of amplification is reasonably assumed to be similar for the nucleic acid of the bioagent and of the calibration sequence. The amplification reaction then produces two amplification products: a bioagent identifying amplicon and a calibration amplicon. The bioagent identifying amplicon and the calibration amplicon should be distinguishable by molecular mass while being amplified at essentially the same rate. Effecting differential molecular masses can be accomplished by choosing as a calibration sequence, a representative bioagent identifying amplicon (from a specific species of bioagent) and performing, for example, a 2-8 nucleobase deletion or insertion within the variable region between the two priming sites. The amplified sample containing the bioagent identifying amplicon and the calibration amplicon is then subjected to molecular mass analysis by mass spectrometry, for example. The resulting molecular mass analysis of the nucleic acid of the bioagent and of the calibration sequence provides molecular mass data and abundance data for the nucleic acid of the bioagent and of the calibration sequence. The molecular mass data obtained for the nucleic acid of the bioagent enables identification of the unknown bioagent and the abundance data enables calculation of the quantity of the bioagent, based on the knowledge of the quantity of calibration polynucleotide contacted with the sample.
[181] In some embodiments, construction of a standard curve where the amount of calibration polynucleotide spiked into the sample is varied provides additional resolution and improved confidence for the determination of the quantity of bioagent in the sample. The use of standard curves for analytical determination of molecular quantities is well known to one with ordinary skill and can be performed without undue experimentation.
[182] In some embodiments, multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with multiple primer pairs which also amplify the corresponding standard calibration sequences. In this or other embodiments, the standard calibration sequences are optionally included within a single vector which functions as the calibration polynucleotide. Multiplex amplification methods are well known to those with ordinary skill and can be performed without undue experimentation.
[183] In some embodiments, the calibrant polynucleotide is used as an internal positive control to confirm that amplification conditions and subsequent analysis steps are successful in producing a measurable amplicon. Even in the absence of copies of the genome of a bioagent, the calibration polynucleotide should give rise to a calibration amplicon. Failure to produce a measurable calibration amplicon indicates a failure of amplification or subsequent analysis step such as amplicon purification or molecular mass determination. Reaching a conclusion that such failures have occurred is in itself, a useful event. [184] In some embodiments, the calibration sequence is comprised of DNA. In some embodiments, the calibration sequence is comprised of RNA. In some embodiments, the calibration sequence is SEQ ID NO. 1561 (Figure 13 A.). In other embodiments, the calibration sequence is SED ID NO. 1562 (Figure 13B.). In further embodiments, the calibration sequence is SEQ ID NO. 1563 (Figure 13C). In additional embodiments, the calibration sequence is SEQ ID NO. 1564 (Figure 13D.)
[185] In some embodiments, the calibration sequence is inserted into a vector that itself functions as the calibration polynucleotide. In some embodiments, more than one calibration sequence is inserted into the vector that functions as the calibration polynucleotide. Such a calibration polynucleotide is herein termed a "combination calibration polynucleotide." The process of inserting polynucleotides into vectors is routine to those skilled in the art and can be accomplished without undue experimentation. Thus, it should be recognized that the calibration method should not be limited to the embodiments described herein. The calibration method can be applied for determination of the quantity of any bioagent identifying amplicon when an appropriate standard calibrant polynucleotide sequence is designed and used. The process of choosing an appropriate vector for insertion of a calibrant is also a routine operation that can be accomplished by one with ordinary skill without undue experimentation.
Identification of Bacteria
[186] In other embodiments, the primer pairs produce bioagent identifying amplicons within stable and highly conserved regions of bacteria. The advantage to characterization of an amplicon defined by priming regions that fall within a highly conserved region is that there is a low probability that the region will evolve past the point of primer recognition, in which case, the primer hybridization of the amplification step would fail. Such a primer set is thus useful as a broad range survey-type primer. In another embodiment, the intelligent primers produce bioagent identifying amplicons including a region which evolves more quickly than the stable region described above. The advantage of characterization bioagent identifying amplicon corresponding to an evolving genomic region is that it is useful for distinguishing emerging strain variants or the presence of virulence genes, drug resistance genes, or codon mutations that induce drug resistance.
[187] The methods disclosed herein have significant advantages as a platform for identification of diseases caused by emerging bacterial strains such as, for example, drug-resistant strains of Staphylococcus aureus. The methods disclosed herein eliminate the need for prior knowledge of bioagent sequence to generate hybridization probes. This is possible because the methods are not confounded by naturally occurring evolutionary variations occurring in the sequence acting as the template for production of the bioagent identifying amplicon. Measurement of molecular mass and determination of base composition is accomplished in an unbiased manner without sequence prejudice.
[188] Another embodiment also provides a means of tracking the spread of a bacterium, such as a particular drug-resistant strain when a plurality of samples obtained from different locations are analyzed by the methods described above in an epidemiological setting. In one embodiment, a plurality of samples from a plurality of different locations is analyzed with primer pairs which produce bioagent identifying amplicons, a subset of which contains a specific drug-resistant bacterial strain. The corresponding locations of the members of the drug-resistant strain subset indicate the spread of the specific drug- resistant strain to the corresponding locations.
Kits
[189] Also provided are kits for carrying out the methods described herein. In some embodiments, the kit may comprise a sufficient quantity of one or more primer pairs to perform an amplification reaction on a target polynucleotide from a bioagent to form a bioagent identifying amplicon. In some embodiments, the kit may comprise from one to fifty primer pairs, from one to twenty primer pairs, from one to ten primer pairs, or from two to five primer pairs. In some embodiments, the kit may comprise one or more primer pairs recited in Table 2.
[190] In some embodiments, the kit comprises one or more broad range survey primer(s), division wide primer(s), or drill-down primer(s), or any combination thereof. If a given problem involves identification of a specific bioagent, the solution to the problem may require the selection of a particular combination of primers to provide the solution to the problem. A kit may be designed so as to comprise particular primer pairs for identification of a particular bioagent. A drill-down kit may be used, for example, to distinguish different genotypes or strains, drug-resistant, or otherwise. In some embodiments, the primer pair components of any of these kits may be additionally combined to comprise additional combinations of broad range survey primers and division- wide primers so as to be able to identify a bacterium.
[191] In some embodiments, the kit contains standardized calibration polynucleotides for use as internal amplification calibrants. Internal calibrants are described in commonly owned PCT Publication Number WO 2005/098047 which is incorporated herein by reference in its entirety. [192] In some embodiments, the kit comprises a sufficient quantity of reverse transcriptase (if RNA is to be analyzed for example), a DNA polymerase, suitable nucleoside triphosphates (including alternative dNTPs such as inosine or modified dNTPs such as the 5-propynyl pyrimidines or any dNTP containing molecular mass-modifying tags such as those described above), a DNA ligase, and/or reaction buffer, or any combination thereof, for the amplification processes described above. A kit may further include instructions pertinent for the particular embodiment of the kit, such instructions describing the primer pairs and amplification conditions for operation of the method. A kit may also comprise amplification reaction containers such as microcentrifuge tubes and the like. A kit may also comprise reagents or other materials for isolating bioagent nucleic acid or bioagent identifying amplicons from amplification, including, for example, detergents, solvents, or ion exchange resins which may be linked to magnetic beads. A kit may also comprise a table of measured or calculated molecular masses and/or base compositions of bioagents using the primer pairs of the kit.
[193] Some embodiments are kits that contain one or more survey bacterial primer pairs represented by primer pair compositions wherein each member of each pair of primers has 70% to 100% sequence identity with the corresponding member from the group of primer pairs represented by any of the primer pairs of Table 5. The survey primer pairs may include broad range primer pairs which hybridize to ribosomal RNA, and may also include division-wide primer pairs which hybridize to housekeeping genes such as rplB, tufB, rpoB, rpoC, valS, and inffi, for example.
[194] In some embodiments, a kit may contain one or more survey bacterial primer pairs and one or more triangulation genotyping analysis primer pairs such as the primer pairs of Tables 8, 12, 14, 19, 21, 23, or 24. In some embodiments, the kit may represent a less expansive genotyping analysis but include triangulation genotyping analysis primer pairs for more than one genus or species of bacteria. For example, a kit for surveying nosocomial infections at a health care facility may include, for example, one or more broad range survey primer pairs, one or more division wide primer pairs, one or more Acinetobacter baumannii triangulation genotyping analysis primer pairs and one or more Staphylococcus aureus triangulation genotyping analysis primer pairs. One with ordinary skill will be capable of analyzing in silico amplification data to determine which primer pairs will be able to provide optimal identification resolution for the bacterial bioagents of interest.
[195] In some embodiments, a kit may be assembled for identification of strains of bacteria involved in contamination of food. [196] In some embodiments, a kit may be assembled for identification of sepsis-causing bacteria. An example of such a kit embodiment is a kit comprising one or more of the primer pairs of Table 25 which provide for a broad survey of sepsis-causing bacteria.
[197] Some embodiments of the kits are 96-well or 384-well plates with a plurality of wells containing any or all of the following components: dNTPs, buffer salts, Mg +, betaine, and primer pairs. In some embodiments, a polymerase is also included in the plurality of wells of the 96-well or 384-well plates.
[198] Some embodiments of the kit contain instructions for PCR and mass spectrometry analysis of amplification products obtained using the primer pairs of the kits.
[199] Some embodiments of the kit include a barcode which uniquely identifies the kit and the components contained therein according to production lots and may also include any other information relative to the components such as concentrations, storage temperatures, etc. The barcode may also include analysis information to be read by optical barcode readers and sent to a computer controlling amplification, purification and mass spectrometric measurements. In some embodiments, the barcode provides access to a subset of base compositions in a base composition database which is in digital communication with base composition analysis software such that a base composition measured with primer pairs from a given kit can be compared with known base compositions of bioagent identifying amplicons defined by the primer pairs of that kit.
[200] In some embodiments, the kit contains a database of base compositions of bioagent identifying amplicons defined by the primer pairs of the kit. The database is stored on a convenient computer readable medium such as a compact disk or USB drive, for example.
[201] In some embodiments, the kit includes a computer program stored on a computer formatted medium (such as a compact disk or portable USB disk drive, for example) comprising instructions which direct a processor to analyze data obtained from the use of the primer pairs disclosed herein. The instructions of the software transform data related to amplification products into a molecular mass or base composition which is a useful concrete and tangible result used in identification and/or classification of bioagents. In some embodiments, the kits contain all of the reagents sufficient to carry out one or more of the methods described herein.
[202] While the present invention has been described with specificity in accordance with certain of its embodiments, the following examples serve only to illustrate the invention and are not intended to limit the same. In order that the invention disclosed herein may be more efficiently understood, examples are provided below. It should be understood that these examples are for illustrative purposes only and are not to be construed as limiting the invention in any manner.
EXAMPLES
Example 1: Design and Validation of Primers that Define Bioagent Identifying Amplicons for Identification of Bacteria
[203] For design of primers that define bacterial bioagent identifying amplicons, bacterial genome segment sequences are obtained, aligned and scanned for regions where pairs of PCR primers would amplify products of about 27 to about 200 nucleotides in length and distinguish subgroups and/or individual strains from each other by their molecular masses or base compositions. A typical process shown in Figure 1 is employed for this type of analysis.
[204] A database of expected base compositions for each primer region is generated using an in silico PCR search algorithm, such as (ePCR). An existing RNA structure search algorithm (Macke et al., Nucl. Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety). This also provides information on primer specificity of the selected primer pairs.
[205] Table 2 represents a collection of primers (sorted by primer pair number) designed to identify bacteria using the methods described herein. The primer pair number is an in-house database index number. Primer sites were identified on segments of genes, such as, for example, the 16S rRNA gene. The forward or reverse primer name shown in Table 2 indicates the gene region of the bacterial genome to which the primer hybridizes relative to a reference sequence. In Table 2, for example, the forward primer name 16S_EC_1077_1106_F indicates that the forward primer (_F) hybridizes to residues 1077-1106 of the reference sequence represented by a sequence extraction of coordinates 4033120..4034661 from GenBank gi number 16127994 (as indicated in Table 3). As an additional example: the forward primer name BONTA_X52066_450_473 indicates that the primer hybridizes to residues 450-437 of the gene encoding Clostridium botulinum neurotoxin type A (BoNT/A) represented by GenBank Accession No. X52066 (primer pair name codes appearing in Table 2 are defined in Table 3. One with ordinary skill will know how to obtain individual gene sequences or portions thereof from genomic sequences present in GenBank. In Table 2, Tp = 5-propynyluracil; Cp = 5-propynylcytosine; * = phosphorothioate linkage; I = inosine. T GenBank Accession Numbers for reference sequences of bacteria are shown in Table 3 (below). In some cases, the reference sequences are extractions from bacterial genomic sequences or complements thereof.
Table 2: Primer Pairs for Identification of Bacteria
Figure imgf000054_0001
Figure imgf000055_0001
Figure imgf000056_0001
Figure imgf000057_0001
Figure imgf000058_0001
Figure imgf000059_0001
Figure imgf000060_0001
Figure imgf000061_0001
Figure imgf000062_0001
Figure imgf000063_0001
Figure imgf000064_0001
Figure imgf000065_0001
Figure imgf000066_0001
Figure imgf000067_0001
Figure imgf000068_0001
Figure imgf000069_0001
Figure imgf000070_0001
Figure imgf000071_0001
Figure imgf000072_0001
Figure imgf000073_0001
Figure imgf000074_0001
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
Figure imgf000081_0001
Figure imgf000082_0001
Figure imgf000083_0001
Figure imgf000084_0001
Figure imgf000085_0001
Figure imgf000086_0001
Figure imgf000087_0001
Figure imgf000088_0001
Figure imgf000089_0001
Figure imgf000090_0001
Figure imgf000091_0001
Figure imgf000092_0001
[206] Primer pair name codes and reference sequences are shown in Table 3. The primer name code typically represents the gene to which the given primer pair is targeted. The primer pair name may include specific coordinates with respect to a reference sequence defined by an extraction of a section of sequence or defined by a GenBank gi number, or the corresponding complementary sequence of the extraction, or the entire GenBank gi number as indicated by the label "no extraction." Where "no extraction" is indicated for a reference sequence, the coordinates of a primer pair named to the reference sequence are with respect to the GenBank gi listing. Gene abbreviations are shown in bold type in the "Gene Name" column.
[207] To determine the exact primer hybridization coordinates of a given pair of primers on a given bioagent nucleic acid sequence and to determine the sequences, molecular masses and base compositions of an amplification product to be obtained upon amplification of nucleic acid of a known bioagent with known sequence information in the region of interest with a given pair of primers, one with ordinary skill in bioinformatics is capable of obtaining alignments of the primers disclosed herein with the GenBank gi number of the relevant nucleic acid sequence of the known bioagent. For example, the reference sequence GenBank gi numbers (Table 3) provide the identities of the sequences which can be obtained from GenBank. Alignments can be done using a bioinformatics tool such as BLASTn provided to the public by NCBI (Bethesda, MD). Alternatively, a relevant GenBank sequence may be downloaded and imported into custom programmed or commercially available bioinformatics programs wherein the alignment can be carried out to determine the primer hybridization coordinates and the sequences, molecular masses and base compositions of the amplification product. For example, to obtain the hybridization coordinates of primer pair number 2095 (SEQ ID NOs: 456: 1261), First the forward primer (SEQ ID NO: 456) is subjected to a BLASTn search on the publicly available NCBI BLAST website. "RefSeq_Genomic" is chosen as the BLAST database since the gi numbers refer to genomic sequences. The BLAST query is then performed. Among the top results returned is a match to GenBank gi number 21281729 (Accession Number NC 003923). The result shown below, indicates that the forward primer hybridizes to positions 1530282..1530307 of the genomic sequence of Staphylococcus aureus subsp. aureus MW2 (represented by gi number 21281729). Staphylococcus aureus subsp. aureus MW2, complete genome Length=2820462
Features in this part of subject sequence:
Panton-Valentine leukocidin chain F precursor
Score = 52.0 bits (26), Expect = 2e-05 Identities = 26/26 (100%), Gaps = 0/26 (0%) Strand=Plus/Plus
Figure imgf000094_0001
[208] The hybridization coordinates of the reverse primer (SEQ ID NO: 1261) can be determined in a similar manner and thus, the bioagent identifying amplicon can be defined in terms of genomic coordinates. The query/subject arrangement of the result would be presented in Strand = Plus/Minus format because the reverse strand hybridizes to the reverse complement of the genomic sequence. The preceding sequence analyses are well known to one with ordinary skill in bioinformatics and thus, Table 3 contains sufficient information to determine the primer hybridization coordinates of any of the primers of Table 2 to the applicable reference sequences described therein.
Table 3: Primer Name Codes and Reference Sequences
Figure imgf000094_0002
Figure imgf000095_0001
Figure imgf000096_0001
Figure imgf000097_0001
Figure imgf000098_0001
Figure imgf000099_0001
Figure imgf000100_0001
Figure imgf000101_0001
Figure imgf000102_0001
[209] Note: artificial reference sequences represent concatenations of partial gene extractions from the indicated reference gi number. Partial sequences were used to create the concatenated sequence because complete gene sequences were not necessary for primer design.
Example 2: Sample Preparation and PCR
[210] Genomic DNA is prepared from samples using the DNeasy Tissue Kit (Qiagen, Valencia, CA) according to the manufacturer's protocols.
[211] All PCR reactions are assembled in 50 μL reaction volumes in a 96-well microtiter plate format using a Packard MPII liquid handling robotic platform and MJ. Dyad thermocyclers (MJ research, Waltham, MA) or Eppendorf Mastercycler thermocyclers (Eppendorf, Westbury, NY). The PCR reaction mixture generally consists of 4 units of Amplitaq Gold, Ix buffer II (Applied Biosystems, Foster City, CA), 1.5 mM MgCl2, 0.4 M betaine, 800 μM dNTP mixture and 250 nM of each primer. The following typical PCR conditions are generally used: 95°C for 10 min followed by 8 cycles of 95°C for 30 seconds, 48°C for 30 seconds, and 72°C for 30 seconds with the 48°C annealing temperature increasing 0.90C with each of the eight cycles. The reaction is then continued for 37 additional cycles of 95°C for 15 seconds, 56°C for 20 seconds, and 72°C 20 seconds.
Example 3: Purification of PCR Products for Mass Spectrometry with Ion Exchange Resin- Magnetic Beads
[212] For solution capture of amplification products with ion exchange resin linked to magnetic beads, 25 μl of a 2.5 mg/mL suspension of BioClone amine terminated superparamagnetic beads are added to 25 to 50 μl of a PCR (or RT-PCR) reaction containing approximately 10 pM of a typical PCR amplification product. The suspension is mixed for approximately 5 minutes by vortexing or pipetting, after which the liquid is removed after using a magnetic separator. The beads containing bound amplification product are then washed three times with 5OmM ammonium bicarbonate/50% MeOH or 10OmM ammonium bicarbonate/50% MeOH, followed by three more washes with 50% MeOH. The bound amplification product is eluted with a solution of 25 mM piperidine, 25 mM imidazole, 35% MeOH which includes peptide calibration standards.
Example 4: Mass Spectrometry and Base Composition Analysis
[213] The ESI-FTICR mass spectrometer is based on a Bruker Daltonics (Billerica, MA) Apex II 7Oe electrospray ionization Fourier transform ion cyclotron resonance mass spectrometer that employs an actively shielded 7 Tesla superconducting magnet. The active shielding constrains the majority of the fringing magnetic field from the superconducting magnet to a relatively small volume. Thus, components that might be adversely affected by stray magnetic fields, such as CRT monitors, robotic components, and other electronics, can operate in close proximity to the FTICR spectrometer. All aspects of pulse sequence control and data acquisition are performed on a 600 MHz Pentium II data station running Bruker' s Xmass software under Windows NT 4.0 operating system. Sample aliquots, typically 15 μl, are extracted directly from 96-well microtiter plates using a CTC HTS PAL autosampler (LEAP Technologies, Carrboro, NC) triggered by the FTICR data station. Samples are injected directly into a 10 μl sample loop integrated with a fluidics handling system that supplies the 100 μl /hr flow rate to the ESI source. Ions are formed via electrospray ionization in a modified Analytica (Branford, CT) source employing an off axis, grounded electrospray probe positioned approximately 1.5 cm from the metallized terminus of a glass desolvation capillary. The atmospheric pressure end of the glass capillary is biased at 6000 V relative to the ESI needle during data acquisition. A counter-current flow of dry N2 is employed to assist in the desolvation process. Ions are accumulated in an external ion reservoir comprised of an rf-only hexapole, a skimmer cone, and an auxiliary gate electrode, prior to injection into the trapped ion cell where they were mass analyzed. Ionization duty cycles greater than 99% are achieved by simultaneously accumulating ions in the external ion reservoir during ion detection. Each detection event consists of IM data points digitized over 2.3 s. To improve the signal-to-noise ratio (S/N), 32 scans are co-added for a total data acquisition time of 74 s.
[214] The ESI-TOF mass spectrometer is based on a Bruker Daltonics MicroTOF™. Ions from the ESI source undergo orthogonal ion extraction and are focused in a reflectron prior to detection. The TOF and FTICR are equipped with the same automated sample handling and fluidics described above. Ions are formed in the standard MicroTOF™ ESI source that is equipped with the same off- axis sprayer and glass capillary as the FTICR ESI source. Consequently, source conditions are the same as those described above. External ion accumulation is also employed to improve ionization duty cycle during data acquisition. Each detection event on the TOF was comprised of 75,000 data points digitized over 75 μs.
[215] The sample delivery scheme allows sample aliquots to be rapidly injected into the electrospray source at high flow rate and subsequently be electrosprayed at a much lower flow rate for improved ESI sensitivity. Prior to injecting a sample, a bolus of buffer is injected at a high flow rate to rinse the transfer line and spray needle to avoid sample contamination/carryover. Following the rinse step, the autosampler injects the next sample and the flow rate is switched to low flow. Following a brief equilibration delay, data acquisition commences. As spectra are co-added, the autosampler continues rinsing the syringe and picking up buffer to rinse the injector and sample transfer line. In general, two syringe rinses and one injector rinse are required to minimize sample carryover. During a routine screening protocol a new sample mixture is injected every 106 seconds. More recently a fast wash station for the syringe needle has been implemented which, when combined with shorter acquisition times, facilitates the acquisition of mass spectra at a rate of just under one spectrum/minute.
[216] Raw mass spectra are post-calibrated with an internal mass standard and deconvoluted to monoisotopic molecular masses. Unambiguous base compositions are derived from the exact mass measurements of the complementary single-stranded oligonucleotides. Quantitative results are obtained by comparing the peak heights with an internal PCR calibration standard present in every PCR well at 500 molecules per well. Calibration methods are commonly owned and disclosed in PCT Publication Number WO 2005/098047 which is incorporated herein by reference in entirety.
Example 5: De Novo Determination of Base Composition of Amplification Products using Molecular Mass Modified Deoxynucleotide Triphosphates
[217] Because the molecular masses of the four natural nucleobases have a relatively narrow molecular mass range (A = 313.058, G = 329.052, C = 289.046, T = 304.046 - See Table 4), a persistent source of ambiguity in assignment of base composition can occur as follows: two nucleic acid strands having different base composition may have a difference of about 1 Da when the base composition difference between the two strands is G <→ A (-15.994) combined with C <-> T (+15.000). For example, one 99-mer nucleic acid strand having a base composition of A27G30C21T21 has a theoretical molecular mass of 30779.058 while another 99-mer nucleic acid strand having a base composition of A26G3iC22T2o has a theoretical molecular mass of 30780.052. A 1 Da difference in molecular mass may be within the experimental error of a molecular mass measurement and thus, the relatively narrow molecular mass range of the four natural nucleobases imposes an uncertainty factor.
[218] The methods provide for a means for removing this theoretical 1 Da uncertainty factor through amplification of a nucleic acid with one mass-tagged nucleobase and three natural nucleobases. The term "nucleobase" as used herein is synonymous with other terms in use in the art including "nucleotide," "deoxynucleotide," "nucleotide residue," "deoxynucleotide residue," "nucleotide triphosphate (NTP)," or deoxynucleotide triphosphate (dNTP).
[219] Addition of significant mass to one of the 4 nucleobases (dNTPs) in an amplification reaction, or in the primers themselves, will result in a significant difference in mass of the resulting amplification product (significantly greater than 1 Da) arising from ambiguities arising from the G <→ A combined with C <-> T event (Table 4). Thus, the same the G <→ A (-15.994) event combined with 5-Iodo-C <→ T (-110.900) event would result in a molecular mass difference of 126.894. If the molecular mass of the base composition A27G305-Iodo-C2iT2i (33422.958) is compared with A26G3IS-IOdO-C22T20, (33549.852) the theoretical molecular mass difference is +126.894. The experimental error of a molecular mass measurement is not significant with regard to this molecular mass difference. Furthermore, the only base composition consistent with a measured molecular mass of the 99-mer nucleic acid is A27G305-I0CI0-C21T21. In contrast, the analogous amplification without the mass tag has 18 possible base compositions.
Table 4: Molecular Masses of Natural Nucleobases and the Mass-Modified Nucleobase 5-Iodo-C and Molecular Mass Differences Resulting from Transitions
Figure imgf000106_0001
[220] Mass spectra of bioagent-identifying amplicons were analyzed independently using a maximum-likelihood processor, such as is widely used in radar signal processing. This processor, referred to as GenX, first makes maximum likelihood estimates of the input to the mass spectrometer for each primer by running matched filters for each base composition aggregate on the input data. This includes the GenX response to a calibrant for each primer.
[221] The algorithm emphasizes performance predictions culminating in probability-of-detection versus probability-of- false-alarm plots for conditions involving complex backgrounds of naturally occurring organisms and environmental contaminants. Matched filters consist of a priori expectations of signal values given the set of primers used for each of the bioagents. A genomic sequence database is used to define the mass base count matched filters. The database contains the sequences of known bacterial bioagents and includes threat organisms as well as benign background organisms. The latter is used to estimate and subtract the spectral signature produced by the background organisms. A maximum likelihood detection of known background organisms is implemented using matched filters and a running-sum estimate of the noise covariance. Background signal strengths are estimated and used along with the matched filters to form signatures which are then subtracted. The maximum likelihood process is applied to this "cleaned up" data in a similar manner employing matched filters for the organisms and a running-sum estimate of the noise- covariance for the cleaned up data.
[222] The amplitudes of all base compositions of bioagent-identifying amplicons for each primer are calibrated and a final maximum likelihood amplitude estimate per organism is made based upon the multiple single primer estimates. Models of all system noise are factored into this two-stage maximum likelihood calculation. The processor reports the number of molecules of each base composition contained in the spectra. The quantity of amplification product corresponding to the appropriate primer set is reported as well as the quantities of primers remaining upon completion of the amplification reaction.
[223] Base count blurring can be carried out as follows. "Electronic PCR" can be conducted on nucleotide sequences of the desired bioagents to obtain the different expected base counts that could be obtained for each primer pair. See for example, ncbi.nlm.nih.gov/sutils/e-pcr/; Schuler, Genome Res. 7:541-50, 1997. In one illustrative embodiment, one or more spreadsheets, such as Microsoft Excel workbooks contain a plurality of worksheets. First in this example, there is a worksheet with a name similar to the workbook name; this worksheet contains the raw electronic PCR data. Second, there is a worksheet named "filtered bioagents base count" that contains bioagent name and base count; there is a separate record for each strain after removing sequences that are not identified with a genus and species and removing all sequences for bioagents with less than 10 strains. Third, there is a worksheet that contains the frequency of substitutions, insertions, or deletions for this primer pair. This data is generated by first creating a pivot table from the data in the "filtered bioagents base count" worksheet and then executing an Excel VBA macro. The macro creates a table of differences in base counts for bioagents of the same species, but different strains. One of ordinary skill in the art may understand additional pathways for obtaining similar table differences without undo experimentation.
[224] Application of an exemplary script, involves the user defining a threshold that specifies the fraction of the strains that are represented by the reference set of base counts for each bioagent. The reference set of base counts for each bioagent may contain as many different base counts as are needed to meet or exceed the threshold. The set of reference base counts is defined by taking the most abundant strain's base type composition and adding it to the reference set and then the next most abundant strain's base type composition is added until the threshold is met or exceeded. The current set of data was obtained using a threshold of 55%, which was obtained empirically.
[225] For each base count not included in the reference base count set for that bioagent, the script then proceeds to determine the manner in which the current base count differs from each of the base counts in the reference set. This difference may be represented as a combination of substitutions, Si=Xi, and insertions, Ii=Yi, or deletions, Di=Zi. If there is more than one reference base count, then the reported difference is chosen using rules that aim to minimize the number of changes and, in instances with the same number of changes, minimize the number of insertions or deletions. Therefore, the primary rule is to identify the difference with the minimum sum (Xi+ Yi) or (Xi+Zi), e.g., one insertion rather than two substitutions. If there are two or more differences with the minimum sum, then the one that will be reported is the one that contains the most substitutions.
[226] Differences between a base count and a reference composition are categorized as one, two, or more substitutions, one, two, or more insertions, one, two, or more deletions, and combinations of substitutions and insertions or deletions. The different classes of nucleobase changes and their probabilities of occurrence have been delineated in U.S. Patent Application Publication No. 2004209260 (U.S. Application Serial No. 10/418,514) which is incorporated herein by reference in entirety.
Example 6: Use of Broad Range Survey and Division Wide Primer Pairs for Identification of Bacteria in an Epidemic Surveillance Investigation
[227] This investigation employed a set of 16 primer pairs which is herein designated the "surveillance primer set" and comprises broad range survey primer pairs, division wide primer pairs and a single Bacillus clade primer pair. The surveillance primer set is shown in Table 5 and consists of primer pairs originally listed in Table 2. This surveillance set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row. Primer pair 449 (non-T modified) has been modified twice. Its predecessors are primer pairs 70 and 357, displayed below in the same row. Primer pair 360 has also been modified twice and its predecessors are primer pairs 17 and 118. Table 5: Bacterial Primer Pairs of the Surveillance Primer Set
Figure imgf000109_0001
Figure imgf000110_0001
[228] The 16 primer pairs of the surveillance set are used to produce bioagent identifying amplicons whose base compositions are sufficiently different amongst all known bacteria at the species level to identify, at a reasonable confidence level, any given bacterium at the species level. As shown in Tables 6A-E, common respiratory bacterial pathogens can be distinguished by the base compositions of bioagent identifying amplicons obtained using the 16 primer pairs of the surveillance set. In some cases, triangulation identification improves the confidence level for species assignment. For example, nucleic acid from Streptococcus pyogenes can be amplified by nine of the sixteen surveillance primer pairs and Streptococcus pneumoniae can be amplified by ten of the sixteen surveillance primer pairs. The base compositions of the bioagent identifying amplicons are identical for only one of the analogous bioagent identifying amplicons and differ in all of the remaining analogous bioagent identifying amplicons by up to four bases per bioagent identifying amplicon. The resolving power of the surveillance set was confirmed by determination of base compositions for 120 isolates of respiratory pathogens representing 70 different bacterial species and the results indicated that natural variations (usually only one or two base substitutions per bioagent identifying amplicon) amongst multiple isolates of the same species did not prevent correct identification of major pathogenic organisms at the species level.
[229] Bacillus anthracis is a well known biological warfare agent which has emerged in domestic terrorism in recent years. Since it was envisioned to produce bioagent identifying amplicons for identification of Bacillus anthracis, additional drill-down analysis primers were designed to target genes present on virulence plasmids of Bacillus anthracis so that additional confidence could be reached in positive identification of this pathogenic organism. Three drill-down analysis primers were designed and are listed in Tables 2 and 6. In Table 6, the drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.
Table 6: Drill-Down Primer Pairs for Confirmation of Identification of Bacillus anthracis
Figure imgf000111_0001
[230]Phylogenetic coverage of bacterial space of the sixteen surveillance primers of Table 5 and the three Bacillus anthracis drill-down primers of Table 6 is shown in Figure 3 which lists common pathogenic bacteria. Figure 3 is not meant to be comprehensive in illustrating all species identified by the primers. Only pathogenic bacteria are listed as representative examples of the bacterial species that can be identified by primers and methods disclosed herein. Nucleic acid of groups of bacteria enclosed within the polygons of Figure 3 can be amplified to obtain bioagent identifying amplicons using the primer pair numbers listed in the upper right hand corner of each polygon. Primer coverage for polygons within polygons is additive. As an illustrative example, bioagent identifying amplicons can be obtained for Chlamydia trachomatis by amplification with, for example, primer pairs 346-349, 360 and 361, but not with any of the remaining primers of the surveillance primer set. On the other hand, bioagent identifying amplicons can be obtained from nucleic acid originating from Bacillus anthracis (located within 5 successive polygons) using, for example, any of the following primer pairs: 346-349, 360, 361 (base polygon), 356, 449 (second polygon), 352 (third polygon), 355 (fourth polygon), 350, 351 and 353 (fifth polygon). Multiple coverage of a given organism with multiple primers provides for increased confidence level in identification of the organism as a result of enabling broad triangulation identification.
[231] In Tables 7A-E, base compositions of respiratory pathogens for primer target regions are shown. Two entries in a cell, represent variation in ribosomal DNA operons. The most predominant base composition is shown first and the minor (frequently a single operon) is indicated by an asterisk (*). Entries with NO DATA mean that the primer would not be expected to prime this species due to mismatches between the primer and target region, as determined by theoretical PCR.
Table 7A - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 346, 347 and 348
Figure imgf000112_0001
Figure imgf000113_0001
Figure imgf000114_0001
Figure imgf000115_0002
Table 7C - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 449, 354, and 352
Figure imgf000115_0001
Figure imgf000116_0001
Figure imgf000117_0001
Table 7D - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 355, 358, and 359
Figure imgf000117_0002
Figure imgf000118_0001
Table 7E - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 362, 363, and 367
Figure imgf000118_0002
Figure imgf000119_0001
Figure imgf000120_0001
[232] Four sets of throat samples from military recruits at different military facilities taken at different time points were analyzed using selected primers disclosed herein. The first set was collected at a military training center from November 1 to December 20, 2002 during one of the most severe outbreaks of pneumonia associated with group A Streptococcus in the United States since 1968. During this outbreak, fifty-one throat swabs were taken from both healthy and hospitalized recruits and plated on blood agar for selection of putative group A Streptococcus colonies. A second set of 15 original patient specimens was taken during the height of this group A Streptococcus -associated respiratory disease outbreak. The third set were historical samples, including twenty-seven isolates of group A Streptococcus, from disease outbreaks at this and other military training facilities during previous years. The fourth set of samples was collected from five geographically separated military facilities in the continental U.S. in the winter immediately following the severe November/December 2002 outbreak.
[233] Pure colonies isolated from group A Streptococcus-selective media from all four collection periods were analyzed with the surveillance primer set. All samples showed base compositions that precisely matched the four completely sequenced strains of Streptococcus pyogenes. Shown in Figure 4 is a 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.
[234] In addition to the identification of Streptococcus pyogenes, other potentially pathogenic organisms were identified concurrently. Mass spectral analysis of a sample whose nucleic acid was amplified by primer pair number 349 (SEQ ID NOs: 401 :1156) exhibited signals of bioagent identifying amplicons with molecular masses that were found to correspond to analogous base compositions of bioagent identifying amplicons of Streptococcus pyogenes (A27 G32 C24 Tl 8), Neisseria meningitidis (A25 G27 C22 Tl 8), and Haemophilus influenzae (A28 G28 C25 T20) (see
Figure 5 and Table 7B). These organisms were present in a ratio of 4:5:20 as determined by comparison of peak heights with peak height of an internal PCR calibration standard as described in commonly owned PCT Publication Number WO 2005/098047 which is incorporated herein by reference in its entirety.
[235] Since certain division-wide primers that target housekeeping genes are designed to provide coverage of specific divisions of bacteria to increase the confidence level for identification of bacterial species, they are not expected to yield bioagent identifying amplicons for organisms outside of the specific divisions. For example, primer pair number 356 (SEQ ID NOs: 449: 1380) primarily amplifies the nucleic acid of members of the classes Bacilli and Clostridia and is not expected to amplify proteobacteria such as Neisseria meningitidis and Haemophilus influenzae. As expected, analysis of the mass spectrum of amplification products obtained with primer pair number 356 does not indicate the presence of Neisseria meningitidis and Haemophilus influenzae but does indicate the presence of Streptococcus pyogenes (Figures 3 and 6, Table 7B). Thus, these primers or types of primers can confirm the absence of particular bioagents from a sample.
[236] The 15 throat swabs from military recruits were found to contain a relatively small set of microbes in high abundance. The most common were Haemophilus influenza, Neisseria meningitides, and Streptococcus pyogenes . Staphylococcus epidermidis, Moraxella catarrhalis, Corynebacteriumpseudodiphtheriticum, and Staphylococcus aureus were present in fewer samples. An equal number of samples from healthy volunteers from three different geographic locations, were identically analyzed. Results indicated that the healthy volunteers have bacterial flora dominated by multiple, commensal non-beta-hemolytic Streptococcal species, including the viridans group streptococci (S. parasangunis, S. vestibularis, S. mitis, S. oralis and S. pneumoniae; data not shown), and none of the organisms found in the military recruits were found in the healthy controls at concentrations detectable by mass spectrometry. Thus, the military recruits in the midst of a respiratory disease outbreak had a dramatically different microbial population than that experienced by the general population in the absence of epidemic disease.
Example 7: Triangulation Genotyping Analysis for Determination of emm-Type of Streptococcus pyogenes in Epidemic Surveillance
[237] As a continuation of the epidemic surveillance investigation of Example 6, determination of sub-species characteristics (genotyping) of Streptococcus pyogenes, was carried out based on a strategy that generates strain-specific signatures according to the rationale of Multi-Locus Sequence
Typing (MLST). In classic MLST analysis, internal fragments of several housekeeping genes are amplified and sequenced (Enright et al. Infection and Immunity, 2001, 69, 2416-2427). In classic MLST analysis, internal fragments of several housekeeping genes are amplified and sequenced. In the present investigation, bioagent identifying amplicons from housekeeping genes were produced using drill-down primers and analyzed by mass spectrometry. Since mass spectral analysis results in molecular mass, from which base composition can be determined, the challenge was to determine whether resolution of emm classification of strains of Streptococcus pyogenes could be determined.
[238] For the purpose of development of a triangulation genotyping assay, an alignment was constructed of concatenated alleles of seven MLST housekeeping genes (glucose kinase (gki), glutamine transporter protein (gtr), glutamate racemase (murl), DNA mismatch repair protein (mutS), xanthine phosphoribosyl transferase (xpt), and acetyl-CoA acetyl transferase (yqiL)) from each of the 212 previously emm -typed strains of Streptococcus pyogenes . From this alignment, the number and location of primer pairs that would maximize strain identification via base composition was determined. As a result, 6 primer pairs were chosen as standard drill-down primers for determination of emm-type of Streptococcus pyogenes. These six primer pairs are displayed in Table 8. This drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.
Figure imgf000122_0001
Figure imgf000123_0001
[239] The primers of Table 8 were used to produce bioagent identifying amplicons from nucleic acid present in the clinical samples. The bioagent identifying amplicons which were subsequently analyzed by mass spectrometry and base compositions corresponding to the molecular masses were calculated.
[240] Of the 51 samples taken during the peak of the November/December 2002 epidemic (Table 9A-C rows 1-3), all except three samples were found to represent emm3, a Group A Streptococcus genotype previously associated with high respiratory virulence. The three outliers were from samples obtained from healthy individuals and probably represent non-epidemic strains. Archived samples (Tables 9A-C rows 5-13) from historical collections showed a greater heterogeneity of base compositions and emm types as would be expected from different epidemics occurring at different places and dates. The results of the mass spectrometry analysis and emm gene sequencing were found to be concordant for the epidemic and historical samples.
Figure imgf000124_0001
Table 9B: Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus samples from Six Military Installations Obtained with Primer Pair Nos. 438 and 441
Figure imgf000125_0001
Figure imgf000126_0001
Table 9C: Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus samples from Six Military Installations Obtained with Primer Pair Nos. 438 and 441
Figure imgf000126_0002
Figure imgf000127_0001
Example 8: Design of Calibrant Polynucleotides based on Bioagent Identifying Amplicons for Identification of Species of Bacteria (Bacterial Bioagent Identifying Amplicons)
[241] This example describes the design of 19 calibrant polynucleotides based on bacterial bioagent identifying amplicons corresponding to the primers of the broad surveillance set (Table 5) and the Bacillus anthracis drill-down set (Table 6).
[242] Calibration sequences were designed to simulate bacterial bioagent identifying amplicons produced by the T modified primer pairs shown in Tables 5 and 6 (primer names have the designation "TMOD"). The calibration sequences were chosen as a representative member of the section of bacterial genome from specific bacterial species which would be amplified by a given primer pair. The model bacterial species upon which the calibration sequences are based are also shown in Table 10. For example, the calibration sequence chosen to correspond to an amplicon produced by primer pair no. 361 is SEQ ID NO: 1445. In Table 10, the forward ( F) or reverse ( R) primer name indicates the coordinates of an extraction representing a gene of a standard reference bacterial genome to which the primer hybridizes e.g.: the forward primer name 16S_EC_713_732_TMOD_F indicates that the forward primer hybridizes to residues 713-732 of the gene encoding 16S ribosomal RNA in an E. coli reference sequence (in this case, the reference sequence is an extraction consisting of residues 4033120-4034661 of the genomic sequence of E. coli K12 (GenBank gi number 16127994). Additional gene coordinate reference information is shown in Table 11. The designation "TMOD" in the primer names indicates that the 5' end of the primer has been modified with a non-matched template T residue which prevents the PCR polymerase from adding non-temp lated adenosine residues to the 5' end of the amplification product, an occurrence which may result in miscalculation of base composition from molecular mass data (vide supra).
[243] The 19 calibration sequences described in Tables 10 and 11 were combined into a single calibration polynucleotide sequence (SEQ ID NO: 1464 - which is herein designated a "combination calibration polynucleotide") which was then cloned into a pCR®-Blunt vector (Invitrogen, Carlsbad, CA). This combination calibration polynucleotide can be used in conjunction with the primers of Tables 5 or 6 as an internal standard to produce calibration amplicons for use in determination of the quantity of any bacterial bioagent. Thus, for example, when the combination calibration polynucleotide vector is present in an amplification reaction mixture, a calibration amplicon based on primer pair 346 (16S rRNA) will be produced in an amplification reaction with primer pair 346 and a calibration amplicon based on primer pair 363 (rpoC) will be produced with primer pair 363. Coordinates of each of the 19 calibration sequences within the calibration polynucleotide (SEQ ID NO: 1464) are indicated in Table 11.
Table 10: Bacterial Primer Pairs for Production of Bacterial Bioagent Identifying Amplicons and Corresponding Representative Calibration Sequences
Figure imgf000128_0001
Figure imgf000129_0002
Table 11: Primer Pair Gene Coordinate References and Calibration Polynucleotide Sequence Coordinates within the Combination Calibration Polynucleotide
Figure imgf000129_0001
Figure imgf000130_0001
Example 9: Use of a Calibration Polynucleotide for Determining the Quantity of Bacillus Anthracis in a Sample Containing a Mixture of Microbes
[244] The process described in this example is shown in Figure 2. The capC gene is a gene involved in capsule synthesis which resides on the pX02 plasmid of Bacillus anthracis. Primer pair number 350 (see Tables 10 and 11) was designed to identify Bacillus anthracis via production of a bacterial bioagent identifying amplicon. Known quantities of the combination calibration polynucleotide vector described in Example 8 were added to amplification mixtures containing bacterial bioagent nucleic acid from a mixture of microbes which included the Ames strain of Bacillus anthracis. Upon amplification of the bacterial bioagent nucleic acid and the combination calibration polynucleotide vector with primer pair no. 350, bacterial bioagent identifying amplicons and calibration amplicons were obtained and characterized by mass spectrometry. A mass spectrum measured for the amplification reaction is shown in Figure 7. The molecular masses of the bioagent identifying amplicons provided the means for identification of the bioagent from which they were obtained (Ames strain of Bacillus anthracis) and the molecular masses of the calibration amplicons provided the means for their identification as well. The relationship between the abundance (peak height) of the calibration amplicon signals and the bacterial bioagent identifying amplicon signals provides the means of calculation of the copies of the pX02 plasmid of the Ames strain of Bacillus anthracis. Methods of calculating quantities of molecules based on internal calibration procedures are well known to those of ordinary skill in the art.
[245] Averaging the results of 10 repetitions of the experiment described above, enabled a calculation that indicated that the quantity of Ames strain of Bacillus anthracis present in the sample corresponds to approximately 10 copies of pX02 plasmid.
Example 10: Preparation of PCR Reaction Mixtures from Genomic DNA Isolated from Mycobacterium tuberculosis Samples
[246] This specific protocol is suitable for obtaining amplification products from samples of
Mycobacterium tuberculosis. The optical density of the isolated genomic material is measured in order to estimate the number of genome copies present in the sample. Serial dilutions are then performed to obtain a maximum concentration of 200 genome copies per microliter. A stock solution of Taq polymerase is prepared such that 3 units of Taq polymerase per microliter are present in the final reaction mixture. An aliquot of 40 microliters of this stock solution is mixed with 40 microliters of the diluted genomic DNA in an Eppendorf tube. A volume of 10 microliters of the mixture is then added to a well of a 96-well plate containing primer pairs used for obtaining amplification products corresponding to bioagent identifying amplicons. The plate is sealed and centrifuged at 800 rpm for one minute prior to beginning the PCR cycle.
Example 11: Selection of Primer Pairs for Genotyping of Members of the Bacterial Genus Mycobacterium and for Identification of Drug-Resistant Strains of Mycobacterium tuberculosis
[247] To combine the power of high- throughput mass spectrometric analysis of bioagent identifying amplicons with the sub-species characteristic resolving power provided by genotyping analysis and codon base composition analysis, a panel of twenty- four genotyping analysis primer pairs was selected. The primer pairs are designed to produce bioagent identifying amplicons within sixteen different housekeeping genes which are listed in Table 12. The primer sequences are found in Table 2 and are cross-referenced by the primer pair numbers, primer pair names or SEQ ID NOs listed in Table 12.
[248] In Mycobacterium tuberculosis, the acquisition of drug resistance is mostly associated with the emergence of discrete key mutations that can be unambiguously determined using the methods disclosed herein. [249] The evolution of the Mycobacterium tuberculosis genome is essentially clonal, thus allowing strain typing through the query of distinct genomic markers that are lineage- specific and only vertically inherited. Co-infections of mixed populations of genotypes of Mycobacterium tuberculosis can be revealed simultaneously in the mass spectra of amplification products produced using the primers of Table 12. The high G+C content and of the Mycobacterium tuberculosis genome itself greatly facilitates the development of short, efficient primers which are appropriate for multiplexing (inclusion of a plurality of primers in each amplification reaction mixture).
Table 12: Primer Pairs for Genotyping and Determination of Drug Resistance of Strains of
Mycobacterium tuberculosis
Figure imgf000132_0001
Figure imgf000133_0001
[250] The panel of 24 primer pairs is designed to be multiplexed into 8 amplification reactions. Thirteen primer pairs were designed with the objective of identifying mutations associated with resistance to drugs including rifampin (primer pair numbers 3546, 3547 and 3548), ethambutol (primer pair numbers 3550 and 3551), isoniazid (primer pair numbers 3352 and 3353), fluoroquinolone (primer pair numbers 3355 and 3556), streptomycin (primer pair number 3557) and pyrazinamide (primer pair numbers 3558, 3559, 3560 and 3561). Four of these thirteen primer pairs were specifically designed to provide bioagent identifying amplicons for base composition analysis of single codons (primer pair numbers 3547 (rpoB codon D526), 3548 (rpoB codon H516), 3551 (embB codon M306), and 3553 (katG codon S315)). In any of these bioagent identifying amplicons used for base composition analysis, detection of a mutation identifies a drug-resistant strain of Mycobacterium tuberculosis. The remaining nine primer pairs define larger bioagent identifying amplicons that contain secondary drug resistance-conferring sites which are more rare than the four codons discussed above, but certain of these nine primer pairs define bioagent identifying amplicons that also contain some of these four codons (for example, primer pair 3546 contains two rpoB codons; D526 and H516).
[251] Shown in Table 13 are classifications of members of the bacterial genus Mycobacterium according to principal genetic group (PGG, determined using primer pair numbers 3354 and 3356), genotype of Mycobacterium tuberculosis, or species of selected other members of the genus Mycobacterium (determined using primer pair numbers 3381-3384, 3386, 3387 and 3399-3601), and drug resistance to rifampin, ethambutol, isoniazid, fluoroquinolone, streptomycin, and pyrazinamide. The primer pairs used to define the bioagent identifying amplicons for each PPG group, genotype or drug resistant strain are shown in the column headings. In the drug resistance columns, codon mutations are indicated by the amino acid single letter code and codon position convention which is well known to those with ordinary skill in the art. For example, when nucleic acid of Mycobacterium tuberculosis strain 13599 is amplified using primer pair number 3555, and the molecular mass or base composition is determined, mutation of codon 90 from alanine (A) to valine (V) is indicated and the conclusion is drawn that strain 13599 is resistant to the drug fluoroquinolone.
[252] Primer pair number 3600 is a speciation primer pair which is useful for distinguishing members of Mycobacterium tuberculosis PPGl (including genotypes I, II and HA) from other species of the genus Mycobacterium (such as for example, Mycobacterium africanum, Mycobacterium bovis, Mycobacterium microti, and Mycobacterium canettii — see Figure 8).
Table 13: Classification and Drug Resistance Profiles of Strains of Members of the Genus Mycobacterium and Genotypes of Mycobacterium tuberculosis
Figure imgf000135_0001
Example 12: Validation of the Panel of 24 Primer Pairs
[253] Each primer pair was individually validated using the reference Mycobacterium tuberculosis strain H37Rv. Dilution To Extinction (DTE) experiments yielded the expected base composition down to 16 genomic copies per well. A multiplexing scheme was then determined in order to spread into different wells the primer pairs targeting the same gene, to spread within a single well the expected amplicon masses, and to avoid cross-formation of primer duplexes. The multiplexing scheme is shown in Table 14 where multiplexed amplification reactions are indicated in headings numbered A through H and the primer pairs utilized for each reaction are shown below.
Table 14: Multiplexing Scheme for Panel of 24 Primer Pairs
Figure imgf000136_0001
[254] An example of an experimentally determined table of base compositions is shown in Table 15. Base compositions of amplification products obtained from nucleic acid isolated from Mycobacterium tuberculosis strain 5170 using the primer pair multiplex reactions indicated in Table 14 are shown. Molecular masses of the amplification products were measured by electrospray time of flight mass spectrometry in order to calculate the base compositions. It should be noted that the lengths of the amplification products within each reaction mixture vary greatly in length in order to avoid overlap of molecular masses during the measurements. For example, reaction A has three amplification products which have lengths of 46 (A13 Tl 1 C15 G07), 68 (A14 T18 C21 G15) and 129 (A21 T37 C44 G27).
Table 15: Base Compositions Obtained in the Multiplex Amplification Reactions of Nucleic Acid of Mycobacterium tuberculosis Strain 5170
Figure imgf000136_0002
Figure imgf000137_0001
[255] Dilution to extinction experiments were then carried out with the chosen triplets of primer pairs in multiplex conditions. Base compositions expected on the basis of the known sequence of the reference strain were observed down to 32 genomic copies per well on average. The assay was finally tested using a collection of 36 diverse strains from the Public Health Research Institute. As expected, the base compositions results were in accordance with the genotyping and drug-resistance profiles already determined for these reference strains.
Example 13: Diagnosis and Treatment of a Human Subject Infected with a Multi-Drug Resistant Strain of Mycobacterium tuberculosis
[256] This example illustrates how the methods disclosed herein would be useful for diagnosis of a human infected with a drug resistant strain of Mycobacterium tuberculosis. A sample is obtained from a human suspected of being infected with Mycobacterium tuberculosis. At this stage, the specific genotype or strain is not known. The sample can be any sample appropriate for identifying a Mycobacterium tuberculosis infection in a human and can be obtained by established clinical methods known to those with ordinary skill in the art. Nucleic acid can be isolated from the sample by known methods or by methods generally similar to those disclosed in Example 10. The nucleic acid is then amplified by known methods or by methods generally similar to those disclosed in Example 2 to obtain amplification products corresponding to bioagent identifying amplicons which are defined, for example, by the primer pairs of Table 12 (whose sequences are shown in Table T), or functional variants thereof. The amplification products are purified by methods generally similar to that described in Example 3 and analyzed according to the methods described in Example 4, and, optionally, Example 5. Optionally, the quantity of Mycobacterium tuberculosis may be determined by preparing calibration polynucleotides for Mycobacterium tuberculosis using methods similar to those described in Example 9. In this example, the series of base compositions of the amplification products obtained in the analyses indicate that the sample contains two distinct populations of two strains of Mycobacterium tuberculosis. The first strain belongs to PGGl as indicated by base compositions of amplification products of primer pair numbers 3554 and 3556 and has genotype I as indicated by base compositions of amplification products of primer pair numbers 3581, 3582, 3583, 3584, 3586, 3587, 3599, 3600, and 3601. None of the drug resistance primer pairs indicate mutations of codons that confer drug resistance so it is concluded that the strain could be either of the known strains 14157 or 15042, neither of which are drug-resistant. On the other hand, the second strain of Mycobacterium tuberculosis in the sample belongs to PPGl as indicated by base compositions of amplification products of primer pair numbers 3554 and 3556 and has genotype II as indicated by base compositions of amplification products of primer pair numbers 3581, 3582, 3583, 3584, 3586, 3587, 3599, 3600, and 3601. Drug resistance primer pairs 3546, 3547 and 3548 indicate the presence of a H528Y mutation indicating resistance to rifampin. Drug resistance primer pairs 3550 and 3551 indicate the presence of a M307V mutation indicating resistance to ethambutol. Drug resistance primer pair 3553 indicates the presence of a S315N/T mutation indicating resistance to isoniazid and drug resistance primer pair 3557 indicates the presence of a K43R mutation indicating resistance to streptomycin. It is then determined that this second strain could be strain 13598, a multi-drug resistant strain. Since this strain does not have resistance to fluoroquinolone or pyrazinamide, these drugs would be in theory, appropriate to treat the individual by killing this strain and presumably would also be useful to kill the first strain which is not resistant to any of the drugs listed in Table 13. The methods could be repeated over the time course of treatment of the subject with fluoroquinolone or pyrazinamide to investigate and verify the eradication of the infection. Likewise, other bacterial co-infections could be investigated using amplification products corresponding to bioagent identifying amplicons defined by other primer pairs disclosed in Table 2.
Example 14: Analysis of 102 Diverse Strains of Mycobacterium tuberculosis from the PHRC Collection.
[257] Recent outbreaks of multidrug-resistant tuberculosis underline the urgent need for new resistance-profiling methods that would allow for timely determination of proper treatment. The instant compositions and methods provided rapid anaysis of large numbers of samples and resolving power approximating sequence-based methods. As discussed above, PCR amplicons are analyzed by electrospray ionization mass spectrometry (ESI-MS) and base composition determination. The M. tuberculosis assay scrutinizes mutations associated with resistance to Rifampin, Isoniazid, Ethambutol, Pyrazinamide, Streptomycin and Fluoroquinolone. In addition, several silent mutations disseminated throughout the M. tuberculosis genome are simultaneously queried in order to discriminate the different sub-species of the M. tuberculosis complex, down to the nine M. tuberculosis SNP -based clusters (Mathema B, et al., Molecular Epidemiology of Tuberculosis: Current Insights. Clin. Microbiol. Rev. (2006) 19:658-685). The assay was tested using 102 diverse strains from the Public Health Research Institute (PHRC). We found that a 24-primer pair panel, which can be multiplexed into 8 PCR reactions, efficiently characterizes M. tuberculosis into the appropriate subspecies and provide the essential drug resistance profiling needed for prescribing the correct drugs and understanding the epidemiology of an outbreak. Table 16 illustrates the genotype and drug-resistance profiles from the analysis of 102 diverse strains from the PHRC collection. Multiple signatures from individual primer pairs, hinting at the presence of different strains within the same sample, are seen in the Table.
Table 16: Base Composition Analysis of Bioagent Identifying Amplicons of Mycobacterium tuberculosis
Figure imgf000140_0001
Example 15: Selection of Additional Primer Pairs for Genotyping of Members of the Bacterial Genus Mycobacterium and for Identification of Drug-Resistant Strains of Mycobacterium tuberculosis
[258] For tuberculosis, resistance to first- line antibiotics is associated with the acquisition of point- specific mutations clustered in genes. For example, 95% of rifampin resistant strains have mutations with the rifampin resistance determining region (RRRDR) spaning rpoB 505 to 533. Mutations are frequently seen in rpoB codions 516, 526 and 531. As well, at least 54% of isoniazid resistant strains have a mutation in katG codon S315. Secondary mutations are seen in inhA (promoter, S94A and 121V/T) as well as in the ahpC promoter. These mutations are not observed in susceptible strains. For example, 95% of the multiple drug-resitant genotypes (RIFR, INHR) are detecteable with less than 10 primer pairs. In addition to RIF and INH resistance, primer pairs targeting mutations conferring resistance to other first and second line drugs were also developed. To combine the power of high-throughput mass spectrometric analysis of bioagent identifying amplicons with the sub-species characteristic resolving power provided by genotyping analysis and codon base composition analysis, a panel of sixty nine genotyping analysis primer pairs was selected, and individually evaluated by dilution to extinction experiments using genomic DNA of the H37Rv strain. Primer pairs were individually validated and tested in multiplex settings of increasing complexity. In some embodiments of multiplex testing the first primer pair targets the rpoB, the rrs, embB, the katG, and/or the gyrA gene. In some embodiments of multiplex testing the second primer pair targets the inhA, the ahpC, the rrs, and/or the rpoB gene. In some embodiments of multiplex testing, the third primer pair targets the pncA and/or rpsL genes. In some embodiments of multiplex testing the first primer pair taregets the rpoB 516 and 526 polymorphisms, the rrs 1484 polymorphism, the embB 306 polymorphism, the katG 315 and 463 polymorphisms, and/or the gyrA 90..95 and 95 polymorphisms. In some embodiments of multiplex testing the second primer pair targets the inhA 189..199 and promoter polymorphisms, the ahpC promoter polymorphism, the rrs 1401-1402 and 511..513 polymormphsim, and/or the rpoB 531, 505..526, and 562..572 polymorphisms. In some embodiments of multiplex testing, the third primer pair targets the pncA 22..48, 77..102, 103..135, <1..2O, 139..171, 49..80 and/or rpsL 29..58, and 59..91 polymorphisms. In other embodiments of multiplex testing a panel primer pairs is used to target multiple genes and polymorphisms. Table 17 shows an exemplary Table of multiplex primer pairs used, for example, for drug resistance testing. Figure 9. shows that critical mutations may be uniquely resolved using dedicated primer pairs. Figure 10. Shows that rare mutations may be simultaneously queried using a shared primer pair. Figure 11. shows determination of resistance-conferring mutations by PCR/ESI- MS with resolution of mass spectra, and that primer pairs sharing the same well yield amplicons of distinct lengths.
Table 17. Target Genes and Polymorphisms for Mycobacterium tuberculosis Multiplex Drug Resistance Testing
Figure imgf000142_0001
[259] Twenty four primer pairs were conrigured in an eight well multiplexed assay. The assay was first tested using gneomic DNA of 102 strains of known phenotypes (PHRI Center/UMDNJ, Newark, NJ). An additional set of 25 multi-drug resistant strains from South Africa was tested. Drug-resistance genotypes were deduced form the determined base composition signatures and compared to independently determined phenotypes (Table 18.)
Table 18. Sensitivity and Specificity of Mycobacterium tuberculosis Drug Resitance Testing
Figure imgf000142_0002
[260] These results demonstrate that PCR/ESI-MS technology has been successfully applied to the characterization fo drug resistance mutations of Mycobacterium tuberculosis . Sensitivity levels achieved for determination of isoniazid and rifampin resistance permit reliable molecular diagnosis of multiple drug resistant strains. Example 16: Selection of Primer Pairs for Genotyping of Members of the Bacterial Genus Mycobacterium and for Identification of Drug-Resistant Strains of Mycobacterium tuberculosis
[26I]To combine the power of high- throughput mass spectrometry analysis of bioagent identifying amplicons with the sub-species characteristic resolving power provided by genotyping analysis and codon base composition analysis, a panel of 16 genotyping analysis primer pairs was selected. The primer pairs are designed to produce bioagent identifying amplicons within different housekeeping genes which are listed in Table 19. The primer sequences are found in Table 2 and are cross- referenced by the primer pair numbers, primer pair names or SEQ ID NOs listed in Table 19.
[262] In Mycobacterium tuberculosis, the acquisition of drug resistance is mostly associated with the emergence of discrete key mutations that can be unambiguously determined using the methods disclosed herein.
[263] The evolution of the Mycobacterium tuberculosis genome is essentially clonal, thus allowing strain typing through the query of distinct genomic markers that are lineage- specific and only vertically inherited. Co-infections of mixed populations of genotypes of Mycobacterium tuberculosis can be revealed simultaneously in the mass spectra of amplification products produced using the primers of Table 19. The high G+C content and of the Mycobacterium tuberculosis genome itself greatly facilitates the development of short, efficient primers which are appropriate for multiplexing (inclusion of a plurality of primers in each amplification reaction mixture).
Table 19: Primer Pairs for Genotyping and Determination of Drug Resistance of Strains of
Mycobacterium tuberculosis
Figure imgf000143_0001
Figure imgf000144_0001
[264] Conventional Mycobacterium tuberculosis culture methods often take 3 months from the presumptive diagnosis of tuberculosis to determine the appropriate treatment regimen for a confirmed MDR case (Figure 12). The challenge in identification of MTb resistance is that multiple mutations in multiple regions of over a dozen genes must be determined simultaneously. Accordingly, a multiplexed assay is provided that characterizes first- and second-line drug resistance of MTb isolates using a high-throughput system, for example, the Ibis Biosciences, Inc. (Carlsbad, CA.) Ibis T5000 Biosensor System, described, for example, in U.S. Patent Application No. 10/754,415, filed January 9, 2004, incorporated by reference herein in its entirety. This assay is capable of provide multidrug resistance profiling of up to 180 TB isolates post culture in 24 h. The primer pairs that amplify relevant regions of resistance in target genes are shown in Table 20.
Table 20. Multiplex assay plate layout: Two primer pairs per well, 8 wells per sample, 12 samples per plate. Primer pairs targeting each of the drugs of choice are coded as follows: Izoniazid (A), Rifampin (B), Fluoroquinolone (C), Diarylquinoline (D) and multiple drug resistance (E).
Figure imgf000144_0002
[265] Primer pairs configured to detect isoniazid resistance include: primer pair BCT3553 (molecular target katG codon 315. Mutations at position S315, in particular S315T (ACC), are present in about 54% of the isoniazid-resistant isolates (mutations frequencies for INH resistance according to Hazbon, AAC 2006, 50:2640-9; INH mutation frequencies vary greatly depending on authors, location and sample size). All mutants are distinguished from the wild- type, but a double mutant S315T (ACA) yields the same base composition as the simple mutant S315N (AAC); pimer pair BCT3552 (molecular target inhA operon promoter. Four distinct mutations located 8 to 17 nt upstream of mabA, the first gene of the inhA operon, are covered by this primer pair. These mutations are found in -10% of Isoniazid-resistant isolates); primer pair BCT4234 )molecular target ahpC promoter. Twelve distinct mutations located 4 to 39 nt upstream of ahpC are detected by this primer pair. These mutations are found in -8% of Isoniazid-resistant isolates); primer pair BCT4235 (molecular target inhA S94A. This mutation is found in -5% of Isoniazid-resistant isolates); and primer pair BCT4236 (molecular target inhA I21V/T. This mutation is found in ~2% of Isoniazid- resistant isolates).
[266] Primer pairs configured to detect rifampin resistance rifampin (RIF) resistance target rpoB, the beta subunit of RNA polymerase. Approximately 95% of RIFr isolates harbor mutations within the Rifampin Resistance Determining Region (RRDR), between rpoB codons 507 and 533 (McCammon, AAC 2005, 49:2200-9, incorporated by reference herein in its entirety). Primary regions within the RRDR are detected by the primer pairs BCT3828, BCT3908, BCT3633, and BCT4366 for the determination of RIF resistance, and primer pairs BCT4237 and BCT3697 detect secondary sites within rpoB. Primer pairs configured to detect rifampin resistance include: primer pair BCT3828 (molecular target rpoB codon 531-533. Mutations at position S531, in particular S531L, are present in about half of resistant isolates. Single mutations S531L and S531W, as well as double mutations S531F and S531Y are resolved from one another. The rare L533P mutation is also captured and segregated from the S531L/Y/F/W mutations); primer pair BCT3908 (molecular target rpoB codon 526 only. This primer pair unambiguously resolves the mutations H526N/D/Y/G/L/R found in -25% of the resistant isolates); BCT3633 (molecular target rpoB codons 515 and 516. This primer pair resolves mutations D516V, D516G and D516Y, even in the event of duplication of codon F515); primer pair BCT4366 (molecular target rpoB codons 505 to 516. This primer pair detects RRDR mutations present in the remaining 9- 10% of resistant isolates, but located outside of the three regions described above (including rare single codon insertions or deletions around positions 510-515). Base compositions from this primer pairs are analyzed in the view of the mutations already detected using primer pair BCT3633); primer pair BCT4237 (molecular target rpoB codons 130 to 140. Mutation V146F is typically found in resistant isolates without RRDR mutations, and accounts for 1% to 4% of the resistant isolates (Heep, JCM 2001, 39:107-110; McCammon, AAC 2005, 49:2200-9, both of which are incorporated by reference herein in their entireties.); and primer pair BCT3697 (molecular target rpoB codons 562 to 572. Mutation I572F may be found in isolates carrying mutations within the RRDR (-1%). [267] Primer pairs configured to detect multiple drug resistance include: primer pair BCT3551 (molecular target embB codon 306). A close correlation exists between mutations M306I/L/V/R and broad multi-drug resistance (Hazbon, AAC 2005, 49:3794-3802; Shi, AAC 2007, 51 :4515-7, both of which are incorporated by reference herein in their entireties). Mutations at this codon are found in isolates with INH- and RIF-resistance conferring mutations, and b) mutations associated with resistance to pyrazinamide are present in isolates carrying an embB 306 mutation. Testing this locus thereby provides a consistency check for the panel of primer pairs).
[268] Primer pairs configured to detect diarylquinolone resistance include: primer pair BCT4364 (molecular target atpE. Mutations (A63P, I66M) conferring resistance to diarylquinolines (Petrella, AAC 2006, 50:2853-6, incorporated by reference herein in its entirety) are deduced from the amplicon base composition of this primer pair.
[269] Primer pairs configured to detect fluoroquinolone resistance include: primer pair BCT3555 (molecular target gyrA codons 90 to 95, the Quinolone Resistance Determining Region (QRDR). Within this locus, frequently observed mutations include A90V, S91P and D94AΛ7N/G; and primer pair BCT3556 (molecular target gyrA codon 95. The mutation T95S is a phylogenetic marker not associated with fluoroquinolone resistance. However, because of its proximity to the QRDR, codon gyrA 95 is detected by BCT3555 in order to insure the production of an amplicon by BCT3555 regardless of the composition of codon 95. Knowledge of the base composition of codon 95 alone is desired to correctly provide the base composition of the QRDR amplicon. For example, the double mutant D94H+T95S might otherwise be indistinguishable from the wild-type QRDR.
[270] Primer pairs configured to detect the principal genetic group include: primer pair BCT3554 (molecular target katG codon 463). Similar to mutations detected by primer pair BCT3556, this mutation is not associated with drug resistance. But in association with BCT3556, this primer pair provides the PGG1/2/3 classification scheme (Sreevatsan, PNAS 1997, 94:9869-74, incorporated by referenence herein in its entirety).
CONCLUDING STATEMENTS
[271] The present invention includes any combination of the various species and subgeneric groupings falling within the generic disclosure. This invention therefore includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein. [272] While in accordance with the patent statutes, description of the various embodiments and examples have been provided, the scope of the invention is not to be limited thereto or thereby. Modifications and alterations of the present invention will be apparent to those skilled in the art without departing from the scope and spirit of the present invention.
[273] Therefore, it will be appreciated that the scope of this invention is to be defined by the appended claims, rather than by the specific examples which have been presented by way of example.
[274] Each reference (including, but not limited to, journal articles, U.S. and non-U.S. patents, patent application publications, international patent application publications, gene bank gi or accession numbers, internet web sites, and the like) cited in the present application is incorporated herein by reference in its entirety.

Claims

CLAIMSWhat is claimed is:
1. A method of identifying a Mycobacterium tuberculosis genotype in a sample comprising: obtaining a sample suspected of containing Mycobacterium tuberculosis; isolating nucleic acid from said sample; contacting said nucleic acid with one or more primer pairs configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying said nucleic acid with said primers such that one or more amplification products corresponding to bioagent identifying amplicons are produced; and measuring the molecular masses of said one or more amplification products, thereby identifying said Mycobacterium tuberculosis genotype.
2. The method of claim 1 further comprising calculating base compositions of said amplification products from said molecular masses.
3. The method of claim 2 further comprising comparing said molecular masses or said base compositions with a database containing molecular masses or base compositions of bioagent identifying amplicons of genotypes of Mycobacterium tuberculosis, said bioagent identifying amplicons defined by said one or more primer pairs.
4. The method of any one of claims 1 to 3 wherein said one or more primer pairs is a primer pair having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair number 3600 (SEQ ID NOs: 1515: 1538).
5. The method of any one of claims 1 to 4 wherein said one or more primer pairs further comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers: 3546 (SEQ ID NOs: 1493: 1517), 3547 (SEQ ID NOs: 1494: 1518), 3548 (SEQ ID NOs: 1495: 1519), 3550 (SEQ ID NOs: 1496: 1520), 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499: 1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502: 1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ ID NOs: 1504: 1527), 3559 (SEQ ID NOs: 1505: 1528), 3560 (SEQ ID NOs: 1506:1529), 3561 (SEQIDNOs: 1507:1530), 3581 (SEQ IDNOs: 1508:1531), 3582 (SEQ IDNOs: 1509:1532), 3583 (SEQIDNOs: 1510:1533), 3584 (SEQ IDNOs: 1511:1534), 3586 (SEQ IDNOs: 1512:1535), 3587 (SEQIDNOs: 1513:1536), 3599 (SEQ IDNOs: 1514:1537), 3601 (SEQ IDNOs: 1516:1539), 3908 (SEQIDNOs: 1540:1541), 3633 (SEQ ID NOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQIDNOs: 1546:1547), 4234 (SEQ ID NOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQIDNOs: 1552:1553), 4237 (SEQ ID NOs: 1554:1555), 4362 (SEQ IDNOs: 1556:1557), 4364 (SEQIDNOs: 1558:1559), and 4366 (SEQ ID NOs: 1560:1543).
6. The method any one of claims 1 to 4 wherein said one or more primer pairs further comprises five or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers: 3546 (SEQ ID NOs: 1493:1517), 3547 (SEQ ID NOs: 1494:1518), 3548 (SEQ ID NOs: 1495:1519), 3550(SEQIDNOs: 1496:1520), 3551 (SEQ IDNOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553(SEQIDNOs: 1499:1523), 3554 (SEQ IDNOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556(SEQIDNOs: 1502:1525), 3557 (SEQ IDNOs: 1503:1526), 3558 (SEQ ID NOs: 1504:1527), 3559(SEQIDNOs: 1505:1528), 3560 (SEQ IDNOs: 1506:1529), 3561 (SEQID NOs: 1507:1530), 3581 (SEQIDNOs: 1508:1531), 3582 (SEQ IDNOs: 1509:1532), 3583 (SEQ ID NOs: 1510:1533), 3584(SEQIDNOs: 1511:1534), 3586 (SEQ IDNOs: 1512:1535), 3587 (SEQ ID NOs: 1513:1536), 3599(SEQIDNOs: 1514:1537), 3601 (SEQ IDNOs: 1516:1539), 3908 (SEQ ID NOs: 1540:1541), 3633(SEQIDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ ID NOs: 1546:1547), 4234(SEQIDNOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQ ID NOs: 1552:1553), 4237(SEQIDNOs: 1554:1555), 4362 (SEQ IDNOs: 1556:1557), 4364 (SEQ ID NOs: 1558:1559), and 4366 (SEQ ID NOs: 1560:1543).
7. The method any one of claims 1 to 3 wherein said one or more primer pairs comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers selected from the group consisting of: 3546 (SEQ ID NOs: 1493: 1517), 3547 (SEQ IDNOs: 1494:1518), 3548(SEQIDNOs: 1495:1519), 3550 (SEQ IDNOs: 1496:1520), 3551 (SEQID NOs: 1497:1521), 3552(SEQIDNOs: 1498:1522), 3553 (SEQ IDNOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555(SEQIDNOs: 1501:1525), 3556 (SEQ IDNOs: 1502:1525), 3557 (SEQ ID NOs: 1503:1526), 3558(SEQIDNOs: 1504:1527), 3559 (SEQ IDNOs: 1505:1528), 3560 (SEQ ID NOs: 1506:1529), 3561 (SEQIDNOs: 1507:1530), 3581 (SEQIDNOs: 1508:1531), 3582 (SEQ ID NOs: 1509:1532), 3583(SEQIDNOs: 1510:1533), 3584 (SEQ IDNOs: 1511:1534), 3586 (SEQ ID NOs: 1512:1535), 3587(SEQIDNOs: 1513:1536), 3599 (SEQ IDNOs: 1514:1537), 3601 (SEQID NOs: 1516:1539), 3908(SEQIDNOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ ID NOs: 1544:1545), 3828 (SEQ ID NOs: 1546:1547), 4234 (SEQ ID NOs: 1548: 1549), 4235 (SEQ ID NOs: 1550:1551), 4236 (SEQ ID NOs: 1552:1553), 4237 (SEQ ID NOs: 1554: 1555), 4362 (SEQ ID NOs: 1556:1557), 4364 (SEQ ID NOs: 1558:1559), and 4366 (SEQ ID NOs: 1560: 1543).
8. The method any one of claims 1 to 3 wherein said one or more primer pairs comprises one or more primer pairs having a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding primer of primer pair numbers selected from the group consising of: 3551 (SEQ ID NOs: 1497: 1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499: 1523), 3554 (SEQ ID NOs: 1500: 1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502:1525), 3908 (SEQ ID NOs: 1540: 1541), 3633 (SEQ ID NOs: 1542:1543), 3697 (SEQ ID NOs: 1544:1545), 3828 (SEQ ID NOs: 1546: 1547), 4234 (SEQ ID NOs: 1548:1549), 4235 (SEQ ID NOs: 1550:1551), 4236 (SEQ ID NOs: 1552: 1553), 4237 (SEQ ID NOs: 1554:1555), 4364 (SEQ ID NOs: 1558:1559), and 4366 (SEQ ID NOs: 1560: 1543).
9. The method of any one of claims 1 to 4 wherein said Mycobacterium tuberculosis genotype is distinguished from Mycobacterium africanum, Mycobacterium bovis, Mycobacterium microti, and Mycobacterium canettii.
10. The method of any one of claims 1 to 9 wherein said Mycobacterium tuberculosis genotype comprises a drug-resistant strain of Mycobacterium tuberculosis.
11. The method of claim 10 wherein said drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
12. The method of claim 10 wherein said drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamide.
13. The method of any one of claims 1 to 8 wherein three or more of said primer pairs are combined in a multiplex reaction to produce a plurality of amplification products corresponding to bioagent identifying amplicons.
14. The method of claim 1 wherein said molecular masses are measured by mass spectrometry.
15. The method of claim 1 wherein said sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy.
16. The method of claim 1 wherein said sample comprises a population of distinct genotypes of
Mycobacterium tuberculosis.
17. An oligonucleotide primer pair comprising a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein said forward primer has at least 70% sequence identity with SEQ ID NO: 1515 and said reverse primer has at least 70% sequence identity with SEQ ID NO: 1538.
18. The oligonucleotide primer pair of claim 17 wherein said forward primer comprises at least 80% sequence identity with SEQ ID NO: 1515.
19. The oligonucleotide primer pair of claim 18 wherein said forward primer comprises at least 90% sequence identity with SEQ ID NO: 1515.
20. The oligonucleotide primer pair of claim 17 wherein said forward primer is SEQ ID NO: 1515.
21. The oligonucleotide primer pair of claim 17 wherein said reverse primer comprises at least 80% sequence identity with SEQ ID NO: 1538.
22. The oligonucleotide primer pair of claim 21 wherein said reverse primer comprises at least 90% sequence identity with SEQ ID NO: 1538.
23. The oligonucleotide primer pair of claim 17 wherein said reverse primer is SEQ ID NO:
1538.
24. A kit for identifying a Mycobacterium tuberculosis genotype in a sample comprising: i) a first oligonucleotide primer pair comprising a forward primer and a reverse primer, each configured to hybridize to a Mycobacterium tuberculosis gyrB gene, and each between 13 and 35 linked nucleotides in length wherein said forward primer has at least 70% sequence identity with SEQ ID NO: 1515 and said reverse primer has at least 70% sequence identity with SEQ ID NO: 1538; and ii) at least one additional primer pair wherein the primers of each of said at least one additional primer pair are configured to hybridize to sequence regions within a Mycobacterium tuberculosis gene selected from the group consisting of: rpoB, embB, fabG, inhA, katG, gyrA, pncA, prcA, rv2348c, rv3815c, rvO147, erg3, rv0083, rvlO47, rvl814, rv0041, andrv0260c.
25. The kit of claim 24 wherein each of said at least one additional primer pairs is a primer pair comprising a forward primer and a reverse primer, said forward primer and said reverse primer each between 13 to 35 linked nucleotides in length and each having at least 70% sequence identity with the corresponding forward and reverse primers of primer pair numbers: 3546 (SEQ IDNOs: 1493:1517), 3547 (SEQ ID NOs: 1494:1518), 3548 (SEQ ID NOs: 1495:1519), 3550 (SEQ IDNOs: 1496:1520), 3551 (SEQIDNOs: 1497:1521), 3552 (SEQ ID NOs: 1498:1522), 3553 (SEQ IDNOs: 1499:1523), 3554(SEQIDNOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556 (SEQ IDNOs: 1502:1525), 3557(SEQIDNOs: 1503:1526), 3558 (SEQ ID NOs: 1504:1527), 3559 (SEQ IDNOs: 1505:1528), 3560(SEQIDNOs: 1506:1529), 3561 (SEQIDNOs: 1507:1530), 3581 (SEQIDNOs: 1508:1531), 3582(SEQIDNOs: 1509:1532), 3583 (SEQ ID NOs: 1510:1533), 3584 (SEQ IDNOs: 1511:1534), 3586(SEQIDNOs: 1512:1535), 3587 (SEQ IDNOs: 1513:1536), 3599 (SEQ IDNOs: 1514:1537), 3601 (SEQIDNOs: 1516:1539), 3908 (SEQ IDNOs: 1540:1541), 3633 (SEQ IDNOs: 1542:1543), 3697 (SEQ ID NOs: 1544:1545), 3828 (SEQ IDNOs: 1546:1547), 4234 (SEQ IDNOs: 1548:1549), 4235 (SEQ ID NOs: 1550:1551), 4236 (SEQ ID NOs: 1552:1553), 4237 (SEQ ID NOs: 1554:1555), 4362 (SEQ IDNOs: 1556:1557), 4364 (SEQ IDNOs: 1558:1559), and 4366 (SEQ IDNOs: 1560:1543).
26. A kit for identifying a Mycobacterium tuberculosis genotype in a sample comprising: i) a first oligonucleotide primer pair comprising a forward primer and a reverse primer, each configured to hybridize to a Mycobacterium tuberculosis gyrB gene, and each between 13 and 35 linked nucleotides in length selected from the group consisting of: 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ IDNOs: 1498:1522), 3553(SEQIDNOs: 1499:1523), 3554 (SEQ IDNOs: 1500:1524), 3555 (SEQ ID NOs: 1501:1525), 3556(SEQIDNOs: 1502:1525), 3908 (SEQ IDNOs: 1540:1541), 3633 (SEQ ID NOs: 1542:1543), 3697(SEQIDNOs: 1544:1545), 3828 (SEQ IDNOs: 1546:1547), 4234 (SEQ ID NOs: 1548:1549), 4235(SEQIDNOs: 1550:1551), 4236 (SEQ IDNOs: 1552:1553), 4237 (SEQ ID NOs: 1554:1555), 4364(SEQIDNOs: 1558:1559), and 4366 (SEQ ID NOs: 1560:1543); and ii) at least one additional primer pair wherein the primers of each of said at least one additional primer pair are configured to hybridize to sequence regions within a Mycobacterium tuberculosis gene selected from the group consisting of: rpoB, embB, fabG, inhA, katG, gyrA, pncA, prcA, rv2348c, rv3815c, rvO147, erg3, rv0083, rvlO47, rvl814, rv0041, andrv0260c.
27. A method for identifying a drug-resistant strain of Mycobacterium tuberculosis comprising: obtaining a sample suspected of containing Mycobacterium tuberculosis; isolating nucleic acid from said sample; contacting said nucleic acid with a primer pair configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying said nucleic acid with said primer pair to obtain an amplification product containing a mutation of a codon known to confer drug resistance upon Mycobacterium tuberculosis; and measuring the molecular mass of said amplification product, thereby identifying said drug resistant strain of Mycobacterium tuberculosis.
28. The method of claim 27 further comprising calculating a base composition of said amplification product from said molecular mass, thereby identifying a base composition for said codon.
29. The method of claim 27 wherein said primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein said forward primer and said reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting of primer pair numbers: 3546 (SEQ IDNOs: 1493:1517), 3547(SEQIDNOs: 1494:1518), 3548 (SEQ ID NOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQIDNOs: 1497:1521), 3552 (SEQ IDNOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554(SEQIDNOs: 1500:1524), 3555 (SEQ IDNOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557(SEQIDNOs: 1503:1526), 3558 (SEQ IDNOs: 1504:1527), 3559 (SEQ ID NOs: 1505:1528), 3560(SEQIDNOs: 1506:1529), 3561 (SEQIDNOs: 1507:1530), 3908 (SEQ ID NOs: 1540:1541), 3633(SEQIDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ ID NOs: 1546:1547), 4234(SEQIDNOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQ ID NOs: 1552:1553), 4237(SEQIDNOs: 1554:1555), 4362 (SEQ IDNOs: 1556:1557), 4364 (SEQ ID NOs: 1558:1559), and 4366 (SEQ ID NOs: 1560:1543).
30. The method of claim 27 wherein said primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein said forward primer and said reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting of primer pair numbers: 3551 (SEQ IDNOs: 1497:1521), 3552(SEQIDNOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554 (SEQ ID NOs: 1500:1524), 3555(SEQIDNOs: 1501:1525), 3556 (SEQ IDNOs: 1502:1525), 3908 (SEQ ID NOs: 1540:1541), 3633(SEQIDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ ID NOs: 1546:1547), 4234(SEQIDNOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQ ID NOs: 1552:1553), 4237(SEQIDNOs: 1554:1555), 4364 (SEQ IDNOs: 1558:1559), and 4366 (SEQ IDNOs: 1560:1543.
31. The method of claim 27 wherein said drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
32. The method of claim 27 wherein said drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
33. The method of claim 27 wherein said molecular mass is measured by mass spectrometry.
34. The method of claim 27 wherein said sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy tissue swab, tissue aspirate, abscess biopsy, cerebrospinal fluid.
35. The method of claim 27 wherein said sample comprises a population of distinct genotypes of
Mycobacterium tuberculosis.
36. The method of claim 35 wherein said population of distinct genotypes comprises a drug- resistant genotype and a drug-sensitive genotype.
37. A method of treating a human infected with a drug-resistant strain of Mycobacterium tuberculosis comprising: obtaining a sample from a human infected with Mycobacterium tuberculosis; isolating nucleic acid from said sample; contacting said nucleic acid with a primer pair configured to produce one or more bioagent identifying amplicons from nucleic acid of Mycobacterium tuberculosis and amplifying said nucleic acid with said primer pair to obtain an amplification product containing a mutation of a codon known to confer drug resistance upon Mycobacterium tuberculosis; measuring the molecular mass of said amplification product, thereby identifying said drug- resistant strain of Mycobacterium tuberculosis; selecting one or more alternative drugs to which said drug-resistant strain is not resistant; and administering said alternative drugs to said human.
38. The method of claim 37 wherein said primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein said forward primer and said reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting of primer pair numbers: 3546 (SEQ IDNOs: 1493:1517), 3547(SEQIDNOs: 1494:1518), 3548 (SEQ ID NOs: 1495:1519), 3550 (SEQ ID NOs: 1496:1520), 3551 (SEQIDNOs: 1497:1521), 3552 (SEQ IDNOs: 1498:1522), 3553 (SEQ ID NOs: 1499:1523), 3554(SEQIDNOs: 1500:1524), 3555 (SEQ IDNOs: 1501:1525), 3556 (SEQ ID NOs: 1502:1525), 3557(SEQIDNOs: 1503:1526), 3558 (SEQ IDNOs: 1504:1527), 3559 (SEQ ID NOs: 1505:1528), 3560(SEQIDNOs: 1506:1529), 3561 (SEQIDNOs: 1507:1530), 3908 (SEQ ID NOs: 1540:1541), 3633(SEQIDNOs: 1542:1543), 3697 (SEQ IDNOs: 1544:1545), 3828 (SEQ ID NOs: 1546:1547), 4234(SEQIDNOs: 1548:1549), 4235 (SEQ IDNOs: 1550:1551), 4236 (SEQ ID NOs: 1552:1553), 4237(SEQIDNOs: 1554:1555), 4362 (SEQ IDNOs: 1556:1557), 4364 (SEQ ID NOs: 1558:1559), and 4366 (SEQ ID NOs: 1560:1543).
39. The method of claim 37 wherein said drug resistant strain of Mycobacterium tuberculosis is resistant to one or more drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
40. The method of claim 37 wherein said drug resistant strain of Mycobacterium tuberculosis is a multi-drug resistant strain which is resistant to a plurality of drugs selected from the group consisting of: rifampin, ethambutol, isoniazid, diarylquinolone, fluoroquinolone, streptomycin and pyrazinamine.
41. The method of claim 37 wherein said molecular mass is measured by mass spectrometry.
42. The method of claim 37 wherein said sample is a human clinical sample selected from the group consisting of: blood, sputum, urine, and tissue biopsy.
43. The method of claim 37 wherein said sample comprises a population of distinct genotypes of
Mycobacterium tuberculosis.
44. The method of claim 37 wherein said population of distinct genotypes comprises a drug- resistant genotype and a drug-sensitive genotype.
45. A method for determining the identity and quantity of Mycobacterium tuberculosis in a sample comprising: contacting said sample with a pair of primers and a known quantity of a calibration polynucleotide comprising a calibration sequence; concurrently amplifying nucleic acid from said Mycobacterium tuberculosis in said sample with said pair of primers and amplifying nucleic acid from said calibration polynucleotide in said sample with said pair of primers to obtain a first amplification product comprising a Mycobacterium tuberculosis identifying amplicon and a second amplification product comprising a calibration amplicon; obtaining molecular mass and abundance data for said Mycobacterium tuberculosis identifying amplicon and for said calibration amplicon wherein the 5 ' and 3 ' ends of said Mycobacterium tuberculosis identifying amplicon and said calibration amplicon are the sequences of said pair of primers or complements thereof; and distinguishing said Mycobacterium tuberculosis identifying amplicon from said calibration amplicon based on their respective molecular masses, wherein the molecular mass of said Mycobacterium tuberculosis identifying amplicon indicates the identity of said Mycobacterium tuberculosis, and comparison of Mycobacterium tuberculosis identifying amplicon abundance data and calibration amplicon abundance data indicates the quantity of Mycobacterium tuberculosis in said sample.
46. The method of claim 41 wherein said primer pair comprises a forward primer and a reverse primer, each between 13 and 35 linked nucleotides in length wherein said forward primer and said reverse primer both have at least 70% sequence identity with the corresponding forward primer and reverse primer of a primer pair selected from the group consisting of primer pair numbers: 3546 (SEQ ID NOs: 1493:1517), 3547 (SEQ ID NOs: 1494: 1518), 3548 (SEQ ID NOs: 1495: 1519), 3550 (SEQ ID NOs: 1496: 1520), 3551 (SEQ ID NOs: 1497:1521), 3552 (SEQ ID NOs: 1498: 1522), 3553 (SEQ ID NOs: 1499: 1523), 3554 (SEQ ID NOs: 1500:1524), 3555 (SEQ ID NOs: 1501 : 1525), 3556 (SEQ ID NOs: 1502: 1525), 3557 (SEQ ID NOs: 1503:1526), 3558 (SEQ ID NOs: 1504: 1527), 3559 (SEQ ID NOs: 1505: 1528), 3560 (SEQ ID NOs: 1506:1529), 3561 (SEQ ID NOs: 1507: 1530), 3908 (SEQ ID NOs: 1540:1541), 3633 (SEQ ID NOs: 1542:1543), 3697 (SEQ ID NOs: 1544: 1545), 3828 (SEQ ID NOs: 1546:1547), 4234 (SEQ ID NOs: 1548:1549), 4235 (SEQ ID NOs: 1550: 1551), 4236 (SEQ ID NOs: 1552:1553), 4237 (SEQ ID NOs: 1554:1555), 4362 (SEQ ID NOs: 1556: 1557), 4364 (SEQ ID NOs: 1558: 1559), and 4366 (SEQ ID NOs: 1560: 1543).
47. The method of claim 41 wherein said calibration polynucleotide is selected from the group consisting of: calibration polynucleotide SEQ ID NO. 1561, calibration polynucleotide SEQ ID NO. 1562, calibration polynucleotide SEQ ID NO. 1563, and calibration polynucleotide SEQ ID NO. 1564.
PCT/US2008/067911 2007-06-22 2008-06-23 Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis WO2009017902A2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US12/666,239 US20110105531A1 (en) 2007-06-22 2008-06-23 Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis
EP08826817A EP2179041A4 (en) 2007-06-22 2008-06-23 Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis
ZA2010/00218A ZA201000218B (en) 2007-06-22 2010-01-12 Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US94585007P 2007-06-22 2007-06-22
US60/945,850 2007-06-22
US3788408P 2008-03-19 2008-03-19
US61/037,884 2008-03-19

Publications (2)

Publication Number Publication Date
WO2009017902A2 true WO2009017902A2 (en) 2009-02-05
WO2009017902A3 WO2009017902A3 (en) 2009-10-15

Family

ID=40305160

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/067911 WO2009017902A2 (en) 2007-06-22 2008-06-23 Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis

Country Status (3)

Country Link
EP (1) EP2179041A4 (en)
WO (1) WO2009017902A2 (en)
ZA (1) ZA201000218B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011047307A1 (en) 2009-10-15 2011-04-21 Ibis Biosciences, Inc. Multiple displacement amplification
WO2011140237A2 (en) * 2010-05-04 2011-11-10 The Government Of The Usa Of America As Represented By The Secretary Of The Department Of Health And Human Services, Centers For Disease Control And Prevention Process for detection of multidrug resistant tuberculosis using real-time pcr and high resolution melt analysis
WO2013132443A1 (en) * 2012-03-06 2013-09-12 Vela Operations Pte. Ltd. Real-time pcr detection of mycobacterium tuberculosis complex
CN108165561A (en) * 2017-12-01 2018-06-15 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and its application
CN108165562A (en) * 2017-12-01 2018-06-15 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and its application
CN108165560A (en) * 2017-12-01 2018-06-15 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and its application
CN110475863A (en) * 2017-01-19 2019-11-19 株式会社钟化 For detecting primer sets, probe, kit and the method for mycobacterium kansasii
CN112481399A (en) * 2020-12-21 2021-03-12 上海国际旅行卫生保健中心(上海海关口岸门诊部) Primer group for mycobacterium tuberculosis MLST typing based on second-generation sequencing technology and library construction method
CN115029453A (en) * 2021-11-16 2022-09-09 江汉大学 MNP (protein-protein) marker site of streptococcus pyogenes, primer composition, kit and application of MNP marker site
EP3936615A4 (en) * 2019-03-04 2023-04-19 Mitsui Chemicals, Inc. Method for determining whether organism having cell wall exists and method for identifying organism having cell wall

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008151023A2 (en) 2007-06-01 2008-12-11 Ibis Biosciences, Inc. Methods and compositions for multiple displacement amplification of nucleic acids

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5851763A (en) * 1992-09-17 1998-12-22 Institut Pasteur Rapid detection of antibiotic resistance in mycobacterium tuberculosis
US5658733A (en) * 1994-04-18 1997-08-19 Mayo Foundation For Medical Education And Research Detection of isoniazid resistant strains of M. tuberculosis
WO2000036142A1 (en) * 1998-12-11 2000-06-22 Visible Genetics Inc. METHOD AND KIT FOR THE CHARACTERIZATION OF ANTIBIOTIC-RESISTANCE MUTATIONS IN $i(MYCOBACTERIUM TUBERCULOSIS)
US6892139B2 (en) * 1999-01-29 2005-05-10 The Regents Of The University Of California Determining the functions and interactions of proteins by comparative analysis
JP3565752B2 (en) * 1999-11-02 2004-09-15 株式会社海洋バイオテクノロジー研究所 Identification and specific detection of slow-growing mycobacteria using characteristic nucleotide sequences present in DNA gyrase gene
US20030027135A1 (en) * 2001-03-02 2003-02-06 Ecker David J. Method for rapid detection and identification of bioagents
KR100454585B1 (en) * 2001-10-09 2004-11-02 주식회사 에스제이하이테크 Microarray comprising probes for Mycobacteria genotyping, M. tuberculosis strain differentiation and antibiotic-resistance detection
WO2004061134A1 (en) * 2002-12-27 2004-07-22 University Of Medicine And Dentistry Of New Jersey Method for single nucleotide polymorphism detection
GB0428255D0 (en) * 2004-12-23 2005-01-26 Health Prot Agency Detection of nucleic acid mutations

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of EP2179041A4 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011047307A1 (en) 2009-10-15 2011-04-21 Ibis Biosciences, Inc. Multiple displacement amplification
EP2957641A1 (en) 2009-10-15 2015-12-23 Ibis Biosciences, Inc. Multiple displacement amplification
EP3225695A1 (en) 2009-10-15 2017-10-04 Ibis Biosciences, Inc. Multiple displacement amplification
US9890408B2 (en) 2009-10-15 2018-02-13 Ibis Biosciences, Inc. Multiple displacement amplification
WO2011140237A2 (en) * 2010-05-04 2011-11-10 The Government Of The Usa Of America As Represented By The Secretary Of The Department Of Health And Human Services, Centers For Disease Control And Prevention Process for detection of multidrug resistant tuberculosis using real-time pcr and high resolution melt analysis
WO2011140237A3 (en) * 2010-05-04 2012-04-05 The Government Of The Usa Of America As Represented By The Secretary Of The Department Of Health And Human Services, Centers For Disease Control And Prevention Process for detection of multidrug resistant tuberculosis using real-time pcr and high resolution melt analysis
WO2013132443A1 (en) * 2012-03-06 2013-09-12 Vela Operations Pte. Ltd. Real-time pcr detection of mycobacterium tuberculosis complex
CN110475863A (en) * 2017-01-19 2019-11-19 株式会社钟化 For detecting primer sets, probe, kit and the method for mycobacterium kansasii
EP3572509A4 (en) * 2017-01-19 2020-08-26 Kaneka Corporation Primer set, probe, kit, and method for detecting mycobacterium kansasii
CN110475863B (en) * 2017-01-19 2023-10-13 株式会社钟化 Primer group, probe, kit and method for detecting mycobacterium kansasii
CN108165560B (en) * 2017-12-01 2021-06-08 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and application thereof
CN108165561A (en) * 2017-12-01 2018-06-15 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and its application
CN108165562B (en) * 2017-12-01 2021-06-08 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and application thereof
CN108165562A (en) * 2017-12-01 2018-06-15 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and its application
CN108165561B (en) * 2017-12-01 2021-06-18 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and application thereof
CN108165560A (en) * 2017-12-01 2018-06-15 北京蛋白质组研究中心 Mycobacterium tuberculosis H37Rv encoding gene and its application
EP3936615A4 (en) * 2019-03-04 2023-04-19 Mitsui Chemicals, Inc. Method for determining whether organism having cell wall exists and method for identifying organism having cell wall
CN112481399A (en) * 2020-12-21 2021-03-12 上海国际旅行卫生保健中心(上海海关口岸门诊部) Primer group for mycobacterium tuberculosis MLST typing based on second-generation sequencing technology and library construction method
CN112481399B (en) * 2020-12-21 2022-08-30 上海国际旅行卫生保健中心(上海海关口岸门诊部) Primer group for mycobacterium tuberculosis MLST typing based on second-generation sequencing technology and library construction method
CN115029453A (en) * 2021-11-16 2022-09-09 江汉大学 MNP (protein-protein) marker site of streptococcus pyogenes, primer composition, kit and application of MNP marker site
CN115029453B (en) * 2021-11-16 2023-06-16 江汉大学 MNP (MNP) marking site of streptococcus pyogenes, primer composition, kit and application of MNP marking site

Also Published As

Publication number Publication date
EP2179041A2 (en) 2010-04-28
EP2179041A4 (en) 2010-12-22
ZA201000218B (en) 2012-04-25
WO2009017902A3 (en) 2009-10-15

Similar Documents

Publication Publication Date Title
US8013142B2 (en) Compositions for use in identification of bacteria
JP5081144B2 (en) Composition for use in bacterial identification
EP2064332B1 (en) Targeted whole genome amplification method for identification of pathogens
EP1891244B1 (en) Compositions for use in identification of adenoviruses
EP2179041A2 (en) Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis
US20120100543A1 (en) Compositions for the use in identification of fungi
WO2006071241A2 (en) Compositions for use in identification of bacteria
US20080145847A1 (en) Methods for identification of sepsis-causing bacteria
WO2012044956A1 (en) Targeted genome amplification methods
US20110105531A1 (en) Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis
WO2008127839A2 (en) Compositions for use in identification of bacteria
US20110065111A1 (en) Compositions For Use In Genotyping Of Klebsiella Pneumoniae
US20100291544A1 (en) Compositions for use in identification of strains of hepatitis c virus

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2008826817

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 12666239

Country of ref document: US