WO2008127839A2 - Compositions for use in identification of bacteria - Google Patents

Compositions for use in identification of bacteria Download PDF

Info

Publication number
WO2008127839A2
WO2008127839A2 PCT/US2008/057717 US2008057717W WO2008127839A2 WO 2008127839 A2 WO2008127839 A2 WO 2008127839A2 US 2008057717 W US2008057717 W US 2008057717W WO 2008127839 A2 WO2008127839 A2 WO 2008127839A2
Authority
WO
WIPO (PCT)
Prior art keywords
seq
primer
sequence identity
primer pair
oligonucleotide
Prior art date
Application number
PCT/US2008/057717
Other languages
French (fr)
Other versions
WO2008127839A3 (en
Inventor
David J. Ecker
Rangarajan Sampath
Thomas A. Hall
Lawrence Blyn
Feng Li
Original Assignee
Ibis Biosciences, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ibis Biosciences, Inc. filed Critical Ibis Biosciences, Inc.
Priority to US12/532,809 priority Critical patent/US20110256541A1/en
Priority to EP08780484A priority patent/EP2076612A2/en
Publication of WO2008127839A2 publication Critical patent/WO2008127839A2/en
Publication of WO2008127839A3 publication Critical patent/WO2008127839A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Definitions

  • the present invention provides compositions, kits and methods for rapid identification and quantification of bacteria by molecular mass and base composition analysis.
  • a problem in determining the cause of a natural infectious outbreak or a bioterrorist attack is the sheer variety of organisms that can cause human disease. There are over 1400 organisms infectious to humans; many of these have the potential to emerge suddenly in a natural epidemic or to be used in a malicious attack by bioterrorists (Taylor et al. Philos. Trans. R. Soc. London B. Biol. ScL, 2001, 356, 983-989). This number does not include numerous strain variants, bioengineered versions, or pathogens that infect plants or animals.
  • PCR polymerase chain reaction
  • Mass spectrometry provides detailed information about the molecules being analyzed, including high mass accuracy. It is also a process that can be easily automated. DNA chips with specific probes can only determine the presence or absence of specifically anticipated organisms. Because there are hundreds of thousands of species of benign bacteria, some very similar in sequence to threat organisms, even arrays with 10,000 probes lack the breadth needed to identify a particular organism.
  • oligonucleotide primers and compositions and kits containing the oligonucleotide primers which define bacterial bioagent identifying amplicons and, upon amplification, produce corresponding amplification products whose molecular masses provide the means to identify bacteria, for example, at and below the species taxonomic level.
  • oligonucleotide primers oligonucleotide primer pairs, compositions and kits comprising the same, and methods for their use in rapid identification, characterization and quantification of bacteria (also referred to herein as bacterial bioagents) by molecular mass and base composition analysis.
  • the bacteria are members of the Staphylococcus genus. In a preferred embodiment, they are members of the Staphylococcus aureus species.
  • the forward and reverse primer members of the oligonucleotide primer pairs are configured to amplify one or more nucleic acids from bioagents, thereby generating amplicons (amplification products) for the nucleic acids.
  • the primers generate bioagent identifying nucleic acid amplicons. The amplicons are preferably generated from gene sequences within the nucleic acid.
  • Each of the oligonucleotide primer pairs comprises a forward and a reverse primer.
  • each of the forward and reverse primers comprises between 13 and 35 linked nucleotides in length.
  • the primer may comprise 13, 14, 15, 16, 17, 18, 19,
  • the forward primer of the oligonucleotide primer pair comprises between 70% and 100% sequence identity with SEQ ID NO.: 1465.
  • the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1465.
  • the forward primer comprises at least 80% sequence identity with SEQ ID NO.: 1465.
  • the forward primer comprises at least 90% sequence identity with SEQ ID NO. : 1465.
  • the forward primer comprises at least 95% sequence identity with SEQ ID NO.: 1465.
  • the forward primer comprises at least 100% sequence identity with SEQ ID NO.: 1465.
  • the forward primer is SEQ ID NO.: 1465 with 0-10 nucleotide deletions, additions, and/or substitutions.
  • the forward primer is SEQ ID NO.: 1465.
  • the reverse primer of the oligonucleotide primer pair comprises between 70% and 100% sequence identity with SEQ ID NO.: 1466.
  • the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1466.
  • the reverse primer comprises at least 80% sequence identity with SEQ ID NO.: 1466.
  • the reverse primer comprises at least 90% sequence identity with SEQ ID NO.: 1466.
  • the reverse primer comprises at least 95% sequence identity with SEQ ID NO.: 1466.
  • the reverse primer comprises at least 100% sequence identity with SEQ ID NO.: 1466.
  • the reverse primer is SEQ ID NO.: 1466 with 0-10 nucleotide deletions, additions, and/or substitutions.
  • the reverse primer is SEQ ID NO.: 1466.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1465.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1466.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1465 and an the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1466.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 288.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1269.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 288 and an the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1269.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 698.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1420.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 698 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1420.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 217.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1167
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 217 and wherein the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1167.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 399.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1041.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 399 and wherein the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1041.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 430.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1321.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 430 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1321.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 174.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 853.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 174 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 853.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 172.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1360.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 172 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1360.
  • One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 205.
  • Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 876.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 205 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 876.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 456.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1261.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 456 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1261.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 437.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1137.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1231.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 456 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1231 or with SEQ ID NO. : 1137.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 530.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 891.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 530 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 891.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 474.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 869.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 474 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 869.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 268.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1284.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 268 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1284.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 418.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1301.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 418 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1301.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 318.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1300.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 318 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1300.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 440.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1076.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 440 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1076.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 219.
  • Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1013.
  • Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 219 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1013.
  • kits comprising one or more of the oligonucleotide primer pairs.
  • the kit comprises an oligonucleotide primer pair comprising a forward primer that comprises at least 70% sequence identity with SEQ ID NO.: 1465 and a reverse primer that comprises at least 70% sequence identity with SEQ ID NO.: 1466, the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1467 and the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1468, or the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1469 and the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1470.
  • the primer pair comprises at least 70% sequence identity with SEQ ID NO.: 1465:SEQ ID NO.: 1466, SEQ ID NO.: 1467:SEQ ID NO.: 1468, or SEQ ID NO.: 1469: SEQ ID NO.: 1470.
  • the kit comprises at least one additional oligonucleotide primer pair that is configured to generate an amplicon between 45 and 200 linked nucleotides in length, and comprises a forward and a reverse primer, each comprising between 13 and 35 linked nucleotides in length and each configured to hybridize to conserved sequence regions within a Staphylococcus aureus gene, said gene selected from the group consisting of: ermA, ermC, pvluk, nuc, tuffi, mecA, mec-Rl, tsstl, and mupR, arcC, aroE, gmk, pta, tpi and yqi.
  • each of the at least one additional oligonucleotide primer pair comprises at least 70% sequence identity with a primer pair selected from: SEQ ID NO.: 288:SEQ ID NO.: 1269, SEQ ID NO.: 698:SEQ ID NO.: 1420, SEQ ID NO.: 217:SEQ ID NO.: 1167, SEQ ID NO.: 399:SEQ ID NO.: 1041, SEQ ID NO : 456:SEQ ID NO.: 1261, SEQ ID NO : 430: SEQ ID NO.: 1321, SEQ ID NO.: 174:SEQ ID NO.:853, SEQ ID NO : 437:SEQ ID NO.: 1232, SEQ ID NO.: 530:SEQ ID NO.:891, SEQ ID NO.: 474:SEQ ID NO.:869, SEQ ID NO.: 268:SEQ ID NO.:1284, SEQ ID NO: 418:SEQ ID NO.:1301, SEQ IDNO: 3
  • the kit comprises eight primer pairs, said eight oligonucleotide primer pairs having at least 70% sequence identity to: SEQIDNO.: 288:SEQ ID NO.:1269, SEQIDNO.: 698:SEQ IDNO.:1420, SEQID NO.: 217:SEQIDNO.:1167, SEQIDNO.: 399:SEQ IDNO.:1041, SEQIDNO.: 456:SEQID NO.:1261, SEQ ID NO.: 430:SEQ ID NO.:1321, SEQ ID NO.: 174:SEQ ID NO.:853, and SEQ ID NO.: 1465: SEQ ID NO: 1466, SEQIDNO.: 1467:SEQ IDNO.:1468, or SEQ ID NO.: 1469:SEQ ID NO.: 1470.
  • the kit comprises eight oligonucleotide primer pairs consisting of SEQIDNO.:288:SEQIDNO.:1269, SEQIDNO.: 698:SEQ ID NO.:1420, SEQ ID NO: 217:SEQ ID NO.: 1167, SEQIDNO.: 399:SEQ IDNO.:1041, SEQIDNO.: 456:SEQ IDNO.:1261, SEQID NO.:430:SEQIDNO.:1321, SEQIDNO.: 174:SEQ IDNO.:853, and SEQ ID NO.: 1465:SEQID NO.: 1466, SEQIDNO.: 1467: SEQ ID NO.: 1468, or SEQ IDNO.: 1469: SEQ ID NO.: 1470.
  • the kit further comprises eight additional primer pairs, comprising at least 70% sequence identity with SEQ ID NO.: 437: SEQ ID NO: 1232, SEQIDNO.: 530:SEQ ID NO.:891, SEQID NO.: 474:SEQ ID NO.:869, SEQ ID NO.: 268:SEQ ID NO.:1284, SEQ ID NO.: 418:SEQ ID NO.:1301, SEQIDNO.: 318:SEQ ID NO.:1300, SEQ IDNO: 440:SEQ ID NO.:1076, and SEQ ID NO.: 219:SEQ ID NO.:1013.
  • the eight additional primer pairs consists of: SEQ ID NO.: 437:SEQ ID NO.:1232, SEQ ID NO.: 530:SEQ ID NO.:891, SEQ ID NO.: 474:SEQ ID NO.:869, SEQ ID NO: 268:SEQ ID NO.:1284, SEQ IDNO: 418:SEQ ID NO.:1301, SEQIDNO.: 318:SEQ ID NO.:1300, SEQ ID NO.: 440:SEQ ID NO.:1076, and SEQ ID NO.: 219:SEQ ID NO.:1013.
  • the kit comprises A kit for identifying a Staphylococcus aureus bioagent comprising: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 288 and a reverse primer with at least 70% sequence identity with SEQ ID NO.
  • a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with SEQ ID NO.: 698 and a reverse primer with at least 70% sequence identity with SEQ ID NO.: 1420; a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 217 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.
  • a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 399 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1041; a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 456 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1261; a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 430 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1321; a seventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 174 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 853; and an eighth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with
  • the kit comprises the eight oligonucleotide primer pairs:
  • the kit for identifying a Staphylococcus aureus bioagent comprises s:: aa first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 288 and a reverse primer with at least 70% sequence identity with SEQ ID NO. : 1269, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 698 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.
  • a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 217 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1167
  • a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 399 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1041
  • a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 456, and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1261
  • a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 430 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1321
  • a seventh oligonucleotide primer pair comprising a forward primer with at
  • the kit comprises eight oligonucleotide primer pairs consisting of: SEQ ID NO.: 288:SEQ ID NO.: 1269, SEQ ID NO.: 698:SEQ ID NO.: 1420, SEQ ID NO.: 217:SEQ ID NO.: 1167, SEQ ID NO.: 399:SEQ ID NO.: 1041, SEQ ID NO.: 456:SEQ ID NO.: 1261, SEQ ID NO.: 430:SEQ ID NO.: 1321, SEQ ID NO.: 174:SEQ ID NO.:853, and SEQ ID NO.: 205:SEQ ID NO.:876.
  • the kit for identifying a Staphylococcus aureus bioagent comprises: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 288 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1269, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 698 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.
  • a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 217 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1167
  • a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 399 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1041
  • a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 456 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1261
  • a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 430 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1321
  • a seventh oligonucleotide primer pair comprising a forward primer with at
  • the kit comprises eight oligonucleotide primer pairs consisting of: SEQIDNO.:288:SEQIDNO.:1269, SEQIDNO.: 698:SEQ IDNO.:1420, SEQIDNO.: 217:SEQIDNO.:1167, SEQIDNO.: 399:SEQ IDNO.:1041, SEQIDNO.: 456:SEQ IDNO.:1261, SEQIDNO.:430:SEQIDNO.:1321, SEQIDNO.: 174:SEQ ID NO.:853, and SEQ IDNO.: 1465:SEQIDNO.:1466.
  • the for identifying a Staphylococcus aureus bioagent comprises: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 437 and a primer with at least 70% sequence identity with: SEQ ID NO.
  • a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 530 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 891
  • a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 474 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 869
  • a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 268 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1284
  • a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 418 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1301
  • a sixth oligonucleotide primer pair comprising a forward primer with at least
  • the kit comprises eight oligonucleotide primer pairs consisting of: SEQIDNO.:437:SEQIDNO.:1137, SEQIDNO.: 530:SEQ IDNO.:891, SEQIDNO.: 474:SEQ ID NO.:869, SEQ ID NO.: 268:SEQ ID NO.:1284, SEQ ID NO.: 418:SEQ ID NO.:1301, SEQ ID NO.: 318:SEQ ID NO.:1300, SEQ ID NO.: 440:SEQ ID NO.:1076, and SEQ ID NO.: 219:SEQIDNO.:1013.
  • the kit for identifying a Staphylococcus aureus bioagent comprises: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 437 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1232, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 530 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.:891, a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 474 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 869, a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 268 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1284
  • the kit comprises eight oligonucleotide primer pairs consisting of: SEQ ID NO : 437: SEQ ID NO : 1232, SEQ ID NO.: 530:SEQ ID NO.:891, SEQ ID NO.: 474:SEQ ID NO.:869, SEQ ID NO.: 268:SEQ ID NO.: 1284, SEQ ID NO.: 418:SEQ ID NO.: 1301, SEQ ID NO.: 318:SEQ ID NO.: 1300, SEQ ID NO.: 440:SEQ ID NO.: 1076, and SEQ ID NO.: 219:SEQ ID NO.: 1013.
  • the kit for identifying a Staphylococcus aureus bioagent comprises: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 437 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1232, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 530 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.:891, a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 474 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 869, a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 268 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1284
  • a tenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 698 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1420
  • an eleventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 217 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1167
  • a twelfth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 399 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.
  • a thirteenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 456 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1261
  • a fourteenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 430 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1321
  • a fifteenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 174 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 853
  • a sixteenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 205 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.:876.
  • each of the oligonucleotide primer pairs is configured to generate an amplicon comprising between 45 and 200 linked nucleotides in length
  • the forward primer comprises between 13 and 35 linked nucleotides in length and is configured to hybridize within a first conserved sequence region of a Staphylococcus aureus gene sequence
  • the reverse primer comprises between 13 and 35 linked nucleotides in length and is configured to hybridize within a second conserved sequence region of said Staphylococcus aureus gene sequence.
  • At least one of the forward primer and the reverse primer comprises at least one modified nucleobase.
  • at least one of the at least one modified nucleobase is a mass modified nucleobase.
  • the mass modified nucleobase is 5-Iodo- C.
  • it comprises a mass modified tag.
  • at least one of the at least one modified nucleobase is a universal nucleobase, for example, inosine.
  • primer pair comprises at least one non-templated T residue on the 5'-end.
  • at least one of the forward primer and the reverse primer comprises at least one non- template tag.
  • At least one of the forward primer and the reverse primer comprises a non-templated T residue on the 5'-end. In another embodiment, at least one of the forward primer and the reverse primer lacks a non-templated T residue on the 5 '-end.
  • kits that comprise one or more of the primer pairs.
  • each member of the one or more primer pairs of the kit is of a length of between 13 and 35 linked nucleotides and has 70% to 100% sequence identity with the corresponding member from any of the primer pairs listed in Table 2.
  • kits comprise at least one calibration polynucleotide for use in quantitiation of bacteria in a given sample, and also for use as a positive control for amplification.
  • kits further comprise at least one anion exchange functional group linked to a magnetic bead.
  • the method is for identification of a bioagent in a sample.
  • the bioagent is a bacterial bioagent, preferably a Staphylococcus aureus bioagent.
  • Nucleic acid from the sample is amplified using the oligonucleotide primer pairs described above to obtain at least one amplification product.
  • the amplification product is between 45 and 200 linked nucleotides in length.
  • the molecular mass of the amplification product is determined by mass spectrometry.
  • the base composition of the amplification product is calculated from the determined molecular mass.
  • the molecular mass and/or base composition is compared to or queried against a database comprising a plurality of base compositions or molecular masses.
  • a database comprising a plurality of base compositions or molecular masses.
  • each base composition/molecular mass within the plurality of base compositions and/or molecular masses in the database is indexed to the primer pair and to a bioagent.
  • a match between the calculated base composition or the determined molecular mass with a base composition or molecular mass comprised in the database identifies the bioagent in the sample.
  • the mass spectrometry used to determine the molecular mass is electrospray ionization (ESI) time of flight (TOF) mass spectrometry or ESI Fourier transform ion cyclotron resonance (FTICR) mass spectrometry, for example.
  • ESI electrospray ionization
  • TOF time of flight
  • FTICR Fourier transform ion cyclotron resonance
  • Other mass spectrometry techniques can also be used to measure the molecular mass of bacterial bioagent identifying amplicons.
  • the identification in the method comprises detecting the presence or absence of a bacterial bioagent in a sample. In another embodiment, it comprises determining the presence or absence of virulence of the bioagent in the sample. In another embodiment, the identifying comprises identifying one or more sub-species characteristics of the bioagent in the sample. In another embodiment, the identifying comprises determining sensitivity or resistance of the bioagent to a drug, preferably an antibiotic.
  • the methods are for determination of the quantity of an unknown bacterial bioagent in a sample.
  • the sample is contacted with the primer pair and a known quantity of a calibration polynucleotide comprising a calibration sequence.
  • Nucleic acid from the unknown bioagent in the sample is concurrently amplified with the composition described above and nucleic acid from the calibration polynucleotide in the sample is concurrently amplified with the composition described above to obtain a first amplification product comprising a bacterial bioagent identifying amplicon and a second amplification product comprising a calibration amplicon.
  • the molecular masses and abundances for the bacterial bioagent identifying amplicon and the calibration amplicon are determined.
  • the bacterial bioagent identifying amplicon is distinguished from the calibration amplicon based on molecular mass and comparison of bacterial bioagent identifying amplicon abundance and calibration amplicon abundance indicates the quantity of bacterium in the sample.
  • the base composition of the bacterial bioagent identifying amplicon is determined.
  • the methods comprise detecting or quantifying bacteria by combining a nucleic acid amplification process with molecular mass determination.
  • such methods identify or otherwise analyze the bacterium by comparing mass information from an amplification product with a calibration or control product. Such methods can be carried out in a highly multiplexed and/or parallel manner allowing for the analysis of as many as 300 samples per 24 hours on a single mass measurement platform.
  • the accuracy of the mass determination methods in some embodiments provided herein permits allows for the ability to discriminate between different bacteria such as, for example, various genotypes and drug resistant strains of Staphylococcus aureus.
  • Figure 1 process diagram illustrating a representative primer pair selection process.
  • Figure 2 process diagram illustrating an embodiment of the calibration method.
  • Figure 3 common pathogenic bacteria and primer pair coverage.
  • the primer pair number in the upper right hand corner of each polygon indicates that the primer pair can produce a bioagent identifying amplicon for all species within that polygon.
  • Figure 4 a representative 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples (labeled NHRC samples) closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.
  • Figure 5 a representative mass spectrum of amplification products indicating the presence of bioagent identifying amplicons of Streptococcus pyogenes, Neisseria meningitidis, and Haemophilus influenzae obtained from amplification of nucleic acid from a clinical sample with primer pair number 349 which targets 23 S rRNA. Experimentally determined molecular masses and base compositions for the sense strand of each amplification product are shown.
  • Figure 6 a representative mass spectrum of amplification products representing a bioagent identifying amplicon of Streptococcus pyogenes, and a calibration amplicon obtained from amplification of nucleic acid from a clinical sample with primer pair number 356 which targets rplB.
  • the experimentally determined molecular mass and base composition for the sense strand of the Streptococcus pyogenes amplification product is shown.
  • Figure 7 a representative mass spectrum of an amplified nucleic acid mixture which contained the Ames strain of Bacillus anthracis, a known quantity of combination calibration polynucleotide (SEQ ID NO: 1464), and primer pair number 350 which targets the capC gene on the virulence plasmid pX02 of Bacillus anthracis. Calibration amplicons produced in the amplification reaction are visible in the mass spectrum as indicated and abundance data (peak height) are used to calculate the quantity of the Ames strain of Bacillus anthracis.
  • SEQ ID NO: 1464 combination calibration polynucleotide
  • primer pair number 350 which targets the capC gene on the virulence plasmid pX02 of Bacillus anthracis.
  • the term “abundance” refers to an amount.
  • the amount may be described in terms of concentration which are common in molecular biology such as “copy number,” “pfu or plate-forming unit” which are well known to those with ordinary skill. Concentration may be relative to a known standard or may be absolute.
  • the primer pairs and methods provided herein determine the abundance of one or more bioagents in a sample.
  • amplifiable nucleic acid is used in reference to nucleic acids that may be amplified by any amplification method. It is contemplated that “amplifiable nucleic acid” also comprises “sample template.”
  • amplification reagents refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification, excluding primers, nucleic acid template, and the amplification enzyme.
  • amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
  • bioagent identifying amplicon when used in context of comparison of bioagent identifying amplicons indicates that the bioagent identifying amplicons being compared are produced with the same pair of primers.
  • bioagent identifying amplicon "A” and bioagent identifying amplicon "B”, produced with the same pair of primers are analogous with respect to each other.
  • Bioagent identifying amplicon "C”, produced with a different pair of primers is not analogous to either bioagent identifying amplicon "A” or bioagent identifying amplicon "B".
  • anion exchange functional group refers to a positively charged functional group capable of binding an anion through an electrostatic interaction.
  • anion exchange functional groups are the amines, including primary, secondary, tertiary and quaternary amines.
  • bacteria refers to any member of the groups of eubacteria and archaebacteria.
  • a “base composition probability cloud” is a representation of the diversity in base composition resulting from a variation in sequence that occurs among different isolates of a given species.
  • the “base composition probability cloud” represents the base composition constraints for each species and is typically visualized using a pseudo four-dimensional plot.
  • a “bioagent” is any organism, cell, or virus, living or dead, or a nucleic acid derived from such an organism, cell or virus.
  • bioagents include, but are not limited, to cells, (including but not limited to human clinical samples, bacterial cells and other pathogens), viruses, fungi, protists, parasites, and pathogenicity markers (including but not limited to: pathogenicity islands, antibiotic resistance genes, virulence factors, toxin genes and other bioregulating compounds).
  • Samples may be alive or dead or in a vegetative state (for example, vegetative bacteria or spores) and may be encapsulated or bioengineered.
  • a "pathogen” is a bioagent which causes a disease or disorder.
  • the term "unknown bioagent” can mean either: (i) a bioagent whose existence is not known (for example, the SARS coronavirus was unknown prior to April 2003), which is also called a "true unknown bioagent,” and/or (ii) a bioagent whose existence is known (such as the well known bacterial species Staphylococcus aureus for example) but which is not known to be in a sample to be analyzed and/or (iii) a bioagent that is known or suspected of being present in a sample but whose sub-species characteristics are not known (such as a bacterial resistance genotype like the QRDR region of Staphyoicoccus aureus species).
  • US2005-0266397 was to be employed subsequent to April 2003 to identify the SARS coronavirus in a clinical sample, only the second meaning (ii) of "unknown" bioagent would apply because the SARS coronavirus became known to science subsequent to April 2003 but because it was not known what bioagent was present in the sample.
  • a “bioagent division” is defined as group of bioagents above the species level and includes but is not limited to, orders, families, genus, classes, clades, genera or other such groupings of bioagents above the species level.
  • a "pathogen” is a bioagent which causes a disease or disorder.
  • virus refers to obligate, ultramicroscopic, parasites that are incapable of autonomous replication (i.e., replication requires the use of the host cell's machinery). Viruses can survive outside of a host cell but cannot replicate.
  • biological product refers to any product originating from an organism. Biological products are often products of processes of biotechnology. Examples of biological products include, but are not limited to: cultured cell lines, cellular components, antibodies, proteins and other cell-derived biomolecules, growth media, growth harvest fluids, natural products and bio-pharmaceutical products.
  • biowarfare agent and “bioweapon” are synonymous and refer to a bacterium, virus, fungus or protozoan that could be deployed as a weapon to cause bodily harm to individuals.
  • military or terrorist groups may be implicated in deployment of biowarfare agents.
  • calibration amplicon refers to a nucleic acid segment representing an amplification product obtained by amplification of a calibration sequence with a pair of primers configured to produce a bioagent identifying amplicon.
  • calibration sequence refers to a polynucleotide sequence to which a given pair of primers hybridizes for the purpose of producing an internal (i.e: included in the reaction) calibration standard amplification product for use in determining the quantity of a bioagent in a sample.
  • the calibration sequence may be expressly added to an amplification reaction, or may already be present in the sample prior to analysis.
  • triplet refers to a set of three adjoined nucleotides (triplet) that codes for an amino acid or a termination signal.
  • the term "codon base composition analysis,” refers to determination of the base composition of an individual codon by obtaining a bioagent identifying amplicon that includes the codon.
  • the bioagent identifying amplicon will at least include regions of the target nucleic acid sequence to which the primers hybridize for generation of the bioagent identifying amplicon as well as the codon being analyzed, located between the two primer hybridization regions.
  • primer pairs are synonymous terms referring to pairs of oligonucleotides (herein called “primers” or “oligonucleotide primers”) that are configured to bind to conserved sequence regions of a bioagent nucleic acid (that is conserved among two or more bioagents) and to generate bioagent identifying amplicons.
  • the bound primers flank an intervening variable region of the bioagent between the conserved sequence sequences.
  • the primer pairs yield amplicons that provide base composition variability between two or more bioagents.
  • the primer pairs are also configured to generate amplicons that are amenable to molecular mass analysis.
  • Each primer pair comprises two primer pair members.
  • the primer pair members are a "forward primer” (“forward primer pair member,” or “reverse member”), which comprises at least a percentage of sequence identity with the top strand of the reference sequence used in configuring the primer pair, and a “reverse primer” (“reverse primer pair member” or “reverse member”), which comprises at least a percentage of reverse complementarity with the top strand of the reference sequence used in configuring the primer pair.
  • forward primer pair member or reverse member
  • reverse primer reverse primer pair member
  • Primer pair configuration is well known in the art and is described in detail herein.
  • Primer pair nomenclature includes the identification of a reference sequence.
  • the forward primer for primer pair number 3106 is named TSSTl_NC002758.2-2137509-2138213 _519_546_F.
  • This forward primer name indicates that the forward primer (“_F”) hybridizes to residues 234-261 ("234_261") of a reference sequence, which in this case is represented by a sequence extraction of coordinates 2137509-2138213 from GenBank gi number 57634611 (corresponding to the GenBank number NC002758.2, as is indicated by the prefix "TSSTl_NC002758.2" and cross-reference in Table 3).
  • the reference sequence is the gene within a Staphylococcus aureus genome encoding for tsstl.
  • Primer pair name codes for the primers provided herein are defined in Table 3, which lists gene abbreviations and GenBank gi numbers that correspond with each primer name code. Sequences of the primers are also provided.
  • the primer pairs are selected and configured; however, to hybridize with two or more bioagents.
  • the reference sequence in the primer name is used merely to provide a reference, and not to indicate that the primers are selected and configured to hybridize with and generate a bioagent identifying amplicon only from the reference sequence. Rather, the primers hybridize with and generate amplicons from a number of sequences.
  • the sequences of the primer members of the primer pairs are not necessarily fully complementary to the conserved region of the reference bioagent. Rather, the sequences are configured to be "best fit" amongst a plurality of bioagents at these conserved binding sequences. Therefore, the primer members of the primer pairs have substantial complementarity with the conserved regions of the bioagents, including the reference bioagent.
  • the primers provided herein are configured to hybridize within conserved sequence regions of bioagent nucleic acids, which are conserved among two or more bioagents, that preferably flank an intervening variable region, which varies among two or more bioagents, and, upon amplification, yield amplification products which ideally provide enough variability to distinguish individual bioagents, and which are amenable to molecular mass analysis.
  • the conserved sequence regions are highly conserved sequence regions.
  • “highly conserved” it is meant that the sequence regions exhibit between about 80-100%, or between about 90-100%, or between about 95-100% identity among all, or at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% of species or strains.
  • the molecular mass of a given amplification product provides a means of identifying the bioagent from which it was obtained, due to the variability of the variable region, which preferably results in amplicons that vary in base composition among bioagents, for example, among different species or strains.
  • configuring of the primers involves selection of a variable region with appropriate variability to resolve the identity of a given bioagent.
  • Bioagent identifying amplicons are ideally specific to the identity of the bioagent.
  • variable region is used to describe a region that is flanked by the two conserved sequence regions to which the primers of a primer pair hybridize.
  • the variable region is a region that is flanked by the primers of any one primer pair described herein.
  • the region possesses distinct base compositions among at least two bioagents, such that at least one bioagent can be identified at the family, genus, species or sub-species level using the primer pairs and the methods provided herein.
  • the degree of variability between the at least two bioagents need only be sufficient to allow for identification using mass spectrometry analysis, as described herein. Such a difference can be as slight as a single nucleotide difference occurring between two bioagents.
  • primer pairs configured to prime amplification of a double stranded sequence are configured and named using one strand of the double stranded sequence as a reference.
  • the forward primer is the primer of the pair that comprises full or partial sequence identity to the one strand (usually the coding, or sense strand) of the sequence being used as a reference.
  • the reverse primer is the primer of the pair that comprises reverse complementarity to the one strand of the sequence being used as a reference.
  • the "plus” or “top” strand (the primary sequence as submitted to GenBank) of the nucleic acid to which the primers hybridize is used as a reference when designing primer pairs.
  • the forward primer will comprise identity and the reverse primer will comprise reverse complementarity, to the sequence listed in GenBank for the reference sequence.
  • the primer pair is configured using the "minus" or “bottom” strand (reverse complement of the primary sequence as submitted to and listed in GenBank).
  • the forward primer comprises sequence identity to the minus strand, and thus comprises reverse complementarity to the top strand, the sequence listed in GenBank.
  • the reverse primer comprises reverse complementarity to the minus Strang, and thus comprises identity to the top strand.
  • the primer pairs may be configured to generate an amplicon from "within a region of a particular SEQ ID NO., which may comprise a specific region of the Genbank gi No. to which the primers were configured.
  • Configuring a primer pair to generate an amplicon from "within a region" of a particular nucleic acid means that each primer of the pair hybridizes to a portion of the reference sequence that is within that region.
  • shifting the coordinates of the portion of a reference sequence to which one or both primers hybridizes slightly, in one direction or the other relative to the region given, such that the portion is not entirely within the region will often result in an equally effective primer pair.
  • Such primer pairs are also encompassed by this description.
  • clade primer pair refers to a primer pair configured to produce bioagent identifying amplicons for species belonging to a clade group.
  • a clade primer pair may also be considered as a "speciating" primer pair which is useful for distinguishing among closely related species.
  • the primer pairs comprise "broad range survey primers," primers configured to identify an unknown bioagent as a member of a particular division (e.g., an order, family, class, clade, or genus). However, in some cases the broad range survey primers are also able to identify unknown bioagents at the species or sub-species level. In other embodiments, the primer pairs comprise "division-wide primers,” configured to identify a bioagent at the species level. In some embodiments, the primer pairs comprise "drill-down" primers, configured to identify a bioagent at the sub-species level.
  • sub-species level of identification includes, but is not limited to, strains, subtypes, variants, and isolates. Drill-down primers are not always required for identification at the sub-species level because broad range survey intelligent primers may, in some cases provide sufficient identification resolution to accomplishing this identification objective.
  • the term "speciating primer pair” refers to a primer pair configured to produce a bioagent identifying amplicon with the diagnostic capability of identifying species members of a group of genera or a particular genus of bioagents.
  • Primer pair number 2249 (SEQ ID NOs: 430: 1321), for example, is a speciating primer pair used to distinguish Staphylococcus aureus from other species of the genus Staphylococcus.
  • a "sub-species characteristic” is a genetic characteristic that provides the means to distinguish two members of the same bioagent species. For example, one viral strain could be distinguished from another viral strain of the same species by possessing a genetic change (e.g., for example, a nucleotide deletion, addition or substitution) in one of the viral genes, such as the RNA-dependent RNA polymerase. Sub-species characteristics such as virulence genes and drug-are responsible for the phenotypic differences among the different strains of bacteria.
  • Properties of the primers may include any number of properties related to structure including, but not limited to: nucleobase length which may be contiguous (linked together) or noncontiguous (for example, two or more contiguous segments which are joined by a linker or loop moiety), modified or universal nucleobases (used for specific purposes such as for example, increasing hybridization affinity, preventing non-templated adenylation and modifying molecular mass) percent complementarity to a given target sequences.
  • the terms “complementary” or “complementarity” are used in reference to polynucleotides (i.e., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid) related by the base-pairing rules.
  • polynucleotides i.e., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid
  • Complementarity may be “partial,” in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be “complete” or “total” complementarity between the nucleic acids.
  • the degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids. Either term may also be used in reference to individual nucleotides, especially within the context of polynucleotides. For example, a particular nucleotide within an oligonucleotide may be noted for its complementarity, or lack thereof, to a nucleotide within another nucleic acid strand, in contrast or comparison to the complementarity between the rest of the oligonucleotide and the nucleic acid strand.
  • nucleic acid sequence refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' end of one sequence is paired with the 3' end of the other, is in "antiparallel association.”
  • Complementarity relates to base pairing ability.
  • a nucleobase that is complementary to another nucleobase can base pair with that other nuceobase.
  • Certain bases not commonly found in natural nucleic acids may be included in the nucleic acids provided herein, and include, for example, inosine and 7-deazaguanine. Complementarity need not be perfect; stable duplexes may contain mismatched base pairs or unmatched bases.
  • oligonucleotide is complementary to a region of a target nucleic acid and a second oligonucleotide has complementary to the same region (or a portion of this region) a "region of overlap" exists along the target nucleic acid. The degree of overlap will vary depending upon the extent of the complementarity.
  • the term "substantial complementarity" means that a primer member of a primer pair comprises between about 70%- 100%, or between about 80-100%, or between about 90-100%, or between about 95-100% identity, or between about 99-100% sequence identity with the conserved binding sequence of any given bioagent.
  • ranges of identity are inclusive of all whole or partial numbers embraced within the recited range numbers. For example, and not limitation, 75.667%, 82%, 91.2435% and 97% sequence identity are all numbers that fall within the above recited range of 70% to 100%, therefore forming a part of this description.
  • amplicon and “bioagent identifying amplicon” refer to a nucleic acid generated using the primer pairs described herein.
  • the amplicon is preferably double stranded DNA; however, it may be RNA and/or DNA:RNA.
  • the amplicon comprises the sequences of the conserved regions/primer pairs and the intervening variable region. Since the primer pairs provided herein are configured such that two or more different bioagents, when amplified with a given primer pair, will yield amplicons with unique base composition signatures, the base composition signatures can be used to identify bioagents based on association with amplicons. As discussed herein, primer pairs are configured to generate amplicons from two or more bioagents.
  • the base composition of any given amplicon will include the primer pair, the complement of the primer pair, the conserved regions and the variable region from the bioagent that was amplified to generate the amplicon.
  • the incorporation of the configured primer pair sequences into any amplicon will replace the native bioagent sequences at the primer binding site, and complement thereof.
  • the resultant amplicons having the primer sequences generate the molecular mass data.
  • Amplicons having any native bioagent sequences at the primer binding sites, or complement thereof, are undetectable because of their low abundance. Such is accounted for when identifying one or more bioagents using any particular primer pair.
  • the amplicon further comprises a length that is compatible with mass spectrometry analysis. Bioagent identifying amplicons generate base composition signatures that are preferably unique to the identity of a bioagent.
  • the term "molecular mass” refers to the mass of a compound as determined using mass spectrometry.
  • the compound is preferably a nucleic acid, more preferably a double stranded nucleic acid, still more preferably a double stranded DNA nucleic acid and is most preferably an amplicon.
  • the nucleic acid is double stranded the molecular mass is determined for both strands.
  • the strands are separated either before introduction into the mass spectrometer, or the strands are separated by the mass spectrometer (for example, electro-spray ionization will separate the hybridized strands).
  • the molecular mass of each strand is measured by the mass spectrometer.
  • the term “mass spectrometry” refers to measurement of the mass of atoms or molecules. The molecules are first converted to ions, which are separated using electric or magnetic fields according to the ratio of their mass to electric charge. The measured masses are used to identity the molecules.
  • base composition refers to the number of each residue comprising an amplicon, without consideration for the linear arrangement of these residues in the strand(s) of the amplicon.
  • the amplicon residues comprise, adenosine (A), guanosine (G), cytidine, (C), (deoxy)thymidine (T), uracil (U), inosine (I), nitroindoles such as 5-nitroindole or 3- nitropyrrole, dP or dK (Hill et al. ⁇ an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot et al, Nucleosides and Nucleotides, 1995, 14, 1053-1056), the purine analog l-(2-deoxy- .beta.-D-ribofuranosyl)-imidazole-4-carboxamide, 2,6-diaminopurine, 5-propynyl
  • the mass- modified nucleobase comprises 15.sup.N or 13. sup. C or both 15.sup.N and 13. sup. C.
  • the non-natural nucleosides used herein include 5-propynyluracil, 5-propynylcytosine and inosine.
  • the base composition for an unmodified DNA amplicon is notated as A.sub.wG.sub.xC.sub.yT.sub.z, wherein w, x, y and z are each independently a whole number representing the number of said nucleoside residues in an amplicon.
  • Base compositions for amplicons comprising modified nucleosides are similarly notated to indicate the number of said natural and modified nucleosides in an amplicon.
  • Base compositions are calculated from a molecular mass measurement of an amplicon, as described below. The calculated base composition for any given amplicon is then compared to a database of base compositions. A match between the calculated base composition and a single database entry reveals the identity of the bioagent.
  • base composition signature refers to the base composition generated by any one particular amplicon.
  • the base composition signature for each of one or more amplicons provides a fingerprint for identifying the bioagent(s) present in a sample.
  • the term "database” is used to refer to a collection of base composition and/or molecular mass data.
  • the base composition and/or molecular mass data in the database is indexed to bioagents and to primer pairs.
  • the base composition data reported in the database comprises the number of each nucleoside in an amplicon that would be generated for each bioagent using each primer pair.
  • the database can be populated by empirical data. In this aspect of populating the database, a bioagent is selected and a primer pair is used to generate an amplicon.
  • the amplicon' s molecular mass is determined using a mass spectrometer and the base composition calculated therefrom.
  • An entry in the database is made to associate the base composition and/or molecular mass with the bioagent and the primer pair used.
  • the database may also be populated using other databases comprising bioagent information. For example, using the GenBank database it is possible to perform electronic PCR using an electronic representation of a primer pair. This in silico method will provide the base composition for any or all selected bioagent(s) stored in the GenBank database. The information is then used to populate the base composition database as described above.
  • a base composition database can be in silico, a written table, a reference book, a spreadsheet or any form generally amenable to databases. Preferably, it is in silico.
  • the database can similarly be populated with molecular masses that is gathered either empirically or is calculated from other sources such as GenBank.
  • nucleobase is synonymous with other terms in use in the art including “nucleotide,” “deoxynucleotide,” “nucleotide residue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” or deoxynucleotide triphosphate (dNTP).
  • a nucleobase includes natural and modified residues, as described herein.
  • a "wobble base” is a variation in a codon found at the third nucleotide position of a DNA triplet. Variations in conserved regions of sequence are often found at the third nucleotide position due to redundancy in the amino acid code.
  • Housekeeping gene refers to a gene encoding a protein or RNA involved in basic functions required for survival and reproduction of a bioagent.
  • Housekeeping genes include, but are not limited to, genes encoding RNA or proteins involved in translation, replication, recombination and repair, transcription, nucleotide metabolism, amino acid metabolism, lipid metabolism, energy generation, uptake, secretion and the like.
  • the primers are configured to produce amplicons from within a housekeeping gene.
  • a "sub-species characteristic” is a genetic characteristic that provides the means to distinguish two members of the same bioagent species.
  • one bacterial strain could be distinguished from another bacterial strain of the same species by possessing a genetic change (e.g., for example, a nucleotide deletion, addition or substitution) in one of the bacterial genes, for example, a gene conferring drug resistance or virulence.
  • triangulation identification means the employment of more than one primer pair to generate a corresponding amplicon for identification of a bioagent.
  • the more than one primer pair can be used in individual wells or in a multiplex PCR assay. Alternatively, PCR reaction may be carried out in single wells comprising a different primer pair in each well.
  • the amplicons are pooled into a single well or container which is then subjected to molecular mass analysis. The combination of pooled amplicons can be chosen such that the expected ranges of molecular masses of individual amplicons are not overlapping and thus will not complicate identification of signals.
  • Triangulation works as a process of elimination, wherein a first primer pair identifies that an unknown bioagent may be one of a group of bioagents. Subsequent primer pairs are used in triangulation identification to further refine the identity of the bioagent amongst the subset of possibilities generated with the earlier primer pair. Triangulation identification is complete when the identity of the bioagent is determined. The triangulation identification process is also used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al, J. Appl. Microbiol, 1999, 87, 270-278) in the absence of the expected signatures from the B.
  • a first pair of primers might determine that a given bioagent is a member of the Staphylococcus genus.
  • a second primer pair may identify the bioagent as a member of the Staphylococcus aureus species, while a third primer may identify a sub-species characteristic of the bioagent, for example, resistance to a particular antibiotic or strain information.
  • triangulation genotyping analysis primer pair is a primer pair configured to produce bioagent identifying amplicons for determining species types in a triangulation genotyping analysis.
  • single primer pair identification means that one or more bioagents can be identified using a single primer pair.
  • a base composition signature for an amplicon may singly identify one or more bioagents.
  • the term "etiology” refers to the causes or origins, of diseases or abnormal physiological conditions.
  • duplex refers to the state of nucleic acids in which the base portions of the nucleotides on one strand are bound through hydrogen bonding the their complementary bases arrayed on a second strand. The condition of being in a duplex form reflects on the state of the bases of a nucleic acid. By virtue of base pairing, the strands of nucleic acid also generally assume the tertiary structure of a double helix, having a major and a minor groove. The assumption of the helical form is implicit in the act of becoming duplexed.
  • RNA having a non-coding function e.g., a ribosomal or transfer RNA
  • the RNA or polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or function is retained.
  • sequence identity is meant to be properly determined when the query sequence and the subject sequence are both described and aligned in the 5' to 3' direction.
  • Sequence alignment algorithms such as BLAST, will return results in two different alignment orientations. In the Plus/Plus orientation, both the query sequence and the subject sequence are aligned in the 5' to 3' direction. On the other hand, in the Plus/Minus orientation, the query sequence is in the 5' to 3' direction while the subject sequence is in the 3' to 5' direction. It should be understood that with respect to the primers provided herein, sequence identity is properly determined when the alignment is designated as Plus/Plus.
  • Sequence identity may also encompass alternate or modified nucleobases that perform in a functionally similar manner to the regular nucleobases adenine, thymine, guanine and cytosine with respect to hybridization and primer extension in amplification reactions.
  • the two primers will have 100% sequence identity with each other.
  • Inosine (I) may be used as a replacement for G or T and effectively hybridize to C, A or U (uracil).
  • inosine replaces one or more C, A or U residues in one primer which is otherwise identical to another primer in sequence and length
  • the two primers will have 100% sequence identity with each other.
  • Other such modified or universal bases may exist which would perform in a functionally similar manner for hybridization and amplification reactions and will be understood to fall within this definition of sequence identity.
  • hybridization is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is influenced by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, and the T m of the formed hybrid. "Hybridization” methods involve the annealing of one nucleic acid to another, complementary nucleic acid, i.e., a nucleic acid having a complementary nucleotide sequence. The ability of two polymers of nucleic acid containing complementary sequences to find each other and anneal through base pairing interaction is a well-recognized phenomenon.
  • PCR polymerase chain reaction
  • the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule.
  • the primers are extended with a polymerase so as to form a new pair of complementary strands.
  • the steps of denaturation, primer annealing, and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one "cycle”; there can be numerous "cycles") to obtain a high concentration of an amplified segment of the desired target sequence.
  • the length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter.
  • PCR polymerase chain reaction
  • any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules.
  • the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
  • ePCR electronic PCR
  • polymerase refers to an enzyme having the ability to synthesize a complementary strand of nucleic acid from a starting template nucleic acid strand and free dNTPs.
  • polymerization means or “polymerization agent” refers to any agent capable of facilitating the addition of nucleoside triphosphates to an oligonucleotide.
  • Preferred polymerization means comprise DNA and RNA polymerases.
  • PCR product refers to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
  • mass-modifying tag refers to any modification to a given nucleotide which results in an increase in mass relative to the analogous non-mass modified nucleotide.
  • Mass-modifying tags can include heavy isotopes of one or more elements included in the nucleotide such as carbon- 13 for example.
  • Other possible modifications include addition of substituents such as iodine or bromine at the 5 position of the nucleobase for example.
  • microorganism as used herein means an organism too small to be observed with the unaided eye and includes, but is not limited to bacteria, virus, protozoans, fungi; and ciliates.
  • multi-drug resistant or “multiple-drug resistant” refers to a microorganism which is resistant to more than one of the antibiotics or antimicrobial agents used in the treatment of said microorganism.
  • non-template tag refers to a stretch of at least three guanine or cytosine nucleobases of a primer used to produce a bioagent identifying amplicon which are not complementary to the template.
  • a non-template tag is incorporated into a primer for the purpose of increasing the primer-duplex stability of later cycles of amplification by incorporation of extra G-C pairs which each have one additional hydrogen bond relative to an A-T pair.
  • nucleic acid sequence refers to the linear composition of the nucleic acid residues A, T, C, G, U, or any modifications thereof, within an oligonucleotide, nucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin which may be single or double stranded, and represent the sense or antisense strand
  • nucleobase is synonymous with other terms in use in the art including “nucleotide,” “deoxynucleotide,” “nucleotide residue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” or deoxynucleotide triphosphate (dNTP).
  • nucleotide analog refers to modified or non-naturally occurring nucleotides such as 5-propynyl pyrimidines (i.e., 5-propynyl-dTTP and 5-propynyl-dTCP), 7-deaza purines (i.e., 7-deaza-dATP and 7-deaza-dGTP). Nucleotide analogs include base analogs and comprise modified forms of deoxyribonucleotides as well as ribonucleotides.
  • oligonucleotide as used herein is defined as a molecule comprising two or more deoxyribonucleotides or ribonucleotides, preferably at least 5 nucleotides, more preferably at least about 13 to 35 nucleotides. The exact size will depend on many factors, which in turn depend on the ultimate function or use of the oligonucleotide.
  • the oligonucleotide may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, PCR, or a combination thereof.
  • an end of an oligonucleotide is referred to as the "5'-end” if its 5' phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring and as the "3 '-end” if its 3' oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose ring.
  • a nucleic acid sequence even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends.
  • a first region along a nucleic acid strand is said to be upstream of another region if the 3' end of the first region is before the 5' end of the second region when moving along a strand of nucleic acid in a 5' to 3' direction.
  • All oligonucleotide primers disclosed herein are understood to be presented in the 5' to 3' direction when reading left to right.
  • the former When two different, non-overlapping oligonucleotides anneal to different regions of the same linear complementary nucleic acid sequence, and the 3' end of one oligonucleotide points towards the 5' end of the other, the former may be called the "upstream” oligonucleotide and the latter the "downstream” oligonucleotide.
  • the first oligonucleotide when two overlapping oligonucleotides are hybridized to the same linear complementary nucleic acid sequence, with the first oligonucleotide positioned such that its 5' end is upstream of the 5' end of the second oligonucleotide, and the 3' end of the first oligonucleotide is upstream of the 3' end of the second oligonucleotide, the first oligonucleotide may be called the "upstream” oligonucleotide and the second oligonucleotide may be called the "downstream" oligonucleotide.
  • the terms “purified” or “substantially purified” refer to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated, and are at least 60% free, preferably 75% free, and most preferably 90% free from other components with which they are naturally associated.
  • An "isolated polynucleotide” or “isolated oligonucleotide” is therefore a substantially purified polynucleotide.
  • reverse transcriptase refers to an enzyme having the ability to transcribe DNA from an RNA template. This enzymatic activity is known as reverse transcriptase activity. Reverse transcriptase activity is desirable in order to obtain DNA from RNA viruses which can then be amplified and analyzed by the methods provided herein.
  • Ribosomal RNA refers to the primary ribonucleic acid constituent of ribosomes. Ribosomes are the protein-manufacturing organelles of cells and exist in the cytoplasm. Ribosomal RNAs are transcribed from the DNA genes encoding them.
  • sample in the present specification and claims is used in its broadest sense. On the one hand it is meant to include a specimen or culture (e.g., microbiological cultures). On the other hand, it is meant to include both biological and environmental samples.
  • a sample may include a specimen of synthetic origin.
  • Biological samples may be animal, including human, fluid, solid (e.g., stool) or tissue, as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by-products, and waste.
  • Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, lagamorphs, rodents, etc.
  • sample template refers to nucleic acid originating from a sample that is analyzed for the presence of "target” (defined below).
  • background template is used in reference to nucleic acid other than sample template that may or may not be present in a sample.
  • Background template is often a contaminant. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
  • a “segment” is defined herein as a region of nucleic acid within a target sequence.
  • sequence alignment refers to a listing of multiple DNA or amino acid sequences and aligns them to highlight their similarities. The listings can be made using bioinformatics computer programs.
  • the term "target” is used in a broad sense to indicate the gene or genomic region being amplified by the primers. Because a given primer pair provided herein is configured to generate a plurality of amplification products (depending on the bioagent being analyzed), multiple amplification products from different specific nucleic acid sequences may be obtained. Thus, the term “target” is not used to refer to a single specific nucleic acid sequence. The “target” is sought to be sorted out from other nucleic acid sequences and contains a sequence that has at least partial complementarity with an oligonucleotide primer. The target nucleic acid may comprise single- or double-stranded DNA or RNA. Primers herein can be targeted to, or configured to hybridize within portions, segments, or regions of nucleic acids. These terms are used when referring to specific regions of nucleic acid sequences used in primer design.
  • template refers to a strand of nucleic acid on which a complementary copy is built from nucleoside triphosphates through the activity of a template-dependent nucleic acid polymerase. Within a duplex the template strand is, by convention, depicted and described as the "bottom” strand. Similarly, the non-template strand is often depicted and described as the "top” strand. [171] As used herein, the term “T.sub.m” is used in reference to the “melting temperature.” The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands.
  • Other references e.g., Allawi, H. T. & SantaLucia, J., Jr. Thermodynamics and NMR of internal G.T mismatches in DNA. Biochemistry 36, 10581-94 (1997) include more sophisticated computations which take structural and environmental, as well as sequence characteristics into account for the calculation of T.sub.m.
  • wild-type refers to a gene or a gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source.
  • a wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the “normal” or “wild-type” form of the gene.
  • modified refers to a gene or gene product that displays modifications in sequence and or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally-occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
  • the methods are for detection and identification of population genotype for a population of bioagents.
  • Primers are selected to hybridize to conserved sequence regions of nucleic acids derived from a bioagent and which bracket (flank) variable sequence regions to yield a bioagent identifying amplicon which can be amplified and which is amenable to molecular mass determination.
  • the molecular mass is converted to a base composition, which indicates the number of each nucleotide in the amplicon.
  • the molecular mass or corresponding base composition signature of the amplicon is then queried against a database of molecular masses or base composition signatures indexed to bioagents and to the primer pair used to generate the amplicon.
  • a match of the measured base composition to a database entry base composition associates the sample bioagent to an indexed bioagent in the database.
  • the identity of the unknown bioagent or population of bioagents is determined. Prior knowledge of the unknown bioagent or population of bioagents is not necessary.
  • the measured base composition associates with more than one database entry base composition.
  • a second/subsequent primer pair is used to generate an amplicon, and its measured base composition is similarly compared to the database to determine its identity in triangulation identification.
  • the method can be applied to rapid parallel multiplex analyses, the results of which can be employed in a triangulation identification strategy.
  • the present method provides rapid throughput and does not require nucleic acid sequencing of the amplified target sequence for bioagent detection and identification.
  • the upper length as a practical length limit is about 200 consecutive nucleobases. Incorporating modified nucleotides into the amplicon can allow for an increase in this upper limit.
  • the amplicons generated using any single primer pair will provide sufficient base composition information to allow for identification of at least one bioagent at the family, genus, species or subspecies level.
  • amplicons greater than 200 nucleobases can be generated and then digested to form two or more fragments that are less than 200 nucleobases. Analysis of one or more of the fragments will provide sufficient base composition information to allow for identification of at least one bioagent.
  • amplicons comprise from about 45 to about 200 consecutive nucleobases (i.e., from about 45 to about 200 linked nucleosides).
  • this range expressly embodies compounds of 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 100, 101, 102, 103, 104, 105,
  • bioagent identifying amplicons amenable to molecular mass determination that are produced by the primers described herein are either of a length, size and/or mass compatible with the particular mode of molecular mass determination or compatible with a means of providing a predictable fragmentation pattern in order to obtain predictable fragments of a length compatible with the particular mode of molecular mass determination.
  • Such means of providing a predictable fragmentation pattern of an amplicon include, but are not limited to, cleavage with restriction enzymes or cleavage primers, for example.
  • bioagent identifying amplicons are larger than 200 nucleobases and are amenable to molecular mass determination following restriction digestion. Methods of using restriction enzymes and cleavage primers are well known to those with ordinary skill in the art.
  • amplicons corresponding to bioagent identifying amplicons are obtained using the polymerase chain reaction (PCR) which is a routine method to those with ordinary skill in the molecular biology arts.
  • PCR polymerase chain reaction
  • Other amplification methods may be used such as ligase chain reaction (LCR), low- stringency single primer PCR, and multiple strand displacement amplification (MDA). These methods are also known to those with ordinary skill. (Michael, SF., Biotechniques (1994), 16:411-412 and Dean et al, Proc. Natl. Acad. Sci. U.S.A. (2002), 99, 5261- 5266).
  • the amplification is carried out in a multiplex assay, a PCR amplification reaction where more than one primer pair is included in the reaction pool allowing two or more different DNA targets to be amplified in a single tube or well.
  • viruses do not share a gene that is essential and conserved among all virus families. Therefore, viral identification is achieved within smaller groups of related viruses, such as members of a particular virus family or genus.
  • RNA-dependent RNA polymerase is present in all single-stranded RNA viruses and can be used for broad priming as well as resolution within the virus family.
  • At least one bacterial nucleic acid segment is amplified in the process of identifying the bacterial bioagent.
  • the nucleic acid segments that can be amplified by the primers disclosed herein and that provide enough variability to distinguish each individual bioagent and whose molecular masses are amenable to molecular mass determination are herein described as bioagent identifying amplicons.
  • identification of bioagents is accomplished at different levels using primers suited to resolution of each individual level of identification.
  • Broad range survey primers are configured with the objective of identifying a bioagent as a member of a particular division (e.g., an order, family, genus or other such grouping of bioagents above the species level of bioagents).
  • broad range survey intelligent primers are capable of identification of bioagents at the species or sub-species level.
  • Examples of broad range survey primers include, but are not limited to: primer pair numbers: 346 (SEQ ID NOs: 202: 1110), 347 (SEQ ID NOs: 560: 1278), 348 SEQ ID NOs: 706:895), and 361 (SEQ ID NOs: 697: 1398) which target DNA encoding 16S rRNA, and primer pair numbers 349 (SEQ ID NOs: 401 : 1156) and 360 (SEQ ID NOs: 409: 1434) which target DNA encoding 23 S rRNA.
  • drill-down primers are configured with the objective of identifying a bioagent at the sub-species level (including strains, subtypes, variants and isolates) based on subspecies characteristics which may, for example, include single nucleotide polymorphisms (SNPs), variable number tandem repeats (VNTRs), deletions, drug resistance mutations or any other modification of a nucleic acid sequence of a bioagent relative to other members of a species having different sub-species characteristics.
  • Drill-down intelligent primers are not always required for identification at the sub-species level because broad range survey intelligent primers may, in some cases provide sufficient identification resolution to accomplishing this identification objective.
  • drill-down primers include, but are not limited to: confirmation primer pairs such as primer pair numbers 351 (SEQ ID NOs: 355: 1423) and 353 (SEQ ID NOs: 220: 1394), which target the pXOl virulence plasmid of Bacillus anthracis.
  • drill-down primer pairs are found in sets of triangulation genotyping primer pairs such as, for example, the primer pair number 2146 (SEQ ID NOs: 437: 1137) which targets the arcC gene (encoding carmabate kinase) and is included in an 8 primer pair panel or kit for use in genotyping Staphylococcus aureus, or in other panels or kits of primer pairs used for determining drug-resistant bacterial strains, such as, for example, primer pair number 2095 (SEQ ID NOs: 456: 1261) which targets the pv-luk gene (encoding Panton- Valentine leukocidin) and is included in an 8 primer pair panel or kit for use in identification of drug resistant strains of Staphylococcus aureus.
  • a representative process flow diagram used for primer selection and validation process is outlined in Figure 1.
  • candidate target sequences are identified (200) from which nucleotide alignments are created (210) and analyzed (220).
  • Primers are then configured by selecting appropriate priming regions (230) to facilitate the selection of candidate primer pairs (240).
  • the primer pairs are then subjected to in silico analysis by electronic PCR (ePCR) (300) wherein bioagent identifying amplicons are obtained from sequence databases such as GenBank or other sequence collections (310) and checked for specificity in silico (320).
  • ePCR electronic PCR
  • Bioagent identifying amplicons obtained from GenBank sequences (310) can also be analyzed by a probability model which predicts the capability of a given amplicon to identify unknown bioagents such that the base compositions of amplicons with favorable probability scores are then stored in a base composition database (325).
  • base compositions of the bioagent identifying amplicons obtained from the primers and GenBank sequences can be directly entered into the base composition database (330).
  • Candidate primer pairs (240) are validated by testing their ability to hybridize to target nucleic acid by an in vitro amplification by a method such as PCR analysis (400) of nucleic acid from a collection of organisms (410). Amplification products thus obtained are analyzed by gel electrophoresis or by mass spectrometry to confirm the sensitivity, specificity and reproducibility of the primers used to obtain the amplification products (420).
  • primers are well known and routine in the art.
  • the primers may be conveniently and routinely made through the well-known technique of solid phase synthesis.
  • Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, CA). Any other means for such synthesis known in the art may additionally or alternatively be employed.
  • the oligonucleotide primers are broad range survey primers which hybridize to conserved regions of nucleic acid encoding the hexon gene of all (or between 80% and 100%, between 85% and 100%, between 90% and 100% or between 95% and 100%) known bacteria and produce bacterial bioagent identifying amplicons.
  • the molecular mass or base composition of a bacterial bioagent identifying amplicon defined by a broad range survey primer pair does not provide enough resolution to unambiguously identify a bacterial bioagent at or below the species level.
  • These cases benefit from further analysis of one or more bacterial bioagent identifying amplicons generated from at least one additional broad range survey primer pair or from at least one additional division-wide primer pair.
  • the employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as triangulation identification.
  • the oligonucleotide primers are division-wide primers which hybridize to nucleic acid encoding genes of species within a genus of bacteria.
  • the oligonucleotide primers are drill-down primers which enable the identification of sub-species characteristics. Drill down primers provide the functionality of producing bioagent identifying amplicons for drill-down analyses such as strain typing when contacted with nucleic acid under amplification conditions. Identification of such sub-species characteristics is often critical for determining proper clinical treatment of viral infections. In some embodiments, sub-species characteristics are identified using only broad range survey primers and division-wide and drill- down primers are not used. [188] In some embodiments, the primers used for amplification hybridize to and amplify genomic DNA, and DNA of bacterial plasmids.
  • various computer software programs may be used to aid in design of primers for amplification reactions such as Primer Premier 5 (Premier Biosoft, Palo Alto, CA) or OLIGO Primer Analysis Software (Molecular Biology Insights, Cascade, CO). These programs allow the user to input desired hybridization conditions such as melting temperature of a primer- template duplex for example.
  • an in silico PCR search algorithm such as (ePCR) is used to analyze primer specificity across a plurality of template sequences which can be readily obtained from public sequence databases such as GenBank for example.
  • An existing RNA structure search algorithm Macke et al, Nucl.
  • Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety).
  • This also provides information on primer specificity of the selected primer pairs.
  • the hybridization conditions applied to the algorithm can limit the results of primer specificity obtained from the algorithm.
  • the melting temperature threshold for the primer template duplex is specified to be 35 0 C or a higher temperature.
  • the number of acceptable mismatches is specified to be seven mismatches or less.
  • the buffer components and concentrations and primer concentrations may be specified and incorporated into the algorithm, for example, an appropriate primer concentration is about 250 nM and appropriate buffer components are 50 mM sodium or potassium and 1.5 mM Mg.sup.2+.
  • a given primer need not hybridize with 100% complementarity in order to effectively prime the synthesis of a complementary nucleic acid strand in an amplification reaction.
  • a primer may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event, (e.g., for example, a loop structure or a hairpin structure).
  • the primers provided herein may comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or at least 99% sequence identity with any of the primers listed in Table 2.
  • either or both of the primers of the primer pairs provided herein may comprise 0- 10 nucleobase deletions, additions, and/or substitutions relative to any of the primers listed in Table 2, or elsewhere herein.
  • either or both of the primers may comprise 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleobase deletions, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleobase additions, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleobase substitutions relative to the sequences of any of the primers disclosed herein.
  • the primers comprise the sequence of any of the primers listed in Table 2 with the T modification removed from the 5' terminus.
  • the primers comprise the sequence of any of the primers listed in Table 2 with the T modification removed from the 5' terminus and comprising 0-10 nucleobase deletions, additions, and/or substitutions.
  • Percent homology, sequence identity or complementarity can be determined by, for example, the Gap program (Wisconsin Sequence Analysis Package, Version 8 for UNIX, Genetics Computer Group, University Research Park, Madison WI), using default settings, which uses the algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489).
  • complementarity of primers with respect to the conserved priming regions of viral nucleic acid is between about 70% and about 75% 80%. In other embodiments, homology, sequence identity or complementarity, is between about 75% and about 80%.
  • homology, sequence identity or complementarity is at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or is 100%.
  • the primers described herein comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 98%, or at least 99%, or 100% (or any range therewithin) sequence identity with the primer sequences specifically disclosed herein.
  • the oligonucleotide primers are 13 to 35 nucleobases in length (13 to 35 linked nucleotide residues). In these embodiments, the primers are at least 13 nucleobases in length, and less than 36 nucleobases in length. These embodiments comprise oligonucleotide primers 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 nucleobases in length, or any range therewithin. Herein is contemplated using both longer and shorter primers.
  • primers may also be linked to one or more other desired moieties, including, but not limited to, affinity groups, ligands, regions of nucleic acid that are not complementary to the nucleic acid to be amplified, labels, etc.
  • Primers may also form hairpin structures.
  • hairpin primers may be used to amplify short target nucleic acid molecules. The presence of the hairpin may stabilize the amplification complex (see e.g., TAQMAN MicroRNA Assays, Applied Biosystems, Foster City, California).
  • any oligonucleotide primer pair may have one or both primers with less then 70% sequence homology with a corresponding member of any of the primer pairs of Table 2 if the primer pair has the capability of producing an amplification product corresponding to a bioagent identifying amplicon.
  • any oligonucleotide primer pair may have one or both primers with a length greater than 35 nucleobases if the primer pair has the capability of producing an amplification product corresponding to a bioagent identifying amplicon.
  • the function of a given primer may be substituted by a combination of two or more primers segments that hybridize adjacent to each other or that are linked by a nucleic acid loop structure or linker which allows a polymerase to extend the two or more primers in an amplification reaction.
  • the primer pairs used for obtaining bioagent identifying amplicons are the primer pairs of Table 2.
  • other combinations of primer pairs are possible by combining certain members of the forward primers with certain members of the reverse primers.
  • An example can be seen in Table 2 for two primer pair combinations of forward primer 16S_EC_789_810_F (SEQ ID NO: 206), with the reverse primers 16S_EC_880_894_R (SEQ ID NO: 796), or 16S_EC_882_899_R or (SEQ ID NO: 818).
  • a bioagent identifying amplicon that would be produced by the primer pair which preferably is between about 45 to about 150 nucleobases in length.
  • a bioagent identifying amplicon longer than 150 nucleobases in length could be cleaved into smaller segments by cleavage reagents such as chemical reagents, or restriction enzymes, for example.
  • the primers are configured to amplify nucleic acid of a bioagent to produce amplification products that can be measured by mass spectrometry and from whose molecular masses candidate base compositions can be readily calculated.
  • any given primer comprises a modification comprising the addition of a non-templated T residue to the 5' end of the primer (i.e., the added T residue does not necessarily hybridize to the nucleic acid being amplified).
  • the addition of a non-templated T residue has an effect of minimizing the addition of non-templated adenosine residues as a result of the nonspecific enzyme activity oi Taq polymerase (Magnuson et al, Biotechniques, 1996, 21, 700-709), an occurrence which may lead to ambiguous results arising from molecular mass analysis.
  • primers may contain one or more universal bases. Because any variation (due to codon wobble in the third position) in the conserved regions among species is likely to occur in the third position of a DNA (or RNA) triplet, oligonucleotide primers can be configured such that the nucleotide corresponding to this position is a base which can bind to more than one nucleotide, referred to herein as a "universal nucleobase.” For example, under this "wobble” pairing, inosine (I) binds to U, C or A; guanine (G) binds to U or C, and uridine (U) binds to U or C.
  • inosine (I) binds to U, C or A
  • guanine (G) binds to U or C
  • uridine (U) binds to U or C.
  • nitroindoles such as 5-nitroindole or 3- nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001-1003), the degenerate nucleotides dP or dK (Hill et al.), an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot et al., Nucleosides and Nucleotides, 1995, 14, 1053-1056) or the purine analog l-(2- deoxy-beta-D-ribofuranosyl)-imidazole-4-carboxamide (SaIa et al, Nucl. Acids Res., 1996, 24, 3302-3306).
  • nitroindoles such as 5-nitroindole or 3- nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001-1003)
  • the oligonucleotide primers are configured such that the first and second positions of each triplet are occupied by nucleotide analogs that bind with greater affinity than the unmodified nucleotide.
  • these analogs include, but are not limited to, 2,6-diaminopurine which binds to thymine, 5-propynyluracil (also known as propynylated thymine) which binds to adenine and 5-propynylcytosine and phenoxazines, including G-clamp, which binds to G.
  • Propynylated pyrimidines are described in U.S.
  • Propynylated primers are described in U.S Pre-Grant Publication No. 2003-0170682, which is also commonly owned and incorporated herein by reference in its entirety.
  • Phenoxazines are described in U.S. Patent Nos. 5,502,177, 5,763,588, and 6,005,096, each of which is incorporated herein by reference in its entirety.
  • G-clamps are described in U.S. Patent Nos. 6,007,992 and 6,028,183, each of which is incorporated herein by reference in its entirety.
  • primer hybridization is enhanced using primers containing 5- propynyl deoxy-cytidine and deoxy-thymidine nucleotides. These modified primers offer increased affinity and base pairing selectivity.
  • non-template primer tags are used to increase the melting temperature (T. sub. m) of a primer-template duplex in order to improve amplification efficiency.
  • a non-template tag is at least three consecutive A or T nucleotide residues on a primer which are not complementary to the template.
  • A can be replaced by C or G and T can also be replaced by C or G.
  • a primer contains a modified internucleoside linkage such as a phosphorothioate linkage, for example.
  • the primers comprise mass-modifying tags. Reducing the total number of possible base compositions of a nucleic acid of specific molecular weight provides a means of avoiding a persistent source of ambiguity in determination of base composition of amplification products. Addition of mass-modifying tags to certain nucleobases of a given primer will result in simplification of de novo determination of base composition of a given bioagent identifying amplicon from its molecular mass.
  • the mass modified nucleobase comprises one or more of the following: for example, 7-deaza-2'-deoxyadenosine-5-triphosphate, 5-iodo-2'-deoxyuridine-5'- triphosphate, 5-bromo-2'-deoxyuridine-5'-triphosphate, 5-bromo-2'-deoxycytidine-5'-triphosphate, 5-iodo-2'-deoxycytidine-5'-triphosphate, 5-hydroxy-2'-deoxyuridine-5'-triphosphate, 4- thiothymidine-5 '-triphosphate, 5-aza-2'-deoxyuridine-5'-triphosphate, 5-fluoro-2'-deoxyuridine-5'- triphosphate, O6-methyl-2'-deoxyguanosine-5 '-triphosphate, N2-methyl-2'-deoxyguanosine-5'- triphosphate, 8-oxo-2'-deoxyguanosine-5'-triphosphate or thioth
  • multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with a plurality of primer pairs.
  • the advantages of multiplexing are that fewer reaction containers (for example, wells of a 96- or 384-well plate) are needed for each molecular mass measurement, providing time, resource and cost savings because additional bioagent identification data can be obtained within a single analysis.
  • Multiplex amplification methods are well known to those with ordinary skill and can be developed without undue experimentation.
  • one useful and non-obvious step in selecting a plurality candidate bioagent identifying amplicons for multiplex amplification is to ensure that each strand of each amplification product will be sufficiently different in molecular mass that mass spectral signals will not overlap and lead to ambiguous analysis results.
  • a 10 Da difference in mass of two strands of one or more amplification products is sufficient to avoid overlap of mass spectral peaks.
  • single amplification reactions can be pooled before analysis by mass spectrometry.
  • the molecular mass of a given bioagent identifying amplicon is determined by mass spectrometry.
  • Mass spectrometry has several advantages, not the least of which is high bandwidth characterized by the ability to separate (and isolate) many molecular peaks across a broad range of mass to charge ratio (m/z).
  • mass spectrometry is intrinsically a parallel detection scheme without the need for radioactive or fluorescent labels, since every amplification product is identified by its molecular mass.
  • the current state of the art in mass spectrometry is such that less than femtomole quantities of material can be readily analyzed to afford information about the molecular contents of the sample.
  • An accurate assessment of the molecular mass of the material can be quickly obtained, irrespective of whether the molecular weight of the sample is several hundred, or in excess of one hundred thousand atomic mass units (amu) or Daltons.
  • intact molecular ions are generated from amplification products using one of a variety of ionization techniques to convert the sample to gas phase.
  • ionization techniques include, but are not limited to, electrospray ionization (ES), matrix-assisted laser desorption ionization (MALDI) and fast atom bombardment (FAB).
  • ES electrospray ionization
  • MALDI matrix-assisted laser desorption ionization
  • FAB fast atom bombardment
  • Electrospray ionization mass spectrometry is particularly useful for very high molecular weight polymers such as proteins and nucleic acids having molecular weights greater than 10 kDa, since it yields a distribution of multiply-charged molecules of the sample without causing a significant amount of fragmentation.
  • the mass detectors used in the methods provided herein include, but are not limited to, Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), time of flight (TOF), ion trap, quadrupole, magnetic sector, Q-TOF, and triple quadrupole.
  • the base composition an the exact number of each nucleobase (A, T, C and G) in an oligonucleotide, for example, an amplicon, and can be calculated, for amplicons generated using the primer pairs provided here, from the molecular mass of the amplicons.
  • a base composition provides an index of a specific organism.
  • Base compositions can be calculated from known sequences of known bioagent identifying amplicons and can also be experimentally determined by measuring the molecular mass of a given bioagent identifying amplicon, followed by determination of all possible base compositions which are consistent with the measured molecular mass within acceptable experimental error.
  • the following example illustrates determination of base composition from an experimentally obtained molecular mass of a 46-mer amplification product originating at position 1337 of the 16S rRNA of Bacillus anthracis.
  • the forward and reverse strands of the amplification product have measured molecular masses of 14208 and 14079 Da, respectively.
  • the possible base compositions derived from the molecular masses of the forward and reverse strands for the B. anthracis products are listed in Table 1.
  • assignment of previously unobserved base compositions can be accomplished via the use of pattern classifier model algorithms.
  • Base compositions like sequences, vary slightly from strain to strain within species, for example.
  • the pattern classifier model is the mutational probability model.
  • the pattern classifier is the polytope model. The mutational probability model and polytope model are both commonly owned and described in U.S. Patent application Serial No. 11/073,362 which is incorporated herein by reference in entirety.
  • base composition probability clouds around the composition constraints for each species. This permits identification of organisms in a fashion similar to sequence analysis.
  • a "pseudo four-dimensional plot" can be used to visualize the concept of base composition probability clouds.
  • Optimal primer design requires optimal choice of bioagent identifying amplicons and maximizes the separation between the base composition signatures of individual bioagents. Areas where clouds overlap indicate regions that may result in a misclassification, a problem which is overcome by a triangulation identification process using bioagent identifying amplicons not affected by overlap of base composition probability clouds.
  • base composition probability clouds provide the means for screening potential primer pairs in order to avoid potential misclassifications of base compositions.
  • base composition probability clouds provide the means for predicting the identity of a bioagent whose assigned base composition was not previously observed and/or indexed in a bioagent identifying amplicon base composition database due to evolutionary transitions in its nucleic acid sequence.
  • mass spectrometry determination of base composition does not require prior knowledge of the composition or sequence in order to make the measurement.
  • a molecular mass of a single bioagent identifying amplicon alone does not provide enough resolution to unambiguously identify a given bioagent.
  • the employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as "triangulation identification.”
  • Triangulation identification is pursued by determining the molecular masses of a plurality of bioagent identifying amplicons selected within a plurality of housekeeping genes. This process is used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al, J. Appl. Microbiol, 1999, 87, 270-278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event.
  • the triangulation identification process can be pursued by characterization of bioagent identifying amplicons in a massively parallel fashion using the polymerase chain reaction (PCR), such as multiplex PCR where multiple primers are employed in the same amplification reaction mixture, or PCR in multi-well plate format wherein a different and unique pair of primers is used in multiple wells containing otherwise identical reaction mixtures.
  • PCR polymerase chain reaction
  • multiplex and multi-well PCR methods are well known to those with ordinary skill in the arts of rapid throughput amplification of nucleic acids.
  • one PCR reaction per well or container may be carried out, followed by an amplicon pooling step wherein the amplification products of different wells are combined in a single well or container which is then subjected to molecular mass analysis.
  • the combination of pooled amplicons can be chosen such that the expected ranges of molecular masses of individual amplicons are not overlapping and thus will not complicate identification of signals.
  • one or more nucleotide substitutions within a codon of a gene of an infectious organism confer drug resistance upon an organism which can be determined by codon base composition analysis.
  • the organism can be a bacterium, virus, fungus or protozoan.
  • the amplification product containing the codon being analyzed is of a length of about 35 to about 200 nucleobases.
  • the primers employed in obtaining the amplification product can hybridize to upstream and downstream sequences directly adjacent to the codon, or can hybridize to upstream and downstream sequences one or more sequence positions away from the codon.
  • the primers may have between about 70% to 100% sequence complementarity with the sequence of the gene containing the codon being analyzed.
  • the codon base composition analysis is undertaken
  • the codon analysis is undertaken for the purpose of investigating genetic disease in an individual. In other embodiments, the codon analysis is undertaken for the purpose of investigating a drug resistance mutation or any other deleterious mutation in an infectious organism such as a bacterium, virus, fungus or protozoan.
  • the bioagent is a bacterium identified in a biological product.
  • the molecular mass of an amplification product containing the codon being analyzed is measured by mass spectrometry.
  • the mass spectrometry can be either electrospray (ESI) mass spectrometry or matrix-assisted laser desorption ionization (MALDI) mass spectrometry.
  • ESI electrospray
  • MALDI matrix-assisted laser desorption ionization
  • TOF Time-of- flight
  • the methods provided here can also be employed to determine the relative abundance of drug resistant strains of the organism being analyzed. Relative abundances can be calculated from amplitudes of mass spectral signals with relation to internal calibrants. In some embodiments, known quantities of internal amplification calibrants can be included in the amplification reactions and abundances of analyte amplification product estimated in relation to the known quantities of the calibrants.
  • one or more alternative treatments can be devised to treat the individual.
  • the identity and quantity of an unknown bioagent can be determined using the process illustrated in Figure 2.
  • Primers (500) and a known quantity of a calibration polynucleotide (505) are added to a sample containing nucleic acid of an unknown bioagent.
  • the total nucleic acid in the sample is then subjected to an amplification reaction (510) to obtain amplification products.
  • the molecular masses of amplification products are determined (515) from which are obtained molecular mass and abundance data.
  • the molecular mass of the bioagent identifying amplicon (520) provides the means for its identification (525) and the molecular mass of the calibration amplicon obtained from the calibration polynucleotide (530) provides the means for its identification (535).
  • the abundance data of the bioagent identifying amplicon is recorded (540) and the abundance data for the calibration data is recorded (545), both of which are used in a calculation (550) which determines the quantity of unknown bioagent in the sample.
  • a sample comprising an unknown bioagent is contacted with a pair of primers that provide the means for amplification of nucleic acid from the bioagent, and a known quantity of a polynucleotide that comprises a calibration sequence.
  • the nucleic acids of the bioagent and of the calibration sequence are amplified and the rate of amplification is reasonably assumed to be similar for the nucleic acid of the bioagent and of the calibration sequence.
  • the amplification reaction then produces two amplification products: a bioagent identifying amplicon and a calibration amplicon.
  • the bioagent identifying amplicon and the calibration amplicon should be distinguishable by molecular mass while being amplified at essentially the same rate.
  • Effecting differential molecular masses can be accomplished by choosing as a calibration sequence, a representative bioagent identifying amplicon (from a specific species of bioagent) and performing, for example, a 2-8 nucleobase deletion or insertion within the variable region between the two priming sites.
  • the amplified sample containing the bioagent identifying amplicon and the calibration amplicon is then subjected to molecular mass analysis by mass spectrometry, for example.
  • the resulting molecular mass analysis of the nucleic acid of the bioagent and of the calibration sequence provides molecular mass data and abundance data for the nucleic acid of the bioagent and of the calibration sequence.
  • the molecular mass data obtained for the nucleic acid of the bioagent enables identification of the unknown bioagent and the abundance data enables calculation of the quantity of the bioagent, based on the knowledge of the quantity of calibration polynucleotide contacted with the sample.
  • construction of a standard curve where the amount of calibration polynucleotide spiked into the sample is varied provides additional resolution and improved confidence for the determination of the quantity of bioagent in the sample.
  • standard curves for analytical determination of molecular quantities is well known to one with ordinary skill and can be performed without undue experimentation.
  • multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with multiple primer pairs which also amplify the corresponding standard calibration sequences.
  • the standard calibration sequences are optionally included within a single vector which functions as the calibration polynucleotide. Multiplex amplification methods are well known to those with ordinary skill and can be performed without undue experimentation.
  • the calibrant polynucleotide is used as an internal positive control to confirm that amplification conditions and subsequent analysis steps are successful in producing a measurable amplicon. Even in the absence of copies of the genome of a bioagent, the calibration polynucleotide should give rise to a calibration amplicon. Failure to produce a measurable calibration amplicon indicates a failure of amplification or subsequent analysis step such as amplicon purification or molecular mass determination. Reaching a conclusion that such failures have occurred is in itself, a useful event.
  • the calibration sequence is comprised of DNA. In some embodiments, the calibration sequence is comprised of RNA.
  • the calibration sequence is inserted into a vector that itself functions as the calibration polynucleotide. In some embodiments, more than one calibration sequence is inserted into the vector that functions as the calibration polynucleotide.
  • a calibration polynucleotide is herein termed a "combination calibration polynucleotide.” The process of inserting polynucleotides into vectors is routine to those skilled in the art and can be accomplished without undue experimentation. Thus, it should be recognized that the calibration method should not be limited to the embodiments described herein. The calibration method can be applied for determination of the quantity of any bioagent identifying amplicon when an appropriate standard calibrant polynucleotide sequence is configured and used.
  • the primer pairs produce bioagent identifying amplicons within stable and highly conserved regions of bacteria.
  • the advantage to characterization of an amplicon defined by priming regions that fall within a highly conserved region is that there is a low probability that the region will evolve past the point of primer recognition, in which case, the primer hybridization of the amplification step would fail.
  • Such a primer set is thus useful as a broad range survey-type primer.
  • the primers produce bioagent identifying amplicons including a region which evolves more quickly than the stable region described above.
  • the advantage of characterization bioagent identifying amplicon corresponding to an evolving genomic region is that it is useful for distinguishing emerging strain variants or the presence of virulence genes, drug resistance genes, or codon mutations that induce drug resistance.
  • the embodiments provided here also have significant advantages in providing a platform for identification of diseases caused by emerging bacterial strains such as, for example, drug- resistant strains of Staphylococcus aureus.
  • the present embodiments eliminate the need for prior knowledge of bioagent sequence to generate hybridization probes. This is possible because the methods are not confounded by naturally occurring evolutionary variations occurring in the sequence acting as the template for production of the bioagent identifying amplicon. Measurement of molecular mass and determination of base composition is accomplished in an unbiased manner without sequence prejudice.
  • Another embodiment provides a means of tracking the spread of a bacterium, such as a particular drug-resistant strain when a plurality of samples obtained from different locations are analyzed by the methods described above in an epidemiological setting.
  • a plurality of samples from a plurality of different locations is analyzed with primer pairs which produce bioagent identifying amplicons, a subset of which contains a specific drug-resistant bacterial strain.
  • the corresponding locations of the members of the drug-resistant strain subset indicate the spread of the specific drug-resistant strain to the corresponding locations.
  • kits for carrying out the methods described herein may comprise a sufficient quantity of one or more primer pairs to perform an amplification reaction on a target polynucleotide from a bioagent to form a bioagent identifying amplicon.
  • the kit may comprise from one to fifty primer pairs, from one to twenty primer pairs, from one to ten primer pairs, or from two to five primer pairs.
  • the kit may comprise one or more primer pairs recited in Table 2.
  • the kit comprises one or more broad range survey primer(s), division wide primer(s), or drill-down primer(s), or any combination thereof. If a given problem involves identification of a specific bioagent, the solution to the problem may require the selection of a particular combination of primers to provide the solution to the problem.
  • a kit may be configured so as to comprise particular primer pairs for identification of a particular bioagent.
  • a drill-down kit may be used, for example, to distinguish different genotypes or strains, drug-resistant, or otherwise.
  • the primer pair components of any of these kits may be additionally combined to comprise additional combinations of broad range survey primers and division-wide primers so as to be able to identify a bacterium.
  • the kit contains standardized calibration polynucleotides for use as internal amplification calibrants. Internal calibrants are described in commonly owned PCT pre- grant publication, publication number WO 2005/094421, which is incorporated herein by reference in its entirety.
  • the kit comprises a sufficient quantity of reverse transcriptase (if RNA is to be analyzed for example), a DNA polymerase, suitable nucleoside triphosphates (including alternative dNTPs such as inosine or modified dNTPs such as the 5-propynyl pyrimidines or any dNTP containing molecular mass-modifying tags such as those described above), a DNA ligase, and/or reaction buffer, or any combination thereof, for the amplification processes described above.
  • a kit may further include instructions pertinent for the particular embodiment of the kit, such instructions describing the primer pairs and amplification conditions for operation of the method.
  • a kit may also comprise amplification reaction containers such as microcentrifuge tubes and the like.
  • a kit may also comprise reagents or other materials for isolating bioagent nucleic acid or bioagent identifying amplicons from amplification, including, for example, detergents, solvents, or ion exchange resins which may be linked to magnetic beads.
  • a kit may also comprise a table of measured or calculated molecular masses and/or base compositions of bioagents using the primer pairs of the kit.
  • a kit may contain one or more survey bacterial primer pairs and one or more triangulation genotyping analysis primer pairs such as the primer pairs of Tables 8, 12, 14, 19, 21, 23, or 24.
  • the kit may represent a less expansive genotyping analysis but include triangulation genotyping analysis primer pairs for more than one genus or species of bacteria.
  • a kit for surveying nosocomial infections at a health care facility may include, for example, one or more broad range survey primer pairs, one or more division wide primer pairs, one or more Acinetobacter baumannii triangulation genotyping analysis primer pairs and one or more Staphylococcus aureus triangulation genotyping analysis primer pairs.
  • One with ordinary skill will be capable of analyzing in silico amplification data to determine which primer pairs will be able to provide optimal identification resolution for the bacterial bioagents of interest.
  • a kit may be assembled for identification of strains of bacteria involved in contamination of food.
  • An example of such a kit embodiment is a kit comprising one or more bacterial survey primer pairs of Table 5 with one or more triangulation genotyping analysis primer pairs of Table 12 which provide strain resolving capabilities for identification of specific strains of Campylobacter jejuni .
  • kits are 96-well or 384-well plates with a plurality of wells containing any or all of the following components: dNTPs, buffer salts, Mg 2+ , betaine, and primer pairs.
  • a polymerase is also included in the plurality of wells of the 96-well or 384-well plates.
  • kits contain instructions for PCR and mass spectrometry analysis of amplification products obtained using the primer pairs of the kits.
  • kits include a barcode which uniquely identifies the kit and the components contained therein according to production lots and may also include any other information relative to the components such as concentrations, storage temperatures, etc.
  • the barcode may also include analysis information to be read by optical barcode readers and sent to a computer controlling amplification, purification and mass spectrometric measurements.
  • the barcode provides access to a subset of base compositions in a base composition database which is in digital communication with base composition analysis software such that a base composition measured with primer pairs from a given kit can be compared with known base compositions of bioagent identifying amplicons defined by the primer pairs of that kit.
  • the kit contains a database of base compositions of bioagent identifying amplicons defined by the primer pairs of the kit.
  • the database is stored on a convenient computer readable medium such as a compact disk or USB drive, for example.
  • the kit includes a computer program stored on a computer formatted medium (such as a compact disk or portable USB disk drive, for example) comprising instructions which direct a processor to analyze data obtained from the use of the primer pairs provided herein.
  • the instructions of the software transform data related to amplification products into a molecular mass or base composition which is a useful concrete and tangible result used in identification and/or classification of bioagents.
  • the kits of the present invention contain all of the reagents sufficient to carry out one or more of the methods described herein.
  • PCR primers would amplify products of about 45 to about 150 nucleotides in length and distinguish subgroups and/or individual strains from each other by their molecular masses or base compositions. A typical process shown in Figure 1 is employed for this type of analysis.
  • a database of expected base compositions for each primer region was generated using an in silico PCR search algorithm, such as (ePCR).
  • An existing RNA structure search algorithm (Macke et al, Nucl. Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety). This also provides information on primer specificity of the selected primer pairs.
  • Table 2 represents a collection of primers (sorted by primer pair number) configured to identify bacteria using the methods described herein.
  • the primer pair number is an in-house database index number. conserveed regions which primers were configured to hybridize within were identified on bacterial bioagent genes including, for example, arcC, aroE, ermA, ermC, gmk, gyrA, mecA, mecRl, mupR, nuc, pta, pvluk, tpi, tsst, tuffi, and yqi.
  • the forward and reverse primer names shown in Table 1 indicate the gene region of a bacterial genome to which the forward and reverse primers hybridize relative to a reference sequence.
  • the forward primer name TSSTl_NC002758.2-2137509-2138213 _519_546_F indicates that the forward primer ("_F") hybridizes to the GyrA gene ("GYRA"), specifically to residues 519-546 ("519_546") of a reference sequence represented by a sequence extraction of coordinates 2137509-2138213 from GenBank gi number 57634611 (as indicated by cross-references in Table 2 for the prefix "GYRA NC002953").
  • This sequence extraction reference includes sequence encoding for tsst.
  • the primer pair name codes appearing in Table 2 are defined in Table 3. For example, Table 2 lists gene abbreviations and GenBank gi numbers that correspond with each primer name code.
  • primer pair has the code "TSST1 NC002758.2" and is thus configured to hybridize to sequence encoding the tsst gene, and the extraction sequence corresponds to coordinates 2137509-2138213 from GenBank gi number 57634611, which is a Staphylococcus aureus sequence.
  • GenBank gi number 57634611 which is a Staphylococcus aureus sequence.
  • the reference nomenclature in the primer name is selected to provide a reference, and does not necessarily mean that the primer pair has been configured with 100% complementarity to that target site on the reference sequence.
  • Tp 5-propynyluracil
  • Cp 5- propynylcytosine
  • * phosphorothioate linkage
  • I inosine.
  • T GenBank Accession Numbers for reference sequences of bacteria are shown in Table 3 (below).
  • the reference sequences are extractions from bacterial genomic sequences or complements thereof.
  • a description of the primer design is provided herein.
  • the reference sequences are extractions from bacterial genomic sequences or complements thereof.
  • CAPC BA 315 334 CCGTGGTATTGGAGTTATT
  • RPOC EC 2218 22 CTGGCAGGTATGCGTGGTC CGCACCGTGGGTTGAGATGAAGT
  • RPOC EC 993 101 CAAAGGTAAGCAAGGACGT
  • RPOC EC 1036 1059 2 CGAACGGCCAGAGTAGTCAACAC
  • TUFB EC 976 100 AACTACCGTCCGCAGTTCT GTTGTCGCCAGGCATAACCATTT
  • TUFB EC 976 100 AACTACCGTCCTCAGTTCT TUFB EC 1045 1068 2 GTTGTCACCAGGCATTACCATTT
  • RPLB EC 688 710 CATCCACACGGTGGTGGTG TGTTTTGTATCCAAGTGCTGGTT
  • VALS EC 1105 11 CGTGGCGGCGTGGTTATCG CGGTACGAACTGGATGTCGCCGT
  • RPLB EC 671 700 TAATGAACCCTAATGACCA TCCAAGTGCTGGTTTACCCCATG
  • SPlOl SPETIl 1 AACCTTAATTGGAAAGAAA SPlOl SPETIl 92 116 CCTACCCAACGTTCACCAAGGGC
  • SPlOl SPETIl 65 TGGGGATTGATATCACCGA SPlOl SPETIl 756 784 TGATTGGCGATAAAGTGATATTT
  • SPlOl SPETIl 30 TAGCTAATGGTCAGGCAGC SPlOl SPETIl 3170 31 TCGACGACCATCTTGGAAAGATT
  • CJST CJ 2060 20 TCCCGGACTTAATATCAAT TCGATCCGCATCACCATCAAAAG
  • Primer pair name codes and reference sequences are shown in Table 3.
  • the primer name code typically represents the gene to which the given primer pair is targeted.
  • the primer pair name may include specific coordinates with respect to a reference sequence defined by an extraction of a section of sequence or defined by a GenBank gi number, or the corresponding complementary sequence of the extraction, or the entire GenBank gi number as indicated by the label "no extraction.” Where "no extraction” is indicated for a reference sequence, the coordinates of a primer pair named to the reference sequence are with respect to the GenBank gi listing. Gene abbreviations are shown in bold type in the "Gene Name" column.
  • primer pairs configured to primer amplification of double stranded sequences will be configured and named using one strand of a double-stranded reference sequence.
  • the forward primer is the primer of the pair that comprises full or partial sequence identity to the one strand of the sequence being used as a reference during design.
  • the reverse primer is the primer of the pair that comprises reverse complementarity.
  • Alignments can be done using a bioinformatics tool such as BLASTn provided to the public by NCBI (Bethesda, MD).
  • BLASTn provided to the public by NCBI (Bethesda, MD).
  • a relevant GenBank sequence may be downloaded and imported into custom programmed or commercially available bioinformatics programs wherein the alignment can be carried out to determine the primer hybridization coordinates and the sequences, molecular masses and base compositions of the amplification product.
  • primer pair number 2095 SEQ ID NOs: 456: 1261
  • First the forward primer (SEQ ID NO: 456) is subjected to a BLASTn search on the publicly available NCBI BLAST website.
  • RefSeq Genomic is chosen as the BLAST database since the gi numbers refer to genomic sequences.
  • the BLAST query is then performed. Among the top results returned is a match to GenBank gi number 21281729 (Accession Number NC_003923). The result shown below, indicates that the forward primer hybridizes to positions 1530282..1530307 of the genomic sequence of Staphylococcus aureus subsp. aureus MW2 (represented by gi number 21281729).
  • the hybridization coordinates of the reverse primer (SEQ ID NO: 1261) can be determined in a similar manner and thus, the bioagent identifying amplicon can be defined in terms of genomic coordinates.
  • Table 3 contains sufficient information to determine the primer hybridization coordinates of any of the primers of Table 2 to the applicable reference sequences described therein.
  • Table 3 Primer Name Codes and Reference Sequence
  • Genomic DNA was prepared from samples using the DNeasy Tissue Kit (Qiagen, Valencia, CA) according to the manufacturer's protocols.
  • PCR reactions were assembled in 50 ⁇ L reaction volumes in a 96-well microtiter plate format using a Packard MPII liquid handling robotic platform and MJ. Dyad thermocyclers (MJ research, Waltham, MA) or Eppendorf Mastercycler thermocyclers (Eppendorf, Westbury, NY).
  • the PCR reaction mixture consisted of 4 units of Amplitaq Gold, Ix buffer II (Applied Biosystems, Foster City, CA), 1.5 mM MgCl 2 , 0.4 M betaine, 800 ⁇ M dNTP mixture and 250 nM of each primer.
  • the following typical PCR conditions were used: 95°C for 10 min followed by 8 cycles of 95°C for 30 seconds, 48°C for 30 seconds, and 72°C 30 seconds with the 48°C annealing temperature increasing 0.9 0 C with each of the eight cycles. The PCR was then continued for 37 additional cycles of 95°C for 15 seconds, 56°C for 20 seconds, and 72°C 20 seconds.
  • the ESI-FTICR mass spectrometer is based on a Bruker Daltonics (Billerica, MA) Apex II 7Oe electrospray ionization Fourier transform ion cyclotron resonance mass spectrometer that employs an actively shielded 7 Tesla superconducting magnet.
  • the active shielding constrains the majority of the fringing magnetic field from the superconducting magnet to a relatively small volume.
  • components that might be adversely affected by stray magnetic fields such as CRT monitors, robotic components, and other electronics, can operate in close proximity to the FTICR spectrometer.
  • Ions were formed via electrospray ionization in a modified Analytica (Branford, CT) source employing an off axis, grounded electrospray probe positioned approximately 1.5 cm from the metalized terminus of a glass desolvation capillary.
  • the atmospheric pressure end of the glass capillary was biased at 6000 V relative to the ESI needle during data acquisition.
  • a counter-current flow of dry N 2 was employed to assist in the desolvation process.
  • Ions were accumulated in an external ion reservoir comprised of an rf-only hexapole, a skimmer cone, and an auxiliary gate electrode, prior to injection into the trapped ion cell where they were mass analyzed.
  • Ionization duty cycles greater than 99% were achieved by simultaneously accumulating ions in the external ion reservoir during ion detection. Each detection event consisted of IM data points digitized over 2.3 s. To improve the signal-to-noise ratio (S/N), 32 scans were co-added for a total data acquisition time of 74 s.
  • S/N signal-to-noise ratio
  • the ESI-TOF mass spectrometer is based on a Bruker Daltonics MicroTOFTM. Ions from the ESI source undergo orthogonal ion extraction and are focused in a reflectron prior to detection.
  • the TOF and FTICR are equipped with the same automated sample handling and fluidics described above. Ions are formed in the standard MicroTOFTM ESI source that is equipped with the same off- axis sprayer and glass capillary as the FTICR ESI source. Consequently, source conditions were the same as those described above. External ion accumulation was also employed to improve ionization duty cycle during data acquisition. Each detection event on the TOF was comprised of 75,000 data points digitized over 75 ⁇ s.
  • the sample delivery scheme allows sample aliquots to be rapidly injected into the electrospray source at high flow rate and subsequently be electrosprayed at a much lower flow rate for improved ESI sensitivity.
  • a bolus of buffer was injected at a high flow rate to rinse the transfer line and spray needle to avoid sample contamination/carryover.
  • the autosampler injected the next sample and the flow rate was switched to low flow.
  • data acquisition commenced.
  • the autosampler continued rinsing the syringe and picking up buffer to rinse the injector and sample transfer line.
  • one 99-mer nucleic acid strand having a base composition of A.sub.27G.sub.30C.sub.21T.sub.21 has a theoretical molecular mass of 30779.058 while another 99-mer nucleic acid strand having a base composition of A.sub.26G.sub.31C.sub.22T.sub.2O has a theoretical molecular mass of 30780.052.
  • a 1 Da difference in molecular mass may be within the experimental error of a molecular mass measurement and thus, the relatively narrow molecular mass range of the four natural nucleobases imposes an uncertainty factor.
  • nucleobase as used herein is synonymous with other terms in use in the art including "nucleotide,” “deoxynucleotide,” “nucleotide residue,” “deoxynucleotide residue,” “nucleotide triphosphate (NTP),” or deoxynucleotide triphosphate (dNTP).
  • Mass spectra of bioagent-identifying amplicons were analyzed independently using a maximum-likelihood processor, such as is widely used in radar signal processing.
  • This processor referred to as GenX, first makes maximum likelihood estimates of the input to the mass spectrometer for each primer by running matched filters for each base composition aggregate on the input data. This includes the GenX response to a calibrant for each primer.
  • the algorithm emphasizes performance predictions culminating in probability-of-detection versus probability-of- false-alarm plots for conditions involving complex backgrounds of naturally occurring organisms and environmental contaminants.
  • Matched filters consist of a priori expectations of signal values given the set of primers used for each of the bioagents.
  • a genomic sequence database is used to define the mass base count matched filters. The database contains the sequences of known bacterial bioagents and includes threat organisms as well as benign background organisms. The latter is used to estimate and subtract the spectral signature produced by the background organisms.
  • a maximum likelihood detection of known background organisms is implemented using matched filters and a running-sum estimate of the noise covariance. Background signal strengths are estimated and used along with the matched filters to form signatures which are then subtracted. The maximum likelihood process is applied to this "cleaned up" data in a similar manner employing matched filters for the organisms and a running-sum estimate of the noise- covariance for the cleaned up data.
  • Base count blurring can be carried out as follows. "Electronic PCR" can be conducted on nucleotide sequences of the desired bioagents to obtain the different expected base counts that could be obtained for each primer pair. See for example, ncbi.nlm.nih.gov/sutils/e-pcr/; Schuler, Genome Res. 7:541-50, 1997.
  • one or more spreadsheets such as Microsoft Excel workbooks contain a plurality of worksheets. First in this example, there is a worksheet with a name similar to the workbook name; this worksheet contains the raw electronic PCR data.
  • filtered bioagents base count that contains bioagent name and base count; there is a separate record for each strain after removing sequences that are not identified with a genus and species and removing all sequences for bioagents with less than 10 strains.
  • Application of an exemplary script involves the user defining a threshold that specifies the fraction of the strains that are represented by the reference set of base counts for each bioagent.
  • the reference set of base counts for each bioagent may contain as many different base counts as are needed to meet or exceed the threshold.
  • the set of reference base counts is defined by taking the most abundant strain's base type composition and adding it to the reference set and then the next most abundant strain's base type composition is added until the threshold is met or exceeded.
  • the current set of data was obtained using a threshold of 55%, which was obtained empirically.
  • Example 6 Use of Broad Range Survey and Division Wide Primer Pairs for Identification of Bacteria in an Epidemic Surveillance Investigation
  • This investigation employed a set of 16 primer pairs which is herein designated the "surveillance primer set” and comprises broad range survey primer pairs, division wide primer pairs and a single Bacillus clade primer pair.
  • the surveillance primer set is shown in Table 5 and consists of primer pairs originally listed in Table 2.
  • This surveillance set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation ⁇ vide supra) relative to originally selected primers which are displayed below in the same row.
  • Primer pair 449 non-T modified
  • Its predecessors are primer pairs 70 and 357, displayed below in the same row.
  • Primer pair 360 has also been modified twice and its predecessors are primer pairs 17 and 118.
  • the 16 primer pairs of the surveillance set are used to produce bioagent identifying amplicons whose base compositions are sufficiently different amongst all known bacteria at the species level to identify, at a reasonable confidence level, any given bacterium at the species level.
  • Tables 6A-E common respiratory bacterial pathogens can be distinguished by the base compositions of bioagent identifying amplicons obtained using the 16 primer pairs of the surveillance set. In some cases, triangulation identification improves the confidence level for species assignment.
  • nucleic acid from Streptococcus pyogenes can be amplified by nine of the sixteen surveillance primer pairs and Streptococcus pneumoniae can be amplified by ten of the sixteen surveillance primer pairs.
  • the base compositions of the bioagent identifying amplicons are identical for only one of the analogous bioagent identifying amplicons and differ in all of the remaining analogous bioagent identifying amplicons by up to four bases per bioagent identifying amplicon.
  • the resolving power of the surveillance set was confirmed by determination of base compositions for 120 isolates of respiratory pathogens representing 70 different bacterial species and the results indicated that natural variations (usually only one or two base substitutions per bioagent identifying amplicon) amongst multiple isolates of the same species did not prevent correct identification of major pathogenic organisms at the species level.
  • Bacillus anthracis is a well known biological warfare agent which has emerged in domestic terrorism in recent years. Since it was envisioned to produce bioagent identifying amplicons for identification of Bacillus anthracis, additional drill-down analysis primers were configured to target genes present on virulence plasmids of Bacillus anthracis so that additional confidence could be reached in positive identification of this pathogenic organism. Three drill-down analysis primers were configured and are listed in Tables 2 and 6. In Table 6, the drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row. Table 6: Drill-Down Primer Pairs for Confirmation of Identification of Bacillus anthracis
  • FIG. 3 Phylogenetic coverage of bacterial space of the sixteen surveillance primers of Table 5 and the three Bacillus anthracis drill-down primers of Table 6 is shown in Figure 3 which lists common pathogenic bacteria.
  • Figure 3 is not meant to be comprehensive in illustrating all species identified by the primers. Only pathogenic bacteria are listed as representative examples of the bacterial species that can be identified by the primers and methods of the present invention.
  • Nucleic acid of groups of bacteria enclosed within the polygons of Figure 3 can be amplified to obtain bioagent identifying amplicons using the primer pair numbers listed in the upper right hand corner of each polygon. Primer coverage for polygons within polygons is additive.
  • bioagent identifying amplicons can be obtained for Chlamydia trachomatis by amplification with, for example, primer pairs 346-349, 360 and 361, but not with any of the remaining primers of the surveillance primer set.
  • bioagent identifying amplicons can be obtained from nucleic acid originating from Bacillus anthracis (located within 5 successive polygons) using, for example, any of the following primer pairs: 346-349, 360, 361 (base polygon), 356, 449 (second polygon), 352 (third polygon), 355 (fourth polygon), 350, 351 and 353 (fifth polygon).
  • Francisella tularensis schu 4 [32 29 22 16] [28 38 26 26] [25 32 28 31]
  • Nei sseria meningi tidi s MC58 [29 28 26 16] [27 34 27 27] [25 35 30 26]
  • Neisseria meningi tidis Z2491 (serogroup A) [29 28 26 16] [27 34 27 27] [25 35 30 26]
  • Mycobacterium tuberculosis CDC 1551 [27 36 21 15] [22 30 28] [21 36 27 30]
  • Streptococcus pneumoniae R6 [26 32 23 18] [25 35 28 28] [25 32 29 30]
  • Streptococcus pneumoniae TIGR4 [26 32 23 18] [25 35 28 28] [25 32 30 29]
  • Neisseria meningi tidi s MC58 (serogroup B) [25 27 22 18] [34 37 25 26] NO DATA
  • Staphylococcus aureus MRSA252 [26 30 25 20] [31 38 24 29] [33 30 31 27] Staphylococcus aureus MSSA476 [26 30 25 20] [31 38 24 29] [33 30 31 27]
  • Streptococcus pneumoniae TIGR4 [28 31 22 20] [34 36 24 28] [37 30 29 25]
  • Streptococcus pneumoniae TIGR4 [22 20 19 14] [25 33 29 35] [30 29 21 25]
  • the third set were historical samples, including twenty-seven isolates of group A Streptococcus, from disease outbreaks at this and other military training facilities during previous years.
  • the fourth set of samples was collected from five geographically separated military facilities in the continental U.S. in the winter immediately following the severe November/December 2002 outbreak.
  • FIG. 4 is a 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.
  • primer pair number 356 (SEQ ID NOs: 449: 1380) primarily amplifies the nucleic acid of members of the classes Bacilli and Clostridia and is not expected to amplify proteobacteria such as Neisseria meningitidis and Haemophilus influenzae.
  • primer pair number 356 As expected, analysis of the mass spectrum of amplification products obtained with primer pair number 356 does not indicate the presence of Neisseria meningitidis and Haemophilus influenzae but does indicate the presence of Streptococcus pyogenes ( Figures 3 and 6, Table 7B). Thus, these primers or types of primers can confirm the absence of particular bioagents from a sample.
  • the 15 throat swabs from military recruits were found to contain a relatively small set of microbes in high abundance. The most common were Haemophilus influenza, Neisseria meningitides, and Streptococcus pyogenes. Staphylococcus epidermidis, Moraxella cattarhalis, Corynebacteriumpseudodiphtheriticum, and Staphylococcus aureus were present in fewer samples. An equal number of samples from healthy volunteers from three different geographic locations, were identically analyzed. Results indicated that the healthy volunteers have bacterial flora dominated by multiple, commensal non-beta-hemolytic Streptococcal species, including the viridans group streptococci (S.
  • Example 7 Triangulation Genotyping Analysis for Determination of emm-Type of Streptococcus pyogenes in Epidemic Surveillance
  • This drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.
  • the primers of Table 8 were used to produce bioagent identifying amplicons from nucleic acid present in the clinical samples.
  • the bioagent identifying amplicons which were subsequently analyzed by mass spectrometry and base compositions corresponding to the molecular masses were calculated.
  • Table 9A Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus samples from Six Military Installations Obtained with Primer Pair Nos. 426 and

Abstract

The present invention provides compositions, kits and methods for rapid identification and quantification of bacteria by molecular mass and base composition analysis.

Description

COMPOSITIONS FOR USE IN IDENTIFICATION OF BACTERIA
CROSS-REFERENCE TO RELATED APPLICATIONS
[01] This application claims the benefit of priority to U.S. Provisional Application Serial Nos. 60/896813, filed March 23, 2007 and 60/896822, filed March 23, 2007, the disclosures of which are incorporated by reference in their entirety for any purpose.
SEQUENCE LISTING
[02] Computer-readable forms of the sequence listing, on CD-ROM, containing the file named DIBIS0096WOSEQ.txt, which is 257,746 bytes (measured in MS-DOS), and were created on March 20, 2008, are herein incorporated by reference.
STATEMENT OF GOVERNMENT SUPPORT
[03] This invention was made with United States Government support under CDC contract ROl CI000099-01. The United States Government has certain rights in the invention.
FIELD OF THE INVENTION
[04] The present invention provides compositions, kits and methods for rapid identification and quantification of bacteria by molecular mass and base composition analysis.
BACKGROUND OF THE INVENTION
[05] A problem in determining the cause of a natural infectious outbreak or a bioterrorist attack is the sheer variety of organisms that can cause human disease. There are over 1400 organisms infectious to humans; many of these have the potential to emerge suddenly in a natural epidemic or to be used in a malicious attack by bioterrorists (Taylor et al. Philos. Trans. R. Soc. London B. Biol. ScL, 2001, 356, 983-989). This number does not include numerous strain variants, bioengineered versions, or pathogens that infect plants or animals. [06] Much of the new technology being developed for detection of biological weapons incorporates a polymerase chain reaction (PCR) step based upon the use of highly specific primers and probes designed to selectively detect certain pathogenic organisms. Although this approach is appropriate for the most obvious bioterrorist organisms, like smallpox and anthrax, experience has shown that it is very difficult to predict which of hundreds of possible pathogenic organisms might be employed in a terrorist attack. Likewise, naturally emerging human disease that has caused devastating consequence in public health has come from unexpected families of bacteria, viruses, fungi, or protozoa. Plants and animals also have their natural burden of infectious disease agents and there are equally important biosafety and security concerns for agriculture.
[07] A major conundrum in public health protection, biodefense, and agricultural safety and security is that these disciplines need to be able to rapidly identify and characterize infectious agents, while there is no existing technology with the breadth of function to meet this need. Currently used methods for identification of bacteria rely upon culturing the bacterium to effect isolation from other organisms and to obtain sufficient quantities of nucleic acid followed by sequencing of the nucleic acid, both processes which are time and labor intensive.
[08] Mass spectrometry provides detailed information about the molecules being analyzed, including high mass accuracy. It is also a process that can be easily automated. DNA chips with specific probes can only determine the presence or absence of specifically anticipated organisms. Because there are hundreds of thousands of species of benign bacteria, some very similar in sequence to threat organisms, even arrays with 10,000 probes lack the breadth needed to identify a particular organism.
[09] Provided herein are oligonucleotide primers and compositions and kits containing the oligonucleotide primers, which define bacterial bioagent identifying amplicons and, upon amplification, produce corresponding amplification products whose molecular masses provide the means to identify bacteria, for example, at and below the species taxonomic level. SUMMARY OF THE INVENTION
[10] Provided herein are, inter alia, oligonucleotide primers, oligonucleotide primer pairs, compositions and kits comprising the same, and methods for their use in rapid identification, characterization and quantification of bacteria (also referred to herein as bacterial bioagents) by molecular mass and base composition analysis. In one embodiment, the bacteria are members of the Staphylococcus genus. In a preferred embodiment, they are members of the Staphylococcus aureus species. The forward and reverse primer members of the oligonucleotide primer pairs are configured to amplify one or more nucleic acids from bioagents, thereby generating amplicons (amplification products) for the nucleic acids. In one embodiment, the primers generate bioagent identifying nucleic acid amplicons. The amplicons are preferably generated from gene sequences within the nucleic acid.
[11] Each of the oligonucleotide primer pairs comprises a forward and a reverse primer. In a preferred embodiment, each of the forward and reverse primers comprises between 13 and 35 linked nucleotides in length. Thus, in this embodiment, the primer may comprise 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 35 linked nucleotides in length.
[12] In a preferred embodiment, the forward primer of the oligonucleotide primer pair comprises between 70% and 100% sequence identity with SEQ ID NO.: 1465. In one aspect, the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1465. In another aspect, the forward primer comprises at least 80% sequence identity with SEQ ID NO.: 1465. In another aspect, the forward primer comprises at least 90% sequence identity with SEQ ID NO. : 1465. In another aspect, the forward primer comprises at least 95% sequence identity with SEQ ID NO.: 1465. In another aspect, the forward primer comprises at least 100% sequence identity with SEQ ID NO.: 1465. In another aspect, the forward primer is SEQ ID NO.: 1465 with 0-10 nucleotide deletions, additions, and/or substitutions. In another aspect, the forward primer is SEQ ID NO.: 1465.
[13] In embodiment, the reverse primer of the oligonucleotide primer pair comprises between 70% and 100% sequence identity with SEQ ID NO.: 1466. In one aspect, the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1466. In another aspect, the reverse primer comprises at least 80% sequence identity with SEQ ID NO.: 1466. In another aspect, the reverse primer comprises at least 90% sequence identity with SEQ ID NO.: 1466. In another aspect, the reverse primer comprises at least 95% sequence identity with SEQ ID NO.: 1466. In another aspect, the reverse primer comprises at least 100% sequence identity with SEQ ID NO.: 1466. In another aspect, the reverse primer is SEQ ID NO.: 1466 with 0-10 nucleotide deletions, additions, and/or substitutions. In another aspect, the reverse primer is SEQ ID NO.: 1466.
[14] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1465.
[15] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1466.
[16] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1465 and an the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1466.
[17] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 288.
[18] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1269.
[19] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 288 and an the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1269.
[20] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 698. [21] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1420.
[22] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 698 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1420.
[23] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 217.
[24] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1167
[25] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 217 and wherein the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1167.
[26] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 399.
[27] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1041.
[28] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 399 and wherein the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1041.
[29] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 430. [30] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1321.
[31] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 430 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1321.
[32] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 174.
[33] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 853.
[34] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 174 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 853.
[35] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 172.
[36] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 1360.
[37] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 172 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1360.
[38] One embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 205. [39] Another embodiment is an oligonucleotide primer between 13 and 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO: 876.
[40] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 205 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 876.
[41] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 456.
[42] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1261.
[43] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 456 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1261.
[44] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 437.
[45] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1137.
[46] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1231.
[47] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 456 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1231 or with SEQ ID NO. : 1137. [48] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 530.
[49] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 891.
[50] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 530 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 891.
[51] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 474.
[52] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 869.
[53] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 474 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 869.
[54] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 268.
[55] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1284.
[56] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 268 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1284. [57] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 418.
[58] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1301.
[59] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 418 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1301.
[60] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 318.
[61] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1300.
[62] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 318 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1300.
[63] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 440.
[64] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1076.
[65] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 440 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1076. [66] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 219.
[67] Another embodiment is an oligonucleotide primer pair 13 to 35 linked nucleotides in length having at least 70% sequence identity with SEQ ID NO.: 1013.
[68] Another embodiment is an oligonucleotide primer pair wherein the forward primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 219 and the reverse primer is between 13 and 35 linked nucleotides in length and comprises at least 70% sequence identity with SEQ ID NO: 1013.
[69] Also provided herein are kits comprising one or more of the oligonucleotide primer pairs.
In one embodiment, the kit comprises an oligonucleotide primer pair comprising a forward primer that comprises at least 70% sequence identity with SEQ ID NO.: 1465 and a reverse primer that comprises at least 70% sequence identity with SEQ ID NO.: 1466, the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1467 and the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1468, or the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1469 and the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1470. In a preferred embodiment, the primer pair comprises at least 70% sequence identity with SEQ ID NO.: 1465:SEQ ID NO.: 1466, SEQ ID NO.: 1467:SEQ ID NO.: 1468, or SEQ ID NO.: 1469: SEQ ID NO.: 1470. In another embodiment, the kit comprises at least one additional oligonucleotide primer pair that is configured to generate an amplicon between 45 and 200 linked nucleotides in length, and comprises a forward and a reverse primer, each comprising between 13 and 35 linked nucleotides in length and each configured to hybridize to conserved sequence regions within a Staphylococcus aureus gene, said gene selected from the group consisting of: ermA, ermC, pvluk, nuc, tuffi, mecA, mec-Rl, tsstl, and mupR, arcC, aroE, gmk, pta, tpi and yqi. In a preferred embodiment, each of the at least one additional oligonucleotide primer pair comprises at least 70% sequence identity with a primer pair selected from: SEQ ID NO.: 288:SEQ ID NO.: 1269, SEQ ID NO.: 698:SEQ ID NO.: 1420, SEQ ID NO.: 217:SEQ ID NO.: 1167, SEQ ID NO.: 399:SEQ ID NO.: 1041, SEQ ID NO : 456:SEQ ID NO.: 1261, SEQ ID NO : 430: SEQ ID NO.: 1321, SEQ ID NO.: 174:SEQ ID NO.:853, SEQ ID NO : 437:SEQ ID NO.: 1232, SEQ ID NO.: 530:SEQ ID NO.:891, SEQ ID NO.: 474:SEQ ID NO.:869, SEQ ID NO.: 268:SEQ ID NO.:1284, SEQ ID NO: 418:SEQ ID NO.:1301, SEQ IDNO: 318:SEQ ID NO.:1300, SEQID NO.: 440:SEQIDNO.:1076, and SEQIDNO.: 219:SEQ ID NO.:1013. In another aspect, the kit comprises eight primer pairs, said eight oligonucleotide primer pairs having at least 70% sequence identity to: SEQIDNO.: 288:SEQ ID NO.:1269, SEQIDNO.: 698:SEQ IDNO.:1420, SEQID NO.: 217:SEQIDNO.:1167, SEQIDNO.: 399:SEQ IDNO.:1041, SEQIDNO.: 456:SEQID NO.:1261, SEQ ID NO.: 430:SEQ ID NO.:1321, SEQ ID NO.: 174:SEQ ID NO.:853, and SEQ ID NO.: 1465: SEQ ID NO: 1466, SEQIDNO.: 1467:SEQ IDNO.:1468, or SEQ ID NO.: 1469:SEQ ID NO.: 1470. In another aspect, the kit comprises eight oligonucleotide primer pairs consisting of SEQIDNO.:288:SEQIDNO.:1269, SEQIDNO.: 698:SEQ ID NO.:1420, SEQ ID NO: 217:SEQ ID NO.: 1167, SEQIDNO.: 399:SEQ IDNO.:1041, SEQIDNO.: 456:SEQ IDNO.:1261, SEQID NO.:430:SEQIDNO.:1321, SEQIDNO.: 174:SEQ IDNO.:853, and SEQ ID NO.: 1465:SEQID NO.: 1466, SEQIDNO.: 1467: SEQ ID NO.: 1468, or SEQ IDNO.: 1469: SEQ ID NO.: 1470. In one aspect, the kit further comprises eight additional primer pairs, comprising at least 70% sequence identity with SEQ ID NO.: 437: SEQ ID NO: 1232, SEQIDNO.: 530:SEQ ID NO.:891, SEQID NO.: 474:SEQ ID NO.:869, SEQ ID NO.: 268:SEQ ID NO.:1284, SEQ ID NO.: 418:SEQ ID NO.:1301, SEQIDNO.: 318:SEQ ID NO.:1300, SEQ IDNO: 440:SEQ ID NO.:1076, and SEQ ID NO.: 219:SEQ ID NO.:1013. In another aspect, the eight additional primer pairs consists of: SEQ ID NO.: 437:SEQ ID NO.:1232, SEQ ID NO.: 530:SEQ ID NO.:891, SEQ ID NO.: 474:SEQ ID NO.:869, SEQ ID NO: 268:SEQ ID NO.:1284, SEQ IDNO: 418:SEQ ID NO.:1301, SEQIDNO.: 318:SEQ ID NO.:1300, SEQ ID NO.: 440:SEQ ID NO.:1076, and SEQ ID NO.: 219:SEQ ID NO.:1013.
[70] In a preferred embodiment, the kit comprises A kit for identifying a Staphylococcus aureus bioagent comprising: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 288 and a reverse primer with at least 70% sequence identity with SEQ ID NO. : 1269; a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with SEQ ID NO.: 698 and a reverse primer with at least 70% sequence identity with SEQ ID NO.: 1420; a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 217 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1167; a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 399 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1041; a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 456 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1261; a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 430 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1321; a seventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 174 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 853; and an eighth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 172 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1360.
[71] In another preferred embodiment, the kit comprises the eight oligonucleotide primer pairs:
SEQ ID NO.: 288:SEQ ID NO.: 1269, SEQ ID NO.: 698:SEQ ID NO.: 1420, SEQ ID NO : 217:SEQ ID NO.: 1167, SEQ ID NO.: 399:SEQ ID NO.: 1041, SEQ ID NO.: 456:SEQ ID NO.: 1261, SEQ ID NO.: 430:SEQ ID NO.: 1321, SEQ ID NO.: 174:SEQ ID NO.:853, and SEQ ID NO.: 172:SEQ ID
NO.: 1360.
[72] In another preferred embodiment, the kit for identifying a Staphylococcus aureus bioagent comprises s:: aa first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 288 and a reverse primer with at least 70% sequence identity with SEQ ID NO. : 1269, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 698 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1420, a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 217 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1167, a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 399 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1041, a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 456, and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1261, a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 430 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1321, a seventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 174 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 853; and an eighth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 205 and a reverse primer with at least 70% sequence identity with: SEQ ID NO : 876.
[73] In a preferred embodiment, the kit comprises eight oligonucleotide primer pairs consisting of: SEQ ID NO.: 288:SEQ ID NO.: 1269, SEQ ID NO.: 698:SEQ ID NO.: 1420, SEQ ID NO.: 217:SEQ ID NO.: 1167, SEQ ID NO.: 399:SEQ ID NO.: 1041, SEQ ID NO.: 456:SEQ ID NO.: 1261, SEQ ID NO.: 430:SEQ ID NO.: 1321, SEQ ID NO.: 174:SEQ ID NO.:853, and SEQ ID NO.: 205:SEQ ID NO.:876.
[74] In another preferred embodiment, the kit for identifying a Staphylococcus aureus bioagent comprises: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 288 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1269, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 698 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1420, a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 217 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1167, a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 399 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1041, a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 456 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1261, a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 430 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1321, a seventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 174 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 853; and an eighth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 1465 and a reverse primer with at least 70% sequence identity with: SEQ ID NO : 1466. [75] In a preferred embodiment, the kit comprises eight oligonucleotide primer pairs consisting of: SEQIDNO.:288:SEQIDNO.:1269, SEQIDNO.: 698:SEQ IDNO.:1420, SEQIDNO.: 217:SEQIDNO.:1167, SEQIDNO.: 399:SEQ IDNO.:1041, SEQIDNO.: 456:SEQ IDNO.:1261, SEQIDNO.:430:SEQIDNO.:1321, SEQIDNO.: 174:SEQ ID NO.:853, and SEQ IDNO.: 1465:SEQIDNO.:1466.
[76] In another preferred embodiment, the for identifying a Staphylococcus aureus bioagent comprises: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 437 and a primer with at least 70% sequence identity with: SEQ ID NO. : 1137, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 530 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 891, a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 474 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 869, a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 268 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1284, a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 418 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1301, a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 318 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1300, a seventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 440 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1076, and an eigth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 219 and a reverse primer with at least 70% sequence identity with: SEQ IDNO.:1013.
[77] In a preferred embodiment, the kit comprises eight oligonucleotide primer pairs consisting of: SEQIDNO.:437:SEQIDNO.:1137, SEQIDNO.: 530:SEQ IDNO.:891, SEQIDNO.: 474:SEQ ID NO.:869, SEQ ID NO.: 268:SEQ ID NO.:1284, SEQ ID NO.: 418:SEQ ID NO.:1301, SEQ ID NO.: 318:SEQ ID NO.:1300, SEQ ID NO.: 440:SEQ ID NO.:1076, and SEQ ID NO.: 219:SEQIDNO.:1013. [78] In another preferred embodiment, the kit for identifying a Staphylococcus aureus bioagent comprises: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 437 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1232, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 530 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.:891, a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 474 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 869, a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 268 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1284, a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 418 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1301, a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 318 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1300, a seventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 440 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1076; and an eighth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 219 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1013.
[79] In a preferred embodiment, the kit comprises eight oligonucleotide primer pairs consisting of: SEQ ID NO : 437: SEQ ID NO : 1232, SEQ ID NO.: 530:SEQ ID NO.:891, SEQ ID NO.: 474:SEQ ID NO.:869, SEQ ID NO.: 268:SEQ ID NO.: 1284, SEQ ID NO.: 418:SEQ ID NO.: 1301, SEQ ID NO.: 318:SEQ ID NO.: 1300, SEQ ID NO.: 440:SEQ ID NO.: 1076, and SEQ ID NO.: 219:SEQ ID NO.: 1013.
[80] In another preferred embodiment, the kit for identifying a Staphylococcus aureus bioagent comprises: a first oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 437 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1232, a second oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 530 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.:891, a third oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 474 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 869, a fourth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 268 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1284, a fifth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 418 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1301, a sixth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 318 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1300, a seventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 440 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1076, an eighth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 219 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1013, a ninth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 288 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1269, a tenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO. : 698 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1420, an eleventh oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 217 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1167, a twelfth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 399 and a reverse primer with at least 70% sequence identity with: SEQ ID NO. : 1041, a thirteenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 456 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1261, a fourteenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 430 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 1321, a fifteenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 174 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.: 853; and a sixteenth oligonucleotide primer pair comprising a forward primer with at least 70% sequence identity with: SEQ ID NO.: 205 and a reverse primer with at least 70% sequence identity with: SEQ ID NO.:876. [81] Preferably, each of the oligonucleotide primer pairs is configured to generate an amplicon comprising between 45 and 200 linked nucleotides in length, and wherein, for each of the oligonucleotide primer pairs, the forward primer comprises between 13 and 35 linked nucleotides in length and is configured to hybridize within a first conserved sequence region of a Staphylococcus aureus gene sequence, and the reverse primer comprises between 13 and 35 linked nucleotides in length and is configured to hybridize within a second conserved sequence region of said Staphylococcus aureus gene sequence.
[82] In some embodiments, at least one of the forward primer and the reverse primer comprises at least one modified nucleobase. In one embodiment, at least one of the at least one modified nucleobase is a mass modified nucleobase. In one aspect, the mass modified nucleobase is 5-Iodo- C. In another aspect, it comprises a mass modified tag. In another embodiment, at least one of the at least one modified nucleobase is a universal nucleobase, for example, inosine. In another embodiment, primer pair comprises at least one non-templated T residue on the 5'-end. In another embodiment, at least one of the forward primer and the reverse primer comprises at least one non- template tag. In one embodiment, at least one of the forward primer and the reverse primer comprises a non-templated T residue on the 5'-end. In another embodiment, at least one of the forward primer and the reverse primer lacks a non-templated T residue on the 5 '-end.
[83] Some embodiments are kits that comprise one or more of the primer pairs. In some embodiments, each member of the one or more primer pairs of the kit is of a length of between 13 and 35 linked nucleotides and has 70% to 100% sequence identity with the corresponding member from any of the primer pairs listed in Table 2.
[84] In some embodiments, the kits comprise at least one calibration polynucleotide for use in quantitiation of bacteria in a given sample, and also for use as a positive control for amplification.
[85] In some embodiments, the kits further comprise at least one anion exchange functional group linked to a magnetic bead.
[86] Also provided herein are methods for identification of bacteria using one or more of the primer pairs provided herein. In one embodiment, the method is for identification of a bioagent in a sample. In one aspect, the bioagent is a bacterial bioagent, preferably a Staphylococcus aureus bioagent. Nucleic acid from the sample is amplified using the oligonucleotide primer pairs described above to obtain at least one amplification product. In a preferred aspect, the amplification product is between 45 and 200 linked nucleotides in length. The molecular mass of the amplification product is determined by mass spectrometry. In a preferred embodiment, the base composition of the amplification product is calculated from the determined molecular mass. The molecular mass and/or base composition is compared to or queried against a database comprising a plurality of base compositions or molecular masses. Preferably, each base composition/molecular mass within the plurality of base compositions and/or molecular masses in the database is indexed to the primer pair and to a bioagent. A match between the calculated base composition or the determined molecular mass with a base composition or molecular mass comprised in the database identifies the bioagent in the sample. In preferred embodiments, the mass spectrometry used to determine the molecular mass is electrospray ionization (ESI) time of flight (TOF) mass spectrometry or ESI Fourier transform ion cyclotron resonance (FTICR) mass spectrometry, for example. Other mass spectrometry techniques can also be used to measure the molecular mass of bacterial bioagent identifying amplicons.
[87] In some embodiments, the identification in the method comprises detecting the presence or absence of a bacterial bioagent in a sample. In another embodiment, it comprises determining the presence or absence of virulence of the bioagent in the sample. In another embodiment, the identifying comprises identifying one or more sub-species characteristics of the bioagent in the sample. In another embodiment, the identifying comprises determining sensitivity or resistance of the bioagent to a drug, preferably an antibiotic.
[88] In some embodiments, the methods are for determination of the quantity of an unknown bacterial bioagent in a sample. The sample is contacted with the primer pair and a known quantity of a calibration polynucleotide comprising a calibration sequence. Nucleic acid from the unknown bioagent in the sample is concurrently amplified with the composition described above and nucleic acid from the calibration polynucleotide in the sample is concurrently amplified with the composition described above to obtain a first amplification product comprising a bacterial bioagent identifying amplicon and a second amplification product comprising a calibration amplicon. The molecular masses and abundances for the bacterial bioagent identifying amplicon and the calibration amplicon are determined. The bacterial bioagent identifying amplicon is distinguished from the calibration amplicon based on molecular mass and comparison of bacterial bioagent identifying amplicon abundance and calibration amplicon abundance indicates the quantity of bacterium in the sample. In some embodiments, the base composition of the bacterial bioagent identifying amplicon is determined.
[89] In some embodiments, the methods comprise detecting or quantifying bacteria by combining a nucleic acid amplification process with molecular mass determination. In some embodiments, such methods identify or otherwise analyze the bacterium by comparing mass information from an amplification product with a calibration or control product. Such methods can be carried out in a highly multiplexed and/or parallel manner allowing for the analysis of as many as 300 samples per 24 hours on a single mass measurement platform. The accuracy of the mass determination methods in some embodiments provided herein permits allows for the ability to discriminate between different bacteria such as, for example, various genotypes and drug resistant strains of Staphylococcus aureus.
BRIEF DESCRIPTION OF THE DRAWINGS
[90] The foregoing summary and the following detailed description are better understood when read in conjunction with the accompanying drawings which are included by way of example and not by way of limitation.
[91] Figure 1: process diagram illustrating a representative primer pair selection process.
[92] Figure 2: process diagram illustrating an embodiment of the calibration method.
[93] Figure 3: common pathogenic bacteria and primer pair coverage. The primer pair number in the upper right hand corner of each polygon indicates that the primer pair can produce a bioagent identifying amplicon for all species within that polygon. [94] Figure 4: a representative 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples (labeled NHRC samples) closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.
[95] Figure 5: a representative mass spectrum of amplification products indicating the presence of bioagent identifying amplicons of Streptococcus pyogenes, Neisseria meningitidis, and Haemophilus influenzae obtained from amplification of nucleic acid from a clinical sample with primer pair number 349 which targets 23 S rRNA. Experimentally determined molecular masses and base compositions for the sense strand of each amplification product are shown.
[96] Figure 6: a representative mass spectrum of amplification products representing a bioagent identifying amplicon of Streptococcus pyogenes, and a calibration amplicon obtained from amplification of nucleic acid from a clinical sample with primer pair number 356 which targets rplB. The experimentally determined molecular mass and base composition for the sense strand of the Streptococcus pyogenes amplification product is shown.
[97] Figure 7: a representative mass spectrum of an amplified nucleic acid mixture which contained the Ames strain of Bacillus anthracis, a known quantity of combination calibration polynucleotide (SEQ ID NO: 1464), and primer pair number 350 which targets the capC gene on the virulence plasmid pX02 of Bacillus anthracis. Calibration amplicons produced in the amplification reaction are visible in the mass spectrum as indicated and abundance data (peak height) are used to calculate the quantity of the Ames strain of Bacillus anthracis.
DETAILED DESCRIPTION OF THE EMBODIMENTS
[98] As used herein, the term "abundance" refers to an amount. The amount may be described in terms of concentration which are common in molecular biology such as "copy number," "pfu or plate-forming unit" which are well known to those with ordinary skill. Concentration may be relative to a known standard or may be absolute. In some embodiments, the primer pairs and methods provided herein determine the abundance of one or more bioagents in a sample.
[99] As used herein, the term "amplifiable nucleic acid" is used in reference to nucleic acids that may be amplified by any amplification method. It is contemplated that "amplifiable nucleic acid" also comprises "sample template."
[100] As used herein, the term "amplification reagents" refers to those reagents (deoxyribonucleotide triphosphates, buffer, etc.), needed for amplification, excluding primers, nucleic acid template, and the amplification enzyme. Typically, amplification reagents along with other reaction components are placed and contained in a reaction vessel (test tube, microwell, etc.).
[101] As used herein, the term "analogous" when used in context of comparison of bioagent identifying amplicons indicates that the bioagent identifying amplicons being compared are produced with the same pair of primers. For example, bioagent identifying amplicon "A" and bioagent identifying amplicon "B", produced with the same pair of primers are analogous with respect to each other. Bioagent identifying amplicon "C", produced with a different pair of primers is not analogous to either bioagent identifying amplicon "A" or bioagent identifying amplicon "B".
[102] As used herein, the term "anion exchange functional group" refers to a positively charged functional group capable of binding an anion through an electrostatic interaction. The most well known anion exchange functional groups are the amines, including primary, secondary, tertiary and quaternary amines.
[103] The term "bacteria" or "bacterium" refers to any member of the groups of eubacteria and archaebacteria.
[104] As used herein, a "base composition probability cloud" is a representation of the diversity in base composition resulting from a variation in sequence that occurs among different isolates of a given species. The "base composition probability cloud" represents the base composition constraints for each species and is typically visualized using a pseudo four-dimensional plot. [105] Herein, a "bioagent" is any organism, cell, or virus, living or dead, or a nucleic acid derived from such an organism, cell or virus. Examples of bioagents include, but are not limited, to cells, (including but not limited to human clinical samples, bacterial cells and other pathogens), viruses, fungi, protists, parasites, and pathogenicity markers (including but not limited to: pathogenicity islands, antibiotic resistance genes, virulence factors, toxin genes and other bioregulating compounds). Samples may be alive or dead or in a vegetative state (for example, vegetative bacteria or spores) and may be encapsulated or bioengineered. Herein, a "pathogen" is a bioagent which causes a disease or disorder.
[106] As is used herein, the term "unknown bioagent" can mean either: (i) a bioagent whose existence is not known (for example, the SARS coronavirus was unknown prior to April 2003), which is also called a "true unknown bioagent," and/or (ii) a bioagent whose existence is known (such as the well known bacterial species Staphylococcus aureus for example) but which is not known to be in a sample to be analyzed and/or (iii) a bioagent that is known or suspected of being present in a sample but whose sub-species characteristics are not known (such as a bacterial resistance genotype like the QRDR region of Staphyoicoccus aureus species). For example, if the method for identification of coronaviruses disclosed in commonly owned U.S. Pre-Grant Publication No. US2005-0266397 (incorporated herein by reference in its entirety) was to be employed prior to April 2003 to identify the SARS coronavirus in a clinical sample, both meanings of "unknown" bioagent are applicable since the SARS coronavirus was unknown to science prior to April, 2003 and since it was not known what bioagent (in this case a coronavirus) was present in the sample. On the other hand, if the method of U.S. Pre-Grant Publication No. US2005-0266397 was to be employed subsequent to April 2003 to identify the SARS coronavirus in a clinical sample, only the second meaning (ii) of "unknown" bioagent would apply because the SARS coronavirus became known to science subsequent to April 2003 but because it was not known what bioagent was present in the sample.
[107] As used herein, a "bioagent division" is defined as group of bioagents above the species level and includes but is not limited to, orders, families, genus, classes, clades, genera or other such groupings of bioagents above the species level. [108] Herein, a "pathogen" is a bioagent which causes a disease or disorder.
[109] The term "virus" refers to obligate, ultramicroscopic, parasites that are incapable of autonomous replication (i.e., replication requires the use of the host cell's machinery). Viruses can survive outside of a host cell but cannot replicate.
[110] As used herein, the term "biological product" refers to any product originating from an organism. Biological products are often products of processes of biotechnology. Examples of biological products include, but are not limited to: cultured cell lines, cellular components, antibodies, proteins and other cell-derived biomolecules, growth media, growth harvest fluids, natural products and bio-pharmaceutical products.
[Ill] The terms "biowarfare agent" and "bioweapon" are synonymous and refer to a bacterium, virus, fungus or protozoan that could be deployed as a weapon to cause bodily harm to individuals. Military or terrorist groups may be implicated in deployment of biowarfare agents.
[112] The term "calibration amplicon" refers to a nucleic acid segment representing an amplification product obtained by amplification of a calibration sequence with a pair of primers configured to produce a bioagent identifying amplicon.
[113] The term "calibration sequence" refers to a polynucleotide sequence to which a given pair of primers hybridizes for the purpose of producing an internal (i.e: included in the reaction) calibration standard amplification product for use in determining the quantity of a bioagent in a sample. The calibration sequence may be expressly added to an amplification reaction, or may already be present in the sample prior to analysis.
[114] The term "codon" refers to a set of three adjoined nucleotides (triplet) that codes for an amino acid or a termination signal.
[115] Herein, the term "codon base composition analysis," refers to determination of the base composition of an individual codon by obtaining a bioagent identifying amplicon that includes the codon. The bioagent identifying amplicon will at least include regions of the target nucleic acid sequence to which the primers hybridize for generation of the bioagent identifying amplicon as well as the codon being analyzed, located between the two primer hybridization regions.
[116] As used herein, "primer pairs," or "oligonucleotide primer pairs" are synonymous terms referring to pairs of oligonucleotides (herein called "primers" or "oligonucleotide primers") that are configured to bind to conserved sequence regions of a bioagent nucleic acid (that is conserved among two or more bioagents) and to generate bioagent identifying amplicons. The bound primers flank an intervening variable region of the bioagent between the conserved sequence sequences. Upon amplification, the primer pairs yield amplicons that provide base composition variability between two or more bioagents. The variability of the base compositions allows for the identification of one or more individual bioagents from two or more bioagents based on the base composition distinctions. The primer pairs are also configured to generate amplicons that are amenable to molecular mass analysis. Each primer pair comprises two primer pair members. The primer pair members are a "forward primer" ("forward primer pair member," or "reverse member"), which comprises at least a percentage of sequence identity with the top strand of the reference sequence used in configuring the primer pair, and a "reverse primer" ("reverse primer pair member" or "reverse member"), which comprises at least a percentage of reverse complementarity with the top strand of the reference sequence used in configuring the primer pair. Primer pair configuration is well known in the art and is described in detail herein.
[117] Primer pair nomenclature, as used herein, includes the identification of a reference sequence. For example, the forward primer for primer pair number 3106 is named TSSTl_NC002758.2-2137509-2138213 _519_546_F. This forward primer name indicates that the forward primer ("_F") hybridizes to residues 234-261 ("234_261") of a reference sequence, which in this case is represented by a sequence extraction of coordinates 2137509-2138213 from GenBank gi number 57634611 (corresponding to the GenBank number NC002758.2, as is indicated by the prefix "TSSTl_NC002758.2" and cross-reference in Table 3). In the case of this primer, the reference sequence is the gene within a Staphylococcus aureus genome encoding for tsstl. Primer pair name codes for the primers provided herein are defined in Table 3, which lists gene abbreviations and GenBank gi numbers that correspond with each primer name code. Sequences of the primers are also provided. One of skill in the art will understand how to determine exact hybridization coordinates of primers with respect to GenBank sequences, given the information provided herein. The primer pairs are selected and configured; however, to hybridize with two or more bioagents. So, the reference sequence in the primer name is used merely to provide a reference, and not to indicate that the primers are selected and configured to hybridize with and generate a bioagent identifying amplicon only from the reference sequence. Rather, the primers hybridize with and generate amplicons from a number of sequences. Further, the sequences of the primer members of the primer pairs are not necessarily fully complementary to the conserved region of the reference bioagent. Rather, the sequences are configured to be "best fit" amongst a plurality of bioagents at these conserved binding sequences. Therefore, the primer members of the primer pairs have substantial complementarity with the conserved regions of the bioagents, including the reference bioagent.
[118] The primers provided herein are configured to hybridize within conserved sequence regions of bioagent nucleic acids, which are conserved among two or more bioagents, that preferably flank an intervening variable region, which varies among two or more bioagents, and, upon amplification, yield amplification products which ideally provide enough variability to distinguish individual bioagents, and which are amenable to molecular mass analysis. In a preferred embodiment, the conserved sequence regions are highly conserved sequence regions. By the term "highly conserved," it is meant that the sequence regions exhibit between about 80-100%, or between about 90-100%, or between about 95-100% identity among all, or at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% of species or strains. The molecular mass of a given amplification product provides a means of identifying the bioagent from which it was obtained, due to the variability of the variable region, which preferably results in amplicons that vary in base composition among bioagents, for example, among different species or strains. Thus configuring of the primers involves selection of a variable region with appropriate variability to resolve the identity of a given bioagent. Bioagent identifying amplicons are ideally specific to the identity of the bioagent.
[119] As used herein, the term "variable region" is used to describe a region that is flanked by the two conserved sequence regions to which the primers of a primer pair hybridize. In other words, the variable region is a region that is flanked by the primers of any one primer pair described herein. The region possesses distinct base compositions among at least two bioagents, such that at least one bioagent can be identified at the family, genus, species or sub-species level using the primer pairs and the methods provided herein. The degree of variability between the at least two bioagents need only be sufficient to allow for identification using mass spectrometry analysis, as described herein. Such a difference can be as slight as a single nucleotide difference occurring between two bioagents.
[120] Methods of oligonucleotide primer pair design are well known. One of skill in the art will understand that primer pairs configured to prime amplification of a double stranded sequence are configured and named using one strand of the double stranded sequence as a reference. The forward primer is the primer of the pair that comprises full or partial sequence identity to the one strand (usually the coding, or sense strand) of the sequence being used as a reference. The reverse primer is the primer of the pair that comprises reverse complementarity to the one strand of the sequence being used as a reference.
[121] In one embodiment, the "plus" or "top" strand (the primary sequence as submitted to GenBank) of the nucleic acid to which the primers hybridize is used as a reference when designing primer pairs. In this case, the forward primer will comprise identity and the reverse primer will comprise reverse complementarity, to the sequence listed in GenBank for the reference sequence. In some embodiments, the primer pair is configured using the "minus" or "bottom" strand (reverse complement of the primary sequence as submitted to and listed in GenBank). In this case, the forward primer comprises sequence identity to the minus strand, and thus comprises reverse complementarity to the top strand, the sequence listed in GenBank. Similarly, in this case, the reverse primer comprises reverse complementarity to the minus Strang, and thus comprises identity to the top strand. The ordinarily skilled artisan will know how to design the primers provided herein armed with this disclosure.
[122] In a preferred embodiment, the primer pairs may be configured to generate an amplicon from "within a region of a particular SEQ ID NO., which may comprise a specific region of the Genbank gi No. to which the primers were configured. Configuring a primer pair to generate an amplicon from "within a region" of a particular nucleic acid means that each primer of the pair hybridizes to a portion of the reference sequence that is within that region. However, one of ordinary skill in the art of primer design will understand that shifting the coordinates of the portion of a reference sequence to which one or both primers hybridizes slightly, in one direction or the other relative to the region given, such that the portion is not entirely within the region, will often result in an equally effective primer pair. Such primer pairs are also encompassed by this description.
[123] The term "clade primer pair" refers to a primer pair configured to produce bioagent identifying amplicons for species belonging to a clade group. A clade primer pair may also be considered as a "speciating" primer pair which is useful for distinguishing among closely related species.
[124] In some embodiments, the primer pairs comprise "broad range survey primers," primers configured to identify an unknown bioagent as a member of a particular division (e.g., an order, family, class, clade, or genus). However, in some cases the broad range survey primers are also able to identify unknown bioagents at the species or sub-species level. In other embodiments, the primer pairs comprise "division-wide primers," configured to identify a bioagent at the species level. In some embodiments, the primer pairs comprise "drill-down" primers, configured to identify a bioagent at the sub-species level. As used herein, the "sub-species" level of identification includes, but is not limited to, strains, subtypes, variants, and isolates. Drill-down primers are not always required for identification at the sub-species level because broad range survey intelligent primers may, in some cases provide sufficient identification resolution to accomplishing this identification objective.
[125] Herein, the term "speciating primer pair" refers to a primer pair configured to produce a bioagent identifying amplicon with the diagnostic capability of identifying species members of a group of genera or a particular genus of bioagents. Primer pair number 2249 (SEQ ID NOs: 430: 1321), for example, is a speciating primer pair used to distinguish Staphylococcus aureus from other species of the genus Staphylococcus.
[126] As used herein, a "sub-species characteristic" is a genetic characteristic that provides the means to distinguish two members of the same bioagent species. For example, one viral strain could be distinguished from another viral strain of the same species by possessing a genetic change (e.g., for example, a nucleotide deletion, addition or substitution) in one of the viral genes, such as the RNA-dependent RNA polymerase. Sub-species characteristics such as virulence genes and drug-are responsible for the phenotypic differences among the different strains of bacteria.
[127] Properties of the primers may include any number of properties related to structure including, but not limited to: nucleobase length which may be contiguous (linked together) or noncontiguous (for example, two or more contiguous segments which are joined by a linker or loop moiety), modified or universal nucleobases (used for specific purposes such as for example, increasing hybridization affinity, preventing non-templated adenylation and modifying molecular mass) percent complementarity to a given target sequences.
[128] As used herein, the terms "complementary" or "complementarity" are used in reference to polynucleotides (i.e., a sequence of nucleotides such as an oligonucleotide or a target nucleic acid) related by the base-pairing rules. For example, for the sequence "5'-A-G-T-3'," is complementary to the sequence "3'-T-C-A-5\" Complementarity may be "partial," in which only some of the nucleic acids' bases are matched according to the base pairing rules. Or, there may be "complete" or "total" complementarity between the nucleic acids. The degree of complementarity between nucleic acid strands has significant effects on the efficiency and strength of hybridization between nucleic acid strands. This is of particular importance in amplification reactions, as well as detection methods that depend upon binding between nucleic acids. Either term may also be used in reference to individual nucleotides, especially within the context of polynucleotides. For example, a particular nucleotide within an oligonucleotide may be noted for its complementarity, or lack thereof, to a nucleotide within another nucleic acid strand, in contrast or comparison to the complementarity between the rest of the oligonucleotide and the nucleic acid strand.
[129] The term "complement of a nucleic acid sequence" as used herein refers to an oligonucleotide which, when aligned with the nucleic acid sequence such that the 5' end of one sequence is paired with the 3' end of the other, is in "antiparallel association." Complementarity relates to base pairing ability. A nucleobase that is complementary to another nucleobase can base pair with that other nuceobase. Certain bases not commonly found in natural nucleic acids may be included in the nucleic acids provided herein, and include, for example, inosine and 7-deazaguanine. Complementarity need not be perfect; stable duplexes may contain mismatched base pairs or unmatched bases. Those skilled in the art of nucleic acid technology can determine duplex stability empirically considering a number of variables including, for example, the length of the oligonucleotide, base composition and sequence of the oligonucleotide, ionic strength and incidence of mismatched base pairs. Where a first oligonucleotide is complementary to a region of a target nucleic acid and a second oligonucleotide has complementary to the same region (or a portion of this region) a "region of overlap" exists along the target nucleic acid. The degree of overlap will vary depending upon the extent of the complementarity.
[130] As is used herein, the term "substantial complementarity" means that a primer member of a primer pair comprises between about 70%- 100%, or between about 80-100%, or between about 90-100%, or between about 95-100% identity, or between about 99-100% sequence identity with the conserved binding sequence of any given bioagent. These ranges of identity are inclusive of all whole or partial numbers embraced within the recited range numbers. For example, and not limitation, 75.667%, 82%, 91.2435% and 97% sequence identity are all numbers that fall within the above recited range of 70% to 100%, therefore forming a part of this description.
[131] As used herein, the terms "amplicon" and "bioagent identifying amplicon" refer to a nucleic acid generated using the primer pairs described herein. The amplicon is preferably double stranded DNA; however, it may be RNA and/or DNA:RNA. The amplicon comprises the sequences of the conserved regions/primer pairs and the intervening variable region. Since the primer pairs provided herein are configured such that two or more different bioagents, when amplified with a given primer pair, will yield amplicons with unique base composition signatures, the base composition signatures can be used to identify bioagents based on association with amplicons. As discussed herein, primer pairs are configured to generate amplicons from two or more bioagents. As such, the base composition of any given amplicon will include the primer pair, the complement of the primer pair, the conserved regions and the variable region from the bioagent that was amplified to generate the amplicon. One skilled in the art understands that the incorporation of the configured primer pair sequences into any amplicon will replace the native bioagent sequences at the primer binding site, and complement thereof. After amplification of the target region using the primers the resultant amplicons having the primer sequences generate the molecular mass data. Amplicons having any native bioagent sequences at the primer binding sites, or complement thereof, are undetectable because of their low abundance. Such is accounted for when identifying one or more bioagents using any particular primer pair. The amplicon further comprises a length that is compatible with mass spectrometry analysis. Bioagent identifying amplicons generate base composition signatures that are preferably unique to the identity of a bioagent.
[132] As used herein, the term "molecular mass" refers to the mass of a compound as determined using mass spectrometry. Herein, the compound is preferably a nucleic acid, more preferably a double stranded nucleic acid, still more preferably a double stranded DNA nucleic acid and is most preferably an amplicon. When the nucleic acid is double stranded the molecular mass is determined for both strands. Here, the strands are separated either before introduction into the mass spectrometer, or the strands are separated by the mass spectrometer (for example, electro-spray ionization will separate the hybridized strands). The molecular mass of each strand is measured by the mass spectrometer. The term "mass spectrometry" refers to measurement of the mass of atoms or molecules. The molecules are first converted to ions, which are separated using electric or magnetic fields according to the ratio of their mass to electric charge. The measured masses are used to identity the molecules.
[133] As used herein, the term "base composition" refers to the number of each residue comprising an amplicon, without consideration for the linear arrangement of these residues in the strand(s) of the amplicon. The amplicon residues comprise, adenosine (A), guanosine (G), cytidine, (C), (deoxy)thymidine (T), uracil (U), inosine (I), nitroindoles such as 5-nitroindole or 3- nitropyrrole, dP or dK (Hill et al.\ an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot et al, Nucleosides and Nucleotides, 1995, 14, 1053-1056), the purine analog l-(2-deoxy- .beta.-D-ribofuranosyl)-imidazole-4-carboxamide, 2,6-diaminopurine, 5-propynyluracil, 5- propynylcytosine, phenoxazines, including G-clamp, 5-propynyl deoxy- cytidine, deoxy-thymidine nucleotides, 5-propynylcytidine, 5-propynyluridine and mass tag modified versions thereof , including 7-deaza-2'-deoxyadenosine-5-triphosphate, 5-iodo-2'-deoxyuridine-5'-triphosphate, 5- bromo-2'-deoxyuridine-5'-triphosphate, 5-bromo-2'-deoxycytidine-5'-triphosphate, 5-iodo-2'- deoxycytidine-5 '-triphosphate, 5-hydroxy-2'-deoxyuridine-5'-triphosphate, 4-thiothymidine-5'- triphosphate, 5-aza-2'-deoxyuridine-5'-triphosphate, 5-fluoro-2'-deoxyuridine-5'-triphosphate, 06- methyl-2'-deoxyguanosine-5 '-triphosphate, N2-methyl-2'-deoxyguanosine-5 '-triphosphate, 8-oxo-2'- deoxyguanosine-5 '-triphosphate or thiothymidine-5'-triphosphate. In some embodiments, the mass- modified nucleobase comprises 15.sup.N or 13. sup. C or both 15.sup.N and 13. sup. C. Preferably, the non-natural nucleosides used herein include 5-propynyluracil, 5-propynylcytosine and inosine. Herein the base composition for an unmodified DNA amplicon is notated as A.sub.wG.sub.xC.sub.yT.sub.z, wherein w, x, y and z are each independently a whole number representing the number of said nucleoside residues in an amplicon. Base compositions for amplicons comprising modified nucleosides are similarly notated to indicate the number of said natural and modified nucleosides in an amplicon. Base compositions are calculated from a molecular mass measurement of an amplicon, as described below. The calculated base composition for any given amplicon is then compared to a database of base compositions. A match between the calculated base composition and a single database entry reveals the identity of the bioagent.
[134] As is used herein, the term "base composition signature" refers to the base composition generated by any one particular amplicon. The base composition signature for each of one or more amplicons provides a fingerprint for identifying the bioagent(s) present in a sample.
[135] As used herein, the term "database" is used to refer to a collection of base composition and/or molecular mass data. The base composition and/or molecular mass data in the database is indexed to bioagents and to primer pairs. The base composition data reported in the database comprises the number of each nucleoside in an amplicon that would be generated for each bioagent using each primer pair. The database can be populated by empirical data. In this aspect of populating the database, a bioagent is selected and a primer pair is used to generate an amplicon. The amplicon' s molecular mass is determined using a mass spectrometer and the base composition calculated therefrom. An entry in the database is made to associate the base composition and/or molecular mass with the bioagent and the primer pair used. The database may also be populated using other databases comprising bioagent information. For example, using the GenBank database it is possible to perform electronic PCR using an electronic representation of a primer pair. This in silico method will provide the base composition for any or all selected bioagent(s) stored in the GenBank database. The information is then used to populate the base composition database as described above. A base composition database can be in silico, a written table, a reference book, a spreadsheet or any form generally amenable to databases. Preferably, it is in silico. The database can similarly be populated with molecular masses that is gathered either empirically or is calculated from other sources such as GenBank.
[136] As used herein, the term "nucleobase" is synonymous with other terms in use in the art including "nucleotide," "deoxynucleotide," "nucleotide residue," "deoxynucleotide residue," "nucleotide triphosphate (NTP)," or deoxynucleotide triphosphate (dNTP). As is used herein, a nucleobase includes natural and modified residues, as described herein.
[137] As used herein, a "wobble base" is a variation in a codon found at the third nucleotide position of a DNA triplet. Variations in conserved regions of sequence are often found at the third nucleotide position due to redundancy in the amino acid code.
[138] As used herein, "housekeeping gene" refers to a gene encoding a protein or RNA involved in basic functions required for survival and reproduction of a bioagent. Housekeeping genes include, but are not limited to, genes encoding RNA or proteins involved in translation, replication, recombination and repair, transcription, nucleotide metabolism, amino acid metabolism, lipid metabolism, energy generation, uptake, secretion and the like. In some embodiments, the primers are configured to produce amplicons from within a housekeeping gene.
[139] As used herein, a "sub-species characteristic" is a genetic characteristic that provides the means to distinguish two members of the same bioagent species. For example, one bacterial strain could be distinguished from another bacterial strain of the same species by possessing a genetic change (e.g., for example, a nucleotide deletion, addition or substitution) in one of the bacterial genes, for example, a gene conferring drug resistance or virulence.
[140] As used herein, "triangulation identification" means the employment of more than one primer pair to generate a corresponding amplicon for identification of a bioagent. The more than one primer pair can be used in individual wells or in a multiplex PCR assay. Alternatively, PCR reaction may be carried out in single wells comprising a different primer pair in each well. Following amplification, the amplicons are pooled into a single well or container which is then subjected to molecular mass analysis. The combination of pooled amplicons can be chosen such that the expected ranges of molecular masses of individual amplicons are not overlapping and thus will not complicate identification of signals. Triangulation works as a process of elimination, wherein a first primer pair identifies that an unknown bioagent may be one of a group of bioagents. Subsequent primer pairs are used in triangulation identification to further refine the identity of the bioagent amongst the subset of possibilities generated with the earlier primer pair. Triangulation identification is complete when the identity of the bioagent is determined. The triangulation identification process is also used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al, J. Appl. Microbiol, 1999, 87, 270-278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event. In one example, a first pair of primers might determine that a given bioagent is a member of the Staphylococcus genus. A second primer pair may identify the bioagent as a member of the Staphylococcus aureus species, while a third primer may identify a sub-species characteristic of the bioagent, for example, resistance to a particular antibiotic or strain information.
[141] As used herein, the term "triangulation genotyping analysis primer pair" is a primer pair configured to produce bioagent identifying amplicons for determining species types in a triangulation genotyping analysis.
[142] As is used herein, the term "single primer pair identification" means that one or more bioagents can be identified using a single primer pair. A base composition signature for an amplicon may singly identify one or more bioagents.
[143] As used herein, the term "etiology" refers to the causes or origins, of diseases or abnormal physiological conditions.
[144] As used herein, the term "concurrently amplifying" used with respect to more than one amplification reaction refers to the act of simultaneously amplifying more than one nucleic acid in a single reaction mixture. [145] The term "duplex" refers to the state of nucleic acids in which the base portions of the nucleotides on one strand are bound through hydrogen bonding the their complementary bases arrayed on a second strand. The condition of being in a duplex form reflects on the state of the bases of a nucleic acid. By virtue of base pairing, the strands of nucleic acid also generally assume the tertiary structure of a double helix, having a major and a minor groove. The assumption of the helical form is implicit in the act of becoming duplexed.
[146] The term "gene" refers to a DNA sequence that comprises control and coding sequences necessary for the production of an RNA having a non-coding function (e.g., a ribosomal or transfer RNA), a polypeptide or a precursor. The RNA or polypeptide can be encoded by a full length coding sequence or by any portion of the coding sequence so long as the desired activity or function is retained.
[147] The terms "homology," "homologous" and "sequence identity" refer to a degree of identity. There may be partial homology or complete homology. A partially homologous sequence is one that is less than 100% identical to another sequence. Determination of sequence identity is described in the following example: a primer 20 nucleobases in length which is otherwise identical to another 20 nucleobase primer but having two non-identical residues has 18 of 20 identical residues (18/20 = 0.9 or 90% sequence identity). In another example, a primer 15 nucleobases in length having all residues identical to a 15 nucleobase segment of a primer 20 nucleobases in length would have 15/20 = 0.75 or 75% sequence identity with the 20 nucleobase primer. Herein, sequence identity is meant to be properly determined when the query sequence and the subject sequence are both described and aligned in the 5' to 3' direction. Sequence alignment algorithms such as BLAST, will return results in two different alignment orientations. In the Plus/Plus orientation, both the query sequence and the subject sequence are aligned in the 5' to 3' direction. On the other hand, in the Plus/Minus orientation, the query sequence is in the 5' to 3' direction while the subject sequence is in the 3' to 5' direction. It should be understood that with respect to the primers provided herein, sequence identity is properly determined when the alignment is designated as Plus/Plus. Sequence identity may also encompass alternate or modified nucleobases that perform in a functionally similar manner to the regular nucleobases adenine, thymine, guanine and cytosine with respect to hybridization and primer extension in amplification reactions. In a non-limiting example, if the 5- propynyl pyrimidines propyne C and/or propyne T replace one or more C or T residues in one primer which is otherwise identical to another primer in sequence and length, the two primers will have 100% sequence identity with each other. In another non-limiting example, Inosine (I) may be used as a replacement for G or T and effectively hybridize to C, A or U (uracil). Thus, if inosine replaces one or more C, A or U residues in one primer which is otherwise identical to another primer in sequence and length, the two primers will have 100% sequence identity with each other. Other such modified or universal bases may exist which would perform in a functionally similar manner for hybridization and amplification reactions and will be understood to fall within this definition of sequence identity.
[148] As used herein, the term "hybridization" is used in reference to the pairing of complementary nucleic acids. Hybridization and the strength of hybridization (i.e., the strength of the association between the nucleic acids) is influenced by such factors as the degree of complementary between the nucleic acids, stringency of the conditions involved, and the Tm of the formed hybrid. "Hybridization" methods involve the annealing of one nucleic acid to another, complementary nucleic acid, i.e., a nucleic acid having a complementary nucleotide sequence. The ability of two polymers of nucleic acid containing complementary sequences to find each other and anneal through base pairing interaction is a well-recognized phenomenon. The initial observations of the "hybridization" process by Marmur and Lane, Proc. Natl. Acad. Sci. USA 46:453 (1960) and Doty et al, Proc. Natl. Acad. Sci. USA 46:461 (1960) have been followed by the refinement of this process into an essential tool of modem biology.
[149] As used herein, the term "polymerase chain reaction" ("PCR") refers to the method of K B. Mullis U.S. Patent Nos. 4,683,195, 4,683,202, and 4,965,188, hereby incorporated by reference, that describe a method for increasing the concentration of a segment of a target sequence in a mixture of genomic DNA without cloning or purification. This process for amplifying the target sequence consists of introducing a large excess of two oligonucleotide primers to the DNA mixture containing the desired target sequence, followed by a precise sequence of thermal cycling in the presence of a DNA polymerase. The two primers are complementary to their respective strands of the double stranded target sequence. To effect amplification, the mixture is denatured and the primers then annealed to their complementary sequences within the target molecule. Following annealing, the primers are extended with a polymerase so as to form a new pair of complementary strands. The steps of denaturation, primer annealing, and polymerase extension can be repeated many times (i.e., denaturation, annealing and extension constitute one "cycle"; there can be numerous "cycles") to obtain a high concentration of an amplified segment of the desired target sequence. The length of the amplified segment of the desired target sequence is determined by the relative positions of the primers with respect to each other, and therefore, this length is a controllable parameter. By virtue of the repeating aspect of the process, the method is referred to as the "polymerase chain reaction" (hereinafter "PCR"). Because the desired amplified segments of the target sequence become the predominant sequences (in terms of concentration) in the mixture, they are said to be "PCR amplified." With PCR, it is possible to amplify a single copy of a specific target sequence in genomic DNA to a level detectable by several different methodologies (e.g., hybridization with a labeled probe; incorporation of biotinylated primers followed by avidin-enzyme conjugate detection; incorporation of 32P-labeled deoxynucleotide triphosphates, such as dCTP or dATP, into the amplified segment). In addition to genomic DNA, any oligonucleotide or polynucleotide sequence can be amplified with the appropriate set of primer molecules. In particular, the amplified segments created by the PCR process itself are, themselves, efficient templates for subsequent PCR amplifications.
[150] The term "in silico" refers to processes taking place via computer calculations. For example, electronic PCR (ePCR) is a process analogous to ordinary PCR except that it is carried out using nucleic acid sequences and primer pair sequences stored on a computer formatted medium.
[151] The term "polymerase" refers to an enzyme having the ability to synthesize a complementary strand of nucleic acid from a starting template nucleic acid strand and free dNTPs.
[152] The term "polymerization means" or "polymerization agent" refers to any agent capable of facilitating the addition of nucleoside triphosphates to an oligonucleotide. Preferred polymerization means comprise DNA and RNA polymerases.
[153] As used herein, the terms "PCR product," "PCR fragment," and "amplification product" refer to the resultant mixture of compounds after two or more cycles of the PCR steps of denaturation, annealing and extension are complete. These terms encompass the case where there has been amplification of one or more segments of one or more target sequences.
[154] As used herein, the term "mass-modifying tag" refers to any modification to a given nucleotide which results in an increase in mass relative to the analogous non-mass modified nucleotide. Mass-modifying tags can include heavy isotopes of one or more elements included in the nucleotide such as carbon- 13 for example. Other possible modifications include addition of substituents such as iodine or bromine at the 5 position of the nucleobase for example.
[155] The term "microorganism" as used herein means an organism too small to be observed with the unaided eye and includes, but is not limited to bacteria, virus, protozoans, fungi; and ciliates.
[156] The term "multi-drug resistant" or "multiple-drug resistant" refers to a microorganism which is resistant to more than one of the antibiotics or antimicrobial agents used in the treatment of said microorganism.
[157] The term "non-template tag" refers to a stretch of at least three guanine or cytosine nucleobases of a primer used to produce a bioagent identifying amplicon which are not complementary to the template. A non-template tag is incorporated into a primer for the purpose of increasing the primer-duplex stability of later cycles of amplification by incorporation of extra G-C pairs which each have one additional hydrogen bond relative to an A-T pair.
[158] The term "nucleic acid sequence" as used herein refers to the linear composition of the nucleic acid residues A, T, C, G, U, or any modifications thereof, within an oligonucleotide, nucleotide or polynucleotide, and fragments or portions thereof, and to DNA or RNA of genomic or synthetic origin which may be single or double stranded, and represent the sense or antisense strand
[159] As used herein, the term "nucleobase" is synonymous with other terms in use in the art including "nucleotide," "deoxynucleotide," "nucleotide residue," "deoxynucleotide residue," "nucleotide triphosphate (NTP)," or deoxynucleotide triphosphate (dNTP). [160] The term "nucleotide analog" as used herein refers to modified or non-naturally occurring nucleotides such as 5-propynyl pyrimidines (i.e., 5-propynyl-dTTP and 5-propynyl-dTCP), 7-deaza purines (i.e., 7-deaza-dATP and 7-deaza-dGTP). Nucleotide analogs include base analogs and comprise modified forms of deoxyribonucleotides as well as ribonucleotides.
[161] The term "oligonucleotide" as used herein is defined as a molecule comprising two or more deoxyribonucleotides or ribonucleotides, preferably at least 5 nucleotides, more preferably at least about 13 to 35 nucleotides. The exact size will depend on many factors, which in turn depend on the ultimate function or use of the oligonucleotide. The oligonucleotide may be generated in any manner, including chemical synthesis, DNA replication, reverse transcription, PCR, or a combination thereof. Because mononucleotides are reacted to make oligonucleotides in a manner such that the 5' phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of its neighbor in one direction via a phosphodiester linkage, an end of an oligonucleotide is referred to as the "5'-end" if its 5' phosphate is not linked to the 3' oxygen of a mononucleotide pentose ring and as the "3 '-end" if its 3' oxygen is not linked to a 5' phosphate of a subsequent mononucleotide pentose ring. As used herein, a nucleic acid sequence, even if internal to a larger oligonucleotide, also may be said to have 5' and 3' ends. A first region along a nucleic acid strand is said to be upstream of another region if the 3' end of the first region is before the 5' end of the second region when moving along a strand of nucleic acid in a 5' to 3' direction. All oligonucleotide primers disclosed herein are understood to be presented in the 5' to 3' direction when reading left to right. When two different, non-overlapping oligonucleotides anneal to different regions of the same linear complementary nucleic acid sequence, and the 3' end of one oligonucleotide points towards the 5' end of the other, the former may be called the "upstream" oligonucleotide and the latter the "downstream" oligonucleotide. Similarly, when two overlapping oligonucleotides are hybridized to the same linear complementary nucleic acid sequence, with the first oligonucleotide positioned such that its 5' end is upstream of the 5' end of the second oligonucleotide, and the 3' end of the first oligonucleotide is upstream of the 3' end of the second oligonucleotide, the first oligonucleotide may be called the "upstream" oligonucleotide and the second oligonucleotide may be called the "downstream" oligonucleotide. [162] As used herein, the terms "purified" or "substantially purified" refer to molecules, either nucleic or amino acid sequences, that are removed from their natural environment, isolated or separated, and are at least 60% free, preferably 75% free, and most preferably 90% free from other components with which they are naturally associated. An "isolated polynucleotide" or "isolated oligonucleotide" is therefore a substantially purified polynucleotide.
[163] The term "reverse transcriptase" refers to an enzyme having the ability to transcribe DNA from an RNA template. This enzymatic activity is known as reverse transcriptase activity. Reverse transcriptase activity is desirable in order to obtain DNA from RNA viruses which can then be amplified and analyzed by the methods provided herein.
[164] The term "ribosomal RNA" or "rRNA" refers to the primary ribonucleic acid constituent of ribosomes. Ribosomes are the protein-manufacturing organelles of cells and exist in the cytoplasm. Ribosomal RNAs are transcribed from the DNA genes encoding them.
[165] The term "sample" in the present specification and claims is used in its broadest sense. On the one hand it is meant to include a specimen or culture (e.g., microbiological cultures). On the other hand, it is meant to include both biological and environmental samples. A sample may include a specimen of synthetic origin. Biological samples may be animal, including human, fluid, solid (e.g., stool) or tissue, as well as liquid and solid food and feed products and ingredients such as dairy items, vegetables, meat and meat by-products, and waste. Biological samples may be obtained from all of the various families of domestic animals, as well as feral or wild animals, including, but not limited to, such animals as ungulates, bear, fish, lagamorphs, rodents, etc. Environmental samples include environmental material such as surface matter, soil, water, air and industrial samples, as well as samples obtained from food and dairy processing instruments, apparatus, equipment, utensils, disposable and non-disposable items. These examples are not to be construed as limiting the sample types applicable to embodiments provided herein. The term "source of target nucleic acid" refers to any sample that contains nucleic acids (RNA or DNA). Particularly preferred sources of target nucleic acids are biological samples including, but not limited to blood, saliva, cerebral spinal fluid, pleural fluid, milk, lymph, sputum and semen. [166] As used herein, the term "sample template" refers to nucleic acid originating from a sample that is analyzed for the presence of "target" (defined below). In contrast, "background template" is used in reference to nucleic acid other than sample template that may or may not be present in a sample. Background template is often a contaminant. It may be the result of carryover, or it may be due to the presence of nucleic acid contaminants sought to be purified away from the sample. For example, nucleic acids from organisms other than those to be detected may be present as background in a test sample.
[167] A "segment" is defined herein as a region of nucleic acid within a target sequence.
[168] As used herein, the term ""sequence alignment"" refers to a listing of multiple DNA or amino acid sequences and aligns them to highlight their similarities. The listings can be made using bioinformatics computer programs.
[169] As used herein, the term "target" is used in a broad sense to indicate the gene or genomic region being amplified by the primers. Because a given primer pair provided herein is configured to generate a plurality of amplification products (depending on the bioagent being analyzed), multiple amplification products from different specific nucleic acid sequences may be obtained. Thus, the term "target" is not used to refer to a single specific nucleic acid sequence. The "target" is sought to be sorted out from other nucleic acid sequences and contains a sequence that has at least partial complementarity with an oligonucleotide primer. The target nucleic acid may comprise single- or double-stranded DNA or RNA. Primers herein can be targeted to, or configured to hybridize within portions, segments, or regions of nucleic acids. These terms are used when referring to specific regions of nucleic acid sequences used in primer design.
[170] The term "template" refers to a strand of nucleic acid on which a complementary copy is built from nucleoside triphosphates through the activity of a template-dependent nucleic acid polymerase. Within a duplex the template strand is, by convention, depicted and described as the "bottom" strand. Similarly, the non-template strand is often depicted and described as the "top" strand. [171] As used herein, the term "T.sub.m" is used in reference to the "melting temperature." The melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half dissociated into single strands. Several equations for calculating the T.sub.m of nucleic acids are well known in the art. As indicated by standard references, a simple estimate of the T.sub.m value may be calculated by the equation: T.sub.m=81.5+0.41(% G+C), when a nucleic acid is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young, Quantitative Filter Hybridization, in Nucleic Acid Hybridization (1985). Other references (e.g., Allawi, H. T. & SantaLucia, J., Jr. Thermodynamics and NMR of internal G.T mismatches in DNA. Biochemistry 36, 10581-94 (1997) include more sophisticated computations which take structural and environmental, as well as sequence characteristics into account for the calculation of T.sub.m.
[172] The term "wild-type" refers to a gene or a gene product that has the characteristics of that gene or gene product when isolated from a naturally occurring source. A wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designated the "normal" or "wild-type" form of the gene. In contrast, the term "modified", "mutant" or "polymorphic" refers to a gene or gene product that displays modifications in sequence and or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally-occurring mutants can be isolated; these are identified by the fact that they have altered characteristics when compared to the wild-type gene or gene product.
[173] Provided herein are methods for detection and identification of bioagents in an unbiased manner using bioagent identifying amplicons. In one aspect, the methods are for detection and identification of population genotype for a population of bioagents. Primers are selected to hybridize to conserved sequence regions of nucleic acids derived from a bioagent and which bracket (flank) variable sequence regions to yield a bioagent identifying amplicon which can be amplified and which is amenable to molecular mass determination. The molecular mass is converted to a base composition, which indicates the number of each nucleotide in the amplicon. The molecular mass or corresponding base composition signature of the amplicon is then queried against a database of molecular masses or base composition signatures indexed to bioagents and to the primer pair used to generate the amplicon. A match of the measured base composition to a database entry base composition associates the sample bioagent to an indexed bioagent in the database. Thus, the identity of the unknown bioagent or population of bioagents is determined. Prior knowledge of the unknown bioagent or population of bioagents is not necessary. In some instances, the measured base composition associates with more than one database entry base composition. Thus, a second/subsequent primer pair is used to generate an amplicon, and its measured base composition is similarly compared to the database to determine its identity in triangulation identification. Furthermore, the method can be applied to rapid parallel multiplex analyses, the results of which can be employed in a triangulation identification strategy. The present method provides rapid throughput and does not require nucleic acid sequencing of the amplified target sequence for bioagent detection and identification.
[174] Calculation of base composition from a mass spectrometer generated molecular mass becomes increasingly more complex as the length of the amplicon increases. For amplicons comprising unmodified nucleic acid, the upper length as a practical length limit is about 200 consecutive nucleobases. Incorporating modified nucleotides into the amplicon can allow for an increase in this upper limit. In one embodiment, the amplicons generated using any single primer pair will provide sufficient base composition information to allow for identification of at least one bioagent at the family, genus, species or subspecies level. Alternatively, amplicons greater than 200 nucleobases can be generated and then digested to form two or more fragments that are less than 200 nucleobases. Analysis of one or more of the fragments will provide sufficient base composition information to allow for identification of at least one bioagent.
[175] Preferably, amplicons comprise from about 45 to about 200 consecutive nucleobases (i.e., from about 45 to about 200 linked nucleosides). One of ordinary skill in the art will appreciate that this range expressly embodies compounds of 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, and 200 nucleobases in length. One ordinarily skilled in the art will further appreciate that the above range is not an absolute limit to the length of an amplicon, but instead represents a preferred length range. Amplicons lengths falling outside of this range are also included herein so long as the amplicon is amenable to calculation of a base composition signature as herein described.
[176] In some embodiments, bioagent identifying amplicons amenable to molecular mass determination that are produced by the primers described herein are either of a length, size and/or mass compatible with the particular mode of molecular mass determination or compatible with a means of providing a predictable fragmentation pattern in order to obtain predictable fragments of a length compatible with the particular mode of molecular mass determination. Such means of providing a predictable fragmentation pattern of an amplicon include, but are not limited to, cleavage with restriction enzymes or cleavage primers, for example. Thus, in some embodiments, bioagent identifying amplicons are larger than 200 nucleobases and are amenable to molecular mass determination following restriction digestion. Methods of using restriction enzymes and cleavage primers are well known to those with ordinary skill in the art.
[177] In some embodiments, amplicons corresponding to bioagent identifying amplicons are obtained using the polymerase chain reaction (PCR) which is a routine method to those with ordinary skill in the molecular biology arts. Other amplification methods may be used such as ligase chain reaction (LCR), low- stringency single primer PCR, and multiple strand displacement amplification (MDA). These methods are also known to those with ordinary skill. (Michael, SF., Biotechniques (1994), 16:411-412 and Dean et al, Proc. Natl. Acad. Sci. U.S.A. (2002), 99, 5261- 5266). In some embodiments, the amplification is carried out in a multiplex assay, a PCR amplification reaction where more than one primer pair is included in the reaction pool allowing two or more different DNA targets to be amplified in a single tube or well.
[178] Unlike bacterial genomes, which exhibit conservation of numerous genes (i.e. housekeeping genes) across all organisms, viruses do not share a gene that is essential and conserved among all virus families. Therefore, viral identification is achieved within smaller groups of related viruses, such as members of a particular virus family or genus. For example, RNA- dependent RNA polymerase is present in all single-stranded RNA viruses and can be used for broad priming as well as resolution within the virus family.
[179] In some embodiments, at least one bacterial nucleic acid segment is amplified in the process of identifying the bacterial bioagent. Thus, the nucleic acid segments that can be amplified by the primers disclosed herein and that provide enough variability to distinguish each individual bioagent and whose molecular masses are amenable to molecular mass determination are herein described as bioagent identifying amplicons.
[180] In some embodiments, identification of bioagents is accomplished at different levels using primers suited to resolution of each individual level of identification. Broad range survey primers are configured with the objective of identifying a bioagent as a member of a particular division (e.g., an order, family, genus or other such grouping of bioagents above the species level of bioagents). In some embodiments, broad range survey intelligent primers are capable of identification of bioagents at the species or sub-species level. Examples of broad range survey primers include, but are not limited to: primer pair numbers: 346 (SEQ ID NOs: 202: 1110), 347 (SEQ ID NOs: 560: 1278), 348 SEQ ID NOs: 706:895), and 361 (SEQ ID NOs: 697: 1398) which target DNA encoding 16S rRNA, and primer pair numbers 349 (SEQ ID NOs: 401 : 1156) and 360 (SEQ ID NOs: 409: 1434) which target DNA encoding 23 S rRNA.
[181] In some embodiments, drill-down primers are configured with the objective of identifying a bioagent at the sub-species level (including strains, subtypes, variants and isolates) based on subspecies characteristics which may, for example, include single nucleotide polymorphisms (SNPs), variable number tandem repeats (VNTRs), deletions, drug resistance mutations or any other modification of a nucleic acid sequence of a bioagent relative to other members of a species having different sub-species characteristics. Drill-down intelligent primers are not always required for identification at the sub-species level because broad range survey intelligent primers may, in some cases provide sufficient identification resolution to accomplishing this identification objective. Examples of drill-down primers include, but are not limited to: confirmation primer pairs such as primer pair numbers 351 (SEQ ID NOs: 355: 1423) and 353 (SEQ ID NOs: 220: 1394), which target the pXOl virulence plasmid of Bacillus anthracis. Other examples of drill-down primer pairs are found in sets of triangulation genotyping primer pairs such as, for example, the primer pair number 2146 (SEQ ID NOs: 437: 1137) which targets the arcC gene (encoding carmabate kinase) and is included in an 8 primer pair panel or kit for use in genotyping Staphylococcus aureus, or in other panels or kits of primer pairs used for determining drug-resistant bacterial strains, such as, for example, primer pair number 2095 (SEQ ID NOs: 456: 1261) which targets the pv-luk gene (encoding Panton- Valentine leukocidin) and is included in an 8 primer pair panel or kit for use in identification of drug resistant strains of Staphylococcus aureus.
[182] A representative process flow diagram used for primer selection and validation process is outlined in Figure 1. For each group of organisms, candidate target sequences are identified (200) from which nucleotide alignments are created (210) and analyzed (220). Primers are then configured by selecting appropriate priming regions (230) to facilitate the selection of candidate primer pairs (240). The primer pairs are then subjected to in silico analysis by electronic PCR (ePCR) (300) wherein bioagent identifying amplicons are obtained from sequence databases such as GenBank or other sequence collections (310) and checked for specificity in silico (320). Bioagent identifying amplicons obtained from GenBank sequences (310) can also be analyzed by a probability model which predicts the capability of a given amplicon to identify unknown bioagents such that the base compositions of amplicons with favorable probability scores are then stored in a base composition database (325). Alternatively, base compositions of the bioagent identifying amplicons obtained from the primers and GenBank sequences can be directly entered into the base composition database (330). Candidate primer pairs (240) are validated by testing their ability to hybridize to target nucleic acid by an in vitro amplification by a method such as PCR analysis (400) of nucleic acid from a collection of organisms (410). Amplification products thus obtained are analyzed by gel electrophoresis or by mass spectrometry to confirm the sensitivity, specificity and reproducibility of the primers used to obtain the amplification products (420).
[183] Many of the important pathogens, including the organisms of greatest concern as biowarfare agents, have been completely sequenced. This effort has greatly facilitated the design of primers for the detection of unknown bioagents. The combination of broad-range priming with division- wide and drill-down priming has been used very successfully in several applications of the technology, including environmental surveillance for biowarfare threat agents and clinical sample analysis for medically important pathogens.
[184] Synthesis of primers is well known and routine in the art. The primers may be conveniently and routinely made through the well-known technique of solid phase synthesis. Equipment for such synthesis is sold by several vendors including, for example, Applied Biosystems (Foster City, CA). Any other means for such synthesis known in the art may additionally or alternatively be employed.
[185] In some embodiments, the oligonucleotide primers are broad range survey primers which hybridize to conserved regions of nucleic acid encoding the hexon gene of all (or between 80% and 100%, between 85% and 100%, between 90% and 100% or between 95% and 100%) known bacteria and produce bacterial bioagent identifying amplicons.
[186] In some cases, the molecular mass or base composition of a bacterial bioagent identifying amplicon defined by a broad range survey primer pair does not provide enough resolution to unambiguously identify a bacterial bioagent at or below the species level. These cases benefit from further analysis of one or more bacterial bioagent identifying amplicons generated from at least one additional broad range survey primer pair or from at least one additional division-wide primer pair. The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as triangulation identification.
[187] In other embodiments, the oligonucleotide primers are division-wide primers which hybridize to nucleic acid encoding genes of species within a genus of bacteria. In other embodiments, the oligonucleotide primers are drill-down primers which enable the identification of sub-species characteristics. Drill down primers provide the functionality of producing bioagent identifying amplicons for drill-down analyses such as strain typing when contacted with nucleic acid under amplification conditions. Identification of such sub-species characteristics is often critical for determining proper clinical treatment of viral infections. In some embodiments, sub-species characteristics are identified using only broad range survey primers and division-wide and drill- down primers are not used. [188] In some embodiments, the primers used for amplification hybridize to and amplify genomic DNA, and DNA of bacterial plasmids.
[189] In some embodiments, various computer software programs may be used to aid in design of primers for amplification reactions such as Primer Premier 5 (Premier Biosoft, Palo Alto, CA) or OLIGO Primer Analysis Software (Molecular Biology Insights, Cascade, CO). These programs allow the user to input desired hybridization conditions such as melting temperature of a primer- template duplex for example. In some embodiments, an in silico PCR search algorithm, such as (ePCR) is used to analyze primer specificity across a plurality of template sequences which can be readily obtained from public sequence databases such as GenBank for example. An existing RNA structure search algorithm (Macke et al, Nucl. Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety). This also provides information on primer specificity of the selected primer pairs. In some embodiments, the hybridization conditions applied to the algorithm can limit the results of primer specificity obtained from the algorithm. In some embodiments, the melting temperature threshold for the primer template duplex is specified to be 350C or a higher temperature. In some embodiments the number of acceptable mismatches is specified to be seven mismatches or less. In some embodiments, the buffer components and concentrations and primer concentrations may be specified and incorporated into the algorithm, for example, an appropriate primer concentration is about 250 nM and appropriate buffer components are 50 mM sodium or potassium and 1.5 mM Mg.sup.2+.
[190] One with ordinary skill in the art of design of amplification primers will recognize that a given primer need not hybridize with 100% complementarity in order to effectively prime the synthesis of a complementary nucleic acid strand in an amplification reaction. Moreover, a primer may hybridize over one or more segments such that intervening or adjacent segments are not involved in the hybridization event, (e.g., for example, a loop structure or a hairpin structure). The primers provided herein may comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95% or at least 99% sequence identity with any of the primers listed in Table 2. Thus, in some embodiments, an extent of variation of 70% to 100%, or any range therewithin, of the sequence identity is possible relative to the specific primer sequences disclosed herein. Determination of sequence identity is described in the following example: a primer 20 nucleobases in length which is identical to another 20 nucleobase primer having two non-identical residues has 18 of 20 identical residues (18/20 = 0.9 or 90% sequence identity). In another example, a primer 15 nucleobases in length having all residues identical to a 15 nucleobase segment of primer 20 nucleobases in length would have 15/20 = 0.75 or 75% sequence identity with the 20 nucleobase primer. Similarly, either or both of the primers of the primer pairs provided herein may comprise 0- 10 nucleobase deletions, additions, and/or substitutions relative to any of the primers listed in Table 2, or elsewhere herein. In other words, either or both of the primers may comprise 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleobase deletions, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleobase additions, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleobase substitutions relative to the sequences of any of the primers disclosed herein. In one aspect, the primers comprise the sequence of any of the primers listed in Table 2 with the T modification removed from the 5' terminus. In one aspect, the primers comprise the sequence of any of the primers listed in Table 2 with the T modification removed from the 5' terminus and comprising 0-10 nucleobase deletions, additions, and/or substitutions.
[191] Percent homology, sequence identity or complementarity, can be determined by, for example, the Gap program (Wisconsin Sequence Analysis Package, Version 8 for UNIX, Genetics Computer Group, University Research Park, Madison WI), using default settings, which uses the algorithm of Smith and Waterman (Adv. Appl. Math., 1981, 2, 482-489). In some embodiments, complementarity of primers with respect to the conserved priming regions of viral nucleic acid is between about 70% and about 75% 80%. In other embodiments, homology, sequence identity or complementarity, is between about 75% and about 80%. In yet other embodiments, homology, sequence identity or complementarity, is at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or is 100%. In some embodiments, the primers described herein comprise at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 95%, at least 96%, at least 98%, or at least 99%, or 100% (or any range therewithin) sequence identity with the primer sequences specifically disclosed herein. [192] One with ordinary skill is able to calculate percent sequence identity or percent sequence homology and able to determine, without undue experimentation, the effects of variation of primer sequence identity on the function of the primer in its role in priming synthesis of a complementary strand of nucleic acid for production of an amplification product of a corresponding bioagent identifying amplicon.
[193] In some embodiments, the oligonucleotide primers are 13 to 35 nucleobases in length (13 to 35 linked nucleotide residues). In these embodiments, the primers are at least 13 nucleobases in length, and less than 36 nucleobases in length. These embodiments comprise oligonucleotide primers 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34 or 35 nucleobases in length, or any range therewithin. Herein is contemplated using both longer and shorter primers. Furthermore, the primers may also be linked to one or more other desired moieties, including, but not limited to, affinity groups, ligands, regions of nucleic acid that are not complementary to the nucleic acid to be amplified, labels, etc. Primers may also form hairpin structures. For example, hairpin primers may be used to amplify short target nucleic acid molecules. The presence of the hairpin may stabilize the amplification complex (see e.g., TAQMAN MicroRNA Assays, Applied Biosystems, Foster City, California).
[194] In some embodiments, any oligonucleotide primer pair may have one or both primers with less then 70% sequence homology with a corresponding member of any of the primer pairs of Table 2 if the primer pair has the capability of producing an amplification product corresponding to a bioagent identifying amplicon. In other embodiments, any oligonucleotide primer pair may have one or both primers with a length greater than 35 nucleobases if the primer pair has the capability of producing an amplification product corresponding to a bioagent identifying amplicon.
[195] In some embodiments, the function of a given primer may be substituted by a combination of two or more primers segments that hybridize adjacent to each other or that are linked by a nucleic acid loop structure or linker which allows a polymerase to extend the two or more primers in an amplification reaction.
[196] In some embodiments, the primer pairs used for obtaining bioagent identifying amplicons are the primer pairs of Table 2. In other embodiments, other combinations of primer pairs are possible by combining certain members of the forward primers with certain members of the reverse primers. An example can be seen in Table 2 for two primer pair combinations of forward primer 16S_EC_789_810_F (SEQ ID NO: 206), with the reverse primers 16S_EC_880_894_R (SEQ ID NO: 796), or 16S_EC_882_899_R or (SEQ ID NO: 818). Arriving at a favorable alternate combination of primers in a primer pair depends upon the properties of the primer pair, most notably the size of the bioagent identifying amplicon that would be produced by the primer pair, which preferably is between about 45 to about 150 nucleobases in length. Alternatively, a bioagent identifying amplicon longer than 150 nucleobases in length could be cleaved into smaller segments by cleavage reagents such as chemical reagents, or restriction enzymes, for example.
[197] In some embodiments, the primers are configured to amplify nucleic acid of a bioagent to produce amplification products that can be measured by mass spectrometry and from whose molecular masses candidate base compositions can be readily calculated.
[198] In some embodiments, any given primer comprises a modification comprising the addition of a non-templated T residue to the 5' end of the primer (i.e., the added T residue does not necessarily hybridize to the nucleic acid being amplified). The addition of a non-templated T residue has an effect of minimizing the addition of non-templated adenosine residues as a result of the nonspecific enzyme activity oi Taq polymerase (Magnuson et al, Biotechniques, 1996, 21, 700-709), an occurrence which may lead to ambiguous results arising from molecular mass analysis.
[199] In some embodiments, primers may contain one or more universal bases. Because any variation (due to codon wobble in the third position) in the conserved regions among species is likely to occur in the third position of a DNA (or RNA) triplet, oligonucleotide primers can be configured such that the nucleotide corresponding to this position is a base which can bind to more than one nucleotide, referred to herein as a "universal nucleobase." For example, under this "wobble" pairing, inosine (I) binds to U, C or A; guanine (G) binds to U or C, and uridine (U) binds to U or C. Other examples of universal nucleobases include nitroindoles such as 5-nitroindole or 3- nitropyrrole (Loakes et al., Nucleosides and Nucleotides, 1995, 14, 1001-1003), the degenerate nucleotides dP or dK (Hill et al.), an acyclic nucleoside analog containing 5-nitroindazole (Van Aerschot et al., Nucleosides and Nucleotides, 1995, 14, 1053-1056) or the purine analog l-(2- deoxy-beta-D-ribofuranosyl)-imidazole-4-carboxamide (SaIa et al, Nucl. Acids Res., 1996, 24, 3302-3306).
[200] In some embodiments, to compensate for the somewhat weaker binding by the wobble base, the oligonucleotide primers are configured such that the first and second positions of each triplet are occupied by nucleotide analogs that bind with greater affinity than the unmodified nucleotide. Examples of these analogs include, but are not limited to, 2,6-diaminopurine which binds to thymine, 5-propynyluracil (also known as propynylated thymine) which binds to adenine and 5-propynylcytosine and phenoxazines, including G-clamp, which binds to G. Propynylated pyrimidines are described in U.S. Patent Nos. 5,645,985, 5,830,653 and 5,484,908, each of which is commonly owned and incorporated herein by reference in its entirety. Propynylated primers are described in U.S Pre-Grant Publication No. 2003-0170682, which is also commonly owned and incorporated herein by reference in its entirety. Phenoxazines are described in U.S. Patent Nos. 5,502,177, 5,763,588, and 6,005,096, each of which is incorporated herein by reference in its entirety. G-clamps are described in U.S. Patent Nos. 6,007,992 and 6,028,183, each of which is incorporated herein by reference in its entirety.
[201] In some embodiments, primer hybridization is enhanced using primers containing 5- propynyl deoxy-cytidine and deoxy-thymidine nucleotides. These modified primers offer increased affinity and base pairing selectivity.
[202] In some embodiments, non-template primer tags are used to increase the melting temperature (T. sub. m) of a primer-template duplex in order to improve amplification efficiency. A non-template tag is at least three consecutive A or T nucleotide residues on a primer which are not complementary to the template. In any given non-template tag, A can be replaced by C or G and T can also be replaced by C or G. Although Watson-Crick hybridization is not expected to occur for a non-template tag relative to the template, the extra hydrogen bond in a G-C pair relative to an A-T pair confers increased stability of the primer-template duplex and improves amplification efficiency for subsequent cycles of amplification when the primers hybridize to strands synthesized in previous cycles. [203] In other embodiments, propynylated tags may be used in a manner similar to that of the non-template tag, wherein two or more 5-propynylcytidine or 5-propynyluridine residues replace template matching residues on a primer. In other embodiments, a primer contains a modified internucleoside linkage such as a phosphorothioate linkage, for example.
[204] In some embodiments, the primers comprise mass-modifying tags. Reducing the total number of possible base compositions of a nucleic acid of specific molecular weight provides a means of avoiding a persistent source of ambiguity in determination of base composition of amplification products. Addition of mass-modifying tags to certain nucleobases of a given primer will result in simplification of de novo determination of base composition of a given bioagent identifying amplicon from its molecular mass.
[205] In some embodiments, the mass modified nucleobase comprises one or more of the following: for example, 7-deaza-2'-deoxyadenosine-5-triphosphate, 5-iodo-2'-deoxyuridine-5'- triphosphate, 5-bromo-2'-deoxyuridine-5'-triphosphate, 5-bromo-2'-deoxycytidine-5'-triphosphate, 5-iodo-2'-deoxycytidine-5'-triphosphate, 5-hydroxy-2'-deoxyuridine-5'-triphosphate, 4- thiothymidine-5 '-triphosphate, 5-aza-2'-deoxyuridine-5'-triphosphate, 5-fluoro-2'-deoxyuridine-5'- triphosphate, O6-methyl-2'-deoxyguanosine-5 '-triphosphate, N2-methyl-2'-deoxyguanosine-5'- triphosphate, 8-oxo-2'-deoxyguanosine-5'-triphosphate or thiothymidine-5 '-triphosphate. In some embodiments, the mass-modified nucleobase comprises 15N or 13C or both 15N and 13C.
[206] In some embodiments, multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with a plurality of primer pairs. The advantages of multiplexing are that fewer reaction containers (for example, wells of a 96- or 384-well plate) are needed for each molecular mass measurement, providing time, resource and cost savings because additional bioagent identification data can be obtained within a single analysis. Multiplex amplification methods are well known to those with ordinary skill and can be developed without undue experimentation. However, in some embodiments, one useful and non-obvious step in selecting a plurality candidate bioagent identifying amplicons for multiplex amplification is to ensure that each strand of each amplification product will be sufficiently different in molecular mass that mass spectral signals will not overlap and lead to ambiguous analysis results. In some embodiments, a 10 Da difference in mass of two strands of one or more amplification products is sufficient to avoid overlap of mass spectral peaks.
[207] In some embodiments, as an alternative to multiplex amplification, single amplification reactions can be pooled before analysis by mass spectrometry. In these embodiments, as for multiplex amplification embodiments, it is useful to select a plurality of candidate bioagent identifying amplicons to ensure that each strand of each amplification product will be sufficiently different in molecular mass that mass spectral signals will not overlap and lead to ambiguous analysis results.
[208] In some embodiments, the molecular mass of a given bioagent identifying amplicon is determined by mass spectrometry. Mass spectrometry has several advantages, not the least of which is high bandwidth characterized by the ability to separate (and isolate) many molecular peaks across a broad range of mass to charge ratio (m/z). Thus mass spectrometry is intrinsically a parallel detection scheme without the need for radioactive or fluorescent labels, since every amplification product is identified by its molecular mass. The current state of the art in mass spectrometry is such that less than femtomole quantities of material can be readily analyzed to afford information about the molecular contents of the sample. An accurate assessment of the molecular mass of the material can be quickly obtained, irrespective of whether the molecular weight of the sample is several hundred, or in excess of one hundred thousand atomic mass units (amu) or Daltons.
[209] In some embodiments, intact molecular ions are generated from amplification products using one of a variety of ionization techniques to convert the sample to gas phase. These ionization methods include, but are not limited to, electrospray ionization (ES), matrix-assisted laser desorption ionization (MALDI) and fast atom bombardment (FAB). Upon ionization, several peaks are observed from one sample due to the formation of ions with different charges. Averaging the multiple readings of molecular mass obtained from a single mass spectrum affords an estimate of molecular mass of the bioagent identifying amplicon. Electrospray ionization mass spectrometry (ESI-MS) is particularly useful for very high molecular weight polymers such as proteins and nucleic acids having molecular weights greater than 10 kDa, since it yields a distribution of multiply-charged molecules of the sample without causing a significant amount of fragmentation. [210] The mass detectors used in the methods provided herein include, but are not limited to, Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS), time of flight (TOF), ion trap, quadrupole, magnetic sector, Q-TOF, and triple quadrupole.
[211] Although the molecular mass of amplification products obtained using intelligent primers provides a means for identification of bioagents, conversion of molecular mass data to a base composition signature is useful for certain analyses. The base composition an the exact number of each nucleobase (A, T, C and G) in an oligonucleotide, for example, an amplicon, and can be calculated, for amplicons generated using the primer pairs provided here, from the molecular mass of the amplicons. In some embodiments, a base composition provides an index of a specific organism. Base compositions can be calculated from known sequences of known bioagent identifying amplicons and can also be experimentally determined by measuring the molecular mass of a given bioagent identifying amplicon, followed by determination of all possible base compositions which are consistent with the measured molecular mass within acceptable experimental error. The following example illustrates determination of base composition from an experimentally obtained molecular mass of a 46-mer amplification product originating at position 1337 of the 16S rRNA of Bacillus anthracis. The forward and reverse strands of the amplification product have measured molecular masses of 14208 and 14079 Da, respectively. The possible base compositions derived from the molecular masses of the forward and reverse strands for the B. anthracis products are listed in Table 1.
Table 1 Possible Base Compositions for B. anthracis 46mer Amplification Product
Figure imgf000055_0001
Figure imgf000056_0001
[212] Among the 16 possible base compositions for the forward strand and the 18 possible base compositions for the reverse strand that were calculated, only one pair (shown in bold) are complementary base compositions, which indicates the true base composition of the amplification product. It should be recognized that this logic is applicable for determination of base compositions of any bioagent identifying amplicon, regardless of the class of bioagent from which the corresponding amplification product was obtained.
[213] In some embodiments, assignment of previously unobserved base compositions (also known as "true unknown base compositions") to a given phylogeny can be accomplished via the use of pattern classifier model algorithms. Base compositions, like sequences, vary slightly from strain to strain within species, for example. In some embodiments, the pattern classifier model is the mutational probability model. On other embodiments, the pattern classifier is the polytope model. The mutational probability model and polytope model are both commonly owned and described in U.S. Patent application Serial No. 11/073,362 which is incorporated herein by reference in entirety.
[214] In one embodiment, it is possible to manage this diversity by building "base composition probability clouds" around the composition constraints for each species. This permits identification of organisms in a fashion similar to sequence analysis. A "pseudo four-dimensional plot" can be used to visualize the concept of base composition probability clouds. Optimal primer design requires optimal choice of bioagent identifying amplicons and maximizes the separation between the base composition signatures of individual bioagents. Areas where clouds overlap indicate regions that may result in a misclassification, a problem which is overcome by a triangulation identification process using bioagent identifying amplicons not affected by overlap of base composition probability clouds.
[215] In some embodiments, base composition probability clouds provide the means for screening potential primer pairs in order to avoid potential misclassifications of base compositions. In other embodiments, base composition probability clouds provide the means for predicting the identity of a bioagent whose assigned base composition was not previously observed and/or indexed in a bioagent identifying amplicon base composition database due to evolutionary transitions in its nucleic acid sequence. Thus, in contrast to probe-based techniques, mass spectrometry determination of base composition does not require prior knowledge of the composition or sequence in order to make the measurement.
[216] Provided herein is bioagent classifying information similar to DNA sequencing and phylogenetic analysis at a level sufficient to identify a given bioagent and methods for obtaining such information. Furthermore, the process of determination of a previously unknown base composition for a given bioagent (for example, in a case where sequence information is unavailable) has downstream utility by providing additional bioagent indexing information with which to populate base composition databases. The process of future bioagent identification is thus greatly improved as more BCS indexes become available in base composition databases.
[217] In some cases, a molecular mass of a single bioagent identifying amplicon alone does not provide enough resolution to unambiguously identify a given bioagent. The employment of more than one bioagent identifying amplicon for identification of a bioagent is herein referred to as "triangulation identification." Triangulation identification is pursued by determining the molecular masses of a plurality of bioagent identifying amplicons selected within a plurality of housekeeping genes. This process is used to reduce false negative and false positive signals, and enable reconstruction of the origin of hybrid or otherwise engineered bioagents. For example, identification of the three part toxin genes typical of B. anthracis (Bowen et al, J. Appl. Microbiol, 1999, 87, 270-278) in the absence of the expected signatures from the B. anthracis genome would suggest a genetic engineering event.
[218] In some embodiments, the triangulation identification process can be pursued by characterization of bioagent identifying amplicons in a massively parallel fashion using the polymerase chain reaction (PCR), such as multiplex PCR where multiple primers are employed in the same amplification reaction mixture, or PCR in multi-well plate format wherein a different and unique pair of primers is used in multiple wells containing otherwise identical reaction mixtures. Such multiplex and multi-well PCR methods are well known to those with ordinary skill in the arts of rapid throughput amplification of nucleic acids. In other related embodiments, one PCR reaction per well or container may be carried out, followed by an amplicon pooling step wherein the amplification products of different wells are combined in a single well or container which is then subjected to molecular mass analysis. The combination of pooled amplicons can be chosen such that the expected ranges of molecular masses of individual amplicons are not overlapping and thus will not complicate identification of signals.
[219] In some embodiments, one or more nucleotide substitutions within a codon of a gene of an infectious organism confer drug resistance upon an organism which can be determined by codon base composition analysis. The organism can be a bacterium, virus, fungus or protozoan.
[220] In some embodiments, the amplification product containing the codon being analyzed is of a length of about 35 to about 200 nucleobases. The primers employed in obtaining the amplification product can hybridize to upstream and downstream sequences directly adjacent to the codon, or can hybridize to upstream and downstream sequences one or more sequence positions away from the codon. The primers may have between about 70% to 100% sequence complementarity with the sequence of the gene containing the codon being analyzed.
[221] In some embodiments, the codon base composition analysis is undertaken
[222] In some embodiments, the codon analysis is undertaken for the purpose of investigating genetic disease in an individual. In other embodiments, the codon analysis is undertaken for the purpose of investigating a drug resistance mutation or any other deleterious mutation in an infectious organism such as a bacterium, virus, fungus or protozoan. In some embodiments, the bioagent is a bacterium identified in a biological product.
[223] In some embodiments, the molecular mass of an amplification product containing the codon being analyzed is measured by mass spectrometry. The mass spectrometry can be either electrospray (ESI) mass spectrometry or matrix-assisted laser desorption ionization (MALDI) mass spectrometry. Time-of- flight (TOF) is an example of one mode of mass spectrometry compatible with the methods provided herein.
[224] The methods provided here can also be employed to determine the relative abundance of drug resistant strains of the organism being analyzed. Relative abundances can be calculated from amplitudes of mass spectral signals with relation to internal calibrants. In some embodiments, known quantities of internal amplification calibrants can be included in the amplification reactions and abundances of analyte amplification product estimated in relation to the known quantities of the calibrants.
[225] In some embodiments, upon identification of one or more drug-resistant strains of an infectious organism infecting an individual, one or more alternative treatments can be devised to treat the individual.
[226] In some embodiments, the identity and quantity of an unknown bioagent can be determined using the process illustrated in Figure 2. Primers (500) and a known quantity of a calibration polynucleotide (505) are added to a sample containing nucleic acid of an unknown bioagent. The total nucleic acid in the sample is then subjected to an amplification reaction (510) to obtain amplification products. The molecular masses of amplification products are determined (515) from which are obtained molecular mass and abundance data. The molecular mass of the bioagent identifying amplicon (520) provides the means for its identification (525) and the molecular mass of the calibration amplicon obtained from the calibration polynucleotide (530) provides the means for its identification (535). The abundance data of the bioagent identifying amplicon is recorded (540) and the abundance data for the calibration data is recorded (545), both of which are used in a calculation (550) which determines the quantity of unknown bioagent in the sample.
[227] A sample comprising an unknown bioagent is contacted with a pair of primers that provide the means for amplification of nucleic acid from the bioagent, and a known quantity of a polynucleotide that comprises a calibration sequence. The nucleic acids of the bioagent and of the calibration sequence are amplified and the rate of amplification is reasonably assumed to be similar for the nucleic acid of the bioagent and of the calibration sequence. The amplification reaction then produces two amplification products: a bioagent identifying amplicon and a calibration amplicon. The bioagent identifying amplicon and the calibration amplicon should be distinguishable by molecular mass while being amplified at essentially the same rate. Effecting differential molecular masses can be accomplished by choosing as a calibration sequence, a representative bioagent identifying amplicon (from a specific species of bioagent) and performing, for example, a 2-8 nucleobase deletion or insertion within the variable region between the two priming sites. The amplified sample containing the bioagent identifying amplicon and the calibration amplicon is then subjected to molecular mass analysis by mass spectrometry, for example. The resulting molecular mass analysis of the nucleic acid of the bioagent and of the calibration sequence provides molecular mass data and abundance data for the nucleic acid of the bioagent and of the calibration sequence. The molecular mass data obtained for the nucleic acid of the bioagent enables identification of the unknown bioagent and the abundance data enables calculation of the quantity of the bioagent, based on the knowledge of the quantity of calibration polynucleotide contacted with the sample.
[228] In some embodiments, construction of a standard curve where the amount of calibration polynucleotide spiked into the sample is varied provides additional resolution and improved confidence for the determination of the quantity of bioagent in the sample. The use of standard curves for analytical determination of molecular quantities is well known to one with ordinary skill and can be performed without undue experimentation.
[229] In some embodiments, multiplex amplification is performed where multiple bioagent identifying amplicons are amplified with multiple primer pairs which also amplify the corresponding standard calibration sequences. In this or other embodiments, the standard calibration sequences are optionally included within a single vector which functions as the calibration polynucleotide. Multiplex amplification methods are well known to those with ordinary skill and can be performed without undue experimentation.
[230] In some embodiments, the calibrant polynucleotide is used as an internal positive control to confirm that amplification conditions and subsequent analysis steps are successful in producing a measurable amplicon. Even in the absence of copies of the genome of a bioagent, the calibration polynucleotide should give rise to a calibration amplicon. Failure to produce a measurable calibration amplicon indicates a failure of amplification or subsequent analysis step such as amplicon purification or molecular mass determination. Reaching a conclusion that such failures have occurred is in itself, a useful event.
[231] In some embodiments, the calibration sequence is comprised of DNA. In some embodiments, the calibration sequence is comprised of RNA.
[232] In some embodiments, the calibration sequence is inserted into a vector that itself functions as the calibration polynucleotide. In some embodiments, more than one calibration sequence is inserted into the vector that functions as the calibration polynucleotide. Such a calibration polynucleotide is herein termed a "combination calibration polynucleotide." The process of inserting polynucleotides into vectors is routine to those skilled in the art and can be accomplished without undue experimentation. Thus, it should be recognized that the calibration method should not be limited to the embodiments described herein. The calibration method can be applied for determination of the quantity of any bioagent identifying amplicon when an appropriate standard calibrant polynucleotide sequence is configured and used. The process of choosing an appropriate vector for insertion of a calibrant is also a routine operation that can be accomplished by one with ordinary skill without undue experimentation. [233] In some embodiments, the primer pairs produce bioagent identifying amplicons within stable and highly conserved regions of bacteria. The advantage to characterization of an amplicon defined by priming regions that fall within a highly conserved region is that there is a low probability that the region will evolve past the point of primer recognition, in which case, the primer hybridization of the amplification step would fail. Such a primer set is thus useful as a broad range survey-type primer. In another embodiment, the primers produce bioagent identifying amplicons including a region which evolves more quickly than the stable region described above. The advantage of characterization bioagent identifying amplicon corresponding to an evolving genomic region is that it is useful for distinguishing emerging strain variants or the presence of virulence genes, drug resistance genes, or codon mutations that induce drug resistance.
[234] The embodiments provided here also have significant advantages in providing a platform for identification of diseases caused by emerging bacterial strains such as, for example, drug- resistant strains of Staphylococcus aureus. The present embodiments eliminate the need for prior knowledge of bioagent sequence to generate hybridization probes. This is possible because the methods are not confounded by naturally occurring evolutionary variations occurring in the sequence acting as the template for production of the bioagent identifying amplicon. Measurement of molecular mass and determination of base composition is accomplished in an unbiased manner without sequence prejudice.
[235] Another embodiment provides a means of tracking the spread of a bacterium, such as a particular drug-resistant strain when a plurality of samples obtained from different locations are analyzed by the methods described above in an epidemiological setting. In one embodiment, a plurality of samples from a plurality of different locations is analyzed with primer pairs which produce bioagent identifying amplicons, a subset of which contains a specific drug-resistant bacterial strain. The corresponding locations of the members of the drug-resistant strain subset indicate the spread of the specific drug-resistant strain to the corresponding locations.
[236] Also provided herein are kits for carrying out the methods described herein. In some embodiments, the kit may comprise a sufficient quantity of one or more primer pairs to perform an amplification reaction on a target polynucleotide from a bioagent to form a bioagent identifying amplicon. In some embodiments, the kit may comprise from one to fifty primer pairs, from one to twenty primer pairs, from one to ten primer pairs, or from two to five primer pairs. In some embodiments, the kit may comprise one or more primer pairs recited in Table 2.
[237] In some embodiments, the kit comprises one or more broad range survey primer(s), division wide primer(s), or drill-down primer(s), or any combination thereof. If a given problem involves identification of a specific bioagent, the solution to the problem may require the selection of a particular combination of primers to provide the solution to the problem. A kit may be configured so as to comprise particular primer pairs for identification of a particular bioagent. A drill-down kit may be used, for example, to distinguish different genotypes or strains, drug-resistant, or otherwise. In some embodiments, the primer pair components of any of these kits may be additionally combined to comprise additional combinations of broad range survey primers and division-wide primers so as to be able to identify a bacterium.
[238] In some embodiments, the kit contains standardized calibration polynucleotides for use as internal amplification calibrants. Internal calibrants are described in commonly owned PCT pre- grant publication, publication number WO 2005/094421, which is incorporated herein by reference in its entirety.
[239] In some embodiments, the kit comprises a sufficient quantity of reverse transcriptase (if RNA is to be analyzed for example), a DNA polymerase, suitable nucleoside triphosphates (including alternative dNTPs such as inosine or modified dNTPs such as the 5-propynyl pyrimidines or any dNTP containing molecular mass-modifying tags such as those described above), a DNA ligase, and/or reaction buffer, or any combination thereof, for the amplification processes described above. A kit may further include instructions pertinent for the particular embodiment of the kit, such instructions describing the primer pairs and amplification conditions for operation of the method. A kit may also comprise amplification reaction containers such as microcentrifuge tubes and the like. A kit may also comprise reagents or other materials for isolating bioagent nucleic acid or bioagent identifying amplicons from amplification, including, for example, detergents, solvents, or ion exchange resins which may be linked to magnetic beads. A kit may also comprise a table of measured or calculated molecular masses and/or base compositions of bioagents using the primer pairs of the kit.
[240] In some embodiments, a kit may contain one or more survey bacterial primer pairs and one or more triangulation genotyping analysis primer pairs such as the primer pairs of Tables 8, 12, 14, 19, 21, 23, or 24. In some embodiments, the kit may represent a less expansive genotyping analysis but include triangulation genotyping analysis primer pairs for more than one genus or species of bacteria. For example, a kit for surveying nosocomial infections at a health care facility may include, for example, one or more broad range survey primer pairs, one or more division wide primer pairs, one or more Acinetobacter baumannii triangulation genotyping analysis primer pairs and one or more Staphylococcus aureus triangulation genotyping analysis primer pairs. One with ordinary skill will be capable of analyzing in silico amplification data to determine which primer pairs will be able to provide optimal identification resolution for the bacterial bioagents of interest.
[241] In some embodiments, a kit may be assembled for identification of strains of bacteria involved in contamination of food. An example of such a kit embodiment is a kit comprising one or more bacterial survey primer pairs of Table 5 with one or more triangulation genotyping analysis primer pairs of Table 12 which provide strain resolving capabilities for identification of specific strains of Campylobacter jejuni .
[242] Some embodiments of the kits are 96-well or 384-well plates with a plurality of wells containing any or all of the following components: dNTPs, buffer salts, Mg2+, betaine, and primer pairs. In some embodiments, a polymerase is also included in the plurality of wells of the 96-well or 384-well plates.
[243] Some embodiments of the kit contain instructions for PCR and mass spectrometry analysis of amplification products obtained using the primer pairs of the kits.
[244] Some embodiments of the kit include a barcode which uniquely identifies the kit and the components contained therein according to production lots and may also include any other information relative to the components such as concentrations, storage temperatures, etc. The barcode may also include analysis information to be read by optical barcode readers and sent to a computer controlling amplification, purification and mass spectrometric measurements. In some embodiments, the barcode provides access to a subset of base compositions in a base composition database which is in digital communication with base composition analysis software such that a base composition measured with primer pairs from a given kit can be compared with known base compositions of bioagent identifying amplicons defined by the primer pairs of that kit.
[245] In some embodiments, the kit contains a database of base compositions of bioagent identifying amplicons defined by the primer pairs of the kit. The database is stored on a convenient computer readable medium such as a compact disk or USB drive, for example.
[246] In some embodiments, the kit includes a computer program stored on a computer formatted medium (such as a compact disk or portable USB disk drive, for example) comprising instructions which direct a processor to analyze data obtained from the use of the primer pairs provided herein. The instructions of the software transform data related to amplification products into a molecular mass or base composition which is a useful concrete and tangible result used in identification and/or classification of bioagents. In some embodiments, the kits of the present invention contain all of the reagents sufficient to carry out one or more of the methods described herein.
[247] The following examples serve only to illustrate the embodiments provided herein and are not intended to be limiting. In order that the embodiments disclosed herein may be more efficiently understood, examples are provided below. It should be understood that these examples are for illustrative purposes only and are not to be construed as limiting in any manner.
EXAMPLES
Example 1: Design and Validation of Primers that Define Bioagent Identifying Amplicons for
Identification of Bacteria
[248] For design of primers that define bacterial bioagent identifying amplicons, a series of bacterial genome segment sequences were obtained, aligned and scanned for regions where pairs of
PCR primers would amplify products of about 45 to about 150 nucleotides in length and distinguish subgroups and/or individual strains from each other by their molecular masses or base compositions. A typical process shown in Figure 1 is employed for this type of analysis.
[249] A database of expected base compositions for each primer region was generated using an in silico PCR search algorithm, such as (ePCR). An existing RNA structure search algorithm (Macke et al, Nucl. Acids Res., 2001, 29, 4724-4735, which is incorporated herein by reference in its entirety) has been modified to include PCR parameters such as hybridization conditions, mismatches, and thermodynamic calculations (SantaLucia, Proc. Natl. Acad. Sci. U.S.A., 1998, 95, 1460-1465, which is incorporated herein by reference in its entirety). This also provides information on primer specificity of the selected primer pairs.
[250] Table 2 represents a collection of primers (sorted by primer pair number) configured to identify bacteria using the methods described herein. The primer pair number is an in-house database index number. Conserved regions which primers were configured to hybridize within were identified on bacterial bioagent genes including, for example, arcC, aroE, ermA, ermC, gmk, gyrA, mecA, mecRl, mupR, nuc, pta, pvluk, tpi, tsst, tuffi, and yqi. The forward and reverse primer names shown in Table 1 indicate the gene region of a bacterial genome to which the forward and reverse primers hybridize relative to a reference sequence. The forward primer name TSSTl_NC002758.2-2137509-2138213 _519_546_F indicates that the forward primer ("_F") hybridizes to the GyrA gene ("GYRA"), specifically to residues 519-546 ("519_546") of a reference sequence represented by a sequence extraction of coordinates 2137509-2138213 from GenBank gi number 57634611 (as indicated by cross-references in Table 2 for the prefix "GYRA NC002953"). This sequence extraction reference includes sequence encoding for tsst. The primer pair name codes appearing in Table 2 are defined in Table 3. For example, Table 2 lists gene abbreviations and GenBank gi numbers that correspond with each primer name code. For example, for the above-mentioned primer pair has the code "TSST1 NC002758.2" and is thus configured to hybridize to sequence encoding the tsst gene, and the extraction sequence corresponds to coordinates 2137509-2138213 from GenBank gi number 57634611, which is a Staphylococcus aureus sequence. One of skill in the art will understand how to determine the exact hybridization coordinates of the primers with respect to the GenBank sequences, given this information. The reference nomenclature in the primer name is selected to provide a reference, and does not necessarily mean that the primer pair has been configured with 100% complementarity to that target site on the reference sequence. One with ordinary skill knows how to obtain individual gene sequences or portions thereof from genomic sequences present in GenBank. In Table 2, Tp = 5-propynyluracil; Cp = 5- propynylcytosine; * = phosphorothioate linkage; I = inosine. T GenBank Accession Numbers for reference sequences of bacteria are shown in Table 3 (below). In some cases, the reference sequences are extractions from bacterial genomic sequences or complements thereof. A description of the primer design is provided herein. In some cases, the reference sequences are extractions from bacterial genomic sequences or complements thereof.
DOCKET NO.: DIBIS-0096WO
Table 2: Primer Pairs for Identification of Bacteria
Figure imgf000068_0001
DOCKET NO.: DIBIS-0096WO
23S EC 2645 266 TCTGTCCCTAGTACGAGAG
17 9 F GACCGG 408 23S EC 2744 2761 R TGCTTAGATGCTTTCAGC 1252
23S EC 2645 266 CTGTCCCTAGTACGAGAGG GTTTCATGCTTAGATGCTTTCAG
18 9 2 F ACCGG 83 23S EC 2751 2767 R C 846
23S EC 493 518 GGGGAGTGAAAGAGATCCT
19 F GAAACCG 125 23S EC 551 571 R ACAAAAGGTACGCCGTCACCC 717
23S EC 493 518 GGGGAGTGAAAGAGATCCT
20 2 F GAAACCG 125 23S EC 551 571 2 R ACAAAAGGCACGCCATCACCC 716
23S EC 971 992 CGAGAGGGAAACAACCCAG
21 F ACC 66 23S EC 1059 1077 R TGGCTGCTTCTAAGCCAAC 1282
CAPC BA 104 131 GTTATTTAGCACTCGTTTT TGAATCTTGAAACACCATACGTA
22 F TAATCAGCC 139 CAPC BA 180 205 R ACG 1150
CAPC BA 114 133 ACTCGTTTTTAATCAGCCC
23 F G 20 CAPC BA 185 205 R TGAATCTTGAAACACCATACG 1149
CAPC BA 274 303 GATTATTGTTATCCTGTTA GTAACCCTTGTCTTTGAATTGTA
24 F TGCCATTTGAG 109 CAPC BA 349 376 R TTTGC 837
CAPC BA 276 296 TTATTGTTATCCTGTTATG
25 F CC 663 CAPC BA 358 377 R GGTAACCCTTGTCTTTGAAT 834
CAPC BA 281 301 GTTATCCTGTTATGCCATT
26 F TG 138 CAPC BA 361 378 R TGGTAACCCTTGTCTTTG 1298
CAPC BA 315 334 CCGTGGTATTGGAGTTATT
27 F G 59 CAPC BA 361 378 R TGGTAACCCTTGTCTTTG 1298
CYA BA 1055 107
28 2 F GAAAGAGTTCGGATTGGG 92 CYA BA 1112 1130 R TGTTGACCATGCTTCTTAG 1352
CYA BA 1349 137 ACAACGAAGTACAATACAA
29 0 F GAC 12 CYA BA 1447 1426 R CTTCTACATTTTTAGCCATCAC 800
CYA BA 1353 137 CGAAGTACAATACAAGACA
30 9 F AAAGAAGG 64 CYA BA 1448 1467 R TGTTAACGGCTTCAAGACCC 1342
CYA BA 1359 137 ACAATACAAGACAAAAGAA
31 9 F GG 13 CYA BA 1447 1461 R CGGCTTCAAGACCCC 794
CYA BA 914 937 CAGGTTTAGTACCAGAACA ACCACTTTTAATAAGGTTTGTAG
32 F TGCAG 53 CYA BA 999 1026 R CTAAC 728
CYA BA 916 935 GGTTTAGTACCAGAACATG
33 F C 131 CYA BA 1003 1025 R CCACTTTTAATAAGGTTTGTAGC 768
INFB EC 1365 13 TGCTCGTGGTGCACAAGTA TGCTGCTTTCGCATGGTTAATTG
34 93 F ACGGATATTA 524 INFB EC 1439 1467 R CTTCAA 1248
LEF BA 1033 105
35 2 F TCAAGAAGAAAAAGAGC 254 LEF BA 1119 1135 R GAATATCAATTTGTAGC 803
DOCKET NO.: DIBIS-0096WO
LEF BA 1036 106 CAAGAAGAAAAAGAGCTTC AGATAAAGAATCACGAATATCAA
36 6 F TAAAAAGAATAC 44 LEF BA 1119 1149 R TTTGTAGC 745
LEF BA 756 781 AGCTTTTGCATATTATATC TCTTCCAAGGATAGATTTATTTC
37 F GAGCCAC 26 LEF BA 843 872 R TTGTTCG 1135
LEF BA 758 778 CTTTTGCATATTATATCGA
38 F GC 90 LEF BA 843 865 R AGGATAGATTTATTTCTTGTTCG 748
LEF BA 795 813
39 F TTTACAGCTTTATGCACCG 700 LEF BA 883 900 R TCTTGACAGCATCCGTTG 1140
LEF BA 883 899
40 F CAACGGATGCTGGCAAG 43 LEF BA 939 958 R CAGATAAAGAATCGCTCCAG 762
PAG BA 122 142 CAGAATCAAGTTCCCAGGG
41 F G 49 PAG BA 190 209 R CCTGTAGTAGAAGAGGTAAC 781
PAG BA 123 145 AGAATCAAGTTCCCAGGGG CCCTGTAGTAGAAGAGGTAACCA
42 F TTAC 22 PAG BA 187 210 R C 774
PAG BA 269 287
43 F AATCTGCTATTTGGTCAGG 11 PAG BA 326 344 R TGATTATCAGCGGAAGTAG 1186
PAG BA 655 675 GAAGGATATACGGTTGATG
44 F TC 93 PAG BA 755 772 R CCGTGCTCCATTTTTCAG 778
PAG BA 753 772 TCCTGAAAAATGGAGCACG
45 F G 341 PAG BA 849 868 R TCGGATAAGCTGCCACAAGG 1089
PAG BA 763 781
46 F TGGAGCACGGCTTCTGATC 552 PAG BA 849 868 R TCGGATAAGCTGCCACAAGG 1089
RPOC EC 1018 10 CAAAACTTATTAGGTAAGC TCAAGCGCCATTTCTTTTGGTAA
47 45 F GTGTTGACT 39 RPOC EC 1095 1124 R ACCACAT 959
RPOC EC 1018 10 CAAAACTTATTAGGTAAGC RPOC EC 1095 1124 2 TCAAGCGCCATCTCTTTCGGTAA
48 45 2 F GTGTTGACT 39 R TCCACAT 958
RPOC EC 114 140 TAAGAAGCCGGAAACCATC
49 F AACTACCG 158 RPOC EC 213 232 R GGCGCTTGTACTTACCGCAC 831
RPOC EC 2178 21
50 96 F TGATTCTGGTGCCCGTGGT 478 RPOC EC 2225 2246 R TTGGCCATCAGGCCACGCATAC 1414
RPOC EC 2178 21 RPOC EC 2225 2246 2
51 96 2 F TGATTCCGGTGCCCGTGGT 477 R TTGGCCATCAGACCACGCATAC 1413
RPOC EC 2218 22 CTGGCAGGTATGCGTGGTC CGCACCGTGGGTTGAGATGAAGT
52 41 F TGATG 81 RPOC EC 2313 2337 R AC 790
RPOC EC 2218 22 CTTGCTGGTATGCGTGGTC RPOC EC 2313 2337 2 CGCACCATGCGTAGAGATGAAGT
53 41 2 F TGATG 86 R AC 789
RPOC EC 808 833 CGTCGGGTGATTAACCGTA GTTTTTCGTTGCGTACGATGATG
54 F ACAACCG 75 RPOC EC 865 889 R TC 847
DOCKET NO.: DIBIS-0096WO
RPOC EC 808 833 CGTCGTGTAATTAACCGTA ACGTTTTTCGTTTTGAACGATAA
55 2 F ACAACCG 76 RPOC EC 865 891 R TGCT 741
RPOC EC 993 101 CAAAGGTAAGCAAGGTCGT CGAACGGCCTGAGTAGTCAACAC
56 9 F TTCCGTCA 41 RPOC EC 1036 1059 R G 785
RPOC EC 993 101 CAAAGGTAAGCAAGGACGT RPOC EC 1036 1059 2 CGAACGGCCAGAGTAGTCAACAC
57 9 2 F TTCCGTCA 40 R G 784
SSPE BA 115 137 CAAGCAAACGCACAATCAG TGCACGTCTGTTTCAGTTGCAAA
58 F AAGC 45 SSPE BA 197 222 R TTC 1201
TUFB EC 239 259 TAGACTGCCCAGGACACGC
59 F TG 204 TUFB EC 283 303 R GCCGTCCATCTGAGCAGCACC 815
TUFB EC 239 259 TTGACTGCCCAGGTCACGC
60 2 F TG 678 TUFB EC 283 303 2 R GCCGTCCATTTGAGCAGCACC 816
TUFB EC 976 100 AACTACCGTCCGCAGTTCT GTTGTCGCCAGGCATAACCATTT
61 0 F ACTTCC 4 TUFB EC 1045 1068 R C 845
TUFB EC 976 100 AACTACCGTCCTCAGTTCT TUFB EC 1045 1068 2 GTTGTCACCAGGCATTACCATTT
62 0 2 F ACTTCC 5 R C 844
TUFB EC 985 101 CCACAGTTCTACTTCCGTA TCCAGGCATTACCATTTCTACTC
63 2 F CTACTGACG 56 TUFB EC 1033 1062 R CTTCTGG 1006
RPLB EC 650 679 GACCTACAGTAAGAGGTTC TCCAAGTGCTGGTTTACCCCATG
66 F TGTAATGAACC 98 RPLB EC 739 762 R G 999
RPLB EC 688 710 CATCCACACGGTGGTGGTG
67 F AAGG 54 RPLB EC 736 757 R GTGCTGGTTTACCCCATGGAGT 842
RPOC EC 1036 10 CGTGTTGACTATTCGGGGC ATTCAAGAGCCATTTCTTTTGGT
68 60 F GTTCAG 78 RPOC EC 1097 1126 R AAACCAC 754
RPOB EC 3762 37 TCAACAACCTCTTGGAGGT TTTCTTGAAGAGTATGAGCTGCT
69 90 F AAAGCTCAGT 248 RPOB EC 3836 3865 R CCGTAAG 1435
RPLB EC 688 710 CATCCACACGGTGGTGGTG TGTTTTGTATCCAAGTGCTGGTT
70 F AAGG 54 RPLB EC 743 771 R TACCCC 1356
VALS EC 1105 11 CGTGGCGGCGTGGTTATCG CGGTACGAACTGGATGTCGCCGT
71 24 F A 77 VALS EC 1195 1218 R T 795
RPOB EC 1845 18 TATCGCTCAGGCGAACTCC
72 66 F AAC 233 RPOB EC 1909 1929 R GCTGGATTCGCCTTTGCTACG 825
RPLB EC 669 698 TGTAATGAACCCTAATGAC CCAAGTGCTGGTTTACCCCATGG
73 F CATCCACACGG 623 RPLB EC 735 761 R AGTA 767
RPLB EC 671 700 TAATGAACCCTAATGACCA TCCAAGTGCTGGTTTACCCCATG
74 F TCCACACGGTG 169 RPLB EC 737 762 R GAG 1000
SPlOl SPETIl 1 AACCTTAATTGGAAAGAAA SPlOl SPETIl 92 116 CCTACCCAACGTTCACCAAGGGC
75 29 F CCCAAGAAGT 2 R AG 779
DOCKET NO.: DIBIS-0096WO
Figure imgf000072_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000073_0001
DOCKET NO.: DIBIS-0096WO
23S EC 891 910 GACTTACCAACCCGATGCA TACCTTAGGACCGTTATAGTTAC
126 F A 100 23S EC 1908 1931 R G 893
23S EC 1424 144
127 2 F GGACGGAGAAGGCTATGTT 117 23S EC 2475 2494 R CCAAACACCGCCGTCGATAT 765
23S EC 1908 193 CGTAACTATAACGGTCCTA
128 1 F AGGTA 73 23S EC 2833 2852 R GCTTACACACCCGGCCTATC 826
23S EC 2475 249 ATATCGACGGCGGTGTTTG TRNA ASP-
129 4 F G 31 RRNH EC 23 41.2 R GCGTGACAGGCAGGTATTC 820
16S EC -60 — AGTCTCAAGAGTGAACACG
131 39 F TAA 28 16S EC 508 525 R GCTGCTGGCACGGAGTTA 823
16S EC 326 345 GACACGGTCCAGACTCCTA
132 F C 95 16S EC 1041 1058 R CCATGCAGCACCTGTCTC 771
16S EC 705 724 GATCTGGAGGAATACCGGT
133 F G 107 16S EC 1493 1512 R ACGGTTACCTTGTTACGACT 739
16S EC 1268 128 GAGAGCAAGCGGACCTCAT TRNA ALA-
134 7 F A 101 RRNH EC 30 46.2 R CCTCCTGCGTGCAAAGC 780
16S EC 969 985
135 F ACGCGAAGAACCTTACC 19 16S EC 1061 1078.2 R ACAACACGAGCTGACGAC 719
16S EC 969 985 16S EC 1061 1078.2 I
137 F ACGCGAAGAACCTTACC 19 14 R ACAACACGAGCTGICGAC 721
16S EC 969 985 16S EC 1061 1078.2 I
138 F ACGCGAAGAACCTTACC 19 12 R ACAACACGAGCIGACGAC 718
16S EC 969 985 16S EC 1061 1078.2 I
139 F ACGCGAAGAACCTTACC 19 11 R ACAACACGAGITGACGAC 722
16S EC 969 985 16S EC 1061 1078.2 I
140 F ACGCGAAGAACCTTACC 19 16 R ACAACACGAGCTGACIAC 720
16S EC 969 985 16S EC 1061 1078.2 2
141 F ACGCGAAGAACCTTACC 19 I R ACAACACGAICTIACGAC 723
16S EC 969 985 16S EC 1061 1078.2 3
142 F ACGCGAAGAACCTTACC 19 I R ACAACACIAICTIACGAC 724
16S EC 969 985 16S EC 1061 1078.2 4
143 F ACGCGAAGAACCTTACC 19 I R ACAACACIAICTIACIAC 725
23S EC 2652 266
147 9 F CTAGTACGAGAGGACCGG 79 23S EC 2741 2760 R ACTTAGATGCTTTCAGCGGT 743
16S EC 683 700
158 F GTGTAGCGGTGAAATGCG 137 16S EC 880 894 R CGTACTCCCCAGGCG 796
16S EC 1100 111
159 6 F CAACGAGCGCAACCCTT 42 16S EC 1174 1188 R TCCCCACCTTCCTCC 1019
DOCKET NO.: DIBIS-0096WO
Figure imgf000075_0001
DOCKET NO.: DIBIS-0096WO
23S EC 2599 261
239 6 F GACAGTTCGGTCCCTATC 96 23S EC 2653 2669 R CCGGTCCTCTCGTACTA 777
23S EC 2653 266
240 9 F TAGTACGAGAGGACCGG 227 23S EC 2737 2758 R TTAGATGCTTTCAGCACTTATC 1369
23S BS -68 AAACTAGATAACAGTAGAC
241 44 F ATCAC 1 23S BS 5 21 R GTGCGCCCTTTCTAACTT 841
AGAGTTTGATCATGGCTCA
242 16S EC 8 27 F G 23 16S EC 342 358 R ACTGCTGCCTCCCGTAG 742
16S EC 314 332
243 F CACTGGAACTGAGACACGG 48 16S EC 556 575 R CTTTACGCCCAGTAATTCCG 801
16S EC 518 536
244 F CCAGCAGCCGCGGTAATAC 57 16S EC 774 795 R GTATCTAATCCTGTTTGCTCCC 839
16S EC 683 700
245 F GTGTAGCGGTGAAATGCG 137 16S EC 967 985 R GGTAAGGTTCTTCGCGTTG 835
16S EC 937 954
246 F AAGCGGTGGAGCATGTGG 7 16S EC 1220 1240 R ATTGTAGCACGTGTGTAGCCC 757
16S EC 1195 121
247 3 F CAAGTCATCATGGCCCTTA 46 16S EC 1525 1541 R AAGGAGGTGATCCAGCC 714
AGAGTTTGATCATGGCTCA
248 16S EC 8 27 F G 23 16S EC 1525 1541 R AAGGAGGTGATCCAGCC 714
23S EC 1831 184
249 9 F ACCTGCCCAGTGCTGGAAG 18 23S EC 1919 1936 R TCGCTACCTTAGGACCGT 1080
16S EC 1387 140 GCCTTGTACACACCTCCCG
250 7 F TC 112 16S EC 1494 1513 R CACGGCTACCTTGTTACGAC 761
16S EC 1390 141 TTGTACACACCGCCCGTCA
251 1 F TAC 693 16S EC 1486 1505 R CCTTGTTACGACTTCACCCC 783
16S EC 1367 138 TACGGTGAATACGTTCCCG
252 7 F GG 191 16S EC 1485 1506 R ACCTTGTTACGACTTCACCCCA 731
16S EC 804 822
253 F ACCACGCCGTAAACGATGA 14 16S EC 909 929 R CCCCCGTCAATTCCTTTGAGT 773
16S EC 791 812 GATACCCTGGTAGTCCACA
254 F CCG 106 16S EC 886 904 R GCCTTGCGACCGTACTCCC 817
16S EC 789 810 TAGATACCCTGGTAGTCCA
255 F CGC 206 16S EC 882 899 R GCGACCGTACTCCCCAGG 818
16S EC 1092 110
256 9 F TAGTCCCGCAACGAGCGC 228 16S EC 1174 1195 R GACGTCATCCCCACCTTCCTCC 810
23S EC 2586 260 TAGAACGTCGCGAGACAGT
257 7 F TCG 203 23S EC 2658 2677 R AGTCCATCCCGGTCCTCTCG 749
DOCKET NO.: DIBIS-0096WO
Figure imgf000077_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000078_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000079_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000080_0001
DOCKET NO.: DIBIS-0096WO
SPlOl SPETIl 35 TGGGGATTCAGCCATCAAA SPlOl SPETIl 448 473 TCCAACCTTTTCCACAACAGAAT
442 8 387 TMOD F GCAGCTATTGAC 588 TMOD R CAGC 998
SPlOl SPETIl 60 TCCTTACTTCGAACTATGA SPlOl SPETIl 686 714 TCCCATTTTTTCACGCATGCTGA
443 0 629 TMOD F ATCTTTTGGAAG 348 TMOD R AAATATC 1018
SPlOl SPETIl 65 TGGGGATTGATATCACCGA SPlOl SPETIl 756 784 TGATTGGCGATAAAGTGATATTT
444 8 684 TMOD F TAAGAAGAA 589 TMOD R TCTAAAA 1189
SPlOl SPETIl 77 TTCGCCAATCAAAACTAAG SPlOl SPETIl 871 896 TGCCCACCAGAAAGACTAGCAGG
445 6 801 TMOD F GGAATGGC 673 TMOD R ATAA 1217
SPlOl SPETIl 1 TAACCTTAATTGGAAAGAA SPlOl SPETIl 92 116 TCCTACCCAACGTTCACCAAGGG
446 29 TMOD F ACCCAAGAAGT 154 TMOD R CAG 1044
SPlOl SPETIl 36 TCAGCCATCAAAGCAGCTA SPlOl SPETIl 448 471 TACCTTTTCCACAACAGAATCAG
447 4 385 F TTG 276 R C 894
SPlOl SPETIl 30 TAGCTAATGGTCAGGCAGC SPlOl SPETIl 3170 31 TCGACGACCATCTTGGAAAGATT
448 85 3104 F C 216 94 R TC 1066
RPLB EC 690 710 TCCACACGGTGGTGGTGAA
449 F GG 309 RPLB EC 737 758 R TGTGCTGGTTTACCCCATGGAG 1336
BONTA X52066 53 BONTA X52066 647 660
481 8 552 F TATGGCTCTACTCAA 239 R TGTTACTGCTGGAT 1346
BONTA X52066 53 TA*TpGGC*Tp*Cp*TpA* BONTA X52066 647 660 TG*Tp*TpA*Cp*TpG*Cp*TpG
482 8 552P F Cp*Tp*CpAA 143 P R GAT 1146
BONTA X52066 70 GAATAGCAATTAATCCAAA BONTA X52066 759 775
483 1 720 F T 94 R TTACTTCTAACCCACTC 1367
BONTA X52066 70 GAA*TpAG*CpAA*Tp*Tp BONTA X52066 759 775 TTA*Cp*Tp*Tp*Cp*TpAA*Cp
484 1 720P F AA*Tp*Cp*CpAAAT 91 P R *Cp*CpA*Cp*TpC 1359
BONTA X52066 45 TCTAGTAATAATAGGACCC BONTA X52066 517 539
485 0 473 F TCAGC 393 R TAACCATTTCGCGTAAGATTCAA 859
BONTA X52066 45 T*Cp*TpAGTAATAATAGG BONTA X52066 517 539 TAACCA*Tp*Tp*Tp*CpGCGTA
486 0 473P F A*Cp*Cp*Cp*Tp*CpAGC 142 P R AGA*Tp*Tp*CpAA 857
BONTA X52066 59 TGAGTCACTTGAAGTTGAT BONTA X52066 644 671 TCATGTGCTAATGTTACTGCTGG
487 1 620 F ACAAATCCTCT 463 R ATCTG 992
SSPE BA 156 168
608 P F TGGTpGCpTpAGCpATT 616 SSPE BA 243 255P R TGCpAGCpTGATpTpGT 1241
SSPE BA 75 89P TACpAGAGTpTpTpGCpGA
609 F C 192 SSPE BA 163 177P R TGTGCTpTpTpGAATpGCpT 1338
SSPE BA 150 168 TGCTTCTGGTpGCpTpAGC TGATTGTTTTGCpAGCpTGATpT
610 P F pATT 533 SSPE BA 243 264P R pGT 1191
SSPE BA 72 89P TGGTACpAGAGTpTpTpGC TCATTTGTGCTpTpTpGAATpGC
611 F pGAC 602 SSPE BA 163 182P R pT 995
DOCKET NO.: DIBIS-0096WO
Figure imgf000082_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000083_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000084_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000085_0001
DOCKET NO.: DIBIS-0096WO
RPOC EC 2145 21 TCAGGAGTCGTTCAACTCG
978 75 F ATCTACATGATG 285 RPOC EC 2228 2247 R TTACGCCATCAGGCCACGCA 1363
CJST CJ 1668 17 TGCTCGAGTGATTGACTTT TGAGCGTGTGGAAAAGGACTTGG
1045 00 F GCTAAATTTAGAGA 522 CJST CJ 1774 1799 R ATG 1170
CJST CJ 2171 21 TCGTTTGGTGGTGGTAGAT TCTCTTTCAAAGCACCATTGCTC
1046 97 F GAAAAAGG 388 CJST CJ 2283 2313 R ATTATAGT 1126
CJST CJ 584 616 TCCAGGACAAATGTATGAA TTCATTTTCTGGTCCAAAGTAAG
1047 F AAATGTCCAAGAAG 315 CJST CJ 663 692 R CAGTATC 1379
CJST CJ 360 394 TCCTGTTATCCCTGAAGTA TCAACTGGTTCAAAAACATTAAG
1048 F GTTAATCAAGTTTGTT 346 CJST CJ 442 476 R TTGTAATTGTCC 955
CJST CJ 2636 26 TGCCTAGAAGATCTTAAAA TTGCTGCCATAGCAAAGCCTACA
1049 68 F ATTTCCGCCAACTT 504 CJST CJ 2753 2777 R GC 1409
CJST CJ 1290 13 TGGCTTATCCAAATTTAGA TTTGCTCATGATCTGCATGAAGC
1050 20 F TCGTGGTTTTAC 575 CJST CJ 1406 1433 R ATAAA 1437
CJST CJ 3267 32 TTTGATTTTACGCCGTCCT TCAAAGAACCCGCACCTAATTCA
1051 93 F CCAGGTCG 707 CJST CJ 3356 3385 R TCATTTA 951
TAGGCGAAGATATACAAAG TCCCTTATTTTTCTTTCTACTAC
1052 CJST CJ 5 39 F AGTATTAGAAGCTAGA 222 CJST CJ 104 137 R CTTCGGATAAT 1029
CJST CJ 1080 11 TTGAGGGTATGCACCGTCT TCCCCTCATGTTTAAATGATCAG
1053 10 F TTTTGATTCTTT 681 CJST CJ 1166 1198 R GATAAAAAGC 1022
CJST CJ 2060 20 TCCCGGACTTAATATCAAT TCGATCCGCATCACCATCAAAAG
1054 90 F GAAAATTGTGGA 323 CJST CJ 2148 2174 R CAAA 1068
CJST CJ 2869 28 TGAAGCTTGTTCTTTAGCA TCCTCCTTGTGCCTCAAAACGCA
1055 95 F GGACTTCA 432 CJST CJ 2979 3007 R TTTTTA 1045
CJST CJ 1880 19 TCCCAATTAATTCTGCCAT TGGTTCTTACTTGCTTTGCATAA
1056 10 F TTTTCCAGGTAT 317 CJST CJ 1981 2011 R ACTTTCCA 1309
CJST CJ 2185 22 TAGATGAAAAGGGCGAAGT TGAATTCTTTCAAAGCACCATTG
1057 12 F GGCTAATGG 208 CJST CJ 2283 2316 R CTCATTATAGT 1152
CJST CJ 1643 16 TTATCGTTTGTGGAGCTAG TGCAATGTGTGCTATGTCAGCAA
1058 70 F TGCTTATGC 660 CJST CJ 1724 1752 R AAAGAT 1198
CJST CJ 2165 21 TGCGGATCGTTTGGTGGTT TCCACACTGGATTGTAATTTACC
1059 94 F GTAGATGAAAA 511 CJST CJ 2247 2278 R TTGTTCTTT 1002
CJST CJ 599 632 TGAAAAATGTCCAAGAAGC TCCCGAACAATGAGTTGTATCAA
1060 F ATAGCAAAAAAAGCA 424 CJST CJ 711 743 R CTATTTTTAC 1024
CJST CJ 360 393 TCCTGTTATCCCTGAAGTA TACAACTGGTTCAAAAACATTAA
1061 F GTTAATCAAGTTTGT 345 CJST CJ 443 477 R GCTGTAATTGTC 882
CJST CJ 2678 27 TCCCCAGGACACCCTGAAA TGTGCTTTTTTTGCTGCCATAGC
1062 03 F TTTCAAC 321 CJST CJ 2760 2787 R AAAGC 1339
DOCKET NO.: DIBIS-0096WO
Figure imgf000087_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000088_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000089_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000090_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000091_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000092_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000093_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000094_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000095_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000096_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000097_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000098_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000099_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000100_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000101_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000102_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000103_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000104_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000105_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000106_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000107_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000108_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000109_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000110_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000111_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000112_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000113_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000114_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000115_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000116_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000117_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000118_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000119_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000120_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000121_0001
DOCKET NO.: DIBIS-0096WO
Figure imgf000122_0001
[251] Primer pair name codes and reference sequences are shown in Table 3. The primer name code typically represents the gene to which the given primer pair is targeted. The primer pair name may include specific coordinates with respect to a reference sequence defined by an extraction of a section of sequence or defined by a GenBank gi number, or the corresponding complementary sequence of the extraction, or the entire GenBank gi number as indicated by the label "no extraction." Where "no extraction" is indicated for a reference sequence, the coordinates of a primer pair named to the reference sequence are with respect to the GenBank gi listing. Gene abbreviations are shown in bold type in the "Gene Name" column.
[252] Methods of primer design are well-known, and one of skill in the art will understand that the primer pairs configured to primer amplification of double stranded sequences will be configured and named using one strand of a double-stranded reference sequence. The forward primer is the primer of the pair that comprises full or partial sequence identity to the one strand of the sequence being used as a reference during design. The reverse primer is the primer of the pair that comprises reverse complementarity.
[253] To determine the exact primer hybridization coordinates of a given pair of primers on a given bioagent nucleic acid sequence and to determine the sequences, molecular masses and base compositions of an amplification product to be obtained upon amplification of nucleic acid of a known bioagent with known sequence information in the region of interest with a given pair of primers, one with ordinary skill in bioinformatics is capable of obtaining alignments of the primers of the present invention with the GenBank gi number of the relevant nucleic acid sequence of the known bioagent. For example, the reference sequence GenBank gi numbers (Table 3) provide the identities of the sequences which can be obtained from GenBank. Alignments can be done using a bioinformatics tool such as BLASTn provided to the public by NCBI (Bethesda, MD). Alternatively, a relevant GenBank sequence may be downloaded and imported into custom programmed or commercially available bioinformatics programs wherein the alignment can be carried out to determine the primer hybridization coordinates and the sequences, molecular masses and base compositions of the amplification product. For example, to obtain the hybridization coordinates of primer pair number 2095 (SEQ ID NOs: 456: 1261), First the forward primer (SEQ ID NO: 456) is subjected to a BLASTn search on the publicly available NCBI BLAST website. "RefSeq Genomic" is chosen as the BLAST database since the gi numbers refer to genomic sequences. The BLAST query is then performed. Among the top results returned is a match to GenBank gi number 21281729 (Accession Number NC_003923). The result shown below, indicates that the forward primer hybridizes to positions 1530282..1530307 of the genomic sequence of Staphylococcus aureus subsp. aureus MW2 (represented by gi number 21281729).
Staphylococcus aureus subsp. aureus MW2, complete genome Length=2820462
Features in this part of subject sequence:
Panton-Valentine leukocidin chain F precursor
Score = 52.0 bits (26), Expect = 2e-05 Identities = 26/26 (100%), Gaps = 0/26 (0%) Strand=Plus/Plus
Query 1 TGAGCTGCATCAACTGTATTGGATAG 26 (SEQ ID NO: 456)
Sbjct 1530282 TGAGCTGCATCAACTGTATTGGATAG 1530307 (SEQ ID NO: 456)
[254] The hybridization coordinates of the reverse primer (SEQ ID NO: 1261) can be determined in a similar manner and thus, the bioagent identifying amplicon can be defined in terms of genomic coordinates. The query/ subject arrangement of the result would be presented in Strand = Plus/Minus format because the reverse strand hybridizes to the reverse complement of the genomic sequence. HThe preceding sequence analyses are well known to one with ordinary skill in bioinformatics and thus, Table 3 contains sufficient information to determine the primer hybridization coordinates of any of the primers of Table 2 to the applicable reference sequences described therein. Table 3: Primer Name Codes and Reference Sequence
Figure imgf000125_0001
Figure imgf000126_0001
Figure imgf000127_0001
[255] Note: artificial reference sequences represent concatenations of partial gene extractions from the indicated reference gi number. Partial sequences were used to create the concatenated sequence because complete gene sequences were not necessary for primer design.
Example 2: Sample Preparation and PCR
[256] Genomic DNA was prepared from samples using the DNeasy Tissue Kit (Qiagen, Valencia, CA) according to the manufacturer's protocols.
[257] All PCR reactions were assembled in 50 μL reaction volumes in a 96-well microtiter plate format using a Packard MPII liquid handling robotic platform and MJ. Dyad thermocyclers (MJ research, Waltham, MA) or Eppendorf Mastercycler thermocyclers (Eppendorf, Westbury, NY). The PCR reaction mixture consisted of 4 units of Amplitaq Gold, Ix buffer II (Applied Biosystems, Foster City, CA), 1.5 mM MgCl2, 0.4 M betaine, 800 μM dNTP mixture and 250 nM of each primer. The following typical PCR conditions were used: 95°C for 10 min followed by 8 cycles of 95°C for 30 seconds, 48°C for 30 seconds, and 72°C 30 seconds with the 48°C annealing temperature increasing 0.90C with each of the eight cycles. The PCR was then continued for 37 additional cycles of 95°C for 15 seconds, 56°C for 20 seconds, and 72°C 20 seconds.
Example 3: Purification of PCR Products for Mass Spectrometry with Ion Exchange Resin- Magnetic Beads
[258] For solution capture of nucleic acids with ion exchange resin linked to magnetic beads, 25 μl of a 2.5 mg/mL suspension of BioClone amine terminated superparamagnetic beads were added to 25 to 50 microliters of a PCR (or RT-PCR) reaction containing approximately 10 pM of a typical PCR amplification product. The above suspension was mixed for approximately 5 minutes by vortexing or pipetting, after which the liquid was removed after using a magnetic separator. The beads containing bound PCR amplification product were then washed three times with 5OmM ammonium bicarbonate/50% MeOH or 10OmM ammonium bicarbonate/50% MeOH, followed by three more washes with 50% MeOH. The bound PCR amplicon was eluted with a solution of 25mM piperidine, 25mM imidazole, 35% MeOH which included peptide calibration standards.
Example 4: Mass Spectrometry and Base Composition Analysis
[259] The ESI-FTICR mass spectrometer is based on a Bruker Daltonics (Billerica, MA) Apex II 7Oe electrospray ionization Fourier transform ion cyclotron resonance mass spectrometer that employs an actively shielded 7 Tesla superconducting magnet. The active shielding constrains the majority of the fringing magnetic field from the superconducting magnet to a relatively small volume. Thus, components that might be adversely affected by stray magnetic fields, such as CRT monitors, robotic components, and other electronics, can operate in close proximity to the FTICR spectrometer. All aspects of pulse sequence control and data acquisition were performed on a 600 MHz Pentium II data station running Bruker' s Xmass software under Windows NT 4.0 operating system. Sample aliquots, typically 15 μl, were extracted directly from 96-well microtiter plates using a CTC HTS PAL autosampler (LEAP Technologies, Carrboro, NC) triggered by the FTICR data station. Samples were injected directly into a 10 μl sample loop integrated with a fluidics handling system that supplies the 100 μl /hr flow rate to the ESI source. Ions were formed via electrospray ionization in a modified Analytica (Branford, CT) source employing an off axis, grounded electrospray probe positioned approximately 1.5 cm from the metalized terminus of a glass desolvation capillary. The atmospheric pressure end of the glass capillary was biased at 6000 V relative to the ESI needle during data acquisition. A counter-current flow of dry N2 was employed to assist in the desolvation process. Ions were accumulated in an external ion reservoir comprised of an rf-only hexapole, a skimmer cone, and an auxiliary gate electrode, prior to injection into the trapped ion cell where they were mass analyzed. Ionization duty cycles greater than 99% were achieved by simultaneously accumulating ions in the external ion reservoir during ion detection. Each detection event consisted of IM data points digitized over 2.3 s. To improve the signal-to-noise ratio (S/N), 32 scans were co-added for a total data acquisition time of 74 s.
[260] The ESI-TOF mass spectrometer is based on a Bruker Daltonics MicroTOF™. Ions from the ESI source undergo orthogonal ion extraction and are focused in a reflectron prior to detection. The TOF and FTICR are equipped with the same automated sample handling and fluidics described above. Ions are formed in the standard MicroTOF™ ESI source that is equipped with the same off- axis sprayer and glass capillary as the FTICR ESI source. Consequently, source conditions were the same as those described above. External ion accumulation was also employed to improve ionization duty cycle during data acquisition. Each detection event on the TOF was comprised of 75,000 data points digitized over 75 μs.
[261] The sample delivery scheme allows sample aliquots to be rapidly injected into the electrospray source at high flow rate and subsequently be electrosprayed at a much lower flow rate for improved ESI sensitivity. Prior to injecting a sample, a bolus of buffer was injected at a high flow rate to rinse the transfer line and spray needle to avoid sample contamination/carryover. Following the rinse step, the autosampler injected the next sample and the flow rate was switched to low flow. Following a brief equilibration delay, data acquisition commenced. As spectra were co- added, the autosampler continued rinsing the syringe and picking up buffer to rinse the injector and sample transfer line. In general, two syringe rinses and one injector rinse were required to minimize sample carryover. During a routine screening protocol a new sample mixture was injected every 106 seconds. More recently a fast wash station for the syringe needle has been implemented which, when combined with shorter acquisition times, facilitates the acquisition of mass spectra at a rate of just under one spectrum/minute.
[262] Raw mass spectra were post-calibrated with an internal mass standard and deconvoluted to monoisotopic molecular masses. Unambiguous base compositions were derived from the exact mass measurements of the complementary single-stranded oligonucleotides. Quantitative results are obtained by comparing the peak heights with an internal PCR calibration standard present in every PCR well at 500 molecules per well. Calibration methods are commonly owned and disclosed in U.S. Provisional Patent Application Serial No. 60/545,425 which is incorporated herein by reference in entirety.
Example 5: De Novo Determination of Base Composition of Amplification Products using Molecular Mass Modified Deoxynucleotide Triphosphates
[263] Because the molecular masses of the four natural nucleobases have a relatively narrow molecular mass range (A = 313.058, G = 329.052, C = 289.046, T = 304.046 - See Table 4), a persistent source of ambiguity in assignment of base composition can occur as follows: two nucleic acid strands having different base composition may have a difference of about 1 Da when the base composition difference between the two strands is G <→ A (-15.994) combined with C <→ T (+15.000). For example, one 99-mer nucleic acid strand having a base composition of A.sub.27G.sub.30C.sub.21T.sub.21 has a theoretical molecular mass of 30779.058 while another 99-mer nucleic acid strand having a base composition of A.sub.26G.sub.31C.sub.22T.sub.2O has a theoretical molecular mass of 30780.052. A 1 Da difference in molecular mass may be within the experimental error of a molecular mass measurement and thus, the relatively narrow molecular mass range of the four natural nucleobases imposes an uncertainty factor.
[264] The present invention provides for a means for removing this theoretical 1 Da uncertainty factor through amplification of a nucleic acid with one mass-tagged nucleobase and three natural nucleobases. The term "nucleobase" as used herein is synonymous with other terms in use in the art including "nucleotide," "deoxynucleotide," "nucleotide residue," "deoxynucleotide residue," "nucleotide triphosphate (NTP)," or deoxynucleotide triphosphate (dNTP).
[265] Addition of significant mass to one of the 4 nucleobases (dNTPs) in an amplification reaction, or in the primers themselves, will result in a significant difference in mass of the resulting amplification product (significantly greater than 1 Da) arising from ambiguities arising from the G <→ A combined with C <→ T event (Table 4). Thus, the same the G <→ A (-15.994) event combined with 5-Iodo-C <→ T (-110.900) event would result in a molecular mass difference of 126.894. If the molecular mass of the base composition A. sub.27G. sub.30 5-Iodo-C.sub.21T.sub.21 (33422.958) is compared with A.sub.26G.sub.315-Iodo-C.sub.22T.sub.20, (33549.852) the theoretical molecular mass difference is +126.894. The experimental error of a molecular mass measurement is not significant with regard to this molecular mass difference. Furthermore, the only base composition consistent with a measured molecular mass of the 99-mer nucleic acid is A.sub.27G.sub.305-Iodo- C.sub.21T.sub.21. In contrast, the analogous amplification without the mass tag has 18 possible base compositions.
Table 4: Molecular Masses of Natural Nucleobases and the Mass-Modified Nucleobase 5-Iodo- C and Molecular Mass Differences Resulting from Transitions
Figure imgf000131_0001
Figure imgf000132_0001
[266] Mass spectra of bioagent-identifying amplicons were analyzed independently using a maximum-likelihood processor, such as is widely used in radar signal processing. This processor, referred to as GenX, first makes maximum likelihood estimates of the input to the mass spectrometer for each primer by running matched filters for each base composition aggregate on the input data. This includes the GenX response to a calibrant for each primer.
[267] The algorithm emphasizes performance predictions culminating in probability-of-detection versus probability-of- false-alarm plots for conditions involving complex backgrounds of naturally occurring organisms and environmental contaminants. Matched filters consist of a priori expectations of signal values given the set of primers used for each of the bioagents. A genomic sequence database is used to define the mass base count matched filters. The database contains the sequences of known bacterial bioagents and includes threat organisms as well as benign background organisms. The latter is used to estimate and subtract the spectral signature produced by the background organisms. A maximum likelihood detection of known background organisms is implemented using matched filters and a running-sum estimate of the noise covariance. Background signal strengths are estimated and used along with the matched filters to form signatures which are then subtracted. The maximum likelihood process is applied to this "cleaned up" data in a similar manner employing matched filters for the organisms and a running-sum estimate of the noise- covariance for the cleaned up data.
[268] The amplitudes of all base compositions of bioagent-identifying amplicons for each primer are calibrated and a final maximum likelihood amplitude estimate per organism is made based upon the multiple single primer estimates. Models of all system noise are factored into this two-stage maximum likelihood calculation. The processor reports the number of molecules of each base composition contained in the spectra. The quantity of amplification product corresponding to the appropriate primer set is reported as well as the quantities of primers remaining upon completion of the amplification reaction.
[269] Base count blurring can be carried out as follows. "Electronic PCR" can be conducted on nucleotide sequences of the desired bioagents to obtain the different expected base counts that could be obtained for each primer pair. See for example, ncbi.nlm.nih.gov/sutils/e-pcr/; Schuler, Genome Res. 7:541-50, 1997. In one illustrative embodiment, one or more spreadsheets, such as Microsoft Excel workbooks contain a plurality of worksheets. First in this example, there is a worksheet with a name similar to the workbook name; this worksheet contains the raw electronic PCR data. Second, there is a worksheet named "filtered bioagents base count" that contains bioagent name and base count; there is a separate record for each strain after removing sequences that are not identified with a genus and species and removing all sequences for bioagents with less than 10 strains. Third, there is a worksheet, "Sheetl" that contains the frequency of substitutions, insertions, or deletions for this primer pair. This data is generated by first creating a pivot table from the data in the "filtered bioagents base count" worksheet and then executing an Excel VBA macro. The macro creates a table of differences in base counts for bioagents of the same species, but different strains. One of ordinary skill in the art may understand additional pathways for obtaining similar table differences without undo experimentation.
[270] Application of an exemplary script, involves the user defining a threshold that specifies the fraction of the strains that are represented by the reference set of base counts for each bioagent. The reference set of base counts for each bioagent may contain as many different base counts as are needed to meet or exceed the threshold. The set of reference base counts is defined by taking the most abundant strain's base type composition and adding it to the reference set and then the next most abundant strain's base type composition is added until the threshold is met or exceeded. The current set of data was obtained using a threshold of 55%, which was obtained empirically.
[271] For each base count not included in the reference base count set for that bioagent, the script then proceeds to determine the manner in which the current base count differs from each of the base counts in the reference set. This difference may be represented as a combination of substitutions, Si=Xi, and insertions, Ii=Yi, or deletions, Di=Zi. If there is more than one reference base count, then the reported difference is chosen using rules that aim to minimize the number of changes and, in instances with the same number of changes, minimize the number of insertions or deletions. Therefore, the primary rule is to identify the difference with the minimum sum (Xi+ Yi) or (Xi+Zi), e.g., one insertion rather than two substitutions. If there are two or more differences with the minimum sum, then the one that will be reported is the one that contains the most substitutions.
[272] Differences between a base count and a reference composition are categorized as one, two, or more substitutions, one, two, or more insertions, one, two, or more deletions, and combinations of substitutions and insertions or deletions. The different classes of nucleobase changes and their probabilities of occurrence have been delineated in U.S. Patent Application Publication No. 2004209260 which is incorporated herein by reference in entirety.
Example 6: Use of Broad Range Survey and Division Wide Primer Pairs for Identification of Bacteria in an Epidemic Surveillance Investigation
[273] This investigation employed a set of 16 primer pairs which is herein designated the "surveillance primer set" and comprises broad range survey primer pairs, division wide primer pairs and a single Bacillus clade primer pair. The surveillance primer set is shown in Table 5 and consists of primer pairs originally listed in Table 2. This surveillance set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation {vide supra) relative to originally selected primers which are displayed below in the same row. Primer pair 449 (non-T modified) has been modified twice. Its predecessors are primer pairs 70 and 357, displayed below in the same row. Primer pair 360 has also been modified twice and its predecessors are primer pairs 17 and 118.
Table 5: Bacterial Primer Pairs of the Surveillance Primer Set
Figure imgf000135_0001
RPLB EC 650 679 TMOD F 232 RPLB EC 739 762 TMOD R 592 rplB
RPLB E C 650 679 F 98 RPLB E C 739 762 R 999 rplB
VALS E C 1105 1124 TMOD F 385 VALS E C 1195 1218 TMOD R 1093 valS
VALS E C 1105 1124 F 77 VALS E C 1195 1218 R 795 valS
RPOB E C 1845 1866 TMOD F 659 RPOB E C 1909 1929 TMOD R 1250 rpoB
RPOB E C 1845 1866 F 233 RPOB E C 1909 1929 R 825 rpoB
23S EC 2646 2667 TMOD F 409 23S EC 2745 2765 TMOD R 1434 23S rRNA
23S EC 2646 2667 F 84 23S EC 2745 2765 R 1389 23S rRNA
23S EC 2645 2669 F 408 23S EC 2744 2761 R 1252 23S rRNA
16S EC 1090 1111 2 TMOD F 697 16S EC 1175 1196 TMOD R 1398 16S rRNA
16S EC 1090 1111 2 F 651 16S EC 1175 1196 R 1159 16S rRNA
RPOB E C 3799 3821 TMOD F 581 RPOB E C 3862 TMOD R 1325 rpoB
RPOB E C 3799 3821 F 124 RPOB E C 3862 388c R 840 rpoB
RPOC E C 2146 2174 TMOD F 284 RPOC E C 2227 2245 TMOD R 898 rpoC
RPOC E C 2146 2174 F 52 RPOC E C 2227 2245 R 736 rpoC
TUFB E C 957 979 TMOD F 308 TUFB E C 1034 1058 TMOD R 1276 tufB
TUFB E C 957 979 F 55 TUFB E C 1034 1058 R 829 tufB
RPLB E C 690 710 F 309 RPLB E C 737 758 R 1336 rplB
RPLB E C 688 710 TMOD F 296 RPLB E C 736 757 TMOD R 1337 rplB
RPLB E C 688 710 F 54 RPLB E C 736 757 R 842 rplB [274] The 16 primer pairs of the surveillance set are used to produce bioagent identifying amplicons whose base compositions are sufficiently different amongst all known bacteria at the species level to identify, at a reasonable confidence level, any given bacterium at the species level. As shown in Tables 6A-E, common respiratory bacterial pathogens can be distinguished by the base compositions of bioagent identifying amplicons obtained using the 16 primer pairs of the surveillance set. In some cases, triangulation identification improves the confidence level for species assignment. For example, nucleic acid from Streptococcus pyogenes can be amplified by nine of the sixteen surveillance primer pairs and Streptococcus pneumoniae can be amplified by ten of the sixteen surveillance primer pairs. The base compositions of the bioagent identifying amplicons are identical for only one of the analogous bioagent identifying amplicons and differ in all of the remaining analogous bioagent identifying amplicons by up to four bases per bioagent identifying amplicon. The resolving power of the surveillance set was confirmed by determination of base compositions for 120 isolates of respiratory pathogens representing 70 different bacterial species and the results indicated that natural variations (usually only one or two base substitutions per bioagent identifying amplicon) amongst multiple isolates of the same species did not prevent correct identification of major pathogenic organisms at the species level.
[275] Bacillus anthracis is a well known biological warfare agent which has emerged in domestic terrorism in recent years. Since it was envisioned to produce bioagent identifying amplicons for identification of Bacillus anthracis, additional drill-down analysis primers were configured to target genes present on virulence plasmids of Bacillus anthracis so that additional confidence could be reached in positive identification of this pathogenic organism. Three drill-down analysis primers were configured and are listed in Tables 2 and 6. In Table 6, the drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row. Table 6: Drill-Down Primer Pairs for Confirmation of Identification of Bacillus anthracis
Figure imgf000138_0001
[276] Phylogenetic coverage of bacterial space of the sixteen surveillance primers of Table 5 and the three Bacillus anthracis drill-down primers of Table 6 is shown in Figure 3 which lists common pathogenic bacteria. Figure 3 is not meant to be comprehensive in illustrating all species identified by the primers. Only pathogenic bacteria are listed as representative examples of the bacterial species that can be identified by the primers and methods of the present invention. Nucleic acid of groups of bacteria enclosed within the polygons of Figure 3 can be amplified to obtain bioagent identifying amplicons using the primer pair numbers listed in the upper right hand corner of each polygon. Primer coverage for polygons within polygons is additive. As an illustrative example, bioagent identifying amplicons can be obtained for Chlamydia trachomatis by amplification with, for example, primer pairs 346-349, 360 and 361, but not with any of the remaining primers of the surveillance primer set. On the other hand, bioagent identifying amplicons can be obtained from nucleic acid originating from Bacillus anthracis (located within 5 successive polygons) using, for example, any of the following primer pairs: 346-349, 360, 361 (base polygon), 356, 449 (second polygon), 352 (third polygon), 355 (fourth polygon), 350, 351 and 353 (fifth polygon). Multiple coverage of a given organism with multiple primers provides for increased confidence level in identification of the organism as a result of enabling broad triangulation identification. [277] In Tables 7A-E, base compositions of respiratory pathogens for primer target regions are shown. Two entries in a cell, represent variation in ribosomal DNA operons. The most predominant base composition is shown first and the minor (frequently a single operon) is indicated by an asterisk (*). Entries with NO DATA mean that the primer would not be expected to prime this species due to mismatches between the primer and target region, as determined by theoretical PCR.
Table 7A - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 346, 347 and 348
Primer 346 Primer 347 Primer 348
Organism Strain [A G C T] [A G C T] [A G C T]
Klebsiella [29 32 25 13] [23 26] [26 32 28 30] pneumoniae MGH78578 [29 31 25 13]* [23 37 28 26]* [26 31 30]*
CO-92 Biovar [29 30 29]
Yersinia pestis Orientalis [29 32 25 13] [22 39 28 26] [30 30 27 29]*
KIM5 P12 (Biovar
Yersinia pestis Mediaevalis ) [29 32 25 13] [22 39 28 26] [29 30 28 29]
[29 30 28 29]
Yersinia pestis 91001 [29 32 25 13] [22 39 28 26] [30 30 27 29]*
Haemophilus influenzae KW20 [28 31 23 17] [24 37 25 27] [29 30 28 29]
Pseudomonas [26 36 24] aeruginosa PAOl [30 31 23 15] [27 36 29 23]* [26 32 29 29]
Pseudomonas fl uorescens PfO-I [30 31 23 15] [26 35 29 25] [28 31 28 29]
Pseudomonas putida KT2440 [30 31 23 15] [28 33 27 27] [27 32 29 28]
Legionella pneumophila Philadelphia-1 [30 30 24 15] [33 33 23 27] [29 28 28 31]
Francisella tularensis schu 4 [32 29 22 16] [28 38 26 26] [25 32 28 31]
Bordetella pertussis Tohama I [30 29 24 16] [23 30 24] [30 32 30 26]
Burkholderia [27 36 31 24] cepacia J2315 [29 29 27 14] [27 32 26 29] [20 42 35 19]*
Burkholderia pseudoma11ei K96243 [29 29 27 14] [27 32 26 29] [27 36 31 24]
Nei sseria FA 1090, ATCC gonorrhoeae 700825 [29 28 24 18] [27 34 26 28] [24 36 29 27]
Nei sseria meningi tidi s MC58 (serogroup B) [29 28 26 16] [27 34 27 27] [25 35 30 26]
Nei sseria meningi tidis serogroup C, FAM18 [29 28 26 16] [27 34 27 27] [25 35 30 26]
Neisseria meningi tidis Z2491 (serogroup A) [29 28 26 16] [27 34 27 27] [25 35 30 26]
Ch1amydophila pneumoniae TW-183 [31 27 22 19] NO DAT [32 27 27 29]
Ch1amydophila pneumoniae AR39 [31 27 22 19] NO DATp^ [32 27 27 29]
Ch1amydophila pneumoniae CWL029 [31 27 22 19] NO DAT A [32 27 27 29] Ch1amydophila pneumoniae J138 [31 27 22 19] NO DATA [32 27 27 29]
Corynebacteri urn diph theπae NCTC13129 [29 34 21 15] [22 38 31 25] [22 33 25 34]
Mycobacteri urn avi urn klO [27 36 21 15] [22 37 30 28] [21 36 27 30]
Mycobacteri urn avium 104 [27 36 21 15] [22 37 30 28] [21 36 27 30]
Mycobacterium tuberculosis CSU#93 [27 36 21 15] [22 30 28] [21 36 27 30]
Mycobacterium tuberculosis CDC 1551 [27 36 21 15] [22 30 28] [21 36 27 30]
Mycobacterium tuberculosis H37Rv (lab strain) [27 36 21 15] [22 30 28] [21 36 27 30]
Mycoplasma pneumoniae M129 [31 29 19 20] NO DATA NO DATA
Staphylococcus [30 29 30 29] aureus MRSA252 [27 30 21 21] [25 35 30 26] [29 31 30 29]*
Staphylococcus [30 29 30 29] aureus MSSA476 [27 30 21 21] [25 35 30 26] [30 29 29 30]*
Staphylococcus [30 29 30 29] aureus COL [27 30 21 21] [25 35 30 26] [30 29 29 30]*
Staphylococcus [30 29 30 29] aureus Mu50 [27 30 21 21] [25 35 30 26] [30 29 29 30]*
Staphylococcus [30 29 30 29] aureus MW2 [27 30 21 21] [25 35 30 26] [30 29 29 30]*
Staphylococcus [30 29 30 29] aureus N315 [27 30 21 21] [25 35 30 26] [30 29 29 30]*
Staphylococcus [25 35 30 26] [30 29 30 29] aureus NCTC 8325 [27 30 21 21] [25 35 31 26]* [30 29 29 30]
Streptococcus [24 36 31 25] agalactiae NEM316 [26 32 23 18] [24 36 30 26]* [25 32 29 30]
Streptococcus equi NC 002955 [26 32 23 18] [23 37 31 25] [29 30 25 32]
Streptococcus pyogenes MGAS8232 [26 32 23 18] [24 30 25] [25 31 29 31]
Streptococcus pyogenes MGAS315 [26 32 23 18] [24 30 25] [25 31 29 31]
Streptococcus pyogenes SSI-I [26 32 23 18] [24 30 25] [25 31 29 31]
Streptococcus pyogenes MGAS10394 [26 32 23 18] [24 37 30 25] [25 31 29 31]
Streptococcus pyogenes Manfredo (M5) [26 32 23 18] [24 37 30 25] [25 31 29 31]
Streptococcus pyogenes SF370 (Ml) [26 32 23 18] [24 37 30 25] [25 31 29 31]
Streptococcus pneumoniae 670 [26 32 23 18] [25 35 28 28] [25 32 29 30]
Streptococcus pneumoniae R6 [26 32 23 18] [25 35 28 28] [25 32 29 30]
Streptococcus pneumoniae TIGR4 [26 32 23 18] [25 35 28 28] [25 32 30 29]
Streptococcus gordonii NCTC7868 [25 33 23 18] [24 36 31 25] [25 31 29 31]
Streptococcus [25 32 29 30] mi ti s NCTC 12261 [26 32 23 18] [25 35 30 26] [24 31 35 29]*
Streptococcus mutans UAl59 [24 32 24 19] [25 37 30 24] [28 31 26 31] Table 7B - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 349, 360, and 356
Primer 349 Primer 360 Primer 356
Organism Strain [A G C T] [A G C T] [A G C T]
Klebsiella pneumoniae MGH78578 [25 31 25 22] [33 37 25 27] NO DATA
CO-92 Biovar [25 31 20]
Yersinia pestis Orientalis [25 32 26 20]* [34 35 25 28] NO DATA
KIM5 P12 (Biovar [25 31 27 20]
Yersinia pestis Mediaevalis ) [25 32 26 20]* [34 35 25 28] NO DATA
Yersinia pestis 91001 [25 31 27 20] [34 35 25 28] NO DATA
Haemophilus infl uenzae KW20 [28 28 25 20] [32 38 25 27] NO DATA
Pseudomonas 27] aeruginosa PAOl [24 31 26 20] 27 28]* NO DATA
Pseudomonas [30 37 28] fluorescens PfO-I NO DAT [30 37 27 28] NO DATA
Pseudomonas putida KT2440 [24 31 26 20] [30 37 27 28] NO DATA
Legionella pneumophila Philadelphia-1 [23 30 25 23] [30 39 29 24] NO DATA
Francisella tularensi s schu 4 [26 31 25 19] [32 36 27 27] NO DATA
Bordetella pertussi s Tohama I [21 29 24 18] [33 36 26 27] NO DATA
Burkholderia cepacia J2315 [23 27 22 20] [31 37 28 26] NO DATA
Burkholderia pseudomallei K96243 [23 27 22 20] [31 37 28 26] NO DATA
Neisseria gonorrhoeae FA 1090, ATCC 700825 [24 27 24 17] [34 37 25 26] NO DATA
Neisseria meningi tidi s MC58 (serogroup B) [25 27 22 18] [34 37 25 26] NO DATA
Nei sseria meningi tidi s serogroup C, FAM18 [25 26 23 18] [34 37 25 26] NO DATA
Nei sseria meningi tidi s Z2491 (serogroup A) [25 26 23 18] [34 37 25 26] NO DATA
Ch1amydophila pneumoniae TW-183 [30 28 27 18] NO DAT A NO DATA
Ch1amydophila pneumoniae AR39 [30 28 27 18] NO DAT NO DATA
Ch1amydophila pneumoniae CWL029 [30 28 27 18] NO DATp^ NO DATA
Ch1amydophila pneumoniae J138 [30 28 27 18] NO DATp^ NO DATA
Corynebacteri urn diph theπae NCTC13129 NO DAT A [29 40 28 25] NO DATA
Mycobacteri urn avi urn klO NO DAT A [33 35 32 22] NO DATA
Mycobacteri urn avi urn 104 NO DAT A [33 35 32 22] NO DATA
Mycobacteri urn tuberculosis CSU#93 NO DAT A [30 36 34 22] NO DATA
Mycobacterium tuberculosis CDC 1551 NO DAT [30 36 34 22] NO DATA
Mycobacterium tuberculosis H37Rv (lab strain) NO DAT [30 36 34 22] NO DATA
Mycoplasma pneumoniae M129 [28 30 24 19] [34 31 29 28] NO DATA
Staphylococcus aureus MRSA252 [26 30 25 20] [31 38 24 29] [33 30 31 27] Staphylococcus aureus MSSA476 [26 30 25 20] [31 38 24 29] [33 30 31 27]
Staphylococcus aureus COL [26 30 25 20] [31 38 24 29] [33 30 31 27]
Staphylococcus aureus Mu50 [26 30 25 20] [31 38 24 29] [33 30 31 27]
Staphylococcus aureus MW2 [26 30 25 20] [31 38 24 29] [33 30 31 27]
Staphylococcus aureus N315 [26 30 25 20] [31 38 24 29] [33 30 31 27]
Staphylococcus aureus NCTC 8325 [26 30 25 20] [31 38 24 29] [33 30 31 27]
Streptococcus agalactiae NEM316 [28 31 22 20] [33 24 28] [37 30 28 26]
Streptococcus equi NC 002955 [28 31 23 19] [33 38 24 27] [37 31 28 25]
Streptococcus pyogenes MGAS8232 [28 31 23 19] [33 37 24 28] [ 38 31 29 23]
Streptococcus pyogenes MGAS315 [28 31 23 19] [33 37 24 28] [ 38 31 29 23]
Streptococcus pyogenes SSI-I [28 31 23 19] [33 37 24 28] [ 38 31 29 23]
Streptococcus pyogenes MGAS10394 [28 31 23 19] [33 24 28] [38 31 29 23]
Streptococcus pyogenes Manfredo (M5) [28 31 23 19] [33 24 28] [38 31 29 23]
Streptococcus [28 31 19] pyogenes SF370 (Ml) [28 31 22 20]* [33 24 28] [38 31 29 23]
Streptococcus pneumoniae 670 [28 31 22 20] [34 36 24 28] [37 30 29 25]
Streptococcus pneumoniae R6 [28 31 22 20] [34 36 24 28] [37 30 29 25]
Streptococcus pneumoniae TIGR4 [28 31 22 20] [34 36 24 28] [37 30 29 25]
Streptococcus gorαonn NCTC7868 [28 32 23 20] [34 36 24 28] [36 31 29 25]
Streptococcus [28 31 20] mi tis NCTC 12261 [29 30 22 20]* [34 36 24 28] [37 30 29 25]
Streptococcus mutans UAl59 [26 32 23 22] [34 37 24 27] NO DATA
Table 7C - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 449, 354, and 352
Primer 449 Primer 354 Primer 352
Organism Strain [A G C T] [A G C T] [A G C T]
Klebsiella pneumoniae MGH78578 NO DATA [27 33 36 26] NO DATA
CO-92 Biovar
Yersinia pestis Orientalis NO DATA [29 31 33 29] [32 28 20 25]
KIM5 P12 (Biovar
Yersinia pestis Mediaevalis ) NO DATA [29 31 33 29] [32 28 20 25]
Yersinia pestis 91001 NO DATA [29 31 33 29] NO DATA
Haemophilus infl uenzae KW20 NO DATA [30 29 31 32] NO DATA
Pseudomonas aeruginosa PAOl NO DATA [26 33 39 24] NO DATA
Pseudomonas fluorescens PfO-I NO DATA [26 33 34 29] NO DATA
Pseudomonas putida KT2440 NO DATA [25 34 36 27] NO DATA Legionella pneumophila Philadelphia-1 NO DATA NO DATA NO DATA
Francisella tularensi s schu 4 NO DAT/ \ [33 32 25 32] NO DATA
Bordetella pertussi s Tohama I NO DAT/ \ [26 33 39 24] NO DATA
Burkholderia cepacia J2315 NO DAT/ \ [25 37 33 27] NO DATA
Burkholderia pseudomallei K96243 NO DATA [25 37 34 26] NO DATA
Neisseria gonorrhoeae FA 1090, ATCC 700825 [17 23 22 10] [29 31 32 30] NO DATA
Neisseria meningi tidis MC58 (serogroup B) NO DATA [29 30 32 31] NO DATA
Nei sseria meningi tidi s serogroup C, FAM18 NO DAT/ \ [29 30 32 31] NO DATA
Nei sseria meningi tidi s Z2491 (serogroup A) NO DAT/ \ [29 30 32 31] NO DATA
Ch1amydophila pneumoniae TW-183 NO DAT/ \ NO DAT/ \ NO DATA
Ch1amydophila pneumoniae AR39 NO DAT/ \ NO DAT/ \ NO DATA
Ch1amydophila pneumoniae CWL029 NO DATA NO DATA NO DATA
Ch1amydophila pneumoniae J138 NO DATA NO DATA NO DATA
Corynebacterium diph theπae NCTC13129 NO DATA NO DATA NO DATA
Mycobacteri urn avi urn klO NO DAT/ \ NO DAT/ \ NO DATA
Mycobacteri urn avi urn 104 NO DAT/ \ NO DAT/ \ NO DATA
Mycobacteri urn tuberculosis CSU#93 NO DAT/ \ NO DAT/ \ NO DATA
Mycobacterium tuberculosis CDC 1551 NO DATA NO DATA NO DATA
Mycobacterium tuberculosis H37Rv (lab strain) NO DATA NO DATA NO DATA
Mycoplasma pneumoniae M129 NO DATA NO DATA NO DATA
Staphylococcus aureus MRSA252 [17 20 21 17] [30 27 30 35] [36 24 19 26]
Staphylococcus aureus MSSA476 [17 20 21 17] [30 27 30 35] [36 24 19 26]
Staphylococcus aureus COL [17 20 21 17] [30 27 30 35] [35 24 19 27]
Staphylococcus aureus Mu50 [17 20 21 17] [30 27 30 35] [36 24 19 26]
Staphylococcus aureus MW2 [17 20 21 17] [30 27 30 35] [36 24 19 26]
Staphylococcus aureus N315 [17 20 21 17] [30 27 30 35] [36 24 19 26]
Staphylococcus aureus NCTC 8325 [17 20 21 17] [30 27 30 35] [35 24 19 27]
Streptococcus agalactiae NEM316 [22 20 19 14] [26 31 27 38] [29 26 22 28]
Streptococcus equi NC 002955 [22 21 19 13] NO DAT/ \ NO DATA
Streptococcus pyogenes MGAS8232 [23 21 19 12] [24 32 30 36] NO DATA
Streptococcus pyogenes MGAS315 [23 21 19 12] [24 32 30 36] NO DATA Streptococcus pyogenes SSI-I [23 21 19 12] [24 32 30 36] NO DATA
Streptococcus pyogenes MGAS10394 [23 21 19 12] [24 32 30 36] NO DATA
Streptococcus pyogenes Manfredo (M5) [23 21 19 12] [24 32 30 36] NO DATA
Streptococcus pyogenes SF370 (Ml) [23 21 19 12] [24 32 30 36] NO DATA
Streptococcus pneumoniae 670 [22 20 19 14] [25 33 29 35] [30 29 21 25]
Streptococcus pneumoniae R6 [22 20 19 14] [25 33 29 35] [30 29 21 25]
Streptococcus pneumoniae TIGR4 [22 20 19 14] [25 33 29 35] [30 29 21 25]
Streptococcus gordonii NCTC7868 [21 21 19 14] NO DATA [29 26 22 28]
Streptococcus mi ti s NCTC 12261 [22 20 19 14] [26 30 32 34] NO DATA
Streptococcus mutans UAl59 NO DATA NO DATA NO DATA
Table 7D - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 355, 358, and 359
Figure imgf000144_0001
Figure imgf000145_0001
Table 7E - Base Compositions of Common Respiratory Pathogens for Bioagent Identifying Amplicons Corresponding to Primer Pair Nos: 362, 363, and 367
Primer 362 Primer 363 Primer 367
Organism Strain [A G C T] [A G C T] [A G C T]
Klebsiella pneumoniae MGH78578 [21 33 22 16] [16 34 26 26] NO DATA
CO-92 Biovar
Yersinia pestis Orientalis [20 34 18 20] NO DATA NO DATA
KIM5 P12 (Biovar
Yersinia pestis Mediaevalis ) [20 34 18 20] NO DATA NO DATA
Yersinia pestis 91001 [20 34 18 20] NO DATA NO DATA
Haemophilus infl uenzae KW20 NO DATA NO DATA NO DATA
Pseudomonas aeruginosa PAOl [19 35 21 17] [16 36 28 22] NO DATA
Pseudomonas fluorescens PfO-I NO DATA [18 35 26 23] NO DATA
Pseudomonas putida KT2440 NO DATA [16 35 28 23] NO DATA
Legionella pneumophila Philadelphia-1 NO DATA NO DATA NO DATA
Francisella tularensi s schu 4 NO DATA NO DATA NO DATA
Bordetella pertussi s Tohama I [20 31 24 17] [15 34 32 21] [26 25 34 19]
Burkholderia cepacia J2315 [20 33 21 18] [15 36 26 25] [25 27 32 20]
Burkholderia pseudomallei K96243 [19 34 19 20] [15 37 28 22] [25 27 32 20]
Neisseria gonorrhoeae FA 1090, ATCC 700825 NO DATA NO DATA NO DATA
Neisseria meningi tidi s MC58 (serogroup B) NO DATA NO DATA NO DATA
Nei sseria meningi tidi s serogroup C, FAM18 NO DATA NO DATA NO DATA
Nei sseria meningi tidi s Z2491 (serogroup A) NO DATA NO DATA NO DATA
Ch1amydophila pneumoniae TW-183 NO DATA NO DATA NO DATA
Ch1amydophila pneumoniae AR39 NO DATA NO DATA NO DATA
Ch1amydophila pneumoniae CWL029 NO DATA NO DATA NO DATA
Ch1amydophila pneumoniae J138 NO DATA NO DATA NO DATA
Corynebacteri urn diph theπae NCTC13129 NO DATA NO DATA NO DATA
Mycobacteri urn avi urn klO [19 34 23 16] NO DATA [24 26 35 19]
Mycobacteri urn avi urn 104 [19 34 23 16] NO DATA [24 26 35 19]
Mycobacteri urn tuberculosis CSU#93 [19 31 25 17] NO DATA [25 25 34 20]
Mycobacterium tuberculosis CDC 1551 [19 31 24 18] NO DATA [25 25 34 20]
Mycobacterium tuberculosis H37Rv (lab strain) [19 31 24 18] NO DATA [25 25 34 20]
Mycoplasma pneumoniae M129 NO DATA NO DATA NO DATA
Staphylococcus aureus MRSA252 NO DATA NO DATA NO DATA Staphylococcus aureus MSSA476 NO DATA NO DATA NO DATA
Staphylococcus aureus COL NO DATA NO DATA NO DATA
Staphylococcus aureus Mu50 NO DATA NO DATA NO DATA
Staphylococcus aureus MW2 NO DATA NO DATA NO DATA
Staphylococcus aureus N315 NO DATA NO DATA NO DATA
Staphylococcus aureus NCTC 8325 NO DATA NO DATA NO DATA
Streptococcus agalactiae NEM316 NO DATA NO DATA NO DATA
Streptococcus equi NC 002955 NO DATA NO DATA NO DATA
Streptococcus pyogenes MGAS8232 NO DATA NO DATA NO DATA
Streptococcus pyogenes MGAS315 NO DATA NO DATA NO DATA
Streptococcus pyogenes SSI-I NO DATA NO DATA NO DATA
Streptococcus pyogenes MGAS10394 NO DATA NO DATA NO DATA
Streptococcus pyogenes Manfredo (M5) NO DATA NO DATA NO DATA
Streptococcus pyogenes SF370 (Ml) NO DATA NO DATA NO DATA
Streptococcus pneumoniae 670 NO DATA NO DATA NO DATA
Streptococcus pneumoniae R6 [2 0 30 19 23] NO DATA NO DATA
Streptococcus pneumoniae TIGR4 [2 0 30 19 23] NO DATA NO DATA
Streptococcus gorαonn NCTC7868 NO DATA NO DATA NO DATA
Streptococcus mi tis NCTC 12261 NO DATA NO DATA NO DATA
Streptococcus mutans UAl59 NO DATA NO DATA NO DATA
[278] Four sets of throat samples from military recruits at different military facilities taken at different time points were analyzed using the primers of the present invention. The first set was collected at a military training center from November 1 to December 20, 2002 during one of the most severe outbreaks of pneumonia associated with group A Streptococcus in the United States since 1968. During this outbreak, fifty-one throat swabs were taken from both healthy and hospitalized recruits and plated on blood agar for selection of putative group A Streptococcus colonies. A second set of 15 original patient specimens was taken during the height of this group A Streptococcus -associated respiratory disease outbreak. The third set were historical samples, including twenty-seven isolates of group A Streptococcus, from disease outbreaks at this and other military training facilities during previous years. The fourth set of samples was collected from five geographically separated military facilities in the continental U.S. in the winter immediately following the severe November/December 2002 outbreak.
[279] Pure colonies isolated from group A Streptococcus-selective media from all four collection periods were analyzed with the surveillance primer set. All samples showed base compositions that precisely matched the four completely sequenced strains of Streptococcus pyogenes. Shown in Figure 4 is a 3D diagram of base composition (axes A, G and C) of bioagent identifying amplicons obtained with primer pair number 14 (a precursor of primer pair number 348 which targets 16S rRNA). The diagram indicates that the experimentally determined base compositions of the clinical samples closely match the base compositions expected for Streptococcus pyogenes and are distinct from the expected base compositions of other organisms.
[280] In addition to the identification of Streptococcus pyogenes, other potentially pathogenic organisms were identified concurrently. Mass spectral analysis of a sample whose nucleic acid was amplified by primer pair number 349 (SEQ ID NOs: 401 : 1156) exhibited signals of bioagent identifying amplicons with molecular masses that were found to correspond to analogous base compositions of bioagent identifying amplicons of Streptococcus pyogenes (A27 G32 C24 T18), Neisseria meningitidis (A25 G27 C22 T 18), and Haemophilus influenzae (A28 G28 C25 T20) (see Figure 5 and Table 7B). These organisms were present in a ratio of 4:5:20 as determined by comparison of peak heights with peak height of an internal PCR calibration standard as described in commonly owned U.S. Patent Application Serial No: 60/545,425 which is incorporated herein by reference in its entirety.
[281] Since certain division-wide primers that target housekeeping genes are configured to provide coverage of specific divisions of bacteria to increase the confidence level for identification of bacterial species, they are not expected to yield bioagent identifying amplicons for organisms outside of the specific divisions. For example, primer pair number 356 (SEQ ID NOs: 449: 1380) primarily amplifies the nucleic acid of members of the classes Bacilli and Clostridia and is not expected to amplify proteobacteria such as Neisseria meningitidis and Haemophilus influenzae. As expected, analysis of the mass spectrum of amplification products obtained with primer pair number 356 does not indicate the presence of Neisseria meningitidis and Haemophilus influenzae but does indicate the presence of Streptococcus pyogenes (Figures 3 and 6, Table 7B). Thus, these primers or types of primers can confirm the absence of particular bioagents from a sample.
[282] The 15 throat swabs from military recruits were found to contain a relatively small set of microbes in high abundance. The most common were Haemophilus influenza, Neisseria meningitides, and Streptococcus pyogenes. Staphylococcus epidermidis, Moraxella cattarhalis, Corynebacteriumpseudodiphtheriticum, and Staphylococcus aureus were present in fewer samples. An equal number of samples from healthy volunteers from three different geographic locations, were identically analyzed. Results indicated that the healthy volunteers have bacterial flora dominated by multiple, commensal non-beta-hemolytic Streptococcal species, including the viridans group streptococci (S. parasangunis, S. vestibularis, S. mitis, S. oralis and S. pneumoniae; data not shown), and none of the organisms found in the military recruits were found in the healthy controls at concentrations detectable by mass spectrometry. Thus, the military recruits in the midst of a respiratory disease outbreak had a dramatically different microbial population than that experienced by the general population in the absence of epidemic disease.
Example 7: Triangulation Genotyping Analysis for Determination of emm-Type of Streptococcus pyogenes in Epidemic Surveillance
[283] As a continuation of the epidemic surveillance investigation of Example 6, determination of sub-species characteristics (genotyping) of Streptococcus pyogenes, was carried out based on a strategy that generates strain-specific signatures according to the rationale of Multi-Locus Sequence Typing (MLST). In classic MLST analysis, internal fragments of several housekeeping genes are amplified and sequenced (Enright et al. Infection and Immunity, 2001, 69, 2416-2427). In classic MLST analysis, internal fragments of several housekeeping genes are amplified and sequenced. In the present investigation, bioagent identifying amplicons from housekeeping genes were produced using drill-down primers and analyzed by mass spectrometry. Since mass spectral analysis results in molecular mass, from which base composition can be determined, the challenge was to determine whether resolution oϊemm classification of strains of Streptococcus pyogenes could be determined. [284] For the purpose of development of a triangulation genotyping assay, an alignment was constructed of concatenated alleles of seven MLST housekeeping genes (glucose kinase (gki), glutamine transporter protein (gtr), glutamate racemase (murl), DNA mismatch repair protein (mutS), xanthine phosphoribosyl transferase (xpt), and acetyl-CoA acetyl transferase (yqiL)) from each of the 212 previously emm-typed strains of Streptococcus pyogenes. From this alignment, the number and location of primer pairs that would maximize strain identification via base composition was determined. As a result, 6 primer pairs were chosen as standard drill-down primers for determination of emm-type of Streptococcus pyogenes. These six primer pairs are displayed in Table 8. This drill-down set comprises primers with T modifications (note TMOD designation in primer names) which constitutes a functional improvement with regard to prevention of non-templated adenylation (vide supra) relative to originally selected primers which are displayed below in the same row.
Table 8: Triangulation Genotyping Analysis Primer Pairs for Group A Streptococcus Drill- Down
Figure imgf000150_0001
SPlOl SPETIl 1314 133 SPlOl SPETIl 1403
426 6 TMOD F 363 1431 TMOD R 849 murl
86 SPlOl S PETIl 1314 133 68 SPlOl SPETIl 1403 711 murl
6 F 1431 R
SPlOl S PETIl 1807 183 SPlOl SPETIl 1901
430 5 TMOD F 235 1927 TMOD R 1439 mutS
90 SPlOl S PETIl 1807 183 33 SPlOl SPETIl 1901 1412 mutS
5 F 1927 R
SPlOl S PETIl 3075 310 SPlOl SPETIl 3168
438 3 TMOD F 473 3196 TMOD R 875 xpt
96 SPlOl S PETIl 3075 310 108 SPlOl SPETIl 3168 715 xpt
3 F 3196 R
SPlOl S PETIl 3511 353 SPlOl SPETIl 3605
441 5 TMOD F 531 3629 TMOD R 1294 yqiL
98 SPlOl S PETIl 3511 353 116 SPlOl SPETIl 3605 832 yqiL
5 F 3629 R
[285] The primers of Table 8 were used to produce bioagent identifying amplicons from nucleic acid present in the clinical samples. The bioagent identifying amplicons which were subsequently analyzed by mass spectrometry and base compositions corresponding to the molecular masses were calculated.
[286] Of the 51 samples taken during the peak of the November/December 2002 epidemic (Table 9A-C rows 1-3), all except three samples were found to represent emm3, a Group A Streptococcus genotype previously associated with high respiratory virulence. The three outliers were from samples obtained from healthy individuals and probably represent non-epidemic strains. Archived samples (Tables 9A-C rows 5-13) from historical collections showed a greater heterogeneity of base compositions and emm types as would be expected from different epidemics occurring at different places and dates. The results of the mass spectrometry analysis and emm gene sequencing were found to be concordant for the epidemic and historical samples.
Table 9A: Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus samples from Six Military Installations Obtained with Primer Pair Nos. 426 and
430 emm-type by murl mutS
# of emm-Gene Location Mass Year (Primer Pair (Primer Pair Instances Sequencing (sample) Spectrometry No. 426) No. 430)
48 3 3 A39 G25 C20 T34 A38 G27 C23 T33
MCRD San
2 6 6 Diego A40 G24 C20 T34 A38 G27 C23 T33
2C 02
1 28 28 A39 G25 C20 T34 A38 G27 C23 T33
(Cultured)
15 3 ND A39 G25 C20 T34 A38 G27 C23 T33
6 3 3 A39 G25 C20 T34 A38 G27 C23 T33
3 5,58 5 A40 G24 C20 T34 A38 G27 C23 T33
6 6 6 A40 G24 C20 T34 A38 G27 C23 T33
NHRC San
1 11 11 Diego- A39 G25 C20 T34 A38 G27 C23 T33
3 12 12 Archive 2003 A40 G24 C20 T34 A38 G26 C24 T33
1 22 22 A39 G25 C20 T34 A38 G27 C23 T33
(Cultured)
3 25,75 75 A39 G25 C20 T34 A38 G27 C23 T33
4 44/61, 82, 9 44/61 A40 G24 C20 T34 A38 G26 C24 T33
2 53, 91 91 A39 G25 C20 T34 A38 G27 C23 T33
1 2 2 A39 G25 C20 T34 A38 G27 C24 T32
2 3 3 A39 G25 C20 T34 A38 G27 C23 T33
1 4 4 A39 G25 C20 T34 A38 G27 C23 T33
1 6 6 Ft. A40 G24 C20 T34 A38 G27 C23 T33 Leonard
11 25 or 75 75 A39 G25 C20 T34 A38 G27 C23 T33 Wood 2C 03
25,75, 33,
1 34, 4,52, 84 75 (Cultured) A39 G25 C20 T34 A38 G27 C23 T33
44/61 or 82
1 or 9 44/61 A40 G24 C20 T34 A38 G26 C24 T33
2 5 or 58 5 A40 G24 C20 T34 A38 G27 C23 T33
3 1 1 A40 G24 C20 T34 A38 G27 C23 T33
2 Ft. Sill
3 3 A39 G25 C20 T34 A38 G27 C23 T33
1 4 4 (Cultured) A39 G25 C20 T34 A38 G27 C23 T33
1 28 28 A39 G25 C20 T34 A38 G27 C23 T33
1 3 3 Ft. 2C 03 A39 G25 C20 T34 A38 G27 C23 T33
Benning
1 4 4 A39 G25 C20 T34 A38 G27 C23 T33
3 6 6 (Cultured) A40 G24 C20 T34 A38 G27 C23 T33
1 11 11 A39 G25 C20 T34 A38 G27 C23 T33
1 13 94** A40 G24 C20 T34 A38 G27 C23 T33 44/61 or 82
1 or 9 82 A40 G24 C20 T34 A3 8 G26 C24 T33
1 5 c r 58 58 A40 G24 C20 T34 A3 8 G27 C23 T33
1 78 or 89 89 A39 G25 C20 T34 A3 8 G27 C23 T33
2 5 c r 58 A40 G24 C20 T34 A3 8 G27 C23 T33
Lackland
1 2 AFB A39 G25 C20 T34 A3 8 G27 C24 T32
1 81 or 90 ND 2003 A40 G24 C20 T34 A3 8 G27 C23 T33
1 78 (Throat A38 G26 C20 T34 A3 8 G27 C23 T33 Swabs)
No detection No detection No detection
3 ND A39 G25 C20 T34 A3 8 G27 C23 T33
1 3 ND MCRD San No detection A3 8 G27 C23 T33
1 3 ND Diego
No detection No detection
1 3 ND (Throat No detection No detection
2 3 ND Swabs ) No detection A3 8 G27 C23 T33
3 No detection ND No detection No detection
Table 9B: Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus samples from Six Military Installations Obtained with Primer Pair Nos. 438 and
441 emm-type by xpt yqiL
# of emm-Gene Location Mass Year Instances (Primer Pair (Primer Pair Sequencing (sample) Spectrometry No. 438) No. 441)
48 3 3 A30 G36 C20 T36 A40 G29 C19 T31
MCRD San
2 6 6 Diego A30 G36 C20 T36 A40 G29 C19 T31
2002
1 28 28 A30 G36 C20 T36 A41 G28 C18 T32
(Cultured)
15 3 ND A30 G36 C20 T36 A40 G29 C19 T31
6 3 3 A30 G36 C20 T36 A40 G29 C19 T31
3 5,58 5 A30 G36 C20 T36 A40 G29 C19 T31
6 6 6 A30 G36 C20 T36 A40 G29 C19 T31
NHRC San
1 11 11 Diego- A30 G36 C20 T36 A40 G29 C19 T31
3 12 12 Archive 2003 A30 G36 C19 T37 A40 G29 C19 T31
1 22 22 A30 G36 C20 T36 A40 G29 C19 T31
(Cultured)
3 25,75 75 A30 G36 C20 T36 A40 G29 C19 T31
4 44/61, 82, 9 44/61 A30 G36 C20 T36 A41 G28 C19 T31
2 53, 91 91 A30 G36 C19 T37 A40 G29 C19 T31
1 2 2 A30 G36 C20 T36 A40 G29 C19 T31
2 3 3 A30 G36 C20 T36 A40 G29 C19 T31
1 4 4 A30 G36 C19 T37 A41 G28 C19 T31
1 6 6 Ft. A30 G36 C20 T36 A40 G29 C19 T31 Leonard
11 25 or 75 75 A30 G36 C20 T36 A40 G29 C19 T31 Wood 2003
25,75, 33,
1 34, 4,52, 84 75 (Cultured) A30 G36 C19 T37 A40 G29 C19 T31
44/61 or 82
1 or 9 44/61 A30 G36 C20 T36 A41 G28 C19 T31
2 5 or 58 5 A30 G36 C20 T36 A40 G29 C19 T31
3 1 1 Ft. Sill 2003 A30 G36 C19 T37 A40 G29 C19 T31 2 3 3 A30 G36 C20 T36 A40 G29 Cl 9 T31
(Cultured)
1 4 4 A30 G36 C19 T37 A41 G28 C19 T31
1 28 28 A30 G36 C20 T36 A41 G28 C18 T32
1 3 3 A30 G36 C20 T36 A40 G29 Cl 9 T31
1 4 4 A30 G36 C19 T37 A41 G28 C19 T31
3 6 6 A30 G36 C20 T36 A40 G29 Cl 9 T31
1 11 11 Ft. A30 G36 C20 T36 A40 G29 Cl 9 T31
1 13 94** 2003 A30 G36 C20 T36 A41 G28 C19 T31
44 /61 or 82 (Cultured)
1 or 9 82 A30 G36 C20 T36 A41 G28 C19 T31
1 5 or 58 58 A30 G36 C20 T36 A40 G29 Cl 9 T31
1 78 or 89 89 A30 G36 C20 T36 A41 G28 C19 T31
2 5 or 58 A30 G36 C20 T36 A40 G29 Cl 9 T31
Lackland
1 2 AFB A30 G36 C20 T36 A40 G29 Cl 9 T31
1 81 or 90 ND 2003 A30 G36 C20 T36 A40 G29 Cl 9 T31
1 78 (Throat
A30 G36 C20 T36 A41 G28 C19 T31 Swabs )
3*** No detection No detection No detection η 3 ND A30 G36 C20 T36 A40 G29 Cl 9 T31
1 3 ND MCRD San A30 G36 C20 T36 A40 G29 Cl 9 T31
Diego
1 3 ND A30 G36 C20 T36 No detection
2002
1 3 ND (Throat No detection A40 G29 Cl 9 T31
2 3 ND Swabs ) A30 G36 C20 T36 A40 G29 Cl 9 T31
3 No detection ND No detection No detection
Table 9C: Base Composition Analysis of Bioagent Identifying Amplicons of Group A Streptococcus samples from Six Military Installations Obtained with Primer Pair Nos. 438 and
441
Figure imgf000154_0001
11 25 or 75 75 A30 G36 C17 T33 A39 G28 C15 T33
25,75, 33,
1 34,4,52,84 75 A30 G36 C17 T33 A39 G28 C15 T33
44/61 or 82
1 or 9 44/61 A30 G36 C18 T32 A39 G28 C15 T33
2 5 or 58 5 A30 G36 C20 T30 A39 G28 C15 T33
3 1 1 A30 G36 C18 T32 A39 G28 C15 T33
Ft. Sill
2 3 3 A32 G35 C17 T32 A39 G28 C16 T32
1 4 4 (Cultured) A31 G35 C17 T33 A39 G28 C15 T33
1 28 28 A30 G36 C17 T33 A39 G28 C16 T32
1 3 3 A32 G35 C17 T32 A39 G28 C16 T32
1 4 4 A31 G35 C17 T33 A39 G28 C15 T33
3 6 6 A31 G35 C17 T33 A39 G28 C15 T33
1 11 11 Ft. A30 G36 C20 T30 A39 G28 C16 T32
1 13 94** 2003 A30 G36 C19 T31 A39 G28 C15 T33
44/61 or 82 (Cultured)
1 or 9 82 A30 G36 C18 T32 A39 G28 C15 T33
1 5 or 58 58 A30 G36 C20 T30 A39 G28 C15 T33
1 78 or 89 89 A30 G36 C18 T32 A39 G28 C15 T33
2 5 or 58 A30 G36 C20 T30 A39 G28 C15 T33
Lackland
1 2 AFB A30 G36 C17 T33 A39 G28 C15 T33
1 81 or 90 ND 2003 A30 G36 C17 T33 A39 G28 C15 T33
1 (Throat
78 A30 G36 C18 T32 A39 G28 C15 T33 Swabs)
3*** No detection No detection No detection
3 ND A32 G35 C17 T32 A39 G28 C16 T32
1 3 ND MCRD San No detection No detection
Diego
1 3 ND A32 G35 C17 T32 A39 G28 C16 T32
1 3 ND (Throat A32 G35 C17 T32 No detection
2 3 ND Swabs) A32 G35 C17 T32 No detection
3 No detection ND No detection No detection
Example 8: Design of Calibrant Polynucleotides based on Bioagent Identifying Amplicons for Identification of Species of Bacteria (Bacterial Bioagent Identifying Amplicons) [287] This example describes the design of 19 calibrant polynucleotides based on bacterial bioagent identifying amplicons corresponding to the primers of the broad surveillance set (Table 5) and the Bacillus anthracis drill-down set (Table 6).
[288] Calibration sequences were designed to simulate bacterial bioagent identifying amplicons produced by the T modified primer pairs shown in Tables 5 and 6 (primer names have the designation "TMOD"). The calibration sequences were chosen as a representative member of the section of bacterial genome from specific bacterial species which would be amplified by a given primer pair. The model bacterial species upon which the calibration sequences are based are also shown in Table 10. For example, the calibration sequence chosen to correspond to an amplicon produced by primer pair no. 361 is SEQ ID NO: 1445. In Table 10, the forward ( F) or reverse ( R) primer name indicates the coordinates of an extraction representing a gene of a standard reference bacterial genome to which the primer hybridizes e.g.: the forward primer name 16S_EC_713_732_TMOD_F indicates that the forward primer hybridizes to residues 713-732 of the gene encoding 16S ribosomal RNA in an E. coli reference sequence (in this case, the reference sequence is an extraction consisting of residues 4033120-4034661 of the genomic sequence of E. coli K12 (GenBank gi number 16127994). Additional gene coordinate reference information is shown in Table 11. The designation "TMOD" in the primer names indicates that the 5' end of the primer has been modified with a non-matched template T residue which prevents the PCR polymerase from adding non-templated adenosine residues to the 5' end of the amplification product, an occurrence which may result in miscalculation of base composition from molecular mass data (vide supra).
[0143] The 19 calibration sequences described in Tables 10 and 11 were combined into a single calibration polynucleotide sequence (SEQ ID NO: 1464 - which is herein designated a "combination calibration polynucleotide") which was then cloned into a pCR®-Blunt vector (Invitrogen, Carlsbad, CA). This combination calibration polynucleotide can be used in conjunction with the primers of Tables 5 or 6 as an internal standard to produce calibration amplicons for use in determination of the quantity of any bacterial bioagent. Thus, for example, when the combination calibration polynucleotide vector is present in an amplification reaction mixture, a calibration amplicon based on primer pair 346 (16S rRNA) will be produced in an amplification reaction with primer pair 346 and a calibration amplicon based on primer pair 363 (rpoC) will be produced with primer pair 363. Coordinates of each of the 19 calibration sequences within the calibration polynucleotide (SEQ ID NO: 1464) are indicated in Table 11.
Table 10: Bacterial Primer Pairs for Production of Bacterial Bioagent Identifying
Amplicons and Corresponding Representative Calibration Sequences
Figure imgf000157_0001
362 RPOB EC 3799 3821 581 RPOB EC 3862 3888 TM 1325 Burkholder 1458
TMOD F OD R ia mallei
363 RPOC EC 2146 2174 284 RPOC EC 2227 2245 TM 898 Burkholder 1459
TMOD F OD R ia mallei
354 RPOC EC 2218 2241 405 RPOC EC 2313 2337 TM 1072 Bacillus 1460
TMOD F OD R an thraci s
355 SSPE BA 115 137 TM 255 SSPE BA 197 222 TMOD 1402 Bacillus 1461
OD F R an thraci s
367 TUFB EC 957 979 TM 308 TUFB EC 1034 1058 TM 1276 Burkholder 1462
OD F OD R ia mallei
358 VALS EC 1105 1124 385 VALS EC 1195 1218 TM 1093 Yersinia 1463
TMOD F OD R Pestis
Table 11: Primer Pair Gene Coordinate References and Calibration Polynucleotide Sequence Coordinates within the Combination Calibration Polynucleotide
Figure imgf000158_0001
rpoB E. 4178823..4182851 16127994 (G) 359 1591..1672 coll . (complement strand) rpoB E. 4178823..4182851 16127994 (G) 362 2081 ..2167 coll (complement strand) rpoC E. 4182928..4187151 16127994 (G) 354 1810 ..1926 coll rpoC E. 4182928..4187151 16127994 (G) 363 2183 ..2279 coll infB E. 3313655..3310983 16127994 (G) 352 1692 ..1791 coll (complement strand) tufB E. 4173523..4174707 16127994 (G) 367 2400 ..2498 coll rplB E. 3449001..3448180 16127994 (G) 356 1945 ..2060 coll rplB E. 3449001..3448180 16127994 (G) 449 1986 ..2055 coll valS E. 4481405..4478550 16127994 (G) 358 1462 ..1572 coll (complement strand) capC 56074..55628 6470151 (P) 350 2517 ..2616 B. (complement strand) anthracis cya 156626..154288 4894216 (P) 351 1338 ..1449 B. (complement strand) an thraci s lef 127442..129921 4894216 (P) 353 1121 ..1234 B. an thraci s sspE 226496..226783 30253828 (G) 355 1007 -1104 B. anthracis
Example 9: Use of a Calibration Polynucleotide for Determining the Quantity of Bacillus Anthracis in a Sample Containing a Mixture of Microbes [289] The process described in this example is shown in Figure 2. The capC gene is a gene involved in capsule synthesis which resides on the pX02 plasmid of Bacillus anthracis. Primer pair number 350 (see Tables 10 and 11) was configured to identify Bacillus anthracis via production of a bacterial bioagent identifying amplicon. Known quantities of the combination calibration polynucleotide vector described in Example 8 were added to amplification mixtures containing bacterial bioagent nucleic acid from a mixture of microbes which included the Ames strain of Bacillus anthracis. Upon amplification of the bacterial bioagent nucleic acid and the combination calibration polynucleotide vector with primer pair no. 350, bacterial bioagent identifying amplicons and calibration amplicons were obtained and characterized by mass spectrometry. A mass spectrum measured for the amplification reaction is shown in Figure 7. The molecular masses of the bioagent identifying amplicons provided the means for identification of the bioagent from which they were obtained (Ames strain of Bacillus anthracis) and the molecular masses of the calibration amplicons provided the means for their identification as well. The relationship between the abundance (peak height) of the calibration amplicon signals and the bacterial bioagent identifying amplicon signals provides the means of calculation of the copies of the pX02 plasmid of the Ames strain of Bacillus anthracis. Methods of calculating quantities of molecules based on internal calibration procedures are well known to those of ordinary skill in the art.
[290] Averaging the results of 10 repetitions of the experiment described above, enabled a calculation that indicated that the quantity of Ames strain of Bacillus anthracis present in the sample corresponds to approximately 10 copies of pX02 plasmid.
Example 10: Triangulation Genotyping Analysis of Campylobacter Species [291] A series of triangulation genotyping analysis primers were configured as described in Example 1 with the objective of identification of different strains of Campylobacter jejuni. The primers are listed in Table 12 with the designation "CJST CJ." Housekeeping genes to which the primers hybridize and produce bioagent identifying amplicons include: tkt (transketolase), glyA (serine hydroxymethyltransferase), gltA (citrate synthase), aspA (aspartate ammonia lyase), glnA (glutamine synthase), pgm (phosphoglycerate mutase), and uncA (ATP synthetase alpha chain). Table 12: Campylobacter Genotyping Primer Pairs
Primer Forward Primer Name Forward Primer Reverse Primer Name Reverse Primer Target Gene Pair (SEQ ID NO:) (SEQ ID NO:) No.
1053 CJST >_.J 1080 1110 F 681 CJST >_.J 1166 1198 R 1022 gltA
1047 CJST CJ 584 616 F 315 CJST CJ 663 692 R 1379 glnA
1048 CJST >_.J 360 394 F 346 CJST >_.J 442 476 R 955 aspA
1049 CJST CJ 2636 2668 F 504 CJST CJ 2753 2777 R 1409 tkt
1054 CJST >_.J 2060 2090 F 323 CJST >_.J 2148 2174 R 1068 pgm
1064 CJST CJ 1680 1713 F 479 CJST CJ 1795 1822 R 938 glyA
[292] The primers were used to amplify nucleic acid from 50 food product samples provided by the USDA, 25 of which contained Campylobacter jejuni and 25 of which contained Campylobacter coli. Primers used in this study were developed primarily for the discrimination of Campylobacter jejuni clonal complexes and for distinguishing Campylobacter jejuni from Campylobacter coli. Finer discrimination between Campylobacter coli types is also possible by using specific primers targeted to loci where closely-related Campylobacter coli isolates demonstrate polymorphisms between strains. The conclusions of the comparison of base composition analysis with sequence analysis are shown in Tables 13A-C.
Table 13A - Results of Base Composition Analysis of 50 Campylobacter Samples with Drill- down MLST Primer Pair Nos: 1048 and 1047
Figure imgf000161_0001
ST 257,
A30 G25 C16
J-4 A48 G21 C18
Human Complex 257 complex RM4197 T46 T22
ST 52,
A 0 G25 CIf A48 321 C17
J-5 Human Complex 52 complex RM4277 jej uni T46 T23
A30 G25 C15 A48 G21 C17
ST 51, RM4275 T47 T23
J-6 Human Complex 443 complex
A30 G25 C15 A48 G21 C17
443 RM4279 T47 T23
C. ST 604,
J-7 A30 ClJ A48 G21 C18
Human Complex 42 complex RMl 864 ieiuni T47 T22 42
ST 362,
Complex A30 ClJ A48 G21 C18
J-8 Human complex RM3193 42/49/362 T47 T22 362
ST 147,
Complex A30 G25 C15 A47 G21 CIi
J-9 Human Complex RM3203 45/283 T47 T23 45
C. A31 327 C20 A48 G21 CIS
32E RM4183 ieiuni T39 T24
A31 327 C20 A48 G21 CIS
Human Ϊ32 RMIl 69 T39 T24
A31 G27 C20 A48 G21 C16
ST 1056 RMl E T39 T24
A31 G27 C20 A48 G21 C16
ST 889 RMIl 66 T39 T24
A31 G27 C20 A48 G21 C16
ST 829 RMIl 82 T39 T24
A31 327 C20 A48 G21 CIS
ST 1050 RM1518 T39 T24
A31 327 C20 A48 G21 CIS
ST 1051 RM1521 T39 T24
A31 327 C20 A48 G21 CIS
ST 1053 RM1523 T39 T24
Consistent
A31 G27 C20 A48 G21 C16 with 74 ST 1055 RM1527 T39 T24
Poultry closely
A31 G27 C20 A48 G21 C16 related ST 1017 RM1529 T39 T24 sequence
C-I A31 327 C20 A48 G21 CIS types (none 360 RMl 840 T39 T24 belong to a clonal A31 327 C20 A48 G21 CIS
ST 1063 RM2219 complex) T39 T24
A31 327 C20 A48 G21 CIS
ST 1066 RM2241 T39 T24
A31 G27 C20 A48 G21 C16
ST 1OS RM2243 T39 T24
A31 G27 C20 A48 G21 C16
ST 1OS RM2439 T39 T24
A31 327 C20 A48 G21 CIS
ST 1016 RM3230 T39 T24
A31 327 C20 A48 G21 CIS
ST 1069 RM3231 T39 T24
A31 327 C20 A48 G21 CIS
ST 1061 RMl 904 T39 T24
A31 G27 C20 A48 G21 C16
ST 825 RM1534 T39 T24
Unknown
A31 G27 C20 A48 G21 C16
ST 901 RMlJ T39 T24
A31 327 C19 A48 G21 CIS
C-2 coll Human 3T 395 RM1532 T40 T24
Consistent A31 G27 C20
C-3 Poultry A48 G21 CIS
ST 1064 RM2223 with 63 T39 T24 closely A31 G27 A48 G21 C16 related ST 1082 RM1178 T39 T24 sequence A31 G27 C20 A48 G21 C16
ST 1054 RM1525 types (none T39 T24 belong to a A31 G27 C20 A48 G21 C16
ST 1049 RM1517 clonal T39 T24 complex ) A31 G27 C20 A48 G21 C16
Marmoset ST 891 RM1531 T39 T24
Table 13B - Results of Base Composition Analysis of 50 Campylobacter Samples with Drill- down MLST Primer Pair Nos: 1053 and 1064
Figure imgf000163_0001
T46 T4:
A23 G24 C26 A39 G30 C27
3T 1OJ RM1518 T46 T47
A23 G24 C26 A39 G30 C27
3T 1051 RM1521 T46 T47
A23 G24 C26 A39 G30 C27
ST 1053 RM1523 T46 T47
A23 G24 C26 A39 G30 C27
3T 10J RM1527 T46 T47
A23 G24 C26 A39 G30 C27
3T 1017 RM1529 T46 T47
A23 G24 C26 A39 G30 C27
ST 860 RMl 840 T46 T47
A23 G24 C26 A39 G30 C27
3T 1063 RM2219 T46 T47
A23 G24 C26 A39 G30 C27
3T 1066 RM2241 T46 T47
A23 G24 C26 A39 G30 C27
ST 1067 RM2243 T46 T47
A23 G24 C26 A39 G30 C27
ST 1068 RM2439 T46 T47
A23 G24 C26 A39 G30 C27
3T 1016 RM3230 T46 T47
A23 G24 C26 NO DATA
Swine ST 1069 RM3231 T46
A23 G24 C26 A39 G30 C27
ST 1061 RMl 904 T46 T47
A23 G24 C26 A39 G30 C27
3T 82J RM1534 T46 T47
Unknown
A23 G24 C26 A39 G30 C27
ST 901 RM1505 T46 T47
A23 G24 C26 A39 G30 C27
C-2 zoli Human 395 ST 895 RM1532 T46 T47
A23 G24 C26 A39 G30 C27
ST 1064 RM2223 T46 T47
Consistent A23 G24 C26 A39 G30 C27 with 63 ST 1082 RM1178 T46 T47 closely
Poultry related A23 G24 C25 A39 G30 C27
C-3 sequence ST 1054 RM1525 T47 T47 types (none belong to a A23 G24 C26 A39 G30 Cz clonal ST 1049 RM1517 complex) T46 T47
A23 G24 C26 A39 G30 Cz
Marmoset 3T 891 RM1531 T46 T47 Table 13C - Results of Base Composition Analysis of 50 Campylobacter Samples with Drill- down MLST Primer Pair Nos: 1054 and 1049
Figure imgf000165_0001
A27 G30 C19 A46 G28 C32
ST 1063 RM2219 T39 T36
A27 G30 C19 A46 Gz o C32
ST 1066 RM2241 T39 T36
A27 G30 C19 A46 Gz o C32
ST 1067 RM2243 T39 T36
A27 G30 C19 A46 G28 C32
ST 1068 RM2439 T39 T36
A27 G30 C19 A46 G28 C32
ST 1016 RM3230 T39 T36
A27 G30 C19 A46 G28 C32
Swine ST 1069 RM3231 T39 T36
A27 G30 C19 A46 Gz o C32
ST 1061 RMl 904 T39 T36
A27 G30 C19 A46 Gz o C32
ST 825 RM1534 T39 T36
Unknown
A27 G30 C19 A46 G28 C32
ST 901 RM1505 T39 T36
A27 G30 C19 A45 G29 C32
C-2 C. coll Human ST 895 ST 895 RM1532 T39 T36
A27 G30 C19 A45 G29 C32
Consistent ST 1064 RM2223 T39 T36 with 63
A27 G30 C19 A45 G29 C32 closely ST 1082 RM1178 T39 T36
Poultry related
A27 G30 C19 A45 G29 C32
C-3 C. coll sequence ST 1054 RM1525 T39 T36 types (none belong to a A27 G30 C19 A45 G29 C32
ST 1049 RM1517 T39 T36 clonal complex) A27 G30 C19 A45 G29 C32
Marmoset ST 891 RM1531 T39 T36
[293] The base composition analysis method was successful in identification of 12 different strain groups. Campylobacter jejuni and Campylobacter coli are generally differentiated by all loci. Ten clearly differentiated Campylobacter jejuni isolates and 2 major Campylobacter coli groups were identified even though the primers were configured for strain typing of Campylobacter jejuni . One isolate (RM4183) which was designated as Campylobacter jejuni was found to group with Campylobacter coli and also appears to actually be Campylobacter coli by full MLST sequencing.
Example 11: Identification of Acinetobacter baumannii Using Broad Range Survey and Division-Wide Primers in Epidemiological Surveillance
[294] To test the capability of the broad range survey and division-wide primer sets of Table 5 in identification of Acinetobacter species, 183 clinical samples were obtained from individuals participating in, or in contact with individuals participating in Operation Iraqi Freedom (including US service personnel, US civilian patients at the Walter Reed Army Institute of Research (WRAIR), medical staff, Iraqi civilians and enemy prisoners. In addition, 34 environmental samples were obtained from hospitals in Iraq, Kuwait, Germany, the United States and the USNS Comfort, a hospital ship.
[295] Upon amplification of nucleic acid obtained from the clinical samples, primer pairs 346- 349, 360, 361, 354, 362 and 363 (Table 5) all produced bacterial bioagent amplicons which identified Acinetobacter baumannii in 215 of 217 samples. The organism Klebsiella pneumoniae was identified in the remaining two samples. In addition, 14 different strain types (containing single nucleotide polymorphisms relative to a reference strain of Acinetobacter baumannii) were identified and assigned arbitrary numbers from 1 to 14. Strain type 1 was found in 134 of the sample isolates and strains 3 and 7 were found in 46 and 9 of the isolates respectively.
[296] The epidemiology of strain type 7 of Acinetobacter baumannii was investigated. Strain 7 was found in 4 patients and 5 environmental samples (from field hospitals in Iraq and Kuwait). The index patient infected with strain 7 was a pre-war patient who had a traumatic amputation in March of 2003 and was treated at a Kuwaiti hospital. The patient was subsequently transferred to a hospital in Germany and then to WRAIR. Two other patients from Kuwait infected with strain 7 were found to be non-infectious and were not further monitored. The fourth patient was diagnosed with a strain 7 infection in September of 2003 at WRAIR. Since the fourth patient was not related involved in Operation Iraqi Freedom, it was inferred that the fourth patient was the subject of a nosocomial infection acquired at WRAIR as a result of the spread of strain 7 from the index patient.
[297] The epidemiology of strain type 3 of Acinetobacter baumannii was also investigated. Strain type 3 was found in 46 samples, all of which were from patients (US service members, Iraqi civilians and enemy prisoners) who were treated on the USNS Comfort hospital ship and subsequently returned to Iraq or Kuwait. The occurrence of strain type 3 in a single locale may provide evidence that at least some of the infections at that locale were a result of nosocomial infections. [298] This example thus illustrates an embodiment of the present invention wherein the methods of analysis of bacterial bioagent identifying amplicons provide the means for epidemiological surveillance.
Example 12: Selection and Use of Triangulation Genotyping Analysis Primer Pairs for Acinetobacter baumanii
[299] To combine the power of high-throughput mass spectrometric analysis of bioagent identifying amplicons with the sub-species characteristic resolving power provided by triangulation genotyping analysis, an additional 21 primer pairs were selected based on analysis of housekeeping genes of the genus Acinetobacter. Genes to which the drill-down triangulation genotyping analysis primers hybridize for production of bacterial bioagent identifying amplicons include anthranilate synthase component I (trpE), adenylate kinase (adk), adenine glycosylase (mutY), fumarate hydratase (fumC), and pyrophosphate phospho-hydratase (ppa). These 21 primer pairs are indicated with reference to sequence listings in Table 14. Primer pair numbers 1151-1154 hybridize to and amplify segments of trpE. Primer pair numbers 1155-1157 hybridize to and amplify segments of adk. Primer pair numbers 1158-1164 hybridize to and amplify segments of mutY. Primer pair numbers 1165-1170 hybridize to and amplify segments of fumC. Primer pair number 1171 hybridizes to and amplifies a segment of ppa. Primer pair numbers: 2846-2848 hybridize to and amplify segments of the parC gene of DNA topoisomerase which include a codon known to confer quinolone drug resistance upon sub-types of Acinetobacter baumannii. Primer pair numbers 2852- 2854 hybridize to and amplify segments of the gyrA gene of DNA gyrase which include a codon known to confer quinolone drug resistance upon sub-types of Acinetobacter baumannii. Primer pair numbers 2922 and 2972 are speciating primers which are useful for identifying different species members of the genus Acinetobacter. The primer names given in Table 14A (with the exception of primer pair numbers 2846-2848, 2852-2854) indicate the coordinates to which the primers hybridize to a reference sequence which comprises a concatenation of the genes TrpE, efp (elongation factor p), adk, mutT, fumC, and ppa. For example, the forward primer of primer pair 1151 is named AB MLS T- 11 -OI FO 07 62 91 F because it hybridizes to the Acinetobacter primer reference sequence of strain type 11 in sample 007 of Operation Iraqi Freedom (OIF) at positions 62 to 91. DNA was sequenced from strain type 11 and from this sequence data and an artificial concatenated sequence of partial gene extractions was assembled for use in design of the triangulation genotyping analysis primers. The stretches of arbitrary residues "N"s in the concatenated sequence were added for the convenience of separation of the partial gene extractions (4ON for AB MLST (SEQ ID NO: 1444)).
[300] The hybridization coordinates of primer pair numbers 2846-2848 are with respect to GenBank Accession number X95819. The hybridization coordinates of primer pair numbers 2852- 2854 are with respect to GenBank Accession number AY642140. Sequence residue "I" appearing in the forward and reverse primers of primer pair number 2972 represents inosine.
Table 14A: Triangulation Genotyping Analysis Primer Pairs for Identification of Sub-species characteristics (Strain Type) of Members of the Bacterial Genus Acinetobacter
Figure imgf000169_0001
Figure imgf000170_0001
Figure imgf000171_0001
Table 14B: Triangulation Genotyping Analysis Primer Pairs for Identification of Sub-species characteristics (Strain Type) of Members of the Bacterial Genus Acinetobacter
Figure imgf000171_0002
Figure imgf000172_0001
[301] Analysis of bioagent identifying amplicons obtained using the primers of Table 14B for over 200 samples from Operation Iraqi Freedom resulted in the identification of 50 distinct strain type clusters. The largest cluster, designated strain type 11 (STl 1) includes 42 sample isolates, all of which were obtained from US service personnel and Iraqi civilians treated at the 28th Combat Support Hospital in Baghdad. Several of these individuals were also treated on the hospital ship USNS Comfort. These observations are indicative of significant epidemiological correlation/linkage.
[302] All of the sample isolates were tested against a broad panel of antibiotics to characterize their antibiotic resistance profiles. As an example of a representative result from antibiotic susceptibility testing, STl 1 was found to consist of four different clusters of isolates, each with a varying degree of sensitivity/resistance to the various antibiotics tested which included penicillins, extended spectrum penicillins, cephalosporins, carbepenem, protein synthesis inhibitors, nucleic acid synthesis inhibitors, anti-metabolites, and anti-cell membrane antibiotics. Thus, the genotyping power of bacterial bioagent identifying amplicons, particularly drill-down bacterial bioagent identifying amplicons, has the potential to increase the understanding of the transmission of infections in combat casualties, to identify the source of infection in the environment, to track hospital transmission of nosocomial infections, and to rapidly characterize drug-resistance profiles which enable development of effective infection control measures on a time-scale previously not achievable.
Example 13: Triangulation Genotyping Analysis and Codon Analysis of Acinetobacter baumannii Samples from Two Health Care Facilities
[303] In this investigation, 88 clinical samples were obtained from Walter Reed Hospital and 95 clinical samples were obtained from Northwestern Medical Center. All samples from both healthcare facilities were suspected of containing sub-types of Acinetobacter baumannii, at least some of which were expected to be resistant to quinolone drugs. Each of the 183 samples was analyzed by the method of the present invention. DNA was extracted from each of the samples and amplified with eight triangulation genotyping analysis primer pairs represented by primer pair numbers: 1151, 1156, 1158, 1160, 1165, 1167, 1170, and 1171. The DNA was also amplified with speciating primer pair number 2922 and codon analysis primer pair numbers 2846-2848 which interrogate a codon present in the parC gene, and primer pair numbers 2852-2854 which bracket a codon present in the gyrA gene. The parC and gyrA codon mutations are both responsible for causing drug resistance in Acinetobacter baumannii. During evolution of drug resistant strains, the gyrA mutation usually occurs before the parC mutation. Amplification products were measured by ESI-TOF mass spectrometry as indicated in Example 4. The base compositions of the amplification products were calculated from the average molecular masses of the amplification products and are shown in Tables 15-18. The entries in each of the tables are grouped according to strain type number, which is an arbitrary number assigned to Acinetobacter baumannii strains in the order of observance beginning from the triangulation genotyping analysis OIF genotyping study described in Example 12. For example, strain type 11 which appears in samples from the Walter Reed Hospital is the same strain as the strain type 11 mentioned in Example 12. Ibis# refers to the order in which each sample was analyzed. Isolate refers to the original sample isolate numbering system used at the location from which the samples were obtained (either Walter Reed Hospital or Northwestern Medical Center). ST = strain type. ND = not detected. Base compositions highlighted with bold type indicate that the base composition is a unique base composition for the amplification product obtained with the pair of primers indicated.
Table 15A: Base Compositions of Amplification Products of 88 A baumannii Samples Obtained from Walter Reed Hospital and Amplified with Codon Analysis Primer Pairs
Targeting the gyrA Gene
PP No: 2854
Species Ibis* Isolate ST PP No: 2852 gyrA PP No: 2853 gyrA gyrA
A. baumannii 20 1082 1 A25G2 3C22T31 A29G28C22T42 Al 7G13C14T20
A. baumannii 13 854 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 1162 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 27 1230 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 31 1367 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 37 1459 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 55 1700 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 64 1777 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 73 1861 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 74 1877 10 ND A29G28C21T43 Al 7G13C13T21
A. baumannii 86 1972 10 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 3 684 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 6 720 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii η 726 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 19 1079 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 21 1123 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 23 1188 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 33 1417 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 34 1431 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 3o 1496 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 40 1523 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 42 1640 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 50 1666 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 51 1668 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 52 1695 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 65 1781 11 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 44 1649 12 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21 A. baumannii 49A 1658.1 12 A25G23C22T31 A29G28C21T43 Al7G13C13T21
A. baumannii 49B 1658.2 12 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 56 1707 12 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 80 1893 12 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 5 693 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 8 749 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 10 839 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 14 865 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 16 888 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 29 1326 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 35 1440 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 41 1524 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 46 1652 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 47 1653 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 48 1657 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 57 1709 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 61 1727 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 63 1762 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 67 1806 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 75 1881 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 77 1886 14 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 1 649 46 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 2 653 46 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 39 1497 16 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 24 1198 15 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 28 1243 15 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 43 1648 15 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 62 1746 15 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 4 689 15 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 68 1822 3 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 69 1823A 3 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 70 1823B 3 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 71 1826 3 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 72 1860 3 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 81 1924 3 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 82 1929 3 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 85 1966 3 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 11 841 3 A25G2 3C22T31 A29G28C22T42 Al 7G13C14T20
A. baumannii 32 1415 24 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 45 1651 24 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 54 1697 24 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 58 1712 24 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
A. baumannii 60 1725 24 A25G2 3C21T32 A29G28C21T43 Al 7G13C13T21
Figure imgf000176_0001
Table 15B: Base Compositions Determined from A. baumannii DNA Samples Obtained from Walter Reed Hospital and Amplified with Codon Analysis Primer Pairs Targeting the parC
Gene
Figure imgf000176_0002
A. baumannii 33 1417 11 A33G26C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 34 1431 11 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 3o 1496 11 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 40 1523 11 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 42 1640 11 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 50 1666 11 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 51 1668 11 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 52 1695 11 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 65 1781 11 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 44 1649 12 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 49A 1658.1 12 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 49B 1658.2 12 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 56 1707 12 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 80 1893 12 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 5 693 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii Q 749 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 10 839 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 14 865 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 16 888 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 29 1326 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 35 1440 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 41 1524 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 46 1652 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 47 1653 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 48 1657 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 57 1709 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 61 1727 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 63 1762 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 67 1806 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 75 1881 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 77 1886 14 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 1 649 46 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 2 653 46 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 39 1497 16 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 24 1198 15 A 33G26 C28T34 A29G29C23T33 A16G14C14T16
A. baumannii 28 1243 15 A 33G26 C28T34 A29G29C23T33 A16G14C14T16
A. baumannii 43 1648 15 A 33G26 C28T34 A29G29C23T33 A16G14C14T16
A. baumannii 62 1746 15 A 33G26 C28T34 A29G29C23T33 A16G14C14T16
A. baumannii 4 689 15 A 34G25 C29T33 A30G27C26T31 A16G14C15T15
A. baumannii 68 1822 3 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 69 1823A 3 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 70 1823B 3 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
A. baumannii 71 1826 3 A 33G26 C28T34 A29G28C25T32 A16G14C14T16
Figure imgf000178_0001
Table 16A: Base Compositions Determined from A. baumannii DNA Samples Obtained from Northwestern Medical Center and Amplified with Codon Analysis Primer Pairs Targeting the gyrA Gene
PP No: 2854
Species Ibis* Isolate ST PP No: 2852 gyrA PP No: 2853 gyrA gyrA
A. baumann 54 536 3 A25G2 3C21T32 A29G2 8C21T43 A17G13C13T21
A. baumann 87 665 3 A25G2 3C21T32 A29G2 8C21T43 A17G13C13T21
A. baumann 8 80 10 A25G2 3C21T32 A29G2 8C21T43 A17G13C13T21
A. baumann 9 91 10 A25G2 3C21T32 A29G2 8C21T43 A17G13C13T21
A. baumann 10 92 10 A25G2 3C21T32 A29G2 8C21T43 A17G13C13T21
A. baumann 11 131 10 A25G2 3C21T32 A29G2 8C21T43 A17G13C13T21
A. baumann 12 137 10 A25G2 3C21T32 A29G2 8C21T43 A17G13C13T21
Figure imgf000179_0001
Figure imgf000180_0001
mixture η 71 ND ND Al7Gl3Cl5Tl9
Table 16B: Base Compositions Determined from A. baumannii DNA Samples Obtained from Northwestern Medical Center and Amplified with Codon Analysis Primer Pairs Targeting the parC Gene
Figure imgf000181_0001
Figure imgf000182_0001
Figure imgf000183_0001
Table 17A: Base Compositions Determined from A. bαumαnnii DNA Samples Obtained from
Walter Reed Hospital and Amplified with Speciating Primer Pair No. 2922 and Triangulation
Genotyping Analysis Primer Pair Nos. 1151 and 1156
Figure imgf000183_0002
Figure imgf000184_0001
Figure imgf000185_0001
Table 17B: Base Compositions Determined from A. baumannii DNA Samples Obtained from Walter Reed Hospital and Amplified with Triangulation Genotyping Analysis Primer Pair
Nos. 1158 and 1160 and 1165
Figure imgf000185_0002
A. baumannii 55 1700 10 A27G21C26T21 A32G35C28T34 A40G33C30T36
A. baumannii 64 1777 10 A27G21C26T21 A32G35C 28T34 A40G33 C30T36
A. baumannii 73 1861 10 A27G21C26T21 A32G35C 28T34 A40G33 C30T36
A. baumannii 74 1877 10 A27G21C26T21 A32G35C28T34 A40G33C30T36
A. baumannii 86 1972 10 A27G21C26T21 A32G35C28T34 A40G33C30T36
A. baumannii 3 684 11 A27G21C25T22 A32G34C28T35 A40G33C30T36
A. baumannii 6 720 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 7 726 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 19 1079 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 21 1123 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 23 1188 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 33 1417 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 34 1431 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 38 1496 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 40 1523 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 42 1640 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 50 1666 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 51 1668 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 52 1695 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 65 1781 11 A27G21C25T22 A32G34C 28T35 A40G33 C30T36
A. baumannii 44 1649 12 A27G21C26T21 A32G34C 29T34 A40G33 C30T36
A. baumannii 49A 1658.1 12 A27G21C26T21 A32G34C 29T34 A40G33 C30T36
A. baumannii 49B 1658.2 12 A27G21C26T21 A32G34C 29T34 A40G33 C30T36
A. baumannii 56 1707 12 A27G21C26T21 A32G34C 29T34 A40G33 C30T36
A. baumannii 80 1893 12 A27G21C26T21 A32G34C 29T34 A40G33 C30T36
A. baumannii 5 693 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 8 749 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 10 839 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 14 865 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 16 888 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 29 1326 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 35 1440 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 41 1524 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 46 1652 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 47 1653 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 48 1657 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 57 1709 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 61 1727 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 63 1762 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 67 1806 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 75 1881 14 A27G21C25T22 A31G36C 28T34 A40G33 C29T37
A. baumannii 77 1886 14 A27G21C25T22 A31G36C28T34 A40G33C29T37
A. baumannii 1 649 46 A29G19C26T21 A31G35C 29T34 A40G33 C29T37
Figure imgf000187_0001
Table 17C: Base Compositions Determined from A. baumannii DNA Samples Obtained from Walter Reed Hospital and Amplified with Triangulation Genotyping Analysis Primer Pair
Nos. 1167 and 1170 and 1171
PP No: 1167 PP No: 1170 PP No: 1171
Species Ibis* Isolate ST fumC fumC ppa
A. baumannii 20 1082 1 A41G34C34T38 A36 G27C21T50 A35G37C33T44
A. baumannii 13 854 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 22 1162 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 27 1230 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 31 1367 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 37 1459 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 55 1700 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 64 1777 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 73 1861 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 74 1877 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 86 1972 10 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 3 684 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 6 720 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 726 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 19 1079 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 21 1123 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 23 1188 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 33 1417 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 34 1431 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 38 1496 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 40 1523 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 42 1640 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 50 1666 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 51 1668 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 52 1695 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 65 1781 11 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 44 1649 12 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 49A 1658.1 12 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 49B 1658.2 12 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 56 1707 12 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 80 1893 12 A41G34C34T38 A38 G27C21T50 A35G37C 33T44
A. baumannii 5 693 14 A40G35C34T38 A38 G27C21T50 A35G37C 30T47
A. baumannii 8 749 14 A40G35C34T38 A38 G27C21T50 A35G37C 30T47
A. baumannii 10 839 14 A40G35C34T38 A38 G27C21T50 A35G37C 30T47
A. baumannii 14 865 14 A40G35C34T38 A38 G27C21T50 A35G37C 30T47
A. baumannii 16 888 14 A40G35C34T38 Δ3c G27C21T50 A35G37C30T47
Figure imgf000189_0001
Figure imgf000190_0001
Table 18A: Base Compositions Determined from A. baumannii DNA Samples Obtained from
Northwestern Medical Center and Amplified with Speciating Primer Pair No. 2922 and
Triangulation Genotyping Analysis Primer Pair Nos. 1151 and 1156
Figure imgf000190_0002
Figure imgf000191_0001
Figure imgf000192_0001
Table 18B: Base Compositions Determined from A. baumannii DNA Samples Obtained from Northwestern Medical Center and Amplified with Triangulation Genotyping Analysis Primer
Pair Nos. 1158, 1160 and 1165
Figure imgf000192_0002
Figure imgf000193_0001
Figure imgf000194_0001
Table 18C: Base Compositions Determined from A. baumannii DNA Samples Obtained from Northwestern Medical Center and Amplified with Triangulation Genotyping Analysis Primer
Pair Nos. 1167, 1170 and 1171
Species Ibis* Isolate ST PP No: 1167 fumC PP No: 1170 fumC PP No: L171 ppa
A. baumannii 54 536 3 A41G34C35T37 A38G27C20T51 A35G37C31T46
A. baumannii 87 665 3 A41G34C35T37 A 38G27C20T51 A35G37 C31T46
A. baumannii 8 80 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 9 91 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 10 92 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 11 131 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 12 137 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 21 218 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 26 242 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 94 678 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 1 9 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 2 13 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 3 19 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 4 24 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 5 36 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 6 39 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 13 139 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 15 165 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 16 170 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 17 186 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 20 202 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 22 221 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 24 234 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 25 239 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 33 370 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 34 389 10 A41G34C34T38 A 38G27C21T50 A35G37 C33T44
A. baumannii 19 201 14 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
A. baumannii 27 257 51 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
A. baumannii 29 301 51 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
A. baumannii 31 354 51 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
A. baumannii 36 422 51 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
A. baumannii 424 51 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
A. baumannii 38 434 51 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
A. baumannii 39 473 51 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
A. baumannii 40 482 51 A40G35C34T38 A38G27C21T50 A35G37 C30T47
A. baumannii 44 512 51 A40G35C34T38 A 38G27C21T50 A35G37 C30T47
Figure imgf000196_0001
Figure imgf000197_0001
[304] Base composition analysis of the samples obtained from Walter Reed hospital indicated that a majority of the strain types identified were the same strain types already characterized by the OIF study of Example 12. This is not surprising since at least some patients from which clinical samples were obtained in OIF were transferred to the Walter Reed Hospital (WRAIR). Examples of these common strain types include: STlO, STI l, ST12, ST14, ST15, ST16 and ST46. A strong correlation was noted between these strain types and the presence of mutations in the gyr A and parC which confer quinolone drug resistance.
[305] In contrast, the results of base composition analysis of samples obtained from Northwestern Medical Center indicate the presence of 4 major strain types: STlO, ST51, ST53 and ST54. All of these strain types have the gyrA quinolone resistance mutation and most also have the parC quinolone resistance mutation, with the exception of ST35. This observation is consistent with the current understanding that the gyrA mutation generally appears before the parC mutation and suggests that the acquisition of these drug resistance mutations is rather recent and that resistant isolates are taking over the wild-type isolates. Another interesting observation was that a single isolate of ST3 (isolate 841) displays a triangulation genotyping analysis pattern similar to other isolates of ST3, but the codon analysis amplification product base compositions indicate that this isolate has not yet undergone the quinolone resistance mutations in gyrA and parC. [306] The six isolates that represent species other than Acinetobacter baumannii in the samples obtained from the Walter Reed Hospital were each found to not carry the drug resistance mutations.
[307] The results described above involved analysis of 183 samples using the methods and compositions of the present invention. Results were provided to collaborators at the Walter Reed hospital and Northwestern Medical center within a week of obtaining samples. This example highlights the rapid throughput characteristics of the analysis platform and the resolving power of triangulation genotyping analysis and codon analysis for identification of and determination of drug resistance in bacteria.
Example 14: Identification of Drug Resistance Genes and Virulence Factors in Staphylococcus aureus
[308] Three primer pair panels, each comprising eight primer pairs, were configured for identification of the Staphylococcus aureus species and for identification of drug resistance genes and virulence factors of Staphylococcus aureus bioagents. These panels are shown in Tables 19 A, 19B and 19C. The primer sequences in these panels can also be found in Table 2, and are cross- referenced in Tables 19A-C by primer pair numbers, primer pair names, and SEQ ID NOs.
Table 19A: Panel of Primer Pairs for Identification of Drug Resistance Genes and Virulence
Factors in Staphylococcus aureus
Figure imgf000198_0001
Figure imgf000199_0001
Table 19B: Panel of Primer Pairs for Identification of Drug Resistance Genes and Virulence
Factors in Staphylococcus aureus
Figure imgf000199_0002
Table 19C: Panel of Primer Pairs for Identification of Drug Resistance Genes and Virulence
Factors in Staphylococcus aureus
Figure imgf000199_0003
[309] Primer pair numbers 2256 and 2249 are confirmation primers configured with the aim of high-level identification of Staphylococcus aureus. The nuc gene is a Staphylococcus aureus- specific marker gene. The tufB gene is a universal housekeeping gene but the bioagent identifying amplicon defined by primer pair number 2249 provides a unique base composition (A43 G28 C19 T35) which distinguishes Staphylococcus aureus from other members of the genus Staphylococcus.
[310] High level methicillin resistance in a given strain of Staphylococcus aureus is indicated by bioagent identifying amplicons defined by primer pair numbers 879 and 2056. Analyses have indicated that primer pair number 879 is not expected to prime S. sciuri homolog or Enterococcus faecalis/faciem ampicillin-resistant PBP5 homologs.
[311] Macrolide and erythromycin resistance in a given strain of Staphylococcus aureus is indicated by bioagent identifying amplicons defined by primer pair numbers 2081 and 2086.
[312] Resistance to mupriocin in a given strain of Staphylococcus aureus is indicated by bioagent identifying amplicons defined by primer pair numbers 2313 and 3016.
[313] In the above panels, virulence in a given strain of Staphylococcus aureus can be indicated by bioagent identifying amplicons defined by primer pair numbers 2095 and 3106. Primer pair number 2095 can identify both the pvl (lukS-PV) gene and the lukD gene which encodes a homologous enterotoxin. A bioagent identifying amplicon of the lukD gene defined by primer pair number 2095 has a six nucleobase length difference relative to the lukS-PV gene. Further, primer pair number 3106 is configured to generate amplicons within the tsst-1 gene, which encodes for shock syndrome toxin, which causes toxic shock syndrome (TSS).
[314] A total of 32 blinded samples of different strains of Staphylococcus aureus were provided by the Center for Disease Control (CDC). Each sample was analyzed by PCR amplification with the eight primer pair panel of Table 19 A, followed by purification and measurement of molecular masses of the amplification products by mass spectrometry. Base compositions for the amplification products were calculated. The base compositions provide the information summarized above for each primer pair. The results are shown in Tables 2OA and 2OB. One result noted upon un-blinding of the samples is that each of the PVL+ identifications agreed with PVL+ identified in the same samples by standard PCR assays. These results indicate that the panel of eight primer pairs is useful for identification of drug resistance and virulence sub-species characteristics for Staphylococcus aureus. Thus, it is expected that a kit comprising one or more of the members of the panels provided in Tables 19A-C will be a useful embodiment.
Table 2OA: Drug Resistance and Virulence Identified in Blinded Samples of Various Strains of Staphylococcus aureus with Primer Pair Nos. 2081, 2086, 2095 and 2256
Figure imgf000201_0001
Figure imgf000202_0001
Table 2OB: Drug Resistance and Virulence Identified in Blinded Samples of Various Strains of Staphylococcus aureus with Primer Pair Nos. 2249, 879, 2056, and 2313
Figure imgf000202_0002
Figure imgf000203_0001
Example 15: Selection and Use of Triangulation Genotyping Analysis Primer Pairs for Staphylococcus aureus
[315] To combine the power of high-throughput mass spectrometric analysis of bioagent identifying amplicons with the sub-species characteristic resolving power provided by triangulation genotyping analysis, two panels, each with eight triangulation genotyping analysis primer pairs was selected and are listed in Tables 21 A and 21B. The primer pairs are configured to produce bioagent identifying amplicons within six different housekeeping genes which are listed in the tables. The primer sequences are found in Table 2 and are cross-referenced by the primer pair numbers, primer pair names or SEQ ID NOs listed in Tables 21 A and 21B. Further, another panel of primer pairs was developed to combining the identification/drug resistance/viulence identifying power of the primer pairs of Tables 19A-C with the triangulation genotyping analysis of Tables 2 IA-B. This panel comprises sixteen primer pairs and is shown in Table 21C. The panel shown in Table 21C combines primer pairs of Tables 19B and 21B. However, other combinations of primer pairs from the Staphylococcus aureus genotyping panels and the identification/virulence/drug resistant panels shown in Examples 14 and 15 are encompassed by this disclosure.
Table 21 A: Primer Pairs for Triangulation Genotyping Analysis of Staphylococcus aureus
Figure imgf000204_0001
Table 21B: Primer Pairs for Triangulation Genotyping Analysis of Staphylococcus aureus
Figure imgf000204_0002
Table 21C: Panel of Primer Pairs for Identification/Drug Resistance/Virulence and Triangulation Genotyping Analysis of Staphylococcus aureus
Figure imgf000204_0003
Figure imgf000205_0001
[316] The same samples analyzed for drug resistance and virulence in Example 14 were subjected to triangulation genotyping analysis. The primer pairs of Table 21 A were used to produce amplification products by PCR, which were subsequently purified and measured by mass spectrometry. Base compositions were calculated from the molecular masses and are shown in Tables 22 A and 22B.
Table 22A: Triangulation Genotyping Analysis of Blinded Samples of Various Strains of Staphylococcus aureus with Primer Pair Nos. 2146, 2149, 2150 and 2156
Sample Primer Pair Primer Pair No Primer Pair Primer Pair
Index Strain No. 2146 2149 (aroE) No. No. 2156 (gmk)
No. (arcC) 2150 (aroE)
CDCOOlO A44 G24 C18 A59 G24 C18 T5 1 A40 G36 C13 A50 G3C C20
COL T29 T43 T32
CDC0015 A44 G24 C18 A59 G24 C18 T5 1 A40 G36 C13 A50 G3C C20
COL T29 T43 T32
CDC0019 A44 G24 C18 A59 G24 C18 T5 1 A40 G36 C13 A50 G3C C20
COL T29 T43 T32 CDC0026 A44 G24 C18 A59 G24 C18 T51 A40 G36 C13 A50 G30 C20
COL T29 T43 T32
CDC0030 A44 G24 C18 A59 G24 C18 T51 A40 G36 C13 A50 G30 C20
COL T29 T43 T32
CDC004 A44 G24 C18 A59 G24 C18 T51 A40 G36 C13 A50 G30 C20
COL T29 T43 T32
CDC0014 A44 G24 C18 A59 G24 C18 T51 A40 G36 C13 A50 G30 C20
COL T29 T43 T32
CDC008 A44 G24 C18 A59 G24 C18 T51 A40 G36 C13 A50 G30 C20
???? T29 T43 T32
CDCOOl A45 G23 C20 A58 G24 C18 T52 A40 G36 C13 A51 G29 C21
Mu50 T27 T43 T31
CDC0022 A45 G23 C20 A58 G24 C18 T52 A40 G36 C13 A51 G29 C21
Mu50 T27 T43 T31
CDC006 A45 G23 C20 A58 G24 C18 T52 A40 G36 C13 A51 G29 C21
Mu50 T27 T43 T31
CDCOOIl A45 G24 C18 A58 G24 C19 T51 A41 G36 C12 A51 G29 C21
MRSA252 T28 T43 T31
CDC0012 A45 G24 C18 A58 G24 C19 T51 A41 G36 C12 A51 G29 C21
MRSA252 T28 T43 T31
CDC0021 A45 G24 C18 A58 G24 C19 T51 A41 G36 C12 A51 G29 C21
MRSA252 T28 T43 T31
CDC0023 A45 G24 C18 A59 G24 C18 T51 A40 G36 C13 A50 G30 C20
ST:110 T28 T43 T32
CDC0025 A45 G24 C18 A59 G24 C18 T51 A40 G36 C13 A50 G30 C20
ST:110 T28 T43 T32
CDC005 A44 G24 C18 A59 G23 C19 T51 A40 G36 C14 A51 G29 C21
ST: 338 T29 T42 T31
CDC0018 A44 G24 C18 A59 G23 C19 T51 A40 G36 C14 A51 G29 C21
ST: 338 T29 T42 T31
CDC002 A46 G23 C20 A5O G24 C19 T51 A42 G36 C12 A51 G29 C20
ST: 108 T26 T42 T32
Figure imgf000207_0001
Table 22B: Triangulation Genotyping Analysis of Blinded Samples of Various Strains of Staphylococcus aureus with Primer Pair Nos. 2146, 2149, 2150 and 2156
Sample Primer Pair Primer Pair No. Primer Pair Primer Pair
Index Strain No. 2157 (pta) 2161 (tpi) No. No. 2166 (yqi)
No. 2163 (yqi)
CDCOOlO A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
COL T29 T43 T37
CDC0015 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
COL T29 T43 T37
CDC0019 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
COL T29 T43 T37
CDC0026 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
COL T29 T43 T37
CDC0030 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
COL T29 T43 T37 CDC004 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
COL T29 T43 T37
CDC0014 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
COL T29 T43 T37
CDC008 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18 unknown T29 T43 T37
CDCOOl A33 G25 C22 A50 G28 C22 T29 A42 G36 C22 A36 G31 C19
Mu50 T29 T43 T36
CDC0022 A33 G25 C22 A50 G28 C22 T29 A42 G36 C22 A36 G31 C19
Mu50 T29 T43 T36
CDC006 A33 G25 C22 A50 G28 C22 T29 A42 G36 C22 A36 G31 C19
Mu50 T29 T43 T36
CDCOOIl A32 G25 C23 A50 G28 C22 T29 A42 G36 C22 A37 G30 C18
MRSA252 T29 T43 T37
CDC0012 A32 G25 C23 A50 G28 C22 T29 A42 G36 C22 A37 G30 C18
MRSA252 T29 T43 T37
CDC0021 A32 G25 C23 A50 G28 C22 T29 A42 G36 C22 A37 G30 C18
MRSA252 T29 T43 T37
CDC0023 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
ST:110 T29 T43 T37
CDC0025 A32 G25 C23 A51 G28 C22 T28 A41 G37 C22 A37 G30 C18
ST:110 T29 T43 T37
CDC005 A32 G25 C24 A51 G27 C21 T30 A42 G36 C22 A37 G30 C18
ST: 338 T28 T43 T37
CDC0018 A32 G25 C24 A51 G27 C21 T30 A42 G36 C22 A37 G30 C18
ST: 338 T28 T43 T37
CDC002 A33 G25 C23 A50 Gz o C22 T29 A42 G36 C22 A37 G30 C18
ST: 108 T28 T43 T37
CDC0028 A33 G25 C23 A50 Gz o C22 T29 A42 G36 C22 A37 G30 C18
ST: 108 T28 T43 T37
CDC003 A32 G25 C23 A51 Gz o C22 T28 A41 G37 C22 A37 G30 C18
ST:107 T29 T43 T37
Figure imgf000209_0001
[317] Note: *** The sample CDC0031 was identified as Staphylococcus scleiferi as indicated in Example 14. Thus, the triangulation genotyping primers configured for Staphylococcus aureus would generally not be expected to prime and produce amplification products of this organism. Tables 22A and 22B indicate that amplification products are obtained for this organism only with primer pair numbers 2157 and 2161.
[318] A total of thirteen different genotypes of Staphylococcus aureus were identified according to the unique combinations of base compositions across the eight different bioagent identifying amplicons obtained with the eight primer pairs in Table 21 A. These results indicate that this eight primer pair panel is useful for analysis of unknown or newly emerging strains of Staphylococcus aureus. It is expected that a kit comprising one or more of the members of the panels in Tables 21 A and 2 IB will be a useful embodiment provided herein. It is envisioned that a kit comprising the primer pairs of Table 21C, or another combination of primer pairs from examples 14 and 15 would be a useful embodiment provided herein that could be useful in identification of Staphylococcus aureus bioagents at multiple levels. Example 16: Selection and Use of Triangulation Genotyping Analysis Primer Pairs for Members of the Bacterial Genus Vibrio
[319] To combine the power of high-throughput mass spectrometric analysis of bioagent identifying amplicons with the sub-species characteristic resolving power provided by triangulation genotyping analysis, a panel of eight triangulation genotyping analysis primer pairs was selected. The primer pairs are configured to produce bioagent identifying amplicons within seven different housekeeping genes which are listed in Table 23. The primer sequences are found in Table 2 and are cross-referenced by the primer pair numbers, primer pair names or SEQ ID NOs listed in Table 23.
Table 23: Primer Pairs for Triangulation Genotyping Analysis of Members of the Bacterial
Genus Vibrio
Primer Forward Primer Name Forward Reverse Primer Name Reverse Target Pair Primer Primer Gene No. (SEQ ID (SEQ ID NO:) NO:)
1098 RNASEP VBC 331 349 F 325 RNAS EP VBC 388 414 R 1163 RNAse P
2000 CTXB NCC 302505 46 70 F 278 CTXB NC 002505 132 162 R 1039 ctxB
2001 FUR NC002505 87 113 F 465 FUR NC002505 205 228 R 1037 fur
2011 GYRB NCC 302505 1161 1190 148 1172 gyrB F GYRB NC 002505 1255 12 84 R
2012 OMPU NCC 302505 85 110 F 190 OMPU NC 002505 154 180 R 1254 ompU
2014 OMPU NCC 302505 431 455 F 266 OMPU NC 002505 544 567 R 1094 ompU
2323 CTXA NC002505-1568114- 508 CTXA NC 002505-1568114 - 1297 ctxA 1567341 122 149 F 1567341 186 214 R
2927 GAPA NC002505 694 721 F 259 GAPA NC 002505 29 58 R 1060 gapA
[320] A group of 50 bacterial isolates containing multiple strains of both environmental and clinical isolates of Vibrio cholerae, 9 other Vibrio species, and 3 species of Photobacteria were tested using this panel of primer pairs. Base compositions of amplification products obtained with these 8 primer pairs were used to distinguish amongst various species tested, including sub-species differentiation within Vibrio cholerae isolates. For instance, the non-0 l/non-0139 isolates were clearly resolved from the Ol and the 0139 isolates, as were several of the environmental isolates of Vibrio cholerae from the clinical isolates.
[321] It is expected that a kit comprising one or more of the members of this panel will be a useful embodiment of the present invention.
Example 17: Selection and Use of Triangulation Genotyping Analysis Primer Pairs for Members of the Bacterial Genus Pseudomonas
[322] To combine the power of high-throughput mass spectrometric analysis of bioagent identifying amplicons with the sub-species characteristic resolving power provided by triangulation genotyping analysis, a panel of twelve triangulation genotyping analysis primer pairs was selected. The primer pairs are configured to produce bioagent identifying amplicons within seven different housekeeping genes which are listed in Table 24. The primer sequences are found in Table 2 and are cross-referenced by the primer pair numbers, primer pair names or SEQ ID NOs listed in Table 24.
Table 24: Primer Pairs for Triangulation Genotyping Analysis of Members of the Bacterial
Genus Pseudomonas
Figure imgf000211_0001
Figure imgf000212_0001
[323] It is expected that a kit comprising one or more of the members of this panel will be a useful embodiment of the present invention.
Example 18: Analysis Involving A Staphylococcus aureus tsstl Gene Calibrant Polynucleotide
[324] Primer pairs 3105, 3106, and 3107 were used in respective dilution series analyses in which the amplification target was a calibrant polynucleotide comprising a segment of the Staphylococcus aureus tsstl gene. The individual primer sequences of primer pairs 3105, 3106, and 3107 are found in Table 2. Aside from varied calibrant polynucleotide copies numbers, the amplification reaction conditions, PCR product purification protocol, and base composition analysis utilized in this example were the same as those described in Examples 2-4, above. The results of this analysis are provided in Table 25, where the average calibrant polynucleotide copy numbers utilized in the various reactions are specified and "X" denotes that the calibrant polynucleotide was detected in the particular reaction mixture. Table 25
Figure imgf000213_0001
Example 19: Analysis of Isolated Clinical Samples
[325] Primer pair 3106, which targets the Staphylococcus aureus tsstl gene, was used against eight isolated clinical samples received from the CDC. The sequences of primer pair 3106 (i.e., SEQ ID NOS: 1465 and 1466) are found in Table 2. Each sample was analyzed in two parallel replicates, as was a control reaction that only included a calibrant polynucleotide (i.e., the control was the same as the other replicates aside from lacking DNA from any of the clinical samples). The amplification reaction conditions, PCR product purification protocol, and base composition analysis utilized in this example were the same as those described in Examples 2-4, above. The results of this analysis are provided in Table 26. As expected, two of the clinical samples were positive for Staphylococcus aureus (i.e., CDCOOI l and CDC0021).
Table 26
Figure imgf000213_0002
Figure imgf000214_0001
[326] The present invention includes any combination of the various species and subgeneric groupings falling within the generic disclosure. This invention therefore includes the generic description of the invention with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.
[327] While in accordance with the patent statutes, description of the various embodiments and examples have been provided, the scope of the invention is not to be limited thereto or thereby. Modifications and alterations of the present invention will be apparent to those skilled in the art without departing from the scope and spirit of the present invention.
[328] Therefore, it will be appreciated that the scope of this invention is to be defined by the appended claims, rather than by the specific examples which have been presented by way of example.
[329] Each reference (including, but not limited to, journal articles, U.S. and non-U.S. patents, patent application publications, international patent application publications, gene bank gi or accession numbers, internet web sites, and the like) cited in the present application is incorporated herein by reference in its entirety.

Claims

CLAIMSWhat is claimed is:
1. A kit comprising an oligonucleotide primer pair comprising a forward primer and a reverse primer, each comprising between 13 and 35 linked nucleotides in length, wherein: the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1465 and the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1466, the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1467 and the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1468, or the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1469 and the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1470.
2. The kit of claim 1, further comprising at least one additional oligonucleotide primer pair that is configured to generate an amplicon between 45 and 200 linked nucleotides in length, and comprises a forward and a reverse primer, each comprising between 13 and 35 linked nucleotides in length and each configured to hybridize to conserved sequence regions within a Staphylococcus aureus gene, said gene selected from the group consisting of: ermA, ermC, pvluk, nuc, tuffi, mecA, mec-Rl, tsstl, and mupR.
3. The kit of claim 2, wherein said oligonucleotide primer pair and said at least one additional oligonucleotide primer pair comprises eight primer pairs, said eight oligonucleotide primer pairs having at least 70% sequence identity to: SEQ ID NO.: 288:SEQ ID NO.: 1269, SEQ ID NO.: 698:SEQ ID NO.: 1420, SEQ ID NO : 217:SEQ ID NO.: 1167, SEQ ID NO.: 399:SEQ ID NO.: 1041, SEQ ID NO.: 456:SEQ ID NO.: 1261, SEQ ID NO.: 430:SEQ ID NO.:1321, SEQ ID NO.: 174:SEQ ID NO.:853, and SEQ ID NO.: 1465: SEQ ID NO: 1466, SEQIDNO.: 1467:SEQ IDNO.:1468, or SEQ ID NO.:
1469:SEQIDNO.:1470.
4. The kit of claim 3 wherein said eight oligonucleotide primers consist of SEQ ID NO.: 288:SEQIDNO.:1269, SEQIDNO.: 698:SEQ IDNO.:1420, SEQ IDNO: 217:SEQ ID NO.: 1167, SEQIDNO.: 399:SEQ ID NO.:1041, SEQIDNO.: 456:SEQ ID NO.:1261, SEQ ID NO.: 430:SEQ ID NO.:1321, SEQ ID NO.: 174:SEQ ID NO.:853, and SEQ ID NO.: 1465: SEQ ID NO: 1466, SEQIDNO.: 1467:SEQ IDNO.:1468, or SEQ ID NO.: 1469:SEQIDNO.:1470.
5. The kit of claim 4 further comprising eight additional primer pairs, said eight additional primer pairs comprising at least 70% sequence identity with: SEQ ID NO.: 437:SEQ ID NO.:1232, SEQIDNO.: 530:SEQ ID NO.:891, SEQ IDNO: 474:SEQ ID NO.:869, SEQIDNO.:268:SEQIDNO.:1284, SEQ IDNO: 418:SEQ ID NO.:1301, SEQID NO.: 318:SEQIDNO.:1300, SEQ ID NO: 440: SEQ ID NO: 1076, and SEQ ID NO.: 219:SEQIDNO.:1013.
6. An oligonucleotide primer pair comprising a forward primer and a reverse primer, each comprising between 13 and 35 linked nucleotides in length, wherein the forward primer comprises at least 70% sequence identity with SEQ ID NO.: 1465, SEQ ID NO.: 1467, or SEQIDNO.: 1469.
7. The oligonucleotide primer pair of claim 6, wherein said forward primer comprises at least 80% sequence identity with SEQ ID NO.: 1465, SEQ ID NO.: 1467, or SEQ ID NO.: 1469.
8. The oligonucleotide primer pair of claim 6, wherein said forward primer comprises at least 90% sequence identity with SEQ ID NO.: 1465, SEQ ID NO.: 1467, or SEQ ID NO.: 1469.
9. The oligonucleotide primer pair of claim 6, wherein said forward primer comprises at least 95% sequence identity with SEQ ID NO.: 1465, SEQ ID NO.: 1467, or SEQ ID NO.: 1469.
10. The oligonucleotide primer pair of claim 6, wherein said forward primer comprises at least 100% sequence identity with SEQ ID NO.: 1465, SEQ ID NO.: 1467, or SEQ ID NO.: 1469.
11. The oligonucleotide primer pair of claim 6, wherein said forward primer is SEQ ID NO. 1465, SEQ ID NO.: 1467, or SEQ ID NO.: 1469 with 0-10 nucleobase deletions, insertions and/or substitutions.
12. The oligonucleotide primer pair of claim 6, wherein said forward primer is SEQ ID NO. 1465, SEQ ID NO. : 1467, or SEQ ID NO. : 1469.
13. A composition comprising the oligonucleotide primer of claim 6.
14. The oligonucleotide primer pair of claim 6, wherein at least one of said forward primer and said reverse primer comprises at least one modified nucleobase.
15. The oligonucleotide primer pair of claim 14, wherein at least one of said at least one modified nucleobase is a mass modified nucleobase.
16. The oligonucleotide primer pair of claim 15, wherein said mass modified nucleobase is 5-Iodo-C.
17. The oligonucleotide primer pair of claim 15, wherein said mass modified nucleobase comprises a molecular mass modifying tag.
18. The oligonucleotide primer pair of claim 14, wherein at least one of said at least one modified nucleobase is a universal nucleobase.
19. The oligonucleotide primer pair of claim 18, wherein said universal nucleobase is inosine.
20. The oligonucleotide primer pair of claim 6, wherein at least one of said forward primer and said reverse primer comprises a non-templated T residue at its 5' end.
21. An oligonucleotide primer pair comprising a forward primer and a reverse primer, each comprising between 13 and 35 linked nucleotides in length, wherein the reverse primer comprises at least 70% sequence identity with SEQ ID NO.: 1466, SEQ ID NO.: 1468, or SEQ ID NO.: 1470.
22. The oligonucleotide primer pair of claim 13, wherein said reverse primer comprises at least 80% sequence identity with SEQ ID NO.: 1466, SEQ ID NO.: 1468, or SEQ ID NO.: 1470.
23. The oligonucleotide primer pair of claim 13, wherein said reverse primer comprises at least 90% sequence identity with SEQ ID NO.: 1466, SEQ ID NO.: 1468, or SEQ ID NO.: 1470.
24. The oligonucleotide primer pair of claim 13, wherein said reverse primer comprises at least 95% sequence identity with SEQ ID NO.: 1466, SEQ ID NO.: 1468, or SEQ ID NO.: 1470.
25. The oligonucleotide primer pair of claim 13, wherein said reverse primer comprises at least 100% sequence identity with SEQ ID NO.: 1466, SEQ ID NO.: 1468, or SEQ ID NO.: 1470.
26. The oligonucleotide primer pair of claim 13, wherein said reverse primer is SEQ ID NO. 1466, SEQ ID NO.: 1468, or SEQ ID NO.: 1470 with 0-10 nucleobase deletions, insertions and/or substitutions.
27. The oligonucleotide primer pair of claim 13, wherein said reverse primer is SEQ ID NO. 1466, SEQ ID NO. : 1468, or SEQ ID NO. : 1470.
28. A composition comprising the oligonucleotide primer of claim 21.
29. The oligonucleotide primer pair of claim 21, wherein at least one of said forward primer and said reverse primer comprises at least one modified nucleobase.
30. The oligonucleotide primer pair of claim 29, wherein at least one of said at least one modified nucleobase is a mass modified nucleobase.
31. The oligonucleotide primer pair of claim 30, wherein said mass modified nucleobase is 5-Iodo-C.
32. The oligonucleotide primer pair of claim 30, wherein said mass modified nucleobase comprises a molecular mass modifying tag.
33. The oligonucleotide primer pair of claim 29, wherein at least one of said at least one modified nucleobase is a universal nucleobase.
34. The oligonucleotide primer pair of claim 33, wherein said universal nucleobase is inosine.
35. The oligonucleotide primer pair of claim 21, wherein at least one of said forward primer and said reverse primer comprises a non-templated T residue at its 5' end.
36. A method for identifying a Staphylococcus aureus bioagent in a sample comprising: a) amplifying a nucleic acid from said sample using an oligonucleotide primer pair comprising a forward primer and a reverse primer, each comprising between 13 and 35 linked nucleotides in length, said forward primer comprising at least 70% sequence identity with SEQ ID NO.: 1465, SEQ ID NO.: 1467, or SEQ ID NO.: 1469 and said reverse primer comprising at least 70% sequence identity with SEQ ID NO.: 1466, SEQ ID NO.: 1468, or SEQ ID NO.: 1470, wherein said amplifying generates at least one amplification product that comprises between 45 and 200 linked nucleotides; and b) determining the molecular mass of said at least one amplification product by mass spectrometry.
37. The method of claim 36, further comprising comparing said determined molecular mass to a database comprising a plurality of molecular masses of bioagent identifying amplicons, wherein a match between said determined molecular mass and a molecular mass comprised in said database identifies said Staphylococcus aureus bioagent in said sample.
38. The method of claim 36, further comprising calculating a base composition of said at least one amplification product using said molecular mass.
39. The method of claim 38, further comprising comparing said calculated base composition to a database comprising a plurality of base compositions of bioagent identifying amplicons, wherein a match between said calculated base composition and a base composition comprised in said database identifies said Staphylococcus aureus bioagent in said sample.
40. The method of claim 36, further comprising repeating said amplifying and determining steps using at least one additional oligonucleotide primer pair wherein the primers of each of said at least one additional primer pair are configured to hybridize to conserved sequence regions within a Staphylococcus aureus gene selected from the group consisting ermA, ermC, pvluk, nuc, tuffi, mecA, mec-Rl, tsstl, and mupR.
41. The method of claim 36, wherein said identifying comprises detecting the presence of said Staphylococcus aureus bioagent in said sample.
42. The method of claim 36, wherein said identifying comprises determining the presence or absence of virulence of said Staphylococcus aureus bioagent in said sample.
PCT/US2008/057717 2007-03-23 2008-03-20 Compositions for use in identification of bacteria WO2008127839A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US12/532,809 US20110256541A1 (en) 2007-03-23 2008-03-20 Compositions for use in identification of bacteria
EP08780484A EP2076612A2 (en) 2007-03-23 2008-03-20 Compositions for use in identification of bacteria

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US89682207P 2007-03-23 2007-03-23
US89681307P 2007-03-23 2007-03-23
US60/896,813 2007-03-23
US60/896,822 2007-03-23

Publications (2)

Publication Number Publication Date
WO2008127839A2 true WO2008127839A2 (en) 2008-10-23
WO2008127839A3 WO2008127839A3 (en) 2009-02-26

Family

ID=39758816

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/057717 WO2008127839A2 (en) 2007-03-23 2008-03-20 Compositions for use in identification of bacteria

Country Status (3)

Country Link
US (1) US20110256541A1 (en)
EP (1) EP2076612A2 (en)
WO (1) WO2008127839A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010076013A1 (en) * 2008-12-30 2010-07-08 Qiagen Gmbh Method for detecting methicillin-resistant staphylococcus aureus (mrsa) strains
WO2011047307A1 (en) 2009-10-15 2011-04-21 Ibis Biosciences, Inc. Multiple displacement amplification
US7956175B2 (en) 2003-09-11 2011-06-07 Ibis Biosciences, Inc. Compositions for use in identification of bacteria

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008151023A2 (en) 2007-06-01 2008-12-11 Ibis Biosciences, Inc. Methods and compositions for multiple displacement amplification of nucleic acids
CN110273026B (en) * 2019-06-20 2022-07-05 广州达安基因股份有限公司 Multiple detection kit and detection method for respiratory tract infection
US11376588B2 (en) 2020-06-10 2022-07-05 Checkable Medical Incorporated In vitro diagnostic device

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0549477A (en) * 1991-08-05 1993-03-02 Wakunaga Pharmaceut Co Ltd Detection of bacteria of the genus staphylococcus
DE69433201T2 (en) * 1994-02-28 2004-07-29 Shimadzu Corp. Oligonucleotides and methods for the detection of bacteria
EP1629124A2 (en) * 2003-06-05 2006-03-01 Wyeth Nucleic acid arrays for detecting multiple strains of a non-viral species
US20120122101A1 (en) * 2003-09-11 2012-05-17 Rangarajan Sampath Compositions for use in identification of bacteria
WO2006116127A2 (en) * 2005-04-21 2006-11-02 Isis Pharmaceuticals, Inc. Compositions for use in identification of bacteria
WO2006116010A2 (en) * 2005-04-21 2006-11-02 Advandx, Inc. Detection of virulence markers of staphylococci
CA2663029C (en) * 2006-09-14 2016-07-19 Ibis Biosciences, Inc. Targeted whole genome amplification method for identification of pathogens

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
DATABASE Geneseq [Online] 11 January 2007 (2007-01-11), "Bacterial DNA PCR primer SEQ ID NO:874." XP002497027 retrieved from EBI accession no. GSN:AEM14131 Database accession no. AEM14131 *
KURODA M ET AL: "Whole genome sequencing of meticillin-resistant Staphylococcus aureus" LANCET THE, LANCET LIMITED. LONDON, GB, vol. 357, no. 9264, 21 April 2001 (2001-04-21), pages 1225-1240, XP004806262 ISSN: 0140-6736 *
MEHROTRA M ET AL: "Multiplex PCR for detection of genes for Staphylococcus aureus enterotoxins, exfoliative toxins, toxic shock syndrome toxin 1, and methicillin resistance" JOURNAL OF CLINICAL MICROBIOLOGY, WASHINGTON, DC, US, vol. 38, no. 3, 1 March 2000 (2000-03-01), pages 1032-1035, XP002224611 ISSN: 0095-1137 *
See also references of EP2076612A2 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7956175B2 (en) 2003-09-11 2011-06-07 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US8013142B2 (en) 2003-09-11 2011-09-06 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US8242254B2 (en) 2003-09-11 2012-08-14 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US8288523B2 (en) 2003-09-11 2012-10-16 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
US8394945B2 (en) 2003-09-11 2013-03-12 Ibis Biosciences, Inc. Compositions for use in identification of bacteria
WO2010076013A1 (en) * 2008-12-30 2010-07-08 Qiagen Gmbh Method for detecting methicillin-resistant staphylococcus aureus (mrsa) strains
WO2011047307A1 (en) 2009-10-15 2011-04-21 Ibis Biosciences, Inc. Multiple displacement amplification
EP2957641A1 (en) 2009-10-15 2015-12-23 Ibis Biosciences, Inc. Multiple displacement amplification
EP3225695A1 (en) 2009-10-15 2017-10-04 Ibis Biosciences, Inc. Multiple displacement amplification
US9890408B2 (en) 2009-10-15 2018-02-13 Ibis Biosciences, Inc. Multiple displacement amplification

Also Published As

Publication number Publication date
US20110256541A1 (en) 2011-10-20
EP2076612A2 (en) 2009-07-08
WO2008127839A3 (en) 2009-02-26

Similar Documents

Publication Publication Date Title
US8013142B2 (en) Compositions for use in identification of bacteria
JP5081144B2 (en) Composition for use in bacterial identification
CA2560521C (en) Compositions for use in identification of bacteria
US8097416B2 (en) Methods for identification of sepsis-causing bacteria
US8546082B2 (en) Methods for identification of sepsis-causing bacteria
US20100204266A1 (en) Compositions for use in identification of mixed populations of bioagents
US20080138808A1 (en) Methods for identification of sepsis-causing bacteria
US20080146455A1 (en) Methods for identification of sepsis-causing bacteria
WO2009017902A2 (en) Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis
WO2012044956A1 (en) Targeted genome amplification methods
EP2076612A2 (en) Compositions for use in identification of bacteria
US20110105531A1 (en) Compositions and methods for identification of subspecies characteristics of mycobacterium tuberculosis
US20120171692A1 (en) Composition For Use In Identification Of Bacteria

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08780484

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 2008780484

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 12532809

Country of ref document: US