WO2007108829A9

WO2007108829A9 - Tuberculosis nucleic acids, polypeptides and immunogenic compositions

Info

Publication number: WO2007108829A9
Application number: PCT/US2006/041526
Authority: WO
Inventors: David Roth; Huaping He
Original assignee: Gene Therapy Systems Inc; David Roth; Huaping He
Priority date: 2005-10-26
Filing date: 2006-10-25
Publication date: 2007-11-08
Also published as: WO2007108829A3; WO2007108829A2

Abstract

The present invention provides transcriptionally active Mtb polynucleotides, recombinant Mtb peptides and polypeptides, and immunogenic Mtb antigens. Immunogenic compositions are also provided that may be useful as recombinant, subunit and DNA vaccines. In addition the invention provides diagnostic kits for Mtb.

Description

TUBERCULOSIS NUCLEIC ACIDS, POLYPEPTIDES AND IMMUNOGENIC

COMPOSITIONS

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. Provisional Application No. 60/730,951, filed October 26, 2005, entitled "TUBERCULOSIS NUCLEIC ACIDS, POLYPEPTIDES AND IMMUNOGENIC COMPOSITIONS", the disclosure of which is hereby expressly incorporated by reference in its entirety.

STATEMENT REGARDING FEDERALLY SPONSORED R&D [0002] The research leading to the present invention was supported, at least in part, by a grant from the National Institute of Allergy And Infectious Diseases. Accordingly, the Government may have certain rights in the invention.

BACKGROUND OF THE INVENTION Field of the Invention

[0003] The present invention relates to immunogenic Mycobacterium tuberculosis (Mtb) peptides, polypeptides, polynucleotides encoding immunogenic Mtb peptides and polypeptides and immunogenic compositions comprising Mtb polypeptides or polynucleotides. Background

[0004] Traditional vaccine technology suffers from the problem that it often produces various degrees of immunogenicity in different hosts. Often, the only reliably immunogenic composition is a pathogenic microorganism. But the manufacture and administration of the pathogenic organism carries a risk of infection by the very pathogen the vaccine is designed to treat. Furthermore, recent well-publicized problems with influenza vaccine production highlight the difficulties in producing large quantities of conventional vaccines and the precarious state of worldwide vaccine supplies. In light of general health concerns and the growing threat of bioterrorism, there is a need to develop recombinant and subunit vaccines capable of inducing an appropriate immune response in the context of multiple and genetically diverse hosts. This approach requires the identification of a number of specific antigenic polypeptides. One of the most difficult tasks in developing a protective or therapeutic vaccine, be it a recombinant or genetic, subunit or multi-valent vaccine, is the identification of the appropriate antigens that can stimulate the most rapid, sustained and efficacious immune responses against a particular pathogen for protection and/or therapeutic effect. This is especially challenging when the genome of the pathogen is large and screening for immunogenic antigens is tedious.

[0005] Tuberculosis is a chronic infectious disease that kills approximately 3 million people per year. It has been estimated that two billion people are infected with M. tuberculosis worldwide, including 7.5 million with active cases of tuberculosis. In recent years there has been an unexpected rise in tuberculosis cases.

[0006] In the U.S., tuberculosis continues to be a major problem especially among the homeless, Native Americans, African-Americans, immigrants, and the elderly. Immunocompromised individuals are particularly susceptible to tuberculosis. Of the 88 million new cases of tuberculosis projected in this decade, approximately 10% are expected to be attributable to HIV infection. The emergence of AIDS has reactivated millions of dormant cases of tuberculosis (Mtb), causing a sharp rise in the number of tuberculosis-associated deaths.

[0007] The only available vaccine for tuberculosis, BCG, is both unpredictable and highly variable in protective efficacy. Hundreds of millions of children and newborns have been vaccinated with BCG, yet this has not consistently stopped the spread of the disease. Tuberculosis has become one of the fastest spreading infectious diseases in both industrialized and developing countries worldwide. Doubtful efficacy of vaccination has spurred interest in developing effective alternatives to BCG.

[0008] The emergence of multi-drug resistant strains of M. tuberculosis e.g. or Mtb, has complicated matters further, with some experts predicting a new tuberculosis epidemic. In the U.S. about 14% of M. tuberculosis isolates are resistant to at least one drug, and approximately 3% are resistant to at least two drugs. Some M. tuberculosis strains have been isolated that are resistant to as many as seven drugs in the repertoire of drugs commonly used to combat tuberculosis. Resistant strains make treatment of tuberculosis extremely difficult, leading to a mortality rate of about 90%, which is one of the reasons it has gained priority as a defined CDC - Category C Biodefense organism.

[0009] In the current age, where treatment of tuberculosis is becoming more challenging and immunosuppressive diseases are more prevalent, new vaccines are essential. Thus, there is a need for developing and commercializing effective and reliable Mtb vaccines. In addition, there is a considerable need for additional diagnostic tests or tests to detect active TB in the face of other diseases such as HIV.

SUMMARY OF THE INVENTION

[0010] The present invention provides isolated polynucleotides encoding Mtb polypeptides that are antigenic in any mammal, including SEQ ID NOS: 46-64, 110-121, and fragments thereof, that encode antigenic polypeptides. Also provided are isolated polynucleotides that have at least 85%, 90%, 95%, 96%, 97%, 98% or 99% nucleic acid sequence identity to the nucleic acid sequences of SEQ ID NOs: 46-64, 110-121 or fragments thereof, and which encode antigenic or immunogenic polypeptides. Further provided herein are isolated polynucleotides that consist essentially of the sequence of SEQ ID NOs: 46-64, 110-121 or fragments thereof that encode antigenic or immunogenic polypeptides. Additionally, provided herein are isolated polynucleotides that hybridize under stringent conditions to the nucleic acid sequences of SEQ ID NOs: 46-64, 110-121 or fragments thereof that encode antigenic or immunogenic polypeptides, or the complements thereof. In some embodiments, the polynucleotides disclosed herein can be optimized for codons most frequently used in an animal host, particularly a human host. The mammal can be, for example, a mouse, rabbit, non-human primate, or human. The invention also provides isolated polynucleotides encoding immunogenic Mtb antigens including SEQ ID NOS: 46-64, 110-121 and fragments thereof that encode immunogenic polypeptides. In some embodiments, immunogenic Mtb antigens react with polyclonal antibodies directed to Mtb bacteria (Mtb) from at least two different species. In another embodiment, immunogenic Mtb antigens are detected by ELISA, Western blotting, or both using polyclonal antibodies that are directed to Mtb bacteria.

[0011] The present invention also provides TAP polynucleotides, e.g., polynucleotides produced by Transcriptionally-Activated PCR (TAP) technology as described in U.S. Patent 6,280,977, which is expressly incorporated herein by reference. Such polynucleotides include a 5' TAP polynucleotide sequence, a Mtb polynucleotide sequence, and a 3' TAP polynucleotide sequence. The Mtb polynucleotide sequence can, for example, comprise one of SEQ ID NOS: 46-64 and 110-121. In some embodiments, the Mtb polynucleotide sequences have at least 85%, 90%, 95%, 96%, 97%, 98% or 99% nucleic acid sequence identity to the nucleic acid sequences of SEQ ID NOs: 46-64. 110- 121 or fragments thereof, and encode antigenic or immunogenic polypeptides. In some embodiments, the Mtb polynucleotides can consist essentially of the nucleic acid of SEQ ID NOs: 46-64, 110-121 or fragments thereof, and encode immunogenic or antigenic polypeptides. In some embodiments, the Mtb polynucleotides comprise nucleic acids that hybridize under stringent conditions to the nucleic acid sequences of SEQ ID NOs: 46-64, 110-121 or fragments thereof, which encode antigenic or immunogenic polypeptides, or the complements thereof. Further, the Mtb polynucleotides can be optimized for codons most frequently used in an animal host, preferably a human host. In some embodiments, the 5' TAP polynucleotide sequence comprises a promoter. In certain embodiments, the 5' TAP polynucleotide sequence is selected from SEQ ID NOS: 2, 3, 6, and 84. In some embodiments the 3' TAP polynucleotide sequence comprises a terminator. In certain embodiments, the 3' TAP polynucleotide sequence is selected from SEQ ID NOS: 4, 5, 7, and 85.

[0012] Also provided are primers that hybridize to Mtb sequences. The primers can be at least 12 residues in length, and hybridize under stringent conditions to at least 12 consecutive bases of a nucleic acid sequence selected from SEQ ID NOs: 8-45 and 86-109or the complement thereof.

[0013] Also provided are primer pairs for amplifying an Mtb polynucleotide. These primer pairs include SEQ ID NOS: 8 and 9; 10 and 11; 12 and 13; 14 and 15; 16 and 17; 18 and 19; 20 and 21 ; 22 and 23; 24 and 25; 26 and 27; 28 and 29; 30 and 31; 32 and 33; 34 and 35; 36 and 37; 38 and 39; 40 and 41 ; 42 and 43; 44 and 45; 86 and 87; 88 and 89; 90 and 91; 92 and 93; 94 and 95; 96 and 97; 98 and 99; 100 and 101; 102 and 103; 104 and 105; 106 and 107; 108 and 109

[0014] Isolated antigenic Mtb peptides and polypeptides are provided, including SEQ ID NOS: 65-83, 122-133, and fragments thereof. Isolated Mtb peptides and polypeptides that are immunogenic, including SEQ ID NOS: 65-83, 122-133, and fragments thereof, that are immunogenic, are also provided. Further provided are Mtb peptides and polypeptides that have at least 85%, 90%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to SEQ ID NOS: 65-83, 122-133 or fragments thereof that are antigenic or immunogenic. In some embodiments, the Mtb polypeptides can consist essentially of the amino acids of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof. In some embodiments, the Mtb peptide and polypeptide sequences can be optimized for expression in an animal host, such as a human host. Embodiments also include polynucleotides encoding any of the peptides and polypeptides disclosed herein. In one embodiment immunogenic peptides and polypeptides react with polyclonal antibodies that are directed to Mtb bacteria (Mtb). In one aspect of this embodiment, the peptides and polypeptides react with polyclonal antibodies that are directed to Mtb bacteria from at least two different species. In another embodiment, immunogenic peptides and polypeptides are detected by ELISA, Western blotting or both using polyclonal antibodies that are directed to Mtb bacteria.

[0015] The present invention also includes recombinant Mtb peptides and polypeptides, wherein the amino terminus of the peptide or polypeptide comprises an HA tag or a (6x)His tag, and the carboxy terminus of the polypeptide is selected from the group consisting of: SEQ ID NOS: 65-83 and 122-133. In some embodiments, the carboxy terminus of the polypeptide can be a polypeptide that has at least 85%, 90%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to SEQ ID NOS: 65-83, 122- 133, or a fragment thereof that is antigenic or immunogenic. In some embodiments, the carboxy terminus of the polypeptide is a polypeptide that consists essentially of the amino acids of SEQ ID NOs: 65-83, 122-133, or an immunogenic or antigenic fragment thereof. In some embodiments, the amino acid sequence of the carboxy terminus can be optimized for expression in humans. Also included are recombinant Mtb peptides and polypeptides, wherein the carboxy terminus of the polypeptide comprises a HA tag or a His tag, and the amino terminus of the polypeptide is selected from the group consisting of: SEQ ID NOS: 65-83 and 122-133. In some embodiments, the amino terminus of the polypeptide can be a polypeptide that has at least 85%, 90%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to SEQ ID NOS: 65-83, 122-133, or a fragment thereof that is antigenic or immunogenic. In some embodiments, the amino terminus of the polypeptide can be a polypeptide that consists essentially of the amino acids of SEQ ID NOs: 65-83, 122-133, or an immunogenic or antigenic fragment thereof. In some embodiments, the amino acid sequence of the amino terminus of the polypeptide can be optimized for expression in a particular host, such as a human host. The peptides and polypeptides of the invention may be expressed in an appropriate in vitro transcription and translation system, such as a T7 polymerase system.

[0016] Immunogenic compositions for inducing an immunological response in a mammalian host against Mtb are also included in the invention. In one embodiment, the immunogenic compositions comprise nucleic acids that encode and express in vivo in a mammalian host cell at least one immunogenic peptide or polypeptide, which may be any one of SEQ ID NOS: 65-83, 122-133, or immunogenic fragments thereof. In some embodiments, the polynucleotide can have at least 85%, 90%, 95%, 96%, 97%, 98% or 99% nucleic acid sequence identity to the nucleic acid sequences of SEQ ID NOs: 46-64, 110-121 or a fragment thereof that encodes an antigenic or immunogenic polypeptide. In some embodiments, the polynucleotides can consist essentially of the sequence of SEQ ID NOs: 46-64, 110-121 or fragments thereof, and encode an antigenic or immunogenic polypeptide. In some embodiments, the polynucleotides can be optimized for codon usage in a particular host, such as a human host. The nucleic acids can be, for example, plasmids or TAP fragments. The compositions can induce either a humoral- or cell- mediated immune response. Furthermore, the immunogenic compositions can include additional components, such as adjuvants, as well as other applications such as serodiagnostics.

[0017] Immunogenic compositions for inducing an immunological response in a mammalian host against Mtb of the invention can also comprise isolated Mtb peptides and/or polypeptides, such as SEQ ID NOS: 65-83, 122-133 and immunogenic fragments thereof. In some embodiments, the Mtb peptides and polypeptides can have at least 85%, 90%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to SEQ ID NOS: 65-83, 122-133, or fragments thereof and are antigenic or immunogenic. In some embodiments, the Mtb polypeptides can consist essentially of the amino acids of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof. In some embodiments, the amino acid sequence of the Mtb peptides and polypeptides can be optimized for expression in a particular host, such as a human host. In one embodiment, the immunogenic peptides and polypeptides are expressed in an in vitro transcription and translation system. The immunogenic peptide and polypeptide compositions can induce either a humoral- or cell-mediated immune response. Furthermore, the immunogenic peptide and polypeptide compositions can include additional components, such as adjuvants, and include other applications such as diagnostics.

[0018] Similarly, detection of Mtb, its constituent proteins, and/or its immunologically reactive products (e.g. antibodies) is clinically relevant for the diagnosis of Mtb, and to track the efficacy of therapeutic treatments for Mtb, especially as translated to serodiagnostic tests. The present application therefore provides antigens for detection of Mtb for immune assays, including humoral immune assays. These antigens are applicable to detection of active Mtb in the face of HIV- and other Mycobacterial- coinfections, multi-drug resistant infections by Mtb (MDRI), and rapidly mutating forms of Mtb depending on genetic makeup, geographical location, and immunocompetency status.

[0019] As such, the present invention also provides diagnostic compositions, including one or more antibodies directed against the peptide epitopes identified herein. Also, the present invention provides diagnostic kits that include at least one or more of such antibodies.

BRIEF DESCRIPTION OF THE DRAWINGS

[0020] Figure 1. illustrates one method used to generate TAP Expression Fragments.

[0021] Figure 2. displays a method of amplifying multiple genes using TAP technology, expressing said genes products, then purifying and quantifying the resulting polypeptides.

[0022] Figure 3. demonstrates how a plurality of polypeptides from a target organism can be assayed to determine each polypeptide's ability to elicit a humoral immune response.

[0023] Figure 4. demonstrates how a plurality of polypeptides from a target organism can be assayed to determine each polypeptide's ability to elicit a cell-mediated response.

[0024] Figure 5. demonstrates that fluorescent proteins (goat IgG antibody) can be more effectively delivered into either NIH-3T3 cells (A&B) and human dendritic cells (C&D) with a protein delivery reagent (B&D) as opposed to without a protein delivery reagent (A&C).

[0025] Figure 6. shows the results of scanning the Mtb proteome for antigenic targets of humoral immunity by ELISA and Western blotting.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

[0026] Before proceeding further with a description of the specific embodiments of the present invention, a number of terms will be defined and described in detail.

[0027] Unless specific definitions are provided, the nomenclature utilized in connection with, and the laboratory procedures, techniques and methods described herein are those known in the art to which they pertain. Standard chemical symbols and abbreviations are used interchangeably with the full names represented by such symbols. Thus, for example, the terms "carbon" and "C" are understood to have identical meaning. Standard techniques may be used for chemical syntheses, chemical analyses, pharmaceutical preparation, formulation, delivery, and treatment of patients. Standard techniques may be used for recombinant DNA methodology, oligonucleotide synthesis, tissue cμlture and the like. Reactions and purification techniques may be performed e.g., using kits according to manufacturer's specifications, as commonly accomplished in the art or as described herein. The foregoing techniques and procedures may be generally performed according to conventional methods well known in the art and as described in various general or more specific references that are cited and discussed throughout the present specification. All references cited herein are incorporated by reference in their entirety and are not admitted to be prior art. See e.g., Sambrook et al. Molecular Cloning: A Laboratory Manual (3rd ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y. (2000)), Harlow & Lane, Antibodies: A Laboratory Manual (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y. (1988)), which are incorporated herein by reference in their entirety for any purpose.

[0028] The terms "polynucleotide" and "nucleic acid (molecule)" are used interchangeably to refer to polymeric forms of nucleotides of any length. The polynucleotides may contain deoxyribonucleotides, ribonucleotides and/or their analogs. Nucleotides may have any three-dimensional structure, and may perform any function, known or unknown. The term "polynucleotide" includes single-stranded, double- stranded and triple helical molecules. The following are non-limiting embodiments of polynucleotides: a gene, a gene fragment, exons, introns, mRNA, tRNA, rRNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, cosmids, viruses and other vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes and primers. A nucleic acid molecule may also comprise modified nucleic acid molecules, such as methylated nucleic acid molecules and nucleic acid molecule analogs. Analogs of purines and pyrimidines are known in the art, and include, but are not limited to, aziridinylcytosine, 4-acetylcytosine, 5-fluorouracil, 5- bromouracil, 5-carboxymethylaminomethyl-2-thiouracil, 5-carboxymethyl- aminomethyluracil, inosine, N6-isopentenyladenine, 1-methyladenine, 1- methylpseudouracil, 1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2- methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, pseudouracil, 5- pentyluracil and 2,6-diaminopurine. The use of uracil as a substitute for thymine in a deoxyribonucleic acid is also considered an analogous form of pyrimidine.

[0029] Sugar modifications (e.g., 2'-o-methyl, 2-fluoro and the like) and phosphate backbone modifications (e.g., morpholino, PNA', thioates, dithioates and the like) can be incorporated singly, or in combination, into the nucleic acid molecules of the present invention. In one embodiment, for example, a nucleic acid of the invention may comprise a modified sugar and a modified phosphate backbone. In another embodiment, a nucleic acid of the invention may comprise modifications to sugar, base and phosphate backbone.

[0030] "Oligonucleotide" refers generally to polynucleotides of between 5 and about 100 nucleotides of single- or double-stranded nucleic acid, typically DNA. Oligonucleotides are also known as oligomers or oligos and may be isolated from genes, or synthesized (e.g., chemically or enzymatically) by methods known in the art. A "primer" refers to an oligonucleotide, usually single-stranded, that provides a 3'-hydroxyl end for the initiation of enzyme-mediated nucleic acid synthesis. In some embodiments, primers can be 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, or more nucleotide residues in length, or any number in between.

[0031] "Peptide" generally refers to a short chain of amino acids linked by peptide bonds. Typically peptides comprise amino acid chains of about 2-100, more typically about 4-50, and most commonly about 6-20 amino acids. "Polypeptide" generally refers to individual straight or branched chain sequences of amino acids that are typically longer than peptides. "Polypeptides" usually comprise at least about 100 to about 1000 amino acids in length, more typically at least about 150 to about 600 amino acids, and frequently at least about 200 to about 500 amino acids. "Proteins" include single polypeptides as well as complexes of multiple polypeptide chains, which may be the same or different. Multiple chains in a protein may be characterized by secondary, tertiary and quaternary structure as well as the primary amino acid sequence structure; may be held together, for example, by disulfide bonds; and may include post-synthetic modifications such as, without limitation, glycosylation, phosphorylation, truncations or other processing. Antibodies such as IgG proteins, for example, are typically comprised of four polypeptide chains (i.e., two heavy and two light chains) that are held together by disulfide bonds. Furthermore, proteins may include additional components such as associated metals (e.g., iron, copper and sulfur), or other moieties. The definitions of peptides, polypeptides and proteins include, without limitation, biologically active and inactive forms; denatured and native forms; as well as variant, modified, truncated, hybrid, and chimeric forms thereof. Non-naturally occurring amino acids include, for example, beta-amino acids, Both D, L and racemic configurations of hydrophobic amino acids. Amino acid analogs include the D or L configuration of an amino acid having the following formula: -NH-CHR-CO-, wherein R is an aliphatic group, a substituted aliphatic group, a benzyl group, a substituted benzyl group, an aromatic group or a substituted aromatic group and wherein R does not correspond to the side chain of a naturally-occurring amino acid. As used herein, aliphatic groups include straight chained, branched or cyclic C1-C8 hydrocarbons which are completely saturated, which contain one or two heteroatoms such as nitrogen, oxygen or sulfur and/or which contain one or more units of unsaturation. Aromatic groups include carbocyclic aromatic groups such as phenyl and naphthyl and heterocyclic aromatic groups such as imidazolyl, indolyl, thienyl, furanyl, pyridyl, pyranyl, oxazolyl, benzothienyl, benzofuranyl, quinolinyl, isoquinolinyl and acridintyl. Suitable substituents on an aliphatic, aromatic or benzyl group include - OH, halogen (-Br, -Cl, -I and -F) -O(aliphatic, substituted aliphatic, benzyl, substituted benzyl, aryl or substituted aryl group), -CN, -NO₂, -COOH, -NH₂, -NH(aliphatic group, substituted aliphatic, benzyl, substituted benzyl, aryl or substituted aryl group), - N(aliphatic group, substituted aliphatic, benzyl, substituted benzyl, aryl or substituted aryl group)₂, -COO(aliphatic group, substituted aliphatic, benzyl, substituted benzyl, aryl or substituted aryl group), -CONH₂, -CONH(aliphatic, substituted aliphatic group, benzyl, substituted benzyl, aryl or substituted aryl group)), -SH, -S(aliphatic, substituted aliphatic, benzyl, substituted benzyl, aromatic or substituted aromatic group) and -NH-C(=NH)- NH₂. A substituted benzylic or aromatic group can also have an aliphatic or substituted aliphatic group as a substituent. A substituted aliphatic group can also have a benzyl, substituted benzyl, aryl or substituted aryl group as a substituent. A substituted aliphatic, substituted aromatic or substituted benzyl group can have one or more substituents. Modifying an amino acid substituent can increase, for example, the lypophilicity or hydrophobicity of natural amino acids which are hydrophillic.

[0032] Non-naturally occurring amino acids and amino acid analogs and salts thereof can be obtained commercially. Others can be synthesized by methods known in the art. Synthetic techniques are described, for example, in Green and Wuts, "Protecting Groups in Organic Synthesis", John Wiley and Sons, Chapters 5 and 7, 1991. [0033] Hydrophobicity is generally defined with respect to the partition of an amino acid between a nonpolar solvent and water. Hydrophobic amino acids are those acids which show a preference for the nonpolar solvent. Relative hydrophobicity of amino acids can be expressed on a hydrophobicity scale on which glycine has the value 0.5. On such a scale, amino acids which have a preference for water have values below 0.5 and those that have a preference for nonpolar solvents have a value above 0.5. As used herein, the term hydrophobic amino acid refers to an amino acid that, on the hydrophobicity scale has a value greater or equal to 0.5, in other words, has a tendency to partition in the nonpolar acid which is at least equal to that of glycine.

[0034] The peptides, polypeptides and proteins of the present invention may be derived from any source or by any method, including, but not limited to extraction from naturally occurring tissues or other materials; recombinant production in host organisms such as bacteria, fungi, plant, insect or animal cells; and chemical synthesis using methods that will be well known to the skilled artisan.

[0035] "Variant polynucleotides" and "variant polypeptides" can be used to describe nucleic acid molecules or amino acid molecules that share a specified percent nucleic acid sequence or amino acid sequence identity with a reference sequence. Variant polynucleotides can also refer to nucleotides that are capable of hybridizing, preferably under stringent hybridization and wash conditions, to the reference polynucleotide sequences. "Variant polynucleotides" and "variant polypeptides" can also refer to polynucleotides that comprise nucleotide or non-naturally occurring amino acids or amino acid analogs, respectively. Nucleotide analogs and non-naturally occurring amino acids and amino acid analogs are discussed herein.

[0036] "Percent (%) nucleic acid sequence identity" is defined as the percentage of nucleotides in a candidate sequence that are identical with the nucleotides in the nucleic acid sequence of interest, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity. Alignment for purposes of determining percent nucleic acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or Megalign (DNASTAR) software. The ALIGN-2 program is publicly available through Genentech, Inc., South San Francisco, California.

[0037] In situations where ALIGN-2 is employed for nucleic acid sequence comparisons, the % nucleic acid sequence identity of a given nucleic acid sequence C to, with, or against a given nucleic acid sequence D (which can alternatively be phrased as a given nucleic acid sequence C that has or comprises a certain % nucleic acid sequence identity to, with, or against a given nucleic acid sequence D) is calculated as follows:

[0038] 100 times the fraction W/Z

[0039] where W is the number of nucleotides scored as identical matches by the sequence alignment program ALIGN-2 in that program's alignment of C and D, and where Z is the total number of nucleotides in D. It will be appreciated that where the length of nucleic acid sequence C is not equal to the length of nucleic acid sequence D, the % nucleic acid sequence identity of C to D will not equal the % nucleic acid sequence identity of D to C.

[0040] Percent nucleic acid sequence identity values may also be obtained as described below by using the WU-BLAST-2 computer program (Altschul et al., Methods in Enzvmology 266:460-480 (1996)). Most of the WU-BLAST-2 search parameters are set to the default values. Those not set to default values, i.e., the adjustable parameters, are set with the following values: overlap span = 1, overlap fraction = 0.125, word threshold (T) = 11, and scoring matrix = BLOSUM62. When WU-BLAST-2 is employed, a % nucleic acid sequence identity value is determined by dividing (a) the number of matching identical nucleotides between the nucleic acid sequence of the nucleic acid molecule of and a comparison nucleic acid molecule of interest, for example a percent variant polynucleotide, as determined by WU-BLAST-2 by (b) the total number of nucleotides of the polynucleotide of interest. For example, in the statement "an isolated nucleic acid molecule comprising a nucleic acid sequence A which has or having at least 80% nucleic acid sequence identity to the nucleic acid sequence B," the nucleic acid sequence A is the comparison nucleic acid molecule of interest and the nucleic acid sequence B is the nucleic acid sequence of interest.

[0041] Percent nucleic acid sequence identity may also be determined using the sequence comparison program NCBI-BLAST2 (Altschul et al., Nucleic Acids Res. 25:3389-3402 (1997)). The NCBI-BLAST2 sequence comparison program may be downloaded from http://www.ncbi.nlm.nih.gov or otherwise obtained from the National Institute of Health, Bethesda, MD. NCBI-BLAST2 uses several search parameters, wherein all of those search parameters are set to default values including, for example, unmask = yes, strand = all, expected occurrences = 10, minimum low complexity length = 15/5, multi-pass e-value = 0.01, constant for multi-pass = 25, dropoff for final gapped alignment = 25 and scoring matrix = BLOSUM62.

[0042] In situations where NCBI-BLAST2 is employed for sequence comparisons, the % nucleic acid sequence identity of a given nucleic acid sequence C to, with, or against a given nucleic acid sequence D (which can alternatively be phrased as a given nucleic acid sequence C that has or comprises a certain % nucleic acid sequence identity to, with, or against a given nucleic acid sequence D) is calculated as follows:

[0043] 100 times the fraction W/Z

[0044] where W is the number of nucleotides scored as identical matches by the sequence alignment program NCBI-BLAST2 in that program's alignment of C and D, and where Z is the total number of nucleotides in D. It will be appreciated that where the length of nucleic acid sequence C is not equal to the length of nucleic acid sequence D, the % nucleic acid sequence identity of C to D will not equal the % nucleic acid sequence identity of D to C.

[0045] "Percent amino acid sequence identity" is defined as the percentage of amino acid residues in a candidate sequence that are positives with respect to the amino acids in the polypeptide sequence of interest, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity.

[0046] The term "positives", in the context of sequence comparison performed as described above, includes residues in the sequences compared that are not identical but have similar properties, for example as a result of conservative substitutions. Conservative substitutions include the following:

Original Exemplary Conservative Preferred Conservative

Residue Substitutions Substitutions

Ala (A) val; leu; ile val

Arg (R) lys; gin; asn lys

Asn (N) gin; his; lys; arg gin

Asp (D) glu glu

Cys (C) ser ser

GIn (Q) asn asn

GIu (E) asp asp

GIy (G) pro; ala ala

His (H) asn; gin; lys; arg arg

He (I) leu; val; met; ala; phe; norleucine leu

Leu (L) norleucine; ile; val; met; ala; phe ile Lys (K) arg; gin; asn arg

Met (M) leu; phe; ile leu

Phe (F) leu; val; ile; ala; tyr leu

Pro (P) ala ala

Ser (S) thr thr

Thr (T) ser ser

Trp (W) tyr; phe tyr

Tyr (Y) trp; phe; thr; ser phe

VaI (V) ile; leu; met; phe; ala; norleucine leu

[0047] For purposes herein, the % value of positives is determined by dividing (a) the number of amino acid residues scoring a positive value between the polypeptide amino acid sequence of interest and the comparison amino acid sequence of interest (i.e., the amino acid sequence against which the polypeptide sequence is being compared) as determined in the BLOSUM62 matrix of WU-BLAST-2 by (b) the total number of amino acid residues of the polypeptide of interest.

[0048] Unless specifically stated otherwise, the % value of positives is calculated as described in the immediately preceding paragraph. However, in the context of the amino acid sequence identity comparisons performed as described for ALIGN-2 and NCBI-BLAST-2 above, includes amino acid residues in the sequences compared that are not only identical, but also those that have similar properties. Amino acid residues that score a positive value to an amino acid residue of interest are those that are either identical to the amino acid residue of interest or are a preferred substitution of the amino acid residue of interest.

[0049] For amino acid sequence comparisons using ALIGN-2 or NCBI- BLAST2, the % value of positives of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % positives to, with, or against a given amino acid sequence B) is calculated as follows:

[0050] 100 times the fraction X/Y

[0051] where X is the number of amino acid residues scoring a positive value as defined above by the sequence alignment program ALIGN-2 or NCBI-BLAST2 in that program's alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that where the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % positives of A to B will not equal the % positives of B to A.

[0052] "Codon Optimized" refers to changes in the codons of the gene of interest to those preferentially used in a particular organism such that the gene is efficiently expressed in the organism. Although the genetic code is degenerate in that most amino acids are represented by several codons, called synonyms or synonymous codons, it is well known that codon usage by particular organisms is nonrandom and biased towards particular codon triplets. This codon usage bias may be higher in reference to a given gene, genes of common function or ancestral origin, highly expressed proteins versus low copy number proteins, and the aggregate protein coding regions of an organism's genome. Although codon bias may arise from nucleotide composition or mutational biases in different organisms, codon usage bias in bacteria and yeast correlates with the abundance of tRNA species in the cell. In general, codon bias is often associated with the level of gene expression. That is, certain codons are preferentially represented in the protein coding regions of highly expressed gene products. Thus, changing the codons to the preferred codons of a particular organism may allow higher level expression of the encoded protein in that organism. In other words, codons are preferably selected to fit the host cell in which the protein is being produced. For example, preferred codons used in bacteria are used to express the gene in bacteria; preferred codons used in yeast are used for expression in yeast; and preferred codons used in mammals cells are used for expression in mammalian cells.

[0053] By "preferred", "optimal" or "favored" codons, or "high codon usage bias" or grammatical equivalents as used herein is meant codons used at higher frequency in the protein coding regions than other codons that code for the same amino acid. The preferred codons may be determined in relation to codon usage in a single gene, a set of genes of common function or origin, highly expressed genes, the codon frequency in the aggregate protein coding regions of the whole organism, codon frequency in the aggregate protein coding regions of related organisms, or combinations thereof.

[0054] A variety of methods are known for determining the codon frequency (e.g., codon usage, relative synonymous codon usage) and codon preference in specific organisms, including multivariat analysis, for example, using cluster analysis or correspondence analysis, and the effective number of codons used in a gene (see GCG CodonPreference, Genetics Computer Group Wisconsin Package; Codon W, John Peden, University of Nottingham; Mclnerney, J. O (1998) Bioinformatics 14: 372 73; Stenico, M. et al. (1994) Nucleic Acids Res. 222437 46; Wright, F. (1990) Gene 87: 23 29). Codon usage tables are available for a growing list of organisms (see for example, Wada, K. et al. (1992) Nucleic Acids Res. 20: 2111 2118; Nakamura, Y. et al. (2000) Nucleic Acids Res. 28: 292; Duret, et al. supra). The data source for obtaining codon usage may rely on any available nucleotide sequence capable of coding for a protein. These data sets include nucleic acid sequences actually known to encode expressed proteins (e.g., complete protein coding sequences-CDS), expressed sequence tags (ESTS), or predicted coding regions of genomic sequences (see for example, Mount, D. Bioinformatics: Sequence and Genome Analysis, Chapter 8, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y., 2001; Uberbacher, E. C. (1996) Methods Enzymol. 266: 259 281; and Tiwari, S. et al. (1997) Comput. Appl. Biosci. 13 263 270).

[0055] When several preferred codons are available for the same amino acid, the choice of substitution can rely on other considerations such ease of constructing the variant, concerns for limiting introduction of mutations during propagation of the gene in the host organism (i.e., mutational bias), secondary structure of the mRNA that may affect expression levels, and concern for generating splice sites. Other considerations may take into account the intended uses of the codon optimized variants, such as insertion of restriction sites for generating fusion proteins. Thus, some deviations from strict adherence to preferred codons are permissible to accommodate restriction sites in the resulting gene for the purposes of constructing the variant, replacement of gene segments (e.g., to simplify insertion of mutated gene segments), and for creating fusion proteins, as described below.

[0056] In some embodiments, all codons need not be replaced to optimize the codon usage of the polypeptide or peptide of interst since the natural sequence will comprise the preferred codons and because use of preferred codons may not be required for all amino acid residues. In one aspect, about 10 to about 35% of the codons are replaced or changed. Additional changes may be introduced to maximize expression. Consequently, codon optimized polynucleotide sequences may contain preferred codons at about 40%, 50%, 60%, 70%, 80%, or greater than 90% of codon positions of the full length coding region.

[0057] "Stringency" of hybridization reactions is readily determinable by one of ordinary skill in the art, and generally is an empirical calculation dependent upon probe length, washing temperature, and salt concentration. In general, longer probes require higher temperatures for proper annealing, while shorter probes need lower temperatures. Hybridization generally depends on the ability of denatured DNA to reanneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and hybridizable sequence, the higher the relative temperature which can be used. As a result, it follows that higher relative temperatures would tend to make the reaction conditions more stringent, while lower temperatures less so. For additional details and explanation of stringency of hybridization reactions, see Ausubel et al., Current Protocols in Molecular Biology, Wiley Interscience Publishers, (1995).

[0058] "Stringent conditions" or "high stringency conditions", as defined herein, may be identified by those that: (1) employ low ionic strength and high temperature for washing, for example 0.015 M sodium chloride/0.0015 M sodium citrate/0.1% sodium dodecyl sulfate at 50⁰C; (2) employ during hybridization a denaturing agent, such as formamide, for example, 50% (v/v) formamide with 0.1% bovine serum albumin/0.1% Ficoll/0.1% polyvinylpyrrolidone/50mM sodium phosphate buffer at pH 6.5 with 750 mM sodium chloride, 75 mM sodium citrate at 42°C; or (3) employ 50% formamide, 5 x SSC (0.75 M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5 x Denhardt's solution, sonicated salmon sperm DNA (50 μg/ml), 0.1% SDS, and 10% dextran sulfate at 42°C, with washes at 42°C in 0.2 x SSC (sodium chloride/sodium citrate) and 50% formamide at 55°C, followed by a high-stringency wash consisting of 0.1 x SSC containing EDTA at 55°C.

[0059] "Moderately stringent conditions" may be identified as described by Sambrook et al., Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Press, 1989, and include the use of washing solution and hybridization conditions (e.g., temperature, ionic strength and %SDS) less stringent that those described above. An example of moderately stringent conditions is overnight incubation at 37°C in a solution comprising: 20% formamide, 5 x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 x Denhardt's solution, 10% dextran sulfate, and 20 mg/ml denatured sheared salmon sperm DNA, followed by washing the filters in 1 x SSC at about 37-50⁰C. The skilled artisan will recognize how to adjust the temperature, ionic strength, etc. as necessary to accommodate factors such as probe length and the like. [0060] "Isolated," when used to describe the various polynucleotides and polypeptides disclosed herein, refers to a polynucleotide or polypeptide that has been identified and separated and/or recovered from a component of its natural environment. Contaminant components of its natural environment are materials that would typically interfere with diagnostic or therapeutic uses for an isolated polypeptide, and may include enzymes, hormones, and other proteinaceous or non-proteinaceous solutes. In preferred embodiments, the polypeptide will be purified (1) to a degree sufficient to obtain at least 15 residues of N-terminal or internal amino acid sequence by use of a spinning cup sequenator, or (2) to homogeneity by SDS-PAGE under non-reducing or reducing conditions using Coomassie blue or, preferably, silver stain. Isolated polypeptide includes polypeptide in situ within recombinant cells, since at least one component of the PRO polypeptide natural environment will not be present. Ordinarily, however, isolated polypeptide will be prepared by at least one purification step. [0060] "Polyclonal antibodies" or "antisera" are heterogeneous populations of antibody molecules derived from the sera of animals immunized with an antigen, or an antigenic functional derivative thereof. For the production of polyclonal antibodies, host animals such as rabbits, mice and goats, may be immunized by injection with an antigen or hapten-carrier conjugate optionally supplemented with adjuvants. Polyclonal antibodies may also be derived from the sera of humans or non-human animals exposed to a pathogen or vaccinated against a pathogen using a commercially available or experimental vaccine. An antiserum against TB (Mtb), for example, may be obtained from a human patient vaccinated with a rδvaccine, or from an animal, such as a mouse, rabbit, goat or sheep immunized with Mtb bacteria or a Mtb preparation.

[0061] "Monoclonal antibodies," which are abbreviated MAb, are homogeneous populations of antibodies to a particular antigen, may be obtained by any technique that provides for the production of antibody molecules by continuous cell lines in culture. These include, but are not limited to the hybridoma technique of Kόhler and Milstein, Nature, 256:495-7 (1975); and U.S. Patent No. 4,376,110, the human B-cell hybridoma technique (Kosbor, et al, Immunology Today, 4:72 (1983); Cote, et al, Proc. Natl. Acad. ScL USA, 80:2026-30 (1983), and the EBV-hybridoma technique (Cole, et al, in Monoclonal Antibodies And Cancer Therapy, Alan R. Liss, Inc., New York, pp. 77-96 (1985)). Such antibodies may be of any immunoglobulin class including IgG, IgM, IgE, IgA, IgD and any subclass thereof. The hybridoma producing the MAb of this invention may be cultivated in vitro or in vivo. Production of high titers of MAbs in vivo makes this a presently preferred method of production.

[0062] In addition, techniques developed for the production of "chimeric antibodies" (Morrison, et al, Proc. Natl. Acad. Sci., 81 :6851-6855 (1984); Takeda, et al, Nature, 314:452-54 (1985)) by splicing the genes from a mouse antibody molecule of appropriate antigen specificity together with genes from a human antibody molecule of appropriate biological activity can be used. A chimeric antibody is a molecule in which different portions are derived from different sources, such as those having a variable region derived from a murine MAb and a human immunoglobulin constant region. Humanized antibodies can also be generated in which certain parts {e.g., framework regions) of a non-human antibody are altered to make the antibody more like a human antibody, while retaining antigen binding features of the parent molecule.

[0063] Alternatively, techniques described for the production of single chain antibodies (U.S. Patent No. 4,946,778; Bird, Science 242:423-26 (1988); Huston, et al, Proc. Natl. Acad. Sci. USA, 85:5879-83 (1988); and Ward, et al., Nature, 334:544-46 (1989)) can be adapted to produce single chain antibodies. Single chain antibodies are typically formed by linking the heavy and light chain fragments of the Fv region via an amino acid bridge, resulting in a single chain polypeptide.

[0064] Antibody fragments that recognize specific epitopes may be generated by known techniques. For example, such fragments include but are not limited to: the Fab fragments that can be produced by papain digestion of the antibody molecule, the F(ab')₂ fragments that can be produced by pepsin digestion of the antibody molecule and the Fab' fragments that can be generated by reducing the disulfide bridges of the F(ab')₂ fragments. Alternatively, Fab expression libraries may be constructed (Huse, et al., Science, 246:1275-81 (1989)) to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity.

[0065] The term "hapten" as used herein, refers to a small proteinaceous or non-protein antigenic determinant that is capable of being recognized by an antibody. Typically, haptens do not elicit antibody formation in an animal unless part of a larger species. For example, small peptide haptens are frequently coupled to a carrier protein such as keyhole limpet hemocyanin in order to generate an anti-hapten antibody response. "Antigens" are macromolecules capable of generating an antibody response in an animal and being recognized by the resulting antibody. Both antigens and haptens comprise at least one antigenic determinant or "epitope," which is the region of the antigen or hapten that binds to the antibody. Typically, the epitope on a hapten is the entire molecule.

[0066] By the terms "specifically binding" and "specific binding" as used herein is meant that an antibody or other molecule, binds to a target such as an antigen, with greater affinity than it binds to other molecules under the specified conditions of the present invention. Antibodies or antibody fragments, as known in the art, are polypeptide molecules that contain regions that can bind other molecules, such as antigens. In various embodiments of the invention, "specifically binding" may mean that an antibody or other specificity molecule, binds to a target molecule with at least about a 10⁶-fold greater affinity, preferably at least about a 10⁷-fold greater affinity, more preferably at least about a 10⁸-fold greater affinity, and most preferably at least about a 10⁹-fold greater affinity than it binds molecules unrelated to the target molecule. Typically, specific binding refers to affinities in the range of about 10⁶-fold to about 10⁹-fold greater than non-specific binding. In some embodiments, specific binding may be characterized by affinities greater than 10⁹-fold over non-specific binding. Whenever a range appears herein, as in "1-10 or one to ten, the range refers without limitation to each integer or unit of measure in the given range. Thus, by 1-10 it is meant that each of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 and any subunit in between.

[0067] "Immunogenic compositions" of the present invention are preparations that, when administered to a human or non-human animal, elicit a humoral and/or cellular immune response. "Vaccine," as used herein, can refer to immunogenic compositions that are administered to a human or non-human patient for the prevention, amelioration or treatment of diseases, typically infectious diseases. "Traditional vaccines" or "whole vaccines" typically may be live, attenuated or killed microorganisms, such as bacteria or viruses. Vaccines can also encompass preparations that elicit or stimulate an immune response that may be useful in the prevention, amelioration or treatment of «o«-infectious diseases. For example, a cancer cell vaccine can be administered to stimulate or supplement a patient's immune response to neoplastic disease. "Subunit vaccines" can be prepared from purified or partially purified proteins or other antigens from a microorganism, cancer cell or other vaccine target. The term "recombinant vaccine" refers to any vaccine that is prepared using recombinant DNA technology and includes certain subunit vaccines (for example, where subunits are cloned and expressed in vitro prior to administration) and "polynucleotide vaccines" such as DNA vaccines that may encode immunogenic polypeptides. Vaccines typically contain at least one immunogenic component {e.g. a cell, virus, polypeptide, polynucleotide, and the like) but may also include additional agents such as adjuvants, which may enhance or stimulate the patient's immune response to the immunogenic component. In certain embodiments, vaccines or components of vaccines may be conjugated e.g. to a polysaccharide or other molecule, to improve stability or immunogenicity of one or more vaccine components.

[0068] As used herein, the term "promoter" refers to a DNA sequence having a regulatory function, which is recognized (directly or indirectly) and bound by a DNA- dependent RNA polymerase during the initiation of transcription. Promoters are typically adjacent to the coding sequence of a gene and extend upstream from the transcription initiation site. The promoter regions may contain several short (<10 base pair) sequence elements that bind transcription factors, generally located within the first 100-200 nucleotides upstream of the transcription initiation site. Sequence elements that regulate transcription from greater distances are generally referred to as "enhancers" and may be located several hundred or thousand nucleotides away from the gene they regulate. Promoters and enhancers may be cell- and tissue-specific; they may be developmentally programmed; they may be constitutive or inducible e.g., by hormones, cytokines, antibiotics, or by physiological and metabolic states. For example, the human metallothionein (MT) promoter is upregulated by heavy metal ions and glucocorticoids. Inducible promoters and other elements may be operatively positioned to allow the inducible control or activation of expression of the desired TAP fragment. Examples of such inducible promoters and other regulatory elements include, but are not limited to, tetracycline, metallothionine, ecdysone, and other steroid-responsive promoters, rapamycin responsive promoters, and the like (see e.g., No, et al, Proc. Natl. Acad. Sci. USA, 93:3346-51 (1996); Furth, et al, Proc. Natl. Acad. Sci. USA, 91:9302-6 (1994)). Certain promoters are operative in prokaryotic cells, while different promoter sequences are required for transcription in eukaryotic cells. Additional control elements that can be used include promoters requiring specific transcription factors, such as viral promotersthat may require virally encoded factors. Promoters can be selected for incorporation into TAP fragments based on the intended use of the polynucleotide, as one skilled in the art will readily appreciate. For example, if the polynucleotide encodes a polypeptide with potential utility in human cells, then a promoter capable of promoting transcription in mammalian cells can be selected. Typical mammalian promoters include muscle creatine kinase promoter, actin promoter, elongation factor promoter as well as those found in mammalian viruses such as CMV, SV40, RSV, MMV, HIV, and the like. In certain embodiments, it may be advantageous to incorporate a promoter from a plant or a plant pathogen (e.g., cauliflower mosaic virus promoter), a promoter from a fungus such as yeast (e.g., Gal 4 promoter), a promoter from a bacteria or bacterial virus, such as bacteriophage lambda, T3, T7, SP6, and the like.

[0069] The term "terminator" refers to DNA sequences, typically located at the end of a coding region, that cause RNA polymerase to terminate transcription. As used herein, the term "terminator" also encompasses terminal polynucleotide sequences that direct the processing of RNA transcripts prior to translation, such as, for example, polyadenylation signals. Any type of terminator can be used for the methods and compositions of the invention. For example, TAP terminator sequences can be derived from a prokaryote, eukaryote, or a virus, including, but not limited to animal, plant, fungal, insect, bacterial and viral sources. In one embodiment, artificial mammalian transcriptional terminator elements are used. A nonexclusive list of terminator sequences that may be used in the present invention include the SV40 transcription terminator, bovine growth hormone (BGH) terminator, synthetic terminators, rabbit β-globin terminator, and the like. Terminators can also be a consecutive stretch of adenine nucleotides at the 3' end of a TAP fragment.

[0070] By "pharmaceutically acceptable" or "pharmacologically acceptable" is meant a material which is not biologically or otherwise undesirable, i.e., the material may be administered to an individual in a formulation or composition without causing any undesirable biological effects or interacting in a deleterious manner with any of the components of the composition in which it is contained.

[0071] By "serodiagnostic test" or grammatical equivalents herein is meant diagnostic tests to detect Mtb by serum of infected organisms, animals or patients.

[0072] By "diagnostic test" or grammatical equivalents herein is meant an assay or test to detect Mtb by any scientific technique from infected organisms, animals or patients.

[0073] The term "consisting essentially of," when used in reference to nucleic acid and/or amino acid sequences can refer to the specified nucleic acid sequences and/or amino acid sequences, and can include any additional nucleotide or amino acid residues, respectively, that do not materially affect the basic and novel characteristics of the specified sequence. The term "consisting essentially of also can refer to variants that are substantially similar to, and differ from a reference sequence in an inconsequential way as judged by examination of the sequence. Nucleic acid sequences encoding the same amino acid sequence are substantially similar despite differences in degenerate positions or modest differences in length or composition of any non-coding regions. Amino acid sequences differing only by conservative substitution, as described below, or minor length variations are substantially similar. Additionally, nucleic acid sequences that encode and amino acid sequences comprising immunogenic or antigenic epitopes that differ in the number of N- or C-terminal flanking residues are substantially similar. For example, a variant polypeptide can be considered to consist essentially of a specified sequence where the variant includes between 0-30 amino acid residues at either terminus, so long as the additional amino acids do not substantially affect he basic and novel characteristics of the polypeptide. Preferably, a variant peptide can have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more amino acid residues at either terminus, so long as the additional amino acids do not substantially affect he basic and novel characteristics of the polypeptide. Likewise, a variant polypeptide can be considered to consist essentially of a specified sequence where the variant has between 0-30 amino acid residues lacking compare to the specified sequence, as either terminus. In some aspects, for example, the variant polypeptides and polynucleotides are also immunogenic or antigenic. Nucleic acids that encode substantially similar amino acid sequences are themselves also substantially similar. Overview

[0074] The present invention generally relates to Mtb polypeptide libraries, methods of determining the immunogenic effect of Mtb polypeptides, and methods of developing immunogenic compositions and vaccines against Mtb, as well as immunogenic and pharmaceutical compositions. The invention also provides immunogenic Mtb polypeptides and mixtures of polypeptides, polynucleotides encoding immunogenic Mtb polypeptides and immunogenic compositions comprising Mtb polypeptides or polynucleotides. Polypeptide Libraries

[0075] According to a method of the present invention, a library or array of Mtb polypeptides, oligonucleotides, or polynucleotides is generated. The immunogenicity of individual polypeptides in the library or array is determined by immunological screening where suitable, immunogenic Mtb polypeptides are selected for vaccine development. Conveniently, individual polypeptides in the library may be arranged in an array to facilitate screening in a rapid and high throughput manner.

[0076] The term "array" includes any arrangement wherein a plurality of different molecules, compounds or other species are contained, held, presented, positioned, situated, or supported. Arrays can be arranged on microtiter plates, such as 48-well, 96-well, 144-well, 192-well, 240-well, 288-well, 336-well, 384-well, 432-well, 480-well, 576-well, 672- well, 768-well, 864-well, 960-well, 1056-well, 1152-well, 1248- well, 1344-well, 1440-well, or 1536-well plates, tubes, slides, chips, flasks, or any other suitable laboratory apparatus. In one embodiment, molecules arranged in an array are peptides, polypeptides or proteins. In another embodiment, the molecules are oligonucleotides or polynucleotides. In one aspect of the invention, polypeptides or polynucleotides in solution are arranged in 96 well plate arrays. In another embodiment, polypeptides or polynucleotides are immobilized on a solid support in an array format. Furthermore, an array can be sub-divided into a plurality of sub-arrays, as for example, where multiple 96-well plates (each an individual sub-array) are required to hold all of the samples of a single, large array.

[0077] The term "library" is likewise to be construed broadly, and includes any non-naturally occurring collection of molecules, whether arranged or not. A library therefore encompasses an array but the two terms are not necessarily synonymous. TAP Technology

[0078] Libraries of Mtb polypeptides may be prepared by any method known in the art. Conveniently, GTS' patented Transcriptionally Active PCR ("TAP") products can be used to amplify DNA in preparation for producing Mtb polypeptide libraries. With TAP technology, a particular polynucleotide of interest can be made transcriptionally active and ready for expression in less than one day. "TAP fragments" are transcriptionally active coding sequences prepared using TAP technology, and the two terms can be used interchangeably. TAP fragments encompass polynucleotides that can be readily expressed, for example, by transfection into animal cells or tissues by any nucleic acid transfection technique, without the need for subcloning into expression vectors or purification of plasmid DNA from bacteria. TAP fragments can be synthesized by amplification {e.g., polymerase chain reaction, or PCR) of any polynucleotide of interest using nested oligonucleotide primers. Two polynucleotide sequences are typically incorporated into TAP fragments, one of which comprises an active transcriptional promoter and the other comprises a transcriptional terminator.

[0079] TAP fragments and methods of making the same are described in detail in U.S. Patent No. 6,280,977, entitled "Method for Generating Transcriptionally Active DNA Fragments" which is hereby incorporated by reference in its entirety. In one embodiment, methods for creating TAP fragments include the steps of: i) designing oligonucleotide primers; ii) amplifying TAP primary fragments; and iii) amplifying TAP expression fragments. FIGURE 1 illustrates one method for generating TAP fragments.

[0080] TAP fragments can be prepared using custom oligonucleotide primers designed to amplify a target polynucleotide sequence of interest from the Mtb genome. Primers complementary to the 5' and 3' ends of the polynucleotide of interest can be designed and synthesized using methods well known in the art, and can include any suitable number of nucleotides to permit amplification of the coding region. Typically, the polynucleotide sequence of interest is an open reading frame (ORF) that consists of an uninterrupted stretch of triplet amino acid codons, without stop codons. In certain embodiments, the polynucleotide is a Mtb polypeptide-encoding sequence.

[0081] In one embodiment of the invention, 5 '-custom oligonucleotide primers of about 41, 42, 43, 44, 45 or 46 nucleotides are designed and synthesized; about 6 nucleotides of which comprise the 5'-TAP end universal sequence 5'- GAAGGAGATATACCATGCATCATCATCATCATCAT- 3¹ (SEQ ID NO: 84) and about 15 to 20 nucleotides are complementary to the Mtb sequence. Accordingly, the target-specific sequence can be, for example, about 15, 16, 17, 18, 19, or 20 nucleotides in length. The 5' oligonucleotide may also incorporate a Kozak consensus sequence (A/GCCAUGG) near an ATG start codon (initiator methionine) for more efficient translation of mRNA. In one embodiment, an ATG start codon is included in the target- specific primer sequence. In another embodiment, an ATG start codon is incorporated into the custom 5 '-oligonucleotide when the target sequence encoding a polypeptide of interest lacks an initiation methionine codon at its 5' end.

[0082] In one embodiment of the invention, 3 '-custom oligonucleotide primers comprise about 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, or 45 nucleotides; of these, about 20 nucleotides comprise the 3'-TAP end universal sequence 5'- TGATGATGAG AACCCCCCCC-3' (SEQ ID NO: 85) and about 20 nucleotides are complementary to the target Mtb sequence. In one aspect, a stop codon sequence, can be added to the end of the target Mtb sequence to achieve proper translational termination by incorporating a TCA, TTA, or CTA into the 3 '-custom oligonucleotide. Bioinformatics analysis of Mtb polynucleotides

[0083] In one embodiment of the invention, a bioinformatics approach is used to identify, prioritize and select Mtb genes, coding sequences, ORFs and other sequences of interest for TAP amplification and to design custom 5' and 3' oligonucleotide primers. According to this approach, a database of Mtb genomic information is compiled from available nucleic acid and amino acid sequence information, including the polynucleotide, gene, locus, polypeptide, and protein names, locations and sizes. In certain embodiments, the location of known coding sequences is included in the database. The sequence information may also be analyzed for unidentified ORFs and putative coding sequences. Any method can be used to identify ORFs and coding sequences including free or commercially available sequence analysis software. For example, the GLIMMER program may be used to predict putative coding regions or genes in prokaryotic nucleotide sequences. See e.g., Salzberg et al., Nucleic Acids Res . 26: 544-548 (1998); Delcher, et al, Nucleic Acids Res. 27:4636-4641(1999).

[0084] In certain aspects of the invention, the genome database includes the entire genomic DNA sequence of Mtb. In one embodiment, the sequence information is obtained from information that is in the public domain. In other embodiments, some or all of the sequence information can be obtained by nucleotide and/or amino acid sequencing.

[0085] As previously described in U.S.S.N. 10/159,428, which is hereby incorporated by reference in its entirety, the methods of the present invention, particularly TAP technology, enable the skilled artisan to prepare a library representing all or substantially all of the polypeptides expressed in an organism or cell type. In certain embodiments of the present invention, however, it may be preferable to prepare a library of polypeptides with selected properties. Thus, one aspect of the present invention utilizes a set of ranking criteria to identify polypeptides predicted to have properties desirable e.g., for vaccine development. Polypeptide ranking criteria, which may be identified using bioinformatics tools, include but are not limited to, the presence of membrane domains, ORF size, secreted proteins signatures, signal sequences, hydrophobicity, B-cell and T-cell epitopes, homology to human proteins, protein and gene expression levels. The ranking criteria may be assigned a numerical score based on relative importance. Coding regions or putative coding regions identified in the database of Mtb sequences are then scored using the numerical ranking criteria and the sum of the scores for each sequence is used to establish a rank order. According to this aspect of the invention, primers are designed to amplify Mrόpolynucleotides in rank order. A library may be constructed, for example, from the top 5%, 10%, 20%, 30%, 40% or 50% by rank of Mtb polynucleotides. Amplification of Mtb polynucleotides

[0086] Using the custom 5' and 3' oligonucleotide primers, TAP primary fragment may be amplified by methods well known in the art. The term "TAP primary fragment" refers to an amplified Mtb polynucleotide, and in one embodiment relates to a polynucleotide sequence that has been amplified but is not transcriptionally active. Generation of TAP primary fragments involves performing PCR, which generates a polynucleotide fragment that contains the Mtb polynucleotide sequence with 5'- and 3'- TAP universal end sequences and may contain other sequences incorporated into the custom 5' and 3' oligonucleotide primers. The 5'- and 3'-TAP universal end sequences are particularly useful for incorporating one or more nucleotide sequences into TAP primary fragment that confer transcriptional activity. In one embodiment, these sequences can include TAP Express™ promoter and terminator fragments (e.g., SEQ ID NOS: 2-7). The skilled artisan will be familiar with methods for amplifying polynucleotides, (e.g. by using PCR) and can adjust the above methods in order to optimize the amplification reaction.

[0087] An additional step in the generation of TAP fragments involves incorporating at least one polynucleotide sequence that confers transcriptional activity into the TAP primary fragment. Typically, at least one polynucleotide sequence is incorporated by performing a second PCR reaction. Examples of polynucleotide sequences that confer transcriptional activity are promoter sequences (e.g., prokaryotic Pribnow boxes and eukaryotic TATA box sequences) binding sites for transcription factors, and enhancers. In one embodiment, one promoter and one terminator sequence are added to the TAP fragment. These promoter and terminator sequences can be obtained in numerous ways. For example, one can use restriction enzyme digestion of commercially available plasmids and cDNA molecules, or one can synthesize these sequences with an automated DNA synthesizer by methods well known in the art. [0088] The end product of the second PCR reaction is referred to as a "TAP expression fragment," which is a transcriptionally active polynucleotide, and which is generally a transcriptionally active coding sequence. In certain embodiments, the TAP expression fragments are used directly for in vivo or in vitro (e.g. cell-free) expression. In other embodiments, TAP expression fragments are transfected into cultured cells or injected into animals.

[0089] Generating TAP fragments is a rapid and efficient way of making a large number of polynucleotide sequences transcriptionally active. Accordingly, a plurality of different genes from Mtb can be made transcriptionally active using TAP technology. Thus, a library representing all, substantially all, or a selected subset of the coding sequences in the Mtb genome can be constructed using TAP technology. TAP Tags and Linker Molecules

[0090] As described above, TAP technology provides powerful methods for amplifying and expressing Mtb polynucleotides. Coding sequences can be rendered transcriptionally active by the PCR-mediated addition of promoter sequences, enhancers, terminators and other regulatory sequences.

[0091] In addition, Mtb polynucleotides can be amplified with additional coding or non-coding sequences that can facilitate rapid screening, characterization, purification and study of the polypeptides that they encode. These additional sequences include, for example, reporter genes, affinity tags, antibody tags, PNA binding sites, secretory signals, and the like.

[0092] According to the present invention, Mtb polynucleotides can be synthesized with an epitope tag. An "epitope tag" is a short stretch of polynucleotide sequence encoding an epitope. In one embodiment, this epitope is preferably recognized by a well-characterized antibody. By incorporating an epitope tag into TAP fragments, the Mtb polynucleotide of interested can be fused in-frame to an epitope-encoding sequence. Expression of an epitope-tagged TAP fragment produces a fusion protein comprising a tagged Mtb polypeptide. Suitable epitope tags will be well known to those skilled in the art, including the hemagglutinin (HA), the 6xHis epitope tag, and the Flag epitope tag. The HA epitope tag is well characterized and highly immunoreactive. Upon transfection of an HA-tagged TAP fragment into cells, the resulting HA-tagged polypeptides can be identified with commercially available anti-HA antibodies. Epitope tagging of TAP fragments is useful for rapidly and conveniently detecting expression of TAP fragments. Epitope tagging of TAP fragments can also help determine the intracellular distribution of Mtb polypeptides and help characterize and purify the Mtb polypeptide. Furthermore, epitope-tagged expression products can be quickly captured and/or purified using antibodies specific for the specific epitope. Antibodies directed against the HA epitope can used in the full range of immunological techniques for detection and analysis of tagged polypeptides including but not limited to Western blotting, ELISAs, radioimmune assays, immunoprecipitation, immunocytochemistry and immunofluorescence, fluorescence assisted cell sorting (FACS) and immunoaffinity purification of the desired fusion polypeptides.

[0093] Also provided herein are chimeric polynucleotides. In some embodiments, the chimeric polynucleotides can comprise a polynucleotide of SEQ ID NO: 46-64, 110-121 or fragments thereof that encode antigenic or immunogenic polypeptides, fused to a sequence that is not naturally contiguous with that sequence as it is found in nature. For example, in some embodiments, the polynucleotides of SEQ ID NO: 46-64, 110-121 or fragments thereof that encode antigenic polypeptides can be fused to a polynucleotide sequence derived from Mycobacterium tuberculosis which is not contiguous with the polynucleotides SEQ ID NO: 46-64, 110-121 or fragments thereof that encode antigenic or immunogenic polypeptides. Alternatively, or in addition, the polynucleotides of SEQ ID NO: 46-64, 110-121 or fragments thereof that encode antigenic or immunogenic peptides can be fused to a heterologous polynucleotide sequence. Also provided are chimeric polynucleotides comprising polynucleotides having at least 85%, 90%, 95%, 96%, 97%, 98% or 99% nucleic acid sequence identity to the nucleic acid sequences of SEQ ID NOs: 46-64, 1 10-121 or fragments thereof that encode antigenic or immunogenic polypeptides, fused to a polynucleotide sequence that is not naturally contiguous with that sequence as it is found in nature. Further provided herein are chimeric polynucleotides comprising polynucleotides that consist essentially of the sequence of SEQ ID NOs: 46-64, 110-121 or fragments thereof that encode antigenic or immunogenic polypeptides, fused to a polynucleotide sequence that is not naturally contiguous with that sequence as it is found in nature. Additionally, provided herein are chimeric polynucleotides comprising polynucleotide sequences that hybridize under stringent conditions to the nucleic acid sequences of SEQ ID NOs: 46-64, 110-121 or immunogenic or antigenic fragments thereof that or the complements thereof, fused to a polynucleotide sequence that is not naturally contiguous with that sequence as it is found in nature. In some embodiments, the polynucleotides disclosed herein can be optimized for codons most frequently used in an animal host, particularly a human host.

[0094] The present invention also provides Mtb polypeptides fused to affinity tags. For example, a polynucleotide sequence encoding a histidine tag can be incorporated into the TAP fragment to enable the expressed gene product to be conveniently purified. These His tags consist of six consecutive histidine residues (6xHis) and are a powerful tool for recombinant polypeptide purification. The 6xHis tag interacts with metals, such as nickel. Thus, polypeptides fused to a 6xHis tag can be purified by metal affinity chromatography, for example, using a nickel nitrilotriacetic (Ni-NTA) resin. The 6xHis tag is much smaller than most other affinity tags and is uncharged at physiological pH. It rarely alters or contributes to polypeptide immunogenicity, rarely interferes with polypeptide structure or function, does not interfere with secretion, does not require removal by protease cleavage, and is compatible with denaturing buffer systems. Accordingly, this tag is a powerful adjunct to expression and purification of recombinant proteins.

[0095] Also provided are chimeric polypeptides wherein the Mtb polypeptides or variants, including but not limited to variants that are 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99% identical to SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof, polypeptides that consist essentially of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof, or variants that have been optimized for expression in a particular host, are fused to an amino acid sequence that is not naturally contiguous with that sequence as it is found in nature. . For example, in some embodiments, the polyeptides of SEQ ID NO: 65-83, 122-133 or immunogenic or antigenic fragments thereof s can be fused to an amino acid sequence derived from Mycobacterium tuberculosis which is not contiguous with the polypeptides of SEQ ID NO: 65-83, 122-133, or antigenic or immunogenic fragments thereof. Alternatively, or in addition, the polypeptides of SEQ ID NO: 65-83, 122-133, or antigenic or immunogenic fragments thereof can be fused to a heterologous amino acid sequence.

[0096] In one aspect of this embodiment, TAP primers can be designed to include the nucleotide sequence encoding the 6xHis epitope tag: to add the 6xHis epitope to the 5' end of a Mtb polynucleotide, a sequence encoding histidine residues can be included along with the promoter-containing primer; to add the 6xHis epitope to the 3' end, the His sequence can be included in the terminator-containing primer. [0097] Commercially available nickel affinity resins can be used to purify 6xHis tagged polypeptides. For example, the well-established QlAexpress Protein Expression and Purification Systems are based on the remarkable selectivity and affinity of patented nickel-nitrilotriacetic acid (Ni-NTA) metal-affinity chromatography matrices for polypeptides tagged with 6 consecutive histidine residues (6xHis tag) available from QIAGEN (Seattle, Washington). The QIAexpress System is based on the remarkable selectivity of Ni-NTA (nickel-nitrilotriacetic acid) for polypeptides with an affinity tag of six consecutive histidine residues — the 6xHis tag. This technology allows purification, detection, and assay of almost any 6xHis-tagged polypeptide from any expression system. Polypeptides with a 6xHis tag can be purified through nickel nitrilotriacetic (Ni-NTA) resin.

[0098] The HA and the 6xHis epitope embodiments are not to be construed as limiting, and are provided for illustrative purposes only. Those skilled in the art will appreciate that any type of tag can be attached to the expressed products such as for example, a Ix, 8x, 9x, or 10x histidine tag, GST tag, fluorescent protein tag, and the like.

[0099] In addition to providing a convenient means for detection and purification of Mif& polypeptides, various tags can provide a "linker" through which the polypeptides of the invention can be immobilized on a solid support. The term "linker molecule," as used herein, encompasses any molecule that is capable of immobilizing the polypeptides to a solid support. TAP fragment with secretory signal

[0100] For many gene therapy and DNA vaccine applications it may be beneficial for the gene product to be secreted from the transfected cells. Thus, one embodiment of the invention provides a version of the TAP system designed to express Mtb polypeptides containing a fused a secretory signal. A commonly used signal peptide is the first 23 amino acids from human tissue plasminogen activator (tPA) with the coding sequence as follows: ATG GAT GCA ATG AAG AGA GGG CTC TGC TGT GTG CTG CTG CTG TGT GGA GCA GTC TTC GTT TCG CCC AGC. (SEQ ID NO: 1) This sequence can be built into the TAP promoter fragment to create a new TAP fragment in a fashion similar to the construction of the tagged polypeptides described above. Incorporating TAP fragments into a plasmid vector [0101] Once the function or immunogenicity of an Mtb polypeptide is identified, it may be of desirable to clone the TAP fragment into a plasmid or other vector to facilitate further gene characterization and manipulation.

[0102] Standard cloning techniques can involve the use of restriction enzymes to digest the plasmid and the gene fragment to be inserted. Annealing and ligation of the compatible ends can lead to insertion of the gene into the vector. An alternative method of restriction ends-directed cloning is to prepare a linearized plasmid with T overhangs on the 3' ends of the double-stranded DNA to accommodate DNA fragments amplified with the aid of specific polymerases through PCR. This method is sometimes called "T/A cloning". Other methods of cloning TAP fragments will be well known to those of skill in the art.

[0103] In certain embodiments, the TAP Cloning systems, methods, and kits can further simplify the cloning process by taking advantage of the universal 5' and 3' sequences that are present on the TAP Express fragment after the first or second PCR step. These regions overlap with the end sequences of our linearized TAP Express Cloning Vector. When the TAP fragment and the linearized plasmid are mixed together and directly electroporated into TAP Express Electro-Comp cells, endogenous bacterial recombinase activity recombines the two DNA fragments resulting in a plasmid with the inserted TAP Express fragment. This process can replace conventional cloning with two simple PCR steps. In some embodiments it does not require cutting, pasting and ligating DNA fragments. In addition, this process can be highly suited for fast and convenient cloning of TAP PCR fragments without having to resort to restriction enzymes, DNA ligase, Topo-isomerase or other DNA modifying enzymes. "TAP" systems, vectors and cells are readily available from Gene Therapy Systems, Inc., San Diego, California.

[0104] GeneGrip PNA compatible TAP system can also be used to couple polypeptides onto DNA through PNA-Dependent Gene Chemistry, thereby avoiding many of the limitations of previously described methodologies. GeneGrip is available through Gene Therapy Systems, Inc., San Diego, California. This approach takes advantage of the property of peptide nucleic acids (PNA) to hybridize with duplex DNA in a sequence specific and very high affinity manner. PNA binding sites can be used for attaching a series of peptides onto DNA in order to target the transfected plasmid and improve transgene expression, for example. This can facilitate a rational approach to improve the efficiency and efficacy of gene delivery by adding elements intended to increase nuclear uptake, facilitate endosomal escape, or target gene delivery to the cell surface or to intracellular receptors.

[0105] Incorporating a GeneGrip site into TAP enables peptide nucleic acids (PNAs) to be hybridized to the TAP gene product. Ligands can then be attached to the PNA in order to improve the bioavailability and DNA vaccine potency of the gene. System for performing TAP method

[0106] In another embodiment of the invention, a system can be used to perform every step involved in generating TAP fragments from a Tb, and in particular the Mtb genome. Additionally, each individual step is capable of being controlled by a system. For example, a system can design customized PCR primers, obtain said primers, perform PCR reactions utilizing TAP technology, attach promoters and terminators, and attach sequences that encode linker molecules to the primary or expression fragment. The system can be either automated or non-automated. In one embodiment of the invention, the system comprises a computer program linked to robotic technologies for rapid and high throughput gene amplification of the genome. Expression of the TAP fragment

[0107] TAP fragments can be used directly as templates in various expression systems in order to obtain the corresponding polypeptide for each coding sequence in the Mtb genome. The invention provides simple, efficient methods for generating TAP fragments from Mtb that can be readily transfected into animal cells or tissues by any nucleic acid transfection technique. The methods of the invention can avoid the need for subcloning into expression vectors and for purification of plasmid DNA from bacteria. As skilled artisans can appreciate, TAP fragments can be rapidly expressed using in vivo or in vitro (e.g. cell-free) expression systems. For example, the amplified TAP fragments can be directly transfected into a eukaryotic or prokaryotic cell for expression. Examples of eukaryotic cells that can be used for expression include mammalian, insect {e.g. Baculovirus expression systems), yeast {e.g. Picchia pastoris), and the like. An example of a prokaryotic cell expression system includes E. coli.

[0108] Alternatively, expression can be accomplished in cell-free systems, for example, a T7 transcription and translation system. Cell-free translation systems can include extracts from rabbit reticulocytes, wheat germ and Escherichia coli. These systems can be prepared as crude extracts containing the macromolecular components (30S, 70S or 80S ribosomes, tRNAs, aminoacyl-tRNA synthetases, initiation, elongation and termination factors, etc.) required for translation of exogenous RNA. To promote efficient translation, each extract can be supplemented with amino acids, energy sources (ATP, GTP), energy regenerating systems (creatine phosphate and creatine phosphokinase for eukaryotic systems, and phosphoenol pyruvate and pyruvate kinase for the E. coli lysate), and other co-factors (Mg²⁺, K⁺, etc.).

[0109] The use of TAP technology allows skilled artisans to rapidly express polypeptides from a plurality of polynucleotides. After a particular Mtb polynucleotide of interest is rendered transcriptionally active, other Mtb polynucleotides can also be made to be transcriptionally active according to the methods of the invention. Accordingly, in one embodiment of the invention, a plurality of polynucleotides from Mtb are amplified and expressed in order to generate a library or array of Mtb polypeptides.

[0110] Other embodiments of the invention relate to expressing the product of a M#_polynucleotide that encodes an epitope tag, affinity tag or other tags, and which may function as linkers. A polynucleotide sequence encoding a linker molecule can be incorporated into a TAP primary fragment or a TAP expression fragment. Accordingly, the linker molecule can be expressed as a fusion to the Mtb polypeptide.

[0111] The generation of Mtb polypeptide libraries according to the methods of the invention allows skilled artisans to easily use them in subsequent research and study. For example, it is possible to organize the expressed Mtb polypeptides into an array for further analysis. The expressed polypeptide arrays can be screened in order to identify, for example, new vaccine and drug targets against microbial, neoplastic disease and the like. The expressed polypeptides can be used to screen antibody libraries, to develop unique research reagents, for functional proteomic studies, and the like. These steps can be rapidly accomplished at rates far exceeding traditional methods. Adapter Technology

[0112] In addition to amplifying Mtb polynucleotides of interest using TAP technology, the present invention also encompasses amplifying Mtb polynucleotides using "adapter technology". In some embodiments adapter technology is performed using a one-step PCR reaction. The term "adapter technology" as used herein relates to methods of cloning a desired polynucleotide into a vector by flanking a desired nucleic acid sequence, a Mtb TAP fragment for example, with first and second adapter sequences. The resulting fragment can be contacted with the vector having sequences homologous to the first and second adapter sequences under conditions such that the nucleic acid fragment is incorporated into the vector by homologous recombination in vivo in a host cell. Accordingly, adapter technology allows for fast and enzyme-less cloning of nucleic acid fragments into vectors and can also be used for forced cloning selection for successful transformation. Adapter technology is described in more detail in U. S Patent Application No. 09/836,436, entitled "Fast and Enzymeless Cloning of Nucleic Acid Fragments", U.S. Patent Application No. 10/125789, entitled "Rapid and Enzymeless Cloning of Nucleic Acid Fragments", and PCT Application No. PCTUS 02/12334, all of which are hereby incorporated by reference in their entirety.

[0113] The nucleic acid fragment can be incorporated into any vector using adaptor technology. In certain embodiments, the vector that the fragment is incorporated into can be, for example, a plasmid, a cosmid, a bacterial artificial chromosome (BAC), and the like. The plasmid can be CoEl, PRlOO, R2, pACYC, and the like. The vector can also include a functional selection marker. The functional selection marker can be, for example, a resistance gene for kanamycin, ampicillin, blasticidin, carbonicillin, tetracycline, chloramphenicol, and the like. The vector further can include a dysfunctional selection marker that lacks a critical element, and wherein the critical element is supplied by said nucleic acid fragment upon successful homologous recombination. The dysfunctional selection marker can be, for example, kanamycin resistance gene, ampicillin resistance gene, blasticidin resistance gene, carbonicillin resistance gene, tetracycline resistance gene, chloramphenicol resistance gene, and the like. Further, the dysfunctional selection marker can be, for example, a reporter gene, such as the lacZ gene, and the like.

[0114] The vector can include a negative selection element detrimental to host cell growth. The negative selection element can be disabled by said nucleic acid fragment upon successful homologous recombination. The negative selection element can be inducible. The negative selection element can be, for example, a mouse GATA-I gene. The vector can include a dysfunctional selection marker and a negative selection element.

[0115] The host cell used in adapter technology can be a bacterium. The bacterium can be capable of in vivo recombination. Examples of bacterium include JC8679, TBl, DHa, DH5, HBlOl, JMlOl, JM109, LE392, and the like. The plasmid can be maintained in the host cell under the selection condition selecting for the functional selection marker. [0116] The first and second adapters can be any length sufficient to bind to the homologous sequences of the vector such that the desired nucleic acid sequence is incorporated into the vector. The first and second adapter sequences can be, for example, at least 11 bp, 12 bp, 13, bp, 14 bp, 15 bp, 16 bp, 17 bp, 18 bp, 19 bp, 20 bp, 21 bp, 22 bp, 23 bp, 24 bp, 25 bp, 26 bp, 27 bp, 28 bp, 29 bp, 30 bp, 31 bp, 32 bp, 33 bp, 34 bp, 35 bp, 36 bp, 37 bp, 38 bp, 40 bp, 50 bp, 60 bp and the like. Furthermore, the first and second adapter sequences can be greater than 60 bp.

[0117] The first and second adapter sequences further can include a functional element. The functional element can include a promoter, a terminator, a nucleic acid fragment encoding a selection marker gene, a nucleic acid encoding a linker molecule, a nucleic acid fragment encoding a known protein, a fusion tag, a nucleic acid fragment encoding a portion of a selection marker gene, a nucleic acid fragment encoding a growth promoting protein, a nucleic acid fragment encoding a transcription factor, a nucleic acid fragment encoding an autofluorescent protein (e.g. GFP), and the like.

[0118] When the common sequences on both the 5' and 3' ends of the nucleic acid fragment are complimentary with terminal sequences in a linearized empty vector, and the fragment and linearized vector are introduced, by electroporation, for example, together into a host cell, they recombine resulting in a new expression vector with the fragment directionally inserted. In alternative embodiments the host cell can include the linearized empty vector so that only the nucleic acid fragment is introduced into the host cell. It should be noted that in alternative embodiments of the present invention the vector can be circularized, and as used herein a vector can be either linearized or circular. The host cell is converted into an expression vector through homologous recombination. In principle this approach can be applied generally as an alternative to conventional cloning methods.

[0119] A nucleic acid fragment having first and second adapter sequences can be generated by methods well known to those of skill in the art. For example, a gene of interest with known 5' and 3' sequences undergoes PCR along with overlapping 5' and 3' priming oligonucleotides. The priming oligonucleotides can be obtained by methods known in the art, including manufacture by commercial suppliers. A primary fragment with adapter sequences can be generated. The adapter sequences flanking the gene of interest can be homologous to sequences on a vector or to sequences from other 5' or 3' fragments to be used in a subsequent PCR. [0120] In some embodiments of the invention, a particular polynucleotide of interest from Mtb can be amplified with an adapter sequence on both the 3' and 5' ends. In other embodiments adapters can be attached to a plurality of polynucleotides, for example every coding region in the Mtb genome. In certain embodiments adapters can make the desired coding regions transcriptionally active. Once incorporated into the desired vector, the Mtb coding region can be rapidly replicated and expressed, such that a plurality ofMtb's genes, for example every gene, is expressed.

[0121] Pluralities of expression products can be stored in libraries or arrays and can be assayed for their immunogenic properties as will be discussed below. While most embodiments relating to the assay methodologies are discussed in terms of TAP technology, all of the following assays can be used on adapter technology expression products as well. Once the appropriate assays are conducted on the adapter technology expression products, methods of developing vaccines can be utilized. While most of the embodiments relating to developing immunogenic compositions and vaccines, discussed below, pertain to TAP technology, all of the vaccine embodiments can also be used with polypeptide libraries and arrays resulting from adapter technology. Identifying Immunogenic Polypeptides

[0122] Libraries and arrays of polypeptides, prepared through TAP or adapter technology with subsequent expression can be useful in the development of polypeptide or nucleic acid subunit vaccines. DNA vaccines are effective vaccines that are inexpensive to manufacture, easy and safe to deliver, and can be widely distributed. It has been found that plasmid DNA, when injected into mice without being associated with any adjuvant, can generate antibody and CTL responses to viral antigens encoded by the plasmid DNA, and elicit protective immunity against viral infection (Ulmer at at, Science, 259:1745, 1993). Starting from this, there have been reported many research results regarding the induction of humoral and cellular immune responses resulting from the introduction of DNA vaccines containing various viral genes in animal models (Chow et at, J. Virol, 71 :169, 1997; McClements et at, Proc. Natl. Acad. Sci. USA, 93:11414, 1996; Xiang et α/., Virology, 199:132, 1994; Wang et at, Virology, 211 :102, 1995; Lee et at, Vaccine, 17:473, 1999; Lee et at, J. Virol., 72:8430, 1998). As well, DNA delivery by electroporation techniques has been well-described (Heller et at, Expert Opin. Drug Deliv. 2(2): 1-14, 2005) [0123] One of the most difficult tasks in developing a DNA vaccine (or any recombinant subunit vaccine) to a pathogen such as Mtb, is the identification of antigens that can stimulate the most effective immune response against the pathogen, particularly when the genome of the organism is large.

[0124] A comprehensive means to accomplish this task, which is embodied by the present invention, is to obtain a plurality of polypeptides from the particular pathogen in the mode of a library or array. These polypeptides can be tested to determine their capability to evoke a humoral and/or a cell-mediated immune response. Polypeptides that evoke immunogenic responses can be tested individually or with other antigens for effectiveness as subunit vaccines. In addition, nucleic acids that encode identified antigenic polypeptides can be used alone or with other nucleic acids that encode antigens to develop a recombinant vaccine, such as a DNA vaccine, for the particular pathogen. Mtb Scanning

[0125] One embodiment of the invention, incorporates a Rapid High- Throughput Vaccine Antigen Scanning approach, using TAP Express, that is able to systematically screen and identify all, substantially all, or a subset of the antigens in Mtb that give rise to a humoral and cell-mediated immune response. The identification of the Mtb antigens allows for the development of a highly specific subunit vaccine

[0126] FIGURE 2 illustrates a method for amplifying multiple Mtb polynucleotides using TAP technology, expressing the gene products of the resultant TAP fragments, purifying, and quantifying the resulting polypeptides. FIGURE 2 further illustrates a method of preparing polypeptides, which can be assayed to identify their ability to evoke a cell-mediated or humoral immune response.

[0127] In certain methods of developing a Mtb vaccine, a plurality of Mtb polynucleotides can be made transcriptionally active. In one embodiment, all of the open reading frames from Mtb genome can be made transcriptionally active using TAP technology. The present invention thus provides Mtb polynucleotides (SEQ ID NOS: 46- 64, 110-121) or variants thereof that have been made transcriptionally active. In some embodiments, the variants can be polynucleotide sequences having at least 85%, 90%, 95%, 96%, 97%, 98% or 99% nucleic acid sequence identity to the nucleic acid sequences of SEQ ID NOs: 46-64. 110-121, and fragments thereof that encode antigenic or immunogenic polypeptides. In some embodiments, the Mtb polynucleotides can consist essentially of the nucleic acid of SEQ ID NOs: 46-64, 110-121, or fragments thereof that encode immunogenic or antigenic polypeptides.

[0128] The resulting Mtb TAP fragments of the present invention can be purified and expressed in vitro or in vivo according to any method known in the art. The expression products, which encompass SEQ ID NOS: 65-83, 122-133, or variants thereof, can be assayed by various methods to determine their ability to evoke a humoral and/or a cell-mediated immunogenic response. In some embodiments, the variants can be peptides and polypeptides that have at least 85%, 90%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to SEQ ID NOS: 65-83, 122-133, and fragments thereof are antigenic or immunogenic. In some embodiments, the Mtb polypeptides can consist essentially of the amino acids of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof. Polypeptides that are identified as capable of evoking an immune response can be used as candidates to develop polynucleotide or polypeptide subunit vaccines. The complete method will be described in more detail below.

[0129] According to one embodiment of the present invention, TAP fragments from Mtb are used to generate a DNA array, and then, if desired, a protein array. In certain embodiments, primers are designed for every gene in the Mtb genome. In another embodiment, designing the primers allows a skilled artisan to make any given Mtb polynucleotide transcriptionally active using TAP technology. In yet another embodiment, coding regions, ORFs and other polynucleotide sequences of interest are ranked according to and the top Mtb polynucleotides are made transcriptionally active using TAP technology.

[0130] As mentioned above, the custom PCR primers can be designed by using an automated system, such as a computerized robotics system. For example, in order to design custom primers for use in the TAP process, a robotic workstation can be interfaced with a dual Pentium III CPU (1.4 GHz) computer running the Linux operating system. In addition, a customized MySQL database can manage the input sequence data from GenBank and from other sources. This database can track all the operations, samples and analytical data generated by the robot. In another embodiment, PCR primers, PCR products and polypeptides can be tracked by the database. For example, PCR primers, PCR products and polypeptides can be tracked by using bar coded 96-well plates. While the embodiments below discuss using 96-well plates in certain embodiments, those skilled in the art can appreciate that any sized well plate can be used. For example, the well plates can consist of about 48, about 96, about 144, about 192, about 240, about 288, about 336, about 384, about 432, about 480, about 576, about 672, about 768, about 864, about 960, about 1056, about 1152, about 1248, about 1344, about 1440, about 1536 or more wells. In addition to well plates, the PCR products and polypeptides can be tracked using any suitable receptacles, for example test tubes.

[0131] Custom oligonucleotide pairs of the present invention (SEQ ID NOS: 8 and 9; 10 and 11; 12 and 13; 14 and 15; 16 and 17; 18 and 19; 20 and 21; 22 and 23; 24 and 25; 26 and 27; 28 and 29; 30 and 31; 32 and 33; 34 and 35; 36 and 37; 38 and 39; 40 and 41; 42 and 43; 44 and 45; 86 and 87; 88 and 89; 90 and 91; 92 and 93; 94 and 95; 96 and 97; 98 and 99; 100 and 101; 102 and 103; 104 and 105; 106 and 107; 108 and 109), which are needed for the TAP PCR reactions, can be synthesized or obtained in order to perform the TAP technology. Accordingly, some embodiments relate to primers that consist or consist essentially of SEQ ID NOs: 8-45 and/or 86-109. In some embodiments, a primer that hybridizes to Mtb sequences, wherein the primer is at least 12 residues in length and hybridizes under stringent conditions to at least 12 consecutive bases of a nucleic acid sequence of SEQ ID NOs: 8-45 and 86-109, or the complement thereof. In certain embodiments, the Mtb genome sequence data and primer design software {e.g., Primer 3) can be used by the system to generate custom primer pairs for all, substantially all, or a subset of the genes in the Mtb genome. The primers can be organized into arrays of about 48, about 96, about 144, about 192, about 240, about 288, about 336, about 384, about 432, about 480, about 576, about 672, about 768, about 864, about 960, about 1056, about 1152, about 1248, about 1344, about 1440, or about 1536, or any number in between, of 5' primers and 3' primers according to polynucleotide size and GC content, such PCR reaction conditions can be optimized on a plate by plate basis. The present invention further contemplates that sequences for each of the custom Mtb primer pairs can be sent to an oligonucleotide synthesis provider {e.g., MWG Biotech, Inc., High Point, NC.) where they can be synthesized. Synthesized primers can be organized and dispensed into bar-coded plates at a desired concentration, such as 100 pmole/μl, frozen and shipped to the practitioner. In one embodiment, 600 Mfδ-specific PCR primers, which are capable of amplifying 300 Mtb coding sequences are designed, generated, ordered, and organized.

[0132] After obtaining or generating the custom Mtb PCR primers, the Mtb polynucleotides of interest can be amplified. In one embodiment, the primers can be organized into arrays of 96 5' primers and 96 3' primers according to polynucleotide size, and placed onto a robotic workstation. The robot can be programmed to generate a plate of about 48, about 96, about 144, about 192, about 240, about 288, about 336, about 384, about 432, about 480, about 576, about 672, about 768, about 864, about 960, about 1056, about 1152, about 1248, about 1344, about 1440, about 1536, or any number in between, PCR reactions by mixing the appropriate 5' and 3' primers with Taq polymerase and Mtb genomic DNA. In addition to Taq, any thermally stable polymerase can be used in the PCR reactions. For example, Vent, Pfu, TfI, Tth, and Tgo polymerases can be used. The robotic workstation can transfer the PCR reaction plate containing the mixed reagents to a PCR machine for amplification. In one embodiment, the robotic workstation can use a robotic arm to transfer the PCR reaction plate to the PCR machine.

[0133] The first TAP PCR procedure can be run for any number of cycles. In one embodiment, the PCR machine is run for about 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, or more cycles. The first TAP PCR reactions can be transferred robotically to a Millipore Montage 96-well cleanup kit, for example, when desired. Any method, kit or system can, however, be used to purify the PCR products from these reactions. According to one embodiment, a vacuum station of the robotic platform can carry out the purification step. In some embodiments, an aliquot of the resulting product can be transferred robotically to an analysis plate containing the Pico-Green fluorescent probe (Molecular Probes, Eugene, OR) that reacts only with the dsDNA products. Depending on the number of wells, the plate can be transferred to an about 48, about 96, about 144, about 192, about 240, about 288, about 336, about 384, about 432, about 480, about 576, about 672, about 768, about 864, about 960, about 1056, about 1 152, about 1248, about 1344, about 1440, about 1536 or more well fluorescent plate reader. The fluorescent signal can be compared to a standard curve to determine the amount of double stranded PCR product generated in this first PCR step. Persons with skill in the art can adjust the above methods in order to optimize their particular PCR reaction, should the need arise.

[0134] In addition to the first TAP PCR procedure, a second TAP PCR reaction can be performed to add at least one sequence that confers transcriptional activity to the primary TAP primary fragment. In one embodiment, a robot can be programmed to transfer an aliquot of each TAP primary fragment from the first TAP PCR reaction into a PCR reaction containing a promoter- and a terminator-containing primers. In a particular embodiment, the promoter can be a T7-his tag promoter sequence and the terminator can be a T7-His tag terminator sequence. Those with skill in the art can appreciate that any promoter or terminator sequence can be added to the primary transcript. In addition, any polynucleotide sequence that encodes a tag or linker allowing the expressed polypeptide to be detected or purified is also contemplated.

[0135] Like the first TAP PCR reaction, the second TAP PCR reaction can be run for any desired number of cycles. In one embodiment, the second TAP PCR reaction is run for about 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40 cycles or more. Furthermore, any type of thermally stable polymerase can be used for the second TAP PCR reaction. In a particular embodiment the polymerase can be Taq. In some embodiments Vent, Pfu, TfI, Tth, and Tgo polymerases can be used. The resulting TAP Express PCR fragments from the second PCR reaction can be cleaned by any kit, method or system. A particular kit that can be used to clean the resulting TAP fragments is a Millipore Montage 96-well cleanup kit. Additionally, as discussed above, the level of PCR product recovered can be determined using any detection agent, for example, Pico-Green.

[0136] The resulting TAP fragments can be expressed by using any method of gene expression. In one embodiment, the TAP fragments can be expressed using in vivo or in vitro (e.g. cell-free) systems. For example, the fragments can be directly transfected into any eukaryotic or prokaryotic cell for expression. Examples of eukaryotic cells that can be used for expression include mammalian, insect, yeast, and the like. An example of a prokaryotic cell expression system includes E. coli. The TAP fragments can also be expressed by a cell-free system. According to one embodiment of the invention, the resulting TAP fragments can be expressed in a high-throughput cell-free expression machine, such as, for example, the Roche RTS (Rapid Translation System) -100. In a further embodiment, the TAP fragments can be incubated in Roche RTS 100 system at 30°C for 5 hours. A person with skill in the art can readily appreciate the utility in following the particular cell-free translation machine's instructions. If a T7-histadine promoter or terminator fragment is added to a primary transcript, translation of the TAP fragment can result in histidine tagged polypeptides, which can be purified as discussed below. As discussed herein, any tag can be used.

[0137] The expressed Mtb polypeptides can be purified using any purification method for purifying expressed polypeptides. In one embodiment histidine tagged polypeptides can be purified with Qiagen nickel columns, such as Ni-NTA Superflow 96 Biorobot Kit. A person with skill in the art can readily appreciate the utility in following the instructions of the particular polypeptide purification system. Other methods that can be used to purify polypeptides include ultrafiltration, extraction, and chromatography.

[0138] The identity, quantity and purity of the purified Mtb polypeptides can be verified by SDS gel electrophoresis. According to one embodiment of the invention, MALDI-TOF MS (Matrix Assisted Laser Desorption/Ionization-Time of Flight Mass Spectrometry) can be employed to confirm the fidelity of the purified polypeptides. According to this embodiment, aliquots of each polypeptide (1-2 μg) can be aliquoted into about 48, about 96, about 144, about 192, about 240, about 288, about 336, about 384, about 432, about 480, about 576, about 672, about 768, about 864, about 960, about 1056, about 1152, about 1248, about 1344, about 1440, about 1536 or more well plates and digested with modified trypsin. The resulting material can be mixed with matrix (alpha- cyano-4-hydroxycinnamic acid (CHCA)) and spotted onto any target plate with a suitable number of spots, for example, 48, about 96, about 144, about 192, about 240, about 288, about 336, about 384, about 432, about 480, about 576, about 672, about 768, about 864, about 960, about 1056, about 1152, about 1248, about 1344, about 1440, about 1536 or more spots. In one embodiment, a 384-spot "anchor chip" target plate (Bruker Daltonics, Billerica, MA) can be used. The plate can be transferred to the sample stage of a Bruker Autoflex MALDI-TOF mass spectrometer. The spectrometer can be set up to automatically scan the plate and search the Mascot polypeptide database via the Internet. Accordingly, a very rapid verification system can verify purity, identity, and quantity in less than a day, for example, depending on the amount of polypeptides. Purified Mtb polypeptides can be placed in libraries or organized into arrays for subsequent testing and analysis. Humoral Immune Response

[0139] Use of the Mtb polypeptide libraries and arrays prepared, for example, according the methods above (e.g. using TAP or adapter technology) can be used to identify antigenic targets of humoral immunity in Mtb non-human animals and human patients. A humoral immune response relates to the generation of antibodies and their ability to bind to a particular antigen. In general, the humoral immune system uses white blood cells (B-cells), which have the ability to recognize antigens, to generate antibodies that are capable of binding to the antigens. [0140] In one embodiment, the Mtb polypeptides of the invention are generated according to the methods described above. In certain aspects of this embodiment additional polynucleotide sequences that encode linker molecules are added to the TAP primary fragment or the TAP expression fragment such that the expressed Mtb polypeptides are fused to a linker molecule. As discussed previously, the term "linker molecule" encompasses molecules that are capable of immobilizing the polypeptides to a solid support.

[0141] In a particular embodiment, a Mtb polynucleotide of interest is fused to a HA epitope tag such that the expressed product can include the Mtb gene product fused to the HA epitope. In another embodiment, a Mtb polynucleotide of interest is combined with a histidine (His) coding sequence, such that the expressed product can include the Mtb gene product and a 6x, 7x, 8x, 9x, or 10x histidine tag. In other embodiments a Mtb polynucleotide is combined with a sequence that codes for a GST tag, fluorescent protein tag, or Flag tag. Using these methods it is possible to express and tag every Mtb polypeptide encoded by its genome. In another embodiment, the tagged Mtb polypeptide can be attached to a solid support, such as a 96-well plate. The immobilize polypeptides can be contacted with an antiserum or other fluid containing antibodies from an animal that has been immunized with one or more antigens from Mtb. In one embodiment, ELISA and Western blot assays are performed in parallel to detect the presence of immunogenic Mtb polypeptides.

[0142] As an example of an ELISA assay, tagged Mtb polypeptides can be immobilized on a solid support, such as a 96-well plate. The immobilized Mtb polypeptides are then incubated with serum from an animal that has been immunized with one or more antigens from Mtb, or has been infected directly with Mtb by inoculation, aerosol delivery, or the like. The reaction mixture can be washed to remove any unbound serum antibodies. The ability of the serum antibodies to bind to the bound Mtb polypeptides can then be detected using any one of a number of methods. For example, enzyme linked secondary antibodies can be added to detect the presence of an antigen specific antibody. Any enzyme linked secondary antibody can be used in this invention, depending on the source of the serum. For example, if vaccinated mouse serum is used to provide the primary antibody, enzyme linked anti-mouse antibody can be used as a secondary antibody. Likewise if human serum is used to provide the primary antibody, enzyme linked anti-human serum can be used as a secondary enzyme. [0143] Any suitable assay can be used to determine the amount of bound polypeptide specific antibody. Also, skilled artisans can develop the enzyme assay to determine the amount of polypeptide specific antibody that is bound. In one embodiment, the readout from an assay can show the presence of different levels of antibody in each of the 96 wells. For example, while some Mtb polypeptides are not able to elicit any serum antibodies, other Mtb polypeptides can elicit intermediate levels of antibodies, and some can elicit high antibody levels. In one embodiment, polypeptides that generate high antibody titers can be further researched to determine which polypeptides are present on the surface of the virus. In a particular embodiment of the invention Mtb polypeptides that generate high antibody titers and that are located on the surface of the virus are candidates for use in the development of a subunit Mtb vaccine.

[0144] In addition, serodiagnostic tests may be developed using antigens identified and characterized by these methods. That is, the peptide (epitopes) identified herein find use in detecting antibodies in serum from Mtb infected or exposed organisms, animals or patients.

[0145] FIGURE 3 demonstrates one embodiment of determining the humoral immune response generated by an array of polypeptides. One of skill in the art may deviate in certain details from those shown in FIGURE 3. For example, the HA tag, or any other tag as described above, may be placed at either the C-terminal or N-terminal end of the polypeptide to insure that epitopes are not concealed due to binding to the plate. Instead of HA tagged polypeptides, a histidine tag can be used, and the polypeptides can be bound to nickel coated plates. For example a 6x, Ix, 8x, 9x, or 10x histidine tag can be used. Alternatively, histidine tagged polypeptides can be purified from either transfected cells or from the in vitro transcription translation system. Furthermore, purified Mtb polypeptides can be attached non-specifically to polypeptide-absorbing plates such as Immulon plates, for example.

[0146] In one aspect of the present invention, immunogenic Mtb antigens are detected by comparing the results of Western blotting analysis with ELISA. Western blotting and ELISA are two independent yet complementary methods that may be used to detect immunogenic Mtb in qualitative and quantitative ways. Western blotting is often used to examine the quality of a polypeptide or protein sample, including such parameters as purity, protein integrity, and degradation. Western blotting detects polypeptides in their denatured form. In one aspect of this embodiment, ELISA, which detects native polypeptides, is used to further examine Western-positive Mtb polypeptides in a more quantitative fashion, to illustrate the strength of the Mtb epitope's immunogenicity. Cell-Mediated Immune Response

[0147] Use of the TAP-expressed Mtb polypeptide libraries and arrays prepared according the methods above {e.g. using TAP or adapter technology) can also be exploited to identify the immunogenic targets of cell-mediated immunity in Mtb vaccinated non-human animals. In contrast to a humoral immune response, where an antibody binds directly binding to an antigen, a cell-mediated immune response relates to T-cells binding to the surface of other cells that display the antigen. When certain T-cells come into contact with a presented antigen, they produce and release cytokines such as interferon-γ (IFN-γ) or Tumor Necrosis Factor-alpha (TNF-α). Cytokines are cellular signals that can alter the behavior or properties of another cell. For example, cytokines may inhibit viral replication, induce increased expression of MHC class I and peptide transporter molecules in infected cells, or activate macrophages. Accordingly, cytokines released by T-cells, associated with the binding to an antigen, can be used to identify and detect T-cell/antigen interactions.

[0148] Some cells have MHC molecules on their membranes to present antigens to T-cells. Efficient T-cell function relies on proper recognition of the MHC- antigen complex. There are two types of MHC molecules: Class I and Class II. The two different classes of MHC molecules bind peptides from different sources inside the cell for presentation at the cell surface to different classes of T-cells. Any T-cell can be used in the present invention, and include for example both CD4⁺ and CD8⁺ T-cells. CD8⁺ cells (cytotoxic T-cells) bind epitopes that are part of class I MHC molecules. CD4 T- Cells, which includes inflammatory CD4 T-cells and helper CD4 T-cells, bind epitopes that are part of class II MHC molecules. Only specialized antigen-presenting cells express class II molecules.

[0149] There are three main types of antigen-presenting cells: B cells, macrophages and dendritic cells. Each of these cell types is specialized to process and present antigens from different sources to T-cells, and two of them, the macrophages and the B cells, are also the targets of subsequent actions of armed effector T-cells. These three cell types can express the specialized co-stimulatory molecules that enable them to activate naive T-cells, although macrophages and B cells express those molecules only when suitably activated by infection. [0150] Embodiments of the present invention relate to detecting Mtb polypeptides capable of evoking a cell-mediated immune response in order to identify potential candidates for use in a subunit vaccine or other pharmaceutical composition. According to one method of detecting a cell-mediated immune response, an Mtb polypeptide is delivered to an antigen-presenting cell where it can be presented in a manner that is recognized by antigen specific T-cells. In another embodiment of the invention, a transcriptionally active gene can be delivered to an antigen-presenting cell where expressed and presented in a manner that can be recognized by antigen specific T- cells. Mtb antigen specific T-cells can be acquired from numerous sources. For example, animals that have been infected, or immunized with one or more antigens from Mtb virus are a good source of antigen specific T-cells. Alternatively, human Mtb patients and volunteers immunized with Mtb can be a source of antigen specific T-cells.

[0151] FIGURE 4 demonstrates one embodiment of determining the cell- mediated immune response generated by an array of polypeptides. One of skill in the art may deviate in certain details from those shown in FIGURE 4.

[0152] In order to test the ability of Mtb polypeptides to elicit a cell-mediated response, a plurality of Mtb polynucleotides can be amplified and made transcriptionally active using TAP technology. In one embodiment about 10, about 20, about 30, about 40, about 50, about 60, about 70, about 80, about 90, about 100, about 110, about 120, about 130, about 140, about 150, about 160, about 170, about 180, about 190, about 200, about 210, about 220, about 230, about 240, about 250, about 260, about 266, or any number in between, Mtb polynucleotides can be made transcriptionally active using TAP technology.

[0153] In one embodiment, transcriptionally active Mtb polynucleotides can be transfected into an antigen-presenting cell and expressed within the cell. In another embodiment, instead of transfecting the genes into an antigen-presenting cell, the Mtb TAP fragments can be expressed in an in vivo or in vitro (cell-free) expression system and the expressed polypeptide can be delivered into the antigen-presenting cell. The polypeptide can be delivered into the antigen-presenting cell according to any method. In one embodiment, the polypeptide can be delivered using the technology described in U.S. Patent Application No. 09/738046, entitled "Intracellular Protein Delivery Reagent" and U.S. Patent Application No. 10/141535, entitled "Intracellular Protein Delivery Compositions and Methods of Use," both of which are hereby incorporated by reference in their entirety. The reagents described therein are capable of delivering any type of polypeptide into any type of cell. Furthermore, the results of FIGURE 5 demonstrate that dendritic cells can present antigens to T-cells supplied from an immunized host after antigenic polypeptides were delivered to the dendritic cells with reagents from the above mentioned applications.

[0154] In certain embodiments of the invention, reagents used to deliver polypeptides into cultured cells can be a cationic lipid formulation. In one embodiment, these reagents can deliver fluorescently labeled antibodies, high and low molecular weight dextrans, phycoerythrin-BSA, caspase 3, caspase 8, granzyme B, and β-galactosidase into the cytoplasm of a variety of different adherent and suspension cells. Caspases delivered to cells with are functional, since they can be shown to send cells into apoptosis. In one embodiment, Mtb polypeptides are delivered into dendritic cells using these reagents.

[0155] Detecting a T-cell's ability to bind to an antigen-presenting cell, after the antigen-presenting cell has processed a particular polypeptide, is useful in determining whether the particular polypeptide evokes a cell-mediated immune response. Once a particular polypeptide is delivered into or expressed in the antigen-presenting cell, an assay can be performed to identify T-cell interaction with the MHC-antigen complex. In one embodiment, it can be determined if T-cells obtained from an animal that was immunized with Mtb can bind to a particular antigen presented by an antigen-presenting cell. For example, an ELIspot assay (Enzyme-Linked Immuno spotting; ELIspot) can be performed to identify antigen specific T-cells. Similar immunoassays can be performed to identify Mtb antigens (presented by an antigen-presenting cells) that stimulate T-cells from active Mtb patients or immunized individuals.

[0156] One method of detecting a T-cell/antigen interaction is to measure the amount of a particular cytokine released by the T-cell when it interacts with a MHC- antigen complex. The skilled artisan can appreciate that other cellular signals can be used to indicate a cell-mediated immune response. In one embodiment, the levels of IFN-γ released by T-cells can indicate whether a particular peptide is capable of evoking a cell- mediated immune response. In a particular embodiment, an antibody specific for IFN-γ can be coated onto a solid support. Unbound antibodies can be washed away and IFN-γ obtained from the supernatant containing T-cells plus antigen-presenting cells or antigen transduced antigen-presenting cells, can be added to the wells. A biotinylated secondary antibody specific for IFN-γ can be added. Excess secondary antibody can be removed and Streptavidin-Peroxidase can be added to the mixture. Streptavidin-Peroxidase is capable of binding to the biotinylated antibody to complete the four-member immunoassay "sandwich." Excess or unbound Streptavidin-Peroxidase is easily removed from the mixture. In order to detect amount of bound Streptavidin-Peroxidase, a substrate solution can be added which reacts with the Streptavidin-Peroxidase to produce color. The intensity of the colored product is directly proportional to the concentration of IFN-γ present in the T-cell/antigen-presenting cell supernatant. Kits for performing these types of immunoassay are readily available from many commercial suppliers or the necessary reagents composing such kits can be purchased separately or produced in-house. In one embodiment, processed and presented Mtb polypeptide that evokes T-cells to produce a high level of IFN-γ can be considered a strong candidate for use in developing a subunit vaccine.

[0157] Those with skill in the art will appreciate that other methods can be used to detect T-cell/Antigen interactions. These methods include bead based assays, flow-based assays, RT-PCR based assays, cytokine ELISAs, lymphoproliferation assays, cytotoxic T cell assays, or any other assay that can detect the interaction of a T-cell with a responder cell (e.g. macrophage).

Developing a Subunit Vaccine, Pharmaceutical Composition, or Immunogenic Composition

[0158] A particular Mtb polypeptide that has been identified to elicit a humoral or cell-mediated immune response, can be further explored to determine its ability to be used in a subunit vaccine, pharmaceutical composition, or immunogenic composition. The terms "subunit vaccine," "DNA vaccine," "recombinant vaccine" and "immunogenic composition" encompass vaccines that are comprised of polypeptides, nucleic acids or a combination of both. Further exploration of a Mtb polypeptide vaccine candidate includes testing the Mtb polypeptide or nucleic acid encoding the Mtb polypeptide in a large number of animal subjects, volunteers or patients. In a particular embodiment, surface antigens can be studied closely because of the likelihood that they can inhibit virus infectivity. In one embodiment, every polypeptide encoded by the Mtb genome is assayed to determine its immunogenic effect. Polypeptides that elicit an immune response, whether cell-mediated or humoral, can be more closely studied to determine potential use alone or in conjunction with other polypeptides and genes as a subunit vaccine, pharmaceutical composition, or immunogenic composition. Suitable methodologies for electing and detecting an immune response are well established in the art.

Uses of Immunogenic Compositions

[0159] As noted previously, the present invention provides peptide immunogens and nucleic acids encoding the immunogens. As such, the present invention also provides methods of using the immunogens to generate an immune response in a mammalian host.

[0160] Methods of generating immune responses in a host are known in the art. However, according to the present invention, the method includes administering to the host an immunogenic composition. The immunogenic composition includes at least one nucleic acid selected from SEQ ID NO: 46-64 and/or 110-121. In some embodiments, the immunogenic composition includes at least one nucleic acid 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91% 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% sequence identity to at least one nucleic acid selected from SEQ ID NOs: 46-64 and 110- 121, wherein the nucleic acid encodes an antigenic and/or immunogenic Mtb epitope. In addition, fragments of these sequences can be used. Also, it should be noted that combinations of these sequences may be used to generate an immune response against Mtb. When using nucleic acids to generate an immune response the nucleic acids preferably encode peptides found in SEQ ID NO: 65-83 and/or 122-133. In addition, fragments of these sequences can be used. Also, combinations of these sequences can be used.

[0161] When combinations of the above immunogenic compositions are to be used at least 2, 3, 4 or 5 or more of the nucleic acids or fragments thereof can be combined to generate an immunogenic composition. Any combination of the nucleic acids finds use in this method.

[0162] Also, methods of generating an immune response include administering to the host at least one peptide selected from the peptides found in SEQ ID NO: 65-83 and/or 122-133, or preptides that share 85%, 90%, 95%, 96%, 97%, 98% or 99% amino acid sequence identity to SEQ ID NOS: 65-83, 122-133, and fragments thereof are antigenic or immunogenic. In some embodiments, Mtb polypeptides that consist essentially of the amino acids of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof are administered to the host. In addition, fragments of these sequences can be used. Also, it should be noted that combinations of these sequences may be used to generate an immune response against Mtb. When combinations of the above immunogenic compositions are to be used at least 2, 3, 4 or 5 or more of the nucleic acids or fragments thereof can be combined to generate an immunogenic composition. Any combination of screened nucleic acids finds use in this method. Kits

[0163] Various nucleic acids and peptides have been identified that generate an immune response. As such, the nucleic acids and peptides find use in kits. The kits of the invention are useful for a variety of applications including combining reagents necessary for producing immunogenic compositions and/or vaccine compositions. Such immunogenic compositions and/or vaccine compositions include the polypeptides and polynucleotides described herein as well as carriers, diluents and other pharmaceutically acceptable carriers. It should be noted, as described above, that the kits may include fragments of the nucleic acids or peptides described herein as well as combinations of the nucleic acids and/or peptides described herein. Preferably the kits include at least 2, 3, 5, 10, 15, 20, 25, 30 or more nucleic acids or peptides described herein. Any combination of the nucleic acids or peptides can be used. In addition, the kits may include adjuvants. In addition, the kits may include instructions for preparing and administering the immunogenic compositions or vaccines.

[0164] In addition, the kits of the invention find use as diagnostic kits. In particular, the kits find use as serodiagnostic kits. As such, the kits include at least one peptide as described herein. Preferably, however, the kits include a plurality of peptides, such as at least 2, 3, 5, 10, 15 or 20 or more peptides for diagnosis of Mtb infection or exposure of an organism, animal or patient.

[0165] In some embodiments, the nucleic acids encoding the polypeptides find use in diagnostic kits. The nucleic acids encoding the antigenic peptides find use as probes to detect complementary nucleic acids of Mtb. However, in an alternative embodiment the kits include the polypeptides produced from the in vitro transcription- translation reaction find use in detecting antibodies from an organism, animal or patient exposed to Mtb. EXAMPLES Example 1. Procedure for generating histidine tagged TAP express fragments

[0166] A detailed procedure that is used to produce tagged T7-TAP Express fragments is as follows: 96 different genes were amplified from a mixture of plasmid templates. A first PCR reaction was run with customized 5' and 3' primers. The 5' primers contained between 43-48 bases. In particular, the T-7-His TAP ends contained 28 bases while the gene-specific component contained between 15-20 bases. The 3' primers contained between 45-50 bases. Specifically, the T7-terminator TAP ends contained 30 bases while the gene specific component contained between 15-20 bases. The reaction temperature and times for the first PCR reaction were: 94 ⁰C for 2 minutes, followed by 28 cycles of: 94 ⁰C for 20 seconds, 58 ⁰C for 35 seconds, and 70 ⁰C for 2 minutes (for genes that contained more than 2kb, 1 minute was added for each kb).

[0167] After the first PCR reaction was performed, an aliquot of each PCR reaction from the previous step was transferred into a PCR reaction containing the T7- histidine promoter fragment and T7 terminator fragment. The T7 promoter primer contained 25 bases, while the T7-promoter-His tag fragment contained a 104 base EcoRV/Bglll fragment. The T7-terminator fragment was a 74 base oligonucleotide. The reaction temperature and times for the second PCR reaction were: 94 ⁰C for 2 minutes, followed by 30 cycles of: 94 ⁰C for 20 seconds, 60 ⁰C for 35 seconds, and 70 ⁰C for 2 minuets (for genes that contained more than 2kb, 1 minute was added for each kb). Example 2. Using the Mtb proteome to identify the antigenic targets of humoral immunity in Mtb mice and humans.

[0168] The following is a method used to systematically screen and identify antigens in Mtb that give rise to a protective humoral immune response. A bioinformatics approach was used to order the M. tuberculosis polynucleotide sequences for amplification. The Mtb genome was first analyzed for hydrophobicity by the method of Doolittle. Hydrophilic polynucleotides sequences were then further grouped by size. Hydrophilic open reading frames/coding regions longer than 500 bp were selected for TAP amplification. Initially, three hundred Mtb genes were synthesized by TAP and ~ 100 proteins were translated and purified in arrays, as described below. TAP PCR [0169] The PCR reactions were performed such that a nucleotide sequence encoding a 6xHis tag was fused to these amplified transcriptionally active genes. The resulting His tagged TAP fragments were expressed to produce -100 Mtb polypeptides containing the His tag.

[0170] A detailed procedure that was used to produce tagged T7-TAP Express fragments is as follows: groups of 96 Mtb polynucleotide sequences were amplified from Mtb genomic DNA. A first PCR reaction was performed using customized 5' and 3' primers, as shown in Table 1 (SEQ ID NOS: 8 and 9; 10 and 11; 12 and 13; 14 and 15; 16 and 17; 18 and 19; 20 and 21; 22 and 23; 24 and 25; 26 and 27; 28 and 29; 30 and 31; 32 and 33; 34 and 35; 36 and 37; 38 and 39; 40 and 41; 42 and 43; 44 and 45; 86 and 87; 88 and 89; 90 and 91; 92 and 93; 94 and 95; 96 and 97; 98 and 99; 100 and 101; 102 and 103; 104 and 105; 106 and 107; 108 and 109). The 5' primers contained between 43-48 bases. In particular, the T-7-His TAP ends contained 28 bases while the gene-specific component contained between 15-20 bases. The 3' primers contained between 45-50 bases. Specifically, the T7 -terminator TAP ends contained 30 bases while the gene specific component contained between 15-20 bases.

Table 1: Immunogenic Mtb Polypeptides: Primers, Polynucleotide Sequences, and Amino Acid Sequences

All polynucleotide sequences are shown in the 5' to 3'orientation

Rv2031c

HEAT SHOCK PROTEIN HSPX (ALPHA-CRSTALLIN HOMOLOG) 14 kDa ANTIGEN) (HSP16.3)

5' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGGCCACCACCCTT (SEQ ID NO: 8)

3 ' primer :

TGATGATGAGAACCCCCCCCGTTGGTGGACCGGATCTGAA (SEQ ID NO : 9)

Polynucleotide sequence:

ATGGCCACCACCCTTCCCGTTCAGCGCCACCCGCGGTCCCTCTTCCCCGAGTTTTCTGAG

CTGTTCGCGGCCTTCCCGTCATTCGCCGGACTCCGGCCCACCTTCGACACCCGGTTGATG

CGGCTGGAAGACGAGATGAAAGAGGGGCGCTACGAGGTACGCGCGGAGCTTCCCGGGGTC

GACCCCGACAAGGACGTCGACATTATGGTCCGCGATGGTCAGCTGACCATCAAGGCCGAG CGCACCGAGCAGAAGGACTTCGACGGTCGCTCGGAATTCGCGTACGGTTCCTTCGTTCGC ACGGTGTCGCTGCCGGTAGGTGCTGACGAGGACGACATTAAGGCCACCTACGACAAGGGC ATTCTTACTGTGTCGGTGGCGGTTTCGGAAGGGAAGCCAACCGAAAAGCACATTCAGATC CGGTCCACCAAC 435 bp(SEQ ID NO: 46)

Amino acid sequence:

MATTLPVQRHPRSLFPEFSELFAAFPSFAGLRPTFDTRLMRLEDEMKEGRYEVRAELPGV DPDKDVDIMVRDGQLTIKAERTEQKDFDGRSEFAYGSFVRTVSLPVGADEDDIKATYDKG ILTVSVAVSEGKPTEKHIQIRSTN(SEQ ID NO: 65)

I. RV3763

19 KDA LIPOPROTEIN ANTIGEN PRECURSOR LPQH

5 ' primer :

GAAGGAGATATACCATGCATCATCATCATCATCATGTGAAGCGTGGACTG(SE Q IDNO:10)

3¹ primer:

TGATGATGAGAACCCCCCCCGGAACAGGTCACCTCGATTT (SEQ ID NO : 11)

Polynucleotide sequence:

GTGAAGCGTGGACTGACGGTCGCGGTAGCCGGAGCCGCCATTCTGGTCGCAGGTCTTTCC GGATGTTCAAGCAACAAGTCGACTACAGGAAGCGGTGAGACCACGACCGCGGCAGGCACG ACGGCAAGCCCCGGCGCCGCCTCCGGGCCGAAGGTCGTCATCGACGGTAAGGACCAGAAC GTCACCGGCTCCGTGGTGTGCACAACCGCGGCCGGCAATGTCAACATCGCGATCGGCGGG GCGGCGACCGGCATTGCCGCCGTGCTCACCGACGGCAACCCTCCGGAGGTGAAGTCCGTT GGGCTCGGTAACGTCAACGGCGTCACGCTGGGATACACGTCGGGCACCGGACAGGGTAAC GCCTCGGCAACCAAGGACGGCAGCCACTACAAGATCACTGGGACCGCTACCGGGGTCGAC ATGGCCAACCCGATGTCACCGGTGAACAAGTCGTTCGAAATCGAGGTGACCTGTTCC 480 bp (SEQ ID NO:47)

Amino acid sequence:

VKRGLTVAVAGAAILVAGLSGCSSNKSTTGSGETTTAAGTTASPGAASGPKVVIDGKDQN

VTGSWCTTAAGNVNIAIGGAATGIAAVLTDGNPPEVKSVGLGNVNGVTLGYTSGTGQGN

ASATKDGSHYKITGTATGVDMANPMSPVNKSFEIEVTCS (SEQ ID NO : 66 )

Rv2744c ~~~ ^"~~~~~~~~~~~~~~~~~Z

CONSERVED 35 KDA ALANINE RICH PROTEIN

5 5¹¹ pprriimmeerr:: GGAAAAGGGGJA!GATATACCATGCATCATCATCATCATCATATGGCCAATCCGTTC (SEQ ID

3j '^■ pprrixmmeerr::

TGATGATGAGAACCCCCCCCCTGACCGTAGGGCTGCTCGG (SEQ ID NO : 13 )

Polynucleotide sequence:

ATGGCCAATCCGTTCGTTAAAGCCTGGAAGTACCTCATGGCGCTGTTCAGCTCGAAGATC GACGAGCATGCCGACCCCAAGGTGCAGATTCAACAGGCCATTGAGGAAGCACAGCGCACC CACCAAGCGCTGACTCAACAGGCGGCGCAAGTGATCGGTAACCAGCGTCAATTGGAGATG CGACTCAACCGACAGCTGGCGGACATCGAAT-AGCTTCAGGTCAATGTGCGCCAAGCCCTG ACGCTGGCCGACCAGGCCACCGCCGCCGGAGACGCTGCCAAGGCCACCGAATACAACAAC GCCGCCGAGGCGTTCGCAGCCCAGCTGGTGACCGCCGAGCAGAGCGTCGAAGACCTCAAG ACGCTGCATGACCAGGCGCTTAGCGCCGCAGCTCAGGCCAAGAAGGCCGTCGAACGAAAT GCGATGGTGCTGCAGCAGAAGATCGCCGAGCGAACCAAGCTGCTCAGCCAGCTCGAGCAG GCGAAGATGCAGGAGCAGGTCAGCGCATCGTTGCGGTCGATGAGTGAGCTCGCCGCGCCA GGCAACACGCCGAGCCTCGACGAGGTGCGCGACAAGATCGAGCGTCGCTACGCCAACGCG ATCGGTTCGGCTGAACTTGCCGAGAGTTCGGTGCAGGGCCGGATGCTCGAGGTGGAGCAG GCCGGGATCCAGATGGCCGGTCATTCACGGTTGGAACAGATCCGCGCATCGATGCGCGGT GAAGCGTTGCCGGCCGGCGGGACCACGGCTACCCCCAGACCGGCCACCGAGACTTCTGGC GGGGCTATTGCCGAGCAGCCCTACGGTCAG 813 bp (SEQ ID NO: 48)

Amino acid sequence:

MANPFVKAWKYLMALFSSKIDEHADPKVQIQQAIEEAQRTHQALTQQAAQVIGNQRQLEM

RLNRQLADIEKLQVNVRQALTLADQATAAGDAAKATEYNNAAEAFAAQLVTAEQSVEDLK

TLHDQALSAAAQAKKAVERNAMVLQQKIAERTKLLSQLEQAKMQEQVSASLRSMSELAAP

GNTPSLDEVRDKIERRYANAIGSAELAESSVQGRMLEVEQAGIQMAGHSRLEQIRASMRG

EALPAGGTTATPRPATETSGGAIAEQPYGQ(SEQ ID NO: 67)

RvO097

POSSIBLE OXIDOREDUCTASE

5 ' primer :

GAAGGAGATATACCATGCATCATCATCATCATCATATGACGCTTAAGGTC (SEQ ID NO : 14 )

3' primer: TGATGATGAGAACCCCCCCCTGCCGCGTATCCCGGCGTCT(SEQ ID NO: 15)

Polynucleotide sequence:

ATGACGCTTAAGGTCAAAGGCGAGGGACTCGGTGCGCAGGTCACAGGGGTCGATCCC

AAGAATCTGGACGATATAACCACCGACGAGATCCGGGATATCGTTTACACGAACAAGCTC

GTTGTGCTAAAAGACGTCCATCCGTCTCCGCGGGAGTTCATCAAACTCGGCAGGATAATT

GGACAAATCGTTCCGTATTACGAACCCATGTACCATCACGAAGACCACCCGGAGATCTTT

GTCTCCTCCACTGAGGAAGGTCAGGGGGTCCCAAAAACCGGCGCGTTCTGGCATATCGAC

TATATGTTTATGCCGGAACCTTTCGCGTTTTCCATGGTGCTGCCGCTGGCGGTGCCTGGA

CACGACCGCGGGACCTATTTCATCGATCTCGCCAGGGTCTGGCAGTCGCTGCCCGCCGCC

AAGCGAGACCCGGCCCGCGGAACCGTCAGCACCCACGACCCTCGACGCCACATCAAGATC

CGACCCAGCGACGTCTACCGGCCCATCGGAGAGGTATGGGACGAGATCAACCGGACCACG

CCCCCAATAAAGTGGCCTACGGTCATCCGGCACCCAAAGACCGGCCAAGAGATCCTCTAC

ATCTGCGCGACGGGCACCACCAAGATCGAGGACAAGGACGGCAATCCGGTTGATCCGGAG

GTGCTGCAAGAACTCATGGCCGCGACCGGACAGCTCGATCCTGAGTACCAGTCGCCGTTC

ATACATACTCAGCACTACCAGGTTGGCGACATCATCTTGTGGGACAACCGGGTTCTCATG

CACCGAGCGAAGCACGGCAGCGCCGCGGGCACTCTGACGACCTACCGCCTGACCATGCTT

GATGGCCTCAAGACGCCGGGATACGCGGCA 870(SEQ ID NO: 49)

Amino acid sequence:

MTLKVKGEGLGAQVTGVDPKNLDDITTDEIRDIVYTNKLWLKDVHPSPREFIKLGRIIG

QIVPYYEPMYHHEDHPEIFVSSTEEGQGVPKTGAFWHIDYMFMPEPFAFSMVLPLAVPGH

DRGTYFIDLARVWQSLPAAKRDPARGTVSTHDPRRHIKIRPSDVYRPIGEVWDEINRTTP PIKWPTVIRHPKTGQEILYICATGTTKIEDKDGNPVDPEVLQELMAATGQLDPEYQSPFI HTQHYQVGDIILWDNRVLMHRAKHGSAAGTLTTYRLTMLDGLKTPGYAA(SEQ ID

NO: 68) RvO475

IRON-REGULATED HEPARIN BINDING HEMAGGLUTININ HBHA (ADHESIN)

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGGCTGAAAACTCG (SEQ ID NO: 16)

3 ' primer:

TGATGATGAGAACCCCCCCCCTTCTGGGTGACCTTCTTGG (SEQ ID NO : 17 )

Polynucleotide sequence:

ATGGCTGAAAACTCGAACATTGATGACATCAAGGCTCCGTTGCTTGCCGCGCTTGGAG CGGCCGACCTGGCCTTGGCCACTGTCAACGAGTTGATCACGAACCTGCGTGAGCGTGCGG AGGAGACTCGTACGGACACCCGCAGCCGGGTCGAGGAGAGCCGTGCTCGCCTGACCAAGC TGCAGGAAGATCTGCCCGAGCAGCTCACCGAGCTGCGTGAGAAGTTCACCGCCGAGGAGC TGCGTAAGGCCGCCGAGGGCTACCTCGAGGCCGCGACTAGCCGGTACAACGAGCTGGTCG AGCGCGGTGAGGCCGCTCTAGAGCGGCTGCGCAGCCAGCAGAGCTTCGAGGAAGTGTCGG CGCGCGCCGAAGGCTACGTGGACCAGGCGGTGGAGTTGACCCAGGAGGCGTTGGGTACGG TCGCATCGCAGACCCGCGCGGTCGGTGAGCGTGCCGCCAAGCTGGTCGGCATCGAGCTGC CTAAGAAGGCTGCTCCGGCCAAGAAGGCCGCTCCGGCCAAGAAGGCCGCTCCGGCCAAGA AGGCGGCGGCCAAGAAGGCGCCCGCGAAGAAGGCGGCGGCCAAGAAGGTCACCCAGAAG 600 bp (SEQ ID NO: 50)

Amino acid sequence:

MAENSNIDDIKAPL]LAALGAAD]LALATVNELITNLRERAEETRTDTRSRVEESRARLTKL QEDLPEQLTELREKFTAEELRKAAEGYLEAATSRYNELVERGEAALERLRSQQSFEEVSA RAEGYVDQAVELTQEALGTVASQTRAVGERAAKLVGIELPKKAAPAKKAAPAKKAAPAKK AAAKKAPAKKAAAKKVTQK(SEQ ID NO: 69)

Rv3117

PROBABLE THIOSULFATE SULFURTRANSFERASE CYSA3 (RHODANESE-LIKE PROTEIN) (THIOSULFATE CYANIDE TRANSSULFURASE) (THIOSULFATE THIOTRANSFERASE)

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGGCACGCTGCGAT (SEQ ID NO: 18)

3 ^■ primer:

TGATGATGAGAACCCCCCCCGCTTCCCAACTCGATCGGGG (SEQ ID NO : 19)

Polynucleotide sequence:

ATGGCACGCTGCGATGTCCTGGTCTCCGCCGACTGGGCTGAGAGCAATCTGCACGCGCCG

AAGGTCGTTTTCGTCGAAGTGGACGAGGACACCAGTGCATATGACCGTGACCATATTGCC

GGCGCGATCAAGTTGGACTGGCGCACCGACCTGCAGGATCCGGTCAAACGTGACTTCGTC

GACGCCCAGCAATTCTCCAAGCTGCTGTCCGAGCGTGGCATCGCCAACGAGGACACGGTG

ATCCTGTACGGCGGCAACAACAATTGGTTCGCCGCCTACGCGTACTGGTATTTCAAGCTC

TACGGCCATGAGAAGGTCAAGTTGCTCGACGGCGGCCGCAAGAAGTGGGAGCTCGACGGA

CGCCCGCTGTCCAGCGACCCGGTCAGCCGGCCGGTGACCTCCTACACCGCCTCCCCGCCG

GATAACACGATTCGGGCATTCCGCGACGAGGTCCTGGCGGCCATCAACGTCAAGAACCTC

ATCGACGTGCGCTCTCCCGACGAGTTCTCCGGCAAGATCCTGGCCCCCGCGCACCTGCCG

CAGGAACAAAGCCAGCGGCCCGGACACATTCCTGGTGCCATCAACGTGCCGTGGAGCAGG

GCCGCCAACGAGGACGGCACCTTCAAGTCCGATGAGGAGTTGGCCAAGCTTTACGCCGAC GCCGGCCTAGACAACAGCAAGGAAACGATTGCCTACTGCCGAATCGGGGAACGGTCCTCG CACACCTGGTTCGTGTTGCGGGAATTACTCGGACACCAAAACGTCAAGAACTACGACGGC AGTTGGACAGAATACGGCTCCCTGGTGGGCGCCCCGATCGAGTTGGGAAGC 834 bp (SEQ ID NO: 51)

Amino acid sequence:

MARCDVLVSADWAESNLHAPKWFVEVDEDTSAYDRDHIAGAIKLDWRTDLQDPVKRDFV

DA^QQFSKLLSERGIANEDTVILYGGNNNWFAAYAYWYFKLYGHEKVKLLDGGRKKWELDG

RPLSSDPVSRPVTSYTASPPDNTIRAFRDEVLAAINVKNLIDVRSPDEFSGKILAPAHLP

QEQSQRPGHIPGAINVPWSRAANEDGTFKSDEELAKLYADAGLDNSKETIAYCRIGERSS

HTWFVLRELLGHQNVKNYDGSWTEYGSLVGAPIELGS (SEQ ID NO : 70 )

Ryl347c

CONSERVED HYPOTHETICAL PROTEIN

5 ' primer :

GAAGGAGATATACCATGCATCATCATCATCATCATATGACCAAACCCACA (SEQ ID NO:20)

3' primer:

TGATGATGAGAACCCCCCCCCGCAGCCGTGGTCGGAGCTT (SEQ ID NO: 21)

Polynucleotide sequence:

ATGACCAAACCCACATCCGCTGGCCAGGCCGACGACGCGCTGGTTCGGCTAGCCCGCGAG

CGATTCGACCTACCTGACCAGGTACGACGCCTCGCCCGCCCGCCCGTTCCATCGTTGGAG

CCGCCATACGGGTTGCGGGTCGCACAGCTGACCGACGCGGAGATGTTGGCGGAGTGGATG

AACCGTCCTCATCTGGCGGCGGCCTGGGAGTACGACTGGCCGGCGTCACGTTGGCGTCAA

CACCTGAACGCCCAACTTGAGGGAACCTATTCGTTGCCATTGATCGGCAGCTGGCACGGA

ACAGATGGTGGTTATCTCGAATTATACTGGGCAGCAAAGGATTTGATTTCTCACTACTAC

GACGCAGACCCCTACGATTTGGGGCTGCACGCGGCCATCGCGGACTTGTCGAAGGTCAAT

CGGGGCTTCGGCCCGCTGCTGCTACCGCGGATCGTGGCCAGCGTCTTTGCCAACGAGCCG

CGTTGCCGGCGGATCATGTTCGACCCCGATCACCGCAACACCGCGACCCGTCGGTTGTGT

GAGTGGGCCGGATGCAAGTTCCTCGGTGAGCATGACACGACAAACCGGCGCATGGCGCTC

TACGCTTTGGAAGCTCCGACCACGGCTGCG 633 bp (SEQ ID NO: 52)

Amino acid sequence:

MTKPTSAGQADDALVRLARERFDLPDQVRRLARPPVPSLEPPYGLRVAQLTDAEMLAEWM NRPHLAAAWEYDWPASRWRQHLNAQLEGTYSLPLIGSWHGTDGGYLELYWAAKDLISHYY DADPYDLGLHAAIADLSKVNRGFGPLLLPRIVASVFANEPRCRRIMFDPDHRNTATRRLC EWAGCKFLGEHDTTNRRMALYALEAPTTAA(SEQ ID NO: 71)

RvO815c

PROBABLE THIOSULFATE SULFURTRANSFERASE CYSA2 (RHODANESE-LIKE PROTEIN) (THIOSULFATE CYANIDE TRANSSULFURASE) (THIOSULFATE THIOTRANSFERASE)

5 ' primer :

GAAGGAGATATACCATGCATCATCATCATCATCATATGGCACGCTGCGAT (SEQ ID NO: 22)

3 ' primer : TGATGATGAGAACCCCCCCCGCTTCCCAACTCGATCGGGG (SEQ ID NO: 23) Polynucleotide sequence:

ATGGCACGCTGCGATGTCCTGGTCTCCGCCGACTGGGCTGAGAGCAATCTGCACGCG

CCGAAGGTCGTTTTCGTCGAAGTGGACGAGGACACCAGTGCATATGACCGTGACCATATT

GCCGGCGCGATCAAGTTGGACTGGCGCACCGACCTGCAGGATCCGGTCAAACGTGACTTC

GTCGACGCCCAGCAATTCTCCAAGCTGCTGTCCGAGCGTGGCATCGCCAACGAGGACACG

GTGATCCTGTACGGCGGCAACAACAATTGGTTCGCCGCCTACGCGTACTGGTATTTCAAG

CTCTACGGCCATGAGAAGGTCAAGTTGCTCGACGGCGGCCGCAAGAAGTGGGAGCTCGAC

GGACGCCCGCTGTCCAGCGACCCGGTCAGCCGGCCGGTGACCTCCTACACCGCCTCCCCG

CCGGATAACACGATTCGGGCATTCCGCGACGAGGTCCTGGCGGCCATCAACGTCAAGAAC

CTCATCGACGTGCGCTCTCCCGACGAGTTCTCCGGCAAGATCCTGGCCCCCGCGCACCTG

CCGCAGGAACAAAGCCAGCGGCCCGGACACATTCCTGGTGCCATCAACGTGCCGTGGAGC

AGGGCCGCCAACGAGGACGGCACCTTCAAGTCCGATGAGGAGTTGGCCAAGCTTTACGCC

GACGCCGGCCTAGACAACAGCAAGGAAACGATTGCCTACTGCCGAATCGGGGAACGGTCC

TCGCACACCTGGTTCGTGTTGCGGGAATTACTCGGACACCAAAACGTCAAGAACTACGAC

GGCAGTTGGACAGAATACGGCTCCCTGGTGGGCGCCCCGATCGAGTTGGGAAGC 834 bp (SEQ ID NO: 53)

Amino acid sequence:

MARCDVLVSADWAESNLHAPKWFVEVDEDTSAYDRDHIAGAIKLDWRTDLQDPVKRDFV

DAQQFSKLLSERGIANEDTVILYGGNNNWFAAYAYWYFKLYGHEKVKLLDGGRKKWELDG

RPLSSDPVSRPVTSYTASPPDNTIRAFRDEVLAAINVKNLIDVRSPDEFSGKILAPAHLP

QEQSQRPGHIPGAINVPWSRAANEDGTFKSDEELAKLYADAGLDNSKETIAYCRIGERSS

HTWFVLRELLGHQNVKNYDGSWTEYGSLVGAPIELGS(SEQ ID NO: 72)

______

CONSERVED HYPOTHETICAL PROTEIN

5 ' primer :

GAAGGAGATATACCATGCATCATCATCATCATCATGTGAGTGACGAGGAC (SEQ ID NO: 24)

3 ' primer : TGATGATGAGAACCCCCCCCTGGTTGCCGAGCCCACTCGG (SEQ ID NO: 25)

Polynucleotide sequence:

GTGAGTGACGAGGACCGCACGGATCGGGCCACCGAGGACCACACCATCTTCGATCGGGGT

GTCGGCCAGCGCGACCAGCTGCAGCGGTTATGGACCCCCTACCGGATGAACTACCTGGCC

GAAGCGCCAGTGAAGCGTGACCCCAATTCCTCGGCCAGCCCTGCGCAGCCGTTCACCGAG

ATCCCGCAGCTGTCCGACGAAGAGGGTCTGGTGGTCGCTCGTGGCAAGCTGGTCTACGCC

GTGCTCAACCTGTACCCGTACAACCCCGGGCACTTGATGGTGGTGCCCTATCGTCGGGTA

TCCGAACTCGAGGATCTCACCGATTTGGAGAGCGCCGAGTTGATGGCGTTCACCCAGAAG

GCGATTCGCGTGATCAAGAACGTGTCGCGTCCGCACGGCTTCAATGTCGGCCTGAACCTA

GGGACATCGGCGGGCGGGTCGCTGGCCGAGCACCTGCACGTGCATGTGGTGCCACGGTGG

GGTGGCGATGCGAATTTCATCACCATCATCGGGGGCTCCAAGGTGATTCCGCAGCTGCTG

CGCGACACCCGTCGGCTGCTTGCCACCGAGTGGGCTCGGCAACCA 588 bp(SEQ

ID NO: 54)

Amino acid sequence :

VSDEDRTDRATEDHTIFDRGVGQRDQLQRLWTPYRMNYLAEAPVKRDPNSSASPAQPFTE IPQLSDEEGLWARGKLVYAVLNLYPYNPGHLMWPYRRVSELEDLTDLESAELMAFTQK AIRVIKNVSRPHGFNVGLNLGTSAGGSLAEHLHVHWPRWGGDANFITIIGGSKVIPQLL

RDTRRLLATEWARQP ( SEQ ID NO : 73 ) Rv3226c

CONSERVED HYPOTHETICAL PROTEIN

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGTGCGGACGGTTT (SEQ ID NO: 26)

3 ' primer:

TGATGATGAGAACCCCCCCCCAGCAGCTGGATCTGCTCGG (SEQ ID NO : 27)

Polynucleotide sequence:

ATGTGCGGACGGTTTGCGGTCACCACTGATCCGGCCCAGCTGGCCGAGAAAATCACGGCC

ATAGACGAGGCCACCGGGTGCGGTGGCGGGAAGACGAGCTACAACGTGGCACCCACCGAC

ACGATCGCGACAGTGGTGTCCCGCCACAGCGAGCCCGACGACGAGCCCACCCGCCGGGTG

CGGCTCATGCGCTGGGGACTGATTCCGTCGTGGATCAAGGCCGGGCCCGGCGGCGCACCC

GATGCCAAAGGCCCACCGCTGATCAACGCCCGCGCCGATAAGGTCGCCACGTCGCCGGCG

TTCCGGAGTGCGGTCAGAAGTAAGCGTTGCCTGGTGCCGATGGACGGCTGGTACGAATGG

CGCGTCGACCCCGACGCCACCCCGGGGAGGCCGAACGCCAAGACGCCGTTCTTCCTGCAC

CGCCACGACGGCGCCCTGTTGTTCACGGCCGGGCTGTGGTCGGTTTGGAAGTCTTACAGG

TCCGCCCCACCGCTGCTGAGCTGCACGGTGATCACCACCGATGCCGTGGGCGAGCTGGCC

GAGATCCATGACCGGATGCCGCTGCTGCTGGCCGAAGAGGACTGGGACGACTGGCTGAAT

CCAGACGCCCCGCCGGATCCTGAGCTGCTGGCCCGCCCGCCGGATGTGCGCGACATCGCG

CTGCGCCAAGTGTCCACGTTGGTCAACAACGTGCGCAACAACGGGCCTGAGCTGTTGGAG

CCGGCCAGGTCGCAGCCCGAGCAGATCCAGCTGCTG 759 bp (SEQ ID

NO: 55)

Amino acid sequence:

MCGRFAVTTDPAQLAEKITAIDEATGCGGGKTSYNVAPTDTIATWSRHSEPDDEPTRRV

RLMRWGLIPSWIKAGPGGAPDAKGPPLINARADKVATSPAFRSAVRSKRCLVPMDGWYEW

RVDPDATPGRPNAKTPFFLHRHDGALLFTAGLWSVWKSYRSAPPLLSCTVITTDAVGELA

EIHDRMPLLLAEEDWDDWLNPDAPPDPELLARPPDVRDIALRQVSTLVNNVRNNGPELLE

PARSQPEQIQLL(SEQ ID NO: 74)

RvO349

HYPOTHETICAL PROTEIN

5¹ primer: GAAGGAGATATACCATGCATCATCATCATCATCATGTGCCAGAGCTGGAG (SEQ ID

NO: 28)

3 ' primer:

TGATGATGAGAACCCCCCCCGTCCGCCAGCTTGACCGACT (SEQ ID NO : 29)

Polynucleotide sequence:

GTGCCAGAGCTGGAGACGCCCGACGACCCAGAGTCGATATACCTTGCCCGCCTCGAGGAT

GTCGGAGAACACAGACCGACGTTCACGGGCGACATCTACCGACTCGGCGATGGTCGCATG

GTGATGATCCTCCAGCACCCATGCGCGCTGCGGCACGGCGTTGACCTCCATCCGCGACTG

CTGGTCGCTCCCGTAAGACCCGACTCGCTTCGTTCCAACTGGGCTAGAGCCCCGTTCGGC

ACGATGCCGCTTCCGAAGCTCATCGACGGTCAGGATCACTCGGCGGACTTCATCAATCTT

GAACTCATCGATTCACCAACGCTTCCGACCTGTGAGCGGATCGCGGTGCTCAGCCAGTCA

GGCGTCAACTTGGTCATGCAACGGTGGGTGTACCACAGCACCCGGCTCGCCGTGCCCACG CACACCTACTCCGACAGCACCGTTGGCCCGTTCGATGAGGCAGACCTGATCGAGGAGTGG GTGACGGATCGCGTCGACGATGGGGCCGACCCGCAGGCGGCCGAACACGAATGCGCCTCC TGGCTCGATGAAAGAATCAGCGGCCGCACTCGGCGAGCGCTGCTCAGCGACCGTCAGCAC GCCAGTTCAATACGGCGAGAAGCGCGTTCTCATCGAAAGTCGGTCAAGCTGGCGGAC 660 bp (SEQ ID NO: 56)

Amino acid sequence:

VPELETPDDPESIYLARLEDVGEHRPTFTGDIYRLGDGRMVMILQHPCALRHGVDLHPRL LVAPVRPDSLRSNWARAPFGTMPLPKLIDGQDHSADFINLELIDSPTLPTCERIAVLSQS GVNLVMQRWVYHSTRLAVPTHTYSDSTVGPFDEADLIEEWVTDRVDDGADPQAAEHECAS WLDERISGRTRRALLSDRQHASSIRREARSHRKSVKLAD (SEQ ID NO : 75 )

Rv0009

PROBABLE IRON-REGULATED PEPTIDYL-PROLYL CIS-TRANS ISOMERASE A PPIA (PPIase A) (ROTAMASE A)

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGGCAGACTGTGAT (SEQ ID

N0:30)

3¹ primer:

TGATGATGAGAACCCCCCCCGGAGATGGTGATCGACTCGA (SEQ ID NO : 31)

Polynucleotide sequence:

ATGGCAGACTGTGATTCCGTGACTAACAGCCCCCTTGCGACCGCTACCGCCACGCTGCAC

ACTAACCGCGGCGACATCAAGATCGCCCTGTTCGGAAACCATGCGCCCAAGACCGTCGCC

AATTTTGTGGGCCTTGCGCAGGGCACCAAGGACTATTCGACCCAAAACGCATCAGGTGGC

CCGTCCGGCCCGTTCTACGACGGCGCGGTCTTTCACCGGGTGATCCAGGGCTTCATGATC

CAGGGTGGCGATCCAACCGGGACGGGTCGCGGCGGACCCGGCTACAAGTTCGCCGACGAG

TTCCACCCCGAGCTGCAATTCGACAAGCCCTATCTGCTCGCGATGGCCAACGCCGGTCCG

GGCACCAACGGCTCACAGTTTTTCATCACCGTCGGCAAGACTCCGCACCTGAACCGGCGC

CACACCATTTTCGGTGAAGTGATCGACGCGGAGTCACAGCGGGTTGTGGAGGCGATCTCC

AAGACGGCCACCGACGGCAACGATCGGCCGACGGACCCGGTGGTGATCGAGTCGATCACC

ATCTCC 549bp (SEQ ID NO: 57)

Amino acid sequence:

MADCDSVTNSPLATATATLHTNRGDIKIALFGNHAPKTVANFVGLAQGTKDYSTQNASGG PSGPFYDGAVFHRVIQGFMIQGGDPTGTGRGGPGYKFADEFHPELQFDKPYLLAMANAGP GTNGSQFFITVGKTPHLNRRHTIFGEVIDAESQRWEAISKTATDGNDRPTDPWIESIT IS(SEQ ID NO: 76)

RylO73

CONSERVED HYPOTHETICAL PROTEIN

5' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGGGGGCGCAGCCG (SEQ ID NO:32)

3¹ primer:

TGATGATGAGAACCCCCCCCACGTCGTGATGTCAACGTGT (SEQ ID NO: 33)

Polynucleotide sequence:

ATGGGGGCGCAGCCGTTCATCGGCAGCGAGGCGTTGGCGGCGGGACTCATCAGCTGGCAT

GAGCTGGGCAAGTACTACACCGCGATCATGCCCAACGTCTATCTGGACAAGCGGCTGAAG CCCTCCCTGCGGCAACGCGTTATCGCGGCCTGGCTGTGGTCGGGCCGCAAAGGGGTGATC GCCGGCGCTTCGGCATCAGCGCTGCACGGCGCGAAATGGGTCGATGACCACGCATTGGTG GAGTTGATCTGGCGCAACGCCAGGGCGCCGAACGGGGTGCGGACTAAGGATGAGCTACTG CTCGACGGCGAAGTCCAGCGCTTGTGCGGGCTTACTGTGACTACCGTTGAACGTACGGCC TTCGACTTGGGCAGGCGTCCACCCTTAGGTCAGGCGATAACCAGACTGGATGCGCTTGCC AATGCCACCGATTTCAAGATCAACGATGTTAGGGAGCTCGCGAGGAAGCACCCCCATACT CGCGGGCTGCGTCAACTAGACAAGGCGCTGGATCTCGTCGACCCAGGTGCGCAGTCGCCG AAGGAGACGTGGCTGCGGCTCTTGCTGATAAACGCCGGCTTTCCACGGCCGTCCACTCAG ATCCCCTTGCTCGGCGTCTACGGGCATCCAAAGTATTTCCTCGACATGGGATGGGAGGAC ATCATGCTCGCGGTCGAGTACGACGGCGAGCAACACCGTCTCAGCCGAGACCAGTTCGTC AAAGACGTCGAACGCCTGGAATACATCCGGCGCGCCGGCTGGACTCACATCAGGGTGCTG GCAGACCACAAGGGACCCGACGTCGTCCGCCGGGTTCGGCAGGCTTGGGACACGTTGACA TCACGACGT 852 bp (SEQ ID NO: 58)

Amino acid sequence:

MGAQPFIGSEALAAGLISWHELGKYYTAIMPNVYLDKRLKPSLRQRVIAAWLWSGRKGVI

AGASASALHGAKWVDDHALVELIWRNARAPNGVRTKDELLLDGEVQRLCGLTVTTVERTA

FDLGRRPPLGQAITRLDALANATDFKINDVRELARKHPHTRGLRQLDKALDLVDPGAQSP

KETWLRLLLINAGFPRPSTQIPLLGVYGHPKYFLDMGWEDIMLAVEYDGEQHRLSRDQFV

KDVERLEYIRRAGWTHIRVLADHKGPDWRRVRQAWDTLTSRR (SEQ ID NO: 77)

RvO781

PROBABLE PROTEASE II PTRBA [FIRST PART] (OLIGOPEPTIDASE B)

5' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGATGCACCGAACC (SEQ ID NO: 34)

3¹ primer:

TGATGATGAGAACCCCCCCCTCGGCTTCGTGGTAAACCCG (SEQ ID NO : 35)

Polynucleotide sequence:

ATGATGCACCGAACCGCACTACCCTCACCGCCCGTGGCCAAGCGGGTGCAGACCCGC:

CGGGAGCACCACGGCGACGTCTTTGTCGACCCATATGAATGGTTGCGCGACAAGGACAGC

CCTGAAGTAATCGCCTACCTCGAAGCTGAAAACGACTACACCGAACGGACCACCGCGCAC

CTTGAGCCATTGCGGCAAAAGATCTTCCACGAAATCAAAGCGCGTACCAAGGAAACCGAC

TTATCGGTGCCGACGCGACGTGGCAACTGGTGGTACTACGCGCGGACCTTTGAGGGAAAG

CAGTATGGCGTACACTGTCGTTGCCCGGTAACCGATCCCGACGACTGGAACCCACCAGAG

TTCGACGAGCGCACCGAAATACCCGGTGAACAGCTTCTGCTCGACGAGAACGTGGAAGCT

GACGGCCACGACTTCTTCGCACTGGGCGCGGCCAGCGTCAGCCTGGACGATAACCTCTTA

GCGTATTCCGTTGATGTCGTAGGTGACGAACGATATACCTTGCGGTTCAAGGATTTACGC

ACCGGAGAACAGTACCCGGACGAGATCGCCGGGATCGGAGCGGGAGTCACCTGGGCAGCT

GACAACCACTGTCTACTACACCACCGTGGACGCGGCCTGGCGTCCGGACACAGTGTGGCG

ATACCGACTAGGGTCCGGCGAATCGTCGGAGCGGGTTTACCACGAAGCCGA 711 bp (SEQ ID NO: 59)

Amino acid sequence:

MMHRTALPSPPVAKRVQTRREHHGDVFVDPYEWLRDKDSPEVIAYLEAENDYTERTTAHL EPLRQKIFHEIKARTKETDLSVPTRRGNWWYYARTFEGKQYGVHCRCPVTDPDDWNPPEF DERTEIPGEQLLLDENVEADGHDFFALGAASVSLDDNLLAYSVDWGDERYTLRFKDLRT GEQYPDEIAGIGAGVTWAADNHCLLHHRGRGLASGHSVAIPTRVRRIVGAGLPRSR (SEQ ID NO: 78)

Rv2108 PPE FAMILY PROTEIN

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGCCCAATTTCTGG (SEQ ID NO:36)

3' primer:

TGATGATGAGAACCCCCCCCAAACTTAGGATGTTCCTTGT (SEQ ID NO : 37)

Polynucleotide sequence:

ATGCCCAATTTCTGGGCGTTGCCGCCCGAGATCAACTCCACCCGGATATATCTCGGCC

CGGGTTCTGGCCCGATACTGGCCGCCGCCCAGGGATGGAACGCTCTGGCCAGTGAGCTGG

AAAAGACGAAGGTGGGGTTGCAGTCAGCGCTCGACACGTTGCTGGAGTCGTATAGGGGTC

AGTCGTCGCAGGCTTTGATACAGCAGACCTTGCCGTATGTGCAGTGGCTGACCACGACCG

CCGAGCACGCCCATAAGACCGCGATCCAGCTCACGGCAGCGGCGAACGCCTACGAGCAGG

CTAGAGCGGCGATGGTGCCGCCGGCGATGGTGCGCGCGAACCGCGTGCAGACCACAGTGT

TGAAGGCAATCAACTGGTTCGGGCAATTCTCCACCAGGATCGCCGACAAGGAGGCCGACT

ACGAACAGATGTGGTTCCAAGACGCGCTAGTGATGGAGAACTATTGGGAAGCCGTGCAAG

AGGCGATACAGTCGACGTCGCATTTTGAGGATCCACCGGAGATGGCCGACGACTACGACG

AGGCCTGGATGCTCAACACCGTGTTCGACTATCACAACGAGAACGCAAAAGAGGAGGTCA

TCCATCTCGTGCCCGACGTGAACAAGGAGAGGGGGCCCATCGAACTCGTAACCAAGGTAG

ACAAAGAGGGGACCATCAGACTCGTCTACGATGGGGAGCCCACGTTTTCATACAAGGAAC

ATCCTAAGTTT 732 bp (SEQ ID NO: 60)

Amino acid sequence:

MPNFWALPPEINSTRIYLGPGSGPILAAAQGWNALASELEKTKVGLQSALDTLLESYRGQ

SSQALIQQTLPYVQWLTTTAEHAHKTAIQLTAAANAYEQARAAMVPPAMVRANRVQTTVL

KAINWFGQFSTRIADKEADYEQMWFQDALVMENYWEAVQEAIQSTSHFEDPPEMADDYDE

AWMLNTVFDYHNENAKEEVIHLVPDVNKERGPIELVTKVDKEGTIRLVYDGEPTFSYKEH

PKF(SEQ ID NO: 79)

Rv3920c

HYPOTHETICAL PROTEIN SIMILAR TO JAG PROTEIN

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGGCCGACGCTGAC (SEQ ID NO: 38)

3 ' primer :

TGATGATGAGAACCCCCCCCGTCGCGGAGCACAACGACTC (SEQ ID NO : 39)

Polynucleotide sequence:

ATGGCCGACGCTGACACCACCGACTTCGACGTCGACGCAGAAGCACCGGGTGGAGGC

GTCCGGGAGGACACGGCGACGGATGCTGACGAGGCCGACGATCAAGAAGAGAGATTGGTC

GCCGAGGGCGAGATTGCAGGCGACTACCTGGAAGAGTTATTGGACGTGTTGGACTTCGAT

GGCGACATCGACCTCGATGTCGAAGGCAATCGTGCGGTGGTGAGCATCGACGGCAGTGAC

GACCTGAACAAGTTGGTCGGGCGCGGGGGCGAGGTGCTCGACGCTCTGCAGGAACTCACC

CGGTTGGCGGTGCATCAGAAGACCGGTGTGCGGAGCCGGTTGATGCTAGACATCGCGAGG

TGGCGACGGCGGCGCCGGGAGGAATTGGCGGCGCTGGCCGACGAGGTGGCGCGGCGAGTG

GCCGAAACCGGTGACCGCGAGGAACTCGTTCCAATGACGCCGTTCGAACGGAAGATCGTC

CACGATGCGGTTGCAGCGGTGCCAGGTGTGCACAGCGAAAGCGAAGGCGTGGAGCCAGAA

CGCCGAGTCGTTGTGCTCCGCGAC 564(SEQ ID NO: 61)

Amino acid sequence: MADADTTDFDVDAEAPGGGVREDTATDADEADDQEERLVAEGEIAGDYLEELLDVLDFDG

DIDLDVEGNRAWSIDGSDDLNKLVGRGGEVLDALQELTRLAVHQKTGVRSRLMLDIARW RRRRREELAALADEVARRVAETGDREELVPMTPFERKIVHDAVAAVPGVHSESEGVEPER RVWLRD(SEQ ID NO: 80)

RylO44

CONSERVED HYPOTHETICAL PROTEIN 5' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATTTGTGTGCAAAACCG (SEQ ID NO : 40)

3¹ primer:

TGATGATGAGAACCCCCCCCCGCCGATGCTCGCTTCGGCC (SEQ ID NO : 41)

Polynucleotide sequence:

TTGTGTGCAAAACCGTATCTAATTGATACGATTGCGCACATGGCTATCTGGGATCGCC

TCGTCGAGGTTGCCGCCGAGCAACATGGCTACGTCACGACTCGCGATGCGCGAGACATCG

GCGTCGACCCTGTGCAGCTCCGCCTCCTAGCGGGGCGCGGACGTCTTGAGCGTGTCGGCC

GAGGTGTGTACCGGGTGCCCGTGCTGCCGCGTGGTGAGCACGACGATCTCGCAGCCGCAG

TGTCGTGGACTTTGGGGCGTGGCGTTATCTCGCATGAGTCGGCCTTGGCGCTTCATGCCC

TCGCTGACGTGAACCCGTCGCGCATCCATCTCACCGTCCCGCGCAACAACCATCCGCGTG

CGGCCGGGGGCGAGCTGTACCGAGTTCACCGCCGCGACCTCCAGGCAGCCCACGTCACTT

CGGTCGACGGAATACCCGTCACGACGGTTGCGCGCACCATCAAAGACTGCGTGAAGACGG

GCACGGATCCTTATCAGCTTCGGGCCGCGATCGAGCGAGCCGAAGCCGAGGGCACGCTTC

GTCGTGGGTCAGCAGCTGAGCTACGCGCTGCGCTCGATGAGACCACTGCCGGATTACGCG

CTCGGCCGAAGCGAGCATCGGCG 624bp (SEQ ID NO: 62)

Amino acid sequence:

LCAKPYLIDTIAHMAIWDRLVEVAAEQHGYVTTRDARDIGVDPVQLRLLAGRGRLERVGR GVYRVPVLPRGEHDDLAAAVSWTLGRGVISHESALALHALADVNPSRIHLTVPRNNHPRA AGGELYRVHRRDLQAAHVTSVDGIPVTTVARTIKDCVKTGTDPYQLRAAIERAEAEGTLR RGSAAELRAALDETTAGLRARPKRASA(SEQ ID NO: 81)

Rv2882c

RIBOSOME RECYCLING FACTOR FRR (RIBOSOME RELEASING FACTOR) (RRF)

5¹ primer: GAAGGAGATATACCATGCATCATCATCATCATCATATGATTGATGAGGCT (SEQ ID

NO: 42)

3' primer:

TGATGATGAGAACCCCCCCCGACCTCCAGCAGCTCGCCTT (SEQ ID NO : 43 )

Polynucleotide sequence:

ATGATTGATGAGGCTCTCTTCGACGCCGAAGAGAAAATGGAGAAGGCTGTGGCGGTGGCA

CGTGACGACCTGTCAACTATCCGTACCGGCCGCGCCAACCCTGGCATGTTCTCTCGGATC

ACCATCGACTACTACGGTGCGGCCACCCCGATCACGCAACTGGCCAGCATCAATGTCCCC

GAGGCGCGGCTAGTCGTGATAAAGCCGTATGAAGCCAATCAGTTGCGCGCTATCGAGACT

GCAATTCGCAACTCCGACCTTGGAGTGAATCCCACCAACGACGGCGCCCTTATTCGCGTG

GCCGTACCGCAGCTCACCGAAGAACGTCGGCGAGAGCTGGTCAAACAGGCAAAGCATAAG

GGGGAGGAGGCCAAGGTTTCGGTGCGTAATATCCGTCGCAAAGCGATGGAGGAACTCCAT

CGCATCCGTAAGGAAGGCGAGGCCGGCGAGGATGAGGTCGGTCGCGCAGAAAAGGATCTC GACAAGACCACGCACCAATACGTCACCCAAATTGATGAGCTGGTTAAACACAAAGAAGGC GAGCTGCTGGAGGTC558 bp(SEQ ID NO: 63)

Amino acid sequence:

MIDEALFDAEEKMEKAVAVARDDLSTIRTGRANPGMFSRITIDYYGAATPITQLASINVP EARLWIKPYEANQLRAIETAIRNSDLGVNPTNDGALIRVAVPQLTEERRRELVKQAKHK GEEAKVSVRNIRRKAMEELHRIRKEGEAGEDEVGRAEKDLDKTTHQYVTQIDELVKHKEG ELLEV(SEQ ID NO: 82)

Rv3733c

CONSERVED HYPOTHETICAL PROTEIN

5 ' primer :

GAAGGAGATATACCATGCATCATCATCATCATCATATGCCCAAGCTCAGC (SEQ ID NO. -.44)

3' primer: TGATGATGAGAACCCCCCCCGCGAGGCAGGGATTCTGGTC(SEQ ID NO: 45)

Polynucleotide sequence:

ATGCCCAAGCTCAGCGCGGGTGTGCTGCTGTATCGGGCGCGCGCCGGTGTCGTCGACGTC

CTTCTGGCGCATCCGGGCGGCCCGTTTTGGGCGGGAAAGGACGACGGCGCTTGGTCGATC

CCGAAGGGCGAATACACCGGCGGCGAAGATCCGTGGCTGGCCGCCCGGCGCGAGTTCTCC

GAGGAGATCGGGTTGTGCGTGCCTGACGGGCCGCGAATCGACTTCGGGTCGCTGAAACAG

TCCGGCGGCAAGGTGGTGACCGTGTTCGGTGTCCGGGCGGATCTGGACATCACCGACGCA

CGAAGCAGCACCTTCGAATTGGACTGGCCGAAGGGCTCGGGCAAGATGCGTAAGTTCCCC

GAGGTCGACCGGGTGAGCTGGTTTCCGGTAGCGCGGGCACGCACCAAACTGCTCAAGGGG

CAGCGGGGTTTTCTCGACCGGTTGATGGCGCACCCGGCCGTGGCGGGTTTGTCTGAAGGA

CCAGAATCCCTGCCTCGC 501 bp (SEQ ID NO: 64)

Amino acid sequence:

MPKLSAGVLLYRARAGWDVLLAHPGGPFWAGKDDGAWSIPKGEYTGGEDPWLAARREFS

EEIGLCVPDGPRIDFGSLKQSGGKVVTVFGVRADLDITDARSSTFELDWPKGSGKMRKFP

EVDRVSWFPVARARTKLLKGQRGFLDRLMAHPAVAGLSEGPESLPR (SEQ ID NO :

83)

RvO138

CONSERVED HYPOTHETICAL PROTEIN

5' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATGTGAGCGCTTCGGAG (SEQ ID NO:86)

3' primer:

TGATGATGAGAACCCCCCCCAGGACCTCCATGCCGGCGCA (SEQ ID NO: 87)

Polynucleotide sequence:

GTGAGCGCTTCGGAGTTCTCCCGTGCTGAACTCGCCGCCGCCTTCGAGAAGTTCGAGAAG

ACCGTGGCCCGCGCCGCCGCGACGCGCGACTGGGATTGCTGGGTGCAGCACTACACCCCC

GACGTCGAATACATCGAGCACGCGGCGGGCATCATGCGAGGCCGCCAGCGGGTACGTGCC

TGGATTCAAGAAACGATGACGACCTTCCCGGGCAGTCACATGGTGGCCTTCCCGTCGCTG

TGGTCGGTGATCGACGAGTCCACCGGGCGAATTATCTGCGAATTGGACAACCCCATGCTC

GACCCCGGCGACGGCAGCGTGATCAGCGCGACGAACATTTCGATCATCACCTATGCCGGC AATGGCCAGTGGTGCCGTCAAGAAGACATCTACAACCCGTTGCGGTTCCTGCGGGCGGCG ATGAAGTGGTGTCGCAAGGCGCAGGAGTTGGGCACCCTCGACGAGGACGCGGCGCGTTGG ATGCGCCGGCATGGAGGTCCT (SEQ ID NO: 110)

Amino acid sequence :

VSASEFSRAELAAAFEKFEKTVARAAATRDWDCWVQHYTPDVEYIEHAAGIMRGRQRVRA

WIQETMTTFPGSHMVAFPSLWSVIDESTGRIICELDNPMLDPGDGSVISATNISIITYAG

NGQWCRQEDIYNPLRFLRAAMKWCRKAQELGTLDEDAARWMRRHGGP (SEQ ID

NO: 122)

Rv0740

CONSERVED HYPOTHETICAL PROTEIN

5 ' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGCTGCCGAAGAAC (SEQ ID NO: 88)

3¹ primer:

TGATGATGAGAACCCCCCCCGCCCTCGGCGGCGTCTTTCG (SEQ ID NO: 89)

Polynucleotide sequence:

ATGCTGCCGAAGAACACCAGACCCACCTCGGAAACCGCCGAAGAGTTCTGGGACAACTCG

CTGTGGTGCAGCTGGGGCGACCGAGAAACGGGATACACCCGCACCGTCACGGTTTCGATC

TGCCAGGTGGCGGACGGCGAACGTGAGGCCGAAGGGGTTCGGGACATGATGCGGCTGGAG

TGTCCGGCTGGGCTGGATCTACGGACACCCAACCCGGAGGCATACGAGATTACCGGTCAG

CGGCCCGGAGAATTCGTGTTCGTGCTCGGCTATCTGGGGCATGTGCGGGCCATCGTGGGC

AACTGTTACATCGAGATCATGCCGATGGGCACCAGGGTCGAGCTGAGCAAGTTGGCCGAT

GTGGCATTGGATATCGGCCGCAGTGTCGGATGCTCGGCCTACGAGAACGACTTCACGCTG

CCGGACATTCCAACGCAGTGGCGCAACCAGCCGCTGGGCTGGTACACGCAAGGCCTTGCC

CCCTACCTGCCGGGGCTGTCGGACCCGAAAGACGCCGCCGAGGGC (SEQ ID

NO:111)

Amino acid sequence:

MLPKNTRPTSETAEEFWDNSLWCSWGDRETGYTRTVTVSICQVADGEREAEGVRDMMRLE CPAGLDLRTPNPEAYEITGQRPGEFVFVLGYLGHVRAIVGNCYIEIMPMGTRVELSKLAD VALDIGRSVGCSAYENDFTLPDIPTQWRNQPLGWYTQGLAPYLPGLSDPKDAAEG (SEQ ID NO: 123)

RvO733

PROBABLE ADENYLATE KINASE ADK (ATP-AMP TRANSPHOSPHORYLASE)

5' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATGTGAGAGTTTTGTTG (SEQ ID NO:90)

3¹ primer:

TGATGATGAGAACCCCCCCCCTTTCCCAGAGCCCGCAACG (SEQ ID NO: 91)

Polynucleotide sequence:

GTGAGAGTTTTGTTGCTGGGACCGCCCGGGGCGGGCAAGGGGACGCAGGCGGTGAAGCTG

GCCGAGAAGCTCGGGATCCCGCAGATCTCCACCGGCGAACTCTTCCGGCGCAACATCGAA GAGGGCACCAAGCTCGGCGTGGAAGCCAAACGCTACTTGGATGCCGGTGACTTGGTGCCG TCCGACTTGACCAATGAACTCGTCGACGACCGGCTGAACAATCCGGACGCGGCCAACGGA TTCATCTTGGATGGCTATCCACGCTCGGTCGAGCAGGCCAAGGCGCTTCACGAGATGCTC GAACGCCGGGGGACCGACATCGACGCGGTGCTGGAGTTTCGTGTGTCCGAGGAGGTGTTG TTGGAGCGACTCAAGGGGCGTGGCCGCGCCGACGACACCGACGACGTCATCCTCAACCGG ATGAAGGTCTACCGCGACGAGACCGCGCCGCTGCTGGAGTACTACCGCGACCAATTGAAG ACCGTCGACGCCGTCGGCACCATGGACGAGGTGTTCGCCCGTGCGTTGCGGGCTCTGGGA AAG (SEQ ID NO: 112)

Amino acid sequence:

VRVLLLGPPGAGKGTQAVKLAEKLGIPQISTGELFRRNIEEGTKLGVEAKRYLDAGDLVP SDLTNELVDDRLNNPDAANGFILDGYPRSVEQAKALHEMLERRGTDIDAVLEFRVSEEVL LERLKGRGRADDTDDVILNRMKVYRDETAPLLEYYRDQLKTVDAVGTMDEVFARALRALG K (SEQ ID NO: 124)

RylO65

CONSERVED HYPOTHETICAL PROTEIN

5' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATGTGGTTATGCCTCTT (SEQ ID NO: 92)

3¹ primer:

TGATGATGAGAACCCCCCCCTCCCGACCCTTCGGGCTGGT (SEQ ID NO: 93)

Polynucleotide sequence:

GTGGTTATGCCTCTTGTCACGCCAACCACCGCGGTTCCATCACCGGGACCCACACGGCTG

CGTGTAGCCGATCTCCTGCGCGCCACCGACCAAGCCGCAGACGACGTGCTTGGCGGGCGC

TGCGACCACCTGCTACCCGACGGTGGTGTCCCGCAGACGCAGCGCTGGTACACCCGCATC

CACGGTGACGAGGAGCTGGATATCTGGCTGATTAGCTGGGTTCCCGGTCAACCGACCGAG

CTGCACGACCATGGCGGGTCCCTGGGAGCGTTGACCGTGCTGAGCGGGTCGCTCAACGAA

TATCGTTGGGACGGCCGTCGGTTGCGACGGCGCCGCCTCGATGCCGGTGATCAGGCAGGG

TTCCCGTTGGGTTGGGTGCACGACGTGGTGTGGGCGCCCCGGCCGATTGGGGGGCCTGAT

GCGGCCGGGATGGCTGTGGCGCCAACCCTGAGCGTGCACGCCTACTCGCCGCCGCTGACG

GCGATGTCGTACTACGAGATCACCGAACGCAACACGCTGCGCCGCCAGCGCACCGAATTG

ACCGACCAGCCCGAAGGGTCGGGA (SEQ ID NO: 113)

Amino acid sequence:

VVMPLVTPTTAVPSPGPTRLRVADLLRATDQAADDVLGGRCDHLLPDGGVPQTQRWYTRI HGDEELDIWLISWVPGQPTELHDHGGSLGALTVLSGSLNEYRWDGRRLRRRRLDAGDQAG FPLGWVHDWWAPRPIGGPDAAGMAVAPTLSVHAYSPPLTAMSYYEITERNTLRRQRTEL TDQPEGSG ( SEQ ID NO : 125 )

Rv2114

HYPOTHETICAL PROTEIN

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGTCGGCTCCCGAA (SEQ ID NO: 94 )

3' primer:

TGATGATGAGAACCCCCCCCGGCGGTCACCAGCGAGTAGC (SEQ ID NO: 95) Polynucleotide sequence:

ATGTCGGCTCCCGAACGGGTAACCGGCTTGTCCGGGCAACGTTACGGGGAAGTCCTTCTC GTAACACCCGGGGAGGCCGGTCCACAGGCCACCGTTTACAACAGCTTCCCGCTTAACGAT TGTCCGGCCGAGCTGTGGTCCGCGCTCGATCCGCAAGCCCTAGCCACCGAACACAAAGCG GCCACCGCCCTGCTCAACGGTCCGCGCTATTGGTTGATGAACGCCATCGAGAAGGCGCCC CAGGGCCCGCCGGTGACGAAGACCTTCGGCGGGATCGAGATGCTCCAGCAGGCCACGGTG CTGCTGTCATCGATGAACCCTGCCCCATACACCGTCAGCCAGGTCAGCCGCAACACGGTC TTTGTGTTCAACGCCGGCGAAGAGGTCTACGAACTGCAGGACCCCAAGGGACAGCGCTGG GTGATGCAGACGTGGAGTCAAGTGGTGGACCCCAACCTGTCCCGAGCCGACCTGCCCAAG CTGGGTGAACGGCTCAACCTGCCAGCCGGGTGGTCCTATCATACCCGCGTGCTTACCAGC GAGTTGCGGGTCGACACTACCAACCGGGAGGCCCGCGTCCTGCAAGACGACCTCACCAAC AGCTACTCGCTGGTGACCGCC (SEQ ID NO: 114)

Amino acid sequence:

MSAPERVTGLSGQRYGEVLLVTPGEAGPQATVYNSFPLNDCPAELWSALDPQALATEHKA ATALLNGPRYWLMNAIEKAPQGPPVTKTFGGIEMLQQATVLLSSMNPAPYTVSQVSRNTV FVFNAGEEVYELQDPKGQRWVMQTWSQWDPNLSRADLPKLGERLNLPAGWSYHTRVLTS ELRVDTTNREARVLQDDLTNSYSLVTA (SEQ ID NO: 126)

Rv2466c

CONSERVED HYPOTHETICAL PROTEIN

5 ' primer :

GAAGGAGATATACCATGCATCATCATCATCATCATATGCTCGAGAAGGCC (SEQ ID NO: 96)

3' primer:

TGATGATGAGAACCCCCCCCGTCGAACTGAGGCGGCTCGG (SEQ ID NO: 97)

Polynucleotide sequence:

ATGCTCGAGAAGGCCCCCCAGAAGTCTGTCGCCGATTTCTGGTTCGATCCGCTGTGCCCG

TGGTGCTGGATCACGTCGCGCTGGATCCTCGAGGTGGCAAAGGTCCGCGACATCGAGGTG

AACTTCCACGTCATGAGCCTGGCAATACTCAACGAAAACCGTGACGACCTGCCCGAGCAA

TACCGCGAAGGCATGGCGAGGGCATGGGGACCGGTACGGGTGGCGATCGCCGCCGAGCAG

GCCCATGGGGCGAAAGTCCTGGACCCGCTGTACACCGCGATGGGCAACCGGATTCACAAC

CAGGGCAACCACGAACTCGACGAGGTCATCACCCAGTCGCTGGCGGACGCCGGTCTGCCC

GCGGAGTTGGCCAAGGCCGCTACCAGCGACGCTTACGACAACGCCCTGCGCAAAAGCCAC

CACGCCGGGATGGACGCGGTGGGCGAGGACGTCGGTACGCCGACGATCCATGTCAATGGT

GTGGCGTTCTTCGGGCCGGTGCTCTCGAAGATTCCGCGCGGCGAGGAAGCCGGCAAGCTC

TGGGATGCCTCGGTTACCTTCGCTTCCTACCCGCACTTTTTTGAGCTCAAGCGGACCCGC

ACCGAGCCGCCTCAGTTCGAC (SEQ ID NO: 115)

Amino acid sequence:

MLEKAPQKSVADFWFDPLCPWCWITSRWILEVAKVRDIEVNFHVMSLAILNENRDDLPEQ

YREGMARAWGPVRVAIAAEQAHGAKVLDPLYTAMGNRIHNQGNHELDEVITQSLADAGLP

AEI-AKAATSDAYDNALRKSHHAGMDAVGEDVGTPTIHVNGVAFFGPVLSKIPRGEEAGKL

WDASVTFASYPHFFELKRTRTEPPQFD (SEQ ID NO: 127)

RvO158

PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN (POSSIBLY TETR- FAMILY) 5 ' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGCCATCCGACACC (SEQ ID NO: 98)

3¹ primer:

TGATGATGAGAACCCCCCCCCGTTTCCTTCCGAGTTCCAA (SEQ ID NO: 99)

Polynucleotide sequence:

ATGCCATCCGACACCAGCCCCAACGGGCTAAGCCGCCGTGAGGAGTTGCTGGCTGTTGCC

ACCAAACTATTCGCGGCGCGCGGTTATCACGGCACCCGGATGGACGACGTCGCCGATGTG

ATCGGGCTCAACAAAGCAACGGTCTATCACTACTACGCCAGCAAGTCGCTGATCCTGTTC

GACATTTACCGTCAGGCGGCCGAGGGCACCCTGGCCGCCGTGCACGACGATCCGTCCTGG

ACGGCCCGTGAAGCGCTGTACCAGTACACGGTCCGGCTGCTCACTGCGATCGCGAGCAAC

CCCGAGCGGGCCGCCGTGTACTTCCAGGAGCAGCCCTACATCACCGAGTGGTTCACCAGC

GAGCAGGTCGCCGAGGTCCGCGAGAAGGAGCAGCAAGTCTACGAGCACGTACACGGCCTG

ATCGACCGCGGGATTGCCAGCGGCGAGTTCTATGAGTGCGACTCGCATGTGGTGGCGCTG

GGGTACATCGGGATGACGCTGGGCAGCTACCGCTGGCTGCGGCCGAGCGGGCGCCGAACG

GCCAAGGAGATCGCGGCGGAGTTCAGCACGGCACTGCTGCGCGGGCTGATCCGCGACGAA

TCGATCCGCAACCAGTCTCCGCTTGGAACTCGGAAGGAAACG (SEQ ID NO: 116)

Amino acid sequence:

MPSDTSPNGLSRREELLAVATKLFAARGYHGTRMDDVADVIGLNKATVYHYYASKSLILF DIYRQAAEGTLAAVHDDPSWTAREALYQYTVRLLTAIASNPERAAVYFQEQPYITEWFTS EQVAEWEKEQQVYEHVHGLIDRGIASGEFYECDSHVVALGYIGMTLGSYRWLRPSGRRT AKEIAAEFSTALLRGLIRDESIRNQSPLGTRKET (SEQ ID NO: 128)

Rv3676

PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN (PROBABLY CRP/FNR-FAMILY)

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATGTGGACGAGATCCTG (SEQ ID NO:100)

3 ' primer:

TGATGATGAGAACCCCCCCCCCTCGCTCGGCGGGCCAGTC (SEQ ID NO: 101)

Polynucleotide sequence:

GTGGACGAGATCCTGGCCAGGGCAGGAATCTTCCAAGGCGTGGAGCCCAGCGCAATCGCC

GCACTGACGAAACAGCTGCAGCCCGTCGACTTCCCCCGTGGACACACGGTCTTCGCGGAA

GGGGAGCCGGGCGATCGGCTGTACATCATCATCTCGGGGAAGGTCAAGATCGGTCGCCGG

GCACCAGACGGCCGAGAAAACCTGTTAACCATCATGGGCCCGTCGGACATGTTCGGCGAG

TTGTCGATCTTCGACCCGGGTCCGCGCACGTCCAGCGCGACCACGATCACCGAGGTGCGG

GCGGTGTCGATGGACCGCGACGCGCTGCGGTCATGGATCGCCGATCGTCCCGAAATCTCC

GAACAGCTGCTGCGGGTGCTGGCCCGCCGGCTGCGCCGCACCAACAACAACCTGGCCGAC

CTCATCTTCACCGATGTGCCCGGTCGGGTGGCCAAGCAGCTGTTGCAGCTCGCCCAGCGT

TTCGGCACCCAGGAAGGTGGCGCATTGCGGGTCACCCACGACCTGACACAGGAAGAAATC

GCCCAGCTGGTCGGGGCCTCACGCGAGACGGTGAACAAGGCACTGGCTGATTTCGCTCAC

CGCGGCTGGATCCGCCTTGAGGGCAAGAGTGTGCTGATCTCTGACTCCGAAAGACTGGCC

CGCCGAGCGAGG (SEQ ID NO: 117)

Amino acid sequence: VDEILARAGIFQGVEPSAIAALTKQLQPVDFPRGHTVFAEGEPGDRLYIIISGKVKIGRR APDGRENLLTIMGPSDMFGELSIFDPGPRTSSATTITEVRAVSMDRDALRSWIADRPEIS EQLLRVLARRLRRTNNNLADLIFTDVPGRVAKQLLQLAQRFGTQEGGALRVTHDLTQEEI AQLVGASRETVNKALADFAHRGWIRLEGKSVLISDSERLARRAR (SEQ ID NO: 129)

Rv2821c

CONSERVED HYPOTHETICAL PROTEIN

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGACTACGAGCTAC (SEQ ID NO: 102)

3 ' primer:

TGATGATGAGAACCCCCCCCAACAGCCGCGAGTTCATGGT (SEQ ID NO: 103)

Polynucleotide sequence :

ATGACTACGAGCTACGCCAAGATCGAGATAACCGGGACACTGACCGTCCTGACGGGCCTG

CAGATCGGGGCCGGCGATGGCTTCTCCGCCATCGGCGCGGTCGACAAGCCTGTCGTTCGT

GATCCGCTGAGCAGGCTGCCGATGATTCCGGGTACCAGCCTGAAGGGCAAGGTCCGCACC

TTGCTGTCCCGCCAATACGGCGCCGACACAGAAACGTTTTACAGGAAGCCGAATGAGGAC

CACGCCCATATCCGTCGGCTTTTCGGCGACACCGAGGAGTACATGACGGGCCGACTCGTC

TTCCGCGACACGAAGCTCACCAACAAAGACGACCTCGAAGCCCGCGGCGCTAAGACTCTC

ACCGAGGTGAAATTCGAGAACGCCATCAACCGGGTGACCGCAAAGGCAAACCTTCGCCAG

ATGGAACGCGTGATCCCCGGCAGCGAGTTCGCGTTCTCACTTGTCTACGAGGTCTCCTTC

GGCACCCCCGGCGAGGAACAGAAGGCGTCTCTGCCTTCCTCCGATGAGATCATCGAGGAC

TTCAACGCCATCGCGCGCGGCCTGAAGTTGCTCGAACTCGACTACCTCGGCGGCAGCGGA

ACCCGTGGCTACGGGCAGGTCAAGTTCAGCAACCTGAAAGCCCGCGCCGCAGTCGGCGCC

CTCGACGGTTCTCTGCTGGAGAAGCTAAACCATGAACTCGCGGCTGTT (SEQ ID

NO: 118)

Amino acid sequence :

MTTSYAKIEITGTLTVLTGLQIGAGDGFSAIGAVDKPWRDPLSRLPMIPGTSLKGKVRT

LLSRQYGADTETFYRKPNEDHAHIRRLFGDTEEYMTGRLVFRDTKLTNKDDLEARGAKTL

TEVKFENAINRVTAKANLRQMERVIPGSEFAFSLVYEVSFGTPGEEQKASLPSSDEIIED FNAIARGLKLLELDYLGGSGTRGYGQVKFSNLKARAAVGALDGSLLEKLNHELAAV

(SEQ ID NO: 130)

RylO56

CONSERVED HYPOTHETICAL PROTEIN

5' primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGAGCGTGGATTAC (SEQ ID NO: 104)

3 ' primer:

TGATGATGAGAACCCCCCCCGCTGAACTGAGTGTGCGGCC (SEQ ID NO: 105)

Polynucleotide sequence:

ATGAGCGTGGATTACCCCCAAATGGCTGCTACCCGGGGAAGAATAGAACCGGCCCCGCGG

CGAGTTCGCGGCTATCTCGGACATGTGCTCGTCTTCGACACCAGTGCGGCGCGCTATGTC

TGGGAGGTTCCCTACTACCCGCAGTACTACATCCCGCTGGCGGATGTCCGCATGGAGTTC

CTGCGCGACGAGAACCACCCGCAGCGAGTGCAGCTGGGTCCGTCGCGGCTGCACTCCTTG GTAAGCGCCGGTCAGACCCACCGATCGGCGGCGCGGGTATTCGATGTCGACGGCGACAGC CCGGTGGCGGGCACCGTGCGTTTCAACTGGGATCCGCTGCGGTGGTTCGAGGAGGACGAG CCGATCTACGGCCATCCGCGCAATCCCTATCAGCGGGCCGATGCGCTGCGCTCGCACCGA CACGTCCGTGTCGAGCTGGACGGCATTGTGCTCGCTGACACCCGATCGCCCGTTCTGCTA TTCGAAACTGGGATACCCACAAGGTATTACATCGATCCGGCCGACATCGCTTTCGAGCAT CTGGAGCCCACCTCGACGCAGACGTTGTGTCCGTACAAGGGGACGACGTCGGGCTATTGG TCTGTGCGCGTCGGCGACGCCGTGCACCGCGACCTGGCCTGGACGTATCACTATCCACTG CCCGCCGTTGCCCCGATCGCCGGCCTGGTGGCGTTTTACAACGAGAAGGTCGACCTCACC GTCGACGGCGTCGCCCTGCCGCGGCCGCACACTCAGTTCAGC (SEQ ID NO: 119)

Amino acid sequence : MSVDYPQMAATRGRIEPAPRRVRGYLGHVLVFDTSAARYVWEVPYYPQYYIPLADVRMEF

LRDENHPQRVQLGPSRLHSLVSAGQTHRSAARVFDVDGDSPVAGTVRFNWDPLRWFEEDE PIYGHPRNPYQRADALRSHRHVRVELDGIVLADTRSPVLLFETGIPTRYYIDPADIAFEH LEPTSTQTLCPYKGTTSGYWSVRVGDAVHRDLAWTYHYPLPAVAPIAGLVAFYNEKVDLT VDGVALPRPHTQFS (SEQ ID NO: 131)

Ryl353c

PROBABLE TRANSCRIPTIONAL REGULATORY PROTEIN

5¹ primer:

GAAGGAGATATACCATGCATCATCATCATCATCATATGCAGACAACCCCA (SEQ ID NO:106)

3¹ primer:

TGATGATGAGAACCCCCCCCACGCGCCACCGCTTTGGCCC (SEQ ID NO: 107)

Polynucleotide sequence:

ATGCAGACAACCCCAGGCAAGCGTCAACGACGGCAGCGCGGATCCATCAACCCCGAGGAC

ATCATCAGCGGCGCATTCGAACTCGCCCAGCAGGTATCGATAGACAACTTGAGCATGCCA

TTGCTCGGCAAACACCTTGGCGTCGGGGTCACCAGCATCTACTGGTACTTCCGCAAGAAG

GACGATCTGCTCAACGCGATGACCGACCGCGCTTTGAGCAAGTACGTGTTCGCTACCCCG

TACATCGAAGCCGGCGACTGGCGCGAAACGTTGCGCAATCATGCCCGCTCGATGCGGAAG

ACGTTCGCGGACAACCCCGTACTGTGCGATCTGATACTGATTCGAGCGGCGCTGTCCCCG

AAAACGGCGCGGTTGGGCGCCCAAGAGATGGAGAAGGCCATCGCCAATCTGGTGACGGCG

GGCCTGTCGCTCGAAGACGCTTTCGACATCTACTCGGCGGTTTCGGTCCACGTGCGCGGA

TCGGTGGTGCTAGATCGGCTCTCCCGCAAGAGCCAGTCGGCGGGCAGCGGACCATCCGCC

ATTGAACACCCCGTGGCCATCGATCCCGCGACGACTCCGCTGCTTGCTCACGCAACTGGG

AGGGGGCATCGGATCGGGGCCCCCGATGAAACCAATTTCGAATATGGTCTCGAATGCATC

CTCGACCATGCTGGCCGGTTGATCGAACAAAGCTCGAAAGCCGCTGGTGAGGTCGCAGTG

CGCCGCCCCACGGCCACCGCCGATGCGCCTACGCCGGGCGCGCGGGCCAAAGCGGTGGCG

CGT (SEQ ID NO: 120)

Amino acid sequence:

MQTTPGKRQRRQRGSINPEDIISGAFELAQQVSIDNLSMPLLGKHLGVGVTSIYWYFRKK

DDLLNAMTDRALSKYVFATPYIEAGDWRETLRNHARSMRKTFADNPVLCDLILIRAALSP

KTARLGAQEMEKAIANLVTAGLSLEDAFDIYSAVSVHVRGSWLDRLSRKSQSAGSGPSA

IEHPVAIDPATTPLLAHATGRGHRIGAPDETNFEYGLECILDHAGRLIEQSSKAAGEVAV

RRPTATADAPTPGARAKAVAR (SEQ ID NO: 132)

5 ' primer :

GAAGGAGATATACCATGCATCATCATCATCATCATATGACGATCCCTGAT ( SEQ ID NO : 108 )

3 ' primer :

TGATGATGAGAACCCCCCCCCAGGCCATCAAAAAAGTCCT (SEQ ID NO: 109)

Polynucleotide sequence:

ATGACGATCCCTGATGCCCAGACGTTGATGCGGCCGATTCTCGCGTATCTTGCCGATGGA

CAAGCGAAGTCGGCCAAGGACGTCATCGCGGCGATGTCCGACGAGTTCGGTCTGTCCGAC

GACGAGCGGGCGCAGATGTTGCCCAGCGGTCGGCAAAGGACCATGTACGACAGGGTGCAC

TGGTCTCTCACTCACATGTCGCAGGCCGGATTGCTCGACCGTCCCACGCGGGGCCACGTC

CAGGTCACGGACACGGGCCGTCAAGTCCTGAAGGCGCATCCCGAGCGCGTCGACATGGCT

GTGCTGCGGGAGTTCCCGTCGTACATCGCTTTTCGTGAGCGAACCAAAGCCAAGCAGCCA

GTCGACGCGACCGCCAAGCGACCGTCCGGGGACGATGTGCAGGTCTCACCCGAGGATCTC

ATCGACGCTGCGCTTGCGGAGAACCGGGCAGCCGTCGAGGGGGAGATCCTGAAGAAGGCA

CTCACGTTGTCGCCCACCGGGTTTGAAGATCTGGTTATCAGACTTTTGGAGGCGATGGGT

TACGGGCGAGCCGGCGCGGTGGAACGGACGAGTGCCTCCGGTGACGCTGGCATCGACGGA

ATCATCAGCCAGGACCCGCTCGGGCTGGACCGCATCTACGTGCAGGCCAAGCGATACGCC

GTCGACCAAACGATTGGCCGGCCGAAGATCCACGAGTTCGCCGGCGCCCTCCTGGGCAAG

CAGGGCGACCGGGGCGTCTACATCACCACGTCATCGTTTTCCCGCGGTGCCCGCGAGGAA

GCTGAGCGGATCAACGCCCGGATCGAACTCATCGACGGCGCTCGGCTGGCCGAGCTGCTC

GTGCGGTATCGAGTCGGTGTCCAGGCGGTGCAGACCGTCGAACTCTTACGGCTCGACGAG

GACTTTTTTGATGGCCTG (SEQ ID NO: 121)

Amino acid sequence:

MTIPDAQTLMRPILAYLADGQAKSAKDVIAAMSDEFGLSDDERAQMLPSGRQRTMYDRVH WSLTHMSQAGLLDRPTRGHVQVTDTGRQVLKAHPERVDMAVLREFPSYIAFRERTKAKQP VDATAKRPSGDDVQVSPEDLIDAALAENRAAVEGEILKKALTLSPTGFEDLVIRLLEAMG _YGRAGAVERTSASGDAGIDGIISQDPLGLDRIYVQAKRYAVDQTIGRPKIHEFAGALLGK QGDRGVYITTSSFSRGAREEAERINARIELIDGARLAELLVRYRVGVQAVQTVELLRLDE DFFDGL (SEQ ID NO: 133)

[0171] The PCR reactions contained 100 ng Mtb genomic DNA, 25 nM final concentration of 5' and 3' primers. Polymerase, PCR buffer and nucleotides were from Clontech. The reaction temperature and times for the first PCR reaction were: 94⁰C for 2 minutes, followed by 30 cycles of: 94⁰C for 30 seconds, 48⁰C for 1 min., and 68⁰C for 2.5 minutes.

[0172] Following the first PCR reaction, an aliquot of each PCR reaction containing 100 ng of PCR product from the previous step was transferred into a PCR reaction containing the the TAP promoter and terminator fragments. The sequences of these fragments were:

[0173] Promoter fragment: [0174]

5 ^'CGGTCACGCTTGGGACTGCCATAGGCTGGCCCGGTGATGC CGGCCACGATGCGTCCGGCGTAGAGGATCGAGATCTCGATCCCGCGAAATTA ATACGACTCACTATAGGGAGACCACAACGGTTTCCCTCTAGAAATAATTTTGT TTAACTTTAAGAAGGAGATATACC 3^' (SEQ ID NO: 6)

[0175] Terminator fragment:

[0176]

5 ^'GGGGGGGGTTCTCATCATCATCATCATCATTAATAAAAGG GCGAATTCCAGCACACTGGCGGCCGTTACTAGTGGATCCGGCTGCTAACAAA GCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCAT AACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGA ACTATATCCGGAGCGACTCCCACGGCACGTTGGCAAGCTCG 3'(SEQ ID NO: 7)

[0177] The reaction temperature and times for the second PCR reaction were: 94⁰C for 2 minutes, followed by 30 cycles of: 94⁰C for 30 seconds, 48⁰C for 60 seconds, and 68⁰C for 2.5 minutes. Protein expression

[0178] The TAP fragments generated by PCR were used as templates for in vitro protein expression using a Roche RTSlOO transcription/translation kit according to manufacturer's instructions. Approximately 0.5 ~ 1.0 μg PCR product was used as template, producing approximately 0.5 ~ 5.0 μg of protein per template. Protein purification

[0179] MagneHis nickel-coated magnetic beads (Promega) were used to purify the expressed proteins. 15 μl of Ni-magnetic beads (Promega) were pipetted into each well of a microtiter plate. To each well 50 μl wash buffer (50 mM NaHPO₄, pH8.0, 300 mM NaCl, 100 mM imidazole) was added with mixing and the plates were placed on a magnetic stand. The supernatant was removed and wash was repeated. 50 μl of the Protein mixture was added with gentle pipetting. The mixture was incubated at room temperature for 2 minutes. The beads were then separated using a magnetic stand, washed 3 times with 150 μl wash buffer and the bound protein was eluted from the beads with 50 μl of 50 mM NaHPO₄, pH 8.0, 300 mM NaCl, 250 mM imidazole. Western blot.

[0180] 15 μl of the purified proteins were resolved on 4-12% SDS- polyacrylamide gels and transferred to nitrocellulose membranes The membranes were blocked in TBST/1% BSA, followed by incubation with TBST/1% BSA containing 1000- fold diluted rabbit anύ-Mtb serum. The blots were washed and then incubated with alkaline phosphatase-conjugated goat-anti rabbit serum secondary antibody. Colorimetric development was used to develop the blots. The results of these analyses are shown in FIGURE 6. ELISA

[0181] The wells of Nunc-Immuno MaxiSorp 96-well plates were coated with 5 μl of expressed protein diluted in 95 μl PBS. The plates were mixed well on a shaker and then incubated overnight at 4⁰C. The plates were washed with PBS + 0.05% Tween- 20 for 5 min. with shaking at 200 rpm. 200 μl of PBS + 1% BSA blocking solution was added and the plates incubated for 1 hr at room temperature, with shaking at 150 rpm. The blocking buffer was removed and 100 μl primary antibody (O4.E293.1.11.WCL(- )LAM rabbit polyclonal antibodies 1 : 200000 diluted in blocking solution) was added. Following 1 hr incubation at room temperature with shaking at 150 rpm, the plates were washed 3 times with PBS + 0.05% Tween-20. 100 μl second antibody (Anti-Rabbit IgG(H+L)-HRP conjugated, (Promega) diluted 1 :2500 in blocking solution was added to each well and the plates were incubated for 1 hr at room temperature, with shaking at 150 rpm. After washing 3 times, 100 μl TMB substrate solution (Promega) was added to each well and the blue color was allowed to develop for 15min at room temperature without shaking. 100 μl of IN HCl was added to each well to stop the reaction and change the blue color to yellow. The plates were read in a specrophotometer at 450 nm after 30 min. The results of this analysis are shown in FIGURE 6.

[0182] As shown in Figure 6, rabbit anύ-Mtb serum identified 19 and 12 proteins that were reactive to the anti-serum Western blot and ELISA, respectively. The results showed a strong correlation in 'hits' between the two methods. In addition, a few antigen proteins at low abundance exhibited high reactivity relative to the others, suggesting the presence of strong B-cell epitopes, thus making them premier candidates for additional study.

Example 3. Using the Mtb proteome to identify the antigenic targets of cell- mediated immunity in Mtb vaccinated mice and humans.

[0183] The following is a method that is used to systematically screen and identify antigens in Mtb that give rise to a protective cell-mediated immune response. Through the use of TAP technology coding sequences of the Mtb genome are amplified. The PCR reactions are performed such that each amplified coding sequence becomes transcriptionally active. The resulting TAP fragments are expressed to produce Mtb polypeptides. Each of the polypeptides is delivered into dendritic cells, located in 96- well plates, using a polypeptide delivery reagent. Serum from Mtb immunized humans is added to each of the different wells.

[0184] An IFN-γ ELIspot assay is run using the following materials and method: MATERIALS:

[0185] Millipore 96-well multi-screen filtration plates (Millipore #MAIP S45- 10) (Millipore, Bedford, MA)

[0186] Anti-IFN-D purified MAb (Clone 1-DlK) (MABTECH #3420-3) (Mabtech, Naka, Sweden)

[0187] Anti-IFN-g Biotinylated MAb (Clone 7-B6-1) (MABTECH #3420-6) (Mabtech, Naka, Sweden)

[0188] Streptavidin- Alkaline Phosphatase (MABTECH #3310-8) (Mabtech, Naka, Sweden)

[0189] Alkaline Phosphate Substrate Kit (BIO-RAD #170-6432) (Bio-Rad, Hercules, CA)

[0190] Carbonate Buffer pH 9.6 (0.2μM sterile filtered)

[0191] RPMI- 1640 Medium (GIBCO #22400-089) (Gibco, Grand Island, N.Y.)

[0192] Fetal Bovine Serum (Sigma #F4135-500mL) (Sigma, St. Louis, MO)

[0193] IX PBS (Prepared from 1OX PBS DIGENE #3400-1010) (DIGENE, Gaithersburg, MD)

[0194] Tween® 20 (J.T. Baker #X251-07) (J.T. Baker, Phillipsburg, NJ) METHOD:

[0195] 96-well plates are coated with Coating Antibody (anti-IFN-g Clone 1- DlK) at 10-15 μg/mL (100 μL/well) and incubated at 4°C overnight. Using aseptic technique, plates are flicked to remove Coating Antibody and washed 6 times with RPMI- 1640. Plates are blocked with lOOμL/well of RPMI- 1640 + 10% FBS (or Human AB serum) for 1-2 hours at room temperature. Plates are flicked to remove blocking buffer and lOOμL/well of antigen specific or control peptides are added at a final concentration of lOμg/well. Peripheral blood lymphocytes (PBL) are added at 4xlO5/well and lxlO5/well. Plates are incubated at 37°C/5% CO₂ for 36 hours. Plates are flicked to remove cells and washed 6 times with PBS + 0.05% Tween® 20 at 200-250μL/well. Plates are blot dried on paper towels.

[0196] Biotinylated antibody (anti-IFN-g Clone 7-B6-1) diluted 1:1,000 in IX PBS at lOOμL/well is added. The resulting solution is incubated for 3 hours at room temperature. Plates are flicked to remove biotinylated antibody and washed 6 times with PBS + 0.05% Tween® 20 at 200-250μL/well. Plates are blot dried on paper towels. Streptavidin alkaline phosphatase is added at lOOμL/well diluted 1 :1,000 in IX PBS. The plates are incubated for 1 hour at room temperature. Plates are flicked to remove the streptavidin alkaline phosphatase and washed 6 times with 0.05% Tween® 20 at 200- 250μL/well. The plates are washed again 3 times with IX PBS at 200-250μL/well. The plates are blot dried on paper towels.

[0197] Substrate is added at lOOμL/well for 10-15 minutes at room temperature. The substrate is prepared according to manufacturer's protocol. The 25X substrate buffer is diluted in dH20 to a IX concentration. Reagent A & B are each diluted 1:100 in the IX substrate buffer. Rinsing plates with generous amounts of tap water (flooding plate and flicking several times) stops colorimetric substrate. Plates are allowed to dry overnight at room temperature in the dark. Spots corresponding to IFN-γ producing cells are determined visually using a stereomicroscope (Zeiss KS ELIspot). Results can be expressed as the number of IFN-γ-secreting cells per 10⁶ spleen cells. Responses are considered positive if the response to test Mtb peptide epitope is significantly different (p<0.05) as compared with the response to no peptide and if the stimulation index (SI = response with test peptide/ response with control peptide) is greater than 2.0. Example 4. Cellular Vaccine Antigen Screen

[0198] A human volunteer was immunized with irradiated sporozoites from P. falciparum, the infectious agent responsible for malaria. Dendritic cells from the volunteer were isolated and cultured. Recombinant CSP polypeptide from P. falciparum was delivered to dendritic cells with or without polypeptide delivery reagents described in U.S. Patent Application No. 09/738046, entitled "Intracellular Protein Delivery Reagent," which is hereby incorporated by reference in its entirety. T-cells isolated from the immunized volunteer were added to the cultures. The EliSpotassay identified 120 CSP antigen specific T-cells out of 250,000 T-cells that were added to the culture when CSP was added to the culture together with said delivery reagents. When CSP was added without said delivery reagents, the signal was barely above background. Example 5. DNA Immunization of mice

[0199] Experiments were set up with five animals per group, consisting of four week old BALB/c female mice, averaging 40 animals per experiment. These mice were immunized IM in each tibialis anterior muscle with 50 μg plasmid DNA or transcriptionally active PCR fragment encoding selected Mtb antigens, 3 times at 3 week intervals.

[0200] Sera was collected 10 days after each immunization for antibody studies. Blood samples (~50 ul) were collected from the mice by orbital bleed with a sterilized pasture pipette. The mice were bled about once a week at a volume of approximately 50 μl.

[0201] Splenocytes were harvested at 14 days after the 3rd immunization and pooled for T-cell studies such as IFN-γ ELIspot assays. Tissue collections were performed on animals euthanized via CO₂ (SOP 98.19) at the end of the experiment. The experiments can be five animals/group, averaging 40 animals/experiment x 4 experiments for a total of 160 mice. Example 6. Preparation of human dendritic cells

[0202] Dendritic cells were ordered from Allcells: Cat # PB002 (NPB- Mononuclear Cells). The cells were in 5OmL buffer. The cells were counted immediately, the total number was 312.5x 10⁶. The cells were pelleted, and resuspended in 25mL RPMI-1640 containing DNAse. This solution (30μg /mL) was incubated for 5 minutes at room temperature. The cells were washed twice with complete medium. The cells were resuspended at 10x10⁶ cells/3 mL. Twelve 10mm dishes containing 1OmL complete medium in each dish were used. The cells were incubated at 37⁰C for 3 hours. The non-adherent cells were removed by gently shaking plates and aspirating the supernatant. Afterwards, the dishes containing adherent cells were washed 3 times with 1 OmL of RPMI- 1640 containing 2% Human Serum. 1 OmL of culture medium were added to each plate containing 50ng/mL GM-CSF and 500u/mL IL-4. This culture medium was added until day 4. After day 4, culture medium without GM-CSF and IL-4 was added. The transfection was done on day 5. The complete medium consisted of RPMI- 1640 (455mL), 5% Human AB Serum (25mL), Non-essential Amino Acids (5mL), Sodium Pyruvate (5mL), L-Glutamine (5mL), and Penicillin-Streptomycin (5mL). Example 7. Generation of dendritic cells from mouse bone marrow

[0203] Cells were taken from the bones of one mouse (2 femur and 2 tibiae without removing the macrophages). The red blood cells were obtained from the bone marrow and lysed. The cells were counted (51 x 10⁶ cells, total) and cultured in a growth medium (2.5 x 10⁶ cells/plate, lOmL/plate) for 8 days before transfection. On day 4 another 1OmL of growth medium was added. On day 6, 1OmL of the old medium was taken from each plate and the cells were pelleted. The cells were resuspended in 1OmL medium with 10ng/mL GM-CSF and 2.5ng/mL IL-4. The cells were placed back into the culture. The cells were cultured until transfection on day 8. On the day of transfection, 2.5 x 10⁶ cells were harvested from each dish. The growth medium for mbmDC contained DMEM/Iscove, 10% FCS, 5OuM β-mercaptoethanol, Ix Penicillin/Streptomycin, 2mM L-Glutamine, 10 mM Hepes, Ix Non-essential amino acids, 20ng/mL rmGM-CSF, and 5ng/mL rmIL-4. Example 8. Adding an HA Epitope Tag

[0204] Oligos were designed using TAP promoter and terminator fragments from pCMVm and pTP-SV40, respectively, and adding the nucleotide sequence encoding the HA epitope tag. For adding the HA epitope to the 5' end of the coding sequence the following sequences is used:

[0205] Promoter 5': CCGCCATGTTGACATTG (SEQ ID NO: 2)

[0206] Promoter 3':

GGCAGATCTGGGAGGCTAGCGTAATCCGGAACATCGTATGG

G

[0207] TACATTGTTAAGTCGACGGTGC (SEQ ID NO: 3)

[0208] For adding the HA epitope to the 3' end of the coding sequence, the following sequences is used:

[0209] Terminator 5':

GATCCCGGGTACCCATACGATGTTCCGGATTACGCTTAGGGG [0210] AGATCTCAGACATG (SEQ ID NO: 4)

[0211] Terminator 3':

CAGGATATCATGCCTGCAGGACGACTCTAGAG (SEQ ID NO:

5)

[0212] The method includes: [0213] PCR is used to amplify a new HA-promoter utilizing pCMVm as a template and a new HA-terminator utilizing pTP-SV40 as a template. The resulting PCR products are gel purified using QIAGEN QIAquick Gel Extraction Kit (Qiagen, Seattle, WA). The PCR products and both plasmids (pCMVm & pTP-SV40) are digested with EcoRV and BgIII restriction enzymes. All digested products are gel purified using QIAquick Gel Extraction Kit. The HA-promoter and HA-terminator are ligated separately into the digested pCMVm and pTP-SV40 plasmids. These plasmids are transformed into DH5, grown overnight on LB plates containing Kanamycin, colonies are selected and grown in LB media containing Kanamycin. QIAGEN QIAprep Spin Miniprep Kit is used to isolate plasmids. Plasmids are digested using EcoRV and BgIII. Digests are run on a gel to identify clones containing plasmid with insert of correct size. The plasmids are sequenced to confirm inserts are correct. A prep culture is grown, plasmids are isolated, plasmids are digested with EcoRV and BgIII, and promoter and terminator fragments are gel purified. Epi-TAP-5'HA and Epi-TAP-3'HA kits are used. Example 9. ICS - Intracellular Cytokine Staining (ICS)

[0214] Bone marrow derived dendritic cells (BMDCs) were prepared by culturing bone marrow cell suspensions with RPMI tissue culture media plus 10% fetal bovine serum and GM-CSF (20ng/ml) for 6-7 days at 37⁰C, 5% CO₂. Cells were then primed with lμg/ml of antigen for 4 hrs at 37⁰C, 5% CO₂.

[0215] Cell suspensions obtained from naive or M. tuberculosis infected mice were used as a source of CD4 T cells. CD4 T cells are isolated by magnetic cell sorting and overlaid onto BMDC primed with specific antigens and cultured at 37⁰C for 24 hrs. After this time T cells were harvested and stained for CD3/CD4/intracelIular IFNγ and analyzed by flow cytometry.

[0216] The sequences disclosed in Table 1 yielded positive results in at least one assay described herein, e.g. Western blot, ELISA or ICS.

Example 10: In one embodiment the method includes detection of antigen-specific CD4⁺ T-cell responses by intracellular cytokine staining (ICS)

[0217] A panel of immunogenic Mtb proteins discovered in Phase I studies that were recognized by rabbit anti-TB sera was selected for further analysis to determine if these proteins could lead to enhanced induction of CD4⁺ T-cells. Thirty-six purified Mtb proteins along with positive controls, culture filtrate proteins (CFP), and recombinant ESAT-6, were included in the ICS assay. The results are summarized in Table 2 and demonstrate that 11 of the 36 proteins significantly stimulated CD4⁺ T-cell responses. Moreover, with equal protein amounts used, 6 Mtb proteins showed greater stimulatory activity than that of ESAT-6.

Table 2. Antigen-specific stimulation of CD4⁺ T-cells

ID Rv3733c Rv0138 Rv0740 RvO733 Rv0009 Rv2882c Rv1065 Rv2613c RvO475 Rv2114 Rv2466c Rv3763 Rv2031c

% T-cells 4.3 4.3 8.3 3.7 4.0 3.6 2.8 2.0 2.1 2.9 2.6 2.5 2.2

ID Rv1347c RvO158 Rv3676 Rv2821 Rv2108 Rv3226c Rv1056 RvO815c Rv3117 Rv1073 Rv0097 ESAT-6 Media

% T-cells 2.0 2.1 1.5 1.5 1.3 2.5 2.2 1.7 1.9 1.6 1.4 3.6 1.4

[0218] One μg each from 36 purified Mtb proteins, along with the control protein ESAT-6, were incubated with mouse dendritic cells for 24 hr. Spleen cells harvested from Mfό-infected mice were added and incubated for an additional 72 hr. The splenocytes were labeled with cychrome-conjugated anti-CD4 antibody and then stained with fluorescein-conjugated anti-γlFN. The cells were washed, fixed and analyzed by flow cytometry. The "% T-cells" indicates the percentage of CD4⁺ T-cells that released γlFN. Based on previous studies, the percent value at or above 2.5% is significant

Claims

WHAT IS CLAIMED IS:

1. An isolated polynucleotide encoding a Mtb polypeptide that is antigenic in a mammal, wherein the polynucleotide is selected from the group consisting of:

SEQ ID NOS : 46-64, 110- 121 ; or a fragment thereof, wherein the fragment encodes an antigenic peptide epitope.

2. The isolated polynucleotide of claim 1, wherein said mammal is a rabbit, human or mouse.

3. An isolated polynucleotide encoding an immunogenic Mtb antigen, wherein the polynucleotide is selected from the group consisting of:

SEQ ID NOS : 46-64, 110- 121 ; or a fragment thereof, wherein the fragment encodes an immunogenic peptide epitope.

4. The polynucleotide of claim 1, wherein said immunogenic Mtb antigen reacts with polyclonal antibodies directed to Mtb.

5. The polynucleotide of claim 1, wherein said immunogenic Mtb antigen is detected by either ELISA or Western blotting using a polyclonal antibody directed to Mtb.

6. A TAP polynucleotide comprising: a 5' TAP polynucleotide sequence; a Mtb polynucleotide sequence; and a 3' TAP polynucleotide sequence.

7. The TAP polynucleotide of claim 6, wherein the Mtb polynucleotide sequence is selected from the group consisting of: SEQ ID NOS: 46-64, 110-121.

8. The polynucleotide of claim 6, wherein the 5' TAP polynucleotide sequence comprises a promoter.

9. The polynucleotide of claim 6, wherein the 5' TAP polynucleotide sequence is selected from the group consisting of: SEQ ID NOS: 2, 3, 6, and 84.

10. The polynucleotide of claim 6, wherein the 3' TAP polynucleotide sequence comprises a terminator.

1 1. The polynucleotide of claim 6, wherein the 3' TAP polynucleotide sequence is selected from the group consisting of: SEQ ID NOS: 4, 5, 7, and 85.

12. A primer pair for amplifying a Mtb polynucleotide selected from the group consisting of: SEQ ID NOS: 8 and 9; 10 and 11; 12 and 13; 14 and 15; 16 and 17; 18 and 19; 20 and 21; 22 and 23; 24 and 25; 26 and 27; 28 and 29; 30 and 31; 32 and 33; 34 and 35; 36 and 37; 38 and 39; 40 and 41; 42 and 43; 44 and 45; 86 and 87; 88 and 89; 90 and 91; 92 and 93; 94 and 95; 96 and 97; 98 and 99; 100 and 101; 102 and 103; 104 and 105; 106 and 107; 108 and 109.

13. An isolated Mtb peptide or polypeptide that is antigenic, wherein said peptide or polypeptide is selected from:

SEQ ID NOS: 65-83, 122-133; or a fragment thereof, wherein the fragment is antigenic.

14. An isolated Mtb peptide or polypeptide that is immunogenic, wherein the peptide or polypeptide is selected from the group consisting of:

SEQ ID NOS: 65-83, 122-133; or a fragment thereof, wherein the fragment is immunogenic.

15. The peptide or polypeptide of claim 14, wherein said peptide or polypeptide reacts with polyclonal antibodies directed Mtb bacteria .

16. The peptide or polypeptide of claim 14, wherein said peptide or polypeptide is detected by either ELISA or Western blotting using a polyclonal antibody that is directed to Mtb.

17. A recombinant Mtb peptide or polypeptide, wherein the amino terminus of said peptide or polypeptide comprises a HA tag or a His tag, and the carboxy terminus of said peptide or polypeptide is selected from the group consisting of: SEQ ID NOS: 65-83, 110-121.

18. A recombinant Mtb peptide or polypeptide, wherein the carboxy terminus of said peptide or polypeptide comprises a HA tag or a His tag, and the amino terminus of said peptide or polypeptide is selected from the group consisting of: SEQ ID NOS: 65-83, 110-121.

19. A recombinant Mtb peptide or polypeptide selected from the group consisting of: SEQ ID NOS: 65-83, 110-121, wherein said peptide or polypeptide is expressed in an in vitro transcription and translation system.

20. The recombinant Mtb peptide or polypeptide of claim 19, wherein said in vitro transcription-translation system is a T7 polymerase system.

21. An immunogenic composition for inducing an immunological response in a mammalian host against Mtb, comprising a nucleic acid that encodes at least one immunogenic peptide or polypeptide selected from the group consisting of: SEQ ID NOS: 65-83, 110-121; or a fragment thereof, wherein the fragment is immunogenic.

22. The immunogenic composition of claim 21, wherein said nucleic acid comprises a plasmid.

23. The immunogenic composition of claim 21, wherein said nucleic acid comprises a TAP fragment.

24. The immunogenic composition of claim 21, wherein said nucleic acid induces a humoral immune response.

25. The immunogenic composition of claim 21, wherein said nucleic acid induces a cell-mediated immune response.

26. The immunogenic composition of claim 21, further comprising an adjuvant.

27. An immunogenic composition comprising at least one isolated Mtb peptide or polypeptide selected from the group consisting of:

SEQ ID NOS: 65-83, 110-121; or a fragment thereof, wherein the fragment is immunogenic.

28. The immunogenic composition of claim 27, wherein said isolated Mtb peptide or polypeptide is expressed in an in vitro transcription and translation system.

29. The immunogenic composition of claim 27, further comprising an adjuvant.

30. A method of generating an immune response in a mammalian host against Mtb comprising: administering to said mammalian host an immunogenic composition comprising at least one nucleic acid selected from the group of SEQ ID NO: 46-64, 110- 121, fragments thereof or combinations thereof, wherein said nucleic acid encodes at least one immunogenic peptide or polypeptide, whereby said immune response against Mtb is generated.

31. A method of generating an immune response in a mammalian host against Mtb comprising: administering to said mammalian host an immunogenic composition comprising at least one nucleic acid encoding at least one immunogenic peptide or polypeptide selected from the group of SEQ ID NO: 65-83,122-133, fragments thereof or combinations thereof, whereby said immune response against Mtb is generated.

32. A method of generating an immune response in a mammalian host against Mtb comprising administering to said mammalian host an immunogenic composition comprising at least one immunogenic peptide or polypeptide selected from SEQ ID NO: 65-83, 122-133, fragment thereof or combinations thereof, whereby said immune response against Mtb is generated.

33. The method according to claims 30, 31 or 32, wherein said immunogenic composition further comprises and adjuvant.

34. A kit comprising: a) at least one Mtb immunogenic composition selected from: i) a nucleic acid selected from the group consisting of SEQ ID NO: 46-64, 110-121, fragments thereof or combinations thereof; or ii) a peptide selected from the group consisting of SEQ ID NO: 65- 83, 122-133, fragments thereof or combinations thereof; and b) an adjuvant.

35. The kit of claims 34, wherein when said Mtb immunogenic composition is a nucleic acid the kit further comprises an expression system.

36. The kit of claim 34 or 35 comprising at least two of said immunogenic compositions.

37. The kit of claim 36 comprising at least two of said peptides.

38. The kit of claim 36, comprising at least two of said nucleic acids.

39. A kit comprising: a) at least one Mtb immunogenic composition selected from: i) a nucleic acid selected from the group consisting of SEQ ID NO: 46-64, 110-121, fragments thereof or combinations thereof; or ii) a peptide selected from the group consisting of SEQ ID NO: 65- 83, 122-133, fragments thereof or combinations thereof; and b) a positive control.

40. An isolated polynucleotide having at least 85% nucleic acid sequence identity to a nucleic acid sequence encoding a polypeptide selected from the group consisting of: SEQ ID NOs: 65-83 and 122-133, or immunogenic or antigenic fragment thereof; or a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 46-64 and 110-121, or immunogenic or antigenic fragment thereof, wherein said isolated polynucleotide encodes an immunogenic or antigenic Mtb polypeptide.

41. The isolated polynucleotide of Claim 40, wherein said polynucleotide has at least 90% nucleic acid sequence identity to a nucleic acid sequence encoding a polypeptide selected from the group consisting of: SEQ ID NOs: 65-83 and 122-133, or immunogenic or antigenic fragment thereof; or a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 46-64 and 110-121, or immunogenic or antigenic fragment thereof, wherein said isolated polynucleotide encodes an immunogenic or antigenic Mtb polypeptide.

42. The isolated polynucleotide of Claim 40, wherein said polynucleotide has at least 95% nucleic acid sequence identity to a nucleic acid sequence encoding a polypeptide selected from the group consisting of: SEQ ID NOs: 65-83 and 122-133, or immunogenic or antigenic fragment thereof; or a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 46-64 and 110-121, or immunogenic or antigenic fragment thereof, wherein said isolated polynucleotide encodes an immunogenic or antigenic Mtb polypeptide.

43. The isolated polynucleotide of Claim 40, wherein said polynucleotide has at least 99% nucleic acid sequence identity to a nucleic acid sequence encoding a polypeptide selected from the group consisting of: SEQ ID NOs: 65-83 and 122-133, or immunogenic or antigenic fragment thereof; or a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 46-64 and 110-121, or immunogenic or antigenic fragment thereof, wherein said isolated polynucleotide encodes an immunogenic or antigenic Mtb polypeptide.

44. An isolated polynucleotide consisting essentially of a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 46-64 and 110-121, or an immunogenic or antigenic fragment thereof.

45. The isolated polynucleotide of Claim 40, further comprising a 5' TAP polynucleotide sequence.

46. The isolated polynucleotide of Claim 40, further comprising a 3' TAP polynucleotide sequence.

47. The isolated polynucleotide of Claim 44, further comprising a 5' TAP polynucleotide sequence.

48. The isolated polynucleotide of Claim 44, further comprising a 3' TAP polynucleotide sequence.

49. An isolated polypeptide having at least 85% amino acid sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 65-83 and 122-133, or immunogenic or antigenic fragments thereof, wherein said isolated polypeptide encodes an immunogenic or antigenic Mtb polypeptide.

50. The isolated polypeptide of Claim 49 having at least 90% amino acid sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 65-83 and 122-133, or immunogenic or antigenic fragments thereof, wherein said isolated polypeptide encodes an immunogenic or antigenic Mtb polypeptide.

51. The isolated polypeptide of Claim 49 having at least 95% amino acid sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 65-83 and 122-133, or immunogenic or antigenic fragments thereof, wherein said isolated polypeptide encodes an immunogenic or antigenic Mtb polypeptide.

52. The isolated polypeptide of Claim 49 having at least 99% amino acid sequence identity to an amino acid sequence selected from the group consisting of SEQ ID NOs: 65-83 and 122-133, or immunogenic or antigenic fragments thereof, wherein said isolated polypeptide encodes an immunogenic or antigenic Mtb polypeptide.

53. An isolated polypeptide consisting essentially of an amino acid sequence selected from the group consisting of SEQ ID NOs: 65-83 and 122-133, or an immunogenic or antigenic fragment thereof.

54. An isolated polypeptide comprising:

(1) an amino acid sequence selected from the group consisting of SEQ ID NOs: 65-83 and 122-123;

(2) an immunogenic or antigenic fragment thereof; or

(3) a polypeptide having at least 85% identity to (1) or (2); and wherein the isolated polypeptide further comprises a heterologous polypeptide sequence or a non-contiguous homologous polypeptide sequence.

55. An isolated nucleic acid comprising:

(1) a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 46-64 and 110-121;

(2) a fragment thereof that encodes an immunogenic or antigenic polypeptide or peptide; or

(3) a nucleic acid having at least 85% identity to (1) or (2); and wherein the isolated nucleic acid further comprises a heterologous polynucleotide sequence or a non-contiguous polynucleotide sequence.

56. A primer that hybridizes to Mtb sequences, wherein said primer is at least 12 residues in length, and wherein hybridizes under stringent conditions to at least 12 consecutive bases of a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 8-45 and 86-109, or the complement thereof.

57. Use of a composition comprising at least one nucleic acid selected from the group consisting of SEQ ID NOs: 46-64, 110-121, or fragments thereof that encode immunogenic or antigenic polypeptides, or variants thereof, or combinations thereof to generate an immune response in a mammalian host against Mtb.

58. Use of a composition comprising at least one nucleic acid that encodes at least one peptide selected from the group consisting of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof, or combinations thereof, to generate an immune response in a mammalian host against Mtb.

59. Use of a composition comprising at least one polypeptide selected from the group consisting of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof, or combinations thereof, to generate an immune response in a mammalian host against Mtb.

60. Use of a composition comprising at least one nucleic acid selected from the group consisting of SEQ ID NOs: 46-64, 110-121, or fragments thereof that encode immunogenic or antigenic polypeptides, or variants thereof, or combinations thereof in the manufacture of a medicament to generate an immune response in a mammalian host against Mtb.

61. Use of a composition comprising at least one nucleic acid that encodes at least one peptide selected from the group consisting of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof, or combinations thereof, in the manufacture of a medicament to generate an immune response in a mammalian host against Mtb.

62. Use of a composition comprising at least one polypeptide selected from the group consisting of SEQ ID NOs: 65-83, 122-133, or immunogenic or antigenic fragments thereof, or combinations thereof, in the manufacture of a medicament to generate an immune response in a mammalian host against Mtb.

63. The method of Claim 21, wherein said nucleic acid is adapted to express said immunogenic or antigenic polypeptide or peptide in vivo in a mammalian host cell.

64. The method of Claim 30, wherein said nucleic acid is adapted to express said immunogenic or antigenic polypeptide or peptide in vivo in a mammalian host cell.

65. The method of Claim 31, wherein said nucleic acid is adapted to express said immunogenic or antigenic polypeptide or peptide in vivo in a mammalian host cell.