WO1999024577A1

WO1999024577A1 - NUCLEIC ACID FRAGMENTS AND POLYPEPTIDE FRAGMENTS DERIVED FROM $i(M. TUBERCULOSIS)

Info

Publication number: WO1999024577A1
Application number: PCT/DK1998/000438
Authority: WO
Inventors: Peter Andersen; Rikke SKJØT
Original assignee: Statens Serum Institut
Priority date: 1997-11-10
Filing date: 1998-10-08
Publication date: 1999-05-20
Also published as: EP1029053A1; NZ504951A; AU750173B2; AU9433898A; CA2319380A1

Abstract

The present invention is based on the identification and characterization of a number of M. tuberculosis derived novel proteins and protein fragments (SEQ ID NOs: 175, 177, 179, 181, 183, and 185). The invention is directed to the polypeptides and immunologically active fragments thereof, the genes encoding them, immunological compositions such as vaccines and skin test reagents containing the polypeptides. Another part of the invention is based on the surprising discovery that CFP7A induces a high protective immune response.

Description

NUCLEIC ACID FRAGMENTS AND POLYPEPTIDE FRAGMENTS DERIVED FROM M. TUBERCULOSIS

FIELD OF THE INVENTION

The present invention relates to a number of immunoiogically active, novel polypeptide fragments derived from the Mycobacterium tuberculosis, vaccines and other immuno- logic compositions containing the fragments as immunogenic components, and methods of production and use of the polypeptides. The invention also relates to novel nucleic acid fragments derived from M. tuberculosis which are useful in the preparation of the polypeptide fragments of the invention or in the diagnosis of infection with M. tuberculosis .

BACKGROUND OF THE INVENTION

Human tuberculosis (hereinafter designated "TB") caused by Mycobacterium tuberculosis is a severe global health problem responsible for approximately 3 million deaths annually, according to the WHO. The worldwide incidence of new TB cases has been progressively falling for the last decade but the recent years has markedly changed this trend due to the advent of AIDS and the appearance of multidrug resistant strains of M. tuberculosis.

The only vaccine presently available for clinical use is BCG, a vaccine which efficacy remains a matter of controversy. BCG generally induces a high level of acquired resis- tance in animal models of TB, but several human trials in developing countries have failed to demonstrate significant protection. Notably, BCG is not approved by the FDA for use in the United States.

This makes the development of a new and improved vaccine against TB an urgent matter which has been given a very high priority by the WHO. Many attempts to define protective mycobacterial substances have been made, and from 1 950 to 1 970 several investigators reported an increased resistance after experimental vaccination. However, the demonstration of a specific long-term protective immune response with the potency of BCG has not yet been achieved by administration of soluble proteins or cell wall fragments, although progress is currently being made by relying on polypeptides derived from short term-culture filtrate, cf . the discussion below.

Immunity to M. tuberculosis is characterized by three basic features; i) Living bacilli efficiently induces a protective immune response in contrast to killed preparations; ii) Specifically sensitized T lymphocytes mediate this protection; iii) The most important mediator molecule seems to be interferon gamma (INF-γ).

Short term-culture filtrate (ST-CF) is a complex mixture of proteins released from M. tuberculosis during the first few days of growth in a liquid medium (Andersen et al., 1 991 ). Culture filtrates has been suggested to hold protective antigens recognized by the host in the first phase of TB infection (Andersen et al. 1991 , Orme et al. 1 993). Recent data from several laboratories have demonstrated that experimental subunit vaccines based on culture filtrate antigens can provide high levels of acquired resis- tance to TB (Pal and Horwitz, 1992; Roberts et al., 1 995; Andersen, 1994; Lindblad et al., 1 997). Culture filtrates are, however, complex protein mixtures and until now very limited information has been available on the molecules responsible for this protective immune response. In this regard, only two culture filtrate antigens have been described as involved in protective immunity, the low mass antigen ESAT-6 (Andersen et al., 1995 and EP-A-0 706 571 ) and the 31 kDa molecule Ag85B (EP-0 432 203).

There is therefore a need for the identification of further antigens involved in the induction of protective immunity against TB in order to eventually produce an effective subunit vaccine.

OBJECT OF THE INVENTION

It is an object of the invention to provide novel antigens which are effective as components in a subunit vaccine against TB or which are useful as components in diag- nostic compositions for the detection of infection with mycobacteria, especially virulence-associated mycobacteria. The novel antigens may also be important drug targets. SUMMARY OF THE INVENTION

The present invention is i.a. based on the identification and characterization of a number of previously uncharacterized culture filtrate antigens from M. tuberculosis. In animal models of TB, T cells mediating immunity are focused predominantly to antigens in the regions 6-1 2 and 1 7-30 kDa of ST-CF. In the present invention 6 antigens in the low molecular weight region (ORF7-1 , ORF7-2, ORF1 1 -1 , ORF1 1 -2, ORF1 1 -3, ORF1 1 -4) have been identified.

Furthermore immunological and biological data on several important antigens are presented.

The encoding genes for 8 antigens have been determined. The panel hold antigens with potential for vaccine purposes as well as for diagnostic purposes, since the antigens are all secreted by metabolizing mycobacteria.

The following table lists the antigens of the invention by the names used herein as well as by reference to relevant SEQ ID NOs of N-terminal sequences, full amino acid sequences and sequences of DNA encoding the antigens:

Antigen N-terminal sequence Nucleotide sequence Amino acid sequence

SEQ ID NO: SEQ ID NO: SEQ ID NO:

CFP7 1 2

CFP7A 81 47 48

CFP7B 168 146 147

CFP8A 73 148 149

CFP8B 74 150 151

CFP9 3 4

CFP10A 169 140 141

CFP11 170 142 143

CFP16 79 63 64

CFP17 17 5 6

CFP19 82 49 50

CFP19A 51 52

CFP19B 80

CFP20 18 7 8

CFP21 19 9 10

CFP22 20 11 12

CFP22A 83 53 54

CFP23 55 56

CFP23A 76

CFP23B 75

CFP25 21 13 14

CFP25A 78 65 66

CFP27 84 57 58

CFP28 22

CFP29 23 15 16

CFP30A 85 59 60

CFP30B 171 144 145

CFP50 86 61 62

MPT51 41 42

CWP32 77 152 153

RD1-ORF8 67 68

RD1-ORF2 71 72

RD1-ORF9B 69 70

RD1-ORF3 87 88

RD1-ORF9A 93 94

RD1-ORF4 89 90

RD1-ORF5 91 92

MPT59- 172

ESAT6

ESAT6- 173

MPT59

ORF7-1 174 175

ORF7-2 176 177

ORF11-1 178 179

ORF11-2 180 181

ORF11-3 182 183

0RF11-4 184 185 It is well-known in the art that T-cell epitopes are responsible for the elicitation of the acquired immunity against TB, whereas B-cell epitopes are without any significant influence on acquired immunity and recognition of mycobacteria in vivo. Since such T- 5 cell epitopes are linear and are known to have a minimum length of 6 amino acid residues, the present invention is especially concerned with the identification and utilisation of such T-cell epitopes.

Hence, in its broadest aspect the invention relates to a substantially pure polypeptide 10 fragment which

a) comprises an amino acid sequence selected from the sequences shown in SEQ ID NO: 2, 4, 6, 8, 1 0, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149,

1 5 1 51 , 1 53, any one of 168-1 71 , 175, 177, 1 79, 181 , 183, and 185

b) comprises a subsequence of the polypeptide fragment defined in a) which has a length of at Ieast 6 amino acid residues, said subsequence being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a pro-

20 tective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, or

25 c) comprises an amino acid sequence having a sequence identity with the polypeptide defined in a) or the subsequence defined in b) of at Ieast 70% and at the same time being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of

30 eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex,

with the proviso that i) the polypeptide fragment is in essentially pure form when consisting of the amino acid sequence 1 -96 of SEQ ID NO: 2 or when consisting of the amino acid sequence 87-108 of SEQ ID NO: 4 fused to β-galactosidase,

ii) the degree of sequence identity in c) is at Ieast 95% when the polypeptide comprises a homologue of a polypeptide which has the amino acid sequence SEQ ID NO: 1 2 or a subsequence thereof as defined in b), and

iii) the polypeptide fragment contains a threonine residue corresponding to position 21 3 in SEQ ID NO: 42 when comprising an amino acid sequence of at Ieast 6 amino acids in SEQ ID NO: 42.

Other parts of the invention pertains to the DNA fragments encoding a polypeptide with the above definition as well as to DNA fragments useful for determining the presence of DNA encoding such polypeptides.

DETAILED DISCLOSURE OF THE INVENTION

In the present specification and claims, the term "polypeptide fragment" denotes both short peptides with a length of at Ieast two amino acid residues and at most 10 amino acid residues, oligopeptides (1 1 -100 amino acid residues), and longer peptides (the usual interpretation of "polypeptide", i.e. more than 100 amino acid residues in length) as well as proteins (the functional entity comprising at Ieast one peptide, oligopeptide, or polypeptide which may be chemically modified by being glycosylated, by being lipi- dated, or by comprising prosthetic groups). The definition of polypeptides also comprises native forms of peptides/proteins in mycobacteria as well as recombinant proteins or peptides in any type of expression vectors transforming any kind of host, and also chemically synthesized peptides.

In the present context the term "substantially pure polypeptide fragment" means a polypeptide preparation which contains at most 5% by weight of other polypeptide material with which it is natively associated (lower percentages of other polypeptide material are preferred, e.g. at most 4%, at most 3%, at most 2%, at most 1 %, and at most ¹/2 %). It is preferred that the substantially pure polypeptide is at Ieast 96% pure, i.e. that the polypeptide constitutes at Ieast 96% by weight of total polypeptide material present in the preparation, and higher percentages are preferred, such as at Ieast 97%, at Ieast 98%, at Ieast 99%, at Ieast 99,25%, at Ieast 99,5%, and at Ieast 99,75%. It is especially preferred that the polypeptide fragment is in "essentially pure form", i.e. that the polypeptide fragment is essentially free of any other antigen with which it is natively associated, i.e. free of any other antigen from bacteria belonging to the tuberculosis complex. This can be accomplished by preparing the polypeptide fragment by means of recombinant methods in a non-mycobacterial host cell as will be described in detail below, or by synthesizing the polypeptide fragment by the well- known methods of solid or liquid phase peptide synthesis, e.g. by the method described by Merrifield or variations thereof.

The term "subsequence" when used in connection with a polypeptide of the invention having a SEQ ID NO selected from 2, 4, 6, 8, 1 0, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, and 1 85 denotes any continuous stretch of at Ieast 6 amino acid residues taken from the M. tuberculosis derived polypeptides in SEQ ID NO: 2, 4, 6, 8, 1 0, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 168-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, and 185 and being immunological equivalent thereto with respect to the ability of conferring increased resistance to infections with bacteria belonging to the tuberculosis complex. Thus, included is also a polypeptide from diffe- rent sources, such as other bacteria or even from eukaryotic cells.

When referring to an "immunologically equivalent" polypeptide is herein meant that the polypeptide, when formulated in a vaccine or a diagnostic agent (i.e. together with a pharmaceutically acceptable carrier or vehicle and optionally an adjuvant), will

I) confer, upon administration (either alone or as an immunologically active constituent together with other antigens), an acquired increased specific resistance in a mouse and/or in a guinea pig and/or in a primate such as a human being against infections with bacteria belonging to the tuberculosis complex which is at Ieast 20% of the acquired increased resistance conferred by Mycobacterium bovis BCG and also at Ieast 20% of the acquired increased resistance conferred by the parent polypeptide comprising SEQ ID NO: 2, 4, 6, 8, 10, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, or 1 85 (said parent polypeptide having substantially the same relative location and pattern in a 2DE gel prepared as the 2DE gel shown in Fig. 6, cf. the examples), the acquired increased resistance being assessed by the observed reduction in mycobacterial counts from spleen, lung or other organ homogenates isolated from the mouse or guinea pig re- ceiving a challenge infection with a virulent strain of M. tuberculosis, or, in a primate such as a human being, being assessed by determining the protection against development of clinical tuberculosis in a vaccinated group versus that observed in a control group receiving a placebo or BCG (preferably the increased resistance is higher and corresponds to at Ieast 50% of the protective immune response elicited by M. bovis BCG, such as at Ieast 60%, or even more preferred to at Ieast 80% of the protective immune response elicited by M. bovis BCG, such as at Ieast 90%; in some cases it is expected that the increased resistance will supersede that conferred by M. bovis BCG, and hence it is preferred that the resistance will be at Ieast 100%, such as at Ieast 1 10% of said increased resistance); and/or

II) elicit a diagnostically significant immune response in a mammal indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex; this diagnostically significant immune response can be in the form of a delayed type hypersensitivity reaction which can e.g. be determined by a skin test, or can be in the form of IFN-γ release determined e.g. by an IFN-γ assay as described in detail below. A diagnostically significant response in a skin test setup will be a reaction which gives rise to a skin reaction which is at Ieast 5 mm in diameter and which is at Ieast 65% (preferably at Ieast 75% such as at the Ieast 85%) of the skin reaction (assessed as the skin reaction diameter) elicited by the parent polypep- tide comprising SEQ ID NO: 2, 4, 6, 8, 10, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, or 185. The ability of the polypeptide fragment to confer increased immunity may thus be assessed by measuring in an experimental animal, e.g. a mouse or a guinea pig, the reduction in mycobacterial counts from the spleen, lung or other organ homogenates isolated from the experimental animal which have received a challenge infection with a virulent strain of mycobacteria belonging to the tuberculosis complex after previously having been immunized with the polypeptide, as compared to the mycobacterial counts in a control group of experimental animals infected with the same virulent strain, which experimental animals have not previously been immunized against tuberculosis. The comparison of the mycobacterial counts may also be carried out with my- cobacterial counts from a group of experimental animals receiving a challenge infection with the same virulent strain after having been immunized with Mycobacterium bovis BCG.

The mycobacterial counts in homogenates from the experimental animals immunized with a polypeptide fragment according to the present invention must at the most be 5 times the counts in the mice or guinea pigs immunized with Mycobacterium bovis BCG, such as at the most 3 times the counts, and preferably at the most 2 times the counts.

A more relevant assessment of the ability of the polypeptide fragment of the invention to confer increased resistance is to compare the incidence of clinical tuberculosis in two groups of individuals (e.g. humans or other primates) where one group receives a vaccine as described herein which contains an antigen of the invention and the other group receives either a placebo or an other known TB vaccine (e.g. BCG). In such a setup, the antigen of the invention should give rise to a protective immunity which is significantly higher than the one provided by the administration of the placebo (as determined by statistical methods known to the skilled artisan).

In the context of the present application, the term "wide genetically" should be under- stood in a meaning of at Ieast two strains. That is, if a polypeptide is recognised by at Ieast two different strains, it is considered to have a wide genetically recognition. A subunit vaccine component is defined as a reagent which stimulates protective immunity in an animal model of infection with an organism of the M. tuberculosis com- plex, when given prior to infection and which also generates a significant immune responses in human volunteers.

The "tuberculosis-complex" has its usual meaning, i.e. the complex of mycobacteria causing TB which are Mycobacterium tuberculosis, Mycobacterium bovis, Mycobacterium bovis BCG, and Mycobacterium africanum.

In the present context the term "metabolizing mycobacteria" means live mycobacteria that are multiplying logarithmically and releasing polypeptides into the culture medium wherein they are cultured.

The term "sequence identity" indicates a quantitative measure of the degree of homology between two amino acid sequences or between two nucleotide sequences of equal length, of if not of equal length aligned to best possible fit: The sequence identity can be calculated as ^{g "}^' — 1 , wherein N_dif is the total number of non- identical residues in the two sequences when aligned and wherein N_ref is the number of residues in one of the sequences. Hence, the DNA sequence AGTCAGTC will have a sequence identity of 75 % with the sequence AATCAATC (N_dif = 2 and N_ref = 8). A gap is counted as non-identity of the specific residue(s), i.e. the DNA sequence AGTGTC will have a sequence identity of 75% with the DNA sequence AGTCAGTC (N_dlf = 2 and N_ref = 8). Sequence identity can alternatively be calculated by the BLASTP program ((Pearson W.R and D.J. Lipman (1 988) PNAS USA 85:2444-2448) in the EMBL database (www.ncbi.nlm.gov/cgi-bin/BLAST) . Generally, the default settings with respect to e.g. " scoring matrix" and "gap penalty" will be used for alignment.

The sequence identity is used here to illustrate the degree of identity between the amino acid sequence of a given polypeptide and the amino acid sequence shown in SEQ ID NO: 2, 4, 6, 8, 10, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, or 1 85. The amino acid sequence to be compared with the amino acid sequence shown in SEQ ID NO: 2, 4, 6, 8, 1 0, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, or 1 85 may be deduced from a DNA sequence, e.g. obtained by hybridization as defined below, or may be obtained by conventional amino acid sequencing methods. The sequence identity is preferably determined on the amino acid sequence of a mature polypeptide, i.e. without taking any leader sequence into consideration.

As appears from the above disclosure, polypeptides which are not identical to the polypeptides having SEQ ID NO: 2, 4, 6, 8, 10, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, or 1 85 are embraced by the present invention. The invention allows for minor variations which do not have an adverse effect on immunogenicity compared to the parent sequences and which may give interesting and useful novel binding properties or biological functions and immunogenicities etc.

Each polypeptide fragment may thus be characterized by specific amino acid and nucleic acid sequences. It will be understood that such sequences include analogues and variants produced by recombinant methods wherein such nucleic acid and polypeptide sequences have been modified by substitution, insertion, addition and/or deletion of one or more nucleotides in said nucleic acid sequences to cause the substitution, insertion, addition or deletion of one or more amino acid residues in the recombinant polypeptide. When the term DNA is used in the following, it should be understood that for the number of purposes where DNA can be substituted with RNA, the term DNA should be read to include RNA embodiments which will be apparent for the man skilled in the art. For the purposes of hybridization, PNA or LNA may be used instead of DNA. PNA has been shown to exhibit a very dynamic hybridization profile (PNA is described in Nielsen P E et al. , 1 991 , Science 254: 1497-1 500) . LNA (Locked Nucleic Acids) is a recently introduced oligonucleotide analogue containing bicyclo nucleoside monomers (Koshkin et al., 1 998, 54, 3607-3630;Nielsen, N.K. et al. J.Am.Chem.Soc 1998, 1 20, 5458-5463).

In both immunodiagnostics and vaccine preparation, it is often possible and practical to prepare antigens from segments of a known immunogenic protein or polypeptide. Certain epitopic regions may be used to produce responses similar to those produced by the entire antigenic polypeptide. Potential antigenic or immunogenic regions may be identified by any of a number of approaches, e.g., Jameson-Wolf or Kyte-Doolittle an- tigenicity analyses or Hopp and Woods (1 981 ) hydrophobicity analysis (see, e.g., Ja- meson and Wolf, 1 988; Kyte and Doolittle, 1 982; or U.S. Patent No. 4,554, 101 ) . Hydrophobicity analysis assigns average hydrophilicity values to each amino acid residue from these values average hydrophilicities can be calculated and regions of greatest hydrophilicity determined. Using one or more of these methods, regions of predicted antigenicity may be derived from the amino acid sequence assigned to the polypep- tides of the invention.

Alternatively, in order to identify relevant T-cell epitopes which are recognized during an immune response, it is also possible to use a "brute force" method: Since T-cell epitopes are linear, deletion mutants of polypeptides having SEQ ID NO: 2, 4, 6, 8, 10, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, or 185 will, if constructed systematically, reveal what regions of the polypeptides are essential in immune recognition, e.g. by subjecting these deletion mutants to the IFN-γ assay described herein. Another method utilises overlapping oligomers (preferably synthetic having a length of e.g. 20 amino acid residues) derived from polypeptides having SEQ ID NO: 2, 4, 6, 8, 10, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-1 71 , 1 75, 1 77, 1 79, 1 81 , 1 83, or 1 85. Some of these will give a positive response in the IFN-γ assay whereas others will not.

In a preferred embodiment of the invention, the polypeptide fragment of the invention comprises an epitope for a T-helper cell.

Although the minimum length of a T-cell epitope has been shown to be at Ieast 6 amino acids, it is normal that such epitopes are constituted of longer stretches of amino acids. Hence it is preferred that the polypeptide fragment of the invention has a length of at Ieast 7 amino acid residues, such as at Ieast 8, at Ieast 9, at Ieast 10, at Ieast 1 2, at Ieast 14, at Ieast 1 6, at Ieast 1 8, at Ieast 20, at Ieast 22, at Ieast 24, and at Ieast 30 amino acid residues.

As will appear from the examples, a number of the polypeptides of the invention are natively translation products which include a leader sequence (or other short peptide sequences), whereas the product which can be isolated from short-term culture filtrates from bacteria belonging to the tuberculosis complex are free of these sequences. Although it may in some applications be advantageous to produce these polypeptides recombinantly and in this connection facilitate export of the polypeptides from the host cell by including information encoding the leader sequence in the gene for the polypeptide, it is more often preferred to either substitute the leader sequence with one which has been shown to be superior in the host system for effecting export, or to totally omit the leader sequence (e.g. when producing the polypeptide by peptide synthesis. Hence, a preferred embodiment of the invention is a polypeptide which is free from amino acid residues -30 to -1 in SEQ ID NO: 6 and/or -32 to -1 in SEQ ID NO: 10 and/or -8 to -1 in SEQ ID NO: 1 2 and/or -32 to -1 in SEQ ID NO: 14 and/or -33 to -1 in SEQ ID NO: 42 and/or -38 to -1 in SEQ ID NO: 52 and/or -33 to -1 in SEQ ID NO: 56 and/or -56 to -1 in SEQ ID NO: 58 and/or -28 to -1 in SEQ ID NO: 1 51 .

In another preferred embodiment, the polypeptide fragment of the invention is free from any signal sequence; this is especially interesting when the polypeptide fragment is produced synthetically but even when the polypeptide fragments are produced recombinantly it is normally acceptable that they are not exported by the host cell to the periplasm or the extracellular space; the polypeptide fragments can be recovered by traditional methods (cf . the discussion below) from the cytoplasm after disruption of the host cells, and if there is need for refolding of the polypeptide fragments, general refolding schemes can be employed, cf. e.g. the disclosure in WO 94/1 8227 where such a general applicable refolding method is described.

A suitable assay for the potential utility of a given polypeptide fragment derived from SEQ ID NO: 2, 4, 6, 8, 10, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 151 , 1 53, any one of 168-171 , 175, 177, 1 79, 181 , 183, or 185 is to assess the ability of the polypeptide fragment to effect IFN-γ release from primed memory T- lymphocytes. Polypeptide fragments which have this capability are according to the invention especially interesting embodiments of the invention: It is contemplated that polypeptide fragments which stimulate T lymphocyte immune response shortly after the onset of the infection are important in the control of the mycobacteria causing the infection before the mycobacteria have succeeded in multiplying up to the number of bacteria that would have resulted in fulminant infection.

It is presently contemplated that when this application refers to IFN-γ release as a measure of immunogenicity, other cytokines could be relevant, such as IL-1 2, TNF-α, IL-4, IL-5, IL-10, IL-6, TGF-β. Usually one or more cytokines will be measured utilising for example the PCR technique or ELISA. It will be appreciated by the person skilled in the art, that a significant increase or decrease in any of these cytokines will be indicative of an immunological effective polypeptide or polypeptide fragment.

Thus, an important embodiment of the invention is a polypeptide fragment defined above which

1 ) induces a release of IFN-γ from primed memory T-lymphocytes withdrawn from a mouse within 2 weeks of primary infection or within 4 days after the mouse has been re-challenge infected with mycobacteria belonging to the tuberculosis complex, the induction performed by the addition of the polypeptide to a suspension comprising about 200,000 spleen cells per ml, the addition of the polypeptide resulting in a concentration of 1 -4 μg polypeptide per ml suspension, the release of IFN-γ being assess- able by determination of IFN-γ in supernatant harvested 2 days after the addition of the polypeptide to the suspension, and/or

2) induces a release of IFN-γ of at Ieast 1 ,500 pg/ml above background level from about 1 ,000,000 human PBMC (peripheral blood mononuclear cells) per ml isolated from TB patients in the first phase of infection, or from healthy BCG vaccinated donors, or from healthy contacts to TB patients, the induction being performed by the addition of the polypeptide to a suspension comprising the about 1 ,000,000 PBMC per ml, the addition of the polypeptide resulting in a concentration of 1 -4 μg polypeptide per ml suspension, the release of IFN-γ being assessable by determination of IFN-γ in supernatant harvested 2 days after the addition of the polypeptide to the suspension; and/or

3) induces an IFN-γ release from bovine PBMC derived from animals previously sensitized with mycobacteria belonging to the tuberculosis complex, said release being at Ieast two times the release observed from bovine PBMC derived from animals not previously sensitized with mycobacteria belonging to the tuberculosis complex.

Preferably, in alternatives 1 and 2, the release effected by the polypeptide fragment gives rise to at Ieast 1 ,500 pg/ml IFN-γ in the supernatant but higher concentrations are preferred, e.g. at Ieast 2,000 pg/ml and even at Ieast 3,000 pg/ml IFN-γ in the supernatant. The IFN-γ release from bovine PBMC can e.g. be measured as the optical density (OD) index over background in a standard cytokine ELISA and should thus be at Ieast two, but higher numbers such as at Ieast 3, 5, 8, and 10 are preferred.

The polypeptide fragments of the invention preferably comprises an amino acid sequence of at Ieast 6 amino acid residues in length which has a higher sequence identity than 70 percent with SEQ ID NO: 2, 4, 6, 8, 10, 1 2, 14, 1 6, any one of 1 7-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141 , 143, 145, 147, 149, 1 51 , 1 53, any one of 1 68-171 , 1 75, 177, 1 79, 181 , 183, or 185. A preferred minimum percentage of sequence identity is at Ieast 80%, such as at Ieast 85%, at Ieast 90%, at Ieast 91 %, at Ieast 92%, at Ieast 93%, at Ieast 94%, at Ieast 95%, at Ieast 96%, at Ieast 97%, at Ieast 98%, at Ieast 99%, and at Ieast 99.5%.

As mentioned above, it will normally be interesting to omit the leader sequences from the polypeptide fragments of the invention. However, by producing fusion polypeptides, superior characteristics of the polypeptide fragments of the invention can be achieved. For instance, fusion partners which facilitate export of the polypeptide when produced recombinantly, fusion partners which facilitate purification of the polypeptide, and fusion partners which enhance the immunogenicity of the polypeptide fragment of the invention are all interesting possibilities. Therefore, the invention also pertains to a fusion polypeptide comprising at Ieast one polypeptide fragment defined above and at Ieast one fusion partner. The fusion partner can, in order to enhance im- munogenicity, e.g. be selected from the group consisting of another polypeptide fragment as defined above (so as to allow for multiple expression of relevant epitopes), and an other polypeptide derived from a bacterium belonging to the tuberculosis complex, such as ESAT-6, CFP7, CFP10, CFP17, CFP21 , CFP25, CFP29, MPB59, MPT59, MPB64, and MPT64 or at Ieast one T-cell epitope of any of these antigens. Other immunogenicity enhancing polypeptides which could serve as fusion partners are T- cell epitopes (e.g. derived from the polypeptides ESAT-6, MPB64, MPT64, or MPB59) or other immunogenic epitopes enhancing the immunogenicity of the target gene product, e.g. lymphokines such as INF-γ, IL-2 and IL-1 2. In order to facilitate expression and/or purification the fusion partner can e.g. be a bacterial fimbrial protein, e.g. the pilus components pilin and papA; protein A; the ZZ-peptide (ZZ- fusions are marketed by Pharmacia in Sweden); the maltose binding protein; gluthatione S-transferase; β-galactosidase; or poly-histidine.

Other interesting fusion partners are polypeptides which are lipidated and thereby effect that the immunogenic polypeptide is presented in a suitable manner to the immune system. This effect is e.g. known from vaccines based on the Borrelia burgdor- feri OspA polypeptide, wherein the lipidated membrane anchor in the polypeptide confers a self-adjuvating effect to the polypeptide (which is natively lipidated) when iso- lated from cells producing it. In contrast, the OspA polypeptide is relatively silent immunologically when prepared without the lipidation anchor.

As evidenced in Example 6A, the fusion polypeptide consisting of MPT59 fused directly N-terminally to ESAT-6 enhances the immunogenicity of ESAT-6 beyond what would be expected from the immunogenicities of MPT59 and ESAT-6 alone. The precise reason for this surprising finding is not yet known, but it is expected that either the presence of both antigens lead to a synergistic effect with respect to immunogenicity or the presence of a sequence N-terminally to the ESAT-6 sequence protects this immune dominant protein from loss of important epitopes known to be present in the N-terminus. A third, alternative, possibility is that the presence of a sequence C- terminally to the MPT59 sequence enhances the immunologic properties of this antigen. Hence, one part of the invention pertains to a fusion polypeptide fragment which comprises a first amino acid sequence including at Ieast one stretch of amino acids constituting a T-cell epitope derived from the M. tuberculosis protein ESAT-6 or MPT59, and a second amino acid sequence including at Ieast one T-cell epitope de- rived from a M. tuberculosis protein different from ESAT-6 (if the first stretch of amino acids are derived from ESAT-6) or MPT59 (if the first stretch of amino acids are derived from MPT59) and/or including a stretch of amino acids which protects the first amino acid sequence from in vivo degradation or post-translational processing. The first amino acid sequence may be situated N- or C-terminally to the second amino acid sequence, but in line with the above considerations regarding protection of the ESAT- 6 N-terminus it is preferred that the first amino acid sequence is C-terminal to the second when the first amino acid sequence is derived from ESAT-6.

Although only the effect of fusion between MPT59 and ESAT6 has been investigated at present, it is believed that ESAT6 and MPT59 or epitopes derived therefrom could be advantageously be fused to other fusion partners having substantially the same effect on overall immunogenicity of the fusion construct. Hence, it is preferred that such a fusion polypeptide fragment according of the invention is one, wherein the at Ieast one T-cell epitope included in the second amino acid sequence is derived from a M. tuberculosis polypeptide (the "parent" polypeptide) selected from the group consisting of a polypeptide fragment according to the present invention and described in detail above and in the examples, or the amino acid sequence could be derived from any one of the M. tuberculosis proteins DnaK, GroEL, urease, glutamine synthetase, the proline rich complex, L-alanine dehydrogenase, phosphate binding protein, Ag 85 complex, HBHA (heparin binding hemagglutinin), MPT51 , MPT64, superoxide dismutase, 19 kDa lipoprotein, α-crystallin, GroES, MPT59 (when the first amino acid sequence is derived from ESAT-6), and ESAT-6 (when the first amino acid sequence is derived from MPT59). It is preferred that the first and second T-cell epitopes each have a sequence identity of at Ieast 70% with the natively occurring sequence in the proteins from which they are derived and it is even further preferred that the first and/or second amino acid sequence has a sequence identity of at Ieast 70% with the protein from which they are derived. A most preferred embodiment of this fusion polypeptide is one wherein the first amino acid sequence is the amino acid sequence of ESAT-6 or MPT59 and/or the second amino acid sequence is the full-length amino acid sequence of the possible "parent" polypeptides listed above.

In the most preferred embodiment, the fusion polypeptide fragment comprises ESAT-6 fused to MPT59 (advantageously, ESAT-6 is fused to the C-terminus of MPT59) and in one special embodiment, there are no linkers introduced between the two amino acid sequences constituting the two parent polypeptide fragments.

Another part of the invention pertains to a nucleic acid fragment in isolated form which

1 ) comprises a nucleic acid sequence which encodes a polypeptide or fusion polypeptide as defined above, or comprises a nucleic acid sequence complementary thereto, and/or

2) has a length of at Ieast 10 nucleotides and hybridizes readily under stringent hybridization conditions (as defined in the art, i.e. 5-1 0°C under the melting point T_m, cf. Sambrook et al, 1 989, pages 1 1 .45-1 1 .49) with a nucleic acid fragment which has a nucleotide sequence selected from SEQ ID NO: 1 or a sequence complementary thereto,

SEQ ID NO: 3 or a sequence complementary thereto,

SEQ ID NO: 5 or a sequence complementary thereto,

SEQ ID NO: 7 or a sequence complementary thereto,

SEQ ID NO: 9 or a sequence complementary thereto, SEQ ID NO: 1 1 or a sequence complementary thereto,

SEQ ID NO: 1 3 or a sequence complementary thereto,

SEQ ID NO: 1 5 or a sequence complementary thereto,

SEQ ID NO: 41 or a sequence complementary thereto,

SEQ ID NO: 47 or a sequence complementary thereto, SEQ ID NO: 49 or a sequence complementary thereto,

SEQ ID NO: 51 or a sequence complementary thereto,

SEQ ID NO: 53 or a sequence complementary thereto,

SEQ ID NO: 55 or a sequence complementary thereto,

SEQ ID NO: 57 or a sequence complementary thereto, SEQ ID NO 59 or a sequence complementary thereto, SEQ ID NO 61 or a sequence complementary thereto, SEQ ID NO 63 or a sequence complementary thereto, SEQ ID NO 65 or a sequence complementary thereto, SEQ ID NO 67 or a sequence complementary thereto, SEQ ID NO 69 or a sequence complementary thereto, SEQ ID NO 71 or a sequence complementary thereto, SEQ ID NO 87 or a sequence complementary thereto, SEQ ID NO 89 or a sequence complementary thereto, SEQ ID NO 91 or a sequence complementary thereto, SEQ ID NO 93 or a sequence complementary thereto, SEQ ID NO 140 or a sequence complementary thereto, SEQ ID NO 142 or a sequence complementary thereto, SEQ ID NO 144 or a sequence complementary thereto, SEQ ID NO 146 or a sequence complementary thereto, SEQ ID NO 148 or a sequence complementary thereto, SEQ ID NO 1 50 or a sequence complementary thereto, SEQ ID NO 1 52 or a sequence complementary thereto, SEQ ID NO 1 74 or a sequence complementary thereto, SEQ ID NO 1 76 or a sequence complementary thereto, SEQ ID NO 1 78 or a sequence complementary thereto, SEQ ID NO 1 80 or a sequence complementary thereto, SEQ ID NO 1 82 or a sequence complementary thereto, and SEQ ID NO 1 84 or a sequence complementary thereto

with the proviso that when the nucleic acid fragment comprises a subsequence of SEQ ID NO: 41 , then the nucleic acid fragment contains an A corresponding to position 781 in SEQ ID NO: 41 and when the nucleic acid fragment comprises a subsequence of a nucleotide sequence exactly complementary to SEQ ID NO: 41 , then the nucleic acid fragment comprises a T corresponding to position 781 in SEQ ID NO: 41 .

It is preferred that the nucleic acid fragment is a DNA fragment. To provide certainty of the advantages in accordance with the invention, the preferred nucleic acid sequence when employed for hybridization studies or assays includes sequences that are complementary to at Ieast a 10 to 40, or so, nucleotide stretch of the selected sequence. A size of at Ieast 1 0 nucleotides in length helps to ensure that the fragment will be of sufficient length to form a duplex molecule that is both stable and selective. Molecules having complementary sequences over stretches greater than 10 bases in length are generally preferred, though, in order to increase stability and selectivity of the hybrid, and thereby improve the quality and degree of specific hybrid molecules obtained.

Hence, the term "subsequence" when used in connection with the nucleic acid fragments of the invention is intended to indicate a continuous stretch of at Ieast 1 0 nucleotides exhibits the above hybridization pattern. Normally this will require a minimum sequence identity of at Ieast 70% with a subsequence of the hybridization partner having SEQ ID NO: 1 , 3, 5, 7, 9, 1 1 , 1 2, 1 5, 21 , 41 , 47, 49, 51 , 53, 55, 57, 59, 61 , 63, 65, 67, 69, 71 , 87, 89, 91 , 93, 140, 142, 144, 146, 148, 1 50, 1 52, 174, 176, 1 78, 1 80, 1 82, or 1 84. It is preferred that the nucleic acid fragment is longer than 1 0 nucleotides, such as at Ieast 1 5, at Ieast 20, at Ieast 25, at Ieast 30, at Ieast 35, at Ieast 40, at Ieast 45, at Ieast 50, at Ieast 55, at Ieast 60, at Ieast 65, at Ieast 70, and at ieast 80 nucleotides long, and the sequence identity should preferable also be higher than 70%, such as at Ieast 75%, at Ieast 80%, at Ieast 85 %, at Ieast 90%, at Ieast 92%, at Ieast 94%, at Ieast 96%, and at Ieast 98% . It is most preferred that the sequence identity is 100% . Such fragments may be readily prepared by, for example, directly synthesizing the fragment by chemical means, by application of nucleic acid reproduction technology, such as the PCR technology of U.S. Patent 4,603, 102, or by introducing selected sequences into recombinant vectors for recombinant production.

It is well known that the same amino acid may be encoded by various codons, the codon usage being related, inter alia, to the preference of the organisms in question ex- pressing the nucleotide sequence. Thus, at Ieast one nucleotide or codon of a nucleic acid fragment of the invention may be exchanged by others which, when expressed, result in a polypeptide identical or substantially identical to the polypeptide encoded by the nucleic acid fragment in question. The invention thus allows for variations in the sequence such as substitution, insertion (including introns), addition, deletion and rearrangement of one or more nucleotides, which variations do not have any substantial effect on the polypeptide encoded by the nucleic acid fragment or a subsequence thereof . The term "substitution" is intended to mean the replacement of one or more nucleotides in the full nucleotide sequence with one or more different nucleotides, "addition" is understood to mean the addition of one or more nucleotides at either end of the full nucleotide sequence, "insertion" is intended to mean the introduction of one or more nucleotides within the full nucleotide sequence, "deletion" is intended to indicate that one or more nucleotides have been deleted from the full nucleotide sequence whether at either end of the sequence or at any suitable point within it, and "rearrangement" is intended to mean that two or more nucleotide residues have been exchanged with each other.

The nucleotide sequence to be modified may be of cDNA or genomic origin as discussed above, but may also be of synthetic origin. Furthermore, the sequence may be of mixed cDNA and genomic, mixed cDNA and synthetic or genomic and synthetic origin as discussed above. The sequence may have been modified, e.g. by site-directed mutagenesis, to result in the desired nucleic acid fragment encoding the desired polypeptide. The following discussion focused on modifications of nucleic acid encoding the polypeptide should be understood to encompass also such possibilities, as well as the possibility of building up the nucleic acid by ligation of two or more DNA fragments to obtain the desired nucleic acid fragment, and combinations of the above-mentioned principles.

The nucleotide sequence may be modified using any suitable technique which results in the production of a nucleic acid fragment encoding a polypeptide of the invention.

The modification of the nucleotide sequence encoding the amino acid sequence of the polypeptide of the invention should be one which does not impair the immunological function of the resulting polypeptide.

A preferred method of preparing variants of the antigens disclosed herein is site-directed mutagenesis. This technique is useful in the preparation of individual peptides, or biologically functional equivalent proteins or peptides, derived from the antigen sequences, through specific mutagenesis of the underlying nucleic acid. The technique further provides a ready ability to prepare and test sequence variants, for example, incorporating one or more of the foregoing considerations, by introducing one or more nucleotide sequence changes into the nucleic acid. Site-specific mutagenesis allows the production of mutants through the use of specific oligonucleotide sequences which encode the nucleotide sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on both sides of the deletion junction being traversed. Typically, a primer of about 1 7 to 25 nucleotides in length is preferred, with about 5 to 10 residues on both sides of the junction of the sequence be- ing altered.

In general, the technique of site-specific mutagenesis is well known in the art as exemplified by publications (Adelman et al., 1 983). As will be appreciated, the technique typically employs a phage vector which exists in both a single stranded and double stranded form. Typical vectors useful in site-directed mutagenesis include vectors such as the M 1 3 phage (Messing et al., 1 981 ). These phage are readily commercially available and their use is generally well known to those skilled in the art.

In general, site-directed mutagenesis in accordance herewith is performed by first ob- taining a single-stranded vector which includes within its sequence a nucleic acid sequence which encodes the polypeptides of the invention. An oligonucleotide primer bearing the desired mutated sequence is prepared, generally synthetically, for example by the method of Crea et al. (1 978). This primer is then annealed with the single- stranded vector, and subjected to DNA polymerizing enzymes such as E. coli polymer- ase I Klenow fragment, in order to complete the synthesis of the mutation-bearing strand. Thus, a heteroduplex is formed wherein one strand encodes the original non- mutated sequence and the second strand bears the desired mutation. This heteroduplex vector is then used to transform appropriate cells, such as E. coli cells, and clones are selected which include recombinant vectors bearing the mutated sequence arrangement.

The preparation of sequence variants of the selected nucleic acid fragments of the invention using site-directed mutagenesis is provided as a means of producing potentially useful species of the genes and is not meant to be limiting as there are other ways in which sequence variants of the nucleic acid fragments of the invention may be obtained. For example, recombinant vectors encoding the desired genes may be treated with mutagenic agents to obtain sequence variants (see, e.g., a method described by Eichenlaub, 1 979) for the mutagenesis of plasmid DNA using hydroxyl- amine.

The invention also relates to a replicable expression vector which comprises a nucleic acid fragment defined above, especially a vector which comprises a nucleic acid fragment encoding a polypeptide fragment of the invention.

The vector may be any vector which may conveniently be subjected to recombinant DNA procedures, and the choice of vector will often depend on the host cell into which it is to be introduced. Thus, the vector may be an autonomously replicating vector, i.e. a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication; examples of such a vector are a plasmid, phage, cosmid, mini-chromosome or virus. Alternatively, the vector may be one which, when introduced in a host cell, is integrated in the host cell genome and replicated together with the chromosome(s) into which it has been integrated.

Expression vectors may be constructed to include any of the DNA segments disclosed herein. Such DNA might encode an antigenic protein specific for virulent strains of mycobacteria or even hybridization probes for detecting mycobacteria nucleic acids in samples. Longer or shorter DNA segments could be used, depending on the antigenic protein desired. Epitopic regions of the proteins expressed or encoded by the disclosed DNA could be included as relatively short segments of DNA. A wide variety of expression vectors is possible including, for example, DNA segments encoding reporter gene products useful for identification of heterologous gene products and/or resistance genes such as antibiotic resistance genes which may be useful in identifying transformed cells.

The vector of the invention may be used to transform cells so as to allow propagation of the nucleic acid fragments of the invention or so as to allow expression of the polypeptide fragments of the invention. Hence, the invention also pertains to a transformed cell harbouring at Ieast one such vector according to the invention, said cell being one which does not natively harbour the vector and/or the nucleic acid fragment of the invention contained therein. Such a transformed cell (which is also a part of the invention) may be any suitable bacterial host cell or any other type of cell such as a unicellular eukaryotic organism, a fungus or yeast, or a cell derived from a multicellular organism, e.g. an animal or a plant. It is especially in cases where glycosylation is desired that a mammalian cell is used, although glycosylation of proteins is a rare event in prokaryotes. Normally, however, a prokaryotic cell is preferred such as a bacterium belonging to the genera Mycobacterium, Salmonella, Pseudomonas, Bacillus and Es- chericia. It is preferred that the transformed cell is an E. coli, B. subtilis, or M. bovis BCG cell, and it is especially preferred that the transformed cell expresses a polypeptide according of the invention. The latter opens for the possibility to produce the polypeptide of the invention by simply recovering it from the culture containing the transformed cell. In the most preferred embodiment of this part of the invention the transformed cell is Mycobacterium bovis BCG strain: Danish 1 331 , which is the My- cobacterium bovis strain Copenhagen from the Copenhagen BCG Laboratory, Statens Seruminstitut, Denmark.

The nucleic acid fragments of the invention allow for the recombinant production of the polypeptides fragments of the invention. However, also isolation from the natural source is a way of providing the polypeptide fragments as is peptide synthesis.

Therefore, the invention also pertains to a method for the preparation of a polypeptide fragment of the invention, said method comprising inserting a nucleic acid fragment as defined above into a vector which is able to replicate in a host cell, introducing the resulting recombinant vector into the host cell (transformed cells may be selected using various techniques, including screening by differential hybridization, identification of fused reporter gene products, resistance markers, anti-antigen antibodies and the like), culturing the host cell in a culture medium under conditions sufficient to effect expression of the polypeptide (of course the cell may be cultivated under conditions appropriate to the circumstances, and if DNA is desired, replication conditions are used), and recovering the polypeptide from the host cell or culture medium; or

isolating the polypeptide from a short-term culture filtrate as defined in claim 1 ; or isolating the polypeptide from whole mycobacteria of the tuberculosis complex or from lysates or fractions thereof, e.g. cell wall containing fractions, or

synthesizing the polypeptide by solid or liquid phase peptide synthesis.

The medium used to grow the transformed cells may be any conventional medium suitable for the purpose. A suitable vector may be any of the vectors described above, and an appropriate host cell may be any of the cell types listed above. The methods employed to construct the vector and effect introduction thereof into the host cell may be any methods known for such purposes within the field of recombinant DNA. In the following a more detailed description of the possibilities will be given:

In general, of course, prokaryotes are preferred for the initial cloning of nucleic sequences of the invention and constructing the vectors useful in the invention. For ex- ample, in addition to the particular strains mentioned in the more specific disclosure below, one may mention by way of example, strains such as E. coli K1 2 strain 294 (ATCC No. 31446), E. coli B, and E. coli X 1776 (ATCC No. 31 537). These examples are, of course, intended to be illustrative rather than limiting.

Prokaryotes are also preferred for expression. The aforementioned strains, as well as E. coli W31 10 (F-, lambda-, prototrophic, ATCC No. 273325), bacilli such as Bacillus subtilis, or other enterobacteriaceae such as Salmonella typhimurium or Serratia mar- cesans, and various Pseudomonas species may be used. Especially interesting are rapid-growing mycobacteria, e.g. M. smegmatis, as these bacteria have a high degree of resemblance with mycobacteria of the tuberculosis complex and therefore stand a good chance of reducing the need of performing post-translational modifications of the expression product.

In general, plasmid vectors containing replicon and control sequences which are de- rived from species compatible with the host cell are used in connection with these hosts. The vector ordinarily carries a replication site, as well as marking sequences which are capable of providing phenotypic selection in transformed cells. For example, E. coli is typically transformed using pBR322, a plasmid derived from an E. coli species (see, e.g., Bolivar et al., 1977, Gene 2: 95). The pBR322 plasmid contains genes for ampicillin and tetracycline resistance and thus provides easy means for identifying transformed cells. The pBR plasmid, or other microbial plasmid or phage must also contain, or be modified to contain, promoters which can be used by the microorganism for expression.

Those promoters most commonly used in recombinant DNA construction include the B-lactamase (penicillinase) and lactose promoter systems (Chang et al., 1 978; Itakura et al., 1977; Goeddel et al., 1979) and a tryptophan (trp) promoter system (Goeddel et al., 1 979; EPO Appl. Publ. No. 0036776). While these are the most commonly used, other microbial promoters have been discovered and utilized, and details concerning their nucleotide sequences have been published, enabling a skilled worker to ligate them functionally with plasmid vectors (Siebwenlist et al., 1 980). Certain genes from prokaryotes may be expressed efficiently in E. coli from their own promoter sequences, precluding the need for addition of another promoter by artificial means.

After the recombinant preparation of the polypeptide according to the invention, the isolation of the polypeptide may for instance be carried out by affinity chromatography (or other conventional biochemical procedures based on chromatography), using a monoclonal antibody which substantially specifically binds the polypeptide according to the invention. Another possibility is to employ the simultaneous electroelution technique described by Andersen et al. in J. Immunol. Methods 161 : 29-39.

According to the invention the post-translational modifications involves lipidation, glycosylation, cleavage, or elongation of the polypeptide.

In certain aspects, the DNA sequence information provided by this invention allows for the preparation of relatively short DNA (or RNA or PNA) sequences having the ability to specifically hybridize to mycobacterial gene sequences. In these aspects, nucleic acid probes of an appropriate length are prepared based on a consideration of the rele- vant sequence. The ability of such nucleic acid probes to specifically hybridize to the mycobacterial gene sequences lend them particular utility in a variety of embodiments. Most importantly, the probes can be used in a variety of diagnostic assays for detecting the presence of pathogenic organisms in a given sample. However, either uses are envisioned, including the use of the sequence information for the preparation of mutant species primers, or primers for use in preparing other genetic constructs.

Apart from their use as starting points for the synthesis of polypeptides of the inven- tion and for hybridization probes (useful for direct hybridization assays or as primers in e.g. PCR or other molecular amplification methods) the nucleic acid fragments of the invention may be used for effecting in vivo expression of antigens, i.e. the nucleic acid fragments may be used in so-called DNA vaccines. Recent research have revealed that a DNA fragment cloned in a vector which is non- replicative in eukaryotic cells may be introduced into an animal (including a human being) by e.g. intramuscular injection or percutaneous administration (the so-called "gene gun" approach). The DNA is taken up by e.g. muscle cells and the gene of interest is expressed by a promoter which is functioning in eukaryotes, e.g. a viral promoter, and the gene product thereafter stimulates the immune system. These newly discovered methods are reviewed in Ul- mer et al., 1 993, which hereby is included by reference.

Hence, the invention also relates to a vaccine comprising a nucleic acid fragment according to the invention, the vaccine effecting in vivo expression of antigen by an animal, including a human being, to whom the vaccine has been administered, the amount of expressed antigen being effective to confer substantially increased resistance to infections with mycobacteria of the tuberculosis complex in an animal, including a human being.

The efficacy of such a "DNA vaccine" can possibly be enhanced by administering the gene encoding the expression product together with a DNA fragment encoding a polypeptide which has the capability of modulating an immune response. For instance, a gene encoding lymphokine precursors or lymphokines (e.g. IFN-γ, IL-2, or IL-1 2) could be administered together with the gene encoding the immunogenic protein, either by administering two separate DNA fragments or by administering both DNA fragments included in the same vector. It also is a possibility to administer DNA fragments comprising a multitude of nucleotide sequences which each encode relevant epitopes of the polypeptides disclosed herein so as to effect a continuous sensitization of the immune system with a broad spectrum of these epitopes. As explained above, the polypeptide fragments of the invention are excellent candidates for vaccine constituents or for constituents in an immune diagnostic agent due to their extracellular presence in culture media containing metabolizing virulent mycobacteria belonging to the tuberculosis complex, or because of their high homologies with such extracellular antigens, or because of their absence in M. bovis BCG.

Thus, another part of the invention pertains to an immunologic composition comprising a polypeptide or fusion polypeptide according to the invention. In order to ensure optimum performance of such an immunologic composition it is preferred that it comprises an immunologically and pharmaceutically acceptable carrier, vehicle or adjuvant.

Suitable carriers are selected from the group consisting of a polymer to which the polypeptide(s) is/are bound by hydrophobic non-covalent interaction, such as a plastic, e.g. polystyrene, or a polymer to which the polypeptide(s) is/are covalently bound, such as a polysaccharide, or a polypeptide, e.g. bovine serum albumin, ovalbumin or keyhole limpet haemocyanin. Suitable vehicles are selected from the group consisting of a diluent and a suspending agent. The adjuvant is preferably selected from the group consisting of dimethyldioctadecylammonium bromide (DDA), Quil A, poly l:C, Freund's incomplete adjuvant, IFN-γ, IL-2, IL-1 2, monophosphoryl lipid A (MPL), and muramyl dipeptide (MDP).

A preferred immunologic composition according to the present invention comprising at Ieast two different polypeptide fragments, each different polypeptide fragment being a polypeptide or a fusion polypeptide defined above. It is preferred that the immunologic composition comprises between 3-20 different polypeptide fragments or fusion polypeptides.

Such an immunologic composition may preferably be in the form of a vaccine or in the form of a skin test reagent.

In line with the above, the invention therefore also pertain to a method for producing an immunologic composition according to the invention, the method comprising preparing, synthesizing or isolating a polypeptide according to the invention, and solubi- lizing or dispersing the polypeptide in a medium for a vaccine, and optionally adding other M. tuberculosis antigens and/or a carrier, vehicle and/or adjuvant substance.

Preparation of vaccines which contain peptide sequences as active ingredients is gen- erally well understood in the art, as exemplified by U.S. Patents 4,608,251 ; 4,601 ,903; 4,599,231 ; 4,599,230; 4,596,792; and 4,578,770, all incorporated herein by reference. Typically, such vaccines are prepared as injectables either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid prior to injection may also be prepared. The preparation may also be emulsified. The active immunogenic ingredient is often mixed with excipients which are pharmaceutically acceptable and compatible with the active ingredient. Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol, or the like, and combinations thereof. In addition, if desired, the vaccine may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, or adju- vants which enhance the effectiveness of the vaccines.

The vaccines are conventionally administered parenterally, by injection, for example, either subcutaneously or intramuscularly. Additional formulations which are suitable for other modes of administration include suppositories and, in some cases, oral for- mulations. For suppositories, traditional binders and carriers may include, for example, polyalkalene glycols or triglycerides; such suppositories may be formed from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably 1 -2% . Oral formulations include such normally employed excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cel- lulose, magnesium carbonate, and the like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release formulations or powders and contain 10-95% of active ingredient, preferably 25-70%.

The proteins may be formulated into the vaccine as neutral or salt forms. Pharmaceu- tically acceptable salts include acid addition salts (formed with the free amino groups of the peptide) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic oxalic, tartaric, man- delic, and the like. Salts formed with the free carboxyl groups may also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-eth- ylamino ethanol, histidine, procaine, and the like.

The vaccines are administered in a manner compatible with the dosage formulation, and in such amount as will be therapeutically effective and immunogenic. The quantity to be administered depends on the subject to be treated, including, e.g., the capacity of the individual's immune system to mount an immune response, and the degree of protection desired. Suitable dosage ranges are of the order of several hundred micro- grams active ingredient per vaccination with a preferred range from about 0.1 μg to 1000 μg, such as in the range from about 1 μg to 300 μg, and especially in the range from about 10 μg to 50 μg. Suitable regimens for initial administration and booster shots are also variable but are typified by an initial administration followed by subsequent inoculations or other administrations.

The manner of application may be varied widely. Any of the conventional methods for administration of a vaccine are applicable. These are believed to include oral application on a solid physiologically acceptable base or in a physiologically acceptable dispersion, parenterally, by injection or the like. The dosage of the vaccine will depend on the route of administration and will vary according to the age of the person to be vac- cinated and, to a lesser degree, the size of the person to be vaccinated.

Some of the polypeptides of the vaccine are sufficiently immunogenic in a vaccine, but for some of the others the immune response will be enhanced if the vaccine further comprises an adjuvant substance.

Various methods of achieving adjuvant effect for the vaccine include use of agents such as aluminum hydroxide or phosphate (alum), commonly used as 0.05 to 0.1 percent solution in phosphate buffered saline, admixture with synthetic polymers of sugars (Carbopol) used as 0.25 percent solution, aggregation of the protein in the vaccine by heat treatment with temperatures ranging between 70° to 101 °C for 30 second to 2 minute periods respectively. Aggregation by reactivating with pepsin treated (Fab) antibodies to albumin, mixture with bacterial cells such as C. parvum or endotoxins or lipopolysaccharide components of gram-negative bacteria, emulsion in physiologically acceptable oil vehicles such as mannide mono-oleate (Aracel A) or emulsion with 20 percent solution of a perfluorocarbon (Fluosol-DA) used as a block substitute may also be employed. According to the invention DDA (dimethyldioctadecylammonium bromide) is an interesting candidate for an adjuvant, but also Freund's complete and incomplete adjuvants as well as QuilA and RIBI are interesting possibilities. Further pos- sibilities are monophosphoryl lipid A (MPL), and muramyl dipeptide (MDP).

Another highly interesting (and thus, preferred) possibility of achieving adjuvant effect is to employ the technique described in Gosselin et al. , 1 992 (which is hereby incorporated by reference herein). In brief, the presentation of a relevant antigen such as an antigen of the present invention can be enhanced by conjugating the antigen to antibodies (or antigen binding antibody fragments) against the Fcγ receptors on mono- cytes/macrophages. Especially conjugates between antigen and anti-FcγRI have been demonstrated to enhance immunogenicity for the purposes of vaccination.

Other possibilities involve the use of immune modulating substances such as lympho- kines (e.g. IFN-γ, IL-2 and IL-1 2) or synthetic IFN-γ inducers such as poly l:C in combination with the above-mentioned adjuvants. As discussed in example 3, it is contemplated that such mixtures of antigen and adjuvant will lead to superior vaccine formulations.

In many instances, it will be necessary to have multiple administrations of the vaccine, usually not exceeding six vaccinations, more usually not exceeding four vaccinations and preferably one or more, usually at Ieast about three vaccinations. The vaccinations will normally be at from two to twelve week intervals, more usually from three to five week intervals. Periodic boosters at intervals of 1 -5 years, usually three years, will be desirable to maintain the desired levels of protective immunity. The course of the immunization may be followed by in vitro proliferation assays of PBL (peripheral blood lymphocytes) co-cultured with ESAT-6 or ST-CF, and especially by measuring the levels of IFN-γ released form the primed lymphocytes. The assays may be performed using conventional labels, such as radionuclides, enzymes, fluorescers, and the like. These techniques are well known and may be found in a wide variety of patents, such as U.S. Patent Nos. 3,791 ,932; 4, 1 74,384 and 3,949,064, as illustrative of these types of assays. Due to genetic variation, different individuals may react with immune responses of varying strength to the same polypeptide. Therefore, the vaccine according to the invention may comprise several different polypeptides in order to increase the immune response. The vaccine may comprise two or more polypeptides, where all of the poly- peptides are as defined above, or some but not all of the peptides may be derived from a bacterium belonging to the M. tuberculosis complex. In the latter example the polypeptides not necessarily fulfilling the criteria set forth above for polypeptides may either act due to their own immunogenicity or merely act as adjuvants. Examples of such interesting polypeptides are MPB64, MPT64, and MPB59, but any other sub- stance which can be isolated from mycobacteria are possible candidates.

The vaccine may comprise 3-20 different polypeptides, such as 3-10 different polypeptides.

One reason for admixing the polypeptides of the invention with an adjuvant is to effectively activate a cellular immune response. However, this effect can also be achieved in other ways, for instance by expressing the effective antigen in a vaccine in a non-pathogenic microorganism. A well-known example of such a microorganism is Mycobacterium bovis BCG.

Therefore, another important aspect of the present invention is an improvement of the living BCG vaccine presently available, which is a vaccine for immunizing an animal, including a human being, against TB caused by mycobacteria belonging to the tuberculosis-complex, comprising as the effective component a microorganism, wherein one or more copies of a DNA sequence encoding a polypeptide as defined above has been incorporated into the genome of the microorganism in a manner allowing the microorganism to express and secrete the polypeptide.

In the present context the term "genome" refers to the chromosome of the microor- ganisms as well as extrachromosomally DNA or RNA, such as plasmids. It is, however, preferred that the DNA sequence of the present invention has been introduced into the chromosome of the non-pathogenic microorganism, since this will prevent loss of the genetic material introduced. It is preferred that the non-pathogenic microorganism is a bacterium, e.g. selected from the group consisting of the genera Mycobacterium, Salmonella, Pseudomonas and Eschericia. It is especially preferred that the non-pathogenic microorganism is Mycobacterium bovis BCG, such as Mycobacterium bovis BCG strain: Danish 1 331 .

The incorporation of one or more copies of a nucleotide sequence encoding the polypeptide according to the invention in a mycobacterium from a M. bovis BCG strain will enhance the immunogenic effect of the BCG strain. The incorporation of more than one copy of a nucleotide sequence of the invention is contemplated to enhance the immune response even more, and consequently an aspect of the invention is a vaccine wherein at Ieast 2 copies of a DNA sequence encoding a polypeptide is incorporated in the genome of the microorganism, such as at Ieast 5 copies. The copies of DNA sequences may either be identical encoding identical polypeptides or be variants of the same DNA sequence encoding identical or homologues of a polypeptide, or in another embodiment be different DNA sequences encoding different polypeptides where at Ieast one of the polypeptides is according to the present invention.

The living vaccine of the invention can be prepared by cultivating a transformed non- pathogenic cell according to the invention, and transferring these cells to a medium for a vaccine, and optionally adding a carrier, vehicle and/or adjuvant substance.

The invention also relates to a method of diagnosing TB caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis in an animal, including a human being, comprising intradermally injecting, in the animal, a polypeptide ac- cording to the invention or a skin test reagent described above, a positive skin response at the location of injection being indicative of the animal having TB, and a negative skin response at the location of injection being indicative of the animal not having TB. A positive response is a skin reaction having a diameter of at Ieast 5 mm, but larger reactions are preferred, such as at Ieast 1 cm, 1 .5 cm, and at Ieast 2 cm in diameter. The composition used as the skin test reagent can be prepared in the same manner as described for the vaccines above.

In line with the disclosure above pertaining to vaccine preparation and use, the invention also pertains to a method for immunising an animal, including a human being, against TB caused by mycobacteria belonging to the tuberculosis complex, comprising administering to the animal the polypeptide of the invention, or a vaccine composition of the invention as described above, or a living vaccine described above. Preferred routes of administration are the parenteral (such as intravenous and intraarterially), in- traperitoneal, intramuscular, subcutaneous, intradermal, oral, buccal, sublingual, nasal, rectal or transdermal route.

The protein ESAT-6 which is present in short-term culture filtrates from mycobacteria as well as the esat-6 gene in the mycobacterial genome has been demonstrated to have a very limited distribution in other mycobacterial strains that M. tuberculosis, e.g. esat-6 is absent in both BCG and the majority of mycobacterial species isolated from the environment, such as M. avium and M. terrae. It is believed that this is also the case for at Ieast one of the antigens of the present invention and their genes and therefore, the diagnostic embodiments of the invention are especially well-suited for performing the diagnosis of on-going or previous infection with virulent mycobacterial strains of the tuberculosis complex, and it is contemplated that it will be possible to distinguish between 1 ) subjects (animal or human) which have been previously vaccinated with e.g. BCG vaccines or subjected to antigens from non-virulent mycobacteria and 2) subjects which have or have had active infection with virulent mycobacteria.

A number of possible diagnostic assays and methods can be envisaged:

When diagnosis of previous or ongoing infection with virulent mycobacteria is the aim, a blood sample comprising mononuclear cells (i.a. T-lymphocytes) from a patient could be contacted with a sample of one or more polypeptides of the invention. This contacting can be performed in vitro and a positive reaction could e.g. be proliferation of the T-cells or release cytokines such as γ-interferon into the extracellular phase (e.g. into a culture supernatant); a suitable in vivo test would be a skin test as described above. It is also conceivable to contact a serum sample from a subject to contact with a polypeptide of the invention, the demonstration of a binding between antibodies in the serum sample and the polypeptide being indicative of previous or ongoing infection. The invention therefore also relates to an in vitro method for diagnosing ongoing or previous sensitization in an animal or a human being with bacteria belonging to the tuberculosis complex, the method comprising providing a blood sample from the animal or human being, and contacting the sample from the animal with the polypeptide of the invention, a significant release into the extracellular phase of at Ieast one cytokine by mononuclear cells in the blood sample being indicative of the animal being sensitized. By the term "significant release" is herein meant that the release of the cytokine is significantly higher than the cytokine release from a blood sample derived from a non-tuberculous subject (e.g. a subject which does not react in a traditional skin test for TB). Normally, a significant release is at Ieast two times the release observed from such a sample.

Alternatively, a sample of a possibly infected organ may be contacted with an antibody raised against a polypeptide of the invention. The demonstration of the reaction by means of methods well-known in the art between the sample and the antibody will be indicative of ongoing infection. It is of course also a possibility to demonstrate the presence of anti-mycobacterial antibodies in serum by contacting a serum sample from a subject with at Ieast one of the polypeptide fragments of the invention and using well-known methods for visualizing the reaction between the antibody and antigen.

Also a method of determining the presence of mycobacterial nucleic acids in an animal, including a human being, or in a sample, comprising administering a nucleic acid fragment of the invention to the animal or incubating the sample with the nucleic acid fragment of the invention or a nucleic acid fragment complementary thereto, and de- tecting the presence of hybridized nucleic acids resulting from the incubation (by using the hybridization assays which are well-known in the art), is also included in the invention. Such a method of diagnosing TB might involve the use of a composition comprising at Ieast a part of a nucleotide sequence as defined above and detecting the presence of nucleotide sequences in a sample from the animal or human being to be tested which hybridize with the nucleic acid fragment (or a complementary fragment) by the use of PCR technique.

The fact that certain of the disclosed antigens are not present in M. bovis BCG but are present in virulent mycobacteria point them out as interesting drug targets; the anti- gens may constitute receptor molecules or toxins which facilitate the infection by the mycobacterium, and if such functionalities are blocked the infectivity of the mycobacterium will be diminshed.

To determine particularly suitable drug targets among the antigens of the invention, the gene encoding at Ieast one of the polypeptides of the invention and the necessary control sequences can be introduced into avirulent strains of mycobacteria (e.g. BCG) so as to determine which of the polypeptides are critical for virulence. Once particular proteins are identified as critical for/contributory to virulence, anti-mycobacterial agents can be designed rationally to inhibit expression of the critical genes or to attack the critical gene products. For instance, antibodies or fragments thereof (such as Fab and (Fab')₂ fragments can be prepared against such critical polypeptides by methods known in the art and thereafter used as prophylactic or therapeutic agents. Alternatively, small molecules can be screened for their ability to selectively inhibit expression of the critical gene products, e.g. using recombinant expression systems which include the gene's endogenous promoter, or for their ability to directly interfere with the action of the target. These small molecules are then used as therapeutics or as prophylactic agents to inhibit mycobacterial virulence.

Alternatively, anti-mycobacterial agents which render a virulent mycobacterium avirulent can be operably linked to expression control sequences and used to transform a virulent mycobacterium. Such anti-mycobacterial agents inhibit the replication of a specified mycobacterium upon transcription or translation of the agent in the mycobacterium. Such a "newly avirulent" mycobacterium would constitute a superb alter- native to the above described modified BCG for vaccine purposes since it would be immunologically very similar to a virulent mycobacterium compared to e.g. BCG.

Finally, a monoclonal or polyclonal antibody, which is specifically reacting with a polypeptide of the invention in an immuno assay, or a specific binding fragment of said antibody, is also a part of the invention. The production of such polyclonal antibodies requires that a suitable animal be immunized with the polypeptide and that these antibodies are subsequently isolated, suitably by immune affinity chromatography. The production of monoclonals can be effected by methods well-known in the art, since the present invention provides for adequate amounts of antigen for both immunization and screening of positive hybridomas.

LEGENDS TO THE FIGURES

Fig. 1 : Long term memory immune mice are very efficiently protected towards an infection with M. tuberculosis. Mice were given a challenge of M. tuberculosis and spleens were isolated at different time points. Spleen lymphocytes were stimulated in vitro with ST-CF and the release of IFN-γ investigated (panel A). The counts of CFU in the spleens of the two groups of mice are indicated in panel B. The memory immune mice control infection within the first week and produce large quantities of IFN-γ in response to antigens in ST-CF.

Fig. 2: T cells involved in protective immunity are predominantly directed to molecules from 6-1 2 and 1 7-38 kDa. Splenic T cells were isolated four days after the challenge with M. tuberculosis and stimulated in vitro with narrow molecular mass fractions of ST-CF. The release of IFN-γ was investigated

Fig. 3: Nucleotide sequence (SEQ ID NO: 1 ) of cfp7. The deduced amino acid sequence (SEQ ID NO: 2) of CFP7 is given in conventional one-letter code below the nucleotide sequence. The putative ribosome-binding site is written in underlined italics as are the putative -1 0 and -35 regions. Nucleotides written in bold are those encoding CFP7.

Fig. 4. Nucleotide sequence (SEQ ID NO: 3) of cfp9. The deduced amino acid sequence (SEQ ID NO: 4) of CFP9 is given in conventional one-letter code below the nucleotide sequence. The putative ribosome-binding site Shine Delgarno sequence is written in underlined italics as are the putative -10 and -35 regions. Nucleotides in bold writing are those encoding CFP9. The nucleotide sequence obtained from the lambda 226 phage is double underlined.

Fig. 5: Nucleotide sequence of mpt51. The deduced amino acid sequence of MPT51 is given in a one-letter code below the nucleotide sequence. The signal is indicated in italics. The putative potential ribosome-binding site is underlined. The nucleotide difference and amino acid difference compared to the nucleotide sequence of MPB51 (Ohara et al., 1 995) are underlined at position 780. The nucleotides given in italics are not present in M. tuberculosis H37Rv.

Fig. 6: the position of the purified antigens in the 2DE system have been determined and mapped in a reference gel. The newly purified antigens are encircled and the position of well-known proteins are also indicated.

EXAMPLE 1

Identification of single culture filtrate antigens involved in protective immunity

A group of efficiently protected mice was generated by infecting 8-12 weeks old fe- male C57BI/6J mice with 5 x 10⁴ M. tuberculosis i.v. After 30 days of infection the mice were subjected to 60 days of antibiotic treatment with isoniazid and were then left for 200-240 days to ensure the establishment of resting long-term memory immunity. Such memory immune mice are very efficiently protected against a secondary infection (Fig. 1 ). Long lasting immunity in this model is mediated by a population of highly reactive CD4 cells recruited to the site of infection and triggered to produce large amounts of IFN-γ in response to ST-CF (Fig. 1 ) (Andersen et al. 1 995).

We have used this model to identify single antigens recognized by protective T cells. Memory immune mice were reinfected with 1 x 10⁶ M. tuberculosis i.v. and splenic lymphocytes were harvested at day 4-6 of reinfection, a time point where this population is highly reactive to ST-CF. The antigens recognized by these T cells were mapped by the multi-elution technique (Andersen and Heron, 1993). This technique divides complex protein mixtures separated in SDS-PAGE into narrow fractions in a physiological buffer. These fractions were used to stimulate spleen lymphocytes in vi- tro and the release of IFN-γ was monitored (Fig. 2). Long-term memory immune mice did not recognize these fractions before TB infection, but splenic lymphocytes obtained during the recall of protective immunity recognized a range of culture filtrate antigens and peak production of IFN-γ was found in response to proteins of apparent molecular weight 6-12 and 17-30 kDa (Fig. 2). It is therefore concluded that culture filtrate antigens within these regions are the major targets recognized by memory effector T-cells triggered to release IFN-γ during the first phase of a protective immune response.

EXAMPLE 2

Cloning of genes expressing low mass culture filtrate antigens

In example 1 it was demonstrated that antigens in the low molecular mass fraction are recognized strongly by cells isolated from memory immune mice. Monoclonal antibodies (mAbs) to these antigens were therefore generated by immunizing with the low mass fraction in RIBI adjuvant (first and second immunization) followed by two injections with the fractions in aluminium hydroxide. Fusion and cloning of the reactive cell lines were done according to standard procedures (Kohler and Milstein 1 975). The procedure resulted in the provision of two mAbs: ST-3 directed to a 9 kDa culture filtrate antigen (CFP9) and PV-2 directed to a 7 kDa antigen (CFP7), when the molecular weight is estimated from migration of the antigens in an SDS-PAGE.

In order to identify the antigens binding to the Mab's, the following experiments were carried out:

The recombinant λgtl 1 M. tuberculosis DNA library constructed by R. Young (Young, R.A. et al. 1 985) and obtained through the World Health Organization IMMTUB programme (WHO.0032. wibr) was screened for phages expressing gene products which would bind the monoclonal antibodies ST-3 and PV-2.

Approximately 1 x 10⁵ pfu of the gene library (containing approximately 25 % recombinant phages) were plated on Eschericia coli Y1090 (DlacU 1 69, proA ⁺ , Dion, araD1 39, supF, trpC22 ::tr/ 70 [pMC9] ATCC#371 97) in soft agar and incubated for 2,5 hours at 42°C.

The plates were overlaid with sheets of nitrocellulose saturated with isopropyl-β-D- thiogalactopyranoside and incubation was continued for 2,5 hours at 37°C. The nitrocellulose was removed and incubated with samples of the monoclonal antibodies in PBS with Tween 20 added to a final concentration of 0.05 %. Bound monoclonal antibodies were visualized by horseradish peroxidase-conjugated rabbit anti-mouse immu- noglobulins (P260, Dako, Glostrup, DK) and a staining reaction involving 5, 5' , 3,3'- tetramethylbenzidine and H₂0₂.

Positive plaques were recloned and the phages originating from a single plaque were used to lysogenize E. co// Y 1 089 (DlacU 1 69, proAZ Dion, araD 1 39, strA, hfH 50 [chr::tr/ 70] [pMC9] ATCC nr. 371 96). The resultant lysogenic strains were used to propagate phage particles for DNA extraction. These lysogenic E. coli strains have been named:

AA226 (expressing ST-3 reactive polypeptide CFP9) which has been deposited 28 June 1 993 with the collection of Deutsche Sammlung von Mikroorganismen und Zell- kulturen GmbH (DSM) under the accession number DSM 8377 and in accordance with the provisions of the Budapest Treaty, and

AA242 (expressing PV-2 reactive polypeptide CFP7) which has been deposited 28 June 1 993 with the collection of Deutsche Sammlung von Mikroorganismen und Zell- kulturen GmbH (DSM) under the accession number DSM 8379 and in accordance with the provisions of the Budapest Treaty.

These two lysogenic E. coli strains are disclosed in WO 95/01441 as are the mycobacterial polypeptide products expressed thereby. However, no information concerning the amino acid sequences of these polypeptides or their genetic origin are given, and therefore only the direct expression products of AA226 and AA242 are made available to the public.

The st-3 binding protein is expressed as a protein fused to β-galactosidase, whereas the pv-2 binding protein appears to be expressed in an unfused version.

Sequencing of the nucleotide sequence encoding the PV-2 and ST-3 binding protein

In order to obtain the nucleotide sequence of the gene encoding the pv-2 binding protein, the approximately 3 kb M. tuberculosis derived EcoR\ - EcoR\ fragment from AA242 was subcloned in the EcoR\ site in the pBluescriptSK + (Stratagene) and used to transform E. coli XL-I Blue (Stratagene).

Similarly, to obtain the nucleotide sequence of the gene encoding the st-3 binding pro- tein, the approximately 5 kb M. tuberculosis derived EcoR\ - EcoR\ fragment from AA226 was subcloned in the EcoR\ site in the pBluescriptSK + ( Stratagene) and used to transform E. co// XL-1 Blue (Stratagene).

The complete DNA sequence of both genes were obtained by the dideoxy chain termi- nation method adapted for supercoiied DNA by use of the Sequenase DNA sequencing kit version 1 .0 (United States Biochemical Corp., Cleveland, OH) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. The sequences DNA are shown in SEQ ID NO: 1 (CFP7) and in SEQ ID NO: 3 (CFP9) as well as in Figs. 3 and 4, respectively. Both strands of the DNA were sequenced.

CFP7

An open reading frame (ORF) encoding a sequence of 96 amino acid residues was identified from an ATG start codon at position 91 -93 extending to a TAG stop codon at position 379-381 . The deduced amino acid sequence is shown in SEQ ID NO: 2 (and in Fig. 3 where conventional one-letter amino acid codes are used).

CFP7 appear to be expressed in E. coli as an unfused version. The nucleotide se- quence at position 78-84 is expected to be the Shine Delgarno sequence and the sequences from position 47-50 and 14-1 9 are expected to be the -1 0 and -35 regions, respectively:

CFP9

The protein recognised by ST-3 was produced as a β-galactosidase fusion protein, when expressed from the AA226 lambda phage. The fusion protein had an approx. size of 1 1 6 - 1 1 7kDa (Mw for β-galactosidase 1 16.25 kDa) which may suggest that only part of the CFP9 gene was included in the lambda clone (AA226). Based on the 90 bp nucleotide sequence obtained on the insert from lambda phage AA226, a search of homology to the nucleotide sequence of the M. tuberculosis genome was performed in the Sanger database (Sanger Mycobacterium tuberculosis database):

http://www.sanger.ac.uk/pathogens/TB-blast-server.html;

Williams, 1996). 100% identity to the cloned sequence was found on the MTCY48 cosmid. An open reading frame (ORF) encoding a sequence of 1 09 amino acid residues was identified from a GTG start codon at position 141 - 143 extending to a TGA stop codon at position 465 - 467. The deduced amino acid sequence is shown in Fig. 4 using conventional one letter code.

The nucleotide sequence at position 1 23 - 1 30 is expected to be the Shine Delgarno sequence and the sequences from position 73 - 78 and 4 - 9 are expected to be the - 10 and -35 region respectively (Fig. 4). The ORF overlapping with the 5'-end of the sequence of AA229 is shown in Fig. 4 by double underlining.

Subcloning CFP7 and CFP9 in expression vectors

The two ORFs encoding CFP7 and CFP9 were PCR cloned into the pMST24 (Theisen et al., 1 995) expression vector pRVNOI or the pQE-32 (QIAGEN) expression vector pRVN02, respectively.

The PCR amplification was carried out in a thermal reactor (Rapid cycler, Idaho Technology, Idaho) by mixing 10 ng plasmid DNA with the mastermix (0.5 μM of each oligonucleotide primer, 0.25 μM BSA (Stratagene), low salt buffer (20 mM Tris-HCl, pH 8.8, 10 mM KCI, 10 mM (NH₄)₂S0₄, 2 mM MgS0₄ and 0, 1 % Triton X-100) (Strata- gene), 0.25 mM of each deoxynucleoside triphosphate and 0.5 U Taq Plus Long DNA polymerase (Stratagene)). Final volume was 10 μl (all concentrations given are concentrations in the final volume). Predenaturation was carried out at 94°C for 30 s. 30 cycles of the following was performed; Denaturation at 94°C for 30 s, annealing at 55°C for 30 s and elongation at 72°C for 1 min. The oligonucleotide primers were synthesised automatically on a DNA synthesizer (Applied Biosystems, Forster City, Ca, ABI-391 , PCR-mode), deblocked, and purified by ethanol precipitation.

The cfp7 oligonucleotides (TABLE 1 ) were synthesised on the basis of the nucleotide sequence from the CFP7 sequence (Fig. 3) . The oligonucleotides were engineered to include an Sma\ restriction enzyme site at the 5' end and a BamVW restriction enzyme site at the 3' end for directed subcioning.

The cfp9 oligonucleotides (TABLE 1 ) were synthesized partly on the basis of the nucleotide sequence from the sequence of the AA229 clone and partly from the identical sequence found in the Sanger database cosmid MTCY48 (Fig. 4). The oligonucleotides were engineered to include a Sma\ restriction enzyme site at the 5' end and a Hind\\\ restriction enzyme site at the 3' end for directed subcioning.

CFP7

By the use of PCR a Sma\ site was engineered immediately 5' of the first codon of the ORF of 291 bp, encoding the cfp7 gene, so that only the coding region would be expressed, and a BamVW site was incorporated right after the stop codon at the 3' end. The 291 bp PCR fragment was cleaved by Sma\ and BamVW, purified from an agarose gel and subcloned into the Sma\ - BamVW sites of the pMST24 expression vector. Vector DNA containing the gene fusion was used to transform the E. coli XLI -Blue (pRVNOD.

CFP9

By the use of PCR a Sma\ site was engineered immediately 5' of the first codon of an ORF of 327 bp, encoding the cfp9 gene, so that only the coding region would be expressed, and a Hind\\\ site was incorporated after the stop codon at the 3' end. The 327 bp PCR fragment was cleaved by Sma\ and HindWl, purified from an agarose gel, and subcloned into the Sma\ - HindWl sites of the pQE-32 (QIAGEN) expression vector. Vector DNA containing the gene fusion was used to transform the E. coli XLI -Blue (pRVN02).

Purification of recombinant CFP7 and CFP9

The ORFs were fused N-terminally to the (His)₆-tag (cf. EP-A-0 282 242). Recombinant antigen was prepared as follows: Briefly, a single colony of E. coli harbouring either the pRVNOI or the pRVN02 plasmid, was inoculated into Luria-Bertani broth containing 100 μg/ml ampicillin and 1 2.5 μg/ml tetracycline and grown at 37°C to OD_600nm = 0.5. IPTG (isopropyl-β-D-thiogalactoside) was then added to a final concentration of 2 mM (expression was regulated either by the strong IPTG inducible P_tac or the T5 promoter) and growth was continued for further 2 hours. The cells were harvested by centrifugation at 4,200 x g at 4°C for 8 min. The pelleted bacteria were stored overnight at -20°C. The pellet was resuspended in BC 40/100 buffer (20 mM Tris-HCl pH 7.9, 20% glycerol, 1 00 mM KCI, 40 mM Imidazole) and cells were broken by sonication (5 times for 30 s with intervals of 30 s) at 4°C. followed by centrifugation at 1 2,000 x g for 30 min at 4°C, the supernatant (crude extract) was used for purification of the recombinant antigens.

The two Histidine fusion proteins (His-rCFP7 and His-rCFP9) were purified from the crude extract by affinity chromatography on a Ni^{2 +}-NTA column from QIAGEN with a volume of 1 00 ml. His-rCFP7 and His-rCFP9 binds to Ni²Z After extensive washes of the column in BC 40/100 buffer, the fusion protein was eluted with a BC 1000/100 buffer containing 1 00 mM imidazole, 20 mM Tris pH 7.9, 20% glycerol and 1 M KCI. subsequently, the purified products were dialysed extensively against 10 mM Tris pH 8.0. His-rCFP7 and His-rCFP9 were then separated from contaminants by fast protein liquid chromatography (FPLC) over an anion-exchange column (Mono Q, Pharmacia, Sweden), in 10 mM Tris pH 8.0 with a linear gradient of NaCl from 0 to 1 M. Aliquots of the fractions were analyzed by 10%-20% gradient sodium dodecyl sulphate poly- acrylamide gel electrophoresis (SDS-PAGE). Fractions containing purified either purified His-rCFP7 or His-rCFP9 were pooled. TABLE 1 . Sequence of the cfp7 and cfp9 oligonucleotides³

Orientation and Sequences (5' → 3') Position" (nucleooligonucleotide tide)

Sense pvR3 GCAACACCCGGGATGTCGCAAATCATG 91 -1 05

(SEQ ID NO: 43) (SEQ ID NO: 1 ) stR2 GTAACACCCGGGGTGGCCGCCGACCCG 141 -1 55

(SEQ ID NO: 44) (SEQ ID NO: 3)

Antisense pvF4 CTACTAAGCTTGGATCCCTAGCCG- 381 -362

CCCCATTTGGCGG (SEQ ID NO: 1 ]

(SEQ ID NO: 45) stF2 CTACTAAGCTTCCATGGTCAGGTC- 467 - 447

TTTTCGATGCTTAC (SEQ ID NO: 3)

(SEQ ID NO: 46)

^a The cfp7 oligonucleotides were based on the nucleotide sequence shown in Fig. 3 (SEQ ID NO: 1 ). The cfp9 oligonucleotides were based on the nucleotide sequence shown in Fig. 4 (SEQ ID NO: 3).

Nucleotides underlined are not contained in the nucleotide sequence of cfp7 and cfp9. ^b The positions referred to are of the non-underlined part of the primers and correspond to the nucleotide sequence shown in Fig. 3 and Fig. 4, respectively.

EXAMPLE 2A

Identification of antigens which are not expressed in BCG strains.

In an effort to control the treat of TB, attenuated bacillus Calmette-Gueπn (BCG) has been used as a live attenuated vaccine. BCG is an attenuated derivative of a virulent Mycobacterium bovis. The original BCG from the Pasteur Institute in Pans, France was developed from 1 908 to 1921 by 231 passages in liquid culture and has never been shown to revert to virulence in animals, indicating that the attenuating mutatιon(s) in BCG are stable deletions and/or multiple mutations which do not readily revert. While physiological differences between BCG and M. tuberculosis and M. bovis has been noted, the attenuating mutations which arose during serial passage of the original BCG strain has been unknown until recently. The first mutations described are the loss of the gene encoding MPB64 in some BCG strains (Li et al., 1 993, Oettinger and Andersen, 1 994) and the gene encoding ESAT-6 in all BCG strain tested (Harboe et al., 1 996), later 3 large deletions in BCG have been identified (Mahairas et al., 1 996). The region named RD 1 includes the gene encoding ESAT-6 and an other (RD2) the gene encoding MPT64. Both antigens have been shown to have diagnostic potential and ESAT-6 has been shown to have properties as a vaccine candidate (cf . PCT/DK94/00273 and PCT/DK/00270). In order to find new M. tuberculosis specific diagnostic antigens as well as antigens for a new vaccine against TB, the RD1 region ( 1 7.499 bp) of M. tuberculosis H37Rv has been analyzed for Open Reading Frames (ORF). ORFs with a minimum length of 96 bp have been predicted using the algorithm described by Borodovsky and Mclninch ( 1 993), in total 27 ORFs have been predicted, 20 of these have possible diagnostic and/or vaccine potential, as they are deleted from all known BCG strains. The predicted ORFs include ESAT-6 (RD 1 -ORF7) and CFP1 0 (RD 1 -ORF6) described previously (Sørensen et al., 1 995), as a positive control for the ability of the algorithm. In the present is described the potential of 7 of the predicted antigens for diagnosis of TB as well as potential as candidates for a new vaccine against TB.

Seven open reading frames (ORF) from the 1 7,499kb RD 1 region (Accession no. U34848) with possible diagnostic and vaccine potential have been identified and cloned.

Identification of the ORF's rd1-orf2, rd1-orf3, rd1-orf4, rdl-orf5, rdJ-orfδ, rd1-orf9a. and rd1-orf9b.

The nucleotide sequence of rd1-orf2 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 71 . The deduced amino acid sequence of RD1 -ORF2 is set forth in SEQ ID NO: 72.

The nucleotide sequence of rd1-orf3 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 87. The deduced amino acid sequence of RD1 -ORF2 is set forth in SEQ ID NO: 88. The nucleotide sequence of rdl-orf4 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 89. The deduced amino acid sequence of RD1 -ORF2 is set forth in SEQ ID NO: 90.

The nucleotide sequence of rd1-orf5 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 91 . The deduced amino acid sequence of RD 1 -ORF2 is set forth in SEQ ID NO: 92.

The nucleotide sequence of rd1-orf8 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 67. The deduced amino acid sequence of RD1 -ORF2 is set forth in SEQ ID NO: 68.

The nucleotide sequence of rd1-orf9a from M. tuberculosis H37Rv is set forth in SEQ ID NO: 93. The deduced amino acid sequence of RD1 -ORF2 is set forth in SEQ ID NO: 94.

The nucleotide sequence of rd1-orf9b from M. tuberculosis H37Rv is set forth in SEQ ID NO: 69. The deduced amino acid sequence of RD1 -ORF2 is set forth in SEQ ID NO: 70.

The DNA sequence rd1-orf2 (SEQ ID NO: 71 ) contained an open reading frame starting with an ATG codon at position 889 - 891 and ending with a termination codon (TAA) at position 2662 - 2664 (position numbers referring to the location in RD 1 ). The deduced amino acid sequence (SEQ ID NO: 72) contains 591 residues corresponding to a molecular weight of 64,525.

The DNA sequence rd1-orf3 (SEQ ID NO: 87) contained an open reading frame starting with an ATG codon at position 2807 - 2809 and ending with a termination codon (TAA) at position 3101 - 3103 (position numbers referring to the location in RD1 ). The deduced amino acid sequence (SEQ ID NO: 88) contains 98 residues corresponding to a molecular weight of 9,799. The DNA sequence rd1-orf4 (SEQ ID NO: 89) contained an open reading frame starting with a GTG codon at position 4014 - 4012 and ending with a termination codon (TAG) at position 3597 - 3595 (position numbers referring to the location in RD1 ). The deduced amino acid sequence (SEQ ID NO: 90) contains 139 residues corresponding to a molecular weight of 14,210.

The DNA sequence rd1-orf5 (SEQ ID NO: 91 ) contained an open reading frame starting with a GTG codon at position 3128 - 3130 and ending with a termination codon (TGA) at position 4241 - 4243 (position numbers referring to the location in RD1 ). The deduced amino acid sequence (SEQ ID NO: 92) contains 371 residues corresponding to a molecular weight of 37,647.

The DNA sequence rd1-orfδ (SEQ ID NO: 67) contained an open reading frame starting with a GTG codon at position 5502 - 5500 and ending with a termination codon (TAG) at position 5084 - 5082 (position numbers referring to the location in RD1 ), and the deduced amino acid sequence (SEQ ID NO: 68) contains 139 residues with a molecular weight of 1 1 ,737.

The DNA sequence rd1-orf9a (SEQ ID NO: 93) contained an open reading frame starting with a GTG codon at position 6146 - 6148 and ending with a termination codon (TAA) at position 7070 - 7072 (position numbers referring to the location in RD1 ). The deduced amino acid sequence (SEQ ID NO: 94) contains 308 residues corresponding to a molecular weight of 33,453.

The DNA sequence rd1-orf9b (SEQ ID NO: 69) contained an open reading frame starting with an ATG codon at position 5072 - 5074 and ending with a termination codon (TAA) at position 7070 - 7072 (position numbers referring to the location in RD1 ). The deduced amino acid sequence (SEQ ID NO: 70) contains 666 residues corresponding to a molecular weight of 70,650. Clonin^g of the ORF's rd1-orf2, rd1-orf3. rd1-orf4. rdl-orfδ. rdl-orf8. rd1-orf9a. and rd1-orf9b.

The ORF's rd1-orf2, rd1-orf3, rd1-orf4, rd1-orfδ, rd1-orf8, rd1-orf9a and rd1-orf9b were PCR cloned in the pMST24 (Theisen et al., 1 995) (rd1-orf3) or the pQE32 (QIAGEN) (rd1-orf2, rd1-orf4, rdl-orfδ, rdl-orfδ, rd1-orf9a and rd1-orf9b) expression vector. Preparation of oligonucleotides and PCR amplification of the rdl-orf encoding genes, was carried out as described in example 2. Chromosomal DNA from M. tuberculosis H37Rv was used as template in the PCR reactions. Oligonucleotides were syn- thesized on the basis of the nucleotide sequence from the RD1 region (Accession no. U34848). The oligonucleotide primers were engineered to include an restriction enzyme site at the 5' end and at the 3' end by which a later subcioning was possible. Primers are listed in TABLE 2.

rd1-orf2. A BamVW site was engineered immediately 5' of the first codon of rd1-orf2, and a Hind\\\ site was incorporated right after the stop codon at the 3' end. The gene rd1-orf2 was subcloned in pQE32, giving pT096.

rd1-orf3. A Sma\ site was engineered immediately 5' of the first codon of rd1-orf3, and a Nco\ site was incorporated right after the stop codon at the 3' end. The gene rd1-orf3 was subcloned in pMST24, giving pT087.

rd1-orf4. A BamVW site was engineered immediately 5' of the first codon of rd1-orf4, and a Hind\\\ site was incorporated right after the stop codon at the 3' end. The gene rd1-orf4 was subcloned in pQE32, giving pT089.

rdl-orfδ. A BamVW site was engineered immediately 5' of the first codon of rdl-orfδ, and a Hinά\\\ site was incorporated right after the stop codon at the 3' end. The gene rdl-orfδ was subcloned in pQE32, giving pT088.

rd1-orf8. A BamVW site was engineered immediately 5' of the first codon of rd1-orf8, and a Λ/col site was incorporated right after the stop codon at the 3' end. The gene rd1-orf8 was subcloned in pMST24, giving pT098. rd1-orf9a. A BamVW site was engineered immediately 5' of the first codon of rd1- orf9a, and a Hind\\\ site was incorporated right after the stop codon at the 3' end. The gene rd1-orf9a was subcloned in pQE32, giving pT091 .

rd1-orf9b. A Sca\ site was engineered immediately 5' of the first codon of rd1-orf9b, and a Hind III site was incorporated right after the stop codon at the 3' end. The gene rd1-orf9b was subcloned in pQE32, giving pTO90.

The PCR fragments were digested with the suitable restriction enzymes, purified from an agarose gel and cloned into either pMST24 or pQE-32. The seven constructs were used to transform the E. coli XL 1 -Blue. Endpoints of the gene fusions were determined by the dideoxy chain termination method. Both strands of the DNA were sequenced.

Purification of recombinant RD1 -ORF2. RD1 -ORF3. RD1 -ORF4. RD1 -ORF5. RD1 -ORF8, RD1 -ORF9a and RD1 -ORF9b.

The rRD1 -ORFs were fused N-terminally to the (His)₆ -tag. Recombinant antigen was prepared as described in example 2 (with the exception that pT091 was expressed at 30°C and not at 37°C), using a single colony of E. coli harbouring either the pT087, pT088, pT089, pTO90, pT091 , pT096 or pT098 for inoculation. Purification of recombinant antigen by Ni²⁺ affinity chromatography was also carried out as described in example 2. Fractions containing purified His-rRD1 -0RF2, His-rRD1 -0RF3 His-rRD1 - 0RF4, His-rRD1 -0RF5, His-rRD1 -0RF8, His-rRD1 -0RF9a or His-rRD1 -0RF9b were pooled. The His-rRD1 -ORF's were extensively dialysed against 10 mM Tris/HCI, pH 8.5, 3 M urea followed by an additional purification step performed on an anion exchange column (Mono Q) using fast protein liquid chromatography (FPLC) (Pharmacia, Uppsala, Sweden). The purification was carried out in 10 mM Tris/HCI, pH 8.5, 3 M urea and protein was eluted by a linear gradient of NaCl from 0 to 1 M. Fractions containing the His-rRD1 -ORF's were pooled and subsequently dialysed extensively against 25 mM Hepes, pH 8.0 before use. Table 2. Sequence of the rdl-orf's oligonucleotides³

Orientation and oligoSequences (5'- 3') Position (nt) nucleotide

Sense

RD 1 -ORF2f CTGGGGATCCGCATGACTGCTGAACCG 886 - 903

RD 1 -ORF3f CTTCCCGGGATGGAAAAAATGTCAC 2807 - 2822

RD1 -0RF4f GTAGGATCCTAGGAGACATCAGCGGC 4028 - 401 5

RD1 -ORF5f CTGGGGATCCGCGTGATCACCAT- 3028 - 3045

GCTGTGG

RD 1 -ORF8f CTCGGATCCTGTGGGTGCAGGTCCGGC 5502 - 5479

GATGGGC

RD1 -ORF9af GTGATGTGAGCTCAGGTGAAGAA- 6144 - 6160

GGTGAAG

RD1 -ORF9bf GTGATGTGAGCTCCTATGGCGGCCGAC- 5072 - 5089

TACGAC

Antisense RD1 -ORF2r TGCAAGCTTTTAACCGGCGCTTGGGGGT 2664 - 2644

GC

RD 1 -ORF3r GATGCCATGGTTAGGCGAAGACGC- 31 03 - 3086

CGGC

RD1 -0RF4r CGATCTAAGCTTGGCAATGGAGGTCTA 3582 - 3597 RD1 -0RF5r TGCAAGCTTTCACCAGTCGTCCT- 4243 - 4223

CTTCGTC

RD1 -ORF8r CTCCCATGGCTACGACAAGCTCTTC- 5083 - 5105

CGGCCGC

RD1 -ORF9a/br CGATCTAAGCTTTCAACGACGTCCAGCC 7073 - 7056

^a The oligonucleotides were constructed from the Accession number U34484 nucleo- tide sequence (Mahairas et al., 1 996). Nucleotides (nt) underlined are not contained in the nucleotide sequence of RD 1 -ORF's. The positions correspond to the nucleotide sequence of Accession number U34484. The nucleotide sequences of rd1-orf2, rd1-orf3, rd1-orf4, rdl-orfδ, rd1-orf8, rdl- orf9a, and rd1-orf9b from M. tuberculosis H37Rv are set forth in SEQ ID NO: 71 , 87, 89, 91 , 67, 93, and 69, respectively. The deduced amino acid sequences of rd1-orf2, rd1-orf3, rd1-orf4 rdl-orfδ, rdl-orfδ, rdl-orf9a, and rdl-orf9b are set forth in SEQ ID NO: 72, 88, 90, 92, 68, 94, and 70, respectively.

EXAMPLE 3

Cloning of the genes expressing 17-30 kDa antigens from ST-CF

Isolation of CFP1 7, CFP20, CFP21 , CFP22, CFP25. and CFP28

ST-CF was precipitated with ammonium sulphate at 80% saturation. The precipitated proteins were removed by centrifugation and after resuspension washed with 8 M urea. CHAPS and glycerol were added to a final concentration of 0.5% (w/v) and 5% (v/v) respectively and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad). The Rotofor Cell had been equilibrated with an 8 M urea buffer containing 0.5% (w/v) CHAPS, 5% (v/v) glycerol, 3% (v/v) Biolyt 3/5 and 1 % (v/v) Biolyt 4/6 (BioRad). Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed on silver-stained 1 0-20% SDS-PAGE. Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1 -3 ml. An equal volume of SDS containing sample buffer was added and the protein solution boiled for 5 min before further separation on a Prep Cell (BioRad) in a matrix of 1 6% polyacrylamide under an electrical gradient. Fractions containing pure proteins with an molecular mass from 17-30 kDa were collected.

Isolation of CFP29

Anti-CFP29, reacting with CFP29 was generated by immunization of BALB/c mice with crushed gel pieces in RIBI adjuvant (first and second immunization) or aluminium hydroxide (third immunization and boosting) with two week intervals. SDS-PAGE gel pieces containing 2-5 μg of CFP29 were used for each immunization. Mice were boosted with antigen 3 days before removal of the spleen. Generation of a monoclonal cell line producing antibodies against CFP29 was obtained essentially as described by Kohler and Milstein (1 975). Screening of supernatants from growing clones was carried out by immunoblotting of nitrocellulose strips containing ST-CF separated by SDS- PAGE. Each strip contained approximately 50 μg of ST-CF. The antibody class of anti- CFP29 was identified as IgM by the mouse monoclonal antibody isotyping kit, RPN29 (Amersham) according to the manufacturer's instructions.

CFP29 was purified by the following method: ST-CF was concentrated 10 fold by ultrafiltration, and ammonium sulphate precipitation in the 45 to 55% saturation range was performed. The pellet was redissolved in 50 mM sodium phosphate, 1 .5 M ammonium sulphate, pH 8.5, and subjected to thiophilic adsorption chromatography (Po- rath et al., 1 985) on an Affi-T gel column (Kem-En-Tec). Protein was eluted by a linear 1 .5 to 0 M gradient of ammonium sulphate and fractions collected in the range 0.44 to 0.31 M ammonium sulphate were identified as CFP29 containing fractions in West- ern blot experiments with mAb Anti-CFP29. These fractions were pooled and anion exchange chromatography was performed on a Mono Q HR 5/5 column connected to an FPLC system (Pharmacia). The column was equilibrated with 10 mM Tris-HCl, pH 8.5 and the elution was performed with a linear gradient from 0 to 500 mM NaCl. From 400 to 500 mM sodium chloride, rather pure CFP29 was eluted. As a final puri- fication step the Mono Q fractions containing CFP29 were loaded on a 1 2.5% SDS- PAGE gel and pure CFP29 was obtained by the multi-elution technique (Andersen and Heron, 1 993).

N-terminal sequencing and amino acid analysis

CFP1 7, CFP20, CFP21 , CFP22, CFP25, and CFP28 were washed with water on a Centricon concentrator (Amicon) with cutoff at 10 kDa and then applied to a ProSpin concentrator (Applied Biosystems) where the proteins were collected on a PVDF membrane. The membrane was washed 5 times with 20% methanol before sequencing on a Procise sequencer (Applied Biosystems).

CFP29 containing fractions were blotted to PVDF membrane after tricine SDS-PAGE (Ploug et al., 1 989). The relevant bands were excised and subjected to amino acid analysis (Barkholt and Jensen, 1989) and N-terminal sequence analysis on a Procise sequencer (Applied Biosystems).

The following N-terminal sequences were obtained:

For CFP17: A/S ELDAPAQAGTEXAV (SEQ ID NO: 17)

For CFP20: A QITLR G N A l N TVG E (SEQ ID NO: 18)

For CFP21 : D PX SD I AVVFA RGTH (SEQ ID NO: 19)

For CFP22: TN S PLATATAT LHTN (SEQ ID NO: 20) For CFP25: AX PDA EV VFA R G R FE (SEQ ID NO: 21)

For CFP28: X l/V Q K S L E L I V/T V/F T A D/Q E (SEQ ID NO: 22)

For CFP29: M N N LY R D LA PVTEA AWA EI (SEQ ID NO: 23)

"X" denotes an amino acid which could not be determined by the sequencing method used, whereas a "/" between two amino acids denotes that the sequencing method could not determine which of the two amino acids is the one actually present.

Cloning the gene encoding CFP29

The N-terminal sequence of CFP29 was used for a homology search in the EMBL database using the TFASTA program of the Genetics Computer Group sequence analysis software package. The search identified a protein, Linocin M18, from Brevibacterium linens that shares 74% identity with the 19 N-terminal amino acids of CFP29.

Based on this identity between the N-terminal sequence of CFP29 and the sequence of the Linocin M18 protein from Brevibacterium linens, a set of degenerated primers were constructed for PCR cloning of the M. tuberculosis gene encoding CFP29. PCR reactions were containing 10 ng of M. tuberculosis chromosomal DNA in 1 x low salt Taq + buffer from Stratagene supplemented with 250 μM of each of the four nucleo- tides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1 % DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag-I- DNA polymerase (Stratagene) in 10 μl reaction volume. Reactions were initially heated to 94°C for 25 sec. and run for 30 cycles of the program; 94°C for 15 sec, 55°C for 15 sec. and 72°C for 90 sec, using ther- mocycler equipment from Idaho Technology. An approx. 300 bp fragment was obtained using primers with the sequences:

1 : 5'-CCCGGCTCGAGAACCTSTACCGCGACCTSGCSCC (SEQ ID NO: 24) 2: 5'-GGGCCGGATCCGASGCSGCGTCCTTSACSGGYTGCCA

(SEQ ID NO: 25) -where S = G/C and Y = T/C

The fragment was excised from a 1 % agarose gel, purified by Spin-X spinn columns (Costar), cloned into pBluescript SK II + - T vector (Stratagene) and finally sequenced with the Sequenase kit from United States Biochemical.

The first 1 50 bp of this sequence was used for a homology search using the Blast program of the Sanger Mycobacterium tuberculosis database:

(http//www. sanger. ac.uk/projects/M-tuberculosis/blast_server).

This program identified a Mycobacterium tuberculosis sequence on cosmid cy444 in the database that is nearly 100% identical to the 1 50 bp sequence of the CFP29 pro- tein. The sequence is contained within a 795 bp open reading frame of which the 5' end translates into a sequence that is 100% identical to the N-terminally sequenced 1 9 amino acids of the purified CFP29 protein.

Finally, the 795 bp open reading frame was PCR cloned under the same PCR condi- tions as described above using the primers:

3: 5'-GGAAGCCCCATATGAACAATCTCTACCG (SEQ ID NO: 26)

4: 5'-CGCGCTCAGCCCTTAGTGACTGAGCGCGACCG (SEQ ID NO: 27)

The resulting DNA fragments were purified from agarose gels as described above sequenced with primer 3 and 4 in addition to the following primers:

5: 5'-GGACGTTCAAGCGACACATCGCCG-3' (SEQ ID NO: 1 15) 6: 5'-CAGCACGAACGCGCCGTCGATGGC-3' (SEQ ID NO: 1 16)

Three independent cloned were sequenced. All three clones were in 100% agreement with the sequence on cosmid cy444.

All other DNA manipulations were done according to Maniatis et al. (1 989).

All enzymes other than Taq polymerase were from New England Biolabs.

Homology searches in the Sanger database

For CFP1 7, CFP20, CFP21 , CFP22, CFP25, and CFP28 the N-terminal amino acid sequence from each of the proteins were used for a homology search using the blast program of the Sanger Mycobacterium tuberculosis database:

http://www.sanger.ac.uk/pathogens/TB-blast-server.html.

For CFP29 the first 1 50 bp of the DNA sequence was used for the search. Furthermore, the EMBL database was searched for proteins with homology to CFP29.

Thereby, the following information were obtained:

CFP1 7

Of the 14 determined amino acids in CFP1 7 a 93% identical sequence was found with MTCY 1 A1 1 .1 6c. The difference between the two sequences is in the first amino acid: It is an A or an S in the N-terminal determined sequenced and a S in MTCY1 A1 1 . From the N-terminal sequencing it was not possible to determine amino acid number 1 3.

Within the open reading frame the translated protein is 1 62 amino acids long. The N- terminal of the protein purified from culture filtrate starts at amino acid 31 in agreement with the presence of a signal sequence that has been cleaved off. This gives a length of the mature protein of 1 32 amino acids, which corresponds to a theoretical molecular mass of 1 3833 Da and a theoretical pi of 4.4. The observed mass in SDS- PAGE is 1 7 kDa.

CFP20

A sequence 100% identical to the 1 5 determined amino acids of CFP20 was found on the translated cosmid cscy09F9. A stop codon is found at amino acid 1 66 from the amino acid M at position 1 . This gives a predicted length of 1 65 amino acids, which corresponds to a theoretical molecular mass of 1 6897 Da and a pi of 4.2. The ob- served molecular weight in a SDS-PAGE is 20 kDa.

Searching the GenEMBL database using the TFASTA algorithm (Pearson and Lipman, 1 988) revealed a number of proteins with homology to the predicted 1 64 amino acids long translated protein.

The highest homology, 51 .5% identity in a 1 63 amino acid overlap, was found to a Haemophilus influenza Rd toxR reg. (HIH10751 ).

CFP21

A sequence 100% identical to the 14 determined amino acids of CFP21 was found at MTCY39. From the N-terminal sequencing it was not possible to determine amino acid number 3; this amino acid is a C in MTCY39. The amino acid C can not be detected on a Sequencer which is probably the explanation of this difference.

Within the open reading frame the translated protein is 21 7 amino acids long. The N- terminally determined sequence from the protein purified from culture filtrate starts at amino acid 33 in agreement with the presence of a signal sequence that has been cleaved off. This gives a length of the mature protein of 185 amino acids, which cor- responds to a theoretical molecular weigh at 18657 Da, and a theoretical pi at 4,6. The observed weight in a SDS-PAGE is 21 kDa.

In a 1 93 amino acids overlap the protein has 32,6% identity to a cutinase precursor with a length of 209 amino acids (CUTI ALTBR P41744). A comparison of the 14 N-terminal determined amino acids with the translated region (RD2) deleted in M. bovis BCG revealed a 100% identical sequence (mb3484) (Mahairas et al. (1 996)).

CFP22

A sequence 100% identical to the 1 5 determined amino acids of CFP22 was found at MTCY10H4. Within the open reading frame the translated protein is 1 82 amino acids long. The N-terminal sequence of the protein purified from culture filtrate starts at amino acid 8 and therefore the length of the protein occurring in . tuberculosis culture filtrate is 1 75 amino acids. This gives a theoretical molecular weigh at 1851 7 Da and a pi at 6.8. The observed weight in a SDS-PAGE is 22 kDa.

In an 182 amino acids overlap the translated protein has 90, 1 % identity with E235739; a peptidyl-prolyl cis-trans isomerase.

CFP25

A sequence 93% identical to the 1 5 determined amino acids was found on the cosmid MTCY339.08c. The one amino acid that differs between the two sequences is a C in MTCY339.08c and a X from the N-terminal sequence data. On a Sequencer a C can not be detected which is a probable explanation for this difference.

The N-terminally determined sequence from the protein purified from culture filtrate begins at amino acid 33 in agreement with the presence of a signal sequence that has been cleaved off. This gives a length of the mature protein of 187 amino acids, which corresponds to a theoretical molecular weigh at 1 9665 Da, and a theoretical pi at 4.9. The observed weight in a SDS-PAGE is 25 kDa.

In a 21 7 amino acids overlap the protein has 42.9% identity to CFP21 (MTCY39.35). CFP28

No homology was found when using the 1 0 determined amino acid residues 2-8, 1 1 , 1 2, and 14 of SEQ ID NO: 22 in the database search.

CFP29

Sanger database searching: A sequence nearly 100% identical to the 1 50 bp sequence of the CFP29 protein was found on cosmid cy444. The sequence is contained within a 795 bp open reading frame of which the 5' end translates into a sequence that is 1 00% identical to the N-terminally sequenced 1 9 amino acids of the purified CFP29 protein. The open reading frame encodes a 265 amino acid protein.

The amino acid analysis performed on the purified protein further confirmed the iden- tity of CFP29 with the protein encoded in open reading frame on cosmid 444.

EMBL database searching: The open reading frame encodes a 265 amino acid protein that is 58% identical and 74% similar to the Linocin M 1 8 protein (61 % identity on DNA level). This is a 28.6 kDa protein with bacteriocin activity (Valdes-Stauber and Scherer, 1 994; Valdes-Stauber and Scherer, 1 996). The two proteins have the same length (except for 1 amino acid) and share the same theoretical physicochemical properties. We therefore suggest that CFP29 is a mycobacterial homolog to the Brevibacterium linens Linocin M1 8 protein.

The amino acid sequences of the purified antigens as picked from the Sanger database are shown in the following list. The amino acids determined by N-terminal sequencing are marked with bold.

CFP17 (SEQ ID NO: 6):

1 MTDMNPDIEK DQTSDEVTVE TTSVFRADFL SELDAPAQAG TESAVSGVEG 51 LPPGSALLVV KRGPNAGSRF LLDQAITSAG RHPDSDIFLD DVTVSRRHAE 101 FRLENNEFNV VDVGSLNGTY VNREPVDSAV LANGDEVQIG KFRLVFLTGP 1 51 KQGEDDGSTG GP CFP20 (SEQ ID NO: 8):

1 MAQITLRGNA INTVGELPAV GSPAPAFTLT GGDLGVISSD QFRGKSVLLN 51 IFPSVDTPVC ATSVRTFDER AAASGATVLC VSKDLPFAQK RFCGAEGTEN 101 VMPASAFRDS FGEDYGVTIA DGPMAGLLAR AIVVIGADGN VAYTELVPEI 1 51 AQEPNYEAAL AALGA

CFP21 (SEQ ID NO: 10):

1 MTPRSLVRIV GVVVATTLAL VSAPAGGRAA HADPCSDIAV 41 VFARGTHQAS GLGDVGEAFV DSLTSQVGGR SIGVYAVNYP ASDDYRASAS 91 NGSDDASAHI QRTVASCPNT RIVLGGYSQG ATVIDLSTSA MPPAVADHVA 1 1 AVALFGEPSS GFSSMLWGGG SLPTIGPLYS SKTINLCAPD DPICTGGGNI 191 MAHVSYVQSG MTSQAATFAA NRLDHAG

CFP22 (SEQ ID NO: 1 2):

1 MADCDSVTNS PLATATATLH TNRGDIKIAL FGNHAPKTVA NFVGLAQGTK 51 DYSTQNASGG PSGPFYDGAV FHRVIQGFMI QGGDPTGTGR GGPGYKFADE 101 FHPELQFDKP YLLAMANAGP GTNGSQFFIT VGKTPHLNRR HTIFGEVIDA 1 51 ESQRVVEAIS KTATDGNDRP TDPVVIESIT IS

CFP25 (SEQ ID NO: 14):

1 MGAAAAMLAA VLLLTPITVP AGYPGAVAPA TAACPDAEVV FARGRFEPPG 51 IGTVGNAFVS ALRSKVNKNV GVYAVKYPAD NQIDVGANDM SAHIQSMANS 101 CPNTRLVPGG YSLGAAVTDV VLAVPTQMWG FTNPLPPGSD EHIAAVALFG 1 51 NGSQWVGPIT NFSPAYNDRT lELCHGDDPV CHPADPNTWE ANWPQHLAGA 201 YVSSGMVNQA ADFVAGKLQ

CFP29 (SEQ ID NO: 16):

1 MNNLYRDLAP VTEAAWAEIE LEAARTFKRH IAGRRVVDVS DPGGPVTAAV 51 STGRLIDVKA PTNGVIAHLR ASKPLVRLRV PFTLSRNEID DVERGSKDSD 101 WEPVKEAAKK LAFVEDRTIF EGYSAASIEG IRSASSNPAL TLPEDPREIP 1 51 DVISQALSEL RLAGVDGPYS VLLSADVYTK VSETSDHGYP IREHLNRLVD 201 GDIIWAPAID GAFVLTTRGG DFDLQLGTDV AIGYASHDTD TVRLYLQETL 251 TFLCYTAEAS VALSH

For all six proteins the molecular weights predicted from the sequences are in agreement with the molecular weights observed on SDS-PAGE.

Cloning of the genes encoding CFP1 7, CFP20, CFP21 , CFP22 and CFP25.

The genes encoding CFP1 7, CFP20, CFP21 , CFP22 and CFP25 were all cloned into the expression vector pMCT6, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the proteins.

PCR reactions contained 1 0 ng of M. tuberculosis chromosomal DNA in 1 x low salt Taq + buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1 % DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag -t- DNA polymerase (Stratagene) in 1 0 μl reac- tion volume. Reactions were initially heated to 94°C for 25 sec. and run for 30 cycles according to the following program; 94°C for 1 0 sec, 55°C for 1 0 sec. and 72°C for 90 sec, using thermocycler equipment from Idaho Technology.

The DNA fragments were subsequently run on 1 % agarose gels, the bands were ex- cised and purified by Spin-X spin columns (Costar) and cloned into pBluescript SK II + - T vector (Stratagene). Plasmid DNA was thereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMCT6 in frame with 8 histidine residues which are added to the N-terminal of the expressed proteins. The resulting clones were hereafter sequen- ced by use of the dideoxy chain termination method adapted for supercoiied DNA using the Sequenase DNA sequencing kit version 1 .0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced. For cloning of the individual antigens, the following gene specific primers were used:

CFP1 7: Primers used for cloning of cfp1 7:

OPBR-51 : ACAGATCTGTGACGGACATGAACCCG (SEQ ID NO: 1 1 7) OPBR-52: TTTTCCATGGTCACGGGCCCCCGGTACT (SEQ ID NO: 1 1 8)

OPBR-51 and OPBR-52 create Bglll and Ncol sites, respectively, used for the cloning in pMCT6.

CFP20: Primers used for cloning of cfp20:

OPBR-53: ACAGATCTGTGCCCATGGCACAGATA (SEQ ID NO: 119) OPBR-54: TTTAAGCTTCTAGGCGCCCAGCGCGGC (SEQ ID NO: 120)

OPBR-53 and OPBR-54 create Bglll and HinDIII sites, respectively, used for the cloning in pMCT6.

CFP21 : Primers used for cloning of cfp21 :

OPBR-55: ACAGATCTGCGCATGCGGATCCGTGT (SEQ ID NO: 121) OPBR-56: TTTTCCATGGTCATCCGGCGTGATCGAG (SEQ ID NO: 122)

OPBR-55 and OPBR-56 create Bglll and Ncol sites, respectively, used for the cloning in pMCT6.

CFP22: Primers used for cloning of cfp22:

OPBR-57: ACAGATCTGTAATGGCAGACTGTGAT (SEQ ID NO: 123) OPBR-58: TTTTCCATGGTCAGGAGATGGTGATCGA (SEQ ID NO: 124)

OPBR-57 and OPBR-58 create Bglll and Ncol sites, respectively, used for the cloning in pMCT6. CFP25: Primers used for cloning of cfp25:

OPBR-59: ACAGATCTGCCGGCTACCCCGGTGCC (SEQ ID NO: 1 25) 5 OPBR-60: TTTTCCATGGCTATTGCAGCTTTCCGGC (SEQ ID NO: 1 26)

OPBR-59 and OPBR-60 create Bglll and Ncol sites, respectively, used for the cloning in pMCT6.

10 Expression/purification of recombinant CFP1 7, CFP20. CFP21 , CFP22 and CFP25 proteins.

Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 I LB-media containing

1 5 100 μg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XLI -Blue cells harbouring recombinant pMCT6 plasmids. Cultures were shaken at 37 °C until they reached a density of OD₆₀₀ = 0.4 - 0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4 - 1 6 hours. Cells were harvested, resuspended in 1 X sonication buffer + 8 M urea and sonicated 5 X 30

20 sec. with 30 sec. pausing between the pulses.

After centrifugation, the lysate was applied to a column containing 25 ml of resuspended Talon resin (Clontech, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.

25 After elution, all fractions (1 .5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCl, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column,

30 eluted with a linear 0-1 M gradient of NaCl. Fractions were analyzed by SDS-PAGE and protein concentrations were estimated at OD₂₈₀. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH 8.5. Finally the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.

EXAMPLE 3A

Identification of CFP7A, CFP8A, CFPBB, CFP16, CFP19, CFP19B, CFP22A, CFP23A, CFP23B, CFP2δA, CFP27, CFP30A, CWP32 and CFPδO.

Identification of CFP1 6 and CFP1 9B.

ST-CF was precipitated with ammonium sulphate at 80% saturation. The precipitated proteins were removed by centrifugation and after resuspension washed with 8 M urea. CHAPS and glycerol were added to a final concentration of 0.5 % (w/v) and 5 % (v/v) respectively and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad). The Rotofor Cell had been equilibrated with a 8M urea buffer containing 0.5 % (w/v) CHAPS, 5% (v/v) glycerol, 3% (v/v) Biolyt 3/5 and 1 % (v/v) Biolyt 4/6 (BioRad). Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed on silver-stained 10-20% SDS-PAGE. Fractions with similar band pat- terns were pooled and washed three times with PBS on a Centriprep concentrator

(Amicon) with a 3 kDa cut off membrane to a final volume of 1 -3 ml. An equal volume of SDS containing sample buffer was added and the protein solution boiled for 5 min before further separation on a Prep Cell (BioRad) in a matrix of 1 6% polyacrylamide under an electrical gradient. Fractions containing well separated bands in SDS-PAGE were selected for N-terminal sequencing after transfer to PVDF membrane.

Isolation of CFP8A, CFP8B, CFP1 9, CFP23A, and CFP23B.

ST-CF was precipitated with ammonium sulphate at 80% saturation and redissolved in PBS, pH 7.4, and dialysed 3 times against 25mM Piperazin-HCl, pH 5.5, and subjected to chromatofocusing on a matrix of PBE 94 (Pharmacia) in a column connected to an FPLC system (Pharmacia) . The column was equilibrated with 25 mM Piperazin-HCl, pH 5.5, and the elution was performed with 1 0% PB74-HCI, pH 4.0 (Pharmacia). Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1 -3 ml and separated on a Prepcell as described above.

Identification of CFP22A

ST-CF was concentrated approximately 10 fold by ultrafiltration and proteins were precipitated at 80 % saturation, redissolved in PBS, pH 7.4, and dialysed 3 times against PBS, pH 7.4. 5.1 ml of the dialysed ST-CF was treated with RNase (0.2 mg/ml, QUIAGEN) and DNase (0.2 mg/ml, Boehringer Mannheim) for 6 h and placed on top of 6.4 ml of 48 % (w/v) sucrose in PBS, pH 7.4, in Sorvall tubes (Ultracrimp 03987, DuPont Medical Products) and ultracentrifuged for 20 h at 257,300 x g_max, 10°C. The pellet was redissolved in 200 μl of 25 mM Tris-1 92 mM glycine, 0.1 % SDS, pH 8.3.

Identification of CFP7A, CFP25A, CFP27, CFP30A and CFP50

For CFP27, CFP30A and CFP50 ST-CF was concentrated approximately 10 fold by ultrafiltration and ammonium sulphate precipitation in the 45 to 55 % saturation range was performed. Proteins were redissolved in 50 mM sodium phosphate, 1 .5 M ammo- nium sulphate, pH 8.5, and subjected to thiophilic adsorption chromatography on an Affi-T gel column (Kem-En-Tec). Proteins were eluted by a 1 .5 to 0 M decreasing gradient of ammonium sulphate. Fractions with similar band patterns in SDS-PAGE were pooled and anion exchange chromatography was performed on a Mono Q HR 5/5 column connected to an FPLC system (Pharmacia). The column was equilibrated with 1 0 mM Tris-HCl, pH 8.5, and the elution was performed with a gradient of NaCl from 0 to 1 M. Fractions containing well separated bands in SDS-PAGE were selected.

CFP7A and CFP25A were obtained as described above except for the following modification: ST-CF was concentrated approximately 1 0 fold by ultrafiltration and proteins were precipitated at 80 % saturation, redissolved in PBS, pH 7.4, and dialysed 3 times against PBS, pH 7.4. Ammonium sulphate was added to a concentration of 1 .5 M, and ST-CF proteins were loaded on an Affi T-gel column. Elution from the Affi T- gel column and anion exchange were performed as described above. Isolation of CWP32

Heat treated H37Rv was subfractionated into subcellular fractions as described in Sø- rensen et al 1 995. The Cell wall fraction was resuspended in 8 M urea, 0.2 % (w/v) N-octyl β-_D glucopyranoside (Sigma) and 5 % (v/v) glycerol and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad) which was equilibrated with the same buffer. Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed by SDS-PAGE and fractions containing well separated bands were polled and subjected to N-terminal sequencing after transfer to PVDF membrane.

N-terminal seguencing

Fractions containing CFP7A, CFP8A, CFP8B, CFP1 6, CFP1 9, CFP1 9B, CFP22A, CFP23A, CFP23B, CFP27, CFP30A, CWP32, and CFP50A were blotted to PVDF membrane after Tricine SDS-PAGE (Ploug et al, 1 989) . The relevant bands were excised and subjected to N-terminal amino acid sequence analysis on a Procise 494 sequencer (Applied Biosystems). The fraction containing CFP25A was blotted to PVDF membrane after 2-DE PAGE (isoelectric focusing in the first dimension and Tricin SDS- PAGE in the second dimension). The relevant spot was excised and sequenced as de- scribed above.

The following N-terminal sequences were obtained:

CFP7A: AEDVRAEIVA SVLEVVVNEG DQIDKGDVVV LLESMYMEIP VLAEAAGTVS (SEQ ID NO: 81 )

CFP8A: DPVDDAFIAKLNTAG (SEQ ID NO: 73)

CFP8B: DPVDAIINLDNYGX (SEQ ID NO: 74)

CFP1 6: AKLSTDELLDAFKEM (SEQ ID NO: 79)

CFP1 9: TTSPDPYAALPKLPS (SEQ ID NO: 82) CFP1 9B: DPAXAPDVPTAAQLT (SEQ ID NO: 80)

CFP22A: TEYEGPKTKF HALMQ (SEQ ID NO: 83)

CFP23A: VIQ/AGMVT/GHIHXVAG (SEQ ID NO: 76)

CFP23B: AEMKXFKNAIVQEID (SEQ ID NO: 75)

CFP25A: AIEVSVLRVF TDSDG (SEQ ID NO: 78) CWP32: TNIVVLIKQVPDTWS (SEQ ID NO: 77)

CFP27: TTIVALKYPG GVVMA (SEQ ID NO: 84)

CFP30A: SFPYFISPEX AMRE (SEQ ID NO: 85)

CFP50: THYDVVVLGA GPGGY (SEQ ID NO: 86)

N-terminal homology searching in the Sanger database and identification of the corresponding genes.

The N-terminal amino acid sequence from each of the proteins was used for a homo- logy search using the blast program of the Sanger Mycobacterium tuberculosis database:

http://www.sanger.ac.uk/projects/m-tuberculosis/TB-blast-server.

For CFP23B, CFP23A, and CFP1 9B no similarities were found in the Sanger database. This could be due to the fact that only approximately 70% of the M. tuberculosis genome had been sequenced when the searches were performed. The genes encoding these proteins could be contained in the remaining 30% of the genome for which no sequence data is yet available.

For CFP7A, CFP8A, CFP8B, CFP1 6, CFP1 9, CFP1 9B , CFP22A, CFP25A, CFP27, CFP30A, CWP32, and CFP50, the following information was obtained:

CFP7A: Of the 50 determined amino acids in CFP7A a 98% identical sequence was found in cosmid csCY07D1 (contig 256):

Score = 226 (100.4 bits), Expect = 1 .4e-24, P = 1 .4e-24

Identities = 49/50 (98%), Positives = 49/50 (98%), Frame = -1

Query: 1 AEDVRAEIVASVLEVVVNEGDQIDKGDVVVLLESMYMEIPVLAEAAGTVS

50 AEDVRAEIVASVLEVVVNEGDQIDKGDVVVLLESM MEIPVLAEAAGTVS

Sbjct: 257679 AEDVRAEIVASVLEVVVNEGDQIDKGDVVVLLESMKMEIPVLAEAAGTVS

257530

(SEQ ID NOs: 127, 1 28, and 129) The identity is found within an open reading frame of 71 amino acids length corresponding to a theoretical MW of CFP7A of 7305.9 Da and a pi of 3.762. The observed molecular weight in an SDS-PAGE gel is 7 kDa.

CFP8A: A sequence 80% identical to the 1 5 N-terminal amino acids was found on contig TB_1 884. The N-terminally determined sequence from the protein purified from culture filtrate starts at amino acid 32. This gives a length of the mature protein of 98 amino acids corresponding to a theoretical MW of 9700 Da and a pi of 3.72 This is in good agreement with the observed MW on SDS-PAGE at approximately 8 kDa. The full length protein has a theoretical MW of 1 2989 Da and a pi of 4.38.

CFP8B: A sequence 71 % identical to the 14 N-terminal amino acids was found on contig TB_653. However, careful re-evaluation of the original N-terminal sequence data confirmed the identification of the protein. The N-terminally determined sequence from the protein purified from culture filtrate starts at amino acid 29. This gives a length of the mature protein of 82 amino acids corresponding to a theoretical MW of 8337 Da and a pi of 4.23. This is in good agreement with the observed MW on SDS- PAGE at approximately 8 kDa. Analysis of the amino acid sequence predicts the pres- ence of a signal peptide which has been cleaved of the mature protein found in culture filtrate.

CFP1 6: The 1 5 aa N-terminal sequence was found to be 1 00% identical to a sequence found on cosmid MTCY20H 1 .

The identity is found within an open reading frame of 1 30 amino acids length corresponding to a theoretical MW of CFP1 6 of 1 3440.4 Da and a pi of 4.59. The observed molecular weight in an SDS-PAGE gel is 1 6 kDa.

CFP1 9: The 1 5 aa N-terminal sequence was found to be 100% identical to a sequence found on cosmid MTCY270. The identity is found within an open reading frame of 1 76 amino acids length corresponding to a theoretical MW of CFP1 9 of 1 8633.9 Da and a pi of 5.41 . The observed molecular weight in an SDS-PAGE gel is 1 9 kDa.

CFP22A: The 1 5 aa N-terminal sequence was found to be 1 00% identical to a sequence found on cosmid MTCY 1 A6.

The identity is found within an open reading frame of 1 81 amino acids length corresponding to a theoretical MW of CFP22A of 20441 .9 Da and a pi of 4.73. The ob- served molecular weight in an SDS-PAGE gel is 22 kDa.

CFP25A: The 1 5 aa N-terminal sequence was found to be 100% identical to a sequence found on contig 255.

The identity is found within an open reading frame of 228 amino acids length corresponding to a theoretical MW of CFP25A of 24574.3 Da and a pi of 4.95. The observed molecular weight in an SDS-PAGE gel is 25 kDa.

CFP27: The 1 5 aa N-terminal sequence was found to be 1 00% identical to a sequence found on cosmid MTCY261 .

The identity is found within an open reading frame of 291 amino acids length. The N- terminally determined sequence from the protein purified from culture filtrate starts at amino acid 58. This gives a length of the mature protein of 233 amino acids, which corresponds to a theoretical molecular weigh at 24422.4 Da, and a theoretical pi at 4.64. The observed weight in an SDS-PAGE gel is 27 kDa.

CFP30A: Of the 1 3 determined amino acids in CFP30A, a 1 00% identical sequence was found on cosmid MTCY261 .

The identity is found within an open reading frame of 248 amino acids length corresponding to a theoretical MW of CFP30A of 26881 .0 Da and a pi of 5.41 . The observed molecular weight in an SDS-PAGE gel is 30 kDa. CWP32: The 1 5 amino acid N-terminal sequence was found to be 1 00% identical to a sequence found on contig 281 . The identity was found within an open reading frame of 266 amino acids length, corresponding to a theoretical MW of CWP32 of 28083 Da and a pi of 4.563. The observed molecular weight in an SDS-PAGE gel is 32 kDa.

CFP50: The 1 5 aa N-terminal sequence was found to be 100% identical to a sequence found in MTVO38.06. The identity is found within an open reading frame of 464 amino acids length corresponding to a theoretical MW of CFP50 of 49244 Da and a pi of 5.66. The observed molecular weight in an SDS-PAGE gel is 50 kDa.

Use of homology searching in the EMBL database for identification of CFP1 9A and CFP23.

Homology searching in the EMBL database (using the GCG package of the Biobase, Arhus-DK) with the amino acid sequences of two earlier identified highly immunoreac- tive ST-CF proteins, using the TFASTA algorithm, revealed that these proteins (CFP21 and CFP25, EXAMPLE 3) belong to a family of fungal cutinase homologs. Among the most homologous sequences were also two Mycobacterium tuberculosis sequences found on cosmid MTCY1 3E1 2. The first, MTCY 1 3E1 2.04 has 46% and 50% identity to CFP25 and CFP21 respectively. The second, MTCY1 3E1 2.05, has also 46% and 50% identity to CFP25 and CFP21 . The two proteins share 62.5% aa identity in a 184 residues overlap. On the basis of the high homology to the strong T-cell antigens CFP21 and CFP25, respectively, it is believed that CFP1 9A and CFP23 are possible new T-cell antigens.

The first reading frame encodes a 254 amino acid protein of which the first 26 aa constitute a putative leader peptide that strongly indicates an extracellular location of the protein. The mature protein is thus 228 aa in length corresponding to a theoretical MW of 23149.0 Da and a Pi of 5.80. The protein is named CFP23.

The second reading frame encodes an 231 aa protein of which the first 44 aa constitute a putative leader peptide that strongly indicates an extracellular location of the protein. The mature protein is thus 1 87 aa in length corresponding to a theoretical MW of 19020.3 Da and a Pi of 7.03. The protein is named CFP19A. The presence of putative leader peptides in both proteins (and thereby their presence in the ST-CF) is confirmed by theoretical sequence analysis using the signalP program at the Expasy molecular Biology server

(http://expasy.hcuge.ch/www/tools.html).

Searching for homologies to CFP7A, CFP1 6, CFP1 9, CFP1 9A, CFP1 9B, CFP22A, CFP23, CFP25A, CFP27, CFP30A, CWP32 and CFP50 in the EMBL database.

The amino acid sequences derived from the translated genes of the individual antigens were used for homology searching in the EMBL and Genbank databases using the TFASTA algorithm, in order to find homologous proteins and to address eventual functional roles of the antigens.

CFP7A: CFP7A has 44% identity and 70% similarity to hypothetical Methanococcus jannaschii protein (M. jannaschii from base 1 1 621 99-1 1 75341 ), as well as 43% and 38% identity and 68 and 64% similarity to the C-terminal part of B. stearothermophi- lus pyruvate carboxylase and Streptococcus mutans biotin carboxyl carrier protein.

CFP7A contains a consensus sequence EAMKM for a biotin binding site motif which in this case was slightly modified (ESMKM in amino acid residues 34 to 38). By incubation with alkaline phosphatase conjugated streptavidin after SDS-PAGE and transfer to nitrocellulose it was demonstrated that native CFP7A was biotinylated.

CFP1 6: RpIL gene, 1 30 aa. Identical to the M. bovis 50s ribosomal protein L7/L1 2 (ace No P37381 ).

CFP1 9: CFP1 9 has 47% identity and 55% similarity to E.coli pectinesterase homolog (ybhC gene) in a 150 aa overlap.

CFP1 9A: CFP1 9A has between 38% and 45% identity to several cutinases from different fungal sp. In addition CFP1 9A has 46% identity and 61 % similarity to CFP25 as well as 50% identity and 64% similarity to CFP21 (both proteins are earlier isolated from the ST- CF).

CFP1 9B: No apparent homology

CFP22A: No apparent homology

CFP23: CFP23 has between 38% and 46% identity to several cutinases from different fungal sp.

In addition CFP23 has 46% identity and 61 % similarity to CFP25 as well as 50% identity and 63% similarity to CFP21 (both proteins are earlier isolated from the ST- CF).

CFP25A: CFP25A has 95% identity in a 241 aa overlap to a putative M. tuberculosis thymidylate synthase (450 aa accession No p281 76).

CFP27: CFP27 has 81 % identity to a hypothetical M. leprae protein and 64% identity and 78% similarity to Rhodococcus sp. proteasome beta-type subunit 2 (prcB(2) gene).

CFP30A: CFP30A has 67% identity to Rhodococcus proteasome alfa-type 1 subunit.

CWP32: The CWP32 N-terminal sequence is 100% identical to the Mycobacterium leprae sequence MLCB637.03.

CFP50: The CFP50 N-terminal sequence is 100% identical to a putative lipoamide de- hydrogenase from M. leprae (Accession 41 5183)

Cloning of the genes encoding CFP7A. CFP8A, CFP8B, CFP16, CFP19, CFP1 9A, CFP22A, CFP23. CFP25A, CFP27, CFP30A, CWP32, and CFP50. The genes encoding CFP7A, CFP8A, CFP8B, CFP1 6, CFP1 9, CFP1 9A, CFP22A, CFP23, CFP25A, CFP27, CFP30A, CWP32 and CFP50 were all cloned into the ex- pression vector pMCT6, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the proteins.

PCR reactions contained 1 0 ng of M. tuberculosis chromosomal DNA in 1 X low salt Taq -l- buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1 % DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag + DNA polymerase (Stratagene) in 1 0 ml reaction volume. Reactions were initially heated to 94°C for 25 sec. and run for 30 cycles of the program; 94°C for 10 sec, 55°C for 1 0 sec. and 72°C for 90 sec, using ther- mocycler equipment from Idaho Technology.

The DNA fragments were subsequently run on 1 % agarose gels, the bands were excised and purified by Spin-X spin columns (Costar) and cloned into pBluescript SK II + - T vector (Stratagene ). Plasmid DNA was hereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMCT6 in frame with 8 histidines which are added to the N- terminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiied DNA using the Sequenase DNA sequencing kit version 1 .0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.

For cloning of the individual antigens, the following gene specific primers were used:

CFP7A: Primers used for cloning of cfp7 :

OPBR-79: AAGAGTAGATCTATGATGGCCGAGGATGTTCGCG (SEQ ID NO:

95) OPBR-80: CGGCGACGACGGATCCTACCGCGTCGG (SEQ ID NO: 96)

OPBR-79 and OPBR-80 create BglW and BamVW sites, respectively, used for the cloning in pMCT6. CFP8A: Primers used for cloning of ctμδA:

CFP8A-F: CTGAGATCTATGAACCTACGGCGCC (SEQ ID NO: 154) CFP8A-R: CTCCCATGGTACCCTAGGACCCGGGCAGCCCCGGC (SEQ ID NO: 155)

CFP8A-F and CFP8A-R create Bg/W and Λ/col sites, respectively, used for the cloning in pMCT6.

CFP8B: Primers used for cloning of c/p8B:

CFP8B-F: CTGAGATCTATGAGGCTGTCGTTGACCGC (SEQ ID NO: 156) CFP8B-R: CTCCCCGGGCTTAATAGTTGTTGCAGGAGC (SEQ ID NO: 157)

CFP8B-F and CFP8B-R create BglW and Sma\ sites, respectively, used for the cloning in pMCT6.

CFP1 6: Primers used for cloning of c/p1 6:

OPBR-104: CCGGGAGATCTATGGCAAAGCTCTCCACCGACG

(SEQ ID NOs: 1 1 1 and 1 30) OPBR-105: CGCTGGGCAGAGCTACTTGACGGTGACGGTGG

(SEQ ID NOs: 1 1 2 and 1 31 )

OPBR-104 and OPBR-105 create BglW and Λ/col sites, respectively, used for the cloning in pMCT6.

CFP1 9: Primers used for cloning of cfpλ 9:

OPBR-96: GAGGAAGATCTATGACAACTTCACCCGACCCG

(SEQ ID NO: 107) OPBR-97. CATGAAGCCATGGCCCGCAGGCTGCATG

(SEQ ID NO: 1 08) OPBR-96 and OPBR-97 create BglW and Λ/col sites, respectively, used for the cloning in pMCT6.

CFP1 9A: Primers used for cloning of c/p1 9A:

OPBR-88: CCCCCCAGATCTGCACCACCGGCATCGGCGGGC

(SEQ ID NO: 99) OPBR-89. GCGGCGGATCCGTTGCTTAGCCGG (SEQ ID NO: 100)

OPBR-88 and OPBR-89 create BglW and BamVW sites, respectively, used for the cloning in pMCT6.

CFP22A: Primers used for cloning of cfp22A:

OPBR-90: CCGGCTGAGATCTATGACAGAATACGAAGGGC

(SEQ ID NO: 101 ) OPBR-91 : CCCCGCCAGGGAACTAGAGGCGGC (SEQ ID NO: 102)

OPBR-90 and OPBR-91 create BglW and Λ/col sites, respectively, used for the cloning in pMCT6.

CFP23: Primers used for cloning of cfp23:

OPBR-86: CCTTGGGAGATCTTTGGACCCCGGTTGC (SEQ ID NO: 97)

OPBR-87: GACGAGATCTTATGGGCTTACTGAC (SEQ ID NO: 98)

OPBR-86 and OPBR-87 both create a BglW site used for the cloning in pMCT6.

CFP25A: Primers used for cloning of c/p25A:

OPBR-106: GGCCCAGATCTATGGCCATTGAGGTTTCGGTGTTGC

(SEQ ID NO: 1 1 3) OPBR-107: CGCCGTGTTGCATGGCAGCGCTGAGC (SEQ ID NO: 1 14) OPBR-106 and OPBR-107 create BglW and Λ/col sites, respectively, used for the cloning in pMCT6.

CFP27: Primers used for cloning of cfp27:

OPBR-92: CTGCCGAGATCTACCACCATTGTCGCGCTGAAATACCC

(SEQ ID NO: 103) OPBR-93: CGCCATGGCCTTACGCGCCAACTCG (SEQ ID NO: 104)

OPBR-92 and OPBR-93 create BglW and Λ/col sites, respectively, used for the cloning in pMCT6.

CFP30A: Primers used for cloning of cfp30A:

OPBR-94: GGCGGAGATCTGTGAGTTTTCCGTATTTCATC

(SEQ ID NO: 105) OPBR-95: CGCGTCGAGCCATGGTTAGGCGCAG (SEQ ID NO: 106)

OPBR-94 and OPBR-95 create BglW and Λ/col sites, respectively, used for the cloning in pMCT6.

CWP32: Primers used for cloning of cwp32:

CWP32-F: GCTTAGATCTATGATTTTCTGGGCAACCAGGTA

(SEQ ID NO: 1 58) CWP32-R: GCTTCCATGGGCGAGGCACAGGCGTGGGAA (SEQ ID NO: 1 59)

CWP32-F and CWP32-R create BglW and Λ/col sites, respectively, used for the cloning in pMCT6.

CFP50: Primers used for cloning of cfpδO: OPBR-100: GGCCGAGATCTGTGACCCACTATGACGTCGTCG

(SEQ ID NO: 109) OPBR-101 : GGCGCCCATGGTCAGAAATTGATCATGTGGCCAA

(SEQ ID NO: 110)

OPBR-100 and OPBR-101 create BglW and Λ/col sites, respectively, used for the cloning in pMCT6.

Expression/purification of recombinant CFP7A, CFP8A. CFP8B, CFP1 6, CFP1 9. CFP1 9A, CFP22A. CFP23, CFP25A, CFP27, CFP30A. CWP32, and CFP50 proteins.

Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 I LB-media containing 100 μg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XL1 -Blue cells harbouring recombinant pMCT6 plasmids. Cultures were shaken at 37°C until they reached a density of OD₆₀₀ = 0.4 - 0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4-1 6 hours. Cells were harvested, resuspended in 1 X sonication buffer + 8 M urea and sonicated 5 X 30 sec. with 30 sec. pausing between the pulses.

After elution, all fractions ( 1 .5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCl, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCl. Fractions were analyzed by SDS-PAGE and protein concentrations were estimated at OD₂₈₀. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH 8.5. Finally the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.

EXAMPLE 3B

Identification of CFP7B, CFP10A, CFP1 1 and CFP30B.

Isolation of CFP7B

ST-CF was precipitated with ammonium sulphate at 80% saturation and redissolved in PBS, pH 7.4, and dialyzed 3 times against 25 mM Piperazin-HCl, pH 5.5, and subjected to cromatofocusing on a matrix of PBE 94 (Pharmacia) in a column connected to an FPLC system (Pharmacia). The column was equilibrated with 25 mM Piperazin-HCl, pH 5.5, and the elution was performed with 10% PB74-HCI, pH 4.0 (Pharmacia). Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1 -3 ml. An equal volume of SDS containing sample buffer was added and the protein solution boiled for 5 min before further separation on a MultiEluter (BioRad) in a matrix of 10-20 % polyacrylamid (Andersen, P. & Heron, I., 1993). The fraction containing a well separated band below 10 kDa was selected for N-terminal sequencing after transfer to a PVDF membrane.

Isolation of CFP1 1

ST-CF was precipitated with ammonium sulphate at 80% saturation. The precipitated proteins were removed by centrifugation and after resuspension washed with 8 M urea. CHAPS and glycerol were added to a final concentration of 0.5 % (w/v) and 5% (v/v) respectively and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad). The Rotofor Cell had been equilibrated with an 8M urea buffer containing 0.5 % (w/v) CHAPS, 5% (v/v) glycerol, 3% (v/v) Biolyt 3/5 and 1 % (v/v) Biolyt 4/6 (BioRad). Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed on silver-stained 10-20% SDS-PAGE. The fractions in the pH gradient 5.5 to 6 were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1 ml. 300 mg of the protein preparation was separated on a 10-20% Tricine SDS-PAGE (Ploug et al 1989) and transferred to a PVDF membrane and Coomassie stained. The lowest band occurring on the membrane was excised and submitted for N-terminal sequencing.

Isolation of CFP10A and CFP30B

ST-CF was concentrated approximately 10-fold by ultrafiltration and ammonium sulphate precipitation at 80 % saturation. Proteins were redissolved in 50 mM sodium phosphate, 1 .5 M ammonium sulphate, pH 8.5, and subjected to thiophilic adsorption chromatography on an Affi-T gel column (Kem-En-Tec). Proteins were eluted by a 1 .5 to 0 M decreasing gradient of ammonium sulphate. Fractions with similar band patterns in SDS-PAGE were pooled and anion exchange chromatography was performed on a Mono Q HR 5/5 column connected to an FPLC system (Pharmacia). The column was equilibrated with 10 mM Tris-HCl, pH 8.5, and the elution was performed with a gradient of NaCl from 0 to 1 M. Fractions containing well separated bands in SDS- PAGE were selected.

Fractions containing CFP10A and CFP30B were blotted to PVDF membrane after 2-DE PAGE (Ploug et al, 1989). The relevant spots were excised and subjected to N-terminal amino acid sequence analysis.

N-terminal seouencing

N-terminal amino acid sequence analysis was performed on a Procise 494 sequencer (applied Biosystems).

The following N-terminal sequences were obtained:

CFP7B: PQGTVKWFNAEKGFG (SEQ ID NO: 168)

CFP10A: NVTVSIPTILRPXXX (SEQ ID NO: 169)

CFP1 1 : TRFMTDPHAMRDMAG (SEQ ID NO: 170)

CFP30B: PKRSEYRQGTPNWVD (SEQ ID NO: 171 ) "X" denotes an amino acid which could not be determined by the sequencing method used.

N-terminal homology searching in the Sanger database and identification of the corre- sponding genes.

The N-terminal amino acid sequence from each of the proteins was used for a homology search using the blast program of the Sanger Mycobacterium tuberculosis genome database:

http//www. sanger. ac.uk/projects/m-tuberculosis/TB-blast-server.

For CFP1 1 a sequence 100% identical to 15 N-terminal amino acids was found on contig TB_1314. The identity was found within an open reading frame of 98 amino acids length corresponding to a theoretical MW of 10977 Da and a pi of 5.14.

Amino acid number one can also be an Ala (insted of a Thr) as this sequence was also obtained (results not shown), and a 100% identical sequence to this N-terminal is found on contig TB 671 and on locus MTCI364.09.

For CFP7B a sequence 100% identical to 15 N-terminal amino acids was found on contig TB_2044 and on locus MTY1 5C10.04 with EMBL accession number: z95436. The identity was found within an open reading frame of 67 amino acids length corresponding to a theoretical MW of 7240 Da and a pi of 5.18.

For CFP10A a sequence 100% identical to 1 2 N-terminal amino acids was found on contig TB 752 and on locus CY130.20 with EMBL accession number: Q10646 and Z73902. The identity was found within an open reading frame of 93 amino acids length corresponding to a theoretical MW of 9557 Da and a pi of 4.78.

For CFP30B a sequence 100% identical to 1 5 N-terminal amino acids was found on contig TB 335. The identity was found within an open reading frame of 261 amino acids length corresponding to a theoretical MW of 27345 Da and a pi of 4.24. The amino acid sequences of the purified antigens as picked from the Sanger database are shown in the following list.

CFP7B (SEQ ID NO: 147)

1 MPQGTVKWFN AEKGFGFIAP EDGSADVFVH YTEIQGTGFR TLEENQKVEF 51 EIGHSPKGPQ ATGVRSL

CFP10A (SEQ ID NO: 141 )

1 MNVTVSIPTI LRPHTGGQKS VSASGDTLGA VISDLEANYS GISERLMDPS 51 SPGKLHRFVN IYVNDEDVRF SGGLATAIAD GDSVTILPAV AGG

CFP1 1 protein sequence (SEQ ID NO: 143)

1 MATRFMTDPH AMRDMAGRFE VHAQTVEDEA RRMWASAQNI SGAGWSGMAE 51 ATSLDTMAQM NQAFRNIVNM LHGVRDGLVR DANNYEQQEQ ASQQILSS

CFP30B (SEQ ID NO: 145)

1 MPKRSEYRQG TPNWVDLQTT DQSAAKKFYT SLFGWGYDDN PVPGGGGVYS

51 MATLNGEAVA AIAPMPPGAP EGMPPIWNTY IAVDDVDAVV DKVVPGGGQV

101 MMPAFDIGDA GRMSFITDPT GAAVGLWQAN RHIGATLVNE TGTLIWNELL

1 51 TDKPDLALAF YEAVVGLTHS SMEIAAGQNY RVLKAGDAEV GGCMEPPMPG 201 VPNHWHVYFA VDDADATAAK AAAAGGQVIA EPADIPSVGR FAVLSDPQGA

251 IFSVLKPAPQ Q

Cloning of the genes encoding CFP7B, CFP10A, CFP1 1 , and CFP30B.

PCR reactions contained 10 ng of M. tuberculosis chromosomal DNA in 1 X low salt Taq + buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1 % DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag + DNA polymerase (Stratagene) in 10 ml reaction volume. Reactions were initially heated to 94°C for 25 sec. and run for 30 cycles of the program; 94°C for 10 se , 55°C for 10 sec and 72°C for 90 sec, using ther- mocycler equipment from Idaho Technology.

The DNA fragments were subsequently run on 1 % agarose gels, the bands were ex- cised and purified by Spin-X spin columns (Costar) and cloned into pBluscript SK II + - T vector (Stratagene). Plasmid DNA was hereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMCT6 in frame with 8 histidines which are added to the N- terminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiied DNA using the Sequenase DNA sequencing kit version 1 .0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.

CFP7B: Primers used for cloning of cfp7 :

CFP7B-F: CTGAGATCTAGAATGCCACAGGGAACTGTG (SEQ ID NO: 160) CFP7B-R: TCTCCCGGGGGTAACTCAGAGCGAGCGGAC (SEQ ID NO: 161)

CFP7B-F and CFP7B-R create BglW and Sma\ sites, respectively, used for the cloning in pMCT6.

CFP10A: Primers used for cloning of cfpλ A:

CFP10A-F: CTGAGATCTATGAACGTCACCGTATCC (SEQ ID NO: 1 62) CFP10A-R: TCTCCCGGGGCTCACCCACCGGCCACG (SEQ ID NO: 163)

CFP10A -F and CFP10A -R create BglW and Sma\ sites, respectively, used for the cloning in pMCT6.

CFP1 1 : Primers used for cloning of cfpλ 1 : CFP1 1 -F: CTGAGATCTATGGCAACACGTTTTATGACG (SEQ ID NO: 1 64)

CFP1 1 -R: CTCCCCGGGTTAGCTGCTGAGGATCTGCTH (SEQ ID NO: 1 65)

CFP1 1 -F and CFP1 1 -R create BglW and Smal sites, respectively, used for the cloning in pMCT6.

CFP30B: Primers used for cloning of c/p30B:

CFP30B-F: CTGAAGATCTATGCCCAAGAGAAGCGAATAC (SEQ ID NO: 166) CFP30B -R: CGGCAGCTGCTAGCATTCTCCGAATCTGCCG (SEQ ID NO: 167)

CFP30B-F and CFP30B-R create BglW and PvuW sites, respectively, used for the cloning in pMCT6.

Expression/purification of recombinant CFP7B, CFP10A, CFP1 1 and CFP30B protein.

Expression and metal affinity purification of recombinant protein was undertaken essentially as described by the manufacturers. 1 I LB-media containing 100 μg/ml ampi- cillin, was inoculated with 10 ml of an overnight culture of XL1 -Blue cells harbouring recombinant pMCTδ plasmid. The culture was shaken at 37 °C until it reached a density of OD₆₀₀ = 0.5. IPTG was hereafter added to a final concentration of 1 mM and the culture was further incubated 4 hours. Cells were harvested, resuspended in 1 X sonication buffer + 8 M urea and sonicated 5 X 30 sec. with 30 sec. pausing be- tween the pulses.

After elution, all fractions (1 .5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCl, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCl. Fractions were analysed by SDS-PAGE and protein concentrations were estimated at OD₂₈₀. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH 8.5.

Finally the protein concentration and the LPS content was determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.

EXAMPLE 3C

Using homology searching for identification of ORF1 1- 1, ORF1 1-2, ORF1 1-3 and ORF1 1-4.

A search of the Mycobacterium tuberculosis Sanger sequence database with the amino acid sequences of CFP1 1 , a previously identified ST-CF protein, identified 4 new very homologous proteins. All 4 proteins were at Ieast 96% homologous to CFP1 1 .

On the basis of the strong homology to CFP1 1 , it is belived that ORF1 1 -1 , ORF1 1 -2, ORF1 1 -3 and ORF1 1 -4 are potential new T-cell antigens.

The first open reading frame, MTCY10G2.1 1 , homologous to CFP1 1 , encodes a protein of 98 amino acids corresponding to a theoretical molecular mass of 10994Da and a pi of 5.14. The protein was named ORF1 1 -1 .

The second open reading frame, MTCI364.09, homologous to CFP1 1 , encodes a protein of 98 amino acids corresponding to a theoretical molecular mass of 10964Da and a pi of 5.14. The protein was named ORF1 1 -2.

The third open reading frame, MTV049.14, has an in frame stop codon. Because of the very conserved DNA sequence in this position amongst the 4 open reading frames it is however suggested that this is due to a sequence mistake. The "T" in position 1 75 of the DNA sequence is therefor suggested to be a "C" as in the four other ORF's. The Q in position 59 in the amino acid sequence would have been a "stop" if the T in position 1 75 in the DNA sequence had not been substituted. The open reading frame encodes a protein of 98 amino acids corresponding to a theo- reticai molecular mass of 10994Da and a pi of 5.14. The protein was named ORF1 1 - 3.

The fourth open reading frame, MTCY1 5C10.32, homologous to CFP1 1 , encodes a protein of 98 amino acids corresponding to a theoretical molecular mass of 1 1024Da and a pi of 5.14. The protein was named ORF1 1 -4.

Using homology searching for identification of ORF7-1 and ORF7-2.

A search of the Mycobacterium tuberculosis Sanger sequence database with the amino acid sequences of a previously identified immunoreactive ST-CF protein, CFP7, identified 2 new very homologous proteins. The protein ORF7-1 (MTV01 2.33) was 84% identical to CFP7, with a primary structure of the same size as CFP7, and the protein ORF7-2 (MTV01 2.31 ) was 68% identical to CFP7 in a 69 amino acid overlap. On the basis of the strong homology to the potent human T-cell antigen CFP7, ORF7-1 and ORF7-2 are belived to be potential new T-cell antigens.

The first open reading frame homologous to CFP7, encodes a protein of 96 amino acids corresponding to a theoretical molecular mass of 1031 3Da and a pi of 4.1 86. The protein was named ORF7-1 .

The second open reading frame homologous to CFP7, encodes a protein of 1 20 amino acids corresponding to a theoretical molecular mass of 1 2923.00 Da and a pi of 7.889. The protein was named ORF7-2.

Cloning of the homologous orf7- 1 and orf7-2.

Since ORF7-1 and ORF7-2 are nearly identical to CFP7 it was nessesary to use the flanking DNA regions in the cloning procedure, to ensure the cloning of the correct ORF. Two PCR reactions were carried out with two different primer sets. PCR reaction 1 was carried out using M. tuberculosis chromosomal DNA and a primerset corresponding to the flanking DNA. PCR reaction 2 was carried out directly on the first PCR product using ORF specific primers which introduced restriction sites for use in the later cloning procedure. The sequences of the primers used are given below;

Orf7-T.

Primers used for the initial PCR reaction (1) using M. tuberculosis chromosomal DNA as template;

Sence:MTV012.33-R1: 5'- GGAATGAAAAGGGGTTTGTG - 3' (SEQ ID NO: 186) Antisence:MTV012.33-F1: 5'- GACCACGCCCGCGCCGTGTG - 3'(SEQ ID NO: 187)

Primers used for the second round of PCR (2) using PCR product 1 as template;

Sence:MTV012.33-R2: 5' - GC ΛC CCCGGGATGTCGCAGATTATG - 3'

(SEQ ID NO: 188) (introduces a Smal upstream of the orf7-1 start codon) Antisence:MTV012.33-F2: 5' - CT GCΓTGGΛΓCCCTAGCCGCCCCACTTG - 3' ((SEQ ID NO: 189) (introduces a BamHl downstream of the or/7- 7 stop codon).

Orf7-2:

Sence:MTV012.31-R1: 5'- GAATATTTGAAAGGGATTCGTG - 3' (SEQ ID NO: 190) Antisence:MTV012.31-F1: 5'- CTACTAAGCTTGGATCCTT AGTCTCCGGCG - 3' (SEQ ID NO: 191) (introduces a BamVW downstream of the orf7-2 stop codon)

Primers used for the second round of PCR (2) using PCR product 1 as template; Sence: MTV012.31 -R2: 5' -GCAdC CCCGGGGTGTCGCAGAGTATG- 3'

(SEQ ID NO: 1 92) (introduces a Smal upstream of the orf7-2 start codon) Antisence:MTV01 2.31 -F1 : 5 - CT Cr ΛGC7TGGΛ TCCTTAGTCTCCGGCG - 3" (SEQ ID NO: 193) (introduces a BamHl downstream of the orf7-2 stop codon)

The genes encoding ORF7-1 and ORF7-2 were cloned into the expression vector pMST24, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the proteins.

The first PCR reactions contained either 10 ng of M. tuberculosis chromosomal DNA (PCR reaction 1 ) or 10ng PCR product 1 (PCR reaction 2) in 1 x low salt Taq -t- buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehrin- ger Mannheim), 0,5 mg/ml BSA (IgG technology), 1 % DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag + DNA polymerase (Stratagene) in 10 ml reaction volume. Reactions were initially heated to 94°C for 25 sec. and run for 30 cycles of the program; 94°C for 10 sec, 55°C for 10 sec. and 72°C for 90 sec, using thermo- cycler equipment from Idaho Technology. The DNA fragments were subsequently run on 1 % agarose gels, the bands were excised and purified by Spin-X spin columns (Costar) and cloned into pBluscript SK II + - T vector (Stratagene). Plasmid DNA was hereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMST24 in frame with 6 histidines which are added to the N- terminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiied DNA using the Sequenase DNA sequencing kit version 1 .0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.

Expression/purification of recombinant ORF7-1 and ORF7-2 protein. Expression and metal affinity purification of recombinant protein was undertaken essentially as described by the manufacturers. 1 I LB-media containing 100 μg/ml ampi- cillin, was inoculated with 10 ml of an overnight culture of XL1 -Blue cells harbouring recombinant pMCT6 plasmid. The culture was shaken at 37 °C until it reached a density of OD600 = 0.5. IPTG was hereafter added to a final concentration of 1 mM and the culture was further incubated 2 hours. Cells were harvested, resuspended in 1 X sonication buffer + 8 M urea and sonicated 5 X 30 sec. with 30 sec. pausing between the pulses.

After elution, all fractions (1 .5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris- HCl, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCl. Fractions were analysed by SDS-PAGE. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH 8.5.

EXAMPLE 4

Cloning of the gene expressing CFP26 (MPTδl)

Synthesis and design of probes

Oligonucleotide primers were synthesized automatically on a DNA synthesizer (Applied Biosystems, Forster City, Ca, ABI-391 , PCR-mode) deblocked and purified by ethanol precipitation.

Three oligonucleotides were synthesized (TABLE 3) on the basis of the nucleotide sequence from mpbδl described by Ohara et al. (1995). The oligonucleotides were engineered to include an EcoRI restriction enzyme site at the 5' end and at the 3' end by which a later subcioning was possible. Additional four oligonucleotides were synthesized on the basis of the nucleotide sequence from MPT51 (Fig. 5 and SEQ ID NO: 41 ). The four combinations of the primers were used for the PCR studies.

DNA cloning and PCR technology

Standard procedures were used for the preparation and handling of DNA (Sambrook et al., 1 989). The gene mptδ l was cloned from M. tuberculosis H37Rv chromosomal DNA by the use of the polymerase chain reactions (PCR) technology as described previously (Oettinger and Andersen, 1 994). The PCR product was cloned in the pBluescriptSK + (Stratagene).

Cloning of mptδ !

The gene, the signal sequence and the Shine Delgarno region of MPT51 was cloned by use of the PCR technology as two fragments of 952 bp and 81 5 bp in pBluescript SK + , designated pT052 and pT053.

DNA Sequencing

The nucleotide sequence of the cloned 952 bp M. tuberculosis H37Rv PCR fragment, pT052, containing the Shine Dalgarno sequence, the signal peptide sequence and the structural gene of MPT51 , and the nucleotide sequence of the cloned 81 5 bp PCR fragment containing the structural gene of MPT51 , pT053, were determined by the dideoxy chain termination method adapted for supercoiied DNA by use of the Sequenase DNA sequencing kit version 1 .0 (United States Biochemical Corp., Cleveland, OH) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.

The nucleotide sequences of pT052 and pT053 and the deduced amino acid sequence are shown in Figure 5. The DNA sequence contained an open reading frame starting with a ATG codon at position 45 - 47 and ending with a termination codon (TAA) at position 942 - 944. The nucleotide sequence of the first 33 codons was expected to encode the signal sequence. On the basis of the known N-terminal amino acid sequence (Ala - Pro - Tyr - Glu - Asn) of the purified MPT51 (Nagai et al., 1 991 ) and the features of the signal peptide, it is presumed that the signal peptidase recognition sequence (Ala-X-Ala) (von Heijne, 1984) is located in front of the N- terminal region of the mature protein at position 144. Therefore, a structural gene encoding MPT51 , mptδ l , derived from M. tuberculosis H37Rv was found to be located at position 144 - 945 of the sequence shown in Fig. 5. The nucleotide sequence of mptδ l differed with one nucleotide compared to the nucleotide sequence of MPB51 described by Ohara et al. (1 995) (Fig. 5). In mptδ l at position 780 was found a substitution of a guanine to an adenine. From the deduced amino acid sequence this change occurs at a first position of the codon giving a amino acid change from alanine to threonine. Thus it is concluded, that mptδ l consists of 801 bp and that the deduced amino acid sequence contains 266 residues with a molecular weight of 27,842, and MPT51 show 99,8% identity to MPB51 .

Subcioning of mptδ!

An EcoRI site was engineered immediately 5' of the first codon of mptδl so that only the coding region of the gene encoding MPT51 would be expressed, and an EcoRI site was incorporated right after the stop codon at the 3' end.

DNA of the recombinant plasmid pT053 was cleaved at the EcoRI sites. The 81 5 bp fragment was purified from an agarose gel and subcloned into the EcoRI site of the pMAL-c/?7 expression vector (New England Biolabs), pT054. Vector DNA containing the gene fusion was used to transform the E. coli XLI -Blue by the standard procedures for DNA manipulation.

The endpoints of the gene fusion were determined by the dideoxy chain termination method as described under section DNA sequencing. Both strands of the DNA were sequenced. Preparation and purification of rMPT51

Recombinant antigen was prepared in accordance with instructions provided by New England Biolabs. Briefly, single colonies of E. coli harbouring the pT054 plasmid were inoculated into Luria-Bertani broth containing 50 μg/ml ampicillin and 1 2.5 μg/ml tetra- cycline and grown at 37°C to 2 x 10⁸ cells/ml. Isopropyl-β-D-thiogalactoside (IPTG) was then added to a final concentration of 0.3 mM and growth was continued for further 2 hours. The pelleted bacteria were stored overnight at -20°C in new column buffer (20 mM Tris/HCI, pH 7.4, 200 mM NaCl, 1 mM EDTA, 1 mM dithiothreitol (DTT))and thawed at 4 °C followed by incubation with 1 mg/ml lysozyme on ice for 30 min and sonication (20 times for 1 0 sec with intervals of 20 sec). After centrifugation at 9,000 x g for 30 min at 4°C, the maltose binding protein -MPT51 fusion protein (MBP-rMPT51 ) was purified from the crude extract by affinity chromatography on amylose resin column. MBP-rMPT51 binds to amylose. After extensive washes of the column, the fusion protein was eluted with 10 mM maltose. Aliquots of the fractions were analyzed on 10% SDS-PAGE. Fractions containing the fusion protein of interest were pooled and was dialysed extensively against physiological saline.

Protein concentration was determined by the BCA method supplied by Pierce (Pierce Chemical Company, Rockford, IL).

TABLE 3.

Sequence of the mptδ l oligonucleotides³.

Orientation Sequences (5'— 3¹) Position¹³ and oli(nucleotide) gonucleotide³

Sense

MPT51-1 CTCGAATTCGCCGGGTGCACACAQ 6 - 21 (SEQ ID NO: 28) (SEQ ID NO: 41)

MPT51-3 CTCGAATTCGCCCCATACGAGAAC 143 - 158 (SEQ ID NO: 29) (SEQ ID NO: 41)

MPT51-5 GTGTATCTGCTGGAC 228 - 242 (SEQ ID NO: 30) (SEQ ID NO: 41)

MPT51-7 CCGACTGGCTGGCCG 418 - 432 (SEQ ID NO: 31) (SEQ ID NO: 41)

Antisense

MPT51-2 GAGGAATTCGCTTAGCGGATCGCA 946 - 932 (SEQ ID NO: 32) (SEQ ID NO: 41)

MPT51-4 CCCACATTCCGTTGG 642 - 628 (SEQ ID NO: 33) (SEQ ID NO: 41) PT51-6 GTCCAGCAGATACAC 242 - 228 (SEQ ID NO: 34) (SEQ ID NO: 41)

^a The oligonucleotides MPT51 -1 and MPT51 -2 were constructed from the MPB51 nucleotide sequence (Ohara et al., 1 995). The other oligonucleotides constructions were based on the nucleotide sequence obtained from mptδl reported in this work. Nucleotides (nt) underlined are not contained in the nucleotide sequence of MPB/T51 . ^b The positions referred to are of the non-underlined parts of the primers and corre- spond to the nucleotide sequence shown in SEQ ID NO: 41 . Cloning of mptδ ! in the expression vector pMST24.

A PCR fragment was produced from pT052 using the primer combination MPT51 -F and MPT51 -R (TABLE 4). A BamHl site was engineered immediately 5' of the first co- don of mptδl so that only the coding region of the gene encoding MPT51 would be expressed, and an Λ/col site was incorporated right after the stop codon at the 3' end.

The PCR product was cleaved at the BamHl and the Λ/col site. The 81 1 bp fragment was purified from an agarose gel and subcloned into the BamHl and the Λ/col site of the pMST24 expression vector, pT086. Vector DNA containing the gene fusion was used to transform the E. coli XL1 -Blue by the standard procedures for DNA manipulation.

The nucleotide sequence of complete gene fusion was determined by the dideoxy chain termination method as described under section DNA sequencing. Both strands of the DNA were sequenced.

Preparation and purification of rMPT51 .

Recombinant antigen was prepared from single colonies of E. coli harbouring the pT086 plasmid inoculated into Luria-Bertani broth containing 50 μg/ml ampicillin and 1 2.5 μg/ml tetracycline and grown at 37°C to 2 x 10⁸ cells/ml. Isopropyl-β-D-thioga- lactoside (IPTG) was then added to a final concentration of 1 mM and growth was continued for further 2 hours. The pelleted bacteria were resuspended in BC 100/20 buffer (100 mM KCI, 20 mM Imidazole, 20 mM Tris/HCI, pH 7.9, 20 % glycerol). Cells were broken by sonication (20 times for 10 sec with intervals of 20 sec). After centrifugation at 9,000 x g for 30 min. at 4°C the insoluble matter was resuspended in BC 100/20 buffer with 8 M urea followed by sonication and centrifugation as above. The 6 x His tag-MPT51 fusion protein (His-rMPT51 ) was purified by affinity chroma- tography on Ni-NTA resin column (Qiagen, Hilden, Germany). His-rMPT51 binds to Ni- NTA. After extensive washes of the column, the fusion protein was eluted with BC 100/40 buffer (100 mM KCI, 40 mM Imidazole, 20 mM Tris/HCI, pH 7.9, 20 % glycerol) with 8 M urea and BC 1 000/40 buffer (1 000 mM KCI, 40 mM Imidazole, 20 mM Tris/HCI, pH 7.9, 20 % glycerol) with 8 M urea. His-rMPT51 was extensive dia- lysed against 1 0 mM Tris/HCI, pH 8.5, 3 M urea followed by purification using fast protein liquid chromatography (FPLC) (Pharmacia, Uppsala, Sweden), over an anion exchange column (Mono Q) using 10 mM Tris/HCI, pH 8.5, 3 M urea with a 0 - 1 M NaCl linear gradient. Fractions containing rMPT51 were pooled and subsequently dia- lysed extensively against 25 mM Hepes, pH 8.0 before use.

Protein concentration was determined by the BCA method supplied by Pierce (Pierce Chemical Company, Rockford, IL) .

The lipopolysaccharide (LPS) content was determined by the limulus amoebocyte ly- sate test (LAL) to be less than 0.004 ng/μg rMPT51 , and this concentration had no influence on cellular activity.

TABLE 4. Sequence of the mptδ ! oligonucleotides.

Orientation and Sequences (5¹ → 3') Position oligonucleotide (nt)

Sense PT51-F CTCGGATCCTGCCCCATACGAGAACCTG 139 - 156

Antisense

MPT51 -R CTCCCATGGTTAGCGGATCGCACCG 93 9 - 924

EXAMPLE 4A

Cloning of the ESA T6-MPTδ9 and the MPTδ9-ESA T6 hybrides.

Background for ESAT-MPT59 and MPT59-ESAT6 fusion

Several studies have demonstrated that ESAT-6 is a an immunogen which is relatively difficult to adjuvate in order to obtain consistent results when immunizing therewith. To detect an in vitro recognition of ESAT-6 after immunization with the antigen is very difficult compared to the strong recognition of the antigen that has been found during the recall of memory immunity to M. tuberculosis. ESAT-6 has been found in ST-CF in a truncated version were amino acids 1 -1 5 have been deleted. The deletion includes the main T-cell epitopes recognized by C57BL/6J mice (Brandt et al., 1 996). This result indicates that ESAT-6 either is N-terminally processed or proteolytically degraded in STCF. In order to optimize ESAT-6 as an immunogen, a gene fusion between ESAT-6 and another major T cell antigen MPT59 has been constructed. Two different construct have been made: MPT59-ESAT-6 (SEQ ID NO: 1 72) and ESAT-6-MPT59 (SEQ ID NO: 1 73). In the first hybrid ESAT-6 is N-terminally protected by MPT59 and in the latter it is expected that the fusion of two dominant T-cell antigens can have a syner- gistic effect.

The genes encoding the ESAT6-MPT59 and the MPT59-ESAT6 hybrides were cloned into the expression vector pMCT6, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the hybrid proteins.

Construction of the hybrid MPT59-ESAT6.

The cloning was carried out in three steps. First the genes encoding the two components of the hybrid, ESAT6 and MPT59, were PCR amplified using the following primer constructions:

ESAT6:

OPBR-4: GGCGCCGGCAAGCTTGCCATGACAGAGCAGCAGTGG

(SEQ ID NO: 132) OPBR-28: CGAACTCGCCGGATCCCGTGTTTCGC (SEQ ID NO: 133)

OPBR-4 and OPBR-28 create HinDIII and BamHl sites, respectively.

MPT59:

OPBR-48: GGCAACCGCGAGATCTTTCTCCCGGCCGGGGC (SEQ ID NO: 1 34) OPBR-3: GGCAAGCTTGCCGGCGCCTAACGAACT (SEQ ID NO: 135) OPBR-48 and OPBR-3 create Bglll and HinDIII, respectively. Additionally OPBR-3 deletes the stop codon of MPT59.

PCR reactions contained 10 ng of M. tuberculosis chromosomal DNA in 1 x low salt Taq + buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1 % DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag -t- DNA polymerase (Stratagene) in 10 μl reaction volume. Reactions were initially heated to 94°C for 25 sec. and run for 30 cycles of the program; 94°C for 10 sec, 55°C for 10 sec. and 72°C for 90 sec, using ther- mocycler equipment from Idaho Technology.

The DNA fragments were subsequently run on 1 % agarose gels, the bands were excised and purified by Spin-X spin columns (Costar). The two PCR fragments were digested with HinDIII and ligated. A PCR amplification of the ligated PCR fragments en- coding MPT59-ESAT6 was carried out using the primers OPBR-48 and OPBR-28. PCR reaction was initially heated to 94°C for 25 sec. and run for 30 cycles of the program; 94°C for 30 sec, 55°C for 30 sec. and 72°C for 90 sec. The resulting PCR fragment was digested with Bglll and BamHl and cloned into the expression vector pMCT6 in frame with 8 histidines which are added to the N-terminal of the expressed protein hybrid. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiied DNA using the Sequenase DNA sequencing kit version 1 .0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.

Construction of the hybrid ESAT6-MPT59.

Construction of the hybrid ESAT6-MPT59 was carried out as described for the hybrid MPT59-ESAT6. The primers used for the construction and cloning were:

ESAT6:

OPBR-75: GGACCCAGATCTATGACAGAGCAGCAGTGG (SEQ ID NO: 1 36) OPBR-76: CCGGCAGCCCCGGCCGGGAGAAAAGCTTTGCGAACATCCCAGTGACG (SEQ ID NO: 1 37)

OPBR-75 and OPBR-76 create Bglll and HinDIII sites, respectively. Additionally OPBR- 76 deletes the stop codon of ESAT6.

MPT59:

OPBR-77: GTTCGCAAAGCTTTTCTCCCGGCCGGGGCTGCCGGTCGAGTACC (SEQ ID NO: 1 38)

OPBR-18: CCTTCGGTGGATCCCGTCAG (SEQ ID NO: 1 39)

OPBR-77 and OPBR-1 8 create HinDIII and BamHl sites, respectively.

Expression/purification of MPT59-ESAT6 and ESAT6-MPT59 hybrid proteins.

Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 I LB-media containing 1 00 μg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XLI -Blue cells harbouring recombinant pMCT6 plasmids. Cultures were shaken at 37 °C until they reached a density of OD₆₀₀ = 0.4 - 0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4 - 1 6 hours. Cells were harvested, resuspended in 1 X sonication buffer + 8 M urea and sonicated 5 X 30 sec. with 30 sec. pausing between the pulses.

After elution, all fractions (1 .5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCl, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCl. Fractions were analyzed by SDS-PAGE and protein concentrations were estimated at OD₂₈₀. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH 8.5.

Finally the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.

The biological activity of the MPT59-ESAT6 fusion protein is described in Example 6A.

EXAMPLE 5

Mapping of the purified antigens in a 2DE system.

In order to characterize the purified antigens they were mapped in a 2-dimensional electrophoresis (2DE) reference system. This consists of a silver stained gel containing ST-CF proteins separated by isoelectrical focusing followed by a separation according to size in a polyacrylamide gel electrophoresis. The 2DE was performed according to Hochstrasser et al. (1988). 85 μg of ST-CF was applied to the isoelectrical focusing tubes where BioRad ampholytes BioLyt 4-6 (2 parts) and BioLyt 5-7 (3 parts) were in- eluded. The first dimension was performed in acrylamide/piperazin diacrylamide tube gels in the presence of urea, the detergent CHAPS and the reducing agent DTT at 400 V for 18 hours and 800 V for 2 hours. The second dimension 10-20% SDS-PAGE was performed at 100 V for 1 8 hours and silver stained. The identification of CFP7, CFP7A, CFP7B, CFP8A, CFP8B, CFP9, CFP1 1 , CFP1 6, CFP1 7, CFP1 9, CFP20, CFP21 , CFP22, CFP25, CFP27, CFP28, CFP29, CFP30A, CFP50, and MPT51 in the 2DE reference gel were done by comparing the spot pattern of the purified antigen with ST-CF with and without the purified antigen. By the assistance of an analytical 2DE software system (Phoretix International, UK) the spots have been identified in Fig. 6. The position of MPT51 and CFP29 were confirmed by a Western blot of the 2DE gel using the Mab's anti-CFP29 and HBT 4. EXAMPLE 6

Biological activity of the purified antigens.

IFN-γ induction in the mouse model of TB infection

The recognition of the purified antigens in the mouse model of memory immunity to TB (described in example 1 ) was investigated. The results shown in TABLE 5 are representative for three experiments.

A very high IFN-γ response was induced by two of the antigens CFP1 7 and CFP21 at almost the same high level as ST-CF.

TABLE 5

IFN-γ release from splenic memory effector cells from C57BL/6J mice isolated after reinfection with M. tuberculosis after stimulation with native antigens.

Antigen³ IFN-γ (pg/ml) ^b

ST-CF 12564

CFP7 ND^d

CFP9 ND

CFP17 9251

CFP20 2388

CFP21 10732

CFP22 + CFP25⁰ 5342

CFP26 (MPT51) ND

CFP28 2818

CFP29 3700

The data is derived from a representative experiment out of three. ^a ST-CF was tested in a concentration of 5 μg/ml and the individual antigens in a concentration of 2 μg/ml. ^b Four days after rechallenge a pool of cells from three mice were tested. The results are expressed as mean of duplicate values and the difference between duplicate cultures are < 15% of mean. The IFN-γ release of cultures incubated without antigen was 390 pg/ml. ^c A pool of CFP22 and CFP25 was tested. ^d ND, not determined.

Skin test reaction in TB infected guinea pigs

The skin test activity of the purified proteins was tested in M. tuberculosis infected guinea pigs.

1 group of guinea pigs was infected via an ear vein with 1 x 10⁴ CFU of M. tuberculosis H37Rv in 0,2 ml PBS. After 4 weeks skin tests were performed and 24 hours after injection erythema diameter was measured.

As seen in TABLES 6 and 6a all of the antigens induced a significant Delayed Type Hypersensitivity (DTH) reaction.

TABLE 6

DTH erythema diameter in guinea pigs infected with 1 x 1 0⁴ CFU of M. tuberculosis, after stimulation with native antigens.

Antigen³ Skin reaction (mm) ^b

Control 2.00

PPD^C 15.40 (0.53)

CFP7 ND^e

CFP9 ND

CFP17 11.25 (0.84)

CFP20 8.88 (0.13)

CFP21 12.44 (0.79)

CFP22 + CFP25^d 9.19 (3.10)

CFP26 (MPT51) ND

CFP28 2.90 (1.28)

CFP29 6.63 (0.88)

The values presented are the mean of erythema diameter of four animals and the SEM's are indicated in the brackets. For PPD and CFP29 the values are mean of erythema diameter of ten animals. ³ The antigens were tested in a concentration of 0, 1 μg except for CFP29 which was tested in a concentration of 0,8 μg. ^b The skin reactions are measured in mm erythema 24 h after intradermal injection. ^c 10 TU of PPD was used. ^d A pool of CFP22 and CFP25 was tested. ^e ND, not determined.

Together these analyses indicate that most of the antigens identified were highly biologically active and recognized during TB infection in different animal models. TABLE 6a

DTH erythema diameter of recombinant antigens in outbred guinea pigs infected with 1 x 10⁴ CFU of M. Tuberculosis.

Antigen³ Skin reaction (mm) ^b

Control 2.9 (0.3)

PPD^C 14.5 (1.0)

CFP 7a 13.6 (1.4)

CFP 17 6.8 (1.9)

CFP 20 6.4 (1.4)

CFP 21 5.3 (0.7)

CFP 25 10.8 (0.8)

CFP 29 7.4 (2.2)

MPT 51 4.9 (1.1)

The values presented are the mean of erythema diameter of four animals and the SEM's are indicated in the brackets. For Control, PPD, and CFP 20 the values are mean of erythema diameter of eight animals. ³ The antigens were tested in a concentration of 1 ,0 μg. ^b The skin test reactions are measured in mm erythema 24 h after intradermal infection. ⁰ 10 TU of PPD was used.

Table 6B.

DTH erythema diameter in guinea pigs i.v. infected with 1 x 10^ CFU M. tuberculosis, after stimulation with 1 0 g antigen.

Antigen Mean (mm) SEM

PBS 3,25 0,48

PPD (2TU) 10,88 1 nCFP7B 7,0 0,46 nCFP19 6,5 0,74

MPT59-ESAT6 14,75 1,5 The values presented are the mean of erythema diameter of four animals.

The results in Table 6B indicates biological activity of nCFP7B, nCFP1 9 and MPT59- ESAT-6. MPT59-ESAT-6 resulting in a DTH response at the level of PPD.

Biological activity of the purified recombinant antigens.

Interferon-γ induction in the mouse model of TB infection.

Primary infections. 8 to 1 2 weeks old female C57BL/6j(H-2^b), CBA/J(H-2^k), DBA.2(H- 2^d) and A.SW(H-2^S) mice (Bomholtegaard, Ry) were given intravenous infections via the lateral tail vein with an inoculum of 5 x 1 0⁴ M. tuberculosis suspended in PBS in a vol. of 0.1 ml. 14 days postinfection the animals were sacrificed and spleen cells were isolated and tested for the recognition of recombinant antigen.

As seen in TABLE 7 the recombinant antigens rCFP7A, rCFP1 7, rCFP21 , rCFP25, and rCFP29 were all recognized in at Ieast two strains of mice at a level comparable to STCF. rMPT51 and rCFP7 were only recognized in one or two strains respectively, at a level corresponding to no more than 1 /3 of the response detected after ST-CF stimula- tion. Neither of the antigens rCFP20 and rCFP22 were recognized by any of the four mouse strains.

As shown in TABLE 7A, the recombinant antigens rCFP27, RD1 -0RF2, MPT59- ESAT6, rCFPI OA, rCFP1 9, and rCFP25A were all recognized in at Ieast two strains of mice at a level comparable to ST-CF, whereas ESAT6-MPT59, rCFP23, and rCFP30B only were recognized in one strain at this level. rCFP30A , RD 1 -0RF5, rCFP1 6 gave rise to an IFN-γ release in two mice strains at a level corresponding to 2/3 of the response after stimulation with ST-CF. RD1 -0RF3 was recognized in two strains at a level of 1 /3 of ST-CF.

The native CFP7B was recognized in two strains at a level of 1 /3 of the response seen after stimulation with ST-CF. Memory responses. 8-1 2 weeks old female C57BL/6j(H-2^b) mice (Bomholtegaard, Ry) were given intravenous infections via the lateral tail vein with an inoculum of 5 x 10⁴ M. tuberculosis suspended in PBS in a vol. of 0.1 ml. After 1 month of infection the mice were treated with isoniazid (Merck and Co., Rahway, NJ) and rifabutin (Farma- talia Carlo Erba, Milano, Italy) in the drinking water, for two months. The mice were rested for 4-6 months before being used in experiments. For the study of the recall of memory immunity, animals were infected with an inoculum of 1 x 10⁶ bacteria i.v. and sacrificed at day 4 postinfection. Spleen cells were isolated and tested for the recognition of recombinant antigen. As seen from TABLE 8, IFN-γ release after stimulation with rCFP1 7, rCFP21 and rCFP25 was at the same level as seen from spleen cells stimulated with ST-CF. Stimulation with rCFP7, rCFP7A and rCFP29 all resulted in an IFN-γ no higher than 1 /3 of the response seen with ST-CF. rCFP22 was not recognized by IFN-γ producing cells. None of the antigens stimulated IFN-γ release in naive mice. Additionally non of the antigens were toxic to the cell cultures.

As shown in TABLE 8A, IFN-γ release after stimulation with RD 1 -0RF2, MPT59- ESAT6, ESAT6-MPT59, and rCFP1 9 was at the same level as seen from spleen cells stimulated with ST-CF. Stimulation with rCFPI OA and rCFP30A gave rise to an IFN-γ release of 2/3 of the response after stimulation with ST-CF, whereas rCFP27, RD 1 - 0RF5, rCFP23, rCFP25A and rCFP30B all resulted in an IFN-γ release no higher than 1 /3 of the response seen with ST-CF. RD1 -0RF3 and rCFP1 6 were not recognized by IFN-γ producing memory cells.

TABLE 7. T cell responses in primary TB infection.

Name C57BL/6J (H2^b) DBA.2(H2^d) CBA/J(H2^k) A.S (H2^S)

_____ _ _ _ _

rCFP7A *++ +++ +++ + rCFP17 +++ + +++ + rCFP20 - - rCFP21 +++ +++ +++ + rCFP22 - - rCFP25 +++ ++ +++ + rCFP29 +++ +++ +++ ++ rMPT51 + -

Mouse IFN-γ release 14 days after primary infection with M. tuberculosis. -:no response; + : 1 /3 of ST-CF; + + : 2/3 of ST-CF; + + + : level of ST-CF.

TABLE 7A. T cell responses in primary TB infection.

Name C57B1/ 6J (H2Z DBA . 2 (H2^U) CBA/J (H2^K) A . SW (H2^B ) rCFP27 ++ ++ +++ +++ rCFP30A - + ++ ++

RD1-0RF2 +++ +++ +++ ++

RD1-0RF3 - - + +

RD1-0RF5 + + ++ ++

MPT59-ESAT6 +++ +++ +++ ++

ESAT6-MPT59 +++ - + - rCFPlOA +++ n.d. +++ n.d rCFP16 ++ n.d. ++ n.d rCFP19 +++ n.d. +++ n.d rCFP23 ++ n.d. *-++ n.d rCFP25A +++ n.d. +++ n.d rCFP30B + n.d. +++ n.d

CFP7B (native) + n.d. + n.d

Mouse IFN-γ release 14 days after primary infection with M. tuberculosis. -: no response; + : 1 /3 of ST-CF; + + : 2/3 of ST-CF; + + -4- : level of ST-CF. n.d. = not determined.

TABLE 8. T cell responses in memory immune animals.

Name Memory response

_____ _

rCFP7A ++ rCFP17 +++ rCFP21 +++ rCFP22 rCFP29 + rCFP25 +++ rMPT51 +

Mouse IFN-γ release during recall of memory immunity to M. tuberculosis. -:no response; + : 1 /3 of ST-CF; + + : 2/3 of ST-CF; + + + : level of ST-CF.

TABLE 8A. T cell responses in memory immune animals.

Name Memory response

______ _ rCFP30A ++

RD1 -0RF2 +++

RD1 -0RF3

RD1 -0RF5 +

MPT59 - ESAT6 +++

ESAT6 -MPT59 +++ rCFPl OA ++ rCFP16 rCFP19 +++ rCFP23 + rCFP25A + rCFP30B +

Mouse IFN-γ release during recall of memory immunity to M. tuberculosis. -: no response; + : 1 /3 of ST-CF; 4- + : 2/3 of ST-CF; ^■+^• + + : level of ST-CF. Interferon-γ induction in human TB patients and BCG vaccinated people.

Human donors: PBMC were obtained from healthy BCG vaccinated donors with no known exposure to patients with TB and from patients with culture or microscopy 5 proven infection with Mycobacterium tuberculosis. Blood samples were drawn from the TB patients 1 -4 months after diagnosis.

Lymphocyte preparations and cell culture: PBMC were freshly isolated by gradient centrifugation of heparinized blood on Lymphoprep (Nycomed, Oslo, Norway). The cells

10 were resuspended in complete medium: RPMI 1 640 (Gibco, Grand Island, N.Y.) supplemented with 40 μg/ml streptomycin, 40 U/ml penicillin, and 0.04 mM/ml glutamine, (all from Gibco Laboratories, Paisley, Scotland) and 1 0% normal human ABO serum (NHS) from the local blood bank. The number and the viability of the cells were determined by trypan blue staining. Cultures were established with 2,5 x 10⁵ PBMC in

1 5 200 μl in microtitre plates (Nunc, Roskilde, Denmark) and stimulated with no antigen, ST-CF, PPD (2.5μg/ml); rCFP7, rCFP7A, rCFP1 7, rCFP20, rCFP21 , rCFP22, rCFP25, rCFP26, rCFP29, in a final concentration of 5 μg/ml. Phytohaemagglutinin, 1 μg/ml (PHA, Difco laboratories, Detroit, Ml. was used as a positive control. Supernatants for the detection of cytokines were harvested after 5 days of culture, pooled and stored

20 at -80°C until use.

Cytokine analysis: Interferon-γ (IFN-γ ) was measured with a standard ELISA technique using a commercially available pair of mAb ^' s from Endogen and used according to the instructions for use. Recombinant IFN-γ (Gibco laboratories) was used as a 25 standard. The detection level for the assay was 50 pg/ml. The variation between the duplicate wells did not exceed 10 % of the mean. Responses of 9 individual donors are shown in TABLE 9.

A seen in TABLE 9 high levels of IFN-γ release are obtained after stimulation with sev- 30 eral of the recombinant antigens. rCFP7a and rCFP1 7 gives rise to responses comparable to STCF in almost all donors. rCFP7 seems to be most strongly recognized by BCG vaccinated healthy donors. rCFP21 , rCFP25, rCFP26, and rCFP29 gives rise to a mixed picture with intermediate responses in each group, whereas low responses are obtained by rCFP20 and rCFP22. As is seen from Table 9A RD1 -ORF2 and RD 1 -ORF5 give rise to IFN-γ responses close to the level of ST-CF. Between 60% and 90% of the donors show high IFN-γ responses ( > 1000 pg/ml). rCFP30A gives rise to a mixed response with 40-50% high re- sponders, whilst low responses are obtained with RD 1 -ORF3.

As seen from Table 9B MPT59-ESAT6 and ESAT6-MPT59 both give rise to IFN-γ responses at the level of ST-CF and 67-89 % show high responses ( > 1000 pg/ml).

TABLE 9. Results from the stimulation of human blood cells from 5 healthy BCG vaccinated and 4 Tb patients with recombinant antigen. ST-CF, PPD and PHA are shown for comparison. Results are given in pg IFN-γ/ml.

Controls , Healthy , BCG vaccinated, no known TB exposure donor : : no ag PHA PPD STCF CFP7 CFP17 CFP7A CFP20 CFP21 CFP22 CFP25 CFP26 CFP29

1 6 9564 6774 3966 7034 69 1799 58 152 73 182 946 86

2 48 12486 6603 8067 3146 10044 5267 29 6149 51 1937 526 2065

3 190 11929 10000 8299 8015 11563 8641 437 3194 669 2531 8076 6098

4 10 21029 4106 3537 1323 1939 5211 1 284 1 1344 20 125 r 5 1 18750 14209 13027 17725 8038 19002 1 3008 1 2103 974 8181

5

TB patients, 1-4 month a .fter diagnosis no ag PHA PPD STCF CFP7 CFP17 CFP7A CFP20 CFP21 CFP22 CFP25 CFP26 CFP29

6 9 8973 5096 6145 852 4250 4019 284 1131 48 2400 1078 4584

7 1 12413 6281 3393 168 6375 4505 11 4335 16 3082 1370 5115

8 4 11915 7671 7375 104 2753 3356 119 407 437 2069 712 5284

9 32 22130 16417 17213 8450 9783 16319 91 5957 67 10043 13313 9953

Table 9A.

Results from the stimulation of human blood cells from 1 0 healthy BCG vaccinated or non vaccinated ST-CF responsive healthy donors and 10 Tb patients with recombinant antigen are shown. ST-CF, PPD and PHA are included for comparison. Results are given in pg

IFN-γ/ml and negative values below 300 pg/ml are shown as " < " . nd = not done.

Controls, Healthy BCG vaccinated, or non vaccinated ST-CF positive

Donor no ag PHA PPD STCF RD1-ORF2 RD1-ORF3 RD1-ORF5 rCFP30A

10 < nd 3500 4200 1250 < 690 nd

11 < nd 5890 4040 5650 880 9030 nd

12 < nd 6480 3330 2310 < 3320 nd

13 < nd 7440 4570 920 < 1230 nd

14 < 8310 nd 2990 1870 < 4880 <

15 < 10820 nd 4160 5690 < 810 3380

16 < 8710 nd 5690 1630 < 5600 <

17 < 7020 4480 5340 2030 nd 670 <

18 < 8370 6250 4780 3850 nd 370 1730

19 < 8520 1600 310 5110 nd 2330 1800

Tb patients, 1 -4 month after diagnosis

Donor no ag PHA PPD STCF RD1-ORF2 RD1-ORF3 RD1-ORF5 rCFP30A

20 < nd 10670 12680 2020 < 9670 nd

21 < nd 3010 1420 850 < 350 nd

22 < nd 8450 7850 430 < 1950 nd

23 < 10060 nd 3730 < < 350 <

24 < 10830 nd 6180 2090 < 320 730

25 < 9000 nd 3200 4760 < 4960 2820

26 < 10740 nd 7650 4710 < 1170 2280

27 < 7550 6430 6220 2030 nd 3390 3069

28 < 8090 5790 4850 1100 nd 2095 550

29 < 7790 4800 4260 2800 nd 1210 420 Table 9B.

Results from the stimulation of human blood cells from 9 Healthy BCG vaccinated, or non vaccinated ST-CF positive and 8 Tb patients with recombinant MPT59-ESAT6 and ESAT6-MPT59 are shown. ST-CF, PPD and PHA are included for comparison. Results are given in pg IFN-γ/ml and negative values below 300 pg/ml are shown as " < " . nd = not done.

Controls, Healthy BCG vaccinated, or non vaccinated ST-CF positive.

Donor no ag PHA PPD STCF MPT59- ESAT6- ESAT6 MPT59

1 < 9560 6770 3970 2030 <

2 < 12490 6600 8070 5660 5800

4 < 21030 4100 3540 < <

5 < 18750 14200 13030 8540 <

11 < nd 5890 4040 4930 8870

12 < nd 6470 3330 2070 6450

14 < 8310 nd 2990 10270 11030

15 < 10830 nd 4160 3880 4540

16 < 8710 nd 5690 2240 5820

Tb patients, 1 -4 month after diagnosis

Donor no ag PHA PPD STCF MPT59- ESAT6- ESAT6 MPT59

6 < 8970 5100 6150 4150 4120

7 < 12410 6280 3390 5050 2040

8 < 11920 7670 7370 800 1350

9 < 22130 16420 17210 13660 5630

23 < 10070 nd 3730 1740 2390

24 < 10820 nd 6180 1270 1570

25 < 9010 nd 3200 3680 5340

26 < 10740 nd 7650 2070 620 EXAMPLE 6A

Four groups of 6-8 weeks old, female C57BI/6J mice (Bomholtegard, Denmark) were immunized subcutaneously at the base of the tail with vaccines of the following com- positions:

Group 1 : 10 μg ESAT-6/DDA (250 μg) Group 2: 10 μg MPT59/DDA (250μg) Group 3: 10 μg MPT59-ESAT-6 /DDA (250 μg) Group 4: Adjuvant control group: DDA (250 μg) in NaCl

The animals were injected with a volume of 0.2 ml. Two weeks after the first injection and 3 weeks after the second injection the mice were boosted a little further up the back. One week after the last immunization the mice were bled and the blood cells were isolated. The immune response induced was monitored by release of IFN-γ into the culture supernatants when stimulated in vitro with relevant antigens (see the following table).

Immunogen For restimulation' : Ag in vitro 10 μg/dose no antigen ST-CF ESAT-6 MPT59

ESAT-6 219 ± 219 569 ± 569 835 ± 633 -

MPT59 0 802 ± 182 - 5647 ± 159

Hybrid: 127 ± 127 7453 ± 581 15133 ± 861 16363 ± MPT59-ESAT-6 1002

^al Blood cells were isolated 1 week after the last immunization and the release of

IFN-γ (pg/ml) after 72h of antigen stimulation (5 μg/ml) was measured.

The values shown are mean of triplicates performed on cells pooled from three mice ± SEM ^b) - not determined The experiment demonstrates that immunization with the hybrid stimulates T cells which recognize ESAT-6 and MPT59 stronger than after single antigen immunization. Especially the recognition of ESAT-6 was enhanced by immunization with the MPT59- ESAT-6 hybrid. IFN-γ release in control mice immunized with DDA never exceeded 1 000 pg/ml.

EXAMPLE 6B

The recombinant antigens were tested individually as subunit vaccines in mice. Eleven groups of 6-8 weeks old, female C57BI/6J mice (Bomholtegard, Denmark) were immunized subcutaneously at the base of the tail with vaccines of the following composition:

Group 1 : 10 μg CFP7 Group 2: 10 μg CFP1 7

Group 3: 10 μg CFP21

Group 4: 10 μg CFP22

Group 5: 10 μg CFP25

Group 6: 10 μg CFP29 Group 7: 10 μg MPT51

Group 8: 50 μg ST-CF

Group 9: Adjuvant control group

Group 10: BCG 2,5 x 10⁵/ml, 0,2 ml Group 1 1 : Control group: Untreated

All the subunit vaccines were given with DDA as adjuvant. The animals were vaccinated with a volume of 0.2 ml. Two weeks after the first injection and three weeks after the second injection group 1 -9 were boosted a little further up the back. One week after the last injection the mice were bled and the blood cells were isolated. The immune response induced was monitored by release of IFN-γ into the culture supernatant when stimulated in vitro with the homologous protein.

6 weeks after the last immunization the mice were aerosol challenged with 5 x 10⁶ viable Mycobacterium tuberculosislml. After 6 weeks of infection the mice were killed and the number of viable bacteria in lung and spleen of infected mice was determined by plating serial 3-fold dilutions of organ homogenates on 7H 1 1 plates. Colonies were counted after 2-3 weeks of incubation. The protective efficacy is expressed as the difference between 1 og_l0 values of the geometric mean of counts obtained from five mice of the relevant group and the geometric mean of counts obtained from five mouse of the relevant control group.

The results from the experiments are presented in the following table.

Immunogenicity and protective efficacy in mice, of ST-CF and 7 subunit vaccines

Subunit Vaccine Immunogenicity Protect ive eff icacy

ST- CF +++ +++

CFP7 ++

CFP17 +++ +++

CFP21 +++ ++

CFP22

CFP25 +++ +++

CFP29 +++ +++

MPT51 + + + ++

+ + + Strong immunogen / high protection (level of BCG) + + Medium immunogen / medium protection - No recognition / no protection

In conclusion, we have identified a number of proteins inducing high levels of protection. Three of these CFP1 7, CFP25 and CFP29 giving rise to similar levels of protection as ST-CF and BCG while two proteins CFP21 and MPT51 induces protections around 2/3 the level of BCG and ST-CF. Two of the proteins CFP7 and CFP22 did not induce protection in the mouse model.

As is described for rCFP7, rCFP1 7, rCFP21 , rCFP22, rCFP25, rCFP29 and rMPT51 the two antigens rCFP7A and rCFP30A were tested individually as subunit vaccines in mice. C57BI/6J mice were immunized as described for rCFP7, rCFP1 7, rCFP21 , rCFP22, rCFP25, rCFP29 and rMPT51 using either 10μg rCFP7A or 10μg rCFP30A. Controls were the same as in the experiment including rCFP7, rCFP1 7, rCFP21 , rCFP22, rCFP25, rCFP29 and rMPT51 . Immunogenicity and protective efficacy in mice, of ST-CF and 2 subunit vaccines.

Subunit vaccine Immunogenicity Protective efficacy

ST- CF +++ +++ rCFP7A + + + + + + rCFP30A +++

+ + + Strong immunogen/high protection (level of BCG) + + Medium immunogen/medium protection No recognition/no protection

In conclusion we have identified two strong immunogens of which one, rCFP7A, induces protection at the level of ST-CF.

EXAMPLE 7

Species distribution of cfp7, cfp9, mptδ l, rd1-orf2, rd1-orf3, rd1-orf4, rdl-orfδ, rdl- orfδ, rd1-orf9a and rd1-orf9b as well as of cfp7a, cfp7b, cfp 10a, cfp 17, cfp20, cfp21, cfp22, cfp22a, cfp23, cfp2δ and cfp2δa.

Presence of cfp7, cfp9, mptδ l , rd1-orf2, rd1-orf3. rd1-orf4. rdl-orfδ, rdl-orfδ, rd1- orf9a and rd1-orf9b in different mycobacterial species.

In order to determine the distribution of the cfp7, cfp9, mpt51 , rd1-orf2, rd1-orf3, rd1-orf4, rdl-orfδ, rdl-orfδ, rd1-orf9a and rd1-orf9b genes in species belonging to the M. tuberculosis-complex and in other mycobacteria PCR and/or Southern blotting was used. The bacterial strains used are listed in TABLE 10. Genomic DNA was prepared from mycobacterial cells as described previously (Andersen et al. 1992).

PCR analyses were used in order to determine the distribution of the cfp7, cfp9 and mptδ l gene in species belonging to the tuberculosis-complex and in other mycobacteria. The bacterial strains used are listed in TABLE 10. PCR was performed on genomic DNA prepared from mycobacterial cells as described previously (Andersen et al., 1 992). The oligonucleotide primers used were synthesised automatically on a DNA synthesizer (Applied Biosystems, Forster City, Ca, ABI-391 , PCR-mode), deblocked, and purified by ethanol precipitation. The primers used for the analyses are shown in TABLE 1 1 .

The PCR amplification was carried out in a thermal reactor (Rapid cycler, Idaho Technology, Idaho) by mixing 20 ng chromosomal with the mastermix (contained 0.5 μM of each oligonucleotide primer, 0.25 μM BSA (Stratagene), low salt buffer (20 mM Tris-HCl, pH8.8 , 10 mM KCI, 10 mM (NH₄)₂S0₄, 2 mM MgS0₄ and 0, 1 % Triton X- 100) (Stratagene), 0.25 mM of each deoxynucleoside triphosphate and 0.5 U Taq Plus Long DNA polymerase (Stratagene)) . Final volume was 10 μl (all concentrations given are concentrations in the final volume). Predenaturation was carried out at 94°C for 30 s. 30 cycles of the following was performed: Denaturation at 94°C for 30 s, annealing at 55°C for 30 s and elongation at 72°C for 1 min.

The following primer combinations were used (the length of the amplified products are given in parentheses):

mptδ l : MPT51 -3 and MPT51 -2 (820 bp), MPT51 -3 and MPT51 -6 (108 bp), MPT51 -5 and MPT51 -4 (41 5 bp), MPT51 -7 and MPT51 -4 (325 bp). cfp7: pVF1 and PVR1 (274 bp), pVF1 and PVR2 (1 97 bp), pVF3 and PVR1 (302 bp), pVF3 and PVR2 ( 1 25 bp). cfp9: stR3 and stF1 (351 bp).

TABLE 10.

Mycobacterial strains used in this Example.

Species and strain (s) Source

1. M. tuberculosis H37RV ^"ATCC³ (ATCC 27294)

2 . H37Ra ATCC (ATCC 25177]

Erdman Obtained from A. Lazlo, Ottawa , Canada

4. M. bovis BCG substrain: Danish SSI^b 1331

5. Chinese SSI^C

6. Canadian SSI^C

7. Glaxo SSI^C

8. Russia SSI^C

9. Pasteur SSI^C

10. Japan HO^e

11. M. bovis NC 27 SSI^C

12. M. africanum Isolated from a Danish patient

13. M. leprae (armadillo-derived) Obtained from J. . M. Colston,

London, UK

14. M. avium (ATCC 15769) ATCC

15. M. kansasii (ATCC 12478) ATCC

16. M. marinum (ATCC 927) ATCC

17. M. scrofulaceum (ATCC 19275) ATCC

18. M. intercellulare (ATCC 15985) ATCC

19. M. fortui tu (ATCC 6841) ATCC

20. M. xenopi Isolated from a Danish patient

21. M. flavescens Isolated from a Danish patient

22. M. szulgai Isolated from a Danish patient

23. M. terra e SSI^C

24. E . coli SSI^d

25. S . aureus SSI^d ³ American Type Culture Collection, USA. ^b Statens Serum Institut, Copenhagen, Denmark.

⁰ Our collection Department of Mycobacteriology, Statens Serum Institut, Copenhagen, Denmark. ^d Department of Clinical Microbiology, Statens Serum Institut, Denmark. ^e WH0 International Laboratory for Biological Standards, Statens Serum Institut, Copenhagen, Denmark. TABLE 11.

Sequence of the mptδl , cfp7 and cfp9 oligonucleotides. Orientation and Sequences (5'-»3')^a Position¹³ oligonucleotide (nucleotides)

Sense

MPT51- CTCGAATTCGCCGGGTGCACACAG 6 - 21

1 (SEQ ID NO: 28) (SEQ ID NO: 41)

MPT51- CTCGAATTCGCCCCATACGAGAAC 143- 158

3 (SEQ ID NO: 29) (SEQ ID NO: 41)

MPT51- GTGTATCTGCTGGAC 228 - 242

5 (SEQ ID NO: 30) (SEQ ID NO: 41)

MPT51- CCGACTGGCTGGCCG 418-432

7 (SEQ ID NO: 31) (SEQ ID NO: 41) pvR1 GTACGAGAATTCATGTCGCAAATCATG 91 - 105

(SEQ ID NO: 35) (SEQ ID NO: 1) pvR2 GTACGAGAATTCGAGCTTGGGGTGCCG 168- 181

(SEQ ID NO: 36) (SEQ ID NO: 1) stR3 CGATTCCAAGCTTGTGGCCGCCGACCCG 141 - 155

(SEQ ID NO: 37) (SEQ ID NO: 3)

Antisense

MPT51- GAGGAATTCGCTTAGCGGATCGCA 946- 932

(SEQ ID NO: 32) (SEQ ID NO: 41)

MPT51- CCCACATTCCGTTGG 642- 628

(SEQ ID NO: 33) (SEQ ID NO: 41)

MPT51- GTCCAGCAGATACAC 242- 228

(SEQ ID NO: 34) (SEQ ID NO: 41) pvF1 CGTTAGGGATCCTCATCGCCATGGTGTTGG 340- 323

(SEQ ID NO: 38) (SEQ ID NO: 1) pvF3 CGTTAGGGATCCGGTTCCACTGTGCC 268- 255

(SEQ ID NO: 39) (SEQ ID NO: 1) stF1 CGTTAGGGATCCTCAGGTCTTTTCGATG 467- 452

(SEQ ID NO: 40) (SEQ ID NO: 3)

³ Nucleotides underlined are not contained in the nucleotide sequences of mpt λ , cfp7, and cfpQ. ^b The positions referred to are of the non-underlined parts of the primers and correspond to the nucleotide sequence shown in SEQ ID NOs: 41, 1, and 3 for mpt l , cfpl , and cfp9, respectively.

The Southern blotting was carried out as described previously (Oettinger and Andersen, 1994) with the following modifications: 2 μg of genomic DNA was digested with PvuW, electrophoresed in an 0.8% agarose gel, and transferred onto a nylon membrane (Hybond N-plus; Amersham International pic, Little Chalfont, United Kingdom) with a vacuum transfer device (Milliblot, TM-v; Millipore Corp., Bedford, MA). The cfp7, cfp9, mpt51, rd1-orf2, rd1-orf3, rd1-orf4, rdl-orfδ, rdl-orfδ, rd1-orf9a and rd1-orf9b gene fragments were amplified by PCR from the plasmids pRVNOI , pRVN02, pT052, pT087, pT088, pT089, pT090, pT091 , pT096 or pT098 by using the primers shown in TABLE 1 1 and TABLE 2 (in Example 2a). The probes were labelled non-radioactively with an enhanced chemiluminescence kit (ECL; Amersham International pic, Little Chalfont, United Kingdom). Hybridization and detection was performed according to the instructions provided by the manufacturer. The results are summarized in TABLES 1 2 and 1 3.

TABLE 12. Interspecies analysis of the cfp7, cfp9 and mptδl genes by PCR and/or Southern blotting and of MPT51 protein by Western blotting.

,PCR Southern blot ', Western

[blot

Species and strain ! cfp7 cfp mpt51 cfp7 Cfp9 mpt5 ;MPT51

9 1

1 . M. tuJ . H37Rv : + + + + + + | +

2 . M. tub . H37Ra + + + N D. N D. + +

3 . M. tub. Erdmann + + + + + + +

4 . M. jbovis + + + + +

5 . M. bovis BCG + + + + + + +

Danish 1331

6. M. bovis BCG , + + N D. + + + ;N.D.

Japan

7. M. bovis BCG + + N D. + + N D. N.D.

Chinese

8 . M. bovis BCG + + N D. + + N D. N.D.

Canadian

9 . M. Jbovis BCG ! + + N D. + + N D. [N.D.

Glaxc '

10 . M. bovis BCG + + N D. + + N D. N.D.

Russia

11 . M. bovis BCG + + N D. + + N D. N.D.

Pasteur

12 . M. africanum ^'. + + + + + + . +

13 . M. leprae - - - - -

14 . M. avium ', + + - + + + ! -

15 . M. kansasii | + - - + + +

16. M. marinum - ( + ) - + + + -

17 . M. scrofulaceum - - - - - - -

18 . M. intercellul - + ( + ) - + + + - are

19. M. fortuitu ; - - - - - - ! -

20 . M. flavescens ! + ( + ) - + + + !N.D.

21 . M. xenopi ; - - - N. D. N D. +

22 . M. szulgai ( + ) ( + ) - - + - -

23 . M. terrae - - N D. N. D. N D. N D. N.D.

+ , positive reaction; -, no reaction, N.D. not determined.

cfp7, cfp9 and mptδl were found in the M. tuberculosis complex including BCG and the environmental mycobacteria; M. avium, M. kansasii, M. marinum, M. intracellular and M. flavescens. cfp9 was additionally found in M. szulgai and mptδ! in M. xenopi.

Furthermore the presence of native MPT51 in culture filtrates from different mycobacterial strains was investigated with western blots developed with Mab HBT4. There is a strong band at around 26 kDa in M. tuberculosis H37Rv, Ra, Erdman, M. bovis AN5, M. bovis BCG substrain Danish 1 331 and M. africanum. No band was seen in the region in any other tested mycobacterial strains.

TABLE 13a. Interspecies analysis of the rd1-orf2, rd1-orf3, rd1-orf4, rdl-orfδ, rdl- orfδ, rd1-orf9a and rd1-orf9b genes by Southern blotting.

Species and rdl - rdl - rdl - rdl- rdl - rdl - rdl - strain orf2 orf3 orf4 orf5 orf8 orf9a orf9b

1 . M. tub . + + + + + + +

H37Rv

2 . M. bovis + + + + N.D. + +

3 . M. bovis + - - - N.D. - -

BCG

Danish

1331

4 . M. bovis + - - - N.D. - -

BCG Japan

5. M. avium - - - - N.D. - -

6. M. - - - - N.D. - - kansasii

7. M. marinum + - + - N.D. - -

8. M. scrofu - + - - - N.D. - - laceum

9. M. - - - - N.D. - - intercellu lar e

10. M. - - - - N.D. - - fortui tum

11 . M. xenopi - - - - N.D. - -

12 . M. + - - - N.D. - - szulgai

4- , positive reaction; -, no reaction, N.D. not determined.

Positive results for rd1-orf2, rd1-orf3, rdl-orf4, rdl-orfδ, rdl-orfδ, rd1-orf9a and rd1- orf9b were only obtained when using genomic DNA from M. tuberculosis and M. bovis, and not from M. bovis BCG or other mycobacteria analyzed except rd1-orf4 which also was found in M. marinum.

Presence of cfo7a, cfo7b. cfp 10a. cfp 17, cfp20, cfp21 ^', cfp22, cfo22a, cfp23, cfp2δ and cfp2δa in different mycobacterial species.

Southern blotting was carried out as described for rd1-orf2, rd1-orf3, rd1-orf4, rdl- orfδ, rdl-orfδ, rd1-orf9a and rd1-orf9b. The cfp7a, cfp7b, cfp 10a, cfp 17, cfp20. cfp21, cfp22, cfp22a, cfp23, cfp2δ and cfp2δa gene fragments were amplified by PCR from the recombinant pMCT6 plasmids encoding the individual genes. The primers used (same as the primers used for cloning) are described in example 3, 3A and 3B. The results are summarized in Table 1 3b.

TABLE 13b. Interspecies analysis of the cfp7a, cfp7b, cfp lOa, cfp 17, cfp20, cfp21, cfp22, cfp22a, cfp23, cfp2δ, and cfp2δa genes by Southern blotting.

Species and cfp7a cfp7 cfp- cfp cfp cfp2 cfp2 cfp- cfp cfp cfp- strain b 10a 77 20 1 2 22a 23 2δ 2δa

7. M. tub. + 4- + 4- + 4- 4- + + + 4-

H37Rv

2. M. bovis + 4^* 4- 4- + 4- 4- + + + 4-

3. M. bovis 4^* 4- 4- 4^* + N.D. 4- + + + 4-

BCG

Danish 1 331

4. M. bovis 4- 4- 4- 4- + 4- 4- + + 4^* 4-

BCG Japan δ. M. avium 4- N.D. - 4- - 4- 4- 4- + 4- -

6. M. kansasii - N.D. 4- - - - 4- - + - -

7. M. marinum 4- 4- - 4- + 4- 4- 4- + 4^* 4- δ. M. scrofu- - - 4- - + 4^* - + + 4- - laceum

9. M. intercel- 4- 4- - 4- - 4- 4- - + 4- - lulare

10. M. fortui- - N.D. - - - - - - + - - tum

1 1. M. xenopi 4^* 4- 4- 4- 4- 4- 4- + 4^* 4- 4-

12. M. szulgai 4- 4- - + 4- 4- 4- 4- 4- 4- 4-

4- , positive reaction; -, no reaction, N.D. not determined.

LIST OF REFERENCES

Andersen, P. and Heron, I, 1 993, J. Immunol. Methods 161 : 29-39. Andersen, A. B. etal., 1992, Infect. Immun. 60: 2317-2323.

Andersen P., 1994, Infect. Immun. 62: 2536-44.

Andersen P. etal., 1995, J. Immunol. 154: 3359-72

Barkholt, V. and Jensen, A. L., 1989, Anal. Biochem. 177: 318-322.

Borodovsky, M., and J. Mclninch.1993, Computers Chem. 17: 123-133.

van Dyke M. W. et al., 1992. Gene pp.99-104.

Gosselin etal., 1992, J. Immunol. 149: 3477-3481.

Harboe, M. etal., 1996, Infect. Immun. 64: 16-22.

von Heijne, G., 1984, J. Mol. Biol. 173: 243-251.

Hochstrasser, D.F. et al., 1988, Anal. Biochem. 173: 424-435

Kohler, G. and Milstein, C, 1975, Nature 256: 495-497.

Li, H. etal., 1993, Infect. Immun. 61: 1730-1734.

Lindblad E.B. etal., 1997, Infect. Immun. 65: 623-629.

Mahairas, G. G. etal., 1996, J. Bacteriol 178: 1274-1282.

Maniatis T. et al., 1989, "Molecular cloning: a laboratory manual", 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.

Nagai, S. etal., 1991, Infect. Immun. 59: 372-382.

Oettinger, T. and Andersen, A. B., 1994, Infect. Immun. 62: 2058-2064. Ohara, N. etal., 1995, Scand. J. immunol. 41: 233-442.

Pal P. G. and Horwitz M. A., 1992, Infect. Immun. 60: 4781-92.

Pearson, W. R. and Lipman D. J., 1988. Proc. Natl. Acad. Sci. USA 85: 2444-2448.

Ploug, M. etal., 1989, Anal. Biochem. 181: 33-39.

Porath, J. etal., 1985, FEBS Lett. 185: 306-310.

Roberts, A.D. et al., 1995, Immunol.85: 502-508.

Sørensen, A.L. et al., 1995, Infect. Immun. 63: 1710-1717.

Theisen, M. et al., 1995, Clinical and Diagnostic Laboratory Immunology, 2: 30-34.

Valdes-Stauber, N. and Scherer, S., 1994, Appl. Environ. Microbiol. 60: 3809-3814.

Valdes-Stauber, N. and Scherer, S., 1996, Appl. Environ. Microbiol. 62: 1283-1286.

Williams, N., 1996, Science 272: 27.

Young, R. A. etal., 1985, Proc. Natl. Acad. Sci. USA 82: 2583-2587.

Claims

1 . A substantially pure polypeptide fragment which

a) comprises an amino acid sequence selected from the sequences shown in SEQ

ID NO: 1 75, 1 77, 1 79, 1 81 , 1 83, and 1 85,

b) comprises a subsequence of the polypeptide fragment defined in a) which has a length of at Ieast 6 amino acid residues, said subsequence being immunologically equi- valent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, or

c) comprises an amino acid sequence having a sequence identity with the polypeptide defined in a) or the subsequence defined in b) of at Ieast 70% and at the same time being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobac- teria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex.

2. The polypeptide fragment according to claim 1 in essentially pure form.

3. The polypeptide fragment according to claim 1 or 2, which comprises an epitope for a T-helper cell.

4. The polypeptide fragment according to any of the preceding claims, which has a length of at Ieast 7 amino acid residues, such as at Ieast 8, at Ieast 9, at Ieast 1 0, at

Ieast 1 2, at Ieast 14, at Ieast 1 6, at Ieast 1 8, at Ieast 20, at Ieast 22, at Ieast 24, and at Ieast 30 amino acid residues.

5. The polypeptide fragment according to any of the preceding claims which is free from any signal sequence.

6. The polypeptide fragment according to any of the preceding claims which

1 ) induces a release of IFN-╬│ from primed memory T-lymphocytes withdrawn from a mouse within 2 weeks of primary infection or within 4 days after the mouse has been re-challenge infected with mycobacteria belonging to the tuberculosis complex, the induction performed by the addition of the polypeptide to a suspension comprising about 200.000 spleen cells per ml, the addition of the polypeptide resulting in a concentration of 1 -4 ╬╝g polypeptide per ml suspension, the release of IFN-╬│ being assessable by determination of IFN-╬│ in supernatant harvested 2 days after the addition of the polypeptide to the suspension, and/or

2) induces a release of IFN-╬│ of at Ieast 300 pg above background level from about 1000,000 human PBMC (peripheral blood mononuclear cells) per ml isolated from TB patients in the first phase of infection, or from healthy BCG vaccinated donors, or from healthy contacts to TB patients, the induction being performed by the addition of the polypeptide to a suspension comprising the about 1 ,000,000 PBMC per ml, the addition of the polypeptide resulting in a concentration of 1 -4 ╬╝g polypeptide per ml suspension, the release of IFN-╬│ being assessable by determination of IFN-╬│ in supernatant harvested 2 days after the addition of the polypeptide to the suspension; and/or

3) induces an IFN-╬│ release from bovine PBMC derived from animals previously sensitized with mycobacteria belonging to the tuberculosis complex, said release being at Ieast two times the release observed from bovine PBMC derived from animals not previously sensitized with mycobacteria belonging to the tuberculosis complex.

7. A polypeptide fragment according to any of the preceding claims, wherein the sequence identity in c) is at Ieast 80%, such as at Ieast 85%, at Ieast 90%, at Ieast 91 %, at Ieast 92%, at Ieast 93%, at Ieast 94%, at Ieast 95%, at Ieast 96%, at Ieast 97%, at Ieast 98%, at Ieast 99%, and at Ieast 99.5%.

8. A fusion polypeptide comprising at Ieast one polypeptide fragment according to any of the preceding claims and at Ieast one fusion partner.

9. A fusion polypeptide according to claim 8, wherein the fusion partner is selected from the group consisting of a polypeptide fragment as defined in any of claims 1 -8, and an other polypeptide fragment derived from a bacterium belonging to the tuberculosis complex, such as ESAT-6 or at Ieast one T-cell epitope thereof, MPB64 or at Ieast one T-cell epitope thereof, MPT64 or at Ieast one T-cell epitope thereof, and MPB59 or at Ieast one T-cell epitope thereof.

10. A fusion polypeptide fragment which comprises

1 ) a first amino acid sequence including at Ieast one stretch of amino acids constituting a T-cell epitope derived from the M. tuberculosis protein ESAT-6, and a second amino acid sequence including at Ieast one T-cell epitope derived from a M. tuberculosis protein different from ESAT-6 and/or including a stretch of amino acids which protects the first amino acid sequence from in vivo degradation or post-translational processing; or

2) a first amino acid sequence including at Ieast one stretch of amino acids constituting a T-cell epitope derived from the M. tuberculosis protein MPT59, and a second amino acid sequence including at Ieast one T-cell epitope derived from a M. tuberculosis protein different from MPT59 and/or including a stretch of amino acids which protects the first amino acid sequence from in vivo degradation or post-translational pro- cessing.

1 1 . A fusion polypeptide fragment according to claim 10, wherein the first amino acid sequence is situated C-terminally to the second amino acid sequence.

12. A fusion polypeptide fragment according to claim 1 0, wherein the first amino acid sequence is situated N-terminally to the second amino acid sequence.

1 3. A fusion polypeptide fragment according to any of claims 10-1 2, wherein the at Ieast one T-cell epitope included in the second amino acid sequence is derived from a M. tuberculosis polypeptide selected from the group consisting of a polypeptide fragment according to any of claims 1 -55, DnaK, GroEL, urease, glutamine synthetase, the proline rich complex, L-alanine dehydrogenase, phosphate binding protein, Ag 85 complex, HBHA (heparin binding hemagglutinin), MPT51 , MPT64, superoxide dismutase, 1 9 kDa lipoprotein, ╬▒-crystallin, GroES, MPT59 when the first T-cell epitope is derived from ESAT-6, and ESAT-6 when the first T-cell epitope is derived from MPT59.

14. A fusion polypeptide fragment according to any of claims 1 0-1 3, wherein the first and second T-cell epitopes each have a sequence identity of at Ieast 70% with the natively occurring sequence in the proteins from which they are derived.

1 5. A fusion polypeptide according to any of claims 10-14, wherein the first and/or second amino acid sequence have a sequence identity of at Ieast 70% with the protein from which they are derived.

1 6. A fusion polypeptide fragment according to any of claims 1 0-1 5, wherein the first amino acid sequence is the amino acid sequence of ESAT-6 or of MPT59 and/or the second amino acid sequence is the amino acid sequence of a M. tuberculosis polypeptide selected from the group consisting of a polypeptide fragment according to any of claims 1 -7, DnaK, GroEL, urease, glutamine synthetase, the proline rich complex, L-alanine dehydrogenase, phosphate binding protein, Ag 85 complex, HBHA (heparin binding hemagglutinin), MPT51 , MPT64, superoxide dismutase, 1 9 kDa lipoprotein, ╬▒-crystallin, GroES, ESAT-6 when the first amino acid sequence is that of MPT59, and MPT59 when the first amino acid sequence is that of ESAT-6.

1 7. A fusion polypeptide fragment according to any of claims 1 0-1 6, wherein no linkers are introduced between the two amino acid sequences.

1 8. A polypeptide according to any of the preceding claims which is lipidated so as to allow a self-adjuvating effect of the polypeptide.

19. A substantially pure polypeptide according to any of claims 1 -1 8 for use as a pharmaceutical.

20. The use of a substantially pure polypeptide according to any of claims 1 -1 9 in the preparation of a pharmaceutical composition for the diagnosis of tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.

21 . The use of a substantially pure polypeptide according to any of claims 1 -1 9 in the preparation of a pharmaceutical composition for the vaccination against tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.

22. A nucleic acid fragment in isolated form which

1 ) comprises a nucleic acid sequence which encodes a polypeptide as defined in any of claims 1 -18, or comprises a nucleic acid sequence complementary thereto,

2) has a length of at Ieast 10 nucleotides and hybridizes readily under stringent hybridization conditions with a nucleic acid fragment which has a nucleotide sequence selected from

SEQ ID NO: 1 74 or a sequence complementary thereto, SEQ ID NO: 1 76 or a sequence complementary thereto,

SEQ ID NO: 1 78 or a sequence complementary thereto,

SEQ ID NO: 1 80 or a sequence complementary thereto,

SEQ ID NO: 1 82 or a sequence complementary thereto, and

SEQ ID NO: 184 or a sequence complementary thereto,

23. A nucleic acid fragment according to claim 22, which is a DNA fragment.

24. A nucleic acid fragment according to claims 22 or 23 for use as a pharmaceutical.

25. The use of a nucleic acid fragment according to any of claims 22-24 in the preparation of a pharmaceutical composition for the vaccination against tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.

26. The use of a nucleic acid fragment according to any of claims 22-24 in the preparation of a pharmaceutical composition for the diagnosis of tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.

27. A vaccine comprising a nucleic acid fragment according to any of claims 22-24, the vaccine effecting in vivo expression of antigen by an animal, including a human being, to whom the vaccine has been administered, the amount of expressed antigen being effective to confer substantially increased resistance to infections with mycobacteria of the tuberculosis complex in an animal, including a human being.

28. An immunologic composition comprising a polypeptide according to any of claims 1 -19.

29. An immunologic composition according to claim 28, which further comprises an immunologically and pharmaceutically acceptable carrier, vehicle or adjuvant.

30. An immunologic composition according to claim 29, wherein the carrier is selected from the group consisting of a polymer to which the polypeptide(s) is/are bound by hydrophobic non-covalent interaction, such as a plastic, e.g. polystyrene, a polymer to which the polypeptide(s) is/are covalently bound, such as a polysaccharide, and a polypeptide, e.g. bovine serum albumin, ovalbumin or keyhole limpet hemocyanin; the vehicle is selected from the group consisting of a diluent and a suspending agent; and the adjuvant is selected from the group consisting of dimethyldioctadecylammonium bromide (DDA), Quil A, poly l:C, Freund's incomplete adjuvant, IFN-╬│, IL-2, IL-1 2, monophosphoryl lipid A (MPL), and muramyl dipeptide (MDP).

31 . An immunologic composition according to any of claims 28-30, comprising at Ieast two different polypeptide fragments, each different polypeptide fragment being a polypeptide according to any of claims 1 -1 9.

32. An immunologic composition according to claim 31 , comprising 3-20 different polypeptide fragments, each different polypeptide fragment being according to any of claims 1 -1 9.

33. An immunologic composition according to any of claims 28-32, which is in the form of a vaccine.

34. An immunologic composition according to any of claims 28-32, which is in the form of a skin test reagent.

35. A vaccine for immunizing an animal, including a human being, against tuberculosis caused by mycobacteria belonging to the tuberculosis complex, comprising as the effective component a non-pathogenic microorganism, wherein at Ieast one copy of a DNA fragment comprising a DNA sequence encoding a polypeptide according to any of claims 1 -1 9 has been incorporated into the genome of the microorganism in a manner allowing the microorganism to express and optionally secrete the polypeptide.

36. A vaccine according to claim 35, wherein the microorganism is a bacterium.

37. A vaccine according to claim 36, wherein the bacterium is selected from the group consisting of the genera Mycobacterium, Salmonella, Pseudomonas and Eschericia.

38. A vaccine according to claim 37, wherein the microorganism is Mycobacterium bovis BCG, such as Mycobacterium bovis BCG strain: Danish 1 331 .

39. A vaccine according to any of claims 35-38, wherein at Ieast 2 copies of a DNA fragment encoding a polypeptide according to any of claims 1 -1 2 are incorporated into the genome of the microorganism.

40. A vaccine according to claim 39, wherein the number of copies is at Ieast 5.

41 . A replicable expression vector which comprises a nucleic acid fragment according to any of claims 22-24.

42. A vector according to claim 41 , which is selected from the group consisting of a virus, a bacteriophage, a plasmid, a cosmid, and a microchromosome.

43. A transformed cell harbouring at Ieast one vector according to claim 41 or 42.

44. A transformed cell according to claim 43, which is a bacterium belonging to the tuberculosis complex, such as a M. tuberculosis bovis BCG cell.

45. A transformed cell according to claim 43 or 44, which expresses a polypeptide according to any of claims 1 -1 9.

46. A method for producing a polypeptide according to any of claims 1 -1 9, comprising

inserting a nucleic acid fragment according to any of claims 1 5-1 7 into a vector which is able to replicate in a host cell, introducing the resulting recombinant vector into the host cell, culturing the host cell in a culture medium under conditions sufficient to effect expression of the polypeptide, and recovering the polypeptide from the host cell or culture medium; or

isolating the polypeptide from a short-term culture filtrate; or

isolating the polypeptide from whole mycobacteria of the tuberculosis complex or from lysates or fractions thereof, e.g. cell wall containing fractions; or

synthesizing the polypeptide by solid or liquid phase peptide synthesis.

47. A method for producing an immunologic composition according to any of claims 28-34 comprising

preparing, synthesizing or isolating a polypeptide according to any of claims 1 - 1 9, and

solubilizing or dispersing the polypeptide in a medium for a vaccine, and

optionally adding other M. tuberculosis antigens and/or a carrier, vehicle and/or adjuvant substance, or

cultivating a cell according to any of claims 41 -45, and

transferring the cells to a medium for a vaccine, and

optionally adding a carrier, vehicle and/or adjuvant substance.

48. A method of diagnosing tuberculosis caused by Mycobacterium tuberculosis, My- cobacterium africanum or Mycobacterium bovis in an animal, including a human being, comprising intradermally injecting, in the animal, a polypeptide according to any of claims 1 -1 9 or an immunologic composition according to any of claims 28-34, a positive skin response at the location of injection being indicative of the animal having tuberculosis, and a negative skin response at the location of injection being indicative of the animal not having tuberculosis.

49. A method for immunising an animal, including a human being, against tuberculosis caused by mycobacteria belonging to the tuberculosis complex, comprising administering to the animal the polypeptide according to any of claims 1 -1 9, the immunologic composition according to any of claims 28-34, or the vaccine according to any of claims 35-40.

50. A method according to claim 49, wherein the polypeptide, immunologic composition, or vaccine is administered by the parenteral (such as intravenous and intraarterially), intraperitoneal, intramuscular, subcutaneous, intradermal, oral, buccal, sublingual, nasal, rectal or transdermal route.

51 . A method for diagnosing ongoing or previous sensitization in an animal or a human being with bacteria belonging to the tuberculosis complex, the method comprising providing a blood sample from the animal or human being, and contacting the sample from the animal with the polypeptide according to any of claims 1 -1 9, a significant release into the extracellular phase of at Ieast one cytokine by mononuclear cells in the blood sample being indicative of the animal being sensitized.

52. A composition for diagnosing tuberculosis in an animal, including a human being, comprising a polypeptide according to any of claims 1 -1 9, or a nucleic acid fragment according to any of claims 22-24, optionally in combination with a means for detection.

53. A monoclonal or polyclonal antibody, which is specifically reacting with a polypeptide according to any of claims 1 -1 9 in an immuno assay, or a specific binding fragment of said antibody.

54. Use of CFP7A or CFP30A, or a T-cell epitope thereof, for the induction of a strong immune response in a mammal including a human being.

55. Use of CFP7A, or a T-cell epitope thereof, for the induction of a high protective immune response in a mammal including a human being.

56. Use of CFP7B, CFP1 9, or MPT59-ESAT6, or a T-cell epitope thereof, for the diagnosis of tuberculosis in a mammal including a human being by performing a DTH type skin test.

57. Use of CFP27, CFP30A, RD1 -ORF2, RD1 -ORF3, RD1 -ORF5, MPT59-ESAT6, ESAT6-MPT59, CFP10A, CFP1 6, CFP1 9, CFP23, CFP25A, CFP30B, CFP7B, or a T- cell epitope thereof, for the preparation of an immunological composition with a wide genetically recognition.

58. Use of CFP27, CFP30A, RD 1 -ORF2, RD1 -ORF5, MPT59-ESAT6, ESAT6-MPT59, CFP10A, CFP19, CFP23, CFP25A, CFP30B, or a T-cell epitope thereof, for the preparation of a vaccine such as a subunit vaccine.