AU750173B2 - Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis - Google Patents

Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis Download PDF

Info

Publication number
AU750173B2
AU750173B2 AU94338/98A AU9433898A AU750173B2 AU 750173 B2 AU750173 B2 AU 750173B2 AU 94338/98 A AU94338/98 A AU 94338/98A AU 9433898 A AU9433898 A AU 9433898A AU 750173 B2 AU750173 B2 AU 750173B2
Authority
AU
Australia
Prior art keywords
polypeptide
tuberculosis
seq
sequence
ala
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
AU94338/98A
Other versions
AU9433898A (en
Inventor
Peter Andersen
Rikke Skjot
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Statens Serum Institut SSI
Original Assignee
Statens Serum Institut SSI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/DK1998/000132 external-priority patent/WO1998044119A1/en
Application filed by Statens Serum Institut SSI filed Critical Statens Serum Institut SSI
Publication of AU9433898A publication Critical patent/AU9433898A/en
Application granted granted Critical
Publication of AU750173B2 publication Critical patent/AU750173B2/en
Anticipated expiration legal-status Critical
Expired legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/35Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Mycobacteriaceae (F)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/51Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Peptides Or Proteins (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Description

WO 99/24577 PCT/DK98/00438 1 NUCLEIC ACID FRAGMENTS AND POLYPEPTIDE FRAGMENTS DERIVED FROM M. TUBERCULOSIS FIELD OF THE INVENTION The present invention relates to a number of immunologically active, novel polypeptide fragments derived from the Mycobacterium tuberculosis, vaccines and other immunologic compositions containing the fragments as immunogenic components, and methods of production and use of the polypeptides. The invention also relates to novel nucleic acid fragments derived from M. tuberculosis which are useful in the preparation of the polypeptide fragments of the invention or in the diagnosis of infection with M. tuberculosis.
BACKGROUND OF THE INVENTION Human tuberculosis (hereinafter designated caused by Mycobacterium tuberculosis is a severe global health problem responsible for approximately 3 million deaths annually, according to the WHO. The worldwide incidence of new TB cases has been progressively falling for the last decade but the recent years has markedly changed this trend due to the advent of AIDS and the appearance of multidrug resistant strains of M. tuberculosis.
The only vaccine presently available for clinical use is BCG, a vaccine which efficacy remains a matter of controversy. BCG generally induces a high level of acquired resistance in animal models of TB, but several human trials in developing countries have failed to demonstrate significant protection. Notably, BCG is not approved by the FDA for use in the United States.
This makes the development of a new and improved vaccine against TB an urgent matter which has been given a very high priority by the WHO. Many attempts to define protective mycobacterial substances have been made, and from 1950 to 1970 several investigators reported an increased resistance after experimental vaccination.
However, the demonstration of a specific long-term protective immune response with the potency of BCG has not yet been achieved by administration of soluble proteins or cell wall fragments, although progress. is currently being made by relying on polypeptides derived from short term-culture filtrate, cf. the discussion below.
Immunity to M. tuberculosis is characterized by three basic features; i) Living bacilli efficiently induces a protective immune response in contrast to killed preparations; ii) Specifically sensitized T lymphocytes mediate this protection; iii) The most important mediator molecule seems to be interferon gamma (INF-y).
Short term-culture filtrate (ST-CF) is a complex mixture of proteins released from M. tuberculosis during the first few days of growth in a liquid medium (Andersen et al., 1991). Culture filtrates has been suggested to hold protective antigens recognized by the host in the first phase of TB infection (Andersen et al. 1991, Orme et acd. 1993). Recent data from several laboratories have demonstrated that experimental subunit vaccines based on S'culture filtrate antigens can provide high levels of acquired resistance to TB (Pal and Horwitz, 1992; Roberts et al., 1995; Andersen, 1994; Lindblad et al., 1997). Culture filtrates are, however, complex protein mixtures and until now very limited information has been available on the molecules responsible for this protective immune response. In this regard, only two cultures filtrates antigens have been described as involved in protective immunity, the low mass antigen ESAT-6 (Andersen et al., 1995 and EP-A-0 706 571) and the 31 kDa molecule Ag85B (EP-0 432 203).
i There is therefore a need for the identification of further antigens involved in the induction of protective immunity against TB in order to eventually produce an effective subunit vaccine.
oooo Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is solely for the purpose of providing a context for the present invention. It is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention as it existed in Australia before the priority date of each claim of this application.
OBJECT OF THE INVENTION It is an object of the invention to provide novel antigens which are effective as components in a subunit vaccine against TB or which are useful as components in diagnostic compositions for the detection of infection with mycobacteria, especially virulence-associated mycobacteria. The novel antigens may also be important drug targets.
SUMMARY OF THE INVENTION The present invention is i.a. based on the identification and characterization of a number of previously uncharacterized culture filtrate antigens from M.
tuberculosis. In animal models of TB, T cells mediating immunity are 20 focused predominantly to antigens in the regions 6-12 and 17-30 kDa of ST-CF. In the present invention 6 antigens in the low molecular weight region (ORF7-1, ORF7-2, ORF11-1, ORFll-2, ORFll-3, ORFll-4) have been identified.
Furthermore immunological and biological data on several important antigens are presented.
The encoding genes for 8 antigens have been determined. The panel hold antigens with potential for vaccine purposes as well as for diagnostic purposes, since the antigens are all secreted by metabolizing mycobacteria.
The following table lists the antigens of the invention by the names used herein as well as be reference to relevant SEQ ID NOs of N-terminal sequences, full amino acid sequences and sequences of DNA encoding the antigens: WO 99/24577 WO 9924577PCT/DK98/00438 Antigen N-terminal sequence SEQ ID NO: CFP7 CFP7A C FP7 B CFP8A C FP8 B CF P9 CFP1 OA CFP1 1 CFP1 6 CFP1 7 CFP1 9 CFP1 9A CEP 198 CFP21 C FP2 2 C FP2 2A CFP23 CFP23A CF P23 B C FP25S CFP27 CFP28 C FP2 9 MPT51 CWP32 RD1-ORF8 RDl1-ORF2 RD1 -ORF9B RD1-ORF3 RD 1 -ORF9A RDl-ORF4 RD 1 -ORES MPT59- ESAT6 ESAT6- M PT59 ORF7-1 ORF7-2 ORF1 1-1 ORFi 1-2 ORFi 1-3 ORFi 1-4 81 168 73 74 169 170 79 17 82 18 19 20 83 76 21 78 84 22 23 85 171 86 Nucleotide sequence SEQ ID NO: 1 47 146 148 150 3 140 142 63 5 49 51 7 9 11 53 55 Amino acid sequence SEQ ID NO: 2 48 147 149 151 4 141 143 64 6 52 13 65 57 15 59 144 61 41 152 67 71 69 87 93 89 91 8 12 54 56 14 66 58 16 145 62 42 153 68 72 88 94 92 172 174 176 178 180 182 184 175 177 179 181 183 185 WO 99/24577 PCT/DK98/00438 It is well-known in the art that T-cell epitopes are responsible for the elicitation of the acquired immunity against TB, whereas B-cell epitopes are without any significant influence on acquired immunity and recognition of mycobacteria in vivo. Since such Tcell epitopes are linear and are known to have a minimum length of 6 amino acid residues, the present invention is especially concerned with the identification and utilisation of such T-cell epitopes.
Hence, in its broadest aspect the invention relates to a substantially pure polypeptide fragment which a) comprises an amino acid sequence selected from the sequences shown in SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, and 185 b) comprises a subsequence of the polypeptide fragment defined in a) which has a length of at least 6 amino acid residues, said subsequence being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, or c) comprises an amino acid sequence having a sequence identity with the polypeptide defined in a) or the subsequence defined in b) of at least 70% and at the same time being immunologically equivalent to the polypeptide defined in a) with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, with the proviso that WO 99/24577 PCT/DK98/00438 6 i) the polypeptide fragment is in essentially pure form when consisting of the amino acid sequence 1-96 of SEQ ID NO: 2 or when consisting of the amino acid sequence 87-108 of SEQ ID NO: 4 fused to P-galactosidase, ii) the degree of sequence identity in c) is at least 95% when the polypeptide comprises a homologue of a polypeptide which has the amino acid sequence SEQ ID NO: 12 or a subsequence thereof as defined in and iii) the polypeptide fragment contains a threonine residue corresponding to position 213 in SEQ ID NO: 42 when comprising an amino acid sequence of at least 6 amino acids in SEQ ID NO: 42.
Other parts of the invention pertains to the DNA fragments encoding a polypeptide with the above definition as well as to DNA fragments useful for determining the presence of DNA encoding such polypeptides.
DETAILED DISCLOSURE OF THE INVENTION In the present specification and claims, the term "polypeptide fragment" denotes both short peptides with a length of at least two amino acid residues and at most 10 amino acid residues, oligopeptides (11-100 amino acid residues), and longer peptides (the usual interpretation of "polypeptide", i.e. more than 100 amino acid residues in length) as well as proteins (the functional entity comprising at least one peptide, oligopeptide, or polypeptide which may be chemically modified by being glycosylated, by being lipidated, or by comprising prosthetic groups). The definition of polypeptides also comprises native forms of peptides/proteins in mycobacteria as well as recombinant proteins or peptides in any type of expression vectors transforming any kind of host, and also chemically synthesized peptides.
In the present context the term "substantially pure polypeptide fragment" means a polypeptide preparation which contains at most 5% by weight of other polypeptide material with which it is natively associated (lower percentages of other polypeptide material are preferred, e.g. at most at most at most at most and at WO 99/24577 PCTDK98/00438 7 most Y2 It is preferred that the substantially pure polypeptide is at least 96% pure, i.e. that the polypeptide constitutes at least 96% by weight of total polypeptide material present in the preparation, and higher percentages are preferred, such as at least 97%, at least 98%, at least 99%, at least 99,25%, at least 99,5%, and at least 99,75%. It is especially preferred that the polypeptide fragment is in "essentially pure form", i.e. that the polypeptide fragment is essentially free of any other antigen with which it is natively associated, i.e. free of any other antigen from bacteria belonging to the tuberculosis complex. This can be accomplished by preparing the polypeptide fragment by means of recombinant methods in a non-mycobacterial host cell as will be described in detail below, or by synthesizing the polypeptide fragment by the wellknown methods of solid or liquid phase peptide synthesis, e.g. by the method described by Merrifield or variations thereof.
The term "subsequence" when used in connection with a polypeptide of the invention having a SEQ ID NO selected from 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, and 185 denotes any continuous stretch of at least 6 amino acid residues taken from the M. tuberculosis derived polypeptides in SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, and 185 and being immunological equivalent thereto with respect to the ability of conferring increased resistance to infections with bacteria belonging to the tuberculosis complex. Thus, included is also a polypeptide from different sources, such as other bacteria or even from eukaryotic cells.
When referring to an "immunologically equivalent" polypeptide is herein meant that the polypeptide, when formulated in a vaccine or a diagnostic agent together with a pharmaceutically acceptable carrier or vehicle and optionally an adjuvant), will I) confer, upon administration (either alone or as an immunologically active constituent together with other antigens), an acquired increased specific resistance in a mouse and/or in a guinea pig and/or in a primate such as a human being against infections with bacteria belonging to the tuberculosis complex which is at least 20% of the WO 99/24577 PCT/DK98/00438 8 acquired increased resistance conferred by Mycobacterium bovis BCG and also at least of the acquired increased resistance conferred by the parent polypeptide comprising SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, or 185 (said parent polypeptide having substantially the same relative location and pattern in a 2DE gel prepared as the 2DE gel shown in Fig. 6, cf. the examples), the acquired increased resistance being assessed by the observed reduction in mycobacterial counts from spleen, lung or other organ homogenates isolated from the mouse or guinea pig receiving a challenge infection with a virulent strain of M. tuberculosis, or, in a primate such as a human being, being assessed by determining the protection against development of clinical tuberculosis in a vaccinated group versus that observed in a control group receiving a placebo or BCG (preferably the increased resistance is higher and corresponds to at least 50% of the protective immune response elicited by M. bovis BCG, such as at least 60%, or even more preferred to at least 80% of the protective immune response elicited by M. bovis BCG, such as at least 90%; in some cases it is expected that the increased resistance will supersede that conferred by M. bovis BCG, and hence it is preferred that the resistance will be at least 100%, such as at least 110% of said increased resistance); and/or II) elicit a diagnostically significant immune response in a mammal indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex; this diagnostically significant immune response can be in the form of a delayed type hypersensitivity reaction which can e.g. be determined by a skin test, or can be in the form of IFN-y release determined e.g. by an IFN-y assay as described in detail below. A diagnostically significant response in a skin test setup will be a reaction which gives rise to a skin reaction which is at least 5 mm in diameter and which is at least 65% (preferably at least 75% such as at the least 85%) of the skin reaction (assessed as the skin reaction diameter) elicited by the parent polypeptide comprising SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, or 185.
WO 99/24577 PCT/DK98/00438 9 The ability of the polypeptide fragment to confer increased immunity may thus be assessed by measuring in an experimental animal, e.g. a mouse or a guinea pig, the reduction in mycobacterial counts from the spleen, lung or other organ homogenates isolated from the experimental animal which have received a challenge infection with a virulent strain of mycobacteria belonging to the tuberculosis complex after previously having been immunized with the polypeptide, as compared to the mycobacterial counts in a control group of experimental animals infected with the same virulent strain, which experimental animals have not previously been immunized against tuberculosis. The comparison of the mycobacterial counts may also be carried out with mycobacterial counts from a group of experimental animals receiving a challenge infection with the same virulent strain after having been immunized with Mycobacterium bovis BCG.
The mycobacterial counts in homogenates from the experimental animals immunized with a polypeptide fragment according to the present invention must at the most be times the counts in the mice or guinea pigs immunized with Mycobacterium bovis BCG, such as at the most 3 times the counts, and preferably at the most 2 times the counts.
A more relevant assessment of the ability of the polypeptide fragment of the invention to confer increased resistance is to compare the incidence of clinical tuberculosis in two groups of individuals humans or other primates) where one group receives a vaccine as described herein which contains an antigen of the invention and the other group receives either a placebo or an other known TB vaccine BCG). In such a setup, the antigen of the invention should give rise to a protective immunity which is significantly higher than the one provided by the administration of the placebo (as determined by statistical methods known to the skilled artisan).
In the context of the present application, the term "wide genetically" should be understood in a meaning of at least two strains. That is, if a polypeptide is recognised by at least two different strains, it is considered to have a wide genetically recognition.
A subunit vaccine component is defined as a reagent which stimulates protective immunity in an animal model of infection with an organism of the M. tuberculosis com- WO 99/24577 PCT/DK98/00438 plex, when given prior to infection and which also generates a significant immune responses in human volunteers.
The "tuberculosis-complex" has its usual meaning, i.e. the complex of mycobacteria causing TB which are Mycobacterium tuberculosis, Mycobacterium bovis, Mycobacterium bovis BCG, and Mycobacterium africanum.
In the present context the term "metabolizing mycobacteria" means live mycobacteria that are multiplying logarithmically and releasing polypeptides into the culture medium wherein they are cultured.
The term "sequence identity" indicates a quantitative measure of the degree of homology between two amino acid sequences or between two nucleotide sequences of equal length, of if not of equal length aligned to best possible fit: The sequence identity can be calculated as (NfN,,f) 1, wherein NO~f is the total number of non- N,,f identical residues in the two sequences when aligned and wherein N,e, is the number of residues in one of the sequences. Hence, the DNA sequence AGTCAGTC will have a sequence identity of 75% with the sequence AATCAATC (Ndi,= 2 and Ne, A gap is counted as non-identity of the specific residue(s), i.e. the DNA sequence AGTGTC will have a sequence identity of 75% with the DNA sequence AGTCAGTC (Nd, 2 and N,,e Sequence identity can alternatively be calculated by the BLASTP program ((Pearson W.R and D.J. Lipman (1988) PNAS USA 85:2444-2448) in the EMBL database (www.ncbi.nlm.gov/cgi-bin/BLAST). Generally, the default settings with respect to e.g. "scoring matrix" and "gap penalty" will be used for alignment.
The sequence identity is used here to illustrate the degree of identity between the amino acid sequence of a given polypeptide and the amino acid sequence shown in SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, or 185. The amino acid sequence to be compared with the amino acid sequence shown in SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, WO 99/24577 PCT/DK98/00438 11 any one of 168-171, 175, 177, 179, 181, 183, or 185 may be deduced from a DNA sequence, e.g. obtained by hybridization as defined below, or may be obtained by conventional amino acid sequencing methods. The sequence identity is preferably determined on the amino acid sequence of a mature polypeptide, i.e. without taking any leader sequence into consideration.
As appears from the above disclosure, polypeptides which are not identical to the polypeptides having SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, or 185 are embraced by the present invention. The invention allows for minor variations which do not have an adverse effect on immunogenicity compared to the parent sequences and which may give interesting and useful novel binding properties or biological functions and immunogenicities etc.
Each polypeptide fragment may thus be characterized by specific amino acid and nucleic acid sequences. It will be understood that such sequences include analogues and variants produced by recombinant methods wherein such nucleic acid and polypeptide sequences have been modified by substitution, insertion, addition and/or deletion of one or more nucleotides in said nucleic acid sequences to cause the substitution, insertion, addition or deletion of one or more amino acid residues in the recombinant polypeptide. When the term DNA is used in the following, it should be understood that for the number of purposes where DNA can be substituted with RNA, the term DNA should be read to include RNA embodiments which will be apparent for the man skilled in the art. For the purposes of hybridization, PNA or LNA may be used instead of DNA. PNA has been shown to exhibit a very dynamic hybridization profile (PNA is described in Nielsen P E et al., 1991, Science 254: 1497-1500). LNA (Locked Nucleic Acids) is a recently introduced oligonucleotide analogue containing bicyclo nucleoside monomers (Koshkin et al., 1998, 54, 3607-3630;Nielsen, N.K. et al. J.Am.Chem.Soc 1998, 120, 5458-5463).
In both immunodiagnostics and vaccine preparation, it is often possible and practical to prepare antigens from segments of a known immunogenic protein or polypeptide.
WO 99/24577 PCT/DK98/00438 12 Certain epitopic regions may be used to produce responses similar to those produced by the entire antigenic polypeptide. Potential antigenic or immunogenic regions may be identified by any of a number of approaches, Jameson-Wolf or Kyte-Doolittle antigenicity analyses or Hopp and Woods (1981) hydrophobicity analysis (see, Jameson and Wolf, 1988; Kyte and Doolittle, 1982; or U.S. Patent No. 4,554,101). Hydrophobicity analysis assigns average hydrophilicity values to each amino acid residue from these values average hydrophilicities can be calculated and regions of greatest hydrophilicity determined. Using one or more of these methods, regions of predicted antigenicity may be derived from the amino acid sequence assigned to the polypeptides of the invention.
Alternatively, in order to identify relevant T-cell epitopes which are recognized during an immune response, it is also possible to use a "brute force" method: Since T-cell epitopes are linear, deletion mutants of polypeptides having SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, or 185 will, if constructed systematically, reveal what regions of the polypeptides are essential in immune recognition, e.g. by subjecting these deletion mutants to the IFN-y assay described herein. Another method utilises overlapping oligomers (preferably synthetic having a length of e.g. 20 amino acid residues) derived from polypeptides having SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, or 185. Some of these will give a positive response in the IFN-y assay whereas others will not.
In a preferred embodiment of the invention, the polypeptide fragment of the invention comprises an epitope for a T-helper cell.
Although the minimum length of a T-cell epitope has been shown to be at least 6 amino acids, it is normal that such epitopes are constituted of longer stretches of amino acids. Hence it is preferred that the polypeptide fragment of the invention has a length of at least 7 amino acid residues, such as at least 8, at least 9, at least 10, at WO 99/24577 PCT/DK98/00438 13 least 12, at least 14, at least 16, at least 18, at least 20, at least 22, at least 24, and at least 30 amino acid residues.
As will appear from the examples, a number of the polypeptides of the invention are natively translation products which include a leader sequence (or other short peptide sequences), whereas the product which can be isolated from short-term culture filtrates from bacteria belonging to the tuberculosis complex are free of these sequences. Although it may in some applications be advantageous to produce these polypeptides recombinantly and in this connection facilitate export of the polypeptides from the host cell by including information encoding the leader sequence in the gene for the polypeptide, it is more often preferred to either substitute the leader sequence with one which has been shown to be superior in the host system for effecting export, or to totally omit the leader sequence when producing the polypeptide by peptide synthesis. Hence, a preferred embodiment of the invention is a polypeptide which is free from amino acid residues -30 to -1 in SEQ ID NO: 6 and/or -32 to -1 in SEQ ID NO: 10 and/or -8 to -1 in SEQ ID NO: 12 and/or -32 to -1 in SEQ ID NO: 14 and/or -33 to -1 in SEQ ID NO: 42 and/or -38 to -1 in SEQ ID NO: 52 and/or -33 to -1 in SEQ ID NO: 56 and/or -56 to -1 in SEQ ID NO: 58 and/or -28 to -1 in SEQ ID NO: 151.
In another preferred embodiment, the polypeptide fragment of the invention is free from any signal sequence; this is especially interesting when the polypeptide fragment is produced synthetically but even when the polypeptide fragments are produced recombinantly it is normally acceptable that they are not exported by the host cell to the periplasm or the extracellular space; the polypeptide fragments can be recovered by traditional methods (cf. the discussion below) from the cytoplasm after disruption of the host cells, and if there is need for refolding of the polypeptide fragments, general refolding schemes can be employed, cf. e.g. the disclosure in WO 94/18227 where such a general applicable refolding method is described.
A suitable assay for the potential utility of a given polypeptide fragment derived from SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, or 185 is to assess the ability of the polypeptide fragment to effect IFN-y release from primed memory T- WO 99/24577 PCT/DK98/00438 14 lymphocytes. Polypeptide fragments which have this capability are according to the invention especially interesting embodiments of the invention: It is contemplated that polypeptide fragments which stimulate T lymphocyte immune response shortly after the onset of the infection are important in the control of the mycobacteria causing the infection before the mycobacteria have succeeded in multiplying up to the number of bacteria that would have resulted in fulminant infection.
It is presently contemplated that when this application refers to IFN-y release as a measure of immunogenicity, other cytokines could be relevant, such as IL-12, TNF-a, IL-4, IL-5, IL-10, IL-6, TGF-P. Usually one or more cytokines will be measured utilising for example the PCR technique or ELISA. It will be appreciated by the person skilled in the art, that a significant increase or decrease in any of these cytokines will be indicative of an immunological effective polypeptide or polypeptide fragment.
Thus, an important embodiment of the invention is a polypeptide fragment defined above which 1) induces a release of IFN-y from primed memory T-lymphocytes withdrawn from a mouse within 2 weeks of primary infection or within 4 days after the mouse has been re-challenge infected with mycobacteria belonging to the tuberculosis complex, the induction performed by the addition of the polypeptide to a suspension comprising about 200,000 spleen cells per ml, the addition of the polypeptide resulting in a concentration of 1-4 lpg polypeptide per ml suspension, the release of IFN-y being assessable by determination of IFN-y in supernatant harvested 2 days after the addition of the polypeptide to the suspension, and/or 2) induces a release of IFN-y of at least 1,500 pg/ml above background level from about 1,000,000 human PBMC (peripheral blood mononuclear cells) per ml isolated from TB patients in the first phase of infection, or from healthy BCG vaccinated donors, or from healthy contacts to TB patients, the induction being performed by the addition of the polypeptide to a suspension comprising the about 1,000,000 PBMC per ml, the addition of the polypeptide resulting in a concentration of 1-4 pg polypeptide per ml suspension, the release of IFN-y being assessable by determination of IFN-y WO 99/24577 PCT/DK98/00438 in supernatant harvested 2 days after the addition of the polypeptide to the suspension; and/or 3) induces an IFN-y release from bovine PBMC derived from animals previously sensitized with mycobacteria belonging to the tuberculosis complex, said release being at least two times the release observed from bovine PBMC derived from animals not previously sensitized with mycobacteria belonging to the tuberculosis complex.
Preferably, in alternatives 1 and 2, the release effected by the polypeptide fragment gives rise to at least 1,500 pg/ml IFN-y in the supernatant but higher concentrations are preferred, e.g. at least 2,000 pg/ml and even at least 3,000 pg/ml IFN-y in the supernatant. The IFN-y release from bovine PBMC can e.g. be measured as the optical density (OD) index over background in a standard cytokine ELISA and should thus be at least two, but higher numbers such as at least 3, 5, 8, and 10 are preferred.
The polypeptide fragments of the invention preferably comprises an amino acid sequence of at least 6 amino acid residues in length which has a higher sequence identity than 70 percent with SEQ ID NO: 2, 4, 6, 8, 10, 12, 14, 16, any one of 17-23, 42, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, any one of 72-86, 88, 90, 92, 94, 141, 143, 145, 147, 149, 151, 153, any one of 168-171, 175, 177, 179, 181, 183, or 185. A preferred minimum percentage of sequence identity is at least such as at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, and at least 99.5%.
As mentioned above, it will normally be interesting to omit the leader sequences from the polypeptide fragments of the invention. However, by producing fusion polypeptides, superior characteristics of the polypeptide fragments of the invention can be achieved. For instance, fusion partners which facilitate export of the polypeptide when produced recombinantly, fusion partners which facilitate purification of the polypeptide, and fusion partners which enhance the immunogenicity of the polypeptide fragment of the invention are all interesting possibilities. Therefore, the invention also pertains to a fusion polypeptide comprising at least one polypeptide fragment defined above and at least one fusion partner. The fusion partner can, in order to enhance im- WO 99/24577 PCT/DK98/00438 16 munogenicity, e.g. be selected from the group consisting of another polypeptide fragment as defined above (so as to allow for multiple expression of relevant epitopes), and an other polypeptide derived from a bacterium belonging to the tuberculosis complex, such as ESAT-6, CFP7, CFP10, CFP17, CFP21, CFP25, CFP29, MPB59, MPT59, MPB64, and MPT64 or at least one T-cell epitope of any of these antigens. Other immunogenicity enhancing polypeptides which could serve as fusion partners are Tcell epitopes derived from the polypeptides ESAT-6, MPB64, MPT64, or MPB59) or other immunogenic epitopes enhancing the immunogenicity of the target gene product, e.g. lymphokines such as INF-y, IL-2 and IL-12. In order to facilitate expression and/or purification the fusion partner can e.g. be a bacterial fimbrial protein, e.g. the pilus components pilin and papA; protein A; the ZZ-peptide (ZZfusions are marketed by Pharmacia in Sweden); the maltose binding protein; gluthatione S-transferase; P-galactosidase; or poly-histidine.
Other interesting fusion partners are polypeptides which are lipidated and thereby effect that the immunogenic polypeptide is presented in a suitable manner to the immune system. This effect is e.g. known from vaccines based on the Borrelia burgdorferi OspA polypeptide, wherein the lipidated membrane anchor in the polypeptide confers a self-adjuvating effect to the polypeptide (which is natively lipidated) when isolated from cells producing it. In contrast, the OspA polypeptide is relatively silent immunologically when prepared without the lipidation anchor.
As evidenced in Example 6A, the fusion polypeptide consisting of MPT59 fused directly N-terminally to ESAT-6 enhances the immunogenicity of ESAT-6 beyond what would be expected from the immunogenicities of MPT59 and ESAT-6 alone. The precise reason for this surprising finding is not yet known, but it is expected that either the presence of both antigens lead to a synergistic effect with respect to immunogenicity or the presence of a sequence N-terminally to the ESAT-6 sequence protects this immune dominant protein from loss of important epitopes known to be present in the N-terminus. A third, alternative, possibility is that the presence of a sequence Cterminally to the MPT59 sequence enhances the immunologic properties of this antigen.
WO 99/24577 PCT/DK98/00438 17 Hence, one part of the invention pertains to a fusion polypeptide fragment which comprises a first amino acid sequence including at least one stretch of amino acids constituting a T-cell epitope derived from the M. tuberculosis protein ESAT-6 or MPT59, and a second amino acid sequence including at least one T-cell epitope derived from a M. tuberculosis protein different from ESAT-6 (if the first stretch of amino acids are derived from ESAT-6) or MPT59 (if the first stretch of amino acids are derived from MPT59) and/or including a stretch of amino acids which protects the first amino acid sequence from in vivo degradation or post-translational processing. The first amino acid sequence may be situated N- or C-terminally to the second amino acid sequence, but in line with the above considerations regarding protection of the ESAT- 6 N-terminus it is preferred that the first amino acid sequence is C-terminal to the second when the first amino acid sequence is derived from ESAT-6.
Although only the effect of fusion between MPT59 and ESAT6 has been investigated at present, it is believed that ESAT6 and MPT59 or epitopes derived therefrom could be advantageously be fused to other fusion partners having substantially the same effect on overall immunogenicity of the fusion construct. Hence, it is preferred that such a fusion polypeptide fragment according of the invention is one, wherein the at least one T-cell epitope included in the second amino acid sequence is derived from a M.
tuberculosis polypeptide (the "parent" polypeptide) selected from the group consisting of a polypeptide fragment according to the present invention and described in detail above and in the examples, or the amino acid sequence could be derived from any one of the M. tuberculosis proteins DnaK, GroEL, urease, glutamine synthetase, the proline rich complex, L-alanine dehydrogenase, phosphate binding protein, Ag 85 complex, HBHA (heparin binding hemagglutinin), MPT51, MPT64, superoxide dismutase, 19 kDa lipoprotein, a-crystallin, GroES, MPT59 (when the first amino acid sequence is derived from ESAT-6), and ESAT-6 (when the first amino acid sequence is derived from MPT59). It is preferred that the first and second T-cell epitopes each have a sequence identity of at least 70% with the natively occurring sequence in the proteins from which they are derived and it is even further preferred that the first and/or second amino acid sequence has a sequence identity of at least 70% with the protein from which they are derived. A most preferred embodiment of this fusion polypeptide is one wherein the first amino acid sequence is the amino acid sequence of ESAT-6 or WO 99/24577 PCT/DK98/00438 18 MPT59 and/or the second amino acid sequence is the full-length amino acid sequence of the possible "parent" polypeptides listed above.
In the most preferred embodiment, the fusion polypeptide fragment comprises ESAT-6 fused to MPT59 (advantageously, ESAT-6 is fused to the C-terminus of MPT59) and in one special embodiment, there are no linkers introduced between the two amino acid sequences constituting the two parent polypeptide fragments.
Another part of the invention pertains to a nucleic acid fragment in isolated form which 1) comprises a nucleic acid sequence which encodes a polypeptide or fusion polypeptide as defined above, or comprises a nucleic acid sequence complementary thereto, and/or 2) has a length of at least 10 nucleotides and hybridizes readily under stringent hybridization conditions (as defined in the art, i.e. 5-10°C under the melting point Tm, cf. Sambrook et al, 1989, pages 11.45-11.49) with a nucleic acid fragment which has a nucleotide sequence selected from SEQ ID NO: 1 or a sequence complementary thereto, SEQ ID NO: 3 or a sequence complementary thereto, SEQ ID NO: 5 or a sequence complementary thereto, SEQ ID NO: 7 or a sequence complementary thereto, SEQ ID NO: 9 or a sequence complementary thereto, SEQ ID NO: 11 or a sequence complementary thereto, SEQ ID NO: 13 or a sequence complementary thereto, SEQ ID NO: 15 or a sequence complementary thereto, SEQ ID NO: 41 or a sequence complementary thereto, SEQ ID NO: 47 or a sequence complementary thereto, SEQ ID NO: 49 or a sequence complementary thereto, SEQ ID NO: 51 or a sequence complementary thereto, SEQ ID NO: 53 or a sequence complementary thereto, SEQ ID NO: 55 or a sequence complementary thereto, SEQ ID NO: 57 or a sequence complementary thereto, WO 99/24577 PCT/DK98/00438 19 SEQ ID NO: 59 or a sequence complementary thereto, SEQ ID NO: 61 or a sequence complementary thereto, SEQ ID NO: 63 or a sequence complementary thereto, SEQ ID NO: 65 or a sequence complementary thereto, SEQ ID NO: 67 or a sequence complementary thereto, SEQ ID NO: 69 or a sequence complementary thereto, SEQ ID NO: 71 or a sequence complementary thereto, SEQ ID NO: 87 or a sequence complementary thereto, SEQ ID NO: 89 or a sequence complementary thereto, SEQ ID NO: 91 or a sequence complementary thereto, SEQ ID NO: 93 or a sequence complementary thereto, SEQ ID NO: 140 or a sequence complementary thereto, SEQ ID NO: 142 or a sequence complementary thereto, SEQ ID NO: 144 or a sequence complementary thereto, SEQ ID NO: 146 or a sequence complementary thereto, SEQ ID NO: 148 or a sequence complementary thereto, SEQ ID NO: 150 or a sequence complementary thereto, SEQ ID NO: 152 or a sequence complementary thereto, SEQ ID NO: 174 or a sequence complementary thereto, SEQ ID NO: 176 or a sequence complementary thereto, SEQ ID NO: 178 or a sequence complementary thereto, SEQ ID NO: 180 or a sequence complementary thereto, SEQ ID NO: 182 or a sequence complementary thereto, and SEQ ID NO: 184 or a sequence complementary thereto with the proviso that when the nucleic acid fragment comprises a subsequence of SEQ ID NO: 41, then the nucleic acid fragment contains an A corresponding to position 781 in SEQ ID NO: 41 and when the nucleic acid fragment comprises a subsequence of a nucleotide sequence exactly complementary to SEQ ID NO: 41, then the nucleic acid fragment comprises a T corresponding to position 781 in SEQ ID NO: 41.
It is preferred that the nucleic acid fragment is a DNA fragment.
WO 99/24577 PCT/DK98/00438 To provide certainty of the advantages in accordance with the invention, the preferred nucleic acid sequence when employed for hybridization studies or assays includes sequences that are complementary to at least a 10 to 40, or so, nucleotide stretch of the selected sequence. A size of at least 10 nucleotides in length helps to ensure that the fragment will be of sufficient length to form a duplex molecule that is both stable and selective. Molecules having complementary sequences over stretches greater than bases in length are generally preferred, though, in order to increase stability and selectivity of the hybrid, and thereby improve the quality and degree of specific hybrid molecules obtained.
Hence, the term "subsequence" when used in connection with the nucleic acid fragments of the invention is intended to indicate a continuous stretch of at least 10 nucleotides exhibits the above hybridization pattern. Normally this will require a minimum sequence identity of at least 70% with a subsequence of the hybridization partner having SEQ ID NO: 1,3, 5, 7, 9, 11, 12, 15, 21,41,47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 87, 89, 91, 93, 140, 142, 144, 146, 148, 150, 152, 174, 176, 178, 180, 182, or 184. It is preferred that the nucleic acid fragment is longer than nucleotides, such as at least 15, at least 20, at least 25, at least 30, at least 35, at least 40, at least 45, at least 50, at least 55, at least 60, at least 65, at least 70, and at least 80 nucleotides long, and the sequence identity should preferable also be higher than 70%, such as at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 94%, at least 96%, and at least 98%. It is most preferred that the sequence identity is 100%. Such fragments may be readily prepared by, for example, directly synthesizing the fragment by chemical means, by application of nucleic acid reproduction technology, such as the PCR technology of U.S. Patent 4,603,102, or by introducing selected sequences into recombinant vectors for recombinant production.
It is well known that the same amino acid may be encoded by various codons, the codon usage being related, inter alia, to the preference of the organisms in question expressing the nucleotide sequence. Thus, at least one nucleotide or codon of a nucleic acid fragment of the invention may be exchanged by others which, when expressed, result in a polypeptide identical or substantially identical to the polypeptide encoded by the nucleic acid fragment in question. The invention thus allows for variations in the sequence such as substitution, insertion (including introns), addition, deletion and WO 99/24577 PCT/DK98/00438 21 rearrangement of one or more nucleotides, which variations do not have any substantial effect on the polypeptide encoded by the nucleic acid fragment or a subsequence thereof. The term "substitution" is intended to mean the replacement of one or more nucleotides in the full nucleotide sequence with one or more different nucleotides, "addition" is understood to mean the addition of one or more nucleotides at either end of the full nucleotide sequence, "insertion" is intended to mean the introduction of one or more nucleotides within the full nucleotide sequence, "deletion" is intended to indicate that one or more nucleotides have been deleted from the full nucleotide sequence whether at either end of the sequence or at any suitable point within it, and "rearrangement" is intended to mean that two or more nucleotide residues have been exchanged with each other.
The nucleotide sequence to be modified may be of cDNA or genomic origin as discussed above, but may also be of synthetic origin. Furthermore, the sequence may be of mixed cDNA and genomic, mixed cDNA and synthetic or genomic and synthetic origin as discussed above. The sequence may have been modified, e.g. by site-directed mutagenesis, to result in the desired nucleic acid fragment encoding the desired polypeptide. The following discussion focused on modifications of nucleic acid encoding the polypeptide should be understood to encompass also such possibilities, as well as the possibility of building up the nucleic acid by ligation of two or more DNA fragments to obtain the desired nucleic acid fragment, and combinations of the above-mentioned principles.
The nucleotide sequence may be modified using any suitable technique which results in the production of a nucleic acid fragment encoding a polypeptide of the invention.
The modification of the nucleotide sequence encoding the amino acid sequence of the polypeptide of the invention should be one which does not impair the immunological function of the resulting polypeptide.
A preferred method of preparing variants of the antigens disclosed herein is site-directed mutagenesis. This technique is useful in the preparation of individual peptides, or biologically functional equivalent proteins or peptides, derived from the antigen sequences, through specific mutagenesis of the underlying nucleic acid. The technique WO 99/24577 PCT/DK98/00438 22 further provides a ready ability to prepare and test sequence variants, for example, incorporating one or more of the foregoing considerations, by introducing one or more nucleotide sequence changes into the nucleic acid. Site-specific mutagenesis allows the production of mutants through the use of specific oligonucleotide sequences which encode the nucleotide sequence of the desired mutation, as well as a sufficient number of adjacent nucleotides, to provide a primer sequence of sufficient size and sequence complexity to form a stable duplex on both sides of the deletion junction being traversed. Typically, a primer of about 17 to 25 nucleotides in length is preferred, with about 5 to 10 residues on both sides of the junction of the sequence being altered.
In general, the technique of site-specific mutagenesis is well known in the art as exemplified by publications (Adelman et al., 1983). As will be appreciated, the technique typically employs a phage vector which exists in both a single stranded and double stranded form. Typical vectors useful in site-directed mutagenesis include vectors such as the M13 phage (Messing et al., 1981). These phage are readily commercially available and their use is generally well known to those skilled in the art.
In general, site-directed mutagenesis in accordance herewith is performed by first obtaining a single-stranded vector which includes within its sequence a nucleic acid sequence which encodes the polypeptides of the invention. An oligonucleotide primer bearing the desired mutated sequence is prepared, generally synthetically, for example by the method of Crea et al. (1978). This primer is then annealed with the singlestranded vector, and subjected to DNA polymerizing enzymes such as E. coli polymerase I Klenow fragment, in order to complete the synthesis of the mutation-bearing strand. Thus, a heteroduplex is formed wherein one strand encodes the original nonmutated sequence and the second strand bears the desired mutation. This heteroduplex vector is then used to transform appropriate cells, such as E. coli cells, and clones are selected which include recombinant vectors bearing the mutated sequence arrangement.
The preparation of sequence variants of the selected nucleic acid fragments of the invention using site-directed mutagenesis is provided as a means of producing potentially useful species of the genes and is not meant to be limiting as there are other WO 99/24577 PCT/DK98/00438 23 ways in which sequence variants of the nucleic acid fragments of the invention may be obtained. For example, recombinant vectors encoding the desired genes may be treated with mutagenic agents to obtain sequence variants (see, a method described by Eichenlaub, 1979) for the mutagenesis of plasmid DNA using hydroxylamine.
The invention also relates to a replicable expression vector which comprises a nucleic acid fragment defined above, especially a vector which comprises a nucleic acid fragment encoding a polypeptide fragment of the invention.
The vector may be any vector which may conveniently be subjected to recombinant DNA procedures, and the choice of vector will often depend on the host cell into which it is to be introduced. Thus, the vector may be an autonomously replicating vector, i.e. a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication; examples of such a vector are a plasmid, phage, cosmid, mini-chromosome or virus. Alternatively, the vector may be one which, when introduced in a host cell, is integrated in the host cell genome and replicated together with the chromosome(s) into which it has been integrated.
Expression vectors may be constructed to include any of the DNA segments disclosed herein. Such DNA might encode an antigenic protein specific for virulent strains of mycobacteria or even hybridization probes for detecting mycobacteria nucleic acids in samples. Longer or shorter DNA segments could be used, depending on the antigenic protein desired. Epitopic regions of the proteins expressed or encoded by the disclosed DNA could be included as relatively short segments of DNA. A wide variety of expression vectors is possible including, for example, DNA segments encoding reporter gene products useful for identification of heterologous gene products and/or resistance genes such as antibiotic resistance genes which may be useful in identifying transformed cells.
The vector of the invention may be used to transform cells so as to allow propagation of the nucleic acid fragments of the invention or so as to allow expression of the polypeptide fragments of the invention. Hence, the invention also pertains to a transformed cell harbouring at least one such vector according to the invention, said cell being WO 99/24577 PCT/DK98/00438 24 one which does not natively harbour the vector and/or the nucleic acid fragment of the invention contained therein. Such a transformed cell (which is also a part of the invention) may be any suitable bacterial host cell or any other type of cell such as a unicellular eukaryotic organism, a fungus or yeast, or a cell derived from a multicellular organism, e.g. an animal or a plant. It is especially in cases where glycosylation is desired that a mammalian cell is used, although glycosylation of proteins is a rare event in prokaryotes. Normally, however, a prokaryotic cell is preferred such as a bacterium belonging to the genera Mycobacterium, Salmonella, Pseudomonas, Bacillus and Eschericia. It is preferred that the transformed cell is an E. coli, B. subtilis, or M. bovis BCG cell, and it is especially preferred that the transformed cell expresses a polypeptide according of the invention. The latter opens for the possibility to produce the polypeptide of the invention by simply recovering it from the culture containing the transformed cell. In the most preferred embodiment of this part of the invention the transformed cell is Mycobacterium bovis BCG strain: Danish 1331, which is the Mycobacterium bovis strain Copenhagen from the Copenhagen BCG Laboratory, Statens Seruminstitut, Denmark.
The nucleic acid fragments of the invention allow for the recombinant production of the polypeptides fragments of the invention. However, also isolation from the natural source is a way of providing the polypeptide fragments as is peptide synthesis.
Therefore, the invention also pertains to a method for the preparation of a polypeptide fragment of the invention, said method comprising inserting a nucleic acid fragment as defined above into a vector which is able to replicate in a host cell, introducing the resulting recombinant vector into the host cell (transformed cells may be selected using various techniques, including screening by differential hybridization, identification of fused reporter gene products, resistance markers, anti-antigen antibodies and the like), culturing the host cell in a culture medium under conditions sufficient to effect expression of the polypeptide (of course the cell may be cultivated under conditions appropriate to the circumstances, and if DNA is desired, replication conditions are used), and recovering the polypeptide from the host cell or culture medium; or isolating the polypeptide from a short-term culture filtrate as defined in claim 1; or WO 99/24577 PCT/DK98/00438 isolating the polypeptide from whole mycobacteria of the tuberculosis complex or from lysates or fractions thereof, e.g. cell wall containing fractions, or synthesizing the polypeptide by solid or liquid phase peptide synthesis.
The medium used to grow the transformed cells may be any conventional medium suitable for the purpose. A suitable vector may be any of the vectors described above, and an appropriate host cell may be any of the cell types listed above. The methods employed to construct the vector and effect introduction thereof into the host cell may be any methods known for such purposes within the field of recombinant DNA. In the following a more detailed description of the possibilities will be given: In general, of course, prokaryotes are preferred for the initial cloning of nucleic sequences of the invention and constructing the vectors useful in the invention. For example, in addition to the particular strains mentioned in the more specific disclosure below, one may mention by way of example, strains such as E. coli K12 strain 294 (ATCC No. 31446), E. coli B, and E. coli X 1776 (ATCC No. 31537). These examples are, of course, intended to be illustrative rather than limiting.
Prokaryotes are also preferred for expression. The aforementioned strains, as well as E. coliW3110 lambda-, prototrophic, ATCC No. 273325), bacilli such as Bacillus subtilis, or other enterobacteriaceae such as Salmonella typhimurium or Serratia marcesans, and various Pseudomonas species may be used. Especially interesting are rapid-growing mycobacteria, e.g. M. smegmatis, as these bacteria have a high degree of resemblance with mycobacteria of the tuberculosis complex and therefore stand a good chance of reducing the need of performing post-translational modifications of the expression product.
In general, plasmid vectors containing replicon and control sequences which are derived from species compatible with the host cell are used in connection with these hosts. The vector ordinarily carries a replication site, as well as marking sequences which are capable of providing phenotypic selection in transformed cells. For example, E. coli is typically transformed using pBR322, a plasmid derived from an E. coli species (see, Bolivar et al., 1977, Gene 2: 95). The pBR322 plasmid contains genes for WO 99/24577 PCT/DK98/00438 26 ampicillin and tetracycline resistance and thus provides easy means for identifying transformed cells. The pBR plasmid, or other microbial plasmid or phage must also contain, or be modified to contain, promoters which can be used by the microorganism for expression.
Those promoters most commonly used in recombinant DNA construction include the B-lactamase (penicillinase) and lactose promoter systems (Chang et al., 1978; Itakura et al., 1977; Goeddel et al., 1979) and a tryptophan (trp) promoter system (Goeddel et al., 1979; EPO Appl. Publ. No. 0036776). While these are the most commonly used, other microbial promoters have been discovered and utilized, and details concerning their nucleotide sequences have been published, enabling a skilled worker to ligate them functionally with plasmid vectors (Siebwenlist et al., 1980). Certain genes from prokaryotes may be expressed efficiently in E. coli from their own promoter sequences, precluding the need for addition of another promoter by artificial means.
After the recombinant preparation of the polypeptide according to the invention, the isolation of the polypeptide may for instance be carried out by affinity chromatography (or other conventional biochemical procedures based on chromatography), using a monoclonal antibody which substantially specifically binds the polypeptide according to the invention. Another possibility is to employ the simultaneous electroelution technique described by Andersen et al. in J. Immunol. Methods 161: 29-39.
According to the invention the post-translational modifications involves lipidation, glycosylation, cleavage, or elongation of the polypeptide.
In certain aspects, the DNA sequence information provided by this invention allows for the preparation of relatively short DNA (or RNA or PNA) sequences having the ability to specifically hybridize to mycobacterial gene sequences. In these aspects, nucleic acid probes of an appropriate length are prepared based on a consideration of the relevant sequence. The ability of such nucleic acid probes to specifically hybridize to the mycobacterial gene sequences lend them particular utility in a variety of embodiments.
Most importantly, the probes can be used in a variety of diagnostic assays for detecting the presence of pathogenic organisms in a given sample. However, either uses are WO 99/24577 PCTDK98/00438 27 envisioned, including the use of the sequence information for the preparation of mutant species primers, or primers for use in preparing other genetic constructs.
Apart from their use as starting points for the synthesis of polypeptides of the invention and for hybridization probes (useful for direct hybridization assays or as primers in e.g. PCR or other molecular amplification methods) the nucleic acid fragments of the invention may be used for effecting in vivo expression of antigens, i.e. the nucleic acid fragments may be used in so-called DNA vaccines. Recent research have revealed that a DNA fragment cloned in a vector which is non- replicative in eukaryotic cells may be introduced into an animal (including a human being) by e.g. intramuscular injection or percutaneous administration (the so-called "gene gun" approach). The DNA is taken up by e.g. muscle cells and the gene of interest is expressed by a promoter which is functioning in eukaryotes, e.g. a viral promoter, and the gene product thereafter stimulates the immune system. These newly discovered methods are reviewed in UImer et al., 1993, which hereby is included by reference.
Hence, the invention also relates to a vaccine comprising a nucleic acid fragment according to the invention, the vaccine effecting in vivo expression of antigen by an animal, including a human being, to whom the vaccine has been administered, the amount of expressed antigen being effective to confer substantially increased resistance to infections with mycobacteria of the tuberculosis complex in an animal, including a human being.
The efficacy of such a "DNA vaccine" can possibly be enhanced by administering the gene encoding the expression product together with a DNA fragment encoding a polypeptide which has the capability of modulating an immune response. For instance, a gene encoding lymphokine precursors or lymphokines IFN-y, IL-2, or IL-12) could be administered together with the gene encoding the immunogenic protein, either by administering two separate DNA fragments or by administering both DNA fragments included in the same vector. It also is a possibility to administer DNA fragments comprising a multitude of nucleotide sequences which each encode relevant epitopes of the polypeptides disclosed herein so as to effect a continuous sensitization of the immune system with a broad spectrum of these epitopes.
WO 99/24577 PCT/DK98/00438 28 As explained above, the polypeptide fragments of the invention are excellent candidates for vaccine constituents or for constituents in an immune diagnostic agent due to their extracellular presence in culture media containing metabolizing virulent mycobacteria belonging to the tuberculosis complex, or because of their high homologies with such extracellular antigens, or because of their absence in M. bovis BCG.
Thus, another part of the invention pertains to an immunologic composition comprising a polypeptide or fusion polypeptide according to the invention. In order to ensure optimum performance of such an immunologic composition it is preferred that it comprises an immunologically and pharmaceutically acceptable carrier, vehicle or adjuvant.
Suitable carriers are selected from the group consisting of a polymer to which the polypeptide(s) is/are bound by hydrophobic non-covalent interaction, such as a plastic, e.g. polystyrene, or a polymer to which the polypeptide(s) is/are covalently bound, such as a polysaccharide, or a polypeptide, e.g. bovine serum albumin, ovalbumin or keyhole limpet haemocyanin. Suitable vehicles are selected from the group consisting of a diluent and a suspending agent. The adjuvant is preferably selected from the group consisting of dimethyldioctadecylammonium bromide (DDA), Quil A, poly I:C, Freund's incomplete adjuvant, IFN-y, IL-2, IL-12, monophosphoryl lipid A (MPL), and muramyl dipeptide (MDP).
A preferred immunologic composition according to the present invention comprising at least two different polypeptide fragments, each different polypeptide fragment being a polypeptide or a fusion polypeptide defined above. It is preferred that the immunologic composition comprises between 3-20 different polypeptide fragments or fusion polypeptides.
Such an immunologic composition may preferably be in the form of a vaccine or in the form of a skin test reagent.
In line with the above, the invention therefore also pertain to a method for producing an immunologic composition according to the invention, the method comprising preparing, synthesizing or isolating a polypeptide according to the invention, and solubi- WO 99/24577 PCT/DK98/00438 29 lizing or dispersing the polypeptide in a medium for a vaccine, and optionally adding other M. tuberculosis antigens and/or a carrier, vehicle and/or adjuvant substance.
Preparation of vaccines which contain peptide sequences as active ingredients is generally well understood in the art, as exemplified by U.S. Patents 4,608,251; 4,601,903; 4,599,231; 4,599,230; 4,596,792; and 4,578,770, all incorporated herein by reference. Typically, such vaccines are prepared as injectables either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid prior to injection may also be prepared. The preparation may also be emulsified.
The active immunogenic ingredient is often mixed with excipients which are pharmaceutically acceptable and compatible with the active ingredient. Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol, or the like, and combinations thereof. In addition, if desired, the vaccine may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH buffering agents, or adjuvants which enhance the effectiveness of the vaccines.
The vaccines are conventionally administered parenterally, by injection, for example, either subcutaneously or intramuscularly. Additional formulations which are suitable for other modes of administration include suppositories and, in some cases, oral formulations. For suppositories, traditional binders and carriers may include, for example, polyalkalene glycols or triglycerides; such suppositories may be formed from mixtures containing the active ingredient in the range of 0.5% to 10%, preferably Oral formulations include such normally employed excipients as, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharine, cellulose, magnesium carbonate, and the like. These compositions take the form of solutions, suspensions, tablets, pills, capsules, sustained release formulations or powders and contain 10-95% of active ingredient, preferably 25-70%.
The proteins may be formulated into the vaccine as neutral or salt forms. Pharmaceutically acceptable salts include acid addition salts (formed with the free amino groups of the peptide) and which are formed with inorganic acids such as, for example, hydrochloric or phosphoric acids, or such organic acids as acetic oxalic, tartaric, mandelic, and the like. Salts formed with the free carboxyl groups may also be derived from inorganic bases such as, for example, sodium, potassium, ammonium, calcium, WO 99/24577 PCTDK98/00438 or ferric hydroxides, and such organic bases as isopropylamine, trimethylamine, 2-ethylamino ethanol, histidine, procaine, and the like.
The vaccines are administered in a manner compatible with the dosage formulation, and in such amount as will be therapeutically effective and immunogenic. The quantity to be administered depends on the subject to be treated, including, the capacity of the individual's immune system to mount an immune response, and the degree of protection desired. Suitable dosage ranges are of the order of several hundred micrograms active ingredient per vaccination with a preferred range from about 0.1 Ag to 1000 lg, such as in the range from about 1 jig to 300 gg, and especially in the range from about 10 ig to 50 jig. Suitable regimens for initial administration and booster shots are also variable but are typified by an initial administration followed by subsequent inoculations or other administrations.
The manner of application may be varied widely. Any of the conventional methods for administration of a vaccine are applicable. These are believed to include oral application on a solid physiologically acceptable base or in a physiologically acceptable dispersion, parenterally, by injection or the like. The dosage of the vaccine will depend on the route of administration and will vary according to the age of the person to be vaccinated and, to a lesser degree, the size of the person to be vaccinated.
Some of the polypeptides of the vaccine are sufficiently immunogenic in a vaccine, but for some of the others the immune response will be enhanced if the vaccine further comprises an adjuvant substance.
Various methods of achieving adjuvant effect for the vaccine include use of agents such as aluminum hydroxide or phosphate (alum), commonly used as 0.05 to 0.1 percent solution in phosphate buffered saline, admixture with synthetic polymers of sugars (Carbopol) used as 0.25 percent solution, aggregation of the protein in the vaccine by heat treatment with temperatures ranging between 70' to 101 0 C for 30 second to 2 minute periods respectively. Aggregation by reactivating with pepsin treated (Fab) antibodies to albumin, mixture with bacterial cells such as C. parvum or endotoxins or lipopolysaccharide components of gram-negative bacteria, emulsion in physiologically acceptable oil vehicles such as mannide mono-oleate (Aracel A) or emulsion with WO 99/24577 PCT/DK98/00438 31 percent solution of a perfluorocarbon (Fluosol-DA) used as a block substitute may also be employed. According to the invention DDA (dimethyldioctadecylammonium bromide) is an interesting candidate for an adjuvant, but also Freund's complete and incomplete adjuvants as well as QuilA and RIBI are interesting possibilities. Further possibilities are monophosphoryl lipid A (MPL), and muramyl dipeptide (MDP).
Another highly interesting (and thus, preferred) possibility of achieving adjuvant effect is to employ the technique described in Gosselin et al., 1992 (which is hereby incorporated by reference herein). In brief, the presentation of a relevant antigen such as an antigen of the present invention can be enhanced by conjugating the antigen to antibodies (or antigen binding antibody fragments) against the Fcy receptors on monocytes/macrophages. Especially conjugates between antigen and anti-FcyRI have been demonstrated to enhance immunogenicity for the purposes of vaccination.
Other possibilities involve the use of immune modulating substances such as lymphokines IFN-y, IL-2 and IL-12) or synthetic IFN-y inducers such as poly I:C in combination with the above-mentioned adjuvants. As discussed in example 3, it is contemplated that such mixtures of antigen and adjuvant will lead to superior vaccine formulations.
In many instances, it will be necessary to have multiple administrations of the vaccine, usually not exceeding six vaccinations, more usually not exceeding four vaccinations and preferably one or more, usually at least about three vaccinations. The vaccinations will normally be at from two to twelve week intervals, more usually from three to five week intervals. Periodic boosters at intervals of 1-5 years, usually three years, will be desirable to maintain the desired levels of protective immunity. The course of the immunization may be followed by in vitro proliferation assays of PBL (peripheral blood lymphocytes) co-cultured with ESAT-6 or ST-CF, and especially by measuring the levels of IFN-y released form the primed lymphocytes. The assays may be performed using conventional labels, such as radionuclides, enzymes, fluorescers, and the like.
These techniques are well known and may be found in a wide variety of patents, such as U.S. Patent Nos. 3,791,932; 4,174,384 and 3,949,064, as illustrative of these types of assays.
WO 99/24577 PCT/DK98/00438 32 Due to genetic variation, different individuals may react with immune responses of varying strength to the same polypeptide. Therefore, the vaccine according to the invention may comprise several different polypeptides in order to increase the immune response. The vaccine may comprise two or more polypeptides, where all of the polypeptides are as defined above, or some but not all of the peptides may be derived from a bacterium belonging to the M. tuberculosis complex. In the latter example the polypeptides not necessarily fulfilling the criteria set forth above for polypeptides may either act due to their own immunogenicity or merely act as adjuvants. Examples of such interesting polypeptides are MPB64, MPT64, and MPB59, but any other substance which can be isolated from mycobacteria are possible candidates.
The vaccine may comprise 3-20 different polypeptides, such as 3-10 different polypeptides.
One reason for admixing the polypeptides of the invention with an adjuvant is to effectively activate a cellular immune response. However, this effect can also be achieved in other ways, for instance by expressing the effective antigen in a vaccine in a non-pathogenic microorganism. A well-known example of such a microorganism is Mycobacterium bovis BCG.
Therefore, another important aspect of the present invention is an improvement of the living BCG vaccine presently available, which is a vaccine for immunizing an animal, including a human being, against TB caused by mycobacteria belonging to the tuberculosis-complex, comprising as the effective component a microorganism, wherein one or more copies of a DNA sequence encoding a polypeptide as defined above has been incorporated into the genome of the microorganism in a manner allowing the microorganism to express and secrete the polypeptide.
In the present context the term "genome" refers to the chromosome of the microorganisms as well as extrachromosomally DNA or RNA, such as plasmids. It is, however, preferred that the DNA sequence of the present invention has been introduced into the chromosome of the non-pathogenic microorganism, since this will prevent loss of the genetic material introduced.
WO 99/24577 PCT/DK98/00438 33 It is preferred that the non-pathogenic microorganism is a bacterium, e.g. selected from the group consisting of the genera Mycobacterium, Salmonella, Pseudomonas and Eschericia. It is especially preferred that the non-pathogenic microorganism is Mycobacterium bovis BCG, such as Mycobacterium bovis BCG strain: Danish 1331.
The incorporation of one or more copies of a nucleotide sequence encoding the polypeptide according to the invention in a mycobacterium from a M. bovis BCG strain will enhance the immunogenic effect of the BCG strain. The incorporation of more than one copy of a nucleotide sequence of the invention is contemplated to enhance the immune response even more, and consequently an aspect of the invention is a vaccine wherein at least 2 copies of a DNA sequence encoding a polypeptide is incorporated in the genome of the microorganism, such as at least 5 copies. The copies of DNA sequences may either be identical encoding identical polypeptides or be variants of the same DNA sequence encoding identical or homologues of a polypeptide, or in another embodiment be different DNA sequences encoding different polypeptides where at least one of the polypeptides is according to the present invention.
The living vaccine of the invention can be prepared by cultivating a transformed nonpathogenic cell according to the invention, and transferring these cells to a medium for a vaccine, and optionally adding a carrier, vehicle and/or adjuvant substance.
The invention also relates to a method of diagnosing TB caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis in an animal, including a human being, comprising intradermally injecting, in the animal, a polypeptide according to the invention or a skin test reagent described above, a positive skin response at the location of injection being indicative of the animal having TB, and a negative skin response at the location of injection being indicative of the animal not having TB. A positive response is a skin reaction having a diameter of at least 5 mm, but larger reactions are preferred, such as at least 1 cm, 1.5 cm, and at least 2 cm in diameter. The composition used as the skin test reagent can be prepared in the same manner as described for the vaccines above.
In line with the disclosure above pertaining to vaccine preparation and use, the invention also pertains to a method for immunising an animal, including a human being, WO 99/24577 PCT/DK98/00438 34 against TB caused by mycobacteria belonging to the tuberculosis complex, comprising administering to the animal the polypeptide of the invention, or a vaccine composition of the invention as described above, or a living vaccine described above. Preferred routes of administration are the parenteral (such as intravenous and intraarterially), intraperitoneal, intramuscular, subcutaneous, intradermal, oral, buccal, sublingual, nasal, rectal or transdermal route.
The protein ESAT-6 which is present in short-term culture filtrates from mycobacteria as well as the esat-6 gene in the mycobacterial genome has been demonstrated to have a very limited distribution in other mycobacterial strains that M. tuberculosis, e.g.
esat-6 is absent in both BCG and the majority of mycobacterial species isolated from the environment, such as M. avium and M. terrae. It is believed that this is also the case for at least one of the antigens of the present invention and their genes and therefore, the diagnostic embodiments of the invention are especially well-suited for performing the diagnosis of on-going or previous infection with virulent mycobacterial strains of the tuberculosis complex, and it is contemplated that it will be possible to distinguish between 1) subjects (animal or human) which have been previously vaccinated with e.g. BCG vaccines or subjected to antigens from non-virulent mycobacteria and 2) subjects which have or have had active infection with virulent mycobacteria.
A number of possible diagnostic assays and methods can be envisaged: When diagnosis of previous or ongoing infection with virulent mycobacteria is the aim, a blood sample comprising mononuclear cells T-lymphocytes) from a patient could be contacted with a sample of one or more polypeptides of the invention. This contacting can be performed in vitro and a positive reaction could e.g. be proliferation of the T-cells or release cytokines such as y-interferon into the extracellular phase (e.g.
into a culture supernatant); a suitable in vivo test would be a skin test as described above. It is also conceivable to contact a serum sample from a subject to contact with a polypeptide of the invention, the demonstration of a binding between antibodies in the serum sample and the polypeptide being indicative of previous or ongoing infection.
WO 99/24577 PCTIDK98/00438 The invention therefore also relates to an in vitro method for diagnosing ongoing or previous sensitization in an animal or a human being with bacteria belonging to the tuberculosis complex, the method comprising providing a blood sample from the animal or human being, and contacting the sample from the animal with the polypeptide of the invention, a significant release into the extracellular phase of at least one cytokine by mononuclear cells in the blood sample being indicative of the animal being sensitized. By the term "significant release" is herein meant that the release of the cytokine is significantly higher than the cytokine release from a blood sample derived from a non-tuberculous subject a subject which does not react in a traditional skin test for TB). Normally, a significant release is at least two times the release observed from such a sample.
Alternatively, a sample of a possibly infected organ may be contacted with an antibody raised against a polypeptide of the invention. The demonstration of the reaction by means of methods well-known in the art between the sample and the antibody will be indicative of ongoing infection. It is of course also a possibility to demonstrate the presence of anti-mycobacterial antibodies in serum by contacting a serum sample from a subject with at least one of the polypeptide fragments of the invention and using well-known methods for visualizing the reaction between the antibody and antigen.
Also a method of determining the presence of mycobacterial nucleic acids in an animal, including a human being, or in a sample, comprising administering a nucleic acid fragment of the invention to the animal or incubating the sample with the nucleic acid fragment of the invention or a nucleic acid fragment complementary thereto, and detecting the presence of hybridized nucleic acids resulting from the incubation (by using the hybridization assays which are well-known in the art), is also included in the invention. Such a method of diagnosing TB might involve the use of a composition comprising at least a part of a nucleotide sequence as defined above and detecting the presence of nucleotide sequences in a sample from the animal or human being to be tested which hybridize with the nucleic acid fragment (or a complementary fragment) by the use of PCR technique.
The fact that certain of the disclosed antigens are not present in M. bovis BCG but are present in virulent mycobacteria point them out as interesting drug targets; the anti- WO 99/24577 PCT/DK98/00438 36 gens may constitute receptor molecules or toxins which facilitate the infection by the mycobacterium, and if such functionalities are blocked the infectivity of the mycobacterium will be diminshed.
To determine particularly suitable drug targets among the antigens of the invention, the gene encoding at least one of the polypeptides of the invention and the necessary control sequences can be introduced into avirulent strains of mycobacteria BCG) so as to determine which of the polypeptides are critical for virulence. Once particular proteins are identified as critical for/contributory to virulence, anti-mycobacterial agents can be designed rationally to inhibit expression of the critical genes or to attack the critical gene products. For instance, antibodies or fragments thereof (such as Fab and (Fab') 2 fragments can be prepared against such critical polypeptides by methods known in the art and thereafter used as prophylactic or therapeutic agents. Alternatively, small molecules can be screened for their ability to selectively inhibit expression of the critical gene products, e.g. using recombinant expression systems which include the gene's endogenous promoter, or for their ability to directly interfere with the action of the target. These small molecules are then used as therapeutics or as prophylactic agents to inhibit mycobacterial virulence.
Alternatively, anti-mycobacterial agents which render a virulent mycobacterium avirulent can be operably linked to expression control sequences and used to transform a virulent mycobacterium. Such anti-mycobacterial agents inhibit the replication of a specified mycobacterium upon transcription or translation of the agent in the mycobacterium. Such a "newly avirulent" mycobacterium would constitute a superb alternative to the above described modified BCG for vaccine purposes since it would be immunologically very similar to a virulent mycobacterium compared to e.g. BCG.
Finally, a monoclonal or polyclonal antibody, which is specifically reacting with a polypeptide of the invention in an immuno assay, or a specific binding fragment of said antibody, is also a part of the invention. The production of such polyclonal antibodies requires that a suitable animal be immunized with the polypeptide and that these antibodies are subsequently isolated, suitably by immune affinity chromatography. The production of monoclonals can be effected by methods well-known in the art, since WO 99/24577 PCT/DK98/00438 37 the present invention provides for adequate amounts of antigen for both immunization and screening of positive hybridomas.
LEGENDS TO THE FIGURES Fig. 1: Long term memory immune mice are very efficiently protected towards an infection with M. tuberculosis. Mice were given a challenge of M. tuberculosis and spleens were isolated at different time points. Spleen lymphocytes were stimulated in vitro with ST-CF and the release of IFN-y investigated (panel The counts of CFU in the spleens of the two groups of mice are indicated in panel B. The memory immune mice control infection within the first week and produce large quantities of IFN-y in response to antigens in ST-CF.
Fig. 2: T cells involved in protective immunity are predominantly directed to molecules from 6-12 and 17-38 kDa. Splenic T cells were isolated four days after the challenge with M. tuberculosis and stimulated in vitro with narrow molecular mass fractions of ST-CF. The release of IFN-y was investigated Fig. 3: Nucleotide sequence (SEQ ID NO: 1) of cfp7. The deduced amino acid sequence (SEQ ID NO: 2) of CFP7 is given in conventional one-letter code below the nucleotide sequence. The putative ribosome-binding site is written in underlined italics as are the putative -10 and -35 regions. Nucleotides written in bold are those encoding CFP7.
Fig. 4. Nucleotide sequence (SEQ ID NO: 3) of cfp9. The deduced amino acid sequence (SEQ ID NO: 4) of CFP9 is given in conventional one-letter code below the nucleotide sequence. The putative ribosome-binding site Shine Delgarno sequence is written in underlined italics as are the putative -10 and -35 regions. Nucleotides in bold writing are those encoding CFP9. The nucleotide sequence obtained from the lambda 226 phage is double underlined.
Fig. 5: Nucleotide sequence of mpt51. The deduced amino acid sequence of MPT51 is given in a one-letter code below the nucleotide sequence. The signal is indicated in WO 99/24577 PCT/DK98/00438 38 italics. The putative potential ribosome-binding site is underlined. The nucleotide difference and amino acid difference compared to the nucleotide sequence of MPB51 (Ohara et aL, 1995) are underlined at position 780. The nucleotides given in italics are not present in M. tuberculosis H37Rv.
Fig. 6: the position of the purified antigens in the 2DE system have been determined and mapped in a reference gel. The newly purified antigens are encircled and the position of well-known proteins are also indicated.
EXAMPLE 1 Identification of single culture filtrate antigens involved in protective immunity A group of efficiently protected mice was generated by infecting 8-12 weeks old female C57BI/6j mice with 5 x 104 M. tuberculosis i.v. After 30 days of infection the mice were subjected to 60 days of antibiotic treatment with isoniazid and were then left for 200-240 days to ensure the establishment of resting long-term memory immunity. Such memory immune mice are very efficiently protected against a secondary infection (Fig. Long lasting immunity in this model is mediated by a population of highly reactive CD4 cells recruited to the site of infection and triggered to produce large amounts of IFN-y in response to ST-CF (Fig. 1) (Andersen et al. 1995).
We have used this model to identify single antigens recognized by protective T cells.
Memory immune mice were reinfected with 1 x 106 M. tuberculosis i.v. and splenic lymphocytes were harvested at day 4-6 of reinfection, a time point where this population is highly reactive to ST-CF. The antigens recognized by these T cells were mapped by the multi-elution technique (Andersen and Heron, 1993). This technique divides complex protein mixtures separated in SDS-PAGE into narrow fractions in a physiological buffer. These fractions were used to stimulate spleen lymphocytes in vitro and the release of IFN-y was monitored (Fig. Long-term memory immune mice did not recognize these fractions before TB infection, but splenic lymphocytes obtained during the recall of protective immunity recognized a range of culture filtrate antigens and peak production of IFN-y was found in response to proteins of apparent molecular weight 6-12 and 17-30 kDa (Fig. It is therefore concluded that culture WO 99/24577 PCT/DK98/00438 39 filtrate antigens within these regions are the major targets recognized by memory effector T-cells triggered to release IFN-y during the first phase of a protective immune response.
EXAMPLE 2 Cloning of genes expressing low mass culture filtrate antigens In example 1 it was demonstrated that antigens in the low molecular mass fraction are recognized strongly by cells isolated from memory immune mice. Monoclonal antibodies (mAbs) to these antigens were therefore generated by immunizing with the low mass fraction in RIBI adjuvant (first and second immunization) followed by two injections with the fractions in aluminium hydroxide. Fusion and cloning of the reactive cell lines were done according to standard procedures (Kohler and Milstein 1975). The procedure resulted in the provision of two mAbs: ST-3 directed to a 9 kDa culture filtrate antigen (CFP9) and PV-2 directed to a 7 kDa antigen (CFP7), when the molecular weight is estimated from migration of the antigens in an SDS-PAGE.
In order to identify the antigens binding to the Mab's, the following experiments were carried out: The recombinant kgtl 1 M. tuberculosis DNA library constructed by R. Young (Young, R.A. et al. 1985) and obtained through the World Health Organization IMMTUB programme (WHO.0032.wibr) was screened for phages expressing gene products which would bind the monoclonal antibodies ST-3 and PV-2.
Approximately 1 x 10" pfu of the gene library (containing approximately 25% recombinant phages) were plated on Eschericia coil Y1090 (DlacU169, proA', DIon, araD139, supF, trpC22::tnl0 [pMC9] ATCC#37197) in soft agar and incubated for 2,5 hours at 42 0
C.
The plates were overlaid with sheets of nitrocellulose saturated with isopropyl-3-Dthiogalactopyranoside and incubation was continued for 2,5 hours at 37°C. The nitrocellulose was removed and incubated with samples of the monoclonal antibodies in WO 99/24577 PCT/DK98/00438 PBS with Tween 20 added to a final concentration of 0.05%. Bound monoclonal antibodies were visualized by horseradish peroxidase-conjugated rabbit anti-mouse immunoglobulins (P260, Dako, Glostrup, DK) and a staining reaction involving tetramethylbenzidine and H 2 0 2 Positive plaques were recloned and the phages originating from a single plaque were used to lysogenize E. coliY1089 (DlacU169, proA', DIon, araD139, strA, hfl150 [pMC9] ATCC nr. 37196). The resultant lysogenic strains were used to propagate phage particles for DNA extraction. These lysogenic E. coli strains have been named: AA226 (expressing ST-3 reactive polypeptide CFP9) which has been deposited 28 June 1993 with the collection of Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSM) under the accession number DSM 8377 and in accordance with the provisions of the Budapest Treaty, and AA242 (expressing PV-2 reactive polypeptide CFP7) which has been deposited 28 June 1993 with the collection of Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSM) under the accession number DSM 8379 and in accordance with the provisions of the Budapest Treaty.
These two lysogenic E. coli strains are disclosed in WO 95/01441 as are the mycobacterial polypeptide products expressed thereby. However, no information concerning the amino acid sequences of these polypeptides or their genetic origin are given, and therefore only the direct expression products of AA226 and AA242 are made available to the public.
The st-3 binding protein is expressed as a protein fused to P-galactosidase, whereas the pv-2 binding protein appears to be expressed in an unfused version.
Sequencing of the nucleotide sequence encoding the PV-2 and ST-3 binding protein In order to obtain the nucleotide sequence of the gene encoding the pv-2 binding protein, the approximately 3 kb M. tuberculosis derived EcoRI EcoRI fragment from WO 99/24577 PCT/DK98/00438 41 AA242 was subcloned in the EcoRI site in the pBluescriptSK (Stratagene) and used to transform E. coli XL-1 Blue (Stratagene).
Similarly, to obtain the nucleotide sequence of the gene encoding the st-3 binding protein, the approximately 5 kb M. tuberculosis derived EcoRI EcoRI fragment from AA226 was subcloned in the EcoRI site in the pBluescriptSK (Stratagene) and used to transform E. coli XL-1Blue (Stratagene).
The complete DNA sequence of both genes were obtained by the dideoxy chain termination method adapted for supercoiled DNA by use of the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., Cleveland, OH) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. The sequences DNA are shown in SEQ ID NO: 1 (CFP7) and in SEQ ID NO: 3 (CFP9) as well as in Figs. 3 and 4, respectively. Both strands of the DNA were sequenced.
CFP7 An open reading frame (ORF) encoding a sequence of 96 amino acid residues was identified from an ATG start codon at position 91-93 extending to a TAG stop codon at position 379-381. The deduced amino acid sequence is shown in SEQ ID NO: 2 (and in Fig. 3 where conventional one-letter amino acid codes are used).
CFP7 appear to be expressed in E. coli as an unfused version. The nucleotide sequence at position 78-84 is expected to be the Shine Delgarno sequence and the sequences from position 47-50 and 14-19 are expected to be the -10 and -35 regions, respectively: CFP9 The protein recognised by ST-3 was produced as a P-galactosidase fusion protein, when expressed from the AA226 lambda phage. The fusion protein had an approx.
size of 116 117kDa (Mw for P-galactosidase 116.25 kDa) which may suggest that only part of the CFP9 gene was included in the lambda clone (AA226).
WO 99/24577 PCT/DK98/00438 42 Based on the 90 bp nucleotide sequence obtained on the insert from lambda phage AA226, a search of homology to the nucleotide sequence of the M. tuberculosis genome was performed in the Sanger database (Sanger Mycobacterium tuberculosis database): http://www.sanger.ac.uk/pathogens/TB-blast-server.html; Williams, 1996). 100% identity to the cloned sequence was found on the MTCY48 cosmid. An open reading frame (ORF) encoding a sequence of 109 amino acid residues was identified from a GTG start codon at position 141 143 extending to a TGA stop codon at position 465 467. The deduced amino acid sequence is shown in Fig.
4 using conventional one letter code.
The nucleotide sequence at position 123 130 is expected to be the Shine Delgarno sequence and the sequences from position 73 78 and 4 9 are expected to be the and -35 region respectively (Fig. The ORF overlapping with the 5'-end of the sequence of AA229 is shown in Fig. 4 by double underlining.
Subcloninq CFP7 and CFP9 in expression vectors The two ORFs encoding CFP7 and CFP9 were PCR cloned into the pMST24 (Theisen et al., 1995) expression vector pRVNO1 or the pQE-32 (QIAGEN) expression vector pRVN02, respectively.
The PCR amplification was carried out in a thermal reactor (Rapid cycler, Idaho Technology, Idaho) by mixing 10 ng plasmid DNA with the mastermix (0.5 pM of each oligonucleotide primer, 0.25 pM BSA (Stratagene), low salt buffer (20 mM Tris-HCI, pH 8.8, 10 mM KCI, 10 mM (NH 4 2
SO
4 2 mM MgSO, and 0,1% Triton X-100) (Stratagene), 0.25 mM of each deoxynucleoside triphosphate and 0.5 U Taq Plus Long DNA polymerase (Stratagene)). Final volume was 10 pl (all concentrations given are concentrations in the final volume). Predenaturation was carried out at 94 0 C for 30 s. cycles of the following was performed; Denaturation at 94 0 C for 30 s, annealing at 0 C for 30 s and elongation at 72 0 C for 1 min.
WO 99/24577 PCT/DK98/00438 43 The oligonucleotide primers were synthesised automatically on a DNA synthesizer (Applied Biosystems, Forster City, Ca, ABI-391, PCR-mode), deblocked, and purified by ethanol precipitation.
The cfp7 oligonucleotides (TABLE 1) were synthesised on the basis of the nucleotide sequence from the CFP7 sequence (Fig. The oligonucleotides were engineered to include an Smal restriction enzyme site at the 5' end and a BamHI restriction enzyme site at the 3' end for directed subcloning.
The cfp9 oligonucleotides (TABLE 1) were synthesized partly on the basis of the nucleotide sequence from the sequence of the AA229 clone and partly from the identical sequence found in the Sanger database cosmid MTCY48 (Fig. The oligonucleotides were engineered to include a Smal restriction enzyme site at the 5' end and a Hindlll restriction enzyme site at the 3' end for directed subcloning.
CFP7 By the use of PCR a Smal site was engineered immediately 5' of the first codon of the ORF of 291 bp, encoding the cfp7 gene, so that only the coding region would be expressed, and a BamHI site was incorporated right after the stop codon at the 3' end.
The 291 bp PCR fragment was cleaved by Smal and BamHI, purified from an agarose gel and subcloned into the Smal BamHI sites of the pMST24 expression vector. Vector DNA containing the gene fusion was used to transform the E. coli XL1-Blue (pRVNO1).
CFP9 By the use of PCR a Smal site was engineered immediately 5' of the first codon of an ORF of 327 bp, encoding the cfp9 gene, so that only the coding region would be expressed, and a Hindlll site was incorporated after the stop codon at the 3' end. The 327 bp PCR fragment was cleaved by Smal and Hindlll, purified from an agarose gel, and subcloned into the Smal Hindlll sites of the pQE-32 (QIAGEN) expression vector.
WO 99/24577 PCT/DK98/00438 44 Vector DNA containing the gene fusion was used to transform the E. coli XL1-Blue (pRVN02).
Purification of recombinant CFP7 and CFP9 The ORFs were fused N-terminally to the (His) 6 -tag (cf. EP-A-0 282 242). Recombinant antigen was prepared as follows: Briefly, a single colony of E. coil harbouring either the pRVNO1 or the pRVN02 plasmid, was inoculated into Luria-Bertani broth containing 100 pg/ml ampicillin and 12.5 pg/ml tetracycline and grown at 37 0 C to
OD
0 oonm 0.5. IPTG (isopropyl-p-D-thiogalactoside) was then added to a final concentration of 2 mM (expression was regulated either by the strong IPTG inducible P,ac or the T5 promoter) and growth was continued for further 2 hours. The cells were harvested by centrifugation at 4,200 x g at 4°C for 8 min. The pelleted bacteria were stored overnight at -20 0 C. The pellet was resuspended in BC 40/100 buffer (20 mM Tris-HCI pH 7.9, 20% glycerol, 100 mM KCI, 40 mM Imidazole) and cells were broken by sonication (5 times for 30 s with intervals of 30 s) at 4 0 C. followed by centrifugation at 12,000 x g for 30 min at 4 0 C, the supernatant (crude extract) was used for purification of the recombinant antigens.
The two Histidine fusion proteins (His-rCFP7 and His-rCFP9) were purified from the crude extract by affinity chromatography on a Ni 2 -NTA column from QIAGEN with a volume of 100 ml. His-rCFP7 and His-rCFP9 binds to Ni 2 After extensive washes of the column in BC 40/100 buffer, the fusion protein was eluted with a BC 1000/100 buffer containing 100 mM imidazole, 20 mM Tris pH 7.9, 20% glycerol and 1 M KCI.
subsequently, the purified products were dialysed extensively against 10 mM Tris pH His-rCFP7 and His-rCFP9 were then separated from contaminants by fast protein liquid chromatography (FPLC) over an anion-exchange column (Mono Q, Pharmacia, Sweden). in 10 mM Tris pH 8.0 with a linear gradient of NaCI from 0 to 1 M. Aliquots of the fractions were analyzed by 10%-20% gradient sodium dodecyl sulphate polyacrylamide gel electrophoresis (SDS-PAGE). Fractions containing purified either purified His-rCFP7 or His-rCFP9 were pooled.
WO 99/24577 PCT/DK98/00438 TABLE 1. Sequence of the cfp7 and cfp9 oligonucleotidesa.
Orientation and Sequences Position" (nucleooligonucleotide tide) Sense pvR3 GCAACACCCGGGATGTCGCAAATCATG 91-105 (SEQ ID NO: 43) (SEQ ID NO: 1) stR2 GTAACACCCGGGGTGGCCGCCGACCCG 141-155 (SEQ ID NO: 44) (SEQ ID NO: 3) Antisense pvF4 CTACTAAGCTTGGATCCCTAGCCG- 381-362 CCCCATTTGGCGG (SEQ ID NO: 1) (SEQ ID NO: stF2 CTACTAAGCTTCCATGGTCAGGTC- 467 447 TTTTCGATGCTTAC (SEQ ID NO: 3) (SEQ ID NO: 46) a The cfp7 oligonucleotides were based on the nucleotide sequence shown in Fig. 3 (SEQ ID NO: The cfp9 oligonucleotides were based on the nucleotide sequence shown in Fig. 4 (SEQ ID NO: 3).
Nucleotides underlined are not contained in the nucleotide sequence of cfp7 and cfp9.
b The positions referred to are of the non-underlined part of the primers and correspond to the nucleotide sequence shown in Fig. 3 and Fig. 4, respectively.
EXAMPLE 2A Identification of antigens which are not expressed in BCG strains.
In an effort to control the treat of TB, attenuated bacillus Calmette-Gu6rin (BCG) has been used as a live attenuated vaccine. BCG is an attenuated derivative of a virulent Mycobacterium bovis. The original BCG from the Pasteur Institute in Paris, France was developed from 1908 to 1921 by 231 passages in liquid culture and has never been shown to revert to virulence in animals, indicating that the attenuating mutation(s) in BCG are stable deletions and/or multiple mutations which do not readily revert. While physiological differences between BCG and M. tuberculosis and M. bovis has been WO 99/24577 PCT/DK98/00438 46 noted, the attenuating mutations which arose during serial passage of the original BCG strain has been unknown until recently. The first mutations described are the loss of the gene encoding MPB64 in some BCG strains (Li et al., 1993, Oettinger and Andersen, 1994) and the gene encoding ESAT-6 in all BCG strain tested (Harboe et al., 1996), later 3 large deletions in BCG have been identified (Mahairas et al., 1996). The region named RD1 includes the gene encoding ESAT-6 and an other (RD2) the gene encoding MPT64. Both antigens have been shown to have diagnostic potential and ESAT-6 has been shown to have properties as a vaccine candidate (cf.
PCT/DK94/00273 and PCT/DK/00270). In order to find new M. tuberculosis specific diagnostic antigens as well as antigens for a new vaccine against TB, the RD1 region (17.499 bp) of M. tuberculosis H37Rv has been analyzed for Open Reading Frames (ORF). ORFs with a minimum length of 96 bp have been predicted using the algorithm described by Borodovsky and Mclninch (1993), in total 27 ORFs have been predicted, of these have possible diagnostic and/or vaccine potential, as they are deleted from all known BCG strains. The predicted ORFs include ESAT-6 (RD1-ORF7) and (RD1-ORF6) described previously (Serensen et al., 1995), as a positive control for the ability of the algorithm. In the present is described the potential of 7 of the predicted antigens for diagnosis of TB as well as potential as candidates for a new vaccine against TB.
Seven open reading frames (ORF) from the 17,499kb RD1 region (Accession no.
U34848) with possible diagnostic and vaccine potential have been identified and cloned.
Identification of the ORF's rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdl-orf8, rdl-orf9a, and rdl-orf9b.
The nucleotide sequence of rdl-orf2 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 71. The deduced amino acid sequence of RD1-ORF2 is set forth in SEQ ID NO: 72.
The nucleotide sequence of rdl-orf3 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 87. The deduced amino acid sequence of RD1-ORF2 is set forth in SEQ ID NO: 88.
WO 99/24577 PCT/DK98/00438 47 The nucleotide sequence of rdl-orf4 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 89. The deduced amino acid sequence of RD1-ORF2 is set forth in SEQ ID NO: The nucleotide sequence of rdl-orf5 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 91. The deduced amino acid sequence of RD1-ORF2 is set forth in SEQ ID NO: 92.
The nucleotide sequence of rdl-orf8 from M. tuberculosis H37Rv is set forth in SEQ ID NO: 67. The deduced amino acid sequence of RD1-ORF2 is set forth in SEQ ID NO: 68.
The nucleotide sequence of rdl-orf9a from M. tuberculosis H37Rv is set forth in SEQ ID NO: 93. The deduced amino acid sequence of RD1-ORF2 is set forth in SEQ ID NO: 94.
The nucleotide sequence of rdl-orf9b from M. tuberculosis H37Rv is set forth in SEQ ID NO: 69. The deduced amino acid sequence of RD1-ORF2 is set forth in SEQ ID NO: The DNA sequence rdl-orf2 (SEQ ID NO: 71) contained an open reading frame starting with an ATG codon at position 889 891 and ending with a termination codon (TAA) at position 2662 2664 (position numbers referring to the location in RD1). The deduced amino acid sequence (SEQ ID NO: 72) contains 591 residues corresponding to a molecular weight of 64,525.
The DNA sequence rdl-orf3 (SEQ ID NO: 87) contained an open reading frame starting with an ATG codon at position 2807 2809 and ending with a termination codon (TAA) at position 3101 3103 (position numbers referring to the location in RD1). The deduced amino acid sequence (SEQ ID NO: 88) contains 98 residues corresponding to a molecular weight of 9,799.
I
WO 99/24577 PCT/DK98/00438 48 The DNA sequence rdl-orf4 (SEQ ID NO: 89) contained an open reading frame starting with a GTG codon at position 4014 4012 and ending with a termination codon (TAG) at position 3597 3595 (position numbers referring to the location in RD1). The deduced amino acid sequence (SEQ ID NO: 90) contains 139 residues corresponding to a molecular weight of 14,210.
The DNA sequence rdl-orf5 (SEQ ID NO: 91) contained an open reading frame starting with a GTG codon at position 3128 3130 and ending with a termination codon (TGA) at position 4241 4243 (position numbers referring to the location in RD1). The deduced amino acid sequence (SEQ ID NO: 92) contains 371 residues corresponding to a molecular weight of 37,647.
The DNA sequence rdl-orf8 (SEQ ID NO: 67) contained an open reading frame starting with a GTG codon at position 5502 5500 and ending with a termination codon (TAG) at position 5084 5082 (position numbers referring to the location in RD1), and the deduced amino acid sequence (SEQ ID NO: 68) contains 139 residues with a molecular weight of 11,737.
The DNA sequence rdl-orf9a (SEQ ID NO: 93) contained an open reading frame starting with a GTG codon at position 6146 6148 and ending with a termination codon (TAA) at position 7070 7072 (position numbers referring to the location in RD1).
The deduced amino acid sequence (SEQ ID NO: 94) contains 308 residues corresponding to a molecular weight of 33,453.
The DNA sequence rdl-orf9b (SEQ ID NO: 69) contained an open reading frame starting with an ATG codon at position 5072 5074 and ending with a termination codon (TAA) at position 7070 7072 (position numbers referring to the location in RD1). The deduced amino acid sequence (SEQ ID NO: 70) contains 666 residues corresponding to a molecular weight of 70,650.
WO 99/24577 PCT/DK98/00438 49 Cloning of the ORF's rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdl-orf8, rdl-orf9a, and rd 1-orf9b.
The ORF's rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdl-orf8, rdl-orf9a and rdl-orf9b were PCR cloned in the pMST24 (Theisen et al., 1995) (rdl-orf3) or the pQE32 (QIAGEN) (rdl-orf2, rdl-orf4, rdl-orf5, rdl-orf8, rdl-orf9a and rdl-orf9b) expression vector. Preparation of oligonucleotides and PCR amplification of the rdl-orf encoding genes, was carried out as described in example 2. Chromosomal DNA from M. tuberculosis H37Rv was used as template in the PCR reactions. Oligonucleotides were synthesized on the basis of the nucleotide sequence from the RD1 region (Accession no.
U34848). The oligonucleotide primers were engineered to include an restriction enzyme site at the 5' end and at the 3' end by which a later subcloning was possible.
Primers are listed in TABLE 2.
rdl-orf2. A BamHI site was engineered immediately 5' of the first codon of rdl-orf2, and a Hindlll site was incorporated right after the stop codon at the 3' end. The gene rdl-orf2 was subcloned in pQE32, giving pT096.
rdl-orf3. A Smal site was engineered immediately 5' of the first codon of rdl-orf3, and a Ncol site was incorporated right after the stop codon at the 3' end. The gene rdl-orf3 was subcloned in pMST24, giving pT087.
rdl-orf4. A BamHI site was engineered immediately 5' of the first codon of rdl-orf4, and a Hindlll site was incorporated right after the stop codon at the 3' end. The gene rdl-orf4 was subcloned in pQE32, giving pTO89.
A BamHI site was engineered immediately 5' of the first codon of and a Hindill site was incorporated right after the stop codon at the 3' end. The gene was subcloned in pQE32, giving pTO88.
rdl-orf8. A BamHI site was engineered immediately 5' of the first codon of rdl-orf8, and a Ncol site was incorporated right after the stop codon at the 3' end. The gene rdl-orf8 was subcloned in pMST24, giving pTO98.
WO 99/24577 PCT/DK98/00438 rdl-orf9a. A BamHI site was engineered immediately 5' of the first codon of rdlorf9a, and a HindIll site was incorporated right after the stop codon at the 3' end. The gene rdl-orf9a was subcloned in pQE32, giving pTO91.
rdl-orf9b. A Scal site was engineered immediately 5' of the first codon of rdl-orf9b, and a Hind III site was incorporated right after the stop codon at the 3' end. The gene rdl-orf9b was subcloned in pQE32, giving The PCR fragments were digested with the suitable restriction enzymes, purified from an agarose gel and cloned into either pMST24 or pQE-32. The seven constructs were used to transform the E. coli XL1-Blue. Endpoints of the gene fusions were determined by the dideoxy chain termination method. Both strands of the DNA were sequenced.
Purification of recombinant RD1-ORF2, RD1-ORF3, RD1-ORF4, RD1-ORF5, RD1-ORF8, RD1-ORF9a and RD1-ORF9b.
The rRD1-ORFs were fused N-terminally to the (His) 6 -tag. Recombinant antigen was prepared as described in example 2 (with the exception that pTO91 was expressed at 0 C and not at 37 0 using a single colony of E. coli harbouring either the pTO87, pT088, pTO89, pTO90, pTO91, pTO96 or pTO98 for inoculation. Purification of recombinant antigen by Ni 2 affinity chromatography was also carried out as described in example 2. Fractions containing purified His-rRD1-ORF2, His-rRD1-ORF3 His-rRD1- ORF4, His-rRD1-ORF5, His-rRD1-ORF8, His-rRD1-ORF9a or His-rRD1-ORF9b were pooled. The His-rRD1-ORF's were extensively dialysed against 10 mM Tris/HCI, pH 8.5, 3 M urea followed by an additional purification step performed on an anion exchange column (Mono Q) using fast protein liquid chromatography (FPLC) (Pharmacia, Uppsala, Sweden). The purification was carried out in 10 mM Tris/HCI, pH 8.5, 3 M urea and protein was eluted by a linear gradient of NaCI from 0 to 1 M. Fractions containing the His-rRD1-ORF's were pooled and subsequently dialysed extensively against 25 mM Hepes, pH 8.0 before use.
WO 99/24577 WO 9924577PCT/DK98/00438 51 Table 2. Sequence of the rdl-orf's oligonucleotides'.
Orientation and oligonuci eotide Sequences 3') Position (nt) Sense RD1 -0RF2f RD1 -ORF3f RD 1 -ORF4f RD1 -ORF5f RD1 -ORF8f RD1 -0RF9af RD1 -ORF9bf Antisense RD1 -ORF2r RD 1-ORF3r RDl -ORF4r RD1 -ORF~r RD1 -ORF8r RD 1 -ORF9a/br
CTGGGGATCCGCATGACTGCTGAACCG
CTTCCCGGGATGGAAAAAATGTCAC
GTAGGATCCTAGGAGACATCAGCGGG
CTGGGGATCCGCGTGATCACCAT-
GGTGTGG
CTCGGATCCTGTGGGTGCAGGTCCGGC
GATGGGC
GTGATGTGAGCTCAGGTGAAGAA-
GGTGAAG
GTGATGTGAGCTCCTATGGCGGCCGAC-
TAGGAG
TGCAAGCTTTTAACCGGGGCTTGGGGGT
GC
GATGCCATGGTTAGGCGAAGACGC-
CGGC
CG ATCTAAG CTTGGCAATG GAG GTCTA
TGCAAGCTTTCAGCAGTCGTCCT-
CTTCGTC
CTGCCATGGCTACGACAAGCTCTTC-
CGGCCGC
.CGATCTAAGCTTTCAACGACGTCGAGGC
886 -903 2807 -2822 4028-4015 3028 -3045 5502 -5479 6144- 6160 5072 -5089 2664 -2644 3103-3086 3582 -3597 4243 -4223 5083-5105 7073 -7056 a The oligonucleotides were constructed from the Accession number U34484 nucleotide sequence (Mahairas et al., 1996). Nucleotides (nt) underlined are not contained in the nucleotide sequence of RD 1-ORF's. The positions correspond to the nucleotide sequence of Accession number U34484.
WO 99/24577 PCT/DK98/00438 52 The nucleotide sequences of rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdl-orf8, rdlorf9a, and rdl-orf9b from M. tuberculosis H37Rv are set forth in SEQ ID NO: 71, 87, 89, 91, 67, 93, and 69, respectively. The deduced amino acid sequences of rdl-orf2, rdl-orf3, rdl-orf4 rdl-orf5, rdl-orf8, rdl-orf9a, and rdl-orf9b are set forth in SEQ ID NO: 72, 88, 90, 92, 68, 94, and 70, respectively.
EXAMPLE 3 Cloning of the genes expressing 17-30 kDa antigens from ST-CF Isolation of CFP17, CFP20, CFP21, CFP22, CFP25, and CFP28 ST-CF was precipitated with ammonium sulphate at 80% saturation. The precipitated proteins were removed by centrifugation and after resuspension washed with 8 M urea. CHAPS and glycerol were added to a final concentration of 0.5% and respectively and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad). The Rotofor Cell had been equilibrated with an 8 M urea buffer containing CHAPS, 5% glycerol, 3% Biolyt 3/5 and 1% Biolyt 4/6 (BioRad). Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed on silver-stained 10-20% SDS-PAGE. Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1-3 ml. An equal volume of SDS containing sample buffer was added and the protein solution boiled for 5 min before further separation on a Prep Cell (BioRad) in a matrix of 16% polyacrylamide under an electrical gradient. Fractions containing pure proteins with an molecular mass from 17-30 kDa were collected.
Isolation of CFP29 Anti-CFP29, reacting with CFP29 was generated by immunization of BALB/c mice with crushed gel pieces in RIBI adjuvant (first and second immunization) or aluminium hydroxide (third immunization and boosting) with two week intervals. SDS-PAGE gel pieces containing 2-5 pg of CFP29 were used for each immunization. Mice were boosted with antigen 3 days before removal of the spleen. Generation of a monoclonal WO 99/24577 PCT/DK98/00438 53 cell line producing antibodies against CFP29 was obtained essentially as described by Khler and Milstein (1975). Screening of supernatants from growing clones was carried out by immunoblotting of nitrocellulose strips containing ST-CF separated by SDS- PAGE. Each strip contained approximately 50 pg of ST-CF. The antibody class of anti- CFP29 was identified as IgM by the mouse monoclonal antibody isotyping kit, RPN29 (Amersham) according to the manufacturer's instructions.
CFP29 was purified by the following method: ST-CF was concentrated 10 fold by ultrafiltration, and ammonium sulphate precipitation in the 45 to 55% saturation range was performed. The pellet was redissolved in 50 mM sodium phosphate, 1.5 M ammonium sulphate, pH 8.5, and subjected to thiophilic adsorption chromatography (Porath et al., 1985) on an Affi-T gel column (Kem-En-Tec). Protein was eluted by a linear to 0 M gradient of ammonium sulphate and fractions collected in the range 0.44 to 0.31 M ammonium sulphate were identified as CFP29 containing fractions in Western blot experiments with mAb Anti-CFP29. These fractions were pooled and anion exchange chromatography was performed on a Mono Q HR 5/5 column connected to an FPLC system (Pharmacia). The column was equilibrated with 10 mM Tris-HCI, pH and the elution was performed with a linear gradient from 0 to 500 mM NaCI.
From 400 to 500 mM sodium chloride, rather pure CFP29 was eluted. As a final purification step the Mono Q fractions containing CFP29 were loaded on a 12.5% SDS- PAGE gel and pure CFP29 was obtained by the multi-elution technique (Andersen and Heron, 1993).
N-terminal sequencing and amino acid analysis CFP17, CFP20, CFP21, CFP22, CFP25, and CFP28 were washed with water on a Centricon concentrator (Amicon) with cutoff at 10 kDa and then applied to a ProSpin concentrator (Applied Biosystems) where the proteins were collected on a PVDF membrane. The membrane was washed 5 times with 20% methanol before sequencing on a Procise sequencer (Applied Biosystems).
CFP29 containing fractions were blotted to PVDF membrane after tricine SDS-PAGE (Ploug et al., 1989). The relevant bands were excised and subjected to amino acid WO 99/24577 PCT/DK98/00438 54 analysis (Barkholt and Jensen, 1989) and N-terminal sequence analysis on a Procise sequencer (Applied Biosystems).
The following N-terminal sequences were obtained: ForCFP17:A/SELDAPAQAGTEXAV (SEQIDNO: 17) N AINTV G E (SEQID NO: 18) ForCFP21:DPXSD IAVVFARGTH (SEQID NO: 19) For CFP22: TN SPLAT A TAT L H TN (SEQ ID NO: ForCFP25:AXPDAEV VFARGRFE (SEQIDNO: 21) For CFP28: X I/V Q K S L E L I V/T V/F T A D/Q E (SEQ ID NO: 22) ForCFP29:M N N LYRDLAPVTEA AWA EI (SEQIDNO:23) denotes an amino acid which could not be determined by the sequencing method used, whereas a between two amino acids denotes that the sequencing method could not determine which of the two amino acids is the one actually present.
Cloning the gene encoding CFP29 The N-terminal sequence of CFP29 was used for a homology search in the EMBL database using the TFASTA program of the Genetics Computer Group sequence analysis software package. The search identified a protein, Linocin M18, from Brevibacterium linens that shares 74% identity with the 19 N-terminal amino acids of CFP29.
Based on this identity between the N-terminal sequence of CFP29 and the sequence of the Linocin M18 protein from Brevibacterium linens, a set of degenerated primers were constructed for PCR cloning of the M. tuberculosis gene encoding CFP29. PCR reactions were containing 10 ng of M. tuberculosis chromosomal DNA in 1 x low salt Taq buffer from Stratagene supplemented with 250 gM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1% DMSO (Merck), pmoles of each primer and 0.5 unit Tag+ DNA polymerase (Stratagene) in 10 p1 reaction volume. Reactions were initially heated to 94 0 C for 25 sec. and run for 30 cycles of the program; 94 0 C for 15 sec., 55 0 C for 15 sec. and 72 0 C for 90 sec, using thermocycler equipment from Idaho Technology.
WO 99/24577 PCT/DK98/00438 An approx. 300 bp fragment was obtained using primers with the sequences: 1: 5'-CCCGGCTCGAGAACCTSTACCGCGACCTSGCSCC 2: (SEQ ID NO: 24) (SEQ ID NO: -where S G/C and Y T/C The fragment was excised from a 1% agarose gel, purified by Spin-X spinn columns (Costar), cloned into pBluescript SK II+ T vector (Stratagene) and finally sequenced with the Sequenase kit from United States Biochemical.
The first 150 bp of this sequence was used for a homology search using the Blast program of the Sanger Mycobacterium tuberculosis database: (http//www.sanger.ac.uk/projects/M-tuberculosis/blast_server).
This program identified a Mycobacterium tuberculosis sequence on cosmid cy444 in the database that is nearly 100% identical to the 150 bp sequence of the CFP29 protein. The sequence is contained within a 795 bp open reading frame of which the end translates into a sequence that is 100% identical to the N-terminally sequenced 19 amino acids of the purified CFP29 protein.
Finally, the 795 bp open reading frame was PCR cloned under the same PCR conditions as described above using the primers: 3: 5'-GGAAGCCCCATATGAACAATCTCTACCG 4: 5'-CGCGCTCAGCCCTTAGTGACTGAGCGCGACCG (SEQ ID NO: 26) (SEQ ID NO: 27) The resulting DNA fragments were purified from agarose gels as described above sequenced with primer 3 and 4 in addition to the following primers: 5'-GGACGTTCAAGCGACACATCGCCG-3' (SEQ ID NO: 115) WO 99/24577 PCT/DK98/00438 56 6: 5'-CAGCACGAACGCGCCGTCGATGGC-3' (SEQ ID NO: 116) Three independent cloned were sequenced. All three clones were in 100% agreement with the sequence on cosmid cy444.
All other DNA manipulations were done according to Maniatis et al. (1989).
All enzymes other than Taq polymerase were from New England Biolabs.
Homology searches in the Sanger database For CFP17, CFP20, CFP21, CFP22, CFP25, and CFP28 the N-terminal amino acid sequence from each of the proteins were used for a homology search using the blast program of the Sanger Mycobacterium tuberculosis database: http://www.sanger.ac.uk/pathogens/TB-blast-server.html.
For CFP29 the first 150 bp of the DNA sequence was used for the search. Furthermore, the EMBL database was searched for proteins with homology to CFP29.
Thereby, the following information were obtained: CFP17 Of the 14 determined amino acids in CFP17 a 93% identical sequence was found with MTCY1A11.16c. The difference between the two sequences is in the first amino acid: It is an A or an S in the N-terminal determined sequenced and a S in MTCY1A11.
From the N-terminal sequencing it was not possible to determine amino acid number 13.
Within the open reading frame the translated protein is 162 amino acids long. The Nterminal of the protein purified from culture filtrate starts at amino acid 31 in agreement with the presence of a signal sequence that has been cleaved off. This gives a length of the mature protein of 132 amino acids, which corresponds to a theoretical WO 99/24577 PCT/DK98/00438 57 molecular mass of 13833 Da and a theoretical pl of 4.4. The observed mass in SDS- PAGE is 17 kDa.
A sequence 100% identical to the 15 determined amino acids of CFP20 was found on the translated cosmid cscy09F9. A stop codon is found at amino acid 166 from the amino acid M at position 1. This gives a predicted length of 165 amino acids, which corresponds to a theoretical molecular mass of 16897 Da and a pl of 4.2. The observed molecular weight in a SDS-PAGE is 20 kDa.
Searching the GenEMBL database using the TFASTA algorithm (Pearson and Lipman, 1988) revealed a number of proteins with homology to the predicted 164 amino acids long translated protein.
The highest homology, 51.5% identity in a 163 amino acid overlap, was found to a Haemophilus influenza Rd toxR reg. (HIH10751).
CFP21 A sequence 100% identical to the 14 determined amino acids of CFP21 was found at MTCY39. From the N-terminal sequencing it was not possible to determine amino acid number 3; this amino acid is a C in MTCY39. The amino acid C can not be detected on a Sequencer which is probably the explanation of this difference.
Within the open reading frame the translated protein is 217 amino acids long. The Nterminally determined sequence from the protein purified from culture filtrate starts at amino acid 33 in agreement with the presence of a signal sequence that has been cleaved off. This gives a length of the mature protein of 185 amino acids, which corresponds to a theoretical molecular weigh at 18657 Da, and a theoretical pl at 4,6.
The observed weight in a SDS-PAGE is 21 kDa.
In a 193 amino acids overlap the protein has 32,6% identity to a cutinase precursor with a length of 209 amino acids (CUTIALTBR P41744).
WO 99/24577 PCT/DK98/00438 58 A comparison of the 14 N-terminal determined amino acids with the translated region (RD2) deleted in M. bovis BCG revealed a 100% identical sequence (mb3484) (Mahairas et al. (1996)).
CFP22 A sequence 100% identical to the 15 determined amino acids of CFP22 was found at MTCY10H4. Within the open reading frame the translated protein is 182 amino acids long. The N-terminal sequence of the protein purified from culture filtrate starts at amino acid 8 and therefore the length of the protein occurring in M. tuberculosis culture filtrate is 175 amino acids. This gives a theoretical molecular weigh at 18517 Da and a pl at 6.8. The observed weight in a SDS-PAGE is 22 kDa.
In an 182 amino acids overlap the translated protein has 90,1% identity with E235739; a peptidyl-prolyl cis-trans isomerase.
A sequence 93% identical to the 15 determined amino acids was found on the cosmid MTCY339.08c. The one amino acid that differs between the two sequences is a C in MTCY339.08c and a X from the N-terminal sequence data. On a Sequencer a C can not be detected which is a probable explanation for this difference.
The N-terminally determined sequence from the protein purified from culture filtrate begins at amino acid 33 in agreement with the presence of a signal sequence that has been cleaved off. This gives a length of the mature protein of 187 amino acids, which corresponds to a theoretical molecular weigh at 19665 Da, and a theoretical pl at 4.9.
The observed weight in a SDS-PAGE is 25 kDa.
In a 217 amino acids overlap the protein has 42.9% identity to CFP21 (MTCY39.35).
WO 99/24577 PCT/DK98/00438 59 CFP28 No homology was found when using the 10 determined amino acid residues 2-8, 11, 12, and 14 of SEQ ID NO: 22 in the database search.
CFP29 Sanger database searching: A sequence nearly 100% identical to the 150 bp sequence of the CFP29 protein was found on cosmid cy444. The sequence is contained within a 795 bp open reading frame of which the 5' end translates into a sequence that is 100% identical to the N-terminally sequenced 19 amino acids of the purified CFP29 protein. The open reading frame encodes a 265 amino acid protein.
The amino acid analysis performed on the purified protein further confirmed the identity of CFP29 with the protein encoded in open reading frame on cosmid 444.
EMBL database searching: The open reading frame encodes a 265 amino acid protein that is 58% identical and 74% similar to the Linocin M18 protein (61% identity on DNA level). This is a 28.6 kDa protein with bacteriocin activity (Vald6s-Stauber and Scherer, 1994; Vald6s-Stauber and Scherer, 1996). The two proteins have the same length (except for 1 amino acid) and share the same theoretical physicochemical properties. We therefore suggest that CFP29 is a mycobacterial homolog to the Brevibacterium linens Linocin M18 protein.
The amino acid sequences of the purified antigens as picked from the Sanger database are shown in the following list. The amino acids determined by N-terminal sequencing are marked with bold.
CFP17 (SEQ ID NO: 6): 1 MTDMNPDIEK DQTSDEVTVE TTSVFRADFL SELDAPAQAG TESAVSGVEG 51 LPPGSALLVV KRGPNAGSRF LLDQAITSAG RHPDSDIFLD DVTVSRRHAE 101 FRLENNEFNV VDVGSLNGTY VNREPVDSAV LANGDEVQIG KFRLVFLTGP 151 KQGEDDGSTG GP WO 99/24577 WO 9924577PCT/DK98/00438 (SEQ ID NO: 8): 1 MAQITLRGNA INTVGELPAV GSPAPAFTLT GGDLGVISSD QFRGKSVLLN 51 IFPSVDTPVC ATSVRTFDER AAASGATVLC VSKDLPFAQK RFCGAEGTEN 101 VMPASAFRDS FGEDYGVTIA DGPMAGLLAR AIVVIGADGN VAYTELVPEI 151 AQEPNYEAAL AALGA CFP21 (SEQ ID NO: 1 MTPRSLVRIV GVVVATTLAL VSAPAGGRAA HADPGSDIAV 41 VFARGTHQAS GLGDVGEAFV DSLTSQVGGR SIGVYAVNYP ASDDYRASAS 91 NGSDDASAHI QRTVASCPNT RIVLGGYSQG ATVIDLSTSA MPPAVADHVA 141 AVALFGEPSS GFSSMLWGGG SLPTIGPLYS SKTINLCAPD DPICTGGGNI 191 MAHVSYVQSG MTSQAATFAA NRLD HAG CFP22 (SEQ ID NO: 1 2): 1 MADCDSVTNS PLATATATLH TNRGDIKIAL FGNHAPKTVA NFVGLAQGTK 51 DYSTQNASGG PSGPFYDGAV FHRVIQGFMI QGGDPTGTGR GGPGYKFADE 101 FHPELQFDKP YLLAMANAGP GTNGSQFFIT VGKTPHLNRR HTIFGEVIDA 151 ESQRVVEAIS KTATDGNDRP TDPVVIESIT IS (SEQ ID NO: 14): 1 MGAAAAMLAA VLLLTPITVP AGYPGAVAPA TAACPDAEVV FARGRFEPPG 51 IGTVGNAFVS ALRSKVNKNV GVYAVKYPAD NQIDVGANDM SAHIQSMANS 101 CPNTRLVPGG YSLGAAVTDV VLAVPTQMWG FTNPLPPGSD EHIAAVALFG 151 NGSQWVGPIT NFSPAYNDRT IELCHGDDPV CHPADPNTWE ANWPQHLAGA 201 YVSSGMVNQA ADFVAGKLQ CFP29 (SEQ ID NO: 16): 1 MNNLYRDLAP VTEAAWAEIE LEAARTFKRH lAG RRVVDVS DPGGPVTAAV WO 99/24577 PCT/DK98/00438 61 51 STGRLIDVKA PTNGVIAHLR ASKPLVRLRV PFTLSRNEID DVERGSKDSD 101 WEPVKEAAKK LAFVEDRTIF EGYSAASIEG IRSASSNPAL TLPEDPREIP 151 DVISQALSEL RLAGVDGPYS VLLSADVYTK VSETSDHGYP IREHLNRLVD 201 GDIIWAPAID GAFVLTTRGG DFDLQLGTDV AIGYASHDTD TVRLYLQETL 251 TFLCYTAEAS VALSH For all six proteins the molecular weights predicted from the sequences are in agreement with the molecular weights observed on SDS-PAGE.
Cloning of the genes encoding CFP17, CFP20, CFP21, CFP22 and The genes encoding CFP17, CFP20, CFP21, CFP22 and CFP25 were all cloned into the expression vector pMCT6, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the proteins.
PCR reactions contained 10 ng of M. tuberculosis chromosomal DNA in 1x low salt Taq+ buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1% DMSO (Merck), pmoles of each primer and 0.5 unit Tag+ DNA polymerase (Stratagene) in 10 p. reaction volume. Reactions were initially heated to 94 0 C for 25 sec. and run for 30 cycles according to the following program; 94 0 C for 10 sec., 55 0 C for 10 sec. and 72 0 C for sec, using thermocycler equipment from Idaho Technology.
The DNA fragments were subsequently run on 1% agarose gels, the bands were excised and purified by Spin-X spin columns (Costar) and cloned into pBluescript SK II T vector (Stratagene). Plasmid DNA was thereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMCT6 in frame with 8 histidine residues which are added to the N-terminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiled DNA using the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.
WO 99/24577 WO 9924577PCT/DK98/00438 62 For cloning of the individual antigens, the following gene specific primers were used: CFP1 7: Primers used for cloning of cfpl 7: OPBR-51: ACAGATCTGTGACGGACATGAACCCG (SEQ ID NO: 117) OPBR-52: TTTTCCATGGTCACGGGCCCCCGGTACT (SEQ ID NO: 118) OPBR-51 and OPBR-52 create Bglll and NcoI sites, respectively, used for the cloning in pMCT6.
Primers used for cloning of OPBR-53: ACAGATCTGTGCCCATGGCACAGATA (SEQ ID NO: 119) OPBR-54: TTTAAGCTTCTAGGGGCCCAGCGCGGC (SEQ ID NO: 120) OPBR-53 and OPBR-54 create BgIII and HinDliI sites, respectively, used for the cloning in pMCT6.
CFP21: Primers used for cloning of cfp2l: ACAGATCTGCGCATGCGGATCCGTGT (SEQ ID NO: 121) OPBR-56: TTTTCCATGGTGATCCGGCGTGATCGAG (SEQ ID NO: 122) OPBR-55 and OPBR-56 create BgIII and NcoI sites, respectively, used for the cloning in pMCT6.
CFP22: Primers used for cloning of cfp22: OPBR-57: AGAGATCTGTAATGGCAGACTGTGAT (SEQ ID NO: 123) OPBR-58: TTTTCCATGGTCAGGAGATGGTGATCGA (SEQ ID NO: 124) OPBR-57 and OPBR-58 create BglIl and Ncol sites, respectively, used for the cloning in pMCT6.
WO 99/24577 PCT/DK98/00438 63 Primers used for cloning of OPBR-59: ACAGATCTGCCGGCTACCCCGGTGCC (SEQ ID NO: 125) OPBR-60: TTTTCCATGGCTATTGCAGCTTTCCGGC (SEQ ID NO: 126) OPBR-59 and OPBR-60 create Bglll and Ncol sites, respectively, used for the cloning in pMCT6.
Expression/purification of recombinant CFP17, CFP20, CFP21, CFP22 and CFP25 proteins.
Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 I LB-media containing 100 pg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XL1-Blue cells harbouring recombinant pMCT6 plasmids. Cultures were shaken at 37 oC until they reached a density of OD 6 0 0 0.4 0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4 16 hours. Cells were harvested, resuspended in 1X sonication buffer 8 M urea and sonicated 5 X sec. with 30 sec. pausing between the pulses.
After centrifugation, the lysate was applied to a column containing 25 ml of resuspended Talon resin (Clontech, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.
After elution, all fractions (1.5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCI, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCI. Fractions were analyzed by SDS-PAGE and protein concentrations were estimated at OD 2 8 0 Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH WO 99/24577 PCT/DK98/00438 64 Finally the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.
EXAMPLE 3A Identification of CFP7A, CFP8A, CFP8B, CFP16, CFP19, CFP19B, CFP22A, CFP23A, CFP23B, CFP25A, CFP27, CFP30A, CWP32 and Identification of CFP16 and CFP19B.
ST-CF was precipitated with ammonium sulphate at 80% saturation. The precipitated proteins were removed by centrifugation and after resuspension washed with 8 M urea. CHAPS and glycerol were added to a final concentration of 0.5 and 5 respectively and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad). The Rotofor Cell had been equilibrated with a 8M urea buffer containing CHAPS, 5% glycerol, 3% Biolyt 3/5 and 1% Biolyt 4/6 (Bio- Rad). Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed on silver-stained 10-20% SDS-PAGE. Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1-3 ml. An equal volume of SDS containing sample buffer was added and the protein solution boiled for 5 min before further separation on a Prep Cell (BioRad) in a matrix of 16% polyacrylamide under an electrical gradient. Fractions containing well separated bands in SDS-PAGE were selected for N-terminal sequencing after transfer to PVDF membrane.
Isolation of CFP8A. CFP8B, CFP19, CFP23A, and CFP23B.
ST-CF was precipitated with ammonium sulphate at 80% saturation and redissolved in PBS, pH 7.4, and dialysed 3 times against 25mM Piperazin-HCI, pH 5.5, and subjected to chromatofocusing on a matrix of PBE 94 (Pharmacia) in a column connected to an FPLC system (Pharmacia). The column was equilibrated with 25 mM Piperazin-HCI, pH and the elution was performed with 10% PB74-HCI, pH 4.0 (Pharmacia). Fractions with similar band patterns were pooled and washed three times with PBS on a WO 99/24577 PCT/DK98/00438 Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1-3 ml and separated on a Prepcell as described above.
Identification of CFP22A ST-CF was concentrated approximately 10 fold by ultrafiltration and proteins were precipitated at 80 saturation, redissolved in PBS, pH 7.4, and dialysed 3 times against PBS, pH 7.4. 5.1 ml of the dialysed ST-CF was treated with RNase (0.2 mg/ml, QUIAGEN) and DNase (0.2 mg/ml, Boehringer Mannheim) for 6 h and placed on top of 6.4 ml of 48 sucrose in PBS, pH 7.4, in Sorvall tubes (Ultracrimp 03987, DuPont Medical Products) and ultracentrifuged for 20 h at 257,300 x gmax, 0 C. The pellet was redissolved in 200 /1 of 25 mM Tris-192 mM glycine, 0.1 SDS, pH 8.3.
Identification of CFP7A, CFP25A, CFP27, CFP30A and For CFP27, CFP30A and CFP50 ST-CF was concentrated approximately 10 fold by ultrafiltration and ammonium sulphate precipitation in the 45 to 55 saturation range was performed. Proteins were redissolved in 50 mM sodium phosphate, 1.5 M ammonium sulphate, pH 8.5, and subjected to thiophilic adsorption chromatography on an Affi-T gel column (Kem-En-Tec). Proteins were eluted by a 1.5 to 0 M decreasing gradient of ammonium sulphate. Fractions with similar band patterns in SDS-PAGE were pooled and anion exchange chromatography was performed on a Mono Q HR 5/5 column connected to an FPLC system (Pharmacia). The column was equilibrated with mM Tris-HCI, pH 8.5, and the elution was performed with a gradient of NaCI from 0 to 1 M. Fractions containing well separated bands in SDS-PAGE were selected.
CFP7A and CFP25A were obtained as described above except for the following modification: ST-CF was concentrated approximately 10 fold by ultrafiltration and proteins were precipitated at 80 saturation, redissolved in PBS, pH 7.4, and dialysed 3 times against PBS, pH 7.4. Ammonium sulphate was added to a concentration of M, and ST-CF proteins were loaded on an Affi T-gel column. Elution from the Affi Tgel column and anion exchange were performed as described above.
WO 99/24577 PCT/DK98/00438 66 Isolation of CWP32 Heat treated H37Rv was subfractionated into subcellular fractions as described in S0rensen et al 1995. The Cell wall fraction was resuspended in 8 M urea, 0.2 (w/v) N-octyl glucopyranoside (Sigma) and 5 glycerol and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad) which was equilibrated with the same buffer. Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed by SDS-PAGE and fractions containing well separated bands were polled and subjected to N-terminal sequencing after transfer to PVDF membrane.
N-terminal sequencing Fractions containing CFP7A, CFP8A, CFP8B, CFP16, CFP19, CFP19B, CFP22A, CFP23A, CFP23B, CFP27, CFP30A, CWP32, and CFP50A were blotted to PVDF membrane after Tricine SDS-PAGE (Ploug et al, 1989). The relevant bands were excised and subjected to N-terminal amino acid sequence analysis on a Procise 494 sequencer (Applied Biosystems). The fraction containing CFP25A was blotted to PVDF membrane after 2-DE PAGE (isoelectric focusing in the first dimension and Tricin SDS- PAGE in the second dimension). The relevant spot was excised and sequenced as described above.
The following N-terminal sequences were obtained: CFP7A: CFP8A: CFP8B: CFP16: CFP19: CFP19B: CFP22A: CFP23A: CFP23B: AEDVRAEIVA SVLEVVVNEG DQIDKGDVVV LLESMYMEIP VLAEAAGTVS (SEQ ID NO: 81) DPVDDAFIAKLNTAG (SEQ ID NO: 73) DPVDAIINLDNYGX (SEQ ID NO: 74) AKLSTDELLDAFKEM (SEQ ID NO: 79) TTSPDPYAALPKLPS (SEQ ID NO: 82) DPAXAPDVPTAAQLT (SEQ ID NO: TEYEGPKTKF HALMQ (SEQ ID NO: 83) VIQ/AGMVT/GHIHXVAG (SEQ ID NO: 76) AEMKXFKNAIVQEID (SEQ ID NO: AIEVSVLRVF TDSDG (SEQ ID NO: 78) WO 99/24577 PCT/DK98/00438 67 CWP32: TNIVVLIKQVPDTWS (SEQ ID NO: 77) CFP27: TTIVALKYPG GVVMA (SEQ ID NO: 84) SFPYFISPEX AMRE (SEQ ID NO: THYDVVVLGA GPGGY (SEQ ID NO: 86) N-terminal homology searching in the Sanqer database and identification of the corresponding genes.
The N-terminal amino acid sequence from each of the proteins was used for a homology search using the blast program of the Sanger Mycobacterium tuberculosis database: http://www.sanger.ac.uk/projects/m-tuberculosis/TB-blast-server.
For CFP23B, CFP23A, and CFP19B no similarities were found in the Sanger database.
This could be due to the fact that only approximately 70% of the M. tuberculosis genome had been sequenced when the searches were performed. The genes encoding these proteins could be contained in the remaining 30% of the genome for which no sequence data is yet available.
For CFP7A, CFP8A, CFP8B, CFP16, CFP19, CFP19B CFP22A, CFP25A, CFP27, CWP32, and CFP50, the following information was obtained: CFP7A: Of the 50 determined amino acids in CFP7A a 98% identical sequence was found in cosmid csCY07D1 (contig 256): Score 226 (100.4 bits), Expect 1.4e-24, P 1.4e-24 Identities 49/50 Positives 49/50 Frame -1 Query: 1 AEDVRAEIVASVLEVVVNEGDQIDKGDVVVLLESMYMEIPVLAEAAGTVS AEDVRAEIVASVLEVVVNEGDQIDKGDVVVLLESM MEIPVLAEAAGTVS Sbjct: 257679 AEDVRAEIVASVLEVVVNEGDQIDKGDVVVLLESMKMEIPVLAEAAGTVS 257530 (SEQ ID NOs: 127, 128, and 129) WO 99/24577 PCT/DK98/00438 68 The identity is found within an open reading frame of 71 amino acids length corresponding to a theoretical MW of CFP7A of 7305.9 Da and a pl of 3.762. The observed molecular weight in an SDS-PAGE gel is 7 kDa.
CFP8A: A sequence 80% identical to the 15 N-terminal amino acids was found on contig TB_1884. The N-terminally determined sequence from the protein purified from culture filtrate starts at amino acid 32. This gives a length of the mature protein of 98 amino acids corresponding to a theoretical MW of 9700 Da and a pl of 3.72 This is in good agreement with the observed MW on SDS-PAGE at approximately 8 kDa. The full length protein has a theoretical MW of 12989 Da and a pl of 4.38.
CFP8B: A sequence 71% identical to the 14 N-terminal amino acids was found on contig TB_653. However, careful re-evaluation of the original N-terminal sequence data confirmed the identification of the protein. The N-terminally determined sequence from the protein purified from culture filtrate starts at amino acid 29. This gives a length of the mature protein of 82 amino acids corresponding to a theoretical MW of 8337 Da and a pl of 4.23. This is in good agreement with the observed MW on SDS- PAGE at approximately 8 kDa. Analysis of the amino acid sequence predicts the presence of a signal peptide which has been cleaved of the mature protein found in culture filtrate.
CFP16: The 15 aa N-terminal sequence was found to be 100% identical to a sequence found on cosmid MTCY20H1.
The identity is found within an open reading frame of 130 amino acids length corresponding to a theoretical MW of CFP16 of 13440.4 Da and a pl of 4.59. The observed molecular weight in an SDS-PAGE gel is 16 kDa.
CFP19: The 15 aa N-terminal sequence was found to be 100% identical to a sequence found on cosmid MTCY270.
WO 99/24577 PCT/DK98/00438 69 The identity is found within an open reading frame of 176 amino acids length corresponding to a theoretical MW of CFP19 of 18633.9 Da and a pl of 5.41. The observed molecular weight in an SDS-PAGE gel is 19 kDa.
CFP22A: The 15 aa N-terminal sequence was found to be 100% identical to a sequence found on cosmid MTCY1A6.
The identity is found within an open reading frame of 181 amino acids length corresponding to a theoretical MW of CFP22A of 20441.9 Da and a pi of 4.73. The observed molecular weight in an SDS-PAGE gel is 22 kDa.
The 15 aa N-terminal sequence was found to be 100% identical to a sequence found on contig 255.
The identity is found within an open reading frame of 228 amino acids length corresponding to a theoretical MW of CFP25A of 24574.3 Da and a pl of 4.95. The observed molecular weight in an SDS-PAGE gel is 25 kDa.
CFP27: The 15 aa N-terminal sequence was found to be 100% identical to a sequence found on cosmid MTCY261.
The identity is found within an open reading frame of 291 amino acids length. The Nterminally determined sequence from the protein purified from culture filtrate starts at amino acid 58. This gives a length of the mature protein of 233 amino acids, which corresponds to a theoretical molecular weigh at 24422.4 Da, and a theoretical pl at 4.64. The observed weight in an SDS-PAGE gel is 27 kDa.
Of the 13 determined amino acids in CFP30A, a 100% identical sequence was found on cosmid MTCY261.
The identity is found within an open reading frame of 248 amino acids length corresponding to a theoretical MW of CFP30A of 26881.0 Da and a pl of 5.41. The observed molecular weight in an SDS-PAGE gel is 30 kDa.
WO 99/24577 PCT/DK98/00438 CWP32: The 15 amino acid N-terminal sequence was found to be 100% identical to a sequence found on contig 281. The identity was found within an open reading frame of 266 amino acids length, corresponding to a theoretical MW of CWP32 of 28083 Da and a pl of 4.563. The observed molecular weight in an SDS-PAGE gel is 32 kDa.
The 15 aa N-terminal sequence was found to be 100% identical to a sequence found in MTVO38.06. The identity is found within an open reading frame of 464 amino acids length corresponding to a theoretical MW of CFP50 of 49244 Da and a pl of 5.66. The observed molecular weight in an SDS-PAGE gel is 50 kDa.
Use of homoloqy searching in the EMBL database for identification of CFP19A and CFP23.
Homology searching in the EMBL database (using the GCG package of the Biobase, Arhus-DK) with the amino acid sequences of two earlier identified highly immunoreactive ST-CF proteins, using the TFASTA algorithm, revealed that these proteins (CFP21 and CFP25, EXAMPLE 3) belong to a family of fungal cutinase homologs. Among the most homologous sequences were also two Mycobacterium tuberculosis sequences found on cosmid MTCY13E12. The first, MTCY13E12.04 has 46% and 50% identity to CFP25 and CFP21 respectively. The second, MTCY13E12.05, has also 46% and identity to CFP25 and CFP21. The two proteins share 62.5% aa identity in a 184 residues overlap. On the basis of the high homology to the strong T-cell antigens CFP21 and CFP25, respectively, it is believed that CFP19A and CFP23 are possible new T-cell antigens.
The first reading frame encodes a 254 amino acid protein of which the first 26 aa constitute a putative leader peptide that strongly indicates an extracellular location of the protein. The mature protein is thus 228 aa in length corresponding to a theoretical MW of 23149.0 Da and a Pi of 5.80. The protein is named CFP23.
The second reading frame encodes an 231 aa protein of which the first 44 aa constitute a putative leader peptide that strongly indicates an extracellular location of the protein. The mature protein is thus 187 aa in length corresponding to a theoretical MW of 19020.3 Da and a Pi of 7.03. The protein is named CFP19A.
WO 99/24577 PCT/DK98/00438 71 The presence of putative leader peptides in both proteins (and thereby their presence in the ST-CF) is confirmed by theoretical sequence analysis using the signalP program at the Expasy molecular Biology server (http://expasy.hcuge.ch/www/tools.html).
Searching for homologies to CFP7A, CFP16, CFP19, CFP19A, CFP19B, CFP22A, CFP23, CFP25A, CFP27, CFP30A, CWP32 and CFP50 in the EMBL database.
The amino acid sequences derived from the translated genes of the individual antigens were used for homology searching in the EMBL and Genbank databases using the TFASTA algorithm, in order to find homologous proteins and to address eventual functional roles of the antigens.
CFP7A: CFP7A has 44% identity and 70% similarity to hypothetical Methanococcus jannaschii protein jannaschii from base 1162199-1175341), as well as 43% and 38% identity and 68 and 64% similarity to the C-terminal part of B. stearothermophilus pyruvate carboxylase and Streptococcus mutans biotin carboxyl carrier protein.
CFP7A contains a consensus sequence EAMKM for a biotin binding site motif which in this case was slightly modified (ESMKM in amino acid residues 34 to 38). By incubation with alkaline phosphatase conjugated streptavidin after SDS-PAGE and transfer to nitrocellulose it was demonstrated that native CFP7A was biotinylated.
CFP16: RplL gene, 130 aa. Identical to the M. bovis 50s ribosomal protein L7/L12 (acc. No P37381).
CFP19: CFP19 has 47% identity and 55% similarity to E.coli pectinesterase homolog (ybhC gene) in a 150 aa overlap.
CFP19A: CFP19A has between 38% and 45% identity to several cutinases from different fungal sp.
WO 99/24577 PCT/DK98/00438 72 In addition CFP19A has 46% identity and 61% similarity to CFP25 as well as identity and 64% similarity to CFP21 (both proteins are earlier isolated from the ST-
CF).
CFP19B: No apparent homology CFP22A: No apparent homology CFP23: CFP23 has between 38% and 46% identity to several cutinases from different fungal sp.
In addition CFP23 has 46% identity and 61% similarity to CFP25 as well as identity and 63% similarity to CFP21 (both proteins are earlier isolated from the ST-
CF).
CFP25A has 95% identity in a 241 aa overlap to a putative M. tuberculosis thymidylate synthase (450 aa accession No p2 8 1 7 6 CFP27: CFP27 has 81% identity to a hypothetical M. leprae protein and 64% identity and 78% similarity to Rhodococcus sp. proteasome beta-type subunit 2 (prcB(2) gene).
CFP30A has 67% identity to Rhodococcus proteasome alfa-type 1 subunit.
CWP32: The CWP32 N-terminal sequence is 100% identical to the Mycobacterium leprae sequence MLCB637.03.
The CFP50 N-terminal sequence is 100% identical to a putative lipoamide dehydrogenase from M. leprae (Accession 415183) Cloning of the genes encoding CFP7A, CFP8A, CFP8B, CFP16, CFP19, CFP19A, CFP22A, CFP23, CFP25A, CFP27, CFP30A, CWP32, and The genes encoding CFP7A, CFP8A, CFP8B, CFP16, CFP19, CFP19A, CFP22A, CFP23, CFP25A, CFP27, CFP30A, CWP32 and CFP50 were all cloned into the ex- WO 99/24577 PCT/DK98/00438 73 pression vector pMCT6, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the proteins.
PCR reactions contained 10 ng of M. tuberculosis chromosomal DNA in 1X low salt Taq buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1% DMSO (Merck), pmoles of each primer and 0.5 unit Tag+ DNA polymerase (Stratagene) in 10 ml reaction volume. Reactions were initially heated to 94 0 C for 25 sec. and run for 30 cycles of the program; 94 0 C for 10 sec., 55 0 C for 10 sec. and 72 0 C for 90 sec, using thermocycler equipment from Idaho Technology.
The DNA fragments were subsequently run on 1% agarose gels, the bands were excised and purified by Spin-X spin columns (Costar) and cloned into pBluescript SK II T vector (Stratagene). Plasmid DNA was hereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMCT6 in frame with 8 histidines which are added to the Nterminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiled DNA using the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.
For cloning of the individual antigens, the following gene specific primers were used: CFP7A: Primers used for cloning of cfp7A: OPBR-79: AAGAGTAGATCTATGATGGCCGAGGATGTTCGCG (SEQ ID NO: OPBR-80: CGGCGACGACGGATCCTACCGCGTCGG (SEQ ID NO: 96) OPBR-79 and OPBR-80 create Bgll and BamHI sites, respectively, used for the cloning in pMCT6.
WO 99/24577 WO 9924577PCT/DK98/00438 74 CFP8A: Primers used for cloning of cfp8A: CFP8A-F: CTGAGATCTATGAACGTACGGCGCC (SEQ ID NO: 154) CFP8A-R: CTCCGATGGTACCGTAGGACCCGGGCAGCCCCGGC (SEQ ID NO: 155) CFP8A-F and CFP8A-R create Bg1I and Ncol sites, respectively, used for the cloning in pMGT6.
CFP8B: Primers used for cloning of cfp8B: CFP8B-F: CTGAGATCTATGAGGCTGTCGTTGACCGC (SEQ ID NO: 156) GFP8B-R: CTCCCCGGGCTTAATAGTTGTTGCAGGAGC (SEQ ID NO: 157) 1 5 CFP8B-F and CFP8B-R create Bg1l and Sinai sites, respectively, used for the cloning in pMCT6.
CFP1 6: Primers used for cloning of cfpl 6: OPBR- 104: CCGGGAGATCTATGGCAAAGCTCTCCACCGACG (SEQ ID NOs: 111 and 130) OPBR- 105: CGCTGGGCAGAGCTACTTGACGGTGACGGTGG (SEQ ID NOs: 112 and 131) OPBR-1O04 and OPBR-1 05 create Bg1l and Ncol sites, respectively, used for the cloning in pMCT6.
CFP1 9: Primers used for cloning of cfp 19: OPBR-96: GAGGAAGATCTATGACAACTTCACCCGACGCG (SEQ ID NO: 107) OPBR-97. CATGAAGCCATGGCCCGCAGGCTGCATG (SEQ ID NO: 108) WO 99/24577 WO 9924577PCT/DK98/00438 OPBR-96 and OPBR-97 create Bg1l and Ncol sites, respectively, used for the cloning in pMCT6.
CFP1 9A: Primers used for cloning of cfpl 9A: OPBR-88: CCCCGCAGATCTGCACGACCGGCATCGGCGGGC (SEQ ID NO: 99) OPBR-89. GCGGCGGATCGGTTGGTTAGCCGG (SEQ ID NO: 100) OPBR-88 and OPBR-89 create Bg1I and BamHl sites, respectively, used for the cloning in pMCT6.
CFP22A: Primers used for cloning of cfp22A: OPBR-90: CCGGCTGAGATCTATGACAGAATACGAAGGGC (SEQ ID NO: 101) OPBR-91: CCCCGCCAGGGAACTAGAGGCGGC (SEQ ID NO: 102) and OPBR-91 create BgAl and Ncol sites, respectively, used for the cloning in pMCT6.
CFP23: Primers used for cloning of cfp23: OPBR-86: GCTTGGGAGATCTTTGGACCCCGGTTGC (SEQ ID NO: 97) OPBR-87: GACGAGATCTTATGGGCTTACTGAC (SEQ ID NO: 98) OPBR-86 and OPBR-87 both create a BgIII site used for the cloning in pMCT6.
CFP25A: Primers used for cloning of OPBR-1 06: GGCCAGATCTATGGCCATTGAGGTTTCGGTGTTGC (SEQ ID NO: 113) OPBR-107: GGCCGTGTTGCATGGCAGCGCTGAGC (SEQ ID NO: 114) WO 99/24577 PCT/DK98/00438 76 OPBR-106 and OPBR-107 create Bg1l and Ncol sites, respectively, used for the cloning in pMCT6.
CFP27: Primers used for cloning of cfp27: OPBR-92: CTGCCGAGATCTACCACCATTGTCGCGCTGAAATACCC (SEQ ID NO: 103) OPBR-93: CGCCATGGCCTTACGCGCCAACTCG (SEQ ID NO: 104) OPBR-92 and OPBR-93 create Bgl and Ncol sites, respectively, used for the cloning in pMCT6.
CFP3OA: Primers used for cloning of cfp3OA: OPBR-94: GGCGGAGATCTGTGAGTTTTCCGTATTTCATC (SEQ ID NO: 105) CGCGTCGAGCCATGGTTAGGGGCAG (SEQ. ID NO: 106) OPBR-94 and OPBR-95 create BgIl and Ncol sites, respectively, used for the cloning in pMCT6.
CWP32: Primers used for cloning of cwp32: CWP3 2-F: GCTTAGATCTATGATTTTCTGGGCAACCAGGTA (SEQ. ID NO: 158) CWP32-R: GCTTCCATGGGCGAGGCACAGGCGTGGGAA (SEQ ID NO: 159) CWP32-F and CWP32-R create Bg~ll and Ncol sites, respectively, used for the cloning in pMCT6.
Primers used for cloning of WO 99/24577 PCT/DK98/00438 77 OPBR-100: GGCCGAGATCTGTGACCCACTATGACGTCGTCG (SEQ ID NO: 109) OPBR-101: GGCGCCCATGGTCAGAAATTGATCATGTGGCCAA (SEQ ID NO: 110) OPBR-100 and OPBR-101 create Bg/ll and Ncol sites, respectively, used for the cloning in pMCT6.
Expression/purification of recombinant CFP7A, CFP8A, CFP8B, CFP16, CFP19, CFP19A, CFP22A, CFP23, CFP25A, CFP27, CFP30A, CWP32, and CFP50 proteins.
Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 I LB-media containing 100 pg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XL1-Blue cells harbouring recombinant pMCT6 plasmids. Cultures were shaken at 37 0 C until they reached a density of OD 6 oo 0.4 0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4-16 hours. Cells were harvested, resuspended in 1X sonication buffer 8 M urea and sonicated 5 X 30 sec.
with 30 sec. pausing between the pulses.
After centrifugation, the lysate was applied to a column containing 25 ml of resuspended Talon resin (Clontech, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.
After elution, all fractions (1.5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCI, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCI. Fractions were analyzed by SDS-PAGE and protein concentrations were estimated at OD 280 Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH WO 99/24577 PCT/DK98/00438 78 Finally the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.
EXAMPLE 3B Identification of CFP7B, CFP10A, CFP11 and Isolation of CFP7B ST-CF was precipitated with ammonium sulphate at 80% saturation and redissolved in PBS, pH 7.4, and dialyzed 3 times against 25 mM Piperazin-HCI, pH 5.5, and subjected to cromatofocusing on a matrix of PBE 94 (Pharmacia) in a column connected to an FPLC system (Pharmacia). The column was equilibrated with 25 mM Piperazin-HCI, pH 5.5, and the elution was performed with 10% PB74-HCI, pH 4.0 (Pharmacia). Fractions with similar band patterns were pooled and washed three times with PBS on a Centriprep concentrator (Amicon) with a 3 kDa cut off membrane to a final volume of 1-3 ml. An equal volume of SDS containing sample buffer was added and the protein solution boiled for 5 min before further separation on a MultiEluter (BioRad) in a matrix of 10-20 polyacrylamid (Andersen,P. Heron,l., 1993). The fraction containing a well separated band below 10 kDa was selected for N-terminal sequencing after transfer to a PVDF membrane.
Isolation of CFP11 ST-CF was precipitated with ammonium sulphate at 80% saturation. The precipitated proteins were removed by centrifugation and after resuspension washed with 8 M urea. CHAPS and glycerol were added to a final concentration of 0.5 and respectively and the protein solution was applied to a Rotofor isoelectrical Cell (BioRad). The Rotofor Cell had been equilibrated with an 8M urea buffer containing CHAPS, 5% glycerol, 3% Biolyt 3/5 and 1% Biolyt 4/6 (BioRad). Isoelectric focusing was performed in a pH gradient from 3-6. The fractions were analyzed on silver-stained 10-20% SDS-PAGE. The fractions in the pH gradient to 6 were pooled and washed three times with PBS on a Centriprep concentrator WO 99/24577 PCT/DK98/00438 79 (Amicon) with a 3 kDa cut off membrane to a final volume of 1 ml. 300 mg of the protein preparation was separated on a 10-20% Tricine SDS-PAGE (Ploug et al 1989) and transferred to a PVDF membrane and Coomassie stained. The lowest band occurring on the membrane was excised and submitted for N-terminal sequencing.
Isolation of CFP10A and ST-CF was concentrated approximately 10-fold by ultrafiltration and ammonium sulphate precipitation at 80 saturation. Proteins were redissolved in 50 mM sodium phosphate, 1.5 M ammonium sulphate, pH 8.5, and subjected to thiophilic adsorption chromatography on an Affi-T gel column (Kem-En-Tec). Proteins were eluted by a to 0 M decreasing gradient of ammonium sulphate. Fractions with similar band patterns in SDS-PAGE were pooled and anion exchange chromatography was performed on a Mono Q HR 5/5 column connected to an FPLC system (Pharmacia). The column was equilibrated with 10 mM Tris-HCI, pH 8.5, and the elution was performed with a gradient of NaCI from 0 to 1 M. Fractions containing well separated bands in SDS- PAGE were selected.
Fractions containing CFP10A and CFP30B were blotted to PVDF membrane after 2-DE PAGE (Ploug et al, 1989). The relevant spots were excised and subjected to N-terminal amino acid sequence analysis.
N-terminal sequencing N-terminal amino acid sequence analysis was performed on a Procise 494 sequencer (applied Biosystems).
The following N-terminal sequences were obtained: CFP7B: PQGTVKWFNAEKGFG (SEQ ID NO: 168) NVTVSIPTILRPXXX (SEQ ID NO: 169) CFP11: TRFMTDPHAMRDMAG (SEQ ID NO: 170) PKRSEYRQGTPNWVD (SEQ ID NO: 171) WO 99/24577 PCT/DK98/00438 denotes an amino acid which could not be determined by the sequencing method used.
N-terminal homoloqy searching in the Sanger database and identification of the corresponding genes.
The N-terminal amino acid sequence from each of the proteins was used for a homology search using the blast program of the Sanger Mycobacterium tuberculosis genome database: http//www.sanger.ac.uk/projects/m-tuberculosis/TB-blast-server.
For CFP11 a sequence 100% identical to 15 N-terminal amino acids was found on contig TB_1314. The identity was found within an open reading frame of 98 amino acids length corresponding to a theoretical MW of 10977 Da and a pl of 5.14.
Amino acid number one can also be an Ala (insted of a Thr) as this sequence was also obtained (results not shown), and a 100% identical sequence to this N-terminal is found on contig TB_671 and on locus MTC1364.09.
For CFP7B a sequence 100% identical to 15 N-terminal amino acids was found on contig TB_2044 and on locus MTY15C10.04 with EMBL accession number: z95436.
The identity was found within an open reading frame of 67 amino acids length corresponding to a theoretical MW of 7240 Da and a pl of 5.18.
For CFP10A a sequence 100% identical to 12 N-terminal amino acids was found on contig TB_752 and on locus CY130.20 with EMBL accession number: Q10646 and Z73902. The identity was found within an open reading frame of 93 amino acids length corresponding to a theoretical MW of 9557 Da and a pl of 4.78.
For CFP30B a sequence 100% identical to 15 N-terminal amino acids was found on contig TB_335. The identity was found within an open reading frame of 261 amino acids length corresponding to a theoretical MW of 27345 Da and a pl of 4.24.
WO 99/24577 PCT/DK98/00438 81 The amino acid sequences of the purified antigens as picked from the Sanger database are shown in the following list.
CFP7B (SEQ ID NO: 147) 1 MPQGTVKWFN AEKGFGFIAP EDGSADVFVH YTEIQGTGFR TLEENQKVEF 51 EIGHSPKGPQ ATGVRSL (SEQ ID NO: 141) 1 MNVTVSIPTI LRPHTGGQKS VSASGDTLGA VISDILEANYS GISERLMDPS 51 SPGKLHRFVN IYVNDEDVRF SGGLATAIAD GDSVTILPAV AGG CFP1 1 protein sequence (SEQ ID NO: 143) 1 MATRFMTDPH AMRDMAGRFE VHAQTVEDEA RRMWASAQNI SGAGWSGMAE 51 ATSLDTMAQM NQAFRNIVNM LHGVRDGLVR DANNYEQQEQ ASQQILSS (SEQ ID NO: 145) 1 MPKRSEYRQG TPNWVDLQTT DQSAAKKFYT SLFGWGYDDN PVPGGGGVYS 51 MATLNGEAVA AIAPMPPGAP EGMPPIWNTY IAVDDVDAVV DKVVPGGGQV 101 MMPAFDIGDA GRMSFITDPT GAAVGLWQAN RHIGATLVNE TGTLIWNELL 151 TDKPDLALAF YEAVVGLTHS SMEIAAGQNY RVLKAGDAEV GGCMEPPMPG 201 VPNHWHVYFA VDDADATAAK AAAAGGQVIA EPADIPSVGR FAVLSDPQGA 251 IFSVLKPAPQ Q Cloningi of the genes encoding CFP7B. CFP10A, CFP1 1, and CFP3OB.
PCR reactions contained 10 ng of M. tuberculosis chromosomal DNA in 1 X low salt Taq buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/mI BSA (lgG technology), 1 DMSO (Merck), pmoles of each primer and 0.5 unit Tag+ DNA polymerase (Stratagene) in 10 ml reaction volume. Reactions were initially heated to 94 0 C for 25 sec. and run for 30 cycles WO 99/24577 PCT/DK98/00438 82 of the program; 94 0 C for 10 sec., 55 0 C for 10 sec. and 72 0 C for 90 sec., using thermocycler equipment from Idaho Technology.
The DNA fragments were subsequently run on 1% agarose gels, the bands were excised and purified by Spin-X spin columns (Costar) and cloned into pBluscript SK II T vector (Stratagene). Plasmid DNA was hereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMCT6 in frame with 8 histidines which are added to the Nterminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiled DNA using the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.
For cloning of the individual antigens, the following gene specific primers were used: CFP7B: Primers used for cloning of cfp7B: CFP7B-F: CTGAGATCTAGAATGCCACAGGGAACTGTG (SEQ ID NO: 160) CFP7B-R: TCTCCCGGGGGTAACTCAGAGCGAGCGGAC (SEQ ID NO: 161) CFP7B-F and CFP7B-R create Bgl and Smal sites, respectively, used for the cloning in pMCT6.
Primers used for cloning of cfplOA: CFP1OA-F: CTGAGATCTATGAACGTCACCGTATCC (SEQ ID NO: 162) CFP1OA-R: TCTCCCGGGGCTCACCCACCGGCCACG (SEQ ID NO: 163) -F and CFP10A -R create Bglll and Smal sites, respectively, used for the cloning in pMCT6.
CFP11: Primers used for cloning of cfpl 1: WO 99/24577 PCT/DK98/00438 83 CFP11-F: CTGAGATCTATGGCAACACGTTTTATGACG (SEQ ID NO: 164) CFP1 l-R:CTCCCCGGGTTAGCTGCTGAGGATCTGCTH (SEQ ID NO: 165) CFP11-F and CFP11-R create BglI and Smal sites, respectively, used for the cloning in pMCT6.
Primers used for cloning of CFP30B-F: CTGAAGATCTATGCCCAAGAGAAGCGAATAC (SEQ ID NO: 166) CGGCAGCTGCTAGCATTCTCCGAATCTGCCG (SEQ ID NO: 167) and CFP30B-R create Bg/ll and Pvull sites, respectively, used for the cloning in pMCT6.
Expression/purification of recombinant CFP7B, CFP10A, CFP11 and CFP30B protein.
Expression and metal affinity purification of recombinant protein was undertaken essentially as described by the manufacturers. 1 I LB-media containing 100 pg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XL1-Blue cells harbouring recombinant pMCT6 plasmid. The culture was shaken at 37 °C until it reached a density of OD 6 oo 0.5. IPTG was hereafter added to a final concentration of 1 mM and the culture was further incubated 4 hours. Cells were harvested, resuspended in 1 X sonication buffer 8 M urea and sonicated 5 X 30 sec. with 30 sec. pausing between the pulses.
After centrifugation, the lysate was applied to a column containing 25 ml of resuspended Talon resin (Clontech, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.
After elution, all fractions (1.5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCI, pH 8.5. The dialysed protein WO 99/24577 PCT/DK98/00438 84 was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCI. Fractions were analysed by SDS-PAGE and protein concentrations were estimated at OD 28 0 Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH Finally the protein concentration and the LPS content was determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.
EXAMPLE 3C Using homology searching for identification of ORF11-1, ORF11-2, ORF11-3 and ORF11-4.
A search of the Mycobacterium tuberculosis Sanger sequence database with the amino acid sequences of CFP11, a previously identified ST-CF protein, identified 4 new very homologous proteins. All 4 proteins were at least 96% homologous to CFP11.
On the basis of the strong homology to CFP11, it is belived that ORF11-1, ORF11-2, ORF11-3 and ORF11-4 are potential new T-cell antigens.
The first open reading frame, MTCY10G2.11, homologous to CFP11, encodes a protein of 98 amino acids corresponding to a theoretical molecular mass of 10994Da and a pl of 5.14. The protein was named ORF11-1.
The second open reading frame, MTC1364.09, homologous to CFP11, encodes a protein of 98 amino acids corresponding to a theoretical molecular mass of 10964Da and a pl of 5.14. The protein was named ORF11-2.
The third open reading frame, MTV049.14, has an in frame stop codon. Because of the very conserved DNA sequence in this position amongst the 4 open reading frames it is however suggested that this is due to a sequence mistake.
WO 99/24577 PCT/DK98/00438 The in position 175 of the DNA sequence is therefor suggested to be a as in the four other ORF's. The Q in position 59 in the amino acid sequence would have been a "stop" if the T in position 175 in the DNA sequence had not been substituted.
The open reading frame encodes a protein of 98 amino acids corresponding to a theoretical molecular mass of 10994Da and a pl of 5.14. The protein was named ORF11- 3.
The fourth open reading frame, MTCY15C10.32, homologous to CFP11, encodes a protein of 98 amino acids corresponding to a theoretical molecular mass of 11024Da and a pl of 5.14. The protein was named ORF11-4.
Using homology searching for identification of ORF7-1 and ORF7-2.
A search of the Mycobacterium tuberculosis Sanger sequence database with the amino acid sequences of a previously identified immunoreactive ST-CF protein, CFP7, identified 2 new very homologous proteins. The protein ORF7-1 (MTV012.33) was 84% identical to CFP7, with a primary structure of the same size as CFP7, and the protein ORF7-2 (MTV012.31) was 68% identical to CFP7 in a 69 amino acid overlap.
On the basis of the strong homology to the potent human T-cell antigen CFP7, ORF7-1 and ORF7-2 are belived to be potential new T-cell antigens.
The first open reading frame homologous to CFP7, encodes a protein of 96 amino acids corresponding to a theoretical molecular mass of 10313Da and a pl of 4.186. The protein was named ORF7-1.
The second open reading frame homologous to CFP7, encodes a protein of 120 amino acids corresponding to a theoretical molecular mass of 12923.00 Da and a pl of 7.889. The protein was named ORF7-2.
Cloning of the homologous orf7-1 and orf7-2.
Since ORF7-1 and ORF7-2 are nearly identical to CFP7 it was nessesary to use the flanking DNA regions in the cloning procedure, to ensure the cloning of the correct ORF. Two PCR reactions were carried out with two different primer sets. PCR reaction WO 99/24577 PCT/DK98/00438 86 1 was carried out using M. tuberculosis chromosomal DNA and a primerset corresponding to the flanking DNA. PCR reaction 2 was carried out directly on the first PCR product using ORF specific primers which introduced restriction sites for use in the later cloning procedure.
The sequences of the primers used are given below; Orf7-1: Primers used for the initial PCR reaction using M. tuberculosis chromosomal DNA as template; Sence: MTVO12.33-R1: GGAATGAAAAGGGGTTTGTG 3' (SEQ ID NO: 186) Antisence:MTV012.33-F1: GACCACGCCCGCGCCGTGTG 3'(SEQ ID NO:187) Primers used for the second round of PCR using PCR product 1 as template; Sence:MTV012.33-R2: 5' GCAACACCCGGGATGTCGCAGATTATG -3' (SEQ ID NO: 188) (introduces a Smal upstream of the orf7-1 start codon) Antisence:MTV012.33-F2: CTAAGCTTGGATCCCTAGCCGCCCCACTTG- 3' ((SEQ ID NO: 189) (introduces a BamHI downstream of the orf7-1 stop codon).
Orf7-2: Primers used for the initial PCR reaction using M. tuberculosis chromosomal DNA as template; Sence:MTV012.31-R1: GAATATTTGAAAGGGATTCGTG 3' (SEQ ID NO: 190) Antisence:MTV012.31-Fl: CTACTAAGCTTGGATCCTTAGTCTCCGGCG 3' (SEQ ID NO: 191) (introduces a BamHI downstream of the orf7-2 stop codon) Primers used for the second round of PCR using PCR product 1 as template; WO 99/24577 PCT/DK98/00438 87 Sence: MTV012.31-R2: 5' -GCAACACCCGGGGTGTCGCAGAGTATG- 3' (SEQ ID NO: 192) (introduces a Smal upstream of the orf7-2 start codon) Antisence:MTV012.31-Fl: CTACTAAGCTTGGATCCTTAGTCTCCGGCG 3' (SEQ ID NO: 193) (introduces a BamHI downstream of the orf7-2 stop codon) The genes encoding ORF7-1 and ORF7-2 were cloned into the expression vector pMST24, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the proteins.
The first PCR reactions contained either 10 ng of M. tuberculosis chromosomal DNA (PCR reaction 1) or 1Ong PCR product 1 (PCR reaction 2) in 1 x low salt Taq buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1% DMSO (Merck), 5 pmoles of each primer and 0.5 unit Tag DNA polymerase (Stratagene) in 10 ml reaction volume. Reactions were initially heated to 94°C for 25 sec. and run for 30 cycles of the program; 94 0 C for 10 sec., 55 0 C for 10 sec. and 72 0 C for 90 sec, using thermocycler equipment from Idaho Technology.
The DNA fragments were subsequently run on 1% agarose gels, the bands were excised and purified by Spin-X spin columns (Costar) and cloned into pBluscript SK II T vector (Stratagene). Plasmid DNA was hereafter prepared from clones harbouring the desired fragments, digested with suitable restriction enzymes and subcloned into the expression vector pMST24 in frame with 6 histidines which are added to the Nterminal of the expressed proteins. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiled DNA using the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.
Expression/purification of recombinant ORF7-1 and ORF7-2 protein.
Expression and metal affinity purification of recombinant protein was undertaken essentially as described by the manufacturers. 1 I LB-media containing 100 pg/ml ampi- WO 99/24577 PCT/DK98/00438 88 cillin, was inoculated with 10 ml of an overnight culture of XL1-Blue cells harbouring recombinant pMCT6 plasmid. The culture was shaken at 37 °C until it reached a density of OD600 0.5. IPTG was hereafter added to a final concentration of 1 mM and the culture was further incubated 2 hours. Cells were harvested, resuspended in 1 X sonication buffer 8 M urea and sonicated 5 X 30 sec. with 30 sec. pausing between the pulses.
After centrifugation, the lysate was applied to a column containing 25 ml of resuspended Talon resin (Clontech, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.
After elution, all fractions (1.5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris- HCI, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, eluted with a linear 0-1 M gradient of NaCI. Fractions were analysed by SDS-PAGE. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH Finally the protein concentration and the LPS content was determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.
EXAMPLE 4 Cloning of the gene expressing CFP26 (MPT5 1) Synthesis and design of probes Oligonucleotide primers were synthesized automatically on a DNA synthesizer (Applied Biosystems, Forster City, Ca, ABI-391, PCR-mode) deblocked and purified by ethanol precipitation.
Three oligonucleotides were synthesized (TABLE 3) on the basis of the nucleotide sequence from mpb51 described by Ohara et al. (1995). The oligonucleotides were engineered to include an EcoRI restriction enzyme site at the 5' end and at the 3' end by which a later subcloning was possible.
WO 99/24577 PCT/DK98/00438 89 Additional four oligonucleotides were synthesized on the basis of the nucleotide sequence from MPT51 (Fig. 5 and SEQ ID NO: 41). The four combinations of the primers were used for the PCR studies.
DNA cloning and PCR technology Standard procedures were used for the preparation and handling of DNA (Sambrook et al., 1989). The gene mpt51 was cloned from M. tuberculosis H37Rv chromosomal DNA by the use of the polymerase chain reactions (PCR) technology as described previously (Oettinger and Andersen, 1994). The PCR product was cloned in the pBluescriptSK (Stratagene).
Cloning of mpt51 The gene, the signal sequence and the Shine Delgarno region of MPT51 was cloned by use of the PCR technology as two fragments of 952 bp and 815 bp in pBluescript SK designated pT052 and pTO53.
DNA Sequencing The nucleotide sequence of the cloned 952 bp M. tuberculosis H37Rv PCR fragment, pTO52, containing the Shine Dalgarno sequence, the signal peptide sequence and the structural gene of MPT51, and the nucleotide sequence of the cloned 815 bp PCR fragment containing the structural gene of MPT51, pTO53, were determined by the dideoxy chain termination method adapted for supercoiled DNA by use of the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., Cleveland, OH) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.
The nucleotide sequences of pTO52 and pTO53 and the deduced amino acid sequence are shown in Figure 5. The DNA sequence contained an open reading frame starting with a ATG codon at position 45 47 and ending with a termination codon WO 99/24577 PCT/DK98/00438 (TAA) at position 942 944. The nucleotide sequence of the first 33 codons was expected to encode the signal sequence. On the basis of the known N-terminal amino acid sequence (Ala Pro Tyr Glu Asn) of the purified MPT51 (Nagai et al., 1991) and the features of the signal peptide, it is presumed that the signal peptidase recognition sequence (Ala-X-Ala) (von Heijne, 1984) is located in front of the Nterminal region of the mature protein at position 144. Therefore, a structural gene encoding MPT51, mpt51, derived from M. tuberculosis H37Rv was found to be located at position 144 945 of the sequence shown in Fig. 5. The nucleotide sequence of mpt51 differed with one nucleotide compared to the nucleotide sequence of MPB51 described by Ohara et al. (1995) (Fig. In mpt51 at position 780 was found a substitution of a guanine to an adenine. From the deduced amino acid sequence this change occurs at a first position of the codon giving a amino acid change from alanine to threonine. Thus it is concluded, that mpt51 consists of 801 bp and that the deduced amino acid sequence contains 266 residues with a molecular weight of 27,842, and MPT51 show 99,8% identity to MPB51.
Subcloninq of mpt51 An EcoRI site was engineered immediately 5' of the first codon of mpt51 so that only the coding region of the gene encoding MPT51 would be expressed, and an EcoRI site was incorporated right after the stop codon at the 3' end.
DNA of the recombinant plasmid pTO53 was cleaved at the EcoRI sites. The 815 bp fragment was purified from an agarose gel and subcloned into the EcoRI site of the pMAL-cR1 expression vector (New England Biolabs), pT054. Vector DNA containing the gene fusion was used to transform the E. coil XL1-Blue by the standard procedures for DNA manipulation.
The endpoints of the gene fusion were determined by the dideoxy chain termination method as described under section DNA sequencing. Both strands of the DNA were sequenced.
WO 99/24577 PCT/DK98/00438 91 Preparation and purification of rMPT51 Recombinant antigen was prepared in accordance with instructions provided by New England Biolabs. Briefly, single colonies of E. coli harbouring the pTO54 plasmid were inoculated into Luria-Bertani broth containing 50 pg/ml ampicillin and 12.5 pg/ml tetracycline and grown at 37 0 C to 2 x 108 cells/ml. Isopropyl-P-D-thiogalactoside (IPTG) was then added to a final concentration of 0.3 mM and growth was continued for further 2 hours. The pelleted bacteria were stored overnight at -20 0 C in new column buffer (20 mM Tris/HCI, pH 7.4, 200 mM NaCI, 1 mM EDTA, 1 mM dithiothreitol (DTT))and thawed at 4°C followed by incubation with 1 mg/ml lysozyme on ice for min and sonication (20 times for 10 sec with intervals of 20 sec). After centrifugation at 9,000 x g for 30 min at 4 0 C, the maltose binding protein -MPT51fusion protein (MBP-rMPT51) was purified from the crude extract by affinity chromatography on amylose resin column. MBP-rMPT51 binds to amylose. After extensive washes of the column, the fusion protein was eluted with 10 mM maltose. Aliquots of the fractions were analyzed on 10% SDS-PAGE. Fractions containing the fusion protein of interest were pooled and was dialysed extensively against physiological saline.
Protein concentration was determined by the BCA method supplied by Pierce (Pierce Chemical Company, Rockford, IL).
WO 99/24577 PCT/DK98/00438 92 TABLE 3.
Sequence of the mpt51 oligonucleotidesa.
Orientation and oligonucleotidea Sequences 3') Position b (nucleotide) Sense Sense MPT51-1 CTCGAATTCGCCGGGTGCACACAG (SEQ ID NO: 28) MPT51-3 CTCGAATTCGCCCCATACGAGAAC (SEQ ID NO: 29) MPT51-5 GTGTATCTGCTGGAC (SEQ ID NO: 30) MPT51-7 CCGACTGGCTGGCCG (SEQ ID NO: 31) 6 21 (SEQ ID NO: 143 158 (SEQ ID NO: 228 242 (SEQ ID NO: 418 432 (SEQ ID NO: 946 932 (SEQ ID NO: 642 628 (SEQ ID NO: 242 228 (SEQ ID NO: 41) 41) 41) 41) 41) 41) 41) Antisense MPT51-2 GAGGAATTCGCTTAGCGGATCGCA (SEQ ID NO: 32) MPT51-4 CCCACATTCCGTTGG (SEQ ID NO: 33) MPT51-6 GTCCAGCAGATACAC (SEQ ID NO: 34) a The oligonucleotides MPT51-1 and MPT51-2 were constructed from the MPB51 nucleotide sequence (Ohara et al., 1995). The other oligonucleotides constructions were based on the nucleotide sequence obtained from mpt51 reported in this work. Nucleotides (nt) underlined are not contained in the nucleotide sequence of MPB/T51.
b The positions referred to are of the non-underlined parts of the primers and correspond to the nucleotide sequence shown in SEQ ID NO: 41.
WO 99/24577 PCT/DK98/00438 93 Cloning of mpt51 in the expression vector pMST24.
A PCR fragment was produced from pT052 using the primer combination MPT51-F and MPT51-R (TABLE A BamHI site was engineered immediately 5' of the first codon of mpt51 so that only the coding region of the gene encoding MPT51 would be expressed, and an Ncol site was incorporated right after the stop codon at the 3' end.
The PCR product was cleaved at the BamHI and the Ncol site. The 811 bp fragment was purified from an agarose gel and subcloned into the BamHI and the Ncol site of the pMST24 expression vector, pTO86. Vector DNA containing the gene fusion was used to transform the E. coli XL1-Blue by the standard procedures for DNA manipulation.
The nucleotide sequence of complete gene fusion was determined by the dideoxy chain termination method as described under section DNA sequencing. Both strands of the DNA were sequenced.
Preparation and purification of rMPT51.
Recombinant antigen was prepared from single colonies of E. coli harbouring the pTO86 plasmid inoculated into Luria-Bertani broth containing 50 pg/ml ampicillin and 12.5 pg/ml tetracycline and grown at 37 0 C to 2 x 108 cells/ml. Isopropyl-p-D-thiogalactoside (IPTG) was then added to a final concentration of 1 mM and growth was continued for further 2 hours. The pelleted bacteria were resuspended in BC 100/20 buffer (100 mM KCI, 20 mM Imidazole, 20 mM Tris/HCI, pH 7.9, 20 glycerol). Cells were broken by sonication (20 times for 10 sec with intervals of 20 sec). After centrifugation at 9,000 x g for 30 min. at 4 0 C the insoluble matter was resuspended in BC 100/20 buffer with 8 M urea followed by sonication and centrifugation as above.
The 6 x His tag-MPT51 fusion protein (His-rMPT51) was purified by affinity chromatography on Ni-NTA resin column (Qiagen, Hilden, Germany). His-rMPT51 binds to Ni- NTA. After extensive washes of the column, the fusion protein was eluted with BC 100/40 buffer (100 mM KCI, 40 mM Imidazole, 20 mM Tris/HCI, pH 7.9, 20 glycerol) with 8 M urea and BC 1000/40 buffer (1000 mM KCI, 40 mM Imidazole, mM Tris/HCI, pH 7.9, 20 glycerol) with 8 M urea. His-rMPT51 was extensive dia- WO 99/24577 PCT/DK98/00438 94 lysed against 10 mM Tris/HCI, pH 8.5, 3 M urea followed by purification using fast protein liquid chromatography (FPLC) (Pharmacia, Uppsala, Sweden), over an anion exchange column (Mono Q) using 10 mM Tris/HCI, pH 8.5, 3 M urea with a 0 1 M NaCI linear gradient. Fractions containing rMPT51 were pooled and subsequently dialysed extensively against 25 mM Hepes, pH 8.0 before use.
Protein concentration was determined by the BCA method supplied by Pierce (Pierce Chemical Company, Rockford, IL).
The lipopolysaccharide (LPS) content was determined by the limulus amoebocyte lysate test (LAL) to be less than 0.004 ng/pg rMPT51, and this concentration had no influence on cellular activity.
TABLE 4. Sequence of the mpt51 oligonucleotides.
Orientation and Sequences Position oligonucleotide (nt) Sense MPT51-F CTCGGATCCTGCCCCATACGAGAACCTG 139 156 Antisense MPT51-R CTCCCATGGTTAGCGGATCGCACCG 939 924 EXAMPLE 4A Cloning of the ESA T6-MPT59 and the MPT59-ESA T6 hybrides.
Backaround for ESAT-MPT59 and MPT59-ESAT6 fusion Several studies have demonstrated that ESAT-6 is a an immunogen which is relatively difficult to adjuvate in order to obtain consistent results when immunizing therewith.
To detect an in vitro recognition of ESAT-6 after immunization with the antigen is very difficult compared to the strong recognition of the antigen that has been found during the recall of memory immunity to M. tuberculosis. ESAT-6 has been found in ST-CF in WO 99/24577 PCT/DK98/00438 a truncated version were amino acids 1-15 have been deleted. The deletion includes the main T-cell epitopes recognized by C57BL/6j mice (Brandt et al., 1996). This result indicates that ESAT-6 either is N-terminally processed or proteolytically degraded in STCF. In order to optimize ESAT-6 as an immunogen, a gene fusion between ESAT-6 and another major T cell antigen MPT59 has been constructed. Two different construct have been made: MPT59-ESAT-6 (SEQ ID NO: 172) and ESAT-6-MPT59 (SEQ ID NO: 173). In the first hybrid ESAT-6 is N-terminally protected by MPT59 and in the latter it is expected that the fusion of two dominant T-cell antigens can have a synergistic effect.
The genes encoding the ESAT6-MPT59 and the MPT59-ESAT6 hybrides were cloned into the expression vector pMCT6, by PCR amplification with gene specific primers, for recombinant expression in E. coli of the hybrid proteins.
Construction of the hybrid MPT59-ESAT6.
The cloning was carried out in three steps. First the genes encoding the two components of the hybrid, ESAT6 and MPT59, were PCR amplified using the following primer constructions: ESAT6: OPBR-4: GGCGCCGGCAAGCTTGCCATGACAGAGCAGCAGTGG (SEQ ID NO: 132) OPBR-28: CGAACTCGCCGGATCCCGTGTTTCGC (SEQ ID NO: 133) OPBR-4 and OPBR-28 create HinDIII and BamHI sites, respectively.
MPT59: OPBR-48: GGCAACCGCGAGATCTTTCTCCCGGCCGGGGC (SEQ ID NO: 134) OPBR-3: GGCAAGCTTGCCGGCGCCTAACGAACT (SEQ ID NO: 135) WO 99/24577 PCT/DK98/00438 96 OPBR-48 and OPBR-3 create Bglll and HinDIII, respectively. Additionally OPBR-3 deletes the stop codon of MPT59.
PCR reactions contained 10 ng of M. tuberculosis chromosomal DNA in 1x low salt Taq buffer from Stratagene supplemented with 250 mM of each of the four nucleotides (Boehringer Mannheim), 0,5 mg/ml BSA (IgG technology), 1% DMSO (Merck), pmoles of each primer and 0.5 unit Tag DNA polymerase (Stratagene) in 10 jLl reaction volume. Reactions were initially heated to 94C for 25 sec. and run for 30 cycles of the program; 94 0 C for 10 sec., 55 0 C for 10 sec. and 72 0 C for 90 sec, using thermocycler equipment from Idaho Technology.
The DNA fragments were subsequently run on 1% agarose gels, the bands were excised and purified by Spin-X spin columns (Costar). The two PCR fragments were digested with HinDIII and ligated. A PCR amplification of the ligated PCR fragments encoding MPT59-ESAT6 was carried out using the primers OPBR-48 and OPBR-28. PCR reaction was initially heated to 94°C for 25 sec. and run for 30 cycles of the program; 94 0 C for 30 sec., 55 0 C for 30 sec. and 72°C for 90 sec. The resulting PCR fragment was digested with Bglll and BamHI and cloned into the expression vector pMCT6 in frame with 8 histidines which are added to the N-terminal of the expressed protein hybrid. The resulting clones were hereafter sequenced by use of the dideoxy chain termination method adapted for supercoiled DNA using the Sequenase DNA sequencing kit version 1.0 (United States Biochemical Corp., USA) and by cycle sequencing using the Dye Terminator system in combination with an automated gel reader (model 373A; Applied Biosystems) according to the instructions provided. Both strands of the DNA were sequenced.
Construction of the hybrid ESAT6-MPT59.
Construction of the hybrid ESAT6-MPT59 was carried out as described for the hybrid MPT59-ESAT6. The primers used for the construction and cloning were: ESAT6: GGACCCAGATCTATGACAGAGCAGCAGTGG (SEQ ID NO: 136) WO 99/24577 PCT/DK98/00438 97 OPBR-76: CCGGCAGCCCCGGCCGGGAGAAAAGCTTTGCGAACATCCCAGTGACG (SEQ ID NO: 137) and OPBR-76 create Bglll and HinDIII sites, respectively. Additionally OPBR- 76 deletes the stop codon of ESAT6.
MPT59: OPBR-77: GTTCGCAAAGCTTTTCTCCCGGCCGGGGCTGCCGGTCGAGTACC (SEQ ID NO: 138) OPBR-18: CCTTCGGTGGATCCCGTCAG (SEQ ID NO: 139) OPBR-77 and OPBR-18 create HinDIII and BamHI sites, respectively.
Expression/purification of MPT59-ESAT6 and ESAT6-MPT59 hybrid proteins.
Expression and metal affinity purification of recombinant proteins was undertaken essentially as described by the manufacturers. For each protein, 1 I LB-media containing 100 pg/ml ampicillin, was inoculated with 10 ml of an overnight culture of XL1-Blue cells harbouring recombinant pMCT6 plasmids. Cultures were shaken at 37 °C until they reached a density of ODoo 0.4 0.6. IPTG was hereafter added to a final concentration of 1 mM and the cultures were further incubated 4- 16 hours. Cells were harvested, resuspended in 1X sonication buffer 8 M urea and sonicated 5 X sec. with 30 sec. pausing between the pulses.
After centrifugation, the lysate was applied to a column containing 25 ml of resuspended Talon resin (Clontech, Palo Alto, USA). The column was washed and eluted as described by the manufacturers.
After elution, all fractions (1.5 ml each) were subjected to analysis by SDS-PAGE using the Mighty Small (Hoefer Scientific Instruments, USA) system and the protein concentrations were estimated at 280 nm. Fractions containing recombinant protein were pooled and dialysed against 3 M urea in 10 mM Tris-HCI, pH 8.5. The dialysed protein was further purified by FPLC (Pharmacia, Sweden) using a 6 ml Resource-Q column, WO 99/24577 PCT/DK98/00438 98 eluted with a linear 0-1 M gradient of NaCI. Fractions were analyzed by SDS-PAGE and protein concentrations were estimated at OD 2 8 o. Fractions containing protein were pooled and dialysed against 25 mM Hepes buffer, pH Finally the protein concentration and the LPS content were determined by the BCA (Pierce, Holland) and LAL (Endosafe, Charleston, USA) tests, respectively.
The biological activity of the MPT59-ESAT6 fusion protein is described in Example 6A.
EXAMPLE Mapping of the purified antigens in a 2DE system.
In order to characterize the purified antigens they were mapped in a 2-dimensional electrophoresis (2DE) reference system. This consists of a silver stained gel containing ST-CF proteins separated by isoelectrical focusing followed by a separation according to size in a polyacrylamide gel electrophoresis. The 2DE was performed according to Hochstrasser et al. (1988). 85 pg of ST-CF was applied to the isoelectrical focusing tubes where BioRad ampholytes BioLyt 4-6 (2 parts) and BioLyt 5-7 (3 parts) were included. The first dimension was performed in acrylamide/piperazin diacrylamide tube gels in the presence of urea, the detergent CHAPS and the reducing agent DTT at 400 V for 18 hours and 800 V for 2 hours. The second dimension 10-20% SDS-PAGE was performed at 100 V for 18 hours and silver stained. The identification of CFP7, CFP7A, CFP7B, CFP8A, CFP8B, CFP9, CFP11, CFP16, CFP17, CFP19, CFP21, CFP22, CFP25, CFP27, CFP28, CFP29, CFP30A, CFP50, and MPT51 in the 2DE reference gel were done by comparing the spot pattern of the purified antigen with ST-CF with and without the purified antigen. By the assistance of an analytical 2DE software system (Phoretix International, UK) the spots have been identified in Fig.
-I
6. The position of MPT51 and CFP29 were confirmed by a Western blot of the 2DE gel using the Mab's anti-CFP29 and HBT 4.
WO 99/24577 PCT/DK98/00438 99 EXAMPLE 6 Biological activity of the purified antigens.
IFN-y induction in the mouse model of TB infection The recognition of the purified antigens in the mouse model of memory immunity to TB (described in example 1) was investigated. The results shown in TABLE 5 are representative for three experiments.
A very high IFN-y response was induced by two of the antigens CFP17 and CFP21 at almost the same high level as ST-CF.
TABLE IFN-y release from splenic memory effector cells from C57BL/6J mice isolated after reinfection with M. tuberculosis after stimulation with native antigens.
Antigena IFN-y (pg/ml)b ST-CF 12564 CFP7 NDd CFP9 ND CFP17 9251 2388 CFP21 10732 CFP22 CFP25c 5342 CFP26 (MPT51) ND CFP28 2818 CFP29 3700 The data is derived from a representative experiment out of three.
aST-CF was tested in a concentration of 5 ig/ml and the individual antigens in a concentration of 2 ig/ml.
WO 99/24577 PCT/DK98/00438 100 bFour days after rechallenge a pool of cells from three mice were tested. The results are expressed as mean of duplicate values and the difference between duplicate cultures are 15% of mean. The IFN-y release of cultures incubated without antigen was 390 pg/ml.
c A pool of CFP22 and CFP25 was tested.
d ND, not determined.
Skin test reaction in TB infected guinea pigs The skin test activity of the purified proteins was tested in M. tuberculosis infected guinea pigs.
1 group of guinea pigs was infected via an ear vein with 1 x 10' CFU of M. tuberculosis H37Rv in 0,2 ml PBS. After 4 weeks skin tests were performed and 24 hours after injection erythema diameter was measured.
As seen in TABLES 6 and 6a all of the antigens induced a significant Delayed Type Hypersensitivity (DTH) reaction.
WO 99/24577 PCT/DK98/00438 101 TABLE 6 DTH erythema diameter in guinea pigs infected with 1 x 104 CFU of M. tuberculosis, after stimulation with native antigens.
Antigena Skin reaction (mm)b Control 2.00 PPDC 15.40 (0.53) CFP7 NDe CFP9 ND CFP17 11.25 (0.84) 8.88 (0.13) CFP21 12.44 (0.79) CFP22 CFP25 d 9.19 (3.10) CFP26 (MPT51) ND CFP28 2.90 (1.28) CFP29 6.63 (0.88) The values presented are the mean of erythema diameter of four animals and the SEM's are indicated in the brackets. For PPD and CFP29 the values are mean of erythema diameter of ten animals.
a The antigens were tested in a concentration of 0,1 ~g except for CFP29 which was tested in a concentration of 0,8 Ig.
b The skin reactions are measured in mm erythema 24 h after intradermal injection.
c 10 TU of PPD was used.
d A pool of CFP22 and CFP25 was tested.
e ND, not determined.
Together these analyses indicate that most of the antigens identified were highly biologically active and recognized during TB infection in different animal models.
WO 99/24577 PCT/DK98/00438 102 TABLE 6a DTH erythema diameter of recombinant antigens in outbred guinea pigs infected with 1 x 104 CFU of M. Tuberculosis.
Antigena Skin reaction (mm)b Control
PPDC
CFP 7a CFP 17 CFP 20 CFP 21 CFP 25 CFP 29 MPT 51 2.9 14.5 13.6 6.8 6.4 5.3 10.8 7.4 4.9 (0.3) (1.4) (1.9) (1.4) (0.7) (0.8) (2.2) (1.1) The values presented are the mean of erythema diameter of four animals and the SEM's are indicated in the brackets. For Control, PPD, and CFP 20 the values are mean of erythema diameter of eight animals.
a The antigens were tested in a concentration of 1,0 pg.
b The skin test reactions are measured in mm erythema 24 h after intradermal infection.
c 10 TU of PPD was used.
Table 6B.
DTH erythema diameter in guinea pigs i.v. infected with 1 x 104 CFU M. tuberculosis, after stimulation with 10 g antigen.
Antigen Mean (mm) SEM PBS 3,25 0,48 PPD (2TU) 10,88 1 nCFP7B 7,0 0,46 nCFP19 6,5 0,74 MPT59-ESAT6 14,75 WO 99/24577 PCT/DK98/00438 103 The values presented are the mean of erythema diameter of four animals.
The results in Table 6B indicates biological activity of nCFP7B, nCFP19 and MPT59- ESAT-6. MPT59-ESAT-6 resulting in a DTH response at the level of PPD.
Biological activity of the purified recombinant antigens.
Interferon-I induction in the mouse model of TB infection.
Primary infections. 8 to 12 weeks old female C57BL/6j(H-2b), CBA/J(H-2k), DBA.2(H- 2 d) and A.SW(H-2s) mice (Bomholtegaard, Ry) were given intravenous infections via the lateral tail vein with an inoculum of 5 x 104 M. tuberculosis suspended in PBS in a vol. of 0.1 ml. 14 days postinfection the animals were sacrificed and spleen cells were isolated and tested for the recognition of recombinant antigen.
As seen in TABLE 7 the recombinant antigens rCFP7A, rCFP17, rCFP21, rCFP25, and rCFP29 were all recognized in at least two strains of mice at a level comparable to ST- CF. rMPT51 and rCFP7 were only recognized in one or two strains respectively, at a level corresponding to no more than 1/3 of the response detected after ST-CF stimulation. Neither of the antigens rCFP20 and rCFP22 were recognized by any of the four mouse strains.
As shown in TABLE 7A, the recombinant antigens rCFP27, RD1-ORF2, MPT59- ESAT6, rCFP10A, rCFP19, and rCFP25A were all recognized in at least two strains of mice at a level comparable to ST-CF, whereas ESAT6-MPT59, rCFP23, and only were recognized in one strain at this level. rCFP30A RD1-ORF5, rCFP16 gave rise to an IFN-y release in two mice strains at a level corresponding to 2/3 of the response after stimulation with ST-CF. RD1-ORF3 was recognized in two strains at a level of 1/3 of ST-CF.
The native CFP7B was recognized in two strains at a level of 1/3 of the response seen after stimulation with ST-CF.
WO 99/24577 PCT/DK98/00438 104 Memory responses. 8-12 weeks old female C57BL/6j(H-2b) mice (Bomholtegaard, Ry) were given intravenous infections via the lateral tail vein with an inoculum of 5 x 104 M. tuberculosis suspended in PBS in a vol. of 0.1 ml. After 1 month of infection the mice were treated with isoniazid (Merck and Co., Rahway, NJ) and rifabutin (Farmatalia Carlo Erba, Milano, Italy) in the drinking water, for two months. The mice were rested for 4-6 months before being used in experiments. For the study of the recall of memory immunity, animals were infected with an inoculum of 1 x 106 bacteria i.v. and sacrificed at day 4 postinfection. Spleen cells were isolated and tested for the recognition of recombinant antigen.
As seen from TABLE 8, IFN-y release after stimulation with rCFP17, rCFP21 and was at the same level as seen from spleen cells stimulated with ST-CF.
Stimulation with rCFP7, rCFP7A and rCFP29 all resulted in an IFN-y no higher than 1/3 of the response seen with ST-CF. rCFP22 was not recognized by IFN-y producing cells. None of the antigens stimulated IFN-y release in naive mice. Additionally non of the antigens were toxic to the cell cultures.
As shown in TABLE 8A, IFN-y release after stimulation with RD1-ORF2, MPT59- ESAT6, ESAT6-MPT59, and rCFP19 was at the same level as seen from spleen cells stimulated with ST-CF. Stimulation with rCFP1OA and rCFP30A gave rise to an IFN-y release of 2/3 of the response after stimulation with ST-CF, whereas rCFP27, RD1rCFP23, rCFP25A and rCFP30B all resulted in an IFN-y release no higher than 1/3 of the response seen with ST-CF. RD1-ORF3 and rCFP16 were not recognized by IFN-y producing memory cells.
WO 99/24577 WO 9924577PCTIDK98/00438 105 TABLE 7. T cell responses in primary TB infection.
Name c57BL/6J (H 2 b) DBA. 2 (H 2 d) CBA/J (H 2 k) A. SW (H2 s) rCFP7 rCF'P7A rCFP17 rCFP2O rCFP2J. rCFP22 rCFP29 rMPT51 Mouse IFN-y release 14 days after primary infection with M. tuberculosis.
-:no response; 1/3 of ST-CF; 2/3 of ST-CF; level of ST-CF.
WO 99/24577 WO 9924577PCT/DK98/00438 106 TABLE 7A. T cell responses in primary TB infection.
A V, Name C57B1/Gj (H21') DBA. 2 (H2) CBA/J (H2) A. SW (H 2) rCFP27 rCFP3OA RD1-ORF2 RD1-ORF3 MPT59-ESAT6 ESAT6-MPT59 rCFP1OA n.d. n.d.
rCFP16 n.d. n.d.
rCFP19 n.d. 4 n.d.
rCFP23 n.d. n.d.
m.d. n.d.
rCFP3OB n.d. n.d.
CFP7B(native) m.d. n.d.
Mouse IFN-y release 1 4 days after primary infection with M. tuberculosis.
no response; 1/3 of ST-CF; 2/3 of ST-OF; level of ST-CF.
n.d. not determined.
WO 99/24577 WO 9924577PCT/DK98/00438 107 TABLE 8. T cell responses in memory immune animals.
Name Memory response rCFP7+ rCFP7A rCFP17..
rCFP21..
rCFP22 rCFP29 rMPT51 Mouse IFN-y release during recall of memory immunity to M. tuberculosis.
-:no response; 1/3 of ST-CF; 2/3 of ST-CF; level of ST-CF.
TABLE 8A. T cell responses in memory immune animals.
Name Memory response rCFP27 rCFP3OA RD1-ORF2 RD1-ORF3 RD1 -ORF5 MPT59-ESAT6..
ESAT6-MPT59..
rCFP1OA rCFP16 rCFP19..
rCFP23 rCFP3 OB Mouse IFN-y release during recall of memory immunity to M. tuberculosis.
no response; 1/3 of ST-CF; 2/3 of ST-CF; level of ST-CF.
WO 99/24577 PCT/DK98/00438 108 Interferon-Y induction in human TB patients and BCG vaccinated people.
Human donors: PBMC were obtained from healthy BCG vaccinated donors with no known exposure to patients with TB and from patients with culture or microscopy proven infection with Mycobacterium tuberculosis. Blood samples were drawn from the TB patients 1-4 months after diagnosis.
Lymphocyte preparations and cell culture: PBMC were freshly isolated by gradient centrifugation of heparinized blood on Lymphoprep (Nycomed, Oslo, Norway). The cells were resuspended in complete medium: RPMI 1640 (Gibco, Grand Island, supplemented with 40 pg/ml streptomycin, 40 U/ml penicillin, and 0.04 mM/ml glutamine, (all from Gibco Laboratories, Paisley, Scotland) and 10% normal human ABO serum (NHS) from the local blood bank. The number and the viability of the cells were determined by trypan blue staining. Cultures were established with 2,5 x 105 PBMC in 200 p/ in microtitre plates (Nunc, Roskilde, Denmark) and stimulated with no antigen, ST-CF, PPD (2.5pg/ml); rCFP7, rCFP7A, rCFP17, rCFP20, rCFP21, rCFP22, rCFP26, rCFP29, in a final concentration of 5 pg/ml. Phytohaemagglutinin, 1 pg/ml (PHA, Difco laboratories, Detroit, MI. was used as a positive control. Supernatants for the detection of cytokines were harvested after 5 days of culture, pooled and stored at -800C until use.
Cytokine analysis: Interferon-y (IFN-y was measured with a standard ELISA technique using a commercially available pair of mAb's from Endogen and used according to the instructions for use. Recombinant IFN-y (Gibco laboratories) was used as a standard. The detection level for the assay was 50 pg/ml. The variation between the duplicate wells did not exceed 10 of the mean. Responses of 9 individual donors are shown in TABLE 9.
A seen in TABLE 9 high levels of IFN-y release are obtained after stimulation with several of the recombinant antigens. rCFP7a and rCFP17 gives rise to responses comparable to STCF in almost all donors. rCFP7 seems to be most strongly recognized by BCG vaccinated healthy donors. rCFP21, rCFP25, rCFP26, and rCFP29 gives rise to a mixed picture with intermediate responses in each group, whereas low responses are obtained by rCFP20 and rCFP22.
WO 99/24577 PCT/DK98/00438 109 As is seen from Table 9A RD1-ORF2 and RD1-ORF5 give rise to IFN-y responses close to the level of ST-CF. Between 60% and 90% of the donors show high IFN-y responses 1000 pg/ml). rCFP30A gives rise to a mixed response with 40-50% high responders, whilst low responses are obtained with RD1-ORF3.
As seen from Table 9B MPT59-ESAT6 and ESAT6-MPT59 both give rise to IFN-y responses at the level of ST-CF and 67-89 show high responses 1000 pg/ml).
TABLE 9. Results from the stimulation of human blood cells from 5 healthy BCG vaccinated and 4 Tb patients with recombinant antigen. ST-CF, PPD and PHA are shown for comparison. Results are given in pg IFN-y/ml.
Controls, Healthy, BCG vaccinated, no known TB exposure donor: no ag PHA PPD STCF CFP7 CFP17 CFP7A CFP20 CFP21 CFP22 CFP25 CFP26 CFP29 1 6 9564 6774 3966 7034 69 1799 58 152 73 182 946 86 2 48 12486 6603 8067 3146 10044 5267 29 6149 51 1937 526 2065 3 190 11929 10000 8299 8015 11563 8641 437 3194 669 2531 8076 6098 4 10 21029 4106 3537 1323 1939 5211 1 284 1 1344 20 125 1 18750 14209 13027 17725 8038 19002 1 3008 1 2103 974 8181 TB patients, 1-4 month after diagnosis no ag PHA PPD STCF CFP7 CFP17 CFP7A CFP20 CFP21 CFP22 CFP25 CFP26 CFP29 6 9 8973 5096 6145 852 4250 4019 284 1131 48 2400 1078 4584 7 1 12413 6281 3393 168 6375 4505 11 4335 16 3082 1370 5115 8 4 11915 7671 7375 104 2753 3356 119 407 437 2069 712 5284 9 32 22130 16417 17213 8450 9783 16319 91 5957 67 10043 13313 9953 WO 99/24577 PCT/DK98/00438 111 Table 9A.
Results from the stimulation of human blood cells from 10 healthy BCG vaccinated or non vaccinated ST-CF responsive healthy donors and 10 Tb patients with recombinant antigen are shown. ST-CF, PPD and PHA are included for comparison. Results are given in pg IFN-y/ml and negative values below 300 pg/ml are shown as nd not done.
Controls, Healthy BCG vaccinated, or non vaccinated ST-CF positive Donor no ag PHA PPD STCF RD1-ORF2 RD1-ORF3 RD1-ORF5 nd 3500 4200 1250 690 nd 11 nd 5890 4040 5650 880 9030 nd 12 nd 6480 3330 2310 3320 nd 13 nd 7440 4570 920 1230 nd 14 8310 nd 2990 1870 4880 10820 nd 4160 5690 810 3380 16 8710 nd 5690 1630 5600 17 7020 4480 5340 2030 nd 670 18 8370 6250 4780 3850 nd 370 1730 19 8520 1600 310 5110 nd 2330 1800 Tb patients, 1-4 month after diagnosis Donor no ag PHA PPD STCF RD1-ORF2 RD1-ORF3 RD1-ORF5 nd 10670 12680 2020 9670 nd 21 nd 3010 1420 850 350 nd 22 nd 8450 7850 430 1950 nd 23 10060 nd 3730 350 24 10830 nd 6180 2090 320 730 9000 nd 3200 4760 4960 2820 26 10740 nd 7650 4710 1170 2280 27 7550 6430 6220 2030 nd 3390 3069 28 8090 5790 4850 1100 nd 2095 550 29 7790 4800 4260 2800 nd 1210 420 WO 99/24577 PCT/DK98/00438 112 Table 9B.
Results from the stimulation of human blood cells from 9 Healthy BCG vaccinated, or non vaccinated ST-CF positive and 8 Tb patients with recombinant MPT59-ESAT6 and ESAT6-MPT59 are shown. ST-CF, PPD and PHA are included for comparison. Results are given in pg IFN-y/ml and negative values below 300 pg/ml are shown as nd not done.
Controls, Healthy BCG vaccinated, or non vaccinated ST-CF positive.
Donor no ag PHA PPD STCF MPT59- ESAT6- ESAT6 MPT59 1 9560 6770 3970 2030 2 12490 6600 8070 5660 5800 4 21030 4100 3540 18750 14200 13030 8540 11 nd 5890 4040 4930 8870 12 nd 6470 3330 2070 6450 14 8310 nd 2990 10270 11030 10830 nd 4160 3880 4540 16 8710 nd 5690 2240 5820 Tb patients, 1-4 month after diagnosis Donor no ag PHA PPD STCF MPT59- ESAT6- ESAT6 MPT59 6 8970 5100 6150 4150 4120 7 12410 6280 3390 5050 2040 8 11920 7670 7370 800 1350 9 22130 16420 17210 13660 5630 23 10070 nd 3730 1740 2390 24 10820 nd 6180 1270 1570 9010 nd 3200 3680 5340 26 10740 nd 7650 2070 620 WO 99/24577 PCT/DK98/00438 113 EXAMPLE 6A Four groups of 6-8 weeks old, female C57BI/6J mice (BomholtegArd, Denmark) were immunized subcutaneously at the base of the tail with vaccines of the following compositions: Group 1: 10 pg ESAT-6/DDA (250 pg) Group 2: 10 pg MPT59/DDA (250pg) Group 3: 10 pg MPT59-ESAT-6 /DDA (250 /g) Group 4: Adjuvant control group: DDA (250 pg) in NaCI The animals were injected with a volume of 0.2 ml. Two weeks after the first injection and 3 weeks after the second injection the mice were boosted a little further up the back.
One week after the last immunization the mice were bled and the blood cells were isolated. The immune response induced was monitored by release of IFN-y into the culture supernatants when stimulated in vitro with relevant antigens (see the following table).
Immunogen For restimulation" Ag in vitro pg/dose no antigen ST-CF ESAT-6 MPT59 ESAT-6 219 219 569 569 835 633 MPT59 0 802 182 5647 159 Hybrid: 127 127 7453 i 581 15133 861 16363 MPT59-ESAT-6 1002 Blood cells were isolated 1 week after the last immunization and the release of IFN-y (pg/ml) after 72h of antigen stimulation (5 pg/ml) was measured.
The values shown are mean of triplicates performed on cells pooled from three mice SEM b) not determined WO 99/24577 PCT/DK98/00438 114 The experiment demonstrates that immunization with the hybrid stimulates T cells which recognize ESAT-6 and MPT59 stronger than after single antigen immunization.
Especially the recognition of ESAT-6 was enhanced by immunization with the MPT59- ESAT-6 hybrid. IFN-y release in control mice immunized with DDA never exceeded 1000 pg/ml.
EXAMPLE 6B The recombinant antigens were tested individually as subunit vaccines in mice. Eleven groups of 6-8 weeks old, female C57BI/6j mice (Bomholteggrd, Denmark) were immunized subcutaneously at the base of the tail with vaccines of the following composition: Group 1: Group 2: Group 3: Group 4: Group 5: Group 6: Group 7: Group 8: Group 9: Group 10: Group 11: 10 pg CFP7 10 pg CFP17 10 pg CFP21 10 /g CFP22 10 pg 10 pg CFP29 10 pg MPT51 50 pg ST-CF Adjuvant control group BCG 2,5 x 10 5 /ml, 0,2 ml Control group: Untreated All the subunit vaccines were given with DDA as adjuvant. The animals were vaccinated with a volume of 0.2 ml. Two weeks after the first injection and three weeks after the second injection group 1-9 were boosted a little further up the back. One week after the last injection the mice were bled and the blood cells were isolated. The immune response induced was monitored by release of IFN-y into the culture supernatant when stimulated in vitro with the homologous protein.
6 weeks after the last immunization the mice were aerosol challenged with 5 x 106 viable Mycobacterium tuberculosis/ml. After 6 weeks of infection the mice were killed WO 99/24577 PCT/DK98/00438 115 and the number of viable bacteria in lung and spleen of infected mice was determined by plating serial 3-fold dilutions of organ homogenates on 7H11 plates. Colonies were counted after 2-3 weeks of incubation. The protective efficacy is expressed as the difference between log 1 0 values of the geometric mean of counts obtained from five mice of the relevant group and the geometric mean of counts obtained from five mouse of the relevant control group.
The results from the experiments are presented in the following table.
Immunogenicity and protective efficacy in mice, of ST-CF and 7 subunit vaccines Subunit Vaccine Immunogenicity Protective efficacy ST-CF CFP7 CFP17 CFP21 CFP22 CFP29 MPT51 Strong immunogen high protection (level of BCG) Medium immunogen medium protection No recognition no protection In conclusion, we have identified a number of proteins inducing high levels of protection. Three of these CFP17, CFP25 and CFP29 giving rise to similar levels of protection as ST-CF and BCG while two proteins CFP21 and MPT51 induces protections around 2/3 the level of BCG and ST-CF. Two of the proteins CFP7 and CFP22 did not induce protection in the mouse model.
As is described for rCFP7, rCFP17, rCFP21, rCFP22, rCFP25, rCFP29 and rMPT51 the two antigens rCFP7A and rCFP30A were tested individually as subunit vaccines in mice. C57BI/6j mice were immunized as described for rCFP7, rCFP17, rCFP21, rCFP22, rCFP25, rCFP29 and rMPT51 using either 10g rCFP7A or 10g Controls were the same as in the experiment including rCFP7, rCFP17, rCFP21, rCFP22, rCFP25, rCFP29 and rMPT51.
WO 99/24577 PCT/DK98/00438 116 Immunogenicity and protective efficacy in mice, of ST-CF and 2 subunit vaccines.
Subunit vaccine Immunogenicity Protective efficacy ST-CF rCFP7A Strong immunogen/high protection (level of BCG) Medium immunogen/medium protection No recognition/no protection In conclusion we have identified two strong immunogens of which one, rCFP7A, induces protection at the level of ST-CF.
EXAMPLE 7 Species distribution of cfp7, cfp9, mpt51, rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdlorf8, rd -orf9a and rd -orf9b as well as of cfp7a, cfp7b, cfp O1a, cfp 17, cfp21, cfp22, cfp22a, cfp23, cfp25 and Presence of cfp7, cfp9, mpt51, rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdl-orf8, rdlorf9a and rdl-orf9b in different mycobacterial species.
In order to determine the distribution of the cfp7, cfp9, mpt51, rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdl-orf8, rdl-orf9a and rdl-orf9b genes in species belonging to the M. tuberculosis-complex and in other mycobacteria PCR and/or Southern blotting was used. The bacterial strains used are listed in TABLE 10. Genomic DNA was prepared from mycobacterial cells as described previously (Andersen et al. 1992).
PCR analyses were used in order to determine the distribution of the cfp7, cfp9 and mpt51 gene in species belonging to the tuberculosis-complex and in other mycobacteria. The bacterial strains used are listed in TABLE 10. PCR was performed on genomic DNA prepared from mycobacterial cells as described previously (Andersen et al., 1992).
WO 99/24577 PCT/DK98/00438 117 The oligonucleotide primers used were synthesised automatically on a DNA synthesizer (Applied Biosystems, Forster City, Ca, ABI-391, PCR-mode), deblocked, and purified by ethanol precipitation. The primers used for the analyses are shown in TABLE 11.
The PCR amplification was carried out in a thermal reactor (Rapid cycler, Idaho Technology, Idaho) by mixing 20 ng chromosomal with the mastermix (contained 0.5 pM of each oligonucleotide primer, 0.25 pM BSA (Stratagene), low salt buffer (20 mM Tris-HCI, pH8.8, 10 mM KCI, 10 mM (NH 4 2
SO
4 2 mM MgSO 4 and 0,1% Triton X- 100) (Stratagene), 0.25 mM of each deoxynucleoside triphosphate and 0.5 U Taq Plus Long DNA polymerase (Stratagene)). Final volume was 10 pl (all concentrations given are concentrations in the final volume). Predenaturation was carried out at 94 0 C for s. 30 cycles of the following was performed: Denaturation at 94 0 C for 30 s, annealing at 55°C for 30 s and elongation at 72 0 C for 1 min.
The following primer combinations were used (the length of the amplified products are given in parentheses): mpt51: MPT51-3 and MPT51-2 (820 bp), MPT51-3 and MPT51-6 (108 bp), MPT51-5 and MPT51-4 (415 bp), MPT51-7 and MPT51-4 (325 bp).
cfp7: pVF1 and PVR1 (274 bp), pVF1 and PVR2 (197 bp), pVF3 and PVR1 (302 bp), pVF3 and PVR2 (125 bp).
cfp9: stR3 and stF1 (351 bp).
WO 99/24577 PCT/DK98/00438 118 TABLE Mycobacterial strains used in this Example.
Speie an srai-s Species and strain(s) 1. M. tuberculosis 2.
3.
H37Rv (ATCC 27294) H37Ra (ATCC 25177) Erdman 4. M.
1331 6.
7.
8.
9.
11. M.
12. M.
bovis BCG substrain:Danish Chinese Canadian Glaxo Russia Pasteur Japan bovis MNC 27 africanum Source
ATCC"
ATCC
Obtained from A. Lazlo, Ottawa, Canada SSIb SSIc
SSIC
SSIC
SSIc SSIc WHOe
SSIC
Isolated from a Danish patient Obtained from J. M. Colston, London, UK
ATCC
ATCC
ATCC
ATCC
ATCC
ATCC
Isolated from a Danish patient Isolated from a Danish patient Isolated from a Danish patient SSIc SSIl SSId 13. M. leprae (armadillo-derived) avium (ATCC 15769) kansasii (ATCC 12478) marinum (ATCC 927) scrofulaceum (ATCC 19275) intercellulare (ATCC 15985) fortuitum (ATCC 6841) xenopi 21. M. flavescens 22. M. szulgai 23. M. terrae 24. E. coli S.aureus a American Type Culture Collection, USA.
b Statens Serum Institut, Copenhagen, Denmark.
c Our collection Department of Mycobacteriology, Statens Serum Institut, Copenhagen, Denmark.
dDepartment of Clinical Microbiology, Statens Serum Institut, Denmark.
e WHO International Laboratory for Biological Standards, Statens Serum Institut, Copenhagen, Denmark.
WO 99/24577 PCT/DK98/00438 119 TABLE 11.
Sequence of the Orientation and oligonucleotide Sense mpt51, cfp7 and cfp9 oligonucleotides.
Sequences 5 Positionb (nucleotides) MPT51- CTCGAATTCGCCGGGTGCACACAG 1 (SEQ ID NO: 28) MPT51- CTCGAATTCGCCCCATACGAGAAC 3 (SEQ ID NO: 29) MPT51- GTGTATCTGCTGGAC (SEQ ID NO: 30) MPT51- CCGACTGGCTGGCCG 7 (SEQ ID NO: 31) pvR1 GTACGAGAATTCATGTCGCAAATCATG (SEQ ID NO: 35) pvR2 GTACGAGAATTCGAGCTTGGGGTGCCG (SEQ ID NO: 36) stR3 CGATTCCAAGCTTGTGGCCGCCGACCCG (SEQ ID NO: 37) Antisense MPT51- GAGGAATTCGCTTAGCGGATCGCA 2 (SEQ ID NO: 32) MPT51- CCCACATTCCGTTGG 4 (SEQ ID NO: 33) MPT51- GTCCAGCAGATACAC 6 (SEQ ID NO: 34) pvF1 CGTTAGGGATCCTCATCGCCATGGTGTTGG (SEQ ID NO: 38) pvF3 CGTTAGGGATCCGGTTCCACTGTGCC (SEQ ID NO: 39) stF1 CGTTAGGGATCCTCAGGTCTTTTCGATG (SEQ ID NO: 40) 6-21 (SEQ ID NO: 41) 143- 158 (SEQ ID NO: 41) 228 -242 (SEQ ID NO: 41) 418-432 (SEQ ID NO: 41) 91 -105 (SEQ ID NO: 1) 168-181 (SEQ ID NO: 1) 141 155 (SEQ ID NO: 3) 946 932 (SEQ ID NO: 41) 642 628 (SEQ ID NO: 41) 242- 228 (SEQ ID NO: 41) 340- 323 (SEQ ID NO: 1) 268- 255 (SEQ ID NO: 1) 467 452 (SEQ ID NO: 3) a Nucleotides underlined are not contained in the nucleotide sequences of mpt51, cfp7, and cfp9.
b The positions referred to are of the non-underlined parts of the primers and correspond to the nucleotide sequence shown in SEQ ID NOs: 41, 1, and 3 for mpt51, cfp7, and cfp9, respectively.
The Southern blotting was carried out as described previously (Oettinger and Andersen, 1994) with the following modifications: 2 pg of genomic DNA was digested with Pvull, electrophoresed in an 0.8% agarose gel, and transferred onto a nylon membrane WO 99/24577 PCT/DK98/00438 120 (Hybond N-plus; Amersham International plc, Little Chalfont, United Kingdom) with a vacuum transfer device (Milliblot, TM-v; Millipore Corp., Bedford, MA). The cfp7, cfp9, mpt51, rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdl-orf8, rdl-orf9a and rdl-orf9b gene fragments were amplified by PCR from the plasmids pRVN01, pRVN02, pTO52, pTO87, pTO88, pT089, pT090, pT091, pT096 or pTO98 by using the primers shown in TABLE 11 and TABLE 2 (in Example 2a). The probes were labelled non-radioactively with an enhanced chemiluminescence kit (ECL; Amersham International plc, Little Chalfont, United Kingdom). Hybridization and detection was performed according to the instructions provided by the manufacturer. The results are summarized in TABLES 12 and 13.
WO 99/24577 WO 9924577PCT/DK98/00438 121 TABLE 12. Interspecies analysis of the cfp7, cfp9 and mpt5l genes by PCR and/or Southern blotting and of MPT51 protein by Western blotting.
:PCR :Southern blot :Western blot Species and strain :cfp7 cfp mptS1 cfp7 cfp9 mpt5 :MPT51 9 1 1. M. tub. H37Rv 2. M. tub. H37Ra N.D. N.D. 3. M. tub. Erdmarm 4. M. bovis M. bovis BCG Danish 1331 6. M. bovis BCG N.D. :N.D.
Japan 7. M. bovis BCG N.D. N.D. N.D.
Chinese S. M. bovis BCG N.D. N.D. N.D.
Canadian 9. M. bovis BCG N.D. N.D. :N.D.
Glaxo M. bovis BCG N.D. N.D. N.D.
Russia 11. M. bovis BCG N.D. N.D. N.D.
Pasteur 12. M. africanun 13. M. leprae- 14. M. aviun M. kansasii 16. M. marinum 17. M. scrofulaceun- 18. M. intercellul- are 19. M. fortuitum M. flavescens 1+ :N.D.
21. M. xenopi N.D. N.D. 22. M. szulgai 23. N. terrae N.D. N.D. N.D. N.D. N.E.
+,positive reaction; no reaction, N.D. not determined.
cfp7, cfp9 and mpt5l were found in the M. tuberculosis complex including BCG and the environmental mycobacteria; M. avium, M. kansasli, M. marinum, M. intracellular and M. flavescens. cfp9 was additionally found in M. szulgai and mpt5l in M. xenopi.
Furthermore the presence of native MPT51 in culture filtrates from different mycobacterial strains was investigated with western blots developed with Mab HBT4.
WO 99/24577 WO 9924577PCT/DK98/00438 122 There is a strong band at around 26 kDa in M. tuberculosis H37Rv, Ra, Erdman, M.
bovis AN5, M. bovis BCG substrain Danish 1331 and M. africanurn. No band was seen in the region in any other tested mycobacterial strains.
TABLE 13a. Interspecies analysis of the rdl-orf2, rdl-orf3, rdl-orf4, rdl-orf5, rdlorf8, rdl-orf9a and rdl-orf9b genes by Southern blotting.
Species and rdl- rdl- rdl- rdl- rdl- rdl- rdlstrain orf2 or! 3 orff4 or! 5 orf8 or!9a orf9b 1. M. tub. H3 7Rv 2. M. bovis N.D. 3. M. bovis N.D.
BCG
Danish 1331 4. M. bovis N.D. BCG Japan S. M.aviun N.D. 6. M. N.D. kansasii 7. M. marinumn+ 8. M. scrofu- N.D. laceum 9. M. N.D. intercellulare M. N.D. fortul turn 11. M. xenopa-- N.D. 12. M. N.D. sz ulgal positive reaction; -,no reaction, N.D. not determined.
Positive results for rdl-orf2, rdl-orf3, rdl-orf4 rdl-orf5, rdl-orf8, rdl-orf9a and rdlorf9b were only obtained when using genomic DNA from M. tuberculosis and M. bovis, and not from M. bovis BCG or other mycobacteria analyzed except rdl-orf4 which also was found in M. marnurn.
Presence of cfx, 7a, cfo 7b. cfp 1 Oa. cfp 17. cfo2O, cfn2 1, cfo22. cfy22a. cfn23. 1 5 and cfy25a in different mycobacterial species.
Southern blotting was carried out as described for rdl-orf2, rdl-orf3, rdl-orf4, rdlrd 1-orf8, rd 1-orf9a and rdl1-orf9b. The cfp 7a, cfp 7b, cfp 10Oa, cfp 17, cfp2O, WO 99/24577 WO 9924577PCT/DK98/00438 123 cfp2l, cfp22, cfp22a, cfp23, cfp25 and cfp25a gene fragments were amplified by PCR from the recombinant pMCT6 plasmids encoding the individual genes. The primers used (same as the primers used for cloning) are described in example 3, 3A and 3B3. The results are summarized in Table 1 3b.
TABLE 13b. Interspecies analysis of the cfp 7a, cfp 7b, cfpl10a, cfp 17, cfp20, cfp2l1, cfp22, cfp22a, cfp23, cfp25, and cfp25a genes by Southern blotting.
Species and cfp7a cfp7 cfp- cfp cfp cfp2 cfp2 cfp- cfp cfp cfpstrain b 10a 17 20 1 2 22a 23 25 1. M. tub. H 37 Rv 2. M. bovis 3. M. bovis N. D.
BCG
Danish 1 331 4. M. bovis BCG Japan M. a viurn 6. M. kansasi 7. M. marinum 8. M. scrofulaceum 9. M. intercellu/are M. fortulturn
N.D.
N.D.
N.D.
11. M. xenopi 12. M. szu/gai positive reaction; -no reaction, N.D. not determined.
LIST OF REFERENCES Andersen, P. and Heron, 1, 1993, J. Immunol. Methods 161: 29-39.
WO 99/24577 PCT/DK98/00438 124 Andersen, A. B. et 1992, Infect. Immun. 60: 2317-2323.
Andersen 1994, Infect. Immun. 62: 2536-44.
Andersen P. etal., 1995, J. Immunol. 154: 3359-72 Barkholt, V. and Jensen, A. 1989, Anal. Biochem. 177: 318-322.
Borodovsky, and J. Mclninch. 1993, Computers Chem. 17: 123-133.
van Dyke M. W. et 1992. Gene pp. 99-104.
Gosselin et 1992, J. Immunol. 149: 3477-3481.
Harboe, M. et 1996, Infect. Immun. 64: 16-22.
von Heijne, 1984, J. Mol. Biol. 173: 243-251.
Hochstrasser, D.F. et al., 1988, Anal.Biochem. 173: 424-435 Kohler, G. and Milstein, 1975, Nature 256: 495-497.
Li, H. et al., 1993, Infect. Immun. 61: 1730-1734.
Lindblad E.B. et 1997, Infect. Immun. 65: 623-629.
Mahairas, G. G. et al., 1996, J. Bacteriol 178: 1274-1282.
Maniatis T. et 1989, "Molecular cloning: a laboratory manual", 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.
Nagai, S. et 1991, Infect. Immun. 59: 372-382.
Oettinger, T. and Andersen, A. 1994, Infect. Immun. 62: 2058-2064.
WO 99/24577 PTD9/03 PCT/DK98/00438 125 Ohara, N. eta!., 1995, Scand. J. immunol. 41: 233-442.
Pal P. G. and Horwitz M. 1992, Infect. Immun. 60: 4781-92.
Pearson, W. R. and Lipman D. 1988. Proc. Nati. Acad. Sci. USA 85: 2444-2448.
Ploug, M. eta!., 1989, Anal. Biochem. 181: 33-39.
Porath, J. eta!., 1985, FEBS Lett. 185: 306-310.
Roberts, A.D. eta!., 1995, Immunol. 85: 502-508.
Sorensen, A.L. eta!., 1995, Infect. Immun. 63: 1710-1717.
Theisen, M. et 1995, Clinical and Diagnostic Laboratory Immunology, 2: 30-34.
Valdds-Stauber, N. and Scherer, 1994, Appl. Environ. Microbiol. 60: 3809-3814.
Vald~s-Stauber, N. and Scherer, 1996, AppI. Environ. Microbiol. 62: 1283-1286.
Williams, 1996, Science 272: 27.
Young, R. A. eta!., 1985, Proc. Natl. Acad. Sci. USA 82: 2583-2587.
EDITORIAL NOTE 94338/98 SEQUENCE LISTING PAGES 1 TO 116 FOLLOW PAGE 125 WO o0/2477 PCT/DK98/00438 1 SEQUENCE LISTING GENERAL INFORMATION:
APPLICANT:
NAME: Statens Seruminstitut STREET: Artillerivej CITY: Copenhagen COUNTRY: Denmark POSTAL CODE (ZIP): 2300 S (ii) TITLE OF INVENTION: Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis (iii) NUMBER OF SEQUENCES: 193 (iv) COMPUTER READABLE FORM: MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible OPERATING SYSTEM: PC-DOS/MS-DOS SOFTWARE: PatentIn Release Version #1.30 (EPO) INFORMATION FOR SEQ ID NO: 1: SEQUENCE CHARACTERISTICS: LENGTH: 381 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium STRAIN: H37Rv (ix) FEATURE: NAME/KEY: CDS LOCATION: 91..381 tuberculosis (ix) FEATURE:
NAME/KEY:
LOCATION:
(ix) FEATURE:
NAME/KEY:
LOCATION:
14..19 47..50 (ix) FEATURE: NAME/KEY: RBS LOCATION: 78..84 (ix) FEATURE:
NAME/KEY:
LOCATION:
mat_peptide 91..381 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: SUBSTITUTE SHEET (RULE 26) 7fAC DPCTV/'li/rQ l/nf GGCCGCCGGT ACCTATGTGG CCGCCGATGC TGCGGACGCG TCGACCTATA CCGGGTTCTG ATCGAACCCT GCTGACCGAG AGGACTTGTG ATG TCG CAA ATC ATG TAC AAC TAC 114 Met Ser Gin Ile Met Tyr Asn Tyr 1 CCC GCG ATG TTG GGT CAC GCC GGG GAT ATG GCC GGA TAT GCC GGC ACG 162 Pro Ala Met Leu Gly His Ala Gly Asp Met Ala Gly Tyr Ala Gly Thr 15 CTG CAG AGC TTG GGT GCC GAG ATC GCC GTG GAG CAG GCC GCG TTG CAG 210 Leu Gin Ser Leu Gly Ala Glu Ile Ala Val Glu Gin Ala Ala Leu Gin 30 35 AGT GCG TGG CAG GGC GAT ACC GGG ATC ACG TAT CAG GCG TGG CAG GCA 258 Ser Ala Trp Gin Gly Asp Thr Gly Ile Thr Tyr Gin Ala Trp Gin Ala 50 CAG TGG AAC CAG GCC ATG GAA GAT TTG GTG CGG GCC TAT CAT GCG ATG 306 Gin Trp Asn Gin Ala Met Glu Asp Leu Val Arg Ala Tyr His Ala Met 65 TCC AGC ACC CAT GAA GCC AAC ACC ATG GCG ATG ATG GCC CGC GAC ACC 354 Ser Ser Thr His Glu Ala Asn Thr Met Ala Met Met Ala Arg Asp Thr 80 GCC GAA GCC GCC AAA TGG GGC GGC TAG 381 Ala Glu Ala Ala Lys Trp Gly Gly INFORMATION FOR SEQ ID NO: 2: SEQUENCE CHARACTERISTICS: LENGTH: 96 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: Met Ser Gin Ile Met Tyr Asn Tyr Pro Ala Met Leu Gly His Ala Gly 1 5 10 Asp Met Ala Gly Tyr Ala Gly Thr Leu Gin Ser Leu Gly Ala Glu Ile 25 Ala Val Glu Gin Ala Ala Leu Gin Ser Ala Trp Gin Gly Asp Thr Gly 40 Ile Thr Tyr Gin Ala Trp Gin Ala Gin Trp Asn Gin Ala Met Glu Asp 55 Leu Val Arg Ala Tyr His Ala Met Ser Ser Thr His Glu Ala Asn Thr 70 75 Met Ala Met Met Ala Arg Asp Thr Ala Glu Ala Ala Lys Trp Gly Gly 90 INFORMATION FOR SEQ ID NO: 3:
F
SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 SEQUENCE CHARACTERISTICS: LENGTH: 467 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium STRAIN: H37Rv tuberculosis (ix) FEATURE:
NAME/KEY:
LOCATION:
(ix) FEATURE:
NAME/KEY:
LOCATION:
(ix) FEATURE:
NAME/KEY:
LOCATION:
(ix) FEATURE:
NAME/KEY:
LOCATION:
(ix) FEATURE:
NAME/KEY:
LOCATION:
CDS
141..467 73..78 4..9
RBS
123..130 mat_peptide 141..467 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: GGGTAGCCGG ACCACGGCTG GGCAAAGATG TGCAGGCCGC CATCAAGGCG GTCAAGGCCG GCGACGGCGT CATAAACCCG GACGGCACCT TGTTGGCGGG CCCCGCGGTG CTGACGCCCG ACGAGTACAA CTCCCGGCTG GTG GCC GCC GAC CCG GAG TCC ACC GCG GCG Met Ala Ala Asp Pro Glu Ser Thr Ala Ala TTG CCC GAC Leu Pro Asp GAA CTC GAA Glu Leu Glu GGC GCC GGG CTG GTC GTT CTG GAT GGC ACC GTC Gly Ala Gly Leu Val Val Leu Asp Gly Thr Val ACT GCC Thr Ala
GCC
Ala GAG GGC TGG GCC Glu Gly Trp Ala GAT CGC ATC CGC Asp Arg Ile Arg GAA CTG CAA Glu Leu Gin ATC CGG GTG Ile Arg Val GAG CTG CGT AAG TCG ACC GGG CTG GAC GTT TCC GAC Glu Leu Arg Lys Ser Thr Gly Leu Asp Val Ser Asp GTG ATG Val Met TCG GTG CCT GCG Ser Val Pro Ala
GAA
Glu CGC GAA GAC TGG Arg Glu Asp Trp CGC ACC CAT CGC Arg Thr His Arg GAC CTC ATT GCC GGA GAA ATC TTG GCT ACC GAC TTC GAA TTC GCC GAC SUBSTITUTE SHEET (RULE 26) U9 nn/24577 PCT/IqR/ndl438 vT, 771*J I I 4 Asp Leu Ile Ala Gly Glu Ile Leu Ala Thr Asp Phe Glu Phe Ala Asp 80 85 CTC GCC GAT GGT GTG GCC ATC GGC GAC GGC GTG CGG GTA AGC ATC GAA 458 Leu Ala Asp Gly Val Ala Ile Gly Asp Gly Val Arg Val Ser Ile Glu 100 105 AAG ACC TGA 467 Lys Thr INFORMATION FOR SEQ ID NO: 4: SEQUENCE CHARACTERISTICS: LENGTH: 108 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: Met Ala Ala Asp Pro Glu Ser Thr Ala Ala Leu Pro Asp Gly Ala Gly 1 5 10 Leu Val Val Leu Asp Gly Thr Val Thr Ala Glu Leu Glu Ala Glu Gly 25 Trp Ala Lys Asp Arg Ile Arg Glu Leu Gin Glu Leu Arg Lys Ser Thr 40 Gly Leu Asp Val Ser Asp Arg Ile Arg Val Val Met Ser Val Pro Ala 55 Glu Arg Glu Asp Trp Ala Arg Thr His Arg Asp Leu Ile Ala Gly Glu 70 75 Ile Leu Ala Thr Asp Phe Glu Phe Ala Asp Leu Ala Asp Gly Val Ala 90 Ile Gly Asp Gly Val Arg Val Ser Ile Glu Lys Thr 100 105 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 889 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) FEATURE: NAME/KEY: CDS LOCATION: 201..689 SUBSTITUTE SHEET (RULE 26) Wn Q /'7AC7'7 PCT/DK98/00438 (ix) FEATURE: NAME/KEY: sigpeptide LOCATION: 201..290 (ix) FEATURE: NAME/KEY: mat~peptide LOCATION: 291..689 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: CGGGTCTGCA CGGATCCGGG CCGGGCAGGG CAATCGAGCC TGGGATCCGC TGGGGTGCGC ACATCGCGGA CCCGTGCGCG GTACGGTCGA GACAGCGGCA CGAGAAAGTA GTAAGGGCGA TAATAGGCGG TAAAGAGTAG CGGGAAGCCG GCCGAACGAC TCGGTCAGAC AACGCCACAG CGGCCAGTGA GGAGCAGCGG GTG ACG GAC ATG AAC CCG GAT ATT GAG AAG Met Thr Asp Met Asn Pro Asp Ile Giu Lys
GAC
Asp CAG ACC TCC GAT Gin Thr Ser Asp GTC ACG GTA GAG Vai Thr Val Glu ACC TCC GTC TTC Thr Ser Val Phe GCA GAC TTC CTC AGC GAG CTG GAC Ala Asp Phe Leu Ser Giu Leu Asp 1 CCT GCG CAA GCG GGT ACG GAG Pro Ala Gin Ala Gly Thr Glu AGC GCG GTC TCC GGG GTG GAA Ser Ala Val Ser Gly Val Glu
GGG
Gly CTC CCG CCG GGC Leu Pro Pro Gly GCG TTG CTG Ala Leu Leu GTA GTC Val Val AAA CGA GGC CCC Lys Arg Gly Pro GCC GGG TCC CGG Ala Gly Ser Arg CTA CTC GAC CAA Leu Leu Asp Gin GCC ATC ACG TCG GCT GGT CGG CAT CCC GAC Ala Ile Thr Ser Ala Gly Arg His Pro Asp GAC ATA TTT CTC Asp Ile Phe Leu GAC 470 Asp GAC GTG ACC GTG Asp Val Thr Val CGT CGC CAT GCT Arg Arg His Ala
GAA
Glu TTC CGG TTG GAA Phe Arg Leu Glu AAC AAC Asn Asn GAA TTC AAT Glu Phe Asn CGC GAG CCC Arg Glu Pro
GTC
Val GTC GAT GTC GGG Val Asp Val Gly CTC AAC GGC ACC Leu Asn Gly Thr TAC GTC AAC Tyr Val Asn GAG GTC CAG Glu Val Gin 566 GTG GAT TCG GCG Val Asp Ser Ala
GTG
Val 100 CTG GCG AAC GGC Leu Ala Asn Gly ATC GGC Ile Gly 110 AAG TTC CGG TTG GTG TTC TTG ACC GGA CCC AAG CAA GGC GAG Lys Phe Arg Leu Val Phe Leu Thr Gly Pro Lys Gin Gly Glu
GAT
Asp 125 GAC GGG AGT ACC Asp Gly Ser Thr GGG GGC CCG Gly Gly Pro 130 TGA GCGCACCCGA TAGCCCCGCG SUBSTTUTE SHEET (RULE 26) PCT/DK98/00438 WO /14flA77PTfV~~l 6 CTGGCCGGGA TGTCGATCGG GGCGGTCCTC GACCTGCTAC GACCGGATTT TCCTGATGTC ACCATCTCCA AGATTCGATT CTTGGAGGCT GAGGGTCTGG TGACGCCCCG GCGGGCCTCA TCGGGGTATC GGCGGTTCAC CGCATACGAC TGCGCACGGC TGCGATTCAT TCTCAC!TGCC INFORMATION FOR SEQ ID NO: 6: SEQUENCE CHARACTERISTICS: LENGTH: 162 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: Met Thr Asp Met Asn Pro Asp Ile Glu Lys Val Leu Glu Asn Arg Arg Val Ala Val Gly Phe Thr Leu Asp Leu Asn Val Val Gly Arg Glu Leu Gin Asp Asn Asn Gin Glu Asp Ala Ser Val1 Ala Asp Glu Arg Ile Asp 125 Phe Val1 Lys Thr Thr Asn Pro Lys Gly Gin Thr Ser Asp Glu Leu Ser Arg Ser Val Val Val Phe Ser INFORMATION FOR SEQ ID NO: 7: SEQUENCE CHARACTERISTICS: LENGTH: 898 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) SUBSTITUTE SHEET (RULE 28) PCT/DK98/f00438 H^ f\f\ M A V^^ wu vy3 7 (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) FEATURE: NAME/KEY: CDS LOCATION: 201..698 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 201..698 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: TCGACTCCGG CGCCACCGGG CAGGATCACG GTGTCGACGG GGTCGCCGGG GAATCCCACG ATAACCACTC TTCGCGCCAT GAATGCCAGT GTTGGCCAGG CGCTGGCCTG GCGTCCACGC CACACACCGC ACAGATTAGG ACACGCCGGC GGCGCAGCCC TGCCCGAAAG ACCGTGCACC GGTCTTGGCA GACTGTGCCC ATG GCA CAG ATA ACC CTG CGA GGA AAC GCG Met Ala Gin Ile Thr Leu Arg Gly Asn Ala 120 180 230 ATC AAT ACC GTC GGT GAG CTA Ile Asn Thr Val Gly Glu Leu CCT GCT GTC Pro Ala Val 20 GGA TCC CCG GCC Gly Ser Pro Ala CCG GCC Pro Ala 278 TTC ACC CTG Phe Thr Leu
ACC
Thr GGG GGC GAT CTG Gly Gly Asp Leu
GGG
Gly 35 GTG ATC AGC AGC Val Ile Ser Ser GAC CAG TTC Asp Gin Phe GAC ACA CCG Asp Thr Pro CGG GGT AAG TCC GTG TTG CTG Arg Gly Lys Ser Val Leu Leu AAC ATC Asn Ile TTT CCA TCC Phe Pro Ser GTG TGC Val Cys GCG ACG AGT GTG Ala Thr Ser Val ACC TTC GAC GAG Thr Phe Asp Glu GCG GCG GCA AGT Ala Ala Ala Ser
GGC
Gly GCT ACC GTG CTG Ala Thr Val Leu GTC TCG AAG GAT Val Ser Lys Asp CCG TTC GCC CAG Pro Phe Ala Gin CGC TTC TGC GGC Arg Phe Cys Gly GAG GGC ACC GAA Glu Gly Thr Glu GTC ATG CCC GCG Val Met Pro Ala TCG GCA Ser Ala 105 TTC CGG GAC Phe Arg Asp CCG ATG GCC Pro Met Ala 125
AGC
Ser 110 TTC GGC GAG GAT Phe Gly Glu Asp GGC GTG ACC ATC Gly Val Thr Ile GCC GAC GGG Ala Asp Gly 120 GGC GCG GAC Gly Ala Asp GGG CTG CTC GCC Gly Leu Leu Ala GCA ATC GTG GTG Ala Ile Val Val GGC AAC GTC GCC TAC ACG GAA TTG GTG CCG GAA ATC GCG CAA GAA CCC SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 Gly Asn Val Ala Tyr Thr Glu Leu Val Pro Glu Ile Ala Gin Glu Pro 140 145 150 AAC TAC GAA GCG GCG CTG GCC GCG CTG GGC GCC TAG GCTTTCACAA Asn Tyr Glu Ala Ala Leu Ala Ala Leu Gly Ala 155 160 165 GCCCCGCGCG TTCGGCGAGC AGCGCACGAT TTCGAGCGCT GCTCCCGAAA AGCGCCTCGG TGGTCTTGGC CCGGCGGTAA TACAGGTGCA GGTCGTGCTC CCACGTGAAG GCGATGGCAC CGTGGATCTG AAGAGCGGAG CCGGCGCATA ACACAAAGGT TTCCGCGGTC TGCGCCTTCG
CCAGCGGCGC
INFORMATION FOR SEQ ID NO: 8: SEQUENCE CHARACTERISTICS: LENGTH: 165 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 708 768 828 888 898 Met Ala Gin Ile Thr Leu Arg Gly Asn Ala Ile Asn Thr Val Leu Asp Leu Arg Val Gly Glu Ala Glu 145 Ala Pro Leu Asn Thr Ser Thr Asp Arg 130 Leu Ala Ala Gly Ile Phe Lys Glu Tyr 115 Ala Val Leu Val Val Phe Asp Asp Asn 100 Gly Ile Pro Gly Gly Ile Pro Glu Leu Val Val Val Glu Ala Ser Ser Ser Arg 70 Pro Met Thr Val Ile 150 Ala Pro 25 Asp Gin 40 Asp Thr Ala Ala Ala Gin Ala Ser 105 Ala Asp 120 Gly Ala Gin Glu Thr Gly Cys Ala Phe Arg Met Asn 140 Tyr Leu Lys Ala Thr Cys Asp Ala 125 Val Glu Thr Ser Thr Val Gly Ser 110 Gly Ala Ala Gly Glu Gly Gly Val Leu Ser Val Leu Cys Ala Glu Phe Gly Leu Leu Tyr Thr Ala Leu 160 INFORMATION FOR SEQ ID NO: 9: SUBSTITUTE SHEET (RULE 26) WO 99/24577 9 SEQUENCE CHARACTERISTICS: LENGTH: 1054 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) PCTIDK98/00438 (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium STRAIN: H37Rv tuberculosis (ix) FEATURE:
NAME/KEY:
LOCATION:
(ix) FEATURE:
NAME/KEY:
LOCATION:
(ix) FEATURE:
NAME/KEY:
LOCATION:
CDS
201..854 sig_peptide 201..296 mat peptide 297..854 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: ATAATCAGCT CACCGTTGGG ACCGACCTCG ACCAGGGGTC CTTTGTGACT GCCGGGCTTG ACGCGGACGA CCACAGAGTC GGTCATCGCC TAAGGCTACC GTTCTGACCT GGGGCTGCGT GGGCGCCGAC GACGTGAGGC ACGTCATGTC TCAGCGGCCC ACCGCCACCT CGGTCGCCGG CAGTATGTCA GCATGTGCAG ATG ACT CCA CGC AGC CTT GTT CGC ATC GTT Met Thr Pro Arg Ser Leu Val Arg Ile Val -32 -30 GGT GTC GTG Gly Val Val GTT GCG ACG ACC Val Ala Thr Thr GCG CTG GTG AGC Ala Leu Val Ser
GCA
Ala CCC GCC GGC Pro Ala Gly 278 GGT CGT Gly Arg GCC GCG CAT GCG GAT CCG TGT TCG Ala Ala His Ala Asp Pro Cys Ser 1 ATC GCG GTC GTT TTC Ile Ala Val Val Phe GCT CGC GGC ACG Ala Arg Gly Thr
CAT
His CAG GCT TCT GGT Gin Ala Ser Gly GGC GAC GTC GGT Gly Asp Val Gly GAG GCG Glu Ala TTC GTC GAC Phe Val Asp TAC GCG GTG Tyr Ala Val CTT ACC TCG CAA Leu Thr Ser Gin
GTT
Val GGC GGG CGG TCG Gly Gly Arg Ser ATT GGG GTC Ile Gly Val AAC TAC CCA GCA AGC GAC GAC TAC CGC GCG AGC GCG TCA Asn Tyr Pro Ala Ser Asp Asp Tyr Arg Ala Ser Ala Ser AAC GGT TCC GAT GAT GCG AGC GCC CAC ATC CAG CGC ACC GTC GCC AGC Asn Gly Ser Asp Asp Ala Ser Ala His Ile Gin Arg Thr Val Ala Ser SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 TGC CCG Cys Pro AAC ACC AGG ATT GTG CTT GGT GGC Asn Thr Arg Ile Val Leu Gly Gly TCG CAG GGT GCG Ser Gin Gly Ala
ACG
Thr GTC ATC GAT TTG Val Ile Asp Leu ACC TCG GCG ATG Thr Ser Ala Met
CCG
Pro 100 GCG GTG GCA Ala Val Ala GAT CAT Asp His 105 GTC GCC GCT Val Ala Ala ATG TTG TGG Met Leu Trp 125 TCT AAG ACC Ser Lys Thr 140 CTT TTC GGC Leu Phe Gly
GAG
Glu 115
CCG
Pro CCA TCC AGT GGT Pro Ser Ser Gly GGC GGG TCG Gly Gly Ser
TTG
Leu 130
GCT
Ala ACA ATC GGT Thr Ile Gly
CCG
Pro 135
ATA
Ile TTC TCC AGC Phe Ser Ser 120 CTG TAT AGC Leu Tyr Ser TGC ACC GGA Cys Thr Gly ATA AAC TTG Ile Asn Leu CCC GAC GAT Pro Asp Asp
GGC
Gly 155
AGC
Ser GGC AAT ATT ATG Gly Asn Ile Met CAG GCG GCG ACA Gin Ala Ala Thr GTT TCG TAT Val Ser Tyr
GTT
Val 165
CTC
Leu TCG GGG ATG Ser Gly Met 710 758 806 854 914 974 1034 1054 GCG GCG AAC AGG Ala Ala Asn Arg GAT CAC GCC GGA Asp His Ala Gly
TCAAAGACTG
AAGGGCAAGA
GTGTTGAACG
GCCGGCAGGG
TTGTCCCTAT ACCGCTGGGG CTGTAGTCGA ACCCGGTATT CATCAGGCCG GATGAAATGA CGTAGAGCCG ATCACCGCCG GGGCTGGTGT
TTCCGGATCC
TGTACACCGG CTGGAATCTG CGGTCGGGCG GTAATCGTTT AGACCTCAAT GTTTGTGTTC INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 217 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Met Thr Pro Arg Ser Leu Val Arg Ile Val Gly Val Val Val -32 -30 -25 Thr Leu Ala Leu Val Ser Ala Pro Ala Gly Gly Arg Ala Ala -10 Asp Pro Cys Ser Asp Ile Ala Val Val Phe Ala Arg Gly Thr 1 5 10 Ala Ser Gly Leu Gly Asp Val Gly Glu Ala Phe Val Asp Ser 25 Ala Thr His Ala His Gin Leu Thr SUBSTITUTE SHEET (RULE 26) rOl nnQ/lC T' PCT/DK98/00438 "TW I 11 Ser Gin Val Gly Gly Arg Ser Ile Gly Val Tyr Ala Val Asn Tyr Pro 40 Ala Ser Asp Asp Tyr Arg Ala Ser Ala Ser Asn Gly Ser Asp Asp Ala 55 Ser Ala His Ile Gin Arg Thr Val Ala Ser Cys Pro Asn Thr Arg Ile 70 75 Val Leu Gly Gly Tyr Ser Gin Gly Ala Thr Val Ile Asp Leu Ser Thr 90 Ser Ala Met Pro Pro Ala Val Ala Asp His Val Ala Ala Val Ala Leu 100 105 110 Phe Gly Glu Pro Ser Ser Gly Phe Ser Ser Met Leu Trp Gly Gly Gly 115 120 125 Ser Leu Pro Thr Ile Gly Pro Leu Tyr Ser Ser Lys Thr Ile Asn Leu 130 135 140 Cys Ala Pro Asp Asp Pro Ile Cys Thr Gly Gly Gly Asn Ile Met Ala 145 150 155 160 His Val Ser Tyr Val Gin Ser Gly Met Thr Ser Gin Ala Ala Thr Phe 165 170 175 Ala Ala Asn Arg Leu Asp His Ala Gly 180 185 INFORMATION FOR SEQ ID NO: 11: SEQUENCE CHARACTERISTICS: LENGTH: 949 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) FEATURE: NAME/KEY: CDS LOCATION: 201..749 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 224..749 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: AGCCGCTCGC GTGGGGTCAA CCGGGTTTCC ACCTGCTCAC TCATTTTGCC GCCTTTCTGT GTCCGGGCCG AGGCTTGCGC TCAATAACTC GGTCAAGTTC CTTCACAGAC TGCCATCACT 120 GGCCCGTCGG CGGGCTCGTT GCGGGTGCGC CGCGTGCGGG TTTGTGTTCC GGGCACCGGG 180 SUBSTITUTE SHEET (RULE 26) CT/DK98/00438 111d"% fllIAC!77 P TGGGGGCCCG CCCGGGCGTA ATG GCA GAC TGT GAT TCC GTG ACT AAC AGC Met Ala Asp Cys Asp Ser Val Thr Asn Ser
CCC
Pro
AAG
Lys
GTG
Val
GGT
Gly
ATC
Ile
GGC
Gly
TTC
Phe 100
AAC
Asn
CGG
Arg
GTT
Val
ACG
CTT
Leu
ATC
Ile
GGC
Gly
GGC
Gly
CAG
Gin
GGA
Gly
GAC
Asp
GGC
Gly
CGC
Arg
GTG
Val
GAC
GCG
Ala
GCC
Ala
CTT
Leu
CCG
Pro
GGC
Gly
CCC
Pro
AAG
Lys
TCA
Ser
CAC
His
GAG
Glu 150
CCG
ACC
Thr
CTG
Leu
GCG
Ala
TCC
Ser
TTC
Phe
GGC
Gly
CCC
Pro
CAG
Gin
ACC
Thr 135
GCG
Ala
GTG
ACC GCC ACG CTG CAC ACT AAC CGC GGC GAC ATC Thr
GGA
Gly 25
GGC
Gly
CCG
Pro
ATC
Ile
AAG
Lys
CTG
Leu 105
TTC
Phe
TTC
Phe
TCC
Ser
ATC
Ala 10
AAC
Asn
ACC
Thr
TTC
Phe
CAG
Gin
TTC
Phe 90
CTC
Leu
ATC
Ile
GGT
Gly
AAG
Lys
GAG
Thr Leu CAT GCG His Ala AAG GAC Lys Asp TAC GAC Tyr Asp 60 GGT GGC Gly Gly 75 GCC GAC Ala Asp GCG ATG Ala Met ACC GTC Thr Val GAA GTG Giu Val 140 ACG GCC Thr Ala 155 TCG ATC His
CCC
Pro
TAT
Tyr
GGC
Gly
GAT
Asp
GAG
Giu
GCC
Ala
GGC
Gly 125
ATC
Ile
ACC
Thr
ACC
Thr
AAG
Lys 30
TCG
Ser
GCG
Ala
CCA
Pro
TTC
Phe
AAC
Asn 110
AAG
Lys
GAC
Asp
GAC
Asp
ATC
Asn
ACC
Thr
ACC
Thr
GTC
Val
ACC
Thr
CAC
His
GCC
Ala
ACT
Thr
GCG
Ala
GGC
Gly
TCC
Arg
GTC
Val
CAA
Gin
TTT
Phe
GGG
Gly
CCC
Pro
GGT
Gly
CCG
Pro
GAG
Giu
AAC
Asn 160
TGA
Gly
GCC
Ala
AAC
Asn
CAC
His
ACG
Thr
GAG
Giu
CCG
Pro
CAC
His
TCA
Ser 145
GAT
Asp
AAT
Asn
GCA
Ala
CGG
Arg
GGT
Gly
CTG
Leu
GGC
Gly
CTG
Leu 130
CAG
Gin
CGG
Ile
TTT
Phe
TCA
Ser
GTG
Val
CGC
Arg
CAA
Gin
ACC
Thr 115
AAC
Asn
CGG
Arg
CCG
326 374 422 470 Asp Arg Pro
CCCGAAGCTA
Thr Asp 165 Pro Val Val Ile Giu Ser Ile Thr Ile Ser CGTCGGCTCG TCGCTCGAAT ACACCTTGTG GACCCGCCAG CACGCCGTTG GGGCCGTTCA ACCGGACGCC CTCACGCCAA GACCGGCGTA ACCGGCAGCG GTAAGCGCAT CGAGCACCTC
CCCAGCGGGA
INFORMATION FOR SEQ ID NO: 12: SEQUENCE CHARACTERISTICS: LENGTH: 182 amino acids GGCACGTGGC GGTACACCGA GTCCGCTCAC! CTTTGGCCGC CACTGGGTCG GTGCCGAGAT 819 879 939 949 SUBSTITUTE SHEET (RULE 26) WO 99/24577 13 TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: Met Ala Asp Cys Asp Ser Val Thr Asn Ser Pro Le PCT/DK98/00438 -7 -5 Ala Thr Leu Asn His Ala Thr Lys Asp Phe Tyr Asp Gln Gly Gly Phe Ala Asp Leu Ala Met Ile Thr Val Gly Glu Val 140 Lys Thr Ala 155 Glu Ser Ile 170 u His Pro Tyr Gly Asp Glu Ala Gly 125 Ile Thr Thr Gly Ala Asn His 65 Thr Glu Pro His Ser 145 Asp Lys 20 Val Gly Ile Gly Phe 100 Asn Arg Val Thr Ile Gly Gly Gln Gly Asp Gly Arg Val Asp 165 Ala Ala Leu Pro Gly Pro Thr Leu Ala Ser Phe Gly Pro Gin Thr 135 Ala Val Thr Gly Gly Pro Ile Lys Leu 105 Phe Phe Ser Ile INFORMATION FOR SEQ ID NO: 13: SEQUENCE CHARACTERISTICS: LENGTH: 1060 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) FEATURE: NAME/KEY: CDS LOCATION: 201..860 SUBSTITUTE SHEET (RULE 26) wn 99/24577 PrTm ,K98/00438 WO 99/24577 PCT~r~9IO3 14 (ix) FEATURE: NAME/KEY: sigpeptide LOCATION: 201..296 (ix) FEATURE: NAME/KEY: mat peptide LOCATION: 297..860 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: TGGACCTTCA CCGGCGGTCC CTTCGCTTCG GGGGCGACAC CTAACATACT GGTCGTCAAC CTACCGCGAC ACCGCTGGGA CTTTGTGCCA TTGCCGGCCA CTCGGGGCCG CTGCGGCCTG GAAAAATTGG TCGGGCACGG GCGGCCGCGG GTCGCTACCA TCCCACTGTG AATGATTTAC TGACCCGCCG ACTGCTCACC ATG GGC GCG GCC GCC GCA ATG CTG GCC GCG Met Gly Ala Ala Ala Ala Met Leu Ala Ala 120 180 230 GTG CTT CTG Val Leu Leu CTT ACT CCC ATC Leu Thr Pro Ile ACC GTT CCC GCC GGC TAC Thr Val Pro Ala Gly Tyr -15 CCC GGT GCC Pro Gly Ala GTT GCA Val Ala CCG GCC ACT GCA GCC TGC CCC GAC Pro Ala Thr Ala Ala Cys Pro Asp 1
GCC
Ala GAA GTG GTG TTC Glu Val Val Phe CGC GGC CGC TTC Arg Gly Arg Phe
GAA
Glu CCG CCC GGG ATT Pro Pro Gly Ile ACG GTC GGC AAC Thr Val Gly Asn GCA TTC Ala Phe GTC AGC GCG Val Ser Ala GTG AAA TAC Val Lys Tyr CGC TCG AAG GTC Arg Ser Lys Val AAG AAT GTC GGG Lys Asn Val Gly GTC TAC GCG Val Tyr Ala AAC GAC ATG Asn Asp Met CCC GCC GAC AAT Pro Ala Asp Asn ATC GAT GTG GGC Ile Asp Val Gly AGC GCC Ser Ala CAC ATT CAG AGC ATG GCC AAC AGC TGT His Ile Gin Ser Met Ala Asn Ser Cys 65
CCG
Pro AAT ACC CGC CTG Asn Thr Arg Leu GTG CCC GGC GGT TAC Val Pro Gly Gly Tyr
TCG
Ser CTG GGC GCG GCC Leu Gly Ala Ala ACC GAC GTG GTA Thr Asp Val Val GCG GTG CCC ACC Ala Val Pro Thr ATG TGG GGC TTC Met Trp Gly Phe AAT CCC CTG CCT Asn Pro Leu Pro CCC GGC Pro Gly 105 614 662 AGT GAT GAG Ser Asp Glu TGG GTC GGC Trp Val Gly 125
CAC
His 110 ATC GCC GCG GTC GCG CTG TTC GGC AAT Ile Ala Ala Val Ala Leu Phe Gly Asn 115 GGC AGT CAG Gly Ser Gin 120 GAT CGG ACC Asp Arg Thr CCC ATC ACC AAC Pro Ile Thr Asn
TTC
Phe 130 AGC CCC GCC TAC Ser Pro Ala Tyr 710 SUBSTITUTE SHEET (RULE 26) Wn O/2A77 DPCT/9/innAi ATC GAG TTG TGT CAC GGC GAC GAC CCC GTC TGC CAC CCT GCC GAC CCC 758 Ile Glu Leu Cys His Gly Asp Asp Pro Val Cys His Pro Ala Asp Pro 140 145 150 AAC ACC TGG GAG GCC AAC TGG CCC CAG CAC CTC GCC GGG GCC TAT GTC 806 Asn Thr Trp Glu Ala Asn Trp Pro Gin His Leu Ala Gly Ala Tyr Val 155 160 165 170 TCG TCG GGC ATG GTC AAC CAG GCG GCT GAC TTC GTT GCC GGA AAG CTG 854 Ser Ser Gly Met Val Asn Gin Ala Ala Asp Phe Val Ala Gly Lys Leu 175 180 185 CAA TAG CCACCTAGCC CGTGCGCGAG TCTTTGCTTC ACGCTTTCGC TAACCGACCA 910 Gin ACGCGCGCAC GATGGAGGGG TCCGTGGTCA TATCAAGACA AGAAGGGAGT AGGCGATGCA 970 CGCAAAAGTC GGCGACTACC TCGTGGTGAA GGGCACAACC ACGGAACGGC ATGATCAACA 1030 TGCTGAGATC ATCGAGGTGC GCTCCGCAGA 1060 INFORMATION FOR SEQ ID NO: 14: SEQUENCE CHARACTERISTICS: LENGTH: 219 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: Met Gly Ala Ala Ala Ala Met Leu Ala Ala Val Leu Leu Leu Thr Pro -32 -30 -25 Ile Thr Val Pro Ala Gly Tyr Pro Gly Ala Val Ala Pro Ala Thr Ala -10 Ala Cys Pro Asp Ala Glu Val Val Phe Ala Arg Gly Arg Phe Glu Pro 1 5 10 Pro Gly Ile Gly Thr Val Gly Asn Ala Phe Val Ser Ala Leu Arg Ser 25 Lys Val Asn Lys Asn Val Gly Val Tyr Ala Val Lys Tyr Pro Ala Asp 40 Asn Gin Ile Asp Val Gly Ala Asn Asp Met Ser Ala His Ile Gin Ser 55 Met Ala Asn Ser Cys Pro Asn Thr Arg Leu Val Pro Gly Gly Tyr Ser 70 75 Leu Gly Ala Ala Val Thr Asp Val Val Leu Ala Val Pro Thr Gin Met 90 Trp Gly Phe Thr Asn Pro Leu Pro Pro Gly Ser Asp Glu His Ile Ala 100 105 110 SUBSTITUTE SHEET (RULE 26) PCTDK98/00438 WO 99/24577 Ala Val Ala Leu Phe Gly Asn Gly Ser Gin Trp 115 120 Asn Phe Ser Pro Ala Tyr Asn Asp Arg Thr Ile 130 135 Asp Asp Pro Val Cys His Pro Ala Asp Pro Asn 145 150 155 Trp Pro Gin His Leu Ala Gly Ala Tyr Val Ser 165 170 Val Gly Pro Ile Thr 125 Glu Leu Cys His Gly 140 Thr Trp Glu Ala Asn 160 Ser Gly Met Val Asn 175 Gin Ala Ala Asp Phe Val Ala Gly Lys Leu Gin 180 185 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 1198 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) FEATURE: NAME/KEY: CDS LOCATION: 201..998 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 201..998 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: CAGATGCTGC GCAACATGTT TCTCGGCGAT CCGGCAGGCA ACACCGATCG AGTGCTTGAC TTTTCCACCG CGGTGACCGG CGGACTGTTC TTCTCACCCA CCATCGACTT TCTCGACCAT CCACCGCCCC TACCGCAGGC GGCGACGCCA ACTCTGGCAG CCGGGTCGCT ATCGATCGGC AGCTTGAAAG GAAGCCCCCG ATG AAC AAT CTC TAC CGC GAT TTG GCA CCG Met Asn Asn Leu Tyr Arg Asp Leu Ala Pro 1 5 GTC ACC GAA GCC GCT TGG GCG GAA ATC GAA TTG GAG GCG GCG CGG ACG Val Thr Glu Ala Ala Trp Ala Glu Ile Glu Leu Glu Ala Ala Arg Thr 20 TTC AAG CGA CAC ATC GCC GGG CGC CGG GTG GTC GAT GTC AGT GAT CCC Phe Lys Arg His Ile Ala Gly Arg Arg Val Val Asp Val Ser Asp Pro 35 GGG GGG CCC GTC ACC GCG GCG GTC AGC ACC GGC CGG CTG ATC GAT GTT Gly Gly Pro Val Thr Ala Ala Val Ser Thr Gly Arg Leu Ile Asp Val 50 120 180 230 278 326 374 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 AAG GCA Lys Ala CCA ACC AAC GGC Pro Thr Asn Gly ATC GCC CAC CTG Ile Ala His Leu
CGG
Arg GCC AGC AAA CCC Ala Ser Lys Pro 422 CTT GTC CGG CTA CGG Leu Val Arg Leu Arg CCG TTT ACC CTG Pro Phe Thr Leu CGC AAC GAG ATC Arg Asn Giu Ile GAC GTG GAA CGT Asp Val Giu Arg
GGC
Gly TCT AAG GAC TCC Ser Lys Asp Ser
GAT
Asp 100 TGG GAA CCG GTA Trp Giu Pro Val AAG GAG Lys Glu 105 GCG GCC AAG Ala Ala Lys TAC AGC GCC Tyr Ser Ala 125 AAG CTG GCC TTC GTC GAG GAC CGC ACA ATA Lys Leu Ala Phe Val Glu Asp Arg Thr Ile 110 115 GCA TCA ATC GAA GGG ATC CGC AGC GCG AGT Ala Ser Ile Glu Gly Ile Arg Ser Ala Ser TTC GAA GGC Phe Glu Gly 120 TCG AAC CCG Ser Asn Pro GCG CTG Ala Leu 140 ACG TTG CCC GAG Thr Leu Pro Glu
GAT
Asp 145 CCC CGT GAA ATC Pro Arg Giu Ile
CCT
Pro 150 GAT GTC ATC TCC Asp Val Ile Ser
CAG
Gin 155 GCA TTG TCC GAA Ala Leu Ser Glu
CTG
Leu 160 CGG TTG GCC GGT Arg Leu Ala Gly GAC GGA CCG TAT Asp Gly Pro Tyr GTG TTG CTC TCT Val Leu Leu Ser
GCT
Ala 175 GAC GTC TAC ACC Asp Val Tyr Thr
AAG
Lys 180 GTT AGC GAG ACT Vai Ser Glu Thr TCC GAT Ser Asp 185 CAC GGC TAT CCC ATC CGT GAG CAT His Gly Tyr Pro Ile Arg Glu His 190 AAC CGG CTG GTG Asn Arg Leu Val GAC GGG GAC Asp Gly Asp 200 ACC ACT CGA Thr Thr Arg ATC ATT TGG Ile Ile Trp 205 GCC CCG GCC ATC Ala Pro Ala Ile GGC GCG TTC GTG Gly Ala Phe Val
CTG
Leu 215 GGC GGC Gly Gly 220 GAC TTC GAC CTA Asp Phe Asp Leu
CAG
Gin 225 CTG GGC ACC GAC Leu Gly Thr Asp
GTT
Va1 230 GCA ATC GGG TAC Ala Ile Gly Tyr
GCC
Ala 235 AGC CAC GAC ACG Ser His Asp Thr ACC GAG CGC CTC Thr Glu Arg Leu CTG CAG GAG ACG Leu Gin Glu Thr ACG TTC CTT TGC Thr Phe Leu Cys
TAC
Tyr 255 ACC GCC GAG GCG Thr Ala Giu Ala
TCG
Ser 260 GTC GCG CTC AGC Val Ala Leu Ser CAC TAA His 265 GGCACGAGCG CGAGCAATAG CTCCTATGGC AAGCGGCCGC GGGTTGGGTG TGTTCGGAGC TGGGCTGGTG GACGGTGCGC AGGGCCTGGA AGACGGTGCG GGCTAGGCGG CGTTTGAGGC AGCGTAGTGC TGCGCGTTTG GTTTTCCCGG CGTCTTGCAG CCTTTGGTAG TAGGCCTGGC CCCGGCTGTC GGTCATCCGG 1058 1118 1178 1198 SUBSTmm SHEET (RULE 26) W\ 00/9 /277 PCT/DK98/00438 18 INFORMATION FOR SEQ ID NO: 16: SEQUENCE CHARACTERISTICS: LENGTH: 265 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: Met Asn Asn Leu Tyr Arg Asp Leu Ala Pro Val Thr Glu Ala Ala Trp 1 5 10 Ala Glu Ile Glu Leu Glu Ala Ala Arg Thr Phe Lys Arg His Ile Ala 25 Gly Arg Arg Val Val Asp Val Ser Asp Pro Gly Gly Pro Val Thr Ala 40 Ala Val Ser Thr Gly Arg Leu Ile Asp Val Lys Ala Pro Thr Asn Gly 55 Val Ile Ala His Leu Arg Ala Ser Lys Pro Leu Val Arg Leu Arg Val 70 75 Pro Phe Thr Leu Ser Arg Asn Glu Ile Asp Asp Val Glu Arg Gly Ser 90 Lys Asp Ser Asp Trp Glu Pro Val Lys Glu Ala Ala Lys Lys Leu Ala 100 105 110 Phe Val Glu Asp Arg Thr Ile Phe Glu Gly Tyr Ser Ala Ala Ser Ile 115 120 125 Glu Gly Ile Arg Ser Ala Ser Ser Asn Pro Ala Leu Thr Leu Pro Glu 130 135 140 Asp Pro Arg Glu Ile Pro Asp Val Ile Ser Gin Ala Leu Ser Glu Leu 145 150 155 160 Arg Leu Ala Gly Val Asp Gly Pro Tyr Ser Val Leu Leu Ser Ala Asp 165 170 175 Val Tyr Thr Lys Val Ser Glu Thr Ser Asp His Gly Tyr Pro Ile Arg 180 185 190 Glu His Leu Asn Arg Leu Val Asp Gly Asp Ile Ile Trp Ala Pro Ala 195 200 205 Ile Asp Gly Ala Phe Val Leu Thr Thr Arg Gly Gly Asp Phe Asp Leu 210 215 220 Gin Leu Gly Thr Asp Val Ala Ile Gly Tyr Ala Ser His Asp Thr Asp 225 230 235 240 Thr Glu Arg Leu Tyr Leu Gin Glu Thr Leu Thr Phe Leu Cys Tyr Thr 245 250 255 Ala Glu Ala Ser Val Ala Leu Ser His Y1-~ Y' SUBSTITUTE SHEET (RULE 26) nA477 PrTr )K98/00438 Sv 771^*J I I 19 260 265 INFORMATION FOR SEQ ID NO: 17: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) FEATURE: NAME/KEY: Duplication LOCATION: 1 OTHER INFORMATION: Ala is Ala or Ser (ix) FEATURE: NAME/KEY: Duplication LOCATION: 13 OTHER INFORMATION: Xaa is unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: Ala Glu Leu Asp Ala Pro Ala Gln Ala Gly Thr Glu Xaa Ala Val 1 5 10 INFORMATION FOR SEQ ID NO: 18: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: Ala Gln Ile Thr Leu Arg Gly Asn Ala Ile Asn Thr Val Gly Glu 1 5 10 INFORMATION FOR SEQ ID NO: 19: SEQUENCE CHARACTERISTICS: SUBSTITUTE SHEET (RULE 26) WO 99/24577 20 LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) Feature: NAME/KEY: Other LOCATION: 3 OTHER INFORMATION: Xaa is unknown PCT/DK98/00438 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: Asp Pro Xaa Ser Asp Ile Ala Val Val Phe Ala Arg Gly Thr His 1 5 10 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (xi) Thr 1 SEQUENCE DESCRIPTION: SEQ ID NO: Asn Ser Pro Leu Ala Thr Ala Thr Ala Thr Leu His Thr Asn 5 10 INFORMATION FOR SEQ ID NO: 21: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 (ix) Feature: NAME/KEY: Other LOCATION: 2 OTHER INFORMATION: Xaa is unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: Ala Xaa Pro Asp Ala Glu Val Val Phe Ala Arg Gly Arg Phe Glu 1 5 10 INFORMATION FOR SEQ ID NO: 22: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) Feature: NAME/KEY: Other LOCATION: 1 OTHER INFORMATION: Xaa (ix) FEATURE: NAME/KEY: Duplication LOCATION: 2 OTHER INFORMATION: Ile (ix) FEATURE: NAME/KEY: Duplication LOCATION: OTHER INFORMATION: Val (ix) FEATURE: NAME/KEY: Duplication LOCATION: 11 OTHER INFORMATION: Val (ix) FEATURE: NAME/KEY: Duplication LOCATION: 14 OTHER INFORMATION: Asp is unknown is Ile or Val is Val or Thr is Val or Phe is Asp or Gln (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: Ile Gln Lys Ser Leu Glu Leu Ile Val Val Thr Ala Asp Glu INFORMATION FOR SEQ ID NO: 23: SUBSTITUTE SHEET (RULE 26) WO 99/24_577 PCT/DK98/00438 22 SEQUENCE CHARACTERISTICS: LENGTH: 19 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: Met Asn Asn Leu Tyr Arg Asp Leu Ala Pro Val Thr Glu Ala Ala Trp 1 5 10 Ala Glu Ile INFORMATION FOR SEQ ID NO: 24: SEQUENCE CHARACTERISTICS: LENGTH: 34 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: CCCGGCTCGA GAACCTSTAC CGCGACCTSG CSCC 34 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 37 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GGGCCGGATC CGASGCSGCG TCCTTSACSG GYTGCCA 37 INFORMATION FOR SEQ ID NO: 26: SEQUENCE CHARACTERISTICS: SUBSTITUTE SHEET (RULE 26) IIrI nnlW PDT/I'lOQI/flnAdR 7V r23 LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: GGAAGCCCCA TATGAACAAT CTCTACCG 28 INFORMATION FOR SEQ ID NO: 27: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: CGCGCTCAGC CCTTAGTGAC TGAGCGCGAC CG 32 INFORMATION FOR SEQ ID NO: 28: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: CTCGAATTCG CCGGGTGCAC ACAG 24 INFORMATION FOR SEQ ID NO: 29: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO SUBSTITUTE SHEET (RULE 26) WO9 O/24577 PrT/KM'oB/l04la 24 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: CTCGAATTCG CCCCCATACG AGAAC INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 15 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GTGTATCTGC TGGAC INFORMATION FOR SEQ ID NO: 31: SEQUENCE CHARACTERISTICS: LENGTH: 15 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: CCGACTGGCT GGCCG INFORMATION FOR SEQ ID NO: 32: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: YES SUBSTITUTE SHEET (RULE 26) Wd" On/l A C77 PrT/DiKa/n0t03 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: GAGGAATTCG CTTAGCGGAT CGCA 24 INFORMATION FOR SEQ ID NO: 33: SEQUENCE CHARACTERISTICS: LENGTH: 15 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: CCCACATTCC GTTGG INFORMATION FOR SEQ ID NO: 34: SEQUENCE CHARACTERISTICS: LENGTH: 15 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: GTCCAGCAGA TACAC INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GTACGAGAAT TCATGTCGCA AATCATG 27 SUBSTITUTE SHEET (RULE 26) I/24t nn/IAc? DMT/T iQ/oILnfAQ 26 INFORMATION FOR SEQ ID NO: 36: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: GTACGAGAAT TCGAGCTTGG GGTGCCG 27 INFORMATION FOR SEQ ID NO: 37: SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: CGATTCCAAG CTTGTGGCCG CCGACCCG 28 INFORMATION FOR SEQ ID NO: 38: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: CGTTAGGGAT CCTCATCGCC ATGGTGTTGG INFORMATION FOR SEQ ID NO: 39: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs v SUBSTITUTE SHEET (RULE 26) 19r 9 I C PCT/DKwQR/nl43 ~VW~JI 1 27 TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: CGTTAGGGAT CCGGTTCCAC TGTGCC 26 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: CGTTAGGGAT CCTCAGGTCT TTTCGATG 28 INFORMATION FOR SEQ ID NO: 41: SEQUENCE CHARACTERISTICS: LENGTH: 952 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: circular (ii) MOLECULE TYPE: DNA (genomic) (vi) ORIGINAL SOURCE: ORGANISM: Mycobacterium tuberculosis STRAIN: H37Rv (ix) FEATURE: NAME/KEY: CDS LOCATION: 45..944 (ix) FEATURE: NAME/KEY: sig_peptide LOCATION: 45..143 (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 144..941 SUBSTITUTE SHEET (RULE 26) WO 99/24577 28 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: PCT/DK98/00438 GAATTCGCCG GGTGCACACA GCCTTACACG ACGGAGGTGG ACAC ATG AAG GGT CGG Met Lys Gly Arg -33 TCG GCG CTG CTG Ser Ala Leu Leu GCG CTC TGG ATT Ala Leu Trp Ile GCA CTG TCA TTC Ala Leu Ser Phe GGG TTG Gly Leu GGC GGT GTC Gly Gly Val GTA GCC GCG GAA Val Ala Ala Glu ACC GCC AAG GCC GCC CCA TAC Thr Ala Lys Ala Ala Pro Tyr 1 GAG AAC CTG ATG GTG CCG Glu Asn Leu Met Val Pro
TCG
Ser 10 CCC TCG ATG GGC Pro Ser Met Gly
CGG
Arg GAC ATC CCG GTG Asp Ile Pro Val
GCC
Ala TTC CTA GCC GGT Phe Leu Ala Gly CCG CAC GCG GTG TAT CTG CTG GAC GCC Pro His Ala Val Tyr Leu Leu Asp Ala AAC GCC GGC CCG Asn Ala Gly Pro GTC AGT AAC TGG Val Ser Asn Trp ACC GCG GGT AAC Thr Ala Gly Asn GCG ATG Ala Met AAC ACG TTG Asn Thr Leu GCG TAC AGC Ala Tyr Ser GGC AAG GGG ATT Gly Lys Gly Ile GTG GTG GCA CCG Val Val Ala Pro GCC GGT GGT Ala Gly Gly AAG CAG TGG Lys Gln Trp ATG TAC ACC AAC Met Tyr Thr Asn GAG CAG GAT GGC Glu Gin Asp Gly GAC ACC Asp Thr TTC TTG TCC GCT Phe Leu Ser Ala CTG CCC GAC TGG Leu Pro Asp Trp
CTG
Leu GCC GCT AAC CGG Ala Ala Asn Arg
GGC
Gly 100 TTG GCC CCC GGT Leu Ala Pro Gly
GGC
Gly 105 CAT GCG GCC GTT His Ala Ala Val GCC GCT CAG GGC Ala Ala Gin Gly
GGT
Gly 115 488 TAC GGG GCG ATG Tyr Gly Ala Met
GCG
Ala 120 CTG GCG GCC TTC Leu Ala Ala Phe CCC GAC CGC TTC Pro Asp Arg Phe GGC TTC Gly Phe 130 536 GCT GGC TCG Ala Gly Ser GGT GCG ATC Gly Ala Ile 150
ATG
Met 135 TCG GGC TTT TTG Ser Gly Phe Leu
TAC
Tyr 140 CCG TCG AAC ACC Pro Ser Asn Thr GCG GCG GGC ATG Ala Ala Gly Met CAA TTC GGC GGT Gln Phe Gly Gly
GTG
Val 160 ACC ACC AAC Thr Thr Asn 145 GAC ACC AAC Asp Thr Asn CAC GAC CCG His Asp Pro 632 GGA ATG Gly Met 165 TGG GGA GCA CCA Trp Gly Ala Pro CTG GGT CGG TGG Leu Gly Arg Trp AAG TGG Lys Trp 175
TGG
Trp 180 GTG CAT GCC AGC Val His Ala Ser CTG GCG CAA AAC AAC ACC CGG GTG TGG Leu Ala Gln Asn Asn Thr Arg Val Trp 190 GTG 728 Val 195 SUBSTITUTE SHEET (RULE 26) PCT/DK98/00438 WO 99/24577 29 TGG AGC CCG ACC AAC CCG GGA GCC AGC GAT CCC GCC GCC ATG ATC GGC 776 Trp Ser Pro Thr Asn Pro Gly Ala Ser Asp Pro Ala Ala Met Ile Gly 200 205 210 CAA ACC GCC GAG GCG ATG GGT AAC AGC CGC ATG TTC TAC AAC CAG TAT 824 Gin Thr Ala Glu Ala Met Gly Asn Ser Arg Met Phe Tyr Asn Gin Tyr 215 220 225 CGC AGC GTC GGC GGG CAC AAC GGA CAC TTC GAC TTC CCA GCC AGC GGT 872 Arg Ser Val Gly Gly His Asn Gly His Phe Asp Phe Pro Ala Ser Gly 230 235 240 GAC AAC GGC TGG GGC TCG TGG GCG CCC CAG CTG GGC GCT ATG TCG GGC 920 Asp Asn Gly Trp Gly Ser Trp Ala Pro Gin Leu Gly Ala Met Ser Gly 245 250 255 GAT ATC GTC GGT GCG ATC CGC TAA GCGAATTC 952 Asp Ile Val Gly Ala Ile Arg 260 265 INFORMATION FOR SEQ ID NO: 42: SEQUENCE CHARACTERISTICS: LENGTH: 299 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: Met Lys Gly Arg Ser Ala Leu Leu Arg Ala Leu Trp Ile Ala Ala Leu -33 -30 -25 Ser Phe Gly Leu Gly Gly Val Ala Val Ala Ala Glu Pro Thr Ala Lys -10 Ala Ala Pro Tyr Glu Asn Leu Met Val Pro Ser Pro Ser Met Gly Arg 1 5 10 Asp Ile Pro Val Ala Phe Leu Ala Gly Gly Pro His Ala Val Tyr Leu 25 Leu Asp Ala Phe Asn Ala Gly Pro Asp Val Ser Asn Trp Val Thr Ala 40 Gly Asn Ala Met Asn Thr Leu Ala Gly Lys Gly Ile Ser Val Val Ala 55 Pro Ala Gly Gly Ala Tyr Ser Met Tyr Thr Asn Trp Glu Gin Asp Gly 70 Ser Lys Gin Trp Asp Thr Phe Leu Ser Ala Glu Leu Pro Asp Trp Leu 85 90 Ala Ala Asn Arg Gly Leu Ala Pro Gly Gly His Ala Ala Val Gly Ala 100 105 110 Ala Gin Gly Gly Tyr Gly Ala Met Ala Leu Ala Ala Phe His Pro Asp 115 120 125 SUBSTITUTE SHEET (RULE 26) W/N 00A9d77 PCT/DK98L/nn04 Arg Phe Gly Phe Ala Gly Ser Met Ser Gly Phe Leu Tyr Pro Ser Asn 130 135 140 Thr Thr Thr Asn Gly Ala Ile Ala Ala Gly Met Gin Gln Phe Gly Gly 145 150 155 Val Asp Thr Asn Gly Met Trp Gly Ala Pro Gln Leu Gly Arg Trp Lys 160 165 170 175 Trp His Asp Pro Trp Val His Ala Ser Leu Leu Ala Gin Asn Asn Thr 180 185 190 Arg Val Trp Val Trp Ser Pro Thr Asn Pro Gly Ala Ser Asp Pro Ala 195 200 205 Ala Met Ile Gly Gin Thr Ala Glu Ala Met Gly Asn Ser Arg Met Phe 210 215 220 Tyr Asn Gin Tyr Arg Ser Val Gly Gly His Asn Gly His Phe Asp Phe 225 230 235 Pro Ala Ser Gly Asp Asn Gly Trp Gly Ser Trp Ala Pro Gin Leu Gly 240 245 250 255 Ala Met Ser Gly Asp Ile Val Gly Ala Ile Arg 260 265 INFORMATION FOR SEQ ID NO: 43: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: GCAACACCCG GGATGTCGCA AATCATG 27 INFORMATION FOR SEQ ID NO: 44: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: NO SUBSTITUTE SHEET (RULE 26) WO 009/1457 PCT/DK98/00438 31 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: GTAACACCCG GGGTGGCCGC CGACCCG 27 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 37 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: CTACTAAGCT TGGATCCCTA GCCGCCCCAT TTGGCGG 37 INFORMATION FOR SEQ ID NO: 46: SEQUENCE CHARACTERISTICS: LENGTH: 38 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (synthetic) (iv) ANTI-SENSE: YES (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: CTACTAAGCT TCCATGGTCA GGTCTTTTCG ATGCTTAC 38 INFORMATION FOR SEQ ID NO: 47: SEQUENCE CHARACTERISTICS: LENGTH: 450 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 105...320 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: SUBSTITUTE SHEET (RULE 26) WO 99/24577 GTGCCGCGCT C ACCAGCAGTC A PCTIDK98/00438 CCCAGGGTT CTTATGGTTC GATATACCTG AGTTTGATGG AAGTCCGATG LGCATACGGC ATGGCCGAAA AGAGTGGGGT GATG ATG GCC GAG GAT Met Ala Glu Asp 1 GTT CGC GCC GAG ATC GTG GCC AGC GTT CTC GAA GTC GTT GTC AAC GAA Val Arg Ala Glu Ile Val Ala Ser Val Leu Glu Val Val Val Asn Glu 10 15 GGC GAT CAG ATC GAC AAG GGC GAC GTC GTG GTG CTG CTG GAG TCG ATG Gly Asp Gin Ile Asp Lys Gly Asp Val Val Val Leu Leu Glu Ser Met 30 AAG ATG GAG ATC CCC GTC CTG GCC GAA GCT GCC GGA ACG GTC AGC AAG Lys Met Glu Ile Pro Val Leu Ala Glu Ala Ala Gly Thr Val Ser Lys 45 GTG GCG GTA TCG GTG GGC GAT GTC ATT CAG GCC GGC GAC CTT ATC GCG Val Ala Val Ser Val Gly Asp Val Ile Gin Ala Gly Asp Leu Ile Ala 60 GTG ATC AGC TAGTCGTTGA TAGTCACTCA TGTCCACACT CGGTGATCTG CTCGCCGAA Val Ile Ser CACACGGTGC TGCCGGGCAG CGCGGTGGAC CACCTGCATG CGGTGGTCGG GGAGTGGCAG CTCCTTGCCG ACTTGTCGTT TGCC INFORMATION FOR SEQ ID NO: 48: SEQUENCE CHARACTERISTICS: LENGTH: 71 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: Met Ala Glu Asp Val Arg Ala Glu Ile Val Ala Ser Val Leu Glu Val 1 5 10 Val Val Asn Glu Gly Asp Gin Ile Asp Lys Gly Asp Val Val Val Leu 25 Leu Glu Ser Met Lys Met Glu Ile Pro Val Leu Ala Glu Ala Ala Gly 40 Thr Val Ser Lys Val Ala Val Ser Val Gly Asp Val Ile Gin Ala Gly 55 Asp Leu Ile Ala Val Ile Ser INFORMATION FOR SEQ ID NO: 49: 116 164 212 260 308 366 426 450 SUBSTITUTE SHEET (RULE 26) WOI Q/A 477 PCT/DK98/00438 33 SEQUENCE CHARACTERISTICS: LENGTH: 750 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 113...640 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: GGGTACCCAT CGATGGGTTG CGGTTCGGCA CCGAGGTGCT AACGCACTTG CTGACACACT GCTAGTCGAA AACGAGGCTA GTCGCAACGT CGATCACACG AGAGGACTGA CC ATG ACA Met Thr 1 ACT TCA CCC GAC CCG TAT GCC GCG CTG CCC AAG CTG Thr Ser Pro Asp Pro Tyr Ala Ala Leu Pro Lys Leu TCC TTC AGC Ser Phe Ser CTG ACG Leu Thr TCA ACC TCG ATC Ser Thr Ser Ile GAT GGG CAG CCG Asp Gly Gln Pro
CTG
Leu GCT ACA CCC CAG Ala Thr Pro Gln
GTC
Val AGC GGG ATC ATG Ser Gly Ile Met
GGT
Gly 40 GCG GGC GGG GCG Ala Gly Gly Ala
GAT
Asp GCC AGT CCG CAG Ala Ser Pro Gln 262 AGG TGG TCG GGA TTT CCC AGC GAG ACC Arg Trp Ser Gly Phe Pro Ser Glu Thr AGC TTC GCG GTA Ser Phe Ala Val ACC GTC Thr Val TAC GAC CCT Tyr Asp Pro GCC AAC CTG Ala Asn Leu GCC CCC ACC CTG TCC GGG TTC TGG CAC Ala Pro Thr Leu Ser Gly Phe Trp His TGG GCG GTG Trp Ala Val GTC GGC GAT Val Gly Asp CCT GCC AAC GTC Pro Ala Asn Val GAG TTG CCC GAG Glu Leu Pro Glu
GGT
Gly GGC CGC Gly Arg 100 GAA CTG CCG GGC Glu Leu Pro Gly GCA CTG ACA TTG Ala Leu Thr Leu AAC GAC GCC GGT Asn Asp Ala Gly ATG CGC CGG TAT GTG GGT Met Arg Arg Tyr Val Gly 115 120 GCG GCG CCG CCT Ala Ala Pro Pro GGT CAT GGG GTG Gly His Gly Val
CAT
His 130 CGC TAC TAC GTC Arg Tyr Tyr Val
GCG
Ala 135 GTA CAC GCG GTG Val His Ala Val
AAG
Lys 140 GTC GAA AAG Val Glu Lys CTC GAC CTC 550 Leu Asp Leu 145 CCC GAG GAC Pro Glu Asp
GCG
Ala 150 AGT CCT GCA TAT Ser Pro Ala Tyr CTG GGA Leu Gly 155 TTC AAC CTG TTC CAG CAC 598 Phe Asn Leu Phe Gln His 160 SUBSTITUTE SHEET (RULE 26) WO9 90/4577 PCT/DK98/00n43 34 GCG ATT GCA CGA GCG GTC ATC TTC GGC ACC TAC GAG CAG CGT TAGCGCTTT Ala Ile Ala Arg Ala Val Ile Phe Gly Thr Tyr Glu Gin Arg 165 170 175 AGCTGGGTTG CCGACGTCTT GCCGAGCCGA CCGCTTCGTG CAGCGAGCCG AACCCGCCGT CATGCAGCCT GCGGGCAATG CCTTCATGGA TGTCCTTGGC C INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 176 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Met Thr Thr Ser Pro Asp Pro Tyr Ala Ala Leu Pro Lys Leu Pro Ser J J l V Phe Pro Gin Thr Ala Gly Ala Val Asp 145 Gin Ser Gin Leu Val Val Asp Gly His 130 Leu His Thr Ala Ser Thr Val Gly 105 Ala His Ala Ile Asp Gly Glu Leu Thr Ala Ala Ala Tyr Phe 170 Gin Ala Arg Gly Leu Thr Pro Lys 140 Gly Thr Pro Asp Ser Phe Pro Leu Pro 125 Val Phe Tyr Ala Ser Ala His Gly Asn His Lys Leu Gin 175 Thr Pro Val Trp Val Asp Gly Leu Phe 160 Arg INFORMATION FOR SEQ ID NO: 51: SEQUENCE CHARACTERISTICS: LENGTH: 800 base pairs TYPE: nucleic acid SUBSTITUTE SHEET (RULE 26) 11/ nn 717 PCi r/DK98/00438 V 1 7LJ I 1 35 STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 18...695 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 18...134 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: TCATGAGGTT CATCGGG GTG ATC CCA CGC CCG CAG CCG CAT TCG GGC CGC Met Ile Pro Arg Pro Gin Pro His Ser Gly Arg TGG CGA GCC Trp Arg Ala GGT GCC GCA CGC Gly Ala Ala Arg CTC ACC AGC CTG Leu Thr Ser Leu
GTG
Val GCC GCC GCC Ala Ala Ala TTT GCG Phe Ala GCG GCC ACA CTG TTG CTT ACC CCC GCG CTG GCA CCA CCG Ala Ala Thr Leu Leu Leu Thr Pro Ala Leu Ala Pro Pro
GCA
Ala TCG GCG GGC TGC Ser Ala Gly Cys GAT GCC GAG GTG GTG TTC GCC CGC GGA Asp Ala Glu Val Val Phe Ala Arg Gly ACC GGC 194 Thr Gly GAA CCA CCT Glu Pro Pro CGC CAG CAG Arg Gin Gln
GGC
Gly CTC GGT CGG GTA Leu Gly Arg Val CAA GCT TTC GTC Gin Ala Phe Val AGT TCA TTG Ser Ser Leu AAC TAC CCG Asn Tyr Pro ACC AAC AAG AGC Thr Asn Lys Ser GGG ACA TAC GGA Gly Thr Tyr Gly 290 GCC AAC Ala Asn GGT GAT TTC TTG GCC GCC GCT GAC GGC Gly Asp Phe Leu Ala Ala Ala Asp Gly AAC GAC GCC AGC 338 Asn Asp Ala Ser CAC ATT CAG CAG His Ile Gin Gin GCC AGC GCG TGC Ala Ser Ala Cys
CGG
Arg GCC ACG AGG TTG Ala Thr Arg Leu
GTG
Val CTC GGC GGC TAC Leu Gly Gly Tyr CAG GGT GCG GCC Gin Gly Ala Ala ATC GAC ATC GTC Ile Asp Ile Val ACC GCC Thr Ala 100 GCA CCA CTG Ala Pro Leu GAC GAT CAC Asp Asp His 120
CCC
Pro 105 GGC CTC GGG TTC ACG CAG CCG TTG CCG Gly Leu Gly Phe Thr Gin Pro Leu Pro 110 CCC GCA GCG Pro Ala Ala 115 TCG GGC CGC Ser Gly Arg ATC GCC GCG ATC Ile Ala Ala Ile
GCC
Ala 125 CTG TTC GGG AAT Leu Phe Gly Asn GCT GGC GGG CTG ATG AGC GCC CTG ACC CCT CAA TTC GGG TCC AAG ACC Ala Gly Gly Leu Met Ser Ala Leu Thr Pro Gin Phe Gly Ser Lys Thr SUBSTITUTE SHEET (RULE 26) WA 001/A'77 PCT/K98/n438 36 135 140 145 ATC AAC CTC TGC AAC AAC GGC GAC CCG ATT TGT TCG GAC GGC AAC CGG 626 Ile Asn Leu Cys Asn Asn Gly Asp Pro Ile Cys Ser Asp Gly Asn Arg 150 155 160 165 TGG CGA GCG CAC CTA GGC TAC GTG CCC GGG ATG ACC AAC CAG GCG GCG 674 Trp Arg Ala His Leu Gly Tyr Val Pro Gly Met Thr Asn Gin Ala Ala 170 175 180 CGT TTC GTC GCG AGC AGG ATC TAACGCGAGC CGCCCCATAG ATTCCGGCTA AGCA 729 Arg Phe Val Ala Ser Arg Ile 185 ACGGCTGCGC CGCCGCCCGG CCACGAGTGA CCGCCGCCGA CTGGCACACC GCTTACCACG 789 GCCTTATGCT G 800 INFORMATION FOR SEQ ID NO: 52: SEQUENCE CHARACTERISTICS: LENGTH: 226 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...38 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: Met Ile Pro Arg Pro Gin Pro His Ser Gly Arg Trp Arg Ala Gly Ala -30 Ala Arg Arg Leu Thr Ser Leu Val Ala Ala Ala Phe Ala Ala Ala Thr -15 Leu Leu Leu Thr Pro Ala Leu Ala Pro Pro Ala Ser Ala Gly Cys Pro 1 5 Asp Ala Glu Val Val Phe Ala Arg Gly Thr Gly Glu Pro Pro Gly Leu 20 Gly Arg Val Gly Gin Ala Phe Val Ser Ser Leu Arg Gin Gin Thr Asn 35 Lys Ser Ile Gly Thr Tyr Gly Val Asn Tyr Pro Ala Asn Gly Asp Phe 50 Leu Ala Ala Ala Asp Gly Ala Asn Asp Ala Ser Asp His Ile Gin Gin 65 Met Ala Ser Ala Cys Arg Ala Thr Arg Leu Val Leu Gly Gly Tyr Ser 80 85 SUBSTITUTE SHEET (RULE 26) W OQ/2 4577 PCT/DK98/00438 37 Gln Gly Ala Ala Val Ile Asp Ile Val Thr Ala Ala Pro Leu Pro Gly 100 105 Leu Gly Phe Thr Gin Pro Leu Pro Pro Ala Ala Asp Asp His Ile Ala 110 115 120 Ala Ile Ala Leu Phe Gly Asn Pro Ser Gly Arg Ala Gly Gly Leu Met 125 130 135 Ser Ala Leu Thr Pro Gin Phe Gly Ser Lys Thr Ile Asn Leu Cys Asn 140 145 150 Asn Gly Asp Pro Ile Cys Ser Asp Gly Asn Arg Trp Arg Ala His Leu 155 160 165 170 Gly Tyr Val Pro Gly Met Thr Asn Gin Ala Ala Arg Phe Val Ala Ser 175 180 185 Arg Ile INFORMATION FOR SEQ ID NO: 53: SEQUENCE CHARACTERISTICS: LENGTH: 700 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 73...615 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: CTAGGAAAGC CTTTCCTGAG TAAGTATTGC CTTCGTTGCA TACCGCCCTT TACCTGCGTT AATCTGCATT TT ATG ACA GAA TAC GAA GGG CCT AAG ACA AAA TTC CAC GCG 111 Met Thr Glu Tyr Glu Gly Pro Lys Thr Lys Phe His Ala 1 5 TTA ATG CAG GAA CAG ATT CAT AAC GAA TTC ACA GCG GCA CAA CAA TAT 159 Leu Met Gin Glu Gin Ile His Asn Glu Phe Thr Ala Ala Gin Gin Tyr 20 GTC GCG ATC GCG GTT TAT TTC GAC AGC GAA GAC CTG CCG CAG TTG GCG 207 Val Ala Ile Ala Val Tyr Phe Asp Ser Glu Asp Leu Pro Gin Leu Ala 35 40 AAG CAT TTT TAC AGC CAA GCG GTC GAG GAA CGA AAC CAT GCA ATG ATG 255 Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg Asn His Ala Met Met 55 CTC GTG CAA CAC CTG CTC GAC CGC GAC CTT CGT GTC GAA ATT CCC GGC 303 Leu Val Gin His Leu Leu Asp Arg Asp Leu Arg Val Glu Ile Pro Gly 70 GTA GAC ACG GTG CGA AAC CAG TTC GAC AGA CCC CGC GAG GCA CTG GCG 351 Val Asp Thr Val Arg Asn Gin Phe Asp Arg Pro Arg Glu Ala Leu Ala SUBSTrTUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 85 CTG GCG CTC GAT CAG GAA CGC ACA GTC ACC GAC CAG GTC GGT CGG Leu Ala Leu Asp Gin Glu Arg Thr Val Thr Asp Gin Val Gly Arg 100 105 ACA GCG GTG GCC CGC GAC GAG GGC GAT TTC CTC GGC GAG CAG TTC Thr Ala Val Ala Arg Asp Glu Gly Asp Phe Leu Gly Glu Gin Phe 110 115 120 CAG TGG TTC TTG CAG GAA CAG ATC GAA GAG GTG GCC TTG ATG GCA Gin Trp Phe Leu Gin Glu Gin Ile Glu Glu Val Ala Leu Met Ala 130 135 140 CTG GTG CGG GTT GCC GAT CGG GCC GGG GCC AAC CTG TTC GAG CTA Leu Val Arg Val Ala Asp Arg Ala Gly Ala Asn Leu Phe Glu Leu 145 150 155 AAC TTC GTC GCA CGT GAA GTG GAT GTG GCG CCG GCC GCA TCA GGC Asn Phe Val Ala Arg Glu Val Asp Val Ala Pro Ala Ala Ser Gly 160 165 170 CCG CAC GCT GCC GGG GGC CGC CTC TAGATCCCTG GCGGGGATCA GCGAGT< Pro His Ala Ala Gly Gly Arg Leu 175 180 CCGTTCGCCC GCCCGTCTTC CAGCCAGGCC TTGGTGCGGC CGGGGTGGTG AGTAC INFORMATION FOR SEQ ID NO: 54: SEQUENCE CHARACTERISTICS: LENGTH: 181 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: Met Thr Glu Tyr Glu Gly Pro Lys Thr Lys Phe His Ala Leu Met 1 5 10 Glu Gln Ile His Asn Glu Phe Thr Ala Ala Gln Gin Tyr Val Ala 25 Ala Val Tyr Phe Asp Ser Glu Asp Leu Pro Gin Leu Ala Lys His 40 Tyr Ser Gin Ala Val Glu Glu Arg Asn His Ala Met Met Leu Val 55 His Leu Leu Asp Arg Asp Leu Arg Val Glu Ile Pro Gly Val Asp 70 75 Val Arg Asn Gin Phe Asp Arg Pro Arg Glu Ala Leu Ala Leu Ala 90 Asp Gin Glu Arg Thr Val Thr Asp Gln Val Gly Arg Leu Thr Ala
CTG
Leu
ATG
Met 125
ACC
Thr
GAG
Glu 3CC Ala 3GTC Gin Ile Phe Gin Thr Leu Val SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 Ala Arg Asp 115 Glu Gly Asp Phe Leu 120 Gly Glu Gin Phe Gin Trp Phe Leu Gin 130 Glu Gin Ile Glu Glu 135 Val Ala Leu Met Ala Thr Leu Val Arg 140 Ala Asp Arg Ala Ala Asn Leu Phe Glu 155 Leu Glu Asn Phe Val 160 Ala Arg Glu Val Val Ala Pro Ala Ser Gly Ala Pro His Ala 175 Ala Gly Gly Arg Leu 180 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 950 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 133...918 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 133...233 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: TGGGCTCGGC ACTGGCTCTC CCACGGTGGC GCGCTGATTT CTCCCCACGG TAGGCGTTGC GACGCATGTT CTTCACCGTC TATCCACAGC TACCGACATT TGCTCCGGCT GGATCGCGGG TAAAATTCCG TC GTG AAC AAT CGA CCC ATC CGC CTG CTG ACA TCC GGC AGG Met Asn Asn Arg Pro Ile Arg Leu Leu Thr Ser Gly Arg
GCT
Ala GGT TTG GGT GCG Gly Leu Gly Ala
GGC
Gly -15 GCC TTG GGC GCT GTT TGG Ala Leu Gly Ala Val Trp 1 GAC GCC GAA GTC ACG TTC Asp Ala Glu Val Thr Phe GGG CGC GTT GGC CAG GCG Gly Arg Val Gly Gin Ala GCA TTG ATC ACC GCC GTC GTC CTG CTC ATC Ala Leu Ile Thr Ala Val Val Leu Leu Ile -10 ACC CCG GTT GCC TTC GCC GAT GGA TGC CCG Thr Pro Val Ala Phe Ala Asp Gly Cys Pro 5 GCC CGC GGC ACC GGC GAG CCG CCC GGA ATC Ala Arg Gly Thr Gly Glu Pro Pro Gly Ile 20 TTC GTC GAC TCG CTG CGC CAG CAG ACT GGC Phe Val Asp Ser Leu Arg Gin Gin Thr Gly 35 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438
ATG
Met GAG ATC GGA GTA TAC CCG GTG AAT TAC Glu Ile Gly Val Tyr Pro Val Asn Tyr GCC AGC CGC CTA Ala Ser Arg Leu CTG CAC GGG GGA Leu His Gly Gly GGC GCC AAC GAC Gly Ala Asn Asp ATA TCG CAC ATT Ile Ser His Ile AAG TCC Lys Ser ATG GCC TCG Met Ala Ser CAG GGC GCA Gin Gly Ala TGC CCG AAC ACC AAG CTG GTC TTG GGC Cys Pro Asn Thr Lys Leu Val Leu Gly GGC TAT TCG Gly Tyr Ser 459 507 555 ACC GTG ATC GAT Thr Val Ile Asp GTG GCC GGG GTT CCG TTG GGC AGC Val Ala Gly Val Pro Leu Gly Ser 105 ATC AGC Ile Ser 110 TTT GGC AGT CCG Phe Gly Ser Pro CCT GCG GCA TAC Pro Ala Ala Tyr GAC AAC GTC GCA Asp Asn Val Ala
GCG
Ala 125 GTC GCG GTC TTC Val Ala Val Phe
GGC
Gly 130 AAT CCG TCC AAC CGC GCC GGC GGA TCG CTG Asn Pro Ser Asn Arg Ala Gly Gly Ser Leu TCG AGC CTG AGC Ser Ser Leu Ser CTA TTC GGT TCC Leu Phe Gly Ser
AAG
Lys 150 GCG ATT GAC CTG Ala Ile Asp Leu TGC AAT Cys Asn 155 CCC ACC GAT Pro Thr Asp CAC ATC GAC His Ile Asp 175
CCG
Pro 160 ATC TGC CAT GTG Ile Cys His Val
GGC
Gly 165 CCC GGC AAC GAA Pro Gly Asn Glu TTC AGC GGA Phe Ser Gly 170 GGC TAC ATA CCC Gly Tyr Ile Pro TAC ACC ACC CAG GCG GCT AGT TTC Tyr Thr Thr Gin Ala Ala Ser Phe 185 GTC GTG Val Val 190 CAG AGG CTC CGC Gin Arg Leu Arg GGG TCG GTG CCA Gly Ser Val Pro CTG CCT GGA TCC Leu Pro Gly Ser
GTC
Val 205 CCG CAG CTG CCC Pro Gin Leu Pro
GGG
Gly 210 TCT GTC CTT CAG Ser Val Leu Gin
ATG
Met 215 CCC GGC ACT GCC Pro Gly Thr Ala
GCA
Ala 220 CCG GCT CCC GAA Pro Ala Pro Glu
TCG
Ser 225 CTG CAC GGT CGC Leu His Gly Arg TGACGCTTTG TCAGTAAGCC CATAAAA
TCGCG
INFORMATION FOR SEQ ID NO: 56: SEQUENCE CHARACTERISTICS: LENGTH:, 262 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein SUBSTITUTE SHEET (RULE 26) WO/ Q019477 PCT/DK98/00438 41 FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...33 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: Met Asn Asn Arg Pro Ile Arg Leu Leu Thr Ser Gly Arg Ala Gly Leu -25 Gly Ala Gly Ala Leu Ile Thr Ala Val Val Leu Leu Ile Ala Leu Gly -10 Ala Val Trp Thr Pro Val Ala Phe Ala Asp Gly Cys Pro Asp Ala Glu 1 5 10 Val Thr Phe Ala Arg Gly Thr Gly Glu Pro Pro Gly Ile Gly Arg Val 25 Gly Gin Ala Phe Val Asp Ser Leu Arg Gin Gin Thr Gly Met Glu Ile 40 Gly Val Tyr Pro Val Asn Tyr Ala Ala Ser Arg Leu Gin Leu His Gly 55 Gly Asp Gly Ala Asn Asp Ala Ile Ser His Ile Lys Ser Met Ala Ser 70 Ser Cys Pro Asn Thr Lys Leu Val Leu Gly Gly Tyr Ser Gin Gly Ala 85 90 Thr Val Ile Asp Ile Val Ala Gly Val Pro Leu Gly Ser Ile Ser Phe 100 105 110 Gly Ser Pro Leu Pro Ala Ala Tyr Ala Asp Asn Val Ala Ala Val Ala 115 120 125 Val Phe Gly Asn Pro Ser Asn Arg Ala Gly Gly Ser Leu Ser Ser Leu 130 135 140 Ser Pro Leu Phe Gly Ser Lys Ala Ile Asp Leu Cys Asn Pro Thr Asp 145 150 155 Pro Ile Cys His Val Gly Pro Gly Asn Glu Phe Ser Gly His Ile Asp 160 165 170 175 Gly Tyr Ile Pro Thr Tyr Thr Thr Gin Ala Ala Ser Phe Val Val Gin 180 185 190 Arg Leu Arg Ala Gly Ser Val Pro His Leu Pro Gly Ser Val Pro Gin 195 200 205 Leu Pro Gly Ser Val Leu Gin Met Pro Gly Thr Ala Ala Pro Ala Pro 210 215 220 Glu Ser Leu His Gly Arg 225 SUBSTITUTE SHEET (RULE 26) WO 99/24577 42 INFORMATION FOR SEQ ID NO: 57: SEQUENCE CHARACTERISTICS: LENGTH: 1000 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: PCT/DK98/00438 NAME/KEY: Coding Sequence LOCATION: 94...966 OTHER INFORMATION: NAME/KEY: Signal Sequence LOCATION: 94...264 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: CGAGGAGACC GACGATCTGC TCGACGAAAT CGACGACGTC CTCGAGGAGA ACGCCGAGGA CTTCGTCCGC GCATACGTCC AAAAGGGCGG ACA GTG ACC TGG CCG TTG CCC GAT Met Thr Trp Pro Leu Pro Asp 114 162 CGC CTG TCC ATT AAT TCA CTC TCT GGA ACA CCC GCT GTA GAC Arg Leu Ser Ile Asn Ser Leu Ser Gly Thr Pro Ala Val Asp CTA TCT Leu Ser TCT TTC ACT Ser Phe Thr AGC ATC AGC Ser Ile Ser TTC CTG CGC CGC Phe Leu Arg Arg
CAG
Gln GCG CCG GAG TTG CTG CCG GCA Ala Pro Glu Leu Leu Pro Ala 210 GGC GGT GCG CCA CTC GCA GGC GGC GAT GCG CAA CTG CCG Gly Gly Ala Pro Leu Ala Gly Gly Asp Ala Gin Leu Pro CAC GGC ACC ACC ATT GTC GCG CTG AAA TAC His Gly Thr Thr Ile Val Ala Leu Lys Tyr 1 5
CCC
Pro GGC GGT GTT GTC Gly Gly Val Val GCG GGT GAC CGG Ala Gly Asp Arg TCG ACG CAG GGC Ser Thr Gin Gly ATG ATT TCT GGG Met Ile Ser Gly CGT GAT Arg Asp GTG CGC AAG Val Arg Lys GGC ACG GCT Gly Thr Ala
GTG
Val TAT ATC ACC GAT Tyr Ile Thr Asp TAC ACC GCT ACC Tyr Thr Ala Thr GGC ATC GCT Gly Ile Ala GCC GTG GAA Ala Val Glu GCG GTC GCG GTT Ala Val Ala Val
GAG
Glu TTT GCC CGG CTG Phe Ala Arg Leu CTT GAG Leu Glu CAC TAC GAG AAG CTC GAG GGT GTG CCG His Tyr Glu Lys Leu Glu Gly Val Pro ACG TTT GCC GGC Thr Phe Ala Gly
AAA
Lys ATC AAC CGG CTG Ile Asn Arg Leu ATT ATG GTG CGT Ile Met Val Arg GGC AAT CTG GCG GCC GCG Gly Asn Leu Ala Ala Ala 90 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 ATG CAG GGT CTG Met Gin Gly Leu
CTG
Leu 100 GCG TTG CCG TTG Ala Leu Pro Leu
CTG
Leu 105 GCG GGC TAC GAC Ala Gly Tyr Asp ATT CAT Ile His 110 594 642 GCG TCT GAC Ala Ser Asp GGC GGT TGG Gly Gly Trp 130
CCG
Pro 115 CAG AGC GCG GGT CGT ATC GTT TCG TTC Gin Ser Ala Gly Arg Ile Val Ser Phe 120 GAC GCC GCC Asp Ala Ala 125 AAC ATC GAG GAA GAG GGC TAT CAG GCG GTG GGC TCG GGT Asn Ile Glu Glu Glu Gly Tyr Gin Ala Val Gly Ser Gly 690 738 TCG CTG Ser Leu 145 TTC GCG AAG TCG TCG ATG AAG AAG TTG Phe Ala Lys Ser Ser Met Lys Lys Leu 150 TCG CAG GTT ACC Ser Gin Val Thr GGT GAT TCG GGG Gly Asp Ser Gly
CTG
Leu 165 CGG GTG GCG GTC Arg Val Ala Val
GAG
Glu 170 GCG CTC TAC GAC Ala Leu Tyr Asp GCC GAC GAC GAC Ala Asp Asp Asp GCC ACC GGC GGT CCG GAC CTG GTG CGG Ala Thr Gly Gly Pro Asp Leu Val Arg 185 GGC ATC Gly Ile 190 TTT CCG ACG Phe Pro Thr GAG AGC CGG Glu Ser Arg 210 GTG ATC ATC GAC Val Ile Ile Asp GAC GGG GCG GTT Asp Gly Ala Val GAC GTG CCG Asp Val Pro 205 AGC CGT TCG Ser Arg Ser 882 930 ATT GCC GAA TTG Ile Ala Glu Leu CGC GCG ATC ATC Arg Ala Ile Ile GGT GCG Gly Ala 225 GAT ACT TTC GGC Asp Thr Phe Gly GAT GGC GGT GAG Asp Gly Gly Glu
AAG
Lys 235 TGAGTTTTCC GTATTT 982 CATCTCGCCT GAGCAGGC INFORMATION FOR SEQ ID NO: 58: SEQUENCE CHARACTERISTICS: LENGTH: 291 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (ix) FEATURE: NAME/KEY: Signal Sequence LOCATION: 1...56 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: Met Thr Trp Pro Leu Pro Asp Arg Leu Ser Ile Asn Ser Leu Ser Gly -50 1000 SUBSTITUTE SHEET (RULE 26) Wn/ QQIC77 PCT/DK98/00438 44 Thr Pro Ala Val Asp Leu Ser Ser Phe Thr Asp Phe Leu Arg Arg Gin -35 -30 Ala Pro Glu Leu Leu Pro Ala Ser Ile Ser Gly Gly Ala Pro Leu Ala -15 Gly Gly Asp Ala Gin Leu Pro His Gly Thr Thr Ile Val Ala Leu Lys 1 Tyr Pro Gly Gly Val Val Met Ala Gly Asp Arg Arg Ser Thr Gin Gly 15 Asn Met Ile Ser Gly Arg Asp Val Arg Lys Val Tyr Ile Thr Asp Asp 30 35 Tyr Thr Ala Thr Gly Ile Ala Gly Thr Ala Ala Val Ala Val Glu Phe 50 Ala Arg Leu Tyr Ala Val Glu Leu Glu His Tyr Glu Lys Leu Glu Gly 65 Val Pro Leu Thr Phe Ala Gly Lys Ile Asn Arg Leu Ala Ile Met Val 80 Arg Gly Asn Leu Ala Ala Ala Met Gin Gly Leu Leu Ala Leu Pro Leu 95 100 Leu Ala Gly Tyr Asp Ile His Ala Ser Asp Pro Gin Ser Ala Gly Arg 105 110 115 120 Ile Val Ser Phe Asp Ala Ala Gly Gly Trp Asn Ile Glu Glu Glu Gly 125 130 135 Tyr Gin Ala Val Gly Ser Gly Ser Leu Phe Ala Lys Ser Ser Met Lys 140 145 150 Lys Leu Tyr Ser Gin Val Thr Asp Gly Asp Ser Gly Leu Arg Val Ala 155 160 165 Val Glu Ala Leu Tyr Asp Ala Ala Asp Asp Asp Ser Ala Thr Gly Gly 170 175 180 Pro Asp Leu Val Arg Gly Ile Phe Pro Thr Ala Val Ile Ile Asp Ala 185 190 195 200 Asp Gly Ala Val Asp Val Pro Glu Ser Arg Ile Ala Glu Leu Ala Arg 205 210 215 Ala Ile Ile Glu Ser Arg Ser Gly Ala Asp Thr Phe Gly Ser Asp Gly 220 225 230 Gly Glu Lys 235 INFORMATION FOR SEQ ID NO: 59: SEQUENCE CHARACTERISTICS: LENGTH: 900 base pairs TYPE: nucleic acid STRANDEDNESS: single oo SUBSTITUTE SHEET (RULE 26) WO99/245 77 PT/K98/00438 7 T 771kJ TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 66...808 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: TTGGCCCGCG CGATCATCGA AAGCCGTTCG GGTGCGGATA CTTTCGGCTC CGATGGCGGT GAGAA GTG AGT TTT CCG TAT TTC ATC TCG CCT GAG CAG GCG ATG CGC GAG 110 Met Ser Phe Pro Tyr Phe Ile Ser Pro Glu Gin Ala Met Arg Glu 1 5 10 CGC AGC GAG TTG GCG CGT AAG GGC ATT GCG CGG GCC AAA AGC GTG GTG 158 Arg Ser Glu Leu Ala Arg Lys Gly Ile Ala Arg Ala Lys Ser Val Val 25 GCG CTG GCC TAT GCC GGT GGT GTG CTG TTC GTC GCG GAG AAT CCG TCG 206 Ala Leu Ala Tyr Ala Gly Gly Val Leu Phe Val Ala Glu Asn Pro Ser 40 CGG TCG CTG CAG AAG ATC AGT GAG CTC TAC GAT CGG GTG GGT TTT GCG 254 Arg Ser Leu Gln Lys Ile Ser Glu Leu Tyr Asp Arg Val Gly Phe Ala 55 GCT GCG GGC AAG TTC AAC GAG TTC GAC AAT TTG CGC CGC GGC GGG ATC 302 Ala Ala Gly Lys Phe Asn Glu Phe Asp Asn Leu Arg Arg Gly Gly Ile 70 CAG TTC GCC GAC ACC CGC GGT TAC GCC TAT GAC CGT CGT GAC GTC ACG 350 Gin Phe Ala Asp Thr Arg Gly Tyr Ala Tyr Asp Arg Arg Asp Val Thr 85 90 GGT CGG CAG TTG GCC AAT GTC TAC GCG CAG ACT CTA GGC ACC ATC TTC 398 Gly Arg Gin Leu Ala Asn Val Tyr Ala Gin Thr Leu Gly Thr Ile Phe 100 105 110 ACC GAA CAG GCC AAG CCC TAC GAG GTT GAG TTG TGT GTG GCC GAG GTG 446 Thr Glu Gin Ala Lys Pro Tyr Glu Val Glu Leu Cys Val Ala Glu Val 115 120 125 GCG CAT TAC GGC GAG ACG AAA CGC CCT GAG TTG TAT CGT ATT ACC TAC 494 Ala His Tyr Gly Glu Thr Lys Arg Pro Glu Leu Tyr Arg Ile Thr Tyr 130 135 140 GAC GGG TCG ATC GCC GAC GAG CCG CAT TTC GTG GTG ATG GGC GGC ACC 542 Asp Gly Ser Ile Ala Asp Glu Pro His Phe Val Val Met Gly Gly Thr 145 150 155 ACG GAG CCG ATC GCC AAC GCG CTC AAA GAG TCG TAT GCC GAG AAC GCC 590 Thr Glu Pro Ile Ala Asn Ala Leu Lys Glu Ser Tyr Ala Glu Asn Ala 160 165 170 175 AGC CTG ACC GAC GCC CTG CGT ATC GCG GTC GCT GCA TTG CGG GCC GGC 638 Ser Leu Thr Asp Ala Leu Arg Ile Ala Val Ala Ala Leu Arg Ala Gly 180 185 190 AGT GCC GAC ACC TCG GGT GGT GAT CAA CCC ACC CTT GGC GTG GCC AGC 686 Ser Ala Asp Thr Ser Gly Gly Asp Gin Pro Thr Leu Gly Val Ala Ser SUBSTITUTE SHEET (RULE 26) WOU 9/2A4577 PCT/DK98/00438 46 195 200 205 TTA GAG GTG GCC GTT CTC GAT GCC AAC CGG CCA CGG CGC GCG TTC CGG 734 Leu Glu Val Ala Val Leu Asp Ala Asn Arg Pro Arg Arg Ala Phe Arg 210 215 220 CGC ATC ACC GGC TCC GCC CTG CAA GCG TTG CTG GTA GAC CAG GAA AGC 782 Arg Ile Thr Gly Ser Ala Leu Gln Ala Leu Leu Val Asp Gln Glu Ser 225 230 235 CCG CAG TCT GAC GGC GAA TCG TCG GG CTGAGTCCGA AAGTCCGACG CGTGTCTG 836 Pro Gin Ser Asp Gly Glu Ser Ser Gly 240 245 GGACCCCGCT GCGACGTTAA CTGCGCCTAA CCCCGGCTCG ACGCGTCGCC GGCCGTCCTG 896 ACTT 900 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 248 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Met Ser Phe Pro Tyr Phe Ile Ser Pro Glu Gin Ala Met Arg Glu Arg 1 5 10 Ser Glu Leu Ala Arg Lys Gly Ile Ala Arg Ala Lys Ser Val Val Ala 25 Leu Ala Tyr Ala Gly Gly Val Leu Phe Val Ala Glu Asn Pro Ser Arg 40 Ser Leu Gin Lys Ile Ser Glu Leu Tyr Asp Arg Val Gly Phe Ala Ala 55 Ala Gly Lys Phe Asn Glu Phe Asp Asn Leu Arg Arg Gly Gly Ile Gin 70 75 Phe Ala Asp Thr Arg Gly Tyr Ala Tyr Asp Arg Arg Asp Val Thr Gly 90 Arg Gin Leu Ala Asn Val Tyr Ala Gin Thr Leu Gly Thr Ile Phe Thr 100 105 110 Glu Gin Ala Lys Pro Tyr Glu Val Glu Leu Cys Val Ala Glu Val Ala 115 120 125 His Tyr Gly Glu Thr Lys Arg Pro Glu Leu Tyr Arg Ile Thr Tyr Asp 130 135 140 Gly Ser Ile Ala Asp Glu Pro His Phe Val Val Met Gly Gly Thr Thr 145 150 155 160 SUBSTITUTE SHEET (RULE 26) \1141 OQ9IA'77 PCT/DKoR/0n438 r 77LfA.7 /PCT/DK9/A 0438 47 Glu Pro Ile Ala Asn Ala Leu Lys Glu Ser Tyr Ala Glu Asn Ala Ser 165 170 175 Leu Thr Asp Ala Leu Arg Ile Ala Val Ala Ala Leu Arg Ala Gly Ser 180 185 190 Ala Asp Thr Ser Gly Gly Asp Gin Pro Thr Leu Gly Val Ala Ser Leu 195 200 205 Glu Val Ala Val Leu Asp Ala Asn Arg Pro Arg Arg Ala Phe Arg Arg 210 215 220 Ile Thr Gly Ser Ala Leu Gin Ala Leu Leu Val Asp Gin Glu Ser Pro 225 230 235 240 Gin Ser Asp Gly Glu Ser Ser Gly 245 INFORMATION FOR SEQ ID NO:61: SEQUENCE CHARACTERISTICS: LENGTH: 1560 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 98...1487 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: GAGTCATTGC CTGGTCGGCG TCATTCCGTA CTAGTCGGTT GTCGGACTTG ACCTACTGGG TCAGGCCGAC GAGCACTCGA CCATTAGGGT AGGGGCC GTG ACC CAC TAT GAC GTC 115 Met Thr His Tyr Asp Val 1 GTC GTT CTC GGA GCC GGT CCC GGC GGG TAT GTC GCG GCG ATT CGC GCC 163 Val Val Leu Gly Ala Gly Pro Gly Gly Tyr Val Ala Ala Ile Arg Ala 15 GCA CAG CTC GGC CTG AGC ACT GCA ATC GTC GAA CCC AAG TAC TGG GGC 211 Ala Gin Leu Gly Leu Ser Thr Ala Ile Val Glu Pro Lys Tyr Trp Gly 30 GGA GTA TGC CTC AAT GTC GGC TGT ATC CCA TCC AAG GCG CTG TTG CGC 259 Gly Val Cys Leu Asn Val Gly Cys Ile Pro Ser Lys Ala Leu Leu Arg 45 AAC GCC GAA CTG GTC CAC ATC TTC ACC AAG GAC GCC AAA GCA TTT GGC 307 Asn Ala Glu Leu Val His Ile Phe Thr Lys Asp Ala Lys Ala Phe Gly 60 65 ATC AGC GGC GAG GTG ACC TTC GAC TAC GGC ATC GCC TAT GAC CGC AGC 355 Ile Ser Gly Glu Val Thr Phe Asp Tyr Gly Ile Ala Tyr Asp Arg Ser 80 SUBSTITUTE SHEET (RULE 28) WO 99/24577 CGA AAG GTA Arg Lys Val AAG AAC AAG Lys Asn Lys 105 PCT/DK98/00438 GAG GGC AGG GTG Giu Gly Arg Val
GCC
Ala GGT GTG CAC TTC Gly Val His Phe CTG ATG AAG Leu Met Lys 100 GCC GAC GCC Ala Asp Ala ATC ACC GAG ATC CAC GGG TAC GGC ACA Ile Thr Glu Ile His Gly Tyr Giy Thr
TTT
Phe 115 AAC ACG TTG TTG GTT Asn Thr Leu Leu Val 120 GAT CTC AAC GAC GGC GGT ACA GAA TCG GTC ACG Asp Leu Asn Asp Gly Gly Thr Giu Ser Val Thr 125 130
TTC
Phe 135 GAC AAC GCC ATC Asp Asn Ala Ile GCG ACC GGC AGT Ala Thr Giy Ser ACC CGG CTG GTT Thr Arg Leu Val GGC ACC TCA CTG Gly Thr Ser Leu GCC AAC GTA GTC Ala Asn Val Val
ACC
Thr 160 TAC GAG GAA CAG Tyr Glu Giu Gln ATC CTG Ile Leu 165 TCC CGA GAG Ser Arg Giu GGC ATG GAG Gly Met Giu 185 CCG AAA TCG ATC Pro Lys Ser Ile
ATT
Ile 175 ATT GCC GGA GCT GGT GCC ATT Ile Ala Gly Ala Gly Ala Ile TTC GGC TAC GTG Phe Gly Tyr Val AAG AAC TAC GGC GTT GAC GTG ACC Lys Asn Tyr Gly Val Asp Val Thr 195 ATC GTG Ile Val 200 GAA TTC CTT CCG Giu Phe Leu Pro
CGG
Arg 205 GCG CTG CCC AAC Ala Leu Pro Asn
GAG
Glu 210 GAC GCC GAT GTG Asp Ala Asp Val
TCC
Ser 215 AAG GAG ATC GAG Lys Glu Ile Giu CAG TTC AAA AAG Gin Phe Lys Lys GGT GTC ACG ATC Gly Vai Thr Ile ACC GCC ACG AAG Thr Ala Thr Lys GAG TCC ATC GCC Giu Ser Ile Ala
GAT
Asp 240 GGC GGG TCG CAG Giy Gly Ser Gin GTC ACC Val Thr 245 GTG ACC GTC Val Thr Val GTG TTG CAG Vai Leu Gin 265 AAG GAC GGC GTG GCG CAA GAG CTT AAG Lys Asp Gly Vai Ala Gin Giu Leu Lys 255 GCG GAA AAG Ala Giu Lys 260 TAC GGG CTG Tyr Gly Leu GCC ATC GGA TTT Ala Ile Gly Phe CCC AAC GTC GAA Pro Asn Vai Giu
GGG
Giy 275 GAC AAG GCA GGC GTC GCG Asp Lys Aia Gly Val Ala 280
CTG
Leu 285 ACC GAC CGC AAG, GCT ATC GGT GTC GAC Thr Asp Arg Lys Ala Ile Gly Val Asp 290
GAC
Asp 295 TAC ATG CGT ACC Tyr Met Arg Thr
AAC
Asn 300 GTG GGC CAC ATC Val Gly His Ile
TAC
Tyr 305 GCT ATC GGC GAT Ala Ile Giy Asp 1027 AAT GGA TTA CTG CAG CTG GCG CAC GTC Asn Gly Leu Leu Gin Leu Ala His Vai 315 GAG GCA CAA GGC Giu Ala Gin Gly GTG GTA Val Val 325 1075 112 3 GCC GCC GAA ACC ATT GCC GGT GCA GAG ACT TTG ACG CTG GGC GAC CAT SUBSTITUTE SHEET (RULE 26) PCT/DK98/0043 8 WO 99/24577 49 Ala Ala Glu Thr Ile Ala Gly Ala Glu Thr Leu Thr Leu Gly Asp His 330
CCG
Pro 335
TGT
Cys CGG ATG TTG Arg Met Leu 345 GGG CTC ACC Gly Leu Thr CGC GCG ACG Arg Ala Thr CAG CCA AAC Gin Pro Asn GCC AGC TTC Ala Ser Phe GTG GTG GTG Val Val Val GAG CAG CAA Glu Gin Gin AAC GAA GGT Asn Glu Gly 360
AAG
Lys
TAC
Tyr 370
CAC
His
GCC
Ala 375
CCC
Pro TTC CCG TTC Phe Pro Phe
ACG
Thr 380
AAG
Lys AAC GCC AAG Asn Ala Lys
GCG
Ala 385
GCC
Ala GGC GTG GGT Gly Val Gly
GAC
Asp 390 AGT GGG TTC Ser Gly Phe CTG GTG GCC Leu Val Ala
GAC
Asp 400
GTG
Val AAG CAC GGC Lys His Gly GAG CTA Glu Leu 405 1171 1219 1267 1315 1363 1411 1459 1512 CTG GGT GGG Leu Gly Gly CTC ACG CTG Leu Thr Leu 425 AAC GTC CAC Asn Val His
CAC
His 410
GCG
Ala GTC GGC CAC Val Gly His
GAC
Asp 415
CTG
Leu GCC GAG CTG Ala Glu Leu CAG AGG TGG Gin Arg Trp ACC GCC AGC Thr Ala Ser CTG CCG GAG Leu Pro Glu 420 CTG GCT CGC Leu Ala Arg GAG TGC TTC Glu Cys Phe ACC CAC CCA Thr His Pro ATG TCT GAG GCG CTG Met Ser Glu Ala Leu 440
GGC
Gly
CAC
His 455 CTG GTT GGC CAC ATG ATC AAT T TCTGAGCGGC TCATGACGAG
GCGCG
Leu Val Gly His Met Ile Asn Phe CGAGCACTGA CACCCCCCAG ATCATCATGG GTGCCATCGG TGGTGTGG INFORMATION FOR SEQ ID NO: 62: SEQUENCE CHARACTERISTICS: LENGTH: 464 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: Met Thr His Tyr Asp Val Val Val Leu Gly Ala Gly Pro 1 5 10 Val Ala Ala Ile Arg Ala Ala Gin Leu Gly Leu Ser Thr 25 Glu Pro Lys Tyr Trp Gly Gly Val Cys Leu Asn Val Gly 40 Ser Lys Ala Leu Leu Arg Asn Ala Glu Leu Val His Ile 55 Asp Ala Lys Ala Phe Gly Ile Ser Gly Glu Val Thr Phe 70 75 Ile Ala Tyr Asp Arg Ser Arg Lys Val Ala Glu Gly Arg 1560 Gly Ala Cys Phe Asp Val Gly Ile Ile Thr Tyr Ala Tyr Val Pro Lys Gly Gly SUBSTITUTE SHEET (RULE 26) WO 99/2477 PCT/DK98/00438 90 Val His Phe Leu Met Lys Lys Asn Lys Ile Thr Glu Ile His Gly Tyr 100 105 110 Gly Thr Phe Ala Asp Ala Asn Thr Leu Leu Val Asp Leu Asn Asp Gly 115 120 125 Gly Thr Glu Ser Val Thr Phe Asp Asn Ala Ile Ile Ala Thr Gly Ser 130 135 140 Ser Thr Arg Leu Val Pro Gly Thr Ser Leu Ser Ala Asn Val Val Thr 145 150 155 160 Tyr Glu Glu Gin Ile Leu Ser Arg Glu Leu Pro Lys Ser Ile Ile Ile 165 170 175 Ala Gly Ala Gly Ala Ile Gly Met Glu Phe Gly Tyr Val Leu Lys Asn 180 185 190 Tyr Gly Val Asp Val Thr Ile Val Glu Phe Leu Pro Arg Ala Leu Pro 195 200 205 Asn Glu Asp Ala Asp Val Ser Lys Glu Ile Glu Lys Gin Phe Lys Lys 210 215 220 Leu Gly Val Thr Ile Leu Thr Ala Thr Lys Val Glu Ser Ile Ala Asp 225 230 235 240 Gly Gly Ser Gin Val Thr Val Thr Val Thr Lys Asp Gly Val Ala Gin 245 250 255 Glu Leu Lys Ala Glu Lys Val Leu Gin Ala Ile Gly Phe Ala Pro Asn 260 265 270 Val Glu Gly Tyr Gly Leu Asp Lys Ala Gly Val Ala Leu Thr Asp Arg 275 280 285 Lys Ala Ile Gly Val Asp Asp Tyr Met Arg Thr Asn Val Gly His Ile 290 295 300 Tyr Ala Ile Gly Asp Val Asn Gly Leu Leu Gin Leu Ala His Val Ala 305 310 315 320 Glu Ala Gin Gly Val Val Ala Ala Glu Thr Ile Ala Gly Ala Glu Thr 325 330 335 Leu Thr Leu Gly Asp His Arg Met Leu Pro Arg Ala Thr Phe Cys Gin 340 345 350 Pro Asn Val Ala Ser Phe Gly Leu Thr Glu Gin Gin Ala Arg Asn Glu 355 360 365 Gly Tyr Asp Val Val Val Ala Lys Phe Pro Phe Thr Ala Asn Ala Lys 370 375 380 Ala His Gly Val Gly Asp Pro Ser Gly Phe Val Lys Leu Val Ala Asp 385 390 395 400 Ala Lys His Gly Glu Leu Leu Gly Gly His Leu Val Gly His Asp Val 405 410 415 Ala Glu Leu Leu Pro Glu Leu Thr Leu Ala Gin Arg Trp Asp Leu Thr 420 425 430 Ala Ser Glu Leu Ala Arg Asn Val His Thr His Pro Thr Met Ser Glu 435 440 445 Ala Leu Gin Glu Cys Phe His Gly Leu Val Gly His Met Ile Asn Phe 450 455 460 INFORMATION FOR SEQ ID NO: 63: SEQUENCE CHARACTERISTICS: LENGTH: 550 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 101...490 OTHER INFORMATION: SUBSTITUTE SHEET (RULE 26) lO- 00/41C77 PCT/DK98/00438 T d -5 1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: GGCCCGGCTC GCGGCCGCCC TGCAGGAAAA GAAGGCCTGC CCAGGCCCAG ACTCAGCCGA GTAGTCACCC AGTACCCCAC ACCAGGAAGG ACCGCCCATC ATG GCA AAG CTC TCC Met Ala Lys Leu Ser 1 ACC CTG TTG GAG CTC 115 ACC GAC Thr Asp TCC GAC Ser Asp GCT CCA Ala Pro GTC GAG Val Glu GCC GGC Ala Gly TCC GGC Ser Gly AAG CCG Lys Pro GCC AAG GAA CTG CTG Glu Leu Leu TTC GTC AAG Phe Val Lys GTC GCC GTC Val Ala Val GCT GCC GAG Ala Ala Glu GAC AAG AAG Asp Lys Lys CTG GGC CTC Leu Gly Leu CTG CTG GAG Leu Leu Glu 105 CTG GAG GCC
GAC
Asp
AAG
Lys
GCC
Ala
GAG
Glu
ATC
Ile 75
AAG
Lys
AAG
Lys
GCC
GCG TTC AAG Ala Phe Lys TTC GAG GAG Phe Glu Glu 30 GCC GCC GGT Ala Ala Gly 45 CAG TCC GAG Gln Ser Glu 60 GGC GTC ATC Gly Val Ile GAG GCC AAG Glu Ala Lys GTC GCC AAG Val Ala Lys 110 GGC GCC ACC GAA ATG Glu Met 15 ACC TTC Thr Phe GCC GCC Ala Ala TTC GAC Phe Asp AAG GTG Lys Val 80 GAC CTG Asp Leu 95 GAG GCC Glu Ala GTC ACC Thr Leu Leu
GAG
Glu
CCG
Pro
GTG
Val
GTC
Val
GTC
Val
GCC
Ala
GTC
GTC ACC Val Thr GCC GGT Ala Gly ATC CTT Ile Leu CGG GAG Arg Glu GAC GGC Asp Gly GAC GAG Asp Glu 115 AAG TAG( Glu Leu GCC GCC Ala Ala GCC GCC Ala Ala GAG GCC Glu Ala ATC GTT Ile Val GCG CCC Ala Pro 100 GCC AAG Ala Lys CTCTGCC CA 502 Ala Lys Leu Glu Ala Ala Gly Ala Thr Val Thr Val Lys 120 125 130 GCGTGTTCTT TTGCGTCTGC TCGGCCCGTA GCGAACACTG CGCCCGCT INFORMATION FOR SEQ ID NO: 64: SEQUENCE CHARACTERISTICS: LENGTH: 130 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: Met Ala Lys Leu Ser Thr Asp Glu Leu Leu Asp Ala Phe Lys Glu Met 1 5 10 Thr Leu Leu Glu Leu Ser Asp Phe Val Lys Lys Phe Glu Glu Thr Phe SUBSTITUTE SHEET (RULE 26) PCT/DK98/00438 r/t. AJV JJ* J Pf^ Wu O L 7h'JI' 52 25 Glu Val Thr Ala Ala Ala Pro Val Ala Val Ala Ala Ala Gly Ala Ala 40 Pro Ala Gly Ala Ala Val Glu Ala Ala Glu Glu Gin Ser Glu Phe Asp 55 Val Ile Leu Glu Ala Ala Gly Asp Lys Lys Ile Gly Val Ile Lys Val 70 75 Val Arg Glu Ile Val Ser Gly Leu Gly Leu Lys Glu Ala Lys Asp Leu 90 Val Asp Gly Ala Pro Lys Pro Leu Leu Glu Lys Val Ala Lys Glu Ala 100 105 110 Ala Asp Glu Ala Lys Ala Lys Leu Glu Ala Ala Gly Ala Thr Val Thr 115 120 125 Val Lys 130 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 900 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 87...770 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: TGAACGCCAT CGGGTCCAAC GAACGCAGCG CTACCTGATC ACCACCGGGT CTGTTAGGGC TCTTCCCCAG GTCGTACAGT CGGGCC ATG GCC ATT GAG GTT TCG GTG TTG CGG 113 Met Ala Ile Glu Val Ser Val Leu Arg 1 GTT TTC ACC GAT TCA GAC GGG AAT TTC GGT AAT CCG CTG GGG GTG ATC 161 Val Phe Thr Asp Ser Asp Gly Asn Phe Gly Asn Pro Leu Gly Val Ile 15 20 AAC GCC AGC AAG GTC GAA CAC CGC GAC AGG CAG CAG CTG GCA GCC CAA 209 Asn Ala Ser Lys Val Glu His Arg Asp Arg Gin Gin Leu Ala Ala Gin 35 TCG GGC TAC AGC GAA ACC ATA TTC GTC GAT CTT CCC AGC CCC GGC TCA 257 Ser Gly Tyr Ser Glu Thr Ile Phe Val Asp Leu Pro Ser Pro Gly Ser 50 ACC ACC GCA CAC GCC ACC ATC CAT ACT CCC CGC ACC GAA ATT CCG TTC 305 Thr Thr Ala His Ala Thr Ile His Thr Pro Arg Thr Glu Ile Pro Phe 65 SUBSTITUTE SHEET (RULE 26) WO 99/24577 GCC GGA CAC Ala Gly His PCT/DK98/00438 CCG ACC GTG Pro Thr Val
GGA
Gly 80 GCG TCC TGG TGG CTG CGC GAG AGG GGG Ala Ser Trp Trp Leu Arg Glu Arg Gly 353
ACG
Thr CCA ATT AAC ACG Pro Ile Asn Thr CAG GTG CCG GCC Gin Val Pro Ala
GGC
Gly 100 ATC GTC CAG GTG Ile Val Gin Val 401 TAC CAC GGT GAT Tyr His Gly Asp
CTC
Leu 110 ACC GCC ATC AGC GCC CGC TCG GAA TGG GCA CCC Thr Ala Ile Ser Ala Arg Ser Glu Trp Ala Pro 449 GAG TTC GCC Glu Phe Ala
ATC
Ile 125 CAC GAC CTG GAT His Asp Leu Asp CTT GAT GCG CTT Leu Asp Ala Leu GCC GCC GCC Ala Ala Ala 135 TGG ACC TGG Trp Thr Trp GAC CCC GCC GAC TTT CCG GAC GAC ATC GCG CAC TAC Asp Pro Ala Asp Phe Pro Asp Asp Ile Ala His Tyr 140 145
CTC
Leu 150 ACC GAC Thr Asp 155 CGC TCC GCT GGC Arg Ser Ala Gly
TCG
Ser 160 CTG CGC GCC CGC Leu Arg Ala Arg TTT GCC GCC AAC Phe Ala Ala Asn
TTG
Leu 170 GGC GTC ACC GAA Gly Val Thr Glu
GAC
Asp 175 GAA GCG ACC GGT Glu Ala Thr Gly GCG GCC ATC CGG Ala Ala Ile Arg ACC GAT TAC CTC Thr Asp Tyr Leu
AGC
Ser 190 CGT GAC CTC ACC Arg Asp Leu Thr ACC CAG GGC AAA Thr Gin Gly Lys GGA TCG Gly Ser 200 TTG ATC CAC Leu Ile His CGA GTT GTC Arg Val Val 220 ACC TGG AGT CCC Thr Trp Ser Pro
GAG
Glu 210 GGC TGG GTT CGG Gly Trp Val Arg GTA GCC GGC Val Ala Gly 215 AGC GAC GGT GTG Ser Asp Gly Val
GCA
Ala 225 CAA CTC GAC Gin Leu Asp TGACGTAGAG CTCAGCGCTG CCGATGCAAC ACGGCGGCAA GGTGATCCTG CAGGGGTTGC CCGACCGCGC GCATCTGCAA CGAGTACGAA AGCTCGTCGC CGTCGATGCG GTAGGAACGG TCAAGGGCGG INFORMATION FOR SEQ ID NO: 66: SEQUENCE CHARACTERISTICS: LENGTH: 228 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: Met Ala Ile Glu Val Ser Val Leu Arg Val Phe Thr Asp Ser Asp Gly 1 5 10 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 Asn Arg Phe His Ala Val Ile Asp Asp 145 Leu Ala Leu Pro Ala 225 Asn Pro Leu Gly Val Leu Ser Glu 70 Arg Val Glu Leu Leu 150 Phe Ala Gly Arg Ala Gly Pro Arg Val Ala 120 Ala Thr Ala Arg Gly 200 Ala Ile Asn Gin Ser Ser Thr Phe Ala Gly Thr Ser Tyr 105 Pro Glu Ala Asp Trp Thr Asn Leu 170 Ile Thr 185 Ser Leu Gly Arg Ser Tyr Ala His Ile Gly Ala Ala 140 Arg Val Tyr His Val 220 Lys Ser His Pro Asn Asp Ile 125 Asp Ser Thr Leu Thr 205 Ser Val Glu Ala Thr Thr Leu 110 His Phe Ala Glu Ser 190 Thr Asp Glu Thr Thr Val Leu Thr Asp Pro Gly Asp 175 Arg Trp Gly His Ile Ile Gly Gin Ala Leu Asp Ser 160 Glu Asp Ser Val INFORMATION FOR SEQ ID NO: 67: SEQUENCE CHARACTERISTICS: LENGTH: 500 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 49...465 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: GTTTGTGGTG TCGGTGGTCT GGGGGGCGCC AACTGGGATT CGGTTGGG GTG GGT GCA Met Gly Ala SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 GGT CCG GCG ATG GGC ATC GGA GGT GTG GGT GGT Gly Pro Ala Met Gly Ile Gly Gly Val Gly Gly
TTG
Leu GGT GGG GCC GGT Gly Gly Ala Gly 105 153 TCG Ser GGT CCG GCG ATG Gly Pro Ala Met ATG GGG GGT GTG GGT GGT TTG GGT GGG Met Gly Gly Val Gly Gly Leu Gly Gly GGT TCG GGT CCG Gly Ser Gly Pro ATG GGC ATG GGG Met Gly Met Gly GTG GGT GGT TTA Val Gly Gly Leu GAT GCG Asp Ala 201 GCC GGT TCC Ala Gly Ser GGC GGA GGC Gly Gly Gly
GGC
Gly GAG GGC GGC TCT Glu Gly Gly Ser
CCT
Pro 60 GCG GCG ATC GGC Ala Ala Ile Gly ATC GGA GTT Ile Gly Val GCC GAC ACG Ala Asp Thr 249 297 GGA GGT GGG GGT Gly Gly Gly Gly
GGG
Gly GGT GGC GGC GGC Gly Gly Gly Gly AAC CGC Asn Arg TCC GAC AGG TCG Ser Asp Arg Ser GAC GTC GGG GGC GGA GTC TGG CCG TTG Asp Val Gly Gly Gly Val Trp Pro Leu 345 393 GGC Gly 100 TTC GGT AGG TTT GCC GAT GCG GGC GCC Phe Gly Arg Phe Ala Asp Ala Gly Ala 105 GGA AAC GAA GCA Gly Asn Glu Ala
CTG
Leu 115 GGG TCG AAG AAC Gly Ser Lys Asn TGC GCT GCC ATA Cys Ala Ala Ile TCC GGA GCT TCC Ser Gly Ala Ser ATA CCT Ile Pro 130 TCG TGC GGC Ser Cys Gly
CGG
Arg 135 AAG AGC TTG TCG Lys Ser Leu Ser TAGTCGGCCG CCATGACAAC CTCTCAGAGT
GCGCT
INFORMATION FOR SEQ ID NO: 68: SEQUENCE CHARACTERISTICS: LENGTH: 139 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: Met Gly Ala Gly Pro Ala Met Gly Ile Gly Gly Val Gly Gly Leu Gly Gly Ala Gly Ser Gly Pro Ala Met Gly Met Gly Gly Val Giy Gly Leu 25 Gly Gly Ala Gly Ser Gly Pro Ala Met Gly Met Gly Gly Val Gly Gly SUBSTTUVTE SHEET (RULE 26) WO 99/24577 56 56 40 Leu Asp Ala Ala Gly Ser Gly Glu Gly Gly Ser Pro 55 Ile Gly Val Gly Gly Gly Gly Gly Gly Gly Gly Gly 70 75 Ala Asp Thr Asn Arg Ser Asp Arg Ser Ser Asp Val 90 Trp Pro Leu Gly Phe Gly Arg Phe Ala Asp Ala Gly 100 105 Glu Ala Leu Gly Ser Lys Asn Gly Cys Ala Ala Ile 115 120 Ser Ile Pro Ser Cys Gly Arg Lys Ser Leu Ser 130 135 INFORMATION FOR SEQ ID NO: 69: SEQUENCE CHARACTERISTICS: LENGTH: 2050 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 22...2019 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: AGCGCACTCT GAGAGGTTGT C ATG GCG GCC GAC TAC GAC Met Ala Ala Asp Tyr Asp 1 5 CCG CAC GAA GGT ATG GAA GCT CCG GAC GAT ATG GCA Pro His Glu Gly Met Glu Ala Pro Asp Asp Met Ala 20 TTC GAC CCC AGT GCT TCG TTT CCG CCG GCG CCC GCA Phe Asp Pro Ser Ala Ser Phe Pro Pro Ala Pro Ala 35 CCG AAG CCC AAC GGC CAG ACT CCG CCC CCG ACG TCC Pro Lys Pro Asn Gly Gin Thr Pro Pro Pro Thr Ser 50 GAG CGG TTC GTG TCG GCC CCG CCG CCG CCA CCC CCA Glu Arg Phe Val Ser Ala Pro Pro Pro Pro Pro Pro 65 CCT CCG CCA ACT CCG ATG CCG ATC GCC GCA GGA GAG Pro Pro Pro Thr Pro Met Pro Ile Ala Ala Gly Glu 80 85 PCT/DK98/00438 Ala Ala Ile Gly Gly Gly Gly Gly Gly Gly Gly Val Ala Gly Gly Asn 110 Ser Ser Gly Ala 125
AAG
Lys
GCG
Ala
TCG
Ser
GAC
Asp
CCC
Pro
CCG
Pro
CTC
Leu
CAG
Gin
GCA
Ala
GAC
Asp
CCA
Pro
CCC
Pro
TTC
Phe
CCG
Pro
AAC
Asn
CTG
Leu
CCT
Pro
TCG
Ser
CGG
Arg
TTC
Phe
CTA
Leu
TCG
Ser
CCG
Pro
CCG
Pro SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 GAA CCG GCC GCA Glu Pro Ala Ala
TCT
Ser AAA CCA CCC ACA Lys Pro Pro Thr CCC CCC Pro Pro 100 ATG CCC ATC Met Pro Ile GCC GGA Ala Gly 105 339 CCC GAA CCG GCC CCA CCC AAA CCA CCC ACA CCC CCC ATG Pro Glu Pro Ala Pro Pro Lys Pro Pro Thr Pro Pro Met CCC ATC GCC Pro Ile Ala 120 ATG CCC ATC Met Pro Ile 387 GGA CCC GAA Gly Pro Glu 125 CCG GCC CCA CCC Pro Ala Pro Pro
AAA
Lys 130 CCA CCC ACA CCT Pro Pro Thr Pro GCC GGA Ala Gly 140 CCT GCA CCC ACC Pro Ala Pro Thr
CCA
Pro 145 ACC GAA TCC CAG Thr Glu Ser Gin
TTG
Leu 150 GCG CCC CCC AGA Ala Pro Pro Arg
CCA
Pro 155 CCG ACA CCA CAA Pro Thr Pro Gin
ACG
Thr 160 CCA ACC GGA GCG Pro Thr Gly Ala
CCG
Pro 165 CAG CAA CCG GAA Gin Gin Pro Glu
TCA
Ser 170 CCG GCG CCC CAC Pro Ala Pro His CCC TCG CAC GGG Pro Ser His Gly
CCA
Pro 180 CAT CAA CCC CGG His Gin Pro Arg CGC ACC Arg Thr 185 GCA CCA GCA Ala Pro Ala GCT CCG TCC Ala Pro Ser 205
CCG
Pro 190 CCC TGG GCA AAG Pro Trp Ala Lys
ATG
Met 195 CCA ATC GGC GAA Pro Ile Gly Glu CCC CCG CCC Pro Pro Pro 200 ACC CGG CCT Thr Arg Pro AGA CCG TCT GCG Arg Pro Ser Ala CCG GCC GAA CCA Pro Ala Glu Pro GCC CCC Ala Pro 220 CAA CAC TCC CGA Gin His Ser Arg GCG CGC CGG GGT Ala Arg Arg Gly CGC TAT CGC ACA Arg Tyr Arg Thr
GAC
Asp 235 ACC GAA CGA AAC Thr Glu Arg Asn GGG AAG GTA GCA Gly Lys Val Ala
ACT
Thr 245 GGT CCA TCC ATC Gly Pro Ser Ile
CAG
Gin 250 GCG CGG CTG CGG Ala Arg Leu Arg GAG GAA GCA TCC Glu Glu Ala Ser
GGC
Gly 260 GCG CAG CTC GCC Ala Gin Leu Ala CCC GGA Pro Gly 265 ACG GAG CCC Thr Glu Pro CCG CCC ACC Pro Pro Thr 285
TCG
Ser 270 CCA GCG CCG TTG Pro Ala Pro Leu
GGC
Gly 275 CAA CCG AGA TCG Gin Pro Arg Ser TAT CTG GCT Tyr Leu Ala 280 CCC TCG CCG Pro Ser Pro CGC CCC GCG CCG Arg Pro Ala Pro
ACA
Thr 290 GAA CCT CCC CCC Glu Pro Pro Pro CAG CGC Gin Arg 300 AAC TCC GGT CGG Asn Ser Gly Arg
CGT
Arg 305 GCC GAG CGA CGC Ala Glu Arg Arg CAC CCC GAT TTA His Pro Asp Leu 963 1011 GCC CAA CAT GCC Ala Gin His Ala
GCG
Ala 320 GCG CAA CCT GAT Ala Gin Pro Asp
TCA
Ser 325 ATT ACG GCC GCA Ile Thr Ala Ala
ACC
Thr 330 ACT GGC GGT CGT CGC CGC AAG CGT GCA GCG CCG GAT CTC GAC GCG ACA 1059 SUBSTITUTE SHEET (RULE 26) WO 99/24577 P~D9103 PCT/DK98/00438 Thr
CAG
Gin
AAG
Lys
CGC
Arg
CTG
Leu 395
CGC
Arg
GGG
Gly
CAG
Gin
GGA
Giy
GAT
Asp 475
CAC
His
TAC
Tyr
GCC
Ala
GCC
Ala
GGT
Giy 555 Gly
AAA
Lys
CCC
Pro
GGC
Gly 380
TCA
Ser
AAT
Asn
GCT
Ala
GTG
Val
AAC
Asn 460
GTG
Val
ACT
Thr
AGC
Ser
GAT
Asp
GGC
Gly 540
GTC
Val Gly
TCC
Ser
CAG
Gin 365
TGG
Trp
CCC
Pro
CCC
Pro
GGC
Gly
CGG
Arg 445
CTC
Leu
CTT
Leu
AGC
Ser
TCG
Ser
CCT
Pro 525
TTC
Phe
GTG
Val Arg Arg Arg Lys Arg Ala 335
TTA
Leu 350
AAA
Lys
CGA
Arg
GAC
Asp
CGC
Arg
AAA
Lys 430
GCC
Ala
GCC
Ala
GCA
Ala
GTC
Val
GCG
Ala 510
GCG
Ala
TTC
Phe
GTC
Val
AGG
Arg
CCG
Pro
CAT
His
GAG
Giu
GGG
Gly 415
ACC
Thr
GAC
Asp
GAT
Asp
GAA
Glu
AAT
Asn 495
CAG
Gin
TCG
Ser
GAC
Asp
GTG
Val
CCG
Pro
AAG
Lys
TGG
Trp
AAG
Lys 400
TCG
Ser
ACG
Thr
CGG
Arg
CGG
Arg
AAA
Lys 480
GCG
Ala
CGC
Arg
AGG
Arg
CCG
Pro
GCA
Ala 560
GCG
Ala
GCC
Ala
GTG
Val 385
TAC
Tyr
TAT
Tyr
CTG
Leu
ATC
Ile
GTA
Val 465
GAG
Giu
GTC
Val
GCG
Ala
TTT
Phe
CTG
Leu 545
AGT
Ser
GCC
Ala
ACG
Thr 370
CAT
His
GAG
Glu
CAG
Gin
ACA
Thr
CTG
Leu 450
GGG
Gly
CTG
Leu
AAT
Asn
CTC
Leu
TAC
Tyr 530
ACC
Thr
GTC
Val
AAG
Lys 355
AAG
Lys
GCG
Ala
CTG
Leu
ATC
Ile
GCA
Ala 435
GCT
Ala
CGA
Arg
TCG
Ser
CTG
Leu
AGC
Ser 515
AAC
Asn
CGC
Arg
TCA
Ser 58 Ala Pro 340 GGG CCG Gly Pro CCG CCC Pro Pro TTG ACG Leu Thr GAC CTG Asp Leu 405 GCC GTC Ala Vai 420 GCG TTG Ala Leu CTA GAC Leu Asp CAA TCG Gin Ser CAC TAC His Tyr 485 GAA GTG Giu Val 500 GAC GCC Asp Ala CTC GTC Leu Val GGC GTG Gly Vai ATC GAC Ile Asp 565 Asp Leu Asp Ala Thr 345 AAG GTG Lys Val AAA GTG Lys Val 375 CGA ATC Arg Ile 390 CAC GCT His Ala GTC GGT Val Gly GGG TCG Gly Ser GCG GAT Ala Asp 455 GGC GCG Giy Ala 470 AAC GAC Asn Asp CTG CCG Leu Pro GAC TGG Asp Trp TTG GCT Leu Ala 535 CTG TCC Leu Ser 550 GGC GCA Gly Ala TAC CAA Tyr Gin
AAG
Lys 360
GTG
Val
AAC
Asn
CGA
Arg
CTC
Leu
ACG
Thr 440
CCA
Pro
ACC
Thr
ATC
Ile
GCA
Ala
CAT
His 520
GAT
Asp
ACG
Thr
CAA
Gin
GAT
Asp
AAG
Lys
TCG
Ser
CTG
Leu
GTC
Val
AAA
Lys 425
TTG
Leu
GGC
Gly
ATC
Ile
CGC
Arg
CCG
Pro 505
TTC
Phe
TGT
Cys
GTG
Val
CAG
Gin
TTG
Leu
GTG
Val
CAG
Gin
GGC
Gly
CGC
Arg 410
GGT
Gly
GCT
Ala
GCC
Ala
GCT
Ala
GCA
Ala 490
GAA
Glu
ATC
Ile
GGG
Gly
TCC
Ser
GCG
Ala 570
GCG
Ala 1107 1155 1203 1251 1299 1347 1395 1443 1491 1539 1587 1635 1683 1731 1779 TCG GTC GCG TTG GAC TGG TTG CGC AAC AAC GGT Ser Val Ala Leu Asp Trp Leu Arg Asn Asn Gly SUBSTITUTE SHEET (RULE 26) I1/2" nn4l577 PCT/InVO/nlA0R V/77/1*J 1 59 575 580 585 AGC CGC GCA TGC GTG GTC ATC AAT CAC ATC ATG CCG GGA GAA CCC AAT 1827 Ser Arg Ala Cys Val Val Ile Asn His Ile Met Pro Gly Glu Pro Asn 590 595 600 GTC GCA GTT AAA GAC CTG GTG CGG CAT TTC GAA CAG CAA GTT CAA CCC 1875 Val Ala Val Lys Asp Leu Val Arg His Phe Glu Gin Gin Val Gin Pro 605 610 615 GGC CGG GTC GTG GTC ATG CCG TGG GAC AGG CAC ATT GCG GCC GGA ACC 1923 Gly Arg Val Val Val Met Pro Trp Asp Arg His Ile Ala Ala Gly Thr 620 625 630 GAG ATT TCA CTC GAC TTG CTC GAC CCT ATC TAC AAG CGC AAG GTC CTC 1971 Glu Ile Ser Leu Asp Leu Leu Asp Pro Ile Tyr Lys Arg Lys Val Leu 635 640 645 650 GAA TTG GCC GCA GCG CTA TCC GAC GAT TTC GAG AGG GCT GGA CGT CGT T 2020 Glu Leu Ala Ala Ala Leu Ser Asp Asp Phe Glu Arg Ala Gly Arg Arg 655 660 665 GAGCGCACCT GCTGTTGCTG CTGGTCCTAC 2050 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 666 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Met Ala Ala Asp Tyr Asp Lys Leu Phe Arg Pro His Glu Gly Met Glu 1 5 10 Ala Pro Asp Asp Met Ala Ala Gin Pro Phe Phe Asp Pro Ser Ala Ser 25 Phe Pro Pro Ala Pro Ala Ser Ala Asn Leu Pro Lys Pro Asn Gly Gin 40 Thr Pro Pro Pro Thr Ser Asp Asp Leu Ser Glu Arg Phe Val Ser Ala 55 Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Thr Pro Met 70 75 Pro Ile Ala Ala Gly Glu Pro Pro Ser Pro Glu Pro Ala Ala Ser Lys 90 Pro Pro Thr Pro Pro Met Pro Ile Ala Gly Pro Glu Pro Ala Pro Pro 100 105 110 Lys Pro Pro Thr Pro Pro Met Pro Ile Ala Gly Pro Glu Pro Ala Pro 115 120 125 v SUBSTITUTE SHEET (RULE 26) PCT/DK98/00438 IlWI nn-ifl A "7 %J .J I e 60 Pro Lys Pro Pro Thr Pro Pro Met Pro Ile Ala Gly Pro Ala Pro Thr 130 135 140 Pro Thr Glu Ser Gin Leu Ala Pro Pro Arg Pro Pro Thr Pro Gin Thr 145 150 155 160 Pro Thr Gly Ala Pro Gin Gin Pro Glu Ser Pro Ala Pro His Val Pro 165 170 175 Ser His Gly Pro His Gin Pro Arg Arg Thr Ala Pro Ala Pro Pro Trp 180 185 190 Ala Lys Met Pro Ile Gly Glu Pro Pro Pro Ala Pro Ser Arg Pro Ser 195 200 205 Ala Ser Pro Ala Glu Pro Pro Thr Arg Pro Ala Pro Gin His Ser Arg 210 215 220 Arg Ala Arg Arg Gly His Arg Tyr Arg Thr Asp Thr Glu Arg Asn Val 225 230 235 240 Gly Lys Val Ala Thr Gly Pro Ser Ile Gin Ala Arg Leu Arg Ala Glu 245 250 255 Glu Ala Ser Gly Ala Gin Leu Ala Pro Gly Thr Glu Pro Ser Pro Ala 260 265 270 Pro Leu Gly Gin Pro Arg Ser Tyr Leu Ala Pro Pro Thr Arg Pro Ala 275 280 285 Pro Thr Glu Pro Pro Pro Ser Pro Ser Pro Gin Arg Asn Ser Gly Arg 290 295 300 Arg Ala Glu Arg Arg Val His Pro Asp Leu Ala Ala Gin His Ala Ala 305 310 315 320 Ala Gin Pro Asp Ser Ile Thr Ala Ala Thr Thr Gly Gly Arg Arg Arg 325 330 335 Lys Arg Ala Ala Pro Asp Leu Asp Ala Thr Gin Lys Ser Leu Arg Pro 340 345 350 Ala Ala Lys Gly Pro Lys Val Lys Lys Val Lys Pro Gin Lys Pro Lys 355 360 365 Ala Thr Lys Pro Pro Lys Val Val Ser Gin Arg Gly Trp Arg His Trp 370 375 380 Val His Ala Leu Thr Arg Ile Asn Leu Gly Leu Ser Pro Asp Glu Lys 385 390 395 400 Tyr Glu Leu Asp Leu His Ala Arg Val Arg Arg Asn Pro Arg Gly Ser 405 410 415 Tyr Gin Ile Ala Val Val Gly Leu Lys Gly Gly Ala Gly Lys Thr Thr 420 425 430 Leu Thr Ala Ala Leu Gly Ser Thr Leu Ala Gin Val Arg Ala Asp Arg 435 440 445 f u J J SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 Ile Leu Ala Leu Asp Ala Asp Pro Gly 450 455 Val 465 Glu Val Ala Phe Leu 545 Ser Leu Ile Val Pro 625 Leu Ser Gin His Glu 500 Asp Leu Gly Ile Asn 580 Ile Phe Arg Ile Phe 660 Ser Tyr 485 Val Ala Val Val Asp 565 Gly Met Glu His Tyr 645 Glu Gly 470 Asn Leu Asp Leu Leu 550 Gly Tyr Pro Gin Ile 630 Lys Arg Ala Asp Pro Trp Ala 535 Ser Ala Gin Gly Gin 615 Ala Arg Ala Thr Ile Ala His 520 Asp Thr Gin Asp Glu 600 Val Ala Lys Gly Ile Arg Pro 505 Phe Cys Val Gin Leu 585 Pro Gin Gly Val Arg 665 Ala Ala Ala 490 Glu Ile Gly Ser Ala 570 Ala Asn Pro Thr Leu 650 Arg Asn 460 Val Thr Ser Asp Gly 540 Val Val Arg Ala Arg 620 Ile Leu Leu Leu Ser Ser Pro 525 Phe Val Ala Ala Val 605 Val Ser Ala Ala Ala Val Ala 510 Ala Phe Val Leu Cys 590 Lys Val Leu Ala Arg Lys 480 Ala Arg Arg Pro Ala 560 Trp Val Leu Met Leu 640 Leu INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 1890 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 79...1851 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: GCAGCGATGA GGAGGAGCGG CGCCAACGGC CCGCGCCGGC GACGATGCAA AGCGCAGCGA SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 62 ATG ACT GCT GAA CCG GAA GTA CGG ACG Met Thr Ala Glu Pro Giu Val Arg Thr TGAGGAGGAG CGGCGCGC CTG CGC Leu Arg GAG GTT GTG CTG GAC CAG CTC GGC Glu Val Val Leu Asp Gin Leu Gly ACT GCT Thr Ala 20 GAA TCG CGT Glu Ser Arg GCG TAC AAG Ala Tyr Lys ATG TGG CTG Met Trp Leu CCG CCG TTG ACC Pro Pro Leu Thr CCG GTC CCG CTC AAC GAG CTC ATC Pro Val Pro Leu Asn Giu Leu Ile 207 GCC CGT Ala Arg GAT CGG CGA CAA Asp Arg Arg Gin CTG CGA TTT GCC Leu Arg Phe Ala
CTG
Leu GGG ATC ATG GAT Gly Ile Met Asp 255 303
GAA
Glu CCG CGC CGC CAT Pro Arg Arg His
CTA
Leu CAG GAT GTG TGG Gin Asp Val Trp GTA GAC GTT TCC Vai Asp Vai Ser GCC GGC GGC AAC Ala Gly Gly Asn GGT ATT GGG GGC GCA CCT CAA ACC GGG Gly Ile Gly Gly Ala Pro Gin Thr Gly AAG TCG Lys Ser ACG CTA CTG Thr Leu Leu CGC AAC GTT Arg Asn Val 110 ACG ATG GTG ATG Thr Met Val Met GCC GCC GCC ACA Ala Ala Ala Thr CAC TCA CCG His Ser Pro 105 GGG CTG ATC Gly Leu Ile CAG TTC TAT TGC Gin Phe Tyr Cys GAC CTA GGT GGC Asp Leu Gly Gly
GGC
Gly 120 TAT CTC Tyr Leu 125 GAA AAC CTT CCA Glu Asn Leu Pro GTC GGT GGG GTA Val Gly Gly Val AAT CGG TCC GAG Asn Arg Ser Glu
CCC
Pro 140 GAC AAG GTC AAC Asp Lys Val Asn GTG GTC GCA GAG Val Val Ala Glu
ATG
Met 150 CAA GCC GTC ATG Gin Ala Val Met CAA CGG GAA ACC Gin Arg Glu Thr TTC AAG GAA CAC Phe Lys Giu His
CGA
Arg 165 GTG GGC TCG ATC Val Gly Ser Ile GGG ATG Gly Met 170 TAC CGG CAG Tyr Arg Gin TAC GGC GAC Tyr Gly Asp 190 CGT GAC GAT CCA Arg Asp Asp Pro
AGT
Ser 180 CAA CCC GTT GCG Gin Pro Vai Ala TCC GAT CCA Ser Asp Pro TTT GTC GGC Phe Val Gly GTC TTT CTG ATC Val Phe Leu Ile GAC GGA TGG CCC Asp Gly Trp Pro GAG TTC Glu Phe 205 CCC GAC CTT GAG Pro Asp Leu Glu CAG GTT CAA GAT Gin Val Gin Asp GCC GCC CAG GGG Ala Ala Gin Gly GGG TTC GGC GTC Gly Phe Gly Val GTC ATC ATC TCC Val Ile Ile Ser
ACG
Thr 230 CCA CGC TGG ACA Pro Arg Trp Thr CTG AAG TCG CGT GTT CGC GAC TAC CTC GGC ACC AAG ATC GAG TTC CGG SUBSTITUTE SHEET (RULE 26) WO 99/24577 WO 9924577PCT/DK98/00438 Leu Lys Ser Arg Val 240 Arg Asp Tyr Leu Gly 245 Thr Lys Ile Glu Phe Arg 250 CTT GGT GAO Leu Giy Asp CCG GCG AAT Pro Ala Asn 270 AAT GAA ACC CAG Asn Giu Thr Gin GAC CGG ATT ACC Asp Arg Ile Thr OGO GAG ATC Arg Giu Ile 265 CAC CAT CTG His His Leu CGT CCG GGT CGG Arg Pro Giy Arg GTG TCG ATG GAA Val Ser Met Giu
AAG
Lys 280 ATG ATC Met Ile 285 GGC GTG CCC AGG Giy Vai Pro Arg
TTC
Phe 290 GAC GGC GTG CAC Asp Giy Val His
AGO
Ser 295 GCC GAT AAC OTG Aia Asp Asn Leu
GTG
Val 300 GAG GOG ATC ACC Glu Ala Ile Thr
GCG
Ala 305 GGG GTG AOG CAG Gly Vai Thr Gin GCT TCC CAG CAC Ala Ser Gin His
ACC
Thr 315 975 1023 1071 GAA CAG GOA OCT Giu Gin Ala Pro
CCG
Pro 320 GTG OGG GTC OTG Val Arg Val Leu GAG CGT ATO CAC OTG CAC Giu Arg Ile His Leu His 330 GAA CTC GAO Giu Leu Asp TGG GAG ATT Trp, Glu Ile 350
COG
Pro 335 AAC COG COG GGA Asn Pro Pro Gly
OCA
Pro 340 GAG TOO GAO TAO Giu Ser Asp Tyr OGO ACT OGO Arg Thr Arg 345 COG GOT CAC Pro Ala His COG ATO GGO TTG Pro Ile Gly Leu
OGO
Arg 355 GAG AOG GAO OTG Glu Thr Asp Leu TGO CAC Cys His 365 ATG CAC ACG AAC Met His Thr Asn CAC OTA OTG ATO His Leu Leu Ile GGT GOG GOC AAA Gly Ala Ala Lys 1119 1167 1215 1263 1311
TOG
Ser 380
OGA
Arg GGC AAG ACG ACC Giy Lys Thr Thr AAO AGT CCC CAG Asn Ser Pro Gin 400 GOC CAC GOG ATO Ala His Ala Ile
GOG
Ala 390 OGO GOC ATT TGT Arg Ala Ile Cys CAG GTG CGG TTC Gin Vai Arg Phe
ATG
Met 405 OTO GOG GAO TAO Leu Ala Asp Tyr OGO TOG Arg Ser 410 GGO CTG CTG Gly Leu Leu GOG GTG COG GAO Ala Val Pro Asp CAT CTG OTG GGC His Leu Leu Gly GOC GGC GOG Ala Gly Ala 425 GCA CTG GOG Ala Leu Ala 13 59 ATO AAO OGO AAO AGO GOG TOG Ile Asn Arg Asn Ser Ala Ser 430 GAO GAG GOC GOT Asp Giu Ala Ala
CAA
Gin 440 1407 GTC AAO Val Asn 445 CTG AAG AAG CGG Leu Lys Lys Arg
TTG
Leu 450 COG COG ACC GAO CTG ACG AOG GOG CAG Pro Pro Thr Asp Leu Thr Thr Ala Gin 455 1455
OTA
Leu 460 OGO TOG CGT TOG Arg Ser Arg Ser TGG AGO GGA TTT Trp, Ser Gly Phe
GAO
Asp 470 GTC GTG OTT CTG Val Val Leu Leu
GTO
Val 475 1503 GAO GAT TGG CAC ATG Asp Asp Trp His Met ATO GTG GGT GOC GOC GGG GGG ATG COG COG ATG Ile Val Gly Ala Ala Gly Gly Met Pro Pro Met 1551 SUB3STITuE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 480 48! GCA CCG CTG GCC CCG TTA TTG CCG GCG GC( Ala Pro Leu Ala Pro Leu Leu Pro Ala Al 495 500 ATC ATT GTC ACC TGT CAG ATG AGC CAG GC" Ile Ile Val Thr Cys Gin Met Ser Gin Al 510 515 AAG TTC GTC GGC GCC GCA TTC GGG TCG GG( Lys Phe Val Gly Ala Ala Phe Gly Ser Gl 525 530 TCG GGC GAG AAG CAG GAA TTC CCA TCC AG' Ser Gly Glu Lys Gin Glu Phe Pro Ser Set 540 545 CGC CCC CCT GGC CAG GCA TTT CTC GTC TC( Arg Pro Pro Gly Gin Ala Phe Leu Val Se] 560 56! ATC CAG GCC CCC TAC ATC GAG CCT CCA GAj Ile Gin Ala Pro Tyr Ile Glu Pro Pro Gl 575 580 CCA AGC GCC GGT TAAGATTATT TCATTGCCGG Pro Ser Ala Gly 590 INFORMATION FOR SEQ ID NO: 72: SEQUENCE CHARACTERISTICS: LENGTH: 591 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Met Thr Ala Glu Pro Glu Val Arg Thr Le 1 5 1 Gin Leu Gly Thr Ala Glu Ser Arg Ala Ty: Leu Thr Asn Pro Val Pro Leu Asn Glu Le Gin Pro Leu Arg Phe Ala Leu Gly Ile Me 55 Leu Gin Asp Val Trp Gly Val Asp Val Se 70 Gly Ile Gly Gly Ala Pro Gin Thr Gly Ly 9 34 5 490 3 GCA GAT ATC GGG TTG a Ala Asp Ile Gly Leu 505 I TAC AAG GCA ACC ATG a Tyr Lys Ala Thr Met 520 C GCT CCG ACA ATG TTC y Ala Pro Thr Met Phe 535 T GAG TTC AAG GTC AAG r Glu Phe Lys Val Lys 550 3 CCA GAC GGC AAA GAG r Pro Asp Gly Lys Glu 5 570 A GAA GTG TTC GCA GCA i Glu Val Phe Ala Ala 585 TGTAGCAGGA CCCGAGCTC
CAC
His
GAC
Asp
CTT
Leu
CGG
Arg 555
GTC
Val
CCC
Pro 1599 1647 1695 1743 1791 1839 1890 NO: 72: u Arg Glu 0 r Lys Met u Ile Ala t Asp Glu r Gly Ala s Ser Thr 0 SUBSTITUTE SHEET (RULE 26) WOn 099/2477 PCT/DK98/00438 Met Val Met Ser Ala Ala Ala Thr His Ser Pro Arg Asn Val Gin Phe 100 105 110 Tyr Cys Ile Asp Leu Gly Gly Gly Gly Leu Ile Tyr Leu Glu Asn Leu 115 120 125 Pro His Val Gly Gly Val Ala Asn Arg Ser Glu Pro Asp Lys Val Asn 130 135 140 Arg Val Val Ala Giu Met Gin Ala Val Met Arg Gin Arg Glu Thr Thr 145 150 155 160 Phe Lys Glu His Arg Val Gly Ser Ile Gly Met Tyr Arg Gin Leu Arg 165 170 175 Asp Asp Pro Ser Gin Pro Val Ala Ser Asp Pro Tyr Gly Asp Val Phe 180 185 190 Leu Ile Ile Asp Gly Trp Pro Gly Phe Val Gly Glu Phe Pro Asp Leu 195 200 205 Glu Gly Gin Val Gin Asp Leu Ala Ala Gin Gly Leu Gly Phe Gly Val 210 215 220 His Val Ile Ile Ser Thr Pro Arg Trp Thr Glu Leu Lys Ser Arg Val 225 230 235 240 Arg Asp Tyr Leu Gly Thr Lys Ile Glu Phe Arg Leu Gly Asp Val Asn 245 250 255 Glu Thr Gin Ile Asp Arg Ile Thr Arg Glu Ile Pro Ala Asn Arg Pro 260 265 270 Gly Arg Ala Val Ser Met Glu Lys His His Leu Met Ile Gly Val Pro 275 280 285 Arg Phe Asp Gly Val His Ser Ala Asp Asn Leu Val Glu Ala Ile Thr 290 295 300 Ala Gly Val Thr Gin Ile Ala Ser Gin His Thr Glu Gin Ala Pro Pro 305 310 315 320 Val Arg Val Leu Pro Glu Arg Ile His Leu His Glu Leu Asp Pro Asn 325 330 335 Pro Pro Gly Pro Glu Ser Asp Tyr Arg Thr Arg Trp Glu Ile Pro Ile 340 345 350 Gly Leu Arg Glu Thr Asp Leu Thr Pro Ala His Cys His Met His Thr 355 360 365 Asn Pro His Leu Leu Ile Phe Gly Ala Ala Lys Ser Gly Lys Thr Thr 370 375 380 Ile Ala His Ala Ile Ala Arg Ala Ile Cys Ala Arg Asn Ser Pro Gin 385 390 395 400 Gin Val Arg Phe Met Leu Ala Asp Tyr Arg Ser Gly Leu Leu Asp Ala 405 410 415 SUBSTITUTE SHEET (RULE 26) W 00/' 4A 77 PCT/DK98/00438 66 Val Pro Asp Thr His Leu Leu Gly Ala Gly Ala Ile Asn Arg Asn Ser 420 425 430 Ala Ser Leu Asp Glu Ala Ala Gln Ala Leu Ala Val Asn Leu Lys Lys 435 440 445 Arg Leu Pro Pro Thr Asp Leu Thr Thr Ala Gin Leu Arg Ser Arg Ser 450 455 460 Trp Trp Ser Gly Phe Asp Val Val Leu Leu Val Asp Asp Trp His Met 465 470 475 480 Ile Val Gly Ala Ala Gly Gly Met Pro Pro Met Ala Pro Leu Ala Pro 485 490 495 Leu Leu Pro Ala Ala Ala Asp Ile Gly Leu His Ile Ile Val Thr Cys 500 505 510 Gin Met Ser Gin Ala Tyr Lys Ala Thr Met Asp Lys Phe Val Gly Ala 515 520 525 Ala Phe Gly Ser Gly Ala Pro Thr Met Phe Leu Ser Gly Glu Lys Gin 530 535 540 Glu Phe Pro Ser Ser Glu Phe Lys Val Lys Arg Arg Pro Pro Gly Gin 545 550 555 560 Ala Phe Leu Val Ser Pro Asp Gly Lys Glu Val Ile Gin Ala Pro Tyr 565 570 575 Ile Glu Pro Pro Glu Glu Val Phe Ala Ala Pro Pro Ser Ala Gly 580 585 590 INFORMATION FOR SEQ ID NO: 73: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: Asp Pro Val Asp Asp Ala Phe Ile Ala Lys Leu Asn Thr Ala Gly 1 5 10 INFORMATION FOR SEQ ID NO: 74: SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (ix) Feature: NAME/KEY: Other SUBSTITUTE SHEET (RULE 26) WO 99/24577 67 LOCATION: 14 OTHER INFORMATION: Xaa is unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: Asp Pro Val Asp Ala Ile Ile Asn Leu Asp Asn Tyr Gly Xaa PCT/DK98/00438 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (ix) Feature: NAME/KEY: Other LOCATION: OTHER INFORMATION: Xaa is unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Ala Glu Met Lys Xaa Phe Lys Asn Ala Ile Val Gin Glu Ile Asp 1 5 10 INFORMATION FOR SEQ ID NO: 76: SEQUENCE CHARACTERISTICS: LENGTH: 14 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (ix) FEATURE: NAME/KEY: Other LOCATION: 3...3 OTHER INFORMATION: NAME/KEY: Other LOCATION: 7...7 OTHER INFORMATION: (ix) Feature: NAME/KEY: Other LOCATION: 11 OTHER INFORMATION: Ala is Ala or Gin Thr is Gly or Thr Xaa is unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: Val Ile Ala Gly Met Val Thr His Ile His Xaa Val Ala Gly 1 5 SUBSTITUTE SHEET (RULE 26) WO 99/24577 68 INFORMATION FOR SEQ ID NO: 77: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: Thr Asn Ile Val Val Leu Ile Lys Gin Val Pro Asp Thr Trp Ser 1 5 10 INFORMATION FOR SEQ ID NO: 78: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: Ala Ile Glu Val Ser Val Leu Arg Val Phe Thr Asp Ser Asp Gly 1 5 10 INFORMATION FOR SEQ ID NO: 79: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: Ala Lys Leu Ser Thr Asp Glu Leu Leu Asp Ala Phe Lys Glu Met 1 5 10 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (ix) FEATURE: NAME/KEY: Other PCT/DK98/00438 SUBSTITUTE SHEET (RULE 26) wO 99n47 PCT/DK98/00438 69 LOCATION: 4...4 OTHER INFORMATION: Asp is Asp or Glu (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Asp Pro Ala Asp Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr 1 5 10 INFORMATION FOR SEQ ID NO: 81: SEQUENCE CHARACTERISTICS: LENGTH: 50 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: Ala Glu Asp Val Arg Ala Glu Ile Val Ala Ser Val Leu Glu Val Val 1 5 10 Val Asn Glu Gly Asp Gln Ile Asp Lys Gly Asp Val Val Val Leu Leu 25 Glu Ser Met Tyr Met Glu Ile Pro Val Leu Ala Glu Ala Ala Gly Thr 40 Val Ser INFORMATION FOR SEQ ID NO: 82: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: Thr Thr Ser Pro Asp Pro Tyr Ala Ala Leu Pro Lys Leu Pro Ser 1 5 10 INFORMATION FOR SEQ ID NO: 83: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 0 l J J V V SUBSTITUTE SHEET (RULE 26) WO 99/24577 Thr Glu Tyr Glu Gly Pro Lys Thr Lys Phe His Ala Leu Met Gin 1 5 10 INFORMATION FOR SEQ ID NO: 84: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: Thr Thr Ile Val Ala Leu Lys Tyr Pro Gly Gly Val Val Met Ala 1 5 10 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (ix) FEATURE: NAME/KEY: Other LOCATION: OTHER INFORMATION: Xaa is unknown (ix) FEATURE: NAME/KEY: Other LOCATION: OTHER INFORMATION: Xaa is unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Ser Phe Pro Tyr Phe Ile Ser Pro Glu Xaa Ala Met Arg Glu Xaa 1 5 10 INFORMATION FOR SEQ ID NO: 86: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide FRAGMENT TYPE: N-terminal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: Thr His Tyr Asp Val Val Val Leu Gly Ala Gly Pro Gly Gly Tyr 1 5 10 PCT/DK98/00438 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 71 INFORMATION FOR SEQ ID NO: 87: SEQUENCE CHARACTERISTICS: LENGTH: 450 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Other (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 107...400 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: AGCCCGGTAA TCGAGTTCGG GCAATGCTGA CCATCGGGTT TGTTTCCGGC TATAACCGAA CGGTTTGTGT ACGGGATACA AATACAGGGA GGGAAGAAGT AGGCAA ATG GAA AAA Met Glu Lys 115 163 ATG TCA Met Ser CAT GAT CCG ATC GCT GCC GAC ATT GGC ACG CAA GTG AGC GAC His Asp Pro Ile Ala Ala Asp Ile Gly Thr Gin Val Ser Asp
AAC
Asn GCT CTG CAC GGC Ala Leu His Gly ACG GCC GGC TCG ACG GCG CTG ACG TCG Thr Ala Gly Ser Thr Ala Leu Thr Ser 211 259 ACC GGG CTG GTT CCC GCG GGG GCC GAT Thr Gly Leu Val Pro Ala Gly Ala Asp ACG GCG TTC ACA TCG GAG GGC ATC CAA Thr Ala Phe Thr Ser Glu Gly Ile Gin GTC TCC GCC CAA Val Ser Ala Gin GCG GCG Ala Ala TTG CTG GCT Leu Leu Ala TCC AAT GCA TCG Ser Asn Ala Ser 307 GCC CAA GAC Ala Gin Asp CAG CTC CAC CGT Gin Leu His Arg GGC GAA GCG GTC Gly Glu Ala Val GAC GTC GCC Asp Val Ala 355 CGC ACC Arg Thr TAT TCG CAA ATC Tyr Ser Gin Ile GAC GGC GCC GCC Asp Gly Ala Ala GTC TTC GCC TAATA Val Phe Ala GGCCCCCAAC ACATCGGAGG GAGTGATCAC CATGCTGTGG CACGC INFORMATION FOR SEQ ID NO: 88: SEQUENCE CHARACTERISTICS: LENGTH: 98 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal SUBSTITUTE SHEET (RULE 26) wn 99/2477 PCT/DK98/00438 72 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: Met Glu Lys Met Ser His Asp Pro Ile Ala Ala Asp Ile Gly Thr Gin 1 5 10 Val Ser Asp Asn Ala Leu His Gly Val Thr Ala Gly Ser Thr Ala Leu 25 Thr Ser Val Thr Gly Leu Val Pro Ala Gly Ala Asp Glu Val Ser Ala 40 Gin Ala Ala Thr Ala Phe Thr Ser Glu Gly Ile Gin Leu Leu Ala Ser 55 Asn Ala Ser Ala Gin Asp Gin Leu His Arg Ala Gly Glu Ala Val Gin 70 75 Asp Val Ala Arg Thr Tyr Ser Gin Ile Asp Asp Gly Ala Ala Gly Val 90 Phe Ala INFORMATION FOR SEQ ID NO: 89: SEQUENCE CHARACTERISTICS: LENGTH: 460 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 37...453 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: GCAACCGGCT TTTCGATCAG CTGAGACATC AGCGGC GTG CGG GTC AAC GAC CCA 54 Met Arg Val Asn Asp Pro 1 CCT GCG CCA GGT AGC GAC TCC GCG CGC AGC AGG CCC GCG CCC GCG CTG 102 Pro Ala Pro Gly Ser Asp Ser Ala Arg Ser Arg Pro Ala Pro Ala Leu 15 GGG CCT GAT CCA CCA GCC AGC GGA TGG TTC GAC AGC GGA CTG GTG CCG 150 Gly Pro Asp Pro Pro Ala Ser Gly Trp Phe Asp Ser Gly Leu Val Pro 30 AGC AGG CCC ATC TGC GCG GCT TCC TCG TCG GCT GGG TTG CCG CCG CCG 198 Ser Arg Pro Ile Cys Ala Ala Ser Ser Ser Ala Gly Leu Pro Pro Pro 45 GTG CCG CCC ACC TGG CTG AAC AAC GAC GTC ACC TGC TGC AGC GGC TGG 246 Val Pro Pro Thr Trp Leu Asn Asn Asp Val Thr Cys Cys Ser Gly Trp 60 65 GTC AGC TGC TGC ATC GGG CCG CTC ATC TCA CCC AGT TGG CCG AGG GTC 294 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 73 Val Ser Cys Cys Ile Gly Pro Leu Ile Ser Pro Ser Trp Pro Arg Val 80 TGG GTA GCC GCC GGC GGC AAC TGG CCA ACC GGT GTT GAG CTG CCA GGG 342 Trp Val Ala Ala Gly Gly Asn Trp Pro Thr Gly Val Glu Leu Pro Gly 95 100 GAG GGC ATT CCG AAG ATC GGG TTC GTC GTG CTC TGG CTC GCG CCG GGA 390 Glu Gly Ile Pro Lys Ile Gly Phe Val Val Leu Trp Leu Ala Pro Gly 105 110 115 TCA AGG ATC GAC GCC ATC GGC TCG AGC TTC TCG AAA AGC GTG TTA ACC 438 Ser Arg Ile Asp Ala Ile Gly Ser Ser Phe Ser Lys Ser Val Leu Thr 120 125 130 GCG GTC TCG GCC TGG TAGACCT 460 Ala Val Ser Ala Trp 135 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 139 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Met Arg Val Asn Asp Pro Pro Ala Pro Gly Ser Asp Ser Ala Arg Ser 1 5 10 Arg Pro Ala Pro Ala Leu Gly Pro Asp Pro Pro Ala Ser Gly Trp Phe 25 Asp Ser Gly Leu Val Pro Ser Arg Pro Ile Cys Ala Ala Ser Ser Ser 40 Ala Gly Leu Pro Pro Pro Val Pro Pro Thr Trp Leu Asn Asn Asp Val 55 Thr Cys Cys Ser Gly Trp Val Ser Cys Cys Ile Gly Pro Leu Ile Ser 70 75 Pro Ser Trp Pro Arg Val Trp Val Ala Ala Gly Gly Asn Trp Pro Thr 90 Gly Val Glu Leu Pro Gly Glu Gly Ile Pro Lys Ile Gly Phe Val Val 100 105 110 Leu Trp Leu Ala Pro Gly Ser Arg Ile Asp Ala Ile Gly Ser Ser Phe 115 120 125 Ser Lys Ser Val Leu Thr Ala Val Ser Ala Trp 130 135 SUBSTITUTE SHEET (RULE 26) WO 99/24577 74 INFORMATION FOR SEQ ID NO: 91: SEQUENCE CHARACTERISTICS: LENGTH: 1200 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: PCT/DK98/00438 NAME/KEY: Coding Sequence LOCATION: 28...1140 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: TAATAGGCCC CCAACACATC GGAGGGA GTG ATC ACC ATG CTG TGG CAC GCA ATG Met Ile Thr Met Leu Trp His Ala Met
CCA
Pro CCG GAG CTA AAT Pro Glu Leu Asn GCA CGG CTG ATG Ala Arg Leu Met
GCC
Ala GGC GCG GGT CCG Gly Ala Gly Pro CCA ATG CTT GCG Pro Met Leu Ala GCC GCG GGA TGG Ala Ala Gly Trp ACG CTT TCG GCG Thr Leu Ser Ala GCT CTG Ala Leu GAC GCT CAG Asp Ala Gin GCC TGG ACT Ala Trp Thr GTC GAG TTG ACC Val Glu Leu Thr
GCG
Ala CGC CTG AAC TCT Arg Leu Asn Ser CTG GGA GAA Leu Gly Glu GCA ACG CCG Ala Thr Pro GGA GGT GGC AGC Gly Gly Gly Ser AAG GCG CTT GCG Lys Ala Leu Ala ATG GTG GTC TGG CTA CAA Met Val Val Trp Leu Gln GCG TCA ACA CAG Ala Ser Thr Gin AAG ACC CGT GCG Lys Thr Arg Ala
ATG
Met CAG GCG ACG GCG Gin Ala Thr Ala GCC GCG GCA TAC Ala Ala Ala Tyr
ACC
Thr 100 CAG GCC ATG GCC Gin Ala Met Ala
ACG
Thr 105 ACG CCG TCG CTG Thr Pro Ser Leu GAG ATC GCC GCC Glu Ile Ala Ala
AAC
Asn 115 CAC ATC ACC CAG His Ile Thr Gin GCC GTC Ala Val 120 CTT ACG GCC Leu Thr Ala ACC GAG ATG Thr Glu Met 140 AAC TTC TTC GGT Asn Phe Phe Gly AAC ACG ATC CCG Asn Thr Ile Pro ATC GCG TTG Ile Ala Leu 135 GCC CTG GCA Ala Leu Ala 438 486 GAT TAT TTC ATC Asp Tyr Phe Ile ATG TGG AAC CAG Met Trp Asn Gin ATG GAG Met Glu 155 GTC TAC CAG GCC GAG ACC GCG GTT AAC Val Tyr Gin Ala Glu Thr Ala Val Asn CTT TTC GAG AAG Leu Phe Glu Lys CTC GAG CCG ATG GCG TCG ATC CTT GAT CCC GGC GCG AGC CAG AGC ACG SUBSTITUTE SHEET (RULE 26) WO 99/24577 WO 9924577PCT/DK98/00438 Leu 170
ACG
Thr
GGC
Gly
ATG
Met
TCG
Ser
GAG
Giu 250
CAT
His
CGC
Arg
CTG
Leu
GCG
Ala
GCG
Ala 330
GGT
Gly
GAG
Glu Pro AAC CCG Asn Pro CAG TTG Gin Leu AGC GGC Ser Gly 220 TTG TTC Leu Phe 235 GAA GCC Glu Ala CCG CTG Pro Leu GCG GAG Ala Giu ATG TCT Met Ser 300 GCT GCT Ala Ala 315 GGA GCG Gly Ala CTG GTC Leu Val GAC GAC Met
ATC
Ile
CCG
Pro 205
CCG
Pro
AGC
Ser
GCG
Ala
GCT
Ala
TCG
Ser 285
CAG
Gin
GCC
Ala
ATG
Met
GCG
Ala
TGG
Ala
TTC
Phe 190
CCG
Pro
ATG
Met
CAG
Gin
CAG
G ln
GGT
Giy 270
CTA
Leu
CTG
Leu
GGA
Gly
GGC
Gly
CCG
Pro 350
GAC
Ser 175
GGA
Gly
GCG
Ala
CAG
Gin
GTG
Val1
ATG
Met 255
GGA
Gly
CCT
Pro
ATC
Ile
TCG
Ser
CAG
Gin 335
GCA
Ala
GAA
Ile
ATG
Met
GCT
Ala
CAG
Gin
GGC
Gly 240
GGC
Gly
TCA
Ser
GGC
Gly
GAA
Glu
TCG
Ser 320
GGT
Gly
CCG
Pro
GAG
Leu
CCC
Pro
ACC
Thr
CTG
Leu 225
GGC
Gly
CTG
Leu
GGC
Gly
GCA
Ala
AAG
Lys 305
GCG
Ala
GCG
Ala
CTC
Leu
GAC
Asp
TCC
Ser
CAG
Gin 210
ACC
Thr
ACC
Thr
CTC
Leu
CCC
Pro
GGT
Gly 290
CCG
Pro
ACG
Thr
CAA
Gln
GCG
Ala
GAC
Pro Gly Ala Ser Gin Ser Thr
CCT
Pro 195
ACC
Thr
CAG
Gin
GGC
Gly
GGC
Gly
AGC
Ser 275
GGG
Gly
GTT
Val
GGT
Gly
TCC
Ser
CAG
Gin 355
TGG
180
GGC
Gly
CTC
Leu
CCG
Pro
GGC
Gly
ACC
Thr 260
GCG
Ala
TCG
Ser
GCC
Ala
GGC
Gly
GGC
Gly 340
GAG
AGC
Ser
GGC
Gly
CTG
Leu
GGC
Gly 245
AGT
Ser
GGC
Gly
TTG
Leu
CCC
Pro
GCC
Ala 325
GGC
Gly
CGT
TCA
Ser
CAA
Gin
CAG
Gin 230
AAC
Asn
CCG
Pro
GCG
Ala
ACC
Thr
TCG
Ser 310
GCT
Ala
TCC
Ser
GAA
ACA
Thr
CTG
Leu 215
CAG
Gln
CCA
Pro
CTG
Leu
GGC
Gly
CGC
Arg 295
GTG
Val1
CCG
Pro
ACC
Thr
GAA
CCG
Pro 200
GGT
Gly
GTG
Val1
GCC
Ala
TCG
Ser
CTG
Leu 280
ACG
Thr
ATG
Met
GTG
Val
AGG
Arg
GAC
185
GTT
Val
GAG
Glu
ACG
Thr
GAC
Asp
AAC
Asn 265
CTG
Leu
CCG
Pro
CCG
Pro
GGT
Gly
CCG
Pro 345
GAC
630 678 726 774 822 870 918 966 1014 1062 1110 Glu Arg Glu Giu Asp Asp 360 TGAGCTCCCG TAATGACAAC AGA 1163 Glu Asp Asp Trp Asp Glu Glu Asp Asp Trp 365 370 CTTCCCGGCC ACCCGGGCCG GAAGACTTGC CAACATT INFORMATION FOR SEQ ID NO: 92: SEQUENCE CHARACTERISTICS: LENGTH: 371 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear 1200 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 76 (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: Met Ile Thr Met Leu Trp His Ala Met Pro Pro Glu Leu Asn Thr Ala 1 5 10 Arg Leu Met Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala Ala Ala 25 Gly Trp Gin Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu 40 Thr Ala Arg Leu Asn Ser Leu Gly Glu Ala Trp Thr Gly Gly Gly Ser 55 Asp Lys Ala Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu Gin Thr 70 75 Ala Ser Thr Gin Ala Lys Thr Arg Ala Met Gin Ala Thr Ala Gin Ala 90 Ala Ala Tyr Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro Glu Ile 100 105 110 Ala Ala Asn His Ile Thr Gin Ala Val Leu Thr Ala Thr Asn Phe Phe 115 120 125 Gly Ile Asn Thr Ile Pro Ile Ala Leu Thr Glu Met Asp Tyr Phe Ile 130 135 140 Arg Met Trp Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu 145 150 155 160 Thr Ala Val Asn Thr Leu Phe Glu Lys Leu Glu Pro Met Ala Ser Ile 165 170 175 Leu Asp Pro Gly Ala Ser Gin Ser Thr Thr Asn Pro Ile Phe Gly Met 180 185 190 Pro Ser Pro Gly Ser Ser Thr Pro Val Gly Gin Leu Pro Pro Ala Ala 195 200 205 Thr Gin Thr Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin Gin 210 215 220 Leu Thr Gin Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Val Gly 225 230 235 240 Gly Thr Gly Gly Gly Asn Pro Ala Asp Glu Glu Ala Ala Gin Met Gly 245 250 255 Leu Leu Gly Thr Ser Pro Leu Ser Asn His Pro Leu Ala Gly Gly Ser 260 265 270 Gly Pro Ser Ala Gly Ala Gly Leu Leu Arg Ala Glu Ser Leu Pro Gly 275 280 285 Ala Gly Gly Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu Ile Glu SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 77 290 295 300 Lys Pro Val Ala Pro Ser Val Met Pro Ala Ala Ala Ala Gly Ser Ser 305 310 315 320 Ala Thr Gly Gly Ala Ala Pro Val Gly Ala Gly Ala Met Gly Gin Gly 325 330 335 Ala Gln Ser Gly Gly Ser Thr Arg Pro Gly Leu Val Ala Pro Ala Pro 340 345 350 Leu Ala Gin Glu Arg Glu Glu Asp Asp Glu Asp Asp Trp Asp Glu Glu 355 360 365 Asp Asp Trp 370 INFORMATION FOR SEQ ID NO: 93: SEQUENCE CHARACTERISTICS: LENGTH: 1000 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 46...969 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: GACGCGACAC AGAAATCCTT AAGGCCGGCG GCCAAGGGGC CGAAG GTG AAG AAG GTG 57 Met Lys Lys Val 1 AAG CCC CAG AAA CCG AAG GCC ACG AAG CCG CCC AAA GTG GTG TCG CAG 105 Lys Pro Gin Lys Pro Lys Ala Thr Lys Pro Pro Lys Val Val Ser Gin 10 15 CGC GGC TGG CGA CAT TGG GTG CAT GCG TTG ACG CGA ATC AAC CTG GGC 153 Arg Gly Trp Arg His Trp Val His Ala Leu Thr Arg Ile Asn Leu Gly 30 CTG TCA CCC GAC GAG AAG TAC GAG CTG GAC CTG CAC GCT CGA GTC CGC 201 Leu Ser Pro Asp Glu Lys Tyr Glu Leu Asp Leu His Ala Arg Val Arg 45 CGC AAT CCC CGC GGG TCG TAT CAG ATC GCC GTC GTC GGT CTC AAA GGT 249 Arg Asn Pro Arg Gly Ser Tyr Gin Ile Ala Val Val Gly Leu Lys Gly 60 GGG GCT GGC AAA ACC ACG CTG ACA GCA GCG TTG GGG TCG ACG TTG GCT 297 Gly Ala Gly Lys Thr Thr Leu Thr Ala Ala Leu Gly Ser Thr Leu Ala 75 CAG GTG CGG GCC GAC CGG ATC CTG GCT CTA GAC GCG GAT CCA GGC GCC 345 Gin Val Arg Ala Asp Arg Ile Leu Ala Leu Asp Ala Asp Pro Gly Ala 90 95 100 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 GGA AAC CTC GCC Gly Asn Leu Ala CGG GTA GGG CGA Arg Val Gly Arg CAA TCG Gin Ser 110 GGC GCG ACC ATC GCT Gly Ala Thr Ile Ala 115 GAT GTG CTT GCA GAA AAA GAG CTG Asp Val Leu Ala Glu Lys Glu Leu 120
TCG
Ser 125 CAC TAC AAC GAC His Tyr Asn Asp ATC CGC GCA Ile Arg Ala 130 GCA CCG GAA Ala Pro Glu CAC ACT AGC His Thr Ser 135 GTC AAT GCG GTC Val Asn Ala Val
AAT
Asn 140 CTG GAA GTG CTG Leu Glu Val Leu 489 TAC AGC Tyr Ser 150 TCG GCG CAG CGC Ser Ala Gin Arg CTC AGC GAC GCC Leu Ser Asp Ala
GAC
Asp 160 TGG CAT TTC ATC Trp His Phe Ile
GCC
Ala 165 GAT CCT GCG TCG Asp Pro Ala Ser
AGG
Arg 170 TTT TAC AAC CTC GTC TTG GCT GAT TGT Phe Tyr Asn Leu Val Leu Ala Asp Cys 175
GGG
Gly 180 GCC GGC TTC TTC Ala Gly Phe Phe
GAC
Asp 185 CCG CTG ACC CGC Pro Leu Thr Arg GTG CTG TCC ACG Val Leu Ser Thr GTG TCC Val Ser 195 GGT GTC GTG Gly Val Val TCG GTC GCG Ser Val Ala 215
GTC
Val 200 GTG GCA AGT GTC Val Ala Ser Val
TCA
Ser 205 ATC GAC GGC GCA Ile Asp Gly Ala CAA CAG GCG Gin Gin Ala 210 GAT TTG GCG Asp Leu Ala TTG GAC TGG TTG Leu Asp Trp Leu
CGC
Arg 220 AAC AAC GGT TAC Asn Asn Gly Tyr AGC CGC Ser Arg 230 GCA TGC GTG GTC Ala Cys Val Val
ATC
Ile 235 AAT CAC ATC ATG Asn His Ile Met
CCG
Pro 240 GGA GAA CCC AAT Gly Glu Pro Asn
GTC
Val 245 GCA GTT AAA GAC Ala Val Lys Asp
CTG
Leu 250 GTG CGG CAT TTC Val Arg His Phe CAG CAA GTT CAA Gin Gin Val Gin
CCC
Pro 260 GGC CGG GTC GTG Gly Arg Val Val
GTC
Val 265 ATG CCG TGG GAC Met Pro Trp Asp
AGG
Arg 270 CAC ATT GCG GCC His Ile Ala Ala GGA ACC Gly Thr 275 GAG ATT TCA Glu Ile Ser GAA TTG GCC Glu Leu Ala 295
CTC
Leu 280 GAC TTG CTC GAC Asp Leu Leu Asp
CCT
Pro 285 ATC TAC AAG CGC Ile Tyr Lys Arg AAG GTC CTC Lys Val Leu 290 GGA CGT CGT T Gly Arg Arg GCA GCG CTA TCC Ala Ala Leu Ser GAT TTC GAG AGG Asp Phe Glu Arg GAGCGCACCT GCTGTTGCTG CTGGTCCTAC INFORMATION FOR SEQ ID NO: 94: SEQUENCE CHARACTERISTICS: LENGTH: 308 amino acids TYPE: amino acid STRANDEDNESS: single 1000 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 79 TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: Met Lys Lys Val Lys Pro Gin Lys Pro Lys Ala Thr Lys Pro Pro Lys Val Ile Ala Gly Ser Asp Ala Asp Pro 145 Trp Ala Ser Ala Gin 225 Gly Gin Val Asn Arg Leu Thr Pro Thr Ile 130 Ala His Asp Thr Gin 210 Asp Giu Vali Ser Leu Val Lys Leu Gly Ile Arg Pro Phe Cys Vai 195 Gin Leu Pro Gin Gin Gly Arg Gly Ala Ala 100 Ala Ala Giu Ile Giy 180 Ser Ala Ala Asn Pro Arg Leu Arg Gly Gin Gly Asp His Tyr Ala 165 Ala Gly Ser Ser Val 245 Gly Gly Ser Asn Ala 70 Val1 Asn Val Thr Ser i50 Asp Gly Val Vai Arg 230 Ala Arg Trp Pro Pro 55 Gly Arg Leu Leu Ser 135 Ser Pro Phe Val Ala 215 Ala Val Val Arg Asp 40 Arg Lys Ala Ala Ala 120 Vai Ala Ala Phe Val 200 Leu Cys Lys Val His 25 Giu Gly Thr Asp Asp 105 Giu Asn Gin Ser Asp 185 Val1 Asp Val Asp Val 265 Trp, Lys Ser Thr Arg 90 Arg Lys Ala Arg Arg 170 Pro Ala Trp Val Leu 250 Met Val Tyr Tyr Leu 75 Ile Val Giu Val Ala 155 Phe Leu Ser Leu Ile 235 Val Pro His Giu Gin Thr Leu Gly Leu Asn 140 Leu Tyr Thr Val Arg 220 Asn Arg Trp Ala Leu Ile Ala Ala Arg Ser 125 Leu Ser Asn Arg Ser 205 Asn His His Asp Leu Asp Ala Ala Leu Gin 110 His Glu Asp Leu Gly 190 Ile Asn Ile Phe Arg 270 Thr Leu Val Leu Asp Ser Tyr Val Ala Val 175 Val Asp Gly Met Giu 255 His Arg His Val Gly Ala Gly Asn Leu Asp 160 Leu Leu Gly Tyr Pro 240 Gin Ile Ala Ala Gly Thr Giu Ile Ser Leu Asp Leu Leu Asp Pro Ile Tyr Lys SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCTIDK98/00438 Arg Lys Val Leu Glu Leu Ala Ala Ala Leu Ser Asp Asp Phe Glu Arg 290 295 300 Ala Gly Arg Arg 305 INFORMATION FOR SEQ ID NO: SEQUENCE CHARACTERISTICS: LENGTH: 34 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: AAGAGTAGAT CTATGATGGC CGAGGATGTT CGCG 34 INFORMATION FOR SEQ ID NO: 96: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: CGGCGACGAC GGATCCTACC GCGTCGG 27 INFORMATION FOR SEQ ID NO: 97: SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: CCTTGGGAGA TCTTTGGACC CCGGTTGC 28 INFORMATION FOR SEQ ID NO: 98: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 81 GACGAGATCT TATGGGCTTA CTGAC INFORMATION FOR SEQ ID NO: 99: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: CCCCCCAGAT CTGCACCACC GGCATCGGCG GGC 33 INFORMATION FOR SEQ ID NO: 100 SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: GCGGCGGATC CGTTGCTTAG CCGG 24 INFORMATION FOR SEQ ID NO: 101: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: CCGGCTGAGA TCTATGACAG AATACGAAGG GC 32 INFORMATION FOR SEQ ID NO: 102: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: CCCCGCCAGG GAACTAGAGG CGGC 24 INFORMATION FOR SEQ ID NO: 103: SEQUENCE CHARACTERISTICS: LENGTH: 38 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 82 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: CTGCCGAGAT CTACCACCAT TGTCGCGCTG AAATACCC 38 INFORMATION FOR SEQ ID NO: 104: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: CGCCATGGCC TTACGCGCCA ACTCG INFORMATION FOR SEQ ID NO: 105: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: GGCGGAGATC TGTGAGTTTT CCGTATTTCA TC 32 INFORMATION FOR SEQ ID NO: 106: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: CGCGTCGAGC CATGGTTAGG CGCAG INFORMATION FOR SEQ ID NO: 107: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: GAGGAAGATC TATGACAACT TCACCCGACC CG 32 INFORMATION FOR SEQ ID NO: 108: SEQUENCE CHARACTERISTICS: SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 83 LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: CATGAAGCCA TGGCCCGCAG GCTGCATG 28 INFORMATION FOR SEQ ID NO: 109: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: GGCCGAGATC TGTGACCCAC TATGACGTCG TCG 33 INFORMATION FOR SEQ ID NO: 110: SEQUENCE CHARACTERISTICS: LENGTH: 36 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: GGCGCCCATG GTCAGAAATT GATCATGTGG CCAACC 36 INFORMATION FOR SEQ ID NO: 111: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: CCGGGAGATC TATGGCAAAG CTCTCCACCG ACG 33 INFORMATION FOR SEQ ID NO: 112: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: CGCTGGGCAG AGCTACTTGA CGGTGACGGT GG 32 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/0043 8 84 INFORMATION FOR SEQ ID NO: 113: SEQUENCE CHARACTERISTICS: LENGTH: 36 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: GGCCCAGATC TATGGCCATT GAGGTTTCGG TGTTGC 36 INFORMATION FOR SEQ ID NO: 114: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: CGCCGTGTTG CATGGCAGCG CTGAGC 26 INFORMATION FOR SEQ ID NO: 115: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: GGACGTTCAA GCGACACATC GCCG 24 INFORMATION FOR SEQ ID NO: 116: SEQUENCE CHARACTERISTICS: LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: CAGCACGAAC GCGCCGTCGA TGGC 24 INFORMATION FOR SEQ ID NO: 117: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: ACAGATCTGT GACGGACATG AACCCG 26 INFORMATION FOR SEQ ID NO: 118: SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: TTTTCCATGG TCACGGGCCC CCGGTACT 28 INFORMATION FOR SEQ ID NO: 119: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: ACAGATCTGT GCCCATGGCA CAGATA 26 INFORMATION FOR SEQ ID NO: 120: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: TTTAAGCTTC TAGGCGCCCA GCGCGGC 27 INFORMATION FOR SEQ ID NO: 121: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: ACAGATCTGC GCATGCGGAT CCGTGT 26 INFORMATION FOR SEQ ID NO: 122: SEQUENCE CHARACTERISTICS: SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 86 LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: TTTTCCATGG TCATCCGGCG TGATCGAG 28 INFORMATION FOR SEQ ID NO: 123: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: ACAGATCTGT AATGGCAGAC TGTGAT 26 INFORMATION FOR SEQ ID NO: 124: SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: TTTTCCATGG TCAGGAGATG GTGATCGA 28 INFORMATION FOR SEQ ID NO: 125: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: ACAGATCTGC CGGCTACCCC GGTGCC 26 INFORMATION FOR SEQ ID NO: 126: SEQUENCE CHARACTERISTICS: LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: TTTTCCATGG CTATTGCAGC TTTCCGGC 28 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 87 INFORMATION FOR SEQ ID NO: 127: SEQUENCE CHARACTERISTICS: LENGTH: 50 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: Ala Glu Asp Val Arg Ala Glu Ile Val Ala Ser Val Leu Glu Val Val 1 5 10 Val Asn Glu Gly Asp Gin Ile Asp Lys Gly Asp Val Val Val Leu Leu 25 Glu Ser Met Tyr Met Glu Ile Pro Val Leu Ala Glu Ala Ala Gly Thr 40 Val Ser INFORMATION FOR SEQ ID NO: 128: SEQUENCE CHARACTERISTICS: LENGTH: 49 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: Ala Glu Asp Val Arg Ala Glu Ile Val Ala Ser Val Leu Glu Val Val 1 5 10 Val Asn Glu Gly Asp Gin Ile Asp Lys Gly Asp Val Val Val Leu Leu 25 Glu Ser Met Met Glu Ile Pro Val Leu Ala Glu Ala Ala Gly Thr Val 40 Ser INFORMATION FOR SEQ ID NO: 129: SEQUENCE CHARACTERISTICS: LENGTH: 50 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: Ala Glu Asp Val Arg Ala Glu Ile Val Ala Ser Val Leu Glu Val Val SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCTIDK98/00438 88 1 5 10 Val Asn Glu Gly Asp Gin Ile Asp Lys Gly Asp Val Val Val Leu Leu 25 Glu Ser Met Lys Met Glu Ile Pro Val Leu Ala Glu Ala Ala Gly Thr 40 Val Ser INFORMATION FOR SEQ ID NO: 130: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: CCGGGAGATC TATGGCAAAG CTCTCCACCG ACG 33 INFORMATION FOR SEQ ID NO: 131: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: CGCTGGGCAG AGCTACTTGA CGGTGACGGT GG 32 INFORMATION FOR SEQ ID NO: 132: SEQUENCE CHARACTERISTICS: LENGTH: 36 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: GGCGCCGGCA AGCTTGCCAT GACAGAGCAG CAGTGG 36 INFORMATION FOR SEQ ID NO: 133: SEQUENCE CHARACTERISTICS: LENGTH: 26 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 89 CGAACTCGCC GGATCCCGTG TTTCGC 26 INFORMATION FOR SEQ ID NO: 134: SEQUENCE CHARACTERISTICS: LENGTH: 32 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: GGCAACCGCG AGATCTTTCT CCCGGCCGGG GC 32 INFORMATION FOR SEQ ID NO: 135: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: GGCAAGCTTG CCGGCGCCTA ACGAACT 27 INFORMATION FOR SEQ ID NO: 136: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: GGACCCAGAT CTATGACAGA GCAGCAGTGG INFORMATION FOR SEQ ID NO: 137: SEQUENCE CHARACTERISTICS: LENGTH: 47 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: CCGGCAGCCC CGGCCGGGAG AAAAGCTTTG CGAACATCCC AGTGACG 47 INFORMATION FOR SEQ ID NO: 138: SEQUENCE CHARACTERISTICS: LENGTH: 44 base pairs TYPE: nucleic acid STRANDEDNESS: single SUBSTITUTE SHEET (RULE 26) WO 99/24577 TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: GTTCGCAAAG CTTTTCTCCC GGCCGGGGCT GCCGGTCGAG TACC INFORMATION FOR SEQ ID NO: 139: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: CCTTCGGTGG ATCCCGTCAG INFORMATION FOR SEQ ID NO: 140: SEQUENCE CHARACTERISTICS: LENGTH: 450 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: PCT/DK98/00438 NAME/KEY: Coding Sequence LOCATION: 68...346 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: TGGCGCTGTC ACCGAGGAAC CTGTCAATGT CGTCGAGCAG TACTGAACCG TTCCGAGAAA GGCCAGC ATG AAC GTC ACC GTA TCC ATT CCG ACC ATC CTG CGG CCC CAC Met Asn Val Thr Val Ser Ile Pro Thr Ile Leu Arg Pro His
ACC
Thr GGC GGC CAG AAG Gly Gly Gln Lys GTC TCG GCC AGC Val Ser Ala Ser GAT ACC TTG GGT Asp Thr Leu Gly GTC ATC AGC Val Ile Ser ATG GAC CCG Met Asp Pro GTC AAC GAC Val Asn Asp GAC CTG Asp Leu GAG GCC AAC TAT TCG GGC ATT TCC GAG Glu Ala Asn Tyr Ser Gly Ile Ser Glu 40 CGC CTG Arg Leu TCT TCC CCA GGT AAG TTG Ser Ser Pro Gly Lys Leu 55 CAC CGC TTC GTG AAC ATC TAC His Arg Phe Val Asn Ile Tyr GAG GAC GTG CGG Glu Asp Val Arg
TTC
Phe TCC GGC GGC TTG Ser Gly Gly Leu ACC GCG ATC Thr Ala Ile GCT GAC GGT GAC TCG GTC ACC ATC CTC CCC GCC GTG GCC GGT GGG TGAGC Ala Asp Gly Asp Ser Val Thr Ile Leu Pro Ala Val Ala Gly Gly 351 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 91 85 GGAGCACATG ACACGATACG ACTCGCTGTT GCAGGCCTTG GGCAACACGC CGCTGGTTGG 411 CCTGCAGCGA TTGTCGCCAC GCTGGGATGA CGGGCGAGA 450 INFORMATION FOR SEQ ID NO: 141: SEQUENCE CHARACTERISTICS: LENGTH: 93 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: Met Asn Val Thr Val Ser Ile Pro Thr Ile Leu Arg Pro His Thr Gly 1 5 10 Gly Gin Lys Ser Val Ser Ala Ser Gly Asp Thr Leu Gly Ala Val Ile 25 Ser Asp Leu Glu Ala Asn Tyr Ser Gly Ile Ser Glu Arg Leu Met Asp 40 Pro Ser Ser Pro Gly Lys Leu His Arg Phe Val Asn Ile Tyr Val Asn 55 Asp Glu Asp Val Arg Phe Ser Gly Gly Leu Ala Thr Ala Ile Ala Asp 70 75 Gly Asp Ser Val Thr Ile Leu Pro Ala Val Ala Gly Gly INFORMATION FOR SEQ ID NO: 142: SEQUENCE CHARACTERISTICS: LENGTH: 480 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 88...381 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: GGTGTTCCCG CGGCCGGCTA TGACAACAGT CAATGTGCAT GACAAGTTAC AGGTATTAGG TCCAGGTTCA ACAAGGAGAC AGGCAAC ATG GCA ACA CGT TTT ATG ACG GAT CCG 114 Met Ala Thr Arg Phe Met Thr Asp Pro 1 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 92 CAC GCG ATG CGG GAC ATG GCG GGC CGT TTT GAG GTG CAC GCC CAG ACG 162 His Ala Met Arg Asp Met Ala Gly Arg Phe Glu Val His Ala Gin Thr 15 20 GTG GAG GAC GAG GCT CGC CGG ATG TGG GCG TCC GCG CAA AAC ATC TCG 210 Val Glu Asp Glu Ala Arg Arg Met Trp Ala Ser Ala Gin Asn Ile Ser 35 GGC GCG GGC TGG AGT GGC ATG GCC GAG GCG ACC TCG CTA GAC ACC ATG 258 Gly Ala Gly Trp Ser Gly Met Ala Glu Ala Thr Ser Leu Asp Thr Met 50 GCC CAG ATG AAT CAG GCG TTT CGC AAC ATC GTG AAC ATG CTG CAC GGG 306 Ala Gin Met Asn Gin Ala Phe Arg Asn Ile Val Asn Met Leu His Gly 65 GTG CGT GAC GGG CTG GTT CGC GAC GCC AAC AAC TAC GAG CAG CAA GAG 354 Val Arg Asp Gly Leu Val Arg Asp Ala Asn Asn Tyr Glu Gin Gin Glu 80 CAG GCC TCC CAG CAG ATC CTC AGC AGC TAACGTCAGC CGCTGCAGCA CAATACT 408 Gin Ala Ser Gin Gin Ile Leu Ser Ser TTTACAAGCG AAGGAGAACA GGTTCGATGA CCATCAACTA TCAGTTCGGT GATGTCGACG 468 CTCATGGCGC CA 480 INFORMATION FOR SEQ ID NO: 143: SEQUENCE CHARACTERISTICS: LENGTH: 98 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: Met Ala Thr Arg Phe Met Thr Asp Pro His Ala Met Arg Asp Met Ala 1 5 10 Gly Arg Phe Glu Val His Ala Gin Thr Val Glu Asp Glu Ala Arg Arg 25 Met Trp Ala Ser Ala Gin Asn Ile Ser Gly Ala Gly Trp Ser Gly Met 40 Ala Glu Ala Thr Ser Leu Asp Thr Met Ala Gin Met Asn Gin Ala Phe 55 Arg Asn Ile Val Asn Met Leu His Gly Val Arg Asp Gly Leu Val Arg 70 75 Asp Ala Asn Asn Tyr Glu Gin Gin Glu Gin Ala Ser Gin Gin Ile Leu 90 Ser Ser SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 93 INFORMATION FOR SEQ ID NO: 144: SEQUENCE CHARACTERISTICS: LENGTH: 940 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 86...868 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: GCCCCAGTCC TCGATCGCCT CATCGCCTTC ACCGGCCGCC AGCCGACCGC AGGCCACGTG TCCGCCACCT AACGAAAGGA TGATC ATG CCC AAG AGA Met Pro Lys Arg AGC GAA TAC AGG Ser Glu Tyr Arg GAT CAG TCC GCC Asp Gin Ser Ala
CAA
Gin
GCC
Ala
GGC
Gly ACG CCG AAC TGG Thr Pro Asn Trp
GTC
Val 15 GAC CTT CAG ACC Asp Leu Gin Thr
ACC
Thr 20 AAA AAG TTC TAC Lys Lys Phe Tyr TCG TTG TTC GGC Ser Leu Phe Gly
TGG
Trp GGT TAC GAC GAC Gly Tyr Asp Asp AAC CCG Asn Pro GTC CCC GGA Val Pro Gly GCC GTG GCC Ala Val Ala GGT GGG GTC TAT Gly Gly Val Tyr ATG GCC ACG CTG Met Ala Thr Leu AAC GGC GAA Asn Gly Glu GAG GGG ATG Glu Gly Met GCC ATC GCA CCG Ala Ile Ala Pro CCC CCG GGT GCA Pro Pro Gly Ala CCG CCG Pro Pro ATC TGG AAC ACC Ile Trp Asn Thr ATC GCG GTG GAC Ile Ala Val Asp GTC GAT GCG GTG Val Asp Ala Val
GTG
Val GAC AAG GTG GTG Asp Lys Val Val GGG GGC GGG CAG Gly Gly Gly Gin ATG ATG CCG GCC Met Met Pro Ala
TTC
Phe 105 GAC ATC GGC GAT Asp Ile Gly Asp GGC CGG ATG TCG TTC ATC ACC GAT CCG Gly Arg Met Ser Phe Ile Thr Asp Pro 115 ACC GGC Thr Gly 120 GCT GCC GTG Ala Ala Val CTA TGG CAG GCC Leu Trp Gin Ala CGG CAC ATC GGA Arg His Ile Gly GCG ACG TTG Ala Thr Leu 135 ACG GAC AAG Thr Asp Lys 496 GTC AAC GAG ACG GGC ACG CTC Val Asn Glu Thr Gly Thr Leu 140
ATC
Ile 145 TGG AAC GAA CTG Trp Asn Glu Leu 544 CCG GAT TTG GCG CTA GCG TTC TAC GAG GCT GTG GTT GGC CTC ACC CAC SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCTIDK98/00438 Pro
TCG
Ser 170
GGC
Gly
CCG
Pro
GCG
Ala
GAC
Asp
GCG
Asp 155
AGC
Ser
GAC
Asp
AAT
Asn
GCC
Ala
ATT
Ile 235
ATC
Leu
ATG
Met
GCG
Ala
CAT
His
AAA
Lys 220
CCG
Pro
TTC
Ala
GAG
Glu
GAA
Glu
TGG
Trp 205
GCC
Ala
TCG
Ser
AGT
Leu
ATA
Ile
GTC
Val 190
CAC
His
GCC
Ala
GTG
Val
GTG
Ala
GCT
Ala 175
GGC
Gly
GTC
Val
GCA
Ala
GGC
Gly
TTG
Phe 160
GCG
Ala
GGC
Gly
TAC
Tyr
GCG
Ala
CGG
Arg 240
AAG
Tyr
GGC
Gly
TGT
Cys
TTT
Phe
GGC
Gly 225
TTC
Phe
CCC
Val
TAT
Tyr 180
CCG
Pro
GAT
Asp
GTC
Val
TTG
Leu
CAG
Val 165
CGG
Arg
CCG
Pro
GAC
Asp
ATT
Ile
TCC
Ser 245
CAA
Gin Gly Leu Thr His GTG CTC AAG GCC Val Leu Lys Ala 185 ATG CCC GGC GTG Met Pro Gly Val 200 GCC GAC GCC ACG Ala Asp Ala Thr 215 GCG GAA CCG GCT Ala Glu Pro Ala 230 GAT CCG CAG GGC Asp Pro Gin Gly TAGGGAGCAT CCCGGG Ala Ile Phe Ser Val Leu Lys Pro Ala Pro Gin CAGGCCCGCC GGCCGGCAGA TTCGGAGAAT GCTAGAAGCT GCCG INFORMATION FOR SEQ ID NO: 145: SEQUENCE CHARACTERISTICS: LENGTH: 261 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: Met Pro Lys Arg Ser Glu Tyr Arg Gln Gly Thr Pro 1 5 Leu Gin Thr Thr Asp Gin Ser Ala Ala Lys Lys Phe 25 Phe Gly Trp Gly Tyr Asp Asp Asn Pro Val Pro Gly Tyr Ser Met Ala Thr Leu Asn Gly Glu Ala Val Ala 55 Met Pro Pro Gly Ala Pro Glu Gly Met Pro Pro Ile 70 Ile Ala Val Asp Asp Val Asp Ala Val Val Asp Lys Gly Gly Gin Val Met Met Pro Ala Phe Asp Ile Gly CCGGCG CCGCCG Trp Thr Gly Ile Asn Val Ala SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 100 105 110 Met Ser Phe Ile Thr Asp Pro Thr Gly Ala Ala Val Gly Leu Trp Gin 115 120 125 Ala Asn Arg His Ile Gly Ala Thr Leu Val Asn Glu Thr Gly Thr Leu 130 135 140 Ile Trp Asn Glu Leu Leu Thr Asp Lys Pro Asp Leu Ala Leu Ala Phe 145 150 155 160 Tyr Glu Ala Val Val Gly Leu Thr His Ser Ser Met Glu Ile Ala Ala 165 170 175 Gly Gin Asn Tyr Arg Val Leu Lys Ala Gly Asp Ala Glu Val Gly Gly 180 185 190 Cys Met Glu Pro Pro Met Pro Gly Val Pro Asn His Trp His Val Tyr 195 200 205 Phe Ala Val Asp Asp Ala Asp Ala Thr Ala Ala Lys Ala Ala Ala Ala 210 215 220 Gly Gly Gin Val Ile Ala Glu Pro Ala Asp Ile Pro Ser Val Gly Arg 225 230 235 240 Phe Ala Val Leu Ser Asp Pro Gin Gly Ala Ile Phe Ser Val Leu Lys 245 250 255 Pro Ala Pro Gin Gin 260 INFORMATION FOR SEQ ID NO: 146: SEQUENCE CHARACTERISTICS: LENGTH: 280 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 47...247 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: CCGAAAGGCG GTGCACCGCA CCCAGAAGAA AAGGAAAGAT CGAGAA ATG CCA CAG Met Pro Gin 1 GGA ACT GTG AAG TGG TTC AAC GCG GAG AAG GGG TTC GGC TTT ATC GCC 103 Gly Thr Val Lys Trp Phe Asn Ala Glu Lys Gly Phe Gly Phe Ile Ala 10 CCC GAA GAC GGT TCC GCG GAT GTA TTT GTC CAC TAC ACG GAG ATC CAG 151 Pro Glu Asp Gly Ser Ala Asp Val Phe Val His Tyr Thr Glu Ile Gin 25 30 GGA ACG GGC TTC CGC ACC CTT GAA GAA AAC CAG AAG GTC GAG TTC GAG 199 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 96 Gly Thr Gly Phe Arg Thr Leu Glu Glu Asn Gin Lys Val Glu Phe Glu 45 ATC GGC CAC AGC CCT AAG GGC CCC CAG GCC ACC GGA GTC CGC TCG CTC T 248 Ile Gly His Ser Pro Lys Gly Pro Gin Ala Thr Gly Val Arg Ser Leu 60 GAGTTACCCC CGCGAGCAGA CGCAAAAAGC CC 280 INFORMATION FOR SEQ ID NO: 147: SEQUENCE CHARACTERISTICS: LENGTH: 67 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: Met Pro Gin Gly Thr Val Lys Trp Phe Asn Ala Glu Lys Gly Phe Gly 1 5 10 Phe Ile Ala Pro Glu Asp Gly Ser Ala Asp Val Phe Val His Tyr Thr 25 Glu Ile Gin Gly Thr Gly Phe Arg Thr Leu Glu Glu Asn Gin Lys Val 40 Glu Phe Glu Ile Gly His Ser Pro Lys Gly Pro Gin Ala Thr Gly Val 55 Arg Ser Leu INFORMATION FOR SEQ ID NO: 148: SEQUENCE CHARACTERISTICS: LENGTH: 540 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 105...491 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: ATCGTGTCGT ATCGAGAACC CCGGCCGGTA TCAGAACGCG CCAGAGCGCA AACCTTTATA ACTTCGTGTC CCAAATGTGA CGACCATGGA CCAAGGTTCC TGAG ATG AAC CTA CGG 116 Met Asn Leu Arg 1 CGC CAT CAG ACC CTG ACG CTG CGA CTG CTG GCG GCA TCC GCG GGC ATT 164 Arg His Gin Thr Leu Thr Leu Arg Leu Leu Ala Ala Ser Ala Gly Ile SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 97 10 15 CTC AGC GCC GCG GCC TTC GCC GCG CCA GCA CAG GCA AAC CCC GTC GAC 212 Leu Ser Ala Ala Ala Phe Ala Ala Pro Ala Gin Ala Asn Pro Val Asp 30 GAC GCG TTC ATC GCC GCG CTG AAC AAT GCC GGC GTC AAC TAC GGC GAT 260 Asp Ala Phe Ile Ala Ala Leu Asn Asn Ala Gly Val Asn Tyr Gly Asp 45 CCG GTC GAC GCC AAA GCG CTG GGT CAG TCC GTC TGC CCG ATC CTG GCC 308 Pro Val Asp Ala Lys Ala Leu Gly Gin Ser Val Cys Pro Ile Leu Ala 60 GAG CCC GGC GGG TCG TTT AAC ACC GCG GTA GCC AGC GTT GTG GCG CGC 356 Glu Pro Gly Gly Ser Phe Asn Thr Ala Val Ala Ser Val Val Ala Arg 75 GCC CAA GGC ATG TCC CAG GAC ATG GCG CAA ACC TTC ACC AGT ATC GCG 404 Ala Gln Gly Met Ser Gin Asp Met Ala Gln Thr Phe Thr Ser Ile Ala 90 95 100 ATT TCG ATG TAC TGC CCC TCG GTG ATG GCA GAC GTC GCC AGC GGC AAC 452 Ile Ser Met Tyr Cys Pro Ser Val Met Ala Asp Val Ala Ser Gly Asn 105 110 115 CTG CCG GCC CTG CCA GAC ATG CCG GGG CTG CCC GGG TCC TAGGCGTGCG CG 503 Leu Pro Ala Leu Pro Asp Met Pro Gly Leu Pro Gly Ser 120 125 GCTCCTAGCC GGTCCCTAAC GGATCGATCG TGGATGC 540 INFORMATION FOR SEQ ID NO: 149: SEQUENCE CHARACTERISTICS: LENGTH: 129 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: Met Asn Leu Arg Arg His Gin Thr Leu Thr Leu Arg Leu Leu Ala Ala 1 5 10 Ser Ala Gly Ile Leu Ser Ala Ala Ala Phe Ala Ala Pro Ala Gin Ala 25 Asn Pro Val Asp Asp Ala Phe Ile Ala Ala Leu Asn Asn Ala Gly Val 40 Asn Tyr Gly Asp Pro Val Asp Ala Lys Ala Leu Gly Gin Ser Val Cys 55 Pro Ile Leu Ala Glu Pro Gly Gly Ser Phe Asn Thr Ala Val Ala Ser 70 75 Val Val Ala Arg Ala Gin Gly Met Ser Gin Asp Met Ala Gin Thr Phe SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 Thr Ser Ile Ala Ile Ser Met Tyr Cys 100 105 Ala Ser Gly Asn Leu Pro Ala Leu Pro 115 120 Pro Ser Val Met Ala Asp Val 110 Asp Met Pro Gly 125 Leu Pro Gly INFORMATION FOR SEQ ID NO: 150: SEQUENCE CHARACTERISTICS: LENGTH: 400 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 25...354 OTHER INFORMATION: (ix) FEATURE: NAME/KEY: mat_peptide LOCATION: 109..357 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: ATAGTTTGGG GAAGGTGTCC ATAA ATG AGG CTG TCG TTG ACC GCA TTG AGC Met Arg Leu Ser Leu Thr Ala Leu Ser GCC GGT GTA GGC GCC GTG GCA ATG TCG TTG ACC GTC GGG GCC Ala Gly Val Gly Ala Val Ala Met Ser Leu Thr Val Gly Ala GGG GTC Gly Val GCC TCC GCA GAT CCC GTG GAC GCG GTC ATT AAC ACC ACC TGC AAT TAC Ala Ser Ala Asp Pro Val Asp Ala Val Ile Asn Thr Thr Cys Asn Tyr GGG CAG Gly Gin GTA GTA GCT GCG Val Val Ala Ala AAC GCG ACG GAT Asn Ala Thr Asp GGG GCT GCC GCA Gly Ala Ala Ala
CAG
Gin TTC AAC GCC TCA Phe Asn Ala Ser
CCG
Pro 35 GTG GCG CAG TCC TAT TTG CGC AAT TTC Val Ala Gin Ser Tyr Leu Arg Asn Phe GCC GCA CCG CCA Ala Ala Pro Pro CCT CAG Pro Gin CGC GCT GCC ATG GCC GCG Arg Ala Ala Met Ala Ala CAA TTG CAA GCT Gin Leu Gin Ala TCG GTT GCC GGC Ser Val Ala Gly 291 339 GTG CCG GGG GCG GCA CAG TAC ATC Val Pro Gly Ala Ala Gin Tyr Ile
GGC
Gly 70 CTT GTC GAG Leu Val Glu TCC TGC AAC AAC TAT Ser Cys Asn Asn Tyr TAAGCCCATG CGGGCCCCAT CCCGCGACCC GGCATCGTCG SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 99 CCGGGG 400 INFORMATION FOR SEQ ID NO: 151: SEQUENCE CHARACTERISTICS: LENGTH: 110 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: Met Arg Leu Ser Leu Thr Ala Leu Ser Ala Gly Val'Gly Ala Val Ala -28 -25 -20 Met Ser Leu Thr Val Gly Ala Gly Val Ala Ser Ala Asp Pro Val Asp -5 1 Ala Val Ile Asn Thr Thr Cys Asn Tyr Gly Gin Val Val Ala Ala Leu 10 15 Asn Ala Thr Asp Pro Gly Ala Ala Ala Gin Phe Asn Ala Ser Pro Val 30 Ala Gln Ser Tyr Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro Gin Arg 45 Ala Ala Met Ala Ala Gin Leu Gin Ala Val Pro Gly Ala Ala Gin Tyr 60 Ile Gly Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 75 INFORMATION FOR SEQ ID NO: 152: SEQUENCE CHARACTERISTICS: LENGTH: 990 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 93...890 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: AATAGTAATA TCGCTGTGCG GTTGCAAAAC GTGTGACCGA GGTTCCGCAG TCGAGCGCTG CGGGCCGCCT TCGAGGAGGA CGAACCACAG TC ATG ACG AAC ATC GTG GTC CTG 113 Met Thr Asn Ile Val Val Leu 1 ATC AAG CAG GTC CCA GAT ACC TGG TCG GAG CGC AAG CTG ACC GAC GGC 161 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 Ile Lys Gin Vai Pro Asp Thr GAT TTC ACG CTG GAC CGC GAG Asp Phe Thr Leu Asp Arg Glu Trp 15 Ser Glu Arg Lys Leu Thr Asp Gly GCC GCC GAC GCG Ala Ala Asp Ala CTG GAC GAG ATC Leu Asp Giu Ile 209 AAC GAG CGC GCC GTG GAG GAA GCG CTA CAG Asn Glu Arg Ala Val Giu Glu Ala Leu Gin 45 GCC GAC GGC ATC GAA GGG TCG GTA ACC GTG Ala Asp Gly Ile Giu Gly Ser Val Thr Val 65
ATT
Ile CGG GAG AAA GAG Arg Giu Lys Glu 257 305 CTG ACG GCG GGC Leu Thr Ala Gly CCC GAG Pro Glu CGC GCC ACC Arg Ala Thr GCC GTC CAC Ala Val His GCG ATC CGC AAG Ala Ile Arg Lys CTG TCG ATG GGT Leu Ser Met Gly GCC GAC AAG Ala Asp Lys GTC ATC CAA Val Ile Gin 353 401 CTA AAG GAC GAC Leu Lys Asp Asp
GGC
Gly ATG CAC GGC TCG Met His Gly Ser ACC GGG Thr Gly 105 TGG GCT TTG GCG Trp Ala Leu Ala GCG TTG GGC ACC Ala Leu Gly Thr GAG GGC ACC GAG Glu Gly Thr Glu
CTG
Leu 120 GTG ATC GCA GGC Val Ile Ala Gly GAA TCG ACC GAC Glu Ser Thr Asp
GGG
Gly 130 GTG GGC GGT GCG Val Gly Gly Ala CCG GCC ATC ATC GCC GAG TAC CTG GGC Pro Ala Ile Ile Ala Glu Tyr Leu Gly 140 CCG CAG CTC ACC Pro Gin Leu Thr CAC CTG His Leu 150 CGC AAA GTG Arg Lys Val GAT GAG GGC Asp Glu Gly 170 ATC GAG GGC GGC Ile Giu Gly Gly
AAG
Lys 160 ATC ACC GGC GAG Ile Thr Gly Glu CGT GAG ACC Arg Glu Thr 165 GTG ATC AGC Val Ile Ser GTA TTC ACC CTC Val Phe Thr Leu
GAG
Glu 175 GCC ACG CTG CCC Ala Thr Leu Pro GTG AAC Val Asn 185 GAG AAG ATC AAC Glu Lys Ile Asn
GAG
Glu 190 CCG CGC TTC CCG Pro Arg Phe Pro
TCC
Ser 195 TTC AAA GGC ATC Phe Lys Gly Ile ATG GCC GCC AAG AAG AAG Met Ala Ala Lys Lys Lys 200 205 GAA GTT ACC GTG Glu Val Thr Val
CTG
Leu 210 ACC CTG GCC GAG Thr Leu Ala Glu GGT GTC GAG AGC Gly Val Giu Ser GAG GTG GGG CTG Glu Val Gly Leu AAC GCC GGA TCC ACC GTG Asn Ala Gly Ser Thr Val 230 CTG GCG TCG Leu Ala Ser
ACG
Thr 235 CCC AAA CCG GCC Pro Lys Pro Ala ACT GCC GGG GAG Thr Ala Gly Glu AAG GTC ACC Lys Val Thr 245 GAC GAG GGT GAA GGC GGC AAC CAG ATC GTG CAG TAC CTG GTT GCC CAG Asp Giu Gly Giu Gly Gly Asn Gin Ile Val Gin Tyr Leu Val Ala Gin SUBSTITUTE SHEET (RULE 28) WO 99/24577 PCT/DK98/00438 101 250 255 260 AAA ATC ATC TAAGACATAC GCACCTCCCA AAGACGAGAG CGATATAACC CATGGCTGA Lys Ile Ile 265 AGTACTGGTG CTCGTTGAGC ACGCTGAAGG CGCGTTAAAG, AAGGTCAGCG C INFORMATION FOR SEQ ID NO: 153: SEQUENCE CHARACTERISTICS: LENGTH: 266 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: Met Thr Asn Ile Val Val Leu Ile Lys Gin Vai Pro Asp Thr Trp Ser Giu Asp Gin Val Leu His Gly Asp Leu 145 Ile Thr Phe Arg Ala Ile Leu Ser Gly Thr Gly 130 Pro Thr Leu Pro Asp Glu Glu Pro 70 Asp Ile Thr Ala His 150 Glu Ile Gly Gly Ile Ala 55 Giu Lys Gin Glu Val1 135 Leu Thr Ser Ile Asp Asn Ala Arg Ala Thr Leu 120 Pro Arg Asp Val Met 200 Gly Phe Glu Asp Ala Val Gly 105 Val Ala Lys Glu Asn 185 Ala Val Thr Arg Gly Thr His Trp Ile Ile Val Gly 170 Glu Ala Glu Asp Arg Val Glu Giu Gly Ala Ile Lys Asp Leu Ala Gly Asn 125 Ala Glu 140 Ile Glu Phe Thr Ile Asn Lys Lys 205 Asp Glu 220 Glu Glu Ser Arg Asp Arg 110 Glu Tyr Gly Leu Giu 190 Glu Val Ala Ala Val Lys Gly Ala Ser Leu Gly Giu 175 Pro Val Giy Ala Leu Thr Ala Met Leu Thr Gly Lys 160 Ala Arg Thr Leu Val Leu Thr Leu Ala Giu Ile SUBSTIE SHEET (RULE 28) WO 99/24577 PCT/DK98/00438 102 Ala Asn Ala Gly Ser Thr Val Leu Ala 225 230 Thr Ala Gly Glu Lys Val Thr Asp Glu 245 Val Gin Tyr Leu Val Ala Gin Lys Ile 260 265 INFORMATION FOR SEQ ID NO: 154: SEQUENCE CHARACTERISTICS: LENGTH: 25 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear Ser Thr Pro Lys Pro Ala Lys 235 240 Gly Glu Gly Gly Asn Gin Ile 250 255 Ile (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: CTGAGATCTA TGAACCTACG GCGCC INFORMATION FOR SEQ ID NO: 155: SEQUENCE CHARACTERISTICS: LENGTH: 35 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: CTCCCATGGT ACCCTAGGAC CCGGGCAGCC CCGGC INFORMATION FOR SEQ ID NO: 156: SEQUENCE CHARACTERISTICS: LENGTH: 29 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: CTGAGATCTA TGAGGCTGTC GTTGACCGC INFORMATION FOR SEQ ID NO: 157: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: CTCCCCGGGC TTAATAGTTG TTGCAGGAGC SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 103 INFORMATION FOR SEQ ID NO: 158: SEQUENCE CHARACTERISTICS: LENGTH: 33 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: GCTTAGATCT ATGATTTTCT GGGCAACCAG GTA 33 INFORMATION FOR SEQ ID NO: 159: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: GCTTCCATGG GCGAGGCACA GGCGTGGGAA INFORMATION FOR SEQ ID NO: 160: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: CTGAGATCTA GAATGCCACA GGGAACTGTG INFORMATION FOR SEQ ID NO: 161: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: TCTCCCGGGG GTAACTCAGA GCGAGCGGAC INFORMATION FOR SEQ ID NO: 162: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 104 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: CTGAGATCTA TGAACGTCAC CGTATCC 27 INFORMATION FOR SEQ ID NO: 163: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: TCTCCCGGGG CTCACCCACC GGCCACG 27 INFORMATION FOR SEQ ID NO: 164: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: CTGAGATCTA TGGCAACACG TTTTATGACG INFORMATION FOR SEQ ID NO: 165: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: CTCCCCGGGT TAGCTGCTGA GGATCTGCTH INFORMATION FOR SEQ ID NO: 166: SEQUENCE CHARACTERISTICS: LENGTH: 31 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: CTGAAGATCT ATGCCCAAGA GAAGCGAATA C 31 INFORMATION FOR SEQ ID NO: 167: SEQUENCE CHARACTERISTICS: LENGTH: 31 base pairs SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 105 TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167: CGGCAGCTGC TAGCATTCTC CGAATCTGCC G 31 INFORMATION FOR SEQ ID NO: 168: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: Pro Gln Gly Thr Val Lys Trp Phe Asn Ala Glu Lys Gly Phe Gly 1 5 10 INFORMATION FOR SEQ ID NO: 169: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (ix) FEATURE: NAME/KEY: Other LOCATION: OTHER INFORMATION: Xaa is unknown (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: Asn Val Thr Val Ser Ile Pro Thr Ile Leu Arg Pro Xaa Xaa Xaa 1 5 10 INFORMATION FOR SEQ ID NO: 170: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (ix) FEATURE: NAME/KEY: Other LOCATION: 1 OTHER INFORMATION: Thr Could also be Ala (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: SUBSTITUTE SHEET (RULE 26) WO 99/24577 106 Thr Arg Phe Met Thr Asp Pro His Ala Met Arg Asp Met Ala Gly 1 5 10 INFORMATION FOR SEQ ID NO: 171: SEQUENCE CHARACTERISTICS: LENGTH: 15 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: None (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: Pro Lys Arg Ser Glu Tyr Arg Gin Gly Thr Pro Asn Trp Val Asp 1 5 10 INFORMATION FOR SEQ ID NO:172: SEQUENCE CHARACTERISTICS: LENGTH: 404 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear PCT/DK98/00438 Met 1 Ile Gin Ser Ala Trp Ser Cys Trp Gly 145 Pro Ser Gly Trp Asn 225 Gly (xi) SEQUENCE Ala Thr Val Asn 5 Glu Gly Arg Ser Val Pro Ser Pro Gly Gly Asn Asn Gin Asp Asp Tyr Tyr Tyr Gin Ser Ser Phe Tyr Ser 100 Gin Thr Tyr Lys 115 Leu Ser Ala Asn 130 Leu Ser Met Ala Gin Gin Phe Ile 165 Gin Gly Met Gly 180 Gly Tyr Lys Ala 195 Glu Arg Asn Asp 210 Thr Arg Leu Trp Gly Ala Asn Ile Arg Phe Ser Ser Asn 70 Gly Asp Trp Arg Gly 150 Tyr Pro Ala Pro Val 230 Pro Ser Arg Ser Arg Met Gly 40 Pro Ala 55 Gly Trp Leu Ser Trp Tyr Glu Thr 120 Ala Val 135 Ser Ser Ala Gly Ser Leu Asp Met 200 Thr Gin 215 Tyr Cys Ala Glu His Pro 25 Arg Val Asp Ile Ser 105 Phe Lys Ala Ser Ile 185 Trp Gin Gly Phe His 10 Gly Asp Tyr Ile Val 90 Pro Leu Pro Met Leu 170 Gly Gly Ile Asn Leu His Leu Ile Leu Asn 75 Met Ala Thr Thr Ile 155 Ser Leu Pro Pro Gly 235 Glu His Pro Lys Leu Thr Pro Cys Ser Gly 140 Leu Ala Ala Ser Lys 220 Thr Asn His Val Val Asp Pro Val Gly Glu 125 Ser Ala Leu Met Ser 205 Leu Pro Phe DESCRIPTION: SEQ ID NO:172: His Glu Gin Gly Ala Gly Lys 110 Leu Ala Ala Leu Gly 190 Asp Val Asn Val His His Tyr Leu Phe Gin Leu Arg Phe Glu Gly Gin Ala Gly Pro Gin Ala Ile Tyr His 160 Asp Pro 175 Asp Ala Pro Ala Ala Asn Glu Leu 240 Arg Ser SUBSTITUTE SHEET (RULE 26) WO 99/24577 PTD9/03 PCT/DK98/00438 Ser Ala Gly Ala 305 Glu Leu Gly Ala Ile 385 Gly Asn Val Ala 290 Gly Ala Leu Gly Thr 370 Ser Met Leu Phe 275 Gin Lys Ala Lys 260 Asn Leu Leu Ala Ala Asn 280 Lys Glu Gln Ser Tyr 360 Asn Met 107 250 Asn Thr Asp Gln Asn 330 Thr Gly Leu Ser Ala Ser Gin 300 Asn Thr Leu Gin Asn 380 Glu Gly 270 Giu Ser Ala I le Ala 350 Lys Ala Asn 255 His Tyr Leu Gly His 335 Ala Trp, Arg Val1 Asp Glu 340 Ser Gly 355 Ala Thr Glu Ala Phe Ala INFORMATION FOR SEQ ID NO:173: Wi SEQUENCE CHARACTERISTICS: LENGTH: 403 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:173: Met Ala Thr Val Asn 1 Ile Glu Gly Glu Ala Ala Leu Leu Asp Gly Gly Ser Ala Thr Ala Ile Ser Giu Gly Met Phe 115 Leu Gin Val 130 Gln Ser Gly 145 Arg Ala Gin Glu Trp Tyr Gin Ser Ser 195 5 Ser Ser Gly Ser Glu Gly Lys Ser Asn Asp 165 Gin Tyr Arg Met Ala Lys Glu 70 Leu Gin Leu Pro Asn 150 Tyr Ser Ser Ser Arg Thr Glu Ile Gin 40 Gin Ser 55 Ala Tyr Asn Asn Ala Met Phe Ser 120 Ser Met 135 Ser Pro Asn Gly Gly Leu Asp Trp 200 Trp Glu 215 His Gin Gly Leu Gin Ala Ala 105 Arg Gly Ala Trp Ser 185 Tyr Thr His Gin Asn Thr Gly Leu 90 Ser Pro Arg Val Asp 170 Ile Ser Phe His Trp Val Lys Val1 75 Gin Thr Gly Asp Tyr 155 Ile Val Pro Leu His Asn Thr Leu Gln Asn Giu Leu Ile 140 Leu Asn Met Ala Thr 220 His Phe Ser Ala Gin Leu Gly Pro 125 Lys Leu Thr Pro Cys 205 Ser His Ala Ile Ala Lys Ala Asn 110 Val Val1 Asp Pro Val 190 Gly Glu His Gly His Ala Trp Arg Val Glu Gin Gly Ala 175 Gly Lys Leu His Ile Ser Trp Asp Thr Thr Tyr Phe Leu 160 Phe Gly Ala Pro Gly Cys Gin Thr Tyr Lys 210 SUBSTITE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 Gin 225 Ile His Pro Ala Ala 305 Asn Leu Ser Asn Trp 385 Leu Leu Gin Gin 275 Gly Glu Thr Gly Asn 355 Val Ala Asn 230 Ala Ile Gly Ala Asp 310 Trp Ile Phe Phe Asn 390 Arg Gly Tyr Pro Ala 295 Pro Val Pro Gin Pro 375 Ala Val Ser Gly 265 Leu Met Gin Cys Glu 345 Ala Asn Lys 108 Lys Ala i 250 Ser Ile Trp Gin Gly 330 Phe Tyr Gly Gly Ser Ala Leu 270 Met Ser Leu Pro Phe 350 Gly Trp Ser Ala Ala 255 Leu Gly Asp Val Asn 335 Val Gly Glu Ser Ala 240 Tyr Asp Asp Pro Ala 320 Glu Arg His Tyr Leu 400 Gly Ala Gly INFORMATION FOR SEQ ID NO:174: SEQUENCE CHARACTERISTICS: LENGTH: 291 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 1...288 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:174: ATG TCG CAG ATT ATG TAC AAC TAT CCG GCG Met Ser Gin Ile Met Tyr Asn Tyr Pro Ala
CAG
Gin ATG ATG GCT CAT GCC GGG Met Met Ala His Ala Gly AGC TTG GGG GCC GAT ATC Ser Leu Gly Ala Asp Ile GAC ATG GCC Asp Met Ala GCC AGT GAG Ala Ser Glu ATC ACG TAT Ile Thr Tyr TAT GCG GGC ACG Tyr Ala Gly Thr
CTG
Leu
AGT
Ser GAT ACC GGG Asp Thr Gly GCC GTG CTG Ala Val Leu
TCC
Ser 40
ACC
Thr GCT TGG CAG Ala Trp Gin
GGT
Gly CAG GGC TGG Gin Gly Trp CTG GTG Leu Val
CAG
Gin 55
TCG
Ser CAG TGG AAC Gin Trp Asn
CAG
Gin
CAT
His GCC CTA GAG GAT Ala Leu Glu Asp CGG GCC TAT Arg Ala Tyr
CAG
Gin 70 ATG TCT GGC ACC Met Ser Gly Thr 75 GAG TCC AAC Glu Ser Asn
ATG
ACC
Thr GGC T GCG ATG TTG GCT CGA GAT GGG GCC GAA GCC GCC AAG TGG GGC SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 109 Met Ala Met Leu Ala Arg Asp Gly Ala Glu Ala Ala Lys Trp Gly Gly 90
AG
INFORMATION FOR SEQ ID NO:175: SEQUENCE CHARACTERISTICS: LENGTH: 96 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO: Met Ser Gin Ile Met Tyr Asn Tyr Pro Ala Met 1 5 10 Asp Met Ala Gly Tyr Ala Gly Thr Leu Gin Ser Ala Ser Glu Gln Ala Val Leu Ser Ser Ala Trp Ile Thr Tyr Gin Gly Trp Gin Thr Gin Trp Asn 55 Leu Val Arg Ala Tyr Gin Ser Met Ser Gly Thr 70 75 Met Ala Met Leu Ala Arg Asp Gly Ala Glu Ala 90 INFORMATION FOR SEQ ID NO:176: SEQUENCE CHARACTERISTICS: LENGTH: 363 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 1...360 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO: GTG TCG CAG AGT ATG TAC AGC TAC CCG GCG ATG Met Ser Gin Ser Met Tyr Ser Tyr Pro Ala Met 1 5 10 GAC ATG GCC GGT TAT ACG GGC ACG ACG CAG AGC Asp Met Ala Gly Tyr Thr Gly Thr Thr Gin Ser 25 GCC AGT GAG CGC ACC GCG CCG TCG CGT GCT TGC Ala Ser Glu Arg Thr Ala Pro Ser Arg Ala Cys 40 ATG AGT CAT CAG GAC TGG CAG GCC CAG TGG AAT 175: Met Leu Gin Gin His Ala Ala Asp Thr Glu Asn Gly Gly Ile Gly Asp Thr Gly 176:
ACG
Thr
TTG
Leu GCC AAT GTC GGA Ala Asn Val Gly GGG GCC GAT ATC Gly Ala Asp Ile CAA GGT Gin Gly CAG GCC GAT CTC GGG Asp Leu Gly ATG GAG GCT SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 110 Met Ser CTC GCG Leu Ala His Gin Asp Trp Gin 55
CGG
Arg Ala Gin Trp Asn Gin Ala Met Glu Ala CGG GCC TAC Arg Ala Tyr
CGT
Arg 70 TGC CGG CGA Cys Arg Arg CTA CGC CAG ATC Leu Arg Gin Ile
GTG
Val
GGG
Gly CTG GAA AGG Leu Glu Arg
CCG
Pro GTA GGC GAT TCG Val Gly Asp Ser
TCA
Ser 90
GAC
Asp GAC TGC GGA ACG Asp Cys Gly Thr ATT AGG Ile Arg GGT CCA Gly Pro GTG GGG TCG Val Gly Ser GCC ACG GCC Ala Thr Ala 115 CGG GGT CGG TGG Arg Gly Arg Trp
CTG
Leu 105
TAA
CCG CGC CAT Pro Arg His GAC GCC GGA Asp Ala Gly
GAC
Asp 120 INFORMATION FOR SEQ ID NO:177: SEQUENCE CHARACTERISTICS: LENGTH: 120 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal Met 1 Asp Ala Met Leu Val Val Ala (xi) SEQUENCE Ser Gin Ser Met 5 Met Ala Gly Tyr Ser Glu Arg Thr Ser His Gin Asp Ala Arg Ala Tyr Leu Glu Arg Pro Gly Ser Phe Arg 100 Thr Ala Ala Asp 115 DESCRIPTION: SEQ ID Tyr Ser Tyr Pro Ala 10 Thr Gly Thr Thr Gin 25 Ala Pro Ser Arg Ala Trp Gin Ala Gin Trp 55 Arg Arg Cys Arg Arg 70 Val Gly Asp Ser Ser 90 Gly Arg Trp Leu Asp 105 Ala Gly Asp 120 NO:177: Met Thr Ser Leu Cys Gin Asn Gin Ala Leu Asp Cys Pro Arg Asn Ala Asp Met Gin Thr Ala 110 Val Asp Leu Glu Ile Ile Gly INFORMATION FOR SEQ ID NO:178: SEQUENCE CHARACTERISTICS: LENGTH: 297 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence SUBSTITUTE SHEET (RULE 26) WO 99/24577 111 LOCATION: 1...294 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:178: PCT/DK98/00438 ATG GCC TCG CGT TTT ATG ACG GAT CCG CAC GCG ATG CGG GAC Ala Met Arg Asp Met 1
GGC
Gly Ala Ser Arg Phe Met Thr Asp Pro His 10
GTG
Val ATG GCG Met Ala CGT TTT GAG Arg Phe Glu GTG CAC GCC CAG Val His Ala Gin
ACG
Thr 25
TCG
Ser GAG GAC GAG Glu Asp Glu ATG TGG GCG Met Trp Ala GCC GAG GCG Ala Glu Ala
TCC
Ser GCG CAA AAC Ala Gin Asn GGC GCG GGC Gly Ala Gly
TGG
Trp
AAT
Asn GCT CGC CGG Ala Arg Arg AGT GGC ATG Ser Gly Met CAG GCG TTT Gln Ala Phe ACC TCG CTA Thr Ser Leu ATG ACC CAG Met Thr Gin CGC AAC Arg Asn
ATG
Met
GAC
Asp ATC GTG AAC Ile Val Asn CAC GGG GTG His Gly Val GGG CTG GTT Gly Leu Val
GAC
Asp GCC AAC AAC Ala Asn Asn
TAC
Tyr GAA CAG CAA GAG Glu Gin Gin Glu TCC CAG CAG Ser Gin Gin
ATC
Ile AGC AGC TGA Ser Ser INFORMATION FOR SEQ ID NO:179: Met 1 Gly Met Ala Arg Asp Ser SEQUENCE CHARACTERISTICS: LENGTH: 98 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID Ala Ser Arg Phe Met Thr Asp Pro His 5 10 Arg Phe Glu Val His Ala Gin Thr Val 25 Trp Ala Ser Ala Gin Asn Ile Ser Gly Glu Ala Thr Ser Leu Asp Thr Met Thr 55 Asn Ile Val Asn Met Leu His Gly Val 70 Ala Asn Asn Tyr Glu Gin Gin Glu Gin 90 Ser NO:179: Ala Met Glu Asp Ala Gly Gln Met Arg Asp 75 Ala Ser Asp Ala Ser Gin Leu Gin Met Arg Gly Ala Val Ile Ala Arg Met Phe Arg Leu SUBSTITUTE SHEET (RULE 26) WO 99/24577 112 INFORMATION FOR SEQ ID NO:180: SEQUENCE CHARACTERISTICS: LENGTH: 297 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 1...294 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:180: ATG GCC TCA CGT TTT ATG ACG GAT CCG CAC GCG ATG Met Ala Ser Arg Phe Met Thr Asp Pro His Ala Met PCT/DK98/00438 CGG GAC ATG Arg Asp Met GGC CGT TTT GAG GTG CAC GCC CAG ACG GTG GAG GAC GAG Gly Arg Phe Glu Val His Ala Gin Thr Val Glu Asp Glu GCT CGC CGG Ala Arg Arg AGT GGC ATG Ser Gly Met
ATG
Met TGG GCG TCC GCG CAA AAC Trp Ala Ser Ala Gin Asn
ATT
Ile TCC GGT GCG GGC Ser Gly Ala Gly GCC GAG GCG ACC TCG CTA Ala Glu Ala Thr Ser Leu ACC ATG GCC CAG Thr Met Ala Gin AAT CAG GCG TTT Asn Gin Ala Phe
CGC
Arg AAC ATC GTG AAC Asn Ile Val Asn
ATG
Met 70 CTG CAC GGG GTG Leu His Gly Val
CGT
Arg GAC GGG CTG GTT CGC Asp Gly Leu Val Arg GAC GCC AAC AAC Asp Ala Asn Asn
TAC
Tyr GAG CAG CAA GAG Glu Gin Gin Glu GCC TCC CAG CAG Ala Ser Gin Gin ATC CTC Ile Leu AGC AGC TAA 297 Ser Ser INFORMATION FOR SEQ ID NO:181: SEQUENCE CHARACTERISTICS: LENGTH: 98 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:181: Met Ala Ser Arg Phe Met Thr Asp Pro His Ala Met Arg Asp Met Ala 1 5 10 SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 113 Gly Arg Phe Glu Val His Ala Gin Thr Val Glu Asp Glu Ala Arg Arg 25 Met Trp Ala Ser Ala Gin Asn Ile Ser Gly Ala Gly Trp Ser Gly Met 40 Ala Glu Ala Thr Ser Leu Asp Thr Met Ala Gin Met Asn Gin Ala Phe 55 Arg Asn Ile Val Asn Met Leu His Gly Val Arg Asp Gly Leu Val Arg 70 75 Asp Ala Asn Asn Tyr Glu Gin Gin Glu Gin Ala Ser Gin Gin Ile Leu 90 Ser Ser INFORMATION FOR SEQ ID NO:182: SEQUENCE CHARACTERISTICS: LENGTH: 297 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ix) FEATURE: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:182: ATGGCCTCAC GTTTTATGAC GGATCCGCAT GCGATGCGGG ACATGGCGGG CCGTTTTGAG GTGCACGCCC AGACGGTGGA GGACGAGGCT CGCCGGATGT GGGCGTCCGC GCAAAACATT 120 TCCGGTGCGG GCTGGAGTGG CATGGCCGAG GCGACCTCGC TAGACACCAT GACCTAGATG 180 AATCAGGCGT TTCGCAACAT CGTGAACATG CTGCACGGGG TGCGTGACGG GCTGGTTCGC 240 GACGCCAACA ACTACGAACA GCAAGAGCAG GCCTCCCAGC AGATCCTGAG CAGCTAG 297 INFORMATION FOR SEQ ID NO:183: SEQUENCE CHARACTERISTICS: LENGTH: 98 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: peptide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:183: Met Ala Ser Arg Phe Met Thr Asp Pro His Ala Met Arg Asp Met Ala 1 5 10 Gly Arg Phe Glu Val His Ala Gin Thr Val Glu Asp Glu Ala Arg Arg 25 Met Trp Ala Ser Ala Gin Asn Ile Ser Gly Ala Gly Trp Ser Gly Met 40 Ala Glu Ala Thr Ser Leu Asp Thr Met Thr Gin Met Asn Gin Ala Phe 55 Arg Asn Ile Val Asn Met Leu His Gly Val Arg Asp Gly Leu Val Arg 70 75 Asp Ala Asn Asn Tyr Glu Gin Gin Glu Gin Ala Ser Gin Gin Ile Leu 90 Ser Ser INFORMATION FOR SEQ ID NO:184: SEQUENCE CHARACTERISTICS: SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 114 LENGTH: 297 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (ix) FEATURE: NAME/KEY: Coding Sequence LOCATION: 1...294 OTHER INFORMATION: (xi) SEQUENCE DESCRIPTION: SEQ ID NO:184: ATG ACC TCG CGT TTT ATG ACG GAT CCG CAC GCG ATG CGG GAC Met Thr Ser Arg Phe Met Thr Asp Pro His Ala Met Arg Asp ATG GCG Met Ala GGC CGT TTT Gly Arg Phe ATG TGG GCG Met Trp Ala
GAG
Glu GTG CAC GCC CAG Val His Ala Gln
ACG
Thr 25 GTG GAG GAC GAG Val Glu Asp Glu GCT CGC CGG Ala Arg Arg TCC GCG CAA AAC ATT TCC GGC GCG GGC TGG AGT GGC ATG Ser Ala Gln Asn Ile Ser Gly Ala Gly Trp Ser Gly Met GCC GAG Ala Glu GCG ACC TCG CTA Ala Thr Ser Leu ACC ATG ACC CAG Thr Met Thr Gln
ATG
Met AAT CAG GCG TTT Asn Gln Ala Phe
CGC
Arg AAC ATC GTG AAC Asn Ile Val Asn CTG CAC GGG GTG Leu His Gly Val
CGT
Arg GAC GGG CTG GTT Asp Gly Leu Val GAC GCC AAC AAC Asp Ala Asn Asn
TAC
Tyr GAA CAG CAA GAG Glu Gln Gln Glu GCC TCC CAG CAG Ala Ser Gln Gln ATC CTC Ile Leu AGC AGC TGA Ser Ser INFORMATION FOR SEQ ID NO:185: SEQUENCE CHARACTERISTICS: LENGTH: 98 amino acids TYPE: amino acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: protein FRAGMENT TYPE: internal (xi) SEQUENCE DESCRIPTION: SEQ ID NO:185: Met 1 Gly Thr Ser Arg Phe Met Thr Asp Pro His Ala Met Arg Asp Met Ala 5 10 Arg Phe Glu Val His Ala Gln Thr Val 25 Glu Asp Glu Ala Arg Arg Ser Gly Met Met Trp Ala Ser Ala Gln Asn Ile Ser Gly Ala Gly Trp SUBSTITUTE SHEET (RULE 26) WO 99/24577 PCT/DK98/00438 115 40 Ala Glu Ala Thr Ser Leu Asp Thr Met Thr Gin Met Asn Gin Ala Phe 55 Arg Asn Ile Val Asn Met Leu His Gly Val Arg Asp Gly Leu Val Arg 70 75 Asp Ala Asn Asn Tyr Glu Gin Gin Glu Gin Ala Ser Gin Gin Ile Leu 90 Ser Ser INFORMATION FOR SEQ ID NO:186: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:186: GGAATGAAAA GGGGTTTGTG INFORMATION FOR SEQ ID NO:187: SEQUENCE CHARACTERISTICS: LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:187: GACCACGCCC GCGCCGTGTG INFORMATION FOR SEQ ID NO:188: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:188: GCAACACCCG GGATGTCGCA GATTATG 27 INFORMATION FOR SEQ ID NO:189: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:189: CTAAGCTTGG ATCCCTAGCC GCCCCACTTG SUBSTITUTE SHEET (RULE 26) WO 99/24577 116 INFORMATION FOR SEQ ID NO:190: SEQUENCE CHARACTERISTICS: LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:190: GAATATTTGA AAGGGATTCG TG INFORMATION FOR SEQ ID NO:191: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (xi) SEQUENCE DESCRIPTION: SEQ ID NO:191: CTACTAAGCT TGGATCCTTA GTCTCCGGCG INFORMATION FOR SEQ ID NO:192: SEQUENCE CHARACTERISTICS: LENGTH: 27 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:192: GCAACACCCG GGGTGTCGCA GAGTATG INFORMATION FOR SEQ ID NO:193: SEQUENCE CHARACTERISTICS: LENGTH: 30 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: cDNA (xi) SEQUENCE DESCRIPTION: SEQ ID NO:193: CTACTAAGCT TGGATCCTTA GTCTCCGGCG 20486PC1 seqlst Page 140 of 140 PCT/DK98/00438 SUBSTITUTE SHEET (RULE 26)

Claims (58)

1. A substantially pure polypeptide fragment which a) comprises an amino acid sequence selected from the sequences shown in SEQ ID NO: 175, b) comprises a subsequence of the polypeptide fragment shown in SEQ ID NO: 175 which has a length of at least 6 amino acid residues, said subsequence being immunologically equivalent to the polypeptide shown in SEQ ID NO: 175 with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitization with antigens derived from mycobacteria belonging to the tuberculosis complex, or c) comprises an amino acid sequence having a sequence identity with the polypeptide shown in SEQ ID NO: 175 or the subsequence of the amino acid 15 sequence shown in SEQ ID NO: 175 (defined in b) of at least 70%, and at the same time being immunoligically equivalent to the polypeptide shown in SEQ ID NO: 175 with respect to the ability of evoking a protective immune response against infections with mycobacteria belonging to the tuberculosis complex or with respect to the ability of eliciting a diagnostically significant immune response indicating previous or ongoing sensitisation with antigens derived from mycobacteria belonging to the tuberculosis complex.
S2. The polypeptide fragment according to claim 1 in essentially pure i. form.
3. The polypeptide fragment according to claim 1 or 2, which comprises 25 an epitope for a T-helper cell.
4. The polypeptide fragment according to any one of the preceding claims, which has a length of at least 7 amino acid residues, such as at least 8, at least 9, at least 10, at least 12, at least 14, at least 16, at least 18, at least at least 22, at least 24, and at least 30 amino acid residues.
5. The polypeptide fragment according to any one of the preceding claims which is free from any signal sequence.
6. The polypeptide fragment according to any one of the preceding claims which 1) induces a release of IFN-y from primed memory T-lymphocytes withdrawn from a mouse within 2 weeks of primary infection or within 4 days after the mouse has been re-challenge infected with mycobacteria belonging to the tuberculosis complex, the induction performed by the addition of the polypeptide to a suspension comprising about 200.000 spleen cells per ml, the addition of the polypeptide resulting in a concentration of 1- 4 itg polypeptide per ml suspension, the release of IFN-y being assessable by determination of IFN-y in supernatant harvested 2 days after the addition of the polypeptide to the suspension, and/or 2) induces a release of IFN-y of at least 300 pg above background level from about 1000,000 human PBMC (peripheral blood mononuclear cells) per ml isolated from TB patients in the first phase of infection, or from healthy BCG vaccinated donors, or from healthy contacts to TB patients, the induction being performed by the addition of the polypeptide to a suspension comprising the about 1,000,000 PBMC per ml, the addition of the polypeptide resulting in a concentration of 1-4 tg polypeptide per ml suspension, the release of IFN-y being assessable by determination of IFN-y in supernatant d5 harvested 2 days after the addition of the polypeptide to the suspension; and/or 3) induces an IFN-y release from bovine PBMC derived from animals previously sensitised with mycobacteria belonging to the tuberculosis complex, said release being at least two times the release observed from 0 bovine PBMC derived from animals not previously sensitized with mycobacteria belonging to the tuberculosis complex.
7. A polypeptide fragment according to any one of the preceding claims, wherein the sequence identity in c) is at least 80%, such as at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 5 at least 96%, at least 97%, at least 98%, at least 99%, and at least 99.5%.
8. A fusion polypeptide comprising at least one polypeptide fragment according to any one of the preceding claims and at least one fusion partner.
9. A fusion polypeptide according to claim 8, wherein the fusion partner is selected from the group consisting of a polypeptide fragment as defined in any one of claims 1-8, and another polypeptide fragment derived from a bacterium belonging to the tuberculosis complex, such as ESAT-6 or at least one T-cell epitope thereof, MPB64 or at least one T-cell epitope thereof, MPT64 or at least one T-cell epitope thereof, and MPB59 or at least one T-cell epitope thereof.
10. A fusion polypeptide fragment which comprises 128 1) a first amino acid sequence including at least one stretch of amino acids constituting a T-cell epitope derived from the M. tuberculosis protein ESAT-6, and a second amino acid sequence including at least one T-cell epitope derived from a M. tuberculosis protein selected from the group consisting of a polypeptide fragment according to any one of claims 1-7, DnaK, GroEL, urease, glutamine synthetase, the proline rich complex, L-alanine dehydrogenase, phosphate binding protein, Ag 85 complex, HBHA (heparin binding hemagglutinin), MPT51, MPT64, superoxide dismutase, 19 kDa lipoprotein, a-crystallin, GroES, MPT59 and/or including a stretch of amino acids which protects the first amino acid sequence from in vivo degradation or post-translational processing; or 2) a first amino acid sequence including at least one stretch of amino acids constituting a T-cell epitope derived from the M. tuberculosis protein MPT59, and a second amino acid sequence including at least one T-cell epitope derived from a M. tuberculosis protein different from MPT59 and/or including a stretch of amino acids which protects the first amino acid sequence from in vivo degradation or post-translational processing.
11. A fusion polypeptide fragment according to claim 10, wherein the first amino acid sequence is situated C-terminally to the second amino acid 20 sequence.
12. A fusion polypeptide fragment according to claim 10, wherein the first amino acid sequence is situated N-terminally to the second amino acid sequence.
13. A fusion polypeptide fragment according to any one of claims 10-12, wherein the at least one T-cell epitope included in the first amino acid sequence is derived from the M. tuberculosis protein MPT59 and the at least one T-cell epitope included in the second amino acid sequence is derived from the M. tuberculosis protein ESAT-6.
14. A fusion polypeptide fragment according to any one of claims 10-13, wherein the first and second T-cell epitopes each have a sequence identity of Sat least 70% with the natively occurring sequence in the proteins from which they are derived.
A fusion polypeptide according to any one of claims 10-14, wherein the first and/or second amino acid sequence have a sequence identity of at least 70% with the protein from which they are derived.
16. A fusion polypeptide fragment according to any one of claims 10-15, wherein the first amino acid sequence is the amino acid sequence of ESAT-6 or of MPT59 and/or the second amino acid sequence is the amino acid sequence of a M.tuberculosis polypeptide selected from the group consisting of a polypeptide fragment according to any one of claims 1-7, DnaK, GroEL, urease, glutamine synthetase, the proline rich complex, L-alanine dehydrogenase, phosphate binding protein, Ag 85 complex, HBHA (heparin binding hemagglutinin), MPT51, MPT64, superoxide dismutase, 19kDa lipoprotein, a-crystallin, GroES, ESAT-6 when the first amino acid sequence is that of MPT59, and MPT59 when the first amino acid sequence is that of ESAT-6.
17. A fusion polypeptide according to any one of claims 10-16, wherein no linkers are introduced between the two amino acid sequences.
18. A polypeptide according to any one of the preceding claims which is 15 lipidated so as to allow a self-adjuvating effect of the polypeptide.
19. A substantially pure polypeptide according to any one of claims 1-18 for use as a pharmaceutical.
20. The use of a substantially pure polypeptide according to any one of claims 1-19 in the preparation of a pharmaceutical composition for the diagnosis of tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.
21. The use of a substantially pure polypeptide according to any one of claims 1-19 in the preparation of a pharmaceutical composition for the vaccination against tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.
22. A nucleic acid fragment in isolated form which 1) comprises a nucleic acid sequence which encodes a polypeptide as defined in any one of claims 1-18, or comprises a nucleic acid sequence complementary thereto. 2) has a length of at least 10 nucleotides and hybridizes readily under stringent hybridization conditions with a nucleic acid fragment which has a nucleotide sequence selected shown in SEQ ID NO: 174 or a sequence complementary thereto.
23. A nucleic acid fragment according to claim 22, which is a DNA fragment.
24. A nucleic acid fragment according to claims 22 or 23 for use as a pharmaceutical.
The use of a nucleic acid fragment according to any one of claims 22- 24 in the preparation of a pharmaceutical composition for the vaccination against tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.
26. The use of a nucleic acid fragment according to any one of claims 22- 24 in the preparation of a pharmaceutical composition for the diagnosis of tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis.
27. A vaccine comprising a nucleic acid fragment according to any one of claims 22-24, the vaccine effecting in vivo expression of antigen by an animal, including a human being, to whom the vaccine has been administered, the amount of expressed antigen being effective to confer 15 substantially increased resistance to infections with mycobacteria of the tuberculosis complex in an animal, including a human being.
28. An immunologic composition comprising a polypeptide according to any one of claims 1-19.
29. An immunologic composition according to claim 28, which further comprises an immunologically and pharmaceutically acceptable carrier, vehicle or adjuvant.
30. An immunologic composition according to claim 29, wherein the carrier is selected from the group consisting of a polymer to which the polypeptide(s) is/are bound by hydrophobic non-covalent interaction, such as a plastic, e.g. polystyrene, a polymer to which the polypeptide(s) is/are covalently bound, such as a polysaccharide, and a polypeptide, e.g. bovine serum albumin, ovalbumin or keyhole limpet hemocyanin; the vehicle is selected from the group consisting of a diluent and a suspending agent; and the adjuvant is selected from the group consisting of dimethyldioctadecylammonium bromide (DDA), Quil A, poly I:C, Freund's incomplete adjuvant, IFN-y, IL-2, IL-12, monophosphoryl lipid A (MPL), and muramyl dipeptide (MDP).
31. An immmunologic composition according to any one of claims 28-30, comprising at least two different polypeptide fragments, each different polypeptide fragment being a polypeptide according to any one of claims 1- 19.
32. An immunologic composition according to claim 31, comprising 3-20 different polypeptide fragments, each different polypeptide fragment being according to any one of claims 1-19.
33 An immunologic composition according to any one of claims 28-32, which is in the form of a vaccine.
34. An immunologic composition according to any one of claims 28-32, which is in the form of a skin test reagent.
A vaccine for immunizing an animal, including a human being, against tuberculosis caused by mycobacteria belonging to the tuberculosis complex, comprising as the effective component a non-pathogenic microorganism, wherein at least one copy of a DNA fragment comprising a DNA sequence encoding a polypeptide according to any one of claims 1-19 has been incorporated into the genome of the microorganism in a manner allowing the microorganism to express and optionally secrete the polypeptide. 15
36. A vaccine according to claim 35, wherein the microorganism is a bacterium. i
37. A vaccine according to claim 36, wherein the bacterium is selected from the group consisting of the genera Mycobacterium, Salmonella, Pseudomonas and Eschericia. !0
38. A vaccine according to claim 37, wherein the microorganism is Mycobacterium bovis BCG, such as Mycobacterium bovis BCG strain: Danish S. 1331.
39. A vaccine according to any one of claims 35-38, wherein at least 2 copies of a DNA fragment encoding a polypeptide according to any one of claims 1-12 are incorporated into the genome of the microorganism.
40. A vaccine according to claim 39, wherein the number of copies is at least
41. A replicable expression vector which comprises a nucleic acid fragment according to any one of claims 22-24.
42. A vector according to claim 41, which is selected from the group consisting of a virus, a bacteriophage, a plasmid, a cosmid, and a microchromosome.
43. A transformed cell harbouring at least one vector according to claim 41 or 42.
44. A transformed cell according to claim 43, which is a bacterium belonging to the tuberculosis complex, such as a M tuberculosis bovis BCG cell.
A transformed cell according to claim 43 or 44, which expresses a polypeptide according to any one of claims 1-19.
46. A method for producing a polypeptide according to any one of claims 1-19, comprising inserting a nucleic acid fragment according to any one of claims 22-24 into a vector which is able to replicate in a host cell, introducing the resulting recombinant vector into the host cell, culturing the host cell in a culture medium under conditions sufficient to effect expression of the polypeptide, and recovering the polypeptide from the host cell or culture medium; or isolating the polypeptide from a short-term culture filtrate; or isolating the polypeptide from whole mycobacteria of the tuberculosis complex or from lysates or fractions thereof, e.g. cell wall containing fractions; or synthesizing the polypeptide by solid or liquid phase peptide 20 synthesis.
47. A method for producing an immunologic composition according to any one of claims 28-34 comprising preparing, synthesizing or isolating a polypeptide according to any one of claims 1-19, and solubilizing or dispersing the polypeptide in a medium for a vaccine, and optionally adding other M. tuberculosis antigens and/or a carrier, vehicle and/or adjuvant substance, or 30 cultivating a cell according to any one of claims 43-45, and transferring the cells to a medium for a vaccine, and optionally adding a carrier, vehicle and/or adjuvant substance.
48. A method of diagnosing tuberculosis caused by Mycobacterium tuberculosis, Mycobacterium africanum or Mycobacterium bovis in an animal, including a human being, comprising intradermally injecting, in the animal, apolypeptide according to any one of claims 1-19 or an immunologic composition according to any one of claims 28-34, a positive skin response at the location of injection being indicative of the animal having tuberculosis, and a negative skin response at the location of injection being indicative of the animal not having tuberculosis.
49. A method for immunising an animal, including a human being, against tuberculosis caused by mycobacteria belonging to the tuberculosis complex, comprising administering to the animal the polypeptide according to any one of claims 1-19, the immunologic composition according to any one of claims 28-34, or the vaccine according to any one of claims 35-40.
50. A method according to claim 49, wherein the polypeptide, immunologic composition, or vaccine is administered by the parenteral (such as intravenous and intraarterially), intraperitoneal, intramuscular, subcutaneous, intradermal, oral, buccal, sublingual, nasal, rectal or transdermal route. o 5
51. A method for diagnosing ongoing or previous sensitization in an animal or a human being with bacteria belonging to the tuberculosis complex, the method comprising providing a blood sample from the animal or human being, and contacting the sample from the animal with the polypeptide according to any one of claims 1-19, a significant release into the 3 extracellular phase of at least one cytokine by mononuclear cells in the blood sample being indicative of the animal being sensitized.
52. A composition for diagnosing tuberculosis in an animal, including a human being, comprising a polypeptide according to any one of claims 1-19, i*O: or a nucleic acid fragment according to any one of claims 22-24, optionally in 5 combination with a means for detection.
53. A monoclonal or polyclonal antibody, which is specifically reacting with a polypeptide according to any one of claims 1-19 in an immunoassay, or a specific binding fragment of said antibody. Dated this fourth day of January 2001 STATENS SERUM INSTITUT Patent Attorneys for the Applicant: F B RICE CO WO 99/24577 PCT/DK98/00438 133 43. A transformed cell harbouring at least one vector according to claim 41 or 42. 44. A transformed cell according to claim 43, which is a bacterium belonging to the tuberculosis complex, such as a M. tuberculosis bovis BCG cell. A transformed cell according to claim 43 or 44, which expresses a polypeptide according to any of claims 1-19. 46. A method for producing a polypeptide according to any of claims 1-19, comprising inserting a nucleic acid fragment according to any of claims 15-17 into a vector which is able to replicate in a host cell, introducing the resulting recombinant vector into the host cell, culturing the host cell in a culture medium under conditions sufficient to ef- fect expression of the polypeptide, and recovering the polypeptide from the host cell or culture medium; or isolating the polypeptide from a short-term culture filtrate; or isolating the polypeptide from whole mycobacteria of the tuberculosis complex or from lysates or fractions thereof, e.g. cell wall containing fractions; or synthesizing the polypeptide by solid or liquid phase peptide synthesis. 47. A method for producing an immunologic composition according to any of claims 28-34 comprising preparing, synthesizing or isolating a polypeptide according to any of claims 1- 19, and solubilizing or dispersing the polypeptide in a medium for a vaccine, and optionally adding other M. tuberculosis antigens and/or a carrier, vehicle and/or adjuvant substance, WO 99/24577 PCT/DK98/00438 134 or cultivating a cell according to any of claims 41-45, and transferring the cells to a medium for a vaccine, and optionally adding a carrier, vehicle and/or adjuvant substance. 48. A method of diagnosing tuberculosis caused by Mycobacterium tuberculosis, My- cobacterium africanum or Mycobacterium bovis in an animal, including a human being, comprising intradermally injecting, in the animal, a polypeptide according to any of claims 1-19 or an immunologic composition according to any of claims 28-34, a posi- tive skin response at the location of injection being indicative of the animal having tu- berculosis, and a negative skin response at the location of injection being indicative of the animal not having tuberculosis. 49. A method for immunising an animal, including a human being, against tuberculosis caused by mycobacteria belonging to the tuberculosis complex, comprising administering to the animal the polypeptide according to any of claims 1-19, the immunologic composition according to any of claims 28-34, or the vaccine according to any of claims 35-40. A method according to claim 49, wherein the polypeptide, immunologic composition, or vaccine is administered by the parenteral (such as intravenous and intraarterially), intraperitoneal, intramuscular, subcutaneous, intradermal, oral, buccal, sublingual, nasal, rectal or transdermal route. 51. A method for diagnosing ongoing or previous sensitization in an animal or a human being with bacteria belonging to the tuberculosis complex, the method comprising providing a blood sample from the animal or human being, and contacting the sample from the animal with the polypeptide according to any of claims 1-19, a significant release into the extracellular phase of at least one cytokine by mononuclear cells in the blood sample being indicative of the animal being sensitized. WO 99/24577 PCT/DK98/00438 135 52. A composition for diagnosing tuberculosis in an animal, including a human being, comprising a polypeptide according to any of claims 1-19, or a nucleic acid fragment according to any of claims 22-24, optionally in combination with a means for detection. 53. A monoclonal or polyclonal antibody, which is specifically reacting with a polypep- tide according to any of claims 1-19 in an immuno assay, or a specific binding frag- ment of said antibody.
54. Use of CFP7A or CFP30A, or a T-cell epitope thereof, for the induction of a strong immune response in a mammal including a human being.
Use of CFP7A, or a T-cell epitope thereof, for the induction of a high protective immune response in a mammal including a human being.
56. Use of CFP7B, CFP19, or MPT59-ESAT6, or a T-cell epitope thereof, for the diag- nosis of tuberculosis in a mammal including a human being by performing a DTH type skin test.
57. Use of CFP27, CFP30A, RD1-ORF2, RD1-ORF3, RD1-ORF5, MPT59-ESAT6, ESAT6-MPT59, CFP10A, CFP16, CFP19, CFP23, CFP25A, CFP30B, CFP7B, or a T- cell epitope thereof, for the preparation of an immunological composition with a wide genetically recognition.
58. Use of CFP27, CFP30A, RD1-ORF2, RD1-ORF5, MPT59-ESAT6, ESAT6-MPT59, CFP19, CFP23, CFP25A, CFP30B, or a T-cell epitope thereof, for the prepa- ration of a vaccine such as a subunit vaccine.
AU94338/98A 1997-11-10 1998-10-08 Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis Expired AU750173B2 (en)

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
DK127797 1997-11-10
DK97/01277 1997-11-10
US7048898P 1998-01-05 1998-01-05
US60/070488 1998-01-05
PCT/DK1998/000132 WO1998044119A1 (en) 1997-04-02 1998-04-01 NUCLEIC ACID FRAGMENTS AND POLYPEPTIDE FRAGMENTS DERIVED FROM $i(M. TUBERCULOSIS)
WO98/00132 1998-04-01
PCT/DK1998/000438 WO1999024577A1 (en) 1997-11-10 1998-10-08 NUCLEIC ACID FRAGMENTS AND POLYPEPTIDE FRAGMENTS DERIVED FROM $i(M. TUBERCULOSIS)

Related Child Applications (1)

Application Number Title Priority Date Filing Date
AU2002301509A Division AU2002301509A1 (en) 1997-11-10 2002-10-10 Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis

Publications (2)

Publication Number Publication Date
AU9433898A AU9433898A (en) 1999-05-31
AU750173B2 true AU750173B2 (en) 2002-07-11

Family

ID=27221205

Family Applications (1)

Application Number Title Priority Date Filing Date
AU94338/98A Expired AU750173B2 (en) 1997-11-10 1998-10-08 Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis

Country Status (5)

Country Link
EP (1) EP1029053A1 (en)
AU (1) AU750173B2 (en)
CA (1) CA2319380A1 (en)
NZ (1) NZ504951A (en)
WO (1) WO1999024577A1 (en)

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB9706957D0 (en) 1997-04-05 1997-05-21 Smithkline Beecham Plc Formulation
CA2322505A1 (en) * 1998-03-06 1999-09-10 Statens Serum Institut Production of mycobacterial polypeptides by lactic acid bacteria
GB9820525D0 (en) 1998-09-21 1998-11-11 Allergy Therapeutics Ltd Formulation
JP5010079B2 (en) 1999-07-13 2012-08-29 スタテンズ セーラム インスティテュート Mycobacterium tuberculosis esat-6 gene family based tuberculosis vaccine and diagnostic method
US6537552B1 (en) 1999-10-19 2003-03-25 Iowa State University Research Foundation Vaccine adjuvant
GB0000891D0 (en) * 2000-01-14 2000-03-08 Allergy Therapeutics Ltd Formulation
WO2001079274A2 (en) * 2000-04-19 2001-10-25 Statens Serum Institut Tuberculosis antigens and methods of use thereof
WO2002004018A2 (en) * 2000-07-10 2002-01-17 Colorado State University Research Foundation Mid-life vaccine and methods for boosting anti-mycobacterial immunity
US7105170B2 (en) * 2001-01-08 2006-09-12 The United States Of America As Represented By The Department Of Health And Human Services Latent human tuberculosis model, diagnostic antigens, and methods of use
BRPI0518933A2 (en) 2004-11-16 2008-12-16 Crucell Holland B V E Aeras Gl replication defective recombinant adenovirus, recombinant polynucleotide vector, multivalent tuberculosis vaccine, and use of mycobacterium antigen tb10.4
CN100354004C (en) * 2004-11-19 2007-12-12 李忠明 Tubercle bacillus chimeric gene vaccine and preparation process thereof
CN101438166A (en) * 2006-03-14 2009-05-20 俄勒冈健康科学大学 Methods for producing an immune response to tuberculosis
GB0618127D0 (en) * 2006-09-14 2006-10-25 Isis Innovation Biomarker
EP2368568A1 (en) 2006-11-01 2011-09-28 Immport Therapeutics, INC. Compositions and methods for immunodominant antigens
WO2008116468A2 (en) 2007-03-26 2008-10-02 Dako Denmark A/S Mhc peptide complexes and uses thereof in infectious diseases
EP3023436A1 (en) 2007-07-03 2016-05-25 Dako Denmark A/S Improved methods for generation, labeling and use of mhc multimers
EP2197908A2 (en) * 2007-09-27 2010-06-23 Dako Denmark A/S Mhc multimers in tuberculosis diagnostics, vaccine and therapeutics
US10968269B1 (en) 2008-02-28 2021-04-06 Agilent Technologies, Inc. MHC multimers in borrelia diagnostics and disease
WO2010009735A2 (en) 2008-07-23 2010-01-28 Dako Denmark A/S Combinatorial analysis and repair
US11992518B2 (en) 2008-10-02 2024-05-28 Agilent Technologies, Inc. Molecular vaccines for infectious disease
US10369204B2 (en) 2008-10-02 2019-08-06 Dako Denmark A/S Molecular vaccines for infectious disease
EP2711706A1 (en) 2012-09-21 2014-03-26 LIONEX Diagnostics and Therapeutics GmbH Mycobacterial thiolperoxidase and its use
WO2015188839A2 (en) 2014-06-13 2015-12-17 Immudex Aps General detection and isolation of specific cells by binding of labeled molecules
CN104306408B (en) * 2014-10-31 2019-02-01 潘霞 A kind of polysaccharide nucleic acid pharmaceutical composition for treating infantile eczema
CN104383162A (en) * 2014-11-21 2015-03-04 王正琦 Ointment for infantile eczema
CN106248934B (en) * 2016-08-25 2018-04-06 中国疾病预防控制中心传染病预防控制所 Antigen of mycobacterium tuberculosis albumen Rv0446c and its t cell epitope peptide application

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997009428A2 (en) * 1995-09-01 1997-03-13 Corixa Corporation Compounds and methods for immunotherapy and diagnosis of tuberculosis
WO1997009429A2 (en) * 1995-09-01 1997-03-13 Corixa Corporation Compounds and methods for diagnosis of tuberculosis
AU6820498A (en) * 1997-04-02 1998-10-22 Statens Serum Institut Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DK79893D0 (en) * 1993-07-02 1993-07-02 Statens Seruminstitut NEW VACCINE
AU6024596A (en) * 1995-05-23 1996-12-11 Regents Of The University Of California, The Abundant extracellular products and methods for their production and use
US6555653B2 (en) * 1997-05-20 2003-04-29 Corixa Corporation Compounds for diagnosis of tuberculosis and methods for their use
US6613881B1 (en) * 1997-05-20 2003-09-02 Corixa Corporation Compounds for immunotherapy and diagnosis of tuberculosis and methods of their use

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997009428A2 (en) * 1995-09-01 1997-03-13 Corixa Corporation Compounds and methods for immunotherapy and diagnosis of tuberculosis
WO1997009429A2 (en) * 1995-09-01 1997-03-13 Corixa Corporation Compounds and methods for diagnosis of tuberculosis
AU6820498A (en) * 1997-04-02 1998-10-22 Statens Serum Institut Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis

Also Published As

Publication number Publication date
EP1029053A1 (en) 2000-08-23
NZ504951A (en) 2001-06-29
WO1999024577A1 (en) 1999-05-20
CA2319380A1 (en) 1999-05-20
AU9433898A (en) 1999-05-31

Similar Documents

Publication Publication Date Title
AU750173B2 (en) Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis
JP4544434B2 (en) Nucleic acid fragments and polypeptide fragments derived from Mycobacterium tuberculosis
US6991797B2 (en) M. tuberculosis antigens
CA2165949C (en) Tuberculosis vaccine
US5955077A (en) Tuberculosis vaccine
US6641814B1 (en) Nucleic acids fragments and polypeptide fragments derived from M. tuberculosis
US8076469B2 (en) TB diagnostic based on antigens from M. tuberculosis
AU689075B2 (en) Membrane-associated immunogens of mycobacteria
AU766093B2 (en) Tuberculosis vaccine and diagnostic reagents based on antigens from the mycobacterium tuberculosis cell
EP1484405A1 (en) Nucleic acid fragments and polypeptide fragments derived from M. Tuberculosis
AU2006252186A1 (en) Nucleic acid fragments and polypeptide fragments derived from M. tuberculosis
EP1787994A1 (en) TB vaccine and diagnostic based on antigens from M. tuberculosis cell

Legal Events

Date Code Title Description
MK6 Application lapsed section 142(2)(f)/reg. 8.3(3) - pct applic. not entering national phase
TH Corrigenda

Free format text: IN VOL 14, NO 3, PAGE(S) 464-468 UNDER THE HEADING APPLICATIONS LAPSED, REFUSED OR WITHDRAWN PLEASE DELETE ALL REFERENCE TO APPLICATION NO 94338/98

FGA Letters patent sealed or granted (standard patent)
MK14 Patent ceased section 143(a) (annual fees not paid) or expired