WO1997041234A2

WO1997041234A2 - PROTEINS INVOLVED IN THE SYNTHESIS AND ASSEMBLY OF O-ANTIGEN IN $i(PSEUDOMONAS AERUGINOSA)

Info

Publication number: WO1997041234A2
Application number: PCT/CA1997/000295
Authority: WO
Inventors: Joseph S. Lam; Lori Burrows; Deborah Charter; Teresa De Kievit
Original assignee: University Of Guelph
Priority date: 1996-04-30
Filing date: 1997-04-30
Publication date: 1997-11-06
Also published as: EP0904376A2; WO1997041234A3; CA2175435A1; US5994072A; AU2377497A

Abstract

Nucleic acid molecules encoding proteins involved in the synthesis and assembly of O-antigen in P. aeruginosa; and proteins encoded by the nucleic acid molecules are described. Methods are disclosed for detecting P. aeruginosa in a sample by determining the presence of the proteins or a nucleic acid molecule encoding the proteins in the sample.

Description

PROTEINS INVOLVED IN THE SYNTHESIS AND ASSEMBLY OF O- ANTIGEN IN PSEUDOMONAS AERUGINOSA

FIELD OF THE INVENTION The invention relates to novel nucleic acid molecules encoding proteins involved rn the synthesis and assembly of O-antigen in P aeruginosa, the novel proteins encoded by the nucleic acid molecules, and, uses of the proteins and nucleic acid molecules BACKGROUND OF THE INVENTION

The opportunistic pathogen P aeruginosa remains a problem in the nosicomial infection of lmmunocornprormsed individuals P aeruginosa infections are particularly a problem m burn patients, people receiving medical implants, and in individuals suffering from cystic fibrosis (Fick, R B Jr , 1993) The organism is intrinsically resistant to many antibiotics and capable of forming biofilms which are recalcitrant to treatment Several virulence factors have been identified m the pathogenesis of P aeruginosa infections, including proteins such as exotoxin A, proteases, and exopolysacchaπdes including alginate and lipopolysacchaπde (LPS) The LPS of P aeruginosa is typical of Gram-negative bacteria, composed of lipid A-core oligosacchaπde- O antigen repeating units

P aeruginosa is capable of coexpressing two distinct forms of LPS, designated A-band and B-band LPS, respectively A-band LPS is a shorter, common form expressed by the majority of P aeruginosa serotypes, and has a tπsacchaπde repeating unit of α-D-rhamnose linked 1 — >3, 1→3, 1 — >2 B-band LPS is the serotype-specific, O-antigen- contaming form, and is a heteropolymer composed of di- to pentasacchaπde repeats containing a wide variety of acyl sugars, amino sugars, and uronic acids Both the A- and B- band repeating units are attached to lipid A-core, but there appear to be differences between them regarding point of attachment to and composition of the outer core region (Rivera et al , 1992)

The gene clusters for biosynthesis of core oligosacchandes/O-antigens rfb have been cloned and characterized from several bacterial species, including some from non-enteric genera such as Bordetelia (Allen and Maskell, 1996), Haemophύus (Jarosik and Hansen, 1994), Neisseria (Gotsch ch, 1994), Vibrio (Stroeher et al , 1992, Amor and Muthaπa. 1995, Comstock et al , 1996), and Xanthamonas (Kingsley et al , 1993) rfb clusters appear to be composed of mosaics of biosynthetic genes acquired horizontally from different sources (Reeves, 1993) Biochemical characterization of O-antigens from various species has shown that conservation of structure does not necessarily mirror conservation at the genetic level Strains with identical O-antigens can differ significantly in their rfb clusters, while unique O-antigens can be encoded by only slightly variant rfb genes in other strains (Whitfield and Valvano, 1993)

Lightfoot and Lam were the first to report the cloning of genes mvolved in the expression of A-band (Lightfoot and Lam, 1991) and B-band (Lightfoot and Lam, 1993) LPS of P aeruginosa A recombinant cosmid clone pFV3 complemented A-band LPS synthesis in an A-band-deficient mutant, rd7513 pFV3 also mediated A-band LPS synthesis in five of the six P aeruginosa O serotypes which lack A-band LPS Another cosmid clone, pFVlOO, complemented B-band LPS synthesis in mutant ge6, which lacks B- band LPS Physical mappmg of the genes involved in A-band and B-band LPS synthesis mdicated that the two gene clusters are physically distmct and are separated by more than 1 9 Mbp on the P aeruginosa PAOl genome A-band LPS genes mapped between 5 75 and 5 89 Mbp (105 to 13 3 mm), and B-band LPS genes mapped at 1 9 Mbp (near 37 mm) on the 5 9- Mbp chromosome

The structure of the P aeruginosa 05 O-antigen has been elucidated (Knirel et al , 1988) 05 has a tπsacchaπde repeating unit of 2-acetamιdo-3-acetamιdιno- 2,3-dιdeoxy-D-mannuronιc acid, 2,3-dιacetamιdo-D-mannuronιc acid, and N-acetyl-D- fucosamine (Figure 30) Serotypes 02, 016, 018, and O20 of P aeruginosa have similar O- antigens to serotype 05, varying only in one linkage or one epimcr from 05 (Knirel et al , 1988) (Figure 30) Immunochemical cross reactions have also been demonstrated among LPS of serotypes 02, 05 and 016 by the use of monoclonal antibodies (Lam et al , 1992) The rfb A (herein also referred to as "psbL " and "wbpl ") from the 05 gene cluster has been characterized (Dasgupta and Lam, 1995) This 05 O-antigen biosynthetic gene has been shown to hybridize only with chromosomal DNA from the group of five serotypes with similar O-antigens, and not with the remaining fifteen serotypes There are currently three pathways proposed for biosynthesis and assembly of LPS, the Rfc-dependent and Rfc-independent pathways Rfc is the O-antigen polymerase, and appears to be required for assembly of heteropolymeπc O-antigens (Makela and Stocker, 1984) In contrast, homopolymenc O-antigens appear to be assembled without an O-antigen polymerase (Whitfield, 1995) Rfc-dependent (or Wzy) LPS synthesis has been shown to mvolve at least two other gene products which act in concert with Rfc, RfbX (or Wzx), the putative fhppase which translocates individual O-antigen units across the cytoplasmic membrane where they are polymerized by Rfc (or Wzy), and Rol (or Wzz), the regulator of O-antigen chain length, which determines the preferred O- antigen chain length characteristic of the individual strain or serotype (Batchelor et al , 1993, Bastin et al , 1993, Morona et al , 1994b, Dodgson et al , 1996) SUMMARY OF THE INVENTION The present inventors have characterized a £,. aeruginosa JB-band (psb) gene cluster involved in the synthesis and assembly of B-band lipopolysaccharide 1 e O- antigen The gene cluster is also known as and referred to herem as the wbp gene cluster

The cluster contains two groups of genes, one of which is found in P aeruginosa serotypes 02, 05, 016, 018. and O20, and the other is found in serotypes Ol to O20 The genes found serotypes 02, 05, 016, 018, and O20 include the psbL gene also known as wbpL and rFA (Dasgupta and Lam, 1995), and the novel genes designated rol, psbA, psbB, psbC psbD, psbE, rfc, psbF, psbC, psbH, psbl, psbf, and psbK ("Group I genes"), also known as and referred to herem as wzz, wbpA, zvbpB, lυbpC, wbpD, wbpE, wzy, wbpF, wbpG, wbpH, wbpl, wbp], and wbpK respectively The genes found in serotypes Ol to O20 mclude the novel genes psbM and psbN which are also known as and referred to herem as wbpM and wbpN respectively ("Group II genes") The psb gene cluster also contains genes which are not mvolved in LPS synthesis including the genes rpsA and himD and the novel genes designated uvrB, insertion element IS407, hisH and hisF The arrangement of the genes in the wbp gene cluster is shown in Figure 1

The identification and sequencing of the genes and proteins in the wbp gene cluster permits the identification of substances which affect O-antigen synthesis or assembly in P aeruginosa These substances may be useful in inhibiting O-antigen synthesis or assembly thereby rendermg the microorganisms more susceptible to attack by host defence mechanisms

Broadly stated the present invention relates to an isolated P aeruginosa B-band gene cluster containing the following genes rol (wzz), psbA (wbpA), psbB (wbpB), psbC (wbpC), psbD (wbpD), psbE (wbpE), rfc (wzy), psbF (zυbpF), psbG dυbpG), psbH (wbpH), psbl (wbpl), psbf (wbp]), psbK (wbpK), psbL (wbpl ), psbM (wbpM), and psbN (wbpN) mvolved in the synthesis, and assembly of lipopolysaccharide in P aeruginosa The terms in parenthesis correspond to other designations that have been given to these genes The gene cluster may also contain the non-LPS gene uvrB, the insertion element IS407 (IS1209), the genes hisH and hisF involved in his dine synthesis, the gene rpsA which encodes a 30 S nbosomal subunit protem SI and the gene himD which encodes an integration host factor

The present mvention also relates to nucleic acid molecules encoding the following proteins (1) (a) Rol (also known as Wzz), (b) PsbA (also known as WbpA), (c) PsbB (also known as WbpB), (d) PsbC (also known as WbpC), (e) PsbD (also known as WbpD); (f) PsbE (also known as WbpE), (g) Rfc (also known as Wzy), (h) PsbF (also known as WbpF), (i) PsbG (also known as WbpG), (j) Psbl (also known as Wbpl), (k) PsbJ (also known as WbpJ), (1) PsbK (also known as WbpK), (m) PsbM (also known as WbpM), (n) PsbH (also known as WbpH) or (o) PsbN (also known as WbpN), mvolved in P aeruginosa O- antigen synthesis and assembly; (2) UvrB involved in ultraviolet repair; (3) HisH or HisF involved in histidine synthesis, or (4) RpsA a 30S ribosomal subunit protein SI. In addition, nucleic acid molecules are provided which contain sequences encoding two or more of the following proteins (1) (a) Rol (also known as Wzz); (b) PsbA (also known as WbpA); (c) PsbB (also known as WbpB); (d) PsbC (also known as WbpC); (e) PsbD (also known as WbpD); (f) PsbE (also known as WbpE); (g) Rfc (also known as Wzy); (h) PsbF (also known as WbpF); (i) HisH; (j) HisF; (k) PsbG (also known as WbpG); (1) Psbl (also known as Wbpl); (m) PsbJ (also known as WbpJ); (n) PsbK (also known as WbpK); (o) PsbM (also known as WbpM); (p) PsbN (also known as WbpN); (q) PsbH (also known as WbpH); (r) PsbL (also known as WbpL); and (s) RpsA.

The invention also contemplates a nucleic acid molecule comprising a sequence encoding a truncation of a protein of the invention, an analog, or a homolog of a protein of the invention, or a truncation thereof.

The nucleic acid molecules of the invention may be inserted into an appropriate expression vector, i.e. a vector which contains the necessary elements for the transcription and translation of the inserted coding sequence. Accordingly, recombinant expression vectors adapted for transformation of a host cell may be constructed which comprise a nucleic acid molecule of the invention and one or more transcription and translation elements operatively linked to the nucleic acid molecule. The recombinant expression vector may be used to prepare transformed host cells expressing a protein of the invention. Therefore, the invention further provides host cells containing a recombinant molecule of the invention.

The invention further provides a method for preparing a protein of the invention utilizing the purified and isolated nucleic acid molecules of the invention. In an embodiment a method for preparing a protein of the invention is provided comprising (a) transferring a recombinant expression vector of the invention into a host cell; (b) selecting transformed host cells from untransformed host cells; (c) culturing a selected transformed host cell under conditions which allow expression of the protein; and (d) isolating the protein. The invention further broadly contemplates an isolated protein characterized in that it has part or all of the primary structural conformation (ie. continuous sequence of amino acid residues) of a novel protein encoded by a gene of the wbp gene cluster of the invention. In an embodiment of the invention, a purified protein is provided which has the amino acid sequence as shown in Figure 3 or SEQ ID NO:2;, Figure 4 or SEQ ID NO:3; Figure 5 or SEQ ID NO:4; Figure 6 or SEQ ID NO:5; Figure 7 or SEQ ID NO:6; Figure 8 or SEQ ID NO:7; Figure 9 or SEQ ID NO:8; Figure 10 or SEQ ID NO:9; Figure 11 or SEQ ID NO:10; Figure 12 or SEQ ID NO:ll; Figure 13 or SEQ ID NO:12; Figure 14 or SEQ ID NO 13, Figure 15 or SEQ ID NO 14, Figure 16 or SEQ ID NO.15, Figure 17 or SEQ ID NO 16, or, Figure 18 or SEQ ID NO.17, Figure 19 or SEQ.ID No 18, or, Figure 20 or SEQ ID No 19 The mvention also mcludes truncations of the protein and analogs, homologs, and isoforms of the protein and truncations thereof The proteins of the mvention may be conjugated with other molecules, such as proteins, to prepare fusion proteins This may be accomplished, for example, by the synthesis of N-terminal or C-termmal fusion proteins

The nucleic acid molecules of the mvention allow those skilled in the art to construct nucleotide probes for use m the detection of nucleotide sequences in samples such as biological (e g clinical specimens), food, or environmental samples The nucleotide probes may also be used to detect nucleotide sequences that encode protems related to or analogous to the protems of the mvention

Accordingly, the invention provides a method for detecting the presence of a nucleic acid molecule having a sequence encoding a protein of the mvention, comprising contacting the sample with a nucleotide probe which hybridizes with the nucleic acid molecule, to form a hybridization product under conditions which permit the formation of the hybridization product, and assaying for the hybridization product

The mvention further provides a kit for detecting the presence of a nucleic acid molecule having a sequence encoding a protein of the mvention, comprising a nucleotide probe which hybridizes with the nucleic acid molecule, reagents required for hybridization of the nucleotide probe with the nucleic acid molecule, and directions for its use

The nucleic acid molecules of the invention also permit the identification and isolation, or synthesis, of nucleotide sequences which may be used as primers to amplify a nucleic acid molecule of the mvention, for example in the polymerase cham reaction (PCR)

Accordingly, the invention relates to a method of determming the presence of a nucleic acid molecule having a sequence encoding a protein of the mvention in a sample, comprising treating the sample with primers which are capable of amplifying the nucleic acid molecule in an amplification reaction, preferably in a polymerase chain reaction, to form amplified sequences, under conditions which permit the formation of amplified sequences, and, assaying for amplified sequences

The mvention further relates to a kit for determming the presence of a nucleic acid molecule having a sequence encodmg a protein of the mvention m a sample, comprising primers which are capable of amplifying the nucleic acid molecule in an amplification reaction, preferably a polymerase chain reaction, to form amplified sequences, reagents required for amplifying the nucleic acid molecule thereof in the amphfication reaction, means for assaying the amplified sequences, and directions for its use

The mvention also relates to an antibody specific for an epitope of a protem of the mvention, and methods for preparing the antibodies Antibodies specific for a protem encoded by a Group I gene can be used to detect P aeruginosa serotypes 02, 05, 016, 018, and O20 m a sample, and antibodies specific for a protem encoded by a Group II gene can be used to detect P aeruginosa serotypes Ol to O20 m a sample Therefore, the mvention also relates to a method for detecting P aeruginosa serotypes 02, 05, 016, 018, and O20 m a sample comprising contacting a sample with an antibody specific for an epitope of a protem encoded by a Group I gene which antibody is capable of bemg detected after it becomes bound to a protem m the sample, and assaymg for antibody bound to protem m the sample, or unreacted antibody A method is also provided for detectmg P aeruginosa serotypes Ol to O20 in a sample comprising contacting a sample with an antibody specific for an epitope of a protem encoded by a Group II gene which antibody is capable of being detected after it becomes bound to a protem in the sample, and assaymg for antibody bound to protem in the sample, or unreacted antibody

A kit for detectmg P aeruginosa serotypes in a sample compπsmg an antibody of the mvention, preferably a monoclonal antibody and directions for its use is also provided The kit may also contain reagents which are required for binding of the antibody to the protein m the sample

As discussed above, the identification and sequencmg of genes m the wbp gene cluster in P aeruginosa permits the identification of substances which affect the activity of the protems encoded by the genes m the cluster, or the expression of the protems, thereby affecting O-antigen synthesis or assembly These substances may be useful in rendering the microorganisms more susceptible to attack by host defence mechanisms Accordingly, the invention provides a method for assaying for a substance that affects one or both of P aeruginosa O-antigen synthesis or assembly comprising mixing a protem or nucleic acid molecule of the mvention with a test substance which is suspected of affectmg P aeruginosa O-a tigen synthesis or assembly, and determining the effect of the substance by comparmg to a control

Other ob_jects, features and advantages of the present invention will become apparent from the following detailed description It should be understood, however, that the detailed description and the specific examples while indicating preferred embodiments of the mvention are given by way of illustration only, smce various changes and modifications within the spirit and scope of the invention will become apparent to those skilled in the art from this detailed description BRIEF DESCRIPTION OF DRAWINGS The invention will now be described in relation to the drawings:

Figure 1 shows the organization of the P. aeruginosa PAOl psb (wbp) gene cluster;

Figure 2 shows the nucleic acid sequence of the P. aeruginosa PAOl gene cluster (SEQ. ID. NO. 1);

Figure 3 shows the amino acid sequence of the Rol protein of the invention (SEQ. ID NO. 2);

Figure 4 shows the amino acid sequence of the PsbA (WbpA) protein of the invention (SEQ. ID NO. 3); Figure 5 shows the amino acid sequence of the PsbB (WbpB) protein of the invention (SEQ. ID NO. 4);

Figure 6 shows the amino acid sequence of the PsbC (WbpC) protein of the invention (SEQ. ID NO. 5);

Figure 7 shows the amino acid sequence of the PsbD (WbpD) protein of the invention (SEQ. ID NO. 6);

Figure 8 shows the amino acid sequence of the PsbE (WbpE) protein of the invention (SEQ. ID NO. 7);

Figure 9 shows the amino acid sequence of the Rfc (Wzy) protein of the invention (SEQ. ID NO. 8); Figure 10 shows the amino acid sequence of the PsbF (WbpF) protein of the invention (SEQ. ID NO. 9);

Figure 11 shows the amino acid sequence of the HisH protein of the invention (SEQ. ID NO. 10);

Figure 12 shows the amino acid sequence of the HisF protein of the invention (SEQ. ID NO. 11);

Figure 13 shows the amino acid sequence of the PsbG (WbpG) protein of the invention (SEQ. ID NO. 12);

Figure 14 shows the amino acid sequence of the PsbH (WbpH) protein of the invention (SEQ. ID NO. 13); Figure 15 shows the amino acid sequence of the Psbl (Wbpl) protein of the invention (SEQ. ID NO. 14);

Figure 16 shows the amino acid sequence of the PsbJ (WbpJ) protein of the invention (SEQ. ID NO. 15);

Figure 17 shows the amino acid sequence of the PsbK (WbpK) protein of the invention (SEQ. ID NO. 16);

Figure 18 shows the amino acid sequence of the PsbM (WbpM) protein of the invention (SEQ. ID NO. 17); Figure 19 shows the ammo acid sequence of the PsbN (WbpN) protem of the mvention (SEQ ID NO. 18);

Figure 20 shows the ammo acid sequence of the UvrB protem of the mvention (SEQ ID NO 19), Figure 21 shows the ammo acid sequence of PsbL (SEQ. ID NO 20)

(WbpL),

Figure 22 shows a silver-stained SDS-PAGE gel of LPS from PAOl, AK1401, AK14Ol(pFV100), and AK1401(pFV TK8) (Panel A) and Western immunoblots of this LPS reacted with 05-specιfιc MAb MF15-4 (Panel B), Figure 23 shows restriction maps of the chromosomal inserts from pFVlOO and several pFV subclones, and the results of complementation studies of the SR mutants AK1401 and rd7513 with the pFV subclones are also shown,

Figure 24 shows a Southern analysis of the three rfc (wzy) chromosomal mutants, OP5 2, OP5 3, and OP5 5, showing the mserhon of an 875 bp Gm^R cassette to the rfc (wzy) gene (panel C), and restriction maps of the PAOl wild-type (panel A) and mutant (panel B) rfc (wzy) codmg regions are shown,

Figure 25 shows a silver-stamed SDS-PAGE gel (panel A) and Western blots of LPS from PAOl, AK1401 and the three rfc (wzy) chromosomal mutants, OP5 2, OP5 3, and OP5 5 (Panels B and C), and Figure 26 shows the restriction maps of recombinant plasmids pFV161, pFV401, and pFV402,

Figure 27 are blots of Southern hybridizations of chromosomal DNA from PAOl (lane 2) and rol (wzz) mutants (lanes 3 and 4),

Figure 28 are Western immunoblots showing the characterization of LPS from PAOl and PAOl rol (wzz) chromosomal mutants,

Figure 29 is an autoradiogram showmg ³⁵S-labeled protems expressed by pFV401, which contains the rol (wzz) gene and correspondmg control plasmid vector pBluescπpt II SK in £ coh JM 109DE3 by use of the T7 expression system,

Figure 30 is a diagram showmg the structures of the O-antigens of P aeruginosa serotypes related to 05,

Figure 31 shows £ coh σ⁷⁰ and similar regions m psbA (wpbA), hisH, psbG (wpbG), IS407 and psbN (wpbN),

Figure 32 shows features of the psb genes of the psb gene cluster identifying the presumed start codon and spaces between RBS (πbosome binding sequence) and the first codon, Figure 33 shows the sequences of the NAD-binding doma s of PsbA, PsbK, and PsbM aligned with those of other bacterial protems involved in polysacchaπde biosynthesis,

Figure 34 shows a sequence alignment for PsbA (WpbA), E coh RffD, and B solanaceraeum EpsD,

Figure 35 shows a sequence alignment for PsbD (WpbD) and Bordetelia pertussis BplB, CysE of a number of bacteria,

Figure 36 shows a sequence alignment for PsbE (WpbE) and BP-BplC, BS-DegT, S-EryCl, S-DnrJ, and BS-SpsC, Figure 37 shows a hydropathy index computation for sequence PsbF,

Figure 38 shows a sequence alignment for PA-Psbl, BP-BplD, EC-NfrC, BS-OrfX, and SB-RfbC,

Figure 39 shows a sequence alignment for PA-PsbJ, BP-BplE, and YE- TrsE, Figure 40 shows a sequence alignment for PA-PsbL, YE-TrsF and HI-

Rfe,

Figure 41 shows a sequence alignment for PsbM, TrsG, BP-BplL, and SA-

CapD,

Figure 42 shows the nucleotide sequence of the rol (wzz) gene, Figure 43 is a physical map of the 5 ' end of the wbp cluster,

Figure 44 is a comparison of hydropathy plots of selected Wzz-hke proteins,

Figure 45 shows the expression of P aeruginosa Wzz in vitro, Figure 46A shows an SDS-PAGE gel of LPS from Wzz knockout mutants, Figure 46B shows a western immunoblot using Mab 18-19,

Figure 46C shows a western immunoblot using Mab MF15-4, Figure 47 shows the ability of P aeruginosa 05 Wzz to function in E Coh,

Figure 48 shows an SDS-PAGE gel from WbpF knockout mutants, Figure 49 shows the ammo acid and nucleotide sequence encodmg Rps

A, and

Figure 50 shows the ammo acid and nucleotide sequence encodmg Him D

DETAILED DESCRIPTION OF THE INVENTION The following standard abbreviations for the am o acid residues are used throughout the specification A, Ala - alanme, C, Cys - cysteme, D, Asp- aspartic acid, E, Glu - glutamic acid, F, Phe - phenylalanme, G, Gly - glycme, H, His - histidme, I, He - lsoleucme, K, Lys - lysine, L, Leu - leucine, M, Met - methiomne, N, Asn - asparagme, P, Pro - prolme, Q, Gin - glutamme, R, Arg - argmme, S, Ser - serine, T, Thr - threonme, V, Val - vahne, W, Trp- tryptophan, Y, Tyr - tyrosine, and p Y , P Tyr - phosphotyrosme I. Nucleic Acid Molecules of the Invention As herembefore mentioned, the present mvention relates to an isolated

P aeruginosa B-band gene cluster containing genes mvolved m the synthesis and assembly of O-antigen m P aeruginosa The present invention also relates to the isolated genes which comprise the cluster

The term "isolated' refers to a nucleic acid substantially free of cellular material or culture medium when produced by recombinant DNA techniques, or chemical precursors, or other chemicals when chemically synthesized The term ' nucleic acid" is intended to mclude DNA and RNA and can be either double stranded or single stranded

The P aeruginosa B-band gene cluster comprises the following genes rol (wzz), psbA (wbpA), psbB (wbpB), psbC (wbpC), psbD (wbpD), psbE (wbpE), rfc (wzy), psbF (wbpF), psbG (wbpG), psbH (wbpH), psbl (wbpl), psb] (wbp]), psbK (wbpK), psbL (wbpL), psbM (wbpM), and psbN (wbpN) involved in the synthesis, and assembly of lipopolysaccharide m P aeruginosa The gene cluster may also contam the non-LPS genes hisH, hisF, himD, rspa, uvrB, and the insertion element IS407 (IS1209) The genes preferably have the organization as shown in Figure 1 (SEQ

ID NO 1) In Figure 1, the genes necessary for sugar biosynthesis (Man(2NAc3N)A and Man(2NAc3NAc) biosynthesis) are scattered throughout the gene cluster (wpbl (psbl), wpbE (psbE), wpbD (psbD), wpbB (psbB), wpbC (psbC) The genes encoding transferases are mterspersed throughout the wpb (psb) cluster (wpbH (psbH), wpb] (psbl), wpbL, (wpbl)), and are separated from one another by one gene each The gene encodmg the putative first transferase (Wpb (PsbL)), thought to initiate O-antigen assembly by attachment of an FucNAc residue to undecaprenol, is the most distal

The mvention provides nucleic acid molecules encodmg the following protems (1) (a) Rol (Wzz), (b) PsbA (WbpA), (c) PsbB (WbpB), (d) PsbC (WbpC), (e) PsbD (WbpD), (f) PsbE (WbpE), (g) Rfc (Wzy), (h) PsbF (WbpF), (i) PsbG (WbpG), (j) Psbl (Wbpl), (k) PsbJ (WbpJ), (1) PsbK (WbpK), (m) PsbM (WbpM), (n) PsbH (WbpH), and (o) PsbN (WbpN) involved in P aeruginosa O-antigen synthesis and assembly, (2) UvrB involved in ultraviolet repair, (3) HisH or HisF involved in histidme synthesis or (4) himD mvolved m host factor integration and (5) RpsA a 30S πbosomal subunit protem SI In addition, nucleic acid molecules are provided which contam sequences encodmg two or more of the following proteins (1) (a) Rol (wzz), (b) PsbA (WbpA), (c) PsbB (WbpB), (d) PsbC (WbpC), (e) PsbD (WbpD), (f) PsbE (WbpE), (g) Rfc (Wzy), (h) PsbF (WbpF), (I) HisH, ) HisF; (k) PsbG (WbpG); (1) Psbl (Wbpl); (m) PsbJ (WbpJ); (n) PsbK (WbpK); (o) PsbM (WbpM); (p) PsbN (WbpN); (q) PsbH (WbpH); (r) PsbL (WbpL); (s) RpsA or (t) HimD.

In an embodiment of the invention, an isolated nucleic acid molecule is provided having a sequence which encodes a protein having an amino acid sequence as shown in Figure 3 or SEQ.ID. No.: 2; Figure 4 or SEQ.ID. No.: 3; Figure 5 or SEQ.ID. No.: 4; Figure 6 or SEQ.ID. No.: 5; Figure 7 or SEQ.ID. No.: 6; Figure 8 or SEQ.ID. No.: 7; Figure 9 or SEQ.ID. No.: 8; Figure 10 or SEQ.ID. No.: 9; Figure 11 or SEQ.ID. No.: 10; Figure 12 or SEQ.ID. No.: 11; Figure 13 or SEQ.ID. No.: 12; Figure 14 or SEQ.ID. No.: 13; Figure 15 or SEQ.ID. No.: 14; Figure 16 or SEQ.ID. No.: 15; Figure 17 or SEQ.ID. No.: 16.; Figure 18 or SEQ.ID. No.: 17; Figure 19 or SEQ.ID. No.: 18; and Figure 20 or SEQ.ID. No.: 19.

Preferably, the purified and isolated nucleic acid molecule comprises

(a) a nucleic acid sequence containing nucleotides 1-479; 1286-2596 2670-3620; 3689-5578; 5575-6066; 6152-6982; 7236-8552; 8549-9499; 9831-10388; 10388-11143 11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851; 17935-19144; 19678-21675 22302-23693; or 23704-24417, as shown in Figure 2 or SEQ. ID. NO.: 1, wherein T can also be U;

(b) a nucleic acid sequence containing two or more of nucleotides 1-479, 1286-2596; 2670-3620; 3689-5578; 5575-6066; 6152-6982; 7236-8552; 8549-9499; 9830-10388 10388-11143; 11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851; 17935-19144 19678-21675; 22302-23693; or 23704-24417, as shown in Figure 2 or SEQ. ID. NO.: 1, wherein T can also be U;

(c) nucleic acid sequences complementary to (a) or (b);

(d) nucleic acid sequences which are homologous to (a) or (b);

(e) a fragment of (a) to (d) that is at least 15 bases, preferably 20 to 30 bases, and which will hybridize to (a) to (d) under stringent hybridization conditions; or

(f) a nucleic acid molecule differing from any of the nucleic acids of (a) to (c) in codon sequences due to the degeneracy of the genetic code.

Specific embodiments of the nucleic acid molecule of the invention include the following: 1. An isolated nucleic acid molecule characterized by having a sequence encoding a Rol (Wzz) protein of P. aeruginosa which regulates O-antigen linking. The nucleic acid molecule preferably encodes Rol having the amino acid sequence as shown in Figure 3 or SEQ.ID. No.: 2, and most preferably comprises nucleotides 1-479 as shown in Figure 2 or SEQ.ID. No.: 1, or a nucleotide sequence as shown in Figure 42, which shows the full length nucleotide sequence of the rol gene.

2. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbA (WbpA) protein of P. aeruginosa which has dehydrogenase activity. The nucleic acid molecule preferably encodes PsbA having the amino acid sequence as shown in Figure 4 or SEQ.ID. No.: 3, and most preferably comprises nucleotides 1286-2596 as shown in Figure 2 or SEQ.ID. No.: 1.

3. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbB (WbpB) protein of P. aeruginosa. The nucleic acid molecule preferably encodes PsbB having the amino acid sequence as shown in Figure 5 or SEQ.ID. No.: 4, and most preferably comprises nucleotides 2670-3620 as shown in Figure 2 or SEQ.ID. No.: 1.

4. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbC (WbpC) protein of P. aeruginosa which has acetyltransferase activity. The nucleic acid molecule preferably encodes PsbC having the amino acid sequence as shown in Figure 6 or SEQ.ID. No.: 5, and most preferably comprises nucleotides 3689-5578 as shown in Figure 2 or SEQ.ID. No.: 1.

5. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbD (WbpD) protein of P. aeruginosa which has acetyltransferase activity. The nucleic acid molecule preferably encodes PsbD having the amino acid sequence as shown in Figure 7 or SEQ.ID. No.: 6, and most preferably comprises nucleotides 5575-6066 as shown in Figure 2 or SEQ.ID. No.: 1.

6. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbE (WbpE) protein of P. aeruginosa. The nucleic acid molecule preferably encodes PsbE having the amino acid sequence as shown in Figure 8 or SEQ.ID. No.: 7, and most preferably comprises nucleotides 6152-6982 as shown in Figure 2 or SEQ.ID. No.: 1.

7. An isolated nucleic acid molecule characterized by having a sequence encoding a Rfc (Wzy) protein of P. aeruginosa which has O-polymerase activity.

The nucleic acid molecule preferably encodes Rfc having the amino acid sequence as shown in Figure 9 or SEQ.ID. No.: 8, and most preferably comprises nucleotides 7236-8552 as shown in Figure 2 or SEQ.ID. No.: 1. The nucleic acid molecule may comprise nucleotides 7236 to 8552 where base 8059 is "G". The Rfc coding region has a lower mol.% G+C than the P. aeruginosa chromosomal average and it has similar amino acid composition and codon usage to that reported for other Rfc proteins. Using a novel gene-replacement vector, the present inventors were able to generate PAOl chromosomal rfc mutants. These knockout mutants express LPS containing complete core plus one O-repeat unit, indicating that they are no longer producing a functional O-polymerase enzyme. 8. An isolated nucleic acid molecule characterized by having a sequence encodmg a PsbF (WbpF) protein of P. aeruginosa. The nucleic acid molecule preferably encodes PsbF having the amino acid sequence as shown in Figure 10 or SEQ.ID. No.: 9, and most preferably comprises nucleotides 8549-9499 as shown in Figure 2 or SEQ.ID. No.: 1.

9. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbG (WbpG) protein of P. aeruginosa. The nucleic acid molecule preferably encodes PsbG having the amino acid sequence as shown in Figure 13 or SEQ.ID.

No.: 12, and most preferably comprises nucleotides 11281-12411 as shown in Figure 2 or

SEQ.ID. No.: 1.

The present inventors have inserted a gentamicin cassette into psbG which resulted in B-band deficient mutants of PAOl. 10. An isolated nucleic acid molecule characterized by having a sequence encodmg a PsbH (WbpH) protein of P. aeruginosa which has ManA transferase activity. The nucleic acid molecule preferably encodes PsbH having the amino acid sequence as shown in Figure 14 or SEQ.ID. No.: 13, and most preferably comprises nucleotides 12427-13548 as shown in Figure 2 or SEQ.ID. No.: 1. The present inventors have produced a psbH knockout mutant of PAOl which is B-band deficient.

11. An isolated nucleic acid molecule characterized by having a sequence encodmg a Psbl (Wbpl) protein of P. aeruginosa which converts UDP-N- acetylglucosamine to UDP-N-acetylmannosamine. The nucleic acid molecule preferably encodes Psbl having the amino acid sequence as shown in Figure 15 or SEQ.ID. No.: 14, and most preferably comprises nucleotides 13545-14633 as shown in Figure 2 or SEQ.ID. No.: 1.

12. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbJ (WbpJ) protein of P. aeruginosa which has ManA transferase activity. The nucleic acid molecule preferably encodes PsbJ having the amino acid sequence as shown in Figure 16 or SEQ.ID. No.: 15, and most preferably comprises nucleotides 14651- 15892 as shown in Figure 2 or SEQ.ID. No.: 1.

13. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbK (WbpK) protein of P. aeruginosa which has dehydratase activity. The nucleic acid molecule preferably encodes PsbK having the amino acid sequence as shown in Figure 17 or SEQ.ID. No.: 16, and most preferably comprises nucleotides 15889-16851 as shown in Figure 2 or SEQ.ID. No.: 1.

14. An isolated nucleic acid molecule characterized by having a sequence encoding a PsbM (WbpM) protein of P. aeruginosa and having dehydrogenase activity. The nucleic acid molecule preferably encodes PsbM having the amino acid sequence as shown in Figure 18 or SEQ.ID. No.: 17, and most preferably comprises nucleotides 19678-21675 as shown in Figure 2 or SEQ.ID. No.: 1. PsbM knockout mutants do not produce LPS. 15. An isolated nucleic acid molecule characterized by haying a sequence encoding a PsbN (WbpN) protein of P. aeruginosa . The nucleic acid molecule preferably encodes PsbN having the amino acid sequence as shown in Figure 19 or SEQ.ID. No.: 18, and most preferably comprises nucleotides 22302-23693 as shown in Figure 2 or SEQ.ID. No.: 1.

16. An isolated nucleic acid molecule characterized by having a sequence encoding a UvrB protein of P. aeruginosa which is involved in ultraviolet repair. The nucleic acid molecule preferably encodes UvrB having the amino acid sequence as shown in Figure 20 or SEQ.ID. No.: 19, and most preferably comprises nucleotides 23704- 24417 as shown in Figure 2 or SEQ.ID. No.: 1.

17. An isolated nucleic acid molecule characterized by having a sequence encoding a RpsA protein for a 30S ribosomal subunit. The nucleic acid molecule preferably encodes RpsA having the amino acid sequence as shown in Figure 49.

18. An isolated nucleic acid molecule characterized by having a sequence encoding a HimD protein for a host integration factor. The nucleic acid molecule preferably encodes HimD having the amino acid sequence as shown in Figure 50.

In an embodiment of the invention, the nucleic acid molecule contains two genes from the gene cluster of the invention, preferably two genes which are adjacent in the gene cluster. For example, the present inventors have found that rfc (wzy) and psbF (wbpF) are cotranscribed and they are both required for B-band synthesis. If psbF (wbpF) is absent, both A and B synthesis are knocked out indicating that its gene product is required for expressor of A and B- band LPS onto the core oligosaccharide. Accordingly, the invention provides a nucleic acid molecule encoding a PsbF (WpbF) protein and an Rfc (Wzy) protein. Preferably a nucleic acid molecule comprising nucleotides 7239 to 9499 as shown in Figure 2 or SEQ.ID. No.: 1.

It will be appreciated that the invention includes nucleic acid molecules encoding truncations of the proteins of the invention, and analogs and homologs of the proteins of the invention and truncations thereof, as described below. It will further be appreciated that variant forms of the nucleic acid molecules of the mvention which arise by alternative splicing of an mRNA corresponding to a cDNA of the invention are encompassed by the invention.

Further, it will be appreciated that the invention includes nucleic acid molecules comprising nucleic acid sequences having substantial sequence homology with the nucleic acid sequences containing nucleotides 1-479; 1286-2596; 2670-3620; 3689-5578; 5575- 6066; 6152-6982; 7236-8552; 8549-9499; 9831-10388; 10388-11143; 11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851; 17935-19144; 19678-21675; 22302-23693; or 23704- 24417, as shown in Figure 2 or SEQ. ID. NO.: 2 and fragments thereof. The term "sequences having substantial sequence homology" means those nucleic acid sequences which have slight or inconsequential sequence variations from these sequences, i.e. the sequences function in substantially the same manner to produce functionally equivalent proteins. The variations may be attributable to local mutations or structural modifications. Nucleic acid sequences having substantial homology include nucleic acid sequences having at least 80-90%, preferably 90% identity with the nucleic acid sequence 1-479; 1286-2596; 2670-3620; 3689-5578; 5575-6066; 6152-6982; 7236-8552; 8549-9499 9831-10388; 10388-11143; 11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851 17935-19144; 19678-21675; 22302-23693; or 23704-24417, as shown in Figure 2 or SEQ. ID. NO. 2. By way of example, it is expected that a sequence having 80% sequence homology with the DNA sequence encoding PsbM of the invention will provide a functional PsbM protein.

Another aspect of the invention provides a nucleic acid molecule, and fragments thereof having at least 15 bases, which hybridizes to the nucleic acid molecules of the invention under hybridization conditions, preferably stringent hybridization conditions. Appropriate stringency conditions which promote DNA hybridization are known to those skilled in the art, or may be found in Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6. For example, the following may be employed: 6.0 x sodium chloride/sodium citrate (SSC) at about 45°C, followed by a wash of 2.0 x SSC at 50°C. The stringency may be selected based on the conditions used in the wash step. For example, the salt concentration in the wash step can be selected from a high stringency of about 0.2 x SSC at 50°C. In addition, the temperature in the wash step can be at high stringency conditions, at about 65'C.

Isolated and purified nucleic acid molecules having sequences which differ from the nucleic acid sequence shown in SEQ ID NO:l or Figure 2, and the nucleic acid sequences 1-479; 1286-2596; 2670-3620; 3689-5578; 5575-6066; 6152-6982; 7236-8552; 8549-9499, 9831-10388; 10388-11143; 11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851 17935-19144; 19678-21675; 22302-23693; or 23704-24417, as shown in Figure 2 or SEQ. ID. NO. 1, due to degeneracy in the genetic code are also within the scope of the invention. Such nucleic acids encode functionally equivalent proteins (e.g., a PsbM (WpbM) protein having dehydrogenase activity) but differ in sequence from the above mentioned sequences due to degeneracy in the genetic code.

An isolated nucleic acid molecule of the invention which comprises DNA can be isolated by preparing a labelled nucleic acid probe based on all or part of the nucleic acid sequences containing nucleotides 1-479; 1286-2596; 2670-3620; 3689-5578; 5575- 6066; 6152-6982; 7236-8552; 8549-9499; 9831-10388; 10388-11143; 11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851; 17935-19144; 19678-21675; 22302-23693; or 23704- 24417, as shown in Figure 2 or SEQ. ID. NO.: 2, and using this labelled nucleic acid probe to screen an appropriate DNA library (e g a cDNA or genomic DNA library) For example, a whole genomic library isolated from a microorganism, such as a serotype of P aeruginosa , can be used to isolate a DNA encodmg a novel protem of the mvention by screenmg the library with the labelled probe using standard techniques Nucleic acids isolated by screenmg of a cDNA or genomic DNA library can be sequenced by standard techniques

An isolated nucleic acid molecule of the mvention which is DNA can also be isolated by selectively amplifying a nucleic acid encodmg a novel protein of the mvention usmg the polymerase cham reaction (PCR) methods and cDNA or genomic DNA It is possible to design synthetic ohgonucleotide primers from the nucleic acid molecules containing the nucleotides 1-479, 1286-2596, 2670-3620, 3689-5578, 5575-6066, 6152-6982, 7236-8552, 8549-9499, 9831-10388, 10388-11143, 11281-12411, 12427-13548, 13545-14633, 14651-15892, 15889-16851, 17935-19144, 19678-21675, 22302-23693, or 23704-24417, as shown in Figure 2 or SEQ ID NO 2, for use in PCR A nucleic acid can be amplified from cDNA or genomic DNA using these ohgonucleotide primers and standard PCR amplification techniques The nucleic acid so amplified can be cloned into an appropriate vector and characterized by DNA sequence analysis It will be appreciated that cDNA may be prepared from mRNA, by isolating total cellular mRNA by a variety of techniques, for example, by usmg the guanidmium-thiocyanate extraction procedure of Chirgwm et al , Biochemistry, 18, 5294-5299 (1979) cDNA is then synthesized from the mRNA usmg reverse transcriptase (for example, Moloney MLV reverse transcriptase available from Gibco/BRL, Bethesda, MD, or AMV reverse transcriptase available from Seikagaku America, Inc , St Petersburg, FL)

An isolated nucleic acid molecule of the mvention which is RNA can be isolated by cloning a cDNA encodmg a novel protem of the invention into an appropriate vector which allows for transcription of the cDNA to produce an RNA molecule which encodes a novel protem of the mvention For example, a cDNA can be cloned downstream of a bacteπophage promoter, (e g a T7 promoter) m a vector, cDNA can be transcribed in vitro with T7 polymerase, and the resultant RNA can be isolated by standard techniques

A nucleic acid molecule of the invention may also be chemically synthesized using standard techniques Various methods of chemically synthesizmg polydeoxynucleotides are known, including solid-phase synthesis which, like peptide synthesis, has been fully automated in commercially available DNA synthesizers (See e g , Itakura et al U S Patent No 4,598,049, Caruthers et al U S Patent No 4,458,066, and Itakura U.S Patent Nos 4,401,796 and 4,373,071) Determmation of whether a particular nucleic acid molecule encodes a novel protein of the invention may be accomplished by expressing the cDNA in an appropriate host cell by standard techniques, and testmg the activity of the protem usmg the methods as described herein. For example, the activity of a putative PsbM protein may be tested by mixing with an appropriate substrate and assaying for dehydrogenase activity. A cDNA having the activity of a novel protein of the invention so isolated can be sequenced by standard techniques, such as dideoxynucleotide chain termination or Maxam-Gilbert chemical sequencing, to determine the nucleic acid sequence and the predicted amino acid sequence of the encoded protein.

The initiation codon and untranslated sequences of the nucleic acid molecules of the invention may be determined using currently available computer software designed for the purpose, such as PC/Gene (IntelliGenetics Inc., Calif.). Regulatory elements can be identified using conventional techniques. The function of the elements can be confirmed by using these elements to express a reporter gene which is operatively linked to the elements. These constructs may be introduced into cultured cells using standard procedures. In addition to identifying regulatory elements in DNA, such constructs may also be used to identify proteins interacting with the elements, using techniques known in the art.

The sequence of a nucleic acid molecule of the invention may be inverted relative to its normal presentation for transcription to produce an antisense nucleic acid molecule. Preferably, an antisense sequence is constructed by inverting a region preceding the initiation codon or an unconserved region. In particular, the nucleic acid sequences contained in the nucleic acid molecules of the invention or a fragment thereof, preferably one or more of the nucleic acid sequences shown in the Sequence Listing as SEQ. ID. NO. 1 and in Figure 2 (i.e. a nucleic acid molecule containing nucleotides 1-479; 1286- 2596; 2670-3620; 3689-5578; 5575-6066; 6152-6982; 7236-8552; 8549-9499; 9831-10388; 10388- 11143; 11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851; 17935-19144; 19678- 21675; 22302-23693; or 23704-24417) may be inverted relative to their normal presentation for transcription to produce antisense nucleic acid molecules.

The antisense nucleic acid molecules of the invention or a fragment thereof, may be chemically synthesized using naturally occurring nucleotides or variously modified nucleotides designed to increase the biological stability of the molecules or to increase the physical stability of the duplex formed with mRNA or the native gene e.g. phosphorothioate derivatives and acridine substituted nucleotides. The antisense sequences may be produced biologically using an expression vector introduced into cells in the form of a recombinant plasmid, phagemid or attenuated virus in which antisense sequences are produced under the control of a high efficiency regulatory region, the activity of which may be determined by the cell type into which the vector is introduced. The invention also provides nucleic acids encoding fusion proteins comprising a novel protein of the invention and a selected protein, or a selectable marker protein (see below).

II. Novel Proteins of the Invention The invention further broadly contemplates an isolated protein characterized in that it has part or all of the primary structural conformation (ie. continuous sequence of amino acid residues) of a novel protein encoded by a gene of the psb gene cluster of the invention. In an embodiment of the invention, an isolated protein is provided which has the amino acid sequence as shown in Figure 3 or SEQ ID NO:2; (Rol or Wzz), Figure 4 or SEQ ID NO:3 (PsbA or WbpA) Figure 5 or SEQ ID NO:4 (PsbB or WbpB); Figure 6 or SEQ ID NO:5 (PsbC or WbpC); Figure 7 or SEQ ID NO:6 (PsbD or WbpD); Figure 8 or SEQ ID NO:7 (PsbE or WbpE); Figure 9 or SEQ ID NO:8 (Rfc or Wzy); Figure 10 or SEQ ID NO:9 (PsbF or WbpF); Figure 11 or SEQ ID NO:10 (HisH); Figure 12 or SEQ ID NO:ll (HisF); Figure 13 or SEQ ID NO:12 (PsbG or WbpG); Figure 14 or SEQ ID NO:13 (PsbH or WbpH); Figure 15 or SEQ ID NO:14 (Psbl or Wbpl); Figure 16 or SEQ ID NO:15 (PsbJ or

WbpJ); Figure 17 or SEQ ID NO:16 (PsbK or WbpK); Figure 18 or SEQ ID NO:17 (PsbM or

WbpM); Figure 19 or SEQ ID NO:18 (PsbN or WbpN); or Figure 20 or SEQ ID NO:19 (UvrB).

The gene products of rol, psbA, psbB, psbC, psbD, psbE, rfc, psbF, hisH, hisF, psbG, psbH, psbl, psb], psbL, and psbK (also known as wzz, wbp A, wbpB, wbpC, wbpD, wbpE, wzy, wbpF, hisH, hisF, wbpG, wbpH, wpbl, wbp) respectively) are expected to be found in serotypes 02, 05, 016, 018, and O20, and the gene products of psbM and psbN (also known as wbpM and wbpN, respectively) are expected to be found in serotypes Ol to O20. The gene products of hisF and hisH are not found in serotype 06.

Specific embodiments of the invention include the following: 1. An isolated Rol (Wzz) protein of P. aeruginosa which regulates O- antigen linking, having the amino acid sequence as shown in Figure 3 or SEQ.ID. No.: 2. The function of Rol may be associated with the Rfc protein.

2. An isolated PsbA (WbpA) protein of P. aeruginosa which has dehydrogenase activity, and the amino acid sequence as shown in Figure 4 or SEQ.ID. No.: 3. PsbA may be involved in the biosynthesis of mannuronic acid residues.

3. An isolated PsbB (WbpB) protein of P. aeruginosa having the amino acid sequence as shown in Figure 5 or SEQ.ID. No.: 4. PsbB may be involved in Fuc2NAc biosynthesis.

4. An isolated PsbC (WbpC) protein of P. aeruginosa which has acetyltransferase activity and the amino acid sequence as shown in Figure 6 or SEQ.ID. No.:

5. PsbC may be involved in the acetylation of mannuronic acid residues in the O-antigen. 5. An isolated PsbD (WbpD) protein of P. aeruginosa which has acetyltransferase activity and the amino acid sequence as shown in Figure 7 or SEQ.ID. No.: 6. PsbD may be involved in the acetylation of mannuronic acid residues in the O-antigen.

6. An isolated PsbE (WbpE) protein of P. aeruginosa. having the amino acid sequence as shown in Figure 8 or SEQ.ID. No.: 7. PsbE may be involved in the biosynthesis of 2,3-, 2,4-, and 2,6-dideoxy sugars such as 2,3-dideoxy mannuronic acid produced by P. aeruginosa 05.

7. An isolated Rfc (Wzy) protein of P. aeruginosa which has O- polymerase activity and the amino acid sequence as shown in Figure 9 or SEQ.ID. No.: 8. The Rfc protein is characterized as very hydrophobic, and it is an integral membrane protein with 11 putative membrane spanning domains.

8. An isolated PsbF (WbpF) protein of P. aeruginosa. having the amino acid sequence as shown in Figure 10 or SEQ.ID. No.: 9. PsbF is translationally coupled with rfc and it is a putative flippase. 9. An isolated PsbG (WbpG) protein of P. aeruginosa which has the amino acid sequence as shown in Figure 13 or SEQ.ID. No.: 12.

10. An isolated PsbH (WbpH) protein of P. aeruginosa which has ManA transferase activity and the amino acid sequence as shown in Figure 14 or SEQ.ID. No.: 13. PsbH may be involved in the addition of ManA (i.e. Man(2NAc3N)A) to the O- antigen unit.

11. An isolated Psbl (Wbpl) protein of P. aeruginosa which converts UDP-N-acetylglucosamine to UDP-N-acetylmannosamine, and has the amino acid sequence as shown in Figure 15 or SEQ.ID. No.: 14.

12. An isolated PsbJ (WbpJ) protein of P. aeruginosa which has ManA transferase activity, and the amino acid sequence as shown in Figure 16 or SEQ.ID. No.: 15.

Based on their gene order and their relative hydropathic indices, the psb] and psbH gene products are thought to transfer Man(NAc)2A and Man(2Nac3N)A, respectively.

13. An isolated PsbK (WbpK) protein of P. aeruginosa which has dehydratase activity, and the amino acid sequence as shown in Figure 17 or SEQ.ID. No.: 16.

14. An isolated PsbM (WbpM) protein of P. aeruginosa having dehydrogenase activity, and the amino acid sequence as shown in Figure 18 or SEQ.ID. No.: 17. PsbM is involved in the biosynthesis of N-acetylfucosamine residues of the O-antigen. PsbM contains 2 NAD binding domains. 15. An isolated PsbN (WbpN) protein of P. aeruginosa. having the amino acid sequence as shown in Figure 19 or SEQ.ID. No.: 18. 16 An UvrB protem of P aeruginosa which is mvolved m ultraviolet repair and has the amino acid sequence as shown in Figure 20 or SEQ ID No 19

The molecular weights, lsoelectπc pomts, and hydropathic indices of the Rol (Wzz), PsbA (WbpA), PsbB (WbpB), PsbC (WbpC), PsbD (WbpD), PsbE (WbpE), Rfc (Wzy), PsbF (WbpF), PsbG (WbpG), PsbH (WbpH), Psbl (Wbpl), PsbJ (WbpJ), PsbK (WbpK), PsbM (WbpM) and PsbN (WbpN) protems are shown in Table 1

Withm the context of the present mvention, a protem of the invention may include various structural forms of the primary protein which retain biological activity For example, a protem of the invention may be m the form of acidic or basic salts or in neutral form In addition, individual amino acid residues may be modified by oxidation or reduction

In addition to the full length ammo acid sequences (Figures 3 to 20 or SEQ ID NOS 2 to 19), the protems of the present mvention may also mclude truncations of the protems, and analogs, and homologs of the protems and truncations thereof as described herem Truncated proteins may comprise peptides of at least fifteen ammo acid residues

The proteins of the invention may also include analogs of the protems having the ammo acid sequences shown in Figures 3 to 20, or SEQ ID NOS 2 to 19 and/or truncations thereof as described herem, which may mclude, but are not limited to an ammo acid sequence contammg one or more ammo acid substitutions, msertions, and/ or deletions Ammo acid substitutions may be of a conserved or non-conserved nature Conserved ammo acid substitutions mvolve replacmg one or more ammo acids of the protems of the mvention with amino acids of similar charge, size, and /or hydrophobicity characteπsitics When only conserved substitutions are made the resulting analog should be functionally equivalent Non-conserved substitutions involve replacmg one or more amino acids of the ammo acid sequence with one or more amino acids which possess dissimilar charge, size, and /or hydrophobicity characteristics

One or more ammo acid insertions may be mtroduced mto the ammo acid sequences shown in Figures 3 to 20, or SEQ ID NOS 2 to 19 Ammo acid msertions may consist of single ammo acid residues or sequential ammo acids ranging from 2 to 15 ammo acids m length For example, ammo acid insertions may be used to destroy target sequences so that the protem is no longer active This procedure may be used in v vo to inhibit the activity of a protem of the mvention

Deletions may consist of the removal of one or more amino acids, r discrete portions from the ammo acid sequences shown m Figures 3 to 20 or SEQ ID NOS 2 to 19 The deleted ammo acids may or may not be contiguous The lower limit length of the resultmg analog with a deletion mutation is about 10 ammo acids, preferably 100 ammo acids Analogs of a protein of the invention may be prepared by introducing mutations in the nucleotide sequence encoding the protein. Mutations in nucleotide sequences constructed for expression of analogs of a protein of the invention must preserve the reading frame of the coding sequences. Furthermore, the mutations will preferably not create complementary regions that could hybridize to produce secondary mRNA structures, such as loops or hairpins, which could adversely affect translation of the receptor mRNA.

Mutations may be introduced at particular loci by synthesizing oligonucleotides containing a mutant sequence, flanked by restriction sites enabling ligation to fragments of the native sequence. Following ligation, the resulting reconstructed sequence encodes an analog having the desired amino acid insertion, substitution, or deletion.

Alternatively, oligonucleotide-directed site specific mutagenesis procedures may be employed to provide an altered gene having particular codons altered according to the substitution, deletion, or insertion required. Deletion or truncation of a protein of the invention may also be constructed by utilizing convenient restriction endonuclease sites adjacent to the desired deletion. Subsequent to restriction, overhangs may be filled in, and the DNA religated. Exemplary methods of making the alterations set forth above are disclosed by Sambrook et al (Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, 1989).

The proteins of the invention also include homologs of the amino acid sequences shown in Figures 3 to 20, or SEQ.ID. NOS:2 to 19 and/or truncations thereof as described herein. Such homologs are proteins whose amino acid sequences are comprised of amino acid sequences that hybridize under stringent hybridization conditions (see discussion of stringent hybridization conditions herein) with a probe used to obtain a protein of the invention. Homologs of a protein of the invention will have the same regions which are characteristic of the protein.

Amino acid homologies for WbpA, WbpD, WbpE, HisH, HisF, Wbpl,

WbpJ, WbpK, WbpM and Wzz proteins are shown in Table 2 to 4. It will be appreciated that the invention includes WbpA, WbpD, WbpE, HisH, HisF, Wbpl, WbpJ, WbpK, WbpM and Wzz proteins having at least 51%, 84%, 76%, 57%, 54%, 70%, 53%, 54%, 61% and 51% homology, respectively.

The invention also contemplates isoforms of the proteins of the invention. An isoform contains the same number and kinds of amino acids as a protein of the invention, but the isoform has a different molecular structure. The isoforms contemplated by the present invention are those having the same properties as a protein of the invention as described herein.

The present invention also includes a protein of the invention conjugated with a selected protein, or a selectable marker protein (see below) to produce fusion proteins. Additionally, immunogenic portions of a protein of the invention are within the scope of the invention.

The proteins of the invention (including truncations, analogs, etc.) may be prepared using recombinant DNA methods. Accordingly, the nucleic acid molecules of the present invention having a sequence which encodes a protein of the invention may be incorporated in a known manner into an appropriate expression vector which ensures good expression of the protein. Possible expression vectors include but are not limited to cosmids, plasmids, or modified viruses (e.g. replication defective retroviruses, adenoviruses and adeno-associated viruses), so long as the vector is compatible with the host cell used. The expression vectors are "suitable for transformation of a host cell", means that the expression vectors contain a nucleic acid molecule of the invention and regulatory sequences selected on the basis of the host cells to be used for expression, which is operatively linked to the nucleic acid molecule. Operatively linked is intended to mean that the nucleic acid is linked to regulatory sequences in a manner which allows expression of the nucleic acid. The invention therefore contemplates a recombinant expression vector of the invention containing a nucleic acid molecule of the invention, or a fragment thereof, and the necessary regulatory sequences for the transcription and translation of the inserted protein-sequence. Suitable regulatory sequences may be derived from a variety of sources, including bacterial, fungal, or viral genes (For example, see the regulatory sequences described in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990). Selection of appropriate regulatory sequences is dependent on the host cell chosen as discussed below, and may be readily accomplished by one of ordinary skill in the art. Examples of such regulatory sequences include: a transcriptional promoter and enhancer or RNA polymerase binding sequence, a ribosomal binding sequence, including a translation initiation signal. Additionally, depending on the host cell chosen and the vector employed, other sequences, such as an origin of replication, additional DNA restriction sites, enhancers, and sequences conferring inducibility of transcription may be incorporated into the expression vector. It will also be appreciated that the necessary regulatory sequences may be supplied by the native protein and /or its flanking regions. The invention further provides a recombinant expression vector comprising a DNA nucleic acid molecule of the invention cloned into the expression vector in an antisense orientation. That is, the DNA molecule is operatively linked to a regulatory sequence in a manner which allows for expression, by transcription of the DNA molecule, of an RNA molecule which is antisense to a nucleotide sequence comprising 1-479; 1293-2596; 2670-3620; 3277-5577; 5574-6065; 6151-6981; 7235-8551; 8548-9498; 9830-10388; 10388-11143; 11281-12411; 12427-13548; 13545-14633; 14651-15892; 15889-16851; 18032-19141; 19678-21675; 22302-23693; or 23704-24417, as shown in Figure 2 or SEQ. ID. NO.: 2. Regulatory sequences operatively lmked to the antisense nucleic acid can be chosen which direct the continuous expression of the antisense RNA molecule

The recombinant expression vectors of the mvention may also contam a selectable marker gene which facilitates the selection of host cells transformed or transfected with a recombinant molecule of the mvention Examples of selectable marker genes are genes encodmg a protem such as G418 and hygromycin which confer resistance to certain drugs, β-galactosidase, chloramphemcol acetyltransferase, or firefly luciferase Transcription of the selectable marker gene is monitored by changes m the concentration of the selectable marker protem such as β-galactosidase, chloramphemcol acetyltransferase, or firefly luciferase If the selectable marker gene encodes a protem conferrmg antibiotic resistance such as neomycm resistance transformant cells can be selected with G418 Cells that have incorporated the selectable marker gene will survive, while the other cells die This makes it possible to visualize and assay for expression of recombinant expression vectors of the mvention and in particular to determine the effect of a mutation on expression and phenotype It will be appreciated that selectable markers can be introduced on a separate vector from the nucleic acid of interest

The recombinant expression vectors may also contain genes which encode a fusion moiety which provides increased expression of the recombinant protem, increased solubility of the recombinant protein, and aid in the purification of a target recombinant protein by acting as a ligand in affinity purification For example, a proteolytic cleavage site may be added to the target recombinant protein to allow separation of the recombmant protem from the fusion moiety subsequent to purification of the fusion protein Typical fusion expression vectors mclude pGEX (Amrad Corp , Melbourne, Australia), pMAL (New England Biolabs, Beverly, MA) and pRIT5 (Pharmacia, Piscataway, NJ) which fuse glutathione S-tranferase (GST), maltose E binding protem, or protem A, respectively, to the recombmant protem

Recombmant expression vectors can be mtroduced mto host cells to produce a transformant host cell The term "transformant host cell" is mtended to mclude prokaryotic and eukaryotic cells which have been transformed or transfected with a recombmant expression vector of the mvention The terms "transformed with", "transfected with", "transformation" and "transfection" are intended to encompass introduction of nucleic acid (e g a vector) mto a cell by one of many possible techniques known m the art Prokaryotic cells can be transformed with nucleic acid by, for example, electroporation or calcium-chloride mediated transformation Nucleic acid can be mtroduced mto mammalian cells via conventional techniques such as calcium phosphate or calcium chloride co- precipitation, DEAE-dextran-mediated transfection, lipofectin, electroporation or micromjection Suitable methods for transforming and transfecting host cells can be found in Sambrook et al. (Molecular Cloning. A Laboratory Manual, 2nd Edition, Cold Spr g Harbor Laboratory press (1989)), and other laboratory textbooks.

Suitable host cells include a wide variety of prokaryotic and eukaryotic host cells. For example, the protems of the mvention may be expressed m bacterial cells such as £. coh, msect cells (using baculovirus), yeast cells or mammalian cells Other suitable host cells can be found m Goeddel, Gene Expression Technology:

Methods in Enzymology 185, Academic Press, San Diego, CA (199 1).

More particularly, bacterial host cells suitable for carrying out the present mvention include £. coh, as well as many other bacterial species well known to one of ordmary skill m the art. Bacterial expression vectors preferably comprise a promoter which functions m the host cell, one or more selectable phenotypic markers, and a bacterial oπgm of replication. Representative promoters mclude the β-lactamase (penicillinase) and lactose promoter system (see Chang et al , Nature 275:615, 1978), the trp promoter (Nichols and Yanofsky, Meth m Enzymology 101:155, 1983) and the tac promoter (Russell et al , Gene 20 231, 1982) Representative selectable markers include various antibiotic resistance markers such as the kanamycin or ampicillin resistance genes. Suitable expression vectors mclude but are not limited to bacteπophages such as lambda derivatives or plasmids such as pBR322 (see Bolivar et al., Gene 2:9S, 1977), the pUC plasmids pUC18, pUC19, pUC118, pUC119 (see Messmg, Meth in Enzymology 101:20-77, 1983 and Vieira and Messing, Gene 19.259-268, 1982), and pNH8A, pNH16a, pNHlδa, and Bluescnpt M13 (Stratagene, La Jolla, Calif.).

Yeast and fungi host cells suitable for carrying out the present invention include, but are not limited to Saccharomyces cerevisae, the genera Pichia or Kluyveromyces and various species of the genus Aspergillus Examples of vectors for expression in yeast S. cerivisae mclude pYepSecl (Baldan. et al., (1987) Embo J. 6.229-234), pMFa (Kur_jan and Herskowitz, (1982) Cell 30:933-943), pJRY88 (Schultz et al., (1987) Gene 54:113-123), and pYES2 (Invitrogen Corporation, San Diego, CA). Protocols for the transformation of yeast and fungi are well known to those of ordmary skill in the art.(see Hinnen et al, PNAS USA 75:1929, 1978; Itoh et al, J. Bacteriology 153.163, 1983, and Cullen et al (Bio /Technology 5:369, 1987).

The proteins of the invention may also be prepared by chemical synthesis using techniques well known in the chemistry of proteins such as solid phase synthesis (Mernfield, 1964, J. Am Chem. Assoc 85.2149-2154) or synthesis m homogenous solution (Houbenweyl, 1987, Methods of Organic Chemistry, ed. E. Wansch, Vol. 15 I and II, Thieme, Stuttgart) III. Applications

Detection of Nucleic Acid Molecules, Antibodies, and Diagnostic Applications

The nucleic acid molecules of the mvention, allow those skilled m the art to construct nucleotide probes for use m the detection of nucleotide sequences m a sample A nucleotide probe may be labelled with a detectable marker such as a radioactive label which provides for an adequate signal and has sufficient half life such as ³²P, ³H, ¹⁴C or the like Other detectable markers which may be used mclude antigens that are recognized by a specific labelled antibody, fluorescent compounds, enzymes, antibodies specific for a labelled antigen, and chemiluminescent compounds An appropriate label may be selected havmg regard to the rate of hybridization and bmd g of the probe to the nucleotide to be detected and the amount of nucleotide available for hybridization

The nucleotide probes may be used to detect genes that encode protems related to or analogous to protems of the mvention

Accordingly, the present invention also relates to a method of detectmg the presence of nucleic acid molecules encodmg a protem of the mvention m a sample compπsmg contacting the sample under hybridization conditions with one or more of nucleotide probes which hybridize to the nucleic acid molecules and are labelled with a detectable marker, and determmmg the degree of hybridization between the nucleic acid molecule in the sample and the nucleotide probes In an embodiment of the mvention a method for detectmg P aeruginosa serotypes 01 to 020 in a sample compπsmg contactmg the sample with a nucleotide sequence encodmg PsbM, or PsbN, or a fragment thereof, under conditions which permit the nucleic acid molecule to hybridize with a complementary sequence in the sample to form a hybridization product, and assaymg for the hybridization product In another embodiment of the invention a method for detecting

P aeruginosa serotypes 02, 05, 016, 018, O20 in a sample compπsmg contactmg the sample with a nucleotide sequence encodmg one or more of Rol, PsbB, PsbC, PsbD, PsbE, rfc, PsbF, PsbG, PsbH, Psbl, PsbJ, PsbK (also known as Wzz, WbpB, WbpC, WbpD, WbpE, Wzy, WbpF, WbpG, WbpH, Wbpl, WbpJ, WbpK, respectively), HisH, or HisF or a fragment thereof, under conditions which permit the nucleic acid molecule to hybridize with complementary sequences m the sample to form hybridization products, and assaymg for the hybridization products

Hybridization conditions which may be used m the methods of the mvention are known in the art and are described for example m Sambrook J, Fπtch EF, Maniatis T In Molecular Clonmg, A Laboratory Manual,1989 (Nolan C, Ed ), Cold Sprmg Harbor Laboratory Press, Cold Sprmg Harbor, NY The hybridization product may be assayed using techniques known m the art The nucleotide probe may be labelled with a detectable marker as described herein and the hybridization product may be assayed by detecting the detectable marker or the detectable change produced by the detectable marker

The nucleic acid molecule of the invention also permits the identification and isolation, or synthesis of nucleotide sequences which may be used as primers to amplify a nucleic acid molecule of the mvention, for example m the polymerase cham reaction (PCR) which is discussed in more detail below The primers may be used to amplify the genomic DNA of other bacterial species known to have LPS The PCR amplified sequences can be exammed to determine the relationship between the various LPS genes

The length and bases of the primers for use in the PCR are selected so that they will hybridize to different strands of the desired sequence and at relative positions along the sequence such that an extension product synthesized from one primer when it is separated from its template can serve as a template for extension of the other primer mto a nucleic acid of defined length

Primers which may be used in the mvention are ohgonucleotides l e molecules containing two or more deoxyπbonucleotides of the nucleic acid molecule of the invention which occur naturally as in a purified restriction endonuclease digest or are produced synthetically using techniques known in the art such as for example phosphotnester and phosphodiester methods (See Good et al Nucl Acid Res 4 2157, 1977) or automated techniques (See for example, Conolly, B A Nucleic Acids Res 15 15(7) 3131, 1987) The primers are capable of actmg as a point of initiation of synthesis when placed under conditions which permit the synthesis of a primer extension product which is complementary to the DNA sequence of the invention I e in the presence of nucleotide substrates, an agent for polymerization such as DNA polymerase and at suitable temperature and pH Preferably, the primers are sequences that do not form secondary structures by base pairmg with other copies of the primer or sequences that form a hair pin configuration The primer preferably contains between about 7 and 25 nucleotides

The primers may be labelled with detectable markers which allow for detection of the amplified products Suitable detectable markers are radioactive markers such as P-32, S-35, 1-125, and H-3, luminescent markers such as chemiluminescent markers, preferably luminol, and fluorescent markers, preferably dansyl chloride, fluorceιn-5-ιsothιocyanate, and 4-fluor-7-nιtrobenz-2-axa-l,3 diazole, enzyme markers such as horseradish peroxidase, alkaline phosphatase, β-galactosidase, acetylcholinesterase, or biotm

It will be appreciated that the primers may contain non-complementary sequences provided that a sufficient amount of the primer contams a sequence which is complementary to a nucleic acid molecule of the invention or ohgonucleotide fragment thereof, which is to be amplified Restriction site lmkers may also be incorporated mto the primers allowing for digestion of the amplified products with the appropriate restriction enzymes facilitating clonmg and sequencing of the amplified product

In an embodiment of the invention a method of determining the presence of a nucleic acid molecule havmg a sequence encodmg a protem of the invention is provided comprising treating the sample with primers which are capable of amplifying the nucleic acid molecule or a predetermined ohgonucleotide fragment thereof in a polymerase cham reaction to form amplified sequences, under conditions which permit the formation of amplified sequences and, assaymg for amplified sequences

In a preferred embodiment of the mvention, a method for detectmg P aeruginosa serotypes Ol to O20 m a sample is provided comprismg treatmg the sample with a primer which is capable of amplifying nucleic acid molecules comprismg nucleotide sequences encodmg PsbM (WbpM), or PsbN (WbpN), or a predetermmed ohgonucleotide fragment thereof, in a polymerase chain reaction to form amplified sequences, under conditions which permit the formation of amplified sequences and, assaymg for amplified sequences

In another preferred embodiment of the invention, a method for detectmg P aeruginosa serotypes 02, 05, 016, 018, O20 in a sample is provided comprismg treatmg the sample with a primer which is capable of amplifying nucleic acid molecules comprismg nucleotide sequences encodmg Rol, PsbA, PsbB, PsbC, PsbD, PsbE, Rfc, PsbF, PsbG, PsbH, Psbl, PsbJ, PsbK, (also known as Wzz, WbpA, WbpB, WbpC, WbpD, WbpE, Wzy, WbpF, WbpG, WbpH, Wbpl, WbpJ, WbpK respectively) HisH or HisF, or a predetermmed ohgonucleotide fragment thereof, in a polymerase chain reaction to form amplified sequences, under conditions which permit the formation of amplified sequences and, assaymg for amplified sequences

The polymerase chain reaction refers to a process for amplifying a target nucleic acid sequence as generally described m Innis et al, Academic Press, 1990 in Mulhs el al., U.S Pat No 4,863,195 and Mullis, U S Patent No 4,683,202 which are incorporated herein by reference Conditions for amplifying a nucleic acid template are described in M A Innis and D H Gelfand, PCR Protocols, A Guide to Methods and Applications M A. Innis, D H Gelfand, J J Snmsky and T J White eds, ρp3-12, Academic Press 1989, which is also incorporated herem by reference The amplified products can be isolated and distinguished based on their respective sizes usmg techniques known in the art For example, after amplification, the DNA sample can be separated on an agarose gel and visualized, after staining with ethidium bromide, under ultra violet (UW) light. DNA may be amplified to a desired level and a further extension reaction may be performed to incorporate nucleotide derivatives having detectable markers such as radioactive labelled or biotin labelled nucleoside triphosphates. The primers may also be labelled with detectable markers as discussed above. The detectable markers may be analyzed by restriction and electrophoretic separation or other techniques known in the art.

The conditions which may be employed in the methods of the invention using PCR are those which permit hybridization and amplification reactions to proceed in the presence of DNA in a sample and appropriate complementary hybridization primers. Conditions suitable for the polymerase chain reaction are generally known in the art. For example, see M.A. Innis and D.H. Gelfand, PCR Protocols, A guide to Methods and Applications M.A. Innis, D.H. Gelfand, JJ. Sninsky and TJ. White eds, pp3-12, Academic Press 1989, which is incorporated herein by reference. Preferably, the PCR utilizes polymerase obtained from the thermophilic bacterium Thermus aquatics (Taq polymerase, GeneAmp Kit, Perkin Elmer Cetus) or other thermostable polymerase may be used to amplify DNA template strands.

It will be appreciated that other techniques such as the Ligase Chain

Reaction (LCR) and NASBA may be used to amplify a nucleic acid molecule of the invention

(Barney in "PCR Methods and Applications", August 1991, Vol.l(l), page 5, and European Published Application No. 0320308, published June 14, 1989, and U.S. Serial NO. 5,130,238 to Malek).

A protein of the invention can be used to prepare antibodies specific for the protein. Antibodies can be prepared which bind a distinct epitope in an unconserved region of the protein. An unconserved region of the protein is one which does not have substantial sequence homology to other proteins. Alternatively, a region from a well- characterized domain can be used to prepare an antibody to a conserved region of a protein of the invention. Antibodies having specificity for a protein of the invention may also be raised from fusion proteins.

Conventional methods can be used to prepare the antibodies. For example, by using a peptide of a protein of the invention, polyclonal antisera or monoclonal antibodies can be made using standard methods. A mammal, (e.g., a mouse, hamster, or rabbit) can be immunized with an immunogenic form of the peptide which elicits an antibody response in the mammal. Techniques for conferring immunogenicity on a peptide include conjugation to carriers or other techniques well known in the art. For example, the peptide can be administered in the presence of adjuvant. The progress of immunization can be monitored by detection of antibody titers in plasma or serum. Standard ELISA or other i munoassay procedures can be used with the immunogen as antigen to assess the levels of antibodies Following immunization, antisera can be obtained and, if desired, polyclonal antibodies isolated from the sera

To produce monoclonal antibodies, antibody producing cells (lymphocytes) can be harvested from an immunized animal and fused with myeloma cells by standard somatic cell fusion procedures thus immortalizing these cells and yielding hybndoma cells Such techniques are well known m the art, (e g , the hybndoma technique originally developed by Kohler and Milstein (Nature 256, 495-497 (1975)) as well as other techniques such as the human B-cell hybndoma technique (Kozbor et al , Immunol Today 4, 72 (1983)), the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al Monoclonal Antibodies m Cancer Therapy (1985) Allen R Bliss, Inc , pages 77-96), and screening of combmatonal antibody libraries (Huse et al , Science 246, 1275 (1989)] Hybndoma cells can be screened immunochemically for production of antibodies specifically reactive with the peptide and the monoclonal antibodies can be isolated Therefore, the invention also contemplates hybndoma cells secreting monoclonal antibodies with specificity for a protem of the invention

The term "antibody" as used herein is intended to include fragments thereof which also specifically react with a protem, of the invention, or peptide thereof Antibodies can be fragmented using conventional techniques and the fragments screened for utility in the same manner as described above For example, F(ab')2 fragments can be generated by treatmg antibody with pepsin The resulting F(ab')2 fragment can be treated to reduce disulfide bridges to produce Fab fragments

Chimenc antibody derivatives, l e , antibody molecules that combine a non-human animal variable region and a human constant region are also contemplated within the scope of the invention Chimenc antibody molecules can include, for example, the antigen bmdmg domam from an antibody of a mouse, rat, or other species, with human constant regions Conventional methods may be used to make chimenc antibodies containing the immunoglobulm variable region which recognizes the gene product of the genes of the psb cluster of the mvention (See, for example, Morrison et al , Proc Natl Acad Sci U S.A 81,6851 (1985), Takeda et al , Nature 314, 452 (1985), Cabilly et al , U S Patent No 4,816,567, Boss et al , U S Patent No 4,816,397, Tanaguchi et al , European Patent Publication EP171496, European Patent Publication 0173494, United Kingdom patent GB 2177096B)

Monoclonal or chimenc antibodies specifically reactive with a protem of the invention as described herem can be further humanized by producmg human constant region chimeras, in which parts of the variable regions, particularly the conserved framework regions of the antigen-bmding domain, are of human origin and only the hypervanable regions are of non-human origin Such immunoglobulm molecules may be made by techniques known m the art, (e g , Teng et al , Proc Natl Acad Sci U S.A , 80, 7308-7312 (1983); Kozbor et al , Immunology Today, 4, 7279 (1983), Olsson et al , Meth Enzymol , 92, 3-16 (1982)), and PCT Publication WO92/06193 or EP 0239400) Humanized antibodies can also be commercially produced (Scotgen Limited, 2 Holly Road, Twickenham, Middlesex, Great Bπtam )

Specific antibodies, or antibody fragments, reactive agamst protems of the invention may also be generated by screening expression libraries encoding immunoglobulm genes, or portions thereof, expressed in bacteria with peptides produced from the nucleic acid molecules of the present invention For example, complete Fab fragments, VH regions and FV regions can be expressed in bacteria usmg phage expression libraries (See for example Ward et al , Nature 341, 544-546 (1989), Huse et al , Science 246, 1275-1281 (1989), and McCafferty et al Nature 348, 552-554 (1990)) In an embodiment of the invention, antibodies that bind to an epitope of a protein of the invention are engmeered using the procedures described in N Tout and J Lam (Clinc Diagn Lab Immunol Vol 4(2) 147-155, 1997)

The antibodies may be labelled with a detectable marker including various enzymes, fluorescent materials, luminescent materials and radioactive materials Examples of suitable enzymes include horseradish peroxidase, biotin, alkaline phosphatase, β-galactosidase, or acetylcho nesterase, examples of suitable fluorescent materials include umbelhferone, fluorescein, fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythnn, an example of a luminescent material includes lummol, and examples of suitable radioactive material include S-35, Cu-64, Ga-67, Zr-89, Ru-97, Tc-99m, Rh-105, Pd-109, In-Ill, 1-123, 1-125, 1131, Re-186, Au-198, Au-199, Pb-203, At-211, Pb-212 and Bι-212 The antibodies may also be labelled or conjugated to one partner of a ligand bmding pair Representative examples include avidin-biotin and riboflavin-riboflavin bmdmg protein Methods for conjugatmg or labelling the antibodies discussed above with the representative labels set forth above may be readily accomplished usmg conventional techniques

The antibodies reactive agamst protems of the mvention (e g enzyme con_jugates or labeled derivatives) may be used to detect a protem of the mvention in various samples, for example they may be used in any known immunoassays which rely on the bmdmg interaction between an antigenic determinant of a protem of the mvention and the antibodies Examples of such assays are radioimmunoassays, enzyme immunoassays (e g ELISA), immunofluorescence, immunoprecipitation, latex agglutination, hemagglutination, and histochemical tests Thus, the antibodies may be used to identify or quantify the amount of a protein of the invention in a sample m order to diagnose P aeruginosa infections A sample may be tested for the presence or absence of P. aeruginosa serotypes Ol to 020 by contacting the sample with an antibody specific for an epitope of PsbM (WbpM) or PsbN (WbpN) which antibody is capable of being detected after it becomes bound to PsbM (WbpM) or PsbN (WbpN) in the sample, and assaying for antibody bound to PsbM (WbpM) or PsbN (WbpN) in the sample, or unreacted antibody. A sample may also be tested for the presence or absence of P. aeruginosa serotypes 02, 05, 016, 018, and O20 by contacting the sample with an antibody specific for an epitope of a Rol, PsbA, PsbB, PsbC, PsbD, PsbE, Rfc, PsbF, PsbG, PsbH, Psbl, PsbJ, PsbK (also known as Wzz, WbpA, WbpB, WbpC, WbpD, WbpE, Wzy, WbpF, WbpG, WbpH, Wbpl, WbpJ, WbpK respectively), HisH or HisF, protem which antibody is capable of being detected after it becomes bound to the protein in the sample, and assaymg for antibody bound to protein in the sample, or unreacted antibody.

In a method of the invention a predetermined amount of a sample or concentrated sample is mixed with antibody or labelled antibody. The amount of antibody used in the process is dependent upon the labelling agent chosen. The resulting protein bound to antibody or labelled antibody may be isolated by conventional isolation techniques, for example, salting out, chromatography, electrophoresis, gel filtration, fractionation, absorption, polyacrylamide gel electrophoresis, agglutination, or combinations thereof.

The sample or antibody may be insolubilized, for example, the sample or antibody can be reacted using known methods with a suitable carrier. Examples of suitable carriers are Sepharose or agarose beads. When an insolubilized sample or antibody is used protein bound to antibody or unreacted antibody is isolated by washing. For example, when the sample is blotted onto a nitrocellulose membrane, the antibody bound to a protein of the invention is separated from the unreacted antibody by washing with a buffer, for example, phosphate buffered saline (PBS) with bovine serum albumin (BSA).

When labelled antibody is used, the presence of a P. aeruginosa serotype can be determined by measuring the amount of labelled antibody bound to a protein of the invention in the sample or of the unreacted labelled antibody. The appropriate method of measuring the labelled material is dependent upon the labelling agent. When unlabelled antibody is used in the method of the invention, the presence of a P. aeruginosa serotype can be determined by measuring the amount of antibody bound to the P. aeruginosa serotype using substances that interact specifically with the antibody to cause agglutination or precipitation. In particular, labelled antibody against an antibody specific for a protein of the invention, can be added to the reaction mixture. The presence of a P. aeruginosa serotype can be determined by a suitable method from among the already described techniques depending on the type of labelling agent. The antibody against an antibody specific for a protein of the invention can be prepared and labelled by conventional procedures known m the art which have been described herem The antibody against an antibody specific for a protem of the invention may be a species specific anti-immunoglobulin antibody or monoclonal antibody, for example, goat anti-rabbit antibody may be used to detect rabbit antibody specific for a protem of the mvention The reagents suitable for applymg the methods of the mvention may be packaged mto convenient kits providing the necessary materials, packaged mto suitable containers Such kits may include all the reagents required to detect a P aeruginosa serotype m a sample by means of the methods described herem, and optionally suitable supports useful m perf orming the methods of the mvention In one embodiment of the invention the kit contams a nucleotide probe which hybridizes with a nucleic acid molecule of the invention, reagents required for hybridization of the nucleotide probe with the nucleic acid molecule, and directions for its use ln another embodiment of the invention the kit includes antibodies of the mvention and reagents required for binding of the antibody to a protein specific for a P aeruginosa serotype m a sample In still another embodiment of the invention, the kit mcludes primers which are capable of amplifying a nucleic acid molecule of the invention or a predetermined ohgonucleotide fragment thereof, all the reagents required to produce the amplified nucleic acid molecule or predetermmed fragment thereof m the polymerase cham reaction, and means for assaymg the amplified sequences The methods and kits of the present invention have many practical applications For example, the methods and kits of the present invention may be used to detect a P aeruginosa serotype m any medical or vetermary sample suspected of containing P aeruginosa Samples which may be tested mclude bodily materials such as blood, ur e, tissues and the like Typically the sample is a clinical specimen from wound, burn and urinary tract infections In addition to human samples, samples may be taken from mammals such as non-human primates, etc Further, water and food samples and other environmental samples and mdustnal wastes may be tested

Before testing a sample m accordance with the methods described herein, the sample may be concentrated using techniques known in the art, such as centnfugation and filtration For the hybridization and /or PCR-based methods described herem, nucleic acids may be extracted from cell extracts of the test sample usmg techniques known in the art Substances that Affect O-antigen synthesis and assembly

A protem of the mvention may also be used to assay for a substance which affects O-antigen synthesis or assembly in P aeruginosa Accordmgly, the mvention provides a method for assaymg for a substance that affects O-antigen synthesis or assembly in P aeruginosa comprismg mixmg a protem of the mvention with a test substance which is suspected of affectmg the expression or activity of the protem, and determining the effect of the substance by comparing to a control

In an embodiment of the mvention the protem is an enzyme, and a method is provided for assaymg for a substance that affects O-antigen synthesis and assembly in P aeruginosa comprismg mcubatmg a protem of the mvention with a substrate of the protein, and a test substance which is suspected of affecting the activity of the protem, and determining the effect of the substance by comparing to a control

In a preferred embodiment the protein is PsbM which has dehydrogenase activity Representative substrates which may be used with PsbM in the assay are precursor sugars such as glucose Dehydrogenase activity mav be assayed usmg conventional methods Compositions and Methods of Treatment

The substances identified by the methods described herem, antisense nucleic acid molecules, and antibodies, may be used for modulating one or both of O-antigen synthesis and assembly in P aeruginosa and accordmgly may be used m the treatment of mfections caused by P aeruginosa O-a tigen is a virulence factor of P aeruginosa and it is responsible for serum resistance Therefore, substances which can target LPS biosynthesis in P aeruginosa to change the organism into making "rough" LPS devoid of the long cham O- antigen (B-band) polymers will be useful in rendering the bacterium susceptible to attack by host defense mechanisms The substances identified by the methods described herem, antisense nucehc acid molecules, and antibodies are preferably used to treat infections caused by P aeuginosa serotypes 02, 05, 16, 18 and 20 The substances etc are also preferably used to treat mfections caused by P aeruginosa serotypes 03 or 06 which are predominant clinical isolates It will be appreciated that the substances may also be useful to treat infections caused by other members of the family Pseudomonadaceae (eg P cepacia and P pseudomallei), and to treat other bacteria which produce O-antigen, (e g other gram negative bacteria such as £ coh, S enterica, Vibrio cholera, Yersima entercohtica and Shigella βexneri)

The substances identified usmg the methods described herem may be formulated into pharmaceutical compositions for adminstration to sub_jects m a biologically compatible form suitable for administration in vivo By "biologically compatible form suitable for administration in vivo" is meant a form of the substance to be admmistered in which any toxic effects are outweighed by the therapeutic effects The substances may be administered to living organisms including humans, and animals Administration of a therapeutically active amount of the pharmaceutical compositions of the present mvention is defined as an amount effective, at dosages and for periods of time necessary to achieve the desired result For example, a therapeutically active amount of a substance may vary according to factors such as the disease state, age, sex, and weight of the individual, and the ability of antibody to elicit a desired response in the individual Dosage regima may be adjusted to provide the optimum therapeutic response For example, several divided doses may be administered daily or the dose may be proportionally reduced as mdicated by the exigencies of the therapeutic situation

The active substance may be administered in a convenient manner such as by injection (subcutaneous, intravenous, etc ), oral administration, inhalation, transdermal application, or rectal administration Depending on the route of administration, the active substance may be coated in a material to protect the compound from the action of enzymes, acids and other natural conditions which may mactivate the compound The compositions described herem can be prepared by per se known methods for the preparation of pharmaceutically acceptable compositions which can be admmistered to subjects, such that an effective quantity of the active substance is combmed in a mixture with a pharmaceutically acceptable vehicle Suitable vehicles are described, for example, in Remington's Pharmaceutical Sciences (Remington s Pharmaceutical Sciences, Mack Publishing Company, Easton, Pa , USA 1985) On this basis, the compositions mclude, albeit not exclusively, solutions of the substances in association with one or more pharmaceutically acceptable vehicles or diluents, and contained in buffered solutions with a suitable pH and iso-osmotic with the physiological fluids

The reagents suitable for applying the methods of the invention to identify substances that affect O-antigen synthesis and assembly in P aeruginosa may be packaged mto convenient kits providing the necessary materials packaged into suitable contamers The kits may also mclude suitable supports useful in performmg the methods of the mvention

The utility of the substances, antibodies, and compositions of the mvention may be confirmed m experimental model systems

The invention will be more fully understood by reference to the followmg examples However, the examples are merely mtended to illustrate embodiments of the mvention and are not to be construed to limit the scope of the mvention EXAMPLES Materials and methods used in Examples 1 to 3 described herem mclude the followmg Bacterial strains and culture conditions

The bacterial strains used in this study are listed in Table 6 All bacterial strains were mamtamed on Tryptic Soy Agar (Difco Laboratories, Detroit, MI) P Isolation Agar (PIA, Difco) was used for selection of transconjugants followmg matmg experiments Antibiotics used in selection media mclude ampicillin at 100 μg/ml for £ coh and carbenicillin at 450 μg/ml for P aeruginosa, tetracyclme at 15 μg/ml for £ coh and 90 μg/ml for P aeruginosa (250 μg/ml m PIA), gentamicm at 10 μg/ml for £ coh and 300 μg/ml for P aeruginosa DNA procedures

Small-scale preparation of plasmid DNA was done utilizing the alkaline lysis method of Birnboim and Doly (1979) Large-scale preparations of plasmid DNA were obtained usmg the Qiagen midi plasmid kit (Qiagen Inc , Chatsworth, CA), accordmg to procedures specified by the manufacturer Whole genomic DNA was isolated from P aeruginosa followmg the method of Goldberg and Ohman (1984) Restriction enzymes were purchased from GIBCO/BRL and Boehringer-Mannheim (Mannheim, Germany) T4 DNA hgase, T4 DNA polymerase and alkaline phosphatase were purchased from Boehringer-Mannheim All enzymes were used followmg suppliers' recommendations DNA was transformed into £ coh and P aeruginosa by electroporation usmg a Bio-Rad electroporation unit (Bio-Rad Laboratories, Richmond, CA) and accordmg to the protocols supplied by the manufacturer Electrocompetent cells of £ coh and P aeruginosa were prepared accordmg to the methods of Bmotto et al (1991) and Farinha and Kropinski (1990), respectively Recombinant plasmids were mobilized from £ coh DH5α to P aeruginosa through tnparental matings as described by Ruvkun and Ausubel (1981) Plasmids were also mobilized from £ coh SM10 to P aeruginosa usmg the method of Simon et al (1983) Genomic DNA was transferred to Zetaprobe membrane (Bio-Rad) by capillary transfer followmg the manufacturer's mstructions Southern hybridizations were done at 42°C for 18-24h with DNA previously labelled with dUTP con_jugated to digoxigenin (DIG) (Boehringer-Mannheim) Labelling of DNA was done accordmg to the manufacturer's recommendations Hybridized DNA was detected usmg an anti-DIG polyclonal antibody conjugated to alkaline phosphatase and AMPPD (0 235 mM 3-(2 -Spιroadamantane)-4- methoxy-4(3"-phosphoryloxy)-phenyl-l,2-dιoxetane) (Boehringer-Mannheim), followed by exposure to X-ray film (E I Du Pont de Nemours & Co , Wilmington, DE) TnlOOO mutagenesis of pFV.TKβ

Tnϊ 000 mutagenesis of pFV TK6 was performed as described previously (Lightfoot and Lam, 1993) usmg the method of de Lencastre et al (1983) DNA sequencing

DNA sequence analysis of the 1 9 kb insert of pFV TK8 was performed by the MOBIX facility (McMaster University, Hamilton ON) The 1 9 kb Xhol-HindlU msert of pFV TK8 was cloned into the sequencmg vector pBluescπpt II KS and double-strand sequenced usmg a model 373A DNA sequencmg unit (Applied Biosystems, Foster City, CA) Oligodeoxynucleotide primers for sequencmg were synthesized on an Applied Biosystems model 391 DNA synthesizer and purified accordmg to the manufacturers mstructions The Taq DyeDeoxy™ Termmator Cycle Sequencmg Kit (Applied Biosystems) was used for cycle sequencmg reactions which were carried out m an Ericomp (San Diego, CA) model TCX15 thermal cycler Sequence Analysis

The computer software programs Gene Runner for Windows (Hastmgs Software, New York, NY) and PCGENE (IntelliGenetics, Mountain View, CA) were used for nucleic acid sequence analysis, ammo acid sequence analysis, and characterization of the predicted protem DNA and protem database searches were performed usmg the NCBI BLAST network server (Altschul et al., 1990, Gish and States, 1993) Mutagenesis of the rfc gene of P. aeruginosa PAOl In order to construct P aeruginosa rfc chromosomal mutants a novel gene replacement vector, pEXlOOT (Schweizer and Hoang , 1995) was used This vector, called pEXlOOT, contams the sacB gene of B subtihi which imparts sucrose sensitivity on gram- negative organisms and allows for positive selection of true mutants from the more frequently occurrmg merodiploids In the first step of this experiment, the 5 6 kb H dIII fragment of pFV TK6 was blunt-ended usmg T4 DNA polymerase and subcloned mto the Smal site of pEXlOOT An 875 bp Gm^R cassette from pUCGM (Schweizer, 1993) was then cloned mto the smgle BαmHI site of the msert DNA The resulting plasmid, pFV TK9, was transformed into the mobihzer strain £ coh SM10 and then conjugally transferred into PAOl (Simon et al , 1983) After matmg, cells were plated on PIA contammg 300 μg/ml of Gm Colonies that grew on the Gm-containmg medium were picked and streaked on PIA containing 300 μg/ml Gm and 5% sucrose to identify isolates that had lost the vector- associated sacB gene, and thus had become resistant to sucrose Southern blot analysis was performed to verify that gene replacement had occurred (Figure 24) Preparation of LPS LPS used in sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and Western lmmunoblotting experiments was prepared accordmg to the protemase K digest method of Hitchcock and Brown (1983) SDS-PAGE

The discontmuous SDS-PAGE procedure of Hancock and Carey (1979) utilizing 15% running gels was used LPS separated by SDS-PAGE was visualized by silver- stammg accordmg to the method of Dubray and Bezard (1982) lmmunoblotting

The Western lmmunoblotting procedure of Burnette 981) was used with the followmg modifications Nitrocellulose blots were blocked with 3% (w/v) skim milk followed by mcubation with hybndoma culture supernatant contammg either MAb MF15-4, specific for 05 LPS, or MAb N1F10, specific for A-band LPS The blots were developed at room temperature, usmg goat anti-mouse F(ab')₂ fragment conjugated antibody (Jackson Immunoresearch Laboratories, West Grove, PA) and a substrate consisting of 30 mg of Nitro Blue Tetrazolium and 15 mg of 5-bromo-4-chloro-3-indolyl phosphate toluidine (Sigma, St. Louis, MO) in 100 ml of 0.1 M bicarbonate buffer (pH 9.8).

EXAMPLE 1 Analysis of the LPS from mutants AK1401 and rd7513. Strain AK1401 has been previously shown to contain A-band LPS; its B-band LPS consists of complete core plus one O-repeat unit (SR phenotype) (Berry and Kropinski, 1986; Lam et al., 1992). Strain rd7513 is a mutant of AK1401 that has the SR phenotype but is no longer producing A-band LPS, due to a mutation in an A-band biosynthetic gene (Lightfoot and Lam, 1991). Strain rd7513 was used in this study described in the examples, in addition to AK1401; but the majority of this investigation will focus on AK1401.

Complementation of O-antigen expression in P. aeruginosa AK1401. Mobilization of pFVlOO, which contains the 05 rfb gene cluster, into SR mutant AK1401 resulted in production of 05 B-band LPS. These results suggest that an O-polymerase gene might be localized on the cloned DNA. Analysis of LPS isolated from PAOl and AK14Ol(pFV100) in both silver-stained SDS-PAGE gels and Western immunoblots, reacted with 05-specific MAb MF15-4, revealed that the two strains expressed similar high molecular weight LPS profiles (Figure 22 a, b). In order to localize the putative rfc gene on the 26 kb insert of pFVlOO, various subclones were made (Figure 23) and used in complementation studies with AK1401. Plasmid ρFV.TK2, which contains a 16.5 kb Xbal fragment from pFVlOO was able to complement 05 O-antigen production after mobilization into AK1401 (data not shown). Plasmids pFV.TK3, pFV.TK4, and pFV.TK5 were generated and mobilized into AK1401, however none of the three plasmids was able to complement B-band synthesis in this mutant. Subsequently, pFV.TKό which contains a 5.6 kb H «dIII insert was made and was able to complement the SR phenotype of AK1401 (data not shown).

Transposon Tn 1000 mutagenesis of pFV.TKό. Transposon mutagenesis using TnlOOO was performed in order to more precisely define the region of insert DNA in pFV.TKό responsible for complementation of O-antigen expression in AK1401. pFV.TK6::Tnϊ000 recombinants were mobilized into AK1401 and then screened for the lack of expression of O- antigen using 05-specific MAb MF15-4. Plasmid DNA was isolated from colonies that did not react with MAb MF15-4, and subjected to restriction enzyme analysis to determine the location of the Tn 1000 insertion in pFV.TKό. Three Tn 1000 insertions in a 1.5 kb Xhol fragment were found to interrupt O-antigen expression in AK1401 (Fig. 23). This 1.5 kb Xhol fragment was cloned into vector pUCP26 (pFV.TK7) and mobilized into AK1401. In Western immunoblots of LPS from AK1401(pFV.TK7) with MAb MF15-4 no reaction of this antibody with high molecular weight B-band LPS could be detected (data not shown). Therefore, the 1.5 kb Xhol insert in pFV.TK7 was unable to restore the O-polymerase function m AK1401 A 1 9 kb X/ioI-Hwdlll fragment was then subcloned mto pUCP26 and the resulting plasmid was designated pFV TK8 (Figure 23) Mobilization of this recombmant plasmid mto both SR mutants, AK1401 and rd7513, resulted in restoration of O-antigen expression Silver-stamed SDS-PAGE gels and Western blots reacted with MAb MF15-4, showed that the AK1401(pFV TK8) transconjugants expressed levels of 05 B-band LPS comparable to that produced by the wild-type PAOl (Figure 22)

Southern analysis using a 1.5 kb Xhol probe. The 1 5 kb Xhol msert of pFV TK7, internal to the rfc codmg region, was labelled with dUTP conjugated to digoxigenm and used to probe X/ioI-digested chromosomal DNA from the twenty P aeruginosa serotypes The probe hybridized to a 1 5 kb fragment m serotypes 02, 05, 016, 018 and O20 (data not shown), suggestmg that these serotypes may share a similar O-polymerase gene These hybnzation results are not surpnsmg in that serotypes 02, 05, 016, and O20 share a similar O-repeat backbone structure (Knirel, 1990) Although the O-antigen structure of serotype 018 has not yet been determined, it exhibits cross-reactivity with polyclonal antisera raised against serotype 05 (data not shown), suggesting that it has an O-repeat unit structure similar to that of 05 In a recent study, Collins and Hackett (1991) found that a probe generated from the rfc gene of S enterica (typhimurium) cross-hybridized to chromosomal DNA of Salmonella groups A, B, and Dl strains but not with strains of groups D2 or E2, suggestmg that the former may share a common rfc gene In addition, studies done by Nurmmen and coworkers (1971) have shown that the O-polymerase enzymes of Salmonella groups B and Dl strains are able to polymerize O-repeat units of either serotype

Generation of P. aeruginosa chromosomal r c-mutants In order to confirm that the msert DNA of pFV TK8 codes for an O-polymerase gene, insertional mutagenesis was performed and the resultmg plasmid used for homologous recombmation with the PAOl chromosome In the first step, the 5 6 kb insert of plasmid pFV TK6 was cloned into a novel gene replacement vector, pEXlOOT, (Schweizer and Hoang, 1995) pEXlOOT is a pUC19-based plasmid that does not replicate m P aeruginosa, therefore, maintenance of plasmid DNA can only occur after homologous recombmation mto the chromosome The 5 6 kb msert of pFV TK6 was used for gene replacement instead of the 1 9 kb msert of pFV TK8 to ensure that there was sufficient DNA for homologous recombmation The next step involved insertion of an 875 bp Gm^R cassette mto a unique Ba l site in the msert DNA (Figure 24b) This step generated a mutation m the rfc gene and provided a means of later selectmg for colonies that had undergone homologous recombination Because the vector, pEXlOOT, contams the sacB gene of Bacillus subtilis it renders Gram-negative organisms sensitive to sucrose Streaking Gm^R recombinants on media contammg 5% sucrose allowed separation of true recombinants from merodiploids, since merodiploids exhibit sucrose-sensitivity because of the presence of the vector-associated sacB gene Of the eighty Gm^R colonies that were isolated, twenty-four were found to be sucrose-resistant. Three of the twenty-four isolates were randomly chosen for further characterization and were designated OP5.2, OP5.3, and OP5.5. Southern blot analysis of chromosomal DNA from these three putative mutants was performed in order to confirm that gene replacement had occurred. The 1.5 kb Xhol fragment of pFV.TK8 was used to probe X/zoI-digested chromosomal DNA isolated from the PAOl wild-type strain as well as OP5.2. OP5.3, and OP5.5. In strains that had undergone gene replacement, Xhol digestion should yield a probe-hybridizable fragment of 2.4 kb instead of 1.5 kb because of the insertion of the 875 bp Gm^R cassette (Figure 24 a, b). Southern blot analysis of the three Gm^R, sucrose-resistant isolates revealed a probe-reactive fragment of 2.4 kb (Figure 24 c, lanes 2-4); whereas, the probe reacted with a 1.5 kb fragment of the PAOl control DNA (Figure 24 c, lane 1), demonstrating that gene replacement had occurred in OP5.2, OP5.3, and OP5.5. Analysis of LPS from these three strains in silver-stained gels and Western immunoblots with 05-specific MAb MF15-4 demonstrated that they were not capable of producing long chain B-band O-antigen (Fig. 25a, b). Immunoblots reacted with A-band specific MAb N1F10 revealed that, like the SR mutant AK1401, these three mutants were still producing A-band LPS (Figure 25c). Biosynthesis of A-band LPS therefore, appears to be unaffected by this chromosomal mutation. The relative mobility of the core-lipid A bands was also similar to that of the SR mutant AK1401 (Figure 25a); therefore the LPS phenotype of the three rfc knockout mutants was identical to that of AK1401. Mobilization of pFV.TKδ into OP5.2, OP5.3 and OP5.5 restored O-antigen expression in the three mutants (data not shown), indicating that the PAOl chromosomal modification was the result of a direct mutation of the rfc gene and not caused by a secondary mutation, Nucleotide sequence determination and analysis of rfc. The 1.9 kb Xhol-Hindlll insert of pFV.TKδ, containing the rfc coding region, was cloned into pBluescript and subjected to double-strand nucleotide sequence analysis. Examination of the nucleotide sequence (Figure 9; GenBank accession number U17294) revealed one open reading frame (ORF) that coded for a protein of 438 amino acids, with a predicted mass of 48.9 kDal. This ORF was designated ORF48.9. Analysis of the P. aeruginosa rfc mol. % G + C content (44.8%; Table 6) revealed that it is significantly lower than that of the rest of the genome (67.2%; Palleroni, 1984). A low G + C content is a common feature of reported rfc genes (Collins and Hackett, 1991; Brown et al, 1992; Klena and Schnaitman, 1993; Morona et al, 1994) and has also been observed in all of the rfb clusters so far analyzed. The finding that the gene coding for the O-polymerase enzyme and the genes encoding the O-antigen repeat units have a compatible G + C content is not surprising since the specificity of the enzyme must relate to the structure of it substrate. Homology searches of both the nucleotide and the amino acid sequences of the P aeruginosa rfc gene were performed usmg EMBL/GenBank/PDB and Swiss-PROT (release 28 0) databases (Altschul et al , 1990, Gish and States, 1993) Comparison of the P aeruginosa rfc sequences with sequences reported for other prokaryotic genes revealed no significant homology, mcludmg with those reported for other rfc genes Previous studies on the structure of P aeruginosa O-antigens have revealed that their sugar compositions differ significantly from most other enterobacteπal O-antigens (Knirel et al , 1988) Neutral sugars, which are commonly found m enteric O-antigens, are only rarely found in O-antigens of P aeruginosa In addition, P aeruginosa O-antigens are rich m ammo sugars, many of which are substituted with acyl groups, a phenomenon rarely found m natural carbohydrates Given the unique sugar composition of P aeruginosa O-antigens, and the fmdmg by Morona et al (1994) that the S flexneri Rfc protem showed no homology with other enteric Rfc proteins, it is not surprising that the P aeruginosa Rfc protem exhibited no sequence homology with those of other enteric organisms The P aeruginosa rfc gene product does, however, have several features in common with other reported Rfc proteins, including the fact that it is very hydrophobic The mean hydropathic index of the P aeruginosa Rfc is 0 8 while those of other enteric organisms have been reported to range from 0 65 - 1 08 (Table 7) Exammation of the hydropathy profile of this protem and analysis of the ammo acid sequence, usmg the software program PCGENE, revealed that it is an integral membrane protein with 11 putative membrane-spanning domains (Klem et al , 1985) The Rfc protems of S enterica (typhimurium) and S enterica (muenchen) are reported to have 11 membrane-spanning doma s, while that of S flexneri is reported to have 13 (Morona et al , 1994), therefore, structural similarities appear to exist among the Rfc proteins of these four organisms Codon usage and amino acid composition analysis When the codon usage and amino acid composition of the P aeruginosa Rfc protein was compared with that reported for S enterica (typhimurium), S enterica (muenchen), and Slngella flexneri Rfc protems (Collins and Hackett, 1991, Brown et al , 1992, Morona et al , 1994), significant similarities were found between them (data not shown) Rfc protems have been reported to contam a high content of three ammo acids, namely, leucme, isoleucme, and phenylalanme (Morona et al , 1994) These three ammo acids account for 27, 30, and 37 % of the total ammo acids of the Rfc protems of S enterica (typhimurium), S enterica (muenchen), and Shigella flexneri, respectively (Morona et al , 1994) In the Rfc protem of P aeruginosa, these amino acids represent 30% of the total amino acid composition In summary, the present inventors have isolated an rfc gene m P aeruginosa 05 encoding an O-polymerase enzyme Usmg a gene-replacement system, P aeruginosa r/c-chromosomal mutants were generated which expressed the typical sr lps phenotype The P aeruginosa Rfc is similar to other reported Rfc proteins m that it is very hydrophobic, contammg 11 membrane-spanning domains, the Rfc coding region has a lower mol % G + C than the P aeruginosa chromosomal average, and it has a similar ammo acid composition and codon usage to that reported for other Rfc protems EXAMPLE 2

Isolation of a rol gene in P. aeruginosa 05 (PAOl) Encoding a Protein which Regulates O- antigen Chain Length

The P aeruginosa serotype 05 (PAOl) rol gene (regulator of 0_-chaιn length) was cloned from a genomic DNA cosmid library An open readmg frame (ORF) of 1046 bp, encodmg a 39 3 kDa protem, was identified The characterization of the function of Rol was facilitated by the generation of knockout mutants

The DNA sequence of a subclone of pFVlOO, pFVlόl (Figure 26), was found to have homology to the rol genes from a number of members of the family En ter obactertaceae However, only the 3' end of the putative rol gene was present on pFVlόl A cosmid library of P aeruginosa (PAOl) genomic DNA was screened usmg a digoxigenm-labeled probe from pFVlόl to identify an overlapping cosmid (pFV400) contammg the complete rol gene Southern blot analysis of DNA from pFV400, digested with a number of different restriction enzymes, was performed The pFVlόl probe hybridized to an approximately 2 3 kb H dlll fragment of pFV400 Assuming the rol gene of P aeruginosa serotype 05 (PAOl) was similar in size (approx 1 kb) to members of the family Enter obactertaceae (Morona et al , 1995), this fragment would be sufficient to contain the entire putative rol gene This 2 3 kb H dIII fragment was subcloned mto the vector pBluescript II SK (PDI Biosciences, Aurora, Ontario, Canada) and named pFV401 (Figure 26) Nucleotide sequencmg of the 2 3 kb HwdlH insert was performed usmg dye terminator cycle sequencmg (GenAlyTiC sequencmg facility, University of Guelph), and an open readmg frame (ORF) that coded for a protem of 348 ammo acids, with a predicted mass of 393 kDA, was identified (GenBank accession #U50397) Homology searches usmg the GenBank database through the NCBI Blast network server were performed (Altschul et al , 1990, Gish and States, 1993) Both the nucleotide and the deduced ammo acid sequences of the putative P aeruginosa rol gene showed approximately 33-35% ammo acid homology between the putative Rol protein and the Rol proteins of Salmonella enterica serovar typhimurium, Escherichia coh, and Shigella flexneri (Morona et al , 1995) (Table 5)

To confirm that the msert DNA of pFV401 codes for a Rol protem, msertional mutagenesis was performed and the resulting plasmid construct used for homologous recombmation with the PAOl chromosome Briefly, the 2 3 kb msert of pFV401 was cloned into a novel gene-replacement vector, pEXlOOT (Schweizer and Hoang, 1995), that does not replicate in P aeruginosa pEXlOOT also contams the sacB gene of B subttlls which imparts sucrose sensitivity on Gram-negative organisms and allows for positive selection of true mutants from the more frequently occurrmg merodiploids Next, an 875 bp gentamicin-resistance (GM^R) cassette from pUCGM (Schweizer, 1993) was inserted mto a unique Xhol site m the msert DNA The resultmg plasmid (pFV401TG) was transformed into the mobihzer stram E coh SM10 and then conjugally transferred mto PAOl (Simon et al , 1983) After matmg, cells were plated on P isolation agar (PIA, Difco Laboratories, Detroit, Mich ) contammg 300 μg ml * gentamicm (Sigma Chemical Co , St Louis, Mo ) and 5% sucrose This selective medium allows the identification of isolates that have undergone homologous recombination and lost the vector-associated sacB gene thus, becoming resistant to sucrose Southern blot analysis with both wild-type rol gene and Gm^R cassette probes was used to confirm the msertional mutation The wild-type control and the mutants showed probe reactive fragments of 2 3 kb and 3 1 kb respectively (Fig 27)

The LPS of the mutants was prepared accordmg to the proteinase K digest method of Hitchcock and Brown (1983) The LPS was analyzed usmg sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and Western immunoblots according to the methods described previously (de Kievit et al , 1995) When compared with the wild-type strain, the mutant LPS showed a marked alteration in the O-antigen ladder-like banding pattern, in which there was a decrease in high molecular weight bands and an mcrease m visible low molecular weight bands This change corresponds to a loss of bimodal distribution m O-antigen length (Fig 28)

A T7 expression system (Tabor and Richardson, 1985) was used for expression of the Rol protem A unique protein band with an apparent molecular mass of 39 kDa was observed This expressed polypeptide corresponded well to the predicted mass of 39 3 kDa This band was not observed m the vector-only control (Fig 29)

In conclusion, a rol gene was isolated in P aeruginosa 05 (PAOl) encodmg a protem which regulates O-antigen cham length Using a gene-replacement system, P aeruginosa rol Gm^R knockout mutants were generated which express LPS with unregulated O-antigen cham length Thus, the P aeruginosa 05 (PAOl) Rol protem has both sequence and functional homology to other reported Rol proteins This also confirms that the pathway for P aeruginosa B-band LPS biosynthesis is Rfc-dependent The function of Rol is often associated with the Rfc protein, an O-polymerase (Whitfield, 1995, Kievit e t al , 1995)

EXAMPLE 3 Sequencing of the psb gene cluster

The isolation of a cosmid clone, pFVlOO, contammg the psb gene cluster of P aeruginosa 05 identified in accordance with the present invention, was previously described (Lightfoot and Lam, 1993) Several subclones of pFVlOO contammg the psb genes were constructed The sequencmg and characterization of two of these clones pFVlll and pFVUO), contammg the rfc and psbL (rfbA) genes respectively, has previously been described (de Kievit et al , 1995, Dasgupta and Lam, 1995) Sequencmg of the remainder of the pFVlOO msert was undertaken m order to identify all the genes required for synthesis of the 05 O-antigen

Sequencmg of the entire msert of pFVlOO, a total of 24416 bp, revealed a large number of open readmg frames (ORFs) on both strands ORFs which were readmg m the same direction as rfc and psbL and which had homology either to any previously identified polysacchande or antibiotic biosynthetic genes or to highly conserved bacterial genes were characterized further A total of 21 ORFs which could be mvolved m synthesis of the 05 O-antigen were identified (Table 1) These genes were designated psbA through psbN in the 5' to 3' direction, with the exceptions of rol and rfc, which were named according to convention A further 4 ORFs with high homology to other bacterial genes or insertion sequences but which are not thought to be involved with LPS synthesis were identified (hisH, hisF, uvrB, IS407, Table 1)

Distribution of the psb genes among the 20 serotypes of P. aeruginosa and localization of the 05-specific region.

Southern blot analysis of the 20 serotypes of P aeruginosa usmg various psb genes as probes revealed an mterestmg dichotomy All of the probes tested which were 5' to the IS407 element hybridized only with chromosomal DNA from serotypes 02, 05, 016, 018 and 020 (Table 1) As stated above, these five serotypes have biochemically and structurally similar O-antigens (Figure 1) Although the O-antigens of serotypes 02, 05, Olό, 018, and 020 are serologically distmct and have been shown to have clear biochemical differences, none of the psb genes tested hybridized only to serotype 05 chromosomal DNA at high stringency

In contrast with these findings, probes for DNA sequences 3' to the IS407 element, and the IS407 element itself, hybridized with the chromosomal DNA from all 20 serotypes of P aeruginosa (Table 1) These results show that the msertion sequence is the junction between the portion of the psb cluster specific for 05 and related serotypes (heremafter referred to as the 05-sρecιfιc region, or sometimes as the Group I genes) and the non-specific chromosomal DNA Therefore, ps£>L appears to be the last gene of the 05- specific region Despite the fact that the DNA 3' of the insertion element is not 05- specific, this region is thought to contam at least two ORFs (psbM and psbN or sometimes referred to as the Group II genes) which may be mvolved in 05 LPS biosynthesis (see below) A 1.2 kb probe from the extreme 5' end of the insert of pFVlOO hybridized only to the five related serotypes, indicating that the 5' end of the 05-specific region had not been cloned. This probe was used to isolate an overlapping cosmid, pFV400. Various subclones of pFV400 were constructed to localize the 5' end of the 05-specific region to within a 1.3 kb Sstl-Xhol fragment located 1.7 kb upstream of the 5' end of pFVlOO. Preliminary sequence analysis of this upstream region revealed no additional ORFs thought to be involved with LPS synthesis. Also, no insertion sequences could be found in this region of DNA. Localization of the 5' end of the 05-specific region to the 1.3 kb Sstl- Xhol fragment means the total amount of DNA which is specific to 05 and related serotypes is approximately 20 kb.

The composition and chromosomal milieu of the 05 psb cluster.

The %G+C of the P. aeruginosa chromosome has been determined by various methods to be approximately 65-67% (Palleroni, 1984; West and Iglewski, 19XX). The %G+C content of the P. aeruginosa 05 psb cluster within the 05-specific region averages 51.1% overall, with individual genes ranging from a low of 44.5% (psbG) to a high of 56.8% (psbK) (Table 1). These results are consistent with those seen for other rfb genes, averaging at least 10% below the chromosomal background, and this is thought to be reflective either of origin in a low %G+C background (Reeves, 1993) or of possible regulatory constraints (Collins and Hackett, 1991; Morona et al., 1994a). The %G+C content of the psbM and psbN genes, which fall outside the 05-specific region, averages 62.6 %.

Sequence analysis of pFV100/pFV400 revealed no homology to gnd (encoding 6-phosphogluconate dehydrogenase) in the regions flanking the LPS genes. However, P. aeruginosa has been shown to convert glucose-6-phosphate to 6- phosphogluconate as part of the Entner-Douderoff pathway, suggesting a homologue of the gnd gene is located elsewhere on the chromosome. The location of the P. aeruginosa h is operon is not known, but the few his auxotrophic lesions that have been mapped on the chromosome of serotype 05 (strain PAOl) are several minutes from the A- and B-band LPS clusters (Lightfoot and Lam, 1993; Hollaway et al., 1994). Interestingly, two his genes (hisH and hisF) were found in the middle of the psb cluster, within the 05-specific region (see below). Because these genes fail to hybridize with all twenty serotypes of P. aeruginosa at high stringency, it is likely they are not native P. his genes, but were acquired along with the psb genes in a horizontal transfer event.

Homology searches of the Genbank databases with each of the ORFs in the psb cluster were performed. Assignment of putative function for the products of the ORFs was made based on homology of the encoded proteins to those previously described. Because the O-antigen of P. aeruginosa 05 contains two similar 2,3-diacetaminido- mannuronic acid residues, it is anticipated that both residues share a common biosynthetic pathway.

The 5' end of the pFVlOO insert contains a partial rol gene.

The partial open reading frame at the 5' end of the insert of pFVlOO was found to have low homology at the amino acid level (34-37%) with the Rol proteins of Escherichia coli (Batchelor et al., 1992; Bastin et al., 1993), Salmonella enterica sv Typhimurium (Batchelor et al., 1992; Bastin et al., 1993), and Shigella flexneri (Morona et al., 1994b). Only 479 bp of ro.-homologous DNA (encoding 159 amino acids) were present from the Xhol cloning site of pFVlOO. This sequence represented approximately the 3' half of the putative rol gene, based on the sizes of previously described rol genes. Using the partial gene as a probe, the entire rol gene has been cloned from an overlapping cosmid, pFV400, and its function confirmed by mutational analysis (Example 2). In other Rfc- dependent LPS gene clusters, the rol gene is positioned near or at the end of the cluster. These results, along with the large number of ORFs already identified on pFVlOO suggested that most, if not all, of the genes required for 05 O-antigen biosynthesis are present on this cosmid. psbA.

There is a distance of 807 bases between the rol gene and the first adjacent gene, psbA. Although P. aeruginosa promoters are not well defined, there are similarities with £. coli promoters (Harley and Reynolds, 1987; Deretic et al., 1989). There is a possible σ⁷⁰ -like promoter sequence and a putative ribosomal binding site (RBS) located 93 bp and 7 bp, respectively, upstream of the start of psbA (Figure 31). PsbA has homology (summarized in Table 2) to EpsD, thought to be a dehydrogenase required for synthesis of exopolysaccharide in Burkholderia solanaceraeum (Huang and Schell, 1995); to VipA, involved in synthesis of the Vi antigen in S. enterica sv Typhi (Hashimoto et al., 1993); and to RffD, a UDP-N-acetyl-D-mannosarninuronic acid dehydrogenase involved in synthesis of Enterobacterial Common Antigen (ECA) in £. coli (Meier-Dieter et al., 1992). ECA is an exopolysaccharide common to most enterics that can be linked to lipid A-core in rough strains. It is composed of N-acetyl-D-glucosamine (GlcNAc), N-acetyl-D- mannosaminuronic acid (ManNAcA), and 4-acetamido-4, 6-dideoxy-D-galactose (Fuc4NAc).

PsbA also has homology with CapL, involved in type 1 capsular polysaccharide production in Staphylococcus aureus (Lin et al., 1994). The type 1 capsule is composed of taurine, 2-acetamido-2-deoxy-fucose (Fuc2NAc) and 2-acetamido-2-D- galacturonic acid (Gal2NAcA). The sugar composition of both ECA and type 1 capsule are similar to the P. aeruginosa 05 O-antigen. PsbA also has a low level of homology with ORF7 of the Vi antigen region of £. coli/Citrobacter freundii (accession #Z21706), and several GDP-mannose and UDP-glucose dehydrogenases, including AlgD of P. aeruginosa (Deretic et al., 1987). AlgD is a GDP-mannose dehydrogenase required for alginate synthesis. These homologies suggest that PsbA functions as a dehydrogenase involved in the biosynthesis of the mannuronic acid residues, possibly converting UDP-N-acetyl-D- mannosamine into UDP-N-acetyl-D-mannosaminuronic acid. A large number of dehydrogenases including PsbA (as well as PsbK and PsbM, below) contain a consensus nicotinamide adenosine dinucleotide (NAD)-binding domain, thought to be important for activity (Figure 33). An alignment of the amino acid sequences of some PsbA-like proteins is shown in Figure 34. psbB.

The psbB gene start is 74 bases from the termination codon of psbA, but no separate promoter sequence for psbB could be detected. A putative RBS is located 6 bp from the initiation codon for psbB and the second codon is AAΛ, the preferred second codon in £. coli (Gold and Stormo, 1987; Figure 32). The psbB gene product is possibly an oxido- reductase, dehydratase, or dehydrogenase. It is 28.2% homologous to the LmbZ protein of Streptomyces lincolnesis required for lincomycin production (Peschke et al., 1995), and also has homology with the purlO gene product of Streptomyces alboniger required for puromycin production (Tercero et al., 1996). PsbB has 17% homology to the BplA protein from B . pertussis required for LPS production (Allen and Maskell, 1996) and even weaker homology to ORF334 and MocA from Rhizobium meliloti found in the operon for rhizopine catabolism (Rossbach et al., 1994). In B. pertussis, the BplA protein is thought to catalyze the final step in the biosynthesis of UDP-diNAcManA from UDP-diNAcMan (Allen and Maskell, 1996).

Several of the psb genes were found lo have high homology with bpl genes, suggesting a common ancestry. B. pertussis has semi-rough LPS, with only one O- antigen unit attached to the core oligosaccharide. The composition of the B. pertussis O- antigen unit is N-acetylglucosamine (GlcNAc), 2,3-dideoxy-2,3-N-acetylmannosaminuronic acid (2,3-diNAcManA), and N-acetyl-N-methyl fucosamine (FucNAcMe) (Allen and Maskell, 1996). These sugars are similar to those comprising ECA, S. aureus type 1 capsule, and the P. aeruginosa 05 O-antigen. The amino acid homology between PsbB and BplA as well as the similarties in O-antigen unit composition suggest that PsbB could have a homologous function to that of BplA. Unlike the other putative dehydrogenases encoded in the psb cluster, PsbB does not contain a consensus NAD-binding domain. psbC. The start of psbC overlaps significantly (343 bases) with the stop of psbB, and psbC could encode a large protein of 85.3 kDa (766 amino acids). Careful scrutiny of the DNA sequencing results confirmed no sequencing errors were present. Protein expression will determine whether this entire large ORF is translated The large size of this protem may mdicate it resulted from a fusion event There is a weak potential RBS upstream of the AUG codon of psbC (Figure 32)

The carboxy-terminal portion of PsbC has homology with a hypothetical protem (HI0392) derived from the Haemophilia influenzae genome sequence (Fleischmann et al , 1995) HI0392 is a 245 ammo acid protem of unknown function, with several hydrophobic domams, and is thought to be an mtegral membrane protem There is homology between PsbC and the macrohde 3-O-acyltransferase acyA gene from the Streptomyces thermotolerans carbomycm biosynthetic cluster (Arisawa et al , 1995) PsbC also has weak homology with ExoZ of R meliloti, mvolved in succmoglycan production (Buendia et al , 1991), and with NodX of R legummosarum , mvolved m nodulation (Davis et al , 1988) ExoZ is a 317 ammo acid protem, also with multiple hydrophobic domams, while NodX is a 367 ammo acid protem thought to be located in the cytoplasmic membrane ExoZ and NodX genes are both putative 3-O-acyltransferases A summary of the homologies between the above protems is shown m Table 2 The similarities mdicate PsbC, particularly the carboxy terminal portion, may have 3-O-acyltransferase activity, and could be mvolved in acetylation of the mannuronic acid residues in the 05 O-antigen psbD

The psbD gene appears to be translationally coupled with the psbC gene, smce its start codon overlaps the stop codon of psbC A potential RBS is located 9 bp upstream of the psbD AUG codon (Figure 32) The product of the psbD gene is most homologous with the product of the bplB gene m the B pertussis LPS biosynthetic cluster (Allen and Maskell, 1996) PsbD and BplB appear to be O-acetyl transf erases, and have some homology to serine O-acetyl transferases (CysE) from a variety of bacteria, including Buchnera aphidicola (Lai and Baumann, 1992), Bacillus stearothermophilus (Gagnon et al , 1994), B subtihs (Ogasawata et al , 1994), E coh (Denk and Bock, 1987), S enterica sv Typhimurium (accession #P29847), H influenzae (Fleischmann et al , 1995), and the plant Arabidopsis thahana (Bogdanova et al., 1995) (Table 2, Figure 35) As with PsbC, PsbD is probably mvolved in the acetylation of the mannuronic acid residues comprising two-thirds of the 05 repeat unit While bplA and bplB are contiguous on the B pertussis chromosome, the psb homologues, psbB and psbD respectively, are separated by the large psbC gene psbE psbE has high homology with a B pertussis LPS biosynthetic gene, bplC psbD and psbE are adjacent to one another m the psb cluster, as are bplB and bplC m the bpl cluster (Allen and Maskell, 1996) However, they do not appear to be translationally coupled, smce there are 86 bases between the end of psbD and the start of psbE While there is a potential RBS 9 bp before the psbE start (Figure 32), it is not known whether this gene can be transcribed from a promoter internal to the psbD gene There are some sequences with weak homology to the £ coh consensus promoter sequence m that area

Also homologous to PsbE are DegT, from B subtihs (Takagi et al , 1990), Saccharopolyspora erythraea ErbS (ERYC1) mvolved in erythromycm synthesis (Dhillon et al , 1989), DnrJ from Streptomyces peucetius required for daunorubicin biosynthesis (Stutzman et al , 1992) and SpsC from B subtihs involved in spore coat polysacchande biosynthesis (Glaser et al , 1993) (summarized in Table 2) There is also weak homology between PsbE and both MosB for rhizopine synthesis in R meliloti (Murphy et al , 1993) and Yifl, a hypothetical protem m the rffE/rffT mtragenic region of E coh (Daniels et al , 1992) The proteins DegT/DnrJ/ERYCl /SpsC form a family of protems formerly thought to form the DNA-binding component of sensory-transduction two- component regulatory systems More recently, however, their function is suggested to be in the biosynthesis of 2,3-, 2,4-, and 2,6-dιdeoxy sugars such as the 2,3-dιdeoxy mannuronic acid produced by P aeruginosa 05 (Thorsen et al , 1993) An alignment of the ammo acid sequences of the PsbE-like protems is shown m Figure 36 The O-antigen polymerase, rfc.

The rfc gene starts 254 bases downstream of the end of the psbE gene This gene was cloned, sequenced and characterized as described m Example 1 Knockout mutations generated by insertion of a gentamicin cassette into rfc were used to confirm this gene encoded the O-antigen polymerase Gentamicin-resistant mutants were shown to have the semi-rough phenotype (See Example 1) characteristic of an rfc mutant (Makela and Stacker, 1984) psbF.

The psbh gene appears to be translationally coupled with the rfc gene smce they have an overlapping stop and start There is a RBS sequence 8 bp upstream of the initiation codon of psbF It is most homologous to the LxoT protein of R meliloti (Glucksmann et al , 1993), which is thought to be mvolved in succmoglycan transport There is also a small amount of homology to FeuC of B subtihs, part of its iron uptake system (Quirk et al , 1994) PsbF is the most hydrophobic protem encoded by the psb cluster (Table 1) and has 9-10 membrane-spannmg domams This secondary structure is remniscent of that of RfbX, the putative fhppase found in Rfc-dependent O-antigen clusters (Figure 37) (Schnaitman and Klena, 1993) Mutations in RfbX have been found to be unstable and deleterious to the host strain (Schnaitman and Klena, 1993) Recently Liu et al (1996) confirmed that RfbX (Wzx) mutants accumulate one O-antigen unit on undecaprenol on the mside of the cytoplasmic membrane PsbF knockout mutants generated by msertion of a gentamicm resistance cassette mto psbF are both A and B-band minus (I igure 48) PsbF may be the P aeruginosa 05 equivalent of RfbX The htsH and htsF genes.

The histidme operon, contammg genes required for the biosynthesis of the ammo acid histidme, has previously been shown to lie adjacent to the rfb clusters of several enteric species (reviewed m Schnaitman and Klena, 1993) Comparison of the chromosomal map locations of the P aeruginosa 05 A- and B-band LPS clusters with those of known PAOl his mutations showed there were no his genes located adjacent to either the psa (11-13 mm) or psb (37 mm) clusters (Lightfoot and Lam, 1993, Holloway et al , 1994) Therefore, the identification of two genes with high homology to the genes hisF and hisH of various bacterial species m the middle of the psb cluster was unexpected The hisH and hisF genes are located between the psbF and psbG genes (Figure 1), and transcribed in the same direction The direction of transcription of the his genes in previously characterized rfb clusters is opposite to that of the rfb genes (Ames and Hartman, 1974, Macpherson et al , 1994)

While the deduced amino acid sequence of hisF appears to give a complete open readmg frame (from bases 10387 to 11142), the sequence of htsH appears to be lacking an AUG initiation codon at the location predicted for the start of the protem based on ammo acid homology However, there are potential starts at three GUG codons located 51, 72, and 132 bp upstream of the first AUG, located at base 9830 The size of the protem correspondmg to the product of hisH is approximately 21 kDa, indicating it is probably translated from either of these putative starts Only the GUG codon at 9777 is preceded by a good RBS (Figure 32), none of the other potential start codons have consensus RBS sites N- termmal analysis of the HisH product will confirm the translational start

Protein expression analysis of this region shows the products of these genes are expressed in vitro m both orientations, indicating there is a promoter region precedmg the his genes that can be recognized by £ coh Analysis of the sequence upstream of the putative start sites of hisH shows there is a potential promoter sequence with partial homology to the £ co consensus -35 and -10 regions (Figure 31) This homology is within the range seen in previously reported P aeruginosa promoter sequences that can function in £ coh (Deretic et al , 1989, Ronald et al , 1992) In K pneumontae, the products of the hisH and hisF genes have been shown to form a heterodimeπc enzyme complex required for the conversion of N'- [(5'phosphoπbulosyl)-formιmιno]-5-amιnoιmιdazole-4-carboxamιde-πbonucleotιde (5'- PRFAR) to lmidazole glycerol-phosphate (IGP) and 5'-phosphoπbosyl-4-carboxamιde-5- aminoimidazole (ZMP) (Rieder et al , 1994) Although the products of the hisH and htsF genes have been shown to function together, the hisH and hisF genes themselves are separated by a third gene, hisA (Ahfano et al , 1996) The hisA and hisH genes are highly related and are thought to have arisen through gene duplication The gene order of hisHAF has been found m all bacterial species characterized to date (Ahfano et al , 1996)

Comparison of the ammo acid sequence homologies of various HisF and HisH protems (Tables 3 and 4) showed that the P aeruginosa psb HisF and HisH protems are not closely related to any of the HisF/HisH proteins characterized thus far Comparisons of P aeruginosa psb HisF with the other HisF protems shown m Table 6 shows that it is the most distantly related protein of the group analyzed, at approximately 50% homology psbG There is a distance of 138 bp between hisF and psbG, and a putative promoter is identified m this region (Figure 31) A RBS is identified 4 bp from a putative GUG start and 7 bp from the adjacent AUG start codon (Figure 32) The optimum spacmg of a RBS from the initiation site is 8 ±2 bp, suggestmg the AUG codon is likely to be the start PsbG has limited homology to ORF2 (11 2%) of Vibrio cholerae O-antigen (Comstock et al , 1996), and less homology with NfrB of H influenzae, a formate-dependent nitrate reductase (Fleischmann et al , 1993), and Pfk, a phosphofructokinase of the Gram positive bacterium, Lactococcus lactis (Xiao and Moore, 1993) Interestingly, the homology is associated with NfrB centres around the metal bmdmg recognition site CXXCH, of which there are five in NfrB and one m PsbG (ammo acids 24-28) Insertion of a gentamicm cassette mto psbG results in B-band deficient mutants of PAOl, suggestmg a role for it m O-antigen biosynthesis psbH

There are 15 bp between psbG and psbH, however, no RBS can be detected upstream of the psbH start codon The third codon is AAA (Figure 32) PsbH demonstrates low homology with CapM (14 2%) of S aureus (L et al , 1994), mvolved in the synthesis of N-acetogalactosammo uronic acid PsbH also has homology with a number of glycosyl transferases, including IcsA (17 1 %) (accession #U39810) and RfaK (13%) (accession #U35713) of Neissena meningitidis , RfbF (11 3%) of Klebsiella pneumomae (Keenleyside and Whitfield, 1994) There is also a low level of homology with RfpB of Shigella dysenteriae (Gohmann et al , 1994), and BplH and BplE of B pertussis (Allen and Maskell, 1996) These enzymes are likely to belong to a family of transferases mvolved m the addition of a similar sugar to the growmg O-antigen unit

RfpB, RfaK, and RfbF are glucosyl- or galactosyl transferases and it is likely that CapM is the transferase involved in the addition of N- acetogalactosammouronic acid This suggests that PsbH is one of the two ManA transferases PsbH also has very limited homology to the DnaK proteins of R. meliloti (Falah and Gupta, 1994) and Agrobacterium tumefaciens (Segal and Ron, 1995). However, the homology is concentrated around the central region of PsbH. DnaK is a chaperonin, and is thought to have a role in gene regulation. Homology around the functional domain of DnaK may suggest a role for psbH /PsbH in regulation of the psb cluster. psbl.

The start codon of psbl overlaps the stop codon of psbH. A putative RBS is situated 6 bp upstream of the AUG start and the second codon is AAA (Figure 32). Psbl demonstrates strong homology with BplD of B. pertussis (Allen and Maskell, 1996) (Table 2). BplD is purported to initiate the first step in the biosynthesis of 2,3- diNAcManA. Psbl also demonstrates moderate homology to NfrC and ORF o389 (RffD) of £. coli (Daniels et al., 1992), EpsC of Burkholderia solanacearuni (Huang and Schell, 1995), YvyH of B. subtihs (Soldo et al., 1993) and RfbC of S. enterica sv Borreze (Keenleyside and Whitfield, 1995). EpsC is thought to be involved in the biosynthesis of N - acetylgalactosaminuronic acid, and RfbC is thought to be UDP-JV-acetylglucosamine 2- epimerase. Alignment of Psbl and related proteins is shown in Figure 10. Based on these homologies, it is likely that Psbl converts UDP-N-acetylglucosamine to UDP-N- acetylmannosamine as the first step in the biosynthesis of mannuronic acid. Interestingly, the genes encoding the remaining enzymes in this pathway are located upstream and somewhat removed from the psbl gene (psbABDE). psb].

The distance between psbl and psb] is 17 bp. A putative RBS is present immediately following the stop codon of psbl, 13 bp from the AUG start codon of psb] (Figure 4). PsbJ demonstrates reasonable homology to BplE (52.6%) of B. pertussis, a glycosyl transferase thought to attach either 2,3-diNAcManA or FucNAcMe to the O-unit (Allen and Maskell, 1996) (Table 2). TrsE of Yersinia enterocolitica also has homology to PsbJ (Skurnik et al., 1995), and is thought to be one of the galactosyl- or mannosyl transferases. An alignment of PsbJ and PsbJ-like proteins is shown in Figure 39. As BplE also has limited homology with PsbH, it is likely that both PsbH and PsbJ are the transferases involved in the addition of the two mannuronic acid residues to the B-band O-antigen unit. PsbJ has two putative membrane-spanning domains at the N-terminus, and may be anchored in the cytoplasmic membrane. psbK. The start codon of psbK overlaps the stop codon of psb], and the second codon is AAA (Figure 32). PsbK demonstrates homology to a series of glucose dehydratases, including StrP of Streptomyces glauciens involved in streptomycin biosynthesis (accession number 629223), ExoB of R meliloti (Buendia et al , 1991), ORF o355 (incorrectly assigned RffE) of £ co (Daniels et al, 1992, Macpherson et al , 1994), GraE of Strepfomt ces violaceoruben (Bechtold et al , 1995) and RfbB of a number of organisms including N meningitidis (Hamerschmidt et al , 1994) and £ coh (Marolda and Valvano, 1995) Alignment of these protems show the presence of an NAD-bmdmg domam (GXXGXXG) near the N-terminal end (Figure 5, Macpherson et al , 1994) RfbB and o355 are known to be involved in the biosynthesis of FucNAc (Meier-Dieter et al , 1992) Based on these homologies, PsbK is thought to be dTDP-D-glucose 4,6-dehydratase, required as the second step m the biosynthesis of FucNAc psbL

There are 59 bp between the end of psbK and the start of psbE but no RBS could be detected in the region preceding the double start codons (Figure 32 Identification of the psbE(rfbA) gene has previously been reported (Dasgupta and Lam, 1995) Further characterization of PsbL suggests it functions as a transferase, and is thought to initiate O-antigen unit biosynthesis with the addition of FucNAc to undecaprenol, based on its homology to Rfe The alignment of PsbL with TrsF from Y enterocohtica (Skurnik et al , 1995) and Rfe from £ coh (Daniels et al , 1992) is shown m Figure 40 Rfe is the initial transferase involved in the biosynthesis of ECA and some O-antigens (Schnaitman and Klena, 1993, Macpherson et al , 1994), transferring GlcNAc to undecaprenol (Meier-Dieter et al , 1992) Because the first transferase m the biosynthesis of O-antigen interacts with undecaprenol, it would be expected to be a hydrophobic protein PsbL is the most hydrophobic (hydropathy mdex of 0 84, Table 1) of the three putative transferases encoded m the psb cluster (PsbH, PsbJ, PsbL) IS407_Pa Following the psbL gene is an msertion sequence with 61 5% nucleotide identity with the previously characterized IS407 element of B cepacia (Wood et al , 1991) This homology prompted the designation IS407_Pa , with the subscript _Pa to mdicate it is the P aeruginosa version Both elements are similar m size (1243 bp for JS407_Bc and 1211 for 7S407p_a) and have very similar imperfect mverted repeats (IR) of 12 and 11 bp respectively The 1S407 elements are similar to IS sequences from other soil-, water- and plant-associated bacteria, including ISR1 from R meliloti (Priefer et al , 1989), IS52 7 from Caulobacter crescentens, IS1222 from Enterobacter agglomerans, IS476 from Xanthamonas campestris (Kearney and Staskawicz, 1990), and IS9H from S dysenteriae (Prere et al , 1990) There have been previous reports of IS elements in P aeruginosa (Pπtchard and Vasil, 1990, Sokol et al , 1994) but none of these have homology to the above group, therefore this is the first report of IS407 m P aeruginosa Southern blot analysis usmg the IS407_Pa as a probe showed ιt is present m all 20 serotypes of P aeruginosa (Table 2), and most serotypes appear to have only a smgle copy of the element psbM.

The psbM gene follows the IS407_Pa element and may be transcribed from one of three potential promoters present m the right IR (Figure 31) A gene-activatmg promoter was previously shown to be present m the right IR of IS407_Bc (Wood et al , 1991) psbM is unusual because m contrast to other psb genes described above, it hybridizes to chromosomal DNA from all 20 serotypes (Table 1) PsbM mutants, generated by msertion of a gentamicm cassette into a unique Nrul site within psbM, exhibit B-band LPS-mmus phenotype This confirms the mvolvement of the psbM product in LPS biosynthesis, despite the fact it lies outside of the 05-specιfιc region (Figure 41) PsbM has homology to a range of protems mvolved m exopolysaccharide synthesis, including BplL from the B pertussis LPS cluster (Allen and Maskell, 1996), TrsG from the core biosynthetic cluster of Y enter ocohtica 03 (Skurnik et al , 1995), and CapD from the S aureus capsular gene cluster (L et al , 1994) These homologies are summarized in Table 2

As shown previously for BplL, only the carboxy half of the PsbM protein has homology to GalE from several bacterial species, suggesting it may have originated as a fusion protem In support of this hypothesis, PsbM also has homology to two adjacent ORFs (ORF10 and ORFll) in the LPS cluster of V cholerae 0139 (Comstock et al , 1996) The homology to ORF10 and ORFll lies in the ammo-termmal and carboxy- terminal half of PsbM, respectively (Table 2), suggesting that two similar ORFs were fused durmg the evolution of PsbM and the BplL/TrsG/CapD group

Based on these homologies, PsbM is thought to be involved m the biosynthesis of the N-acetylfucosamine residue of the 05 O-antigen As mentioned above, the O-antigen of B pertussis and the type 1 capsule of S aureus and the outer core of Y e terocohtica 03 all contam N-acetylfucosamine PsbM could function as a dehydrogenase, and it contains two putative NAD-bindmg domams (Figure 33), as do BplL and TrsG Again, these duplications may have arisen from an ancestral fusion of two NAD-bmdmg domam-contammg protems and may be bifunctional psbN.

The psbN gene has some homology to eryA, a gene involved in erythromycin biosynthesis in Sacchropolyspora erythrae Generation of knockout mutations m psbN will demonstrate its function m biosynthesis of the 05 O-antigen uvrB. The last partial open reading frame present on pFVlOO has high homology to the highly conserved uvrB gene from several bacterial species, including £ coli, S enterica sv Typhimurium, and Micrococcus luteus UvrB is a subunit of the UvrABC DNA excision repair complex involved in removal of thymidine dimers induced by irradiation with ultraviolet light The presence of uvrB adjacent to psbN confirms that psbN is the last gene m the psb cluster that could be involved in O-antigen biosynthesis Organization of the psb gene cluster in P. aeruginosa 05. Several entire rfb clusters, particularly from enteric bacteria, have been characterized to date (reviewed m Whitfield and Valvano, 1993, and Schnaitman and Klena, 1993) In general, rfb clusters are located on the chromosome ad_jacent to the his operon and the gnd gene Amongst the enterics, it has previously been shown that the rfb clusters are organized in a specific fashion (Reeves, 1993, Schnaitman and Klena, 1993) Genes necessary for sugar biosynthesis are arranged in discrete blocks located 5' to the transferases and other assembly genes (rfbX, rfc and rol) The psb cluster, however, appears to be almost randomly organised, with genes thought to be mvolved in the biosynthesis of Man(2NAc3N)A and Man(2NAc3NAc)Λ scattered throughout the gene cluster (psbl, psbE, psbD, psbB and psbC) The genes thought to encode for the biosynthesis of FucNAc are also scattered throughout the cluster (psbK, psbM, psbG, psbN) Further, the genes encodmg transferases are interspersed throughout the psb cluster (psbH, psb], psbL), and are separated from one another by one gene each However, the transferase genes do appear to be organized such that the gene encoding the putative first transferase (PsbL), thought to initiate O-antigen assembly on undecaprenol, is the most distal Recent results from detailed spectroscopic analysis, usmg high resolution NMR and Mass Spectroscopy of an rfc mutant of PAOl, stram AK1401, show that FucNAc is the first sugar of the O-antigen unit, attached to the core oligosacchaπde PsbL's homology to Rfe, and its hydropathicity support the interpretation that it is the first transferase, and is responsible for attachment of the FucNAc residue to undecaprenol Therefore, based on their gene order and their relative hydropathic mdices (-0 21 and 0 10), the psb] and psbH gene products are thought to transfer Man(NAc)₂A and Man(2NAc3N)A, respectively

The O-antigen of P. aeruginosa 05 is an Rfc-dependent heteropolymer

The psb cluster was shown to contam an rfc gene, (See Example 1) the interruption of which (by knockout mutation and gene replacement) resulted in a SR phenotype (de Kievit et al , 1995) At least two other gene products, Rol and RfbX, are thought to be involved in Rfc-dependent synthesis of heteropolymeric O-antigens (Whitfield, 1994) Here a rol gene has been identified m the psb cluster However, m the analysis of the psb genes, no rfbX-hke gene was identified The psbF gene product appeared to be the most likely candidate, based on its hydropathy profile (Figure 9), but msertional mutants of psbF do not have the phenotype expected of rfbX mutants Identification of his genes within the psb gene cluster. The identification of the hisH and hisF genes in the middle of the psb cluster raises some interesting evolutionary questions. It appears that these two his genes are not native to P. (.m.g.λ.øsfl, -because they have a lower %G+C content than background (50% vs.67%) and they hybridize only to a limited number of serotypes with related O- antigens instead of all 20 serotypes. It is not uncommon for his operons to be located adjacent to rfb clusters, and it is likely that the his genes were acquired simultaneously with some or all of the psb genes. The lack of significant homology with any of the HisF and HisH proteins characterized to date, and particularly with those of other Gram-negative bacteria precludes the use of these genes as evolutionary "luggage tags". The lack of homology with other Gram-negative HisH/F proteins suggests either they came from an as-yet uncharacterized source or that they have been resident in P. aeruginosa for a long time. The latter possibility is bolstered by the divergence over time of the O-antigen structures/genes from the ancestral psb cluster in the five 05-related serotypes in which these hisH and hisF genes are found. The location of hisH and hisF adjacent to one another is unique in bacteria. The similarity between hisH and his A genes, and the usual location of his A, rather than hisH, adjacent to hisF, raises the possibility that the P. aeruginosa psb hisH gene was originally a hisA gene that has diverged so as to be more similar to hisH than to hisA. However, there is precedent for the juxtaposition of hisH and hisF; in the yeast Sacchromyces cerevisiae, the homologues of the hisH and hisF genes are adjacent, and are fused into one translational unit called HIS7 (Kuenzler et al., 1993). Alternatively, the hisHF arrangement may be ancestral to the duplication event which resulted in the hisHAF gene order. Another possibility is that the hisA gene may have been lost, leaving hisH and hisF adjacent. psb gene dissemination amongst the 20 serotypes of P. aeruginosa.

The observation that no genes were found in the 05 cluster which hybridize only to chromosomal DNA from serotype 05 and not to the other related serotypes was intriguing. The differences among these five serotypes is confined to changes in the type of linkage between sugars or to the epimer present in the O-antigen, either mannuronic or guluronic acid (Figure 30). These differences could result from variation in transferase activity or in epimerization activity, respectively. Further analysis of the putative transferase activities will be necessary to determine whether there are differences in activity among serotypes despite the obvious homology at the genetic level. It will be interesting to determine whether the introduction of multicopy plasmids containing the 05 transferase genes into the related serotypes will result in an alteration in O-antigen structure that could be detectable with serotype-specific monoclonal antibodies. There is precedence for this, as a P. aeruginosa strain PAOl (serotype 05) phage induced mutant, stram AK1380, was isolated which was identified as serotype Ol6 (see Lam et al , 1992, Fig 30, and Kuzio and Kropmski, 1993)

The genetic differences among the five serotypes with related O- antigens are obviously quite m or Comparison of the DNA sequences of the 02 rfc and the 05 rfc genes revealed they are very homologous at the nucleotide level)

EXAMPLE 4 Further Characterization of Rol (Wzz) Gene and Region Upstream

In this example the rol gene is generally referred to as the wzz gene The materials and methods used m Example 4 are as follows Bacterial strains and plasmids.

The bacterial stra s and plasmids used in this study are listed in Table 8 P aeruginosa strams were cultured either on Luna broth or plates or on Pseudomonab Isolation Agar (PIA Difco, Detroit, MI) E coh strams were cultured on Luna broth or plates Media were supplemented with antibiotics ampicillin, carbemcillin, tetracyc ne, or gentamicin (all from Sigma, St Louis, MO) as required, using the concentrations outlined in de Kievit et al , 1995 DNA methods.

Chromosomal DNA was isolated from P aeruginosa usmg the method of Goldberg and Ohman, 1984 Plasmid and cosmid DNA was isolated usmg the Qiagen midi-prep kit (Qiagen Inc , Chatsworth, CA) as directed by the manufacturer Restriction and modification enzymes were supplied by Gibco/BRL (Gaithersburg, MD), Boehr ger Mannheim (Laval, PQ), and/or New England Biolabs (Beverly, MA) and were used as directed by the manufacturers

Plasmids were mtroduced mto E coh by CaCl₂ transformation (Huff et al , 1990) and into P aeruginosa by electroporation using a BioRad (Richmond, CA) Gene Pulser apparatus followmg manufacturers protocols P aeruginosa electrocompetent cells were prepared by washing early log phase cells twice for 5 mm each in sterile 15% room-temperature glycerol followed by immediate resuspension m the same solution Cells were either used immediately or frozen at -80°C for future use Alternatively, plasmids were mobilized into P aeruginosa through biparental mating with £ colt SM10 carrymg plasmids of mterest (Simon et al , 1983) Construction of plasmids.

The cosmid pFVlOO, contammg the P aeruginosa wbp cluster, was used as a source of DNA for the construction of pFVlόl (Fig 43) An overlappmg cosmid, ρFV400, was the source of a 2 3-kb H dIII fragment cloned mto pBluescript II SK (pFV401) For DNA sequencmg, a 0 8 kb Hm dlll-X/ioI fragment from pFV401 was subcloned into pBluescript II SK (pFV402) A 3 0 kb Ssfl fragment containing the 5 portion of wzz and upstream sequences was cloned from pFV400 into pBluescript II SK (pFV403) For complementation experiments, the 2 3 kb insert of pFV401 was cloned into the Pseudomonas-E coh shuttle vector pUCP26 (Table 14), downstream of the vectors lacZ promoter (pFV401-26) DNA sequencing and analysis.

Using the above plasmids, the DNA sequences of both strands of the ρFV401 insert were determmed by the GenAlyTiC facility (University of Guelph, Guelph, ON) employing the Taq DyeDeoxy Termmator Cycle Sequencmg Kit (Applied Biosystems, Mississauga, ON) and an Encomp Model TCX15 Thermal cycler Ohgonucleotide primers were synthesized on an Applied Biosystems model 391 DNA synthesizer and purified as directed by the manufacturer

DNA sequences were collated and analyzed usmg GENE RUNNER for Windows (Hastings Software, Newark, NJ), DNAsis for Windows (Hitachi Software, Helrxx, Scarborough, ON), and PC/GENE (IntelliGenetics Inc , Mountain View, CA) DNA and protein database searches were performed using the NCBI BLAST network server (Altschul et al , 1990, Gish and States, 1993) Expression of the Wzz protein.

An £ coh S30 extract in vitro protein expression kit (Promega, Madison, WI) was used to examine the product encoded by the 05 wzz gene Column-purified (Qiagen) plasmid DNA of pBluescript II SK, pFV401a (contammg the 05 wzz gene cloned downstream of the lacZ promoter of pBluescript II SK) and pFV401b (contammg the same DNA cloned m the opposite orientation) were used as templates m the coupled transcription/ translation reaction in the presence of ³⁵S-labelled methionine (Trans35-Label, ICN, Costa Mesa, CA) The labelled proteins were precipitated with acetone, separated on standard discontinuous 12 5% SDS-PAGE along with unstained BioRad low-molecular-weight markers and visualized by autoradiography using ³⁵S-sensιtιve film (BioMax, Kodak, Toronto, ON) Preparation and visualization of LPS.

LPS from P aeruginosa was prepared by the method of Hitchcock and Brown, 1983 The LPS preparations were separated on standard discontmuous 12 5% SDS-PAGE gels and visualized by silver stammg usmg the method of Dubray and Bezard, 1982 Alternatively, LPS separated on SDS-PAGE gels was transferred to nitrocellulose and visualized by lmmunoblotting (Burnete, 1981) Nitrocellulose blots were blocked with 3% skim milk followed by overnight mcubation with hybndoma culture supernatants contammg MAb MF15-4 (specific for 05 B-band LPS), MAb 18-19 (cross-reactive for 02, 05, and 016 B-band LPS core-plus-one O-antigen unit, 28) or MAb N1F10 (specific for A-band LPS, 30) The second antibody was a goat anti-mouse F(ab)₂-alkahne phosphatase conjugate (Jackson Laboratories, Bio/Can Scientific, Mississauga, ON). The blots were developed using a substrate containing 0.3 mg/ml NBT (Nitro Blue Tetrazolium) and 0.15 mg/ml BCIP (5-bromo-4-chloro-3-indolyl phosphate toluidine) (Sigma) in 0.1 M bicarbonate buffer (pH 9.8). Creation of wzz knockout mutants through gene replacement.

The gene replacement strategy of Schweitzer and Hoang, 1985 was used for generation of knockout mutations in wzz. The 2.3 kb Hinάlll insert of pFV401 was cloned into pEXlOOT, a pUC19-based vector containing the sσcB gene as a selectable marker (pFV401T). An 875 bp gentamicin resistance cassette from the plasmid pUCGM was then cloned into the unique Xhol site within the insert (pFV401TGm). Constructs containing the interrupted wzz gene were mobilized into P. aeruginosa 05 by biparental mating with E. coli SM10. Since pEXlOOT does not replicate in P. aeruginosa, selection for gentamicin resistance allows detection of chromosomally-integrated copies of the mutated gene. Determination of sucrose and carbenicillin (Cb) sensitivities distinguishes between merodiploids (sucrose^s, Cb^R) and true recombinants (sucrose^R, Cb^s). The presence of the gentamicin cassette in the chromosomal DNA of P. aeruginosa 05 and 016 wzz mutants was confirmed by Southern blot analysis (not shown). RESULTS Cloning and sequencing of the P. aeruginosa 05 wzz gene. Nucleotide sequences with homology to wzz from £. coli, Salmonella enterica sv Typhimurium and Shigella flexneri (Bastin et al., 1993; Batchelor et al., 1992; Morona et al., 1995) were identified ending approximately 800 bp upstream of the first gene of the P. aeruginosa 05 wbp gene cluster, wbpA (Fig. 43). The amount of DNA with homology to wzz was 479 bp, starting at the Xhol cloning site of the insert of pFVlOO and ending with a stop codon. Based on the average size (1 kb) of previously characterized wzz genes (Bastin et al., 1993; Batchelor et al., 1992; Morona et al., 1995), this sequence represented approximately half of the putative P. aeruginosa wzz gene.

A 1.5 kb X.iol-H.ndlll fragment from pFVlόl containing the 3 end of the putative wzz gene (Fig. 43) was used as a probe to screen a P. aeruginosa 05 cosmid library. One cosmid (pFV400) which hybridized with the probe was isolated. A probe-reactive 2.3 kb HindlH fragment from pFV400 was subcloned into pBluescript II SK to form pFV401 (Fig. 43).

DNA sequence analysis revealed an open reading frame (ORF) of 1046 base pairs (bp), sufficient to encode a protein of 348 amino acids with a molecular mass of 39.3 kilodaltons (kDa), and an isoelectric point of 6.26. Comparison of the deduced amino acid sequence of the P. aeruginosa 05 protein with those in GenBank revealed from 11.5 to 20.0% amino acid identity with Wzz-like proteins of other species (Table 15). P. aeruginosa Wzz also has similarity with proteins thought to be involved in polymerization or export of exopolysaccharide capsules in £. coli 08/09 (13, 15; accession #U39306), Vibrio cholerae 0139 (4; OtnB, X90547), Klebsiella pneumoniae (ORF6, 747665), and Rhizobium meliloti (ExoP, Z22636). P. aeruginosa Wzz also has similarity with FepE from E. coli, thought to be a component of the ferric enterobactin permease (Ozenburger et al., 1987; X74129).

While there is poor primary sequence homology between the Wzz protein of P. aeruginosa 05 and related proteins, their predicted secondary structures are similar (Fig. 44). There are conserved hydrophobic regions at both the amino and carboxy termini, and hydrophilic regions in the central portion of the protein. The predicted transmembrane helices in P. aeruginosa 05 Wzz are between amino acids 29-49 and 319-339. These hydrophobic regions contain the amino acid residues which are most highly conserved among Wzz-like proteins. Analysis of the region upstream of wzz. The wzz gene is upstream of the wbp cluster of P. aeruginosa 05. As described in Example 3, most of the genes in this cluster, including wzz, are serogroup-specific, and are found only in serotypes 02, 05, 016, 018, and O20. These serotypes have chemically- and structurally-related O antigens (Knirel and Koch et Kov., 1994). Based on Southern blot hybridization results, the 5 end of the serogroup-specific region was previously localized to a 1.9-kb Sstl-Xhol fragment located 1.1 kb upstream of the 5 end of pFVlOO. DNA sequence analysis of this fragment revealed a gene with 85% nucleotide identity with the £. coli gene rpsA, encoding 30S ribosomal protein SI (Schnier et al., 1982), and a second gene which has 98% identity with P. aeruginosa himD, encoding the β subunit of integration host factor (IHF) (Delic-Atree et al., 1995). The rpsA and h im D genes are transcribed in the same direction as wzz. These data locate rpsA and himD adjacent to the wbp cluster at 37 minutes on the chromosomal map of P. aeruginosa 05 strain PAOl (Holloway et al., 1994; Lightfoot and Lam, 1993). Expression of the putative Wzz protein.

Using an E. coli S30 extract expression system, the putative wzz gene was shown to encode a protein with an apparent molecular weight of 40 kDa which was not present in samples containing only the vector, pBluescript II SK (Fig. 45). The estimated size of 40 kDa is in good agreement with that predicted from the DNA sequence (39.3 kDa). A reduced amount of the same protein was detected in the sample in which the insert DNA was cloned in the opposite orientation (ρFV401b), indicating that there is a native promoter present upstream of the wzz gene which functions weakly in £. coli. Examination of the DNA sequence upstream of wzz revealed at least three potential promoter sequences with partial homology to the £. coli δ⁷⁰ consensus. The -10 regions of these putative promoters are located approximately 60, 140, or 155 bp upstream of the wzz initiation codon. Analysis of the putative Wzz protein function using chromosomal knockout mutants.

A gentamicin-resistance (Gm^R) cassette was inserted into the putative wzz gene of P. aeruginosa 05, and the interrupted gene was reintroduced into the 05 chromosome by homologous recombination. Comparison of LPS from the wild-type strain and the Gm^R mutant on silver-stained SDS-PAGE gels and Western immunoblots using B-band-specific MAbs MF15-4 and 18-19 showed that the mutant had an altered LPS banding pattern. When MAb 18-19 was used, the LPS from the wzz mutant showed an increase in both shorter and longer B-band LPS O chains and a decrease in B-band O chains whose length corresponded to that preferred in the 05 parent strain (Fig. 46). On the immunoblot using MAb MF15-4, which is specific for high-molecular-weight LPS (Lam et al., 1992), there is also an increase in both shorter and longer B-band O chains. Similar Western immunoblots using the A-band LPS-specific MAb N1F10 showed the modality of A-band was unaffected by the wzz mutation (not shown). Although the B-band LPS pattern of the wzz mutant is significantly different from the parent strain, it does not show the linear distribution of O-antigen chain lengths seen in enteric wzz mutants (Fig. 47A). Reintroduction of the 05 wzz gene on pFV401-26 restored the mutant to a phenotype similar to that of the parent but missing both the shortest and longest groups of chain lengths (Fig. 46).

Comparison of the function of wzz in two related serotypes of P. aeruginosa.

A DNA probe containing the 05 zυzz gene hybridized with chromosomal DNA only from serotypes 02, 05, 016, 018, and O20 of P. aeruginosa, all of which have chemically- and structurally-related O antigens (Example 3). The O antigens of both 05 and 016 are composed of two mannuronic acid and one N-acetyl fucosamine residues, but differ in one glycosidic linkage. In 05, the linkage is (l(3)-(-D-Fuc2NAc, while in 016, the linkage is (l(3)-(-D-Fuc2NAc. This change results in a discernible difference in the LPS patterns of 05 and 016 (Fig. 46).

Taking advantage of the similarity between the O-antigen gene clusters of 05 and 016, a wzz knockout mutation was introduced into 016, using the 05 wzz knockout construct. As an additional benefit, Olό does not express A-band LPS (Lam et al., 1989), thus any changes in B-band LPS patterns on silver-stained gels were more easily visualized. The structural difference between 05 and 016 LPS is detected by MAb MF15-4, which recognizes only 05 and not 016 LPS. To examine LPS from both 05 and 016 simultaneously on Western immunoblots, MAb 18-19, which cross-reacts with all five serotypes in the 05 serogroup (Lam et al., 1992), was used. Comparison of LPS from the wild-type 016 parent and the 016 wzz knockout mutant showed the mutant displayed a loss of modality correspondmg to the preferred cham lengths of the parent, and an mcrease m higher-molecular-weight LPS (Fig 46) Interestmgly, there still appeared to be cham length modulation in the 016 wzz mutant that was different from that of the parent, with a decrease in short O chams in comparison to the 05 wzz mutant Bastin and coworkers (1996) showed that the modality of cham length distribution was dependent on the source of the wzz gene However, the pattern of LPS cham length distribution of 016 τυzz mutants carrymg the 05 wzz gene on pFV401-26 resembled that of the 016 parent strain, rather than the 05 stram (Fig 46) Ability of the P. aeruginosa 05 wzz gene to function in E. coh. In order to determine whether wzz from P aeruginosa 05 could complement an enteric wzz mutation, £ coh stram CLM4, which is deleted for O-antigen genes including wzz (Marolda and Valvano, 1993), was used CLM4 was transformed with either pSS37 (containing the O-antigen biosynthetic genes from S dysenteriae type I without a wzz gene alone, or with both pSS37 and pFV401, contammg P aeruginosa 05 wzz While LPS from £ coh CLM4/pSS37 showed an unregulated distribution of cham lengths, LPS from £ coh CLM4/pSS37/pFV401 showed a restoration to modality, with a decrease in short and very long O chams, and an mcrease m chams with approximately 10-20 repeats

The core oligosacchande of the £ coh K-12 hybrid stram HBlOl, but not K-12 itself, can act as an acceptor for P aeruginosa O antigens (Goldberg et al , 1992, Lightfoot and Lam, 1993) The structure of the HBlOl core has not been elucidated Although £ coh HBlOl carrymg pFVlOO had previously been shown to express LPS which could be recognized by B-band-specific MAb MF15-4, its cham-length regulation had not been examined pFVlOO is now known to contam a truncated zυzz gene The expression of LPS from E coh HBlOl carrymg both pFVlOO and the complete 05 wzz gene on pFV401 was exammed £ coh HBlOl carrymg pFVlOO alone expressed an 05 O antigen with modulated, short-cham O-antigen molecules (Fig 47B) When both pFVlOO and pFV401 were present m E coh HBlOl, a dual LPS banding pattern was visible on Western immunoblots (Fig 47B) The coexpression of both E coh and P aeruginosa Wzz protems resulted in a ma_jor group of short O chains attributable to HBlOl Wzz, and a minor group with longer chains attributable to the P aeruginosa 05 Wzz protem

The identification of the rpsA and h imD genes upstream of wzz completes the delineation of the region of serogroup-specific DNA responsible for encodmg the B-band LPS O antigen of P aeruginosa 05 and related serotypes The entire 05 wbp cluster is thus bounded by himD on the 5 end and uvrB on the 3 end and is approximately 24 3 kb from the start of wzz to the end of wbpN The serogroup-specific portion is approximately 18 4 kb from the start of wzz to the end of wbpL Unlike enteric O-antigen (rfb) clusters, the wbp cluster is not flanked by his and gnd, although there are two his genes, htsH and hisF, located m the center of the cluster The location of wzz upstream of the wbp cluster in P aeruginosa is opposite to that m many enteric bacteria, where wzz is located downstream of the O-antigen cluster (Batchelor et al , 1992, Morona et al , 1995) The presence of the rpsA and himD genes, which are highly conserved among bacterial species, at the junction between the serogroup-specific and common regions suggests they may have been the site of a past recombmation event himD encodes the β-subunit of IHF which has previously been shown to be involved in regulation of biosynthesis of the exopolysaccharide algmate (Wozniak and Ohman, 1993, Wozniak, 1994) The presence of a functional wzz gene in P aeruginosa 05 confirms that both the O-antigen polymerase, Wzy, and Wzz are required for expression of the heteropolymeπc B-band O antigen, as predicted by current models Growing evidence suggests that Wzz protems may also play a role in the modulation of the length of capsular exopolysaccharide polymers (Bik et al , 1996, Dodgson et al , 1996, Franco et al , 1996) A possible homologue of the third component of Wzy-dependent systems, Wzx, is present m the wbp cluster (Burrows et al , 1996)

The LPS banding pattern of enteric wzz mutants consists mainly of short O chains with steadily decreasmg amounts of longer chains (Fig 47A) In contrast, neither the 05 nor the 016 wzz mutants display this typical wzz phenotype, and the 016 mutant m particular contmues to display some chain length regulation It is possible that cham length regulation m P aeruginosa is not simply dependent on wzz In the case of 016, there may be a second wzz gene present in the 016 chromosome whose activity is normally masked by the wzz of the 05 serogroup Complementation of the 05 and 016 mutants by wzz on a multicopy plasmid gave rise to strains whose LPS appeared even more tightly regulated for size than that of the parent strams, smce the complemented wzz mutants lacked both short- and very long-chain modal groups, and had an increase in medium-length groups One possible mterpretation of these results is that the regulation of chain length by wzz m P aeruginosa is normally imprecise, giving rise to groups with multiples of the preferred cham length instead of a single group This mterpretation fits the model of Bastm et al , 1993 who suggested that multimodal distributions of chain lengths could result from reinitiation of polymerization without an intervening ligation step

Complementation of the 016 mutants by the 05 lυzz gene restored them to a phenotype resembling the Olό parent Contrary to the findings of Bastin and colleagues, 1993, these results show that in these closely-related serotypes, the structure of the O antigen, or possibly difference in the 05 vs 016 genetic background, determines the preferred O-antigen chain length While the 016 wzz and wzy genes have not been lsolated, they are probably highly similar to those of 05 based on the results of high-strmgency Southern blot analysis The analysis of wzy from the related serotypes 02 and 05 demonstrated that the genes are essentially identical

The P aeruginosa 05 Wzz protem can modulate expression of both homologous (P aeruginosa 05) and heterologous (S dysenteriae) O antigens in £ coh although it has only 20% identity with the Wzz protein of £ coh The abli ty of P aeruginosa Wzz to modulate a heterologous O antigen is consistent with previous work showmg Wzz is not specific for O-antigen type When £ coh and P aeruginosa Wzz proteins are coexpressed in £ coh, the modulating effect of the native protem predominates although the P aeruginosa wzz is present in multicopy This difference can be seen m the mcreased proportion of short O chams versus longer O chams which are expressed Despite variations in efficacy, it appears that the Wzz proteins from different Gram-negative families function m an analogous manner and can act as interchangeable components of the O-antigen assembly complex The ability of Wzz, Wzy and WaaL protems with divergent primary sequences to act reciprocally suggests that they are interactmg through recognition of common, conserved structural features Although the ammo acid similarities between the Wzz protems are low, their secondary structures are alike (Fig 44) Similarly, although the primary sequence similarities of the Wzy protems from a number of bacteria are poor, all have highly similar secondary structures containing multiple membrane-spannmg domams (Cryz et al , 1984) Comparison of the WaaL protems from £ coh and S enterica sv Typhimurium, the only O-antigen ligases characterized to date, show that they too have conserved secondary structures, but less than 20% primary sequence homology (Liu and Wang, 1990) In light of this information, it is now possible to target conserved structural features of these proteins for modification m order to further defme the areas critical for putative protem mteractions

Having illustrated and described the principles of the invention m a preferred embodiment, it should be appreciated to those skilled in the art that the invention can be modified in arrangement and detail without departure from such principles We claim all modifications commg within the scope of the followmg claims

All publications, patents and patent applications referred to herein are incorporated by reference m their entirety to the same extent as if each individual publication, patent or patent application was specifically and individually mdicated to be incorporated by reference in its entirety Below full citations are set out for the references referred to m the specification and detailed legends for the figures are provided The application contains sequence listings which form part of the application.

TABLE 1

Pseudomonas aeruginosa serotype O5 wbp gene cluster.

^a truncated ORF de Kievit et al. (1995) ^c wbpL was originally named rfbA; Dasgupta and Lam (1995) ^d number of amino acids ^e isoelectric point of the protein, calculated using GeneRunner for Windows (Hastings

Software). ^f hydropathic index of the protein, calculated using DNAsis for Windows (Hitachi

Software). Positive values indicate the protein is hydrophobic, while negative values indicate the protein is hydrophilic. ε distribution of this gene among the 20 serotypes of P. aeruginosa , based on positive hybridization in high-stringency Southern blot analysis. TABLE 2

0) c m

CΛ

rπ ON O

(0 z m m

H c m io σ>

TABLE 2 Cont'd

WbpE BplC-5. pertussis D gϊ -Bacillus subtilis ERYCl-Saccharopolyspora erythrae SpsC-Ba. subtilis Dnij-Str. peucetius

WbpF ExoT-R. meliloti FeuC-Ba. subtilis O c

CD WbpG O Fl-Vibrio cholerae 0139 Pfk-Lactococcus l ctis NrfB-H. influenzae m O

X WbpH RfaK-Neisseria meningitidis < m m CapM-5. aureus

H IcsA-N. meningitidis

3 BplH-5. pertussis c BplE-B. pertussis m io

& σ>■ Wbpl BplD-5. pertussis EpsC-i?. solanacearum RffE (o389)-E coli YvyH-Ba. subtilis RfbC-S. enterica sv Borreze

WbpJ BpϊE-B. pertussis TτsE-Yersinia enterocoliήca 0:3

TABLE 2 Cont'd

O c

CD CO

H C H m

CO I

X m as oo m

H

3 c ι- m t σ>

TABLE 3

Amino acid homologies of HisH proteins

Amino acid homologies of HisH proteins from various bacterial species. The amino acid sequences of various HisH proteins were aligned pairwise using the PC GENE PALIGN program with the foUowing parameters: K-tuple value = 1 ; gap penalty = 5; window size = 10; open gap cost = 10; unit gap cost = 10; filtering level = 2.5. The numbers shown are a summation of identical and conserved amino acid residues. Key: PA, Pseudomonas aeruginosa O5 psb cluster HisH; AB, Azospirillum brazilense HisH; EC, Escherichia coli HisH; HI, Haemophilus influenzae HisH; LL, Lactobacillus lactis HisH; RS, Rhodobacter sphaeroides HisH; and ST, Salmonella enterica typhimurium HisH.

TABLE 4

Amino acid homologies of HisF proteins.

. Amino acid homologies of HisF proteins from various bacterial species. The amino acid sequences of various HisF proteins were aligned pairwise using the PC/GENE PALIGN program with the following parameters: K-tuple value = 1 ; gap penalty = 5; window size = 10; open gap cost = 10; unit gap cost = 10; filtering level = 2.5. The numbers shown arc a summation of identical and conserved amino acid residues. Key: Pa, Pseudomonas aeruginosa O5 psb cluster HisF; Ab, Azospirillum brazilense HisF; Ec, Escherichia coli HisF; Hi, Haemophilus influenzae HisF; Ll, Lactobacillus lactis HisF; Rs, Rhodobacter sphaero4ides HisF; and St, Salmonella enterica typhimurium HisF.

TABLE 5

Pairwise comparison of Rol amino acid homologies¹

PA EC1 EC2 SF ST

PA 100.0

EC1

EC2

SF

ST

¹ Analyses were done using PCGENE PALIGN program.

² PA, Pseudomonas aeruginosa 05 Rol; EC1, E. coli 075 Rol; EC2, E. coli Ol 11 CLD; SF. Shigella flexneri Rol; ST, Salmonella enterica serovar typhimurium strain LT2 CLD. Note that CLD (chain length detemiinant) is another nomenclature used by some researchers (Bastin et al., 1993) to describe the same class of Rol proteins.

TABLE 6

Bacterial strains and plasmids

"OT684 is the immediate progenitor strain of AK 1401 and is a restnctionless mutant of PAOl (Potter and Loutit, 1982). TABLE 7

Rfc proteins of P aeruginosa and other gram-negative organisms

"Molecular weight based on nucleotide sequence

^Hydropathy index deduced from hydrophobicity analysis (Kyte and Doolittle, 1982) Percentage of the bases G and C in the coding sequence

TABLE 8

Bacterial strams and plasmids used in this study.

Stram or plasmid Genotype, phenotype or properties Reference /source

P aeruginosa 05 strain PAOl, wild type A+ B+ 20

05 wzz PAOl, τυzz msertion mutation at Xhol, A+ B+ this study IATS 016 Serotype 016 wild type A- B+ 33 016 wzz Serotype 016 wzz msertion mutation at Xhol, A- B+ this study

£ coh JM109 recAl supE endAl hsdR17

relAl thi 53

(lac-pro AB F[tra D36, proAB+, lacln, lac7 (M15]

SM10 thι-1 thr leu tonA lacY supE recA RP4-2-Tc Mu, Km^R 45 HBlOl V- thι-1 hsdS20 ser A ara proAl lacYl galK2 rpsL20 27 xyl mtl-1 sιφE44 recA13 leuB6 Str^R

CLM4 lacZ2286 trp-49 ((sbcB-rfb)86 ιφp-12 relA l rpsll50 (- 35 recA

Plasmids pFVlOO 24 4 kb Xhol fragment m cosmid pCP13, contams the 8, 31 wbp cluster pFV400 25.0 kb Sau3Al fragment m pCP13, overlaps pFVlOO this study pFV401 2.3 kb H dIII fragment in pBluescript II SK, contains this study the P aeruginosa 05 ivzz gene pFV401-26 same insert m pUCP26 this study pFV401TGm same msert m pEXlOOT, with G R cassette inserted at this study unique Xhol site withm wzz pFV403 3 0 kb Sstl fragment m pBluescript II SK, contams 5 this study portion of wzz and upstream sequences pBluescript II SK 29 kb clonmg vector contammg T7 promoter, Ap^R Stratagene pUCP26 4.9 kb pUC18-based broad-host-range vector, Tc^R 48 pEXlOOT gene-replacement vector, oπT^"1 , sαcB⁺, Ap^R 44 pUCPGM source of gentamicin resistance cassette; Ap^R, Gm^R 44 TABLE 9

Amino acid identities/similarities of various wzz-like proteins.

Ec Wzz Ec o349 Sf Wzz St Wzz Ec 08 Ye Wzz Yp Wzz Fx FepE Vc OtnB Wzz

Pa Wzz

Ec Wzz E o349

CO c

W Sf Wzz

H

«= St Wzz m ^ Ec 08 Wzz m m

H Ye Wzz

3 c rπ Yp Wzz io Ec FepE

Numbers shown are percent identity, with percent similarity in brackets.

Pa, P. aeruginosa 05, accession U50397; Ec Wzz, E. coli Ol l l, Z17241 ; Ec o349, E. coli, M87049; Sf Wzz, Shigella flexneri, X71970; St Wzz,

S. enterica sv Typhimurium LT2, M89933; Ec 08 Wzz, E. coli 08, U39306; Ye Wzz, Yersinia enterocolitica 0:8, U43708; Yp Wzz, Y. pseudotuberculosis, U13685; Ec FepE, E. coli, P26266; Vc OtnB, Vibrio cholerae 0139, X90547.

REFERENCES

Alifano, P., Fani, R., Liό, P., Lazcano, A., Bazzicalupo, M., Stella Carlomagno, M., and

Bruni, CB. (1996) Histidme biosynthetic pathway and genes structure, regulation, and evolution Microbiol Rev 60 44-69 Allen and Maskell, (1996) The identification, clonmg and mutagenesis of genetic locus required for lipopolysaccharide biosynthesis in Bordetelia pertussis Mol Microbiol 19

37-52

Altschul, S.E., G. Warren, W. Miller, E.U. Myers, and D.J. Lipman 1990 Basic local alignment search tool ] Mol Biol 215 403-410 Amor, P., and L. Mutharia. (1995) Cloning and expression of rfb genes from Vibrio anguillarum serotype 02 in Escherichia coh evidence for cross-reactive epitopes Infect

Immun 63 3537-3542

Arisawa, A., Tsunekawa, H., Okamura, K. and Okamoto, R. (1995) Nucleotide sequence analysis of the carbomycin biosynthetic genes including the 3-O-acyltransferase gene from Streptomyces thermotolerans Biosa Biotechnol Biochem 59 582-588

Arsenault, T. L., Hughes, D. W., MacLean, D. B., Szarek, W. A., Kropinski, A. M. B. and

Lam, J. S. 1991 Structural studies on the polysacchaπde portion of "A-band" lipopolysaccharide from a mutant (AK1401) of P aeruginosa stram PAOl Can ] Chem 69

1273-1280 Bastin, D.A., G. Stevenson, P.K. Brown, A. Haase, and P.R. Reeves 1993 Repeat unit polysacchaπdes of bacteria a model for polymerization resemblmg that of πbosomes and fatty acid synthetase, with a novel mechanism for determining chain length Mol

Microbiol 7.725-734

Batchelor, R.A., P. Alifano, E. Biffali, S.I. Hull, and R.A. Hull. 1992 Nucleotide sequences of the genes regulating O-polysacchaπde antigen chain length (rol) from

Escherichia coh and Salmonella typhimurium Protein homology and functional complementation J Bacteπol. 174 5228-5236

Bechthold, A., Sohng, J.K., Smith, T.M., Chu, X. and Floss, H.G. (1995) Identification of

Streptomyces violaceoruber Tu22 genes mvolved m the biosynthesis of granatiαn Mol Gen Genet 248 610-620

Berry, D., and Kropinski, A. M. 1986 Effect of lipopolysaccharide mutations and temperature on plasmid transformation efficiency in P aeruginosa Can } Microbiol 32 436-

438

Bik, E.M., A.E. Bunschoten, R.J.L. Willems, A.C.Y. Chang, and F.R. Mooi 1996 Genetic organization and functional analysis of the otn DNA essential for cell-wall polysacchande synthesis in Vibrio cholerae 0139 Mol Microbiol 20.799-811 Binotto, J., MacLachlan, R., and Sanderson, K. E. 1991. Electrotransformation in

Salmonella typhimurium LT2. Can ] Microbiol 37:474-477.

Bimboim, H. C, and Doly, J. 1979. A rapid extraction procedure for screening recombinant plasmid. Nucleic Acids Res. 7:1513-1523. Bogdanova, N., Bork, C, and Hell, R. (1995) Cysteine biosynthesis in plants: isolation and functional identification of a cDNA encoding a serine acetyltransferase from A rabidopsis thaliana. FEBS Lett 358: 43-47.

Boyer, H. W., and Roulland-Dussoix, D. 1969. A complementation analysis of the restriction and modification of DNA in Escherichia coli. } Mol Biol 41:459-496. Brown, P. K., Romana, L. K., and Reeves, P. R. 1992. Molecular analysis of the rβ gene cluster of Salmonella serovar muenchen (strain M67), the genetic basis of the polymorphism between groups C2 and B. Mol Microbiol 6:1385-1394.

Buendia, A.M., Enenkel, B., Kόplin, R., Niehaus, K., Arnold W., and Pϋhler, A.. (1991) The

Rhizobium meliloti exoZ/exoB fragment of megaplasmid 2: ExoB functions as a UDP-glucose-4-epimerase and ExoZ shows homology to NodX of Rhizobium leguminosarum biovar viciae strain TOM. Mol Microbiol 5: 1519-1530.

Burnette, W.N. 1981. Western blotting: electrophoretic transfer of proteins from sodium dodecyl sulphate-polyacrylamide gels to unmodified nitrocellulose and radiographic detection with antibody and radioiodinated protein A. Anal. Biochem. 112:195-203. Burrows, L.L., D. Chow, and J.S. Lam . 1997. Pseudomonas aeruginosa B-band O antigen chain length is modulated by Wzz (Rol). J. Bacteriol. 179: in press.

Burrows, L.L., D.F. Charter, and J.S. Lam. 1996. Molecular characterization of the

Pseudomonas aeruginosa serotype 05 B-band lipopolysaccharide gene cluster. Mol.

Microbiol. 22:481-495. Collins, L. V., and Hackett, J. 1991. Molecular cloning, characterization, and nucleotide sequence of the rfc gene, which encodes an O-antigen polymerase of Salmonella typhimurium. ] Bacteriol 173:2521-2529.

Comstock, L.E., Johnson, J.A., Michalski, J.M., Morris, J.G., Jr., and Kaper, J.P. (1996) Cloning and sequence of a region encoding a surface polysacchande of Vibrio cholerae 0139 and characterization of the insertion site in the chromosome of Vibrio cholerae Ol. M o l

Microbiol 19: 815-826.

Cryz, S.J. Jr., T.L. Pitt, E. Furer, and R. Germanier. 1984. Role of lipopolysaccharide in virulence of Pseudomonas aeruginosa. Infect. Immun. 44:508-513.

Daniels, D.L., Plunkett, G., Burland, V., and Blattner, F.R. (1992) Analysis of the Escherichia coli genome: DNA sequence of the region from 84.5 to 86.5 minutes. Science 257:

771-778. Darzins, A., and Chakrabarty, A. M. 1984 Cloning of genes controlling algmate biosynthesis from a mucoid cystic fibrosis isolate of P aeruginosa ] Bacteriol 159 9-18

Dasgupta, T., and Lam, J. S. Identification of putative rfb genes involved in B-band lipopolysaccharide biosynthesis in P aeruginosa serotype 05 Submitted for publication Dasgupta, T., and J.S. Lam (1995) Identification of rfb A , involved in B-band lipopolysaccharide biosynthesis in Pseudomonas aeruginosa serotype 05 Infection and

Immunity 63: 1674-1680

Dasgupta, T., Malburg, S., and Lam, J. S. 1993 Program Abstr 93rd Gen Meet Amer Soc

Microbiol abstr D-240 Davis, E.O., Evans, LJ. and Johnston, A.W. (1988) Identification of nodX, a gene that allows Rhizobium leguminosarum biovar viciae stram TOM to nodulate Afghanistan peas

Mol Gen Genet 212 531-535

Denk, D. and Bock, A. (1987) L-cysteine biosynthesis in Escherichia coh nucleotide sequence and expression of the serine acetyltransferase (cysE) gene from the wild-type and a cysteine-excreting mutant / Gen Microbiol 133 515-525 de Kievit, T.R., T. Dasgupta, H. Schweitzer, and J.S. Lam. 1995 Molecular clonmg and characterization of the rfc gene of Pseudomonas aeruginosa (serotype 05) Mol Microbiol

16 565-574 de Kievit, T.R., and J.S. Lam 1997 Pseudomonas aeruginosa rfc genes of serotypes 02 and 05 could complement O-polymerase deficienct SR mutants of either serotype FEMS

Microbiol Letters, in press de Kievit, T. R., and Lam, J. S. 1994 Program Abstr 94th Gen Meet Amer Soc Microbiol abstr D-192 de Kievit, T. R., Dasgupta, T., Schweizer, H., and Lam, J.S. (1995) Molecular clonmg and characterization of the rfc gene of Pseudomonas aeruginosa (serotype 05) Mol Microbiol 16

565-574 de Lencastre, H., Chak, K.-F., and Piggot, P. J. 1983 Use of Escherichia coh transposon

TnlOOO (γδ) to generate mutations in Bacillus subtihs DNA / Gen Microbiol 129 3202-3210

Delic-Attree, I., B. Toussaint, and P.M. Vignais 1995 Clonmg and sequence analyses of the genes coding for the integration host factor (IHF) and HU proteins of Pseudomonas aeruginosa Gene 154 61-64

Deretic, V., Gill, J.F., and Chakrabarty, A.M. (1987) Gene algD coding for GDPmarmose dehydrogenase is transcπptionally activated in mucoid Pseudomonas aeruginosa ]

Bacteriol 169 351-358 Dhillon, N., Hale, R.S., Cortes, J., and Leadlay, P.F. (1989) Molecular characterization of a gene from Saccharopolyspora erythraea (Streptomyces erythraeus) which is mvolved m erythromycm biosynthesis Mol Microbiol 3 1404-1414 Ditta, G., Schmidhauser, T., Yakobson, E., Su, P., Liang, X.-W., Finlay, D. R., Guiney, D., and Helinski, D. R. 1985. Plasmids related to the broad host range vector, pRK290, useful for gene cloning and for monitoring gene expression. Plasmid 13:149-153. Dodgson, C, P. Amor, and C. Whitfield. 1996. Distribution of the rol gene encoding the regulator of lipopolysaccharide O-chain length in Escherichia coli and its influence on the expression of group I capsular K antigens. J. Bacteriol. 178:1895-1902.

Dodgson, C, P. Amor, and C. Whitfield. 1996. Distribution of the rol gene encodmg the regulator of lipopolysaccharide O-chain length in Escherichia coli and its influence on the expression of group I capsular K antigens. J. Bacteriol. 178:1895-1902. Dubray, G., and G . Bezard. 1982. A highly sensitive periodic acid-silver stain for 1,2-diol groups of glycoproteins and polysaccharides in polyacrylamide gels. Anal Biochem 119:325-329.

Falah, M. and R. S. Gupta. 1994. Cloning of the hsp70 (dnaK) genes from Rhizobium meliloti and Pseudomonas cepacia: phylogenetic analyses of mitochondrial origin based on a highly conserved protein sequence. J Bacteriol 176: 7748-7753.

Farinha, M. A., and Kropinski, A. M. 1990. High efficiency electroporation of P. aeruginosa using frozen cell suspensions. FEMS Microbiol Lett 70:221-226.

Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A., Kirkness, E.F., Kerlavage, A.R., Bult, C.J., Tomb, J.-F., Dougherty, B.A., Merrick, J.M., McKenney, K., Sutton, G., FitzHugh, W., Fields, C.A., Gocayne, J.D., Scott, J.D., Shirley, R., Liu, L.-I., Glodek, A., Kelley, J.M., Weidman, J.F., Phillips, C.A., Spriggs, T., Hedblom, E., Cotton, M..D., Utterback, T.R., Hanna, M.C., Nguyen, D.T., Saudek, D..M., Brandon, R.C., Fine, L.D., Fritchman, J.L., Fuhrmann, J.L., Geoghagen, N.S.M., Gnehm, C.L., McDonald, L.A., Small, K.V., Fraser, CM., Smith, H.O. and Venter, J.C. (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269: 496-512.

Franco, A.V., D. Liu, and P.R. Reeves. 1996. A Wzz (Cld) protein determines the chain length of K lipopolysaccharide in Escherichia coli 08 and 09 strains. J. Bacteriol.

178:1903-1907.

Gagnon, Y., Breton, R., Putzer, H., Pelchat, M., Grunberg-Manago, M., and Lapointe, J. (1994) Clustering and co-transcription of the Bacillus subtilis genes encoding the aminoacyl-tRNA synthetases specific for glutamate and for cysteine and the first enzyme for cysteine biosynthesis. / Biol Chem 269: 7473-7482.

Gish, W., and D.J. States. 1993. Identification of protein coding regions by database similarity search. Nature Genet. 3:266-272. Glaser, P., Kunst, F., Arnaud, M., Coudart, M.-P., Gonzales, W., Hullo, M.-F., Ionescu, M., Lubochinsky, B., Marcelino, L., Moszer, I., Presecan, E., Santana, M., Schneider, E., Schweizer, J., Vertes, A., Rapoport, G., and Danchin, A.. (1993) Bacillus subtilis genome project: cloning and sequencmg of the 97 kb region from 325° to 333° Mol Microbiol 10:

371-384.

Glucksmann, M.A., Reuber, T.L., Walker, G.C (1993) Genes needed for the modification, polymerization, export, and processing of succmoglycan by Rhizobium meliloti. a model for succmoglycan biosynthesis / Bacteriol 175: 7045-7055.

Gδhmann, S., Manning, P.A., Alpert, C.A., Walker, M.J., and Timmis, K.N. (1994)

Lipopolysaccharide O-antigen biosynthesis in Shigella dysenteriae serotype 1: analysis of the plasmid-carried rfp determinant. Microb Pathog 16: 53-64

Gold, L., and Stormo, G., (1987) Transcriptional initiation In Escherichia coh and Salmonella typhimurium. Cellular and Molecular Biology Vol. 2. Neidhardt, F.C. (ed).

Washington, D.C American Society for Microbiology, pp.807-876

Goldberg, J.B., K. Hatano, G. Small Meluleni, and G.B. Pier. 1992 Clonmg and surface expression of Pseudomonas aeruginosa O antigen in Escherichia coh Proc Nat Acad. Sci

USA 89:10716-10720. Goldberg, J.B., and D.E. Ohman. 1984. Cloning and expression in Pseudomonas aeruginosa of a gene mvolved with the production of alginate. J. Bacteriol. 158.1115-1121

Goldberg, J.B., K. Hatano, G. Small Meluleni, and G.B. Pier. 1992 Clonmg and surface expression of Pseudomonas aeruginosa O antigen in Escherichia co Proc Nat. Acad. Sci

USA 89:10716-10720. Goldman, R.C, and L. Leive 1980 Heterogeneity of antigenic-side-chain length in lipopolysaccharide from Escherichia co Olll and Salmonella typhimurium LT2. Eur. J

Biochem. 107 145-153.

Gotschlich, 1994.

Hammerschmidt, S., Birkholz, C, Zahringer, U., Robertson, B.D., van Putten, J., Ebelling, O., and Frosch, M., (1994) Contribution of genes from the capsule gene complex (cps) to lipooligosacchaπde biosynthesis and serum resistance in Neisseria meningitidis Mol

Microbiol 11: 885-896.

Hancock, R.E.W., and A.M. Carey. 1979 Outer membrane of Pseudomonas aeruginosa: heat- and 2-mercaptoethanol-modιfιable proteins J Bacteriol. 158: 1115-1121. Harley, CB. and R. P. Reynolds (1987) Analysis of £. coh promoter sequences. Nucleic

Acids Res 15: 2343-2361.

Hashimoto, Y., Li, N., Yokoyama, H. and Ezaki, T. (1993) Complete nucleotide sequence and molecular characterization of ViaB region encodmg Vi antigen in Salmonella typhi. ]

Bacteriol 175: 4456-4465 Hitchcock, P.J., and T.M. Brown. 1983 Morphological heterogeneity among Salmonella lipopolysaccharide chemotypes in silver-stained polyacrylamide gels J Bacteriol

154:269-277. Holloway, B.W., Rom ng, U., Tummler, B. (1994) Genomic mapping of Pseudomonas aeruginosa PAO Microbiology 140 2907-2929

Holloway, B.W., U. Rmling, and B. Tm ler. 1994 Genomic mapping of Pseudomonas aeruginosa PAO Microbiology 140 2907-2929 Huang, J., and Schell, M. (1995) Molecular characterization of the eps gene cluster of

Pseudomonas solanacearum and its transcπptional regulation at a single promoter Mol

Microbiol 16 977-989

Huff, J.P., B.J. Grant, CA. Penning, and K.F. Sullivan 1990 Optimization of routine transformation of Escherichia coh with plasmid DNA Biotechniques 9 570-577 Jarosik, G. P. and E. J. Hansen. 1994 Identification of a new locus mvolved m expression of

Haemophilus influenzae type b lipoohgosaccharide Infect Immun 62: 4861-4867

X. M. Jiang, B. Neal, F. Santiago, S. J. Lee, L. K. Romana & P. R Reeves (1991) Structure and sequence of the rβ (O antigen) gene cluster of Salmonella serovar typhimurium (strain

LT2) Mol Microbiol 5: 695-713 Kao, C C and L. Sequeira 1991 A gene cluster required for coordmated biosynthesis of lipopolysaccharide and extracellular polysacchande also affects virulence of Pseudomonas solanacearum J Bacteriol 173 7841-7847

Kearney, B., and Staskawicz, B.J. (1990) Characterization of IS476 and its role in bacterial spot disease of tomato and pepper / Bacteriol 172 143-148 Keenleyside W. J., M. Perry, L. Maclean, C Poppe and C. Whitfield. 1994 A plasmid-encoded rβ O 54 gene cluster is required for biosynthesis of the O 54 antigen in

Salmonella enterica serovar Borreze Mol Microbiol 11 437-448

Keenleyside, W.J., and Whitfield, C (1995) Lateral transfer of rβ genes a mobilizable

ColEl-type plasmid carries the rβ O 54 (O 54 antigen biosynthesis) gene cluster from Salmonella enterica serovar Borreze / Bacteriol 177 5247-5253

Keenleyside, W.J., and C Whitfield 1996 A novel pathway for O-polysacchande biosynthesis in Salmonella enterica serovar Borreze J Biol Chem 271 28581-28592

Kingsley, M.T., D. W. Gabriel, G. C Marlow & P. D. Roberts 1993 The opsX locus of

Xanthomonas campestris affects host range and biosynthesis of lipopolysaccharide and extracellular polysacchande J Bacteriol 175: 5839-50

Klein, P., Kanehisa, M., and DeLisi, C 1985 Description of one of the methods used in

SOAP Bwchimica et Bwphysica Acta 815 468-476

Klena, J. D., and Schnaitman, CA. 1993 Function of the rβ gene cluster and the rfe gene in the synthesis of O-antigen by Shigella dysenteriae 1 Mol Microbiol 9 393-402 Knirel, Y. A. 1990 Polysacchande antigens of P aeruginosa Cnt Rev Microbiol 17 273-

304 Knirel, Y.A., and N.K. Kochetkov. 1994 The structure of lipopolysacchandes of

Gram-negative bacteria III The structure of O-antigens a review Biochemistry

(Moscow) 59 1325-1383

Knirel, Y.A., E.V. Vinogradov, N.A. Kocharova, N.A. Paramonov, N.K. Kochetkov, B.A. Dmitriev, E.S. Stanislavsky, and B. Lanyi. 1988 The structure of O-specific polysacchaπdes and the serological classification of Pseudomonas aeruginosa Acta

Microbiol Hung 35 3-24

Kuenzler, M., Balmelli, T., Egli, CM., Paravicini, G., and Braus, G.H. (1993) Clonmg, primary structure, and regulation of the HIS7 gene encodmg a bifunctional glutamme amidotransferase cyclase from Saccharomyces cerevisiae J Bacteriol 175 5548-5558

Kuzio, J., and Kropinski A.M. (1983) O-antigen conversion in Pseudomonas aeruginosa

PAOl by bactenophage D3 / Bacteriol 155 203-212

Lacks, S., and J.R. Greenberg 1977 Complementary specificity of restriction endonucleases of Dψlococcus pneumontae with respect to DNA methylation J Mol Biol 114 153-168 Lam, M.Y.C, E.J. McGroarty, A.M. Kropinski, L.A. MacDonald, S.S. Pedersen, N. Hiby, and

J.S. Lam 1989 Occurrence of a common lipopolysaccharide antigen m standard and clinical strams of Pseudomonas aeruginosa J Chn Microbiol 27 962-967

Lam, J.S., M.Y.C. Handelsman., T.R. Chivers, and L.A. MacDonald. 1992 Monoclonal antibodies as probes to examine serotype-specific and cross-reactive epitopes of lipopolysacchandes from serotypes 02, 05, and 016 of Pseudomonas aeruginosa }

Bacteriol 174.2178-2184

Lai, C.-Y. and Baumann, P. (1992) Sequence analysis of a DNA fragment from Buchnera aphidicola (an endosymbiont of aphids) contammg genes homologous to dnaG, rpoD, cysE, and secB Gene 119 113-118 Lightfoot, J.L., and J.S. Lam 1991 Molecular clonmg of genes involved with expression of

A-band lipopolysaccharide, an antigenically conserved form, in Pseudomonas aeruginosa J

Bacteriol 173 5624-5630

Lightfoot, J.L., and J.S. Lam. 1993 Chromosomal mappmg, expression and synthesis of lipopolysaccharide in Pseudomonas aeruginosa a role for guanosine diphospho (GDP)-D-mannose Mol Microbiol 8 771-782

Liu, D., R.A. Cole, and P. R. Reeves 1996 An O-antigen processing function for Wzx (Rfb> a promising candidate for O-unit flippase J Bacteriol 178 2102-2107

Liu, P.V. and S. Wang 1990 Three new major somatic antigens of Pseudomonas aeruginosa

J Clm Microbiol 28 922-925 Lin, W.S., Cunneen, T. and Lee, C.Y. (1994) Sequence analysis and molecular characterization of genes required for the biosynthesis of type 1 capsular polysacchande m

Staphylococcus aureus ] Bacteriol 176 7005-7016 Liu, P. V., Matsumoto, H., Kusama, H., and Bergan, T. 1983 Survey of heat-stable major somatic antigens of P aeruginosa Int ] Syst Bacteriol 33 256-264

Macpherson, D.F., Manning, P.A., and Morona, R. (1994) Characterization of the dTDP rhamnose biosynthethic genes encoded in the rβ locus of Shigella flexneri Mol Microbiol 11 281-292

MacLachlan, P.R., S.K. Kadam, and K.E. Sanderson 1991 Clonmg, characterization, and

DNA sequence of the rfaLK region for lipopolysaccharide synthesis in Salmonella typhimurium LT2 J Bacteriol 173 7151-7163

Makela, P. H., and Stocker, B. A. D. 1984 Genetics of lipopolysaccharide, p 59-137 In E T Rietschel (ed ), Handbook of endotoxrn, vol 1 Elsevier Science Publishing, Amsterdam

Marolda, C.L., and M.A. Valvano 1993 Identification, expression, and DNA sequence of the GDP-manose biosynthesis genes encoded by the 07 rβ cluster of strain VW187

(Escherichia coh 07 K1) J Bacteriol 175 148-158

Marolda, CL., and Valvano, M.A. (1995) Genetic analysis of the dTDP-rhamnose biosynthesis region of the Escherichia coh VW187 (07 K1) rβ gene cluster identification of functional homologs of rβB and rβA in the rff cluster and correct location of the rffE gene / Bacteriol 177 5539-5546

May, T.B., D. Shmabarger, R. Maharaj, J. Kato, L. Chu, J.D. DeVault, S. Roychoudhury,

N.A. Zielinski, A. Berry, R.K. Rothmel, T.K. Misra, and A.M. Chakrabarty 1991 Alginate synthesis by Pseudomonas aeruginosa a key pathogenic factor in chronic pulmonary mfections of cystic fibrosis patients Clm Microbiol Rev 4 191-206

Meier-Dieter, U., Barr, K., Starman, R., Hatch, L. and Rick, P.D. (1992) Nucleotide sequence of the Escherichia coh rfe gene involved in the synthesis of enterobactenal common antigen Molecular clonmg of the rfe rff gene cluster J Biol Chem 267 746-753 Morona, R., Mavns, M., Fallanno, A., and Manning, P. A. 1994 Characterization of the rfc region of Shigella flexneri ] Bacteriol 176 733-747

Morona, R., L. van den Bosch, and P.A. Manning 1995 Molecular, genetic, and topological characterization of O-antigen chain length regulation in Shigella flexneri ] Bacteriol

177 1059-1068 Nurminen, M., Hellerqvist, C E., Valtonen, V. V., and Makela, P. H. 1971 The smooth lipopolysaccharide character of 1, 4, (5), 12 and 1, 9, 12 transductants formed as hybrids between groups B and D of Salmonella Eur f Biochem 22 500-505

Ogasawara, N., Nakai, S. and Yoshikawa, H. (1994) Systematic sequencmg of the 180 kilobase region of the Bacillus subtihs chromosome contammg the replication origin DNA Res 1 1-14 Ozenberger, B.A., M. Schrodt Nahlik, and M.A. Mclntosh 1987 Genetic organization of multiple fep genes encodmg ferric enterobactm transport functions m Escherichia coh }

Bacteriol 169 3638-3646

Palleroni, N. J. 1984 Genus I P p 141-199 ln N R Kneg and J C Holt (ed ), Bergey s Manual of Systematic Bacteriology, Vol 1, Williams and Wilkins, Baltimore

Peschke, U., Schmidt, H., Zhang, H.Z. and Piepersberg, W. (1995) Molecular characterization of the hncomycin-production gene cluster of Streptomyces hncolnensis

78-11 Mol Microbiol 16 1137-1156

Potter, A. A. and Loutit, J. S. 1982 Exonuclease activity from P aeruginosa which is missing in phenotypically restnctionless mutants / Bacteriol 151 1204-1209

Prere, M.F., Chandler, M., and Fayet, O. (1990) Transposition in Shigella dysenteriae isolation and analysis of IS911, a new member of the IS3 group of msertion sequences /

Bacteriol 172 4090-4099

Priefer, U.B., Kalinowski, J., Ruger, B., Heumann, W., and Puhler, A. (1989) ISR2 , a transposable DNA sequence resident in Rhizobium class IV strains, shows structural characteristics of classical msertion elements Plasmid 21 120-128

Pntchard, A.E., and Vasil, M.L. (1990) Possible insertion sequences in a mosaic genome organization upstream of the exotoxin A gene in Pseudomonas aeruginosa ] Bacteriol 172

2020-2028 Quirk, P.G., Guffanti, A.A., Clejan, S., Cheng, J., and Krulwich, T.A. (1994) Isolation of

Tn917 msertional mutants of Bacillus subtihs that are resistant to the protonophore carbonyl cyanide m-chlorophenylhydrazone Bwchim Biophys Acta 1186 27-34

Reeves, P. (1993) Evolution of Salmonella O antigen variation by interspecific gene transfer on a large scale Trends Genet 9 17-22 Reeves, P.R., M. Hobbs, M. Valvano, M. Skurnik, C Whitfield, D. Cophn, N. Kido, J.

Klena, D. Maskell, C Raetz, and P. Rick. 1996 Proposal for a new nomenclature for bacterial surface polysacchande genes Trends Microbiol 4 495-503

Rieder, B., Merrick, M.J., Castorph, H., Kleiner, D. (1994) Function of hisl and hisH gene products m histidme biosynthesis Biol Chem 269 14386-14390 Rivera, M., Bryan, L. E., Hancock, R. E. W. and McGroarty, E. J. 1988 Heterogeneity of lipopolysacchandes from P aeruginosa analysis of lipopolysaccharide chain leng ι /

Bacteriol 170 512-521

Rivera, M., T.R. Chivers, J.S. Lam, and E.J. McGroarty 1992 Common antigen lipopolysaccharide from Pseudomonas aeruginosa AK1401 as a receptor for bacteπophage A7 J Bacteriol 174 2407-2411 Rossbach, S., D. A. Kulpa, U. Rossbach and F. J. de Bruijn (1994) Molecular and genetic characterization of the rhizopine catabolism (mocABRC) genes of Rhizobium meliloti

L5-30. Mol Gen Genet 245: 11-24.

Ruvkun, G. B., and Ausubel, F. M. 1981. A general method for site-directed mutagenesis m prokaryotes. Nature (London) 289:85-88

Schnaitman, C.A., and J.D. Klena 1993 Genetics of lipopolysaccharide biosynthesis m enteric bacteria. Microbiol. Rev. 57: 655-682.

Schnier, J., M. Kimura, K. Foulaki, A.R. Subramanian, K. lsono, and B. Wittmann-Liebold.

1982. Primary structure of Escherichia coh ribosomal protem SI and of its gene rpsA Proc. Natl. Acad. Sci. U.S.A 79:1008-1011.

Schweizer, H. P. 1993. Small broad-host-range gentamycm resistance gene cassettes for site-specific msertion and deletion mutagenesis. BioTechmques 15-831-833.

Schweitzer, H.P., and T.T. Hoang. 1995. An improved system for gene replacement and xylE fusion analysis in Pseudomonas aeruginosa Gene 158 15-22 Segal G. and E. Z. Ron (1995) The dnaK] operon of Agrobacterium tumefaciens- transcriptional analysis and evidence for a new heat shock promoter J Bacteriol 177:

5952-5958.

Simon, R., Priefer, U., and Piihler, A. 1983 A broad-host-range mobilization system for in vivo genetic engineering. transposon mutagenesis in gram negative bacteria. Bio/Technology 1:784-791

Skurnik, M., Venho, R., Toivanen, P., and Alhendy, A. (1995). A novel locus of Yersinia enterocolitica serotype 0.3 involved in lipopolysaccharide outer core biosynthesis Mol

Microbiol 17: 575-594

Sokol, P.A., Luan, M.Z., Storey, D.G., and Thirukkumaran, P. (1994) Genetic rearrangement associated with m vivo mucoid conversion of Pseudomonas aeruginosa PAO is due to msertion elements / Bacteriol 176: 553-562.

Soldo, B., Lazarevic, V., Margot, P., and Karamata, D. (1993) Sequencmg and analysis of the divergon comprising gtaB, the structural gene of UDP-glucose pyrophosphorylase of

Bacillus subtihs 168 / Gen Microbiol 139. 3185-3195. Stutzman-Engwall, K.J., Otten, S.L., and Hutchinson, C.R. (1992) Regulation of secondary metabolism in Streptomyces spp. and overproduction of daunorubicin in Streptomyces peucettus. ] Bacteriol 174: 144-154.

Sturm, S. and K.N. Timmis. 1986 Clonmg of the rβ region of Shigella dysenteriae 1 and construction of an rβ-rfp gene cassette for the development of hpopolysaccharide-based live anti-dysentery vaccines. Microb. Pathog. 1:289-297. Tabor, S., and CC Richardson. 1985. A bacteriophage T7 RNA polymerase /promoter system for controlled exclusive expression of specific genes Proc Nat. Acad. Sci. USA

82:1074-1078

Takagi, M., Takada, H., and Imanaka, T. (1990) Nucleotide sequence and clonmg m Bacillus subtihs of the Bacillus stearothermophilus pleiotropic regulatory gene degT ]

Bacteriol 172: 411-418

Tercero, J.A., Espinosa, J.C, Lacalle, R.A. and Jimenez, A. (1996) The biosynthetic pathway of the aminonucleoside antibiotic puromycm, as deduced from the molecular analysis of the pur cluster of Streptomyces alboniger. ] Biol Chem 271. 1579-1590 Thorson, J. S., Lo, S.F., Ploux, O., He, X., and Liu, H.-W. (1994) Studies of the biosynthesis of 3,6-dideoxyhexoses: molecular clonmg and characterization of the asc (ascarylose) region from Yersima pseudotuberculosis serogroup VA / Bacteriol 176: 5483-5493

West, S.E. and Iglewski, B.H. (1988) Codon usage m Pseudomonas aeruginosa. Nucleic

Acids Res 16. 9323-9335 West, S.E.H., H.P. Schweizer, C Dall, A.K. Sample, and L.J. Runyen- anecky. 1994

Construction of improved Escherichia-Pseudomonas shuttle vectors derived from pUC18/19 and the sequence of the region required for their replication in Pseudomonas aeruginosa

Gene 128. 81-86.

West, S. E. H., Schweizer, H. P., Dall, C, Sample, A. K., and Runyen-Janecky, L. J. (1994) Construction of improved Escherichia-P shuttle vectors derived from pUC18/19 and the sequence of the region required for their replication in P. aeruginosa Gene 128:81-86.

Whitfield, C 1995 Biosynthesis of lipopolysaccharide O-antigens Trends Microbiol

3.178-185

Whitfield, C, and M.A. Valvano. 1993. Biosynthesis and expression of cell-surface polysacchandes in gram-negative bacteria Adv. Microb. Physiol. 35:135-246

Wozniak, D. J. 1994. Integration host factor and sequences downstream of the Pseudomonas aeruginosa algD transcription start site are required for expression. J. Bacteriol

176:5068-5076.

Wozniak, D. J., and D. E. Ohman. 1993. Involvement of the alginate algT gene and integration host factor in the regulation of the Pseudomonas aeruginosa algB gene. J

Bacteriol 175: 4145-4153

Wood, M.S., Byrne, A., and Lessie, T.G. (1991) IS406 and IS407, two gene-activating insertion sequences from Pseudomonas cepacia. Gene 105- 101-105.

Xiao, Q. and Moore, CH. (1993) The primary structure of phosphofructokinase from Lactococcus lactis. Biochem Biophys Res Commun 194: 65-71.

Yanisch-Perron, C, J. Vieira, and J. Messing. 1985 Improved M13 phage clonmg vectors and host strams: nucleotide sequences of the M13mpl8 and pUC19 vectors Gene 33- 103-119 Detailed Figure Legends for Figures 22 to 29, 32, 33, and 43 to 47

Figure 22. Silver-stained SDS-PAGE gel of LPS from PAOl, AK1401, AK14Ol(pFV100), and AK1401(pFV.TK8) (Panel A) and Westernimmunoblots of this LPS reacted with 05- specific MAb MF15-4 (Panel B) . Note that the two transconjugants strains, AK14Ol(pFV100) and AK1401(pFV.TK8), produce levels of B-band LPS similar to the PAOl wild-type strain.

Figure 23. Restriction maps of the chromosomal inserts from pFVlOO and several pFV subclones. Results of complementation studies of the SR mutants AK1401 and rd7513 with the pFV subclones are also shown. The three TnlOOO insertions in the 1.5 kb Xhol fragment of pFV.TKό that were found to interrupt O-antigen complementation in AK1401 are indicated. This Xhol fragment was later purified and used as a probe in Southern blot analysis. Restriction sites: B, Bam l; X, Xhol; S, Spel; Xb, Xbal; H, Hindlll.

Figure 24. Southern analysis the three rfc chromosomal mutants, OP5.2, OP5.3, and OP5.5, showing the insertion of an 875 bp Gm^R cassette into the rfc gene. Restriction maps of the PAOl wild-type (panel A) and mutant (panel B) rfc coding regions are shown. Southern hybridizations of chromosomal DNA from PAOl (lane 1) and mutants OP5.2, OP5.3, and OP5.5 (lanes 2-4, respectively) digested with Xhol were performed using an rfc probe (panel C). This DIG-labelled probe was generated from the 1.5 kb Xhol insert of pFV.TK7 (shown in panel A). The probe hybridized to a 1.5 kb fragment of PAOl and a 2.4 kb fragment of thethree rfc mutants. The molecular size of the probe-reactive fragments are shown on the left (in kb) .

Figure 25. Silver-stained SDS-PAGE gel and Western blots of LPS from PAOl,

AK1401 and the three rfc chromosomal mutants, OP5.2, OP5.3, and OP5.5. Panel A: silver- stained SDS-PAGE gel; Panel B: Western blot reacted with 05-specific MAb MF15-4; Panel C: Western blot reacted with A-band specific MAb N1F10. Note that the chromosomal rfc mutants are not able to produce long-chain O-antigen; however, they are still expressing A- band LPS, like the SR mutant AK1401.

Figure 26. Restriction maps of recombinant plasmids pFVlόl, pFV401 and pFV402. The shaded box represents the DIG-labeled probe generated from pFVlόl. Restriction sites: B, BαmHI; H, H dIII; X, Xhol.

Figure 27. Southern hybridizations of chromosomal DNA from PAOl (lane 2) and rol mutants (lanes 3&4). Chromosomal DNA in Panel A was digested with Psfl and Sstl. DNA m Panel B was digested with HmdIII The samples m Panel A were probed with the Gm^R cassette (Schweizer, 1993) The probe used in Panel B is the 2 3 kb HmdIII msert from pFV401 Molecular weight markers, usmg λ DNA digested with HmdIII, are mdicated to the left of each panel

Figure 28 Characterization of LPS from PAOl and PAOl rol chromosomal mutants The samples m each lane are as labeled Panel A is a silver-stained SDS-PAGE gel Panel B is the correspondmg Western immunoblot reacted with an 05 (B-band)-specιfιc mAb MF15-4

Figure 29. T7 protein expression of P aeruginosa 05 Rol This autoradiogram shows ³⁵S- labeled proteins expressed by pFV401 , which contains the rol gene, and correspondmg control plasmid vector pBluescript II SK in £ coh JM109DE3 by use of the T7 expression system The arrow indicates the putative Rol protein Molecular size markers are indicated to the left of the figure

Figure 32. Features of the initiation regions Capital letters for bases indicate one of the following sites potential πbosomal bmding sites (RBS), the presumed start codon (also in bold and double underlmed), the second codon where it is AAA (the preferred second codon), and components of the sequences TTAA and AAA from +10 to +13 and from -1 to -3 respectively (Gold and Stormo, 1987) The termination codon of the precedmg gene is mdicated by a bar above if it is in the region shown The reference sequences mvolved are also shown above the set of sequences

Figure 33. NAD-binding domains of PsbA, PsbK and PsbM aligned with those of other bacterial protems involved m polysacchande biosynthesis The consensus sequence for an NAD-bmdmg domain (Macpherson et al , 1994) is shown at the bottom of the figure m bold underline The first column contams the protem names, the second column mdicates the location of the NAD-bmdmg site withm the protem, the third column shows the alignment of the NAD-bmdmg domains with highly conserved residues indicated m bold type, and the fourth column gives the reference for the protem shown Most of the protems in this group of sugar biosynthesis enzymes function as dehydrogenases/dehydratases Note that PsbM, BplL, and TrsG have two putative NAD-bindmg domains, instead of one The presence of two domains supports the proposal that these large protems arose from fusion of two smaller protems

Figure 43 Physical map of the 5 end of the wbp cluster The wzz gene ends approximately 800 bp upstream of wbp A, the first gene of the wbp cluster (8) The probe used to identify a HmdIII fragment containing the tact wzz gene for clonmg into pFV401 is shown as a black bar above the restriction map. The site of insertion of the gentamicin cassette used to create the wzz knockout mutants is indicated by a black triangle. Key: B, Bam l; H, HmdIII; S, Sstl; X, Xhol.

Figure 44. Comparison of hydropathy plots of selected Wzz-like proteins. The hydropathy plots of selected Wzz-like proteins were calculated using PC/GENE SOAP. The X axis represents amino acid residues, while the Y axis represents relative hydropathy. Positive values indicate hydrophobicity; negative values indicate hydrophilicity. A, P. aeruginosa 05 Wzz, U50397; B, £. coli Olll Wzz, Z17241; C, £. coli o349, M87049; D, E. coli FepE, P26266; E, Y. enterocolitica 08 Wzz, U43708; F, Y . pseudotuberculosis Wzz, ; G, V. cholerae 0139 OtnB, X90547.

Figure 45. Expression of P. aeruginosa Wzz in vitro. The 40 kDa Wzz protein (indicated by black arrowhead) was expressed from the insert of pFV401 in both orientations. A 28 kDa protein was also expressed in both orientations and may represent either a breakdown product of the 40 kDa polypeptide, or initiation of translation from a secondary ribosome-binding site. There are several smaller ORFs encoded on the positive strand of the 2.3 kb insert of pFV401 which could correspond to the 10 kDa protein.

Figure 46. Analysis of LPS from wzz knockout mutants. LPS from P. aeruginosa serotypes 05 and 016 and their corresponding wzz mutants was examined. Figure 46A: Silver-stained 12.5% SDS-PAGE. Figure 46B: Western immunoblot using MAb 18-19, specific for B-band LPS from the 05 serogroup (serotypes 02, 05, 016, 018, O20). Figure 46C: Western immunoblot using MAb MF15-4, specific for serotype 05 B-band LPS. The plasmid pFV401-26 contains the 05 wzz gene cloned downstream of the lacZ promoter of shuttle vector pUCP26.

Figure 47. Ability of P. aeruginosa 05 Wzz to function in E. coli. Panel A. Silver-stained SDS-PAGE gel of E. coli CLM4 containing the Shigella dysenteriae rβ cluster on pSS37, with and without the P. aeruginosa wzz gene in pFV401. Panel B. Western immunoblot of £. coli HBlOl containing the P. aeruginosa 05 wbp cluster in pFVlOO, with and without the P. aeruginosa wzz gene in pFV401. The membrane was incubated with MAb MF15-4, specific for serotype 05 B-band LPS.

Figure 48. Western immunoblot analysis of lipopolysaccharide (LPS) isolated using the hot water-phenol method of Westphal and Jann. Lanes 05 are LPS from the parent strain, while lanes FI and F2 are LPS from two mutants containing a gentamicin cassette inserted at the Ssfl site within the open reading frame of wbpF. The monoclonal antibodies used are N1F10, specific for A-band LPS, and 18-19, specific for B-band LPS. Note that a knockout mutation of wbpF abrogates both A-band and B-band LPS expression.

SEQUENCE LISTING

(1) GENERAL INFORMATION:

(i) APPLICANTS:

(A NAME: UNIVERSITY OF GUELPH (B STREET: Office of Vice President of Research,

Room 214, Reynolds Building

(C CITY: Guelph (D STATE: Ontario (E COUNTRY: Canada (F POSTAL CODE: NIG 2W1 (G TELEPHONE NO. : (519) 824-4120 (H TELEFAX NO.: (519) 821-5236

(A NAME: LAM, Joseph S. <B STREET: 2 Bridlewood Drive (C CITY: Guelph (D STATE: Ontario (E COUNTRY: Canada (F POSTAL CODE: NIG 4A6

(A NAME: BURROWS, Lori (B STREET: 22 Devere Drive (C CITY: Guelph (D STATE: Ontario (E COUNTRY: Canada (F POSTAL CODE: NIG 2S9

(A NAME: CHARTER, Deborah (B STREET: 78 College Street West (C CITY: Guelph (D STATE: Ontario (E COUNTRY: Canada (F POSTAL CODE: NIG 4S7

(A NAME: de KIEVIT, Teresa (B STREET: 2-100 Sunny Lea Crescent (C CITY: Guelph (D STATE: Ontario (E COUNTRY: Canada (F POSTAL CODE: NIG 1W6

(ii) TITLE OF INVENTION: Novel Proteins Involved in the Synthesis and Assembly of O-Antigen in Pseudomonas Aeruginosa

(iii) NUMBER OF SEQUENCES: 20

(iv) CORRESPONDENCE ADDRESS:

(A) ADDRESSEE: BERESKIN & PARR

(B) STREET: 40 King Street West

(C) CITY: Toronto

(D) STATE: Ontario

(E) COUNTRY: Canada

(F) ZIP: M5H 3Y2

(v) COMPUTER READABLE FORM:

(A) MEDIUM TYPE: Floppy disk

(B) COMPUTER: IBM PC compatible

(C) OPERATING SYSTEM: PC-DOS/MS-DOS

(D) SOFTWARE: PatentIn Release #1.0, Version #1.30

(vi) CURRENT APPLICATION DATA:

(A) APPLICATION NUMBER: PCT

(B) FILING DATE:

(C) CLASSIFICATION: (viii) ATTORNEY/AGENT INFORMATION:

(A) NAME: Kurdydyk, Linda M.

(B) REGISTRATION NUMBER: 34,971

(C) REFERENCE/DOCKET NUMBER: 6580-87

(ix) TELECOMMUNICATION INFORMATION:

(A) TELEPHONE: (416) 364-7311

(B) TELEFAX: (416) 361-1398

(2) INFORMATION FOR SEQ ID NO:1:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 24417 base pairs

(B) TYPE: nucleic acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: DNA (genomic)

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l:

CTCGAGATAT TGAGCAGCGC ATACAGAACT TGCGGAGAGA ATGCCAAGGC AGACGTGAAG 60

ATCGTATTGT TCAGCTCAAG GAGGCGTTGA AGGTCGCAGG TGCGCTGAAA TTGGAGGAGC 120

CTCCACTGAT CAGTGGGCAA TCCTCTGAGG AGCTCTCGGC TATCATGAAT GGAAGTCTGA 180

TGTATATGCG TGGCAGTAAG GCGATTATGG CCGAGATTCA GACATTGGAG GCGCGTAGCT 240

CTGATGATCC TTTTATTCCG GCGTTGCGTA CTCTTCAGGA GCAGCAGTTA TTGCTGAGTA 300

GCTTGCGTGT TAATTCGGAG CGGGTTTCTG TTTTTCGACA AGACGGTCCG ATAGAAACGC 360

CGGACTCACC AGTTCGTCCA AGGAGAGCGA TGATTTTGAT TTTTGGGTTG ATAATTGGTG 420

GTGTGCTTGG TGGTTTTCTG GCGTTGTGCC GGATTTTTTT GAAGAAGTAT GCTCGTTAGG 480

AAAGAGCTAG TTATTGAAGT GGTGATGCGT TGCACGTACT TTGGTCGAGT AATTTTGTGG 540

AGTAGGTTTT CGTTGGGTGG CTCGATTGCT GAGGGGTGAG AACGTTTCCA TGCGGTGTTT 600

CCTCAGCTCT GTCTCCTGTG CCTTGGCTCC TTGAACGCAG AGGTTAACAG TTGAGCTGTG 660

GTTGTGGGTA TGTGACGTCT GTTGCGGTGG TGTCTGGTTC CTGGTGTCGG GTGTGCGAGA 720

AGATGCCAAG TTGCCTGGCA GGTCGTTACG TGTCGTAGCC GTATTCGAAG CTCGGCAATC 780

GCGGGGTGAT TTACAGGACT GTGCTTAATA CGGCGCAGGC TTGGTCAGGG TCGAGTCGGG 840

TCTTCGGGTG TCAACTGGAT CGTGCGAAAA CCGGTTTCGT GGATGCTGAT AAGCTCGGCT 900

TGACTGGCAG TCCAGGGCGG TTACCAGGTC TGTGGAGGCG CAAAATGTAT AGGAGCCTGC 960

GTGAGCTGGG CAGGCTGAAG GCCTGCTCGA AAGCGAGTTA GCATTGTGGT CCGGAAGGGC 1020

ATGGGTGGAC CAGAGTGCCG TTCTGCACGG CAAAAGCCAA CTTGCTCGGA GGTTCCCTAG 1080

CGCCTATGAT TACGACGCCC TTCATTTTTG GCCATTGCCG CCAGGTGCTG TGGAAAGCGA 1140

CAGTATCCCT TCTTTATCGA TCTTGTGAAG ATGTCGAGAG TGGTCGCAGA AAGGATTCAC 1200 TCGACTGACG AATGAATCGT GGAAGATTTA AGTTCCCGTT GTGCGGTCGC AGGCGCGGGC 1260

AGGTAAAATT GAGGTGAGTT GGAAAATGAT AGATGTTAAC ACAGTGGTAG AGAAGTTCAA 1320

AAGCCGACAG GCCTTGATTG GTATCGTGGG TCTGGGTTAT GTCGGTTTAC CACTGATGCT 1380

GCGATACAAC GCCATTGGTT TCGATGTCTT GGGTATCGAT ATCGATGATG TCAAGGTTGA 1440

CAAGCTTAAT GCCGGGCAGT GCTATATCGA ACATATTCCG CAAGCCAAAA TTGCTAAGGC 1500

CCGTGCAAGC GGTTTCGAGG CTACGACCGA TTTCAGCCGT GTCAGTGAAT GTGATGCCCT 1560

GATCCTTTGT GTGCCGACGC CGCTGAACAA GTATCGCGAG CCGGATATGA GCTTTGTCAT 1620

CAATACCACC GACGCACTAA AACCGTATCT GCGCGTAGGG CAGGTGGTTT CGCTGGAAAG 1680

TACCACCTAT CCGGGAACTA CCGAGGAAGA GTTGTTGCCA CGCGTGCAGG AGGGTGGCCT 1740

CGTGGTTGGC CGGGACATCT ACCTGGTCTA TTCTCCGGAG CGTGAAGATC CGGGCAACCC 1800

GAACTTCGAG ACTCGTACCA TTCCGAAAGT GATCGGTGGT CACACTCCTC AGTGTCTGGA 1860

AGTCGGCATT GCCCTGTATG AACAGGCCAT CGACCGGGTC GTGCCGGTCA GTTCCACCAA 1920

GGCCGCCGAG ATGACCAAGC TGTTGGAGAA CATTCATCGC GCGGTCAATA TCGGTCTGGT 1980

CAACGAAATG AAGATCGTTG CTGATCGCAT GGGTATCGAC ATCTTTGAAG TGGTTGATGC 2040

TGCGGCGACC AAGCCGTTCG GTTTCACTCC TTACTACCCA GGGCCGGGAC TGGGCGGGCA 2100

CTGTATCCCG ATCGATCCCT TCTACCTGAC TTGGAAGGCT CGCGAATACG GACTGCATAC 2160

CCGCTTCATC GAACTGTCTG GTGAGGTCAA CCAGGCCATG CCGGAATACG TACTGGGCAA 2220

ACTCATGGAT GGCCTGAACG AGGCAGGCAG GGCCCTCAAG GGCAGTCGTG TACTGGTATT 2280

GGGTATCGCT TATAAGAAGA ATGTCGACGA CATGCGCGAG TCGCCATCCG TGGAAATCAT 2340

GGAGCTGATC GAAGCCAAGG GTGGGATGGT CGCCTATAGC GATCCGCATG TGCCGGTGTT 2400

CCCGAAGATG CGTGAACACC ACTTCGAACT GAGCAGTGAG CCGCTGACTG CCGAAAACCT 2460

GGCTAGGTTC GACGCTGTAG TGCTTGCGAC CGACCATGAC AAGTTTGACT ATGAGCTGAT 2520

CAAGGCCGAA GCCAAGCTAG TTGTTGACAG CCGTGGCAAG TACCGCTCCC CGGCGGCACA 2580

CATCATCAAG GCTTGATCAC CCATCCCAGC ATGTCCATCC GCTCGTGCCA GAAGGCCGGG 2640

CGGATCCGCT CATTTCCATA GGACGAACCA TGAAAAATTT CGCTCTCATC GGTGCTGCCG 2700

GCTACATCGC TCCTCGCCAT ATGCGCGCCA TCAAAGACAC CGGTAACTGC CTGGTTTCGG 2760

CCTATGACAT CAATGACTCG GTCGGTATTA TTGATAGCAT CTCTCCCCAG AGCGAGTTTT 2820

TTACCGAGTT CGAGTTCTTT CTTGATCATG CGAGCAACCT CAAGCGCGAC TCTGCTACCG 2880

CGCTGGACTA CGTATCGATC TGCTCGCCCA ATTACCTGCA CTACCCGCAT ATCGCTGCAG 2940

GTCTGCGCTT GGGTTGCGAC GTAATCTGCG AAAAGCCGCT TGTTCCAACC CCAGAGATGC 3000

TCGATCAGTT GGCTGTTATC GAGCGCGAAA CCGATAAGCG CCTCTACAAC ATTCTGCAAC 3060

TGCGTCATCA CCAGGCGATC ATCGCATTGA AGGACAAGGT CGCCCGCGAA AAAAGTCCGC 3120

ATAAGTACGA GGTCGATCTG ACTTACATTA CTTCCCGCGG CAACTGGTAT CTGAAAAGCT 3180 GGAAGGGAGA TCCACGTAAG TCGTTCGGCG TGGCTACCAA CATCGGTGTG CACTTCTACG 3240

ACATGCTGCA CTTCATCTTT GGCAAGCTGC AGCGTAATGT TGTGCACTTC ACTTCCGAGT 3300

ACAAGACAGC TGGTTATCTG GAGTACGAGC AGGCCCGTGT GCGTTGGTTT CTGTCCGTGG 3360

ATGCTAACGA CCTGCCGGAG TCGGTCAAGG GCAAAAAGCC GACCTATCGT TCGATTACCG 3420

TCAACGGTGA GGAAATGGAG TTCTCTGAAG GCTTTACCGA TCTACATACA ACCAGCTACG 3480

AAGAAATTCT CGCTGGTCGT GGTTATGGCA TCGATGACGC TCGTCATTGT GTGGAAACTG 3540

TCAATACCAT TCGCAGCGCC GTCATCGTAC CGGCCTCTGA TAACGAAGGG CATCCGTTCG 3600

TCGCGGCGCT TGCGCGTTGA GGTAGAAAAG GAGTGGCCGT CCTCGGTCAC CTGTTTACAG 3660

CAGGTTTCCG CAGGATCATT CATCAGCATG TCATCTAGTA GCTCTAAATT GCTGAACGGT 3720

ATGGTCGCGG TAAGTTCAGG CAGAAACATT CGGCTGGATG TCCAGGGGCT GCGGGCTGTT 3780

GCAGTTCTGG CTGTGCTAGC TTACCACGCC AACAGTGCCT GGCTCAGGGC TGGGTTTGTC 3840

GGCGTTGACG TGTTCTTCGT CATTTCCGGG TTTATCATTA CCGCCTTACT GGTCGAGCGC 3900

GGTGTAAAAG TTGATCTGGT AGAGTTTTAC GCGGGCCGTA TCAAACGTAT TTTTCCAGCC 3960

TATTTCGTCA TGTTGGCGAT TGTCTGCATT GTCTCGACAA TTCTGTTTCT GCCTGATGAC 4020

TATGTTTTTT TTGAAAAAAG TCTACAGTCA TCTGTATTTT TTTCCAGTAA TCACTATTTC 4080

GCTAATTTTG GTAGTTACTT TGCTCCGAGA GCTGAAGAGC TGCCGCTGCT GCATACTTGT 4140

TCAATAGCCA ACGAGATGCA GTTTTATCTG TTCTACCCTG TACTGTTCAT GTGCCTGCCA 4200

TGTCGATGGC GCTTGCCGGT GTTCATCCTA TTAGCTATTT TGCTGTTCAT TTGGAGTGGC 4260

TATTGCGTAT TCAGCGGCAG CCAAGATGCT CAGTACTTCG CCTTGCTAGC TCGTGTACCT 4320

GAGTTCATGT CGGGAGCTGT TGTCGCATTA TCATTACGTG ATCGTGAGCT ACCCGCCAGG 4380

CTTGCGA AC TTGCGGGGTT ATTGGGGGCG GCGTTGCTGG TCTGCTCCTT CATTATCATC 4440

GACAAGCAGC ACTTTCCCGG ATTCTGGTCG CTCCTGCCAT GCCTGGGAGC CGCTCTGCTC 4500

ATTGCTGCCC GACGTGGCCC TGCCAGCCTG CTGCTGGCCA GCAGGCCCAT GGTCTGGATA 4560

GGTGGTATCT CCTATTCGTT GTATCTGTGG CACTGGCCAA TTCTGGCATT CATCCGTTAC 4620

TACACCGGCC AATACGAATT GAGCTTCGTG GCGCTGTTGG CATTTCTCAC AGGTTCGTTC 4680

CTGCTGGCCT GGTTCTCATA CCGCTACATC GAGACACCTG CCAGAAAGGC TGTGGGTCTG 4740

CGCCAGCAGG CGCTGAAGTG GATGTTGGCC GCCAGTGTGG TAGCTATAGT GGTTACGGGG 4800

GGGGCGCAGT TCAATGTGTT GGTTGTGGCG CCGGCGCCAA TTCAGTTGAC GCGCTACGCT 4860

GTACCAGAGT CGATCTGCCA TGGTGTTCAG GTAGGGGAGT GCAAGCGAGG CAGCGTCAAT 4920

GCCGTACCCC GTGTGCTGGT GATCGGTGAT AGCCATGCTG CGCAGCTTAA CTACTTCTTC 4980

GACGTGGTTG GCAACGAGTC AGGTGTGGCT TACCGAGTAC TCACCGGAAG CAGTTGTGTG 5040

CCAATACCTG CTTTCGATCT TGAACGTTTG CCCCGTTGGG CGCGGAAACC CTGCCAAGCG 5100

CAGATTGATG CAGTTGCCCA ATCAATGTTG AACTTTGACA AGATCATTGT GGCGGGCATG 5160 TGGCAGTATC AGATGCAGAG TCCGGCATTT GCCCAGGCTA TGCGTGCCTT CCTTGTCGAT 5220

ACCAGCTATG CCGGCAAGCA GGTCGCTCTA CTCGGGCAGA TACCGATGTT CGAATCAAAC 5280

GTGCAGCGTG TGCGTCGTTT CAGGGAGCTG GGTTTGTCAG CTCCGCTTGT TAGCTCCAGC 5340

TGGCAAGGTG CGAACCAGCT GTTGCGTGCT CTAGCCGAGG GTATTCCAAA CGTACGGTTC 5 00

ATGGATTTTT CTTCCAGCGC CTTCTTCGCC GATGCTCCTT ATCAGGACGG AGAGCTTATT 5460

TACCAGGATA GCCATCACCT TAACGAGGTG GGGGCTCGCC GCTATGGATA TTTCGCGAGC 5520

CGTCAATTGC AGCGGCTGTT TGAACAACCA CAATCGAGTG TGAGTCTCAA GCCATGAGTT 5580

ATTATCAGCA CCCCAGCGCG ATCGTCGACG ACGGTGCGCA GATCGGTAGC GACTCCCGAG 5640

TTTGGCACTT CGTGCACATC TGTGCAGGTG CCCGGATTGG CGCAGGGGTT TCGTTGGGTC 5700

AGAACGTATT CGTCGGCAAC AAGGTCGTTA TTGGTGATCG CTGCAAGATC CAGAACAACG 5760

TGTCGGTATA TGACAATGTC ACTCTCGAAG AGGGCGTGTT CTGCGGGCCG AGCATGGTAT 5820

TTACCAACGT TTACAACCCC CGCTCGTTGA TCGAGCGCAA GGATCAGTAC CGTAACACGT 5880

TGGTAAAAAA AGGTGCCACG CTTGGTGCCA ACTGCACTAT CGTCTGTGGC GTGACTATTG 5940

GTGAATATGC CTTCCTGGGT GCGGGTGCGG TCATTAACAA GAATGTTCCA TCTTATGCCC 6000

TGATGGTAGG CGTGCCCGCT CGACAGATTG GTTGGATAGC GAATTCGGTG AGCAGCTGCA 6060

GCTGAACGAG CAGGGCGAAG CTGTCTGCTC ACACTCCGGT GCGCGCTATG TACTCAATGG 6120

AAAGATCCTG AGCAAGGTGG ACGTGTGACC ATGATTGAAT TCATCGACCT GAAGAACCAG 6180

CAAGCGCGTA TCAAGGACAA GATCGATGCC GGTATCCAGC GCGTGCTGAG ACACGGGCAG 6240

TACATTCTTG GCCCGGAAGT CACTGAGCTT GAGGATCGCC TCGCCGATTT CGTCGGCGCT 6300

AAGTACTGCA TCAGTTGCGC CAACGGTACT GACGCTCTAC AGATTGTGCA GATGGCCTTG 6360

GGTGTTGGCC CAGGTGACGA AGTAATCACC CCTGGTTTTA CTTATGTTGC GACAGCGGAG 6420

ACCGTCGCGC TTTTGGGAGC CAAGCCGGTT TACGTGGATA TTGATCCACG CACCTACAAT 6480

CTTGATCCGC AGTTGCTGGA GGCTGCGATC ACACCGCGTA CGAAGGCTAT CATTCCTGTT 6540

TCGCTGTATG GCCAGTGTGC AGACTTCGAT GCAATCAACG CCATTGCCTC CAAATATGGT 6600

ATCCCTGTCA TTGAGGATGC TGCACAGAGC TTCGGTGCTT CGTACAAGGG TAAGCGTTCT 6660

TGTAATCTGA GTACCGTTGC CTGCACCAGC TTCTTCCCGA GCAAACCGTT GGGTTGCTAT 6720

GGGGATGGTG GAGCGATCTT CACTAACGAC GATGAACTGG CTACTGCTAT TCGTCAAATT 6780

GCCCGGCATG GTCAGGACCG CCGCTATCAT CACATTCGTG TGGGGGTGAA TAGTCGGTTG 6840

GACACATTGC AGGCTGCGAT TCTTCTACCG AAGCTTGAAA TTTTCGAGGA GGAGATTGCG 6900

TTGCGCCAGA AGGTAGCCGC GGAGTATGAC CTATCACTGA AACAGGTCGG TATCGGCACG 6960

CCGTTTATTG GAAGTGGATA ACATCAGTGT TTATGCCCAG TATACGGTGC GTATGGATAA 7020

TCGAGAGTCT GTTCAGGCTT CTTTGAAAGC TGCCGGGGTT CCAACTGCTG TGCATTACCC 7080

TATTCCGCTT AATAAGCAGC CTGCTGTTGC GGATGAGAAA GCGAAACTAC CAGTGGGTGA 7140 .CAAGGCTGCT ACTCAAGTAA TGAGCCTACC CATGCATCCC TATCTGGATA CGGCATCCAT 7200

CAAAATCATC TGTGCTGCGT TGACGAATTG ACGGATGTAT ATACTTGCTC GAGTCGACAG 7260

GTCTATTCTG CTGAACACAG TGTTACTGTT TGCTTTCTTT TCAGCGACAG TGTGGGTGAA 7320

TAATAATTAT ATCTATCATC TCTATGATTA TATGGGGTCT GCGAAAAAAA CTGTCGACTT 7380

CGGCTTGTAT CCGTACTTGA TGGTCTTGGC GCTCATCTGT GCCCTGTTGT GTGGAGGGGC 7440

AATTCGCAGG CCAGGTGATC TGTTAGTTAC ATTATTAGTT GTAATACTTG TTCCTCATTC 7500

ATTGGTTCTT AATGGAGCTA ATCAATATTC TCCGGATGCG CAACCATGGG CTGGCGTGCC 7560

TCTGGCAATT GCTTTTGGTA TTTTGATCAT CGGCATTGTC AATAAGATAA GATTCCATCC 7620

GCTAGGTGCA TTGCAGCGAG AAAACCAAGG AAGGCGAATG TTAGTGCTAC TGTCAGTACT 7680

CAACATAGTA GTGCTTGTGT TTATTTTCTT TAAAAGCGCT GGTTATTTTT CCTTTGACTT 7740

TGCTGGGCAG TATGCTCGCC GTGCACTTGC TCGTGAGGTT TTTGCTGCGG GTTCTGCAAA 7800

CGGCTACTTG TCGTCAATCG GTACCCAGGC ATTCTTTCCT GTGTTGTTTG CCTGGGGGGT 7860

CTACAGACGA CAATGGTTCT ACTTGGTCCT GGGTATTGTC AATGCACTAG TGCTGTGGGG 7920

AGCGTTTGGA CAGAAGTATC CTTTTGTCGT GTTGTTTCTA ATTTATGGCC TGATGGTTTA 7980

TTTTCGACGA TTCGGTCAGG TCAGAGTGTC TTGGGTTGTC TGCGCACTAT TGATGCTTTT 8040

GCTTTTAGGG GCGTTGGAAC ATGAGGTGTT TGGCTATTCA TTCTTGAATG ATTATTTTCT 8100

ACGTCGTGCT TTTATTGTGC CTTCCACCCT GTTGGGGGCA GTTGATCAGT TTGTGTCTCA 8160

GTTCGGATCC AATTATTACA GGGATACCCT GTTGGGCGCG CTCTTGGGTC AGGGTAGGAC 8220

TGAGCCGTTG AGCTTTCGTC TGGGGACGGA AATTTTCAAT AATCCCGATA TGAATGCGAA 8280

TGTAAACTTC TTCGCGATAG CCTATATGCA GTTGGGTTAT GTGGGGGTTA TGGCTGAGTC 8340

GATGTTGGTG GGCGGTAGTG TCGTTCTCAT GAATTTCTTA TTTTCGAGGT ATGGTGCATT 8400

CATGGCCATT CCGGTTGCTT TGTTATTTAC TACAAAGATT CTTGAGCAGC CCCTGCTAAC 8460

TGTAATGCTT GGCTCTGGTG TTTTCTTGAT ACTGCTTTTC CTTGCGCTAA TTTCTTTTCC 8520

ACTCAAGATG TCTTTAGGAA AAACTCTATG AGTGCGGCTT TTATCAACCG TGTCGCACGA 8580

GTATTAGTAG GCACCTTGGG AGCACAGCTC ATAACGATTG GTGTCACTCT GCTACTGGTT 8640

CGTCTGTATT CTCCTGCTGA AATGGGCGCT TTCAGTGTTT GGCTATCGTT CGCTACGATT 8700

TTTGCAGTTG TAGTTACTGG GCGCTATGAG TTGGCTATTT TTTCGACTCG AGAAGAGGGC 8760

GAACTCCAGG CAATCGTCAA GCTGATACTT CAGTTGACAC TATTGATTTT CGTTGCCGTG 8820

GCGATTGCTG TTGTTATAGG TAGACATCTG ATTGAGTCGA TGCCAGTTGT GATCGGTGAA 8880

TACTGGTTCG CATTGGCGGT GGCTTCGCTG GGGTTGGGGA TAAATAAGCT AGTCTTGTCG 8940

TTACTTACAT TTCAACAATC TTTTAATCGG TTGGGAGTTG CTCGTGTAAG CCTGGCTGCA 9000

TGTATTGCCG TTGCACAAGT TTCAGCTGCA TATTTACTGG AGGGCGTATC AGGGCTGATC 9060

TATGGCCAGC TGTTTGGTGT CGTCGTAGCC ACGGCGCTTG CGGCCCTTTG GGTAGGAAAG 9120 .TCGCTGATTT TAAATTGTAT CGAGACACCG TGGCGTATGG TACGACAAGT AGCGGTACAG 9180

TACATCAATT TCCCGAAGTT TTCTCTGCCT GCGGATCTGG TCAACACGGT TGCCAGTCAG 9240

GTGCCTGTGA TTTTATTGGC GGCAAAGTTT GGTGGAGACA GTGCAGGCTG GTTTGCCCTG 9300

ACTCTGAAGA TAATGGGAGC TCCCATTTCC TTGTTGGCTG CTTCGGTGCT CGATGTGTTC 9360

AAAGAACAAG CCGCTCGTGA CTACCGAGAG TTTGGTAATT GCCGAGGTAT CTTCCTCAAG 9420

ACTTTCAGGT TGCTTGCCGT CCTCGCGCTA CCTCCTTTTA TTATATTTGG TTCATTGGCG 9480

AGTGGGCCTT TGGGTTAGTC TTTGGCGAAG CGTGGGCTGA GTCGGGGCGT TATGCTGTAT 9540

TGATGGTTCC GTTGTTTTAT ATGCGTTTCG TGGTGAGTCC GCTCAGCTAT ACAATCTATA 9600

TTGCCCAGCG GCAGAGTATG GATTTGTTGT GGCAGCTAGC CTTGTTGCTC CTGACGTTTA 9660

TCTGTTTTAC CTTGCCTGAC TCTGTCGACT CGGTGTTGTG GTTTTACTCC ATAGCATATG 9720

CTGTTATGTA TTTTGTCTAT TTCTGGATGT CCTTCCAGTG TGCCAAGGGA GATGCCAAGT 9780

GATCGTTGTT ATTGATTACG GTGTAGGTAA CATTGCTTCA GTCTTGAACA TGCTGAAGCG 9840

AGTTGGTGCC AAAGCCAAGG CATCCGATAG CCGAGAGGAT ATCGAGCAGG CGGAGAAACT 9900

GATTTTGCCT GGTGTCGGTG CTTTTGACGC CGGAATGCAA ACACTACGCA AGAGTGGGCT 9960

GGTGGATGTA CTGACAGAGC AGGTCATGAT CAAACGAAAG CCGGTCATGG GGGTGTGTCT 10020

CGGGAGTCAA GATGCTGGGG CTGCGATCTG AGGAGGGAGC GGAACCGGGG CTTGGATGGA 10080

TCGATATGGA TAGCGTCCGT TTCGAAAGGC GTGACGACCG AAAGGTTCCA CATATGGGCT 10140

GGAATCAAGT GTCCCCGCAA TTGGAGCATC CTATACTTAG CGGTATAAAC GAGCAAAGCC 10200

GATTCTATTT TGTTCATAGT TATTATATGG TTCCGAAAGA CCCAGACGAT ATCCTGTTGA 10260

GTTGTAATTA TGGACAAAAA TTCACTGCGG CGGTGGCTCG GGATAATGTT TTCGGATTTC 10320

AGTTTCATCC TGAGAAGAGT CATAAATTCG GTATGCAGTT ATTCAAAAAC TTCGTGGAGC 10380

TTGTCTGATG GTCCGGAGGC GCGTTATCCC ATGCTTGCTG CTCAAGGATC GCGGTCTAGT 10440

GAAAACCGTG AAGTTCAAGG AGCCCAAGTA CGTTGGAGAC CCGATCAACG CAATACGCAT 10500

CTTCAATGAG AAAGAAGTCG ACGAACTGAT TTTGCTGGAT ATAGATGCTT CCAGGCTCAA 10560

TCAAGAGCCT AACTATGAGT TGATCGCGGA AGTGGCTGGT GAGTGTTTTA TGCCTATTTG 10620

CTATGGGGGC GGTATCAAGA CATTGGAGCA TGCGGAAAAA ATCTTTTCCC TAGGTGTCGA 10680

AAAAGTTTCG ATAAATACCG CCGCTCTTAT GGATCTTTCG TTGATTCGAA GAATTGCCGA 10740

TAAGTTTGGT TCGCAAAGCG TAGTTGGCTC TATCGACTGC CGCAAGGGTT TCTGGGGAGG 10800

ACACTCCGTG TTCTCAGAGA ATGGGACGCG CGACATGAAA CGCTCCCCAT TGGAGTGGGC 10860

GCAAGCGCTC GAAGAGGCTG GAGTGGGTGA GATTTTTCTA AATTCTATTG ATCGAGATGG 10920

AGTGCAGAAA GGCTTCGACA ACGCTCTAGT GGAAAATATC GCTTCTAACG TCCATGTGCC 10980

AGTGATCGCC TGTGGTGGAG CTGGCTCCAT CGCTGACCTC ATCGATCTTT TTGAGCGTAC 110 0

GTGTGTGTCG GCAGTAGCGG CGGGAAGCCT ATTCGTTTTC CATGGCAAGC ATCGTGCGGT 11100 ACTGATTAGT TATCCGGATG TCAACAAGCT CGACGTCGGT TAGAGTGAGC TGAGTTATTT 11160

ATGGCAAGGA CGCTTGTTGG CAACGCTATA TGCGCTTCAA GATTGTCGAA CTAAATTTGA 11220

GTTTGTCAGT GGGGCGTTCC ATTAGGCAGG CCGAGGTGAG TGCTTCGGGA GGTTGTTGTG 11280

ATGAAGATCT GTTCGCGCTG TGTTATGGAT ACATCTGACG CTGAAATCGT ATTTGATGAG 11340

GCGGGAGTCT GTAATCACTG CCATAAATTT GACAATGTTC AGTCCCGGCA GCTGTTTTCC 11400

GATGCTAGTG GTGAGCAGCG CCTTCAAAAG ATAATTGGGC AGATCAAGAA GGACGGTTCA 11460

GGTAAGGATT ATGACTGCAT CATTGGCCTT AGTGGCGGCG TAGATAGTTC CTATCTTGCT 11520

GTAAAGGTCA AGGATCTTGG CTTGCGCCCA CTGGTTGTGC ATGTGGACGC CGGCTGGAAT 11580

AGCGAACTTG CAGTCAGTAA TATTGAAAAG ATTGTAAAAT ATTGCGGTTT TGATTTACAT 11640

ACTCATGTAA TAAACTGGGA GGAAATTCGT GATCTTCAGT TGGCTTATAT GAAAGCTGCT 11700

GTCGCCAATC AGGATGTGCC TCAAGATCAT GCCTTCTTCG CTAGTATGTA TCACTTTGCT 11760

GTGAAGAATA ATATTAAGTA CATTCTGAGT GGTGGTAATT TGGCCACTGA GGCAGTATTC 11820

CCAGATACAT GGCACGGCAG CGCTATGGAT GCAATAAACC TAAAGGCTAT TCACAAAAAA 11880

TATGGTGAGC GTCCGCTAAG GGACTACAAG ACTATTAGTT TTCTTGAGTA CTATTTCTGG 11940

TATCCCTTTG TCAAAGGAAT GAGAACGGTC CGTCCGTTGA ATTTCATGGC CTATGATAAG 12000

GCCAAGGCTG AAACCTTCCT TCAAGAAACG ATAGGCTATC GTTCTTACGC GCGAAAGCAT 12060

GGAGAGTCGA TTTTCACCAA GCTTTTCCAG AACTACTATC TACCGACCAA GTTTGGCTAT 12120

GATAAACGCA AACTGCACTA CTCCAGCATG ATTTTGTCTG GGCAAATGAC GCGTGACGAA 12180

GCTCAGGCTA AACTGGCTGA GCCGCTATAT GATGCAGATG AACTGCAGTT TGATATCGAA 12240

TATTTCTGCA AGAAGATGCG AATCACCCAG GCTCAATTTG AAGAGTTGAT GAATGCACCT 12300

GTTCATGACT ATTCGGAGTT TGCCAACTGG GATTCTCGAC AGAGGATTGC GAAAAAAGTT 12360

CAAATGATTG TCCAGCGTGC GCTGGGTCGT CGCATCAATG TCTACTCGTG ATGACCGGGG 12420

CCGCTCATGA CTAAAGTTGC TCATTTGACA TCGGTTCACT CGCGTTATGA TATTCGTATA 12480

TTTCGAAAGC AGTGTAGAAC ACTCTCTCAA TACGGATACG ATGTGTATCT GGTTGTCGCA 12540

GATGGTAAGG GTGATGAAGT CAAGGATGGT GTAAGGATTG TTGATGTCGG AGTACTCTCA 12600

GGTCGCTTGA ATCGTATTCT AAAAACCACC CGAAAAATTT ATGAACAGGC TTTGGCGCTT 12660

GGGGCTGATG TCTATCATTT TCATGATCCC GAACTGATAC CTGTTGGTCT TCGACTGAAA 12720

AAGCAAGGTA AGCAGGT AT CTTCGACTCC CATGAGGATG TGCCGAAGCA ACTGCTGAGT 12780

AAACCT AC TGCGACCGTT TTTACGCCGT GTAGTGGCTG TGTTATTTTC CTGCTATGAG 12840

AAATATGCAT GCCCTAAGCT GGATGCAGTC CTTACGGCAA CGCCGCATAT TCGTGAAAAA 12900

TTTAAAAATA TTAATGGGAA TGTTCTAGAT ATTAATAACT TTCCCATGTT GGGTGAGTTG 12960

GATGCGATGG TTCCTTGGGC AAGCAAGAAA ACTGAAGTCT GCTACGTCGG TGGTATCACT 13020

TCCATTCGTG GTGTTCGTGA AGTCGTTAAG AGTCTTGAGT GCTTGAAGTC CTCGGCGCGC 13080 .TTGAATTTAG TGGGAAAGTT TTCAGAGCCA GAGATAGAAA AAGAAGTCAG AGCGCTCAAG 13140

GGATGGAACT CCGTTAACGA ACATGGTCAG CTTGATCGAG AAGATGTTCG TCGTGTACTC 13200

GGTGACTCTG TTGCCGGGTT GGTGACATTT CTCCCAATGC CTAATCATGT TGATGCACAA 13260

CCTAATAAGA TGTTCGAGTA TATGTCGTCG GGAATCCCTG TGATCGCTTC CAATTTTCCT 13320

CTCTGGCGGG AAATTGTTGA AGGTAGCAAT TGTGGTATAT GCGTAGATCC TCTAAGTCCT 13380

GCTGCCATTG CTGAAGCGAT CGACTATCTG GTAAGTAATC CGTGTGAGGC GGCAGCGCTG 13440

GGACGTAATG GCCAGCGGGC AGTGAACGAA CGTTATAACT GGGATTTGGA AGGGCGCAAA 13500

CTAGCGCGGT TCTATTCCGA TCTACTGAGT AAGCGAGATT CCATATGAAA ATTCTGACCA 13560

TCATTGGTGC GCGTCCGCAG TTTATTAAAG CGAGTGTGGT TTCAAAGGCT ATCATTGAGC 13620

AGCAGACCCT TTCGGAAATC ATCGTTCATA CTGGTCAGCA TTTTGATGCC AATATGTCTG 13680

AAATATTTTT CGAACAGCTG GGTATTCCAA AGCCGGATTA CCAGTTGGAT ATCCATGGTG 13740

GTACTCACGG CCAAATGACC GGGCGTATGC TAATGGAGAT CGAGGATGTA ATTCTCAAGG 13800

AGAAACCTCA TCGCGTATTG GTATACGGCG ATACCAACTC TACCTTGGCT GGAGCGTTGG 13860

CTGCCTCCAA GCTGCATGTT CCTATCGCAC ACATCGAAGC CGGCCTGCGA AGTTTCAATA 13920

TGCGGATGCC GGAGGAAATT AACCGTATTC TTACTGATCA GGTTAGTGAT ATTCTGTTTT 13980

GCCCTACTCG AGTTGCAATT GATAATCTCA AGAATGAAGG TTTCGAAAGA AAGGCTGCGA 14040

AGATAGTCAA CGTGGGTGAT GTGATGCAGG ATAGCGCTCT ATTCTTTGCG CAGCGTGCAA 14100

CCTCGCCAAT TGGACTTGCG TCACAAGATG GGTTTATTCT CGCGACCCTG CATCGTGCCG 14160

AGAACACCGA CGATCCAGTT CGCCTGACTT CGATAGTCGA GGCTCTGAAT GAAATCCAGA 14220

TTAATGTTGC ACCTGTGGTG CTACCCCTGC ATCCACGTAC CCGCGGTGTC ATCGAGCGCC 14280

TAGGGCTCAA GCTGGAAGTG CAGGTTATCG ATCCTGTCGG ATATCTGGAA ATGATCTGGC 14340

TGTTGCAACG CTCTGGCCTG GTGCTCACGG ACAGCGGCGG TGTTCAGAAA GAAGCATTCT 14400

TCTTCGGCAA GCCCTGCGTG ACCATGCGTG ACCAGACCGA ATGGGTGGAG CTAGTGACCT 14460

GTGGAGCCAA CGTTCTTGTG GGAGCGGCCC GCGACATGAT TGTCGAATCT GCACGGACTA 14520

GCCTGGGAAA GACCATTCAA GACGATGGTC AGCTTTACGG AGGCGGTCAA GCCTCTCTCG 14580

GATTGCTGAA TATCTTGCCA AGCTGTGATG CTTTGCGTGT CGAGTTTAAA TAAAGGATTT 14640

ATTTAGTTCC ATGAACGTCT GGTATGTGCA TCCCTATGCT GGCGGCCCCG GAGTTGGTCG 14700

TTATTGGCGG CCTTATTATT TCTCCAAGTT TTGGAATCAG GCTGGGCATC GGTCGGTCAT 14760

AATCTCGGCA GGCTATCACC ATCTGCTGGA ACCGGATGAA AAGCGTTCGG GCGTCACCTG 14820

TGTAAATGGA GCCGAATACG CATATGTACC TACTTTGCGC TATTTGGGCA ATGGCGTGGG 14880

CAGAATGCTA TCGATGCTCA TATTTACCAT GATGTTGCTG CCATTCTGCC TGATCTTGGC 14940

CCTGAAGCGT GGAACGCCGG ATGCGATTAT CTACTCATCG CCTCACCCGT TTGGCGTCGT 15000

TAGCTGTTGG CTGGCTGCTC GCCTGCTAGG TGCGAAATTT GTATTTGAGG TGCGCGATAT 15060 CTGGCCTTTG AGTCTGGTCG AACTGGGAGG CTTGAAAGCT GACAATCCCC TGGTGCGTGT 15120

TACCGGTTGG ATCGAAAGAT TCTCCTATGC GCGAGCTGAT AAGATCATCA GTCTGCTGCC 15180

ATGTGCGGAG CCGCACATGG CCGACAAAGG ACTTCCCGCT GGAAAGTTCC TGTGGGTTCC 15240

GAATGGCGTT GACAGCAGCG ATATCTCTCC TGATAGCGCT GTGAGTTCAA GTGATTTGGT 15300

CCGGCATGTA CAAGTTCTCA AGGAGCAGGG TGTTTTCGTT GTGATCTATG CTGGAGCGCA 15360

CGGCGAACCC AATGCTCTGG AGGGATTGGT TCGCTCTGCC GGACTGCTGC GCGAGCGTGG 15420

TGCAAGTATC AGAATCATTC TGGTGGGCAA GGGAGAGTGC AAAGAGCAAC TCAAGGCGAT 15480

TGCCGCACAG GATGCCAGCG GGCTAGTGGA GTTTTTCGAT CAGCAGCCCA AAGAGACTAT 15540

CATGGCTGTC CTGAAGCTGG CGTCGGCGGG CTACATCTCG CTCAAGTCAG AACCGATCTT 15600

CCGCTTTGGC GTGAGCCCCA ACAAGCTATG GGATTACATG CTGGTTGGGT TGCCAGTCAT 15660

TTTCGCCTGC AAGGCAGGGA ACGACCCGGT TAGTGACTAC GATTGCGGTG TATCTGCCGA 15720

CCCAGATGCC CCTGAGGATA TTACTGCAGC CATCTTCCGT CTGTTGCTGC TGAGCGAAGA 15780

CGAGCGTCGC ACAATGGGGC AAAGAGGGCG TGATGCGGTC CTGGAGCATT ATACCTACGA 15840

GAGTCTGGCT CTTCAGGTGT TGAACGCCCT TGCTGATGGG CGCGCAGCAT GAAAGCTGTC 15900

ATGGTGACCG GTGCATCAGG ATTCGTCGGA TCGGCCTTGT GCTGTGAGCT TGCTCGGACA 15960

GGGTATGCGG TGATTGCGGT GGTACGGCGG GTTGTTGAAA GAATACCTTC TGTGACGTAC 16020

ATCGAAGCTG ATCTGACCGA TCCAGCCACG TTTGCCGGCG AGTTCCCGAC GGTGGATTGC 16080

ATTATTCATC TCGCTGGACG TGCCCATATA CTCACTGACA AGGTTGCAGA CCCGCTCGCC 16140

GCATTTCGTG AAGTCAACCG AGATGCGACT GTCCGGTTGG CTACCCGTGC GCTCGAGGCT 16200

GGGGTGAAGC GTTTCGTGTT TGTCAGTTCA ATTGGCGTTA ACGGTAACAG CACCCGGCAA 16260

CAGGCTTTCA ACGAAGATTC TCCAGCCGGC CCACATGCGC CCTATGCCAT CTCCAAATAC 16320

GAGGCTGAGC AGGAGCTGGG GACTTTGCTC CGGGGTAAAG GTATGGAGTT GGTGGTTGTC 16380

CGACCGCCTT TGATCTATGC CAATGATGCG CCAGGTAACT TCGGCCGTTT GCTCAAGCTC 16440

GTCGCTAGTG GTCTGCCGCT TCCGCTTGAC GGTGTCCGTA ATGCGCGCAG CCTGGTTTCT 16500

AGGAGAAACA TCGTGGGTTT CCTGAGTCTT TGTGCCGAAC ACCCCGATGC TGCGGGCGAA 16560

CTGTTTCTGG TGGCGGATGG CGAGGATGTT TCCATTGCGC AAATGATCGA GGCCCTGAGT 16620

CGGGGAATGG GCAGGCGTCC AGCTCTTTTC ACGTTTCCAG CGGTGCTGCT GAAGCTTGTA 16680

ATGTGCTTGC TGGGTAAGGC TTCCATGCAT GAACAGCTCT GTGGCTCGTT ACAGGTCGAT 16740

GCTTCCAAGG CCCGCCGGCT GCTCGGCTGG GTTCCCGTCG AGACTATTGG TGCCGGTCTG 16800

CAAGCAGCAG GTCGAGAGTA CATTCTTCGC CAGAGGGAGC GCCGAAAATG ACGGACACAT 16860

CCAAACCCCT GGTCGGCAAT TACGCTGAAC TTTAATAAGT TCTCTTTCCA ATGATGATCT 16920

GGATGATCGC GTGTCTAGTT GTCTTGCTGT TTTCATTTGT CGCTACCTGG GGGCTGCGTC 16980

GCTATGCATT AGCGACGAAA CTGATGGATG TTCCGAATGC CCGTAGCTCC CACAGTCAAC 17040 CGACGCCTAG GGGGGGAGGT GTTGCAATCG TTCTGGTCTT CCTTGCAGCG TTGGTGTGGA 17100

TGCTGAGTGC AGGCAGTATC TCCGGCGGCT GGGGGGGGGC GATGCTGGGT GCAGGTTCTG 17160

GCGTGGCACT GTTAGGGTTC CTGGATGACC ATGGGCACAT TGCTGCGCGT TGGCGGCTGC 17220

TCGGCCATTT CTCAGCAGCG ATATGGATCT TGCTGTGGAC GGGTGGTTTC CCGCCGCTGG 17280

ATGTGGTTGG GCATGCTGTC GACTTAGGAT GGCTGGGCCA CGTATTGGCA GTTTTCTATT 17340

TGGTATGGGT GCTGAACCTT TATAACTTCA TGGATGGCAT TGATGGTATT GCCAGTGTCG 17400

AGGCCATTGG TGTCTGTGTA GGAGGGGCCC TGATCTACTG GCTTACAGGG CATGTCGCGA 17460

TGGTTGGTAT CCCTCTGTTG CTGGCGTGCG CGGTCGCCGG CTTCCTGATC TGGAACTTCC 17520

CTCCAGCTCG AATCTTCATG GGTGATGCGG GGAGTGGTTT TCTTGGTATG GTTATTGGTG 17580

CACTAGCTAT TCAGGCTGCA TGGACCGCCC CCTCGCTGTT CTGGTGCTGG TTGATATTGC 17640

TGGGAGTGTT CATCGTTGAT GCAACCTATA CTCTGATCCG CCGGATCGCC AGAGGGGAGA 17700

AATTCTATGA GGCGCATCGC AGCCACGCTT ATCAGTTTGC CTCGCGTCGT TATGCTAGCC 17760

ATCTGCGGGT TACCTTGGGT GTTCTGGCTA TCAACACTCT TTGGTTGTTG CGTTGGCACT 17820

GATGGTTGCA TTGGGTTGGA TCAGCGGCTT CATCGGTATC CTGGTTGCTT ATGCTCCTCT 17880

TTGCCTCTTG GCGGTAGGAT ACAAGGCGGG TTCCTTGGAA AAATCCTAAG CCGTGGATTG 17940

ACCTGCTCCC CGATTTCAGT ACCACGCCGA ACTTAGTAGA GTCTGTTTTC CGAGCAGGAG 18000

ACGGCAGTGA AAAAGCGTTT TACTGAAGAA CAGATTCTAG ACTTTCTGAA GCAGGCAGAA 18060

GCCGGTGTGC CGGTGAAGGA GCTGTGTCGC CGACACAGCT TCAGTGATGC CACGTTCTAC 18120

ACCTAGCGGG CCAAGTTCGT CGGCATGACC GTGCCGGATG CCAAGCGCCT GAAGGATCTC 18180

GAACTGGAAA ACAGCCGGCT GAAGAAGTTG CTCGCCGAGT CCCTCCTCGA CATCGGGGCG 18240

CTGAAAGTGG TCACCCGGGG AAAGGGGGAG CCCGGCAGCG GGGCGGGGGG GCAGGAGATT 18300

CAGGCGCAAA CCGACATCTC CGAGCGTCGT GCCCTGTCAG TTGTTCAGGC TGTCCCGCTC 18360

TGTGTTGTGC CACCAGCCGC GAACTAGTGT GCAAAACACC GAGCTGCAAG CCCAACTGGT 18420

GGAACTGGCA AGGGCTTCGG CACTTTGGCT ATCACCGCCT GCACATTCTG CTGCGGCGTG 18480

CTGGTGTGCA GATCAACTAC AAGCGGACTT ACCGGCTATA CTGAGCCGTC GGCTTGATGG 18540

TGAAGCGGCG GAGGCGCCGC CACAGGGGCG CGGTGGCGTG CGAATGCCTG AGCCTGCCGA 18600

GCGCACCGAA CTAGGTCTTG TCGATGGATT TCGTCTTCGA CGCGCTCAGC ACTGGGCGAC 18660

GGATCAAATG CCTGACGGTG GTCGATGACT TCACCAAGGA GTCGGTTGGC ATCCTGGTGG 18720

AGCACGGTAT CAGCGGTTTT CGTGTCACAC GGGCGCTGGA CAGATGGCAC GGTTGCGCGG 18780

TTACCCGAAG GCGATCCGCA CCCCCGAGTT CACCGGCAAG GCGCTTGATC AGTGGGCCTA 18840

TCGGCGTGAT ATTAAGTTGA AGCTGACTCA GTCCGGCAAG CCCACGCAGA ACGCCTTCAT 18900

CGTCATTCCA ACGGCAAGTT CCGCAATGAG CACTGCTGCT CGCTGGTCGA AGCCAGAATC 18960

CGCATCGTGG CCTGGCGGCA CGATTACAAC GAGCACCGAC CGTCCAGCGC CATTGGCAAT 19020 CTCACCTCGC TAGAGTTTGC TGCAAGTTGG CGAACTCGCC AGCAGCAACT GAAGCAGGAA 19080

AATTGATGTC AACCCCAGGG CCTACTACCT AGGCAGCGTA CTAAAACTGG GGGCAGGTCA 19140

TCTACGATCC TTGTGATAGG TATCGACGGT GCTGTGGCGA TCCGTGCATG TGGAACTGAT 19200

CTGGGATTTT CCCTGCGTGT GTTTTCAGGG GCCTGGCAGT GATTTTTTGA GCATTGCCAT 19260

GGGGGGGCGG GTTTTTGCAT CCTGCTCGGA CGCTGGCTGA TTCCCACTCG ACGTGCTCGT 19320

GTTCGATGTC ACTTTTACTT TGCTGCTGCA TCGTTTGTTA TGAGGCGATA AAATTCGGCA 19380

GAGCTATCGA GTCACGCATG ATGGCACGTT GGTGTCGTGC TGAAGTGGCA TTTGCCGGTT 19440

ATCCTTTGTG GCTGTGATCA GTTTCTTCTG GTTATTACCC TAGCATTGCT GGTAGTACTA 19500

AGCATTATCG ACGGAGTACT TGGGGGCTTA TCGCGTATGC TCCTATGGCT TGGATGGCGA 19560

CGAGTCTTGG GAGGGGATGT CCTGAGACGT AGCGTGGGCC TTGCCATATT GTTGCCATGG 19620

TTATCTGTCT GATCTGTCTG GTTGGTATGG ATGTATTGAA CGGGGCTGAT AAATAGGATG 19680

TTGGATAATT TGAGGATAAA GCTCCTGGGA TTGCCGCGCC GCTATAAGCG AATGCTGCAA 19740

GTCGCTGCCG ATGTGACTCT TGTGTGGCTA TCCCTCTGGC TGGCTTTCTT GGTCAGGTTG 19800

GGCACAGAAG ACATGATCAG CCCGTTTAGC GGCCATGCCT GGCTGTTCAT CGCCGCCCCG 19860

TTGGTGGCCA TTCCCCTGTT CATCCGCTTC GGCATGTACC GGGCGGTGAT GCGCTACCTG 19920

GGCAACGACG CCCTTATCGC GATCGCCAAG GCCGTCACCA TTTCCGCGCT GGTCCTGTCG 19980

TTGCTGGTCT ACTGGTACCG CTCCCCGCCG GCGGTGGTGC CGCGTTCCCT GGTGTTCAAC 20040

TACTGGTGGT TGAGCATGCT GCTGATCGGC GGCTTGCGTC TGGCCATGCG CCAGTATTTC 20100

ATGGGAGACT GGTACTCTGC TGTGCAGTCG GTACCATTTC TCAACCGCCA GGATGGCCTG 20160

CCCAGGGTGG CTATCTATGG CGCGGGGGCG GCCGCCAACC AGTTGGTTGC GGCATTGCGT 20220

CTCGGTCGGG CGATGCGTCC GGTGGCGTTC ATCGATGATG ACAAGCAGAT CGCCAACCGG 20280

GTCATCGCCG GTCTGCGGGT CTATACCGCC AAGCATATCC GCCAGATGAT CGACGAGACG 20340

GGCGCGCAGG AGGTTCTCCT GGCGATTCCT TCCGCCACTC GGGCCCGGCG CCGAGAGATT 20400

CTCGAGTCCC TGGAGCCGTT CCCGCTGCAC GTGCGCAGCA TGCCCGGCTT CATGGACCTG 20460

ACCAGCGGCC GGGTCAAGGT GGACGACCTG CAGGAGGTGG ACATCGCTGA CCTGCTGGGG 20520

CGCGACAGCG TCGCACCGCG CAAGGAGCTG CTGGAACGTT GCATCCGCGG TCAGGTGGTG 20580

ATGGTGACCG GGGCGGGCGG CTCTATCGGT TCGGAACTCT GTCGGCAGAT CATGAGTTGT 20640

TCGCCTAGCG TGCTGATCCT GTTCGAGCAC AGCGAATACA ACCTCTATAG CATCCATCAG 20700

GAACTGGAGC GTCGGATCAA GCGCGAGTCG CTTTCGGTGA ACCTGTTGCC GATCCTCGGT 20760

TCGGTGCGCA ATCCCGAGCG CCTGGTGGAC GTGATGCGTA CCTGGAAGGT CAATACCGTC 20820

TACCATGCGG CGGCCTACAA GCATGTGCCG ATCGTCGAGC ACAACATCGC CGAGGGCGTT 20880

CTCAACAACG TGATAGGCAC CTTGCATGCG GTGCAGGCCG CGGTGCAGGT CGGCGTGCAG 20940

AACTTCGTGC TGATTTCCAC CGACAAGGCG GTGCGACCGA CCAATGTGAT GGGCAGCACC 21000 AAGCGCCTGG CGGAGATGGT CCTTCAGGCG CTCAGCAACG AATCGGCACC GTTGCTGTTC 21060

GGCGATCGGA AGGACGTGCA TCACGTCAAC AAGACCCGTT TCACAATGGT CCGCTTCGGC 21120

AACGTCCTCG GTTCGTCCGG TTCGGTCATT CCGCTGTTCC GCGAGCAGAT CAAGCGCGGC 21180

GGCCCGGTGA CGGTCACCCA CCCGAGCATC ACCCGTTACT TCATGACCAT TCCCGAGGCA 21240

GCGCAGTTGG TCATCCAGGC CGGTTCGATG GGGCAGGGCG GAGATGTATT CGTGCTGGAC 21300

ATGGGGCCGC CGGTGAAGAT CCTGGAGCTC GCCGAGAAGA TGATCCACCT GTCCGGCCTG 21360

AGCGTGCGTT CCGAGCGTTC GCCCCATGGT GACATCGCCA TCGAGTTCAG TGGCCTGCGT 21420

CCTGGCGAGA AGCTCTACGA AGAGCTGCTG ATCGGTGACA ACGTGAATCC CACCGACCAT 21480

CCGATGATCA TGCGGGCCAA CGAGGAACAC CTGAGCTGGG AGGCCTTCAA GGTCGTGCTG 21540

GAGCAGTTGC TGGCCGCCGT GGAGAAGGAC GACTACTCGC GGGTTCGCCA GTTGCTGCGG 21600

GAAACCGTCA GCGGCTATGC GCCTGACGGT GAAATCGTCG ACTGGATCTA TCGCCAGAGG 21660

CGGCGAGAAC CCTGAGTCAT CGTTCTCCGG AAAAGGCCGC CTAGCGGCCT TTTTTGTTTT 21720

CTCCGTACGA TGTTTCCGGT GCCGGACCAG GAAGCGACTG CTTTGCTGGG GCTGTCGATC 21780

CAGGTGCGTT CCACGGCGAT AAGGTGGTTT CGTGGATGGG CATGAAGCCC TCTACGTGGT 21840

CATTCATCTC TGAAGGAGTG CACCCATGCA CCTAATCAAA TCCGCTCTGC TTCTCATCCT 21900

GTTCGCCTGT CTTCCGTTTT CGGCTTCCGC CGCACCGGTC GCCGTCGCCA AGAATCCGCT 21960

GGCCGCAACG ACACCTGCGA CGACCGTGTC GCCGGGGGAG CAGGTCAATA TCAATACGGT 22020

CGACGAGGCC GCCCTGATAC GGGGGCTCAA CGGTGTCGGC GAGGCCAAGG CCAGGGCGAT 22080

CCTCGAGTAT CGTGCGGCCC ATGGTCCGTT CGTCTCGGTG GATCAACTGC TGGAAGTGAA 22140

AGGGGTAGGC CCGGCGTTGC TGGAGAAGAA CCGGGCGCGG ATCGTCATCG AGTGAGGTGC 22200

GACTGAAGGG GCGAACTTTC GTCCCGATAA CGAAAAAGCC CCCGGCATGT GCCGAGGGCT 22260

TTGAATTTGG CTCCGCGACC TGGACTCGAA CCAGGGACCC AATGATTAAC AGTCATTTGC 22320

TCTACCGACT GAGCTATCGC GGAACAGCGA GGCGTATGTT ACTGATTAAA AAGGGGAAGC 22380

CTCTCCCGAT GACTTCCCCA TTTTCCCTAC AGGACCTGGA CGATGGCCTT GGTGATGGTC 22440

TCCAGGTTCG ATTTGTTCAG CGCGGCGACG CAGATACGGC CGGTGCTGAC GGCGTAGATA 22500

CCGAACTCGG TCTTCAGGCG CTCGACCTGG TCGGCGGTCA GGCCGGAATA GGAGAACATG 22560

CCACGTTGGC GACCGACGAA ACTGAAGTCG CGCTTGGCGC CGTGGGCTGC CAGTTGCTCG 22620

ACCATCGCCA GGCGCATGTC GCGGATGCGG TCGCGCATCT CGCCCAGTTC CTGCTCCCAG 22680

AGGGCCCGCA GTTCCGGGCT GTTGAGCACG GAGGAGACGA CGCTGGCGCC GTGGGTCGGT 22740

GGGTTCGAAT AGTTGGTGCG GATCACCCGC TTCACCTGGG ACAGCACGCG GGCCGATTCA 22800

TCGCGGCTTT CGGTCACGAT CGAGAGGGCG CCGACGCGTT CGCCATAGAG CGAGAAGGAT 22860

TTGGAGAACG AGCTGGAAAC GAAGAAGCTC AGGCCCGACT GGGCGAACAG GCGCACCGCG 22920

GCGGCGTCTT CCTCGATGCC GTTGCCGAAG CCCTGGTAGG CGATGTCGAG GAACGGCACG 22980 TGGCCCTTGG CCTTGAGCAC GTCCAGCACC TGTTTCCAGT CGTCCAGCTC GAGATCGACG 23040

CCGGTCGGAT TATGGCAGCA GGCGTGCAGA ACCACGATCG AGCGGGCCGG CAGGGCATTC 23100

AGGTCTTCCA GCAGGCCGGC GCGGTTCACG CCATTGCTGG CGGCGTCGTA ATAGCGGTAG 23160

TTCTGCACCG GGAAGCCGGC GGCTTCGAAC AGTGCGCGGT GGTTTTCCCA GCTCGGGTCG 23220

CTGATGGCCA CGGTGGCGTC GGGCAGCAGG CGCTTGAGGA AGTCGGCGCC GAGCTTGAGC 23280

GCGCCGGTGC CGCCGACGGC CTGGGTCGTG ACCACACGGC CGGCGGCCAG CAGCTCGGAC 23340

TCGTTACCGA ACAGCAGTTT CTGTACGCCC TGGTCGTAGG CGGCGATCCC TTCGATCGGC 23400

AGGTAGCCGC GCGGCGCGTG GGCCTCGATG CGGGCCTTCT CGGCAGCCTG CACGGCACGC 23460

AACAGCGGAA TGCGCCCCTC CTCGTTGTAG TACACGCCCA CGCCCAGGTT GATCTTGCCC 23520

GGACGGGTAT CGGCGTTGAA GGCTTCGTTC AGGCCAAGGA TGGGATCACG CGGTGCCATT 23580

TCGACGGCAG AAAACAGACT CATTTTGCGG CTGCTCGGAG TGTGAAGAGA GGAGGGCAAC 23640

GCAACCCGTT ATGCGGGGGC GCAAAGGGTT GCGCAAACGG GGGGTTATTA TAGACACCCC 23700

TTGATGCATG CGGCGACATT TAGGTGCATG CTTTCAGCTA TTTCTGACGC CGGATTTTCC 23760

TTGGCGTCAC AGCTCCCTGC GAGGTTTTTC ATGGATACGT TCCAACTCGA CTCGCGCTTC 23820

AAGCCCGCCG GCGACCAGCC GGAAGCCATC CGGCAAATGG TCGAGGGGCT GGAGGCGGGG 23880

CTTTCGCACC AGACCCTGCT GGGGGTGACG GGCTCTGGCA AGACTTTCAG CATCGCCAAC 23940

GTGATTGCCC AGGTGCAGCG CCCGACCCTG GTCCTGGCGC CGAACAAGAC CCTGGCGGCC 24000

CAGCTCTACG GGGAGTTCAA GACGTTCTTC CCGCACAATT CCGTGGAGTA CTTCGTTTCC 24060

TACTACGACT ACTACCAGCC GGAGGCCTAC GTCCCGTCTT CCGATACCTA TATCGAGAAG 24120

GACTCCTCGA TCAACGACCA TATCGAGCAG ATGCGCCTGT CGGCGACCAA GGCGCTGCTC 24180

GAGCGTCCGG ATGCGATCAT CGTCGCCACC GTGTCGTCCA TCTACGGCCT CGGTGATCCC 24240

GCGTCCTACC TGAAGATGGT CCTGCACCTG GACCGCGGCG ACCGCATCGA CCAGCGCGAA 24300

CTGCTGCGGC GACTGACCAG CCTGCAGTAC ACCCGCAACG ACATGGATTT CGCCCGTGCG 24360

ACTTTCCGTG TGCGTGGCGA TGTGATCGAC ATCTTCCCGG CCGAATCCGA TCTCGAG 24417 (2) INFORMATION FOR SEQ ID NO:2:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 158 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: rol (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2 :

Arg Asp lie Glu Gin Arg lie Gin Asn Leu Arg Arg Glu Cys Gin Gly 1 5 - 10 15

Arg Arg Glu Asp Arg lie Val Gin Leu Lys Glu Ala Leu Lys Val Ala 20 25 30

Gly Ala Leu Lys Leu Glu Glu Pro Pro Leu lie Ser Gly Gin Ser Ser 35 40 45

Glu Glu Leu Ser Ala lie Met Asn Gly Ser Leu Met Tyr Met Arg Gly 50 55 60

Ser Lys Ala lie Met Ala Glu lie Gin Thr Leu Glu Ala Arg Ser Ser 65 70 75 80

Asp Asp Pro Phe lie Pro Ala Leu Arg Thr Leu Gin Glu Gin Gin Leu 85 90 95

Leu Leu Ser Ser Leu Arg Val Asn Ser Glu Arg Val Ser Val Phe Arg 100 105 110

Gin Asp Gly Pro lie Glu Thr Pro Asp Ser Pro Val Arg Pro Arg Arg 115 120 125

Ala Met lie Leu lie Phe Gly Leu lie lie Gly Gly Val Leu Gly Gly 130 135 140

Phe Leu Ala Leu Cys Arg lie Phe Leu Lys Lys Tyr Ala Arg 145 150 155

(2) INFORMATION FOR SEQ ID NO:3 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 436 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbA

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3 :

Met lie Asp Val Asn Thr Val Val Glu Lys Phe Lys Ser Arg Gin Ala

1 5 10 15

Leu He Gly He Val Gly Leu Gly Tyr Val Gly Leu Pro Leu Met Leu 20 25 30

Arg Tyr Asn Ala He Gly Phe Asp Val Leu Gly He Asp He Asp Asp 35 40 45

Val Lys Val Asp Lys Leu Asn Ala Gly Gin Cys Tyr He Glu His He 50 55 60

Pro Gin Ala Lys He Ala Lys Ala Arg Ala Ser Gly Phe Glu Ala Thr 65 70 75 80 Thr Asp Phe Ser Arg Val Ser Glu Cys Asp Ala Leu He Leu Cys Val 85 90 95 .

Pro Thr Pro Leu Asn Lys Tyr Arg Glu Pro Asp Met Ser Phe Val He 100 105 110

Asn Thr Thr Asp Ala Leu Lys Pro Tyr Leu Arg Val Gly Gin Val Val 115 120 125

Ser Leu Glu Ser Thr Thr Tyr Pro Gly Thr Thr Glu Glu Glu Leu Leu 130 135 140

Pro Arg Val Gin Glu Gly Gly Leu Val Val Gly Arg Asp He Tyr Leu 145 150 155 160

Val Tyr Ser Pro Glu Arg Glu Asp Pro Gly Asn Pro Asn Phe Glu Thr 165 170 175

Arg Thr He Pro Lys Val He Gly Gly His Thr Pro Gin Cys Leu Glu 180 185 190

Val Gly He Ala Leu Tyr Glu Gin Ala He Asp Arg Val Val Pro Val 195 200 205

Ser Ser Thr Lys Ala Ala Glu Met Thr Lys Leu Leu Glu Asn He His 210 215 220

Arg Ala Val Asn He Gly Leu Val Asn Glu Met Lys He Val Ala Asp 225 230 235 240

Arg Met Gly He Asp He Phe Glu Val Val Asp Ala Ala Ala Thr Lys 245 250 255

Pro Phe Gly Phe Thr Pro Tyr Tyr Pro Gly Pro Gly Leu Gly Gly His 260 265 270

Cys He Pro He Asp Pro Phe Tyr Leu Thr Trp Lys Ala Arg Glu Tyr 275 280 285

Gly Leu His Thr Arg Phe He Glu Leu Ser Gly Glu Val Asn Gin Ala 290 295 300

Met Pro Glu Tyr Val Leu Gly Lys Leu Met Asp Gly Leu Asn Glu Ala 305 310 315 320

Gly Arg Ala Leu Lys Gly Ser Arg Val Leu Val Leu Gly He Ala Tyr 325 330 335

Lys Lys Asn Val Asp Asp Met Arg Glu Ser Pro Ser Val Glu He Met 340 345 350

Glu Leu He Glu Ala Lys Gly Gly Met Val Ala Tyr Ser Asp Pro His 355 360 365

Val Pro Val Phe Pro Lys Met Arg Glu His His Phe Glu Leu Ser Ser 370 375 380

Glu Pro Leu Thr Ala Glu Asn Leu Ala Arg Phe Asp Ala Val Val Leu 385 390 395 400

Ala Thr Asp His Asp Lys Phe Asp Tyr Glu Leu He Lys Ala Glu Ala 405 410 415

Lys Leu Val Val Asp Ser Arg Gly Lys Tyr Arg Ser Pro Ala Ala His 420 425 430 He He Lys Ala 435

(2) INFORMATION FOR SEQ ID N0:4:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 316 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii ) IMMEDIATE SOURCE : ( B ) CLONE : psbB

( xi ) SEQUENCE DESCRIPTION : SEQ ID NO : 4 :

Met Lys Asn Phe Ala Leu He Gly Ala Ala Gly Tyr He Ala Pro Arg

1 5 10 15

His Met Arg Ala He Lys Asp Thr Gly Asn Cys Leu Val Ser Ala Tyr 20 25 30

Asp He Asn Asp Ser Val Gly He He Asp Ser He Ser Pro Gin Ser 35 40 45

Glu Phe Phe Thr Glu Phe Glu Phe Phe Leu Asp His Ala Ser Asn Leu 50 55 60

Lys Arg Asp Ser Ala Thr Ala Leu Asp Tyr Val Ser He Cys Ser Pro 65 70 75 80

Asn Tyr Leu His Tyr Pro His He Ala Ala Gly Leu Arg Leu Gly Cys 85 90 95

Asp Val He Cys Glu Lys Pro Leu Val Pro Thr Pro Glu Met Leu Asp 100 105 110

Gin Leu Ala Val He Glu Arg Glu Thr Asp Lys Arg Leu Tyr Asn He 115 120 125

Leu Gin Leu Arg His His Gin Ala He He Ala Leu Lys Asp Lys Val 130 135 140

Ala Arg Glu Lys Ser Pro His Lys Tyr Glu Val Asp Leu Thr Tyr He 145 150 155 160

Thr Ser Arg Gly Asn Trp Tyr Leu Lys Ser Trp Lys Gly Asp Pro Arg 165 170 175

Lys Ser Phe Gly Val Ala Thr Asn He Gly Val His Phe Tyr Asp Met 180 185 190

Leu His Phe He Phe Gly Lys Leu Gin Arg Asn Val Val His Phe Thr 195 200 205

Ser Glu Tyr Lys Thr Ala Gly Tyr Leu Glu Tyr Glu Gin Ala Arg Val 210 215 220

Arg Trp Phe Leu Ser Val Asp Ala Asn Asp Leu Pro Glu Ser Val Lys 225 230 235 240

Gly Lys Lys Pro Thr Tyr Arg Ser He Thr Val Asn Gly Glu Glu Met 245 - 250 255

Glu Phe Ser Glu Gly Phe Thr Asp Leu His Thr Thr Ser Tyr Glu Glu 260 265 270

He Leu Ala Gly Arg Gly Tyr Gly He Asp Asp Ala Arg His Cys Val 275 280 285

Glu Thr Val Asn Thr He Arg Ser Ala Val He Val Pro Ala Ser Asp 290 295 300

Asn Glu Gly His Pro Phe Val Ala Ala Leu Ala Arg 305 310 315

(2) INFORMATION FOR SEQ ID NO: 5:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 766 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbC

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:

Met Leu Cys Thr Ser Leu Pro Ser Thr Arg Gin Leu Val He Trp Ser 1 5 10 15

Thr Ser Arg Pro Val Cys Val Gly Phe Cys Pro Trp Met Leu Thr Thr 20 25 30

Cys Arg Ser Arg Ser Arg Ala Lys Ser Arg Pro He Val Arg Leu Pro 35 40 45

Ser Thr Val Arg Lys Trp Ser Ser Leu Lys Ala Leu Pro He Tyr He 50 55 60

Gin Pro Ala Thr Lys Lys Phe Ser Leu Val Val Val Met Ala Ser Met 65 70 75 80

Thr Leu Val He Val Trp Lys Leu Ser He Pro Phe Ala Ala Pro Ser 85 90 95

Ser Tyr Arg Pro Leu He Thr Lys Gly He Arg Ser Ser Arg Arg Leu 100 105 110

Arg Val Glu Val Glu Lys Glu Trp Pro Ser Ser Val Thr Cys Leu Gin 115 120 125

Gin Val Ser Ala Gly Ser Phe He Ser Met Ser Ser Ser Ser Ser Lys 130 135 140

Leu Leu Asn Gly Met Val Ala Val Ser Ser Gly Arg Asn He Arg Leu 145 150 155 160 Asp Val Gin Gly Leu Arg Ala Val Ala Val Leu Ala Val Leu Ala Tyr 165 170 175 .

His Ala Asn Ser Ala Trp Leu Arg Ala Gly Phe Val Gly Val Asp Val 180 185 190

Phe Phe Val He Ser Gly Phe He He Thr Ala Leu Leu Val Glu Arg 195 200 205

Gly Val Lys Val Asp Leu Val Glu Phe Tyr Ala Gly Arg He Lys Arg 210 215 220

He Phe Pro Ala Tyr Phe Val Met Leu Ala He Val Cys He Val Ser 225 230 235 240

Thr He Leu Phe Leu Pro Asp Asp Tyr Val Phe Phe Glu Lys Ser Leu 245 250 255

Gin Ser Ser Val Phe Phe Ser Ser Asn His Tyr Phe Ala Asn Phe Gly 260 265 270

Ser Tyr Phe Ala Pro Arg Ala Glu Glu Leu Pro Leu Leu His Thr Cys 275 280 285

Ser He Ala Asn Glu Met Gin Phe Tyr Leu Phe Tyr Pro Val Leu Phe 290 295 300

Met Cys Leu Pro Cys Arg Trp Arg Leu Pro Val Phe He Leu Leu Ala 305 310 315 320

He Leu Leu Phe He Trp Ser Gly Tyr Cys Val Phe Ser Gly Ser Gin 325 330 335

Asp Ala Gin Tyr Phe Ala Leu Leu Ala Arg Val Pro Glu Phe Met Ser 340 345 350

Gly Ala Val Val Ala Leu Ser Leu Arg Asp Arg Glu Leu Pro Ala Arg 355 360 365

Leu Ala He Leu Ala Gly Leu Leu Gly Ala Ala Leu Leu Val Cys Ser 370 375 380

Phe He He He Asp Lys Gin His Phe Pro Gly Phe Trp Ser Leu Leu 385 390 395 400

Pro Cys Leu Gly Ala Ala Leu Leu He Ala Ala Arg Arg Gly Pro Ala 405 410 415

Ser Leu Leu Leu Ala Ser Arg Pro Met Val Trp He Gly Gly He Ser 420 425 430

Tyr Ser Leu Tyr Leu Trp His Trp Pro He Leu Ala Phe He Arg Tyr 435 440 445

Tyr Thr Gly Gin Tyr Glu Leu Ser Phe Val Ala Leu Leu Ala Phe Leu 450 455 460

Thr Gly Ser Phe Leu Leu Ala Trp Phe Ser Tyr Arg Tyr He Glu Thr 465 470 475 480

Pro Ala Arg Lys Ala Val Gly Leu Arg Gin Gin Ala Leu Lys Trp Met 485 490 495

Leu Ala Ala Ser Val Val Ala He Val Val Thr Gly Gly Ala Gin Phe 500 505 510 Asn Val Leu Val Val Ala Pro Ala Pro He Gin Leu Thr Arg Tyr Ala 515 520 525

Val Pro Glu Ser He Cys His Gly Val Gin Val Gly Glu Cys Lys Arg 530 535 540

Gly Ser Val Asn Ala Val Pro Arg Val Leu Val He Gly Asp Ser His 545 550 555 560

Ala Ala Gin Leu Asn Tyr Phe Phe Asp Val Val Gly Asn Glu Ser Gly 565 570 575

Val Ala Tyr Arg Val Leu Thr Gly Ser Ser Cys Val Pro He Pro Ala 580 585 590

Phe Asp Leu Glu Arg Leu Pro Arg Trp Ala Arg Lys Pro Cys Gin Ala 595 600 605

Gin He Asp Ala Val Ala Gin Ser Met Leu Asn Phe Asp Lys He He 610 615 620

Val Ala Gly Met Trp Gin Tyr Gin Met Gin Ser Pro Ala Phe Ala Gin 625 630 635 640

Ala Met Arg Ala Phe Leu Val Asp Thr Ser Tyr Ala Gly Lys Gin Val 645 650 655

Ala Leu Leu Gly Gin He Pro Met Phe Glu Ser Asn Val Gin Arg Val 660 665 670

Arg Arg Phe Arg Glu Leu Gly Leu Ser Ala Pro Leu Val Ser Ser Ser 675 680 685

Trp Gin Gly Ala Asn Gin Leu Leu Arg Ala Leu Ala Glu Gly He Pro 690 695 700

Asn Val Arg Phe Met Asp Phe Ser Ser Ser Ala Phe Phe Ala Asp Ala 705 710 715 720

Pro Tyr Gin Asp Gly Glu Leu He Tyr Gin Asp Ser His His Leu Asn 725 730 735

Glu Val Gly Ala Arg Arg Tyr Gly Tyr Phe Ala Ser Arg Gin Leu Gin 740 745 750

Arg Leu Phe Glu Gin Pro Gin Ser Ser Val Ser Leu Lys Pro 755 760 765

(2) INFORMATION FOR SEQ ID NO: 6 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 160 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbD - Ill

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:

Met Ser Tyr Tyr Gin His Pro Ser Ala He Val Asp Asp Gly Ala Gin 1 5 - 10 15

He Gly Ser Asp Ser Arg Val Trp His Phe Val His He Cys Ala Gly 20 25 30

Ala Arg He Gly Ala Gly Val Ser Leu Gly Gin Asn Val Phe Val Gly 35 40 45

Asn Lys Val Val He Gly Asp Arg Cys Lys He Gin Asn Asn Val Ser 50 55 60

Val Tyr Asp Asn Val Thr Leu Glu Glu Gly Val Phe Cys Gly Pro Ser 65 70 75 80

Met Val Phe Thr Asn Val Tyr Asn Pro Arg Ser Leu He Glu Arg Lys 85 90 95

Asp Gin Tyr Arg Asn Thr Leu Val Lys Lys Gly Ala Thr Leu Gly Ala 100 105 110

Asn Cys Thr He Val Cys Gly Val Thr He Gly Glu Tyr Ala Phe Leu 115 120 125

Gly Ala Gly Ala Val He Asn Lys Asn Val Pro Ser Tyr Ala Leu Met 130 135 140

Val Gly Val Pro Ala Arg Gin He Gly Trp He Ala Asn Ser Val Ser 145 150 155 160

(2) INFORMATION FOR SEQ ID NO:7:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 276 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

( vii ) IMMEDIATE SOURCE : ( B ) CLONE : psbE

( xi ) SEQUENCE DESCRI PTION : SEQ ID NO : 7 :

Met He Glu Phe He Asp Leu Lys Asn Gin Gin Ala Arg He Lys Asp

1 5 10 15

Lys He Asp Ala Gly He Gin Arg Val Leu Arg His Gly Gin Tyr He 20 25 30

Leu Gly Pro Glu Val Thr Glu Leu Glu Asp Arg Leu Ala Asp Phe Val 35 40 45

Gly Ala Lys Tyr Cys He Ser Cys Ala Asn Gly Thr Asp Ala Leu Gin 50 55 60

He Val Gin Met Ala Leu Gly Val Gly Pro Gly Asp Glu Val He Thr 65 70 75 80

Pro Gly Phe Thr Tyr Val Ala Thr Ala Glu Thr Val Ala Leu Leu Gly 85 - 90 95

Ala Lys Pro Val Tyr Val Asp He Asp Pro Arg Thr Tyr Asn Leu Asp 100 105 110

Pro Gin Leu Leu Glu Ala Ala He Thr Pro Arg Thr Lys Ala He He 115 120 125

Pro Val Ser Leu Tyr Gly Gin Cys Ala Asp Phe Asp Ala He Asn Ala 130 135 140

He Ala Ser Lys Tyr Gly He Pro Val He Glu Asp Ala Ala Gin Ser 145 150 155 160

Phe Gly Ala Ser Tyr Lys Gly Lys Arg Ser Cys Asn Leu Ser Thr Val 165 170 175

Ala Cys Thr Ser Phe Phe Pro Ser Lys Pro Leu Gly Cys Tyr Gly Asp 180 185 190

Gly Gly Ala He Phe Thr Asn Asp Asp Glu Leu Ala Thr Ala He Arg 195 200 205

Gin He Ala Arg His Gly Gin Asp Arg Arg Tyr His His He Arg Val 210 215 220

Gly Val Asn Ser Arg Leu Asp Thr Leu Gin Ala Ala He Leu Leu Pro 225 230 235 240

Lys Leu Glu He Phe Glu Glu Glu He Ala Leu Arg Gin Lys Val Ala 245 250 255

Ala Glu Tyr Asp Leu Ser Leu Lys Gin Val Gly He Gly Thr Pro Phe 260 265 270

He Gly Ser Gly 275

(2) INFORMATION FOR SEQ ID NO: 8 :

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 438 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: rfc a

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:

Met Tyr He Leu Ala Arg Val Asp Arg Ser He Leu Leu Asn Thr Val

1 5 10 15

Leu Leu Phe Ala Phe Phe Ser Ala Thr Val Trp Val Asn Asn Asn Tyr 20 25 30

He Tyr His Leu Tyr Asp Tyr Met Gly Ser Ala Lys Lys Thr Val Asp 35 - 40 45

Phe Gly Leu Tyr Pro Tyr Leu Met Val Leu Ala Leu He Cys Ala Leu 50 55 60

Leu Cys Gly Gly Ala He Arg Arg Pro Gly Asp Leu Leu Val Thr Leu 65 70 75 80

Leu Val Val He Leu Val Pro His Ser Leu Val Leu Asn Gly Ala Asn 85 90 95

Gin Tyr Ser Pro Asp Ala Gin Pro Trp Ala Gly Val Pro Leu Ala He 100 105 110

Ala Phe Gly He Leu He He Gly He Val Asn Lys He Arg Phe His 115 120 125

Pro Leu Gly Ala Leu Gin Arg Glu Asn Gin Gly Arg Arg Met Leu Val 130 135 140

Leu Leu Ser Val Leu Asn He Val Val Leu Val Phe He Phe Phe Lys 145 150 155 160

Ser Ala Gly Tyr Phe Ser Phe Asp Phe Ala Gly Gin Tyr Ala Arg Arg 165 170 175

Ala Leu Ala Arg Glu Val Phe Ala Ala Gly Ser Ala Asn Gly Tyr Leu 180 185 190

Ser Ser He Gly Thr Gin Ala Phe Phe Pro Val Leu Phe Ala Trp Gly 195 200 205

Val Tyr Arg Arg Gin Trp Phe Tyr Leu Val Leu Gly He Val Asn Ala 210 215 220

Leu Val Leu Trp Gly Ala Phe Gly Gin Lys Tyr Pro Phe Val Val Leu 225 230 235 240

Phe Leu He Tyr Gly Leu Met Val Tyr Phe Arg Arg Phe Gly Gin Val 245 250 255

Arg Val Ser Trp Val Val Cys Ala Leu Leu Met Leu Leu Leu Leu Gly 260 265 270

Ala Leu Glu His Glu Val Phe Gly Tyr Ser Phe Leu Asn Asp Tyr Phe 275 280 285

Leu Arg Arg Ala Phe He Val Pro Ser Thr Leu Leu Gly Ala Val Asp 290 295 300

Gin Phe Val Ser Gin Phe Gly Ser Asn Tyr Tyr Arg Asp Thr Leu Leu 305 310 315 320

Gly Ala Leu Leu Gly Gin Gly Arg Thr Glu Pro Leu Ser Phe Arg Leu 325 330 335

Gly Thr Glu He Phe Asn Asn Pro Asp Met Asn Ala Asn Val Asn Phe 340 345 350

Phe Ala He Ala Tyr Met Gin Leu Gly Tyr Val Gly Val Met Ala Glu 355 360 365

Ser Met Leu Val Gly Gly Ser Val Val Leu Met Asn Phe Leu Phe Ser 370 375 380

Arg Tyr Gly Ala Phe Met Ala He Pro Val Ala Leu Leu Phe Thr Thr 385 390 - 395 400

Lys He Leu Glu Gin Pro Leu Leu Thr Val Met Leu Gly Ser Gly Val 405 410 415

Phe Leu He Leu Leu Phe Leu Ala Leu He Ser Phe Pro Leu Lys Met 420 425 430

Ser Leu Gly Lys Thr Leu 435

(2) INFORMATION FOR SEQ ID NO:9:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 316 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbF

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 :

Met Ser Ala Ala Phe He Asn Arg Val Ala Arg Val Leu Val Gly Thr 1 5 10 15

Leu Gly Ala Gin Leu He Thr He Gly Val Thr Leu Leu Leu Val Arg 20 25 30

Leu Tyr Ser Pro Ala Glu Met Gly Ala Phe Ser Val Trp Leu Ser Phe 35 40 45

Ala Thr He Phe Ala Val Val Val Thr Gly Arg Tyr Glu Leu Ala He 50 55 60

Phe Ser Thr Arg Glu Glu Gly Glu Leu Gin Ala He Val Lys Leu He 65 70 75 80

Leu Gin Leu Thr Leu Leu He Phe Val Ala Val Ala He Ala Val Val 85 90 95

He Gly Arg His Leu He Glu Ser Met Pro Val Val He Gly Glu Tyr 100 105 110

Trp Phe Ala Leu Ala Val Ala Ser Leu Gly Leu Gly He Asn Lys Leu 115 120 125

Val Leu Ser Leu Leu Thr Phe Gin Gin Ser Phe Asn Arg Leu Gly Val 130 135 140

Ala Arg Val Ser Leu Ala Ala Cys He Ala Val Ala Gin Val Ser Ala 145 150 155 160

Ala Tyr Leu Leu Glu Gly Val Ser Gly Leu He Tyr Gly Gin Leu Phe 165 170 175 Gly Val Val Val Ala Thr Ala Leu Ala Ala Leu Trp Val Gly Lys Ser 180 185 190

Leu He Leu Asn Cys He Glu Thr Pro Trp Arg Met Val Arg Gin Val 195 200 205

Ala Val Gin Tyr He Asn Phe Pro Lys Phe Ser Leu Pro Ala Asp Leu 210 215 220

Val Asn Thr Val Ala Ser Gin Val Pro Val He Leu Leu Ala Ala Lys 225 230 235 240

Phe Gly Gly Asp Ser Ala Gly Trp Phe Ala Leu Thr Leu Lys He Met 245 250 255

Gly Ala Pro He Ser Leu Leu Ala Ala Ser Val Leu Asp Val Phe Lys 260 265 270

Glu Gin Ala Ala Arg Asp Tyr Arg Glu Phe Gly Asn Cys Arg Gly He 275 280 285

Phe Leu Lys Thr Phe Arg Leu Leu Ala Val Leu Ala Leu Pro Pro Phe 290 295 300

He He Phe Gly Ser Leu Ala Ser Gly Pro Leu Gly 305 310 315

(2) INFORMATION FOR SEQ ID NO:10:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 118 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: hisH

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:

Met Leu Gly Leu Arg Ser Glu Glu Gly Ala Glu Pro Gly Leu Gly Trp 1 5 10 15

He Asp Met Asp Ser Val Arg Phe Glu Arg Arg Asp Asp Arg Lys Val 20 25 30

Pro His Met Gly Trp Asn Gin Val Ser Pro Gin Leu Glu His Pro He 35 40 45

Leu Ser Gly He Asn Glu Gin Ser Arg Phe Tyr Phe Val His Ser Tyr

50 55 60

Tyr Met Val Pro Lys Asp Pro Asp Asp He Leu Leu Ser Cys Asn Tyr 65 70 75 80

Gly Gin Lys Phe Thr Ala Ala Val Ala Arg Asp Asn Val Phe Gly Phe 85 90 95 Gin Phe His Pro Glu Lys Ser His Lys Phe Gly Met Gin Leu Phe Lys 100 105 110

Asn Phe Val Glu Leu Val - 115

(2) INFORMATION FOR SEQ ID NO:11:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 251 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: hisF

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:

Met Val Arg Arg Arg Val He Pro Cys Leu Leu Leu Lys Asp Arg Gly 1 5 10 15

Leu Val Lys Thr Val Lys Phe Lys Glu Pro Lys Tyr Val Gly Asp Pro 20 25 30

He Asn Ala He Arg He Phe Asn Glu Lys Glu Val Asp Glu Leu He 35 40 45

Leu Leu Asp He Asp Ala Ser Arg Leu Asn Gin Glu Pro Asn Tyr Glu 50 55 60

Leu He Ala Glu Val Ala Gly Glu Cys Phe Met Pro He Cys Tyr Gly 65 70 75 80

Gly Gly He Lys Thr Leu Glu His Ala Glu Lys He Phe Ser Leu Gly 85 90 95

Val Glu Lys Val Ser He Asn Thr Ala Ala Leu Met Asp Leu Ser Leu 100 105 110

He Arg Arg He Ala Asp Lys Phe Gly Ser Gin Ser Val Val Gly Ser 115 120 125

He Asp Cys Arg Lys Gly Phe Trp Gly Gly His Ser Val Phe Ser Glu 130 135 140

Asn Gly Thr Arg Asp Met Lys Arg Ser Pro Leu Glu Trp Ala Gin Ala 145 150 155 160

Leu Glu Glu Ala Gly Val Gly Glu He Phe Leu Asn Ser He Asp Arg 165 170 175

Asp Gly Val Gin Lys Gly Phe Asp Asn Ala Leu Val Glu Asn He Ala 180 185 190

Ser Asn Val His Val Pro Val He Ala Cys Gly Gly Ala Gly Ser He 195 200 205

Ala Asp Leu He Asp Leu Phe Glu Arg Thr Cys Val Ser Ala Val Ala 210 215 220

Ala Gly Ser Leu Phe Val Phe His Gly Lys His Arg Ala Val Leu He 225 230 - 235 240

Ser Tyr Pro Asp Val Asn Lys Leu Asp Val Gly 245 250

(2) INFORMATION FOR SEQ ID NO:12:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 376 ammo acids

(B) TYPE: ammo acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbG

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO:12:

Met Lys He Cys Ser Arg Cys Val Met Asp Thr Ser Asp Ala Glu He 1 5 10 15

Val Phe Asp Glu Ala Gly Val Cys Asn His Cys His Lys Phe Asp Asn 20 25 30

Val Gin Ser Arg Gin Leu Phe Ser Asp Ala Ser Gly Glu Gin Arg Leu 35 40 45

Gin Lys He He Gly Gin He Lys Lys Asp Gly Ser Gly Lys Asp Tyr 50 55 60

Asp Cys He He Gly Leu Ser Gly Gly Val Asp Ser Ser Tyr Leu Ala 65 70 75 80

Val Lys Val Lys Asp Leu Gly Leu Arg Pro Leu Val Val His Val Asp 85 90 95

Ala Gly Trp Asn Ser Glu Leu Ala Val Ser Asn He Glu Lys He Val 100 105 110

Lys Tyr Cys Gly Phe Asp Leu His Thr His Val He Asn Trp Glu Glu 115 120 125

He Arg Asp Leu Gin Leu Ala Tyr Met Lys Ala Ala Val Ala Asn Gin 130 135 140

Asp Val Pro Gin Asp His Ala Phe Phe Ala Ser Met Tyr His Phe Ala 145 150 155 160

Val Lys Asn Asn He Lys Tyr He Leu Ser Gly Gly Asn Leu Ala Thr 165 170 175

Glu Ala Val Phe Pro Asp Thr Trp His Gly Ser Ala Met Asp Ala He 180 185 190

Asn Leu Lys Ala He His Lys Lys Tyr Gly Glu Arg Pro Leu Arg Asp 195 200 205 Tyr Lys Thr He Ser Phe Leu Glu Tyr Tyr Phe Trp Tyr Pro Phe Val 210 215 220

Lys Gly Met Arg Thr Val Arg Pro Leu Asn Phe Met Ala Tyr Asp Lys 225 230 235 240

Ala Lys Ala Glu Thr Phe Leu Gin Glu Thr He Gly Tyr Arg Ser Tyr 245 250 255

Ala Arg Lys His Gly Glu Ser He Phe Thr Lys Leu Phe Gin Asn Tyr 260 265 270

Tyr Leu Pro Thr Lys Phe Gly Tyr Asp Lys Arg Lys Leu His Tyr Ser 275 280 285

Ser Met He Leu Ser Gly Gin Met Thr Arg Asp Glu Ala Gin Ala Lys 290 295 300

Leu Ala Glu Pro Leu Tyr Asp Ala Asp Glu Leu Gin Phe Asp He Glu 305 310 315 320

Tyr Phe Cys Lys Lys Met Arg He Thr Gin Ala Gin Phe Glu Glu Leu 325 330 335

Met Asn Ala Pro Val His Asp Tyr Ser Glu Phe Ala Asn Trp Asp Ser 340 345 350

Arg Gin Arg He Ala Lys Lys Val Gin Met He Val Gin Arg Ala Leu 355 360 365

Gly Arg Arg He Asn Val Tyr Ser 370 375

(2) INFORMATION FOR SEQ ID NO:13:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 373 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbH

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:

Met Thr Lys Val Ala His Leu Thr Ser Val His Ser Arg Tyr Asp He

1 5 10 15

Arg He Phe Arg Lys Gin Cys Arg Thr Leu Ser Gin Tyr Gly Tyr Asp 20 25 30

Val Tyr Leu Val Val Ala Asp Gly Lys Gly Asp Glu Val Lys Asp Gly 35 40 45

Val Arg He Val Asp Val Gly Val Leu Ser Gly Arg Leu Asn Arg He 50 55 60 Leu Lys Thr Thr Arg Lys He Tyr Glu Gin Ala Leu Ala Leu Gly Ala 65 70 75 .80

Asp Val Tyr His Phe His Asp Pro Glu Leu He Pro Val Gly Leu Arg 85 90 95

Leu Lys Lys Gin Gly Lys Gin Val He Phe Asp Ser His Glu Asp Val 100 105 110

Pro Lys Gin Leu Leu Ser Lys Pro Tyr Met Arg Pro Phe Leu Arg Arg 115 120 125

Val Val Ala Val Leu Phe Ser Cys Tyr Glu Lys Tyr Ala Cys Pro Lys 130 135 140

Leu Asp Ala Val Leu Thr Ala Thr Pro His He Arg Glu Lys Phe Lys 145 150 155 160

Asn He Asn Gly Asn Val Leu Asp He Asn Asn Phe Pro Met Leu Gly 165 170 175

Glu Leu Asp Ala Met Val Pro Trp Ala Ser Lys Lys Thr Glu Val Cys 180 185 190

Tyr Val Gly Gly He Thr Ser He Arg Gly Val Arg Glu Val Val Lys 195 200 205

Ser Leu Glu Cys Leu Lys Ser Ser Ala Arg Leu Asn Leu Val Gly Lys 210 215 220

Phe Ser Glu Pro Glu He Glu Lys Glu Val Arg Ala Leu Lys Gly Trp 225 230 235 240

Asn Ser Val Asn Glu His Gly Gin Leu Asp Arg Glu Asp Val Arg Arg 245 250 255

Val Leu Gly Asp Ser Val Ala Gly Leu Val Thr Phe Leu Pro Met Pro 260 265 270

Asn His Val Asp Ala Gin Pro Asn Lys Met Phe Glu Tyr Met Ser Ser 275 280 285

Gly He Pro Val He Ala Ser Asn Phe Pro Leu Trp Arg Glu He Val 290 295 300

Glu Gly Ser Asn Cys Gly He Cys Val Asp Pro Leu Ser Pro Ala Ala 305 310 315 320

He Ala Glu Ala He Asp Tyr Leu Val Ser Asn Pro Cys Glu Ala Ala 325 330 335

Ala Leu Gly Arg Asn Gly Gin Arg Ala Val Asn Glu Arg Tyr Asn Trp 340 345 350

Asp Leu Glu Gly Arg Lys Leu Ala Arg Phe Tyr Ser Asp Leu Leu Ser 355 360 365

Lys Arg Asp Ser He 370

(2) INFORMATION FOR SEQ ID NO:14:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 362 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbl

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:

Met Lys He Leu Thr He He Gly Ala Arg Pro Gin Phe He Lys Ala 1 5 10 15

Ser Val Val Ser Lys Ala He He Glu Gin Gin Thr Leu Ser Glu He 20 25 30

He Val His Thr Gly Gin His Phe Asp Ala Asn Met Ser Glu He Phe 35 40 45

Phe Glu Gin Leu Gly He Pro Lys Pro Asp Tyr Gin Leu Asp He His 50 55 60

Gly Gly Thr His Gly Gin Met Thr Gly Arg Met Leu Met Glu He Glu 65 70 75 80

Asp Val He Leu Lys Glu Lys Pro His Arg Val Leu Val Tyr Gly Asp 85 90 95

Thr Asn Ser Thr Leu Ala Gly Ala Leu Ala Ala Ser Lys Leu His Val 100 105 110

Pro He Ala His He Glu Ala Gly Leu Arg Ser Phe Asn Met Arg Met 115 120 125

Pro Glu Glu He Asn Arg He Leu Thr Asp Gin Val Ser Asp He Leu 130 135 140

Phe Cys Pro Thr Arg Val Ala He Asp Asn Leu Lys Asn Glu Gly Phe 145 150 155 160

Glu Arg Lys Ala Ala Lys He Val Asn Val Gly Asp Val Met Gin Asp 165 170 175

Ser Ala Leu Phe Phe Ala Gin Arg Ala Thr Ser Pro He Gly Leu Ala 180 185 190

Ser Gin Asp Gly Phe He Leu Ala Thr Leu His Arg Ala Glu Asn Thr 195 200 205

Asp Asp Pro Val Arg Leu Thr Ser He Val Glu Ala Leu Asn Glu He 210 215 220

Gin He Asn Val Ala Pro Val Val Leu Pro Leu His Pro Arg Thr Arg 225 230 235 240

Gly Val He Glu Arg Leu Gly Leu Lys Leu Glu Val Gin Val He Asp 245 250 255

Pro Val Gly Tyr Leu Glu Met He Trp Leu Leu Gin Arg Ser Gly Leu 260 265 270

Val Leu Thr Asp Ser Gly Gly Val Gin Lys Glu Ala Phe Phe Phe Gly 275 280 285

Lys Pro Cys Val Thr Met Arg Asp Gin Thr Glu Trp Val Glu Leu Val 290 295 300

Thr Cys Gly Ala Asn Val Leu Val Gly Ala Ala Arg Asp Met He Val 305 310 315 320

Glu Ser Ala Arg Thr Ser Leu Gly Lys Thr He Gin Asp Asp Gly Gin 325 330 335

Leu Tyr Gly Gly Gly Gin Ala Ser Leu Gly Leu Leu Asn He Leu Pro 340 345 350

Ser Cys Asp Ala Leu Arg Val Glu Phe Lys 355 360

(2) INFORMATION FOR SEQ ID NO:15:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 413 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbJ

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:

Met Asn Val Trp Tyr Val His Pro Tyr Ala Gly Gly Pro Gly Val Gly 1 5 10 15

Arg Tyr Trp Arg Pro Tyr Tyr Phe Ser Lys Phe Trp Asn Gin Ala Gly 20 25 30

His Arg Ser Val He He Ser Ala Gly Tyr His His Leu Leu Glu Pro 35 40 45

Asp Glu Lys Arg Ser Gly Val Thr Cys Val Asn Gly Ala Glu Tyr Ala 50 55 60

Tyr Val Pro Thr Leu Arg Tyr Leu Gly Asn Gly Val Gly Arg Met Leu 65 70 75 80

Ser Met Leu He Phe Thr Met Met Leu Leu Pro Phe Cys Leu He Leu 85 90 95

Ala Leu Lys Arg Gly Thr Pro Asp Ala He He Tyr Ser Ser Pro His 100 105 110

Pro Phe Gly Val Val Ser Cys Trp Leu Ala Ala Arg Leu Leu Gly Ala 115 120 125

Lys Phe Val Phe Glu Val Arg Asp He Trp Pro Leu Ser Leu Val Glu 130 135 140

Leu Gly Gly Leu Lys Ala Asp Asn Pro Leu Val Arg Val Thr Gly Trp 145 150 155 160

He Glu Arg Phe Ser Tyr Ala Arg Ala Asp Lys He He Ser Leu Leu 165 - 170 175

Pro Cys Ala Glu Pro His Met Ala Asp Lys Gly Leu Pro Ala Gly Lys 180 185 190

Phe Leu Trp Val Pro Asn Gly Val Asp Ser Ser Asp He Ser Pro Asp 195 200 205

Ser Ala Val Ser Ser Ser Asp Leu Val Arg His Val Gin Val Leu Lys 210 215 220

Glu Gin Gly Val Phe Val Val He Tyr Ala Gly Ala His Gly Glu Pro 225 230 235 240

Asn Ala Leu Glu Gly Leu Val Arg Ser Ala Gly Leu Leu Arg Glu Arg 245 250 255

Gly Ala Ser He Arg He He Leu Val Gly Lys Gly Glu Cys Lys Glu 260 265 270

Gin Leu Lys Ala He Ala Ala Gin Asp Ala Ser Gly Leu Val Glu Phe 275 280 285

Phe Asp Gin Gin Pro Lys Glu Thr He Met Ala Val Leu Lys Leu Ala 290 295 300

Ser Ala Gly Tyr He Ser Leu Lys Ser Glu Pro He Phe Arg Phe Gly 305 310 315 320

Val Ser Pro Asn Lys Leu Trp Asp Tyr Met Leu Val Gly Leu Pro Val 325 330 335

He Phe Ala Cys Lys Ala Gly Asn Asp Pro Val Ser Asp Tyr Asp Cys 340 345 350

Gly Val Ser Ala Asp Pro Asp Ala Pro Glu Asp He Thr Ala Ala He 355 360 365

Phe Arg Leu Leu Leu Leu Ser Glu Asp Glu Arg Arg Thr Met Gly Gin 370 375 380

Arg Gly Arg Asp Ala Val Leu Glu His Tyr Thr Tyr Glu Ser Leu Ala 385 390 395 400

Leu Gin Val Leu Asn Ala Leu Ala Asp Gly Arg Ala Ala 405 410

(2) INFORMATION FOR SEQ ID NO:16:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 320 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbK (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:

Met Lys Ala Val Met Val Thr Gly Ala Ser Gly Phe Val Gly Ser Ala 1 5 10 15

Leu Cys Cys Glu Leu Ala Arg Thr Gly Tyr Ala Val He Ala Val Val 20 25 30

Arg Arg Val Val Glu Arg He Pro Ser Val Thr Tyr He Glu Ala Asp 35 40 45

Leu Thr Asp Pro Ala Thr Phe Ala Gly Glu Phe Pro Thr Val Asp Cys 50 55 60

He He His Leu Ala Gly Arg Ala His He Leu Thr Asp Lys Val Ala 65 70 75 80

Asp Pro Leu Ala Ala Phe Arg Glu Val Asn Arg Asp Ala Thr Val Arg 85 90 95

Leu Ala Thr Arg Ala Leu Glu Ala Gly Val Lys Arg Phe Val Phe Val 100 105 110

Ser Ser He Gly Val Asn Gly Asn Ser Thr Arg Gin Gin Ala Phe Asn 115 120 125

Glu Asp Ser Pro Ala Gly Pro His Ala Pro Tyr Ala He Ser Lys Tyr 130 135 140

Glu Ala Glu Gin Glu Leu Gly Thr Leu Leu Arg Gly Lys Gly Met Glu 145 150 155 160

Leu Val Val Val Arg Pro Pro Leu He Tyr Ala Asn Asp Ala Pro Gly 165 170 175

Asn Phe Gly Arg Leu Leu Lys Leu Val Ala Ser Gly Leu Pro Leu Pro 180 185 190

Leu Asp Gly Val Arg Asn Ala Arg Ser Leu Val Ser Arg Arg Asn He 195 200 205

Val Gly Phe Leu Ser Leu Cys Ala Glu His Pro Asp Ala Ala Gly Glu 210 215 220

Leu Phe Leu Val Ala Asp Gly Glu Asp Val Ser He Ala Gin Met He 225 230 235 240

Glu Ala Leu Ser Arg Gly Met Gly Arg Arg Pro Ala Leu Phe Thr Phe 245 250 255

Pro Ala Val Leu Leu Lys Leu Val Met Cys Leu Leu Gly Lys Ala Ser 260 265 270

Met His Glu Gin Leu Cys Gly Ser Leu Gin Val Asp Ala Ser Lys Ala 275 280 285

Arg Arg Leu Leu Gly Trp Val Pro Val Glu Thr He Gly Ala Gly Leu 290 295 300

Gin Ala Ala Gly Arg Glu Tyr He Leu Arg Gin Arg Glu Arg Arg Lys 305 310 315 320

(2) INFORMATION FOR SEQ ID NO:17: (i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 665 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbM

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:

Met Leu Asp Asn Leu Arg He Lys Leu Leu Gly Leu Pro Arg Arg Tyr 1 5 10 15

Lys Arg Met Leu Gin Val Ala Ala Asp Val Thr Leu Val Trp Leu Ser 20 25 30

Leu Trp Leu Ala Phe Leu Val Arg Leu Gly Thr Glu Asp Met He Ser 35 40 45

Pro Phe Ser Gly His Ala Trp Leu Phe He Ala Ala Pro Leu Val Ala 50 55 60

He Pro Leu Phe He Arg Phe Gly Met Tyr Arg Ala Val Met Arg Tyr 65 70 75 80

Leu Gly Asn Asp Ala Leu He Ala He Ala Lys Ala Val Thr He Ser 85 90 95

Ala Leu Val Leu Ser Leu Leu Val Tyr Trp Tyr Arg Ser Pro Pro Ala 100 105 110

Val Val Pro Arg Ser Leu Val Phe Asn Tyr Trp Trp Leu Ser Met Leu 115 120 125

Leu He Gly Gly Leu Arg Leu Ala Met Arg Gin Tyr Phe Met Gly Asp 130 135 140

Trp Tyr Ser Ala Val Gin Ser Val Pro Phe Leu Asn Arg Gin Asp Gly 145 150 155 160

Leu Pro Arg Val Ala He Tyr Gly Ala Gly Ala Ala Ala Asn Gin Leu 165 170 175

Val Ala Ala Leu Arg Leu Gly Arg Ala Met Arg Pro Val Ala Phe He 180 185 190

Asp Asp Asp Lys Gin He Ala Asn Arg Val He Ala Gly Leu Arg Val 195 200 205

Tyr Thr Ala Lys His He Arg Gin Met He Asp Glu Thr Gly Ala Gin 210 215 220

Glu Val Leu Leu Ala He Pro Ser Ala Thr Arg Ala Arg Arg Arg Glu 225 230 235 240

He Leu Glu Ser Leu Glu Pro Phe Pro Leu His Val Arg Ser Met Pro 245 250 255 Gly Phe Met Asp Leu Thr Ser Gly Arg Val Lys Val Asp Asp Leu Gin 260 265 270

Glu Val Asp He Ala Asp Leu Leu Gly Arg Asp Ser Val Ala Pro Arg 275 280 285

Lys Glu Leu Leu Glu Arg Cys He Arg Gly Gin Val Val Met Val Thr 290 295 300

Gly Ala Gly Gly Ser He Gly Ser Glu Leu Cys Arg Gin He Met Ser 305 310 315 320

Cys Ser Pro Ser Val Leu He Leu Phe Glu His Ser Glu Tyr Asn Leu 325 330 335

Tyr Ser He His Gin Glu Leu Glu Arg Arg He Lys Arg Glu Ser Leu 340 345 350

Ser Val Asn Leu Leu Pro He Leu Gly Ser Val Arg Asn Pro Glu Arg 355 360 365

Leu Val Asp Val Met Arg Thr Trp Lys Val Asn Thr Val Tyr His Ala 370 375 380

Ala Ala Tyr Lys His Val Pro He Val Glu His Asn He Ala Glu Gly 385 390 395 400

Val Leu Asn Asn Val He Gly Thr Leu His Ala Val Gin Ala Ala Val 405 410 415

Gin Val Gly Val Gin Asn Phe Val Leu He Ser Thr Asp Lys Ala Val 420 425 430

Arg Pro Thr Asn Val Met Gly Ser Thr Lys Arg Leu Ala Glu Met Val 435 440 445

Leu Gin Ala Leu Ser Asn Glu Ser Ala Pro Leu Leu Phe Gly Asp Arg 450 455 460

Lys Asp Val His His Val Asn Lys Thr Arg Phe Thr Met Val Arg Phe 465 470 475 480

Gly Asn Val Leu Gly Ser Ser Gly Ser Val He Pro Leu Phe Arg Glu 485 490 495

Gin He Lys Arg Gly Gly Pro Val Thr Val Thr His Pro Ser He Thr 500 505 510

Arg Tyr Phe Met Thr He Pro Glu Ala Ala Gin Leu Val He Gin Ala 515 520 525

Gly Ser Met Gly Gin Gly Gly Asp Val Phe Val Leu Asp Met Gly Pro 530 535 540

Pro Val Lys He Leu Glu Leu Ala Glu Lys Met He His Leu Ser Gly 545 550 555 560

Leu Ser Val Arg Ser Glu Arg Ser Pro His Gly Asp He Ala He Glu 565 570 575

Phe Ser Gly Leu Arg Pro Gly Glu Lys Leu Tyr Glu Glu Leu Leu He 580 585 590

Gly Asp Asn Val Asn Pro Thr Asp His Pro Met He Met Arg Ala Asn 595 600 605 Glu Glu His Leu Ser Trp Glu Ala Phe Lys Val Val Leu Glu Gin Leu 610 615 620

Leu Ala Ala Val Glu Lys Asp Asp Tyr Ser Arg Val Arg Gin Leu Leu 625 630 635 640

Arg Glu Thr Val Ser Gly Tyr Ala Pro Asp Gly Glu He Val Asp Trp 645 650 655

He Tyr Arg Gin Arg Arg Arg Glu Pro 660 665

(2) INFORMATION FOR SEQ ID NO:18:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 463 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl

(vii) IMMEDIATE SOURCE: (B) CLONE: psbN

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:

Met He Asn Ser His Leu Leu Tyr Arg Leu Ser Tyr Arg Gly Thr Ala 1 5 10 15

Arg Arg Met Leu Leu He Lys Lys Gly Lys Pro Leu Pro Met Thr Ser 20 25 30

Pro Phe Ser Leu Gin Asp Leu Asp Asp Gly Leu Gly Asp Gly Leu Gin 35 40 45

Val Arg Phe Val Gin Arg Gly Asp Ala Asp Thr Ala Gly Ala Asp Gly 50 55 60

Val Asp Thr Glu Leu Gly Leu Gin Ala Leu Asp Leu Val Gly Gly Gin 65 70 75 80

Ala Gly He Gly Glu His Ala Thr Leu Ala Thr Asp Glu Thr Glu Val 85 90 95

Ala Leu Gly Ala Val Gly Cys Gin Leu Leu Asp His Arg Gin Ala His 100 105 110

Val Ala Asp Ala Val Ala His Leu Ala Gin Phe Leu Leu Pro Glu Gly 115 120 125

Pro Gin Phe Arg Ala Val Glu His Gly Gly Asp Asp Ala Gly Ala Val 130 135 140

Gly Arg Trp Val Arg He Val Gly Ala Asp His Pro Leu His Leu Gly 145 150 155 160

Gin His Ala Gly Arg Phe He Ala Ala Phe Gly His Asp Arg Glu Gly 165 170 175 Ala Asp Ala Phe Ala He Glu Arg Glu Gly Phe Gly Glu Arg Ala Gly 180 185 190

Asn Glu Glu Ala Gin Ala Arg Leu Gly Glu Gin Ala His Arg Gly Gly 195 200 205

Val Phe Leu Asp Ala Val Ala Glu Ala Leu Val Gly Asp Val Glu Glu 210 215 220

Arg His Val Ala Leu Gly Leu Glu His Val Gin His Leu Phe Pro Val 225 230 235 240

Val Gin Leu Glu He Asp Ala Gly Arg He Met Ala Ala Gly Val Gin 245 250 255

Asn His Asp Arg Ala Gly Arg Gin Gly He Gin Val Phe Gin Gin Ala 260 265 270

Gly Ala Val His Ala He Ala Gly Gly Val Val He Ala Val Val Leu 275 280 285

His Arg Glu Ala Gly Gly Phe Glu Gin Cys Ala Val Val Phe Pro Ala 290 295 300

Arg Val Ala Asp Gly His Gly Gly Val Gly Gin Gin Ala Leu Glu Glu 305 310 315 320

Val Gly Ala Glu Leu Glu Arg Ala Gly Ala Ala Asp Gly Leu Gly Arg 325 330 335

Asp His Thr Ala Gly Gly Gin Gin Leu Gly Leu Val Thr Glu Gin Gin 340 345 350

Phe Leu Tyr Ala Leu Val Val Gly Gly Asp Pro Phe Asp Arg Gin Val 355 360 365

Ala Ala Arg Arg Val Gly Leu Asp Ala Gly Leu Leu Gly Ser Leu His 370 375 380

Gly Thr Gin Gin Arg Asn Ala Pro Leu Leu Val Val Val His Ala His 385 390 395 400

Ala Gin Val Asp Leu Ala Arg Thr Gly He Gly Val Glu Gly Phe Val 405 410 415

Gin Ala Lys Asp Gly He Thr Arg Cys His Phe Asp Gly Arg Lys Gin 420 425 430

Thr His Phe Ala Ala Ala Arg Ser Val Lys Arg Gly Gly Gin Arg Asn 435 440 445

Pro Leu Cys Gly Gly Ala Lys Gly Cys Ala Asn Gly Gly Leu Leu 450 455 460

(2) INFORMATION FOR SEQ ID NO:19:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 238 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa (vii) IMMEDIATE SOURCE: (B) CLONE: uvrB

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:

Met His Ala Ala Thr Phe Arg Cys Met Leu Ser Ala He Ser Asp Ala 1 5 10 15

Gly Phe Ser Leu Ala Ser Gin Leu Pro Ala Arg Phe Phe Met Asp Thr 20 25 30

Phe Gin Leu Asp Ser Arg Phe Lys Pro Ala Gly Asp Gin Pro Glu Ala 35 40 45

He Arg Gin Met Val Glu Gly Leu Glu Ala Gly Leu Ser His Gin Thr 50 55 60

Leu Leu Gly Val Thr Gly Ser Gly Lys Thr Phe Ser He Ala Asn Val 65 70 75 80

He Ala Gin Val Gin Arg Pro Thr Leu Val Leu Ala Pro Asn Lys Thr 85 90 95

Leu Ala Ala Gin Leu Tyr Gly Glu Phe Lys Thr Phe Phe Pro His Asn 100 105 110

Ser Val Glu Tyr Phe Val Ser Tyr Tyr Asp Tyr Tyr Gin Pro Glu Ala 115 120 125

Tyr Val Pro Ser Ser Asp Thr Tyr He Glu Lys Asp Ser Ser He Asn 130 135 140

Asp His He Glu Gin Met Arg Leu Ser Ala Thr Lys Ala Leu Leu Glu 145 150 155 160

Arg Pro Asp Ala He He Val Ala Thr Val Ser Ser He Tyr Gly Leu 165 170 175

Gly Asp Pro Ala Ser Tyr Leu Lys Met Val Leu His Leu Asp Arg Gly 180 185 190

Asp Arg He Asp Gin Arg Glu Leu Leu Arg Arg Leu Thr Ser Leu Gin 195 200 205

Tyr Thr Arg Asn Asp Met Asp Phe Ala Arg Ala Thr Phe Arg Val Arg 210 215 220

Gly Asp Val He Asp He Phe Pro Ala Glu Ser Asp Leu Glu 225 230 235

(2) INFORMATION FOR SEQ ID NO:20:

(i) SEQUENCE CHARACTERISTICS:

(A) LENGTH: 303 amino acids

(B) TYPE: amino acid

(C) STRANDEDNESS: single

(D) TOPOLOGY: linear

(ii) MOLECULE TYPE: peptide

(vi) ORIGINAL SOURCE:

(A) ORGANISM: Pseudomonas aeruginosa

(B) STRAIN: PAOl (vii) IMMEDIATE SOURCE: (B) CLONE: psbL

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:

Met Met He Trp Met He Ala Cys Leu Val Val Leu Leu Phe Ser Phe 1 5 10 15

Val Ala Thr Trp Gly Leu Arg Arg Tyr Ala Leu Ala Thr Lys Leu Met 20 25 30

Asp Val Pro Asn Ala Arg Ser Ser His Ser Gin Pro Thr Pro Arg Gly 35 40 45

Gly Gly Val Ala He Val Leu Val Phe Leu Ala Ala Leu Val Trp Met 50 55 60

Leu Ser Ala Gly Ser He Ser Gly Gly Trp Gly Gly Ala Met Leu Gly 65 70 75 80

Ala Gly Ser Gly Val Ala Leu Leu Gly Phe Leu Asp Asp His Gly His 85 90 95

He Ala Ala Arg Trp Arg Leu Leu Gly His Phe Ser Ala Ala He Trp 100 105 110

He Leu Leu Trp Thr Gly Gly Phe Pro Pro Leu Asp Val Val Gly His 115 120 125

Ala Val Asp Leu Gly Trp Leu Gly His Val Leu Ala Val Phe Tyr Leu 130 135 140

Val Trp Val Leu Asn Leu Tyr Asn Phe Met Asp Gly He Asp Gly He 145 150 155 160

Ala Ser Val Glu Ala He Gly Val Cys Val Gly Gly Ala Leu He Tyr 165 170 175

Trp Leu Thr Gly His Val Ala Met Val Gly He Pro Leu Leu Leu Ala 180 185 190

Cys Ala Val Ala Gly Phe Leu He Trp Asn Phe Pro Pro Ala Arg He 195 200 205

Phe Met Gly Asp Ala Gly Ser Gly Phe Leu Gly Met Val He Gly Ala 210 215 220

Leu Ala He Gin Ala Ala Trp Thr Ala Pro Ser Leu Phe Trp Cys Trp 225 230 235 240

Leu He Leu Leu Gly Val Phe He Val Asp Ala Thr Tyr Thr Leu He 245 250 255

Arg Arg He Ala Arg Gly Glu Lys Phe Tyr Glu Ala His Arg Ser His 260 265 270

Ala Tyr Gin Phe Ala Ser Arg Arg Tyr Ala Ser His Leu Arg Val Thr 275 280 285

Leu Gly Val Leu Ala He Asn Thr Leu Trp Leu Leu Arg Trp His 290 295 300

Claims

WE CLAIM:

1. An isolated P. aeruginosa B-band gene cluster containing the following genes: wzz, wbpA, wbpB, wbpC wbpD, wbpE, wzy, wbpF, wbpG, wbpH, wpsl, wbpl, wbpK, wbpL, wbpM and wbpN involved in the synthesis, and assembly of lipopolysaccharide in P. aeruginosa.

2. An isolated P. aeruginosa B-band gene cluster as claimed in claim 1 wherein the genes are organized as shown in Figure 1 (SEQ.ID. NO:1) .

3. An isolated nucleic acid molecule encoding :

(1) (a) Wzz; (b) WbpA; (c) WbpB; (d) WbpC; (e) WbpD; (f) WbpE; (g) Wzy; (h) WbpF; (I) WbpG; (j) Wbpl; (k) WbpJ; (1) WbpK; (m) WbpM; (n) WbpH; and (o) WbpN mvolved in P. aeruginosa O-antigen synthesis and assembly;

(2) UvrB mvolved in ultraviolet repair,

(3) HisH or HisF mvolved in histidme synthesis,

(4) RpsA, a 30S ribosomal subunit protein S1 .

4. A nucleic acid molecule comprising nucleic acid sequences encoding two or more of the following proteins (1) (a) Wzz; (b) WbpA; (c) WbpB; (d) WbpC; (e) WbpD; (f) WbpE; (g) Wzy; (h) WbpF; (i) HisH; (j) HisF; (k) WbpG; (1) Wbpl; (m) WbpJ; (n) WbpK; (o) WbpM; (p) WbpN; (q) WbpH; (r) WbpL; and (s) RpsA .

5. A recombinant molecule adapted for transformation of a host cell comprising a nucleic acid molecule as claimed in claim 3 and an expression control sequence operatively lmked to the DNA segment.

6. A transformant host cell including a recombinant molecule as claimed in claim 5 .

7. An isolated protein characterized in that it has part or all of the primary structural confirmation of a protein encoded by a gene of the psb gene cluster as claimed in claim 1 .

8. A purified protein having the amino acid sequence as shown in Figure 3 or SEQ .ID. NO: 2;, Figure 4 or SEQ.ID. NO: 3; Figure 5 or SEQ.ID. NO: 4; Figure 6 or SEQ.ID. NO: 5; Figure 7 or SEQ.ID. NO: 6, Figure 8 or SEQ .ID. NO: 7; Figure 9 or SEQ.ID. NO: 8; Figure 10 or SEQ.ID. NO: 9; Figure 11 or SEQ.ID. NO: 10; Figure 12 or SEQ .ID. NO: 11; Figure 13 or SEQ.ID. NO: 12; Figure 14 or SEQ.ID. NO: 13; Figure 15 or SEQ .ID. NO: 14; Figure 16 or SEQ .ID. NO: 15; Figure 17 or SEQ.ID. NO:16; or, Figure 18 or SEQ .ID. NO:17; Figure 19 or SEQ. ID. No.: 18; or, Figure 20 or SEQ.ID. No.: 19.

9. A monoclonal or polyclonal antibody specific for an epitope of a purified protein as claimed in claim 8.

10. A method for detecting P. aeruginosa in a sample comprising contacting the sample with a monoclonal or polyclonal antibody as claimed in claim 9 which is capable of being detected after it becomes bound to protein in the sample.

11. A method for detecting the presence of a nucleic acid molecule as claimed in claim 3 in a sample, comprising contacting the sample with a nucleotide probe capable of hybridizing with the nucleic molecule, to form a hybridization product, under conditions which permit the formation of the hybridization product, and assaying for the hybridization product.

12. A method for detecting the presence of a nucleic acid molecule as claimed in claim 3, or a predetermined oligonucleotide fragment thereof in a sample, comprising treating the sample with primers which are capable of amplifying the nucleic acid molecule or the predetermined oligonucleotide fragment thereof in a polymerase chain reaction to form amplified sequences under conditions which permit the formation of amplified sequences, and assaying for amplified sequences.

13. A kit for detecting P. aeruginosa by assaying for a protein involved in O-antigen synthesis or assembly in a sample comprising a monoclonal or polyclonal antibody as claimed in claim 9, reagents required for binding of the antibody to protein in the sample, and directions for its use.

14. A kit for detecting the presence of a nucleic acid molecule as claimed in claim 3 in a sample comprising a nucleotide probe capable of hybridizing with the nucleic acid molecule, reagents required for hybridization of the nucleotide probe with the nucleic acid molecule, and directions for its use.