WO2000027994A2 - Chlamydia pneumoniae genome sequence - Google Patents

Chlamydia pneumoniae genome sequence Download PDF

Info

Publication number
WO2000027994A2
WO2000027994A2 PCT/US1999/026923 US9926923W WO0027994A2 WO 2000027994 A2 WO2000027994 A2 WO 2000027994A2 US 9926923 W US9926923 W US 9926923W WO 0027994 A2 WO0027994 A2 WO 0027994A2
Authority
WO
WIPO (PCT)
Prior art keywords
protein
procein
hypothetical
nucleic acid
hypothencal
Prior art date
Application number
PCT/US1999/026923
Other languages
French (fr)
Other versions
WO2000027994A3 (en
Inventor
Richard Stephens
Wayne Mitchell
Sue Kalman
Ronald Davis
Original Assignee
The Regents Of The University Of California
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Regents Of The University Of California filed Critical The Regents Of The University Of California
Priority to EP99960323A priority Critical patent/EP1133572A4/en
Priority to JP2000581161A priority patent/JP2002529069A/en
Priority to CA002350775A priority patent/CA2350775A1/en
Priority to AU17223/00A priority patent/AU1722300A/en
Publication of WO2000027994A2 publication Critical patent/WO2000027994A2/en
Publication of WO2000027994A3 publication Critical patent/WO2000027994A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/295Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Chlamydiales (O)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/505Medicinal preparations containing antigens or antibodies comprising antibodies
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies

Definitions

  • This invention relates to nucleic acids and polypeptides from Chlamydia pneumoniae and to their use in the diagnosis, prevention and treatment of diseases associated with C. pneumoniae.
  • Chlamydiaceae is a family of obligate intracellular parasite with a tropism for epithelial cells lining the mucus membranes.
  • the bacteria have two morphologically distinct forms, "elementary body” and “reticulate body”.
  • the elementary body is the infectious form, and has a rigid cell wall, primarily of cross-linked outer membrane proteins.
  • the reticulate body is the intracellular, metabohcally active form. A unique developmental cycle between these two forms characterizes Chlamydia growth.
  • C. pneumoniae is a human respiratory pathogen that causes acute respiratory disease, and approximately 10% of community-acquired pneumonia. Antibody prevalence studies have shown that virtually everyone is infected with C. pneumoniae at some time, and that reinfection is common.
  • C. pneumoniae is related to other Chlamydia species, but the level of sequence similarity is relatively low. Very little is known about the biology of this organism, although it appears to be an important human pathogen. Allelic diversity and structural relationships between specific genes of Chlamydial species is described in Kaltenboeck et al. (1993) J Bacteriol 175(2):487-502; Gaydos et al.
  • This invention provides the genomic sequence of Chlamydia pneumoniae.
  • the sequence information is useful for a variety of diagnostic and analytical methods.
  • the genomic sequence may be embodied in a variety of media, including computer readable forms, or as a nucleic acid comprising a selected fragment of the sequence.
  • Such fragments generally consist of an open reading frame, transcriptional or translational control elements, or fragments derived therefrom. Proteins encoded by the open reading frames are useful for diagnostic purposes, as well as for their enzymatic or structural activity.
  • amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
  • Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ -carboxyglutamate, and O-phosphoserine.
  • Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an ⁇ carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group., e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
  • Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
  • Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the rUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
  • “Amplification” primers are oligonucleotides comprising either natural or analogue nucleotides that can serve as the basis for the amplification of a select nucleic acid sequence. They include, e.g., polymerase chain reaction primers and ligase chain reaction oligonucleotides.
  • Antibody refers to an immunoglobulin molecule able to bind to a specific epitope on an antigen.
  • Antibodies can be a polyclonal mixture or monoclonal. Antibodies can be intact immunoglobulins derived from natural sources or from recombinant sources and can be immunoreactive portions of intact immunoglobulins. Antibodies may exist in a variety of forms including, for example, Fv, F aD , and F(ab) , as well as in single chains. Single-chain antibodies, in which genes for a heavy chain and a light chain are combined into a single coding sequence, may also be used.
  • an "antigen” is a molecule that is recognized and bound by an antibody, e.g., peptides, carbohydrates, organic molecules, or more complex molecules such as glycolipids and glycoproteins.
  • the part of the antigen that is the target of antibody binding is an antigenic determinant and a small functional group that corresponds to a single antigenic determinant is called a hapten.
  • Bio sample refers to any sample obtained from a living or dead organism.
  • biological samples include biological fluids and tissue specimens. Such biological samples can be prepared for analysis of the presence of C. pneumoniae nucleic acids, proteins, or antibodies specifically reactive with the proteins.
  • C pneumoniae gene shall be intended to mean the open reading frame encoding specific C. pneumoniae polypeptides, as well as adjacent 5' and 3' non-coding nucleotide sequences involved in the regulation of expression, up to about 2 kb beyond the coding region, but possibly further in either direction.
  • the gene may be introduced into an appropriate vector for extrachromosomal maintenance or for integration into a host genome.
  • Constantly modified variants applies to both amino acid and nucleic acid sequences.
  • conservatively modified variants refers to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences.
  • degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res . 19:5081 (1991); Ohtsuka et al., J. Biol. Chem.
  • nucleic acid variations are "silent variations," which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid.
  • each codon in a nucleic acid can be modified to yield a functionally identical molecule. Accordingly, each silen: variation of a nucleic acid which encodes a polypeptide is implicit in each describ id sequence.
  • amino acid sequences one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention.
  • nucleic acids or polypeptide sequences refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a comparison window, as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection.
  • This definition also refers to the complement of a test sequence, which has a designated percent sequence or subsequence complementarity when the test sequence has a designated or substantial identity to a reference sequence.
  • a designated amino acid percent identity of 95% refers to sequences or subsequences that have at least about 95% amino acid identity when aligned for maximum correspondence over a comparison window as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Such sequences would then be said to have substantial identity, or to be substantially identical to each other.
  • sequences have at least about 70% identity, more preferably 80% identity, more preferably 90-95% identity and above.
  • the percent identity exists over a region of the sequence that is at least about 25 amino acids in length, more preferably over a region that is 50-100 amino acids in length.
  • sequence identity When percentage of sequence identity is used in reference to proteins or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions, where amino acids residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity.
  • a conservative substitution is given a score between zero and 1.
  • the scoring of conservative substitutions is calculated according to, e.g., the algorithm of Meyers & Miller, Computer Applic. Biol. Sci. 4:11-17 (1988) e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, California, USA).
  • PC/GENE Intelligents, Mountain View, California, USA.
  • sequence comparison typically one sequence acts as a reference sequence, to which test sequences are compared.
  • test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Pefault program parameters can be used, or alternative parameters can be designated.
  • the sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated or default program parameters.
  • a comparison window includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 25 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned.
  • Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc.
  • PILEUP creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments to show relationship and percent sequence identity. It also plots a tree or dendogram showing the clustering relationships used to create the alignment.
  • PILEUP uses a simplification of the progressive alignment method of Feng & Ooolittle, J. Mol. Evol. 35:351-360 (1987). The method used is similar to the method described by Higgins & Sharp, CABIOS 5:151-153 (1989).
  • the program can align up to 300 sequences, each of a maximum length of 5,000 nucleotides or amino acids.
  • the multiple alignment procedure begins with the pairwise alignment of the two most similar sequences, producing a cluster of two aligned sequences.
  • This cluster is then aligned to the next most related sequence or cluster of aligned sequences.
  • Two clusters of sequences are aligned by a simple extension of the pairwise alignment of two individual sequences.
  • the final alignment is achieved by a series of progressive, pairwise alignments.
  • the program is run by designating specific sequences and their amino acid or nucleotide coordinates for regions of sequence comparison and by designating the program parameters.
  • PILEUP a reference sequence is compared to other test sequences to determine the percent sequence identity relationship using the following parameters: default gap weight (3.00), default gap length weight (0.10), and weighted end gaps.
  • PILEUP can be obtained from the GCG sequence analysis software package, e.g, version 7.0 (Pevereaux et al, Nuc. Acids Res. 12:387-395 (1984).
  • BLAST algorithm Another example of algorithm that is suitable for determining percent sequence identity (i.e., substantial similarity or identity) is the BLAST algorithm, which is described in Altschul et al, J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra).
  • HSPs high scoring sequence pairs
  • initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them.
  • the word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues, always ⁇ 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached.
  • the BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment.
  • the BLASTP program uses as default parameters a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
  • the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat 'I.
  • nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001.
  • nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross ,-eactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below.
  • a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative suostitutions.
  • Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below.
  • stringent conditions are sequence dependent and will be different in different circumstances.
  • stringent conditions are selected to be about 5°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH.
  • Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe.
  • Typical stringent conditions for a Southern blot protocol involve hybridizing in a buffer comprising 5x SSC, 1% SPS at 65°C or hybridizing in a buffer containing 5x SSC and 1% SPS at 42°C and washing at 65°C with a 0.2x SSC, 0.1% SPS wash.
  • label is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means.
  • useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, dioxigenin, or haptens and proteins for which antisera or monoclonal antibodies are available.
  • nucleic acid refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form.
  • nucleic acids containing known nucleotide analogs or modified backbone residues or linkages which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides.
  • analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs).
  • nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, and polynucleotide.
  • nucleic acid probe or oligonucleotide is defined as a nucleic acid capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation.
  • a probe may include natural (i.e., A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.).
  • the bases in a probe may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization.
  • probes may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages.
  • probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions.
  • the probes are preferably directly labeled as with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin complex may later bind. By assaying for the presence or absence of the probe, one can detect the presence or absence of the select sequence or subsequence.
  • a labeled nucleic acid probe or oligonucleotide is one that is bound, either covalently, through a linker, or through ionic, van der Waals or hydrogen bonds to a label such that the presence of the probe may be detected by detecting the presence of the label bound to the probe.
  • “Pharmaceutically acceptable” means a material that is not biologically or otherwise undesirable, i.e., the material can be administered to an individual along with a Chlamydia antigen without causing any undesirable biological effects or interacting in a deleterious manner with any of the other components of the pharmaceutical composition.
  • polypeptide polypeptide
  • peptide and “protein” are used interchangeably herein to refer to a polymer of amino acid residues.
  • the terms apply to amino acid polymers in which one or more amino acid residue is an analog or mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers.
  • the phrase "specifically or selectively hybridizing to,” refers to hybridization between a probe and a target sequence in which the probe binds substantially only to the target sequence, forming a hybridization complex, when the target is in a heterogeneous mixture of polynucleotides and other compounds. Such hybridization is determinative of the presence of the target sequence.
  • the probe may bind other unrelated sequences, at least 90%, preferably 95% or more of the hybridization complexes formed are with the target sequence.
  • recombinant when used with reference to a cell, or nucleic acid, or vector, indicates that the cell, or nucleic acid, or vector, has been modified by the introduction of a heterologous nucleic acid or the alteration of a native nucleic acid, or that the cell is derived from a cell so modified.
  • recombinant cells express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.
  • the specified antibodies bind to a particular protein and do not bind in a significant amount to other proteins present in the sample.
  • Specific binding to an antibody under such conditions may require an antibody that is selected for its specificity for a particular protein.
  • a variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein and are described in detail below.
  • substantially pure or “isolated” when referring to a Chlamydia peptide or protein means a chemical composition which is free of other subcellular components of the Chlamydia organism.
  • a monomeric protein is substantially pure when at least about 85% or more of a sample exhibits a single polypeptide backbone. Minor variants or chemical modifications may typically share the same polypeptide sequence. Oepending on the purification procedure, purities of 85%, and preferably over 95% pure are possible.
  • Protein purity or homogeneity may be indicated by a number of means well known in the art, such as polyacrylamide gel electrophoresis of a protein sample, followed by visualizing a single polypeptide band on a polyacrylamide gel upon silver staining. For certain purposes high resolution will be needed and HPLC or a similar means for purification utilized.
  • the present invention provides the nucleotide sequence of the C. pneumoniae genome SEQ IP NO: 1 or a representative fragment thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan.
  • a "representative fragment" of the nucleotide sequence depicted in SEQ IP NO: 1 refers to any portion which is not presently represented within a publicly available database.
  • Preferred representative fragments of the present invention are open reading frames, expression modulating fragments, uptake modulating fragments, and fragments which can be used to diagnose the presence of C. pneumoniae in sample.
  • nucleic acid hybridization-based assays for the detection of C. pneumoniae. Any of a number of well known techniques for the specific detection of target nucleic acids can be used.
  • Exemplary hybridization-based assays include, but are not limited to, traditional "direct probe” methods such as Southern Blots, dot blots, m situ /zybndization (e g , FISH), PCR, and the like
  • the methods can be used in a wide vanety of formats including, but not limited to substrate- (e g membrane or glass) bound methods or array-based approaches as desc ⁇ bed below
  • this invention also embraces methods for detecting the presence of Chlamydia PNA or RNA m biological samples These sequences can be used to detect Chlamydia in biological samples from patients suspected of being infected
  • a vanety of methods of specific PNA and RNA measurement using nucleic acid hybridization techniques are known to those of skill in the art (see Sambrook et al , supra) In situ h>bndization assays are well known (e g , Angerer (1987) Meth
  • in situ hybridization comprises the following major steps (1) fixation of tissue or biological structure to analyzed, (2) prehyb ⁇ dization treatment of the biological structure t increase accessibility of target PNA, and to reduce nonspecific binding, (3) hybndizati n of the mixture of nucleic acids to the nucleic acid m the biological structure or tissue, (4) post-hyb ⁇ dization washes to remove nucleic acid fragments not bound in the hybridization and (5) detection of the hyb ⁇ dized nucleic acid fragments
  • the reagent used in each of these steps and the conditions for use vary depending on the particular application
  • cells are fixed to a solid support, typically a glass slide If a nucleic acid is to be probed, the cells are typically denatured with heat or alkali The cells are then contacted with a hybridization solution at a moderate temperature to permit annealing of labeled probes specific to the nucleic acid sequence encoding the protein
  • the targets e g , cells
  • the targets are then typically washed at a predetermined st ⁇ ngency or at an increasing st ⁇ ngency until an approp ⁇ ate signal to noise ratio is obtained
  • nucleic acids of this invention are particularly well suited to array- based hyb ⁇ dization formats
  • Arrays are a multiplicity of different "probe” or “target” nucleic acids (or other compounds) attached to one or more surfaces (e g , solid, membrane, or gel).
  • the multiplicity of nucleic acids (or other moieties) is attached to a single contiguous surface or to a multiplicity of surfaces juxtaposed to each other
  • mate ⁇ al for the solid surface
  • Illustrative solid surfaces include, e g , nitrocellulose, nylon, glass, quartz, diazotized membranes (paper or nylon), sihcones, polyformaldehyde, cellulose, and cellulose acetate
  • plastics such as polyethylene, polypropylene, polystyrene, and the like can be used.
  • mate ⁇ als which may be employed include paper, ceramics, metals, metalloids, semiconductive mate ⁇ als, cermets or the like
  • substances that form gels can be used
  • Such matenals include, e.g , proteins (e.g , gelatins), hpopolysacchandes, silicates, agarose and polyacrylamides
  • vanous pore sizes may be employed depending upon the nature of the system In prepa ⁇ ng the surface, a plurality of different matenals may be employed, particularly as laminates, to obtain vanous properties.
  • proteins e g , bovme serum albumin
  • macromolecules e g , Oenhardt's solution
  • the surface will usually be polyfunctional or be capable of being polyfunctionalized.
  • Functional groups which may be present on the surface and used for linking can include carboxylic acids, aldehydes, amino groups, cyano groups, ethylenic groups, hydroxyl groups, mercapto groups and the like. The manner of linking a wide variety of compounds to various surfaces is well known and is amply illustrated in the literature.
  • Target elements of various sizes ranging from 1 mm diameter down to 1 ⁇ m can be used.
  • Smaller target elements containing low amounts of concentrated, fixed probe PNA are used for high complexity comparative hybridizations since the total amount of sample available for binding to each target element will be limited.
  • Such small array target elements are typically used in arrays with densities greater than 10 /cm 2 .
  • Relatively simple approaches capable of quantitative fluorescent imaging of 1 cm 2 areas have been described that permit acquisition of data from a large number of target elements in a single image (see, e.g., Wittrup (1994) Cytometry 16:206-213).
  • Substrates such as glass or fused silica are advantageous in that they provide a very low fluorescence substrate, and a highly efficient hybridization environment.
  • Covalent attachment of the target nucleic acids to glass or synthetic fused silica can be accomplished according to a number of known techniques (described above). Nucleic acids can be conveniently coupled to glass using commercially available reagents.
  • materials for preparation of silanized glass with a number of functional groups are commercially available or can be prepared using standard techniques (see, e.g., Gait (1984) Oligonucleotide Synthesis: A Practical Approach, IRL Press, Wash., O.C.). Quartz cover slips, which have at least 10- fold lower auto fluorescence than glass, can also be silanized.
  • probes can also be immobilized on commercially available coated beads or other surfaces.
  • biotin end-labeled nucleic acids can be bound to commercially available avidin-coated beads.
  • Streptavidin or anti-digoxigenin antibody can also be attached to silanized glass slides by protein-mediated coupling using e.g., protein A following standard protocols (see, e.g., Smith (1992) Science 258: 1122- 1126).
  • Biotin or digoxigenin end-labeled nucleic acids can be prepared according to standard techniques. Hybridization to nucleic acids attached to beads is accomplished by suspending them in the hybridization mix, and then depositing them on the glass substrate for analysis after washing.
  • paramagnetic particles such as ferric oxide particles, with or without avidin coating, can be used.
  • nucleic acid hybridization formats are known to those skilled in the art.
  • common formats include sandwich assays and competition or displacement assays.
  • Hybridization techniques are generally described in Hames and Higgins (1985) Nucleic Acid Hybridization, A Practical Approach, IRL Press; Gall and Pardue (1969) Proc. Natl. Acad. Sci. USA 63: 378-383; and John et al. (1969) Nature 223: 582-587.
  • Sandwich assays are commercially useful hybridization assays for detecting or isolating nucleic acid sequences. Such assays utilize a "capture" nucleic acid covalently immobilized to a solid support and a labeled "signal" nucleic acid in solution. The sample will provide the target nucleic acid. The "capture” nucleic acid and “signal” nucleic acid probe hybridize with the target nucleic acid to form a "sandwich” hybridization complex. To be most effective, the signal nucleic acid should not hybridize with the capture nucleic acid. Petection of a hybridization complex may require the binding of a signal generating complex to a duplex of target and probe polynucleotides or nucleic acids.
  • such binding occurs through ligand and anti-ligand interactions as between a ligand-conjugated probe and an anti-ligand conjugated with a signal.
  • the sensitivity of the hybridization assays may be enhanced through use of a nucleic acid amplification system that multiplies the target nucleic acid being detected. Examples of such systems include the polymerase chain reaction (PCR) system and the ligase chain reaction (LCR) system.
  • PCR polymerase chain reaction
  • LCR ligase chain reaction
  • Other methods recently described in the art are the nucleic acid sequence based amplification (NASBAO, Cangene, Mississauga, Ontario) and Q Beta Replicase systems.
  • Nucleic acid hybridization simply involves providing a denatured probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing. The nucleic acids that do not form hybrid duplexes are then washed away leaving the hybridized nucleic acids to be detected, typically through detection of an attached detectable label. It is generally recognized that nucleic acids are denatured by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids, or in the addition of chemical a ents, or the raising of the pH.
  • hybrid duplexes Under low stringency conditions (e.g., low temperature and/or high salt and/or high target concentration) hybrid duplexes (e.g., ONA:PNA, RNA:RNA, or RNA:PNA) will form even where the annealed sequences are not perfectly complementary. Thus specificity of hybridization is reduced at lower stringency. Conversely, at higher stringency (e.g., higher temperature or lower salt) successful hybridization requires fewer mismatches.
  • hybridization conditions may be selected to provide any degree of stringency. In a preferred embodiment, hybridization is performed at low stringency to ensure hybridization and then subsequent washes are performed at higher stringency to eliminate mismatched hybrid duplexes.
  • Successive washes may be performed at increasingly higher stringency (e.g., down to as low as 0.25 X SSPE-T at 37°C to 70°C) until a desired level of hybridization specificity is obtained.
  • Stringency can also be increased by addition of agents such as formamide.
  • Hybridization specificity may be evaluated by comparison of hybridization to the test probes with hybridization to the various controls that can be present.
  • the wash is performed at the highest stringency that produces consistent results and that provides a signal intensity greater than approximately 10% of the background intensity.
  • the hybridized array may be washed at successively higher stringency solutions and read between each wash. Analysis of the data sets thus produced will reveal a wash stringency above which the hybridization pattern is not appreciably altered and which provides adequate signal for the particular probes of interest.
  • the hybridized nucleic acids are detected by detecting one or more labels attached to the sample or probe nucleic acids.
  • the labels may be incorporated by any of a number of means well known to those of skill in the art.
  • Means of attaching labels to nucleic acids include, for example nick translation or end- labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label (e.g., a fluorophore).
  • a label e.g., a fluorophore
  • linkers for the attachment of labels to nucleic acids are also known.
  • intercalating dyes and fluorescent nucleotides can also be used.
  • Oetectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means.
  • Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., PynabeadsTM), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like, see, e.g., Molecular Probes, Eugene, Oregon, USA), radiolabels (e.g., 3 H, 125 I, 35 S, 14 C, or 32 P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold (e.g., gold particles in the 40 -80 nm diameter size range scatter green light with high efficiency) or colored glass or plastic (e.g., polysty
  • a fluorescent label is preferred because it provides a very strong signal with low background. It is also optically detectable at high resolution and sensitivity through a quick scanning procedure.
  • the nucleic acid samples can all be labeled with a single label, e.g., a single fluorescent label.
  • different nucleic acid samples can be simultaneously hybridized where each nucleic acid sample has a different label. For instance, one target could have a green fluorescent label and a second target could have a red fluorescent label. The scanning step will distinguish cites of binding of the red label from those binding the green fluorescent label.
  • Each nucleic acid sample (target nucleic acid) can be analyzed independently from one another.
  • Suitable chromogens which can be employed include those molecules and compounds which absorb light in a distinctive range of wavelengths so that a color can be observed or, alternatively, which emit light when irradiated with radiation of a particular wave length or wave length range, e.g., fluorescers.
  • fluorescers should absorb light above about 300 nm, preferably about 350 nm, and more preferably above about 400 nm, usually emitting at wavelengths greater than about 10 nm higher than the wavelength of the light absorbed. It should be noted that the absorption and emission characteristics of the bound dye can differ from the unbound dye. Therefore, when referring to the various wavelength ranges and characteristics of the dyes, it is intended to indicate the dyes as employed and not the dye which is unconjugated and characterized in an arbitrary solvent.
  • Fluorescers are generally preferred because by irradiating a fluorescer with light, one can obtain a plurality of emissions. Thus, a single label can provide for a plurality of measurable events.
  • Petectable signal can also be provided by chemiluminescent and bioluminescent sources.
  • Chemiluminescent sources include a compound which becomes electronically excited by a chemical reaction and can then emit light which serves as the detectable signal or donates energy to a fluorescent acceptor.
  • luciferins can be used in conjunction with luciferase or lucigenins to provide bioluminescence.
  • Spin labels are provided by reporter molecules with an unpaired electron spin which can be detected by electron spin resonance (ESR) spectroscopy.
  • ESR electron spin resonance
  • Exemplary spin labels include organic free radicals, transitional metal complexes, particularly vanadium, copper, iron, and manganese, and the like.
  • Exemplary spin labels include nitroxide free radicals.
  • the label may be added to the target (sample) nucleic acid(s) prior to, or after the hybridization.
  • direct labels are detectable labels that are directly attached to or incorporated into the target (sample) nucleic acid prior to hybridization.
  • indirect labels are joined to the hybrid duplex after hybridization.
  • the indirect label is attached to a binding moiety that has been attached to the target nucleic acid prior to the hybridization.
  • the target nucleic acid may be biotinylated before the hybridization. After hybridization, an avidin-conjugated fluorophore will bind the biotin bearing hybrid duplexes providing a label that is easily detected.
  • Fluorescent labels are easily added during an in vitro transcription reaction.
  • fluorescein labeled UTP and CTP can be inco ⁇ orated into the RNA produced in an in vitro transcription.
  • the labels can be attached directly or through a linker moiety.
  • the site of label or linker-label attachment is not limited to any specific position.
  • a label may be attached to a nucleoside, nucleotide, or analogue thereof at any position that does not interfere with detection or hybridization as desired.
  • certain Label-ON Reagents from Clontech (Palo Alto, C A) provide for labeling interspersed throughout the phosphate backbone of an oligonucleotide and for terminal labeling at the 3' and 5' ends.
  • labels can be attached at positions on the ribose ring or the ribose can be modified and even eliminated as desired.
  • the base moieties of useful labeling reagents can include those that are naturally occurring or modified in a manner that does not interfere with the purpose to which they are put.
  • Modified bases include but are not limited to 7-deaza A and G, 7-deaza-8-aza A and G, and other heterocyclic moieties.
  • fluorescent labels are not to be limited to single species organic molecules, but include inorganic molecules, multi-molecular mixtures of organic and/or inorganic molecules, crystals, heteropolymers, and the like.
  • CdSe-CdS core-shell nanocrystals enclosed in a silica shell can be easily derivatized for coupling to a biological molecule (Bruchez et al (1998) Science, 281: 2013-2016).
  • highly fluorescent quantum dots (zinc sulfide-capped cadmium selenide) have been covalently coupled to biomolecules for use in ultrasensitive biological detection (Warren and Nie (1998) Science, 281: 2016-2018).
  • amplification-based assays can be used to detect nucleic acids.
  • the nucleic acid sequences act as a template in an amplification reaction (e g Polymerase Cham Reaction (PCR) Detailed protocols for quantitative PCR are provided in Innis et al. (1990) PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc. N.Y.).
  • ligase chain reaction LCR
  • LCR ligase chain reaction
  • Genomics 4- 560 Landegren et al (1988) Science 241 : 1077
  • Barnnger et al. (1990) Gene 89 117 transcnption amplification
  • transcnption amplification Kwoh et al. (1989) Proc. Natl Acad. Sci USA 86. 1173
  • self- sustamed sequence replication (Guatelh et al. (1990) Proc. Nat. Acad. Sci. USA 87: 1874).
  • the nucbic acids of the invention can also be used to C pneumoniae detect gene transcnpts
  • Methods of detecting and/or quantifying gene transcnpts using nucleic acid hybndizatu >n techniques are known to those of skill in the art (see Sambrook et al. supra).
  • a Northern transfer may be used for the detection of the desired mRNA directly.
  • the mRNA is isolated from a given cell sample using, for example, an acid guanidinium-phenol-chloroform extraction method. The mRNA is then electrophoresed to separate the mRNA species and the mRNA is transferred from the gel to a nitrocellulose membrane.
  • labeled probes are used to identify and/or quantify the target mRNA.
  • the gene transcript can be measured using amplification (e.g PCR) based methods as desc ⁇ bed above for directly assessing copy number of the target sequences
  • nucleic acids disclosed here can be used for recombinant expression of the proteins.
  • the nucleic acids encoding the proteins of interest are introduced into suitable host cells, followed by induction of the cells to produce large amounts of the protein.
  • the invention relies on routine techniques in the field of recombinant genetics, well known to those of ordinary skill in the art. A basic text disclosing the general methods of use in this invention is Sambrook et al., Molecular Cloning, A Laboratory Manual (2nd ed. 1989).
  • Standard transfection methods are used to produce prokaryotic, mammalian, yeast or insect cell lines which express large quantities of the desired polypeptide, which is then purified using standard techniques (see, e.g., Colley et al, J. Biol Chem. 264: 17619-17622, 1989; Guide to Protein Purification, supra).
  • the nucleotide sequences used to transfect the host cells can be modified to yield Chlamydia polypeptides with a variety of desired properties.
  • the polypeptides can vary from the naturally-occurring sequence at the primary structure level by amino acid, insertions, substitutions, deletions, and the like. These modifications can be used in a number of combinations to produce the final modified protein chain.
  • the amino acid sequence variants can be prepared with various objectives in mind, including facilitating purification and preparation of the recombinant polypeptide.
  • the modified polypeptides are also useful for modifying plasma half life, improving therapeutic efficacy, and lessening the severity or occurrence of side effects during therapeutic use.
  • the amino acid sequence variants are usually predetermined variants not found in nature but exhibit the same immunogenic activity as naturally occurring protein.
  • modifications of the sequences encoding the polypeptides may be readily accomplished by a variety of well-known techniques, such as site-directed mutagenesis (see Gillman & Smith, Gene 8:81-97 (1979); Roberts et al, Nature 328:731- 734 (1987)).
  • the effect of many mutations is difficult to predict. Thus, most modifications are evaluated by routine screening in a suitable assay for the desired characteristic. For instance, the effect of various modifications on the ability of the polypeptide to elicit a protective immune response can be easily determined using in vitro assays. For instance, the polypeptides can be tested for their ability to induce lymphoproliferation, T cell cytotoxicity, or cytokine production using standard techniques.
  • the particular procedure used to introduce the genetic material into the host cell for expression of the polypeptide is not particularly critical. Any of the well known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, plasmid vectors, viral vectors and any of the other well known methods for introducing cloned genomic PNA, cPNA, synthetic PNA or other foreign genetic material into a host cell (see Sambrook et al. , supra). It is only necessary that the particular procedure utilized be capable of successfully introducing at least one gene into the host cell which is capable of expressing the gene.
  • prokaryotic cells such as E. coli can be used.
  • Eukaryotic cells include, yeast, Chinese hamster ovary (CHO) cells, COS cells, and insect cells.
  • the particular vector used to transport the genetic information into the cell is also not particularly critical.
  • Any of the conventional vectors used for expression of recombinant proteins in prokaryotic and eukaryotic cells may be used.
  • Expression vectors for mammalian cells typically contain regulatory elements from eukaryotic viruses.
  • the expression vector typically contains a transcription unit or expression cassette that contains all the elements required for the expression of the polypeptide PNA in the host cells.
  • a typical expression cassette contains a promoter operably linked to the PNA sequence encoding a polypeptide and signals required for efficient polyadenylation of the transcript.
  • the term "operably linked” as used herein refers to linkage of a promoter upstream from a PNA sequence such that the promoter mediates transcription of the PNA sequence.
  • the promoter is preferably positioned about the same distance from the heterologous transcription start site as it is from the transcription start site in its natural setting. As is known in the art, however, some variation in this distance can be accommodated without loss of promoter function. Following the growth of the recombinant cells and expression of the polypeptide, the culture medium is harvested for purification of the secreted protein.
  • the media are typically clarified by centrifugation or filtration to remove cells and cell debris and the proteins are concentrated by adsorption to any suitable resin or by use of ammonium sulfate fractionation, polyethylene glycol precipitation, or by ultrafiltration. Other routine means known in the art may be equally suitable.
  • Further purification of the polypeptide can be accomplished by standard techniques, for example, affinity chromatography, ion exchange chromatography, sizing chromatography, His 6 tagging and Ni-agarose chromatography (as described in Oobeli et al., Mol. and Biochem. Parasit. 41:259-268 (1990)), or other protein purification techniques to obtain homogeneity.
  • the purified proteins are then used to produce pharmaceutical compositions, as described below.
  • vaccinia virus is grown in suitable cultured mammalian cells such as the HeLa S3 spinner cells, as described by Mackett et al, in DNA cloning Vol. II: A practical approach, pp. 191-211 (Glover, ed.).
  • the proteins of the present invention can be used to produce antibodies specifically reactive with C pneumoniae antigens. If isolated proteins are used, they may be recombinantly produced or isolated from Chlamydia cultures. Synthetic peptides made using the protein sequences may also be used.
  • an immunogen preferably a purified protein
  • an adjuvant preferably an adjuvant
  • animals are immunized.
  • blood is collected from the animal and antisera is prepared.
  • Polyclonal antisera are used to identify and characterize Chlamydia in the tissues of patients using, for instance, in situ techniques and immunoperoxidase test procedures described in Anderson et al. JA VMA 198:241 (1991) and Barr et al. Vet.
  • Monoclonal antibodies may be obtained by various techniques familiar to those skilled in the art. Briefly, spleen cells from an animal immunized with a desired antigen are immortalized, commonly by fusion with a myeloma cell (see Kohler &
  • Monoclonal antibodies produced in such a manner are used, for instance, in ELISA diagnostic tests, immunoperoxidase tests, immunohistochemical tests, for the in vitro evaluation of spirochete invasion, to select candidate antigens for vaccine development, protein isolation, and for screening genomic and cPNA libraries to select appropriate gene sequences.
  • Immunodiagonostic detection of C. pneumoniae infections are used, for instance, in ELISA diagnostic tests, immunoperoxidase tests, immunohistochemical tests, for the in vitro evaluation of spirochete invasion, to select candidate antigens for vaccine development, protein isolation, and for screening genomic and cPNA libraries to select appropriate gene sequences.
  • the present invention also provides methods for detecting the presence or absence of C. pneumoniae, or antibodies reactive with it, in a biological sample.
  • antibodies specifically reactive with Chlamydia can be detected using either Chlamydia proteins or the isolates described here.
  • the proteins and isolates can also be used to raise specific antibodies (either monoclonal or polyclonal) to detect the antigen in a sample.
  • nucleic acids disclosed and claimed here can be used to detect Chlamydia-specific sequences using standard hybridization techniques.
  • the immunoassays of the present invention can be performed in any of several configurations, which are reviewed extensively in Enzyme Immunoassay (Maggio, ed., 1980); Tijssen, Laboratory Techniques in Biochem. stry and Molecular Biology (1985)).
  • Enzyme Immunoassay Maggio, ed., 1980
  • Tijssen Laboratory Techniques in Biochem. stry and Molecular Biology (1985)
  • the proteins and antibodies disclose i here are conveniently used in ELISA, immunoblot analysis and agglutination assays.
  • immunoassays to measure -Chlamydia antibodies or antigens can be either competitive or noncompetitive binding assays.
  • the sample analyte e.g., zn ⁇ -Chlamydia antibodies
  • a labeled analyte e.g., anti-Chlamydia monoclonal antibody
  • a capture agent e.g., isolated Chlamydia protein
  • Noncompetitive assays are typically sandwich assays, in which the sample analyte is bound between two analyte-specific binding reagents.
  • One of the binding agents is used as a capture agent and is bound to a solid surface.
  • the second binding agent is labelled and is used to measure or detect the resultant complex by visual or instrument means.
  • a number of combinations of capture agent and labelled binding agent can be used.
  • an isolated Chlamydia protein or culture can be used as the capture agent and labelled anti-human antibodies specific for the constant region of human antibodies can be used as the labelled binding agent.
  • Goat, sheep and other non- liuman antibodies specific for human immunoglobulin constant regions e.g., ⁇ or ⁇
  • the anti-human antibodies can be the capture agent and the antigen can be labelled.
  • the assay may be bound to a solid surface.
  • a solid surface may be a membrane (e.g., nitrocellulose), a microtiter dish (e.g., PVC or polystyrene) or a bead.
  • the desired component may be covalently bound or noncovalently attached through nonspecific bonding.
  • the immunoassay may be carried out in liquid phase and a variety of separation methods may be employed to separate the bound labeled component from the unbound labelled components. These methods are known to those of skill in the art and include immunoprecipitation, column chromatography, adsorption, addition of magnetizable particles coated with a binding agent and other similar procedures. An immunoassay may also be carried out in liquid phase without a separation procedure. Various homogeneous immunoassay methods are now being applied to immunoassays for protein analytes. In these methods, the binding of the binding agent to the analyte causes a change in the signal emitted by the label, so that binding may be measured without separating the bound from the unbound labelled component.
  • Western blot (immunoblot) analysis can also be used to detect the presence of antibodies to Chlamydia in the sample.
  • This technique is a reliable method for confirming the presence of antibodies against a particular protein in the sample.
  • the technique generally comprises separating proteins by gel electrophoresis on the basis of molecular weight, transferring the separated proteins to a suitable solid support, (such as a nitrocellulose filter, a nylon filter, or derivatized nylon filter), and incubating the sample with the separated proteins. This causes specific target antibodies present in the sample to bind their respective proteins. Target antibodies are then detected using labeled anti- human antibodies.
  • the immunoassay formats described above employ labelled assay components.
  • the label may be coupled directly or indirectly to the desired component of the assay according to methods well known in the art.
  • a wide variety of labels may be used.
  • the component may be labelled by any one of several methods. Traditionally a radioactive label incorporating 3 H, I25 1, 35 S, 14 C, or 32 P was used.
  • Non-radioactive labels include ligands which bind to labelled antibodies, fluorophores, chemiluminescent agents, enzymes, and antibodies which can serve as specific binding pair members for a labelled ligand.
  • the choice of label depends on sensitivity required, ease of conjugation with the compound, stability requirements, and available instrumentation.
  • Enzymes of interest as labels will primarily be hydrolases, particularly phosphatases, esterases and glycosidases, or oxidoreductases, particularly peroxidases.
  • Fluorescent compounds include fluorescein and its derivatives, rhodamine and its derivatives, dansyl, umbelliferone, etc.
  • Chemiluminescent compounds include luciferin, and 2,3-dihydrophthalazinediones, e.g., luminol.
  • Non-radioactive labels are often attached by indirect means.
  • a ligand molecule e.g., biotin
  • the ligand then binds to an anti-ligand (e.g., streptavidin) molecule which is either inherently detectable or covalently bound to a signal system, such as a detectable enzyme, a fluorescent compound, or a chemiluminescent compound.
  • an anti-ligand e.g., streptavidin
  • a number of ligands and anti-ligands can be used.
  • a ligand has a natural anti-ligand, for example, biotin, thyroxine, and cortisol, it can be used in conjunction with the labelled, naturally occurring anti-ligands.
  • any haptenic or antigenic compound can be used in combination with an antibody.
  • agglutination assays can be used to detect the presence of the target antibodies.
  • antigen-coated particles are agglutinated by samples comprising the target antibodies.
  • none of the components need be labelled and the presence of the target antibody is detected by simple visual inspection.
  • the peptides or antibodies (typically monoclonal antibodies) of the present invention and pharmaceutical compositions thereof are useful for administration to mammals, particularly humans, to treat and/or prevent Chlamydia infections. Suitable formulations are found in Remington 's Pharmaceutical Sciences, Mack Publishing Company, Philadelphia, PA, 17th ed. (1985).
  • the immunogenic peptides or antibodies of the invention are administered prophylactically or to an individual already suffering from the disease.
  • the peptide compositions are administered to a patient in an amount sufficient to elicit an effective immune response to Chlamydia. An effective immune response is one that inhibits infection.
  • Amount adequate to accomplish this is defined as “therapeutically effective dose” or “immunogenically effective dose.” Amounts effective for this use will depend on, e.g., the peptide composition, the manner of administration, the stage and severity of the disease being treated, the weight and general state of health of the patient, and the judgment of the prescribing physician, but generally range for the initial immunization (that is for therapeutic or prophylactic administration) from about 0.1 mg to about 1.0 mg per 70 kilogram patient, more commonly from about 0.5 mg to about 0.75 mg per 70 kg of body weight.
  • Boosting dosages are typically from about 0.1 mg to about 0.5 mg of peptide using a boosting regimen over weeks to months depending upon the patient's response and condition. A suitable protocol would include injection at time 0, 4, 2, 6, 10 and 14 weeks, followed by further booster injections at 24 and 28 weeks.
  • Vaccine compositions containing the peptides are administered prophylactically to a patient susceptible to or otherwise at risk of the infection.
  • compositions are intended for parenteral or oral administration.
  • the pharmaceutical compositions are administered parenterally, e.g., subcutaneously, intradermally, or intramuscularly.
  • the invention provides compositions for parenteral administration which comprise a solution of the immunogenic polypeptides dissolved or suspended in an acceptable carrier, preferably an aqueous carrier.
  • an aqueous carrier e.g., water, buffered water, 0.4% saline, 0.3% glycine, hyaluronic acid and the like.
  • These compositions may be sterilized by conventional, well known sterilization techniques, or may be sterile filtered.
  • compositions may be packaged for use as is, or lyophilized, the lyophilized preparation being combined with a sterile solution prior to administration.
  • the compositions may contain pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions, such as buffering agents, tonicity adjusting agents, wetting agents and the like, for example, sodium acetate, sodium lactate, sodium chloride, potassium chloride, calcium chloride, sorbitan monolaurate, triethanolamine oleate, etc.
  • the compositions may also comprise carriers to enhance the immune response.
  • Useful carriers are well known in the art, and include, e.g., KLH, thyroglobulin, albumins such as human serum albumin, tetanus toxoid, polyamino acids such as poly(lysine:glutamic acid), influenza, hepatitis B virus core protein, hepatitis B virus recombinant vaccine and the like.
  • conventional nontoxic solid carriers may be used which include, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharin, talcum, cellulose, glucose, sucrose, magnesium carbonate, and the like.
  • a pharmaceutically acceptable nontoxic composition is formed t y inco ⁇ orating any of the normally employed excipients, such as those carriers previously listed, and generally 10-95% of active ingredient, that is, one or more peptides of the invention, and more preferably at a concentration of 25%-75%.
  • compositions and methods of administration suitable for maximizing the immune response are preferred.
  • peptides may be introduced into a host, including humans, linked to a carrier or as a homopolymer or heteropolymer of active peptide units from various Chlamydia proteins disclosed here.
  • a "cocktail" of polypeptides can be used.
  • a mixture of more than one polypeptide has the advantage of increased immunological reaction and, where different peptides are used to make up the polymer, the additional ability to induce antibodies to a number of epitopes.
  • compositions also include an adjuvant.
  • adjuvants include incomplete Freund's adjuvant, alum, aluminum phosphate, aluminum hydroxide, N-acetyl-muramyl-L-threonyl-P-isoglutamine (thr-MPP), N-acetyl-nor-muramyl-L-alanyl-P-isoglutamine (CGP 11637, referred to as nor-MPP), N-acety-muramyl-Lalanyl-P-isoglutaminyl-L-alanine-2-( -2'-dipalmitoyl-sn- glycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 19835A, referred to as MTP-PE), and RIBI, which contains three components extracted from bacteria, monophosphoryl lipid A, trehalose dimycolate and cell wall skeleton (M
  • the concentration of immunogenic peptides of the invention in the pharmaceutical formulations can vary widely, i.e. from less than about 0.1%, usually at or at least about 2% to as much as 20% to 50% or more by weight, and will be selected primarily by fluid volumes, viscosities, etc., in accordance with the particular mode of administration selected.
  • the peptides of the invention can also be expressed by attenuated viral hosts, such as vaccinia or fowlpox.
  • This approach involves the use of vaccinia virus as a vector to express nucleotide sequences that encode the peptides of the invention.
  • the recombinant vaccinia virus Upon introduction into a host, the recombinant vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response.
  • Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. Patent No. 4,722,848.
  • Another vector is BCG (Bacille Calmette Guerin). BCG vectors are described in Stover et al. (Nature 351 :456-460 (1991)).
  • the peptides may also be encapsulated, introduced into the lumen of liposomes, prepared as a colloid, or other conventional techniques may be employed which provide an extended serum half-life of the peptides.
  • liposomes as described in, e.g., Szoka et al, Ann. Rev. Biophys. Bioeng. 9:467 (1980), U.S. Pat. ⁇ os. 4, 235,871, 4,501,728 and 4,837,028.
  • Example 1 This example describes comparison of the C. pneumoniae genome disclosed here and the, previously sequenced, C. trachomatis genome (Stephens, et al. Science 282:754-759 (1998)).
  • the previously sequenced C. trachomatis genome contains 1,042,519 nucleotides and 875 likely protein-coding genes. Similarity searching permitted the inferred functional assignment of sequences 636 (60%) genes disclosed here and 251 (23%) are similar to hypothetical genes for other bacterial organisms including those for C. trachomatis. The remaining 186 (17%) genes are not homologous to sequences deposited in GenBank.. Seventy C. trachomatis genes are not represented in the C. pneumoniae genome. These are contained within blocks consisting of 2-17 genes and 19 single genes. Of the 70 C. trachomatis genes without homo logs in C. pneumoniae, 60 are classified as encoding hypothetical proteins. The remaining genes not represented in C.
  • pneumoniae consist of the tryptophan operon (trpA,B,R), trpC, two predicted thiol protease genes, and 4 genes assigned to the phospholipase-O superfamily. It is evident that there is a high level of functional conservation between C. pneumoniae and C. trachomatis as orthologs to C. trachomatis genes were identified for 859 (80%) of the predicted coding sequences for C. pneumoniae. The level of similarity for individual encoded proteins spans a wide spectrum (22-95% amino acid identity) with an average of 62% amino acid identity between orthologs from the two species.
  • the percent amino acid identity between orthologous chlamydial proteins is similar among functional groups with the highest for proteins associated with translation and the lowest for proteins whose function in chlamydiae is uncharacterized and not related to proteins encoded by other organisms.
  • the gene order of the homologous set of genes in C. pneumoniae shows reorganization relative to the genome of C. trachomatis; however, there is a high level of synteny for the gene organization of the two genomes.
  • the distribution of genome reorganization is not evenly distributed on the chromosome as the region between C. pneumoniae coding sequences 0130-0300 contains substantially more reorganization than other areas of the genome. This region coincides with the predicted chromosome replication terminus.
  • trachomatis proteins containing SET and SWTB domains, and a SWTB domain fused to the C-terminus of the chlamydial topoisomerase I are found in C. pneumoniae supporting their possible role in the chromatin condensation-decondensation characteristic of the biologically unique chlamydial developmental cycle.
  • C. pneumoniae has a glycolytic pathway and a linked tricarboxylic acid cycle, although likely functional, is incomplete as genes for citrate synthase, aconitase, and isocitrate dehydrogenase were not identified.
  • C. pneumoniae has a complete glycogen synthesis and degradation system supporting a role for glycogen synthesis and utilization of glucose-derivatives in chlamydial metabolism.
  • C. pneumoniae also contains the V (vacuolar)-type ATPase operon and the two ATP translocases found in C. trachomatis.
  • the type-Ill secretion virulence system required for invasion by several pathogenic bacteria and found in the C. trachomatis genome in three chromosomal locations is also present in the C. pneumoniae genome.
  • Each of the components is conserved and their relative genomic contexts are conserved.
  • Genes such as a predicted serine/threonine protein kinase and other genes physically linked to genes encoding structural components of the type-Ill secretion apparatus, but without identified homologs, are also highly similar between the two species suggesting the functional roles in modifying cellular biology are fundamentally conserved.
  • Chlamydia-encoded proteins that are not found in chlamydial organisms but localized to the intracellular chlamydial inclusion membrane are likely essential for the unique intracellular biology and perhaps differences in inclusion mo ⁇ hology observed between species of Chlamydia.
  • Several such proteins, termed IncA,B&C have been characterized for a C. psittaci strain (-Rockey, et al. Mol. Microbiol. 15:617-626 (1995); Rockey et al. Inject. Immun. 62: 106-112 (1994)).
  • C. pneumoniae and C. trachomatis encode orthologs to C. psittaci IncB and IncC and C. trachomatis also contains an ortholog to icA.
  • C. pneumoniae contains two genes that encode proteins with similarity to IncA (CPn0186 and CPn0585), although the level of homology is low suggesting analogous but possibily altered functions.
  • the tryptophan biosynthesis operon (trpA, trpB, trpR) and trpC identified in C. trachomatis is conspicuously missing in the C. pneumoniae genome. This represents the entire repertoire of genes associated with tryptophan biosynthesis identified in C. trachomatis. Seventeen genes adjacent to the C. trachomatis tryptophan operon also were not found in the C. pneumoniae genome. This region is the single largest loss of a contiguous genomic segment and includes 4 HKD superfamily encoding genes that encompass a family of proteins related to endonuclease and phospholipase D. These findings may be important for the ability of Chlamydia to persist in their hosts and cause disease by eliciting potent, focal and persistent inflammatory responses thought to be essential for pathogenesis.
  • the C. pneumoniae genome contains 187,711 additional nucleotides compared to the C. trachomatis genome, and the 214 coding sequences not found in C. trachomatis account for most of the increased genome size. Eighty-eight of these genes are found in blocks of >10 genes (11-30 genes/block), 41 are single genes, and the remainder are partnered with at least one other gene. Based upon the observation that -70% of all the C. pneumoniae genes have an identifiable homolog in GenBank, exclusive of C. trachomatis, it would be expected that over 150 of the 214 genes should have a homolog in GenBank, many associated with a function. However, only 28 coding sequences have similarity to genes from other organisms.
  • the major functionally identifiable addition to the C. pneumoniae genome is a large expansion of genes encoding a new family of chlamydial polymo ⁇ hic membrane proteins (Pmp), alone representing 22% of the increased coding capacity. While the C. trachomatis genome has 9 pmp genes, remarkably the C. pneumoniae genome contains 21 pmp genes. Most of these genes appear to be amplified in two regions of the genome with three stand-alone genes. Interestingly one of the stand-alone genes is most closely related to the C. trachomatis pmpD which is the only stand-alone pmp gene in the C. trachomatis genome and it is located with the same relative genomic context, suggesting an essential and conserved function for this paralog.
  • Pmp chlamydial polymo ⁇ hic membrane proteins
  • Aromatic amino acid hyroxlyases include three distinct enzymes that function to receptively oxidize phenylalanine to tyrosine, tyrosine to Popa, and tryptophan to 5-hydroxytryptophan and serotonin.
  • the chlamydial protein is similar to proteins of this family and incrementally more closely related to tryptophan hydroxylase, its specific function could not be confidently predicted. We hypothesize that it may be involved in C. pneumoniae virulence.
  • Tryptophan hydroxylase has not been previously identified in bacteria and the origin of the chlamydial gene appears to be from eukaryotes.
  • the functional role of an aromatic amino acid hydroxylase for C. pneumoniae is linked to the unique intracellular biology of this organism and may represent a key contribution to C. pneumoniae persistence and pathogenesis.
  • Table 1 provides functional assignments of C. pneumoniae nonprotein- encoding genomic sequences.
  • Table 2 provides functional assignments of protein coding sequences.
  • Table 3 provides the amino acid sequences of the proteins corresponding to the coding sequences. TABLE 1 type » SEQ !DNO:1 SEQ ID NO: 1 Gene start position end position
  • Ori 841664 841396 (R) Putative Origin of Repli tmRNA 138493 138074 (R) tmRNA pRNA 607342 607649 Rifaonuclease P RNA rRNA 1000564 1002115 15S rRNA rRNA 1002415 1005278 23S rRNA rRNA 1005393 1005509 5S rRNA tRNA 269070 269142 Ala tRNA_l tRNA 164318 164389 Asn tRNA tRNA 296224 296151 (R) Asp tRNA tRNA 836191 835119 (R) Ala tRNA_2 tRNA 1030533 1030603 Cys tRNA tRNA 784896 784822 (R) Glu tRNA tRNA 781680 781610 (R) Gly tRNA_l tRNA 961536 961607 Gly tRNA_2 tRNA 999949 1000023 His tRNA tRNA
  • CT016 hypothetical protein cpnoioe 135091 136374 F phoH-ATPase- (CT015)
  • CPH0625 718485 718060 R rll7-L17 Ribosomal Procein- (CTS06)
  • CT601 hypothetical protein CPn0780 879205 878591
  • CT601 hypothetical protein

Abstract

C. pneumoniae genome sequence and analysis of the encoded polypeptides and RNAs are provided. The C. pneumoniae gene nucleic acid compositions find use in identifying homologous or related proteins and the DNA sequences encoding such proteins; in producing compositions that modulate the expression or function of the protein; and in studying associated physiological pathways. In addition, modulation of the gene activity in vivo is used for prophylactic and therapeutic purposes, such as identification of cell type based on expression, and the like.

Description

CHLAMYDIA PNEUMONIAE GENOME SEQUENCE
CROSS-REFERENCES TO RELATED APPLICATIONS The present application is related to 60/128,606, filed April 8, 1999 and 60/108,279, filed November 12, 1998, which are incorporated herein by reference.
STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSORED RESEARCH AND DEVELOPMENT
FIELD OF THE INVENTION
This invention relates to nucleic acids and polypeptides from Chlamydia pneumoniae and to their use in the diagnosis, prevention and treatment of diseases associated with C. pneumoniae.
BACKGROUND OF THE INVENTION
Chlamydiaceae is a family of obligate intracellular parasite with a tropism for epithelial cells lining the mucus membranes. The bacteria have two morphologically distinct forms, "elementary body" and "reticulate body". The elementary body is the infectious form, and has a rigid cell wall, primarily of cross-linked outer membrane proteins. The reticulate body is the intracellular, metabohcally active form. A unique developmental cycle between these two forms characterizes Chlamydia growth. C. pneumoniae is a human respiratory pathogen that causes acute respiratory disease, and approximately 10% of community-acquired pneumonia. Antibody prevalence studies have shown that virtually everyone is infected with C. pneumoniae at some time, and that reinfection is common. In addition to respiratory disease, studies have shown an association of this organism with coronary artery disease. It has been demonstrated in atherosclerotic lesions of the aorta and coronary arteries by immunocytochemistry and by polymerase chain reaction (Kuo et al. (1993) J Infect Pis 167(4):841-849). Recent reports have further demonstrated the presence of C. pneumoniae in the walls of abdominal aortic aneurysms (Juvonen et al. (1997) J Vase Surg 25(3):499-505). Abdominal aortic aneurysms are frequently associated with atherosclerosis, and inflammation may be an important factor in aneurysmal dilatation. C. pneumoniae may play a role in maintaining an inflammation and triggering the development of aortic aneurysms.
Muhlestein et al. (1996) JACC 27:1555-61, reported a differential incidence of Chlamydia species within the coronary artery wall of patients with atherosclerosis versus those with other forms of cardiovascular disease. The extremely high rate of possible infection in patients with symptomatic atherosclerotic disease compared to the very low rate in patients with normal coronary arteries or coronary artery disease from chronic transplant rejection provides evidence for a direct link between the atherosclerotic process and Chlamydia infection. Because a history of chlamydial infection is so prevalent in the population, the issue of causality remains. On a physiologic and pathologic level, abnormal interactions among endothelial cells, platelets, macrophages and lymphocytes may lead to a cascade of events resulting in acute endothelial damage, thrombosis and repair, chronically leading to the development of atheroma in blood vessels. C. pneumoniae is related to other Chlamydia species, but the level of sequence similarity is relatively low. Very little is known about the biology of this organism, although it appears to be an important human pathogen. Allelic diversity and structural relationships between specific genes of Chlamydial species is described in Kaltenboeck et al. (1993) J Bacteriol 175(2):487-502; Gaydos et al. (1992) Infect Immun 60(12):5319-5323; Everett et al. C19971 Int J Svst Bacteriol 47(2 :461-473: and Pudjiatmoko et al. ( 1997) Int J Svst Bacteriol 47(2):425-431.
A number of studies have been published describing methods for detection of C. pneumoniae, and for distinguishing between Chlamydial species. Such methods include PCR detection (Rasmussen et al. (1992) Mol Cell Probes 6(5):389-394; Holland et al. (1990) J Infect Pis 162(4):984-987); a simplified polymerase chain reaction-enzyme immunoassay (Wilson et al. (1996) J Appl Bacteriol 80(4):431-438); sequence determination and restriction endonuclease cleavage (Herrmann et al. (1996) J Clin Microbiol 34(8):1897-1902).
Antigenic and molecular analyses of different C. pneumoniae strains is described in Jantos et al. (1997) J Clin Microbiol 35(3):620-623. Some genes of C. pneumoniae have been isolated and sequenced. These include the Gro E operon (Kikuta et al. (1991) Infect Immun 59(12):4665-4669); the major outer membrane protein Perez et al. (1991) Infect Immun 59(6):2195-2199; the PnaK protein homolog (Kornak et al. (1991) Infect Immun 59(2):721-725); as well as a number of ribosomal and other genes.
SUMMARY OF THE INVENTION This invention provides the genomic sequence of Chlamydia pneumoniae. The sequence information is useful for a variety of diagnostic and analytical methods. The genomic sequence may be embodied in a variety of media, including computer readable forms, or as a nucleic acid comprising a selected fragment of the sequence.
Such fragments generally consist of an open reading frame, transcriptional or translational control elements, or fragments derived therefrom. Proteins encoded by the open reading frames are useful for diagnostic purposes, as well as for their enzymatic or structural activity.
OEFINITIONS The term "amino acid" refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ-carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., an α carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group., e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid. Amino acids may be referred to herein by either their commonly known three letter symbols or by the one-letter symbols recommended by the rUPAC-IUB Biochemical Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes. "Amplification" primers are oligonucleotides comprising either natural or analogue nucleotides that can serve as the basis for the amplification of a select nucleic acid sequence. They include, e.g., polymerase chain reaction primers and ligase chain reaction oligonucleotides. "Antibody" refers to an immunoglobulin molecule able to bind to a specific epitope on an antigen. Antibodies can be a polyclonal mixture or monoclonal. Antibodies can be intact immunoglobulins derived from natural sources or from recombinant sources and can be immunoreactive portions of intact immunoglobulins. Antibodies may exist in a variety of forms including, for example, Fv, FaD, and F(ab) , as well as in single chains. Single-chain antibodies, in which genes for a heavy chain and a light chain are combined into a single coding sequence, may also be used.
An "antigen" is a molecule that is recognized and bound by an antibody, e.g., peptides, carbohydrates, organic molecules, or more complex molecules such as glycolipids and glycoproteins. The part of the antigen that is the target of antibody binding is an antigenic determinant and a small functional group that corresponds to a single antigenic determinant is called a hapten.
"Biological sample" refers to any sample obtained from a living or dead organism. Examples of biological samples include biological fluids and tissue specimens. Such biological samples can be prepared for analysis of the presence of C. pneumoniae nucleic acids, proteins, or antibodies specifically reactive with the proteins.
The term "C pneumoniae gene" shall be intended to mean the open reading frame encoding specific C. pneumoniae polypeptides, as well as adjacent 5' and 3' non-coding nucleotide sequences involved in the regulation of expression, up to about 2 kb beyond the coding region, but possibly further in either direction. The gene may be introduced into an appropriate vector for extrachromosomal maintenance or for integration into a host genome.
"Conservatively modified variants" applies to both amino acid and nucleic acid sequences. With respect to particular nucleic acid sequences, conservatively modified variants refers to those nucleic acids which encode identical or essentially identical amino acid sequences, or where the nucleic acid does not encode an amino acid sequence, to essentially identical sequences. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., Nucleic Acid Res . 19:5081 (1991); Ohtsuka et al., J. Biol. Chem. 260:2605- 2608 (1985); Rossolini et al., Mol. Cell. Probes 8:91-98 (1994)). Because of the degeneracy of the genetic code, a large number of functionally identical nucleic acids encode any given protein. For instance, the codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every position where an alanine is specified by a codon, the codon can be altered to any of the corresponding codons described without altering the encoded polypeptide. Such nucleic acid variations are "silent variations," which are one species of conservatively modified variations. Every nucleic acid sequence herein which encodes a polypeptide also describes every possible silent variation of the nucleic acid. One of skill will recognize that each codon in a nucleic acid (except AUG, which is ordinarily the υnly codon for methionine, and TGG, which is ordinarily the only codon for tryptophan) can be modified to yield a functionally identical molecule. Accordingly, each silen: variation of a nucleic acid which encodes a polypeptide is implicit in each describ id sequence. As to amino acid sequences, one of skill will recognize that individual substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein sequence which alters, adds or deletes a single amino acid or a small percentage of amino acids in the encoded sequence is a "conservatively modified variant" where the alteration results in the substitution of an amino acid with a chemically similar amino acid. Conservative substitution tables providing functionally similar amino acids are well known in the art. Such conservatively modified variants are in addition to and do not exclude polymorphic variants, interspecies homologs, and alleles of the invention.
The following groups each contain amino acids that are conservative substitutions for one another: 1) Alanine (A), Glycine (G);
2) Serine (S), Threonine (T);
3) Aspartic acid (P), Glutamic acid (E);
4) Asparagine (N), Glutamine (Q);
5) Cysteine (C), Methionine (M); 6) Arginine (R), Lysine (K), Histidine (H);
7) Isoleucine (I), Leucine (L), Valine (V); and
8) Phenylalanine (F), Tyrosine (Y), Tryptophan (W). see, e.g., Creighton, Proteins (1984)). The terms "identical" or percent "identity," in the context of two or more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a comparison window, as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. This definition also refers to the complement of a test sequence, which has a designated percent sequence or subsequence complementarity when the test sequence has a designated or substantial identity to a reference sequence. For example, a designated amino acid percent identity of 95% refers to sequences or subsequences that have at least about 95% amino acid identity when aligned for maximum correspondence over a comparison window as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. Such sequences would then be said to have substantial identity, or to be substantially identical to each other. Preferably, sequences have at least about 70% identity, more preferably 80% identity, more preferably 90-95% identity and above. Preferably, the percent identity exists over a region of the sequence that is at least about 25 amino acids in length, more preferably over a region that is 50-100 amino acids in length.
When percentage of sequence identity is used in reference to proteins or peptides, it is recognized that residue positions that are not identical often differ by conservative amino acid substitutions, where amino acids residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. Where sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated according to, e.g., the algorithm of Meyers & Miller, Computer Applic. Biol. Sci. 4:11-17 (1988) e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, California, USA).. For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Pefault program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated or default program parameters.
A comparison window includes reference to a segment of any one of the number of contiguous positions selected from the group consisting of from 25 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat 'I. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FAST A, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Or., Madison, Wl), or by manual alignment and visual inspection (see, e.g., Ausubel et al, supra).
One example of a useful algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments to show relationship and percent sequence identity. It also plots a tree or dendogram showing the clustering relationships used to create the alignment. PILEUP uses a simplification of the progressive alignment method of Feng & Ooolittle, J. Mol. Evol. 35:351-360 (1987). The method used is similar to the method described by Higgins & Sharp, CABIOS 5:151-153 (1989). The program can align up to 300 sequences, each of a maximum length of 5,000 nucleotides or amino acids. The multiple alignment procedure begins with the pairwise alignment of the two most similar sequences, producing a cluster of two aligned sequences. This cluster is then aligned to the next most related sequence or cluster of aligned sequences. Two clusters of sequences are aligned by a simple extension of the pairwise alignment of two individual sequences. The final alignment is achieved by a series of progressive, pairwise alignments. The program is run by designating specific sequences and their amino acid or nucleotide coordinates for regions of sequence comparison and by designating the program parameters. Using PILEUP, a reference sequence is compared to other test sequences to determine the percent sequence identity relationship using the following parameters: default gap weight (3.00), default gap length weight (0.10), and weighted end gaps. PILEUP can be obtained from the GCG sequence analysis software package, e.g, version 7.0 (Pevereaux et al, Nuc. Acids Res. 12:387-395 (1984).
Another example of algorithm that is suitable for determining percent sequence identity (i.e., substantial similarity or identity) is the BLAST algorithm, which is described in Altschul et al, J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al, supra). These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues, always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, M=5, N=4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as default parameters a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)). The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat 'I. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and most preferably less than about 0.001. An indication that two nucleic acid sequences or polypeptides are substantially identical is that the polypeptide encoded by the first nucleic acid is immunologically cross ,-eactive with the antibodies raised against the polypeptide encoded by the second nucleic acid, as described below. Thus, a polypeptide is typically substantially identical to a second polypeptide, for example, where the two peptides differ only by conservative suostitutions. Another indication that two nucleic acid sequences are substantially identical is that the two molecules or their complements hybridize to each other under stringent conditions, as described below.
Another indication that polynucleotide sequences are substantially identical is if two molecules hybridize to each other under stringent conditions. Stringent conditions are sequence dependent and will be different in different circumstances. Generally, stringent conditions are selected to be about 5°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Typically stringent conditions for a Southern blot protocol involve hybridizing in a buffer comprising 5x SSC, 1% SPS at 65°C or hybridizing in a buffer containing 5x SSC and 1% SPS at 42°C and washing at 65°C with a 0.2x SSC, 0.1% SPS wash.
A "label" is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, or chemical means. For example, useful labels include 32P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, dioxigenin, or haptens and proteins for which antisera or monoclonal antibodies are available. The term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form. The term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs). Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences, as well as the sequence explicitly indicated. The term nucleic acid is used interchangeably with gene, cDNA, mRNA, oligonucleotide, and polynucleotide.
As used herein a "nucleic acid probe or oligonucleotide" is defined as a nucleic acid capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. As used herein, a probe may include natural (i.e., A, G, C, or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere with hybridization. Thus, for example, probes may be peptide nucleic acids in which the constituent bases are joined by peptide bonds rather than phosphodiester linkages. It will be understood by one of skill in the art that probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. The probes are preferably directly labeled as with isotopes, chromophores, lumiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin complex may later bind. By assaying for the presence or absence of the probe, one can detect the presence or absence of the select sequence or subsequence.
A labeled nucleic acid probe or oligonucleotide is one that is bound, either covalently, through a linker, or through ionic, van der Waals or hydrogen bonds to a label such that the presence of the probe may be detected by detecting the presence of the label bound to the probe. "Pharmaceutically acceptable" means a material that is not biologically or otherwise undesirable, i.e., the material can be administered to an individual along with a Chlamydia antigen without causing any undesirable biological effects or interacting in a deleterious manner with any of the other components of the pharmaceutical composition. The terms "polypeptide," "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an analog or mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers. The phrase "specifically or selectively hybridizing to," refers to hybridization between a probe and a target sequence in which the probe binds substantially only to the target sequence, forming a hybridization complex, when the target is in a heterogeneous mixture of polynucleotides and other compounds. Such hybridization is determinative of the presence of the target sequence. Although the probe may bind other unrelated sequences, at least 90%, preferably 95% or more of the hybridization complexes formed are with the target sequence.
The term "recombinant" when used with reference to a cell, or nucleic acid, or vector, indicates that the cell, or nucleic acid, or vector, has been modified by the introduction of a heterologous nucleic acid or the alteration of a native nucleic acid, or that the cell is derived from a cell so modified. Thus, for example, recombinant cells express genes that are not found within the native (non-recombinant) form of the cell or express native genes that are otherwise abnormally expressed, under expressed or not expressed at all.
The phrase "specifically immunoreactive with", when referring to a protein or peptide, refers to a binding reaction between the protein and an antibody which is determinative of the presence of the protein in the presence of a heterogeneous population of proteins and other compounds. Thus, under designated immunoassay conditions, the specified antibodies bind to a particular protein and do not bind in a significant amount to other proteins present in the sample. Specific binding to an antibody under such conditions may require an antibody that is selected for its specificity for a particular protein. A variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein and are described in detail below. The phrase "substantially pure" or "isolated" when referring to a Chlamydia peptide or protein, means a chemical composition which is free of other subcellular components of the Chlamydia organism. Typically, a monomeric protein is substantially pure when at least about 85% or more of a sample exhibits a single polypeptide backbone. Minor variants or chemical modifications may typically share the same polypeptide sequence. Oepending on the purification procedure, purities of 85%, and preferably over 95% pure are possible. Protein purity or homogeneity may be indicated by a number of means well known in the art, such as polyacrylamide gel electrophoresis of a protein sample, followed by visualizing a single polypeptide band on a polyacrylamide gel upon silver staining. For certain purposes high resolution will be needed and HPLC or a similar means for purification utilized.
PETAILEP PESCRIPTION The present invention provides the nucleotide sequence of the C. pneumoniae genome SEQ IP NO: 1 or a representative fragment thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. As used herein, a "representative fragment" of the nucleotide sequence depicted in SEQ IP NO: 1 refers to any portion which is not presently represented within a publicly available database. Preferred representative fragments of the present invention are open reading frames, expression modulating fragments, uptake modulating fragments, and fragments which can be used to diagnose the presence of C. pneumoniae in sample. Using the information provided in the present application, together with routine cloning and sequencing methods, one of ordinary skill in the art will be able to clone and sequence all "representative fragments" of interest including open reading frames (ORFs) encoding a large variety of C. pneumoniae proteins. A non-limiting identification of such preferred representative fragments is provided in Tables 2 and 3.
Piagnostic use of C. pneumoniae nucleic acids
Hybridization-based assays
Using the nucleic acids disclosed here, one of skill can design nucleic acid hybridization-based assays for the detection of C. pneumoniae. Any of a number of well known techniques for the specific detection of target nucleic acids can be used.
Exemplary hybridization-based assays include, but are not limited to, traditional "direct probe" methods such as Southern Blots, dot blots, m situ /zybndization (e g , FISH), PCR, and the like The methods can be used in a wide vanety of formats including, but not limited to substrate- (e g membrane or glass) bound methods or array-based approaches as descπbed below As noted above, this invention also embraces methods for detecting the presence of Chlamydia PNA or RNA m biological samples These sequences can be used to detect Chlamydia in biological samples from patients suspected of being infected A vanety of methods of specific PNA and RNA measurement using nucleic acid hybridization techniques are known to those of skill in the art (see Sambrook et al , supra) In situ h>bndization assays are well known (e g , Angerer (1987) Meth
Enzymol 152 649) Gererally, in situ hybridization comprises the following major steps (1) fixation of tissue or biological structure to analyzed, (2) prehybπdization treatment of the biological structure t increase accessibility of target PNA, and to reduce nonspecific binding, (3) hybndizati n of the mixture of nucleic acids to the nucleic acid m the biological structure or tissue, (4) post-hybπdization washes to remove nucleic acid fragments not bound in the hybridization and (5) detection of the hybπdized nucleic acid fragments The reagent used in each of these steps and the conditions for use vary depending on the particular application
In a typical in situ hybridization assay, cells are fixed to a solid support, typically a glass slide If a nucleic acid is to be probed, the cells are typically denatured with heat or alkali The cells are then contacted with a hybridization solution at a moderate temperature to permit annealing of labeled probes specific to the nucleic acid sequence encoding the protein The targets (e g , cells) are then typically washed at a predetermined stπngency or at an increasing stπngency until an appropπate signal to noise ratio is obtained
The nucleic acids of this invention are particularly well suited to array- based hybπdization formats Arrays are a multiplicity of different "probe" or "target" nucleic acids (or other compounds) attached to one or more surfaces (e g , solid, membrane, or gel). In a preferred embodiment, the multiplicity of nucleic acids (or other moieties) is attached to a single contiguous surface or to a multiplicity of surfaces juxtaposed to each other
In an array format a large number of different hybπdization reactions can be run essentially "in parallel " This provides rapid, essentially simultaneous, evaluation of a number of hybπdizations in a single "expeπment" Methods of performing hybπdization reactions in array based formats are well known to those of skill in the art (see, e g , Pastinen (1997) Genome Res 7 606-614, Jackson (1996) Nature Biotechnology 14 1685, Chee (1995) Science 274 610, WO 96/17958 Arrays, particularly nucleic acid arrays can be produced according to a wide vanety of methods well known to those of skill in the art For example, m a simple embodiment, "low density" arrays can simply be produced by spotting (e g by hand using a pipette) different nucleic acids at different locations on a solid support (e g a glass surface, a membrane, etc ) This simple spotting, approach has been automated to produce high density spotted arrays (see, e g , \J S Patent No 5,807,522) This patent descπbes the use of an automated systems that taps a microcapillary against a surface to deposit a small volume of a biological sample The process is repeated to generate high density arrays Arrays can also be produced using oligonucleotide synthesis technology Thus, for example, U S Patent No 5,143,854 and PCT patent publication Nos WO 90/15070 and 92/10092 teach the use of light-directed combinatoπal synthesis of high density oligonucleotide arrays
Many methods for immobilizing nucleic acids on a vanety of solid surfaces are known in the art A wide vanety of organic and inorganic polymers, as well as other matenals, both natural and synthetic, can be employed as the mateπal for the solid surface Illustrative solid surfaces include, e g , nitrocellulose, nylon, glass, quartz, diazotized membranes (paper or nylon), sihcones, polyformaldehyde, cellulose, and cellulose acetate In addition, plastics such as polyethylene, polypropylene, polystyrene, and the like can be used. Other mateπals which may be employed include paper, ceramics, metals, metalloids, semiconductive mateπals, cermets or the like In addition, substances that form gels can be used Such matenals include, e.g , proteins (e.g , gelatins), hpopolysacchandes, silicates, agarose and polyacrylamides Where the solid surface is porous, vanous pore sizes may be employed depending upon the nature of the system In prepaπng the surface, a plurality of different matenals may be employed, particularly as laminates, to obtain vanous properties. For example, proteins (e g , bovme serum albumin) or mixtures of macromolecules (e g , Oenhardt's solution) can be employed to avoid non-specific binding, simplify covalent conjugation, enhance signal detection or the like. If covalent bonding between a compound and the surface is desired, the surface will usually be polyfunctional or be capable of being polyfunctionalized. Functional groups which may be present on the surface and used for linking can include carboxylic acids, aldehydes, amino groups, cyano groups, ethylenic groups, hydroxyl groups, mercapto groups and the like. The manner of linking a wide variety of compounds to various surfaces is well known and is amply illustrated in the literature.
For example, methods for immobilizing nucleic acids by introduction of various functional groups to the molecules is known (see, e.g., Bischoff (1987) Anal. Biochem., 164: 336-344; Kremsky (1987) Nucl. Acids Res. 15: 2891-2910). Modified nucleotides can be placed on the target using PCR primers containing the modified nucleotide, or by enzymatic end labeling with modified nucleotides. Use of glass or membrane supports (e.g., nitrocellulose, nylon, polypropylene) for the nucleic acid arrays of the invention is advantageous because of well developed technology employing manual and robotic methods of arraying targets at relatively high element densities. Such membranes are generally available and protocols and equipment for hybridization to membranes is well known.
Target elements of various sizes, ranging from 1 mm diameter down to 1 μm can be used. Smaller target elements containing low amounts of concentrated, fixed probe PNA are used for high complexity comparative hybridizations since the total amount of sample available for binding to each target element will be limited. Thus it is advantageous to have small array target elements that contain a small amount of concentrated probe PNA so that the signal that is obtained is highly localized and bright. Such small array target elements are typically used in arrays with densities greater than 10 /cm2. Relatively simple approaches capable of quantitative fluorescent imaging of 1 cm2 areas have been described that permit acquisition of data from a large number of target elements in a single image (see, e.g., Wittrup (1994) Cytometry 16:206-213). If fluorescently labeled nucleic acid samples are used, arrays on solid surface substrates with much lower fluorescence than membranes, such as glass, quartz, or small beads, can achieve much better sensitivity. Substrates such as glass or fused silica are advantageous in that they provide a very low fluorescence substrate, and a highly efficient hybridization environment. Covalent attachment of the target nucleic acids to glass or synthetic fused silica can be accomplished according to a number of known techniques (described above). Nucleic acids can be conveniently coupled to glass using commercially available reagents. For instance, materials for preparation of silanized glass with a number of functional groups are commercially available or can be prepared using standard techniques (see, e.g., Gait (1984) Oligonucleotide Synthesis: A Practical Approach, IRL Press, Wash., O.C.). Quartz cover slips, which have at least 10- fold lower auto fluorescence than glass, can also be silanized.
Alternatively, probes can also be immobilized on commercially available coated beads or other surfaces. For instance, biotin end-labeled nucleic acids can be bound to commercially available avidin-coated beads. Streptavidin or anti-digoxigenin antibody can also be attached to silanized glass slides by protein-mediated coupling using e.g., protein A following standard protocols (see, e.g., Smith (1992) Science 258: 1122- 1126). Biotin or digoxigenin end-labeled nucleic acids can be prepared according to standard techniques. Hybridization to nucleic acids attached to beads is accomplished by suspending them in the hybridization mix, and then depositing them on the glass substrate for analysis after washing. Alternatively, paramagnetic particles, such as ferric oxide particles, with or without avidin coating, can be used.
A variety of other nucleic acid hybridization formats are known to those skilled in the art. For example, common formats include sandwich assays and competition or displacement assays. Hybridization techniques are generally described in Hames and Higgins (1985) Nucleic Acid Hybridization, A Practical Approach, IRL Press; Gall and Pardue (1969) Proc. Natl. Acad. Sci. USA 63: 378-383; and John et al. (1969) Nature 223: 582-587.
Sandwich assays are commercially useful hybridization assays for detecting or isolating nucleic acid sequences. Such assays utilize a "capture" nucleic acid covalently immobilized to a solid support and a labeled "signal" nucleic acid in solution. The sample will provide the target nucleic acid. The "capture" nucleic acid and "signal" nucleic acid probe hybridize with the target nucleic acid to form a "sandwich" hybridization complex. To be most effective, the signal nucleic acid should not hybridize with the capture nucleic acid. Petection of a hybridization complex may require the binding of a signal generating complex to a duplex of target and probe polynucleotides or nucleic acids. Typically, such binding occurs through ligand and anti-ligand interactions as between a ligand-conjugated probe and an anti-ligand conjugated with a signal. The sensitivity of the hybridization assays may be enhanced through use of a nucleic acid amplification system that multiplies the target nucleic acid being detected. Examples of such systems include the polymerase chain reaction (PCR) system and the ligase chain reaction (LCR) system. Other methods recently described in the art are the nucleic acid sequence based amplification (NASBAO, Cangene, Mississauga, Ontario) and Q Beta Replicase systems.
Nucleic acid hybridization simply involves providing a denatured probe and target nucleic acid under conditions where the probe and its complementary target can form stable hybrid duplexes through complementary base pairing. The nucleic acids that do not form hybrid duplexes are then washed away leaving the hybridized nucleic acids to be detected, typically through detection of an attached detectable label. It is generally recognized that nucleic acids are denatured by increasing the temperature or decreasing the salt concentration of the buffer containing the nucleic acids, or in the addition of chemical a ents, or the raising of the pH. Under low stringency conditions (e.g., low temperature and/or high salt and/or high target concentration) hybrid duplexes (e.g., ONA:PNA, RNA:RNA, or RNA:PNA) will form even where the annealed sequences are not perfectly complementary. Thus specificity of hybridization is reduced at lower stringency. Conversely, at higher stringency (e.g., higher temperature or lower salt) successful hybridization requires fewer mismatches. One of skill in the art will appreciate that hybridization conditions may be selected to provide any degree of stringency. In a preferred embodiment, hybridization is performed at low stringency to ensure hybridization and then subsequent washes are performed at higher stringency to eliminate mismatched hybrid duplexes. Successive washes may be performed at increasingly higher stringency (e.g., down to as low as 0.25 X SSPE-T at 37°C to 70°C) until a desired level of hybridization specificity is obtained. Stringency can also be increased by addition of agents such as formamide. Hybridization specificity may be evaluated by comparison of hybridization to the test probes with hybridization to the various controls that can be present.
In general, there is a tradeoff between hybridization specificity (stringency) and signal intensity. Thus, in a preferred embodiment, the wash is performed at the highest stringency that produces consistent results and that provides a signal intensity greater than approximately 10% of the background intensity. Thus, in a preferred embodiment, the hybridized array may be washed at successively higher stringency solutions and read between each wash. Analysis of the data sets thus produced will reveal a wash stringency above which the hybridization pattern is not appreciably altered and which provides adequate signal for the particular probes of interest.
Methods of optimizing hybridization conditions are well known to those of skill in the art (see, e.g., Tijssen (1993) Laboratory Techniques in Biochemistry and Molecular Biology, Vol. 24: Hybridization With Nucleic Acid Probes, Elsevier, N.Y.).
Labeling and detection of nucleic acids.
In a preferred embodiment, the hybridized nucleic acids are detected by detecting one or more labels attached to the sample or probe nucleic acids. The labels may be incorporated by any of a number of means well known to those of skill in the art. Means of attaching labels to nucleic acids include, for example nick translation or end- labeling (e.g. with a labeled RNA) by kinasing of the nucleic acid and subsequent attachment (ligation) of a nucleic acid linker joining the sample nucleic acid to a label (e.g., a fluorophore). A wide variety of linkers for the attachment of labels to nucleic acids are also known. In addition, intercalating dyes and fluorescent nucleotides can also be used.
Oetectable labels suitable for use in the present invention include any composition detectable by spectroscopic, photochemical, biochemical, immunochemical, electrical, optical or chemical means. Useful labels in the present invention include biotin for staining with labeled streptavidin conjugate, magnetic beads (e.g., Pynabeads™), fluorescent dyes (e.g., fluorescein, texas red, rhodamine, green fluorescent protein, and the like, see, e.g., Molecular Probes, Eugene, Oregon, USA), radiolabels (e.g., 3H, 125I, 35S, 14C, or 32P), enzymes (e.g., horse radish peroxidase, alkaline phosphatase and others commonly used in an ELISA), and colorimetric labels such as colloidal gold (e.g., gold particles in the 40 -80 nm diameter size range scatter green light with high efficiency) or colored glass or plastic (e.g., polystyrene, polypropylene, latex, etc.) beads. Patents teaching the use of such labels include U.S. Patent Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241.
A fluorescent label is preferred because it provides a very strong signal with low background. It is also optically detectable at high resolution and sensitivity through a quick scanning procedure. The nucleic acid samples can all be labeled with a single label, e.g., a single fluorescent label. Alternatively, in another embodiment, different nucleic acid samples can be simultaneously hybridized where each nucleic acid sample has a different label. For instance, one target could have a green fluorescent label and a second target could have a red fluorescent label. The scanning step will distinguish cites of binding of the red label from those binding the green fluorescent label. Each nucleic acid sample (target nucleic acid) can be analyzed independently from one another. Suitable chromogens which can be employed include those molecules and compounds which absorb light in a distinctive range of wavelengths so that a color can be observed or, alternatively, which emit light when irradiated with radiation of a particular wave length or wave length range, e.g., fluorescers.
Pesirably, fluorescers should absorb light above about 300 nm, preferably about 350 nm, and more preferably above about 400 nm, usually emitting at wavelengths greater than about 10 nm higher than the wavelength of the light absorbed. It should be noted that the absorption and emission characteristics of the bound dye can differ from the unbound dye. Therefore, when referring to the various wavelength ranges and characteristics of the dyes, it is intended to indicate the dyes as employed and not the dye which is unconjugated and characterized in an arbitrary solvent.
Fluorescers are generally preferred because by irradiating a fluorescer with light, one can obtain a plurality of emissions. Thus, a single label can provide for a plurality of measurable events.
Petectable signal can also be provided by chemiluminescent and bioluminescent sources. Chemiluminescent sources include a compound which becomes electronically excited by a chemical reaction and can then emit light which serves as the detectable signal or donates energy to a fluorescent acceptor. Alternatively, luciferins can be used in conjunction with luciferase or lucigenins to provide bioluminescence. Spin labels are provided by reporter molecules with an unpaired electron spin which can be detected by electron spin resonance (ESR) spectroscopy. Exemplary spin labels include organic free radicals, transitional metal complexes, particularly vanadium, copper, iron, and manganese, and the like. Exemplary spin labels include nitroxide free radicals.
The label may be added to the target (sample) nucleic acid(s) prior to, or after the hybridization. So called "direct labels" are detectable labels that are directly attached to or incorporated into the target (sample) nucleic acid prior to hybridization. In contrast, so called "indirect labels" are joined to the hybrid duplex after hybridization. Often, the indirect label is attached to a binding moiety that has been attached to the target nucleic acid prior to the hybridization. Thus, for example, the target nucleic acid may be biotinylated before the hybridization. After hybridization, an avidin-conjugated fluorophore will bind the biotin bearing hybrid duplexes providing a label that is easily detected. For a detailed review of methods of labeling nucleic acids and detecting labeled hybridized nucleic acids see Laboratory Techniques in Biochemistry and Molecular
Biology, Vol. 24: Hybridization With Nucleic Acid Probes, P. Tijssen, ed. Elsevier, N.Y., (1993)).
Fluorescent labels are easily added during an in vitro transcription reaction. Thus, for example, fluorescein labeled UTP and CTP can be incoφorated into the RNA produced in an in vitro transcription.
The labels can be attached directly or through a linker moiety. In general, the site of label or linker-label attachment is not limited to any specific position. For example, a label may be attached to a nucleoside, nucleotide, or analogue thereof at any position that does not interfere with detection or hybridization as desired. For example, certain Label-ON Reagents from Clontech (Palo Alto, C A) provide for labeling interspersed throughout the phosphate backbone of an oligonucleotide and for terminal labeling at the 3' and 5' ends. As shown for example herein, labels can be attached at positions on the ribose ring or the ribose can be modified and even eliminated as desired. The base moieties of useful labeling reagents can include those that are naturally occurring or modified in a manner that does not interfere with the purpose to which they are put. Modified bases include but are not limited to 7-deaza A and G, 7-deaza-8-aza A and G, and other heterocyclic moieties.
It will be recognized that fluorescent labels are not to be limited to single species organic molecules, but include inorganic molecules, multi-molecular mixtures of organic and/or inorganic molecules, crystals, heteropolymers, and the like. Thus, for example, CdSe-CdS core-shell nanocrystals enclosed in a silica shell can be easily derivatized for coupling to a biological molecule (Bruchez et al (1998) Science, 281: 2013-2016). Similarly, highly fluorescent quantum dots (zinc sulfide-capped cadmium selenide) have been covalently coupled to biomolecules for use in ultrasensitive biological detection (Warren and Nie (1998) Science, 281: 2016-2018).
Amplification-based assays.
In another embodiment, amplification-based assays can be used to detect nucleic acids. In such amplification-based assays, the nucleic acid sequences act as a template in an amplification reaction (e g Polymerase Cham Reaction (PCR) Detailed protocols for quantitative PCR are provided in Innis et al. (1990) PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc. N.Y.).
Other suitable amplification methods include, but are not limited to ligase chain reaction (LCR) (see Wu and Wallace (1989) Genomics 4- 560, Landegren et al (1988) Science 241 : 1077, and Barnnger et al. (1990) Gene 89 117, transcnption amplification (Kwoh et al. (1989) Proc. Natl Acad. Sci USA 86. 1173), and self- sustamed sequence replication (Guatelh et al. (1990) Proc. Nat. Acad. Sci. USA 87: 1874).
Petectior of C pneumoniae gene expression
The nucbic acids of the invention can also be used to C pneumoniae detect gene transcnpts Methods of detecting and/or quantifying gene transcnpts using nucleic acid hybndizatu >n techniques are known to those of skill in the art (see Sambrook et al. supra). For example , a Northern transfer may be used for the detection of the desired mRNA directly. In bnef, the mRNA is isolated from a given cell sample using, for example, an acid guanidinium-phenol-chloroform extraction method. The mRNA is then electrophoresed to separate the mRNA species and the mRNA is transferred from the gel to a nitrocellulose membrane. As with the Southern blots, labeled probes are used to identify and/or quantify the target mRNA. In another preferred embodiment, the gene transcript can be measured using amplification (e.g PCR) based methods as descπbed above for directly assessing copy number of the target sequences
Expression of C pneumoniae proteins
The nucleic acids disclosed here can be used for recombinant expression of the proteins. In these methods, the nucleic acids encoding the proteins of interest are introduced into suitable host cells, followed by induction of the cells to produce large amounts of the protein. The invention relies on routine techniques in the field of recombinant genetics, well known to those of ordinary skill in the art. A basic text disclosing the general methods of use in this invention is Sambrook et al., Molecular Cloning, A Laboratory Manual (2nd ed. 1989).
Standard transfection methods are used to produce prokaryotic, mammalian, yeast or insect cell lines which express large quantities of the desired polypeptide, which is then purified using standard techniques (see, e.g., Colley et al, J. Biol Chem. 264: 17619-17622, 1989; Guide to Protein Purification, supra).
The nucleotide sequences used to transfect the host cells can be modified to yield Chlamydia polypeptides with a variety of desired properties. For example, the polypeptides can vary from the naturally-occurring sequence at the primary structure level by amino acid, insertions, substitutions, deletions, and the like. These modifications can be used in a number of combinations to produce the final modified protein chain.
The amino acid sequence variants can be prepared with various objectives in mind, including facilitating purification and preparation of the recombinant polypeptide. The modified polypeptides are also useful for modifying plasma half life, improving therapeutic efficacy, and lessening the severity or occurrence of side effects during therapeutic use. The amino acid sequence variants are usually predetermined variants not found in nature but exhibit the same immunogenic activity as naturally occurring protein. In general, modifications of the sequences encoding the polypeptides may be readily accomplished by a variety of well-known techniques, such as site-directed mutagenesis (see Gillman & Smith, Gene 8:81-97 (1979); Roberts et al, Nature 328:731- 734 (1987)). One of ordinary skill will appreciate that the effect of many mutations is difficult to predict. Thus, most modifications are evaluated by routine screening in a suitable assay for the desired characteristic. For instance, the effect of various modifications on the ability of the polypeptide to elicit a protective immune response can be easily determined using in vitro assays. For instance, the polypeptides can be tested for their ability to induce lymphoproliferation, T cell cytotoxicity, or cytokine production using standard techniques.
The particular procedure used to introduce the genetic material into the host cell for expression of the polypeptide is not particularly critical. Any of the well known procedures for introducing foreign nucleotide sequences into host cells may be used. These include the use of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, plasmid vectors, viral vectors and any of the other well known methods for introducing cloned genomic PNA, cPNA, synthetic PNA or other foreign genetic material into a host cell (see Sambrook et al. , supra). It is only necessary that the particular procedure utilized be capable of successfully introducing at least one gene into the host cell which is capable of expressing the gene. Any of a number of well known cells and cell lines can be used to express the polypeptides of the invention. For instance, prokaryotic cells such as E. coli can be used. Eukaryotic cells include, yeast, Chinese hamster ovary (CHO) cells, COS cells, and insect cells. The particular vector used to transport the genetic information into the cell is also not particularly critical. Any of the conventional vectors used for expression of recombinant proteins in prokaryotic and eukaryotic cells may be used. Expression vectors for mammalian cells typically contain regulatory elements from eukaryotic viruses. The expression vector typically contains a transcription unit or expression cassette that contains all the elements required for the expression of the polypeptide PNA in the host cells. A typical expression cassette contains a promoter operably linked to the PNA sequence encoding a polypeptide and signals required for efficient polyadenylation of the transcript. The term "operably linked" as used herein refers to linkage of a promoter upstream from a PNA sequence such that the promoter mediates transcription of the PNA sequence. The promoter is preferably positioned about the same distance from the heterologous transcription start site as it is from the transcription start site in its natural setting. As is known in the art, however, some variation in this distance can be accommodated without loss of promoter function. Following the growth of the recombinant cells and expression of the polypeptide, the culture medium is harvested for purification of the secreted protein. The media are typically clarified by centrifugation or filtration to remove cells and cell debris and the proteins are concentrated by adsorption to any suitable resin or by use of ammonium sulfate fractionation, polyethylene glycol precipitation, or by ultrafiltration. Other routine means known in the art may be equally suitable. Further purification of the polypeptide can be accomplished by standard techniques, for example, affinity chromatography, ion exchange chromatography, sizing chromatography, His6 tagging and Ni-agarose chromatography (as described in Oobeli et al., Mol. and Biochem. Parasit. 41:259-268 (1990)), or other protein purification techniques to obtain homogeneity. The purified proteins are then used to produce pharmaceutical compositions, as described below.
An alternative method of preparing recombinant polypeptides useful as vaccines involves the use of recombinant viruses (e.g., vaccinia). Vaccinia virus is grown in suitable cultured mammalian cells such as the HeLa S3 spinner cells, as described by Mackett et al, in DNA cloning Vol. II: A practical approach, pp. 191-211 (Glover, ed.).
Antibody Production
The proteins of the present invention can be used to produce antibodies specifically reactive with C pneumoniae antigens. If isolated proteins are used, they may be recombinantly produced or isolated from Chlamydia cultures. Synthetic peptides made using the protein sequences may also be used.
Methods of production of polyclonal antibodies are known to those of skill in the art. In brief, an immunogen, preferably a purified protein, is mixed with an adjuvant and animals are immunized. When appropriately high titers of antibody to the immunogen are obtained, blood is collected from the animal and antisera is prepared.
Further fractionation of the antisera to enrich for antibodies reactive to Chlamydia proteins can be done if desired (see Harlow & Lane, Antibodies: A Laboratory Manual
(1988)). Polyclonal antisera are used to identify and characterize Chlamydia in the tissues of patients using, for instance, in situ techniques and immunoperoxidase test procedures described in Anderson et al. JA VMA 198:241 (1991) and Barr et al. Vet.
Pathol. 28:110-116 (1991).
Monoclonal antibodies may be obtained by various techniques familiar to those skilled in the art. Briefly, spleen cells from an animal immunized with a desired antigen are immortalized, commonly by fusion with a myeloma cell (see Kohler &
Milstein, Ewr. J. Immunol. 6:51 1-519 (1976)). Alternative methods of immortalization include transformation with Epstein Barr Virus, oncogenes, or retroviruses, or other methods well known in the art. Colonies arising from single immortalized cells are screened for production of antibodies of the desired specificity and affinity for the antigen, and yield of the monoclonal antibodies produced by such cells may be enhanced by various techniques, including injection into the peritoneal cavity of a vertebrate host. Monoclonal antibodies produced in such a manner are used, for instance, in ELISA diagnostic tests, immunoperoxidase tests, immunohistochemical tests, for the in vitro evaluation of spirochete invasion, to select candidate antigens for vaccine development, protein isolation, and for screening genomic and cPNA libraries to select appropriate gene sequences. Immunodiagonostic detection of C. pneumoniae infections
The present invention also provides methods for detecting the presence or absence of C. pneumoniae, or antibodies reactive with it, in a biological sample. For instance, antibodies specifically reactive with Chlamydia can be detected using either Chlamydia proteins or the isolates described here. The proteins and isolates can also be used to raise specific antibodies (either monoclonal or polyclonal) to detect the antigen in a sample. In addition, the nucleic acids disclosed and claimed here can be used to detect Chlamydia-specific sequences using standard hybridization techniques.
For a review of immunological and immunoassay procedures in general, see Basic and Clinical .'mmunology (Stites & Terr ed., 7th ed. 1991)). The immunoassays of the present invention can be performed in any of several configurations, which are reviewed extensively in Enzyme Immunoassay (Maggio, ed., 1980); Tijssen, Laboratory Techniques in Biochem. stry and Molecular Biology (1985)). For instance, the proteins and antibodies disclose i here are conveniently used in ELISA, immunoblot analysis and agglutination assays.
In brief, immunoassays to measure -Chlamydia antibodies or antigens can be either competitive or noncompetitive binding assays. In competitive binding assays, the sample analyte (e.g., znύ-Chlamydia antibodies) competes with a labeled analyte (e.g., anti-Chlamydia monoclonal antibody) for specific binding sites on a capture agent (e.g., isolated Chlamydia protein) bound to a solid surface. The concentration of labeled analyte bound to the capture agent is inversely proportional to the amount of free analyte present in the sample.
Noncompetitive assays are typically sandwich assays, in which the sample analyte is bound between two analyte-specific binding reagents. One of the binding agents is used as a capture agent and is bound to a solid surface. The second binding agent is labelled and is used to measure or detect the resultant complex by visual or instrument means.
A number of combinations of capture agent and labelled binding agent can be used. For instance, an isolated Chlamydia protein or culture can be used as the capture agent and labelled anti-human antibodies specific for the constant region of human antibodies can be used as the labelled binding agent. Goat, sheep and other non- liuman antibodies specific for human immunoglobulin constant regions (e.g., γ or μ) are well known in the art. Alternatively, the anti-human antibodies can be the capture agent and the antigen can be labelled.
Various components of the assay, including the antigen, anti-Chlamydia antibody, or anti-human antibody, may be bound to a solid surface. Many methods for immobilizing biomolecules to a variety of solid surfaces are known in the art. For instance, the solid surface may be a membrane (e.g., nitrocellulose), a microtiter dish (e.g., PVC or polystyrene) or a bead. The desired component may be covalently bound or noncovalently attached through nonspecific bonding.
Alternatively, the immunoassay may be carried out in liquid phase and a variety of separation methods may be employed to separate the bound labeled component from the unbound labelled components. These methods are known to those of skill in the art and include immunoprecipitation, column chromatography, adsorption, addition of magnetizable particles coated with a binding agent and other similar procedures. An immunoassay may also be carried out in liquid phase without a separation procedure. Various homogeneous immunoassay methods are now being applied to immunoassays for protein analytes. In these methods, the binding of the binding agent to the analyte causes a change in the signal emitted by the label, so that binding may be measured without separating the bound from the unbound labelled component. Western blot (immunoblot) analysis can also be used to detect the presence of antibodies to Chlamydia in the sample. This technique is a reliable method for confirming the presence of antibodies against a particular protein in the sample. The technique generally comprises separating proteins by gel electrophoresis on the basis of molecular weight, transferring the separated proteins to a suitable solid support, (such as a nitrocellulose filter, a nylon filter, or derivatized nylon filter), and incubating the sample with the separated proteins. This causes specific target antibodies present in the sample to bind their respective proteins. Target antibodies are then detected using labeled anti- human antibodies.
The immunoassay formats described above employ labelled assay components. The label may be coupled directly or indirectly to the desired component of the assay according to methods well known in the art. A wide variety of labels may be used. The component may be labelled by any one of several methods. Traditionally a radioactive label incorporating 3H, I251, 35S, 14C, or 32P was used. Non-radioactive labels include ligands which bind to labelled antibodies, fluorophores, chemiluminescent agents, enzymes, and antibodies which can serve as specific binding pair members for a labelled ligand. The choice of label depends on sensitivity required, ease of conjugation with the compound, stability requirements, and available instrumentation. Enzymes of interest as labels will primarily be hydrolases, particularly phosphatases, esterases and glycosidases, or oxidoreductases, particularly peroxidases. Fluorescent compounds include fluorescein and its derivatives, rhodamine and its derivatives, dansyl, umbelliferone, etc. Chemiluminescent compounds include luciferin, and 2,3-dihydrophthalazinediones, e.g., luminol. For a review of various labelling or signal producing systems which may be used, see U.S. Patent No. 4,391,904, which is incoφorated herein by reference.
Non-radioactive labels are often attached by indirect means. Generally, a ligand molecule (e.g., biotin) is covalently bound to the molecule. The ligand then binds to an anti-ligand (e.g., streptavidin) molecule which is either inherently detectable or covalently bound to a signal system, such as a detectable enzyme, a fluorescent compound, or a chemiluminescent compound. A number of ligands and anti-ligands can be used. Where a ligand has a natural anti-ligand, for example, biotin, thyroxine, and cortisol, it can be used in conjunction with the labelled, naturally occurring anti-ligands. Alternatively, any haptenic or antigenic compound can be used in combination with an antibody.
Some assay formats do not require the use of labelled components. For instance, agglutination assays can be used to detect the presence of the target antibodies. In this case, antigen-coated particles are agglutinated by samples comprising the target antibodies. In this format, none of the components need be labelled and the presence of the target antibody is detected by simple visual inspection.
Pharmaceutical Compositions
The peptides or antibodies (typically monoclonal antibodies) of the present invention and pharmaceutical compositions thereof are useful for administration to mammals, particularly humans, to treat and/or prevent Chlamydia infections. Suitable formulations are found in Remington 's Pharmaceutical Sciences, Mack Publishing Company, Philadelphia, PA, 17th ed. (1985). The immunogenic peptides or antibodies of the invention are administered prophylactically or to an individual already suffering from the disease. The peptide compositions are administered to a patient in an amount sufficient to elicit an effective immune response to Chlamydia. An effective immune response is one that inhibits infection. An amount adequate to accomplish this is defined as "therapeutically effective dose" or "immunogenically effective dose." Amounts effective for this use will depend on, e.g., the peptide composition, the manner of administration, the stage and severity of the disease being treated, the weight and general state of health of the patient, and the judgment of the prescribing physician, but generally range for the initial immunization (that is for therapeutic or prophylactic administration) from about 0.1 mg to about 1.0 mg per 70 kilogram patient, more commonly from about 0.5 mg to about 0.75 mg per 70 kg of body weight. Boosting dosages are typically from about 0.1 mg to about 0.5 mg of peptide using a boosting regimen over weeks to months depending upon the patient's response and condition. A suitable protocol would include injection at time 0, 4, 2, 6, 10 and 14 weeks, followed by further booster injections at 24 and 28 weeks.
For therapeutic use, administration should begin at the first sign of infection. This is followed by boosting doses until at least symptoms are substantially abated and for a period thereafter. In some circumstances, loading doses followed by boosting doses may be required. The resulting immune response helps to cure or at least partially arrest symptoms and/or complications. Vaccine compositions containing the peptides are administered prophylactically to a patient susceptible to or otherwise at risk of the infection.
The pharmaceutical compositions (containing either peptides or antibodies) are intended for parenteral or oral administration. Preferably, the pharmaceutical compositions are administered parenterally, e.g., subcutaneously, intradermally, or intramuscularly. Thus, the invention provides compositions for parenteral administration which comprise a solution of the immunogenic polypeptides dissolved or suspended in an acceptable carrier, preferably an aqueous carrier. A variety of aqueous carriers may be used, e.g., water, buffered water, 0.4% saline, 0.3% glycine, hyaluronic acid and the like. These compositions may be sterilized by conventional, well known sterilization techniques, or may be sterile filtered. The resulting aqueous solutions may be packaged for use as is, or lyophilized, the lyophilized preparation being combined with a sterile solution prior to administration. The compositions may contain pharmaceutically acceptable auxiliary substances as required to approximate physiological conditions, such as buffering agents, tonicity adjusting agents, wetting agents and the like, for example, sodium acetate, sodium lactate, sodium chloride, potassium chloride, calcium chloride, sorbitan monolaurate, triethanolamine oleate, etc. The compositions may also comprise carriers to enhance the immune response. Useful carriers are well known in the art, and include, e.g., KLH, thyroglobulin, albumins such as human serum albumin, tetanus toxoid, polyamino acids such as poly(lysine:glutamic acid), influenza, hepatitis B virus core protein, hepatitis B virus recombinant vaccine and the like. For solid compositions, conventional nontoxic solid carriers may be used which include, for example, pharmaceutical grades of mannitol, lactose, starch, magnesium stearate, sodium saccharin, talcum, cellulose, glucose, sucrose, magnesium carbonate, and the like. For oral administration, a pharmaceutically acceptable nontoxic composition is formed t y incoφorating any of the normally employed excipients, such as those carriers previously listed, and generally 10-95% of active ingredient, that is, one or more peptides of the invention, and more preferably at a concentration of 25%-75%.
As noted above, the peptide compositions are intended to induce an immune response to Chlamydia. Thus, compositions and methods of administration suitable for maximizing the immune response are preferred. For instance, peptides may be introduced into a host, including humans, linked to a carrier or as a homopolymer or heteropolymer of active peptide units from various Chlamydia proteins disclosed here. Alternatively, a "cocktail" of polypeptides can be used. A mixture of more than one polypeptide has the advantage of increased immunological reaction and, where different peptides are used to make up the polymer, the additional ability to induce antibodies to a number of epitopes.
The compositions also include an adjuvant. As used here, number of adjuvants are well known to one skilled in the art. Suitable adjuvants include incomplete Freund's adjuvant, alum, aluminum phosphate, aluminum hydroxide, N-acetyl-muramyl-L-threonyl-P-isoglutamine (thr-MPP), N-acetyl-nor-muramyl-L-alanyl-P-isoglutamine (CGP 11637, referred to as nor-MPP), N-acety-muramyl-Lalanyl-P-isoglutaminyl-L-alanine-2-( -2'-dipalmitoyl-sn- glycero-3-hydroxyphosphoryloxy)-ethylamine (CGP 19835A, referred to as MTP-PE), and RIBI, which contains three components extracted from bacteria, monophosphoryl lipid A, trehalose dimycolate and cell wall skeleton (MPL+TPM+CWS) in a 2% squalene/Tween 80 emulsion. The effectiveness of an adjuvant may be determined by measuring the amount of antibodies directed against the immunogenic peptide.
The concentration of immunogenic peptides of the invention in the pharmaceutical formulations can vary widely, i.e. from less than about 0.1%, usually at or at least about 2% to as much as 20% to 50% or more by weight, and will be selected primarily by fluid volumes, viscosities, etc., in accordance with the particular mode of administration selected.
The peptides of the invention can also be expressed by attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of vaccinia virus as a vector to express nucleotide sequences that encode the peptides of the invention. Upon introduction into a host, the recombinant vaccinia virus expresses the immunogenic peptide, and thereby elicits an immune response. Vaccinia vectors and methods useful in immunization protocols are described in, e.g., U.S. Patent No. 4,722,848. Another vector is BCG (Bacille Calmette Guerin). BCG vectors are described in Stover et al. (Nature 351 :456-460 (1991)). A wide variety of other vectors useful for therapeutic administration or immunization of the peptides of the invention, e.g., Salmonella typhi vectors and the like, will be apparent to those skilled in the art from the description herein. The PΝA encoding one or more of the peptides of the invention can also be administered to the patient. This approach is described, for instance, in Wolff et. al, Science 247: 1465-1468 (1990) as well as U.S. Patent Νos. 5,580,859 and 5,589,466.
In order to enhance serum half-life, the peptides may also be encapsulated, introduced into the lumen of liposomes, prepared as a colloid, or other conventional techniques may be employed which provide an extended serum half-life of the peptides. A variety of methods are available for preparing liposomes, as described in, e.g., Szoka et al, Ann. Rev. Biophys. Bioeng. 9:467 (1980), U.S. Pat. Νos. 4, 235,871, 4,501,728 and 4,837,028.
EXAMPLES
The following examples are offered to illustrate, but no to limit the claimed invention. Example 1 : This example describes comparison of the C. pneumoniae genome disclosed here and the, previously sequenced, C. trachomatis genome (Stephens, et al. Science 282:754-759 (1998)).
The apparent low level of PNA homology between C. trachomatis and C. pneumoniae (Campbell, et al, J. Clin. Microbiol. 25:191 1-1916 (1987)) yet analogous cell structures and developmental cycles, predicts that comparative analysis of the two genomes will significantly enhance the understanding of both pathogens. Identification of genes that are present in one species but not the other are of particular importance for the mutually exclusive biological, virulence and pathogenesis capabilities of each. Identification of genes shared between the two species strongly supports the requirement for these capabilities in a biological system that has, over its long-term association with mammalian host cells, evolved to reduce the metabolic capacities while optimizing survival, growth and transmission of these unique pathogens.
The previously sequenced C. trachomatis genome contains 1,042,519 nucleotides and 875 likely protein-coding genes. Similarity searching permitted the inferred functional assignment of sequences 636 (60%) genes disclosed here and 251 (23%) are similar to hypothetical genes for other bacterial organisms including those for C. trachomatis. The remaining 186 (17%) genes are not homologous to sequences deposited in GenBank.. Seventy C. trachomatis genes are not represented in the C. pneumoniae genome. These are contained within blocks consisting of 2-17 genes and 19 single genes. Of the 70 C. trachomatis genes without homo logs in C. pneumoniae, 60 are classified as encoding hypothetical proteins. The remaining genes not represented in C. pneumoniae consist of the tryptophan operon (trpA,B,R), trpC, two predicted thiol protease genes, and 4 genes assigned to the phospholipase-O superfamily. It is evident that there is a high level of functional conservation between C. pneumoniae and C. trachomatis as orthologs to C. trachomatis genes were identified for 859 (80%) of the predicted coding sequences for C. pneumoniae. The level of similarity for individual encoded proteins spans a wide spectrum (22-95% amino acid identity) with an average of 62% amino acid identity between orthologs from the two species. The percent amino acid identity between orthologous chlamydial proteins is similar among functional groups with the highest for proteins associated with translation and the lowest for proteins whose function in chlamydiae is uncharacterized and not related to proteins encoded by other organisms. The gene order of the homologous set of genes in C. pneumoniae shows reorganization relative to the genome of C. trachomatis; however, there is a high level of synteny for the gene organization of the two genomes. We identified thirty-nine blocks of 2 or more genes whose gene organization is colinear with homologs to C. trachomatis, although some of these are inverted. The distribution of genome reorganization is not evenly distributed on the chromosome as the region between C. pneumoniae coding sequences 0130-0300 contains substantially more reorganization than other areas of the genome. This region coincides with the predicted chromosome replication terminus.
We identified orthologs of enzymes characterized in other bacteria that account for the essential requirements for PNA replication, repair, transcription and translation including two predicted PNA helicases of the Swi2/Snf2 family found in C. trachomatis. Similar to C. trachomatis, alternative sigma subunits for RNA polymerase, σ28 and σ54, were identified in addition to anti-σ regulatory system factors RsbV, a RsbW-like single-domain histidine kinase, and a RsbU-like protein phosphatase. These findings suggest that the fundamental mechanisms of transcriptional regulation are conserved among Chlamydia. The C. trachomatis proteins containing SET and SWTB domains, and a SWTB domain fused to the C-terminus of the chlamydial topoisomerase I, not identified outside eukaryotes, are found in C. pneumoniae supporting their possible role in the chromatin condensation-decondensation characteristic of the biologically unique chlamydial developmental cycle.
The central metabolic pathways inferred from the C. pneumoniae genome sequence are the same as those identified for C. trachomatis C. pneumoniae has a glycolytic pathway and a linked tricarboxylic acid cycle, although likely functional, is incomplete as genes for citrate synthase, aconitase, and isocitrate dehydrogenase were not identified. C. pneumoniae has a complete glycogen synthesis and degradation system supporting a role for glycogen synthesis and utilization of glucose-derivatives in chlamydial metabolism. Genes encoding essential functions in aerobic respiration are present and electron flux may be supported by pyruvate, succinate, glycerol-3-phosphate, and NAPH dehydrogenases, NAPH-ubiquinone oxidoreductase and cytochrome oxidase. C. pneumoniae also contains the V (vacuolar)-type ATPase operon and the two ATP translocases found in C. trachomatis.
The type-Ill secretion virulence system required for invasion by several pathogenic bacteria and found in the C. trachomatis genome in three chromosomal locationsis also present in the C. pneumoniae genome. Each of the components is conserved and their relative genomic contexts are conserved. Genes such as a predicted serine/threonine protein kinase and other genes physically linked to genes encoding structural components of the type-Ill secretion apparatus, but without identified homologs, are also highly similar between the two species suggesting the functional roles in modifying cellular biology are fundamentally conserved.
Chlamydia-encoded proteins that are not found in chlamydial organisms but localized to the intracellular chlamydial inclusion membrane are likely essential for the unique intracellular biology and perhaps differences in inclusion moφhology observed between species of Chlamydia. Several such proteins, termed IncA,B&C, have been characterized for a C. psittaci strain (-Rockey, et al. Mol. Microbiol. 15:617-626 (1995); Rockey et al. Inject. Immun. 62: 106-112 (1994)). C. pneumoniae and C. trachomatis encode orthologs to C. psittaci IncB and IncC and C. trachomatis also contains an ortholog to icA. C. pneumoniae contains two genes that encode proteins with similarity to IncA (CPn0186 and CPn0585), although the level of homology is low suggesting analogous but possibily altered functions.
The tryptophan biosynthesis operon (trpA, trpB, trpR) and trpC identified in C. trachomatis is conspicuously missing in the C. pneumoniae genome. This represents the entire repertoire of genes associated with tryptophan biosynthesis identified in C. trachomatis. Seventeen genes adjacent to the C. trachomatis tryptophan operon also were not found in the C. pneumoniae genome. This region is the single largest loss of a contiguous genomic segment and includes 4 HKD superfamily encoding genes that encompass a family of proteins related to endonuclease and phospholipase D. These findings may be important for the ability of Chlamydia to persist in their hosts and cause disease by eliciting potent, focal and persistent inflammatory responses thought to be essential for pathogenesis.
The C. pneumoniae genome contains 187,711 additional nucleotides compared to the C. trachomatis genome, and the 214 coding sequences not found in C. trachomatis account for most of the increased genome size. Eighty-eight of these genes are found in blocks of >10 genes (11-30 genes/block), 41 are single genes, and the remainder are partnered with at least one other gene. Based upon the observation that -70% of all the C. pneumoniae genes have an identifiable homolog in GenBank, exclusive of C. trachomatis, it would be expected that over 150 of the 214 genes should have a homolog in GenBank, many associated with a function. However, only 28 coding sequences have similarity to genes from other organisms. Thus the majority of the genes that are mutually exclusive of C. trachomatis (186 of 214), and the 60 of 70 C. trachomatis genes that lacked an identifiable homolog in C. pneumoniae, do not have detectable homologs to genes from other organisms. We predict that most of the unique genes are essential for specific attributes that define the differential biology, tropism and pathogenesis of C. trachomatis and C. pneumoniae. Moreover, this suggests that C. pneumoniae has more unique biological (i.e., virulence) capacity than C. trachomatis. The ability of C pneumoniae to be more invasive and survive in a broader range of host cell types than C. trachomatis is consistent with this hypothesis. Not all of the differences in biological capacity may be associated with mutually exclusive genes. One explanation for the significantly lower level of homology between protein sequences assigned as having C. pneumoniae and C. trachomatis orthologs but no identifiable orthologs in other organisms is that this set of proteins is not only associated with biological requirements specific for Chlamydia but this polymoφhism may account for differential biology between the two species. The determination of the genome sequence from a representative of the C. psittaci group will precisely delineate those genes that are mutually exclusive and specific for each species.
The major functionally identifiable addition to the C. pneumoniae genome is a large expansion of genes encoding a new family of chlamydial polymoφhic membrane proteins (Pmp), alone representing 22% of the increased coding capacity. While the C. trachomatis genome has 9 pmp genes, remarkably the C. pneumoniae genome contains 21 pmp genes. Most of these genes appear to be amplified in two regions of the genome with three stand-alone genes. Interestingly one of the stand-alone genes is most closely related to the C. trachomatis pmpD which is the only stand-alone pmp gene in the C. trachomatis genome and it is located with the same relative genomic context, suggesting an essential and conserved function for this paralog. Six Pmp-coding genes are presumably not functional as five contain predicted coding frame-shifts and one is truncated. The amplification of this gene family and the confidently predicted frame- shifts suggest a specific molecular mechanism to promote functional or antigenic diversity. The biological role of this protein family remains enigmatic, although at least one of the proteins in C. psittaci related to this family is exposed on the chlamydial surface. While a function could not be assigned for most of the unique C. pneumoniae genes, several have significant similarity to genes from other organisms. Functional assignments could be made for genes encoding GMP synthetase, IMP dehydrogenase, UMP synthase, uridine kinase, biotin synthase pathway proteins, methylthioadenosine nucleosidase, a PNA glycosylase and aromatic amino acid hydroxylase. Thus a complete pathway was identified for biotin biosynthesis. The additional purine and pyrimidine salvage pathway genes presumably reflect metabolic limitations in one of the cell types that C. pneumoniae infects or differences in the ability of C. pneumoniae to transport precursor nucleosides or nucleotides. The addition of aromatic amino acid hydroxylase in C. pneumoniae is intriguing especially in light of the loss of tryptophan biosynthetic genes and the inability to synthesize other amino acids including phenylalanine. Aromatic amino acid hyroxlyases include three distinct enzymes that function to receptively oxidize phenylalanine to tyrosine, tyrosine to Popa, and tryptophan to 5-hydroxytryptophan and serotonin. Although the chlamydial protein is similar to proteins of this family and incrementally more closely related to tryptophan hydroxylase, its specific function could not be confidently predicted. We hypothesize that it may be involved in C. pneumoniae virulence. Tryptophan hydroxylase has not been previously identified in bacteria and the origin of the chlamydial gene appears to be from eukaryotes. The functional role of an aromatic amino acid hydroxylase for C. pneumoniae is linked to the unique intracellular biology of this organism and may represent a key contribution to C. pneumoniae persistence and pathogenesis.
It is understood that the examples and embodiments described herein are for illustrative puφoses only and that various modifications or changes in light thereof will be suggested to persons skilled in the art and are to be included within the spirit and purview of this application and scope of the appended claims. All publications, patents, and patent applications cited herein are hereby incoφorated by reference in their entirety for all puφoses.
Table 1 provides functional assignments of C. pneumoniae nonprotein- encoding genomic sequences. Table 2 provides functional assignments of protein coding sequences. Table 3 provides the amino acid sequences of the proteins corresponding to the coding sequences. TABLE 1 type » SEQ !DNO:1 SEQ ID NO: 1 Gene start position end position
Ori 841664 841396 (R) Putative Origin of Repli tmRNA 138493 138074 (R) tmRNA pRNA 607342 607649 Rifaonuclease P RNA rRNA 1000564 1002115 15S rRNA rRNA 1002415 1005278 23S rRNA rRNA 1005393 1005509 5S rRNA tRNA 269070 269142 Ala tRNA_l tRNA 164318 164389 Asn tRNA tRNA 296224 296151 (R) Asp tRNA tRNA 836191 835119 (R) Ala tRNA_2 tRNA 1030533 1030603 Cys tRNA tRNA 784896 784822 (R) Glu tRNA tRNA 781680 781610 (R) Gly tRNA_l tRNA 961536 961607 Gly tRNA_2 tRNA 999949 1000023 His tRNA tRNA 268992 269065 He tRNA tRNA 672236 672318 Leu tRNA_l tRNA 680178 680257 Leu tRNA_2 tRNA 715889 715971 Leu tRNA_3 tRNA 739403 739486 Leu tRNA_4 tRNA 1175863 1175944 Leu tRNA_5 tRNA 784994 784922 (R) Lys tRNA tRNA 843926, 843999 Pro tRNA_2 tRNA 409922 409848 (R) Pro tRNA_l tRNA 631373 631445 Phe tRNA tRNA 677337 677264 (R) Arg tRNA_2 tRNA 807413 807341 (R) Arg tRNA_3 tRNA 877473 877400 (R) Arg tRNA_4 NA 462141 462214 Arg tRNA_l tRNA 1085605 10.85676 Gin tRNA tRNA 786780 786708 (R) Thr tRNA_3 tRNA 89728 89657 (R) Thr tRNA_l tRNA 293477 293405 (R) Thr tRNA_2 tRNA 87522 87450 (R) Met tRNA_l tRNA 199301 199229 (R) Met tRNA_2 tRNA 199390 199317 (R) Met tRNA_3 tRNA 626904 626987 Ser tRNA_l tRNA 708359 708440 Ser tRNA_2 tRNA 1142034 1142117 Ser tRNA_3 tRNA 1230028 1229945 (R) Ser tRNA_4 tRNA 91070 90999 (R) Trp tRNA tRNA 293399 293317 (R) Tyr tRNA tRNA 296147 296075 (R) Val tRNA_l NA 1137389 1137462 V L tRNA_2 TABLE 2 s«ni I mm,. Straπri ; r ,h""" " -'in i nπ tn Btrtnch-...;
CPπOOOl 282 4 R CTOOl hypoctieelcal protein
CPnOOO. 573 875 r 7a.C-Glu-.RNA Gin Ainidotrans erase !C subunie) - (CT002)
CPnOOO3 895 2370 ? <jatΛ-Glu cRNA Gin Aπudotrans ferae- (CT003 )
Irl.lU 1 J 171) [- •IΛtH ■ ( Per. ul ) ■- L * . f i'tiΛ Lrx Am i'lαr. ran- :<4ras-i ( B .ϊuβun i r. ) - (OTin i !
*-Pnϋl>υS 4127 od'J- F ptnp_l -Polymorphic Outer Membrane Protein G Family
CPnOOOβ 7293 7141 R
CPn0007 7805 10496 r
CPnOOOβ 10975 11685 7
CPnOOO9 11815 13X19 7 cPnOOio 13435 14325 P
CPnOOlO 14379 15746 7 frame-shift with 0010
CPnOOll 15892 16614 ?
CPn0012 16644 18212 r
CPn0013 18584 21106 F pmp_2-?olyπorphιc Outer Membrane Protein G Family
CPn0014 21392 21922 r pmp_3 -Polymorphic Outer Membrane Procein G Family
CPnOOlS 21835 24174 r pmp_3-PMP_3 ( frame-shift with 0014)
CPnooiδ 24416 26188 r pmp_4- Polymorphic Outer Membrane Procein G Family
CPn0017 26094 27X70 ! pmp_4-PMP_4 (frame-shift with 0016)
CPnOOlβ 27522 29003 7 pmp_5-Polymorphιc Outer Membrane Procein G Family ςpnooi9 29007 30356 F pmp_5-PMP_5 ( frame-shift with 00X8)
CPn0020 32687 30603 P Prβdicced OMP (leader (14) peptide: oucβr membrane] - (CT351)
CPn0021 34410 32707 B Predicted OHP (leader (19) peptide] - (CT350)
CPI10022 34982 34395 S maf- (CT349)
CPS0023 36603 35014 F yjjK/alr-ABC Transporter Procein ATPase- (CT348)
CPn0024 37596 36661 F xerC-Intβgrasβ/reeombinasβ- (CT347)
CPn0025 38604 37684 R elaC/atsA-Sulphohydrolaββ/Glycoβuliacaββ- ( CT346)
CPn002S 39625 38762 CT345 hypothetical procein- (CT345)
CPH0027 42234 39778 R lon- αn ATP-dependenc Procease-(CT344)
CPn0028 4332S 42543 R
CPn0029 437S5 43390 R
CPn0030 43891 44S29 F gcp_l-0-Sialoglycoproeβin Endopepeidase_l- (CT343)
CPn0031 44711 44884 F rs21-S21 Ribosomal Procein- (CT342)
CPn0032 44923 46098 7 dαaJ-Heac Shock Procein J-(CT3 1)
CPn0033 46138 48171 P. pdhλ&B/odbAaodbB-(pyruvace) oxo sovalerate Dehydrogenase Alpha 6 Beca Fusion- (CT340)
CPn00 49457 48210 R
CPn00 S 51029 49569 R CT339 hypothetical procein
CPn0036 51002 51796 7 CT338 hypochecical procein
CPn0037 51792 52X15 7 pcsH-PTS Phosphocarrier Procein Hpr-(CT337)
CPn0038 52119 53831 7 pcsI-PTS PEP Phosphotransferase-(CT336)
CPn0039 54250 53963 R ybaB-(CT335)
CPn0040 55643 54318 R dnaX_l-DNA Pol III Gamma and Tau_l- (CT334)
CPn0041 55996 57342 7
CPπ0042 57403 58X82
CPn00 58447 60372
CPT10044 60419 60778
CPH004S 61069 62790
CPH0046 62790 63263
CPn0047 63455 636S2
CPn00 8 63687 65801 *yqfF-Bs conserved hypothetical IM procein
CPH00 9 66296 65817 R
CPΠOOSO 66813 66499 R cpnoαs 66833 67111 7
CPrιOOS2 68005 67304 R hetnC-Porphobilinoςren Deamιnase- (CT299)
CPnOOS 69344 67986 R sms-Sms Protein- (CT298)
CPn00S4 70023 69313 R rnc-Ribonuclease III- (CT297)
CPnOOSS 70129 70590 F CT296 hypothetical procein
CPnOOSβ 70953 72746 7 mrsA-Phosphomannomucase- (CT29S)
CPιι0057 72934 73554 sodM-superoxide Oismucase (Mn) - (CT294)
CPn005β 73639 74562 r accD-AcCoA Carboxylase/Transferase Beta- (CT293 )
CPn00 9 *6L6 75050 F dut -dUTP Nucleocιdohydrolasβ- (CT292) PnOOβO 75055 7S528 F ptsN_l -PTS IIA Protein- (CT291) Pn00 L 75534 76208 F ptsN_2 -PTS IIA Protein ► HTH DNA-Bindmq Domain- (CT290)
CPn0062 7630d 77690 F TZ O hypothet ica l protein
Figure imgf000039_0001
CPnOO 7H346 7857(5 F
':pnoo 5 H<> 4 40651 F CT288 hypothetica l, pror.«ιιn rrPBOOS*) H09..5 H^1, F CPn0067 82953 84053 F
CPn0068 84903 84331 R CT360 hypothetical protein
CPn0069 85236 87086 F
CPn0070 87378 87208 R
CPn0071 88045 87599 R CT325 hypothetical protein
CPn0072 89061 88057 R CT324 hypothetical protein
CPn0073 89356 89574 F mfA-Initiation Factor IF-1-(CT323)
CPn0074 89774 90955 F εufλ-Elongation Factor Tu-(CT322)
CPn0075 91102 91350 F secE-preprotem translocase- (CT321)
CPn0076 91358 91903 F nusG-Transcπptional Anciterminacion- (CT320)
CPn0077 92013 92435 F rlll- ll Ribosomal Protein- (CT319)
CPn0078 92465 93160 F rll-Ll Ribosomal Protein- (CT318)
CPn0079 93179 93688 F rllO- lO Ribosomal Protein- (CT317)
CPnOOβO 93735 94121 F rl7-L7/L12 Ribosomal Protein- (CT316)
CPnOOεi 94261 98016 F rpoB-RNA Polymerase Beta-(CT315)
CPn0082 98043 102221 F rpoC-RNA Polymerase Beta' -(C 314)
CPn008 102332 103312 F tal-Transaldolase- (CT313 )
CPn0084 103362 103751 F predicted ferredoxm- (CT312)
CPn0085 104506 103766 R CT311 hypothetical procein
CPn0086 104904 105527 F atpE-ATP Synthase Subunit E-(CT310)
CPn0087 105579 106376 F CT309 hypothetical procein
CPn0088 106373 108145 F atpA-ATP Synthase Subunit A-(CT308)
CPn0089 108153 109466 F atpB-ATP Synthase Subunit B-(CT307)
,CPn0090 109454 110080 F atpD-ATP Synthase Subunit D-(CT306)
CPn0091 110074 112053 F atpI-ATP Synthase Subunit I-(C 305)
CPn0092 112151 112573 F atpK-ATP Synthase Subunit K-(CT304)
CPn0093 112509 113015 F CT303 hypothetical protein
CPn0094 113152 115971 F valS-Valyl tRNA Synthetase- (CT302)
CPn0095 116037 118790 F pknD-S/T Protein Kinase- (CT301)
CPn0096 124314 118837 R uvrA-Excmuclease ABC Subunit A-(CT333)
CPn0097 124555 126006 F pyk-Pyruvate Kinase- (CT332)
CPn0098 127491 126091 R htrB-Acyltransferase- (CTOIO)
CPn00 9 127593 127865 F
CPnOlOO 129141 127882 R CT011 hypothetical procein
CPnOlOl 129932 129141 R ybbP family hypothetical protein- (CT012 )
CPn0102 130123 131466 F cydA-Cytochrome Oxidase Subunit I-(CT013)
CPn0103 131480 132511 F cydB-Cytochrome Oxidase Subunit II-(CT014)
CPn0104 133875 132676 R " CT017 hypothetical protein
CPn0105 134847 134029 R CT016 hypothetical protein cpnoioe 135091 136374 F phoH-ATPase- (CT015)
CPn0107 137162 136392 R CT058 hypothetical protein_l
CPnOlOβ 137857 137303 R CT018
CPn0109 138655 141783 F ileS-Isoleucyl-tRNA Synthetase- (CT019)
CPnOllO 143734 141827 R lepB-Signal Peptidase I-(CT020)
CPnOlll 144686 143934 R CT021 hypothetical protein
CPn0112 144767 14S093 F rl31- 31 Ribosomal Protein- (CT022 )
CPn0113 145335 146405 F pfrA-Pepcide Chain Releasing Factor (RF-1) - (CT023)
CPn0114 146398 147261 F hemK-A/G specific methylase- (CT024)
CPn0115 147279 148622 F ffh-Signal Recognition Particle GTPase- (CT025)
CPnOllδ 148616 148972 F rsl6-S16 Ribosomal Protein- (CT026)
CPn0117 148989 150071 F trmD-tRKA (guanme N-l) -Methyltransferase- (CT027)
CPnOllβ 150102 150464 F rll9-L19 Ribosomal Protein- (CT028)
CPn0119 150523 151164 F rnhB_l-Ribonuclease HII_1- (CT029)
CPnO120 151164 151778 F gmk-GMP Kinase- (CT030)
CPn0121 151778 152068 F CT031 hypothetical protein
CPn0122 152071 153723 F metG-Methionyl-tRNA Synthetase- (CT032)
CPn0123 155969 153774 R recD_l-Exodeoxynbonuclease V (Alpha Subunit) _1- (CT033)
CPn0124 156614 158068 F
CPn0125 158096 158605 F
CPn0126 158809 161085 F
CPn0127 162143 161130 R ytfF-Cationic Ammo Acid Transporter- (CT03 )
CPn0128 162277 163053 F bpll-Biotin Protein Ligase- (CT035)
CPn0129 163717 163064 R similarity to CT036
CPn0130 164245 163751 R
CPn0131 164549 165580 F
CPn0132 165587 166561 F
CPn0133 167334 166564 R CHLPS hypothetical protein- (CT109)
CPn0134 169098 167467 R groEL_l-HSP-60_l-(CT110)
CPn0135 169448 169143 R groES-lOKDa Chaperonin- (CT111)
CPn0136 171401 169569 R pepF-Oligopeptidase- (CT112)
CPn0137 172254 171502 R ybgl-ACR family- (CT108)
CPn0138 174019 172700 R hemL-Glutamate-l-Gemιaldehyde-2 , 1-amιnomutaβe- (C 210) CPn0139 174656 174093 R yqgE-(CT210) CPnO140 175110 174673 R yqd£-!CT2l2) CPH0141 175802 1751X0 R rpiλ-Ribose-5-P Iso erase Λ-ICT213) CPn0142 176091 175816 R CPn0143 177335 176214 R *yxJG_Bs_l Hypothetical Procein CPn0144 177963 180S60 F clpB-Clp Protease ATPase- (CT113) CPn014S 180777 182369 F CT114 hypothetical procein CPn0146 182613 183095 F CPnOX47 183225 183671 F CPn0148 183846 185702 F pknl-S/T Procein Kinase- (CT145) CPn01 9 185715 187700 r dnlJ-D A Ligase- (CTX46) cpnoxso 187834 192444 F CTX47 hypothetical procein CPn0151 194X42 19262S R mhpA-Monooxygβnasβ- (CTX48 ) CPn0152 195265 1943X8 R CT149 hypothetical procein CPnOX53 195433 197892 F leuS-Leueyl tRNA Synthetase- (CT209) CPn01S4 197892 199202 F gseλ-KDO Transferase- (CT208)
CPΠOXSS 199691 199488 R CPn0156 200X17 199770 R CPB0X57 200723 200298 R CPII0158 20X430 200894 R CPn0159 201772 20X467 R CPnOlδO 203791 202X27 R pfkA_l-Fruceose-6-P Phosphocrans£erase_l-JCT207) cpnoiεi 204622 203798 R predicted acyleransferase family- (CT206)
CPnOX62 205828 204803 R CPn0X63 206026 206394 7 CPnOX64 206498 206998 7 CPnOlδS 206998 207582 F CPn0166 207630 207962 F CPn0167 208306 207977 R cpnoiββ 208641 208417 R
CPn0169 209S01 208710 R CPn0X70 21X026 210025 R CPnOX7i 2X2435 2XXX49 R guaλ-GKP Synthase CPttOX72 2X3X77 2X2440 R •guaB/impD-Inosine ■monophosphase dehydrogenase (COOH-cezmiaal regior. only)
CPH0173 213987 2137X5 R CPH0X74 2X4257 2X4724 F CPn01 5 2X4898 2X5275 F* CPH0176 2X5286 2X65X8 F CTX53 hypothetical procein CPH0177 2X7459 2X6608 R CPnOX78 2X80S2 2X7789 R CPn0179 2X8403 2X8056 R CPnOlβO 2X8851 2X8355 R cPnOlβl 219175 2X8777 R
CPn0182 220695 2X9334 R accC-Biocin Carboxylase-(CT124)
CPn0183 221X95 220695 R accB-Biotm Carboxyl Carrier Procein- (CTX23)
CPn0184 22X775 22X221 R efp_l-Elongacxon Faceor P_1-(C 122)
CPnOlBS 222451 22X765 R rpβ/araD-Ribulose-P Epimβrase- (CT121)
CPn0186 222899 224068 F •similarity to Cps IncA_l-(CT119)
CFH0187 224248 225045 F predicted mechylaβe-(CTX33)
CPnOlββ 2251X1 226400 F CTX32 hypothetical procein
CPTI0X89 226400 229825 F CTX3X homolog- ( ossible Transmembrane Procein)
CPn0190 229919 23X274 7
CPH0X91 23X991 23X314 R glnQ-ABC Amino Acid Transporter ATPase- (CT130)
CPB0X92 232634 23X984 R glnP-ABC Amino Acid Transporter Permease-(CT129)
CH.0193 233126 232686 R •argR-Ar inine Represser
CPU0X94 233210 234241 F gcp_2-0-Sιaloglycoprotein Endopepeιdase_2- (CT197)
CPnOX9S 234190 235785 F oppA_l-01igopepcide Binding Protein_l
CPn0196 235939 237519 F oppλ_2-Oligopeptide Binding Protein_2-(CT198)
CPn0197 237578 238882 F oppA_ -Oligopeptide Binding Protβin_3
CPn0198 239X69 240746 F oppA_ -Oligopeptide Binding Protein.4
CPnOX 9 24X042 24X983 F oppB_l-Oligopeptide Pezmease.l- (CT199)
CPn0200 242017 242868 F oppC_l-01igopβpcidβ Permβase_l- (CT200)
CPn0201 242864 243715 F oppO-Oligopepcide Transport ATPase- (CT201)
CPn0202 243715 244500 F oppF-Oligopepcxde Transport ATPase- (CT202)
CPn0203 - 245008 245802 F
CPH0204 245817 246002 F
CPn0205 246X33 246327 F
CPn0206 246409 247161 F CT203 hypothetical procein
CPn0207 247208 248617 F ybhl/βodiTl-Oxoglucarste/Malace Translocator-(CT204)
CPn0208 248953 250602 F pfkλ_2-Fruccoβe-6-P Phoβphotranβfβraβe_2- (CT20S)
CPH0209 251036 251272 F CPn0210 252384 251440 R
CPn0211 252756 252463 R
CPn0212 254066 252888 R
CPn0213 254342 254190 R
CPn0214 255657 254446 R
CPn0215 257015 255759 R
CPn0216 257608 257174 R
CPn0217 257896 258579 F ypdP-(CT140)
CPn0218 259058 258582 R
CPn0219 259357 260472 F tgt-Queuine tRNA Ribosyl Trans ferase- (CT193)
CPn0220 260696 261238 F
CPn0221 261657 262064 F
CPn0222 262504 262842 F weak similarity to Bacteriophage CHP1 (Orf4)
CPn0223 262956 263333 F
CPn022 263435 263674 F
CPn0225 263873 264541 F
CPn0226 264566 264967 F
CPn0227 265416 265009 R dsbB-Disul ide bond Oxidoreductase- (CT176)
CPn0228 266110 265412 R dsbG-Disulfide Bond Chaperone- (CT177)
CPn0229 266328 267560 F CT178 hypothetical protein
CPn0230 268253 267576 R CT179 hypothetical protein
CPn0231 2689S7 268253 R tauB-ABC Transport ATPase (Nitrate/Fe) - (CT180)
CPn02 2 270122 269232 R similarity to 5 -Methylthioadenosme / S-Adenosylhomocysteme Nucleosidase
CPn0233 270424 270248 R
CPn0234 271240 270548 R CT181 hypothetical protein
CPn0235 271416 272177 F kdsB-deoxyoctulonosic Acid Synthetase- (CT182)
CPn023S 272156 273766 F pyrG-CTP Synthetase- (CT183)
CPn0237 273762 274214 F yggF Family- (CT184)
CPn0238 274303 275838 F rwf-Glucose-6-P Dehyrogenase- (CT185)
CPn0239 275899 276672 F devB-Glucose-6-P Dehyrogenase (DevB family) - (CT186)
CPn0240 277861 276698 R
CPn0241 279354 278203 R
CPn0242 279918 279487 R
CPn0243 280555 280133 R
CPn0244 280918 281556 F adk-Adenylate Kinase- (CT128)
CPn0245 281645 282499 F ydhO-Polysaccharide Hydrolase-Invasm Repeat Family- (CT127)
CPn0246 282952 282551 R% rs9-S9 Ribosomal Protein- (CT126)
CPn0247 283415 282969 R rll3-L13 Ribosomal Protein- (CT125)
CPn02 8 284327 283650 R ycfV/ybbλ-ABC Transporter ATPase- (CT152)
CPn0249 285841 284333 R CT1S1 hypothetical protein
CPn0250 286057 285902 R rl33-L33 Ribosomal Protein- (CTISO)
CPn0251 286060 287559 F conserved hypothetical protein
CPn0252 288112 287576 R CT144 hypothetical protein (frame-shift with 0253?)
CPn0253 288456 287950 R CT144 hypothetical proteιn_l
CPn0254 289262 288459 R CT143 hypothetical proteιn_l
CPn0255 290165 289329 R CT142 hypothetical proteιn_l
CPn0256 291264 290398 R CT144 hypothetical protem_2
CPn0257 292127 291267 R CT143 hypothetical prote _2
CPn0258 292534 292133 R CT142 hypothetical protein (frame-shift with 0259?)
CPn0259 292986 292441 R CT142 hypothetical proteιn_2
CPn0260 294045 293548 R secA_l-Proteιn Translocase Subunιt_l- (CT141)
CPn0261 294302 295033 F ydaO-PP-Loop Superfamily ATPase- (CT217)
CPn0262 295091 295933 F surE-SurE-like Acid Phosphatase- (CT218)
CPn0263 296249 297136 F yqfϋ hypothetical protein- (CT221)
CPn0264 297730 297155 R ubiD-Phenylacrylate Decarboxylase- (CT220)
CPn0265 298620 297730 R ubiA-Benzoate Octaphenyltrans erase- (CT219)
CPn0266 299184 299876 F
CPn0267 300122 300910 F
CPn0268 300935 301318 F
CPn0269 302450 301476 R Dipeptidase- (CT138 )
CPn0270 303325 302468 R ywlC-SuA5 Superfaπuly-related Protein- (CT137)
CPn0271 303634 304362 F Lysophospholipase esterase- (CT136)
CPn0272 305233 304340 R dnaX_2-DNA Pol III Gamma and Tau_2- (CT187)
CPn0273 305844 305227 R tdk-Thymidylate Kmase- (CT188)
CPn027 308353 305852 R gyrA_l-DNA Gyrase Subunit A_1-(CT189)
CPn027S 310786 308372 R gyrB_l-DNA Gyrase Subunit B_1-(CT190)
CPn0276 311137 310793 R CT191 hypothetical protein
CPn0277 311910 311404 R
CPn0278 312875 312060 R conserved outer membrane lipoprotem protein
CPn0279 313537 312875 R Possible A3C Transporter Permease Protein
CPn0280 314572 313550 R dppF-Dipeptide Transporter ATPase- (CT689) CPn0281 315057 316103 dhnA-Predicted 1.6-Fructosβ Bιphosph«.-<s Aldolaae (dehydrin family)-
(CT215)
CPn0282 316126 317529 F xasA/gadC-λmino Acid Transporter- (CT216)
CPn0283 318497 317532 R
CPn0284 319045 318551 R
CPn0285 320595 319051 R
CPn0286 322059 320650 R mgtE-.g** Transporter (CBS Domain) - (CT194)
CPn0287 324221 322089 R
CPn0288 325716 324571 R CT195 Hypothetical protein
CPn028 325812 326996 F aaaT-Neutral Am o Acid (Glucamate) Transporter- (CT230)
CPn0290 327042 328523 F Na-dependent Transporter- (CT231 )
CPn0291 328667 329194 F incB-Inclusion Membrane Protein B-(CT232)
CPn0292 329228 329836 F cc-lnclusion Membrane Procein C-(CT233)
CPn0293 329949 332723 p CT234 hypothetical protein
CPn0294 333092 333502 F CAMP-Dependent Protein Kinase Regulatory Subunit- (CT235)
CPn0295 333863 333627 P acpP-Acyl Carrier Protein- (CT236)
CPn0296 334765 334022 R fabG-Oxoacyl (Carrier Procein) Reduccase-(CT237)
CPn0297 335697 334774 fabD-Malonyl Acyl Carrier Transcyclase-(CT238)
CPn0298 336721 335717 fabH-Oxoacyl Carrier Protein Synthase III-(CT239)
CPn0299 336816 337415 recR-Recombination Protein- (CT240)
CPn0300 337783 340152 yaeT-Omp85 Analog- (CT241)
CPn0301 340250 340762 (OpH-Like Outer Membrane Procein) -(CT242)
CPn0302 340787 341866 IpxD-UDP Glucosamine N-λcylcransferase- (CT243)
CPn0303 342958 341921 CT2 4 hypothetical procein
CPn0304 343133 344158 pdhλ/odpA-Pyruvace Dehydrogenase Alpha- (CT245)
CPn0305 344154 345137 pdhB/odpB-Pyruvate Dehydrogenase Beta-(CT246)
CPn0306 345X45 346431 pdhC-Dihydrolipoanu.de λcetylcrans erase- (CT247)
CPn0307 348986 346515 glgP-Glycogen Phosphorylase- (CT248)
CPn0308 349234 349596 ϊ similarity to CT249
CPn0309 350974 349595 R dnaA_l-Replication Initiation Protem_l- (CT250)
CPn0310 353433 351049 R 60IM-60kDa Inner Membrane Procein- (CT251)
CPn0311 354438 353575 R lgt-Prolipoprotem Diacylglycerol Trans erase- (CT252)
CPn0312 354524 3S4976 F CT101 hypothetical protein
CPn0313 354990 355355 F acpS-Acyl-carrier Protein Synchase-(CTIOO)
CPn0314 356285 355353 R crxB-Thioredoxm Reductase-(CT099)
CPn0315 356977 358716 F rsl-Sl Ribosomal Protein- (CT098)
CPn0316 358820 360121 F nusA-N Utilization Protein A-(CT097)
CPn0317 360081 362750 ' fB-Initiation Factor-2-(CT096)
CPn0318 362767 363126 F rbfλ-Ribosome Binding Factor A-(CT095)
CPn0319 363175 363879 F truB-tRNA Pseudouridine Synthase- (CT094)
CPn0320 363860 364783 F nbF-FAD Synthase- (CT093)
CPn0321 365858 364767 R ychF-GTP Binding Protein- (CT092)
CPn0322 366249 367328 F yscU-YopS Translocation Protein U -(CT091)
CPn0323 367331 369460 F lcrD- Low Calcium Response D-(CT090)
CPn0324 369492 370688 F lcrE- Low Calcium Response E-(CT089)
CPn0325 370708 371148 F sycE-Secretion Chaperone- (CT088)
CPn0326 371148 372725 F malQ-Glucanotransferase- (CT087 )
CPn0327 372945 373211 F rl28-L28 Ribosomal Protein- (CT086)
CPn0328 373241 374992 F CT085 hypothetical protein
CPn0329 375088 376146 F Phopholipase D Superfamily (leader (33) peptide] -(CT084)
CPn0330 376675 376202 R CT083 hypothetical procein
CPn0331 378437 376701 R CT082 hypothetical protein
CPn0332 378655 378536 R CHLTR T2 Protein- (CT081)
CPn0333 379090 378800 R ltuB-(CTOβO)
CPn0334 379311 379823 F CT079 similarity
CPn0335 379817 380674 F folD-Methylene Tetrahydrofolate Dehydrogenase- (CT078)
CPn0336 380650 381591 F yojL-(CT077)
CPn0337 382027 381575 R smpB- Small Protein B-(CT076)
CPn0338 382278 383375 F dnaN-DNA Pol III (beta chain) - (CT075)
CPn0339 383420 384034 F recF-ABC superfamily ATPase- (C 074)
CPn0340 383842 384156 F (frame-shift with 0339)
CPn0341 384160 384495 F (frame-shift with 0340)
CPH0342 384622 385062 F predicted OMP (leader (19) peptide] -(CT073)
CPn0343 84999 385595 F (frame-shift with 0342?)
CPn0344 387420 385558 R yaeL-Metailoprotease- (CT072)
CPn0345 388572 387436 R yaeM-(CT071)
CPn0346 389675 388704 R troD/ytgD-Integral Membrane Protein- (CT070)
CPn0347 391021 389678 R troC/ytgC-Integral Membrane Protein- (CT069)
CPn0 8 391803 391027 R troB/ytgB-ABC transporter ATPase- (CT068)
CPn0349 392770 391790 R troλ/ytgA-Solute Protein Binding Family- (CT067)
CPn0350 393181 393684 F CT066 hypothetical protein
CPn0351 393B88 395432 r adt_l-ADP/ATP Translocase_l- (CT065) CPn0352 395574 396830 F
CPn0 53 396893 397135 F
CPn0354 397167 398507 F
CPn0355 399889 398591 R
CPn0356 400459 400109 R
CPn0 57 401317 400469 R
CPn035B 401751 401578 R
CPn0359 402012 403817 F lepA-GTPase- <CT064 )
CPn0360 40S358 403922 R gnd-6- Phosphogluconate Dehydrogenase- (CT063 )
CPn0361 406647 405382 R tyrS-tyrosyl tRNA Synthetase- (CT062 )
CPn0362 407825 407055 R fliA/rpsD-Sigma-28/WhiG Family- (CT061 )
CPn0363 409688 407943 R flhA-Flagellar Secretion Procein- (CT060 )
CPn0364 409966 410238 F fer4-Ferredoxm IV- 1CT059 )
CPn0365 410528 411544 F
CPn0366 411976 412440 F
CPn0367 413102 413836 F
CPn0368 413790 414107 F
CPn0369 414351 415562 F CT058 hypothetical proteιn_2
CPn0370 415800 416912 F CT058 hypothetical proteιn_3
CPn0371 417147 417503 F
CPn0372 417687 418001 F
CPn0373 418380 420218 F gcpE-(CT057)
CPn0374 420218 420961 F CT056 hypothetical protein
CPn0375 421121 421615 F
CPn0376 421854 422294 F
CPn0377 423438 422347 R sucB_l-Dιhydrolιpoamιde Succιnyltransferase_l- (CT055)
CPn0378 426168 423445 R εucA-Oxoglutarate Dehydrogenase- (CT054)
CPn0379 426322 426765 F CT053 hypothetical protein
CPn0380 426758 427876 F hemN_l -coproporphyrinogen III Oxιdase_l- (CT052)
CPn0381 429809 428037 R CT326 similarity
CPn0382 430749 430036 R yabC/yraL-SAM-Dependent Methytransferase- (CT048)
CPn0383 431693 430749 R CT047 hypothetical procein
CPn0384 432377 431862 R hccB-Hiscone-like Protein 2-(CT046)
CPn0385 434018 432522 R pepA-Leucyl Aminopeptidase A-ICT045)
CPn0386 434525 434046 R ssb-SS DNA Binding Protein- (CT044)
CPn0387 435196 434699 R CT043 hypothetical protein
CPn0388 435329 437320 F glgX-Glycogen Hydrolase (debranchmg) - (CT042)
CPn0389 438134 437319 R% CT041 hypothetical protein
CPn0390 439144 438134 R ruvB-Holliday Junction Helicase- (CT040)
CPn0391 439692 439510 R
CPn0392 439814 440383 F dcd-dCTP Deamιnase-(CT039)
CPn0393 440379 440723 F CT038 hypothetical protein
CPn0394 440736 441968 F tlyC_l-CBS Domain protein (Hemolysm Homolog)_l- (CT256)
CPn0395 441964 443175 F CT257 hypothetical protein
CPn0396 444353 443241 R yhfO-NifS-related protein- (CT258)
CPn0397 445115 444381 R PP2C phosphatase family- (CT259)
CPn0398 445533 445700 F
CPn0399 445879 446523 F CT253 hypothetical protein
CPn0400 446536 447306 F CT254 hypothetical protein
CPn0401 447884 447495 R CT255 hypothetical protein
CPn0402 448994 447888 R mutY-Adenme Glycosylase- (CT107)
CPn0403 449015 449710 F yceC-predicted pseudouridme synthetase family- (CT106)
CPnO404 450887 449871 R
CPn0405 451739 450966 R CT105 hypothetical protein
CPn0406 451969 452865 F fabl-Enoyl-Acyl-Carrier Protein Reductase- (CT104)
CPn0407 453742 452858 R HAD superfamily hydrolase/phoεphatase- (CT103)
CPn0408 454105 454581 F CT102 hypothetical protein
CPn0 0 454645 455127 F CT260 hypothetical protein
CPn0410 455123 455833 F dnaO_l-DNA Pol III Epsilon Cham_l- (CT261)
CPn0411 455833 456609 F CT262 hypothetical protein
CPn0412 456590 457246 F CT263 hypothetical protein
CPn0413 459203 457227 R msbA-Transport ATP Binding Protein- (CT264)
CPn0414 460143 459172 R accA-AcCoA Carboxylase/Transferase Alpha- (CT265)
CPn0415 461498 460221 R CT266 hypothetical protein
CPn0416 461856 461557 R himD/ihfA-lntegration Host Factor Alpha- (CT267)
CPn0417 463035 462244 R amiA-N-Acetylmuramoyl Alan e Amidase- (CT268)
CPn0418 464401 462953 R murE-N-Acetylmuramoylalanylglutamyl DAP Ligase- (CT269)
CPn0419 466834 464876 R pbp3- transglycolase/transpeptidase- (CT270)
CPn0420 467108 466824 R CT271 hypothetical protein
CPn0 21 467998 467108 R yabC-PSP2B Family methyltransferase- (CT272 )
CPn0422 468242 468784 F CT273 hypothetical protein
CPn0423 468791 469216 F CT274 hypothetical protein CPn042 469612 470961 F dnaA_2 -Replication Initiation Factor.,.- (CT275)
CPn0425 470980 471564 F CT276 hypothetical proteins
CPn0426 472111 471536 R CT277 similarity
CPn0427 472207 473715 F nqr2-NADH (UbiσΛiinone) Dehydrogenase- (CT278)
CPn0428 473722 474681 F πqr3-NADH (Ubiquinone) Oxidoreductase, Gamma- (CT279)
CPn0429 474681 475319 F nqr4-NADH (Ubiquinone) Reductase 4-(C 280)
CPn0430 475326 476093 F nqr5-NADH (Ubiquinone) Reductase 5-(CT281)
CPn0431 476483 476151 R
CPn04 2 476816 476514 R
CPn043 477273 476929 R gcsH-Glycme Cleavage System H Protein- (CT282)
CPn0434 479462 477276 R CT283 hypothetical procein
CPn0435 480902 479475 R Phospholipase D superfamily [uncleavable leader peptide] - (CT284)
CPn0436 481618 480902 R lp λ-Lipoate Procein Ligase-Like Procein- (CT285)
CPn0437 481816 484350 F clpC-ClpC Procease-(CT286)
CPn0438 485416 484334 R ycbF-PP-loop superfamily ATPase- (CT287)
CPn0439 485553 486077 F
CPn0440 486105 486740 F
CPn0441 486891 487838 F CT007 hypochetical protein
CPn0442 488013 488528 F CT006 hypothetical procein
CPn0443 488729 489979 F CT005 hypothetical procein
CPn0444 490287 494507 F pmp_6-Polymorphic Outer Membrane Procein G/I Family
CPn0445 494772 497579 F pmp_7-Polymorphic Outer Membrane Procein C Family
CPn0446 497626 500415 F pmp_8-Polymorphic Outer Membrane Protein G Family
CPn0447 500566 503351 7 pmp_9-Polymorphic Outer Membrane Protein G/I Family
CPn044β 504810 503698 R *y3G_Bs_2 Hypothecical Procein
CPn0449 507231 505330 R pmp_10-PMP_10 (Frame-shift with 0451)
CPn0450 508112 507X80 R pmp_10-Polymorphic Outer Membrane Procein G Family
CPn0451 508275 51X058 F pmp_ll-polymorphic Outer Membrane Protein G Family
CPn04S2 511319 512860 F pmp_12-Polymorphic Outer Membrane Procein A/I Family (truncated)
CPn0453 513234 516X52 F pmp_13 -Polymorphic Outer Membrane Protein G Family
CPn0454 516182 5X9X15 F pmp_l4-Polymorphic Outer Membrane Procein H Family
CPn0455 520348 519458 R
CPn0456 521532 520327 R
CPn0457 523865 522120 R
CPn0458 526320 524236 R
CPn0459 527005 526619 R
CPn0460 527840 526992 R
CPn0461 528638 527844 R*
CPn0462 531052 529037 R
CPn0463 532357 531191 R
CPn0464 532842 532366 R
CPn0465 533212 532871 R
CPn0466 533724 536537 F pmp_15-Polymorphιc Outer Membrane Protein E Family
CPn0 67 536633 539434 F pmp_16-Polymorphιc Outer Membrane Protein E Family
CPn0468 539632 540432 F pmp_17-Polymorphic Outer Membrane Protein E Family
CPn0469 540399 541460 F pmp_17-Polymorphιc Outer Membrane Protein (Frame-shifc with 0469)
CPn0470 54X357 542532 F pmp_17-Polymorphic Outer Membrane Protein (Frame-shifc with 0470)
CPn0471 542564 545401 F pmp_l8-Polymorphic Outer Membrane Protein E/F Family
CPn0472 547905 545581 R
CPn0473 549593 548070 R
CPn0474 55X573 549807 R CT365 hypothetical protein
CPn0475 553844 551685 R glgB-Glucan Branching Enzyme- (CT866)
CPn0476 554844 553858 R CT865 hypothetical protein
CPn0477 556X06 554844 R *yqeV_Bs Hypothetical Protein
CPn0478 557625 5S6210 R hflX-GTP Binding Protein- (CT379)
CPn0479 558425 557616 R phnP-Metal Dependent Hydrolase- (CT380)
CPR0480 559303 558650 R CT383 hypochetical protein
CPn0481 560946 559339 R
CPn0482 561737 560961 R artJ-Argimne Periplasmic Binding Protein- (CT381)
CPn0483 561836 564964 F
CPn0484 564970 565824 F aroG-Deoxyheptonate λldolase- (CT382)
CPn0485 566038 566229 F CT382.1 hypothecical procein
CPn0486 567784 566405 R hypochetical proline permease
CPn0487 569740 568112 R CT384 hypothetical procein
CPn0488 570096 569767 R hitλ-HIT Family Hydrolase- (CT385)
CPn0489 570965 570096 R CT386 hypothetical protein
CPnO490 571279 573333 F CT387 hypochetical protein
CPn0491 574352 573336 R CT389 hypothecical protein
CPn04S2 574652 574804 F
CPn0493 S75004 574855 R
CPn0494 575364 575146 R
C?n0495 575603 576793 F asp -λβparcate Aminotransferase- (CT390) CPn0496 576793 577812 F CT.91 hypochetical procein
CPn0497 578089 577820 R CT 88 hypothetical procein
Figure imgf000046_0001
proS-Prolyl tRNA Synthetase- (CT393 )
CPnOSOl 582457 583650 F hrcA-HTH Transcriptional Represser- (CT394 )
CPn0502 583650 584201 F grpE-HSP-70 Co actor- (CT395)
CPn0503 584234 586213 F dnaK-HSP-70- (CT396)
CPnOS04 586487 588514 F vacB-ribonuclβase family- (CT397)
CPnO505 588519 589106 F *3-mechyladenιne DNA glycosylase
CPn0506 589X72 589840 F CT421 hypothecical procein
CPn0507 589961 590X22 F CT421.1 hypothetical procein
CPn0508 590142 590300 F CT421.2 hypochetical procein
CPn0509 590335 590808 F (predicted Mβcalloβnzytne) -(CT422)
CPnOSlO 590813 59X973 F tlyC_2-CBS Domains (Hemolysm homolog)_2- (CT423)
CPnOSll 592141 592488 F rsbV_l-Sigma Regulatory Faecor_l-(CT424)
CPn0512 592553 5944X2 F CT425 hypothetical procein
CPn0513 594647 595753 F Fe-S oxidoreduecase_l-(CT426)
CPn0514 595729 596520 F CT427 hypothetical procein
CPn0515 596492 597X81 F ubiE-Ubiqumone Methyltrans erase- (CT428)
CPn0516 598814 597255 R
CPn0517 599631 598795 R
CPnOSlβ 600803 599832 R CT429 hypothetical protein
CPn0519 601674 600904 R dapF-Diaminopimelace Epιmerase-(CT430)
CPn0520 602218 601646 R clpP-CLP Protease- (CT431)
CPn0521 603797 602241 R glyλ-Serme Hydroxymethyltrans erase- (CT432)
CPnOS22 603987 604655 F CT433 hypothetical procein
CPn0523 604723 605052 F
CPn0524 605103 606179 F
CPnOS25 606522 607283 F CT398 hypochetical procein
CPn0526 608696 6077X0 R yrbH-GuCQ/KpsF Family Sugar-P Isomerase-(CT399)
CPn0527 609904 608726 R sucB_2-Dihydrolipoamide Succmylcrans erase_2- (CT400)
CPn0528 611162 609921 R glCT-GluCamace Syporc- (CT 01)
CPn0529 612259 61XX65 R ycaH-ATPase- (CT402)
CPn0530 613254 6X2460 R spoO_l-rRNA Mechylase_l-(CT403)
CPn0531 614069 6X3245 R SAM dependent methyltrans erase- (CT404)
CPn0S32 614674 614075 R ribC/risA-Riboflavin Synthase- (CT405)
CPn0S33 614930 615385 F" CT406 hypothecical procein
CPn0534 615413 615784 F dksA-DnaK Suppressor- (CT407)
CPn0535 615793 616296 F lspλ-Lipoprocein Signal Peptidase-(CT408)
CPn0536 616345 617691 F dagA_l-D-Ala/Gly Permease_l-(CT409)
CPn0537 617833 618X89 F CT814.1 hypothetical procein
CPn0538 618212 618511 F CT814 hypothecical procein
CPn053 618705 62X545 F pmp_19-polymorphic oucer membrane procein A Family -(CT4X2)
CPn0540 621694 626862 F p p_20-polymorphic oucer membrane procein B Family- (CT413)
CPn0541 627170 628003 F Soluce binding procein (-yebL-Synechocyscis Adhesin Homolog) -(CT415)
CPn0542 628003 628737 F ABC Transporter ATPase- (CT416)
CPn0543 628725 629603 F (Mecal Transport Protein) -(CT417)
CPB0544 630529 629525 R yhbZ-GTP binding procein- (CT418)
CPn05 5 630884 630633 R rl27-L27 ribosomal protein- (CT419)
CPn0546 631229 630912 R rl21-L21 Ribosomal Procein- (CT420)
CPn0547 631661 632X88 F ygbB family- (CT434)
CPn0548 633231 632X91 R cysJ-Sul ice Reductase- (CT435)
CPn0549 633569 633255 R rslO-SlO Ribosomal Procein- (CT436)
CPn0550 635661 633580 R fusλ-Elongacion Factor G-ICT437)
CPn0551 636168 635696 R rs7-S7 Ribosomal Procein- (CT438)
CPn0552 636587 636219 R rsl2-S12 Ribosomal Protein- (CT439)
CPn0553 637747 636812 R
CPn0554 637854 638141 F CT440 hypothecical procein
CPn0555 638298 640241 F csp-Tail-Specific Protease- (CT441)
CPn0556 640912 640325 R crpA-15kDa Cysceine-Rich Procein- (CT442)
CPH0557 642861 641194 R o cB-βOkDa Cysceine-Rich Oucer Membrane Complex Procein- (CT443)
CPnOS5β 643300 643031 R omcA-9kDa-Cysceιne-Rιch Oucer Membrane Complex Lipoprocein- (CT 44)
CPnOS59 643742 643927 F CT441.1 hypothetical protein
C?nOS60 64S612 644098 R gltX-Glueamyl-cRNλ Synthetase- (CT445)
CPnOSβl 646404 645871 R euo-CHLPS Euo Protein- (CT446)
CPnOS62 648036 646918 R CKLPS 43 kOa protein homolog.l
CPn0563 6500S6 648293 R recJ-ssONA Exonuclease-(CT447)
CPnOS64 654350 6S0145 R secOdsecF-Protein Export Proteins SecD/SecF ( fusion) -(CT448)
CPnOSβS 655630 654S33 R CT449 hypothetical procein
CPn0566 656141 656890 F yaeS family- (CT450)
CPn0567 656894 657817 F cdsA-Phosphaeidace Cytιdylytrar.βf«rase- <CT451 ) CPn056β 657817 658464 F cdβA-Phosphatidate Cyt dylytransferae*- (CT452)
CPn0569 658464 659099 F plβC-Glycerol-3-P Acyltranβferase- (CT453)
CPn0570 659107 660789 F argS-Argnyl tRNA Trans erase- (CT454)
CPn0571 662122 660749 R murΛ-UDP-N-Acetylglucosamine Trans-erase- (CT455)
CPn0572 662352 664616 F CT456 hypothetical procein
CPn0573 665404 664691 R yebC family- (CT457)
CPn0574 665945 665394 R
CPnOS75 666494 665982 R YhhY-λ ino Group Acecyl Transferase- (CT45B)
CPH0576 667543 666494 R pr£B-Peptιde Chain Release Faccor 2 (natural UGA frame-shift )-(CT45S
CPn0576 667598 667530 R prfB-(nacural UGA frame-shifc )
CPH0577 667895 668155 Ξ IB (YM74) complex procein- (CT460)
CPnOS78 668406 669365 F yael-phosphohydrolase- (CT461 )
CPnOS79 669361 669993 F ygap/yacM-Sugar Nucleotide Phosphorylase-(CT462)
CPnOSβO 669993 670793 F truλ-Pseudouridylace Synchase I-ICT463)
CPnOSβl 671434 670745 R Phosphoglycolace Phosphatase- (CT464)
CPn0582 671503 672X77 F CT465 hypothetical procein
CPn0583 672400 6727X7 F CT466 hypothecical procein
CPn05B4 672707 673798 r acoS/ncrB-2-Componenc Sensor- (CT467)
CPnOSβS 675817 673865 E similarity to Cps Incλ_2
CPn0586 676026 677X83 F acoC/ntrC-2-Component Regulacor- (CT468)
CPH0587 677441 678X24 F *yvyD_Bs conserved hypoehec cal procein
CPnOSβS 678084 678626 F CT469 hypoehecical procein
CPn0589 678640 679395 J CT470 hypoehecical procein
CPnO590 680XX2 6795X6 f CT471 hypoehecical procein
CPn0591 680373 68X020 F yagE family- (CT472)
CPn0592 68XX53 68X461 f yidD family- (CT473)
CPnOS93 682476 68X391 I CT474 hypoehecical procein
CPn0594 682583 684958 E pheT-phenylalanyl tRNA Synthetase Beta-(CT475)
CPn0595 684958 685926 F CT476 hypoehecical procein
CPn0596 685939 686457 F ada-mechylcrans erase- (CT477)
CPn0597 6882X5 686479 R oppC_2-01ιgopepeιde Permease_2- (CT478)
CPn0598 689697 688219 R oppB_2-Oligopeptide Permease_2- (CT479)
CPn0599 69X802 689682 R oppA_5-oligopeptide Binding Lιpoproceιn_5- (CT480)
CPnOβOO 692X47 69X827 R cpnoβoi 693053 692736 R CT483 hypothecical procein
CPn0602 694X05 693X04 R CT484 hypoehecical procein
CPn0603 694205 695X85 F hemZ-Ferroehecalase- (CT485)
CPn0604 695945 695X96 R" flir-Glutamine Binding Procein- (CT486) craoβos 696707 696X50 R yhhF-Meehylase -(CT487)
CPH0606 697444 696707 R CT488 hypothetical protein
CPn0607 698895 697573 R glgC-Glucose-1-P Adenyl ransferase- (CT489)
CPnOβOβ 699645 6990X6 R pyrF-Uridme 5 -Monophosphate Synthase (Ump Synthase) -cruncaced?
CPn06O9 699705 699986 p CT490 hypoehecical procein cpnOβlo 70X420 700029 R rho-Transcripcion Terminacion Faceor-(CT491)
CPnOβll 702025 70X420 R yacE-predicced phosphatase/k ase- (CT492)
CPH0612 704631 702022 R polλ-DNλ Polymerase I-(CT493)
CPn0613 705656 704658 R sohB-Protease- (CT494)
CPn0614 707402 705783 R adt_2-ADP/ATP Translocase_2- (CT495)
CPa0615 708137 707634 R pgsA_l-Glycerol-3-P Phoβphat dyleransferase_l- (CT496)
CPn0616 708791 7X0X37 F dnaB-Replicae ve DNA Helicase- (CT497)
CPn0617 710484 7123X6 F gidλ-FAD-dependent oxidoreductase- (CT498)
CPn0618 712306 7X30X0 F lplλ-Lipoate-Proeein Ligase A-(CT499)
CPn0619 713444 713013 R ndk-Nucleosιde-2-P Kinase- (CTS00)
CPn0620 714139 713519 R ruvA-Holliday Junction Helicase- (CTS01)
CP80621 714647 714144 R ruvC-Crossover Junction Endonucleaβe- (CT502!
CPn0622 715752 714793 R CTS03 hypoehecical procein
CPn0623 716993 716X63 R CT504 hypoehecical procein
CFn0624 718015 717011 R gapλ-Glyceraldehyde-3-F Dehyrogenase- (CTS05)
CPH0625 718485 718060 R rll7-L17 Ribosomal Procein- (CTS06)
CPn0626 719616 718495 R rpoλ-RNλ Polymerase Alpha- (CTS07)
CP&0627 720038 719640 R rsll-Sll Ribosomal Procein- (CTSOβ)
CPn0628 720428 720063 R rsl3-S13 Ribosomal Protein- (CT509)
CPn0629 721857 720487 R secY-Translocase- (CTS10)
CPH0630 22316 721885 R rll5-L15 Ribosomal Protein- (CT511)
CPn0631 722806 722312 R rsS-S5 Ribosomal Protein- (CTS12)
CPn0632 723195 722827 R rll8-L18 Ribosomal Protein- (CT513)
CPn06 3 723757 723209 R rl6-L6 Ribosomal Protein- (CT514)
CPn063 724185 723787 R rsβ-S8 Ribosomal Protein- (CTSl5)
CPn0635 724745 724206 R rl5-L5 Ribosomal Protein- (CTS16)
CPn0636 725082 724750 R rl24-L24 Ribosomal Protein- (CTS17)
CPΛ0637 725464 725099 R rll4-L14 Ribosomal Protein- (CT518)
CPn063β 725747 725490 R rsl7-S17 Ribosomal Protein- (CT519) CPn0639 725958 725743 R rl29-L29 Ribosomal Protein- (CT520)
CPn0640 726377 725964 R rll6-L16 Ribosomal Protein- (CT521)
CPn0641 727077 726409 R rs3-S3 Ribosomal Protein- (CT522)
CPn0642 727428 727096 R rl22-L22 Ribosomal Protein- (CT523)
CPn0643 727713 727450 R rsl9-S19 Ribosomal Protein- (CT524)
CPn0644 728573 727722 R rl2-L2 Ribosomal Protein- (CT525)
CPn0645 728930 728598 R rl23-L23 Ribosomal Protein- (CT526)
CPn0646 729621 728950 R rl4-L4 Ribosomal Protein- (CT527)
CPn0647 730331 729657 R rl3-L3 Ribosomal Protein- (CT528)
CPn064β 731603 730605 R CT529 hypothetical protein
CPn0649 732672 731710 R fmt-Methionyl tRNA Formyltrans erase- (CT530)
CPn0650 733501 732665 R lpxA-Acyl-Carrier UDP-GlcNAc -(CT531)
CPnOβSl 733975 733517 R fabZ-Myristoyl-Acyl Carrier Dehydratase-(CT532)
CPn0652 734835 733990 R lpxC-MyπsCoyl GlcNac Deacetylase-(CT533)
CPn0653 736490 734868 R cutE-λpolipoprotem N-Acetyltransferase- (CT534)
CFn0654 736967 736503 R vdlD/yciA-acyl-CoA Thioescerase- (CT535)
CPn0655 737847 737101 R dnaQ_2-DNA Pol III Epsilon Chaιn_2- (CT536)
CPn0656 737872 738048 F
CPn0657 738473 738051 R yjeE (ATPase or Kinase) - (CT537)
CPn0658 739168 738455 R CT538 hypoehecical procein
CPn0659 739533 739838 F crxA-Th oredoxin- (CT539)
CPn0660 740327 739860 R spoU_2-rRNA Mechylase_2-(CT540)
CPn0661 741100 740327 R mip-FKBP-cype pepcidyl-prolyl cis-crans ιsomerase-(CT541)
CPn0662 742923 741172 R aspS-Asparcyl CRNA Synchecase- (CT542)
CPn0663 744190 742901 R hisS-HisCidyl tRNA Synthetase- (CT543)
CPn0664 744757 744557 R
CPn0665 745001 746365 F uhpC-Hexosphosphate Transport -(CT544)
CPn0666 746388 750107 F dnaE-DNA Pol III Alpha- (CT545)
CPΩ0667 751058 7S0177 R predicted OMP [leader (17)-(CT546)
CPn0668 751209 752162 F CT547 hypoehecical procein
CPn0669 752179 752775 F CT548 hypothetical protein
CPn0670 752765 753196 F rsbW-sigma regulatory factor-histidme kinase- (CT549)
CPn0671 753630 753205 R CT550 hypoehecical procein
CPn0672 753741 755048 F dacF(pbp5)-D-Ala-D-Ala Caroxypepcidase- (CT551)
CPn0673 755287 755463 F CT552 hypoehecical procein
CPn0674 756668 755577 R fmu-RNA Mechyltransferase-(CT553)
CPn0675 757919 756768 R CT696 hypoehecical procein
CPn0676 759217 758051 ' homologous to CT695
CPn0677 760401 759256 R
CPn0678 761320 760682 R
CPn0679 762930 761725 R pgk-Phosphoglycerate Kinase- (CT693)
CPn0680 764248 762971 R ygo4-Phosphate Permease-(CT692)
CPn0681 764929 764258 R CT691 hypothetical protein
CPn0682 764984 765955 F dppD-ABC ATPase Dipeptide Transport- (CT690)
CPn0683 765948 766919 F dppF-ABC ATPase Dipeptide Transport- (CT689)
CPn0684 768038 767181 R spoJ/parB-Chromosome Partitioning Protein- (CT688)
CPn0685 768068 768217 F
CPn0686 768361 768176 R
CPn0687 768564 769214 F CT482 hypothetical protein
CPn0688 769382 770137 F CT481 hypothetical protein
CPn0689 771404 770187 R yfhO_l-NιfS-related Ammotransferase_l- (CT687 )
CPn0690 772680 771436 R ABC Transporter Membrane Protein- (CT686)
CPn0691 773452 772685 R abcX-ABC Transporter ATPase- (CT685)
CPn0692 774912 773461 R ABC Transporter- (CT684)
CPn0693 776256 775240 R TPR Repeats (O-L nked GlcNAc Transferase similarity) - (CT683)
CPn0694 779599 776330 R pbp2-PBP2-transglycolase/transpeptιdase-(CT682)
CPn0695 780216 781382 F ompA-Major Outer Membrane Protein- (CT681)
CPn0696 781769 782599 F rs2-S2 Ribosomal Protein- (CT680)
CPn0697 782602 783447 F tsf-Elongation Factor TS-(CT679)
CPn0698 783458 784201 F pyrH-UMP Kinase- (CT679)
CPn0699 784182 784721 F rrf-Ribosome Releasing Factor- (CT677)
CPn0700 785097 78S609 F CT676 hypothetical protein
CPn0701 785599 786672 F karG-Argmine Kinase- (CT675)
CPn0702 789685 786929 R yscC/gspD-Yop C/Gen Secretion Protein D-(CT674)
CPn0703 791190 789685 R pkn5-S/T Procein Kinase- (CT673)
CPnO704 792321 791209 R fliN- Flagellar Motor Switch Domain/YscQ family- (CT672)
C?n0705 793173 792334 R CT671 hypothetical protein
CPn0706 793683 793180 R CT670 hypothetical protein
CPn0707 795029 793704 R yscN-Yop N (Flagellar-Type ATPase) - (CT669)
CPn0708 795705 795034 R CT668 hypothetical protein
CPn0709 796188 795742 R CT667 hypothetical protein
CPn07lO 796461 796210 R CT666 hypothetical protein CPn07U 796731 796486 R CT665 hypothetical procein
CPn0712 799315 796781 R FHλ domain; homology to adenylate cyclase) - (CT664)
CPn0713 799721 799332 R CT663 hypothetical protein
CPn0714 801107 800091 R hemA-Glutamyl tRNA Reductase- (CT662)
CPn0715 801657 803462 F gyrB_2-DNA Gyrase Subunit B_2-(CT661)
CPn0716 803469 804902 F gyrA_2-DNA Gyrase Subunit A_2-(CT660)
CPn0717 805010 805306 F CT656 hypothetical protein
CPn0718 805309 805626 F CT657 hypothetical protein
CPn0719 805916 806890 F sfhB-( Pseudouridme Synthase) - (CT658)
CPn0720 807003 807236 F CT659 hypothetical protein
CPn0721 807683 808489 F kdsA-KDO Synthetase- (CT655)
CPn0722 808489 808974 F CT654 hypothetical protein
CPn0723 808984 809703 F yhbG-ABC Transporter ATPase- (CT653)
_CPn0724 810527 809706 R
CPn0725 810811 810S87 R CT652.1 hypothetical protein
CPn0726 813372 810880 R CT620 hypothecical procein
CPn0727 813577 816192 F CT619 hypoehecical procein
CPn0728 818477 816525 R CHLPN 76kDa Homolog_l (CT622)
CPn0729 819857 818592 R CHLPN 76kDa Homolog_2 (CT623)
CPn0730 821603 819963 R mviN-Ineegral Membrane Procein- (CT624)
CPn0731 821587 821760 F
CPn0732 822098 822976 F nfo-Endonuclease IV-(CT625)
CPn0733 823727 823101 R rs4-S4 Ribosomal Procein- (CT626)
CPn0734 823944 824915 F yceλ-(CT627)
CPn0735 825668 825003 R pyrH/udk-Uridme Kinase (Uridme Monophosphok ase) (Pyrimidine
Ribonucleoside Kinase) .
CPn0736 827686 825992 R ygeD-E lux Procein- (CT641)
CPn0737 827685 830756 P recC-Exodeoxyribonuclease V, Gamma- (CT640)
CPn0738 830746 833895 F recB-Exodeoxyribonuclease v, Beta-(CT639)
CPn0739 834871 833861 R CT638 hypoehecical procein
CPn0740 836048 834864 R tyrB-Aromacic AA Amιnotransferase-(CT637)
CPn0741 838350 83618S R greA-Transcription Elongation Factor- (CT636)
CPn0742 838463 838888 F CT635 hypothetical procein
CPn0743 838962 840362 F nqrA-Ubiquinone Oxidoreductase, Alpha- (CT634)
CPn0744 841384 840389 R hemB-Porphobilmogen Synthase- (CT633)
CPn0745 841903 841742 R
CPn0746 841975 843567 F CT632 hypothetical procein
CPn0747 843675 843740 F" CT631 hypothetical protein
CPn0747 843725 843910 F CT631 hypoehecical procein (frame-shifc)
CPn0748 844987 844121 R ispA-Geranyl Transcransferase- (CT628)
CPn0749 845629 845006 R glmU-UDP-GlcNλc Pyrophosphorylase- (CT629)
CPn0750 846411 845707 R ecCD/cpxR-HTH Transcriptional Regulatory Protein + Receiver Doman-
(CT630)
CPn0751 846608 848434 F CT651 hypothetical protein
CPn0752 848604 850082 F recD_2-Exodeoxynbonuclease V, Alpha_2-(CT652)
CPn0753 851006 850161 R
CPn0754 851336 851040 R rs20-S20 Ribosomal Protein- (CT617)
CPn0755 851597 852799 F CT616 hypothetical protein
CPn0756 852961 854676 F rpoD-RNA Polymerase Sigma-66 -(CT615)
CPn0757 854733 855134 F folX-Dihydroneopterin λldolase- (CT614)
CPn0758 855110 856459 F folP/dhpS-D hydropteroate Synthase- (CT613)
CPn0759 856488 856997 F folA-Dihydrofolate Reductase- (CT612)
CPn0760 856957 857694 F CT611 hypothetical protein
CPn0761 8S7704 858375 F CT610 hypothetical protein
CPn0762 859597 858539 R recλ-RecA recombination protein- (CT650)
CPn0763 860511 859972 R ygfA-Formyltetrahydrofolate Cyclol gaβe- (CT649)
CPn0764 861807 860524 R CT648 hypoehecical procein
CPn0765 862382 861801 R CT647 hypoehecical procein
CPn0766 863782 862394 R CT646 hypothetical protein
CPn0767 863884 864177 F CT645 hypothetical protein
CPn0768 864159 865163 F yohI/nιr -predicted oxidoreductase -(CT644)
CPn0769 867733 865121 R CopA-DNA Topo somerase I-Fused CO SWI Domain- (CT643)
CPn0770 868340 869131 F CT642 hypochetical protein
CPn0771 870463 869144 R rpoN-RNA Polymerase Sιgma-54- (CT609)
CPn0772 872385 870469 R uvrD-DNA Helicase- (CT608)
CPn0773 872488 873195 F ung-Uracil DNA Glycosylase-(CT607)
CPn077 873195 873425 F CT606.1 hypothetical protein
CPn0775 874031 873414 R yggV family- (CT606)
CPn0776 874246 875487 F CT605 hypothecical protein
CPn0777 875601 877178 F groEL_2-heat shock proteιn-60 -(CT604)
CPn0778 877505 878092 p- tsa/ahpC-Thio-speci ic Antioxidanc (TSA) Peroxidase- (CT603)
CPn0779 878481 878095 R CT602 hypothetical protein CPn0780 879205 878591 R papQ/amiB-N-Acecylmuramoyl-L-Ala Amidaβe- (CT601)
CPn0781 879773 879198 R pal-Peptidoglycan-Assoeiated Lipoprocein- (CT600)
CPn0782 881065 879773 R tolB-polysacchaπde transporter- (CT599)
CPn0783 881885 881100 R CT598 hypothecical procein
CPn0784 882296 881892 R exbD-Biopolymer Transport Protein- (CT597)
CPn0785 882991 882296 R eκbB/tolQ-polysacchande transporter- (CT596)
CPn078 883185 885293 F dsbD/xprA-Thιo:dιsul ide Interchange Protein- (CT595)
CPn0787 885619 886401 F yabD/ycf.W-PHP superfamily (urease/pynmidinase) hydrolase- (CTS941
CPn0788 886542 887432 F sdhC-Succinate Dehydrogenase- (CT593)
CPn0789 887439 889316 F sdhA-Succinate Dehydrogenase- (CTS92)
CPn0790 889330 890103 F sdhB-Succinate Dehydrogenase- (CTS91)
CPn0791 893050 890111 R CT590 hypothetical protein
CPn0792 894919 893108 R CT589 hypothetical procein
CPn07 3 896823 394919 R rbsu-sigma regulacory family procein—PP2C phosphacase (RsbW ancagonisc) - (CT588)
CPn0794 897174 898004 F
CPn0795 898128 899195 F
CPn0796 899301 901340 F
CPn0797 901600 902694 F
Figure imgf000050_0001
CPn0799 904986 903940 R
CPnOBOO 906532 905249 R eno-Enolase- (CT587)
CPnOβOl 908697 906727 R uvrB-Ex uclease ABC Subunit B-(CT586)
CPn0802 909740 908709 R trpS-Tryptophanyl tRNA Synthetase- (CT585)
CPn0803 910303 909752 R CTS84 hypochetical protein
CPn0804 911059 910310 R gp6D-CHLTR Plasmid Paralog- (CT583)
CPn0805 911821 911067 R mmD-chromosome partitioning ATPase-CHLTR plasmid protein GP5D-1CTS82)
CPnOβOβ 913771 911867 R thrS-Threoπyl CRNA Synthetase- (CTS81)
CPn0807 913971 914879 F CT580 hypoehecical procein
CPnOBOβ 916287 914956 R CT579 hypoehecical procein
CPn0809 917785 916307 R CT578 hypoehecical procein cpnoβio 918184 917825 R CT577 hypoehecical proce cpnoβn 918900 918208 R lcrH_l-Low Ca Response Procein H_1-(CT576)
CPn0812 919123 920862 F mucL-DNA Mismacch Repair- (CTS75) cpnoai3 920870 921934 F pepP-λminopepeidase P-(CT574)
CPn0814 922107 923357 F CT573 hypoehecical procein
CPnOβlS 923361 925622 F gspD/pilQ-Gen. Secretion Protein D-(CT572)
CPn0816 92S615 927102 F' gspE-Gen. Secretion Protein E-(CTS71)
CPn0817 927115 928287 F gspF-Gen. Secretion Protein F-(CT570)
CPn0818 928314 928682 F predicted OMP [leader (16) peptide) - (CT569)
CPn0819 928689 929132 F CT568 hypothetical procein
CPn0820 929120 929659 F CT567 hypothetical protein
CPn0821 929667 930668 F CT566 hypothetical protein
CPn0822 930756 931229 F CT565 hypothetical protein
CPn0823 932367 931501 R yscT/spaR-YopT Tranlocation T-(CT564)
CPn082 932662 932378 R yscS/fliQ-YopS/fliQ Translocation Protein- (CT563)
CPn0825 933594 932677 R yscR-Yop Translocation R-ICT562)
CPn0826 934310 933612 R yscL-Yop Translocation L-(CT561)
CPn0827 93S264 934434 R CT560 hypothetical protein
CPn0828 936271 935267 R yscJ-Yop Translocation J-(CT559)
CPn0829 936744 937298 F
CPnOS30 937444 937959 F
CPn0831 938267 938434 F
CPn08 2 939747 938827 R l pA-Lipoace Synthetase- (CT558)
CPn0833 941129 939747 R lpdA-Lipoamide Dehydrogenase- (CTSS7)
CPn0834 941S53 942014 F CT5S6 hypothetical protein
CPn0835 945689 942045 R motl_l-S I/SNF family helιcase_l-(CTS55)
CPn0836 946879 945722 R braQ-Amino Ac d (Branched) Transport- (CT554)
CPn0837 947771 947145 R nth-Enodnuclease III-(CT697)
CPn08 8 949106 947781 R thdF-Thiophene/Furan Oxidation Protein- (CT698)
CPn083 949257 950159 F psdD-Phosphatidylseπne Decarboxylase- (CT699)
CPn0840 950222 951544 F CT700 hypothetical protein
CPn0841 951731 954640 F secA_2-Translocase SecA_2- (CT701)
CPn0842 954883 954710 R CT702 hypothetical protein (frame-shift with 0843)
CPn0843 955191 954994 R CT702 hypothetical protein
CPn0344 9S6730 955270 R yphC-GTPase/GTP-bindmg protein- (CT703)
CPn0845 958079 956550 R pcnB_l-Poly A Polymerase_l- (CT704)
CPn0846 959374 958112 R clpX-CLP Protease ATPase- (CT705)
CPn0847 959995 959387 R clpP-CLP Protease Subunit- (CT706I
CPn0848 961502 960177 R tig/murl-Trigger Facεor-pepcidyl-prolyl lεomeraβe- (CT707)
CPn0849 961788 96S285 F mocl_2-S I/SNF family helιcase_2- (CT708 )
CPnOβSO 965293 966390 F mreB-Rod Shape Protein-Sugar Kinase- ICT709) CPn0851 966396 968195 F pckA-Phosphoenolpyruva.e Carboxyκιnase- (CT710)
CPn0852 968316 970613 F CT711 hypothetical protein
CPn0853 970637 971803 F CT712 hypothetical protein
CPn0854 972837 971806 R ompB-Outer Membrane Protein B-(CT713)
CPn0855 973995 972994 R gpdA-Glyeerol-3-P Dehydrogenase- (CT714 )
CPn0856 975377 973995 R AgX-1 Homolog-UDP-Gl cose Pyrophosphorylase- (CT715)
CPn0857 975757 975392 R CT716 hypothetical protein
CPn0858 977055 975757 R flil-Flagellum-specific ATP Synthase- (CT717)
CPn0859 977588 977055 R CT718 hypothetical protein
CPnO860 978630 977608 R fliF-Flagellar M-Rmg Protein- (CT719)
CPn0861 979722 978925 R mfU-NifU-related protein- (CT720)
CPn0862 980873 979722 R yfhO_2-NifS-related proteιn_2- (CT721)
CPn0863 981514 980831 R pgmA-Phosphoglycerate Mutase- (CT722)
CPn0864 981670 982374 F yjbC-predicted pseudouridme synthase- (CT723)
CPn0865 982418 982942 F CT72 hypothetical protein
CPn0866 983491 982916 R birA-Biot Synthetase- (CT725)
CPn0867 983423 984667 r rodA-Rod Shape Protein- (CT726)
CPn0868 986643 984670 F zntA/cadA-Metal Transporc P-eype ATPase- (CT727)
CPn0869 987401 986658 F CT728 hypoehecical procein
CPn0870 988728 987448 F serS-Seryl CRNA Synchecase_2- (CT729)
CPn0871 988772 989899 F ribD-Ribo lavin Deammase- (CT730)
CPn0872 989963 991216 F πbAiribB-GTP Cyclohydracase & DHBP Synchase -(CT731)
CPn0873 991233 991694 F ribE-Ribicyllumazine Synchase- (CT732)
CPn0874 993107 991749 F CT733 hypoehecical procein
CPn0875 993372 994022 F CT734 hypoehecical protein
CPn0876 994144 995517 F dagA_2-D-Alanme/Glyc ne Peπnease_2- (CT735)
CPn0877 995533 9959B2 F ybcL family- (CT736)
CPn0878 996654 995992 F SET Domain protein- (CT737)
CPn0879 997439 996645 R yycJ-metal dependent hydrolase- (CT738)
CPnOββO 999861 997444 R ftsK-Cell Division Protein FtsK-(CT739)
CPnOβSl 1005667 1006209 F
CPn0882 1006268 1007404 F
CPn0883 1008865 1007573 R dmpP/πqr6-Phenolhydrolase/NADH ubiquinone oxidoreductase- (CT740)
CPn0884 1009359 1009009 R CT741 hypothetical protein
CPn0885 1010635 1009433 R ygcA-rRNA Methyltransferse- (CT742)
CPn0886 1011276 1010908 R hctA-Histone-Like Developmental Protein- (CT743)
CPn0887 1011692 1014157 F CHLTR possible phosphoprotem- (CT744)
CPnOβββ 1015423 1014119 R' hemG-protoporphyπnogen Oxidase- (CT745)
CPn0889 1016835 1015462 R hemN_2-Coproporphyrinogen III Oxιdase_2- (CT746)
CPnO890 1017805 1016819 R hemE-Uroporphyrinogen Decarboxylase- (CT747)
CPn0891 1021073 1017819 R mfd-Transcription-Repair Coupling- (CT748)
CPn0892 1023661 1021046 R alaS-Alanyl tRNA Synchecase- (CT749)
CPn0893 1023894 1025888 F CkCB-Transkecolase- (CT750)
CPn0894 1026766 1025888 R amn-AMP Nucleosidase- (CT751)
CPn0895 1026988 1027557 F efp_2-Elongacιon Factor P_2-(CT752)
CPn08 6 1027595 1027822 F CT753 hypothecical procein
CPnOB97 1028737 1027853 R (possible phosphohydrolase) - (CT754 )
CPn089β 1030460 1028904 R Mitochoπdrial HSP60 Chaperonin Homolog- (CT755)
CPn0899 1030875 1032215 F murF-Muramoyl-DAP Ligase- (CT756)
CPn09O0 1032235 1033281 F mraY-Muramoyl-Pentapeptide Transferase- (CT757)
CPn0901 1033287 1034S37 F murD-Muramoylalamne-Glutamate Ligase- (CT758)
CPn0902 1034543 1035241 F nlpD-Muramidase dnvasin repeat family) - (CT759)
CPn0903 1035263 1036417 F fts -Cell Division Protein Ftsw-(CT760)
CPn0904 1036326 1037396 F murG-Peptidoglycan Transferase- (CT761)
CPn090S 1037409 1039835 F murCtddlA-Muramate-Ala Ligase & D-Ala-D-Alam Ligase- (CT762)
CPn0906 1040340 1039915 R CT763 hypothetical protein
CPn0907 1040780 1040445 R •cutA Periplasmic Divalent Cation Tolerance Protein CutA (C-Type
Cytochrome Biogenesis Protein)
CPn0908 1041589 1040780 R CT764 hypochetical protein
CPn0909 1041637 1041966 F rsbV_2-Sιgma Factor Regulator_2-(CT765)
CPnO910 1041979 1043004 F miaA-tRNA Pyrophosphate Trans erase- (CT766)
CPn0911 1044043 1042985 R Fe-S cluster oxιdoreductase_2- (CT767)
CPn0912 1044129 1045760 F CT768 hypothetical protein
CPn0913 1045760 1045945 F
CPn0914 1045999 1046397 F
CPn0915 1046461 1046817 P ybeB-iojap superfamily ortholog- (CT769)
CPn0916 1046837 1048084 F fabF-Acyl Carrier Protein Synthase- (CT770)
CPn0917 1048090 1048539 F hydrolase/phosphacase homolog- (CT771)
CPn0918 1049223 1048579 R ppa-Inorganic Pyrophosphatase- (CT772)
CPn0919 1049378 1050430 F ldh-Leucme Dehydrogenase- (CT773)
CPn0920 1051405 1050431 R cysQ-Sul.i.e Synthesis/biphosphace phoβphataβe- (CT774)
CPn0921 1C51535 1052293 F snGlyceroι-3-P Acylcrans.erase- (CT775) CPn0922 1052314 10S3927 F aas-Acylglycerophosphoethanolam e Acy ' transferase- (CT776!
CPn0923 1053984 1055093 F bιoF_l-Oxononanoate Synthaβe_l- (CT777)
CPn0924 1057274 1055028 R priA-Primosomal Protein N' -(CT778)
CPn0925 1057900 1057226 R CT779 hypothetical protein
CPn0926 1058060 1058557 F Thioredox Disulfide Isomerase- (CT780)
CPn0927 1059809 1058670 R CHLPS 43 kDa protein homolog_2
CPn0928 1061008 1059884 R CHLPS 43 kDa protein homolog_3
CPn0929 1062292 1061186 R CHLPS 43 kDa protein homolog_4
CPn0930 1062857 1063330 F
CPn0931 1064138 1065718 F lysS-Lysyl tRNA Synthetase- (CT781)
CPn0932 1067142 1065721 R cysS-Cysteinyl tRNA Synthetase- (CT782)
CPn0933 1067535 1068578 F predicted disulfide bond isomerase- (CT783)
CPn0934 1068942 1068526 R mpA-Ribonuclease P Protein Component- (CT784)
CPn0935 1069091 1068957 R rl34-L34 Ribosomal Protein- (CT785)
CPn0936 1069336 1069470 F rl36-L36 Ribosomal Protein- (CT786)
CPn0937 1069496 1069798 F rsl4-S14 Ribosomal Protein- (CT787)
CPn0938 1070322 1069849 R CT788 hypothetical protein -[leader (60) pepcide-periplasmic]
CPn0939 1070728 1071195 F CT790 hypoehecical procein
CPnO940 1073012 1071204 R uvrC-Excinuclease ABC, Subunic C-(CT791)
CPn0941 1075501 1073018 R muCS-DNA Mismacch Repair- (CT792)
CPn0942 1075985 1077754 F dnaG/priM-DNA Primase- (CT794 )
CPn0943 1077978 1078238 F CT794.1 hypochetical protein
CPn0944 1078512 1078997 F
CPn094S 1079070 1079660 F CT795 hypothetical protein
CPn0946 1082786 1079745 R glyQ-Glycyl tRNA Synthetase- (CT796)
CPn0947 1083442 1084059 F pgsA_2-Glycerol-3-P-Phosphatydyltrans erase_2-(CT797)
CPn0948 1085474 1084047 R glgλ-Glycogen Synchase- (CT798)
CPn0949 1085929 1086483 F ccc-General Stress Protein- (CT799)
CPn0950 1086488 1087027 F pth-Peptidyl CRNA Hydrolase- (CT800)
CPn09Sl 1087122 1087457 F rs6-S6 Ribosomal Procein- (CT801)
CPn0952 1087478 1087723 F rsl8-S18 Ribosomal Procein- (CT802)
CPn0953 1087742 1088248 F rl9-L9 Ribosomal Protein- (CT803)
CPn0954 1088286 1088708 F ychB-Predicted Kinase- (CT804)
CPn0955 1088612 1089175 F (frame-shift with 0954)
CPn0956 1089560 1090909 F CT805 hypothetical protein
CPn0957 1093788 1090963 R ide/ptr-Insulmase family/Protease III-(CT806)
CPn0958 1094785 1093793 R plsB-Glycerol-3-P Acyltransferase- (CT807)
CPn0959 1096343 1094799 R cafE-Axial Filament Protein- (CT808)
CPnO960 1096764 1097102 F CT809 hypothetical protein
CPn0961 1097118 1097297 F rl32-L32 Ribosomal Protein- (CT810)
CPn0962 1097316 1098275 F plsX-FA/Phospholipid Synthesis Protein- (CT811)
CPn0963 1098398 1103224 F pmp_21-Polymorphic Outer Membrane Protein D Family- (CT812)
CPn0964 1104758 1103301 R
CPn0965 1106736 1104925 R lpxB-Lipid A D saccharide Synthase- (CT411)
CPn0966 1108037 1106748 R pcnB_2-Polyλ Polymerase_2-(CT410)
CPn0967 1108512 1109885 F mrsA/pgm-Phosphoglucomutase- (CT815)
CPn0968 1109895 1111721 F glmS-Glucosamme-Fructoεe-6-P Aminotransferase- (CT816)
CPn0969 1111812 1112999 F 0969-tyrP_l-Tyrosine Transporc_l-(CT817) cyrP_l-Tyrosine Transporc_l-
(CT817)
CPn0970 1113461 1114648 F 0970-eyrP_2-Tyrosιne Transport_2- (CT818) eyrP_2-Tyroβιne TransporC_2-
(CT818)
CPn0971 1114702 1115415 F yccA-Transporc Permease- (CT819)
CPn0972 1116299 1115430 R fesY-Cell Division Procein FtsY-(CT820)
CPn0973 1116370 1117527 F sucC-Succ yl-CoA Synchecase. Beea-(CT821)
CPn0974 1117544 1118422 F sucD-Succmyl-CoA Synchecase. Alpha- (CT822)
CPn0975 1119104 1119637 F
CPn0976 1120082 1121185 F
CPn0977 1121371 1122402 F
CPn0978 1122665 1123693 F
CPn0979 1123980 1125443 F hCrA-DO Serine Procease- (CT823)
CPn0980 1126982 1125504 R •similarity to Saccharomyces serevisiae hypothetical 52.9KD protein
CPn0981 1127031 1129952 F Zinc Metalloprotease (insul ase family) - (CT824)
CPn0982 1131194 1129962 R yigN family- (CT825)
CPn0983 1132000 1131206 R pssA-Glycerol-Serine Phosphatidyltransferase- (CT826)
CPn0984 1132379 1135510 F nrdA-Ribonucleoside Reductase , Large Chain- (CT827 )
CPn0985 1135534 1136571 F nrdS-Ribonucleoside Reductase, Small Chain- (CT828 )
CPn0986 1136724 1137395 F yggH-predicted rRNA Methylase- (CT829 )
CPn0987 1137516 1138115 r ycgB- like predicted rRNA methylase- (CT830 )
CPn0988 1138986 1138075 R murB-UDP-N-Acecylenolpyruvoylglucosamine Reductase- (CT831 )
CPn0989 1139495 1139016 R CT832 hypothetical protein
CPn0990 1139883 1140440 F mfC-Initiation Factor 3 - (CT833 )
CPn0991 1140421 1140612 F rl35 -L35 Ribosomal Procein- (C 834 ) CPn0992 1140634 1140996 F rl20-L20 Ribosomal Protein- (CT835)
CPn0993 1141014 1142030 F pheS-Phenylalanyl tRNA Synthetase, Alpha- (CT836)
CPn099 1142398 1144440 F CT837 hypothetical protein
CPn0995 1145512 1144415 R CT838 hypothetical protein
CPn0996 1146589 1145519 R CT839 hypothetical protein
CPn0997 1146708 1147664 F mesJ-PP-loop superfamily ATPase- (CT840)
CPn0998 1147855 1150584 F ftsH-ATP-dependent zinc protease- (CT841)
CPn0999 1152847 1150766 R pnp-Polyribonucleotide Nucleotidyltransferase- (CT842)
CPnlOOO 1153157 1152891 R rsl5-S15 Ribosomal Protein- (CT843 )
CPnlOOl 1153405 1153869 F yfhc-cytosine deam ase- (CT844)
CPnl002 1153862 1154089 F CT845 hypothetical protein
CPnl003 1154796 1154092 R CT846 hypothetical protein
CPnl004 1155397 1154879 R CT847 hypothetical protein
CinlOOS 1155933 1155415 R CT848 hypothetical protein
CPnlOOβ 1156472 1155990 R CT849 hypothecical procein
CPnl007 1156689 1156907 F CT849.1 hypoehecical proeem
CPnlOOβ 1156928 1158223 F CT850 hypoehecical procein
CPnl009 1159058 1158186 R map-Mechionine Aminopepcidase- (CT851)
CPnlOlO 1159672 1159067 R CT852 hypoehecical procein
CPnlOll 1160306 1159902 R CT853 hypoehecical procein
CPnl012 1162193 1160421 R yzeB-ABC cransporcer permease- (CT854)
CPnl013 1162245 1163624 F fumC-Fumarace Hydracase- (CT855)
CPnl014 1165426 1163732 R ychM-Sul ace Transporcer- (CT856)
CPnlOlS 1165634 1166893 F CT857 hypoehecical procein (possible IM procein)
CPnlOlδ 1167042 1168898 F CT858 hypoehecical procein
CPnl017 1169006 1169935 F lyCB-Mecalloprocease-(CT859)
CPnlOlβ 1169898 1170629 F
CPnl019 1172128 1170638 R CT860 hypoehecical procein
CPnl020 1173679 1172150 R CT861 hypoehecical procein
CPnl021 1174213 1173698 R lcrH_2-Low Calcium Response_2- (CT862)
CPnl022 1175673 1174216 R CT863 hypoehecical procein
CPnl023 1176035 1176331 F
CPnl024 1177236 1176334 R xerD-Incegrase/recombinase- (CT864 )
CPnl025 1177302 1178879 F pgi-Glucose-6-P Isomerase- (CT378)
CPnl026 1178997 1179137 F ltuA-(CT377)
CPnl027 1179175 1180755 F
CPnl028 1181016 1181999 F mdhC-Malace Dehyrogenase- (CT376)
CPnl029 1182008 1182844 F"
CPnl030 1183886 1182843 R predicced D-ammo acid dehyrogenase- (CT375)
CPnl031 1185552 1184098 R arcD-Arginine/OrniCh e Anciporcer- (CT374)
CPnl032 1186X50 1185566 R CT373 hypoehecical procein
CPnl033 1187500 1186187 R CT372 hypoehecical procein
CPnl034 1188517 1187732 R Predicced OMP_l (CT371) [leader (18) pepcide]
CPnl035 1190000 1188570 R AroE-Sh kimace S-Dehyrogenase- (CT370)
CPnl036 1191135 1189984 R AroB-Dehyroqumate Synthase- (CT369)
CPnl037 1192199 1191123 R AroC-Chorismate Synthase- (CT368)
CPnl038 1192726 1192199 R aroL-Shikimate Kinase II-(CT367)
CPnl039 1193999 1192665 R aroA-Phosphoshikimate V yltransferase- (CT366)
CPnl040 1194741 1194073 R
CPnl041 1195994 1194726 R *bioλ-λdenosylmethιonιne-8-λmιno-7-Oxononanoate Ammotransferase
CPnl042 JL19S590 1195934 R *bιoD-dethιobιotm synchecase
CPnl043 1197717 1196572 R bιoF_2-Oxononanoace Synchase_2
CPnl044 1198691 1197699 R •bioB-Bioe Synchase
CPnl045 1199590 1198901 R conserved hypoehecical baccerial membrane procein
CPnl046 1200675 1199590 R Trypcophan Hyroxylase
CPnl047 1200552 1201343 F dapB-Dihydrodipicolinace Reductase- (CT364)
CPnl048 1201606 1202604 F asd-Asparcace Dehydrogenase- (CT363 )
CPnl049 1202595 1203914 F lysC-Asparcokinase III-(CT362)
CPnl050 1203926 1204798 F dapA-Dihydrodip col ace Synchase- (CT361)
CPnlOδl 1204962 1205270 F
CPnl052 1205417 1206169 F
CPnl053 1206153 1206701 F
CPnl054 1207034 1209466 F
CPnl055 1209694 1210521 F
CPnlOS6 1210527 1211228 F
CPnl057 1211497 1213596 F CT3S6 hypothecical procein
CPnl058 1213748 1214836 F CT355 hypoehecical procein
CPnl059 1214848 1215678 F kgsA-Dimechyladenos e Transferase-(CT354)
CPnl060 1217658 1215727 R dxs/tkt-Transketolase- (CT331)
CPnlOβl 1217920 1217666 R CT330 hypothetical procein
CPnl062 1219820 1218159 R xseA-Exodoxyribonuclease VII-(CT329)
CPnl063 1219951 1220712 F epiS-Triosephosphate Isomerase- (CT328) CPnl064 1220719 1220895 F
CPnl065 1221095 1220928 R
CPnl066 1221135 1221488 F
CPnl067 1221735 1222292 F def-Polypeptide Deforraylase- ( T353)
CPnl068 1223258 1222365 R rnhB_2-Ribonucleaβe HII_2-(CT00β)
CPnl069 1223513 1223941 F yfgA-HTH Transcriptional Regulator- (CT009)
CPH1070 1225511 1224144 R
CPnl071 1227324 1225885 R
CPnl072 1227969 1228835 F
CPnl073 1229011 1229832 F Predicted OMP_2 -(CT371)
Table 2 (Supplemental Data) Functional Assignments oi C pneumoniae Coding Sequences C trachomatis genes are shown in parentheses
Ammo Acid Biosvnthesis
Aromatic Family
1039 (CT366) aroA Phosphoshikimate Vinvltransferase
1036 (CT369) aroB Dehyroquinate Synthase
1037 (CT368) aroC Chonsmate Synthase
1035 (CT370) aroE Shikimate 5-Dehyrogenase
0484 (CT 82) aroG Deoxyheptoπate Aldolase
1038 (CT367) aro Shikimate Kinase 11
0740 (CT637) tyrB Aroi laαc AA Aminotransterase
Aspartate Family (lysine)
1048 (CT363) asd Aspa'tate Dehydrogenase
1050 (CT361 ) dapA Dihydrodipicoliπate Synthase
1047 (CT364) dapB Dthydrodipicolinate Reductase
0519 (CT430) dapF Diaπ inopimelate Epimerase
1049 (CT362) lysC Aspa tokinase III
Serine Family
0433 (CT282) gcsH Glyc ne Cleavage System H Protein
0521 (CT432) giyA Seπne Hydroxymethyltransferase
Base Nucleotide Metabolism
0171 guaA GMP Synthase
0172 guaB Inosine 5'-Monophosphase Dehydrogenase
0608 Undine 5'- onophosphate Synthase
0735 Undine Kinase
0244 (CT128) ad Adenylate Kinase
0894 (CT751 ) amn AMP Nucleosidase
0568 (CT452) cm CMP Kinase
0392 (CT039) dcd dCTP Deaminase
0059 (CT292) dut dUTP Nucleondohydrolase
0120 (CT030) gmk GMP Kinase
0619 (CT500) ndk Nucleosιde-2-P Kinase
0984 (CT827) nrdA Riboπucleoside Reductase. Large Chain
0985 (CT828) nrdB Ribonucleoside Reductase, Small Chain
0236 (CT183) pyτG CTP Synthetase
0698 (CT678) pyrH UMP Kinase
0273 (CT188) tdk Thymidylate Kinase
0659 (CT539) trxA Thioredoxin
0314 (CT099) tixB Thioredoxin Reductase
1001 (CT844) yfhC Cytosine Deaminase
Biosynthesis of Cofactors
Biotin Lφoate & Ubiquinone
1041 bioA Adenosylmethιoπιne-8-Amtno-7-Oxononanoate Aminotiansferase
1044 bioB Biotin Synthase
1042 bioD Dethiobiotin Synthetase
0923 (CT777) bioF Oxononanoate Synthase l
1043 (CT777) bιoF_2 Oxononanoate Synthase_2
0366 (CT725) birA Biotm Synthetase
0748 (CT628) ispA Geranyl Transtπnsferase
0832 (CT558) lipA Lipoate Synthetase 0265 (CT219) ubiA Benzoate Octaphenvltransferase
0264 (CT220) ubiD Phenylacrylate Decarboxvlase
0515 (CT428) uoiE Ubiquinone Methyltransferase
Folic Acid
0759 (CT612) folA Dihydrofolate Reductase
0335 (CT078) folD Methylene Tetrahydrofolate Dehydrogenase
0758 (CT613) folP Dihydropteroate Synthase
0757 (CT614) folX Dihydroneoptenn Aldolase
0763 (CT649) gtA Fonmvltetrahydrofolate Cvcloligase
Porphyrm
0714 (CT662) hemA Glutamyl tRNA Reductase
0744 (CT63) hemB Porphobilinogen Synthase
0052 (CT299) he C Poiphobilinogen Deaminase
0890 (CT747) hemE Uroporphyπnogen Decarboxylase
0888 (CT745) hemG protoporphyπnogen Oxidase 0138 (CT210) hemi. Glutamate- 1 -Semιaldehyde-2, 1 -Ammomutase 0380 (CT052) hemN_l Coproporphyπnogen III Oxιdase_l
0889 (CT746) hemN_2 Coproporphyπnogen III Oxιdase_2 0603 (CT485) hemZ FerTochetalase
Riboβavin
0872 (CT7 1) πbA&πbB GTP Cyclohydratase & DHBP Svnthase 0532 (CT405) πbC Riboflavin Svnthase 0871 (CT730) nbD Riboflavin Deaminase
0873 (CT732) πbE Ribityllumazine Synthase 0320 (CT093) nbF FAD Synthase
Cell Envelope
Fatty Acid & Phospholφid Metabolism
0161 (CT206) (predicted acyltransferase family)
0922 (CT776) aas Acylglycerophosphoethanolamine Acyltransferase
0414 (CT265) accA AcCoA Carboxylase Transferase Alpha
0183 (CT123) accB Biotin Carboxyl Camer Protein
0182 (CT124) accC Biotin Carboxylase
0058 (CT293) accD AcCoA Carboxylase Transferase Beta
0295 (CT236) acpP Acyl Camer Protein
0313 (CT100) acpS Acyl-camer Protein Synthase
0567 (CT451 ) cdsA Phosphatidate Cytidylytransferase
0297 (CT238) fabD Maloπyl Acyl Camer Traπscyclase
0916 (CT770) fabF Acyl Camer Protein Synthase
0296 (CT237) fabG Oxoacyl (Camer Protein) Reductase
0298 (CT239) fabH Oxoacyl Camer Protein Synthase III
0406 (CT104) fabl Enoyi-Acyl-Camer Protein Reductase
06 1 (CT532) fabZ Myπstoyl-Acyl Camer Dehydratase
0098 (CT010) htrB Acyltransferase
0271 (CT136) Lysophospholipase Esterase
0615 (CT496) pgsA_l Glycerol-3-P Phosphatidyltransferase l
0947 (CT797) pgsA_2 Glycerol-3-P Phosphatydyltransferase_2
0958 (CT807) plsB Glycerol-3-P Acyltransferase
0569 (CT453) plsC Glycerol-3-P Acyltransferase
0962 (CT81 1 ) plsX FA Phospholipid Synthesis Protein
0839 (CT699) psdD Phosphahdylseπnc Decarboxylase
0983 (CT826) pssA Glycerol-Seπne Phosphaπdyltraπsferase
0921 (CT775) snGlycerol-3-P Acyltransferase
0654 (CT535) yciA Acyl-CoA Thioesterase
0877 (CT736) ybcL. CT736 Hypothetical Protein 0154 (CT208) gseA KDO Transferase
0721 (CT655) kdsA KDO Synthetase
0235 (CT182) kdsB Deoxyoctulonosic Acid Svnthetase
0650 (CT531 ) IpxA Acyl-Camer UDP-GlcNAc O-Acyltransferase
0965 (CT411 ) IpxB Lipid A Disacchande Svnthase
0652 (CT533) lpxC Mynstoyl GlcNac Deacetvlase
0302 (CT243) IpxD UDP Glucosam e N-Acvltransferase
Membrane Proteins. Lipoproteins & Por s
0310 (CT251) 60IM 60kDa Inner Membrane Protein
0556 (CT442) CφA 15kDa Cysteme-Rich Protein
0653 (CT534) cutE Λpolipoprotem N-Acetyltransferase
031 1 (CT252) Igt Prolipoprotein Diacylglycerol Transferase
0558 (CT444) omcA 9kDa-Cysteme-Rιch Lipoprotein
0557 (CT443) omcB 60kDa Cysteme-Rich OMP
0695 (CT681) ompA Major Outer Membrane Protein
0854 (CT713) ompB Outer Membrane Protein B
0781 (CT600) pal Pepttdoglycan-Associated Lipoprotein
0300 (CT241 ) yaeT Omp85 Homolog Peptidogi 'yean
0417 (CT268) amiA N-Acetylmuramoyl Alanine Amidase
0780 (CT601 ) amiB N-Acetylmuramoyl-L-Ala Amidase
0672 (CT551) dacF D-Ala-D-Ala Caroxypeptidase
0968 (CT816) gl S GIucosamιne-Fructose-6-P Amiπotransferase
0749 (CT629) glmU UDP-GlcNAc Pytophosphorylase
0900 (CT757) mraY Muramoyl-Pentapepade Transferase
0571 (CT455) murA UDP-N-Acetylglucosamine Transferase
0988 (CT831 ) murB UDP-N-Acetylenolpyruvoylglucosamine Reductase
0905 (CT762) mmuutiCC&&dddd:lA Muramate-Ala Ligase & D-Ala-D-Alam Ligase
0901 (CT758) murD Muramoylalamne-Glutamate Ligase
0418 (CT269) murE N-Acetyimuramoylalanylglutamyl DAP Ligase
0899 (CT756) murF Muramoyl-DAP Ligase
0904 (CT761 ) murG Pepndoglycan Transferase
0902 (CT759) nlpD Muramidase (mvasin repeat family)
0694 (CT682) pbp2 PBP2-Transglycolase/Transpeptιdase
0419 (CT270) pbp3 Transglycolase Transpeptidase
0421 (CT272) yabC PBP2B Family Methyltransferase
Cellular Processes
Cell Division
0959 (CT808) cafE Axial Filament Protein
0880 (CT739) ftsK Cell Division Protein FtsK
0903 (CT760) ftsW Cell Division Protein FtsW
0972 (CT820) ftsY Cell Division Protein FtsY
0617 (CT498) gidA FAD-dependent Oxidoreductase
0805 (CT582) minD Chromosome Partitioning ATPase
0850 (CT709) mreB Rod Shape Protein-Sugar Kinase
0867 (CT726) rodA Rod Shape Protein
0684 (CT688) parB Chromosome Partitioning Protein
Detoxtification
0057 (CT294) sodM Superoxide Dismutase (Mn)
0778 (CT603) ahpC Thio-specific Annoxidant (TSA) Peroxidase
Signal Transd ction
0148 (CT145) ST Protein Kinase
0584 (CT467) atoS Two-Component Sensor
0294 (CT23S) cAMP-Dependent Protein Kinase Regulatory Subunit
0712 (CT664) (FHA domain) 0478 (CT379) hflX GTP Binding Protein
0703 (CT673) S T Protein Kinase
0095 (CT301) S/T Protein Kinase
0397 (CT259) PP2C Phosphatase Family
0037 (CT337) ptsH PTS Phosphocamer Protein Hpr
0038 (CT336) ptsl PTS PEP Phosphotransferase
0060 (CT29I ) ptsN_l PTS IIA Protein
0061 (CT290) ptsN_2 PTS IIA Protein i- HTH DNA-B ding Domain
0262 (CT218) surE SurE-like Acid Phosphatase
0838 (CT698) thdF Thiopheπe Furan Oxidation Protein
0693 (CT683) TPR Repeats-CT683 Hypothetical Protein
0321 (CT092) ychF GTP Binding Protein
0544 (CT418) yhbZ GTP binding protein
0844 (CT703) yphC GTPase GTP -binding protein
Standard Protein Secretion
01 15 (CT025) ffh Signal Recognition Panicle GTPase
0363 (CT060) πhA Flagellar Secretion Protein
0858 (CT71 ) fill Flagellum-specific ATP Synthase
0704 (CT672) fliN Flagellar Motor Switch DomainYscQ family
0815 (CT572) gspD Gen Secretion Protein D
0816 (CT571 ) gspE Gen Secretion Protein E
0817 (CT570) gspF Gen Secretion Protein F
0359 (CT064) lepA GTPase
OHO (CT020) lepB Signal Peptidase I
053S (CT408) IspA Lipoprotein Signal Peptidase
0260 (CT141 ) secA l Protein Translocase SubuniM
0841 (CT701 ) secA_2 Translocase SecA_2
0564 (CT448) secD&secF Protein Export Proteins SecD/SecF (fusion)
0075 (CT321 ) secE Preprotein Translocase
0629 (CT510) secY Translocase
0848 (CT707) "g Tπgger Factor-Pepttdyl-prolyl Isomerase
Transport-Related Proteins
0486 Hypothetical Proliπe Permease
0289 (CT230) aaaT Neutral Amino Acid (Gluta ate) Transporter
0691 (CT685) abcX ABC Transporter ATPase
1031 (CT374) arcD Arginine/Omithine Antiporter
0482 (CT381 ) artJ Argiπme Peπplasmic Binding Protein
0836 (CT554) bmQ Ammo Acid (Branched) Transport
0536 (CT409) dagA_I D-Ala/Gly Permease
0876 (CT735) dagA_2 D-Alanine/Glycine Permease_2
0682 (CT690) dppD ABC ATPase Dipepnde Transport
0683 (CT689) dppF ABC ATPase Dipepnde Transport
0280 (CT689) dppF Dipeptide Transporter ATPase
0785 (CT596) exbB Macromolecule Transporter
0784 (CT597) exbD Biopoiymer Transport Protein
0604 (CT486) niY Glutamine Binding Protein
0192 (CT129) glnP ABC Amino Acid Transporter Peπnease
0191 (CT130) glnQ ABC Amino Acid Transporter ATPase
0528 (CT40I) gltT Glutamate Symport
0286 (CT194) mgtE Mg++ Transporter (CBS Domain)
0413 (CT264) msbA Transport ATP Binding Protein
0290 (CT231) Na+-dependent Transporter
0195 (CTI98) oppA_l Oligopeptide Binding ProteinJ
0196 (CT198) oppA_2 Oligopeptide Binding Protetn_2
0197 (CT139) oppA_3 Oligopeptide Binding Proteιn_3
0198 (CTI 75) oppA_4 Oligopeptide Binding Protein 0599 (CT480) oppA_5 Oligopeptide Binding LiDoproteιn_5
0199 (CTI99) oppB l Oligopeptide Permease !
0598 (CT479) oppB_2 Oligopeptide Permease_2
0200 (CT200) oppCJ Oligopeptide Permease l
0597 (CT478) oppC_2 Oligopeptide Permease_2
0201 (CT201 ) oppD Oligopeptide Transport ATPase
0202 (CT202) oppF Oligopeptide Transport ATPase
0231 (CT180) tauB ABC Transport ATPase rSitrate/Fe)
0782 (CT599) tolB Macromolecule Transporter
0969 (CT817) tyrPJ Tyrosine Transport l
0970 (CT8.18) tyrP_2 Tyrosine Transport_2
0665 (CT544) uhpC Hexosphosphate Transport
0282 (CT216) xasA Amin-) Acid Transporter
0207 (CT204) ybhl dicarboxylate Translocator
0971 (CT81 ) yccA Transport Permease
0248 (CTI52) ycfV ABC Transporter ATPase
1014 (CT856) ychM Sulfa t: Transporter
0736 (CT641 ) ygeD Effla Protein
0680 (CT692) ygo4 Phosp late Permease
0723 (CT653) yhbG ABC Transporter ATPase
0023 (CT348) yjjK ABC Transporter Protein ATPase
0127 (CT034) ytfF Canoi ic Ammo Acid Transporter
0349 (CT067) ytgA Soluti Protein Binding Family
0348 (CT068) ytgB ABC lansporter ATPase
0347 (CT069) ytgC Integral Membrane Prote
0346 (CT070) ytgD Integral Membrane Protein
1012 (CT854) yzeB ABC Transporter Permease
0868 (CT727) zntA Metal Transport P-type ATPase
0279 Possible ABC Transporter Permease Protein
0543 (CT417) (Metal Transport Protein)
0692 (CT684) ABC Transporter
0542 (CT416) ABC Transporter ATPase
0690 (CT686) ABC Transporter Membrane Protein
0541 (CT415) solute binding protein
Type-Ill Secretion
0323 (CT090) IcrD Low Calcium Response D
0324 (CT089) IcrE Low Calcium Response E
081 1 (CT576) IcrHJ Low Ca Response Protein H_l
1021 (CT862) lcrH_2 Low Calcium Response_2
0325 (CT088) sycE Secretion Chaperone
0702 (CT674) yscC Yop C/Gen Secretion Protein D
0828 (CT559) yscJ Yop Translocation J
0826 (CT561 ) yscL Yop Translocation L
0707 (CT669) yscN Yop N (Flagellar-Type ATPase)
0825 (CT562) yscR Yop Translocation R
0824 (CT563) yscS YopS Translocanon Protein
0823 (CT564) yscT YopT Tranlocation T
0322 (CT09 I ) yscU Yop Translocanon Protein U
Central Intermediary Metabolism ycogen Metabolism
0856 (CT715) UDP-Glucose Pyrophosphorylase
0948 (CT798) glgA Glycogen Synthase
0475 (CT866) glgB Gtucan Branching Enzyme
0607 (CT489) glgC Glucose-1-P Adenyltransferase
0307 (CT248) glgP Glycogen Phosphorylase
0388 (CT042) glgX Glycogen Hydrolase (debranching) 0326 (CT087) malQ Glucanotransterase
0851 (CT710) pckA Phosphoenolpynivate Carϋoxykinase Phosphorous & Sulfur
0548 (CT435) cvsJ Sulfite Reductase
0920 (CT774) cvsQ Sulfite SynthesisBiphospnate Phosphatase
0025 (CT346) atsA Sulphohydrolase
0918 (CT772) ppa Inorganic Pyrophosphatase
DNA Replication, Modification, Repair & Recombination
DNA Mismatch Repau
0505 3-Methyladenιne DNA Glycosylase
0812 (CT575) mutL DNA Mismatch Repair
0941 (CT792) mutS DNA Mismatch Repair
0402 (CT107) utY Adenine Glycosylase
0732 (CT625) n o Endonuclease IV
0837 (CT697) nth Enodnuciease III
DNA Modification
0596 (CT477) ada Methyltransferase
01 14 (CT024) hemK A/G-specific Methylase
0891 (CT748) fd Transcnption-Repair Coupling
0620 (CT501) ruvA Holliday Junction Helicase
0390 (CT040) ruvB Holhday Junction Helicase
0621 (CT502) ruvC Crossover Juncnon Endonuclease
0053 (CT298) sms Sms Protein
0773 (CT607) ung Uracil DNA Glycosylase
1062 (CT329) xseA Exodoxyπbonuclcase VII
DNA Recombination
0762 (CT650) recA RecA Recombination Protein
0738 (CT639) recB Exodeoxynbonuclease V. Beta
0737 (CT640) recC Exodeoxynbonuclease V, Gamma
0123 (CT033) recD_l Exodeoxynbonuclease V (Alpha Subunιt)_l
0752 (CT652) recD_2 Exodeoxynbonuclease V Alpha_2
0339 (CT074) recF ABC Superfamily ATPase
0340 (CT074) (frame-shift with 0339)
0563 (CT447) recJ ssDNA Exonuclease
0299 (CT240) recR Recombination Protein
DNA Replication
0309 (CT250) dnaA_l Replication Imtianon Proteιn_l
0424 (CT275) dnaA_2 Replication Initiation Factor_2
0616 (CT497) dnaB Repiicanve DNA Helicase
0666 (CT545) dnaE DNA Pol III Alpha
0942 (CT794) dnaG DNA Pnmase
0338 (CT075) dnaN DNA Pol III (Beta)
0410 (CT261) dnaQ_l DNA Pol III Epsilon Cham
0655 (CT536) dnaQ_2 DNA Pol III Epsilon Cham_2
0040 (CT334) dnaXJ DNA Pol III Gamma and Tau_l
0272 (CT187) dnaX_2 DNA Pol III Gamma and Tau_2
0149 (CT146) dnlJ DNA Ligase
0274 (CT189) gyrAJ DNA Gyrase Subunit A_l
0716 (CT660) gyrA_2 DNA Gyrase Subunit A_2
0275 (CT190) gyrBJ DNA Gyrase Subunit B_l
0715 (CT661 ) gyrB.2 DNA Gyrase Subunit B_2
0416 (CT267) himD Integration Host Factor Alpha
0612 (CT493) polA DNA Poiymerase I
0924 (CT778) pnA Pnmosomal Protein N1
0386 (CT044) ssb SS DNA Binding Protein 0835 (CT555) SWI/SNF familv helicase l
0849 (CT708) SWI7SNF family helιcase_2
0769 (CT643) topA DNA Topoisomerase I-Fused to SWI Domain
0024 (CT347) xerC Integrase recombinase
1024 (CT864) xerD Integrase recombinase
Eukaryotic-Type Chromatin Factors
0886 (CT743) hctA Histone-Like Developmental Protein
0384 (CT046) hctB Histone-like Protein 2
0878 (CT737) SET Domain protein 0577 (CT460) SWIB (YM74) Complex Protein
UVR Exmuclease Repair System
0096 (CT333) uvrA Excinuclease ABC Subunit A
0801 (CT586) uvrB Exmuclease ABC Subunit B
0940 (CT791 ) uvrC Excinuclease ABC, Subunit C 0772 (CT608) uvrD DNA Helicase
Energy Metabolism
Aerobic
0855 (CT714) gpdA Glycerol-3-P Dehydrogenase
0743 (CT634) nqrA Ubiquinone Oxidoreductase. Alpha
0427 (CT278) nqr2 NADH (Ubiquinone) Dehydrogenase
0428 (CT279) nqr3 NADH (Ubiquinone) Oxidoreductase. Gamma
0429 (CT280) nqr4 NADH (Ubiquinone) Reductase 4
0430 (CT281) nqrS NADH (Ubiquinone) Reductase 5
0883 (CT740) nqrt Phenolhydrolase/NADH (Ubiquinone) Oxidoreductase 6
A TP Biogenesis and metabolism
0351 (CT065) adt ADP/ATP TranslocaseJ
0614 (CT495) adt_2 ADP/ATP Translocase_2
0088 (CT308) atpA ATP Synthase Subunit A
0089 (CT307) atpB ATP Synthase Subunit B
0090 (CT306) atpD ATP Synthase Subunit D
0086 (CT310) atpE ATP Synthase Subunit E
0091 (CT305) atp! ATP Synthase Subunit I
0092 (CT304) atpK ATP Synthase Subunit K
0860 (CT719) ftiF Flagellar M-Ring Protein
Electron Transport Cham
0102 (CT0I3) cydA Cytochrome Oxidase Subunit I
0103 (CT014) cydB Cytochrome Oxidase Subunit II
0364 (CT059) Ferredoxin
0084 (CT312) Predicted Ferredoxm
Qlycolysis & Gluconeogenesis
0281 (CT215) dhnA Predicted 1,6-Fructose Biphosphate Aldolase
0800 (CT587) eno Enolase
0624 (CT505) gapA Glyceraldehyde-3-P Dehyrogenase
0056 (CT295) mrsA Phosphomannomutase
0967 (CT815) pgm Phosphoglucomutase
0160 (CT207) pfkAJ Fructose-6-P Phosphotransferase l
0208 (CT205) pfkA_2 Fructose-6-P Phosphotransferase_2
1025 (CT378) Pgi Glucose-6-P Isomerase
0679 (CT693) Pgk Phosphoglycerate Kinase
0863 (CT722) PgmA Phosphoglycerate Mutase
0097 (CT332) pyk Pyruvate Kinase
1063 (CT328) tpiS Tnosephosphate Isomerase
Pentose Phosphate Pathway
0239 (CT186) devB Glucose-6-P Dehyrogenase (DevB family)
1060 (CT331) dxs Transketolase 0360 (CT063) gnd 6-Phosphogluconate Dehydrogenase
0185 (CT121) rpe Ribulose-P Epimerase
0141 (CT213) φiA Rιbose-5-P Isomerase A
0083 (CT313) tal Transaldolase 0893 (CT750) tktB Transketolase
0238 (CTI85) zwf Glucose-6-P Dehyrogenase
Pvruvate Dehydrogenase
0833 (CT557) lpdA Lipoamide Dehydrogenase
0436 (CT285) IplAJ Lipoate Protein Ligase-Like Protein 0618 (CT499) lplA_2 Lipoate-Protem Ligase A
0033 (CT340) pdhA&B Oxoisovalerate Dehydrogenase α/β Fusion
0304 (CT245) pdhA Pyruvate Dehydrogenase Alpha
0305 (CT246) pdhB Pyruvate Dehydrogenase Beta
0306 (CT247) pdhC Dihydrolipoamide Acetyltransferase TCA Cycle
0495 (CT390) aspC Aspartate Aminotransferase
1013 (CT855) fumC Fumarate Hydratase
1028 (CT376) mdhC Malate Dehyrogenase
0789 (CT592) sdhA Succinate Dehydrogenase 0790 (CT591) sdhB Succ ate Dehydrogenase
0788 (CT593) sdhC Succinate Dehydrogenase
0378 (CT054) sucA Oxoglutarate Dehydrogenase
0377 (CT055) sucBJ Dihydrolipoamide Succ yltransferase l
0527 (CT400) sucB_2 Dihydrolipoamide Succιnyltransferase_2 0973 (CT821) sucC Succinyl-CoA Synthetase, Beta
0974 (CT822) sucD Succinyl-CoA Synthetase, Alpha
Protein Folding, Assembly & Modification
Chaperones
0949 (CT799) etc General Stress Protein
0534 (CT407) dksA DnaK Suppressor
0032 (CT341 ) dnaJ Heat Shock Protein J
0503 (CT396) dnaK Hsp-70
0134 (CT110) groELJ Hsp-60_1
0777 (CT604) groEL_2 Hsp-60_2
0898 (CT755) groEL_3 Hsp-60_3
0135 (CT11 1 ) groES lOKDa Chaperomn
0502 (CT395) gtpE HSP-70 Cofactor
0661 (CT541 ) mip FKBP-type Pepudyl-prolyl Cis-Trans Isomerase
Proteases
0144 (CT1 13) clpB Clp Protease ATPase
0437 (CT286) clpC ClpC Protease
0520 (CT431) clpP CLP Protease
0847 (CT706) clpP CLP Protease Subunit
0846 (CT705) clpX CLP Protease ATPase
0269 (CT138) Dipepαdase
0998 (CT841) ftsH ATP-dependent Zinc Protease
0030 (CT343) gcp_l O-Sialogiycoprotein Endopeptιdase_l
0194 (CT197) gcp_2 O-Sialoglycoprotem Endopeptιdase_2
0979 (CT823) htrA DO Seπne Protease
0957 (CT806) ide Insulinase family Protease III
0027 (CT344) Ion Lon ATP-dependent Protease
1017 (CT859) lytB Metallopro tease
1009 (CT851) map Methionine Aminopeptidase
0385 (CT045) pepA Leucyl Ammopepndase A
0136 (CT1 12) pepF Oligopeptidase 0813 (CT574) pepP Ammopeptidase P
0613 (CT494) sohB Protease
0555 (CT441 ) up Tail-Specific Protease
0344 (CT072) yaeL Metal'.oprotease
0981 (CT824) Zinc Melalloprotease (insulmase familv)
Protein Isomerases
0227 (CT176) dsbB Disulfide bond Oxidoreductase
0786 (CT595) dsbD Thio disulfide Intercnanae Protein
0228 (CT177) dsbG Disulfide Bond Chaperone
0933 (CT783) Predicted Disulfide Bond isomerase
0926 (CT780) Thioredoxin Disulfide Isomerase
Transcription
RNA Degradation
0999 (CT842) pnp Poiyπbonucleotide Nucleoπdyltransferase
0054 (CT297) mc Ribonuclease ill
01 19 (CT029) mhBJ Ribonuclease Hil l
1068 (CT008) mhB_2 Ribonuclease HII 2
0934 (CT784) pA Ribonuclease P Protein Component
0504 (CT397) vacB Ribonuclease Familv
RNA Elongation & T ermtnatton actors
0741 (CT636) greA Transcnption Elongation Factor
0316 (CT097) πusA N Utilization Protein A
0076 (CT320) nusG TranscπpQona! Antiterminanon
0845 (CT704) pcnBJ Poly A Polymerase l
0966 (CT410) pcnB_2 PolyA Polymerase_2
0610 (CT49I ) rho Transcnption Termination Factor
RNA Methylases
0674 (CT553) fmu RNA Methyltransferase
1059 (CT354) kgsA Dimethyladenosine Transterase
0187 (CT133) Predicted Methylase
0530 (CT403) spoU_l rRNA MethylaseJ
0660 (CT540) spoU_2 rRNA Methylase_2
0117 (CT027) trmD tRNA (Guamne ^-O-Methyltransferase
0885 (CT742) ygcA rRNA Methyloansferse
0986 (CT829) yggH Predicted rRNA Methvlase
0987 (CT830) ytgB Predicted rRNA Methylase
RNA Modification
0649 (CT530) fmt Methionyl tRNA Formyltransferase
0910 (CT766) miaA tRNA Pyrophosphate Transferase
0719 (CT658) sfhB Predicted Pseudoundine Synthase
0219 (CT193) tgt Queume tRNA Ribosyl Transferase
0580 (CT463) tniA Pseudouπdylate Synthase I
0319 (CT094) truB tRNA Pseudoundine Synthase
0403 (CT106) yceC Predicted Pseudoundine Synthetase Family
0864 (CT723) yjbC Predicted Pseudoundine Synthase
RNA Polymerase & Transcripts Regulators
0586 (CT468) atoC Two-Component Regulator
0362 (CT06I ) φsD Sιgma-28/ hιG Family
0501 (CT394) hrcA HTH Transcnptional Represser
0793 (CT588) rbsU Sigma Regulatory Family Protein— PP2C Phosphatase (RsbW Antagonist)
0626 (CT507) φoA RNA Polymerase Alpha
0081 (CT315) φoB RNA Polymerase Beta
0082 (CT314) φoC RNA Polymerase Beta'
0756 (CT615) φoD RNA Polymerase Sιgma-66
0771 (CT609) φθN RNA Polymerase Sιgma-54
051 1 (CT424) rsbVJ Sigma Regulatory Factor
0909 (CT765) rsbV_2 Sigma Factor Regulator_2
0670 (CT549) rsbW Sigma Regulatory Factor-Histidine Kinase
0750 (CT630) tctD HTH Transcnptional Regulatory Protein + Receiver Doman
1069 (CT009) yfgA HTH Transcnptional Regulator
Translation
Amino Acyl tRNA Synthesis 0892 (CT749) alaS Alanyl tRNA Synthetase 0570 (CT454) argS Arginyl tRNA Transferase
0662 (CT542) aspS Aspartyl tRNA Synthetase 0932 (CT782) cysS Cysteinyl tRNA Synthetase
0003 (CT003) gatA GIu tRNA Gin Amidotransferase (A subunit)
0004 (CT004) gatB GIu tRNA Gin Amidotransferase (B Subunit)
0002 (CT002) gatC GIu tRNA Gin Amidotransferase (C subunit)
0560 (CT445) gltX Glutamyl-tRNA Synthetase
0946 (CT796) giyQ Glycyl tRNA Synthetase
0663 (CT543) hisS Histidyl tRNA Synthetase
0109 (CT019) lleS Isoleucyi-tRNA Synthetase
0153 (CT209) leuS Leucyl tRNA Synthetase
0931 (CT781 ) lysS Lysyl tRNA Synthetase
0122 (CT032) metG Methionyl-tRNA Synthetase
0993 (CT836) pheS Phenylalanyl tRNA Synthetase. Alpha
0594 (CT475) pheT Phenylalanyl tRNA Synthetase Beta
0500 (CT393) proS Prolyl tRNA Synthetase
0870 (CT729) serS Seryl tRNA Synthetase_2
0806 (CT581 ) thrS Threonyl tRNA Synthetase
0802 (CT585) tφS Tryptophanyl tRNA Synthetase
0361 (CT062) tyrS Tyrosyl tRNA Synthetase
0094 (CT302) valS Valyl tRNA Synthetase
Peptide Chain Initiation. Elongation & Termination
1067 (CT353) def Polypeptide Deformylase
0184 (CT122) efp Elongation Factor P I
0895 (CT752) efp_2 Elongation Factor P_2
0550 (CT437) fusA Elongation Factor G
0073 (CT323) infA Initiation Factor IF- 1
0317 (CT096) infB Initiation Factor-2
0990 (CT833) infC Initiation Factor 3
01 13 (CT023) pfrA Peptide Chain Releasing Factor 1
0576 (CT459) prfB Pepbde Chain Release Factor 2
0950 (CT800) pth Pepπdyl tRNA Hydrolase
0318 (CT095) rbfA Ribosome Binding Factor A
0699 (CT677) πf Ribosome Releasing Factor
0697 (CT679) tsf Elongation Factor TS
0074 (CT322) tufA Elongation Factor Tu
Ribosomal Proteins
0078 (CT318) rll LI Ribosomal Protein
0644 (CT525) rl2 L2 Ribosomal Protein
0647 (CT528) rl3 L3 Ribosomal Protein
0646 (CT527) r!4 L4 Ribosomal Protein
0635 (CT516) rl5 L5 Ribosomal Protein
0633 (CT514) rl6 L6 Ribosomal Protein
0080 (CT316) rl7 L7/L12 Ribosomal Protein
0953 (CT803) rl9 L9 Ribosomal Protein
0079 (CT3I7) rllO L10 Ribosomal Protein
0077 (CT319) ril l LI I Ribosomal Protein
0247 (CTI25) rll3 LI 3 Ribosomal Protein
0637 (CT518) rll4 LI4 Ribosomal Protein
0630 (CT511) r!15 L15 Ribosomal Protein
0640 (CT521) rllδ LI 6 Ribosomal Protein
0625 (CT506) rll7 LI7 Ribosomal Protein
0632 (CT513) rll8 L18 Ribosomal Protein
0118 (CT028) rll9 L19 Ribosomal Protein
0992 (CT835) rl20 L20 Ribosomal Protein
0546 (CT420) rl21 L21 Ribosomal Protein
0642 (CT523) rl22 L22 Ribosomal Protein
0645 (CT526) r!23 L23 Ribosomal Protein 0636 (CT517) rl24 L24 Ribosomal Protein
0545 (CT419) rl27 L27 nbosomal protein
0327 (CT086) rl28 L28 Ribosomal Protein
0639 (CT520) rl29 L29 Ribosomal Protein
0112 (CT022) r!31 L31 Ribosomal Protein
0961 (CT810) rl32 L32 Ribosomal Protein
0250 (CT150) rl 3 L33 Ribosomal Protein
0935 (CT785) rl34 L34 Ribosomal Protein
0991 (CT834) rl35 L35 Ribosomal Protein
0936 (CT786) rl36 L36 Ribosomal Protein
0315 (CT098) rsl SI Ribosomal Protein
0696 (CT680) rs2 S2 Ribosomal Protein
0641 (CT522) rs3 S3 Ribosomal Protein
0733 (CT626) rs4 S4 Ribosomal Protein
0631 (CT512) rs5 S5 Ribosomal Protein
0951 (CT801) rs6 S6 Ribosomal Protein
0551 (CT438 rs7 S7 Ribosomal Protein
0634 (CT515 rs8 S8 Ribosomal Protein
0246 (CT126 rs9 S9 Ribosomal Protein
0549 (CT436) rslO S10 Ribosomal Protein
0627 (CT508) rsl l Sl l Ribosomal Protein
0552 (CT439 rsl2 S12 Ribosomal Protein
0628 (CT509 rsl3 S13 Ribosomal Protein
0937 (CT787] rsl4 S14 Ribosomal Protein
1000 (CT843 rsl5 S15 Ribosomal Protein
01 16 (CT026 rsl6 S16 Ribosomal Protein
0638 (CT519 rs! 7 SI 7 Ribosomal Protein
0952 (CT802 rsl8 SI 8 Ribosomal Protein
0643 (CT524 rsl9 S19 Ribosomal Protein
0754 (CT617 rs20 S20 Ribosomal Protein
0031 (CT342 rs21 S21 Ribosomal Protein
Other Categories
Chlamydia-Specific Proteins
0561 (CT446) Euo CHLPS Euo Protein
0804 (CT583) Gp6D CHLTR Plasmid Paralog
0186 (CT1 19) Similaπty to incA_l
0291 (CT232) incB Inclusion Membrane Protein B
0292 (CT233) incC Inclusion Membrane Protein C
1026 (CT377) LtuA Protein
0333 (CT080) LtuB Protein
0005 (CT871) pmp l Polymoφhic Outer Membrane Protein G Family
0013 (CT871) pmp_2 Polymoφhic Outer Membrane Protein G Family
0014 (CT871 ) pmp_3 Polymoφhic Outer Membrane Protein G Family
0015 (CT871) pmp_3 PMP_3 (frame-shift with 0014)
0016 (CT874) pmp^4 Polymoφhic Outer Membrane Protein G Family
0017 (CT871) pmp_4 PMP_4 (frame-shift with 0016)
0018 (CT874) pmp_5 Polymoφhic Outer Membrane Protein G Family
0019 (CT871 ) pmp_5 PMP_5 (frame-shift with 0018)
0444 (CT871 ) pmp_6 Polymoφhic Outer Membrane Protein G/I Family
0445 (CT871 ) pmp 7 Polymoφhic Outer Membrane Protein G Family
0446 (CT871 ) pmp_8 Polymoφhic Outer Membrane Protein G Family
0447 (CT87I) pmp_9 Polymoφhic Outer Membrane Protein G/I Family
0450 (CT871) pmp JO Polymoφhic Outer Membrane Protein G Family
0449 (CT871 ) pmp_10 PMPJ0 (Frame-shift with 0450) 0451 (CTS71 ) pmp_ 1 1 Polymoφhic Outer Memorane Protein G Family
0452 (CT874) pmp_1 Polymoφhic Outer Membrane Protein (truncated) A/I Family
0453 (CT871) pmp_ 1 Polymoφhic Outer Membrane Protein G Family
0454 (CT872) pmp 14 Polymoφhic Outer Memorane Protein H Familv
0466 (CT869) pmp_15 Polymoφhic Outer Membrane Protein E Family
0467 (CT869) pmp_16 Polymoφhic Outer Membrane Protein E Family
0468 (CT869) pmp_17 Polymoφhic Outer Membrane Protein E Family
0469 (CT869) pmp_17 PMPJ 7 (Frame-shift with 0468)
0470 (CT869) pmp_17 PMP_17 (Frame-shift with 0469)
0471 (CT870) pmp_18 Polymoφhic Outer Membrane Protein E F Family
0539 (CT412) pmp_19 Polymoφhic Membrane Protein A Family
0540 (CT413) pmp_20 Polymoφhic Membrane Protein B Family
0963 (CT812) pmp_21 Polymoφhic Membrane Protein D Family
0562 CHLPS 43 kDa Protein HomologJ
0927 CHLPS 43 kDa Protein HomologJ
0928 CHL'S 43 kDa Protein HomologJ
0929 CHL >S 43 kDa Protein HomologJ
0728 (CT622) CHL 'N 761cDa Homolo (CT622)
0729 (CT623) CHLPN 76kDa Homolo . (CT623)
0133 (CT109) CHLPS Hypothetical Protein
0332 (CT081) CHL" R T2 Protein scellan eous Enzym> es/Consers ά Prott ins 0193 argR Possi le Argmine Represser
1046 Aron atic Amino Acid Hydroxylase
0232 Simil.iπty to 5'-Methylthιoadenosιne Nucleosidase
0128 (CT035) Biotin Protein Ligase
0513 (CT426) Fe-S Oxidoreductase I
091 1 (CT767) Fe-S Oxidoreductase 2
0373 (CT057) gcpE GcpE Protein
0407 (CT103) HAD Superfamily Hydrolase Phosphatase
0917 (CT771) Hydrolase Phosphatase Homolog
0488 (CT385) ycfF HIT Family Hydrolase
0701 (CT675) karG Arginine Kinase
0526 (CT399) kpsF GutQ KpsF Family Sugar-P Isomerase
0919 (CT773) Idh Leucine Dehydrogenase
0022 (CT349) maf Maf protein
0997 (CT840) mesJ PP-loop superfamily ATPase
0151 (CT148) mhpA Monooxygenase
0730 (CT624) mviN Integral Membrane Protein
0861 (CT720) NifU-Related Protein
0479 (CT380) phnP Metal Dependent Hydrolase
0106 (CT015) phoH ATPase
0329 (CT084) Phopholipase D Superfamily
0435 (CT284) Phospholipase D Superfamily
0581 (CT464) Phosphoglycolate Phosphatase
0897 (CT754) Predicted Phosphohydrolase
0509 (CT422) Predicted Metalloenzyme
1030 (CT375) Predicted D-Amino Acid Dehyrogenase
0531 (CT404) SAM Dependent Methyltransferase
0337 (CT076) smpB Small Protein B
0394 (CT256) tlyCJ CBS Domain Protein (Hemolysm Homolog) J
0510 (CT423) tlyC_2 CBS Domains (Hemolysm Homolog)J
0382 (CT048) yabC SAM-Dependent Methytransferase
0787 (CT594) yabD PHP Superfamily (Urease Pynmidinase) Hydrolase
061 1 (CT492) yacE Predicted Phosphatase/Kinase
0579 (CT462) yacM Sugar Nucleotide Phosphorylase
0578 (CT461 ) yael Phosphohydrolase 0345 (CT071) vaeM CT071 Hypothetical Protein
0566 (CT450) yaeS YaeS family Hypothencal Protein
0591 (CT472) yagE YagE family
0039 (CT335) ybaB YbaB family Hypothencal Protein
0101 (CTO12) ybbP YbbP family Hypotnetical Protein
0915 (CT769) ybeB lojap Superfamily Oπholog
0137 (CT108) ybgl ACR family
0529 (CT402) ycaH ATPase
0438 (CT2S7) ycbF PP-loop Superfamily ATPase
0734 (CT627) vceA YceA Hypothencal Protein
0954 (CT804) ychB Predicted Kinase
0261 (CT217) ydaO PP-Loop Superfamily ATPase
0245 (CT127) vdhO Polysacchaπde Hydrolase-lnvasin Repeat Family
0573 (CT457) yebC YebC Family Hypothencal Protein
0689 (CT687) yfhOJ NifS-related Aminotransferase l
0862 (CT721) yfhOJ NifS-related Aminotransferase 2
0547 (CT434) ygbB YgbB Family Hypothetical Protein
0237 (CT184) yggf YggF Family Hypothetical Protein
0775 (CT606) yggv YggV Family Hypothencal Protein
0396 (CT258) yhfOJ NifS-related Aminotransferase J
0605 (CT487) yhhF Predicted Methylase
0575 (CT458) yhhY Amino Group Acetyl Transferase
0592 (CT473) yidD YidD Family
0982 (CT825) yigN YigN Family Hypothetical Protein
0657 (CT537) yjeE YjeE Hypothetical Protein
0768 (CT644) yohl Yohl Predicted Oxidoreductase
0336 (CT077) yojL YojL Hypothetical Protein
0217 (CT140) ypdP YpdP Hypothetical Protein
0140 (CT212) yqdE YqdE Hypothetical Protein
0263 (CT221) yqfU YqfU Hypothetical Protein
0139 (CT211) yqgE YqgE Hypothencal Protein
0270 (CT137) ywlC SuA5 Superfamily-related Protein
0879 (CT738) yycJ Metal Dependent Hydrolase
Homologs to CHLTR Hypothetical Coding Genes
0001 (CTOOl ) CTOOl Hypothetical Protein
0020 (CT351 ) CT35 ] Hypothetical Protein
0021 (CT350) CT350 Hypothetical Protein 0026 (CT345) CT345 Hypothetical Protein
0035 (CT339) CT339 HypotheOcal Protein
0036 (CT338) CT338 Hypothencal Protein 0055 (CT296) CT296 Hypotheπcal Protein 0062 (CT289) CT289 Hypothencal Protein 0065 (CT288) CT288 HypotheOcal Protein 0068 (CT360) CT360 Hypothencal Protein
0071 (CT325) CT325 Hypothencal Protein
0072 (CT324) CT324 Hypothencal Protein 0085 (CT311) CT311 HypotheOcal Protein 0087 (CT309) CT309 Hypothencal Protein 0093 (CT303) CT303 Hypothencal Protein 0100 (CT011) CT011 Hypodiencal Protein
0104 (CT017) CTO 17 Hypothencal Protein
0105 (CT016) CTO 16 Hypothencal Protein
0107 (CT058) CT058 Hypodiencal ProteinJ
0108 (CT0I8) CT0l8 Sιmιlanty 011! (CT021) CT021 Hypothencal Protein 0121 (CT031) CT031 Hypothencal Protein 0129 (CT036I CT036 Similaπty
0145 (CT1I4) CT! 14 Hypothetical Protein
0150 (CTI47) CT147 Hypothencal Protein
0152 (CTI49) CT149 Hypotheπcal Protein
0176 (CT153) CT153 Hypodiencal Protein
0188 (CT132) CT132 Hypothencal Protein
0189 (CTI31) CT131 Hypothencal Protein
0206 (CT203) CT203 Hypothencal Protein
0229 (CT178) CTI78 Hypothencal Protein
0230 (CT179) CT179 Hypothencal Protein
0234 (CT181) CT18I Hypothencal Protein
0249 (CT151) CT151 Hypodiencal Protein
0253 (CT144) CT144 HypotheOcal ProteinJ
0254 (CT143) CT143 HypotheOcal ProteinJ
0255 (CT142) CT142 HypotheOcal ProteinJ
0256 (CT144) CT144 HypotheOcal ProteinJ
0257 (CT143) CT143 HypotheOcal ProteinJ
0259 (CT142) CTI42 Hypothencal ProteinJ
0276 (CT19I) CT191 HypotheOcal Protein
0288 (CT195) CT195 Hypothencal Protein
0293 (CT234) CT234 Hypothetical Protein
0301 (CT242) CT368 HypotheOcal Protein
0303 (CT244) CT244 HypotheOcal Protein
0308 (CT249) CT249 Similanty
0312 (CTIOI) CT101 Hypothencal Protein
0328 (CT085) CT085 Hypothencal Protein
0330 (CT083) CT083 Hypothencal Protein
0331 (CT082) CT082 HypotheOcal Protein
0334 (CT079) CT079 Similanty
0342 (CT073) CT073 HypotheOcal Protein
0343 (CT073) (frame-shift with 0342')
0350 (CT066) CT066 Hypodieocal Protein
0369 (CT058) CT058 HypotheOcal ProteinJ
0370 (CT058) CT058 HypotheOcal ProteinJ
0374 (CT056) CT056 HypotheOcal Protein
0379 (CT053) CT053 Hypothencal Protein
0381 (CT326) CT326 Similanty
0383 (CT047) CT047 Hypothetical Protein
0387 (CT043) CT043 Hypothetical Protein
0389 (CT041) CT041 Hypothencal Protein
0393 (CT038) CT038 HypotheOcal Protein
0395 (CT257) CT257 HypotheOcal Protein
0399 (CT253) CT253 Hypothencal Protein
0400 (CT254) CT254 Hypothencal Protein
0401 (CT255) CT255 Hypodiencal Protein
0405 (CTI05) CT105 Hypothencal Protein
0408 (CT102) CT102 Hypodieocal Protein
0409 (CT260) CT260 HypotheOcal Protein
0411 (CT262) CT262 HypotheOcal Protein
0412 (CT263) CT263 Hypothencal Protein
0415 (CT266) CT266 Hypodieocal Protein
0420 (CT271) CT271 Hypodieocal Protein
0422 (CT273) CT273 HypotheOcal Protein
0423 (CT274) CT274 Hypothencal Protein
0425 (CT276) CT276 Hypothencal Proteins
0426 (CT277) CT277 Similanty
0434 (CT283) CT283 Hypothencal Protein CT007 Hvpothetical Protein CT006 Hypothencal Protein CT005 Hypothetical Protein CT365 Hypothencal Protein CT865 HypotheOcal Protein CT383 Hypothencal Protein CT382 1 Hypothetical Protein CT384 Hypothencal Protein CT386 Hypodiencal Protein CT387 Hypothencal Protein CT389 Hypothencal Protein CT391 Hypodiencal Protein CT388 Hypodiencal Protein CT421 Hypodiencal Protein CT421 1 Hypothencal Protein CT421 2 Hypothencal Protein CT425 Hypothencal Protein CT427 HypotheOcal Protein CT429 Hypothencal Protein CT433 Hypodiencal Protein CT398 Hypothetical Protein CT406 Hypothencal Protein CT8I4 1 Hypo ieocal Protein CT814 Hypothencal Protein CT440 Hypotheπcal Protein CT441 1 Hypothencal Protein CT449 Hypodieocal Protein CT456 HypotheOcal Protein CT465 Hypodieocal Protein CT466 Hypothencal Protein CT469 HypotheOcal Protein CT470 Hypodieocal Protein CT471 Hypothencal Protein CT474 Hypodiencal Protein CT476 Hypothencal Protein CT483 Hypothencal Protein CT484 Hypothetical Protein CT488 Hypothetical Protein CT490 Hypothetical Protein CT503 Hypothencal Protein CT504 Hypothencal Protein CT529 Hypothencal Protein CT538 Hypothencal Protein CT546 Hypothencal Protein CT547 Hypothencal Protein CT548 Hypothencal Protein CT550 Hypothencal Protein CT552 Hypodieocal Protein CT696 Hypodiencal Protein CT695 Similanty CT6 1 HypotheOcal Protein CT482 Hypothencal Protein CT481 Hypothencal Protein CT676 Hypothencal Protein CT671 Hypothetical Protein CT670 Hypothetical Protein
Figure imgf000070_0001
CT668 Hypotheπcal Protein 0709 tCT667) CT667 Hypothetical Protein
0710 1CT666) CT666 Hypothencal Protein
0711 ιCT665) CT665 Hypothetical Protein
0713 (CT663) CT663 Hypothencal Protein
0717 (CT656) CT656 Hypothencal Protein
0718 (CT657) CT657 Hypodiencal Protein
0720 (CT659) CT659 Hypothencal Protein
0722 (CT654) CT654 Hypothencal Protein
0725 (CT652) CT652.1 Hypothencal Protein
0726 ICT620) CT620 Hypothencal Protein
0727 (CT619) CT619 Hypothencal Protein
0739 (CT638) CT368 HypotheOcal Protein
0742 (CT635) CT635 Hypothencal Protein
0746 (CT632) CT632 Hypothencal Protein
0747 (CT631) CT631 Hypothencal Protein
0751 (CT651) CT65 I Hypothencal Protein
0755 (CT616) CT616 HypotheOcal Protein
0760 (CT611) CT61 1 Hypothetical Protein
0761 (CT610) CT610 Hypothencal Protein
0764 (CT648) CT648 Hypothen .al Protein
0765 (CT647) CT647 Hvpotheti al Protein
0766 (CT646) CT646 Hypothen. al Protein
0767 (CT645) CT645 Hypothen al Protein
0770 (CT642) CT642 Hypothen :al Protein
0774 (CT606) CT606 1 Hypotheπcal Protein
0776 (CT605) CT605 Hypothencal Protein
0779 (CT602) CT602 Hypothencal Protein
0783 (CT598) CT598 Hypothencal Protein
0791 (CT590) CT590 Hypothencal Protein
0792 (CT589) CT589 Hypothetical Protein
0803 (CT584) CT584 Hypothencal Protein
0807 (CT580) CT580 Hypothencal Protein
0808 (CT579) CT579 Hypothencal Protein
0809 (CT578) CT578 Hypothencal Protein
0810 (CT577) CT577 Hypothetical Protein
0814 (CT573) CT573 Hypothencal Protein
0818 (CT569) CT569 Hypothetical Protein
0819 (CT568) CT568 Hypothencal Protein
0820 (CT567) CT567 Hypothencal Protein
0821 (CT566) CT566 Hypothencal Protein
0822 (CT565) CT565 Hypothencal Protein
0827 (CT560) CT560 HypotheOcal Protein
0834 (CT556) CT556 HypotheOcal Protein
0840 (CT700) CT700 Hypothencal Protein
0842 (CT702) CT702 Hypothencal Protein
0843 (CT702) CT702 Hypothencal Protein
0852 (CT71I) CT711 HypotheOcal Protein
0853 (CT712) CT712 Hypothencal Protein
0857 (CT716) CT716 Hypothencal Protein
0859 (CT71 ) CT718 HypotheOcal Protein
0865 (CT724) CT724 HypotheOcal Protein
0869 (CT728) CT728 HypotheOcal Protein
0874 (CT733) CT733 Hypothencal Protein
0875 (CT734) CT734 Hypothencal Protein
0884 (CT74I) CT741 Hypothetical Protein
0887 (CT744) CHLTR Possible Phosphoprotein
0896 (CT753) CT753 Hypothetical Protein 0906 CT763) CT763 Hypothetical Protein
0908 CT764) CT764 Hypothetical Protein
0912 CT768) CT768 Hypothetical Protein
0925 CT779) CT779 Hypothetical Protein
0938 (CT788) CT788 Hypothencal Protein
0939 CT790) CT790 Hypothencal Protein
0943 CT794) CT794 1 Hypothencal Protein
0945 (CT795) CT795 Hypothencal Protein
0956 (CT805) CT805 Hypothencal Protein
0960 CT809) CT809 Hypothencal Protein
0989 CT832) CT832 Hypothencal Protein
0994 CT837) CT837 Hypothencal Protein
0995 CT838) CT838 Hypothencal Protein
0996 CT839) CT839 Hypothencal Protein
1002 CT845) CT845 Hypothencal Protein
1003 (CT846) CT846 Hypothencal Protein
1004 (CT847) CT847 Hypothencal Protein
1005 CT848) CT848 Hypothetical Protein
1006 CT849) CT849 Hypothetical Protein
1007 CT849) CT849 1 Hypothetical Protein
1008 CT850) CT850 Hypothetical Protein
1010 (CT852) CT852 Hypothetical Protein
1011 CT853) CT853 Hypothetical Protein
1015 CT857) CT857 Hypothencal Protein
1016 CT858) CT858 Hypothetical Protein
1019 (CT860) CT860 Hypothencal Protein
1020 (CT861) CT861 Hypothencal Protein
1022 CT863) CT863 Hypotheπcal Protein
1032 (CT373) CT373 Hypothencal Protein
1033 (CT372) CT372 Hypothencal Protein
1034 (CT37I) CT371 HypotheOcal Protein
1057 (CT356) CT356 Hypothetical Protein
1058 (CT355) CT355 Hypothencal Protein
1061 (CT330) CT330 Hypothencal Protein
1073 (CT371) CT37I Hypothencal Protein
Coding Genes Not in C trachomatis
0486 Hypothetical Praline Permease
0279 Possible ABC Transporter Permease Protein
0505 3-Methylademne DNA Glycosylase
0193 argR Similarity to Argmine Represser
1041 bioA Adenosylmethιonιπe-8-Amιno-7-Oxononanoate Aminotransferase
1044 bioB Biotin Synthase
1042 bioD Dethiobionn synthetase
0585 Similanty to Cps IncA J
0562 CHLPS 43 kDa Protein HomologJ
0927 CHLPS 43 kDa Protein Homolo J
0928 CHLPS 43 kDa Protein Homolo J
0929 CHLPS 43 kDa Protein Homolo
1045 Conserved Hypothencal Membrane Protein
0251 Conserved Hypothencal Protein
0278 Conserved Outer Membrane Lipoprotein Protein
0907 CutA-like Peπpiasmic Divalent Canon Tolerance Protein
0171 guaA GMP Synthase
0172 guaB Inosine 5'-Monophosphase Dehydrogenase
0608 Undine 5'-Monophosphate Svnthase
0735 Undine Kinase 0980 Similar to Saccharomyces cerevisiae 52 9KDa Protein
0232 Similanty to 5'-Methylthιoadenosιne Nucleosidase
1046 Tryptophan Hydroxylase 0477 yqeV Bs Conserved Hypothetical Protein
0048 yqfF-Bs Conserved Hypothencal IM Protein 0587 yvyD Bs Conserved Hypothetical Protein 0143 yxjGJJsJ Conserved Hypothetical Protein 0448 yxjG J3s J Conserved Hypothencal Protein
0006 0180 0440 0977
0007 0181 0455 0978
0008 0190 0456 1018
0009 0203 0457 1023
0010 0204 0458 1027
0011 0205 0459 1029 0012 0209 0460 1040
0028 0210 0461 1051
0029 0211 0462 1052 0034 0212 0463 1053 0041 0213 0464 1054 0042 0214 0465 1055
0043 0215 0472 1056
0044 0216 0473 1064
0045 0218 0481 1065
0046 0220 0483 1066 0047 022! 0492 1070
0049 0222 0493 1071
0050 0223 0494 1072
0051 0224 0498 0063 0225 0499 0064 0226 0516
0066 0233 0517
0067 0240 0523
0069 0241 0524
0070 0242 0553 0099 0243 0574
0124 0266 0600
0125 0267 0656
0126 0268 0664 0130 0277 0677 0131 0283 0678
0132 0284 0685
0142 0285 0686
0146 0287 0724
0147 0352 0731 0155 0353 0745
0156 0354 0753
0157 0355 0794
0158 0356 0795
0159 0357 0796 0162 0358 0797
0163 0365 0798
0164 0366 0799
0165 0367 0829
0166 0368 0830 0167 0371 0831
0168 0372 0881
0169 0375 0882 0170 0376 0913
0173 0391 0914
0174 0398 0930
0175 0404 0944
0177 0431 0964
0178 0432 0975
0179 0439 0976
Chlamydia poawaonlu Oanona cncodt. rocalna rKYFYLRSYPPPPCHSVOS . JK .RVLArτF .VFOM..-.-.; JGA F'-TLGIPGLSAAta" FCrΛtCL^ALα^rtΛISCLL LWREirTWPEEIPεC όLΛPSEEPAlΛAACKTlAOL. pn.oπm no 4 PKELOOLOTD lOETW.r.Rfll-ΛSKrøHKFUIDAiαβSltlV^^ AQEGWDf-NF riGGRSUOTttSESLBLFHVSKRLGYLP-CDV^^
TTOOl hypT-ff. seal procein
KRLKDEIrTfTSL. RKAHIGKIIRGL£SLIV:LCAIJWGLIC:TO KLNIIAJLCGGVSTP SLHCEIHICVAVAFDRNSYAMAEKAFAKA[«TAI.EF,TVYRSLTQ3YRDKFt.ESERA IP G
AT0πΥIII-iΛ3VrC LSFCPFCSKK3RHSHCD3C3SCO HSHHSDiai H tTWLRDDAKSG AEKKLGMPRNVCRNLGKOSFG
*.."■■.' •' : ι wrr.-'..,t'- .:":-«r'* — '•"'..
15377 16614 Petll2) GIu tRNA Gin Amidotransferase (B Subunit)
16596 18212 ) G u tRNA G n Amidotransferase (B Subunit)
18509 21106 Outer Membrane Procein
21922
21335 24174 (frame-shift w th 0014)
I FTGEKLSETEAADSKNI.TSKL OPVTLS
Figure imgf000075_0001
Figure imgf000076_0001
CPn_0019 29007 30356 pmp_5-PMP_5 ( frame-shi ft with 0018) CPn_0028 43328 42543
ASTEDIVITNLS INAPT IYGKNPINIVASAANKNITLTGTTALVTlArXaFYENHTLQDSQ No robust homolog present in Genebank/EMBL as of 11/7/98
DYSFVTtt£PGAGGTIIT0DAS0KPLEVAP_niPHYGYOGHWNVO iPGTGTQPSQAN E V RMFf-OFFHPrvTSαjSLSFLPYLGKSSGIIEKCSNIVEHYLHLGGDTSVIITGVSGATπ. TGYLPNPERQGSLVPNSLWSFVDORAIOEIMVNSSOI CQERGVHGAGIANFLHRDKI SVDHALPISKSEKI IKILSYILILPLILALFIKIVLRI ILFFKYRGLILDVKKEDUCKTL
NEHGYPJfSGVGYLVrjVrjntAFSDATINAAFCQLFSRDKDYVVSramGT-rY£CVVFI-EDTL TPXE.JLSLPLPSPTTLKKIHAt-KILVRSGICTYNELIQEGFSFTKITDr.GOAPSPKQOIG EFRSPTCFYTDSSSEACCNO Λ/TIDM0--WSHPJINIMπ'KYTTYPEAQGS ANDWG-JιF FSYNSLLPNFYFHSLVSVPNISGEERALNraKECXJEEϊΛWIJ TTOACSFVFTlSLHLPSM GATrYYYPNSTFLFDYYSPF JU^riTAHOEDFKETGGEVRHFTSGO-fTJLAVPIGVKFE QTKDKKAGFGLLTFFPWKIYPL RFSrXKRGSYεLTLAYVPDVIRKDPKSTATIASσATWSTHσNNr-SRO^XO--l-ΛNHCLIN PGIEVFSHGAIELRGSSRNYNINLGGKYRF CPn_0029 43839 43390
No robust homolog present in Genebank/EMBL as of 11/7/98
CPn_0020 32717 30603 SNWJEROTNIYCFNLFRYIRFFAALNIRMNDGLRFCYSYIUΛPMIXDSStjyUCr^
Predicted OMP ( leader ( 14 ) pept ide : outer membrane 1 KFOIKI^TTSIKSSLISLROOLGKREATOSOILYGTSRFOYLNSFEIEDPRIPPTMAAO i W£MPNUttJ<KRCFLFr \5FVLMGSSADALTH0EAvyKKMSYI^HFK-rVSGrVTI LQEIT SRSVMELKIKFYVYLNSERN TKP NIHrølΛIQA KVrnrE WWSUαVAHGWMVWRAICTLV^
RFAMYPWFr-GGSMITLTPETr/lRKGYISTSEGPKKDI I_3GDYIJKSSDSI- SIGKTTI- CPn_0030 43840 44529
RVCRIPirjLPPFSIMPMEIPKPPINFRβGTGGFT-GSYLGMSYSPISRKHFSSTFFLDSF gcp-O-Sialoglycoprotein Endopepcidase
FKHGM-MGFmΛCSQKOvτεNVFNMXSYYAHRIAIDMλ-ΛHDRYR^ U GWTΛ«SLFFYI NPΛMYFYKYVIIDTSGYYPFIACVDNOTVXE1MSLPVDPD-Clv^
GEYHI^DSWEWADIFPrWFMI_0>rTGPTRVIXriWJDf«FEGYLK^
YLTLRQYPISIYWIY-VYLENrVECGYIJ^AFSDHIVGENFSSLRIAARPKIJK VPLPIG AUflLPl_GKRC»/LTI^SEIPEEGLNEKRJ»GVGPGAIASYEEASr»OTM
TLSSTI-SSSLIYYSDVPεiSSRHSQLSAKMI-OYTtF.J^KSYIORiWIIEPFVrFITETR LFASSFSDKITVEEVAPSVΕXJIRRHVISOFMTvΕYDKQLSPDYRSYSCIF
PLJUajEDHYIFSIODAFHSLNr.IJζAGIOTSVW TNPP_?PRIHAKLWITHILSN0r^
FPKTACELSLPFGKMnVSLDAEWIW KHOmHMNIR EWIGNDNVAMTr-ESI-HRSKlfSL CPn_0031 44708 44884
IKCDR£NFILDVSRPIMLr.DSPLSDHRjn,ILGKI-FVRPHPCWr4YRL^ rs21-S21 Ribosomal Protein
YLEYQMILGTKIFEHWQLYGVYERREADSRFFFFLK DKPKKPPF CMPSVKVRVGEPVDRAIΛIUζKKIDKEGI -^AKSHPJΥDKPSVKKRAKSKAAAKYKSR
Figure imgf000076_0002
Figure imgf000077_0001
EΛTYΛIDRKAHKKP rENiε CHO I tKH'.- INCKSLNAL IEINR NCTCPATANLLAS i -pn III).. I 7S-.II I 7S20R UtLNI^OPMPYCF-MPECG jSYtJLNtirJ.irCDI IARADQC IMTLS T aOtKKEPORI or.-,N PT-; H A Prote in . HTII DNA-Binαmg Domain _._.. ,
R3HEC IC.ICVKMDWUJEVA LI-DVSEHTVWWLKECAI PSYSMNNEYR REEIENWLL [E3NH
HNQMJt.0EKi.M»«D 3U«^^
LDESVXFEM ΗPQIIΛSTGIGECIALPHAKDFLINAYYDIVVPMFLAEPIErø^ICKP CPn_0073 89353 39574
WILFFLFA -0DF3HωLvWIVHLCMS «RSFFKNYPM«D0tXAYVXE ESfJTH infA- Init iat ion Factor IF - 1
SWAKKEDTLVLEGK' E-liPCMHFRVILENGMPVTAHr GKMRMSNlRLLVCDRVTVEMS AYDLTKARWYRHR tutA-tlungdt iυn ructor Tu
EDFEMS«TF0RW?HlNICTIGHvOHGKTTLTAAITRAL5GDGt-WFRIWSSin»TPEE
KARGrriNASHVΕYErPNRHYAHVTCPGHADYVXNMITGAAOMtXlAILWSATOGAHPvT
VT<VSDKvT3LV13IΛETKETrvTOVEMFRKELPEGRAGEϊn 3LLrjtG∑σ»IDvΕ^^
Figure imgf000078_0001
NSVKPrmtFlCSAVYVIΛKEEGGRHKPFFSGYRPQFFFRTrDVTαVVT^ VELDVELIGTVALEEGMRFAIREGGRT IGAGT ISKΣNA
CPn_0063 78109 78267
No robust homolog present n Genebank/EMBL as of 11/7/98 CPn_0075 91037 91350
PMYANCKHNCLCLYDFSRHRSPPGLPLTFTPPYSFTLGIFLGRCLSTSNIVLL secE-preprotem translocase
SRSWFMK∞HNRKAI^RKIGTVKKQAKFAGSFU5EIKKΣEVWS HDLKKYIKW-.ISIFG
CPn_0064 78340 78576 FσFAIYFVDLVLRXSΣTCLDGITTFLFG
No robust homolog present in Genebank/EMBL as of 11/7 /98
LWTKICCSAOYYRSRPAERAQTPPQPFI-ARDRADF ERHPRFSACCRVUiVAWVVXAL CPn_0076 91334 91903
LFLFVMLLPLAAGSYLLAF nusG-Transcrictional An itermmat ion
QPFCSVTfCMYKWY VOVFTAOEKKvTC ALEEFKESSGMTDFIOEIILPIENVlfEvTWGEH
CPn_0065 78882 80651 K ΛTEirYIWPGYLLVKMHLTDESVΛYvTCSTAGΣVEFtΛXWVWAΣ^^ T238 hypothetical protein SGVVOKHQFEWGSRWINIXΛrFVNFIGMVSEVFHDKGRI^v lVSIFGRETRVDDL-aF QV
YOYYKYrWFFKKrmfrDFPTHFKGPKIΛPIKWPtnTFERNPKVARVIΛITAWIΛIlALL εεvAPGOEsε
SG-VLIIGTPIi3APISMILGGCUASGGJUJVασrlATIL0ARNSY KAVN0KKLSEPLM
ERPELKAIΛYSLDLKEVWDLmSTWKHI ΪΛUttSrø CPn_0077 91956 92435
MISEhm^I-ΪMΣAYREtαiJKEOTQYQETRFNOIΛTHRNKVI^^ rlll-Lll Ribosomal Protein
SLKFSTLSSRMSRIHTTTTVIt^AI^AVVSVM AALIPtKIIAIJIUAVAISACWrvTG FFVSYPLFVEVSO KVRFSMSVKKVIKI IKLO I PGGKANPAPP ΣGPALGAAGVNIMGFCK
LSY VRQILSNrKRNRODFYKDFVrarølEΣJ-NOτvT-ORFlF-ML^^ EFrWATODKPGDUJ nπ vΥADKTFTFITKOPPVSSLIKlCrLNIJεSGSKIPrΛWK^^
Qn^YTOYITNAPIEKPXIEEIRVTYKEIMOTIOOWTDLEFLENEvTtSGRI-SVASPSEDP TQΛCTvtAIAEOKMiααHDrΛ.LESAKRMVEGTARSHGIDVE
3CTPIFTCX3KEFλKI-RRO SONISTITOPDNENIDP-TSLPVrøKKEEEIDHSI_ePVrKL
EPGSREEIXLVErivTffTLREIΛMRIAIJΛOOI^Sv κWRHPRGEHYGlWrYSDTELDRIQ CPn_0078 92453 93160
MLEX3AFYNHLREAQEEITQSLGDLVDI0NRILGIIV.CDSDSRTEEEPOE rll-Ll Ribosomal Protein
SCIUlfTKHGiσilRGIUOWJFSKSYSIΛ.ΛIDIlJCOCPPVT^DOTΛriJvϊIKLGIDPKK^
CPn_0066 80916 82655 MlRGAvTLPN3TGCTIΛILVFASGNKWEAVEAGADFMGSDDLVEKIKSQ«l FOVAVA
No ronust homolog present in Genebank/EMBL as of 11/7 /93 TPr-MMREΛrGK∑jGKvTCPRNUlPTPRTt?r\TTDVAKAISEXRKGKIEFKADRAGVrjW^
CWYMANPTOSRPPSPEISIEEI^IΛEI-VSSSNTETISNTPPPSCAATAEEVSLFIEGGRR KLSFESSQIKENIEALSSALIKAKPPAAKGQYLVSFTISSTMGPGISIDTRELMAS
NSESEEGPLGSCEVYDVVCITN XSDP.VRDHETOvllYINGSGRTQHEGILDAMNICnLRG
EPVTtFIHNSGYGIΛSCFLGIRNRIPPRIJNv^SOAIQARWNEFFIFA.-WrøDYrvXFSβN CPn_0079 93170 93688 rllO-LlO Ribosomal Procein
RGao^QEKTI-I- EVEDKISAAOGFIU tYLRrrAAYSRFJT^SLSGVSAEFEVLiaWIF
FKAIEAAGLEVTCSDTIXmrjGVVFSCGDPVSAAKQVTJJFT^rKOHKDSLVr AGRMDNASLS
GAEVEAVAKLPSI CEUtQOVVGLFAAPMSOVVtJIMNSv SGVISCVDQ AGKN
CPn_0080 93720 94121
GCVLAIGGTItff OtlLLVIrπ'FTFVT^SVTWNLHRRPHR rl7-L7/L12 Ribosomal Protein
VRVTtrπTESr-ETLVEKI^tΛTVLEl^OLKKI EΪKWDVτASAP AVAAGGGG
CPn_0067 82920 84053 EPTEFAv LETArPADia IGvUCWREVTGU XEAKEHrEGLPl IVXEKTSKSt^^
No robust homolog present in Genebank/EMBL as of 11/7/98 KLODAGAKASFKGL
KGSGYSYRGPPMAVΕGRVNSSOλlJ^IXrOEVlJWKQSKGU UπilU-VVAVrrFIAGVV
LIAI.TlASILTSvTYt.U PffTirVT KIIFAUISEKIKKVPPTPISHKEEIIAWFEER CPnJOβl 94219 98016
KNlrXtEKEKIDPεHFGRTATDIPMRSALDQFNHSCHHIHεSPλLTεriΥRSHQDVIJ FKDW rpoB-RNA Polymerase Beta
CPVΎLPDVΓSEEEVLIPJTV Λ3SYUΛEACVTKVΪMLIDEIJOIKIJ SPSERECLFIDKKΓL FRε∑r^HONSRRTW-^CPESVΞVKKKEDIPDLPNLIEIOIK-ΥKOFLOΣG LAEERENI QRKASFIJΓTOKDLATFFIAYTRVN∞HLAPFRAGAK ILIHYVRLRROHNONDFFTPGHS GIΛEv REIFPrKSYNEAWLEYI-SYNIΛVTICYSPEECIRRGITYS CvTtFRLTDE G CYYAPIAFNO^ORLYHOI-^NVΕKIJ^IYANMDKDPLCHPWAFIPIYDLΣJΠ'EDHGDGFLE IKEE-VYMGTIPLMTDKGTFIINGAERVVVSOVHRSPGINFEQEKHSKGNILFSFRIIPY QQEDREYPSRAAODQFWG RGSWLEAIFDINDLIYIHIDRKKRRRKIIAITFIRALGYSSDADIIEEFFTIGESSLRSE DFAliVCRII-ADNIIDEASSLVYGKAGEKIJTA>riJCP^LXIAGIA-n^IAVt)ADENHPII
CPn_0068 84909 84331 KMIJUODPTDSYEAAl CDFΥRRI-RPGEPATI-ANARSTIMRLFFOPKRYVLGP ΛGRYi NRK
CT360 hypothet ical protein LGFSIDDEALSQVTU?KEDVIGALKYLIRLK <;DEKACVDDIDHL \NRRVRSVGEL10NQ
SFMIKKFFIYSLIFSCSFSAPLKGICNEDVSSOSRIEEDPEVLITOLNELIETPIEEGKE CRSGTIARMEKrvT«ERMr^DFSSrm.TPGKWSAKGIASVTJ DFFGRSQLMF^
IR πaΛAISaWKSSEEIEESCCTSDSEσI^εKTDKESS EYVLDFFDSMVORLEGISKM AELTOKP U^AIjGPGGI-TOεRAGFEVRDVHASHYGRICPIETPEGPNIGLITSLSSFAKI
CQSGQVAQ I TTY-FMBFFnτPNBPτ.rτ.ιnjιιrτ FT.ppypr F"^" ILDWNKEKVSRELAFQR røFGFIETPYRIvTU3GrVTDEIEYMTAI)vΕU^CVIAOASASLDEYNMFTEPVα«RYAGE
EQDIKQTLMLLKK AFEADTSTVTHMDVSPKOLVSIVTGLIPFLEHDr-ANRALMGSNMOROAVPLLKTEAPVVG
TGLECRAAKDSGAΣWAEEαSv Λ3FVTXrYKVVVAAKHNPTIKRTYHLiαtFLRSNSOTCIN
CPn_0069 85191 87086 OOPLCAvrjrWITKGDVΣAΣX3PATDRGEI UiGKNvT.VAFMPWYGYNFEDAIIISEKLIRED
No robust homolog present in Genebank/EMBL as of 11 /7/98 AYTSIYIEεFELTARDTKIΛKEEITRDIPNVSDEVUu\ILGEDGIIRΣGAEVKPGDΣLVGK
ITPKSETεiAPεERU.RAΣFGEKAADv DASLTvTPGTEGWMI7vT<VFSRKDRLSKSODE
LVEr_AVH KDIΛKGYKWVATLKTEΥREKLGALI NE PAAI IHRRTAEIVVHEGLLFD
OETIERΣEOEDLVCLLMPNCEHYEVTJCGLl-SDYETALORl.EΣNYrπ'EVEHIR-lGOADLDH
OTIROVKv^VASKRKLOVGDK^- GRHGNKσV SKI PEADMPY S σET QMI NP σ P
SRMrJLG0vT.ETHLGYAAKTACIYvT TPVFEGFPE0RIWDtmlE GLPE∞KSFLYDGKTG
ERFDNKVVICΥIYMtJCL≤HLIAD IHARSΣGPYSLV MPLGGKAOWXSORFσEMEVWAL εAYσVAHMLQE LT ^SD^WSCR RIYESΣ ^( L-^SG PεSF^rvXIKEMOG GU) R
PMWDA
<~P-l_θn ;i) 87399 87208
No- robur.c homolog present in Genebank/EMBL as ot 11 /7 /98
YKVCLFHLKNONFFSNOSRTYEORFP VSPHFESILRLQSVGFSSOGTLLISFRrjTELKR
'pn_()l) / l H80h6 87599 liypo tωr ica l protein
I I L F tCFLOHΛRCLKKQHK I lεεLFrEPFCiKDHLYLKLMEN≤iJSRDΛFDKKRML Ki:HI.VΛX;υ::DLYI.YEVYODG ILFFFTYTKALM33GIA3LFTEvY3CETPSTtLTCKPIF l''jHLTr-YL :π:RIJJC> iεjLYMRMKOtΛVOYLKPPCT
' f-ri 'if > .* H-U51 HB057
' T J. hyfxir ht-r ir.i l prote in
I" ;Yμ:,TKT::VΪKCKVl. I L[YCLLFYFπiYRM:rrPL.-."GG ISPMXJYVPOεLFCDRLSSΞR N::pU::NA:;r;Li:;r [V:PPt:;ALVALTDLKLVPYNOII3F.TWTTRLKNAVEKIGLFLORN K / I I.LY I IΛWΛLII.VOIIIITVΛLTLTtWLt' L tCV/F rFTΛTCLDKENKHRHVNSL NL INΪKJ I UJUiPNTfRO I LLΛTM IΛS t:IAL [GNOLJ INTVYCARLCD
Figure imgf000078_0002
Figure imgf000079_0001
144743 145093 GIu tRNA Gin Amidotransferase (B Subunit)
Figure imgf000080_0001
1*5.584 ltt»5βl noBolβg pn-senc in OenebaBk/EKBL as. &s IWKBi
— PK- 'T' ^ TO"1
169131 167467
169448 169143
171419 169569 idase
ITKTLDTIFATLFRQ FFA
172263 171502 family
174094 172700 _ 1-am nomutase" I IKFIGGYHGHADTL
IIDSLIKIFDSSAORFF
175140 174673 I LCTεεKKFA n GHVSMGOA 175110 tbose-5 P Isomerase A IOTESLAVHA IEEΣRHLG EGE RLODTGDLFITDS KKYSV ir I I 7/lfl rn
Figure imgf000081_0001
Figure imgf000082_0001
195274 194313 protein
195430 197892 tRNA Synthetase
APEHPDLDS I VSEEORDEVTAYVOESLRKSERDRISSVKTK
IGGAEHAVLHU.YSRFWHR
197874 199202 Transferase
I PEHKIΛVTGNIKTΥVAAOTALHLERET
199697 199488 homolog present in Genebank/EMBL as of 11/7/98
as of 11/7/98
200753 200298 homolog present in Genebank/EMBL as of 11/7/98
201463 200894 homolog present in Genebank/EMBL as of 11/7/98 IDTVAKLEKNNPGEEF IGFLIV εENYGSV 201811 201467 homolog present in Genebank/EMBL as of 11/7/98 ILIALAΣRYFLHRK 203794 202127 Phosphor. ranst erase IOETSSPP≤PPPELOKHΣPNL RIPE IGCrOISN IGVPKTID^DLKN(^*rET3LGFIIT:'CRTYSEMICNLAKDAL IΛτRKt:;LKOI-SεθlΛLGLVRRY t r i t LitKLπrεTLKTFHLFPK tΛTF.εLLΛVM%-KKπrEKIKPHMεFli::v::ilFFrr/EΛRAGFP IAlCII Λl.FLVROKTrr/MrTINNIΛυ YTE QtilΛTPI.YKMMHLENRCGTE ;.7r.nι.γi'rrGPi. γrι;κcFLinoRPi.TLrwENθτ lti'iH o 1 " Lei m.-.ri-i.j .<■ i.iuu ly) ivv i.KHiirjf'T i ^i.π.uiNrprr'ii.uiTπ.iiYNpr-YMvtLUic iΛΛLf'vi)i.ijiiιι;rxΕi;ι:iΛPi-
Figure imgf000082_0002
Ff-NKtKΛ[ \VWΛI I.U:κiΛΛΛI-7Vjl'NΛPεVt
Figure imgf000083_0001
CPn_0177 217513 216608
CPn_0163 20S331 206394
No robust homolog present in Genebank/EMBL as of 11/7/98 No robust homolog present in Genebank/EMBL as of 11/7/98
FEKA1VYCΣKCKOΣ ΣKCΣS I IHTPTPATPLCTEGEIFPGPVDSAIQNDLERLLTVKKRPD DKREστKSKFIFLISEεSMOPωLIFSS ΛπΛIΛLGSLSSCt*3KPS NYHNτSTSEEFF
I ΣREYUtAGGSLv rΥPKEGQPXRSPEQLRVLDDLVOSYPNHLHAiεLDCGAΣPODLIGA VHGNKSVSO PhΥPSAFRTTOIFSEEHrTOPYvVAKτDEESRKI REIHKNLKIKGSYIPI rYΣΣTFADFSTYILSLRSYOANSPSDσrVΛΣVrFGSIDDPVOAVISFLKDHGFALPSTLAQ ε YGSLMHPKSAALTLKTYRPHPI INGYεRSFNIDTGKYLIC∞SRPJWSinXSPIOn vL
DPLLCTNK NLIKSSGRROlAIGLEKrEEDFVIARRREGWSLYPVTVCSYPOGNPFVIAYAWIADESA
CSKEVLPΛWGYYSLVWESVSSSDSLNAFGDSFAεirYLRSTFLANGTSIi WESY αCVPP
CPn_0164 206444 206998 QP
No robust homolog present in Genebank/EMBL as of 11/7/98
I FKCIYΣKΣΣFSFUCQLMTRSTiεSSDSl SRSFSOKI^VOTLKNLCεSRLMKITSLVI CPn_0178 218052 217789
AFLTLI VGGALIALAGGGVLSFPLGLILGSVLVLFSS I YLVSCCKFFTLKEMTMTCSVKS No robust homolog present in Genebank/EMBL as of 11/7/98
KI WFEKOR KDIεKAU3 PDLFσENK ^r GNP3AR^K3 m∑LHETDσII KRYMKGAK VT EYIJ3FLvOPJ<vΕRDPOTKRHCWS0rF∞ESIDAl rTTGQLFHI^
MYFYL ESILKQLLALGIITGYENREREV VYLD
CPn_0165 206983 207582 CPn_0179 218550 218056
No robust homolog present in Genebank/EMBL as of 11/7 /98 No robust homolog present in Genebank/EMBL as of 11/7/98
NVTIFMNWVTKTIDHVDPESEIDIPJ RVSCYKLIKECOPEFMLISEL T/IRMLRLLK PKIWOTHFCTRIEATSVPKFNRRIJUCSFHKSGRSSRPSKACVAriFFNFTLQAGRSGIIPG
RSKYQΕ0ARTVSDET)APLFCLTPTYΥ0A ATPL UU3PROLINHYIHIΛRRENPKHFFSP KKAILLNVNrjAKTPOTSCIFESIGFFr E0DI AQHrWAALVTWIUCVVT>HHFUu3LIAK
KHP YYAMAFNESVCT^REIJDIERLTKMYVEGDYSKEQEKNLQAIIJFVKΓLDEGKDF LPRS CXDRKFMΞSLIFTKLSYALDLSAPMKLEGKPNLSYεEKI.D
LIEHKDTDLΣGRGFTDVFCT
CPn_0180 218963 2183S5
CPn_0166 207594 207962 No robust homolog present in Genebank/EMBL as of 11/7/98
No robust homolog present in Genebank/EMBL as of 11/7/98 TSUKIIJXKYKPWIO^rVASETYPSOIIJtAORiΛ IuWYFNOADCHPAJtAWIlJSAiαtl
NCLKOYNKSDSIMSESINRSIHI ASTPFFIlαTNLCESPXVKITSLVIStiALVGAGVT _XIJW YHTNHYSVF^FCV^J^flfP^!LRFTFVSS10Λ E^TC^
LVVLFVAGILPIiPVLIIillLITVLVIJ.FCLVLEPYLIEKPSKΣKELPKVDεLSVVΕTD LAACKIWIEVPRVVT3LDLRSGILISKI ITOP0F0SLTEDFVNHSTN0EEARVK0KHVL
STL LISLILLCKOAVLESFOEKKRSS
CPn_0167 208309 207977 CPn_0181 219175 218777
No rooust homolog present in Genebank/EMBL as of 11/7/98 No robust homolog present in Genebank/EMBL as of 11/7 /98
NLWSHFPRGFFMLPFCPTILLAKPFΣΛSENYGUailAATvOSYFDL∞SQΣVFLSKODQG VHEI^FKIDGVYYFFKKFMKLFY lOTSLNSHHEKPSSIjα AvOALDSYFYMSGDI DVI^
I-rVEELSAKDRKFKPGSMNCTLYTEDPΣLPAKNSFSNCSDIOMRTPISPIH DDISMIY VRW,YIPJWrVSISOSLSRIPvπUJ«IUJlYCrLRGKYVMPILIKRIAlLL
GLIRFSPXRKSVY
CPn_0168 208716 208417
No robust homolog present in Genebank/EMBL as of 11/7 /98 CPn_0182 220704 219334
SΥIOTJWRENPεHFFNPGHPCYYARIAFNεSVRrYWOLFOTAEUCQ^^ accC-Biotin Carboxylase
LKSILSFVQILDEKDGFDDFLATHRl/l l IGRGGAPIFCS RC-MKKVXIANRGEIAVTlIIRACHDUH TVAWSIAIX)εALHvTJ^
SYLKISNILAACEITGADAVHPGYGFLSENANFASICESCGLTFIGPSSESIAMMGDKIA
CPn_0169 209537 208710 AKSIAKKIKCPVIPGSEGriEDESEGLKIAEKIGFPIVIKAVAGGGGRGIRIVKEKDEfY
No robust homolog present in Genebank/EMBL as of 11/7/98 RAFSAARAEAEAGFtJNPrJVYIEKFIENPRHL£Iθ iσσTΗG>rYVHLGERrX IQRRROKL
SFHIEFTIGEWIMrørvΩSECSQPLvΗEIΛTOPLRNΣΛESRLVKITSFVIAUALVGGITL IEETPSPIIJHAEIR\/KvT5Kv7tvOI U«AGΥFSVGrrvΕFt£Λ^^
TAΣAGAGIΣ^FLPVfcVIΛΣVLVVT-CAIΛIJSYKFCPIKiα n^ ITEEVTGIDLVKEOIHVAMCaffiLPWKOKNIEFSGHI ICCRINAEDPTNNFSPSPGRLDYY l03 ΛI«IΕ3»ELFGENRAEIMπ^A SQVKETΣΛIX:r !NVU KI ER ΣΛ^^ LPPAGPSIRVTCACYSGYAIPPYYDSMIAKVIAKGiaroEEAIAI lKRALKEFHIGGVQST
TMDDVDPVSEDSIRWISCYKLIKACKPEFRSLISEIiRAMQSGLGLLSRCSRYOεRAKT IPFHQFMLJ3NPKFLESNYDINYIDNLLA0GNSFFKEF
VSHKTAPIJCPTHSYYRIIGYLTPLRAGPRYI INRAI
CPn_0183 221207 220695
CPn_0170 211098 210025 accB-Biotin Car boxy 1 Carrier Protein
No robust homolog present in Genebank/EMBL as of 11/7/98 RRLGMDI-XOI-^ IIAJ«;RfCMKRFAIKREGLEXr^E-raTREramθEPVFYDSPXFSGFS
NVRKNHIIRGEKYr4TCTVTAFVIJ3MSYI7πjKNLEKEDSVHK-C^IFAL QERPIPTOPKKr/riK£.ri l NSETSTTTSSGDFISSPLVCTFYGSPAPDSPSFVKPGDIV
EAIIKNLPKADIHVHLPGTITTOLAWILGVKNQFLKWSYNSWTNHRLLSPKNPHKQYSNI SεDTIVCIVEAMKvTINεVKA Ϊ<SGRvX4?/LITrraDPV0FGSKLFRIAKDAS
FRNFQDICHEKDPDLSVLQYNIIΛYDFNSFDRVllATVOσHRFPPσσiONεEDLLLIFNNY 3^LDDTIVYTεVQQNIRLAHVLYPSLPεKHARMKFYQ ILYRASOTFSKHG ITLRFLNC CPn_0184 221814 221221
FTIKTFAPOINTOεPAOεAVQWLOETOSTFPGLFVGΣOSAGSεSAPσACPKRLASGYRNAY efp-Elongation Factor P
DSGFGCεAHAGεGIETRTΣFSSAKVNPεGLIEΣTRVTFSSLKRKQPSSLPIRVTCOIiG OΛΛIKFCKEEKIMVXSSOLSVGMFISTKreLYKVTSVSKVAGPKGESFIKVALOAADSD
VVIERNFKATOEv EAOFεTP I YLYLεDεSYLFLDLGNYEKLFITOEIMKDNFLFLKA
CPn_0171 212444 211149 GVTVSAMVYI»π v SVεLPHFLEIiWSKTDFPGDSI^LSσGVKl«ALLETGIEVMVPPFVE
-guaA-GMP Synthase IGDVIKIDTRTCEYIORV _ lIKIΛSARRHIiWIFILDFGSOYTYVLAKOVRKLFVYCEVLPWNISVOCLKERAPLGIΣL
SGGPHSvYE^rKAPH DPEΣYKI ;IP∑ω∑CTGMOLMARDFα-TVSPGVσEFσ TPΣHLYP CPn_0185 222457 221765
CELFKHI'/DCεSLDτεiRMSHRDHVTTIPEGFNVΣAΞTSOCSΣSGIENTKORLYGLOFHP rpe/araD-Ribulose-P εpimerase εVSDSTPrGNKILETFVOEICSAPTLWNPLYΣQODLVSKΣODTV∑εVFDεVAQSLDVOWL AEVKKOESvXVGPSIMGADLTCLGVεAKKLEOAGSDFIHIDIMDGHFVPNLTFGPσllAA
AOCTIYSDViεSSRSGHASEVIKSHHrJVGGLPKNLKLKLVεPLRYLFKDEVRILGEALGL INRSTDLFLεVHAMIYNPFEFiεSFVRSGADRIIVHFεASEDIKELLSYIKKCGVOAGLA
SSYIADRHPFPCPCLTIRVΣGεiLPEYLAΣLRRADLΣFΣE-LRKAKLYDKΣSOAFALFLP FSPDTSIEFLPSFLPFCTJvVVXhlSvYPGFTGOSFLPNTiεKIAFARHAIKTLGLKDSCLI
IKSVSVKGDCRSYGYTIALRAVESTDFMTGRWAYLPCDVLSSCSSRIINε∑PEVSRWYD εvrχaiDMSAPLCRDAGADILVTASYIJεADSIJ«EDKIIiLP.GENYGVK isDKPPATiεwε
CPn_0186 222878 224063
CPn_0l72 213237 212440 -s imilarity to Cps IncA
impD-Inosine 5 ' -monophosphase dehydrogenase (COOH-terminal P I KDKILMSSPVNNTPSAPNIPI PAPTTPGI PTTKPRSSFIEKVI I AKYILFAIAATSG region only) ALGTILGLSGALTPGIGIALLVIFFVSMVLLGLILKDSISGGEERRLREEVSRFTSENQR
APIGAAΣGΣGPLGΣSRAHHLVEAGANVLVΣDTAHAHSKσVFQTVLε∑KSOFPQISLWGN LTVITTTLETE TtDLKAAKDOLTLEIEAFRNENGNLKTTAEDLEEOVSKLSEOLEALERI
LVTAEAAVSLAεiσ/DAVKVσ∑GPGSICTTRIVSGVGYPOITAITNVAKALKNSAVTVIA WLIOANAGDAOεiSSεLKKLISGVTOSIWVεOIOTSIOALICVLLGOE VOEAO HVKAMQ
TCRIRYSGDVVKAIΛAGAx^^LCSLLAGTDεAPGDIVSIDEKLFKRYRGMσSLGAMKOσ EOIQALOAEILGMHt«£TAL KS\^NLLVOD0ALτRVVr:εLLεSENXLSOACSALR0Eiε
3ADRYF0T G0KKLVPGGVEGLVAYKGSVHDVLYQILGGIRSGMGYVGAETLKDLKTKAS KLAOHεTSLOORIDAMLAOEONLAEOVTALεKMKθεAOKAεSεFIACVRDRTFGRRETPP
F7RITε3GRAESHIHNIYKVQPTLNY PTTPWEGDεSOεεDεGGTPPVSnpSSPVDRATGDGO
Figure imgf000083_0002
241018 241983 ide Permease ΣAKLDFGNSLVYKDRKVTNI ΣS AFPISAΣ aLCSLFLSIG
241996 242868 Permease
ISTLIFTIPNAΣYTEAFISFLGLGIQPPOAS 243715 Transport ATPase
243682 244500 Transport ATPase
245802 present in Genebank/EMBL as of 11/7/98
24S691 246002 as of 11/7/98
246327 present in Genebank/EMBL as of 11/7/98
246346 247161
247208 248617 -Oxoglucarace/Malate Trans locator
WJ .-.I.'/.' i'ii -rii in iMhk/KMllt. .... ul 1 I, / •IH
Figure imgf000084_0001
iMi: ι.i/::γF!; viκι»,rnviΛiτfΛi;iΛtΛYBQNt IILJMTVK IL 7L3FPR3LLRTT3LWYRP
'.frι_!)2lll 252151 25144U
No roDii.r nomoiog present in Geneoank/EMBL as o£ 11/7/98
Figure imgf000085_0001
YQKLWEREP.r/FKTtRεKεHATr3TMLVELεAL|.RεFAHLKD0KPTSDCEiτSLYCCLDH
LEFVI_ :LG-;0KFLIΛTEDEσ/LFεS0KAIDAWIIALLTKARDVLGLCBΣGAIYOT:EFLG '"Pn_022 263402 263674
AYLSIONRRΛFCIΛ EIHF KTAIRDLNA-rfLLDFRWPLCKiεE rΛrjNtJCVε∑AKRKL No robust homolog present in Genebdnn/εMBL as of 11 '7/99
'-'FEKETKEUlESLLREEHAMEKCSrODUJRKtSC∑∑IE HDVSLPrFSCTPSQEEYOID YTFKNPKKNKWKFNSIIFLεrrKHYPOIFRεGFVRDRHGLMEASDWL Sτε∑TIIRSIL
..::» ::-< vii ;.' wri: s\—r- ι: π_0--'j 2ojej c _o4}41
CPn_0211 252765 252463 No robust nomoiog present in Genebank/εMBL as of 11/7/98
No robust homolog present in Genebank/EMBL as of 11/7/98 NSFTΣKFLLMTKNAΣNSC/I TP0PNLTDAεPΣASRAOCKSIAVΣΣSLFALGHIXLCl<a:i ECVMSYPDΣSNVQASS ΣOSALLHKTSDQIQQKRCFKOSTFVΣLAVSLVI ΣGSLFLLAGVA LISIPΣPσLAAOVALGLGIVSLΣLGIALANIGFLCLLLRCKOVPOKPDTLPSESSKOPSE I LTVFSHGVLSLVFGVLG ΣVLGLLLLAGGVGLLVEEAKSLL GSTPTALPWAGEFLEKVOVSATP ΣLLPKNKDEεLSAKVMKεGAEAASS IKOAVLESTEK
L∑rauWOEESRRεAiWKIVAεεAEASRKRΣi^MMAADOεALRKRKEEVAKRK
Figure imgf000085_0002
Figure imgf000086_0001
Figure imgf000087_0001
C?n_0266 299181 299876 CPn_0277 312003 311404
No robust homolog present in Genebank/EMBL as of 11/7/98 No robust homolog present in Genebank/EMBL as of 11/7 /98
IMALDEIrWONNPSQQIASSTSCTSKIWDRKTFACTVTLLVVATLMILSσiVLLFTiσS NISIFYPITYFIEGKEVLΣ1OΛPPLIFYGVILMIINVRAPAFGITSVQ0FSTNFOAAIPIL
LGLSVPLSGΣLGTFAVTVGAV FITσLTILVPCSLGIEOKNEDLNFLKIKTPTPPARPLM NΣVTGCSRISSTYAZDIEEVAOEKΣ KSTHSKSSTSvTCWAHRVTtσVVEILGGGIVlLAL
SKFSVTCSTTSIVTGMALLIGAVVSVFFXTGYM GI AGLVGtffALFVAGLARMSPRS EITALVU!VIIKLΣKCLID\rti?/CtjrσLGVOΛ/AIIGAΣAFCV\ΛnΛn YlβFCSQGEEL£
LADOEGSGSADSQSNIVGIGEPKAAOEOKWYKMAVVRGEDGIPTAIRLTPEK PIEVKTLISPDKPYPT YV
CPn_0267 300122 300910 CPn_0278 312884 312060
No robust homolog present in Genebank/EMBL as of 11/7/93 •conserved oucer membrane lipoprotein
VSΣMSLNKTNALLNOPEPAVCLNAWDPKYINQDRKTFA IVΓLLVΣATIΛILTTGVΓVLL RΣ3SMKKKLSI VGLIFVI^SCHKεDAO^IRIVASPTPHAε∑ εSLθεεAKDLGIKL IL
AMGSPGLSVLVSTΣIGTSVTTLGTALFIIGLVKLΣKKSLAWΣOYOKYFOΕVVKOKYΕPFS PVDDYRIPNRIXIJ3KOvOANYTQHQAFLDDεCERYI)CKGεLVVIAKVHL£POAIYSKXHS
:PKNT*IVHKLTSCLPSPIJ3lεSPSPεASTPVSiαΛΣACSGVAIVIβVTLLIGAVVSVFFC SLERIJCSQKKLTΣAIPVORTNAORALHΣJLiϊCGLrVCKGPANIiΦfrAKΣMXSKENRSINI TGYUJIALCVGFACLGTALFVGGLAGLRTHSLIAQG IMYLYLTYYLSSALεERNETVKDQ LE SAPIiVGS PD I)AAVIP a^FAIAA^IΣ^PKKDSIrLεDI^5VSKY^I^LWI SE^Λ CS RNε∑NTYLTεECRQOKREKALLε PICMIKLQi .FQSPSVOHFFiyrKYHGNILTMTQDW;
CPn_026β 300914 301318 CPn_0279 313546 312875
No robust homolog present in Genebank/EMBL as of 11/7/98 * Possible ABC Transporter Permease Procein
KOWA1J3IMSOC0SSSTSTWE MKSFVPNWKNPTPPLSPIPSEDEFILAYEPFVLPKTDPE ) KIMOSDLIOIUJ ETVNTLYMVSTAFFFSCAIGσMLGLGLFCTSPKSLNPICKSLYATΣS
NAQANPPGTΞTPNVENG IDDLNPLLGQPNEONNANNPGTSGSNPTSLPAPERLPETEENS MIlSFLTAΣPFAΣI.ΣVIΣ FPITRWrvσrSlΛWASIVPLTIGAΣPFVVTIV n FiaJSAL
QEEEQGSQNNEDLIG
QYGWYPJ∞SViτSVLVITLVLIESVRILGDFMGRRVLKYRGIL
CPn_0269 302468 301476
Dipeptidase CPn_0280 314593 313550
VAFT«CVMTIDMHCDLLSHPHFCPjωPATOCSPEQLLSGGVPUMVωiFVPHSRGEPNCDK dppP-Dipeptide Transporter ATPase
QNSIjrFSLPNQYPDIGLLSYEEEENGSSSOKKSI^LIP SIENASALGDDTAPI^GTLLAKL IKGEAWLVSEOHSPI I£ ^DVSlα r DHII SKVSFSVYreεVFGIVGHSGSGKTTLLRC
IHLTTOGPIAY GIV KGDNRTGGGTEAPKPiSNMKvTiDIMYELGVPIDLSHCSDKLA LDfTJWPTSGSISVAGFIΪNSLPTrjKFSRRNFSKKVAYISONYGLFSSKTVFENIAYPLRI εDΣLDYTADIΛPNLAVΣASHSNFRSVXDHRRNLVDAHAKErVTlRKGViσLNLVRSYVGDS HHSEMSKEVEEOyYDTLNFΣJ^YHRHDAYPGNωGGOKOKVAIARAIVCOPEWLCOEI l^DI.EKHVLHAENLGILSSIVLGSDFFYANEDENFFFNECSSAEAHPVLNQLIHRIFSKG TSALDPKSTENI iεRIAQLNQERGΣTLVLVSHEir)WKKΣCSHVLVTmC ΛVTELC?ITEE
KAESILSSRAEKFLKOVIVEOVNPKITDVKL LFLNSENSITNELFHEDINrAALSSCYFAEDREEVLRLNFSKELAICGI ISKVIQTGLVS
INIIΛGNΣNLFTtKSPMGFL∑rvXEGEVEOP KKAKEΣiΣELGWIKEFY
CPn_0270 303343 302468 ywlC-SuA5 Superfamily-related Protein CPn_0281 315033 316103
31FσVIVPDKKAOΣTFSLPEWSAΣHCGKrVALPTDTVYGFVLSLYASEAEεRLYALKDR dhnA-Predicted 1 , 6-Fructose Biphosphace Aldolase (dehydrin
EPSKAFALYVNSΣEDIENISGYPLSPTAKKIAQIJPGAITLVVKHRNPRFPKETLAFRIV family) -
DHSWRεiVDHCσTLΣGTSANLSεFPSALTAQε∑FADFADHDLCIFD3PCSHGLεSTWA ISLRRHTIΛLNIHDΣLCOT)DENIiSYCCKHITKDKLTLPSHDFVDKVFGLSDRNNRVLRS
3DPLYIYREGLΣSRSVΣENΣAGTεAKΣFHRTSHAFSKHΣKΣYTVKNQε0LVSFLSGSLDF LCThlFSHGRI U^SGYL5ILPVDOGΣEHSAGASFAΣNPΣYFDPENΣVKLAIESGCSAVAST
KGVVCεHPKPKNFYτRLRεALKKKTPSIVFΣYDINTSDYPεLFPFLSPYYiε YGTLSIiSRKYAHKIPFMLIUΛHNI I^SYPTKYHQIFFTOVεAAYSMGAVAvOATVYFσS εTS>iεεiVAVSlJAFAKAP^IΛIATVLW3YI WPAFVANσVDYHTAADLTGΩAI)HLGATLG
CPn_0271 303628 304362 ADΣVKOf PTCOGGFI AΣNFGKTDERVYSELSSNHPIDLCRYOVLNSYCGKVGLΣNSGGP
Lysophospholipase esterase SGIO DFTEAARTAVINKRAGGMGLILGRKAFORPLSEGIQLLNLVQDIYLDPNITIA
KLMTDYSFFRRKIGN∑εAiεCPGNPODPΣ ΣΣLCHGYGSLADNLTFFPSICSFSKLRPTWI
FPrWILPLENDFRGSRACFPLNVLLLOELSRLYANGVGNLOεKYDεLFDVDLETPKEALE CPn_0282 316084 317529
ELILNLNRPYNEII∑σσFSOGAΣLATHLVLTSONPYAGALIFAGARLFNOGWεEGLKOCA xasA/gadC-Amino Ac id Transporter
QVPFLOSHGYεDε∑LPYHLGAHLNDLLLTKLNGOFVSFHGGHεiPSWFQKMQVTVPN Σ ΣLΣLQSIΛFSKKvTMHSHSKPTKPLGTFTvTJMLSLAVVISLRNLPLTAKHGLSTLFFYGL
DPARG AVICFMΣPYALΣSAELASFKPCGIYI ARDALGK GFFAI Mg FHNMT YPAVXAFIA
STIVYKΣNPELAHNIΛΥΣATVΣLAGF ILTFFNFLσ∑TSSALFSSICVI ΣCTLIPGVILV
CPn_0272 305272 304340 SLALFWIFSGNPΣAISI-SWGNLLPNFSWSSLVLI-AGMLLALCCLEANANLASDMVNPRK dnaX-DNA Pol III Gamma and Tau NYPKAVFICAIATLTΣL'.XGSLSΣAΣVΣPKεEΣSLVSGL'/KTFTLFFDKYNLS MTGrW
FNROSDArATWVMHLεεεNOGWεALLRKVYHOεVPPAΣLLHGFTLPVLODKAEQLASEI VMTΣAG3LGELNA*mFAGTKGLFΣST0NlX:LPRLFKI VT[3KNVPTNLMLFQGIVVTIFTL
LLSSSPGSEHKVSOKΣHPDΣYQFFPεGKGRLHSΣDLPRGIKKQIYΣSPFEANYKIYIIHE LFLCLOSADLVYWΣ LTALS VOMYLAMY ΣCLFLAGP Σ LP. Σ KεPRAQRLYSVPGKFLC ΣCTM
ADRMTLAAISAFLKVFEεPPKHAVIΣLTTAKVORLPKTIISRSLSIFiεRGεKΣLCSKET SILσiLSCAFALWVSFLPPRELAQISεGSKΣGYTTFLLLAFSLNCLΣPFGΣYFTHKRLSK
F YLFRYA0Cε∑PVTEVS0IΣKESSεTDKQVLRDKV0RFMEVLLELYRDRYTLNLGLKAS
ΛLNYPεHVKε∑LOLPLLPLDKVLLΣVεSACRSLNNSSSAASVLEWVAIQLVSLQYKεKεL
V3VSPG0DLSN Pn_()27 ) 305853 305227 rdk -Thymtdy lar.e Kinase
- iVFI7iεGCεσ3σKSSLAKALGD0LVA0DRKVLLTREPGGCLIGERLRDLΣLEPPHLE -Pf.-CELFLFLG3PΛQHI0EVI IPALRDGYΣVICERFHD3Tr'/-/,θσiAεGLGADFVADLC
, '/VGPTPFLPNFVLLLDIPADICL0RKHROKVFDKFEKKPL3YHNRIREGFL LASADP :PYLVLDΛPE::LΛ.-;LΓDKVMLHTOLGLCT
Figure imgf000087_0002
I lι_0274 J tHΛS 305852 <"r-n_ 2'<4 U-O'.4 I 1 H 5 1 ly i A DNA ι;y r i-.i- .>utιun ιt A Nu I 'Jii .l hoiwi l.vi pi .-.'.i lil i n ι ;.-nι.tj.j|lk/l-.'1bL .ι. ul l l / / /"H I. .T i eMPIIKDC t tVPKNLEEEMKElIYLPYSM.Wt ir,PALPDtRDCI.Kl':;QRRVLYAMKQL i i*iNiPΛr vr-vιιn-ι-vvNNT:;.-;γι;ι.';Lκ.:.':Lr'MTYi.rLΛ[iΛiΛτt.M-:vι,YFιMπ:;
.l.::if;AKHr'Kl-Λ Iι.-'JDTΪGDYHPIlt;εSVIYPTLVPMΛON AMRYPLVDCXjι;NFGS IDCD vιrrιvι :MLirt..:vr.'viΛ'VΛYi.rγι j';.';iEi'TFvrr;ιτ':r-:;vπ-:nr.pijjι.ι.u: EED:; I l-/\ΛMPTrπΛI<L'πi:;/\MYLMCDLDKDTVDIVPNYDETKHEPV/FPS-KFPNLLCNGSSGIA v;ΛiDπLi.κNFPΛi)iii'HRpκMi.rγ':nFi.DEijι;r't-NK!;i't.ED.';HT':κu. ViMΛTM I I'l-HNUJKI.I EΛTLLLLΛNPOASVDEILOVMPGPDFPTGG I ICG3EGIRSAYTT
IRi;K I KVPΛHLIIvmjEDKHRE31 1 ITEMPYNVNKIIPL I EQ TANLVNEKTLAG ISDVRDE Cll WWi l.'UI.IM 1 1 -111', I
IbKU ir. l'WLC I KKGES.'JC r. I tNRLYKrrDVQVTFTJAJIMLALLKNLPRTMS IHRMISA I Nu rι,r,ιιsr homo l. |.ι .-., -ι,ι , ,, ι ;, .M<-t»ιιιk/l- Mr'l . ι-. m i l / / /'iH l'IIKFE'/ I I'P -rRYELHKAETRAHVLCGYLKAL:;CI.DALVKT tPE3GNKEHAKERi rESFC κι:LHHLFFrr-iNκcrr\.:ιιι:ι.iγι.κιι -:F-:ι..:ι.- '[Li;ι.iΛi.;vι,ι.ι.u:vvFΛi.vc--ιιvι.
Figure imgf000088_0001
CPn_0288 325785 324571 CPn_0299 336726 337415
CT28B hypothetical protein recR-Rβcombination Procein
ISITIREFLFFGFεCRAIFYNVIMSCFNLTSTNESΣJIPISPKASFPKQGWJSYFRSALRK RWαLVYΥSESLYSMJΛGPRPECKNKIHITOTRYPDYI^iαiFFIΛKΣJ^IGFKTAEltlA
HRSDTLSVSVCKWKYDANLFVRLTVIAIAVvσVLILFSIMΣ^ASICπiVITSWPLVTAA FεLISWMEO∑^I aiAFHNVASERSHCPLCFTLKESKEADCHFCRεERENOSLCIVASP
ILIPTΣLLTGX3MYIIΛRLGKKVDVISGVCIPPFSRRCWVPΣSSSHTLεKFDEKHVSACSY KITv F ERSK FKσRYH l 3S I^PΣ GKH∑ ^rERI^IL SRIETLCPIEIILAID T E
^ISTLSAtXSSGIMVYKPPLLFRAFPCFGIPCAMPFVALLRMΣYNLIRFLVVPFYIΪF GMTAIJLJCOE∑ΛHFSVNISRI- GLPΣGLSFDYVDSGTLARAFSGRHSΥ
RMΣYEHFFCKHLPEDDRFIYKITv7iREMσRSΣAAFIKAPFYASA∞IGAFYSIXDPIΛGRV
LMGSVεP»NrjrWILAPWSIJU4BWSIJRFEGGGσPJ<GΣ MHAFYΣJ<I CQP0SVFLFD CPn_0300 337768 340152
KGεrVSGAHPSIOLPERRGLDTSGRYPHISVIPDSGNDSAKNFIV yaeT-Omp85 Analog
SR a<LIMRMKVII }ISIΣALICrPLTLFSTεKv1CEGHVVVDSITIITεGεNλSMKHPL
CPn_0289 325797 326996 PKIjπTl-^yUJSOLDFDEDLRILAKEYDSvΕPKvΕFSEGKTNΣALHLΣAKPSΣRNΣHISG
CT289 hypothetical protein WVVPEHKI CTLOIYPJroLFEREKFlJCGLDDLRTYYIJ<RGYFASSVrrYSLE»»3EKGHI
NFNRΣΛKKORSHYKKNNLLLli^ILΩIΛΣΛSVQSPWΣVYSAECΣANTFLKFIΛLLSIPL tJVLIKINEGPCσKIKOLTFSGISRSEKSDIQEFIOTKOHSTTTSWFTOΛGLYHPOIvΕOO
\ffrιτ/;ιrrττgτnπnjτι«rrτnιπττvvn-rr.--^fτ..gτπr.r.r.rπ.τ.pryy(T ηnΛr.»ττ SLAITNYIJt»rjYADArVNSHYDLDDKGNILiYMDIDRGSRYT 3HVHI0GFEVLPKRLI
TCCNPLGγ∑JWLSrm,PENΣFKPFU3CWISAACIAVLΣOTASUTQEKErø Eκosore rot'Y PDKIwrλHKIκo YAKYG^IN ^ιvDVIJ PHA RPrγ^τvTYεvsEG
FFSIFUJΣARGG OiPIAMLGFSVILFKEIJCMSNLTMFAEYLLCVIGANI-AC^ SPYlTΛrGLIKITC5rmrriCSDWL«CTSΣJ,PCOTFNRΣJαjiOT^
ILUINKVSPIJCVAKAMSPALVTAFFSISSAATLPLTMEIAEDDIJCINKNLSRFSFPLCS SQIJ3PMC3JA ϊYTOIFVEVKETrTGr4IΛLFIΛFSSUlNUra^ ^Λ rJCΛAFI Σ VLFVATSNσMIISPIiffi ^πFIA Σ««ΣG^IAσVPMσCYFLTLSLL
TSMNVPI^IIΛLILPFYTVIDMIETSLNVWSDCCVVSLAN
CPn_0290 327027 328523
Na-dependent Transporter
RSALTMNIOXHASFSSPJ FIFSMiσiAvrjAσNI*mFPRVAAQNGGGAFLΣLWLCFLFL S
IPLΣIiεi^IGKLTiαCAPIGALIKTAσ!«FA AGGFΣTLV rCΣIAYYSTIVrΛIGLJ5YFY
YAVSGKΣHLGNDFAKL TSHYQSSΣPL AHLTSLGLAYLVIRKGIVHG∑εKCNKILΣPAF CPn_0301 340163 340762
FLCTIAIJ RAVTLPGAVOTIK0LFSα)KSCFSNYKVWIEALTQNA im;AG 3LLLVYA (OmpH-Like Outer Membrane Protein)
GFASKKTGVVSΪKSALTAICNtΛVSLIMGIIIFSTCASΣΛILGTTOLODGAGASSIGITFI IKDLSKEIFWFRKGFWYPFSΣ PKLVO ∑MKKLLFSTFLLVLGSTSAAHANΣΛYVNLKRC
YLPEΣJTRLPGGIYLTTLFSSIFFLAFSMAALSSMISMLFLLSOTLAεFGIKPYISETLA LEESDLGKKETErOJAMKOrFVKNAEKIEEELTSIYNKI^DEDYMESI^DSASEELUrKF
TIIAFVIΛIPSAI^LTFFSWDTVWσVALIVNGLIFΣYAALVYGFPKLKKEVINAAPGDL E»I^SGEYNAYOSOYY0SINOSrΛrrøΣQKLIOE\KΣAAESVRSKEKLεAILNEEAVLAIAP
RLNKAFDYIIKYLLPIEGILIXG YFYEGLFPENGOWWNPΣSLYSLGSLVLQWSLGLIIL GTDKTTε∑ΣAILNεSFKKON
WKFNKQLYLRFSRYNHεiL
CPn_0302 340766 341866
CPn_0291 328658 329194 lpxD-UDP Glucosamine N-Acyltransferase incB-Σnclusion Membrane Protein B SKFKEFS14SεAPVYTLK0LAεLLQvEV0GNlεTPΣSGVεDΣS0AQPHHIAFLDNEKYSSF
EKHMSAPΣPTPOELSDOITCLNVOYQQVSELARENKGDIEGLKTLTAALTADAGΣOPSAD UOmCAGAIILSRSOAMQHAHLKKNFLΣTNESPSLTFOKCIELFiεPVTSGFPGIHPTAV
EIvSLQTAAALILSASEKPGSGPSGSTEGSVTVQSPCKFKKVLAWL'":iALIAIAVLIA ΣHPTARΣEKNVTΣEPYVVISOHAHΣGSDTY∑σAGSVIGAHSVLσANCLIHPKVVIRERVL
CIΣAACGGFPLLLSALNLYTIGACVSLPIΣASTSVALICLCTFVANSLΣKPVΣTVRTTR MGrøVvTJPGAVIΛSCGFGYITNAFGHHKPLKHLσYVIVGDDvΕIGANTTIDRGRFKNTV
IHEGTKIDNOVOVAHHVε∑GKHSIΣVAOAGΣAGSTKΣGEHVIIGGOTGITGHISIAOHVI
CPn_0292 329201 329836 MIACTπVTKSITSPσiYGGAPARPYOεTHRLΣAKΣRNLPI rεεRLSKLEKCVRDLSTPSL incC-Inclusion Membrane Protein C AEΣPSEΣ VKNTKlSDFMTSPIPFQSSCDASFLAEQPQQLPSTSεSQLOTQLLTMMKHTQALSεTVLQ
OORDRLPTASIΣLOVGGAPτGGAGAPFOPGPADDHHHPIPPPWPAQIETEΣTTIRSεLQ CPn_0303 342982 341021 LMRSTLCQSTKGARTGVLWTAΣLMTΣSLLAI 11 IΣLAVLGFTGVLPQVALLMOGETNLI CT303 hypothetical protein AMVSGSIICFIALIGTLGLILTNKNTPLPAS REOKCLHHMDVSRKΣNRHTOFYVDSΣDCVΣKNFDHKPSEDKSRDHεεLεεKLLTΣTKRIV
ASAQεFONRKTDSKNYYLKKTOWLPFKNεεLECTKELFAMLTSMDKKΣAOLFFYSPGCSS
CPn_0293 329940 332723 D vΕFTr/ΣCHLNDSIGLGGvT.LCCGLFεC<!CEHVvTvTIKKLDLPLLU3TTVVNSLRYYL
T234 hypothetical protein TYWI3LLNC0≤MSELσκεLCDvXK0HGVAFTLIFKEIOTID ΛYVKLIOGU(RSGNI0
ARIYDNDVPTLPSVSSSPΣALRYSLANTΣRCLALHVDFSSLKFISPSΣLSNTεHTAKALN
^GσECFIFSNLDεFNLCMKIVMOLLRTCKfpεiLNKNIMKILMΣKRRVRSLYΣ
nEΛ-Λ/llKTt IΛLDI FVKDLLMTT OLKNT..RKYΛI.ΛM k l-LDKEVAPAFLOVLTDE
Figure imgf000088_0002
I I I IIY tOKk ,LLA 'Pn IHi 11414. II I /
NTI/l M//ΛI-VNFMI. LIλ,lLCSMEH CVL[RΛLTGKNQK IKΛ Al ESLEKIF D3HLF3L fjιlh_t/'*lr>n iviuvir, ϋt.hyirι ιι ti i i, r ,
I tl PΛirl< H Y EK'kYFKrcvt LTLKεLLNMMCN'ΪP LNKl rΛQOLKEEL3YCDPDF HKE3MI K1IK 11 I tKLΛt Hl-AtOFEM PDI NV 11/ rπi n/I A, KVTKϊ.I [.PK r.PKRV o vrm/NoeiiFDi-RrtE.,r-TLi3!-L.'.t lUlxVl ΛJSΛt * ti [I ΛΛI I I'l [ [1CM - ill r/ΛII«jlI IIAAKMIIIKri. KFΛ/Pt vmr-rtr i on ,ιu tn
111077 l 11SD2 I I EYNI I" /I VPI. KΛIIPV IT HUI 1 ( 11 / I'MV I IKlAl IΛKKPWT L..ICI
Figure imgf000089_0001
CPn_03l3 354957 355355 acpS-Ac-yl-carπer Protein Synthase
HKILKεiSANSMEIIHIGTDΣΣEΣSRΣRεAΣATHGNRLLNRΣFTEΛECKYCLεKTDPΣPS
FAσRFAGKEAVAKALGTGΣGSWAWKDIEVFKVSHGPEVLLPSHVYAKICISKVILSΣSH
CKεYATATAIALA ι-Pn_0 14 356285 355353 r rxθ-Thioredoxin Reductase
M IHSRLI I IGSGPSCYTΛA IYASRALLHPLLFECFF3GISGCOLMTTTEVENFPσFPεG I
LCPKLMNNMKεOAVRFCTKTLAODIISVDFSVRPFILKSKEETYSCDACIIATGASAKRL
CIPlJAGNDεF OKGvTACAVCDCASPΣFKNKDLYVKyMDSALEEALYLTRYGGHVYWH
PPDKLftAiKΛMEΛRΛONNεKITFLWNSεiVKI.SGDπiVRΛvOIKNVCTCEITTREAACVF
L-ΛIGIIKPNTDRUXIOLTLDESCYIVΤΕKGTΞKTSVPOVFAΛCDV DKYYROA'TKAGSCC IΛΛLDΛΓRFLC
Figure imgf000089_0002
f ι_e> 11 -, ιSiιi77 ιr,H71ι> |-PM_ I24 ln-1463 J /OhO i I ..I Kit*). t*n il Ptoti.-m ιTJ.4 hypothor l .il pror in
Ml'K0Λπ-rTWι:-:KKI(.DNtCCLTεDVAFFKDLLYTAIIPtT*:-;CCε:TNEI Pi'!Λ£LKI7rW \VVΛIIRRHMΛA3Gθ GGLCGTyrΛILAΛVI \/'u\KΛDΛ.E,/VΛ.';θεt;:;DMNMtOOSODLT IHNKDI'VWIiVHLK.-EOVtPMIIEFlDn^EGLV ^lEVnVYt.DOAIIlFFAlK LGREKATR NI-AAATRTKKKEEKrOTLEIlRK'GEAOIAl K r.THXKPDTD[ΛOKYA:-TN.';CIflG0EL ' HljWI YII.MICI-.Kll-IIVKllOrrRKVKail.lVDIiMEAFLPIinOII'NKKIKNL DYVGKVr l&IU'UAltiULVtfPEDILAl.VXjErlK.il-A i .i I K ILK IHVKRR I WlIRRELLEΛER t.lKKAELIE 13 ∑f ICYRKiiWKN IT FOVFLDLD αιt-T|RTΛIι:AKNtLF,VΩEYΛDGUr/';t--':i P ./I.r TrjDTIrTCDOLt-IMLODRYTYQD i.irx-,LI.III'ri«TWKIHHNP:KMVFIWELEVIIL;vr;KEKGRVA UK KEIlMr*IEDIEK MΛiv.';.'.τι*ικGMΛ-rELKnoι;pr/r--:Λ jvι«-ιι-ι-ι,tιι/jΛVL.'r:;γDYFE:;RVPtLLR:LK 383405 384034 protein
384160 384495 ENAQVSEQI I 385062 OMP (leader (19) pepcidel
Figure imgf000090_0001
Figure imgf000091_0001
CPn_0352 395478 396830 CPn_0363 409700 407943
No robust homolog present in Genebank/εMBL as of 11/7/98 flhA-Flagellar Secretion Protein
VΛra∑FFINSHFTNSYAFFNQIVIΣ'IVPΛSC<riιWCSPLTLVPHIFLINDCECHSSCSLKI EAVTVSGKKrxWRGMIFVTLSILVLΣFLPLPOΣLLDFGΣΛΣSFAωLLTVCWv TLNSSN
RTIARLIIΛLVlALVSAI^ri/FLAAPISYAΣGC ΣAIJ^ LIITLWAΣiAKSKvXPΣ SAKLFPPFFLYXCLΣ UΛΣΛΣASTR WSSGTASSLIVSLGSFFStΛSLWMTFACLIXF
PNEΣΛK∑rnmYPKEvTYFVKTHSL'Tv^EΣJCIFINCTKSGTDLPPNΣJflCKAliAFGIDILK FVπTLMVSKGSERIAEVMRFFrjJ^PAXC^ALDSDLVSGPΛSYKAVKKOKNALΣEEGDF
SIDLTLFPEFEEILIΛNCPLYWLSHFIDKTεSVAGEIGIΛKTOKVYGrLIΛPLAFHKGyrT FSA^lECVFRFVKσMIISCIΣi V^JVVSVTCLYYτS-^YALEQ WFTV^
TSCAAATLISKIDKlϊESIXim.FEYYKOLTOHFRWSLLIFSLCCIPSSPKFPIVLLASL
QFLFΣJFSHGITWEQAQMIQLINPDtJWKMl QFDKAGGHCSMΛTF∞FljπΕ^^ LWLAYRKEEPASEDSC IERAFSYVEGACPKEQESQFY0 YRAASEEvTEDIΛV1UΛ»vT.TS
S^YEPT r^FMT^«E Kvl EKV^{ESPMHPASALV^3KICVNI HHONΣ KR^ro UlIElWPWIJtVFGONVYI IEWrPEAVLPFIJWIAHEAIΛAEVVOKYIjeESER
WTSSLPOYAFHAQTYKI.EKKlεSSLPIRSSL IVPlOζlSLSSLWl^RIXVPjSVSUαjPKILEAVAVYONSσDSLEΣIAEKVRKSLGY I σRSI TOKC/l TIDFHVEELINSS SKS PVMOErV RRv^SLIiRSVFI03FM
CPn_0353 396893 397135 SCETRFiaiKMfflOPHFPDLLVLSHDELPKEIPISFLGIVSDEVLVP
No robust homolog present in Genebank/EMBL as of 11/7/98
U^FRNIKKSLIFIKRΣRYSQSGKEQKGARPFFKKSΣTSSLVILLLEAIFNENFSSIIONN CPn_0364 409954 410238
F^^KNFK K ISI RIFVKFTI fer4-Ferredoxιn ΣV
KE2<SMAi vITSDDEOOEFlaE∞SEIAEKESH:∑PFACTEG«rrVIEVXH3lENLS
CPn_0354 397062 398507 εi^ISEYDFLGEPεDSNERI ΛCCRIKGGCVKVTF
No robust homolog present in Genebank/EMBL as of 11/7/98
YCTISIKILKIKTFI1.IGFLIΛΣΛYNT0IDEPRJCQISNITSPVIO1WMCNYYTEIJWST CPn_0365 410498 411544
TIHIWSAIIiCGALIAl^CVAAPVSYILSσAIJ .IΛlilALΣGVILGIKKITPMISSKE No robust homolog presenc in Genebank/εMBL as of 11/7/98
QVTPQELVNRIRAHYPKFVSDFVSEAKPNΣJCDLISFIDLLNOIΛSEVGSSTMlfNVSεELQ FKGTOVNSLIMATISPΣSLTVDHPLVτriiαKSCSIffDKIQSRIΣXITAIFAv,vTIθrrL
QKIDTTEGIARLKNEVRTASURΣ SAASSRPΣJPSLPKIΣΛKVFPFF LGEFISAGSKV ΣGΣAIΛIPVrYFLTCISFIAVVTSNFILYirjWTr∑iKPRA∞KHKEIKPlO^
VELHRVKKIGGSLEmL5DYIKPEMLPTYWLIPLDFRPTNSSIIJΛHTLVIAR\fLTRl7VF ISIAJNIUKENVπεHOPKI)I«NLPAPSALLT17NPYEIMKAKHSLFSLVSlXPβrjNPεHI.I
QHUCYAAU«EWrrtimSDLNTMl<∞LFλKYHAAYQ_rYCTI QPSIΛm^ SASEtffΛKTI IEETSCNAPΣSSYVDTTPSPKSLI^EAIQETRVEim'ELPAGDSσERLY
RYSVnWMSLIKTVPADLϊmjLCCLTIΛHTGP?Ql»IEFASLIGTLYTCπ,IHKESIiAFLSS WPDFTiσRVFLPOIPTTPEAIYOYYYALYvTYICTAINTrmjIIOΣP YSIΛEHLYSREL
LTLULTOFKTIRRQSTNIAMFΣjaiΣATHNSTFRSLPPITVHPLKRSVFSQPεEDESSLL PPQSPJI∞SI WΣTAVKYMAELHPEYPLTIACVERSLAOLPOES∑εDLS
CPn_0366 411976 412440
CPn_0355 399955 398591 No robust homolog present in Genebank/EMBL as of 11/7/98
No robust homolog present in Genebank/EMBL as of 11/7/98 MGλ PVSA'T7VLFESPAAPLINSANTONOKLIEIJCGKO<3AESSPRTITεVIlα5V ιVIGC
CLIVXSΣ-LAIRPALOFTLETGHPAAΣAVLAVSGTILLVAVΣ ILFCFLAAVPFAAKKTYKY
WTVDDYASWHSHQCTPTLGTΣFSGIVYAESQAOL
I KlW PIFEIIΣJLSCTCPLYWlΛ3KFISAσDroVCPJ3tΛvT>RECYGYYVΛΛPI/ΪY CPn_0367 413078 413836
ΣFCKCTHHΣUrøLTKεDT.LI αjKALQ-.røDTDEVIttlVER No robust homolog present in Genebank/EMBL as of 11/7/98
KεTISKEΣiLLSLHGYSFDOLQLITQLPRIlAWIWΣΛFVDNSTAYNLOLCALVGALSSONL SFPUrRYFMTKTTSIP17WENQSHLSVDERLISεSP\rLTKJαWIAKIIKLTALILALA.A
LDESSIDFDVNIΛLYVI0DLKεAVX3AFSASDEPICKELCKTlIJUiIΛSvSKlU.ESVLRQGL VCTAWAGVTiT.MPLMAΣATGAAΣJ-WVVWCIJ-ΣΛRRεPSKPrΕELΣΛ
HRIAIJMGNARARVYIMαFVTGARΣHRKTSIFFKD V13PSWLIJYOKLIΛNEϊm-VNTt^EΣNISWrLODPNORYΥV EHOGAPITLVATTGDIAK
PRLCTSGRVMIVNAANS»10SGr»GTNAAI^AATHPTWNNTRTSGGKINTGKGLSVGEC
CPn_0356 400465 400109 RSAP INRDWTNK
No robust homolog present n Genebank/EMBL as of 11/7/98
KQ\Λ3LFQYrøεSGWr ru:DFDSθσEGFQLSRLVGLLHSS ALYEAKE0FYLPEVSLLT E CPn_0368 413766 414107
ELIEWUώKPTKHCΛ/AlffiLCNVFεKHFQRFRQYLσSLDLlJQRFεNTFLNYPKYHLDRε No robust homolog present in Genebank/EMBL as of 11/7/98
TI uXrjY WVNAAOHPGSIETGRINDTNPGEAHFlΛOLLGPKYEGεLKAHPεiI^NVIiαCA
CPn_0357 401341 400469 YLrCFOEALNNOATVVOVPLISSSIYSPσGKLELεPVNOTKPNSSAYKLYHIRT
No robust homolog present in Genebank/EMBL as of 11/7/98
YSSHNGASMVN∑OPWRNTOVNYSOATQFSVCOPALSLIIVSVVAAVLAΣVALVCSOSLL CPn_0369 414345 415562
3IELOTALVLVSLΣLFA3AMFMΣYKMROEPKELLΣPKKΣMεLΣOEHYPSΣWDFΣRDQEV CT058 hypothecical proteιn_2
3IYEIHHLΣ3∑ωKTNVFDI<APWLθεKLL0FGiεKFKDVHPSKLPNFεεiLL0HCPLH NIMTDSNPLPSYTD.SLYRTPAiωEΥPΣRLPLNRTDR∑εKΣLKIVTLTLALACALGFSΣA
IΛRLV-/PMVSDVTPστYσYY CσPLCLYENAPSLFERRSLLLLKKISFGεFALLεDGLKK AGΣLAMPΣFSAWVITLAΣAAVSLYSLLKKPKLYεiLPOIEPESEOSSLSPSPOPPEOOD
NTWSSSELVQIRONLFTRYYADKEεVDεAεLNADYεQFDSLLHLIFSHKLS LPLOΣDPLPDPES PεVSLADLTTPPεεLTAΣTVTPGYεALLEONWDLLPSLAAVDPSFT
TETP∞PCFIWIΛDSKLIFΣSTSGDΣAVPRΣIWQσRVMrVNAANENΣSREGAΪGTNKALS
CPπ_0353 401757 401578 l_ATSLO INASRLFRAHSRSGSOWPGECRSAKWENSDHTSNDHVPGKAHFLAOLLGPEA
Mo robust homolog present in Genebank/εMBL as of 11/7/98 AKCNNDPKOAFEVSKKAFHNLFOEAεiiσVDV∑OLPLΣGCNLFAPSRLLNLGKTRAE ∑ε εε'/LSV3MKLlPT0DSΣERεTDSKRDKKΣFTIYΣCSSKVLAGHFFSHLDKHNKIHεSIGV AΣKLALΣTSLODFGWεOONOεεOKΣΣILTDKDQPPIIPPRFDLTTP
CPn_0359 401994 403817 CPn_0370 415755 416912 lepA-GTPase CT058 hypothecical proteιn_3
ITLOYILK.εYK∑εNΣRNFSIΣAHΣDHGKSTΣADRLLESTSTVEEREMRεOLLDSMDLERE KRΣFFKLFVFYLKSFMSTTEPNLTNVNLTMLI3SESMPT0LASHKLKGLDLVAFILIIGΣ
RGΣTΣKΛHPVTMTYLYEGEVYQLNLΣDTPGHVDFSYE'/SRSLSACεGALLΣVDAAQGVQA AVSSCTAAΣΣLGΣFLLFILTALAVLAFSILLYFLLRεPKSPΣSVTHOPTPΣΣKDTDLPPV
QSUVNVr/LΛLεRDLεiΣPVLNKΣDLPAADPWΣA∞∑εDYΣGLDTTNIIACSAKTGOCΣP PP ALTP PτEA\XE PP PSPRTHC^LU3ε^lWDRΣPD 0A^rTDMPFΣAAD CTGYA H
AILKAΣΣDLVPPPIAPAεTELKALVFDSHYDPYVσ∑MVYVRΣΣSGELKKGDRΣTFMAAKG K S LT IΞT CFIE RYKTCCI MIvTJAATP^IMA NVKGTS A AKATS RCWENSKK
.';3Fr/LGICAFLPKATFΣE3SLRPCQvT3FFΣANLKK'/KDv ∑GDTVTKTKHPAKTPLEGF SPDPUtSKOPLCLGECRSAKWENUirrrTNAGKAGLPOFLGOLLGPKASDYNYNPNDAFTF
KεtNPV FAπ∑YPIDSSDFDTLKDALGRLOLNDSALTΣEQεSSHSLGFGFRCGFLGLLHL CROAYLNCLNεAKRRKTTV\-CLPLL33MFPG.';pKDEεTTSLRLO IlXTVKLALIDAL0TF
KUFEP.ΣIRπFD ΛIIΛTΛP.WIYKVVLKNGKVLDIDNPSGYPDPAI∑εHVεεPWVHVNΣ GSEΛENONOP VIILTTLΛRHPLITP
[TPOEYL3N IMNLCLDKRG tt.-VKTEMLDQHRLVLΛYεLPLNεiVSDFNDKLKSVTKGYGS
FDYRLGL-YPKC:; 11 KLEVL INEEP ΣDΛFSCLVHRDFAεSRGRS ICεKLVDVl PQQLFK Σ P CPn_0371 417141 17--, :
I'jΛΛIMI'KVIΛRETIRΛI.ilKUVTAKCYGGDITRKRKLWEKQKKGKKRMKEFCKVSΣPNTA No robust hoπ q pιesκnr. in ι;>:hι.r„mk/EMBL .is ut 11/7/9H
''IKVLKIXi KTMPVS3APLPT-:ilRPS:XlNlι';LMEPtl.-:KΛI.I'ΛKII DrrrκTrK!.l,VKILVA[LVtEVLG itAAFFrpi7Tppu ii *;t.n.Tr/i -vi.i.i.7iKL,\i.viiKTnTrrAi-A,vtK K ;sK3i
'|ιι_OS'.i| l(lr. Ii..| 41)1922
'Ti i) yimi ti.-r lu.il rotein
VΛ j-|'fl| ;i.Iι;LΛVM(.;KNLVIΛMIDHGFSV;.-VNRTPEKTRDFLKEYPNIIRCLVGFε3LE ι.i'vM.:ι.ιτ<ιuκtMiΛiuΛι;κι-VD :;HiΛLLPFi.ει jr.vιtrjr ;N.';γFKDSERp κεL0EK .uι.i "Λ-,i:j»; w;ΛRiiι; :;rMπ'r.NPFJtw Λ rF :UΛΛK oι;RP cr,- oτGt;AG
I IYV Λ7I im : I RYi IP IOI . Iι AY'" 11.RDFLKL3ΛTA ΛT t LKE NTLELESYL I P. IΛ3EVL
Figure imgf000091_0002
rSCSCSLAJLFLRALAST [ VAVETLV I RMVNL - .κβ t-rotein .
Aminopeptidase A
Hydrolase Idebranching)
ΣVAESQQGFVP0NVATPTVSLOPHTTI.IA1S 438254 437319 protein
439701 439510 homolog present in Genebank/εMBL as of 11 /7 /98
CPn_0393 440229 440723 i i l protein rF': iLrθGFv-('DRAI0ELRTεεLRL03KV3';LC0DΣLSA0EK0R rKGYKKl rVΪPKOO'εNKD
Figure imgf000092_0001
Figure imgf000093_0001
CPn_0405 451814 45096b CPn_0417 4b3047 4b2244 T105 hypothetical protein amιA-N-A_etylmuramoyl Alanine Amidase
REKGMKLTKYLNTKOLRSMΣSRLFVR/3LFMSKOLSFFALCVLGSHPIFAOTPNPPORVR
R,EVIFΣDPGHGGKDOCTASIEIilYεεKSLTLSLALTV03YLKRMGYKP0LTRSSnVYVD
LGKRVAL3NRCQGDVF IS IHCNHSSNAAAFGTεVYFYNrjrGSPTRNRMSεVLGKNtLAA
ACOAASVLLNLLSΛTGSAAANPLσTAASLAθriYAA'TSPGAKKTSεFCYNYCσεTCGGN MEKNGILK.iRGLKTΛNFWIRDTSMPAVLVETGFLSNEPεRAALODARYRMHVAKGIAEG LGCPTCCTPDGOCGCGGFCRFFCGV KNCCGΣGεG'jOEPAΣPL VHNFL.TP-FOKPKO IAK IRKPQIQAN 451 >ι,0 tS28 ,i 11.4401 4n2'53 t i t-Enoyt Acyl t irrier erorein K ui-rase murfc. N Λι ι»ry limn ir iv l my I'jlucimy 1 DAP Liqi i c^.r-MLKIDLT;KVΛFVΛI ICDDQCYGWGIΛKLLΛEAGATI IVGT VPIYKlr303WELCK MUIKH um/UAKIYi KVI!rLEVRNLTRD-5RCV3VGDrFIAIIhn P/ :.Nr.l-AVnALANG ITCTTt ^CVΛEQVKKDF ΛI \IΛ tYNI I. WOI (TPNI EELEΛEL-5ΛKιΥEYP"KLHT^/rτrrW,l'TTVTCLt
IIIDILVIIMΛN IT! LLLT'-RKGYl.AAL.. A. /„FV I L.,l(t\ „ IMHRl TTTI .IT KAIIIi .iM l I "l II II ILi-LI IKDrrTTPTPALL F LΛTM'/l'iNRnΛ -/ME ^TI MΛΛRΛVPl <<U M ΛKAΛLE'jnTKTLΛ EΛi.-RPWritRVNTI ,V I K I'VΛ. I HI IT I TN 1 N Dill DFIIITrFCT/VΛAKAFLFJLVl f l MWttlTDSPYA tKMVDY jFWΛr II EΛMNAEOV'.AVΛΛri.ΛIFLλ. AITTCTI YvnilCANVMT IGPEMFPK <J< IF. ΛKΛlVll-u II. AALnT'ATDIOL . l.TKYT /YGDOK [ A , FtGKYNVYNL u IΛΛI "I /IIΛ I m HI I IH I I I- IC u n x RLDPVLMGP'-pVYID/Λlrl PDΛIJΛVtTGL III I I 11'/ lit IWI . I'UI* KPKLMAi.WFRYCFA -/T ,DNI I- .1 Ptl M /NFll DG ln_04ll/ I 1/1/ l'. H5H I / KN/I irilιl'M.lΛI ISA!. IΛ. l;pr,IVI IΛC.KOHfcA/ lFKIIO'r/ΛI IX/ Vr/i'FVLA
HΛIJ .ιιρι rt imily y liol.i n/plιυ ph it i f- /v 20 ,;KHOPS lόKAPTORR.iL Pn_0 1-ι 46Λ397 41 8 H pbpi- transgLy oIjse 1ranspeptidase CPn_0431 *;, jj""- ^*48-* *| f|4jT»»lSl '.'" f sf "li -' ' ,. „ . j "
QLFPWrNtMIIPOKKVSVFYPMSYRKRSTLIVIXΛrFALYAIXVLRYYK∑OΣCECDHWAAε No robust homolo<»i'Prβs«ncι»ιn.'<ileneb.ιήk/BMBL ϋsT. ϊiS TTVlEXoβ S
ALGOHεFCVRDPFRRσTFFANTTVRKGDKDLOQPFAVDΣTKFHLCADPLAIPECHRDEi:
'.r;iL0FIEC TYECL3 LDKKSRYCKLYPLLDVSVHDRLSLWWKGYATKHRLPTNALFF LFSCεCSOLRLCALYΣGIALAICVLLTIV.ΥCIAJKtΛTACKKPPSISRΣE∑V
ITDYQRSΥPFGKLLCOVUITLREIKDεKTGKAFPTCΛMEAYFNHΣLEGDVσERKLLRSPL
NRLimmVIKLPKIΛGDtYLTΣNPVIOTΣAEεELERCΛεAKAOGGRLΣLMNSOTGEILA Pπ_04 ? 47<5337 47b5U
A.v -r,:;ι --.— :• • • r '" •••/ :•• ~r ' " ••'! rcrrt:-. "'.ι-r - , ::." ; i'.-. ' - ' .' ' . • ">• ' • " - --vv - -\:.-;.v.v-
VA YO3IU ALGFCRKTGIELP3EASGLVPSPHRFHINGSLEWSLSTPY3LAMGYNILAT GLFLTGSAtLvL- IW I AA.,-..."_, ML« CWnh i n. l.,NΛL_.h. I'nVAHES GΣOWOAYAILANGGYAVRPTLVICKIVSλSGEEYHLPTKEKTRLFSεEITFXVVRΛMRFT TLPCGSGFRASPKHHSSAGKTCTTEKMΣHGrfYDICRRHIASFIGFτPVESSεGNFPPLVML CPn_0433 477327 476929 VSΣDDPEYGLRArX7rrøYMGGRCAAPΣFSRVADRTtiYl _ILPDrtKLRNCDEEAAALKRL gcsH-Glycine Cleavage System H Procein YEεWNRSPKQGGTR RTFRILYCTLYRTGSRKVMWYSDYHV ΣLPV'HERWRLGLTEKMOKiπ/^ΣLHVDLPSVG
SLCKEGEΛVΣLESSKSAΣEVLSPVSGEVΣDΣNΣ LVIJNPOKΣNEAPEGECMIAVvRLDQ
CPn_0420 467120 466824 DMDPSNLSLMDEE
CT271 hypothetical procein
KSFPMNK3RFLPXCCCI^FO5SIJΥFYINK0NSLTKΣΛΣ IPCΣJ3VR ?0t CX:NΣSLRF CPn_0434 479471 477276 IDKIERPDHLMEIAALPEYOYLEYPSEESISLLSYELP CT283 hypothetical procein
RP VRIYOODLFCRLCRDPAWFFSl^FTLRFYCLGRGWTLLSFFYKHOKKFIGrVIAW
CPn_042l 468007 467108 CVSGΣGVSCRFSRKσSAESTSRRTv TTASGrRWεKDFMAMKKFFAHEAYPFTGNPRA yabC-PBP2B Family ethyltransferase WNFINEGLLTDYFLTTRVTJEι^FLKVYHP:εK:FSKElAYOPYRRFrAPFISSEEVWKSS
EIlΛSERAHΣPVXXεCIALFAORPPOTFRrjVTΣΛAGGHAYAFLεAYPSLTCYOGSDRDL APQLI II CVFOOIENPISKEGFIJUtAKLFLEERRFPHYV-LROMLEYRRQMFALPPDEAL
OAIAIAEKRLETFQDRVSFSHASFEDΣANOPTPRLYDGVLADLGVSSMOLDTLSRGFSFQ SRGiωLJ^FGYσri0D FGDAYL5M\rtlLIRFIDE0iαCVrLPRPSK0EAilI)DFYDKAKHA
GεKEELDMRMIXJTQEL5AS10v7ΛSLKEEEΣΛRIFREYGEεPOWISAAl«VVHFRKHKKIL
SΣODVKEAIiGVFPHYRFHRKΣHPLTLIFQALRVYVNGEDRO KSLLTSAIS ΣAPOGRΣ.
VI ISFCSSEDRPVKWF FICEAEASGIΛKVITKKVIQPTYQEVRRNPRSRSAKLRCFEKASQ
CPn_0422 468233 468784 0Σ UCVΣiα3reVLI)LYS0DAETYYTI
CT273 hypothetical protein MERLESAIJtTRYrcEEGASI WRPXHKVvΕ^RI^RHI^GSFSWSLDRSLKTFSRCtaCEL
GLAMVEIFNYSTSIYEQh^SNNRn/SDF10i2SICt--GISIPXIVAKHAOI I»INPKPSALTS POEFDRIFSMKVXa)YSS\rFMSPNEGP<rYYC<: SHIiYDRPASVDILFIΛlCS0ωEEIJ^
LLOTNOKSHWACFSPPNNTrTCORFSTPYΣAPSΣΛSPrWDroiEKISSFLKVLTRGKFSY YMERFIE0GWR
RSOΣTPFΣ^YKDKEEEEDEDPEEDDDDPRVO GKllSAIIlUrvTUTNWIDYVl
FVOG CPn_0435 480908 479475
Phospholipase D superfamily [uncleavable leader peptide I
CPn_0423 468788 469216 CΛif«SRIΛFRIAAI/πFFIt£VPN-rVSAlCTIVASDKEKVCΛ/L\YDNS
CT274 hypochetical procein ANFYVΕΣΛPCKTGGRTLiriaivOHIJEASMDLVPELCSYI I IQPTFTDAEDQKLΣJUULKERH
CMΣjaNE KAILG∞MElirε∑ΛΣSσYSFΣJUKHYSKAIΣJ-FEALVILDPLώlYrøO ∑^ P«IιlTΥ\ffTGCPPSTSΣLAPWIEMHIKI^IIΣX;KYCΣLC<7ITΛFEEFMCTPrjDEW LYLQIGENSOAl-AVLΣX3AIJ«4CGDHLPTΣiNKπCALFCLGRIEEATAΣATYLSSCPIPAΣ NPRIJVSσVTWPΣAFPJXJDIMΣJl^nAFG∑ΛIΛEEYHKQFAMWDYYAHHMWFIiΛ^ ANDAEALLMSYSKATKKNAALVR ACPPLTIXQAEETWPGFDKHEDLVLVDSSKIRIVI^WPHDKOPtlPVTOEYUO QGARS
SVKΣAHMYTIPr^ELLNALVWSHNHGWLSLlπ«κ:Hεi^PArTσPYAWσro«ΣNYFALL
CPn_0424 469528 470961 YGKRYPt*nαc FCEiακp ERvsi EFAiwΕrroLHκκcMiιCDEiFviσsnmaα«r.^ dnaA-Replication Initiation Factor DYESIVVIESPEVAAWWKVFNira∑GLSIPVSHGDIFSWYFHSvΗHTΣΛH∑ΛLTYMPA SRCireiFSPSii4GVi ιx:rwESFi rκ st3mτr^ECTrwE0FtJW^
IQVΣJEETQEKIRLEVPNIFvrørYlJrΛrrKRDIΛSFvTLrΛ/HG^ CPn_0436 481633 480902
ASQ|l2ESN∞ISEVFEETKnFEΣJCIΛΣΛYRFI)OTIEσPSNOFVKSJUVσiAGKPG lplA-Lipoace Protein Ligaεe-Like Procein
FIHGβVGLGKTHLωAvOHYVREHHXNΣΛΣHCITT-»F-NDLV^ FrVOMKVRrVOSGKSSAASHMAKDRDLLESLQDGELILHLYEWENPCSLTYGHFMRPEK
LDLI VDDIQFTΛNRQNFEEEFαπTETLINΣ^KOrVITSDKPPSOLKLSERIIARMEWG FIASOTADΣΛLIΛAvTJPTGGβFVTHKGDYAFSVXMSATHPSYSSSVIJEtJY ^
LVAHVGIPDLETRV74lWH AEQrøLLIPNEMAFΥIADHIYGNVR0LEGAINKLTAYCPX LEJCvTRIQGMLAPEDENSSSRDSOffCMACTSKYDVLFGDKKIGGAAQRlvYJQ^^
FGKSLTETTvTΛETIJCELFRSPTKQKISVCTIIJCSVAT CWΣ^LKGϊK^ I^LSGSSSEFYOPT-KPFa EIIEOΣOIHAFFPLGLEAADEVIΛEAROCoYJΛFΣKLFC
ΣAMY AKTLITDSLVAIGAAFGKTHSTvXYACKTIEHKWNDETLKRO/N∑riamiTC GEGL
CPn_0425 470965 471564 CPn_0437 481810 484350
CT276 hypothetical proteins clpC-ClpC Protease
FRGCPffRRTGKGPFXt)VOTLYEEETSSPSSYSPYSRSERPETPPSLFDNPKASEARPLN \Λ?MFEιOTNRAKCΛαKΣ UαffiAQRΣΛHNYtΛTmiL^
HNLTEESSLPQWSSTPRTESΣiPLEEPETTΣΛEr?TrFKGEIAFEPXJΛΣDGTFEGILVSK AR0EVEPXIGYGPEI0VYGDPALTC«VKKSFESANEEASLiEHNYVGT8HLLLGILH0SD
GKI I IGPKGWKADIQ LOεAI IEGWEGNI TVSGKVELBGGAI IKGDΣOANTLCVDEGVR SVAMvTSωlDPREWKEILΛELETFNIΛLPPSSSSSSSSSPJSNPSSSKSPLCMSLGS
ILGYLAIAGITDHSERERDL D^OΛE]C^AI AYGYt)LTEM S^U^P IGMSEVER ILII^πϊiWK^WPVLIσEAGV3
TArvΕGΣ^KIILNEVPTAIJ∞atLITΣΛlALMIAGTKYRGQFEERIICAV^
CPn_0426 472111 471536 LiFIDE HTrVGAGAAEGAIDASNILKPAIJ^GεiCΛIGATTIDEYRlO^ΣEXDAALεRRF
CT277 similarity OKΣVVHPPSvOETΣEII ^LKKKYEEHHNVFITEEA lυ^TLSrXJYVHGRFLPDKAΣDL
MVLFSLLFPKLCYGCOAPGAYFCSNCLEKLLVEDREGRCLHCFRYLGSSεTRLCSQCSPS I εAGASVRVtmtraPTDLMKLEAEIEOTKIAKEOAICTOEYEKAAGLRDEεKKLRεRLO
SQLQAFSLYLPSOTALSVYARACεGKRPALOFFSKSIAFELASLDETPSCIAYΣTSTISR SMKQ-WErøKEEHOW'DEEAVAQWSLCrciPSARLTEAESEKΣAKim^
KIVvTVAi εKLLRIPLWPVILPlKROIEKLPKGEGICFLSAYPLSOKWMOTΣVGGSASPL DAVTSΣCRAΣRBSRTGIKDPMRPTGSFLFLGPTGVGKSLIAOΛIAIEMF∞EDALIO^Di
VSΣSLFLSQNDQ SEYMEKFAATXMfCSPPGY GHEE∞HLTEOΛ'IUWPYCVVLFDEIEKAHPDΣMDLMLOΣL
ECCR TDSFGRKVDFRHAIIIMTSNI yωLIRKSσEIGFGIΛSHMDYKVΣOEKIEHAMKK
CPn_0427 472153 473715 HLKPEFΣNRIJDESVIFRPLεKESLSεilHLEINKIJOSPXKNYOMAIΛIPDSVISFLVTKG nqr2-NADH (Ubiquinone) Dehydrogenase HSPEMGARPLPJJVIEQYΣXDPIAEI IJεSCRQεARKLRATLVENRVAFεREEEEOEAAL
AVCYVFεRVFASTFLSITMLKKFINSLWKLCQQDKYQRFTPIVDAΣDTFCYEPlεTPSKP PSPHLES
PFΣRDSvOvTtRWMMLWIALFPATFVAIWNSGIΛSIVYSSGNPVLMEOFLHISσFGSYLS
F-VYKEIHIVPΣLWEGLKIFIPLLTISYVVGCTCEVLFAVVRσHKΣAEGLLVTGΣLYPLTL CPn_0433 485455 484334
PPTΣP'n^MAAIΛIAFGIVVSKELFσσTGMNΣLNPALSGRAFLFFTFPAKMSGDV VGSNP ycbF-PP-loop superfamily ATPase
GVIKDSLMKMNSSTGKVLI∞FSQSTCLQTLNSTPPSVKRLHVDAΣAANMLHΣPHVPTOD NLTLPMPP0VRεlMCXyTVΣVAMSO3VDSSvVAYLreKFTNYKVΣGLFMl<N εEDSEGGLC
VIHSOFSLWrETHPσWVLDNLTLTOLOTFVTAPVAEGGLGLLPTQFDSAYAITDVΣYGΣG SSTKDYEDVERVCLQLDIPYYTVSFAKEYRERVFARFLKEYSLGYTPNPDILCNREIKFD
KFSAGNLFWGNΣΣGSLGεTSTFACLLGAΣFLΣVTGΣASWRTMAAFGΣCAFLTGWLFKFΣS LL0W<V εLGGDYIATCHYCRUITεL0εTQLΣJtGCDPOK 3SYFLSCTPKSALHNVLFPL
'LΣVGONGA APARFFΣPAYROLFLGCLAFGLVFMATDPVSSPTMKLGKWIYGFFIGFMT GEMNICTε'/RAΣMOAALPTAεKKDSTσ∑CFIGKRPFKεFLεKFLPNiπrjNVIDWDTKEIV
IVΣRLΣNPAYPEGVMLAILLGNVFAPLIDYFAVRKYRKRGV ιXHW3AHYYTΣCORRGLDLGGSεKKYVvτjraj∑EENSIYIVRσEDHPOLYLRELTARELN FTPPK3GCHCSAI<VRYRSPDεACTIDYSSGDEVKVRFS0PVI<ΛVTPG0TIAFYQGDTCL
CPn_0428 473719 474681 G3GVΣDVPMΣPSEC
"nqr3-NADH (Ubiquinone) Oxidoreductase. Gamma"
NMSKGSSKHTVP.INO WYIVSFΣLCLSLFAGVIiSTI- YVLSPIQεOAATFDRNKOMLLA Pn_04:9 485523 486077
AHILDFKGRFOI0εKKεWVPATFDKKT0LLEVATKKVSEVSYPELεLYAεRFVRPLLTDA No rob'ist homolog present in Genebank/EMBL as of 11/7/98
OGiWFSFEεKNLNP∑εFFεKYOεSPPCOOSPLPFY'/ΣLENTSRTENMSCADVAKDLSTVO IΣSSrΛIPYLFVSSTLNGVFPSSLPεεSADLFΣTNKε∑VA-CEKGNVFLTHSΣPMHIΛΛIT
ALIFPISGFGLWPIHσYLσVKNrΛDTV-LσTAWYC^GETPGLCANITNPEWOεQFYGKKΣ I LV I VALAG Σ Al ICLCCYSOS I LL Σ AVG Σ VLT I LTLLCLQALVGF I KFI ROLPQQLHTTV
FLODSGCTTNFATTDLCLεWKGSVRTTLGDS PKAL3A I DC Σ SGATLTCNGVTEAYVQSL OF ΣREK IRPεSSLOLVTNAORKTTODTLKLYεεLCDLSOKEFKLOSTLYOKRFELSHKNE
Λ YRQLLΣNFSMLTHEKKTCE KTNON
Pn_042') 474ι',6h -5319 Pn_044j 43«0*!l 481.740 ιi' -NADH πiliiijuinone) Reductase 4 No roc. r homolo.| pι.?Einr in ι;ι-cι(.-b.ιnk/EMEL .11 ot II '7/'1H l-F. P T KKir .ΪYFFDr W.'I^OtLIΛII^I -ALAVTrπwrΛITMGrΛV iVTGCS LΛ'rtPO.'IiMAT.IVAP^r-VPE.I.^PL.-IIΛTEVLULPMΛYITOPIIPrPΛΛPWETFRSKLSTKH
:rP/-;i.LRKFTPπ;ΛmMITOLIΣt: FV-IVIDCiFLKΛFFFDI:;KTI..';VFVGLtITNriVM Tl.lTΛLr LLT CTI.:ΛGYV;YTr;NW[ [( ι;rr:l ;l IVI.TI.ILΛLLLΛIPLKNKOTriTKL 'l .lli' ΛRIIVTr'irAFLDIΪFΛSG JYiT VLLVIC'/rPELrGriTTLMIIFRIIPOFV'/ΛilCr Iϋr.l:; :.:33IGΛ;FV RYι:LMF.-7ri :;VIII -CLTTιjN εKTPILNEIEΛKKE3rQNLEL HPr/r/yrit„;i.Mvl.ΛP3ΛrFLU;iMtWLVNtRD:-KKPr'R ιτEt'0:.F A K0PKRκ:it>κ.':F P:: ικιιι„:κMP7tι.rrx:
Figure imgf000094_0001
I PS PSOFFHEMI FAPNLKNTRKFY
505330 10 (Frame-shi ft with 0451)
508121 507180 Outer Membrane Protein
511304 512860 Membrane Procein ( truncated)
Figure imgf000095_0001
W ;γι.iΛΛ_,.Mirπ,DHTTL L.;F'J0LYσKTNANP 3E0MYLL3FFO0FPIVTQKSEA Pn_04b4 5 '.!-)■<•
^:.:WKAAYrr:KNHLNTT/LRPDKAPK30X>WtW3YYVLISAEHPFLNV LLTRPLAQA No robust no otoα Dr«»«it»nt :r. ueneMiiR EMR r t .. •' α8 WOI^ GFI.-ΪΛKI-'LGCWOSKFTETGDLORGFSRGKGYNVSLPΣCCSSO FTPFKKAPSTLTΣ 3LETRCRFTE ICtOLLFFD-tΩSLKSlΛLFSEGTΛfMLf * I FΛPLrøSTTipWSRIWgPOL KIAYKPDtYRVNPHNfTWSNOESTSΣSGANLRRHGLFVOIHDV/DLTεDTOAFLNYTF HRIAΣVYΣCVlJ5-ϊ33X[LEPLlSYMSCIY3ε3CWYLRPFMGKHVN0^V'M|i!t-HVB«JI rRCGFFSEDAVPESEPFDLSr'ΛΛrrDRSCPt.PTKKRSόS εLCTV-ELPESΣYPOSEF LM DGKNCFTNHRV-rrGLK.TTF
RPRML3
\, . I— ....i .. uii_.ilrt_.KVuDlLKROR._JLi
536528 539434 Oucer Membrane Procein
ΣNNLAΣNLPSΣLAKGKAPTLWIR
Outer Membrane Protein (Frame-shift with
541357 542532 -Polymorphic Outer Membrane Protein (Frame-shift with )
Figure imgf000096_0001
No rot.il ;ι homolog prpsenc in ijenefc. EMBL as of 11/7/98 a-i
EVFMΛ:, Ϊ( ?λ-7; -βKΣPPKr-r«DRSR3PSPKG_UJ_SHεi3LPP0EHσεεCASGSSHΣHS
333FLPEDOE:;03S3SAAS3PσFF3RVRSσVDPALKSFGNFF3AEST3OARεTR0AFVRL CPn 148 t ■ , 5-ι.W.Λ I- ■ .SV-.i l'> ,,.,., fn ^ — ..
.;KTITΛDεRRWDSSSAAATεARVAEDA3VSGεNPSθσVPεTSSGPεP0RLFSLPSVKKQ No robust nor»loα..pr*s*nc-> ιh".ι:feneb.ιnk/-Ma- ■&,.:«- rwllSSO
JCLGRLVGT RDRIVLPSGAPPTDSEPLόLYELNLRLSSLROεLSDΣOSNDOLTPEEKAE 3CLR tεGILMAτ3VPV S373- CεANSSNεRFTεRT3RMYYAΛLVLGΛ 3CL:FIAMΣV:
ΛTVTΣQOLIOITEFXσYMεATOSSVSLAEARFKGVεTSDεiNSLCSELTDPELOELMSD FTOVGLWAVVLσFALOCLLL3LArVFAV3σLVLGKTLεPSRεΛTPPε:VΛ0KE Tr3CS .-
1D3L0N I.D τΛDDLεΛALSKTRLSFSLDDNPTP Σ DNNPTLΣ SOEEP Σ YEE ΣGGAADPOR ! NEYWRSELΣ3L-1^3εLHε3 :'.TSKDR3LC:r .7LON: KLEr_3rrL3LLI KE VH
TREf«.'rrRL WtREΛLV3LLCMΣL3ΣLGSILHRLRΣARHAAAεAVGRCCTCRGεεCTSS INI ΣLHLVPCWNLLG,.'_L3pε-<TAHAεεLLLFLIEEςYY3FD :LKLΣRYGDAL0ATSPLM
:-r » :• • : • • ■ : • ■ -• ■:• •T.vrwi: ." ' " . — -
■ r ' • ■ -., ■■ -v. .•" :. :-.■ ■• • rvMiax i 1 * : :."-pwτΛP-r.':-37 ι "- :.ιi!- -- . ' . ' -' """• '"•'■•'- -'.'■. . "' : . - - - - :iΛK.-:.:A/r.-: \v rs :
~GDYEVPIT3AEPSKDKNIYMTPRLATPAIΪDL 3RP ;33GS3R3P33DRVRS≤iPNRRG (E ANϋKLLNLf iΛr - JJVr AMrinLi ui .. .VήVtiKKv IH jlLSNTEILENE F .. VPLPPVPSPAMSεεGSIYEDMSGASGAGεSDYεDMSRSPSPRGDLDEPΣYANTPEDNPFT LYεYPLSYL:WAV- L3CVP.GTϊISLεiX3ADYr.rLC<;U:SMLSOFASRLOSG01ΛXNPR QRNΣDRΣLOεRSGGASASPVεPIYDεiPWIHGRPPATLPRPεNTLTNVSLRVSPGFGPEV DVLSE0 VT4Lv GLAACC 3FC<;Ll Ali«LTAVP0RMWLGALPLFεSFPVFNRMKEaTΛ. RAAIX3E^rVSA\^vΕAESIVPPTEPGDGESεYLEPLCX5LVATTKILIΛKG PRGESNA ESLGD
CPn_0473 549602 548070 CPn_0482 561764 560961
No robust homolog present in Genebank/EMBL as of 11/7 /98 artJ-Arg nine Peπpiasmic Binding Protein
GSIMAVTiGVGGSR^PSPΣPPNRRNSEraKVSPKDNΣΛE-tTVSSSDSSLASOGPTIEERKA NIAYlWGTFMIKOIGRFFRAFΣFΣMPLSLTSCεSKΣDRNRΣWΣl'rTrNA-rYPPFEYVDAOG
OIXSGTDKIPLPSVKEPGDSO SGRSGVMRI KGVKGVFKKTPOARPEVSSPRLPSHVOH εVvrjFOIDIΛKAISεKΣΛK0LEΛιT«EFAFDALIlΛLKKHRIDAILAC«SITPSR0I EIAI_.
GORLPGLEGFTORΣOKRSENPεADLGKMKRSYS∞DLDRVGHDSNEXISTEDSRSEGGEPS FYYGDEVOELMVVSKRSLETP/LPLTQYSSVAvrjKTTFOEHYLLSOrciCVTtSFDSTlJW
SKSSSFt-∞VRGAVSKVHGAIΛDIKGiαrQRSASEDDLCTWεDSAGrW/KERRSEEAEAS ΣMEVRYG!CSPVAVLεPSvGR' Vl <DFPNLVATP ^LPPECΛ' 3CGLσVAKDRPEEI0τ:
3KSS3FLSGVRGATSTvWAIΛDAKEKVSAFGεθAAGAIRSAPGINIRTRF !RSSSEGDLS QQAITDLKSEGVIQSLTKKWJLSεVAYε
N\ KAAKHLRKALENLEKVAPεCVSPεVASRVOSLLARMεθLTHOEPPTVEDLITFVESN
'/GSDSVI ASIVPODGSQAPAETAEAPETGGVEGSAAOGA KALRDFVVSIFQAVASFFR CPn_0483 561830 564964
AΣASRLSSARRεSAVDDLASεSNTCWVεQεGVSNPSMPSLSFAεεiAPJlAAEMSNRNA No robust homolog present in Genebank/EMBL as of 11/7 /98
OSLEKLESGNVTDPVIQOGLGLARSFAPEGO IILIKKRAΣFEBMFPΣPPPHCPPNNiαWFYHLTTΣ7rKDPIXLRILRTIGΥ LωiITLGL
LLLIHYYIWHRv KEGLWPPTLPKσPEPKTIEIAKQPPK∞EDKKPr.VPKPGTPPPED
CPn_0474 551600 549807 TPPPPPKAPSPASP!CvTKOPADKKPTPPPEAPPPPVP.VATPMPLRPSSC<rYWO _UIRMVS
CT365 hypothetical protein MvXRPAPLPLPAWVDPIΣΛDFNPHFVASYPrøΣDNEPMYFOΣKOFKKΣAONPDLPOOHR
LKΣΣISΣSFMSTSPISNDPRYLSLSNATtvKTSLLANSPUSI-SPVPNSLVPSNPEΣTIGLRKS MJU3r^ΣJ50ALYL Tl fYLVWrer a«FYRAYAVr ttI -A YEESS m
ΣFTHSVTLFAGLWΣiVAVSVVVVALTVLAIiGVPOAIIJ^IAISGVσiσGFSΣMKSLVYM DLPFASSSPANANI£A£2«£IJ 3irsrrcSFIDLYΣX-VII^OI«HTAT^ΣAFΣJUOLSAYAΣ
V7lI)YMSPRMQESSRIKSAIAVGTGFTvT«SLVMKVGANFVPGGYGGLVGSLGSSAYSRGSQ RC<31AASSNEETAr IJISrX«DDIiPSVLEFIAANRPYSELF0^iraiSALPYlCSlU.K
TTΣ «FSHYIYTKFFRSEKVAKGEKLTEAETIKEAKiα^HYITΣ_. TIC raiAVΣ_3ILLA LFLIAEHIJ?AΣ_FLT.lAEΣΛ10>tSPEtW∑-WQYEREIREAFAJ^
ΣAGTVL∑ .GAPATIAIILAPPLISIGLTTVl ^ILHSSIGICWRAFIiTQEKKDLFVrΛ'SL VTCTXttPEAIRCOYSRFLATIENRRSGDLPWSPALSFFAFLCTCPSVTu^KI£ATFYKSLE
KDIRIja PPSEVEESETSQSVIEVPDSEGIAETRISAEEir-TRI--LTTRQKVIFAIATL DIIIASAPPOP--∑QE∑ω∑SNASI_rYLNEDLDSSW3REVISSNΣMTILTTHESLTLESSM
LLLASΣAAFΣVTGFGGLTVWvliVASvΩSAVASVrLPMVSSGFSYVAYOIJWRΣΛISKL PQlJϊϊIΛJCRIANIJ OWISTSFETPPI^NQPDI ^rLVNK^^
R KEAKMαRVROFLIESGVIASDREFTOMWKTVTflWQIOKTOMIREEWPΛFEKGGEVN ARSXΛL'TMEσSGLSQEODIiYTOAVQIiFFIIΛHPCΛ/NNRPETKDAVKELKMI ^
SALWGΣΣJ^GVrTIY.rMΣiALVPAFAPIVKILAl-GGSTΣΛ∑AGSIΣΛPJCFVN^ YAFKKrølΣCKKWHiΛSΣΣ-ISLVLKPPARYPSτPSNKDr^
LYεRRRI^εLLYGPεSKMRSΣATDLVVεAΣAASHDHIjrDI KPVDFirjVI^ EKNCMFIJWTFPNYOLETEAIIiEKEIESτFR GWNVFLTRLNΣjraSKIΛSPSSPTALS
DQFSKSFLIFCFLNNYPKI ΛKKTPΣAARLrMFOREASHRFTOΛrirDK∑JJ^∑ rTO
CPn_0475 553850 551685 ATI^IOYS AJUX} IC L K V AS∞FCRSGFRQSLIσYIJ^S SS^^EI^∞IΣ^ glgB-Glucan Branching Enzyme EANDVAAMTTVPLQPFAVCLIMSDRIJWSEENIENFVAMHGFLNTΣSPERDARIFLIRFP
?SMVDiαiHPWDLDIiVSGROKDPHKΣJΛIΣASEI)SSDHIVIFRPσAHTVAIt_ GELHH NHYGCLLPRNPRTEDOMSKPDSSNP
AVAYRSGI_ FLSVPKGIGHGDYRVΎHQNGIJ_AHΌPYAFPPLWEIDSFLFHRG HYRIYE RMGAIPMEVOGISGVLFVXWAPHAQRVSWGDFNF HGLVNPULKISRXXJIWELFVPRJLG CPn_0484 564931 565824 EGΣRYKVRØLVTQSGNVIVKTDPYGKSFDPPPOGTΛRVADSΕSYS SDHRWMERRSKOSEG aroG-Deoxyhepconace Aldolase PVT-YEVHMSWCWEGRPI--YS-MAHRLASYCKEMHYTHVXLLPITEHP R5EΣ K GOΣJCSI.VlΛEVLILTFTYPLPRTΣJTOHPDEvΗTVPΣSPNLSFGEGSPILIAGPC
GYYAPTSRYfiTLQEFQYFVτTY KENIGIIIJJWvTβHFPVτiAFAL^ . TLESYEHTVSSALTλ^EAGAC rFRGSΣRKPPrSPFSFCGWEKECTvXWHKEAOSIHGtl'TE
QAIΛPHV^FTFBYSRHEVTNFIΛGSAIjrWLDKMHI-CLRVDAVASMLY TEVIIJVRrπ EITAEHVDILRIGAKtΦfrøπ-piXQEVSKSHRPI ILKRSPAATLEE LCAAE
PNΣ.TXSKENXεSIEFLKHΣΛSVΣHKEFSGVLTFAEESTAFPGVTiα.VIXXWIΛFIJYK^ YIU_5SPSCPGVΣΣ_:εRGIRTFEHSTRYTLDLNTVAIiKεiSSLPVrVDPSHAAGKHSLV σW^β^I)TFH^FMKDPMYRKYH0KDLTFSIΛ«AF0ESFILPI 3HDEv ^HGKGSl,V^a PGDT LPJJ-SAGI^vTJA∞LMIEVHAHPEKAIΛDAKGO TPEi ^^
OTRFAQMRVΣXSYQICLPGKKLLFMGGEFGOYGEMSPDRPUΛ<EΣXNJ«YHK-IjaiCVSA
LNJΛYIHOPYLWMQESSOECFHWVDFHDIENNVIAYYPJAGSNRSSAIXCVHHFSASTFP CPn_0485 565993 566229
SYVΣ-KEGVKHCEliΣi TODESFGGSGKGNPAPVVCQΣXXWAWSΣ lΣELPPLA'ΣVIYLVT CT382.1 hypochetical procβm
FF OPIGRTPTRVFLWRFMIKOACKFYIJ^CCLI^ALYWLΣJ YCRKΣXKGTIΛHSEΕTtYOAI.. SSLIDLLYQLKQLFAPTNE
CPn_0476 554877 553858
CT865 hypothetical protein CPn_0436 567799 566405
GRG«RAIW!-OiIDΣMQHFKPYTMvT>GQKLPIPGSLLYAQ FPTLWPXFSSKHEILNEQT hypothetical prolme permease
LQVrχ5prj∞FAVF0DLHRGGΣAvτSERYl YYLLPS SIXT0SΣKGKLPSAA0AGPLLSLσV A0HRSLL«3NIFHIJ3CGVLYFMNFSLFLFFLIAI0G ΣCLYVGRRGSKKVEDRESYFIΛGR
HKHAI*IOKVRCRPJ.LKEILPLWFPjrAAMAPlα.SYPJ:LETTAIGSLVKTAHORvIΛRETTE St_CIFPΣJΦrrFΣATQΣGGGVLLGAAE-AFCYGYGσ∑LYPLGVALGLIFLGMGPGWLAEG
IAPAI SIAΣJ«FSECFLPRSYDεεFCCΣLPθr«DPEGGVPFεLLSYSFGMIODΣFLRHQ SLTTΛ /SIFEΛTFYGSKKIJU ΣAFLI -AGSLFFILVAOVΣALDRLFSSFPr^3KYVTVAF Σ
GQLVΕILPALPPεFPCGRLIHVALPNLGTWIVWreKTIROVI HAEYSGEVFLKFCSSL VLASYTSTr-GFRGWRTDVI0AσFLLΣAVLVCGVSVVrLSVPKSLSViα5PF0SLPCAKI_nJ
CSARIΛEWSERRIΛGSlGU_3LGεTtJ.IKA rTTYL IX:FHK IFMPMLFMLVEQDMVORCV SSPKRI/ IAAVGAGLVTJ FNFIPLFLGSLCAKAGLKA GCPLIDTIAYFCNPSLAAvHAAAIGVAΣLOTADSLMNAVSQLΣAEεYPrLKAPYYRYLVL
CPn_0477 556112 554844 GIAVAAPLVAIGFTNIvOVLIΣ^/SLST/CCI^VPVGFYLLAPKGRRVSGAAAWAGVLVGA
*yqev_8s Hypothet ical Procein LGYGWVO∑VSLCΛFGεLLAϊΛ σSLVAFSFVGFIEITWKNKVKTQT
RYMrVAεVKGTFKLVCLσCRVNOYεVOAYRDOLTΣLGYOεVLDSεlPADLCIINTCAVTA
SAESSGRHAVROLCRONPTAHΣWTGCLGεSDKεFFASLDRQCTLVSNKεKSRL∑εKIFS CPn_0487 569833 568112
YDTTFPεFKIHSFεGKSRAFΣKVODGCNSFCSYCΣΣPYLRGRSVSRPAεKΣLAεiAσVVD CT384 hypothet ical protein
OGYREWIAGINVσDYCDGERSLASL∑εOVDRIPG∑εRIRΣSSΣDPDDΣTεDLHRAΣTSS RRTCGΣSLTYSSFR ASFRCYSLΣFFCFCGSLFGSESLRYOLLIODFAKVSEεσiGLLES
RHTCPSSHLVLOSGSNSILKRfOmi YSRGDFLrXVεKFRASDPRYAFTTDVIVσFPGεSD KEYSL∑ΛAKLVLRALAONSSFDDWFRSFKKCOΣSYPεLAHDRDVLεεFGΣOVLREGIENP
ODFεDTLRΣ∑εDVGFΣKVHSFPFSARRRTKAYTFDNQIPNOVΣYεRI YLAEVAKRVGOK SΛ/TVRAVSVlAΣGLλRDFRLVPLLI-OSOTDDSAIVRSLALQVAVNYGSεSLKKAIVELAR
-MMKRLGETTEvXVEKVTGQVATGHSPYFEWSFPWGTVAIOTLVSVRLDRVEEEGLIG NDDSΣHVRITAYQWALLOiεELLPFLRεRAEfJKLVDSVERREAWKACLELSSOFLETGV
EΣV AKDD∑rX3ALFTCεVLRNGML?εTTεIFTεLLSVEHPEVθεSLLLSALA SH0L0NHKεFL
SltvTtHVMCTSPFAKV-RFOAAALLHLHCDPLGRDSLVεσLRSPOPLVCεAASAALCSLGIH
CPn_0478 557640 556210 GVPLAKεHLεSL3SRl<AAANLS ILL VSRεDiεP «DVIARYLSNPεMCWAlεYFL DAQ hflX-GTP Binding Protein lιmLRGDTFPLYSr»IIKRεiGRKLΣRLLAVARY3QAl<AvTATFLSσQ0A0σWSFFSGMF ε HGGPLDTIDTPGεOGSOSFGNSLGARFDLPRKεODPSOALAVASYONKTDSOWεεHLD ECDVKrSεDLWDACF.AAKL£GALA3LC0KKDOASL0RVS0LYNDSRWDi AILεSVAF
ELΣSLADSCGISVLεTRSWILKTPSASTYINVGKLεεiεεiLKEFPSIGTLI IDEEITPS SENLDAVPFLLDCCHHEAPS RSAAAGALFS 1 FK
QORNLEKRLGLWLDRTELILEΣFSSRALTAεAN∑OVQLAQARYLLPRLKRLWCHLSROK
SGGσSGGFVKσEGEKOΣELDRRMVRERΣHKLSAOLKAVΣKORAERRKVKSRRGΣPTFALI CPn_0483 570147 569767
GYTNSGKSTLLNLLTAADTYVεDKLFATLDPKTRKCVLPGGRHVLLTDTVGF I RKLPHTL hicA-H IT Fami lv Hydro lase
VAAFKSTLεAAFHEDVLLHWDASHPLALEHVOTTYDLFOεLK∑εKPRI ITVLNKVDRLP RKLPTCFAVNVTRSRDHMTV KOI ΣDGLIDCεi /FεNENFΣAIKDRFPOAPVHLLΣ ΣPKK iJI-S Σ PMKLRLLS PLPVL ΣS AKTGEG ΣONLLSLMTEI ∑OEKSLHvTLNFPYTεYGKFTELC PI PRF DΣPGDFΛΣLMAEAGK IVQELMεFGΣAIKYRWrNNCAECKOAVFHLHIHLLGσ
DASWASSRYOEDFLWEAYLPKELQKKFRPFΣSYVFPεDCGDDEGRGPVLESSFGD RPLGAIA
Figure imgf000097_0001
588514
588471 589106 DNA glycosylase PHAVLIRAILP
590299 590808 Metalloenzyme)
ΣEEIΣGEIADEHDVO
1 '.MM', '.-ι',7', -, i'lι.r'-.tιιι-l .!:-..» ivιιr:;vιιθM,i"rct.i'θPPiα,.';PLYS
Figure imgf000098_0001
IKTPtTE?/H[vα«;FP.'X'NliUYY.'jr-LF 26 ,29) hypothetical protein FE3 ΣKK ΣNεECKALLEORTELKHATNPEL,
../- /Ll,LK--.-ι..tNKMr-; MlJTDi JC. -kuK-Av'OFFFQAFwPKEAM
609910 608726 Succinyltrans erase
609921 Symport
611165
IVTJGGGKIvAriVV RVSNA
613323 612460 Methylase
614918 615385 hypothetical protein
bl5763 616296
Figure imgf000099_0001
T-Η.. vrorhpr. -.--ii o".r .un -Pn_ns4n -> UIU
_ i T-rvL .; t iLKNtHR'.ττ:LNFEAK∑ rL2l-L21 Pιbosom.ι. Procein
I^KORLTLSIERFtniWtΛE YΛVIirmsieQYQV tiOW.W^
ΓPHKΠ FVFrXrriASLGSP^'IATJAOvXAEYLSWrti.EKWAYItYKKBKNY™
EΣLΣ
Figure imgf000100_0001
_ . ;cr-ι r . v 'ΠTENTHVY VLNSNKFKSKTGAYGDLF P . .SLVLrFLVLLL/JJLFFt,- Λ Λ- FΣTΛT-GAVσrγ EYSS»»I \KMH-/PL3τF3A:G3FLFLΛLSF JIRϊ^H-LPt;FFDΛ:.P TLL:'Λn/VWSIF RVRKS IGAL0L3BMTIT3 ΣtΫVCrl H IRllELHVL 3K JftliwEnflfJSIIIBWFe- IHStt A jmi.-A- ikDi ' ytcine-Ri- Lipoprotein DΣ FGYFFGKAF-NKK I AP0 ISPRWTVVCFVACCLGATL I SFI FFKJ. FTRFASTPWPA I
KLMKKAVL[A,V FC-V/-'_J-;CTRIVDC FEDPCAPSSCNPCEVRRKKEPSCGCNACGSY LΣPLGLALGI^FFGDΣIE3∑FKRDAHLKNSNKLKAV∞MLC LD';LLLSTPIAYLFLL: /P<;C3NP:G3TE'N';G3POVKGCT3PCGRCKO TQSKEFΣC
I
Figure imgf000101_0001
LKLI tNTrLTLLHNTTHVIPNTUJIVGLCDLFAR. -cΕOAFKNYDPSLPGLLLSHNPDG :TRIΛXJYr<:DFVLiCHSHσP0VTL3WPKFARKFFεRLSGLENPYLARG-/r rκεGK0LYV NRGLGGLKR tPFCSPPεiCYΣTCSYD
Figure imgf000102_0001
Figure imgf000102_0002
Pn_0580 669936 670793 truA-Pseudouπdylace Synchase Σ CPn_0592 681132 661461
ASSNQNF PRRS^rDCPSPPM lC AΣJ-IAYOG AY3CΛ<J^^PND^-SI0I^IESStJ^ yidD family RTPLΣASCR'TWU.λ HAYGQVAHFRAPrΛPLFANAhrLTKKAUlAΣLPiω∑VΣRtΛ/ALFD∞ LYSKMFSMSFKRFLOOIPVRΣCLLΣIYLYO LΣSPLLGSCCRFFPSCSHYAEOALKSHGF FHARYIΛIAKEYRYSLSRLAKPLPWRHFCrTPPΛPFSTELMQEGANLLIGTHDFASFAN LMGCWLSIKRIGKCGP HPGGIDMVPKTALQEVLεPYOEIDGGDSSHFSE HGRDYNSTVRTΣYTLDIVDKGOSLSΣ ICTG-«FLYKMVRNLVGALLIJVGKGAYPPEHLLD ΣLEOKNPΛEGPSAAPAYGLSLHHVCYSSPγNNFCCεOCSVSTSNEG CPn_0593 682494 631391
CT474 hypothetical procein
CPn_05Bl 671533 670745 VLGAKCMAF1<PJα WLW0VLII^VGrJN>B^Fli -FYSAΣFRjωiYKLHI FSGPLI
Phospnoglycolace Phosphatase VYLSEDFLNEISOASLDDLΣSLF DERYMYGRPIKLWALSVAΣASHHΣDITPVLSKPLTY
EGIΛ RS o SFLROCWIYSMLVSDEFOUr∑ΛSGMYI-eiJYDVFFFDL∞IXVDTEPCFYRA TELKGSSVTl t^PNIDLKDFPVIUJYUtCHKYPYTSKGLFLLIEKMVOEG VDEDCLYHF
FLQACAEFSLEVH DFSTYYSI rTΣΛTEIFSKKFIEQYPOAQEYMAEIFAKRLOIYYKSL CS PEFLYLRTLLvrjADVQASSVASLARMVIRCGSERFFHFCNEεSRTSMISATOROKVL
EHAGPAIΛPGVΕΛFIELVΣ^IΛKTFGVVTNSPRDATHTIΛTMYPΣΣJ^KFΣJ'WVTP NYA^ KSYΣ XEESLAAIXLLVHDSDVVIJfEFCDEOLEKVIPXMPOESPYSONFFSRLOHSPRRE
PKPYGDSYDYAYRTFAPJErjMKVΣGFEDSVKGLRALSKIPATLVCINSMAEΣTPEDYPELK LACMSTQRVEAPRVOEMDEEYV\Λ:CX3DSLWLIAKHFGirΛmκiΣQKNGLNHHPXFPGlW
GKεFFSYPSFDVLTEHCSQQKLL LKLPAKOS
Figure imgf000102_0003
i I-ll IVlll i.HO l li, b7'ι5H l-n.'ir.oi) I.".! I '. £,<? /* "T ι -l'4 l hyjn.t hi -r w.i l pror* rι
Figure imgf000103_0001
CPn_0605 696737 696150 yhhF-Methylase CPn_0616 708704 710137
LRKΣΛSSRGDVRΣLAGKYKGKSLKTFSNPHIRPTSGLVKεAFFSICREDIEGAAFLDLFA dnaB-Replicative DNA Helicase
GMGAIGFEALSRGAASWFVDISIKAIQLΣHTNSALLGEQLPWIFRODAOSAΣQRLIKQ TLTOYESSr MDKSTl.WLPSPPHSKESEMrVτiGeML'rGVHYΣO AArø
I^RSFDLIYIDPPYEΣ aJCYVεTIiOK-VSGNILNPEGTLFI-EllASDεEIACKLπjtRRR KIIFTt\rLODAFKODKPirΛfHΣJWEEΣJCRW«ΣWIGGPSYLITLAEFAGrrAAYtJEEYVDI
KLGKTYLAEYIVEKDP ΣRSKSILRKMISTAKEIEKRALEQPKNVAE^«iiεAONSFFKIS0STSVSCrYTLVADI(LRG
L'I ^TD PYL 00εR0εIJlΛ^-W 3rΛKSFF σIPTHFID D0LIHGFSPS LMILA λ
CPn_0606 697492 696707 PAMGKTAUWIλENΣ^FONIU.PΣGIFSLεMTV∞LIHRMΣCSRSεVDSKXISIGOLSGH
CT488 hypothetical protein DFQRIVSVINEMOEHTIilD∞PGI-KVSDLPARARRMKESYOIOFLΣIDYIΛIiSσSGτL
SSYSRROLRFYTGSLQMHIYGI-VJLHWLGVPEKTMEVFGDP IGYHQKICSEHOA HP ωTESROTEISEISr<MT-KTLARELNIPILCL^QLSRKVEDRANHRPMMSDLRESGSiεQD
EDrVLLPGDIS AMNLSEAHKDFAFrGDLPGTKYMIRGNHDYWSSASTSKILOALPPSLY SDLVMFLi-^REYYDPNDKPCTAELIIAKNRHGSIGSVPLVFEKELARFRNYSAFECΣS
YIΛWFAIATPHIAVVrΛRΣΛDSCTICVKKεNI^TPSTQEQSYTrø^
.AFAALPKEVT-WIVMTHYPPISSrx?rPGPISEFLEAIX3RVSI^ FGHIHKVQRPIDGFGN CPn_0617 710481 712316
IRGΣHYILVAADYVNFVPQEVM gidλ-FAD-dependenc oxidoreductase L^r^HP∑Aγrw∑vvr3AGHAGCEMYCSAK^ισvsvIJ^LTs^^I rτIAKIΛCNPAV^
CPn_0607 698910 697573 IVR£IDA X:iMAEVTDOSGIOFRIIJICTKσPAVRAPPΛ(3VDKOLYHIHMKP_CXENTPGL glgC-Glucose-1-P Adenyltransferase HΣMQA'TVESLLDKEGVISσVTTKEGWMFSGlr-vT^SσTFMRGLIHΣCTRNFSGCJu DP
NRRIQMIENDFPEASNFESSHFYRDKVGV1 ILCGGEGKRLSPLTNCRCKPTVSFGGRYKL SSCπ^EDLKKRGFPISF J rGTPPrUiASSINFSCMEEOPσDUJVGFVHRTEPFOPPLP
IDIPISω∑SAGFSKIFVIGQYL'TYTI-∞WJKTYFYllGVLQMIHLt-^EAROGtXjrMY QLSCFITHTMHOπAIISAtπΛRSALYG^KIEGVσPRYCPSiεDKΣVlrFStaCεRHHVTlX
QCTArAI K Σ Y ED EIEYFLILSσIX3LY^ DF SrrDTAIR'Σ1^VIW LVAQPIPEra PEGLhτOEIYANGLSTSMPFτπnjYCMIP^VUSI-ENAIITRPAYAIEYlrYΣhYanmiPTLE
AYRMGVLDIDSECKLIDFYEXPQEKEVLKRFOLSSEDRRIHKLTεDSGDFLGSMGIYLFR SIΛIEGIJ1 ∞IM- rσYεEAAACCI-IAGΣNAVNKVTTrePPFIPSR0ESYIGVMU)rΛ.T
RDSLFSUΛEEECOT3FGKHLIQAOlKR∞V13 IiYNGYWADIGTIESYYEANIAL'roKPH TOILDEPYRMFrGP^PXXXΛQrΛIACARΣΛHYGYEijIJ SEERYELv^^
AEKR ICYDDNGMΣYSKNHHLPGAΣΣTDSMISSSLLCEGCVINTSHVSRSVLGIRSKIG RIΛKTFROΥGO-W/SLAKAI^RPEVSY^WIΛEAFPNDIRDIΛAVLNASLEMEIKYSCYID
ENSWDQSΣ ΣMGNARYGSPSMPSLGIGKDCEIRKAΣ IDENCCΣGNGVKLQNLKGYIKYDS RQKILIQSLEKA-3 .IPEDLDYKOITALSLEAOEKLAKFTPRTLGSASRΣSGIASADIQ
PDKKLFVRDNI I IVPQGTHIPDNYI F VLMIALKKHAHH
CPn_060β 699690 699016 CPn_0618 712300 713010
•Undine 5 ' -Monopnospnate Synthase (Ump Synthase) -truncated? lplA-Lipoace-Procein Ligase A
VSFLYFVKNGRP.L RMMNYEDAIOarø3AVAILY0iσAIKFGKHILASGεεTPLYVBMRLV KNMPTrNCIFI^LRGHSILHOLOlεX-UXAVANONFCΣΣNSGAKDSIvtGISRNLNODVH
ISSPEVU3TVATLΣIΛLRPSFNSSLLCGVPYTALTLATSISLKYNIPMVLRRKELQNVDP ISRAOADHIFIIrøYSCβπVFIDSNTΣJWSWΣMNSSEASAQPOEΣ AWTYGrYSPLLPN
SDAIKVEGLFTPGQTCLVINDMVSSGKSΣIETAVALEεNGLVVREALVFLDRRKEACOPL TFSI E^roYvIΛHIαIαalAQYI0RHR HHT FLWDID DKI^YY PIPOOTPTYRNQR
GPQGIKVSSVFTVPTLΣKALΣAYGKLSSGDLTLANKISεiLεiεS SHEEFL'TTΣJlPWFPSRDDFIjaiKASGSIXFTVreEFrΛNεLεEILAOPHPJA'rTVLN
CPn_0609' 699672 699986 CPn_0619 713462 713013
CT490 hypothetical protein ndk-Nucleosιde-2-P Kinase
ONTKNSLIRε^LIRLFLGISLPKGFPLYLεPPLVLATFOστQFVGTYSεATNPLYΣDNL RRYVYTMEOTLSIΣKPDSVSKAHIGEILSIFEOSGLRIAAMKMMHLSQTEAEGFYFVHRε NLNYHYTQεLLYKAVPCNYKSΣYREΣPLΣIFPεVLIGSTPTQSTε RPFFOεLvOFMVSGPWVLvXεGANAVSPJTOELMGATNPAεAASGTIRAKFGESΣGVNAV HGSDTLENAAVEIAYFFSKIEWNASKPLV —
CPn_0610 701450 700029 rho-Transcription Termination Factor CPn_0620 714145 713519 o Σ FLRFKGS IMKEεRSSε I PRVKETKKH AYVSMOEKSCVGECAVVASESεEAESVTVTK ruvA-Holliday Junction Helicase
FAKLORMσiEELNΣLARCYσVKNΣGSLTKSOWFEΣ'/ AKSERPDELLΣGEGVLEVLPDσ DKMYDYIRGTLTYVHTGAIV∑εCOG∑σHΣAΣTERWAIECΣRALHODFLVFTHVIFRETE rGFLRSPTYNYLPSAEDΣYVSPAOΣRRFDLKKGDTΣΣGTΣRSPKEKεKYFALLKVDKING HLLYGFHSRεεRεCFRILISFSGICPKLALAILNALPLKVLCSWRSEDΣRALASVSGIG
STPDKAKERVLFεNLTPLYPNORIVMEMGKDHLAεRVLDLTAPIGKGςRGLΣVAPPRSGK KKTΛEKLM\.ΕLKOKLPDLLPLDSRVETSθτHTTSSCLEεGIOALAALGYSKΣAAERMIAE
TVΣLOSΣAHAΣAVNNPDΣVLIVLLIDεRPεEvTDMΣRQVRGEWASTFDEOPεRH∑OVAE A I KDLPEGSSLTD Σ LP I ALKKNFSGVNKD
WIEl<ARRLVEHσNDWILLDSΣTRLARAYNTVOPHSσKILTθσVDASALHKPKRFFGAA
RN I εGGGSLT ΣLATAL I DTGSRMDEVΣ FEεFKGTGNMεLVLDRRLSDRRTΥP AΣDL Σ KSG CPn_062l 714707 714144 τRKεεLLYHPSELεRVYLFRQAΣADLTTΣDAMHLLLGRLKKTNSNAεFLLSLKE ruv -Crossover Junccion εndonuclease
L3RLGSSFKDNKFKVFθεSIVSELIiσVDPCTIVAGYAIIAVE0RY0LRPYSYCAIRLSS
CPn_061l 702133 701420 ∞PLPMRYKTLFεQLSCv^DDT0PNAMVLεT0FVNI<NPQ3TMKLAMΛRGΣVLLAAA0ROI yar.E-pr dicted phosphatase/ k mase rFF/AP^r AKKAWO CII.-;KR0VQV^IVSKt[-WPEVLHPS EDtADAFAI ^ICHTH R'NPRDAKTSERεD3lSYDFΣRSYSCEYLNWKKLGRMLKLLKV3ΣTGDLS3GKTεAC0VF RSPLCCVR jELGAYWSΛDε∑SHSFLIPHTRΣCRRVIDLLGSDW/DGAFDAOAIAAKVFYNSVLLOG
LεAILHPEVCRIIEEOYHOSΣQDCNYPLFVAEVPLLYεiHYAKWFDSVILVMANεDΣRRε CPn_0f,22 -15761 714/'
RFMFKTGRS EDFDOR SRFLNVEEKLAOADVVV'EhrNGTKKELHOKΣEεYr/ALKGAL (T'.OJ hypothpr ic.il protein
I<Y'JVPI.LUI KLHLF::L!i.-l.-tS.' L.';i-H-r/Hrι-::R:;MLIILLCRWKDΛDIMEW0OICNΣLSσV hRR ,Y2l)_2 ' ':PM' )KI.V-l I'L-TOD:':i-|10EHERUILIJYRr t,.';ALEEF/PRRCEAKIIODLEKL(30ENT
1Λ.IA UNA Polym.n.u.e I ι ιi'iΛπκ ιo[ iι ''nι i IirT.' LGT ERPRREYAMKKLrVLDΛJjrtFRΛYFΛLPEMKNIIOGOΛT AVFGFIRCL 11: i : r Y I rTYIWF. l:Λ/FDι;r-NNKQ:;ROA IYADYKSNP KKFED I PO IΛLVKEYCSLIGLΛ I- : :i?f::tCKLTIWTDtΛl-kKKΛ[ΛLLCΛilVrl>jY( OI.I'DI.IIKFKCΪIΛMPSNTKUJHLK
YI.F.KC:VCArj|)VIA.JlAKKAREENYK\n-\' TADKDLL0LVNDHWAWNPWADijGWGt3E i.l.UIKI.I'K.'FΛ'tivVF.tlϋK.ll.i.: v l I pr< ;rι i t
:i.TM :CR0ETLKL;:Klt \L[ )3NirtPVPIESLTFP HPVDEEKLIHFYr00GFKTLVP ' i-iij)-.:' i i i /ι., 11.1 IK TCΛΛTVDVy 11 KPΛIVLTN IljLVl.Vi'.DIAFAVAYT'lNHLI.'II.KLECLALTOGSrrVF l--IΛI.EEEtrrKtLI-II.Kt)|-!--[.HCDI.-rrY..;YNLKRDCIIΛLLNΛi;tVIRCI:;YDLΛLAEHLTN i tπ-v i r -ι<ιτ ;γFt.-.7iiN-rκιιr ; m.|-7ivι--Λ t ι:v t::υ:tι n
ϊϊ:κι-:ro::ι.L7Niιι,-rrr-r,MiRFΛKEwι';Nτ:LriGr'LPEijpr.θYrGcrvΛYLPtιKDΛiL II.. !-π.7l-Λι:lιYr:/M|-VR-|-KIMI. Λ7'-.I.I)l..-,VKtΛθ;i'E/.LIKLrK.'7π't.rVtDEKrLA ij.i'FJTivrTT :κκrκκι ι;κκwκι:κκκι.':ι<pp|i!iKEtΛFvτι;Λ:;ι.>κi[.nτvκ FELWEE. :ι.,r.NC ΣYEOKKF.-LLPPPAKL I3EVΣ3C . JPWTSADLNεSLOALVRεSSDL rz l'1 .i l ' R lbosom.i l I * !" „—-,.,— INΛLL:,ΑDUAIHFPETEEEPT3A3FEE33AMFFPετSSATEεε NKKEKVKSMΛ3EPPG3RKVK IGVWSAKMEKTVWRVER I F3HP YLK'/YR.>3KKYYAHT εLκvsεGDκvκIC-W W«∑^BWι^«ει(yr s: j
CPn_0639 725979 725743 rl2 '-L29 Ribosoma l Prote in
A3GKGIWlAAJ KεLLT0LRGK3DDDLDΛYV iεNKKALFALRAENL 0. KV/KVHMFSTHK KN I ARALTVKOEPFGKVHG
Figure imgf000104_0001
r l lβ -Llβ Ribosomal Protein t IMLMPKRTKFRKQOKGOFAGLSKGATFVDFGEYAMGTLERG VTSRQ∑εACRVAINRYL
CPn_0625 718488 718060 KRRiSKVWΣRΣFPDK.S'.'TKKPAETRMGKGKGAPDHWVAVVRPGRILFEVANVSKEDAODAL RRAAAKLG Σ KTRFVKRVERV r l l7 -H7 Ribosomal Protein „ .„„,-
W0HARKKFRVrjRTSSHNRCMΣΛNM tSLΣHYERIETTLPItAKE∑aR«ADKMITLAKIO S
LAARRtAΣGRUfvTtYi .TSKEARQAKGGr/rSVYNvORLWMαJ^ CPn_0641 727092 726409 R I LKLON IGDNAOKC 11 EFLAS rs3-S3 Ribosomal Protein
KGPJtlMCXJKGCPΣGFR'rσVTKRWR^L YG^OEFGKFLΣEtΛrRIRQFLRXKPSσjGAAGF
CPn_0626 719670 718495 VvTWMSGKiεVTΣCrλRPGLVIC40<GAεVDLU<εiIΛALTCKεVVπ,εiAEΣKRPεiJIAKL rpoA-RNA Polymerase Alpha VADNI ARQΣ EP WSFRRAMKKAMQSVMDAGAVr^ICΛ/SGRΣJ-3AEIARSEWY»CRVPL
!ΛPAW l<AQSvVLσ!<εKG«SrΛ)AHNLLYDKFELPEAVKMLPVEGLPIDKHARFΣAεPLεR HTLRADIDYATACAETTYGI IGIKVWINLGENSSSTTPNNPAAPSAAA
GMGHTLC^AIJUlAU,IGLEAPAΣΣSFAMTGVIΛεYMA∑εGViεDVTNΣIIJΛK »LLKKY
PMQDSSIvGRTTQVLKASISΣDASDLAAANGOK-VTLQDLΣΛεGDFEAVNP QVΣFTVTOP CPn_0642 727440 727096
I0LEV\ΛΛIAFGRGYTPSERrVLεDKGWεrVLDAAFSPvTLvTrfFV-3rrRVGODTDFDR rl22-L22 Ribosomal Protein
L\rt.IVεTI«RVTPKεAIAFSTQILTIWFSIFENMDεKKΣvFEEAΣSΣEKENKDDΣLHKLΣ RRHSWT ATARYΣRVQPRKARI-AAGLMRNIΛW3EAEIβlΛFS0LKAGRCUCKVLNSAVAN
WΣNEIELSVRSTNCLSNANIETIGELVIMPEPRLLOFRNFGKKSLCEIKNKLKEM LΕL AEIΛENIraiENI^VTεVRVDAσPVYKRSKSKSRGGRSPILKRTSHLTVIVGE ER GMDLΓQFGTCLΓJNVKEKMK YAEKIRAK TKG
CPn_0643 727725 727450
CPn_0627 720059 719640 rsl9-S19 Ribosomal Protein rsll-Sll Ribosomal Protein EIRIMGRSLRK3PFVDHHLLKKVRAMNiεE!αCTPIKTWSPJWMITPEMIGHTFEVH GKK
FLlRSRVXVK-TOAOAKKSVKRKQUWIPSC Λ/TIVIWTFNNTIVSITDPAGNVISWASAGK FL'rVFVSETMVGHKLGEFSPTRIFKSHPVKKG vTTYSGSRKSSAFAATVAAODAAKTAMNSGIJ-eVEVCLKGTGAGRESAVRALISAGLWSV
IRDETPVPHNGCRPRi RRRV CPO_0644 728594 727722 rl2-L2 Ribosomal Procein
CPn_0628 720461 720063 FIR£INSMF10XFTCPVTr<?rRQLVLPAFDELTTRGICΣJ*r TKSKRSIJlPJ^^ rs l3-S13 Ribosomal Procein Rra4rjGHΣSCP-WGCαU OLYRVvOFKRNKrX3ITAKΛ nVEYDPNRSAYIALLSYεrχjEKR
DAYTILREAQRMPRΣIGΣDΣPAKKKLKISLTYlTσiσSARSDEIIKKLKLDPEARASELT YILAPKGI0PΛ-rWVSGEGSPFKPGCCMTLKSIPIΛL5vTtNΣEMRPSSGGKLVRSAGLAA εεε TJRIΛSIΛQSEYTVEGDIΛPΛVQSDIKRLIAIHSYRGQRHRLSLPVR∞R'rKTNSRT θ ∑ΛKSPt_YVTIJ0rPSGεFRMIΛ-r-CRATIGEVSNADHNtΛVIX5Kω
RKGKRKTVAGKKK TAMNPVTIHPKCXXSEGRHNGYIPRTPVCI VTKGΣJCrωKNKSNKWIVKimRK
CPn_0629 721881 720487 CPn_0645 728933 728598 secY-Translocase rl23-L23 Ribosomal Procein
KIPXFRPYMTTLRQFFLITEIJϊQKI-rYTFAΣiTACRVl.VFIPVPσiNGELAVAYFKQLLG DMKDPYWIKP W TEKAKMIiHWAGTGreKKKGSFCKDPKF IVSHIWrK^
SGQNΣ-FOLADΣFSGGAFAOMTVIAIΛWPYISASirTO FLVFMPALOREMRESSDOGKR εAIYVBKNVK\n SViπ'INVKroPARMFPJ3RW GKTSGFKKAΣVTFYOOHSVG
RIGRLTRIJTVAIAVIQSlJ^AKFAΣJ-φn^TireiVLP ΣXSSKLFσVPWIFYITTVVV^
TTGfTl-LLMWIGEOISDKGIGHSISLIIAΣΛiωSFPSVIΛSrWKΣJUCSQDSSDLGLIS CPn_0646 729636 728950
ΣLILALVT rtVLΣ'I ILIIECW«IFVOYARRVIGRPJVPGX-GSrY ,PLKvireAGVIPVIF rl4-L4 Ribosomal Procein
ASSΣJUlFPATIGOFIASESStΦOCRIAAI APGSLvΥSICYVLLΣIFFTYF TATOFHPEQ YRεDLMVIXSKFDFSGNKiσEVεVADSI-FADEG∞L0LIKF/rVAIRANKROWSACTRNR
IASEMKIOWAFIPGIRCCKPTQHYtJSYTMNRVTIXGALFΣJΛIAΣLPSLI^^LLRVDSNV SEVSHSTKXPFKOKGTCΛARO^r-ASPOFRGGGIvTσPKPKFNQHVRΣNRKERKAAIRLL
SYFLT3CTAMLrvΛΛ3V\n^I)TMKOΛΦAFΣXMRRYDSVLKTDR'rKGRH LAOJCIO^NKLTVTOi rFVr^TAPKTQSAUlFUCrXW
LSLRNLTAWGFVYGININGYDLASAHNIVISIOCALOELVEPXVSETKD
CPn_0630 722316 721885 rll5-L15 Ribosomal Protein CPn_0647 730490 729657
MIlOJKLI'DISEP-SRP ααXΛPΛPSSGHGXTSGRGHKC.∞SRSGYKRRFGYB-GGVPLYR rl3-L3 Ribosomal Procein
RVPTRGFSHKRFDKCVεεiTTGRLAε∑-FOEGEAITLDAIJCAKKAIARQAVRVKVILKGDL YLEYTSYCKNLPPLΣTCPFIFIΛ-MFtJFLENSISKIIJRFVSLF OEESTCStlXMDIOFM εKTFV QDTAVVLSOGVONLLGIT RSHΣSVlCKKECMIHIFDK∞SLVACSVIRVEPr TOIKTKεSDGYFSLOIGAEEKNAP
A ,riT P /SKPKLGHUU<A∞RVTRFIJ(EVRσSEEAI-^GVSLGDAFσLεVFErjVSSVt)v^
CPn_0631 722812 722312 GISKGKGFC^GVMKKFGFRGGPGSHGSGFHRHAGSIGMRSTPGRCFPGSKRPSHMGAENVT rs5-S5 Ribosomal Protein VK TLEVΣKVDLEKKVLLVKσAIPσARσSIVIVKHSSRτ
SEMSI^KNSHKEDQLEEKVXVV mCSKVVKCGPJ<FSFSALILvra«KGRUrYGFAKANEL
TDAIRKGGEAAKKNLMKΣΣiAI^E∞SIPHEVLVHHDGAQLLLKPAKPGTGIVAGSRIRLIL CPn_0648 731636 730605
EMMIKDIVAKSFGSNNPMNQVKAAFKALTGLSPRKDLLRRGAAIND CT529 hypoehecical protein
CPn_0632 723354 722827 rll8-L18 Ribosomal Protein
KGLISSVΛVMX,OVFAPNVLLNLII VT«EFVMKMNMSVvT<LVKLRKOAKNRSRVMESSLCK
KSLMKRRRAIΛ\^U<VLKσSPTKPRI23VVKTNKHΣYVQLΣDDSIGKTLASVSTLSKLNKSQ
GLTKiαJOεVAKVLσTQΣAεLGKNLOLDRVVFDRGPFKYHGΣVSMVADGARεGGLOF
Figure imgf000104_0002
' l-nj)ι. . / / I ',4'in rf-n_'x.M / iι..l ,'ι / ',4 i)»,H
Figure imgf000105_0001
CPn_0656 737842 738048
No robust homolog present in Genebank/EMBL as of 11/7/98 CPn_0667 751097 750177
THNFIiLPLSLFDΣLLT\reCFLCLTLYFASVORMPCEQKRVPGNLYYYYIAAHSSLCLSV No robust homolog present in Genebank/EMBL as of 11/7/93
CKDTMENKD NISIXCKΣOKRYFM)Qα,ILYFAAFVASLFα?rttWDRVP<:AOKΣMRLAADHSSEVFSKSC
RFVRKISGFEiαO rFERHVSPEOAIAIJPEYRIXSKSFVEIΛFIPHTIΛHvRFSirEEPVKK
CPn_0657 738476 738051 HIISQEGEILWSLVrJGEMVT inYJTWTCSTrøFREαX JIAGKO MRVIOT yjeE (ATPase or Kinase) SIJ«AIjωCNIRAERVIKEroraααiFASra«ΣGTHFCQFQPΣRG(rTTri2ββlPV^
PMGRYRRVSHSS0I^TΣXI^7TiαGCΛΛVPGAVΣXI_FGDYGAGKTεFVRσiVSσYIΛDTIAE RHAAWPAOYSEDR\ iΛvTmIFrΛ)NFLrvTtSSM\^rVPVYKΣSLVSADNSVRVEYiωVT
EVASPSFSΣIΛVYGNεPKRLCHYDLYRΣDQKNOεYIFQDAεEDDVLCIEWADRLPKPRFC GKSFODL
DTINIYΣTMQTNMEREIIIEKR
CPn_0668 751176 752162
CPn_0658 739180 7384S5 CT547 hypothetical protein
CT538 hypothetical procein TOFW /SPPXIMKFΣiYVPΣJ-LVLVSTGCD PVΪFEPFSGKtSTORFEroHSAEEYFSQ roVTJMDISGAv^OKΣ.-OFlΛKQKKPε∑ ^'IYLFYI OALSIΛPVVFVWlKIIFKTPEDAV GQEFΣJOCGNFRKALLCFGI ΣTHHFPRDΣΣ tNQAQYLlσVCYFTODHPDLADKAFASYLOL
RΣIΛODKKI RETEIOISSEKPO ^Emi RrYICPFTGKVFAENVYANTO-ΛrYIWLSSC PDAEYSEELFQMKYMAORFACCKRKRICRia∞FPKLMNADEDALRIYDEILTAFPSKDL
PQ^I EKC/-GVRIK FLVSIα)PDVIKEYAVPPKEPIIK FASAITσK FHSLPP EDFI GAOALYSKAAU.rvT(NDLTEATiαTjααTIΛFPΣΛΣLSSEAFVRLSEIY∑ χ3AKKEPKNL
SSYIΛPMTLEEVQNQTICFQLεSSFtJjLLQriALvΕDKIAAFiεSI.ωrrAFHvΥΣSC Λ'DT QYΣΛFAIO-NEEAM!<XOHP*mPI-MEVVSAhΛ 3AMREHYARGLYATGRFYEKiαC|fAEAANIY
YRTAITNYPDTLLVAKCOKRLDRISKHTS
CPn_0659 739482 739838 CPn_0669 752140 752775 trxA-Thioredoxin CT548 hypochetical procein
LQEMlRDSNSIFREGKLMVKIISSENFDSFIASσLVLVDFFAE3CGtΩ?MLTPIΣ-- LAA ΣEYT^IΣJ'KIEINMPXJSl-GTIYLFFSLALSSCCσYSiωSPYHLSSliGKSLLOERIFIA εLPHVTIGK∞IDεNSKPAETYEVSSIWLILFKTCNEVARVVGLKDKEFL'rNLΣNKHA
CPn_0660 740327 739860 spoU-rRNA Methylase
MRVvXHCPDΣPONTGNIGRTCVAΣΛAELΣLVRPIΛFSΣ -aKFVKRAGMDY DICΣΛLTVVD CPn_0670 752738 753196
SIEEAΣJ vTε∑ra∑FCLSTKGSASYTεFSLPSSGTYV GSεSKGiPKεiΣJC YYKNCLRΣ rsbw-sigma regulacory factor-hist idine kinase
PMOΛDIRSLNIATSVGIVLYEV TtQKTVALQKNPTV PP U- ΛRYTMTFFEGETVFPA T^EΣΛSMIΛLIKRAGKOSKCPQEKLLKIJE tt^EIXΛW
IISYAYCCENSPGTIAΣSCISHRGDLEVVIKDHGPSFNPLAVSINI0EDLPLE0RKLOGL
CPn_0661 741139 740327 GIFI<AXSSVDEFLYAREDHCNIVKLKMLNGQHS mip-FKBP-cype pepc dyl-prolyl cis-trans isomerase
HSRCΣJ IKDRRRKMNP tWNLVIΛTVALALSVASCIWRSKDiωKΣX SLVEYiωNK∑rnro CPn_0671 753660 753205 εi^DN0IOJRTFGHIXAR0LRKSEΣ»-FFDIAEVAKσ∑ΛAaVCI SAPLTETEYEEKMAEV CT550 hypochetical protein
QiαVFεKKSI -3<WLAEKFLJCIΛSI0MAGVVEV0PSKΣΛYKIΣlCIiσAGKAISGKPSALLHY RITI(«RKYTMSΣJ3FFεEFYH0SIΣΛTCTSFPEσYLNIAEΣLSYPHt7TDANTDFLCS0SD
KGSFINGOVFSSSEGN rεPIΣXPLGCriPGFALGMQGMKEGETRVLYΣHPDLAYGTAGQL NDFIIAESωKLTLFNADFAΣVΛVP∑α\^3∞AVτRGYIAVSθσεGNYEPEMAFEASGOYN
PPNSLLIFEΣNL∑OASADεVAAVPOEGNOGE QSSLILIiAIΛLYLIΦIKrΛΕNALRSFRFNNDH
Figure imgf000105_0002
TPn.'li. /i, 75' 20 758051 -LMLIVLAFRQVFF-JHSRSO^ ..KNYLRLLKCNFA IT rKERT 'K IH "IM TFEFASFS
FYTNirPFLEE0KIPAVVrrVA3RY ir3NΛΛuD IP3IIRLKr ;ETLΛF;EE:F ^r tPFCC
Momo iorjou-. to CT695
JMrTrPISGNDGDRNTΣSDPLEεSAAεECDSDLεDRVSESATOV∑εTIADTσiPEATPSEG 0N8 IEMAK3PYIC!^-CF IPJ!ll ICVK«rttLTTEIi^ ^JSDtΛSDLVDRVEYεΛRGSLLTTMIJWIRKAVSOΣ MHVKTKRHPKEO rVRSLσDIPCD 3DPTSRKLAADHYPY3FIiGlllTtNRKLinTmiYRLD rKPM0«'C!,SLP1!;53RYf-«WIKE LLKATRLPKεTAεPPYFYALεTALA3CR3FFFHVFLRLFTLLRR0HPEAPLDLCGTDPIS KSKOLYLKKQLPKR PEAAVAFALΣLRSCCKWVATDATOECLPLEVrEEAGMYNAFSLεATTTVEEVSKRLSεLL YTDKR IDG ANVRGΣTK I ΣTSPYLGAGOO/SVVDNLKTYDLCRNYτQVLACASOΣDEFAD TPn_068O 771407 770137 ^FNFALVMKDπ.Yr.7R0DR'-FπΛDFI n(WSεεHASεv rYOVVI_AΣLE'ΛJLPΣLεEDYR •/f h -N ι f " - r» l«ert Am innr r.ιn-, fβr i'!»
•:• T i ,v>vc - ••• ■-- * /.." ' .. :...ι. > . -. τ ι . ' "'- .v.™\"
CPn_0677 760410 759256 HH ANVL3 E I ACPRRCSLVKK Σ RVHDSGL IDLOCLEKLLNεGAύFVG I PHVSNVTGCVQP
No robust homolog presenc in Genebank/EMBL as of 11/7/98 LOXJVAεLVHRYDAYLAvICAOGAPHLPIDVOLWrΛrDFWFSSHKΣYGPTσiGVLYσiαrBL
RlAMOΣNPSGNRSPDDVVWRGACCDSSSTOGTGATNSNΣΛAHW TSTSOPOVASKAKOL LDOLPPVEGGGDMVAΣYDHONPEYLPAPMKFEΛCTPNIAGVLGI lAAIJΪYlOGl-iAKFIY
W3TVREFFIΛKKSPDSSC iASGPAMOSPSGPTIRPTRPAPPPPTrGX5ANAKRPATHGKGR DKEIALTTYLHKELLε∑Pσvε∑LGPS∑εεPRCALΣGMTIDGAHPLDLGFLLDLRGIAVRT
APOPPTAGSSSGSεOPTAMSSEVAiαVSEIjαjAVHSHAESOKVliα VSOEICTrølTD EN GHQCAQPAMεRWNVGHVLRVSLGIYNDεDDΣDQFILVLςDSLDKIRR
NRGPDYLLHGYRVIARAI^2TYTCQSMLIEGTSSTGPVPOAVTVAKrAVTQTVRGAIKN^
ENPKPGNDPr STLMOVVISI GIEGPTI-SPGESIONFL£TRv^DFGGDDSDIDYTSDIARL CPn_0690 772704 771436
33AIΛRVRENHPNεMPRIWIAI «EI/aΛVHSHATS\ iANAGKNHTRDVVRMANESSRL ABC Transporter Membrane Protein 3GMWLSVGA ANTMTVLIGDLFE ωVUK3DK\^VSΣETFSSIASGSPVTJI<AAεACYT0YSKQPSSKEVLSSFS XQELSLFPD
RY tA^raSELIKOHW^-W HS^-^FECILI^lσ Y PSLSQ PErJ ΣVCσ DEARGSLSSF
C?n_0678 761329 760682 MCCFΕWIKHPLAFLNAVCSεDRGΛA/IYIPEEMCTSDPrFvRHISFPTVSDHDVIFSPRrV o robust homolog present in Genebank/EMBL as of 11 /7/98 VIl«PJ^AQI0ΣSHDVDLεMTCSSKTΣvTK7VTεLFVGEGADLTVFMVPGYSEEI? SWS
KΣ ΣMSVNPSGNSl<NDLWITσAHDOHPrΛ^εSσVTSANLGSHRVTASGGROGLLARIKEAV TIATVΕI0»ICR»froNXIJ-3CQGFG FDNTSYIvT:KKGHAεSLVLV0SPRKTWWe^L«SH
TGFFSR>ISFFRSGAPRGSQQPSAPSADTVRSPLPGGDARATEGAGRNLIKKGYOPGMKvT DAE- VSRQNΣKSrLYSGHFLFEGTISΣSSCCDLSDANQKHrπ'LLLSSEARVSTFPRLEI
Σ POVP<X3r^ORSSGSTTIJ PTRPAPPPPin^3GTNAKRPATHGKGPAPQPP!aYJGTNAKRA ETDEVI ASHGATVGPLDPOOIFYMP^RG4OTεAεAQEKLΣHGFLK0GLVSΣ)TFLGSSF0I-N
ATHaKGPAPOPPKGILKQPGOSGTSGKKRVSWSDεD OTS
CPn_0679 762936 761725 CPn_0691 773467 772685 pgk-Phosphoglycerace Kinase CT691 hypothetical protein
GYMDKLTVODLSPεεKKVLVRVDFWPMQKKIΣΛDIRΣRSAMOTItmjyCKHMVIΣΛS RGIΛSMLKIKHLHASCNΓΛTCIIΛDFNLNIQPC MHVIMGP∞AGKSTLAKΣI-AGDESVLV
HLGRPKGO3F0EEYSΣΛPVVrΛ/LEGYLGHHVPLAPt»Λ3EVAR0AVA0I-SPGRV7J_-E L SSGΕ∑AMEONI -^^PEERSRAGI.FVGFOMPPEIPC TOIKMFLRDAYNARRRANOEGDΣ
RFHIGEEHPεKDPTFAAEWSYGDFYVirDAFCTSHRKHASVYVvTΩAFPCRAAAGI-LMEK SIDEFNTΣ STVLETYEYNA'TTDLFLDRNVNEGFSΣGEPJ RNEICOMLVXEPEMVLLDEP
E ETI>GRHI TSPKKPFTAILGGAKISSKiσVIE^uVXJ«VFlXI-IΛα3MGFTFLOALGKS DSSLtAΦλl ^ICRVIf YRεLHnSSUtVTKNPKU.NLΣRPDVVHΣX DGRVλLA^
LGNSLVEJCSALDLAPjnΛJCIAKSPΛVTrVLPSDVKAAENLOSK∑ySVISIDC jIPPHliQG SLMHELEAKSYQEVTKRVλWR
FDIGPRTTEEFIRIINOSATVFWNGPVr/vYEVPPFDSGSIAIANALGlWPSAvT GGGD
AAAVVAI^AGCSTI VSHVSTCGGAStJSFI CGFLPGTEVLSPSKS CPn_0692 774945 773461
ABC Transporter
CPn_0680 764254 762971 ΣOEFCATGUCWKεSVKVFLEEP^r^PYGFVTPIESCGLTRGLSεETIEEIAALRNEPOF ygo4- Phosphate Permease IIDFRΣ JAYRYWXO IEPAWAPXHYGPIAYDDΣVYFSSPKOKKPIΛRIilϊVDPElLDTFK
YSMLPLIIFVIJΛGFY SWNIGANIJVAhlAvTJPS GSGv TLROAV\ AAIFEFTCAUXG !<ΣΛIPIΛE0KPXIΛVErWAVDLV DSVSΣGTTFKEAΣJ3CAG ΣFCSIΛEAI0Er tα.v^
DRVAGTΣESSrΛ/SVTNPMΣASGDYMYGMTAAΣXATGVWLOI-^FFGWPVSTTHSIVGAVI Y∑ΛSVVSI RDNFFAAΣΛAAVFSDGSFVYVPKGVKCPMDI STYFRINNKEAGOFERTLIW
GFGLVΣΛKGTIIY røSVGIILISWILSPFMTJGCVAYLIFSFIRRHIFYKJIDPVΣΛMVRVA EΣXSCTASYΣ^SCTAPAYSSTOlJttAVViaVAHEHAVIRYSWQNWYAGDiα ^
PFLAALVIMTΣZrr MISGGVILI<VSSTPWAVSGVL\/CX;i_--rYIITFYYVHTKHCSYISOT vTKRGLCACTYRSKISWSC ^^AAIT KYPSCΣU GDESVGEFY-VALTSσKMJAimrrK
PKKGSLTYRI-K-»GrafYσRKYLVVεRIFAYΣΛirVACFMAFAHGS14DVANAIAPVAGVLR MΣJtvGlWπSTVISKGISSDESraπFRSLVSU:K3 AEHSS rYTOCI)SW.ICKASGAYTOP
QAYPASYTSYTLIRLMAF∞IGLVIGLAΣWTJWPΛ^εTVGCKITΕLTPSRGFSVGMGSALT KΣWENSTSSIEHEATTSKLREDOLLYLRSRGLSPEEAVSLVIHGFCRεi∑εOLPLεFAQ
IAΣ-UILGI,PISTTHV\nπAVI JΣGΣ-ARGΣRAINr-NIIKθrVI^WFiπ,PAG^LUILFF EASKLLLIKLENSVG
FALRALFH
CPn_0693 776292 775240
CPn_0681 765001 764258 TPR Repeats (O-Linked GlcNAc Transferase homolog)
CT691 hypochetical protein IJtSTNHVIΛEISMEEAAJ^LAKEFI^SGINLFI^GEYEQAEICRI-KεTLεi-aSTAALAYCY
NGIRSHKSFT/RSFT(θ iIAKIfAIΣ-4O LARIjra0SPFAPΣΛAHΣ--MvVSCvEYMLPIFTA LGIIALETGRVSEAUnCSKGIASEPGDSYUYCYGVAURGNOYEAAIEOTSAYVALHP
LRTXniYεEΣXEMAKLVSDKEYQλIX:iKNDMRNHLPAGLFMPISRAGILεiISIODSIΛDT DrjVΕCWSΣ/:SVYHRI- RWEAΣ-XFDKILAΣJ}P NP0SLYNKAv^L5εH∞Eλ--;iRU
AtiWAIΣXTIRRUOFYPSMεTLFFRFLEKNΣJ-AFEL'ΣwrLIΛεi^NOIi.SSF∞RKADKA EVAVAKNPLYWKAWWLGFLLSRSKRWDKATEAYERVVOLRPDLSDGKYNUπrYLTLDK
RLLVGRVAKSEHESDVXORELMQIFFSDDFIIPEKEFYLWLQVIRRTAGISDSSEKLAHR TPXALKAFOEALFLNAEI)AnAHFYVGI-AHI^LKC<«EAYEAFNSALSINI^HERAHYI-
INMTLEEK YIΛlMO ΪETDI ATraiXFLOKKDSTFAPΣ QKTVVSDPSSMOFERRLDTIS
CPn_0682 764912 765955 CPn_0694 779635 776330 dppD-ABC ATPase Dipeptide Transport pbp2-PBP2-transglycolase/transpept ιdase
TSKGI-31CNSlJ«NNNLPKR5CKRI-4AShn?IL0IEDLSITLAK0R0OYPIVQSI^FTINEG FSDεSεAHNΣHSMKRPl CFPrYLSIAQKTW ,L-MIVIAFAVIAI ?L YLAVVEHE0KLE
O I-AIIGESG.raKSVSAIttllΛΣXrcPPFεVSGO ΛJFCCHNLLTASRSIQKKIIGTEISM EAYKPOIRVLPOYVEPATIOΛFGKTIAvTTOLOYIWSVAYGAΣRDLPTRAWRVDEHGHKQ
IFONPQASLNPVFTIEQQFREI IHTOIALTAEVAKEKMLYALEETGFHDPRLCLNLYPHQ LIP\/RKHYΣMCLSEtX^εiJUAR£AI-SAΣHA ASvTASVPYLVAANVSεRnLHUO^ SGGMΣXJRICIAhlALΣΛSPKLLIADEPTTAIJWSvOYQILQLX-CTLOKKTGMStillTHN SICD PGLHVEAVVPjafYPOESVASDILGYVGPISWEWRVTOεLSOLRECVRAYEEGED
MGVVAεTADDVLVLYAGPΛvTEAPAVO»σHNPSHPYTPJ3LXASRPSLQPQOLσSFNPIPG Pi PεGI-ASIDOvlWI ESVESNAYSIJ^ALVσKMCΛrtΛCWDSKIJlCKIGKKPΣLVDRRG^
QPPHYTAFPSGCRYHPRCSKILNRCSAEAPEΣYPVREGHKVRCWLYDD FIOEMEGAWEAP TTIO^LTI^AELOAYADAI I.EYEKTETFRSAKSLKKREKLPPLFPW
ΣKGX»IIAlJ3PNNGεiLAMASSPRYRNNDFWAKVAEDSl<AVRSSIYRWIfinCEHTAEΣY
CPn_0683 765936 766919 DRKVTLIRERRNPLTC∑rYEEILPLTFrX:FLDFLFPENSVIKΣΛLKRNSFVOTAIEVDNL dppF-ABC ATPase Dipept ide Transport VTRLLSLFPYEεGTCPCSAΣFDAVFPNEEGHΣLIOEVISLOEOKWIMεCLNOHKADIEεL
GVGC3frTNFPOPLI0ATSLTKHYYKRSF FOGKTΣASRPVDDVSFSLYSRRAvσL∑σESG KεALTXJVFNεLPAWYDKILYTDΣI-RLIVDPεRFSPVLPSεVHRI_5LSEFTEIΛCHYVVLR
SGKSTLAIALAGLLPLTSGFLTFNGTPIKLHSKHGRHOLRSOVRLVFONPOASLNPRKTΣ SAFSTILεDAF∑εv«FKSVmKSEFL0YLAAl«0EεAI-RKORYPTPYVDYLεEEKTROYKM
LDSLGHSIXYHKLVPKEKVLATTOEYLELVGLSEEYFYRYPHQLSGGQQORVSIARALLG FCθεHLDTFLAYLFSKTPYKΣvGLEPYYDILDLWINεLt»IGAHRALSWHεHYLFLKERVSH
7PQLIΣCDEIVSALDLSΣQAQΣLNMLAEL0KKLSLTYLFΣSHDLAVVRSFCTEVFΣMYKG LSεHLPALFSTFREFNεLORPLIΛKYPISΣ WIKROTEODLAASFYPVYGYGYLRPHAYG
OΣVEKGNTKRΣFSDPOHPYTPΛLLNAOLPETPDOROSKPΣFOEYHKDSEεSCSTGCYFYN QAATLGSΣFKLVSAYSVLSORΣLWGHNEEPANPLVI IDKNSFGYRSSKPHVσFFKΣXrrPI
RCPQKQEACKSEIΣPNOGDAHHTYRCΣH PTFFRGGSLPGNDFMGRGFIDLVSALEMSSNPYFSLLVGEGLGDPEDLADAASLFGFGEK
TGLGLPGEYAGRVPHDUVYNRSGLYATAΣGOHTL TPLOTAVMLASLVrβσVVYVPKLL
CPn_0684 768056 767181 IjGEWεGεHVSYLSSKKKRTΣFMPDAVVεVLKTGMRNVΣ GOYσTARA∑ SOFPPOLLSRΣ spoJ/parB-Chromosome Partitioning Protein IGCTSTAεSIMRVrjLDRεYGTMKMKDI FAAVGFSDODLSLPTΣVVΣVYLRLGεFGRDAA
EκSGDlVTεε∑SKDTΣ∑εVAΣDDIRVSPFOPRRVFSNεεLOεLIASΣKAVGLΣHPPWRE PMAVKMΣ DM εK ∑ ORESFLRG
ICTσDRVLYYELΣAGεRR RAMOLAGATTΣPVΣLKHVΣADGTAAεATLiεNΣORVNLNPI
EMAEAFKRLΣHVFGLTODKVAYKVσKKRSTVANYLRLLALSKTIOESLLOGQΣTLGHAKV CPn_0695 730201 781382
ILTLEDPΣLRεKLNε∑ΣIOεHLAVREAELIAKOLISEEGSSΣELKPTPLDMAESSKQHEε homologous r CT695
LCΛRt-3DLCGYICvOΣKTRσSKA'rVSFHLQOTQDLOKLεAWL3SHσTLSεSLS SLEVSMKKLLKSALLSAAFAGSVGSL0ALPVGNPSDP3LLΣDσTΣWEGAAGDPCDPCATW
CDA t SLRACFYGtΥVFDR Σ LKVDAPKTFSMGAKPTG3AAANYTTAVDRPNPAYNKHLHDA Pn_0685 76802b 763217 FWFTNAGFΣALNIITORFDVFCTLCΛSNGYΣRCNSTAFNLVσLFσVKσTTVNANELPNVSL
No robust homolog present in Genebank/EMBL as of 11/7/98 SNCnΛ/εLYTDTSF-WSVGARGALWεCGCATLGAεFOYAOSKPKVEεLNVICNVSOFSVNK
FPOSQYLLIFPNRI DL0AFEΣLDVOGMLTDQRKH∑OMLHKHNS∑εΣFLSNMWEVKLFF PKCYKCTVAFPLFTDA rVATATσTKSΛTΣNYHεW0VσA3LSYRLNSLVPYiσV0WSRATFD
KTLK ADNΣRIA0PKLr^AVLrlLTAWNPSLLGNATALSTTDSF3DFM0tVSC0INKFKSRKACσV
TVGATLVDADKWSLTAEARL ΣNεRAAl IVDGQFRF
CPn_0b-'b 7111703 7R25'>'1
CT-,9b hypot her j l
Figure imgf000106_0001
N'y;rrVRMPLLTY.'NFE IEVO.:Lr-';o-:ι 'KL'rtKDLM.':Λι';ΛIIF';ilOTRIr NPKMKLYIFEε KNTJLY t I LΛKT VLMIΛI.I H I UKV IODNKTVLrVi rrKKOAFI V I l'AAIEAGEFFIAE ι-Pn_')637 763501 7r, J214 R l ΛMLTNMTT tRN'; iKTI.IιK rrKDI ^INOΛYLTKKI-ΛΛLLΛKHH KLLKNIjnfRYMK
' Tin- hypor iiL-r lc.i l prot e in KΛf iLLWVOP:!YEK IAVΛI.\KKI/ i I I VI .ΛI.VI/rNCDl'IT IllllV I ll'NIJI)::i.K::tRLI IN
»κ tH N[.nιiΛYRFτr N(-r'-;rMcκ vnN ic V IFENI iFAKHK ur i-/ :ιvκ':ι.ιvι ι--:-υi) ..':iji.tϊJHFi-|i|,iΛKKκr»ii-ΛN
IIDOΛKPM-J I LL'i'RΛΛEVΛV'^l.MFLPSKSΛLSJLEOAYHl/ GCSMKPYΛ FtΛjCrY tllN I l-LPi;AYYAi ;tAYNNr. AU)LPIIP I KLLKEISCA ADOLYDVALSKJY LL TAN.'33PE i T-nJJt. » / 'i "..' / />ι 4 1 / ι-|TI.-:FI.TI.l.HV iπLKεLUIODV:-ςιDFAΛLK3SPLFHOFEPMY:;DGE TLSKRFGKKG r ι: t i.-n I ι< r ut -|-
• Iii .Hn It. )37r. /70117
T4, hy| ,ι hι-l ι> i l pror t' in
Figure imgf000106_0002
Figure imgf000107_0001
CPn_0699 784179 734721 rrf-R bosome Releasing Factor CPn_0710 796482 79β210
TMSVX0DTεWOMAAALDFFHKEvTC3FRTσKAHPALVErvVvrjVYσiTMRLSDlASΣSVAD CTβbβ hypothet -cal protein LROLVΣSPYDCJTOASAIAKσ∑ΣAANΣΛIΛPEVEGSIΣRΣKVPEPTADYROEMIKOLRRKC RSRGEKs ATNKScτAFDFN»αjχrvcrΥwσvo^YLτεLr rsτr rr D ^ εεAKINVTtNIRRεANDIO-KKDSALTεDVVKσNεKKΣQELTDKFCK0LDεLτK0KEAEIAS QΣLSOYMESVSNILTAVNTEMITMARAVKGS
CPn_0700 785094 785609
CT676 hypothet ical protein BHDRAOEΣ «SVODK
UWISPTHOCYHCWPATICYTEΣDKOKVIRSYVCATCPCPSHYYNNεHLSLSKGVσVLT
Figure imgf000107_0002
I^CGNCKTVWHSKQDDεOLLGCHOCYTNFKNOITSKLKSERWSSSFTMEKGOGSLHΣGR
APGEASNTNPLIj iAωεALQDTLERεDYεOAAVIRDQΣNHLKTKNPDDPS CPn_0712 799315 796781
FHA domain homology to adenylate cyclase)
CPn_0701 785534 736672 MΛVRL1 VDEGPLSGVΣ FVLEDG IS S ΣGRDSS ANDΣ PΣ EDPKLGASOAΣ ΣNKTDGSYYIT karG-Argnine Kinase ra^Dr/TΣPIVVTXWAIOεTTOUCNEiπ'ILLGSNOYSFLSDεFDPQDLVYDFDIPEENFSND
KPKIOOTLPNDLΣ-CTL'/KRKεSroANKVWPVTTFSΣARNLSVSKFLPCLSKεθKLεlLOF S<5DLSDSNEC<;iωi^PRCrSεTr4HSPKPKEKLTKDOGSSDP ΣTSGDQELADAFIASAKAε
ΣTSHFNHiεGFGEFIVLPUαTrPLWJKEFLLEHFIiPYDLvTJNPEGEALWSRSGDF AA KNOPRAI V^CKGLKESSNESI^Pr^QNAKDSPKGEERTNKPQNAIMEONGASPRODPQPK
INFQDHLVLHGIDFQGNVΕKTLtΛLVQLDSYIΛSKLSFAFSSETOFLTTNPWfαϊrGLKS SAEPS^!CNTAROETPΣΛEr«PVEEKANXKATPDSPEKKDQPEEσSKKEGSKIEATPLDSQ
XFIΛΣPALLYSKEFTNXIDεEvΕIITSSLlXΛVTαFPGNr IΛNRCSLGLTEiαXLSS KESEDKEA£EAFVθεEEENLTEDNKEX)SDSAAI ArmDTASDHTAEDrøETPKICVENεKSA
LRΣTASKl^VAεVAAKKPiSEENSGDΣ ONLΣLRSIΛIiTHSCOI-ϊIJSErLDALSWIQLGI VX5PFHV0DIJRFIXπlFPAEIDDIAKICNISVDLTOPSRFU-K\ -W3ANIGAEFHIΛSGK
DLGLIKVTENHPLϊmPLF O∑RRAHIJUΛKOAEDSPΛWKOTΣSHUtASVLKELTKGLSP TYIΣj WDPTTCDrVFNDLSVSHOHAKITVσ^IXSGlLIEDLDSraKVrVEGRKIDKTSr^ εSF SNQ Λ A 3TTLr X.IDHHAPADTrVASLSPDDYSLFGRMDA£AI-ER0EAQεεEEK0rRA
TLPAKFIL'TΣ-FVrjCl ULFGIGτASU'HTKEVvT'LENItJYOEDLA^
CPn_0702 789700 786929 KTNSQI_FLIGMVT«STDKSEU.YI vOALSFVT«SVDDNVIDDεAVWOEMNILI^KRPEFKG yscC/gspD-Yop C/Gen Secretion Protein D ISMHSPEreKFIITGYVKTEEOAACLVDYΣΛIHFNYI_ΪIXεNKV\f\^T0MI CAIAGHIJ-Q
LKKNPvTOVILNIGRKΣΣ^XSΣKKKKKKIGILSGLFFLDLVTX TVSSORI'TETSANVKHNL GGFANIIWAFVIKEVTLTGYVNNDDAElCFRA OE S<:lPGVlU,vTOIFAvlI.PAEEGIID DEKtAACPKNSAASLSAKKSHTKKrTPGSIPSKVFSKFDATODKTFOKTSGSAFPAKPT LNIΛYPhmYPnπYJYSRYGEΣSIN V∞RΣLTRGDVIK3MTVrSIOPΪ«IFIiKI«LlCYK
TLKil£ERKKPRPEWTTADVKRSPPJ^ITO-WEEPWAASKiΩLDSIOV EEKQNYARR IDYNK
AVNAIhrLSIKKOLεεffrSTVTεKrΛ^PKTQATPHASraθWASPSTSMPGIEKAATTVAVP
ODKSEEEKvKEPiTKRεLTC^LKDNGYTVNFεDIS ILε∑XOFVSKΣSGTNFVFDSNDLQ CPn_0713 799817 799332
F^IvT^VSHDWSV^DLS IL Qv^JO>IHυLKVVEOαN VLIYPΛPHLSKΣJ3TVVro CT663 hypothetical procein
TCEAWVTRVFRLYSVSPSAAVNI IQPLLSHDAIVSASEATRHVI Σ SDI AGNVDKVSDLL LDIJSEI«AGFRNEIVSII<XrrirTrlAALENTSMLEiα∑KNFATrMGITSTt ΣΛUMAYV
AAI_XPGTSVr>rrEYEVIWANPAALVSYCQrΛrLσTΣA£BDAFO IFΣQPGTNKIFVVSSPR LPISEVVr^niArnNArXΕΣVLSASLGλLPPSAOTAKLYΣ^MMIGNLFGRεTOGSALGLDS
LANKA£QLIJ SIJlvTEI4Ai j3DPASTAIΛI«3TGrITSPKSLRFFMYiα^ εG WMVTUtfSCTτTYDDFVRHVrESFMNFSε∑VLSDLGLGKQ
LQDIGYULYVTTAMDEDFΣNTLNS IOWLEVNNSIVI IGNCGNVDRVΣGLLNGLDLPPKQV
YIEVLILOTSLεKSϊroFC!VOWVALGDEOSKv*YASCLIΛNTσ∑ATPT ATVPPσTP lPGS CPn_0714 801125 800091
ΣPLPTPGQLTGFSOMLNSSSAFGLGΣ IGNVI^HKGKSFLTTXWΣiSAΣXODGDTVrVLNP hemA-Glucamyl tRNA Reductase
RIMAQDTCCASFFVGCTVPYCTIΪ)TIIQEIGTVTQNIDYεDiσV n^VVrSTVM NYRIVTJ<Vl^rVVt;lSYKEAALKEREr IQYIΛSFE»IIJLAORFIΛKGGAFIPLLTCHRA
QΣEOTISEΣJiSASGSLTPVTDlCIΥAATRIΛIPIXSCFLVMSGHIRDKTTKVVSσVPLLNSI ELYYYSESPEIAOAAIJ SELTSOfllRPYRHRGLSCFTHLFQVTSGIDSLIFGETEIQGQV
PLIRG^FSRTIDQRQKPΛΣMMFIKPKVISSFεBGTRVT lKXGYRYNWεADEGSMO APRH KPAYLKGSKn^LPFDLHFLFQKA OXIKEYRSRIGFPDHOWIESVvTJEILLSYtlKSIY
APECQGPPSLOAESDFKIIEIEAQ TOFIJVrΛSDIrJRKVAAYXYQHσYHRΣTFCSRC<3 rrAPYRTLSRETωFROPYDVIFFOS
SESASOFSDΣ^CESΣJ^IPKRΣV DFNVPRTFLWCεTPTGFVYLDIDFISECVOICRLQCT
CPn_0703 791205 789685 KεtJWKAKIXLTCAAKKCWEΣYEKKSSHITORQISSPRIPSVLSY pkn5-S/T Protein Kinase
RKΣGFMIX:Rα3IPLPETOVIGGYHVXKΣLSKKIJ«RVVHGLHPεTRHSTVIKvTSPSPSF CPn_0715 801636 803462
TSRSVYϊn7LKεAQSLH0ITHPNIVKFHRYGKW0DCLYIAMEYIEσ∑SLRεYILAQFISLP gyrB-DNA Gyrase Subunit B
QAIDIΣFDIAQALεHLHSRNΣ lKDIKPεNΣLITPCflKΣKLΣDFGLADWDTEIORAHPSV KFNXISIΦIAAYTελSILSI^^LDHIRIJlλGMYΣGRΣAnsQKI^XIYTLFKEvvτMGIDE
∑σTPYYMSPεQRQσεSHSPASDIYALG AYELILGHLSLGRVFLSLVPεRISKILAKAL FIMGhOKSLKISASDKQISICr KRGIPI^KLIDCVSKINTGAKYTQI3VFHFSVGLfCVG
QPSPNNRYSSTRεFΣQDIHHYRMSGDMOEDLRIKDHTVALYEOWTQRF LAPεTLRFPD LKAWAI^EIFSVRSVTiiαCrCYHIΛTFHRCrvXOεSKCGSTKDPirjTFVSFTPDPSIFPEFT
FΣSGVXYHOGYPLYPHAYDTLI_εGDVFNLW 3YSPISlIATIAI^VVKSLVCOQDLORPLL FNHDFLKDKIROYTYIΛSGLEIPJNDEVFISHrøtΛDLFDAEITEPPLYSPLFFQNEDLT
DRVCεiNECLIRMKIPIDEMGISIirtilSKENKELSWΣACGKTVFWΣKROGRVVODFεS FIFSHUr-rrERYFSFvNTOETΣZXΪCTHLTAFKEAΣVKGΛ/NEFFG
FSPGIΛKΣTSΣΛIRETKVAWEIGDεAVVCTLεLEεSVASUCTLSLAELQDRRQKAIFCPΣ IAIKIASPIFESCTKNKLGNTO∑RSSLΣKtΛ/KεAIVOALRKDIO/APEtXI«EKIKFNEKrR εSIHGGIQSRQHGSNSPSTLISLKRIR KNIOFIKODU SKOKKVHYKIPKLRDCKFHYNDRSLYGEASSΣFLTECESASASΣLASRN
PLTOAWSLRGKP«Nv SLEETKMY»IDεLFYtATAIΛITONε∑OHIΛYNKVILA'rDADV
CPn_0704 792330 791209 DGMHIPΛLLITFFIJCTLLPL\ΕNNHLFILεTPLFKVRNKTTTLYYYSEOEK«OArΛQFGK fliN- Flagellar Motor Switch Domain/YscQ family KDSSLEITRFKGLGε∑SPKεFAAFΣGPEΣRLTPVTΣTSLESΣSSILOFYMGKNTKERKQF RYFMAVAADSSAS LKSRNNFLSSLGKTεεθVAAPεFPKELCOHKΣRεKFRLεDVQVSΣK IMDHLITDF FRGSITAVEA,reEFGVHLLIOPMVVOPWεVENLLFLTSεEDLθεLMVAVFDDASLASYFv EIΦKUΛFHYYFVAEACKLFEεUJWPSLSAjrΛΛKDAIFTATSLCflSFOVVυiSLRLOσK CPn_0716 803466 804902 NVRCRIXiPεDTFQSCOKFFSGIΛDεSDLHNirXyTC<JISLSVεVGYSOLTθεEWHOVVPG gyrA-DNA Gyrase Subunit A SFΣMLDSCLYDPεTEESαALLTVOKHOFFGGRFLTPSSGEFKITSYPNLTHEDPPLPENP FMRWSεLFRTHFMHYASYVILεr^AΣPHΣΣJXLKPVORRLL TLFLMDDGKMHKVANIAG ASAAPLPGYSRLWEVARYSLAVSEFIKLNLGSILSLGNHPAYGVDIILDGAKVGRGEI R'TMAIJIPHσDAPIVεALVVLANKGYL∑rjrOσNFσNPLTGDPHAAARYiεARLSPLARETL ALGDVLGtRVLEV FNTDLΣAFHDSYrXiRεKEPDΣLPAKLPvT,LLHGVDGΣAVCMTTKΣFPHNFAELLKAO∑AΣ
LNDKKFTVFPDFPSGGLMDPSεYODGLGSITLRASΣDIINDKTLWKOΣCPOSTTETLIR
CPn_07()5 793176 792334 SIEHAAKRσTΣKΣDT∑0DFSτDVPHIEΣKLPKGSRAKEMLPLLFεHTECOVΣLYSKPTVΣ
CT671 hypothetical protein YεNKPVεCSΣSε∑LkLHTTALOGYLεKεLLLLOEQLTLDH /HKTLEYIFIKHKLYDSVRε mELKKTAESL SAKTDNH'TVYQNSPEPRDSRDVKVFSLEGKOTROEKTTSSKGNTRTES VLAΣNKKISADDLHOAVLHALEPVΛHELATPVTKODTSOLASLTIKKILCFNEEACTKEL
RKFADEEKRVDDεiAεVGSKεεεOεSOEFCLAεNAFAGMSLIDIAAAGSAεAWEVAPIA I \ΣEFKOAAΣOKDI^RtKε^TVK/LKGLLεRHGHLGεPKTOITNFKTAKTSΣ CθA7TLΣ
/SS∑rTIO iεNΣΣLSTVESMVΣSEΣNGεOLVεLVLDASSSVPEAFVGANLTLVOSGODLS
VKFSSFVDATOMAεAADLVTNNPSOLSSLVSALKGHQLTLKεFSV GNLLVOLPK Σ εEVOT CPn_07 17 804163 30530b
PLHMIASTI RHREEKDORDQNQKQKQDDKεODSYK I EεARL CT656 hypor hec .ca l pt otem
IP ΣFFΣDTΣTΣVVlWεPRH I\ IRKPETPKAPDVEKPrrVPεYMTMANTrTFεGPVKτLDQL RRALΣEORCAEEC kMYDNF∑OSΣLISTFGLVHKDMDRAOKASKRMRSVYKEQ
CT670 hypothet ical procein
/AVAr/PLEPVLAIKKDRVDRAEKWKεKRRLLεiεQεKLREKEAεRDKVKNHYMOKΣQO CPnJ) lx 30*300 30562b
-RDLLDεGTTCDAVLOIKSYΣKWAVQLSEEEEKVNKOKEWLAAiKELEKAEVNLAKRR |"Tf c7 hyporhct ici l ρrot*» ιn
KCEEKTRLHKEEWMKεALKEEARAεεKEODEMG0LLFQLROKKKRεSGGS RAΛ4'-rTYFLALPVrRLM0εRFLCGPKRWAPFINSPL LTLIADHDTP\ LAKNLDKFPLP VEOWEKTVt IW-LLΛjIFL--D .ot RLLΛ TKFEI LTUIDLΛ ΛON l f rit ( i it l nn i l i no ynr h i ι ) rOIF hKV IK I MkTVT FTVf YIWV RI UK /LTε HIT Y /„RΛFYOI II (L.:r,LVOINOQ i rrrr '\~ intot kFri i t i rr/Fr/ M i ι v iNM rMWHPΛPG ιιn» rt i i i i jam ' 1 1 t rΛκ ι noΛ κvr :εLFS hi i ' i ru iiivii I P ' i i r/s i i i ι*ιι< i i ι i _,
Figure imgf000107_0003
11 _ / m / 1 / 'SO J4
Figure imgf000108_0001
< ,v I F I vi : : L Γ'JC-.NGFFΛLN:;:;ED t PNFNPKΛ IOF /H
:PnJ)7)'l HJ43/2 -U33bι
T TbH hypornoci ri . protein
:K7LFKLM3Y,LRNKKTKtC-«Υ::∑ALσ∑L3FR3ΣPCε'/YDKIR33FV3LHVKFFPKIKQ AP33IILA,LεLENL,.'LKER7A3LEεKLKLYεV3NHTPPLFPε∑LTPYFHKLVεGKWYRD
Figure imgf000109_0001
-.Dr 3WWrKH-LRELtPOVE0r3HAYILEKDKYEKΣSOLθεLD3LrθGεσεN0ALLRGIL .-,;*, . ■ • :rτι.:.ι- — iL--"/r-t'.' /,„-',— --.T/E-C -r . -■ • ' ^gy-^s
I." -. .. „F:.'«L: .... :..: ir. I :''..- '-τ:"'."i>
MFRCILFGIFLLTCF SGG'. .'YLrCJHDFJIGPKEKSRSVVIEEEKiFTOSVLHHLPSO
CPnJ0740 836054 334864 HOHI-HIIΛFCCFLLOKMK.-SOAεKΣFSIΛ-YOEAODGPFLFKεEIΣΛSRLINSFFLEXttI cyrB-Aromatic AA Aminotransferase VMETII^LLNORCPNSPYYHLFKALVCYKOILYREΛ'∑εOLAYΪ^εεKTRALAPLLNIS∑ε
CYMSFFNHIPTFSPDAΣLGLONVFFADKRPεKVNLViσVΥEHPOKRYGGLSCΣRKAOTVΣ O τDFLLDYISAHS IεOKMFPεGRVIL^nW ^mL KHεCεW AKTYD IAI S SYF
LEεεONKTfLPΣSGLQΣFΣΛIMRεLVFGAVDPSAΣVGFOSΣΛπGALHLσARLLSVAKGS LELVεSKSADΣYFDYYEMVLFYLKKΣYILEOCPYAELLPEEELVSLIMεHVFΣLPKDKLY σKVWPεOT SNHΣRΣFSOεGLεVΣRYPYYSKEQKOLLFEPLΣAFLKEVEKNSVILLHGC PL∑OLLEMWOIWYVHPNSSL OΣLV'DRFSTHMEGAΣRFCEALVSFSGLEELHQQII'TTF
CHNPTσVDFTEDMWKELAILMKERtαiPFFDTAYOGFAHσ∑ELDPCPΣEIFISεσNTVLV EεlXΛrKV∞ΣiWεEAKOC'/ALLHΣLDPSISΣSεKIALSSDTLONΣVSGODEOHTKLRNY
AASSSKNFALYGεRVG'/FAVHSTFTDεLVKΣHSFLεεKIRGεYSSPORWσVEIVSTILSN LDL EAI0SYD DRC3LVHHLVYσAKDL KKσCΛTEKA ^rL ς \'LRFTSYDIKESVVF
PYLKεε OSεLNFΣRESLGKMRTRFVOALRKVAGHTFDFLLSQHGFFAYPGFSDKQVLFL LFIKQAYKOALSSHAIARLLKI-ΣKFISεANIPSrVISεAεKANFLADAEYLFAHEDYDKC
REOHAVYTTAGGRMNLNG∑τεKNIDHWOSFIQAYεL YLYSMWLTKVAPSPQSYRIΛGLCLMENKRYDEAΣJ^I^MLSPNDS∑roYKTOKALAFCQK HOSKDRAAS
CPnJ0741 838383 336185 greA-Transcription Elongation Factor CPn_0752 848595 850082
EΥIFRLKIODI'CYLEK O^ΣEEGQSANFLSLWEEYCFNDVVRσRεLVεiLεKVKSSSL "recD-Exodeoxynbonuclease V, Alpha"
ASLFGKΣVDT'Λ/PL εKΣPεGKDKDRVLQLILDLOTSNSQMFFDΣATEYVNKKYSGεεNF GWALHTEFAPFLεDLVHQOVΣSPLDIAFASKHΣSSDFεESFVFLAVSSAL RYGHPFLSL
NEALRVVCLRDGRDFOFSLSRFDFLMHMHKGNrFHCWSWGVGεVMGVSFLQOKVLIEFε εENRIRPSΣ/WISETOLYRGFHNLPKEARDKLFVVVSGRLYLRSLYTΣRSKLLDKLSLLC
GIMSAKDISFETAFKSLTPLSGDHFLSRRFGDPDGFεAFAKENPIEWεiLLRDLGPKTA SATPNYFPPSIDSSII^EεθNFIFNKITC<XFSIVSGσPGTGKTFLAA0LILSLVKQOPK
Kε∑KDELVOLVIPεArΛ«RV*«SAKTKIKKGTRIΣSPDNPKEPYvLSDAGCSHMGQLERK LRΣAΣVSPTGKATSHΣROΣΣ 0YNΣFDr»tVΣΛCTVHHFLθεYAYRRYNSΣtJVlXVDEGSM
IΛLSIΛSAEKISLIYHFIRDωsε CNiεiRKSLVKALODLDVEEGNKSLILORεLLLSε VTFDΣXYSLvTjr∑/yTYεKDKKLYTSSLΣIΣΛrΛTIOLPPIG∑σVGNPLCDLIGYFHENTFF
YLGIKDASIDKεYITSΣ^εDOTSRLLENMPΣVALOKSFLSLVRKYSSFWQOVFMOILLYT Σ_WSHRACTGVVDOLT0SVLRGEMISFSPLPS ISSAlεVLKNRFVKSLROSEARLCVLTP
TSPTMPΛFVΥKTIK JPSSVEVLKKRΣiDSAHOPMMFPELFVWFFLKLGNHEDGLFDPεD MRW5P 3VLNI24TMIH0RLARSDPDLRIPIMVTSRYεTWLF)GDTGLI£UTQKIJJFPQ
KIvVIJtLFLESALNFMYOVASTPHKEΣΛKKIΛlfYLVGORYΣAVRO4ΣEGASLPFΣjαXLLL HεPIDSRALSσYVYNYvΗSVΗKSGGSεYDEVIVΣΣPKGSεVFGVSΣLYTAΣTRAKYRVSV
STKCTOFSSSDI-^VIΛSLAEVVOPTLKKHISNVΕ_giroLWSTSESFSPJ«AKLQSLVτjKE GDPETLHKIIKKSNY
MVDNAKEΣI^ARSLGDIJtENSEYKFALεKRARΣΛεEΣRvT.SεεiNRARILTKDLv TDKV
GVGCKVTΣJCGDAGEVVEYTILGP DADPDSCILSΣΛSKLAONMLσKKΣjrovVILQGKEYK CPn_0753 851009 850161
ΣSRΣQSI εεHGA No robusc homolog presenc n Genebank/εMBL as of 11/7/98
IMATAHΣΛROALI-HΣΛSOTPAIPASGrLFRC<3SMSLHNNVLFAGDIVGAIKNSTAΣSRHA
CPn_0742 938442 S38888 SSSHYAHAAIΛKT∞FLGAAtX^WrAVAGAMIΛTOLL∞SMIFεTDEETGEΣJUWaiEAD
CT635 hypothetical procein AECJCMTOKLQRRSALTI'rciCVARΣJ^KTΣΛTATFLHεMIJ SLGANANKIGCKVTSCXΛL
TKMMVIVMNSKSAQKΣIDS:KGILTΣYNIDFDPSFGSSLSSDSDADYεYLITKTOεKIθε VATGCSLTεSSISLYRIt^TRPETΣSDPENRNKPSAεFAARSKAΣRNAFIAlOiGDVVDLV
LDKPAQεiLTCTGMSKεOMεVFANNPrjNFSPεEHLALEKVRSSCDEYRKETENLINEITL CDALGTLSLFLPAILGVHAVLIMAILGLISCVINFVKDYAKIG
DIΛPTKεSKRPKOKLSSTKKNKKKN IPL
CPn_0754 851381 851040
CPnJ>743 838956 840362 rs20-S20 Ribosomal Procein
"ngrA-Ubiquinone Oxidoreduccase, Alpha" OFILNLKVLVΣ^GDΣMAPKKPNKKNVIORRPSAΕKRILTAOKP LINHSFICSIFVKTRVKK
IFMKI'ΣΛWGΣΛΣ^Σ^SPKεSGFYNKIDPεFVSID ^FQPLSLKLKVEQGDAVCSGAP FEASΣJ-C-JΓ/TOATLSNIΛSVYSVVDI AVKRGΣFKΓΛI^AARΣKSKATUVNARAS
ΣAEYKHFPNTYΣTSHVSσV TAΣRJlσ^rlC SIZ»IIKKTPGPTSTεY YD OT SRSDI^ ε∑FKENGLFALΣKQRPFDIPAΣPTOTPRIWFINΣJUJNRPFTPSPEKHLALFSSREEGFYV CPn_0755 351579 352799
FVVGVRAIAIOJGIJlPHIVFRDRL'ΣtPTQi jCTIAHLHTVSGPFPSσSPSIHIHSVAPIT CT616 hypothetical protein
NεKEVVFTLSFOD^TIGHLFLKGRIIΛεθ\π'AIAGTAIJt^SLRRYVITTKσASFSSLIN YKDLFFMΣiVRKWΣOTCFKYWIYFLPVVTLΣX.PLVCYPFΣJISQKIYGYFWTriSSLGW
I-NDISDrJΣTTLISGDPLTGRI^lCKEεEPFΣΛFRDHSΣSVIJINPTKRELFSFLRIGFNKPTF FFAIΛRRENO TAAVOLLOTKIRKLTENNεGLRQIRεSLKεHOOεSAOLQIOSOKtltNS
TKTYΣΛGFFKKKRTY NPimJΣJMETRPIΣ∑rrDΣYDKVMPMRIP PLΣKAVI'rKNFDLA LFHI^IXVKTKGEGOKISTIXiHRTEENRCLKMOVOSLIOECGEICTEEVO UJRELAET
NεLGFLεVCGEDFALPTLIDPSKTεMLTrVKεSLIEYAKESσiLTPHQD LAY∞AIJTOEYOATFSEORNMΣ-OKROIYIGKLENKVQDLMYEIRNLLOLESDIAEMIPSO
ESNAVTGNISIΛI^SEUα IAFKAENΣEAASSLTASRYLirrOTSVHWfSIiCROIJOS R
CPn_0744 341387 840389 EENΣΛMΣJVYARQSORAVFANAIJTTWrGYCAεDFLKFσSDIVΣS∞KOVIMEDLHSSREE hemB-Porphobilinogen Synthase CSGRLVIKTKSRGHLPFRYCΣΛAIΛKGPLCYHVLGVLYPLHKEVLOS
EMSSLTLSRRPRRNRKTAAΣRDLI _CTHLSPI<DLIAPFFVKYσNNIKEεiPSLPGVFR S
LDLΣyE∑εRIΛTYGLRAVMLFPIΣPDDLKDAYGSYSSNPKNΣLCHSIHε∑KNAFPHLCL CPn_0756 852889 854676
ISDIALDPYTIΗGHIXJIFLIKεVLNDεSVRIFGNIATLHAEMGADIVAPSDMMDGRIGYI rpoD-RNA Polymerase Sιgma-66
RSiααj-^rYSKTSIMSYSvKYASCLYSPFRDALSSHVTSGDKKOYOMNPKNVLεALLεSS ISrY PLTT SSItAPΛPLVLFOVRKI.FMNT0NSQATEVSSEεεS0KKLεεLVALAKE0GFI
LDEEEGADILMVKPAGLYLDVΣYRΣRONTCLPLMYQVSGEYAMΣLSAFC<3GWLDKETLF TYEEINEILPMSFDTPεO∑rλTVLIFL'ITJMD∑OvTΛOΣDVEROKEICKKEAKELEGlARRTE
HESLIAΣKRAGADMIISYSAPFILεLLHQGFεF GTPDDPVRMYLKEMGTVPLLTREEEVEISKRIEKAOVOiεRIILRFRYSAKEAISIAHYL
ISG3ERFDKΣΣSEKEVεDKTHFI IXPi iTLLKEEtnΥLεNLIiSLKOPDLSKOEAAKL
CPn_0745 341903 841742 ^rDS EKCRIRTOAYLRCFHCRHNVTEDFσε^nΛFKAYDSFLHLεQO∑NDLKV AER KFAA
No robust homolog present in Genebank/εMBL as of 11/7/98 AKIAAAICRKLYKRεVAAGRTLεεFKKDVTOtLOR MDKSOεAKKEMVεSNLRLVIS I AKKY
VDSCFDD RASSLOσSTTYNVAYDPKHTLAYGFCNQVSVKKFHLKPPKSOεKFL TNRGLSFIΛLΣOεGNMGLMKAVΕKFεYRRCYKFSTYATWWIROAVTRAIADQARTΣRΣPV
HM∑εTINKVLRGAKKLMMETGKEPTPEεLAEELGLTPDRVREΣYKΣAOHPΣSLOAEVGEG
CPn_0746 341939 843567 SεSSFGDFLεDTAVΕSPAEATGYSMLIOJKMKEvXKTLTDRERFVLIHRFGLLDGKPKTLE
CT632 hypothetical protein EVGSAFWTRERΣRQΣEA ALRrørPΛPΣRSKOLΛAFLDLLεεEKTGTSKVKSLKSK rSGRCPFSF^ MLGKEEErrCKOKC^LSHFVTrLTSDVFALKNLPEVVKGALFSKYSRS ιLGLRALLLKEFLSNεεDG7JVCDεAYDFεTDVQKAADFY0RVLDNFGDDSVσεLGGAHLA CPnJJ757 354709 855134
MENVSILAAI<VLEDARIGGSPLεKSTRYVYFDOKVRGEYLYYRDPILMTSAFKDMFLσTC folX-Dihydroneopteπn Aldolase _.
DFLFDTYSALIPQVRAYFEKLYPKDSKTPASAYATSLRAKVLDCIRGLLPAATLTNLGFF PCΣKNΣALVΣA∑εRYOLIΣSKFRM LFLGCSVEERHFKOPVLΣSVTFSYNEVPSACLSDK
GNGRFWCNLIHKLOGHNLAELRRLGDεSLTεLMKVΣPSFVSRAEPHHHHHOAMMQYRRAL LSDACCYLεVTSL∑εε∑ANTKPYALlεHLANεLFDSLVΣSFGDKASKΣDLεVεKεRPPVP
KEOLKGLAEOATFSEεMSSSPSVOLVYσDPDGIYKVAAGFLFPYSNRSLTDLΣDYCKKMP NLLNPΣKFTΣSKεLCPSPVLSA
HEDLVQΣLESSVSARεNRRHKSPRGLεCVεFσrDΣ ADFGAYRDLORHRTLTOEROLLST
HHGYNFPVELLDTPMEKSYREAMERANETYNεiVOEFPεεAOYMVPMAYNIR FFHVNAR CPnJ3758 55104 356459
ALOWICELRSOPQGHONY'RTIATGLVREWKFNPM-.'ΕLFFKFVDYSDΣDLGRLNOEMRKE folP/dhpS-Dihydropteroate Synchase
PTT RAMSεPRFVCLSLGSNLGNRFKNLOΣAPTLLGEOAVLGLRSSVΣLETEALLLPGSPPεWD
LPYFNSVLVGεTTL3LRεLLvTIK0iεKWCRAεESPP SPRTΣDVDILLYCDESFCCDH Pn_0747 S43949 844053 TEΣTΣPLSNLLSRPFLIALΣASLCPYRRFCTOCSPYHNFTFGELAHHLPSPPGMΣRRSLS i.'Thll hypothecical procein prjTMLMGWNVTND3MSDCGMFLDPεKAVA0AEKLFTεCAAVΣDFG 0ATNPKVK0FLSV RTCMGCKGAEVQ I LSSRSLSCMKΣ LS3SLFYKKFC DθεwεRLεPVLRLLKεT SNRKOYPΣ ΣSLDTFYPε∑ ΣLRAMDΣYPΣO ΣNDVSGGSOSMA εVARDCεLSLVMNHS33LPVOPKNΣLSF3VPΣGε0LLS GεK0LKMFSDVGLNAN0VIFD
"PnJ)74R 344-196 844121 PGΣGFσKGAA0SLATLYε∑.\KFKRLCCPΣLICHSRKSFL3LFGNHDPKDRDWεTVGLSΣL
LS A- f jny 1 Transcraπsferase LOQOGVDYLRVHNVAAHOKALSVAACεACAP I TLVUIALDTYPP^IESAIEKALεCFGPΣGHPIRSPVEYALOGGCKRLRPGLVCMMAOCL 'JLNIIDVMDSΛLΛVCFVHT TLΪΛDDLPCMDNDDERPGRPTVHKAFDEATALLΛSYALΣPA Pn_0759 'i'>o4 (4 H56«97 ΛY.'lHLRLNAKKLKECGCDPRε∑DIAYNIIGDΣTDKIIIGCSGVLGCOYDDMFFSNRGOEHV folA-Dlhydrot l.irt. Red ^r a*:*; O.IMIKKTC.SLFEIA I.-iGWLFGGGDrOFAPIITCF-NNFGLLrOIKDDFSDLOKDSOOI LLVKPVIIPSNFENr-L/JVαiCKMP' prjtVΛ DI'nGVKILCCKLrWIIYrεDLOFFnET∑OK
.NYALL.rcnA,\LCLLAR NNrLCLLDRL3AGGLKH33EFETr tS3LG3F FP I'MGPKTWETt.PPKYF ,PRΛ'-/VF':ilFKRι.x ;VHr;E IWVTL-LrCFLLI.DL-;:iPTFLirK '■)ELY3LFLEWtVPLrFI.-ll!F εYΛ';iyrFFPI..:LL_TWTKTVI,πP, 'KrTT(ΥYENIIH.<; l'li_U74"' i.l'.blH R450l)i. UWΓKHIGL
■ limn U I' . tHΛ.. Ivr π|.ιho'-phory las.'
VCYMTYIΛ:-. ιr-:ι Γ.ΠΠ.YI EI I:-KΛIIY,I- DII.DLML/JMLΓ.NIIVF- ;t;[iι<τrvc;cΛ/TLKN 1 PnJ)7f.() >'■• .'.- --./..11
IEKIEIAE AYVΠ J ΛYI 'ir t .^υTLVRIIGX.- P^mTl -RrWim Tεi NSYLC '-ri.il hyporni r I' il i-ri.r.-u.
IIIITKΛAIIΓΛY .C, vi-- 'i vNU;Ar,vπι7u\iFRi.tx;r-NtYvr<:;τ IDK^KKIDT^PRKLGAF RHGPKi.CLEiPKr'-.oi'VTMκιrr/κτι-FiγrYϋDLY.':ιι.c:,-.':ι.Pκι.NFr'-:ιw!τ.':κιv.ι
I I IKI .VΛ Ii - NW I Nil i ll I t.PHTR I ri lOV I I/.-IXIΛWELΓKV.'IKL LIKOLΛLAΛ Γ/ΓKVIIYLTI'KWGILIΓ :,V.FU:::NVΠ<:YFVLY
I'PLFLL VNTLHLW HNFMII.r.ll".! I I :|, irrrr-I.KPHTMCU. 'WNl.πTI.YNYVliKP
I 11 Jl/'. -Ii. It.'. ..Γ./II- D/~FI;RΛLKMTY:;N LD<;L.;ΛΛΛ7I.I w.i';rιαlrι-iΛiιcωrκtτπι-;:r'π-u.ιiJM:rr-LΛ IAF:UCDLYI,PLLO.';MA ETPAI -I NLRKREOTLLQVMETLLPKC -LGK t PAPYrLG I KCLAEDL F1IE3T I FI A I ENKAVA
-rnJl/M "57.. IH 358375 AP IG IFPLKHLFPRG IHODόόHSKENVLOW I ROW I ATECTFLGCTV I3DR ITAKG I PI*AR
-Ti To nyι ιrn'.t ic il prorein RTVΛKYRAOLKΣtRAIirPRβLFYirCSNSpFRDRCTi ">• —
"IIMTTWtELLDKOI EDOHMLKH EFYORWSEGKLEKOOICAYA DYYLH I KAFPCYLSALH
ΛRCDDLO I R J I ENLMDEEACNPNH IDLWROFALSLGVSεεELANHEFSOAAO-MVATF CPn_0772 872400 3"04b9
RRL DMr'JI-AV"JIΛALYT/ε∑3IP;VC/εKΣRαLKεYFσVSARσYAYFTVH0εADΣKHAS uvrD-DNA Helicase ϊEKEMLOτLV^RrNPDΛVLOGSOEVLDTL NFLSSFΣNSTEPCSCK KLGLΣMTCΣSEUIEAςRKAVTAPLNPVLVLAGAGAGKTRVV-rΥRΣLHLINOGΣAPREILA VTFTNKAARεLKεPr.TIC ASTNEFDVPMvr^FIIS-CΛFILRRSΣNLLNRENNFTIYDOS
, v,-\ .' ■" ~ .. i VJ" * .- .. - . "ll'ΛO"— .-»C .-K;u=
-ELLYFSRFSεKQNYSLGNMεTKRSΣYMNLPDRKKALEAAVAY∑εKOFGACSΣMSLCRHS ^FAV DPTOSI/ΛήOANiHtULtlFENDYPNAΛVI^^EEKϊliSYGNILNAANALIKNNA ATHEΣ ST I KTGALSLDLALG IHGVPKGRVΣ EI FGPESSGKTTLATH Σ VANAQKMGGVAAY SRLεKELRSVKGPσεKIRLFLiGSτDRεεADFVAAEILOLHRV'GNΣKLRDΣCIFYRTNSQS IDAEHALDPSYASLIGVNIDDLMΣSQPDCGEDALSIAELLARSσAVDVIVIDSVAALVPK RTFEtJAIXΛRRIPYε∑ΣGGLSFYKRKε∑ODΣLAFLRΣFΣSKSDΣVAFDRTVNLPKRGΣGS 3εLECDIGDVH\Λ3IΛARMMS0ALRi TATLSRSσTCAVFIN0ΣREKIC/SFOTPETTTGG πlFALTOYAΣAOσLPΣLKACMALDTKDVKΣ^KKQOECLOεYLALFPQIEIWYNTLSLR DFIESVVRΣTOTIJεiLKEDAITrFKDRKSNLεεLYHKALεSεCONPirTHLELFLDDLAIJG -CILDLAVεYNIΣEKKGSVreT^QEWCΣΛCGRErRεεl WNRKLFεEIEKRIYDVIAANK SD∞Λ TADRV M ΣJ^iσκσ EF^^VSF VC EEO PHANSIΛrr EIJIεEE R CYV GΣTRAQDLLYLTAAQVRSLWTrVRMMKPSRFLKEIPKOYMIQVR
3Pπ_0763 860520 359972 CPn_0773 872485 873195 ygfA-Formyltetrahydrofolate Cycloligase ung-Uracil DNA Glycosylase
NFPMTDPK∑εKSALRKLFΣSΣRRDLSεεRKHEASSAVASFVRSFSKESWLSFVSFNHEΣ FMOIATIDOLPVSϊ εθLPI WεθLKεEWSKPYMQOU.IFLKOEYKEHTVYPεENCVFS
CWOEANRΣLΣOKCTIΛLPKIDQENLYPVLΣPSIDDLISVVHPKDPFSKCTPISSDKITHV ALRSTPFtXJVRWILGODPYPGKGOAHGLSFSVPεσQRLPPSLINΣFRεLKTDLGIENHK
LVPGLAFtJQQGYRLGYGHGFYDRWLAOHPYPSIRTIGIGYCEOKΣDRLPOESHDIPLSQI GCIΛSWANC<3ΣI- -^rVLTvTWσEPFSHAσKG«EI.FTDAΣVTKLI0εRTHIIFVΣ*IGAAA
YLC RKKCEU-rNSKHOHAVLSSPHPSPLAAHRGFFGCSHFSKINYLLNKLNKPMIN KLP
Figure imgf000110_0001
ΛKT tRP-vΛT0POK0AKC3PPOENV KΛLOKP t PI- .TEPPKPSPAPTVAKKTTATEKP :,. PPirr^KKNT0t-"KTQL0TL3EVA0AL3LHVDKtEK3ET3LKNISWP3TA0LTMHSELKAT TFlSQHLRP OL CHFNnwr/OKWBT LGriGLAGr: ".L." ," /AL.,-AR .LFLAYASSD QEDELCELFRTII IALPSKCYVR IKLVLSPNCE∑OECSFLSεVSAADKOLLTORΣOALPFO v-:«*i5κσrΛ05rwτrcα_EAW LPLETHOALOP«PLtrL-r^EDr i FπLr:εEL DP .'IPtJΛ fc:CSt'TM«ϊ^ KFLEKYKV3KNt3FHIKL7-Nε3 ETEHSADGTLTΣL3F7
Figure imgf000111_0001
Figure imgf000112_0001
Figure imgf000113_0001
procein
975757 ATP Synthase
ΣHLGAYTPGODEELOKAV 977597 977055 hypothetical protein
VFLVTTP SPGSLSOSHLPHPHDPWirrEPTSLPEDPrøKASOEIΛSLWLFRKLSIHLLS
9--963 977608 M-Rmg Protein
ptotein I ! LYH FVI DALDTAVεOCLεi P LEDGSLPLONSPMNLDFEDAN I\ PELOVKVDESGLNLGHP
CfSrfrjO . • HDH I I "ivoyx, i
Figure imgf000114_0001
F.IIMALL ILLPHC0SVWNEKNLF3CWVDIPL300C .FSAGRAΣ0NLPIDCIFT3TLVR HLLIOGSROSYMCLGOILP:;. ■KrFKCF'TrΛIIKυL rFLN. K-F-NTLR r E AΣ L
;LMTALLΛMTHHHSKK IPY Σ VIIEDPKAKεMSR I YSAεεεNNMΣ PLYς33ALNεRMYGεL0 RHVGC3AI<A'^rTFKPYπD3CTO3r/,\KALH'/LRTFPE ."r: -YARLJFECgEV,LL3LRRL -KNKKO AE0FGεεRVKLWRR3YKTAPPW3ESL'rOTK0RTLPYFεKNΣLP0L0NGKNVFV GNYDSL ^LTF/p^Λgr^'Λlιm JWGlIlA;t∑;sLY ^ SAHGNSLRSLIMDLεKI^εεEvX3LELPTGKPVVY0WKNHK∑εKHPEFFG QOHAT Σ EEΛF3P YFTYWWRLOFEOTSRTβMTL'.- EATBHSL J FSEASTLAWSrWW.'PSD εAENLVNSrtT/l?3εHIPLTFRCLP3LVAGL3VATHC3T'/3PENRLRCLYSTMLSU.VKS LRSHREMLNK0LLP CTVLDFSETTL3SGGLDVFAE3IAVRIHLNCAV3INL
CPn_0R75 194022 r.K i Ki'V i V '. - :.. '. • '. . ' : .... ' :: '■■ ' ?: . ". ••.::.:':7:rt- -:.v-
Figure imgf000115_0001
RL3RR≤RRLFA P.IΛ 'KL'r ;V 'ANFkTYAEK: EvCERCLJbV/SSAAEΛij:jLAi-3 OGEIKDALYRIREVHPIΛL∑εALAENPAL∑εGMKKMOGRDWIWNLFLTOLSEVFSOAWSC
CPn_0865 982412 982942 GVISEEO Σ AAFASTLCLDSGTVAS Σ VQGERWPELVDIVIT
CT865 hypothet ical protein
SPMGYVFYVIAGSΣFLGISLGAYCOLYYSVKSVXFSWYIXTVΥALεKRHALLALSOLVGE CPn_0876 994123 995517 EDAOSOKEIDFLSOCDIOSWRAFLKNSYEIIPTFKEMEDLLSεRVOGFLεSiεTIAEHDR dagA-D-Alamne/Glycine Permease AILCIENFWASKNLFDFEΣAAYεEAVEKYΣ LRORAPLRIASKLFRFLDVPSIRFSS SIA«rETMLYFIEOLNKΣ^TSFCVFPMILI-LGGFLTWKI «LOFHGLKLGFNLMΣΛNi i^
DSSSKANEVSSYεAVAGILAGNFGTCTIAG4«VALA XiPGALVWVWLAALLGAIVOYAG
CPn_0866 983494 982916 SYLGSKYP CPEGNTGεFIGGPIACIAFGMRKKILAGFFALFTIMTAFCAGWTvTJVSCIVP birA-Biotin Synthetase LCAEGTP aO--VGILLALWIPVT-AGGNNRILRFSARVIPFIAGFYCISCGIILFOHASA
NMKVIYYεiεεlPSTNTMAKSYMHLWDPYALWISTKCO AGTGKFGKSWKSSKGDLLNT ΣLPAIICLICSSAFGΣKAGΣJCIGGYrτ^OVISTσiNRAVMAτiX^SGMVSILOANτKSKN
FCFFITDLHIt)VSRLFRI TrEA\Λ AI :KOΣΛITEAKIKWPNDVLVHGEKLCGVLPεTLPV PVVTXaVTLVPPVrVMVVCSI'niLVLΣVSGAYSSσACCTΣΛVMSAFKNSLGSLGSVrVIL
EGlAC ΛΛΛIGΣJ«NTTK0A CDvTJQPATSΣΛEILσHPIDLεTTRεLLIHHI 3vTΛENL AMAIJFYJYTTILTWFACAεiCSUJYMireRJIANLWLKAIYVLIIPLGσVlrΛIRMIWAIΛDTG
PDSLATKSNRGNI FSC4WIΣJKIALIAIJjrflvT^τt4RI)VALLKERεCSVADPVRNLDA
CPn_0867 983405 984667 CPn_0877 995521 995982 rodA-Rod Shape Procein ybcL family
CIRIPOMHIGFCHCVRGGNFFYFVINNFHILEIYSLLNSNTIMRYHKYFRYVNSWVFLVV RRRIMO ^PAFAYGAPIPrarYTCCCAGΣSPPLTFVrΛ/PGAAOSLALIVEDPDVPKEIRS
LTLMtXSVWISSMDPTAMLVTSSKGIiTOl«IMOLRHFAIΛWVVFFICAYFITYHIJ TCRW ΣX3Σ«IIWIVι7ΛSTTITOΣΛEGAEΣFAVC3LrJTSGKPWεGPCPFDKOHRYFFTlJ,AIJ3V
AWVLYFFMICALVGLFFVPSVONWRWYRIPFIHMSTvOPSEYGKLVIVIMLSYILESRKA \/LPE£EWTR∞LYEΛMEFHIIEQAElΛGTYEKS
DITSKTTAFLACL ALPFFLILKEPDIiCTALVT PVTLTIFYLSNVHSLLVKFCTvVAT
IGIIGSIXIFSGΣVSHQiαrKPYAΣJCVIKEYQYEPXSPSNHHORASLISIGLGGIRGRGWK CPn_0878 996660 995992
TσεFAGRGWLPYGYTDSVFSAΣΛεEFGΣiGLLFTΣΛIJrΥCLICFGCRTVAVA'roDFTJKLL SET Domain protein
AAGI'IVπ WHVLΣNΣSMMCGΣXJ>πrJVPLILISYOGSSVISTMASLrjVLOSIYSHRFAK GOBTv TEPCSSIHISt-mCWRDSQPYST-DRASEI-^FRFLPSLWSTIWKTCO IETLC
HKSEKRRLISFLAKWUa LHKODΣ XPPAPPVSVrWΣNAHvlSYGvFARDEIAPWTYIGEY
TCI MROAWMDENDYCFRYPMPI-FTΣΛYFTIDSGKCCNvTRFINHSEOPNAEAIGVTS
CPn_0868 986733 984670 EGLFHVIIRlVAPIYAGXjεiCY^YGPLYWKHRKKREEFIPεεε zntA/cadλ-Metal Transport P-cype ATPase
NFRtrjLGVRBLHHFREYYLΣ INEI IITGRYVFSRLFFTSFSAEWNTFFESGMSEDTSPL CPn_0879 997463 996645
I^KO RKωH LPΣ^AYLS _rrYLIALI_5FWΣJlAKNI_nri^^ yyc -metal dependent hydrolase
NΣCQ10AWIDΣΣjπ'SAAFGSΣFIG<ar,mAr,T,r,VIJAISεALGC«VSGKAKSTLVSLKQL YRIUn VSMOGFFPIΛ∞SKGNSAYI ?TDSCKILIDΣiGVSKOVVTRεLLSMNIDPEDIOA
APTTrΛΛV -l a^LOKVAΣ^^KIEW3NI^ ?Σi(SGEVVPLr 3EI ΛGSSSI ΣΛHLTOEKVP IFVTHEHSDHISGIKSFVICAYNTPrVCNΣjπ'ARAΣiCHLLDSHPεFKIFSTGSSFCFODLE
KSCHPGSI /PAGAHNMEσSFDLRVLRTGSDSTIAHIINLVIOAQNSKPRLQORLDKYSSV VCTF»lvTiroAVOPVAFIFHYREEKΣiGFCTtπi_WvTSWΣTHεLYirDYU.IESNHSPEI,VR
YAI^IFAIACGIALLVT>LFTSIPIXΛroSAFYRAIAFLIAASPCALIΣAIPIAYLSAINA QSORPTΛrYKrøVXSKΣiGHΣSNOECGQΣ OKIITPKI CKLYIJUlIjπ'ECOTAELA^
CAlWσVLLKGGVIΣΛRLVSCNSVVMDK'rG'ΣT.TTGELTCICCTTY GSKN^ SIASITSIAPEIALACGΣTSPIYFSRUεVACPR
SSSHPIAEAIVSYT-MEOKVSSLPADRYLWreEGA GYFNEOEAFvratVTTCICKVPSEY
LεDIEQKIYQAKOHGεiCSIAY TΛSFALFYFRDIPRPOAKEIIODIjωLGYPVSMLTGD CPn_0880 . 999864 997444
HICVSAE^π,AEILGISEVFFDLTPEDKΣ-AKIREΣATORQI MVrJDσI^II»PA J«A VGIA fcsK-Cell Division Procein FCsK
MGEAGSATAΣEAADIVΣXHDSLSSLPtreiQKAKOTKKWSONΣJU^ALAIILLVSWPASLG PMΣRεRiOSRHPRIJ LPI-AAKASLYΣJi ACFSGLSLWSFHRDOECTONWIGLLGWSFSS
11 PLWLAVI LHEGSTVI VGLNALRLLKS FΣiYFFTJAAAFFΣPLYFT IIJ3FLYFP lTPRPLFFYKAAAFI^LPFCSAIi ^KlSPVOTL
PAIXI7rW.PKFILGNNPPVSYvGGIPFYτFYEG0SFCLKHLIGSvT7rALIFGFV>lIJSVL
CPn_0869 987479 986658 YUX3GIAIZJCKCTFQDrM KAFCSFFO^FKrJΣJt <LINRRNYLPKPSVPFVSI0^
CT728 hypochetical protein SOPSPRRVSETIΣLDGSΣSPLPOεεiPGSKKESFFLTPHPCKRFLTKFVEPOENKAKEGK
EGMRFFFPICrSEm'SrX»QHOIΣJU IMTQDPHDHreSRTPEDHIKHVRDKHT«VCKσεPHT TΣALSSTPTVVT<ESl Gl«ERAALPKIJCSωvPENDLPOYHI SK rPX «PESIΛAEΣ-ERKA
TFRBFTYHLANNAI_STC FIFFIRTLFFLIPTORA∑ΛJVKSLISU?Λ3^ LITJOQTLTSFGIDADLGNICSGPTIAAFEV PHSGvT VOKIKSLENDIAIJfXOASSIRII
WAYMEIΛHRSMLεEKNEIEENFEOEKIELRIIJΣCN∞FKDPUΛEMVEYVCSDSTTilΛT APIPGKAAVGIEIPTPFPQAVNFRDΣ_C_EDYOK'rNRlUΛΣPIZXiGiα(^
MIREELYIRKEDLPHPLΣOGGSRΣL∞LCGIAIFLPLVIΛΣSY ΣAGΛTSALMVLVLSFL IDPKKVELTrjYSO-*HMLSPVITE
KAKILKNDKISEMVWVLGΣFITSASΣISSLMKLL SREVYNALVWLVKEMεSRYEILRYLGLWJIOA™SRTF<NKTIEASYDR£ΣRεTMPFMVGΣ
IDELSDILLLSSSODirπ'PIIRLAC+lARAVGΣHLΣLATORPSREVΣTCLIKANFPSRISFK
CPnJ0870 988881 987448 VSWKVNSOIΣΣDEPGAENΣΛGNGTΛILVT PSVFσriP XAYΣCDEDnnΛrlODrXSRFR serS-Seryl tRNA Synthetase-2 TOWIPSFHAFDDSDS∞SGEiωPLFAOAKΣtlLCrCJIASTTFLQRKLKΣGYARAASLID
T THPTOGFGGAVILPFSPISIAPΛIKKSCCSEItSSrYSHFCTΣXlJJNETSMLDΣKIIRK QLEEARΣΣGPSEGAKPRQΣLIQNPLEG
TPεεcεTRΣ UαωPKΣSLεi /LSIΛKEVROU ΣOSETLOAQRRIXSODIHKAr rtjσVDAT
^^LIOEVεTIAADLεKIεOHI-WKNAOLHεLΣ^H P YPADDIPVSεDKAG OVIKSVGDL CPn_0881 1005646 1006209
PIFSFPPKHHLεLNOEIΛIΣJJFQAAAKTTGSGWPAYKNRGVLLεWALLTYMLOKOAAHGF No robust homolog present in Genebank/εMBL as of 11/7/98
OLWLPPLLVKKεiLFGSOTIPKFDGOYYRVEDGεθYLYLIPTAEVVLNGFRSODΣLTεκε NKKFAVHMPVPIDNSSRNLQEVPESLEDLEOHAEESPTHOSAESSSLOLSLASSAISSRV
LPLYYAACTPCFRRεAGAAGAQERGL,VRVHOFHKVEMFAFTTPNODDΣAYEKMLSIVEEM εQIΛSLVIΛMεNSDFSSLRDVPIFSAIYESSTHTPVCTPLVσvGYi SQSGYYDTQRES
LTELKLPffllJjLLSτGrΛdSFτASKTIDAEVWLPσOKAFYεVSSΣSOCTDFOSRRSGTRYK LHΣΛOLLGSRRVΕVVYNOGNFMεASLIJ^LCPRRPRRDPSPISLALLεLWεAFFLEHPPGS
DSOGKLOFVHTLNGSGLATPRIXVAILENN∞ArXSVVΣPEVLRPYLGGLε∑LLPKDQ TFNPIFFW
CPnJJ871 988766 989899 CPnJJ882 1006169 1007404 ribD-Riboflavin Deaminase No robust homolog presenc in Genebank/εMBL as of 11/7/98
EYMεDFSεOT FFMRRAIEΣCEKGRΣTAPPNPWΛGCVWOENRΣΣGEσFHAYAGGPHAεε NTPOVALLIOYFFGNGAFYVRεALRLTPHAONΣVLVGΣCPΞLYPεHPRSEYYRVSGDIGS ωiONASMPΣSGSDVYVSLEPCSHFGSCPPCANLLIKHKVSRVFVALVDPDPKVAGOGIA RFDDRGFVNSGVεTLPYSSGSFGIFWΣSFTDPTFNFAΣVNTFMRTAGΣNEVSRPMTODTE
MLROAGIQVYVGIGεSεAOASLOPYLYORTHNFPWTILKSAASVDGOVADSOGKSOWΣTC TSLΣEMRDLSEC EANNTDSLEOEESL∞ΣVσHTVGσVSMTVTSSPNIFYRIO LLGLPε
PεARHDVGKLRAεSOAΣLVGSRTVLSDDPWLTAROPOGMLYPKOPLRWLDSRGSVPPTS TLAEAεεNPTFPNSTΣDSLAε∑MMNLVRΣSDAVSΣFWΣFPIVDTTYNGVLLAVCIGFFGI
KVFDKTSPTL^-VTTERCPεNYΣKVLDSLDVPVLLTεSTPSσVDLHKlΛ-εYLAOKKILQVL NGΣCSTFLMLTNPRSRRDRWRNLRIMvLCYRSLGSGMNLFDLSNNVRMAARRHVTSCTVA
VEGGTTLHTSLLKERFVNSLVLYSGPMΣLGDQKRPLVGVLGNLLεSASPLTLKSSOILGN LYAMVTLFC4<TVAI0DAL0YGFPSVRDAFYRYCLRHRYCLT0RNεDSLO rGTRF0VTRT
SLKWWε∑SPQVFEPΣRN HLεDCΛMVASILNl^WFGLFFGFVGLMTTFGCLε∑SPSCRWDAANNRTVGΣF
CPn_0872' 989903 991216 CPn_0883 . 1008904 1007573 ribA&ribB-GTP Cyclohydratase & DHBP Synthase dmpP/ngrb -Pheno lhydrolase/NADH ubicruinone oxidoreductase
KEP.lFRVACLASεSVNARεSM∑εTRεεVσSANFVSLERAΣEDLRAσKFVIWDEASREDE LYεLFΣKSσiFIVMTWL3CLYFΣCΣASLIFCAIGVΣLAGVILLSRKLFIKVHPCKLKΣND
GDLΣ ΣAGEKΣ'TVεKMTFLLOHTTGVVCAALSQERLLSLDLPPMVKDNRCRFKTPFTVSVD NεεLTKTVεSGOTLLVELLSSGΣPΣPSPCGCKATCKOCKVRVVKNADEPLεTDRSTFSKR
AAHGVTT TVSAADRTKVVQLLADPKSKPEDFISPGHFFPLASSPGCiVLKRAGHTESTVDL OLEEσWRLSCCCKVOHDMSLE∑εεRYLtlASSWEσTVI≤NDNVATFIKεLWAVDPNKPIP
MELAGLOPCGVLAELVNEDYSMMRLPO∑LEFARKHNΣAVΣPVTSIIAHRMLSDRLVSKIS FKP∞YL0ΣTVPSYKTNS3DWKOTMAPE-/YSDWεHFHLFD0VtDNS0LPADSANKAYSLA
SAP.LPTIYGDFTTHVYESLLEGMQHLALVKCNVAGKSN'ΛVRVHSεCVTGDILGSKRCDC 3YPAεLPTTKFNIRIATPPFINσKPN3EIPWGVC3SYVF3LKPGDK ITVSσPYGεSFMKD
GεOLSSAMSY IAεKGTGVLVYLRGQEGRG IGLGHKVPAYALODNGYCTVDANLAMGFPVD DDRPLIFLIGGA :;::FGRt;H ILDLLLHKHSKRεiDLWYi-iAR3LKεNtY0EEYBILεR0FP
SPE-/GIGA0ILVDLKLTTIKLITHNPOKYFGL0GFGL3ΣTERVPLPVR ISεDNE0YLRTK NFirYHLVLlIErLrED^-lAllWDKDDrTrrtlFLFRAFNUlOL RLDNPEnYLYr CGPPLHN EPMGHWLDLP CNNRVO :::; [LKLLGPYirVER:::: I I I.PDF .1 PnJIHH.i I iiti'i i oo-mn-i T74 1 hypor ht'r ic. i l pt oro i n
' :r/-.-ML.';R tVT FI.KLt.::::[.l'LFΛEEEΛΛO::Klrπ -/OI-ΛVMIA IΛ ! l.|-|ΥFII.WI<|.EOKRR
Figure imgf000115_0003
((rrvr)rii PEnτv ι i [Λ:;ι ; 7iΛ-ι.κι :Λ i ::El ι.κr-NnNK.';
Figure imgf000115_0002
CPrιJlHHrι l n lili.'.l) t 'll)->4 1 ; ι.'l-rιJIH74 -I' 164 'I') I 4'» y|ι.Λ rHNΛ Mi't hy l r i .ιιι:.l .-ι :.ι.
' T7 ': ; hypι.r hι -r u.-.t l pt or i.- i n Λ::π.τM:τrM Nt:ι,ιiFi ;vι ι.-.-;:ι.-iv::ιiY::i)::ι.κκκEi-:ι.ι.ιι ι.ι--Λi-ι.vi'.':ιiMiΛi't ι ιx:::i' l.l rilΛI.K ILThORIinF.I'ΛϋMLK ILK rKVLVFPLΛLU /.-M.' riYΛlirVM^LOTN^OTKVK :;ι,Ri:RNKHEF:;FFC»iΥi-y-.ι:κ::ii;ι.-i;:;:-ι- ι κκι:iF'V'π''-LLiιιιvrMi)iι,κι.τi(i-SΛΛjκιι
(i;.':ρ: ι iϊjκι.Rυγ i-:ι.twι,TEπ(ϊ;Λ T.' rp tDMΛY."EK FMκκvrΛ rjiΛir<:;M iιiL F-Ei-4ΛYFPi'KNK.:::i;rrι:i ,'iw:::ιv^ιικMvrι.rι,:»rri'F.YhVNi--Λι-ιi)Hιiκκιι_.^;::ι.
Figure imgf000116_0001
Figure imgf000117_0001
C?n )906 1040514 1039915 CPn_0919 1049375 1050430
CT763 hypoehecical protein ldh-Leucine Dehydrogenase in^SEvTJϊLVNDSOLSREASAFRLDIDFFILNIYPFFRNFKNIELCFFLSISOFNLDFMε FMKYSIΛFKEIKIDr^εRVIEVTCSKVT «AIIAIHOTAvTJPALGGVRASLYSSFEI»CT εFVAYIVKNLVTNPEAVεiRSIEDEDNESIKLEIRVAAεDIGKΣIGRRGNTIHALRTILR OAIΛLARGMTYKAI ISWTGTGGGKSVI ILPODAPSLTEIJMLRAFGOAVNALEGTYICAED
RVCSRLIO^KVQIDLvl3PEhKϊr∑3VIADQDYΣCDNDSSNSTεDTFσεSt)TCCSGHCHYDεDL Ii3VSrNDISIVAEiαTYVCGIAΣΛrsGDPSΣYTAHGGFT CIKETAKYLHGSSSΣΛCaαCIAΣ
NQEEOEEGNMHHSCECSNHH CCIGSVGPJu ΛSΣ_?FErJAELYVADVΣ-EPAVOIJAARLYGATIVTτEEIHAX_-CDIFSPCA
R SWIRiri»n \DΣJKKAIVGv7UIN0LεDΞSAGMMU<εRGΣLYGPDYLVNAGGLLNVAAλΣ
. CPn_0907 1040816 1040445 EGRVYAPKEVΣXJ VI^lPIVΣ^KLTOQSKTrGKDLVAI^DSFvτDKI AYTS cutA Periplasmic Divalent cation Tolerance protein CutA (C- Type Cytochrome Biogenesis Protein) CPn_0920 1051423 1050431
FAFSKFLIIKSSMTAVLILTSFPSEESARSLARHLΣTIRΣJ^CVHVFPKGTSTYLWEGKL cysO-Sulf ite Synchesis/biphosphate phosphatase CESEEHHIQΣKSΣDIRFSEICLAIQεFSGYEVPεVLLFPIENGDPRYLNWLTILSYPEKP ILEENSMHSELPNYONIVΕSVvTEITTOLΣΛYRSiaRLVPFWΩ SDGSFITAADYGSOYY PLSD LK∞LAKAFPNIPFIGεCTLYPDODNEKIPEIIJCπRLLTSSVSRDOLISTLVPPPSPTS
LFWLvOPirXTTAGFIRHRAFAVAISLIYEYRPILSrVMACPAYr^OTFKLYSAAKGHGLSrV
CPn_0908 1041607 1040780 HSCl«-3PΛFvΥADRKO K0FCεASIAAIΛ∞HHATRKLSLGLPNTPSPRRVESOYKYALV
CT764 hypothetical protein AEGAVDFFIRYPFIDSPARAWDHVPGAFLVEEAGGRVTDAI atfr-EYRKESlV NNHAVI
ILAILFMI I IKNNEΣΛIRRFFKTIJPPGPOYSLCΥASILrvXSSLVr ^^FCTrtjrLPELS LASGDOETHETTLAALOWLNVVPTDKLIAL LSKFNPSPIRNLFLVSSTt^KVPPTAΣAEHUu^ADAPTYLHEFSIKEAESSLHALGIFS SLViεKSPDNKGITIFY ∑OTPIAYVraiP-SNTLαiLEGSCFI/κjPYFPSLNLPQΣFFSQε CPn_0921 1051526 1052293 DIΛMOKLPKEKMLFTKIΣJJ IOAMESPKIIDLΛWDAYPGEirvTΣ^SGSLΣΛLPIKTLI) snGlycerol-3-P Acyltransferase RAΣJ3LYKHMKi SPVIKεKQYVYDLRFPNFLLLKAL GELMLIKIΛPA'ΣΥEraKTFLVGAIZj RYRrøVEσWOTΣΛINPKCXOTJ
ILEYXFWSRFHVRPMAVEYLFHSRVVO»rFΣΛSVRSΣP∑roLVPGItESKRSUa»MNVCYEE
CPn_0909 1041592 1041966 ASRAIJJRGεSI YPSGRLSRTGKI÷εrVNOYΞAYVI HRVMEαnA^VRVSGLWαSλFSR rsbV-Sigma Factor Regulator «O1*STPKΣΛPAFItEAFllAΣJ_RRGIFFMPKPJVKITtΛ0VDHI_rUC0FPTlC0DLWrr^
IISLIFTRFL FRT.Ttnil-SAKEYGDirvIYLOGSLDAVSVPSVQEYLεOFIQKKHLKIAL WFNOGDDNLPIEVPYA
^rFTIWSYISSAGI ΣXIώNFK V0S^ ∞KMCIΛCrVKεSVTIvVMRIAGUX3 I LCQSEQε
CLSKL CPn_0922 1052266 1053927 aas-Acylglycerophosphoechanolamine Acyltransferase
CPn_0910 1041970 1043004 QFAHRSSLRITRKUUtMH∞RNRGHNNHNI u tPGSTl εAFLII^SEHEεGIACFDεHL miaA-tRNA Pyrophosphace Transferase GSLSYREΣ WIIAVAIKVSItFSEDRTCVrøPASΣGAFΣAYFGΣΣXAGirτPVMKrWSαSL
FLYMLPFEFETNTTSSPECrΛr-aPQKLFVKLFKRTrvXΣ^GPTGSGKTWSl-ALAPMID RELRACTKTvΕvTtRVLTSOTFIKHLTEVrχ3FVΕΥPFDLMYMEDVT<KRLSWWEKCRIGL^ σεiVSiroSMOVYrXWDIGTAKVSΣJ AROεlPHHLIDIRHVOεPFNVVDFYYEAIQACONI KCSVPWIXJtlFGVSGVESDDTAVILFTSGTεiαPKAWLTHKNIJIENOEACLKFFOPNTQ LSROTrVPILV∞SGFYFHAFLSGPPKGPAADPQIREQΣ AΣAEεHC /SALYICDLLLKOPε DVMUtfLPPFHAYGFNSCGLFPIXMGVHVVFASNPIΛPKi VEFIDDKKVTFFGSTPVFF YAOTITKNDKNKΣIRGLEIIOLTCKKVSDHEWDIVPKASRεYCCRAWFLSPεTεFLKNNI DYIΣJ TAKKQNSCΣiSLRLWΣGGIΛI ClrrLYEETKKIflPOIALYOOTGATεCSPVISIT OMRCεAMLOEGLLεεVRGLLNOGΣRεNPSAFKAlσYRεW∑εFLDNσεKLεEYεεTKRKFV TKESPRKSECTvrjMPΣEGMrJvLΣΣSKETHIPVSSGεOGLIVVRσNSVFSGYLGNHEHOSFV SNSWHYTKKOKTWFKRYSIFPXLPTLGLSSDAIAOKIAKDYLLYS St»3DCAm.TGDLGHIGPSσDLFLEαRLSRFVKIGσεMVSLEALESILHEHFTEN0NEOA
GSLWCGIPGDKVRI^LFTTΣA'ΣTΣHEVNDILKSAETSSIVKISYVHOVESIPILGIGKP
CPn_0911 1044079 1042985 DYVSLNALAVSLFG
Fe-S cluster oxidoreductase
SLLLAIFNVNYFMNI^KRISFεεGLεLFVSSPiεRLOεRADAIRKεRYPSNEVTYVLDAN CPn_0923 1053966 1055093
PNYTNICKΣΣXrrFCAFYRKPKSPDAYLI^FDεVRSLLORWSSCπζTVLLGGGVHPGLGI bioF.l -Oxononanoate Synthase_l
DYLεεLVRΣTVQεFPSΣHPHFFSAVε∑εHACRVSGΣ3ΣEQGLORLWDAGOR IPGGGAε∑ vtKESFLTTSWΣDFV-TNDFLσFARSPTIYCEVSKRFO∑HCOOFPHεKLCΣRGSRLMVGP
LSεRVRKΣΣSPKKMOPGσWINLHKLAHLMGFRTTATMMFGHVENPEDΣLΣHLOTLRDAQD SSVIDDLεSKIASYHGAPNAFΣVNSGYMANLGLCHHVSRSTDVLLWDεεVKMSWHSLSA
SCPGFYSFI iεTYYRILALGRΣFLDNFDHVAASWFσεGKS ISG0HHTFHHNNLεHLεSLLOCYRΣSSKGRΣFΣFVSSVYSFRGTLAPLεθΣΣAI_»KKYHA
LGAKALHYGADDFGGVILDεSVHKATGWSΣQSSEEEICNIΣRSεGFIPVERNTFYQHΣSC HLIVDεAHAMGΣFGD∞KGLCHAUrYεNFYAVLVTYGrALσTMσASLLTSSEVKYDLMQN
TVSSL SPPLRYSTSLSPHTLISΣGTAYDFLASEGEIARKOVFKLKEHFHECFDSHAPGCVQPΣFL
PHTCLεεATSvT,ετTGΣHWr AFAKHPFLRvTJLlUYNTVDεVNLLAOVMKPYLεKSSHR
CPnJJ912 1044120 1045760 '/HΣNHεFHLWRεLCCH
CT768 hypothet ical protein
WIMDNSDNSFHTLεTεθGSFLNDELAVEεVASTεSTεiSDATLCFAεKKVAFILNKMRε Pn_0924 1057301 1055028
.ALTGSSOG3DLRLFWDLRK0CLPLFNε∑εDTAKRADHWRCYΣELTKεσRHLKGLODεEGS priA-Pr imosoma l Protein N "
FWG0∑DLAΣTCLεKDΣLKF0εGTεDKΣFKDRεDNFLεS0ALDKH0AFYKOHHTSLLWLS KRFTAKTKSMGY∑εS≤TFRLYAεVΣVGSNΣNKVLDYCVPENLEHΣTKGTAVTΣSLRGGKK
SFSSKΣ ΣDLRKεLΣNVCMRMRLKSKFFORLSNLGNOVFPKRKELIEKVSOTFAεDVDAFV GVIYOIKTTTOCKKILPILGLΞDSEΣVLPODLLDLLFWISOYYFAPLGKTLKLFLPAΣS
AKYFlGSDKETLKKTVFFLRKEIKNLOHAAKRLP.'SSHVFAErrRLKLSKCWDOLKGMEKE SHVIOPKOHYRWLKCSKAKTKEΣLAKLεVLHPSOGAVLKΣLLOHASPPGLSSLMETAKV rRQEOGRLR SAENSKEVROMLAEVSSLLΣEGNDLSKVRKDLEσ∑SKKΣRALDLTHDDV SQSPIHSLEKLGILDIVDAAOLELOεDLLTFFPPAPKDLHPεQOSAΣDKΣFSSLKTSOFH
ISLKKεMOOLFDOLRεKODAAEHSYOEOLAKDKOVKKEAARSLAERΣTTFSKTCSEGNΣT THLLFGITGSCKTEIYLrtATSEALKO ;KGTILLVPε∑ALr Cm SLFKARFGKrΛGVLHH
.'iEBRεεWOTLKεLLGKMSFLPPPEKΣSLDNOLNLALOTIVNFFEεOLLSSPDSRεKLVNM K.LSDSDKSRTWROASEGSLRILIGPRSALFCPMKNLGLΣ ΣVDEεHDPAYKOTεSPPCYHA
RCVLKOPRεRROELKDKLEODKKLLGSSGLDFDRAMOYSALVEEDKRALEELDASΣLELK PDVAVMPσKLAHAT\VU;.,;,\τP.':LεSYTNALSGK'/VLSRLS3RAAAAHPAKΣSLΣNMNLE
UO tOQLL lAεRLεvGEOVL IFFNRRGYHTtlVSCTVCKHTLKCPHCDMVLT
FHKY-vr/LLCHLCNSJPKDLrxj. CPKCLCTTMTLOYRσSCTEKΣEKΣLOOΣFrOtRTtP∑D t.-PnJJ'J l ! 1114570'! 1045945 r;DTTKFKG3liεTLLR0FΛTt;κΛDVLIGTC)M IΛKlTMNFSA< τLΛVILIIGD3GLYIPDFRAS
Nu it.onϋr rioinolow present in iltne αnk/EMBL v. ot L l /7."is E0VF0LtT'.)i;RΛ:R::ilI.lx:F. I I ::FI.rPHPTIHSΛMP0DY3ΛFYf;0ErTGRELCEYP ll |. :ι-KγFR I EΛTD:;ΛIΛMRRNC IYΛFDLDGTLLKGN:;.'ΛI.'ΪFYCYr;L .X;LF.';YKTLPP(.- ! PFIRL IF' IFMCKI K 'rWEI'TVIIIiVHN tLKEςll.EGTNPLMI-'/TPCGHFK IKDTFRYOFLΣ ϊKI- PFKFFFiI I FIIPS t IR i-HAYV t PVNKKUIHΛLML,\KL:;i-KVKF M IDVDl-MTTFF
Figure imgf000117_0002
"PnJI'lJ' I0r,300 > 1053557 CPnJ)9 J-> 10 • 1071 1 15
Thioro-lox in Dis l f id- Isomerase CT790 hypotnet ical proce in
CHHTOTTYLTPFFKD3∞κrTΛΛGCAFVrXLLLTLPCCAARRRASGENL0OTRPtAAANL HI WTtR 3L L.I3TVL flTSKia«^l«n^r l(WvT CTB PB^ QWε_ΥΛεΛLεHSK0DHKPICLFFTC3rΛ 1WCtKMQD0IL0SSEF HFAσVHLHMVEVDF VtQy∑ ILHCLAKINCVSL Gf3NLΣDALFσRDIERMKGIYVεθDSra<Ht1lrXVf /eV ΛT)YG PQKNHOPEEOPOKNOELI AQYK' TGFPELVFIDAEGKOLAPJIGFEPGGGAAYVSI VKSAL VSIPEITTEEIQCCΣVSEISεYTGLHVAAVHVIIKGLTOPKDR rDEEΣEEEVSVOOLPSPE KLR DFLLENSEG
--n nn" -\C* - ' n5 «lH 105R670 rpn_n<>40 1073019 10^ 1204
•i i - -.' „.. : :. " ι . : \ -_ r Mir r. ' -- r-.ftirv -— • ; : -" - ••. -.— ^ v , :r: \ -"lEKTE ? rI I3I ILFLPLALLWVLKKTC0FFILPSSIISQSMSKTAVAIRRMTFLSHIKQLLSLKEI εRΣPFLMKIffASiεTIWSNETεAIiLENNLIKCHHPKYrΛ-LLKDDKTFFCLAISLSHSW SAADRVVI0YDDLVVDSIAΣKΣPHALPHRWILYSOGNSGLMENLFDRGDSSLH0LA ATG PI reAIRΩCAITSSQRQLΣFGPYVSAεACHTI-C-εVIS ΛffPLRTCSDREFALRKRPCILY S LLVFNYPGIMSSKGEAKRE^rL SYOA(?/RYLRDEE GPKA OI IAFGYSLGTSVQAΛ BMOlClAPCΛΛTYCTPEEYOGTLDKAILFLKGKiεEVVKDLEKVIQI«Sr«JL£FEOAA ^ ALDREVTDG3DGTSWI WKDRGPRSLADVANO ΣCKP IASAI IKLVGWNIDSVKPSERLRC RTLS^IKOAMAK∞vTKFHFWIDALGLYPΛKORTILTIi'rVRSGKIXGARHFSFTENAQ PEIFIYNSNHDOεLISDGLFεRεNCVATPFlJ-lJPEVKTSriTKIPIPERDLLHLNPLSPNV ErX3DLUSFILQYΥWOPYIPKEILTPLPLEFPTΣ_5YVX lAESPPRLRSPKTσYGaCELLD VDRLAAVISNYLDSENRKSOQPD tAYRNAKAYAA'ITLPSSTLPYODFWIIΛMSQYPYRIECYrjNAHMOGAHA'IYJVYIV ^
GFtJPrøYRTFSIDSEKTO^LAΣiEEVLLRRFHSLTrALPratΣWTGr^
CPnJ)92S 1061035 1059884 'TUtt.TOIOA/VTIAKEKSNHSRσLNKEKIFCETFPEGFSLPFTSN'ΣXΛFFOILRDEAHRFA
CHLPS 43 kDa protein homolog_3 ISKHRKKRGKALFEQεKIPGΣGEVT<RKRLU3KFKSWK0VMI^S0EEI-εAir?GLTKKDΣAV
RRKDFAFTLLNLSNRSDILSGΣFSNPHPVSYFSSTHAKQLSDFSKKHPΣLTKrVTIIVKI LXAROKDFNKSD
FKLLIGLIIPPΣΛIYWLCQLVCSLALFPRSSMLYSVUCrCFirXYRLEOEIQtrYFVICNLDP
SFωPAVSESKRITIOGDHLTItπ'LAΣHFSTARPKRWIAISUϊSGDFI CMIGLKDSLFL CPn_0941 1075504 1073018
SWKELAKΣiGANΣLΣYNYPGWSSTGIONΣ^NΣATAHNΣ^ΛI rtβDKICCPGANEIITYG mutS-DNA Mismatch Repair
YS XSA/OSAALOI NPFTNSεTSWVAVKDRAPHSLPAAANSFFσPiσKLΣAVLARWKMDA VMTEl«PTPMMEσWHOCKEI AGDSVLI.FRMGDFYεAFYDDAVIZ^OHtjaTLTOR(»IPM
EKNSP LPCPEILVΥSADRFRPSEVGDDTAΣipεFTIAHAIKRTPFAR5KXFIGEVNIXH
3SPLKHPTIQKLAEAILESL3RKN
FYMrCTAIWWLQQHU TI^rYAWAFEHKFASQKLT-rHFQ^
CPn_0929 1062301 1061186 AGGLLSYΣODKUXJTKHΣAΣPCrTrøKOOlO ΣDTASOWrLεi APLhroPOGKNSLLRΣM
CHLPS 43 kDa procein homolog_4 DHTS PMGSRΣXROILISPFYOTirEILWQrΛvΕFF JtQVTΣJuWIl TYt GVRDI εKFMAPIHGSNAF\rtOIIΛSHPSPQATYFSSTRA0iαΛεF103RHPVLTRIASVIIKIFKV TKVTTGLAGPRDΣCTI-∞SFSAGAQΣYIiOLASA'rLPEFFIDKCSLDTlα VSLIALLSKSL
LIGLIILPIΛIYWΣjCOTIXTNSI 'SKNLUCIFKKQPNTKTIj TIYΣΛALQDYSSK RVA CDIJ'LRVSDC^IFVOEFHNDLKPXJUINOεHSQWIWEYQERIRKETGΣKKLKICFAOAL
SMRRVPΣLθrjlΛrLIOTLεiCLSOAPTNRWMLISΣiGSrX:SLEέlACKEIFDSWORFAIO,IG GYYIEVSSεFAPOLPKDFIRROSRIΛAERFTTiεUXJFODDMSNISεKI-OTLεTOFFKDL
ANILVYNYPGVMSSTGSSSΣ CBLASAHNICreYIj ir£O.K3AKEIITYOΥS CSHILOLRTεlLALSOSLADLDYI ΣSLADIJUHAOGYCRPHVTJMSDTLCΣ YRGCHPVAKTL
ALR«3KIVANDr/rτWIAVKDRCPLFISPEGFHSCRRIGaαVARLFrΛ«rrKAVΕRS0rΛPC VΣW31<r∑Ptπ)TEMRGSOTRMILLTGPNMArΛ STYIR0IAIXVIMAOMGSYΣPAKSAH∑σV εIF YPτDS Pa^STv^^0^rκUAPE 'IXAH KNSPYVQNKEFIEvIΛSSDIDPIDSlC^R IDItIFraiGa«3αJLSlCG»!STFMVΕMAETANIIJlNATDP_3LVIIΛEVTiRβT^^
VALATPILK10-S AVVEYlI-riW lOXAKTIJA'IWYKELTTlCTHCPHVENFHA^
QICSFQIHVλRUGFPLC SRAOQIUO∑-εGPESITRPAQDKMOQLTLF
CPn_0930 1062851 1063330
No robust homolog present in Genebank/EMBL as of 11/7/98 CPn_0942 1075955 1077754
NKMSEΣAPCSTCIΛMVPHTCAmHAΣJWP tVILTIAACLSLIAG-VLvσLGAAAILPSLFG dnaG/priM-D A Primase
VΣGGMILILFSSIALIYLYKKTREΛTlQΣALEPLPEMISKIXJSIIDFvTζTRBYASLEKKAT NCSITKΣΛrAMYTEESΣΛNUWSIDrvrΛO^EHIHI WSCATYKACCPFH^
FAYTHTHYYKSMVFYREIPRFMIiGSYLALRKDMDROALF PAGAHYHCFC 3AMDAIGFI-«hXσYSFTEAΣLVX^KKF0\^LVLQPKI)SσYTPPCCLK
EEIJtHINSεAETFFRYCLYHLPEARHAL0YLYHRGFSPDTIDPJHLGYGPE0SΣJ?LOAME
CPn_0931 1064078 1065718 ERKΣSOEOLHTAGFFGNKWFLFARR IFPVHDAIΛHTΣGFSARKFLENSOGGKYVNTPET lysS-Lysyl tRNA Synthetase PIFKKSRILFGIJJFSRPJlIAKi-αCΛ^I VE iOAr : QMΣDSGF^^C VAArX^^AFTEEHVKE
IDFRVIΛWKSDIVTNIΣ_SεROTARAEYIJ3πΕDFLYRSiπ«IΛElSlαιGVvLYPYEFP^Ϊ^ LSICI^TVUC^TXFDSDEAGNKAALRVGDLCOTAGMSvTVCKLr^XWDro
CεDIKKTFASQELrjNSεAAMSRSTPRVRFAGRLVLFPΛMGICNAFGOIUJKNOTIO nOTlR IAIJfQSQDYXTFLISEia^SYPKFaPREKAI V EAIROIKHWGSPILVYEHUtO S
EFTSrVHGL≤EDAEITPIKFIEKKLDLGDILGirX«T.FFTHSGELTvT,VET miC1C^^ MMVTεtM I IΛNPO πλEPO IIPIK0Kv¥KIHPHrλ/METDIΣ tC^^
LPDI HAGI-SDKEWYPKRWU3LISSREVSOTFVKRSYΣΣKLIRNYMDAHGFLεVΕTPΣLO FYFVTEDFKHPECRKI^AFMISYYEKYI<K>JVPFDEACCΛrt^DSOΣLOI-LTKRRLNTEALD
NIYGGAEAKPFTTTMEALJISEMFI lISLEIALKKILVGGAPRrYELGKVFRNEGIDRTHN TΣFVOSLQKMADRRWRECCKPLSLt«NΣQDKKLEILEDYVQLr<KDPΛΣΣTi^rΛPESELIP
PEFtMIEAYAAYMrΛKεVMVFVENLVElΛVRAVNHDrffSLVYSYWKHGPO-WDFTCAJ^
MTMXES lATYAG irΛnJVHSDOKΣJCEIΣ KJrTTFPεTAFATASRGMLIAALFDELVSIWLI CPn_0943 1077972 1078238
APHHITDHPV TTPtiCKTΣJΪ∞OTAFvraFESFCMKEIΛllAYSEI-NDPIRORELIΛOΛ CT794.1 hypothetical protein
TKMlLPDSECHPIDEEFLEAI^CS!PPA∞FσϊCΛ RLVMILTOAASIRrjVLYFPVMRR PFMKSF CFLLPFOOTILCCCJlLLSSPP^r«ΣSVTεSΣCΛSAVKTLv ^EKAHεFLEGIGY
FDAEKTN GV13ASS∑rJu WIOCWLEiεSLLAQNEVM
CPn_0932 1067160 1065721 CPn_0944 1078503 1078997 cysS-Cyscemyl tRNA Synchecase No robust homolog presenc in Genebank/EMBL as of 11/7/98
VKSD'T MAFSHIEG YFYW^ASQKKE FFP^mTPv^U./TCσF IKΣMMHRYFΣPI^LAIilFSPSLVT -XOPSEOTKσσWT^QLSCAEGSOLFCKFεAAYNNA
ILKRTLVFFGYS\πΗVMNI'TlJvΕDKTIAGASKKNIP∑ΛEYTQPYTiαAFFEDLr/rLNIARA lEEXSKPGILVTFSERPTPEFADLTNGSFSLSTPIAKGFNVVVΣΛPσLISPIΛFFHIOIDPV
DFYPIIA'rHYIPOMIQAITKLLEMIAYIGQtlASrvΥFSU FP^maαSHLDLSSLRCCSR ILYMGSFLEMFPE\rtΛVSGPRlCYILIDεθGGACCOAVLPLETKN
Σ3ADEYDKENPSDFvX KAYNPEK03VIYWESPFGKσRPGWHLIXSIMAMEΣ-U3DSLDIH
AGGVCNIFPHHεNεlAQSεAI23GKPFARYWΣJ{SεHΣ IDGKKMSKSΣ 3NFLTΣJ DLU<OE CPn_0945 1079001 1079660
FTG0EWYMIJXSHYRT0LNFTEEAIJJiCRHAUUu mFvSRt_Er^n5LKESPLPRTr-DS CT795 hypothecical protein
SSOFiεAFSRALANDLNVSTGFA5LFDFvKεiNTLΣ∑Λ HFSKADSLYΣΣJWUCKVrrVL SΣFTQ^ΣLPSYFGHNF∞LRRHYrølAi-St SI MIFPIFGEESRPσSEDGNSNTOErVσ
GV^PLTTSVCΣPETWIOLVAεREεARKTKNWAMADTLRDεlLAAGFLVεDSKSσPKVKPL SQDTflVCLYHSYECC∑ΛASR∑εrjKPLVΣVvT NSGDDWACTIGLSETCEE\Λ^SvXSGSI
FSEUNFVVLvτSΪWNPLIYPPIEDPIΣJ^IVT FKELFKDεSFPTGLSIIVVσvTPEGPG
CPn_0933 1067532 1068578 DIIEVSPVSLTVεεEETLPSEO rEVESTSELOSεDPAIA predicced disulf ide bend isomerase
PVILΣΛNΣKRCSLKOLKVLATLLLSLSLPTLεAAENRDSDSΣVWHLDYQEALOKSKEAEL CPn_0946 1082816 1079745
PLLVIFSGSrΛINGPCMKIRKEVLESPεFII<RVC<3KFVCTE\reYLKHRPQVENIR03NLAL glyQ-Glycyl tRNA Synthetase
KSKreiNELPCMILLSHEεRεiYRIGSFGNETGSNLGDSLCHIVESDSLLRRΛFPMMTSL σεC0KKKCYTLεSFVSεHPLTL0SMIATILRFWSEC<KVIHθσYDLEW3AGTFNPATFLR
SLSELORYYRLAEELSHKεFLKHALεLCVRSDDYFFLSεKFRLLVεVGKMDSεεCORIKK ALOP8PYKAAYVEPSRRPODGRYGVHPNRLONYHOU3VILKPVPENFLSLYTESLRAIGL
RLLNKDPIOΛεKQTHFτVALiεFOεLAKRSRAGVRODASOVΣAPLεSYΣSOFGQODKDNLW DLRJWDΣRFIHDr*mjPTIGAWGLGWEVWLJCMEITOL'rΥFQAIGSKPLDTISGEITYσi
RVEMMΣAOFY'LDSDQWHHALOHAεVAFεAAPNEVRSHΣSRSLEYΣRHOS εRIAMYLOKKISΣYTΛrtΛNDTL'IΥ∞ΣTOASεKAWS-ΥNFDYAOTEIMFKHFEDFAEEAL
RTLKNGLSVPAYDFVΣKASHAFNILDARGTISVTERTRY Σ AR ΣRQLTRLVADSYVΕWRAS
CPn_0934 10b8948 106852b LNYPLL3LSSTSεPKETSESVVPMΣSSTεDLLLε∑GSεεLPATFVPΣGΣ00t.εSLARCVL ripA-Ribonuclease P Proce in Component TOHNIWεGIJEVIiσSPRRLALLVK VAPE QKAFEKKGPMLTSLFSPDCtn SPOGQQFF rVHPLTLPKOSRVLKRKOFLYtTRSGFCCRGSOATFr VPSRHPCTCRMGITVSKKFGK ASOGVDISHYODLSRHASLAIR'rVNOSεYLFLLHPεiRLRTADΣLMOELPLLIORMKFPK AHERNSFKRWRεVFRHVRHOLPNCOIWFPKGHKQPPVFSKLLODFΣNOIPECLHRLGK KJΛrttrDNS 3VEYARPIRWLVALYGεHΣLPΣTLGτ∑ ΣASRNSF HROLDPRKΣSΣSSPODYV
TKATTGσεcTPKsεKCVTAPR εTtΛOACVVVSOKERRMΣiεOGLRAHSSDTΣSAIPLPRLiεεATFLSεHPFVSCGOFSEQ
FCALPKεLLIAεMVNH0KYFPTHETSSGAΣSNFFΣWCDN5PNπTΣΣEGNεKALTPRLTD
"Pn_0935 I0c°100 1068957 GEFLFKODLOTPL'π'FIEKLKSVTYFεALGSLYDKVERLKAHORvTSTFSSLAASεDLDΣ c l 34 -L34 R ibosomal Protein A∑OYCKADLVSAVVNEFPεLOGΣMσ-rr LKHANLPTASAVAVGEHLRHtTMGQKLSTIGT
EDTVKPTYOPSKRKRRNSVCFRTRMATRNGRKLLNPPPRHGRHSLVDL LLSLLDRLDNLLACFΣLGLKrTSSHDPYALRROSLεVLTLV^ASRLPΣDLAJLLDRLADH
FPSTiεεKVWDKSKTIHεiLεFIWGRLKTFMGSLEFRKDεlAAVLIDSATKNPiεiLDTA
".MlJJ 'Jb l t.9330 1069470 εALOLLKεεHTεKLAVITTTHNRLKKΣ L33LKLSMT3SP I EVLGDRESNFKOVLDAFPGF c l lii -L j'. R iDouon l Pt ocein PKET'VΛΛFLεYFL.-LADLSNDIODFUITVIIIANDDGAIRHLR ISLLLTΛMDKFSLCHWε
/IΛKV3Ϊ IVKADP IK^DKLVRRKGRLYVINKKDPNPK PQACPARKK TVAV
Figure imgf000118_0001
Figure imgf000119_0001
CPn_09S6 1089545 1090909 CPn_0964 1104812 1103301
CT805 hypothetical procein No robust homolog presenc in Genebank/EMBL as of 11/7/98
LVWFSMILPPYSYSXJCIGAAVLFFCSILHTFLTP LYTUrOSYEHKKLV PEC KRYAPX OSIΣXSΣIKYT^YLΣHNSKMHMSNPΣSLFSPAELIAIYNLIPl rSPΣYPRRTεLIILEENA
SEIJrTlIΣΛRvΕlvTT^WAVPLFFWFLY EσYRISMAYFNSRNYGFAVFIMVILILLESRP CCIT^TNVAOVT^PSSIJSMSKKXLNPCKSGGPI WΣLNΣLAFΣΣTSVIJIIIX-PVNL
IVYFAELI/ωSIAKΣΛlCTSPISWWWTΣΛIAPPIX^CLUETOΛMIIGATIiMRHFYVFSP ΣVAGI- iΗPLPPKKΣVEDLSEPTTεεTNεVIOPFΣFALOAIΛfε∑WKXJISFKΣVEOSVσ
SRRFAYA'ΣtWLLFSNISIGGLT-rrVSSRAI^LIFPAΣJCWEHSFFΣ^HFAWKArVAII.IST KAPLPNPFUJT<LVAΣSP0εSOlWMPCΣPDr S0LKICVLKSLσvLTPEWlOiMΣJCYFEl3L^
TIYYFIFRKEFllKFPDXPSDKDPS^KVPWWXICVNIIFVGSIIΣ^P-π'PtiTΛGAI IjrY EHDSNPDKKTFPILIKLLΣEALTC*SSLPITPSTKEKMOAALFΣASSαrTαWT>«3EVrT
LGFQKTTIFY0DPINI^KVCrxTΛ:∑JYAGLVVFGDIΛEV«<VLΛ-4OGLSDFGYMWSYT15 P^Ll^PiYSIANErjrΛJO-LIWVOεFKεP LMSIODGDDAEEYRFAACflHGERYTεMEQVL
IFΣΛΛLVNYLVHNΣ^VA'TBCYΗYLVVAGCMAAGGLTLVSNIPNIVGYLILRSAFPSCT RNESAAKLOIHVIOTMKFTHGra-riGLvTEHΛiniC!^^
HMG LFLGALGPS I ISU7VFWLLKNVPEFLYCFFR FLNKYUISGNOLVNSVTKSΗ0l<ADPETKALIREFAIΛΣLYASLPiPCreAHTEWSTΣAM
OPETYEPNKACIAYΣXYVT-KIIEL
CPn_0957 1093812 1090963 ide/ptr-Insulinase family/Protease III CPn_0965 1106769 1104925
KIYTRNO<MFWKΣX£PΣLΣCTSΣ^ITSCEOΛFKVVPW3CPLOVSTPAAADOKIEKIICSN lpxB-Lipid A Diεaccharide Synthase
GLPIillSDPNLPTSGAALLVKTGllNADPεEYPGMAHFTEHCVFLGNεi YPlWSGFPGFL KGFSFSKVσ∑/MIPSGLVYΣiYPLGFTASL FGSAFSIQWWIΛKKRKEvΥΛPRSFWILSS
SENl«vrøIAFTYPNKTVFVFSvΕΗSAFSDAUXFVHLFINPKFRQεDtΛRElYAVHOEFA IGA'TXMIΛmGTIQSOFPVTVLHVINLI rYIΛWIΛITSSRPISFRATLVLMΛIΛVvTvTLP
AHPWTClwWRIOOLVAP∞HPCARFGCCJIASTLTPVTrtSKMAE FLYVNMEWMASPNIFHLPLPPAOLSWHLItKU:iAIFSrjRFLXC44FYXESNNnri)FPLLF
YTSAPI^l<AKKOFSKIFS0IPRSKNYiπ»QEPFLPSGrrSSlJCNLYIN0AIQPTSNLEIYW WKIGU^:LIJ .VYFIRIGDPΣNILCYC^σi^PSΣANLRLFYKEORSTPYLDTHCFLSAG
H Σ YESSHPΣ PLGCYKALAεVLRl^εSKNSLVSLΣ-NEOLXTOΣΛVEFFRSSLNTGEFYISY EASCDΣLGGi ∑0SΣKSLYPNΣRFM3VσGPAMR0EX}WPIIJ)MEEFθ SGFAEVLσSLFR
ELTεKGDIWYSOVIDSTFOYLRYΣQεHGIPTJYTLffilSTINALNYCYSSKSPLFDLLCKQ LYROTRKIIjπ'ILKHKPATLΣFIDFPDFHΣlilKKIJWHσYRGKIII fVCPSΣ AWRPKR
IVSLGNεDLSTYPYHSLVYPKYSSEDESALΣΛLVSDPEOARFVLSSKNSεHWεεATOLHD RILEOHLOKIXLILPFEEGLFKNTSLETVYLGHPLVEEΣSDYKEOASWKEKFLNSDRPI
P Σ FDMTYYvTALDGVODYGKVOSLKP I ALPKPNLFΣ PKεVTLPGVHLLKKOEFPFAPALS VAAFPGSRRGDISRNLRIOVOAFLNSSLSOTHOFWSSSSAKYDEIIEDTLKAEGCOHSQ
YQDDI<LTLYHCεDHYYTAPIΣ_3SOIRIRSPOISRSSPOFLVATεLYCLA»IIX3LLREYYP I Σ PMNFRYEΣ IRSCDCAΣΛK03TΣVLεTALMrrPTIVMClWJtPFσrF AKYΣFKILLPAY
ATQAGLSFTSALCG∞IDLRVSσY Tn/PALLNSILTSLPNLεiSYETFLVYKKOLLεLY SLPNΣIMNSVΣFPεFΣCCKKDFHPεεiATAΣJ3Lti^HGSKEKQrtElX3Uα ;κVMTTGOIA
0GALLNCPVRSGLDεLASOVMKETYSlWPTW_!AI_eKLSFSεFOAFAShrLFNSVHLEVMVL SEεFLKRIFDTLPAV
GNLSεCOKKDYLmWVFTASRSSHATKPFYYεLQSOεiSε∑HHDYPLTANGMLLLLODK
33PSIKIWCAεMLFEW IHITFEELRT0Q0LGYMVGARYREFASRPFσFLYΣRSDAYSP CPn_0966 1108055 U06748
EELLAKTSLFLNKVSASPEKFGΣSOεKFANΣRKAYΣNKΣLεPεHSLDMMNSALFSLAFεR pcnBJ2-PolyA Polymerase
PFVEFSTPDLKIAΣAETLTYEεFLlWCOCFLSNεLGTOTSVYΣRGTQICTS LLΣTΣΣMVCElWILMRGLELLiαtKSNITLTFTΣYSVSNHNΣKLKDFSPHALSVIICTLRK
AGYIAYIVTJGCΣRDLLLlvTTPKDFDΣ-TSAKPEEΣI AIFKNCILVCKRFRLAHIRFSKOI
CPn_0953 1094803 1093793 IEVSTFRSCSTDεDVLITKDNLWGTPEEDV'LRRDFτil«LFYDPεHεEIIDYTCGVNDLR plsβ-Glycerol-3-p Acyltransferase NRYLRTΣCDPFTRFKODFVRMLRLLKILSRSPFTVETOTQEALIACROELIKSSQΛRVFE tYRAlYMOFSRYtΛYAFDNOYLPεPLYOKFSVFHQNYΣDAATIOCAAADOAEVLCLOWVKV ELIKMLNSGAAKNFFOLLIENHLLEILFPYMDKAFRLNPALEEOTATYLKALDDKILKKE tlεDLKHPFIFPPYHKKΣRAPIDLFRLSIDFFSLVΣDDKNSRILNLHRLKεiεεYΣARGD AεYDRHOLMAIFLFPL NFNVRYKHOKHPYLSLTSVFσ/ΣKNFLEQFFADSFTSCSKKNF tlVVXLANHOTECDPOLlTirYALGKTHPELMElWIFVAGDRVTSDPLARPFSMGCDLLCIYS ILTALILOJørYRLTPLΣPTKKALFFNKKLLHHTRFLEALSLLε∑RSIVYPKIΛKVYVAWI RHIATPPELREεKLLHNOKSMOΣLKTLLNεGGKFir/APAGGRDRKNAEGRLYPSEFSP RHHOTLKCKKDSHSOK
ESIEVFPLt UASNOTTHFΥPFALKTYDΣLPPPPKΣE.'IAΣGεORAIFFAPVFFNFGAεLF rDΛLC."KεELIIICDKHAQRTLRAεKVFSΣVKNLYEEL CPnJJ967 1103431 1109935 mrr.A/pgm-Plιoι hrκιlιιcomuc.ιr-.ι»
'I t li'i i KI'lbJ76 1094799 FTAYKFAF ΣCA R.';EK IRR in I DFRRig^V,VF'FLFl7Tiy;VRC A EPtm,r^rTvTAGK
..irε-Λκi.il Filament Protein AVΛRVLREGR:*;KIIR\Λ'\\1KOTPL:X7/MFFΛΛI.rΛG l:,W;lCTI.VWPtrTPGVAFITR
ΛυHYGI'7rRKVMENEILLN∑ε3KEΣRYΛIILKNG0LFDLTIERKKVP.0LKCN∑γRGRVTNI ΛYRΛDΛ. I IM I .-WlllNrYRPNi: [ K IFIILEGFK I .ili'/t.EQP I ETMV.'IEADFl IPLPEDHAVGK I .UN rθ::ΛF INI IF I H [3D I LεNCKKFεOMFDMDVDΛLPEEASεΛPLLSSEEAP i ε NKRV t DΛMCRYVEFVKΛ TFI-Kl IPTI Y.< ILK I VLl/TΛI ► ;A.'"/K VΛll FVFEEI J1AEV ΣCYGCE FIl.KLL.jrvl.VOWKEI-IlJ.IKGARLTSNtllΣpπRYLVLLPNGPIIRGVSRK∑εDPllMREOL pτciNtNEiι ι:Λt.ιιι 'κ,v/ιraκ>Λiιu;iΛi.u;Di;Dr'iιι>(VDrat(:ιιιvr<;wιiL.'!tCΛ Ki.'L I l-::h EMI riMUL II -R'rATTAIiTEAL [NEAHDLLLTWKT I CKFYSTEOPCLLYGET GDlJ<KR:;ALPiiNi<WΛ-πwi K''nι.κγι.Eιu»;ι jVFT':ι-,πDr(iιvι.iΛMuaiEvτwy3E0 l,It.KFΛ7H-t:iDKNYKHLLIDϋYΛrι'QKι:KIIMLKKY3PDA^IKiε,/YRDSIPMFεRFNtε 'OTIIMtFLDYrrrn:iJi:l . \li.ι7I.RlMIFj:li:MI.-:UI.TΛF'IVK.:rx.lTI.INVΛVREKIPLET E!DFATI(l<KIW : i:i:YLrFnKTEAMllTIDVN.<k3P.'7rOLC3CVEETLVOINLEAAEEIA IPLIERTI.KLiVθnΛli;i.-,:illl.[J,γ.»rrE Il l'-MVF/;ilKKII VI»-IΛKΛIAϋVinΛE ; l''.l|.f-IJ'rWiχ;i.VIIDFIDMK.;RKNORRVLERLKεHMKYDAΛRCTIL3MSεFGLVεMTROR NI(E.-:l«0TI.I"I'LI-PYl-.X:NΛIIKτPE2WtE∑εRDLKKVtNHKEH.';HLCLWHPEIASYM KijUILiririllllMLΛKOI.KΛKtfllNT.'JDr.VIIIJIHYOFFSLITCEStDL i.T'llJI'li-H I I l)-»HH'l I I 1 I /.; I .llinS-tlliiiiiK.iii I iii. ro |. Amm l t.m tt-r.ini- DRM irrGYLGNODGVSΣVLECLAKLEYRGYDSA'* . VEOELFΣRKTVCRVOELSNLF QεREirrΛ3VtCHTRWA'rHGVPTErNAHPHimECRsVAVVHl«IΣENFKELRRELTAOGI rpn i)'>79 U2J-'7l 1125443 3FASDTD3EIIV0LFSLYY0E3QDLVF3FCO I-A0U!CSVACALIHKDHPHTILCAS0ES htrA-DO 3erirtBJiR1fottSI.se.' Ii |C '. •!' , „ Λ ,_. ., PLΣlΛlVIKEETFIASDSRAFFItΥTRHSOAIASGEFAIVSOG EPEWYNLELKKΣHKDVRQ ∑∞ι κou«s tΛv Vt!r33i A PE-c^AVGκκεsRvsεtTOr rcΛέ tTCSε0ASDKSGYCYYMLKε∑YD0PEvXl«a.I0ICHMDEEGHI SEFL3DVPI SFKEITΣ ATPAVVYiεSFPKSQAVτHPSPGRRCPYENPFOYFNCεFFNRFFGLPSOREKPOS EAVR VACCΩSYHΛGYlΛYIΣESLVSTPVHIEVASEFRγRRPY∑σKDTLGΣLΣSOSGETADTLA GTGFLVSPDCTIVTNNHWεDTGKΣHVTLHDGOI YPAr/ΣCLDPI CLAVIKΣKSONLPY ALKELRRRN Σ AYLLG ICNVPESAI ALGVDHCLFIXACΛ«IGVATrKAFTSG-.ΣXI,VFLGL LSl WSWLI<VσDWAIAICNPFGL0AT,"rΛVΣ3AKσPJJC IΣADFEDFΣ0TDAAINPGN KLANVIIGΛLTHΛεC^SP-OOLOSLPDfjCOKLLAfiεSLHSWAOPYSYεDKFLFLGRRLMYP 3G PLL I DGOV IGVNTA IV3TiSGGY IC ΣGFA I PSLMΛNR Σ Σ DOL IRDGOVTRGFtiGvTL
."!.v.ι.κι.pι.ι,.ι ti Λt.,\t-ι-,;ι;<κιi';uΛ.. rrivr "- iNMMEVr '"..*" ".:.-:■ • • - ■•■■..• ■ . . . .: \.-»:.- .-_.;.. MFRr_\7.; . .lii v/.i'v. - "' ",, r ':.,"- r-'-r.'-VM.V'/''-' \* re'' .-.I- ;— .". . '-"" -•••'.: .— -,>>-_. It if
PRNLAKSVTVε TKGILIISVEPGSVAASSGIAPGOLILAVNRQKVSStEDLNRT CDSNNENILLMVSOGD VIRFIALKPεE
Figure imgf000120_0001
PD0PRNLVLεKTFKS εP3Pι FTFLF ' ΣLVLLFVYLVFIIROMRCHSCSAHSFCKS
CPnJ)'IH7 I H743 ! U 3H I I 5 PARMLUCCONrtVTFADVAG:εt/^εELΣEΣVOFLIONr)t«FTSL3GRΣPKCrVT^:aPPGTG ytgϊ- l i ke predicted rRNA mer.ny lase CTLIAKAVSGEADI"IIRroiW3DFVEMF*ICT-3A3"rt||UI*^
LEl«tFAtGFFMFAYRTLLTlrNVV0V3HElF!n*r/VreDTVIDA'rcGNCl'IDSLFLARLLQ RHRCΛGIC∞HDE-IEOTlWriLvlWIWFITπilEITVtt GEGRLvVYDI0Kl*AL31^ALLLFETHΣ^ε0ERSVΣEMKE0SHεHΣLεi<r*VKLΣHYι^ΣΛrYLP VMNL DIKGRFEILMVHAKRIKLDrrrVOr.MAVARSTPGASGADLENU-JEAAIJ UUU SR KCNKEΣTTLAR'TTEΣSLEYAωiVRPrjCLΣ'rVVr PσHPECEKETHSVεSLAORLHPKEW TAVTA\/T3VAEARDI<VL-rtKERRSLEMDAEERKTTAYHESGHAVvT3Σ OICDPVI)KvTI Σ V33FYVANRCRAPRLFΣF5R0GSES3VDKG PRGLSI£ATHFLPEKNKL3-r»rKELYrXLAV'U1CCRAA8EIFLCDI3SC««MDIS0A'tltL
VR3M\ CEWGMSP0LCW.*rΥDεRSKL'rCΥσσYHεKSYSεεTAKTΣDTELR«IXDAAY0RA
. -.- IIP. S-LFKKGJIZL
. " ' '. ', ' ' / , . i ,- • .-- / , i ' i iπ i n. - '■ l' i - i ' I ΓΓ KFFΣNLINLDWILKMKEAAPMHFPFPVRRSVWLNRYSTFRIGGPANYFKAIHTiεεARε VIRFIΛSIOTPFLIIGKGSNCΣJDDRGFTΛFVLYNAITOKQFLEDARΣKAYSCl-SFAALG CPn_0999 1152859 1150766 I ATAYNGYSG!^FAAGΣPσSvα»ΣFMNAGTNεSDISSVVPJ*vΕTINSEX;EΣ£SYSVEEL pnp-Polyribonucleotide Nucleotidyltransf erase εL3YRSSRFHRQOEFIΣ_3ATF0ωKKQVSADHSKSILOHRLMTOPYTQPSAGCΣFRNPεG OETFMNFOTI3INLTεGKILVFETGKIAR0ANCAVLVRSGεTCVFASACAVOU)I»CvDFL TS - -KLIDAAGLKCΣ.-.ΣGGAOISPLHANFΣINTGKATSDEVKOLIAII0STLKTGGXDLE P l\h*IY0EKFSSTGin'I 3GFIKREGRPSEKEILVSRLIDRSLRPSFPYP_i«CDvT vl5YV HEΣRIIPYOPKIHSPVSεK WSYrΛ0VLPDPωiCAASAALAISDIPQSNIVAGVTtIGCIDNC Λ/INPrKTEtASST*J5L
VIΛSTENAILMIEGHCDFFTEεOVLDAIEFCTKHΣVτ∑CKRWt-^EEVraBICNIΛAVYP
CPn_0989 1139552 1139016 LPAEVLTAVKECAQDKFTεLFNIKDKl VHAATAHεiEENIIXKLORErørXFSSFNXICAA
CT832 hypothetical procein CKTXKSrrtMRALIMREIRADGRSLTTVTlPITIETSYLPRTHGSCLFTRGET^^
LRTSLAWCVU.TIFWLLVMA'ItJ-PEKFSGSPISISKEFPCflKMREIIΣΛMLYALDMAPS GSEAMAQRYEDΣ^EGLSKFYI£YFFPPFSVCEVGRIGSPCRREIGI«KIΛE1 ALSHAIP AEDSLVPLΣΛSOTAVSOlϊHVLVALWTKSΣΣJ^OEΣJOLIlσNALJCNKSFDSLDLVEKNV DSATFPYTIRIEaNITES∞SSSMASVCGGCΣAI ωAGVPΣSSPΣAGIA>CLXLDDO(SAI LRLπjFEHFYSPPINKAΣLIAEAIRLVKKFSYSEACPFIQAΣLNDΣFTDSSLNENSLSΣ ∑ωDISGΣ^DH Gr*IDFKΣAGSGKGITAFC«DIKVεσrrPAIMKr<ALSOAKθrχ*NDIL I
MNEAI^APICADLSOYAPRIETMOIKPTKIASVIGPGGKOIROIIEεTGvOIDVNDLGvVS
CPn_0990 1139880 1140440 ISASSASAΣNKAKEI I εGLvT5εVEVσKTYRGRVTSVVAFGAFVEVLPσκεGLCHISECSR infC-Iniciacion Factor 3 ORIENISDWKEGDIIDVKLLSINEKGOLKLSHKATLE
SVAIi4FKINROIRAPKvPXIGSAGEOLGIΣΛΣKDAΣ_3ΣJ«EAGΣJ3LVEVASNSEPPVCKI
MDYOJtYRYGLTKKEKDSWWOHOΛ iKEVTO- rølDElWFSTKUCQA^ CPn_lO00 1153193 1152891
CMF∞REIAYPElWFlWvTJKMSC HjmiGFVEAEPKΣΛGRSLICVVAP^ rsl5-S15 Ribosomal Protein
HAQDENQ SAFAAIIlΛRHPMSωKGTKEEITταCFQLHEKΣ7IGSArΛOIAILTεHIAEΣJ EHLlCRSPK DQNSRIj XjαvrJORRKLLEYXNSTiπ'ERYKNLITRLNLRK
CPn_0991 1140394 1140612 rl3S-L35 Ribosomal Procein CPn_1001 1153369 1153869 røt-iCr*RKSLMPKMαmSVSAPJKLTASGQI-WTRrc yfhC-cycosine deaminase KGQVGMYKRMMLV YYI£L∞EKLn«EKDIFFl«QAFK£ARKAYt»DEVPΛΛ3OTIVrø
DATAHA£ILCIG-ΛAQOΣJ»WRΣ2JYrVLYCΛEPCΣ eAGAIOIJUUPRm
CPn_0992 1140622 1140996 AGGS VNXFTEEHPFHTrøCTGC -2SEελεHUααCFFvςKRREKSεK rl20-L20 Ribosomal Protein
OaVMVRATCSVASRRRRl<RIIJCQAKGF GDRKGHIRQSRSSVMRAMAFNYMHRKDRKGD CPn_1002 1153844 1154089
FRSWIARΣJWASRIHSI^YSRLINGΣJCC^ISlΛRKMI^EIAIHNPεσFAEIANgAKKA CT845 hypothetical procein
LEATV KSAERKVKIWIVTIZBO YITOESRIΛKIjGEEIVPNLTPEDΣAQPWDFPQLE εσVLSGIGEV AAILAALSOEN
CPn_0993 1140975 1142030
'pheS-Phenylalanyl tRNA Synchecase, Alpha* CPn_1003 1154862 1154092
KSFGSHSLGIRISMEl<KEEIEAVTCQQFHSEl !OVNSSOAIjωi_WRYLGraα;iFRSFSEK CT846 hypochetical protein
LKQCTDI Ai ΛSLIrølT ΥVEDLWεKSLVLLASEOAEAFSKEKΣDSSLPGDSQPSGGR τSNκriHPL∑*cpDRθiAGKASM viFPDir«roπ"Piι∑£κ∑χ ααps i wsciAPFrsγ
H∑rjCSXΣJOlTVVDIFVHIΛFCVREAPNIESEA-INFTΣXr^ IINKFFGIPGΣiεitJU-SVKGIOKHHFWQFL'rYPLITADSΣ_3IΛKΣX3SFEITQRLiαJuW
RTHTSWrQAREI-SIlTΛPPIKVVAPraΛFRNEDISARSIWLfllO /EAFYvT^^ ωFFI_r««I0HLIRKIΛAFSVLVVISGOALIIGAvT*CFMALIHSS0SFTGPESII<*GV
ILSAFYHSFFOP tTELBI^RHSYFPFVEPGIEVirVSCECCGKGCAIΛKIITr^ L'rVOIFIΛPEKRFTIGPTPΣ^^SΣKWGFLFVΣΛFYCCILIFSCAFIXIJ-ASHLAIVIAIt
HP ^^PJJG^I DPεIYSσYAVr3M3IEWAMI- YσVSDIRLFSE DLRF QQFS FCKXEKIPNPYTTSLRF
CPn_0994 1142371 1144440 CPn_1004 1155418 1154879
CT837 hypoehecical procein CT847 hypothet ical protein
IJFWFHR∞PΛKRSRRNFEQALEIΛEKΣJffilSIATSlfflSYUJ^^
F^Λ»TvΕ>rι LEISCVSKSHADKAU ESDFLIAGVONv SFLENOEDLYKSLLDEYSεvτ σKSILAV ErΛ*^Σ OOORV £ Σ r P l VTDWKK∞SDDEYKNO^re OAYOSS ∞ΣS
KAVDEVTOONΣJCEVI*rYDIJ-TDEETEEHKEPECFΣJNNLVWK^ ANROMΣQOEI^SACΩP «AWKSvNSTTIESMOΣLOATSSMLSTLKELTΣKANLTNSPSD
NDALVOIIYKONKIilETVNECTPLTKTLLWNSEl!vTCNIASSLVrVNr IPLRLFYORALSH
LDIEAVVKVHlWVMAΣJFSRYl»TMv KSPKK13NIWYlTroFLIJ?I-iεAW,^ CPn_1005 1155957 1155415 εRKCTKI>LASALSΣiGIFεSl<L\/FεεASRYLYFNIO KΣ NλrøKKPLSP<nYLTΣlAYEεL CT848 hypothetical protein
HRLISKYP∞PLFKAMDRVliHESRPYT)PMIU:lI_?SLEGTU -røl«IDIIRSPSPVTQ NRKPVRIJ<MWIIDPΣ-3AKKPLOAAINVPGTPI'Σ*3σPNTATADDΣΣAiσSiα3SNPLIVT ^
SSILYAW-NEEi CFTΛAKAHRSEvTLVLNIOlTOISRKERARSRVIl-IiAIJaEεHAPYVH YVΥOSVXVAθr*<Σ^IΣA0εL0ANSSAOTYLNNOEALYOYVSΣPIO KΣ/π5NSSSYW XQS
AFSreεPEεiωi^Σ_εSIHGDiεTFADFFSIWEEFHlWIiASSFFL'ITt£ΣJCEFvTJSFLKE DNOΛXGASROAΣONQISSLGNAAQVISSNUπNt-MIIQαSLOVGQALIOTFSOrVSUAN
KLTAU DIFFAKKKII^FPjmKLLtXΛΣXSYXIVFiαiERTOPNSΣVVVSl<ΣX3IJDWSW I
AGFAFFSIlltAFWDEHSLKΣXLTlWI^PTLVARDRLVFVSHiεiXSKFVirc∑J KNROGFSS
LKSFTKDD∑εCB^FTGYΣΛεLTΣWSHKHNL CPn_1006 1156493 1155990
CT849 hypothetical protein
CPn_0995 1145515 1144415 TKvT^FlMSΣTTLGTLPTVOTΣNSSRPPLεPLNTPKΣGAVLFSΣYεL∑ωAlεiROQTVL
CT838 hypothetical protein TQS∞∑ΛrΛπ'NiQCxX oε'iTO∑KYAivsAGAKεDEi'∑røONorørra^
RMLIWKRHLLraFWFALTSLLVLALIFYASIHHSLJITLKGA-rrAASGASVKLSILYYLAQ TRQNGQΣ ΣLSHASTNΣNΣ ΣOCXJSSODSSFIKTTNSIGSTVNQLNKPLG
ISLKAEFU1PQLVAVA'I SrTLFAMC*lrøε∑ILIΛASGLSLKSl>mPLlX£σAVΣMMVLYA
NFO ITJ)PΣCεKISITKεi^lΦRσiTDKECGKIPALYΣ IX3^r rLLYSSiεPiπ .TLlΦ^ CPn_1007 1156689 1156907
KDPKTIYTMEI<IAF TLSLPΣGL^rv^x3FFAlrøSEM.εLKεFFDMKEFPεIεF FYεNPFS CT849. 1 hypotnetical protein
KLFSAGNKNRLSεFFKAIPWNATCIΛLSTQVPORILSLLAQFYYVLISPLACMAAIΣLSA F rYKSLAGEεKrΛ SGNεCNDYPεVFKDrΛ/SAYVLVTCGOMSSEGKIQVεMTYEGDPAVIS
YLCLRFSRTPTvTLAYLΣPLGTVNΣFFWLKAGIVLASSSVLPTLPVMAFPLIVLFLLTN YLLTKARDSLDεS
YAYAKLQ
CPn_1008 1156904 1158223
CPn_0996 1146592 1145519 CT84850 hypotnet ical protein
CT839 hypothetical protein vTJfYSFIGMLKPMYVLSKRLYRWVrøLIKLCDLVKNSRSFSVEVTvTΣSALLLΣFGClΛCA
AMPILWltvLIFRYLCTAAFCl-LSLΣCΣSΣ ΣSSWε∑VAYΣΛKDVPYDτVLRLMAYOΣPYL 9vVKVSLVPFLLLFSFLAFPLILCFRGKCYALLIΛVr TLWAKYv Λ3ETLWSF LSGL
LPFΣLPGSCFVSAFSLFRKLSDNNHMTFLRASGASQSΣ ΣMFPVLMVSGAΣCCLNFYTCSε GVSFLLAFGLFLGGVWLAOEEεMVKGKεOLRLSεDLDAORSAYεDLLLTKSOεKEFUJAR
1-ASICRYOTCKεiANMAMTSPALLLOTLOKKENNRΣFΣAVDHCAKSKFDNVΣVALKCNNE AOGLDRεLTεCOELU-WYOKOεYLTIDLKILADOKllGWLεDYAεLHNKYiεLVSICNGOV
ΣSHVGIIKSΣΣPDTTK17TVKAKDVVFΣSKLPDSLTESSSPSS0RFYΣETLDεLLIPKITS VFPWVAEPSVσESθσSεRVDVSRWVSALθεKεεSLεP.LRNεiLvεKORCSDYEHRCOELG
TLFACKSYLKTRTDYLPWKOLVKQSLKHSHLPεTLRRVAΣGFLCΣTLTYAGMΣLGΣHKPR LLIΛNFTALεRRCEεL NLLNOKETOINELHOLVCKSεEKVSVEPSAHAεTSCVEEKOYK
FRICSIALYFIFPΣLDLΣLLΣVtϊKMrKNLPLAFMLFVFPQLVSWVVFAARAYRESRGYA GLYSOLOEOFLEKSεTLSLVRKKLFAVOEKYLTLKKKEεLTKODΣSFDDΣSM∑OGUJERΣ ε∑LεEE-ZSHLEεLVSRSLSL
CPn_0997 1146699 U476b4 meβ -PP- loop superta i ly ATPase CPn.lOO'l 1159085 1 158 18b
AYKflVLSSDLLRDDKQLDLFFASLDVKKRYLLALSCGSDSLFLFYLLKERGVSFTAVHID map-Miithioninβ Aininopept idase HrrWRSTGAOlϊAKεLεεLCΛRεCVPFVLYTLTAεεCGDKDLεNOARKKRYAFLYεSYROLD YRLUIP'/ IIΛKRNDPCl CCSilRKWKOCHYPOPPKMSPεALKOHYASOYNILLiπ'PEOKAK AGG IFL,\HHANrχ3AεTVLKRLLESAHLTNLKAMAεR3YVEDVLLLRPLLH Σ PKS3LKEAL I YNΛCO [TAR I LDELCKAόOKCVTTNεLDεLGOεLHFFYDA I AAPFIIYGSPPFPKTICTS DAR(;tL-YLODP:;NεDεRYLRΛRMRKKLFPWLεεVFCKIlITFPLLTLGεεSAεL:;εYLεKQ ( jiεv i i I VDGYYGDCCPM' M FGCVPε∑KKK COΛALECL ΛυPFFGMTIiυDIWεLPCPDCL∑ςx FLCKWVMKKFFNNΛC IΛV.'IRHFLOMVYDHLSRS rl 'l lΛ l t.KriJ t r .-EKIEAIEΛRΛD'rYf lF' /DOF'/'-lll'r/II I EFIIENP'rVPHYRNRSMΣP .rΛTLRMRNK [VI tKPGWVΣD IAI<-3 I iTI EI'M lrΛ/ι:KKEι;WDPKW.ιWEΛr<TCDU :;Λ WElrri.\ [TETC7YEILTLLND
' I n JJVJH 1 1473 1 1 1 f,05R4 ' I-π I 'l 1 1 l '.-lι. /r, 1 I mi. t t -.ll Λ'π*-ιli'pι.nιlι.ιιt -.l ine uiott se 'T'"..' I.yin it lii r i l |.| (.r i-i n
LI.';!.' VK[-M.-,,KUKKMKPEPKKNFπVFFFLLr',-WFir/'/ΛI ONriΛt;KKΛnV F:niOIEII /MI.I I. II-":i.l.!"'VI.Fn 'Ii: H I VKVΛLLKNr :RKrV./PVII.REri.rΛIv;ΛLlLFVTFGR I .VN tLIVI-En IIK IALNDNLVllBWRFRDVOTCEGOLP.YIIYI.ELlIXXIHRLDLDLO-TS .-FF FI.r, I.:i.YAF I KHπ.l .FT/.'I l KMMIJVI -MPEKAKDr/r.' KTrP I FFPIAFPVtTGPA κ.':r.rrι/;κEV'rN:; ιiJ«iF.iΛ i:awp[PEθGYAt:iγr';E-/r;G::vι.τι:PLVV-τr;pΛTPθLiNi. 7(TΛi.ι..;YM i . ; i ι ι \M i iΛWΛFr:ι.ι /:.';.'.TFnpi.ι-γ;Ni-τ;LLΛLEPLF0lΛL ιι:;ti.'κr'Yi-ι- :R:;pF-Λ[.RTYι::;DLYCLiGKYL:;rvι/; ιι;.';κτLKREi <DLYθOVEVSLτo ι.lΛ.-:vιιi .κ. : ι ;
ITr/rFAΛYrl.Yi:0VI.:TLNR [:5J3LW3EσGERF:;0LI':;vni.YREFMIKYIIKLVEΛRDLH i.iAOI.KKl.l<i ;r. ;oTVWYFNNOEl-;SRSLEKODPEVF';ilWF,V;,\KEIwr,\FKFNII::L:';FKA I l i.tl tli'l I I '.-i-m.'
Figure imgf000122_0001
49
' )L I IΛVL.-;.';WL3WTTΣVAE [PF3AAKNGTFPEΣF .εK3PSVSLYΣT33VM0LAMLL RFFLERCVLLRPLGNTLYV. ICεEDLR I ΣYSIILODALCLJP
VYF3SN,\Win-ML3T'TGVMVt.PAYLA_iAAFLFKL3k.,v rYPKKGSΣI APLAMΣTCTL 7vVY :;LWLIYΛGGLKYLFMALVLI-ALGΣPFYIDAGKKKKNAI TFFAKKε∑VGMTFIGLLALTAΣ CPn_1042 ;!: ;. - ΪI'6629 » <•?« I*-, FLFLTOR tK I •b oD-dethiobidt'th synchetassir
NRSPFTYFRANFFM0RII∑vT*irmΛ*3rπ'ΣV3AΣlARALλIAEYWKPI0A _ΛENSDSNI^
:Pn_1032 1 186153 1135566 HELSGAYCHPEAYRLHKPLSPHKAAO IDNVSlεESH ICAPKTTSNL: lETSOGFLSPCTS
CTJ73 hypotne ical protein KRLOGDVFSSWSCSWILVSOAYLGS NHTCLTVEAMRSRNLN Σ LGMWNGYPEDEEHWLT
I .MΛY''TPYPT:ΛF!rτ ^r F.':DDr™PPOPF-?rF~/D.-ALLOAKΣF»IRITV YTS PKε OE IKLPI tGTLAKEKEΣTKTΣ ΣSCYAEOWKEVWTSNHCG ΣOGVSGTPSLNLH .. ;;, ; ', ! '. /?'. 'i- !'i.:' ;.'.'. i .; '.' ':*,'.■ Ii- ;A,\;..:i ' .T iΛ ..-V" ■ ' .' - i' W IrTJKI.- lc .1 V,"rt,V.
ALGFLNFENAEPAKVN oιoF_ -Oxononanoate 3ynchase_J
PMΣΛMFLΣEAI-AIUtKSKHTYRSLSLNSHLIDFTSNirπΛFASSPεLRKεYΣTKLHAiεS
CPn_l033 1187656 1186187 LGATCSRU.TCHSQLCORIEEOlΛAYHNFεSCLIFTπY YTANΣΛ∑iYAlA'rM
CT372 hypothet ical protein YIHASIYTJGΣRLSI<AOSFPFhmNDL^LεKRLASSHI RTFVC\^^WSI-HσSVAPLOAI
N^π KI<DYS<3εFLTτrπVDSΣAFLPSEENFCYI TILFFR-KKlα^YAFFYGEFMISF F SεLCεRYSAYLIVDEAHAVC TGIX ECLVSAX^lΛDKV'I^TVYTFGKALσTHGAAIAGS
LSGITAIiGISSYACTPKETTGHYHRYKARIOKKHPESIKESAPSETPHHNSIiSPVrNIF SIIJ03YXIlJFa«PFIYTTA0PPHALTAIEIAYEHN0RAFN0REHIJjALIHHFREKA0tn<G
CSHPWKrKISVSNΣ TSVτKA'n j∑SΣΛFSILPWFYPHIfAIiGCrOA εiPSWOF^ LQΣΛK)NTτTPIOSICVSGSHRAROAAIΛIQNSGYlr«PIVSPTVKOREEI^
TΛ NEIDHIXIfrLEQIFLCNVSSL sEIIa QtiGFODSY Irx^I^lFSIY^^LTκs γ^n?γσY SW^p ps(*∞GQYSV ^ CPn_1044 1198700 1197699
E0NSO πYJWSLNAAOHIHEiaYLFσRIN5A'ΣtϊTALPΣNRSYV GLVSENPI24R«S0DΣX^ •bioB-Biotin Synchase IGFATNKVNAIiAΣSNVNKLRRYESVMEAFATIGFGPYISLTPDFQLYIHPALRPERRTSQ AKHMRIXrVSWSL£DIREIYhTPVFELIHKANAIΣASNFΣΛSEΣ/π'C^ VYGLRANLSL CAYCAQSSRYHTW TPEPMMKIVTrV BRAl'-ttVEIiαATRvC'-CAAt^
MVKSlTDLGAEVCCAIiG»u^EEOAKiaYIW3LYAYNrorLDSSPEFYETIITTRSYED
CPn_1034 1188589 1187732 TLDVVNK-BΣ-T^CGGT^ΛΛCεSEEDRIKIXHVLATRDHIPESVI^NlLWPIDσTPLQDQ
Predicted OMP (CT371 ) [ leader ( 18 ) peptide] PPISFWt-Λ*ffIATAR-*vFPRSMVRIΛΛGRΛFLTVεOTI£FIΛGA∞
KTSWOKYKKYΣ^YSILVQKΣARYVMKTVΛFFTFΣJFSCSSFYASCRYAEVRSIHEVAGDIL NDIDEDAEMΣKLLGLIPRPSFGIERGNPCYANNS
YDEENFWLIΣJM-OOTΣXCOTεAΣ-JHSIWKSKAIOCΣΛKC*^
GTVQPIESAIFΣilEKIOKOGKTTFVY ERPKTAKDLTIJCOlJiMUWSLEI rAPXJTO CPn_l045 1199602 1198901
PIO«XYTSσiLFSGDYHKGPGΣJ3IJTΛICTPLPAKIIYirΛ«3KEW-LRIGDLCOKYGIAY 'conserved hypothetical bacterial membrane protein
FσiTΥKAOεLHPPIYFDNIAQVQYOTSKiα SNEAAAΣJ UlHQMHE GTLI»-WS>IRKTLVTSYI^STFTIiL\(TJNI,VWSKLIPTrFFNFIIPrMLILYPLTFLI
SDVVNEIFGPIC ARvΗIFSAFΣANLLASSΣVOΣFMFFPVASPEMOTAWHCLFDLSPLRFL
CPn_1035 1190081 1188570 ASt-ΛFrra∞ωrvXY-fFFMIRTPNSSlAJΣΛSrJGST ISOIPOTFIV^ aroε-Shikimate 5-Dehyrogenase FPOTIΛr«i fSYIYl I I^nL'∑τprjn(lA\ΛOTIRIiϊ a
' VQLPLMVPΣ' HLQIWRFSMΣYYGVSVMi Λ'rvSGPSFCEAK∞IIΛ-ωuWDIIELRI-O LINELDDOELHTLI'rTAONPILTFRQHKEMSTAΣiπOiαYSIJUCIiPllWMDI∑rv^LPlCTA CPn_1046 1200675 1199590 . «3^IRKSHPKΣKlIΣΛYHTDKNiα3IΛAIYNEMΣATPAEr«rVΣ^PlMSSEAΣJffIKKAR •Tryptophan Hyroxylase LLPKPSTVLCMGTHGLPSRVLSPLISNAMOTAAGISAPWAKSQPiajSEIXJnrNYSI T-SE VHYCERTΣJJPIOflΣJUAIJ'XΛQSIΛLFroNSQSLORAYSTW KSHΣYGLIGDPVDRSΣSHΣJIOIFIJ^H^I-W'IYIKFPVTiσiϊVVTFFSAIRDLPFSGLS RHKCISII£FFrøT-Σ-FVHI^I£KrøPJSCCSTr*4AW '^T4PLKTAIFDHVΣ)AI_3ASAQLCεSINTLVFΗNQKIΣΛY mX:εGΛ'AKLΣJ^ YCPRI lJ5YIJ-AFGIXSDFIiWrΛVIKFFEI^rroFSYYPVSGFVAPHOTL5IXQDRYFPI HΣAIVGAGGAAKAIAATΣAMOGANΣΛIFNRTLSSAAAIA'Σ KKGKAYPΣΛSI-aiFKTIDI ASVMR'Z*IJ,)KIWFSL'r?I^IHDΣ-U3HVPWL-ΛPSFSE ΣINCLPPBrrFPWRFPPΣVMDI-m PHPSPYΣ-EPJWIOKSSLIIHσYEMFIEQALLOFALW RΣC/rLOSNLIMvPCFVrFTVESGLΣEhmEGRKAYGAV^ISSPOELGHAF∑riΛπXVLPLEL FPDFLTPεSCDSFRNYVKNFMAKV DQΣIRLPFOTSTPOETI_?SΣIWFυεLVELTSKLISV ΛDML--ESΣPLYNOEKYΣώGFεVL
CQ
Figure imgf000123_0001
50 tp ij -Tπosepπospnat'ϊ .-rase
Pn_I OVI 1207010 t209466 FCRESMRIKFPENKERKMTPC . L^^IWKMHK ICεΛKεnC -AJ ^GεF 3CT 3ΣA
No rouuiit homolog presenc in Genebank/EMBL as of 11 /7/98 3PFT3LRAIHW|t4T-p-AI*1i» »(3l^rHPBLSGAF-S ;RWIIHRFtMOVLLSP0LPPPI?0HSVG3IS3PSIttΛiαAΣTFLvF(3MΣJ .Σ3GALFLTLGΣ H I FGESDAF I ASK Tt^AQAG PWIVBεS ε'/R€miCANQV'-l-KCr^^ PGL3AΛ I SFGLG IGLSAIiGGVLMISGLLCLLVKREI PTVRPEEI PεGVSLAPSEEPA QA FLIAYEPVWAΣGTCrWAεASr^raDIHMFCRE AERFSEATAEεiJILϊCCSvTCVDNAQR lOCTLAOLPKεLMΣJTTDIOεVFACIΛItl ωSlWESRSFLNDAKKEΣΛVFOFVVEDTXSE FGO rSrΛΛMr- VCGASLEGOSFFEVAKNFNV [Fr ROIVAOIXWDLNFLΣNCGRSLMOTAESESLDLFHVSl«LσYLPSGrΛrRGEGLjαtSA 'r r' PIΛII.H EIHKVAVAFDRN.TrAMAεKΛFAKALGALεεSVYRSLTOSYRDKFLESE CPn 10-.4 ι22-)7 l6 1220395 i v, - .„„„ . ., .„..„-,., t-r-Ljf - M . 'TFP-r K I.V. T:r.ArlA»PLF
/VRrΛWDOεFOItAGERLEKIJHALYPEVSVSIRENKΣQETRSNLIiKAYEAIEENYRCCVRE
JEσΛVKEEEKREAEFRERCNKILSPEELESSLEOFDIIG <NFSEKLMIiEGHILKLOKεA CPn_l065 1221140 1220923
TAEVENKILSMESRLEIVFiajvTffiMKRiεEIEKI RflAELPIXPTKXAFEIf^ No robust homolog present in Genebank/εMBL as of 11 /7 /98
MI-ML£KVTtr^YCKESLAYVTSI*ERLVSI^ED-JuWYTETOraFOCDS(ILESE RHI«X3RHP lTSr3PCFLFYF3IPεεSLPPDSCP -NWPI«εHLPSlΣiKKPIir«I CITSI
RεRIQεFErøω VEraj VSSRIΛOTEαXTvϊGVKKI*^ YEKAIFNTGLP
OSRWMTOSERLREIΪvTJACNKMΣJCAσWEEDKVΣJIXEEYWLYREERKNKEKRLV^
QORVAAFESΣEVPEIPEAPEεKPSLLDKARSLFTREDHT CPO.1066 1221132 1221488
No robust homolog presenc in Genebank/EMBL as of 11, 7 /98
CPn_1055 1209583 1210521 SMSLNKEIC_MTvT-FYAFLFIFΣJΣ£VIΣ :GLΣLVOESKSMGΣΛSSFO^ΛSGDSV* *VSTP
No robust homolog present in Genebank/EMBL as of 11/7/98 DIUαvτ"*ΛAVAFCIGCIJ JFSTtΛUΪKKI_3AKEFLLPAAEESΣ*ro
CKYLYHHSYPPPPm_W-AFFCI2-KFRVΣAITFLVL(3VΣJlISGAIJ TLGISGLSAAIS
FGLGICΣ AΣitX*v WSe CΣJJWREvTTVRPEEIPEGVSVAPSE£PA^ CPn_1067 1221675 1222292
PKEΣMLDRYΣQEVVSCIΛKΣJ-IiωCEIXMUJfrAKEX O^ def-Polypepcide Defor ylase
MεiSWYLKCLIOEMRDIGSTLFMSOTSLFKΣΛJEWLGYΣ^SGrΛnΛGεRI αSAREWDR^ IQVL- RDFFTEΣΛ»HVO/IMIRRI YYGSPILRKKSSPIAEITDEIRNLVSr»COTMI^
RRICrαrRKVAMTFDRNAYC* AKTAFEKAFGALETCVYl«MTESYRεAFCEYXKTKILRDE HRGVGLAAPCΛ*3KNVSLFVMCVDRETEDGELΣFSεSPRVFΣNPVLSDPSETPIΣGKEOCL
EKΣLRΣCYLELRR SireLRGEVFRTOKrTVTAMDIJCKIFTEHLEGFTARIΣMHεTDHIΛGVLYΣDLMEEPKD
PKKFKASLEKΣKRRYNTHLSKEELVS
CPn_1056 1210482 1211228
No robust homolog present in Genebank/EMBL as of 11/7/98 CPn_1068 1223267 1222365
GεDIlω^α^RvEEIEMM RVIE PIiPIKOA EKAFVQTOSYKAKL'^Kv^PCF ESPAYI rnhB-Ribonucleasβ HΣI
TSΣ2Iu^SlJX7riJERAYKEY0KRF0EPSRΣ-εSEVSGCREHΣΛEQWQFETαGlΛLIKEEL MscϊI PPFvvτ ^τsAO ^r RIx K!^ ^lFIFSOP 5^^^woARSi*^^
IFVSrΛ-LFP WSCLVSTVHWFMEFYYεYFEiyUUJΪΣ-RAr^ KGSEEFIE FLEPEILHTT^THARVEODLRPRLGVDESGKGDFFGPLCΣAAVYASNAEΣLK κ r^εκλKAPRE E-fwχχ£ εpj sκεκ}u^ιumxεAA∞RvκDi^EPPPXκ∑^ lα,YE K 0DSK^^LKI7rKIASLARI I^LCVCDVlΣLYPEKYNE Yσ|ιJ^ LLAWAHA
YSFFIRLKS TOIilNIAPKPAGD^AISOOFAASEYTIXKALOKKETDITLIQKPRAEODvVV^^
RDAFVQSΣQKI-EIXJYQVOLPliϊlAGFNVKAAGω∑AKQRGlCELLAKISIi HFKTFDEΣCSσ
CPn_1057 1211467 1213596 K
CT356 hypothetical protein
IIHFYFFNFAMPεPLYTNKLITEKSPYIXiYAHTPVNWYPWGAEAFHIAAIENiαVFLSI CPn_1069 1223507 1223941 α:KHSRWCOVMIΛESYTNPEIAAMLJrtΥFVNVKvOI<EELrjγVAi YGDLACMLAVSσDH0 yfgA-HTH Transcnptional Regulacor
ETV^WPLNVFLTPDLVPFFSVNYLGNEGKI^PSFPQIIDKIJ-FMWIiΣlAEERB^ VIMOEHXHKiα HIΛEIFRSSRESOSLSUΦvΕAATSΣRYSCI AΣEOGCLGKLISPVYA i θ£IASFLEGπ-Ri EIiJ3ESSI ΛWAALY0DIDPHYGχ*VKAFPKRIJGΣiω CGFXKKYA'TYTjGLIX*DSIWEKPYVTKIFKEFSDHNMI2u DLESMrX»NSPERAXHSWS
LEYQεSR FFVORSI^MVλΣ^JGVROKXGGGVYSYTXDDiα&IFAFεKRLIDNλLHλl.NY NLWWAGLΣΣIGGIMVWWLGSLFSΣF
LEAHACIΛKEEYtfflIGKQIL5YΣLSELYSPEVGAFYSSE0AENWCΛC<WtJrYTWSVεEIS
NAΣ£EΣλ-£IFCrjYYGISRεGFFN3^NILHIPVHRεXεε∑^εKYHP^XEλ CPn_1070 1225523 1224144
LKGIRAORSHRSrøDΣΛLTFNNrailMIWFAYAGRΣΛGEVTΥIEIGKK∞EFvTtNSLYjm No robusc homolog present in Genebank/EMBL as of 11/7/98
RRSLMTFP«a«2Nr2YY tETPPPI4I-»EDXPI«EG(30Sr»CGGRVX-I^
GSr*JVU3MvΕ0AGSLLNNLLDSARM0RLGHYCYRTGTPWCREHCPGFl IΣWC«rauXL
ETVDDPDNPSAOFL∞LIMYGPΣCVGMSFCCLPHCTOKIECGEPLCπXaXOEVENGα ,
HRE*--KAAQPRCMGESLVKΣΛQNt-3ΣΛEDM(3OTP^
WI QPE0ΛPCPPPPTOEEQIΛCJλ\-3GAPAP-χKKHPAQECRV^^
CPn_105β 1213742 1214836 I.SLESGYKGPI^ΪCAAKOΣVDLΣIOCSLKRLVASDLATFIΛPGIGr^LεSO iTεVLVIJXIX
CT355 hypothetical procein SKGYLPLDPIΛPεQTVLDPRVτχ3PWQRΣIJU ^vTTrAGENΣWRCTWεAPROAPPPPDP
EVMKLYO ΣJrøI\/LVSTCX:iFΣ/3MHGGYAAEVFVrSSGYENTΛI« WDDDEIEP GrvTC«3FGΣPCOCLRCWRKLPTEKRPNRWL
FB/DEENVVτALTΛ/IHKIΛΣJLFTOSYPHLIDSFPARSOYYTAMI^PVVIJa^
AWUΛIATDPTAVWEIEEMFGPJJωPLYAHFEMSPNDIFNVIDRTLTAQRVMraOWRSK CPI 1071 1227336 1225885
VMΣ -VTPGKIREYYRKI£EEASRKVIWKYRVLTI- ANTXSLAS0IAD1CVRARUJEAKTTO No robust homolog present in Genebank/EMBL as of 11/7/98
KDRLTALVISOGGQL\TCSEEFSRENSεi_30SHKθεi_3LIC-YPKEI£σLPICAHKSGYiα.YM KC^r*rMVCPNNSWFRMCOI4Ft*:EWVEV'lTl't.LTl'H0SASDISEεAGSSG»'^
LXDKTSGSIEPLOVMESKIKQHLFALEAESVEKQYIOJRLRKRYGYDASMIAKLLSEEAPP TICVΕKRVOFOTAGGDESTIHMIQEAGεLVDSILSHRRTOOCTεYCT^
LFSLL GTU,ΣCGTYKACCTJIΪEI3-0VAGLVHECECr™σPΣAVALAAKTM3LNI lELVεKNTIΣ-^E
QKNEFROHCSEAKTOLYGT ^SWONFFLEW SIREMLDDSLVOAVLSFΣATRSWEKT
CPn_l059 1214848 1215678 ΣESεεASGTSSASNSTRΣPA 'ILNTSPLTTSRLSCGSRDARRPSSVGAεPOYVAKI YND kgsA-Di ethyladenosine Transferase )GMAR0LσKIC T^ttJ TσDFSAWPFGI ΣVKM SFI SAS0STSSΣLKHTα3ε∑C TC
VTRSSPAQLSRFLSEIONKPKKSLSONFLVTJONIVKKΣVATSEVXroDWVLEIGPGFGAL PNFRDIVVΣiMΣAIGYCPANTDCTSVvOIHMIDDPIMTIFYRIΛY-FYRTGKTSASFtJα K
TεεLIAAGAOVIA∑εKDPMFAPSΣ^ELPIRLEIIDACKYPLDOLQEYKTLGKGRVVANLP PSLVRQεSLIXPTPAESVPLMSSLεεεDεriEI)DDEDG rLAYC^RILεCSC4ILQτLFLGIK
YHΣTTPΣXTIOFLεAPDFWlπVIVMVODEVARRIVAOPGGRDYσSLTIFLOFFADIHYAF INKE
! VSASCFYPKPQVOSAVIHMI vTIΕTLPLSDεεiPv FTLTRTAFQORRirVΣAtTrLKGLYP
|χOyεQALK£ GLL ^^ RPE I^LNt)Y A FHKMQAG CPn_1072 1227924 1228835 robust homolog present in Genebank/EMBL as of 11/7/98
CPn_1060 1217694 1215727 TVGA IVGFFNSADAAPKKKKΣP∑OILYSFT dxs/tkt -Transketolase WSSY CNEnASTIFCVrΛTORGLIΛHRYΣΛSPσWεTRRROLFKSLENOSYσNERLGEET
YKRFLYIHΣTKVMTSSSCPΣ BLΣLSPADLKKΣ^ΣSOLPGLAεε∑RYRΣΣSVLSOTGGHL LAIDIFRNKECLESεiPEOMEAIl U SSALVLGΣSSFGITGΣPATLHSLLRQNLSFQKRS
SSNXGIVΕLTΣALHYVFSSPI DKFΣFDVσHO YPHKLLTGRNNεGFDHΣRNDNGLSGFTN ΣASESFLLKΣDSAPSDASVFYKGVLFRGεTAΣVDALSOI.FAOLDLSPKKIΣFLGEDPEVV
PTεSDHDLFFSGHAσTALSLALGMAOTTPLESRTHVΣ P Σ LGDAAFSCGLTLεALNNISTD 0AvrJSACiσWGMNFLGLVYYPA0εSLFSYVHPYSTATεL0εA0GLQVΣSDEVAOt.TrjlAL
LSKFVVIU4IΪ«lMSISKNvX»MSRIFSRWLHHPATNKLTKOVεKWLAKIPRYGDSLAKHS PKMN
RRlJ!OCVK LFCP PLFE0FσIΛYWPIDGH^^VKKLΣPΣ 0SV ^rLPFPΣLVHVCTTKGK
GLDOAQWPAKYHGVRANFNKRεSAKHLPAΣKPKPSFPDΣFGO LCεLGεVSSRL'røvTP CPn_1073 1229011 1229832
AMSΣGSRLEGFKOKFPεRFFDVGΣAεGHAVTFSAGΣAKAGNPVΣCSΣYSTFLHRALDNVF Predicted OMP (CT37D
HDVTΛ10DLPVIFAIDRAGLAYGDGRSHHGIYDMSFLRAMPOMIΣC0PRSOVVFQQI-t.YSS NWRYLF fVLALCLYRAAPLεAWΣKΣ'roA0AVLKFARεi rLVCFN∑εirrVVTPKCWΛ30S
LHWSSPSAΣRYPNΣPAPHGDPLTCDPNFLRSPCMAεTI^QGεDVLΣ ΣALGTLCFTALSΣK AWLYNRεLDLK'T LSεεOARεOAFLεWMGISFLVDYεLVGANLRNVLTCLSLKRSWVLGI
HQLLAYCΣSATVVOPΣFIKPFDNDLFSLLLMSHSKVITiεεHSIRGGLASεFNNFVATFN SORPVHLΣIO'TLRΣLR≤FNΣDFτSCPAΣCεDCWLSHPTKDTTFDOAMA∑εKNΣLFVGSLK
FKvO∑ωFAΣPOTFLSHGSKεALTKSΣCLDESSMTNRILTHFNFRSKKQTVGDVRV NCOPMDAALεVLLSGΣSSPPSQΣ ΣYVDODAεRLRSΣGAFCKKANΣYFΣGMLYTPAKQRVε
3YNPKLTΛIOW≤0∑RKNLSDεYYεSLLSYVKSK
CPπ_1061 1217932 1217666
CT330 hypothet ical protein
FGSLMVε∑HHKDPSLJ KLFALQOSLεTLNSLSDΣVATΥεAMFSLΣYεGLNKALRKD0LCY LL≤VNSKGεLLKSPSGDPΣVQTFPΣHPHH RNA GεCTΣON Pn_10S2 1219835 1213159 xseA -Ex doxyribonuclease VΣ I
R(;FPVM:;3PP0AVΛSLTεR tKTLLε3NFC0Σ IVKGEL3NV3L0Pr;GHLYFGΣKDS0AFLN πnRNΛ I lrt vn | 19074 .ΛFFIIFF';KY"ι'DP PKDr,DAVt IHC:KLAVYAPRαθYQIVΛIIΛLVYAGEGDLL KFεETKR i πYTU-IirΛ.;UVI'APTP-:ΛΛACtVl-K33EE0V0VFEGYLRIILLiJIIJR0LLTGKK0 .LLPW I'I'F IXiRΛεFYTTΛCΛXJLD.'; ICtΛIQKUVOCK IliεSKORYDN t:,'RWLIJI.;DLV3PMTCRL0S i
' YFIIKIITI'I.KIIΛIItlVLE(."jLR'JIIVOKI,ELLCPRL:;r'i- CIΛLON'JK IAYΛNVKETLΛTIL I HRYEN.','/ΛHY' ALKE UI.:LNPKNVLKRGYAMLFDrtlEN:;ΛMI:JVD:;L ENARVR I LQ KIEAILTTTNIEfKLIKi; 51
tRNA 1 I Begin End Type Codon t 89657 89728 Thr GGT
2 90998 91070 Tro CCA b 296075 296147 val TAC
6. 296151 296224 Asp GTC
7 409848 409922 Pro TCG
8 462141 462214 Arg CCT
9 672236 672318 Leu CAA
10 677264 677337 Arg TCG
11 739403 739486 Leu CAG
12 781610 781680 Gly TCC
13 784822 784896 GIu TTC
14 784922 784994 Lys TTT
15 836119 836191 Ala GGC
16 843926 843999 Pro GGG
17 877400 877473 Arg ACG
1-8 1085605 1085676 Gin TTG 9 1142034 1142118 Ser TGA 0 1175863 1175944 Leu TAG 1 1230028 1229942 Ser CGA 2 1137462 1137389 Val GAC 3 1030603 1030533 Cys GCA 4 1000022 999949 His GTG 5 961607 961536 Gly GCC 6 307413 807341 Arg CT 7 786780 786708 Thr CGT 8 715971 715889 Leu TAA 9 708441 708354 Ser GCT 0 680259 680178 Leu GAG 1 631445 631373 Phe GAA 2 626987 626901 Ser GGA 3 293477 293405 Thr TGT 4 293399 293317 Tyr GTA 5 269142 269070 Ala TGC 6 26906S 268992 He GAT 7 164389 164318 λsn GTT 8 87522 87450 Met CAT

Claims

What is Claimed is:
1. An isolated nucleic acid encoding a C. pneumoniae protein as set forth in Table 3.
2. The isolated nucleic acid of Claim 1, wherein said nucleic acid has a nucleotide sequence of an open reading frame in SEQ ID NO:l.
3. A probe comprising a hybridizing fragment of an isolated nucleic acid according to Claim 2.
5. An isolated nucleic acid that hybridizes under stringent conditions to the nucleic acid sequ ce of Claim 2.
6. An expression cassette comprising a transcriptional initiation region functional in an expression host, a nucleic acid having a sequence of the isolated nucleic acid according to Claim 1 under the transcriptional regulation of said transcriptional initiation region, and a transcriptional termination region functional in said expression host.
7. A cell comprising an expression cassette according to Claim 6 as part of an extrachromosomal element or integrated into the genome of a host cell as a result of introduction of said expression cassette into said host cell, and the cellular progeny of said host cell.
8. A method for producing a C. pneumoniae protein, said method comprising: growing a cell according to Claim 7, whereby said C. pneumoniae protein is expressed; and isolating said C. pneumoniae protein free of other proteins.
9. A purified polypeptide composition comprising at least 50 weight
% of the protein present as a C. pneumoniae protein comprising an amino acid sequence of claiml.
10. A monoclonal antibody binding specifically to the polypeptide of
Claim 9.
PCT/US1999/026923 1998-11-12 1999-11-12 Chlamydia pneumoniae genome sequence WO2000027994A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP99960323A EP1133572A4 (en) 1998-11-12 1999-11-12 Chlamydia pneumoniae genome sequence
JP2000581161A JP2002529069A (en) 1998-11-12 1999-11-12 Chlamydia pneumoniae genome sequence
CA002350775A CA2350775A1 (en) 1998-11-12 1999-11-12 Chlamydia pneumoniae genome sequence
AU17223/00A AU1722300A (en) 1998-11-12 1999-11-12 Chlamydia pneumoniae genome sequence

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US10827998P 1998-11-12 1998-11-12
US60/108,279 1998-11-12
US12860699P 1999-04-08 1999-04-08
US60/128,606 1999-04-08

Publications (2)

Publication Number Publication Date
WO2000027994A2 true WO2000027994A2 (en) 2000-05-18
WO2000027994A3 WO2000027994A3 (en) 2000-11-23

Family

ID=26805735

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/026923 WO2000027994A2 (en) 1998-11-12 1999-11-12 Chlamydia pneumoniae genome sequence

Country Status (5)

Country Link
EP (1) EP1133572A4 (en)
JP (1) JP2002529069A (en)
AU (1) AU1722300A (en)
CA (1) CA2350775A1 (en)
WO (1) WO2000027994A2 (en)

Cited By (74)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1105490A1 (en) 1998-08-20 2001-06-13 Aventis Pasteur Limited Nucleic acid molecules encoding inclusion membrane protein c of chlamydia
EP1105489A1 (en) 1998-08-20 2001-06-13 Aventis Pasteur Limited Nucleic acid molecules encoding pomp91a protein of chlamydia
EP1124965A1 (en) 1998-10-28 2001-08-22 Aventis Pasteur Limited Chlamydia antigens and corresponding dna fragments and uses thereof
EP1127065A2 (en) 1998-11-02 2001-08-29 Aventis Pasteur Limited $i(chlamydia) antigens and corresponding dna fragments and uses thereof
EP1140997A2 (en) 1998-12-18 2001-10-10 CHIRON S.p.A. Chlamydia trachomatis antigens
EP1140998A1 (en) 1998-12-28 2001-10-10 Aventis Pasteur Limited $i(CHLAMYDIA) ANTIGENS AND CORRESPONDING DNA FRAGMENTS AND USES THEREOF
EP1144642A2 (en) 1998-12-08 2001-10-17 Corixa Corporation Compounds and methods for treatment and diagnosis of chlamydial infection
EP1149162A2 (en) 1999-02-05 2001-10-31 Neutec Pharma PLC Medicament
EP1163342A1 (en) 1999-03-12 2001-12-19 Aventis Pasteur Limited Chlamydia antigens and corresponding dna fragments and uses thereof
EP1165591A1 (en) * 1999-03-26 2002-01-02 Human Genome Sciences, Inc. 47 human secreted proteins
EP1165828A1 (en) * 1999-03-26 2002-01-02 Human Genome Sciences, Inc. 50 human secreted proteins
EP1177301A2 (en) 1999-05-03 2002-02-06 Aventis Pasteur Limited $i(CHLAMYDIA) ANTIGENS AND CORRESPONDING DNA FRAGMENTS AND USES THEREOF
EP1219635A2 (en) * 2000-12-21 2002-07-03 Shire Biochem Inc. Chlamydia pneumoniae antigenes
EP1220925A1 (en) 1999-09-20 2002-07-10 Aventis Pasteur Limited Chlamydia antigens and corresponding dna fragments and uses thereof
EP1222283A1 (en) 1999-09-22 2002-07-17 The University of Manitoba DNA IMMUNIZATION AGAINST i CHLAMYDIA /i INFECTION
EP1240331A2 (en) 1999-12-22 2002-09-18 Aventis Pasteur Limited i CHLAMYDIA /i ANTIGENS AND CORRESPONDING DNA FRAGMENTS AND USES THEREOF
EP1278855A2 (en) 2000-04-21 2003-01-29 Corixa Corporation Compounds and methods for treatment and diagnosis of chlamydial infection
EP1282718A2 (en) 2000-05-08 2003-02-12 Aventis Pasteur Limited Chlamydia antigens and corresponding dna fragments and uses thereof
EP1297005A2 (en) 2000-07-03 2003-04-02 CHIRON S.p.A. Immunisation against chlamydia pneumoniae
EP1307564A2 (en) 2000-07-20 2003-05-07 Corixa Corporation Compounds and methods for treatment and diagnosis of chlamydial infection
WO2003070909A2 (en) 2002-02-20 2003-08-28 Chiron Corporation Microparticles with adsorbed polypeptide-containing molecules
US7166289B2 (en) 1998-08-20 2007-01-23 Sanofi Pasteur Limited Nucleic acid molecules encoding inclusion membrane protein C of Chlamydia
US7297341B1 (en) 1998-12-23 2007-11-20 Sanofi Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
US7326545B2 (en) 1998-12-01 2008-02-05 Sanofi Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
US7335370B2 (en) 1998-12-28 2008-02-26 Aventis Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
US7361353B2 (en) 2001-12-12 2008-04-22 Novartis Vaccines And Diagnostics, Inc. Immunisation against Chlamydia trachomatis
WO2009034473A2 (en) 2007-09-12 2009-03-19 Novartis Ag Gas57 mutant antigens and gas57 antibodies
US7537772B1 (en) 2000-10-02 2009-05-26 Emergent Product Development Gaithersburg Inc. Chlamydia protein, gene sequence and the uses thereof
US7553493B2 (en) 1998-12-23 2009-06-30 Sanofi Pasteur Limited Chlamydia flagellar protein antigen
EP2179729A1 (en) 2003-06-02 2010-04-28 Novartis Vaccines and Diagnostics, Inc. Immunogenic compositions based on microparticles comprising adsorbed toxoid and a polysaccharide-containing antigen
WO2010049806A1 (en) 2008-10-27 2010-05-06 Novartis Ag Purification method
US7731980B2 (en) 2000-10-02 2010-06-08 Emergent Product Development Gaithersburg Inc. Chlamydia PMP proteins, gene sequences and uses thereof
WO2010078556A1 (en) 2009-01-05 2010-07-08 Epitogenesis Inc. Adjuvant compositions and methods of use
WO2010078027A1 (en) * 2008-12-17 2010-07-08 Genocea Biosciences, Inc. Chlamydia antigens and uses thereof
US7754228B2 (en) 2002-02-13 2010-07-13 Novartis Vaccines And Diagnostics, Srl Cytotoxic T-cell epitopes from Chlamydia
WO2010079464A1 (en) 2009-01-12 2010-07-15 Novartis Ag Cna_b domain antigens in vaccines against gram positive bacteria
EP2255827A1 (en) 2001-07-26 2010-12-01 Novartis Vaccines and Diagnostics S.r.l. Vaccines comprising aluminium adjuvants and histidine
EP2258390A1 (en) 2002-08-30 2010-12-08 Novartis Vaccines and Diagnostics S.r.l. Improved bacterial outer membrane vesicles
EP2258716A2 (en) 2002-11-22 2010-12-08 Novartis Vaccines and Diagnostics S.r.l. Multiple variants of meningococcal protein NMB1870
EP2258365A1 (en) 2003-03-28 2010-12-08 Novartis Vaccines and Diagnostics, Inc. Use of organic compounds for immunopotentiation
EP2263688A1 (en) 2001-06-20 2010-12-22 Novartis AG Neisseria meningitidis combination vaccines
EP2267005A1 (en) 2003-04-09 2010-12-29 Novartis Vaccines and Diagnostics S.r.l. ADP-ribosylating toxin from Listeria monocytogenes
EP2270176A1 (en) 2001-03-27 2011-01-05 Novartis Vaccines and Diagnostics S.r.l. Streptococcus pneumoniae proteins and nucleic acids
EP2277895A1 (en) 2000-10-27 2011-01-26 Novartis Vaccines and Diagnostics S.r.l. Nucleic acids and proteins from streptococcus groups A & B
EP2277595A2 (en) 2004-06-24 2011-01-26 Novartis Vaccines and Diagnostics, Inc. Compounds for immunopotentiation
EP2279746A2 (en) 2002-11-15 2011-02-02 Novartis Vaccines and Diagnostics S.r.l. Surface proteins in neisseria meningitidis
EP2279747A1 (en) 2004-10-29 2011-02-02 Novartis Vaccines and Diagnostics S.r.l. Immunogenic bacterial vesicles with outer membrane proteins
US7901907B2 (en) 1996-01-04 2011-03-08 The Provost Fellows And Scholars Of The College Of The Holy And Undivided Trinity Of Queen Elizabeth Near Dublin Process for production of Helicobacter pylori bacterioferritin
EP2298795A1 (en) 2005-02-18 2011-03-23 Novartis Vaccines and Diagnostics, Inc. Immunogens from uropathogenic escherichia coli
EP2298796A2 (en) 2001-03-27 2011-03-23 Novartis Vaccines and Diagnostics S.r.l. Staphylococcus aureus proteins and nucleic acids
WO2011051917A1 (en) 2009-10-30 2011-05-05 Novartis Ag Purification of staphylococcus aureus type 5 and type 8 capsular saccharides
EP2327719A1 (en) 2001-09-06 2011-06-01 Novartis Vaccines and Diagnostics S.r.l. Hybrid and tandem expression of neisserial proteins
EP2351772A1 (en) 2005-02-18 2011-08-03 Novartis Vaccines and Diagnostics, Inc. Proteins and nucleic acids from meningitis/sepsis-associated Escherichia coli
EP2357184A1 (en) 2006-03-23 2011-08-17 Novartis AG Imidazoquinoxaline compounds as immunomodulators
EP2357000A1 (en) 2005-10-18 2011-08-17 Novartis Vaccines and Diagnostics, Inc. Mucosal and systemic immunizations with alphavirus replicon particles
EP2360175A2 (en) 2005-11-22 2011-08-24 Novartis Vaccines and Diagnostics, Inc. Norovirus and Sapovirus virus-like particles (VLPs)
WO2011138636A1 (en) 2009-09-30 2011-11-10 Novartis Ag Conjugation of staphylococcus aureus type 5 and type 8 capsular polysaccharides
WO2011149564A1 (en) 2010-05-28 2011-12-01 Tetris Online, Inc. Interactive hybrid asynchronous computer game infrastructure
WO2012035519A1 (en) 2010-09-16 2012-03-22 Novartis Ag Immunogenic compositions
WO2012085668A2 (en) 2010-12-24 2012-06-28 Novartis Ag Compounds
EP2537857A2 (en) 2007-12-21 2012-12-26 Novartis AG Mutant forms of streptolysin O
EP2548895A1 (en) 2007-01-11 2013-01-23 Novartis AG Modified saccharides
WO2013038375A2 (en) 2011-09-14 2013-03-21 Novartis Ag Methods for making saccharide-protein glycoconjugates
US8409587B2 (en) 2002-11-01 2013-04-02 Glaxosmithkline Biologicals S.A. Immunogenic composition
EP2583678A2 (en) 2004-06-24 2013-04-24 Novartis Vaccines and Diagnostics, Inc. Small molecule immunopotentiators and assays for their detection
EP2586790A2 (en) 2006-08-16 2013-05-01 Novartis AG Immunogens from uropathogenic Escherichia coli
WO2013068949A1 (en) 2011-11-07 2013-05-16 Novartis Ag Carrier molecule comprising a spr0096 and a spr2021 antigen
EP2612679A1 (en) 2004-07-29 2013-07-10 Novartis Vaccines and Diagnostics, Inc. Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae
US8568732B2 (en) 2009-03-06 2013-10-29 Novartis Ag Chlamydia antigens
WO2013174832A1 (en) 2012-05-22 2013-11-28 Novartis Ag Meningococcus serogroup x conjugate
JP2016106117A (en) * 2010-10-20 2016-06-16 ジェノセア バイオサイエンシーズ, インコーポレイテッド Chlamydia antigens and uses thereof
WO2017175082A1 (en) 2016-04-05 2017-10-12 Gsk Vaccines S.R.L. Immunogenic compositions
EP3498302A1 (en) 2005-02-01 2019-06-19 Novartis Vaccines and Diagnostics S.r.l. Conjugation of streptococcal capsular saccharides to carrier proteins
US10561720B2 (en) 2011-06-24 2020-02-18 EpitoGenesis, Inc. Pharmaceutical compositions, comprising a combination of select carriers, vitamins, tannins and flavonoids as antigen-specific immuno-modulators

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108514870B (en) * 2018-04-27 2020-02-28 湖南大学 Hydrotalcite-poly (m-phenylenediamine) composite material and preparation method and application thereof

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5374718A (en) * 1992-08-26 1994-12-20 Gen-Probe Incorporated Nucleic acid probes to chlamydia pneumoniae

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08294400A (en) * 1995-04-28 1996-11-12 Hitachi Chem Co Ltd Probe and primer for detecting and measuring chlamydia pneumoniae gene, detection and measurement of chlamydia pneumoniae gene using the same probe or primer and reagent for detecting and measuring chlamydia pneumoniae gene containing the same probe or primer
JPH10210978A (en) * 1997-01-31 1998-08-11 Hitachi Chem Co Ltd Recombinant vector and transformant containing the same, and recombinant vacurovirus and its production, and production of chlamydia pneumoniae antigen polypeptide

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5374718A (en) * 1992-08-26 1994-12-20 Gen-Probe Incorporated Nucleic acid probes to chlamydia pneumoniae
US5683870A (en) * 1992-08-26 1997-11-04 Gen-Probe Incorporated Nucleic acid probes to Chlamydia pneumoniae

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1133572A2 *

Cited By (123)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7901907B2 (en) 1996-01-04 2011-03-08 The Provost Fellows And Scholars Of The College Of The Holy And Undivided Trinity Of Queen Elizabeth Near Dublin Process for production of Helicobacter pylori bacterioferritin
EP1105490A1 (en) 1998-08-20 2001-06-13 Aventis Pasteur Limited Nucleic acid molecules encoding inclusion membrane protein c of chlamydia
EP1105489A1 (en) 1998-08-20 2001-06-13 Aventis Pasteur Limited Nucleic acid molecules encoding pomp91a protein of chlamydia
US7166289B2 (en) 1998-08-20 2007-01-23 Sanofi Pasteur Limited Nucleic acid molecules encoding inclusion membrane protein C of Chlamydia
EP1124965A1 (en) 1998-10-28 2001-08-22 Aventis Pasteur Limited Chlamydia antigens and corresponding dna fragments and uses thereof
EP1127065A2 (en) 1998-11-02 2001-08-29 Aventis Pasteur Limited $i(chlamydia) antigens and corresponding dna fragments and uses thereof
US7326545B2 (en) 1998-12-01 2008-02-05 Sanofi Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
US7736873B2 (en) 1998-12-01 2010-06-15 Sanofi Pasteur Limited Chlamydia polypeptides and corresponding DNA fragments and uses thereof
US8263089B2 (en) 1998-12-08 2012-09-11 Corixa Corporation Compounds and methods for treatment and diagnosis of chlamydial infection
EP1144642A2 (en) 1998-12-08 2001-10-17 Corixa Corporation Compounds and methods for treatment and diagnosis of chlamydial infection
US8052975B2 (en) 1998-12-08 2011-11-08 Corixa Corporation Compounds and methods for treatment and diagnosis of chlamydial infection
EP1140997A2 (en) 1998-12-18 2001-10-10 CHIRON S.p.A. Chlamydia trachomatis antigens
US7297341B1 (en) 1998-12-23 2007-11-20 Sanofi Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
US7850980B2 (en) 1998-12-23 2010-12-14 Sanofi Pasteur Limited Chlamydia OMP antigen
US7553493B2 (en) 1998-12-23 2009-06-30 Sanofi Pasteur Limited Chlamydia flagellar protein antigen
US7335370B2 (en) 1998-12-28 2008-02-26 Aventis Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
EP1140998A1 (en) 1998-12-28 2001-10-10 Aventis Pasteur Limited $i(CHLAMYDIA) ANTIGENS AND CORRESPONDING DNA FRAGMENTS AND USES THEREOF
EP1149162A2 (en) 1999-02-05 2001-10-31 Neutec Pharma PLC Medicament
US7629327B2 (en) 1999-03-12 2009-12-08 Sanofi Pasteur Limited Chlamydia 60 Kda CRMP antigens and vaccine uses
US7183402B2 (en) 1999-03-12 2007-02-27 Sanofi Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
US7285276B2 (en) 1999-03-12 2007-10-23 Sanofi Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
EP1163342A1 (en) 1999-03-12 2001-12-19 Aventis Pasteur Limited Chlamydia antigens and corresponding dna fragments and uses thereof
EP1165828A4 (en) * 1999-03-26 2002-09-25 Human Genome Sciences Inc 50 human secreted proteins
EP1165591A4 (en) * 1999-03-26 2002-09-25 Human Genome Sciences Inc 47 human secreted proteins
EP1165828A1 (en) * 1999-03-26 2002-01-02 Human Genome Sciences, Inc. 50 human secreted proteins
EP1165591A1 (en) * 1999-03-26 2002-01-02 Human Genome Sciences, Inc. 47 human secreted proteins
US7658934B2 (en) 1999-05-03 2010-02-09 Sanofi Pasteur Limited Chlamydia antigens and protein vaccine
US7595058B2 (en) 1999-05-03 2009-09-29 Sanofi Pasteur Limited Chlamydia antigens and vaccine uses of the protein
EP1177301A2 (en) 1999-05-03 2002-02-06 Aventis Pasteur Limited $i(CHLAMYDIA) ANTIGENS AND CORRESPONDING DNA FRAGMENTS AND USES THEREOF
EP1220925A1 (en) 1999-09-20 2002-07-10 Aventis Pasteur Limited Chlamydia antigens and corresponding dna fragments and uses thereof
US7662391B2 (en) 1999-09-20 2010-02-16 Sanofi Pasteur Limited Chlamydia outer membrane protein (OMP) and vaccine uses of the protein
US7314869B2 (en) 1999-09-20 2008-01-01 Sanofi Pasteur Limited Chlamydia antigens and corresponding DNA fragments and uses thereof
EP1222283A1 (en) 1999-09-22 2002-07-17 The University of Manitoba DNA IMMUNIZATION AGAINST i CHLAMYDIA /i INFECTION
EP1240331A2 (en) 1999-12-22 2002-09-18 Aventis Pasteur Limited i CHLAMYDIA /i ANTIGENS AND CORRESPONDING DNA FRAGMENTS AND USES THEREOF
EP1278855A2 (en) 2000-04-21 2003-01-29 Corixa Corporation Compounds and methods for treatment and diagnosis of chlamydial infection
EP1282718A2 (en) 2000-05-08 2003-02-12 Aventis Pasteur Limited Chlamydia antigens and corresponding dna fragments and uses thereof
JP2004502415A (en) * 2000-07-03 2004-01-29 カイロン エセ.ピー.アー. Immunization against Chlamydiapneumoniae
EP1297005A2 (en) 2000-07-03 2003-04-02 CHIRON S.p.A. Immunisation against chlamydia pneumoniae
EP1307564A2 (en) 2000-07-20 2003-05-07 Corixa Corporation Compounds and methods for treatment and diagnosis of chlamydial infection
US7537772B1 (en) 2000-10-02 2009-05-26 Emergent Product Development Gaithersburg Inc. Chlamydia protein, gene sequence and the uses thereof
US7803388B2 (en) 2000-10-02 2010-09-28 Emergent Product Development Gaithersburg, Inc. Chlamydia PMP proteins, gene sequences and uses thereof
US7851609B2 (en) 2000-10-02 2010-12-14 Emergent Product Development Gaithersburg Inc. Chlamydia PMP proteins, gene sequences and uses thereof
US7731980B2 (en) 2000-10-02 2010-06-08 Emergent Product Development Gaithersburg Inc. Chlamydia PMP proteins, gene sequences and uses thereof
EP2284183A1 (en) 2000-10-27 2011-02-16 Novartis Vaccines and Diagnostics S.r.l. Nucleic acids and proteins from streptococcus groups A and B
EP2277894A1 (en) 2000-10-27 2011-01-26 Novartis Vaccines and Diagnostics S.r.l. Nucleic acids and proteins from streptococcus groups A & B
EP2277896A1 (en) 2000-10-27 2011-01-26 Novartis Vaccines and Diagnostics S.r.l. Nucleic acids and proteins from streptococcus groups A & B
EP2277895A1 (en) 2000-10-27 2011-01-26 Novartis Vaccines and Diagnostics S.r.l. Nucleic acids and proteins from streptococcus groups A & B
EP2284181A1 (en) 2000-10-27 2011-02-16 Novartis Vaccines and Diagnostics S.r.l. Nucleic acids and proteins from streptococcus groups A and B
EP2284182A1 (en) 2000-10-27 2011-02-16 Novartis Vaccines and Diagnostics S.r.l. Nucleic acids and proteins from streptococcus groups A and B
EP1219635A3 (en) * 2000-12-21 2003-10-08 Shire Biochem Inc. Chlamydia pneumoniae antigenes
EP1219635A2 (en) * 2000-12-21 2002-07-03 Shire Biochem Inc. Chlamydia pneumoniae antigenes
EP2314697A1 (en) 2001-03-27 2011-04-27 Novartis Vaccines and Diagnostics S.r.l. Streptococcus pneumoniae proteins and nucleic acids
EP2298796A2 (en) 2001-03-27 2011-03-23 Novartis Vaccines and Diagnostics S.r.l. Staphylococcus aureus proteins and nucleic acids
EP2278009A1 (en) 2001-03-27 2011-01-26 Novartis Vaccines and Diagnostics S.r.l. Streptococcus pneumoniae proteins and nucleic acids
EP2270177A1 (en) 2001-03-27 2011-01-05 Novartis Vaccines and Diagnostics S.r.l. Streptococcus pneumoniae proteins and nucleic acids
EP2278008A2 (en) 2001-03-27 2011-01-26 Novartis Vaccines and Diagnostics S.r.l. Streptococcus pneumoniae proteins and nucleic acids
EP2270175A1 (en) 2001-03-27 2011-01-05 Novartis Vaccines and Diagnostics S.r.l. Streptococcus pneumoniae proteins and nucleic acids
EP2278010A1 (en) 2001-03-27 2011-01-26 Novartis Vaccines and Diagnostics S.r.l. Streptococcus pneumoniae proteins and nucleic acids
EP2270176A1 (en) 2001-03-27 2011-01-05 Novartis Vaccines and Diagnostics S.r.l. Streptococcus pneumoniae proteins and nucleic acids
EP2263688A1 (en) 2001-06-20 2010-12-22 Novartis AG Neisseria meningitidis combination vaccines
EP2277536A2 (en) 2001-06-20 2011-01-26 Novartis AG Purification of bacterial capsular polysaccharides
EP2277537A2 (en) 2001-06-20 2011-01-26 Novartis AG Neisseria meningitidis conjugate combination vaccine
EP2277539A2 (en) 2001-06-20 2011-01-26 Novartis AG Neisseria meningitidis conjugate combination vaccine
EP2266605A1 (en) 2001-07-26 2010-12-29 Novartis Vaccines and Diagnostics S.r.l. Vaccines comprising aluminium adjuvants and histidine
EP2255827A1 (en) 2001-07-26 2010-12-01 Novartis Vaccines and Diagnostics S.r.l. Vaccines comprising aluminium adjuvants and histidine
EP2360176A2 (en) 2001-09-06 2011-08-24 Novartis Vaccines and Diagnostics S.r.l. Hybrid and tandem expression of neisserial derived proteins
EP2327719A1 (en) 2001-09-06 2011-06-01 Novartis Vaccines and Diagnostics S.r.l. Hybrid and tandem expression of neisserial proteins
EP2829549A2 (en) 2001-09-06 2015-01-28 Novartis Vaccines and Diagnostics S.r.l. Hybrid and tandem expression of neisserial derived proteins
US7842297B2 (en) 2001-12-12 2010-11-30 Novartis Vaccines And Diagnostics Srl Immunisation against chlamydia trachomatis
EP2335724A1 (en) 2001-12-12 2011-06-22 Novartis Vaccines and Diagnostics S.r.l. Immunisation against chlamydia trachomatis
EP2335723A1 (en) 2001-12-12 2011-06-22 Novartis Vaccines and Diagnostics S.r.l. Immunisation against chlamydia trachomatis
US7361353B2 (en) 2001-12-12 2008-04-22 Novartis Vaccines And Diagnostics, Inc. Immunisation against Chlamydia trachomatis
US7754228B2 (en) 2002-02-13 2010-07-13 Novartis Vaccines And Diagnostics, Srl Cytotoxic T-cell epitopes from Chlamydia
EP2572707A2 (en) 2002-02-20 2013-03-27 Novartis Vaccines and Diagnostics, Inc. Microparticles with adsorbed polypeptide-containing molecules
WO2003070909A2 (en) 2002-02-20 2003-08-28 Chiron Corporation Microparticles with adsorbed polypeptide-containing molecules
EP2258390A1 (en) 2002-08-30 2010-12-08 Novartis Vaccines and Diagnostics S.r.l. Improved bacterial outer membrane vesicles
EP2258389A1 (en) 2002-08-30 2010-12-08 Novartis Vaccines and Diagnostics S.r.l. Improved bacterial outer membrane vesicles
EP2258388A1 (en) 2002-08-30 2010-12-08 Novartis Vaccines and Diagnostics S.r.l. Improved bacterial outer membrane vesicles
US8409587B2 (en) 2002-11-01 2013-04-02 Glaxosmithkline Biologicals S.A. Immunogenic composition
EP2279746A2 (en) 2002-11-15 2011-02-02 Novartis Vaccines and Diagnostics S.r.l. Surface proteins in neisseria meningitidis
EP2261239A2 (en) 2002-11-22 2010-12-15 Novartis Vaccines and Diagnostics S.r.l. Multiple variants of meningococcal protein NMB1870
EP2258717A2 (en) 2002-11-22 2010-12-08 Novartis Vaccines and Diagnostics S.r.l. Variant form of meningococcal NadA
EP2258716A2 (en) 2002-11-22 2010-12-08 Novartis Vaccines and Diagnostics S.r.l. Multiple variants of meningococcal protein NMB1870
EP2258365A1 (en) 2003-03-28 2010-12-08 Novartis Vaccines and Diagnostics, Inc. Use of organic compounds for immunopotentiation
EP2267005A1 (en) 2003-04-09 2010-12-29 Novartis Vaccines and Diagnostics S.r.l. ADP-ribosylating toxin from Listeria monocytogenes
EP2179729A1 (en) 2003-06-02 2010-04-28 Novartis Vaccines and Diagnostics, Inc. Immunogenic compositions based on microparticles comprising adsorbed toxoid and a polysaccharide-containing antigen
EP2277595A2 (en) 2004-06-24 2011-01-26 Novartis Vaccines and Diagnostics, Inc. Compounds for immunopotentiation
EP2583678A2 (en) 2004-06-24 2013-04-24 Novartis Vaccines and Diagnostics, Inc. Small molecule immunopotentiators and assays for their detection
EP2612679A1 (en) 2004-07-29 2013-07-10 Novartis Vaccines and Diagnostics, Inc. Immunogenic compositions for gram positive bacteria such as streptococcus agalactiae
EP2279747A1 (en) 2004-10-29 2011-02-02 Novartis Vaccines and Diagnostics S.r.l. Immunogenic bacterial vesicles with outer membrane proteins
EP3498302A1 (en) 2005-02-01 2019-06-19 Novartis Vaccines and Diagnostics S.r.l. Conjugation of streptococcal capsular saccharides to carrier proteins
EP2351772A1 (en) 2005-02-18 2011-08-03 Novartis Vaccines and Diagnostics, Inc. Proteins and nucleic acids from meningitis/sepsis-associated Escherichia coli
EP2298795A1 (en) 2005-02-18 2011-03-23 Novartis Vaccines and Diagnostics, Inc. Immunogens from uropathogenic escherichia coli
EP2357000A1 (en) 2005-10-18 2011-08-17 Novartis Vaccines and Diagnostics, Inc. Mucosal and systemic immunizations with alphavirus replicon particles
EP2360175A2 (en) 2005-11-22 2011-08-24 Novartis Vaccines and Diagnostics, Inc. Norovirus and Sapovirus virus-like particles (VLPs)
EP2357184A1 (en) 2006-03-23 2011-08-17 Novartis AG Imidazoquinoxaline compounds as immunomodulators
EP2586790A2 (en) 2006-08-16 2013-05-01 Novartis AG Immunogens from uropathogenic Escherichia coli
EP2548895A1 (en) 2007-01-11 2013-01-23 Novartis AG Modified saccharides
WO2009034473A2 (en) 2007-09-12 2009-03-19 Novartis Ag Gas57 mutant antigens and gas57 antibodies
EP2537857A2 (en) 2007-12-21 2012-12-26 Novartis AG Mutant forms of streptolysin O
WO2010049806A1 (en) 2008-10-27 2010-05-06 Novartis Ag Purification method
WO2010078027A1 (en) * 2008-12-17 2010-07-08 Genocea Biosciences, Inc. Chlamydia antigens and uses thereof
WO2010078556A1 (en) 2009-01-05 2010-07-08 Epitogenesis Inc. Adjuvant compositions and methods of use
US9180184B2 (en) 2009-01-05 2015-11-10 EpitoGenesis, Inc. Adjuvant compositions and methods of use
US8425922B2 (en) 2009-01-05 2013-04-23 EpitoGenesis, Inc. Adjuvant compositions and methods of use
WO2010079464A1 (en) 2009-01-12 2010-07-15 Novartis Ag Cna_b domain antigens in vaccines against gram positive bacteria
US9675683B2 (en) 2009-03-06 2017-06-13 Glaxosmithkline Biologicals S.A. Chlamydia antigens
US9151756B2 (en) 2009-03-06 2015-10-06 Glaxosmithkline Biologicals Sa Chlamydia antigens
US10716842B2 (en) 2009-03-06 2020-07-21 Glaxosmithkline Biologicals Sa Chlamydia antigens
US8568732B2 (en) 2009-03-06 2013-10-29 Novartis Ag Chlamydia antigens
WO2011138636A1 (en) 2009-09-30 2011-11-10 Novartis Ag Conjugation of staphylococcus aureus type 5 and type 8 capsular polysaccharides
WO2011051917A1 (en) 2009-10-30 2011-05-05 Novartis Ag Purification of staphylococcus aureus type 5 and type 8 capsular saccharides
EP3199177A1 (en) 2009-10-30 2017-08-02 GlaxoSmithKline Biologicals S.A. Purification of staphylococcus aureus type 5 and type 8 capsular saccharides
WO2011149564A1 (en) 2010-05-28 2011-12-01 Tetris Online, Inc. Interactive hybrid asynchronous computer game infrastructure
WO2012035519A1 (en) 2010-09-16 2012-03-22 Novartis Ag Immunogenic compositions
JP2016106117A (en) * 2010-10-20 2016-06-16 ジェノセア バイオサイエンシーズ, インコーポレイテッド Chlamydia antigens and uses thereof
WO2012085668A2 (en) 2010-12-24 2012-06-28 Novartis Ag Compounds
US10561720B2 (en) 2011-06-24 2020-02-18 EpitoGenesis, Inc. Pharmaceutical compositions, comprising a combination of select carriers, vitamins, tannins and flavonoids as antigen-specific immuno-modulators
WO2013038375A2 (en) 2011-09-14 2013-03-21 Novartis Ag Methods for making saccharide-protein glycoconjugates
WO2013068949A1 (en) 2011-11-07 2013-05-16 Novartis Ag Carrier molecule comprising a spr0096 and a spr2021 antigen
WO2013174832A1 (en) 2012-05-22 2013-11-28 Novartis Ag Meningococcus serogroup x conjugate
US10124051B2 (en) 2012-05-22 2018-11-13 Glaxosmithkline Biologicals Sa Meningococcus serogroup X conjugate
WO2017175082A1 (en) 2016-04-05 2017-10-12 Gsk Vaccines S.R.L. Immunogenic compositions

Also Published As

Publication number Publication date
JP2002529069A (en) 2002-09-10
EP1133572A2 (en) 2001-09-19
WO2000027994A3 (en) 2000-11-23
AU1722300A (en) 2000-05-29
CA2350775A1 (en) 2000-05-18
EP1133572A4 (en) 2005-06-15

Similar Documents

Publication Publication Date Title
EP1133572A2 (en) Chlamydia pneumoniae genome sequence
US6822071B1 (en) Polypeptides from Chlamydia pneumoniae and their use in the diagnosis, prevention and treatment of disease
US9034642B2 (en) Genes of an otitis media isolate of nontypeable Haemophilus influenzae
US10035826B2 (en) Proteins and nucleic acids from meningitis/sepsis-associated Escherichia coli
US7749518B2 (en) Polypeptides from non-typeable Haemophilus influenzae
US7090973B1 (en) Nucleic acid sequences relating to Bacteroides fragilis for diagnostics and therapeutics
US8628917B2 (en) Genes of an otitis media isolate of nontypeable Haemophilus influenzae
US20120093868A1 (en) Haemophilus influenzae type b
CN101203529A (en) Proteins and nucleic acids from meningitis/sepsis-associated escherichia coli
SA99191283B1 (en) CHLAMYDIA PROTEIN and the chain of religion and its uses
KR20010012236A (en) Enterococcus faecalis polynucleotides and polypeptides
US20040067554A1 (en) Nucleotide sequences of moraxella catarrhalis genome
US6902893B1 (en) Lyme disease vaccines
EP1341810B1 (en) Secreted chlamydia polypeptides and method for identifying such polypeptides by their secretion by a type iii secretion pathway of a gram-negative bacteria.
AU2012207041A1 (en) Proteins and nucleic acids from meningitis/sepsis-associated Escherichia Coli
Viratyosin Genetic variation of chlamydial Inc proteins
Bina Analysis of the resistance-nodulation-division and HOP families of cell envelope proteins in helicobacter pylori
Average The genome sequence of the plant pathogen Xylella fastidiosa

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref country code: AU

Ref document number: 2000 17223

Kind code of ref document: A

Format of ref document f/p: F

AK Designated states

Kind code of ref document: A2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CR CU CZ DE DK DM EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref document number: 2350775

Country of ref document: CA

Ref country code: CA

Ref document number: 2350775

Kind code of ref document: A

Format of ref document f/p: F

ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 581161

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1999960323

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWP Wipo information: published in national office

Ref document number: 1999960323

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1999960323

Country of ref document: EP