WO1999054474A2 - Bacterial yiha polypeptide family - Google Patents

Bacterial yiha polypeptide family Download PDF

Info

Publication number
WO1999054474A2
WO1999054474A2 PCT/EP1999/002640 EP9902640W WO9954474A2 WO 1999054474 A2 WO1999054474 A2 WO 1999054474A2 EP 9902640 W EP9902640 W EP 9902640W WO 9954474 A2 WO9954474 A2 WO 9954474A2
Authority
WO
WIPO (PCT)
Prior art keywords
polypeptide
iii
antagonist
gene
sequences
Prior art date
Application number
PCT/EP1999/002640
Other languages
French (fr)
Other versions
WO1999054474A3 (en
Inventor
Fabrizio Arigoni
Michael David Edgerton
Hannes Loferer
Manuel C. Peitsch
Original Assignee
Glaxo Group Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Glaxo Group Limited filed Critical Glaxo Group Limited
Priority to AU37090/99A priority Critical patent/AU3709099A/en
Publication of WO1999054474A2 publication Critical patent/WO1999054474A2/en
Publication of WO1999054474A3 publication Critical patent/WO1999054474A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/37Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi
    • C07K14/39Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi from yeasts
    • C07K14/395Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi from yeasts from Saccharomyces
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Definitions

  • This invention relates to a family of bacterial polypeptides which are required for growth of both gram negative and gram positive bacteria, the genes which encode them and the use of such polypeptides and genes as tools for identifying novel broad spectrum antibiotics.
  • the invention therefore provides an isolated polypeptide of the yihA family as defined below particularly for use in the identification of novel antibiotic agents.
  • the polypeptides of the present invention are believed to be essential to the viability of a wide range of bacteria including both gram positive and gram negative bacteria.
  • BLAST searches J. Mol. Biol. (1990) 215:403 -10 and Meth. Enzymol. (1996) 266: 131-141, 227-258 both incorporated herein by reference
  • Such searches involve using in succession as query sequences, each of the existing yihA protein family member sequences to identify other full length members of the yihA family of proteins.
  • HSP high-scoring segment pairs
  • Motif based searches may be carried out using PROSITE patterns defined for the yihA family members. These searches involve the representation as patterns, of the conserved sequence elements identified in the profile searches.
  • HSP score of greater than or equal to 100 when compared with one of the sequences of Figure 1 when the BLAST algorithm is used with a BLOSUM62 scoring matrix ;
  • ii) containing a set of amino acid sequences which are positively identified when position dependent scoring matrices according to Tables 1 -4 are used with MAST to yield a p-value of less than lxlO "30 ; or iii) comprising at least one of the following amino acid sequences:
  • X is any one amino acid residue, and the numbers in the curved brackets refer to the number of residues at that position.
  • both of the amino acid sequences listed under iii) are present.
  • the invention also provides an isolated polypeptide sequence as set out in any of Figures 2a-d.
  • polypeptides are preferably recombinant and ideally purified to homogeneity.
  • polypeptides according to the invention are variants, analogues and derivatives. Particularly those in which a number of amino acids have been substituted, deleted or added.
  • Polypeptides which have at least 70% identity to any of the polypeptide sequences according to the invention, in particular the sequences of Figures 2a-d are encompassed within the invention.
  • the identity is at least 80%, more preferably at least 90% and still more preferably at least or greater than 95% identity for example 97%, 98% or even 99% identity to any of the sequences according to the invention, in particular the sequences of Figures 2a-d.
  • Such polypeptides may also be fragments.
  • a fragment is a part of a polypeptide according to the invention which retains sufficient identity of the original polypeptide to be effective for example in a screen.
  • Such fragments may be fused to other amino acids or polypeptides or may be comprised within a larger polypeptide.
  • Such a fragment may be comprised within a precursor polypeptide designed for expression in a host. Therefore in one aspect the term fragment means a portion or portions of a fusion polypeptide or polypeptide derived from a polypeptide according to the invention.
  • Fragments also include portions of a polypeptide according to the invention characterised by structural or functional attributes of a polypeptide according to the invention. These may have similar or improved chemical or biological activity or reduced side-effect activity.
  • fragments may comprise an alpha helix or alpha-helix forming region, beta sheet and beta-sheet forming region, turn and turn forming regions, coil and coil-forming regions, hydrophilic regions, hydrophobic regions, amphipathic regions (alpha or beta), flexible regions, surface-forming regions, substrate binding regions and regions of high antigenic index.
  • Fragments or portions may be used for producing the corresponding full length polypeptide by peptide synthesis.
  • polypeptides according to the invention include the polypeptides of Helicobacter pylori, Haemophilus influenza, Mycoplasma genitalium, Mycoplasma pneumoniae, Streptococcus pneumoniae, Streptococcus pyogenes, Pseudomonas aeruginosa, Saccharomyces cerevisiae, Methanobacterium jannaschii, Neisseria gonorrhoea, Neisseria meningitides, Staphylococcus epidermidis, Aquifex aeolicus, Bacillus subtilis and Escherichia coli.
  • the present invention further provides isolated polynucleotides which encode the polypeptides as defined herein, polynucleotides complementary thereto, or polynucleotides hybridising to any of the aforesaid polynucleotides.
  • Isolated polynucleotides have been removed by separation from their natural environment and those materials with which they are naturally associated.
  • these polynucleotide molecules are provided in recombinant form (i.e. combined with one or more heterologous sequences).
  • stringent hybridisation conditions which is sometimes used is where attempted hybridisation is carried out at a temperature of from about 35°C to about 65°C using a salt solution which is about 0.9 molar.
  • the skilled person will be able to vary such conditions as appropriate in order to take into account variables such as probe length, base composition, type of ions present, etc.
  • the invention also provides polynucleotide variants, analogues, derivatives and fragments which encode polypeptides according to the invention.
  • Polynucleotides are included which preferably have at least 70% identity over their entire length to a polynucleotide encoding a polypeptide according to the invention, most preferably those set out in Figures 2a-d. More preferred are those sequences which have at least 80% identity over their entire length to a polynucleotide encoding a polypeptide according to the invention. Even more preferred are polynucleotides which demonstrate at least 90% for example 95%, 97%, 98% or 99% identity over their entire length to a polynucleotide encoding a polypeptide according to the invention.
  • Polynucleotide molecules of the present invention may be used as probes for other members of the gene family or in anti-sense therapy to block or to reduce the expression of one or more of the polypeptides of the invention. Since these substances are believed to be essential to the bacteria expressing them, blocking or reducing their expression can provide an effective way of treating bacterial mediated diseases or disorders. Polynucleotides may also be used directly in screening and in generating whole cell screens by expression of a polypeptide of the inventions. As part of the isolation process or thereafter the polynucleotides may be joined to other polynucleotides such as to form fusions or to regulatory elements for expression.
  • Isolated polynucleotides alone or joined to other polynucleotides can be in introduced into a vector which itself will contain other elements of DNA or RNA for expression in a host cells.
  • the invention therefore comprises a vector containing a polynucleotide generally operatively linked to appropriate expression control sequences.
  • Vectors for use in the invention include plasmid vectors, phage vectors and DNA or RNA viral vectors. These vectors may include gene sequences which render them inducible under certain conditions such as manipulation of the environmental conditions under which the host cells are maintained for example by temperature alteration or nutrient additives. Regulatory sequences include for example a promoter to direct mRNA transcription. Such promoters include for example E. coli. lac, trp, tac and araBAD as well as the SV40 early and late promoters Such systems and sequences would be well known to those skilled in the art.
  • Host cells expressing a polynucleotide of the present invention can be generated by any of the traditional routes such as transfection or electroporation see for example Davis et al, Basic Methods in Molecular Biology, (1986) and Sambrook et al Molecular Cloning: A Laboratory Manual, 2 nd Edition., Cold Spring Harbor Lab. Press, Cold Spring Harbor, N.Y. (1989).
  • This invention also provides a method for identification of molecules such as antagonists, that bind to the polypeptide or a polynucleotide encoding a polypeptide of the present invention.
  • Biochemical assays for inhibition of polypeptide activity with purified polypeptides or bacterial extracts can be more sensitive than whole cell killing assays and provide direct evidence for a compound's mode of action.
  • this approach requires that the target polypeptide is known and the activity of the polypeptide be amenable to in vitro assays. Nor does it address other factors, such as membrane permeability or compound stability, which can limit a compounds effectiveness as an antibiotic.
  • Whole cell screening of compounds for killing activity will identify molecules which kill cells at the concentrations tested, but provide no information on the mode of action of the compound and may not have the sensitivity needed to detect less potent compounds.
  • Bacterial strains which contain surrogate markers whose activity is linked to that of the target gene or which have been engineered to over-express or under- express the target polypeptide can be used for selective whole-cell screens.
  • the invention further provides a host cell comprising a vector as defined herein and a reporter gene encoding a reporter molecule whose activity is linked to that of the polypeptide encoded by the vector.
  • a reporter gene encoding a reporter molecule whose activity is linked to that of the polypeptide encoded by the vector. Examples of such systems include a transcriptional fusion of the E. coli lacZ gene to vanH promoter in a B. subtilis strain expressing VanS and R as a reporter for inhibition of cell wall biosynthesis (J. Bacteriol.
  • surrogate markers for the activity of the gene can be identified using at least two approaches. Two dimensional electrophoresis coupled with mass spectrometry analysis of isolated polypeptides, proteome mapping, has been used to identify specific polypeptides which increase in abundance in response to polypeptide or RNA synthesis inhibitors (Microbial & Comparative Genomics (1996) 1 :375). Tightly regulated promoters used to demonstrate that the E. coli and B. subtilis conserved, essential polypeptides are essential can also be used to reduce the concentrations of these polypeptides.
  • proteome maps generated from bacteria depleted of the conserved essential genes can be used to detect polypeptides which change in abundance as compared to wild-type bacteria. Transcriptional or translational fusions to these polypeptides can be used as reporter molecules to screen for antagonists of members of the conserved essential gene family.
  • transposons or other mobile genetic elements containing reporter genes can be used to search for reporter molecules. Such an approach has been used to identify vancomycin responsive genes in S. aureus (Antibiot. (Tokyo) (1991) 44:210-217).
  • bacteria in which conserved essential genes are controlled by tightly regulated promoters can be used to screen for transposon carrying strains in which expression of promoterless reporter genes is induced upon depletion of the polypeptides.
  • Standard broth or plate assays can be used in many different formats. Such assays will detect molecules which antagonise the response which couples the activity of the conserved, target polypeptide to the reporter molecule. Thus, the compounds identified may act directly upon the target polypeptide or on another stage in the pathway which leads to activation of the reporter.
  • Screens for inhibitors of the target which do not require the use of surrogate markers may be designed by manipulating expression levels of the target polypeptide.
  • quinolone resistant strains of E. coli have been made by over-expression of gyrA (FEMS Microbiol. Lett. (1997) 154:271-276)
  • over-expression of alanine racemase has been shown to increase resistance to cycloserine in M. smegmatis
  • multicopy plasmids carrying murZ have been shown to increase phosphomycin resistance in both E. coli (J. Bacteriol. (1992)
  • strains more sensitive to antibiotics may be made by reducing expression levels of the polypeptide targeted by the antibiotic.
  • Over or under-expression of members of the conserved, essential gene family may be used to screen for antibiotics which act either directly on gene or gene product or indirectly on the pathway which it is involved.
  • Another example of an assay for antagonists is a competitive assay that combines the polypeptide of the present invention and a potential antagonist with membrane-bound binding molecules, recombinant binding molecules, natural substrates or ligands, or substrate or ligand mimetics, under appropriate conditions for a competitive inhibition assay.
  • the polypeptide can be labelled, such as by radioactivity or a colorimetric compound, such that the number of polypeptide molecules bound to a binding molecule or converted to product can be determined accurately to assess the effectiveness of the potential antagonist.
  • the present invention therefore provides a method of assaying compounds for activity against bacteria comprising:
  • the present invention also provides a method of assaying compounds for activity against bacteria comprising:
  • the present invention further provides a method of screening for an antibiotic which method comprises:
  • the method may be carried out as above but the level of expression of the polypeptide is decreased and the cells are assayed for increased sensitivity to an inhibitor.
  • the present invention also provides a method of assaying compounds for activity against bacteria comprising:
  • Potential antagonists include small organic molecules, ions which interact specifically with a polypeptide or polynucleotide for example a substrate, cell membrane component, receptor a fragment thereof or a peptide.
  • Such molecules may include antibodies, antibody-derived reagents or chimaeric molecules.
  • Potential antagonists also may be small organic molecules, a peptide, a polypeptide such as a closely related protein or antibody that binds to the same sites on a binding molecule without inducing functional activity of the polypeptide of the invention.
  • the antibodies may be monoclonal or polyclonal. Techniques for producing monoclonal and polyclonal antibodies which bind to a particular polypeptide are now well developed in the art. They are discussed in standard immunology textbooks, for example in Roitt et al (Immunology, Churchill Livingston, 2nd Edition (1989)).
  • the present invention covers variants thereof which are capable of binding to an epitope present or a substance of the present invention.
  • the variants may be antibody fragments or synthetic constructs. Examples of antibody fragments and synthetic constructs are given by Dougall et al in Tibtech 12 372-379 (September 1994).
  • Antibody fragments include Fab and Fv fragments.
  • Other synthetic constructs include CDR peptides. These are synthetic peptides comprising antigen binding determinants. Peptide mimetics may also be used. These molecules are usually conformationally restricted organic rings which mimic the structure of a CDR loop and which include antigen-interactive side chains. Synthetic constructs include chimaeric molecules.
  • humanised antibodies or derivatives thereof are within the scope of the present invention.
  • An example of a humanised antibody is an antibody having human framework regions, but a rodent or other non-human hypervariable regions.
  • Synthetic constructs also include molecules comprising a covalently linked moiety which provides the molecule with some desirable property in addition to antigen binding.
  • the moiety may be a label (e.g. a fluorescent or radioactive label) or a pharmaceutically active agent.
  • antisense molecules see Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides As Antisense Inhibitors Of Gene Expression, CRC Press, Boca Raton, FL (1988), for a description of these molecules).
  • the invention provides the use of the polypeptide, polynucleotide or antagonist of the invention to interfere with the initial physical interaction between a pathogen and mammalian host responsible for sequelae of infection.
  • the invention further includes molecules which block the function of the polypeptides according to the invention or a polynucleotide encoding the same, identifiable by any of the above described methods.
  • An antagonist of the invention may be provided in pharmaceutical compositions which may include a carrier. They may be provided in unit dosage form. Such agents and pharmaceutical compositions are within the scope of the present invention. In order to prepare such pharmaceutical compositions the inhibitors will normally be provided in substantially pure form. They can then be combined with a carrier under sterile conditions.
  • the present invention also provides a method of treatment which comprises administering to a patient an effective amount of an antagonist of the expression or function of a polypeptide as defined herein.
  • the present invention further provides the use of an antagonist of a polypeptide as defined herein or a polynucleotide encoding the same for the manufacture of a medicament for the treatment of a bacterial infection.
  • Figure 1 shows the multiple sequence alignment of the yihA family members which may be used for BLAST based identification.
  • Figures 2a-d shows both the position-dependant scoring matrices used for profile- based identification of yihA family members and examples of the motifs recognised by each matrix in the family members.
  • Figure 2a shows examples of motif 1 in the yihA family.
  • Figure 2b shows examples of motif 2 in the yihA family.
  • Figure 2c shows examples of motif 3 in the yihA family.
  • Figure 2d shows examples of motif 4 in the yihA family.
  • Figure 3 shows the PROSITE patterns which may be used to recognize yihA family members.
  • Figure 4 shows the outline cloning strategy for a gene disruption plasmid.
  • the black box represents the adapter sequence.
  • Figure 5 shows Growth dependence on arabinose of a conditional mutant in the E coli gene yihA.
  • An E. coli MG1655 derivative in which the chromosomal areBAD genes have been replaced with yihA and the native yihA gene has been deleted is shown on the upper half of each plate and a wild-type control is shown on the lower half of each plate.
  • Figure 6 is a diagram of the vector used to create conditional mutants in B. subtilis.
  • Figure 7 shows growth dependence on xylose of a conditional mutant in the B. subtilis yihA orthologue ysxC.
  • Example 1 Identification of conserved bacterial open reading frames.
  • the SIM score was then divided by a '"selfSIM” score, a value obtained when the query protein is compared to itself using SIM algorithm with the PAM200 matix, to yield a similarity value of between 1.0 and 0. Proteins for which this similarity value was greater than 0.2 when the E. coli protein was compared to either the B. subtilis or M. genatilum genome where then compiled into a list and manually screened to identify proteins of unknown function. Those open reading frames which also had high similarity values in other bacteria were then considered as candidate genes and targets for gene disruption.
  • a disruption plasmid was constructed using DNA containing an in-frame deletion of the gene of interest plus -900 base pairs of 5' and 3' flanking DNA for homologous recombination.
  • the plasmid was cloned into the gene-replacement vector pKO3 as follows: Two separate PCR reactions were used to amplify fragments of approximately 900 base pairs of 5' and 3' sequence flanking the gene of interest. Chromosomal DNA from E .coli strain MG1655 was used as the template.
  • Primers 2 and 3 carry a 5' extension of a 33 bp adapter sequence
  • the 2 PCR products were purified using High PureTMPCR Product Purification Kit (Boehringer Mannheim Inc., Mannheim, GE). Using the adapter sequence, the 2 PCR products are assembled in a second PCR reaction to give a single product . Following restriction enzyme digestion, preparative agarose gel electrophoresis and purification using JetsorbTMGel Extraction Kit (Genomed Inc.) the final product was cloned into pKO3 using standard techniques. This clone is referred to as the disruption plasmid. All PCR reactions described in this section were performed with PWOTM DNA Polymerase (Boehringer Mannheim Inc., Mannheim, GE). In the final product the gene of interest was deleted from the start to the stop codon and replaced by the 33 bp adapter sequence [e.g. 5'-
  • the disruption vector pKO3 (A.J.Link et al., J. Bacteriol. 179:6228-6237,1997) is a derivative of pMAK700 (C.A.Hamilton et al, J. Bacteriol. 171.4617-4622). It features the repA (Ts) replication origin derived from pSClOl [permissive at 30°C but inactive at 42 to 44°C], the cat gene encoding chloramphenicol resistance and the sacB gene for counter selection against vector sequences in the presence of 5% sucrose.
  • chromosomal integrates (cointegrates produced by a single homologous recombination event) of the plasmid were isolated by selecting clones on chloramphenicol at 44°C. Following 2-times purification under the same conditions, the cointegrates are grown at 30°C in the presence of 5% sucrose to force resolution of the cointegrate and elimination of the plasmid from the cell. At this step, a preliminary assignment if a given gene is essential or non-essential for growth of E. coli in complex media was made.
  • the genotype of the chloramphenicol-sensitive clones obtained following cointegration and resolution of the disruption plasmid was determined by colony-PCR using primers cl and c2 (see Fig.4).
  • the second recombination event can result in either a wild-type or a mutant genotype.
  • the testing of 20 independent clones showed routinely that a ⁇ l:l distribution of wild-type versus mutant genotype in case of a non-essential gene. Recovery of only wild-type genotype in 50 independent clones was considered as preliminary evidence for a gene's essentiality.
  • a vector, pRDC15 was designed, which allows a copy of a putative essential gene to be placed in ectopic position on the chromosome under the control of a tightly regulated promoter.
  • the plasmid is a derivative of pKO3.
  • pRDCl 5 carries a DNA fragment consisting of the araC gene, the arabinose promoter, a cloning site [BamRl-Nhel-Sfil-Xhol-Sphl-Sftl] and the polB gene.
  • the wild-type copy of a putative essential gene was amplified by PCR and cloned into the vector pRDC15 using restriction sites Nhel and Xhol.
  • the resulting construct was used for gene replacement in a manner identical to the disruption plasmids described above.
  • the araC and polB genes of pRDC 15 represent the homologous D ⁇ A for recombination at the araCBADpolB locus of the E. coli chromosome.
  • the araBAD genes in the E. coli chromosome are replaced by the wild-type copy of the gene of interest, which is now under the control of the arabinose promoter.
  • This merodiploid strain is then used to construct an in frame deletion of the wild-type target gene using the disruption plasmid described above in the presence of 0.2% arabinose.
  • the deletion mutant can be obtained since a wild-type copy is expressed in trans from the arabinose locus.
  • the resulting strain is a conditional mutant as expression of the target gene is now dependent on the presence of arabinose.
  • the inability of such a strain to grow in the absence of arabinose is a final proof that a given gene is essential for growth of E. coli.
  • Figure 5 shows that the gene yihA is essential in E. coli.
  • Example 3 ysxC is an essential gene in Bacillus subtilis.
  • An integrative plasmid allowing the expression of genes under the control of a xylose inducible promoter was constructed as follows: A DNA fragment carrying the repressor gene xylR and the xylA promoter was PCR amplified from B. subtilis genomic DNA with the following primers:
  • pxyl-4 5 '-atcgctcgagAGATGCACCTTCTATACCCG-3 '
  • pxyl-7 5'-atcgaagcttAGCGATCCTACACAATCATG-3'
  • the primers were designed such that they introduced a unique Ec RI site at the 5' end of the PCR product and a unique BamHl site at the 3' end of the product.
  • the PCR fragment was then cloned as an EcoRl-BamHI fragment into the B. subtilis integrative vector pDG648 to yield pRDC9 ( Figure 6).
  • a DNA fragment containing approximately 100 bp sequence from the 5' region of ysxC was amplified by PCR from B. subtilis genomic DNA.
  • the PCR primers were designed such that the resulting PCR product contains unique restrictions site at both the 5' and 3 'ends of the PCR product. Subsequently, the PCR product was cloned into pRDC9.
  • N meningitidis - contig G ⁇ MCY55F Ssequence data for N meningitidis was obtained from The Institute for Genomic Research website at http://www.tigr.org. H. pylori - HP 1567, genebank accession number g2314750
  • MAST motif alignment and search tool
  • yihA family members which are positively identified when p-values of less than 1 x 10 "30 are obtained.
  • p-values are based on a random sequence model that assumes each position in a random sequence is generated according to the average letter frequencies of all sequences in the peptide non-redundant database (ftp://ncbi.nlm.nih.gov/blast/db/) on September 22, 1996.
  • Tables 1 to 4 show the position dependent scoring used to define the yihA family. Values in the position-dependent scoring matrix are calculated by taking the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the average frequency of that letter in the training set. Columns correspond to 1 letter amino acid codes and rows correspond to the position in the motif.
  • Values are the position-dependent scoring matrix are calculated by taking the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the average frequency of that letter in the training set. Columns correspond to 1 letter ammo acid codes and rows correspond to the position m the motif.
  • Values are the position-dependent scoring matrix are calculated by taking the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the average frequency of that letter in the training set.
  • Columns correspond to 1 letter amino acid codes and rows correspond to the position in the raotif. l og -od l mat rix a leng t 20 w- 55 n ⁇ 7288 baye - 7 56739
  • PROSITE patterns using the conventions outlined in PROSITE: A dictionary of protein sites and patterns (http://www.expasy.ch/sprot/prosite.html) and Bairoch A., Bucher P., Hofmann K. The PROSITE datatase, its status in 1995. Nucleic Acids Res. 24:189- 196(1995). YihA family members are positively identified when exact matches to any one of the two prosite patterns pattern 1 or pattern 2 described in figure 3 are obtained.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Mycology (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Microbiology (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

This invention relates to a family of bacterial polypeptides which are considered essential for growth of both gram negative and gram positive bacteria. The family has been identified by a number of methods including computer based algorithms. The use of such polypeptides and the genes which encode them as tools for identifying novel broad spectrum antibiotics is described.

Description

Bacterial Polypeptide Family
This invention relates to a family of bacterial polypeptides which are required for growth of both gram negative and gram positive bacteria, the genes which encode them and the use of such polypeptides and genes as tools for identifying novel broad spectrum antibiotics.
New antibiotics are urgently needed in current medical practice as both serious bacterial infections and multiply antibiotic resistant strains are becoming increasingly prevalent (Proc. Natl. Acad. Sci USA (1994) 91 :2420-2427; New England J. Med. (1994)
330:1247-1251). The increase in number of serious infections has been ascribed to a variety of causes, including: 1) Increasing age of the general population, 2) increasingly long and complex surgeries and 3) a growing immuno-suppressed population associated with cancer therapies, organ transplants and HIV infection. Overuse of antibiotics in both medical and agricultural settings, improper sanitation and a general lack of concern about antibiotic resistant organisms have all contributed to the increasing frequency of multiply antibiotic resistant bacteria. Taken together, these two trends suggest that we will soon be faced with bacterial infections which are resistant to all therapies. Indeed, the first report of vancomycin-resistant S. aureus has just been published (Lancet (1997) 350:1670-1673).
Identification of conserved essential proteins is a key step in the development of broad- spectrum antibiotics. If a target protein is conserved across taxonomic lines, the possibility that antibiotics acting on that protein will be effective on a wide range of bacteria is maximized. As examples, DNA gyrase and RNA polymerase are found in all bacteria, which helps to explain why quinolones and rifampicin are good broad- spectrum antibiotics. However, not all bacteria synthesize peptidoglycan, which explains why b-lactam antibiotics are ineffective against Chlamydia, Rickettesia and Legionella species. The recent publication of several complete eubacterial genomic sequences (Science (1995) 270:397-403; Science (1997) 277: 1453-1474; Nature (1997) 390:249-256) allows the identification of bacterial proteins which have orthologues in all of the sequenced genomes. This approach has lead to the identification of many conserved protein families (Science (1997) 278:631-637). In some cases a biochemical function for the conserved family may be deduced from their predicted amino acid sequence. In other cases no function can be predicted for the protein family. However, it is impossible to predict the physiological role of a protein or protein family without detailed characterisation of at least one family member.
Following identification of a conserved bacterial protein family, the protein must be shown to be essential for bacterial viability if it is to serve as an antibiotic target. Genetic systems have been developed to demonstrate a genes essentiality in both E. coli (J. Bacteriol. (1997) 179:6228-6237) and B. subtilis (Genes Dev. (1991) 177:4194- 4197). In some instances these systems suffer either from a reliance on negative data, failure to disrupt a given gene, or insufficient repression of the candidate gene, which can lead to misidentification of genes essentiality. Clean data from taxonomically diverse bacteria, such as gram negative and gram positive strains offers the best evidence that a conserved bacterial protein family is essential for viability and will make a good broad-spectrum antibiotic target.
We have identified a family of conserved bacterial genes which we have designated the yihA gene family, after the name given to the E. coli gene family member. These genes have not been previously isolated nor the polypeptides expressed as no function has been ascribed to these genes. It has now been discovered that this family of genes encodes a family of polypeptides which are essential for the survival their host bacteria.
The invention therefore provides an isolated polypeptide of the yihA family as defined below particularly for use in the identification of novel antibiotic agents. The polypeptides of the present invention are believed to be essential to the viability of a wide range of bacteria including both gram positive and gram negative bacteria.
Any one of the following three methods may be used to identify members of the yihA family as claimed herein; BLAST searches (J. Mol. Biol. (1990) 215:403 -10 and Meth. Enzymol. (1996) 266: 131-141, 227-258 both incorporated herein by reference) may be carried out using the yihA family member sequences as described in Figure 1. Such searches involve using in succession as query sequences, each of the existing yihA protein family member sequences to identify other full length members of the yihA family of proteins. Such family members yield high-scoring segment pairs (HSP) scores of greater than 100 in comparison to at least one member of the yihA family when the BLAST algorithm described in the reference above is used with a particular scoring matrix (a BLOSUM62 matrix - Proteins (1993) 17:49-61 incorporated herein by reference).
Profile based searches (Proceedings of the second International Conference on Intelligent Systems for Molecular Biology, pp28-36, AAAI Press, Menlo Park California, 1994 incorporated herein by reference) may be carried out using position- dependent scoring matrices defined for the yihA family members. These searches use a table compiled from a multiple sequence alignment which describes distinctive sequences of amino acids as probability values for each residue at each position in the gene family to identify other proteins which contain similar sequences of amino acids.
Motif based searches (Nucleic Acids Res. (1995) 24:189-196 incorporated herein by reference) may be carried out using PROSITE patterns defined for the yihA family members. These searches involve the representation as patterns, of the conserved sequence elements identified in the profile searches.
The isolated polypeptides of the invention may therefore be characterised by:
i) an HSP score of greater than or equal to 100 when compared with one of the sequences of Figure 1 when the BLAST algorithm is used with a BLOSUM62 scoring matrix ; or
ii) containing a set of amino acid sequences which are positively identified when position dependent scoring matrices according to Tables 1 -4 are used with MAST to yield a p-value of less than lxlO"30; or iii) comprising at least one of the following amino acid sequences:
E-X(4)-G-[GR]-[STAG]-N-X-G-K-S-[STAG]; [VILM]-A-X(2)-S-X(2)-[PT]-G-X-T-[RKQN]-X(2)-N-X-[FY];
where, the letters denote an amino acid in one letter code, the square brackets denote a single amino acid, the amino acids within the square brackets are alternatives,
X is any one amino acid residue, and the numbers in the curved brackets refer to the number of residues at that position.
In a preferred aspect of the invention both of the amino acid sequences listed under iii) are present.
The invention also provides an isolated polypeptide sequence as set out in any of Figures 2a-d.
The polypeptides are preferably recombinant and ideally purified to homogeneity.
Also included as polypeptides according to the invention are variants, analogues and derivatives. Particularly those in which a number of amino acids have been substituted, deleted or added. Polypeptides which have at least 70% identity to any of the polypeptide sequences according to the invention, in particular the sequences of Figures 2a-d are encompassed within the invention. Preferably the identity is at least 80%, more preferably at least 90% and still more preferably at least or greater than 95% identity for example 97%, 98% or even 99% identity to any of the sequences according to the invention, in particular the sequences of Figures 2a-d. Such polypeptides may also be fragments. In this regard a fragment is a part of a polypeptide according to the invention which retains sufficient identity of the original polypeptide to be effective for example in a screen. Such fragments may be fused to other amino acids or polypeptides or may be comprised within a larger polypeptide. Such a fragment may be comprised within a precursor polypeptide designed for expression in a host. Therefore in one aspect the term fragment means a portion or portions of a fusion polypeptide or polypeptide derived from a polypeptide according to the invention.
Fragments also include portions of a polypeptide according to the invention characterised by structural or functional attributes of a polypeptide according to the invention. These may have similar or improved chemical or biological activity or reduced side-effect activity. For example fragments may comprise an alpha helix or alpha-helix forming region, beta sheet and beta-sheet forming region, turn and turn forming regions, coil and coil-forming regions, hydrophilic regions, hydrophobic regions, amphipathic regions (alpha or beta), flexible regions, surface-forming regions, substrate binding regions and regions of high antigenic index.
Fragments or portions may be used for producing the corresponding full length polypeptide by peptide synthesis.
Specific polypeptides according to the invention include the polypeptides of Helicobacter pylori, Haemophilus influenza, Mycoplasma genitalium, Mycoplasma pneumoniae, Streptococcus pneumoniae, Streptococcus pyogenes, Pseudomonas aeruginosa, Saccharomyces cerevisiae, Methanobacterium jannaschii, Neisseria gonorrhoea, Neisseria meningitides, Staphylococcus epidermidis, Aquifex aeolicus, Bacillus subtilis and Escherichia coli.
The present invention further provides isolated polynucleotides which encode the polypeptides as defined herein, polynucleotides complementary thereto, or polynucleotides hybridising to any of the aforesaid polynucleotides. Isolated polynucleotides have been removed by separation from their natural environment and those materials with which they are naturally associated. Preferably these polynucleotide molecules are provided in recombinant form (i.e. combined with one or more heterologous sequences).
Polynucleotide molecules which hybridise to polynucleotides encoding substances of the present invention, or to complementary polynucleotides thereto, preferably do so under stringent hybridisation conditions. One example of stringent hybridisation conditions which is sometimes used is where attempted hybridisation is carried out at a temperature of from about 35°C to about 65°C using a salt solution which is about 0.9 molar. However, the skilled person will be able to vary such conditions as appropriate in order to take into account variables such as probe length, base composition, type of ions present, etc.
The invention also provides polynucleotide variants, analogues, derivatives and fragments which encode polypeptides according to the invention. Polynucleotides are included which preferably have at least 70% identity over their entire length to a polynucleotide encoding a polypeptide according to the invention, most preferably those set out in Figures 2a-d. More preferred are those sequences which have at least 80% identity over their entire length to a polynucleotide encoding a polypeptide according to the invention. Even more preferred are polynucleotides which demonstrate at least 90% for example 95%, 97%, 98% or 99% identity over their entire length to a polynucleotide encoding a polypeptide according to the invention.
Polynucleotide molecules of the present invention may be used as probes for other members of the gene family or in anti-sense therapy to block or to reduce the expression of one or more of the polypeptides of the invention. Since these substances are believed to be essential to the bacteria expressing them, blocking or reducing their expression can provide an effective way of treating bacterial mediated diseases or disorders. Polynucleotides may also be used directly in screening and in generating whole cell screens by expression of a polypeptide of the inventions. As part of the isolation process or thereafter the polynucleotides may be joined to other polynucleotides such as to form fusions or to regulatory elements for expression. Isolated polynucleotides alone or joined to other polynucleotides can be in introduced into a vector which itself will contain other elements of DNA or RNA for expression in a host cells. The invention therefore comprises a vector containing a polynucleotide generally operatively linked to appropriate expression control sequences.
Vectors for use in the invention include plasmid vectors, phage vectors and DNA or RNA viral vectors. These vectors may include gene sequences which render them inducible under certain conditions such as manipulation of the environmental conditions under which the host cells are maintained for example by temperature alteration or nutrient additives. Regulatory sequences include for example a promoter to direct mRNA transcription. Such promoters include for example E. coli. lac, trp, tac and araBAD as well as the SV40 early and late promoters Such systems and sequences would be well known to those skilled in the art.
Host cells expressing a polynucleotide of the present invention can be generated by any of the traditional routes such as transfection or electroporation see for example Davis et al, Basic Methods in Molecular Biology, (1986) and Sambrook et al Molecular Cloning: A Laboratory Manual, 2nd Edition., Cold Spring Harbor Lab. Press, Cold Spring Harbor, N.Y. (1989).
This invention also provides a method for identification of molecules such as antagonists, that bind to the polypeptide or a polynucleotide encoding a polypeptide of the present invention.
Selective whole-cell screens combine the sensitivity and specificity of in vitro biochemical assays with the direct demonstration of in vivo activity seen in whole cell screens. Biochemical assays for inhibition of polypeptide activity with purified polypeptides or bacterial extracts can be more sensitive than whole cell killing assays and provide direct evidence for a compound's mode of action. However, this approach requires that the target polypeptide is known and the activity of the polypeptide be amenable to in vitro assays. Nor does it address other factors, such as membrane permeability or compound stability, which can limit a compounds effectiveness as an antibiotic.
Whole cell screening of compounds for killing activity will identify molecules which kill cells at the concentrations tested, but provide no information on the mode of action of the compound and may not have the sensitivity needed to detect less potent compounds. Bacterial strains which contain surrogate markers whose activity is linked to that of the target gene or which have been engineered to over-express or under- express the target polypeptide can be used for selective whole-cell screens.
Surrogate markers, easily assayed reporter molecules whose activity is tightly coupled to the activity of the polypeptide being studied, may be used as a means of assaying antibiotics. The invention further provides a host cell comprising a vector as defined herein and a reporter gene encoding a reporter molecule whose activity is linked to that of the polypeptide encoded by the vector. Examples of such systems include a transcriptional fusion of the E. coli lacZ gene to vanH promoter in a B. subtilis strain expressing VanS and R as a reporter for inhibition of cell wall biosynthesis (J. Bacteriol. (1996) 178:6305-6309), the use of lacZ transcriptional and translational fusions to rpoB and rpoC to monitor RNA polymerase activity (Mol. Microbiol. (1996) 19:483-493) and the use of a secA-lacZ gene fusion as a reporter for inhibition of secA activity (Genetics (1988) 118:571-579).
When the function of a gene is unknown, surrogate markers for the activity of the gene can be identified using at least two approaches. Two dimensional electrophoresis coupled with mass spectrometry analysis of isolated polypeptides, proteome mapping, has been used to identify specific polypeptides which increase in abundance in response to polypeptide or RNA synthesis inhibitors (Microbial & Comparative Genomics (1996) 1 :375). Tightly regulated promoters used to demonstrate that the E. coli and B. subtilis conserved, essential polypeptides are essential can also be used to reduce the concentrations of these polypeptides. In a manner similar to that described above, proteome maps generated from bacteria depleted of the conserved essential genes can be used to detect polypeptides which change in abundance as compared to wild-type bacteria. Transcriptional or translational fusions to these polypeptides can be used as reporter molecules to screen for antagonists of members of the conserved essential gene family. As an alternative to proteome mapping, transposons or other mobile genetic elements containing reporter genes can be used to search for reporter molecules. Such an approach has been used to identify vancomycin responsive genes in S. aureus (Antibiot. (Tokyo) (1991) 44:210-217). As with proteome mapping, bacteria in which conserved essential genes are controlled by tightly regulated promoters can be used to screen for transposon carrying strains in which expression of promoterless reporter genes is induced upon depletion of the polypeptides.
Once a reporter gene has been identified, screening of compounds for induction or inhibition of the marker can be undertaken. Standard broth or plate assays can be used in many different formats. Such assays will detect molecules which antagonise the response which couples the activity of the conserved, target polypeptide to the reporter molecule. Thus, the compounds identified may act directly upon the target polypeptide or on another stage in the pathway which leads to activation of the reporter.
Screens for inhibitors of the target which do not require the use of surrogate markers may be designed by manipulating expression levels of the target polypeptide. For example, quinolone resistant strains of E. coli have been made by over-expression of gyrA (FEMS Microbiol. Lett. (1997) 154:271-276), over-expression of alanine racemase has been shown to increase resistance to cycloserine in M. smegmatis (J. Bacteriol. (1997) 179:5046-5055), and multicopy plasmids carrying murZ have been shown to increase phosphomycin resistance in both E. coli (J. Bacteriol. (1992)
174:5748-5752) and A calcoaceticus (FEMS Microbiol. Lett. (1994) 117:137-142). Similarly, strains more sensitive to antibiotics may be made by reducing expression levels of the polypeptide targeted by the antibiotic. Over or under-expression of members of the conserved, essential gene family may be used to screen for antibiotics which act either directly on gene or gene product or indirectly on the pathway which it is involved. Another example of an assay for antagonists is a competitive assay that combines the polypeptide of the present invention and a potential antagonist with membrane-bound binding molecules, recombinant binding molecules, natural substrates or ligands, or substrate or ligand mimetics, under appropriate conditions for a competitive inhibition assay. The polypeptide can be labelled, such as by radioactivity or a colorimetric compound, such that the number of polypeptide molecules bound to a binding molecule or converted to product can be determined accurately to assess the effectiveness of the potential antagonist.
The present invention therefore provides a method of assaying compounds for activity against bacteria comprising:
i) providing a polypeptide according to the invention; ii) contacting said polypeptide with candidate inhibitory compounds; and iii) measuring for binding to said polypeptide or fragment.
The present invention also provides a method of assaying compounds for activity against bacteria comprising:
i) expressing a polypeptide according to the invention in a host cell; ii) contacting said cell with candidate inhibitory compounds; and iii) measuring cell death.
The present invention further provides a method of screening for an antibiotic which method comprises:
i) transfecting a host cell with a vector comprising a polynucleotide encoding a polypeptide as defined herein; ii) allowing the host cell to express the polynucleotide; iii) increasing the level of expression of the polypeptide as defined herein; and iv) assaying for increased resistance. Alternatively the method may be carried out as above but the level of expression of the polypeptide is decreased and the cells are assayed for increased sensitivity to an inhibitor.
The present invention also provides a method of assaying compounds for activity against bacteria comprising:
i) generating a bacterial strain containing a reporter gene linked to the gene encoding a polypeptide according to the invention; ii) contacting said strain with candidate inhibitory compounds; and iii) measuring for induction or inhibition of said marker.
Potential antagonists include small organic molecules, ions which interact specifically with a polypeptide or polynucleotide for example a substrate, cell membrane component, receptor a fragment thereof or a peptide. Such molecules may include antibodies, antibody-derived reagents or chimaeric molecules.
Potential antagonists also may be small organic molecules, a peptide, a polypeptide such as a closely related protein or antibody that binds to the same sites on a binding molecule without inducing functional activity of the polypeptide of the invention.
The antibodies may be monoclonal or polyclonal. Techniques for producing monoclonal and polyclonal antibodies which bind to a particular polypeptide are now well developed in the art. They are discussed in standard immunology textbooks, for example in Roitt et al (Immunology, Churchill Livingston, 2nd Edition (1989)).
In addition to whole antibodies, the present invention covers variants thereof which are capable of binding to an epitope present or a substance of the present invention. The variants may be antibody fragments or synthetic constructs. Examples of antibody fragments and synthetic constructs are given by Dougall et al in Tibtech 12 372-379 (September 1994). Antibody fragments include Fab and Fv fragments. Other synthetic constructs include CDR peptides. These are synthetic peptides comprising antigen binding determinants. Peptide mimetics may also be used. These molecules are usually conformationally restricted organic rings which mimic the structure of a CDR loop and which include antigen-interactive side chains. Synthetic constructs include chimaeric molecules. Thus, for example, humanised antibodies or derivatives thereof are within the scope of the present invention. An example of a humanised antibody is an antibody having human framework regions, but a rodent or other non-human hypervariable regions. Synthetic constructs also include molecules comprising a covalently linked moiety which provides the molecule with some desirable property in addition to antigen binding. For example the moiety may be a label (e.g. a fluorescent or radioactive label) or a pharmaceutically active agent.
Other potential antagonists include antisense molecules (see Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides As Antisense Inhibitors Of Gene Expression, CRC Press, Boca Raton, FL (1988), for a description of these molecules).
In a particular aspect the invention provides the use of the polypeptide, polynucleotide or antagonist of the invention to interfere with the initial physical interaction between a pathogen and mammalian host responsible for sequelae of infection.
The invention further includes molecules which block the function of the polypeptides according to the invention or a polynucleotide encoding the same, identifiable by any of the above described methods.
An antagonist of the invention may be provided in pharmaceutical compositions which may include a carrier. They may be provided in unit dosage form. Such agents and pharmaceutical compositions are within the scope of the present invention. In order to prepare such pharmaceutical compositions the inhibitors will normally be provided in substantially pure form. They can then be combined with a carrier under sterile conditions. The present invention also provides a method of treatment which comprises administering to a patient an effective amount of an antagonist of the expression or function of a polypeptide as defined herein.
The present invention further provides the use of an antagonist of a polypeptide as defined herein or a polynucleotide encoding the same for the manufacture of a medicament for the treatment of a bacterial infection.
Figures
Figure 1 shows the multiple sequence alignment of the yihA family members which may be used for BLAST based identification.
Figures 2a-d shows both the position-dependant scoring matrices used for profile- based identification of yihA family members and examples of the motifs recognised by each matrix in the family members. Figure 2a shows examples of motif 1 in the yihA family. Figure 2b shows examples of motif 2 in the yihA family. Figure 2c shows examples of motif 3 in the yihA family. Figure 2d shows examples of motif 4 in the yihA family.
Figure 3 shows the PROSITE patterns which may be used to recognize yihA family members.
Figure 4 shows the outline cloning strategy for a gene disruption plasmid. The black box represents the adapter sequence.
Figure 5 shows Growth dependence on arabinose of a conditional mutant in the E coli gene yihA. An E. coli MG1655 derivative in which the chromosomal areBAD genes have been replaced with yihA and the native yihA gene has been deleted is shown on the upper half of each plate and a wild-type control is shown on the lower half of each plate. Figure 6 is a diagram of the vector used to create conditional mutants in B. subtilis.
Figure 7 shows growth dependence on xylose of a conditional mutant in the B. subtilis yihA orthologue ysxC.
Examples
Example 1. Identification of conserved bacterial open reading frames.
The predicted open reading frames obtained from the complete E. coli genomic sequence (Science (1997) 277: 1453-1474) were compared in a serial manner to the predicted open reading frames of the H. influenzae (Science (1995) 270:397-403), M. genatilum (Science (1995) 270:397-403), Synechocystis (Nuc. Acids Res. (1998) 26: 63-67) and B. subtilis (Nature (1997) 390:249-256) complete genome sequences using the BLAST algorithm (J. Mol. Biol. (1990) 215:403-10). All matches with BLAST Score of greater than 75 were then analysed in a pair- wise fashion using the SIM algorithm (Advances in Applied Mathematics (1991) 12:337-357). The SIM score was then divided by a '"selfSIM" score, a value obtained when the query protein is compared to itself using SIM algorithm with the PAM200 matix, to yield a similarity value of between 1.0 and 0. Proteins for which this similarity value was greater than 0.2 when the E. coli protein was compared to either the B. subtilis or M. genatilum genome where then compiled into a list and manually screened to identify proteins of unknown function. Those open reading frames which also had high similarity values in other bacteria were then considered as candidate genes and targets for gene disruption.
Example 2. Demonstration of essentiality of yihA genes in E. coli.
2A - In-frame deletion of selected genes in E. coli.
A disruption plasmid was constructed using DNA containing an in-frame deletion of the gene of interest plus -900 base pairs of 5' and 3' flanking DNA for homologous recombination. The plasmid was cloned into the gene-replacement vector pKO3 as follows: Two separate PCR reactions were used to amplify fragments of approximately 900 base pairs of 5' and 3' sequence flanking the gene of interest. Chromosomal DNA from E .coli strain MG1655 was used as the template. Primers 2 and 3 carry a 5' extension of a 33 bp adapter sequence
adaptor sequence forward direction 5'-gttataaatttggagtgtgaaggttattgcgtg; adaptor sequence reverse direction 5'-cacgcaataaccttcacactccaaatttataac.
Subsequently, the 2 PCR products were purified using High Pure™PCR Product Purification Kit (Boehringer Mannheim Inc., Mannheim, GE). Using the adapter sequence, the 2 PCR products are assembled in a second PCR reaction to give a single product . Following restriction enzyme digestion, preparative agarose gel electrophoresis and purification using Jetsorb™Gel Extraction Kit (Genomed Inc.) the final product was cloned into pKO3 using standard techniques. This clone is referred to as the disruption plasmid. All PCR reactions described in this section were performed with PWO™ DNA Polymerase (Boehringer Mannheim Inc., Mannheim, GE). In the final product the gene of interest was deleted from the start to the stop codon and replaced by the 33 bp adapter sequence [e.g. 5'-
ATGgttataaatttggagtgtgaaggttattgcgtgTAA-3']. As a consequence the reading frame is maintained.
2B - Construction of an in- frame deletion mutant of Escherichia coli
The disruption vector pKO3 (A.J.Link et al., J. Bacteriol. 179:6228-6237,1997) is a derivative of pMAK700 (C.A.Hamilton et al, J. Bacteriol. 171.4617-4622). It features the repA (Ts) replication origin derived from pSClOl [permissive at 30°C but inactive at 42 to 44°C], the cat gene encoding chloramphenicol resistance and the sacB gene for counter selection against vector sequences in the presence of 5% sucrose.
The disruption plasmid described above was transformed into MG1655. Subsequently, chromosomal integrates (cointegrates produced by a single homologous recombination event) of the plasmid were isolated by selecting clones on chloramphenicol at 44°C. Following 2-times purification under the same conditions, the cointegrates are grown at 30°C in the presence of 5% sucrose to force resolution of the cointegrate and elimination of the plasmid from the cell. At this step, a preliminary assignment if a given gene is essential or non-essential for growth of E. coli in complex media was made. The genotype of the chloramphenicol-sensitive clones obtained following cointegration and resolution of the disruption plasmid was determined by colony-PCR using primers cl and c2 (see Fig.4). In the case of a non-essential gene, the second recombination event can result in either a wild-type or a mutant genotype. The testing of 20 independent clones, showed routinely that a ~l:l distribution of wild-type versus mutant genotype in case of a non-essential gene. Recovery of only wild-type genotype in 50 independent clones was considered as preliminary evidence for a gene's essentiality.
2C - Construction of a conditional mutant and final proof that a given gene is essential for growth of E. coli
A vector, pRDC15 was designed, which allows a copy of a putative essential gene to be placed in ectopic position on the chromosome under the control of a tightly regulated promoter. The plasmid is a derivative of pKO3. In addition to the attributes of pKO3, pRDCl 5 carries a DNA fragment consisting of the araC gene, the arabinose promoter, a cloning site [BamRl-Nhel-Sfil-Xhol-Sphl-Sftl] and the polB gene. The wild-type copy of a putative essential gene was amplified by PCR and cloned into the vector pRDC15 using restriction sites Nhel and Xhol. The resulting construct was used for gene replacement in a manner identical to the disruption plasmids described above. In this case the araC and polB genes of pRDC 15 represent the homologous DΝA for recombination at the araCBADpolB locus of the E. coli chromosome. Following cointegration and resolution, the araBAD genes in the E. coli chromosome are replaced by the wild-type copy of the gene of interest, which is now under the control of the arabinose promoter. This merodiploid strain is then used to construct an in frame deletion of the wild-type target gene using the disruption plasmid described above in the presence of 0.2% arabinose. In this case, the deletion mutant can be obtained since a wild-type copy is expressed in trans from the arabinose locus. The resulting strain is a conditional mutant as expression of the target gene is now dependent on the presence of arabinose. The inability of such a strain to grow in the absence of arabinose is a final proof that a given gene is essential for growth of E. coli. Figure 5 shows that the gene yihA is essential in E. coli.
Example 3 ysxC is an essential gene in Bacillus subtilis.
3 A - Construction of a B. subtilis integrative plasmid for xylose controlled gene expression.
An integrative plasmid allowing the expression of genes under the control of a xylose inducible promoter was constructed as follows: A DNA fragment carrying the repressor gene xylR and the xylA promoter was PCR amplified from B. subtilis genomic DNA with the following primers:
pxyl-4: 5 '-atcgctcgagAGATGCACCTTCTATACCCG-3 ' pxyl-7: 5'-atcgaagcttAGCGATCCTACACAATCATG-3'
The primers were designed such that they introduced a unique Ec RI site at the 5' end of the PCR product and a unique BamHl site at the 3' end of the product. The PCR fragment was then cloned as an EcoRl-BamHI fragment into the B. subtilis integrative vector pDG648 to yield pRDC9 (Figure 6).
3B - Construction of the disruption plasmid.
A DNA fragment containing approximately 100 bp sequence from the 5' region of ysxC was amplified by PCR from B. subtilis genomic DNA. The PCR primers were designed such that the resulting PCR product contains unique restrictions site at both the 5' and 3 'ends of the PCR product. Subsequently, the PCR product was cloned into pRDC9.
3C - Construction of a conditional mutant. The disruption plasmid was inserted into B. subtilis strain JH642. Chromosomal integration of the plasmid via single-reciprocal Campbell-like recombination at the ysxC locus into the chromosome was driven by selection on LB plates containing erythromycin (1 μg/ml), lincomycin (25 μg/ml) and 10 mM xylose. The resulting strain is a conditional mutant in which expression of ysxC is dependent on the presence of xylose into the growth medium.
3D - Confirmation that ysxC is an essential gene.
Confirmation of that ysxC is essential for growth was obtained by streaking the ysxC conditional mutant on LB plates plates containing erythromycin (1 μg/ml), lincomycin (25 μg/ml) with or without 10 mM xylose. The strain formed single colonies only on xylose containing plates thereby indicating that expression of ysxC is indispensable for growth (Figure 7).
Example 4 - Characterisation of the yihA polypeptide family
4A - Repetitive BLAST searches
Repetitive BLAST searches (Altschul, S.F., Gish, W., Miller, W., Myers E.W., and. Lipman, D.J. (1990). Basic local alignment search tool. J. Mol. Biol. 215:403-10) in which each of the of the yihA protein family members described below were used in succession as query sequences to identify other members of the yihA family as proteins which yield high-scoring segment pairs (HSP) scores of greater than 100 in comparison to at least one member of the yihA polypeptide sequnces shown in figure 1 when a BLOSUM62 scoring matrix is used.
Sources for each of the sequences set out in Figure 1 are given below:
H. influenzae - yihA, Swissprot accession number P46453 E. coli - yihA, Swissprot accession number P24253 S. epidermidis - Glaxo Wellcome S. epidermidis genomic sequencing project ORF
Z0304001 (B. Kimmerly, unpublished data) B. subtilis - ysxC, Swissprot accession number P38424
S. pyogenes - gnl|OUACGT|Contig301 from S. pyogenes genome sequencing project, B.A. Roe, S. Clifton, Mike McShan and Joseph Ferretti
(http ://www. genome, ou. edu/strep . html S. pneumoniae - Glaxo Wellcome S. pneumoniae genomic sequencing project contig
SP07_00013 (G. Feger, unpublished data) M. jannaschii - Y320, Swissprot accession number Q57768 M. genitalium - Y335, Swissprot accession number P47577 M. pneumoniae - Y335, Swissprot accession number P75303 S. cerevisiae - D9651.4, Swissprot accession number Q05473 N. gonorrhoea - genebank accession number gl 914833 A. aeolicus - genebank accession number g2984110 P. aeruginosa - gnl|PAGP|Contig639, the Pseudomonas Genome Project
(http://www.pseudomonas.com) N meningitidis - contig GΝMCY55F, Ssequence data for N meningitidis was obtained from The Institute for Genomic Research website at http://www.tigr.org. H. pylori - HP 1567, genebank accession number g2314750
4B - Profile based searches
Multiple sequence alignments of the yihA family members have been used to identify short patterns of amino acid sequences, which are common to all of the family members. Four motifs have been identified in the yihA gene family using the motif discovery tool, MEME (Bailey, T. L. and Elkan, C, Fitting a mixture model by expectation maximization to discover motifs in biopolymers, Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994). Each of the four motifs are shown as they exist in each of the family members and are explicitly described as position-dependent scoring matrices, or profiles. Together these profiles can be used by the motif alignment and search tool, MAST, described in the same reference, to search databases for yihA family members, which are positively identified when p-values of less than 1 x 10"30 are obtained. Where p-values are based on a random sequence model that assumes each position in a random sequence is generated according to the average letter frequencies of all sequences in the peptide non-redundant database (ftp://ncbi.nlm.nih.gov/blast/db/) on September 22, 1996.
Tables 1 to 4 show the position dependent scoring used to define the yihA family. Values in the position-dependent scoring matrix are calculated by taking the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the average frequency of that letter in the training set. Columns correspond to 1 letter amino acid codes and rows correspond to the position in the motif.
Table 1 - Position-dependent scoring matrix. Values are the position-dependent scoring matrix are calculated by taking the log (base 2) of the ratio p/f at each position m the motif where p is the probability of a particular letter at that position in the motif, and f s the average frequency of that letter in the training set. Columns correspond to 1 letter amino acid codes and rows correspond to the position in the motif. log-odds matrix: alength= 20 w= 19 n= 2828 bayes= 7.551
A C D E F G H I K L M N P Q R S T V W Y
1 -4 627 -5 566 -1 994 3 934 -5 992 -5 13. -4 319 -5 263 -2 733 -6 028 -5 261 -4 113 -5 943 -3 377 -5 195 -5 180 -5 125 -5 356 -5 664 -5 723
2 -3 632 -3 194 -5 898 -5 772 -3 519 -5 982 -5 786 3 052 -5 545 -2 032 1 337 -5 403 -5 893 -5 608 -5 850 -5 283 -3 559 2 720 -5 293 -4 633
3 3 001 2 369 -4 617 -4 272 -3 291 -2 960 -4 078 1 008 -4 231 -2 612 -2 179 -4 156 -4 831 -4 133 -4 190 -2 080 -2 662 0 958 -4 024 -3 972
4 -4 441 -3 810 -6 755 -6 354 3 779 -6 364 -4 859 -1 516 -6 175 1 552 -1 063 -6 052 -5 833 -5 098 -5 765 -5 706 -4 489 0 962 -3 440 -2 903
5 2 260 -1 636 -4 165 -3 572 -1 822 -3 155 -2 660 -0 854 -3 287 0 502 2 231 -3 197 -3 684 -2 961 -3 158 -0 234 -1 826 1 831 -2 536 -2 261
6 -3 229 -4 353 -3 776 -4 543 -5 442 3 784 -4 184 -5 595 -4 364 -5 904 -4 855 -3 514 -4 954 -4 740 -4 265 -3 737 -4 690 -5 096 -4 672 -5 030
7 -3 596 -2 891 -4 449 -4 574 -4 716 -1 839 -2 006 -4 160 -1 434 -4 147 -3 764 -3 344 -3 774 -2 453 4 113 -3 831 -3 889 -4 786 -3 066 -4 173
8 -1 748 -1 931 -3 399 -3 981 -3 613 -3 218 -3 189 -3 780 -3 158 -3 970 -2 935 -1 905 -3 410 -3 351 -3 068 3 411 0 897 -3 748 -3 757 -3 487
9 -4 577 -4 085 -3 088 -4 952 -4 646 -4 342 -1 904 -4 290 -4 121 -5 108 -4 290 4 350 -4 739 -3 657 -4 372 -2 917 -3 619 -4 710 -4 165 -4 371
10 2 063 -2 724 -5 672 -5 589 -4 330 -3 735 -5 152 -1 555 -5 640 -3 411 -3 208 -4 913 -4 771 -5 121 -5 328 -3 060 -3 101 3 284 -5 428 -5 211
11 -3 229 -4 353 -3 776 -4 543 -5 442 3 784 -4 184 -5 595 -4 364 -5 904 -4 855 -3 514 -4 954 -4 740 -4 265 -3 737 -4 690 -5 096 -4 672 -5 030
12 -3 867 -3 862 -4 925 -4 583 -5 500 -4 863 -3 812 -4 287 3 990 -5 023 -4 091 -3 799 -4 740 -4 012 -0 799 -4 651 -4 128 -4 881 -4 315 -4 890
13 -4 038 -6 288 -6 639 -6 087 -5 193 -5 958 3 726 -3 427 -3 021 -3 852 -2 811 -1 774 -3 320 -3 ?43 -2 936 3 310 1 414
15 1 994 -3 319 -6 193 -5 572 3 131 -5 434 -4 351 0 501 -5 279 1 257 -0 928 -5 197 -5 317 -4 498 -4 980 -4 710 -3 609 -1 791 -3 464 -3 333
16 -3 717 -3 345 -5 557 -5 469 -2 811 -5 755 -5 203 3 366 -5 150 1 590 -1 382 -5 084 -5 616 -5 013 -5 350 -4 988 -3 578 0 649 -4 414 -4 095
Table 2 - Position-dependent scoring matrix.
Values are the position-dependent scoring matrix are calculated by taking the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the average frequency of that letter in the training set. Columns correspond to 1 letter ammo acid codes and rows correspond to the position m the motif.
6 1 315 -3 131 -1 890 -1 433 -3 636 -0 123 -1 525 -3 408 2 806 -3 268 -2 398 -1 520 -3 090 0 825 -1 109 0 796 -1 936 -2 924 -3 543 -2 911
Table 3 - Position-dependent scoring matrix.
Values are the position-dependent scoring matrix are calculated by taking the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the average frequency of that letter in the training set. Columns correspond to 1 letter amino acid codes and rows correspond to the position in the raotif. l og -od l mat rix a leng t 20 w- 55 n~ 7288 baye - 7 56739
A C D E F G H I K L M N P Q R S T V W Y
1 -3 934 -3 203 -3 997 -4 110 -0 772 -4 162 -3 343 -3 909 -3 898 -2 533 -2 718 -3 774 -4 478 -2 019 -2 222 -4 098 -3 943 -3 4B6 6 039 -1 474
2 -1 730 -3 370 -1 305 0 733 -3 473 2 151 -1 392 -3 123 0 731 -3 034 -2 181 -1 447 -2 907 2 255 -1 013 -1 775 -1 855 0 945 -3 411 -2 764
3 -2 000 1 756 -2 177 -1 473 -3 631 -0 022 -1 342 -3 203 2 339 -3 046 1 484 1 539 -3 146 -0 829 2 065 -1 981 -1 986 -2 845 -3 338 -2 801
4 0 197 -1 385 -3 750 -3 109 0 849 -3 085 1 460 -0 433 -2 791 0 816 3 712 -2 735 -3 240 -2 463 -2 685 -0 022 -1 413 0 719 -2 049 -1 725
5 -2 336 -2 185 -4 401 -0 157 -1 693 -4 023 -2 996 2 136 -3 501 2 448 -0 465 -3 6u2 -3 907 -3 072 -3 337 -3 164 0 088 -0 762 -2 772 -2 568
6 -1 558 -2 953 -1 447 2 354 -2 931 1 805 -1 265 -2 579 -0 536 -0 373 -1 813 -1 378 -2 788 -0 781 0 664 -1 621 1 rr3 -2 265 2 109 -2 391
7 0 111 -4 114 1 524 2 820 -4 021 -3 034 -1 861 -3 596 -1 272 -3 485 -2 611 -1 804 -2 991 0 991 -2 033 -2 083 0 274 -3 042 -4 072 0 924
8 -3 624 -3 259 -4 426 -4 323 1 439 -4 726 -1 099 -3 333 -4 143 -0 268 -2 691 -3 697 -4 537 -3 692 -3 822 -3 70b -3 959 -3 401 -1 033 4 503
9 -3 202 -2 837 -4 756 -4 018 -1 598 -4 753 -3 330 0 694 -0 719 3 054 -0 114 -4 149 -3 973 -3 054 -3 345 -3 896 -3 091 -1 498 -3 001 -3 026
10 -1 176 -2 763 -1 190 1 190 -2 792 -2 302 -0 893 -2 593 -0 410 -0 469 1 608 -0 °"5 -2 377 1 676 0 588 0 878 2 017 -2 179 -2 857 -2 159
Figure imgf000026_0001
12 -3 448 -4 066 -1 819 -3 977 -0 660 -3 566 -3 158 -3 616 -2 231
13 0 111 -4 176 1 526 2 380 -4 066 -3 094 -1 900 -3 614 1 778 -3 498 -2 623 -1 909 -3 001 0 996 -2 024 -2 130 -2 202 0 046 -4 104 -3 322
14 -1 881 1 779 -0 942 1 251 -3 569 -2 232 -1 293 -3 561 -1 197 -3 447 -2 593 2 570 -2 950 1 769 -1 757 1 497 -1 825 -3 126 -3 663 -2 793
r-
)7 49? _ 1114 ! 400 2 <14( 5 71 •> i ,10 2 801 4 1 )' -? 095 -4 5C4 -3 7(1 0 674 -1 1 I) ill ) 11 ? 911 1 |1<I ) 1 HOB - ', 701 -4 645 38 -5 310 -4 123 -5 717 -5 836 3 300 -5 661 -1 848 -4 275 -5 286 ~3 732 -3 628 0 710 -5 577 -4 311 -4 677 -4 909 -4 984 -4 358 4 483 2 548
3) 1 j28 -2 HI - ' 484 4 848 2 411 -4 750 -3 726 1 r41 -4 545 1 844 -0 609 -4 470 -4 698 ϊ 904 4 299 -3 970 -2 946 -1 304 -3 043 -2 928
40 -1 951 -3 441 -1 932 0 508 -3 651 -2 975 3 060 -3 254 2 310 -3 077 -2 240 -1 645 -3 094 1 802 -0 526 -1 957 -1 971 0 986 -3 375 -2 811
41 0 737 -2 472 0 374 0 375 -0 515 -2 336 1 464 -2 468 -0 710 -2 359 -1 570 -1 029 -2 435 -0 708 -1 174 0 058 0 938 -2 139 -1 961 3 008
42 -0 547 -1 644 -2 512 -2 296 1 711 -3 019 1 293 -1 842 -2 275 -0 816 -1 258 -2 268 -2 787 -2 246 -2 177 -0 050 0 106 -1 749 -0 550 3 741
43 -2 070 -3 618 1 532 0 624 -3 858 2 269 -1 490 -3 840 -1 439 -0 442 -2 881 1 704 -3 104 0 917 -2 059 -1 649 -2 048 -3 378 -3 935 -3 042
44 -2 732 -2 530 -4 827 -4 444 -2 206 -4 543 -3 727 3 099 -3 939 0 766 -0 935 -4 055 -4 579 -3 830 1 720 -3 707 -2 657 0 836 -3 376 -3 060
45 -0 210 -3 173 -3 137 -2 991 -3 775 -3 343 -2 837 -3 571 -2 788 -3 408 -3 176 -3 264 4 034 -2 590 -2 998 -2 275 -2 559 -3 123 -4 296 -4 141
46 -1 096 -1 575 -0 912 -3 271 0 410 -3 733 -2 671 0 487 -3 279 -1 847 -1 461 -3 387 -3 329 -3 219 -2 973 -3 017 -0 475 3 375 -3 325 -3 318
47 -1 334 -1 801 -2 194 0 276 -1 754 -2 731 2 946 1 772 -1 351 1 128 -0 721 -1 788 -2 844 0 793 -1 683 -1 717 0 385 -0 863 -2 318 -1 878
48 -0 207 -2 002 -4 133 -3 894 -2 913 -4 368 -3 392 1 652 -3 925 -0 559 -1 822 -4 015 -3 935 -3 913 -3 650 -3 694 -2 109 3 289 -4 065 -3 997
49 -2 847 -2 579 -5 366 -4 797 -2 054 -4 756 -3 855 1 613 -4 518 2 072 -0 948 -4 427 -4 725 -4 043 -4 362 -3 969 -2 823 2 239 -3 352 1 093
50 1 318 1 712 -4 308 -3 647 0 781 -3 626 -2 645 0 687 -3 335 2 154 1 658 -3 294 -3 646 -2 883 -3 133 -2 761 -1 929 -0 756 -2 394 -2 169
51 -2 241 -2 252 -3 594 -4 028 -3 774 -3 825 -3 207 -2 770 -3 223 -3 897 -2 266 -2 041 -3 867 -2 922 -3 251 1 317 3 584 -2 395 -3 813 -4 069
52 -3 540 -3 578 -4 644 -4 201 -5 184 -4 616 -3 450 -3 932 3 944 -2 689 -3 715 -3 458 -4 474 -3 554 -0 458 -4 311 -3 789 -4 508 -4 028 -4 556
53 3 217 -0 823 -3 447 -0 897 -0 292 -1 809 -2 875 -0 486 -3 013 -2 266 0 560 -3 076 -3 999 -2 947 -2 983 -0 892 -1 855 -0 986 -2 896 -2 998
54 -3 227 -3 646 4 064 -1 034 -4 141 -3 671 -2 470 -4 063 -3 784 -4 397 -3 716 -0 857 -4 576 -3 351 -3 793 -3 229 -1 684 -3 849 -3 918 -3 682
55 -3 722 -3 725 -4 788 -4 431 -5 370 -4 738 -3 668 -4 141 3 979 -4 880 -3 943 -3 651 -4 610 -3 853 -0 650 -4 506 -3 983 -4 735 -4 181 -4 751
Table 4 - Position-dependent scoring matrix.
Values are the position-dependent scoring matrix are calculated by taking the log (base 2) of the ratio p/f at each position in the motif where p is the probability of a particular letter at that position in the motif, and f is the average frequency of that letter in the training set. Columns correspond to 1 letter ammo acid codes and rows correspond to the position in the motif. log-odds matrix alength- 20 w- 11 n» 2948 bayes= 8 42265
A C D E F G H I K L M N P Q R S T V W Y
1 -3 251 -2 238 -4 266 -4 273 4 412 -4 520 -3 589 -2 262 -4 336 -1 531 -1 815 -4 133 -3 973 -4 350 -4 429 -2 348 -3 957 -2 565 -1 706 -0 484
2 -1 492 -1 656 -3 075 -3 660 -3 319 -2 959 -2 883 -3 481 -2 845 -3 673 -2 640 -1 597 -3 127 -3 047 -2 764 3 350 1 017 -3 475 -3 467 -3 189
3 1 357 -1 700 -3 628 -4 024 -3 973 -2 362 -3 467 -4 035 -3 291 -4 205 -3 211 -2 436 -3 343 -3 451 -3 508 3 217 -0 385 -3 124 -4 097 -3 973
4 -1 177 -1 778 -1 669 0 660 -1 660 -2 557 -1 342 -0 886 -1 111 1 591 -0 557 -1 589 1 218 1 041 -1 493 -1 542 -1 246 1 374 -2 258 -1 823
5 -1 178 -2 769 -0 870 0 822 -2 840 -2 048 -0 762 -2 695 2 664 -2 568 -1 681 1 078 -2 327 -0 349 -0 720 0 472 0 645 -2 274 -2 856 -2 124
6 -3 300 -3 580 -4 306 -3 397 -5 045 -4 242 -2 601 -3 834 3 788 -4 167 -3 321 -2 986 -4 249 - - 218 1 205 -3 746 -3 350 -4 104 -3 840 -4 086
7 0 309 -1 990 -1 651 -0 858 0 045 -2 442 -0 735 1 287 0 696 -1 620 -0 853 -1 265 -2 463 3 158 -1 010 -1 366 -1 221 -1 206 -2 321 -1 888
8 -1 982 -3 297 -2 620 -3 364 -4 418 3 699 -3 109 -4 447 -3 195 -4 879 -3 686 -2 322 -3 970 -3 610 -3 132 -2 548 -3 557 -3 938 -3 660 -3 973
9 -2 759 -2 728 -3 818 -3 828 -2 304 -4 251 -3 613 3 -32 -1 540 -0 408 0 447 -3 395 -4 271 -3 502 -3 681 -3 302 -2 474 1 261 -3 271 -2 772
10 -3 199 -3 601 4 081 -0 980 -4 085 -3 653 -2 431 -4 009 -3 759 -4 341 -3 662 -0 818 -4 529 -3 320 -3 751 -3 218 -3 612 -3 800 -3 857 -3 631
11 -1 899 -4 815 1 200 3 072 -4 647 -3 178 -2 177 -3 839 1 599 -0 901 -2 898 -1 886 -3 017 -0 736 -2 292 -2 380 -2 441 -3 183 -4 622 -3 780
Figure imgf000029_0001
4C - PROSITE based searches
The conserved sequence elements identified with MEME can also be represented as PROSITE patterns using the conventions outlined in PROSITE: A dictionary of protein sites and patterns (http://www.expasy.ch/sprot/prosite.html) and Bairoch A., Bucher P., Hofmann K. The PROSITE datatase, its status in 1995. Nucleic Acids Res. 24:189- 196(1995). YihA family members are positively identified when exact matches to any one of the two prosite patterns pattern 1 or pattern 2 described in figure 3 are obtained.

Claims

1. An isolated polypeptide of the yihA family as defined by:
i) an HSP score of greater than or equal to 100 when compared with one of the sequences of Figure 1 when the BLAST algorithm is used with a BLOSUM62 scoring matri ; or
ii) containing a set of amino acid sequences which are positively identified when position dependent scoring matrices according to Tables 1 -4 are used with
MAST to yield a p-value of less than lxl 0"30; or
iii) comprising at any one of the following amino acid sequences:
E-X(4)-G-[GR]-[STAG]-N-X-G-K-S-[STAG];
[VILM]-A-X(2)-S-X(2)-[PT]-G-X-T-[RKQN]-X(2)-N-X-[FY]
where, the letters denote an amino acid in one letter code, the square brackets denote a single amino acid, the amino acids within the square brackets are alternatives, X is any one amino acid residue, and the numbers in the curved brackets refer to the number of residues at that position.
2. A polypeptide or fragment according to claim 1 comprising both of the sequences listed in iii).
3. A polypeptide containing any of the sequences set out in Figures 2a-2d.
4. A polypeptide according to any of claims 1-3 wherein said polypeptide is from Helicobacter pylori, Haemophilus influenza, Mycoplasma genitalium, Mycoplasma pneumoniae, Streptococcus pneumoniae, Streptococcus pyogenes, Pseudomonas aeruginosa, Saccharomyces cerevisiae, Methanobacterium jannaschii, Neisseria gonorrhoea, Neisseria meningitidis, Staphylococcus epidermidis, Aquifex aeolicus, Bacillus subtilis and Escherichia coli.
5. A polypeptide according to any of claims 1-4 for use in a method of screening for agents with antibiotic activity.
6. An isolated polynucleotide encoding a polypeptide as defined in any of claims 1-4.
7. A vector comprising a transcriptional regulatory sequence and a nucleotide sequence encoding a polypeptide as defined in any of claims 1-4.
8. A host cell comprising a vector as claimed in claim 7 and a reporter gene whose activity is linked to the expression of the polypeptide according to any of claims 1- 4.
9. A method of assaying compounds for activity against bacteria comprising:
i) providing a polypeptide according to the invention; ii) contacting said polypeptide with an antagonist; and iii) measuring for binding to said polypeptide.
10. A method of assaying compounds for activity against bacteria comprising:
i) expressing a polypeptide or fragment thereof according to any of claims 1 -4 in a host cell; ii) contacting said polypeptide with an antagonist; and iii) measuring for inactivation of said polypeptide.
11. A method of assaying compounds for activity against bacteria comprising: i) providing a polypeptide according to the invention; ii) contacting said polypeptide with an antagonist; and iii) measuring for cell death.
12. A method of assaying compounds for activity against bacteria comprising:
i) transfecting a host cell with a vector comprising a polynucleotide encoding a polypeptide as defined herein; ii) allowing the host cell to express the polynucleotide; iii) increasing the level of expression of the polypeptide as defined herein; measuring for binding to said polypeptide; and iv) assaying for increased resistance.
13. A method of assaying compounds for activity against bacteria comprising:
i) transfecting a host cell with a vector comprising a polynucleotide encoding a polypeptide as defined herein; ii) allowing the host cell to express the polynucleotide; iii) decreasing the level of expression of the polypeptide as defined herein; measuring for binding to said polypeptide; and iv) assaying for increased sensitivity to an inhibitor.
14. A method of assaying compounds for activity against bacteria comprising:
i) generating a bacterial strain containing a reporter gene linked to the gene encoding a polypeptide according to the invention; ii) contacting said strain with an antagonist; and iii) measuring for induction or inhibition of said marker.
15. An antagonist of a polypeptide as defined in any of claims 1-4 identifiable by a method according to any of claims 9-14 for use in therapy.
16. Use of an antagonist of a polypeptide as defined in any of claims 1-4 identifiable by a method according to any of claims 9-14 for the manufacture of a medicament for the treatment of a bacterial infection.
17. A method of treatment which comprises administering to a patient an effective amount of an antagonist of a polypeptide as defined in any of claims 1-4 identifiable by any of the methods according to claims 9-14.
PCT/EP1999/002640 1998-04-22 1999-04-20 Bacterial yiha polypeptide family WO1999054474A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU37090/99A AU3709099A (en) 1998-04-22 1999-04-20 Bacterial polypeptide family

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GBGB9808363.7A GB9808363D0 (en) 1998-04-22 1998-04-22 Bacterial polypeptide family
GB9808363.7 1998-04-22

Publications (2)

Publication Number Publication Date
WO1999054474A2 true WO1999054474A2 (en) 1999-10-28
WO1999054474A3 WO1999054474A3 (en) 2000-05-04

Family

ID=10830639

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1999/002640 WO1999054474A2 (en) 1998-04-22 1999-04-20 Bacterial yiha polypeptide family

Country Status (3)

Country Link
AU (1) AU3709099A (en)
GB (1) GB9808363D0 (en)
WO (1) WO1999054474A2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000061792A1 (en) * 1999-04-10 2000-10-19 Bayer Aktiengesellschaft Novel essential bacterial genes and their proteins
EP2168591A1 (en) * 2008-09-24 2010-03-31 Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH) Antimicrobial peptides

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996033276A1 (en) * 1995-04-21 1996-10-24 Human Genome Sciences, Inc. NUCLEOTIDE SEQUENCE OF THE HAEMOPHILUS INFLUENZAE Rd GENOME, FRAGMENTS THEREOF, AND USES THEREOF
EP0756006A2 (en) * 1995-06-07 1997-01-29 The Institute For Genomic Research Nucleotide sequence of the mycoplasma genitalium genome, fragments thereof, and uses thereof
WO1998018931A2 (en) * 1996-10-31 1998-05-07 Human Genome Sciences, Inc. Streptococcus pneumoniae polynucleotides and sequences

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996033276A1 (en) * 1995-04-21 1996-10-24 Human Genome Sciences, Inc. NUCLEOTIDE SEQUENCE OF THE HAEMOPHILUS INFLUENZAE Rd GENOME, FRAGMENTS THEREOF, AND USES THEREOF
EP0756006A2 (en) * 1995-06-07 1997-01-29 The Institute For Genomic Research Nucleotide sequence of the mycoplasma genitalium genome, fragments thereof, and uses thereof
WO1998018931A2 (en) * 1996-10-31 1998-05-07 Human Genome Sciences, Inc. Streptococcus pneumoniae polynucleotides and sequences

Non-Patent Citations (18)

* Cited by examiner, † Cited by third party
Title
ARIGONI F ET AL.: "A genome-based approach for the identification of essential bacterial genes." NATURE BIOTECHNOLOGY, vol. 16, September 1998 (1998-09), pages 851-856, XP002124132 *
BAIROCH A ET AL.: "The PROSITE database, its status in 1997" NUCLEIC ACIDS RESEARCH, vol. 25, no. 1, 1 January 1997 (1997-01-01), pages 217-221, XP002126094 *
BAIROCH A: "UPF (Uncharacterized Protein Families) list and index of members" July 1999 (1999-07), page 1-15 XP002124133 *
BALTZ ET AL: "DNA sequence sampling of the Streptococcus pneumoniae genome to identify novel targets for antibiotic development" MICROBIAL DRUG RESISTANCE,US,LIEBERT, vol. 4, no. 1, 21 March 1998 (1998-03-21), page 1-9 XP002112153 ISSN: 1076-6294 *
DATABASE EMBL [Online] ID NGU72876, AC U72876, 31 March 1997 (1997-03-31) ROPP P A ET AL.: "Neisseria gonorrhoeae PilO (pilO), PilN (pilN), PilM (pilM), penicillin-binding protein 1 (ponA) and putative GTPase genes complete cds and PilP (pilP) gene, partial cds" XP002124958 -& ROPP P A ET AL.: "Cloning and characterization of the ponA gene encoding penicillin-binding protein 1 from Neisseria gonorrhoeae and Neisseria meningitidis" JOURNAL OF BACTERIOLOGY, vol. 179, no. 8, April 1997 (1997-04), pages 2783-2787, XP002124949 *
DATABASE PIR [Online] Accession: E70456, 8 May 1998 (1998-05-08) DECKERT G ET AL.: "Conserved hypothetical protein aq_1815 - Aquifex aeolicus" XP002124959 -& DECKERT G ET AL.: "The complete genome of the hyperthermophilic bacterium Aquifex aeolicus" NATURE, vol. 392, no. 6674, 26 March 1998 (1998-03-26), pages 353-358, XP002124950 *
DATABASE PIR [Online] Accession: G64715, 9 August 1997 (1997-08-09) "Conserved hypothetical ATP-binding protein HP1567 Helicobacter pylori." XP002124134 cited in the application *
DATABASE SWISSPROT [Online] ID Q05473, AC Q05473, JOHNSTON M ET AL.: "Similar to E. coli hypothetical 22.1 kD protein in polA 3' region" XP002124956 cited in the application *
DATABASE SWISSPROT [Online] ID Y320_METJA, AC Q57768, 1 November 1997 (1997-11-01) BULT C J ET AL.: "Hypothetical GTP-binding protein MJ0320" XP002124957 cited in the application *
DATABASE SWISSPROT [Online] ID Y335_MYCGE, AC P47577, 1 February 1996 (1996-02-01) FRASER C M ET AL.: "Hypothetical GTP-binding protein MG335" XP002124954 cited in the application *
DATABASE SWISSPROT [Online] ID Y335_MYCPN, AC P75303, 1 November 1997 (1997-11-01) HIMMELREICH R ET AL.: "Hypothetical GTP-binding protein MG335 homolog" XP002124955 cited in the application *
DATABASE SWISSPROT [Online] ID YIHA_ECOLI, AC P24253, 1 March 1992 (1992-03-01) "Hypothetical GTP-binding protein in polA-hemN intergenic region" XP002124135 cited in the application *
DATABASE SWISSPROT [Online] ID YIHA_HAEIN, AC P46453, 1 November 1995 (1995-11-01) "Hypothetical GTP-binding protein HI1118" XP002124137 cited in the application *
DATABASE SWISSPROT [Online] ID YSXC_BACSU, AC P38424, 1 October 1994 (1994-10-01) "Hypothetical GTP-binding protein in lonA-hemA intergenic region (orfX)." XP002124136 cited in the application -& RIETHDORF S ET AL.: "Cloning, nucleotide sequence, and expression of the Bacillus subtilis lon gene." J. BACTERIOL., vol. 176, no. 21, November 1994 (1994-11), pages 6518-6527, XP000857246 *
DATABASE TIGR [Online] gnl TIGR_487 N.meningitidis_GNMCY55F, 28 January 1999 (1999-01-28) "Neisseria meningitidis MC58" XP002125850 *
HENIKOFF S: "Scores for sequence searches and alignments" CURRENT OPINION IN STRUCTURAL BIOLOGY, vol. 6, June 1996 (1996-06), pages 353-360, XP002126095 *
MUSHEGIAN A R ET AL.: "A minimal gene set for cellular life derived by comparison of complete bacterial genomes" PROC. NATL. ACAD. SCI. USA, vol. 93, September 1996 (1996-09), pages 10268-10273, XP002126189 -& DATABASE COMPLETE_GENOMES [Online] ncbi Minimal gene set, September 1996 (1996-09) XP002126431 *
TATUSOV R L ET AL.: "Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coli" CURRENT BIOLOGY, vol. 6, no. 3, 1 March 1996 (1996-03-01), pages 279-291, XP000857871 -& DATABASE COMPLETE_GENOMES [Online] NCBI HIN0530 (HI1118), yihA, 1 March 1996 (1996-03-01) XP002126430 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000061792A1 (en) * 1999-04-10 2000-10-19 Bayer Aktiengesellschaft Novel essential bacterial genes and their proteins
EP2168591A1 (en) * 2008-09-24 2010-03-31 Helmholtz Zentrum München Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH) Antimicrobial peptides
WO2010034783A1 (en) * 2008-09-24 2010-04-01 Helmholtz Zentrum München - Deutsches Forschungszentrum für Gesundheit und Umwelt (GmbH) Antimicrobial peptides

Also Published As

Publication number Publication date
WO1999054474A3 (en) 2000-05-04
GB9808363D0 (en) 1998-06-17
AU3709099A (en) 1999-11-08

Similar Documents

Publication Publication Date Title
Thanassi et al. Identification of 113 conserved essential genes using a high‐throughput gene disruption system in Streptococcus pneumoniae
Babbitt et al. The enolase superfamily: a general strategy for enzyme-catalyzed abstraction of the α-protons of carboxylic acids
Mishra et al. Two autonomous structural modules in the fimbrial shaft adhesin FimA mediate Actinomyces interactions with streptococci and host cells during oral biofilm development
Williams et al. Identification of a novel gene cluster encoding staphylococcal exotoxin-like proteins: characterization of the prototypic gene and its protein product, SET1
Woo et al. Structural studies of a bacterial condensin complex reveal ATP-dependent disruption of intersubunit interactions
Bono et al. Oligopeptide permease in Borrelia burgdorferi: putative peptide-binding components encoded by both chromosomal and plasmid loci
Swartley et al. Characterization of the gene cassette required for biosynthesis of the (α1→ 6)-linked N-acetyl-d-mannosamine-1-phosphate capsule of serogroup A Neisseria meningitidis
Abendroth et al. The X-ray structure of the type II secretion system complex formed by the N-terminal domain of EpsE and the cytoplasmic domain of EpsL of Vibrio cholerae
Park et al. Nontypeable pneumococci can be divided into multiple cps types, including one type expressing the novel gene pspK
Trevino et al. CovS simultaneously activates and inhibits the CovR-mediated repression of distinct subsets of group A Streptococcus virulence factor-encoding genes
Ng et al. Regulation of the pspA virulence factor and essential pcsB murein biosynthetic genes by the phosphorylated VicR (YycF) response regulator in Streptococcus pneumoniae
Chakravarti et al. Application of genomics and proteomics for identification of bacterial gene products as potential vaccine candidates
Olivares-Illana et al. Structural basis for the regulation mechanism of the tyrosine kinase CapB from Staphylococcus aureus
Yeung et al. Identification of a gene involved in assembly of Actinomyces naeslundii T14V type 2 fimbriae
Lorenzini et al. Structure and protein-protein interaction studies on Chlamydia trachomatis protein CT670 (YscO Homolog)
Kattke et al. Structure and mechanism of TagA, a novel membrane-associated glycosyltransferase that produces wall teichoic acids in pathogenic bacteria
Young et al. Structural conservation, variability, and immunogenicity of the T6 backbone pilin of serotype M6 Streptococcus pyogenes
Campbell et al. Genetic interaction screens with ordered overexpression and deletion clone sets implicate the Escherichia coli GTPase YjeQ in late ribosome biogenesis
Marrec‐Fairley et al. Differential functionalities of amphiphilic peptide segments of the cell‐septation penicillin‐binding protein 3 of Escherichia coli
Wen et al. Analysis of cis-and trans-acting factors involved in regulation of the Streptococcus mutans fructanase gene (fruA)
Sergiev et al. Identification of Escherichia coli m2G methyltransferases: II. The ygjO gene encodes a methyltransferase specific for G1835 of the 23 S rRNA
Tourand et al. Differential telomere processing by Borrelia telomere resolvases in vitro but not in vivo
Weinstock et al. From microbial genome sequence to applications
Barrow et al. Functional cloning of Bacillus anthracis dihydrofolate reductase and confirmation of natural resistance to trimethoprim
Deka et al. Physicochemical evidence that Treponema pallidum TroA is a zinc-containing metalloprotein that lacks porin-like structure

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

NENP Non-entry into the national phase

Ref country code: KR

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: CA