WO2004018638A2 - Iron transport and metabolism proteins - Google Patents

Iron transport and metabolism proteins Download PDF

Info

Publication number
WO2004018638A2
WO2004018638A2 PCT/US2003/026488 US0326488W WO2004018638A2 WO 2004018638 A2 WO2004018638 A2 WO 2004018638A2 US 0326488 W US0326488 W US 0326488W WO 2004018638 A2 WO2004018638 A2 WO 2004018638A2
Authority
WO
WIPO (PCT)
Prior art keywords
iron
polynucleotide
genes
expression
sequence
Prior art date
Application number
PCT/US2003/026488
Other languages
French (fr)
Other versions
WO2004018638A8 (en
WO2004018638A3 (en
WO2004018638A9 (en
Inventor
Vivek Kapur
Mugdha Gadgil
Original Assignee
Regents Of The University Of Minnesota
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Regents Of The University Of Minnesota filed Critical Regents Of The University Of Minnesota
Priority to AU2003260043A priority Critical patent/AU2003260043A1/en
Publication of WO2004018638A2 publication Critical patent/WO2004018638A2/en
Publication of WO2004018638A9 publication Critical patent/WO2004018638A9/en
Publication of WO2004018638A3 publication Critical patent/WO2004018638A3/en
Publication of WO2004018638A8 publication Critical patent/WO2004018638A8/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals

Definitions

  • Iron is an essential element for the survival of nearly all organisms, including pathogenic bacteria. Although there is an adequate iron present in the body fluids of humans and animals, the amount of iron readily available to bacteria is extremely low. This is due, in part, to the fact that a majority of the iron in an animal is intracellular, in the form of ferritin, haemosiderin or haem. In addition, iron present in body fluids is complexed with high affinity iron binding proteins like transferrin and lactoferrin. Hence, the amount of free iron in equilibrium with iron binding proteins is at an approximate concentration of 10 "18 M. Even outside the host, free Fe "1-1 ⁇ in an aerobic, aqueous environment is limited to an equilibrium value of approximately 10 " M, a value far below that required for optimal bacterial growth.
  • pathogenic bacteria have evolved specialized transport and metabolic systems to acquire a sufficient iron supply.
  • high affinity iron transport systems have been developed that include specific ferric iron chelaters, "siderophores,” and iron-regulated outer membrane proteins (IROMPs) and/or siderophore receptor proteins (SRPs) that are receptors for siderophores on the outer membrane of the bacterial cell.
  • IROMPs iron-regulated outer membrane proteins
  • SRPs siderophore receptor proteins
  • Fur negatively regulates the genes involved in iron uptake and the biosynthesis of siderophores in response to the iron level in the cell.
  • Fur mutants constitutively express the siderophore biosynthesis enzymes and iron transport proteins.
  • the present invention provides proteins involved in iron transport and or metabolism and polynucleotides encoding those proteins.
  • the invention provides an isolated and purified polypeptide comprising at least one of SEQ ID NOs 65-128 or 130.
  • the present invention also provides an isolated and purified polynucleotide comprising a nucleic acid sequence encoding at least one of SEQ ID NOs 65-128 or 130.
  • the present invention further provides an isolated and purified polynucleotide comprising at least one of SEQ ID NOs 1-64 or 129.
  • the present invention also provides an expression cassette, comprising a nucleic acid sequence encoding a promoter operably linked to at least one of the polynucleotides of the invention.
  • a cell e.g., comprising an expression cassette, polynucleotide, and/or polypeptide of the invention.
  • the present invention also provides a method of identifying a gene, including: a) contacting a probe including nucleic acid obtained from a cell grown in an iron-limited environment with a solid substrate including one or more nucleotide sequences to provide a profile of gene expression; b) contacting a probe including nucleic acid obtained from a cell grown in an non iron-limited environment with a solid substrate including one or more nucleotide sequences to provide a profile of gene expression; and c) comparing the profile of (a) to the profile of (b) so as to identify a gene having altered expression.
  • Figure 1 Growth curve of the wild-type (•) and the fur ( ⁇ ). Wild-type and fur were cultured in defined, iron-limited media, and at time 0, FeSO 4 .7H 2 ⁇ was added to the culture to a final concentration of lO ⁇ M.
  • Figure 3 Expression of some of the genes involved in iron transport and having an operonic organization. Wild-type ( ⁇ ),fu (--). The Y-axis shows fold- change in expression on addition of iron to an iron-limited culture. X-axis shows the time points at which samples were taken.
  • Figure 4 Expression profiles of bl973, b0597,b0805 andbl452. Wildtype (•),fur ⁇ (m). The Y-axis shows fold-change in expression on addition of iron to an iron-limted culture. X-axis shows the time points at which samples were taken. Figure 5. Growth curves, sequence analysis and location of bl973. 5a is a comparison of growth curves of wild-type and bl973 ⁇ mutant. 5b depicts the putative fur-box in the bl973 sequence.
  • FIG. 5c bl968 putative 2 component sensor protein, bl969: putative 2 component transcriptional regulator, bl970: putative periplasmic or exported protein, bl971: putative reductase, bl972: putative inner membrane protein, bl973 : putative metal ABC transporter substrate binding protein.
  • Figure 6a Comparison of array data and real time PCR data. Wild-type or fur (KO) (-), Real time PCR data for the same experimental condition (--). Y axis is the fold change (log to the base2 of ratio) maximum 6 and minimum -8; X axis has three points at which samples were taken: 20 min, 60 min and 90 minutes.
  • Figure 6b Scatter plot of ratio from cDNA microrray vs. ratio from Real time PCR.
  • the transcriptional response of Escherichia coli MG1655 and defined mutants to growth in an iron-limited environment was characterized by cDNA- based microarray analysis of the whole genome. Samples taken at six different time points after addition of iron to an iron-starved culture were analyzed and showed that the expression of a large proportion ( ⁇ 20%) of genes is altered during iron restriction. Applicants have discovered that the pathways that were most dramatically altered were nucleotide biosynthesis and metabolism (-34%), amino acid biosynthesis and metabolism (-28%), and transport and binding proteins (-19%). These proteins can serve as targets in the development of vaccines and antimicrobial agents.
  • the present invention provides an isolated and purified polypeptide comprising at least one of S ⁇ Q ID NOs 65-128 or 130.
  • the polypeptide comprises SEQ ID NO:86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 130.
  • the present invention also provides an isolated and purified polynucleotide comprising a nucleic acid sequence encoding at least one of SEQ ID NOs 65-128 or 130.
  • the polynucleotide encodes SEQ ID NO:86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 130.
  • the present invention also provides an isolated and purified polynucleotide comprising at least one of SEQ ID NOs 1-64 or 129.
  • the polynucleotide comprises SEQ ) NO: 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 129.
  • the present invention also provides an expression cassette, comprising a nucleic acid sequence encoding a promoter operably linked to at least one of the polynucleotides of the invention.
  • the nucleic acid sequence comprises SEQ ID NO: 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 129.
  • the nucleic acid sequence encodes SEQ ID NO:86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 130.
  • a cell e.g., a host cell, comprising an expression cassette, polynucleotide, andor polypeptide of the invention.
  • the present invention also provides polynucleotides and polypeptides having substantial similarity to at least one of the polynucleotides or polypeptides of the invention.
  • the present invention also provides fragments of the polynucleotides and polypeptides of the invention.
  • the present invention also provides a method of identifying a gene, including: a) contacting a probe including nucleic acid obtained from a cell grown in an iron-limited environment with a solid substrate including one or more nucleotide sequences to provide a profile of gene expression; b) contacting a probe including nucleic acid obtained from a cell grown, in an non iron-limited environment with a solid substrate including one or more nucleotide sequences to provide a profile of gene expression; and c) comparing the profile of (a) to the profile of (b) so as to identify a gene having altered expression.
  • the cells are bacterial cells.
  • the cells are prokaryotic cells. In some embodiments, the cells are eukaryotic cells. In some embodiments, the gene having the altered expression encodes an outer membrane protein. In some embodiments, the gene is identified by comparing the expression of genes in a wild-type host cell to the expression of genes in a mutant host cell. In some embodiments, the mutant cell comprises a mutated_ wr gene. The present invention also provides genes identified by such methods, and proteins encoded by those genes.
  • an iron transport and metabolism systems is meant to refer to a bacterial system for the uptake, acquisition, transport, metabolism and/or regulation of iron under low iron conditions.
  • low iron conditions is mean an iron-limited environment, i.e., an environment wherein the availability of iron, e.g., free Fe +++ , is at a lower concentration than that required for optimal bacterial growth. This concentration may vary depending on the particular requirements, e.g., nutritional needs, of a bacterial cell.
  • iron transport and metabolism systems includes related systems, i.e., systems that facilitate iron transport and metabolism systems.
  • nucleic acid encoding an iron transport and metabolism protein includes a gene encoding a protein involved iron uptake as well as the gene controlling expression of the uptake component.
  • chimeric refers to any gene or DNA that contains 1) DNA sequences, including regulatory and coding sequences, that are not found together in nature, or 2) sequences encoding parts of proteins not naturally adjoined, or 3) parts of promoters that are not naturally adjoined. Accordingly, a chimeric gene may include regulatory sequences and coding sequences that are derived from different sources, or include regulatory sequences and coding sequences derived from the same source, but arranged in a manner different from that found in nature.
  • “Expression” refers to the transcription and translation of an endogenous gene or a transgene in a host cell.
  • expression may refer to the transcription of the antisense DNA only.
  • expression refers to the transcription and stable accumulation of sense (mRNA) or functional RNA. Expression may also refer to the production of protein.
  • genes include coding sequences and/or the regulatory sequences required for their expression.
  • gene refers to a nucleic acid fragment that expresses mRNA, or specific protein, including regulatory sequences. Genes also include nonexpressed DNA segments that, for example, form recognition sequences for other proteins. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters.
  • a “transgene” refers to a gene that has been introduced into the genome by transformation and is stably maintained.
  • Transgenes may include, for example, DNA that is either .heterologous or homologous to the DNA of a particular cell to be transformed. Additionally, transgenes may include native genes inserted into a non-native organism, or chimeric genes.
  • endogenous gene refers to a native gene in its natural location in the genome of an organism.
  • a “foreign” gene refers to a gene not normally found in the host organism but that is introduced by gene transfer.
  • a “mutation” refers to an insertion, deletion or substitution of one or more nucleotide bases of a nucleic acid sequence, so that the nucleic acid sequence differs from the wild-type sequence.
  • a 'point' mutation refers to an alteration in the sequence of a nucleotide at a single base position from the wild type sequence.
  • nucleic acid refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form, composed of monomers (nucleotides) containing a sugar, phosphate and a base which is either a purine or pyrimidine. Unless specifically limited, the term encompasses nucleic acids containing known analogs of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides.
  • nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences as well as the sequence explicitly indicated.
  • degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., 1991; Ohtsuka et al., 1985; Rossolini et al., 1994).
  • a "nucleic acid fragment" is a fraction of a given nucleic acid molecule.
  • nucleotide sequence refers to a polymer of DNA or RNA that can be single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases capable of incorporation into DNA or RNA polymers.
  • nucleic acid may also be used interchangeably with gene, cDNA, DNA and RNA encoded by a gene (Batzer et al., 1991; Ohtsuka et al., 1985; Rossolini et al., 1994).
  • operably linked when used with respect to nucleic acid, means joined as part of the same nucleic acid molecule, suitably positioned and oriented for transcription to be initiated from the promoter.
  • DNA operably linked to a promoter is under transcriptional initiation regulation of the promoter.
  • Coding sequences can be operably-linked to regulatory sequences in sense or antisense orientation.
  • Promoter refers to a nucleotide sequence, usually upstream (5') to its coding sequence, which controls the expression of the coding sequence by providing the recognition for RNA polymerase and other factors required for proper transcription.
  • Promoter includes a minimal promoter that is a short DNA sequence including a TATA- box and other sequences that serve to specify the site of transcription initiation, to which regulatory elements are added for control of expression. "Promoter” also refers to a nucleotide sequence that includes a minimal promoter plus regulatory elements that is capable of controlling the expression of a coding sequence or functional RNA. This type of promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an “enhancer” is a DNA sequence that can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue specificity of a promoter.
  • promoter Both enhancers and other upstream promoter elements bind sequence-specific DNA-binding proteins that mediate their effects. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even include synthetic DNA segments. A promoter may also contain DNA sequences that are involved in the binding of protein factors that control the effectiveness of transcription initiation in response to physiological or developmental conditions.
  • the "initiation site” is the position surrounding the first nucleotide that is part of the transcribed sequence, which is also defined as position +1. With respect to this site all other sequences of the gene and its controlling regions are numbered. Downstream sequences (i.e. further protein encoding sequences in the 3' direction) are denominated positive, while upstream sequences (mostly of the controlling regions in the 5' direction) are denominated negative.
  • Promoter elements particularly a TATA element, that are inactive or that have greatly reduced promoter activity in the absence of upstream activation are referred to as "minimal or core promoters.”
  • minimal or core promoters In the presence of a suitable transcription factor, the minimal promoter functions to permit transcription.
  • a minimal or core promoter thus consists only of all basal elements needed for transcription initiation, e.g., a TATA box and/or an initiator.
  • Constant expression refers to expression using a constitutive or regulated promoter.
  • Consditional and regulated expression refer to expression controlled by a regulated promoter.
  • An "inducible promoter” is a regulated promoter that can be turned on in a cell by an external stimulus, such as a chemical, light, hormone, stress, or a pathogen.
  • sequence relationships between two or more nucleic acids or polynucleotides are used to describe the sequence relationships between two or more nucleic acids or polynucleotides: (a) “reference sequence”, (b) “comparison window”, (c) “sequence identity”, (d) “percentage of sequence identity”, and (e) “substantial identity.”
  • reference sequence is a defined sequence used as a basis for sequence comparison.
  • a reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence.
  • comparison window makes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison window may include additions or deletions (i.e., gaps) compared to the reference sequence (which does not include additions or deletions) for optimal alignment of the two sequences.
  • the comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100, or longer.
  • Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity. Such implementations include, but are not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, California); the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Version 8 (available from Genetics Computer Group (GCG), 575 Science Drive, Madison, Wisconsin, USA). Alignments using these programs can be performed using the default parameters.
  • the CLUSTAL program is well described by Higgins et al. (1988), Higgins et al. (1989), Corpet et al. (1988), Huang et al. (1992), and Pearson et al. (1994).
  • the ALIGN program is based on the algorithm of Myers and Miller, supra.
  • the BLAST programs of Altschul et al. ,(1990); (1997), are based on the algorithm of Karlin and Altschul supra.
  • Software for performing BLAST analyses is publicly available through the
  • HSPs high scoring sequence pairs
  • BLAST algorithm performs a statistical analysis of the similarity between two sequences.
  • One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance.
  • a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence to the reference nucleic acid sequence can be less than about 0.1, less than about 0.01, or less than about 0.001.
  • Gapped BLAST in
  • BLAST 2.0 can be utilized as described in Altschul et al. (1997).
  • PSI-BLAST in BLAST 2.0
  • PSI-BLAST can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al., supra.
  • the default parameters of the respective programs e.g. BLASTN for nucleotide sequences, BLASTX for proteins
  • the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix. See the world wide web at ncbi.nlm.nih.gov. Alignment may also be performed manually by inspection.
  • comparison of nucleotide sequences for determination of percent sequence identity to the promoter sequences disclosed herein can be made using the BlastN program (version 1.4.7 or later) with its default parameters or any equivalent program.
  • equivalent program is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by the alternative program.
  • sequence identity or “identity” in the context of two nucleic acid or polypeptide sequences makes reference to a specified percentage of residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window, as measured by sequence comparison algorithms or by visual inspection.
  • percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule.
  • sequences differ in conservative substitutions the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution.
  • Sequences that differ by such conservative substitutions are said to have "sequence similarity" or "similarity.” Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, California).
  • percentage of sequence identity means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may include additions or deletions (i.e., gaps) as compared to the reference sequence (which does not include additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity.
  • polynucleotide sequences means that a polynucleotide includes a sequence that has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, at least 90%, 91%, 92%, 93%, or 94%, and at least 95%, 96%, 97%o, 98%, or 99% sequence identity, compared to a reference sequence using one of the alignment programs described using standard parameters.
  • nucleotide sequences are substantially identical if two molecules hybridize to each other under stringent conditions (see below).
  • stringent conditions are selected to be about 5°C lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength and pH.
  • T m thermal melting point
  • stringent conditions encompass temperatures in the range of about 1°C to about 20°C, depending upon the desired degree of stringency as otherwise qualified herein.
  • Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides they encode are substantially identical. This may occur, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code.
  • One indication that two nucleic acid sequences are substantially identical is when the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the polypeptide encoded by the second nucleic acid.
  • substantially identical in the context of a peptide indicates that a peptide includes a sequence with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, at least 90%, 91%, 92%, 93%, or 94%, or 95%, 96%, 97%, 98% or 99%, sequence identity to the reference sequence over a specified comparison window.
  • Optimal alignment may be conducted using the homology alignment algorithm of Needleman and Wunsch (1970).
  • a peptide is substantially identical to a second peptide, for example, where the two peptides differ only by a conservative substitution.
  • sequence comparison typically one sequence acts as a reference sequence to which test sequences are compared.
  • test and reference sequences are input into a computer, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are designated.
  • sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
  • hybridizing specifically to refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA.
  • Bod(s) substantially refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target nucleic acid sequence.
  • T m The thermal melting point
  • T m is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Specificity is typically the function of post-hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution.
  • T m can be approximated from the equation of Meinkoth and Wahl (1984); T m 81.5°C + 16.6 (log M) +0.41 (%GC) -
  • T m is reduced by about 1°C for each 1% of mismatching; thus, T m , hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with >90% identity are sought, the T m can be decreased 10°C. Generally, stringent conditions are selected to be about 5°C lower than the T m for the specific sequence and its complement at a defined ionic strength and pH.
  • An example of highly stringent wash conditions is 0.15 M NaCl at 72 °C for about 15 minutes.
  • An example of stringent wash conditions is a 0.2X SSC wash at 65 °C for 15 minutes (see, Sambrook, infra, for a description of SSC buffer).
  • a high stringency wash is preceded by a low stringency wash to remove background probe signal.
  • An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides is IX SSC at 45 °C for 15 minutes.
  • An example low stringency wash for a duplex of, e.g., more than 100 nucleotides is 4-6X SSC at 40°C for 15 minutes.
  • stringent conditions typically involve salt concentrations of less than about 1.5 M, or about 0.01 to 1.0 M, Na ion concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically at least about 30°C and at least about 60 C for long probes (e.g., >50 nucleotides).
  • Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide.
  • destabilizing agents such as formamide.
  • a signal to noise ratio of 2X(or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization. Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the proteins that they encode are substantially identical.
  • Very stringent conditions are selected to be equal to the T m for a particular probe.
  • An example of stringent conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or Northern blot is 50% formamide, e.g., hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37 C, and a wash in 0.1X SSC at 60 to 65°C.
  • Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 37°C, and a wash in 0.5X to IX SSC at 55 to 60°C.
  • the invention described herein includes polynucleotides and polypeptides that are substantially identical to any one of SEQ ID NOs 1-130.
  • a “transgenic”, “transformed”, or “recombinant” cell refers to a genetically modified or genetically altered cell, the genome of which includes a recombinant DNA molecule or sequence ("transgene”).
  • a “transgenic cell” can be a cell transformed with a "vector.”
  • a “transgenic”, “transformed”, or “recombinant” cell thus refers to a host cell such as a bacterial or yeast cell into which a heterologous nucleic acid molecule has been introduced.
  • the nucleic acid molecule can be stably integrated into the genome by methods generally known in the art (e.g., disclosed in Sambrook et al., 2001).
  • “transformed,” “transformant,” and “transgenic” cells have been through the transformation process and contain a foreign or exogenous gene.
  • the term “untransformed” refers to cells that have not been through the transformation process.
  • transformation refers to the transfer of a nucleic acid fragment into the genome of a host cell, or the transfer into a host cell of a nucleic acid fragment that is maintained extrachromosomally.
  • a “transgene” refers to a gene that has been introduced into the genome by transformation.
  • Transgenes may include, for example, genes that are heterologous or endogenous to the genes of a particular cell to be transformed. Additionally, transgenes may include native genes inserted into a non-native organism, or chimeric genes.
  • endogenous gene refers to a native gene in its natural location in the genome of an organism. Such genes can be hyperactivated in some cases by the introduction of an exogenous strong promoter into operable association with the gene of interest.
  • a “foreign” or an “exogenous” gene refers to a gene not normally found in the host cell but that is introduced by gene transfer.
  • Vector is defined to include, inter alia, any plasmid, cosmid, phage or other construct in double or single stranded linear or circular form that may or may not be self transmissible or mobilizable, and that can transform prokaryotic or eukaryotic host either by integration into the cellular genome or exist extrachromosomally, e.g., autonomous replicating plasmid with an origin of replication.
  • a vector can include a construct such as an expression cassette having a DNA sequence capable of directing expression of a particular nucleotide sequence in an appropriate host cell, comprising a promoter operably linked to the nucleotide sequence of interest that also is operably linked to termination signals.
  • An expression cassette also typically includes sequences required for proper translation of the nucleotide sequence.
  • the expression cassette comprising the nucleotide sequence of interest may be chimeric, meaning that at least one of its components is heterologous with respect to at least one of its other components.
  • the expression cassette may also be one that is naturally occurring but has been obtained in a recombinant form useful for heterologous expression.
  • the expression of the nucleotide sequence in the expression cassette may be under the control of a constitutive promoter or of an inducible promoter that initiates transcription only when the host cell is exposed to some particular external stimulus.
  • wild type refers to an untransformed cell, i.e., one where the genome has not been altered by the presence of the recombinant DNA molecule or sequence or by other means of mutagenesis.
  • a "corresponding" untransformed cell is a typical control cell, i.e., one that has been subjected to transformation conditions, but has not been exposed to exogenous DNA.
  • Sources of nucleotide sequences from which the present nucleic acid molecules encoding iron transport and metabolism system proteins, or the nucleic acid complements thereof, may be prepared, for example, from total or polyA+ RNA from any prokaryotic, e.g., pathogenic bacterial, cellular source from which cDNAs can be derived by methods known in the art.
  • Sources include those gram- negative bacteria that are frequent pathogens of animals, such as Escherichia coli, Salmonella spp. and Pasteur ella spp.
  • Other sources of the DNA molecules of the invention include genomic libraries derived from any prokaryotic cellular source.
  • Genes encoding proteins involved in iron transport or metabolism can be isolated, for example, using gene chip technology. Such methods are disclosed herein.
  • Nucleic acid molecules encoding the amino acid sequence of an iron transport and metabolism system protein are prepared by a variety of methods known in the art.
  • B. Polypeptides of the Invention The isolated and purified iron transport and metabolism system proteins, or portions thereof, or derivatives thereof, can be synthesized in vitro, e.g., by the solid phase peptide synthetic method or by recombinant DNA approaches (see above).
  • the solid phase peptide synthetic method is an established and widely used method, which is described in the following references: Stewart et al. (1969); Merrifield (1963); Meienhofer (1973); Bavaay and Merrifield (1980); and Clark-Lewis et al. (1997).
  • peptides can be further purified by fractionation on immunoaffinity or ion-exchange columns; ethanol precipitation; reverse phase HPLC; chromatography on silica or on an anion-exchange resin such as DEAE; chromatofocusing; SDS-PAGE; ammonium sulfate precipitation; gel filtration using, for example, Sephadex G-75; or ligand affinity chromatography.
  • derivatives e.g., chemically derived derivatives
  • amides of the iron transport systems protein of the present invention may also be prepared by techniques well known in the art for converting a carboxylic acid group or precursor, to an amide.
  • One method for amide formation at the C- terminal carboxyl group is to cleave the polypeptide from a solid support with an appropriate amine, or to cleave in the presence of an alcohol, yielding an ester, followed by aminolysis with the desired amine.
  • Salts of carboxyl groups of a polypeptide of the invention may be prepared in the usual manner by contacting the polypeptide with one or more equivalents of a desired base such as, for example, a metallic hydroxide base, e.g., sodium hydroxide; a metal carbonate or bicarbonate base such as, for example, sodium carbonate or sodium bicarbonate; or an amine base such as, for example, triethylamine, triethanolamine, and the like.
  • a desired base such as, for example, a metallic hydroxide base, e.g., sodium hydroxide
  • a metal carbonate or bicarbonate base such as, for example, sodium carbonate or sodium bicarbonate
  • an amine base such as, for example, triethylamine, triethanolamine, and the like.
  • N-acyl derivatives of an amino group of the iron transport systems protein may be prepared by utilizing an N-acyl protected amino acid for the final condensation, or by acylating a protected or unprotected polypeptide.
  • O-acyl derivatives may be prepared, for example, by acylation of a free hydroxy polypeptide or polypeptide resin. Either acylation may be carried out using standard acylating reagents such as acyl halides, anhydrides, acyl imidazoles, and the like. Formyl-methionine, pyroglutamine and trimethyl-alanine may be substituted at the N-terminal residue of the polypeptide.
  • Other amino-terminal modifications include aminooxypentane modifications (see Simmons et al. (1997).
  • the iron transport and metabolism system proteins of the invention include proteins substitutions of at least one amino acid residue in the polypeptide.
  • Amino acid substitutions falling within the scope of the invention include those that do not differ significantly in their effect on maintaining (a) the structure of the peptide backbone in the area of the substitution, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain.
  • Naturally occurring residues are divided into groups based on common side-chain properties:
  • hydrophobic norleucine, met, ala, val, leu, ile
  • Substitution of like amino acids can also be made on the basis of hydrophilicity.
  • the following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 ⁇ 1); glutamate (+3.0 ⁇ 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); proline (-0.5 ⁇ 1); threonine (-0.4); alanine (- 0.5); histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4).
  • the substitution of amino acids whose hydrophilicity values can be within ⁇ 2, within ⁇ 1, or within ⁇ 0.5.
  • the iron transport and metabolism system protein has a conservative amino acid substitution, for example, aspartic- glutamic as acidic amino acids; lysine/arginine/histidine as basic amino acids; leucine/isoleucine, methionine/valine, alanine/valine as hydrophobic amino acids; serine/glycine/alanine/threonine as hydrophilic amino acids.
  • Conservative amino acid substitutions also includes groupings based on side chains.
  • a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and tlireonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur- containing side chains is cysteine and methionine.
  • Acid addition salts of the polypeptide or of amino residues of the polypeptide may be prepared by contacting the polypeptide or amine with one or more equivalents of the desired inorganic or organic acid, such as, for example, hydrochloric acid.
  • Esters of carboxyl groups of the polypeptides may also be prepared by any of the usual methods known in the art.
  • the present invention contemplates an isolated iron transport systems protein.
  • the iron transport systems protein of the invention is a recombinant polypeptide.
  • Amino acid residues can be added to or deleted from a full-length iron transport systems protein through the use of standard molecular biological techniques without altering the functionality of the receptor. For example, portions of the iron transport systems protein can be removed to create truncated iron transport systems proteins. The truncated protein retains the properties of the full- length iron transport systems protein.
  • the recombinant DNA sequence or segment may be circular or linear, double-stranded or single-stranded.
  • a recombinant DNA sequence which encodes a RNA sequence that is substantially complementary to a mRNA sequence encoding a iron transport or metabolism system protein is typically a "sense" DNA sequence cloned into a cassette in the opposite orientation (i. e. , 3 ' to 5 ' rather than 5 ' to 3 ').
  • the recombinant DNA sequence or segment is in the form of chimeric DNA, such as plasmid DNA, that can also contain coding regions flanked by control sequences which promote the expression of the recombinant DNA present in the resultant cell.
  • a portion of the recombinant DNA may be untranscribed, serving a regulatory or a structural function.
  • the recombinant DNA may itself comprise a promoter that is active in poultry cells, or may utilize a promoter already present in the genome that is the transformation target. Such promoters are well known to the art.
  • Other elements functional in the cells such as introns, enhancers, polyadenylation sequences and the like, may also be a part of the recombinant DNA. Such elements may or may not be necessary for the function of the DNA, but may provide improved expression of the DNA by affecting transcription, stability of the mRNA, or the like. Such elements may be included in the DNA as desired to obtain the optimal performance of the transforming DNA in the cell.
  • a coding sequence of an expression cassette may also be operatively linked to a transcription terminating region.
  • RNA polymerase transcribes an encoding DNA sequence through a site where polyadenylation occurs. Typically, DNA sequences located a few hundred base pairs downstream of the polyadenylation site serve to terminate transcription. Those DNA sequences are referred to herein as transcription-termination regions. Those regions are required for efficient polyadenylation of transcribed RNA. Transcription-terminating regions are well- known in the art.
  • the recombinant DNA to be introduced into the cells may contain either a selectable marker gene or a reporter gene or both to facilitate identification and selection of transformed cells from the population of cells sought to be transformed.
  • selectable marker maybe carried on a separate piece of DNA and used in a co-transformation procedure.
  • Both selectable markers and reporter genes may be flanked with appropriate regulatory sequences to enable expression in the host cells. Useful selectable markers are well known in the art.
  • Reporter genes are used for identifying potentially transformed cells and for evaluating the functionality of regulatory sequences. Reporter genes which encode for easily assayable proteins are well known in the art. In general, a reporter gene is a gene which is not present in or expressed by the recipient organism or tissue and which encodes a protein whose expression is manifested by some easily detectable property, e.g., enzymatic activity. Examples of reporter genes include the luciferase gene from firefly Photinus pyralis. Expression of the reporter gene is assayed at a suitable time after the DNA has been introduced into the recipient cells.
  • the present invention thus provides an expression cassette or vector comprising a polynucleotide of the invention, i.e., one that encodes an iron transport or metabolism system proteins, or a portion thereof with substantially the same activity as the full-length iron transport systems protein.
  • expression cassettes and vectors comprise a promoter, or optionally a promoter, or optionally an enhancer-promoter, operably linked to the polynucleotide.
  • An enhancer-promoter used in an expression cassette of the present invention can be any enhancer-promoter that drives expression in a cell to be transfected.
  • expression cassette of the invention comprise a polynucleotide operatively linked to a tissue- or cell-specific promoter.
  • exemplary vectors for the expression cassette include viral vectors, e.g., adenovirus or lentivirus vectors.
  • An expression cassette of the present invention is useful both as a means for preparing quantities of the iron transport systems protein encoding DNA itself, and as a means for preparing the encoded polypeptides. It is contemplated that where iron transport systems proteins of the invention are made by recombinant means, one can employ either prokaryotic or eukaryotic expression vectors as shuttle systems.
  • the recombinant DNA can be readily introduced into the cells, e.g., mammalian, bacterial, yeast or insect cells, by transfection with an expression cassette or vector comprising DNA encoding a iron transport systems protein or its complement, by any procedure useful for the introduction into a particular cell, e.g., physical or biological methods, to yield a transformed cell having the recombinant DNA optionally stably integrated into its genome, so that the DNA molecules, sequences, or segments, of the present invention are expressed by the cell.
  • Physical methods to introduce a recombinant DNA into a cell include calcium, DEAE-dextran, lipofection, particle bombardment, protoplast fusion, microinjection, electroporation, and the like.
  • a widely used method is transfection mediated by either calcium phosphate or DEAE-dextran. Depending on the cell type, up to 90% of a population of cultured cells can be transfected at any one time. Because of its high efficiency, transfection mediated by calcium phosphate or DEAE-dextran may be the method of choice for experiments that require transient expression of the foreign DNA in large numbers of cells. Calcium phosphate- mediated transfection is also used to establish cell lines that integrate copies of the foreign DNA, which are usually arranged in head-to-tail tandem arrays into the host cell genome.
  • Electroporation can be extremely efficient and can be used both for transient expression of cloned genes and for establishment of cell lines that carry integrated copies of the gene of interest. Electroporation, in contrast to calcium phosphate- mediated transfection and protoplast fusion, frequently gives rise to cell lines that carry one, or at most a few, integrated copies of the foreign DNA.
  • Liposome transfection involves encapsulation of DNA and RNA within liposomes, followed by fusion of the liposomes with the cell membrane.
  • the mechanism of how DNA is delivered into the cell is unclear but transfection efficiencies can be as high as 90%>.
  • Direct microinjection of a DNA molecule into nuclei has the advantage of not exposing DNA to cellular compartments such as low-pH endosomes. Microinjection is therefore used primarily as a method to establish lines of cells that carry integrated copies of the DNA of interest.
  • Biological methods to introduce the DNA of interest into a cell include the use of DNA and RNA viral vectors.
  • the main advantage of physical methods is that they are not associated with pathological or oncogenic processes of viruses. However, they are less precise, often resulting in multiple copy insertions, random integration, disruption of foreign and endogenous gene sequences, and unpredictable expression.
  • the recombinant cells of the present invention are prokaryotic cells.
  • the recombinant cells of the invention are bacterial cells of the DH5a strain of Escherichia coli, as well as E. coli W3110 (F, ⁇ , prototrophic, ATCC No. 273325), bacilli such as Bacillus subtilis, or other enterobacteriaceae such as Salmonella typhimurium or Serratia marcesceus, and various Pseudomonas species.
  • prokaryotes are used for the initial cloning of DNA sequences and constructing the vectors useful in the invention.
  • E. coli K12 strains can be particularly useful.
  • Other microbial strains which can be used include E. coli B, and E. coli X1776 (ATCC No. 31537). These examples are, of course, intended to be illustrative rather than limiting.
  • plasmid vectors containing replicon and control sequences which are derived from species compatible with the cell are used in connection with these cells.
  • the vector ordinarily carries a replication site, as well as marking sequences which are capable of providing phenotypic selection in transformed cells.
  • E. coli can be transformed using pBR322, a plasmid derived from an E. coli species.
  • pBR322 contains genes for ampicillin and tetracycline resistance and thus provides easy means for identifying transformed cells.
  • the pBR plasmid, or other microbial plasmid or phage must also contain, or be modified to contain, promoters which can be used by the microbial organism for expression of its own polypeptides.
  • promoters most commonly used in recombinant DNA construction include the 3-lactamase (penicillinase) and lactose promoter systems (Chang et al., 1978; Itakura et al., 1977; Goeddel et al., 1979; Goeddel et al., 1980) and a tryptophan (TRP) promoter system ( ⁇ PO Appl. Publ. No. 0036776; Siebwenlist et al., 1980).
  • 3-lactamase penicillinase
  • lactose promoter systems Chang et al., 1978; Itakura et al., 1977; Goeddel et al., 1979; Goeddel et al., 1980
  • TRP tryptophan
  • eukaryotic microbes such as yeast can also be used. Saccharomyces cerevisiae or common baker's yeast is the most commonly used among eukaryotic microorganisms, although Schizosaccharomyces and Pichia are commonly available.
  • Saccharomyces cerevisiae or common baker's yeast is the most commonly used among eukaryotic microorganisms, although Schizosaccharomyces and Pichia are commonly available.
  • the plasmid YRp7 for example, is commonly used (Stinchcomb et al., 1979; Kingsman et al, 1979;
  • This plasmid already contains the trpl gene which provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example ATCC No. 44076.
  • the presence of the trpl lesion as a characteristic of the yeast cell genome then provides an effective environment for detecting transformation by growth in the absence of tryptophan.
  • Suitable promoter sequences in yeast vectors include the promoters for 3- phosphoglycerate kinase (Hitzeman et al., 1980) or other glycolytic enzymes (Hess et al., 1968; Holland et al., 1978) such as enolase, glyceraldehyde-3 -phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose- 6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase.
  • 3- phosphoglycerate kinase Hitzeman et al., 1980
  • other glycolytic enzymes Hess et al., 1968; Holland et al., 1978
  • enolase glyceraldehyde-3
  • the termination sequences associated with these genes are also introduced into the expression vector downstream from the sequences to be expressed to provide polyadenylation of the mRNA and termination.
  • Other promoters which have the additional advantage of transcription controlled by growth conditions are the promoter region for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and the aforementioned glyceraldehyde-3 -phosphate dehydrogenase, and enzymes responsible for maltose and galactose utlization.
  • Any plasmid vector containing a yeast-compatible promoter, origin or replication and termination sequences is suitable.
  • assays include, for example, "molecular biological” assays well known to those of skill in the art, such as Southern and Northern blotting, RT-PCR and PCR; "biochemical” assays, such as detecting the presence or absence of a particular iron transport systems protein, e.g., by immunological means (ELISAs and Western blots) or by additional assays known to the art.
  • molecular biological assays well known to those of skill in the art, such as Southern and Northern blotting, RT-PCR and PCR
  • biochemical such as detecting the presence or absence of a particular iron transport systems protein, e.g., by immunological means (ELISAs and Western blots) or by additional assays known to the art.
  • RNA produced from introduced recombinant DNA segments may be employed.
  • PCR it is first necessary to reverse transcribe RNA into DNA, using enzymes such as reverse transcriptase, and then through the use of conventional PCR techniques amplify the DNA.
  • PCR techniques while useful, will not demonstrate integrity of the RNA product.
  • Further information about the nature of the RNA product may be obtained by Northern blotting. This technique demonstrates the presence of an RNA species and gives information about the integrity of that RNA. The presence or absence of an RNA species can also be determined using dot or slot blot Northern hybridizations. These techniques are modifications of Northern blotting and only demonstrate the presence or absence of an RNA species.
  • Southern blotting and PCR may be used to detect the recombinant DNA segment in question, they do not provide information as to whether the preselected DNA segment is being expressed. Expression may be evaluated by specifically identifying the peptide products of the introduced recombinant DNA sequences or evaluating the phenotypic changes brought about by the expression of the introduced recombinant DNA segment in the cell.
  • a recombinant iron transport systems protein may be recovered or collected either from the transfected or infected cells or the medium in which those cells are cultured. Recovery comprises isolating and purifying the recombinant polypeptide. Isolation and purification techniques for polypeptides are well-known in the art and include such procedures as precipitation, filtration, chromatography, electrophoresis and the like.
  • the present invention also provides a method for the identification of a bacterial outer-membrane protein with a role in iron acquisition and/or metabolism.
  • Genes encoding the outer membrane receptor proteins are transcriptionally altered during iron limitation, as opposed to genes encoding proteins that are periplasmic, in the inner membrane, or cytoplasmic.
  • outer-membrane components can serve as suitable targets for the development of vaccines, antimicrobial peptides, or antimicrobial agents.
  • the strategy of the present method reduces the time to identify genes that encode proteins that are both involved iron acquisition as well as present on the outer membrane of the bacterium.
  • the present method provides a three-step strategy for the identification of iron regulated outer membrane proteins.
  • the first step involves the comparative transcriptional profiling of bacterial cells grown in the presence or absence of iron by microarray or alternate methods known to the art for the identification of iron regulated genes.
  • proteins encoded by all of the genes in the microbe of interest are examined for the presence of signal sequences, membrane anchor domains, or surface probability using computational and bioinformatics tools known to the art.
  • the third step involves the identification of genes that are included in both of the subsets from steps 1 and 2 that will likely represent genes encoding outer-membrane receptors for iron or iron containing proteins.
  • Iron is an essential nutrient for the survival and proliferation of E. coli.
  • Figure 1 presents data demonstrating that he bacterium grows at a slower rate in iron depleted as compared with iron rich media (Fig. 1). It was interesting however to note that the fur mutant also grows slower than the wild-type in conditions of iron limitation. While this would not be surprising in iron rich media since the fur " mutant is likely to expend metabolic resources in needlessly making products such as the enterobactin siderophore and iron transport proteins which are not produced by the wild type, the slower growth rate of E. coli fur mutant under iron-starvation conditions suggests that Fur, either directly or through some mediator, positively regulates processes required for cellular growth and adaptation during iron limitation, a hypothesis that remains to be rigorously tested.
  • cDNA microarrays including 3,866 genes which make up -90% of the genes
  • E. coli MG1655 were used to identify genes whose expression changed in response to iron-limitation. 675 genes showed a greater than 2-fold change in expression for at least one of the time points sampled in the wild type cells.
  • Figure 2 shows a distribution of the differentially expressed genes among the various classes with a break-up of the magnitudes of change within each class. The maximum number of genes changing was in the "Hypotheticals" class followed by those transport and binding proteins. Among the 79 transport-related genes that were differentially expressed, 50 were upregulated, while 29 were downregulated. The maximum magnitude of changes were in the genes encoding amino acid biosynthesis and metabolism and in transport and binding related proteins, whereas the genes encoding the energy metabolism class of proteins did not show as much change.
  • the primary genes involved in iron transport are entABCDEFS, fepABCDEFG,fecABCDEIR,fl ⁇ uABCDEF, cir A, tonB and exbBD (Earhart (in) Neidhardt, 1996) and are known to be negatively regulated by Fur. These genes showed either a decrease in expression after addition of iron or no appreciable change. In addition, the results show that the genes which were downregulated at the first time point sampled at 5 minutes remained so at all the six time points sampled till 90 minutes after iron addition.
  • ferric citrate transport system consisting of the fee genes is induced by citrate, which was absent in the media used in the current investigation.
  • Previous studies have shown that the fecBCDE genes are positively regulated hyfecIR (Braun, 1997) while fecAIR are negatively regulated by Fur.
  • fecA is also induced by citrate
  • fecBCDE are induced by conformational changes in fecA, feel and fecR in the presence of ferric citrate (Enz et al., 1995).
  • the results presented herein show that while fecAIR genes are downregulated in the wild-type, fecBCDE show no change upon the addition of iron to the medium.
  • 1185 genes showed a greater than 2-fold change in expression for at least one of the time points sampled in the fur mutant.
  • the maximum number of genes changing in a particular class was again the Transport and binding proteins.
  • Table 2 Genes upregulated after iron addition in the/wr mutant with the maximum change seen between all time points
  • the functional classes having the maximum number of genes with different profiles in the wild type and the mutant are the Transport and binding followed by Hypotheticals and Translational, post translational modification. 11 out of the 14 transport related genes identified had a known function in iron transport. Among the Amino acid biosynthesis and metabolism genes, a majority is related to arginine synthesis, which show a more than 16-fold upregulation at the 60 minute time-point in the wild-type. Very few genes in Central intermediary metabolism (4) had dissimilar profiles in the wild-type and mutant experiments. This indicates Fur does not regulate a large number of metabolic genes.
  • a summary of the expression patterns and the sequence analysis for four of the 13 hypotheticals having the highest Euclidean distance between the expression profiles in the wild-type and the fur " mutant is presented below:
  • bl973 shows very slight downregulation in the wild-type, whereas in the fur " mutant it shows a dramatic downregulation at 5 min, continues to be further downregulated till 10 min and then the expression increases (Fig. 4).
  • Its nucleotide sequence is similar to a metal ABC transporter substrate-binding protein. Its nucleotide sequence also shows the presence of a putative fur-box with 13 out of the 19 nucleotides matching the fur-box consensus sequence (Fig. 5b).
  • bl973 The expression profile of bl973 suggests that it may be involved in the low affinity transport of iron.
  • the flux of iron through the low affinity system might signal the presence of adequate amounts of iron in the external media.
  • the initial downregulation of low affinity transporters like bl973 may be required to maintain homeostasis.
  • bl973 may be dramatically downregulated till the high affinity system is regulated through an alternate mechanism. Since the E.
  • coli fur mutant is not lethal, there may be some other regulatory system controlling the expression of the high affinity iron transport system (or an iron export system) in the absence of which the cells would die of iron overload under abundant iron conditions.
  • bl973 expression starts increasing after 20 to 30 minutes.
  • nucleotide sequence is similar to a putative protein possibly involved in aromatic compounds catabolism.
  • the position of W597 on the genome, along with its nucleotide sequence and expression profiles indicates that it might be involved in the adaptation of metabolism to the biosynthesis of enterobactin.
  • bl452 is upregulated in mutant and downregulated in the wild-type (Fig. 4). Its protein sequence shows the presence of a ATP/GTP-binding site. It is likely that bl452 is involved in iron transport.
  • E. coli K12 was transformed with the pKD46 Helper plasmid by electroporation at 1.6kV, 50 ⁇ F and 200 ohms.
  • 56 bp long primers were designed to amplify the Chloramplenicol gene from the pKD3 plasmid having the required homology regions to the fur gene [left primer: aac get tec teg ttt aaa aat cct gga agt tct tea gtg tag get gga get get tc (36 bp homology region) (SEQ ID NO:131); right primer: agt gac acg taa aga tag aga ctg tgg tta gtc agg cat atg aat ate etc ctt ag (36 bp homology region) (SEQ ID NO: 132)] .
  • the PCR product was gel-purified using Gel Extraction kits (Qiagen) according to the manufacturers protocol, and then Dpnl digested to remove the genomic DNA. 10 ⁇ l of the PCR product, 2 ⁇ l Dpnl, 5 ⁇ l of the 10X buffer (supplied with the enzyme) and 33 ⁇ l of dH 2 O were mixed together and incubated at 37°C for an hour. Dpnl was then deactivated by heating the solution to 80°C for 20 minutes on a heat block.
  • Gel Extraction kits Qiagen
  • E. coli with pKD46 was grown to a OD 6 oo of 0.5-0.6 (6 ml ) in SOB containing ampicillin (50 ⁇ g/ml) and 10 mM arabinose, washed three times with ice-cold 10% glycerol and resuspended in 50 ⁇ l cold 10% glycerol. 100 ng of the gel purified and Dpnl digested PCR product was mixed with 25 ⁇ l cells and the other 25 ⁇ l was used as a control. The cells were allowed to sit on ice for 10 min and then electroporated at 1.6kV, 25 ⁇ F and 400 ohms.
  • primers were designed such that they were in the region just left and right of where the insertion was expected such that if there were no insertion a band corresponding to the fur gene would be seen (447 bp) and if there were an insertion a 1194 bp band would be seen. Without colony purification two bands of the two predicted size were observed. However, after colony purification, there was only one band at the correct location. Another PCR was carried out with primers within the chloramphenicol resistance gene sequence. In case there was an insertion, a 729 bp band was expected to be seen, which was observed.
  • RNA extraction was done using RNeasy Mini Kits (Qiagen) according to the manufacturers protocol. DNase digestion for removal of genomic DNA contamination was done on the Qiagen columns according to the manufacturers protocol. RNA was quantified by absorbance at 260 nm.
  • Thermocycling conditions were: a) 94°C for 5 min; b) Cycle conditions: 94°C for 30 sj?c ⁇ nds,.,55°C_for 30 soilseeonds,,&nd72°.C for 1 minute; c) 72°C for 10 minutes-and--.. stored at 4°C.
  • a total of 0.5 ⁇ L of the PCR product was then used as template for a second 100 ⁇ L PCR reaction (same conditions) to minimize genomic DNA contamination.
  • the PCR reactions that failed under these conditions were reamplified at different annealing temperatures.
  • PCR product was run on agarose gels, and the gels were analysed using BioRad Ql software. 90% of the genome was successfully amplified this way.
  • Amplified PCR products were cleaned with MultiScreen PCR plates (Millipore) as per the manufacturers instructions and resuspended in 50 ⁇ L of 3X SSC, 0.01% SDS.
  • Three Arabidopsios thaliana genes (RUBISCO activase (RCA), photosystem I chlorophyll a/b-binding protein (Cab), root cap 1 (RCP1) ) were printed in triplicate on the array as controls (PCR product purchased from Stratagene).
  • the array (all genes spotted in triplicate) was printed on poly-L-lysine slides using the Total Array System Robot (BioRobotics) as described by the manufacturer. After printing was completed, the slides were post processed. Post processing involves Rehydration, Blocking and Denaturation. The spotting process does not, in general, leave DNA evenly distributed throughout the spot. To distribute the DNA more evenly, the spots (which dry rapidly during spotting) were rehydrated and snap dried. During the Blocking step, the remaining free lysine groups are modified to minimize their ability to bind labeled probe DNA. If these groups are not blocked labeled probe DNA will bind indiscriminately and nonspecifically to the surface and will produce excessively high background. Blocking is done by acylation with succinic anhydride. Denaturation is carried out to make the probe accessible to the target. All the protocols are available on the website at wwwl.umn.edu/agac/microrray.
  • Probe preparation Control RNA [RUBISCO activase (RCA), photosystem I chlorophyll a b- binding protein (Cab), root cap 1 (RCP1) (Strategene)] at different concentrations (between 20 pg/ ⁇ l and 800 pg/ ⁇ l) of-each-was added-to 20 ⁇ g of each sample-RNA.
  • the RNA was primed with 60 ⁇ g of random hexamers (Amersham), incubated at 70°C for 10 minutes and then allowed to sit on ice for 10 min.
  • RNA mixture was then added to the reverse transcription reaction, which included the reverse transcriptase Superscript II (GibcoBRL Life Technologies), DTT (Amersham), and a 4:1 ratio of amino-allyl dUTP with dNTPs (Amersham).
  • the mixture was incubated for 2 hours at 42°C. After incubation, the reaction was stopped by addition of 20 ⁇ L NaOH, 20 ⁇ L 0.5 M EDTA, and incubating at 65°C for 15 min. The samples were neutralized with the addition of 50 ⁇ L of 1 M Tris-HCl (pH 7.4).
  • the amino labeled cDNA was then stripped of all extra amine groups before the fluorescent dye was coupled to the amino group using Microcon YM-30 filters as per the manufacturers instractions (Millipore).
  • the sample was dried down to concentrate cDNA.
  • the cDNA was resuspended in 18 ul Sodium bicarbonate buffer (pH 9.0) and allowed to sit 10-15' at room temperature to ensure resuspension. The entire volume was transferred into a tube containing dried Cy-dye aliquot and incubated in the dark for an hour. In order to prevent cross coupling between the probes, 9 ⁇ L of 4M hydroxylamine was added and the solution incubated for 15 minutes at room temperature in the dark.
  • the Qia-Quick PCR purification kit was used to clean up the probe product as per the manufacturers instructions (Qiagen) with the addition of one Buffer PE wash.
  • the samples were dried down and then eluted in 33.5 ⁇ L dH 2 O, 6.68 ⁇ l 20x SSC, l.Ol ⁇ L 10% SDS, 3.38 ⁇ L of salmon sperm DNA and 3.38 ⁇ l polyA as a blocker.
  • Hybridization Slides were placed in hybridization chambers. 10 ⁇ L of 3 x SSC was added to the ends of each slide to prevent dehydration of the hybridization solution. The hybridization solution was denatured for 2 minutes and then allowed to cool for 10 minutes at room temperature.
  • the fluorescent intensities were quantified using the QuantArray Software.
  • the adaptive measurement protocol was used.
  • the local background was subtracted from the spot intensity to get the final signal intensity.
  • Signal intensities thus obtained which were greater than at least one standard deviation of the local background were used for further analysis.
  • the spot intensities were normalized using the non linear normalization program by Tseng et al(Tseng et al., 2001).
  • the standard deviation of the three measurements for each gene was calculated and genes having a standard deviation more than 1.5 were discarded. After discarding the 'invalid' data points, the average ratio of the intensities for the two dyes for each gene was calculated. For each slide, any gene which had only one valid data point was discarded.
  • the ratio of the difference between the ratios from the two spots to the minimum (absolute) of the two ratios had to be less than 2 for the gene to pass the quality filtering.
  • the data from the two replicate experiments is shown separately instead of combining all the 6 data points into one.
  • RNA was treated for DNA removal using the DNA-free kit from Ambion according to the manufacturers protocol. 1 ⁇ g of the 'cleaned' RNA was reverse transcribed using random hexamers as primers similar to the reverse transcription reaction for the generation of the probe for the array, as described above. Amplification, detection, and real-time analysis were performed using the ABI Prism 7700 Sequence Detection System (Applied Biosystems). SYBR Green I (Applied Biosystems) was used for detection of the amplified product. Primers were designed to produce amplicons of the same length (approximately 100 bp) using the Primer3 software.
  • the cDNA was diluted 13 fold and 1 ⁇ l of this diluted cDNA was used for subsequent PCR amplification using the appropriate specific, primers and the SYBR Green 2X PCR. Master .Mix kit (Applied. Biosystems). The 'cleaned' RNA was also used as a template to check for any amplification from genomic DNA contamination.
  • YodA from Escherichia coli is a metal-binding, lipocalin-like protein.
  • J Biol Chem. Earhart in) Neidhardt, F.C.a.C, R. (1996) Escherichia and Salmonella: cellular and molecular biology. Washington, D.C.: ASM Press. Euz, S.,.Braun,_V.,.and Crosa, J.H.

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Biochemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Toxicology (AREA)
  • Zoology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Immunology (AREA)
  • Cell Biology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

Applicants have discovered previously uncharacterized genes and proteins that are involved in iron transport and/or metabolism. Accordingly, the present invention provides proteins involved in iron transport and/or metabolism and polynucleotides encoding those proteins. The present invention also provides expression cassettes containing the polynucleotides. The present invention also provides cells containing the expression cassettes, polypeptides, and/or polynucleotides of the invention.

Description

IRON TRANSPORT AND METABOLISM PROTEINS
Cross-Reference to Related Application This patent application claims the benefit of priority under 35 U.S.C. §
119(e) to U.S. Provisional Patent Application Serial Number 60/405,331, filed on August 21, 2002, which is herein incorporated by reference.
Background of the Invention Iron is an essential element for the survival of nearly all organisms, including pathogenic bacteria. Although there is an adequate iron present in the body fluids of humans and animals, the amount of iron readily available to bacteria is extremely low. This is due, in part, to the fact that a majority of the iron in an animal is intracellular, in the form of ferritin, haemosiderin or haem. In addition, iron present in body fluids is complexed with high affinity iron binding proteins like transferrin and lactoferrin. Hence, the amount of free iron in equilibrium with iron binding proteins is at an approximate concentration of 10"18M. Even outside the host, free Fe"1-1^ in an aerobic, aqueous environment is limited to an equilibrium value of approximately 10" M, a value far below that required for optimal bacterial growth.
To circumvent these restrictive conditions, pathogenic bacteria have evolved specialized transport and metabolic systems to acquire a sufficient iron supply. For example, high affinity iron transport systems have been developed that include specific ferric iron chelaters, "siderophores," and iron-regulated outer membrane proteins (IROMPs) and/or siderophore receptor proteins (SRPs) that are receptors for siderophores on the outer membrane of the bacterial cell.
Some genes involved in iron transport under conditions of iron limitation have been identified, such as fur. Fur negatively regulates the genes involved in iron uptake and the biosynthesis of siderophores in response to the iron level in the cell. Fur mutants constitutively express the siderophore biosynthesis enzymes and iron transport proteins. Despite the above knowledge, there is a need in the art to identify additional genes and proteins that are involved in bacterial iron transport and metabolism. Such genes and/or proteins could be used, for example, as potential targets for agents active against pathogenic bacteria.
Summary of the Invention Applicants have discovered previously uncharacterized genes and proteins that are involved in iron transport and/or metabolism. Accordingly, the present invention provides proteins involved in iron transport and or metabolism and polynucleotides encoding those proteins.
The invention provides an isolated and purified polypeptide comprising at least one of SEQ ID NOs 65-128 or 130.
The present invention also provides an isolated and purified polynucleotide comprising a nucleic acid sequence encoding at least one of SEQ ID NOs 65-128 or 130.
The present invention further provides an isolated and purified polynucleotide comprising at least one of SEQ ID NOs 1-64 or 129.
The present invention also provides an expression cassette, comprising a nucleic acid sequence encoding a promoter operably linked to at least one of the polynucleotides of the invention.
Also provided is a cell, e.g., comprising an expression cassette, polynucleotide, and/or polypeptide of the invention.
The present invention also provides a method of identifying a gene, including: a) contacting a probe including nucleic acid obtained from a cell grown in an iron-limited environment with a solid substrate including one or more nucleotide sequences to provide a profile of gene expression; b) contacting a probe including nucleic acid obtained from a cell grown in an non iron-limited environment with a solid substrate including one or more nucleotide sequences to provide a profile of gene expression; and c) comparing the profile of (a) to the profile of (b) so as to identify a gene having altered expression. Brief Description of the Figures
Figure 1. Growth curve of the wild-type (•) and the fur (■). Wild-type and fur were cultured in defined, iron-limited media, and at time 0, FeSO4.7H2θ was added to the culture to a final concentration of lOμM. Figure 2. Distribution of genes (number of genes changed and magnitude of change) differentially expressed in the wild type among all functional classes: 1: Amino acid biosynthesis and metabolism; 2: Biosynthesis of co factors, prosthetic groups and carriers; 3: Carbon compound catabolism; 4: Cell processes (incl. adaptation, protection); 5: Cell structure; 6: Central intermediary metabolism; 7: DNA replication, recombination, modification and repair; 8:
Energy metabolism; 9: Fatty acid and phospholipid metabolism; 10: Hypothetical, unclassified, unknown; 11: Membrane proteins; 12 Nucleotide biosynthesis and metabolism; 13: Other genes; 14: Phage, transposon, orplasmid; 15: Putative chaperones; 16: Putative enzymes; 17: Putative regulatory proteins; 18: Putative transport proteins; 19: Regulatory function; 20: Structural proteins; 21:
Transcription, RNA processing and degradation; 22: Translation, post-translational modification; and 23: Transport and binding proteins.
Figure 3. Expression of some of the genes involved in iron transport and having an operonic organization. Wild-type (~),fu (--). The Y-axis shows fold- change in expression on addition of iron to an iron-limited culture. X-axis shows the time points at which samples were taken.
Figure 4. Expression profiles of bl973, b0597,b0805 andbl452. Wildtype (•),fur~(m). The Y-axis shows fold-change in expression on addition of iron to an iron-limted culture. X-axis shows the time points at which samples were taken. Figure 5. Growth curves, sequence analysis and location of bl973. 5a is a comparison of growth curves of wild-type and bl973~ mutant. 5b depicts the putative fur-box in the bl973 sequence. 5c: bl968 putative 2 component sensor protein, bl969: putative 2 component transcriptional regulator, bl970: putative periplasmic or exported protein, bl971: putative reductase, bl972: putative inner membrane protein, bl973 : putative metal ABC transporter substrate binding protein. Figure 6a. Comparison of array data and real time PCR data. Wild-type or fur (KO) (-), Real time PCR data for the same experimental condition (--). Y axis is the fold change (log to the base2 of ratio) maximum 6 and minimum -8; X axis has three points at which samples were taken: 20 min, 60 min and 90 minutes. Figure 6b. Scatter plot of ratio from cDNA microrray vs. ratio from Real time PCR.
Detailed Description of the Invention
The transcriptional response of Escherichia coli MG1655 and defined mutants to growth in an iron-limited environment was characterized by cDNA- based microarray analysis of the whole genome. Samples taken at six different time points after addition of iron to an iron-starved culture were analyzed and showed that the expression of a large proportion (~20%) of genes is altered during iron restriction. Applicants have discovered that the pathways that were most dramatically altered were nucleotide biosynthesis and metabolism (-34%), amino acid biosynthesis and metabolism (-28%), and transport and binding proteins (-19%). These proteins can serve as targets in the development of vaccines and antimicrobial agents.
To identify genes with a specific role in iron transport, the transcriptional response of E. coli with a deleted/or gene, a well-characterized transcriptional regulator of iron transport related genes, was analyzed. The analysis revealed altered transcriptional profiles in the wild-type vs. fur of 64 genes, including 13 previously uncharacterized genes with a role in iron transport or metabolism. These proteins can serve as targets in the development of vaccines and antimicrobial agents. One of these genes, bl973, was down-regulated by up to 7-fold in the fur mutant as compared with the wild-type, indicating that its expression is either directly or indirectly regulated by Fur. Insertional inactivation of b 19? '3 resulted in a considerably longer doubling time when grown either in minimal media or during iron restriction in rich media.
Accordingly, the present invention provides an isolated and purified polypeptide comprising at least one of SΕQ ID NOs 65-128 or 130. In some embodiments of the invention, the polypeptide comprises SEQ ID NO:86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 130.
The present invention also provides an isolated and purified polynucleotide comprising a nucleic acid sequence encoding at least one of SEQ ID NOs 65-128 or 130. In some embodiments of the invention, the polynucleotide encodes SEQ ID NO:86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 130.
The present invention also provides an isolated and purified polynucleotide comprising at least one of SEQ ID NOs 1-64 or 129. In some embodiments of the invention, the polynucleotide comprises SEQ ) NO: 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 129.
The present invention also provides an expression cassette, comprising a nucleic acid sequence encoding a promoter operably linked to at least one of the polynucleotides of the invention. In some embodiments, the nucleic acid sequence comprises SEQ ID NO: 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, or 129. In some embodiments of the invention, the nucleic acid sequence encodes SEQ ID NO:86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 130.
Also provided is a cell, e.g., a host cell, comprising an expression cassette, polynucleotide, andor polypeptide of the invention. The present invention also provides polynucleotides and polypeptides having substantial similarity to at least one of the polynucleotides or polypeptides of the invention.
The present invention also provides fragments of the polynucleotides and polypeptides of the invention. The present invention also provides a method of identifying a gene, including: a) contacting a probe including nucleic acid obtained from a cell grown in an iron-limited environment with a solid substrate including one or more nucleotide sequences to provide a profile of gene expression; b) contacting a probe including nucleic acid obtained from a cell grown, in an non iron-limited environment with a solid substrate including one or more nucleotide sequences to provide a profile of gene expression; and c) comparing the profile of (a) to the profile of (b) so as to identify a gene having altered expression. In some embodiments, the cells are bacterial cells. In some embodiments, the cells are prokaryotic cells. In some embodiments, the cells are eukaryotic cells. In some embodiments, the gene having the altered expression encodes an outer membrane protein. In some embodiments, the gene is identified by comparing the expression of genes in a wild-type host cell to the expression of genes in a mutant host cell. In some embodiments, the mutant cell comprises a mutated_ wr gene. The present invention also provides genes identified by such methods, and proteins encoded by those genes.
I. Definitions
As used herein, the phrase "an iron transport and metabolism systems" is meant to refer to a bacterial system for the uptake, acquisition, transport, metabolism and/or regulation of iron under low iron conditions. By "low iron conditions" is mean an iron-limited environment, i.e., an environment wherein the availability of iron, e.g., free Fe+++, is at a lower concentration than that required for optimal bacterial growth. This concentration may vary depending on the particular requirements, e.g., nutritional needs, of a bacterial cell. For example, in a mammalian host, little iron is available as a nutrient for microbial growth because it is tightly bound to hemoglobin, myoglobin, cytochromes and ferritin within a cell, as well as bound to transferrin in the blood. For example, low iron conditions refer to the iron-limited environment that exists in vivo in a mammalian or avian host, as well as wherein the availability of iron, e.g., free Fe "1" ", is in the range of about 10" 8M to lO"20M or less. In addition, "iron transport and metabolism systems" includes related systems, i.e., systems that facilitate iron transport and metabolism systems. For example, the phrase "nucleic acid encoding an iron transport and metabolism protein" includes a gene encoding a protein involved iron uptake as well as the gene controlling expression of the uptake component. The term "chimeric" refers to any gene or DNA that contains 1) DNA sequences, including regulatory and coding sequences, that are not found together in nature, or 2) sequences encoding parts of proteins not naturally adjoined, or 3) parts of promoters that are not naturally adjoined. Accordingly, a chimeric gene may include regulatory sequences and coding sequences that are derived from different sources, or include regulatory sequences and coding sequences derived from the same source, but arranged in a manner different from that found in nature.
"Expression" refers to the transcription and translation of an endogenous gene or a transgene in a host cell. For example, in the case of antisense constructs, expression may refer to the transcription of the antisense DNA only. In addition, expression refers to the transcription and stable accumulation of sense (mRNA) or functional RNA. Expression may also refer to the production of protein.
The term "gene" is used broadly to refer to any segment of nucleic acid associated with a biological function. Thus, genes include coding sequences and/or the regulatory sequences required for their expression. For example, gene refers to a nucleic acid fragment that expresses mRNA, or specific protein, including regulatory sequences. Genes also include nonexpressed DNA segments that, for example, form recognition sequences for other proteins. Genes can be obtained from a variety of sources, including cloning from a source of interest or synthesizing from known or predicted sequence information, and may include sequences designed to have desired parameters. A "transgene" refers to a gene that has been introduced into the genome by transformation and is stably maintained. Transgenes may include, for example, DNA that is either .heterologous or homologous to the DNA of a particular cell to be transformed. Additionally, transgenes may include native genes inserted into a non-native organism, or chimeric genes. The term "endogenous gene" refers to a native gene in its natural location in the genome of an organism. A "foreign" gene refers to a gene not normally found in the host organism but that is introduced by gene transfer.
A "mutation" refers to an insertion, deletion or substitution of one or more nucleotide bases of a nucleic acid sequence, so that the nucleic acid sequence differs from the wild-type sequence. For example, a 'point' mutation refers to an alteration in the sequence of a nucleotide at a single base position from the wild type sequence.
The term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides and polymers thereof in either single- or double-stranded form, composed of monomers (nucleotides) containing a sugar, phosphate and a base which is either a purine or pyrimidine. Unless specifically limited, the term encompasses nucleic acids containing known analogs of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions) and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et al., 1991; Ohtsuka et al., 1985; Rossolini et al., 1994). A "nucleic acid fragment" is a fraction of a given nucleic acid molecule. Deoxyribonucleic acid (DNA) in the majority of organisms is the genetic material while ribonucleic acid (RNA) is involved in the transfer of information contained within DNA into proteins. The term "nucleotide sequence" refers to a polymer of DNA or RNA that can be single- or double-stranded, optionally containing synthetic, non-natural or altered nucleotide bases capable of incorporation into DNA or RNA polymers. The terms "nucleic acid," "nucleic acid molecule," "nucleic acid fragment," "nucleic acid sequence," or "polynucleotide" may also be used interchangeably with gene, cDNA, DNA and RNA encoded by a gene (Batzer et al., 1991; Ohtsuka et al., 1985; Rossolini et al., 1994).
"Operably linked" when used with respect to nucleic acid, means joined as part of the same nucleic acid molecule, suitably positioned and oriented for transcription to be initiated from the promoter. DNA operably linked to a promoter is under transcriptional initiation regulation of the promoter. Coding sequences can be operably-linked to regulatory sequences in sense or antisense orientation. "Promoter" refers to a nucleotide sequence, usually upstream (5') to its coding sequence, which controls the expression of the coding sequence by providing the recognition for RNA polymerase and other factors required for proper transcription. "Promoter" includes a minimal promoter that is a short DNA sequence including a TATA- box and other sequences that serve to specify the site of transcription initiation, to which regulatory elements are added for control of expression. "Promoter" also refers to a nucleotide sequence that includes a minimal promoter plus regulatory elements that is capable of controlling the expression of a coding sequence or functional RNA. This type of promoter sequence consists of proximal and more distal upstream elements, the latter elements often referred to as enhancers. Accordingly, an "enhancer" is a DNA sequence that can stimulate promoter activity and may be an innate element of the promoter or a heterologous element inserted to enhance the level or tissue specificity of a promoter. It is capable of operating in both orientations (normal or flipped), and is capable of functioning even when moved either upstream or downstream from the promoter. Both enhancers and other upstream promoter elements bind sequence-specific DNA-binding proteins that mediate their effects. Promoters may be derived in their entirety from a native gene, or be composed of different elements derived from different promoters found in nature, or even include synthetic DNA segments. A promoter may also contain DNA sequences that are involved in the binding of protein factors that control the effectiveness of transcription initiation in response to physiological or developmental conditions. The "initiation site" is the position surrounding the first nucleotide that is part of the transcribed sequence, which is also defined as position +1. With respect to this site all other sequences of the gene and its controlling regions are numbered. Downstream sequences (i.e. further protein encoding sequences in the 3' direction) are denominated positive, while upstream sequences (mostly of the controlling regions in the 5' direction) are denominated negative.
Promoter elements, particularly a TATA element, that are inactive or that have greatly reduced promoter activity in the absence of upstream activation are referred to as "minimal or core promoters." In the presence of a suitable transcription factor, the minimal promoter functions to permit transcription. A minimal or core promoter" thus consists only of all basal elements needed for transcription initiation, e.g., a TATA box and/or an initiator.
"Constitutive expression" refers to expression using a constitutive or regulated promoter. "Conditional" and "regulated expression" refer to expression controlled by a regulated promoter. An "inducible promoter" is a regulated promoter that can be turned on in a cell by an external stimulus, such as a chemical, light, hormone, stress, or a pathogen.
The following terms are used to describe the sequence relationships between two or more nucleic acids or polynucleotides: (a) "reference sequence", (b) "comparison window", (c) "sequence identity", (d) "percentage of sequence identity", and (e) "substantial identity."
(a) As used herein, "reference sequence" is a defined sequence used as a basis for sequence comparison. A reference sequence may be a subset or the entirety of a specified sequence; for example, as a segment of a full-length cDNA or gene sequence, or the complete cDNA or gene sequence.
(b) As used herein, "comparison window" makes reference to a contiguous and specified segment of a polynucleotide sequence, wherein the polynucleotide sequence in the comparison window may include additions or deletions (i.e., gaps) compared to the reference sequence (which does not include additions or deletions) for optimal alignment of the two sequences. Generally, the comparison window is at least 20 contiguous nucleotides in length, and optionally can be 30, 40, 50, 100, or longer. Those of skill in the art understand that to avoid a high similarity to a reference sequence due to inclusion of gaps in the polynucleotide sequence a gap penalty is typically introduced and is subtracted from the number of matches.
Methods of alignment of sequences for comparison are well known in the art. Thus, the determination of percent identity between any two sequences can be accomplished using a mathematical algorithm. Non-limiting examples of such mathematical algorithms are the algorithm of Myers and Miller (1988); the local homology algorithm of Smith et al. (1981); the homology alignment algorithm of Needleman and Wunsch (1970); the search-for-similarity-method of Pearson and Lipman (1988); the algorithm of Karlin and Altschul (1990), modified as in Karlin and Altschul (1993).
Computer implementations of these mathematical algorithms can be utilized for comparison of sequences to determine sequence identity. Such implementations include, but are not limited to: CLUSTAL in the PC/Gene program (available from Intelligenetics, Mountain View, California); the ALIGN program (Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Version 8 (available from Genetics Computer Group (GCG), 575 Science Drive, Madison, Wisconsin, USA). Alignments using these programs can be performed using the default parameters. The CLUSTAL program is well described by Higgins et al. (1988), Higgins et al. (1989), Corpet et al. (1988), Huang et al. (1992), and Pearson et al. (1994). The ALIGN program is based on the algorithm of Myers and Miller, supra. The BLAST programs of Altschul et al. ,(1990); (1997), are based on the algorithm of Karlin and Altschul supra. Software for performing BLAST analyses is publicly available through the
National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighborhood word score threshold. These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N
(penalty score for mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when the cumulative alignment score falls off by the quantity X from its maximum achieved value, the cumulative score goes to zero or below due to the accumulation of one or more negative-scoring residue alignments, or the end of either sequence is reached. In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences. One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a test nucleic acid sequence is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid sequence to the reference nucleic acid sequence can be less than about 0.1, less than about 0.01, or less than about 0.001. To obtain gapped alignments for comparison purposes, Gapped BLAST (in
BLAST 2.0) can be utilized as described in Altschul et al. (1997). Alternatively, PSI-BLAST (in BLAST 2.0) can be used to perform an iterated search that detects distant relationships between molecules. See Altschul et al., supra. When utilizing BLAST, Gapped BLAST, PSI-BLAST, the default parameters of the respective programs (e.g. BLASTN for nucleotide sequences, BLASTX for proteins) can be used. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix. See the world wide web at ncbi.nlm.nih.gov. Alignment may also be performed manually by inspection.
For purposes of the present invention, comparison of nucleotide sequences for determination of percent sequence identity to the promoter sequences disclosed herein can be made using the BlastN program (version 1.4.7 or later) with its default parameters or any equivalent program. By "equivalent program" is intended any sequence comparison program that, for any two sequences in question, generates an alignment having identical nucleotide or amino acid residue matches and an identical percent sequence identity when compared to the corresponding alignment generated by the alternative program. (c) As used herein, "sequence identity" or "identity" in the context of two nucleic acid or polypeptide sequences makes reference to a specified percentage of residues in the two sequences that are the same when aligned for maximum correspondence over a specified comparison window, as measured by sequence comparison algorithms or by visual inspection. When percentage of sequence identity is used in reference to proteins it is recognized that residue positions which are not identical often differ by conservative amino acid substitutions, where amino acid residues are substituted for other amino acid residues with similar chemical properties (e.g., charge or hydrophobicity) and therefore do not change the functional properties of the molecule. When sequences differ in conservative substitutions, the percent sequence identity may be adjusted upwards to correct for the conservative nature of the substitution. Sequences that differ by such conservative substitutions are said to have "sequence similarity" or "similarity." Means for making this adjustment are well known to those of skill in the art. Typically this involves scoring a conservative substitution as a partial rather than a full mismatch, thereby increasing the percentage sequence identity. Thus, for example, where an identical amino acid is given a score of 1 and a non-conservative substitution is given a score of zero, a conservative substitution is given a score between zero and 1. The scoring of conservative substitutions is calculated, e.g., as implemented in the program PC/GENE (Intelligenetics, Mountain View, California). (d) As used herein, "percentage of sequence identity" means the value determined by comparing two optimally aligned sequences over a comparison window, wherein the portion of the polynucleotide sequence in the comparison window may include additions or deletions (i.e., gaps) as compared to the reference sequence (which does not include additions or deletions) for optimal alignment of the two sequences. The percentage is calculated by determining the number of positions at which the identical nucleic acid base or amino acid residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison, and multiplying the result by 100 to yield the percentage of sequence identity. (e)(i) The term "substantial identity" of polynucleotide sequences means that a polynucleotide includes a sequence that has at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, or 79%, at least 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, or 89%, at least 90%, 91%, 92%, 93%, or 94%, and at least 95%, 96%, 97%o, 98%, or 99% sequence identity, compared to a reference sequence using one of the alignment programs described using standard parameters. One of skill in the art will recognize that these values can be appropriately adjusted to determine corresponding identity of proteins encoded by two nucleotide sequences by taking into account codon degeneracy, amino acid similarity, reading frame positioning, and the like. Substantial identity of amino acid sequences for these purposes normally means sequence identity of at least 70%, at least 80%, 90%, or at least 95%.
Another indication that nucleotide sequences are substantially identical is if two molecules hybridize to each other under stringent conditions (see below). Generally, stringent conditions are selected to be about 5°C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. However, stringent conditions encompass temperatures in the range of about 1°C to about 20°C, depending upon the desired degree of stringency as otherwise qualified herein. Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides they encode are substantially identical. This may occur, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. One indication that two nucleic acid sequences are substantially identical is when the polypeptide encoded by the first nucleic acid is immunologically cross reactive with the polypeptide encoded by the second nucleic acid.
(e)(ii) The term "substantial identity" in the context of a peptide indicates that a peptide includes a sequence with at least 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, at least 90%, 91%, 92%, 93%, or 94%, or 95%, 96%, 97%, 98% or 99%, sequence identity to the reference sequence over a specified comparison window. Optimal alignment may be conducted using the homology alignment algorithm of Needleman and Wunsch (1970). An indication that two peptide sequences are substantially identical is that one peptide is immunologically reactive with antibodies raised against the second peptide. Thus, a peptide is substantially identical to a second peptide, for example, where the two peptides differ only by a conservative substitution.
For sequence comparison, typically one sequence acts as a reference sequence to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
As noted above, another indication that two nucleic acid sequences are substantially identical is that the two molecules hybridize to each other under stringent conditions. The phrase "hybridizing specifically to" refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA. "Bind(s) substantially" refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target nucleic acid sequence.
"Stringent hybridization conditions" and "stringent hybridization wash conditions" in the context of nucleic acid hybridization experiments such as Southern and Northern hybridizations are sequence dependent, and are different under different environmental parameters. Longer sequences hybridize specifically at higher temperatures. The thermal melting point (Tm) is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Specificity is typically the function of post-hybridization washes, the critical factors being the ionic strength and temperature of the final wash solution. For DNA-DNA hybrids, the Tm can be approximated from the equation of Meinkoth and Wahl (1984); Tm 81.5°C + 16.6 (log M) +0.41 (%GC) -
0.61 (% form) - 500/L; where M is the molarity of monovalent cations, %GC is the percentage of guanosine and cytosine nucleotides in the DNA, % form is the percentage of formamide in the hybridization solution, and L is the length of the hybrid in base pairs. Tm is reduced by about 1°C for each 1% of mismatching; thus, Tm, hybridization, and/or wash conditions can be adjusted to hybridize to sequences of the desired identity. For example, if sequences with >90% identity are sought, the Tm can be decreased 10°C. Generally, stringent conditions are selected to be about 5°C lower than the Tm for the specific sequence and its complement at a defined ionic strength and pH. However, severely stringent conditions can utilize a hybridization and/or wash at 1, 2, 3, or 4°C lower than the Tm; moderately stringent conditions can utilize a hybridization and/or wash at 6, 7, 8, 9, or 10°C lower than the Tm; low stringency conditions can utilize a hybridization and or wash at 11, 12, 13, 14, 15, or 20°C lower than the Tm . Using the equation, hybridization and wash compositions, and desired temperature, those of ordinary skill will understand that variations in the stringency of hybridization and/or wash solutions are inherently described. If the desired degree of mismatching results in a temperature of less than 45 °C (aqueous solution) or 32°C (formamide solution), the SSC concentration can be increased so that a higher temperature can be used. An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993). Generally, highly stringent hybridization and wash conditions are selected to be about 5 °C lower than the Tm for the specific sequence at a defined ionic strength and pH.
An example of highly stringent wash conditions is 0.15 M NaCl at 72 °C for about 15 minutes. An example of stringent wash conditions is a 0.2X SSC wash at 65 °C for 15 minutes (see, Sambrook, infra, for a description of SSC buffer). Often, a high stringency wash is preceded by a low stringency wash to remove background probe signal. An example medium stringency wash for a duplex of, e.g., more than 100 nucleotides, is IX SSC at 45 °C for 15 minutes. An example low stringency wash for a duplex of, e.g., more than 100 nucleotides, is 4-6X SSC at 40°C for 15 minutes. For short probes (e.g., about 10 to 50 nucleotides), stringent conditions typically involve salt concentrations of less than about 1.5 M, or about 0.01 to 1.0 M, Na ion concentration (or other salts) at pH 7.0 to 8.3, and the temperature is typically at least about 30°C and at least about 60 C for long probes (e.g., >50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. In general, a signal to noise ratio of 2X(or higher) than that observed for an unrelated probe in the particular hybridization assay indicates detection of a specific hybridization. Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the proteins that they encode are substantially identical. This occurs, e.g., when a copy of a nucleic acid is created using the maximum codon degeneracy permitted by the genetic code. Very stringent conditions are selected to be equal to the Tm for a particular probe. An example of stringent conditions for hybridization of complementary nucleic acids which have more than 100 complementary residues on a filter in a Southern or Northern blot is 50% formamide, e.g., hybridization in 50% formamide, 1 M NaCl, 1% SDS at 37 C, and a wash in 0.1X SSC at 60 to 65°C. Exemplary low stringency conditions include hybridization with a buffer solution of 30 to 35% formamide, IM NaCl, 1% SDS (sodium dodecyl sulphate) at 37°C, and a wash in IX to 2X SSC (20X SSC = 3.0 M NaCl/0.3 M trisodium citrate) at 50 to 55 C. Exemplary moderate stringency conditions include hybridization in 40 to 45% formamide, 1.0 M NaCl, 1% SDS at 37°C, and a wash in 0.5X to IX SSC at 55 to 60°C.
Thus, the invention described herein includes polynucleotides and polypeptides that are substantially identical to any one of SEQ ID NOs 1-130.
The terms "protein," "peptide" and "polypeptide" are used interchangeably herein. As used herein, a "transgenic", "transformed", or "recombinant" cell refers to a genetically modified or genetically altered cell, the genome of which includes a recombinant DNA molecule or sequence ("transgene"). For example, a "transgenic cell" can be a cell transformed with a "vector." A "transgenic", "transformed", or "recombinant" cell thus refers to a host cell such as a bacterial or yeast cell into which a heterologous nucleic acid molecule has been introduced. The nucleic acid molecule can be stably integrated into the genome by methods generally known in the art (e.g., disclosed in Sambrook et al., 2001). For example, "transformed," "transformant," and "transgenic" cells have been through the transformation process and contain a foreign or exogenous gene. The term "untransformed" refers to cells that have not been through the transformation process. The term "transformation" refers to the transfer of a nucleic acid fragment into the genome of a host cell, or the transfer into a host cell of a nucleic acid fragment that is maintained extrachromosomally. A "transgene" refers to a gene that has been introduced into the genome by transformation. Transgenes may include, for example, genes that are heterologous or endogenous to the genes of a particular cell to be transformed. Additionally, transgenes may include native genes inserted into a non-native organism, or chimeric genes. The term "endogenous gene" refers to a native gene in its natural location in the genome of an organism. Such genes can be hyperactivated in some cases by the introduction of an exogenous strong promoter into operable association with the gene of interest. A "foreign" or an "exogenous" gene refers to a gene not normally found in the host cell but that is introduced by gene transfer.
"Vector" is defined to include, inter alia, any plasmid, cosmid, phage or other construct in double or single stranded linear or circular form that may or may not be self transmissible or mobilizable, and that can transform prokaryotic or eukaryotic host either by integration into the cellular genome or exist extrachromosomally, e.g., autonomous replicating plasmid with an origin of replication. A vector can include a construct such as an expression cassette having a DNA sequence capable of directing expression of a particular nucleotide sequence in an appropriate host cell, comprising a promoter operably linked to the nucleotide sequence of interest that also is operably linked to termination signals. An expression cassette also typically includes sequences required for proper translation of the nucleotide sequence. The expression cassette comprising the nucleotide sequence of interest may be chimeric, meaning that at least one of its components is heterologous with respect to at least one of its other components. The expression cassette may also be one that is naturally occurring but has been obtained in a recombinant form useful for heterologous expression. The expression of the nucleotide sequence in the expression cassette may be under the control of a constitutive promoter or of an inducible promoter that initiates transcription only when the host cell is exposed to some particular external stimulus.
The term "wild type" refers to an untransformed cell, i.e., one where the genome has not been altered by the presence of the recombinant DNA molecule or sequence or by other means of mutagenesis. A "corresponding" untransformed cell is a typical control cell, i.e., one that has been subjected to transformation conditions, but has not been exposed to exogenous DNA.
II. Polynucleotides and Polypeptides of the Invention
A. Polynucleotides encoding iron transport and metabolism system proteins
1. Sources of nucleic acid molecules
Sources of nucleotide sequences from which the present nucleic acid molecules encoding iron transport and metabolism system proteins, or the nucleic acid complements thereof, may be prepared, for example, from total or polyA+ RNA from any prokaryotic, e.g., pathogenic bacterial, cellular source from which cDNAs can be derived by methods known in the art. Sources include those gram- negative bacteria that are frequent pathogens of animals, such as Escherichia coli, Salmonella spp. and Pasteur ella spp. Other sources of the DNA molecules of the invention include genomic libraries derived from any prokaryotic cellular source.
2. Isolation of a gene encoding an iron transport or metabolism system protein
Genes encoding proteins involved in iron transport or metabolism can be isolated, for example, using gene chip technology. Such methods are disclosed herein.
3. Preparation of the Polynucleotides of the Invention
Nucleic acid molecules encoding the amino acid sequence of an iron transport and metabolism system protein are prepared by a variety of methods known in the art. B. Polypeptides of the Invention The isolated and purified iron transport and metabolism system proteins, or portions thereof, or derivatives thereof, can be synthesized in vitro, e.g., by the solid phase peptide synthetic method or by recombinant DNA approaches (see above). The solid phase peptide synthetic method is an established and widely used method, which is described in the following references: Stewart et al. (1969); Merrifield (1963); Meienhofer (1973); Bavaay and Merrifield (1980); and Clark-Lewis et al. (1997). These peptides can be further purified by fractionation on immunoaffinity or ion-exchange columns; ethanol precipitation; reverse phase HPLC; chromatography on silica or on an anion-exchange resin such as DEAE; chromatofocusing; SDS-PAGE; ammonium sulfate precipitation; gel filtration using, for example, Sephadex G-75; or ligand affinity chromatography.
Once isolated and characterized, derivatives, e.g., chemically derived derivatives, of a given iron transport systems protein can be readily prepared. For example, amides of the iron transport systems protein of the present invention may also be prepared by techniques well known in the art for converting a carboxylic acid group or precursor, to an amide. One method for amide formation at the C- terminal carboxyl group is to cleave the polypeptide from a solid support with an appropriate amine, or to cleave in the presence of an alcohol, yielding an ester, followed by aminolysis with the desired amine. Salts of carboxyl groups of a polypeptide of the invention may be prepared in the usual manner by contacting the polypeptide with one or more equivalents of a desired base such as, for example, a metallic hydroxide base, e.g., sodium hydroxide; a metal carbonate or bicarbonate base such as, for example, sodium carbonate or sodium bicarbonate; or an amine base such as, for example, triethylamine, triethanolamine, and the like.
N-acyl derivatives of an amino group of the iron transport systems protein may be prepared by utilizing an N-acyl protected amino acid for the final condensation, or by acylating a protected or unprotected polypeptide. O-acyl derivatives may be prepared, for example, by acylation of a free hydroxy polypeptide or polypeptide resin. Either acylation may be carried out using standard acylating reagents such as acyl halides, anhydrides, acyl imidazoles, and the like. Formyl-methionine, pyroglutamine and trimethyl-alanine may be substituted at the N-terminal residue of the polypeptide. Other amino-terminal modifications include aminooxypentane modifications (see Simmons et al. (1997).
The iron transport and metabolism system proteins of the invention include proteins substitutions of at least one amino acid residue in the polypeptide. Amino acid substitutions falling within the scope of the invention include those that do not differ significantly in their effect on maintaining (a) the structure of the peptide backbone in the area of the substitution, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. Naturally occurring residues are divided into groups based on common side-chain properties:
(1) hydrophobic: norleucine, met, ala, val, leu, ile;
(2) neutral hydrophilic: cys, ser, thr;
(3) acidic: asp, glu;
(4) basic: asn, gin, his, lys, arg; (5) residues that influence chain orientation: gly, pro; and
(6) aromatic; tip, tyr, phe.
Substitution of like amino acids can also be made on the basis of hydrophilicity. As detailed in U.S. Patent No. 4,554,101, the following hydrophilicity values have been assigned to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 ± 1); glutamate (+3.0 ± 1); serine (+0.3); asparagine (+0.2); glutamine (+0.2); glycine (0); proline (-0.5 ± 1); threonine (-0.4); alanine (- 0.5); histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4). In such changes, the substitution of amino acids whose hydrophilicity values can be within ± 2, within ± 1, or within ± 0.5.
In one embodiment of the invention, the iron transport and metabolism system protein has a conservative amino acid substitution, for example, aspartic- glutamic as acidic amino acids; lysine/arginine/histidine as basic amino acids; leucine/isoleucine, methionine/valine, alanine/valine as hydrophobic amino acids; serine/glycine/alanine/threonine as hydrophilic amino acids. Conservative amino acid substitutions also includes groupings based on side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and tlireonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur- containing side chains is cysteine and methionine.
Exemplary substitutions include those in Table 1.
Table 1
Original Residue Exemplary Substitutions
Ala Gly; Ser
Arg Lys
Asn Gin; His
Asp Glu
Cys Ser
Gin Asn
Glu Asp
Gly Ala
His Asn; Gin lie Leu; Val
Leu He; Val
Lys Arg
Met Met; Leu; Tyr
Ser Thr; Ala; Leu
Thr Ser; Ala
Trp Tyr
Tyr Trp; Phe
Val He; Leu
After the substitutions are introduced, the resulting iron transport systems protein is screened for activity. Acid addition salts of the polypeptide or of amino residues of the polypeptide may be prepared by contacting the polypeptide or amine with one or more equivalents of the desired inorganic or organic acid, such as, for example, hydrochloric acid. Esters of carboxyl groups of the polypeptides may also be prepared by any of the usual methods known in the art.
Accordingly, the present invention contemplates an isolated iron transport systems protein. In one embodiment, the iron transport systems protein of the invention is a recombinant polypeptide.
Amino acid residues can be added to or deleted from a full-length iron transport systems protein through the use of standard molecular biological techniques without altering the functionality of the receptor. For example, portions of the iron transport systems protein can be removed to create truncated iron transport systems proteins. The truncated protein retains the properties of the full- length iron transport systems protein. C. Expression Cassettes, Vectors and Cells of the Invention
1. Expression Cassettes
To prepare expression cassettes for transformation, the recombinant DNA sequence or segment may be circular or linear, double-stranded or single-stranded. A recombinant DNA sequence which encodes a RNA sequence that is substantially complementary to a mRNA sequence encoding a iron transport or metabolism system protein is typically a "sense" DNA sequence cloned into a cassette in the opposite orientation (i. e. , 3 ' to 5 ' rather than 5 ' to 3 '). Generally, the recombinant DNA sequence or segment is in the form of chimeric DNA, such as plasmid DNA, that can also contain coding regions flanked by control sequences which promote the expression of the recombinant DNA present in the resultant cell.
Aside from recombinant DNA sequences that serve as transcription units for an iron transport systems protein, or portions thereof, a portion of the recombinant DNA may be untranscribed, serving a regulatory or a structural function. For example, the recombinant DNA may itself comprise a promoter that is active in poultry cells, or may utilize a promoter already present in the genome that is the transformation target. Such promoters are well known to the art. Other elements functional in the cells, such as introns, enhancers, polyadenylation sequences and the like, may also be a part of the recombinant DNA. Such elements may or may not be necessary for the function of the DNA, but may provide improved expression of the DNA by affecting transcription, stability of the mRNA, or the like. Such elements may be included in the DNA as desired to obtain the optimal performance of the transforming DNA in the cell.
A coding sequence of an expression cassette may also be operatively linked to a transcription terminating region. RNA polymerase transcribes an encoding DNA sequence through a site where polyadenylation occurs. Typically, DNA sequences located a few hundred base pairs downstream of the polyadenylation site serve to terminate transcription. Those DNA sequences are referred to herein as transcription-termination regions. Those regions are required for efficient polyadenylation of transcribed RNA. Transcription-terminating regions are well- known in the art. The recombinant DNA to be introduced into the cells may contain either a selectable marker gene or a reporter gene or both to facilitate identification and selection of transformed cells from the population of cells sought to be transformed. Alternatively, the selectable marker maybe carried on a separate piece of DNA and used in a co-transformation procedure. Both selectable markers and reporter genes may be flanked with appropriate regulatory sequences to enable expression in the host cells. Useful selectable markers are well known in the art.
Reporter genes are used for identifying potentially transformed cells and for evaluating the functionality of regulatory sequences. Reporter genes which encode for easily assayable proteins are well known in the art. In general, a reporter gene is a gene which is not present in or expressed by the recipient organism or tissue and which encodes a protein whose expression is manifested by some easily detectable property, e.g., enzymatic activity. Examples of reporter genes include the luciferase gene from firefly Photinus pyralis. Expression of the reporter gene is assayed at a suitable time after the DNA has been introduced into the recipient cells. The general methods for constructing recombinant DNA which can transform target cells are well known to those skilled in the art, and the same compositions and methods of construction maybe utilized to produce the DNA useful herein. For example, Sambrook et al. (2001) provides suitable methods of construction.
The present invention thus provides an expression cassette or vector comprising a polynucleotide of the invention, i.e., one that encodes an iron transport or metabolism system proteins, or a portion thereof with substantially the same activity as the full-length iron transport systems protein. In one embodiment of the invention, expression cassettes and vectors comprise a promoter, or optionally a promoter, or optionally an enhancer-promoter, operably linked to the polynucleotide. An enhancer-promoter used in an expression cassette of the present invention can be any enhancer-promoter that drives expression in a cell to be transfected. By employing an enhancer-promoter with well-known properties, the level and pattern of gene product expression can be optimized, hi one embodiment, expression cassette of the invention comprise a polynucleotide operatively linked to a tissue- or cell-specific promoter. Exemplary vectors for the expression cassette include viral vectors, e.g., adenovirus or lentivirus vectors.
An expression cassette of the present invention is useful both as a means for preparing quantities of the iron transport systems protein encoding DNA itself, and as a means for preparing the encoded polypeptides. It is contemplated that where iron transport systems proteins of the invention are made by recombinant means, one can employ either prokaryotic or eukaryotic expression vectors as shuttle systems.
2. Introduction into Cells
The recombinant DNA can be readily introduced into the cells, e.g., mammalian, bacterial, yeast or insect cells, by transfection with an expression cassette or vector comprising DNA encoding a iron transport systems protein or its complement, by any procedure useful for the introduction into a particular cell, e.g., physical or biological methods, to yield a transformed cell having the recombinant DNA optionally stably integrated into its genome, so that the DNA molecules, sequences, or segments, of the present invention are expressed by the cell. Physical methods to introduce a recombinant DNA into a cell include calcium, DEAE-dextran, lipofection, particle bombardment, protoplast fusion, microinjection, electroporation, and the like. A widely used method is transfection mediated by either calcium phosphate or DEAE-dextran. Depending on the cell type, up to 90% of a population of cultured cells can be transfected at any one time. Because of its high efficiency, transfection mediated by calcium phosphate or DEAE-dextran may be the method of choice for experiments that require transient expression of the foreign DNA in large numbers of cells. Calcium phosphate- mediated transfection is also used to establish cell lines that integrate copies of the foreign DNA, which are usually arranged in head-to-tail tandem arrays into the host cell genome.
The application of brief, high- voltage electric pulses to a variety of cells leads to the formation of nanometer-sized pores in the plasma membrane. DNA is taken directly into the cell cytoplasm either through these pores or as a consequence of the redistribution of membrane components that accompanies closure of the pores. Electroporation can be extremely efficient and can be used both for transient expression of cloned genes and for establishment of cell lines that carry integrated copies of the gene of interest. Electroporation, in contrast to calcium phosphate- mediated transfection and protoplast fusion, frequently gives rise to cell lines that carry one, or at most a few, integrated copies of the foreign DNA.
Liposome transfection involves encapsulation of DNA and RNA within liposomes, followed by fusion of the liposomes with the cell membrane. The mechanism of how DNA is delivered into the cell is unclear but transfection efficiencies can be as high as 90%>. Direct microinjection of a DNA molecule into nuclei has the advantage of not exposing DNA to cellular compartments such as low-pH endosomes. Microinjection is therefore used primarily as a method to establish lines of cells that carry integrated copies of the DNA of interest.
Biological methods to introduce the DNA of interest into a cell include the use of DNA and RNA viral vectors. The main advantage of physical methods is that they are not associated with pathological or oncogenic processes of viruses. However, they are less precise, often resulting in multiple copy insertions, random integration, disruption of foreign and endogenous gene sequences, and unpredictable expression.
In one embodiment, the recombinant cells of the present invention are prokaryotic cells. In one case, the recombinant cells of the invention are bacterial cells of the DH5a strain of Escherichia coli, as well as E. coli W3110 (F, λ, prototrophic, ATCC No. 273325), bacilli such as Bacillus subtilis, or other enterobacteriaceae such as Salmonella typhimurium or Serratia marcesceus, and various Pseudomonas species. In general, prokaryotes are used for the initial cloning of DNA sequences and constructing the vectors useful in the invention. For example, E. coli K12 strains can be particularly useful. Other microbial strains which can be used include E. coli B, and E. coli X1776 (ATCC No. 31537). These examples are, of course, intended to be illustrative rather than limiting.
In general, plasmid vectors containing replicon and control sequences which are derived from species compatible with the cell are used in connection with these cells. The vector ordinarily carries a replication site, as well as marking sequences which are capable of providing phenotypic selection in transformed cells. For example, E. coli can be transformed using pBR322, a plasmid derived from an E. coli species. pBR322 contains genes for ampicillin and tetracycline resistance and thus provides easy means for identifying transformed cells. The pBR plasmid, or other microbial plasmid or phage must also contain, or be modified to contain, promoters which can be used by the microbial organism for expression of its own polypeptides.
Those promoters most commonly used in recombinant DNA construction include the 3-lactamase (penicillinase) and lactose promoter systems (Chang et al., 1978; Itakura et al., 1977; Goeddel et al., 1979; Goeddel et al., 1980) and a tryptophan (TRP) promoter system (ΕPO Appl. Publ. No. 0036776; Siebwenlist et al., 1980). While these are the most commonly used, other microbial promoters have been discovered and utilized, and details concerning their nucleotide sequences have been published, enabling a skilled worker to introduce functional promoters into plasmid vectors (Siebwenlist et al., 1980). In addition to prokaryotes, eukaryotic microbes, such as yeast can also be used. Saccharomyces cerevisiae or common baker's yeast is the most commonly used among eukaryotic microorganisms, although Schizosaccharomyces and Pichia are commonly available. For expression in Saccharomyces, the plasmid YRp7, for example, is commonly used (Stinchcomb et al., 1979; Kingsman et al, 1979;
Tschemper et al., 1980). This plasmid already contains the trpl gene which provides a selection marker for a mutant strain of yeast lacking the ability to grow in tryptophan, for example ATCC No. 44076. The presence of the trpl lesion as a characteristic of the yeast cell genome then provides an effective environment for detecting transformation by growth in the absence of tryptophan.
Suitable promoter sequences in yeast vectors include the promoters for 3- phosphoglycerate kinase (Hitzeman et al., 1980) or other glycolytic enzymes (Hess et al., 1968; Holland et al., 1978) such as enolase, glyceraldehyde-3 -phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, glucose- 6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, triosephosphate isomerase, phosphoglucose isomerase, and glucokinase. In constructing suitable expression plasmids, the termination sequences associated with these genes are also introduced into the expression vector downstream from the sequences to be expressed to provide polyadenylation of the mRNA and termination. Other promoters, which have the additional advantage of transcription controlled by growth conditions are the promoter region for alcohol dehydrogenase 2, isocytochrome C, acid phosphatase, degradative enzymes associated with nitrogen metabolism, and the aforementioned glyceraldehyde-3 -phosphate dehydrogenase, and enzymes responsible for maltose and galactose utlization. Any plasmid vector containing a yeast-compatible promoter, origin or replication and termination sequences is suitable.
To confirm the presence of the recombinant DNA sequence in the cell, a variety of assays may be performed. Such assays include, for example, "molecular biological" assays well known to those of skill in the art, such as Southern and Northern blotting, RT-PCR and PCR; "biochemical" assays, such as detecting the presence or absence of a particular iron transport systems protein, e.g., by immunological means (ELISAs and Western blots) or by additional assays known to the art.
To detect and quantitate RNA produced from introduced recombinant DNA segments, RT-PCR may be employed. In this application of PCR, it is first necessary to reverse transcribe RNA into DNA, using enzymes such as reverse transcriptase, and then through the use of conventional PCR techniques amplify the DNA. In most instances PCR techniques, while useful, will not demonstrate integrity of the RNA product. Further information about the nature of the RNA product may be obtained by Northern blotting. This technique demonstrates the presence of an RNA species and gives information about the integrity of that RNA. The presence or absence of an RNA species can also be determined using dot or slot blot Northern hybridizations. These techniques are modifications of Northern blotting and only demonstrate the presence or absence of an RNA species.
While Southern blotting and PCR may be used to detect the recombinant DNA segment in question, they do not provide information as to whether the preselected DNA segment is being expressed. Expression may be evaluated by specifically identifying the peptide products of the introduced recombinant DNA sequences or evaluating the phenotypic changes brought about by the expression of the introduced recombinant DNA segment in the cell. A recombinant iron transport systems protein may be recovered or collected either from the transfected or infected cells or the medium in which those cells are cultured. Recovery comprises isolating and purifying the recombinant polypeptide. Isolation and purification techniques for polypeptides are well-known in the art and include such procedures as precipitation, filtration, chromatography, electrophoresis and the like.
III. Methods of the invention
The present invention also provides a method for the identification of a bacterial outer-membrane protein with a role in iron acquisition and/or metabolism. Genes encoding the outer membrane receptor proteins are transcriptionally altered during iron limitation, as opposed to genes encoding proteins that are periplasmic, in the inner membrane, or cytoplasmic. Thus, outer-membrane components can serve as suitable targets for the development of vaccines, antimicrobial peptides, or antimicrobial agents. Hence, the strategy of the present method reduces the time to identify genes that encode proteins that are both involved iron acquisition as well as present on the outer membrane of the bacterium.
As shown by the results herein, the greatest change in expression amongst the genes represented in known iron transport pathways, was found in the genes encoding receptor proteins. For instance, fepA was down-regulated 16-fold after adding iron, cir was down-regulated 16-fold after adding iron, fhuA was down- regulated 7-fold after adding iron, and fecA was down-regulated 2-fold after adding iron. These changes were observed relatively soon after the addition of iron , for example,/eμ4 reached a 16-fold repression between 5 to 10 minutes, while cir reached a 16-fold repression between 10 to 20 minutes following the addition of iron. In contrast to the substantial changes in expression level of genes encoding the outer-membrane receptor proteins, a lower magnitude of change was observed in genes that encoded the periplasmic binding protein members of iron transport pathways. For instance, fepB was down-regulated 4-fold, while fhuD zaxdfepB did not show significant change. The least magnitude of change was noted in genes encoding inner membrane proteins, which proteins typically showed little or no change in expression level.
To determine whether this was a generalized phenomenon for other ABC transporters in E. coli, the alteration in expression profiles of 15 non-iron related transport systems was examined. In only a few cases, such as the dipeptide transporter (dpp), the gene encoding the periplasmic binding protein had a more dramatic transcriptional response than the inner-membrane partners. However, this was not the case for the vast majority of other ABC transporters, where 12 of 15 (80%)) did not show a similar trend.
In one embodiment, the present method provides a three-step strategy for the identification of iron regulated outer membrane proteins. The first step involves the comparative transcriptional profiling of bacterial cells grown in the presence or absence of iron by microarray or alternate methods known to the art for the identification of iron regulated genes. In the second step, proteins encoded by all of the genes in the microbe of interest are examined for the presence of signal sequences, membrane anchor domains, or surface probability using computational and bioinformatics tools known to the art. The third step involves the identification of genes that are included in both of the subsets from steps 1 and 2 that will likely represent genes encoding outer-membrane receptors for iron or iron containing proteins.
The invention will now be described by the following non-limiting example.
Example: Iron transport and Metabolism Proteins Slower growth rate of E. coli Fur" mutant in iron depleted media
Iron is an essential nutrient for the survival and proliferation of E. coli. Figure 1 presents data demonstrating that he bacterium grows at a slower rate in iron depleted as compared with iron rich media (Fig. 1). It was interesting however to note that the fur mutant also grows slower than the wild-type in conditions of iron limitation. While this would not be surprising in iron rich media since the fur" mutant is likely to expend metabolic resources in needlessly making products such as the enterobactin siderophore and iron transport proteins which are not produced by the wild type, the slower growth rate of E. coli fur mutant under iron-starvation conditions suggests that Fur, either directly or through some mediator, positively regulates processes required for cellular growth and adaptation during iron limitation, a hypothesis that remains to be rigorously tested.
Transcriptional changes on addition of iron to an iron limited culture of wild- type E. coli
cDNA microarrays including 3,866 genes which make up -90% of the genes
E. coli MG1655 were used to identify genes whose expression changed in response to iron-limitation. 675 genes showed a greater than 2-fold change in expression for at least one of the time points sampled in the wild type cells. Figure 2 shows a distribution of the differentially expressed genes among the various classes with a break-up of the magnitudes of change within each class. The maximum number of genes changing was in the "Hypotheticals" class followed by those transport and binding proteins. Among the 79 transport-related genes that were differentially expressed, 50 were upregulated, while 29 were downregulated. The maximum magnitude of changes were in the genes encoding amino acid biosynthesis and metabolism and in transport and binding related proteins, whereas the genes encoding the energy metabolism class of proteins did not show as much change. It was interesting to note a striking change in expression of genes encoding transport related proteins as the growth rate changes. The magnitude of change in these genes exceeded that of genes encoding proteins in central carbon metabolism, suggesting that the expression levels of genes in the transport and binding protein class respond dramatically to environmental cues as well as growth rate.
The effect of iron limitation on the expression of genes encoding proteins with an iron-sulfur binding domain were also investigated, since the function of these proteins will be directly affected by iron availability. There are 40 such genes as listed in the InterPro database for E. coli K12. Of these, signals for 38 were detected during the course of our investigations. The results show that only one gene in this class (nuol) was upregulated, and one (fdnH) downregulated in the wild-type for at least one of the six time-points, while in the mutant, 2 genes (napH, fdoH) were upregulated and none downregulated. In addition, it is noteworthy that none of these genes had a dramatic (greater than 4-fold) change in expression level. Together, these results are consistent with the relative importance of these proteins in providing critical cellular functions, e.g., in electron transport, and suggest that these genes are therefore very tightly regulated.
Expression profiles of genes directly involved in iron transport in wild-type E. coli:
The primary genes involved in iron transport are entABCDEFS, fepABCDEFG,fecABCDEIR,flιuABCDEF, cir A, tonB and exbBD (Earhart (in) Neidhardt, 1996) and are known to be negatively regulated by Fur. These genes showed either a decrease in expression after addition of iron or no appreciable change. In addition, the results show that the genes which were downregulated at the first time point sampled at 5 minutes remained so at all the six time points sampled till 90 minutes after iron addition.
The ferric citrate transport system consisting of the fee genes is induced by citrate, which was absent in the media used in the current investigation. Previous studies have shown that the fecBCDE genes are positively regulated hyfecIR (Braun, 1997) while fecAIR are negatively regulated by Fur. In addition, fecA is also induced by citrate, while fecBCDE are induced by conformational changes in fecA, feel and fecR in the presence of ferric citrate (Enz et al., 1995). The results presented herein show that while fecAIR genes are downregulated in the wild-type, fecBCDE show no change upon the addition of iron to the medium.
These results indicate that not all the genes previously identified to be involved in iron transport are altered in expression in response to iron deprivation. However, an interesting trend in gene expression patterns emerged. Among the genes represented in each of the known iron transport pathways, the gene that shows the maximum change in expression was almost always the one encoding the receptor protein. For instance, fepA was down-regulated 16 fold after adding iron, cir 16 fold, fhuA 7 fold, fecA 2 fold. In addition, these changes were also observed relatively soon after the addition of iron -fepA reached a 16 fold repression between 5 to 10 minutes, while cir reached a 16 fold repression between 10 to 20 minutes. These data indicate that the expression of the outer-membrane is rather dramatically modulated over short time frames in response to the extracellular iron levels, hi contrast to the substantial changes in expression level of genes encoding the outer- membrane receptor proteins, a lower magnitude of change was observed in genes that encoded the periplasmic binding protein members of iron transport pathways. For instance, fepB was downregulated 4 fold while fhuD and fepB did not show significant change. The least magnitude of change was noted in genes encoding inner membrane proteins - these typically showed little or no change in expression level. These results indicate that the process of iron transport across the outer membrane is the rate-limiting step (Sprencel et al., 2000), and provide an explanation for why the cell needs to respond to iron deprivation with a large increase in the relevant outer membrane receptors.
To determine whether this was a generalized phenomenon for other ABC transporters in E. coli, the alteration in expression profiles of 15 non-iron related transport systems were evaluated, hi only a few cases, such as the dipeptide transporter (dpp), the gene encoding the periplasmic binding protein had a more dramatic transcriptional response than its inner-membrane partners. However, this was not the case for the vast majority of other ABC transporters, where 12 of 15 (80%>) did not show a similar trend. Thus, these data indicate that outer membrane receptors and periplasmic binding proteins that show a dramatic regulation are likely to be involved in iron transport.
Alterations in transcriptional profiles from the point of view of the known operon stracture of these genes were examined. The profiles for some of the opeτons,fepA-entD,fes-entF-fepE,fecIR, entCEBA-b0597 in the wild-type and the fur mutant are shown (Fig. 3). With the exception of fepE, the alterations in expression patterns of all genes within an operon were generally found to be similar in direction of change. In all cases, either the first gene in the operon was downregulated more than the others (e.g. fepA , fes, entF, fee A, flruA) or there was no detectable difference between all the genes of the operon.
Expression profiling in the «r" mutant:
1185 genes showed a greater than 2-fold change in expression for at least one of the time points sampled in the fur mutant. The maximum number of genes changing in a particular class was again the Transport and binding proteins. There are 249 genes which are upregulated in both wild-type and fur-on iron addition. Out of these, 42 belong to the functional class of Translational and post translational modification, 29 are Energy metabolism related and 28 are Hypotheticals. There are 94 genes downregulated in both wild-type .dfuf, out of which 20 are Hypotheticals, 12 are Transport and Binding related and 11 Central intermediary metabolism related.
In the fur mutant, the expression of none of the genes involved in iron uptake was downregulated. It is noteworthy that the relative expression levels of some iron-acquisition and transport related genes increased in the fur mutant. For example, an increase in expression levels was noted for fecAE, exbD, cir A, fhuE, entBECD, fepA, and fhuA (Table 2). Nearly all the genes which were thus upregulated are either outer membrane receptors or siderophore synthesis genes. (Note: entCE and the outer membrane receptors are among the most highly downregulated genes in the wild-type.) There is thus an additional regulation to the expression of these than that by the Fur protein.
Table 2: Genes upregulated after iron addition in the/wr mutant with the maximum change seen between all time points
Figure imgf000036_0001
Comparing profiles in the wild-type and fur mutant to identify new genes involved in iron transport: The profiles of genes that are regulated directly or indirectly by Fur were examined in the wild-type and the fuf~ mutant. The difference in the expression profiles between the wild-type and fur- mutant was judged by the Euclidean distance between the profiles. The mean of the Euclidean distances between the profiles of all genes was 2 with a standard deviation of 1. All genes where the Euclidean distance was greater than 2 standard deviations of the mean were selected. Hence, genes were identified with dissimilar profiles in the wild type and the mutant experiments as genes in which the Euclidean distance between the two expression profiles was greater than 4. Table 3 lists the 64 genes which passed the above mentioned criterion. The functional classes having the maximum number of genes with different profiles in the wild type and the mutant are the Transport and binding followed by Hypotheticals and Translational, post translational modification. 11 out of the 14 transport related genes identified had a known function in iron transport. Among the Amino acid biosynthesis and metabolism genes, a majority is related to arginine synthesis, which show a more than 16-fold upregulation at the 60 minute time-point in the wild-type. Very few genes in Central intermediary metabolism (4) had dissimilar profiles in the wild-type and mutant experiments. This indicates Fur does not regulate a large number of metabolic genes.
Table 3. Genes with different profiles in the wild type and fur mutant
Figure imgf000037_0001
Figure imgf000038_0001
Figure imgf000039_0001
Figure imgf000040_0001
It has been hypothesized that Fur regulates iron demand under conditions of iron limitation by down-regulating some iron-containing proteins (Masse and Gottesman, 2002). However, using the criterion of difference in expression profiles in the wild-type and the mutant, much change in expression of the genes for iron- sulfur domain containing proteins was not detected. Also, among sdhBACD, acnA, fumA, sodB and bfr which were reported to be indirectly regulated by Fur through ryhB by Masse et al, only sodB passed the present criterion of differential expression. sdhA is upregulated in the wild-type at 60 minutes, while it remains unchanged in the mutant, but the difference is not sufficient to pass the present criterion. However, from another experiment comparing the expression of in a. fur mutant vs the wild-type (data not shown), sdhABC, sodB and/tn were seen to be downregulated in the fur" mutant. Thus, it is possible that Fur downregulates expression of a few iron containing proteins under iron limitation, but not of a majority of these proteins. The fur mutant might have an active efflux protein to pump out the excess iron that is transported inside the cell enabling it to maintain mtracellular iron concentrations below lethal levels. zntA, a P-type ATPase shown to be involved in the efflux of Pb(II), Cd(E), and Zn(II) (Binet and Poole, 2000; Rensing et αl., 1997), is up-regulated on iron addition in the fur" mutant, while it is unchanged in expression in the wild-type (Euclidean distance between the two profiles=3.5), suggesting it might be involved in iron efflux too. A summary of the expression patterns and the sequence analysis for four of the 13 hypotheticals having the highest Euclidean distance between the expression profiles in the wild-type and the fur" mutant is presented below:
bl973
bl973 shows very slight downregulation in the wild-type, whereas in the fur" mutant it shows a dramatic downregulation at 5 min, continues to be further downregulated till 10 min and then the expression increases (Fig. 4). Its nucleotide sequence is similar to a metal ABC transporter substrate-binding protein. Its nucleotide sequence also shows the presence of a putative fur-box with 13 out of the 19 nucleotides matching the fur-box consensus sequence (Fig. 5b).
The expression profile of bl973 suggests that it may be involved in the low affinity transport of iron. When iron is added to the media in which the iron limited cells are growing, the flux of iron through the low affinity system might signal the presence of adequate amounts of iron in the external media. Hence, during change from low iron to iron replete conditions, the initial downregulation of low affinity transporters like bl973 may be required to maintain homeostasis. In the fur" mutant, since the high affinity system is no longer controlled by Fur, bl973 may be dramatically downregulated till the high affinity system is regulated through an alternate mechanism. Since the E. coli fur mutant is not lethal, there may be some other regulatory system controlling the expression of the high affinity iron transport system (or an iron export system) in the absence of which the cells would die of iron overload under abundant iron conditions. In both the wild type and the mutant, bl973 expression starts increasing after 20 to 30 minutes.
In order to identify the role of b 1973 in iron transport, the gene from the wild-type K12 MG1655 strain was knocked-out. The deletion of b 1973 retards the growth of the bacteria in defined rich media without iron (Fig. 5a). Also, in M9 minimal media, the bl973- mutant grows slower than the wild-type in media with and without iron (data not shown). These data indicate that B 1973 plays a non- redundant role in iron transport. b0597 Another gene, of particular interest is ybdB (b0597). It is placed.next.to entA— on the E. coli genome and shows the same order of change as entA (Fig. 4). It is downregulated in the wild type and upregulated in the fur" mutant. Its nucleotide sequence is similar to a putative protein possibly involved in aromatic compounds catabolism. The position of W597 on the genome, along with its nucleotide sequence and expression profiles indicates that it might be involved in the adaptation of metabolism to the biosynthesis of enterobactin.
b0805 A motif search on the protein sequence of b0805 carried out with the
PANAL protein motif classification tool
(http://web.ahc.umn.edu/panal/rim panal.html) showed the presence of a TonB-box, indicating that it might be a TonB dependent outer membrane receptor.
bl45
bl452 is upregulated in
Figure imgf000042_0001
mutant and downregulated in the wild-type (Fig. 4). Its protein sequence shows the presence of a ATP/GTP-binding site. It is likely that bl452 is involved in iron transport.
Validation of Results by Real time PCR
Three levels of quality control for the microarray results were present. Each gene was spotted in triplicate on the array, so data from each hybridization could be checked for consistency. The whole experiment was repeated again, and microarray analysis was carried out from RNA from the replicate cultures. Also, in order to validate the results of the microarray analyses, the entire experiment was repeated and RNA from the new cell samples was used for the real time PCR assay. Real Time PCR was performed on samples at the 0, 20, 60 and 90 minute time points for 12 ORFs (Fig. 6a). Figure 6 shows the comparison of the profile obtained from the array and the profile obtained from the Real Time PCR experiments for all 12 genes in both the wild type and the fur' mutant. Qualitatively, the expression profiles from LhejdDNA microarray and the Real Time PCR assays agree. The difference between the ratios obtained by the two methods is likely to result from either difference in sensitivity of the two assays or the inherent biological variability in the two different cultures or both. A scatter plot of some of the data of the ratios obtained from the cDNA microarray and that obtained from the Real Time PCR is shown in Figure 6b (correlation coefficient =0.76).
MATERIALS AND METHODS
Construction of t e fur ~ mutant
Construction of the mutant was performed using a protocol by Datsenko et al (Datsenko and Wanner, 2000). E. coli K12 was transformed with the pKD46 Helper plasmid by electroporation at 1.6kV, 50 μF and 200 ohms. 56 bp long primers were designed to amplify the Chloramplenicol gene from the pKD3 plasmid having the required homology regions to the fur gene [left primer: aac get tec teg ttt aaa aat cct gga agt tct tea gtg tag get gga get get tc (36 bp homology region) (SEQ ID NO:131); right primer: agt gac acg taa aga tag aga ctg tgg tta gtc agg cat atg aat ate etc ctt ag (36 bp homology region) (SEQ ID NO: 132)] . The PCR product was gel-purified using Gel Extraction kits (Qiagen) according to the manufacturers protocol, and then Dpnl digested to remove the genomic DNA. 10 μl of the PCR product, 2 μl Dpnl, 5 μl of the 10X buffer (supplied with the enzyme) and 33 μl of dH2O were mixed together and incubated at 37°C for an hour. Dpnl was then deactivated by heating the solution to 80°C for 20 minutes on a heat block.
E. coli with pKD46 was grown to a OD6oo of 0.5-0.6 (6 ml ) in SOB containing ampicillin (50 μg/ml) and 10 mM arabinose, washed three times with ice-cold 10% glycerol and resuspended in 50 μl cold 10% glycerol. 100 ng of the gel purified and Dpnl digested PCR product was mixed with 25 μl cells and the other 25μl was used as a control. The cells were allowed to sit on ice for 10 min and then electroporated at 1.6kV, 25 μF and 400 ohms. After electroporation, 1 ml ,.ice-cold.SO.C medium was immediately added to the cells. The cells were.allowed - to recover by shaking at 37°C for 1 hour and then allowed to stand overnight at room temperature. They were then plated on LB plates with chloramphenicol (20 μg/ml). The cells were colony-purified once on LB+chloramplenicol plates to obtain pure colonies.
For verification of the construct, primers were designed such that they were in the region just left and right of where the insertion was expected such that if there were no insertion a band corresponding to the fur gene would be seen (447 bp) and if there were an insertion a 1194 bp band would be seen. Without colony purification two bands of the two predicted size were observed. However, after colony purification, there was only one band at the correct location. Another PCR was carried out with primers within the chloramphenicol resistance gene sequence. In case there was an insertion, a 729 bp band was expected to be seen, which was observed.
Culture conditions
All cultures were grown in chemically defined media containing MOPS 0.04M, tricine 0.004 M, NHjCl 0.01 M, K2SO4 0.276 mM, CaCl2.H2O 0.5 μM, MgCl2 0.528 mM, NaCl 0.05M, Ammonium molybdate 0.003 μM, Boric acid
0.0004 mM, Cobalt chloride 0.03 μM, Cupric sulfate 0.01 μM, Manganese chloride 7.98 mM, Zinc sulfate 0.01 μM, Glucose 0.011 M, Adenine 0.2 mM, Cytosine 0.198 mM, Uracil 0.2 mM, Guanine 0.198 mM, Cysteine 0.102 mM, Alanine 0.815 mM, Arginine (HCl) 0.409 mM, Asparagine 0.463 mM, Aspartic acid (K salt) 0.408 mM, Glutamic acid (K salt) 0.612 mM, Glutamine 0.063 mM, Glycine 0.815 mM, Histidine (HClH2O) 0.204 mM, Isoleucine 0.404 mM, Leucine 0.821 mM, Lysine 0.409 mM, Methionine 0.205 mM, Plenylalanine (free) 0.410 mM, Proline (free) 0.408 mM, Serine (free) 10.19 mM, Threonine (free) 0.408 mM, Tryptophane (free) 0.104 mM, Tyrosine (free) 0.203 mM, Valine (free) 0.613 mM, ThiamineHCL 10 μM, Ca pantothenate 10 μM, pAmino Benzoic acid 10 μM, pHydroxy Benzoic acid 10 μM, 2,3-di Hydroxy Benzoic acid 10 μM..The media was stirred with - - Chelex resin (BioRad) for 1 hour to remove iron. All cations in the media were added after this step. For cultures supplemented with iron, FeSO .7H2O was added to a final concentration of 10 μM.
Ten milliliters of seed culture (either wild type oτfur" mutant) was grown overnight in defined media with iron in a 50 ml tube at 37°C with shaking at 250 rpm. 1% (v/v) innoculum in stationary phase was added to a 150 ml culture and the culture was allowed to grow in a 500 ml shaker flask at 250 rpm till when it was just entering stationary phase. At this point a sample was taken for RNA extraction (time 0). FeSO4 solution was then added to the culture to make the final concentration in the culture to lOμM. Samples were taken at 5, 10, 20, 30, 60 and 90 minutes after addition of iron for RNA extraction.
RNA Extraction
Immediately after sampling, the cell suspension was centrifuged at 5500 rpm for 5 minutes. The cell pellet was stored at -80°C. RNA extraction was done using RNeasy Mini Kits (Qiagen) according to the manufacturers protocol. DNase digestion for removal of genomic DNA contamination was done on the Qiagen columns according to the manufacturers protocol. RNA was quantified by absorbance at 260 nm.
cDNA Microarray Preparation
Ninety percent of the E. coli genome (3866 successful PCRs) was printed onto a poly-lysine coated slide based on a modified protocol originally designed by the DeRisi lab at the University of San Francisco using the MicroGridPro Spotting Robot (BioRobotics). PCR products were obtained for a majority of these genes with the following conditions: 0.3μL Taq polymerase, IX Buffer, 20mM dNTPs, 1.5 mM MgCl2, 800nM primers, genomic DNA, and water to 25 μL. Thermocycling conditions were: a) 94°C for 5 min; b) Cycle conditions: 94°C for 30 sj?cønds,.,55°C_for 30„seeonds,,&nd72°.C for 1 minute; c) 72°C for 10 minutes-and--.. stored at 4°C. A total of 0.5 μL of the PCR product was then used as template for a second 100 μL PCR reaction (same conditions) to minimize genomic DNA contamination. The PCR reactions that failed under these conditions (either no product or multiple bands) were reamplified at different annealing temperatures. After each round of amplification, the PCR product was run on agarose gels, and the gels were analysed using BioRad Ql software. 90% of the genome was successfully amplified this way. Amplified PCR products were cleaned with MultiScreen PCR plates (Millipore) as per the manufacturers instructions and resuspended in 50 μL of 3X SSC, 0.01% SDS. Three Arabidopsios thaliana genes (RUBISCO activase (RCA), photosystem I chlorophyll a/b-binding protein (Cab), root cap 1 (RCP1) ) were printed in triplicate on the array as controls (PCR product purchased from Stratagene).
The array (all genes spotted in triplicate) was printed on poly-L-lysine slides using the Total Array System Robot (BioRobotics) as described by the manufacturer. After printing was completed, the slides were post processed. Post processing involves Rehydration, Blocking and Denaturation. The spotting process does not, in general, leave DNA evenly distributed throughout the spot. To distribute the DNA more evenly, the spots (which dry rapidly during spotting) were rehydrated and snap dried. During the Blocking step, the remaining free lysine groups are modified to minimize their ability to bind labeled probe DNA. If these groups are not blocked labeled probe DNA will bind indiscriminately and nonspecifically to the surface and will produce excessively high background. Blocking is done by acylation with succinic anhydride. Denaturation is carried out to make the probe accessible to the target. All the protocols are available on the website at wwwl.umn.edu/agac/microrray.
Probe preparation Control RNA [RUBISCO activase (RCA), photosystem I chlorophyll a b- binding protein (Cab), root cap 1 (RCP1) (Strategene)] at different concentrations (between 20 pg/μl and 800 pg/μl) of-each-was added-to 20μg of each sample-RNA. The RNA was primed with 60 μg of random hexamers (Amersham), incubated at 70°C for 10 minutes and then allowed to sit on ice for 10 min. The RNA mixture was then added to the reverse transcription reaction, which included the reverse transcriptase Superscript II (GibcoBRL Life Technologies), DTT (Amersham), and a 4:1 ratio of amino-allyl dUTP with dNTPs (Amersham). The mixture was incubated for 2 hours at 42°C. After incubation, the reaction was stopped by addition of 20 μL NaOH, 20 μL 0.5 M EDTA, and incubating at 65°C for 15 min. The samples were neutralized with the addition of 50 μL of 1 M Tris-HCl (pH 7.4). The amino labeled cDNA was then stripped of all extra amine groups before the fluorescent dye was coupled to the amino group using Microcon YM-30 filters as per the manufacturers instractions (Millipore). The sample was dried down to concentrate cDNA. The cDNA was resuspended in 18 ul Sodium bicarbonate buffer (pH 9.0) and allowed to sit 10-15' at room temperature to ensure resuspension. The entire volume was transferred into a tube containing dried Cy-dye aliquot and incubated in the dark for an hour. In order to prevent cross coupling between the probes, 9μL of 4M hydroxylamine was added and the solution incubated for 15 minutes at room temperature in the dark. The Qia-Quick PCR purification kit was used to clean up the probe product as per the manufacturers instructions (Qiagen) with the addition of one Buffer PE wash. In order to have the probe ready for hybridization, the samples were dried down and then eluted in 33.5 μL dH2O, 6.68μl 20x SSC, l.OlμL 10% SDS, 3.38 μL of salmon sperm DNA and 3.38 μl polyA as a blocker. Hybridization Slides were placed in hybridization chambers. 10 μL of 3 x SSC was added to the ends of each slide to prevent dehydration of the hybridization solution. The hybridization solution was denatured for 2 minutes and then allowed to cool for 10 minutes at room temperature. It was then added to the post-processed printed slides, covered with a cover slip, and incubated at 63°C for 6-8 hours. Images of the fluorescent activity were measured with the ScanArray 5000 microarray scanner (GSI Lumonics).,. The entire. experimenLwas. repeated again and the samples were, labeled with a dye other than the dye with which it was labeled the first time. Image Quantification
The fluorescent intensities were quantified using the QuantArray Software. The adaptive measurement protocol was used. The local background was subtracted from the spot intensity to get the final signal intensity. Signal intensities thus obtained which were greater than at least one standard deviation of the local background were used for further analysis. The spot intensities were normalized using the non linear normalization program by Tseng et al(Tseng et al., 2001). The standard deviation of the three measurements for each gene was calculated and genes having a standard deviation more than 1.5 were discarded. After discarding the 'invalid' data points, the average ratio of the intensities for the two dyes for each gene was calculated. For each slide, any gene which had only one valid data point was discarded. For genes with two valid data points, the ratio of the difference between the ratios from the two spots to the minimum (absolute) of the two ratios had to be less than 2 for the gene to pass the quality filtering. The data from the two replicate experiments is shown separately instead of combining all the 6 data points into one.
Real Time PCR
Ten microgram of total RNA was treated for DNA removal using the DNA- free kit from Ambion according to the manufacturers protocol. 1 μg of the 'cleaned' RNA was reverse transcribed using random hexamers as primers similar to the reverse transcription reaction for the generation of the probe for the array, as described above. Amplification, detection, and real-time analysis were performed using the ABI Prism 7700 Sequence Detection System (Applied Biosystems). SYBR Green I (Applied Biosystems) was used for detection of the amplified product. Primers were designed to produce amplicons of the same length (approximately 100 bp) using the Primer3 software. The cDNA was diluted 13 fold and 1 μl of this diluted cDNA was used for subsequent PCR amplification using the appropriate specific, primers and the SYBR Green 2X PCR. Master .Mix kit (Applied. Biosystems). The 'cleaned' RNA was also used as a template to check for any amplification from genomic DNA contamination.
The Real Time PCR assay was performed for the wild type and the^wr" mutant at 0, 20, 60 and 90 minutes. 12 genes were assayed by this method. The n- fold difference in expression of each gene between time 0 and each of the time points 20, 60 and 90, was calculated as 2"x, where x = (Cj in 20(or 60 or 90) minute sample -Cj in 0 min sample). Because comparisons were made between products of PCRs using identical primers, and not between PCR products derived from different genes, the amplification efficiencies of different primer pairs were eliminated as a variable in the interpretation of the data.
Cited Documents Abdul-Tehrani, H., Hudson, A.J., Chang, Y.S., Timms, A.R., Hawkins, C,
Williams, J.M., Harrison, P.M., Guest, J.R., and Andrews, S.C. (1999) Ferritin mutants of Escherichia coli are iron deficient and growth impaired, and fur mutants are iron deficient. J Bacteriol 181 : 1415-1428. Arfm, S.M., Long, A.D., Ito, E.T., Tolleri, L., Riehle, M.M., Paegle, E.S., and Hatfield, G.W. (2000) Global gene expression profiling in Escherichia coli K12. The effects of integration host factor. J Biol Chem 275: 29672- 29684. Bagg, A., and Neilands, J.B. (1987) Molecular mechanism of regulation of siderophore-mediated iron assimilation. Microbiol Rev 51: 509-518. Bernstein, J.A., Khodursky, A.B., Lin, P.H., Lin-Chao, S., and Cohen, S.N. (2002) Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays. Proc Natl Acad Sci U S A 99: 9697-9702. Binet, M.R., andPoøle, R.K..(2000) Cd(E), Pb(E).and Zn(II) ions regulate expression of the metal-transporting P-type ATPase ZntA in Escherichia coli. FEBS Lett 473: 67-70. Birch, R.M., O'Byrne, C, Booth, I.R., and Cash, P. (2003) Enrichment of Escherichia coli proteins by column chromatography on reactive dye columns. Proteomics 3: 764-776. Braun, V. (1997) Surface signaling: novel transcription initiation mechanism starting from the cell surface. Arch Microbiol 167: 325-331. Braun, V., and Killmann, H. (1999) Bacterial solutions to the iron-supply problem. Trends Biochem Sci 24: 104-109.
Braun, V., and Braun, M. (2002) Iron transport and signaling in Escherichia coli.
FEBS Lett 529: 78. Bullen, J.J., Rogers, H.J., and Griffiths, E. (1978) Role of iron in bacterial infection. Curr Top Microbiol Immunol 80: 1-35. Courcelle, J., Khodursky, A., Peter, B., Brown, P.O., and Hanawalt, P.C. (2001) Comparative gene expression profiles following UN exposure in wild- type and SOS-deficient Escherichia coli. Genetics 158: 41-64. Crosa, J.H. (1997) Signal transduction and transcriptional and posttranscriptional control of iron-regulated genes in bacteria. Microbiol Mol Biol Rev 61: 319-336.
Datsenko, K.A., and Wanner, B.L. (2000) One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc Νatl Acad Sci U S A 97: 6640-6645. David, G., Blondeau, K., Renouard, M., and Lewit-Bentley, A. (2002) Crystallization and preliminary analysis of Escherichia coli YodA. Acta
Crystallogr D Biol Crystallogr 58: 1243-1245. David, G., Blondeau, K., Schiltz, M., Penel, S., and Lewit-Bentley, A. (2003) YodA from Escherichia coli is a metal-binding, lipocalin-like protein. J Biol Chem. Earhart (in) Neidhardt, F.C.a.C, R. (1996) Escherichia and Salmonella: cellular and molecular biology. Washington, D.C.: ASM Press. Euz, S.,.Braun,_V.,.and Crosa, J.H. (1995) Transcription of .the region-encoding the- ferric dicitrate-transport system in Escherichia coli: similarity between promoters for fecA and for extracytoplasmic function sigma factors. Gene 163: 13-18. Ferianc, P., Farewell, A., and Nystrom, T. (1998) The cadmium-stress stimulon of Escherichia coli K-12. Microbiology 144 ( Pt 4): 1045-1050.
Furrer, J.L., Sanders, D.N., Hook-Barnard, I.G., and Mclntosh, M.A. (2002) Export of the siderophore enterobactin in Escherichia coli: involvement of a 43 kDa membrane exporter. Mol Microbiol 44: 1225-1234. Guerinot, M.L. (1994) Microbial iron transport. Annu Rev Microbiol 48: 743-772.
Hantke, K. (2001) Iron and metal regulation in bacteria. Curr Opin Microbiol 4: 172-177.
Khodursky, A.B., Peter, B.J., Cozzarelli, N.R., Botstein, D., Brown, P.O., and Yanofsky, C. (2000) DNA microarray analysis of gene expression in response to physiological and genetic changes that affect tryptophan metabolism in Escherichia coli. Proceedings of the National Academy of Sciences of the United States of America 97: 12170-12175.
Klebba, P.E., Mclntosh, M.A., andNeilands, J.B. (1982) Kinetics of biosynthesis of iron-regulated membrane proteins in Escherichia coli. J Bacteriol 149: 880-888.
Leong, J., and Neilands, J.B. (1976) Mechanisms of siderophore iron transport in enteric bacteria. J Bacteriol 126: 823-830.
Masse, E., and Gottesman, S. (2002) A small RNA regulates the expression of genes involved in iron metabolism in Escherichia coli. Proc Natl Acad Sci U S A 99: 4620-4625.
McHugh, J.P., Rodriguez-Quinones, F., Abdul-Tehrani, H., Svistunenko, D.A., Poole, R.K., Cooper, C.E., and Andrews, S.C. (2003) Global iron- dependent gene regulation in Escherichia coli: A new mechanism for iron homeostasis. J Biol Chem. Moll y, M.P., Herbert -R-,.Slade, M.B., Rabilloud,X, Nouwens, A.S., Williams,- K.L., and Gooley, A. A. (2000) Proteomic analysis of the Escherichia coli outer membrane. Eur J Biochem 267: 2871-2881. Polen, T., Rittmann, D., Wendisch, V.F., and Sahm, H. (2003) DNA microarray analyses of the long-term adaptive response of Escherichia coli to acetate and propionate. Appl Environ Microbiol 69: 1759-1774. Pomposiello, P.J., Bennik, M.H., and Demple, B. (2001) Genome-wide transcriptional profiling of the Escherichia coli responses to superoxide stress and sodium salicylate. Journal of Bacteriology 183: 3890-3902. Puskarova, A., Ferianc, P., Kormanec, J., Homerova, D., Farewell, A., and Nystrom, T. (2002) Regulation of yodA encoding a novel cadmium-induced protein in Escherichia coli. Microbiology 148: 3801-3811. Rensing, C, Mitra, B., and Rosen, B.P. (1997) The zntA gene of Escherichia coli encodes a Zn(E)-translocating P-type ATPase. Proc Natl Acad Sci U S A 94: 14326-14331.
Richmond, C.S., Glasner, J.D., Mau, R., Jin, H., and Blattner, F.R. (1999) Genome- wide expression profiling in Escherichia coli K-12. Nucleic Acids Research 27: 3821-3835. Salmon, K., Hung, S.P., Mekjian, K., Baldi, P., Hatfield, G.W., and Gunsalus, R.P. (2003) Global gene expression profiling in escherichia coli K12: The effects of oxygen availability and FNR. J Biol Chem. Sprencel, C, Cao, Z., Qi, Z., Scott, D.C., Montague, M.A., Ivanoff, N., Xu, J., Raymond, K.M., Newton, S.M., and Klebba, P.E. (2000) Binding of ferric enterobactin by the Escherichia coli periplasmic protein FepB. J Bacteriol 182: 5359-5364.
Tseng, G.C., Oh, M.K., Rohlin, L., Liao, J.C., and Wong, W.H. (2001) Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res 29: 2549-2557. Yan, J.X., Devenish, A.T., Wait, R., Stone, T., Lewis, S., and Fowler, S. (2002) Fluorescence two-dimensional difference gel electrophoresis and mass sp ctrometry based proteomic analysis of Escherichia coli. Proteomics.2: 1682-1698. Zheng, M., Wang, X., Templeton, L.J., Smulski, D.R., LaRossa, R.A., and Storz, G. (2001) DNA microarray-mediated transcriptional profiling of the Escherichia coli response to hydrogen peroxide. Journal of Bacteriology
183: 4562-4570.
All publications, patents and patent applications referred to are incorporated herein by reference. While in the foregoing specification this invention has been described in relation to certain preferred embodiments thereof, and many details have been set forth for purposes of illustration, it will be apparent to those skilled in the art that the invention is susceptible to additional embodiments and that certain of the details described herein may be varied considerably without departing from the basic principles of the invention.

Claims

WHAT IS CLAIMED IS:
1. An isolated and purified polypeptide comprising at least one of SEQ ID NOs 65-128 or 130.
2. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:86.
3. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID
NO:87.
4. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ED NO:88.
5. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ TD NO:89.
6. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:90.
7. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:91.
8. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:92.
9. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:93.
10. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:94.
11. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:95.
12. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:96.
13. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:97.
14. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:98.
15. The polypeptide of claim 1 , wherein the polypeptide comprises SEQ ID NO:130.
16. An isolated and purified polynucleotide comprising a nucleic acid sequence encoding at least one of SEQ ID NOs 65-128 or 130.
17. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ TD NO: 86.
18. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO:87.
19. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO: 88.
20. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO: 89.
21. The polynucleotide of claim 16, wherein the polynucleotide comprises a - sequence encoding SEQ ID NO:90.
22. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO:91.
23. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO: 92.
24. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO: 93.
25. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO:94.
26. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO:95.
27. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ED NO: 96.
28. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ID NO: 97.
29. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ED NO:98.
30. The polynucleotide of claim 16, wherein the polynucleotide comprises a sequence encoding SEQ ED NO: 130.
1. An isolated and purified polynucleotide-comprising at least one of SEQ
ID NOs 1-64 or 129.
32. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ ID NO:22.
33. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ HD NO:23.
34. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ ID NO:24.
35. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ ID NO:25.
36. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ ID NO:26.
37. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ TD NO:27.
38. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ ID NO:28.
39. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ ID NO:29.
40. The polynucleotide of claim 31 , wherein the polynucleotide comprises
SEQ ID NO:30.
41. The polynucleotide. of claim 31,. wherein the polynucleotide comprises SEQ ID NO:31.
42. The polynucleotide of claim 31 , wherein the polynucleotide comprises SEQ ED NO:32.
43. The polynucleotide of claim 31 , wherein the polynucleotide comprises SEQ ID NO:33.
44. The polynucleotide of claim 31 , wherein the polynucleotide comprises SEQ ID NO:34.
45. The polynucleotide of claim 31 , wherein the polynucleotide comprises SEQ ID NO: 129.
46. An expression cassette, comprising a nucleic acid sequence encoding a promoter operably linked to at least one of the polynucleotides of any of claims 16-45.
47. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:22.
48. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:23.
49. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:24.
50. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:25.
51.. . The expression cassette of claim 46, wherein the nucleic acid sequence, comprises SEQ ED NO:26.
52. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:27.
53. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:28.
54. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:29.
55. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:30.
56. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ID NO:31.
57. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:32.
58. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:33.
59. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO:34.
60. The expression cassette of claim 46, wherein the nucleic acid sequence comprises SEQ ED NO: 129.
PCT/US2003/026488 2002-08-21 2003-08-21 Iron transport and metabolism proteins WO2004018638A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003260043A AU2003260043A1 (en) 2002-08-21 2003-08-21 Iron transport and metabolism proteins

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US40533102P 2002-08-21 2002-08-21
US60/405,331 2002-08-21

Publications (4)

Publication Number Publication Date
WO2004018638A2 true WO2004018638A2 (en) 2004-03-04
WO2004018638A9 WO2004018638A9 (en) 2004-04-22
WO2004018638A3 WO2004018638A3 (en) 2004-06-03
WO2004018638A8 WO2004018638A8 (en) 2004-10-28

Family

ID=31946858

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2003/026488 WO2004018638A2 (en) 2002-08-21 2003-08-21 Iron transport and metabolism proteins

Country Status (2)

Country Link
AU (1) AU2003260043A1 (en)
WO (1) WO2004018638A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018076049A1 (en) * 2016-10-24 2018-05-03 The University Of Queensland Immunogenic protein and method of use

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DATABASE GENBANK [Online] 21 September 1996 SCHRAMM ET AL., XP002975549 Retrieved from STN Database accession no. U70214 *
VAN VLIET ET AL.: 'Evolutionary divergence of genes for ornithine and aspartate ... comparison of argI an pyrB' NUCLEIC ACIDS RESEARCH vol. 12, no. 15, August 1984, pages 6277 - 6289, XP002975548 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018076049A1 (en) * 2016-10-24 2018-05-03 The University Of Queensland Immunogenic protein and method of use

Also Published As

Publication number Publication date
WO2004018638A8 (en) 2004-10-28
WO2004018638A3 (en) 2004-06-03
AU2003260043A1 (en) 2004-03-11
AU2003260043A8 (en) 2004-03-11
WO2004018638A9 (en) 2004-04-22

Similar Documents

Publication Publication Date Title
JP4616889B2 (en) Promoter nucleic acid derived from Corynebacterium bacteria, expression cassette containing promoter and vector containing cassette, host cell containing vector, and method for expressing gene using the same
AU2010255732B8 (en) Recombinant production of peptides
Pathak et al. Domain structure of virulence-associated response regulator PhoP of Mycobacterium tuberculosis: role of the linker region in regulator-promoter interaction (s)
JP2003156489A (en) Identification and use of molecule associated with pain
JP2008048607A (en) FUSION PROTEIN OF Fc-BINDING DOMAIN AND CALCIUM-BINDING PHOTOPROTEIN, GENE ENCODING THE SAME AND USE THEREOF
WO2004018638A2 (en) Iron transport and metabolism proteins
Laplace et al. Cloning, characterization and expression of an Enterococcus faecalis gene responsive to heavy metals
Fiedler et al. The two-component system PhoPR of Clostridium acetobutylicum is involved in phosphate-dependent gene regulation
KR101475033B1 (en) Perfluorooctane sulfonate responsive genes in Hydra magnipapillata and the method for diagnosing aquatic environment pollution using the same
Schlüter et al. The high biofilm-encoding Bee locus: a second pilus gene cluster in Enterococcus faecalis?
EP3356525A1 (en) Compositions for adjustable ribosome translation speed and methods of use
WO2000065062A2 (en) Genome sequence and polypeptides of pyrococcus abissy, fragment and uses thereof
JP2006506044A (en) Compositions and methods for acceleration of protein secretion dynamics
KR20080074286A (en) Novel promoter and uses thereof
Babykin et al. On the involvement of the regulatory gene prqR in the development of resistance to methyl viologen in cyanobacterium Synechocystis sp. PCC 6803
Amaral et al. Heat-shock-induced protein synthesis is responsible for the switch-off of hsp70 transcription in Tetrahymena
WO2000037495A1 (en) EF-Tu
Kakuda et al. Cloning and characterization of the fur gene from Moraxella bovis
KR101656744B1 (en) Triclosan responsive genes in Oryzias javanicus and the method for diagnosing aquatic environment pollution using the same
Huan et al. An underlying mechanism for MleR activating the malolactic enzyme pathway to enhance acid tolerance in Lacticaseibacillus paracasei L9
JP6424101B2 (en) Fusion protein of Fc-binding domain and calcium-binding photoprotein, gene encoding the same, and use thereof
Regulation The Two-Component System PhoPR of
KR20230079107A (en) Genetically modified Methylobacillus bacteria with improved properties
Haley Bioseparation process improvement via genomic manipulation: Development of novel strains for use in Immobilized Metal Affinity Chromatography (IMAC)
JP5040438B2 (en) Fusion protein of Fc-binding domain and calcium-binding photoprotein, gene encoding the same, and use thereof

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

COP Corrected version of pamphlet

Free format text: PAGES 1/6-6/6, DRAWINGS, REPLACED BY NEW PAGES 1/8-8/8; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

121 Ep: the epo has been informed by wipo that ep was designated in this application
WR Later publication of a revised version of an international search report
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase in:

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP