EP0973942A2 - Screening method for proline-rich proteins - Google Patents

Screening method for proline-rich proteins

Info

Publication number
EP0973942A2
EP0973942A2 EP98910914A EP98910914A EP0973942A2 EP 0973942 A2 EP0973942 A2 EP 0973942A2 EP 98910914 A EP98910914 A EP 98910914A EP 98910914 A EP98910914 A EP 98910914A EP 0973942 A2 EP0973942 A2 EP 0973942A2
Authority
EP
European Patent Office
Prior art keywords
proline
leu
lys
rich
protein
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP98910914A
Other languages
German (de)
French (fr)
Inventor
Vega Masignani
Guido Grandi
Rino Rappuoli
Beatrice Arico
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GSK Vaccines SRL
Original Assignee
Chiron SRL
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chiron SRL filed Critical Chiron SRL
Publication of EP0973942A2 publication Critical patent/EP0973942A2/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P1/00Drugs for disorders of the alimentary tract or the digestive system
    • A61P1/04Drugs for disorders of the alimentary tract or the digestive system for ulcers, gastritis or reflux esophagitis, e.g. antacids, inhibitors of acid secretion, mucosal protectants
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents

Definitions

  • This invention relates to the identification of proteins either secreted from or on the cell surface of microorganisms, and of the genes encoding them. These proteins may be useful as antigens for the preparation of vaccines against the microorganisms.
  • One specific area of interest is the identification of secreted, exported and cell-surface proteins, which might be suitable for the design and development of vaccines, especially subunit vaccines - compositions which comprise one or more antigens which are capable of eliciting an immune response.
  • proline-rich region in a protein is a good indicator of its cellular location. This information can be used both to screen sequence information in databases and to amplify DNA from microorganisms, even in the absence of sequence information, by designing primers based on degenerate proline-rich sequences.
  • a method for screening a pathogenic microorganism for a secreted or cell-surface protein comprising the step of identifying proline-rich proteins.
  • the method further comprises the step of selecting the proline-rich proteins which have been identified.
  • proline-rich protein denotes a protein having substantially more than the mean proline content of the proteins of the particular microorganism undergoing screening.
  • the proline-rich protein preferably comprises one or more amino acid motifs having the following general sequences:
  • a preferred motif referred to hereafter as the proline-rich region (PRR) is represented by the sequence (PX m ) 6 P.
  • PX m proline-rich region
  • the PRR thus represents a stretch of seven non-consecutive proline residues, spaced by at least one but no more than five other amino acids.
  • the probability of finding the PRR by chance in a sequence in which the 20 amino acids are randomly distributed is extremely low (7.8x10 "10 ).
  • sequences represent motifs in which proline residues repeat relatively closely together, either singly (eg. PXPXP or PXXPXXPXXP) or doubly (eg. PPXPPXPPXPP or PPXXPPXXPPXXPP), or as a mixture of both (eg. PPXXPXXPP).
  • the motifs are of the general sequence (P s X m ) n P s , wherein each s is independently 1 or 2.
  • n is, of course, dependent upon the protein and the microorganism in question, and may be large (eg. greater than 10) for large proline-rich proteins.
  • the screening method may be applied to any microorganism, but is preferably applied to pathogenic microorganisms such as a bacterium or parasite.
  • the bacterium may be Gram positive or Gram negative and may be, for example, Helicobacter pylori, Neisseria meningitidis, Mycobacterium tuberculosis, Escherichia coli, Neisseria gonorrhoeae, Haemophilus influenzae, Bordetella pertussis, Vibrio cholerae and Bacillus subtilis.
  • the parasite may be a protozoan, such as Leishmania, the malarial parasite Plasmodium, or the Legionella.
  • Other suitable microorganisms include Saccharomyces, Chlamydia, and Borrelia burgdorferi.
  • the screening method may be conducted at the amino acid level, by identifying proteins of known sequence, but unknown cellular localisation. This can conveniently be carried out by computer, using the FINDPATTERN program available in the GCG Wisconsin Package (version 8.1 or later), for instance.
  • a suitable database for source information on amino acid sequences of microorganisms is the public database SWISSPROT.
  • the screening method is conducted at the DNA level, by identifying and amplifying DNA from a library which contains a consensus sequence encoding a proline-rich amino acid motif.
  • the method of the present invention preferably comprises the step of screening a DNA library of a microorganism for DNA encoding a proline-rich amino acid sequence. This first step is preferably followed by selecting DNA which encodes a proline-rich amino acid sequence.
  • the library is suitably a plasmid genomic library, prepared using known techniques. Suitable libraries for screening Helicobacter pylori, for instance, are described in Censini el al. (1996) PNAS USA 93:14648-14653 and Covacci et al. (1993) PNAS USA 90:5791-5795.
  • the step of screening the DNA library may comprise any suitable screening technique, but preferably involves the screening and amplification of specific DNA sequences containing a DNA sequence encoding a proline-rich amino acid sequence.
  • the screening step comprises at least one polymerase chain reaction (PCR) step.
  • PCR polymerase chain reaction
  • the screening involves the preparation of a degenerate DNA primer encoding a proline-rich amino acid sequence, using known methods.
  • the proline-rich amino acid sequence may be determined by elucidation from known proteins in general, or more specifically by determining consensus sequences from known secreted or cell-surface proteins either generally or from a specific microorganism of interest.
  • the degenerate primer is used to amplify specific sequences containing one of the degenerate primer sequences (or sufficiently similar to hybridise thereto) to produce a population of amplified DNA.
  • the sequences thus obtained may then be sequenced and, using a reverse PCR primer and an internal primer, the DNA encompassing the sequence encoding the proline-rich amino acid sequence can be obtained.
  • the screening process may further comprise the step of expressing the DNA so obtained in a suitable host organism, such as a prokaryotic or eukaryotic host cell.
  • a suitable host organism such as a prokaryotic or eukaryotic host cell.
  • a proline-rich protein identified by the screening method of the invention.
  • Such proteins may be formulated with suitable pharmaceutical excipients, such as carriers, and may include other antigens and/or one or more adjuvants.
  • a protein comprising an amino acid sequence depicted in Figure 2 or Figure 3 ⁇ SEQ IDs 2-4>, or a functionally active fragment or derivative thereof.
  • functionally active means retaining a substantial biological activity, preferably antigenicity. This protein was identified as a proline-rich protein in H.pylori.
  • nucleic acid eg. RNA or DNA
  • a DNA sequence encoding the protein of Figure 2 or a functionally active fragment or derivative thereof.
  • This DNA sequence preferably comprises ⁇ SEQ ID 1> or a fragment thereof (eg. the coding sequence of ⁇ SEQ ID 1>).
  • an immunogenic composition comprising a proline-rich protein of the invention or nucleic acid encoding such a protein, and a method for producing such a composition comprising the step of bringing a proline-rich protein of the invention (or nucleic acid encoding such a protein) into association with a pharmaceutically acceptable excipient.
  • the immunogenic compositions of the invention preferably also comprise one or more further antigenic H.pylori proteins.
  • These further proteins may be proline-rich proteins themselves, but this is not necessarily so, and may be, for example, VacA, CagA, or NAP.
  • an immunogenic composition comprising either the protein of Figure 2 or nucleic acid encoding the protein of Figure 2, and a method for producing such a composition comprising the step of bringing the protein or nucleic acid into association with a pharmaceutically acceptable excipient.
  • the immunogenic compositions of the invention are preferably formulated as vaccines. These vaccines may be prophylactic or therapeutic, and preferably comprise one or more adjuvants.
  • Such vaccines typically comprise antigen or antigens in combination with a "pharmaceutically acceptable excipient” ie. any excipient that does not itself induce the production of antibodies harmful to the individual receiving the composition.
  • Suitable carriers are typically large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, lipid aggregates (such as oil droplets or liposomes), and inactive virus particles.
  • Such carriers are well known to those of ordinary skill in the art. Additionally, these carriers may function as immunostimulating agents ("adjuvants").
  • the antigen may be conjugated to a bacterial toxoid, such as a toxoid from diphtheria, tetanus, cholera, H. pylori, etc. pathogens.
  • Preferred adjuvants to enhance efficacy of the composition include, but are not limited to: (1) aluminium salts (alum), such as aluminium hydroxide, aluminium phosphate, aluminium sulphate etc. ; (2) oil-in- water emulsion formulations (with or without other specific immunostimulating agents such as muramyl peptides or bacterial cell wall components), such as for example (a) MF59 (WO90/14837), containing 5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of MTP-PE (see below), although not required) formulated into submicron particles using a micro fiuidizer (b) SAF, containing 10% Squalane, 0.4%> Tween 80, 5% pluronic-blocked polymer L121 , and thr-MDP either microfluidized into a submicron emulsion or vortexed to generate a larger particle size emulsion, and (c) Ribi ⁇ adj
  • interferons eg. IFN- ⁇
  • M-CSF macrophage colony stimulating factor
  • TNF tumor necrosis factor
  • other substances that act as immunostimulating agents to enhance the effectiveness of the composition.
  • Alum and MF59 are preferred.
  • Muramyl peptides include, but are not limited to, N-acetyl-muramyl-L-threonyl-D- isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(r-2'-dipalmitoyl-i , «-glycero-3- hydroxyphosphoryloxy)-ethylamine (MTP-PE), etc.
  • thr-MDP N-acetyl-muramyl-L-threonyl-D- isoglutamine
  • nor-MDP N-acetyl-normuramyl-L-alanyl-D-isoglutamine
  • MTP-PE «-glycero-3- hydroxyphosphoryloxy)-eth
  • the immunogenic compositions typically will contain diluents, such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may be present in such vehicles.
  • the immunogenic compositions are prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection may also be prepared.
  • the preparation also may be emulsified or encapsulated in liposomes for enhanced adjuvant effect, as discussed above under pharmaceutically acceptable carriers.
  • Immunogenic compositions used as vaccines comprise an immunologically effective amount of the antigenic polypeptides, as well as any other of the above-mentioned components, as needed.
  • immunologically effective amount it is meant that the administration of that amount to an individual, either in a single dose or as part of a series, is effective for treatment or prevention. This amount varies depending upon the health and physical condition of the individual to be treated, the taxonomic group of individual to be treated (eg. nonhuman primate, primate, etc.), the capacity of the individual's immune system to synthesize antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's assessment of the medical situation, and other relevant factors. The amount will fall in a relatively broad range that can be determined through routine trials.
  • the immunogenic compositions are conventionally administered parenterally eg. by injection, either subcutaneously or intramuscularly. Additional formulations suitable for other modes of administration include oral and pulmonary formulations, suppositories, and transdermal applications. Dosage treatment may be a single dose schedule or a multiple dose schedule. The vaccine may be administered in conjunction with other immunoregulatory agents.
  • DNA vaccination may be employed [eg. Robinson & Torres (1997) Seminars in Immunology 9:271-283; Donnelly et al. (1997) Annu Rev Immunol 15:617-648].
  • an antibody which is specific for a protein according to the invention.
  • a vector including nucleic acid according to the invention.
  • a host organism transformed with a vector according to the invention.
  • Figure 1 shows schematically the PCR techniques used to identify and prepare a proline-rich H.pylori protein.
  • the upper panel (A) describes the first PCR experiment using a degenerate proline primer and the universal primer of the pBluescriptTM vector.
  • the lower panel (B) describes the second PCR experiment, using an internal specific primer and the reverse primer of the vector.
  • Figure 2 shows the nucleotide ⁇ SEQ ID 1> and amino acid ⁇ SEQ ID 2> sequences of the proline-rich H.pylori protein.
  • the putative leader peptide and proline-rich consensus sequence are underlined, and "SD” denotes the possible Shine-Dalgarno sequence.
  • Figure 3 compares the sequences ⁇ SEQ IDs 2-4> of the proline-rich proteins from H.pylori type I and type II strains. " — " denotes amino acids not found in proteins from type II strains.
  • FIG 4 shows Western analysis of H.pylori extracts. The antiserum used was raised against the proline-rich protein of Figure 2.
  • the average length of the PRRs is approximately 32 amino acids.
  • flanking regions are also rich in proline.
  • PRRs Whilst the distributions shown in the final three rows in the table are the same for PRRs as for the genome in general, hydrophobic and aromatic residues are under-represented in PRRs, whereas proline is obviously over-represented. This analysis suggests that PRRs will usually be in hydrophilic regions of a protein. PRR-containing proteins identified in genomic sequences
  • H.pylori was chosen to demonstrate the utility of the screening methods.
  • the experiments were aimed at the identification and characterisation of surface-exposed proline-rich proteins common to type I and type II strains of H.pylori to use as candidate antigens in a new vaccine against H.pylori.
  • PCR was used to identify and amplify DNA sequences encoding proline-rich proteins, from a genomic library ⁇ H.pylori type I strain CCUG 17874 (Culture Collection of the University of Gotheborg).
  • the library comprised Hw/ ⁇ ll-digested genomic DNA in pBluescript(SK+) [Censini et al. & Covacci et al., supra].
  • the library was screened for six different proline-rich pentapeptide sequences. According to a codon preference table compiled for H.pylori using the GCG Wisconsin package, these six sequences were translated into the following degenerate 15mer primers:
  • more than one amplification product was obtained, indicating the presence of more than one proline-rich sequence.
  • Each product obtained from this first PCR reaction was completely sequenced, giving a sequence with the T3 primer sequence at one end and the proline-primer sequence at the other. From within this, a sequence was used to design a specific internal primer.
  • This internal primer was used for a second PCR reaction, this time coupled to the downstream T7 primer from pBluescript ( Figure IB). This amplifies a region with the T7 primer sequence at one end, and the internal primer sequence at the other, and which spans the proline-rich sequence.
  • the two sequences obtained in the two PCR reactions can be combined to give the complete DNA sequence between the pBluescript primers. This sequence was searched for open reading frames (ORFs). Where an ORF was not complete within a single plasmid, the terminal sequence was used to screen further clones.
  • ORFs open reading frames
  • This ORF encodes a 223 amino acid protein (25.5kDa) with the sequence PASAP near the C-terminus ( Figure 2 ⁇ SEQ ID 2>).
  • the 15 residue N-terminus signal peptide suggests that the protein is a secreted or cell-surface protein, although no canonical promoter region was located.
  • Secondary structure prediction analysis results in a continuous a-helical structure, except for the proline-rich motif, which appears to be folded as a loop. No hydrophobic segments (suggesting transmembrane regions) were detected.
  • This protein can be formulated as an immunogenic composition and can, for instance, be used to produce antibodies.
  • PCR primers were designed against the ORF and these were used to amplify DNA from the G21 and G50 chromosomes:
  • the amplified DNA (925bp) was sequenced, and the results are shown in Figure 3.
  • the ORF is well conserved between type I and II strains, except for a four amino acid deletion at positions 197-200 and a few other point mutations.
  • DNA encoding amino acids 19-183 of the ORF was amplified using the following PCR primers:
  • the antiserum also reacted with protein extracts from H.pylori strains G27, 60190, 326, 639, 646, 6120 & 686 (all type I strains) and 621, 650, ZU+ & ZU- (all type II strains), ' although the apparent MW was 30kDa ( Figure 4A, lanes 2, 4, 5, 6, 7). The difference in MW may be due to the 4 residue deletion described above. Protein fractions from G27 strain culture supernatant, whole cells, and periplasm (polimixin B) preparations were tested by Western analysis (Figure 4B).
  • the protein was detected in the whole cells (lane 1) and periplasmic fractions (lane 3), but not in the culture supernatant (lane 4), suggesting a periplasmic location.
  • the band at lower molecular weight is also found in the isogenic mutant, and appears to be unrelated to the protein of interest.
  • MOLECULE TYPE genomic DNA
  • xi SEQUENCE DESCRIPTION FOR SEQ ID NO: 1
  • MOLECULE TYPE protein
  • xi SEQUENCE DESCRIPTION FOR SEQ ID NO: 2
  • Met Arg Lys lie Leu Leu Met Gly Leu lie Leu Gin Ala Leu Phe Gly
  • Gin Ser Leu Arg lie Leu Gin Thr Glu Asn Ala Arg Leu Leu Asp Glu 50 55 60
  • Arg Glu lie Lys Gin Ala Lys Asp Ser Lys lie Gly Glu Thr Tyr Ser 115 120 125
  • Gin Asn Ala Leu Glu lie Leu Met Ala Leu Lys Pro Gin Glu Leu Gly 145 150 155 160
  • Lys lie Leu Ala Lys Met Asp Pro Lys Lys Ala Ala Ala Leu Thr Glu
  • Glu Pro Met lie Lys Asp Pro Asn Thr Lys Glu Pro Ala Gly Val
  • MOLECULE TYPE protein
  • xi SEQUENCE DESCRIPTION FOR SEQ ID NO: 3 Met Arg Lys lie Leu Leu Met Gly Leu lie Leu Gin Ala Leu Phe Ser
  • Arg Glu lie Lys Gin Ala Lys Asp Ser Lys He Gly Glu Thr Tyr Ser 115 120 125 Lys Met Lys Asp Ser Lys Ser Ala Leu He Leu Glu Asn Leu Pro Thr 130 135 140

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Chemical & Material Sciences (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Medicinal Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biotechnology (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Physics & Mathematics (AREA)
  • Communicable Diseases (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • General Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A method for screening a pathogenic microorganism for a secreted or cell-surface protein potentially useful as a vaccine antigen involves the step of identifying proline-rich proteins, either as the amino acid level, or preferably at the DNA level by employing PCR. Suitable consensus sequences are described as (PXm)nP or (PPXq)nPP wherein: P is proline; X is any amino acid, m is from 1 to 5; q is 1 or 2; and n is 2 or more.

Description

SCREENING METHOD
FIELD OF THE INVENTION
This invention relates to the identification of proteins either secreted from or on the cell surface of microorganisms, and of the genes encoding them. These proteins may be useful as antigens for the preparation of vaccines against the microorganisms.
BACKGROUND TO THE INVENTION
The characterisation of proteins produced by pathogenic microorganisms is, of course, a well known and active field of research.
One specific area of interest is the identification of secreted, exported and cell-surface proteins, which might be suitable for the design and development of vaccines, especially subunit vaccines - compositions which comprise one or more antigens which are capable of eliciting an immune response.
It would be useful if such proteins could be easily identified from the vast amount of uncharacterised sequence information now available from the genomes of microorganisms. The identification of sequence motifs which are characteristic of exported or surface-located proteins in bacteria facilitates rapid screening of genomes for potential virulence factors and vaccine components.
It has now been found that the presence of a proline-rich region in a protein is a good indicator of its cellular location. This information can be used both to screen sequence information in databases and to amplify DNA from microorganisms, even in the absence of sequence information, by designing primers based on degenerate proline-rich sequences.
DESCRIPTION OF THE INVENTION
According to the present invention, there is provided a method for screening a pathogenic microorganism for a secreted or cell-surface protein comprising the step of identifying proline-rich proteins. Preferably, the method further comprises the step of selecting the proline-rich proteins which have been identified.
As used herein, the term "proline-rich" protein denotes a protein having substantially more than the mean proline content of the proteins of the particular microorganism undergoing screening. The proline-rich protein preferably comprises one or more amino acid motifs having the following general sequences:
(PXn nP Or (PPXq)nPP wherein: P is proline; X is any amino acid; m is from 1 to 5 (preferably m is from 1 to 3); q is 1 or 2; and n is 2 or more, (preferably n=6). It will be appreciated that, for each of the n repeats, m and q may be the same or different. It will also be appreciated that the various X residues may be the same or different.
A preferred motif, referred to hereafter as the proline-rich region (PRR), is represented by the sequence (PXm)6P. For each of the 6 repeats of PXm, the value of m may be the same or different. The PRR thus represents a stretch of seven non-consecutive proline residues, spaced by at least one but no more than five other amino acids. The probability of finding the PRR by chance in a sequence in which the 20 amino acids are randomly distributed is extremely low (7.8x10"10).
It will be understood that these sequences represent motifs in which proline residues repeat relatively closely together, either singly (eg. PXPXP or PXXPXXPXXP) or doubly (eg. PPXPPXPPXPP or PPXXPPXXPPXXPP), or as a mixture of both (eg. PPXXPXXPP). In other words, the motifs are of the general sequence (PsXm)nPs, wherein each s is independently 1 or 2.
The X residues in these general sequences, which may be any amino acid, may the same as each other, but are generally different. The upper limit on n is, of course, dependent upon the protein and the microorganism in question, and may be large (eg. greater than 10) for large proline-rich proteins.
The screening method may be applied to any microorganism, but is preferably applied to pathogenic microorganisms such as a bacterium or parasite. The bacterium may be Gram positive or Gram negative and may be, for example, Helicobacter pylori, Neisseria meningitidis, Mycobacterium tuberculosis, Escherichia coli, Neisseria gonorrhoeae, Haemophilus influenzae, Bordetella pertussis, Vibrio cholerae and Bacillus subtilis. The parasite may be a protozoan, such as Leishmania, the malarial parasite Plasmodium, or the Legionella. Other suitable microorganisms include Saccharomyces, Chlamydia, and Borrelia burgdorferi.
The screening method may be conducted at the amino acid level, by identifying proteins of known sequence, but unknown cellular localisation. This can conveniently be carried out by computer, using the FINDPATTERN program available in the GCG Wisconsin Package (version 8.1 or later), for instance. A suitable database for source information on amino acid sequences of microorganisms is the public database SWISSPROT.
Alternatively, and preferably, the screening method is conducted at the DNA level, by identifying and amplifying DNA from a library which contains a consensus sequence encoding a proline-rich amino acid motif.
Where the screening method is conducted at the DNA level, the method of the present invention preferably comprises the step of screening a DNA library of a microorganism for DNA encoding a proline-rich amino acid sequence. This first step is preferably followed by selecting DNA which encodes a proline-rich amino acid sequence.
The library is suitably a plasmid genomic library, prepared using known techniques. Suitable libraries for screening Helicobacter pylori, for instance, are described in Censini el al. (1996) PNAS USA 93:14648-14653 and Covacci et al. (1993) PNAS USA 90:5791-5795.
The step of screening the DNA library may comprise any suitable screening technique, but preferably involves the screening and amplification of specific DNA sequences containing a DNA sequence encoding a proline-rich amino acid sequence. Suitably the screening step comprises at least one polymerase chain reaction (PCR) step.
In a preferred method, the screening involves the preparation of a degenerate DNA primer encoding a proline-rich amino acid sequence, using known methods. The proline-rich amino acid sequence may be determined by elucidation from known proteins in general, or more specifically by determining consensus sequences from known secreted or cell-surface proteins either generally or from a specific microorganism of interest.
The degenerate primer is used to amplify specific sequences containing one of the degenerate primer sequences (or sufficiently similar to hybridise thereto) to produce a population of amplified DNA. The sequences thus obtained may then be sequenced and, using a reverse PCR primer and an internal primer, the DNA encompassing the sequence encoding the proline-rich amino acid sequence can be obtained.
The screening process may further comprise the step of expressing the DNA so obtained in a suitable host organism, such as a prokaryotic or eukaryotic host cell. According to a further aspect of the invention, there is provided a proline-rich protein identified by the screening method of the invention. Such proteins may be formulated with suitable pharmaceutical excipients, such as carriers, and may include other antigens and/or one or more adjuvants.
In particular, there is provided a protein comprising an amino acid sequence depicted in Figure 2 or Figure 3 <SEQ IDs 2-4>, or a functionally active fragment or derivative thereof. As used herein, the term "functionally active" means retaining a substantial biological activity, preferably antigenicity. This protein was identified as a proline-rich protein in H.pylori.
According to a further aspect of the invention, there is provided nucleic acid (eg. RNA or DNA) encoding a protein of the invention (or their functionally active fragments or derivatives). In particular, there is provided a DNA sequence encoding the protein of Figure 2 or a functionally active fragment or derivative thereof. This DNA sequence preferably comprises <SEQ ID 1> or a fragment thereof (eg. the coding sequence of <SEQ ID 1>).
According to a further aspect of the invention, there is provided an immunogenic composition comprising a proline-rich protein of the invention or nucleic acid encoding such a protein, and a method for producing such a composition comprising the step of bringing a proline-rich protein of the invention (or nucleic acid encoding such a protein) into association with a pharmaceutically acceptable excipient.
The immunogenic compositions of the invention preferably also comprise one or more further antigenic H.pylori proteins. These further proteins may be proline-rich proteins themselves, but this is not necessarily so, and may be, for example, VacA, CagA, or NAP. In particular, there is provided an immunogenic composition comprising either the protein of Figure 2 or nucleic acid encoding the protein of Figure 2, and a method for producing such a composition comprising the step of bringing the protein or nucleic acid into association with a pharmaceutically acceptable excipient.
The immunogenic compositions of the invention are preferably formulated as vaccines. These vaccines may be prophylactic or therapeutic, and preferably comprise one or more adjuvants.
Such vaccines typically comprise antigen or antigens in combination with a "pharmaceutically acceptable excipient" ie. any excipient that does not itself induce the production of antibodies harmful to the individual receiving the composition. Suitable carriers are typically large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, lipid aggregates (such as oil droplets or liposomes), and inactive virus particles. Such carriers are well known to those of ordinary skill in the art. Additionally, these carriers may function as immunostimulating agents ("adjuvants"). Furthermore, the antigen may be conjugated to a bacterial toxoid, such as a toxoid from diphtheria, tetanus, cholera, H. pylori, etc. pathogens.
Preferred adjuvants to enhance efficacy of the composition include, but are not limited to: (1) aluminium salts (alum), such as aluminium hydroxide, aluminium phosphate, aluminium sulphate etc. ; (2) oil-in- water emulsion formulations (with or without other specific immunostimulating agents such as muramyl peptides or bacterial cell wall components), such as for example (a) MF59 (WO90/14837), containing 5% Squalene, 0.5% Tween 80, and 0.5% Span 85 (optionally containing various amounts of MTP-PE (see below), although not required) formulated into submicron particles using a micro fiuidizer (b) SAF, containing 10% Squalane, 0.4%> Tween 80, 5% pluronic-blocked polymer L121 , and thr-MDP either microfluidized into a submicron emulsion or vortexed to generate a larger particle size emulsion, and (c) Ribiτ adjuvant system (RAS), (Ribi Immunochem, Hamilton, MT) containing 2% Squalene, 0.2% Tween 80, and one or more bacterial cell wall components from the group consisting of monophosphorylipid A (MPL), trehalose dimycolate (TDM), and cell wall skeleton (CWS), preferably MPL + CWS (Detox™); (3) saponin adjuvants, such as Stimulon™ (Cambridge Bioscience, Worcester, MA) may be used or particles generated therefrom such as ISCOMs (immunostimulating complexes); (4) Complete Freund's Adjuvant (CFA) and Incomplete Freund's Adjuvant (IFA); (5) cytokines, such as interleukins (eg. IL-1, IL-2, IL-4, IL-5, IL-6, IL-7, IL-12, etc.), interferons (eg. IFN-γ), macrophage colony stimulating factor (M-CSF), tumor necrosis factor (TNF), etc; and (6) other substances that act as immunostimulating agents to enhance the effectiveness of the composition. Alum and MF59 are preferred.
Muramyl peptides include, but are not limited to, N-acetyl-muramyl-L-threonyl-D- isoglutamine (thr-MDP), N-acetyl-normuramyl-L-alanyl-D-isoglutamine (nor-MDP), N-acetylmuramyl-L-alanyl-D-isoglutaminyl-L-alanine-2-(r-2'-dipalmitoyl-i,«-glycero-3- hydroxyphosphoryloxy)-ethylamine (MTP-PE), etc.
The immunogenic compositions (eg. the antigen, pharmaceutically acceptable carrier, and adjuvant) typically will contain diluents, such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may be present in such vehicles. Typically, the immunogenic compositions are prepared as injectables, either as liquid solutions or suspensions; solid forms suitable for solution in, or suspension in, liquid vehicles prior to injection may also be prepared. The preparation also may be emulsified or encapsulated in liposomes for enhanced adjuvant effect, as discussed above under pharmaceutically acceptable carriers.
Immunogenic compositions used as vaccines comprise an immunologically effective amount of the antigenic polypeptides, as well as any other of the above-mentioned components, as needed. By "immunologically effective amount", it is meant that the administration of that amount to an individual, either in a single dose or as part of a series, is effective for treatment or prevention. This amount varies depending upon the health and physical condition of the individual to be treated, the taxonomic group of individual to be treated (eg. nonhuman primate, primate, etc.), the capacity of the individual's immune system to synthesize antibodies, the degree of protection desired, the formulation of the vaccine, the treating doctor's assessment of the medical situation, and other relevant factors. The amount will fall in a relatively broad range that can be determined through routine trials.
The immunogenic compositions are conventionally administered parenterally eg. by injection, either subcutaneously or intramuscularly. Additional formulations suitable for other modes of administration include oral and pulmonary formulations, suppositories, and transdermal applications. Dosage treatment may be a single dose schedule or a multiple dose schedule. The vaccine may be administered in conjunction with other immunoregulatory agents.
As an alternative to protein-based vaccines, DNA vaccination may be employed [eg. Robinson & Torres (1997) Seminars in Immunology 9:271-283; Donnelly et al. (1997) Annu Rev Immunol 15:617-648].
According to a further aspect of the invention, there is provided an antibody which is specific for a protein according to the invention.
According to a further aspect of the invention, there is provided a vector including nucleic acid according to the invention. There is also provided a host organism transformed with a vector according to the invention.
BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 shows schematically the PCR techniques used to identify and prepare a proline-rich H.pylori protein. The upper panel (A) describes the first PCR experiment using a degenerate proline primer and the universal primer of the pBluescript™ vector. The lower panel (B) describes the second PCR experiment, using an internal specific primer and the reverse primer of the vector.
Figure 2 shows the nucleotide <SEQ ID 1> and amino acid <SEQ ID 2> sequences of the proline-rich H.pylori protein. The putative leader peptide and proline-rich consensus sequence are underlined, and "SD" denotes the possible Shine-Dalgarno sequence.
Figure 3 compares the sequences <SEQ IDs 2-4> of the proline-rich proteins from H.pylori type I and type II strains. " — " denotes amino acids not found in proteins from type II strains.
Figure 4 shows Western analysis of H.pylori extracts. The antiserum used was raised against the proline-rich protein of Figure 2.
DETAILED DESCRIPTION OF EMBODIMENTS
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of computer searching, molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. Such techniques are explained fully in the literature eg. Molecular Cloning; A Laboratory Manual, Second Edition (Sambrook, 1989); DNA Cloning, Volumes I and ii (D.N Glover ed. 1985); Oligonucleotide Synthesis (M.J. Gait ed, 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. Higgins eds. 1984); Transcription and Translation (B.D. Hames & S.J. Higgins eds. 1984); Animal Cell Culture (R.I. Freshney ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide to Molecular Cloning (1984); the Methods in Enzymology series (Academic Press, Inc.), especially volumes 154 & 155; Gene Transfer Vectors for Mammalian Cells (ed. Miller and Calos 1987, Cold Spring Harbor Laboratory); Mayer and Walker, eds. (1987), Immunochemical Methods in Cell and Molecular Biology (Academic Press, London); Scopes, (1987) Protein Purification: Principles and Practice, Second Edition (Springer- Verlag, N.Y.), and Handbook of Experimental Immunology, Volumes I-IV (ed. Weir and Blackwell 1986).
1) Computer analysis of protein sequences
It was considered that proline-rich proteins might have specific functions in microorganisms and specifically to be correlated with secreted/exported or cell-surface proteins. On this basis, the FINDPATTERN program (GCG Wisconsin package, version 8.1) was used to screen the SWISSPROT database for sequences containing the following sequences:
(PXm)„P or (PPXq)„PP wherein: P is proline; X is any amino acid; m is from 1 to 3; q is 1 or 2; and n is 2 or more. These sequences were designed to detect any kind of proline-rich motif in proteins which had already been characterised.
The search was limited at the outset to three microorganism classes: protozoa, Gram positive bacteria, and Gram negative bacteria. The results were as follows, classified by cellular location and/or biological function:
These results clearly demonstrate that screening for proline-rich motifs produces a highly relevant percentage of membrane-associated proteins. Indeed, amongst the proteins identified in this way are two of the major components of the acellular vaccine developed against Bordetella pertussis, namely filamentous haemagglutinin (FHA) and pertactin [Rappuoli et al. (1991) TIBTECH 9:232-238].
These results clearly support the general principle that analysis for the presence of proline-rich motifs in an amino acid sequence is useful for the identification of secreted or cell-surface proteins. In addition, since the invention was initially conceived, additional studies have been carried out which further validate this approach.
Firstly, the 17567 prokaryotic proteins in release 34 of SWISSPROT were searched against the PRR motif [(PXι-5)6P], using the FINDPATTERN program in version 9.0 of the GCG Wisconsin package. 1536 motifs were identified in 190 bacterial proteins (ie. just over 1% of all proteins analysed). Of these, 98 were from Gram-negative bacteria, and the remaining 92 were from Gram-positive bacteria. On the basis of either known function or predicted cellular location [PSORT algorithm - http://psort.nibb.ac.jp], it was found that 68 of the 98 (69.4%) and 64 of the 92 (69.5%) sequences were surface-located or exported proteins.
Secondly, the published genomes of Escherchia coli, Haemophilus influenzae, Helicobacter pylori, Methanobacterium thermoautothropicum, Methanococcus jannaschii and Mycoplasma pneumoniae were searched for the presence of PRR motifs. Of the 53 proteins identified in this way, 39 (73.6%) were predicted to be membrane-associated or exported, thus confirming that PRR motifs are preferentially found in proteins localised in the superficial compartments of the bacterial cell. In H.influenzae, all five PRR-containing proteins are predicted to be surface-located or exported.
2) Detailed analysis of PRR-containing proteins
The 29 PRR-containing proteins found in the genomes f E.coli, H.pylori, and H.influenzae were analysed in more detail (see page 1 1 - PRRs are shown underlined).
The average length of the PRRs is approximately 32 amino acids.
The shortest PRR was found in the RhsA transmembrane protein of E.coli (14 residues):
1235-PLNPVTNTDPLGLEVFPRPFPLPIP WPKSP- 1264 '
The longest was in the H.pylori TonB homologue Hpl341 (56 residues):
73-PKEEPKEKPKKEEPKKEEPKKEVTKPK PKPKPKPKPKPKPKPEPKPEPKPEPKPEPKVEEVKKEEPKEEP-142
It is apparent that, in addition to the PRRs themselves, the flanking regions are also rich in proline.
The following table shows the bias in the PRRs towards proline residues. The amino acid composition of PRRs is clearly different from the average across the genome:
Whilst the distributions shown in the final three rows in the table are the same for PRRs as for the genome in general, hydrophobic and aromatic residues are under-represented in PRRs, whereas proline is obviously over-represented. This analysis suggests that PRRs will usually be in hydrophilic regions of a protein. PRR-containing proteins identified in genomic sequences
3) Analysis oϊ H.pylori genome library
In order to extend the principles demonstrated above, the microorganism H.pylori was chosen to demonstrate the utility of the screening methods. The experiments were aimed at the identification and characterisation of surface-exposed proline-rich proteins common to type I and type II strains of H.pylori to use as candidate antigens in a new vaccine against H.pylori.
PCR was used to identify and amplify DNA sequences encoding proline-rich proteins, from a genomic library ϊ H.pylori type I strain CCUG 17874 (Culture Collection of the University of Gotheborg). The library comprised Hw/ϋll-digested genomic DNA in pBluescript(SK+) [Censini et al. & Covacci et al., supra].
The library was screened for six different proline-rich pentapeptide sequences. According to a codon preference table compiled for H.pylori using the GCG Wisconsin package, these six sequences were translated into the following degenerate 15mer primers:
In six separate reactions, lOOng plasmid DNA and 50pmol each of degenerate primer and the upstream universal pBluescript T3 primer was used for PCR amplification (Figure 1 A).
In some of the reactions, more than one amplification product was obtained, indicating the presence of more than one proline-rich sequence. Each product obtained from this first PCR reaction was completely sequenced, giving a sequence with the T3 primer sequence at one end and the proline-primer sequence at the other. From within this, a sequence was used to design a specific internal primer.
This internal primer was used for a second PCR reaction, this time coupled to the downstream T7 primer from pBluescript (Figure IB). This amplifies a region with the T7 primer sequence at one end, and the internal primer sequence at the other, and which spans the proline-rich sequence.
The two sequences obtained in the two PCR reactions can be combined to give the complete DNA sequence between the pBluescript primers. This sequence was searched for open reading frames (ORFs). Where an ORF was not complete within a single plasmid, the terminal sequence was used to screen further clones.
4) Further analysis of a specific H.pylori proline-rich protein
One of the several proline-rich ORFs identified in this way was singled out for further analysis. This ORF encodes a 223 amino acid protein (25.5kDa) with the sequence PASAP near the C-terminus (Figure 2 <SEQ ID 2>). The 15 residue N-terminus signal peptide suggests that the protein is a secreted or cell-surface protein, although no canonical promoter region was located. Secondary structure prediction analysis results in a continuous a-helical structure, except for the proline-rich motif, which appears to be folded as a loop. No hydrophobic segments (suggesting transmembrane regions) were detected.
This protein can be formulated as an immunogenic composition and can, for instance, be used to produce antibodies.
Although no significant homologies were detected when this sequence was first identified, subsequent sequencing of the H.pylori genome has identified this protein as ORF257 [Tomb et al. (1997) Nature 388:539-547]. No function has been described for this protein, although it has homology to conserved secreted proteins. Using the TBLASTN algorithm, weak homology (27-31% identity) can be seen with ORF6 of the flaA locus of B.subtilis, ORF6 of the flagellar operon of Agrobacterium tumefaciens, and ORFB found in a similar locus of Rhizobium meliliti (GenBank L76929).
The fact that the PCR reaction identified a protein containing a PASAP motif, whereas the degenerate oligonucleotide primer was designed for the PAPAP motif indicates that, depending upon the stringency used for amplification, additional proline-rich proteins can be selected beyond those expected.
5) Presence of the protein in type II strains
The presence of this CCUG 17874 (type I strain) ORF in H.pylori type II strains was investigated using the G21 and G50 strains [Censini et al. (1996) supra]. The PCR fragment containing the complete ORF was used as a probe in Southern analysis of H/ z<£II-digested chromosomal DNA from CCUG 17874, G21 and G50 strains under highly stringent conditions, and a similar band was detected for all three strains.
PCR primers were designed against the ORF and these were used to amplify DNA from the G21 and G50 chromosomes:
5'-CCACCCAGAGAGCGAGAATTT-3' & 5'-CTAATTCATGGACAAAGATC-3'
The amplified DNA (925bp) was sequenced, and the results are shown in Figure 3. The ORF is well conserved between type I and II strains, except for a four amino acid deletion at positions 197-200 and a few other point mutations. DNA encoding amino acids 19-183 of the ORF was amplified using the following PCR primers:
5 ' -TGTGTGGGAATTCGCCAAGAATTGTTGCAATGCTCTGCG-3 '
5 GTGTGAAGCTTTTTTTGCCACAACTCTGTCAAAGC-3 ' which also introduced EcoRI and Hindlϊl sites (underlined). This fragment was inserted into the pGΕX-2T vector [Guan et al. (1991) Anal Biochem 192:262-267]. The truncated protein lacks the signal sequence and the 40 C-terminus residues. The protein was expressed as a GST-fusion in E.coli and was purified using glutathione affinity chromatography. Rabbit antiserum was raised against this recombinant protein, in the form of an immunogenic composition, and was used in Western analysis. Against total protein extracts from H.pylori, a 34kDa protein species was detected (Figure 4A, lane 1). In an isogenic knock-out strain (produced by allelic exchange), the protein was not detected (lane 3). The discrepancy between the theoretical MW (25kDa) and observed MW (34kDa) can probably be attributed to the high proline-content of the protein, which is known to cause abnormal migration in SDS gels.
The antiserum also reacted with protein extracts from H.pylori strains G27, 60190, 326, 639, 646, 6120 & 686 (all type I strains) and 621, 650, ZU+ & ZU- (all type II strains),' although the apparent MW was 30kDa (Figure 4A, lanes 2, 4, 5, 6, 7). The difference in MW may be due to the 4 residue deletion described above. Protein fractions from G27 strain culture supernatant, whole cells, and periplasm (polimixin B) preparations were tested by Western analysis (Figure 4B). The protein was detected in the whole cells (lane 1) and periplasmic fractions (lane 3), but not in the culture supernatant (lane 4), suggesting a periplasmic location. The band at lower molecular weight is also found in the isogenic mutant, and appears to be unrelated to the protein of interest.
It will be appreciated that the invention has been illustrated by means of example only, and that modifications to these may be made whilst remaining within the scope and spirit of the invention.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(l) APPLICANT:
(A) NAME: Chiron SpA
(B) STREET: Via Florentine 1
(C) CITY: Siena
(E) COUNTRY: ITALY
(F) POSTAL CODE (ZIP) : 53100 Siena
(n) TITLE OF INVENTION: Screening Method (m) NUMBER OF SEQUENCES: 4 (IV) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk 720K
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentln Ver 2.0/Mιcrosoft Word 97
(v) CURRENT APPLICATION DATA:
APPLICATION NUMBER: PCT/IB98/
(2) INFORMATION FOR SEQ ID NO: 1
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 845 nucleotides
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: genomic DNA (xi) SEQUENCE DESCRIPTION FOR SEQ ID NO: 1
ACTAAGGATT AAAAATTTAG AAAGAGATTA TCTTTTGGCT AACCAGGAAT TAGAAAAAGC 60
TAAAATCATT TTAGAAAAGG AAAAGCAAAA AGAACAGGAA ATTTTGGGAA AAAAAGAGCA 120
GGCTCTTTTG GACGAAAATG CCATGATTTT ACACTGGCAA AAAGAGGGCT TGCATGCGTA 180 AAATCTTGTT AATGGGTCTG ATTTTACAAG CGCTCTTTGG CGAAGAAGCC GCGCAAGAAT 240
TGTTGCAATG CTCTGCGATT TTTGAATCTA AAAAAGCCGA ATTGAAAGAC GATTTGCGCC 300
AATTGAGCGA AAAAGAGCAG TCTTTAAGGA TCTTGCAAAC CGAAAACGCC CGCCTTTTAG 360
ATGAAAAAAC CGATCTGTTG AACCAAAAAG AAAAAGAAGT GGAAGAAAAA CTGAAAAATT 420
TAGCCGCTAA AGAAGAAGCC TTTAAAACCT TACAAACGGA AGAAAAGAAA CGCCTTAAAA 480 ATTTGATAGA AGAAAATGAA GAAATTTTAA GAGAAATCAA GCAGGCTAAA GATAGCAAGA 540
TTGGCGAGAC TTATTCTAAA ATGAAAGATT CTAAATCGGC TCTGATTTTA GAAAATTTAC 600
CCACTCAAAA CGCCCTAGAA ATTTTAATGG CGTTAAAACC CCAAGAATTA GGTAAAATTT 660
TAGCCAAAAT GGATCCTAAA AAAGCGGCGG CTTTGACAGA GTTGTGGCAA AAACCCCCAA 720
AAGAAAATAA AGAAAACCAA AAAACCACAG AGCCTACACC AGCATCCGCG CCCCCCATAG 780 CACCCACGCC TCCTAAAGAG CCGATGATAA AAGATCCTAA CACCAAAGAG CCTGCAGGGG 840
TATGA 845
(2) INFORMATION FOR SEQ ID NO: 2
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 223 ammo acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(11) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION FOR SEQ ID NO: 2
Met Arg Lys lie Leu Leu Met Gly Leu lie Leu Gin Ala Leu Phe Gly
5 10 15 Glu Glu Ala Ala Gin Glu Leu Leu Gin Cys Ser Ala lie Phe Glu Ser
20 25 30
Lys Lys Ala Glu Leu Lys Asp Asp Leu Arg Gin Leu Ser Glu Lys Glu 35 40 45
Gin Ser Leu Arg lie Leu Gin Thr Glu Asn Ala Arg Leu Leu Asp Glu 50 55 60
Lys Thr Asp Leu Leu Asn Gin Lys Glu Lys Glu Val Glu Glu Lys Leu 65 70 75 80
Lys Asn Leu Ala Ala Lys Glu Glu Ala Phe Lys Thr Leu Gin Thr Glu 85 90 95 Glu Lys Lys Arg Leu Lys Asn Leu lie Glu Glu Asn Glu Glu lie Leu 100 105 110
Arg Glu lie Lys Gin Ala Lys Asp Ser Lys lie Gly Glu Thr Tyr Ser 115 120 125
Lys Met Lys Asp Ser Lys Ser Ala Leu lie Leu Glu Asn Leu Pro Thr 130 135 140
Gin Asn Ala Leu Glu lie Leu Met Ala Leu Lys Pro Gin Glu Leu Gly 145 150 155 160
Lys lie Leu Ala Lys Met Asp Pro Lys Lys Ala Ala Ala Leu Thr Glu
165 170 175 Leu Trp Gin Lys Pro Pro Lys Glu Asn Lys Glu Asn Gin Lys Thr Thr 180 185 190
Glu Pro Thr Pro Ala Ser Ala Pro Pro lie Ala Pro Thr Pro Pro Lys 195 200 205
Glu Pro Met lie Lys Asp Pro Asn Thr Lys Glu Pro Ala Gly Val
210 215 220
(2) INFORMATION FOR SEQ ID NO: 3
(l) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 219 amino acids
(B) TYPE: ammo acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(n) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION FOR SEQ ID NO: 3 Met Arg Lys lie Leu Leu Met Gly Leu lie Leu Gin Ala Leu Phe Ser
5 10 15
Glu Glu Ala Ala Gin Glu Leu Leu Gin Cys Ser Ala lie Phe Glu Ser 20 25 30
Lys Lys Ala Glu Leu Lys Asp Asp Leu Arg Gin Leu Ser Glu Lys Glu 35 40 45 Gin Ser Leu Arg lie Leu Gin Thr Glu Asn Ala Arg Leu Leu Asp Glu 50 55 60
Lys Thr Asp Leu Leu Asn Gin Lys Glu Lys Glu Val Glu Glu Lys Leu 65 70 75 80
Lys Asn Leu Ala Val Lys Glu Glu Ala Phe Lys Thr Leu Gin Thr Glu 85 90 95
Glu Lys Lys Arg Leu Lys Asn Leu lie Glu Glu Asn Glu Gly lie Leu 100 105 110
Arg Glu lie Lys Gin Ala Lys Asp Ser Lys He Gly Glu Thr Tyr Ser 115 120 125 Lys Met Lys Asp Ser Lys Ser Ala Leu He Leu Glu Asn Leu Pro Thr 130 135 140
Gin Asn Ala Leu Glu He Leu Met Ala Leu Lys Pro Gin Glu Leu Gly 145 150 155 160
Lys He Leu Ala Lys Met Asp Pro Lys Lys Ala Ala Ala Leu Thr Glu 165 170 175
Leu Trp Gin Lys Pro Pro Lys Glu Asn Lys Glu Ser Gin Lys Thr He 180 185 190
Pro Pro Thr Pro Pro He Ala Pro Thr Pro Leu Lys Glu Pro Met He 195 200 205 Lys Asp Pro Asn Thr Lys Glu Pro Ala Gly Val 210 215
(2) INFORMATION FOR SEQ ID NO: 4
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 219 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION FOR SEQ ID NO: 4
Met Arg Lys He Leu Leu He Gly Leu He Leu Gin Ala Leu Phe Ser
5 10 15
Glu Glu Ala Ala Gin Glu Leu Leu Gin Cys Ser Ala He Phe Glu Ser Lys Lys Ala Glu Leu Lys Asp Asp Leu Arg Gin Leu Ser Glu Lys Glu 35 40 45
Gin Ser Leu Arg He Leu Gin Thr Glu Asn Ala Arg Leu Leu Asp Glu " 50 55 60
Lys Thr Asp Leu Leu Asn Gin Lys Glu Lys Glu Val Glu Glu Lys Leu 65 70 75 80 Lys Asn Leu Ala Ala Lys Glu Glu Ala Phe Lys Thr Leu Gin Thr Glu
85 90 95
Glu Lys Lys Arg Leu Lys Asn Leu He Glu Glu Asn Glu Gly He Leu 100 105 110
Arg Glu He Lys Gin Ala Lys Asp Ser Lys He Gly Glu Thr Tyr Ser 115 120 125
Lys Met Lys Asp Ser Lys Ser Ala Leu He Leu Glu Asn Leu Pro Thr 130 135 140
Gin Asn Ala Leu Glu He Leu Met Ala Leu Lys Pro Gin Glu Leu Gly 145 150 155 160 Lys He Leu Ala Lys Met Asp Pro Lys Lys Ala Ala Ala Leu Thr Glu
165 170 175
Leu Trp Gin Lys Pro Pro Lys Glu Asn Lys Glu Ser Gin Lys Thr He 180 185 190
Pro Pro Thr Pro Pro He Ala Pro Thr Pro Leu Lys Glu Pro Met He 195 200 205
Lys Asp Pro Asn Thr Thr Glu Pro Ala Gly Val

Claims

1. A method for screening a pathogenic microorganism for a secreted or cell-surface protein, comprising the step of identifying and selecting proline-rich proteins.
2. A method according to claim 1, wherein the protein comprises one or more amino acid motifs having the following general sequences:
(PXm)ΓÇ₧P or (PPXq)nPP wherein:
P is proline; X is any amino acid; m is from 1 to 5; q is 1 or 2; and n is 2 or more.
3. A method according to claim 2, wherein m is from 1 to 3.
4. A method according to claim 2, wherein n is 6.
5. A method according to any preceding claim, wherein the method comprises the step of screening a DNA library of a microorganism for DNA encoding a proline-rich amino acid sequence.
6. A method according to any preceding claim, comprising the step of screening and amplification of specific DNA sequences containing a DNA sequence encoding a proline-rich amino acid sequence by PCR.
7. A method according to claim 6, wherein the screening involves the preparation of a degenerate DNA primer encoding a proline-rich amino acid sequence.
8. A proline-rich protein identified by a screening method according to any preceding claim.
9. A proline-rich protein according to claim 8, comprising the sequence <SEQ ID 1>, or a functionally active fragment or derivative thereof.
10. Nucleic acid encoding a proline-rich protein according to claim 8 or claim 9.
11. A vaccine composition comprising a protein according to claim 8 or claim 9, or nucleic acid according to claim 10.
12. A method for producing a vaccine according to claim 11 , comprising the step of bringing a proline-rich protein according to claim 8 or claim 9, or nucleic acid according to claim 10, into association with a pharmaceutically acceptable excipient.
EP98910914A 1997-03-24 1998-03-24 Screening method for proline-rich proteins Withdrawn EP0973942A2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
GB9706078 1997-03-24
GBGB9706078.4A GB9706078D0 (en) 1997-03-24 1997-03-24 Screening method and proteins and dna identified thereby
PCT/IB1998/000525 WO1998042868A2 (en) 1997-03-24 1998-03-24 Screening method for proline-rich proteins

Publications (1)

Publication Number Publication Date
EP0973942A2 true EP0973942A2 (en) 2000-01-26

Family

ID=10809774

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98910914A Withdrawn EP0973942A2 (en) 1997-03-24 1998-03-24 Screening method for proline-rich proteins

Country Status (5)

Country Link
EP (1) EP0973942A2 (en)
JP (1) JP2001514527A (en)
CA (1) CA2285480A1 (en)
GB (1) GB9706078D0 (en)
WO (1) WO1998042868A2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013066731A2 (en) * 2011-10-31 2013-05-10 Merck Sharp & Dohme Corp. Protective vaccine based on staphylococcus aureus sa2451 protein

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU8740491A (en) * 1990-09-28 1992-04-28 Protein Engineering Corporation Proteinaceous anti-dental plaque agents
US5476929A (en) * 1991-02-15 1995-12-19 Uab Research Foundation Structural gene of pneumococcal protein
US5609876A (en) * 1993-08-09 1997-03-11 University Of Pittsburgh Of The Commonwealth System Of Higher Education Peptide vaccines and associated methods for protection against feline leukemia virus
GB9409985D0 (en) * 1994-05-18 1994-07-06 Medical Res Council Vaccine against mycobacterial infections

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9842868A2 *

Also Published As

Publication number Publication date
GB9706078D0 (en) 1997-05-14
CA2285480A1 (en) 1998-10-01
JP2001514527A (en) 2001-09-11
WO1998042868A3 (en) 1998-12-23
WO1998042868A2 (en) 1998-10-01

Similar Documents

Publication Publication Date Title
Anderson et al. Comparative sequence analysis of a genus-common rickettsial antigen gene
US20080241151A1 (en) Virulence genes, proteins, and their use
WO1993018150A1 (en) Helicobacter pylori proteins useful for vaccines and diagnostics
EP0643770A1 (en) Helicobacter pylori proteins useful for vaccines and diagnostics
US6444799B1 (en) P. gingivalis polynucleotides and uses thereof
US8114976B2 (en) Cryptosporidium hominis genes and gene products for chemotherapeutic, immunoprophylactic and diagnostic applications
Bergström et al. Molecular and cellular biology of Borrelia burgdorferi sensu lato.
US7157088B2 (en) Surface protein of Leptospira
US20130230903A1 (en) Proteins with repetitive bacterial-ig-like (big) domains present in leptospira species
US20040170972A1 (en) Ehrlichia canis genes and vaccines
EP0973942A2 (en) Screening method for proline-rich proteins
EP1284993A2 (en) Polypeptides containing polymorphisms of the repeated regions of pertactin in bordetella pertussis, bordetella paraperussis and bordetella bronchiseptica. their use in diagnostics, and in immunogenic compositions
US6676942B1 (en) Osp a proteins of Borrelia burgdorferi subgroups, encoding genes and vaccines
AU772946B2 (en) Ehrlichia canis genes and vaccines
Feng et al. P55, an immunogenic but nonprotective 55-kilodalton Borrelia burgdorferi protein in murine Lyme disease
Hoshino et al. Heterogeneity found in the cagA gene of Helicobacter pylori from Japanese and non-Japanese isolates
US9175049B2 (en) Recombinant mycobacterium avium subsp. paratuberculosis proteins induce immunity and protect against infection
WO2001087939A2 (en) Genes required for natural competence, proteins, and their use
US7108854B1 (en) Surface proteins of Leptospira
Allan et al. 11 Genetic Characterization of the Gastric Pathogen Helicobacter pylori
Tu et al. Campylobacter fetus sap Evidence that the

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 19990923

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

17Q First examination report despatched

Effective date: 20020510

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20030722