AU773484B2 - Fimbrial proteins - Google Patents

Fimbrial proteins Download PDF

Info

Publication number
AU773484B2
AU773484B2 AU52628/00A AU5262800A AU773484B2 AU 773484 B2 AU773484 B2 AU 773484B2 AU 52628/00 A AU52628/00 A AU 52628/00A AU 5262800 A AU5262800 A AU 5262800A AU 773484 B2 AU773484 B2 AU 773484B2
Authority
AU
Australia
Prior art keywords
leu
ser
ala
gly
thr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU52628/00A
Other versions
AU5262800A (en
Inventor
Anders Folkesson
Sven Lofdahl
Staffan Normark
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SBL Vaccin AB
Original Assignee
SBL Vaccin AB
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SBL Vaccin AB filed Critical SBL Vaccin AB
Publication of AU5262800A publication Critical patent/AU5262800A/en
Application granted granted Critical
Publication of AU773484B2 publication Critical patent/AU773484B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/24Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Enterobacteriaceae (F), e.g. Citrobacter, Serratia, Proteus, Providencia, Morganella, Yersinia
    • C07K14/255Salmonella (G)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P1/00Drugs for disorders of the alimentary tract or the digestive system
    • A61P1/04Drugs for disorders of the alimentary tract or the digestive system for ulcers, gastritis or reflux esophagitis, e.g. antacids, inhibitors of acid secretion, mucosal protectants
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/505Medicinal preparations containing antigens or antibodies comprising antibodies
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/51Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
    • A61K2039/52Bacterial cells; Fungal cells; Protozoal cells
    • A61K2039/523Bacterial cells; Fungal cells; Protozoal cells expressing foreign proteins
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/51Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
    • A61K2039/53DNA (RNA) vaccination
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A50/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
    • Y02A50/30Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change

Landscapes

  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • General Chemical & Material Sciences (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Animal Behavior & Ethology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Oncology (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Communicable Diseases (AREA)
  • Medicines Containing Antibodies Or Antigens For Use As Internal Diagnostic Agents (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines Containing Material From Animals Or Micro-Organisms (AREA)
  • Peptides Or Proteins (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Description

WO 00173336 PCT/SE00/01079 FIMBRIAL PROTEINS Field of the invention The present invention is based on the finding that two fimbrial operons, the saf operon and the tcfoperon, are specific for Salmonella enterica subspecies 1 bacteria and therefore have therapeutic use. Due to their specificity they can be used to provide vaccines against Salmonella enterica subspecies I as well as for detection of Salmonella enterica subspecies I. The safoperon is specific for all Salmonella enterica subspecies 1 bacteria and the tcfoperon is specific for the serovar Typhi of Salmonella enterica subspecies 1.
All or part of the DNA-sequences of the genes encoding these proteins can be used as active agents in a vaccine against diseases caused by the Salmonella enterica subspecies I bacterial strains or for detection of said bacterial strains.
The present invention also relates to methods of isolating these fimbrial proteins, to antibodies directed against these proteins, and to a vaccine composition comprising these proteins or antibodies directed against these proteins for use in the treatment of infections caused by the Salmonella spp.
The fimbrial proteins according to the invention or antibodies directed against them can be used for detection of Salmonella spp. bacteria.
Background of the invention The members of genus Salmonella spp colonize and infect a wide range of different organisms. Many cause gastroenteritis and enteric fever in humans and domesticated animals while others are not associated with human disease (Saylers et al, 1994). The genus has been divided into two species, Salmonella bongori and Salmonella enterica where enterica can be further subdivided into seven subspecies, designated I, II, IIIa, IIIb, IV, VI, and VII (Reeves et al, 1989).
Salmonella enterica subspecies I are preferentially associated with warmblooded animals. Over 99% of all clinical Salmonella isolates are strains belonging to this subspecies, including serovars Typhimurium and Enteritidis, which are the major causes of Salmonella induced gastroenteritis in humans, and Typhi, the human specific causative organism of typhoid fever, the most severe form of human salmonellosis (Popoff et al, 1992).
Salmonella enterica subspecies I consists of over 1300 different serovars and is preferentially associated with warm-blooded animals (Biumler, 1997). Over 99% of all clinical Salmonella isolates are strains belonging to this subspecies, including serovars Typhimurium and Enteritidis, which are the major causes of Salmonella induced gastroenteritis in humans, and Typhi, the human specific causative organism of typhoid fever, the most severe form of human salmonellosis (Popoff and Le Minor, 1992).
Today gastroenteritis and enteric fever can neither be prevented nor treated with good results. Typhoid fever is a substantial public health problem in developing countries.
Each year 33 million people become ill and over 500 000 people die from this infection (American Institute of Medicine, 1986). Typhoid fever can be prevented by 10 vaccination with attenuated bacteria, such as Ty21 and Vi vaccines and whole cell vaccines. Whole cell vaccines show a high incidence of side effects (Ashcroft et al, 1964, Yugoslav Typhoid commission, 1964). The vaccines consisting of attenuated strains of Salmonella typhi suffer from serious drawbacks. They must be administered as three or four spaced doses in order to stimulate protective immune responses (Levine et al, 1989). The treatment of Salmonella typhi with antibiotics is jeopardized since there are strains of Salmonella typhi that are resistant to chloramphenicol, ampicillin, and trimethoprim as well as ciprofloxacin (i.e.
multidrug-resistant strains) (Rowe et al, 1997).
20 Accurate detection of Salmonella enterica subspecies I is today not possible.
Salmonella enterica subspecies I can today only be detected by antibodies directed against surface proteins of Salmonella enterica subspecies I. The use of the sequences according to the invention makes it for the first time possible to rapidly and accurately determine the presence of Salmonella enterica subspecies I.
For many pathogenic bacteria, there is evidence that the filamentous surface protein structures called pili (fimbriae) are connected to the adhesion of the bacteria to the host cells. Pili proteins are very antigenic and are easily purified. Therefore pili preparations have been used as antigens for vaccination.
Summary of the invention The present invention is based on the identification of two fimbrial proteins that are specific for Salmonella enterica subspecies I bacterial strains. The nucleotide 5 sequences encoding said proteins, as well as the corresponding amino acid sequences have therapeutic and diagnostic use. Further, recombinant microorganisms may be prepared, in which nucleotide sequences encoding the two fimbrial proteins have been inserted.
10 Vaccine compositions for use in the treatment of Salmonella enterica infective strains may be prepared using essentially pure Saf and Tcf fili protein of Salmonella enterica subspecies I and Salmonella enterica subspecies I serovar Typhi, respectively, as well as antibodies directed to these fili proteins.
The DNA sequences of the genes encoding the Saf and Tcf proteins can be used for recombinant production of the proteins and for the preparation of vector vaccines against Salmonella enterica subspecies I and Salmonella enterica subspecies I serovar Typhi, respectively, as well as for diagnostic purposes.
20 Purified Saf and Tcf protein from Salmonella enterica subspecies I bacteria may be used for active or passive immunization of mammals, i.e. these proteins can be comprised in a vaccine composition or be used to raise antibodies which can be comprised in a vaccine composition.
25 According to the invention there is thus provided an isolated peptide encoded by the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 6 for use in medicine.
The invention also provides: an antibody directed against a protein of the invention for use in medicine; an isolated polynucleotide comprising a nucleotide sequence selected from SEQ ID NO: 1 or SEQ ID NO: 6 for use in medicine; S- a vaccine for protection against diseases caused by Salmonella enterica subspecies I, comprising a protein or an antibody of the invention and a 5 pharmaceutically acceptable carrier; a nucleic acid vaccine for protection against diseases caused by Salmonella enterica subspecies I, comprising a polynucleotide of the invention and a pharmaceutically acceptable carrier; a vector vaccine for protection against diseases caused by Salmonella enterica 10 subspecies I, comprising a host harbouring a recombinant vector which incorporates a polynucleotide of the invention and a pharmaceutically acceptable carrier; a method for protection against diseases caused by Salmonella enterica subspecies I, comprising administering to a patient in need thereof a vaccine of the invention; use of a vaccine of the invention in the manufacture of a medicament for protection against diseases caused by Salmonella enterica subspecies I; an antibody directed against a peptide of the invention for use in a diagnostic method; 20 an isolated peptide of the invention for use in a diagnostic method; and primers for, or probes that hybridize with, the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 6, for use in a diagnostic method for the purpose of detecting Salmonella enterica subspecies I.
25 The invention may be more fully understood by reference to the following drawings .and detailed description.
Throughout this specification, the word "comprise", or variations such as "comprised" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
References to any prior art in this specification are not, and should not be taken as, an acknowledgement of any form or suggestion that such prior art forms part of the common general knowledge in Australia.
10 Brief description of the drawings Figure 1: Schematic representation of phage clones (named N10, Dl, B1, Fl1) covering the entire cs7 insert of Salmonella enterica serovar Typhimurium strain SR X 3181, i.e. comprising the saffimbrial operon, i.e. safA, B, C and D (SEQ ID NO: 1).
The clones were selected from partial EcoRI and BamHI libraries in the Lambda Dash II vector. The cs7 insert is represented by a bold line. The extent of respective phage insert is represented by horizontal bars. Name and size of the phage inserts are S indicated on the left side of the figure.
20 Figure 2: Schematic representation of the pTY52 cosmid comprising the tcf-operon o (SEQ ID NO: 6).
A tcfspecific PCR fragment of 11105 bp was cloned into the Expand vector I cosmid (Roche). The insert is represented with a thick black line while vector sequences are represented with thin lines. Relevant restriction site sequences are indicated. The 25 position of the tcf-operon, i.e. tcfA, B, C and D (SEQ ID NO: is represented by a shaded arrow.
Figure 3: The phylogenetic distribution of the identified genes on the cs7 insert was investigated using the well defined SARC collection, see Example 1.
Figure 4: A 2 kb large internal EcoRI fragment was used as a probe in a Southern blot of the SARC collection, see Example 2.
Sequence listing SEQ ID NO: 1 -DNA sequence of the genes encoding the precursor of the saf fimbrie unit of Salmonella enterica subspecies I.
SEQ ID NO: 6 -DNA sequence of the genes which encode the precursor of the tcf o5 fimbrie unit of Salmonella enterica subspecies I serovar Typhi.
Deposit information The phages carrying the inserted SEQ ID NO: 1, i.e. phage clones B1, D1, F11 and
S
(see figure 1) have been given the ECACC Accession numbers 99051922, 10 99051923, 99051924, and 99051925, respectively.
The cosmide carrying the inserted SEQ ID NO: 6, i.e. cosmide pTY52 (see Figure 2) has been given the ECACC Accession number 99051926.
The depositions were made May 19, 1999.
Detailed description of the invention The present invention is based on the finding that two fimbrial operons, the saf operon and the tcfoperon, are specific for Salmonella enterica subspecies I bacteria.
Due to their specificity they can be used to provide vaccines against Salmonella enterica subspecies I as well as detection methods for Salmonella enterica subspecies 20 I. The safoperon is specific for all Salmonella enterica So oo .oo WO 00/73336 PCT/SE00/01079 subspecies 1 bacteria and the tcfoperon is specific for the serovar typhi of Salmonella enterica subspecies 1, see Examples 1 2.
The main object of the invention relates to two fimbrial operons, the safoperon and the tcfoperon, that are specific for Salmonella enterica subspecies 1 bacteria for terapeutic use.
Another object of the present invention is to provide vaccines against Salmonella enterica subspecies 1 induced gastroentritis, entric fever and typhoid fever.
A further object of the present invention is to provide methods to detect Salmonella enterica subspecies 1. The nucleotide sequences according to the invention are useful for constructing vectors for use as vaccines for insertion into attenuated bacteria in constructing a recombinant vaccine, for insertion into a viral vector in constructing a recombinant viral vaccine, or for direct inoculation as a nucleic acid vaccine. The pili proteins according to the invention, or antigenic fragments thereof, can be used for active immunization and antibodies directed against them can be used for passive immunization. All these applications of the sequences according to the invention are obtained by applying standard techniques known to the man ordinary skilled in the art.
Vaccines against Salmonella enterica subspecies I.
The genes encoding the saf and tcf fimbrial structures, or fragments thereof, may be incorporated into a bacterial or viral vaccine comprising recombinant bacteria, virus or fungi which are engineered to produce one or more immunogenic epitopes of the saf or tcf fimbrial structures. In addition, the genes encoding the saf and tcf fimbrial structures, or part thereof, operatively linked to regulatory elements, can be introduced directly as a nucleic acid vaccine, to elicit a protective immune response.
The proteins or antigenic fragment thereof, deduced from the nucleic acid sequences of the present invention are useful alone or in conventional vaccine mixtures in the vaccine compositions according to the invention. The proteins could be produced by chemical synthesis or recombinant expression according to conventional methods.
WO 00/73336 PCT/SE00/01079 6 The proteins and peptides according to the invention can be obtained by using a host organism transformed or transfected with an expression vector obtained by insertion of a gene according to the invention, or part thereof, into a vector in a conventional manner. The vector which is used to construct the expression vector is not particularly limited, but specific examples include plasmids such as pET (Stratagen) and the like; and phages such as M13 (NEB), phage display libraries and the like. As expression regulatory sequence can among others T7 promotors and lac promotors be used.
An appropriate host to be transformed or transfected with the expression vector can be chosen among for example E.-coli, Salmonella or Bacillus subtilus. The transformed or transfected host is cultured and proliferated under suitable conditions.
After culturing, the peptides of the present invention may be purified by, for example, chromatography, precipitation, and/or density gradient centrifugation. The thus obtained peptides can be used as a vaccine or for the production of antibodies directed against said peptides, which can be used for passive immunization.
The purified preparation containing one or several proteins according to the invention, or parts thereof, is then formulated as a pharmaceutical composition, as for example a vaccine, or in a mixture with adjuvants. If desired the proteins are fragmented by standard chemical or enzymatic techniques to produce antigenic segments.
In formulating the vaccine compositions with the peptide or protein, alone or in various combinations, the immunogen is adjusted to an appropriate concentration and formulated with any suitable vaccine adjuvant. The immunogen may also be incorporated into liposornes, or conjugated to polysaccharides and/or other polymers for use in a vaccine formulation.
The different vaccines according to the present invention are administered to mammals in many different ways. These include intradermal, intramuscular, intraperitoneal, intravenous, subcutaneous, oral, and intranasal routes of administration. The vaccine doses will differ depending on circumstances such WO 00/73336 PCT/SE00/01079 7 as body weight, interferences with other administered medicaments etc. The upper limit is not critical unless the dose shows toxicity.
The peptides and proteins of the present invention are also useful to produce monoclonal or polyclonal antibodies for use in passive immunotherapy against Salmonella enterica subspecies 1. Human immunoglobulin is preferred.
Antisera is obtained from individuals immunized with proteins or peptides according to the invention. The immunoglobulin fraction is then enriched, for example by immunoaffininty or affininty chromatography. Antibodies raised in a suitable mammal or in the patient to be treated, can subsequently be administered locally or topically, e.g. orally to the patient.
Detection of Salmonella enterica subspecies I in general.
The sequences according to the invention, or part thereof, or fragments hybridizing therewith, as well as the proteins according to the invention, or part thereof, and antibodies directed to said proteins, or antigenic fragments thereof, can be used in molecular diagnostic assays for the detection of Salmonell enterica subspecies I.
Nucleic acids having the nucleotide sequence according to the invention, or any nucleotide sequence hybridizing therewith can be used as a probe in nucleic acid hybridization assays for the detection of Salmonella spp in various tissues and body fluids of patients. The hybridization assay may be of any type including; Southern blots, Northern blots, colony blots.
PCR technology is the most preferred technology for detection according to the invention of Salmonella enterica subspecies 1. Primers of at least one selected from the 5' end and one from the 3' end can be used in PCR and other known tests to rapidly identify the presence of Salmonella enterica subspecies 1. This is according to conventional techniques.
The isolated and purified proteins and peptides of the invention can be used as diagnostics to measure an increase in serum titer of Salmonella enterica subspecies I-specific antibody since they bind strongly to these antibodies. A serum test sample can be screened for Salmonella enterica subspecies I by methods such as for example ELISA.
WO 00/73336 PCT/SE00/01079 8 The invention further comprises the use of antibodies directed against the saf and tcf fimbrie structures for quantitative or qualitative determinations of the pili proteins of the invention, or fractions thereof, in cells, tissues or body fluids.
Detection of Salmonella enterica subspecies I by using nucleic acid hybridization technology Nucleic acid hybridization technology can also be to detect Salmonella enterica subspecies 1 according to the invention. The nucleic acid probes chosen from parts of the sequences according to the invention can be either DNA or RNA.
DNA sequences complementary to the sequences according to the invention can also be used. The binding of the probe to the target sequence, i.e. the hybridization, must not be perfect. Variations and mutations of the sequences according to the invention can be used as long as they hybridize good enough to detect Salmonella enterica subspecies I. The preferred length of the nucleic acid probes is about 10 to 400 nucleotides, most preferred not longer than 100 nucleotides.
The nucleotide probe is preferably chosen from the parts of the sequences that have the least variation. In the most preferred embodiments when screening for SEQ ID NO 1 (the safoperon, specific for Salmonella enterica subspecies 1) a nucleotide probe or PCR primer selected from nucleotides 37 368-37 868 should be avoided since this region is hypervariable.
The nucleic acid probes according to the invention are prepared by any conventional method such as organic synthesis, recombinant techniques, or isolation from genomic DNA.
The nucleic acid probes of the invention are labeled in a conventional manner to signal hybridization to target nucleic acid from Salmonella enterica subspecies I. The labeling may comprise a radiolabel, an enzyme, a bacterial label, a fluorescent label, an antibody, an antigen, a latex particle, an electron dense compound, or a light scattering particle.
The probes may be provided in a lyophilized form, to be reconstituted in a buffer appropriate for hybridization, or the probes may already be present in WO 00/7r3336 PCT/SE00/01079 9 such a buffer. The buffer may contain a suitable hybridization enhancer, detergent, carrier DNA, and a compound to increase the specificity.
Any conventional hybridization assay technique, such as dot blot hybridization, Southern blotting, sandwich hybridization, displacement hybridization and the like, can be used.
The target analyte polynucleotide of a microorganism may be in various media, most often in a biological, or physiological specimen. In most cases it is preferred to subject the specimen containing the target polynucleotide to any conventional extraction, purification, and/or isolation before conducting the analysis.
The sample containing the target analyte nucleotide sequence must often be treated to convert the DNA to a single-stranded form, which may be accomplished by a variety of conventional techniques, such as thermal or chemical techniques.
The following examples describe the isolation and specificity of the sequences according to the invention.
EXAMPLE 1 Identification and characterization of the safoperon.
The present inventors found, upon investigation of a 7 kb chromosomal region on centisome 7 originally isolated from the S. typhimurium strain SR-11 k 3 181, a region that exhibits many of the traits that define a pathogenicity island. It has a lower G+C composition than the average composition of the Salmonella genome and includes many sequences related to different mobile genetic elements. The region is not present in E.coli K12, and the Salmonella specific DNA is inserted between the tRNA gene aspVand the stop codon of yafV, a hypothetical protein upstream of the yafHI gene at 5 min in the E.coli chromosome. This Salmonella specific insert encodes proteins creating adhesive structures and other virulence factors. Sequencing revealed genes encoding a new fimbrial operon that they designated Salmonella Atypical Fimbriae (saj], due to its relatedness to a subgroup of adhesive structures forming thin atypical fimbriae or non-fimbrial adhesins.
WO 00/73336 PCT/SE00/01079 The safoperon consists of four contiguous genes, safA, safB, safC and safD that encode fimbrial subunit, periplasmic chaperone, outer membrane usher protein and alternative fimbrial subunit, respectively. The genes safA, B, C and safD encode putative proteins of 166, 244, 836 and 156 amino acids, respectively. Analyzes of clinical Salmonella isolates showed that DNA of 195 out of 198 clinical isolates belonging to S. enterica subspecies I hybridized with safB and safC, i.e. these sequences are common to more than 99% of the known Salmonella enterica subspecies 1 bacteria. The inventors showed that 58% of these clinical isolates carry the safA, see Table 1.
WO 00/73336 PCT/SEOO/01079 11 Table 1. The prevalence of the saf genes in clinical Salmonella isolates.
Scrovar safA sa]B safC isolates S. adelaide 1 Sagona 6 S. onatum 3 S bareilly 3 S. blockley +i 3 S bouismorbificans S braenderup 4 S. brandenburg 1 S. brederey S.chester 1 S. colindale 1 S. derby 1 S. dublin 1 S. eastboume 2 S. emek 1 S. enceritidis 8 S. give 1 S. goettrngen 1 S. haardt
I
S. hadar 16 S. heidelberg f 1 S. hvittingfoss S. infantis 6 S java 1 S. jaiana 1 kottbus 1 S livingstone -4I S. london -i 1 S. maastricht 2 S mbandaka 3 S monteideo 1 S muenster 1 S.newport 2 S. ohio 1 S. oranienburg 2 S. panama 3 S. potsdm +4 1 S. rissen 1 S. saarbrucken 1 S. saintpaul 3 S. schwa rtzengntnd -4 1 S singapore 1 Sstanley S. subsp 2 S. subsp 14.5,12:b:- 1 S. subspl4.5,12:i:- 1 S subsp Ispont 1 S. tennessee 2 S thompson 1 S. typhi 1 S typhimurium 27 Svirchow 7 S welterureden 1 S. urthington 2 S. subsp I 1 WO 00/73336 PCT/SE00/01079 12 The phylogenetic distribution of the identified genes on the cs7 insert was investigated using the well defined SARC collection, which showed that the presence of the safA, safB, safC and safD genes is restricted to S. enterica subspecies I (Fig. This region is hence the first subspecies I specific genetic region to be identified with a broad distribution within the subspecies. Since the serovars of subspecies I constitute over 99% of human salmonellosis and are preferentially associated with warm blooded animals, it implicates a role for the safadhesive organelle in the colonization of these organisms.
EXAMPLE 2 Identification and characterization of the tcfoperon.
The present inventors found that Salmonella enterica subspecies I serovar Typhi contains DNA encoding an additional fimbrial operon, the tcf operon, in the sinR-pagN intergenic region. Southern blot analysis revealed a markedly different restriction pattern in S. enterica serovar Typhi than the other subspecies I isolates, suggesting that the saf-sin region in serovar Typhi might carry additional DNA relative to serovar Typhimurium strains. A PCR reaction (using a kit from Roche) was therefore performed using a sinR (5'-GTA AAT CGC TTA GTC GCC-3') specific forward primer and a pagN (5'-TCA ACT CAA CCT TCA GCC-3') specific reverse primer.
This primer pair produced, as expected, a product of 2 kb in serovar Typhimurium from the SARC collection, while from serovar Typhi the product was 10 kb. Thus, the neighboring sinR and pagN genes in serovar Typhimurium strains are separated by approximately 8 kb in serovar Typhi.
The Typhi specific PCR product was purified, digested partially with EcoRI and sub-cloned into pUC18 forming a set of overlapping clones. Sequencing of the clones revealed a putative fimbrial operon designated tcffor Typhi Colonizing Factor. Four ORFs, tcfA,B,C,D, have been identified with putative proteins having significant homology to CooB (38% identical over 192 aa), CooA (37% identical over 170 aa), CooC (34% identical over 872 aa) and CooD (31% identical over 272 aa), respectively. The Coo proteins are involved in the biosynthesis of the CS1 colonizing factor antigens of enterotoxigenic E.coli (Fig.
4) (Froehlich et al., 1994). The peptide of the tcfB ORF is also homologous to the CblA major fimbrial subunit protein (45% identical over 154 aa) of the cable WO 00/73336 PCT/SE00/01079 13 type II pili of the cystic fibrosis-associated Burkholderia cepacia(Sajjan et al., 1995). Down-stream of the tcf-operon two ORFs were identified with the same transcriptional orientation as the tcfgenes. The first was designated tinR for Typhi insert regulator because it is homologous (33% identical over 144 aa) to AzlB of Bacillus subtilis, a member of the Lrp/AsnC family of transcriptional regulators (Belitsky et al., 1997). tinR is followed by an ORF (tioA for Typhi insert orf) encoding a putative protein of 205 amino acids with no significant homologies to anything in the DDBJ/EMBL/GenBank databases. The above sequence from Salmonella enterica serovar Typhi strain RKS 3333 and the tcf region of the incomplete genome sequence from serovar Typhi strain CT18 http:// www.sanger.ac.uk) are 99% identical over the total length of the investigated region in concordance with the clonal nature of the serovar.
A 2 kb large internal EcoR I fragment was used as a probe in a Southern blot of the SARC collection. This blot shows that Salmonella enterica subspecies I serovar Typhi (SARC2) is the only strain in the collection possessing DNA hybridizing to this fragment (Fig. 4).
DrT/Ic nnil i'7 WU UU/13330 14 References: American Institute of Medicine. (1986) New vaccine development: establishing priorities. Washington, DC: National Academy Press.
Ashcroft, M. Ritchie, J. Nicholson, C. C. (1964) Am. J. Hyg. 79:196-206.
Levine, M. Taylor, D. Ferreccio, C. (1989) Pediat. Infect. Dis. 8:374.
Popoff, M. Y. Le Minor, L. (1992) Antigenic formulas of the Salmonella serovars (WHO Collaborating Center for Reference and Research on Salmonella, Institute Pasteur, Paris).
Reeves, M. Evins, G. Heiba, A. Plikaytis, B. D. Farmer III, J. J.
(1989) J. Clin. Microbiol. 27, 313-320.
Rowe, Ward, and Threlfall, E.J. (1997) Clinical Infectious Disease 24:(Suppl 1) S106-9 Salyers, A. A. Whitt, D. D. (1994) Bacterial Pathogenesis: A molecular approach. (ASM Press, Washington Yugoslav Typhoid Comission. (1964) Bull. WHO 30:623-30.
SUBSTITUTE SHEET (RULE 26) EDITORIAL NOTE APPLICATION NUMBER 52628/00 The following Sequence Listing pages 1 to 47 are part of the description. The claims pages follow on pages 15 to 17.
9.
9 9 9 9 99 9 99 99 9 9 9 9 99*9 9. 9 99 99 9 9 9 999 9 9*99 9999 99** 9 *999 *999 99** 9 9 9 9 999
N
.9 99 99 9.
*9 .99*9* *9 99999 .9 99 .99* 9.9 .9 *9 99 9.9 9 99..
9*99 990 9999 9.9.
99999 <110> <120> <130> <140> <141> <150> <151> <160> <170> <210> <211> <212> <213> <220> <221> <222> <223> <220> <221> <222> <223> <220> <221> <222> <223> <220> <221> <222> <223> SEQUENCE LISTING FOLKESSON, Anders; NORMARK, Staf fan; L6FDAHL, Sven Firnbrial proteins ABR 022 US PCT/SEOO/0 10 79 2000-05-26 SE/9901961-4 1999-05-28 11 Patentln Ver. 2.1 1 46870
DNA
Salmonella typhimurium
CDS
(37368) .(37868) safA putative fimbrial subunit
CDS
(3 7952) (3 8689) saf B putative periplasmic chaperone
CDS
(38713) .(41223) saf C putative outer membrane usher
CDS
(41245) (41715) safD putative fimbrial subunit <400> 1 gatacaaatc aaagggatac caaatgggcc gctattttga ggacatgttc aaccgccaga gatctcattg tgggatggtc ttccagcgca accgggtggc aaccgtgcct ctgtaccacg tcagggtgtt ggtttttttt tacgcctgga tcccactgac agcgcagaag catgcgcatc atcacctccg gggctactg tcattacgat gcgttgacag gcggggtctg tcaggcagca tttatacatc cgtcttcaag tgacgaacag atttaaggcg cagactgcgt atcactccag agaaccgttt gttcactgct cgcgaattgg gcggcacatc gcactggcgc gcgctaacaa ctgtgaagta aagttccacc gatattaccg cggcctcatg aatgttgata.
ggcatcccac: tcccaccagt ctcaaaccag ccccggttca ctctgcccgt tggcgacgga ctgccgcagg aaaaaaaccg gtctatcgtg ccacttcttt gcggtgctta tcactcagat ttctccagca.
ct ttcagcc t cgacgaatgc gcattgtgtg gccggaacgg gggatttccg cgtgaaaaat tatcactgta gaatctggcg cactgtcatg accgggatcg aattacggag actccaccg ggcgtaacag accgcaggcg ggagggatat cattatccat gtgtggttgt ctggagccag
*E.
*G
V
atcgcctaat aaacgccgcc ttcgggggcc ctgtaggcgc ttcgtgatcc ttctgcggat gctcagttct gcgactgcgc cgcatcacgt ttgcgcctcc cggatcgggg attactctca catgcatggc caggcggtgc tgaacgtacg ggtaatcctc cggcctggcc gggtgatttc gtacgctgaa ccagatgggc gtagatcctg ccagcccgcc ggtggtgaaa tcagtgccag tgtacaggct caaaggtcat cttgcctgcc gtaacgccag ccggttctga gagaaggtat taaggactga ccatagacca atcaggctgc gtctcacgcg agcggtaatg agcgcatcag agggcggcac agcagcgccg gcgccttcgc gccagttgtg agcagcagga tcaatttccg gcctgggggg ccgcagggtt gtgggttaag tttacctgcc ggcatgaggc caccagttct cagcgtgagc gataagcgcc atcgttcgcc atagtgcgcc ggaggctgcc cccgctgaga accagcggac aatattggaa ctcggtatag caggtgcagg cgtcggcgtc ggagtatcac ggggttttct ccggccaggt taatggaaac taccggaaaa acaggccgga gcacctggcg cccacacttg ccccacgggc ccgggcgggg tgtttgacag ccagggcatt cggtaagcag tgagcatttc tcaacacccg gccagttcac cggcgcggct gttccgcgct ccgtgttcag caggccgtcc ttcaactgca gtcagcacgg gggccgataa ccgttgccca accgttatcc agcgtggcgg aggcgaacat ctggcgatat tgggcgctgc aaatgaacca tgcagcggca cgt cca tcc t tgtcccagac agagc tgggc gcgcctggtt catgacatgc atggcgggca tccttcttcg gccgggcagg cgactgcggg caggcgttct gcgctttcct ttcgaggtcg gccagttgct agcggataac gcgcgctcca cgtaataccg cgaatgtccg actgcgcgcc agcgacgcat ctgtggatgg agtgcatcag cgccagtccc tgtgcgccag gatatcctta ctgcgcgtga ggcttccggc gtactgacaa gaacagcagc aggtattccg taccggttta gtcagggaca gttcaaaaat catgccggtc agccaagccg gctcaatttc gcagattgtt ttagccggaa gcaccgaaga agtgcagggc ccagcccgtc gatctgacag aaaacgcaaa gccatgcctg gcgtcggatc acgggccgtt gccagctgac ggaatttttc atccacgcgc gagagcgccg ccgcgttgca atatagtgtt tccacggtca cggcgggtca ctgtcgtgag ctgcgcgccg ccagcaatat gccgccgggc tttgtactgc ctgcgcgggc gaattctttc aggattgcgc cgggctttcg acggtcgtgt ccagaactgc gtgacgactc tggcccccat ctgggcgatt catccgtggc cgcatcaggt gtcatacgct t cgagtgcag ccagcaccag gttcgcaact ccggggtggt ccgggataaa gcgcttcgcg ggcggacgga catcccgtac ccatcacgcc cttcaataga accagattac cctgctggat gcagacggcg aatcaccata gatcgtactc tgtccatatc accggacggt ccgatcaccg tcaaaaccga agaaaacgca atatccggta gtgcccagac accgcaaaag gattcgcgta gcgctttccc tccggtttgt gtcagccagc tctgagcggt agtccgaggc tcccgtggcg ttgccagata tcaggaatcc cccccgttgc cgtcatctgc accgaacagg gacgccccgg tt tgcagcca cagattcagc 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 .9 9 9 9 .9 99 9 9.
9 9 9 9 9 9999 9. 9 9* 99 9 99 9 9 9 9 999 9 9.99 9999 9999 9 9999 9999 *99.
9 9999 *99999 9 9.9.
9 9999 9 9 *9 9* .9 99** 999 9 99..
999~ 99..
9.9.99 .9 agatcgcgca aacgacaact gggcgcggcg tcatcccggc acggtgatat aagact tccg cggcgcggtt gtgtgataca tcctgaacag tccgtcacgc ggcgtacaga aggcggttaa gacaatccgg agattatggc ccttcgtgcg tcaccggcca cgttcttcgc agttccccga agatagcggt gacagcggcc tccgggatgg ccttcctgcg ttggggctga agctgagtgc gcaatatcgg tcagcaccgg tccatgaata ctgcgtacca tgtagcgtgt gcggcgttca tcccacttat gacgccgccg agcaaaaagg tcccggtcaa gcgcctggcc gccggatcag gtctgataag cattacgtgg ggcgcagatt agccggtata cacgacgcag gcggacgaaa aaaacacctc tgtgagtgac acaggctgaa gcagtatgac tcggggtgaa cgtgaaacac ccaccgggtg gtgtggcgac cgcagaggta aggttcgcag gcagggcggg acagcgtgac gggaaacgaa tatcgggata cgacttccag gggcgctcag ttccctgcat tctcccgcag atgccctgta gaaactccgt tatagctacc gccggggttc gctcatacat cctgcggata agagatcccg ttttctgata gccggtgcgg ccgccaggcc acccacgccc aataaggcag ttccgggtag gggggttcgg tgaaaaataa aatcatcttc gtaatccagc ttcaatccgg ctgtgcagcg aatttccgcc gaagtaaaac gttccaggcg ttgaagattc ggcgctggta aaagggcagc ggtgatgcgc catatccggc gtcctggctg cgcggtatcg cagctttacc taaccgctgt aaagctgaac ccccagtcgg gtagctcagt ttaaaagaac gggatacggc ctgccgggtt gaaacggata aaagctgccc gcgccgggCg gcgaatgatc cggagcattg tgatccagat atttcgcgtt gccaccggaa ggcagatcgc ggcgcttcat gtgccatagc cgcccgtggt cgtgtggttt ggacgggtac gtggtggtgc tccgtctgat acattgccct cgttccggac agcggtagca acattcagtt tgtagcagct cgtgccggac aatgccccgg ggcgcggcgg ctgcgaaact cgcggcacgg actgccatcg gtgaagcggg gcctcgatca gcggcaattt tcgcggttgt gtaatgcggc tgcgtcagaa t tatcgagca atggcgcgcc gccagcggcg tcgatgtcac tccttcagtt tcacacagcc cggccagcgg cagccagcgg tcgccgcatc ggttagtcac gctgatcaac ggcgggcgtt tgccttcgtc ccgcttccag gatccaccac gaggaagcag gaatcagcca gcaccttttg aggcgaaaaa gcccctgccc cgccgtcaaa caaacaggtg cggccagctc cgacatgaat tcaggcgtac ggcaggcggt tgaccccttt atggcgtggg gaaactcggc tgcgt tccac tgggatggag aatactccag tcagctccat tttgtccgcg agggcgtgac ggatcgcctc gcaggccata cctcgtggct gtaccggcgt gatcaaacag cagataatta cggctgcggc tacggtcaga cattgccgtg cagcgagagg ttctgacgag attattacgg gccttcgact cagatgctgt gttgatcacc gtccggcggc taacccggca ttcatgcagc tggctccagc gtgaccgggt tgacgcaatg gctgaaagtc attaggcggc ctcctcgatg attttcgcct cgccaggtcg ggtgacgtaa gtcaatt tt t atacgggtcg ggtggcgaac tagccgtgga atccagcgcg aatttcaaac cctgagtgtg gctgatatcg gttgagcact aatggtattg gacgctgata cgtgggcagc 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 S* Se
'S
0* aggtggttag aggtacaatg gccccagcgc gacgcagggc gcgtatcgcc tcggttgcgg cggtgactag cgccgccgtt tatcaacctc cgggctgcgg tttcgcagcg tggcgcatag gaaacaggct tctgaatccc tgtcagtttt ctatcgatgc gcggttttcc gccagccagt ccgggcgcgc gacgtcaaat ttttcctgcc ctatccgctt gggttctgcg gattcagatg cgggcaaggc cggtgatgag acaatccacc tgctcaaaaa tactgtgcgt ttcattatgt cataaataat ttcgtcgtga ttctgcaatt cgggtgtaaa ggccatgtcc ccgcacggtg gtcggatgca attcaccagg cgat tccagc ctcaagtacc ccactgcccg accggtatcg cagaagaaaa aaccagctcc ctgaagctgg gtggcggcag ctccgcaata tttcatgggc caataatacg cccccagttc gcatttcacc ctgccaccat cctgggtgac cggagggccg catggccagt gattgagctg actgaaccaa accaggatga gttccatcac acgcctcttc gcggaatgta tatccgttat cccttaaatg tttaactaaa atgcattgtt gcgattttta acgagacgtg agcaggctga gtttcgccgg ctttccgaac gtgatattaa gagcgtattt gggccaaggc cccgtgtctg tcatgacagg ccgggtcgtt cggtacagcc tgtagcgccc agcgctttcg gagtattcct ggtatttcca atgctcgcgg cggcgagagc ctcaccggtc atcctgtaac agcgtcttca caatgcctgc gtaagaggat ttgttgttta ctgaggaaac cagaccccat ctgctcattg gtggcgaaca gggggcggcg accctgcgac tatttatttt tttataggga gtatgcatag tcttgctgtt ctcatgcgcc tatcgccgtg ggccgtcctg cgctgtaacg cgggtttcca gcgaaaacgg gggaatcgct ttatggcttc cgagtgccgc gttcaccctg gggcctcctg gcgaccagtc ccggattttc gtatcagcgc tttttctgtg cgggtcagat agttgcagga gtgtccagcg gtgtcggtgt tttcggggaa tgatattcct gcggtttccc aaccagtcca gaaacatcgg tcaatcgtat acggcacagc agactgagcg gcagtaaata tgtattttcc tctggaggaa gtcattattg atgtcttttt cattgattat atcgttttcc gctggtcagc ccaggcggt t ggtaaagagc cagcagatca cagccagata gtcgctcgcc cagtgcggta cagcagtgac aaaaacgg tg ggtataattg t ccggcca ca ccgaacctgc ggacagggta tcggagtgat cgggcagtat tatcgggcat tatccagaat cgcccttttt acggtttagc ggtaaagctg ggacgggaat gatccaggta gacgggact c ctccatcatt acagttcatc cgatatcgtc cgatgtatgt gtaattccca acgt ttaggg atgacaccct tgaaatat ta ttaatattaa tgagcatgaa cacactttct tccctgcaca caggcgccgt gtcaggcgcg tacacgccgc cagtcaaatg ttacggtgtt tccacccaaa tggcggaaca gcctccatcc cacagcaact tgctccgcca gcaggaagcg tcggtagtgg ctcgdtatgt tgattccatc ggcatcaata atcgtataac acggaggggc gtgtagtgtg aatatcgaac ct cagcaacg atcattggtg cagccgcata agagtgatta tttcccatgt ccctgcggga catgtcttgt agttttaatt ttttattatt tttcttttaa cggcttagtt 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 6120 6180 6240 6300 6360 6420 6480 6540 6600 6660 6720 0 0 0 gctaattagt aaagaattgt aaggcataac ccactattcc gttgagctga gttcttcggc gatatgctgc gttgaaaagg tggttgctgc gcgccgctgg acatcgccgg ggtgaaagca tactgtcagg gagcatgaaa ctgactggtg gcgcaggggg gctctgttgg gaagaggccg gtgggcgcgg gcgcgcggca gagaaagatc atccccgcaa ctgattctgg cggcagttgc acgctgcaca atggaacggt gatgcgctga tggcagcggg gqtgatgaca gcgccggtgg accggtattc gcccgactgg attcagaccg gccgggccgt ccctgatctg ttatatattt acatggaaac ggtcgctgga ctcactggct attaccagat ccgccggggc cctggatgct tggccttgtt ccacgcttcc aagcgcagga gtcaggcgat acatgacggc tccgcaccat aggcgggcgt aagtgccgcc ccggagccag ggcgctcgcc gcggcgcatc ccctgcggac cggcgctgac tggaaatggt atgaggcggt cggataaggc cgccgcctgc cgctgttgca tggcgcgaat agctggaact aaaccacgct tgttcccgga ctgctgggcg cgcaacgcgt ccagggcggg ccggtgtcgg tatcattgtt atatgcattg tcctgtttca atcggcaacg gcaccagtta ccctctttct cagcgccatt ggcgagcgtc gaccacgcca ggttgatgaa gcgcccttac tcccaacggc acaggcgcgc gacggatatt cgggaaaacg cgcgctgcgg catgaaaggc gcagccggtt cggcacgggc tatcggcgcc ccgtcgtttt gcgtggtctg acgtgcggcg catcagcctg cagcgtacag gaagcaggaa tttctcgctc ggtacatacg gcaacaggcc agtcagcgcg catggtgaaa taccgggcaa actgggcgat taaaaccgaa ttgtttcgat atgcattatt cgcagtgcgt gcattttgca acacagcagc gatgtggaga agtgattttt cgttacggcg gaactgcgtc ctgacggaaa gacggctccg gggcaggacg gacggcaaaa ctgctgcgcc gcggtcgtcg gaagtacggc gagtttgaat attctgtttg gatgccgcta accacctgga caggtgttgc gtggatacgc gtacagcttt ctggataccg ttcctgcgcc aaaatgggga aacaatgaac ttgcaggaac gaaacggcgc gcggttgtcg gatgaggcca gacggcgcgc ccacgcaaac accgcgctgg attttttcga tttatgaatt tgtatggaaa aactacgctc ccgataacga aagcgttact ctcaccatat ataacaaaat gggtactgag tactgccctc gcctggcatc gtaaatccgc tcgacccggt gtcgccagaa aaggttttgc tactggcgct cgcgtctgaa ttgatgaagt acctgctgaa gcgaatacaa agattgccga tggaaaaaca ctcaccgcta ccgcggcccg agcagctaaa ttcagtcaga tgactgcatc tgcgtctcgc taagggagtg cggcgattgt gccaggtgct tggcgcagat cggtgggcgt cgctggcgga ggctatcaat tttatgtcac actggccggc taatccctgg tattctccac ccggcaactg cgatctcagc tcgcagcggc cagtatctgc gttgatcgaa agccattccc gctggcaaaa gacggggcgt taatccacta cctcgcgatt ggacgttggc agggttactg tcacactctg accggcgctg gcgccatatt accggaagag ccataacgta cattcccgcc cgtggcgctg agcggcggaa tgagcggcgc cgaatcccgc agagtctgat gcagggcgac cgccgactgg ggaactgcct tggtgaacgt gtttatgctg ggctatctac 6780 6840 6900 6960 7020 7080 7140 7200 7260 7320 7380 7440 7500 7560 7620 7680 7740 7800 7860 7920 7980 8040 8100 8160 8220 8280 8340 8400 8460 8520 8580 8640 8700 8760 -:Do *Do** 0.
S o*.OS.
.oo S O. S
S
N
0 S.
S
S. 0 Sq..
050*0 ggcggcgagc tccacgctga gaagcggtgc catgacgtcc ggtacacatg ct cat cagecc gcgctaatgc atcccttaec eggctggtgg gtcgcacata tacatcgaac aaagccgcga atcgtttgct tatacegete aaeaatageg gaegtagaga gccgatetgg gatattgata getgtgeega atggacgact gaagcccgca aatetggtea aaatcggccg atttttaagg ataccgcctc ctacccaggc tcagegatga ccgaaeagat geggtatgga tgaatctgte ateaaageee ettatggetg ttggctctat agaacetggt aaggegcgcc gtcgccaeccc aegaatett tcgatttcaa agatgtgtga cggaat tgcg tgeegetgga egcggatgag ttgtggegtg ageacattet teaggeagat gaaaccggcc cccttgcggc cgcagaaatt tttacggttc ccgggaaacc ttgataaett atacgctgac tttcgccggc ctcaaetggc ataaaetgtt ctaeccagca aattttcatg egeateeggt ggcaaaagcc egcctataaa caaactgatc geatetggtt aaaagatgaa gatgtteaag tatcattgcg cgccaaagtc aaccatcaat gcccggctat ctggagcgta etateaggtg aaacaecacg agatccggec caageatttc cgaaaegteg tgaacagcac ctgteeaatg gccacgactg tgatatcggc gttcgaagtg aaccagttga catcgcgcgc cgagaaaaaa gegtgaagaa eaatgagege gggtgaaggt acagattgec gaaeeteeag gcaggacccg agatgtgtea gcaaacagta gaatttgatg gtggaagccg agcatcagcg ctgcaacatc tacaacaceg t tgeggcgca aaactgtatg gattactact gccgcgtcgg atgagegagt gtgggctatg gtgetgeteg t ttgaeaagg ctattactca ttaatgcccg ccggcggcat cgtggcgtga ggegtgaege eatgaaacgg tcgcgctact gttaatggtg tecgtagtgc etaaaaagga aaccgcgcgc atcgagetgc etgccgeegg atgaaageca eagttgatgg egeaaggtgg aectaeatgg actctgctga geggataatg atatgcaggc cgttgetgaa eggtgcagae ctattattgc eegactggea agaccgacga acatgaagcg aagcegaata tegaceatac cccatgcgcc tccaggaggc gcgagggtgg acgagatcga gtgggatgga ccaceaacgt atgctaeggg ttctgggccg ttgeecgtct tgacgtatag gcgegeggtt gg ttgcaggc atgagcagat gattttaaaa aatgaaggat cgcgcgtgca egttcgtgat tgacggat cg ttgegcegcg tcgatatcac acgecctgaa atggcaagge aaaegetggc aatcagcgga aaccgacgcg tcaggecttc gctggegaac gcagatcgac gaagetggaa aaagetgaaa ttacaagggc cggccagtta accgcccgat gtttattgcc teacaccgtt tgtgctgaeg aaaagcgcac ggacggcgag gggttccgac get taaagag cgtgaeggtg gcacettgae egaggaaetg gctgattggc catgaeggaa tgtttttgag actgtaccgg tatggetatc gattgaatat ggcggtgetg eaaattectc egtggegttc gctggaaaat eeagttaetg gggggeggaa gaatgegccg ataaegtcga gttgeteagg egacccaaga acgateaccg tttaaactga tcetcgtgge attcgcttea atcgeetggg ggtggcgaae gtggatctgc ggggcttccc 8820 8880 8940 9000 9060 9120 9180 9240 9300 9360 9420 9480 9540 9600 9660 9720 9780 9840 9900 9960 10020 10080 10140 10200 10260 10320 10380 10440 10500 10560 10620 10680 10740 0 .0 0.000 0 *.00* 0
I
0 0 0 .b 0 :0 *.*v4 0 0lb 1 00 .t -b *o *f o o ft 0 0000 cctcggtact tcgtcaccca gttatattgg acccggtgga tctggagcaa gctggtgtac ataccttccc ctgaccgccg actcagacta atccggacgc cgcgcttcgc gtgaggatat tgaactcctc aagaggtcga agcttgaagg gcaatgcctg tataaagagg acgtaatgat gtaggaaata gctcttttca gtcttctgct tatgtcgcgt ggatctcaaa acgtggtagc gatatggttg gggatcgcca atttcagtac aaataatgca ctctgttctc aaaagatata cttatgacat aaaatgaaat ccggtagcgg tcgaccgcgc gcaaatggac gaacc tggaa cctgacgatg cgagtttgat cgcggcctac gttgattcgc gactgacgat cgaggc tgaa tgccgccttt gacggccaac tcacttcctc gcagcgctgg gcaagaaact aggcaatcca gctgacggga atatatattt gatatgaaga gacacatatc catattatga caatatgctg gtgagcctga ttatggttaa ccagagacta gataaagaaa acgcaaccaa gaagaccagt aggttatggt aattcataaa ctgaagttat gataaaatca ttttttgaaa tgaagtactg cctcggctcc cagcccgaac tcctggcagg tatgcgccgt ccgcgttttc t ttgaagaag gcgatgggcg ggtgtggaat ggcggcgtgg ctggcgaaaa atcggcgcac gctaacctgt aaatgtatcg ctaaatgaat aaagcccgtc ggttattacg tcgctgcgcc tgtgaatgtt aaagaatttc cgaatgccag ccataagccc tagcctggga agtctgatgt gtagtattgg aaaaagtaaa ccgatcgcta tacctgcttc ctgtatcttt ttttattatc ataactaaat tggagaagga ggtttatttt attgacggca agctggcgct ggtaaggtct ctgttcaaat aactggcgaa ggaactcgct ttgcccgcct atgcggatgg taaacatcaa caggcggtgc acatgaaatg acggttttat agtcgctgca ctgcccgtct tccgcgacaa ggattatgaa gtccgctggc acgcgaaatt tggtgacaaa taagcgagtg gtctcgccca taacaatgcc aacttttcat agtcagttgc cccagcggcg ggtaacgcag cggtgaaaat ttgggtttcg ttctactcgc tatctgtaat ggaagttatg acattatcgg ttctgtacgg aagtaaaact ttgatggcga ggaatattca ccgtcaccaa actgcgcctc tccccgcgac gcgggctagc gccgtatggc ttctgaccat Ccgttccttc ggtggaaaat cccgaccgaa cccgt tgatc aaaaccacag accgtacctg aatcggttcc ttatgtcgac tgccgctgaa cttcctgcgt actgccgtca aagtcagaga cggtctcgta gaagcttttt ctgttacctg ccggctgtta tggcagcttt ggtaaacctg tggtctgtat tgtatttatg tgtaagactc tagcgatttg aagcat tat t taccggaaaa caatgattta taataaggat gtcaatggat tcaggaatcc cctggatttt cggcaagcac ctgaccaaaa gaagactccc gcaaaaacca accaaatacg aaacactacg cttccctgcc atcgccatct caccgtaaaa gaatactacg ttcgcctgct tttaaagagc gccgatccgg gtagtggtgg ccgcatttcc gtgaagcagg agatagagaa aaggtggggt atatcattga gtattgttct ttgatactca ctccccgata aaaacctgat gggaaacaga gtcatgaaca gtaattttga agacgtgaaa tatatgcatt a tat acagt c tctataaaca ataaaaatgg gacaaacaca accatgcacg gatcactata attccgcagg 10800 10860 10920 10980 11040 11100 11160 11220 11280 11340 11400 11460 11520 11580 11640 11700 11760 11820 11880 11940 12000 12060 12120 12180 12240 12300 12360 12420 12480 12540 12600 12660 12720 12780 0 0.
0 0 00* .00.
0 00 0 0 0. 0 0 O* *~z ccattctggt ccgacctgat gtgaaacggt agggcggcag aacggctgtt gaatttacag tttgcactgc cagcggcctg gcggcgtcca agcaaaatga tccgtcgctg tattttcaaa atagcccgta cgcgtgaatg aataatccga ggctggagca tgccacttat atactgccgt agccatcctt attgggcgct ccagagctac aaattgatga attttaatac aaaggcgtgt gaatttatgg ataaatcagg ctgaaaattg gtactgagct ggttccggta ccgaacctgt cgtaaggctg gcaatggtat ctctccttca tatgcgtaag tgtcgccgtg ggagctctcc cggcggcacc tttccggcca tgaataaatg tcctgctttc gcctgagcgt tgaagatacg acagaccttc atgtgggtag atgcctgccc actctccgta atatgattga aacagagtga atgccagagg taaatgaccc gaaatggtta tgcacaagaa gagtcattgt ggccagtgct gattgcgaaa catgaagtgc cgagaagtaa gtaagaaaga gcttaattta acggcattga ggcgctggaa aggtctccgt tcaaatactg gcggcaatcc cgcccagcgg gcaccgtgaa gctggcggca gtttccccga ttcagcaccg atcaccgcag gatgttatgt gcgtaacccc cggatgcggc gaaagcgtt t tgttgagaat attcaatgaa cattattggt tattcgaatg tgcaaaggtt ttatcttact ctttatcggg acacgttaca agataatgga atattagtta gcattgacca ctggcattag taccttgaat aaatattcag atagatttta aattcaaaga gctgtacagg ggtagttaaa cggcgagtca tattcatcag cactaatctt ctcttccggt gctggagtac aagccaggga gcaggaatac atccgctgga gcggcagcca tgaagcagga gctacgactt ctggctggtt actgggtggt agtagcgatt tacaaggtga taatgaccta gcgtggttag ggaaacgt tg agctatgttt agtggtgccg catactatgg aagaaaggaa ttatggaatg ccatttgttc ctttcagcat cacaatattc tatacaaaga atggtaaaca ggttgaaata tccatgacag tattaaaagc aatagttaat aggatagtag atggatgaca gaatccacca tcatttgaac aagcacattc ctcaagtaca ggggaaattg gtggtgcaga gtacctcaag cgatgg tgaa atacgtggtg caaggccaac ttattgtttt tatgtgcggt cgctaettga attctgacaa cacagaattt cttttaggaa ggaaaaatat tgaatgcgac ataataaatt gcaagcctga t tat cgtagt gcagtatctg ctgaagttgg aagtggaatg acagt cggaa tgatgtcgtt atctgtggag taacggttcg ggaat taaat atacgttctc ctgttcacct atatgtctta aacacaaaaa tgcacgccgg attacatcga cgcaggccat ct t tcacaga cgtctcgcga accagcaggg tataccttca atcgcctccc cagaaccagc aaagaaattt gattttaaag agctatgcct cccctaatcg tcagaagaaa ttagaggtta ggtgaatcat aactggtggt agggt tccca ctatatttat tcttattgtt aaaagggcat ttcagatcag gacactgtgg ctggtatggc ctcttaaaaa aaaaacgatg atttaccatg atatcatcag gaattaatta ttgctctgat aataaagcag tgacattttt tgaaattgaa tagcggtctc tcgcgccagc tctggttatg tctgattatt atcaattgaa tggcagcggc 12840 12900 12960 13020 13080 13140 13200 13260 13320 13380 13440 13500 13560 13620 13680 13740 13800 13860 13920 13980 14040 14100 14160 14220 14280 14340 14400 14460 14520 14580 14640 14700 14760 a a a a a a a *aaa a. a a a a.
a a.
a a a a a ggcaccatca cggccagat t gaatggcgta ctttccggat agcgtgaaag atacgtgttt catgacaacg cgtcccggcg gtactggccg gaagccccgg gaggcggaac tggaatgacc caagagcgt t tggggattca aaagaggcat ctgccgccgc ccggtacgcg cgtttctcgg ttattacagc tggattggcc gaccacgacc ctgtcgtgga ggcagcgaca ctcaatcgct ctgtaccgct cgacggcccg ctggttgatg gagctgaaag gatttcagca cattttgccg catctgccgg gccggataca tacggcggga ggagtgcgcg ccgcaggcta tatatctggc accccactcg gcggcagtag cctccgataa atgaactgaa acaaatccgt aagagaaaaa gataccgtaa aaaaagcctg tggaacaaag gcgtagtctg atctggaaca gccactacaa cagggatttt cactgaccat cgccgaacag tacatgaaca tcagtcatct tgccgttgac tgatcccgcc tcaacgatct accacggcca ttgagccgct acctgtccga ctgaatacaa aggtgcagtt aggggactta cgctggtact cccagaccaa ggctggcttt tctattacga tggccatgca ataaatgaca cgacttcaag cggatttatt gtggttatgt cgatgcgtta ggtgaatcct aaatgacgcc ccttaccgac actgcgtcgc cctggccaaa gtacagcagc cgccattgta gagtgaagga cgtcatgcat tattgatggc tcctgacggc tctgccggag cgaagaaacc cgacatccgc gcgcctgcgg ccgcatcacc catcattaat catccggatg tgaagcagcg gctgactcac actggccggg agagtacaaa cctgctgaac cggcatcctg ggcgataaag aatcgggcca gcaggttctg catccgccgc taccgccggg gacagtaccc gccaacaaag attttgattt gcggtagcta cctgacctcg gacaatcaga gctttcacga gatttagtgc ccgctgaatg tcggtctggc ttcatatcg attacggaac caatttttac taccgcagcc gaagcgctga acgccgttta cacctgaacc acgtttgaca gacgccaact ctgctgccgg gggttgaacc tatcaggcca cgggccgat t gaggtctccg ctggcgaaaa gaactctcca cacctgacgc gcggtactga aatgcggtgg gcttcaatgc tccgatcgcc cctgtaccac gagggagcat gaatttccgg tgacgccgcc aaatttaacg taaaggaatt tgccttttgc aatcacagcg agaaggccgc cagctgat ta gtcgcgacag cgcagaccac gggtaaccta ggaaaggaaa gggataaatg tgccgcagat tgccgctgac acatcggtaa acgcaccgga agcaga tt tg ataacccgga cgctgggacg aaaaggcggt ctgacgggcg gttcactgat cgctggcgga attacctgct ccccgctggc cctatgtgcg cctatgccgg tccggggcgc tggccccttc cgaccgatgt tgccggaact cgcgccaaat tgtgggaaca ggctggagac agcggcggat gctgtttttc tacagtgaat actgctcctg actcgacctg gcccattgag ctggtcgctc ct ttat t ttg ggcaatcggc caaaatcccg agtgcagttg aattatgagc gtttcagcag cccctttttc actgatactg ccacaccccg tctggcggta atcattggcg tggcgcgcag gacgggcgcc gatagatatc gtgtacctgg acggctgac gctgcaaatt ccggaggtg tccacaaacg gttgaaatcg gcagcgcatc cgatcttgcc gctactgcaagatccgctcg cccgtttcag cattgcccgt agaactgtgg atgatgtcct 14820 14880 14940 15000 15060 15120 15180 15240 15300 15360 15420 15480 15540 15600 15660 15720 15780 15840 15900 15960 16020 16080 16140 16200 16260 16320 16380 16440 16500 16560 16620 16680 16740 16800 a a a.
a.
a. a a. *e .a a.
a. *.a a.
a a a.
a..
a a.
a a .a a a a a .aaa a a a a.
a *a a a a.
a a.
a a a a. aa a.
a a.
a a a *aaa a a ttttgtccac agcgcacgga aaatcagcgc cagccatgcc gcgagatgca tggcggtacg gacggcgcgg gcgggataaa acgtactgga agccagacgg agcgtcgcga gactgcggcg tgctgacgct agcatattga ggctgaaaat accagcacag tgagtgacgc gcgcagtcac ccaacctggt gcgtacctgc acggcagtaa tgaatgatga ggggattatt atcatcttat ttcatttccc gacaacgtac gaaggtaaag ctgacctggt gaggaagcat tcccggtcat t tt t ttaaat ctcggcgtgg ctaacattaa cacgccggaa actcaatgtc ggcggctaac cgctaaactg tctgtaccag ttactgcctg cgtctgggcc acttttccag ggttatctac gcgtaagcaa tccggttatg gatgcgccgg gtttggcctt tgcgattggt cctgctggca tagggtggtc aatccggcca tgtaacgggt actgtcggaa cggacgggta agccgggcgg aaaaatcaac atcggatctc gttgcctgtt ccgataatgt agaaaaaaga cctggtgtgg ttgaacagca tagttaaaaa ctcccatttt tttgggtgtc aggatgcgac atgaattagt cataaggaca atcgttgaag ccgttgctcg gatgcggccc acat tatgcg tgtacggcgc ggaaaaagcc atcatcgggc cacctgctgg ctggacaata cccgcgctct gtgccggtct tacagccacc ataaaactgc aacgaaatcg ttccgtggcg gtgattaaca cacactgaca aaacgggcgg catatcgtcg gcgaaaaacc ctatgatgtg agcgacatat atccggtggc caacctcagt caatgcactg atatggtcag cgcagtgaaa ttttccctgt atccctgacg tggtaatcaa agagaacaaa ctatgtttct gtgaatatga acggtccgga ccgc tgcccg tggtagagcc atcaggcgaa ttgatgaagc tgctggtaac gtctggcggc ggttgggatt t tcgccagca cgcctgactt ggctgagcgc ggatggatgt cgccgccgcc cccgtggcct acgccatgt t aagcggcgcg gccagcccat cggaagttgc gcaagggcga gtcgggtgga tctcatcatt cacataacac gcttttgccg actcaggat t atgtatatgc gttgacagta aagcctgatg cagaggacag ccggatgcgc cgggataaac ctgtggtgtg cttaaaaata aacgccggta cagcaaactc gcctttattg ttaccgtaat cctgcggcgc cgccaataac atttcatggt cagcttccag tgaaggccgc actgctgaca tcagggggcg cggga tagcc gcagaccgtc tgtgccggtt gctgaccgtg tgtgccggga ggaaatcgcc tcat tcggct ggcgt tgctg tacggtgccg aattctggta cggcagtatg gatctgttcg gatccccgat tcctgaaatt tgggggtgga taacaataaa taagggcttc actcctccat ttaatctttc tcagggcggg gatacgcttt aaaccaatga cacaccagcc cggctggctg tgcgctctcg ctgctggtac gagcacgtac acaacctggg gaaagcgaag gagcatggca tacagcgtgc cagctttcac ataagcggac ctgttggcga accgtacaac cataagctgc gacgaagatg cagaaaacgg cgcgtgggcg gaattcccat acctccggcg gtggcggata gtggagtgaa tggcgtgacg tgtttttttg taacgcagga ttatgccact ggatgcgaca ccatactgtg aatactaata aaaaattgct aggtaatgac tgtctatctg atttaagacg ggaactgaat 16860 16920 16980 17040 17100 17160 17220 17280 17340 17400 17460 17520 17580 17640 17700 17760 17820 17880 17940 18000 18060 18120 18180 18240 18300 18360 18420 18480 18540 18600 18660 18720 18780 *.aa 0O a a. *e a. a. a..
a.
a a.
a .a a.
a. ta.
S.
a a.
a a at..a a *a a ta..
~0~ a a .a *.a *in. a .0 .0 0* *0 0* 0.* 0.
0 0*0* 0**0 0**0 *0 0* t0* 00 000 0 000 .0 0* *00.0 0 000 0 .000. 0 0*00 00 ~00* 0 0*t* ~000 0000 so.. 0 000000* 5e* 00e .ao 0 0000 tctcgcgcgg cattcaagtt actggggtga tgataatgtt tattgatatg gcacgattat aaaaaatgac tcttgatatc aaaagataag ctatgaaaat aaaagatatt tggttatatt tataatgcaa tctggttctg cccgctggcc actgtggctg tggcttcgtc ccgcctgacc gtggcgtgcg agaggtaccg gcaggccatc cgagggaaaa cggcaaaacc gcagacttcg taccaacgaa tgaagcgagc taaacatcgc aaccgcacag agaactgcgc tttgttgccg ctggggcttc cgccgcctgc ccggttacag gttcgCCgcc ctgaacttat gcatcaaggc gcgaggaaag atcgctgcca tttaaggtaa gggtgtgcta ctgaatatag tcacctgtat gccaaaaata caggttaaat ttcattgcat actttatcca aaatttctta gcgctgctgg tccgtgggta gtcaactggt acaccgctgc ctgattggtt ctgcgtatgg gtggcaggcg cggcagctgc cgctttctgt acggccctgc cgcatcctga gcggtgttga gccgcgcagc cccggcgcgc tcaccggcgg gagaccctgg gggttcagcg acgttgccgt gcgcaggagc gaagagtacg ctcggcgagc cataaataaa ggcaagggag ccaacgcaca gttcagataa tggtctgtct cggttggagc atgtcgctac ctaaggtcta aagtggcaat ctatcgtggc ccagcttcat gggaatttta gtctgctttt tctggtttgt gccgggtagt cgatgagtat tggccctggg tcatcctgct atgaacaact agatcaaagc ggcagttgcg atgagctgcc tgaacaccgg cagtaccggg ttgataccgc gtaacgccgg cgcttaacgg aacgcctggc ggattcgctt aatattttcg acagccgcag tggcgcgcct acc ttaaaag cgttgctgga ttaatagagt tgaatccccg tgcagcctga agtcttggtg gt tggtgggt atcaatggag tat tataaag tgcagaatct attagataaa aaaatatacc gaatgctgat aaatat tgat ttcccggcgc cgggccgctg gaccattgcc catcggcatc cgatgtccat gatgtacgcg gctgcgtcgc cgacctgcgc ggtggatatg gtggttcatg attgcagttc tggcggcacg cggacgctac agagtggcag cgtgatcctg ggcctgcgcc tccggtctat cacgctgac gcgaaaagcg gacgctgCgg ccgccagcgg ggctattgaa atccctgtaa ggagcgtaca agtatgacag gtaaagtctt gttcctgcca tcatcattat gataaaacaa ctggccagga aaatcctatt tatattaaca gagtgt tctg tattgggtgg gcgctggcag gtgtcatttg ctgttgctga agtgtcctgt ccgtttgcgc ctgtacggcc ttcctgcatc accgtcaacc cctggctggc gtggtcggca ccgctggcgg ctacactgcg gcccgccacg ggctttctcg acgctaaacg gctctgcggg ctggtggtca agccatcttc ggcgacccgc ttggatcagg ctgtatacct cagatcttcc tatataaaat ttagttcgtg gtataccctg ggtggtaaga tttcttatgc ttgatgccat aggtagagat tggattatga tcgatagtta aagataaaga taagatttaa cgagtcgaaa t tgtgggcgt ataccctgcg tgctgctggt gcctggcgat cgctgtgggt tgtaccggct cgcgcgggga atattgtcac gtaaaatctt gtcccggcga agcaaatgga actggtggtt atgacggtgg gtctgctgcg tggcggattt cgcgactggc ccaaaatgga gtgcacaaat aggcgctgca gactggatac tcccgcgtga tcgattcaaa 18840 18900 18960 19020 19080 19140 19200 19260 19320 19380 19440 19500 19560 19620 19680 19740 19800 19860 19920 19980 20040 20100 20160 20220 20280 20340 20400 20460 20520 20580 20640 20700 20760 20820 S S S S 55 5 S S
S.
S
S S S S S. S
S*
55
S
*5 S S
S
S
*5*S
SSSS
S
*5SS
*.SS
*SSS
S
555
S
*555
S
**SS
S.
5 5.
5555 55 SS S *5 Se 55 55 *O .55
S
S 5.
S.
5 SSSe 5% *S *5.
.5 555 S *55 S S
S
S
S
S S 5~55 55
-S.
5.55
S
555555 5 55 S S SSS S e. S attcgatgcc ggcgcaggcc aaaaaccgcc cagctacttc gccaaacctc actggtgctg ctatctgaat cggtaaaccg ctggccggaa accgccggta gccgccgctg taaagcggcc agataaatac cagcgtggcc cggcagccgg ggcattcctc ggagagcgaa ggtctttgtg tgaaggatac gaacgatggc cagccagata ggaatatgcc ggaagagggc ggactcaccg gccggacccg agtggtacag gctggtggat acagagcggc cacccaactg tgccgcagac gctggatctg cacccagatg gttcgccgac acgcaactga gacgccgtgg cgtggcgaat ctgcatgacc cagtgggcct gcattcctgt gaaatcagcg gcgatggctc ctggacccgg accgacagcg gtgaaacgga tacgatgccc aacgcggcgg gggttcggcg gtggtgcatt gacggtcaca gcgccgcagg cgtagcaacg cgggagctgt tgggtgatgg ccggggcagg cgccgctggc agttccggcc ctgatgcggc caggccagac acggcaaaac gatcgtttcg ggtgggacga accattgctg aagttgcaac acgaagcagg gaggcgatga agtccgcagg ataacacgct ccgaccagt t cctccgcctc tgctgacaca ggcgttaccg tgtggcaggg cccgcgcgac ccgtcccggc acgcgccgcc tggcgtcgct tggagtatgt tgcgcatcta agatccagtc ggcgcgccgc caccgtatga ccagtaccga agttcacgct gcgcgccgct tcgacaaacg gccgggagag agcagtctgt aggattttct tggcctatga tgggaaaagc agaaacaac t tgttccagga ccgcgctgcg tgcagatcgc atagcgcgct tggaggcggc gaacgcgcaa tgggcgacga aggtcagcgc gcgcggggtg gagtatctgg tctcccacac gtttattttt cctgctgcgc gatgcagacc ccggc tggac actgctggac gctggcctgg gtacaaccgt gctggcggac tctgctactg gtgggtgatt cgtgctgacg gaaagatgag gcgtatctac ggtacgcgcc ggatcggggc at taccggaa tacgccaaaa cgcccgcgaa ggacagtatc tttacaggtg ggtggtggag ggcgcagcgc tattcacccg cgaggtcatt ttcgctgctg ggcggcgggg gaaactgccc aatcaacgcc ctgccgtgac cgaggacttt tttttcacca cagcgctttg gctctgccgg cgtgaagcgc ctcggcgggc agccagcaga ggtgatgtga agcgcaaggg cgctacggtc ctgctggatc gccattgccc aatctggata aacgatctgg catatcgaag gcgctgatcc gcgcgggcgc gtcggcgcgg gtgccgggta tttgtggcgg aagctgactg gtccgccgtt catagtatca ctgcgcaccc cagaccacgc gcatccggta gaagaacggc gccgggcgga accatgctca acgttgccag gcgccgctga gggaccggcg gccatcgacg aaccgcatct gcgccgcgca tccgggcgat acggcaaccg acctggtgga acctgctggt ccaacggcga aagcctacac aactgtccgc tgtacagcgt aactgctgct gtcaggatag aagatcacga ggaacagcga cgctgtttga gccaggcgcg tggcggcaat atgcgggaac tttttacccg cggcgacggc acagcctgcg tgtacctgac acagtgccgg tggcgtcgcc tggtgccgcc acgcggggaa tggaaaaaac cggacggcgg acgagtatta cgcgcattac aaaacatcct acgtgctgaa ggcgctatcc tcgccagcgg 20880 20940 21000 21060 21120 21180 21240 21300 21360 21420 21480 21540 21600 21660 21720 21780 21840 21900 21960 22020 22080 22140 22200 22260 22320 22380 22440 22500 22560 22620 22680 22740 22800 S S S 5* S. S
S
55
S
.5 5 5
S
*555 5* S S 5
S.
S
55 5 0 S S
S
5555 *5.5
S
555*
S
.55.
0050 5555
S
*55*
*SSSS
0 0
*SSS
S
5555 0S S. S S 55@S
*SS.S
5 055 @0
S*SS
5 5 505555 5 555 S. 55
*SS
5*.0 5 5 555 55 555 5 OSS. S 0..e* 555555 5S5@5S .55.
555555 *e.6.
55.55.
555055 555*0* 555
S
55** cggcgtac tg cccgtggcgc gtttcagcag ttcctggtcg tgatatcgac gtggccggga ggacacctca aatggtacag ccgggtggtg gcagaacttc ccggcgatgt gaagagggac catatcgtgt tcgccgcgtg gtgatgtcgc atggcgtcac cgggagtggc ctggcggcgc gaccagctat cgacgtacat gaactgccgg gcaacaccgg ctgcggttgg ttctgtttat atgtaaacat aaatggataa gtggctgttt acggacgcgg gacgttat~c tggcgctaat gatggcagta taaggcgac ggcgataccg aaatgctgcg gatgctttct tacaaaccga gcgaagcaga atgcagatta ggccaggtgc ccgcgcaacg acgctactga gaaacggcgg ctggaaatca agttgcccgg tcggcaggct gaatatggca tgccgggagg gcgaacatga cgggat tt tt gggatagcgt tggacgagag agcatatcac gggcgatgtg cgcaacgcag gcgtgcgtta ggcaaggctg tggatgaaaa gtctgcctgt tgataacccg ccttaccatt ctcaacatta aagtgcatat tttcagcttt gatacgttgt agactgactt acggggttac gaagggaatt gattgtgatc ggagcaaaca ccgaaggcaa tccgcagcgt gcgtggtgga tgcgctacgc gctcgatggc ccggcgggcc tgcgggggcg ccgccgggcg cgagggcgct gccggaccag ggactggctg ccagacgacc tatgcccgcg gccgatgggt gggacgcccg tttgcaggag gccggacacg gcagccgggg ccgggaactg tctgccgtgg gttctggcag aaatagaggt gcaggaatgt ctggcggagt ctgaaaacaa atcaactata tcccttcagg gtattatcgg acaccaccgc ttgacgggca cggatttaca ataaattgtt tgggaccatc actggctccg catgacattg attttttaac tatggacccg ccatggtccg ggaaatcacc gtgggcgctg tcaactggtg ggatt ttaac gtaatgcgcc cgcgactatg aaccgccaga gatgaagagc ccggaacacc gggacgactg tggccgctgg acgcagggct atgcggcgcg ccgtggtggg accgggctgc ccgggctggc cagaacagtg cagatgctgt tcctgcccgt gtgtggatat cggtcacact ccagcttact aggggaatat tggataacca tgtaatacac aaagctgaca gtggccagat tattcagttt atggggctgg ctggcggaca caggggccgg agcgagggcg gccatcacgg gaccgtcccc gccagcccgc tttcatctgc gaatatgatt ccggtcagcc ggccagcagt tgcgctggcg cgtgggtggg gtgacggctg atccgctgcc actggctgac tgatttacca ggctgtactg ggcggctgac cgcagtggct cggatgaagc cgggtaagac aagggcggta gaaaaagtac atttgcctgg cca t cccgt t gaaaaaaagc ggcgcaggat ctcactggcg gtctgtacgg agaa tag tca tggcctggta aaacagtgtg cagggggagg agtaccattc ccgccagcga atctgacgcc ggaaaaaatt aactggtgat tgaaagtaac gtattcgcca tggacgccgg ttgacggtcg gggagctgtt ggtatcagcc agtgggcgct cagcgggcgg gatgcacctg gtggagcttt cggagtgctg gcgctgcggt gctggcgagg cgagcaggtg gcgtgggtta accggtagta gctggggcag tgtggatgcg gctgtagagg ggcggaacag catcgtcagg accggtgaat gttgagggtt aagatgcagg gatcagaagc gggagaagat tcgggatatt tgccggatgc caccgataag cgcgaggcca 22860 22920 22980 23040 23100 23160 23220 23280 23340 23400 23460 23520 23580 23640 23700 23760 23820 23880 23940 24000 24060 24120 24180 24240 24300 24360 24420 24480 24540 24600 24660 24720 24780 24840 *9 9 99 9 9* 9* 9 9 9 9 9 99 9 9* 9.
9 9.
9 9 9 9 9.9 9 9999 9999 .99.
999.
9.9.
9 999999 9 9999 9 9999 9 9 9 99 9 009 9.999 9. .9.
99 99 99 9 9 9 .9.
9.99.9 9 9 4 9 .90 .1*9 99 .99 999 99 9 9 9. .9.
.9 9 9 9 999 9 999e 9999.~ 9 9 999 9999 9 *99.999 999.
9 *99 999 9~e.999 9 99 ggctgccgga atcagcggat tgattccacc tctgaaacag tgtttctggt attatcgtgt gagcaccaat aatgcaatgc tccttattta ggctcttgtc gcattataaa taaaggagtc gacgacaacc cggaagcgtg cgcattgccg cgccgcaggt gggggcgcat cagcctggac gcccgacacc ggtgggcaaa cgggctggtc gcgtatggag taaaaccgtg gcggctggtg tgattttctt ggacagccac gccgctggtc cacggcaaac gccacgttca cgagcattat cat tcgtatg cacactcatg ggaatatctg acatgtgaga gagaagaccc aaaggttaca gagacggaaa cagcaaacga ctggacgtgc gataaagtat gaatctggta tgtaattcag agcaatatac gtgcggggca actaccatga acgccgccga gcagatgcgg gagccatcag atggcgcagg aatgtgacgg ggcggcgaaa ctgaatctgg gatctgtgcg acggcggcgc ccgtgggtaa gtggatattc gaaagctacc cagcgactga acgctggtgc gagtggcacc gagagcctgc ttgctggcaa gagtggccgg gaagcgcagc accggttata ctgatgcaga tatactgggc gaaaagcctg gtcatggttg aagaaaatgg atggtggaac tgtggttgac atgactggcg agcgtaataa tatttatgtt cctgttcttc ctgtttcctg gttttgtatc taaccggaga cggaatccgc gactggtgaa atgcggtatc tcagcggcag cat taagtga gctatgtctc tcaacatcga gggtggtggg aactgctgac tggatgaggt cggtacgcac tgcaggagtg tggcggatgc aggaagggct ggactggaca acaccgtggc gagactactt gcagccccgg ccttcacgct ccttgctgtt gaactgggga tgggggcaag tattgaagtg tgaaaagaca aaaacaataa aatgattaca gtgtgctggt aaataaaatc atgtgctaat tggtaaatat acggtgagtc cacaaataat aagtggcggt cgtggaacag agccgcggta ggccatcgtc cgccgtaccg actgttcagc cccggcggcc actggatggt ccatgaaggg ccataccagc tctggcggaa ctggcaggtg gggcatctac catcagcgcc gaagctggac gtgggtgCtg aaacccgcgt cgacaagagt cagtcgggtg ggaaaactat tgtgcaggac tataatcgta cggggtggtt gaaccggtgt tttacggtta taccgttaaa atggtggcca tat taatcaa ataatattgt ttttgtgttt ttaaatataa tatttatttt aaatccggta gtcaccgcag gctgcgggat gcagcggcgc tctgctgttg ccgggcgcat tatgtggtac aacctgccgc ggcggtaaac cgttcggtta gactacaaag tatccctacc cagtacggtg tggtggtttg cacaaagcat aaggagttta gatgatttcg gaaaccggtc gaaggcgaga ctggggggag cccaccgccg aacgcgcagc tccggctgga tttatatcca ttttccgtat atgttaagta accggatgaa gataagtaat atttctataa tatgttcatt ttattttcat caggctggtt aatccggtat tgggagggct attcagtcgc cgctatttgg aggctgccgc caggcgggcc tactgttcgc agctaaaaac tcaaaccgat gacatatcag cctatgagct cattccagaa cggtggaaaa aaactgattt agcacagcga gtccggactc tccacactat attttacgaa atgccaccta tgctgacgcg ggaatatccg aagtcaatca acagcgggca 24900 24960 25020 25080 25140 25200 25260 25320 25380 25440 25500 25560 25620 25680 25740 25800 25860 25920 25980 26040 26100 26160 26220 26280 26340 26400 26460 26520 26580 26640 26700 26760 26820 IbO I 0 0"9.
0 ,b a 000 4b* 'a 0 0 IbIbI ggaccagcac cccgcagcgg cccggcgggc ggatcgctac ggcgggcaaa tttcaaaaac catgccgccg gtatggcggg agtgaagttc ggttatggtg cagaaacacc tatcggcacc ggcgcttgtt taccagcgat agccgtaacc ggagttctca tttacggatc ctgaatattt cagattgcac caggcaggaa gggaaaacgg gtcttcaccc tggctggcag gaagcagccc ggcgctatca tttcttgccg ctgggcgagg gagaatgtca gataaccata cagcccgccg ccgaatgtgt gactggctgc 9CC9999cgt aagtttatcg tttacctttt acggtgagca caggaaatct ggcaaaatgg ggcttcggga ggcgatccgg tggggactgc ccaacgaacg cacgcggaaa gatcgcacta acggtaacga acttatcgtt ttacataagg gtcatgcaac gccgatgata tggatcgacc gcagcgtcaa cccgcgatac tgatgaaaaa cgggcaatga aagtgtacca tgaccagtcc gctttcagcc gtgtggatga tcggcatcgc ggctgattct ccatcgggcg gcaccaacag tcgcagaaaa cccgtaagga ttctcggcgg gcaaggtggt ggcggcaggc gcggggagpt ccacccgttt aaccccacac ggacggatca atgaaaacag tgatccagat atctgccgat cgggaatggc gcaacatgct aagatctcaa aaaccattat agaatgacgg tagatgttgg acggctccat tgattggtaa ttgcccccct ataccgcata tatttttatc gctaaaacct aaatctcggt tgcccttatg gcgtcaggcc ccgtcctttt ggataaaaac tcctatctac cattatcgcg gggttttatg ctccatccac tcgcccggcg acgcatcgcc tgaccacacc cggcacacag ggatgtattg ggcaaagctg tgtcgggatg tgaactgcac caaagggccg gtacgggcgg cacctgctgg cccgcgtatc catcgtgggg gtcgcagagc gcgttttgac caccacggtg taaaaatgaa cctttccgta cgatcaattc tgagttttgt aggtattgat tctcacctct caggaagggt ctggagggca gatgaagacc cagcaccggg ggggaacaaa gggtttattg gatgataaag gaataatcac cacaccagcg cttgccgcct gccgatcaaa cacacggcag gcgcgcgcgg caagggtcgg gaatgcgacg acggtactgg tttgtcgtgg gggacgaaat gccgtgggtg cccacccgcg cagagcgcca gtaaaggtac atacgcgtca ggccaggaag cgtacctaca gggatcttca gacaaaacgg aagaataatg accaacagta aaactggcgc acqcttcgct ggcaagcaac atgaacccgg gagtgatctg gttttgtcct atgaacgaac tgcccgccta tattgtcgcg ttgccgccac caacccctgg cagacctact acggaggtgt cgctcgccgg ttgccttctt tagcctccgg gaaaaatcct tactgagtac aaaatatcta cggtgattga aaatcagttc cgagtctgct ttggcactaa aggctatcag aggtgttccg tcgtcaccgg agtttggttg gctacccgtg tgctggtgga accaggacac gccactcgct gcgcggagga aaacgcatac ttggtgagga agacgatcaa gcggcaatgc tgatgttaca atggcggcac aattaaacct gcctgaaaca atcgcccagc tattgaccgc agcgcctgca ccataaatcc caaggtactg ctggaacacc gaccatgtat gtttcttatc tagctgcggt ggtattgcaa caccggttcg ggtgaaatgc catcaacagc agacggttcg tgaaattccg cggcgggctg atgtgccgct cgggctgttc 26880 26940 27000 27060 27120 27180 27240 27300 27360 27420 27480 27540 27600 27660 27720 27780 27840 27900 27960 28020 28080 28140 28200 28260 28320 28380 28440 28500 28560 28620 28680 28740 28800 28860 *t agcaatccgg ctgcccggtc ggactgttgg cacatcacgc cccggccacc cgttacgtgc ggcatggcat gaaggaggca acacagaccc ggcaatctgg gcgggaacgc gcgacggggt gagcacgaca accaccgtca atcaccgcgc ttccctgtca aaccgggtgg gagttcgacg aacgcccacg tacgatgaac gtgcggcggg cacggccagc cccacggagg gaccgcagcg gacgcgctgg tattacggga gagcgcgaca ggctatgacg gttcttcctg aacaaccacg tatgacagaa tattacgata tggatgtgac gcctgccggt gacggggc tg tgaccggcgt agatatttga tgcattacac acctgctgtt gactggtacg cggcagggga tggagtaccg aggtgcgtca tcacctgccg ccagtgacgg ccggcaggca accggacgcc acattgagct tgaagacgac aaatcacgcg gccagccagt aggggcagat agacgcagca ggacgttcag cggggcaaca gctggctgac gcaacccgac gcgggcatct gcctgcaccg acgacgggct gccagcgccc gcgaggtggg gcggttacct aggcgggcaa caccgggcag cacctgctcg gcggctgaac acaggggcgg cccggaagaa cgatcgcagc tatggagacg gatagcctcc gcggctgtcg gtatgacgat gtttgcttat ctaccgctgg cgaacattac gggggagaca gggcggtgga gcccggcggt agatccggaa cacggcgctg gcaggagacg gtgcagccgg gcgggatgcg ctacaaccgg gcagcaccgg ggcggagcac ggacatcacg gttacagacg tcagataatg gctgagctgg ggcgcggcag cacgattgac gaccgggcgc cctgctggat aaaatcctgc cgtttttacg tgggaaacca gaactgcgtt cag tt atacc tattacgtat ccgcaccgcc agcagcgggc cgaattgaac aacggtcaac gaaaacgggc caggaactcg cgctttgact tggcagtggt atgtaccgct cgcacggtgg ggccgggtga gacgatgacg gacccggaag acggatgcgg ctgggccgtc ctggaccaga atgcaggccg gcggggaacg ctgccggacg gcgctggacg cgcacgcagg cagcgcagtc ggctgcgtga gacggcctgc tcaggtcaga aacgaagggc tgccggaaac ccagccacct gcctgcgcga acccgaaaac tcagccgcct ttggtgattt agcgcattgt atcacctgtt tggtgcaggg tgaccggcgt tgatgacggc acggcgcgcc atgattttgc ggtacgacag tcacgtacaa cgtatgaata cgcagacgca ctgtctggaa ggcgggtgac cgggcggcac tgttacggac taacggcagt acacggtgcg gtagcatatg ggcagcacct gcctgacggt ggcagcttgc tggcgtccgg cgtcgaggga gtggcagcgt tgtatgacca agggagcggt ggacttcacc ggaaactgtg tgacgatgaa gatgctgacg gcatgacggg tgacagtgac cttcgggcac actgcaccgc cggcacccgt ggtgaaccgg gcacagcaat gcgcgtgacg cgcaggcacc ggaaacgtat cgaagaccac tgacatccag gtggaacggc aacgcagtac gcagtacgct gcagggcggc ggagaatgaa gacgctcacg ttttgagtat ttatcagcgc gacgcatctg gagcgagtat gacgtacagc cagtgcccct ctattactgg ggtgtacagc tgaccgttat gatgagcaac 28920 28980 29040 29100 29160 29220 29280 29340 29400 29460 29520 29580 29640 29700 29760 29820 29880 29940 30000 30060 30120 30180 30240 30300 30360 30420 30480 30540 30600 30660 30720 30780 cggctgccgg gctgtggtcg tgaccgttac ggctataacg agtggggcga gctgaccacg 30840 If .gg. cggcgcgacc aacacggaga gggcggcaca ctgttgcag ccgtacacgc cacacggatg gcggggtata tttcaccagc aatctgttca ctgaggggcg cttggtttaa gatactggga atagatgctt gaaaaatcat gaagggcgat ataactgtaa aattgtgggt acagtattat acctaagctg gaaacgaat t ttaaataata ccatcaatcc cgtacgggcg tcaggctgtt aacagccgta attaccacac tgtgggcggg cgtactttca attacaatct ttgggctggc gatcctttag ggatttgaaa aaaactggta cctcatgcag agcaac tgga cgcactacgg cgggccatac agaacgtgca cggtggcgag tgacgggcac tcagggggtt CgctgCggCt gatattatgc ggttaaacct ccgcgactgt cagtagtaca ttggtaagca tagtacctac tagctaaacg ggggcgagaa gagcaaacat tatagatttt tatgtcagaa ttctacattt gttggccggg ggagacgcgc gcacacgggc gcaggagaac cacgccggtg ggacgtgacg gtatatcagg ccagccgctg gttcagatat ggggggggct gacttgctat atgcgggtat ctgttgttga atatggatgt gtggaacgcg ctacgatgcg ggcgcggagc gcagcagggc cgtgacgggg gccgcaggag tggagagaat gccggggcag accggagtgt ttatcagtat tgggcgatgg aagttcaaca agcaaaaaat aaatgaagga caaaggtttg aattaatggg actgattata gaagatgacc gtaatggatg gatgaacttc taagaagtta taccggtacg catacggcgc gtgcagcagc gcgagcgtga ggcacgccgc gggtttggag cggctgccgg tatgcaccgg gaatctttac cctggagcat gacaaaccct gtttaaaggt gacagcaggg caggggcagc c tggggaggc cggacggact tggcggacct cggggagaaa gtgacggcgg gcggcggaca tattttgacg ggacggtttg gcgccaaatc atggggcctg gggacaactc ggtgctatgt tgggcaaaaa cctgttcctg gaagttgaag cggtaattca ggtatatatc tggatactgg tggatatatt actcttcccg atgcgctggg ggagccggac agggctggcg cgggaaaggg aggaggtgac aaaatgcggc ggcagtattt agtgtggacg cagtatgcgc caatctaatt gaggatgtca ccaaatgggg catgataaac tgacgcgggt gaacccgcaa t tgtgtggga atctgtacga gcaggcaggt cggacggaac tcagcaacag acgagacagg tcagtcagga ctctcaaata cggaatatca atgttgccta atgttgaatt tagtagggcc aaatgccaac caaaatgcta t aatgaaaat aagatttact tttatataaa tgatgatttt gctgttttat eaggcgggtg ggactttgtg gacatatctg agaaagcagg ggcggcggac ggacatcagc tgacgacgag gttcgtcagt ctaatccgat ttgatgcggc ctttctcgaa ctaaagttgc cacatgttgg catcagcggc ggcgacgtac ggggttcagg tgcggaacag gtggtattac gctggtgtgg cggggcgtac gctgcattac tccgatcggg tatagaccca gcaaatgctt ccctgctgat tgatgtgcct agattctatc agcagaaaac aataaattta gatttatctt gtatgggatg ttaaacaaga atgataagta tatctaaccc agcaaggcga tgggaggggt tacgatgcgg caggtgtggt ggaacgctgg aacagcgggg acagggctgc caggatccga tagatggatc aaggagaacc agtcgatccc ttatgatgca ttggcaatcc 30900 30960 31020 31080 31140 31200 31260 31320 31380 31440 31500 31560 31620 31680 31740 31800 31860 31920 31980 32040 32100 32160 32220 32280 32340 32400 32460 32520 32580 32640 32700 32760 32820 32880 0, 0 *l 6 0 0.*W* 0'* 0 .0.0 0 gcaggaaaaa catccgcatc ttagaatcga aacccgaaca ggtggttatc aaaacatatc tatttttgat ccctgcctga tatcttggct aaagcgcgag ggcgcgattg taaccgggat gttatttgct acgaagggcc at ttatgaa atataaaacc atataattca aaatagtaag ggctgtgata tttaaattag catctttctg agtaatgcaa aaattagaaa aatttagatg tacggaaatg aaagggggtt gggaaagatc cccccgcctt tcacttcgtc cgtgcccggt tttgtgagta aggcattcat tggagctcag gaggttccgg gctctgactc actggagaat aatcgtttct attaaaacat acaaatagta tgctgatgaa aat tatagaa tatatgtgaa tgctctttag tctggcaggg agatccgcta cctggagcaa gagcattaaa aatgtaatct agaaccggaa tttgatcatg ggagttggtt agaagtcgag atgaagatag aattatttga ggcgaattat atggaagtga gtatctattt attttctcat ttgatatcaa ttttgtcttc aatatgcttc cgtctctgca ctcaaatccc acaggcggtc atttgctgac ggtgagtgag aggagctaat taagggagat gcgattaaaa gagataaaag agacaatatg agtaaacaac gataatgagc aattgcagat aatgatcatg cgatatagtt gaaacagcaa gacctgaatc ataaacctta gcactggcag aatggaaagt ccataggtga ggcgaacaga ccggtggagg aacaatgctg atattgtttt tataattaaa tgaaccaatt ctggagaatt tgcacttgga ttcagaaagt aaaacggctt ccggatttta atcgactttt atcaggatca gcttcttcct atgcaggttt taaataaatt tatgggcgcg agaggtaata gataaatgtt atctcggtat agatatatgt tcttttctta tcaatgaaag aaatatatgt attttgaata gtgatttgat gcccatattt ttctgaggtc cagttgatgc ttacattggt gttatcaaaa cagaggttac aaaaattccc tcccagcgca aaaatgcgga cgttttgtga gggcgaatta aaaccccctg atagtggata attggtcatc ataggtgatt gagtggaaaa gaaattgcct ttatttaatc tcacctgata aacacccctc gccagtcacc cgctgttgat cttatttatc acatcctggc ttacttatga aaattcaaat tcatgattat taatggtaat taccgataat caatgtaatc atataacgtt ttatgttgca tgtatgctca aggcgttact aggaagatag gacaggttat attactaatg gaaaatggaa gagaaatatt tcaacaaata caagcattta tgagtgatta agcccggaga taacactaat gaataacaga catattcttt aggttaatta cctgtaaaaa cacttcctaa gaaaatgaaa cccgttcacc aagctccttc ggagatcttc cttaaggtgc atggcggacg cgccggatgc accgcgcctc tggcccacaa atgtctgaac cgtgtcgata cctagaacct tctggatata aacaaacata cctcttaact gatcatgaac accattaagt agccgaagat cataacccat agggtttatg atatggtttg ggatgctgcc atatagagaa gagagaataa aagactctta agaattttgg catattttgt gactgtcgga gttagaaatt atttgataag caatccaaaa gaaagactgt attatctcct ataaaaagcc acattattta cgtagatccc acggtgacgc tggctgggga ctgacgccgg tggctgattg cctctccccg 32940 33000 33060 33120 33180 33240 33300 33360 33420 33480 33540 33600 33660 33720 33780 33840 33900 33960 34020 34080 34140 34200 34260 34320 34380 34440 34500 34560 34620 34680 34740 34800 34860 S S
S
S
5555
S
*5555S
S
5555
S
*5 S S SS
SS
S. 55 S. S.
*S 555.
S. St
S
S*S
SS* 55 5 5
SSSS
5
S.
55 5 5S55 5e5. 5 5555 5 *5SS S. S.
5.55 5 *5 gccagccccg gcctgcatta atccaatagg atagtgatcc taagatgaaa caaaaagtaa gttgatgttt agtaatttaa cgttgtgatg atgcttatta ttgaaacaaa cccggctttt cggaatatct cgtttattaa tgcggcgtga aaccccgcct tatgtaagta ctgccaacgc gtgattagat tctgaagcca acaaaaggct aacggctgcc ggtctgtggg gtcgattgaa tacagcaacc ctggcactgg aagccgaaga aattcttatc gaatataact caataacgtt ggggcacctg attaatgtgc aaggaagggg tagtgttgag gccggtgatg aaatctgttc gctgaaaggg tcttggcctt aatggttgaa cggatttgct ttactacagg tgttgagatt aggatgaatt aaataatgaa acatctggaa tattatcaat ccggtattgt tgcgggttac tcacgatgca gcggcatcca tcccgcataa aggagatcgc cagtcgaatc ccagggacaa ccaccgtatt ggtacgcaat cgtcgttttc atcgatctga gcggttatct ctgaatgggc aaaacgtttt tgttcagaac gcgaacgtcc atctggccgg cgatgaattt catgttcacg agagctt tt t actctcgttg agcagccggc agatattatg ggatggaacc attacttgtg aataattgcg tgctggaatt aataagtatg cggtaaggag taatgtatat tccattatat ccggatgaat tactcattaa tcagcgcccc gcccatccct atccctcatt ttcaccgtac tcgagccat t tatgcgtaaa cggacggact ctggaggtct tactgtctgc ccctcgccac gcacgttcaa atctgccagc ggccatgctg aaagaaacat catcacgcgc gctgaatgag acatgaatcg aatcttaaaa cgctgcactg gggaggttgt catgcctgta gagcgttata tgccggggcc taccggagtg gatatcattc gtgctgatag ttagtgaatg gtgt taacga gttaacgata cttgatgaat cgagaaacgg gaaaaacatc aatgtgtaaa ctcctgttcc ggaaatgttt tcaatacagc acccgcacct aaacgaactg cacatttaga gcccgtatta gttaaagatg ggatacggcg tcaagctgaa tggtcacgcc tgttgttgat tctgcgagtg cgtatggata gcagtaaagc tttaaccgga gtgtgggaaa cggaacaata atgcatggaa aaaagcgata gcgacgtttg taatcagtct attgcttttc gtattttgac tggctggttc tccgctgaat aggtgattct ctcatttatg atgcaaataa aggatacagc ctgttgctgt ttggttttat cagaaataaa agccggaggg gttcttttgc ttaaccactg caaagagtcc tgacgggcat aaagt ctCt C gatcatccga ctgcgcacca tctaccggga gcagggatac ttttcgccgt ggaagcgctg gac tgtaatc gtccgtgtac agggaccgga tggcgtttat cataccgtac ttacggataa tgataccgaa ctaaaacggg ccggatgaga cataatccag ggcctgtgtc tgtttcggaa gatgaaacgg gtcagtcagg cctattacag ggcaagttat cctttaccta aatcaatgct tatattaatt tgttcagtcc catgggtgaa accaaaagga gttatctttt gtttaatcac ttctgcactc gtgggtatgc cccgttaata aacgcgtgag cataatcaat gatcatcgct gcccggtatt gcgtggaatc aagggtaaac aaccagagct gtgaagcgtt tcgacaggat atttatctcg ccagccgggt agaaatactc agggttatca ggaataccgc tctatttaca gctgcttcaa caagaactga agtcagctct aacaagattt 34920 34980 35040 35100 35160 35220 35280 35340 35400 35460 35520 35580 35640 35700 35760 35820 35880 35940 36000 36060 36120 36180 36240 36300 36360 36420 36480 36540 36600 36660 36720 36780 36840 36900 tccattaa gatgtatg gataagtt tggaaggt atcaacac tatagatt ttattacc gatgtgct ata aaa Ile Lys tgt tat Cys Tyr ag atcttccctg c~ tt taaatataaa c tc ccacacccag tc at aattgctttt t~ tg tttatattgg c( tt tatgacatta ci ag ccacggattt ti gg attattttaa ti aaa. ttg att atc Lys Leu Ile Ile gct ggc tca ttt Ala Giy Ser Phe gaggaaaag aatgttgca gtagtaggt 3actggcat aataacgca ttatttgaa ttacatacg ttggtttaa ttaactaata acgatgcctg gtcatggtaa tctattccac atttttttca t tggtaacaa gtaagatttg aaaaggtggt atcttaccgt ataattatcc tgttatcact cctgacaaca gattaagagg ataccaataa gtatggcgtt cgagt tagga tctcttcgaa tgaatgtaaa cgatgttaac tgctctgata gtacaagctg atgtattctg 9* 9 *9 9* 9 9 *9*9 9* 9. 9* *9 *9 .9.
9* 999 9 9 9 9 9*9* 9* .9 99 900** 0e@9 See. *99* S *9*9 *9*9 *9 5.90 9 *9 99 *9*9 *0 9**9 tattcaa atg aaa agc Met Lys Ser 1 atg atg gct gct agt Met Met Ala Ala Ser agt gcg ttg agc Ser Ala Leu Ser ccg aac tca Pro Asn Ser caa caa aaa tca Gin Gin Lys Ser att gtg ttt Ile Val Phe ccc caa gat Pro Gin Asp acc gta tcg ctt Thr Val Ser Leu att cca Ile Pro gtg tcg ggc Vai Ser Gly aag ctt gta Lys Leu Vai att tct aac Ile Ser Asn gct ggg aaa Ala Giy Lys cct agc gcg Pro Ser Ala aaa att gcg Lys Ile Ala gtc agg ggg Vai Arg Giy 36960 37020 37080 37140 37200 37260 37320 37376 37424 37472 37520 37568 37616 37664 37712 37760 37808 37856 37908 37963 gtt aat tct act Vai Asn Ser Thr ctt aaa gaa ttc Leu Lys Giu Phe aac gtg gta.
Asn Val Val act ggc act Thr Gly Thr tgg cgt gta gct Trp Arg Val Ala ggt aaa Giy Lys aat act ggt Asn Thr Gly aaa Lys 105 agc Ser gag atc ggt gtg Giu Ile Gly Val tta tca agt gac Leu Ser Ser Asp aga aga. tct Arg Arg Ser acg gaa aaa Thr Giu Lys aat ggg gtg aac Asn Giy Vai Asn tgg atg Trp Met 130 acc ttt aat Thr Phe Asn cag aat gtc Gin Asn Val 150 tat caa cct Tyr Gin Pro 165 gac aca ctt Asp Thr Leu gtc ctg aca Val Leu Thr gga ccg gcg Giy Pro Ala 145 gta gtg gga Val Val Gly gct gac acg Ala Asp Thr cca ata act tta Pro Ile Thr Leu taa tagtaaacaa ctattagtgt attgtgcctt gtttaaggcg caatacacat caaatcatct atttttcttt tacaattttt gat atg aaa ata gtt Met Lys Ile Val 170 a.
a *aSe
*OSOSO
gcg gta Ala Val aac cag Asn Gin gct aca Ala Thr 210 gtg agc Val Ser 225 gca gca Ala Ala ttt cgt Phe Arg ggt ggt Gly Gly aag gcg Lys Ala 290 gcg acc Ala Thr 305 ttc cgC Phe Arg tta aga Leu Arg ccg ttt Pro Phe aca ggg Thr Gly 370 cca ggt Pro Gly 385 ggt ggt.
Gly Gly gct Ala caa Gin 195 cga Arg aac Asn gac Asp tta Leu gac Asp 275 gta Val ctt Leu ccg Pro tgg Trp tac Tyr 355 ctt Leu agt Ser gaa Glu ttg ttc Leu Phe 180 tta aat Leu Asn gtg att Val Ile ccg cag Pro Gin aaa agt Lys Ser 245 gaa gca Glu Ala 260 atg cca Met Pro cca ccc Pro Pro gac ctc Asp Leu gat gcc Asp Ala 325 gtg gag Val Glu 340 atg aat Met Asn gag tat Glu Tyr gcc cat Ala His agt cat Ser His gcc Ala tca Ser tat Tyr aat Asn 230 tcg Ser aac Asn acg Thr gaa Glu aat Asn 310 gtg Val acg Thr tta Leu gtc Val ggt Gly 390 ccg act aat tct atg gtt Met Val tta ttc Leu Phe acg gct Thr Ala ttg gtt Leu Val 235 ttt ttg Phe Leu 250 caa ctg Gin Leu act tta Thr Leu tcg gat Ser Asp aac gcc Asn Ala 315 ccg gaa Pro Glu 330 ctt aag Leu Lys aca gta Thr Val gct gac Ala Asp tgg aga Trp Arg 395 gtt ctt Val Leu 38011 38059 38107 38155 38203 38251 38299 38347 38395 38443 38491 38539 38587 38635 38683 Thr Asp Phe Pro Phe His Tyr taa atccaggggc ttagcggcag aaa atg aag ttc aaa caa cct gcc ttg 38736 Met Lys Phe Lys Gin Pro Ala Leu 415 420 ctg ttc atc gcg gga gtg gtt cat tgc gca aat gcg cac act tac 38784 Leu Phe Ile Ala Gly Val Val His Cys Ala Asn Ala His Thr Tyr 425 430 435 006 too.S 0 soS 0 .0 100 a aca Thr tcg Ser gtg Val 470 t tg Leu gtC Val t tg Leu att Ile ctg Leu 550 cca Pro agt Ser gac Asp tgg Trp, ggg Gly 630 cta Leu ttt Phe atg Met t tg Leu tta Leu 460 cgt Arg gga Gly tac Tyr ccc Pro gta Val 540 gcg Ala gat Asp acg Thr caa Gin acc Thr 620 tat Tyr ctg Leu acc Thr cgg Arg gcg Aia ggg Giy cgt Arg 480 ctg Leu acg Thr tgt Cys aat Asn gaa Giu 560 gcg Ala atg Met gga Giy Arg gag Giu 640 act Thr t tg Leu ccg Pro ggg gtt Gly Vai 450 tat cgc Tyr Arg gtg gtg Val Val ct tgt Pro Cys gat tac Asp Tyr 515 gat ctg Asp Leu 530 cag caa Gin Gin aag ggg Lys Giy ctg atg Leu Met atg gtg Met Val 595 aat ata Asn Ilie 610 agt caa Ser Gin gga ctg Giy Leu cag ggg Gin Gly tcg gat Ser Asp 675 gtg cgt Vai Arg 690 38832 38880 38928 38976 39024 39072 39120 39168 39216 39264 39312 39360 39408 39456 39504 39552 39600 gcc cgc acg cag gct cgg gtg gag gto Aia Arg Thr Gin Aia Arg Vai Giu Val aaa cag aat Lys Gin Asn tao acc att Tyr Thr Ile 0 6S 0**G 0* *00* d* *00*
S.
gl 0 00.
0*000 0*000* 0*ee.
*0.009 *0*0 695 tac aac acc act Tyr Asn Thr Thr 710 gta. aca. gac agt Val Thr Asp Ser ggc agt aca. caa Gly Ser Thr Gin 745 ctg cac cag gga Leu His Gin Gly 760 tcg tca. gac tct Ser Ser Asp Ser 775 atg tat ggt ctg Met Tyr Gly Leu 790 gca acg cat aat Ala Thr His Asn cgg tgg ggg agt Arg Trp Gly Ser 825 cag ggg gag gcg Gin Gly Giu Ala 840 aac cag ctg act Asn Gin Leu Thr 855 tat gcc tcg cag Tyr Ala Ser Gin 870 cga cat aat ggc Arg His Asn Gly agc tcg cgt act Ser Ser Arg Thr 905 ggc aat ctg agt Gly Asn Leu Ser 920 ggt cat gat gac Gly His Asp Asp 935 ggc tcg ctg tca Gly Ser Leu Ser 950 gtg gcg Val Ala Pro 715 ggt gat Gly Asp ttt gtg Phe Val ttg aag Leu Lys acg gat Thr Asp 780 tgg aat Trp Asn 795 gct gca Ala Ala tct gtc Ser Val cag caa Gin Gin acg ggg Thr Gly 860 tat aac Tyr Asn 875 cgt cta Arg Leu ctg atg Leu Met acc ggt Thr Gly tac gga Tyr Gly 940 aac tg Asn Trp Phe gtc Val 735 tat Tyr ctg Leu cag Gin gca Al a 99 t Gly 815 agc Ser tcc Se r ttt Phe tcc Ser tgg Trp 895 cag Gin acc Thr tgg Trp aac Asn ccc gga ccg ttt 705 ctg Leu gtg Val acc Thr gcg Ala gcg Ala 785 ggc Gly 999 Gly aca Thr cga Arg c tg Leu 865 gtg Val gaa Giu tgg Trp tgg Trp acc Thr 945 acg Thr cgg gat Arg Asp tgg gag Trp Glu ccg gcg Pro Ala 755 ggc cga Gly Arg 770 cag gct Gin Ala ggt ata Gly Ile gga tct Gly Ser cac agt His Ser 835 ctg cgt Leu Arg 850 acg aga Thr Arg ctc gac Leu Asp aat ttg Asn Leu 999 agg Gly Arg 915 cgt aat Arg Asn 930 tct atc Ser Ile ctg tgg Leu Trp 39648 39696 39744 39792 39840 39888 39936 39984 40032 40080 40128 40176 40224 40272 40320 40368 40416 955 ggc gcg cac cgt aaa gag aac ata acc agc ctg tgg ttc agt atg cca.
Gly Ala His Arg Lys Glu Asn Ile Thr Ser Leu Trp Phe Ser Met Pro 970 975 980 tta agc cgc tgg acg ggg aat aat gta agt gct agt tgg cag atg act 40464 Leu Ser Arg Trp Thr Gly Asn Asn Vai Ser Ala Ser Trp Gin Met Thr 985 990 995 tca cca tca cac ggt ggt cag acg caa caa gtg ggg gtc aac gga gag 40512 Ser Pro Ser His Giy Gly Gin Thr Gin Gin Val Giy Val Asn Giy Giu 1000 i005 i010 gca ttc agt cag caa ctg gat tgg gag gtg cgt cag agt tac cgt gcc 40560 Ala Phe Ser Gin Gin Leu Asp Trp Giu Vai Arg Gin Ser Tyr Arg Ala 1015 1020 1025 gat gcc ccg cca ggt ggt ggt aat aac agc gca ttg cac ttg gca tgg 40608 1030 1035 1040 1045 aat ggg gat tac ggc ctg tta ggt ggt gac tat agc tac agc cgg gcg 40656 Asn Gly Asp Tyr Gly Leu Leu Gly Gly Asp Tyr Ser Tyr Ser Arg Ala A1050 1055 1060 atg cgc cag atg gga gtc aat at c gcg gga ggt ata gtt atc cac cat 40704 Met Arg Gin Met Gly Val Asn Ile Ala Gly Gly Ile Val Ile His His 1065 1070 1075 cat ggt gtg acg ctg ggg caa cct ttg caa ggc tca gtg gcg ctg gtt 40752 His Gly Val Thr Leu Gly Gin Pro Leu Gin Gly Ser Val Ala Leu Val 1080 1085 1090 gaa gcg cca ggg gcc tcg ggg gtg cca gtt ggc ggc tgg cct ggc gtt 40800 Glu Ala Pro Gly Ala Ser Gly Val Pro Val Gly Gly Trp Pro Gly Val 1095 1100 1105 aag acg gat ttt cgt ggc gac acc aca gtg ggc aac ctg aac gtc tat 40848 Lys Thr Asp Phe Arg Gly Asp Thr Thr Val Gly Asn Leu Asn Val Tyr 1110 1115 1120 1125 cag gag aat aca gtc agc ctc gat ccg tcg cga cta ccg gat gac gca 40896 Gin Glu Asn Thr Val Ser Leu Asp Pro Ser Arg Leu Pro Asp Asp Ala 1130 1135 1140 gag gtc aca caa acc gat gtg cgc gtg gtg cca acc gaa ggg gcg gtg 40944 Glu Val Thr Gin Thr Asp Val Arg Vai Val Pro Thr Glu Gly Ala Val 1145 1150 1155 gtg gaa gcg aag ttt cac act cgc atc ggg gcc agg gca ctg atg acg 40992 0*r0 Val Glu Ala Lys Phe His Thr Arg Ile Gly Ala Arg Ala Leu Met Thr 0e&:1160 1165 1170 ctg aaa cgg gaa gat ggt agc gcc att cct ttc ggg gcg cag gtt aca 41040 S Leu Lys Arg Glu Asp Gly Ser Ala Ile Pro Phe Gly Ala Gin Val Thr 1175 1180 1185 4 gtc aat ggg cag gat ggc agt gct gct ctg gtg gat act gat agc cag 41088 Val Asn Gly Gin Asp Gly Ser Ala Ala Leu Val Asp Thr Asp Ser Gin 1190 1195 1200 1205 gtt tat ctc act ggt ttg gcg gat aag ggc gaa ctg acg gtg aaa tgg 41136 Val Tyr Leu Thr Gly Leu Ala Asp Lys Gly Glu Leu Thr Val Lys Trp 1210 1215 1220 gga gca cag caa tgt cgg gtt aac tac cgc cta cct gcc cac aag gga 41184 Gly Ala Gin Gin Cys Arg Val Asn Tyr Arg Leu Pro Ala His Lys Gly 1225 1230 1235 ate gcg ggc ttg tat caa atg agc ggt ctc tgc aga tag ccgattctga Ilie Ala Giy Leu Tyr Gin Met Ser Gly Leu Cys Arg 1240 1245 1250 aggagagaat. a atg tgg atg aaa. ata eag ega gtg aaa. acg gtt ate tat Met Trp Met Lys Ile Gin Arg Val Lys Thr Val Ile Tyr 1255 1260 agc gta Ser Val 1265 gcc gca Aia Ala 1280 age tta ctg gte get gcc Ser Leu Leu Val Ala Ala 1270 gaa. aaa ctt cag aca. acg Glu Lys Leu Gin Thr Thr agt agc ttg gtg Ser Ser Leu Val 1275 eta cgt gta. ggt Leu Arg Val Gly ceg ata gcg aac Pro Ile Ala Asn act tac ttt Thr Tyr Phe 1285 *9 9* *9 9 9 9.
9* S S. *9 9 4. .9.
9. 9.
99099 99 9 9 9 9.
*9G 9 .99 99 *9*9 9.* 9999 0**9 9 9 *99* *9 99..
9** 9*9 1290 atg gtg ctt geg Met Val Leu Ala egg Arg 1295 gc t Ala tat Tyr gcg Ala ggg cac gtg cca Gly His Val Pro 1300 eac ggc agt cac His Gly Ser His 1315 ggt aac acg cct Glv Asn Thr Pro gat ggg Asp Gly caa ggc tgg Gin Gly Trp age ggg ttt Ser Gly Phe eg9 Arg 1320 1305 gta Val gtg aet Val Thr 1310 tgg age gat Trp Ser Asp ace gta Thr Val 1330 ege cat eae Arg His His 1345 aeg gtg agt Thr Val Ser 1360 t tg Leu 1335 etg Leu etg etg age gg Leu Leu Ser Gl) eaa.
rGin 1340 tgg Trp gag eaa aag Giu Gin Lys 1325 eag gat eet.
Gin Asp Pro eaa. eea gat Gin Pro Asp 41233 41283 41331 41379 41427 41475 41523 41571 41619 41667 41715 att eag gtt Ile Gin Val ege Arg L350 gag gge gag Glu Gly Giu gg Gl) 1355 ggt egt gge gee Gly Arg Gly Ala 1365 att tta aga ace get gea gat aae gee Ile Leu Arg Thr Ala Ala Asp Asn Ala 1370 1375 agt tte agt gtg gte gtt gat gge aat eag gaa Ser Phe Ser Val Val Val Asp Gly Asn Gin Giu 1380 1385 tgg aeg etg gat ttt aag gee tgt gea ttg geg Trp Thr Leu Asp Phe Lys Ala Cys Ala Leu Ala gtg ect geg gae ace Val Pro Ala Asp Thr 1390 eag gag gat aeg tag Gin Giu Asp Thr 1405 1395 1400 eegtetgtte egaggatgta ateaattget atgtttegta.
gtttgattea tgteaggtat teaggegtaa.
eatttaeegg ceeegtaegt geetgeagag gaeetttgat taatetgcat eacteatact ttgaaaattt gattgatttt aaataattta etettttgtg teatatgtga aaataattta tattaatgta teegggegca gtggaagett ggeggetggc gegeatetet teetgtatea gtgaataagg egaataetgt atateaggtt 41775 aeteaaggca aaaatetgaa aeggaeaetg ageggggett tatttaagat ttttttaeat teaeeatgte gatgaaatgg tttateatgg tggataactg ttcttgtgac aatatggtta.
atteatettt atgetttat tataagecec tttatgttta aaaggaataa agtgataaac etggeegaat tgeaagattg gttgeaggtt cagtttgatc gpaatettea tttgtaggga tatttetgtt tggtataagg gggggetttt ageatatgte ceggacagat etggetggaa eetegtaaaa.
ttteeggtge agtgacggae t tgeagaaaa gaggaatatt ttgtggattt t ttgt ttate tettgteeee ttatgeaega.
aaeettatet aaeegttaet gegteatget tgeaggagtt cggtccgtat.
41835 41895 41955 42015 42075 42135 42195 42255 42315 42375 42435 0* e..
tcctgcaggc ggctgatgag tgagtttcac tccgtcggat gggttattgc tgtggctgaa agtcatcggt gctgaaacgt t aagggggca aatcctcccc ttttgagcag aggcagcagg aatgtacaga tatcttcccc gcacataggt ctttcacaaa tgcttgtttc gtcacagaaa tggtatctgg tggtaaacag tattttgcag acgtatcagt taatgatgga cggaagtagg gtaaatcgct atcacttaga cttcctttca taaaattttc ttacagctag ttagtaaatc atctcgtgta tcagaaaata t tcacaacca gaaccggtgt gtcatgctgc tcgcacacgc ttgcttcgtg agtcagcatc gagttggggt tcacagcgta cgtctgtttt cgtcggatag catcatcggc aaataagata tcactggcgc ggcccagcgg tggcgccagt ttgtgttttg taatgtcgag cacgggatac aggagaatgc ttcggcgaca g ttcaat tag ccaatagatt aagtcgttga actgatcaat gaatctttcc tagtcgcctt actgactgga gtccttgctg ggctccggct ctatccggat cctggagttt t cgaaagata tggcaggatc tggtctttcc actctcatga gctggtcaga acacccaccg tcgatattct tgtgctggcc tcacatacct t tggtcgtat atcacaccac ctgactgagc gggcaggggt acaccggagg agggcttcta gagtggttgt aaagctggat agtaagtgtc ccgggtgggg t ttgt taggt ctttgcagaa accttgacga taaccggaga t t ttat taag tagctcattt aatcgtggta cgggctgcgc gagaattcta gctggtcaga gatgaattgc attgtctgtg acaaaaatct gatgagctgg agccct t taa tcacaccctg ggacctctta tgagtgtcaa ggtccgggag acgctgggac tcttagtcgt tgagggctgg gtataccaca caacgcaaag gcccggattt cagagaccag tgtaggctat ccgtgataga cccgctggct tgttgcgccg atcagcgacg ttatttaaac gactcaagta gaacgataca cgggcgcatg tcacgtctga tggatgcaaa atgatatcaa gatataagaa tgcgagactg gggaat taga ttgaggttac cagcctatgc gaaatccgga agtttcttgc acctggat tc tttttaaaag agttggttct gcgatttgga ctcttttccg atgctggtta atgcacctca cagaagcctg aagcggatga tattgttctg gaaaggcgta gagcgaaaga tcttcgctgc gatgaaagtg atcccaaatg cccggttgcg tttggttatg gaattattgc tgacgaaatg cgtctgttct agaataatct attaatgcgc acaatgaacc gatagggtat atcatgatcg ggacattgag tttcttttat gatgatattt tattagcatt tttacttcgg aagcatgaaa taaaattatc caataagtgg acgagagcgt tggcataatc t tgtgcgagt aaatcacatt tcaggatgaa aacaaggccg gtggccttgt tgtcccgtaa gggagatgct actatattca tgaacaatcc atgtgggctg tggcccggca ctgcataccg atggtgctga cagaatctgg ggagattttg cttgcgcaga agaggagcct gtttcctccg ggcgatgttt tccagactgc actttattat t tacactgag attcagatgc gcacacaatg caacggaaga attaaagtgg tctgctgtca cgcgattcac aggataacat agagggagta atatgggaat agcgattttt gaaagtgagg ccgaaatata attgtgggtc tcatacacta 42495 42555 42615 42675 42735 42795 42855 42915 42975 43035 43095 43155 43215 43275 43335 43395 43455 43515 43575 43635 43695 43755 43815 43875 43935 43995 44055 44115 44175 44235 44295 44355 44415 9. 9 9.
999 9999 9*9 99 9 9 9 9 *9 99 9 9.
9 9 9 ~9 9 .99** W9 99 .9 9~ 999 9 9 9 99 9999 9.9 99 9 9 9 9.
99 9.
999 9 9* 9 999 99 9.99 9 9999 *999 999~ *99. 9 999999 9 99 9 9 9999 9.99.9 9.99 ttagtggcgc ttttagaagg aaaataatga ttgtatggcg aagataaatg atggttttga tgcgactggt cggaaatgat cataccagaa ttctgaaggc ggaccagcta tggaggcagc ttgaaatgct ctgttacagc acagagtttc cgaggccaac agccattaac caccaaatgc ccagttctaa cgccgccaaa cctggctgaa tgatatagat tgcagactgc gtttcacggt tgtaacgctg cgataatcgt tctacgttat aatttgtacc caagtgcctg ttaaatcaaa ctgcaagctg gcgcaccgtc gaagtt t ttt tgttaatgtt aaagggtatc acttgaaatt tcatcgtcag gaataatcgc tacgt tcctg cagccagtcg aatacgcgga acgttgcagt gcaggagtat ttttaccgat ctttgtgtct tactttgcca atattttgca attaatacca gcctgcgctg catgtaagtg aataatatcc acgtactgga aacgcctttg ggttgagttg cccttctttt gaaaaagttt atagggtgtg ggttttctgt cggtatggaa gtaacgggta cgacctggat ggcgagatct ataatattta ctgttttacg tttaattc gatccgggta catttatctt aacctcatga atacttcctg tattattctc ccacaaattg acttaaaccc tcttttgacg aacaggacgg gtactgaaaa tcattactga agtgtttaaa gcatcataag gcat taatgt ccgatacctg aaaccaacag atatagggag gtcattcgga tgcccgcctt agctggaatg gtacggtcag attccataga gctgatgcag ttcatttcag atataacgat aagcgggtaa atccatctcc gtgtgagatg aaagcccgtt tgttgtacct acccgataat gcggcgtcat ttcatagaaa atctgatcgt ccaataatct ctccggcctg aatggagggt ctttatttca attttctgaa acatgatgac atgcccaata ggctgttttt atggcgccag tatcctccat aggcgtaagt cggtatgttc atttataact cgccccaggc gaatggtgtt taaatgccgt cctgattttt tagcatccgt gatcataaaa gtaacgttgc cattgactac tagcggacca aacctgcctt tacataaacg aaaatgagat tcgccaaatt gagcgatgct atcccggaat gttgacgctg agcgagcctg cccatttacc tctgactgac ttgtcagctt tttgagtgtt gcttgccacc tccagatctc acgctttctg tgatgattaa tgaat tgagg caaaacatga gggcagccgg gttgcacctg tgcgccttcg aatgccgagc atcaccagca ggcgtcaatc aaagttattt attacttagc actattgtga tacatttatg ctcacctctg gtcataaccg atgaccatta ggatgtcccc ggctaccaca aatattgggc aagcccaaaa gaagatttta gccccacgta gtaagaaaaa aacgggcaaa ttctggtgtt ttgttctatg ggatttaatc gtcggtttcc cagatgctgg cttaatttgg aaatacttaa cccatttatc tct tt tat tg cccgtttgga catcgagata tgccgactga aagttaagcc ttcaaagatt ggaacccaca atgaagtcat tagtgatttt ataatattat tttgaagcag ttcacatgag aaatcataat tgtactggat aaagtggtat atagcaaccc actatctcat gctttcccgg aggggaatga taaaagacaa aacggtctat aataacaata cggtttcact gatgaagatg aatatttact aagactttgc ttactgaagg acctctatca agttgttggt acaataggat 44475 44535 44595 44655 44715 44775 44835 44895 44955 45015 45075 45135 45195 45255 45315 45375 45435 45495 45555 45615 45675 45735 45795 45855 45915 45975 46035 46095 46155 46215 46275 46335 46395 cctgggcggg caggggagga ttggggacag cggtggcgaa agcgccaaaa gaaacgcccg 46455 ccagagtcgc ggagtgatta agtagggcta ttttgaacgc acgtgtacgc tggtgggata tttacggaga tgccagtaaa accgttttta ttttacttgc cgctagcggc tccggttttt ggctcttagt gttacggagt gttgtgcgta ttacccacga cattttggac ggcccgaagg gcgcgctgtc cagaatacgt gaaacaatcc caaagttttt atggcgagca ctgggcagtg gcgagcgtag cgtgtccaaa tgcctgccac cgccgcggtg catgaagata attatcttag ctcgccaaaa cgagtcaaac ctggctgcgc attacgccac agcggcaggt tcctgataag agcctatccc cgcgttagcg ctcacgtact caataacgcc gcgaatttgt tgctt 46515 46575 46635 46695 46755 46815 46870 0 0, 000* 0.0 0 0.
p0.* *so* 0 0 "a 0, 0*.0 <210> 2 <211> 166 <212> PRT <213> Salmonella typhimurium <400> 2 Met Lys Ser Ilie Lys Lys Leu Ile Ile Ser Ala Leu Ser Met Met Ala Ala Ser Lys Ser Val Tyr Ala Gly Ser Pro Asn Ser Thr Val Ser Asp Ile Val Phe Pro Gin Asp Lys Ilie Pro Val Ser Gly Ala Gly Lys Pro Ser Ala Ala Lys Leu Asn Ser Thr Lys Giu Phe Arg Gly Ilie Asn Asn Val Val Thr Gly Thr Ala Trp, Arg Val Ala Ser Asp Ser 115 Asn Tru Met Asn Thr Gly Ile Gly Val Arg Arg Ser Thr Glu Lys Gly Leu Ser 110 Asn Gly Val Val Leu Thr Thr Phe Asn Asp Thin Leu 130 Gly Pro Ala Gin Asn Ala Asp Thr Ile Thr Leu Val Gly Tyr Gin Pro 165 <210> 3 <211> 245 <212> PRT e* p p.
p p p <213> Salmonella typhimurium <400> 3 Met Lys Ile Val Asn Phe Ala Val Met Ala Val Ala Leu Phe Ala Thr 1 5 10 Asn Ser Met Val Ser Val Tyr Ala Val Asn Gin Gin Leu Asn Ser Ala 20 25 Thr Lys Leu Phe Ser Val Lys Leu Gly Ala Thr Arg Val Ile Tyr His 40 Ala Gly Thr Ala Gly Ala Thr Leu Ser Val Ser Asn Pro Gin Asn Tyr 55 Pro Ilie Leu Val Gin Ser Ser Val Lys Ala Ala Asp Lys Ser Ser Pro 70 75 Ala Pro Phe Leu Val Met Pro Pro Leu Phe Arg Leu Giu Aia Asn Gin 90 Gin Ser Gin Leu Arg Ile Val Arg Thr Gly Giy Asp Met Pro Thr Asp 100 105 110 Arg Glu Thr Leu Gin Trp Val Cys Ile Lys Aia Vai Pro Pro Giu Asn 115 120 125 *Giu Pro Ser Asp Thr Gin Ala Lys Giy Ala Thr Leu Asp Leu Asn Leu *130 135 140 Ser Ilie Asn Ala Cys Asp Lys Leu Ile Phe Arg Pro Asp Ala Val Lys Gly Thr Pro Giu Asp Val Aia Gly Asn Leu Arg Trp Vai Glu Thr Gly 165 170 175 Asn Lys Leu Lys Val Giu Asn Pro Thr Pro Phe Tyr Met Asn Leu Ala *180 185 190 Ser Val Thr Val Giy Gly Lys Pro Ile Thr Gly Leu Glu Tyr Vai Pro 195 200 205 Pro Phe Ala Asp Lys Thr Leu Asn Met Pro Gly Ser Ala His Gly Asp 210 215 220 Ile Glu Trp Arg Val Ile Thr Asp Phe Giy Gly Glu Ser His Pro Phe *225 230 235 240 His Tyr Vai Leu Lys .245 <210> 4 <211> 836 <213> Salmonella typhimurium <400> 4 Met Lys Phe Lys Gin Pro Ala Leu Leu Leu Phe Ile Ala Gly Val Val 1 5 10 .His Cys Ala Asn Ala His Thr Tyr Thr Phe Asp Ala Ser Met Leu Gly 25 Asp Ala Ala Lys Gly Val Asp Met Ser Leu Phe Asn Gin Gly Leu Gin 35 40 45 Gin Pro Gly Thr Tyr Arg Val Asp Val met Val Asn Giy Lys Arg Val 55 Asp Thr Arg Asp Val Val Phe Lys Leu Glu Lys Asp Gly Gin Giy Thr 70 75 Pro Val Leu Ala Pro Cys Leu Thr Val Ser Gin Leu Ser Arg Tyr Gly 0***85 90 S0 0* 0 0** *0 :0 0 o 0.
f 00 lo 1 *:s *00 Lys Thr Giu Cys 115 Ile Asn 130 Pro Giu Pro Ala Lys Met Pro Gly 195 Gin Arg 210 Ala Glu Lys Thr Met Leu Ala Pro 275 Lys Gin 290 Phe Ala Val Thr Tyr Gin Leu Leu 355 Gin Ile 370 Ala Tyr Gly Leu Ser Asp Ser Trp 435 Giu Asp 100 Ala Asp Asn Gin Phe Lys Phe Leu 165 Asp Met 180 Ile Asn Ser Ser Arg Gly Ser Gin 245 Ala Ser 260 Val Val Asn Gly Leu Arg Val Trp 325 Thr Pro 340 Ala Gly Ala Gin Gly Giy Gly Giy 405 Thr His 420 Arg Leu Pro Gin Thr Ala 120 Leu Gin 135 Ile Ala Asn Tyr Gly Arg Gly Ala 200 Leu Ser 215 Tyr Ser Glu Ile Asp Asn Gly Ile 280 Thr Ile 295 Leu Ser Ala Asp Ile Ala Tyr Arg 360 Thr Leu 375 Gin Ser Leu Gly Gin Arg Tyr Ser 440 Trp Pro Ser Giu Aia 170 Asn Arg Lys Lys Asp 250 Val Arg Asn Thr Ser 330 His Ser Tyr Thr Trp 410 Gly Gin Lys 110 Ala Leu Asp Gin Val 190 Ala Ala Thr Phe Giu 270 Arg Ala Gly Phe Leu 350 Thr Trp, Ala Ser Gin 430 Thr Pro Val Ala Asp Thr 175 Gin Thr Tyr Leu Thr 255 Arg Val Pro Asp Val 335 Lys Asp Asn Ala Val 415 Gin Gly Pro Leu Leu Gly 160 Asp Leu Ser Thr Gly 240 Gly Gin Giu Gly Leu 320 Val Tyr Lys Leu Leu 400 Asp Gly Thr 9.
J*9 09 f#6 0% Asn Leu 465 Ser Ser Arg Ser Gin 545 Thr Val Gin Glu Asn 625 Giy Al a Leu Pro Thr 705 Pro Val Ilie Ilie Ala 785 Leu Val Giu Trp 500 Trp Thr Thr Trp Ser 580 Gly Gin Leu Ser Ile 660 Ser Gly Asn Leu Thr 740 Arg Gly Asp Thr Leu Asn 485 Giy Arg Ser Leu Phe 565 Trp Val Ser His Tyr 645 Val Val Trp Leu Pro 725 Giu Al a Al a Thr Arg Trp Gin 455 Asp Ser Tyr 470 Leu Gin Pro Arg His Leu Asn Arg Pro 520 Ile Gly Gly 535 Trp Arg Asn 550 Ser Met Pro Gin Met Thr Asn Gly Giu 600 Tyr Arg Ala 615 Leu Aia Trp 630 Ser Arg Ala Ile His His Ala Leu Vai 680 Pro Gly Val 695 Asn Val Tyr 710 Asp Asp Ala Gly Ala Vai Leu Met Thr 760 Gin Val Thr 775 Asp Ser Gin 790 460 Asn Gly Asn 475 A~rg Thr Thr Leu Ser Leu A.sp Asp Ser 525 Leu Ser Leu 540 H-is Arg Lys 555 Arg Trp Thr Ser His Gly Ser Gin Gin 605 Pro Pro Gly 620 Asp Tyr Gly 635 Gin Met Gly Val Thr Leu Pro Gly Ala 685 Asp Phe Arg 700 Asn Thr Val 715 Thr Gin Thr Ala Lys Phe Arg Giu Asp 765 Gly Gin Asp 780 Leu Thr Gly 795 Arg Leu Thr 510 Tyr Asn Glu Gly Gly 590 Leu Gly Leu Val Gly 670 Ser Gly Ser Asp His 750 Gly Giy Leu Leu Met 495 Gly Gly Trp Asn Asn 575 Gin Asp Gly Leu Asn 655 Gin Gly Asp Leu Val 735 Thr Ser Ser Ala Trp, 480 Leu Ser Leu Asn Ile 560 Asn Thr Trp Asn Gly 640 Ile Pro Val Thr Asp 720 Arg Arg Ala Ala Asp 800 Tyr Ala Ser Gin Gly Tyr Asn Thr Lys Gly Giu Leu Thr Val Lys Trp Gly Ala Gin Gin Cys Arg Val Asn 805 810 815 Tyr Arg Leu Pro Ala His Lys Gly Ile Ala Gly Leu Tyr Gin Met Ser 820 825 830 Gly Leu Cys Arg 835 <210> <211> 156 <212> PRT <213> Salmonella typhimurium <400> Met Trp Met Lys Ile Gin Arg Val Lys Thr Val Ile Tyr Ser Val Ser *1 5 10 Leu. Leu Val Ala Ala Ser Ser Leu Val Pro Ile Ala Asn Ala Ala Giu ~20 25 Lys Leu Gin Thr Thr Leu Arg Val Gly Thr Tyr Phe Arg Ala Gly His 40 Val Pro Asp Giy Met Val Leu Ala Gin Gly Trp Val Thr Tyr His Gly 55 Ser His Ser Gly Phe Arg Val Trp Ser Asp Giu Gin Lys Ala Gly Asn 70 75 Thr Pro Thr Val Leu Leu Leu Ser Gly Gin Gin Asp Pro Arg His His 90 lie Gin Val Arg Leu Glu Gly Glu Gly Trp Gin Pro Asp Thr Val Ser 100 105 110 0 009 Gly Arg Gly Ala Ilie Leu Arg Thr Ala Ala Asp Asn Ala Ser Phe Ser 000115 120 125 Val Val Vai Asp Gly Asn Gin Giu Val Pro Ala Asp Thr Trp Thr Leu 130 135 140 .qf Asp Phe Lys Ala Cys Ala Leu Ala Gin Glu Asp Thr 145 150 155 A9p 09 P 00.
F0 <210> 6 9. 0 00 60.0 <211> 9253 Je4 <212> DNA <213> Salmonella typhi <220> <221> CDS <222> (1898)..(2608) 0.* #to <223> tcfA putative fimbrial subunit <220> <221> CDS sss <222> (2659)..(3234) <223> tcfB putative fimbrial subunit <220> <221> CDS .00. <222> (3 360) (602 9) <223> tcfC putative fimbrial subunit <220> <221> CDS <222> (6052)..(7131) <223> tcfD putative firnbrial subunit <220> <221> CDS <222> (7264)..(7719) <223> tinR putative transcriptional regulator 00 00*
P*
AP*
LO 16* tgtaagtatc gccaacgcag gattagatca tgaagccacc atcttgagga tttgatcact ttggtcagcg agtgtggttg aacatctctc taccgtcggt cctatttcgt cggttttcca ccgatggccc tccgggggcc tcccggacga ttgactcatg ttc tattcct attggtcgtt gtaaacaaca atgttttaaa tggaatattt gagaggtgta atatttttaa ttaattactt tttgtggctt gggctttctt gagggggaaa tgaagaattt ccgcataatc gagatcgctg gttgaatccg agggacaact tgtcaacgcc tctgctgtcc ccccgacttt agtgatgcag cccagtctgc tcagcagtcg ccaggatcag gcttcgcttt acatctgccg ccagcaggat ccttacgatc gcagacgagc gctgcagcgc aagacttgta t tgacaggta aggtgaattg agctaatatt aacatcaagc cagatattga tgcttacttt tgtttaaggt tagattctct tacagttcca ccaaaatttt gagccattca tgcgtaaagc gacggactgt ggaagtctgg aggatttatg gctttccgcc gcgtttttcc tagccgatcc gaagcctt tg gaagaacaga cccccgcgca catcagcgtc ctttcacacc cacattctcg gatgcctggc ctg tttcagc catgcacagg tattcatgtt agggactggc tattttcatg taatctaaaa gtaaccatat gcaagtaaat tt t ttacc tg ttatttggtt gtctcctacg tttatctgag tactgcggcg catttagaga ccgtattact taaagatgtc atacggcggc gtcgtttttc atttcatctt ttcagttggt aggatcgctg tttgacgtca Ctggcttcct tagctcagtt taccaacctg gggggcagcg cagcgctcca tggaagctga cgggattcca aatgtcgact gcttactcgc atcaatttga gtgagttgt t atagtgtatt gatgtatata ctgatctaac gttttgtaaa ttgtagtctg ttgttatgtt taagtcaggt ttattaattg tcatccggca gcgcaccaga taccgggagg atggaagctt gctattttta cactgattgg aactttctcc ttgccagcac ggataatgct cactggtcat gttgtagctg tcctgcggca gccaggtggg cgaacgccag agtcgaactg ttccgcgctg tattttgggc acaaaagggg atggtatttt ctgttttata atattgatta agt t tttgt t tgtaacttta acctcatgat aattgtatct acgtatttgt acacagtaac ttcagcgatt taatcaatct tcatcgctgt ccggtatttc ctgatattaa atatccgctg cgttgcgctt tttgatattc gttatcaccg cgctttttca tggcaggtaa gcgctccagg tgaacaacac ttttctccac accggccaac ctccagcgtt atgtctgccg tgacagaggc gaaaggaacc ttataacatt ttgttgtttg cactttgctg tgctgatatc gttttctaaa gatcagtcta ccctcctgga tgctttgaag aactttctta cttacagatc 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 tgtcgttcgc ttttggtgaa tgaaatccgt ggacttttat ttactaattt tttctttcct 1740 gaaaaaaaca gaggtattga gcgaaaaatt ttattccgta tgatgccctc cacacaaaat 1800 gtattaacac tgaatcgtaa tttgcttctt tatgctgata actttctgtc tatgctaata 1860 ctaaaattta gatgactttt atacggtaaa atctggt atg aat ttt aaa gat act 1915 Met Asn Phe Lye Asp Thr 1 ctt ccc ggg gtg ttt ctc tgt gtc gct atg ttt gca tgt ggt cat gcc 1963 Leu Pro Gly Val Phe Leu Cys Val Ala Met Phe Ala Cys Gly His Ala 15 agg gcg aat atg ctc gtt tat ccc atg gcg gca gaa att aat agt agc 2011 Arg Ala Asn Met Leu Val Tyr Pro Met Ala Ala Giu Ile Asn Ser Ser 30 cgc gaa gag gcc acc tcg ctg ttc gtc tat tct aaa tca gat cat gtg 2059 Arg Glu Giu Ala Thr Ser Leu Phe Val Tyr Ser Lys Ser Asp His Val 45 caa tat att cga aca aga atc atg cgt att gaa cac ccc ggt atg cca 2107 Gin Tyr Ile Arg Thr Arg Ile Met Arg Ile Giu His Pro Gly Met Pro 60 65 cag gag aag gag gta cca gca ggg aat gat ata gag aca gga ctt gtt 2155 Gin Giu Lye Giu Val Pro Ala Gly Asn Asp Ile Giu Thr Gly Leu Val 80 gtc tcc ccg gag aaa ttt gct ctt tcc ccg gga aca aaa aaa aca ata 2203 Val Ser Pro Giu Lys Phe Ala Leu Ser Pro Gly Thr Lys Lye Thr Ile 95 100 cgt gtt atc agt act cag gca ccg gaa aga gag gaa gcc tgg cgg gta 2251 Arg Val Ile Ser Thr Gin Ala Pro Giu Arg Giu Giu Ala Trp Arg Val 105 110 115 tac ttc gag gct gtt cct gaa ctg gaa gat gat cca cag gca ggc gga 2299 ,0 Tyr Phe Glu Ala Val Pro Glu Leu Giu Asp Asp Pro Gin Ala Gly Gly 120 125 130 0%0 .0 .0 .0 4 aag caa aat tca tcc gta agt gtg aat ctt gtc tgg ggg gtg ttg ctg 2347 N Lye Gin Asn Ser Ser Val Ser Val Aen Leu Val Trp Gly Val Leu Leu 135 140 145 150 cgt gtt tct ccg tca gac ccc agg cct gcg ctg gta acg gac ggt cac 2395 0 Arg Val Ser Pro Ser Asp Pro Arg Pro Ala Leu Val Thr Asp Gly His 155 160 165 cac ctg ctg aat ag gga aac aca cgg ctt tct ctt att cgg gct ggc 2443 His Leu Leu Asn Thr Gly Asn Thr Arg Leu Ser Leu Ile Arg Ala Gly 170 175 180 aac tgc gac acc aca tgc cac tgg cag aat ata ggc aaa agt att tat 2491 Mer Asn Cys Asp Thr Thr Cys His Trp Gin Asn Ile Gly Lye Ser Ile Tyr 185 190 195 9 ccc ggc ggg agt gct gat att ccg gcc gga ata aaa agt aat gca ttt 2539 Pro Gly Gly Ser Ala Asp Ile Pro Ala Gly Ile Lys Ser Asn Ala Phe 200 205 210 *cgt gtg gaa tat cgt acg ggt gca aat tca ccg gta atc tct gct gat 2587 Arg Val Glu Tyr Arg Thr Gly Ala Asn Ser Pro Val Ile Ser Ala Asp 215 220 225 230 tta aca gca gcc gga aag taa aaacacacgg agcgtacgct ataccctaca 2638 Leu Thr Ala Ala Gly Lys 235 tttattctca gggggagcgg atg tat acc gag tgt aca tat atc act gta ata 2691 Met Tyr Thr Glu Cys Thr Tyr Ile Thr Val Ile *t.
C.
C.
CC. C
CC
Ge..
C
e Ge..
Ge..
Gee.
*e.
S
OS*sSe 0
C
aac aac aaa Asn Asn Lys 250 gcc gca gct Ala Ala Ala 265 gtt cag aag Val Gin Lys ctg ctg cag Leu Leu Gin ttc atg ccg Phe Met Pro 315 tac agc aac Tyr Ser Asn 330 cca caa ctt Pro Gin Leu 345 gtg act ctg Val Thr Leu gct aaa acc Ala Lys Thr ctg aac ctg Leu Asn Leu 395 CCt 9CC ggt Pro Ala Giy 410 gtc act gcc Val Thr Ala 425 Leu Phe 255 ttg 9CC Leu Ala 270 acc gtc Thr Val ggt tca Gly Ser ggc ctg Gly Leu acc aag Thr Lys 335 gtc ctg Val Leu 350 cgg tca Arg Ser ccg gac Pro Asp ggt cag Gly Gin agc gga Ser Gly 415 Phe Met acc gtt Thr Vai act gcc Thr Ala tcc ctc Ser Leu 305 gtc cat Val His 320 tcg gtt Ser Val gat ccc Asp Pro ctg ac Leu Thr gga aaa Gly Lys 385 aag gct Lys Ala 400 ttg gtc Leu Val Asn Met Lys 260 tat tct ttt Tyr Ser Phe 275 aat att gac Asn Ile Asp 290 ccg tcg act Pro Ser Thr aaa tca ctc Lys Ser Leu aat gta aaa Asn Val Lys 340 acc aaa acc Thr Lys Thr 355 acc acc aat Thr Thr Asn 370 act ggc gat Thr Gly Asp gga gca gcc Gly Ala Ala agt ctg gtg Ser Leu Val Thr Ser Phe tct gtt tct Ser Val Ser agt aca ctt Ser Thr Leu 295 atg aag ctg Met Lys Leu 1310 cag acc cgc Gin Thr Arg 325 ctg ttg aat Leu Leu Asn att gat atg Ile Asp Met tct gta ctg Ser Val Leu 375 gct tca gct Ala Ser Ala 390 tta caa aac Leu Gin Asn 405 att tca cag Ile Ser Gin tta ttt ttt atg aac atg aaa aca tct ttt att 2739 2787 2835 2883 2931 2979 3027 3075 3123 3171 3219 3274 3334 3386 3434 taactggtta ttagctcttc atctgatccg gttttggggg gcaccgt tcg tacctgaacc ggatccggta ttgatcttat tctttacgtg agtcgttatt tctgg atg tat tat tta Met Tyr Tyr Leu 430 ttt acc agc cag gca act ctt att ccc cct cct Phe Thr Ser Gin Ala Thr Leu Ile Pro Pro Pro 440 445 tattcattgc aattcaggtc ctg gga ttg tgc agt Leu Gly Leu Cys Ser 435 gga ttt gaa tct ctg Gly Phe Glu Ser Leu 450 ctg gaa gga cag act gag caa att gaa gtg ttg Leu Giu Gly Gin Thr Glu Gin Ile Giu Vai Leu 0 00 go 0 dm C alf a 0:*C.0 Sos 0:90 S 0.600 ctg Leu tcc Ser gca Ala aac Asn tgt Cys 535 gag Giu gat Asp gtc Vai cac His gac Asp 615 gaa Giu cag Gin Arg ccc Pro gtt Vai 695 gga Giy cca Pro gaa Giu agc Ser 520 ggt Gly aac Asn gaa Giu agc Ser agt Ser 600 ggt Gly cat His gat Asp aat Asn c tg Leu 680 aac Asn 460 gtg Val ctt Leu gCg Ala tgt Cys aca Thr 540 tta Leu gat Asp cac His aca Thr gga Gly 620 agt Ser cag Gin gcc Ala gat Asp gac Asp 700 gtt Vai agc Ser gct Aia 510 gtc Val aaa Lys ttg Leu cgc Arg cag Gin 590 aat Asn gac Asp gcc Ala tat Tyr ggg Gly 670 t ta Leu aat Asn ccg Pro 480 ggg Gly ctc Leu gaa Giu gat Asp ctt Leu 560 ctg Leu ctg Leu agc Ser gcg Aia ttt Phe 640 cag Gin gat Asp acc Thr act Thr 465 gac Asp ctt.
Leu agc Ser gca Ala gt t Val 545 aac Asn act Thr tat Tyr ggt Giy gct Aia 625 gac Asp gct Ala ttt Phe ggg Gly ccg Pro 705 cta Leu acc Thr gcc Ala cgt Arg aaa Lys 530 gcg Ala cgg Arg ccg Pro ctg Leu gcc Ala 610 atc Ile aat Asn ggt Gly ggg Giy acc Thr 690 gtt Val cca Pro gtg Val gcg Ala ccg Pro 515 gac Asp gtt Val gac Asp acc Thr agt Ser 595 ctg Leu tgg Trp ctg Leu cgg Arg ttc Phe 675 acc Thr atg Met 999 Giy cag Gln t tg Leu 500 t tg Leu agc Ser att Ile tgg Trp ccg Pro 580 gat Asp ggg Gly aat Asn ttt Phe atg Met 660 agt Ser caa Gin gtt Val cat His ttc Phe 485 ccg Pro cta Leu agc Ser ttt Phe ttg Leu 565 gag Giu gat Asp ctt.
Leu cag Gin gtc Val 645 gat Asp ctg Leu gct.
Ala cag Gin tca Ser 470 atg Met gcc Ala cgt Arg gag Giu gat Asp 550 ccg Pro ggt Gly ctc Leu ggt Giy tca Ser 630 cgt Arg cag Gin ctt Leu tat Tyr gtt Val 710 3482 3530 3578 3626 3674 3722 3770 3818 3866 3914 3962 4010 4058 4106 4154 4202 4250 acc cga aat gcc cgt att. gat Thr Arg Asn Ala Arg Ile Asp 715 att tat cgt Ile Tyr Arg 720 ggc agc gag ttg ctg ggg Gly Ser Giu Leu Leu Gly agt cag ttc ctg acc ccg gga atg cat acc ctg gat act cat tct ctt 4298 Ser Gin Phe Leu Thr Pro Gly Met His Thr Leu Asp Thr His Ser Leu 730 735 740 cca ccg gga agc tat cct ctg gcg ttg cgg gtg tat gag gat ggg att 4346 Pro Pro Giy Ser Tyr Pro Leu Ala Leu Arg Vai Tyr Giu Asp Gly Ile 745 750 755 ctg cgg cga acg gag acc cag ccc ttc agt aag ggg ggc aat agc ttc 4394 Leu Arg Arg Thr Giu Thr Gin Pro Phe Ser Lys Gly Giy Asn Ser Phe 760 765 770 agt gca cag acc cag tgg ttt att. cag ggc ggg ctg gaa gat acc ggg 4442 775 780 785 790 gat aaa gcc agc cat tat gac ggt gag act gtc atg gct gcc gga ttc 4490 Asp Lys Ala Ser His Tyr Asp Gly Glu Thr Val Met Aia Ala Giy Phe ,795 800 805 caa act ggg ctg cgg aaa aat atc agt ctg acc gaa ggt atc tct ctg 4538 *9 Gin Thr Gly Leu Arg Lys Asn Ilie Ser Leu Thr Giu Gly Ile Ser Leu 810 815 820 gca cat gag gcc tgg tac agt gaa acc cga ctg aat tca cag cat gca 4586 Ala His Glu Ala Trp Tyr Ser Glu Thr Arg Leu Asn Ser Gin His Ala 825 830 835 gtg ctg gat ggc acg ctg gac ctt tct gcc ggg ata ctg cat ggg aca 4634 Vai Leu Asp Gly Thr Leu Asp Leu Ser Ala Giy Ile Leu His Gly Thr 840 845 850 'See** gac agc acg agc ggt aac act gag cag gtg aca tac aac gac gga ttt 4682 Asp Ser Thr Ser Giy Asn Thr Glu Gin Val Thr Tyr Asn Asp Giy Phe 0:669: 855 860 865 870 off. tcc gcg agt ctg tgg cgt aac cat acg gaa agt gat gcc tgt agt ggt 4730 Ser Ala Ser Leu Trp Arg Asn His Thr Glu Ser Asp Ala Cys Ser Gly 875 880 885 o a: cgt cat cca cag tca gtg cat gcc agt atg acc tgc cag act tcg atg 4778 Arg His Pro Gin Ser Vai His Ala Ser Met Thr Cys Gin Thr Ser Met .0.890 895 900 aac gcc tcc ctg tcg gtt tcg gtg ggg aac tgg tat gcc cta ctg gga 4826 Off.. Asn Ala Ser Leu Ser Vai Ser Val Gly Asn Trp Tyr Ala Leu Leu Gly 910 915 *0 tac agt acc agc agg aca gaa ggt cgg ccg gtt tac cgg gga tat gat 4874 Tyr Ser Thr Ser Arg Thr Glu Gly Arg Pro Vai Tyr Arg Gly Tyr Asp 920 925 930 gat aac agt gac aaa gaa aat gtg ttc tgg cga cag gca tac atc cct 4922 Asp Asn Ser Asp Lys Glu Asn Val Phe Trp Arg Gin Ala Tyr Ile Pro 935 940 945 950 gcc tct cac cgc gaa tct gct cag gct agt gca acg tac agc ctt aat 4970 Ala Ser His Arg Giu Ser Ala Gin Ala Ser Aia Thr Tyr Ser Leu Asn 955 960 965 atg gct ggc atg aat att aat acc cat ggg gga gta tgg cga acc cga 5018 Met Ala Gly Met Asn Ile Asn Thr His Gly Giy Vai Trp Arg Thr Arg *970 975 980 f **ft aat gac gga gtg aat gat gat ggc ttg ttt atg agt gtc agt gtg tca 5066 Asn Asp Gly Val Asn Asp Asp Gly Leu Phe Met Ser Val Ser Val Ser 985 990 995 tat gcc tct caa cca ccg aca atg act ggc agt aat agg tat acc tca 5114 Tyr Ala Ser Gin Pro Pro Thr Met Thr Gly Ser Asn Arg Tyr Thr Ser 1000 1005 1010 gcc ggg acc gat att cac agt agc cgg aat caa aaa aca cag acg tcc 5162 Ala Gly Thr Asp Ile His Ser Ser Arg Asn Gin Lys Thr Gin Thr Ser 1015 1020 1025 1030 tgg aat gtg aac cat gtg aga tcc tgg cag cag gat ctg tat cgt gaa 5210 Trp Asn Val Asn His Val Arg Ser Trp Gin Gin Asp Leu Tyr Arg Giu 1035 1040 1045 ctg tcg gtg ggt ttc tcc ggt tat aac gac gac agc tgg agc ggg agt 5258 Leu Ser Val Gly Phe Ser Gly Tyr Asn Asp Asp Ser Trp Ser Gly Ser 1050 1055 1060 000.: ctc ggc gga cgc atg agc ggc cgt atg ggt gaa ctg agc gcc act atc 5306 0 0 0 Leu Gly Gly Arg Met Ser Gly Arg Met Gly Giu Leu Ser Ala Thr Ile 1065 1070 1075 agt aac tcc cat caa cgt aat gcg ggc age gcc agt tca ctc acc gct 5354 Ser Asn Ser His Gin Arg Asn Ala Gly Ser Ala Ser Ser Leu Thr Ala 0:01080 1085 1090 0000 ggc tac agc tcg tct ctg gcg tta tcc cgt aat gga ctg ttc tgg gga 5402 Gly Tyr Ser Ser Ser Leu Ala Leu Ser Arg Asn Gly Leu Phe Trp Gly ooe. ggt ggt cag gac ggt gaa ceg gcc tct ggc atg gcg gtg aac gtg gag 5450 Gly Gly Gin Asp Gly Glu Pro Ala Ser Gly Met Ala Vai Asn Val Glu ilo.:l11 1120 1125 *000.* tca gag ggg gac gag ggc agt agc ggg aaa gta gtc agc gtt cgt ggc 5498 Ser Glu Gly Asp Giu Gly Ser Ser Gly Lys Val Val Ser Val Arg Gly 1130 1135 1140 agc agc cag ccg ttc agt ctc ggt ttt ggt cag cag tcg ctg ttg ctg 5546 Ser Ser Gin Pro Phe Ser Leu Gly Phe Gly Gin Gin Ser Leu Leu Leu .1145 1150 1155 atg gaa gge tat aac gcc acg gag gtg ace att gag gat gca ggg gtt 5594 0 Met Glu Gly Tyr Asn Ala Thr Glu Val Thr Ile Glu Asp Ala Gly Val 0 1160 1165 1170 agt t ca cag ggt atg gca ggc gta aaa gcg gga ggg gga agc agg tgt 5642 0 Ser Ser Gin Gly Met Ala Gly Val Lys Ala Gly Gly Gly Ser Arg Cys 1175 1180 1185 1190 :o I.*o 4%,010 tac ttc ctg aca ccc ggg cat ctg ctg gtt cac aac atc age gcc agt 5690 .0 Tyr Phe Leu Thr Pro Gly His Leu Leu Val His Asn Ile Ser Ala Ser 1195 1200 1205 0:0 atg agc cga ctg tac gtt ggc cgc gta ctg gac aag gat ggc aga ccg 5738 00 Met Ser Arg Leu Tyr Val Gly Arg Val Leu Asp Lys Asp Gly Arg Pro **-1210 1215 1220 ctg ctg gac gca cag cca ctg aac tat cca ttt ttg teg ttg gga cct 5786 Leu Leu Asp Ala Gin Pro Leu Asn Tyr Pro Phe Leu Ser Leu Gly Pro 1225 1230 1235 tee ggg cga ttt age ctg cag agc gag cat aaa gaa tcc age ctg tgg 5834 Ser Gly Arq Phe Ser Leu Gin Ser Giu His Lys Giu Ser Ser Leu Trp 1240 1245 1250 ctg ctg tct aaa aac agg atc ctg cgt tgt ccg atg tca gta cat aaa 5882 Leu Leu Ser Lys Asn Arg Ile Leu Arg Cys Pro Met Ser Val His Lys 1255 1260 1265 1270 egt egg gat gtt atg cag gta gtg ggt gat gtg egg tgt gaa tta agt 5930 Arg Arg Asp Val Met Gin Val Val Gly Asp Val Arg Cys Glu Leu Ser 1275 1280 1285 gac gtg gat gcc ctg cca cag gcg ttg caa ata teg ccg egg gtc atc 5978 Asp Val Asp Ala Leu Pro Gin Ala Leu Gin Ile Ser Pro Arg Val Ile 1290 1295 1300 cgt ttg ctg aac gtg gca ggt ttg etg cgc cat tee gtt cag gaa gee 6026 Arg Leu Leu Asn Val Ala Gly Leu Leu Arg His Ser Val Gin Giu Ala 1305 1310 1315 tga egtagagata aaggcgttaa ct atg agt aat aaa atg aag tgg acg agt 6078 S.i Met Ser Asn Lys Met Lys Trp Thr Ser 1320 1325 atg aca gee eat tgg tea gea att att aat ttc ate ega aaa tat gtt 6126 Met Thr Ala His Trp Ser Ala Ile Ile Asn Phe Ile Arg Lys Tyr Val 1330 1335 1340 tat cea gea agg ata att gee ate etg etg atg get gge get aca etg 6174 Tyr Pro Ala Arg Ile Ile Ala Ile Leu Leu Met Ala Gly Ala Thr Leu 1345 1350 1355 1360 cca eaa gte gee gat geg att acc gte gac ctg aat tae gae aag aae 6222 Pro Gin Val Ala Asp Ala Ile Thr Val Asp Leu Asn Tyr Asp Lys Asn 1365 1370 1375 aat gta geg gte ate aet cct gte tgg tee eaa gaa tgg agt gta gea 6270 Asn Val Ala Val Ile Thr Pro Val Trp Ser Gin Giu Trp Ser Val Ala 1380 1385 1390 aat gtg ttg ggg gga tgg gta tgt egt tea aac agg aat gaa aat gag 6318 Asn Val Leu Gly Gly Trp Val Cys Arg Ser Asn Arg Asn Glu Asn Glu 1395 1400 1405 ggg geg tgt gaa gaa aea eat ttg gta tgg tgg tat get ttt gga get 6366 Gly Ala Cys Giu Glu Thr His Leu Val Trp Trp Tyr Ala Phe Gly Ala 1410 1415 1420 tat tea aaa att egt etg egt tte aga gaa caa ate age eat gee gaa 6414 Tyr Ser Lys Ile Arg Leu Arg Phe Arg Giu Gin Ile Ser His Ala Glu 1425 1430 1435 1440 e g. att aeg ete ata etg ete gge agt gtt egt gat gee tgt tat act ggt 6462 I l: le Thr Leu Ile Leu Leu Gly Ser Val Arg Asp Ala Cys Tyr Thr Gly 1445 1450 1455 gte ate aae atg aac get get gea tgt caa tgg ggt agg teg etg aaa 6510 Val Ile Asn Met Asn Ala Ala Ala Cys Gin Trp Gly Arg Ser Leu Lys 1460 1465 1470 ew ett agg ata cet tea gaa gag ett geg aag ata ect aca age gga aca 6558 Leu Arg Ile Pro Ser Giu Giu Leu Ala Lys Ile Pro Thr Ser Gly Thr 1475 1480 1485 tgg aaa gea aeg tta gtc ctg gat tat tta caa tgg ggc gga gac gat 6606 Trp Lys Ala Thr Leu Val Leu Asp Tyr Leu Gin Trp Gly Gly Asp Asp 1490 1495 1500 cct tta ggc aca tca act aca gat atc acg ctg aat gta aca gac cac 6654 Pro Leu Gly Thr Ser Thr Thr Asp Ile Thr Leu Asn Val Thr Asp His 1505 1510 1515 1520 ttt gct gaa aat gcg gct att tac ttt ccg caa ttt ggt aca gca acg 6702 Phe Ala Giu Asn Ala Ala Ile Tyr Phe Pro Gin Phe Giy Thr Ala Thr 1525 1530 1535 ccc cgg gtg gac ctg aat ctt cac cgg atg aat gcc tca caa atg tcg 6750 Pro Arg Val Asp Leu Asn Leu His Arg Met Asn Ala Ser Gin Met Ser 1540 1545 1550 ggc agg gct aat ctg gat atg tgt ctg tat gac gga ggt gtg aaa gcc 6798 Gly Arg Ala Asn Leu Asp Met Cys Leu Tyr Asp Gly Gly Val Lys Ala 1555 1560 1565 cgt tca tta cag atg aag ata gaa gga agc aat aag tca ggt acg gga 6846 1570 1575 1580 ttt cag gtt ata aag agc gat tct gct gat acg att gat tat gcg gtc 6894 Phe Gin Val Ile Lys Ser Asp Ser Ala Asp Thr Ile Asp Tyr Ala Val 1585 1590 1595 1600 agt atg aat tat ggg gga cga agt att cct gtc acc cgt ggc gtg gag 6942 Ser Met Asn Tyr Gly Gly Arg Ser Ile Pro Val Thr Arg Gly Val Glu 1605 1610 1615 ttc agt ctg gat aac gtg gat aaa gca gca acg cgt ccg gtg gta ctt 6990 Phe Ser Leu Asp Asn Val Asp Lys Ala Ala Thr Arg Pro Val Val Leu 1620 1625 1630 ccc ggg caa cgg cag gcg gta cgt tgt gtg cca gtg ccc ctt acc ctg 7038 Pro Giy Gin Arg Gin Ala Val Arg Cys Val Pro Val Pro Leu Thr Leu So: 1635 1640 1645 aca aca caa ccc ttt aac atc aga gag aag cgt tct ggt gag tat cag 7086 Thr Thr Gin Pro Phe Asn Ile Arg Giu Lys Arg Ser Gly Giu Tyr Gin 1650 1655 1660 gga acg ctg aca gtg aca atg ctg atg gga aca caa acc ccc tga 7131 Gly Thr Leu Thr Val Thr Met Leu Met Gly Thr Gin Thr Pro 1665 1670 1675 cagtaattat ttattttatt gatatctttc ttatatggtt ttttaaatca gagttctctt 7191 tatatacttg ttttatttaa taaagagaat ctattcactt atgaaaatca atgcgtgagg 7251 ttctgctttc ct atg act gtg tat tta gat gat aaa gat aaa gaa tta ttg 7302 Met Thr Val Tyr Leu Asp Asp Lys Asp Lys Giu Leu Leu A: 1680 1685 1690 aaa gaa atc caa aaa gat tgt gca caa act tta tgg caa ctt gca tat 7350 Lys Giu Ile Gin Lys Asp Cys Ala Gin Thr Leu Trp Gin Leu Ala Tyr 1695 1700 1705 aaa gtg gga ctt acg ccc aca cca tgt ttc aaa cgt tta aaa aaa ctt 7398 "O Lys Val Gly Leu Thr Pro Thr Pro Cys Phe Lys Arg Leu Lys Lys Leu 1710 1715 1720 aaa gac agg ggg gtt atc att ggt cag ttc gct tta ttg gat aag gaa 7446 Lys Asp Arg Gly Val lie Ile Gly Gin Phe Ala Leu Leu Asp Lys Glu 1725 1730 1735 1740 aaa cta ggt ctt tca ctt aat gtc ttt att atg att aac ata tct gag 7494 Lys Leu Gly Leu Ser Leu Asn Val Phe Ile Met Ile Asn Ile Ser Glu 1745 1750 1755 gag caa tac gct Giu Gin Tyr Ala 1760 att gcc ttt tat Ile Ala Phe Tyr 1775 gta ttt aca gat Val Phe Thr Asp 1790 tta act aat tct Leu Thr Asn Ser 1805 caa ata aag gaa Gin Ile Lys Giu gatttaatac attat aggattttaa tcagc tggttaaata tcacc atatttattt cgttc tgttttttga agagt aatatatact cagga taaatagcgt cgggt gcgagtacat cagga aacattctta ttcca ctcgtgatcc taatt tcggggatat aaat atggaatgca atggc aaacttggtg tatat ttgtttaatt attgt cggtggggag gctgS atacgaatgg tttat acactcacat tactz tgcatgtgac gtta gttttttgtC tcggt tcatgctgaa ggcg gatgctactg tagac cctccacgtc ctga~ cga att tct gga Ara Ile Ser Giv 1780 atg aac gat tac Met Asn Asp Tyr 1795 tca att agt gga Ser Ile Ser Gly 1810 aca aac gaa ctg Thr Asn Glu Leu .825 tca ttt aat tat tta Ser Phe Asn Tyr Leu 1785 tat agt ttt tat gag Tyr Ser Phe Tyr Giu 1800 tct gca tcg agc ttt Ser Ala Ser Ser Phe 1815 tca gtg tga aagtgtg Ser Val 1830 atg cat aca Met His Thr aaa ata ata Lys Ile Ile gtt ctt gag Val Leu Giu 1820 agt att tct gag aaa ata aag tca atg cct gag gtt Ser Ile Ser Giu Lys Ile Lys Ser Met Pro Giu Val 1765 1770 7542 7590 7638 7686 atg tgtacttact 7739 g~ 8 .1 t.
I
S..
S :tatcc ~agtgg ~ccgtg ;aaccc :tacaa LtCCCC :tattt Lgtcgc lgtgaa :ttatt jatgta ~catcg :aacag :t ttgt ;aagtt :atatg ~tgttc ~atcat iattta :agttg ~tttca ;ataat 1gggcg ttcttacgga tgaaattaag catgtaacaa ttctggaaaa aagtcattta tgggaatttg gctgtttctg agagattatg attaaaaacc agtagtgttt gaacagaggg tcatatgacg aaggagtgaa tttttattac taggggatga cactagatgt cgttacatta tattaatatg aatgagctga ttatattatt gatctggagg aaattctgct tgggt tcctg acaacaacgg cggcacagaa aaaaccgcat aaggcgaaaa atttattcaa tgctcataca ttgactttaa gtatggaagt ttggattctt atcattatca ccgtgtacca gttttttctt aatttgaatc gattaaatat ccgtttatca attattttag aggaatatac agtgtggttc ctgtatgaat attttatttg cattgggtta aaagattccc ctccaacggg cagattgcgg taacacagcg taaaacagat ccacataatt ccataaatat tatggaaagg taacaaccac ttggtttgat ttttcaggct taagttccaa gaacgaagtt tcgggatcat aaaaatatct aaagaacatc acaattttat tttaatatat atctgtatct ttgttttacg tctaaatact gtgttatttg tttatgacgg caattcttgt ctgcctgact ctgttgaaca gaatatcaca gatgt tactg gagtcattga gggttaaata atcagtaaat aaatttctga ttttgtggta tgcgtagtgc aagattcttg gattacatgt aaaaaaaatg tatttatttt attgttcgtg tacagccacc cgatggttgc cgttatacgc catgcatggt ttagagaggt cagccagtgc gaacaagaga gcattgacat gcacgctcct 7799 7859 7919 7979 8039 8099 8159 8219 8279 8339 8399 8459 8519 8579 8639 8699 8759 8819 8879 8939 8999 9059 9119 tccacaggca agcacggcgt gtcccgctct aaaatgttac gcgcgccgtt tacatcggcg 9179 ttcgcagtat atcttcatac cagacacttg taagtatctc gcataatcgt gccattcaca 9239 tttagagatc atac 9253 <210> 7 <211> 236 <212> PRT <213> Salmonella typhi <400> 7 Met Asn Phe Lys Asp Thr Leu Pro Gly Val Phe Leu Cys Val Ala Met S1 5 10 Phe Ala Cys Gly His Ala Arg Ala Asn Met Leu Val Tyr Pro Met Ala 20 25 Ala Glu Ile Asn Ser Ser Arg Glu Glu Ala Thr Ser Leu Phe Val Tyr 35 40 Ser Lys Ser Asp His Val Gin Tyr Ile Arg Thr Arg Ile Met Arg Ile 55 Glu His Pro Gly Met Pro Gin Glu Lys Glu Val Pro Ala Gly Asn Asp 65 70 75 Ile Glu Thr Gly Leu Val Val Ser Pro Glu Lys Phe Ala Leu Ser Pro 85 90 **ee Gly Thr Lys Lys Thr Ile Arg Val Ile Ser Thr Gin Ala Pro Glu Arg 100 105 110 Glu Glu Ala Trp Arg Val Tyr Phe Glu Ala Val Pro Glu Leu Glu Asp 115 120 125 Asp Pro Gin Ala Gly Gly Lys Gin Asn Ser Ser Val Ser Val Asn Leu 130 135 140 Val Trp Gly Val Leu Leu Arg Val Ser Pro Ser Asp Pro Arg Pro Ala 145 150 155 160 i!i! Leu Val Thr Asp Gly His His Leu Leu Asn Thr Gly Asn Thr Arg Leu 165 170 175 t Ser Leu Ile Arg Ala Gly Asn Cys Asp Thr Thr Cys His Trp Gin Asn 180 185 190 *5l Ile Gly Lys Ser Ile Tyr Pro Gly Gly Ser Ala Asp Ile Pro Ala Gly 195 200 205 Ile Lys Ser Asn Ala Phe Arg Val Glu Tyr Arg Thr Gly Ala Asn Ser 210 215 220 Pro Val Ile Ser Ala Asp Leu Thr Ala Ala Gly Lys 225 230 235 :;it: S <210> 8 <211> 191 <212> PRT S <213> Salmonella typhi fooo <400> 8 Met Tyr Thr Glu Cys Thr Tyr Ile Thr Val Ile Asn Asn Lys Ala Arg S1 5 10 Leu Phe Phe Met Asn Met Lys Thr Ser Phe Ile Ala Ala Ala Val Ala 25 Leu Ala Thr Val Tyr Ser Phe Ser Val Ser Ala Val Gin Lys Asp Ile 40 Thr Val Thr Ala Asn Ile Asp Ser Thr Leu Giu Leu Leu Gin Ala Asp 55 Gly Ser Ser Leu Pro Ser Thr Met Lys Leu Asp Phe Met Pro Gly Lys 70 75 Gly Leu Val His Lys Ser Leu Gin Thr Arg Leu Tyr Ser Asn Asp Gin 90 Thr Lys Ser Val Asn Vai Lys Leu Leu Asn Ala Pro Gin Leu Ile Asn *..100 105 110 *Val Leu Asp Pro Thr Lys Thr Ile Asp Met Glu Val Thr Leu Gly Gly 115 120 125 Arg Ser Leu Thr Thr Thr Asn Ser Val Leu Giu Ala Lys Thr Leu Phe 130 135 140 Pro Asp Giy Lys Thr Giy Asp Ala Ser Aia Leu Leu Asn Leu Asp Ile 145 150 155 160 s* Gly Gin Lys Ala Gly Ala Ala Leu Gin Asn Leu Pro Ala Gly Giu Tyr *165 170 175 Ser Gly Leu Val Ser Leu Val Ile Ser Gin Ala Val Thr Ala Gly 180 185 190 <210> 9 <211> 889 <212> PRT <213> Salmonella typhi <400> 9 Met Tyr Tyr Leu Leu Gly Leu Cys Ser Phe Thr Ser Gin Ala Thr Leu 1 5 10 Ile Pro Pro Pro Gly Phe Glu Ser Leu Leu Giu Gly Gin Thr Giu Gin 25 3 Ile Giu Val Leu Leu Pro Giy His Ser Leu Gly Leu Phe Pro Val Val 010 035 40 Val Lys Pro Asp Thr Val Gin Phe Met Ser Pro Leu Met Val Leu Glu *t 50 55 Ser Ser Gly Leu Ala Ala Leu Pro Ala Ala Glu Arg Gin Lys Ala Leu 70 75 Ala Ala Leu Ser Arg Pro Leu Leu Arg Asn Ser Asn Leu Val Cys Gly 90 Val Ser Giu Ala Lys Asp Ser Ser Giu Cys Gly Tyr Val Ala Thr Asp .s%100 105 110 o Lys Glu Asp Val Ala Val Ile Phe Asp Giu Asn Asn Ala Gin Leu Ser 115 120 125 Leu Phe Leu Asn Arg Asp Trp Leu Pro Asp Giu Giu Arg Arg Asp Lys 9.
*~p9e. 9*9* 9% 9**9 9 "4' *99* 0* Ile His Met Thr 175 Leu Gly 190 Asn Ser Asn Gin Ser Aia Phe Asp 255 Val Asp 270 Arg Ile Thr Pro Tyr Pro Giu Thr 335 Gin Trp, 350 His Tyr Arg Lys Trp, Tyr Thr Leu 415 Giy Asn 430 Trp Arg Ser Vai Ser Val 465 Vai Giy Asn Trp, Tyr Aia 485 Leu Leu Giy Ser Thr Ser Arg Thr Giu 495 *.o a *00 Gly Val Gin Thr 545 Gly Met Ser Ser Tyr 625 Arg Ala Leu Ala Ser 705 Gly Glu Val1 Leu Arg 785 Asn Ser Leu Ser Asp His Arg 525 Gly Met 540 Gly Val Ser Gin Thr Asp Val Asn 605 Val Gly 620 Gly Arg Ser His Ser Ser Gin Asp 685 Gly Asp 700 Gin Pro Gly Tyr Gin Gly Leu Thr 765 Arg Leu 780 Asp Ala Arg Phe Ser Lys Asp Val 845 Lys Giu 510 Giu Ser Asn Ile Asn Asp Pro Pro 575 Ile His 590 His Val Phe Ser Met Ser Gin Arg 655 Ser Leu 670 Gly Giu Giu Gly Phe Ser Asn Ala 735 Met Ala 750 Pro Gly Tyr Vai Gin Pro Ser Leu 815 Asn Arg 830 Met Gin Val Gly Asp Val Arg Cys Glu Leu Ser Asp Val Asp Ala Leu Pro Gin 850 855 860 Ala Leu Gin Ile Ser Pro Arg Val Ile Arg Leu Leu Asn Val Ala Gly 865 870 875 880 Leu Leu Arg His Ser Val Gin Glu Ala 885 <210> <211> 359 <212> PRT <213> Salmonella typhi <400> *Met Ser Asn Lys Met Lys Trp, Thr Ser Met Thr Ala His Trp Ser Ala 1 5 10 I le Ile Asn Phe Ile Arg Lys Tyr Val Tyr Pro Ala Arg Ile Ile Ala *20 25 30 Ile Leu Leu Met Ala Gly Ala Thr Leu Pro Gin Val Ala Asp Ala Ile 40 Thr Val Asp Leu Asn Tyr Asp Lys Asn Asn Val Ala Val Ile Thr Pro 55 Val Trp Ser Gin Giu Trp Ser Val Ala Asn Val Leu Gly Gly Trp Val 70 75 Cys Arg Ser Asn Arg Asn Giu Asn Giu Gly Ala Cys Giu Giu Thr His 90 Leu Val Trp Trp, Tyr Ala Phe Gly Ala Tyr Ser Lys Ile Arg Leu Arg 5100 105 110 Phe Arg Giu Gin Ile Ser His Ala Giu Ile Thr Leu Ile Leu Leu Gly 115 120 125 Ser Val Arg Asp Ala Cys Tyr Thr Gly Val Ile Asn Met Asn Ala Ala 130 135 140 Ala Cys Gin Trp Gly Arg Ser Leu Lys Leu Arg Ile Pro Ser Giu Giu 145 150 155 160 6* Leu Ala Lys Ile Pro Thr Ser Gly Thr Trp Lys Ala Thr Leu Val Leu *0165 170 175 Asp Tyr Leu Gin Trp Gly Gly Asp Asp Pro Leu Gly Thr Ser Thr Thr *180 185 190 Asp Ile Thr Leu Asn Val Thr Asp His Phe Ala Glu Asn Ala Ala Ile 195 200 205 Tyr Phe Pro Gin Phe Gly Thr Ala Thr Pro Arg Val Asp Leu Asn Leu 210 215 220 00.0 0 His Arg Met Asn Ala Ser Gin Met Ser Gly Arg Ala Asn Leu Asp Met 00.0 225 230 235 240 Cys Leu Tyr Asp Gly Gly Val Lys Ala Arg Ser Leu Gin Met Lys Ile 245 250 255 Giu Gly Ser Asn Lys Ser Gly Thr Gly Phe Gin Val Ile Lys Ser Asp 260 265 270 Thr Ile Val Thr Thr Arg Pro Vai 325 Arg Ser 340 Thr Gin Tyr Aia Vai Ser Met 280 Gly Vai Glu Phe Ser 295 Vai Vai Leu Pro Giy 315 Leu Thr Leu Thr Thr 330 Giu Tyr Gin Gly Thr 345 Pro Asn Tyr Giy Gly Arg 285 Leu Asp Asn Vai Asp 300 Gin Arg Gin Ala Val 320 Gin Pro Phe Asn Ile 335 Leu Thr Val Thr Met 350 0000 00 *0000 *0 0 00 0 <210> 11 <211> i51 <212> PRT <213> Salmonella typhi <400> 11 Met Thr Val Tyr Leu Asp 1 5 Gin Lys Asp Cys Ala Gin 20 Leu Thr Pro Thr Pro Cys Gly Val Ile Ile Gly Gin Leu Ser Leu Asn Val Phe Ala Ser Ile Ser Glu Lys Tyr Arg Ile Ser Gly Ser 100 Asp Met Asn Asp Tyr Tyr 115 Ser Ser Ile Ser Gly Ser 130 Giu Thr Asn Giu Leu Ser 145 150 Asp Lys Trp Gin Arg Leu Leu Leu Ile Asn Ser Met Tyr Leu 105 Tyr Glu Ser Phe 0005 0000 0000 5000 0500 0000 0000 0000
S
@00050 0 5050 5050

Claims (15)

1. An isolated peptide encoded by the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 6 for use in medicine.
2. An antibody directed against a peptide according to claim 1 for use in medicine.
3. An isolated polynucleotide comprising a nucleotide sequence selected 10 from SEQ ID NO: 1 or SEQ ID NO: 6 for use in medicine.
4. A vaccine for protection against diseases caused by Salmonella enterica subspecies I, comprising a peptide according to claim 1 or an antibody according to claim 2 and a pharmaceutically acceptable carrier. A nucleic acid vaccine for protection against diseases caused by S S :Salmonella enterica subspecies 1, comprising a polynucleotide according to claim 3 20 6. A vector vaccine for protection against diseases caused by Samonella enterica subspecies I, comprising a host harbouring a recombinant vector which 0 incorporates a polynucleotide according to claim 3 and a pharmaceutically acceptable carrer. 25 7. A method for protection against diseases caused by Samonella enterica subspecies I, comprising administering to a patient in need thereof a vaccine according to any one of claims 4 to 6. 5551
8. Use of a vaccine according to any one of claims 4 to 6 in the manufacture of a medicament for protection against diseases caused by Samonella 0 .0 0:0. 0 0: 0 0 0 a0 0 *0 0 0000 enterica subspecies I.
9. An antibody directed against a peptide according to claim 1 for use in a diagnostic method. An isolated peptide according to claim 1 for use in a diagnostic method.
11. Primers for, or probes that hybridize with, the nucleotide sequence of 10 SEQ ID NO: 1 or SEQ ID NO: 6, for use in a diagnostic method for the purpose of detecting Samonella enterica subspecies I.
12. An isolated peptide according to claim 1 substantially as hereinbefore described or exemplified.
13. An antibody according to claim 2 substantially as hereinbefore described.
14. An isolated polynucleotide according to claim 3 substantially as 20 hereinbefore described.
15. A vaccine according to claim 4 substantially as hereinbefore described. 25 16. A nucleic acid vaccine according to claim 5 substantially as hereinbefore described.
17. A vector vaccine according to claim 6 substantially as hereinbefore described. 00 0 0 00 00 0 0* 00 0.0 0 0 *t 0000 0. 0* 0 00 0* 000 00 @0 0 000 00 0 *0 0 00 *00 0*S@0 0 000 0000 0 *000 0000 0 hO **00 O e. 00e0 On.. 0 0000 0*000 *u 0 -17-
18. A method according to claim 7 substantially as hereinibefore described
19. Use according to claimn 8 substantially as hereinibefore described. 0 natbd acrigt i 9 susatilya heerbfr desried
520. An aiotdyppt according to claim 0 substantially as hereinibefore described. 21. An pisolated peptbe according to claim 10 substantially as hribfr hesrinbeforexdescried. Dae hi. 'tMrh.04 SBLVaci A isPtntAtmy Dais olio Cv
AU52628/00A 1999-05-28 2000-05-26 Fimbrial proteins Ceased AU773484B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
SE9901961 1999-05-28
SE9901961A SE9901961D0 (en) 1999-05-28 1999-05-28 Fimbrial proteins
PCT/SE2000/001079 WO2000073336A1 (en) 1999-05-28 2000-05-26 Fimbrial proteins

Publications (2)

Publication Number Publication Date
AU5262800A AU5262800A (en) 2000-12-18
AU773484B2 true AU773484B2 (en) 2004-05-27

Family

ID=20415789

Family Applications (1)

Application Number Title Priority Date Filing Date
AU52628/00A Ceased AU773484B2 (en) 1999-05-28 2000-05-26 Fimbrial proteins

Country Status (7)

Country Link
EP (1) EP1180118A1 (en)
JP (1) JP2003502291A (en)
AU (1) AU773484B2 (en)
CA (1) CA2372250A1 (en)
NZ (1) NZ515912A (en)
SE (1) SE9901961D0 (en)
WO (1) WO2000073336A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE60216099D1 (en) * 2001-05-17 2006-12-28 Creatogen Ag METHOD FOR DETECTING WEAKENED OR VIRULENT-DEFECTIVE MICROBES
AU2009254932A1 (en) 2008-06-03 2009-12-10 Health Protection Agency Salmonella detection assay

Also Published As

Publication number Publication date
SE9901961D0 (en) 1999-05-28
WO2000073336A1 (en) 2000-12-07
AU5262800A (en) 2000-12-18
JP2003502291A (en) 2003-01-21
CA2372250A1 (en) 2000-12-07
NZ515912A (en) 2004-02-27
EP1180118A1 (en) 2002-02-20

Similar Documents

Publication Publication Date Title
KR102321388B1 (en) Nucleic Acid Guide Nuclease
AU2016219667B2 (en) Antibacterial phage, phage peptides and methods of use thereof
US8198430B2 (en) Immunogenic sequences
KR100886095B1 (en) Novel Streptococcus pneumoniae open reading frames encoding polypeptide antigens and a composition comprising the same
AU2021201338B2 (en) Complete genome sequence of the methanogen methanobrevibacter ruminantium
KR102532707B1 (en) Bioconjugates of Escherichia coli O-antigen polysaccharides, methods of making the same, and methods of using the same
KR20220097391A (en) Development of a novel live attenuated African swine fever vaccine based on the deletion of gene I177L
KR20230098825A (en) Seed. Phage-Derived Particles for In Situ Delivery of DNA Payloads to the Acnes Population
KR102574882B1 (en) Methods for producing bioconjugates of E. coli O-antigen polysaccharides, compositions thereof and methods of use thereof
JPH09322781A (en) Staphylococcus aureus polynucleotide and sequence
PT2753364T (en) A conditional replicating cytomegalovirus as a vaccine for cmv
Smith-Vaughan et al. Nonencapsulated Haemophilus influenzae in Aboriginal infants with otitis media: prolonged carriage of P2 porin variants and evidence for horizontal P2 gene transfer
KR20230012583A (en) Synthetic Modified Vaccinia Ankara (sMVA) Based Coronavirus Vaccine
KR20150056540A (en) Clostridium difficile polypeptides as vaccine
JPH09252787A (en) Mycoplasma genitalium genome or nucleotide sequence of its fragment and use thereof
TW202227128A (en) Multivalent vaccine compositions and uses thereof
AU773484B2 (en) Fimbrial proteins
Han et al. DNA-based adaptive immunity protect host from infection-associated periodontal bone resorption via recognition of Porphyromonas gingivalis virulence component
KR20230136600A (en) Genomic deletion of African swine fever vaccine enables efficient growth in stable cell lines
AU2021240230B2 (en) Vaccines and vaccine components for inhibition of microbial cells
AU777190B2 (en) Streptococcus pneumoniae polynucleotides and sequences
US6432669B1 (en) Protective recombinant Haemophilus influenzae high molecular weight proteins
KR20230173188A (en) Bacteriophage therapy for adherent-invasive E. coli
KR20240021274A (en) Bacteriophages against vancomycin-resistant enterococci
CA2345208C (en) Protective recombinant haemophilus influenzae high molecular weight proteins

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)