HEPATITIS C ASSAY UTILIZING RECOMBINANT ANTIGENS TO NS1
This is a continuation-in-part application of U.S. Serial No. 07/572,822, filed August 24, 1990 and U.S. Serial No. 07,614,069, filed November 7, 1990, which enjoy common ownership and are incorporated herein by reference. This application also is related to co-filed patent applications entitled "HEPATITIS C ASSAY UTILIZING RECOMBINANT ANTIGENS FROM NS5 REGION"(U. S. Serial No. 748,565) and "HEPATITIS C ASSAY UTILIZING RECOMBINANT ANTIGENS TO C-100 REGION"(U. S. Serial No. 748,566) which enjoy common ownership and are incorporated herein by reference.
This invention relates generally to an assay for identifying the presence in a sample of an antibody which is immunologically reactive with a hepatitis C virus antigen and specifically to an assay for detecting a complex of an antibody and recombinant antigens representing distinct regions of the HCV genome. Recombinant antigens derived from the molecular cloning and expression in a heterologous expression system of the synthetic DNA sequences representing distinct antigenic regions of the HCV genome can be used as reagents for the detection of antibodies and antigen in body fluids from individuals exposed to hepatitis C virus (HCV).
BACKGROUND OF THE INVENTION
Acute viral hepatitis is clinically diagnosed by a well-defined set of patient symptoms, including jaundice, hepatic tenderness, and an increase in the serum levels of alanine am inotransf erase (ALT) and aspartate aminotransferase. Additional serologic immunoassays are generally performed to diagnose the specific type of viral causative agent. Historically, patients presenting clinical hepatitis symptoms and not otherwise infected by hepatitis A, hepatitis B, Epstein-Barr or cytomegalovirus were clinically diagnosed as having non-A non-B hepatitis (NANBH) by default. The disease may result in chronic liver damage.
Each of the well-known, immunologically characterized hepatitis-inducing viruses, hepatitis A virus (HAV), hepatitis B virus (HBV), and hepatitis D virus
(HDV) belongs to a separate family of viruses and has a distinctive viral organization, protein structure, and mode of replication.
Attempts to identify the NANBH virus by virtue of genomic similarity to one of the known hepatitis viruses have failed, suggesting that NANBH has a distinct organization and structure. [Fowler, et al.. J. Med. Virol.. 12:205-213 (1983) and Weiner, et al.. J. Med. Virol.. 21 :239-247 (1987)].
Progress in developing assays to detect antibodies specific for NANBH has
been particularly hampered by difficulties in correctly identifying antigens associated with NANBH. See, for example, Wands, J., et al.. U.S. Patent 4,870,076, Wands, et aL Proc. Nat' I. Acad. Sc 83:6608-6612 (1986), Ohori, et al.. J. Med. Virol.. 12:161 -178 (1983), Bradley, et al.. Proc. Nat 'I. Acad. Sc 84:6277- 6281 , (1987), Akatsuka, T., et al.. J. Med. Virol. 20:43-56 (1986), Seto, B., ≤l a , U.S. Patent Application Number 07/234,641 (available from U.S. Department of Commerce National Technical Information Service, Springfield, Virginia, No. 89138168), Takahashi, K., et al.. European Patent Application No. 0 293 274, published November 30, 1988, and Seelig, R., et aL in PCT Application PCT/EP88/00123.
Recently, another hepatitis-inducing virus has been unequivocally identified as hepatitis C virus (HCV) by Houghton, M., et al.. European Patent Application publication number 0 318 216, May 31 , 1989. Related papers describing this virus include Kuo, G., et al.. Science. 244:359-361 (1989) and Choo, Q., et. al. Science. 244:362-364 (1989). Houghton, M., et al. reported isolating cDNA sequences from HCV which encode antigens which react immunologically with antibodies present in patients infected with NANBH, thus establishing that HCV is one of the viral agents causing NANBH. The cDNA sequences associated with HCV were isolated from a cDNA library prepared from the RNA obtained from pooled serum from a chimpanzee with chronic HCV infection. The cDNA library contained cDNA sequences of approximate mean size of about 200 base pairs. The cDNA library was screened for encoded epitopes expressed in clones that could bind to antibodies in sera from patients who had previously experienced NANBH.
In the European Patent Application, Houghton, M., et al. also described the preparation of several superoxide dismutase fusion poiypeptides (SOD) and the use of these SOD fusion poiypeptides to develop an HCV screening assay. The most complex SOD fusion polypeptide described in the European Patent Application, designated c100-3, was described as containing 154 amino acids of human SOD at the aminotermiπus, 5 amino acid residues derived from the expression of a synthetic DNA adapter containing a restriction site, EcoRI, 363 amino acids derived from the expression of a cloned HCV cDNA fragment, and 5 carboxyl terminal amino acids derived from an MS2 cloning vector nucleotide sequence. The DNA sequence encoding this polypeptide was transformed into yeast cells using a plasmid. The transformed cells were cultured and expressed a 54,000 molecular weight polypeptide which was purified to about 80% purity by differential extraction. Other SOD fusion poiypeptides designated SOD-NANB5-1-1 and SOD- NANBβi were expressed in recombinant bacteria. The E.coli fusion poiypeptides
were purified by differential extraction and by chromatography using anion and cation exchange columns. The purification procedures were able to produce SOD- NANB5.-1 _ι as about 80% pure and SOD-NAN38, as about 50% pure.
The recombinant SOD fusion poiypeptides described by Houghton, M., et al. were coated on microtiter wells or polystyrene beads and used to assay serum samples. Briefly, coated microtiter wells were incubated with a sample in a diluent. After incubation, the microtiter wells were washed and then developed using either a radioactively labelled sheep anti-human antibody or a mouse antihuman IgG-HRP (horseradish peroxidase) conjugate. These assays were used to detect both post acute phase and chronic phase HCV infection.
Due to the preparative methods, assay specificity required adding yeast or E.coli extracts to the samples in order to prevent undesired immunological reactions with any yeast or E.coli antibodies present in samples.
Ortho Diagnostic Systems Inc. have developed a immunoenzyme assay to detect antibodies to HCV antigens. The Ortho assay procedure is a three-stage test for serum/plasma carried out in a microwell coated with the recombinant yeast hepatitis C virus SOD fusion polypeptide dOO-3.
In the first stage, a test specimen is diluted directly in the test well and incubated for a specified length of time. If antibodies to HCV antigens are present in the specimen, antigen-antibody complexes will be formed on the microwell surface.
If no antibodies are present, complexes will not be formed and the unbound serum or plasma proteins will be removed in a washing step.
In the second stage, anti-human IgG murine monoclonal antibody horseradish peroxidase conjugate is added to the microwell. The conjugate binds specifically to the antibody portion of the antigen-antibody complexes. If antigen-antibody complexes are not present, the unbound conjugate will also be removed by a washing step.
In the third stage, an enzyme detection system composed of 0- phenyienediamine 2HCI (OPD) and hydrogen peroxide is added to the test well. If bound conjugate is present, the OPD will be oxidized, resulting in a colored end product. After formation of the colored end product, dilute sulfuric acid is added to the microwell to stop the color-forming detection reaction.
The intensity of the colored end product is measured with a microwell reader. The assay may be used to screen patient serum and plasma. It is established that HCV may be transmitted by contaminated blood and blood products. In transfused patients, as many as 10% will suffer from post- transfusion hepatitis. Of these, approximately 90% are the result of infections
diagnosed as HCV. The prevention of transmission of HCV by blood and blood products requires reliable, sensitive and specific diagnosis and prognostic tools to identify HCV carriers as well as contaminated blood and blood products. Thus, there exists a need for an HCV assay which uses reliable and efficient reagents and methods to accurately detect the presence of HCV antibodies in samples.
SUMMARY OF THE INVENTION
The present invention provides an improved assay for detecting the presence of an antibody to an HCV antigen in a sample by contacting the sample with at least one recombinant protein representing a distinct antigenic region of the HCV genome.
Recombinant antigens which are derived from the molecular cloning and expression of synthetic DNA sequences in heterologous hosts are provided. Briefly, synthetic DNA sequences which encode the desired proteins representing distinct antigenic regions of the HCV genome are optimized for expression in E.coli by specific codon selection. Specifically, recombinant proteins representing five distinct antigenic regions of NS1 of the HCV genome are described. The proteins are expressed as chimeric fusions with E.coli CMP-KDO synthetase (CKS) gene. The first protein, expressed by plasmid pHCV-77 (identified as SEQ. ID. NO. 1) represents amino acids 365-579 of the HCV sequence of NS1 and, based on analogy to the genomic organization of other flaviviruses, has been named HCV CKS-NS1S1.
Note that the term pHCV-77 will also refer to the fusion protein itself and that pHCV-77' will be the designation for a polypeptide representing the NS1 region from about amino acids 365-579 of the HCV sequence prepared using other recombinant or synthetic methodologies. Other recombinant methodologies would include the preparation of pHCV-77', utilizing different expression systems. The methodology for the preparation of synthetic peptides of HCV is described in U.S. Serial No. 456,162, filed December 22, 1989, and U.S. Serial No. 610,180, filed November 7, 1990, which enjoy common ownership and are incorporated herein by reference. The next protein is expressed by plasmid pHCV-65, identified as SEQ. ID. NO. 2, and represents amino acids 565-731 of the NS1 region of the HCV genome, pHCV-65 has been named HCV CKS-NS1S2 and is expressed by the plasmid pHCV-65. The fusion protein itself will also be referred to as pHCV-65 and pHCV- 65' shall be the designation for a polypeptide from the NS-1 region representing from about amino acids 565-731 of the HCV sequence prepared using other recombinant or synthetic methodologies. The next recombinant antigen represents amino acids 717-847 of the NS1 region of the HCV sequence, and is expressed by the plasmid pHCV-78 (identified by SEQ. ID. NO. 3). The fusion protein will be
referred to as pHCV-78 and pHCV-78' shall be the designation for a polypeptide from the NS1 region representing from about amino acids 717-847 of the HCV sequence prepared using other recombinant or synthetic methodologies. It has been designated clone HCV CKS-NS1S3 based on the strategy used in its construction. Figure 44 illustrates the position of pHCV-77, pHCV-65 and pHCV-78 in the NS1 region of the HCV genome. The recombinant antigen produced by pHCV-80 is identified as SEQ. ID. NO.4 and is designated HCV CKS-NS1S1-NS1S2. The fusion protein is also designated by pHCV-80 and pHCV-80' refers to the polypeptide located in the NS1 region of HCV, representing amino acids 365-731 of the HCV genome prepared using different recombinant methodologies. Figure 45 illustrates the position of pHCV-80 within the HCV genome. HCV CKS-Full Length NS1 is the designation for the recombinant protein pHCV-92 (SEQ. ID. NO. 5). It represents amino acids 365-847 of the HCV genome. The fusion proteins will be referred to as pHCV-92 and pHCV 92' shall be the designation for the polypeptide from the NS1 region representing amino acids 365-847 of the HCV sequence prepared using other recombinant or synthetic methodologies. Figure 46 illustrates the position of pHCV-92 in the HCV genome. These antigens are used in the inventive immunoassays to detect the presence of HCV antibodies in samples.
One assay format according to the invention provides a screening assay for identifying the presence of an antibody that is immunologically reactive with an HCV antigen. Briefly, a fluid sample is incubated with a solid support containing the commonly bound recombinant proteins. Finally, the antibody-antigen complex is detected. In a modification of the screening assay the solid support additionally contains recombinant polypeptide c1OO-3. Another assay format provides a confirmatory assay for unequivocally identifying the presence of an antibody that is immunologically reactive with an HCV antigen. The confirmatory assay includes synthetic peptides or recombinant antigens representing the epitopes contained within the NS1 region of the HCV genome, which are the same regions represented by the recombinant proteins described in the screening assay. These are pHCV-77, pHCV-65, pHCV-78, pHCV-
80 and pHCV-92. Recombinant proteins used in the confirmatory assay should have a heterologous source of antigen to that used in the primary screening assay (i.e. should not be an E.coli-derived recombinant antigen nor a recombinant antigen composed in part, of CKS sequences). Briefly, specimens repeatedly reactive in the primary screening assay are retested in the confirmatory assay. Aliquots containing identical amounts of specimen are contacted with a synthetic peptide or recombinant antigen individually coated onto a solid support. Finally, the antibody-
antigen complex is detected. The poiypeptides or recombinant proteins can be utilized as indicated or combined with other poiypeptides and recombinant proteins a described herein and also described in U.S. Serial No. 456,162 entitled "Hepatitis C Assay", filed December 22, 1989, which enjoys common ownership and is incorporated herein by reference.
Another assay format provides a competition assay or neutralization assay directed to the confirmation that positive results are not false by identifying the presence of an antibody that is immunologically reactive with an HCV antigen in a fluid sample where the sample is used to prepare first and second immunologically equivalent aliquots. The first aliquot is contacted with solid support containing a bound polypeptide which contains at least one epitope of an HCV antigen under conditions suitable for complexing with the antibody to form a detectable antibody¬ polypeptide complex and the second aliquot is first contacted with the same solid support containing bound polypeptide. The preferred recombinant poiypeptides include pHCV-77, pHCV-65, pHCV-78, pHCV-80 and pHCV-92.
Another assay format provides an immunodot assay for identifying the presence of an antibody that is immunologically reactive with an HCV antigen by concurrently contacting a sample with recombinant poiypeptides each containing distinct epitopes of an HCV antigen under conditions suitable for complexing the antibody with at least one of the poiypeptides and detecting the antibodypolypeptide complex by reacting the complex with colorproducing reagents. The preferred recombinant poiypeptides employed include those recombinant poiypeptides derived from pHCV-77, pHCV-65, pHCV-78, pHCV-80, as well as pHCV-92.
In all of the assays, the sample is preferably diluted before contacting the polypeptide absorbed on a solid support. Samples may be obtained from different biological samples such as whole blood, serum, plasma, cerebral spinal fluid, and lymphocyte or cell culture supernatants. Solid support materials may include cellulose materials, such as paper and nitrocellulose, natural and synthetic polymeric materials, such as poiyacrylamide, polystyrene, and cotton, porous gels such as silica gel, agarose, dextran and gelatin, and inorganic materials such as deactivated alumina, magnesium sulfate and glass. Suitable solid support materials may be used in assays in a variety of well known physical configurations, including microtiter wells, test tubes, beads, strips, membranes, and microparticles. A preferred solid support for a non-immunodot assay is a polystyrene bead. A preferred solid support for an immunodot assay is nitrocellulose.
Suitable methods and reagents for detecting an antibody-antigen complex in an assay of the present invention are commercially available or known in the
relevant art. Representative methods may employ detection reagents such as enzymatic, radioisotopic, fluorescent, luminescent, or chemiluminescent reagents. These reagents may be used to prepare hapten-labelled antihapten detection systems according to known procedures, for example, a biotin-labelled antibiotin system may be used to detect an antibody-antigen complex.
The present invention also encompasses assay kits including poiypeptides which contain at least one epitope of an HCV antigen bound to a solid support as well as needed sample preparation reagents, wash reagents, detection reagents and signal producing reagents. Other aspects and advantages of the invention will be apparent to those skilled in the art upon consideration of the following detailed description which provides illustrations of the invention in its presently preferred embodiments.
E.coli strains containing plasmids useful for constructs of the invention have been deposited at the American Type Culture Collection, Rockville, Maryland on August 10, 1990, under the accession Nos. ATCC 68380 (pHCV-23), ATCC 68381 (pHCV-29), ATCC 68382 (pHCV-31), ATCC 68383 (pHCV-34) and on November 6, 1990 for E.coli strains containing plasmids useful for constructs under the accession Nos. ATCC 68458 (pHCV-50), ATCC 68459 (pHCV-57), ATCC 68460 (pHCV-103), ATCC 68461 (pHCV-102), ATCC 68462 (pHCV-51), ATCC 68463 (pHCV-105), ATCC 68464 (pHCV-107), ATCC 68465 (pHCV-104), ATCC 68466 (pHCV-45), ATCC 68467 (pHCV-48),ATCC 68468 (pHCV-49), ATCC 68469 (pHCV-58) and ATCC 68470 (pHCV-101). E. coli strains containing plasmids useful for constructs of the invention have been deposited at the A.T.C.C. on September 26, 1991 under deposit numbers ATCC 68690 (pHCV-77), ATCC 68696 (pHCV-65), ATCC 68689 (pHCV-78), ATCC 68688 (pHCV-80) and ATCC 68695 (pHCV-92).
BRIEF DESCRIPTION OF THE DRAWINGS
FIGURE 1 illustrates the HCV genome. FIGURE 2 illustrates the use of recombinant poiypeptides to identify the presence of antibodies in a chimpanzee inoculated with HCV.
FIGURE 3 illustrates the sensitivity and specificity increase in using the screening assay using pHCV-34 and pHCV-31 antigens.
FIGURE 4 illustrates the construction of plasmid pHCV-34. FIGURE 5 illustrates fusion protein pHCV-34.
FIGURE 6 illustrates the expression of pHCV-34 proteins in E.coli.
FIGURE 7 illustrates the construction of plasmid pHCV-23.
FIGURE 8 illustrates the construction of plasmid pHCV-29.
FIGURE 9 illustrates the construction of plasmid pHCV-31.
FIGURE 10 illustrates the fusion protein pHCV-31.
FIGURE 11 illustrates the expression of pHCV-29 in E.coli. FIGURE 12 illustrates the expression of pHCV-23 in E.coli.
FIGURE 13 illustrates the expression of pHCV-31 in E.coli.
FIGURE 14 illustrates the increased sensitivity using the screening assay utilizing the pHCV-34.
FIGURE 15 illustrates the increased specificity with the screening assay utilizing pHCV-34 and pHCV-31.
FIGURE 16 illustrates the results in hemodialysis patients using the screening and confirmatory assays.
FIGURE 17 illustrates earlier detection of HCV in a hemodialysis patient using the screening assay. FIGURE 18 illustrates the results of the screening assay utilizing pHCV-34 and pHCV-31 on samples from individuals with acute NANBH.
FIGURE 19 illustrates the results of the confirmatory assay of the same population group as in Figure 18.
FIGURE 20 illustrates the results of the screening and confirmatory assays on individuals infected with chronic NANBH.
FIGURE 21 illustrates preferred buffers, pH conditions, and spotting concentrations for the HCV immunodot assay.
FIGURE 22 illustrates the results of the HCV immunodot assay. FIGURE 23 illustrates the fusion protein pHCV-45. FIGURE 24 illustrates the expression of pHCV-45 in E.coli.
FIGURE 25 illustrates the fusion protein pHCV-48. FIGURE 26 illustrates the expression of pHCV-48 in E.coli.
FIGURE 27 illustrates the fusion protein pHCV-51. FIGURE 28 illustrates the expression of pHCV-51 in E.coli. FIGURE 29 illustrates the fusion protein pHCV-50.
FIGURE 30 illustrates the expression of pHCV-50 in E.coli.
FIGURE 31 illustrates the fusion protein pHCV-49.
FIGURE 32 illustrates the expression of pHCV-49 in E.coli.
FIGURE 33 illustrates an immunoblot of pHCV-23, pHCV-45, pHCV-48, pHCV-51 , pHCV-50 and pHCV-49.
FIGURE 34 illustrates the fusion proteins pHCV-24, pHCV-57, pHCV-58.
FIGURE 35 illustrates the expression of pHCV-24, pHCV-57, and pHCV-58
in E.coli.
FIGURE 36 illustrates the fusion protein pHCV-105. FIGURE 37 illustrates the expression of pHCV-105 in E.coli.
FIGURE 38 illustrates the fusion protein pHCV-103. FIGURE 39 illustrates the fusion protein pHCV-101.
FIGURE 40 illustrates the fusion protein pHCV-102.
FIGURE 41 illustrates the expression of pHCV-102 in E.coli.
FIGURE 42 illustrates the fusion protein pHCV-107.
FIGURE 43 illustrates the fusion protein pHCV-104. FIGURE 44 illustrates the NS1 region of the HCV genome, and in particular, the locations of pHCV-77, pHCV-65 and pHCV-78.
FIGURE 45 illustrates the NS1 region of the HCV genome, and in particular, the location of pHCV-80.
FIGURE 46 illustrates the NS1 region of the HCV genome, and in particlar, the location of pHCV-92.
FIGURE 47A ilustrates the expression of pHCV-77 in E. coli: and FIGURE 47B illustrates an immunblot of pHCV-77 in E. coli.
FIGURE 48A illustrates the expression of pHCV-65 in E. coli and FIGURE 48B illustrates an immunoblot of pHCV-65 in E. coli. FIGURE 49A illustrates the expression of pHCV-80 in E. coli and FIGURE
49B illustrates an immunoblot of pHCV-80 in E. coli.
PETAILEP DESCRIPTION QFTHE INVENTION
The present invention is directed to an assay to detect an antibody to an HCV antigen in a sample. Human serum or plasma is preferably diluted in a sample diluent and incubated with a polystyrene bead coated with a recombinant polypeptide that represents a distinct antigenic region of the HCV genome. If antibodies are present in the sample they will form a complex with the antigenic polypeptide and become affixed to the polystyrene bead. After the complex has formed, unbound materials and reagents are removed by washing the bead and the bead-antigen- antibody complex is reacted with a solution containing horseradish peroxidase labeled goat antibodies directed against human antibodies. This peroxidase enzyme then binds to the antigen-antibody complex already fixed to the bead. In a final reaction the horseradish peroxidase is contacted with o-phenylenediamine and hydrogen peroxide which results in a yellow-orange color. The intensity of the color is proportional to the amount of antibody which initially binds to the antigen fixed to the bead.
The preferred recombinant poiypeptides having HCV antigenic epitopes were selected from portions of the HCV genome which encoded poiypeptides which possessed amino acid sequences similar to other known immunologically reactive agents and which were identified as having some immunological reactivity. (The immunological reactivity of a polypeptide was initially identified by reacting the cellular extract of E.coli clones which had been transformed with cDNA fragments of the HCV genome with HCV infected serum. Poiypeptides expressed by clone containing the incorporated cDNA were immunologically reactive with serum known to contain antibody to HCV antigens.) An analysis of a given amino acid sequence, however, only provides rough guides to predicting immunological reactivity. There is no invariably predictable way to ensure immunological activity short of preparing a given amino acid sequence and testing the suspected sequence in an assay.
The use of recombinant poiypeptides representing distinct antigenic regions of the HCV genome to detect the presence of an antibody to an HCV antigen is illustrated in Figure 2. The course of HCV infection in the chimpanzee, Pan, was followed with one assay using recombinant c1OO-3 polypeptide and with another improved assay, using the two recombinant antigens CKS-Core (pHCV-34) (SEQ.ID.NO 6 and 7) and pHCV-33c-BCD (pHCV-31) (SEQ.ID.NO 8 and 9) expressed by the plasmids pHCV-34 and pHCV-31, respectively. The assay utilizing the recombinant pHCV-34 and pHCV-31 proteins detected plasma antibody three weeks prior to detection of antibody by the assay using c100-3.
A summary of the results of a study which followed the course of HCV infection in Pan and six other chimpanzees using the two assays described above is summarized in Figure 3. Both assays gave negative results before inoculation and both assays detected the presence of antibodies after the animal had been infected with HCV. However, in the comparison of the two assays, the improved screening assay using pHCV-34 and pHCV-31 detected seroconversion to HCV antigens at an earlier or equivalent bleed date in six of the seven chimpanzees. Data from these chimpanzee studies clearly demonstrate that overall detection of HCV antibodies is greatly increased with the assay utilizing the pHCV-34 and pHCV-31 proteins. This test is sufficiently sensitive to detect seroconversion during the acute phase of this disease, as defined as an elevation in ALT levels, in most animals. Equally important is the high degree of specificity of the test as no pre-inoculation specimens were reactive.
The poiypeptides useful in the practice of this invention are produced using recombinant technologies. The DNA sequences which encode the desired poiypeptides
are preferably assembled from fragments of the total desired sequence. Synthetic DNA fragments of the HCV genome can be synthesized based on their corresponding amino acid sequences. Once the amino acid sequence is chosen, this is then reverse translated to determine the complementary DNA sequence using codons optimized to facilitate expression in the chosen system. The fragments are generally prepared using well known automated processes and apparatus. After the complete sequence has been prepared the desired sequence is incorporated into an expression vector which is transformed into a host cell. The DNA sequence is then expressed by the host cell to give the desired polypeptide which is harvested from the host cell or from the medium in which the host cell is cultured. When smaller peptides are to be made using recombinant technologies it may be advantageous to prepare a single DNA sequence which encodes several copies of the desired polypeptide in a connected chain. The long chain is then isolated and the chain is cleaved into the shorter, desired sequences. The methodology of polymerase chain reaction (PCR) may also be employed to develop PCR amplified genes from any portion of the HCV genome, which in turn may then be cloned and expressed in a manner similar to the synthetic genes.
Vector systems which can be used include plant, bacterial, yeast, insect, and mammalian expression systems. It is preferred that the codons are optimized for expression in the system used.
A preferred expression system utilizes a carrier gene for a fusion system where the recombinant HCV proteins are expressed as a fusion protein of an E.coli enzyme, CKS (CTP:CMP-3-deoxy-ma_ιι__-octulosonate cytidylyl transferase or CMP-KDO synthetase). The CKS method of protein synthesis is disclosed in U.S. Patent Applications Serial Nos. 167,067 and 276,263 filed March 11 , 1988 and
November 23, 1988, respectively, by Boiling (EPO 891029282) which enjoy common ownership and are incorporated herein by reference.
Other expression systems may be utilized including the lambda PL vector system whose features include a strong lambda pL promoter, a strong three-frame translation terminator rrnBtl, and translation starting at an ATG codon.
In the present invention, the amino acid sequences encoding for the recombinant HCV antigens of interest were reverse translated using codons optimized to facilitate high level expression in E.coli. Individual oligonucleotides were synthesized by the method of oligonucieotide directed double-stranded break repair disclosed in U.S. Patent Application Serial No. 883,242, filed July 8, 1986 by Mandecki (EPO 87109357.1) which enjoys common ownership and is incorporated herein by reference. Alternatively, the individual oligonucleotides
may be synthesized on the Applied Biosystem 380A DNA synthesizer using methods and reagents recommended by the manufacturer. The DNA sequences of the individual oligonucleotides were confirmed using the Sanger dideoxy chain termination method (Sanger et al., J. Mole. Biol.. 162:729 (1982)). These individual gene fragments were then annealed and ligated together and cloned as EcoRI-BamHI subfragments in the CKS fusion vector pJO200. After subsequent DNA sequence confirmation by the Sanger dideoxy chain termination method, the subfragments were digested with appropriate restriction enzymes, gel purified, ligated and cloned again as an EcoRI-BamHI fragment in the CKS fusion vector pJO2OO. The resulting clones were mapped to identify a hybrid gene consisting of the EcoRI-BamHI HCV fragment inserted at the 3' end of the CKS (CMP-KDO synthetase) gene. The resultant fusion proteins, under control of the
promoter, consist of 239 amino acids of the CKS protein fused to the various regions of HCV.
The synthesis, cloning, and characterization of the recombinant poiypeptides as well as the preferred formats for assays using these poiypeptides are provided in the following examples. Examples 1 and 2 describe the synthesis and cloning of CKS-Core and CKS-33-BCD, respectively. Example 3 describes a screening assay. Example 4 describes a confirmatory assay. Example 5 describes a competition assay. Example 6 describes an immunodot assay. Example 7 describes the synthesis and cloning of HCV CKS-NS5E, CKS-NS5F, CKS-NS5G, CKS-NS5H and CKS-NS5I. Example 8 describes the preparation of HCV CKS-C100 vectors. Example 9 describes the preparation of HCV PCR derived expression vectors. Example 10 describes the synthesis and characterization of pHCV-77 of NS1. Example 11 describes the synthesis and characterization of pHCV-65 of NS1. Example 12 describes the synthesis and characterization of pHCV-78 of NS1.
Example 13 describes the synthesis and characterization of pHCV-80 of NS1. Example 14 describes the synthesis and characterization of pHCV-92 of NS1.
REAGE^SANP EN_YMES Media such as Luria-Bertaπi (LB) and Superbroth II (Dri Form) were obtained from Gibco Laboratories Life Technologies, Inc., Madison Wisconsin. Restriction enzymes, Klenow fragment of DNA polymerase I, T4 DNA iigase, T4 polynucleotide kinase, nucleic acid molecular weight standards, M13 sequencing system, X-gal (5-bromo-4-chloro-3-indonyl-β-D-galactoside), IPTG (isopropyl-β-D-thiogalactoside), glycerol, Dithiothreitol, 4-chloro-1 -naphthol were purchased from Boehringer Mannheim Biochemicals, Indianapolis, Indiana; or New England Biolabs, Inc., Beverly, Massachusetts; or Bethesda Research
Laboratories Life Technologies, Inc., Gaithersburg, Maryland. Prestained protein molecular weight standards, acrylamide (crystallized, electrophoretic grade >99%); N-N'-Methylene-bis-acrylamide (BIS); N,N,N\N',- Tetramethylethylenediamine (TEMED) and sodium dodecylsulfate (SDS) were purchased from BioRad Laboratories, Richmond, California. Lysozyme and ampiciliin were obtained from Sigma Chemical Co., St. Louis, Missouri. Horseradish peroxidase (HRPO) labeled secondary antibodies were obtained from Kirkegaard & Perry Laboratories, Inc., Gaithersburg, Maryland. Seaplaque® agarose (low melting agarose) was purchased from FMC Bioproducts, Rockland, Maine.
T50E10 contained 50mM Tris, pH 8.0, lOmM EDTA; 1X TG contained 1OOmM Tris, pH 7.5 and 10% glycerol; 2X SDS/PAGE loading buffer consisted of 15% glycerol, 5% SDS, lOOmM Tris base, 1 M β-mercaptoethanol and 0.8% Bromophenol blue dye; TBS container 50 mM Tris, pH 8.0, and 150 mM sodium chloride; Blocking solution consisted of 5% Carnation nonfat dry milk in TBS.
HOSTCELL CULTURES. DNA SOURCES AND VECTORS
_ ___i JM103 cells, pUC8, pUC18, pUC19 and M13 cloning vectors were purchased from Pharmacia LKB Biotechnology, Inc., Piscataway, New Jersey; Competent Epicurean™ coli stains XL1 -Blue and JM109 were purchased from
Stratagene Cloning Systems, LaJolla, California. RR1 cells were obtained from Coli Genetic Stock Center, Yale University, New Haven, Connecticut; and E.coli CAG456 cells from Dr. Carol Gross, University of Wisconsin, Madison, Wisconsin. Vector pRK248.cits was obtained from Dr. Donald R. Helinski, University of California, San Diego, California.
GENERAL METHODS
All restriction enzyme digestion were performed according to suppliers' instructions. At least 5 units of enzyme were used per microgram of DNA, and sufficient incubation was allowed to complete digestion of DNA. Standard procedures were used for minicell lysate DNA preparation, phenol-chloroform extraction, ethanol precipitation of DNA, restriction analysis of DNA on agarose, and low melting agarose gel purification of DNA fragments (Maniatis et al., Molecular Cloning. A Laboratory Manual [New York: Cold Spring Harbor, 1982]). Plasmid isolations from E.coli strains used the alkali lysis procedure and cesium chloride- ethidium bromide density gradient method (Maniatis et al., supra). Standard buffers were used for T4 DNA ligase and T4 poly nucleotide kinase (Maniatis et al.,
supra) .
EXAMPLE 1. CKS-CORE A. Construction of the Plasmid pJ0200 The cloning vector pJO200 allows the fusion of recombinant proteins to the
CKS protein. The plasmid consists of the plasmid pBR322 with a modified Jac promoter fused to a KdsB gene fragment (encoding the first 239 of the entire 248 amino acids of the E.coli CMP-KDO synthetase of CKS protein), and a synthetic linker fused to the end of the KdsB gene fragment. The cloning vector pJO200 is a modification of vector pTB210. The synthetic linker includes: multiple restriction sites for insertion of genes; translational stop signals, and the trpA rho- independent transcriptional terminator. The CKS method of protein synthesis as well as CKS vectors including pTB210 are disclosed in U.S. Patent Application Serial Nos. 167,067 and 276,263, filed March 11 , 1988 and November 23, 1988, respectively, by Boiling (EPO 891029282) which enjoy common ownership, and are herein incorporated by reference.
B. Preparation of HCV CKS-Core Expression Vector
Six individual nucleotides representing amino acids 1-150 of the HCV genome were ligated together and cloned as a 466 base pair EcoRI-BamHI fragment into the CKS fusion vector pJO200 as presented in Figure 4. The complete DNA sequence of this plasmid, designated pHCV-34, and the entire amino acid sequence of the pHCV-34 recombinant antigen produced is presented in SEQ.ID.NO 6 and 7. The resultant fusion protein HCV CKS-Core, consists of 239 amino acids of CKS, seven amino acids contributed by linker DNA sequences, and the first 150 amino acids of
HCV as illustrated in Figure 5.
The pHCV-34 plasmid and the CKS plasmid pTB210 were transformed into E.coli K-12 strain xL-l (recAI, endAI, gyrA96, thi-1 , hsdRI7, supE44, relAI, lac/F', proAB, laclqZDMl 5, TN10) cells made competent by the calcium chloride method. In these constructions the expression of the CKS fusion proteins was under the control of the ]ac promoter and was induced by the addition of IPTG. These plasmids replicated as independent elements, were nonmobilizable and were maintained at approximately 10-30 copies per cell.
C. Characterization of Recombinant HCV-Core
In order to establish that clone pHCV-34 expressed the unique HCV-CKS Core protein, the pHCV-34/XL-1 culture was grown overnight at 37°C in growth
media consisting of yeast extract, trytone, phosphate salts, glucose, and ampicillin. When the culture reached an OD600 of 1.0, IPTG was added to a final concentration of 1 mM to induce expression. Samples (1.5 ml) were removed at 1 hour intervals, and cells were pelleted and resuspended to an OD600 of 1.0 in 2X SDS/PAGE loading buffer. Aliquots (15ul) of the prepared samples were separated on duplicate 12.5% SDS/PAGE gels.
One gel was fixed in a solution of 50% methanol and 10% acetic acid for 20 minutes at room temperature, and then stained with 0.25% Coomassie blue dye in a solution of 50% methanol and 10% acetic acid for 30 minutes. Destaining was carried out using a solution of 10% methanol and 7% acetic acid for 3-4 hours, or until a clear background was obtained.
Figure 6 presents the expression of pHCV-34 proteins in E.coli. Molecular weight standards were run in Lane M. Lane 1 contains the plasmid pJ0200-the CKS vector without the HCV sequence. The arrows on the left indicate the mobilities of the molecular weight markers from top to bottom: 110,000; 84,000; 47,000; 33,000; 24,000; and 16,000 daltons. The arrows on the right indicate the mobilities of the recombinant HCV proteins. Lane 2 contains the E.coli lysate containing pHCV-34 expressing CKS-Core (amino acids 1 to 150) prior to induction; and Lane 3 after 3 hours of induction. The results show that the recombinant protein pHCV-34 has an apparent mobility corresponding to a molecular size of 48,000 daltons. This compares acceptably with the predicted molecular mass of 43,750 daltons.
Proteins from the second 12.5% SDS/PAGE gel were electrophoretically transferred to nitrocellulose for immunoblotting. The nitrocellulose sheet containing the transferred proteins was incubated with Blocking Solution for one hour and incubated overnight at 4°C with HCV patients' sera diluted in TBS containing E.coli K-12 strain XL-I lysate. The nitrocellulose sheet was washed three times in TBS, then incubated with HRPO-labeled goat anti-human IgG, diluted in TBS containing 10% fetal calf sera. The nitrocellulose was washed three times with TBS and the color was developed in TBS containing 2 mg/ml 4-chloro-1- napthol, 0.02% hydrogen peroxide and 17% methanol. Clone HCV-34 demonstrated a strong immu no reactive band at 48,000 daltons with the HCV patients' sera. Thus, the major protein in the Coomassie stained protein gel was immunoreactive. Normal human serum did not react with any component of pHCV-34.
EXAMPLE 2. HCV CKS-33C-BCD A. Preparation of HCV CKS-33c-BCD Expression Vector
The construction of this recombinant clone expressing the HCV CKS-33-BCD antigen was carried out in three steps described below. First, a clone expressing the HCV CKS-BCD antigen was constructed, designated pHCV-23. Second, a clone expressing the HCV CKS-33 antigen was constructed, designated pHCV-29. Lastly, the HCV BCD region was excised from pHCV-23 and inserted into pHCV-29 to construct a clone expressing the HCV CKS-33-BCD antigen, designated pHCV-31 (SEQ.ID.NO. 8 and 9).
To construct the plasmid pHCV-23, thirteen individual oligonucleotides representing amino acids 1676-1931 of the HCV genome were ligated together and cloned as three separate EcoRI-BamHI subfragments into the CKS fusion vector pJO200. After subsequent DNA sequence confirmation, the three subfragments, designated B, C, and D respectively, were digested with the appropriate restriction enzymes, gel purified, ligated together, and cloned as a 781 base pair EcoRI-BamHI fragment in the CKS fusion vector pJO200, as illustrated in Figure 7. The resulting plasmid, designated pHCV-23, expresses the HCV CKS-BCD antigen under control of the jac promoter. The HCV CKS-BCD antigen consists of 239 amino acids of CKS, seven amino acids contributed by linker DNA sequences, 256 amino acids from the HCV NS4 region (amino acids 1676-1931 , and 10 additional amino acids contributed by linker DNA sequences. To construct the plasmid pHCV-29 twelve individual oligonucleotides representing amino acids 1 192-1457 of the HCV genome were ligated together and cloned as two separate EcoRI-BamHI subfragments in the CKS fusion vector pJO200. After subsequent DNA sequence confirmation, the two subfragments were digested with the appropriate restriction enzymes, gel purified, ligated together and cloned again as an 816 base pair EcoRI-BamHI fragment in the CKS fusion vector pJO200, as illustrated in Figure 8. The resulting plasmid, designated pHCV-29, expresses the CKS-33 antigen under control of the jac promoter. The HCV CKS-33 antigen consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, and 266 amino acids from the HCV NS3 region (amino acids 1192- 1 457) .
To construct the plasmid pHCV-31 , the 781 base pair EcoRI-BamHI fragment from pHCV-23 representing the HCV-BCD region was linker-adapted to produce a Cla1-BamHl fragment which was then gel purified and ligated into pHCV- 29 at the Cla1-BamHl sites as illustrated in Figure 9. The resulting plasmid, designated pHCV-31 , expresses the pHCV-31 antigen under control of the jac promoter. The complete DNA sequence of pHCV-31 and the entire amino acid sequence of the HCV CKS-33-BCD recombinant antigen produced is presented in
SEQ.ID.NO. 8 and 9. The HCV CKS-33-BCD antigen consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, 266 amino acids of the HCV NS3 region (amino acids 1192-1457), 2 amino acids contributed by linker DNA sequences, 256 amino acids of the HCV NS4 region (amino acids 1676-1931), and 10 additional amino acids contributed by linker DNA sequences. Figure 12 presents a schematic representation of the pHCV-31 antigen.
The pHCV-31 plasmid was transformed into E.coli K-12 strain XL-I in a manner similar to the pHCV-34 and CKS-pTB210 plasmids of Example 1.
B. Characterization of Recombinant HCV CKS-33-BCD
Characterization of pHCV CKS-33-BCD was carried out in a manner similar to pHCV CKS-Core of Example 1. pHCV-23, pHCV SDS/PAGE gels were run for E.coli lysates containing the plasmids pHCV-29 (Figure 11), pHCV-23 (Figure 12), and pHCV-31 (Figure 13) expressing the recombinant fusion proteins CKS- 33c, CKS-BCD, and CKS-33-BCD, respectively. For all three figures, molecular weight standards were run in Lane M, with the arrows on the left indicating mobilities of the molecular weight markers the from top to bottom: 110,000; 84,000; 47,000; 33,000; 24,000; and 16,000 daltons. In Figure 11 , Lane 1 contained the E.coli lysate containing pHCV-29 expressing HCV CKS-33c (amino acids 1192 to 1457) prior to induction and lane 2 after 4 hours induction. These results show that the recombinant pHCV-29 fusion protein has an apparent mobility corresponding to a molecular size of 60,000 daltons. This compares acceptably to the predicted molecular mass of 54,911.
In Figure 12, Lane 1 contained the E.coli lysate containing pJO200- the CKS vector without the HCV sequence. Lane 2, contained pHCV-20 expressing the
HCV CKS-B (amino acids 1676 to 1790). Lane 3, contained the fusion protein pHCV-23 (amino acids 1676-1931). These results show that the recombinant pHCV-23 fusion protein has an apparent mobility corresponding to a molecular size of 55,000 daltons. This compares acceptably to the predicted molecular mass of 55,070 daltons.
In Figure 13, Lane 1 contained the E.coli lysate containing pJO200 the CKS vector without the HCV sequences. Lane 2 contained pHCV-31 expressing the CKS- 33C-BCD fusion protein (amino acids 1192 to 1447 and 1676 to 1931) prior to induction and lane 3 after 2 hours induction. These results show that the recombinant pHCV-31 (CKS-33c-BCD) fusion protein has an apparent mobility corresponding to a molecular size of 90,000 daltons. This compares acceptably to the predicted molecular mass of 82,995 daltons.
An immunoblot was also run on one of the SDS/PAGE gels derived from the PHCV-31/X1-1 culture. Human serum from an HCV exposed individual reacted strongly with the major pHCV-31 band at 90,000 daltons. Normal human serum did not react with any component of the pHCV-31 (CKS-33-BCD) preparations.
EXAMPLE 3. SCREENING ASSAY
The use of recombinant poiypeptides which contain epitopes within cl00-3 as well as epitopes from other antigenic regions from the HCV genome, provide immunological assays which have increased sensitivity and may be more specific than HCV immunological assays using epitopes within c100-3 alone.
In the presently preferred screening assay, the procedure uses two E.coli expressed recombinant proteins, CKS-Core (pHCV-34) and CKS-33-BCD (pHCV- 31), representing three distinct regions of the HCV genome. These recombinant poiypeptides were prepared following procedures described above. In the screening assay, both recombinant antigens are coated onto the same polystyrene bead. In a modification of the screening assay the polystyrene bead may also be coated with the SOD-fusion polypeptide c100-3.
The polystyrene beads are first washed with distilled water and propanol and then incubated with a solution containing recombinant pHCV-31 diluted to 0.5 to 2.0 ug/ml and pHCV-34 diluted to 0.1 to 0.5 ug/ml in 0.1 M aH2PO_>-H2θ with
0.4M NaC1 and 0.0022% Triton X-100, pH 6.5. The beads are incubated in the antigen solution for 2 hours (plus or minus 10 minutes) at 38-42°C, washed in PBS and soaked in 0.1% (w/v) Triton X-100 in PBS for 60 minutes at 38-42°C. The beads are then washed two times in phosphate buffered saline (PBS), overcoated with a solution of 5.0% (w/v) bovine serum albumin (BSA) in PBS for 60 minutes at 38-42°C and washed one time in PBS. Finally, the beads are overcoated with 5% (w/v) sucrose in PBS, and dried under nitrogen or air.
The polystyrene beads coated with pHCV-31 and pHCV-34 are used in an antibody capture format. Ten microliters of sample are added to the wells of the reaction tray along with 400 ul of a sample diluent and the recombinant coated bead.
The sample diluent consists of 10% (v/v) bovine serum and 20% (v/v) goat serum in 20 mM Tris phosphate buffer containing 0.15% (v/v) Triton X-100, 1%(w/v) BSA, 1% E.coli lysate and 500 ug/ml or less CKS lysate. When the recombinant yeast d 00-3 polypeptide is used, antibodies to yeast antigens which may be present in a sample are reacted with yeast extracts which are added to the sample diluent (typically about 200 ug/ml). The addition of yeast extracts to the sample diluent is used to prevent false positive results. The final material is sterile
filtered and filled in plastic bottles, and preserved with 0.1% sodium azide.
After one hour of incubation at 40°C, the beads are washed and 200 ul of conjugate is added to the wells of the reaction tray.
The preferred conjugate is goat anti-human IgG horseradish peroxidase conjugate. Concentrated conjugate is titered to determine a working concentration. A twenty-fold concentrate of the working conjugate solution is then prepared by diluting the concentrate in diluent. The 20X concentrate is sterile filtered and stored in plastic bottles.
The conjugate diluent includes 10% (v/v) bovine serum, 10% (v/v) goat serum and 0.15% Triton-X100 in 20 mM Tris buffer, pH 7.5 with 0.01% gentamicin sulfate, 0.01% thimerosal and red dye. The conjugate is sterile filtered and filled in plastic bottles.
Anti-HCV positive control is prepared from plasma units positive for antibodies to HCV. The pool of units used includes plasma with antibodies reactive to pHCV-31 and pHCV-34. The units are recalcified and heat inactivated at 59-61 °C for 12 hours with constant stirring. The pool is aliquoted and stored at -20°C or at 2-8°C. For each lot of positive control, the stock solution is diluted with negative control containing 0.1% sodium azide as a preservative. The final material is sterile filtered and filled in plastic bottles. Anti-HCV negative control is prepared from recalcified human plasma, negative for antibodies to pHCV-31 and pHCV-34 proteins of HCV. The plasma is also negative for antibodies to human immunodeficiency virus (HIV) and negative for hepatitis B surface antigen (HBsAg). The units are pooled, and 0.1% sodium azide is added as a preservative. The final material is sterile filtered and filled in plastic bottles.
After one hour of incubation with the conjugate at 40°C, the beads are washed, exposed to the OPD substrate for thirty minutes at room temperature and the reaction terminated by the addition of 1 N H2SO4. The absorbance is read at
492 nm. In order to maintain acceptable specificity, the cutoff for the assay should be at least 5-7 standard deviations above the absorbance value of the normal population mean. In addition, it has generally been observed that acceptable specificity is obtained when the population mean runs at a sample to cutoff (S/CO) value of 0.25 or less. Consistent with these criteria, a "preclinical" cutoff for the screening assay was selected which clearly separated most of the presumed "true negative" from "true positive" specimens. The cutoff value was calculated as the sum of the positive control mean absorbance value multiplied by 0.25 and the
negative control mean absorbance value. The cutoff may be expressed algebraically as:
Cutoff value=0.25 PCx + NCx. Testing may be performed by two methods which differ primarily in the degree of automation and the mechanism for reading the resulting color development in the assay. One method is referred to as the manual or Quantum™ method because Quantum or Quantumatic is used to read absorbance at 492 nm. It is also called the manual method because sample pipetting, washing and reagent additions are generally done manually by the technician, using appropriately calibrated pipettes, dispensers and wash instruments. The second method is referred to as the PPC method and utilizes the automated Abbott Commander® system. This system employs a pipetting device referred to as the Sample Management Center (SMC) and a wash/dispense/read device referred to as the Parallel Processing Center (PPC) disclosed in E.P.O. Publication No. 91114072.1. The optical reader used in the PPC has dual wavelength capabilities that can measure differential absorbencies (peak band and side band) from the sample wells. These readings are converted into results by the processor's Control Center.
Screening Assay Performance 1. Serum/Plasma From Inoculated Chimpanzees
As previously described, Table I summarizes the results of a study which followed the course of HCV infection in seven chimpanzees using a screening assay which utilized the c100-3 polypeptide, and the screening assay which utilized pHCV-31 and pHCV-34. Both assays gave negative results before inoculation and both assays detected the presence of antibodies after the animal had been infected with HCV. However, in the comparison of the two assays, the assay utilizing pHCV- 31 and pHCV-34 detected seroconversion to HCV antigens at an earlier or equivalent bleed date in six of the seven chimpanzees. Data from these chimpanzee studies clearly demonstrate that overall detection of HCV antibodies is greatly increased with the assay utilizing the pHCV-31 and pHCV-34 proteins. This test is sufficiently sensitive to detect seroconversion during the acute phase of this disease, as defined as an elevation in ALT levels, in most animals. Equally important is the high degree of specificity of the test as no pre-inoculation specimens were reactive.
Non-A. Non-B Panel II (Η. Alter. NIH,
A panel of highly pedigreed human sera from Dr. H. Alter, NIH, Bethesda,
MD., containing infectious HCV sera, negative sera and other disease controls were tested. A total of 44 specimens were present in the panel.
Six of seven sera which were "proven infectious" in chimpanzees were positive in both the screening assay using c100-3 as well as in the screening assay utilizing the recombinant proteins pHCV-31 and pHCV-34. These six reactive specimens were obtained from individuals with chronic hepatitis. All six of the reactive specimens were confirmed positive using synthetic peptide sp67. One specimen obtained during the acute phase of NANB post-transfusion hepatitis was non-reactive in both screening assays. In the group labeled "probable infectious" were three samples taken from the same post transfusion hepatitis patient. The first two acute phase samples were negative in both assays, but the third sample was reactive in both assay. The disease control samples and pedigreed negative controls were uniformly negative. All sixteen specimens detected as positive by both screening assays were confirmed by the spll7 confirmatory assay (Figure 14). In addition, specimens 10 and 29 were newly detected in the screening assay utilizing the recombinant pHCV- 31 and pHCV-34 antigens and were reactive by the sp75 confirmatory assay. Specimen 39 was initially reactive in the screening test utilizing pHCV-34 and pHCV-31 , but upon retesting was negative and could not be confirmed by the confirmatory assays.
In summary, both screening tests identified 6 of 6 chronic NANBH carriers and 1 of 4 acute NANBH samples. Paired specimens from an implicated donor were non-reactive in the screening test utilizing c100-3 but were reactive in the screening test with pHCV-31 and pHCV-34. Thus, the screening test utilizing the recombinant antigens pHCV-31 and pHCV-34 appears to be more sensitive than the screening assay utilizing c100-3. None of the disease control specimens or pedigreed negative control specimens were reactive in either screening assay.
3. CBER Reference Panel A reference panel for antibody to Hepatitis C was received from the Center for Biologies Evaluation and Research (CBER). This 10 member panel consists of eight reactive samples diluted in normal human sera negative for antibody to HCV and two sera that contain no detectable antibody to HCV. This panel was run on the Ortho first generation HCV EIA assay, the screening assay utilizing d 00-3 and the screening assay utilizing pHCV-31 and pHCV-34. The assay results are presented in Figure 15.
The screening assay utilizing pHCV-31 and pHCV-34 detected all six of the
HCV positive or borderline sample dilutions. The two non-reactive sample dilutions (709 and 710) appear to be diluted well beyond endpoint of antibody detectabiiity for both screening assays. A marked increase was observed in the sample to cutoff values for three of the members on the screening assay utilizing pHCV-31 and pHCV-34 compared to the screening assay utilizing c100-3 or the Ortho first generation test. All repeatably reactive specimens were confirmed.
EXAMPLE 4. CONFIRMATORY ASSAY
The confirmatory assay provides a means for unequivocally identifying the presence of an antibody that is immunologically reactive with an HCV antigen. The confirmatory assay includes synthetic peptides or recombinant antigens representing major epitopes contained within the three distinct regions of the HCV genome, which are the same regions represented by the two recombinant antigens described in the screening assay. Recombinant proteins used in the confirmatory assay should have a heterologous source of antigen to that used in the primary screening assay (i.e. should not be an E.coli-derived recombinant antigen nor a recombinant antigen composed in part, of CKS sequences). Specimens repeatedly reactive in the primary screening assay are retested in the confirmatory assay. Aliquots containing identical amounts of specimen are contacted with a synthetic peptide or recombinant antigen individually coated onto a polystyrene bead. Sero reactivity for epitopes within the c100-3 region of the HCV genome are confirmed by use of the synthetic peptides sp67 and sp65. The synthetic peptide sp117 can also be used to confirm seroreactivity with the c100-3 region. Seroreactivity for HCV epitopes within the putative core region of HCV are confirmed by the use of the synthetic peptide sp75. In order to confirm seroreactivity for HCV epitopes within the 33c region of HCV, a recombinant antigen expressed as a chimeric protein with superoxide dismutase (SOD) in yeast is used. Finally, the antibody-antigen complex is detected.
The assay protocols were similar to those described in Example 3 above. The peptides are each individually coated onto polystyrene beads and used in an antibody capture format similar to that described for the screening assay. Ten microliters of specimen are added to the wells of a reaction tray along with 400 ul of a specimen diluent and a peptide coated bead. After one hour of incubation at 40°C, the beads are washed and 200 ul of conjugate (identical to that described in Example 3) is added to the wells of the reaction tray. After one hour of incubation at 40°C, the beads are washed, exposed to the OPD substrate for 30 minutes at room temperature and the reaction terminated by the addition of 1 N H2SO4. The absorbance is read at
492 nm. The cutoff value for the peptide assay is 4 times the mean of the negative control absorbance value.
1. Panels containing Specimens "At Risk" for HCV Infection. A group of 233 specimens representing 23 hemodialysis patients all with clinically diagnosed NANBH were supplied by Gary Gitnick, M.D. at the University of California, Los Angeles Center for the Health Sciences. These samples which were tested in by the screening assay utilizing d 00-3 were subsequently tested in the screening assay which uses pHCV-31 and pHCV-34. A total of 7/23 patients (30.44%) were reactive in the c100-3 screening assay, with a total of 36 repeat reactive specimens. Ten of 23 patients (43.48%) were reactive by the screening assay utilizing pHCV-31 and pHCV-34, with a total of 70 repeatable reactives among the available specimens (Figure 16). Two specimens were unavailable for testing. All of the 36 repeatedly reactive specimens detected in the c100-3 screening assay were confirmed by synthetic peptide confirmatory assays. A total of 34 of these 36 were repeatedly reactive on HCV EIA utilizing pHCV-34 and pHCV-31 ; two specimens were not available for testing. Of the 36 specimens additionally detected by the screening assay utilizing pHCV-34 and pHCV-31 , 9 were confirmed by the core peptide confirmatory assay (sp75) and 27 were confirmed by the SOD-33C confirmatory assay.
In summary these data indicate that detection of anti-HCV by the screening assay utilizing pHCV-31 and pHCV-34 may occur at an equivalent bleed date or as many as 9 months earlier, when compared to the d 00-3 screening assay. Figure 17 depicts earlier detection by the screening assay utilizing pHCV-34 and pHCV-31 in a hemodialysis patient.
5. Acute/Chronic Non-A. Non-B Hepatitis
A population of specimens was identified from individuals diagnosed as having acute or chronic NANBH. Specimens from individuals with acute cases of NANBH were received from Gary Gitnick, M.D. at the University of California, Los
Angeles Center for Health Sciences. The diagnosis of acute hepatitis was based on the presence of a cytolytic syndrome (ALT levels greater than 2X the upper normal limit) on at least 2 serum samples for a duration of less than 6 months with or without other biological abnormalities and clinical symptoms. All specimens were also negative for IgM antibodies to Hepatitis A Virus (HAV) and were negative for Hepatitis B surface Ag when tested with commercially available tests. Specimens from cases of chronic NANBH were obtained from two clinical sites. Individuals
were diagnosed as having chronic NANBH based on the following criteria: persistently elevated ALT levels, liver biopsy results, and/or the absence of detectable HBsAg. Specimens with biopsy results were further categorized as either chronic active NANBH, chronic persistent NANBH, or chronic NANBH with cirrhosis.
These specimens were tested by both the c100-3 screening assay and the screening assay utilizing pHCV-34 and pHCV-31. The latter testing was performed in replicates of two by both the Quantum and PPC methods. Community Acquired NANBH (Acute, The d 00-3 screening assay detected 2 of 10 specimens (20.00%) as repeatedly reactive, both of which were confirmed. The screening assay utilizing pHCV-34 and pHCV-31 detected both of these specimens plus and additional 2 specimens (Figure 18). These 2 specimens were confirmed by sp75 (see Figure 1 9) . Acute Post-Transfusion NANBH
The C100-3 assay detected 4 of 32 specimens (12.50%) as repeatedly reactive, all of which was confirmed. The screening assay utilizing pHCV-34 and pHCV-31 detected 3 out of these 4 specimens (75%) as reactive. The one sample that was missed had an S/CO of 0.95 by the latter screening test. This sample was confirmed by the sp67 peptide (Figure 18). In addition, the screening assay utilizing pHCV-34 and pHCV-31 detected 11 specimens not reactive in the c100-3 screening assay. Of the 9 specimens available for confirmation, 8 were confirmed by sp75 and 1 could not be confirmed but had an S/CO of 0.90 in the sp65 confirmatory test, (see Figure 19). Chronic NANBH
A summary of the results on these populations is shown in Figure 20. Overall, 155 of 164 (94.5%) chronic NANBH samples were detected by the screening test utilizing pHCV-31 and pHCV-34 using either Quantum or PPC. The 155 reactive samples were all confirmed in alternate assays using synthetic peptides based on sequences from either the clOO, 33c or core regions of the HCV genome. In contrast, only 138 of 164 (84.1%) specimens were positive by the clOO-3 assay. All but one of the 138 c100-3 samples were detected as positive by the screening assay utilizing pHCV-31 and pHCV-34. The one discordant specimen was not confirmed by either synthetic or neutralization assays. Conversely, there were 17 confirmed specimens which were positive only by the screening assay utilizing pHCV-34 and pHCV-31.
The results indicate that the screening assay utilizing pHCV-34 and pHCV-
31 is more sensitive than the current test in detecting HCV positive individuals within chronically infected NANBH populations.
EXAMPLE 5, Competition ASSAY The recombinant poiypeptides containing antigenic HCV epitopes are useful for competition assays. To perform a neutralization assay, a recombinant polypeptide representing epitopes within the c100-3 region such as CKS-BCD (pHCV-23) is solubilized and mixed with a sample diluent to a final concentration of 0.5-50 ug/ml. Ten microliters of specimen or diluted specimen is added to a reaction well followed by 400 ul of the sample diluent containing the recombinant polypeptide and if desired, the mixture may be preincubated for about fifteen minutes to two hours. A bead coated with d 00-3 antigen is then added to the reaction well and incubated for one hour at 40°C. After washing, 200 ul of a peroxidase labeled goat anti-human IgG in conjugate diluent is added and incubated for one hour at 40°C. After washing, OPD substrate is added and incubated at room temperature for thirty minutes. The reaction is terminated by the addition of 1 N sulfuric acid and the absorbance read at 492 nm.
Samples containing antibodies to the d 00-3 antigen generate a reduced signal caused by the competitive binding of the peptides to these antibodies in solution. The percentage of competitive binding may be calculated by comparing the absorbance value of the sample in the presence of a recombinant polypeptide to the absorbance value of the sample assayed in the absence of a recombinant polypeptide at the same dilution.
EXAMPLE 6. IMMUNODOT ASSAY
The immunodot assay system uses a panel of purified recombinant poiypeptides placed in an array on a nitrocellulose solid support. The prepared solid support is contacted with a sample and captures specific antibodies to HCV antigens. The captured antibodies are detected by a conjugate-specific reaction. Preferably, the conjugate specific reaction is quantified using a reflectance optics assembly within an instrument which has been described in U.S. Patent Applications Serial No. 07/227,408 filed August 2, 1988. The related U.S. Patent Applications Serial Nos. 07/227,272, 07/227,586 and 07/227,590 further describe specific methods and apparatus useful to perform an immunodot assay. The assay has also been described in U.S. Application Serial No. 07/532,489 filed June 6, 1990. Briefly, a nitrocellulose-base test cartridge is treated with multiple antigenic poiypeptides. Each polypeptide is contained within a specific reaction zone on the
test cartridge. After all the antigenic poiypeptides have been placed on the nitrocellulose, excess binding sites on the nitrocellulose are blocked. The test cartridge is then contacted with a sample such that each antigenic polypeptide in each reaction zone will react if the sample contains the appropriate antibody. After reaction, the test cartridge is washed and any antigen-antibody reactions are identified using suitable well known reagents.
As described in the patent applications listed above, the entire process is amenable to automation. The specifications of these applications related to the method and apparatus for performing an immunodot assay are incorporated by reference herein.
In a preferred immunodot assay, the recombinant poiypeptides pHCV-23, pHCV-29, pHCV-34, and clOO-3 were diluted in the preferred buffers, pH conditions, and spotting concentrations as summarized in Figure 21 and applied to a preassembled nitrocellulose test cartridge. After drying the cartridge overnight at room temperature 37°C, the non-specific binding capacity of the nitro-cellulose phase was blocked. The blocking solution contained 1% porcine gelatin, 1% casein enzymatic hydrolysate, 5% Tween-20, 0.1% sodium azide, 0.5 M sodium chloride and 20 mM Tris, pH 7.5.
Forty normal donors were assayed by following the method described above. The mean reflectance density value then was determined for each of the recombinant proteins. A cutoff value was calculated as the negative mean plus six standard deviations. Test cartridges were incubated with samples A00642 and 423 (see Figure 22). Sample A00642 was from a convalescent non-A, non-B hepatitis patient, diluted in negative human plasma from 1 :100 to 1 :12800. The other sample, 423, was from a paid plasma donor which tested positive in an assay using a recombinant c100-3 polypeptide, diluted in negative human plasma from 1 :40 to 1 :2560. After sample incubation, sequential incubations with a biotin-conjugated goat anti-human immunoglobulin-specific antibody, an alkaline phosphatase- conjugated rabbit anti-biotin specific antibody, and 5-bromo-4-chloro-3-indolyl phosphate produced a colored product at the site of the reaction. Sample to cutoff values (S/CO) were determined for all HCV recombinant proteins. Those S/CO values greater than or equal to 1.0 were considered reactive. The limiting dilution was defined as the lowest dilution at which the S/CO was greater than or equal to 1.0. As seen in Figure 22, each sample tested positive for all HCV recombinant proteins. The data demonstrate that reactivity for sample A00642 was greatest with pHCV-29, and decreased for the remaining antigens pHCV-23, d OO-3, and pHCV-34. Sample 423 most strongly reacted with the recombinant proteins
expressing pHCV-29 and pHCV-34, and to a lesser extent with pHCV-23 and c100- 3.
EXAMPLE 7. HCV CKS-NS5 EXPRESSION VECTORS A. Preparation of HCV CKS-NS5E
Eight individual oligonucleotides representing amino acids 1932-2191 of the HCV genome were ligated together and cloned as a 793 base pair EcoRI-BamHI fragment into the CKS fusion vector pJ0200. The resulting plasmid, designated pHCV-45 (SEQ.ID.NO 8), expresses the HCV CKS-NS5E antigen under control of the ]ac promoter. The HCV CKS-NS5E antigen consists of 239 amino acids of CKS, nine amino acids contributed by linker DNA sequences, and 260 amino acids from the HCV NS4/NS5 region (amino acids 1932-2191). Figure 23 presents a schematic representation of the recombinant antigen expressed by pHCV-45. SEQ.ID.NO. 10 and 11 presents the DNA and amino acid sequence of the HCV CKS-NS5E recombinant antigen produced by pHCV-45. Figure 24 presents the expression of pHCV-45 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-45 expressing the HCV CKS-NS5E antigen (amino acids 1932-2191) prior to induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These results show that the pHCV-45 fusion protein has an apparent mobility corresponding to a molecular size of 55,000 daltons. This compares acceptably to the predicted molecular mass of 57,597 daltons.
B. Preparation of HCV CKS-NS5F
Eleven individual oligonucleotides representing amino acids 2188-2481 of the HCV genome were ligated together and cloned as a 895 base pair EcoRI-BamHI fragment into the CKS fusion vector pJ0200. The resulting plasmid, designated pHCV-48 , expresses the HCV CKS-NS5F antigen under control of the lac promoter. The HCV CKS-NS5F antigen consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, and 294 amino acids from the HCV NS5 region (amino acids 2188-2481). Figure 25 presents a schematic representation of the recombinant antigen expressed by pHCV-48. SEQ.ID.NO. 12 and 13 presents the DNA and amino acid sequence of the HCV CKS-NS5F recombinant antigen produced by pHCV-48. Figure 26 presents the expression of pHCV-48 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-48 expressing the HCV CKS-NS5F antigen (amino acids 2188-2481) prior to induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These results show that the pHCV-48 fusion protein has an apparent mobility corresponding to a molecular size of 65,000
daltons. This compares acceptably to the predicted molecular mass of 58,985 daltons.
C. Preparation of HCV CKS-NS5G Seven individual oligonucleotides representing amino acids 2480-2729 of the HCV genome were ligated together and cloned as a 769 base pair EcoRI-BamHI fragment into the CKS fusion vector pJ0200. The resulting plasmid, designated pHCV-51 (SEQ.ID.NO. 10), expresses the HCV CKS-NS5G antigen under control of the lac promoter. The HCV CKS-NS5G antigen consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, and 250 amino acids from the HCV NS5 region (amino acids 2480-2729). Figure 27 presents a schematic representation of the recombinant antigen expressed by pHCV-51. SEQ.NO.ID NO.14 and 15 presents the DNA and amino acid sequence of the HCV CKS-NS5G recombinant antigen produced by pHCV-51. Figure 28 presents the expression of pHCV-51 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-51 expressing the HCV CKS-NS5G antigen (amino acids 2480-2729) prior to induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These results show that the pHCV-51 fusion protein has an apparent mobility corresponding to a molecular size of 55,000 daltons. This compares acceptably to the predicted molecular mass of 54,720 daltons.
P. Preparation of HCV CKS-NS5H
Six individual oligonucleotides representing amino acids 2728-2867 of the HCV genome were ligated together and cloned as a 439 base pair EcoRI-BamHI fragment into the CKS fusion vector pJ0200. The resulting plasmid, designated pHCV-50 (SEQ.NO.ID.11) expresses the HCV CKS-NS5H antigen under control of the lac promoter. The HCV CKS-NS5H antigen consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, and 140 amino acids from the HCV NS5 region (amino acids 2728-2867). Figure 29 presents a schematic representation of the recombinant antigen expressed by pHCV-50. SEQ.ID.NO. 16 and 17 presents the DNA and amino acid sequence of the HCV CKS-NS5H recombinant antigen produced by pHCV-50. Figure 30 presents the expression of pHCV-50 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-50 expressing the HCV CKS-NS5H antigen (amino acids 2728-2867) prior to induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These results show that the pHCV-50 fusion protein has an apparent mobility corresponding to a molecular size of 45,000 daltons. This compares acceptably to
the predicted molecular mass of 42,783 daltons. E. Preparation of HCV CKS-NS5I
Six individual oligonucleotides representing amino acids 2866-3011 of the HCV genome were ligated together and cloned as a 460 base pair EcoRI-BamHI fragment into the CKS fusion vector pJ0200. The resulting plasmid, designated pHCV-49 (SEQ.NO.ID.NO. 12), expresses the HCV CKS-NS5I antigen under control of the lac promoter. The HCV CKS-NS5I antigen consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, and 146 amino acids from the HCV NS5 region (amino acids 2866-3011). Figure 31 presents a schematic representation of the recombinant antigen expressed by pHCV-49. SEQ.ID.NO. 18 and 19 presents the DNA and amino acid sequence of the HCV CKS-NS5I recombinant antigen produced by pHCV-49. Figure 32 presents the expression of pHCV-49 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-49 expressing HCV CKS-NS5I antigen (amino acids 2866-3011) prior to induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. These results show that the pHCV-49 fusion protein has an apparent mobility corresponding to a molecular size of 42,000 daltons. This compares acceptably to the predicted molecular mass of 43,497 daltons.
F. Immunoblot of HCV CKS-NS5 Antigens
Induced E.coli lysates containing pHCV-23, pHCV-45, pHCV-48, pHCV-51, pHCV-50, or pHCV-49 were individually run on preparative SDS/PAGE gels to separate the various HCV CKS-NS5 or HCV CKS-BCD recombinant antigens assay from the majority of other E.coli proteins. Gel slices containing the separated individual HCV CKS-NS5 or HCV CKS-BCD recombinant antigens were then electropheretically transferred to nitrocellulose, and the nitrocellulose sheet cut into strips. Figure 40 presents the results of a Western Blot analysis of various serum or plasma samples using these nitrocellulose strips. The arrows on the right indicate the position of each HCV CKS-BCD or HCV CKS-NS5 recombinant antigen, from top to bottom pHCV-23 (HCV CKS-BCD), pHCV-45 (HCV CKS-NS5E), pHCV-
48 (HCV CKS-NS5F), pHCV-51 (HCV CKS-NS5G), pHCV-50 (HCV CKS-NS5H), pHCV-49 (HCV CKS-NS5I), and pJO200 (CKS). Panel A contained five normal human plasma, panel B contained five normal human sera, panel C contained twenty human sera positive in the Abbott HCV EIA test, panel D contained two mouse sera directed against CKS, and panel E contained two normal mouse sera. Both the HCV CKS-NS5E antigen expressed by pHCV-45 and the HCV CKS-NS5F antigen expressed by pHCV-48 were immunoreactive when screened with human serum
samples containing HCV antibodies.
EXAMPLE 8. HCV CKS-C100 A. Preparation of HCV CKS-C100 Vectors Eighteen individual oligonucleotides representing amino acids 1569-1931 of the HCV genome were ligated together and cloned as four separate EcoRI-BamHI subfragments into the CKS fusion vector pJ0200. After subsequent DNA sequences confirmation, the four subfragments were digested with the appropriate restriction enzymes, gel purified, ligated together, and cloned as an 1 102 base pair EcoRI- BamHI fragment in the CKS fusion vector pJ0200. The resulting plasmid, designated pHCV-24, expresses the HCV CKS-C100 antigen under control of the ]ac promoter. The HCV CKS-c100 antigen consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, 363 amino acids from the HCV NS4 region (amino acids 1569-1931) and 10 additional amino acids contributed by linker DNA sequences. The HCV CKS-c100 antigen was expressed at very low levels by pHCV-24.
Poor expression levels of this HCV CKS-c100 recombinant antigen were overcome by constructing two additional clones containing deletions in the extreme amino terminal portion of the HCV c100 region. The first of these clones, designated pHCV-57 (SEQ.ID.NO. 20 and 21), contains a 23 amino acid deletion
(HCV amino acids 1575-1597) and was constructed by deleting a 69 base pair Ddel restriction fragment. The second of these clones, designated pHCV-58 (SEQ.ID.NO. 22 and 23), contains a 21 amino acid deletion (HCV amino acids 1600-1620) and was constructed by deleting a 63 base pair NlalV-Haelll restriction fragment. Figure 34 presents a schematic representation of the recombinant antigens expressed by pHCV-24, pHCV-57, and pHCV-58. SEQ.ID. NO. 13 presents the DNA and amino acid sequence of the HCV-C100D1 recombinant antigen produced by pHCV-57. SEQ.ID.NO. 14 presents the DNA and amino acid sequence of the HCV- C100D2 recombinant antigen produced by pHCV-58. Figure 35 presents the expression of pHCV-24, pHCV-57, and pHCV-58 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-24 expressing the HCV CKS-c100 antigen (amino acids 1569-1931) prior to induction and lanes 2 and 3 after 2 and 4 hours post induction, respectively. Lane 4 contained the E.coli lysate containing pHCV-57 expressing the HCV-CKS-C100D1 antigen (amino acids 1569-1574 and 1598-1931) prior to induction and lanes 5 and 6 after 2 and 4 hours induction, respectively. Lane 7 contained the E.coli lysate containing pHCV-58 expressing the HCV CKS-C100D2 antigen (amino acids 1569-1599 and 1621 -1931) prior to
induction, and lanes 8 and 9 after 2 and 4 hours induction, respectively. These results show that both the pHCV-57 and pHCV-58 fusion proteins express at significantly higher levels than the pHCV-24 fusion protein and that both the pHCV-57 and pHCV-58 fusion proteins have an apparent mobility corresponding to a molecular size of 65,000 daltons. This compares acceptably to the predicted molecular mass of 64,450 daltons for pHCV-57 and 64,458 daltons for pHCV-58.
EXAMPLE 9. HCV PCR DERIVED EXPRESSION VECTORS A. Preparation of HCV DNA Fragments RNA was extracted from the serum of various chimpanzees or humans infected with HCV by first subjecting the samples to digestion with Proteinase K and SDS for 1 hour at 37° centigrade followed by numerous phenol:chloroform extractions. The RNA was then concentrated by several ethanol precipitations and resuspended in water. RNA samples were then reverse transcribed according to supplier's instructions using a specific primer. A second primer was then added and PCR amplification was performed according to supplier's instructions. An aliquot of this PCR reaction was then subjected to an additional round of PCR using nested primers located internal to the first set of primers. In general, these primers also contained restriction endonuclease recognition sequences to be used for subsequent cloning. An aliquot of this second round nested PCR reaction was then subjected to agarose gel electrophoresis and Southern blot analysis to confirm the specificity of the PCR reaction. The remainder of the PCR reaction was then digested with the appropriate restriction enzymes, the HCV DNA fragment of interest gel purified, and ligated to an appropriate cloning vector. This ligation was then transformed into E.coli and single colonies were isolated and plasmid DNA prepared for DNA sequences analysis. The DNA sequences was then evaluated to confirm that the specific HCV coding region of interest was intact. HCV DNA fragments obtained in this manner were then cloned into appropriate vectors for expression analysis. B. Preparation of HCV CKS-NS3
Using the methods detailed above, a 474 base pair DNA fragment from the putative NS3 region of HCV was generated by PCR. This fragment represents HCV amino acids #1473-1629 and was cloned into the CKS expression vector pJ0201 by blunt-end ligation. The resulting clone, designated pHCV-105, expresses the HCV CKS-NS3 antigen under control of the ]ac promoter. The HCV CKS-NS3 antigen consists of 239 amino acids of CKS, 12 amino acids contributed by linker DNA sequences, 157 amino acids from the HCV NS3 region (amino acids 1473-1629),
and 9 additional amino acids contributed by linker DNA sequences. Figure 36 presents a schematic representation of the pHCV-105 antigen. SEQ.ID.NO. 24 and 25 presents the DNA and amino acid sequence of the HCV CKS-NS3 recombinant antigen produced by pHCV-105. Figure 37 presents the expression of pHCV-105 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-105 expressing the HCV CKS-NS3 antigen (amino acids 1472-1629) prior to induction and lanes 2 and 3 after 2 and 4 hours induction, respectively. These results show that the pHCV-105 fusion protein has an apparent mobility corresponding to a molecular mass of 43,000 daltons. This compares acceptably to the predicted molecular mass of 46,454 daltons. C. Preparation of HCV CKS-5'ENV
Using the methods detailed above, a 489 base pair DNA fragment from the putative envelope region of HCV was generated by PCR. This fragment represents the HCV amino acids 1 14-276 and was cloned into the CKS expression vector pJ0202 using EcoRI-BamHI restriction sites. The resulting clone, designated pHCV-103 (SEQ.ID.NO. 26 and 27), expresses the HCV CKS-5'ENV antigen under control of the lac promoter. The HCV CKS-5ΕNV antigen consists of 239 amino acids of CKS, 7 amino acids contributed by linker DNA sequences, 163 amino acids from the HCV envelope region (aminό acids 1 14-276), and 16 additional amino acids contributed by linker DNA sequences. Figure 38 presents a schematic representation of the pHCV-103 antigen. SEQ.ID.NO. 26 and 27 presents the DNA and amino acid sequence of the HCV CKS-5'ENV recombinant antigen produced by pHCV-103. Figure 37 presents the expression of pHCV-103 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-103 expressing the HCV CKS- 5ΕNV antigen (amino acids 1 14-276) prior to induction and lanes 5 and 6 after 2 and 4 hours induction, respectively. These results show that the pHCV-103 fusion protein has an apparent mobility corresponding to a molecular mass of 47,000 daltons. This compares acceptably to the predicted molecular mass of 46,091 daltons. P. Preparation of HCV CKS-3'ENV
Using the methods detailed above, a 621 base pair DNA fragment form the putative envelope region of HCV was generated by PCR. This fragment represents HCV amino acids 263-469 and was cloned into the CKS expression vector pJ0202 using EcoRI restriction sites. The resulting clone, designated pHCV-101 (SEQ.ID.NO. 17), expresses the HCV CKS-3'ENV antigen under control of the lac promoter. The HCV CKS-3ΕNV antigen consists of 239 amino acids of CKS, 7 amino acids contributed by linker DNA sequences, 207 amino acids from the HCV
envelope region (amino acids 263-469), and 15 additional amino acids contributed by linker DNA sequences. Figure 39 presents a schematic representation of the pHCV-101 antigen. SEQ.ID.NO. 28 and 29 presents the DNA and amino acid sequence of the HCV CKS-3ΕNV recombinant antigen produced by pHCV-101. Figure 37 presents the expression of pHCV-101 proteins in E.coli Lane 7 contained the E.coli lysate containing pHCV-101 expressing the HCV CKS-3ΕNV antigen (amino acids 263-469) prior to induction and lanes 8 and 9 after 2 and 4 hours induction, respectively. These resulting show that the pHCV-101 fusion protein has an apparent mobility corresponding to a molecular mass of 47,000 daltons. This compares acceptably to the predicted molecular mass of 51 ,181 daltons.
E. Preparation of HCV CKS-NS2
Using the methods detailed above, a 636 base pair DNA fragment from the putative NS2 region of HCV was generated by PCR. This fragment represents the HCV amino acids 994-1205 and was cloned into the CKS expression vector pJ0201 using EcoRI restriction sites. The resulting clone, designated pHCV-102, expresses the HCV CKS-NS2 antigen under control of the lac promoter. The HCV CKS-NS2 antigen consists of 239 amino acids of CKS, 7 amino acids contributed by linker DNA sequences, 212 amino acids from the HCV NS2 region (amino acids 994- 1205), and 16 additional amino acids contributed by linker DNA sequences. Figure 40 presents a schematic representation of the pHCV-102 antigen. SEQ.ID.NO. 30 and 31 presents the DNA and amino acid sequence of the HCV CKS-NS2 recombinant antigen produced by pHCV-102. Figure 41 presents the expression of pHCV-102 proteins in E.coli. Lane 1 contained the E.coli lysate containing pHCV-102 expressing the HCV CKS-NS2 antigen (amino acids 994-1205) prior to induction and lanes 2 and 3 after 2 and 4 hours induction, respectively. These results show that the pHCV-102 fusion protein has an apparent mobility corresponding to a molecular mass of 53,000 daltons. This compares acceptably to the predicted molecular mass of 51,213 daltons.
F. Preparation of HCV CKS-NS1 Using the methods detailed above, a 654 base pair DNA fragment from the putative NS1 region of HCV was generated by PCR. This fragment represents HCV amino acids 617-834 and was cloned into the CKS expression vector pJ0200 using EcoRI-BamHI restriction sites. The resulting clone, designated pHCV-107, expresses the HCV CKS-NS1 antigen under control of the Jac promoter. The HCV CKS-NS1 antigen consists of 239 amino acids of CKS, 10 amino acids contributed by linker DNA sequences, and 218 amino acids from the HCV NS1 region (amino acids 617-834). Figure 42 presents a schematic representation of the pHCV-107
antigen. SEQ.ID.NO. 32 and 33 presents the DNA and amino acid sequence of the HCV CKS-NS1 recombinant antigen produced by pHCV-107. G. Preparation of HCV CKS-ENV
Using the methods detailed above, a 1068 base pair DNA fragment from the putative envelope region of HCV was generated by PCR. This fragment represents HCV amino acids #114-469 and was cloned into the CKS expression vector pJ0202 using EcoRI restriction sites. The resulting clone, designated pHCV-104, expresses the HCV CKS-ENV antigen under control of the lac promoter. The HCV CKS-ENV antigen consists of 239 amino acids of CKS, 7 amino acids contributed by linker DNA sequences, 356 amino acids from the HCV envelope region (amino acids 114- 469), and 15 additional amino acids contributed by linker DNA sequences. Figure 43 presents a schematic representation of the pHCV-104 antigen. SEQ.ID.NO. 34 and 35 presents the DNA and amino acid sequence of the HCV CKS-ENV recombinant antigen produced by pHCV-104.
EXAMPLE 10. HCV CKS-NS1S1 A. Construction of the HCV CKS-NS1S1 Expression Vector
Eight individual oligonucleotides representing amino acids 365-579 of the HCV genome were ligated together and cloned as a 645 base pair EcoRI/BamHI fragment into the CKS fusion vector pJO200. The amino acid sequence of this antigen is designated as pHCV-77 (SEQ. ID. NO. 1). The resultant fusion protein HCV CKS-NS1S1 consists of 239 amino acids of CKS, seven amino acids contributed by linked DNA sequences, and 215 amino acids from the NS1 region of the HCV genome. B. Production and Characterization of the Recombinant Antigen HCV-NS1S1 pHCV-77 was transformed into E.coli K-12 strain XL-1 (recA1 , endA1 , gyrA96, thi-1 , hsdRl 7, SupE44, relA1 , Iac/f1 , p10AB, lacl1ADM15, TN10) cells. Expression analysis and characterization of the recombinant protein was done using polyacrylamide gel electrophoresis as described in Example 1. The apparent molecular weight of the pHCV-77 antigen was the same as the expected molecular weight of 50,228 as visualized on a coumassie stained gel. The immunoreactivity as determined by Western blot analysis using human sera indicated that this recombinant antigen was indeed immunoreactive. FIGURE 47A presents the expression of pHCV-77 in E. coli. FIGURE 47B presents an immunoblot of the pHCV-77 antigen expressed in E. coli. Lane 1 contained the E. coli lysate containing pHCV-77 expressing the HCV CKS-NS1S1 antigen prior to induction and Lanes 2 and 3 are 2 and 4 hours post-induction, respectfully.
EXAMPLE 11. HCV CKS-NS1S2
A. Construction of the HCV CKS-NS1S2 Expression Vector
Six individual oligonucleotides representing amino acids 565-731 of the HCV genome was ligated together and cloned as a 501 base pair EcoRI/BamHI fragment into the CKS fusion vector pJO200. The complete amino acid sequence of this antigen is designated as pHCV-65 (SEQ. ID. NO. 2). The resultant fusion protein HCV CKS-NS1S2 consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, and 167 amino acids from the NS1 region of the HCV genome.
B. Production and Characterization of the Recombinant Antigen HCV-NS1S2 pHCV-65 was transformed into E.coli K-12 strain XL-1 (recA1 , endA1 , gyrA96, thi-1 , hsdR17, SupE44, relAI , lac/f 1 , p10AB, laclqAMD15, TN10) cells. Expression analysis and characterization of the recombinant protein was done using polyacrylamide gel electrophoresis as described in Example 1. The apparent molecular weight of the pHCV-65 antigen was the same as the expected molecular weight of 46,223 as visualized on a coumassie stained gel. The immunoreactivity as determined by Western blot analyis using human sera indicated that this recombinant antigen was indeed immunoreactive. FIGURE 48A presents the expression of pHCV-65 in E. coli. FIGURE 48B presents an immunoblot of the pHCV-65 antigen expressed in E. coli. Lane 1 contained the E. coli lysate containing pHCV-65 expressing the HCV CKS-NS1S2 antigen prior to induction and Lanes 2 and 3 are 2 and 4 hours post-induction, respectively.
EXAMPLE 12. CKS-NS1S3
A. Construction of the HCV CKS-NS1S3 Expression Vector
Six individual oligonucleotides representing amino acids 717-847 of the HCV genome were ligated together and cloned as a 393 base pair EcoRI/BamHI fragment into the CKS fusion vector pJO200. The complete amino acid sequence of this antigen is designated as pHCV-78 (SEQ. ID. NO. 3). The resultant fusion protein HCV CKS-NS1S3 consists of 239 amino acids of CKS, eight amino acids contributed by linker DNA sequences, and 131 amino acids from the NS1 region of the HCV genome.
B. Production and Characterization of the Recombiant Antigen HCV-NS1S3 pHCV-78 was transformed into E.coli K-12 strain XL-1 (recA1 , endA1 , gyrA96, thi-1 , hsdR17, SupE44, relAI , lac/f 1 , p10AB, laclqADM15, TN10) cells. Expression analysis and characterization of the recombinant protein was done using
polyacrylamide gel electrophoresis as described in Example 1. Analysis of the coumassie stained gel indicated very low levels of expression of the protein with an expected molecular weight of 42,1141. Western blot analysis also failed to show any immunoreactivity and we are continuing to identify human sera that is specific to this region of NS1.
EXAMPLE 13. CKS-NS1S1-NS1S2
A. Construction of the HCV CKS-NS1S1-NS1S2 Expression Vector
The construction of pHCV-80 (NS1S1-NS1S2) involved using the SACI/BamHI insert from pHCV-65 and ligating that into the Sacl/BamHI vector backbone of pHCV-77. The resultant HCV gene represents amino acids 365-731 of the HCV genome. This resulted in a 1101 base pair EcoRI/BamHI fragment of HCV cloned into the CKS fusion vector pJO200. The complete amino acid sequence of this antigen is designated as pHCV-80 (SEQ. ID. NO. 4). The resultant fusion protein HCV CKS NS1S1-NS1S2 consists of 239 amino acids of CKS, seven amino acids contributed by linker DNA sequences, and 367 amino acids from the NS1 region of the HCV genome.
B. Production and Characterization of the Recombinant Antigen HCV-NS1S1-NS1S2 pHCV-80 was transformed into E.coli K-12 strain XL-1 (recA1, endA1, gyrA96, thi-1 , hsdR17, SupE44, relAI , lac/f 1 , plOAB, laclqADMlδ, TN10) cells.
Expression analysis and characterization of the recombinant protein was done using polyacrylamide gel electrophoresis as described in Example 1. The apparent molecular weight of the pHCV-80 antigen was the same as the expected molecular weight of 68,454 as visualized on a coumassie stained gel. The immunoreactivity as determined by Western blot analysis using human sera indicated that this recombinant antigen was very immunoreactive. FIGURE 49A presents the expression of pHCV-80 in E. coli. FIGURE 49B presents an immunoblot of pHCV- 80 antigen expressed in E. coli. Lane 1 contained the E. coli lysate containing pHCV- 80 expressing the HCV CKS-NS1S1-NS1S2 antigen prior to induction and Lanes 2 and 3 are 2 and 4 hours post-induction, respectively.
EXAMPLE 14. HCV CKS-FULL LENGTH NS1 A. Construction of the HCV CKS-full length NS1 Expression Vector
The construction of pHCV-92 (SEQ. ID. NO. 5) full length NS1) involved using the Xhol/BamHI insert from pHCV-78 (SEQ. ID. NO. 3) and ligating that into the Xhol/BamHI vector backbone of pHCV-80 (SEQ. ID. NO. 4). The resultant HCV gene represents amino acids 365-847 of the HCV genome. This resulted in a 1449
base pair EcoRI/BamHI fragment of HCV cloned into CKS fusion vector pJO200. The complete amino acid sequence of this antigen is designated as pHCV-92 (SEQ. ID. NO. 5). The resultant fusion protein HCV CKS-full length NS1 consists of 239 amino acids of CKS, seven amino acids contributed by linker DNA sequences, and 483 amino acids from the NS1 region of the HCV genome.
B. Production and Characterization of the Recombinant Antigen pHCV-92 pHCV-92 was transformed into E.coli K-12 strain XL-1 (recA1 , endA1 , gyrA96, thi-1, hsdR17, SupE44, relAI , lac/f 1 , plOAB, laclqADM15, TN10) cells. Expression analysis and characterization of the recombinant protein was done using polyacrylameide gel electrophoresis as described in Example 1. The expression levels as seen by counassie stained gel were virtually undectable and the Western blot indicated no immunoreactivity. We are still in the process of identifying sera that will recognize this region of HCV NS1.
The present invention thus provides unique recombinant antigens representing distinct antigenic regions of the HCV genome which can be used as reagents for the detection and/or confirmation of antibodies and antigens in test samples from individuals exposed to HCV. The NS1 protein is considered to be a rron-structural membrane glycoproteiή and to be able to elicit a protective immune response of the host against lethal viral infection.
The recombinant antigens, either alone or in combination, can be used in the assay formats provided herein and exemplified in the Examples. It also is contemplated that these recombinant antigens can be used to develop specific inhibitors of viral replication and used for therapeutic purposes, such as for vaccines. Other applications and modifications of the use of these antigens and the specific embodiments of this inventions as set forth herein, will be apparent to those skilled in the art. Accordingly, the invention is intended to be limited only in accordance with the appended claims.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: DEVARE, S. DESAI, S. DAI LEY, S.
(ii) TITLE OF INVENTION: HCV SYNTHETIC PEPTIDE FROM NS1 REGION
(iii) NUMBER OF SEQUENCES: 35
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: ABBOTT LABORATORIES
(B) STREET: ONE ABBOTT PARK ROAD
(C) CITY: ABBOTT PARK
(D) STATE: ILLINOIS (E) COUNTRY: U.S.
(F) ZIP: 60065-3500
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS (D) SOFTWARE: Patentin Release #1.0, Version #1.25
(vi) CURRENT APPUCATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE: (C) CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION: (A) NAME: POREMBSKI, PRISCILLA E. (B) REGISTRATION NUMBER: 33,207 (C) REFERENCE DOCKET NUMBER: 4834PC.02
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 708-937-6365
(B) TELEFAX: 708-937-9556
(2) INFORMATION FOR SEQ IP NO:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 463 amino acids
(B) TYPE: amino acid
(C) STRANOEPNESS: single (P) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:1 :
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu
1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met lie Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He lie Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 115 120 125
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Thr Met Val Gly Asn Trp Ala Lys Val Leu 245 250 255
Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr His Val Thr 260 265 270
Gly Gly Ser Ala Gly His Thr Val Ser Gly Phe Val Ser Leu Leu Ala 275 280 285
Pro Gly Ala Lys Gin Asn Val Gin Leu lie Asn Thr Asn Gly Ser Trp 290 295 300
His Leu Asn Ser Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn Thr Gly 305 310 315 320
Trp Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser Ser Gly Cys 325 330 335
Pro Glu Arg Leu Ala Ser Cys Arg Pro Leu Thr Asp Phe Asp Gin Gly 340 345 350
Trp Gly Gin lie Ser Tyr Ala Asn Gly Ser Gly Pro Asp Gin Arg Pro 355 360 365
Tyr Cys Trp His Tyr Pro Pro Lys Pro Cys Gly lie Val Pro Ala Lys 370 375 380
Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val 385 390 395 400
Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Trp Gly Glu Asn 405 410 415
Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn 420 425 430
Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys 435 440 445
Gly Ala Pro Pro Cys Val lie Gly Gly Ala Gly Asn Asn Thr Leu ' 450 455 460
(2) INFORMATION FOR SEQ ID N02:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 414 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID N02:
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He lie Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Met Gly Ala Pro Pro Cys Val He Gly Gly 245 250 255
Ala Gly Asn Asn Thr Leu His Cys Pro Thr Asp Cys Phe Arg Lys His 260 265 270
Pro Asp Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp lie Thr Pro 275 280 285
Arg Cys Leu Val Asp Tyr Pro Tyr Arg Leu Trp His Thr Pro Cys Thr 290 295 300 lie Asn Thr Thr He Phe Lys He Arg Met Tyr Val Gly Gly Val Glu 305 310 315 320
His Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp 325 330 335
Leu Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Thr Thr 340 345 350
Thr Gin Tφ Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu 355 360 365
Ser Thr Gly Leu lie His Leu Gly Gin Asn lie Val Asp Val Gin Tyr 370 375 380
Leu Tyr Gly Val Gly Ser Ser lie Ala Ser Trp Ala lie Lys Trp Glu 385 390 395 400
Tyr Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val 405 410
(2) INFORMATION FOR SEQ ID NO:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 378 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:
Met Ser Phe Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu Pro 1 5 10 15
Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His Val 20 25 30
Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala Thr 35 40 45
Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu Val 50 55 60
Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala Glu 65 70 75 80
Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn Val 85 90 95
Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val Ala 100 105 110
Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val Pro 115 120 125
He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val Val 130 135 140
Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie Pro 145 150 155 160
Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp Asn 165 170 175
Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He Arg 180 185 190
Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met Leu 195 200 205
Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala Val 210 215 220
Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu Asp 225 230 235 240
Pro Ser Thr Asn Ser Thr Met Glu Tyr Val Val Leu Leu Phe Leu Leu 245 250 255
Leu Ala Asp Ala Arg Val Cys Ser Cys Leu Tφ Met Met Leu Leu He 260 265 270
Ser Gin Ala Glu Ala Ala Leu Glu Asn Leu Val He Leu Asn Ala Ala 275 280 285
Ser Leu Ala Gly Thr His Gly Leu Val Ser Phe Leu Val Phe Phe Cys 290 295 300
Phe Ala Trp Tyr Leu Lys Gly Lys Tφ Val Pro Gly Ala Val Tyr Thr 305 310 315 320
Phe Tyr Gly Met Tφ Pro Leu Leu Leu Leu Leu Leu Ala Leu Pro Gin 325 330 335
Arg Ala Tyr Ala Leu Asp Thr Glu Val Ala Ala Ser Cys Gly Gly Val 340 345 350
Val Leu Val Gly Leu Met Ala Leu Thr Leu Ser Pro Tyr Tyr Lys Arg 355 360 365
Tyr He Ser Trp Cys Leu Trp Tφ Leu Gin 370 375
(2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 622 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Thr Met Val Gly Asn Trp Ala Lys Val Leu 245 250 255
Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr His Val Thr 260 265 270
Gly Gly Ser Ala Gly His Thr Val Ser Gly Phe Val Ser Leu Leu Ala 275 280 285
Pro Gly Ala Lys Gin Asn Val Gin Leu lie Asn Thr Asn Gly Ser Tφ
290 295 300
His Leu Asn Ser Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn Thr Gly 305 310 315 320
Tφ Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser Ser Gly Cys 325 330 335
Pro Glu Arg Leu Ala Ser Cys Arg Pro Leu Thr Asp Phe Asp Gin Gly 340 345 350
Tφ Gly Gin He Ser Tyr Ala Asn Gly Ser Gly Pro Asp Gin Arg Pro 355 360 365
Tyr Cys Tφ His Tyr Pro Pro Lys Pro Cys Gly He Val Pro Ala Lys 370 375 380
Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val 385 390 395 400
Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Tφ Gly Glu Asn 405 410 415
Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn 420 425 430
Tφ Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys 435 440 445
Gly Ala Pro Pro Cys Val He Gly Pro Pro Cys Val He Gly Gly Ala 450 455 460
Gly Asn Asn Thr Leu His Cys Pro Thr Asp Cys Phe Arg Lys His Pro 465 470 475 480
Asp Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp He Thr Pro Arg 485 490 495
Cys Leu Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr He 500 505 510
Asn Tyr Thr He Phe Lys He Arg Met Tyr Val Gly Gly Val Glu His 515 520 525
Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu 530 535 540
Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Thr Thr Thr 545 550 555 560
Gin Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser 565 570 575
Thr Gly Leu He His Leu His Gin Asn He Val Asp Val Gin Tyr Leu 580 585 590
Tyr Gly Val Gly Ser Ser lie Ala Ser Trp Ala lie Lys Trp Glu Tyr 595 600 605
Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Xaa 610 615 620
(2) INFORMATION FOR SEQ ID NOS:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 738 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: peptide
(xi) SEQUENCE DESCRIPTION: SEQ ID NOS:
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met lie Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Aia Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Aia Val 115 120 125
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Trp Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Thr Met Val Gly Asn Trp Ala Lys Val Leu 245 250 255
Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr His Val Thr 260 265 270
Gly Gly Ser Ala Gly His Thr Val Ser Gly Phe Val Ser Leu Leu Ala 275 280 285
Pro Gly Ala Lys Gin Asn Val Gin Leu He Asn Thr Asn Gly Ser Trp 290 295 300
His Leu Asn Ser Thr Ala Leu Asn Cys Asn Asp Ser Leu Asn Thr Gly 305 310 315 320
Tφ Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser Ser Gly Cys 325 330 335
Pro Glu Arg Leu Ala Ser Cys Arg Pro Leu Thr Asp Phe Asp Gin Gly 340 345 350
Tφ Gly Gin He Ser Tyr Ala Asn Gly Ser Gly Pro Asp Gin Arg Pro 355 360 365
Tyr Cys Tφ His Tyr Pro Pro Lys Pro Cys Gly He Val Pro Ala Lys 370 375 380
Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro Ser Pro Val Val Val 385 390 395 400
Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr Ser Tφ Gly Glu Asn 405 410 415
Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg Pro Pro Leu Gly Asn 420 425 430
Tφ Phe Gly Cys Thr Trp Met Asn Ser Thr Gly Phe Thr Lys Val Cys 435 440 445
Gly Ala Pro Pro Cys Val He Gly Pro Pro Cys Val lie Gly Gly Ala 450 455 460
Gly Asn Asn Thr Leu His Cys Pro Thr Asp Cys Phe Arg Lys His Pro 465 470 475 480
Asp Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp lie Thr Pro Arg 485 490 495
Cys Leu Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr He 500 505 510
Asn Tyr Thr lie Phe Lys lie Arg Met Tyr Val Gly Gly Val Glu His 515 520 525
Arg Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu 530 535 540
Glu Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Thr Thr Thr 545 550 555 560
Gin Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser 565 570 575
Thr Gly Leu He His Leu His Gin Asn lie Val Asp Val Gin Tyr Leu 580 585 590
Tyr Gly Val Gly Ser Ser lie Ala Ser Trp Ala lie Lys Trp Glu Tyr 595 600 605
Val Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys Ser Cys 610 615 620
Leu Tφ Met Met Leu Leu lie Ser Gin Ala Glu Ala Ala Leu Glu Asn 625 630 635 640
Leu Val lie Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly Leu Val 645 650 655
Ser Phe Leu Val Phe Phe Cys Phe Ala Trp Tyr Leu Lys Gly Lys Trp 660 665 670
Val Pro Gly Ala Val Tyr Thr Phe Tyr Gly Met Trp Pro Leu Leu Leu 675 680 685
Leu Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Val 690 695 700
Ala Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala Leu Thr 705 710 715 720
Leu Ser Pro Tyr Tyr Lys Arg Tyr He Ser Trp Cys Leu Tφ Trp Leu 725 730 735
Gin Xaa
(2) INFORMATION FOR SEQ IP NO:6:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 4481 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 130..1317
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:
G- \TTMTTCCCATTMTGTGAGfπAC3CTCACTCATTAGGCACCCCAGGCT^^ 60
ATGπCXX3GCTCGTATTTTGTGTGGAATTGTGAGCGGATAACWVπ 120
GAGGTπAAATGAGTTTTGTGGTCATTAπCCCGCGCGCTACGCGTCG 168 Met Ser Phe Val Val He lie Pro Ala Arg Tyr Ala Ser 1 5 10
ACGCGTCTGCCCGGT/ CCAπGGTTGATATTAACGGCAAACCCATG 216 Thr Arg Leu Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met 15 20 25
ATT GTT CAT GTT CTT GAA CGC GCG CGT GAATCA GGT GCC GAG CGC ATC 264 He Val His Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He 30 35 40 45
ATCGTGGCAACCGATCATGAGGATGTTGCCCGCGCCGTTGAAGCCGCT 312 He Val Ala Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala 50 55 60
GGCGGTGAAGTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGAA 360 Gly Gly Glu Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu 65 70 75
CGTCTGGCGGAAGTTGTCGAAAAATGCGCATTCAGCGACGACACGGTG 408 Arg Leu Ala Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val 80 85 90
ATCGTTAATGTGCAGGGTGATGAACCGATGATCCCTGCGACAATCATT 456 He Val Asn Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He lie 95 100 105
CGT CAG GTT GCT GAT AAC CTC GCT CAG CGT CAG GTG GGT ATG GCG ACT 504 Arg Gin Val Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr 110 115 120 125
CTGGCGGTGCCAATCCACAATGCGGAAGAAGCGTTTAACCCGAATGCG 552 Leu Ala Val Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala 130 135 140
GTG AM GTG GTT CTC GAC GCT GAA GGG TAT GCA CTG TAC TTC TCT CGC 600 Val Lys Val Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg 145 150 155
GCCACC ATTCCTTGGGATCGTGATCGTTTTGCAGAAGGCCTTGAAACC 648 Ala Thr lie Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr 160 165 170
GTT GGC GAT AAC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA 696 Val Gly Asp Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala 175 180 185
GGCTTTATCCGTCGTTACGTCAACTGGCAGCCAAGTCCGTTAGAACAC 744 Gly Phe He Arg Arg Tyr Val Asn Tφ Gin Pro Ser Pro Leu Glu His 190 195 200 205
ATC GAA ATG TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC 792 He Glu Met Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He 210 215 220
CATGTTGCTGTTGCTCAGGAAGTTCCTGGCACAGGTGTGGATACCCCT 840 His Val Ala Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro 225 230 235
GA GATCTCGAC CCGTCGACGAATTCCATGTCTACCAACCCGAAACCG 888 Glu Asp Leu Asp Pro Ser Thr Asn Ser Met Ser Thr Asn Pro Lys Pro 240 245 250
CAGAAAAAAAACAAACGTAACACCAACCGTCGTCCGCAGGACGTTAAA 936 Gin Lys Lys Asn Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys 255 260 265
TTC CCG GGT GGT GGT CAG ATC GTT GGT GGT GTT TAC CTG CTG CCG CGT 984 Phe Pro Gly Gly Gly Gin He Val Gly Gly Val Tyr Leu Leu Pro Arg 270 275 280 285
CGTGGTCCGCGTCTGGGTGTTCX3TGCTACGCGT .ACCTCTGAACGT 1032 Arg Gly Pro Arg Leu Gly Val Arg Aia Thr Arg Lys Thr Ser Glu Arg 290 295 300
TCTCAGCCGCGTGGGCGTCGTCAGCCGATCCCGAAAGCTCGTCGTCCG 1080 Ser Gin Pro Arg Gly Arg Arg Gin Pro lie Pro Lys Ala Arg Arg Pro 305 310 315
GAAGGTCGTACCTGGGCTCAGCCGGGTTACCCGTGGCCGCTGTACGGT 1128 Glu Gly Arg Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly 320 325 330
AACGAAGGTTGCGGTTGGGCTGGTTGGCTGCTGTCTCCGCGTGGATCT 1176 Asn Glu Gly Cys Gly Tφ Ala Gly Tφ Leu Leu Ser Pro Arg Gly Ser 335 340 345
CGTCCGTCTTGG GGT CCG ACC GAC CCG CGT CGT CGT TCT CGT AAC CTT 1224 Arg Pro Ser Trp Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu 350 355 360 365
GGTAAAGTTATCGATACC CTGACCTGCGGTTTCGCTGACCTGATGGGT 1272 Gly Lys Val lie Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly
370 375 380
TACATACCGCTGGTTGGAGCTCCG CTGGGTGGTGCTGCTCGTGCT 1317 Tyr He Pro Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala 385 390 395
TM∞CATGGATCCTCTAGACTGCAGGCATGC^^ 1377
GCTGAMTGCGCTMmCACTCACGACACTCAG^ 1437
CGTTACGATT TTCCTC AATT TTTCTTTTCA ACAATTGATC TCATTCAGGT GAC ATCTTTT 1497
ATATTGGCGCTCATTATGMAGCAGTAGCTTTTATGAGGGTMTCTGMTGGAAC^ 1557
∞TGCCGMTTMGCCATTTACTGGGCGAAAMCTCAGTCGTATTGAGTGC^TCMTGM 1617
AAAG(-OGV\TACGK_CGπGTGGGCTTTGTATGACAGCCAGG 1677
GC GAAGCTTAGCCCGCCTAATGΛGCGGGUI 1 1 1 1 1 1 ICGACGCGAGGCTGGATGGCCT 1737
TGT<XAGGCAGGTAGATGACGA<X TCAGGG^ 1857
< AGCCTMCTTCGATCV\CTGGACC_^^ 1917
GOVCATGGMCGGGπGGCATGGAπGTAGGC^^ 1977
CGπGCGTCGCGGTGCATGGAGCCGGGC^ 2037
CGCTMCGGAπCACCACTCCMGMπGGAGCX_^^ 2097
TGCGCAAACCAACCCTTGGCAGMCATATC^ 2157
GCGGCGCATCTCGGGCAGCGπGGGTCCTGGCCApCGGCT 2217
GΠTGAGG/VXCGGCTAGGCTGGOJGKΞGTO 2277
ACGCGAGCGAACGTGAAGCX3/-CrGCTGCTG 2337
MTGGTCTTCGGTTTC03TGTTTCGTAMGTCTGGAAACX3CGGM 2397
CAπATGTTCCX3GATCTGCATCX3CAGGATGCTGCTGGCTACCCTGTGGAA 2457
TGTAπMCGMGCGCTTCTTCCGCπCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTC 2517 GGCTGCGGCX3AGCGGTATCAGCT<_ACTCM 2577
GGGATAACGCAGGAMGAACATGTGAG(_MAAGGCC \GC^/V^ 2637
GACGCTCAAGTCAGAGGTGGCGAA _-C03ACAGGA 2757
CTGGMGCTCCCTCGTGCGCTCTCCTGπCOBACXX GCCGCTTACCGGΛTACCT 2817
CCTTTCTCCC TTCGGGAAGC GTGGCGCTTT CTCMTGCTC ACGCTGTAGG TATCTCAGTT .2877
GCTGCGCXϊπATCCX_GTMCTATCX3TCπGAGTCCAACCCGGTMGAC^ 2997
CACTGGCAGCAGCCACTGGTAACAG^ 3057
AGTTCTTGMGTGGTGGCCTMCTA BGCTACACTAGMGGACAGTATπGGTATCT 3117
CTCTGCTGMGCCAGTTACCTTCGGA,A,AMGr\GπTGGTAGCTCπGAT∞ 3177
CCACCGCTGGTAGCGGTGGT 1 1 1 1 1 IGI 1 1 GCMGCAGCAGATT.ACGCGCAGAAAAAAAG 3237 GATCTCMGAAGATC-ClTrGATCTmCTACG^ 3297
CACGTTAAGG GATTTTGGTC ATGAGATTAT CAAAAAGGAT CTTCACCTAG ATCCTTTTAA 3357 ATTAAAAATG MGTTTT W. TCMTCTAAA GTATATATGA GTAAACTTGG TCTGACAGTT 3417 ACCMTGCTTMTCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAG 3477 TTGCCTGACTOXCGTCXaTGTAGAT^^ 3537
GTGCTGCMTGATACCGCGAGACX>CACGCTCACC^^ 3597
AGCCAGCCGGMGGGCCG GCG(_AG GTGGTCCTGC CT^^ 3657
CTATTMπGTTGCCGGGMGCTAGAGTMGTAGTTCGCCAGTTMTAGTTTGCGCAACG 3717 TTGTTTGCCATTGCTACAGGC ATCGTGGTGT CACGCTCGTC GTTTGGTATG GCTTCATTCA 3777 GCTOC _TTCC(_MCGATCAAGGCGAGTTACATGATCC C^ 3837
TTAGCTCCTTCGGTCCTCCGATCGTTGTCAGW\GTMGTTGGCCGCAGTGT 3897
TGK3TTATGGCAGCACTGCATMTTCTCπACTGTCATGCCATCXX3TMGATGCTTTTCTG 3957 TGACTGGTGA GTACTCAACC AAGTCATTCT GAGAATAGTG TATGCGGCGACCGAGTTGCT 4017 CTTGCCCGGC TCW\CA(_GGGATMTA^ 4077
TCATTGGA ACOπCTTCGGGGCGAA CTCTCMGGATCTTACCGCTGTTGAGATC^ 4137 GTTCGATGTAACCC_ACT(-OTGCAαXMCTGATCTTCAGCATCTTTTACTT^^ 4197
TTTCTGGGTGAGCAAAMCAGGMGGCPAMATG -COCAMAAA 4257
GGAMTGTTGMTACTCATACTCTTCCTTTTTCAATATTAπGAAGCATTTATCAGGGTT 4317 ATTGTCTCAT GAGCGGATAC ATATTTGAAT GTATTTAGAA AAATAAACAA ATAGGGGTTC 4377 CGCGCACAπTCCCCGAAAAGTGCCACCTG ACGTCTAAGAAACCATTAπ ATCATGACAT 4437
TAACCTATAA AAATAGGCGT ATCACGAGGC CCTTTCGTCT TCM 4481
(2) INFORMATION FOR SEQ ID NO:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 396 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Aia Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Met Ser Thr Asn Pro Lys Pro Gin Lys Lys 245 250 255
Asn Lys Arg Asn Thr Asn Arg Arg Pro Gin Asp Val Lys Phe Pro Gly 260 265 270
Gly Gly Gin lie Val Gly Gly Val Tyr Leu Leu Pro Arg Arg Gly Pro 275 280 285
Arg Leu Gly Val Arg Ala Thr Arg Lys Thr Ser Glu Arg Ser Gin Pro 290 295 300
Arg Gly Arg Arg Gin Pro He Pro Lys Ala Arg Arg Pro Glu Gly Arg 305 310 315 320
Thr Trp Ala Gin Pro Gly Tyr Pro Trp Pro Leu Tyr Gly Asn Glu Gly 325 330 335
Cys Gly Trp Ala Gly Trp Leu Leu Ser Pro Arg Gly Ser Arg Pro Ser 340 345 350
Tφ Gly Pro Thr Asp Pro Arg Arg Arg Ser Arg Asn Leu Gly Lys Val 355 360 365
He Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro 370 375 380
Leu Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala 385 390 395
(2) INFORMATION FOR SEQ ID NO:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 5600 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 130..2472
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:
GAATTAATTC CCATTAATGT GAGTTAGCTC ACTCATTAGG CACCCCAGGC TTTACACTTT 60
ATCflTCCCaGCTCGrTATTTTC 120
GAGGTTTAAATGAGTTTTGTGGTCATTATTCCCGCGCGCTACGCGTCG 168 Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser 1 5 10
ACGCX3TCTGCO C_GT .CCATTGGTTGATATTAACGGCAAACCCATG 216 Thr Arg Leu Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met 15 20 25
AπGπCATGπCTTGAACGCGCGCGTGMTCAGGTGCCGAGCGCATC 264 He Val His Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He 30 35 40 45
ATCGTGGCAACCGATCATGAGGATGTTGCCCGCGCCGTTGMGCCGCT 312 He Val Ala Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala 50 55 60
GGCGGTGAAGTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGAA 360 Gly Gly Glu Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu 65 70 75
CGTCTGGCGGAAGTTGTCGAAAAATGCGCATTCAGCGACGACACGGTG 408 Arg Leu Ala Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val 80 85 90
ATCGTTAATGTGCAGGGTGATGAACCGATGATCCCTGCGACAATCATT 456 He Val Asn Val Gin Gly Asp Glu Pro Met He Pro Ala Thr lie He 95 100 105
CGTCAG'GTTGCTGATAACCTCGCTCAGCGTCAGGTGGGTATGGCGACT 504 Arg Gin Val Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr 110 115 120 125
CTGGCGGTGOΪAATCCACMTGCGGMGMG∞TTTAACCCGAATGCG 552 Leu Ala Val Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala 130 135 140
GTG AAA GTG GTT CTC GAC GCT GAA GGG TAT GCA CTG TAC TTC TCT CGC 600 Val Lys Val Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg 145 150 155
GCCACCATTCCTTGGGATCGTGATCGTTTTGCAGAAGGCCTTGAAACC 648 Ala Thr He Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr 160 165 170
GTT GGC GAT AAC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA 696 Val Gly Asp Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala 175 180 185
GGC TTT ATC CGT CGT TAC GTC AAC TGG CAG CCA AGT CCG TTA GAA CAC 744 Gly Phe He Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His 190 195 200 205
ATC GAA ATG TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GAA AAA ATC 792 He Glu Met Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He 210 215 220
CAT GTT GCT GTT GCT CAG GAA GTT CCT GGC ACA GGT GTG GAT ACC CCT 840 His Val Ala Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro 225 230 235
GMGATCTCGACCCGTC GACX_MTTCCATGGCTGTTGACTTTATCCCG 888 Glu Asp Leu Asp Pro Ser Thr Asn Ser Met Ala Val Asp Phe He Pro 240 245 250
GTT GAA AAT CTC GAG ACT ACT ATG CGT TCT CCG GTT TTC ACT GAC AAC 936 Val Glu Asn Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn 255 260 265
TCT TCT CCG CCG GTT GTT CCG CAG TCT TTC CAG GTT GCT CAC CTG CAT 984 Ser Ser Pro Pro Val Val Pro Gin Ser Phe Gin Val Ala His Leu His 270 275 280 285
GCT CCG ACT GGT TCT GGT AAA TCT ACT AAA GTT CCA GCT GCT TAC GCT 1032 Ala Pro Thr Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala 290 295 300
GCT CAG GGTTAC AAA GTT CTG GTT CTG MC CCG TCT GTT GCT GCT ACT 1080 Ala Gin Gly Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr 305 310 315
CTGGGTTTCGGCGCCTACATGTCTAMGCTCACGGTATCGACCCGMC 1128 Leu Gly Phe Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn 320 325 330
ATT CGT ACT GGT GTA CGT ACT ATC ACT ACT GGT TCT CCG ATC ACT TAC 1176 He Arg Thr Gly Val Arg Thr lie Thr Thr Gly Ser Pro He Thr Tyr 335 340 345
TCT ACTTAC GGT AM TTC CTG GCT GAC GGT GGTTGC TCT GGT GGT GCT 1224 Ser Thr Tyr Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala 350 355 360 365
TAC GAT ATC ATC ATC TGC GAC GM TGC CAC TCT ACT GAC GCT ACT TCT 1272 Tyr Asp lie He lie Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser 370 375 380
ATC CTG GGT ATC GGT ACC GTT CTG GAC CAG GCT GM ACT GCA GGT GCT 1320 He Leu Gly He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala 385 390 395
CGT CTG GTT GTT CTG GCT ACT GCT ACT CCG CCG GGTTCT GTT ACT GTT 1368 Arg Leu Val Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val 400 405 410
CCG CAC CCG MC ATC GMGM GTTGCT CTG TCG ACT ACT GGT GM ATC 1416 Pro His Pro Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He 415 420 425
CCG πc TAC GGT AM GCT ATC CCG CTC GAG GTT ATC AM GGT GGT CGT 1464 Pro Phe Tyr Gly Lys Ala He Pro Leu Glu Val He Lys Gly Gly Arg
430 435 440 445
CAC CTG ATTTTC TGC CAC TCT AMMA MA TGC GAC GM CTG GCT GCT 1512 His Leu He Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala 450 455 460
MG CTT GTT GCT CTG GGT ATC MC GCT GTT GCTTACTAC CGT GGT CTG 1560 Lys Leu Val Ala Leu Gly He Asn Ala Val Aia Tyr Tyr Arg Gly Leu 465 470 475
GACGTTTCTGTTATCCCGACTTCTGGTGACGTTGTTGTTGTGGCCACT 1608 Asp Val Ser Val He Pro Thr Ser Gly Asp Val Val Val Val Ala Thr 480 485 490
GAC GCT CTG ATG ACT GGTTAC ACT GGT GAC TTC GAC TCT GTT ATC GAT 1656 Asp Ala Leu Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp 495 500 505
TGC MC ACT TGC MTTCG TCG ACC GGTTGC GTT GTT ATC GTT GGT CGT 1704 Cys Asn Thr Cys Asn Ser Ser Thr Gly Cys Val Val lie Val Gly Arg 510 515 520 525
GπGTTCTGTCTGGTAMCCGGCCATTATCCCGGACCGTGMGTTCTG 1752 Val Val Leu Ser Gly Lys Pro Ala lie He Pro Asp Arg Glu Val Leu 530 535 540
TACCGTGAGTTCGACGMATGGMGMTGCTCTCAGCACCTGCCGTAC 1800 Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr 545 550 555
ATCGMCAGGGTATGATGCTGGCTGMCAGTTCAMCAGAMGCTCTG 1848 He Glu Gin Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu 560 565 570
GGTCTGCTGCAGACCGCTTCTCGTCAGGCTGMGTTATCGCTCCGGCT 1896 Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu Val He Ala Pro Ala 575 580 585
GTT CAG ACC MC TGG CAG AM CTC GAG ACC TTC TGG GCT AM CAC ATG 1944 Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe Tφ Ala Lys His Met 590 595 600 605
TGGMCTTCATCTCTGGTATCCAGTACCTG GCT GGT CTG TCT ACC CTG 1992 Tφ Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly Leu Ser Thr Leu 610 615 620
CCGGGTMCCCGGCTATCGCAAGCTTGATGGCTTTCACCGCTGCTGTT 2040 Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr Ala Ala Val 625 630 635
ACC TCT CCG CTG ACC ACC TCT CAG ACC CTG CTG TTC MC ATT CTG GGT 2088 Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn He Leu Gly 640 645 650
GGTTGG GTT GCT GCT CAG CTG GCT GCT CCG GGT GCT GCT ACC GCTTT C 2136
Gly Tφ Val Ala Aia Gin Leu Aia Ala Pro Gly Ala Ala Thr Ala Phe 655 660 665
GTTGGTGCTGGTCTG GCTGGTGCTGCTATC GGTTCTGTAGGC CTG GGT 2184 Val Gly Ala Gly Leu Ala Gly Ala Ala lie Gly Ser Val Gly Leu Gly 670 675 680 685
AM GTT CTG ATC GAC ATT CTG GCT GGT TAC GGT GCT GGT GTT GCT GGA 2232 Lys Val Leu He Asp He Leu Aia Gly Tyr Gly Ala Gly Val Ala Gly 690 695 700
GCT CTG GTT GCT TTC AM ATC ATG TCT GGT GM GTT CCG TCT ACC GM 2280 Ala Leu Val Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu 705 710 715
GAT CTG GTT MC CTG CTG CCG GCT ATC CTG TCT CCG GGT GCT CTG GTT 2328 Asp Leu Val Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val 720 725 730
GTTGGTGTTGTTTGCGCTGCTATCCTG CGTCGTCACGTTGGCCCGGGT 2376 Val Gly Val Val Cys Ala Ala lie Leu Arg Arg His Val Gly Pro Gly 735 740 745
GM GGT GCT GTT CAG TGG ATG MC CGT CTG ATC GCT TTC GCT TCT CGT 2424 Glu Gly Ala Val Gin Trp Met Asn Arg Leu lie Ala Phe Ala Ser Arg 750 755 760 765
GGTMCCACGTTTCTCCATGG GAT CCT CTA GAC TGC AGGCATGCT G 2472 Gly Asn His Val Ser Pro Tφ Asp Pro Leu Asp Cys Arg His Ala Lys 770 775 780
TMGTAGATC TTGAGCGCGTTCGCGCTGM ATGCGCTMTTTCACTTCAC GACACTTCAG 2532 CCMTTTTGG GAGGAGTGTC GTACCGTTAC GATTTTCCTC MTTTTTCTTTTCMCMTT 2592 GATCTCATTCAGGTGACATCTTTTATATTG GCGCTCATTATGAMGCAGTAGCTTTTATG 2652 AGGGTMTCTGMTGGMCAGCTGCGTGCCGMTTMGCC ATTTACTGGGCG AAMCTC 2712 AGTCGTATTGAGTGCXSTCMTGMAMGCG GATACG^ 2772
CAGGGAMCCCMTG_XGΠMTGC3CMGAAGCT^^ 2832
TTTCGACGCGAGGCTGGATG GCCTTCCCCATTATGATTCTTCTC^ 2892
GGATGCCCGCGflTGfDAGGCCATGCrGT^ 2952
TTCAAGGATCG-CTOaCXi^ 3012
CGGCGATTTATGCCGXXTOGGOi GCACATGG^ 3072
CCCTATACCTTGTCTGCCTCCCCGCGπGCGnOGCGGTGCATG^ 3132
CCTGMTGGAAGCCGGCGGCACCTCG_-TM 3192
TCMTTCTTGCGGAGMCTGTGMTGCGCAMCX_MCXXΪTTGGCAGMCATATCC^^ 3252
GGTGCGCATGATCGTGCTCCTGTOC^ 3372
TGGπTAGCAGMTGMTCACCGATACGCGAGCGAACGTGAAG∞A^ 3432
GTCTGCGACCTGAGCAACW.CATGMTG^ 3492
AACGα_3MGTCAGCX_CCCTGCVVCX_ T^^^ 3552
GCTACCCTGTGGMCACCTACATCTGTATTAACGAtt 3612
CTGACT(DGCTG(_GCTCGGTCGrrrCGGCTGCGGCGΛ _^^ 3672
TMTACGGπATCCACAGMTCAGGGGATAACXSC^^ 3732
AGCAΛMGGCCAGGMCCGTAAAMGGCX_G(-Gπ'GCTGGCGI 1 1 1 ICCATAGGCTCCGCC 3792
TGO>GCrrTACCGGATACCTGTCX-X3C4CTTTCTC^ 3972
GCTCACGCTGTAGGTATCTCAGTTα_GTGrTAGGrrCGTTCGCT 4032
ACGAACCCCCCGπ<DAGCCCGACCGCTG∞ 4092
ACCCGGTAAGACA(_GACTTATOGCCACTGGCAGCAGC^ 4152
CGAGGTATGTAGGCGGTGCTACAGAGTTCTTGMGTGGTGGCXn"MCT^ 4212
GMGGACAGTATTTGGTATCTGCGCTCTGCTGMGCCV.GTTACCTTCGGAAAMGAGTTG 4272 GTAGCTCTTGATCCGGCAMCMACX ΛCCGCTGGTAGCGGTGGI 1 1 1 1 1 1 GTTTGCMGC 4332 AGCAGAπACGCGCAGAAMMAGGATCTCMGMGAT∞TπGATCTπTCTACpOGGGT 4392 CTGA∞CTCAGTGGMCGMMCTCACGTTMGGGATm 4452
GGATCTTCAC CTAGATCCTT TTAMTTAM MTGMGTTT TMATCMTC TAMGTATAT 4512 ATGAGTAMC TTGGTCTGAC AGTTACCMT GCTTMTCAG TGAGGCACCT ATCTCAGCGA 4572 TCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATAC 4632 GGGAGGGCTT.ACCATCTGGC(XX^GTGCTGCM^ 4692
CTCCAGΛTπAT(-V* _CAATAMC(_A^ 4752
CMCTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGMGCTAGAGTMGTAGTT 4812 CGCCAGTTMTAGTTTGCGCMCGTTGTTG CCATTGCTAC AGGCATCGTG GTGTCACGCT 4872
CGTCGmGGTATGGCrπCAπCAGCTCCGG 4932
CCCCCATGπCTGCAMAMGCGGπAGCTCCTTC^ 4992
AGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATMTTCTCTTACTGTCA 5052 TGCCAT(XGTMGATGCTTTTCTGTGACTGGTGAGTACTCMCCMGT(-V TTCTGAGMT 5112 AG GTATGCGGCGACCGAGTTGCTCT^^ 5172
ATAGCAGAACTTTAAMGTGCTCATCATTGGAMACGπCTTC)GGGGCGAAMCTCT^ 5232 GGATCTTACCGCTGπGAGATCCAGπCGATGTMCCCACTCX_TGCACCCAACT_^^ 5292 CAGCATCTTTTACTTTCACC AGCGTTTCTGGGTGAGCAMMCAGGMGGCAAMTGCCG 5352 CMMMGGG MTMGGGCG ACACGGAMT GTTGMTACT CATACTCTTC CTTTTTCMT 5412 ATTATTGMG C ATTTATC AG GGTTATTGTC TC ATGAGCGG ATAC ATATTT GMTGTATTT 5472 AGAΛAMTMACVVλATAGGGGTTCCGCGCACATπCCCCGAAMGTGCCACCTG^ 5532
MGAMCCATTATTATCATG ACATTMCCT ATAAAMTAG GCGTATCACG AGGCCCTTTC 5592 GTCTTCAA 5600
(2) INFORMATION FOR SEQ ID NO:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 781 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn
85 90 95
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Aia Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Met Ala Val Asp Phe He Pro Val Glu Asn 245 250 255
Leu Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro 260 265 270
Pro Val Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr 275 280 285
Gly Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly 290 295 300
Tyr Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe 305 310 315 320
Gly Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr 325 330 335
Gly Val Arg Thr lie Thr Thr Gly Ser Pro He Thr Tyr Ser Thr Tyr 340 345 350
Gly Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He 355 360 365
He He Cys Asp Glu Cys His Ser Thr Asp Ala Thr Ser He Leu Gly 370 375 380
He Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val 385 390 395 400
Val Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro 405 410 415
Asn He Glu Glu Val Ala Leu Ser Thr Thr Gly Glu He Pro Phe Tyr 420 425 430
Gly Lys Ala lie Pro Leu Glu Val lie Lys Gly Gly Arg His Leu lie 435 440 445
Phe Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Val 450 455 460
Ala Leu Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser 465 470 475 480
Val lie Pro Thr Ser Gly Asp Val Val Val Val Aia Thr Asp Ala Leu 485 490 495
Met Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr 500 505 510
Cys Asn Ser Ser Thr Giy Cys Val Val lie Val Gly Arg Val Val Leu 515 520 525
Ser Gly Lys Pro Ala lie lie Pro Asp Arg Glu Val Leu Tyr Arg Glu 530 535 540
Phe Asp Glu Met Glu Glu Cys Ser Gin His Leu Pro Tyr lie Glu Gin 545 550 555 560
Gly Met Met Leu Ala Glu Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu 565 570 575
Gin Thr Ala Ser Arg Gin Ala Glu Val lie Aia Pro Ala Val Gin Thr 580 585 590
Asn Trp Gin Lys Leu Glu Thr Phe Tφ Ala Lys His Met Tφ Asn Phe 595 600 605
He Ser Gly lie Gin Tyr Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn 610 615 620
Pro Ala lie Ala Ser Leu Met Ala Phe Thr Ala Ala Val Thr Ser Pro 625 630 635 640
Leu Thr Thr Ser Gin Thr Leu Leu Phe Asn lie Leu Gly Giy Tφ Val 645 650 655
Ala Ala Gin Leu Ala Ala Pro Gly Ala Ala Thr Ala Phe Val Gly Ala 660 665 670
Gly Leu Ala Gly Ala Ala He Gly Ser Val Gly Leu Gly Lys Val Leu 675 680 685
He Asp He Leu Ala Gly Tyr Gly Ala Gly Val Ala Gly Ala Leu Val 690 695 700
Ala Phe Lys He Met Ser Gly Glu Val Pro Ser Thr Glu Asp Leu Val 705 710 715 720
Asn Leu Leu Pro Ala lie Leu Ser Pro Gly Ala Leu Val Val Gly Val 725 730 735
Val Cys Ala Ala He Leu Arg Arg His Val Gly Pro Gly Glu Gly Ala 740 745 750
Val Gin Trp Met Asn Arg Leu He Ala Phe Ala Ser Arg Gly Asn His 755 760 765
Val Ser Pro Trp Asp Pro Leu Asp Cys Arg His Ala Lys 770 775 780
(2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1548 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..1548
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:
ATGAGTTTTGTGGTCAπAπCCCGCGCGCTACGCGTCGACGCGTCTG 48 Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCC GGTAMCCATTG GTT GAT ATT MC GGC AMCCC ATG ATT GTT CAT 96 Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
GTTCTTGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie He Val Ala 35 40 45
ACCGATCATGAGGATGπGCCCGCGCCGTTGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGTTGTCGMMATGCGCATTCAGCGACGACACGGTGATCGTTMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG AC A ATC ATT CGT CAG GTT 336 Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie He Arg Gin Val 100 105 110
GCTGATMCCTCGCTCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Giy Met Ala Thr Leu Ala Val 115 120 125
CCA ATC CAC MT GCG GMGM GCG TTT MC CCG MT GCG GTG MA GTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
GTT CTC GAC GCT GM GGG TAT GCA CTG TAC TTC TCT CGC GCC ACC ATT 480 Val Leu Asp Ala Glu Gly Tyr Aia Leu Tyr Phe Ser Arg Ala Thr lie 145 150 155 160
CCT TGG GAT CGT GAT CGT TTT GCA GM GGC CTT GM ACC GTT GGC GAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA GGC TTT ATC 576 Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGT CGT TAC GTC MC TGG CAG CCA AGT CCG TTA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GM AM ATC CAT GTT GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
GTTGCTCAGGMGTTCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GACCCGTCGACGMTTCCCCATGGACCCACTACGTTCCGGMTCTGAC 768 Asp Pro Ser Thr Asn Ser Pro Trp Thr His Tyr Val Pro Glu Ser Asp 245 250 255
GCT GCT GCT CGA GTT ACC GCT ATC CTG TCT TCT CTG ACC GTT ACC CAG 816 Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 260 265 270
CTT CTG CGT CGT CTG CAC CAG TGG ATC TCT TCT GM TGC ACC ACC CCG 864 Leu Leu Arg Arg Leu His Gin Trp lie Ser Ser Glu Cys Thr Thr Pro 275 280 285
TGC TCT GGTTCTTGG CTG CGT GAC ATC TGG GAC TGG ATC TGC GM GTT 912 Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Tφ He Cys Glu Val 290 295 300
CTGTCTGACTTCAMACCTGGCTG W.GCT>W\CTGATGCCGCAGCTG 960 Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 305 310 315 320
CCGGGTATCCCGTTCGTTTCTTGCCAGCGTGGTTACAMGGTGTTTGG 1008 Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 325 330 335
CGTGTTGACGGTATCATGCACACCCGTTGCCACTGCGGTGCTGMATC 1056 Arg Val Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He 340 345 350
ACCGGTCACGTTAMMCGGTACCATGCGTATCGTTGGTCCGCGTACC 1104 Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 355 360 365
TGC CGT MC ATG TGG TCT GGC ACC TTC CCG ATC MC GCTTAC ACC ACC 1152 Cys Arg Asn Met Trp Ser Gly Thr Phe Pro He Asn Ala Tyr Thr Thr 370 375 380
GGTCCGTGCACCCCGCTGCCGGCTCCGMCTACACCTTCGCTCTGTGG 1200 Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Tφ 385 390 395 400
CGTGTTTCTGCTGMGMTACGTTGMATCCGTCAGGTTGGTGACTTC 1248 Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 405 410 415
CACTACGTTACCGGTATGACCACCGACMCCTGMATGCCCGTGCCAG 1296 His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 420 425 430
GTT CCG TCT CCG GAG TTC TTC ACC GM CTG GAC GGT GTT CGT CTG CAC 1344 Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 435 440 445
CGTTTCGCTCCGCCGTGCAMCCGCTGCTGCGTGMGMGTTTCTTTC 1392 Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 450 455 460
CGT GTT GGT CTG CAC GM TAC CCG GTT GGTTCT CAG CTG CCG TGC GM 1440 Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 465 470 475 480
CCGGMCCGGACGTTGCTGπCTGACCTCTATGCTGACCGACCCGTCT 1488 Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 485 490 495
CAC ATC ACC GCT GM GCT GCT GGT CGT CGA CTG GAT CCT CTA GAC TGC 1536 His He Thr Ala Glu Ala Ala Gly Arg Arg Leu Asp Pro Leu Asp Cys 500 505 510
AGG CAT GCT AAG 1548 Arg His Ala Lys 515
(2) INFORMATION FOR SEQ ID NO:11 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 516 amino acids
(B) TYPE: amino acid (0) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE OESCRIPTION: SEQ ID NO:11 :
Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His lie Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Pro Trp Thr His Tyr Val Pro Glu Ser Asp 245 250 255
Ala Ala Ala Arg Val Thr Ala He Leu Ser Ser Leu Thr Val Thr Gin 260 265 270
Leu Leu Arg Arg Leu His Gin Trp He Ser Ser Glu Cys Thr Thr Pro 275 280 285
Cys Ser Gly Ser Trp Leu Arg Asp He Trp Asp Tφ He Cys Glu Val 290 295 300
Leu Ser Asp Phe Lys Thr Trp Leu Lys Ala Lys Leu Met Pro Gin Leu 305 310 315 320
Pro Gly He Pro Phe Val Ser Cys Gin Arg Gly Tyr Lys Gly Val Trp 325 330 335
Arg Val Asp Gly He Met His Thr Arg Cys His Cys Gly Ala Glu He 340 345 350
Thr Gly His Val Lys Asn Gly Thr Met Arg He Val Gly Pro Arg Thr 355 360 365
Cys Arg Asn Met Trp Ser Giy Thr Phe Pro He Asn Ala Tyr Thr Thr 370 375 380
Gly Pro Cys Thr Pro Leu Pro Ala Pro Asn Tyr Thr Phe Ala Leu Trp 385 390 395 400
Arg Val Ser Ala Glu Glu Tyr Val Glu He Arg Gin Val Gly Asp Phe 405 410 415
His Tyr Val Thr Gly Met Thr Thr Asp Asn Leu Lys Cys Pro Cys Gin 420 425 430
Val Pro Ser Pro Glu Phe Phe Thr Glu Leu Asp Gly Val Arg Leu His 435 440 445
Arg Phe Ala Pro Pro Cys Lys Pro Leu Leu Arg Glu Glu Val Ser Phe 450 455 460
Arg Val Gly Leu His Glu Tyr Pro Val Gly Ser Gin Leu Pro Cys Glu 465 470 475 480
Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro Ser 485 490 495
His lie Thr Ala Glu Ala Ala Gly Arg Arg Leu Asp Pro Leu Asp Cys 500 505 510
Arg His Ala Lys 515
(2) INFORMATION FOR SEQ ID N0:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1623 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATIONS ..1623
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:
ATGAGTTTTGTGGTCATTATTCCCGCG CGCTACGCGTCGACGCGTCTG 48 Met Ser Phe Val Val He lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCC GGT AM CCA TTG GTT GAT ATT MC GGC MA CCC ATG ATT GTT CAT 96 Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 20 25 30
GTTCTTGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie He Val Ala 35 40 45
ACCGAT(_ATG^GGATGTTGCCCGCGCCGTTGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGTTGTCGM MA TGC GCA TTC AGC GAC GAC ACG GTG ATCGTTMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC ATT CGT CAG GTT 336 Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He lie Arg Gin Val 100 105 110
GCTGATMCCTCGCTCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGTTTMCCCGMTGCGGTGMAGTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
GTT CTC GAC GCT GM GGG TAT GCA CTG TAC TTC TCT CGC GCC ACC ATT 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Aia Thr He 145 150 155 160
CCTTGGGATCGTGATCGTTTTGCAGMGGCCTTGMACCGπGGCGAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA GGC TTT ATC 576 Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGT CGTTAC GTC MC TGG CAG CCA AGT CCG TTA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GM MA ATC CAT GTT GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
GTTGCTCAGGMGTTCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCGTCG ACGMTTCT ATG CGT CGA CTG GCT CGT GGT TCT CCG CCG 768 Asp Pro Ser Thr Asn Ser Met Arg Arg Leu Ala Arg Gly Ser Pro Pro 245 250 255
TCT GTT GCTTCT TCTTCT GCT TCT CM CTG TCT GCT CCG TCT CTG MA 816 Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys 260 265 270
GCTACCTGCACCGCTMCCACGACTCTCCGGACGCTGMCTGATCGM 864 Ala Thr Cys Thr Ala Asn His Asp Ser Pro Asp Ala Glu Leu He Glu 275 280 285
GCTMCCTGCTGTGGCGTCAGGMATGGGTGGTMCATCACCCGTGTT 912 Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val 290 295 300
GMTCTGMMCMAGTTGTTATCCTGGACTCTπCGACCCGCTGGTT 960 Glu Ser Glu Asn Lys Val Val He Leu Asp Ser Phe Asp Pro Leu Val 305 310 315 320
GCTGMGMGACGMCGTGAGATCTCTGTTCCGGCTGMATCCTGCGT 1008 Ala Glu Glu Asp Glu Arg Glu lie Ser Val Pro Ala Glu He Leu Arg 325 330 335
AMTCTCGTCGTTTC GCT CAG GCT CTG CCG GTT TGG GCT CGT CCG GAC 1056 Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Tφ Ala Arg Pro Asp 340 345 350
TACMCCCGCCGCTGGTTGMACCTGGAMMACCGGACTACGMCCG 1104 Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro 355 360 365
CCGGπGπCACGCTTGCCCGCTGCCGCCGCCGpMATCTO^ 1152
Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val 370 375 380
CXX3 CCG CPOG CPOT . VMCGTACC GTTGTTCTGACC GMTCTACC CTG 1200 Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu 385 390 395 400
TCT ACC GCT CTG GCT GM CTG GCT ACC CGT TCT TTC GGT TCT TCT TCT 1248 Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser 405 410 415
ACCTCGGGTATCACCGGTGACMCACCACCACCTCTTCTGMCCG GCT 1296 Thr Ser Gly He Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala 420 425 430
CCGTCTGGTTGCCCGCCG GACTCTGACGCTGMTCTTACTCTTCTATG 1344 Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met 435 440 445
CCG CCG CTG GM GGT GM CCG GGT GAC CCG GAT CTG TCT GAC GGTTCT 1392 Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser 450 455 460
TGG TCT ACC GTT TCT TCT GM GCT MC GCT GM GAC GTT GTT TGC TGC 1440 Tφ Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys 465 470 475 480
TCT ATG TCT TAC TCT TGG ACC GGT GCT CTG GTT ACT CCG TGC GCT GCT 1488 Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala 485 490 495
GM GM CAG AM CTG CCG ATC MC GCT CTG TCT MC TCT CTG CTG CGT 1536 Glu Glu Gin Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu Leu Arg 500 505 510
CAC CAC MC CTG GTT TAC TCT ACC ACC TCT CGT TCT GCT TGC CAG CGT 1584 His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg 515 520 525
CAG AM AM GTT ACC TTC GAC CGT CTG CM GTT CTA GAC 1623
Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp 530 535 540
(2) INFORMATION FOR SEQ ID NO:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 541 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Aia Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Met Arg Arg Leu Ala Arg Gly Ser Pro Pro 245 250 255
Ser Val Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser Leu Lys
260 265 270
Ala Thr Cys Thr Aia Asn His Asp Ser Pro Asp Ala Glu Leu He Glu 275 280 285
Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie Thr Arg Val 290 295 300
Glu Ser Glu Asn Lys Val Val lie Leu Asp Ser Phe Asp Pro Leu Val 305 310 315 320
Ala Glu Glu Asp Glu Arg Glu lie Ser Val Pro Ala Glu He Leu Arg 325 330 335
Lys Ser Arg Arg Phe Ala Gin Ala Leu Pro Val Tφ Ala Arg Pro Asp 340 345 350
Tyr Asn Pro Pro Leu Val Glu Thr Trp Lys Lys Pro Asp Tyr Glu Pro 355 360 365
Pro Val Val His Gly Cys Pro Leu Pro Pro Pro Lys Ser Pro Pro Val 370 375 380
Pro Pro Pro Arg Lys Lys Arg Thr Val Val Leu Thr Glu Ser Thr Leu 385 390 395 400
Ser Thr Ala Leu Ala Glu Leu Ala Thr Arg Ser Phe Gly Ser Ser Ser 405 410 415
Thr Ser Giy He Thr Gly Asp Asn Thr Thr Thr Ser Ser Glu Pro Ala 420 425 430
Pro Ser Gly Cys Pro Pro Asp Ser Asp Ala Glu Ser Tyr Ser Ser Met 435 440 445
Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp Gly Ser 450 455 460
Tφ Ser Thr Val Ser Ser Glu Ala Asn Ala Glu Asp Val Val Cys Cys 465 470 475 480
Ser Met Ser Tyr Ser Trp Thr Gly Ala Leu Val Thr Pro Cys Ala Ala 485 490 495
Glu Glu Gin Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu Arg 500 505 510
His His Asn Leu Val Tyr Ser Thr Thr Ser Arg Ser Ala Cys Gin Arg 515 520 525
Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp 530 535 540
(2) INFORMATION FOR SEQ ID NO:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1488 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..1488
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:
ATG AGTTTTGTG GTC ATT ATT CCC GCG CGC TAC GCG TCG ACG CGT CTG 48 Met Ser Phe Val Val He lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCC GGT AM CCA TTG GTT GAT ATT MC GGC MA CCC ATG ATT GTT CAT 96 Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
GTTCTTGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
ACCGATCATGAGGATGTTGCCCGCGCCGTTGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu • 50 55 60
GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Aia Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGTTGTC GMMATGC GCA TTC AGC GAC GAC ACG GTG ATCGTTMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC ATT CGT CAG GTT 336 Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
GCT GAT MC CTC GCT CAG CGT CAG GTG GGT ATG GCG ACT CTG GCG GTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGTTTMCCCGMTGCGGTGMAGTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
GTT CTC GAC GCT GM GGG TAT GCA CTG TAC TTC TCT CGC GCC ACC ATT 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
CCT TGG GAT CGT GAT CGT TTT GCA GAA GGC CTT GM ACC GTT GGC GAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA GGC TTT ATC 576 Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGT CGTTAC GTC MC TGG CAG CCA AGT CCG TTA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GM MA ATC CAT GTT GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
GTTGCTCAGGMGTTCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCG TCG ACG MTTCT CTA GAC TCC CAC TAC CAG GAC GTT CTG AM 768 Asp Pro Ser Thr Asn Ser Leu Asp Ser His Tyr Gin Asp Val Leu Lys 245 250 255
GM GTT AM GCT GCT GCT TCT MA GTT AM GCT MC CTG CTG TCT GTT 816 Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val 260 265 270
GM GM GCA TGC TCT CTG ACC CCG CCG CAC TCT GCT AM TCT MA TTC 864 Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe 275 280 285
GGT TAC GGT GCT AM GAC GTT CGT TGC CAC GCT CGT AM GCT GTT ACC 912 Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Thr 290 295 300
CAC ATC MC TCT GTT TGG MA GAT CTG CTG GM GAC MC GTT ACC CCG 960 His He Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr Pro 305 310 315 320
ATC GAC ACC ACC ATC ATG GCT AM MC GM GTT TTC TGC GTT CAG CCG 1008 He Asp Thr Thr He Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro 325 330 335
GM AM GGT GGT CGT MA CCG GCT CGT CTG ATC GTT TTC CCG GAC CTG 1056 Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu 340 345 350
GGT GTT CGT GTT TGC GM MA ATG GCT CTG TAC GAC GTT GTT ACC AM 1104 Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr Lys 355 360 365
CTG CCG CTG GCT GTT ATG GGT TCT TCT TAC GGT TTC CAG TAC TCT CCG 1152 Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro 370 375 380
GGTCAGCGTGTTGAGπCCTGGπCAGGCTTGGAMTCTMAAMACC 1200 Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr 385 390 395 400
CCG ATG GGT TTC TCT TAC GAC ACC CGT TGC TTC GAC TCT ACC GTT ACC 1248 Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr 405 410 415
GMTCTGACAπCGTACCGAAGMGCTATCTACCAGTGCTGCGACCTG 1296 Glu Ser Asp He Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu 420 425 430
GAC CCG CAG GCT CGT GTT GCT ATC AM TCT CTG ACC GM CGT CTG TAC 1344 Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu Tyr 435 440 445
GTTGGTGGTCCGCTGACCMCTCTCGGGGTGMMCTGCGGTTACCGT 1392 Val Gly Giy Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg 450 455 460
CGTTGCCGTGCTTCTGGTGTTCTG ACC ACC TCT TGC GGT MC ACC CTG 1440 Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu 465 470 475 480
ACC TGC TAC ATC AM GCT CGT GCT GCT TGC CGT GCT GCT GGT CTG CAG 1488 Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin 485 490 495
(2) INFORMATION FOR SEQ ID NO:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 496 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He lie Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Leu Asp Ser His Tyr Gin Asp Val Leu Lys 245 250 255
Glu Val Lys Ala Ala Ala Ser Lys Val Lys Ala Asn Leu Leu Ser Val 260 265 270
Glu Glu Ala Cys Ser Leu Thr Pro Pro His Ser Ala Lys Ser Lys Phe 275 280 285
Gly Tyr Gly Ala Lys Asp Val Arg Cys His Ala Arg Lys Ala Val Thr 290 295 300
His lie Asn Ser Val Trp Lys Asp Leu Leu Glu Asp Asn Val Thr Pro 305 310 315 320
He Asp Thr Thr lie Met Ala Lys Asn Glu Val Phe Cys Val Gin Pro 325 330 335
Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu lie Val Phe Pro Asp Leu 340 345 350
Gly Val Arg Val Cys Glu Lys Met Ala Leu Tyr Asp Val Val Thr Lys 355 360 365
Leu Pro Leu Ala Val Met Gly Ser Ser Tyr Gly Phe Gin Tyr Ser Pro 370 375 380
Gly Gin Arg Val Glu Phe Leu Val Gin Ala Trp Lys Ser Lys Lys Thr 385 390 395 400
Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys Phe Asp Ser Thr Val Thr 405 410 415
Glu Ser Asp He Arg Thr Glu Glu Ala lie Tyr Gin Cys Cys Asp Leu 420 425 430
Asp Pro Gin Ala Arg Val Ala He Lys Ser Leu Thr Glu Arg Leu Tyr 435 440 445
Val Gly Gly Pro Leu Thr Asn Ser Arg Gly Glu Asn Cys Gly Tyr Arg 450 455 460
Arg Cys Arg Ala Ser Gly Val Leu Thr Thr Ser Cys Gly Asn Thr Leu 465 470 475 480
Thr Cys Tyr He Lys Ala Arg Ala Ala Cys Arg Ala Ala Gly Leu Gin 485 490 495
(2) INFORMATION FOR SEQ ID NO:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1161 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..1161
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:
ATG AGT TTT GTG GTC ATT ATT CCC GCG CGC TAC GCG TCG ACG CGT CTG 48 Met Ser Phe Val Val He lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCC GGTMACCATTG GTT GAT ATT MC GGC AM CCC ATG ATT GTT CAT 96 Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met lie Val His 20 25 30
GTTCTTGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
ACCGATCATGAGGATGTTGCCCGCGCCGTTGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu
50 55 60
GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGTTGTC GM MA TGC GCA TTC AGC GAC GAC ACG GTG ATCGTTMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC ATT CGT CAG Gπ 336 Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
GCTGAT CCTCGCTCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCA ATC CAC MT GCG GMGM GCG TTT MC CCG MT GCG GTG MA GTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
GTT CTC GAC GCT GM GGG TAT GCA CTG TAC TTC TCT CGC GCC ACC ATT 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
( TTGGGATCGTGATCGTTTTGCAGMGGCCTTGMACCGTTGGCGAT 528 Pro Tφ Asp Arg Asp Arg Phe Aia Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC TTC CTG CGT CAT CTT GGT ATT TAT GGC TAC CGT GCA GGC TTT ATC 576 Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGT CGT TAC GTC MC TGG CAG CCA AGT CCG TTA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
TTA GAG CAG CTT CGT GTT CTG TGG TAC GGC GM MA ATC CAT GTT GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
GTT GCT CAG GM GTT CCT GGC ACA GGT GTG GAT ACC CCT GM GAT CTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCG TCG ACG MTTGC ATG CTG CAG GAC TGC ACC ATG CTG GTTTGC 768 Asp Pro Ser Thr Asn Cys Met Leu Gin Asp Cys Thr Met Leu Val Cys 245 250 255
GGT GAC GAC CTG Gπ Gπ ATC TGC GM TCT GCT GGT Gπ CAG GM GAC 816 Gly Asp Asp Leu Val Val lie Cys Glu Ser Ala Giy Val Gin Glu Asp 260 265 270
GCT GCT TCT CTG CGT GCT c ACC GM GCT ATG ACC CGT TAC TCT GCT 864
Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 275 280 285
CPCCCCGGGTGACCCGCCGCAG ∞GGMTACGAC CTGGMCTGATCAC 912 Pro Pro Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu lie Thr 290 295 300
TCT TGC TCT TCT MC Gπ TCT Gπ GCT CAC GAC GGT GCT GGT AM CGT 960 Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg 305 310 315 320
GπTACTACCTGACCCGTGACCCGACCACCCCGCTGGCTCGTGCTGCT 1008 Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala 325 330 335
TGG GM ACC GCT CGT CAC ACC CCG GTAMCTCTTGG CTG GGT MC ATC 1056 Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He 340 345 350
ATC ATG πC GCT CCG ACC CTG TGG GCC CGT ATG ATC CTG ATG ACC CAC 1104 He Met Phe Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His 355 360 365 πc πc TCT Gπ CTG ATC GCT CGT GAC CAG CTG GM CAG GCT CTG GAC 1152
Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp 370 375 380
TGC GAG ATC 1161
Cys Glu He
385
(2) INFORMATION FOR SEQ ID NO:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 387 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Giy Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Giy Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr lie lie Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe lie 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Cys Met Leu Gin Asp Cys Thr Met Leu Val Cys 245 250 255
Gly Asp Asp Leu Val Val He Cys Glu Ser Ala Gly Val Gin Glu Asp 260 265 270
Ala Ala Ser Leu Arg Ala Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala 275 280 285
Pro Pro Gly Asp Pro Pro Gin Pro Giu Tyr Asp Leu Glu Leu lie Thr 290 295 300
Ser Cys Ser Ser Asn Val Ser Val Ala His Asp Gly Ala Gly Lys Arg 305 310 315 320
Val Tyr Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala 325 330 335
Trp Glu Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He 340 345 350 lie Met Phe Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His
355 360 365
Phe Phe Ser Val Leu He Ala Arg Asp Gin Leu Glu Gin Ala Leu Asp 370 375 380
Cys Glu lie 385
(2) INFORMATION FOR SEQ ID NO:18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1179 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATIONS..1179
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:
ATG AGTTπ GTG GTC Aπ π CCC GCG CGC TAC GCG TCG ACG CGT CTG 48 Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCCGGTAMCCAπGGπGATAπMCGGCMACCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 20 25 30
GπcπGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He lie Val Ala 35 40 45
ACCGATCATGAGGATGπGCCCGCGCCGπGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGπGTCGMMATGCGCAπCAGCGACGACACGGTGATCGπMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC Aπ CGT CAG Gπ 336 Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He He Arg Gin Val 100 105 110
GCTGATMCCTCGCTCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val
115 120 125
CCAATCCACMTGCGGMGMGCGTπ CCCGMTGCGGTGAMGTG 432 Pro He His Asn Ala Glu Giu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GAA GGG TAT GCA CTG TAC πC TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
CCTTGGGATCGTGATCGTTπGCAGMGGCCπGMACCGπGGCGAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC πC CTG CGT CAT CTT GGT Aπ TAT GGC TAC CGT GCA GGC Tπ ATC 576 Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGTCGTTAC GTC MC TGG CAG CCA AGT CCG πA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205 πA GAG CAG Cπ CGT Gπ CTG TGG TAC GGC GM AM ATC CAT Gπ GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
GπGCTCAGGMGπCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCG TCG ACG MT TCC ATG GAG ATC TAC GGT GCT TGC TAC TCT ATC 768 Asp Pro Ser Thr Asn Ser Met Glu lie Tyr Gly Ala Cys Tyr Ser He 245 250 255
GMCCGCTGGACCTGCCGCCGATCAπCAGCGTCTGCACGGTCTGTCT 816 Glu Pro Leu Asp Leu Pro Pro lie He Gin Arg Leu His Gly Leu Ser 260 265 270
GCT πC TCT CTG CAC TCT TAC TCC CCG GGT GM ATC MC CGT Gπ GCT 864 Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg Val Ala 275 280 285
GCTTGC CTG CGT AM CTG GGT Gπ CCG CCG CTG CGTGCTTGG CGT CAC 912 Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Tφ Arg His 290 295 300
CGTGCTCGTTCTGπCGTGCTCGTCTGCTGGCTCGTGGTGGCCGTGCT 960 Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly Arg Ala 305 310 315 320
GCT ATC TGC GGT AM TAC CTG πCMC TGG GCT Gπ CGT ACC AM CTG 1008 Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 325 330 335
AM CTG ACC CCG ATC GCT GCT GCT GGT CAG CTG GAC CTG TCTGGTTGG 1056
Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser Gly Tφ 340 345 350 πC ACC GCT GGT TAC TCT GGT GGT GAC ATC TAC CAC TCT Gπ TCT CAC 1104 Phe Thr Ala Gly Tyr Ser Gly Gly Asp He Tyr His Ser Val Ser His 355 360 365
GCT CGT CCG CGTTGG ATC TGG TTC TGC CTG CTG CTG CTG GCT GCT GGT 1152 Ala Arg Pro Arg Trp He Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly 370 375 380
G GGT ATC TAC CTG CTG CCG AAC CGT 1179
Val Gly He Tyr Leu Leu Pro Asn Arg 385 390
(2) INFORMATION FOR SEQ ID NO:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 393 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 ' 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He
145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Met Glu lie Tyr Gly Ala Cys Tyr Ser He 245 250 255
Glu Pro Leu Asp Leu Pro Pro lie lie Gin Arg Leu His Gly Leu Ser 260 265 270
Ala Phe Ser Leu His Ser Tyr Ser Pro Gly Glu lie Asn Arg Val Ala 275 280 285
Ala Cys Leu Arg Lys Leu Gly Val Pro Pro Leu Arg Ala Tφ Arg His 290 295 300
Arg Ala Arg Ser Val Arg Ala Arg Leu Leu Ala Arg Gly Gly Arg Ala 305 310 315 320
Ala He Cys Gly Lys Tyr Leu Phe Asn Trp Ala Val Arg Thr Lys Leu 325 330 335
Lys Leu Thr Pro He Ala Ala Ala Gly Gin Leu Asp Leu Ser Gly Tφ 340 345 350
Phe Thr Ala Gly Tyr Ser Gly Gly Asp lie Tyr His Ser Val Ser His 355 360 365
Ala Arg Pro Arg Trp lie Trp Phe Cys Leu Leu Leu Leu Ala Ala Gly 370 375 380
Val Gly lie Tyr Leu Leu Pro Asn Arg 385 390
(2) INFORMATION FOR SEQ ID NO:20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1791 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATIONS ..1791
(xi) SEQUENCE DESCRIPTION: SEQ ID NO20:
ATGAGTTπGTGGTCAπAπCCCGCGCGCTACGCGTCGACGCGTCTG 48 Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCCGGTAMCCAπGGπGATAπMCGGCMACCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 20 25 30
GπCπGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Aia Arg Glu Ser Giy Ala Glu Arg He He Val Ala 35 40 45
ACCGATCATGAGGATGπGCCCGCGCCGπGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGπGTCGMMATGCGCAπCAGCGACGACACGGTGATCGπMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC Aπ CGT CAG Gπ 336 Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
GCTGATMCCTCGCTCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGπTMCCCGMTGCGGTGMAGTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GM GGG TAT GCA CTG TAC πC TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 145 150 155 160
CCTTGGGATCGTGATCGTTπGCAGMGGCCπGMACCGπGGCGAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC πC CTG CGT CAT CTT GGT Aπ TAT GGC TAC CGT GCA GGC TTT ATC 576 Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He
180 185 190
CGT CGT TAC GTC MC TGG CAG CCA AGT CCG πA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205 πA GAG CAG Cπ CGT Gπ CTG TGG TAC GGC GM AM ATC CAT Gπ GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
GπGCTCAGGMGπCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCG TCG ACG MTTCC ATG GAC GCT CAC πC CTG TCT CAG GCG CCG 768 Asp Pro Ser Thr Asn Ser Met Asp Ala His Phe Leu Ser Gin Ala Pro 245 250 255
CCGCCGTCTTGGGATCAGATGTGGAMTGCCTGATCCGTCTGAMCCG 816 Pro Pro Ser Tφ Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro 260 265 270
ACCCTGCACGGCCCGACCCCGCTGCTGTACCGTCTGGGTGCTGπCAG 864 Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Giy Ala Val Gin 275 280 285
MC GM ATC ACC CTG ACC CAC CCG Gπ ACC AM TAC ATC ATG ACC TGC 912 Asn Glu He Thr Leu Thr His Pro Val Thr Lys Tyr lie Met Thr Cys 290 295 300
ATG TCT GCT GAT CTA GM Gπ Gπ ACC TCT ACC TGG Gπ CTG Gπ GGT 960 Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Tφ Val Leu Val Gly 305 310 315 320
GCT Gπ CTG GCT GCT CTG GCT GCTTAC TGC CTG TCG ACC GGT TGC Gπ 1008 Gly Val Leu Ala Aia Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val 325 330 335
Gπ ATC Gπ GGT CGT Gπ Gπ CTG TCT GGT AM CCG GCC Aπ ATC CCG 1056 Val He Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro 340 345 350
GACCGTGMGπCTGTAC CGTGAGπC AC GMATGGMGMTGCTCT 1104 Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser 355 360 365
CAG CAC CTG CCG TAC ATC GM CAG GGT ATG ATG CTG GCT GM CAG πC 1152 Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe 370 375 380
AM CAG AM GCT CTG GGT CTG CTG CAG ACC GCTTCT CGT CAG GCTGM 1200 Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu 385 390 395 400
Gπ ATC GCT CCG GCT Gπ CAG ACC MC TGG CAG AM CTC GAG ACC πC 1248
Val He Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe 405 410 415
TGG GCT AM CAC ATG TGG MC πC ATC TCT GGT ATC CAG TAC CTG GCT 1296 Trp Ala Lys His Met Trp Asn Phe He Ser Gly lie Gin Tyr Leu Ala 420 425 430
GGT CTG TCT ACC CTG CCG GGTMC CCG GCT ATC GCA AGC πG ATG GCT 1344 Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala 435 440 445 πCACCGCTGCTGπACCTCTCCGCTGACCACCTCTCAGACCCTGCTG 1392 Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu 450 455 460 πCMCAπCTGGGTGGTTGGGπGCTGCTCAGCTGGCTGCTCCGGGT 1440 Phe Asn He Leu Gly Gly Tφ Val Ala Ala Gin Leu Ala Ala Pro Gly 465 470 475 480
GCT GCT ACC GCTπC Gπ GGT GCT GGT CTG GCT GGT GCT GCT ATC GGT 1488 Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala He Gly 485 490 495
TCT GTA GGC CTG GGT AM Gπ CTG ATC GAC Aπ CTG GCT GGT TAC GGT 1536 Ser Val Gly Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly 500 505 510
GCTGGTGπGCTGGAGCTCTGGπGCTπCAMATCATGTCTGGTGM 1584 Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu 515 520 525
GπCCGTCTACCGMGATCTGGπMCCTGCTGCCGGCTATCCTGTCT 1632 Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser 530 535 540
CCG GGT GCT CTG GπGπ GGTGπGπTGC GCT GCT ATC CTG CGT CGT 1680 Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala lie Leu Arg Arg 545 550 555 560
CACGπGGCCCGGGTGMGGTGCTGπCAGTGGATGMCCGTCTGATC 1728 His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He 565 570 575
GCTπCGCTTCT CGT GGTMC CAC GπTCTCCATGG GAT CCT CTA GAC 1776 Ala Phe Ala Ser Arg Giy Asn His Val Ser Pro Tφ Asp Pro Leu Asp 580 585 590
TGC AGG CAT GCT MG 1791
Cys Arg His Ala Lys 595
(2) INFORMATION FOR SEQ ID NO.21 : (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 597 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21 :
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He lie Arg Gin Val 100 105 110 .
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Tφ Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Met Asp Ala His Phe Leu Ser Gin Ala Pro 245 250 255
Pro Pro Ser Tφ Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro 260 265 270
Thr Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin 275 280 285
Asn Glu He Thr Leu Thr His Pro Val Thr Lys Tyr He Met Thr Cys 290 295 300
Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Tφ Val Leu Val Gly 305 310 315 320
Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly Cys Val 325 330 335
Val He Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala He He Pro 340 345 350
Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu Cys Ser 355 360 365
Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu Gin Phe 370 375 380
Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin Ala Glu 385 390 395 400
Val He Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu Thr Phe 405 410 415
Trp Ala Lys His Met Trp Asn Phe He Ser Gly lie Gin Tyr Leu Ala 420 425 430
Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Aia Ser Leu Met Ala 435 440 445
Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr Leu Leu 450 455 460
Phe Asn He Leu Gly Gly Tφ Val Ala Ala Gin Leu Ala Ala Pro Gly 465 470 475 480
Ala Ala Thr Aia Phe Val Gly Ala Gly Leu Aia Gly Ala Ala He Gly 485 490 495
Ser Val Gly Leu Gly Lys Val Leu He Asp He Leu Ala Gly Tyr Gly 500 505 510
Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser Gly Glu 515 520 525
Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser 530 535 540
Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg 545 550 555 560
His Val Gly Pro Gly Glu Gly Ala Val Gin Tφ Met Asn Arg Leu He 565 570 575
Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Tφ Asp Pro Leu Asp 580 585 590
Cys Arg His Ala Lys 595
(2) INFORMATION FOR SEQ ID N0:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1797 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..1797
(xi) SEQUENCE DESCRIPTION: SEQ ID N022:
ATG AGTTπ CTG GTC AπAπ CCC GCG CGC TAC GCG TCG ACG CGT CTG 48 Met Ser Phe Val Val lie lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCC GGTAMCCAπGGπGATAπMC GGCMACCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met lie Val His 20 25 30
GπcπGMCGCGCGCGTGMTCAGCTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie lie Val Ala 35 40 45
ACCGATCATGAGGATGπGCCCGCGCCGπGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Aia Val Glu Ala Ala Gly Gly Glu 50 55 60
GTATGT ATG ACG CGC GCCGATCATCAGTCAGG ACAGMCGTCTG GCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGπGTC GMMA TGC GCAπC AGC GAC GAC ACG CTG ATC GπMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC Aπ CGT CAG Gπ 336 Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val
100 105 110
GCT GAT MC CTC GCT CAG CGT CAG CTG GCT ATG GCG ACT CTG GCG GTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGTπMCCCGMTGCGGTGMACTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GM GGG TAT GCA CTG TAC πc TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
CCTTGGGATCGTGATCGTπTGCAGMGGCCπGMACCGπGGCGAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MCπCCTGCGTCATCπGGTAπTATGGCTACCGTGCAGGCπTATC 576 Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGT CGT TAC GTC MC TGG CAG CCA AGT CCG πA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205 πAGAGCAGCπCGTGπCTGTGGTACGGCGMMAATCCATGπGCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
GπGCTCAGGMGπCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCG TCG ACG MT TCC ATG GAC GCT CAC πC CTG TCT CAG ACC AM 768 Asp Pro Ser Thr Asn Ser Met Asp Ala His Phe Leu Ser Gin Thr Lys 245 250 255
CAGTCTGGTGMMCCπCCGTACCTGGπGCTTACCAGGCTACCGπ 816 Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 260 265 270
TGCGCTCGTGCTCAGGCCCCGACCCCGCTGCTGTACCGTCTGGGTGCT 864 Cys Ala Arg Ala Gin Ala Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala 275 280 285
GπCAGMCGMATCACCCTGACCCACCCGGπACC/\MTACATCATG 912 Val Gin Asn Glu He Thr Leu Thr His Pro Val Thr Lys Tyr He Met 290 295 300
ACC TGC ATG TCT GCT GAT CTA GM Gπ Gπ ACC TCT ACC TGG Gπ CTG 960 Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Tφ Val Leu 305 310 315 320
GπGGTGGTGπCTGGCTGCTCTGGCTGCTTACTGCCTGTCGACCGGT 1008
Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly 325 330 335
TGCGπGπATCGπGGTCGTGπGπCTGTCTGGTAMCCG GCCAπ 1056 Cys Val Val He Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala He 340 345 350
ATCCCGGACCGTGMGπCTGTACCGTGAGπCGACGMATGGMGM 1104 He Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu 355 360 365
TGC TCT CAG CAC CTG CCG TAC ATC GM CAG GGT ATG ATG CTG GCT GM 1152 Cys Ser Gin His Leu Pro Tyr He Glu Gin Gly Met Met Leu Ala Glu 370 375 380
CAG πC AM CAG AM GCT CTG GGT CTG CTG CAG ACC GCT TCT CGT CAG 1200 Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin 385 390 395 400
GCTGMGπATCGCTCCGGCTGπCAGACCMCTGGCAGAMCTCGAG 1248 Ala Glu Val lie Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu 405 410 415
ACC πC TGG GCT AM CAC ATG TGG MC πC ATC TCT GGT ATC CAG TAC 1296 Thr Phe Tφ Ala Lys His Met Trp Asn Phe He Ser Gly lie Gin Tyr 420 425 430
CTG GCT GGT CTG TCT ACC CTG CCG GGTMC CCG GCT ATC GCAAGCπG 1344 Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu 435 440 445
ATG GCT πC ACC GCT GCT Gπ ACC TCT CCG CTG ACC ACC TCT CAG ACC 1392 Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr 450 455 460
CTG CTG πC MC Aπ CTG GGT GGT TGG Gπ GCT GCT CAG CTG GCT GCT 1440 Leu Leu Phe Asn lie Leu Gly Gly Tφ Val Ala Ala Gin Leu Ala Ala 465 470 475 480
CCGGCTGCTGCTACCGCTπCGπGGTGCTGGTCTGGCTGGTGCTGCT 1488 Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala 485 490 495
ATC GCT TCT GTA GGC CTG GCT AM Gπ CTG ATC GAC Aπ CTG GCT GGT 1536 He Gly Ser Val Gly Leu Gly Lys Val Leu lie Asp lie Leu Ala Gly 500 505 510
TAC GGT GCT GGT Gπ GCT GGA GCT CTG Gπ GCT πC AM ATC ATG TCT 1584 Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser 515 520 525
GGT GM Gπ CCG TCT ACC GM GAT CTG GTTMC CTG CTG CCG GCT ATC 1632 Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala lie 530 535 540
CTG TCT CCG GGT GCT CTG GπGπ GGT GπGπ TGC GCT GCT ATC CTG 1680 Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu 545 550 555 560
CGTCGTCACGπGGCCCGGGTGMGGTGCTGπCAGTGGATGMCCGT 1728 Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg 565 570 575
CTG ATC GCT πC GCT TCT CGT GGT MC CAC Gπ TCT CCA TGG GAT CCT 1776 Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Trp Asp Pro 580 585 590
CTA GAC TGC AGG CAT GCT MG 1797
Leu Asp Cys Arg His Ala Lys 595
(2) INFORMATION FOR SEQ ID NO.23:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 599 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.23:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Aia 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Giu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Tφ Tyr Gly Glu Lys lie His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Met Asp Ala His Phe Leu Ser Gin Thr Lys 245 250 255
Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr Gin Ala Thr Val 260 265 270
Cys Ala Arg Ala Gin Ala Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala 275 280 285
Val Gin Asn Glu lie Thr Leu Thr His Pro Val Thr Lys Tyr He Met 290 295 300
Thr Cys Met Ser Ala Asp Leu Glu Val Val Thr Ser Thr Tφ Val Leu 305 310 315 320
Val Gly Gly Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Ser Thr Gly 325 330 335
Cys Val Val lie Val Gly Arg Val Val Leu Ser Gly Lys Pro Ala He 340 345 350
He Pro Asp Arg Glu Val Leu Tyr Arg Glu Phe Asp Glu Met Glu Glu 355 360 365
Cys Ser Gin His Leu Pro Tyr lie Glu Gin Gly Met Met Leu Ala Glu 370 375 380
Gin Phe Lys Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Ser Arg Gin 385 390 395 400
Ala Glu Val lie Ala Pro Ala Val Gin Thr Asn Trp Gin Lys Leu Glu 405 410 415
Thr Phe Tφ Ala Lys His Met Trp Asn Phe He Ser Gly lie Gin Tyr 420 425 430
Leu Ala Gly Leu Ser Thr Leu Pro Gly Asn Pro Ala lie Ala Ser Leu
435 440 445
Met Ala Phe Thr Ala Ala Val Thr Ser Pro Leu Thr Thr Ser Gin Thr 450 455 460
Leu Leu Phe Asn He Leu Gly Gly Tφ Val Ala Ala Gin Leu Ala Ala 465 470 475 480
Pro Gly Ala Ala Thr Ala Phe Val Gly Ala Gly Leu Ala Gly Ala Ala 485 490 495
He Gly Ser Val Gly Leu Gly Lys Val Leu He Asp He Leu Ala Gly 500 505 510
Tyr Gly Ala Gly Val Ala Gly Ala Leu Val Ala Phe Lys He Met Ser 515 520 525
Gly Glu Val Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He 530 535 540
Leu Ser Pro Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu 545 550 555 560
Arg Arg His Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg 565 570 575
Leu He Ala Phe Ala Ser Arg Gly Asn His Val Ser Pro Trp Asp Pro 580 585 590
Leu Asp Cys Arg His Ala Lys 595
(2) INFORMATION FOR SEQ ID NO:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1251 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..1251
(xi) SEQUENCE DESCRIPTION: SEQ ID NO24:
ATG AGT Tπ GTG GTC Aπ Aπ CCC GCG CGC TAC GCG TCG ACG CGT CTG 48 Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCCGGT VMCCAπGGπGATAπMCGGCMACCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His
20 25 30
GπCπGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCCTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
ACCGATCATGAGGATGπGCCCGCGCCGπGMGCCGCTGGCGCTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
CTATCTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGπGTCGMMATGC GCAπCAGCGACGACACGGTGATCGπMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC Aπ CGT CAG Gπ 336 Val Gin Giy Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
GCT GATMC CTC GCT CAG CGT CAG CTG GGT ATG GCG ACT CTG GCG GTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Giy Met Ala Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGπTMCCCGMTGCGCTGMACTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GM GGG TAT GCA CTG TAC πC TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
CCT TGG GAT CGT GAT CCT TTT GCA GM GGC CπGM ACC Gπ GGC GAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC πC CTG CGT CAT CTT GGT Aπ TAT GGC TAC CGT GCA GGC TTT ATC 576 Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGT CGTTAC GTC MC TGG CAG CCA AGT CCG πA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205 πA GAG CAG Cπ CGT Gπ CTG TGG TAC GGC GM MA ATC CAT Gπ GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Gπ GCT CAG GM Gπ CCT GGC ACA GGT GTG GAT ACC CCT GM GAT CTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GACCCGTCGACTCGAAπCGAGCTCGGTACCCTGAGACAATCACGCπ 768
Asp Pro Ser Thr Arg He Arg Ala Arg Tyr Pro Glu Thr He Thr Leu 245 250 255
CCCCAGGATGCTGTCTCCCGCACCCAGCGTCGGGGCAGGACTGGCAGG 816 Pro Gin Asp Aia Val Ser Arg Thr Gin Arg Arg Gly Arg Thr Gly Arg 260 265 270
GGGMGCCAGGCATCTACAGAπTGTGGCACCGGGGGAGCGCCCTTCC 864 Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser 275 280 285
GGCATGπCGACTCGTCCGTCCTCTGCGAGTGCTATGACGCGGGCTGG 912 Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Trp 290 295 300
CCTTGGTATGAGCTCACACCCGCCGAGACC CAGπAGGCTACGAGCG 960 Pro Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 305 310 315 320
TAC ATG MC ACC CCG GGA CTC CCC GTG TGC CM GAC CAT C GAATTT 1008 Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe 325 330 335
TGG GAG GGC GTC πC ACG GCT CTC ACC CAT ATA GAC GCC CAC TTT CTA 1056 Tφ Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu 340 345 350
TCCCAGACAMGCAGAGTGGGGMMCCπCCTTACCTGGTAGCGTAC 1104 Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr 355 360 365
CMGCCACCGTGTGCGCTAGAGCTCMGCCCCTCCCCCATCGTGGGAC 1152 Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp 370 375 380
CAG ATG TGG MG TGC πG ATC CGC CTC MG CCT ACC CTT CAT GGG CCG 1200 Gin Met Tφ Lys Cys Leu He Arg Leu Lys Pro Thr Leu His Gly Pro 385 390 395 400
ACCCCCCTGCTATACAGACTGGGCGGGGGATCCTCTAGACTGCAGGCA 1248 Thr Pro Leu Leu Tyr Arg Leu Gly Gly Gly Ser Ser Arg Leu Gin Ala 405 410 415
TGC 1251
Cys
(2) INFORMATION FOR SEQ ID NO.25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 417 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID N0.24:
Met Ser Phe Val Val He lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He lie Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val lie Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr lie He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Vai Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Arg He Arg Ala Arg Tyr Pro Glu Thr lie Thr Leu 245 250 255
Pro Gin Asp Ala Val Ser Arg Thr Gin Arg Arg Giy Arg Thr Gly Arg 260 265 270
Gly Lys Pro Gly He Tyr Arg Phe Val Ala Pro Gly Glu Arg Pro Ser 275 280 285
Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp Ala Gly Tφ 290 295 300
Pro Trp Tyr Glu Leu Thr Pro Ala Glu Thr Thr Val Arg Leu Arg Ala 305 310 315 320
Tyr Met Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His Leu Glu Phe 325 330 335
Tφ Glu Gly Val Phe Thr Gly Leu Thr His He Asp Ala His Phe Leu 340 345 350
Ser Gin Thr Lys Gin Ser Gly Glu Asn Leu Pro Tyr Leu Val Ala Tyr 355 360 365
Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro Ser Trp Asp 370 375 380
Gin Met Tφ Lys Cys Leu lie Arg Leu Lys Pro Thr Leu His Gly Pro 385 390 395 400
Thr Pro Leu Leu Tyr Arg Leu Gly Gly Gly Ser Ser Arg Leu Gin Ala 405 410 415
Cys
(2) INFORMATION FOR SEQ ID NO:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1275 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..1275
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.26:
ATGAGTTTTGTGGTCAπAπCCCGCGCGCTACGCGTCGACGCGTCTG 48 Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCCGGTAMCCAπGGπGATAπMCGGCMACCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 20 25 30
GπCπGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144
Val Leu Glu Arg Ala Arg Glu Ser Gly Aia Glu Arg lie lie Val Ala 35 40 45
ACCGATCATGAGGATGπGCCCGCGCCGπGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
GTATGT ATG ACG CGC GCC GAT CAT CAG TCA GGA ACA GM CGT CTG GCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GM Gπ GTC GM MA TGC GCA πC AGC GAC GAC ACG CTG ATC GπMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC Aπ CGT CAG Gπ 336 Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
GCT GAT MC CTC GCT CAG CGT CAG GTG GGT ATG GCG ACT CTG GCG GTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGπTMCCCGMTGCGGTGMAGTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GM GGG TAT GCA CTG TAC πC TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 145 150 155 160
CCT TGG GAT CGT GAT CCT Tπ GCA GM GGC CπGM ACC Gπ GGC GAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Giu Thr Val Gly Asp 165 170 175
MC πC CTG CGT CAT CTT GGT A TAT GGC TAC CGT GCA GGC T ATC 576 Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGTCGTTAC GTC MC TGG CAG CCA AGT CCG πAGM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205 πA GAG CAG Cπ CGT Gπ CTG TGG TAC GGC GM MA ATC CAT Gπ GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
Gπ GCT CAG GM Gπ CCT GGC ACA GGT GTG GAT ACC CCT GM GAT CTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GACCCGTCGACTCGAAπCGTAGGTCGCGCMTπGGGTMGGTCATC 768 Asp Pro Ser Thr Arg He Arg Arg Ser Arg Asn Leu Gly Lys Val He 245 250 255
GAC ACC CTC ACG TGC GGC πC GCC GAC CTC ATG GGG TAT Aπ CCG CTC 816 Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu 260 265 270
GTCGGCGCCCCTCπGGAGGCGCTGCCAGGGCCCTGGGCCATGGCGTC 864 Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Gly His Gly Val 275 280 285
CGG GπCTGGMGAC GGC GTG MC TAT GCG ACA GGG MTCTT CCT GCT 912 Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 290 295 300
TGC TCTπC TCT ATC πC Cπ CTG GCC CTG CTC TCTTGC CTG ACC GTG 960 Cys Ser Phe Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 305 310 315 320
CCCGCATCAGCCTACCMGTACGCMCTCCTCGGGCCTTTACCATGTC 1008 Pro Ala Ser Ala Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val 325 330 335
ACCMTGATTGCCCCMCTCGAGTAπGTGTACGAGACGGCCGATGCC 1056 Thr Asn Asp Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Ala 340 345 350
ATCCTGCACACTCCGGGGTGCGTCCCTTGCGπCGTGAGGGCAACGCC 1104 He Leu His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala 355 360 365
TCX3/VGATGTTGGCTGGCGGTGGC.CCCCACyVCTGGCC VCC _3GGATGGA 1152 Ser Arg Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 370 375 380
AM CTC CCC GCA ACG CAG CTT CGA CGT CAC Aπ GAT CTG CTT GTC GGG 1200 Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val Gly 385 390 395 400
AGC GCC ACC CTCTGTTCG GCCCTCTACπA AGG AGCTCG GTACCCGGG 1248 Ser Ala Thr Leu Cys Ser Ala Leu Tyr Leu Arg Ser Ser Val Pro Gly 405 410 415
GAT CCT CTA GAC TGC AGG CAT GCT MG 1275
Asp Pro Leu Asp Cys Arg His Ala Lys 420 425
(2) INFORMATION FOR SEQ ID NO:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 425 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO.27:
Met Ser Phe Val Val He lie Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met lie Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg lie He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Giu Ala Aia Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr lie lie Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Arg He Arg Arg Ser Arg Asn Leu Gly Lys Val lie 245 250 255
Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr He Pro Leu 260 265 270
Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Gly His Gly Val 275 280 285
Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 290 295 300
Cys Ser Phe Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 305 310 315 320
Pro Ala Ser Ala Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val 325 330 335
Thr Asn Asp Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Ala 340 345 350
He Leu His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala 355 360 365
Ser Arg Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 370 375 380
Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val Gly 385 390 395 400
Ser Ala Thr Leu Cys Ser Ala Leu Tyr Leu Arg Ser Ser Val Pro Gly 405 410 415
Asp Pro Leu Asp Cys Arg His Ala Lys 420 425
(2) INFORMATION FOR SEQ ID NO.28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1401 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: ! ..1401
(xi) SEQUENCE DESCRIPTION: SEQ ID NO__8:
ATGAGTπTGTGGTCAπAπCCCGCGCGCTACGCGTCGACGCGTCTG 48 Met Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCCGGTAMCCAπGGπGATAπMCGGCMACCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
GπcπGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
ACCGAT(-V.TGAGGATGπGCC(_OCGCCGπGMGCCGCTGGCGCTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
CTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGπGTCGMMATGCGCAπCAGCGACGACACGCTGATCGπMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC Aπ CGT CAG Gπ 336 Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He lie Arg Gin Val 100 105 110
GCT GAT MC CTC GCT CAG CGT CAG GTG GGT ATG GCG ACT CTG GCG GTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGTπMCCCGMTGCGGTGMACTG 432 Pro He His Asn Ala Glu Giu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GM GGG TAT GCA CTG TAC πC TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
CCT TGG GAT CGT GAT CCT Tπ GCA GM GGC CπGM ACC Gπ GGC GAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC πC CTG CGT CAT CTT GGT Aπ TAT GGC TAC CGT GCA GGC TTT ATC 576 Asn Phe Leu Arg His Leu Gly lie Tyr Giy Tyr Arg Ala Gly Phe He 180 185 190
CGT CGT TAC GTC MC TGG CAG CC A AGT CCG πA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Giu His He Glu Met 195 200 205 πA GAG CAG Cπ CGT Gπ CTG TGG TAC GGC GM MA ATC CAT Gπ GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
GπGCTCAGGMGπCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCG TCG ACT CGA Aπ CTG CTT GTC GGG AGC GCC ACC CTC TGC TCG 768 Asp Pro Ser Thr Arg lie Leu Leu Val Gly Ser Ala Thr Leu Cys Ser 245 250 255
GCC CTC TAT GTG GGG GAC πG TGC GGG TCT CTC Tπcπ GTC GGT CM 816 Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Gly Gin
260 265 270
CTG πC CTπC TCC CCC AGG CAG CAC TGG ACA ACG CM GAC TGC AAC 864 Leu Phe Thr Phe Ser Pro Arg Gin His Tφ Thr Thr Gin Asp Cys Asn 275 280 285
TGTTCT ATC TAC CCC GGC CAC GTA ACG GGT CAC CGC ATG GCA TGG GAT 912 Cys Ser He Tyr Pro Gly His Val Thr Gly His Arg Met Ala Trp Asp 290 295 300
ATG ATG ATG AAC TGG TCC CCT ACG ACA GCG CTG CTA GTA GCT CAG CTG 960 Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Ala Gin Leu 305 310 315 320
CTC AGG GTC CCG CM GCC ATC πG GAC ATG ATC GCT GGT GCC CAC TGG 1008 Leu Arg Val Pro Gin Ala He Leu Asp Met He Ala Gly Ala His Tφ 325 330 335
GGAGTCCTAGCGGGCATAGCGTATπCTCCATGGTGGGGMCTGGGCG 1056 Gly Val Leu Ala Gly He Ala Tyr Phe Ser Met Val Gly Asn Tφ Ala 340 345 350
MGGTCCTGGTAGTGCTGCTGCTATπGCCGGCGπGACGCGGMACC 1104 Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr 355 360 365
CACGTCACCGGGGGAAGTGCCGGCCACAπACGGCTGGGCπGπCGT 1152 His Val Thr Gly Gly Ser Ala Gly His He Thr Ala Gly Leu Val Arg 370 375 380
CTC CTTTCA CCA GGC GCC MG CAG MC ATC CM CTG ATC MC ACCMC 1200 Leu Leu Ser Pro Gly Ala Lys Gin Asn He Gin Leu He Asn Thr Asn 385 390 395 400
GGCAGTTGGCACATCMTAGCACGGCCπGMCTGCMTGMAGCCπ 1248 Gly Ser Trp His lie Asn Ser Thr Ala Leu Asn Cys Asn Glu Ser Leu 405 410 415
MC ACC GGC TGG πA GCA GGG CTC πC TAT CAC CAC AAATTCAAC TCT 1296 Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser 420 425 430
TCAGGCTGTCCTGAGAGGGπGCCAGCTGCCGTCGCCπACCGATTπ 1344 Ser Gly Cys Pro Glu Arg Val Ala Ser Cys Arg Arg Leu Thr Asp Phe 435 440 445
GAC CAG GGC TGG GMTTC GAG CTC GGT ACC CGG GGA TCC TCT AGACTG 1392 Asp Gin Gly Tφ Glu Phe Glu Leu Gly Thr Arg Gly Ser Ser Arg Leu 450 455 460
CAG GCA TGC 1401
Gin Ala Cys
465
(2) INFORMATION FOR SEQ ID N0:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 467 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID N0.29:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val' Gin Gly Asp Glu Pro Met lie Pro Ala Thr He He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Aia Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Aia Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Arg He Leu Leu Val Gly Ser Ala Thr Leu Cys Ser 245 250 255
Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val Gly Gin 260 265 270
Leu Phe Thr Phe Ser Pro Arg Gin His Tφ Thr Thr Gin Asp Cys Asn 275 280 285
Cys Ser He Tyr Pro Gly His Val Thr Gly His Arg Met Ala Trp Asp 290 295 300
Met Met Met Asn Trp Ser Pro Thr Thr Ala Leu Val Val Ala Gin Leu 305 310 315 320
Leu Arg Val Pro Gin Ala He Leu Asp Met He Ala Gly Ala His Tφ 325 330 335
Gly Val Leu Ala Gly He Ala Tyr Phe Ser Met Val Gly Asn Tφ Ala 340 345 350
Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala Glu Thr 355 360 365
His Val Thr Gly Gly Ser Ala Gly His He Thr Ala Gly Leu Val Arg 370 375 380
Leu Leu Ser Pro Gly Ala Lys Gin Asn He Gin Leu lie Asn Thr Asn 385 390 395 400
Gly Ser Trp His He Asn Ser Thr Ala Leu Asn Cys Asn Glu Ser Leu 405 410 415
Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr His His Lys Phe Asn Ser 420 425 430
Ser Gly Cys Pro Glu Arg Val Ala Ser Cys Arg Arg Leu Thr Asp Phe 435 440 445
Asp Gin Gly Tφ Glu Phe Glu Leu Gly Thr Arg Gly Ser Ser Arg Leu 450 455 460
Gin Ala Cys 465
(2) INFORMATION FOR SEQ ID NO:30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1422 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..1422
(xi) SEQUENCE DESCRIPTION: SEQ ID NO_30:
ATG AGTTTTGTGGTC AπAπCCCGCG CGCTACGCGTCG ACGCCTCTG 48 Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCCGGTAMCCAπGGπGATAπMC GGCAMCCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 20 25 30
GπcπGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCGTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He lie Val Ala 35 40 45
ACC GAT CAT GAG GAT Gπ GCC CGC GCC GπGM GCC GCT GGC GCTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
GTATCTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GMGπGTCGMMATGCGCAπCAGCGACGACACGCTGATCGπMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC Aπ CGT CAG Gπ 336 Val Gin Gly Asp Glu Pro Met lie Pro Ala Thr He He Arg Gin Val 100 105 1 10
GCT GAT MC CTC GCT CAG CGT CAG GTG GCT ATG GCG ACT CTG GCG CTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCA ATC CAC MT GCG GMGM GCG TTT MC CCG MT GCG GTG MA GTG 432 Pro lie His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GM GGG TAT GCA CTG TAC πC TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr lie 145 150 155 160
CCT TGG GAT CGT GAT CGT Tπ GCA GM GGC CπGM ACC Gπ GGC GAT 528 Pro Tφ Asp Arg Asp Arg Phe Aia Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC πC CTG CGT CAT CTT GGT Aπ TAT GGC TAC CGT GCA GGC Tπ ATC 576 Asn Phe Leu Arg His Leu Gly lie Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGT CGT TAC GTC MC TGG CAG CCA AGT CCG πA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205 πA GAG CAG Cπ CGT GTT CTG TGG TAC GGC GM MA ATC CAT Gπ GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
GπGCTCAGGMGπCCTGGCACAGGTGTGGATACCCCTGAAGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCG TCG ACC GAAHC GGT GAC ATC ATC AAC GGC πG CCC GTC TCC 768 Asp Pro Ser Thr Glu Phe Gly Asp He He Asn Gly Leu Pro Val Ser 245 250 255
GCCCGTAGGGGCCAGGAGATACTGCTCGGACCAGCCGACGGAATGGTC 816 Ala Arg Arg Gly Gin Glu He Leu Leu Gly Pro Ala Asp Gly Met Val 260 265 270
TCCMGGGGTGGAGGπGCTGGCGCCCATCACGGCGTACGCCCAGCAG 864 Ser Lys Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin 275 280 285
ACAAGGGGCCTCCTAGGGTGTATAATCACCAGCCTGACTGGCCGGGAC 912 Thr Arg Gly Leu Leu Gly Cys He He Thr Ser Leu Thr Gly Arg Asp 290 295 300
AMMCCMGCGGAGGGTGAGGTCCAGAπGTGTCAACTGCTGCCCM 960 Lys Asn Gin Ala Glu Gly Glu Val Gin lie Val Ser Thr Ala Aia Gin 305 310 315 320
ACTπC CTG GCA ACG TGC ATC MT GGG GTA TGC TGG ACT GTC TAC CAT 1008 Thr Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His 325 330 335
GGGGCCGGAACGAGGACCCTCGCATCACCCMGGGTCCTGπATCCAG 1056 Gly Ala Gly Thr Arg Thr Leu Ala Ser Pro Lys Gly Pro Val He Gin 340 345 350
ATG TAT ACC MT GTA GAC CM GAC Cπ GTG GGC TGG CCC GCT CCT CM 1104 Met Tyr Thr Asn Val Asp Gin Asp Leu Val Giy Trp Pro Ala Pro Gin 355 360 365
GGT GCC CGC TCAπG ACA CCC TGC ACC TGC GGC TCC TCG GAC CTTTAC 1152 Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr 370 375 380
CTGGπACGAGGCACGCCGATGTCAπCCCGTGCGCCGGCGGGGTGAT 1200 Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp 385 390 395 400
AGCAGGGGCAGCCTGCπTCGCCCCGGCCCAπTCTTATπGAMGGC 1248 Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly
405 410 415
TCC TCG GGG GCT CCG CTG πG TGC CCC GCG GGA CAC GCC CTG GGC ATA 1296 Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly He 420 425 430 πCAGGGCX. GCGGTGTCT/~XCGTGGACTGGCTMGGCGCTGGACTπ 1344 Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe 435 440 445
GTCCCCGTGGAGMCCTCGAGACAACCATGMTTCGAGCTCGCTACCC 1392 Val Pro Val Glu Asn Leu Glu Thr Thr Met Asn Ser Ser Ser Val Pro 450 455 460
GGG GAT CCT CTA GAC TGC AGG CAT GCT MG 1422
Gly Asp Pro Leu Asp Cys Arg His Ala Lys 465 470
(2) INFORMATION FOR SEQ ID NO:31 :
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 474 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31 :
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Giu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Aia Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr lie He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val
130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys lie His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Glu Phe Gly Asp lie He Asn Gly Leu Pro Val Ser 245 250 255
Ala Arg Arg Gly Gin Glu He Leu Leu Gly Pro Ala Asp Gly Met Val 260 265 270
Ser Lys Gly Trp Arg Leu Leu Ala Pro He Thr Ala Tyr Ala Gin Gin 275 280 285
Thr Arg Gly Leu Leu Gly Cys lie He Thr Ser Leu Thr Gly Arg Asp 290 295 300
Lys Asn Gin Ala Glu Gly Glu Val Gin He Val Ser Thr Ala Ala Gin 305 310 315 320
Thr Phe Leu Ala Thr Cys He Asn Gly Val Cys Trp Thr Val Tyr His 325 330 335
Gly Ala Gly Thr Arg Thr Leu Ala Ser Pro Lys Gly Pro Val He Gin 340 345 350
Met Tyr Thr Asn Val Asp Gin Asp Leu Val Gly Trp Pro Ala Pro Gin 355 360 365
Gly Ala Arg Ser Leu Thr Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr 370 375 380
Leu Val Thr Arg His Ala Asp Val He Pro Val Arg Arg Arg Gly Asp 385 390 395 400
Ser Arg Gly Ser Leu Leu Ser Pro Arg Pro He Ser Tyr Leu Lys Gly 405 410 415
Ser Ser Gly Gly Pro Leu Leu Cys Pro Ala Gly His Ala Val Gly He 420 425 430
Phe Arg Ala Ala Val Cys Thr Arg Gly Val Ala Lys Ala Val Asp Phe 435 440 445
Val Pro Val Glu Asn Leu Glu Thr Thr Met Asn Ser Ser Ser Val Pro 450 455 460
Gly Asp Pro Leu Asp Cys Arg His Ala Lys 465 470
(2) INFORMATION FOR SEQ ID N0:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1401 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATION: 1..1401
(xi) SEQUENCE DESCRIPTION: SEQ ID N032:
ATGAGTTπGTGGTCAπAπCCCGCGCGCTACGCGTCGACGCCTCTG 48 Met Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCCGGTAMCCAπGGπGATAπMC GGCAMCCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
GπCπGMCGCGCGCGTGMTCAGCTGCCGAGCGCATCATCCTGGCA 144 Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He lie Val Ala 35 40 45
ACCG^TCATGΛGGATGπGCCCXBCGCCGπGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
GTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Aia Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
GM Gπ GTC GM MA TGC GCA πc AGC GAC GAC ACG GTG ATC Gπ MT 288 Glu Val Val Glu Lys Cys Aia Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTGCAG GGTGATGMCCG ATG ATCCCTGCG ACAATC AπCGTCAGGπ 336 Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He lie Arg Gin Val 100 105 110
GCTGATMCCTCGCTCAGCGTCAGGTGGGTATGACGACTCTGGCGGTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGTπMCCCGMTGCGGTGMAGTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GAA GGG TAT GCA CTG TAC πC TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
CCTTGGGATCGTGATCCTTπGCAGMGGCCπGMACCGπGGCGAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Giy Asp 165 170 175
MC πC CTG CGT CAT CTT GGT Aπ TAT GGC TAC CGT GCA GGC TTT ATC 576 Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGT CGT TAC GTC MC TGG CAG CCA AGT CCG πA GM CAC ATC GM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205 πA GAG CAG Cπ CGT Gπ CTG TGG TAC GGC GM MA ATC CAT Gπ GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
GπGCTCAGGMGπCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GAC CCG TCG ACG MT TCC ACC ATG GGG CATTAT CCTTGT ACC ATC MC 768 Asp Pro Ser Thr Asn Ser Thr Met Gly His Tyr Pro Cys Thr lie Asn 245 250 255
TACACCCTGπCAMGTCAGGATGTACCTGGGAGGGGTCGAGCACAGG 816 Tyr Thr Leu Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg 260 265 270
CTGGMGπGCTTGCMCTGGACGCGGGGCGMCCTTGTGATCTGGAC 864 Leu Glu Val Ala Cys Asn Tφ Thr Arg Gly Glu Arg Cys Asp Leu Asp 275 280 285
GACAGGGACAGGTCCGAGCTCAGCCCGCTGCTGCTGTCC CCACTCAG 912 Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gin 290 295 300
TGG CAG GTC CTT CCG TGTTCC πC ACG ACC πG CCA GCC πG ACC ACC 960 Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Thr Thr 305 310 315 320
GGC CTC ATC CAC CTC CAC CAG MC ATC GTG GAC GTG CM TAC πG TAC 1008 Gly Leu He His Leu His Gin Asn He Val Asp Val Gin Tyr Leu Tyr 325 330 335
GGG GTG GGG TC A AGC Aπ GTG TCC TGG GCC ATC AAG TGG GAG TAC CTC 1056 Gly Val Gly Ser Ser lie Val Ser Tφ Aia He Lys Tφ Glu Tyr Val 340 345 350
ATC CTC πGπT CTC CTG Cπ GCA GAC GCG CGC ATC TGC TCC TGC πG 1104 lie Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg He Cys Ser Cys Leu 355 360 365
TGGATGATGπACTCATATCCCMGCGGAGGCAGCCπGGMMCCπ 1152 Tφ Met Met Leu Leu He Ser Gin Ala Glu Ala Ala Leu Glu Asn Leu 370 375 380
CTG πA CTC MT GCG GCG TCT CTG GCC GGG ACG CAC GCT CTT GTG TCC 1200 Val Leu Leu Asn Ala Ala Ser Leu Ala Gly Thr His Gly Leu Val Ser 385 390 395 400 πc CTC GTG πr πc TGC Tπ GC A TGG TAT CTG MG GGT MG TGG GTG 1248
Phe Leu Val Phe Phe Cys Phe Ala Tφ Tyr Leu Lys Gly Lys Tφ Val 405 410 415
CCC GGA GTG GCC TAC GCC πC TAC GGG ATG TGG CCTπC CTC CTG CTC 1296 Pro Gly Val Ala Tyr Ala Phe Tyr Gly Met Trp Pro Phe Leu Leu Leu 420 425 430
CTGπAGCGπGCCCCMCGGGCATACGCGCTGGACACGGAGATGGCC 1344 Leu Leu Ala Leu Pro Gin Arg Ala Tyr Ala Leu Asp Thr Glu Met Ala 435 440 445
GCGTCGTCTGGCGGCGπGπCπGTCGGGπAATGGCGCTGACTCTG 1392 Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala Leu Thr Leu 450 455 460
TCA CCA TAT 1401
Ser Pro Tyr
465
(2) INFORMATION FOR SEQ ID NO:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 467 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:
Met Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp lie Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Giu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Giy Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr lie He Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Thr Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Giy Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Asn Ser Thr Met Gly His Tyr Pro Cys Thr He Asn 245 250 255
Tyr Thr Leu Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg 260 265 270
Leu Glu Val Ala Cys Asn Tφ Thr Arg Gly Glu Arg Cys Asp Leu Asp 275 280 285
Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gin 290 295 300
Trp Gin Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Thr Thr 305 310 315 320
Gly Leu He His Leu His Gin Asn lie Val Asp Val Gin Tyr Leu Tyr
325 330 335
Gly Val Gly Ser Ser He Val Ser Tφ Ala lie Lys Tφ Glu Tyr Val 340 345 350
He Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg He Cys Ser Cys Leu 355 360 365
Tφ Met Met Leu Leu He Ser Gin Ala Glu Ala Ala Leu Glu Asn Leu 370 375 380
Val Leu Leu Asn Ala Ala Ser Leu Aia Gly Thr His Gly Leu Val Ser 385 390 395 400
Phe Leu Val Phe Phe Cys Phe Ala Tφ Tyr Leu Lys Gly Lys Tφ Val 405 410 415
Pro Gly Val Ala Tyr Ala Phe Tyr Gly Met Trp Pro Phe Leu Leu Leu 420 425 430
Leu Leu Ala Leu Pro Gin Arg Ala Tyr Aia Leu Asp Thr Glu Met Ala 435 440 445
Ala Ser Cys Gly Gly Val Val Leu Val Gly Leu Met Ala Leu Thr Leu 450 455 460
Ser Pro Tyr 465
(2) INFORMATION FOR SEQ ID N0:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1851 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: circular
(ii) MOLECULE TYPE: DNA (genomic)
(ix) FEATURE:
(A) NAME/KEY: CDS (B) LOCATIONS ..1851
(xi) SEQUENCE DESCRIPTION: SEQ ID N0:34:
ATGAGTTπGTGGTCAπAπCCCGCG CGCTACGCGTCGACGCCTCTG 48 Met Ser Phe Val Val lie He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
CCCGGT VMCCAπGGπGATAπMC GGCAMCCCATGAπGπCAT 96 Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met lie Val His 20 25 30
GπCπGMCGCGCGCGTGMTCAGGTGCCGAGCGCATCATCCTGGCA 144
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He lie Val Ala 35 40 45
ACCGATCATGAGGATGπGCCCGCGCCGπGMGCCGCTGGCGGTGM 192 Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
CTATGTATGACGCGCGCCGATCATCAGTCAGGAACAGMCGTCTGGCG 240 Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Giu Arg Leu Ala 65 70 75 80
GMGπGTCGMMATGCGCAπCAGCGACGACACGGTGATCGπMT 288 Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
GTG CAG GGT GAT GM CCG ATG ATC CCT GCG ACA ATC Aπ CGT CAG Gπ 336 Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He He Arg Gin Val 100 105 110
GCTGATMCCTCGCTCAGCGTCAGGTGGGTATGGCGACTCTGGCGGTG 384 Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
CCAATCCACMTGCGGMGMGCGTTTMCCCGMTGCGGTGAMGTG 432 Pro He His Asn Ala Glu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Gπ CTC GAC GCT GM GGG TAT GCA CTG TAC πC TCT CGC GCC ACC Aπ 480 Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
CCTTGGGATCGTGATCGTTπGCAGMGGCCπGMACCGπGGCGAT 528 Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
MC πC CTG CGT CAT CTT GGT Aπ TAT GGC TAC CGT GCA GGC Tπ ATC 576 Asn Phe Leu Arg His Leu Giy He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
CGTCGTTACGTCMCTGG CAG CCA AGT CCG πAGM CAC ATCGM ATG 624 Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205 πA GAG CAG Cπ CGT Gπ CTG TGG TAC GGC GMMA ATC CAT Gπ GCT 672 Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
GπGCTCAGGMGπCCTGGCACAGGTGTGGATACCCCTGMGATCTC 720 Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
GACCCGTCGACTCGAAπCGTAGGTCGCGCMTπGGGTMGGTCATC 768 Asp Pro Ser Thr Arg He Arg Arg Ser Arg Asn Leu Gly Lys Val He 245 250 255
GAT ACC CTC ACG TGC GGCπC GCC GAC CTC ATG GGG TAC Aπ CCG CTC 816 Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu 260 265 270
GTC GGC GCC CCT Cπ GGA GGC GCT GCC AGG GCC CTG GCG CAT GGC GTC 864 Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val 275 280 285
CGGGπCTGGMGACGGCGTGMCTATGCAACAGGGMCCπCCCGGT 912 Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 290 295 300
TGC TCT πc TCT ATC πc cπ CTG GCC CTG CTC TCT TGC CTG ACT GTG 960 Cys Ser Phe Ser lie Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 305 310 315 320
CCC GCG TCA TCC TAC CM GTA CGC MC TCC TCG GGC CTT TAT CAT CTC 1008 Pro Ala Ser Ser Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val 325 330 335
ACCMTGATTGCCCCMCTCGAGCAπGTGTACGAGACGGCCGATACC 1056 Thr Asn Asp Cys Pro Asn Ser Ser lie Val Tyr Glu Thr Ala Asp Thr 340 345 350
ATCCTACACTCTCCGGGGTGCGTCCCTTGCGπCGCGAGGGCMCACC 1104 He Leu His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Thr 355 360 365
TCGAMTCTTGGGTGGCGGTGGCCCCCACACTGGCCACCAGGGACGGC 1152 Ser Lys Cys Tφ Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 370 375 380
AM CTC CCC TCA ACG CAG CTT CGA CGT CAC ATC GAT CTG CTC GTC GGG 1200 Lys Leu Pro Ser Thr Gin Leu Arg Arg His lie Asp Leu Leu Val Gly 385 390 395 400
AGCGCCACCCTCTGCTCGGCCCTCTATCTGGGGGACπGTGCGGGTCT 1248 Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser 405 410 415
GTCπrcπGTCAGTCMCTGπcACCπcTCCCCTAGGCGCCATTGG 1296
Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg Arg His Trp 420 425 430
ACA ACG CM GAC TGC MC TGT TCT ATC TAC CCC GGC CAT ATA ACG GGT 1344 Thr Thr Gin Asp Cys Asn Cys Ser lie Tyr Pro Gly His He Thr Gly 435 440 445
CAC CGC ATG GCATGG GAT ATG ATG ATG MC TGG TCC CCT ACA ACG GCG 1392 His Arg Met Ala Trp Asp Met Met Met Asn Tφ Ser Pro Thr Thr Ala 450 455 460
CTG GTA CTA GCT CAG CTG CTC AGG GTC CCA CM GCC ATC πG GAC ATG 1 40 Leu Val Val Ala Gin Leu Leu Arg Val Pro Gin Ala He Leu Asp Met 465 470 475 480
ATCGCAGGTGCCCACTGGGGAGTCCTAGCGGGCATAGCGTATπCTCC 1488 He Ala Gly Ala His Tφ Gly Val Leu Ala Gly lie Ala Tyr Phe Ser 485 490 495
ATG GTG GGG MC TGG GCG MG GTC CTG GTA GTG CTG πG CTG πTTCC 1536 Met Val Gly Asn Trp Ala Lys Val Leu Val Val Leu Leu Leu Phe Ser 500 505 510
GGCGTCGATGCGGCAACCTACACXD CXϊGGGGGGAGCGπGCTAGGACC 1584 Gly Val Asp Ala Ala Thr Tyr Thr Thr Gly Gly Ser Val Ala Arg Thr 515 520 525
ACGCATGGAπCTCCAGCπAπCAGTCMGGCGCCMGCAGMCATC 1632 Thr His Gly Phe Ser Ser Leu Phe Ser Gin Gly Ala Lys Gin Asn He 530 535 540
CAGCTGAπMCACCMCGGCAGTTGGCACATCMTCGCACGGCCπG 1680 Gin Leu He Asn Thr Asn Gly Ser Trp His He Asn Arg Thr Ala Leu 545 550 555 560
MC TGTMT GCG AGC CTC GAC ACT GGC TGG GTA GCG GGG CTC πC TAT 1728 Asn Cys Asn Ala Ser Leu Asp Thr Gly Trp Val Ala Gly Leu Phe Tyr 565 570 575
TACCACAMπCMCTCTTCAGGCTGCCCTGAGAGGATGGCCAGCTGT 1776 Tyr His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Met Ala Ser Cys 580 585 590
AGACCCCπGCCGATTπGACCAGGGCTGGGMπCGAGCTCGGTACC 1824 Arg Pro Leu Ala Asp Phe Asp Gin Gly Trp Glu Phe Glu Leu Gly Thr 595 600 605
CGG GGA TCC TCT AGA CTG CAG GCA TGC 1851
Arg Gly Ser Ser Arg Leu Gin Ala Cys 610 615
(2) INFORMATION FOR SEQ ID NO:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 617 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:
Met Ser Phe Val Val He He Pro Ala Arg Tyr Ala Ser Thr Arg Leu 1 5 10 15
Pro Gly Lys Pro Leu Val Asp He Asn Gly Lys Pro Met He Val His 20 25 30
Val Leu Glu Arg Ala Arg Glu Ser Gly Ala Glu Arg He He Val Ala 35 40 45
Thr Asp His Glu Asp Val Ala Arg Ala Val Glu Ala Ala Gly Gly Glu 50 55 60
Val Cys Met Thr Arg Ala Asp His Gin Ser Gly Thr Glu Arg Leu Ala 65 70 75 80
Glu Val Val Glu Lys Cys Ala Phe Ser Asp Asp Thr Val He Val Asn 85 90 95
Val Gin Gly Asp Glu Pro Met He Pro Ala Thr He lie Arg Gin Val 100 105 110
Ala Asp Asn Leu Ala Gin Arg Gin Val Gly Met Ala Thr Leu Ala Val 115 120 125
Pro He His Asn Ala Giu Glu Ala Phe Asn Pro Asn Ala Val Lys Val 130 135 140
Val Leu Asp Ala Glu Gly Tyr Ala Leu Tyr Phe Ser Arg Ala Thr He 145 150 155 160
Pro Tφ Asp Arg Asp Arg Phe Ala Glu Gly Leu Glu Thr Val Gly Asp 165 170 175
Asn Phe Leu Arg His Leu Gly He Tyr Gly Tyr Arg Ala Gly Phe He 180 185 190
Arg Arg Tyr Val Asn Trp Gin Pro Ser Pro Leu Glu His He Glu Met 195 200 205
Leu Glu Gin Leu Arg Val Leu Trp Tyr Gly Glu Lys He His Val Ala 210 215 220
Val Ala Gin Glu Val Pro Gly Thr Gly Val Asp Thr Pro Glu Asp Leu 225 230 235 240
Asp Pro Ser Thr Arg lie Arg Arg Ser Arg Asn Leu Gly Lys Val He 245 250 255
Asp Thr Leu Thr Cys Gly Phe Ala Asp Leu Met Gly Tyr lie Pro Leu 260 265 270
Val Gly Ala Pro Leu Gly Gly Ala Ala Arg Ala Leu Ala His Gly Val 275 280 285
Arg Val Leu Glu Asp Gly Val Asn Tyr Ala Thr Gly Asn Leu Pro Gly 290 295 300
Cys Ser Phe Ser He Phe Leu Leu Ala Leu Leu Ser Cys Leu Thr Val 305 310 315 320
Pro Ala Ser Ser Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val
325 330 335
Thr Asn Asp Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Thr 340 345 350
He Leu His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Thr 355 360 365
Ser Lys Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 370 375 380
Lys Leu Pro Ser Thr Gin Leu Arg Arg His He Asp Leu Leu Val Gly 385 390 395 400
Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser 405 410 415
Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg Arg His Trp 420 425 430
Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly His He Thr Gly 435 440 445
His Arg Met Ala Trp Asp Met Met Met Asn Tφ Ser Pro Thr Thr Ala 450 455 460
Leu Val Val Ala Gin Leu Leu Arg Val Pro Gin Ala He Leu Asp Met 465 470 475 .480
He Ala Gly Ala His Tφ Gly Val Leu Ala Gly He Ala Tyr Phe Ser 485 490 495
Met Val Gly Asn Trp Ala Lys Val Leu Val Val Leu Leu Leu Phe Ser 500 505 510
Gly Val Asp Ala Ala Thr Tyr Thr Thr Gly Gly Ser Val Ala Arg Thr 515 520 525
Thr His Gly Phe Ser Ser Leu Phe Ser Gin Gly Ala Lys Gin Asn He 530 535 540
Gin Leu He Asn Thr Asn Gly Ser Trp His He Asn Arg Thr Ala Leu 545 550 555 560
Asn Cys Asn Ala Ser Leu Asp Thr Gly Trp Val Ala Gly Leu Phe Tyr 565 570 575
Tyr His Lys Phe Asn Ser Ser Gly Cys Pro Glu Arg Met Ala Ser Cys 580 585 590
Arg Pro Leu Ala Asp Phe Asp Gin Gly Trp Glu Phe Glu Leu Gly Thr 595 600 605
Arg Gly Ser Ser Arg Leu Gin Ala Cys 610 615